(Cornelius Lanczos) Linear Differential Operators

Download as pdf or txt
Download as pdf or txt
You are on page 1of 582

Linear Differential

Operators
SIAM's Classics in Applied Mathematics series consists of books that were
previously allowed to go out of print. These books are republished by SIAM as a
professional service because they continue to be important resources for
mathematical scientists.

Editor-in-Chief
Gene H. Golub, Stanford University

Editorial Board
Richard A. Brualdi, University of Wisconsin-Madison
Herbert B. Keller, California Institute of Technology
Ingram Olkin, Stanford University
Robert E. O'Malley, Jr., University of Washington

Classics in Applied Mathematics


C. C. Lin and L A. Segel, Mathematics Applied to Deterministic Problems in the Natural
Sciences
Johan G. F. Belinfante and Bernard Kolman, A Survey of Lie Groups and Lie Algebras with
Applications and Computational Methods
James M. Ortega, Numerical Analysis: A Second Course
Anthony V. Fiacco and Garth P. McCormick, Nonlinear Programming: Sequential
Unconstrained Minimization Techniques
F. H. Clarke, Optimization and Nonsmooth Analysis
George F. Carrier and Carl E. Pearson, Ordinary Differential Equations
Leo Breiman, Probability
R. Bellman and G. M. Wing, An Introduction to Invariant Imbedding
Abraham Berman and Robert J. Plemmons, Nonnegative Matrices in the
Mathematical Sciences
Olvi L Mangasarian, Nonlinear Programming
*Carl Friedrich Gauss, Theory of the Combination of Observations Least Subject to Errors:
Part One, Part Two, Supplement. Translated by G. W. Stewart
Richard Bellman, Introduction to Matrix Analysis
U. M. Ascher, R. M. M. Mattheij, and R. D. Russell, Numerical Solution of Boundary Value
Problems for Ordinary Differential Equations
K. E. Brenan, S. L. Campbell, and L R. Petzold, Numerical Solution of Initial-Value Problems
in Differential-Algebraic Equations
Charles L. Lawson and Richard J. Hanson, Solving Least Squares Problems
J. E. Dennis, Jr. and Robert B. Schnabel, Numerical Methods for Unconstrained Optimization
and Nonlinear Equations
Richard E. Barlow and Frank Proschan, Mathematical Theory of Reliability
Cornelius Lanczos, Linear Differential Operators

*First time in print.


Linear Differential
Operators
Cornelius Lanczos

Society for Industrial and Applied Mathematics


Philadelphia
Copyright © 1996 by the Society for Industrial and Applied Mathematics.

This SIAM edition is an unabridged, corrected republication of the work first


published by D. Van Nostrand Company, Ltd., London, England, 1961.

SIAM would like to thank Bart Childs, Texas A&M University, for suggesting
the corrections made to this edition.

10987654321

All rights reserved. Printed in the United States of America. No part of this
book may be reproduced, stored, or transmitted in any manner without the
written permission of the publisher. For information, write to the Society for
Industrial and Applied Mathematics, 3600 University City Science Center,
Philadelphia, PA 19104-2688.

Library of Congress Cataloging-in-Publication Data

Lanczos, Cornelius, 1893-


Linear differential operators / Cornelius Lanczos.
p. cm. ~ (Classics in applied mathematics ; 18)
Originally published: London ; New York : Van Nostrand, 1961.
Includes bibliographical references and index.
ISBN 0-89871-370-6 (pbk.)
1. Calculus, Operational. 2. Differential equations, Linear.
I. Title. II. Series.
QA432.L3 1996
515'.7242~dc20 96-14933

The royalties from the sales of this book are being placed in a fund to help
students attend SIAM meetings and other SIAM related activities. This fund is
administered by SIAM and qualified individuals are encouraged to write
directly to SIAM for guidelines.

SlcLJTL. is a registered t.
A 1'apotre de I'humanite universelle, le Pere Pire,
dont la charite ne connait pas de limites.

V
This page intentionally left blank
CONTENTS

PAGE
PREFACE xiii
BIBLIOGRAPHY xvii
1. INTERPOLATION
1. Introduction 1
2. The Taylor expansion 2
3. The finite Taylor series with the remainder term 3
4. Interpolation by polynomials 5
5. The remainder of Lagrangian interpolation formula 6
6. Equidistant interpolation 8
7. Local and global interpolation 11
8. Interpolation by central differences 13
9. Interpolation around the midpoint of the range 16
10. The Laguerre polynomials 17
11. Binomial expansions 21
12. The decisive integral transform 24
13. Binomial expansions of the hypergeometric type 26
14. Recurrence relations 27
15. The Laplace transform 29
16. The Stirling expansion 32
17. Operations with the Stirling functions 34
18. An integral transform of the Fourier type 35
19. Recurrence relations associated with the Stirling series 37
20. Interpolation of the Fourier transform 40
21. The general integral transform associated with the Stirling
series 42
22. Interpolation of the Bessel functions 45
2. HARMONIC ANALYSIS
1. Introduction 49
2. The Fourier series for differentiable functions 50
3. The remainder of the finite Fourier expansion 53
4. Functions of higher differentiability 56
5. An alternative method of estimation 58
6. The Gibbs oscillations of the finite Fourier series 60
7. The method of the Green's function 66
8. Non-differentiable functions. Dirac's delta function 68
vii
Vlll CONTENTS

PAGE
9. Smoothing of the Gibbs oscillations by Fejer's method 71
10. The remainder of the arithmetic mean method 72
11. Differentiation of the Fourier series 74
12. The method of the sigma factors 75
13. Local smoothing by integration 76
14. Smoothing of the Gibbs oscillations by the sigma method 78
15. Expansion of the delta function 80
16. The triangular pulse 81
17. Extension of the class of expandable functions 83
18. Asymptotic relations for the sigma factors 84
19. The method of trigonometric interpolation 89
20. Error bounds for the trigonometric interpolation method 91
21. Relation between equidistant trigonometric and polynomial
interpolations 93
22. The Fourier series in curvfitting98 98

3. MATRIX CALCULUS
1. Introduction 100
2. Rectangular matrices 102
3. The basic rules of matrix calculus 103
4. Principal axis transformation of a symmetric matrix 106
5. Decomposition of a symmetric matrix 111
6. Self-adjoint systems 113
7. Arbitrary n x m systems 115
8. Solvability of the general n x m system 118
9. The fundamental decomposition theorem 120
10. The natural inverse of a matrix 124
11. General analysis of linear systems 127
12. Error analysis of linear systems 129
13. Classification of linear systems 134
14. Solution of incomplete systems 139
15. Over-determined systems 141
16. The method of orthogonalisation 142
17. The use of over-determined systems 144
18. The method of successive orthogonalisation 148
19. The bilinear identity 152
20. Minimum property of the smallest eigenvalue 158
4. THE FUNCTION SPACE
1. Introduction 163
2. The viewpoint of pure and applied mathematics 164
3. The language of geometry 165
4. Metrical spaces of infinitely many dimensions 166
5. The function as a vector 167
6. The differential operator as a matrix 170
CONTENTS IX

PAGE
7. The length of a vector 173
8. The scalar product of two vectors 175
9. The closeness of the algebraic approximation 175
10. The adjoint operator 179
11. The bilinear identity 181
12. The extended Green's identity 182
13. The adjoint boundary conditions 184
14. Incomplete systems 187
15. Over-determined systems 190
16. Compatibility under inhomogeneous boundary conditions 192
17. Green's identity in the realm of partial differential operators 195
18. The fundamental field operations of vector analysis 198
19. Solution of incomplete systems 201
5. THE GREEN'S FUNCTION
1. Introduction 206
2. The role of the adjoint equation 207
3. The role of Green's identity 208
4. The delta function 8(x, £) 208
5. The existence of the Green's function 211
6. Inhomogeneous boundary conditions 217
7. The Green's vector 220
8. Self-adjoint systems 225
9. The calculus of variations 229
10. The canonical equations of Hamilton 230
11. The Hamiltonisation of partial operators 237
12. The reciprocity theorem 239
13. Self-adjoint problems. Symmetry of the Green's function 241
14. Reciprocity of the Green's vector 241
15. The superposition principle of linear operators 244
16. The Green's function in the realm of ordinary differential
operators 247
17. The change of boundary conditions 255
18. The remainder of the Taylor series 256
19. The remainder of the Lagrangian interpolation formula 258
20. Lagrangian interpolation with double points 263
21. Construction of the Green's vector 266
22. The constrained Green's function 270
23. Legendre's differential equation 275
24. Inhomogeneous boundary conditions 278
25. The method of over-determination 281
26. Orthogonal expansions 286
27. The bilinear expansion 291
28. Hermitian problems 299
29. The completion of linear operators 308
X CONTENTS

PAGE
6. COMMUNICATION PROBLEMS
1. Introduction 315
2. The step function and related functions 315
3. The step function response and higher order responses 320
4. The input-output relation of a galvanometer 323
5. The fidelity problem of the galvanometer response 325
6. Fidelity damping 327
7. The error of the galvanometer recording 328
8. The input-output relation of linear communication devices 330
9. Frequency analysis 334
10. The Laplace transform 336
11. The memory time 337
12. Steady state analysis of music and speech 339
13. Transient analysis of noise phenomena 342
7. STTJEM-LIOUVILLE PROBLEMS
1. Introduction 348
2. Differential equations of fundamental significance 349
3. The weighted Green's identity 352
4. Second order operators in self-adjoint form 356
5. Transformation of the dependent variable 359
6. The Green's function of the general second order differential
equation 364
7. Normalisation of second order problems 368
8. Riccati's differential equation 370
9. Periodic solutions 371
10. Approximate solution of a differential equation of second
order 374
11. The joining of regions 376
12. Bessel functions and the hypergeometric series 378
13. Asymptotic properties of Jv(z) in the complex domain 380
14. Asymptotic expression of Jp(x) for large values of a; 382
15. Behaviour of Jp(z) along the imaginary axis 384
16. The Bessel functions of the order £ 385
17. Jump conditions for the transition "exponential-periodic" 387
18. Jump conditions for the transition "periodic-exponential" 388
19. Amplitude and phase in the periodic domain 389
20. Eigenvalue problems 390
21. Hermite's differential equation 391
22. Bessel's differential equation 394
23. The substitute functions in the transitory range 400
24. Tabulation of the four substitute functions 404
25. Increased accuracy in the transition domain 405
26. Eigensolutions reducible to the hypergeometric series 409
27. The ultraspherical polynomials 410
CONTENTS XI

PAGE
28. The Legendre polynomials 412
29. The Laguerre polynomials 418
30. The exact amplitude equation 420
31. Sturm-Liouville problems and the calculus of variations 425
8. BOUNDARY VALUE PROBLEMS
1. Introduction 432
2. Inhomogeneous boundary conditions 435
3. The method of the "separation of variables" 438
4. The potential equation of the plane 439
5. The potential equation hi three dimensions 448
6. Vibration problems 464
7. The problem of the vibrating string 456
8. The analytical nature of hyperbolic differential operators 464
9. The heat flow equation 469
10. Minimum problems with constraints 472
11. Integral equations in the service of boundary value problems 476
12. The conservation laws of mechanics 479
13. Unconventional boundary value problems 486
14. The eigenvalue A = 0 as a limit point 487
15. Variational motivation of the parasitic spectrum 494
16. Examples for the parasitic spectrum 498
17. Physical boundary conditions 504
18. A universal approach to the theory of boundary value
problems 508
9. NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS
1. Introduction 512
2. Differential equations in normal form 513
3. Trajectory problems 514
4. Local expansions 515
5. The method of undetermined coefficients 517
6. Lagrangian interpolation in terms of double points 520
7. Extrapolations of maximum efficiency 521
8. Extrapolations of minimum round-off 521
9. Estimation of the truncation error 524
10. End-point extrapolation 526
11. Mid-point interpolations 527
12. The problem of starting values 529
13. The accumulation of truncation errors 531
14. The method of Gaussian quadrature 534
15. Global integration by Chebyshev polynomials 536
16. Numerical aspects of the method of global integration 540
17. The method of global correction 546
Appendix 551
Index 555
This page intentionally left blank
PREFACE
PREFACE

In one of the (unfortunately lost) comedies of Aristophanes the Voice of


the Mathematician appeared, as it descended from a snow-capped mountain
peak, pronouncing in a ponderous sing-song—and words which to the
audience sounded like complete gibberish—his eternal Theorems, Lemmas,
and Corollaries. The laughter of the listeners was enhanced by the
implication that in fifty years' time another Candidate of Eternity would
pronounce from the same snow-capped mountain peak exactly the same
theorems, although in a modified but scarcely less ponderous and in-
comprehensible language.
Since the days of antiquity it has been the privilege of the mathematician
to engrave his conclusions, expressed in a rarefied and esoteric language,
upon the rocks of eternity. While this method is excellent for the codifica-
tion of mathematical results, it is not so acceptable to the many addicts of
mathematics, for whom the science of mathematics is not a logical game,
but the language in which the physical universe speaks to us, and whose
mastery is inevitable for the comprehension of natural phenomena.
In his previous books the author endeavoured to establish a more
discursive manner of presentation in which the esoteric shorthand formulation
of mathematical deductions and results was replaced by a more philosophic
exposition, putting the emphasis on ideas and concepts and their mutual
interrelations, rather than on the mere manipulation of formulae. Our
symbolic mechanism is eminently useful and powerful, but the danger is
ever-present that we become drowned in a language which has its well-
defined grammatical rules but eventually loses all content and becomes a
nebulous sham. Hence the author's constant desire to penetrate below the
manipulative surface and comprehend the hidden springs of mathematical
equations.
To the author's surprise this method (which, of course, is not his
monopoly) was well received and made many friends and few enemies. It
is thus his hope that the present book, which is devoted to the fundamental
aspects of the theory of Linear Differential Operators, will likewise find its
adherents. The book is written at advanced level but does not require
any specific knowledge which goes beyond the boundaries of the customary
introductory courses, since the necessary tools of the subject are developed
as the narration proceeds.
Indeed, the first three chapters are of an introductory nature, exploring
some of the technical tools which will be required in the later treatment.
Since the algebraic viewpoint will be the principal beacon throughout our
xiii
XIV PREFACE

journey, the problem of obtaining a function from a discrete set of values,


that is the problem of Interpolation, is the first station on our itinerary. We
investigate the properties of the Gregory-Newton and Stirling type of
interpolations and their limitations, encountering more than one strange
phenomenon which is usually left unheeded. The second station is Harmonic
Analysis. Here we have a chance to study at close range the remarkable
manner in which a series of orthogonal functions, terminated after a finite
number of terms, approximates a function. We can hardly find a better
introduction to those "orthogonal expansions", which will play such a vital
role in our later studies, than by studying the nature of the Fourier series.
The third station is Matrix Calculus. Here we encounter for the first time
that fundamental "decomposition theorem" which in proper re-interpreta-
tion will become the theme-song of our later explorations.
Through the concept of the Function Space we establish the link between
matrices and the continuous domain, and we proceed to the central problem
of the Green's Function which—being the inverse operator—plays such a
central role in the solution of differential equations. Certain elementary
aspects of Communication Problems provide a number of interesting applica-
tions of the Green's function method in engineering problems, whence we
proceed to the Sturm- Liouville Problems which played such a remarkable
role in the historical development of mathematical physics. In the chapter
on Boundary Value Problems we get acquainted with the classical examples
of the solution method known as the "separation of variables", but we add
some highly non-traditional types of boundary value problems which bring
the peculiar "parasitic spectrum" in appearance. The book comes to a
close with a brief chapter on the Numerical Solution of Trajectory Problems.
One may well ask why it was necessary to add another treatise on
differential equations to the many excellent textbooks which are already on
the market. It is the author's contention, however, that neither the
treatment nor the selection of the material duplicates any of the standard
treatises. Points are stressed which often find scanty attention, while
large fields are neglected which take a prominent part in other treatments.
The emphasis is constantly on the one question: what are the basic and
characteristic properties of linear differential operators? Manipulative skill is
relegated to a more secondary place—although it is the author's conviction
that the student who works seriously on the 350 "Problems" posed (and
solved) in the course of discussions, will in fact develop a "feel" for the
peculiarities of differential equations which will enable him to try his hands
on more specialised and technically more involved problems, encountered
in physical and industrial research. (The designation "Problem" instead
of "Exercise" may be resented as too pretentious. However, the author
does not feel himself in the role of a teacher, who hands out "home-work"
to the student, in order to prepare him for examinations. These "Problems"
arise naturally from the proceedings in the form of questions or puzzles
which deserve an answer. They often complement the text on insufficiently
treated details and induce the student to ask questions of his own, "flying
PREFACE xV

off at a tangent", if necessary. At the same time they force him to develop
those manipulative skills, without which the successful study of mathematics
is not conceivable.)
It is the author's hope that his book will stimulate discussions and
research at the graduate level. Although the scope of the book is restricted
to certain fundamental aspects of the theory of linear differential operators,
the thorough and comprehensive study of these aspects seemed to him well
worth pursuing. By a peculiar quirk of historical development the brilliant
researches of Fredholm and Hilbert in the field of integral equations over-
shadowed the importance of differential operators, and the tendency is
widespread to transform a given differential equation immediately into an
integral equation, and particularly an integral equation of the Fredholm
type which in algebraic language is automatically equivalent to the n x n
type of matrices. This is a tendency which completely overlooks the true
nature of partial differential operators. The present book departs sharply
from the preconceived notion of "well-posed" problems and puts the
general—that is arbitrarily over-determined or under-determined—case in
the focus of interest. The properties of differential operators are thus
examined on an unbiased basis and a theory is developed which submerges
the "well-posed " type of problems in a much more comprehensive framework.
The author apologises to the purist and the modernist that his language
is that of classical mathematics to which he is bound by tradition and
conviction. In his opinion the classical methods can go a long way in the
investigation of the fundamental problems which arise in the field of
differential operators. This is not meant, however, as a slight on those
who with more powerful tools may reach much more sweeping results.
Yet, there was still another viewpoint which militated against an overly
"modernistic" treatment. This book is written primarily for the natural
scientist and engineer to whom a problem in ordinary or partial differential
equations is not a problem of logical acrobatism, but a problem in the
exploration of the physical universe. To get an explicit solution of a given
boundary value problem is in this age of large electronic computers no
longer a basic question. The problem can be coded for the machine and
the numerical answer obtained. But of what value is the numerical answer
if the scientist does not understand the peculiar analytical properties and
idiosyncrasies of the given operator? The author hopes that this book will
help him in this task by telling him something about the manifold aspects
of a fascinating field which is still far from being properly explored.
Acknowledgements. In the Winter Semester 1957-58 the author had the
privilege to give a course on "Selected Topics of Applied Analysis" in the
Graduate Seminar of Professor A. Lonseth, Oregon State College, Corvallis,
Oregon. The lecture notes of that course form the basic core from which the
present book took its start.
By the generous invitation of Professor R. E. Langer the excellent
research facilities and stimulating associations of the Mathematics Research
Center of the U.S. Army in Madison, Wis., were opened to the author, in
XVI PREFACE

the winter of 1959-60. He is likewise indebted to Professor John W. Carr


III, Director of the Computation Center, University of North Carolina,
Chapel Hill, N.C., for the memorable time spent with him and his graduate
group.
These opportunities "far from the home base " were happily complemented
by the stimulating daily "tea-time" discussions with the junior and senior
staff of the School of Theoretical Physics, Dublin Institute for Advanced
Studies, in particular with Professor John L. Synge, Director of the School,
whose animated inquiries brought to the surface and elucidated many
hidden corners of the subject.
Finally the author wishes to express his heartfelt thanks to his publishers,
the Van Nostrand Company, for their unfailing courtesy and understanding.
Dublin, November 1960 C. L.
BIBLIOGRAPHY

The following textbooks, written in the English language, and selected


from a very extensive literature, contain material which in parts parallel the
discussions of the present volume, and which can be recommended for
collateral or more advanced reading. Additional sources are listed at the
end of each chapter. References in braces { } refer to the books of the
general Bibliography, those in brackets [ ] to the books of the chapter
bibliographies.
{1} Courant, R. and D. Hilbert, Methods of Mathematical Physics, Vol. 1
(Interscience Publishers, New York, 1953)
{2} Duff, G. F. D., Partial Differential Equations (University of Toronto Press,
1956)
{3} Friedman, B., Principles and Techniques of Applied Mathematics (John
Wiley & Sons, 1957)
{4} Ince, E. L., Ordinary Differential Equations (Dover, New York, 1944)
{5} Jeffreys, H. and B. S. Jeffreys, Mathematical Methods of Physics (Cambridge
University Press, 1956)
{6} Margenau, H. and G. M. Murphy, The Mathematics of Physics and
Chemistry, 2nd Ed. (D. Van Nostrand, 1956)
{7} Morse, P. M. and H. Feshbach, Methods of Theoretical Physics (McGraw-
Hill, 1953)
{8} Page, C. H., Physical Mathematics (D. Van Nostrand, 1955)
{9} Sneddon, I. N., Elements of Partial Differential Equations (McGraw-Hill,
1957)
{10} Sommerfeld, A., Partial Differential Equations of Physics (Academic Press,
New York, 1949)
{11} Webster, A. G., Partial Differential Equations of Mathematical Physics
(Hafner, New York, 1941)
{12} Whittaker, E. T., and G. N. Watson, A Course of Modern Analysis
(Cambridge University Press, 1940)

xvii
This page intentionally left blank
CHAPTER 1

INTERPOLATION

Synopsis. We investigate the two types of equidistant interpolation


procedures, corresponding to the Gregory-Newton and the Stirling
formulae. We get acquainted with the Laguerre polynomials and their
intimate relation to equidistant interpolation. We learn that only a
very restricted class of functions, characterised by a certain integral
transform, allows the Gregory-Newton type of interpolation, while
another integral transform is characteristic for the Stirling type of
interpolation.

1.1. Introduction
The art of interpolation goes back to the early Hindu algebraists. The
idea of "linear interpolation" was in fact known by the early Egyptians and
Babylonians and belongs to the earliest arithmetic experiences of mankind.
But the science of interpolation in its more intricate forms starts with the
time of Newton and Wallis. The art of table-making brought into the
foreground the idea of obtaining some intermediate values of the tabulated
function in terms of the calculated tabular values, and the aim was to
achieve an accuracy which could match the accuracy of the basic values.
Since these values were often obtained with a large number of significant
figures, the art of interpolation had to be explored with great circumspection.
And thus we see the contemporaries of Newton, particularly Gregory,
Stirling, and Newton himself, developing the fundamental tools of the
calculus of interpolation.
The unsettled question remained, to what extent can we trust the
convergence of the various interpolation formulas. This question could not
be settled without the evolution of that exact "limit concept" which came
about in the beginning of the 19th century, through the efforts of Cauchy
and Gauss. But the true nature of equidistant interpolation was discovered
even later, around 1900, through the investigations of Runge and Borel.
Our aim in the present chapter will be to discuss some of the fundamental
aspects of the theory of interpolation, in particular those features of the
theory which can be put to good use in the later study of differential
equations. As a general introduction to the processes of higher analysis
one could hardly find a more suitable subject than the theory of interpolation.
2—L.D.O. l
2 INTERPOLATION CHAP. 1

1.2. The Taylor expansion


One of the most fundamental tools of higher mathematics is the well-
known "Taylor expansion" which is known on the one hand as an infinite
series and on the other as a finite series with a remainder term. We assume
that the function f(x) is "analytical" in the neighbourhood of a certain
point x = a. This means that f(x + iy), considered as a function of the
complex variable x + iy, possesses a unique derivative f ' ( x + iy) at that
particular point x = a, y = 0. In that case the derivatives of all orders
exist and we can consider the infinity of values

from which we can construct the infinite series

Although by formal differentiation on both sides we can prove that F(z)


coincides with /(z) in all its derivatives at the point z = a, this does not
prove that the infinite series (2)* is meaningful and that it represents /(z).
But it is shown in the theory of analytical functions that in fact the infinite
series (2) does converge in a certain domain of the complex variable
z = x + iy and actually converges to f(z) at every point of the domain of
convergence. We can say even more and designate the domain of con-
vergence quite accurately. It is determined by the inside of a circle whose
centre is at the point z = a of the complex plane and whose radius extends
to the nearest "singular" point of the function, that is a point in which
the analytical character of /(z) ceases to exist; (f(z) might become infinite
for example). If it so happens that f(z) is an "entire function"—which
remains analytical for all finite values of z—then the radius of convergence
becomes infinite, i.e. the Taylor series converges for all values of z.
Generally, however, the radius of convergence is restricted to a definite value
beyond which the expansion (2) diverges and loses its meaning.
Exactly on the circle of convergence the series may or may not converge,
depending on the individuality of the function.
The remarkable feature of the expansion (2) is that the given data are
taken from the infinitesimal neighbourhood of the point x = a, the "centre
of expansion". If f(x) is given between x = a — e and x = a + e—no
matter how small e is chosen—we can form the successive difference co-
efficients of first, second, third, . . . order with a Ax which is sufficiently
small and which converges to zero. Hence our data do not involve more
* Equations encountered in the current section are quoted by the last digit only;
hence (2), encountered in Section 2, refers to equation (1.2.2). Equations quoted by
two digits refer to another section of the same chapter; e.g. (4.1) refers to equation
(1.4.1) of the present chapter, while the same equation, if encountered in a later chapter,
would be quoted as (1.4.1). "Chapter 5.18" refers to section 18 of Chapter 5. The
author's book Applied Analysis (Prentice-Hall, 1957) is quoted by A. A.
SEC. 1.3 THE FINITE TAYLOR SERIES WITH THE REMAINDER TERM 3

than an infinitesimal element of the function and yet we can predict what
the value of f(x) will be outside of the point x = a, within a circle of the
complex plane whose centre is at x = a. The Taylor series is thus not an
interpolating but an extrapolating series.
Problem 1. Given the following function:

Find the convergence radius of the Taylor series if the centre of expansion is
at x = TT.
[Answer: r = ZTT]
Problem 2. The mere existence of all the derivatives of f^(a) on the real axis
is not sufficient for the existence of the Taylor series. Show that the function

possesses derivatives of all order at x = 0 (if f(x) is considered as function of


the real variable x), and that all these derivatives vanish. The corresponding
Taylor series vanishes identically and does not converge to f(x), except at the
single point x = 0. Show that f ( x + iy) is not analytical at the point x = 0,
y = 0.
Problem 3. Find the radius of convergence of the Taylor expansion of (4), if
the centre of expansion is at the point x = 4.
[Answer: r = 4]

1.3. The finite Taylor series with the remainder term


In the early days of calculus an "infinite series" was taken literally, viz.
the actual sum of an infinity of terms. The exact limit theory developed
by Gauss and Cauchy attached a more precise meaning to the expression
"infinity of terms". It is obviously impossible to add up an infinity of
terms and what we actually mean is that we add more and more terms and
thus hope to approach f(x) more and more. Under no circumstances can
an infinite series be conceived to be more than a never ending approximation
process. No matter how many terms of the series we have added, we still
have not obtained f(x) exactly. However, the "convergence" of an infinite
series permits us to make the remaining difference between the sum/w(x) and
f(x) itself as small as we wish, although we cannot make it zero. This is the
meaning of the statement that the infinite series gives us in the limit f(x):

The unfortunate feature of this symbolism is that the equality sign is used
for an infinite process in which in fact equality never occurs.
Instead of operating with the infinite Taylor series we may prefer the
use of the finite series
4 INTERPOLATION CHAP. 1

together with an estimation of the "remainder" of the series, defined by

We shall see later (cf. Chapter 5.18) that on the basis of the general theory of
differential operators we can derive a very definite expression for ijn(x) in the
form of the following definite integral:

which we may put in the frequently more convenient form

Now the remainder of an approximation process need not be known with


full accuracy. What we want is merely an estimation of the error. This is
possible in the case of (5), on the basis of the mean value theorem of integral
calculus:

which holds if p(x) does not change its sign in the interval [a, b] and f(x) is
continuous; x is some unknown point of the interval [a, &]. These con-
ditions are satisfied in the case of (5) if we identify /(£) with/< n )(a + £) and
/>(£) with (t — I)""1. Hence we obtain the estimation

where x is some unknown point of the interval [a, &].


The finite expansion (2) with the remainder (7) gives more information
than the infinite series (2.2). The analytical nature of f(x + iy) is no longer
assumed. It is not even demanded that the derivatives of all order exist at
the point x = a since derivatives of higher than wth order do not appear in
either fn(x) or 7]n(x). Nor is the convergence of the series (2) with increasing
n demanded. It may happen that yn(x) decreases up to a certain point and
then increases again. In fact, it is even possible that r]n(x) increases to
infinity with increasing n. We may yet obtain a very close value of f(x)—
that is a very small error rjn(x)—if we stop at the proper value of n.
The difference between a convergent and a divergent series is not that the
first one yields the right value of f(x) while the second has to be discarded
as mathematically valueless. What is true is that they are both approxima-
tions of/(#) with an error which in the first case can be made as small as we
wish, while in the second case the error cannot be reduced below a certain
finite minimum.
Problem 4. Prove the truth of the following statement: "If I/^H^)] has a
maximum at the centre of expansion, the error of the truncated Taylor series
is smaller than the first neglected term."
SEC. 1.4 INTERPOLATION BY POLYNOMIALS 5

Problem 5. Consider the Taylor expansion around x — 0 of the function

for x > 0. Show that for any n > 4 the remainder of the series is smaller than
the first neglected term.
Problem 6. The infinite Taylor series of the function (8) converges only up to
x = 1. Let us assume that we want to obtain /(2). How many terms of the
series shall we employ for maximum accuracy, and what error bound do we
obtain for it? Demonstrate by the discussion of the error term (4) that the
error can be greatly diminished by adding the first neglected term with the
weight J.
[Answer:
/9(2) = 46.8984, corrected by \ of next term:/* 9 (2) = 46.7617
correct value: /(2) = 46.7654]

1.4. Interpolation by polynomials


The finite Taylor series of n terms can be interpreted as a polynomial
approximation of f(x) which has the property that the functional value and
the derivatives up to the order n — 1 coincide at the centre of expansion
x = a. We can equally say that we have constructed a polynomial of
n — 1st order which has the property that it coincides with f(x) at n points
which are infinitely near to the point x = a.
An obvious generalisation of this problem can be formulated as follows:
Construct a polynomial of the order n — 1 which shall coincide with f(x) at
the n arbitrarily given points

This problem was solved with great ingenuity by Lagrange, who proceeded
as follows.
We construct the "fundamental polynomial" Fn(x) by multiplying all
the root factors:

Dividing synthetically by the root factors x — x^ we now construct the


n auxiliary polynomials

and
6 INTERPOLATION CHAP. 1

These auxiliary polynomials Pk(%) have the property that they give zero at
all the root points Xi, except at XK where they give the value 1:

where SM is "Kronecker's symbol"

If now we form the sum

we obtain a polynomial which has the following properties: its order is


n — 1 and it assumes the values /(#&) at the prescribed points (1). Hence
it solves the Lagrangian interpolation problem from which we started. The
formula (7) may also be written in the form

It is called "Lagrange's interpolation formula".


Problem 7. Construct the Lagrangian polynomials pjc (x) for the following
distribution of points:

Obtain the interpolating polynomial for the following function:

[Answer:

Problem 8. Show that if the x^ are evenly distributed around the origin (i.e.
every xjc appears with + and — signs), the interpolating polynomial contains
only even powers if f(x) is an even function: f(x) — f( — x) and only odd powers
if f(x) is an odd function: f(x) = —/(— x). Show that this is not the case if the
xjf are not evenly distributed.

1.5. The remainder of the Lagrangian interpolation formula


If a function y = f(x) is given and we have constructed a polynomial of
the order n — 1 which fits the functional values y^ = f(xje) at the n pre-
scribed points (4.1), this does not mean that the interpolation will necessarily
be very close at the points between the points of interpolation. It will be
our task to find an estimation for the "remainder", or "error" of our
interpolating polynomial Pn-i(x), that is the difference
SEC. 1.5 REMAINDER OF LAGRANGIAN INTERPOLATION FORMULA 7

For this purpose we want to assume that f(x) is n times differentiable,


although it is obvious that we may approximate a function by a polynomial
which does not satisfy this condition (cf. for example Problem 7).
If we differentiate the equation (1) n tunes, the second term on the right
side will drop out since the wth derivative of any polynomial of not higher
than n — 1st order vanishes. Accordingly we obtain

We can consider this differential equation as the defining equation for 7;n(a;),
although a differential equation of nth order cannot have a unique solution
without adding n "boundary conditions". These conditions are provided
by the added information that 7jn(x) vanishes at the n points of interpolation
x = XK'.

Although these are inside conditions rather than boundary conditions, they
make our problem uniquely determined.
At this point we anticipate something that will be fully proved in Chapter 5.
The solution of our problem (3) (with the given auxiliary conditions) can be
obtained with the help of an auxiliary function called the " Green's function:
G(x> £)" which is constructed according to definite rules. It is quite
independent of the given "right side" of the differential equation (2). The
solution appears in the form of a definite integral:

We have assumed that the points of interpolation x^ are arranged in


increasing magnitude:

and that x is some point inside the interval [xi, xn]:

As we shall demonstrate later, the function G(x, £), considered as a function


of £, has the property that it does not change its sign throughout the interval
[#i> #n]- But then we can again make use of the mean value theorem
(3.6) of integral calculus and obtain

where x is some unknown point of the interval \x\, xn]. The second factor
does not depend on £ any more, but is a pure function of x, which is hide-
pendent of f(x). Hence we can evaluate it by choosing any f(x) we like.
We will choose for f(x) the special function
8 INTERPOLATION CHAP. 1

where Fn(x) is the fundamental polynomial (4.2). This function has the
property that it vanishes at all the points of interpolation and thus the
interpolating polynomial Pn-i(x) vanishes identically. Hence f]n(x) becomes
f(x) itself. Moreover, this choice has the advantage that it eliminates the
unknown position of x since here the nih derivative of f(x) is simply 1
throughout the range. Hence we obtain from (7):

The second factor of the right side of (7) is now determined and we obtain
the estimation

This is Lagrange's form of the remainder of a polynomial interpolation.


The result can be extended to the case of a point x which lies outside the
realm \x\, xn~\, in which case we cannot speak of interpolation any more but of
extrapolation. The only difference is that the point x becomes now some
unknown point of the interval \x\, x\ if x > xn and of the interval [x, xn]
if x < x\.
The disadvantage of the formula (10) from the numerical standpoint is
that it demands the knowledge of a derivative of high order which is
frequently difficult to evaluate.
Problem 9. Deduce the remainder (3.7) of the truncated Taylor series from the
general Lagrangian remainder formula (10).

1.6. Equidistant interpolation


Particular interest is attached to the case of equidistantly placed points
«*:

If a function f(x) is tabulated, we shall almost always give the values of f(x)
in equidistant arguments (1). Furthermore, if a function is observed by
physical measurements, our measuring instruments (for example clock
mechanisms) will almost exclusively provide us with functional values which
belong to equidistant intervals. Hence the interpolation between equi-
distant arguments was from the beginning of interpolation theory treated
as the most important special case of Lagrangian interpolation. Here we
need not operate with the general formulae of Lagrangian interpolation
(although in Chapter 2.21 we shall discover a particularly interesting property
of equidistant interpolation exactly on the basis of the general formula of
Lagrangian interpolation) but can develop a specific solution of our problem
by a certain operational approach which uses the Taylor series as its model
and translates the operational properties of this series into the realm of
difference calculus.
In the calculus of finite differences it is customary to normalise the given
SEC. 1.6 EQUIDISTANT INTEBPOLATION 9

constant interval Ax of the independent variable to 1. If originally the


tabulation occurred in intervals of Ax = Ji, we change the original a; to a
new independent variable xjh and thus make the new Ax equal to 1. We
will assume that this normalisation has already been accomplished.
The fundamental operation of the calculus of finite differences is the
difference quotient A/Ax which takes the place of the derivative djdx of
infinitesimal calculus. But if Ax is normalised to 1, it suffices to consider
the operation

without any denominator which simplifies our formulae greatly. The


operation can obviously be repeated, for example

and so on.
Now let us start with the truncated Taylor series, choosing the centre of
expansion as the point x = 0:

By differentiating on both sides and putting x = 0 we can prove that ijn(^)


has the property that it vanishes at the point x = 0, together with all of
its derivatives, up to the order n — 1. The proof is based on the fact that
the functions

satisfy the following functional equation

together with the boundary condition

while

If now we can find a corresponding set of polynomials which satisfy the


fundamental equation

together with the same boundary conditions (7) and (8), then we can translate
the Taylor series into the calculus of finite differences by putting
10 INTERPOLATION CHAP. 1

and proving that r)n(x) vanishes at x = 0, together with its first, second,
. . ., n — 1st differences. But this means that the polynomial

coincides with f(x) in all its differences up to the order n — I and since
these differences are formed in terms of the functional values

we see that Pn-i(x) coincides with f(x) at the points

and thus solves the problem of equidistant interpolation. Our problem is


thus reduced to the solution of the functional equation (9), in conjunction
with the boundary conditions (7) and (8).
Now the application of the A operation to the Newtonian "binomial
coefficients"

shows that these functions indeed satisfy the functional equation (9):

with the proper boundary (actually initial) conditions. Hence we have in


(14) the proper auxiliary functions which take the place of the functions (5)
of the Taylor series, and we can write down the "Gregory-Newton inter-
polation formula"

The successive differences 4/(0), ^2/(0), • • •are obtainable by setting up a


"difference table" and reading off the values which belong to the line
x = 0.* But they are equally obtainable by the following "binomial
weighting" of the original functional values:

or in symbolic notation

with the understanding that in the expansion on the right side we replace
/»»by/(m).
* Cf. A. A., p. 308.
SEC. 1.7 LOCAL AND GLOBAL INTERPOLATION 11

Problem 10. Show that the function

satisfies the functional equation

Problem 11. Show that Newton's formula of the "binomial expansion"

can be conceived as a special application of the general interpolation formula


(16) (with n = m + 1).

1.7. Local and global interpolation


Let a certain continuous and differentiate function y = /(x) be tabulated
in equidistant intervals, normalised to Ax = 1. Let our table start with the
value x = 0 and continue with a; = 1, 2, . . . , n — 1. We now want to
obtain f(x) at a point £ which is between zero and 1. The simplest form
of interpolation is "linear interpolation":

Here we have connected the functional values /(O) and /(I) by a straight
line. We may want greater accuracy, obtainable by laying a parabola
through the points/(0),/(l),/(2). We now get the " quadratic interpolation"

The procedure can obviously be continued to polynomials of higher and


higher order by taking into account more and more terms of the interpolation
formula (6.16). The analogy with the Taylor series would induce us to
believe that we constantly gain in accuracy as we go to polynomials of ever-
increasing order. There is, however, an essential difference between the two
series. In the case of the Taylor series we are at a constant distance from
the centre of expansion, while here we stay at a certain point £ but the
range in which our polynomials interpolate, increase all the time. For a
linear interpolation only the two neighbouring points play a role. But if
we operate with a polynomial of the order n — 1, we use n successive points
of interpolation and lay a polynomial of high order through points which are
in fact quite far from the point at which we want to obtain the functional
value. We can generally not expect that the error oscillations will
necessarily decrease by this process. That we take in more data, is an
advantage. But our approximating polynomial spreads over an ever-
increasing range and that may counteract the beneficial effect of more data.
Indeed, in the case of the Taylor series the functions (6.5) have a strongly
decreasing tendency since every new function provides the added factor x
in the numerator and the factor k in the denominator. On the other hand,
the functions (6.14) yield likewise the factor k in the denominator but in the
12 INTERPOLATION CHAP. 1

numerator we now obtain x + I — k and thus we see that for large k we


have gained a factor which is not much smaller than 1. Convergence can
only be expected on account of the successive differences Akf(Q). These
differences may go down for a while but then a minimum may be reached
and afterwards the terms will perhaps go up again. In this case we have to
stop with the proper order n — I of the interpolating polynomial since the
addition of further terms will increase rather than decrease the error of
interpolation.
We see that under such circumstances we cannot trust the automatic
functioning of the interpolation formula (6.16). We have to use our
judgement in deciding how far we should go in the series to obtain the closest
approximation. We stop at a term which just precedes the minimum term.
We may also add the minimum term with half weight, thus giving much
higher accuracy but losing the chance of estimating the committed error.
The first neglected term of the series does not yield necessarily a safe error
bound of our interpolation, except if the first two neglected terms are of
opposite signs and f^n+^(x) does not change its sign in the range of inter-
polation (cf. 5.10). (Safe error bounds for monotonously increasing or
decreasing series are not easily available.)
A good illustration is provided by the example studied by Bunge. Let
the function

be given at the integer points x = 0, +1, +2, . . . , +10. We want to


obtain f(x) at the point x = 9.5.
First of all we shift the point of reference to the point x = 10 and count
the K-values backward. Hence we make a table of the given y-values,
starting with t/io, and continuing with t/9, y8, y>j . . . . We then evaluate the
successive Akf(0), by setting up a difference table—or quicker by binomial
weighting according to the formula (6.17)—and apply the Gregory-Newton
interpolation formula (6.16) for x = 0.5. The successive terms and then-
sum is tabulated below:

0.961538 0.961538
0.107466 1.069004
-0.009898 1.059106
0.002681 1.061788
-0.001251 1.060537
0.000851 1.061388
-0.000739 1.060649
0.000568 1.061217
0.000995 1.062212
[correct :/(9.5) = 1.061008]
SEC. 1.8 INTERPOLATION BY CENTRAL DIFFERENCES 13

We observe that in the beginning the error fluctuates with alternating sign
and has the tendency to decrease. After 8 steps a minimum is reached and
from then on the terms increase again and have no tendency to converge.
In fact the differences of high order become enormous. In view of the
change of sign of the seventh and eighth terms we can estimate that the
correct value will lie between 1.061388 and 1.060649. The arithmetic mean
of these two bounds: 1.061018, approaches in fact the correct functional
value 1.061008 with a high degree of accuracy. Beyond that, however, we
cannot go.
Runge's discovery was that this pattern of the error behaviour cannot be
remedied by adding more and more data between, thus reducing Ax to
smaller and smaller values. No matter how dense our data are, the inter-
polation for some rvalue between will show the same general character:
reduction of the error to a certain finite minimum, which cannot be surpassed
since afterwards the errors increase again and in fact become exceedingly
large.
In the present problem our aim has been to obtain the functional value f ( x )
at a given point x. This is the problem of local interpolation. We can use
our judgement how far we should go with the interpolating series, that is
how many of our data we should actually use for a minimisation of our
error. We may have, however, quite a different problem. We may want
an analytical expression which should fit the function y = f(x) with reason-
able accuracy in an extended range of x, for example in the entire range
[ — 10, +10] of our data. Here we can no longer stop with the interpolation
formula at a judiciously chosen point. For example in our previous pro-
cedure, where we wanted to obtain/(9.5), we decided to stop with n = 6 or 7.
This means that we used a polynomial of 5th or 6th order which fits our
data between x = 4 (or 5) and 10. But this polynomial would completely
fail in the representation of f(x) for values which are between —10 and 0.
On the other hand, if we use a polynomial of the order 20 in order to include
all our data, we would get for/(9.5) a completely absurd value because now
we would have to engage that portion of the Gregory-Newton interpolation
formula which does not converge at all. We thus come to the conclusion
that interpolation in the large by means of high order polynomials is not
obtainable by Lagrangian interpolation of equidistant data. If we fit our
data exactly by a Lagrangian polynomial of high order we shall generally
encounter exceedingly large error oscillations around the end of the range.
In order to obtain a truly well fitting polynomial of high order, we have to
make systematic errors in the data points. We will return to this puzzling
behaviour of equidistant polynomial interpolation when we can elucidate it
from an entirely different angle (cf. Chapter 2.21).

1.8. Interpolation by central differences


The Gregory-Newton formula (6.16) takes into account the given data in
the sequence /(O), /(I), /(2), . . . . If for example we want to obtain /(0.4)
and we employ 5 terms of the series, we operate in fact with a polynomial
14 INTERPOLATION CHAP. 1

of fourth order which fits the functional values given at x = 0, 1, 2, 3, 4.


Now the point x = 4 is rather far from the point x = 0.4 at which the
function is desired. We might imagine that it would be preferable to
use data which are nearer to the desired point. This would require that
our data proceed in both directions from x = 0.4 and we could use the
functional data at x = — 2, — 1, 0, 1, 2. This interpolation procedure is
associated with the name of Stirling. In Stirling's formula we employ
the data symmetrically to the left and to the right and thus gain greatly in
convergence.
The difference table we now set up is known as a "central difference
table".* It is still the previous difference table but in new arrangement.
The fundamental operation on which Stirling's formula is based, is called
the "second central difference" and is traditionally denoted by 82. This
notation is operationally misleading since S2 is in fact a basic operation which
cannot be conceived as the square of the operation S. For this reason we
will deviate from the commonly accepted notation and denote the traditional
$2 by 8:

The even part of the function/(#) can be expanded in the even Stirling series:

where the Stirling's functions #2Jfc(#) are defined as follows:

The odd part of f(x) can be made even by multiplication by x and thus
expanding the function xf(x) according to (2). The final result is
expressible in the form of the following expansion:

The operation y has the following significance:

See A. A..pp.309.310.
SEC. 1.8 INTERPOLATION BY CENTRAL DIFFERENCES 15

The formula (4) shows that the odd Stirling functions $zk+i(x) have to be
defined as follows:

A comparison with (3) shows that

Now let/(«) be odd. f( — x) = —/(#)• Then the Stirling expansion becomes

but at the same time g(x) — xf(x) is even and permits the expansion

Dividing on both sides by x we obtain, in view of (7):

This means that the coefficients of the odd terms are obtainable in terms of
the 8 operation alone if this operation is applied to the function xf(x), and
the final result divided by 2k + 2.
Here again the direct construction of a central difference table can be
avoided in favour of a direct weighting of the functional values, in analogy
to (6.18). We now obtain

For example:

(The operation yS2/(0) is equally obtainable by applying the operation


83/(0) to xf(x) and dividing by 2-3 = 6.)
Problem 12. Show that the expansion (4) is equivalent to the application of the
Gregory-Newton formula if we shift the centre of expansion to the point — n:
16 INTERPOLATION CHAP. 1

Problem 13. The exponential function y = e~x is given at the following points:
/(O) = 1, /(I) = 0.367879, /(2) = 0.135335, /(3) = 0.049787, /(4) = 0.018316
Obtain /(0.5) by the Gregory-Newton formula of five terms. Then, adding
the data
/(-I) = 2.718282 /(-2) = 7.38906
obtain/(0.5) by the Stirling formula of five terms; (omitting/(3) and/(4)).
[Answer: Gregory -Newton: /(£) = 0.61197
Stirling :/(£) = 0.61873
correct value :/(£) = 0.60653
The value obtained by the Stirling interpolation is here less accurate than the
G.-N. value.]

1.9. Interpolation around the midpoint of the range


We will return once more to our problem (7.3), examined before in
Section 7. We have assumed that our data were given in the 21 integer
points between x = —10 and x = 10. We have seen that around the end
of the range the Gregory-Newton formula had a "semi-convergent"
behaviour: the terms decrease up to a certain point and then increase again.
We approach the functional value only if we do not go too far in the series.
Quite different is the behaviour of the interpolating series if we stay near
to the middle of the range and operate with central differences. Let us
investigate the convergence behaviour of the Stirling series by trying to
obtain /(0.5) by interpolation on the basis of central differences.
First of all we notice that the Stirling functions <£*(#)—defined by (8.3)
and (8.6)—have better convergence properties than the Gregory-Newton
functions (6.14). The factor we gain as we go from 2k to 2k + 2 is

and that is better than £. In order to obtain a comparison with the


Gregory-Newton formula we should multiply the functions $*(#) by 2* and
divide the central differences by 2*. In our problem f(x) is even and thus
only the even differences 8* will appear, in conjunction with the functions
#2*(#). Hence we will divide 8*/(0) by 4* and multiply the associated
^2*(#) by 4*. In this fashion we keep better track of the order of magnitude
of the successive terms which form the final sum (8.2).
In this instance we do not encounter that divergent behaviour of the
interpolating series that we have encountered earlier in Section 7. But now
we encounter another strange phenomenon, namely that the convergence is
too rapid. The terms we obtain are all negative, with the exception of the
first term. Hence we approach the final value monotonously from above.
The successive terms diminish rapidly and there is no reason to stop before
the contributions of all our data are taken into account. But the peculiar
thing is that the peripheral values contribute too little to the sum so that we
SEC. 1.10 THE LAGTJEBRE POLYNOMIALS 17

get the impression that we are much nearer to the correct value than this is
actually the case. The successive convergents are given in the following
table (the notation/*(0.5) refers to the interpolated values, obtained on the
basis of 1, 3, 5, ...21 data, going from the centre to the periphery, and
taking into account the data to the right and to the left in pairs):

/*(0.5)
1 25 25
0.5 -2.5 23.75
-0.125 0.9375 23.63281
0.0625 -0.54086 23.59901
-0.03906 0.37860 23.58422
0.02734 -0.29375 23.57619
-0.02051 0.24234 23.57122
0.01611 -0.20805 23.56786
-0.01309 0.18357 23.56546
0.01091 -0.16521 23.56366
-0.00927 0.15092 23.56226

As we watch the successive convergents, we should think that the correct


value can be guaranteed to at least two decimal places while in actual fact

The great distance of /*2i(0.5) from the correct value makes it doubtful
whether the addition of the data /(+11), /(+12) . . . out to infinity, would
be able to bridge the gap. A closer analysis corroborates this impression.
The series (8.2) remains convergent as n tends to infinity but the limit does
not coincide withf(x) at the point x = 0.5 (see Chapter 2.21).

1.10. The Laguerre polynomials


The experience of the last section brings us to the following problem.
Our previous discussions were devoted to a function y = f(x) which was
tabulated in a finite interval. We have studied the behaviour of a poly-
nomial interpolation in this interval. But let us now assume that our
function is in fact tabulated in an infinite interval, and first we will assume
that this interval is [0, oo], the function being given at the points

Accordingly we will operate with the Gregory-Newton formula, form the


successive differences

3—L.D.O,
18 INTERPOLATION CHAP. 1

and construct the formal infinite series*

What can we say about the behaviour of this series? Will it converge and
if so, will it converge to /(a;)? (We will ask the corresponding problem for
the interval [ — 00, +00] and the use of central differences somewhat later,
in Section 16.)
This problem is closely related to the properties of a remarkable set of
polynomials, called the "Laguerre polynomials". We will thus begin our
study with the exposition of the basic properties of these polynomials
shaped to the aims of the interpolation theory.
We define our function y = f(x) in the interval [0, oo] in the following
specific manner:

We form the successive differences (2), either by setting up a difference


table, or by directly taking the functional values y^ and weighting them
binomially according to the formula (6.17). We thus obtain

These are so-called "normalised Laguerre polynomials" which we will


denote by -£*(£).f For example:

and so on. These polynomials have the remarkable property that they are
orthogonal to each other in the interval [0, oo], with respect to the weight
factor e~f, while their norm is 1:

* The notation /*(x) in the sense of an "approximation of /(*)" seems rather ill-
chosen, in view of the convention that the asterisk denotes in algebra the "complex
conjugate" of a complex number. An ambiguity need not be feared, however, because
in all instances when this notation occurs, f(x) is a real function of x.
f The customary notation Ln(t) refers to the present Ln(t), multiplied by n!
SEC. 1.10 THE LAGUEBBE POLYNOMIALS 19

They thus form an " ortho-normal" set of functions. Moreover, in view of


the fact that the powers of t come in succession, from tQ to f60, these
functions form a complete ortho-normal set.
Traditionally this property of the Laguerre polynomials is proved on the
basis of the differential equation which they satisfy. But we can demon-
strate this property directly on the basis of the definition (5). We form the
Gregory-Newton series

We do not know yet whether this series will converge or not—we will give
the proof in the next section—but for any integer value x = ra the series
terminates after m + 1 terms and the question of convergence does not arise.
For such values the series (8) is an algebraic identity, no matter how the
key-values/(j) (j = 0, 1, 2, . . . , m) may be prescribed.
Let us apply this expansion to the function (4), obtaining

We now multiply on both sides by Jm/(0)e~<, obtaining

Since integration and differencing are two independent operations, we can


on the left side multiply by/(£), perform the integration, and then take the
mth difference at £ = 0:

Now, making use of the fundamental relation (6.15)—our variable is


we obtain

and putting = 0 we obtain for the left side of (10):

Substituting in (10), we get


20 INTERPOLATION CHAP. 1

which at once yields the orthogonality relation (7). (It is important to


emphasise that the convergence of the infinite expansion (8) is not involved
in this argument. What we need are only integer values of x for which the
convergence is automatic.)
A natural generalisation of (4) offers itself. Let us define the fundamental
function f(x) as follows

where p is an arbitrary positive constant. The successive differences (2)


once more define an infinite set of polynomials, called the "generalised
Laguerre polynomials" LjcP(t). For example:

and so on. The formula (9) is now to be modified as follows:

Moreover, if we multiply on both sides by $(£)&&-*, and integrate with


respect to t between 0 and oo, we obtain on the left side

This leads, by exactly the same reasoning as before in the case of (12):

This means

and in view of (17):


SEC. 1.11 BINOMIAL EXPANSIONS 21

We thus obtain the orthogonality of the generalised Laguerre polynomials


LjcP(.t) in the following sense

Problem 15. Show that all these relations remain valid if the condition p > 0
is generalised to p > — 1.
Problem 16. The hypergeometric series

is convergent for all (real or complex) |a;| < 1. Put x = z/fi and let j8 go to
infinity. Show that the new series, called the "confluent hypergeometric
series", convergent for all z, becomes

Show that the Laguerre polynomials L^p(t) are special cases of this series, namely

1.11. Binomial expansions


The infinite Taylor series

expands the function /(z) into a power series, in terms of the value of/(z)
and all its derivatives at the centre of expansion z — 0. We obtain a
counterpart of the infinite Taylor series, replacing differentiation djdx by
differencing A, in the form of the Gregory-Newton series

which substitutes for the functions z*/fc! the "binomial coefficients"


and for the successive derivatives of /(z) at z = 0 the successive differences
of/(z) at z = 0.
There is, however, a deep-seated difference between these two types of
expansions. The Taylor series (1) has a very general validity since it holds
for all analytical functions within a certain convergence radius |z| < r of
the complex variable z. On the other hand, the example of Section 7 has
22 INTBBPOLATION CHAP. 1

demonstrated that we cannot expect the convergence of the series (2) even
under completely analytical conditions. What we can expect, however, is
that there may exist a definite class of functions which will allow representa-
tion with the help of the infinite expansion (2).
In order to find this class, we are going to make use of the orthogonality
and completeness of the Laguerre polynomials in the range [0, oo]. Let us
assume that f(t) is a function which is absolutely integrable in any finite
range of the interval [0, oo] while its behaviour in infinity is such that

Then the function f(t)e~V2 can be expanded into the orthogonal Laguerre
functions Ljc(t)e-^z which leaves us with an expansion of/(£) itself into the
Laguerre polynomials:

where

As an example let us consider the expansion of the function

in Laguerre polynomials where x is any positive constant, or in fact any


complex constant whose real part is positive or zero:

For this purpose we need the expansion coefficients

which we will now evaluate. For this purpose we imagine that we replace
Lk(t) by the actual power series, integrating term by term:

where pic(x) is some polynomial in x.


We have a very definite information about the roots of this polynomial.
Let us namely assume that in the integral (8) x takes the value of any
integer less than k. In view of the fact that any power tf can be conceived
SBC. 1.11 BINOMIAL EXPANSIONS 23

as a certain linear combination of the Laguerre polynomials La(t) (a < j)


and that Lk(t) is orthogonal to all the polynomials Lj(t) of lower order, we
observe that all these integrals must vanish. Hence pic(x) is zero for
x — 0, 1, 2, . . . , (k — 1). This identifies pic(x) to

and the only remaining uncertainty is the constant C. But we know from
the definition of Ljc(t) that the coefficient of tk is !/(&!). Therefore, if we
let x go to infinity, the coefficient of xk must become l/kl. This determines
the constant C and we obtain

in agreement with our earlier formula (10.13). The expansion (4) thus
becomes

We have now found a special case of a function which permits the infinite
Gregory-Newton expansion and thus the interpolation by powers of the
functional values/(m), given between 0 and oo.
We will draw two conclusions from this result. First of all, let us put
t = 1. Then we obtain the expansion

The factorial x\ itself goes by far too rapidly to infinity to allow the
Gregory-Newton type of interpolation. But the reciprocal of the factorial is
amenable to such an interpolation. If we let x go toward zero, we obtain
in the limit an interesting approximation of the celebrated " Euler's constant"

because the derivative of x \ at x — 0 is — y:

The convergence of this series is very slow and of no practical significance.


But a similar method, applied to the series (13) at some integer point x = m
instead of x = 0, yields expansions of much quicker convergence.
Another conclusion can be drawn concerning the interpolation of the
exponential function eax. We can write
24 INTEBPOLATION CHAP. 1

which can be expanded in a binomial series according to Newton's formula


[cf. (6.16)], provided that

that is a < 0.69315. . . . If, however, we divide by x\ and interpolate the


new function, we obtain convergence for all values of a.
Problem 17. The values of ex for x = 0, 1, 2, . . ., 7 are:
y0 = 1
y1 = 2.71828
y2 = 7.38906
y3 = 20.08554
y4 = 54.59815
y5 = 148.41316
y6 = 403.42879
y7 = 1096.63316
Obtain an upper and lower bound for e1/2 by Gregory-Newton interpolation,
without and with the weight factor (a!)-1. (The latter series is convergent but
the convergence is very slow.)
[Answer:

Problem 18. By an argument quite similar to that used in the proof of (11),
but now applied to the generalised Laguerre polynomials L^fe), show the
validity of the following relation

and deduce the expansion

1.12. The decisive integral transform


We will now proceed to the construction of a certain integral transform
which is fundamental in answering our initial problem: "Find the class of
functions which allow the Gregory-Newton type of expansion (11.2)."
For our construction we make use of an auxiliary function g(t), defined
in the range [0, oo], which satisfies the condition (11.3) and thus permits an
expansion into Laguerre polynomials. We define our integral transform as
follows:

Let us now expand g(t) in Laguerre polynomials:


SEC. 1.12 THE DECISIVE INTEGRAL TRANSFORM 25

If this expansion is substituted in (1) and we integrate term by term, we


obtain, in view of the relation (11.11):

But

and in view of the fact that Lk(t) can be conceived as the kih difference of
the function i*/£! (at £ = 0), we can replace Lk(t) by this function, integrate,
and then take the kih difference (considering £ as a variable), and finally
replacing £ by 0. The result of the integration becomes

Substituting this value of gjc in (3) we obtain the infinite binomial expansion

which shows that the integral transform (1) defines a class of functions which
allows the infinite binomial expansion of Gregory-Newton.
On the other hand, let us assume that we have a Gregory-Newton ex-
pansion which is convergent:

Then we define a function g(t) by the infinite sum

and obtain f(x) by constructing the integral transform (1). Hence we see
that the integral transform (1) is sufficiently general to characterise the
entire class of functions which allow the Gregory-Newton type of interpolation
in the infinite integral [0, ooj.
The analytical form (1) of the function f(x) shows that it is in fact an
analytical function of x, throughout the right complex plane R(x) > 0.
Moreover, the interpolation formula (7) remains valid not only on the positive
real axis but everywhere in the right complex z-plane R(z) > 0. Hence we
have obtained an expansion which not only interpolates properly the discrete
functional values f(m) to the values f(x) between the given data, but also
extrapolates f(z) properly at every point z of the right complex half plane.
26 INTERPOLATION CHAP. 1

Problem 19. Carry through the procedure with respect to the generalised
integral transform

expanding g(t) in the polynomials L^>(t). Show that/p(aj) allows the expansion
(7) throughout the right complex half plane. The expansion may also be
written in the form

with

Problem 20. Comparing the integral transforms (1) and (10), demonstrate the
following theorem: If a function allows the infinite Gregory-Newton expansion, it
allows that expansion also if the centre of expansion is shifted by an arbitrary
amount to the right.

1.13. Binomial expansions of the hypergeometric type


Some special choices of the function g(t) lead to a number of interesting
binomial expansions which are closely related to the hypergeometric function
*(<*, j8,y;«) [of. (10.23)].
Problem 21. Choose the function g(t) of the integral transform (12.1) to be

and obtain the following binomial expansion:

Problem 22. Employ the same function in the integral transform (10) and
derive the following binomial expansion:

Problem 23. Show that the right sides of (2) and (3) are in the following relation
to the hypergeometric series (10.23):
SBO. 1.14 RECURRENCE RELATIONS 27

Problem 24. In the expansion (2) substitute p, — e and make e infinitesimal.


Obtain in this fashion the following binomial expansion for the "logarithmic
derivative of the gamma function" :

Problem 25. Doing the same in the expansion (5) obtain the following
generalisation of (7):

1.14. Recurrence relations


The Taylor series (11.1) has the property that the operation of
"differentiation" leaves its form unchanged, merely shifting the coefficients
by one unit to the left. We can ask for the operations which will leave the
form of the binomial expansion

invariant. First of all we have the "differencing operation" A [cf. (6.1)]


at our disposal. If we apply this operation to the series, we obtain, in view
of (6.10):

Hence the operation A on the function has the effect that the coefficient gk
is changed to gjc+i. This operation can be repeated, of course, any number
of times, obtaining each time a jump in the index of gjc by one. Particularly
important is the operation

and consequently

There is a second fundamental operation which leaves the form of the


series (1) unchanged. Let us namely multiply f(x) by x but take f(x) at
the point x — 1. This operation shall be denoted by the symbol F:

If we perform this operation on the binomial coefficients, we find


28 INTERPOLATION CHAP. 1

Hence

Here we find a shift of the subscript of g to the left, together with a


multiplication by k.
The operation

is a consequence of the fundamental definitions. Accordingly

We see that by the combination of these two operations A and T we can


express any linear combination of the functional values f(x + ra), multiplied
by any polynomials of x.
If now the function/(x), which can be expanded binomially, satisfies some
linear functional equation between the values f ( x + m) whose coefficients
are rational functions of x, this relation will find a counterpart in a
corresponding recurrence relation between the coefficients g^ of its binomial
expansion.
Let us consider for example the expansion (11.12):

Here the expansion coefficients g% become Ljc(t). The function on the left
satisfies the following simple functional equation :

that is:

According to the rules (3) and (8) we can write this equation in the form

Translated to the coefficients g^ the corresponding relation becomes:

which yields the following recurrence relation for the Laguerre polynomials
Lk(t):

Problem 26. Show that the operations A and F are not commutative: FA^ AF.
SEC. 1.15 THE LAPLACE TRANSFORM 29

Problem 27. In the expansion (11.20) the function fp(x) satisfies the following
functional equation:

Translate this equation into the realm of the expansion coefficients and obtain
the following recurrence relation for the generalised Laguerre polynomials L^P(t):

Problem 28. The left side of the expansion (13.2) satisfies the functional equation

Find the corresponding recurrence relation for the expansion coefficients and
verify its validity.
[Answer:

Problem 29. Do the same for the binomial expansion (13.3).


[Answer:

Problem 30. The general hypergeometric series (10.23) for F( — x , f i , y ; z ) can


be conceived as a binomial expansion in x, considering ]8, y, z as mere parameters.
The coefficients of this expansion:

satisfy the following recurrence relation:

Translate this relation into a functional equation for F, considered as a function


of x. Write down the resulting formula in the usual notation F(a, |8, y; z).
[Answer:

1.15. The Laplace transform


Let us choose the input function of the integral transform (12.1) in the form

In order to satisfy the condition (11.3), it is necessary and sufficient that


the real part of a shall be larger than $:
30 INTERPOLATION CHAP. 1

With this choice of g(t) the integral transform (12.1) becomes

We can evaluate this integral by making the substitution at = ti,


obtaining

If we expand the right side by Newton's binomial formula, we obtain

On the other hand, according to the general theory [of. (12.3-4)]:

with

Hence we obtain, in view of (5):

The integral transform

called "Laplace transform", is one of the most important transforms of


applied analysis, fundamental in many problems of mathematical physics
and engineering (cf. Chapter 6.10). The left side of (8) is by definition the
Laplace transform of L^t}. The right side yields this transform in a remark-
ably simple explicit form.
The result (8) has the following important consequence. Let us expand
the input function g(t) of the Laplace transform into Laguerre polynomials:

If this sum is introduced in (9) and we integrate term by term, we get:

This expansion converges for all values of the complex variable a whose
real part is greater than zero.
SEC. 1.15 THE LAPLACE TRANSFORM 31

Sometimes our aim is to obtain the input function g(t] from a known
Laplace transform /(a). In this case the expansion of g(t) in Laguerre
polynomials would not be feasible since this expansion goes beyond all
bounds as t goes to infinity. But if it so happens that g(t) is quadratically
integrable without the weight factor e~f; g2(t)dt — finite, then we can
expand g(t) into the orthonormal Laguerre fund ions, obtained by multiplying
the Laguerre polynomials by e~*/2. In this case:

and

If we now introduce a new variable £ by putting

and expand the function

in a Taylor series around the centre 1:

the coefficients of this series yield directly the coefficients of the series (12).
This procedure is frequently satisfactory even from the numerical stand-
point.*
Does the Laplace transform permit the Gregory-Newton type of inter-
polation? This is indeed the case, as we can see if we consider that the
function e~x£ allows a binomial expansion, on account of Newton's formula:

If we multiply by and integrate between 0 and oo term by term-


assuming that <!>(£) goes to infinity weaker than e~ft, e being an arbitrarily
small positive constant—then our expansion remains convergent and yields
the Gregory-Newton expansion of the Laplace transform

Cf. A. A., p. 292


32 INTERPOLATION CHAP. 1

Problem 31. Show that the condition (2) yields for the convergence of the
binomial expansion (5) the condition

Problem 32. Choose the input function of the integral transform (12.10) in the
form (1) and deduce the following relation:

This gives the Laplace transform of fPLjfV in explicit form.


Problem 33. Find the input function g(t) of the integral transform (12.1) which
leads to the Laplace transform (18).
[Answer:

Problem 34. Obtain the Gregory-Newton expansion of the Laplace transform


(18) whose input function is

y3 = 20.08554

Problem 35. Show that a Laplace transform allows binomial interpolation in


a still different sense namely by applying the Gregory-Newton expansion to the
function f ( x ) / x \ . [Hint: Assume that in the integral transform (12.1) g(t) = 0
for t > 1.]
Problem 36. Show that the binomial interpolation of the Laplace transform
is possible not only with f(m) as key-values but with f(fim) as key-values where
/? is an arbitrary positive constant.

1.16. The Stirling expansion


The interpolation by central differences led to the Stirling type of
expansion (cf. Section 8). Here again the question arises whether this
method of interpolation could be used unlimitedly on tabular values of a
function which is tabulated in equidistant intervals between — oo and + oo.
Hence the key-values are now/( + ra) where m assumes the values 0, 1, 2,. . . ,
to infinity. Once more it is clear that only a restricted class of functions
will submit itself to this kind of interpolation and our aim will be to
circumscribe this class.
If in our previous discussions the fundamental functions were the Laguerre
polynomials—which represented a special class of hypergeometric functions—
SEC. 1.16 THE STIRLING EXPANSION 33

we will once more turn to the hypergeometric series and consider two
particular cases, characterised by the following choice of the parameters:

and

If we multiply numerator and denominator by 2k, we get rid of the factions


in the denominator and the product y(y H- 1) . . . (y + fc — 1) becomes
1-3-5 ... (2k — 1) in the first case and 1-3-5 . . . (2k + 1) in the second.
Furthermore, the k\ in the denominator can be written in the form
2-4 . . . (2k), if we multiply once more numerator and denominator by 2fc.
The two factors of the denominator combine into (2k) I in the first case and
(2k + 1)! in the second. We thus obtain the two expansions

where 3>2*(#) and $2ic+i(x) are the Stirling functions, encountered earlier in
(8.3) and (8.6).
The hypergeometric functions represented by these expansions are
obtainable in closed form. Let us consider the differential equation of
Gauss which defines the hypergeometric function

For the special case (1) this differential equation becomes

while the choice (2) yields

Problem 37. Transform t into the new variable 9 by the transformation

Show that in the new variable the differential equation (6) becomes

4--L.D.O.
34 INTEBPOLATION CHAP. 1

while in the case (7) we get


for

If we adopt the new angle variable 6 for the expansions (3) and (4), we observe
that the functions F(-x, x, i; sin2 6/2) and F(-x + l,x + 1, f ; sin2 6/2) are
even functions of 6. Hence in the general solution of (9):
y = A cos xQ + B sin xd
the sine-part must drop out, while the constant A must be chosen as 1, since
for 6 = 0 the right side is reduced to 1. We thus obtain

and by a similar argument

Hence

1.17. Operations with the Stirling functions


The two fundamental operations in the process of central differencing are
(cf. Section 8):

Let us perform these operations on the left sides of the series (14) and (15):

If we apply the same operation term by term on the right sides, we obtain
the following operational equations (considering t — sin2 6/2 as a variable
and equating powers of t):
SEC. 1.18 AN INTEGRAL TRANSFORM OF THE FOURIER TYPE 35

Furthermore, the identities

yield the following operational relations:

On the basis of these relations we see that, if we put

we must have

and

Problem 38. Show that

and

This establishes the two hypergeometric series (16.14) and (16.15) as infinite
Stirling expansions.
1.18. An integral transform of the Fourier type
On the basis of our previous results we can now establish a particular but
important class of functions which allow the infinite Stirling expansion.
First of all we will combine the two series (16.14) and (16.15) in the following
complex form:

with

Since the hypergeometric series converges for all \t\ = jsin2 0/2J < 1, we can
36 INTERPOLATION CHAP. 1

make use of this series for any 9 which varies between — TT and +TT. If we
now multiply by an absolutely integrable function <p(6) and integrate between
the limits — IT, +TT, we obtain the following integral transform:

This f(x) allows the infinite Stirling expansion which means that f(x) is
uniquely determined if it is given at all integer values x = ± m. We can
now form the successive central differences 5*/(0) and yS*/(0)—also obtain-
able according to (8.11) and (8.12) by a binomial weighting of the functional
values f(±m) themselves—and expand f(x) in an infinite series:

However, the formulae (2-4) show that the coefficients gk are also obtainable
by evaluating the following definite integrals:

The integral transform (4) is a special case of the so-called "Fourier


transform" which is defined quite similarly to (4) but with the limits ± oo.
We can conceive the transform (4) as that case of the Fourier transform for
which <p(6) vanishes everywhere outside the limits + TT. The analytical form
of (4) shows that it represents an analytical function of the complex variable x,
throughout the entire complex plane. Furthermore, we know from the
nature of the hypergeometric series that the series (16.14-15) remain valid for
arbitrary complex values of x. Hence the series (6) not only interpolates
the functional values f(±m) on the real axis, but extrapolates them to any
value of the complex plane.
Problem 39. Given the following data. The function f(x) = cos TTX assumes at
integer points the values f(±m) = (—l)m. Moreover, the function allows the
Stirling type of interpolation. Show that these data are sufficient for the unique
determination of cos WE at all points x. [Hint: derive the series (16.14) (for
6 = TT) from the given data by forming the successive central differences.]
Problem 40. Given the following data. The function sin -nx vanishes at all
integer points. At x — 0 it goes to zero like WE. It is, if divided by x,
expandable into an infinite Stirling series. Show that these data are sufficient
SEC. 1.19 RECURRENCE RELATIONS 37

for obtaining BWITX at all points. [Hint: Consider the Stirling expansion of
sin irx/frx and derive the following series:

Problem 41. Assume the input function <p(6) of the integral transform (4) in
the form

(m = integer). Then the function f(x) becomes

The values of f(x) at integer points are all zero, except at x = m where the
function assumes the value ( — l)m. Hence the binomial weighting of the
functional values is particularly simple. Derive the expansion

which, if written in the general form (17.11) possesses the following expansion
coefficients:

The same coefficients are obtainable, however, on the basis of the integrals (7)
and (8). Hence obtain the following formulae:

(The second integral is reducible to the first. Show the consistency of the
two expressions.)
Problem 42. Show that the first 2m — 1 terms of the Stirling expansion (12)
drop out, because their coefficients are zero.

1.19. Recurrence relations associated with the Stirling series


As in the case of the Gregory-Newton series (cf. Section 14), we can once
more ask for those operations which leave the Stirling series invariant. We
have found already two such operations: y and 8. They had the property
that if they operate on the functions 0#, they generate a linear combination
of these functions, without changing the form of the series. They merely
re-arrange the coefficients g^. We will employ the following notation.
If we write Bgjc, this should mean: the change of the gjc, due to the operation
38 INTERPOLATION CHAP. 1

8f(x). Hence, e.g. the equation Sgr* = Qk+z shall signify that in consequence
of the operation 8f(x) the coefficient g^ of the expansion is to be replaced by
9k+z- With this convention we obtain from the operational equations
(17.5, 6, 9, 10):

Now the two operations y and 8 can be combined and repeated any number
of times.
Since by definition

we obtain

and hence we can obtain an arbitrary f(x ± m) with the help of the two
operations y and 8. But we still need another operation we possessed in the
case of simple differences, namely the multiplication by x (cf. 14.8). This
operation is obtainable by the differentiation of the series (16.14) and (16.15).
For this purpose we return to our original variable t (cf. 16.8), but multiplied
by -4:

Then the series (16.14) and (16.15) become

and now, differentiating the first series with respect to T and subtracting the
second series, after multiplying it by x/2, we obtain

which leads to the relation, encountered before (cf. 8.7):


SEC. 1.19 RECURRENCE RELATIONS 39

Let us now differentiate the second series with respect to r and multiply
it by sin2 6 = -r(l + T/4). This gives

and, moving over the first term to the right side:

This, in view of (9), yields the relation

Accordingly, we can extend the rules (1-3) by the two additional rules:

We see that any linear recurrence, relation which may exist between the
functional values f(x + ra), with coefficients which are polynomials of x, can be
translated into a linear recurrence relation for the coefficients of the Stirling
expansion.

Problem 43. Show that the two operations 8 and y are commutative:

Moreover show that the operation y2 is reducible to the operation 8, according


to the relation

Problem 44. Find a recurrence relation for the expansion coefficients (18.13) on
the basis of a recurrence relation for the function (18.11). Verify this relation.
[Answer:

(since sin rex vanishes at all integer points).

Problem 45. Obtain the recurrence relations corresponding to the functional


equation

and show that both the g^k (representing the even part of (18.11)), and the
gzk+i (representing the odd part of (18.11)) satisfy the appropriate relation.
40 INTERPOLATION CHAP. 1

[Answer:

1.20. Interpolation of the Fourier transform


We have discussed in detail the Stirling expansion of the integral transform
(18.4) which was a special example of the class of functions which permit the
infinite Stirling series. It so happens, however, that the same transform can
be interpolated in a still different manner, although employing once more
the same key-values/( + m). We will expand <p(6) in an infinite, Fourier
series (cf. Chapter 2) :*

Then we obtain (cf. 18.11) the series

The coefficients cjg of the expansion (1) are the Fourier coefficients

Hence the expansion (2) becomes

This series is very different from the Stirling series since the functions of
interpolation are not polynomials in x but the trigonometric functions

which are bounded by +1. Moreover, the functional values/(±m) appear


in themselves, and not in binomial weighting. The convergence of the new
series is thus much stronger than that of the Stirling series.
* The notation + oo as summation limits means that the terms k = ± m are taken
into account in succession while m assumes the values 1, 2, 3, . . . . . . . (the initial value
k = 0 is taken only once).
SEC. 1.20 INTERPOLATION OF THE FOURIER TRANSFORM 41

If we separate the even and the odd parts of the function f(x), the
expansion (4) will appear in the following form:

(The prime in the first sum refers to the convention that the term k = 0
should be taken with half weight.)
The function <p(0) of the transform (18.4) may be chosen in the following
extreme fashion: <p(6) vanishes everywhere, except in the infinitesimal
neighbourhood of the point 6 = 0i. With this choice of <p(d) we see that
e-to* itself may be considered as a Fourier transform which permits the
expansion (6), provided that 6 is smaller than TT.

Problem 46. Obtain the following expansions:

Problem 47. Obtain the following expansions:


42 INTERPOLATION CHAP. 1

Problem 48. The limiting value 6 = -n is still permissible for the expansion of
cos 6x. Obtain the series

Problem 49. Consider the Fourier transform

and show that it permits an interpolation by powers and also by trigonometric


functions in terms of the key values/1 ± — k)• Write down the infinite Stirling
series associated with this function.
[Answer:

Problem 50. Show that the integral transform (18.4) allows the Stirling expansion
also in the key-values x = ±fim, where ft is any positive number between 0 and 1.

1.21. The general integral transform associated with the Stirling series
The two series (16.3) and (16.4) had been of great value for the derivation
of the fundamental operational properties of the Stirling functions. The
same series were used in the construction of the integral transform (18.4)
which characterised a large class of functions which permitted the Stirling
kind of interpolation (and extrapolation) in an infinite domain. We will
now generalise our construction to an integral transform which shall include
the entire class of functions to which the infinite Stirling expansion is
applicable. We first consider the even part of the function: %[f(x) + / ( — #)],
which can be expanded with the help of the even Stirling functions $2yfc(#)-
Once more we use the special series (16.14), but without abandoning the
original variable t which shall now be considered as a complex variable —z
whose absolute value is smaller than 1:

In a similar manner as before, we multiply by an auxiliary function g(z) and


integrate over a certain path. However, instead of choosing as the path of
integration the real axis between z = — 1 and 0, we will now choose a closed
circle of the complex plane:
SEC. 1.21 THE GENERAL INTEGRAL TRANSFORM 43

We then obtain the integral transform

We can approach the limit p = 1, without losing convergence. If we do


so, we get

with

Conversely, let us assume that we have an infinite Stirling series

which converges. Then we define the function g(d) by the infinite Fourier
series

because, in view of the orthogonality of the Fourier functions eike, the


integral (5) becomes indeed gzk- The integral transform (4) is thus not only
sufficient but even necessary for the characterisation of an even function
which possesses an infinite convergent Stirling expansion.
The function g(9) is closely related to a function of the complex variable
z defined as follows:

Then on the unit circle we obtain g(9)lei8 and if the series converges at
\z\ — I, it will certainly converge also for \z\ > 1. Hence the integral
transform (4) may also be written in the form

with the understanding that the range of integration is any closed curve on
which G(z) is analytical, and which includes all singularities of G(z), but
excludes the point z — — 1 (which is the point of singularity of the function
(1)). The function G(z) is analytical everywhere outside the unit circle and
will frequently remain analytical even inside, except for certain singular
points.
44 INTERPOLATION CHAP. 1

As to the odd part \[f(x) — f(—x)"\ of the functionf(x), we have seen that
the Stirling expansion of an odd function is formally identical with the
Stirling expansion of an even function (with the absolute term zero) divided
by x (cf. Section 8). Hence the general representation of the class of
functions which permits the Stirling kind of interpolation in an infinite
domain, may be given in the form of the following integral transform:

where GI(Z), GZ(Z) are arbitrary functions, analytical outside and on the
unit circle, and satisfying the auxiliary condition

Problem 51. Let the function f(x) be defined as one of the Stirling functions
®2k(x)t respectively ^2k-l(x)- Find the corresponding generating functions
Gi(z), 6<2(z).
[Answer:

Problem 52. Show that, if f(x) is an even polynomial of the order 2n, the
generating function 6*1(2) is a polynomial of the order n + 1 in z"1, while
6*2(2) = 0. If f(x) is an odd polynomial of the order 2n — 1, the same is true
of 6*2(2) (with the term z-1 missing), while 6*1(2) = 0.
Problem 53. Find the generating functions of the functions (16.12) and (16.13).
[Answer:

Problem 64. Find the generating functions of the integral transform (18.4).

In the case that <p(6) is differentiable, 6*3(2) may be written as follows:


SBO, 1,22 INTERPOLATION OF THE BESSEL FUNCTIONS 45

Problem 55. Find the generating functions of (18,11).


[Answer:

1,22. Interpolation of the Bessel functions


Our previous discussions hare shown that the interpolation of an equi-
distantly tabulated function with the help of central differences is not
necessarily a convergent process. In fact, only a very limited class of entire
analytical functions which allow representation in the form (1.21.10), can be
interpolated in the Stirling fashion. We frequently encounter Integral
transforms of a different type which may allow interpolation by completely
different tools, A good example is provided by the Bessel functions •/#(*)
which depend on the variable *, but also on the order p. Let us first
consider the Bessel functions of vnteger order /»(#). They are defined by
the integral transform

We see that the Bessel function Jn(x) is an entire function of x which has the
form of the Fourier transform (18.4) if cos <p is introduced as a new variable
6. Consequently the conditions for the applicability of the interpolation
in central differences are fulfilled.
Quite different is the situation with respect to the order p of the Bessel
functions. If p is not an integer, the definition (1) does not hold, but has
to be replaced by the following definition:

where

Now the function cos (a; sin 9/2), considered as a function of <p, is an even
function of tp and it is periodic, with respect to the period 2w, Such a
function can be expanded into a Fourier cosine series:
46 INTERPOLATION CHAP. 1

where

as we can see from the definition (1) of the Bessel functions, for n — 2k.
Hence we obtain the series

If we substitute this series in (3), and integrate term by term, we obtain

where we have put

These integrals are available in closed form:

and thus, going back to the original Jp(x) according to (2), we obtain the
following interpolation of an arbitrary Jv(x) in terms of the Bessel functions
of even order:

This formula can be conceived as a generalisation of the recurrence relation

which is a special case of (10), forp = 1. The series on the right terminates
for any integer value of p and expresses the function Jn(x)x~n as a certain
weighted mean of the Bessel functions of even order, up to Jzn(x), with
coefficients which are independent of x.
SEC. 1.22 INTERPOLATION OF THE BESSEL FUNCTIONS 47

Problem 56. What is the maximum tabulation interval Ax — (3 for the key-
values «7n(j8m) to allow convergence of the Stirling interpolation? What is the
same interval for interpolation by simple differences?
[Answer:

Problem 57. Answer the same questions if the tabulated function is ex.
[Answer:

Problem 58. The Harvard Tables* give the following values of the Bessel
functions of even order at the point x = 3.5:

Obtain «/3.5(3.5) by interpolation, and compare the result with the correct value.
The Bessel functions of half-order are expressible in closed form in terms of
elementary functions, in particular:

[Answer: 0.293783539
Correct Value: 0.293783454]
Another aspect of the interpolation properties of the Bessel functions
reveals itself if we write the formula (2), (3) in the following form:

We fix x and consider p as the variable. Then we have an integral trans-


form which has clearly the form (12.10) if we introduce t = \x cos2 9? as a
new integration variable and consider g(t) beyond the upper limit \x as
identically zero. This shows that Jp(x), considered as a function of p,
belongs to that class of functions which allows the, application of the Gregory-
Newton type of interpolation. The Bessel functions of non-integer order are
thus calculable in terms of the Bessel functions of integer order of the
same argument, using the method of simple differences.
Problem 59. Obtain ^3.5(8.5) by Gregory-Newton interpolation, using the values
of «/3(3.5), «/4(3.5), . . . , Ji4(3.5). We complete the table (14) by the following
tabular values:

* The Annals of the Computation Laboratory of Harvard University (Harvard


University Press, 1947).
48 INTERPOLATION CHAP. 1

[Answer: 0.2941956626 (observe the very slow convergence, compared with the
result in Problem 58)]
Problem 60. Riemann's zeta-function can be defined by the following definite
integral, valid for all z > 0:

a) Show that £(z + 1) has a simple pole at z — 0.


b) Show that z£(z + 1) allows in the right half plane the Gregory-Newton
type of interpolation.

BIBLIOGRAPHY

[1] Jordan, Ch., Calculus of Finite Differences (Chelsea, New York, 1950)
[2] Milne, W. E., Numerical Calculus (Princeton University Press, 1949)
[3] Milne-Thomson, L. M., The Calculus of Finite Differences (Macmillan,
London, 1933)
[4] Whittaker, E. T., and G. Robinson, The Calculus of Observations (Blackie &
Sons, London, 1924)
[5] Whittaker, J. M., Interpolatory Function Theory (Cambridge University
Press, 1935)
CHAPTER 2

HARMONIC ANALYSIS

Synopsis. The Fourier series was historically the first example of an


expansion into orthogonal functions and retained its supreme import-
ance as the most universal tool of applied mathematics. We study in
this chapter some of its conspicuous properties and investigate particu-
larly the "Gibbs oscillations" which arise by terminating the series to
a finite number of terms. By the method of the "sigma smoothing"
the convergence of the series is increased, due to a reduction of the
amplitudes of the Gibbs oscillations. This brings us to a brief investi-
gation of the interesting asymptotic properties of the sigma factors.

2.1. Introduction
In the first chapter we studied the properties of polynomial approximations
and came to the conclusion that the powers of x are not well suited to the
approximation of equidistant data. A function tabulated or observed at
equidistant points does not lend itself easily to polynomial interpolation,
even if the points are closely spaced. We have no guarantee that the error
oscillations between the points of interpolation will decrease with an increase
of the order of the interpolating polynomial. To the contrary, only a very
restricted class of functions allows unlimited approximation by powers. If
the function does not belong to this special class of functions, the error
oscillations will decrease up to a certain point and then increase again.
In marked contrast to the powers are the trigonometric functions which we
will study in the present chapter. These functions show a remarkable
flexibility in their ability to interpolate even under adverse conditions. At
the same time they have no "extrapolating" faculty. The validity of the
approximation is strictly limited to the real range.
The approximations obtainable by trigonometric functions fall into two
categories: we may have the function f(x) given in a finite range and our
aim may be to find a close approximation—and in the limit representation—
with the help of a trigonometric series; or we may have f(x) given in a
discrete number of equidistant points and our aim is to construct a well-
approximating trigonometric series, in terms of the given discrete data.
In the first case the theory of the Fourier series is involved; in the second
case, the theory of trigonometric interpolation.
5—L.D.O. 49
50 HABMONIC ANALYSIS CHAP. 2

The basic theory of harmonic analysis is concerned with the convergence


properties of the Fourier series. But in the actual applications of the
Fourier series we have to be concerned not only with the convergence of the
infinite series but with the error bounds of the finite series. It is not enough
to know that, taking more and more terms of the series, the error—that is
the difference between function and series—tends to zero. We must be
able to estimate what the maximum error of the finite expansion is, if we
truncate the Fourier series at an arbitrarily given point. We must also
have proper estimates in the case of trigonometric interpolation. The
present chapter is devoted to problems of this kind.
2.2. The Fourier series for differentiable functions
The elementary theory of the Fourier series proceeds in the following
manner. We take the infinite expansion

for granted. We assume that this series is valid in the range [—IT, +TT],
and is uniformly convergent in that range. If we multiply on both sides
by cos kx, respectively sin kx and integrate term by term, we obtain, in
view of the orthogonality of the Fourier functions, the well-known expressions

These coefficients can be constructed if f(x) is merely integrabk, without


demanding differentiability. We do not know yet, however, whether the
infinite series (1) thus constructed will truly converge and actually represent
/(#). This is in fact not necessarily the case, even if f(x) is everywhere
continuous.
Let us, however, assume that f(x) is not only continuous but even
sectionally differentiable throughout the range which means that f'(x) exists
everywhere, although the continuity off'(x) is not demanded. Furthermore,
let us assume the existence of the boundary condition

Then the coefficients (2) (omitting the constant term £«o) are expressible
with the help of/'(#), by using the method of integrating by parts:
SEC. 2.2 THE FOURIER SERIES FOR DIFFERENTIABLE FUNCTIONS 51

The boundary terms vanish—the first automatically, the second in view of


the boundary condition (3)—and now, substituting back in the formal
series (1) we obtain the infinite sum

The question is whether or not this infinite sum will converge to


f ( x ) ~ lao at all points of the interval.
This calls our attention to the investigation of the following infinite sum
which depends on the two variables x and £ but becomes in fact a function
of the single variable

alone:

If this sum converges uniformly, then a term by term integration is


permitted and the sum (5) becomes replaceable by the definite integral

The function (?i(£ — x) is called the "kernel" of this integral.


Now the simple law of the coefficients of the infinite sum permits us to
actually perform the summation and obtain the sum in closed form. The
result is as follows (cf. Problem 63):

The convergence is uniform at all points of 6, excluding only the point of


discontinuity 6 = 0 where the series gives zero, which is the arithmetic
mean of the two limiting ordinates.
We can now proceed to the evaluation of the integral (8). For this and
later purposes it is of great convenience to extend the realm of validity of
52 HAEMONIC ANALYSIS CHAP. 2

the function f(x) beyond the original range [ — IT, + TT]. We do that by
defining f(x) as a periodic function of the period 2ir:

By this law f(x) is now uniquely determined everywhere. Then the integral
(8) can now be put in the following form, introducing £ — x = 0 as a new
integration variable and realising that the integral over a full period can
always be normalised to the limits — TT, + TT :

In the second term G'\(Q] can be replaced by the constant l/2ir. In the
first term, in view of the discontinuity of Gi(9) we have to take the boundary
term between — TT and 0~ and again between 0+ and TT. In view of the
periodicity of the boundary term the contribution from the two boundaries
at ± TT vanishes and what remains becomes

Hence

We have thus shown that any continuous and sectionally differentiable


function which satisfies the boundary condition (3), allows a uniformly
convergent Fourier expansion at every point of the range.
Problem 61. Let f(x) be defined between 0 and TT. How must we define /(— x) if
a) all cosine terms
b) all sine terms
c) all even harmonics
d) all odd harmonics
shall drop out.
[Answer:

Problem 62. What symmetry conditions are demanded of f(x) if we want


a) a sine series with even harmonics
b) a sine series with odd harmonics
c) a cosine series with even harmonics
d) a cosine series with odd harmonics
[Answer:
SEC. 2.3 THE REMAINDER OF THE FINITE FOURIER EXPANSION 53

Problem 63. Consider the Taylor expansion of log (1 — z):

which converges everywhere inside and on the unit circle, excluding the point
2 = 1 . Put z = el9 and obtain the infinite sums

[Answer:

2.3. The remainder of the finite Fourier expansion


To show the uniform convergence of an infinite expansion is not enough.
It merely demonstrates that taking in a sufficient number of terms we can
make the difference between f(x) and the nth sum fn(%) as small as we wish.
It does not answer the more decisive question: how near are we to f(x) if we
stop with a definite fn(x) where n is not too small but not arbitrarily large
either? We can answer this question by taking advantage of the favourable
circumstance that the kernel GI(£ — x) depends on the single variable
£ — x = 0 only. Let us assume that we can evaluate with sufficient
accuracy the infinite sum

Then we will immediately possess a definite expression for the remainder of


the finite Fourier series

in the following form:

This integral can now be used for estimation purposes, by replacing the
integrand by its absolute value:
54 HAEMONIC ANALYSIS CHAP. 2

The second factor is quite independent of the function f(x) and a mere
numerical constant for every n. Hence we can put

and obtain the following estimation of the remainder at any point of the
range:

Our problem is thus reduced to the evaluation of the infinite sum (1).
We shall have frequent occasion to find the sum of terms which appear as
the product of a periodic function times another function which changes
slowly as we go from term to term. For example the change of I/a: is slow
if we go from l/(n + k) to l/(n + k + 1), assuming that n is large. Let us
assume that we have to obtain a sum of the following general character:

where y(k) changes slowly from k to k + 1. Let us integrate around the


point £ = k between k + | and k — \, making use of the fact that <p(k)
remained practically constant in this interval :

This means that the summation over k is replaceable by the following


definite integral:

Applying this procedure to the series (1) we obtain (in good approximation),
for 6 > 0:
SEC. 2.3 THE REMAINDER OF THE FINITE FOURIER EXPANSION 55

where Si (x) is the so-called "sine-integral"

(If n is large, we can replace n + £ by n with a small error.) For estimation


purposes it is unnecessary to employ complete accuracy. The expression
(10) shows that, except for very small angles, we are almost immediately in
the "asymptotic range" of Si (x) where we can put

Under these circumstances we obtain with sufficient accuracy:

and

Problem 64. If the summation on the left side of (3.9) extends only to k = n,
the upper limit of the integral becomes n + £. Derive by this integration
method the following trigonometric identities, and check them by the sum
formula of a geometrical series:

Problem 65. Prove that the "mean square error"

of the finite Fourier expansion is in the following relation to the Fourier


coefficients:
56 HARMONIC ANALYSIS CHAP. 2

Problem 66. Evaluate the mean square error of the Fourier series (2.7) and
prove that, while the maximum of the local error T]n(x) remains constantly |,
the mean square error converges to zero with n-1/2.
[Answer:

2.4. Functions of higher differentiability


It may happen that f(x) belongs to a class of functions of still higher
differentiability. Let us assume that f(x) is twice differentiable, although
the continuity of f"(x) is not assumed and thus the continuity of f'(x) and
sectional existence of f"(x) suffices. We now demand the two boundary
conditions:

In fact, we will immediately proceed to the general case, in which the


existence of the mth derivative is assumed, coupled with the boundary
conditions:

(The existence of f^m\x) without these boundary conditions is of no avail


since we have to extend f(x) beyond the original range by the periodicity
condition (2.10). Without the conditions (2) the mth derivative would fail
to exist at the point x = ±TT.)
We can again follow the reasoning of the previous section, the only
difference being that now the integration by parts (2.4) can be repeated,
and applied m times. We thus obtain the coefficients «#, &# in terms of the
mth derivative. It is particularly convenient to consider the combination
«* + ibjc and use the trigonometric functions in complex form:

(We omit Ic = 0 since it is always understood that our f(x) is the modified
function f(x) — \a§ which has no area.) We can write even the entire
Fourier series in complex form, namely
SEC. 2.4 FUNCTIONS OF HIGHER DIFFERENTIABILITY 57

with the understanding that we keep only the real part of this expression.
With this convention we can once more put

where

Once more our aim will be to obtain an error bound for the finite expansion
(3.2) and for that purpose we can again put

where gnm(6) is now defined by the real part of the infinite sum

The method of changing this sum to an integral is once more available and
we obtain

Again we argue that with the exception of a very small range around 6 = 0
the asymptotic stage is quickly reached and here we can put

But then, repeating the argument of the previous section, we get for not
too small n:

and

A simpler method of estimation is based on "Cauchy's inequality"


58 HARMONIC ANALYSIS CHAP. 2

which avoids the use of the absolute value. Applying this fundamental
inequality to the integral (7) we can make use of the orthogonality of the
Fourier functions and obtain the simple expression (which holds exactly
for all ri):

Changing this sum to an integral we obtain the close approximation

and thus we can deduce the following estimation for the local error T$,(X)
at any point of the range:

where Nm is the so-called "norm" of the mth derivative of f(x):

Problem 67. Show that the approximation (15) is "safe" for estimation purposes
because the sum on the left side is always smaller than the result of the integration
given on the right side.
Problem 68. Prove the following inequalities: for any f(x) which satisfies the
boundary conditions (4.2) and whose total area is zero:

where fim is a numerical constant, defined by

The Bzm are the Bernoulli numbers: £, -3A0-, -£$, -£$, . . . (starting with m = 1).

2.5. An alternative method of estimation


If we consider the expression (4.7), we notice that the remainder of a finite
Fourier series appears as a definite integral over the product of two factors.
The one is the mih derivative of the given/(x), the other is a function which
is independent of x for each given n. Our policy has been to estimate the
error on the basis that we took the integral over the absolute value of gn(d)
which is a mere constant, depending on n, say Cn. This Cn multiplied by
the maximum value of |/(OT)(#)| gave us an upper bound for the remainder.
SEC. 2.5 AN ALTERNATIVE METHOD OF ESTIMATION 59

We may reverse, however, our procedure by exchanging the role of the


two factors. Under certain circumstances we may fare better by integrating
over the absolute value of f ' ( x ) , and multiplying by the maximum of |<7n(0)|-
It may happen, for example, that f'(x) becomes infinite at some point of the
range, while the integral over |/'(0)| remains finite. In such a case it is
clear that the second method will be preferable to the first.
What can we say about the maximum of j^n^)!? If for the moment we
skip the case of m = 1 and proceed to the case of a twice or more differen-
tiable function, considered in Section 4, we arrive at the function (4.8) which
has its maximum at 8 = 0. By changing the sum to an integral we obtain
with sufficient accuracy:

which yields the estimation

However, in the case of a function which is only once differentiate, we


do not succeed by a similar method because we lose the factor n in the
denominator and the error no longer goes to zero. The kernel Gi(6)—as we
have seen in (2.11)—has a point of discontinuity at 6 — 0 with which the
successive approximations are unable to cope. Hence the maximum of
gnl(8) no longer goes to zero with increasing n but remains constantly |.
And yet even here we succeed if we use the proper precaution. We
divide the range of integration into two parts, namely the realm of small 6
and the realm of large 6. In particular, let us integrate between 6 = 0 and
0 = 1/Vw and then between 6 = l/Vn and 6 = TT; on the negative side we
do the same. Now in the second realm the function (4) has already attained
its asymptotic value and its maximum is available:

In the central section we use the maximum £. In this way we obtain as an


error bound for

The first integral is small because the range of integration is small. The
second integral is small on account of the A/ft in the denominator.
The estimation (4) is, of course, more powerful than the previous estimation
(3.14), although the numerical factor Cn was smaller in the previous case.
60 HARMONIC ANALYSIS CHAP. 2

Even a jump in the function f(x) is now permitted which would make f'(x)
infinite at the point of the jump, but the integral

remains finite. We see, however, that in the immediate vicinity of the jump
we cannot expect a small error, on account of the first term which remains
finite in this vicinity. We also see that under such conditions the estimated
error decreases very slowly with n.
We fare better in such a case if we first remove the jump in the function
by adding to f(x) the special function O\(x — xi), multiplied by a proper
factor a. Since the function aGi(x — x\) makes the jump — a at the point
x = xi, we can compensate for the jump a of the function f(x) at x = x\
and reduce f(x) to a new function <p(x) which is free of any discontinuities.
For the new function the more efficient estimation of Section 3 (cf. 3.14)
can be employed, while the remainder of the special function aOi(x —xi)
is explicitly at our disposal and can be considered separately.
2.6. The Gibbs oscillations of the finite Fourier series
If/(#) is a truly periodic and analytical function, it can be differentiated
any number of times. But it happens much more frequently that the
Fourier series is applied to the representation of a function f(x) which is
given only between — IT and +TT, and which is made artificially periodic by
extending it beyond the original range. Then we have to insure the boundary
conditions (4.2) by artificial means and usually we do not succeed beyond a
certain m. This means that we have constructed a periodic function which
is m times differentiable but the mth derivative becomes discontinuous at
the point x = x\. Under such conditions we can put this lack of continuity
to good advantage for an efficient estimation of the remainder of the finite
series and obtain a very definite picture of the manner in which the truncated
series fn(x) approximates the true function/(a;).
We will accordingly assume that f(x) possesses all the derivatives up to
the order m, but/("*)(#) becomes discontinuous at a certain point x = xi of
the range (if the same occurs in several points, we repeat our procedure for
each point separately and obtain the resulting error oscillations by super-
position). Now the formula (4.7) shows that it is not f^(x) in itself, but
the integral over/<w»)(a;) which determines the remainder T)n(x) of the truncated
series. Hence, instead of stopping with the mtli derivative, we could proceed
to the m + 1st derivative and consider the jump in the mtlj derivative as a
jump in the integral of the m + 1st derivative. This has the consequence
that the major part of the integral which determines ir)n(x), is reducible to
the immediate neighbourhood of the point x = x\. The same will happen
in the case of a function whose m + 1st derivative does not become necessarily
infinite but merely very large, if compared with the values in the rest of
the range.
Since our function f(x) became periodic by extending it beyond the
SEC. 2.6 THE GIBBS OSCILLATIONS OF THE FINITE FOURIER SEEIES 61

original range of definition, we can shift the origin of the period to any
point x = xi. Hence we do not lose in generality but gain in simplicity if
we place the point of infinity of the m + 1st derivative into the point x = 0.
The integration in the immediate vicinity of the point £ = 0 gives (cf. (4.7)):

The second factor is the jump A of the mth derivative at the point x = 0:

Moreover, since /(m+1)(l) is generally regular and becomes so extreme only


in the vicinity of x = 0 (the same holds if /(m+1H£) has merely a strong
maximum at x = 0), we can consider the remaining part of the integral as
small, compared with the contribution of the neighbourhood of £ = 0.
But then the integral (4.7)—replacing m by m + 1—becomes reduced to

We will normalise the magnitude of the jump to 1, since the multiplication


by a constant can be left to the end. Our aim will be to pay closer attention
to the integral (4.9) which is a sufficiently close approximation of the sum
(4.8). We will try to obtain a satisfactory approximation of this integral
in terms of elementary functions. Replacing 6 by x, the integral we want
to approximate may be written as follows:

Let us now consider the following function of f :

(ri stands for n + £). Differentiating logarithmically we obtain

The last term in the numerator is very nearly — lj(n' + £), on account of
the largeness of n'. We fail only in the domain of very small x but even
there the loss is not too serious if we exclude the case m = 0 which we will
consider separately. But then the effect of this substitution is that the m
of the previous term changes to m + 1, with the consequence that now
numerator and denominator cancel out and the second factor becomes 1.
The resulting expression is now exactly the integrand of (4). We have thus
62 HAEMONIC ANALYSIS CHAP. 2

succeeded with the integration and have merely to substitute the limits 0
and oo, obtaining

which yields for gn(x) the expression

Only the real part of this expression must be taken, for positive x. The
transition to negative x occurs according to the following rules:

Let us assume that m is odd:

and the real part of (8) yields the even function

where

On the other hand, if m is even: m = 2s, the real part of (8) yields the odd
function

The formulae (11) and (13) demonstrate in explicit form the remarkable
manner in which the truncated series fn (x) (which terminates with the terms
sin nx, cos nx), approximates the true function f(x). The approximation
winds itself around the true function in the form of high frequency oscilla-
tions (frequently referred to as the "Gibbs oscillations"), which are super-
imposed on the smooth course of f(x) [by definition fn(x) = f(x) — -rjn(x)].
These oscillations appear as of the angular frequency (n + £), with slowly
changing phase and amplitude. The phase a starts with the value 0 at
x = 0 and quickly increases to nearly 7r/2, if n is not too small. Accordingly,
the nodal points of the sine-oscillations and the maxima-minima of the
cosine oscillations are near to the points

These points divide the interval between 0 and IT into n + I nearly equal
sections.
SEC. 2.6 THE GIBBS OSCILLATIONS OF THE FINITE FOURIER SERIES 63

The amplitude of the Gibbs oscillations decreases slowly as we proceed


from the point x = 0—where the break in the mth derivative occurs—
towards the point x = TT, where the amplitude is smallest. The decrease is
slow, however, and the order of the amplitudes is constantly of the magnitude

where A is the discontinuity of the mth derivative (cf. (2)). Only in the
immediate vicinity of the point x = 0 are the oscillations of slightly larger
amplitude.
We see that the general phenomenon of the Gibbs oscillations is inde-
pendent of the order m of the derivative in which the discontinuity occurs.
Only the magnitude of the oscillations is strongly diminished as m increases.
But the slow change in amplitude and phase remains of the same character,
whatever m is, provided that n is not too small relative to m.
The phase-shift between even and odd m—sine vibrations in the first case,
cosine vibrations in the second—is also open to a closer analysis. Let us
write the function f(x) as the arithmetic mean of the even function
g(x) = f(x) + f( — x) and the odd function h(x) = f(x) — f( — x). Now the
Fourier functions of an even function are pure cosines, those of an odd
function pure sines. Hence the remainder -rjn(x) shares with the function
its even or odd character. Furthermore, the behaviour of an even,
respectively odd function in the neighbourhood of x = 0 is such that if an
even derivative becomes discontinuous at x = 0, the discontinuity must
belong to h(x). On the other hand, if an odd derivative becomes dis-
continuous at x = 0, that discontinuity must belong to g(x). In the first
case g(x) is smooth compared with h(x), in the second h(x) is smooth com-
pared with g(x). Hence in the first case the cosine oscillations of the
remainder are negligible compared with the sine oscillations, while in the
second case the reverse is true. And thus the discontinuity in an even
derivative at x = 0 makes the error oscillations to an odd function, the dis-
continuity in an odd derivative to an even function.
We can draw a further conclusion from the formulae (11) and (13). If s
is even and thus m of the form 4/n + 1, the error oscillations will start with
a minimum at x = 0, while if s is odd and thus m of the form 4/u, + 3, with
a maximum. Consequently the arrow which goes from /(O) to /w(0), points
in the direction of the break if that break occurs in the first, fifth, ninth,
. . . derivative, and away from the break if it occurs in the third, seventh,
eleventh, . . . derivative. Similarly, if the break occurs in the second, sixth,
tenth, . .. derivative, the tangent of fn(x) at x = 0 is directed towards the
break; if it occurs in the fourth, eighth, twelfth, . . . derivative, away from
the break (see Figure).
The case m = 0. If the discontinuity occurs in the function itself, we
have the case m — 0. Here the formula (8) loses its significance in the realm
of small x. On the other hand, we have obtained the gnl(x) of formula
64 HARMONIC ANALYSIS CHAP. 2
SEC. 2.6 THE GIBBS OSCILLATIONS OF THE FINITE FOURIER SERIES 65

(4.8) (which belongs to our case m = 0), at an earlier occasion, in Section 3


(cf. 3.10), with the following result:

and thus

where A is the discontinuity of the function f(x) at the point x = 0:

In this instance we have an error pattern which is in apparent contra-


diction to the error pattern observed for general values of m. Since m = 0
is an even number, we have to subordinate our case to the sine vibrations of
the formula (13). There we have seen that the tangent offn(x) at the point
of discontinuity is oriented alternately toward and away from the dis-
continuity. Since m = 2 realises the "toward" case, m = 0 should realise
the "away" case. And yet, fn(x) starts its course in the direction of the
break, rather than away from it.
A closer analysis reveals that even here the contradiction is only apparent.
At the very beginning, during the steep ascent offn(x), we cannot speak of
"Gibbs oscillations" since these oscillations develop only after the first nodal
point at nx — 1.9264 had been reached. If we continue these oscillations
backwards to the point x = 0, we see that once more the tangent of the
sine oscillation points away from the jump, in full agreement with the
general behaviour of the Gibbs oscillations, established for arbitrary m.
The discontinuity in the function merely adds to these oscillations a single
steep peak in the neighbourhood of the singularity which is of a non-
oscillatory character and is superimposed on the regular pattern of the
Gibbs oscillations.
Although the discussions of the present section were devoted to the case
of a single break in the mth derivative, actually the Gibbs oscillations of a
large class of functions behave quite similarly. A function may not show
any discontinuity in its higher derivatives. It is frequently possible,
however, to approximate this function quite effectively by local polynomials
which fit together smoothly. If we differentiate our approximation, we
notice that the higher derivatives become less and less smooth and a certain
derivative of the order m becomes composed of mere delta functions. This
means that the Gibbs oscillations of this function can be studied by super-
imposing a relatively small number of oscillation patterns of the type we have
exhibited in this section. The resulting pattern will once more show the
same features of a fundamentally constant frequency with amplitudes and
phases which are nearly constant.
The fact that the Gibbs oscillations are of fairly constant periods, has the
following interesting and important consequence. We do not commit any
6—L.D.O.
66 HARMONIC ANALYSIS CHAP. 2

serious change offn(x) if we shift the nodal points of its error oscillations into
exactly equidistant points. Then we no longer have our original truncated
Fourier series fn(x) but another trigonometric series with slightly modified
coefficients whose sum, however, does not give anything very different from
what we had before. Now this new series has a very definite significance.
It is obtainable by the process of trigonometric interpolation because we can
interpret the new series as a trigonometric expansion of n terms which has
zero error in n equidistantly prescribed points, in other words, which fits
exactly the functional values of f(x) at the "points of interpolation". This
is no longer a problem in infinite series but the algebraic problem of solving
n linear equations for n unknowns; it is solvable by mere summation,
without any integration. In particular we may prescribe the points of
interpolation very near to the nodal points of the Gibbs oscillations if we
agree that the odd part of the function shall be given in the points

and the even part of the function in the points

The approximating series thus obtained have strictly equidistant Gibbs


oscillations, with amplitude modulations, but without phase modulations
(cf. Section 18).
Problem 69. Study the Gibbs oscillations by expanding the function

in a sine series of 12 terms. The discontinuity appears here in the second


derivative at x = ±TT. Compare the results with the predictions according to
the formula (13) (multiplied with the magnitude of the jump).
Problem 70. Study the Gibbs oscillations by expanding the function

in a cosine series of 12 terms. The discontinuity appears here in the third


derivative at x = ±TT-

2.7. The method of the Green's function


Although we have not mentioned it explicitly, the method we have
employed in the various remainder estimations is closely related to the
method followed in Chapter 1.5, when we were interested in the problem of
Lagrangian interpolation and wanted to estimate the closeness of the approxi-
mation. It is a method which plays a fundamental role in the solution of
differential equations (as we will see later in Chapter 5), and goes under the
name of solving a differential equation by means of the "Green's function".
SEC. 2.7 THE METHOD OF THE GREEN'S FUNCTION 67

If a function f(x) is m times differentiable, we can consider f(x) as the


solution of the differential equation

where on the right side we have the given mth derivative of the function.
This, however, is not enough for a unique characterisation of f(x) since a
differential equation of mth order demands m additional boundary conditions
to make the problem unique. In the Lagrangian case we succeeded in
characterising the remainder itself by the differential equation (1) and the
added boundary conditions—they were in fact inside conditions—followed
from the added information that the remainder vanishes at the points of
interpolation. In our present problem we will not proceed immediately to
the remainder of the Fourier series but stay first with the function/(x) itself.
We add as boundary conditions the periodicity conditions (4.2) which are
demanded by the nature of the Fourier series. With these added conditions
the function f(x) is now uniquely determined, except for an additional
constant which is left undetermined. We eliminate this freedom by adding
one more condition, namely

(This condition is justified since we can always replace f(x) by f(x) — \a$
which indeed satisfies the condition (2).)
Now under these conditions—as we will prove later—we can solve the
given problem in terms of the "Green's function" 0(x, £):

where the auxiliary function G(x, £) is quite independent of f(x) and


constructed according to certain rules. In our present case it so happens
that this function is reducible to a function of a single variable because we
can show that in fact G(x, £) = (?(£ — x).
The operation with the Green's function has the following great advantage.
We want to study the degree of approximation obtainable by a given set
of functions, such as for example the Fourier functions. Then the resolution
(3) has the consequence that we can completely concentrate on the study of
the special function G(x, £), instead of dealing with the general function
f(x). If we know how close we can come in approximating G(x, £), we also
know what can be expected for the general function f(x) because, having the
remainder gn(x, £) for the Green's function G(x, £), we at once obtain the
remainder for f(x), in form of the definite integral
68 HARMONIC ANALYSIS CHAP. 2

This is in fact how we derived earlier Lagrange's remainder formula (1.5.10).


But in the Lagrangian case we could take advantage of the fact that gn(x, £),
considered as a function of £, did not change its sign in the entire interval.
Hence it was not necessary to take the absolute value of gn(x, I) and the
simple integral

could be obtained in closed analytical form. This is not the case now
because in the case of the Fourier series the remainder of the Green's
function has an oscillatory character. In order to estimate the value of (4)
on the basis of the maximum of f(m)(£) we have to replace gn(£ — x) by
\dn(£ — #)| which makes the evaluation of the numerical factor in the
remainder formula (4.12) difficult but does not interfere with the fact that
an effective estimation of the error is possible. Moreover, the Lagrangian
remainder formula (1.3.7) operates specifically with the wth derivative while
in our case the order of the derivative m and the order n of the approximating
Fourier series are independent of each other.
Problem 71. Derive the Green's function Gz(6) of (4.6) (for the case m = 2)
from the Gi(d) of Section 2 (cf. 2.9) on the basis that the infinite sum (4.6)
which defines Qz(G) is the negative integral of the sum (2.7), together with the
validity of the condition (2).
[Answer:

2.8. Non-differentiable functions. Dirac's delta function


We had no difficulty in proving the convergence of the Fourier series for
functions which could be differentiated at least once. We could also give
good error estimates for such functions, in terms of the highest existing
derivative. However, the function f(x) may not have a derivative at every
point of the range. For example the function Gi(d), encountered in Section 2,
had a point of discontinuity at the point 0 = 0. If we would insist in
treating this function as differentiable, we would have to consider the
derivative at 0 = 0 as infinite and thus the estimation of the error on the
basis of the maximum value of \f'(x)\ goes out of bound. And yet, this
function possesses a Fourier series which converges to f(x] at every point x,
even at the point of discontinuity, if we agree that at such a point the value
off(x) shall be defined by the arithmetic mean of the two limiting ordinates.
How can we estimate the remainder of the Fourier series for an f(x) of this
type? The Green's function method is not applicable here since f(x), if
not differentiable, cannot be considered as the solution of a differential
equation.
The method employed in Section 2 is applicable even without the device
of integrating by parts, if we stay with a finite expansion. The truncated
SEC. 2.8 NON-DIFFERENTIABLE FUNCTIONS. DIBAC's DELTA FUNCTION 69

Fourier series (3.2), which does not give f(x) but merely defines a certain
fn(x) associated with f(x), can be written in the following form

where

According to the trigonometric identity (3.15) we can obtain the function


•^»(£ —x)> called the "Dirichlet kernel", in closed form:

At any fixed value of x and £—and thus also of 6 = £ — x—this function


remains finite (excluding the point x = £), no matter how large n may
become. However, the function does not approach any limit as n increases
but keeps oscillating within the same bounds for ever. It is therefore
difficult to apply the function Kn(t; — x) to an efficient estimation of the
error of the Fourier series. We will emphasise specifically that the con-
vergence of the series (2) is not demanded for the convergence of the
Fourier series. The latter demands only that the integral (1) shall con-
verge. But it is entirely possible that the integration over £, with the
weight factor /(£), smooths out the strong Gibbs oscillations of the kernel
K(£, x) and yields a definite limit. Hence we should not consider Kn(%, x)
as a function but rather as an operator, in particular as an integral operator
which has to operate on the function /(£) in the sense of a term by term
integration. In this sense it is entirely legitimate to write down the infinite
sum

which has no meaning as a value, but has meaning as an operator. The


equation

can be considered as a correct operational equation for all functions f(x)


which allow a Fourier expansion.
In the literature of modern physics this cautious approach to the problem
of a divergent series is often replaced by a more direct approach in which
70 HARMONIC ANALYSIS CHAP. 2

the divergent series (4) assumes a more concrete meaning. Although we


realise that the sum (4), as it stands, has no immediate significance, we may
interpret it in a different manner. We may find a function 8(£, x) which
happens to have the series

as its Fourier expansion. Then we could replace the infinite sum (4) by
this function and interpret K(£ — x) not merely as an operator but as an
actual function.
Now the general law (2.2) of the Fourier coefficients tells us that this
hypothetical function S(£, x) must satisfy the following conditions:

Let us consider a function H($, x) of the following properties. It is an


even function of the single variable £ — x = 6:

Moreover, this function is everywhere zero, with the only exception of the
infinitesimal neighbourhood + e of the point 6 — 0. Then the expansion
coefficients a*, 6& of this function become:

where £ is some intermediate value between £ — € and £ + e. If we


demand that

and now let e go towards zero, the point £ becomes in the limit equal to £
and we have actually obtained the desired expansion coefficients (7). The
function thus constructed is Dirac's celebrated "delta function" 8(x, £) =
8($, x) = 8(g — x). It is comparable to an infinitely sharp needle which
pinpoints one definite value f(x) of the function, if used under the integral
sign as an operator:

Here the previous equation (5), which was an elegant operational method
SEC. 2.9 SMOOTHING OF THE GIBBS OSCILLATIONS BY FEJ&R'S METHOD 71

of writing the Fourier series of a function f(x) (provided that the series
converges), now appears in consequence of the definition of the delta
function. But we come back to the previous equation (5) if we replace
the delta function by its Fourier expansion (6).
Neither the "delta function", nor its Fourier expansion (6) are legitimate
concepts if we divest them from their significance as operators. The delta
function is not a legitimate function because we cannot define a function by
a limit process which does not possess a limit. Nor is the infinite series (6)
a legitimate Fourier series because an infinite sum which does not converge
to a limit is not a legitimate series. Yet this is entirely immaterial if these
constructions are used under the integral sign, since it suifices that the
limits of the performed operations shall exist.
2.9. Smoothing of the Gibbs oscillations by FejeVs method
We have mentioned before that the Gibbs oscillations of the Dirichlet
kernel (8.3) interfere with an efficient estimation of the remainder f]n(x) of
the finite Fourier series. Dirichlet succeeded in proving the convergence
of the Fourier series if certain restricting conditions called the "Dirichlet
conditions", are demanded of /(#). But a much more sweeping result was
obtained by Feje*r who succeeded in extending the validity of the Fourier
series to a much larger class of functions than those which satisfy the
Dirichlet conditions. This generalisation became possible by a modification
of the summation procedure by which the Fourier series is obtained.
The straightforward method by which the coefficients (2.2) of the Fourier
series are derived may lead us to believe that this is the only way by which
a trigonometric series can be constructed. And yet this is by no means so.
What we have proved is only the following: assuming that we possess a
never ending sequence of terms with definite coefficients ajc, bjc whose sum shall
converge to f(x), then these coefficients can be nothing but the Fourier
coefficients. This, however, does not interfere with the possibility that for
a certain finite n we may find much more suitable expansion coefficients
since here we are interested in making the error small for that particular n,
and not in constructing an infinite series with rigid coefficients which in the
limit must give us /(#). We may gain greatly in the efficiency of our
approximating series if we constantly modify the expansion coefficients as
n increases to larger and larger values, instead of operating with a fixed set
of coefficients. And in fact this gives us the possibility by which a much
larger class of functions becomes expandable than if we operate with fixed
coefficients.
FejeVs method of increasing the convergence of a Fourier series consists
in the following device. Instead of merely terminating the series after n
terms (the terms with a^ and && always act together, hence we will unite
them as one term) and being satisfied with their sum/n(a;), we will construct
a new sequence by taking the arithmetic means of the original sequence:
72 HARMONIC ANALYSIS CHAP. 2

This new 8n(x) (the construction of which does not demand the knowledge
of the coefficients aje, 6* beyond k = n), has better convergence properties
than the original fn(%). But this 8n(x) may be preferable to /»(#), quite
apart of the question of convergence. The truncated Fourier series /»(#)
has the property that it oscillates around the true course of /(#). These
"Gibbs oscillations" sometimes interfere with an efficient operation of the
Fourier series. Fejer's arithmetic mean method has an excellent influence
on these oscillations, by reducing their amplitude and frequently even
eliminating them altogether, making the approach to f(x) entirely smooth,
without any oscillations.
Problem 72. Apply the arithmetic mean method to the Dirichlet kernel (8.3)
and show that it becomes transformed into the new kernel (cf. 3.18):

called "Fejer's kernel".


Problem 73. Show that Fejer's arithmetic mean method is equivalent to a
certain weighting of the Fourier coefficients which depends on n:

with

2.10. The remainder of the arithmetic mean method


The great advantage of Fejer's kernel (9.2) compared with the Dirichlet
kernel (8.3) is its increased focusing power. In the denominator we now
find sin2 0/2 instead of sin 6[2. Since for small angles the sine is replaceable
by the angle itself, we can say that the Gibbs oscillations of the new kernel
decrease according to the law 0~2 instead of 0-1. Hence the new kernel
possesses more pronouncedly the properties of the delta function than the
Dirichlet kernel. Let us see how we can now estimate the remainder of an
wth order approximation.
For this purpose we apply the method that we have studied in Section 5.
We have to estimate the integral

This is now the nih approximation itself and not the remainder of that
approximation. In order to come to the remainder, let us write f ( x + 6)
as follows:
SEC. 2.10 THE REMAINDER OF THE ARITHMETIC MEAN METHOD 73

Substituting we obtain

Now the area under the kernel 3>n(B) is 1 because <Pn(0) is a weighted sum
of cosines (see 9.4) but each one gives the area zero, except the absolute
term 1/27T which has not been changed by weighting (the corresponding k
being zero) and still gives the area 1. Hence the first term is/(:r) and thus
the second term has now to be interpreted as — r)n(x). In this second term
we divide the range of integration into two parts: very small 6 and larger 6.
For the realm of larger B we can again obtain an expression like the last
term of (5.4), although \f'(8)\ is now to be replaced by |/(0) — f(x)\ and
we have to choose the limiting value of 8, which separates the two domains,
not proportional to l/Vn but proportional to l/^n. In order that this
term shall go to zero with increasing n it is only necessary that

Fejer's method demands solely the absolute integrability of/(#), without any
further conditions.
Now we come to the central region and here we cannot use the estimation
based on the maximum of gn(8)—which in our earlier case became \—
because $n(#) grows out of bound at 6 = 0, as n increases to infinity. But
in this region we can interchange the role of the two factors and take the
maximum value of \fi(x, 0)| multiplied by the integral over the absolute
value of <Pn(8) which is certainly less than 1 (Fejer's kernel is everywhere,
positive and needs no change on account of the "absolute value" demand).
And thus in the inner domain we have a contribution which is less than the
maximum of the absolute value offi(x, 6).
Here we cannot argue that this contribution will be small on account of
the small range of integration. But we can argue that we are very near to
the point 8 = 0 and have to examine the maximum of the quantity

Now the continuity of f(x) is not demanded for the fulfilment of the condition
(4). But if f ( x ) is not continuous at the point x, then the quantity (5)
will not be small and the arithmetic mean method will not converge to f(x).
If, however, we are at a point where f(x) is continuous, then by the very
definition of continuity (without demanding differentiability), the quantity
(5) becomes arbitrarily small as the domain of B shrinks to zero. And thus
we have proved that the arithmetic mean method converges to the proper
f(x) at any point in which f(x) is continuous, the only restricting condition
on the class of admissible f(x) being the absolute integrability (4) of the
function.
Problem 74. Carry through the same argument for the case that the continuity
of f(x) holds separately to the right of x and to the left of x but/(&+) and/(a;-)
74 HARMONIC ANALYSIS CHAP. 2

have two different values. Show that in this case the series of Fejer converges
to

that is the arithmetic mean of the two limiting ordinates.

2.11. Differentiation of the Fourier series


Let us assume that the given function f(x) is m times differentiable at all
points of the range (including the boundaries). The derivative f ' ( x ) of the
original function is a new function in its own right which can also be
expanded into a Fourier series. The same series is obtainable by term-by-
term differentiation of the original series for f(x).
In Section 6 we have studied the error oscillations of an m times
differentiable function. But f ' ( x ) is only m — 1 times differentiable and
thus we lose the factor n + | in the denominator of (6.8). The error os-
cillations of f ' ( x ] have thus increased by the factor n + |. The same will
happen whenever we differentiate again, until after m — I steps we have
a function left which is only once differentiable. Then we know from the
result of Section 2 that fn~ (x) will still converge uniformly to f^m~l\x]
at every point of the range but this is the limit to which we can go with-
out losing the assurance of convergence. And yet it may happen that f ( x )
can be differentiated many more times—perhaps even an arbitrary number
of times—except in certain points in which the derivative ceases to exist.
It is in the global nature of the Fourier series—shared by all orthogonal
expansions—that a single sufficiently strong infinity of the function at any
point of the range suffices to destroy the convergence of the series at all
points. This is the reason that we have to stop with the process of differen-
tiation if even one point exists in which the function becomes discontinuous
and thus the derivative goes too strongly to infinity.
If we could somehow avoid the magnification of the error oscillations by
the factor n in each differentiation, we could obviously greatly increase the
usefulness of the Fourier series. Then it would be possible to counteract
the global features of the Fourier series and transform it into a locally
convergent series. It would not be necessary to demand any differentiability
properties of the function f(x). Let us assume that/(x) is merely integrable.
Then there exists a function F(x) whose derivative isf(x). This F(x) is now
differentiable and thus its Fourier series converges uniformly at all points.
Now we differentiate this series to obtain/(re). If the error oscillations have
not increased in magnitude, the resulting series would give us f(x), possibly
with the exception of some singular points in which f(x) ceases to exist.
And even less would be sufficient. Let us assume that one integration would
not be sufficient to arrive at an absolutely integrable function, but this
would happen after m integrations. Then we could obtain the Fourier
series of this m times integrated function and return to the original f(x) by
m differentiations. The validity of the Fourier series could thus be greatly
extended. We will see in the next section how this can actually be
accomplished.
SEC. 2.12 THE METHOD OF THE SIGMA FACTORS 75

2.12. The method of the sigma factors


If we write the Fourier series in the complex form (4.4) and study the
remainder of the truncated series, we see that the factor e~inx can be taken
in front of the sum. The expression (4.8) obtained for the kernel gnm(t; — x)
of the integral (4.7) demonstrates that this kernel contains the rapidly
oscillating factor etn^~x\ multiplied by a second factor which is relatively
slowly changing. If we differentiate, we get the factor n on account of the
derivative of the first factor, and not on account of the derivative of the
second factor. The increase of the error oscillations by the factor n in
each step of differentiation is thus caused by the rapidly oscillating character
of the Gibbs oscillations. Now we can take advantage of the fortunate
circumstance that the functions cos nx and sin nx have the exact period
2irjn. Let us write the remainder of the finite Fourier series fn(x) in the
complex form

(with the understanding that only the real part of this expression is to be
taken). Instead of the usual differentiation we will now introduce a
"curly <2> process", defined as follows:

This is in fact a differencing device, with a Ax which is strictly adjusted to


the number of terms with which we operate. While this process introduces
an error, it is an error which goes to zero with increasing n and in the limit
n -*• oo coincides with the ordinary derivative of f(x) if this derivative exists:

The operation Qin applied to the functions cos nx and sin nx has the
following effect

and we see that the functions cos nx and sin nx behave like constants with
respect to the operation 3).
Let us apply this operation to the remainder (1) of the Fourier series.
Neglecting quantities of the order w~4 we obtain

The Gibbs oscillations have not increased in order of magnitude in conse-


quence of this operation, because we have avoided the differentiation of the
first factor which would have given the magnification factor n.
76 HAEMONIC ANALYSIS CHAP. 2

If we examine what happens to the terms of the Fourier series in conse-


quence of this operation, we find the following:

We can express the result hi the following more striking form. We intro-
duce the following set of factors, called the "sigma factors":

We apply the ordinary differentiation process to a modified Fourier series


whose coefficients are multiplied by the sigma factors:

Notice that this operation leaves the coefficient £ao unchanged (since
or0 = 1) while the last terms with an, bn drop out, because an = 0.
Problem 75. Apply the Qln process to the Dirichlet kernel (8.3), assuming that
n is large.
[Answer:

2.13. Local smoothing by integration


The application of the Q>n process to the coefficients of a Fourier series
can be conceived as the result of two operations. The one is that we
differentiate in the ordinary manner, the other is that we multiply the
coefficients of the Fourier series by the sigma factors. If we know what the
significance of the second process is, we have also found the significance of
the operation S>n. Let us now consider the following integration process:

where F(x) is the indefinite integral off(x). The meaning of this operation
SBC. 2.13 LOCAL SMOOTHING BY INTEGRATION 77

is that we replace the value off(x) by the arithmetic mean of all the values
in the neighbourhood of/(x), between the limits +TT/W,
The operation f(x) may be expressed hi the following way:

where the "kernel" 8»(£ — x) is defined as the "square pulse" of the width
2fl-/n:

The result of local smoothing is that the analytical


regularity of f(x) has been increased by one degree. If
f(x) was discontinuous, f(x) becomes continuous. If
f(x) was differentiate n times, f(x) is differentiate
n + 1 times. Moreover,f(x) approaches
f(x) more and more as n increases to
infinity and becomes in the limit equal
to/(z) at all points in which/(re) is con-
tinuous.
We shall write (2) operationally in the form

A comparison with the equation (8.11) shows that Dirac's "delta function"
can be conceived as the limit of the function 8n(£, x) since the equation
(8.11) can now be written (at all points x in which f(x) is continuous) in the
form:

The effect of the operation In on the Fourier coefficients a*, 6* is that they
become multiplied by the a factors, according to (12.8); and vice versa: the
operation of multiplying the Fourier coefficients by the a factors is equivalent
to submitting f(x) to the In operation.
Problem 76. Show that local smoothing leaves a straight line portion of f(x)
unchanged. What is the effect of local smoothing on the parabola/(a;) = » 2 ?
Express the result in terms of the second derivative.
78 HAEMONIC ANALYSIS CHAP. 2

FAnswftr:

Problem 77. What is the effect of local smoothing on the amplitude, phase
and frequency of the oscillation

[Answer: Amplitude changes to

Frequency and phase remain unchanged.] Show the validity of (6) for small o>.

Problem 78a. Show that at a point of discontinuity f(x) approaches in the limit
the arithmetic mean of the two limiting ordinates.

Problem 786. Show directly from the definition (12.2) of the 3> process the
validity of the operational equation

(the last equation only if f(x) is differentiable).

2.14. Smoothing of the Gibbs oscillations by the sigma method


We have seen in Section 13 that the operation 3ln could be conceived as
the ordinary D operation on a modified function f(x) which was obtained
by local smoothing. Hence we have the operational equation

The excellent qualities of the In operator in relation to the Fourier series


are based on the strict coordination of the width 2-rr/n of the function
8n(£ — %) to the number n of the terms of the truncated Fourier series.
We have discussed in Section 12 the fact that by ordinary differentiation the
Gibbs oscillation of /n(«) increase by the factor n, while the Q)n process
avoids this increase. Now, since @n itself is nothing but the operation
DIn and the operation D increases the Gibbs oscillations by the factor n,
the operation In must have the effect of decreasing the Gibbs oscillations by
the factor n. Hence the multiplication (12.8) of the Fourier coefficients by
the sigma factors has the beneficial effect of reducing the Gibbs oscillations
by a considerable factor. The convergence of the Fourier series can thus
be greatly increased.
SEC. 2.14 SMOOTHING GIBBS OSCILLATIONS BY THE SIGMA METHOD 79

Let us assume that the remainder of the truncated series is once more
given in the form (12.1):

Furthermore, let this rjn(x) be the derivative of another function

This yields the relation

Now by definition the application of the sigma factors has the following
effect on the remainder (see 12.5):

On the other hand, the differential equation (4) may be solved without
integration for sufficiently large n asymptotically, by expanding into
reciprocal powers of n:

which in view of (5) yields:

Comparison with the original Gibbs oscillations (2) show the following
changes: The phase of the oscillations has changed by 7r/2; the amplitude of
the oscillations has decreased by the factor n, but coupled with a change
of the law of decrease which is no longer yn(x) but y'n(x)-
The modified remainder f/n(a;) can be conceived as the true Fourier
remainder of a modified function/(a;) = In$(x], obtained by the process of
local smoothing:

While it is frequently of great advantage that we cut down on the amplitudes


of the Gibbs oscillations, we have to sacrifice somewhat on the fidelity of
the representation since it is not f(x) but the slightly modified/(a;) which the
truncated Fourier series, weighted by the sigma factors, represents.

Problem 79. Apply the sigma method to the function (2.9). Show that the
jump at x = 0 is changed to a steep but finite slope of the magnitude — n/2-rr.
Show that the Gibbs oscillations (3.10) now decrease with 1/n2, instead of l/n.
80 HARMONIC ANALYSIS CHAP. 2

Find the position and magnitude of the first two maxima of ij»(0). (The
asymptotic procedure (6) is here not applicable, since we are near to the singular
point at 6 = 0; but cf, (3.10).)
[Answer: with n$ = t;
Position of exteemum determined by condition

Expression of TJ»(<);

Numerical solution:

Minimum between at

2,15. Expansion of the delta function


The formal series (8.4) is void of any direct meaning since it diverges at
every point. It represents the Fourier series of Dirae's delta function.
But let us consider the Fourier series of the function SB(f — «); ef. (13.3).
Here we obtain the finite sum, weighted by the sigma factors:

which has entirely different properties. If n goes to infinity, pn(B) approaches


a very definite limit at every point of the interval [—ir, +w], with the only
exception of the point 0 = 0. At that point p»(#) goes strongly to infinity.
At all other points, however, pn(0) converges to zero. Moreover, the area
under the curve is constantly 1:

Hence pn(0), as n grows to infinity, satisfies all the conditions of Dirao's


delta function and can be considered as the trigonometric expansion of the
delta function. We cannot call it the " Fourier series" of the delta function
since the coefficients of the expansion are not universal coefficients but the
universal coefficients weigMed by the sigma factors. It is this weighting
which makes the series convergent.
SEC. 2.16 THE TRIANGULAR PULSE 81

The application of local smoothing to Dirichlet's kernel (8.3) yields the


new kernel

This kernel has the same advantageous properties as Fejer's kernel. The
same reasoning we employed in Section 10 for proving that FejeVs method
insures the convergence of the Fourier series at all points where f(x) exists,
and for all functions which are absolutely integrable, is once more applicable.
Hence we obtain the result that the application of the sigma factors makes
the Fourier series of any absolutely integrable function convergent at all points
in which f(x) approaches a definite limit, at least in the sense of f(x+) and
/(#_). At points where these two limits are different, the series approaches
the arithmetic mean of the two limiting ordinates.
The operator In can be repeated, of course, which means that now the
coefficients «#, b^ will become multiplied by c^2. At each step the con-
vergence becomes stronger by the factor n. We must not forget, however,
that the operation of local smoothing distorts the function and we obtain
quicker convergence not to the original but to the modified function. From
the standpoint of going to the limit n -> oo all these series converge eventually
to f(x). But from the standpoint of the finite series of n terms we have to
compromise between the decrease of the Gibbs oscillations and the modifica-
tion of the given function due to smoothing. It is an advantage to cut down
on the error oscillations, but the price we have to pay is that the basic
function to which these oscillations refer is no longer f(x) but Inf(x),
respectively Inkf(x), if we multiply by ak. The proper optimum will
depend on the nature of the given problem.
Problem 80. Show that the function 8n(0) is obtainable by applying the
operation In to the function Gi(0) (cf. 2.9), taken with a negative sign. Obtain
the Gibbs oscillations of the series (1) and the position and magnitude of the
maximum amplitude.
[Answer:

Maximum at 6 = 0:

For 8 not too small:

Compare these oscillations with those of the Dirichlet kernel (8.3) and the
Fejer kernel (9.2).

2.16. The triangular pulse


In the case of the 8n -function it is worth while to go one step further
and consider the doubly smoothed modification of the original delta function.
7—L.D.O.
82 HARMONIC ANALYSIS CHAP. 2

We now obtain a function which has a triangular shape, instead of the


square pulse of Section 13. It is defined by

comparable to an infinitely sharp needle, as n goes to infinity, which pinpoints


the special value f(x), if used as an integral operator.
The Fourier series of this new 8n(8) function becomes:

Problem 81. Obtain the doubly smoothed Dirichlet kernel Kn(Q) by applying
the operator In to (6). Find again the maximum amplitude of the Gibbs
oscillations.
[Answer:
SEC. 2.17 EXTENSION OF THE CLASS OF EXPANDABLE FUNCTIONS 83

2.17. Extension of the class of expandable functions


The smoothing properties of the sigma factors permit us to differentiate a
Fourier series, without increasing the order of magnitude of the Gibbs
oscillations. But this means that a convergent Fourier series can be safely
differentiated, without losing its convergence. This process can be repeated
any number of times, if at each step we apply a multiplication by the o^.
Hence a differentiation m times will demand the application of the factors
<rjem.
As an example let us consider the infinite series si of Problem 63 (cf. 2.15):

This series converges everywhere, except at 9 = 0 where the function goes


to infinity. If we differentiate on both sides formally, we obtain the
completely divergent series

With the sole exception of the point 6 = ± TT, this series diverges everywhere.
Nor can we expect a Fourier series for the function cot 0/2 which is no
longer integrable since the area under the curve goes logarithmically to
infinity. Hence the Fourier coefficients cannot be evaluated. The applica-
tion to the OK factors, "however, makes the series convergent:

This weighted series converges at all points of 6, except at the point of


singularity 0 = 0. The cotangent, being an odd function, is expanded in a
sine series which at 9 = 0 gives zero (this can be conceived as the arithmetic
mean of the two limiting ordinates ±00). In spite of this zero, if we
prescribe an arbitrarily small 6 = e, the series (3) manages to rise to an
exceedingly large value, if the proper number of terms—excessively large
if e is very small—is summed. Then the increase slows down and eventually
converges in very small steps to the final value, which is the proper value
of cot e/2.
If we differentiate again and apply the a operation a second time, we
obtain the series

which is now an even series and which again converges to the proper value
at every point of the range, excluding the origin 0 = 0.
We see that the sigma factors provide us with a tool of extending the
84 HARMONIC ANALYSIS CHAP. 2

class of functions which allow a harmonic analysis, to a much wider domain,


including functions which go very strongly to infinity and are far from being
absolutely integrable. These series are weighted Fourier series which for every
given n behave exactly like an ordinary Fourier series, with the only exception
that their coefficients constantly change as we increase n to larger and
larger values because the weight factors by which a rigid set of coefficients is
multiplied, keep constantly changing.

Problem 82. Show that the remainder of the series (3) becomes asymptotically

Demonstrate the formula numerically for n. = 10, at the point d = 7r/4 (remember-
ing that 7jn(6) is not f(6) - /w(0) but /(0j - fn(6), cf. (14.8). For a table of the
sigma factors see Appendix).

[Answer: predicted

actual

2.18. Asymptotic relations for the sigma factors


In view of the powerful convergence producing faculties of the ak factors
we can expect that they should have many interesting mathematical
properties. These properties are of an "asymptotic" character, i.e., they
hold with increasing accuracy as n increases.
We can make use of the series (15.1) and (16.2) for the 8n(0) and 8n(9)
functions, to derive two asymptotic relations for the tr-factors. In the first
case we derive from (15.5), in the second case from (16.4) the asymptotic
relations

Further relations are obtainable by substituting in these series for 0 some


other value. For any fixed value which is not zero, we must get
asymptotically zero, since the delta function converges at all points, excluding
the origin, to zero. Moreover, in such a relation the ajc can be replaced by
any power of a^, since the limit value of the series is not influenced by the
degree of smoothing. For example the value 6 = -n gives
SBC. 2.18 ASYMPTOTIC RELATIONS FOB THE SIGMA FACTORS 85

The values of 6 = 7r/2, 7r/3, 27T/3 yield the relations

which hold likewise for all powers of the a*. Additional asymptotic relations
can be derived from the series (17.3) by substituting for 6 the values rr/2,
7T/3, 27T/3, 7T/4:

However, these asymptotic relations can be made much more conclusive if


we include the remainder, making use of the previously discussed asymptotic
treatment (see Section 14). For example the formula (15.4) was obtained
by applying the In operation to the function

This function originated from the function (see 3.15)

We have replaced cot 6/2 by 0/2 which is permissible for small 6. A more
accurate treatment would proceed as follows. We put

and make use of the Taylor series of the second factor, on the basis of the
series
86 HABMONIC ANALYSIS CHAP. 2

Generally the following operations are encountered in the asymptotic


treatment of the Gibbs oscillations. We may know the remainder of a
certain fundamental series which we may put in the complex form (14.2).
Now we may want to integrate this series and investigate the new remainder.
For this purpose we have to integrate the differential equation (14.4) :

We do that by the asymptotic expansion*

where D denotes the operation d/dx.


Another basic operation is the smoothing by the sigma factors. This
operation has the following effect on the remainder (neglecting higher than
third powers of the operator D):

In the case of double smoothing—that is if we multiply the Fourier


coefficients by ajcz—this operator has to be squared.
As an example let us start with the Gibbs oscillations of the delta function.
We will apply the delta function (multiplied by a proper constant) at the
point x — 0, and the negative delta function at the point x = TT, investigating
the following function:

The truncated Fourier series associated with this function becomes

with the remainder

where n = 2v is even. The integral of the function (15) yields the "square
wave" of the constant value ?r/4 for x > 0 and — 7r/4 for x < 0, with a point
* A series of this kind converges for sufficiently large n up to a certain point, although
it diverges later on. The more descriptive term "semi-convergent" is unfortunately
not common in English mathematical literature.
SEC. 2.18 ASYMPTOTIC RELATIONS FOB THE SIGMA FACTORS 87

of discontinuity at x = 0. We will pay particular attention to the point


x — 77/2. Here the truncated Fourier series

yields for 7r/4 the truncated Leibniz series:

Now in the neighbourhood of x = rr/2 we will put

which yields for the remainder rjn(x}) if written in complex form:

Now the Taylor expansion of the last factor yields

Applying the operator (11) to this expansion we notice first of all that
only the even powers of D have to be considered (since we focus our attention
on the point 0 = 0). The result is that the new remainder becomes

This yields the following correction of the slowly convergent Leibniz series:

The effectiveness of this correction becomes evident if we employ it to the


first five terms of the Leibniz series, i.e., v = 5, n = 10:

The new error is only 3.6 units in the sixth decimal place.
We now come to the application of the sigma factors. This means that
the operations (11) and (12) have to be combined, with the following result:
88 HARMONIC ANALYSIS CHAP. 2

We see that here the sigma smoothing reduced the Gibbs oscillations
quadratically, instead of linearly, in n. The reason is that before smoothing
the point x = ?r/2 was a point of maximum amplitude. The shift by 90°
changes this maximum to a nodal point, with the result that the term with
n~2 drops out and the error becomes of third order in 1/w. We thus obtain,
up to quantities of the order n'1

A second smoothing causes a second phase shift by 90° and the maximum
amplitude is once more restored. The reduction by the factor n2 will
cause an error of the order n~z, as we had it in the case of simple smoothing
(since here we do not profit by the privileged position of the point x = ir/2).
The result of the operation is

and thus

Compared with simple smoothing we have not gained more than the factor
2. (As a numerical check, let us apply our formulas to the sigma-weighted
Leibniz series, for v = 5, n = 10. The formula gives (rr/4) - 0.0009822 =
0.7844160, against the actual value of 0.7844133, while the calculated value
of the doubly smoothed series yields (w/4) - 0.0004322 = 0.7849660, against
the actual value of 0.7849681.)
While in this example we started from a point of maximum amplitude
and thus the sigma smoothing gained two powers of n (due to the shift to
a nodal point), it may equally happen that we start from a nodal point,
in which case the sigma smoothing will not decrease but possibly even
increase the local error at that particular point. An example of this kind
is encountered in the formulae (31, 32) of Problem 84.
Problem 83. Show the following exact relations to be valid for the cr-factors:
SEC. 2.19 THE METHOD OF TRIGONOMETRIC INTERPOLATION 89

Problem 84. Obtain the following improved asymptotic expressions (which


hold with an error of the order nr-M-2 if n~^ is the last power included), and
check them numerically for n = 10 (making use of the Table of the sigma
factors of the Appendix).

Numerical check:

Problem 85. Explain why the third of the asymptotic relations (3) will hold
with increasing accuracy as m increases from 1 to n, but ceases to hold if m
becomes larger than n. Demonstrate the situation numerically for n — 6.
[Answer:

2.19. The method of trigonometric interpolation


The exceptional flexibility of the Fourier series in representing a very
extensive class of functions makes harmonic analysis one of the most
successful tools of applied mathematics. There is, however, one drawback
from the standpoint of practical application: the formulae (2.2) demand
the evaluation of a definite integral for every ajt and bk. If f ( x ) is not a
SEC. 2.20 ERROR BOUNDS FOR TRIGONOMETRIC INTERPOLATION 91

series/n_i(z). But this precaution involves the inconvenience that we have


to separate the even and the odd parts of the function and prescribe their
values in two different sets of points. We do not lose essentially if we
abandon this device and choose the same points of interpolation for the
entire series (usually with the choice ft = 0).* The coefficients (5) of the
finite trigonometric series (4) are, of course, not identical with the coefficients
at, bk of the truncated Fourier series (cf. 2.2), obtained by integrations
But the error oscillations—although strictly equidistant now* approximately
equidistant before—are remarkably analogous in both cases and the ampli-
tudes of these oscillations have not increased by the fact that we replace
the original truncated Fourier series by the new series (4), obtained by
interpolation. The greatly simplified numerical scheme is of inestimable
value in the case of complicated functions or empirically observed data.
Problem 86. Show that the last two coefficients an, bn, of the scheme (5) are not
independent of each other but related by the condition

Problem 87. Consider the function

between 0 and rr, once defined as an odd, and once as an even function. (In the
first case the function is discontinuous at x = TT where we define it as.f(Tr) = 0.)
Expand this function by interpolation in a sine and cosine series for n = 9
(/? = 0) and compare the resulting Gibbs oscillations with the Gibbs oscillations
of the truncated Fourier series with n = 8.

2.20. Error bounds for the trigonometric interpolation method


Once more it will be our aim to obtain efficient error estimates for the
difference

between the true function and the approximation obtained by trigonometric


interpolation. Once more the "method of the Green's function", discussed
in Section 7, becomes applicable. We have seen that in the case of an
m-times differentiable function we could obtain/(x) in terms of an auxiliary
"kernel function" G(x, f) (cf. 7.3):

Now let us assume that we have examined the special function Gm(x, £),
considered as a function of x, and determined the remainder r)n(x) for this
special function

* For the numerical aspects of trigonometric interpolation, cf. A. A., Chapter 4,


Sections 11-15.
92 HARMONIC ANALYSIS CHAP. 2

We have interpolated in the points x% and thus r)n(x) will be automatically


zero at all points x = x%. But this fact will not be changed by integrating
over £. Hence in the end we again have nodal points at the points x — x^
and this means that the function fn(%) thus determined is exactly the
trigonometric series obtained by interpolation, whose deviation from f(x)
is thus obtainable with the help of the definite integral

Then again we can make use of "Cauchy's inequality" (4.13) and obtain

Now the second factor is once more the square of the "norm" of f^(x).
In the first factor we encounter conditions which are very similar to those
encountered before, when dealing with the Fourier series (cf. 4.14), and in
fact the result of the analysis is that the error bound (4.16), found before for
the Fourier series, remains valid for the case of trigonometric interpolation.
This result proves once more that the method of trigonometric interpolation
is not inferior to the Fourier series of a comparable number of terms. The
actual coefficients of the two series may differ considerably but the closeness
of approximation is nearly the same in both cases.
We can proceed still differently in our problem of comparing the remainder
of the trigonometric interpolation with the remainder of the corresponding
Fourier series. We will write f(x) in the form

where 1771-1(0;) is the remainder of the truncated series /M-i(a;). Let us now
apply the method of trigonometric interpolation to f(x). This can be done
by interpolating fn-i(x) and f]n-i(x) and forming the sum. But the inter-
polation offn-i(x) must coincide with/M_I(:E) itself, as we can see from the
fact that fn-i(x) is already a finite trigonometric series of the form (19.4),
and the uniqueness of the coefficients (19.5) of trigonometric interpolation
demonstrates that only one such series can exist. Hence it suffices to
interpolate the remainder r)n-i(x)- Since this remainder is small relative
to fn-i(x), we would be inclined to believe that this in itself is enough
to demonstrate that the series obtained by trigonometric interpolation
cannot differ from either fn-i(%) or f(x) by more than a negligibly small
amount.
That this argument is deceptive, is shown by the example of equidistant
polynomial interpolation, considered earlier in Chapter 1. If the zeros of
interpolation are chosen in a definite non-equidistant fashion—namely as the
SEC. 2.21 TRIGONOMETRIC AND POLYNOMIAL INTERPOLATIONS 93

zeros of the Chebyshev polynomials Tn(x}*—we shall obtain a polynomial


Pn(%) which approximates the given continuous function f(x) (of bounded
variation) to any degree of accuracy. Hence we can put

where rjn(x) can be made as small as we wish, by going with n to infinity.


Now, if we apply equidistant polynomial interpolation, in n + I points
to f(x), we can again divide our task by applying the procedure to pn(%) and
to r)n(x) and then forming the sum. By the same argument as before, the
interpolation of pn(x) once more reproduces pn(x). If the interpolation of
the uniformly small -rjn(x) were to remain small, we would not get into
difficulties. But the peculiar paradox holds that, although v]n(x) converges
to zero with increasing n at all points of the range, its equidistant inter-
polation boosts up the amplitudes of the oscillations to such an extent that
they go to infinity on the periphery of the range.
The paradox comes about by the strong disharmony which exists between
the well-distributed zeros and the equidistant zeros. In the case of the
trigonometric functions no such disharmony exists because the natural
Gibbs oscillations of the truncated Fourier series are automatically nearly
equidistant and imitate the behaviour of the imposed equidistancy of the
error oscillations of the interpolated series. The natural and the imposed
error oscillations are so nearly the same that the interpolation of i7n-i(;r) of
the equation (6) can have no strong influence on the error oscillations of
fn-i(x). If the remainder T]n-i(%), caused by the truncation of the Fourier
series, is small, it remains small even in its interpolated form.

2.21. Relation between equidistant trigonometric and polynomial inter-


polations
We have discussed earlier the peculiar fact that a polynomial interpolation
of high order was an eminently unsuitable tool for the representation of
equidistant data because the error oscillations had the tendency to go com-
pletely out of bound around the two ends of the region, even in the case of
analytical functions, while in the case of non-analytical functions the
method fails completely. In strange contrast to this phenomenon we find
that the trigonometric kind of interpolation gives error oscillations which
are practically uniform throughout the range and which are small even in
cases when the function can be differentiated to a very limited degree only.
The great superiority of trigonometric versus polynomial interpolation is
thus demonstrated.
The more surprising then is the fact that these two important types of
interpolation are in fact closely related to each other. The Lagrangian type
of equidistant interpolation problem can in fact be reformulated as a
trigonometric type of interpolation problem.
We return to Lagrangian interpolation and construct the fundamental
* Of. A. A., p. 245.
94 HAEMONIC ANALYSIS CHAP. 2

polynomial F(x). Our data shall be given at the 2n + 1 points xjg — 0,


±1, ±2, . . ., ±n. Then

Now we will make use of a fundamental theorem in the theory of the gamma
function [compare the two expansions (1.13.2) (putting p, = — x) and
(1.18.9)]:

This relation, applied to (1), yields

We introduce the auxiliary function

and write F(x) in the following form—remembering that a constant factor


in F(x) is immaterial:

Now we come to the construction of Lagrange's interpolation formula:

The function Qn(x) does not vanish in the given interval. It is the last
factor of F(x) which vanishes at the points x = + k. Hence

and we can rewrite the interpolation formula (6) as an interpolation for a


new function <p(x), denned by

We obtain

where <p* (x) denotes the interpolated value of 95(2).


SEC. 2.21 TRIGONOMETRIC AND POLYNOMIAL INTERPOLATIONS 95

We see that Lagrangian polynomial interpolation of f(x), applied to equi-


distant data, is equivalent to a trigonometric interpolation of the transformed
function q>(x), defined by (8).
We have encountered a similar interpolation formula earlier in Section 1.20
(cf. 1.20.4), when dealing with the Fourier transform. There the limits of
summation were infinite, while now they extend only between — n and + n.
But this only means that we define all cp(k) for \k\ > n as zero. The formula
(9) is closely related to the formula (19.3) of trigonometric interpolation. Let
us choose.ft = 0. Moreover, let us introduce a new variable t, defined by

Then

Now, if n increases to infinity, we obtain

and the U(x) of (19.3) becomes

which agrees with (9), except for the limits of summation which do not go
beyond ± n, in view of the fact that all the later f(k) vanish.
The relation here established between equidistant polynomial and equi-
distant trigonometric interpolation permits us to make use of the theory of
trigonometric interpolation for the discussion of the error oscillations of the
polynomial interpolations of high order. Moreover, the interpolation
formula (9) is in fact even numerically much simpler than the original
Lagrangian formula and may be preferable to it in some cases.
We can now find a new interpretation for the very large error oscillations
of high order polynomial approximations. As far as the transformed
function <p(x] goes, the Gibbs oscillations remain throughout the range of
practically constant amplitude. However, when we return to the original
function f(x), we have to multiply by the function Qn(%) defined by (4).
This multiplies also the remainder 77,1(0;). Now Qn(x] can be closely
estimated by Stirling's formula:
96 HABMONIC ANALYSIS CHAP. 2

which shows that Qn(x) is very nearly the wth power of a universal function
of xjn:

where

and

The general trend of Qn(%} is nearly e2"*2 which shows the very strong increase
of Qn(x) with increasing x, until the maximum 4n is reached at x = n. It
is this exponential magnification of the fairly uniform Gibbs oscillations
which renders high order polynomial interpolation so inefficient if we leave
the central range of interpolation.
The transformed series (9) can be of great help if our aim is to obtain the
limit of an infinite Stirling series. In Chapter 1.9 we have encountered an
interpolation problem in which the successive terms seemed to converge as
more and more data were taken into account, but it seemed questionable
that the limit thus obtained would coincide with the desired functional
value. In the original form of the Stirling series it is by no means easy to
see what happens as more and more terms of the series are taken into account.
We fare much better by transforming the series into the form (9) and then
make the transition to the limit n -> oo. It is true that this procedure
demands a transition from the function f(x) to the new function q>(x). But
this transformation becomes particularly simple if n is very large and
converges to infinity. The relation between f(x) and <p(x), as given by (8),
requires that we should divide f(x) by Qn(x) which for any finite x and very
large n becomes

We see that for any finite point x the functions f(x) and tp(x) coincide in the
limit, as n grows to infinity. This does not absolve us from the obligation
to investigate the possible contribution of the points in infinity. But if the
nature of the function f(x) is such that we know in advance that the
contribution of the points in infinity converges to zero, then it suffices to
find the limit of the infinite sum

In the problem of Chapter 1.9 the given equidistant values had the form
(cf. 1.7.3)
SEC. 2.21 TRIGONOMETRIC AND POLYNOMIAL INTERPOLATIONS 97

(with C = 100, a = 2). Substitution in (20) yields terms of the following


character:

Let us assume that we are able to sum the series

Then we will have obtained f*(x) in the form

Now the function/(x) = 1 certainly allows the Stirling kind of interpolation


in an infinite domain and thus, in view of (20), we must have

which gives p(x) in the form

Substitution in (24) yields

This shows that the interpolated value/*(x) does not coincide with f(x) at
any point, except at the integer points x = k which provided the key-values
of the interpolation procedure. The infinite Stirling expansion of our
problem does approach a limit, but it is not the desired limit. In our specific
problem we have a = 2, and we have interpolated at the point x = \.
Since

we obtain

The difference is small and yet significant. It would easily escape our
attention if we were to trust the numerical procedure blindly, without
backing it up by the power of a thorough analytical study.
8—L.D.O.
98 HARMONIC ANALYSIS CHAP. 2

Problem 88. Given the even function

in the range [ — 5, +5], Apply 11-point Lagrangian interpolation on the basis


of the formula (9). Evaluate <p* (x) at the half-integer points and obtain the
error 175(0:) at these points. Then return to the original f(x) by multiplying by
$5(0;) and demonstrate the strong increase of the error oscillations.

2.22. The Fourier series in curve fitting


In the frequent problem of curve fitting of equidistant data we can make
excellent use of the uniform error oscillations of the Fourier series, against
the exponential increase of the error oscillations, if we try to interpolate by
powers. However, we have to overcome the difficulty that the given
function in most cases does not satisfy any definite boundary conditions.
Furthermore, it is frequently difficult to measure the derivative of the
function at the endpoints of the range and we have to rely on the given
equidistant ordinates, without further information.
In such cases we can prepare our problem to the application of the Fourier
series by the following artifice. We normalise the range of our data to
[0,1], by a proper choice of the independent variable. We then replace
/(*) by

The new g(x) satisfies the boundary conditions

Then we reflect g(x) as an odd function:

and consider the range [—!,+!]. On the boundaries we find that the
conditions

are automatically satisfied and thus at least function and first derivative
can be conceived as continuous. The first break will occur in the second
derivative.
Under these circumstances we will use the sine series

for the representation of our data (adding in the end the linear correction to
come back to /(#)). The coefficients bje are evaluated according to the
formula (19.5) (replacing, however, sin kxa by smirkxa). The expression
(6.8) shows that the amplitude of the error oscillations will be of the order
of magnitude n~3, except near x = 0 and x = TT where a larger error of the
SEC. 2.22 THE FOURIER SERIES IN CURVE FITTING 99

order n~2 can be expected. Hence a Fourier series of 10 to 12 terms will


generally give satisfactory accuracy, comparable to the accuracy of the data.*

Problem 89. A set of 13 equidistant data are given at x = 0, •£$, • • . , ! . They


happen to lie on the curve

Fit these data according to the method of Section 22. Study the Gibbs
oscillations of the interpolation obtained.
Problem 90. Let f(x) be given at n equidistant points between 0 and 1 and let
also /'(O) and /'(I) be known. What method of curve fitting could we use
under these circumstances?
[Answer: define

and use the cosine series.


Demonstrate that the error oscillations are now of the order n~^ respectively
near to the ends of the range of the order n~3.]

BIBLIOGRAPHY
[1] Churchill, R. V., Fourier Series and Boundary Value Problems (McGraw-Hill,
New York, 1941)
[2] Franklin, Ph., Fourier Methods (McGraw-Hill, New York, 1949)
[3] Jackson, D., Fourier Series and Orthogonal Polynomials (Math. Association
of America, Oberlin, 1941)
[4] Sneddon, I. N., Fourier Transforms (McGraw-Hill, 1951)
* See also A. A., Chapter 5.11, 12.
CHAPTER 3

MATRIX CALCULUS

Synopsis. The customary theory of matrices is restricted to n x n


square matrices. But the general theory of linear operators can only be
based on the general case of n x ra matrices in which the number of
equations and number of unknowns do not harmonise, giving rise to
either over-determined (too many equations) or under-determined (too
few equations) systems. We learn about the two spaces (of n and m
dimensions) to which our n x m matrix belongs and the ^-dimensional
"eigenspace ", in which the matrix is activated. We succeed in extend-
ing the customary "principal axis transformation" of symmetric
matrices to arbitrary n x m matrices and arrive at a fundamental
"decomposition theorem" of arbitrary matrices which elucidates the
behaviour of arbitrarily over-determined or under-determined systems.
Every matrix has a unique inverse because every matrix is complete
within its own space of activation. Consequently every linear system
has in proper interpretation a unique solution. For the existence of
this solution it is necessary and sufficient that both right and left
vectors shall lie completely within the activated ^j-dimensional sub-
spaces (one for the right and one for the left vector), which are uniquely
associated with the given matrix.

3.1. Introduction
It was around the middle of the last century that Cayley introduced the
matrix as an algebraic operator. This concept has become so universal in
the meantime that we often forget its great philosophical significance.
What Cayley did here parallels the algebraisation of arithmetic processes by
the Hindus. While in arithmetic we are interested in getting the answer to
a given arithmetic operation, in algebra we are no longer interested in the
individual problem and its solution but start to investigate the properties
of these operations and their effect on the given numbers. In a similar
way, before Cayley's revolutionary innovation one was merely interested in
the actual numerical solution of a given set of algebraic equations, without
paying much attention to the general algebraic properties of the solution.
Now came Cayley who said: "Let us write down the scheme of coefficients
which appear in a set of linear equations and consider this scheme as one
unity":
100
SEC. 3.1 INTRODUCTION 101

To call this scheme by the letter A was much more than a matter of notation.
It had the significance that we are no longer interested in the numerical
values of the coefficients an ... ann. In fact, these numerical values are
without any significance in themselves. Their significance becomes estab-
lished only in the moment when this scheme operates on something. The
matrix A was thus divested of its arithmetic significance and became an
algebraic operator, similar to a complex number a + ib, although character-
ised by a much larger number of components. A large set of linear equations
could be written down in the simple form

where y and 6 are no longer simple numbers but a set of numbers, called a
"vector". That one could operate with sets of numbers in a similar way
as with single numbers was the great discovery of Cayley's algebraisation of
a matrix and the subsequent development of "matrix calculus".
This development had great repercussions for the field of differential
equations. The problems of mathematical physics, and later the constantly
expanding industrial research demanded the solution of certain linear differ-
ential equations, with given boundary conditions. One could concentrate
on these particular equations and develop methods which led to their
solution, either in closed form, or in the form of some infinite expansions.
But with the advent of the big electronic computers the task of finding the
numerical solution of a given boundary value problem is taken over by
the machine. We can thus turn to the wider problem of investigating the
general analytical properties of the differential operator itself, instead of
trying to find the answer to a given individual problem. If we understand
these properties, then we can hope that we may develop methods for the
given individual case which will finally lead to the desired numerical answer.
In this search for "properties" the methods of matrix calculus can serve
as our guiding light. A linear differential equation does not differ funda-
mentally from a set of ordinary algebraic equations. The masters of 18th
century analysis, Euler and Lagrange, again and again drew exceedingly
valuable inspiration from the fact that a differential quotient is not more
than a difference coefficient whose Ax can be made as small as we wish.
This means that a linear differential equation can be approximated to any
degree of accuracy by a set of ordinary linear algebraic equations. But
these equations fall in the domain of matrix calculus. The "matrix" of
these equations is determined by the differential operator itself. And thus
the study of linear differential operators and the study of matrices as
algebraic operators is in the most intimate relation to one another. The
present chapter deals with those aspects of matrix calculus which are of
particular importance for the study of linear differential operators. One of
102 MATEIX CALCULUS CHAP. 3

the basic things we have to remember in this connection is that the trans-
formation of a differential equation into an algebraic set of equations demands
a limit process in which the number of equations go to infinity. Hence we
can use only those features of matrix calculus which retain their significance
if the order of the matrix increases to infinity.
3.2. Rectangular matrices
The scheme (1.1) pictures the matrix of a linear set of equations in which
the number of equations is n and the number of unknowns likewise n.
Hence we have here an "n x n matrix". From the standpoint of solving
a set of equations it seems natural enough to demand that we shall have
just as many equations as unknowns. If the number of equations is smaller
than the number of unknowns, our data are not sufficient for a unique
characterisation of the solution. On the other hand, if the number of
equations is larger than the number of unknowns, we do not have enough
quantities to satisfy all the given data and our equations are generally not
solvable. For this reason we consider in the matrix calculus of linear
algebraic systems almost exclusively only square matrices. However, for
the general study of differential operators this restriction is a severe handicap.
A differential operator such as y" for example requires the addition of two
"boundary conditions" in order to make the associated differential equation
well determined. But we may want to study the differential operator y"
itself, without any additional conditions. In this case we have to deal with
a system of equations in which the number of unknowns exceeds the number
of equations by 2. In the realm of partial differential operators the dis-
crepancy is even more pronounced. We might have to deal with the
operation "divergence" which associates a scalar field with a given vector
field:

The associated set of linear equations is strongly "under-determined", i.e.


the number of equations is much smaller than the number of unknowns.
On the other hand, consider another operator called the "gradient", which
associates a vector field with a given scalar field:

The associated set of linear equations is here strongly "over-determined",


i.e. the number of equations is much larger than the number of unknowns.
Under these circumstances we have to break down our preference for
n x n matrices and extend our consideration to "n x m" matrices of n
rows and m columns. The number n of equations and the number m of
unknowns are here no longer matched but generally
n < m (under-determined)
n — m (even-determined)
n > m (over-determined)
SEC. 3.3 THE BASIC RULES OP MATRIX CALCULUS 103

and accordingly we speak of under-determined, even-determined and over-


determined linear systems. Our general studies will thus be devoted to the
case of rectangular matrices of n rows and ra columns (called briefly an
"n x m matrix") in which the two numbers n and m are left entirely free.
The operation (1.2) can now be pictured in the following manner, if we
choose for the sake of illustration the case n < m:

It illustrates the following general situation: "The n x m matrix A operates


on a vector of an w-dimensional space and transforms it into a vector of an
n -dimensional space." The vector y on the left side of the equation and
the vector 6 on the right side of the equation belong generally to two different
spaces. This feature of a general matrix operation is disguised by the
special n x n case in which case both y and 6 belong to the same
n-dimensional space.
In the following section we summarise for the sake of completeness the
fundamental operational rules with matrices which we assume to be known.*

3.3. The basic rules of matrix calculus


Originally a matrix was conceived as a two-dimensional scheme of n x n
numbers, in contrast to a "vector" which is characterised by a single row
of numbers. In fact, however, the operations of matrix calculus become
greatly simplified if we consider a matrix as a general scheme of n x m
numbers. Accordingly a vector itself is not more than a one row n-column
matrix (called a "row vector"). Since we have the operation "trans-
position" at our disposal (denoted by the "tilde" ~), it will be our policy to
consider a vector basically as a column vector and write a row vector x in
the form x. (Transposition means: exchange of rows and columns and, if
the elements of the matrix are complex numbers, simultaneous change of
every i to — i.)
Basic rule of multiplying matrices: Two general matrices A and B can
only be multiplied if A is an n x m and B an m x r matrix; the product
AB is an n x r matrix. Symbolically:

* Cf. A. A., Chapter 2.


104 MATRIX CALCULUS CHAP. 3

Matrix multiplication is generally not commutative.

but always associative:

Fundamental transposition rule:

A row vector times a column vector (of the same number of elements) gives
a scalar (a 1 x 1 matrix), called the "scalar product" of the two vectors:

(The asterisk means: "conjugate complex".) The transpose of a scalar


coincides with itself, except for a change of i to — i. This leads to the
following fundamental identity, called the "bilinear identity"

where x is n x 1, A is n x m, and y is m x 1.
Two fundamental matrices of special significance: the "zero matrix"
whose elements are all zero, and the "unit matrix", defined by

whose diagonal elements are all 1, all other elements zero:

A symmetric—or in the complex case Hermitian—matrix is characterised


by the property

An orthogonal matrix U is defined by the property

If U is not an n x n but an n x p matrix (p < n), we will call it "semi-


orthogonal" if

but in that case

A triangular matrix is defined by the property that all its elements above
the main diagonal are zero.
SEC. 3.3 THE BASIC RULES OF MATRIX CALCULUS 105

If A IB an arbitrary n x n square matrix, we call the equation

the "eigenvalue problem" associated with A. The scalars AI, A2, . . . , An for
which the equation is solvable, are called the "eigenvalues" (or "character-
istic values") of A while the vectors x\, xz,. . . , xn are called the
"eigenvectors" (or "principal axes") of A. The eigenvalues \t satisfy the
characteristic equation

This algebraic equation of nth order has always n generally complex roots.
If they are all distinct, the eigenvalue problem (13) yields n distinct eigen-
vectors, whose length can be normalised to 1 by the condition

If some of the eigenvalues coincide, the equation (13) may or may not have
n linearly independent solutions. If the number of independent solutions is
less than n, the matrix is "defective" in certain eigenvectors and is thus
called a "defective matrix".
Any square matrix satisfies its own characteristic equation (the
"Hamilton-Cayley identity"):

Moreover, this is the identity of lowest order satisfied by A, if the A$ are all
distinct. If, however, only p of the eigenvalues are distinct, the identity
of lowest order in the case of a non-defective matrix becomes

Defective matrices, however, demand that some of the root factors shall
appear in higher than first power. The difference between the lowest order
at which the identity appears and p gives the number of eigenvectors in
which the matrix is defective.
Problem 91. Let an n x n matrix M have the property that it commutes with
any n x n matrix. Show that M must be of the form M = <xl.
Problem 92. Show that if A is an eigenvalue of the problem (13), it is also an
eigenvalue of the "adjoint" problem

Problem 93. Let the eigenvalues AI, A2, . . ., An of A be all distinct. Show that
the matrix

has a zero eigenvalue which is ju-fold.


106 MATRIX CALCULUS CHAP. 3

Problem 94. Show that the eigenvalues of Am are the mth power of the original
eigenvalues, while the eigenvectors remain unchanged.
Problem 95. Show that the (complex) eigenvalues of an orthogonal matrix (10)
must lie on the unit circle \z\ — 1.
Problem 96. Show that the following properties of a square matrix A remain
unchanged by squaring, cubing, . . . , of the matrix:
a) symmetry
b) orthogonality
c) triangular quality.
Problem 97. Show that if two non-defective matrices A and B coincide in
eigenvalues and eigenvectors, they coincide altogether: A — B = 0.
Problem 98. Show that two defective matrices A and B which have the same
eigenvalues and eigenvectors, need not coincide. (Hint: operate with two
triangular matrices whose diagonal elements are all equal.)
Problem 99. Show that Am = 0 does not imply A = 0. Show that if A is an
n x n matrix which does not vanish identically, it can happen that A2 = 0,
or A3 s 0, . . . , or An = 0 without any of the lower powers being zero.
Problem 100. Investigate the eigenvalue problem of a triangular matrix whose
diagonal elements are all equal. Show that by a small modification of the
diagonal elements all the eigenvalues can be made distinct, and that the eigen-
vectors thus created are very near to each other in magnitude and direction,
collapsing into one as the perturbation goes to zero.

3.4. Principal axis transformation of a symmetric matrix


The principal axis transformation of a symmetric matrix is one of the
most fundamental tools of mathematical analysis which became of central
importance in the study of linear differential and integral operators. We
summarise in this section the formalism of this theory, for the sake of later
applications.
While a matrix is primarily an algebraic tool, we gain greatly if the purely
algebraic operations are complemented by a geometrical picture. In this
picture we consider the scalar equation

where S is a symmetric (more generally Hermitian) n x n matrix:

and the components of the vector

are conceived as the rectangular coordinates of a point x in an w-dimensional


Euclidean space with the distance expression
SEC. 3.4 PRINCIPAL AXIS TRANSFORMATION OF A SYMMETRIC MATRIX 107

In particular the distance from the origin is given by

The equation (1) can be conceived as the equation of a second order surface
in an n-dimensional space. The eigenvalue problem

characterises those directions in space in which the radius vector and the
normal to the surface become parallel. Moreover in consequence of (1) we
obtain

which means that these eigenvalues A* can be interpreted as the reciprocal


square of the distance of those points of the surface in which radius vector
and normal are parallel. Hence the A$ are interpreted in terms of inherent
properties of the second order surface which proves that they are independent
of any special reference system and thus invariants of an arbitrary orthogonal
transformation, a transformation being "orthogonal" if it leaves the distance
expression (5) invariant.
The freedom of choosing our coordinate system is of greatest importance
in both physics and mathematics. Many mathematical problems are solved,
or at least greatly reduced hi complexity, by formulating them in a properly
chosen system of coordinates. The transformation law

for an n x n matrix where U is an arbitrary orthogonal matrix, defined


by (3.10), has the consequence that an arbitrary relation between n x n
matrices which involves any combination of the operations addition,
multiplication and transposition, remains valid in all frames of reference.
The n solutions of the eigenvalue problem (3.13) can be combined into
the single matrix equation

where the eigenvectors defined by (6) are now arranged as successive


columns of the matrix U, while the diagonal matrix A is composed of the
eigenvalues AI, A2, . . . , A n :
108 MATRIX CALCULUS CHAP. 3

While in the case of a general matrix A the eigenvalues A< are generally
complex numbers and we cannot guarantee even the existence of n eigen-
vectors—they may all collapse into one vector—here we can make much
more definite predictions. The eigenvalues A$ are always real and the
eigenvectors are always present to the full number n. Moreover, they are
in the case of distinct eigenvalues automatically orthogonal to each other,
while in the case of multiple roots they can be orthogonalised—with an
arbitrary rotation remaining free in a definite /^-dimensional subspace if //,
is the multiplicity of the eigenvalue A^. Furthermore, the length of the
eigenvectors can be normalised to 1, in which case U becomes an orthogonal
matrix :

But then we can introduce a new reference system in which the eigenvectors
—that is the columns of U—are introduced as a new set of coordinate axes
(called the "principal axes"). This means the transformation

Introducing this transformation in the equation (1) we see that the same
equation formulated in the new (primed) reference system becomes

where

On the other hand, premultiplication of (10) by V gives the fundamental


relation

This means that in the new reference system (the system of the principal
axes), the matrix 8 is reduced to a diagonal matrix and the equation of the
second order surface becomes

Now we can make use of the fact that the A$ are invariants of an orthogonal
transformation. Since the coefficients of an algebraic equation are
expressible in terms of the roots A«, we see that the entire characteristic
equation (3.14) is an invariant of an orthogonal transformation. This means
that we obtain n invariants associated with an orthogonal transformation
because the coefficient of every power of A is an invariant. The most
important of these invariants are the coefficient of (— A)° and the coefficient
of ( —A) n-1 . The former is obtainable by putting A = 0 and this gives the
determinant of the coefficients of S, simply called the "determinant of S"
SBO. 3.4 PRINCIPAL AXIS TRANSFORMATION OF A SYMMETRIC MATRIX 109

and denoted by \\S\\. The latter is called the "spur" of the matrix and is
equal to the sum of the diagonal terms:

But in the reference system of the principal axes the determinant of S'
becomes the product of all the A<, and thus

while the "spur" of 8' is equal to the sum of the \t and thus

Problem 101. Writing a linear transformation of the coordinates in the form

show that the invariance of (5) demands that U satisfy the orthogonality
conditions (3.10).
Problem 102. Show that by considering the principal axis transformation of
S, S2, <S3, . . . , Sn, we can obtain all the n invariants of S by taking the spur
of these matrices

Investigate in particular the case k — 2 and show that this invariant is equal to
the sum of the squares of the absolute values of all the elements of the matrix S:

Problem 103. Show that the following properties of a matrix are invariants of
an arbitrary rotation (orthogonal transformation):
a) symmetry
b) anti-symmetry
c) orthogonality
d) the matrices 0 and I
e) the scalar product xy of two vectors.
Problem 104. Show that for the invariance of the determinant and the spur
the symmetry of the matrix is not demanded: they are invariants of an
orthogonal transformation for any matrix.
Problem 105. Show that the eigenvalues of a real anti-symmetric matrix
A = — A are purely imaginary and come in pairs: Af = ±if$i. Show that one
of the eigenvalues of an anti-symmetric matrix of odd order is always zero.
Problem 106. Show that if all the eigenvalues of a symmetric matrix S collapse
into one: A< = a, that matrix must become 5 = ol.
110 MATRIX CALCULUS CHAP. 3

Problem 107. Find the eigenvalues and principal axes of the following matrix
and demonstrate explicitly the transformation theorem (16), together with the
validity of the spur equations (21):

[Answer:

Problem 108. Find the eigenvalues and principal axes of the following Hermitian
matrix and demonstrate once more the validity of the three spur equations (21):

[Answer:

Problem 109. Show that in every principal axis of a Hermitian matrix a


complex phase factor of the form e*fl* remains arbitrary.

Problem 110. Show that if a matrix is simultaneously symmetric and orthogonal,


its eigenvalues can only be ± 1.

Problem 111. Show that the following class of n x n matrices are simultaneously
symmetric and orthogonal (of. Section 2.19):
SEC. 3.5 DECOMPOSITION OF A SYMMETRIC MATRIX 111

Show that for all even n the multiplicity of the eigenvalues ± 1 is even, while for
odd n the multiplicity of +1 surpasses the multiplicity of — 1 by one unit.
Problem 112. Construct another class of n x n symmetric and orthogonal
matrices by writing down the elements

and bordering them by the


upper horizontal "I
and left vertical J

lower horizontal 1
and right vertical/

The resulting matrix is multiplied by the scalar

Problem 113. Consider the cases n = 2 and 3. Show that here the sine and
cosine matrices coincide. Obtain the principal axes for these cases.
[Answer:

Problem 114. Show that an arbitrary matrix which is simultaneously symmetric


and orthogonal, can be constructed by taking the product

where U is an arbitrary orthogonal matrix, while the elements of the diagonal


elements of A are ± 1, in arbitrary sequence.

3.5. Decomposition of a symmetric matrix


The defining equation (4.9) of the principal axes gives rise to another
fundamental relation if we do not post-multiply but 2>re-multiply by V,
taking into account the orthogonality of the matrix U:

This shows that an arbitrary symmetric matrix can be obtained as the product
of three factors: the orthogonal matrix U, the diagonal matrix A, and the
transposed orthogonal matrix V,
A further important fact comes into evidence if it so happens that one
112 MATBIX CALCULUS CHAP. 3

or more of the eigenvalues Af are zero. Let us then separate the zero eigen-
values from the non-zero eigenvalues:

We do the same with the eigenvectors ut of the matrix U:

We consider the product (1) and start with post-multiplying U by A.


This means by the rules of matrix multiplication that the successive columns
of U become multiplied in succession by AI, Ag, . . . , Aj,, 0, 0, . . . , 0. In
consequence we have the p columns Xiui, X^uz, . . . , Xpup, while the rest of
the columns drop out identically. Now we come to the post-multiplication
by U. This means that we should multiply the rows of our previous
construction by the columns of U which, however, is equivalent to the
row by row multiplication of UA with U. We observe that all the vectors
up+i, Up+z, . . . , un are obliterated and the result can be formulated in terms
of the semi-orthogonal matrix Up which is composed of the first p columns
of the full matrix U, without any further columns. Hence it is not an
n x n but an n x p matrix. We likewise omit all the zero eigenvalues of
the diagonal matrix A and reduce it to the p x p diagonal matrix

Our decomposition theorem now becomes

which generates the symmetric n x n matrix S as a product of the semi-


orthogonal n x p matrix U, the p x p diagonal matrix Ap and the p x n
matrix UP which is the transpose of the first factor U.
SEC. 3.6 SELF-ADJOINT SYSTEMS 113

Problem 115. Demonstrate the validity of the decomposition theorem (1) for
the matrix (4.23) of Problem 107.
Problem 116. Demonstrate the validity of the decomposition theorem (5) for
the matrix (4.25) of Problem 108.

3.6. Self-adjoint systems


If we have an arbitrary linear system of equations

we obtain the "adjoint system"

by transposing the matrix of the original system. If the matrices of the


two systems coincide, we speak of a "self-adjoint system". In that case
we have the condition A — A which means that the matrix of the linear
system is a symmetric (in the case of complex elements Hermitian) matrix.
But such a matrix has special properties which can immediately be utilised
for the solution of a system of linear equations.
We can introduce a new reference system by rotating the original axes
into the principal axes of the matrix A. This means the transformation

where

Hence in the new reference system the linear system (1) appears in the
form

Since A is a mere diagonal matrix, our equations are now separated and
immediately solvable—provided that they are in fact solvable. This is
certainly the case if none of the eigenvalues of A is zero. In that case the
TmT£kT»C!£i Wl Q •f .T*1 V

exists and we obtain the solution y' in the form

9--L.D.O.
114 MATRIX CALCULUS CHAP. 3

This shows that a self-adjoint linear system whose matrix is free of zero
eigenvalues is always solvable and the solution is unique.
But what happens if some of the eigenvalues A$ vanish? Since any
number multiplied by zero gives zero, the equation

is only solvable for Af = 0 if b't = 0. Now b't can be interpreted as the


ith component of the vector b in the reference system of the principal axes:

and thus the condition

has the geometrical significance that the vector 6 is orthogonal to the ith
principal axis. That principal axis was denned by the eigenvalue equation

which, in view of the vanishing of A^, is now reduced to

If more than one eigenvalue vanishes, then the equation

has more than one linearly independent solution. And since the condition
(11) has to hold for every vanishing eigenvalue, while on the other hand
these are all the conditions demanded for the solvability of the linear system
(1), we obtain the fundamental result that the necessary and sufficient
condition for the solvability of a self-adjoint linear system is that the right side
is orthogonal to every linearly independent solution of the homogeneous equation
Ay = 0.
Coupled with these "compatibility conditions" (11) goes a further
peculiarity of a zero eigenvalue. The equation

is solvable for any arbitrary y'{. The solution of a linear system with a
vanishing eigenvalue is no longer unique. But the appearance of the free
component y't in the solution means from the standpoint of the original
reference system that the product y'tut can be added to any valid solution
of the given linear system. In the case of several principal axes of zero
eigenvalue an arbitrary linear combination of these axes can be added and
we still have a solution of our linear system. On the other hand, this is
all the freedom left in the solution. But "an arbitrary linear combination
SEC. 3.7 ARBITRARY N X M SYSTEMS 115

of the zero axes" means, on the other hand, an arbitrary solution of the
homogeneous equation (14). And thus we obtain another fundamental
result: " The general solution of a compatible self-adjoint system is obtained by
adding to an arbitrary particular solution of the system an arbitrary solution
of the homogeneous equation Ay = 0."

Problem 117. Show that the last result holds for any distributive operator A.

3.7. Arbitrary n x m systems


We now come to the investigation of an arbitrary n x m linear system

where the matrix A has n rows and m columns and transforms the column
vector yofm components into the column vector b of n components. Such
a matrix is obviously associated with two spaces, the one of n, the other
of m dimensions. We will briefly call them the JV-space and the M -space.
These two spaces are in a duality relation to each other. If the vector y
of the M -space is given, the operator A operates on it and transplants it
into the N-space. On the other hand, if our aim is to solve the linear
system (1), we are given the vector 6 of the N-space and our task is to find
the vector y of the M -space which has generated it through the operator A.
However, in the present section we shall not be concerned with any
method of solving the system (1) but rather with a general investigation of
the basic properties of such systems. Our investigation will not be based
on the determinant approach that Kronecker and Frobenius employed in
their algebraic treatment of linear systems, but on an approach which
carries over without difficulty into the field of continuous linear operators.
The central idea which will be basic for all our discussions of the behaviour
of linear operators is the following. We will not consider the linear system
(1) in isolation but enlarge it by the adjoint m x n system

The matrix A has m rows and n columns and accordingly the vectors x
and c are in a reciprocity relation to the vectors y and b, x and b being
vectors of the JV^-space, y and c vectors of the M -space.
The addition of the system (2) has no effect on the system (1) since the
vectors x and c are entirely independent of the vectors y and b, and vice
versa. But the addition of the system (2) to (1) enlarges our viewpoint
and has profound consequences for the deeper understanding of the properties
of linear systems.
We combine the systems (1) and (2) into the larger scheme
116 MATRIX CALCULUS CHAP. 3

where we now introduce a new (n + m) by (n -f m) symmetric (respectively


Hermitian) matrix 8, defined as follows:

The linear system (3) can now be pictured as follows:

The vectors (x, y) combine into the single vector z from the standpoint of
the larger system, just as the vectors (b, c) combine into the larger vector a.
However, for our present purposes we shall prefer to maintain the
individuality of the vectors (x, y) and formulate all our results in vector
pairs, although they are derived from the properties of the unified system (5).
Since the unified system has a symmetric matrix, we can immediately
apply all the results we have found in Sections 4, 5, and 6. First of all,
we shall be interested in the principal axis transformation of the matrix 8.
For this purpose we have to establish the fundamental eigenvalue equation
SEC. 3.7 ARBITRARY N X M SYSTEMS 117

which in view of the specific character of our matrix (4) appears in the
following form, putting w = (u, v):

We will call this pair of equations the "shifted eigenvalue problem", since
on the right side the vectors u and v are in shifted position, compared with
the more familiar eigenvalue problem (3.13), (3.18). It is of interest to
observe that the customary eigenvalue problem loses its meaning for n x m
matrices, due to the heterogeneous spaces to which u and v belong, while
the shifted eigenvalue problem (7) is always meaningful. We know in
advance that it must be meaningful and yield real eigenvalues since it is
merely the formulation of the standard eigenvalue problem associated with
a symmetric matrix, which is always a meaningful and completely solvable
problem. We also know in advance that we shall obtain n + m mutually
orthogonal eigenvectors, belonging to n + m independent eigenvalues,
although the eigenvalues may not be all distinct (the characteristic equation
can have multiple roots).
The orthogonality of two wt eigenvectors now takes the form

but we can immediately add an interesting consequence of the equations (7).


Let \i be a non-zero eigenvalue. Then, together with the solution (v, u, A)
goes the solution (v, —u, — A) and thus—combining the solutions for A( and
— A* —we can complement the relation (8) by the equation

which yields

thus demonstrating that the vectors U{ and Vi in themselves form an


orthogonal set of vectors. This holds so far only for all A$ which are not
zero. But we can extend the result to all ui and vi vectors if we premultiply
the first equation (7) by A, respectively the second equation by A. This
shows that the vectors Ui, and likewise the vectors Vj, can be formulated
independently of each other, as solutions of the eigenvalue problems

Now A A is in itself a symmetric n x n matrix which operates in the N-space,


while AA is a symmetric m x. m matrix which operates in the M -space;
(the symmetry of these matrices follows at once by applying the transposition
rule (3.4) to a product). Consequently we must obtain n mutually orthogonal
Ui vectors as a result of (11) and m mutually orthogonal Vj vectors as a
118 MATRIX CALCULUS CHAP. 3

result of (12). These vectors can serve as an orthogonal set of base vectors
which span the entire N-, respectively, Jf-space.
We will picture these two spaces by their base vectors which are arranged
in successive columns. We thus obtain two square matrices, namely the
tf-matrix, formed out of the n vectors, m, u%, . . . , un, and the F-matrix,
formed out of the m vectors v\, v%, . . . , vm. While these two spaces are
quite independent of each other, yet the two matrices U and V are related
by the coupling which exists between them due to the original eigenvalue
problem (7) which may also be formulated in terms of the matrix equations

This coupling must exist for every non-zero eigenvalue Xt while for a zero
eigenvalue the two equations

become independent of each other.


We will now separate the zero eigenvalue from the non-zero eigenvalues.
Let us assume that our eigenvalue problem (7) has p independent solutions
if we set up the condition that only positive eigenvalues X{ are admitted.
They give us p independent solutions (ui, vi; A^), to which we can add the
p additional solutions (ut, —Vi\ —A*), thus providing us with 2p solutions
of the principal axis problem (6). On the other hand, the eigenvalue
problem (11) must have n independent solutions and, since p was the total
number of non-vanishing eigenvalues, the remaining n — p axes must belong
to the eigenvalue zero. Hence the number of independent solutions of (15)
must be n —p. By exactly the same reasoning the number of independent
solutions of (14) must be m — p. And since these axes are not paired—
that is they appear in the form (Uj, 0), respectively (0, v^)—the zero eigen-
value has the multiplicity

which, together with the 2p "paired" axes actually generate the demanded
m + n principal axes of the full matrix (4).
Problem 118. By applying the orthogonality condition (8) to the pair (m, vi; \{),
(u{, —V{; — Af), (A$ ^ 0), demonstrate that the normalisation of the length of u^
to 1 automatically normalises the length of the associated vt to 1 (or vice versa).

3.8. Solvability of the general n x in system


The extended matrix S puts us in the position to give a complete answer
to the problem of solving arbitrary n x m linear systems. This answer was
found in Section 6 for arbitrary self-adjoint systems, but now it so happens
that the unification of the two problems (7.1) and (7.2) in the form (7.5)
SEC. 3.8 SOLVABILITY OF THE GENEBAL N X M SYSTEM 119

actually yields a self-adjoint system and thus the results of our previous
investigation become directly applicable. In particular we can state
explicitly what are the compatibility conditions to be satisfied by the right
side (b, c) of the unified system (7.5) which will make a solution possible.
This condition appeared in the form (6.11) and thus demands the generation
of the eigenvectors (m, vi) associated with the eigenvalue zero. The
necessary and sufficient condition for the solvability of the system (7.5)
will thus appear in the general form

where (HI, vi) is any principal axis associated with the eigenvalue A = 0:

But now we have seen that these equations fall apart into the two inde-
pendent sets of solutions

and

Consequently the conditions (1) separate into the two sets:

and these compatibility conditions can be interpreted as follows: the


necessary and sufficient condition for the solvability of an arbitrary n x m
system is that the right side is orthogonal to all linearly independent solutions
of the adjoint homogeneous system.
The theorem concerning the uniqueness or non-uniqueness ("deficiency")
of a solution remains the same as that found before in Section 6: The general
solution of the linear system (1.1) is obtained by adding to an arbitrary particular
solution an arbitrary solution of the homogeneous equation Ay — 0. The
number m — p, which characterises the number of linearly independent
solutions of the homogeneous system Ay = 0 is called the "degree of
deficiency" of the given system (7.1).
Problem 119. The customary definition of the rank p of a matrix is: "The
maximum order of all minors constructed out of the matrix which do not vanish
in their totality." In our discussion the number p appeared as "the number
of positive eigenvalues for which the shifted eigenvalue problem (7.7) is solvable".
Show that these two definitions agree; hence our p coincides with the customary
"rank" of a matrix.
Problem 120. Show that the eigenvalue zero can only be avoided if n = m.
Problem 121. Show that an under-determined system (n < m) may or may not
demand compatibility conditions, while an over-determined system (n > m)
always demands at least n — m compatibility conditions.
120 MATRIX CALCULUS CHAP. 3

Problem 122. Show that the number p must lie between 1 and the smaller of
the two numbers n and m:

Problem 123. Prove the following theorem: "The sum of the square of the
absolute values of all the elements of an arbitrary (real) n x m matrix is
equal to the sum of the squares of the eigenvalues AI, A£, • • • > ^p-"
Problem 124. Given the following 4 x 5 matrix:

Determine the deficiencies (and possibly compatibility conditions) of the linear


system Ay = 6. Obtain the complete solution of the shifted eigenvalue problem
(7.7), to 5 decimal places.
[Hint: Obtain first the solution of the homogeneous equations Av = 0 and
Au = 0. This gives 3 + 2 = 5 axes and p = 2. The 4 x 4 system AAu — Xzu
is reducible to a 2 x 2 system by making use of the orthogonality of u to the
two zero-solutions, thus obtaining A2 by solving a quadratic equation.]
[Answer:
Deficiency:
Compatibility:
AI = 15.434743 2 = 6.911490
U U
«1 «2 3 4
0.28682778 -0.46661531 1 2
0.15656598 0.52486565 2 1
0.59997974 0.58311599 -1 0
0.73023153 -0.40836497 0 -1
Vi V2 Vs V£ V5

0.43935348 -0.01703335 -2 6 14
0.33790963 -0.77644357 3 -3 -1
-0.01687774 0.28690800 8 0 0
0.40559800 0.55678265 0 -4 0
0.72662988 0.06724708 0 0 -8

(Note that the zero axes have not been orthogonalised and normalised.) ]

3.9. The fundamental decomposition theorem


In Section 4, when dealing with the properties of symmetric matrices,
we derived the fundamental result that by a proper rotation of the frame of
reference an arbitrary symmetric matrix can be transformed into a diagonal
SEC. 3.9 THE FUNDAMENTAL DECOMPOSITION THEOREM 121
matrix A (cf. 4.15). Later, in Section 5, we obtained the result that an
arbitrary symmetric matrix could be decomposed into a product of 3 matrices:
the semi-diagonal matrix Up, the diagonal matrix Ap, and the transpose
Vp of the first matrix. The eigenvalue A = 0 was eliminated in this
decomposition theorem (5.5). An entirely analogous development is possible
for an arbitrary n x m matrix, on the basis of the eigenvalue problem (7.13).
We have seen that both matrices U and V are orthogonal and thus satisfy
the orthogonality conditions (3.10). If now we pre-multiply the first
equation by U we obtain

The matrix C7 is n x n, the matrix A n x m, the matrix V m x m.


Hence the product is an n- x m matrix, and so is the diagonal matrix A,
whose elements are all zero, except for the diagonal elements, where we find
the positive eigenvalues AI, A£, . . . , Aj> and, if the diagonal contains more
elements than p, the remaining elements are all zero.
The equation (1) takes the place of the previous equation (4.15). In the
special case that A is a symmetric n x n matrix, the two orthogonal matrices
U and V become equal and A becomes a diagonal square matrix. In the
general case the transformation of A into a diagonal form requires pre-
multiplication and post-multiplication by two different orthogonal matrices
U and V.
We can, however, also post-multiply the first equation by P, thus
obtaining

The matrix A is now obtained as the product of the n x n orthogonal


matrix U, the n x m diagonal matrix A and the m x m transpose of the
diagonal matrix V.
Here again we want to separate the p positive eigenvalues of A from the
remaining zero eigenvalues. We define Ap as the positive square matrix

The product UA requires that the successive columns of U be multiplied


by AI, A£, . . . , AP, while the remaining part of the matrix vanishes
identically. Now we shall multiply this product with the columns of F, or
we may also say: with the rows of F. But then all the columns of V
beyond vp are automatically eliminated and what remains is
122 MATEIX CALCULUS CHAP. 3

where Up is the semi-orthogonal n x p matrix which is formed out of the


column vectors u\, u%, . . . , up, while V is a semi-orthogonal m x p matrix,
formed out of the column vectors v±, v^, . . . , vp. The matrix A is thus
obtained as a product of an n x p, p x p, and p x m matrix which actually
gives an n x m matrix.
In order to understand the true significance of the decomposition theorem
(4), let us once more pay attention to the two spaces: the .ZV-space and
the Jf-space, with which the matrix A is associated. We represent these
spaces with the help of the n eigenvectors u\, U2, . . . , un, respectively the
m eigenvectors v\, vz, . . . , vm. However, we will make a sharp division
line beyond the subscript p. The first p vectors ut, vt, although belonging
to two different spaces, are paired with each other. They form the matrices
Up and Vp but we shall prefer to drop the subscript p and call them simply
U and F, while the remaining portions of the N-spa.ce and M -space,
associated with the eigenvalue zero, will be included in the matrices UQ
and FQ:

The fundamental matrix decomposition theorem (4) appears now in the form

and reveals the remarkable fact that the operator A can be generated without
any knowledge of the principal axes associated with the zero eigenvalue, that
is without any knowledge of the solutions of the homogeneous equations
Av = 0 and Au = 0. These solutions gave us vital information concerning
the compatibility and deficiency of the linear system Ay = b, but exactly
these solutions are completely ignored by the operator A.
SEC. 3.9 THE FUNDAMENTAL DECOMPOSITION THEOREM 123

We can understand this peculiar phenomenon and obtain an illuminating


view concerning the general character of a linear operator if we form the
concept of the "proper space" or "eigenspace" or "operational space"
associated with the matrix A as an operator. It is true that the matrix
A is associated with an n-dimensional and an m-dimensional vector space.
But it so happens that the matrix A is not activated in all dimensions of these
two spaces but only in a definite subspace which is in both cases ^-dimensional.
Only in the limiting case n — m = p can it happen that the operator A
includes the entire JV-space and the entire M -space. If n < m (under-
determined system) the entire .N-space (the space of the right side) may be
included but the Jf-space (the space of the solution) is only partially
represented. If n > m (over-determined system) the entire M -space (the
space of the solution) may be included but the JV-space (the space of the
right side) is only partially represented. The reason for the necessity of
compatibility conditions and for the deficiency of linear systems is exactly
this partial activation of the operator A. We will call the principal axes
belonging to the positive eigenvalues the "essential axes" of the matrix
since they are in themselves sufficient for the construction of A. The
remaining axes which belong to the eigenvalue zero (the "zero-field") are
the "deficient axes" in which the matrix is not activated, which are in fact
ignored by the matrix. The matrix A has in a sense a "blind spot" in all
these dimensions. We could use the picture that the field spanned by the
p axes ui, u%, . . . ,up and vi, v%, . . . ,vp is "illuminated" by the operator
while the remaining fields are left in the dark. In this sense the '' eigen-space''
of the matrix A as an operator is limited to the p-dimensional subspaces
which are spanned by the columns of the matrices U and V while the spaces
spanned by the columns of UQ and VQ fall outside the operational space of A.
This concept of "activation" is very useful for the understanding of the
peculiarities of linear systems. The concept retains its usefulness in the
study of continuous operators (differential and integral operators) where
the same phenomenon occurs under analogous circumstances.
The separation of the fields UQ, VQ from the fields U, V has the further
advantage that it yields a particularly simple formulation of the com-
patibility problem and the deficiency problem associated with the solution
of the general linear system (7.1). The compatibility conditions can now
be written down in the form of the single matrix equation

while the deficiency of the system appears in the form of

where y is any particular solution, and 77 an arbitrary column vector of


m — p components.
Problem 125. Apply the general decomposition theorem (6) to a row vector a,
considered as a 1 x m matrix. Do the same for the column vector a, considered
as an n x 1 matrix.
124 MATRIX CALCULUS CHAP. 3

[Answer:

Problem 126. Obtain the decomposition of the following 2 x 3 matrix:

[Answer

Problem 127. Construct the 4 x 5 matrix (8.6) with the help of the two non-
zero eigensolutions, belonging to AI and A2, of the table (8.9).
[Answer: Carry out numerically the row-by-row operation

3.10. The natural inverse of a matrix


The ordinary inverse of a matrix A is defined by the equation

Such an inverse exists only in the limiting case m = n = p, that is when


the eigen-space of the matrix includes the complete vector spaces N and M.
In that case the relations

(which are always valid because U and V are always semi-orthogonal) are
reversible:

and U and F become fully orthogonal. Then we can complement the


construction of A out of U and F according to (9.6) by the construction of
a new matrix B according to the rule

This construction is always possible since the matrix


SEC. 3.10 THE NATURAL INVERSE OF A MATRIX 125

always exists. Moreover, the products AB and BA can always be formed:

But these products become / only in the case that the relations (3) hold
and that is only true if ra = n = p.
Has the matrix B any significance in the general case in which p is not
equal to ra and n( Indeed, this is the case and we have good reasons to
consider the matrix B as the natural inverse of A, even in the general case.
Let us namely take in consideration that the operation of the matrix A is
restricted to the spaces spanned by the matrices U and V. The spaces UQ
and VQ do not exist as far as the operator A is concerned. Now the unit
matrix I has the property that, operating on any arbitrary vector u or v, it
leaves that vector unchanged. Since, however, the concept of an "arbitrary
vector" is meaningless in relation to the operator A—whose operation is
restricted to the eigen-spaces U, V—it is entirely sufficient and adequate to
replace the unit matrix / by a less demanding matrix which leaves any
vector belonging to the subspaces U and V unchanged.
The product AB, being an n x n matrix, can only operate on a vector
of the iV-space and if we want this vector to belong to the subspace U,
we have to set it up in the form

where 77 is an arbitrary column vector of p elements. But then, we obtain


in view of (6):

which demonstrates that the product AB has actually the property to


leave any vector belonging to the U-space unchanged.
On the other hand, the product BA is an m x m matrix which can only
operate on a vector of the M-space. If again we want to restrict this
vector to the subspace V, we have to set it up in the form

where once more T\ is an arbitrary column vector of p elements. Then we


obtain in view of (7):

and once more we have demonstrated that the product BA has actually the
property to leave any vector belonging to the V-space unchanged.
The matrix B is thus the natural substitute for the non-existent "strict
inverse", defined by (1), and may aptly be called the "natural inverse" of
the matrix A. It is an operator which is uniquely associated with A and
whose domain of operation coincides with that of A. It ignores completely
the fields UQ and VQ. If B operates on any vector of the subspace UQ, it
126 MATRIX CALCULUS CHAP. 3

annihilates that vector. Similarly, if £ operates on any vector of the sub-


space FO, it likewise annihilates that vector.
Let us now see what happens if we solve the linear equation

by
Have we found the solution of our equation? Substitution in (12) yields the
condition

But this condition is actually satisfied—as we have just seen—if b belongs


to the subspace U. This, however, was exactly the compatibility condition
of our linear system. The orthogonality of b to all u-vectors with the
eigenvalue zero means that 6 has no projection in the UQ space and that
again means that it belongs completely to the [7-space. And thus we see
that we have found the solution of our system (12), whenever a solution is
possible at all. In fact, more can be said. If b does not satisfy the com-
patibility condition (9.7), then there is still a solution possible in the sense
of least squares. That is, while the difference Ay — b cannot be made zero,
we can at least minimise the length of the error vector, that is we can make
the scalar

as small as possible. It so happens that the solution (13)—which blots out


all projections of 6 into the UQ space—automatically coincides with the
desired least square solution in the case of incompatible systems.
But what can we say about the possible deficiency of the system (12)?
If our solution is not unique but allows an infinity of solutions, there is
obviously no mathematical trick by which we can overcome this deficiency.
And yet the solution (13) seems to give a definite answer to our problem.
Now it is actually true that we have not eliminated the deficiency of our
system. If the homogeneous equation Ay = 0 has solutions, any such
solution may be added to (13). These, however, are exactly the vectors
which constitute the space FO- And so the deficiency of our system means
that our solution is uniquely determined only as far as the space F goes, in
which the operator is activated. The solution may have an additional
projection into the FO space which is ignored by the operator A. This is
a piece of information which our linear system (12) is unable to give since
the space FO is outside of its competence. Under these circumstances it is
natural to normalise our solution by putting it entirely into the well-confined
space F in which the operator is activated and in which the solution is
unique. The projection outside of that space, being outside the competence
of the given operator, is equated to zero. This is the significance of the
uniqueness of the solution (13) which has the role of a natural normalisation
of the solution in the case of incomplete (deficient) systems. The lacking
SEC. 3.11 GENERAL ANALYSIS OF LINEAR SYSTEMS 127

projection of y into VQ has to be obtained by additional information which


our system (12) is unable to give.
Problem 128. Consider the equation Ay = b where A is the 4 x 5 matrix (8.6)
of Problem 124, while b is given as follows:

Obtain the normalised least square solution (13) of this system, without making-
use of the complete eigenvalue analysis contained in the table (8.9). Then
check the result by constructing the matrix B with the help of the two essential
axes belonging to AI and A2-
[Hint: Make the right side b orthogonal to u% and u$. Make the solution y
orthogonal to v$, v%, v§, thus reducing the system to a 2 x 2 system which has a
unique solution.]
[Answer:

Problem 129. The least square solution of the problem Ay = b is equivalent to


the solution of the system

Show that for this system the compatibility condition (9.7) is automatically
fulfilled. Show also that for an over-determined system which is free of
deficiencies the solution (13), constructed with the help of the B matrix,
coincides with the solution of the system (19).

3.11. General analysis of linear systems


The study of linear systems with a symmetric matrix is greatly facilitated
by the fact that a mere rotation of the reference system can diagonalise the
matrix. The equations are then separated and can be solved at once. The
decomposition theorem (9.6) puts us in the position to extend these
advantages to arbitrary n x m systems. We cannot expect, of course,
that a mere " change of the frame of reference " shall suffice since our problem
involves two spaces of generally different dimensionality. We succeed,
however, in our endeavour if we apply a proper rotation in the one space
and in the other space, although these two rotations are generally quite
different. The case of a symmetric n x n matrix is then distinguished only
in that respect that here the vector b and the vector y can be subjected to
the same orthogonal transformation, while in the general case the two
orthogonal transformations do not coincide.
We write down our linear equation (7.1), but substituting for A its
decomposition (9.6):
128 MATRIX CALCULUS CHAP. 3

Let us now perform the following orthogonal transformations:

Then, in view of the semi-orthogonality of U and V we obtain, if we pre-


multiply by V:

Here we have the full counterpart of the equation (6.6) which we have en-
countered earlier in the study of n x n linear systems whose matrix was
symmetric. Now we have succeeded in generalising the procedure to arbitrary
non-symmetric matrices of the general n x m type.
But let us notice the peculiar fact that the new equation (4) is a p x p
system while the original system was an n x m system. How did this
reduction come about?
We understand the nature of this reduction if we study more closely the
nature of the two orthogonal transformations (2) and (3). Since generally
the U and F matrices are not full orthogonal matrices but n x p respectively
m x p matrices, the transformations (2) and (3) put a definite bias on the
vectors b and y. We can interpret these two equations as saying that b is
inside the Z7-space, y inside the F-space. The first statement is not
necessarily true but if it is not true, then our system is incompatible and
allows no solution. The second statement again is not necessarily true
since our system may be incomplete in which case the general solution
appears in the form (9.8) which shows that the solution y can have an
arbitrary projection into FO- However, we take this deficiency for granted
and are satisfied if we find a particular solution of our system which can be
later augmented by an arbitrary solution of the homogeneous equation.
We distinguish this particular solution by the condition that it stays entirely
within the F-space. This condition makes our solution unique.
Since the subspaces U and F are both p-dimensional, it is now under-
standable that our problem was reducible from the original n x m system
to a p x p system. Moreover, the equations of the new system are separated
and are solvable at once:

The matrix A~l—encountered before in (10.5)—always exists since the


diagonal elements of A can never be zero. We obtain one and only one
solution. Going back to our original y by rotating back to the original
system we obtain

Moreover, the premultiplication of (2) by U gives

and thus
SEC. 3.12 ERROR ANALYSIS OF LINEAR SYSTEMS 129

We have thus obtained exactly the same solution that we have encountered
before in (10.13) when we were studying the properties of the "natural
inverse" of a matrix.
We should well remember, however, the circumstances which brought this
unique solution in existence:
1. We took it for granted that the compatibility conditions of the system
are satisfied. This demands that the right side 6 shall lie inside the
^-dimensional subspace U of the full ^V-space.
2. We placed the solution in the eigenspace of the matrix A, and that is
the ^-dimensional subspace V of the full If-space.
Problem 130. Show that for the solution of the adjoint system (7.2) the role
of the spaces U and V is exactly reversed. Obtain the reduction of this ra x n
system to the p x p system of equation (4).

3.12. Error analysis of linear systems


The spectacular advances in the design of large-scale digital computers
brought the actual numerical solution of many previously purely theoretically
solved problems into the limelight of interest. While a generation ago
systems of 10 to 20 simultaneous linear equations taxed our computing
facilities to the utmost, we can now tackle the task of solving linear systems
with hundreds of unknowns. Together with this development goes, how-
ever, the obligation to understand the peculiarities and idiosyncrasies of
large-scale linear systems in order to preserve us from a possible misuse
of the big machines, caused not by any technical defects of the machines but
by the defects of the given mathematical situation.
The problem very frequently encountered can be described as follows.
Our goal is to solve the linear system

We have taken care of the proper degree of determination by having just as


many equations as unknowns. Furthermore, we are assured that the
determinant of the system is not zero and thus the possibility of a zero
eigenvalue is eliminated. We then have the ideal case realised:

and according to the rules of algebra our system must have one and only one
solution. The problem is entrusted to a big computing outfit which carries
through the calculations and comes back with the answer. The engineer
looks at the solution and shakes his head. A number which he knows to be
positive came out as negative. Something seems to be wrong also with
the decimal point since certain components of y go into thousands when
he knows that they cannot exceed 20. All this is very provoking and he
tells the computer that he must have made a mistake. The computer
points out that he has checked the solution, and the equations checked with
an accuracy which goes far beyond that of the data. The meeting breaks
up in mutual disgust.
10—L.D.O.
130 MATRIX CALCULUS CHAP. 3

What happened here? It is certainly true that the ideal case (2)
guarantees a unique and finite answer. It is also true that with our present-
day electronic facilities that answer is obtainable with an accuracy which
goes far beyond the demands of the engineer or the physicist. Then how
could anything go wrong?
The vital point in the mathematical analysis of our problem is that the
data of our problem are not mere numbers, obtainable with any accuracy
we like, but the results of measurements, obtainable only with a limited
accuracy, let us say an accuracy of 0.1%. On the other hand, the engineer
is quite satisfied if he gets the solution with a 10% accuracy, and why
should that be difficult with data which are 100 times as good?
The objection is well excusable. The peculiar paradoxes of linear systems
have not penetrated yet to the practical engineer whose hands are full with
other matters and who argues on the basis of experiences which hold good
in many situations but fail in the present instance. We are in the fortunate
position that we can completely analyse the problem and trace the failure
of that solution to its origins, showing the engineer point by point how the
mishap occurred.
We assume the frequent occurrence that the matrix A itself is known with
a high degree of accuracy while the right side b is the result of measurements.
Then the correct equation (1) is actually not at our disposal but rather the
modified equation

where

the given right side, differs from the "true" right side b by the "error
vector" j3. From the known performance of our measuring instruments we
can definitely tell that the length of /? cannot be more than a small percentage
of the measured vector 6—let us say 0.1%.
Now the quantity the computer obtains from the data given by the
engineer is the vector

since he has solved the equation (3) instead of the correct equation (1). By
substituting (5) in (3) we obtain for the error vector 77 of the solution the
following determining equation:

The question now is whether the relative smallness of /3 will have the relative
smallness of 77 in its wake, and this is exactly the point which has to be
answered by "no".
As in the previous section, we can once more carry through our analysis
most conveniently in a frame of reference which will separate our equations.
SEC. 3.12 ERROR ANALYSIS OF LINEAR SYSTEMS 131

Once more, as before in (11.2) and (11.3), we introduce the orthogonal


transformation

and reduce our system to the diagonal form

Of course, we are now in a better position than before. The U and V


matrices are now full orthogonal matrices, we have no reduction in the
dimensions, all our matrices remain of the size n x n and thus we need
not bother with either compatibility conditions or deficiency conditions—at
least this is what we assume.
In actual fact the situation is much less rosy. Theoretically it is of the
greatest importance that none of the eigenvalues Aj becomes zero. This
makes the system (8) solvable and uniquely solvable. But this fact can
give us little consolation if we realise that a very small A^ will cause a
tremendous magnification of the error /?'$ in the direction of that principal
axis, if we come to the evaluation of the corresponding error caused in the
solution:

Let us observe that the transformation (7) is an orthogonal transformation


in both j8 and 77. The length of neither j8 nor T? is influenced by this
transformation. Hence

Let us assume that the eigenvalue Af in the direction of a certain axis


happens to be 0.001. Then, according to (9), the small error 0.01 in the
data causes immediately the large error 10 in the solution. What does that
mean percentage-wise? It means the following. Let us assume that in
some other directions the eigenvalues are of the order of magnitude of unity.
Then we can say that the order of magnitude of the solution vector y and
that of the data vector 6 is about the same. But the small error of 1% in
one component of the data vector has caused the intolerably large error of 1000%
in the solution vector. That is, the error caused by an inaccuracy of our
data which is not more than 1%, has been blown up by solving the linear
system to such an extent that it appears in the solution as a vector which is
ten times as big as the entire solution. It is clear that under such circumstances
the "solution" is completely valueless. It is also clear that it was exactly
the correct mathematical inversion of the matrix A which caused the trouble
since it put the tremendous premium of 1000 on that axis and thus magnified
the relatively small error in that direction beyond all proportions. Had we
used a computing technique which does not invert the matrix but obtains
132 MATRIX CALCULUS CHAP. 3

the solution of a linear system in successive approximations, bringing the


small eigenvalues gradually in appearance, we would have fared much better
because we could have stopped at a proper point, before the trouble with
the very small eigenvalues developed.*
We see that the critical quantity to which we have to pay attention, is
the ratio of the largest to the smallest eigenvalue, also referred to as the
"condition number" of the matrix:

This popular expression is not without its dangers since it creates the
impression that an "ill-conditioned" matrix is merely in a certain mathe-
matical "condition" which could be remedied by the proper know-how. In
actual fact we should recognise the general principle that a lack of information
cannot be remedied by any mathematical trickery. If we ponder on our
problem a little longer, we discover that it is actually the lack of information
that causes the difficulty. In order to understand what a small eigenvalue
means, let us first consider what a zero eigenvalue means. If in one of the
equations of the system (11.4) for example the *th equation, we let A$ converge
to zero, this means that the component y'i appears in our linear system with
the weight zero. We can trace back this component to the original vector
y, on account of the equation

By the rules of matrix multiplication we obtain the component y'i by


multiplying the ith row of V—and that means the iih column of F—by y.
And thus we have

Hence we can give a very definite meaning to a vanishing eigenvalue. By


forming the scalar product (13), we obtain a definite linear combination of
the unknowns y\t yz, • • •, y-m, which is not represented in the given linear
system and which cannot be recovered by any tricks. We have to take it
for granted that this information is denied us. We see now that the
determination of all the linearly independent solutions of the homogeneous
equation Av = 0 has the added advantage that it gives very definite
information concerning those combinations of the unknowns which are not
represented in our system. We have found for example that the 4 x 5
matrix (8.6) had the three zero-axes ^3, 04, v^, tabulated in (8.9). We can
now add that if we try to solve the system

there will be three linear combinations of the 5 unknowns, which are a priori
* See the author's paper on "Iterative solution of large-scale linear systems" in the
Journal of SIAM 6, 91 (1958).
SEC. 3.12 ERROR ANALYSIS OF LINEAR SYSTEMS 133

unobtainable because they are simple not represented in the system. They
are:

Of course, any linear combinations of these three quantities are likewise


unavailable and so we can state the non-existent combinations in infinitely
many formulations. If we are interested in the solution of the adjoint
system

here the axes u$ and u$ come in operation and we see that the two
combinations

are a priori un-determinable (or any linear aggregate of these two expressions).
Now we also understand what a very small eigenvalue means. A certain
linear combination of the unknowns, which can be determined in advance,
does not drop out completely but is very weakly represented in our system.
If the data of our system could be trusted with absolute accuracy, then the
degree of weakness would be quite immaterial. As long as that combination
is present at all, be it ever so weakly, we can solve our system and it is
merely a question of numerical skill to obtain the solution with any degree
of accuracy. But the situation is very different if our data are of limited
accuracy. Then the very meagre information that our numerical system
gives with respect to certain linear combinations of the unknowns is not
only unreliable—because the errors of the data do not permit us to make
any statement concerning their magnitude—but the indiscriminate handling
of these axes ruins our solution even with respect to that information that
we could otherwise usefully employ. If we took the values of these weak
combinations from some other information—for example by casting a
horoscope or by clairvoyance or some other tool of para-psychology—we
should probably fare much better because we might be right at least in the
order of magnitude, while the mathematically correct solution has no hope
of being adequate even in roughest approximation.
This analysis shows how important it is to get a reliable estimate
concerning the "condition number" of our system and to reject linear
systems whose condition number (11) surpasses a certain danger point,
depending on the accuracy of our data. If we admit such systems at all,
we should be aware of the fact that they are only theoretically square
systems. In reality they are n x m systems (n < m) which are deficient in
certain combinations of the unknowns and which are useful only for those
combinations of the unknowns which belong to eigenvalues which do not go
134 MATRIX CALCULUS CHAP. 3

below a certain limit. Experience shows that large-scale systems especially


are often prone to be highly skew-angular ("ill-conditioned") and special
care is required in the perusal of their results.
Problem 131. Given the following 3 x 3 matrix:

what combination of the coordinates is not obtainable by solving the system


Ay = 6, respectively Ax = c?
[Answer:

Problem 132. Give the following interpretation of the non-obtainable (or


almost non-obtainable) coordinate combinations. Rotate the position vector
(yi> 2/2» • • • j Vm) °f the M-space into the (orthonormal) reference system of the
principal axes (vi, v%, . . . , vm); since the axes %+i, . . . , vm are not represented
(or almost not represented) in the operator A, the components y'p+i, . . . , y'm
become multiplied by zero (or exceedingly small numbers) and thus disappear.

3.13. Classification of linear systems


One of the peculiarities of linear systems is that our naive notions con-
cerning enumeration of equations and unknowns fail to hold. On the
surface we would think that n equations suffice for the determination of n
unknowns. We would also assume that having less equations than un-
knowns our system will have an infinity of solutions. Both notions can
easily be disproved. The following system of three equations with three
unknowns

is clearly unable to determine the 3 unknowns x, y, z since in fact we have


only two equations, the second equation being a mere repetition of the first.
Moreover, the following system of two equations for three unknowns

is far from having an infinity of solutions. It has no solution at all since


the second equation contradicts the first one. This can obviously happen
with any number of unknowns, and thus an arbitrarily small number of
equations (beyond 1) with an arbitrarily large number of unknowns may be
self- contradictory.
SEC. 3.13 CLASSIFICATION OF LINEAR SYSTEMS 135

The only thing we can be sure of is that a linear system can have no
unique solution if the number of equations is less than the number of
unknowns. Beyond that, however, we can come to definite conclusions
only if in our analysis we pay attention to three numbers associated with a
matrix:
1. The number of equations: n
2. The number of unknowns: m
3. The rank of the matrix: p.
It is the relation of p to n and m which decides the general character of a
given linear system.
The "rank" p can be decided by studying the totality of linearly
independent solutions of the homogeneous equation

or

The analysis of Section 9 has shown that these two numbers are not
independent of each other. If we have found that the first equation has //,
independent solutions, then we know at once the rank of the matrix since

and thus

On the other hand, if the number of linearly independent solutions of the


second equation is v, then

and thus

According to the relation of p to n and m we can put the linear systems


into various classes. The following two viewpoints in particular are
decisive:
1. Are our data sufficient for a unique characterisation of the solution?
If so, we will call our system "completely determined", if not, "under-
determined".
2. Are our data independent of each other and thus freely choosable, or
are there some linear relations between them so that we could have dropped
some of our data without losing essential information ? In the first case we
will call our system "free" or "unconstrained", in the second case "over-
determined" or "constrained". Over-determination and under-determina-
tion can go together since some of our data may be merely linear functions
of some basic data (and thus superfluous) and yet the totality of the basic
data may not be sufficient for a unique solution.
136 MATRIX CALCULUS CHAP. 3

These two viewpoints give rise to four different classes of linear systems;
1. Free and complete. The right side can be chosen freely and the solution
is unique. In this case the eigen-space of the operator includes the entire
M and N spaces and we have the ideal case

Such a system is sometimes called "well-posed", adopting an expression


that J. Hadamard used for the corresponding type of boundary value
problems in his "Lectures on the Cauchy-problem". It so happens, how-
ever, that so many "ill-posed" problems can be transformed into the
"well-posed" category that we prefer to characterise this case as the "well-
determined" case.
2. Constrained and complete. The right side is subjected to compatibility
conditions but the solution is unique. The operator still includes the
entire M-space but the ^-space extends beyond the confines of the eigen-
space U of the operator. Here we have the case

3. Free and incomplete. The right side is not subjected to any conditions
but the solution is not unique. The operator now includes the entire
SEC. 3.13 CLASSIFICATION OF LINEAE SYSTEMS 137

N-spa,ce but the M-space extends beyond the confines of the eigen-space V
of the operator. Here we have the case

The number of equations is smaller than the number of unknowns. Such


systems are under-determined.

4. Constrained and incomplete. The right side is subjected to compatibility


conditions and the solution is not unique. The eigen-space of the operator
is more restricted than either the M - or the ^-space. Here we have the case

The relation of m to n, however, remains undecided and we may have the


three sub-cases

Irrespective of this relation, a system of this kind is simultaneously over-


determined and under-determined because in some dimensions we have given
too much, in some others too little.
138 MATRIX CALCULUS CHAP. 3

Numerical illustrations for these categories are given in the following


problems.
Problem 133. Show that the following system is free and complete (well-
determined) :

[Answer:

Problem 134. Show that the following system is constrained and complete:

[Hint: Use successive eliminations.]


[Answer:
2 compat. cond.:

Solution:
Problem 135. Show that the following system is free and incomplete:

[Answer:

Problem 136. Show that the following system is constrained and incomplete:

[Answer:
Compatibility condition:
Solution:
SEC. 3.14 SOLUTION OF INCOMPLETE SYSTEMS 139

3.14. Solution of incomplete systems


Since incomplete systems do not allow a unique solution but have an
infinity of solutions, we would be inclined to think that such systems are
of no significance and need no attention. We cannot take this attitude,
however, if we study the properties of differential operators because we
encounter very fundamental differential operators in the realm of field
operators which from the standpoint of linear equation systems are highly
deficient. For example one of the most important linear differential operators
in the realm of partial differentiation is the "divergence" of a vector field:

This operator transforms a vector of the n-dimensional space into a scalar.


Hence from the standpoint of solving this linear differential equation we
have the case of determining a vector field from a scalar field which is clearly
a highly under-determined problem. And yet, we do encounter the
divergence operator as one of the fundamental operators of mathematical
physics and it is by no means meaningless to ask: "What consequences can
we draw from the fact that the divergence of a vector field is given?"
Now in the previous section we solved such systems by giving an arbitrary
particular solution and then adding the general solution of the homogeneous
equation. While this procedure is formally correct, it has the disadvantage
that the "particular solution" from which we start, can be chosen with a
high degree of arbitrariness. This hides the fact that our operator gives a
very definite answer to our linear problem in all those dimensions in which
the operator is activated, and, on the other hand, fails completely in all those
dimensions in which the operator is not activated. But then it seems more
adequate to the nature of the operator to give the unique solution in all
those dimensions in which this solution exists and ignore those dimensions
which are outside the realm of the operator. If we give some particular
solution, this condition is not fulfilled because in all probability our solution
will have some projection in the field VQ which is not included by the
operator. On the other hand, the solution found in Section 10 with the
help of the B matrix, considered as the "natural inverse" of A, is unique and
satisfies the condition that we make only statements about those co-
ordinate combinations which are not outside the realm of the given operator.
But to generate this "natural inverse" in every case by going through the
task of first solving the shifted eigenvalue problem for all the non-zero axes
and then constructing the matrix as a product of three factors, would
be a formidable endeavour which we would like to avoid. There may
be a simpler method of obtaining the desired particular solution.
This is indeed possible, in terms of a "generating vector". Let us put
140 MATRIX CALCULUS CHAP. 3

and shift the task of determining y to the task of determining w. Now we


know from the decomposition theorem (9.6) that

But then the vector Aw—no matter what w may be—is automatically of
the form Vq, that is we have a vector which lies completely within the
activated field of the operator. The deficiency is thus eliminated and we
obtain exactly the solution which we desire to get. The auxiliary vector w
may not be unique but the solution y becomes unique.
We can thus eliminate the deficiency of any linear system and obtain the
"natural solution" of that system by adopting the following method of
solution:

The matrix AA is an n x n square matrix. The original n x m system


(m > n) is thus transformed into a "well-posed" n x n system which can
be solved by successive eliminations. If the vector w allows several solu-
tions, this uncertainty is wiped out by the multiplication by A. The new
solution y is unique, no matter how incomplete our original system has been.
For example, in the case of the above-mentioned divergence operator we
shall see later that the significance of the substitution (2) is

with the boundary condition

Hence the problem (1) becomes uniquely solvable because now we obtain

which is Poisson's equation and which has (under the condition (6)) a
unique solution. This is in fact the traditional method of solving the
problem (1). But we now see the deeper significance of the method: we
gave a unique solution in all those dimensions of the function space in
which the operator is activated and ignored all the other dimensions.
Problem 137. Obtain the solution of the incomplete system (13.20) by the
method (2) and show that we obtain a unique solution which is orthogonal to
the zero-vectors v^, v$.
[Answer:
SBC. 3.15 OVER-DETERMINED SYSTEMS 141

Problem 138. Do the same for the 4 x 4 system (13.22) and show that the
deficiency of w has no influence on the solution y.
[Answer:

3.15. Over-determined systems


A similar development is possible for n x m systems in which the
number of equations surpasses the number of unknowns: n > m. Such
systems came first into prominence when Gauss found an ingenious method
of adjusting physical measurements. It often happened that the constants
of a certain mathematical law, applied to a physical phenomenon, had to
be obtained by actual physical measurements. If more measurements were
made than the number of constants warranted, the equation system used for
the determination of the unknown parameters became redundant. If the
measurements had been free of observational errors, this redundant system
would be compatible, and any combination of the minimum number of
equations would yield the same results. However, the observational errors
have the consequence that each combination yields a different solution and
we are in a quandary how to make our choice among the various possibilities.
Gauss established the principle (found independently also by Legendre)
that it is preferable to keep the redundant system in its totality and determine
the "most probable" values of the parameters with the help of the principle
that the sum of the squares of the left sides (after reducing the system to
zero) should be made as small as possible ("method of least squares"). In
the case that our equations are of a linear form, we obtain the principle of
minimising the necessarily non-negative scalar quantity

If the system

happens to be compatible, then the minimum of (1) is zero and we obtain the
correct solution of the system (2). Hence we have not lost anything by
replacing the system (2) by the minimisation of (1) which yields

We have gained, however, by the fact that the new system is always
solvable, no matter how incompatible the original system might have been.
The reason is that the decomposition (14.3) of A shows that the new right
side

cannot have any component in the field VQ because the pre-multiplication by


V annihilates any such component. The system (3) is thus automatically
142 MATRIX CALCULUS CHAP. 3

compatible. Moreover, the new system is an ra x m square system, com-


pared with the larger n x m (n > m) system of the original problem. Hence
the new problem is ivell-posed, provided that the original problem was only
over-determined but not incomplete. As an example let us consider the
over-determined system

which transforms a scalar field tp into the vector field F. Let us apply the
method (3) to this over-determined system. In Section 14 we have en-
countered the operator "div" and mentioned that its transpose is the
operator " — grad". Accordingly the transpose of the operator "grad" is
the operator " —div". Hence the least-square reformulation of the original
equation (5) becomes

which yields the scalar equation

and once more we arrive at Poisson's equation. Here again the procedure
agrees with the customary method of solving the field equation (5) but we
get a deeper insight into the significance of this procedure by seeing that
we have applied the least-square reformulation of the original problem.
If we survey the results of the last two sections, we see that we have
found the proper remedy against both under-determination and over-
determination. In both cases the transposed operator A played a vital role.
We have eliminated under-determination by transforming the original y
into the new unknown w by the transformation y = Aw and we have
eliminated over-determination (and possibly incompatibility) by the method
of multiplying both sides of the given equation by A. The unique solution
thus obtained coincides with the solution (10.13), generated with the help
of the "natural inverse" B.
Problem 139. Two quantities £ and 77 are measured in such a way that their
sum is measured /JL times, their difference v times. Find the most probable
values of £ and 77. Solve the same problem with the help of the matrix B and
show the agreement of the two solutions.
[Answer: Let the arithmetic mean of the sum measurements be a, the arithmetic
mean of the difference measurements be /?. Then

Problem 140. Form the product Ab for the system of Problem 128 and show
that the vector thus obtained is orthogonal to the zero vectors v^, v^ and v$
(cf. Problem 124).
3.16. The method of orthogonalisation
We can give still another formulation of the problem of removing
deficiencies and constraints from our system. The characteristic feature of
the solution (14.2) is that the solution is made orthogonal to the field VQ
SEC. 3.16 THE METHOD OF ORTHOGONALISATION 143

which is composed of the zero-axes of the M-field. On the other hand the
characteristic feature of the least square solution (15.3) is that the right
side b is made orthogonal to the field UQ which is composed of the zero axes
of the jV-field. If we possess all the zero axes—that is we know all the
solutions of the homogeneous equations Av — 0, then we can remove the
deficiencies and constraints of our system and transform it to a uniquely
solvable system by carrying through the demanded orthogonalisation in
direct fashion.
1. Removal of the deficiency of the solution. Let yo be a particular solution
of our problem

We want to find the properly normalised solution y which is orthogonal to


the zero field VQ. For this purpose we consider the general solution

and determine the vector q by the orthogonality condition

This equation is solvable for q, in the form of a well-posed (m — p) x (ra — p )


system:

and having obtained q, we substitute in (2) and obtain the desired normalised
solution.
2. Removal of the incompatibility of the right side. We can proceed
similarly with the orthogonalisation of the right side b of a constrained
system. We must not change anything on the projection of b into the field
U but we have to subtract the projection into UQ. Hence we can put

and utilise the condition

which again yields for the determination of q the well-posed (n — p) x


(n — p) system

Substituting in (5) we obtain the new right side of the equation

which satisfies the necessary compatibility conditions. Now we can omit


the surplus equations beyond n = m and handle our problem as an m x m
system.
By this method of orthogonalisation we can remain with the original
formulation of our linear system and still obtain a solution which is unique
and which satisfies the principle of least squares.
144 MATRIX CALCULUS CHAP. 3

Problem 141. Obtain the solution (14.8) of Problem 137 by orthogonalisation.


(For the zero-field VQ, cf. (13.21).) Do the same for the problem (13.22), and
check it with the solution (14.9).
Problem 142. A certain quantity y has been measured n times:

Find the least square solution of this system (the arithmetic mean of the right
sides) by orthogonalisation.

3.17. The use of over-determined systems


In many problems of mathematical analysis, over-determined systems can
be used to great advantage. One of the most fundamental theorems in the
theory of analytical functions is "Cauchy's integral theorem" which permits
us to obtain the value of an analytical function /(z) inside of a given domain
of analyticity if / (z) is known everywhere along the boundary:

The "given data" are in this case the boundary values of /(£) along the
boundary curve C and the relation (1) may be conceived as the solution of
the Cauchy-Kiemann differential equations, under the boundary condition
that/(z) assumes the values/(£) on the boundary. These "given values",
however, are by no means freely choosable. It is shown in the theory of
analytical functions that giving /(£) along an arbitrarily small section of C
is sufficient to determine /(z) everywhere inside the domain included by C.
Hence the values /(£) along the curve C are by no means independent of
each other. They have in fact to satisfy the compatibility conditions

where g(z) may be chosen as any analytical function of z which is free of


singularities inside and on C. There are infinitely many such functions and
SEC. 3.17 THE USE OF OVER-DETERMINED SYSTEMS 145

thus our system is infinitely over-determined. And yet the relation (1) is
one of the most useful theorems in the theory of analytical functions.
Another example is provided by the theory of Newtonian potential,
which satisfies the Laplace equation

inside of a closed domain. We have then a theorem which is closely analogous


to Cauchy's theorem (1):

where

RTs being the distance between the fixed point r inside of C and a point S
of the boundary surface C.

The given data here are the functional values y>(S) along the boundary
surface S, and the values of the normal derivative 8(p]dn along the boundary
surface S. This is in fact too much since q>(S) alone, or (8<p/dn)(S) alone,
would suffice to determine (p(r) everywhere inside the domain. And thus
our problem is once more infinitely over-determined. The given data have
to satisfy the compatibility conditions

where g(r) is any potential function which satisfies the Laplace equation (3)
everywhere inside and on S, free of singularities. There are infinitely
many such functions.
On the surface this abundance of data seems superfluous and handicapped
by the constraints to which they are submitted. But the great advantage
of the method is that we can operate with such a simple function as the
reciprocal distance between two points as our auxiliary function G(T, 8), If
we want to succeed with the minimum of data, we have first to construct
11—L.D.O.
146 MATRIX CALCULUS CHAP. 3

the "Green's function" of our problem—as we will see later—which is for


complicated boundaries a very difficult task. On the other hand, of what
help is our solution, if in fact our observations give only <p(S) and (d(p/dn)(S)
is not known?
In that case we use the compatibility conditions themselves to obtain the
surplus data. We can consider the system (17.6) as an infinite set of linear
equations for the determination of 8(p/8n. If we succeed in solving these
equations, then we possess now the surplus data (8(pj8n)(S) and the relation
(4) solves the problem of obtaining <p(r} at all inside points.
This solution method is of actual practical advantage since the integration
is replaceable by a summation of a large but finite number of terms. We
then obtain a finite system of simultaneous algebraic equations of the form

Moreover, we can make this system well-conditioned by emphasising the


diagonal terms. We do that by choosing for g(r] the function

where S' is a point outside of the boundary C but very near to it. The
sharp increase of this function near to the point S' makes it possible to put
the spotlight rather strongly on that particular value (8(pl8n)(S) which
belongs to an S directly opposite to S' (see Figure). Although we did not
succeed in separating (8<p/8n)(S), yet we have a well-conditioned linear
system for its evaluation which can be solved with the help of the large
electronic computers.
We will demonstrate the value of over-determination by an example
within the realm of algebraic equations. Let us assume that we have to
solve the system Ay = b which shall be of the order 2n, i.e. we consider A
as a 2n x 2n matrix. We now separate our 2n unknowns (yi, yz, - • • •> yzn)
into two groups:

All the columns associated with the second group are carried over to the
right side, which means that we write our equation in the form

where X is now a (2n x n) matrix and so is Y. The new system is now


over-determined because the unknowns 77 are considered as given quantities.
Hence the right side has to be orthogonal to every solution of the adjoint
homogeneous equation

There are altogether n such solutions, which can be combined into the
2n x n matrix UQ. The compatibility conditions become
SEC. 3.17 THE USE OF OVER-DETERMINED SYSTEMS 147
which gives us n equations for the determination of 77. This requires the
inversion of the n x n matrix UoY. Then, having obtained 77, we go back
to (10) but omitting all equations beyond n, and obtaining £ by inverting
the matrix Xi where Xi is an n x n matrix, composed of the first n rows of X.

What we needed in this process, is two inversions of the order n, instead


of one inversion of the order 2n, provided that we possess the solutions of
the system (11). This, however, can again be given on the basis of the
inversion of Xi. The vector u has 2n components which can be split into
the two column vectors u\ and u%, thus writing (11) in the form

which gives—assuming that we possess ^i"1, the inverse of X ± :

The full 2n x n matrix UQ shall also be split into the two n x n matrices
Ui and C/2, writing Uz below U\. We have complete freedom in choosing
the matrix Uz, as long as its determinant is not zero. We will identify it
with the unit matrix /. Then

and

If we split the matrix Y and the vector & in a similar way:

we can write the equation (12) in the form


148 MATRIX CALCULUS CHAP. 3

This requires the Diversion of the n x n matrix QY\ + YZ- Then, having
solved this equation, we return to the first half of (10) and obtain

We have thus demonstrated that the 2n x 2n system

can be solved by two n x n inversions and substitutions.


Problem 143. Apply this method to the solution of the following 6 x 6 system:

[Answer:

3.18. The method of successive orthogonalisation


The matrix decomposition studied in Section 9 showed us that we can
construct the matrix A without paying any attention to the solution of the
homogeneous equations

or

since the principal axes associated with the zero eigenvalue do not participate
in the generation of the matrix. If we possess all the p "essential axes",
associated with the non-zero eigenvalues, we possess everything for the
solution of the linear system

because, together with A, we have obtained also the matrix B which we


could conceive as the natural inverse of A. Hence we could give an explicit
solution of the system (3).
The drawback of this method is, however, that it requires a very elaborate
mechanism to put it into operation. The matrix decomposition theorem
(9.6) is a very valuable tool in the study of the general algebraic and
analytical properties of linear operators but its actual application to the
solution of large-scale linear systems and to the solution of linear differential
equations demands the complete solution of an eigenvalue problem which is
explicitly possible only under greatly simplified conditions.
SEC. 3.18 THE METHOD OP SUCCESSIVE ORTHOGONALISATION 149

Under these circumstances it is of great interest that there is a remarkable


reciprocity between the "essential" principal axes which form the eigen-
space in which the matrix operates, and the remaining axes which form the
spaces UQ and FO in which the operator is not activated. Although these
spaces are apparently ignored by the operator, we can in fact make good
use of them and go a long way toward the actual solution of our problem.
The great advantage of the essential axes has been that they bring
directly into evidence those p-dimensional subspaces U and F in which the
matrix is activated. Hence the original n x m problem can immediately
be reformulated as a p x p problem and we have at once overcome the
handicap of unnecessary surplus equations on the one hand, and deficiency
on the other. But in Section 16, we have studied a method which allows
the same reduction, purely on the basis of solving the homogeneous equations
(1) and (2), without solving the basic eigenvalue problem. We can make
this method more efficient by a process known as the "successive
orthogonalisation of a set of vectors". This process has the advantage that
it proceeds in successive steps, each step being explicitly solvable.
We will apply this orthogonalisation process to both sets of vectors Vi and
Ui, found as a solution of the homogeneous systems (1) and (2). Let us
observe that in this process the unknown vector y and the right side b of
the equation (3) do not come at first into evidence.
We assume that we have found all the linearly independent solutions of
the homogeneous system (1). Then any linear combination of these solutions
is again a solution. We will use this feature of a homogeneous linear
operator to transform the original set of solutions

into a more useful set. We start with VP+I which we keep unchanged,
except that we divide it by the length of the vector:

Now we come to v'v+z which we want to make orthogonal to V'P+I. We put

Pre-multiplying by v'p+i we obtain the condition:

Moreover, the condition that the length of v'p+z shall become 1 yields

which determines y (except for the sign). The process can be continued.
At the kth step we have
150 MATRIX CALCULUS CHAP. 3

The orthogonality conditions give

The normalisation condition gives

After m — p steps the entire set (4) is replaced by a new, orthogonal and
normalised set of vectors v'p+i, . . . , v'p+m. But we need not stop here.
We can continue by choosing p more vectors in any way we like, as long as
they are linearly independent of the previous set. They too can be
orthogonalised, giving us the p additional

At the end of the process we possess a set of m orthogonal vectors which


can be arranged as the columns of an m x m orthogonal matrix.
We can proceed in identical fashion with the solutions of the homogeneous
equation (2):

They too can be orthogonalised and here again we can continue the process,
giving us the p additional vectors

We obtain an n x n orthogonal matrix.


Having accomplished this task we are going to establish a relation between
these matrices and the critical vectors y and b of our problem (3). We
wanted to put the vector y completely into the eigen-space V of the operator.
This means that y has to become orthogonal to all the vectors v'p+i, v'p+z,
. . . , v'm. This again means that y has to become a linear combination of
the vectors (12). Similarly the right side 6 has to be orthogonal to the
vectors (13) by reason of compatibility and thus must be reducible to a
linear combination of the vectors (14). We can express these two statements
by constructing the semi-orthogonal m x p matrix Vv, formed of the column
vectors (12) and the semi-orthogonal n x p matrix Up, formed of the
column vectors (14). Then

Both y' and b' are free column vectors of p components.


With this transformation our linear system (3) becomes

and, pre-multiplying by tfP:


SEC. 3.18 THE METHOD OF SUCCESSIVE OETHOGONALISATION 151

If we put

the new matrix A' is an n x n square matrix.


On the basis of these deductions we come to the following conclusion.
Although we have not solved any eigenvalue problem except for the complete
solution of the problem associated with the eigenvalue A = 0, we have
succeeded in reducing our original arbitrarily over-determined and deficient
n x m system to a new well-determined p x p system which has a unique
solution. This is much less than what we attained before in Section 7,
where the matrices Up and VP were formed in terms of the principal axes
of A. There the new matrix A' became a diagonal matrix and our equations
appeared in separated form. This is not so now because we still have to
solve a p x p linear system. On the other hand, we are relieved of the
heavy burden of obtaining the complete solution of the principal axis
problem and can use other methods for the solution of a linear system.
This reduction to a free and complete p x p system holds even if our
original n x m system was incompatible. The transformation

automatically eliminates all the "forbidden" components of b, without


altering the "permissible" components. The solution obtained coincides
with the least square solution of our incompatible problem. It also coincides
with the solution obtained with the help of the natural inverse B of the
original matrix A,
Problem 144. Apply the method of successive orthogonalisation to the zero-
solutions of the system (8.6) (cf. Problem 124) and solve Problem 128 by
reducing it to a 2 x 2 system. Show that the solution coincides with the
solution (10.17, 18) obtained before.
Problem 145. Apply the same procedure to the 4 x 4 overdetermined and
incomplete system (13.22) and reduce it to a well-determined 3 x 3 system.
Problem 146. Formulate the method of successive orthogonalisation (5-11)
if we do not normalise the lengths of the successive vectors, thus avoiding the
taking of square roots.
Problem 147. Demonstrate the following properties of the reduced set (19):
a) The matrix A' has no zero eigenvalues.
b) The eigenvalues of A', belonging to the shifted eigenvalue problem,
coincide with the non-zero eigenvalues of the original matrix A.
c) The principal axes of A are obtainable from the principal axes of A' by the
following transformation:
152 MATRIX CALCULUS CHAP. 3

3.19. The bilinear identity


In all our dealings with matrices we have to keep in mind that only such
methods and results are applicable to the study of differential equations
which remain interpretable even if the number of equations increases to
infinity. A differential equation can be conceived as a difference equation
with a mesh-size which converges to zero. But with decreasing mesh-size
the number of equations is constantly on the increase and going to the limit
where € becomes arbitrarily small, the corresponding number of equations
becomes arbitrarily large. Fredholm in his fundamental investigation of a
certain class of integral equations was able to operate with determinants
whose order increased to infinity. While this was a great achievement,
its use is limited to a very definite class of problems. A much wider out-
look is obtained if we completely avoid the use of determinants and operate
with other concepts which carry over to the continuous case in a natural
way. The ordinary process of matrix inversion is not amenable to the
proper re-interpretation if our linear system has the property that more
and more equations have to be taken into consideration, without end. We
can base, however, the solution of linear systems on concepts which take no
recourse to either determinants or successive eliminations.
The one method which carries over in its totality to the continuous field,
is the method of the principal axes (studied in Section 7), together with the
decomposition of the operator A into a product of three matrices. But
there is also the alternative method of studying the solutions of the homo-
geneous equation—corresponding to the eigenvalue zero—from which
valuable results are obtainable. In fact, this "homogeneous" method can
be extended to a point where it can be successfully employed not only for
the general exploration of the system—its compatibility and deficiency—
but even for its actual solution. In this method a certain identity, the so-
called "bilinear identity", plays a pivotal role. It is of central importance
for the entire theory of linear differential operators, no matter whether they
belong to the ordinary or the partial type.
The scalar product of two vectors, i.e. flu, gives a 1 x 1 matrix, that is a
pure number. The transpose of this number coincides with itself, or, if it
happens to be a complex number, the transposition changes it to its complex
conjugate. Let us apply this principle to the product xAy where c is an
w-component, y an m-component column vector:

(the asterisk refers to "conjugate complex"). We call this relation an


"identity" because it holds (for a given matrix A) for any choice of the two
vectors x and y.
The power of this identity lies in its great generality. First of all, it may
serve even to identify the transposed matrix A. In the realm of finite
matrices it is easy enough to transpose the rows and columns of a matrix
and thus define the elements of the transposed matrix A:
SEC. 3.19 THE BILINEAK IDENTITY 153

In the realm of continuous operators, where the matrices grow beyond all
size, the original definition of A loses its meaning. But the identity (1)
maintains its meaning and can serve for the purpose of defining A.
The next fundamental application of (1) is the derivation of the
compatibility conditions of the system Ay — b. We will extend this
system—as we have done before in Section 7—by the adjoint system,
considering the complete system

We can now ask: "Can we prescribe the right sides 6 and c freely? " The
application of the bilinear identity yields the relation

which holds, whatever the vectors 6 and c may be. We have the right to
specify our vectors in any way we like. Let us choose c = 0. Then we
have no longer an identity but an equation which holds for a special case,
namely:

The result means that the right side of the system (3) must be orthogonal to
any solution of the transposed (adjoint) homogeneous equation. The same
result can be derived for the vector c by making b equal to zero.
We have seen in the general theory of linear system that these are the
only compatibility conditions that the right side b has to satisfy. In the
special case that the adjoint homogeneous system (7) has no non-vanishing
solution, the system (3) becomes unconstrained (the vector 6 can be chosen
freely). These results, obtained before by different tools, follow at once
from the bilinear identity (1), which is in fact the only identity that can be
established between the matrix A and its transpose A.
We will now go one step further and consider a linear system which is
either well-determined or over-determined but not under-determined:

This situation can always be achieved because in the case of an incomplete


system we can add the m — p further equations

(orthogonality to the zero-field) and obtain a new system which is complete


(cf. Problem 137). In the case of over-determination we assume that the
compatibility conditions are in fact fulfilled. Now we are interested in
finding the value of one particular component (coordinate) of y, for example
154 MATRIX CALCULUS CHAP. 3

yi. For this purpose we add to our previous equation (3) one more equation,
considering the complete system

This means that we consider the value of yi as one of our data. This, of
course, cannot be done freely, otherwise our problem would not have a
unique solution. But the system (10) is a legitimate over-determined system
and it is now the compatibility condition which will provide the solution by
giving us a linear relation between the components of the right side, i.e. a
linear relation between the vector y and an which is in fact yt.
This method is of great interest because it brings into evidence a general
feature of solving linear systems which we will encounter again and again
in the study of linear differential equations. There it is called the "method
of the Green's function". It consists in constructing an auxiliary function
which has nothing to do with the data but is in fact entirely determined by
the operator itself. Moreover, this auxiliary function is obtained by solving
a certain homogeneous equation.
According to the general theory the compatibility of the system (10)
demands in the usual way that we solve the adjoint homogeneous equation

and then we must have

for the solvability of our system. But in our case the matrix A has been
extended by an additional row. This row has all zeros, except the single
element 1 in the *th place. Considering this row as a row-vector, the
geometrical significance of such a vector is that it points in the direction of
the *tb coordinate axis. Hence we will call it the ith "base vector" and
denote it with et, in harmony with our general custom of considering every
vector as a column vector:

Our system (10) can thus be rewritten in the form

Now we have to find the vector u of the equation (11). It is a column


vector of n + 1 elements since the matrix of our extended system is an
(n + 1) x m matrix. We will separate the last element of this vector and
write it in the form
SEC. 3.19 THE BILINEAK IDENTITY 155

where g is a column vector of n elements. Hence the determining equation


(11) for the extended vector u becomes

while the compatibility condition (12), applied to our system (14), becomes

Now the equation (16) represents a homogeneous linear system in the


unknowns g, y which leaves one factor free. Assuming that y is not zero
we can divide by it, which is equivalent to saying that y can be chosen as 1,
or equally as —1. Hence only the two cases y — 0 and — 1 are of essential
interest. But if y = 0, we merely get the compatibility conditions of the
original system and those have been investigated in advance and found
satisfied. Hence we can assume without loss of generality that y = — 1 and
then the system (16) can be written in the inhomogeneous form

and

We see that we could obtain the coordinate yi on the basis of constructing


an auxiliary vector g which satisfied a definite inhomogeneous equation.
But the inhomogeneity came about by transferring a term from the left to
the right side. In principle we have only made use of the compatibility
condition of a linear system and that again was merely an application of the
bilinear identity. We thus see how the bilinear identity becomes of leading
significance for the solution problem of linear equations and it is this method
which carries over in a most natural and systematic manner to the realm
of linear differential operators.
We still have to answer the question whether the system (18) is always
solvable. This is indeed so because the adjoint homogeneous equation
Av = 0 has no solution (apart from v — 0) according to our basic assumptions.
Another interesting point is that if our original system was an n x m
over-determined system (m > n), the adjoint system (18) is accordingly an
m x n under-determined system which allows an infinity of solutions. And
yet we know from the well-determined character of our problem that yi
must have a definite value and thus the freedom in the solution g of the
equation (18) can have no influence on yi. This is indeed the case if we
realise that according to the general theory the general solution of (18) can
be written in the form

where gi is any particular solution and go the solution of the homogeneous


equation
156 MATRIX CALCULUS CHAP. 3

But then the new value of yi, obtained on the basis of a g which is different
from gi becomes

However, the second term vanishes since 6 satisfies the compatibility con-
ditions of the original system. And thus the insensitivity of yi relative to
the freedom of choosing any solution of (18) is demonstrated.
In Section 17 we have encountered Cauchy's integral theorem (17.1)
which was the prototype of a fundamentally important over-determined
system. Here the '' auxiliary vector g " is taken over by the auxiliary function

where z is the fixed point at which f(z) shall be obtained. But the conditions
demanded of (?(£, z) are much less strict than to yield the particular function
(23). In fact we could add to this special function any function g(z) which
remains analytical within the domain bounded by C. But the contribution
generated by this additional function is

and this quantity is zero according to (17.2), due to the nature of the
admissible boundary values /(£). Quite similar is the situation concerning
the boundary value problem (17.4) where again the function G(T,S) need
not be chosen according to (17.5) but we could add any solution of the
Laplacian equation (17.3) which remains everywhere regular in the given
domain. It is exactly this great freedom in solving the under-determined
equation (21) which renders the over-determined systems so valuable from
the standpoint of obtaining explicit solutions. If the system is well-
determined, the equation (18) becomes likewise well-determined and we have
to obtain a unique, highly specified vector g. This is in the realm of partial
differential equations frequently a difficult task.
We return to our original matrix problem (14). Since the solution of
the system (16) changes with i—which assumes in succession the values
1, 2, . . . , m, if our aim is to obtain the entire y vector—we should indicate
this dependence by the subscript i. Instead of one single equation (18) we
now obtain m equations which can be solved in succession

These m vectors of n components can be arranged as columns of an n x m


SEC. 3.19 THE BZLTNEAB IDENTITY 157

matrix G. Then the equations (19) (for i = I , 2, . . . , m) can be replaced


by the matrix equation

Moreover, the base vectors ei, 62, . . . , em, arranged as columns of a matrix,
yield the m x m unit matrix I:

The m defining equations (25) for the vectors gi, gz> • • - >9m can be united
in the single matrix equation

Taking the transpose of this equation we can write it in the form

which shows that & can be conceived as the inverse of A :

and the solution (26) may be written in the form

But the sequence of the factors in (29) is absolutely essential and cannot be
changed to

In fact, if we try to obtain a solution of the equation (32), we see that it is


impossible because the equation (18) would now appear in the form

The solvability of this equation for all i would demand that the adjoint
homogeneous equation

has no non-vanishing solution. But this is not the case if our system is
over-determined (n > m).
Problem 147. Assume that we have found a matrix C such that for all x and y

Show that in this case we have of necessity

Problem 148. Show with the help of the natural inverse (10.4) that in the case
of a unique but over-determined linear system the "left-inverse" (30) exists
but the "right inverse" (32) does not exist.
Problem 149. Apply the solution method of this section to the solution of the
over-determined (but complete) system (13.18) and demonstrate numerically
that the freedom in the construction of G has no effect on the solution.
158 MATRIX CALCULUS CHAP. 3

Problem 150. Do the same for the over-determined but incomplete system
(13.22) of Problem 136, after removing the deficiency by adding as a fifth
equation the condition

Check the solution with the previous solution (14.9).

3.20. Minimum property of the smallest eigenvalue


We consider a complete n x m system which is either well-determined
or over-determined (p = m, n > m), and we focus our attention on the
smallest eigenvalue of the shifted eigenvalue problem (7.7). Since we have
agreed that we enumerate the XK (which are all positive) in decreasing order,
the smallest eigenvalue of our problem will be Xp for which we may also
write Xm since we excluded the possibility p < m. This eigenvalue can be
characterised for itself, as a solution of a minimum problem.
We ask for the smallest possible value of the ratio

Equivalent with this problem is the following formulation. Find the


minimum of the quantity (Ay}z if the length of the vector y is normalised
to 1:

The solution of this minimum problem is

for which we can also put

with the added condition that we choose among the possible solutions of
this problem the absolutely smallest X = Xm. In consequence of this
minimum problem we have for any arbitrary choice of y:

or

This minimum (or maximum) property of the mth eigenvalue is frequently


of great value in estimating the error of an algebraic system which approxi-
mates the solution of a continuous problem (cf. Chapter 4.9).
We can parallel this extremum property of Xm by a corresponding
extremum property of the largest eigenvalue AI. We can define AI as the
solution of the problem of making the ratio (1) to a maximum. Once more
SEC. 3.20 MINIMUM PROPERTY OF THE SMALLEST EIGENVALUE 159

we obtain the equations (4) with the added condition that we choose among
all possible solutions the one belonging to the absolutely largest X = AI. In
consequence of this maximum property we obtain for an arbitrary choice
of y:

And thus we can establish an upper and lower bound for the ratio (1):

or also

These inequalities lead to an interesting consequence concerning the


elements of the inverse matrix A~l. Let us assume that A is a given n x n
square matrix with non-vanishing determinant. Then the inverse A~l is
likewise an n x n square matrix. Occasionally the elements of this matrix
are very large numbers which makes the numerical determination of the
inverse matrix cumbersome and prone to rounding errors. Now the elements
of the inverse matrix cannot be arbitrarily large but they cannot be
arbitrarily small either. We obtain a definite check on the inverse matrix
if the two extreme eigenvalues AI and \m are known.
We have seen in Section 19 that the ith row of the inverse matrix A~l
can be characterised by the following equation :

Let us apply the inequality (9) to this particular vector y = gi, keeping in
mind that the replacement of A by A has no effect on the eigenvalues AI
and ATO, in view of the symmetry of the shifted eigenvalue problem (4):

Since by definition the base vector ei has the length 1 (cf. 19.13), we obtain
for the ith row of the inverse matrix the two bounds

This means that the sum of the squares of the elements of any row of A~l is
included between the lower bound Ai~2 and the upper bound Am~2. Exactly
the same holds for the sum of the squares of the elements of any column,
as we can see by replacing A by A.
Similar bounds can be established for A itself. By considering A as the
160 MATRIX CALCULUS CHAP. 3

inverse of A~l we obtain for the square of any row or any column of A the
inequalities

(In the case of Hermitian matrices a^2 is to be replaced by |a^|2.)


While up to now we have singled out the special case of n x n non-
singular square matrices, we shall now extend our discussion to the
completely general case of an arbitrary n x m matrix A. Here the
"natural inverse" of A (cf. 10.4) can again be characterised by the equation
(10) but with the following restriction. The right side of the equation
cannot be simply ^, in view of the compatibility condition which demands
that the right side must be orthogonal to every solution of the transposed
homogeneous system, if such solutions exist. We will assume that we have
constructed all the m — p linearly independent solutions v$ of the equation

Furthermore, we want to assume that we have orthogonalised and normalised


these solutions (cf. Section 18), so that

Then the ith row of the natural inverse (10.4) of the matrix can be
characterised by the following equation:

where vta denotes the iih component of the vector va. By artiing the
correction term on the right side we have blotted out the projection of the
base vector et into the non-activated portion of the tf-space, without
changing in the least the projection into the activated portion.
The equation (17) has a unique solution if we add the further condition
that gt must be made orthogonal to all the n — p independent solutions of
the homogeneous equation

It is this vector g\ which provides us with the ith row of the inverse B. By
letting i assume the values 1, 2, . . ., m we obtain in succession all the rows
of B.
Now we will once more apply the inequality (11) to this vector gt and
once more an upper and a lower bound can be obtained for the length of the
SEC. 3.20 MINIMUM PROPERTY OF THE SMALLEST EIGENVALUE 161

vector. The difference is only that now the denominator, which appears in
(11), is no longer 1 but the square of the right side of (17) for which we obtain

And thus the relation (11) now becomes

Here again we can extend this inequality to the columns and we can
likewise return to the original matrix A which differs from A~l only in
having A^ instead of Ai"1 as eigenvalues which reverses the role of AI and A m .
We obtain a particularly adequate general scheme if we arrange the
(orthogonalised and normalised) zero-solutions, together with the given
matrix A, in the following fashion.

The u° vectors are arranged in successive columns, the v? vectors in successive


rows. We now have to form a ratio whose numerator is the square of a
certain row of the matrix A while the denominator is the square of the
complementary row, subtracted from 1:

where

12—L.D.O.
162 MATRIX CALCULUS CHAP. 3

A similar relation is obtainable for the columns:

where

In the special case m = n = p the complementary fields disappear and we


are right back at the previous formulae (13), (14). If n = p, m > n, the
right complement disappears but not the complement below. If m — p,
n > m, the complement below disappears but not the right complement.
The remarkable feature of these inequalities is their great generality and
that they provide simultaneously an upper and a lower bound for the
square of every row and every column of an arbitrary n x m matrix—and
likewise its inverse—in terms of the two characteristic numbers AI and A^
(irrespective of the presence of an arbitrary number of zero eigenvalues,
since Ap is defined as the smallest positive eigenvalue for which the equation
(4) is solvable).

BIBLIOGRAPHY

[1] Aitken, A. C., Determinants and Matrices (Interscience Publishers, New York,
1944)
[2] Ferrar, W. L., Algebra (Oxford Press, New York, 1941)
[3] Householder, A. S., Principles of Numerical Analysis (McGraw-Hill, New
York, 1953)
[4] MacDuffee, C. C., The Theonj of Matrices (Chelsea, New York, 1946)
CHAPTER 4

THE F U N C T I O N SPACE

Synopsis. A piecewise continuous function need not be given in an


infinity of points but in a sufficiently dense discrete set of points. This
leads to the picture of a space of many dimensions in which a function
is represented by a vector. By this geometrical image the link is
established between matrices and linear differential operators, and the
previous analysis of linear systems becomes translatable to the analysis
of linear differential equations. We learn about the fundamental
importance of the "adjoint equation" and likewise about the funda-
mental role of the given boundary conditions. From now on the given
boundary conditions will not be considered as accidental accessories,
added to the given differential equation, but the left sides of the
boundary conditions become integral parts of the operator. From
now on, if we write D for a differential operator, we will automatically
assume that this operator includes the left sides of the given boundary
conditions, no matter how few or how many such conditions may have
been prescribed.

4.1. Introduction
The close relation which exists between the solution of differential
equations and systems of algebraic equations was recognised by the early
masters of calculus. David Bernoulli solved the problem of the completely
flexible chain by considering the equilibrium problem of a chain which was
composed of a large number of rigid rods of small lengths. Lagrange solved
the problem of the vibrating string by considering the motion of discrete
masses of finite size, separated by small but finite intervals. It seemed
self-evident that this algebraic approach to the problem of the continuum
must lead to the right results. In particular, the solution of a problem in
partial differential equations seemed obtained if the following conditions
prevailed:
1. We replace the continuum of functional values by a dense set of
discontinuous values.
2. The partial derivatives are replaced by the corresponding difference
coefficients, taken between points which can approach each other as much
as we like.
3. We solve the resulting algebraic system and study the behaviour of
163
164 THE FUNCTION SPACE CHAP. 4

the solution as the discrete set of points becomes denser and denser, thus
approaching the continuum as much as we wish.
4. We observe that under these conditions the solution of the algebraic
system approaches a definite limit.
5. Then this limit is automatically the desired solution of our original
problem.
The constantly increasing demands on rigour have invalidated some of
the assumptions which seemed self-evident even a century ago. To give an
exact existence theorem for the solution of a complicated boundary value
problem in partial differential equations can easily tax our mathematical
faculties to the utmost. In the realm of ordinary differential equations
Cauchy succeeded with the proof that the limit of the substitute algebraic
problem actually yields the solution of the original continuous problem.
But the method of Cauchy does not carry over into the realm of partial
differential operators, and even relatively simple partial differential equations
require a thorough investigation if a rigorous proof is required of the kind
of boundary conditions which can guarantee a solution. We do not possess
any sweeping methods which would be applicable to all partial differential
equations, even if we restrict ourselves to the realm of linear differential
operators.

4.2. The viewpoint of pure and applied mathematics


From the standpoint of pure mathematics the existence of the solution of
a certain problem in partial differential equations may be more important
than the actual construction of the solution. But if our aim is to apply
mathematics to the realm of natural phenomena, the shift of emphasis
becomes clearly visible. In a given physical situation we feel certain in
advance that the solution of a certain boundary value problem must exist
since the physical quantity realised in nature is itself the desired solution.
And thus we often encounter the expression: "We know for physical reasons
that the solution exists." This expression is actually based on wrong
premises. While we cannot doubt that a certain physical quantity in fact
realises the solution of a certain boundary value problem, we have no
guarantee that the mathematical formulation of that problem is correct to the
last dot. We have neglected so many accessories of the problem, we have
simplified so much on the given physical situation that we know in advance
that the field equation with which we operate cannot be considered as the
final truth. If, instead of getting zero on the right side we get an e which is
perhaps of the order 10~6, we are still perfectly satisfied since we know that
an accuracy of this order goes far beyond our measuring faculties and also
beyond the accuracy of our description. From the standpoint of exact
mathematics no error of any order can be tolerated and even the possession
of a limit which tolerates an error in the equation which can be made as
small as we wish, may not be enough to demonstrate the existence of the
solution of our original problem.
SEC. 4.3 THE LANGUAGE OF GEOMETEY 165

Hence, while on the one hand we have no right to claim that a certain
mathematically formulated boundary value problem must have a solution
"for physical reasons", we can, on the other hand, dispense with the
rigorous existence proofs of pure mathematics, in favour of a more flexible
approach which proves the existence of certain boundary value problems
under simplified conditions. Pure mathematics would like to extend these
conditions to much more extreme conditions and the value of such
investigations cannot be doubted. From the applied standpoint, however,
we are satisfied if we succeed with the solution of a fairly general class of
problems with data which are not too irregular.
The present book is written from the applied angle and is thus not
concerned with the establishment of existence proofs. Our aim is not the
solution of a given differential equation but rather the exploration of the
general properties of linear differential operators. The solution of a given
differential equation is of more accidental significance. But we can hardly
doubt that the study of the properties of linear differential operators can
be of considerable value if we are confronted with the task of solving a
given differential equation because we shall be able to tell in advance what
we may and may not expect. Moreover, certain results of these purely
symptomatic studies can give us clues which may be even of practical help
in the actual construction of the solution.

4.3. The language of geometry


Certain purely analytical relations can gain greatly in lucidity if we
express their content in geometrical language. We have investigated the
properties of n x m matrices and have seen how beneficial it was to associate
the two orthogonal vector spaces N and M with the matrix. Although
originally the unknown y is composed of the elements yi, y%, . . . , ym, and
represents an aggregate of m numbers, we translate this analytical picture
into a geometrical one by thinking of these numbers as the successive
rectangular components of a vector. This vector belongs to a certain space
of m dimensions while the right side b belongs to another space of n
dimensions. By seeing these spaces in front of us and by populating them
with the orthogonal vectors vi, v%, . . . ,vm in the one case, and ui, uz, . . . , un
in the other, we have gained greatly in our understanding of the basic
analytical structure of the matrix A.
The same geometrical ideas will again be of great value in the study of
linear differential operators. Originally we have a certain

and a differential equation by which this function is characterised. Or we


may have a more-dimensional continuum, for example a three-dimensional
space, characterised by the coordinates x, y, z, and we may be interested in
finding a certain function
166 THE FUNCTION SPACE CHAP. 4

of these coordinates by solving a certain partial differential equation, such


as the "Laplacian equation"

In the study of such problems great clarification can be obtained by a


certain unifying procedure which interprets the originally given problem in
geometrical language and emphasises certain features of our problem which
hold universally for all linear differential operators. The given special
problem becomes submerged in a much wider class of problems and we extend
certain basic tools of analysis to a much larger class of investigations.

4.4. Metrical spaces of infinitely many dimensions


The characteristic feature of this unification is that the language employed
operates with geometrical concepts, taken from our ordinary space con-
ceptions but generalised to spaces of a more abstract character. If we
speak of "geometrical concepts", we do not mean that actual geometrical
constructions will be performed. We accept Descartes' Analytic Geometry
which transforms a given geometrical problem into a problem of algebra,
through the use of coordinates. By the method of translating our problem
into the language of geometry we make the tools of analytical geometry
available to the investigation of the properties of differential operators which
originally are far from any direct geometrical significance.
This great tie-up between the analytical geometry of spaces and the study
of differential operators came about by the concept of the "function space"
which evolved in consequence of Hilbert's fundamental investigation of a
certain class of integral equations. While Fredholm, the originator of the
theory, formulated the problem in essentially algebraic language, Hilbert
recognised the close relation of the problem with the analytic geometry of
second-order surfaces in a Euclidean space of many (strictly speaking
infinitely many) dimensions.
The structure of the space with which we are going to operate, will be of
the ordinary Euclidean kind. The characteristic feature of this space is
that it is homogeneous, by having the same properties at all points and in all
directions, and by allowing a similarity transformation without changing
anything in the inner relations of figures. The only difference compared
with our ordinary space is that the number of dimensions is not 3 as in our
ordinary space, but an arbitrary number, let us say n. And thus we speak
of the "analytic geometry of w-dimensional spaces", which means that a
"point" of this space has not 3 but n Cartesian coordinates x\, x%, . . . , xn;
and that the "distance" from the origin is not given by the Pythagorean law

but by the generalised Pythagorean law


SEC. 4.5 THE FUNCTION AS A VECTOR 167

While in our geometrical imagination we are somewhat handicapped by not


being able to visualise spaces of more than 3 dimensions, for our analytical
operations it is entirely irrelevant whether we have to extend a sum over
3 or over a thousand terms. The large number of dimensions is a character-
istic feature of all these investigations, in fact, we have to keep in mind
all the time that strictly speaking the function space has an infinity of
dimensions. The essential difference between a discrete algebraic operator
and a continuous operator is exactly this that the operation of a continuous
operator has to be pictured in a space of infinitely many dimensions.
However, the transition can be done gradually. We can start with a space
of a large number of dimensions and increase that number all the time,
letting n grow beyond all bounds. We then investigate the limits to which
certain quantities tend. These limits are the things in which we are really
interested, but the very fact that these limits exist means that the difference
between the continuous operator and the discrete operator associated with
a space of many dimensions can be made as small as we wish.

4.5. The function as a vector


The fundamental point of departure is the method according to which we
tabulate a certain function y = f(x). Although a; is a continuous variable
which can assume any values within a certain interval, we select a series of
discrete values at which we tabulate f(x). Schematically our tabulation
looks as follows:

For example in the interval between 0 and 1 we might have chosen 2001
equidistant values of x and tabulated the corresponding functional values of
y = ex. In that case the xt values are denned by

while the y-values are the 2001 tabulated values of the exponential function,
starting from y = 1, and ending with y = 2.718281828.
We will now associate with this tabulation the following geometrical
picture. We imagine that we have at our disposal a space of 2001
dimensions. We assign the successive dimensions of this space to the
z-values 0, 0.005, 0.001, . . . i.e. we set up 2001 mutually orthogonal co-
ordinate axes which we may denote as the axes Xi, X%, . . . , Xn. Along
these axes we plot the functional values yi — f(xt), evaluated at the points
x = Xi. These y\, yz, . . . , yn can be conceived as the coordinates of a
certain point Y of an w-dimensional space. We may likewise connect the
point Y with the origin 0 by a straight line and arrive at the picture of the
vector OY. The "components" or "projections" of this vector on the
successive axes give the successive functional values yi,y%, . . . ,yn-
At first sight it seems that the independent variable x has dropped
completely out of this picture. We have plotted the functional values yt
168 THE FUNCTION SPACE CHAP. 4

as the components of the vector OY but where are the values xi, #2, • • • , #»?
In fact, these values are present in latent form. The role of the independent
variable x is that it provides an ordering principle for the cataloguing of the
functional values. If we want to know for example what the value /(0.25)
is, we have to identify this x = 0.25 with one of our axes. Suppose we find
that x — 0.25 belongs to the axis ^501, then we single out that particular
axis and see what the projection of the vector OY is on that axis. Our
construction is actually isomorphic with every detail of the original tabula-
tion and repeats that tabulation in a new geometrical interpretation.
We can now proceed even further and include in our construction functions
which depend on more than one variable. Let a function f ( x , y] depend on
two variables x and y. We tabulate this function in certain intervals, for
example in similar intervals as before, but now, proceeding in equal intervals
Ax and Ay, independently. If before we needed 2000 entries to cover the
interval [0, 1], we may now need 4 million entries to cover the square
0 < x < 1, 0 < y < 1. But in principle the manner of tabulation has not
changed. The independent variables x, y serve merely as an ordering
principle for the arrangement of the tabular values. We can make a
catalogue in which we enumerate all the possible combinations of x, y
values in which our function has been tabulated, starting the enumeration
with 1 and ending with, let us say, 4 million. Then we imagine a space of
4 million dimensions and again we plot the successive functional values of
u = f ( x , y} as components of a vector. This one vector is again a perfect
substitute for our table of 4 million entries.
We observe that the dimensionality of our original problem is of no
immediate concern for the resulting vector picture. The fact that we have
replaced a continuum by a discrete set of values abolishes the fundamental
difference between functions of one or more variables. No matter how many
independent variables we had, as soon as we begin to tabulate, we automatically
begin to atomise the continuum and by this process we can line up any
number of dimensions as a one-dimensional sequence of values.
Our table may become very bulky but in principle our procedure never
changes. We need two things: a catalogue which associates a definite
cardinal number
SEC. 4.5 THE FUNCTION AS A VECTOR 169

with the various "cells" in which our continuum has been broken, and a
table which associates a definite functional value with these cardinal numbers,
from 1 to n, where n may be a tremendously large number. Now we take
all these functional values and construct a definite vector of the w-dimensional
space which is a perfect representation of our function. Another function
belonging to the same domain will find its representation in the same
w-dimensional space, but will be represented by another vector because the
functional values, which are the components of the new vector along the
various axes, are different from what they were before.
This concept of a function as a vector looks strange and artificial at the
first moment and yet it is an eminently useful tool in the study of differential
and integral operators. We can understand the inner necessity of this
concept if we approach the problem in the same way as Bernoulli and
Euler and Lagrange approached the solution of differential equations.
Since the derivative is defined as the limit of a difference coefficient, the
replacement of a differential equation by a difference equation involves a
certain error which, however, can be reduced to as little as we like by
making the Ax between the arguments sufficiently small. But it is this
replacement of a differential equation by a difference equation which has a
profound effect on the nature of our problem. So far as the solution is
concerned, we know that we have modified the solution of our problem by a
negligibly small amount. But ideologically it makes a very great difference
to be confronted by a new problem in which everything is formulated in
algebraic terms. The unknown is no longer a continuous function of the
variables. We have selected a discrete set of points in which we want to
obtain the values of the function and thus we have transformed a problem
of infinitely many degrees of freedom to a problem of a finite number of
degrees of freedom. The same occurs with partial differential equations
in which the independent variables form a more than one-dimensional
manifold. In the problem of a vibrating membrane for example we should
find the displacement of an elastic membrane which depends on the three
variables x, y, and t. But if we assume that the material particles of the
membrane are strictly speaking not distributed continuously over a surface
but actually lumped in a large number of "mass-points" which exist in
isolated spots, then we have the right picture which corresponds to the
concepts of the "function space". Because now the displacement of the
membrane is no longer a continuous function of x, y, t but a displacement
which exists only in a large but finite number of grid-points, namely the
points in which the mass-points are concentrated. The new problem is
mathematically completely different from the original problem. We are no
longer confronted with a partial differential equation but with a large
number of ordinary differential equations, because we have to describe the
elastic vibrations that the n mass-points describe under the influence of the
elastic forces which act between them. But now we can go one step still
further. We can carry through the idea of atomisation not only with
respect to space but also with respect to time. If we atomise the time variable,
170 THE FUNCTION SPACE CHAP. 4

these N ordinary differential equations break up into a large number of


ordinary algebraic difference equations and the concept of the "derivative"
disappears. And, yet, if our atomisation is sufficiently microscopic by a
sufficiently fine grid-work of points, the difference between the new algebraic
problem and the original continuous problem becomes imperceptible.
4.6. The differential operator as a matrix
Let us elucidate the general conditions with the help of a simple but
characteristic example. We consider the differential equation

where b(x) is a given function of x. In accordance with the general pro-


cedure we are going to "atomise" this equation by breaking it into a large
number of ordinary algebraic equations. For this purpose we replace the
continuum of x-values by a dense but discrete set of points.
Let us assume that we are interested in the interval x = [0, 1], We
start by replacing this continuum of values by the discrete set

We have chosen for the sake of simplicity an equidistant set of points, which
is not demanded since generally speaking our Ax = e could change from
point to point. But a constant A x is simpler and serves our amis equally well.
Now the function y(x] will also be atomised. We are no longer interested
in the infinity of values y(x) but only in the values of y(x) at the selected
points xi:

In transcribing the given differential equation we make use of the definition


of a derivative as the limit of a difference coefficient. We do not go, however,
to the limit e = 0 but let e be a small but finite quantity. Then the
operation y"(x) has to be interpreted as follows:

and now we have everything for the reformulation of our problem as an


algebraic problem.
First of all we notice that we cannot write down our equation at the two
endpoints x = 0 and x = 1 because we do not possess the left, respectively
right neighbours y( — e) and y(l + e) which go beyond the limitations of
the given range. Hence we will write down only the equations at the
n — 2 points x%, x$, . . . , xn-\:
SBC. 4.6 THE DIFFERENTIAL OPERATOR AS A MATRIX 171

Here we have a simultaneous set of linear algebraic equations, exactly of


the type studied in the previous chapter:

The values y\, yz, 1/3, .. . , yn can be conceived as the components of a


vector of an w-dimensional space and the same can be said of the "given
right side " b& 63, . . ., bn-i of our equation. We now recognise how natural
it is to conceive the values of y(x) at the selected points x = #< as the
components of a vector and to do the same with the right side b(x) of the
original differential equation. We have re-formulated our original problem
as a problem of algebra. In this re-formulation the original "function"
y(x) disappeared and became transformed into the vector

The same happened with the given right side b(x) of the differential equation.
This b(x) too disappeared in its original entity and re-appeared on the
platform as the vector

Closer inspection reveals that strictly speaking these two vectors do not
belong to the same spaces. Our algebraic system (5) is in fact not an
n x n system but an (n — 2) x n system. The number of unknowns
surpasses the number of equations by 2. Here we observe already a
characteristic feature of differential operators: They represent in themselves
without further data, an incomplete system of equations which cannot have
a unique solution. In order to remove the deficiency, we have to give some
further data and we usually do that by adding some proper boundary
conditions, that is certain data concerning the behaviour of the solution
at the boundaries. For example, we could prescribe the values of y(0) and
y(I). But we may also give the values y'(x) at the two endpoints which in
our algebraic transcription means the two values

There are many other possibilities and we may give two conditions at the
point x = 0 without any conditions at x = 1, or perhaps two conditions at
the point x = 1 without any conditions at x = 0, or two conditions which
involve both endpoints simultaneously. The important point is that the
differential equation alone, without boundary conditions, cannot give a unique
solution. This is caused by the fact that a linear differential operator of the
order r represents a linear relation between r + 1 functional values.
Hence, letting the operator operate at every point x = X{, the operator
would make use of r additional functional values which go beyond the limits
of the given interval. Hence we have to cross out r of the equations which
makes the algebraic transcription of an rth order linear differential operator
to a deficient system of n — r equations between n unknowns. The
172 THE FUNCTION SPACE CHAP. 4

remaining r equations have to be made up by r additional boundary conditions.


(In the case of partial differential operators the situation is much more
complicated and we cannot enumerate so easily the degree of deficiency of
a given operator.) In any physical problem the boundary conditions, no
less than the differential operators, are dictated by the physical situation.
For example, if a differential equation is deducible from the "principle of
least action", this principle provides also the natural boundary conditions of
the problem (cf. Chapter 8.17).
But what happened in our algebraic transcription to the differential
operator itself? The function y(x) became a vector and the same happened
to the function b(x). The given differential equation (1) became resolved
in a set of linear algebraic equations (5). This set of equations has a definite
matrix A and it is this matrix which is determined by the given differential
operator. Indeed, the coefficients of the linear system (5) came about on
account of the fact that the differential operator y"(x) was replaced by the
corresponding difference coefficient and the equation written down a
sufficient number of times.
It is important to emphasise, however, that the matrix of the final system
is determined not only by the differential operator but by the given boundary
conditions. For example the matrix of the linear system (5) is an (n — 2) x n
matrix which is thus 2 equations short of a well-determined system. But
let us now add the boundary conditions

In this case y\ and yn disappear on the left side since they are no longer
unknowns. We now obtain a system of n — 2 rows and columns and the
resulting matrix is a square matrix which can be written out as follows, if
we take out the common factor 1/e2:

Although we have chosen a particularly simple example, it is clear that


by the same method an arbitrarily complicated ordinary differential equation
(augmented by the proper boundary conditions) can be transcribed into a
linear algebraic system of equations, of the general form (6).
Problem 151. Find the matrix of the previous problem (1) if the boundary
conditions are modified as follows: y(Q) = 0, y'(0) = 0. Show that the matrix
A now becomes triangular.
SEC. 4.7 THE LENGTH OF A VECTOE 173

Problem 152. Transcribe the differential equation

into a matrix equation, assuming a constant Ax = e. Let x range between


1 and 2 and let the boundary condition be: y ( l ) = 0.
Problem 153. Find the matrix formulation of Problem 151 if the differential
equation (1) is given as a pair of first order equations:

[Hint: Combine yi(x), y%(x) into one column vector of 2n components.]

4.7. The length of a vector


Up to now we have been studying the transcription of a given differential
equation into the language of algebra by breaking up the continuum into a
dense but discrete set of points. We have not yet discussed what is going
to happen as we bring our gridpoints closer and closer together. We are
obviously interested in the limit to which our construction tends as the
number of gridpoints increases to infinity by letting Ax = e decrease to
zero. This limit process is a characteristically new feature which has to be
added to our ordinary matrix algebra, in order to include the study of
linear differential operators within the framework of matrix calculus.
Let us recall that a matrix A operates in conjunction with two vector
spaces. The operation Ay = 6 transforms the vector y into the vector b
and thus we have the space M in which y is located and the space N in
which b is located. We were able to span these spaces with certain mutually
orthogonal principal axes which established a natural "frame of reference"
in these spaces. It will be our aim to re-interpret these results in relation
to those matrices which can be associated with linear differential operators.
However, a certain round-aboutness cannot be avoided since we had to
atomise the continuum in order to arrive at the algebraic matrix picture and
now we have to try to bridge the gap by letting the basic grid-parameter e
go to zero.
The characteristic feature of a Euclidean metrical space is that the
various coordinate axes are on an equal level. We can label our axes by
the sequence 1, 2, 3, . . . , n, but this labelling is arbitrary and does not
correspond to an inherent geometrical property of space. The space is
homogeneous, it has the same properties in every direction and thus allows
arbitrary rotations of the basic axes. Hence the components yi, yz, . . . ,yn
of a vector are of an accidental character but there is a definite "invariant"
in existence, namely the length of the vector which is of an absolute and
immutable character. This is given by the operation

It must be our aim to save this valuable feature of a metrical space in relation
174 THE FUNCTION SPACE CHAP. 4

to the study of continuous operators. If we proceed in the way as we have


done before, it is clear that we will not arrive at a useful concept. If we
define the components yi of the vector y simply as the values of y(x) at the
points Xi:

the quantity (1) would be bare of any meaning as Ax = e approaches zero


since we would get a steadily increasing quantity which grows to infinity.
We succeed, however, by a slight modification of the previous procedure.
Let us agree that we include the Ax in the definition of yi by putting

(the equality of all Axi = xt+i — X{ is not demanded). If we now form the
scalar product (1), we actually get something very valuable, namely

and this quantity approaches a very definite limit as Axt decreases to zero,
namely the definite integral

The same quantity retains its significance in the multi-dimensional case,


studied in Section 5. We have broken up the continuum into a large
number of "cells" which cover the multi-dimensional range of the inde-
pendent variables x, y, z, . . . . In defining the vector-components 0<
associated with this manifold we do not take merely the value of
0(x, y, z, . . .) at the midpoint of the cell as our component 0* but again we
multiply by the square root of et where this e$ is now the volume of the ith
cell:

This definition has once more the great value that in the limit it leads to a
very definite invariant associated with the multi-dimensional function
<P(x, y,z, . . .), namely the definite integral

where dr is the volume element of the region and the integration is extended
over the complete range of all the variables.
In fact, this generalisation to the multi-dimensional case is so natural
that we often prefer to cover the general case with the same symbolism,
denoting by x an arbitrary point of the given multi-dimensional region and
by dx the volume element of that region. The formula (7) may then be
written in the form
SEC. 4.8 THE SCALAR PRODUCT OF TWO VECTORS 175

in full analogy to the formula (5), although the symbol x refers now to a
much more complicated domain and the integration is extended over a
multi-dimensional region.
4.8. The scalar product of two vectors
If two vectors / and g are placed in a Euclidean space, their mutual
position gives rise to a particularly important invariant, the "scalar
product" of these two vectors, expressible in matrix language by the product

In particular, if this product is zero, the two vectors are orthogonal to each
other. We can expect that the same operation applied to the space of
functions will give us a particularly valuable quantity which will be of
fundamental significance in the study of differential operators. If we
return to Section 7, where we have found the proper definition of the vector
components in the space of functions, we obtain—by the same reasoning
that gave us the "norm" (length square) of a function—that the "scalar
product" of the two functions f(x) and g(x) has the following significance:

The same holds in the multi-dimensional case if we interpret the point "x"
in the sense of the formula (7.8) and dx as the volume-element of the domain:

We speak of the "orthogonality" of two functions if this integral over the


given domain of the one (one or more) variables comes out as zero.

4.9. The closeness of the algebraic approximation


The €-method which replaces a given linear differential equation by a
finite system of algebraic equations, is not more than an approximation
procedure whose error we hope to reduce to a negligibly small quantity. Is
it possible to estimate how nearly the solution of the algebraic system will
approach the solution of the original problem? In order to answer this
problem we will consider the exact solution y(x] of the given differential
equation, but restricted to those specific xi values which came into existence
by the atomisation of the continuum. We form the difference coefficients
which in the algebraic treatment take the place of the original derivatives.
Writing down all the algebraic equations of the substitute system, we can
restore the original differential equation (formulated at the selected discrete
points x = Xi) if we put on the right side of the substitute system the
difference between derivative and difference coefficient. This difference can
be estimated on the basis of the repeated application of the truncated
Taylor series
176 THE FUNCTION SPACE CHAP. 4

(where xa denotes some intermediate point), which may be written in the


form

Higher derivatives can be conceived as repeated applications of the djdxa


process and thus the same estimation procedure is applicable for the
replacement of higher order derivatives by the corresponding algebraic
difference quotients.
Under these circumstances we can obtain definite error bounds for the
algebraic system

which—through the process of atomisation—takes the place of the original


differential equation

The vector y of the algebraic equation should represent y(x) at the selected
points x = Xi but in actual fact y cannot be more than an approximation
of y(xt). Hence we will replace y by y and write the algebraic system (3)
in the form

while the correct y = y(xi) satisfies the equation

The error vector 8 can be estimated on the basis of the given differential
operator D and the right side b(x). Assuming continuity of b(x) and
excluding any infinities in the coefficients of the operator Dy(x), we can
establish an error bound for the value of the component St at the point
x = xf.

where hi is the grid parameter which becomes smaller and smaller as N,


the number of grid points in our atomisation process, increases, while ft is
a constant for the entire domain which remains under a certain finite bound.
The fact that the individual error in each algebraic equation can be made
as small as we wish by reducing the grid parameter to an appropriately small
value (at all points xt), does not guarantee that the corresponding error in
y will also remain small. We will put

obtaining for the vector 77 = (171,772, • • • > VN) the equation

Now the general method of solving an algebraic system has to be modified


to some extent for our present problem, in view of the fact that N, the
number of equations, grows to infinity. Without adequate precautions the
SEC. 4.9 THE CLOSENESS OF THE ALGEBRAIC APPROXIMATION 177

lengths of the vectors would grow to infinity and we should not be able to
.obtain any finite limits as N grows larger and larger. We are in fact
interested in a definite point x = (x1, x2, . . . , xs) of the continuum, although
its algebraic labelling xt changes all the time, in view of the constantly
increasing number of points at which the equation is applied.
We avoid the difficulty by a somewhat more flexible formulation of the
bilinear identity that we have discussed earlier in Chapter 3.19. In that
development the "scalar product" of two vectors x and y was defined on
the basis of

(omitting the asterisk since we want to stay in the real domain). However,
the bilinear identity remains unaffected if we agree that this definition shall
be generalised as follows :

where the weight factors pi, P2, • • • , PN are freely at our disposal, although
we will restrict them in advance by the condition that we will admit only
positive numbers as weights.
Now in the earlier treatment we made use of the bilinear identity for the
purpose of solving the algebraic system (3). We constructed a solution of
the equation

Then the bilinear identity gave us

and we obtained the special component yt by defining the vector as the


base vector (3.19.13), in which case we got

But now, if we agree that the scalar product shall be defined on the basis
of (11), we will once more obtain the solution with the help of (14) but the
definition of gt has to occur on the basis of

Now the discrete point Xi was connected with a definite cell of our con-
tinuum of s dimensions. That cell had the volume

while the total volume r of the continuum is the sum total of all the
elementary cells:

13—L.D.O.
178 THE FUNCTION SPACE CHAP. 4

With increasing N each one of the individual volumes rt shrinks to zero,


while the total volume r remains unchanged.
We will now dispose of our pi in the following manner. We will identify
them with the elementary volume rt:

With this definition the lengths of our vectors do not increase to infinity
any more, in spite of the infinity of N. For example the square of the
error vector S on the right side of (9) now becomes

This can be put in the form

if we define hz by

This k is a quantity which cannot become larger than the largest of all hi;
hence it goes to zero with ever increasing N.
Let us now solve the system (9) for the *th component, on the basis of (14):

Applying to this sum the algebraic form of Cauchy's inequality (2.4.13) we


obtain

As far as the second sum goes, we have just obtained the bound (20). In
order to bound the first sum we follow the procedure of Section 3.20, with
the only modification that now the ratio (3.20.1) should be defined in
harmony with our extended definition of the scalar product:

However, the problem of minimising this ratio yields once more exactly
the same solution as before, namely the eigenvalue problem (3.20.4), to be
solved for the smallest eigenvalue \m. And thus we obtain, as before in
(3.20.6):

This relation, if applied to the vector y = g^, yields the estimate


SEC. 4.10 THE ADJOINT OPERATOR 179

and substitution in (23) gives the upper bound

We see that the difference between the algebraic and the continuous
solution converges to zero as N increases to infinity, provided that Xm remains
finite with increasing N. If it so happens that \m converges to zero as N
increases to infinity, the convergence of the algebraic solution to the correct
solution can no longer be established. The behaviour of the smallest
eigenvalue of the matrix A with increasing order N of the matrix is thus of
vital importance.*
4.10. The adjoint operator
Throughout our treatment of linear systems in Chapter 3 we have pointed
out the fundamental importance of the transposed matrix A and the
associated transposed equation Au = 0. Now we deal with the matrix
aspects of linear differential operators and the question of the significance
of A has to be raised. Since the differential operator itself played the role
of the matrix A, we have to expect that the transposed matrix A has to
be interpreted as another linear differential operator which is somehow
uniquely associated with A.
In order to find this operator, we could proceed in the following fashion.
By the method of atomisation we transcribe the given differential equation
into a finite system of algebraic equations. Now we abstract from these
equations the matrix A itself. We transpose A by exchanging rows and
columns. This gives rise to a new linear system and now we watch what
happens as e goes to zero. In the limit we obtain a new differential
equation and the operator of this equation will give the adjoint differential
operator (the word "adjoint" taking the place of "transposed").
While this process is rather cumbersome, it actually works and in
principle we could obtain in this fashion the adjoint of any given linear
differential operator. In practice we can achieve our aim much more
simply by a method which we will discuss in the following section. It is,
however, of interest to construct the associated operator by actual matrix
transposition.
The following general observations should be added. The "transposed"
operator is called "adjoint" (instead of transposed). Moreover, it is
important to observe that the matrix of the transcribed algebraic system is
decisively influenced by the boundary conditions of the problem. The same
differential operator with different boundary conditions yields a different
matrix. For example the problem (6.1) with the boundary conditions (6.10)
yields the matrix (6.11) which is symmetric. Hence the transposed matrix
coincides with the original one and our problem is "self-adjoint". But the
* It should be pointed out that our conclusion does not prove the existence of the
solution of the original differential equation (2). What we have proved is only that
the algebraic solution converges to the desired solution, provided that that solution exists.
180 THE FUNCTION SPACE CHAP. 4

change of the boundary conditions to those of Problem 151 transforms the


matrix to a triangular one and the problem ceases to be self-adjoint. Hence
it is important to keep in mind that the boundary conditions belong to the
operator and cannot be separated from it. Differential equation plus boundary
conditions determine the matrix A of the problem and correspondingly also
the transposed matrix A. If we speak of the "adjoint operator", we have
in mind a certain differential operator together with certain boundary
conditions which are corollaries of the originally given boundary conditions.
The originally given boundary conditions may be either of the homogeneous
or the inhomogeneous type, that is, certain boundary values may be prescribed
as zero ("homogeneous boundary conditions") or as certain given non-
vanishing values ("inhomogeneous boundary conditions"). However, the
given value of a certain quantity always belongs to the right side of the
equation and not to the left side where we find the matrix A of our problem.
Hence from the standpoint of finding the transposed matrix A it is entirely
immaterial how the boundary values are prescribed, whether the demand is
that a certain combination of function and derivatives on the boundary
shall take the value zero or some other value. Hence in the problem of
finding the adjoint operator we can always dispense with inhomogeneous
boundary conditions and replace them by the corresponding homogeneous
conditions. Our problem becomes self-adjoint if the adjoint differential
operator coincides with the original one, and the adjoint boundary conditions
coincide with the original ones. For example in the problem (6.11) the
adjoint operator coincides with the original operator (the matrix being
symmetric) and the adjoint boundary conditions become

That the original boundary conditions (6.10) had the non-zero values /?i, j8n
on the right side, is operationally immaterial since given numerical values
can be no parts of an operator. What is prescribed is the decisive question;
the accidental numerical values on the right side are immaterial.
Problem 154. Denoting the adjoint operator by Gu(x) find the adjoints of the
following problems:
1.
[Answer:
2.
[Answer:
3.

[Answer :
4. no boundary conditions
[Answer:
SEC. 4.11 THE B1UNEAE IDENTITY 181

4.11. The bilinear identity


We now turn to the fundamental bilinear identity which provides us with
a much simpler method of obtaining the adjoint of a given linear differential
operator. In the matrix field we encountered this identity in Chapter 3,
Section 19. It expresses a fundamental relation between an arbitrary matrix
A and its transpose A. This relation can be utilised even as a definition of
A, because, if we find a matrix A which satisfies the identity

we know that A is the transposed matrix. In the realm of finite matrices


this procedure is of smaller importance because the exchange of rows and
columns is a simple enough operation which needs no substitute. But if we
have matrices whose order increases to infinity, the operation loses its
simplicity, and its replacement by the bilinear identity may lead to some-
thing which can be handled with much greater ease. In fact, we arrive here
at a point which becomes of central significance for the general theory of
differential operators. By translating the identity (1) into the realm of
differential operators we put the basic isomorphism between linear differential
operators and matrices to good use, without in every single case having to
transcribe the operator into a discrete matrix and then go to the limit where
the basic grid-parameter e goes to zero. We can stay from the beginning
in the field of differential operators.
There is, however, a peculiarity of differential operators to which special
attention has to be paid. In the identity (1) the vectors u and v are com-
pletely freely choosable. But if we think of the function space in which
the vector u becomes a function u(x) and the vector v a function v(x), the
demands of continuity and differentiability enter. A given differential
operator Dv(x) cannot operate on an arbitrary function v(x) but only on a
function f(x) which is in fact differentiable to the extent demanded by the
given operator. For example, if Du(x) is given as u"(x), it is self-evident
that the function u(x) must be twice differentiable in order that u"(x) shall
have a meaning. Here we observe the first important deviation from the
algebraic case, caused by the fact that the "arbitrary vectors" u and v
have to be re-interpreted as "arbitrary functions within a certain class of
continuous and differentiable functions". The "identity" will then refer
to the existence of a certain relation which holds for any pair of functions
chosen from that class.
Another restriction is caused by the conditions on the boundary. We
have seen that "the matrix A " involves more than the differential operator
Dv(x). It involves also certain boundary conditions demanded of v(x). In
a similar way the transposed matrix A will involve more than the adjoint
operator f)u(x). It will also involve certain boundary conditions demanded
of u(x). Hence A includes the adjoint operator together with the adjoint
boundary conditions. But then it will be of definite advantage to split
our task into two parts by finding first the adjoint differential operator and
182 THE FUNCTION SPACE CHAP. 4

then the adjoint boundary conditions. Accordingly we shall find two


formulations of the bilinear identity (1) useful, when transcribing it into the
realm of differential operators, the one referring to functions u(x), v(x) which
satisfy the natural continuity and differentiability conditions demanded by
the operators Dv(x) and Du(x), without specific boundary conditions and
another referring to functions u(x), v(x) which satisfy the demanded
differentiability conditions within the domain and the boundary conditions
on the boundary of the domain.

4.12. The extended Green's identity


We have seen in Section (8) how the scalar product of two vectors has to
be formulated in relation to the demands of the function space. Accordingly
we will have to rewrite the bilinear identity (11.1) in the following form

which means

and for the case of real functions and operators:

This fundamental identity, which transcribes the bilinear identity of matrix


calculus into the realm of function space, is called "Green's identity". It
has the following significance. To any given linear differential operator
(ordinary or partial) D we can find a uniquely determined adjoint operator
I) such that the definite integral of the left side, extended over the given
domain, gives zero for any pair of functions u(x), v(x] which are sufficiently
differentiate and which satisfy the proper boundary conditions.
A closer analysis of the general complex case (2) reveals that even in the
presence of complex elements the formulation (3) of Green's identity has its
advantages. Let us observe that the functions u(x), v(x) of the identity (2)
are arbitrary functions (except for some boundary conditions and the general
conditions of continuity and limited differentiability). Hence we could
replace u*(x) by u(x) without loss of generality. If we do so we notice that
the entire difference between (2) and (3) is that the D of equation (3) is
replaced by £)*. Let us now agree to call the operator D, as defined by (3),
the "algebraic adjoint" of D. Then, if we want to obtain the "Hermitian
adjoint" of D (and "self-adjoint" always refers to the identity of the given
operator with its Hermitian adjoint), all we have to do is to change in D
every i to —i. For this reason we will henceforth drop the formulation (2)
of Green's identity and operate consistently with the formulation (3), with
the understanding that the Hermitian adjoint of D shall be denoted by D*,
while D refers to the algebraic adjoint.
Before we come to closer grips with Green's identity, we will first formulate
it in somewhat more general terms. We now assume a pair of functions
SEC. 4.12 THE EXTENDED GREEN S IDENTITY 183

u(x), v(x) which satisfy the demanded differentiability conditions but are
not subjected to any specific boundary conditions :

The result of the integration is no longer zero but something that depends
solely on the values of u(x), v(x)—and some of their derivatives—taken on
the boundary of the region. This is the meaning of the expression " boundary
term" on the right side of (4). The fundamental identity (4) is called the
"extended Green's identity".
In order to see the significance of this fundamental theorem let us first
restrict ourselves to the case of a single independent variable x. The given
operator Dv(x] is now an ordinary differential operator, involving the
derivatives of v(x) with respect to x. Let us assume that we succeed in
showing the validity of the following bilinear relation:

where on the right side F(u, v) is an abbreviation for some bilinear function
of u(x) and v(x), and their derivatives. If we are able to prove (5), we shall
at once have (4) because, integrating with respect to x between the limits a
and 6 we obtain

and this equation is exactly of the form (4). Let us then concentrate on the
proof of (5).
The operator Dv(x) is generally of the form

and it suffices to consider a typical term. The following relation is familiar


from the method of integrating by parts:

If we identify g(x) with v(x) and f(x) with pjc(x)u(x), we obtain

and we see that we have obtained the adjoint operator associated with the
term pk(x)vW(x):
184 THE FUNCTION SPACE CHAP. 4

If we repeat the same procedure with every term, the entire operator T)u(x)
will be constructed.
We have thus obtained a simple and powerful mechanism by which to
any given Dv(x) the corresponding Du(x) can be obtained. The process
requires no integrations but only differentiations and combinations of terms.
We can even write down explicitly the adjoint of the operator (7):
•*

The adjoint boundary conditions, however, have not yet been obtained.
Problem 155. Consider the most general linear differential operator of second
order:

Find the adjoint operator.


[Answer:

Problem 156. Find the most general linear differential operator of the second
order which is self-adjoint.
[Answer:

Problem 157. Find the adjoint of the system (6.13):

[Answer:

4.13. The adjoint boundary conditions


We now come to the investigation of the adjoint boundary conditions
which are of necessity associated with the adjoint operator. Although in
our mind differential operator and boundary conditions are separated, in
actual fact the boundary conditions are inseparable ingredients of the adjoint
operator without which it loses its significance. We have merely divided
our task by operating first with the extended Green's identity (12.4). This
identity avoids the question of boundary conditions by subjecting u(x) and
v(x] to some differentiability conditions only, without boundary conditions.
But our final goal is to arrive at the theorem (12.3), the Green's identity
without boundary term, which represents the true transcription of the bilinear
identity and thus the true definition of the adjoint operator. The extended
Green's theorem (12.4) will change over into the homogeneous form (12.3)
SEC. 4.13 THE ADJOINT BOUNDARY CONDITIONS 185

if we subject the functions u(x), v(x) to the proper boundary conditions.


As far as the function v(x) goes, certain boundary conditions will probably
be prescribed since a differential equation without boundary conditions
represents an incomplete system which can have no unique solution. For our
present purposes we take no notion of the prescribed values (which belong
to the right side of the equation and have no operational significance). Any
inhomogeneous boundary condition is replaced by the corresponding homo-
geneous condition. For example, if the value of v(x) is prescribed as 1 at
x = a and as — 1 at x = 6, we will temporarily replace these conditions by
the fictitious conditions

We shall see later that by knowing how to handle homogeneous boundary


conditions we also know how to handle inhomogeneous boundary conditions.
We now examine the boundary term of the extended Green's identity
(12.6). Certain terms will vanish on account of the prescribed boundary
conditions for v(x) and their derivatives. Some other terms will not drop
out. We will now apply the following general principle: we impose on u(x)
and its derivatives the minimum number of conditions which are necessary
and sufficient for the vanishing of the boundary term.
By this principle the adjoint boundary conditions are uniquely determined,
irrespective of how complete or incomplete the original set of boundary
conditions have been. The transpose of a matrix always exists, no matter
how incomplete or over-determined the original matrix may be. From the
fact that the transpose of an n x m matrix is an m x n matrix we can
conclude that the degree of determination of D and D will be in a reciprocal
relation to each other: the more over-determined the original operator D is,
the more under-determined is f) and vice versa. It might happen for example
that the original problem has so many boundary conditions prescribed that
the boundary term of the extended Green's identity (4) vanishes without
any further conditions. In this case the adjoint operator I)u(x) is not
subjected to any boundary conditions. Again, it might happen that the
original problem is completely free of any boundary conditions. In this case
the adjoint problem will acquire an overload of boundary conditions, in
order to make the boundary term of (12.4) vanish without the help of the
function v(x).
As an example let us consider Newton's equation of motion: "mass times
acceleration equals moving force". We normalise the mass to 1 and
consider motion in one dimension only:

(x has the significance of the time t). The boundary conditions are that
at x = 0 the displacement v(x) and the velocity v'(x) are zero:

The range of interest is x = [0, 1].


186 THE FUNCTION SPACE CHAP. 4

We go through the regular procedure of Section 12, obtaining

We investigate the boundary term of the right side. The given boundary
conditions are such that the contribution at the lower limit x = 0 becomes
zero while at the upper limit x = I we have

Nothing is said about v(l) and v'(l). Hence the vanishing of the boundary
term on the right side of (5) demands

We have thus found the adjoint operator of the given problem:

with the boundary conditions

We notice that our problem is not self-adjoint because, although I) and D


agree, the boundary conditions for v(x) and u(x) do not agree since the
conditions for v(x) are prescribed at x = 0, the conditions for u(x) at x = 1.
Let us now assume that we are interested in constructing a mechanism
which will guarantee that at the time moment x = 1 the moving mass
returns to the origin, with zero velocity. In this case we have imposed on
our problem the two further boundary conditions

Now the boundary term of (5) vanishes automatically and we do not get any
boundary conditions for u(x). The adjoint problem Au is now characterised
by the operator u"(x) alone, without any boundary conditions.
Problem 158. The results of Problem 154 were obtained by direct matrix
transposition. Obtain the same results now on the basis of Green's identity.
Problem 159. Consider the following differential operator:
SEC. 4.14 INCOMPLETE SYSTEMS 187

Obtain the adjoint operator t)u(x) and the adjoint boundary conditions under
the following circumstances:

[Answer:

Problem 160. Consider the self-adjoint second-order operator (12.14), A(x) ?£ 0,


range x = [a, 6], with the boundary conditions

Find the adjoint boundary conditions. When will the system become self-
adjoint?
[Answer:

Condition of self-adjointness:

4.14. Incomplete systems


In our general discussions of linear systems we encountered three
characteristic numbers which were of decisive importance for the general
behaviour of the system: the number of equations n, the number of
unknowns ra, and the order of the matrix p. But now we are faced with
problems which from the standpoint of algebra represent an infinity of
equations for an infinity of unknowns. The numbers n and ra, and likewise
the number p, lose their direct significance. We have seen, however, in
Section 7 of Chapter 3 that the solution of the homogeneous equations

give us a good substitute for the direct definition of n, ra, and p. The
number of independent solutions of the system (1) is always ra — p, that of
the system (2) n — p. These solutions tell us some fundamental facts about
the given system, even before we proceed to the task of actually finding the
solution. The system (1) decides the unique or not unique character of
the solution while the system (2) yields the compatibility conditions of the
system.
188 THE FUNCTION SPACE CHAP. 4

The role of the system (1) was: "Add to a particular solution an arbitrary
solution of the system (1), in order to obtain the general solution of the given
system."
The role of the system (2) was: "The given system is solvable if and only
if the right side is orthogonal to every independent solution of (2)."
These results are immediately applicable to the problem of solving linear
differential equations or systems of such equations. Before we attempt a
solution, there are two questions which we will want to decide in advance:
1. Will the solution be unique? 2. Are the given data such that a
solution is possible ?
Let us first discuss the question of the uniqueness of the solution. We
have seen that an r th order differential equation alone, without additional
boundary conditions, represents an (m — r) x m system and is thus r
equations short of a square matrix. But even the addition of r boundary
conditions need not necessarily guarantee that the solution will be unique.
For example the system

represents a second order system with two boundary conditions and thus
we would assume that the problem is well-determined. And yet this is not
so because the homogeneous problem

with the boundary conditions (4) has the solution

Hence we could have added one more condition to the system, in order to
make it uniquely determined, e.g.

But if we have no particular reason for adding a condition of this kind, it is


more natural to remove the deficiency in the fashion we have done before
in Chapter 3, Section 14 by requiring that the solution shall be orthogonal to
every solution of the homogeneous equation. In our example it means that
we remove the deficiency of our system by adding the condition

We might think that deficiency is a purely mathematical phenomenon,


caused by incomplete information. But this is by no means so. We
encounter incomplete systems in well-defined physical situations. Let us
consider for example the problem of a loaded elastic bar. The mathematical
description of this problem leads to a differential equation of fourth order
SEC. 4.14 INCOMPLETE SYSTEMS 189

which may be formulated in the form of two simultaneous differential


equations of second order:

(vi(x) is the deflection of the bar, I(x) is the inertial moment of the generally
variable cross section, fi(x) the load density.) We assume that the bar
extends from x = 0 to x = I. We will also assume that the bar is not
supported at the two endpoints but at points between and we include the
forces of support as part of the load distribution, considering them as
negative loads.
Now the boundary conditions of a bar which is free at the two endpoints
are

These four conditions seem to suffice for a well-determined solution since


the deflection of a bar satisfies a differential equation of fourth order. But
in actual fact the homogeneous system

under the boundary conditions (10) demands only the vanishing of vz(x) while
for vi(x) we obtain two independent solutions

The physical significance of these two solutions is that the bar may be
translated as a whole and also rotated rigidly as a whole. These two degrees
of freedom are in the nature of the problem and not artificially imposed
from outside. We can eliminate this uncertainty by adding the two
orthogonality conditions

These conditions remove the arbitrariness of the frame of reference in which


the vertical displacements of the bar are measured, by putting the origin
of the Z-axis in the centre of mass of the bar and orienting the Z-axis
perpendicular to the "neutral plane" of the bar.
Other conditions can also be chosen for the removal of the deficiency of
the system, provided that they eliminate the solutions of the homogeneous
system.
190 THE FUNCTION SPACE CHAP. 4

Problem 161. Show that the added conditions a) or b) are permissible, the
conditions c) not permissible for the elimination of the deficiency of the system
(9), (10):

Problem 162. Find the adjoint system of the problem (9), (10).
[Answer:

Boundary conditions:

4.15. Over-determined systems


A counterpart of the homogeneous equation Av = 0 is the homogeneous
equation Au = 0. If this equation has non-vanishing solutions, this is an
indication that the given data have to satisfy some definite compatibility
conditions, and that again means that we have given too many data which
cannot be chosen independently of each other. Consider for example the
problem of the "free bar", discussed in the previous section. Here the
adjoint homogeneous system t)u(x) = 0 (of. 14.18) under the boundary
conditions (14.19) yields

This system has two independent solutions:

The orthogonality of the right side of (14.9) to these solutions demands the
following two compatibility conditions:

These conditions have a very definite physical significance. They express


the fact that the loaded elastic bar can be in equilibrium if and only if the sum
of all the loads and the moments of all the loads is zero.
In Section 13 we considered the motion of a mass 1 under the influence
of a force fi(x). We assumed that the motion started with zero displace-
ment and zero velocity (cf. 13.3). But later we assumed that we succeeded
SEC. 4.15 OVER-DETERMINED SYSTEMS 191

with a mechanism which brought the mass back to the origin with zero
velocity at the time moment x = 1. Hence we could add the two surplus
conditions (13.10) with the result that the adjoint homogeneous system
became

without any additional conditions. Accordingly the adjoint homogeneous


system has the two independent solutions u(x) = I and x, and we obtain
the two compatibility conditions

They have the following significance. The forces employed by our mech-
anism satisfy of necessity the two conditions that the time integral of the
force and the first moment of this integral vanishes.
Problem 163. Find the compatibility conditions of the following system

[Answer:

Problem 164. In Problem 161 the addition of the conditions (14.17) were
considered not permissible for the removal of the deficiency of the system.
What new compatibility condition is generated by the addition of these boundary
conditions? (Assume I(x) = const.)
[Answer:

Problem 165. Given the following system:

Find the compatibility conditions of this system.


[Answer:
192 THE FUNCTION SPACE CHAP. 4

4.16. Compatibility under inhomogeneous boundary conditions


In our previous discussions we have assumed that the boundary conditions
of our problem were given in homogeneous form: some linear combinations
of function and derivatives were given as zero on the boundary. This may
generally not be so. We may prescribe certain linear aggregates of function
and derivative on the boundary as some given value which is not zero. These
values belong now to the given "right side" of the equation. The con-
struction of the homogeneous equation Dv = 0 and likewise the construction
of the adjoint homogeneous equation Du = 0 is not influenced by the
inhomogeneity of the given boundary conditions. What is influenced, how-
ever, are the compatibility conditions between the data. The given
inhomogeneous boundary values belong to the data and participate in the
compatibility conditions.
The orthogonality of the given right side to every independent solution
of the adjoint homogeneous equation is a direct consequence of "Green's
identity" (12.3):

if

But let us now assume that the prescribed boundary conditions for v(x) are
of the inhomogeneous type. In that case we have to change over to the
extended Green's identity (12.6). The boundary term on the right side will
now be different from zero but it will be expressible in terms of the given
boundary values and the boundary values of the auxiliary function u(x)
which we found by solving the adjoint homogeneous equation:

As an example let us consider once more the problem of the free elastic
bar, introduced before in Section 14 (cf. 14.9-10). Let us change the
boundary conditions (14.10) as follows:

and let us see what change will occur in the two compatibility conditions
(16.3-4). For this purpose we have to investigate the boundary term of the
extended Green's identity which in our case becomes

The first two terms drop out on account of the adjoint boundary conditions.
The last two terms dropped out earlier on account of the given homogeneous
boundary conditions (14.10) while at present we get
SEC. 4.16 COMPATIBILITY CONDITIONS 193

and thus the compatibility conditions (15.3) and (15.4) must now be extended
as follows:

These conditions again have a simply physical significance. We could


assume that our bar extends by a very small quantity e beyond the limits
0 and I. Then we could conceive the two conditions (7) and (8) as the
previous homogeneous conditions

which express the mechanical principles that the equilibrium of the bar
demands that the sum of all loads and the sum of the moments of all loads
must be zero. The loads which we have added to the previous loads are

We call these loads "point-loads" since they are practically concentrated in


one point. In fact the load p% at x = I and the load — pi at I — 0 are
exactly the supporting loads we have to apply at the two endpoints of the
bar p in order to keep the load distribution f3(x) in equilibrium.
As a second example let us modify the boundary conditions (4) as follows:

Here the boundary term (5) becomes

and the new compatibility conditions (15.3) and (15.4) become:

Once more we can conceive these equations as the equilibrium conditions


14—L.D.O.
194 THE FUNCTION SPACE CHAP. 4

(9) of all the forces, extending the integration from 0 to I + e, and defining
the added load distribution by the following two conditions:

What we have here is a single force at the end of the bar and a force couple
acting at the end of the bar, to balance out the sum of the forces and the
moments of the forces distributed along the bar. This means in physical
interpretation that the bar is free at the left end x = 0 but clamped at the
right end x = I since the clamping can provide that single force and force
couple which is needed to establish equilibrium. Support without clamping
can provide only a point force which is not enough for equilibrium except
if a second supporting force is applied at some other point, for instance at
x = 0, and that was the case in our previous example.
What we have seen here is quite typical for the behaviour of inhomo-
geneous boundary conditions. Such conditions can always be interpreted
as extreme distributions of the right side of the differential equation, taking
recourse to point loads and possibly force couples of first and higher order
which have the same physical effect as the given inhomogeneous boundary
conditions.
Problem 166. Extend the compatibility conditions (15.6) to the case that the
boundary conditions (13.3) and (13.10) of the problem (13.2) are changed as
follows:

[Answer:

Problem 167. Consider the following system:

with the boundary conditions

Obtain the compatibility condition of this system and show that it is identical
with the Taylor series with the remainder term.
SEC. 4.17 GREEN'S IDENTITY 195

Problem 168. Consider once more the system of the previous problem, but
changing the boundary condition at 6 to

where k is some integer between 1 and n — 1:


[Answer:

4.17. Green's identity in the realm of partial differential operators


The fundamental Green's identity (12.2) represents the transcription of
the bilinear identity which holds in the realm of matrices. Since the
isomorphism between matrices and linear differential operators is not
restricted to problems of a single variable but holds equally in multi-
dimensional domains, we must be able to obtain the Green's identity if our
problem involves partial instead of ordinary differentiation.
Here again we will insert an intermediate step by not requiring
immediately that the functions u(x), v(x)—where x now stands for a point
of a multi-dimensional domain—shall satisfy the proper boundary con-
ditions. We will again introduce the extended Green's identity (12.4) where
the right side is not zero but a certain "boundary term" which involves
the values of u(x) and v(x) and some of their partial derivatives on the
boundary. We have seen that in the realm of ordinary differentiation the
integral relation (12.4) could be replaced by the simpler relation (12.5)
which involved no integration. It was this relation which gave rise to a
boundary term after integrating on both sides with respect to x. But it
was also this relation by which the adjoint operator T)u(x) could be obtained
(omitting for the first the question of the adjoint boundary conditions).
Now the procedure in the realm of partial operators will be quite similar
to the previous one. The only difference is that on the right side of (12.5) we
shall not have a single term of the form d/dx but a sum of terms of the form
dl8xa. The quantity F(u, v) which was bilinear in u and v, will now become
a vector which has the components

if p, is the number of independent variables. We will thus write down the


fundamental relation which defines the adjoint operator T)u(x] as follows:
196 THE FUNCTION SPACE CHAP. 4

Let us assume that we have succeeded with the task of constructing the
adjoint operator f)u(x) on this basis. Then we can immediately multiply
by the volume-element dx on both sides and integrate over the given domain.
On the right side we apply the Gaussian integral transformation:

where va is the outside normal (of the length 1) of the boundary surface S.
We thus obtain once more the extended Green's theorem in the form

The " boundary term " appears now as an integral extended over the boundary
surface. From here we continue exactly as we have done before: we impose
the minimum number of conditions on u(x) which are necessary and sufficient
to make the boundary integral on the right side of (4) vanish. This provides
us with the adjoint boundary conditions. We have thus obtained the
differential operator I)u(x) and the proper boundary conditions which
together form the adjoint operator.
Let us then examine the equation (2). An arbitrary linear differential
operator Dv(x) is composed of terms which contain v(x) and its derivatives
linearly. Let us pick out a typical term which we may write in the form

where A (x) is a given function of the x^ while w stands for some partial
derivative of v with respect to any number of Xj. We now multiply by
u(x) and use the method of "integrating by parts" :

What we have achieved now is that w is no longer differentiated. We might


say that we have "liberated " w from the process of differentiation. We can
obviously repeat this process any number of times. At every step we
reduce the order of differentiation by one, until eventually we must end
with a term which contains v itself, without any derivatives. The factor
of v is then the contribution of that term to the adjoint operator Bu(x).
Let us consider for example a term given as follows:
SEC. 4.17 GREEN'S IDENTITY 197
Then the "process of liberation" proceeds as follows:

The result of the process is that v is finally liberated and the roles of u and
v are exchanged: originally u was multiplied by a derivative of v, now v is
multiplied by a derivative of u. Hence the adjoint operator has been
obtained as follows:

while the vector Fa(u, v) of the general relation (2) has in our case the
following significance:

We have followed a certain sequence in this process of reducing the degree


of differentiation, viz. the sequence x, y, y, z. We might have followed
another sequence, in which case the vector Fa(u, v) would have come out
differently but the adjoint operator Du(x) would still be the same. It is
the peculiarity of partial differentiation that the same boundary term may
appear hi various forms, although the complete boundary integral (3) is
the same. The vector Fa has no absolute significance, only the complete
integral which appears on the right side of the extended Green's identity (4).
Problem 169. Obtain the adjoint operator of the term
198 THE FUNCTION SPACE CHAP. 4

once in the sequence x, y and once in the sequence y, x. Show that the equality
of the resulting boundary integral can be established as a consequence of the
following identity:

Prove this identity by changing it to a volume integral with the help of Gauss's
theorem (3).

4.18. The fundamental field operations of vector analysis


In the vector analysis of three dimensional domains

the following operations appear as fundamental in the formulation of the


field problems of classical physics:
1. The vector

2. The scalar

3. The vector

Amongst the combinations of these operations of particular importance is


the "Laplacian operator"

which appears in the following constructions:

According to the general rules we can construct the adjoints of these


operators:
1. Adjoint of the gradient:

Boundary term:
SEO. 4.18 FUNDAMENTAL FIELD OPERATIONS OF VECTOR ANALYSIS 199

2. Adjoint of the divergence:

Boundary term:

3. Adjoint of the curl:

Boundary term:

Many important conclusions can be drawn from these formulae concerning


the deficiency and compatibility of some basic field equations. The following
two identities are here particularly helpful:

Consider for example the problem of obtaining the scalar $ from the
vector field

What can we say concerning the deficiency of this equation? The only
solution of the homogeneous equation

Hence the function # will be obtainable from (16), except for an additive
constant.
What can we say concerning the compatibility of the equation (16)? For
this purpose we have to solve the adjoint homogeneous problem. The
boundary term (9) yields the boundary condition

since we gave no boundary condition for 0 itself. Hence we have to solve


the field equation

with the added condition (19). Now the equation (20) is solvable by putting
200 THE FUNCTION SPACE CHAP. 4

where B is a freely choosable vector field, except for the boundary condition
(19) which demands that the normal component of curl B vanishes at all
points of the boundary surface S:

The compatibility of our problem demands that the right side of (16) is
orthogonal to every solution of the adjoint homogeneous system. This means

But now the formulae (12), (13) give

and, since the vector B is freely choosable inside the domain T, we obtain
the condition

as a necessary condition for the compatibility of the system (16). Then we


can show that in consequence of this condition the last term on the right
side of (23) vanishes too. Hence the condition (24) is necessary and sufficient
for the solvability of (16).
It is of interest to pursue this problem one step further by making our
problem still more over-determined. We will demand that on the boundary
surface S the function 0 vanishes. In that case we get no boundary
condition for the adjoint problem since now the boundary term (9) vanishes
on account of the given boundary condition.
Here now we obtain once more the condition (24) but this is not enough.
The vector B can now be freely chosen on the surface S and the vanishing
of the last term of (23) demands

that is the vector F must be perpendicular to S at every point of the surface.


This is indeed an immediate consequence of the fact that the gradient of a
potential surface 0 = const, is orthogonal to the surface but it is of interest
to obtain this condition systematically on the basis of the general com-
patibility theory of linear systems.
Problem 170. Find the adjoint operator (together with the adjoint boundary
conditions) of the following problem:

with the boundary conditions on S:


SEC. 4.19 SOLUTION OF INCOMPLETE SYSTEMS 201

Show that in all these cases the problem is self-adjoint. Historically this
problem is particularly interesting since "Green's identity" was in fact estab-
lished for this particular problem (George Green, 1793-1841).
Problem 171. Investigate the deficiency and compatibility of the following
problem:

Boundary condition: V — VQ on S.
[Answer:
V uniquely determined. Compatibility conditions:

Problem 172. Given the following system

What boundary conditions are demanded in order that V shall become the
gradient of a scalar?
[Answer: Only Vv can be prescribed on S, with the condition

4.19. Solution of incomplete systems


We have seen in the general treatment of linear systems that certain
linear aggregates of the unknowns may appear in the given system with
zero weight which means that we cannot give a complete solution of the
problem, because our problem does not have enough information for a full
determination of the unknowns. Such incomplete systems allow nevertheless
a unique solution if we agree that we make our solution orthogonal to all
the non-activated dimensions. This means that we give the solution in all
those dimensions in which a solution is possible and add nothing concerning
the remaining dimensions. This unique solution is always obtainable with
the help of a "generating function", namely by putting

We will consider a few characteristic examples for this procedure from the
realm of partial differential operators.
Let us possess the equation

without any further information. Here the adjoint operator is — grad 0


with the boundary condition
202 THE FUNCTION SPACE CHAP. 4

If we put

we obtain the potential equation

with the added boundary condition (3) and this equation has indeed a
unique solution.
As a second example let us consider the system

with the compatibility condition

but without further conditions. Here the adjoint system becomes

with the boundary conditions

(that is U orthogonal to 8) and

According to the general rule we have to put

Substitution in (5) gives first of all the potential equation

with the boundary condition (9) and this problem has a unique solution.
Let us furthermore put

where we can determine the scalar field # by the conditions

This leaves a vector W which is free of divergence and which vanishes on S.


Substitution in the first equation of (5) yields
SEC. 4.19 SOLUTION OF INCOMPLETE SYSTEMS 203

We have for every component of W the potential equation with the


boundary condition that each component must vanish on S. This problem
has a unique solution. Hence we have again demonstrated that the method
of the generating function leads to a unique solution.
Finally we consider a system of equations called the "Maxwellian
equations" of electromagnetism. In particular we consider only the first
group of the Maxwellian equations, which represents 4 equations for 6
quantities, the electric and the magnetic field strengths E and H:

(J = current density, p = charge density). The right sides have to satisfy


the compatibility condition

Our system is obviously under-determined since we have only 4 equations


for 6 unknowns and the compatibility condition (16) reduces these 4 relations
to essentially 3 relations.
Let us now obtain the adjoint system and solve our problem in terms of a
generating function. The multiplication of the left side of (15) by an
undetermined vector U and a scalar u yields:

Hence we obtain the adjoint operator T)u of our system in the following form:

At this point, however, we have to take into consideration a fundamental


result of the Theory of Relativity. That theory has shown that the true
fourth variable of the physical universe is not the time t but

The proper field quantities of the electromagnetic field are not E and H
204 THE FUNCTION SPACE CHAP. 4

but iE and H and the proper way of writing the Maxwellian equations is
as follows:

Accordingly the undetermined multipliers of our equations will be U and


iu, and we obtain the (algebraically) adjoint system in the following form:

The solution of the system (20) in terms of a generating function becomes


accordingly:

or, replacing U by the more familiar notation A and u by <P:

Here we have the customary representation of E and H in terms of the


"vector-potential" A and the "scalar potential" $. Customarily this
representation is the consequence of the second set of Maxwellian equations:

It is of interest to observe that the same representation is obtainable by a


natural normalisation of the solution of the first set of Maxwellian equations.
This means that the second set of the Maxwellian equations can be inter-
preted in the following terms: The electromagnetic field strength iE, H has no
components in those dimensions of the function space which are not activated
by the first set of Maxwellian equations.
SBO. 4.19 SOLUTION OF INCOMPLETE SYSTEMS 205

Problem 173. Making use of the Hermitian definition of the adjoint operator
according to (12.2) show that the following operator in the four variables x, y, z,
t is self-adjoint:

Problem 174. Given the differential equation

without boundary conditions. Obtain a unique solution with the help of a


generating function and show that the same solution is obtainable by making
v(x) orthogonal to the homogeneous solutions

[Answer:

with the boundary conditions

Problem 175. Given the differential equation

without boundary conditions. Obtain a unique solution with the help of a


generating function.
[Answer:

with the boundary conditions

BIBLIOGRAPHY
[1] Halmos, P. R., Finite Dimensional Vector Spaces (Princeton University
Press, 1942)
[ 2] Synge, J. L., The Hypercircle in Mathematical Physics (Cambridge University
Press, 1957)
CHAPTER 5

THE GREEN'S FUNCTION

Synopsis. With this chapter we arrive at the central issue of our


theory. The "Green's Function" represents in fact the inverse of the
given differential operator. In analogy to the theory of matrices, we
can establish—in proper interpretation—a unique inverse to any given
sufficiently regular differential operator. The Green's function can
thus be defined under very general conditions. In a large class of
problems the Green's function appears in the form of a "kernel
function" which depends on the position of two points of the given
domain. This function can be defined as the solution of a certain
differential equation which has Dirac's "delta function" on the right
side. The "reciprocity theorem" makes it possible to define the
Green's function either in terms of the adjoint, or the given differential
operator. Over-determined or under-determined systems lead to the
concept of the "constrained" Green's function which is constrained to
that subspace in which the operator is activated. (In Chapter 8 we will
encounter strange cases, in which the inverse operator exists, without
being, however, reducible to a kernel function of the type of the
Green's function.)

5.1. Introduction
In the domain of simultaneous linear algebraic equations we possess
methods by which in a finite number of steps an explicit solution of the
given system can be obtained in all cases when such a solution exists. In
the domain of linear differential operators we are not in a similarly fortunate
position. We have various methods by which we can approximate the
solution of a given problem in the realm of ordinary or partial differential
equations. But the actual numerical solution of such a specific problem—
although perhaps of great importance for the solution of a certain problem
of physics or industry—may tell us very little about the interesting analytical
properties of the given problem. From the analytical standpoint we are
not interested in the numerical answer of an accidentally encountered
problem but in the general properties of the solution. The analytical tools
by which a solution is obtained may be of little practical significance but of
great importance if our aim is to arrive at a deeper understanding of the
nature of linear differential operators and the theoretical conclusions we can
206
SEC. 5.2 THE EOLE OF THE ADJOINT EQUATION 207

draw concerning their behaviour under various circumstances. From the


beginning of the nineteenth century the focus of interest shifted from the
solution of certain special problems by means of more or less ingenious
artifices to a much broader outlook which led to a universal method of solving
linear differential equations with the help of an auxiliary function called the
"Green's function". This concept was destined to play a central role in
the later development. It is this construction which we shall discuss in the
present chapter together with a number of typical examples which illustrate
the principal properties of this important tool of analysis.

5.2. The role of the adjoint equation


We have seen in the preliminary investigation of a linear system that the
solution of the adjoint homogeneous equation played a major role in the
problem of deciding whether a given linear system is solvable or not. If
the adjoint homogeneous equation had non-vanishing solutions, the
orthogonality of the right side to these solutions was the necessary and
sufficient condition for the solvability of our problem.
This investigation has not yet touched on the task of actually finding
the solution; we have merely arrived at a point where we could decide
whether a solution exists or not. In actual fact, however, we are already
quite near to the solution problem and we need only a small modification
of the previous procedure to adapt it to the more pretentious task of actually
obtaining the solution. We can once more use the artifice of artificial
over-determination in order to reduce the solution problem to the previous
compatibility problem. Let us namely add to our data the additional
equation

The point x\ is a special point of our domain in which we should like to


obtain the value of v(x). Of course, if our problem is well-determined, we
have no right to choose p freely. The fact that our problem has a unique
solution means that p is uniquely determined by the data of our problem.
But the extended system

(where we assume that the first equation includes the given boundary
conditions) is a perfectly legitimate linear system whose compatibility we
can investigate. For this purpose we have to form the adjoint homogeneous
equation

and express the compatibility of the system (2) by making the right side
orthogonal to the solution u(x) of the system (3) (we know in advance that
the adjoint system (3) must have a solution since otherwise the right side
208 THE GREEN'S FUNCTION CHAP. 5
of (2) would be freely choosable and that would mean that p is not deter-
mined). But then the orthogonality of the right side of (2) to the solution
u(x) will give a relation of the following form:
lfi(x)u(x)dx + pu0 = 0 (5.2.4)
and this means that we can obtain p in terms of the given (3(x) by the relation

p = - — $ p(x)u(x)dx (5.2.5)
UQ

This is the basic idea which leads to the construction of the Green's
function and we see the close relation of the construction to the solution of
the adjoint homogeneous equation.

5.3. The role of Green's identity


Instead of making use of the adjoint homogeneous equation we can
return to the fundamental Green's identity (4.12.3) and utilise this identity
for our purposes:
l\u(x)Dv(x] - v(x)Bu(x)]dx = 0 (5.3.1)
Since Dv(x) is given as )S(z), we obtain at once the orthogonality of fi(x) to
u(x) if u(x) satisfies the homogeneous equation
bu(x) = 0 (5.3.2)
But in our case not only fi(x) is given but also v(x) at the point x = x\.
Consequently the vanishing of Du(x) is not demanded everywhere. It
suffices if jfru(x) is zero everywhere except at the point x — x\t since v(x) at
x\ is given as p. Hence the artificial over-determination of our equation in
the sense of (2) leads to a relaxation of the conditions for the adjoint equation
inasmuch as the homogeneous adjoint equation has to be satisfied everywhere
except at the point x — x\.

5.4. The delta function 8(x, £)


In view of the continuous nature of our operator Dv(x) a certain
precaution is required. The integration over x brings in the volume-
element dx of our domain and we have to pay attention to the difference
between "load" and "load density" that we have encountered earlier in
our study of the elastic bar. We would like to pinpoint the particular
value v(xi), singling out the definite point x = x\. This, however, does not
harmonise with the continuity of our operator. A point-load has to be
understood as the result of a limit-process. We actually distribute our
load over a small but finite domain and then shrink the dimensions of this
domain, until in the limit the entire load is applied at the point x = x\.
We shall thus proceed as follows. We take advantage of the continuity
of the solution v(x) which is demanded by the very fact that the differential
operator D must be able to operate on v(x). But if v(x] is continuous at
SEC. 5.4 THE DELTA FUNCTION 8(x, £) 209

x = #1, then our assumption that v(x) assumes the value p at the point
x = xi can be extended to the immediate neighbourhood of x = x\, with
an error which can be made as small as we wish. We will thus extend
our system (2) in the following sense. The equation

is given everywhere, including the given boundary conditions. We add the


equation

in the sense that now xi is not a definite point but a point with its
neighbourhood which extends over an arbitrarily small v-dimensional domain
of the small but finite volume e that surrounds the point x = x\. Accord-
ingly in Green's identity (3.1) we can exempt Du(x) from being zero not
only at the point x = x\ but also in its immediate neighbourhood. This
statement we will write down in the following form

The function 8£(xi, x) has the property that it vanishes everywhere outside
of the small neighbourhood e of the point x = x\. Inside of the small
neighbourhood e we will assume that Be(xi> %) does not change its sign: it is
either positive or zero.
Then Green's identity yields the compatibility condition of our system
(1), (2) in the following form:

We will now take advantage of the linearity of the operator T)u(x). A


mere multiplication of u(x) by a constant brings this constant in front of
Du(x). We can use the freedom of this constant to a proper normalisation
of the right side of (3). If we interpret the right side of (3) as a load density,
we will make the total load equal to 1:

The integration extends only over the neighbourhood e of the point x = x\,
since outside of this neighbourhood the integrand vanishes.
Now we will proceed as follows. In (4) we can replace p(x) by v(x) since
the significance of p(x) is in fact v(x). Then we can make use of the mean
value theorem of integral calculus, namely that in the second integral we
can take out v(x) in front of the integral sign, replacing a; by a certain x\
where xi is some unknown point inside of the domain e. Then we obtain

We notice that by this procedure we have not obtained v(x\) exactly. We


have obtained v(x) at a point x\ which can come to x = x\ as near as we
wish and thus also the uncertainty of v(xi) can be made as small as we wish,
15—L.D.O.
210 THE GREEN'S FUNCTION CHAP. 5
considering the continuity of v(x) at x = x\. But we cannot get around
this limit procedure. We have to start out with the small neighbourhood
e and then see what happens as the domain e shrinks more and more to zero.
For a better description of this limit process, we will employ a more
adequate symbolism. We will introduce the symbol £ as an integration
variable. Then we need not distinguish between x and x\ since for the
present x will be an arbitrary fixed point of our domain—later to be identified
with that particular point at which the value of v(x] shall be obtained.
Hence x becomes a "parameter" of our problem while the active variable
becomes £. Furthermore, we wish to indicate that the function u(g) con-
structed on the basis of (3) depends on the position of the chosen point x.
It also depends on the choice of e. Accordingly we will rewrite the
equation (3) as follows:

The previous x has changed to £, the previous x\ to x. The function u(x)


has changed to a new function G(x, £) in which both x and £ appear on
equal footing because, although originally a; is a mere parameter and £ the
variable, we can alter the value of the parameter x at will and that means that
the auxiliary function u(g) will change with x. (The continuity with respect
to x is not claimed at this stage.)
Adopting this new symbolism, the equation (6) appears now in the
following form

The uncertainty caused by the fact that x does not coincide with x can
now be eliminated by studying the limit approached as e goes to zero.
This limit may exist, even if Gf(x, £) does not approach any definite limit
since the integration has a smoothing effect and it is conceivable that
Ge(x, |), considered as a function of £, would not approach a definite limit
at any point of the domain of integration and yet the integral (8) may
approach a definite limit at every point of the domain. In actual fact,
however, in the majority of the boundary value problems encountered in
mathematical physics the function Gf(x, £) itself approaches a definite limit
(possibly with the exception of certain singular points where Gf(x, £) may go
to infinity). This means that in the limit process

the function G£(x, £) with decreasing e approaches a definite limit, called the
"Green's function":

In that case we obtain from (8):


SEC. 5.5 THE EXISTENCE OF THE GREEN'S FUNCTION 211

The limit process here involved is often abbreviated to the equation

where on the right side we have "Dirac's delta function", introduced by


the great physicist Paul Dirac in his wave-mechanical investigations. Here
a function of £ is considered which vanishes everywhere outside the point
£ = x while at the point £ = x the function goes to infinity in such a manner
that the total volume under the function is 1. Dirac's delta function is a
mathematical counterpart of a concept that Clerk Maxwell employed in his
elastic investigations where he replaced a continuous distribution of loads
by a "point-load" applied at the point x. The elastic deflection caused by
this point load, observed at the point £, was called by Maxwell the
"influence function". It corresponds to the same function that is today
designated as the "Green's function" of the given elastic problem.
The literature of contemporary physics frequently advocates discarding
all scruples concerning the use of the delta function and assigning an actual
meaning to it, without regard to the e-process from which it originated.
In this case we assign a limit to a process which in fact has no limit since
a function cannot become zero everywhere except in one single point, and
yet be associated with an integral which is not zero but 1. However,
instead of extending the classical limit concept in order to include something
which is basically foreign to it, we can equally well consider all formulas
in which the delta function occurs as shorthand notation of a more elaborate
statement in which the legitimate Be(x, £) appears, with the added condition
that e converges to zero. For example, assuming that f(x) is continuous,
we have

where x is a point in the e-neighbourhood of x. Now, if e converges to


zero, we get:

We can now write this limit relation in the abbreviated form:

but what we actually mean is a limit process, in which the delta function
never appears as an actual entity but only in the legitimate form 8f(x, £),
e converging to zero. The truth of the statement (14) does not demand
that the limit of 8e(x, £) shall exist. The same is true of all equations in
which the symbol 8(x, £) appears.

5.5. The existence of the Green's function


We know from the general matrix treatment of linear systems that a
linear system with arbitrarily prescribed right side is solvable if and only
if the adjoint homogeneous equation has no solution which is not identically
212 THE GREEN'S FUNCTION CHAP. 5
zero. This principle, applied to the equation (4.7), demands that the
equation

shall have no non-zero solution. And that again means that the problem

under the given boundary conditions, should not have more than one solution.
Hence the given linear system must belong to the "complete" category,
that is the F-space must be completely filled out by the given operator.
There is no condition involved concerning the C7-space. Our problem can
be arbitrarily over-determined. Only under-determination has to be avoided.
The reason for this condition follows from the general idea of a Green's
function. To find the solution of a given differential equation (with the
proper boundary conditions) means that we establish a linear relation
between v(x) and the given data. But if our system is incomplete, then
such a relation does not exist because the value of v(x) (at the special point x)
may be added freely to the given data. Under such circumstances the
existence of the Green's function cannot be expected.
A good example is provided by the strongly over-determined system

where F is a given vector field. We have encountered this problem earlier,


in Chapter 4.18, and found that the compatibility of the system demands
the condition

In spite of the strong over-determination, the problem is under-determined


in one single direction of the function space inasmuch as the homogeneous
equation permits the solution

Let us now formulate the problem of the Green's function. The adjoint
operator now is

with the boundary condition

(v being the normal at the boundary point considered). Now the differential
equation

certainly demands very little of the vector field U since only a scalar
condition is prescribed at every point of the field and we have a vector at
our disposal to satisfy that condition. And yet the equation (8) has no
SEC. 5.5 THE EXISTENCE OF THE GEEEN'S FUNCTION 213

solution because the application of the Gaussian integral transformation to


the equation (8) yields, in view of the boundary condition (7):

while the definition of the delta function demands that the same integral
shall have the value 1.
Let us, however, complete the differential equation (3) by the added
condition

In this case the adjoint operator (6) ceases to be valid at the point £ = 0
and we have to modify the defining equation of the Green's function as
follows

The condition that the integral over the right side must be zero, determines
a to —1 and we obtain finally the determining equation for the Green's
function of our problem in the following form:

This strongly-undetermined problem has infinitely many solutions but


every one of these solutions can serve as the Green's function of our problem,
giving the solution of our original problem (3), (10) in the form

(on the right side the product FG refers to the scalar product of the two
vector fields).
One particularly simple solution of the differential equation (12) can be
obtained as follows: We choose a narrow tube of constant cross-section q
214 THE GREEN'S FUNCTION CHAP. 5
which shall connect the points £ = 0 and £ = x. We define the vector
field 0(x, g) as zero everywhere outside of this tube while inside of the tube
we assume that G(t-) is everywhere perpendicular to the cross-section of the
tube and of constant length. We have no difficulty in showing that the
divergence of the vector field thus constructed vanishes everywhere, except
in the neighbourhood of the points £ = 0 and £ = x. But these are exactly
the neighbourhoods where we do not want div 6r(£) to vanish since the right
side of (12) is not zero in the small neighbourhood e of these points. The
infinitesimal volume e in which the delta function 8e(x, g) is not zero is
given by the product qh. In this volume we can assign to Sf(x, g) the
constant value

The vector field G(g) starts at the lower end of the tube with zero, grows
linearly with h and attains the value Ijq at the end of the shaded volume.
Then it maintains this length throughout the tube T, arrives with this
constant value at the upper shaded volume, diminishes again linearly with
h and becomes zero at the upper end of the tube.
Let us now form the integral (13). This integral is reduced to the very
narrow tube in which G($) is different from zero. If we introduce the
line-element ds of the central line of the tube and consider it as an
—>
infinitesimal vector ds whose length is ds while its direction is tangential to
the line, then the product G(i-)d£ is replaceable by:

because in the present problem dg—the infinitesimal volume of integration—


is qds, while the length of the vector G(g) is the constant I/q. We see that,
as € shrinks to zero, the volume integral (19) becomes more and more the
line-integral

and this is the well-known elementary solution of our problem, but here
obtained quite systematically on the basis of the general theory of the
Green's function.
Our problem is interesting from more than one aspect. It demonstrates
the correctness of the Green's function method hi a case where the Green's
function itself evaporates into nothingness. The function G((x, £) is a
perfectly legitimate function of £ as long as e is finite but it does not
approach any limit as e approaches zero. It assumes in itself the nature of
a delta function. But this is in fact immaterial. The decisive question is
not whether G(x, £) exists as e goes to zero but whether the integral (13)
approaches a definite limit as e goes to zero and in our example this is the
case, although the limit of Gf(x, $) itself does not exist.
SEC. 5.5 THE EXISTENCE OF THE GREEN'S FUNCTION 215

Our example demonstrated that in the limit process, as e goes to zero,


peculiar things may happen, in fact the Green's function may lose its signifi-
cance by not approaching any limit exactly in that region of space over
which the integration is extended. Such conditions are often encountered
in the integration of the "hyperbolic type" of partial differential equations.
The integration is then often extended over a space which is of lower
dimensionality than the full space in which the differential equation is
stated. We do not hesitate, however, to maintain the concept of the
Green's function in such cases by reducing the integration from the
beginning to the proper subspace. For example in our present problem we
can formulate the solution of our problem in the form

and define the Green's function G(x, s) as

where Ts is the tangent vector to the line s of the length 1.


Even with this allowance we encounter situations in which the existence
of the Green's function cannot be saved. Let us consider the partial
differential equation in two variables:

which is the differential equation of heat conduction in a rod, t being the


time, x the distance measured along the rod. We assume that the two
ends of the rod, that is the points x — 0 and x = I, are kept constantly on
the temperature v — 0. This gives the boundary conditions

Usually we assume that at the time moment t = 0 we observe the tempera-


ture distribution v(x, 0) along the rod and now want to calculate v(x, t) at
some later time moment. Since, however, there is a one-to-one correspond-
ence between the functions v(x, 0) and v(x, t), it must be possible to salvage
v(x, 0) if by accident we have omitted to observe the temperature distribution
at t = 0 and made instead our observation at the later time moment t = T.
We now give the "end-condition" (instead of initial condition)

and now our problem is to obtain v(x, t) at the previous time moments
between 0 and T.
In harmony with the general procedure we are going to construct the
adjoint problem with its boundary conditions, replacing any given in-
homogeneous boundary conditions by the corresponding homogeneous
216 THE GREEN'S FUNCTION CHAP. 5

boundary conditions. The adjoint operator becomes if we carry through


the routine Green's identity procedure:

with the boundary conditions

and the initial condition

The Green's function of our problem is defined, according to the general


method of Section 4, by the differential equation

together with the homogeneous boundary conditions (23), (24). This


problem has no solution and the difficulty is not caused by the extreme
nature of the delta function. Even if we replace 8(x, t; £, t) by 8f(x, t; g, r),
we can find no solution to our problem. The physical significance of the
Green's function defined by (25) and the given boundary conditions would
be a heat distribution caused by a unit heat source applied at the point
£ = x, r = t, with the added condition that the temperature is zero all
along the rod at a later time moment (instead of an earlier time moment).
The non-reversibility of heat phenomena prevents the possibility of such a
heat distribution.
We have here an example of a boundary value problem which is entirely
SEC. 5.6 INHOMOGENEOUS BOUNDARY CONDITIONS 217

reasonable and which in the case of properly given data has a solution and
hi fact a unique solution. That solution, however, cannot be given in
terms of an auxiliary function which satisfies a given non-homogeneous
differential equation. The concept of the Green's function has to be
extended to a more general operator in order to include the solution of such
problems, as we shall see later in Chapter 8.
Problem 176. Remove the over-determination of the problem (3), (10), by the
least square method (application of t> on both sides of the equation) and
characterise the Green's function of the new problem.
[Answer:

Problem 177. Define the Green's function for the Laplacian operator Av, if the
given boundary conditions are

[Answer:

no boundary conditions.]

5.6. Inhomogeneous boundary conditions


In the construction of the adjoint operator T)u—which was needed for the
definition of the Green's function G(x, £)—we have always followed the
principle that we have ignored the given right sides and replaced in-
homogeneous boundary conditions by the corresponding homogeneous
boundary conditions. It so happens that the solution of an inhomogeneous
boundary value problem occurs with the help of the same Green's function
G(x, £) that holds in the case of homogeneous boundary conditions. The
difference is only that the given right side of the differential equation gives
rise to a volume integral, extended over the entire domain of the variables,
while the given non-zero boundary values give rise to a surface integral,
extended over the boundary surface S.
Let us return to Green's identity (3.1). The solution (4.11) in terms of
the Green's function was an immediate consequence of that identity. But
the right side of this identity is zero only if v(£) is subjected to the given
boundary conditions in their homogeneous form, that is if the boundary
values are prescribed as zero. If the given boundary conditions have non-
zero values on the right side, then the "zero" on the right side of Green's
218 THE GREEN'S FUNCTION CHAP. 5
identity (3.1) has to be replaced by a certain integral over the boundary
surface S. In this integral only the given (inhomogeneous) boundary values
will appear, in conjunction with the Green's function G(x, 8} and its deriva-
tives on the boundary. Hence the problem of inhomogeneous boundary
conditions does not pose a problem which goes beyond the problem of
solving the inhomogeneous differential equation with homogeneous boundary
conditions.
As an example let us consider the solution of the homogeneous differential
equation

with the inhomogeneous boundary condition

We go through the regular routine, replacing the condition (2) by the


condition v(8) = 0 and thus obtaining the adjoint operator

with the boundary condition

Now the extended Green's identity—which does not demand any boundary
conditions of u and v—becomes in our case:

(v = outward normal of S). We shall apply this identity to our function


v(£), denned by the conditions (1) and (2), while u(£) will be replaced by
the Green's function G(x, £), denned by

The integration (5) on the left side yields minus v(x), because Av = 0, and
Au is reduced to the delta function which exists only in the e-neighbourhood
of the point £ = x, putting in the limit the spotlight on v(x). On the right
side the first term drops out due to the boundary condition (7) and the
final result becomes

If it so happens that our problem is to solve the inhomogeneous differential


equation

with the inhomogeneous boundary condition (2), then we obtain the solution
SEC. 5.6 INHOMOGENEOUS BOUNDAHY CONDITIONS 219

in the form of a sum of two integrals, the one extended over the given volume
T, the.other over the boundary surface S:

This form of the solution shows that the inhomogeneous boundary values
v(S) can be interpreted as equivalent to a double layer of surface charges
placed on the boundary surface 8.

Problem 178. Obtain the solution of the boundary value problem (5.28), with

[Answer:

Problem 179. Obtain the solution of the heat conduction problem (5.19),
(5.20), but replacing the end condition (5.21) by the initial condition

[Answer:

where the Green's function O(x, t; £, r) satisfies the differential equation (5.25)
with the boundary conditions (5.23) and the end-condition

Problem 180. Consider the problem of the elastic bar (4.14.9), with the load
distribution (3(x) = 0 and the inhomogeneous boundary conditions

Obtain the deflection of the bar.


[Answer:
boundary conditions

Problem 181. Solve the same problem with the boundary conditions

[Answer:

(All differentiations with respect to the second argument.)


220 THE GREEN'S FUNCTION CHAP. 5
5.7. The Green's vector
More than once we have encountered in our discussions the situation that
it was not one function but a set of functions to which the given differential
operator was applied. In the question of obtaining the solution of such a
system, we have to specify which one of the functions we want to obtain.
Consider for example the Laplacian operator

The second order differential equation

can be replaced by the following equivalent system of first order equations,


denoting v by vi:

These systems of four equations for the four unknowns (vi, v%, vs, v^) is
equivalent to the equation (2), as we can see if we substitute for v2, #3, #4
their values into the last equation. But we can equally consider the given
system as a simultaneous system of four equations in four unknowns. In
the latter case the focus of interest is no longer on vi alone. We may
equally consider v2, ^3, t>4 as unknowns. And thus we are confronted with
a new situation to which our previous discussions have to be extended.
How are we going to construct in this case the "Green's function" which will
serve as the auxiliary function for the generation of the solution?
We return once more to the fundamental idea which led to the concept
of the "function space" (see Section 4.5). The continuous variables
x1, x2, . . . , xs—briefly denoted by the symbol x—were replaced by a set of
discrete values in which the function v(x, y, z) was tabulated. Each one of
these tabulated values opened up a new dimension in that abstract "function
space" in which the function v(x) became represented by a single vector.
Now, if we have not merely one such function but a set of functions

we can absorb all these functions in our function space, without giving up
the idea of a single vector as the representation of a function. We could
think, of course, of the p functions vi, vz, • • . , «V as p. different vectors of
SBC. 5.7 THE GREEN'S VECTOR 221
the function space. But we can also do something else. We can add to
our variable x a new variable} which can only assume the values 1,2, . . . , /*,
and is thus automatically a discrete variable. Here then it is unnecessary
to add a limit process by constantly increasing the density of points in
which the function is tabulated. The variable j automatically conforms to
the demands of matrix algebra since it is from the very beginning an algebraic
quantity. If we replace Vj by the symbol v(x, j) we have only introduced a
new notation but this notation suggests a new interpretation. We now
extend the previous function space by added dimensions along which we
plot the values of vi, v%, . . . , v^ at all tabulated points and the entire set
of vectors is once more represented by one single vector. Whether we write
the given operator in the form of DVJ(X), or Dv(x,j), we know at once what
we mean: we are going to operate with the dimensions (1, 2, . . . , ju,) as if they
belonged to a surplus variable j which can only assume the p, discrete values
1, 2, . . . , /*.
Let us see what consequences this viewpoint has for the construction of
the Green's function. We have first of all to transcribe to our extended
problem the solution (4.11) of a differential equation, in terms of the Green's
function. For this purpose we shall write down the system (3) in more
adequate form. On the right side of this system we have (0, 0, 0, jS). We
can conceive this set of values as accidental and will replace them by the
more general set (]8i, fa, fa, fa)- This means that we will consider a general
system of differential equations (ordinary or partial) in the symbolic form

Yet this is not enough. We do not want to exclude from our considerations
over-determined systems* of the type

where the unknown is a scalar function <2> while the right side is a vector of
s components. In order to cover the general case we have to introduce
two discontinuous variables k and j, k for the function v(x) and j for the
right side )3(a;):

Another thing to remember is that the process of summation is not replaced


by integration if it comes to the variables k and j, but remains a summation.
* We exclude under-determined systems from our present considerations. For such
systems the "constrained Green's vector" comes into operation, as we shall see in
Section 22.
222 THE GREEN'S FUNCTION CHAP. 5
The question is now how to obtain the proper interpretation of Green's
identity (3.1), in view of our extended system:

In order to answer this question, we prefer to write the equation (7) in a


more familiar form, replacing the discontinuous variables k and j by
subscripts:

Furthermore, we can conceive the set of functions

as a "vector" of a jn-dimensional space, associated with the left side of the


differential equation. Similarly, we can conceive the set of functions

as a "vector" of a v-dimensional space, associated with the right side of the


differential equation. But now the danger exists that we have overstressed
the use of the word "vector". We have the "function space" in which
the entire set of functions v\(x), . . . , v^x) is represented by one single
vector. And then we have two additional spaces of p, respectively v
dimensions. In order to avoid misunderstandings, we will use the word
"left-vector" to indicate a /u-dimensional vector of the type (10), and "right-
vector" to indicate a v-dimensional vector of the type (11). Furthermore,
we have to remember that the "adjoint operator" Du(x,j) amounts to a
transposition of rows and columns which has the consequence that a left
vector changes to a right vector and a right vector to a left vector, that is: in the
differential equation which characterises the adjoint operator, we have on
the left side a right-vector and on the right side a left-vector. This means
that the adjoint system of (9) will have to be written in the following form:

This adjoint operator is obtainable with the help of the Green's identity
which we can now write down in more definite terms:

Now in the construction of the Green's function we know that this


function, as far as the "active variable" £ is concerned, is nothing but
Uj($). The "passive variable" x plays purely the role of a parameter in
the defining differential equation (4.7). It will be advisable to put the
subscript j next to £ since in fact the subscript j is a portion of the variable
£. Hence we will use the notation $(£)/ in relation to the differential
equation which defines the Green's function while the subscript k which is
SEC. 5.7 THE GREEN'S VECTOR 223
associated with the passive variable x will accordingly appear in conjunction
with x. This leads to the notation

Instead of a "Green's function" we can now speak of a "Green's vector".


It is a ^-dimensional left vector with respect to the variable x and a
v-dimensional right vector with respect to the variable £. It operates
simultaneously in the left space of the vector VK(X) and the right space of
the vector £/(£), obtaining the solution of our general system (9) in the
following form:

What remains is the transcription of the equation (4.10) which defines


the Green's function (in our case Green's vector). It now appears in the
following form

On the right side all reference to the subscript j disappears. The delta function
of the right side represents a pure right-vector, at both points x and £. The
subscripts k and K run through the same set of values 1, 2, . . . , ^. The
definition of the delta-function on the right side of (16) is given as follows:

where S^K is again "Kronecker's symbol" which is 1, if k = K and zero


otherwise, while Dirac's delta function B(x, £) is once more to be construed
by the usual e-process.
In view of the significance of 8jcK it is in fact unnecessary to keep the
subscripts k and K apart. We can write the equation (16) in the simple form

with the following interpretation of the right side. We denote with 8k(x, £)
a right side which is composed of zeros, except one single equation, namely
the fcth equation which has the delta function 8(x, £) as its right side. In
order to construct the entire vector Gjc(x, £)j, k = 1, 2, . . . , jn, we have to
solve a system of /Lt simultaneous differential equations for v functions
(ju. < v), not once but ^ times. We let the 8(0;, £) function on the right
side glide down gradually from the first to the /nth equation and thus obtain
in succession the components G\(x, £)j, GZ(X, £)j, . . . , G^(x, £); of the
complete Green's vector G^x, £)j (which is in fact a "vector" in a double
sense: /^-dimensional hi x and v-dimensional in £).
In frequent applications a system of differential equations originates
from one single differential equation of higher order which is transformed
into a system of first order equations by the method of surplus variables.
In such cases the "given right side" J3j(£) of the system consists of one
single function £(£) in the jth equation while the right sides of the remaining
224 THE GREEN'S FUNCTION CHAP. 5

equations are all zero. In this case the sum on the right side of (7.15) is
reduced to one single term and we obtain

If we concentrate on the specific function Vk(x), disregarding all the others,


the right side of (19) has the form (4.11) of the general theory and we see
that the specific component Ojc(x, g)j plays the role of the "Green's function"
of our problem.

Problem 182. Discuss the solution of the problem (7.6)—with the added
condition <Z>(0) = 0—from the standpoint of the "Green's vector" and compare
it with the solution obtained in section 5.

Problem 183. Write the problem of the elastic bar (4.14.9) in the form of a
first order system:

and solve this system for v^(x) under the boundary conditions

[Answer: Adjoint system:

where a(x) and fi(x) are determined by the boundary conditions prescribed for
G(x, $) (active variable £):

Problem 184. Solve the same problem for the function vi(x) (the elastic
deflection).
[Answer:

with the boundary conditions (26).]


SEC. 5.8 SELF-ADJOINT SYSTEMS 225

5.8. Self-adjoint systems


The differential equation of the elastic bar can be given in a variety of
different forms. We can give it in the form of the fourth order equation

The same differential equation can be formulated as a pair of second order


equations (cf. 4.14.9) and we can go still further and replace this second
order system by a system of four first order equations, as we have done in
(7.20). The adjoint system (7.22) does not agree with the original system.
But let us now formulate exactly the same system in the following sequence:

If now we multiply these equations in succession by the undetermined


factors ui(x), uz(x), us(x), u^(x), form the sum and go through the usual
routine of "liberating " the Vi(x), we obtain the adjoint system in the following
form:

with the boundary term

The boundary conditions (7.21) (the bar clamped on both ends) demand for
the adjoint system

and we observe that the new system is self-adjoint, inasmuch as the adjoint
operator and the adjoint boundary conditions coincide with the original
operator and its boundary conditions.
The self-adjointness of a certain problem in linear differential equations
is a very valuable property which corresponds to the symmetry A = A of
the associated matrix problem. Such a symmetry, however, can be destroyed
if the equations of the system Ay = b are not written down in the proper
order, or even if they are multiplied by wrong factors. Hence it is under-
standable that the system (7.20) was not self-adjoint, although in proper
formulation the problem is in fact self-adjoint. We have to find a method
by which we can guarantee in advance that the self-adjoint character of a
system will not be destroyed by a false ordering of the equations.
16—L.D.O.
226 THE GREEN'S FUNCTION CHAP. 5
The majority of the differential equations encountered in mathematical
physics belong to the self-adjoint variety. The reason is that all the
equations of mathematical physics which do not involve any energy losses
are deducible from a "principle of least action", that is the principle of
making a certain scalar quantity a minimum or maximum. All the linear
differential equations which are deducible from minimising or maximising
a certain quantity, are automatically self-adjoint and vice versa: all
differential equations which are self-adjoint, are deducible from a minimum-
maximum principle.
In order to study these problems, we will first investigate their algebraic
counterpart. Let A be a (real) symmetric matrix

and let us form the scalar quantity

We will change the vector y by an arbitrary infinitesimal amount 8y, called


"variation oft/". The corresponding infinitesimal change of s becomes

The second term can be transformed in view of the bilinear identity (3.3.6)
which in the real case reads

and thus, in view of (6) we have

If now we modify the scalar s to

we obtain

and the equation

can be conceived as the consequence of making the variation 8S equal to


zero for arbitrary infinitesimal variations of y. The condition

is equally demanded for maximising or minimising the quantity s and will


not necessarily make s either a maximum or a minimum; we may have a
maximum in some directions and a minimum in others and we may even
have a "point of inflection" which has an extremum property only if we
consider the right and left sides independently. However, the condition
(14) can be conceived as a necessary and sufficient condition of a summit
in the local sense, staying in the infinitesimal neighbourhood of that particular
SEC. 5.8 SELF-ADJOINT SYSTEMS 227

point in which the condition (13) is satisfied. Such a summit in the local
sense is called a "stationary value", in order to distinguish it from a true
maximum or minimum. The technique of finding such a stationary value of
the scalar quantity § is that we put the infinitesimal variation of s caused by
a free infinitesimal variation of y, equal to zero.
It is of interest to see what happens to the scalar s in the case of a general
(non-symmetric) matrix A. In that case

That is, the variational method automatically symmetrises the matrix A


by replacing it by its symmetric part. An arbitrary square matrix A can
be written in the form

where the first term is symmetric, the second anti-symmetric

Accordingly the quadratic form s becomes

But the second term vanishes identically, due to the bilinear identity

This explains why the variational method automatically ignores the anti-
symmetric part of the matrix A. Exactly the same results remain valid in
the complex case, if we replace "symmetric" by "Hermitian" and "anti-
symmetric" by "anti-Hermitian". We have to remember, of course, that
in the complex case the scalar

(which is real for the case A = A) comes about by changing in the first
factor every i to — i since this change is included in the operation y.
These relations can be re-interpreted for the case of differential operators.
If Dv is a self-adjoint operator, the differential equation

can be conceived as the result of a variational problem. For this purpose


we have to form a scalar quantity Q (replacing the notation s by the more
convenient Q) which under the present circumstances is defined as

in the real case and


228 THE GREEN'S FUNCTION CHAP. 5
in the complex case. However, the characteristic feature of differential
operators is that in the re-interpretation of a matrix relation sometimes an
integration by parts has to be performed.
Consider for example the problem of the elastic bar. By imitating the
matrix procedure we should form the basic scalar s in the form

But now in the first term an integration by parts is possible by which the
order of differentiation can be reduced. The first term is replaceable
(apart from a boundary term which is variationally irrelevant) by

and applying the method a second time, by

The original fourth order operator which appeared in (24), could be replaced
by a second order operator. The quadratic dependence on y(x) has not
changed. Generally an operator of the order 2n can be gradually reduced
to an operator of the order n.
The integral (26) represents the "elastic energy" of the bar which in the
state of equilibrium becomes a minimum. The additional term in Q,
caused by the load distribution f3(x):

represents the potential energy of the gravitational forces. The minimisa-


tion of (24) expresses the mechanical principle that the state of equilibrium
can be characterised as that particular configuration in which the potential
energy of the system is a minimum.
Problem 185. Find the variational integral ("action integral") associated with
the Laplacian operator

[Answer:

Problem 186. Do the same for the "bi-harmonic operator"

[Answer:
SBC. 5.9 THE CALCULUS OF VARIATIONS 229

Problem 187. Show that the system

is deducible from the principle 8Q = 0 if Q is chosen as follows:

Problem 188. Show that the following variational integral yields no differential
equation but only boundary conditions:

5.9. The calculus of variations


The problem of finding the extremum value of a certain integral is the
subject matter of the " calculus of variations ". For our purposes it suffices
to obtain the "stationary value" of a certain integral, irrespective of whether
it leads to a real minimum or maximum. What is necessary is only that
the infinitesimal change 8Q, caused by an arbitrary infinitesimal change
(called "variation") of the function v(x), shall become zero. To achieve
this, we need certain techniques which are more systematically treated in
the calculus of variations.* The following three procedures are of particular
importance:
1. The method of integrating by parts. When we were dealing with the
problem of finding the adjoint operator f)u(x) on the basis of Green's
identity, we employed the technique of "liberation": the derivatives
vW(x) could be reduced to v(x) itself, by integrating by parts. Finally the
resulting factor of v(x) gave us the adjoint operator Du(x) (cf. Chapter 4.12
for the case of ordinary and 4.17 for the case of partial differential operators).
Exactly the same procedure applies to our problem. Whenever a variation
of the form Sv^(x) is encountered, we employ the same procedure of
integrating by parts, until we have Sv(x) itself. Then the factor of Sv(x)
put equal to zero is the sufficient and necessary condition of an extremum
if augmented by the boundary conditions which follow from the requirement
that also the boundary term has to vanish.
2. The method of the Lagrangian multiplier. It may happen that our
task is to find an extremum value (or stationary value) but with certain
restricting conditions (called auxiliary conditions or "constraints"), which
have to be observed during the process of variation. The method of the
Lagrangian multiplier requires that we should add the left sides of the
auxiliary conditions (assumed to be reduced to zero), each one multiplied by
an undetermined factor A, to the given variational integrand, and handle the
new variational problem as a free problem, without auxiliary conditions.
3. The elimination of algebraic variables. Let us assume that the
variational integrand L, called the "Lagrangian function" :

* Cf., e.g., the author's book [6], quoted in the Bibliography of this chapter.
230 THE GREEN'S FUNCTION CHAP. 5
depends on a certain variable w which is present in L, without any derivatives.
Such a variable can be eliminated in advance, by solving for w the equation

and substituting the w thus obtained into L.

Problem 189. By using integration by parts, obtain the differential equations


associated with the following form of the Lagrangian function L:

[Answer:

Problem 190. Derive the differential equation and boundary conditions of the
elastic bar which is free at the two ends (i.e. no imposed boundary conditions)
by minimizing the integral

[Answer:

Problem 191. Do the same for the supported bar; this imposes the boundary
conditions

(and consequently 8v(0) = 8v(l) = 0).


[Answer: Differential equation (6), together with

Problem 192. Show that all linear differential equations which are deducible
from a variational principle, are automatically self-adjoint in both operator and
boundary Conditions.
[Hint: Replace 8v(x) by u(x) and make use of Green's identity.]

5.10. The canonical equations of Hamilton


W. R. Hamilton in 1834 invented an ingenious method by which all
ordinary differential equations which are derivable from a variational
principle can be put in a normal form, called the "canonical form", which
does not involve higher than first derivatives. The problem of the elastic
bar (9.5) is well suited to the elucidation of the method.
SEC. 5.10 THE CANONICAL EQUATIONS OF HAMILTON 231

The Lagrangian of our problem is

We can make this function purely algebraic by introducing the first and the
second derivatives of v(x) as new variables. We do that by putting

and writing L in the form

But in the new formulation we have to add the two auxiliary conditions

which will be treated by the Lagrangian multiplier method. This means


that the original L is to be replaced by

denoting the two Lagrangian multipliers by pi and p%. The new


Lagrangian is of the form

where in our problem

Since, however, the variable ^3 is purely algebraic, we can eliminate it in


advance, according to the method (9.2):

Exactly the same procedure is applicable to differential operators of any


order and systems of such operators. By introducing the proper number
of surplus variables we shall always end with a new Lagrangian L' of the
following form
232 THE GREEN'S FUNCTION CHAP. 5
where H, the "Hamiltonian function", is an explicitly given function of
the variables qt and pt, and of the independent variable x:

Now the process of variation can be applied to the integral

and we obtain the celebrated "canonical equations" of Hamilton:

(The variables were employed in the sequence p\. . . pn; <?i. . . qn>) The
boundary term has the form

Moreover, in the case of linear differential equations the function H cannot


contain the variables pi, qt in higher than second order. The linear part of
H gives rise to a "right side" of the Hamiltonian equations (since the
resulting terms become constants with respect to the pt, qi). The quadratic
part gives rise to a matrix of the following structure. Let us combine the
variables (p\, p%, . . . , pn) to the vector p, and the variables (qi, qz, . . . , qn)
to the vector q. Then the Hamiltonian system (14) may be written in the
following manner:

where the matrices P and Q are symmetric n x n matrices: P = P,Q = Q-


In fact, we can go one step further and put the canonical system in a
still more harmonious form. We will unite the two vectors p and q into
a single vector of a 2w-dimensional space (called the "phase-space"). The
components of this extended vector shall be denoted in homogeneous
notation as follows:
(Pi, P2, • • • , Pn, qi, <ll, • • • , qn) = (Pi, P2, • - - , Pn, Pn+l, Pn+Z, • • • , PZn)
Then the matrices P, R, R, Q can be combined into one single 2n x 2n
symmetric matrix
SEC. 5.10 THE CANONICAL EQUATIONS OF HAMILTON 233

and the entire canonical system may be written in the form of one unified
scheme:

if we agree that the notation pzn+i shall have the following significance:

Hamilton's discovery enables us to put all variational problems into a


particularly simple and powerful normal form, namely the "canonical"
form. Hence the proper realm of the canonical equations is first of all the
field of self-adjoint problems. In fact, however, any arbitrary non-self-
adjoint problem can be conceived as the result of a variational problem and
thus the canonical equations assume universal significance. Indeed, an
arbitrary linear differential equation Dv(x) = fi(x) can be conceived as the
result of the following variational principle:

because the variation of u(x) gives at once:

while the variation of v(x) adds the equation

which can be added without any harm since it has no effect on the solution
of the equation (21).
To obtain the resulting canonical scheme we proceed as follows. By the
method of surplus variables we introduce the first, second, . . . , n — 1st
derivative as new variables. We will illustrate the operation of the
principle by considering the general linear differential equation of third
order:
Ai(x)vm(x) + A2(x)v"(x) + A3(x)v'(x) + AI(X}V(X) = P(x) (5.10.23)
We denote v by #4 and introduce two new variables p$ and p$ by putting

Hence A\v" = p$. The first term of (23) may be written in a slightly
modified form:

and thus the given differential equation (23) may now be formulated as the
following first order system:
234 THE GKEEN'S FUNCTION CHAP. 5
If now we multiply by the undetermined factors pi, p2, pz and apply in
the first term the usual integration by parts technique, we obtain the
adjoint system in the form

The two systems (26) and (27) can be combined into the single system

where the matrix Ctk has the following elements

This is a special case of the general canonical scheme (16), with a matrix
(17) in which the n x n matrices P and Q are missing and thus the operators
D and D fall apart, without any coupling between them. But the canonical
equations of Hamilton are once more valid.
The procedure we followed here has one disadvantage. In the case that
the given differential operator is self-adjoint, we may destroy the self-adjoint
nature of our equation and thus unnecessarily double the number of equations,
in order to obtain the canonical scheme. An example was given in the
discussion of the problem of the elastic bar (cf. Section 7), there we used the
method of surplus variables and succeeded in reducing the given fourth
order equation into a system of first order equations (7.20), which, however,
were not self-adjoint, although the system (8.2) demonstrated that the same
system can also be given in self-adjoint form. This form was deducible if
we knew the action integral from which the problem originated by the
process of variation. Then the Hamiltonian method gave us the canonical
system (8.2). But let us assume that we do not know in advance that our
problem can be put in self-adjoint form. Our equations are given in the
non-self-adjoint form (7.20). Is there a way of transforming this system into
the proper canonical form of four instead of eight equations?
SEC. 5.10 THE CANONICAL EQUATIONS OF HAMILTON 235

If we can assume that a certain linear differential equation is deducible


from a variational principle, the action integral of that principle must have
the form (8.22). Hence we can obtain the Lagrangian of the alleged
variational problem by putting

We can now go through with the Hamiltonian scheme and finally, after
obtaining the resulting self-adjoint system, compare it with the given
system and see whether the two systems are in fact equivalent or not.
In our problem (7.20) we have given our differential equation in the form

with the three auxiliary conditions

Hence our Lagrangian L, modified by the Lagrangian multipliers, now


becomes:

(v = vi since the fundamental variable of our problem is Vi).


The first term is variationally equivalent to

and then L' appears in the following form:

Let us replace pi by the notation pi and put

Let us likewise replace pz by pz and put

With these substitutions L' becomes

But now ^4 is purely algebraic and putting the partial derivative with respect
to ^4 equal to zero we obtain the condition

due to which L' is reducible to


236 THE GREEN'S FUNCTION CHAP. 5
and replacing pz + \v% by a new pz '

Here we have obtained the canonical Lagrangian function, further simplifiable


by the fact that v$ is purely algebraic which leads to the elimination of v%
bv the condition

What remains can be written in the form

with

The canonical equations become, replacing vi and vz by p$ and p^:

The equivalence of this canonical system with the original system (31) and
(32) is easily established if we make the following identifications:

The matrix C becomes in our case:

The symmetry of this matrix expresses the self-adjoint character of our


system.
Problem 193. Assume the boundary conditions

Show the self-adjoint character of the system (14), (15), denoting the adjoint
variables by p^ <ft. Do the same for the boundary conditions
SEC. 5.11 THE HAMILTONISATION OF PAETIAL OPERATORS 237

Problem 194. The planetary motion is characterised by a variational principle


whose integral is of the following form:

(m = mass of planet, considered as point; r, 8 = polar coordinates of its position,


V(r) =' potential energy of the central force). Obtain the Hamiltonian form of
the equations of motion by considering r, 6 as q\, q% and introducing r', 6' as
added variables 33, 54.
[Answer:

5.11. The Hamiltonisation of partial operators


Exactly the same method is applicable to the realm of partial differential
operators, permitting us to reduce differential equations of arbitrary order
to first order differential equations which are once more of the self-adjoint
type if both differential equation and boundary conditions are the result of
a variational principle. For example the potential equation

is derivable by minimising the variational integral

We introduce the partial derivatives of <p as new variables

considering these equations as auxiliary conditions of our variational


problem. Then the application of the Lagrangian multiplier method yields
the new Lagrangian
238 THE GREEN'S FUNCTION CHAP. 5

with

Since rpi, 952, <ps, are purely algebraic variables (their derivatives do not
appear in U), they can be eliminated, obtaining:

The Hamiltonian system for the four variables pi, pz, ps, <p becomes:

with the boundary term

Problem 195. According to Problem 170 (Chapter 4.18) the differential equation

with the boundary condition c):

belongs to a self-adjoint operator. Hence this system must be deducible from a


variational principle &Q = 0. Find the action integral Q of this principle.
[Answer:

Problem 196. Reduce the biharmonic equation

(cf. Problem 186, Section 8) to a Hamiltonian system of first order equations.


[Answer:
SEC. 5.12 THE RECIPROCITY THEOREM 239

5.12. The reciprocity theorem


The general treatment of Green's function has shown that the solution
of a differential equation by the Green's function method is not restricted
to even-determined systems but equally applicable to arbitrarily over-
determined systems. The only condition necessary for the existence of a
Green's function was that the system shall be complete, that is the equation

shall have no solution other than the trivial solution v(x) = 0. If the
adjoint homogeneous system

possessed solutions which did not vanish identically, the Green's function
method did not lose its significance. It was merely necessary that the
given right side should be orthogonal to every independent solution of the
equation (2):

We have seen examples where a finite or an infinite number of such


"compatibility conditions" had to be satisfied.
However, in a very large number of problems the situation prevails that
neither (1) nor (2) has non-vanishing solutions. This cases realises in
matrix language the ideal condition

The number of equations, the number of unknowns and the rank of the
matrix all coincide. In that case we have Hadamard's "well-posed"
problem: the solution is unique and the given right side can be chosen freely.
Under such conditions the Green's function possesses a special property
which leads to important consequences. Let us consider, together with the
problem

the adjoint problem

Since the homogeneous equation (2) has (according to our assumption) no


240 THE GREEN'S FUNCTION CHAP. 5
non-vanishing solutions, the general condition for the existence of a Green's
function is satisfied and we obtain the solution of (6) in the form

We have chosen the notation &(x, £) to indicate that the Green's function
of the adjoint problem is meant. The defining equation of this function is
(in view of the fact that the adjoint of the adjoint is the original operator):

The active variable is again £, while a; is a mere parameter.


This equation, however, can be conceived as a special case of the equation

whose solution is obtainable with the help of the Green's function 0(x, |),
although we have now to replace x by £ and consequently choose another
symbol, say a, for the integration variable:

If we now identify jS(a) with S(x, a) in order to apply the general solution
method (10) to the solution of the special equation (8), the integration over
CT is reduced to the immediate neighbourhood of the point a = x and we
obtain in the limit (as e converges to zero), on the right side of (10):

while the left side v(£) is by definition @(x, £). This gives the fundamental
result

which has the following significance: The solution of the adjoint problem (6)
can be given with the help of the same Green's function G(x, £) which solved the
original problem. All we have to do is to exchange the role of "fixed point"
and '' variable point''.

In a similar manner we can give the solution of the original problem in


terms of the Green's function of the adjoint problem

This result can be expressed in still different interpretation. The defining


equation of &(x, £) was the equation (8). Since @(x, £) is replaceable by
G(£, x), the running variable £ becomes now the first variable of the function
G(x, £). We need not change our notation G(x, £) if we now agree to consider
x as the active variable and £ as a mere parameter. The equation (8) can
then be written in the following form (in view of the fact that the delta
function 8 (x, £) is symmetric in x and £):
SEC. 5.13 SELF-ADJOINT PROBLEMS 241

The underlining of x shall indicate that it is no longer £ but x which became


the active variable of our problem while the point £ is kept fixed during
the solution of the equation.
This remarkable reciprocity between the solution of the original and the
adjoint equations leads to a new interpretation of the Green's function in
the case of well-posed problems. Originally we had to construct the adjoint
differential equation and solve it, with the delta function on the right side.
Now we see that we can remain with the original equation and solve it
once more with the delta function on the right side, but with the difference
that in the first case £ is the variable and x the fixed point; while in the
second case x is the variable and £ the fixed point. In both cases the same
G(x, £) is obtained.
Problem 197. Explain why the reciprocity relation (12) cannot be generalised
(without modification) to problems which do not satisfy the two conditions
that neither (1) nor (2) shall possess non-vanishing solutions.

5.13. Self-adjoint problems. Symmetry of the Green's function


If the given problem is self-adjoint, the operators D and D coincide.
In this case the Green's function G(x, g) of the adjoint problem must coincide
with Q(x, g) since the solution of the given differential equation is unique
and allows the existence of a single Green's function only. The theorem
(12.12) now takes the form

This is a fundamental result which has far-reaching consequences. It


means that the position of the two points x and £ can be exchanged without
changing the value of the kernel function G(x, g). Since all differential
equations derivable from a variational principle are automatically self-
adjoint and the equations of elastic equilibrium are derivable from the
variational principle that the potential energy of the elastic forces must
become a minimum, the equations of elastic deflection provide an example
of a self-adjoint system. It was in conjunction with this example that the
symmetry of the Green's function was first enunciated by J. C. Maxwell
who expressed this fundamental theorem in physical language: "The elastic
deflection generated at the point P by a point load located at Q is equal to
the elastic deflection generated at the point Q by a point load located at the
point P." This "reciprocity theorem" of Maxwell is equivalent to the
statement (1) since Maxwell's "point load" is a complete physical counter-
part of Dirac's delta function S(x, £).

5.14. Reciprocity of the Green's vector


In the case of systems of differential equations we had to introduce the
concept of a "Green's vector" Gk(x, g)j and generally the number of
^-components: k = 1, 2, . . ., p, and the number of j-components: j = 1,2,
. . . , v did not agree. But if we restrict ourselves to the "well-posed" case
17—L.D.O.
242 THE GBEEN'S FUNCTION CHAP. 5
of Section 12, we must assume that now the number of equations v and the
number of unknowns /x coincides

We will once more assume that neither the given homogeneous system, nor
the adjoint homogeneous system has non-vanishing solutions. We can now
investigate the role of the reciprocity theorem (12.12), if the Green's /unction
G(x, |) is changed to the Green's vector Gk(x, £);. This question can be
answered without further discussion since we have seen that the subscripts
k and j can be conceived as extended dimensions of the variables x and £.
Accordingly the theorem (12.12) will now take the form

and this means, if we return to the customary subscript notation:

Moreover, in transcribing the equation (12.15) to the present case we now


obtain the result that the same Green's vector G]c(x, £)/ may be characterised
in two different ways, once by solving the vectorial system

considering the variable £ as an integration variable while the point a: is a


mere parameter (and so is the subscript k which changes only by shifting
the delta function on the right side from equation to equation) and once by
solving the system

where the active variable is now x—together with the subscript k—while £
and j are constants during the process of integration.
This means that the same Green's vector G^(x, £)/ can be obtained in two
different ways. In the first case we write down the adjoint system, putting
the delta function in the kih equation and solving the system for Uj(g). In
the second case we write down the given system, putting the delta function
in the jih equation and solving the system for vjc(x). The general theory
shows that both definitions yields the same function Gk(x, £)y. But the
two definitions do not coincide in a trivial fashion since in one case £, in the
other x is the active variable. In a special case it may not even be simple
to demonstrate that the two definitions coincide without actually con-
structing the explicit solution.
If our system is self-adjoint, the second and the first system of equations
becomes identical and we obtain the symmetry condition of the Green's
vector in the form:

In proper interpretation the reciprocity theorem (12.12)—and its general-


ised form (14.3)—can be conceived as a special application of the symmetry
SEC. 5.14 RECIPROCITY OF THE GREEN's VECTOR 243

theorem (14.6), although the latter theorem is restricted to self-adjoint


equations. It so happens that every equation becomes self-adjoint if we
complement it by the adjoint equation. Let us consider the system of two
simultaneous differential equations, arising from uniting (12.5) and (12.6)
into a single system

Now we have a self-adjoint problem in the variables u, v (the sequence is


important: u is the first, v the second variable), with a pair of equations.
Accordingly we have to solve this system with a Green's vector which has
altogether 4 components, viz. Q\[x, £)i,z and GZ(X, £)i,2- Considering £ as
the active variable we need the two components GI(X, |)i,2 f°r the solution
of the first function—that is u(x)—and the two components Gz(x, £)i, 2 for
the solution of the second function, that is v(x). The first two components
are obtained by putting the delta function in the first equation and solving
the system for u(g), v(£). Since, however, there is no coupling between the
two equations, we obtain u(£) = 0, while in the case of the last two com-
ponents (when the delta function is in the second equation), v(£) = 0.
This yields:

For the remaining two components we have the symmetry condition

But in the original interpretation GI(X, £)% was denoted by G(x, |), while
GZ(X, £)i was denoted by G(x, £). And thus the symmetry relation (14.9)
expresses in fact the reciprocity theorem

obtained earlier (in Section 12) on a different basis. By the same reasoning
the generalised reciprocity theorem (3) can be conceived as an application
of the symmetry relation (6), if again we complement the given vectorial
system by the adjoint vectorial system, which makes the resultant system
self-ad joint.
Problem 198. Define the Green's function G(x, |) of Problem 183 (cf. (7.20)), by
considering x as the active variable.
[Answer:

with the boundary conditions


244 THE GREEN'S FUNCTION CHAP. 5
Problem 199. Consider the same problem in the self-adjoint form (8.2) and
carry through the same procedure. Show the validity of the two definitions
(7.25-26) and (14.11-12) by considering once f and once x as the active variable.

5.15. The superposition principle of linear operators


The fact that Dv(x) is a linear operator, has some far reaching consequences,
irrespective of the dimensionality of the point x. In particular, the
linearity of the operator D finds expression in two fundamental operational
equations which are lost in the case that D is non-linear:

provided that a is a constant throughout the given range. We thus speak


of the superposition principle of linear operators which finds its expression
in the equations (1) and (2). In consequence of these two properties we
see that if

we can solve the equation

by solving the special equations

Then

We can extend this method to the case that p goes to infinity. Let us
assume that

We solve the equations


Dvt(x) = fax) (i = 1,2,...) (5.15.8)
and form the sum

Then, if VN(X) approaches a definite limit as N increases to infinity:

this v(x) becomes the solution of the equation (4),


SEC. 5.15 THE SUPERPOSITION PRINCIPLE OP LINEAR OPERATORS 245
Now an arbitrary sectionally continuous function can be approximated
with the help of "pulses" which are of short duration and which follow
each other in proper sequence. We can illustrate the principle by consider-
ing the one-dimensional case in which x varies between the limits a and 6.
The "unit-pulse" has the width e and the height 1/e. The centre of the
pulse can be shifted to any point we like and if we write 8e[x, a + (e/2)], this
will mean that the unit pulse, illustrated between the points 0 and P of
the figure, is shifted to the point x = a, extending between the points

x = a and x = a + e, and being zero everywhere else. The height of this


pulse becomes 1 if we multiply by e and we obtain the first panel of our
figure by writing

Similarly the second panel of our figure is obtained by writing

The third panel becomes

and so on. That our continuous function f(x) is crudely represented by


these panels should not disturb us since we have e in our hand. By making
e smaller and smaller the difference between f(x) and the approximation
246 THE GREEN'S FUNCTION CHAP. 5
by pulses can be made as small as we wish and in the limit, as e decreases to
zero, we obtain f(x) with the help of a succession of delta functions. We
have to sum, of course, over all these panels, in order to obtain f(x) but this
sum changes to an integral as e recedes to zero. And thus we can generate
f(x) by a succession of pulses, according to the equation

The more dimensional case is quite similar and the equation (11) expresses
the generation of f(x) by a superposition of pulses if we omit the limits a
and b and replace them by the convention that our integral is to be
extended over the entire given domain of our problem, d£ denoting the
volume-element of the domain.
Let us now return to our equation (3). We will generate the right side
fi(x) in terms of pulses, according to the equation

The previous ai corresponds to /?(£), the previous fa(x) to S(x, £). Accordingly
the equation (5) now becomes

replacing the notation Vi(x) by v(x, £). In order to bring into evidence that
we have constructed a special auxiliary function which depends not only on
x but also on the position of the point £ (which is a mere constant from the
standpoint of solving the differential equation (13)), we will replace the
notation v(x, £) by G(x, £):

The superposition principle (9) now becomes

Once more we have obtained the standard solution of a linear differential


equation, in terms of the Green's function. But now all reference to the
adjoint system has disappeared. We have defined 0(x, £) solely on the
basis of the given equation, making use of the superposition principle of
linear operators. The result is the same as that obtained in Section 12,
but now re-interpreted in the light of the superposition principle. It is of
interest to observe that the solvability of (14) demands that the adjoint
equation t)u(x) — 0 shall have no non-vanishing solutions while the definition
of G(x, |) on the basis of the adjoint equation demanded that Dv(x) = 0
should have no non-vanishing solutions. Hence the definition of G(x, £) as
a function of £ excludes under-determination, the definition of G(x, £) as a
function of x excludes over-determination.
In the case of a system of differential equations (cf. (7.9)):
SEC. 5.16 GREEN'S FUNCTION OF ORDINARY DIFFERENTIAL EQUATIONS 247
the right side can once more be obtained as a superposition of pulses. But
now we have to write

where again 8(x, £); denotes that the delta function ?>(x, £) is put in the
jth equation (while all the other right sides are zero). Once more the result
agrees with the corresponding result (14.5) of our previous discussion, but
here again obtained on the basis of the superposition principle.
Problem 200. On the basis of the superposition principle find the solution of the
following boundary value problems:

Explain the peculiar behaviour of c).


[Answer:

c) holds only if p = 2k = even integer, because the adjoint homogeneous


equation has a non-zero solution and thus the right side of c) cannot be pre-
scribed freely; the compatibility condition holds only for the special case p = 2k,]
Problem 201. Given once more the differential equation (20), with the boundary
conditions

Find the compatibility condition between « and jS and explain the situation
found above (why is p = k forbidden?).
[Answer: By Green's identity:

5.16. The Green's function in the realm of ordinary differential equations


While it is simple enough to formulate the differential equation by which
the Green's function can be defined, it is by no means so simple to find the
248 THE GREEN'S FUNCTION CHAP. 5
explicit solution of the defining equation. In the realm of partial differential
equations we have only a few examples which allow an explicit construction
of the Green's function. In the realm of ordinary differential equations,
however, we are in a more fortunate position and we can find numerous
examples in which we succeed with the actual construction of the Green's
function. This is particularly so if the given differential equation has
constant coefficients.
Let us then assume that our x belongs to a one-dimensional manifold
which extends from a to 6. Then only ordinary derivatives occur. More-
over, the homogeneous equation—with zero on the right side—now possesses
only a finite number of solutions, depending on the order of the differential
equation. An wth order equation allows n constants of integration and has
thus n independent solutions. The same is true of a first order system
of n equations. These homogeneous solutions play an important part in
the construction of the Green's function, in fact, if these homogeneous
solutions are known, the construction of the Green's function is reducible to
the solution of a simultaneous set of ordinary algebraic equations.
Our aim is the solution of the differential equation

or else the solution of the differential equation

provided that the homogeneous equations Dv(x) = 0 and I)u(x) = 0 under


the given homogeneous boundary conditions have no non-vanishing solutions.
This we want to assume for the time being.
We will now make use of the fact that the delta function on the right
side of (1) and (2) is everywhere zero, except in the infinitesimal neighbour-
hood of the point x = £. This has the following consequence. Let us
focus our attention on the equation (1), considering £ as the active variable.
Then we can divide the range of £ in two separate parts: a < g < x, and
x < £ < b. In both realms the homogeneous equation

is valid and thus, if we forget about boundary conditions, we can obtain


the solution by the superposition of n particular solutions, each one multi-
plied by an undetermined constant c*. This gives apparently n degrees of
freedom. In actual fact, however, there is a dividing line between the two
realms, at the point £ = x which is a common boundary between the
two regions. In consequence of the delta function which exists at the
point £ = x, we cannot assume that the same analytical solution will exist
on both sides. We have to assume one homogeneous solution to the left
of the point £ = x, and another homogeneous solution to the right of the
point f = x. This means 2n free constants of integration, obtaining u(x]
SEC. 5.16 GREEN S FUNCTION OF ORDINARY DIFFERENTIAL EQUATIONS 249

as a superposition of n particular solutions to the left, and n particular


solutions to the right:

The n prescribed homogeneous boundary conditions will yield n simultaneous


homogeneous algebraic equations for the 2n constants Ai and Bi. This
leaves us with n remaining degrees of freedom which have to be obtained by
joining the two solutions (4) at the common boundary £ = x. Let us
investigate the nature of this joining.
As we have done before, we will replace the delta function 8(2, £) by the
finite pulse 8f(x, £), that is a pulse of the width e and the height 1/e. We
will now solve the differential equation

Since the delta function is zero everywhere to the left of x, up to the point
£ = x — (e/2), y(g) must be a constant C in this region. But the delta
function is zero also to the right of x, beyond the point £ = x + (e/2). Hence
y(g) must again be a constant in this region. But will it be the same
constant C we had on the left side? No, because the presence of the pulse
in the region between £ = x — (e/2) and x + (e/2) changes the course of
the function ?/(£). In this region we get

and thus we arrive at the point £ = x + (e/2) with the value ( 7 + 1 . If


now we let e go to zero, the rate of increase between the points x + (e/2)
becomes steeper and steeper, without changing the final value C + 1. In
the limit e = 0 the solution y(£) = C extends up to the point x coming from
250 THE GREEN'S FUNCTION CHAP. 5

the left, and y(g) = C + I coming from the right, with a point of dis-
continuity at the point £ = x. The magnitude, of the, jump is 1.
Now let us change our differential equation to

Then */(£) is no longer a constant on the two sides of the point x but a
function of £, obtainable by solving the homogeneous equation

But in the narrow region between £ = x + (e/2) we can write our equation
in the form

and we find that now the increase of y(j;) in this narrow region is not exactly
1, in view of the presence of the second term, but the additional term is
proportional to e and goes to zero with e going to zero. Hence the previous
result that y(t-) suffers a jump of the magnitude 1 at the point £ = x remains
once more true. Nor is any change encountered if we add on the left sides
terms which are still smoother than y(£), namely y(-V($), y ( ~ 2) (£)> where
these functions are the first, second . . . , integrals of y(£). As in (9), all
these terms can be transferred to the right side and their contribution to
the jump of y(£) at £ = x is of the order e2, e3, . . . , with the limit zero as e
goes to zero.
Now we can go one step further still and assume that the coefficient of
y'(£) in the equation (7) is not 1 but a certain continuous function p(i-)
which we will assume to remain of the same sign throughout the range [a, 6].
In the neighbourhood of the point £ = x this function can be replaced by
the constant value p(x). By dividing the equation by this constant we
change the height of the delta function by the factor p~l(x). Accordingly
the jump of y(£) at the point £ = x will no longer be 1 but l/p(x).
Let us now consider an arbitrary ordinary linear differential operator
T)u(£) of wth order. This equation is exactly of the type considered before,
if we identify uW(g) with y'(t-), that is y(£) with ufr-Vffl. Translating our
previous result to the new situation we arrive at the following result: The
presence of the delta function at the point £ = x has the consequence that u($),
SEC. 5.16 GREEN'S FUNCTION or ORDINARY DIFFERENTIAL EQUATIONS 251
u'(g), u"(£), . . . , up to w< w ~ 2 >(£) remain continuous at the point £ = x, while,
w (n-i)(£) suffers a jump of the magnitude ljp(x) if p(£) is the coefficient of the
highest derivative %<")(£) of the given differential operator.
In view of this result we can dispense with the delta function in the case
of ordinary differential equations, and characterise the Green's function
G(x, |) by the homogeneous differential equation

to the left and to the right from the point £ = x if we complement this
equation by the aforementioned continuity and discontinuity conditions.
We can now return to our previous only partially solved problem of
obtaining the 2n undetermined constants AI, Bi, of the system (4). The
given boundary conditions yield n homogeneous algebraic equations. Now
we add n further conditions. The condition that u(x), u'(x), . . . , u(n~^(x)
must remain continuous, whether we come from the left or from the right,
yields n — 1 homogeneous algebraic equations between the At and the B^
The last condition is that u(n~V(x) is not continuous but makes a jump of
the magnitude

p(t;} being the coefficient of uW(£) of the adjoint operator I)u(£). This last
condition is the only non-homogeneous equation of our algebraic system of 2n
unknowns AI, BI.
As an example let us construct the Green's function of the differential
equation of the vibrating spring

with the boundary conditions

This problem is self-adjoint, that is w(£) satisfies the same differential


equation and the same boundary conditions as v(£). The homogeneous
equation has the two solutions

and thus we set up the system

The boundary conditions yield the two relations


252 THE GREEN'S FUNCTION CHAP. 5
The continuity of w(£) at £ = x yields the further condition

and finally the jump-condition of w'(£) at £ = x demands the relation

These four equations determine the four constants A\, AZ, B\t B% of our
solution as follows:

and thus

We can now test the symmetry condition of the Green's function

which must hold in view of the self-adjoint character of our problem. This
does not mean that the expressions (20) must remain unchanged for an
exchange of x and £. An exchange of x and £ causes the point £ to come
to the right of # if it was originally to the left and vice versa. What is
demanded then is that an exchange of x and £ changes the left Green's
function to the right Green's function, and vice versa:

This, however, is actually the case in our problem since the second expression
of (20) may also be written in the form
SEC. 5.16 GREEN'S FUNCTION OF ORDINARY DIFFERENTIAL EQUATIONS 253
We also observe that the Green's function ceases to exist if the constant
p assumes the values

because then the denominator becomes zero. But then we have violated
the general condition always required for the existence of the Green's
function, namely that the homogeneous equation (under the given boundary
conditions) must have no solutions which do not vanish identically. If the
condition (24) is satisfied, then the homogeneous solutions (14) satisfy the
boundary conditions (13) and the homogeneous problem has now non-zero
solutions. The modification of our treatment to problems of this kind will
be studied in a later section (see Section 22).
Problem 202. Find the Green's function for the problem

[Answer:

Problem 203. Do the same for (25) with the boundary conditions

[Answer:

Problem 204. Find the Green's function for the problem of the vibrating
spring excited by an external force:

Obtain the result by considering once £ and once x as the active variable.
[Answer:

Problem 205. Find the Green's function for the motion of the "ballistic
galvanometer"
254 THE GBBEN'S FUNCTION CHAP. 5
[Answer:

Problem 206. If the constant y in the previous problem becomes very large, the
first term of the differential equation becomes practically negligible and we obtain

Demonstrate this result by solving (30) with the help of the Green's function.
Problem 207. Solve the differential equation of the vibrating spring in a resisting
medium:

with the help of the Green's function and discuss particularly the case of
"critical damping" p = a.
[Answer:

Problem 208. Solve with the help of the Green's function the differential
eauation

and compare the solution with that obtained by the "variation of the constants"
method.
[Answer:

Problem 209. The differential equation of the loaded elastic bar of uniform
cross-section [cf. Section 4.14, with I(x) = 1] is given by

Obtain the Green's function under the boundary conditions

(bar free at £ = I, clamped at £ =* 0).


[Answer:
SEC. 5.17 THE CHANGE OF BOUNDARY CONDITIONS 255

5.17. The change of boundary conditions


Let us assume that we have obtained the Green's function of an ordinary
differential equation problem under certain boundary conditions, and we
are now interested in the Green's function of the same differential operator,
but under some other boundary conditions. Then it is unnecessary to
repeat the entire calculation. In both cases we have to solve the same
differential equation (16.1) and consequently the difference of the two
Green's functions satisfies the homogeneous equation

We thus see that we can come from the one Green's function to the other
by adding some solution of the homogeneous equation. The coefficients of
this solution have to be adjusted in such a way that the new boundary
conditions shall become satisfied.
For example the Green's function of the clamped-free uniform bar came
out in the form (16.38). Let us now obtain the Green's function of the
clamped-clamped bar:

For this purpose we will add to the previous G(x, £) an arbitrary solution
of the homogeneous equation

that is

with the undetermined constants a, /J, y, 8. These four constants have to


be determined by the condition that at £ = 0 and £ = I the given boundary
conditions (2) shall be satisfied (our problem is self-adjoint and thus the
boundary conditions for u(g) are the same as those for v(£)).
Now at £ = 0 we are to the left of x and thus the upper of the two
expressions (16.38) has to be chosen:

At this point the previous boundary conditions remain unchanged (namely


u(0) = u'(0) = 0), and thus a and j3 have to be chosen as zero. We now
come to the point £ = I. Here the lower of the two expressions (16.38)
comes into operation since now £ is to the right of x:

The new boundary conditions u(l) = u'(l) = 0 determine y and S, and we


obtain the additional term in the form
256 THE GREEN'S FUNCTION CHAP. 5
This is a symmetric function of x and £, as it is indeed demanded by the
symmetry of the Green's function G(x, £) = G(g, x) since the added term is
analytic and belongs equally to the left and to the right expression of the
Green's function. The resulting Green's function of the elastic uniform
bar, clamped at both ends, thus becomes:

Problem 210. Obtain by this method the expression (16.27) from (16.26).
Problem 211. The boundary conditions of a bar simply supported on both
ends are

Obtain the Green's function of the simply supported bar from (16.38).
[Answer: Added term:

Problem 212. Obtain from the Green's function (16.29) the Green's function
of the same problem but modifying the boundary conditions to

Explain why now the added term is not symmetric in x and £.


[Answer: Added term:

5.18. The remainder of the Taylor series


Let us expand a function f(x) around the origin x — 0 in a power series:

By terminating the series after n terms we cannot expect that f*(x) shall
coincide with f(x). But we can introduce the difference between f(x) and
f*(x) as the "remainder" of the series. Let us call it v(x):

This function v(x) can be uniquely characterised by a certain differential


equation, with the proper boundary conditions. Since f*(x) is a polynomial
of the order n — 1, the nih derivative of this function vanishes and thus we
obtain
SEC. 5.18 THE REMAINDER OF THE TAYLOR SERIES 257

The unique characterisation of v(x) requires n boundary conditions but


these conditions are available by comparing the derivatives of the polynomial
(1) with the derivatives of f(x) at the point x — 0. By construction the
functional value and the values of the first n — 1 derivatives coincide at
the point x = 0 and this gives for v(x) the n boundary conditions

We will now find the Green's function of our differential equation (3)
and accordingly obtain the solution in the range [0, x] by the integral

Two methods are at our disposal: we can consider G(x, |) as a function of


£ and operate with the adjoint equation, or we can consider Q(x, £) as a
function of x and operate with the given equation. We will follow the latter
course and solve the equation

with the boundary conditions (4).


Now between the points x = 0 and x = g we have to solve the homo-
geneous equation

This means that v(x) can be an arbitrary polynomial of the order n — 1:

But then the boundary conditions (4) make every one of these coefficients
equal to zero and thus

Now we come to the range x > g. Here again the homogeneous equation
has to be solved and we can write our polynomial in the form

The conditions of continuity demand that v(g), v'(g), . . . , w(«- 2 )(f) shall be
zero since the function on the left side vanishes identically. This makes
5o = &i = &2 = . . . &«-2 = 0, and what remains is

According to the general properties of the Green's function, discussed in


Section 16, the n — 1st derivative must make the jump 1 at the point x = $
(since in our problem p(x] = 1). And thus we get

or

18--L.D.O.
258 THE GREEN'S FUNCTION CHAP. 5
which gives

We thus have obtained the Green's function of our problem:

Returning to our solution (5) we get

and this is the Lagrangian remainder of the truncated Taylor series which, as
we have seen in Chapter 1.3, can be used for an estimation of the error of
the truncated Taylor series.
Problem 213. Carry through the same treatment on the basis of the adjoint
equation, operating with G(x, £).

5.19. The remainder of the Lagrangian interpolation formula


An interesting application of the Green's function method occurs in
connection with the Lagrangian interpolation of a given function f(x) by
a polynomial of the order n — 1, the points of interpolation being given at
the arbitrary points (arranged in increasing order),

of the interval [a, &].


This is a problem we have discussed in the first chapter. We obtained
the interpolating polynomial in the form

where <pk(%) are the Lagrangian "interpolation coefficients"

F(x) being the fundamental root polynomial (x — x\)(x — xz) . . . (x — xn).


Our task will now be to obtain an estimation of the error of the approxi-
mation. For this purpose we deduce a differential equation for the
remainder, in full analogy to the procedure of the previous section. If we
form the difference
SEC. 5.19 REMAINDER OF THE LAGRANGIAN INTERPOLATION FORMULA 259

we first of all notice that at the points of interpolation (1) f(x) and f*(x)
coincide and thus

These conditions take the place of the previous boundary conditions (18.4).
Furthermore, if we differentiate (4) n times and consider that the nth
derivative of /*(#) (being a polynomial of the order n — 1) vanishes, we
once more obtain the differential equation

We thus have to solve the differential equation (6) with the inside conditions

The unusual feature of our problem is that no boundary conditions are


prescribed. Instead of them the value of the function is prescribed as zero
in n inside points of the given interval [a, &]. These conditions determine
our problem just as effectively as if n boundary conditions were given. The
homogeneous equation

allows as a solution an arbitrary polynomial of the order n — 1. But such


a polynomial cannot be zero in n points without vanishing identically since
a polynomial which is not zero everywhere, cannot have more than n — 1
roots.
We now come to the construction of the Green's function 0(x, £). We
will do that by considering G(x, g) as a function of £. Hence we will
construct the adjoint problem. For this purpose we form the extended
Green's identity:

Since there are no boundary conditions given for v(x) at x — a and x = b,


the vanishing of the boundary term demands the conditions

These are altogether 2n boundary conditions and thus our problem is


apparently strongly over-determined. However, we have not yet taken into
consideration the n inside conditions (7). The fact that v(x) is given in n
points, has the consequence that the adjoint differential equation

is put out of action at these points, just as we have earlier conceived the
appearance of the delta function at the point £ = x as a consequence of the
fact that we have added v(x) to the data of our problem.
260 THE GREEN'S FUNCTION CHAP. 5
The full adjoint problem can thus be described as follows. The delta
function 8 (x, g) appears not only at the point £ = x but also at the points
£ = xi,xz, . . . ,xn. The strength with which the delta function appears at
these points has to be left free. The completed adjoint differential equation
of our problem will thus become

We have gained the n new constants «i, «2> • • • > « « which remove the
over-determination since we now have these constants at our disposal,
together with the n constants of integration associated with a differential
equation of the nth order. They suffice to take care of the 2n boundary
conditions (10).
First we will satisfy the n boundary conditions

together with the differential equation

This is the problem of the Green's function G(x, £), solved in the previous
section as a function of x, but now considered as a function of £. We can
take over the previous solution:

We will combine these two solutions into one single analytical expression
by using the symbol [tn~l] for a function which is equal to tn~l for all
positive values of t but zero for all negative values of t:

With this convention, and making use of the superposition principle of


linear operators, the solution of the differential equation (12) under the
boundary conditions (13) becomes:

Now we go to the point £ = a and satisfy the remaining boundary con-


ditions

This has the consequence that the entire polynomial


SEC. 5.19 REMAINDER OF THE LAGRANGIAN INTERPOLATION FORMULA 261

must vanish identically. We thus see that the Green's function 0(x, £) will
not extend from a to 6 but only from x\ to xn if the point x is inside the
interval [xi, xn], or from x to xn if x is to the left of xi, and from xi to x if
x is to the right of xn.
The vanishing of Qn-i(l) demands the fulfilment of the following n
equations:

These equations are solvable by the method of undetermined coefficients.


We multiply the equations in succession by the undetermined factors
CQ, ci, C2, . . . , cn_i, and form the sum. Then the last term becomes some
polynomial Pn-i(x) which can be chosen at will while the factor of a*
becomes the same polynomial, taken at the point x = XK :

We will now choose for Pn-i(x) the Lagrangian polynomials <pk(x), defined
by (3). Then the factor of ajc becomes 1, the factor of all the other ay zero,
and we obtain

Hence we see that the strength with which the delta functions are
represented at the points of interpolation xjc, depends on the position of the
variable point x. Moreover, this strength is given by exactly the same
interpolation coefficients, taken with a negative sign, which appear in the
Lagrangian interpolation formula.
We have now constructed our Green's function in explicit form:

and obtain the remainder v(x) of the Lagrangian interpolation problems in


the form of a definite integral

(We have assumed that x is an inside point of the interval [xi, xn]; if x is
outside of this interval and to the left of xi, the lower limit of integration
becomes x instead of x\; if x is outside of [xi, xn] and to the right of xn, the
upper limit of integration becomes x instead of xn.)
It will hardly be possible to actually evaluate this integral. We can use
it, however, for an estimation of the error TJ(X) of the Lagrangian interpolation
262 THE GREEN S FUNCTION CHAP. 5

(this r)(x) is now our v(x)). In particular, we have seen in the first chapter
that we can deduce the Lagrangian error formula (1.5.10) if we can show
that the function G(x, g), taken as a function of £, does not change its sign
throughout the interval of its existence which in our case will be between
xi and xn, although the cases x to xn or x\ to x can be handled quite
analogously.
First of all we know that G(x, £) = u(g) vanishes at the limiting points
£ = xi and | = xn with all its derivatives up to the order n — 2; w
go beyond n — 2 because the (n — l)st derivative makes a jump at x\
and xn, due to the presence of 8(2:1, £) and 8(xn, £) in the wth derivative
(cf. (12)), and thus starts and ends with a finite value. Now u(£), being a
continuous function of £ and starting and ending with the value zero, must
have at least one maximum or minimum in the given interval [xi, xn~\. But
if u(j;) were to change its sign and thus pass through zero, the number of
extremum values would be at least two. Accordingly the derivative u'(£)
must vanish at least once and if we can show that it vanishes in fact only
once, we have established the non-vanishing of u(£) inside the critical
interval. Continuing this reasoning we can say that w(*>(£) must vanish
inside the critical interval at least k times and if we can show that the
number of zeros is indeed exactly k and not more, the non-vanishing of u(i-)
is once more established.
Now let us proceed up to the (n — 2)nd derivative and investigate its
behaviour. Since the nih derivative of w(£) is composed of delta functions,
the (n — l)st derivative is composed of step functions. Hence it is the
(n — 2)nd derivative where we first encounter a continuous function composed
of straight zig-zag lines, drawn between the n + 1 points xi, x%,... ,x,
.. ., xn. The number of intervals is n and since no crossings occur in the

first and in the last interval, the number of zeros cannot exceed n — 2.
Hence w<w~2>(£) cannot vanish more than n — 2 times while our previous
reasoning has shown that it cannot vanish less than n — 2 times. This
establishes the number of zeros as exactly n — 2 which again has the conse-
quence that an arbitrary &th derivative has exactly k zeros within the given
interval while G(x, £) itself does not change its sign as £ varies between xi
and xn. The theorem on which the estimation of Lagrange was based is
thus established. Furthermore, the formula (23) puts us in the position to
construct G(x, £) explicitly, on the basis of Lagrange's interpolation formula.
The Lagrangian interpolation coefficients <pic(x) ordinarily multiplied by
SBC. 5.20 LAGRANGIAN INTERPOLATION WITH DOUBLE POINTS 263

f(xje) are now multiplied by the functional values of the special function
[(x — £)n~1]/(w — 1)! taken at the points x = xje (considering £ as a mere
parameter).
We see that the Green's function of Lagrangian interpolation can be
conceived as the remainder of the Lagrangian interpolation of the special
function [(x - £)n-1]/(w - 1)!.
Problem 214. Show that if x > xn and £ is a point between xn and x:

If x < xi and £ is a point between x and x±:

Problem 215. Show from the definition (23) that G(x, £) vanishes at all points
X = Xjc-

Problem 216. Show from the definition (23) that G(x, £) vanishes at all values of
£ which are outside the realm of the n + 1 points \x\, x%, . . . , xn, x].

5.20. Lagrangian interpolation with double points


The fundamental polynomial of Lagrangian interpolation:

contains all the root factors once and only once. But in a similar way as
in algebra where a root may become a multiple root through the collapsing
of several single roots, something similar may happen in the process of
interpolation. Let us assume that the functional value f(x^) is prescribed
in two points x = xjc ± e which are very close together, due to the smallness
of e. Then we can put

Moreover, the fundamental polynomial may be written as follows, by


separating the two root factors which belong to the critical points:

($(x) is composed of all the other root factors.) The two terms associated
with the two critical points become

The sign ± shall mean that we should take the sum of two expressions,
264 THE GREEN'S FUNCTION CHAP. 5
the one with the upper, the other with lower sign. The factor of /(«*)
becomes

while the factor of /'(#*) becomes:

In the limit, as e goes to zero, we have obtained an interpolation which fits


not only/(#A;) but alsof'(X]f) at the critical point x*. At the same time the
root factor x — XK of the fundamental polynomial appears now in second
power. The point x = XK of the interpolation becomes a double point. If
all the points of interpolation are made double points, we fit at every point
x — xt the functional value for f(xi) and the derivative f'(xt) correctly, that
is the interpolating polynomial of the order 2n — 1 coincides with f(x) at
all points x = X{, and in addition the derivative of the interpolating poly-
nomial coincides with f'(x) at all points x = x^ The fundamental poly-
nomial now becomes a square:

which remains positive throughout the range, instead of alternating in sign


from point to point.
From the standpoint of a universal formula it will be preferable to operate
with a single fundamental polynomial F(x) counting every root factor with
its proper multiplicity (one for single points, two for double points). The
expression (6) can now be written in the form

and this becomes the factor of f ' ( x ) . On the other hand, the expression (5)
can now be written in the form

and thus the Lagrangian interpolation formula in the presence of double


points has to be generalised as follows:
Contribution of a single point x = xt:

Contribution of a double point x — XK :


SBC. 5.20 LAGRANGIAN INTERPOLATION WITH DOUBLE POINTS 265

For the estimation of an error bound the Lagrangian formula (1.5.10) holds
again, counting every root factor with the proper multiplicity.
Problem 217. By analysing the expression (11) demonstrate that the inter-
polating polynomial assumes the value f(xje) at the point x = XK, while its
derivative assumes the value f'(xje) at the point x = x^.
Problem 218. Obtain a polynomial approximation of the order 4 by fitting
f(x) at the points x = ± 1, 0, and/'(#) at the points x = ± 1.
[Answer:

Problem 219. Apply this formula to an approximation of sin (rr/2)x and cos (rr/2)a;
and estimate the maximum error at any point of the range [—1, 1].
[Answer:

Problem 220. Explain why the error bound for the cosine function can in
fact be reduced to
. . e -•

[Answer: The point x = 0 can be considered as a double point if f(x) is even.


But then F(x) = a;2(a;2 — I)2 and the estimated maximum error becomes
greatly reduced.]
Problem 221. Obtain the Green's function for the remainder of this approxi-
mation.
[Answer:

Problem 222. Reduce the interpolating polynomial to the order 3 by dropping


the point x = 0. Show that for any odd function f ( — x) — —f(x) the result
must agree with that obtained in Problem 218.
[Answer:

Problem 223. Apply this interpolation once more to the functions sin (ir/2)x and
cos (7r/2)g and estimate the maximum errors.
266 THE GBEEN'S FUNCTION CHAP. 5
[Answer:

Problem 224. Obtain the Green's function for the remainder of this inter-
polation.
[Answer:

Problem 225. Show that this Green's function is characterised by exactly the
same conditions as the Green's function of the clamped bar, considered before
in Section 17 except that the new domain extends from — 1 to +1 while the
range of the bar was normalised to [0,1]. Replacing a; by a; + 1, £ by £ + 1,
show that the expression (19) is in fact equivalent to (17.8), if we put 1 = 2.

5.21. Construction of the Green's vector


We will now generalise our discussions concerning the explicit construction
of the Green's vector to the case of systems of differential equations. Since
an arbitrary scalar or vectorial kind of problem can be normalised to the
Hamiltonian canonical form (cf. Section 10), we will assume that our system
is already transformed into the canonical form, that is, we will deal with the
system (10.18):

In order to obtain the solution of this system in the form

we need first of all the explicit construction of the Green's vector Gjf(x, £)].
This means that—considering £ as the active variable—we should put the
delta function in the fcth equation

while all the other equations have zero on the right side.
This again means that the homogeneous equations
SEC. 5.21 CONSTRUCTION OF THE GREEN'S VECTOR 267

are satisfied in both regions £ < x and £ > x. Assuming that we possess
the homogeneous solution with all its 2n constants of integration, we can
set up the solution for £ < x with one set of constants and the solution for
£ > x with another set of constants, exactly as we have done in Section 16.
The 2n boundary conditions of our problem provide us with 2n linear algebraic
relations between the 4n free constants. Now we come to the joining of the
two regions at the point £ = x. In view of the fact that the delta function
exists solely in the fcth equation, we obtain continuity in all components pi(x),
with the only exception of the component pn+k where we get a jump of 1 in
going from the left to the right:

These continuity conditions yield 2n — 1 additional linear homogeneous


algebraic relations between the constants of integration, plus one inhomo-
geneous relation, in consequence of the jump-condition (5).
As an example we return once more to our problem (16.12) for which we
have already constructed the Green's function (cf. 16.20), We will deal
with the same problem, but now presented in the canonical form:

with the boundary conditions

(The previous v is now p2.) The two fundamental solutions of the homo-
geneous system are

Accordingly we establish the two separate solutions to the left and to the
right from the point £ = x in the form

The boundary conditions (7) yield the following two relations between the
four constants A\, A%, BI, B%:
268 THE GREEN'S FUNCTION CHAP. 5
Let us first obtain the two components G\(x, £)i, 2- Then pi is continuous
at £ = x while pz(x) makes a jump of the magnitude 1. This yields the two
further relations

The solution of this algebraic system gives the solution

Hence we have now obtained the following two components of the full
Green's vector:

with the convention that the upper sign holds for £ < x, the lower sign for
$ > x.
We now come to the construction of the remaining two components
G^(x, £)i, 2> characterised by the condition that now pz remains continuous
while — pi makes a jump of 1 at the point £ = x. The equations (9) remain
unchanged but the equations (10) have to be modified as follows:
SBC. 5.21 CONSTRUCTION OF THE GBEBN'S VECTOR 269

The solution yields the required components in the following form:

Now we have constructed the complete Green's vector (although the


solution of the system (6) does not require the two components C?i, z(x, |)i
since j3i(£) is zero in our problem). We can now demonstrate the
characteristic symmetry properties of the Green's vector. First of all the
components GI(X, £)i and Oz(x, £)% must be symmetric in themselves. This
means that an exchange of x and £ must change the left expresssion to the
right expression, or vice versa. This is indeed the case. In these com-
ponents the exchange of x and £ merely changes the sign of x — £ but the
simultaneous change of the sign of l]2 restores the earlier value, the cosine
being an even function. Then we have the symmetry relation

This relation is also satisfied by our solution because an exchange of x and


£ in the second expression of (12) is equivalent to a minus sign in front of
the entire expression, together with a change of —1]2 to +IJ2. But this
agrees with the first expression of (14), if we take the lower sign, in view of
the fact that left and right has to be exchanged. Hence we have actually
tested all the symmetry conditions of the Green's vector.
Problem 226. The components of the Green's vector of our example show the
following peculiarities:

Explain these relations on the basis of the differential equation (21.6), assuming
the right sides in the form y(x), f$(x), instead of 0, j3(ic).
Problem 227. Consider the canonical system (8.2) for the clamped elastic bar
(boundary conditions (7.21)). In (7.24) the Green's function for the solution
VQ(X) was defined and the determining equations (7.26-26) deduced, while later
the application of the reciprocity theorem gave the determining equations
(14.11-12), considering x as the active variable. From the standpoint of the
Green's vector the component G$(x, £)i is demanded, that is, we should evaluate
u
l(£)» putting the delta function in the third equation (which means a jump of
1 at £ = x in the function vz(g)). We can equally operate with G\(x, £)%, that is,
evaluate «3(£), putting the delta function in the first equation (which means a
jump of — 1 at £ = x in «4(£)). In the latter method we have to exchange in the
end x and £. Having obtained the result—for the sake of simplicity assume a bar
270 THE GREEN'S FUNCTION CHAP. 5
of uniform cross-section, i.e., put/(£) = const. = /—we can verify the previously
obtained properties of the Green's function, deduced on the basis of the defining
differential equation but without explicit construction:
la). Considered as a function of £, O(x, £) and its first derivative must vanish
at the two endpoints £ = 0 and £ = I.
Ib). The coefficients of £2 and £3 must remain continuous at the point £ = x.
Ic). The function O(x, g) must pass through the point £ = x continuously
but the first derivative must make a jump of 1 at the point £ = x.
2a). Considered as a function of x the dependence can only be linear in x,
with a jump of 1 in the first derivative at the point x = £.
2b). If we integrate this function twice from the point x = 0, we must wind
up at the endpoint x = I with the value zero for the integral and its first
derivative.
[Answer:

5.22. The constrained Green's function


Up to now we have assumed that we have a "well-posed" system, that is
neither the given, nor the adjoint equation could have non-vanishing
homogeneous solutions. If we combine both equations to the unified self-
adjoint system (as we have done before in Section 14):

both conditions are included in the statement that the homogeneous system

(under the given boundary conditions) has no non-vanishing solutions.


On the other hand, we have seen in the treatment of general n x m
matrices that the insistence on the "well-posed" case is analytically not
justified. If the homogeneous system has non-zero solutions, this fact can
be interpreted in a natural way: The solutions of the, homogeneous equation (2)
trace out those dimensions of the function space in which the given differential
operator is not activated. If we restrict ourselves to the proper subspace,
viz. the "eigen-space" of the operator, ignoring the rest of space, we find
that within this subspace the operator behaves exactly like a "well-posed"
operator. Within this space the solution is unique and within this space
the right side can be given freely. If we do not leave this space, we do not
even notice that there is anything objectionable in the given differential
operator and the associated differential equation.
From the standpoint of the full space, however, this restriction to the
SEC. 5.22 THE CONSTKAINED GREEN'S FUNCTION 271

eigen-space of the operator entails some definite conditions or constraints.


These conditions are two-fold:
1. They restrict the choice of the right side (]8, y) by demanding that the
vector (/?, y) must lie completely within the activated subspace associated with
the operator. This again means that the right side has no components in
the direction of the non-activated dimensions, that is in the direction of
the homogeneous solutions.
Now we must agree on a notation for the possible homogeneous solutions
of the system (2). We may have one or more such solutions. In the case
of ordinary differential equations the number of independent solutions cannot
be large but in the case of partial differential equations it can be arbitrarily
large, or even infinite. We could use subscripts for the designation of these
independent solutions but this is not convenient since our operator Dv(x)
may involve several components vi(x) of the unknown function v(x) and the
subscript notation has been absorbed for the notation of vector components.
On the other hand, we have not used any upper indices and thus we will
agree that the homogeneous solutions shall be indicated in the following
fashion:

and likewise

The required orthogonality of the right sides to the homogeneous solutions


finds expression in the conditions

These are the compatibility conditions of our system, without which a


solution cannot exist.
While we have thus formulated the conditions to which the given right
sides have to be submitted for the existence of a solution, we will now add
some further conditions which will make our solution unique. This is done
by demanding that the solution shall likewise be completely within the
eigen-space of the operator, not admitting any components in the non-
activated dimensions. Hence we will complement the conditions (5) by
the added conditions

Under these conditions our equation is once more solvable and the solution
is unique. Hence we can expect that the solution will again be obtainable
with the help of a Green's function G(x, £), called the "constrained Green's
function":
272 THE GBEEN'S FUNCTION CHAP. 5
The question arises how to define these functions. This definition will
occur exactly along the principles we have applied before, if only we
remember that our operations are now restricted to the activated subspace
of the function space. Accordingly we cannot put simply the delta function
on the right side of the equation. We have to put something on the right
side which excludes any components in the direction of the homogeneous
solutions, although keeping everything unchanged in the activated dimen-
sions. We will thus put once more the delta function on the right side of the
defining equation, but subtracting its projection into the unwanted dimensions.
This means the following type of equation:

and likewise

Now we have to find the undetermined constants pj, ajc. This is simple
if we assume that the homogeneous solutions vk(£), respectively wj(£) have
been orthogonalised, and normalised, i.e. we use such linear combinations of
the homogeneous solutions that the resulting solutions shall satisfy the
orthogonality and normalisation conditions

Then the demanded orthogonality conditions (5) permit us to determine


the constants pj, a^ explicitly and independently of each other by forming
the integrals (5) over the right sides of (8) and (9):

These integrals reduce to something very simple, in view of the extreme


nature of the delta function: the integration is restricted to the immediate
neighbourhood of the point £ = x, but there ui(^,) is replaceable by u*(x)
which comes before the integral sign while the integral over the delta
function itself gives 1. And thus

We thus see that the definition of Green's function to constrained systems


has to occur as follows:
SEC. 5.22 THE CONSTRAINED GREEN'S FUNCTION 273

These equations are now solvable (under the proper boundary conditions)
but the solution will generally not be unique. The uniqueness is restored,
however, by submitting the solution to the orthogonality conditions (6).
As an example we return to the problem we have discussed in Section 16.
We have seen that the solution went out of order if the constant p satisfied
the condition (16.24). These were exactly the values which led to the
homogeneous solutions

These two solutions are already orthogonal and thus we can leave them as
they are, except for the normalisation condition which implies in our case
the factor V%]1:

Hence for the exceptional values (16.24) the defining equation for the
Green's function now becomes

with the previous boundary conditions (16.13).


Now the solution of this equation is particularly simple if we possess the
general solution for arbitrary values of /a2 (the "eigenvalues " of the differential
equation, with which we will deal later in greater detail). Anticipating a
later result, to be proved in Section 29, we describe here the procedure
itself, restricting ourselves to self-adjoint systems. We put

assuming that € is small. The difficulty arises only for e = 0; for any
finite e the problem is solvable. Now we let e go to zero. Then there is a
term which is independent of e and a term which goes with 1/e to infinity.
We omit the latter term, while we keep the constant term. It is this constant
term which automatically yields the Green's function of the constrained problem.
For example in our problem we can put

19—L.D.O.
274 THE GREEN'S FUNCTION CHAP. 5
in which case we know already the solution of the equation (17) since we
are in the possession of the Green's function for arbitrary values of p (cf.
(16.20); we neglect the negligible powers of e):

The constant term becomes:

Here then is the Green's function of our problem obtained by a limit process
from a slightly modified problem which is unconditioned and hence subjected
to the usual treatment, without any modification of the delta function on the
right side. This function would go to infinity without the proper pre-
cautions. By modifying the right side in the sense of (16) we counteract
the effect of the term which goes to infinity and obtain a finite result. The
symmetry of the Green's function:

remains unaffected because our operator, although it is inactive in two


particular dimensions of the function space, behaves within its own activated
space exactly like any other self-adjoint operator.
The physical significance of our problem is an exciting force acting on an
elastic spring which is periodic and of a period which is exactly k times the
characteristic period of the spring. Under such circumstances the amplitudes
of the oscillations would constantly increase and could never attain a steady
state value except if the exciting force is such that its Fourier analysis has
a nodal point at the period of the spring. This is the significance of the
orthogonality conditions (5) which in our problem assume the form

The special character of a steady state solution under resonance conditions


is mathematically expressed by the fact that the boundary conditions
(16.13)—which are equivalent to the condition of a steady state—can only
be satisfied if the right side is orthogonal to the two homogeneous solutions,
that is the characteristic vibrations of the spring.
SEC. 5.23 LEGENDKE'S DIFFERENTIAL EQUATION 275
Problem 228. Apply this method to the determination of the constrained
Green's function of the following problem:

Show that the solution satisfies the given boundary conditions, the defining
differential equation (13), the symmetry condition, and the orthogonality to the
homogeneous solution.
[Answer:

Compatibility condition:

5.23. Legendre's differential equation


An interesting situation arises in the case of Legendre's differential
equation which defines the Legendre polynomials. Here the differential
equation is (for the case n = 0):

The variable x ranges between — 1 and +1 and the coefficient of the highest
derivative: 1 — x2 vanishes at the two endpoints of the range. This has a
peculiar consequence. We do not prescribe any boundary conditions for
v(x). Then we would expect that the adjoint problem will be over-
determined by having to demand 4 boundary conditions. Yet this is not
the case. The boundary term in our case becomes

The boundary term vanishes automatically, due to the vanishing of the first
factor. Hence we are confronted with the puzzling situation that the
adjoint equation remains likewise without boundary conditions which makes
our problem self-adjoint since both differential operator and boundary
conditions remain the same for the given and the adjoint problem.
In actual fact the lack of boundary conditions is only apparent. The
vanishing of the highest coefficient of a differential operator at a certain
point makes that point to a singular point of the differential equation, where
the solution will generally go out of bounds. By demanding finiteness (but
not vanishing) of the solution we already imposed a restriction on our
solution which is equivalent to a boundary condition. Since the same occurs
276 THE GREEN'S FUNCTION CHAP. 6
at the other end point, we have in fact imposed two boundary conditions on
our problem by demanding finiteness of the solution at the two points
x = ± 1.
Our aim is now to find the Green's function of our problem. Since our
differential equation is self-adjoint, we know in advance that the Green's
function G(x, £) will become symmetric in x and £. There is, however, the
further complication that the homogeneous equation has the solution

Accordingly we have to make use of the extended definition (22.8) for the
Green's function. The normalised homogeneous solution becomes

since the integral of \ between the limits +1 becomes 1. The defining


equation of the Green's function thus becomes:

First of all we obtain the general solution of the homogeneous equation:

We also need the solution of the mhomogeneous equation

which yields

The complete solution of (5) will be a superposition of the solution (8) of


the inhomogeneous equation, plus the solution of the homogeneous equation,
adjusted to the boundary conditions and the joining conditions at the point
£ = x. We will start on the left side: £ < x. Here we get:

Now at the left endpoint £ = —1 of our range the function log (1 — £)


SEC. 5.23 LEGENDRE'S DIFFERENTIAL EQUATION 277
remains regular but the function log (1 + £) goes to infinity. Hence the
factor of this function must vanish which gives the condition

By the same reasoning, if we set up our solution on the right side: £ > x:

we obtain

because it is now the function log (1 — £) which goes to infinity and which
thus has to be omitted.
So far we have obtained

The condition of continuity at the point £ = x adds the condition

The discontinuity of the derivative at the point x — £ becomes

But this is exactly what the delta function on the right side demands: the
jump of the n — 1st derivative must be 1 divided by the coefficient of the
highest derivative at the point £ = x (which in our problem is (1 — x2)).
We have now satisfied the differential equation (5) and the boundary
conditions (since our solution remains finite at both points $ = +1). And
yet we did not get a complete solution because the two constants BI and BZ
have to satisfy the single condition (14) only. We can put

Then

In fact we know that this uncertainty has to be expected. We still have


278 THE GREEN'S FUNCTION CHAP. 5
to satisfy the condition that the constrained Green's function must become
orthogonal to the homogeneous solution (3):

This yields the condition

and substituting back in (13) we obtain finally the uniquely determined


Green's function of our problem:

Problem 229. Assuming that fi(x) is an even function: fi(x) = fi( — x), the
integration is reducible to the range [0, 1]. The same holds if /?(#) is odd:
fi(x) — — j8( — x). In the first case we get the boundary condition at x = 0:

In the second case:

Carry through the process for the half range with the new boundary conditions,
(at x = 1 the previous finiteness condition remains) and show that the new
result agrees with the result obtained above.
[Answer:

Compatibility condition:

(Right side free.) ]

5.24. Inhomogeneous boundary conditions


Throughout our discussions we have assumed that the given boundary
conditions were of the homogeneous type, that is that certain data on the
boundary were prescribed as zero. Indeed, for the definition of the Green's
function the assumption of homogeneous boundary conditions is absolutely
essential. This does not mean, however, that the Green's function is
applicable solely to problems with homogeneous boundary conditions. In
fact, the same Green's function which solved the inhomogeneous equation
with homogeneous boundary conditions solves likewise the general case of
SEC. 5.24 INHOMOGENEOUS BOUNDARY CONDITIONS 279

an inhomogeneous differential equation with inhomogeneous boundary


conditions, as we have seen before in Section 6 of this chapter. The only
difference is that the inhomogeneous boundary data contribute their own
share to the result, not in the form of a volume integral but in the form
of a surface integral, extended over the boundary surface. In the case of
ordinary differential equations there is no integration at all; each one of
the given inhomogeneous boundary values contributes a term to the solution,
expressible with the help of the Green's function G(x, f) and its derivatives
with respect to £ and substituting for £ the value £ = b on the upper limit
and £ = a on the lower limit.
We shall discuss, however, the case of constrained systems which under
the given but homogenised boundary conditions possess non-zero solutions.
We have seen the rules of constructing the Green's function under these
circumstances (cf. Section 22). The operation with this Green's function
remains the same as that for unconstrained systems: once more the integra-
tion over the domain is complemented by the proper boundary terms,
necessitated by the presence of inhomogeneous data on the boundary. The
same data influence, however, the compatibility conditions to which the
right side is subjected. The orthogonality to the homogeneous solution
involves now the boundary terms, which have to be added to the integration
over the given domain. It is now the sum of the volume integral, plus the
properly constructed surface integral—all obtained with the help of the
homogeneous solution as auxiliary function—which has to vanish. This
consideration shows that an incompatible system whose right side is not
orthogonal to the homogeneous solution, can be made compatible by
changing some of the given homogeneous boundary conditions to inhomo-
geneous boundary conditions.
A good example is provided by the problem of the vibrating spring
under resonance conditions (cf. 16.24). In order to satisfy the homogeneous
boundary conditions

(which expresses the fact that the system constantly repeats its motion under
the influence of the periodic exciting force) it is necessary and sufficient
that the orthogonality conditions

are satisfied. But let us assume that these conditions are not satisfied.
Then instead of saying that now our given problem is unsolvable, we can
go through the regular routine of the solution exactly as before. But
finally, when checking the compatibility conditions, we find that we have to
make allowances in the boundary conditions in order to make our problem
solvable. The problem of the vibrating spring, kept in motion by a periodic
280 THE GREEN'S FUNCTION CHAP. 5
external force, is a very real physical problem in which we know in advance
that the solution exists. But in the case of resonance the return of the
system to the original position cannot be expected and that means that the
conditions (1) will no longer hold.
Now in the construction of the adjoint system we went through the
following moves. We multiplied the given operator Dv(x) by an un-
determined factor u(x) and succeeded in "liberating" v(x), which was now
multiplied by a new operator t)u(x). In this process a boundary term
appeared on the right side. For example in the problem of the vibrating
spring:

Now under the homogeneous boundary conditions (1) the right side dropped
out and we obtained the compatibility condition

for the case that the homogeneous equation under the given homogeneous
boundary conditions possessed non-zero solutions. But if we allow that
the given boundary conditions (1) have something on the right side, let us
say Pi, P2> then the compatibility condition (4) has to be modified as follows:

This means for the case of resonance:

which gives

But now the solution with the help of the Green's function (22.20) has
also to be modified because the inhomogeneous boundary values pi and pz
will contribute something to the solution. The Green's identity (3) comes
once more in operation and we find that we have to add to our previous
solution the right side of (3), but with a negative sign:
SEC. 5.25 THE METHOD OF OVER-DETERMINATION 281

In this manner we can separate the contribution of the resonance terms


from the steady state terms (the latter being generated by the non-resonant
Fourier components of the exciting force).
Problem 230. Obtain the solution of Problem 228 under the assumption that
the boundary conditions are modified as follows:

[Answer: Compatibility condition:

Added term in solution:

Problem 231. The boundary conditions of an elastic bar, free on both ends,
can be given according to (4.14.10) in the form:

where

and

By modifying these conditions to

obtain the point load and the point torque demanded at the point x = 0, to
keep the bar—which is free at the other endpoint x = I—in equilibrium.
[Answer:

5.25. The method of over-determination


In Section 22 we have dealt with constrained systems for which the
standard definition of the Green's function had to be modified because the
given differential equation (under the given boundary conditions) did not
allow a unique solution. This happened, whenever the homogeneous
equation under homogeneous boundary conditions had solutions which did
282 THE GREEN'S FUNCTION CHAP. 5
not vanish identically. As an example we studied the differential equation
of the vibrating spring with the boundary conditions

In the case of resonance, characterised by the condition (16.24), the homo-


geneous equation under the homogeneous boundary conditions

allowed two non-vanishing solutions. In such problems we can modify the


previous procedure to obtain the solution in simpler manner.
Since the given problem allows the addition of two solutions with two
free constants, we do not lose anything if we add two more boundary
conditions, chosen in such manner that now the problem should become
uniquely solvable. For example we could choose the additional conditions

in which case the boundary conditions (1) become

Our problem is now apparently strongly over-determined since we have


prescribed 4 instead of 2 boundary conditions. And yet we did not alter
our problem, except that we disposed of the two integration constants
which were left free in the original formulation.
In the new formulation the previous difficulty with the solution of the
adjoint equation

does not occur any more. We need not modify the right side in order to
make the equation solvable. In fact, another strange phenomenon is now
encountered. The method of the Green's identity now shows that the
adjoint equation becomes

ivithoid any boundary conditions. Any solution of (6) is acceptable as a


Green's function. We may choose for example
SEC. 5.25 THE METHOD OF OVER-DETERMINATION 283

which satisfies the boundary conditions u(Q) = u'(Q) = 0. With this


solution we obtain v(x) in the form of the following integral:

By putting x = I we see that the boundary conditions at x = I are automatic-


ally fulfilled. On the other hand, by putting x = 0 we obtain:

but in view of the relation (16.24) between p and I we can put

By demanding that these two values vanish we arrive at the previous


compatibility conditions (24.7), now simply obtained by applying the
solution to the boundary at x = 0:

The solution (8) does not coincide with the earlier solution (22.20)—
complemented by (24.8)—because we followed a different policy in normal-
ising the free homogeneous solutions. But the simpler Green's function
(7)—obtained along the usual lines of constructing the Green's function
without any modification of the right side—is just as good from the stand-
point of solving the originally given problem as the more elaborate function
(22.20). If we want to normalise the final solution in a different way, we
can still add the homogeneous solution

and determine A and B by two further conditions.


But we can go one step further still and transform our G\(x, |) into that
unique Green's function (22.20) which is symmetric in x and £ and which
has the property of being orthogonal to the homogeneous solutions in both
variables x and £. For this purpose we solve the defining equation (22.16)
284 THE GREEN'S FUNCTION CHAP. 5
which holds for that function. This means (by the superposition principle
of linear operators) that we add to G\(x, £) the solution of the equation

This can be done since the auxiliary function G\(x, £) puts us in the position
to solve the inhomogeneous equation (14) (remembering, however, that we
have to consider x as the integration variable and this x should preferably
be called x\ in order to distinguish it from the previous x which is a mere
constant during the integration process. The £ on the right side of (14)
becomes likewise x\). In our problem we obtain:

At this stage the constrained Green's function appears in the form

But this is not yet the final answer. We still have the uncertainty of the
homogeneous solution

We remove this uncertainty by demanding that the resulting function


0(x, £) (considered as a function of £ while a is a mere parameter) be
orthogonal to both cos p£ and sin pg.
We can simplify our task by writing the solution (15) somewhat differently.
Since an arbitrary homogeneous solution can be added to (15), we can omit
the last term while the first term may be replaced by

We do that because the natural variable of our problem is not £ but

Hence the function we want to orthogonalise becomes

where the symbol [ ] shall again indicate (cf. (19.16)) that all negative values
SEC. 5.25 THE METHOD OF OVER-DETERMINATION 285

inside the bracket are to be replaced by zero. Furthermore, the un-


determined combination (17) can equally be replaced by

(with different constants A and B). The sum of (20) plus (21) has to satisfy
the condition of orthogonality, integrating with respect to the range
£ = [0,1], that is

We thus get the two conditions for A and B:

This gives

and the final result becomes

where the upper sign holds for £ < x, the lower sign for £ > x. The
symmetry in x, g is evident. Moreover, a comparison with the earlier
expression (22.20) shows perfect agreement.
Problem 232. Apply the method of over-determination to Problem 228 by
adding the boundary condition
v(Q) = 0
Find the Green's function of this problem and construct with its help the
constrained Green's function (22.24).
[Answer:
286 THE GREEN'S FUNCTION CHAP. 5
Problem 233. Consider the problem of the free elastic bar of constant cross-
section, putting I(x) = 1 (of. Chapter 4.14 and Problem 231), with the added
boundary conditions

Obtain the Green's function of this problem and construct with its help the
constrained Green's function G(x, g) of the free elastic bar.
[Answer:

5.26. Orthogonal expansions


In the realm of ordinary differential equations the explicit construction
of the Green's function is a relatively simple task and always solvable if
we are in the possession of the homogeneous solutions of the given differential
operation (that is the right side is made zero while the boundary conditions
are left free). In the realm of partial differential operators the conditions
are much less favourable for the explicit construction of the Green's function
and we know in fact only a few examples in which the Green's function is
known in finite and closed form. Here another approach is frequently
more adequate which elucidates the nature of the Green's function from an
entirely different angle.
If we recall our basic discussions in introducing the Green's function
(cf. Section 2), we observe that our deductions were essentially based on
the absence of something. By adding the value of v(x) at the point x to our
data, we have added a surplus dimension to the ?7-space and obtained a
compatibility condition for our over-determined system by demanding that
the right side must be orthogonal to a principal axis which is associated
with the eigenvalue zero, because that axis is not included in the eigen-space
of the operator. It was this compatibility condition between v(x) and the
given data which led to the solution of our system in terms of the Green's
function. It is thus understandable that the defining equation of the
Green's function is almost completely the adjoint homogeneous equation
excluding only an arbitrarily small neighbourhood of the point £ = x at
which the functional value v(x) was prescribed.
SEC. 5.26 ORTHOGONAL EXPANSIONS 287

On the other hand, in our discussions of matrices we have seen that a


matrix as an operator could be completely characterised in terms of those
principal axes which are positively represented in the matrix. To find all
the principal axes of the eigenvalue problem (3.7.7) is generally a much
more elaborate task than solving the linear system (3.6.1) by direct matrix
inversion. But if we possess all these principal axes, then we possess also
the solution of our linear system, without having recourse to determinants
or matrix inversion. In Chapter 3.11 we have seen that by a proper
orthogonal transformation of both the unknown vector and the right side
an arbitrary n x m matrix could be diagonalised. In the diagonal form the
equations are separated and automatically solvable.
Here we have a method of solving a linear system which omits the zero
axes—in which the operator is not activated—and operates solely with
those axes which are actively present in the operator. These axes were
characterised by the "shifted eigenvalue problem" (3.7.7):

which, if translated into the field of differential operators, has to be


interpreted as the following pair of equations:

Here v(x) is subjected to the given homogeneous (or homogenised) boundary


conditions, u(x) to the adjoint homogeneous conditions.
The new feature which enters in dissimilitude to the matrix problem is
that the function space has an infinity of dimensions and accordingly we
obtain an infinity of solutions for the system (2). The eigenvalues \i of our
system remain usually discrete, that is our system is solvable only for a
definite sequence of discrete eigenvalues*

which we will arrange in increasing order, starting with the smallest eigen-
value AI, and continuing with the larger eigenvalues which eventually
become arbitrarily large. In harmony with our previous policy we will omit
all the negative A^ and also all the zero eigenvalues, which in the case of
partial operators may be present with infinite multiplicity.
Now the corresponding eigenfunctions

and

* We consider solely finite domains; the confluence of eigenvalues in an infinite space


is outside the scope of our discussions.
288 THE GREEN'S FUNCTION CHAP. 5
form an ortho-normal set of functions if we agree that the "length" of all
these functions (cf. Chapter 4.7) shall be normalised to 1:

(The normalisation of the ui(x) automatically entails the normalisation of


the Vi(x) and vice versa; cf. Problem 118.)
These functions represent in function space an orthogonal set of base
vectors of the length 1. The fact that their number is infinity does not
guarantee automatically that they include the entire function space. In
fact, if the eigenvalue problem (2) allows solutions for A = 0—which makes
our system either over-determined or incomplete or both—we know in
advance that by throwing away these solutions our function system (4)
will not cover the entire function space. But these functions will cover
the entire eigen-space of the operator and this is sufficient for the solution
of the problem

or

The omission of the zero-axes is a property of the operator itself and not
the fault of the (generally incomplete) orthogonal system (4)—which,
however, must not omit any of the eigenfunctions which belong to a positive
\i, including possible multiplicities on account of two or more eigenvalues
collapsing into one, in which case the associated eigenfunctions have to be
properly ortho-normalised.
The eigenfunctions Ui(x) are present in sufficient number to allow an
expansion of /3(x) into these functions, in the form of an infinite convergent
series:

We want to assume that f$(x) belongs to a class of functions for which the
expansion converges (if f3(x) is everywhere in the given domain finite,
sectionally continuous, and of bounded variation, this condition is certainly
satisfied. It likewise suffices that /3(x) shall be sectionally differentiate).
We now multiply this expansion by a certain ujc(x) and integrate over the
given domain term by term. This is permissible, as we know from the
theory of convergent infinite series. Then on the right side every term
except the kih drops out, in consequence of the orthogonality conditions (5),
while in the &th term we get /?#. And thus

On the other hand, the unknown function v(x) can likewise be expanded,
but here we have to use the functions Vi(x) (which may belong to a completely
SEC. 5.26 ORTHOGONAL EXPANSIONS 289

different functional domain, for example v(x) may be a scalar, j3(a;) a vector,
cf. Section 5, and Problem 236):

Here again we obtain for the expansion coefficients

but this relation is of no help if v(x) is an unknown function. However,


the expansion coefficients fa are available if fi(x) is a given function, and
now the differential equation (6) establishes the following simple relation
between the expansion coefficients bjc and fa:

from which

We will assume that the eigenvalue spectrum (3) starts with a definite finite
smallest eigenvalue AI ; this is not self-evident since the eigenvalue spectrum
may have a "condensation point" or "limit point" at A = 0 in which case
we have an infinity of eigenvalues which come arbitrarily near to A = 0
and a minimum does not exist. Such problems will be our concern in a
later chapter. For the present we exclude the possibility of a limit point
at A = 0. Then the infinite series

is even better convergent than the original series (8)—because we divide by


the \t which increase to infinity—and we can consider the series (14) as the
solution of our problem. This solution is unique since we have put the
solution into the eigen-space of the operator, in case the homogeneous
equation

allows solutions which do not vanish identically. The uniqueness is


established by demanding that v(x) shall be orthogonal to every solution of
the homogeneous equation:

The corresponding condition for the given right side:

where u1(x) is any independent solution of the adjoint homogeneous equation:

20—L.D.O.
290 THE GREEN S FUNCTION CHAP. 5

is demanded by the compatibility of the system: the right side must have
no components in those dimensions of the function space which are not
included by the operator. We have to test the given function fi(x) as to
the validity of these conditions because, if these conditions are not fulfilled,
we know in advance that the given problem is not solvable.
Problem 234. Given the partial differential equation

in a closed two-dimensional domain with the boundary condition

on the boundary S. Find the compatibility conditions of this problem.

[Answer:

for every fixed 2/1.]


Problem 235. Given the same homogeneous problem

with inhomogeneous boundary conditions

find the compatibility conditions for the boundary data/($).


[Answer:

for all y = yi-]


SEC. 5.27 THE BILINEAR EXPANSION 291

Problem 236. Formulate the eigenvalue problem for the scalar-vector problem
(5.3) and obtain the orthogonal expansions associated with it.
[Answer:

with the boundary condition

where

(dr = volume element of 3-dimensional space).

5.27. The bilinear expansion


We consider the solution of the differential equation (26.6) which we
have obtained in the form

where

If we substitute this value of fii into the expansion (1), we obtain:

In this sum the process of summation and integration is not necessarily


interchangeable. But if we do not go with i to infinity but only up to n,
the statement of equation (3) can be formulated as follows:

where
292 THE GREEN'S FUNCTION CHAP. 5
the auxiliary function Gn(x, g) being defined as follows:

From the fact that 8n(x) converges to a definite limit—namely v(x)—we


cannot conclude that the sequence Gn(x, £) must converge to a definite
limit. Even if the limit of Gn(x, £) (n growing to infinity) does not exist,
the integration over £, after multiplying by £(£), will entail convergence.
But even assuming that Gn(x, £) diverges, this divergence is so weak that
a very small modification of the coefficients 1/A^ assures convergence. This
modification is equivalent to a local smoothing of the same kind that changes
8(x, £) to 8e(x, £). Hence we can say that with an arbitrarily small modifica-
tion of the coefficients the sum (6) converges to a function Ge(x, £) which
differs only of the order of magnitude e from a definite function G(x, £).
With this understanding it is justified to put

and replace the sum (5) in the limit by

But then we are back at the standard solution of a differential equation by


the Green's function and we see that our function (7) has to be identified
with the Green's function of the given differential operator. The important
expansion (7) is called the "bilinear expansion" since it is linear in the
functions vi(x) and likewise in the functions w*(£).
In the literature the bilinear expansion of the Green's function appears
usually in the form

and is restricted to self-adjoint operators. The eigenvalues A* and the


associated eigenfunctions V{(x) are then defined in terms of the traditional
eigenvalue problem

once more omitting the solutions for A = 0 but keeping all the positive and
negative A$ for which a solution is possible. The transition to the "shifted
eigenvalue problems" (26.2) permits us to generalise the usual self-adjoint
expansion to a much wider class of operators which includes not only the
"well-posed", although not self-adjoint problems but even the case of
arbitrarily over-determined or under-determined problems. Hence the
functions vi(x), Ui(g) need not belong to the same domain of the function
space but may operate in completely different domains.
SEC. 5.27 THE BILINEAR EXPANSION 293

For example in our Problem 236 the functions Vi(x) are the scalar functions
<pi(x), while the functions Ui(g) are the vectorial functions grad <?*(£)• The
Green's function of our problem—which is a scalar with respect to the
point x and a vector with respect to the point £—is obtainable with the help
of the following infinite expansion:

If we compare this Green's function with the much simpler solution


found in Section 5, we are struck by the simplicity of the previous result and
the complexity of the new result. The use of the Green's function (11)
entails an integration over a three-dimensional domain:

while previously a simple line-integral (5.16) gave the answer. Moreover,


the previous Green's function could be given explicitly while the new Green's
function can only be given as the limit of an infinite sum whose actual
construction would require an exceedingly elaborate scheme of calculations.
What is the cause of this discrepancy?
The problem we have studied is strongly over-determined since we have
given a vector field for the determination of a scalar field. This means that
the UM = U + UQ space of the right side extends far beyond the confines
of the U space in which the vector is activated. The Green's function (11)
is that particular Green's function which spans the complete eigen-space of
the operator but has no components in any of the dimensions which go
beyond the limitations of the U space. On the other hand, there is no
objection to the use of a Green's function which spills over into the UQ
space. We can add to our constrained G(x, £), defined by (11), an arbitrary
sum of the type

where the pi(x] can be chosen freely as any functions of x. This additional
sum will not contribute anything to the solution v(x) since the right side
satisfies the compatibility conditions

(cf. 26.17) and thus automatically annuls the contribution from the added
sum (13). As we have seen in Section 5, over-determined systems possess
the great advantage that we can choose our Green's function much more
liberally than we can in a well-determined problem n = m = p, where in
fact the Green's function is uniquely defined.
Another interesting conclusion can be drawn from the bilinear expansion
concerning the reciprocity theorem of the Green's function, encountered
294 THE GREEN'S FUNCTION CHAP. 5
earlier in Section 12. Let us assume that we want to solve the adjoint
equation (26.7). Then our shifted eigenvalue problem (26.2) shows at once
that in this case we get exactly the same eigenvalues and eigenfunctions, with
the only difference that the role of the functions ut(x) and vi(x) is now
exchanged. Hence the bilinear expansion of the new Green's function
G(x, £) becomes:

but this is exactly the previous expansion (7), except that the points x and
£ are exchanged; and thus

which is in fact the fundamental reciprocity theorem of the Green's function.


In the case of self-adjoint systems the bilinear expansion (9) becomes in
itself symmetric in x and £ and we obtain directly the symmetry theorem of
the Green's function for such systems:

All these results hold equally for ordinary as for partial differential
operators since they express a basic behaviour which is common to all linear
operators. But in the case of ordinary differential equations a further
result can be obtained. We have mentioned that generally the convergence
of the bilinear expansion (7) cannot be guaranteed without the proper
modifications. The difficulty arises from the fact that the Green's function
of an arbitrary differential operator need not be a very smooth function.
If we study the character of the bilinear expansion, we notice that we can
conceive it as an ordinary orthogonal expansion into the ortho-normal system
Vi(x), if we consider a; as a variable and keep the point £ fixed, or another
orthogonal expansion into the eigenfunctions Ui(£), if we consider £ as the
variable and x as a fixed point. The expandability of G(x, £) into a con-
vergent bilinear series will then depend on whether or not the function
G(x, £) belongs to that class of functions which allow an orthogonal expansion
into a complete system of ortho-normal functions. This "completeness" is
at present of a restricted kind since the functions vi(x] and ui(x] are generally
complete only with respect to a certain subspace of the function space.
However, this subspace coincides with the space in which the constrained
Green's function finds its place. Hence we have no difficulty on account of
the completeness of our functions. The difficulty arises from the fact that
G(x, £) may not be quadratically integrable or may be for other reasons too
unsmooth to allow an orthogonal expansion.
In the domain of ordinary differential equations, however, such an
unsmoothness is excluded by the fact that the Green's function, considered
as a function of x, satisfies the homogeneous differential equation Dv(x) = 0
with the only exception of the point x = £. Hence G(x, £) is automatically
SEC. 5.27 THE BILINEAR EXPANSION 295

a sectionally continuous and even differentiable function which remains


everywhere finite. The discontinuity at the point x = g (in the case of
first order operators) is the only point where the smoothness of the function
suffers. But this discontinuity is not sufficient to destroy the convergence
of an orthogonal expansion, although naturally the convergence cannot be
uniform at the point of discontinuity. And thus we come to the conclusion
that in the case of ordinary differential operators we can count on the
convergence (and even uniform convergence if the point x = £ is excluded)
of the bilinear expansion, without modifying the coefficients of the expansion
by local smoothing.

Problem 237. In Section 23 we have studied Legendre's differential operator


which is self-ad joint. Its eigenvalues are

with the normalised eigenfunctions

where Pk(x) are the "Legendre polynomials", defined by

These polynomials are alternately even and odd, e.g.

They have the common characteristics that they assume at x = 1 their maximum
value 1.
On the basis of the results of Section 23 obtain the following infinite
expansions:
296 THE GREEN'S FUNCTION CHAP. 5

Problem 238. Legendre's differential operator can be obtained by starting with


the first order operator

Then the operator on the left side of (23.1) becomes —£>Dv which shows that
the eigenvalues of the shifted eigenvalue problem (26.2) associated with (28)
n.rA Aonn.l t,r>

Obtain the Green's function of the operator (28) for the range [0, 1], with the
boundary condition

and apply to it the bilinear expansion (6).


[Answer:

Problem 239. Show that at the point of discontinuity £ = 0 the series (32)
yields the arithmetic mean of the two limiting ordinates, and thus:

Problem 240. Obtain the Green's function and its bilinear expansion for the
following operator:

Do the same for the operator f)D.


SEC. 5.27 THE BILINEAR EXPANSION 297

[Answer:

Problem 241. Solve the same problem for the boundary condition

[Answer:
298 THE GREEN'S FUNCTION CHAP. 5

Problem 242. Solve the same problem for the boundary condition

[Answer:
SEC. 5.28 HEEMITIAN PROBLEMS 299

Problem 243. Consider the same problem with the boundary condition

where a is an arbitrary real constant (excluding the value a = 1 which was


treated before). Study particularly the expansions for the point £ = x.
[Answer:
Define an angle AO between ±TT by putting

Then the eigenvalues and associated eigenfunctions become:

5.28. Hermitian problems


In all our previous dealings we have restricted ourselves to the case of
real operators with real boundary conditions. However, in applied problems
the more general case of complex elements is of frequent occurrence. For
example the fundamental operations of wave-mechanics have the imaginary
unit i inherently built into them. These operations are self-adjoint in the
Hermitian sense. But even in classical physics we encounter the need for
complex elements. In all diffraction problems we solve the time-dependent
wave equation by taking out the factor eiwt, thus reducing the wave equation
to the equation
300 THE GREEN'S FUNCTION CHAP. 5
This equation does not reveal any complex elements. However, the
boundary condition of "outgoing waves" demands in infinity the condition

This condition would be self-adjoint in the algebraic sense but is not self-
adjoint in the Hermitian sense since the adjoint boundary condition becomes

If we want to make use of the method of eigenfunctions for the solution of


our diffraction problem, we have to complement the given problem by the
adjoint problem which demands incoming instead of outgoing waves.
The general procedure of obtaining the adjoint operator in the presence
of complex elements is as follows. We go through the regular procedure
of obtaining the adjoint operator D and the adjoint boundary conditions,
paying no attention to the fact that some of the coefficients encountered
in this process are complex numbers. Now, after obtaining our Dw, we
consider this expression as a preliminary result and obtain the final D* by
changing every i to —i. For example in the above diffraction problem the
given differential operator is self-adjoint and thus

There is no change here since the imaginary unit does not occur anywhere.
However, the adjoint boundary condition—obtained in the usual fashion,
with the help of the extended Green's identity—becomes

This condition has to be changed to

and we see that our problem loses its self-adjoint character. Without this
change of i to — i, however, our eigenvalue problem would lose its significance
by not yielding real eigenvalues or possibly not yielding any eigensolutions
at all. On the other hand, we know in advance from the general analytical
theory that the shifted eigenvalue problem with the proper boundary
conditions will yield an infinity of real eigenvalues and a corresponding set
of eigenfunctions which, although complex in themselves, form an ortho-
normal set of functions in the sense that

In this section we will study the nature of such problems with the help
of an over-simplified model which is nevertheless instructive by demonstrating
SEC. 5.28 HEBMITIAN PROBLEMS 301

the basic principles without serious technical complications. We return to


the previous problem 243 with the only change that in the boundary condition
(27.56) we will now assume that a is a complex constant. Since the adjoint
boundary condition came out previously in the form

we now have to change this condition to

Since the differential operator has not changed, we know in advance


that our eigensolutions VK(X) will once more be of the form

and the A& must again become positive real numbers. Moreover, the shifted
eigenvalue problem yields

The two boundary conditions (27.56) and (8) give the following two conditions:

which, if expanded, gives the determinant condition

and thus

We see that in spite of the complex nature of a the eigenvalues A& become
real. In fact, we obtain once more the same system of eigenvalues as before
in (27.58):

where AO is defined as an angle between 0 and IT, satisfying the equation

The eigenfunctions are likewise similarly constructed as those tabulated in


302 THE GREEN'S FUNCTION CHAP. 5
(27.58) but the previous angle 0o becomes now complex, being determined
by the relation

The eigenfunctions Vk(x), ujc(x) now become

where the upper sign holds for A& = 2Jbr + AQ and the lower sign for
Afc = 2&7T — AQ. The amplitude factor Ajc follows from the condition

This condition leads to the following normalisation of the functions v(x)


and u(x):

with

The boundary conditions (12) establish the following relation between AQ,
y and the original complex constant a:

(For the sake of formal simplicity we have departed from our usual con-
vention of consistently positive eigenvalues. If we want to operate with
consistently positive A^, we have to change the sign of A, OQ, and y for the
second group of eigenvalues which belong to the negative sign of the formula
(21)0
The Green's function can again be constructed along the usual lines.
However, in view of the complex elements of the operator (which in our
problem come into evidence only in the boundary conditions) some character-
istic modifications have to be observed. First of all, the Green's function
corresponds to the inverse operator and this feature remains unaltered even
hi the presence of complex elements. Since the proper algebraic adjoint
is not the Hermitian adjoint JD* but I), the definition of the Green's function
—considered as a function of |—must occur once more hi terms of f>:
SEC. 5.28 HERMITIAN PROBLEMS 303

Furthermore, if the homogeneous equation

possesses non-vanishing (orthogonalised and normalised) solutions, the


modification of the right side has to be made as follows:

while in the case of the adjoint Green's G(x, £) the corresponding equation
becomes:

(Notice that the asterisk appears consistently in connection with the variable
£.) The bilinear expansion (27.7) becomes now modified as follows:

The corresponding expansion of the adjoint Green's function appears in


the form

while the symmetry theorem of a self-adjoint problem becomes

The expansion of the right side fi(x) of the differential equation

occurs once more in terms of the eigenfunctions ut(x):

but the expansion coefficients are obtained in terms of the integrals

Similar is the procedure with respect to the expansion of v(x) into the
ortho-normal eigenfunctions Vi(x).
We will apply these formulae to our problem (8-9). Let us first construct
the Green's function associated with the given operator Dv = v'. The rule
304 THE GREEN'S FUNCTION CHAP. 5
(24) demonstrates that we obtain once more the result of Problem 243
(cf. 27.59), although the constant a is now complex:

We now come to the construction of the self-adjoint operator DDv(x).


Here the defining differential equation becomes

with the boundary conditions

The solution of the four conditions at £ = 0, £ = 1, and £ = x yields

A comparison with the previous expression (27.61) demonstrates that for


the case of real values of a the previous result is once more obtained. We
can demonstrate, furthermore, that our G(x, £) satisfies the given boundary
conditions in the variable x, while in the variable $ the same holds if every
i is changed to — *'. We also see that the symmetry condition (31) of a
Hermitian Green's function is satisfied.
We now come to the study of the bilinear expansion (28) of the Green's
function (35). Making use of the eigenfunctions (20) and separating real
and imaginary parts we encounter terms of the following kind:
Real Part:

Imaginary part:

If now we divide by A and form the sum, we notice that the result is
expressible in terms of two functions f(t) and g(t):
SEC. 5.28 HEEMITIAN PROBLEMS 305

Then the real part of the sum becomes

and the imaginary part

In order to identify the two functions f(t) and g(t), we will make use of
the fact that both systems v^x) and ujc(x) represent a complete ortho-
normal function system, suitable for the representation of arbitrary section-
ally continuous and differentiable functions. Let us choose the function

and expand it into a series of vt (x) functions:

where

In our problem we obtain

with the abbreviations

Since the imaginary part of the left side of (48) must vanish, we get

and taking out the constants cos 0o and sin do in the trigonometric sums
(49), we finally obtain for the two sums (41) and (42);

21—L.D.O.
306 THE GREEN'S FUNCTION CHAP. 5
Now we return to our formulae (43), (44), substituting the proper values
for/(#) and g(x). Assuming that £ < x (and x + £ < 1), we obtain for (43):

and for (44):

Combining real and imaginary parts into one complex quantity we finally
obtain

which in view of (23) yields

in full accordance with the value of G(x, £) for £ < x (cf. 34). If £ > x,
the only change is that the first term of (53) changes its sign and we obtain
the correct value of G(x, £) for £ > x.
Problem 244. In the above proof the restricting condition x + £ < 1 was
made, although in fact x + £ varies between 0 and 2. Complement the proof
by obtaining the values of/(£) and g(t) for the interval 1 < t < 2. (Hint: put
x = 1 + x'.) Show that at the point of discontinuity t = 1 the series yield
the arithmetic mean of the two limiting ordinates.
[Answer:

Problem 245. By specifying the values of a: to 0 and £ obtain from (51) and
(52) generalisations of the Leibniz series

[Answer:
SEC. 5.28 HERMITIAN PROBLEMS 307

In particular:

Problem 246. Consider a as purely imaginary a = ia> and demonstrate for this
case the validity of the bilinear expansion of the second Green's function (38).
Problem 247. Obtain for the interval x = [0, 1] the most general Hermitian
operator of first order and find its Green's function G(x, |).
[Answer:

Boundary condition:

where

(a an arbitrary real constant).

where

Problem 248. Putting

where p(x) is a monotonously increasing function, prove that the following set of
functions form a complete Hermitian ortho-normal set in the interval [0, 1]:

Boundary condition:

Problem 249. Choose

and obtain an expansion of the function

into the ortho-normal functions vjc(x). Investigate the behaviour of the


expansion at the two endpoints x = 0 and x = 1.
308 THE GREEN'S FUNCTION CHAP. 5
[Answer:

where

In particular, if we put, we obtain the interesting series

which explains the numerical closeness of


the series gives 3/2
the series gives 3/2.1

Problem 250. By choosing

obtain a similar expansion for the function (1 + x)P.


[Answer:

Problem 251. Make the implicit transformation of x into t by

and show that the expansion into the functions (68) is equivalent to the Fourier
series in its complex form.

5.29. The completion of linear operators


We had many occasions to point out that the eigen-space of a linear
operator is generally incomplete by including only p dimensions of the
w-dimensional F-space and likewise p dimensions of the w-dimensional
?7-space. We have considered a linear system incomplete only if p < m
because the condition p < n had merely the consequence that the right side
of the given system had to be subjected to the compatibility conditions of
the system but had no influence on the uniqueness of the solution. If
p = m, the solution of our problem was unique, irrespective of whether p
was smaller or equal to n. From the standpoint of the operator it makes no
difference whether the eigen-space of the operator omits certain dimensions
in either the one or the other space, or possibly in both spaces. The operator
is incomplete in all these cases.
We will now discuss the remarkable fact that an arbitrarily small
SEC. 5.29 THE COMPLETION OF LINEAR OPERATORS 309

modification of an incomplete operator suffices to make the operator complete


in all possible dimensions of both U and V spaces.
First we restrict ourselves to the self-adjoint case. Instead of considering
the equation

we will modify our equation by putting

where e is a small parameter which we have at our disposal. We see at


once that the new eigenvalue problem

has exactly the same eigenfunctions as our previous problem, while the new
eigenvalues A'$ have changed by the constant amount € :

Now we have assumed that the eigenvalue A = 0 shall not be a limit-


point of the eigenvalue spectrum, that is, the eigenvalue A = 0 is a discrete
value of the eigenvalue spectrum, of arbitrarily high multiplicity. Then
the addition of e to the eigenvalues, if e is sufficiently small, definitely
eliminates the eigenvalue A = 0. But it was precisely the presence of the
eigenvalue A = 0 which caused the incompleteness of the operator D.
The new modified operator D + e is free of any incompleteness and includes
the entire function space. The associated eigenfunction system is now
complete and the right side need no longer satisfy the condition to be
orthogonal to all the solutions vi(x) of the homogeneous equation

These solutions belong now to the eigenvalue e and cease to play an


exceptional role. This remains so even if we diminish e to smaller and smaller
values. No matter how small e becomes, the modified operator includes the
entire function space.
The previous (constrained) Green's function could be expanded into the
eigenfunctions Vi(x), without the v1(x):

The new Green's function becomes:

This function is defined in the usual fashion, paying no attention to the


modification demanded by the existence of homogeneous solutions:
310 THE GEEEN'S FUNCTION CHAP. 5
We may find it more convenient to solve this equation—in spite of the e-
term on the left side—because it has the 8-function alone on the right side,
without any modifications. Then we can return to the Green's function (6)
of the modified problem by the following limit process. We consider e as
a small parameter and expand our O'(x, £) into powers of e. There is first
a term which is inversely proportional to e. Then there is a term which is
independent of e, and then there are higher order terms, proportional to
e, e2, . . . , which are of no concern. We need not go beyond the first two
powers: e"1 and e°. We omit the term with e"1 and keep only the constant
term. This gives us automatically the correct Green's function (6),
characterised by the fact that it contains no components in the direction of
the missing axes. In Section 22 we made use of this method for obtaining
the Green's function of the constrained problem.
If we now investigate the solution of our problem (2), it is in fact true
that the right side fi(x) is no longer subjected to any constraints and freely
choosable. We can expand it into the complete function system v\(x), v1(x):

with

It is only when we come to the solution v(x) and the limit process involved
in the gradual decrease of e that the difficulties arise:

For every finite e the solution is unique and finite. But this solution does
not approach any limit as c converges to zero, except if all the ft disappear
which means the conditions

We are thus back at our usual compatibility conditions but here


approached from a different angle. By a small modification of the operator
we have restored to it all the missing axes and extended our operator to
the utmost limits of the function space. We have no difficulty any more
with the eigenvalue zero which is in fact abolished. But now we watch
what happens to our solution as e decreases to zero. We observe that in
the limit a unique solution is obtained but only if the right side satisfies the
demanded compatibility conditions. If any of these conditions is not ful-
filled, the solution does not converge to any limit and our original problem
(which corresponds to e = 0) has in fact no solution.
We will now extend our considerations to arbitrary linear operators, no
SEC. 5.29 THE COMPLETION OF LINEAR OPEEATOES 311

matter how under-determined or over-determined they may be. We start


out with the equation (1) which we complement, however, by the adjoint
homogeneous equation. We thus consider the pair of equations

By this procedure we have done no harm to our problem since the second
equation is completely independent of the first one and can be solved by the
trivial solution

But now we will establish a weak coupling between the two equations by
modifying our system as follows:

This means from the standpoint of the shifted eigenvalue problem

that once more the eigenfunctions have remained unchanged while the
eigenvalues have changed by the constant amount e, exactly as in (4).
Once more the previous eigenvalue A = 0 has changed to the eigenvalue
A = e and the zero eigenvalue can be avoided by making e sufficiently small.
Hence the previously incomplete operator becomes once more complete and
spans the entire U space and the entire V space. We know from the general
theory that now the right side can be given freely and the solution becomes
unique, no matter how small e may be chosen. In matrix language we
have changed our original n x ra matrix of the rank p to an (n + m) x
(n + m) matrix of the rank n + m. The conditions of a "well-determined"
and "well-posed" problem are now fulfilled: the solution is unique and the
right side can be chosen freely.
The right side j8(#) can be analysed in terms of the complete ortho-
normal function system Ui(x), ul(x):

where
312 THE GREEN'S FUNCTION CHAP. 5
while the solution u(x), v(x) can be analysed in terms of the complete
ortho-normal function systems U{(x), uJ(x), respectively Vi(x), vk(x):

Then the differential equation (15) establishes the following relation between
the expansion coefficients a^ b^ on the one hand and j8,j on the other

These formulae hold without exceptions, including the eigenvalue A = 0,


for which our previous conventions employed the upper indices j and Tc.
Hence we have to complement the formulae (20) by the additional formulae
(substituting A$ = 0):

The second equation shows that none of the. eigenfunctions vk(x) appear in
the expansion (19) which are not represented in the operator Dv(x). The
normalisation we have employed before, namely to put the solution com-
pletely into the activated F-space of the operator, is upheld by the perturbed
system (15) which keeps the solution constantly in the normalised position,
without adding components in the non-activated dimensions.
The new system includes the function u(x) on equal footing with the
function v(x). Now the first formula of (2) shows that the solution u(x)
is weakly excited in all the activated dimensions of the 17-space and converges
to zero with e going to zero. This, however, is not the case with respect to
the non-activated dimensions u3(x). Here the first formula of (21) shows
that the solution increases to infinity with e going to zero, except if the
compatibility conditions fl = 0 are satisfied. Once more we approach our
problem from a well-posed and well-determined standpoint which does not
involve any constraints. These constraints have to be added, however, if
we want our solution to approach a definite limit with e going to zero.
These results can also be stated in terms of a Green's function which now
becomes a "Green's vector" because a pair of equations is involved. Since
the second equation of the system (15) has zero on the right side, only the
two components GI(X, £)i and G%(x, £)i are demanded:
SEC. 5.29 THE COMPLETION OF LINEAR OPERATORS 313

The formulae (20) and (21) establish, the following bilinear expansions for
the two components of the Green's function:

On the other hand, if our aim is to solve the adjoint equation

with the perturbation (15), we need the other two components of the Green's
vector:

The relation

is once more fulfilled. We can, as usual, define the Green's function G%(x, |)i
by considering £ as the active variable and solving the adjoint equation.
f)u(t;) = 8(x, £). But in our case that equation takes the form

and we obtain a new motivation for the modification of the right side which
is needed in the case of a constrained system. The expression (25) for
GZ(X, £)z shows that the first term goes to zero while the second term is
proportional to 1/e and thus ev(g) will contribute a finite term. If we write
(27) in the form

we obtain, as e goes to zero :

Hence we are back at the earlier equation (22.13) of Section 22, which
defined the differential equation of the constrained Green's function. We
see that the correction term which appears on the right side of the equation
can actually be conceived as belonging to the left side, due to the small
modification of the operator by the e-method which changes the constrained
operator to a free operator and makes its Green's function amenable to the
general definition in terms of the delta function. Hence the special position
of a constrained operator disappears and returns only when we demand
that the solution shall approach a definite limit as e converges to zero.
314 THE GREEN'S FUNCTION CHAP. 5
We will add one more remark in view of a certain situation which we
shall encounter later. It can happen that the eigenvalue A = 0 has the
further property that it is a limit point of the eigenvalue spectrum. This
means that A = 0 is not an isolated eigenvalue of the eigenvalue spectrum
but there exists an infinity of A^-values which come arbitrarily near to zero.
In this case we find an infinity of eigenvalues between 0 and e, no matter
how small we may choose e. We are then unable to eliminate the eigen-
value A = 0 by the e-method discussed above.
The difficulty can be avoided, however, by choosing e as purely imaginary,
that is by replacing e by — ie. In this case the solution v(x) remains real,
while u(x) becomes purely imaginary. The Green's functions (23) now
become

Although the eigenvalues of the problem (16) have now the complex values
Aft — ic, this is in no way damaging, as the expressions (30) demonstrate.
The eigenvalues of the modified problem cannot be smaller in absolute value
than \c\ and the infinity of eigenvalues which originally crowded around
A = 0, now crowd around the eigenvalue — ie but cannot interfere with the
existence of the Green's function and its bilinear expansion in the sense of
(30). We are thus able to handle problems—as we see later—for which
the ordinary Green's function method loses its significance, on account of
the limit point of the eigenvalue spectrum at A = 0.
Problem 252. The Green's function (28.65) goes out of bound for a> = 0 but
at the same time A = 0 becomes an eigenvalue. Obtain for this case the proper
expression for the constrained Green's function.
[Answer:

BIBLIOGRAPHY
[1] Cf. {!}, pp. 351-96
[2] Cf. {3}, Chapter 3 (pp. 134-94)
[3] Cf. {7}, Part I, pp. 791-895
[4] Fox, C., An Introduction to the Calculus of Variations (Oxford University
Press, 1950)
[5] Kellogg, O. D., Foundations of Potential Theory (Springer, Berlin, 1929)
[6] Lanczos, C., The Variational Principles of Mechanics (University of Toronto
Press, 1949)
CHAPTER 6

COMMUNICATION PROBLEMS

Synopsis. Heaviside's "unit step function response" was the first


appearance of a Green's function in electrical engineering. The input-
output relations of electric networks provide characteristic examples
for the application of the method of the Green's function, although
frequently the auxiliary functions employed are the first or second
integrals of the mathematical Green's function. We inquire particu-
larly into the "fidelity problem" of communications devices which can
be analysed in terms of the Green's function, paying special attention
to the theory of the galvanometer. This leads to a brief discussion of
the fidelity problem of acoustical engineering. The steady state versus
transient analysis demonstrates the much more stringent requirements
which are demanded for the fidelity recording of noise, compared with
proper recording of the sustained notes of symphonic instruments.

6.1. Introduction
Even before the Green's function received such a prominent position in
the mathematical literature of our days, a parallel development took place
in electrical engineering, by the outstanding discoveries of the English engineer
0. Heaviside (1850-1925). Although his scientific work did not receive
immediate recognition—due to faulty presentation and to some extent also
due to personal feuds—his later influence on the theory of electric networks
was profound. The input-output relation of electric networks can be
conceived as an excellent example of the general theory of the Green's
function and Green's vector and the relation of Heaviside's method to the
standard Green's function method will be our concern in this chapter.
Furthermore, we shall include the general mathematical treatment of the
galvanometer problem, as an interesting example of a mathematically well-
defined problem in differential equations which has immediate significance
in the design of scientific instruments. This has repercussions also in the
fidelity problem of acoustical recording techniques.

6.2. The step function and related functions


We have seen in the general theory of the Green's function that the right
side of a differential equation could be conceived as a linear superposition
315
316 COMMUNICATION PROBLEMS CHAP. 6

of delta functions (in physical interpretation "pulses", cf. Chapter 5.5). In


the pulse we recognise a fundamental building block from which even
the most complicated functions may be generated. If then we know the
solution of the differential equation to the pulse as input, we also know the
solution for arbitrary right sides.
This idea of a "fundamental building block" has further implications,
particularly in the realm of ordinary differential equations where the
independent variable x covers a simple one-dimensional manifold (which in
electric network theory has the significance of the "time t"). When we
wrote f(x) in the form

we expressed the construction of f(x) as a superposition of unit-pulses in


mathematical form. The "pulses" 8(x, £) which appear in this construction
are comparable to infinitely sharp and infinitely thin needles. They are
far from any analytical properties, in fact they cannot be conceived as
legitimate functions in the proper sense of the word. In order to interpret
the equation (1) properly, a double limit process has to be employed. The
integral is defined as the limit of a sum but the notation 8(x, £) itself hides
a second limit process since in fact we should operate with 8e(x, £) and let e
converge to zero.
While we have succeeded in introducing a universal building block in the
form of the delta function 8(x, £), this function cannot be interpreted
properly without the inconvenience of a limit process. The fundamental
building block used by Heaviside in generating functions, namely the
"unit step function", is free of this objection. It is in a very simple
relation to the delta function by being its integral. The basic character of
the construction remains: once more we are in the position of generating
f(x) as a linear superposition of a base function which is transported from
point to point, multiplied by a suitable constant and then integrated.
This new base function, introduced by Heaviside, is defined as follows :

It remains zero between a and £, and then jumps to the constant value 1
between £ and b. Exactly as the pulse was a universal function which was
rigidly transported to the point £—which made 8(x, £) to 8(x - £)—the
SEC. 6.2 THE STEP FUNCTION AND RELATED FUNCTIONS 317

same can be said of the new function 8l(x, |), which can be written in the form
8l(x — £) where the universal function 81(f) is denned as follows:

The derivative of Sl(t) is the delta function S(t).


Let us integrate by parts in the formula (1):

The boundary term vanishes at £ = 6 (since x — b is negative), while at the


lower limit 81(« — a) becomes 1. Thus

The significance of this superposition principle becomes clear if we conceive


the integral as the limit of a sum. We then see that f(x) is generated as a
superposition of small step functions of the height f'(x)Ax.

The disadvantage of Heaviside's method is that it presupposes the


differentiability of f(x) while before only the continuity of f(x)—in fact
not more than piecewise continuity—was demanded. The advantage is
that 8l(t) is a legitimate function which requires no limit process for its
definition.
318 COMMUNICATION PROBLEMS CHAP. 6

Although we have now generated f(x) by a superposition of step functions,


the basic building block is still rather rugged. We need a very large number
of these building blocks for a fairly satisfactory approximation of the
function /(#), although f(x) itself is not only continuous but even
differentiable. We will now repeat the process and integrate a second time,
assuming that even the second derivative off(x) exists:

The new building block is now the integral of the previous step function.
This new function 82(£) is already continuous, although its tangent is dis-
continuous at t = 0:

The boundary term of (7) becomes f'(a)(x — a) which yields the following
formula:

Now our function is put together with the help of straight line portions and
the ruggedness has greatly decreased. We will now succeed with a much

smaller number of building blocks for a satisfactory approximation of /(#).


SEC. 6.2 THE STEP FUNCTION AND BELATED FUNCTIONS 319

One further integration leads us to the new building block

Here even the discontinuity of the tangent is eliminated and only the
curvature becomes discontinuous at t = 0. We now get by integrating by
parts

and the resulting formula becomes

Once more we have a universal function S 3 (t) which is shifted from point to
point, multiplied by the proper weight factor and the sum formed. But
now the base function is a parabolic arc which avoids discontinuity of either
function or tangent. The resulting curve is so smooth that a small number

of parabolic arcs can cover rather large portions of the curve. Since the
second derivative of 83(£) is a constant, our construction amounts to an
approximation of f(x) in which f"(x) in a certain range is replaced by its
average value and the same procedure is repeated from section to section.
320 COMMUNICATION PROBLEMS CHAP. 6

In all these constructions the idea of a "building block" maintained its


significance: we have a universal function which can be rigidly shifted from
point to point, multiplied at each point with the proper weight factor and
then the integral formed.
Problem 253. Show that the generation of a function in terms of parabolic arcs
is equivalent to a solution of the differential equation

where y is a constant which jumps from section to section.

Problem 254. Obtain the three parabolic building blocks for the generation of
a function f(x) defined as follows:

[Answer: i

Problem 255. Approximate the function f(x) = x3 in the range [0, 2] with the
help of two parabolic arcs in the interval x = [0, 1] and x = [1, 2], chosen in
such a way that the constants of the differential equation (15) shall coincide
with the second derivative of f(x) at the middle of the respective intervals.
[Answer:

6.3. The step function response and higher order responses


Let us see how we can utilise these constructions in the problem of solving
differential equations. In Chapter 5.5 the Green's function method was
conceived as an application of the superposition principle. Then it was
unnecessary to construct the adjoint equation. We obtain the solution of
the differential equation

(considering x as the active variable). Then, making use of the generation


of (3(x) in terms of 8(x, £):

we have obtained the solution of the differential equation

(augmented by the proper boundary conditions to make the solution unique),


in the form
SEC. 6.3 STEP FUNCTION RESPONSE AND HIGHER ORDER RESPONSES 321

Exactly the same principle holds if it so happens that a /* times


differentiable function is generated in the form

Then we construct the new Green's function by solving the equation

and obtain

In the one-dimensional case (ordinary differential equations) we have the


further advantage that the base functions 8^(x, £) are in fact functions of
one variable only, since they depend solely on the difference x — % = t:

Of particular interest is the case fj, = 1 which leads to Heaviside's method


of obtaining the input-output relation of electric networks. Here the
function 81(i) of Section 2 comes into operation which is in fact Heaviside's
"unit step function". The associated Green's function is denned according
to (6) and now the formula (2.5) yields the following representation of the
solution of the given differential equation (3):

From this solution we can return to the standard solution in terms of the
Green's function G(x, £), if we integrate by parts with respect to j3'(|):

This shows the following relation between Heaviside's Green's function and
the standard Green's function, which is the pulse response :

We observe, furthermore, the necessity of the boundary condition

which comes about in consequence of the fact that the right side of (6)
vanishes throughout the given range if 8l(x, £) moves out into the end
point £ = 6.
Historically the Green's function denned with the help of the unit step
function rather than the unit pulse played an important role in the theoretical
researches of electrical engineering, since Heaviside, the ingenious originator
of the Green's function in electrical engineering, used consistently the unit
step function as input, instead of the unit pulse (i.e. the delta function).
He thus established the formula (9) instead of the formula (4). In the
22—L.D.O.
322 COMMUNICATION PROBLEMS CHAP. 6

later years of his life Heaviside became aware of the theoretical superiority
of the pulse response compared with the unit step function response. How-
ever, the use of the unit step function became firmly established in
engineering, although in recent years the pulse response gains more and
more access into advanced engineering research.
From the practical standpoint the engineer's preference for the unit step
function is well understandable. It means that at a certain time moment
t = 0 the constant voltage 1 is applied to a certain network and the output
observed. To imitate the unit pulse (in the sense of the delta function)
with any degree of accuracy is physically much less realisable than to
produce the unit step function and observe its effect on the physical system.
The pulse response is a much more elusive and strictly speaking only
theoretically available quantity.
There are situations, involving the motion of mechanical components,
when even the step function response is experimentally unavailable because
even the step function as input function is too unsmooth for practical
operations. Let us consider for example a servo-mechanism installed on an
aeroplane which coordinates the motion of a foot-pedal in the pilot's cockpit
and the induced motion of the rudder at the rear of the aeroplane. The
servo-mechanism involves hydraulic, mechanical, and electrical parts. To
use the step function as input function would mean that the foot-pedal is
pushed out suddenly into its extreme position. This is physically impossible
since it would break the mechanism. Here we have to be satisfied with a
Green's function which is one step still further removed from the traditional
Green's function, by applying the linear input function (2.8). The foot-
pedal is pushed out with uniform speed into its extreme position and the
response observed. This function G2(x, £) is the negative integral of the step
function response Gl(x, £) considering £ as the variable.
Problem 256. Obtain the solution of (3) in terms of Q2(x, g) and O3(x, £), making
use of the formulae (2.9) and (2.13).
[Answer:

Problem 257. Obtain the relation of Gz(x, £) and Gr3(a?, £) to the standard
Green's function G(x, g).
[Answer:
SEC. 6.4 THE INPUT-OUTPUT RELATION OF A GALVANOMETER 323

6.4. The input-output relation of a galvanometer


Up to now we have not specified the nature of the ordinary differential
operator, except for its linearity. It can involve derivatives of any order,
the unknown function can be a scalar or a vector, and the coefficients may
be any piecewise continuous functions of x. In electric network problems
the further simplification occurs that the coefficients of the operator D are
in fact constants. Furthermore, the given boundary conditions are all
initial conditions because the given physical situation is such that at the
time moment x = 0 all the functions involved have prescribed values.
Under these conditions the mathematical problem is greatly simplified.
The solution of the equation (3.6) is then reducible to the parameter-free
equation

because now the Green's function shares the property of the base function
&(x — £) to become a function of the single variable t = x — £ only:

The solution of the differential equation (3.3)—apart from a boundary


term—will now occur in the form

As a characteristic example we will discuss in this section the problem


of the "galvanometer". The galvanometer is a recording instrument for
the measurement of an electric current. A light mirror is suspended with
the help of a very thin wire whose torsion provides the restoring force.
Damping is provided by the motion in air or by electromagnetic damping.
The differential equation of the galvanometer is identical with that of the
vibrating spring with energy loss due to friction:

The constant a is called the "damping constant", while p is called the


"stiffness constant"; the quantity—p z v(x) is frequently referred to as the
"restoring force ". In the case of the galvanometer the variable x represents
the time, jS(aj) is the input current and v(x) is the scale reading which may be
recorded in a photographic way.
We are here interested in the galvanometer as a recording instrument.
It represents a prototype of instruments which are used in many branches of
physics. For example the pendulum used for the recording of earthquakes
324 COMMUNICATION PROBLEMS CHAP. 6

in seismographic research is a similar measuring device. Generally we


speak of a "device of the galvanometer type" if an input-output relation is
involved which is describable by a differential equation of the form (4).
In our present investigation we shall be particularly interested in the
galvanometer type of recording from the standpoint of comparing the output
v(x) with the input fi(x) and analysing the "fidelity" of the recording.
First of all we shall obtain the "Green's function" of our problem. This
can be done according to the standard techniques discussed in Chapter 5.
In fact, we have obtained the solution before, in Chapter 5.16 (cf. Problem
207, equation 5.16.33):

The second equation tells us at once that the upper limit of integration
will not be b but x since the integrand vanishes for all £ > x. The lower
limit of the integral is zero since we started our observations with the time
moment x = 0.
For the sake of formal simplification we will introduce a natural time
scale into the galvanometer problem by normalising the stiffness constant to
1. Furthermore, we will also normalise the output v(x)—the scale reading—
by introducing a proper amplitude factor. In order to continue with our
standard notations we will agree that the original x should be denoted by x,
the original v(x) by v(x). Then we put

With this transformation the original differential equation (4) (in which
x is replaced by x and v by v) appears now in the form

Our problem depends now on one parameter only, namely on the "damping
ratio" K, for which we want to introduce an auxiliary angle y, defined by

This angle is limited to the range 7r/2 to 0 as a increases from 0 to p. If a


surpasses the critical value p, y becomes purely imaginary, but our formulae
do not lose their validity, although the sines and cosines now change to
hyperbolic sines and cosines. The limiting value y = 0 (/c = 1), which
marks the transition from the periodic to the aperiodic range, is usually
referred to as "critical damping".
8EO. 6.5 THE FIDELITY PEOBLEM OF THE GALVANOMETER RESPONSE 325

If we write down the Green's function in the new variables, we obtain the
expression

which yields the output in the form of the integral

Problem 258. Obtain the step function response Ol(t) of the galvanometer
problem.
[Answer:

Problem 259. Obtain the linear response Q2(t) and the parabolic response Oz(t)
of the galvanometer problem.
[Answer:

Problem 260. Obtain G(t) and (?1(f) for critical damping.


[Answer:

6.5. The fidelity problem of the galvanometer response


We will now investigate to what extent we may hope to find a close
resemblance between the input function p(x) and the output v(x). At first
sight we can see no reason why there should be any resemblance between
the two functions since the formula (4.10) shows that v(x) is obtainable
from j3(x) by the process of integration. Hence v(x) will not depend on the
local value of /?(£) at the point £ = x, but on all the values of /?(£) between
0 and x. The galvanometer is an integrating device and what we get will be
a certain weighted average of all the values of /?(£) between 0 and x. However,
this weight factor—which is in fact G(x, £)—has a strongly biased character.
Indeed, the weight factor

is such that it will emphasise the region around t = 0 while the region of
very large values of t will be practically blotted out. The galvanometer
has a "memory" by retaining the earlier values fi(x — t) before the
326 COMMUNICATION PROBLEMS CHAP. 6

instantaneous value fi(x) but this memory is of short duration if the damping
ratio is sufficiently large. For very small damping the memory will be so
extended that the focusing power on small values of t is lost. In that case
we cannot hope for any resemblance between v(x) and fi(x).
Apart from these general, more qualitative results we do not obtain much
information from our solution (4.10), based on a Green's function which
was defined as the pulse response. We will now make our input function
less extreme by using Heaviside's unit step function as input function.
Then the response appeared—for the normalised form (4.7) of the differential
equation—in the form (4.11). If we plot this function, we obtain a graph
of the following character:

From this graph we learn that the output Gl(t), although it does not resemble
the input in the beginning, will eventually reproduce the input J3(t) with a
gradually decreasing error. This shows that in our normalisation the
proportionality factor between v(x) and fi(x) will be 1. The original v(x)
of the general galvanometer equation (4.4) had to be multiplied by pz in
order to give the new v(x}. Hence in the original form of the equation the
proportionality factor of the output v(x) will become /a2, in order to compare
it with the input fi(x).
Still more information can be obtained if we use as input function the
linear function fi(t) = t of Problem 259 (cf. 4.12). Here we obtain a graph
of the following character:
SBC. 6.6 FIDELITY DAMPING 327

From this graph we learn that the output—apart from the initial disturbance
—follows the input with a constant time lag of the amount

If our input function is composed of straight line sections which follow


each other in intervals which are long compared with the time !//>, i.e. the
reciprocal stiffness constant of the galvanometer, this galvanometer will
reproduce the curve with slight disturbances in the neighbourhood of the
points where the sections meet. The constant time lag, with which the
output follows the input, is in most cases not damaging and can be taken
for granted. We then have to know that it is not p2v(x) but p2v(x + a)
which will correspond to fi(x).
Problem 261. Show that for no value of the damping constant a can the
response curve (2) intersect the £-axis (apart from the point t = 0). Show also
that the approach to the line t = 1 is monotonous for all values a > p.
Problem 262. Show that for no value of the damping constant a can the response
curve (3) intersect the <-axis (apart from the origin t — 0).

6.6. Fidelity damping


Even this information does not suffice to decide whether or not some
particular value of the damping constant is preferable to other values in
order to obtain maximum fidelity. We will now go one step further still
and generate the input function with the help of parabolic arcs. This
means that the input function (2.11) has to be used. The solution was
obtained in Problem 259 (4.13), in the form of an expression which started
with

plus further terms which go to zero with increasing t. We now see that
there is indeed a distinguished value of y, namely the value

which will be of particular advantage from the standpoint of fidelity. With


this choice of the damping ratio we obtain for the damping constant a the
value

which is only 71% of the critical damping. Now we have fidelity for any
curve of arbitrary parabolic arcs, which follow each other in intervals large
compared with the time !//> (since there is a quickly damped disturbance
at the intersection of the arcs, due to the excitation of the eigen-vibration
of the galvanometer). It is clear that under such conditions the fidelity of
the galvanometer recording can be greatly increased since the parabolic
328 COMMUNICATION PROBLEMS CHAP. 6

arcs, which approximate a function, can be put much further apart than
mere straight line sections which are too rigid for an effective approximation
of a smooth function. Hence we will denote the choice (3) of the damping
constant a as "fidelity damping".
Problem 263. Obtain the parabolic response (?3(<) of the galvanometer for
fidelity damping and demonstrate that at t — 0 function and derivatives vanish
up to (and inclusive of) the third derivative.
[Answer:

6.7. The error of the galvanometer recording


It will be our aim to obtain a suitable error bound for the fidelity of
galvanometer recording. We will start with the case of fidelity damping
which assures the highest degree of fidelity recording. We have seen in
the foregoing section that an input signal which can be sectionally repre-
sented by any polynomial of second order, will be faithfully reproduced—
except for a constant time-lag—if the sections in which the representations
hold, are sufficiently separated, since it is inevitable that with a jump in the
second derivative the damped eigen-vibrations of the galvanometer will be
excited, thus causing a short-lived disturbance at the points where the
sections join.
In order to estimate the error of a galvanometer with fidelity damping,
we will now employ as input signal the function

The time-lag for fidelity damping is 2/c = A/2 and thus the differential
equation

would be satisfied by v(x) = x3, if absolute fidelity could be expected. In


actual fact the solution of our equation becomes

and this can be interpreted as

We can make our error estimation still more accurate by allowing a certain
time-lag at which the third derivative /?'"(#) is to be taken. Our error
estimation can thus be made accurate for any polynomial of the order four.
Let us use as input function
SBC. 6.7 THE ERROR OF THE GALVANOMETER RECORDING 329

Then the corresponding solution v(x) becomes

which can be interpreted as follows:

The estimated error for critical damping is considerably higher since it is


proportional to the second derivative of j6(#). If we use as input function

and solve the differential equation

we obtain

which can be interpreted in the form

Let us now return to the original formulation (4.4) of the galvanometer


equation. Our results can now be summarised as follows:
Fidelity damping:

Critical damping:

Problem 264. Given the stiffness constants p = 5 and p — 10. Calculate the
relative errors of the galvanometer response (disregarding the initial disturbance
caused by the excitation of the eigen-vibration), for the input signal

for both fidelity and critical damping. Compare these values with those
predicted by the error formulae (12) and (13).
F Answer:
330 COMMUNICATION PROBLEMS CHAP. 6

Problem 265. Apply the method of this section to the error estimation of the
general case in which the damping ratio K = <x/p is arbitrary (but not near to
the critical value
[Answer:

6.8. The input-output relation of linear communication devices


The galvanometer is the prototype of a group of much more elaborate
mechanisms which have certain characteristics in common. We can call
them "communication devices" since their function is to transmit certain
information from one place to another. We may think for example of a
broadcasting station which receives and transmits a speech given in acoustical
signals and transformed into electric signals. These electric signals are
radiated out into space and are received again by a listener's radio set
which transforms the electric signals once more into acoustic signals. The
microphone-receiver system of an ordinary telephone is another example of
a device of the galvanometer type, the "input" being the speech which
brings the microphone into vibration, the "output" being the air vibration
generated by the membrane of the receiver and communicated to the
listener.
Other situations may differ in appearance but belong mathematically to
the same group of problems. Consider for example the movements of an
aeroplane pilot who operates certain mechanisms on the instrument-panel
of his cockpit. These movements are transmitted with the help of "servo-
mechanisms" into a corresponding motion of the aileron, rudder, etc.
The whole field of "servo-mechanism" can thus be considered as a special
example of a communication device.
The mechanisms involved may be of an arbitrarily complicated type.
They may involve mechanical or hydraulic or electric components. They
may consist of an arbitrary number of coupled electric networks. They
may contain a loudspeaker whose action is described by a partial differential
equation and which is thus replaceable by an infinity of coupled vibrating
springs. In order to indicate that we are confronted with an unknown
mechanism whose structure is left unspecified, we speak of a "black box"
which contains some mechanism whose nature is not revealed to us. How-
ever, there are two ends to this black box: at the one end some signal goes
in as a function of the time t which in our notation will be called x. At the
other end some new function of the time t conies out which we will call the
"output". We will adhere to our previous notations and call the input
signal f$(x), the output response v(x). In the case of the galvanometer
ft(x) had the significance of the electric current which entered the galvano-
meter reading. The coupling between these two functions was established
by the differential equation of the damped vibrating spring. In the general
case we still have our two functions /3(x) and v(x) as input and output but
SEC. 6.8 INPUT-OUTPUT OF LINEAR COMMUNICATION DEVICES 331

their significance may be totally different and the coupling between the two
functions need not necessarily be established by a differential equation but
can be of a much more general character. Generally we cannot assume that
there is necessarily a close resemblance between fi(x) and v(x). We may
have the mathematical problem of restoring the input function fi(x) if v(x)
is given and this may lead to the solution of a certain integral equation.
But frequently our problem is to investigate to what extent we can improve
the resemblance of the output v(x) to the input f3(x) by putting the proper
mechanism inside the black box.

a) Linear devices. Although we have no direct knowledge of the


happenings inside the black box, we can establish certain general features
of this mechanism by experimenting with various types of input functions
j8(z) and observing the corresponding outputs v(x). We will assume that
by a number of preliminary experiments we establish the following two
fundamental properties of the input-output relation:
1. Let us use an arbitrary input function j6(z) and observe the corre-
sponding output v(x). We now change fi(x) to fi\(x) = <xp(x), where a is an
arbitrary constant. Then we observe that the new output v\(x] becomes
Vi(x) = av(x).
2. Let gi(x) and gz(x) be two arbitrary input functions and vi(x), vz(x) be
the corresponding outputs. We now use fti(x) + pz(x) as an input. Then
we observe that the output becomes vi(x) + vz(x); this means that the super-
position principle holds. The simultaneous application of two inputs generates
the sum of the corresponding outputs, urithout any mutual interference.
If our communication device satisfies these two fundamental conditions,
we will say that our device is of the linear type. As a consequence of the
332 COMMUNICATION PROBLEMS CHAP. 6

superposition principle and the proportionality principle we can state that


the output which corresponds to the input function

becomes

where v\(x), v%(x), . . . , vn(x) are the outputs which correspond to fii(x),
fa(x), . . . , J3n(x) as inputs.
Now we have seen before (cf. equations (5.4.13-15)) that an arbitrary
continuous function f(x) can be considered as a linear superposition of delta
functions. If now we obtain the response of the black box mechanism to
the delta function as input function, we can obtain the response to an
arbitrary input function fl(x), in a similar manner as the solution of a linear
differential equation was obtained with the help of the Green's function
<*(*,€):

We will assume that the lower limit of our variable x is x = 0 because


we want to measure the time from the time moment £ = 0, the start of
our observations. The upper limit £ = b of our interval can be chosen
arbitrarily but we can immediately add a further property of the function
G(x, |). As a consequence of the causality principle, the output has to
follow the input. Hence G(x, g) must vanish at any value of x which comes
before the time of applying the unit pulse, that is before x = $:

We can equally say—considering £ as the variable—that 0(x, £) vanishes


for all values ^ > x. Hence the integration in (245) does not extend to
£ = b but only to £ = x:

b) Time-independent mechanisms. We will now add a further property of


our communication device. We assume that the unknown components
inside the black box do not change in time. They have physical character-
istics which are time independent. If this is the case, it cannot make any
difference, at what time moment £ the unit pulse is applied, the output v(x]
will always be the same, except for a shift in the time scale. This additional
property of the communication device can be mathematically formulated
as follows. If J3(x) changes to /3(x + a), v(x) changes to v(x + a). Now
the unit pulse applied at the time moment £ is equal to the unit pulse
applied at the time moment 0, with a shift of the time scale:

Hence
SEC. 6.8 INPUT-OUTPUT OF LINEAR COMMUNICATION DEVICES 333

It is thus sufficient to observe the output G(x) which follows the unit pulse
input, applied at the time moment x = 0. This function is different from
zero for positive values of x only while for all negative values G(x) is zero :

The function of two variables G(x, g) is thus reducible to a function of a


single variable G(x) only and the relation (5) now becomes:

We will once more introduce the variable

as we have done before in the theory of the galvanometer (cf. (4.9-10)).


We replace £ by the new variable t, according to

and write the input-output relation (9) in the form

We see that the general theory of an arbitrary linear communication


device is remarkably close to the theory of the galvanometer. The only
difference is that in the case of the galvanometer the function G(t] had a
very definite form, given by (4.9). In the general case the function G(t]
will be determined by the more or less complex mechanism which is inside
the black box.
c) The dissipation of energy. A mechanism of the following type may be
considered. At the input end a switch is pulled. On the output an electric
bulb is lighted. If we are not familiar with the content of the black box,
we would think that a perpetuum mobile kind of device had been invented.

In actual fact the box contains an "internal generator" which constantly


puts out energy. We want to assume that our device does not contain an
internal generator. The input energy will be gradually dissipated by
transforming it into heat. Hence the response to the unit pulse will be such
334 COMMUNICATION PROBLEMS CHAP. 6

that it will not last forever but disappear eventually. It is possible that
strictly speaking G(t] will never become exactly zero but approach zero
asymptotically, as t grows to infinity:

This is the behaviour we have observed in the case of the galvanometer


(cf. (4.9)), and the same will be true if the black box mechanism is composed
of an arbitrary number of electric circuits or vibrating springs. The name
"passive network" is applied to this kind of mechanism.
We see that we have gradually reduced the arbitrariness of the black box
by the following restricting conditions which, however, are valid in most
communication problems:
1. The device is linear.
2. The output follows the input.
3. The components of the device do not change with the time.
4. The device dissipates energy.

6.9. Frequency analysis


An entirely different kind of analysis is frequently of great practical and
theoretical interest. We know from the theory of the Fourier series that a
very large class of functions, defined in a finite or even infinite interval,
can be resolved into purely periodic components. Instead of considering
j8(#) as a superposition of pulses, as we have done before, we can equally
consider fi(x) also as a superposition of periodic functions of the form
cos cox and sin <ux. If we know how our communication device responds to
the input function cos cox or sin cox, we shall be able to generate the entire
output as a superposition of these responses. In this analysis the basic
building block for the generation of a function is not the pulse, shifted from
point to point, but the periodic functions cos cox and sin cox, with arbitrary
values of co.
For mathematical purposes it will be convenient to combine cos cox and
sin tax in the complex form

If we know what the response of our device is to the complex input function
(1), we shall immediately have the response to both cos cox and sin cox as
input functions, by merely separating the real and imaginary parts of the
response.
Before carrying out the computation we will make a small change in the
formula (8.12). The limits of integration were 0 and x. The lower limit
came about in view of 0(t) being zero for all negative values of t. The
upper limit x came about since we have started our observation at the time
moment x = 0. Now the input signal fi(x) did not exist before the time
SEC. 6.9 FREQUENCY ANALYSIS 335

moment x — 0 which means that we can define fi(x) as zero for all negative
values of x.

But then we need not stop with the integration at t = x but can continue
up to t = oo :

The condition (2) automatically reduces the integral (3) to the previous
form (8.12). The new form has the advantage that we can now drop the
condition (2) and assume that the input f3(x) started at an arbitrary time
moment before or after the time moment x = 0.
This will be important for our present purposes because we shall assume
that the periodic function (1) existed already for a very long—mathematically
infinitely long—time. This will not lead to any difficulties in view of the
practically finite memory time of our device.
We introduce now

into our integral (3) and obtain

The factor of ei(ax can be split into a real and imaginary part:

with

Moreover, we may write the complex number (6) in "polar form", with the
"amplitude" p(co) and the "argument" 0(a>):

Then the output (5) becomes

The significance of this relation can be formulated as follows. If we use


a harmonic vibration cos wx or sin a>x as input function, the output will again
be a harmonic vibration of the same frequency but modified amplitude and phase.
336 COMMUNICATION PROBLEMS CHAP. 6

The fact that a harmonic vibration remains a harmonic vibration with


unchanged frequency, is characteristic for the linearity of a communication
device with time-independent elements. We can in fact test our device for
linearity by using a sine or cosine function of arbitrary frequency as input
and observing the output. The necessary and sufficient condition for the
linearity of the device is that a harmonic vibration remains a harmonic
vibration with unchanged frequency, although with modified amplitude and
a certain shift of the phase.
It may happen that in a large range of <a the amplitude factor p(a)) comes
out as practically independent of to. Moreover, the phase-shift 6(w) may
come out as simply proportional to w:

In this case we can write for (10):

Now the output follows the input with the time-lag a but reproduces the
input with the proportionality factor PQ. If these conditions hold for a
sufficiently large range of the frequency o>, the output will represent a high-
fidelity reproduction of the input, apart from the constant time-lag a which
for many purposes is not damaging.

6.10. The Laplace transform


The complex quantity (9.9) which characterises the amplitude factor and
phase-shift of the response, is called the "transfer function". It is in a
definite relation to the pulse-response G(t) of our device and uniquely
determined by that function. The relation between the two functions,
viz. the pulse response and the transfer function, is intimately connected
with a fundamental functional transformation, called the "Laplace trans-
form", encountered earlier in Chapter 1.15. Let the function G(t) be given
in the interval [0, oo]. We introduce a new function of the new variable
p by putting

This function L(p) has generally no resemblance to the original function


G(t). In fact it is a much more regular function of p than G(t) was of t.
The function G(t) need not be analytical at all. It can be prescribed freely
as a generally continuous and absolutely integrable function, with a finite
number of discontinuities and finite number of maxima and minima in any
finite interval. The function L(p), on the other hand, is an analytical
function of the variable p, not only for real values of p but for arbitrary
complex values of p as long as they lie in the right Iwlfofthe complex plane
(i.e. the real part of p is positive). The function L(p) is thus an eminently
SEC. 6.11 THE MEMORY TIME 337

regular and lawful function and we have no right to prescribe it freely, even
in an arbitrarily small interval.
If now we consider the integral (9.6) which defines the transfer function

we see that by the definition of the Laplace transform we obtain the funda-
mental relation

The transfer function F(a>) which determines the frequency response of our
device, is thus obtained as the Laplace transform of the pulse response, taken
along the imaginary axis.
Problem 266. Consider the pulse response of a galvanometer given by (4.9).
Obtain the Laplace transform of this function and show that the singularity of
L(p) lies in the negative half plane, no matter what the value of the damping
ratio K = cos y (between 0 and oo) may be.
Problem 267. Obtain the transfer function (3) and study it from the standpoint
of the fidelity problem. Explain the special role of the value K = 1/V2 (fidelity
damping).
[Answer:

The amplitude response for the critical value becomes (1 + co4)"1/2 the dis-
tortion being of fourth instead of second order in w.
Problem 268. Show that in the case of fidelity damping the maximum error
of the galvanometer response for any a> between 0 and 1/2 V 2 does not surpass
2% and that the error prediction on the basis of (7.12) in this frequency range
holds with an accuracy of over 97.5%.
Problem 269. Show that from the standpoint of smallest phase distortion the
value K = V3/2 (y = 77/6 = 30°) represents the most advantageous damping
ratio.
Problem 270. Assuming that co varies between 0 and l/4« (K > %), show that
the maximum phase distortion of the galvanometer in that frequency range
does not surpass the value

(excluding K values which are near to the critical V3/2).

6.11. The memory time


We have mentioned before that our communication device absorbs the
input energy, without creating energy of its own. Hence the response G(t]
to the input pulse will go asymptotically to zero as t goes to infinity. In
23—L.D.O.
338 COMMUNICATION PROBLEMS CHAP. 6

actual fact it would be mere luxury to integrate out to infinity. Although


G(t) will disappear exactly in the theoretical sense only at t = oo, it will
be zero practically much sooner. No physical device can be taken with
absolute mathematical exactitude since there are always disturbing accidental
circumstances, called "noise", which interfere with the operation of the
exact mathematical law by superimposing on our results some additional
random phenomena which do not belong to the phenomenon we want to
investigate. Hence we cannot aim at absolute accuracy. If we observe
the function 0(t) which eventually becomes arbitrarily small but reaches
zero strictly speaking at t = oo, we come after a finite (and in practice
usually very short) time T into a region which is of the same order of

magnitude as the superimposed noise. We can then discard everything


beyond t — T as of no physical significance.
We see that under these circumstances a passive communication device
possesses a definite "memory time" T which puts an upper limit to our
integration. Instead of integrating out to infinity, it sufiices to integrate
onlv UD to T:

Correspondingly the transfer function F(w) will now become

The error thus induced in F(a>) can be estimated by the well-known integral
theorem

which in our case gives

where

Although the "memory time" of an instrument is not an exact mathematical


concept, to every given accuracy a definite memory time can be assigned.
SEC. 6.12 STEADY STATE ANALYSIS OF MUSIC AND SPEECH 339

Problem 271. Find the memory time of a galvanometer with critical damping
if we desire that the error limit (11.5) shall be 3% of the total area under O(t):

[Answer: T - 5.36.]
Problem 272. Consider the Laplace transform L(p) of the function

Consider on the other hand the same transform Li(p) but integrating only up to
t = 4. Show that the analytical behaviour of Li(p) is very different from that
of L(p) by being free of singularities in the entire complex plane, while L(p) has
a pole at the points p = — l ± i f i . And yet, in the entire right half plane,
including the imaginary axis p = ia), the relative error of Li(p) does not exceed
2%

6.12. Steady state analysis of music and speech


The fact that every kind of input function fi(x) can be analysed in terms
of pure sine and cosine vibrations, leads sometimes to wrong interpretations.
It is an experimental fact that if a sustained vibration of the form

is presented to the human ear, the ear is not sensitive to the presence of
the phase angle 0. It cannot differentiate between the input function
sin MX and the input function sin (tax — 6). From this fact the inference
is drawn that the phase shift induced by the complex transfer function F((a)
is altogether of no importance. Since it makes no difference whether v(x) is
presented to the ear or that superposition of sin tax and cos wx functions
which in their sum are equivalent to v(x), it seems that we can completely
discard the investigation of the phase-shift. The superposition principle
holds, the ear does not respond to the phase of any of the components,
hence it is altogether immaterial what phase angles are present in the output
v(x). According to this reasoning the two functions

and

are equivalent as far as our heading goes. The ear will receive the same
impression whether v(x) or v\(x) is presented to it.
This reasoning is in actual fact erroneous. The experimental fact that
our perception is insensitive to the phase angle, ##, holds only if sustained
musical sounds are involved. It is true that any combination of sine or
cosine functions which are perceived as a continuous musical sound can be
altered freely by adding arbitrary phase angles to every one of the com-
ponents. But we have "noise" phenomena which are not of a periodic
340 COMMUNICATION PROBLEMS CHAP. 6

kind and which are not received by the ear as musical sounds. Such
"noise" sequences can still be resolved into a superposition of steady state
sine and cosine functions which have no beginning and no end. But the
ear perceives these noises as noises and the steady state analysis is no
longer adequate, although it is a mathematical possibility. Now it is no
longer true that the periodic components into which the noise has been
resolved can be altered by arbitrary phase shifts. Such an alteration would
profoundly influence the noise and the perception of the noise. For noise
phenomena which represent a succession of transients and cannot be
perceived as a superposition of musical notes, the phase angle becomes of
supreme importance. A phase shift which is not merely proportional to o>
(giving a mere time-lag) will have no appreciable effect on the strictly
"musical" portions of recorded music and speech but a very strong effect
on the "transient" portions of the recording.
Even in pure music the transient phenomena are by no means of
subordinate importance. The sustained tone of a violin has a beginning
and an end. The ear receives the impression of a sustained tone after the
first cycles of tone generation are over but the transition from one tone to
the other represents a transient phenomenon which has to be considered
separately. The same holds for any other instrument. It is the experience
of many musically trained persons that they recognise the tone of a certain
type of instrument much more by the transitions from tone to tone than by
the sustained notes. The older acoustical investigations devoted the focus
of attention almost completely to the distribution of "overtones" in a
sustained note. Our ear perceives a musical sound under the aspects of
"loudness", "pitch", and "tone quality". The loudness or intensity of
the tone is determined by the amplitude of the air pressure vibrations which
create in the ear the impression of a tone. The "pitch" of the tone is
determined by the frequency of these vibrations. The "tone quality", i.e.
the more or less pleasant or harmonious impression we get of a musical sound,
is determined by the distribution of overtones which are present on account
of the tone-producing mechanism. In strictly periodic tone excitement the
overtones are in the frequency ratios 1:2:3:..., in view of the Fourier
analysis to which a periodic function of time can be submitted. The
presence of high overtones can give the sound a harsh and unpleasant
quality. We must not forget, however, that generally the amplitude of the
high overtones decreases rapidly and that in addition the sensitivity of our
ear to high frequencies decreases rapidly. Hence it is improbable that in
the higher frequency range anything beyond the third or fourth overtone
is of actual musical significance. In the low notes we may perceive the
influence of overtones up to the order 6 or 7. It would be a mistake to
believe, however, that we actually perceive the overtones as separate tones
since then the impression of the tone would be one of a chord rather than
that of a single tone. It is the weakness of the overtones which prevents
the ear from hearing a chord but their existence is nevertheless perceived
and recorded as "tone quality".
SEC. 6.12 STEADY STATE ANALYSIS OF MUSIC AND SPEECH 341

In certain instruments which do not produce a sustained note, the value


of an overtone analysis becomes doubtful since we do not have that
periodicity which is a prerequisite for the application of the Fourier series
to a physical phenomenon. The musical tone generated is here more a
succession of transients than a steady state phenomenon. This is the case
even with the piano tone which is the result of the sudden striking of a
hammer and consists of damped vibrations. This is the reason that the
piano tone does not blend well with the tone of string instruments; the
hammer strikes the strings only once and produces a quickly damped
vibration instead of a continuous tone. The damping is particularly
strongly noticeable in the upper octaves. The transient is here more
predominant than the steady state part of the tone, in fact one can hardly
find a natural basis for a steady state analysis. Similar conditions hold
for military percussion instruments and some jazz instruments.
Now the claim is often made in technical literature that the piano tone
possesses "very high overtones". The tinny and flat sounds encountered
in the piano recordings of older vintage have been explained on the basis
that the amplitude response of the older recording instruments was un-
satisfactory and failed to reproduce the very high overtones which accompany
the tone generated by the piano. In actual fact these claims cannot be
substantiated. Our ear is practically deaf to any overtones beyond about
8000 cycles per second. If the piano tone did possess such high overtones,
they would still not be perceptible to our ear. The decisive factor is not
the amplitude response but the phase response to which usually very little
attention is paid. It was the strong phase distortion in the older recording
instruments which interfered with the proper reproduction of transient
phenomena, and this caused instruments of the piano type and all the other
instruments with strong transient components to be poorly represented.
Similar phenomena are encountered in speech recording. Here the vowels
represent the "sustained musical notes" while the consonants represent the
noise part or transient part of speech. The construction of the larynx and
the shape of the other organs which participate in sound production is such
that the vowel sounds fall in a band of frequencies which vary between 300
and 3000 cycles per second, the most important range being that between
1000 and 2500 cycles. Our ordinary telephone is one of the most ingenious
and most important communication devices. The ordinary telephone
transmits frequencies between about 250 and 2750 cycles per second and is
thus well qualified to the transmission of the steady part of our speech.
At the same time the telephone is a poor device if it comes to the trans-
mission of consonants as we can demonstrate by trying to telephone in a
foreign language in which we are not perfectly at home. The good audibility
of speech through the telephone is achieved through the method of
"inferential extrapolation". We do not hear in the telephone what we
believe we hear but we make up for the deficient information by recon-
structing the distorted speech, because of familiarity with the language.
If this familiarity is lacking, we are unable to use the method of inferential
342 COMMUNICATION PROBLEMS CHAP. 6

extrapolation and we are at a loss to understand the received communication.


Consonants such as b, p, v, lose their identity if spoken in the telephone
and thus in the spelling of unfamiliar words we have to take refuge to the
method of "associative identification" by saying: "B for Billy, P for Peter,
V for Vincent."
The steady state parts of the speech, that is the vowels, do not suffer
much in the process of telephone transmission and are easily recognisable.
The reason is that these vowels are essentially composed of low-frequency
trains of waves. Moreover, the phase distortion, which is a vital handicap
in the transmission of noise, has practically no influence on the perception
by the ear, as long as we stay within the boundaries of steady state analysis.
6.13. Transient analysis of noise phenomena
The resolution of signals into harmonic components is such a vital part
of electrical engineering that it becomes second nature to the engineer. For
him the signal f(t) does not exist as f(t) but as F(a)), that is its harmonic
resolution into the components ei(at, acted upon by the communication
device. The communication device applies its own weight factor to each
one of the harmonic components and after this weighting the component
parts are put together again. Hence the input f(t) is resolved into its
harmonic components by Fourier analysis; the output v(x), on the other
hand, is assembled again out of its constituent parts, with the help of
Fourier synthesis.
In this procedure the weighting of the amplitudes is traditionally considered
as of vital importance. The amplitude response is technically much easier
the subject of exact measurements than the phase response. The latter is
frequently regarded as of negligible importance, in view of the fact that neither
the perception of the sustained notes in music, nor the perception of the
sustained elements of speech (i.e. primarily the vowel sounds) is sensitive
to the presence or absence of arbitrary phase shifts. Our ear does not
recognise the phase of a musical note and hence the question of phase
fidelity is immaterial.

This picture changes, however, very considerably if we now come to the


investigation of the second problem, viz. the reproduction of transient
phenomena. A consonant such as p, q, r, and so on, presents itself primarily
as a certain function of time which is of short duration and non-repetitive.
We can speak of a certain "noise profile" by imagining that the air pressure
generated by a consonant—or likewise by the onset of a musical tone—is
plotted as a graph which characterises that particular element of speech or
SEC. 6.13 TRANSIENT ANALYSIS OF NOISE PHENOMENA 343

music. This graph can be of a rather irregular shape and the question arises
how to obtain a faithful reproduction of it.
That this reproduction cannot be perfect, is clear from the outset.
Perfect reproduction would mean that a delta function as input is recorded
as a delta function as output, except for a certain constant time-lag. This
is obviously impossible, because of the inertia of the mechanical components
of the instrument. In fact we know that even the much more regular unit
step function cannot be faithfully reproduced since no physical instrument
can make a sudden jump. Nor is such an absolutely faithful reproduction
necessary if we realise that our ear itself is not a perfect recording instrument.
No recording instrument can respond to a definite local value of the input
signal without a certain averaging over the neighbouring values. Our ear
is likewise a recording instrument of the integrating type and thus unable to
perceive an arbitrarily rugged /(£). A certain amount of smoothing must
characterise our acoustical perception, and it is this smoothing quality of
the ear on which we can bank if we want to set up some reasonable standards
in the fidelity analysis of noise.
We recognise from the beginning that the reproduction of a given rather
complicated noise profile of short duration will be a much more exacting
task than the recording of musical sounds of relatively low frequencies.
But the question is how far have we to go in order to reproduce noise with
a fidelity which is in harmony with the high, but nevertheless not arbitrarily
high, capabilities of the human ear.
This seems to a certain degree a physiological question, but we can make
some plausible assumptions towards its solution. We will not go far wrong
if we assume that the ear smooths out the very rugged peaks of an input
function by averaging. This averaging is in all probability very similar
to that method of "local smoothing" that we have studied in the theory
of the Fourier series and that we could employ so effectively towards an
increased convergence of the Fourier series by cutting down the contribution
of the terms of very high frequency. The method of local averaging attaches
to the vibration eiwt the weight factor

(of. 2.13.8), if we denote by r the smoothing time. Our insensibility to


higher overtones may easily be due to this smoothing operation of our ear.
Since we do not perceive frequencies beyond 10,000 per second, we shall
probably not go far wrong if we put the smoothing time of the human ear
in the neighbourhood of 1/10,000 = 10~4 second. If we use the galvano-
meter response (4.9) as an operating model, we see from the formula (7.16)
that the output does not give simply the local value ft(x — 2/c), but there is
a correction term added which is of the same nature as that found in the
344 COMMUNICATION PROBLEMS CHAP. 6

theory of the Fourier series, caused by local smoothing (cf. 2.13.6). The
smoothing time r can thus be established by the relation

or, returning to the general time scale of the galvanometer equation (4.4):

With advancing age the smoothing time increases, partly due to an increase
in the damping constant a, and partly due to a decrease in the stiffness
constant p, considering the gradual relaxation of the elastic properties of
living tissues. This explains the reduction of the frequency range to which
the ear is sensitive with advancing age.
We often treat the phenomenon of "hard of hearing" as a decline of
the hearing nerve in perceiving a certain sound intensity. The purpose of the
hearing aid is to amplify the incoming sound, thus counteracting the
reduction of sound intensity caused by the weakening of the hearing nerve.
A second factor is often omitted, although it is of no smaller importance.
This is the reduced resolution power of the ear in perceiving a given noise
profile (1). As the smoothing time r increases, the fine kinks of the noise
profile (1) become more and more blurred. Hence it becomes increasingly
more difficult to identify certain consonants, with their characteristic noise
profiles. If in a large auditorium the lecturer gets the admonition "louder"
from the back benches, he will not only raise his voice but he will instinctively
talk slower and more distinctly. This makes it possible to recognise certain
noise patterns which would otherwise be lost, due to smoothing. If we
listen to a lecture delivered in a language with which we are not very
familiar, we try to sit in the front rows. This has not merely the effect of
higher tone intensity. It also contributes to the better resolution of noise
patterns which in the back seats are blurred on account of the acoustical
echo-effects of the auditorium. (Cockney English, however, cannot be
resolved by this method.)
From this discussion certain consequences can be drawn concerning the
fidelity analysis of noise patterns. In the literature of high fidelity sound
reproducing instruments we encounter occasional claims of astonishing
magnitude. One reads for example that the "flat top amplitude character-
istics" (i.e. lack of any appreciable amplitude distortion) has been extended
to 100,000 cycles per second. If this fact is technically possible, the question
can be raised whether it is of any practical necessity (the distance travelled
by the needle in the playing of a 12-in. record during the time of
1/100,000 = 10~5 second is not more than 0.005 mm). Taking into account
the limited resolution power of the human ear, what are the actual fidelity
requirements in the reproduction of noise patterns? Considering the
inevitable smoothing operation of the ear, it is certainly unnecessary to try
to reproduce all the kinks and irregularities of the given noise profile since
SEC. 6.13 TRANSIENT ANALYSIS OF NOISE PHENOMENA 345
the small details are obliterated anyway. In this situation we can to
some extent relax the exact mathematical conditions usually observed in the
generation of functions. In Section 2 we have recognised the unit step
function as a fundamental building block in the generation of f(x). The
formula (2.5) obtained f(x) by an integration over the base function
&l(x — £). This integral could be conceived as the limit of the sum (2.6),
reducing the Axt to smaller and smaller values. In the presence of smoothing,

however, we can leave this sum as a sum, without going to a limit. We


need not diminish the Ax{ below the smoothing time r since smaller details
will be blotted out by the ear. Under these circumstances we can generate
the function f(x) as a succession of "smoothed" unit step functions which
do not jump up from zero to 1 but have a linear section between. The
following figure demonstrates how such a function—except for the usual
time-lag—can be faithfully reproduced by a galvanometer of critical damping.

We have used 12 units for the horizontal part of the curve, in order to reduce
the disturbance at the beginning and the end of the linear section to
practically negligible amounts. Hence we have solved the problem of
faithfully reproducing a noise pattern which is composed of straight line
sections of the duration r. Since the units used were normalised time units,
the return to the general galvanometer equation (4.4) establishes a stiffness
constant p which is in the following relation to the smoothing time r:
346 COMMUNICATION PROBLEMS CHAP. 6

Now the transfer function (10.4) of a galvanometer demonstrates that we


get a fairly undistorted amplitude and phase response—the amplitude
remaining constant, the phase increasing linearly with co—if we stay with
(a within the limits 0 and \ (for K near unity) and that means in the general
time scale the range

If instead of the angular frequency a) we employ the number v of cycles per


second, we obtain as an upper bound of the faithfulness requirement

This result indicates that we need not extend the fidelity requirement in
amplitude and phase beyond the limit I/T.
A similar result is deducible from a still different approach. We have
seen in the treatment of the Fourier series that the infinite Fourier series
could be replaced with a practically small error by a series which terminates
with n terms, if we modify the Fourier coefficients by the sigma factors and
at the same time replace f(x) by /(#), obtained by local smoothing. The
smoothing time of the local averaging process was

or, expressing once more everything in terms of cycles per second, the last
term of the finite Fourier series becomes an cos 2-nvx + bn sin 2-n-vx while r
becomes 1/v. We may equally reverse our argument and ask for the smooth-
ing time T which will make it possible to obtain f(x) with sufficient accuracy
by using frequencies which do not go beyond v. The answer is r = 1/v.
Here again we come to the conclusion that it is unnecessary to insist on
the fidelity of amplitude and phase response beyond the upper limit v = I/T.
The result of our analysis is that the apparently very stringent fidelity
requirements of a transient recording are in fact not as stringent as we
thought in the first moment. If amplitude and phase distortion can be
avoided up to about 10 or perhaps 15 thousand cycles per second, we can be
pretty sure that we have attained everything that can be expected in the
realm of high-fidelity sound reproduction. The great advances made in
recent years in the field of high-fidelity equipment is not so much due to a
spectacular extension of the amplitude fidelity to much higher frequencies
but due to a straightening out of the phase response which is of vital importance
for the high-fidelity reproduction of noise, although not demanded for the
recording of sustained sounds. The older instruments suffered by strong
phase distortion in the realm of high frequencies. To have eliminated this
distortion up to frequencies of 10,000 cycles per second has improved the
reproduction of the transients of music and speech to an admirable degree.
BIBLIOGRAPHY 347

BIBLIOGRAPHY
[1] Bush, V., Operational Circuit Analysis (Wiley, New York, 1929)
[2] Carson, J. R., Electric Circuit Theory and Operational Analysis (McGraw-Hill,
New York, 1920)
[3] Pipes, L. A., Applied Mathematics for Physicists and Engineers (McGraw-Hill,
New York, 1946)
CHAPTER 7

S T U R M - L I O U V I L L E PROBLEMS

Synopsis. The solution of ordinary second-order differential equations


has played a fundamental role in the evolution of mathematical physics,
starting with the eigenvibrations of a string, and culminating in the
atomic vibrations of Schrodinger's wave equation. The separation of
the fundamental differential operators of physics into functions of a
single variable leads to a large class of important second order
equations. While the solution of these equations cannot be given in
closed form, except in special cases, it is possible to obtain an eminently
useful approximation in terms of a mere quadrature. In the study of
this method we encounter a certain refinement which permits us to
deduce expressions in terms of elementary functions, which approximate
some of the fundamentally important function classes of mathematical
physics (such as the Bessel functions, and the Legendre, Hermite, and
Laguerre type of polynomials), with a remarkably high degree of
accuracy.

7.1. Introduction
There exists an infinite variety of differential equations which we may
want to investigate. Certain differential equations came into focus during
the evolution of mathematical theories owing to their vital importance in
the description of physical phenomena. The "potential equation", which
involved the Laplacian operator A, is in the foremost line among these
equations. The separation of the Laplacian operator in various types of
coordinates unearthed a wealth of material which demanded some kind of
universal treatment. This was found during the nineteenth century,
through the discovery of orthogonal expansions which generalised the out-
standing properties of the Fourier series to a much wider class of functions.
The early introduction of the astonishing hypergeometric series by Euler
was one of the pivotal points of the development. Almost all the important
function classes which came into use during the last two centuries are in
some way related to the hypergeometric series. These function classes are
characterised by ordinary differential equations of the second order. The
importance of these function classes was first discovered by two French
mathematicians, J. Ch. F. Sturm (1803-55) and J. Liouville (1809-82).
These special types of differential operators are thus referred to as belonging
348
SBC. 7.2 DIFFERENTIAL EQUATIONS OF FUNDAMENTAL SIGNIFICANCE 349

to the "Sturm-Liouville type". Lord Bayleigh in his famous acoustical


investigations (1898) came also near to the general theory of orthogonal
expansions. The final unifying viewpoint came into being somewhat later,
through the fundamentally new departure of I. Fredholm (1900), and its
later development by D. Hilbert (1912). It will not be our aim here to go
into any detailed study of this great branch of analysis which gave rise to
a very extended literature. We will deal with the Sturm-Liouville type of
differential equations primarily as an illustration of the general theory.
Furthermore, we will pay detailed attention to a specific feature of second-
order operators, namely that they are reducible to the solution of a first-order
non-linear differential equation. This method of solution played a decisive
role in the early development of quantum theory, characterised by Bohr's
atom model. But certain refinements of the method are well adapted to a
more exact approximation of the Sturm-Liouville type of function classes.
7.2. Differential equations of fundamental significance
The most general linear differential operator of second order can be written
down in the following form:

where A(x), B(x), G(x) are given functions of x. We will assume that A(x)
does not go through zero in the domain of investigation since that would
lead to a "singular point" of the differential equation in which generally
the function v(x) goes out of bound. It can happen, however, that A(x)
may become zero on the boundary of our domain.
We will enumerate a few of the particularly well investigated and
significant differential equations of pure and applied analysis. The majority
of the fundamental problems of mathematical physics are in one way or
another related to these special types of ordinary second-order differential
equations.
1. If

the differential equation (1) may be written in the form

and we have obtained the Sturm-Liouville type of differential operators


which have given rise to very extensive investigations.
2. Bessel's differential equation:

the solution of which are the "Bessel functions"


v(x) = Jp(x)
350 STUBM-LIOUVILLE PROBLEMS CHAP. 7

where the index p, called the "order of the Bessel function", may be an
integer, or in general any real or even complex number.
3. Mathieu's differential equation

which depends on the two constants a and b.


4. The differential equation of Gauss:

the solution of which is the hypergeometric series

This differential equation embraces many of the others since almost all
functions of mathematical physics are obtainable from the hypergeometric
series by the proper specialisation of the constants a, p, y, and the proper
transformation of the variable x.
5. The differential equation of the Jacobi polynomials, obtained from the
Gaussian differential equation by identifying a with a negative integer — n:

These polynomials depend on the two parameters y and 8.


6. A special subclass of these polynomials are the "ultraspherical poly-
nomials"

which correspond to the choice $ = 2y and thus depend on the single


parameter y. They are those Jacobi polynomials which possess left-right
symmetry if x is transformed into x\ = 1 — 2x. In this new variable we
obtain the differential equation

7. The choice y = 1 leads to "Legendre's differential equation"

which defines the "Legendre polynomials"


SEC. 7.2 DIFFERENTIAL EQUATIONS OF FUNDAMENTAL SIGNIFICANCE 351

8. The choice y = \ leads to the "Chebyshev polynomials" :

whose differential equation is

9. Still another class of polynomials, associated with the range [0, oo], is
established by the "differential equation of Laguerre" :

which defines the Laguerre polynomials Ln(x). These polynomials can be


conceived as limiting cases of the Jacobi polynomials, by choosing y = I
and letting 8 go to infinity. This necessitates, however, a corresponding
change of x, in the following sense:

where p. goes to zero. We thus obtain:

10. The "differential equation of Hermite"

is associated with the range [—00, +00] and defines the "Hermitian
polynomials" Hn(x). These polynomials are limiting cases of the ultra-
spherical polynomials (10), by letting y go to infinity and correspondingly
changing x in the following sense:

where /x goes once more to zero. We thus obtain:

Problem 273. By substituting x = {$x\ in Bessel's differential equation, show


that the function Jp(fix) satisfies the following differential equation:

Problem 274. By the added substitution x = x\a show that the function

satisfies the following differential equation:


352 STUBM-LIOTJVILLE PROBLEMS CHAP. 7

Problem 275. By making the further substitution v(x) = x~vvi(x) show that the
function

satisfies the following differential equation:*

Problem 276. Choose a. = £, y = —p/2, j8 = 1. Show that the function

satisfies the following differential equation:

Problem 277. By differentiating this equation n times, prove the following


theorem: Let vp be an arbitrary solution of Bessel's differential equation (4).
Then an arbitrary solution of the same differential equation for p\ — p + n is
obtainable as follows:

Problem 278. By substituting in (25) the value p = — £ show that Bessel's


differential equation for p — \ is solvable by

Problem 279. Obtain the general solution of Bessel's differential equation for
the order p = n — J (n an arbitrary integer) in terms of elementary functions
as follows:

with the understanding that we take an arbitrary linear combination of the


real and imaginary parts of this function.

7.3. The weighted Green's identity


In the matrix approach to the general theory of differential operators the
independent variable plays a relatively minor role. We have represented
the function / (x) as a vector in a many-dimensional space (see Chapter 4.5).
In this mode of representation the independent variable x provides merely
the cardinal number of the axis on which the functional value f(x) may be
found if we project the vector on that particular axis. The matrix associated
with a given linear differential operator D came about by breaking up the
continuous variable x into a large but finite set of discrete points and
replacing the derivatives by the corresponding difference coefficients. But
* This is a particularly useful generalised form of Bessel's differential equation. It
is listed, together with many other differential equations which are solvable in terms of
Bessel functions, in Jahnke-Emde (see Bibliography [2], pp. 214-15).
SEC. 7.3 THE WEIGHTED GKEEN's IDENTITY 353

if we examine the procedure of Chapter 4.6, we shall notice that the correla-
tion of the matrix A to the operator D is not unique. It depends on the
method by which the continuum is atomised. For example the differential
operator was translated into the matrix (4.6.11). In this translation the Axi
of the atomisation process were considered as constants and put equal to e.
If these Axi varied from point to point, the associated matrix would be quite
different. In particular, the originally symmetric matrix would now
become non-symmetric. Now a variation of the Axi could also be conceived
as keeping them once more as constants, but changing the independent
variable x to some other variable t by the transformation

because then

A uniform change Afy = e of the new variable t does not yield any longer a
corresponding uniform change in x. Since the properties of the associated
matrix A are of fundamental importance in the study of the differential
operator D, we see that a transformation of the independent variable x
into a new variable t according to (1) is not a trivial but in fact an essential
operation. It happens to be of particular importance in the study of
second order operators.
We have seen in Chapter 5.26 that the eigenfunctions Ui(x), Vi(x), which
came about by the solution of the shifted eigenvalue problem (5.26.2), had
the orthogonality property

If we introduce the new variable t, the same orthogonality changes its


appearance, due to the substitution

and we now obtain

(With a similar relation for the functions V{.) This modified orthogonality
property can be avoided if we use x instead of t as our independent variable.
But occasionally we have good reasons to operate with the variable t rather
than x. For example the functions ui(x) may become polynomials if
expressed in the variable t, while in the original variable x this property
would be lost.
Under these circumstances it will be of advantage to insist on a complete
freedom in choosing our independent variable x, allowing for an arbitrary
transformation to any new variable t. For this purpose we can survey our
previous results and every time we encounter a dx, replace it by <p'(t)dt.
Since, however, we would like to adhere to our standard notation x for our
24—L.D.O.
354 STURM-LIOT7VILLE PROBLEMS CHAP. 7

independent variable, we will prefer to call the original variable t and the
transformed variable x. Hence we prefer to re-write (4) in the form

Furthermore, we want to introduce a new symbol for <p' (x) by putting

In order to preserve the one-to-one correspondence between the two


variables x and t, we will demand that the sign of w(x) shall not change
within the interval of consideration [a, 6], although we will permit it to
become zero at the endpoints of the range. We do not lose in generality if
we normalise the sign of w(x) by assuming that it is everywhere positive
inside of our range.
This w(x) yields a new degree of freedom which is particularly valuable
in the study of second-order differential operators. Now the generalised
orthogonality relation (5) can be written as follows:

We call it "weighted orthogonality", or "orthogonality with respect to the


weight factor w(x)". We have encountered such a weighted orthogonality
earlier, Avhen dealing with the Laguerre polynomials (cf. 1.10.7), which were
orthogonal with respect to the weight factor e~x. The fundamental Green's
identity appears now in the generalised variable x in the following form:

which again leads to the following generalisation of the relation (4.12.5):

We will call the identity (9), which demies the adjoint operator t) with
respect to the weight factor w(x), the "weighted Green's identity". We
see that the definition of the adjoint operator is vitally influenced by the
choice of the independent variable. If we transform the independent
variable t to a new variable x, and thus express Dv in terms of x, we obtain
the adjoint operator D«—likewise expressed in the new variable x—by
defining this operator with the help of the weighted Green's identity (9).
The definition of the Green's function G(x, £) is likewise influenced by the
weight factor w(x), because the delta function 8(x, £) is not an invariant of
a coordinate transformation. The definition of S(x — £) = S(t) demanded
that
SEC. 7.3 THE WEIGHTED GREEN'S IDENTITY 355

If the independent variable t is changed to x according to the transformation


(6), the condition (11) takes in the new variable the form

Hence in the generalised variable the delta function has to be written in


the form

and the definition of the Green's function must occur on the basis of the
equation (cf. 5.4.12)

or, if we want to define the same function in terms of the original operator
(cf. 5.12.15):

The shifted eigenvalue problem

remains unchanged and the construction of the Green's function with the
help of the bilinear expansion (5.27.7) is once more valid:

(We exclude zero eigenvalues since we assume that the given problem is
complete and unconstrained.) However, in the solution with the help of
the Green's function the weight factor w(x) appears again:

Problem 280. Given the differential equation

with the boundary conditions

Transform t into the new variable x by putting


356 STUBM-LIOUVILLE PROBLEMS CHAP. 7

Formulate the given problem in the new variable x and solve it with the help
of the weight factor

Demonstrate the equivalence of the results in t and in x.


[Answer:

7.4. Second-order operators in self-adjoint form


We have seen in the function space treatment of linear differential operators
that the principal axis transformation of such an operator leads to two sets
of orthogonal functions, operating in two different spaces. The one set
can be used for the expansion of the right side, the other for the expansion
of the solution. The relation between these two expansions was established
by the eigenvalues associated with these two sets of eigenfunctions. But in
the case where the operator is self-adjoint, a great simplification takes place
because then these two spaces collapse into one. Then the functions on the
left side and the right side can be analysed in the same set of orthogonal
functions and the shifted eigenvalue problem becomes reduced to half the
number of equations. The operator D does not now require pre-
multiplication by I) to make it self-ad joint; it is self-ad joint in itself.
Now in the case of second-order differential operators we are in the
fortunate position that the self-adjointness is an automatic corollary of the
operator, provided that we employ the proper independent variable for
the formulation of the problem, or else operate with an arbitrary variable
but change from ordinary orthogonality to weighted orthogonality, in
the sense of the previous section.
Let us consider the general form (2.1) of a linear differential operator.
On the other hand, the most general self-adjoint operator has the form

If we change from the variable t to the variable x by means of the trans-


formation (3.6), we obtain in the new variable:

We can dispose of the weight function w(x) in such a way that the genera]
SEC. 7.4 SECOND-ORDER OPERATORS IN SELF-ADJOINT FORM 357

operator (2.1) becomes transformed into the self-adjoint form (2). For
this purpose we must define the functions A\ and w according to the following
conditions:

This yields for w the condition

With the substitution wA = p we obtain

and thus

which means, if we return to the weight function w(x):

Here we have the weight factor which makes the general second order operator
(2.1) self-adjoint.
The boundary term of the weighted Green's identity becomes

We have a wide choice of boundary conditions which will insure self-


adjointness. Let us write the two homogeneous boundary conditions im-
posed on v(x) as follows:

Then the four constants of these relations can be chosen freely, except for
the single condition:

Conditions which involve no interaction between the two boundary


points, can be derived from (9) by a limit process. Let us put

and let us go with e toward zero. Then the quantity piq% — pzqi is reduced
to pie, while qi is freely at our disposal. Hence in the limit, as e goes to
zero, we obtain the boundary condition
358 STURM-LIOUVILLE PROBLEMS CHAP. 7

where v is arbitrary. This condition takes now the place of the second
condition (9). At the same time, as e goes to zero, pi must go to infinity.
This implies that the first condition (9) becomes in the limit:

Hence the boundary conditions (12) and (13)—with arbitrary p. and v—are
permissible self-adjoint boundary conditions.
Our operator is now self-adjoint, with respect to the weight factor w(x).
Hence the shifted eigenvalue problem (3.16) becomes simplified to

in view of the fact that the functions Ui(x) and vt(x) coincide. The resultant
eigensolutions form an infinite set of ortho-normal functions, orthogonal
with respect to the weight factor w(x):

While generally the eigenvalue problem (14) need not have any solutions,
and, even if the solutions exist, the A$ will be generally complex numbers,
the situation is quite different with second order (ordinary) differential
operators, provided that the boundary conditions satisfy the condition (10).
Here the eigenvalues are always real, the eigensolutions exist in infinite
number and span the entire action space of the operator. The orthogonality
of the eigenfunctions holds, however, with respect to the weight factor
w(x), defined by (7).
Problem 281. Obtain the weight factor w(x) for Mathieu's differential operator
(2.5) and perform the transformation of the independent variable explicitly.
[Answer:

Problem 282. Consider Laguerre's differential equation (2.16). Obtain its


weight factor w(x) and show that for the realm [0, oo] the condition (10) is ful-
filled for any choice of the four constants of (9).
[Answer:

The boundary condition is nevertheless present by the demand that v(x) must
not grow to infinity stronger than ex/2.
Problem 283. Show that boundary conditions involving the point a alone (or
6 alone) cannot be self-adjoint.
SEC. 7.5 TRANSFORMATION OF THE DEPENDENT VARIABLE 359

Problem 284. Find the condition, under which a self-adjoint periodic solution

becomes possible.
[Answer:

7.5. Transformation of the dependent variable


There is still another method of securing the self-adjointness of a second
order differential equation. Instead of transforming the independent
variable x, we may transform the function v(x). We do not want to lose the
linearity of our problem and thus our hands are tied. The transformation
we can consider is the multiplication of v(x) by some function of x. This,
however, is actually sufficient to make any second order problem self-adjoint.
We start with the weighted Green's identity

Let us introduce the new functions v(x) and u(x) by putting

At the same time we transform the operator D to a new operator D by


putting

Then the transcription of (1) into the new variables yields:

We are back at Green's identity without any weight factor. The trans-
formation (2), (3) absorbed the weight factor w(x) and the new operator has
become self-adjoint without any weighting. The solution of the eigenvalue
•nrohlfim

yields an ortho-normal function system for which the simple orthogonality


condition holds:

If we introduce this transformation into the general second order


360 STURM-LIOUVILLE PROBLEMS CHAP. 7

equation (2.1), we obtain the new self-adjoint operator D in the following


form:

In summary we can say that the eigenvalue problem of a second order


linear differential operator can be formulated in three different ways, in
each case treating the problem as self-adjoint:
1. We operate with the given operator without any modification and
switch from ordinary orthogonality to weighted orthogonality.
2. We transform the independent variable to make the operator self-
adjoint, without weighting.
3. We transform the dependent variable and the operator to a self-adjoint
operator, without weighting.
In all three cases the same eigenfunction system is obtained, in different
interpretations, and the eigenvalues remain unchanged. There is, however,
a fourth method by which unweighted self-adjointness can be achieved.
We have written the general second-order operator in the form (4.2). Let
us now multiply the given differential equation Dv = ft on both sides by
w(x). We then obtain

and the new equation is automatically self-adjoint, although neither the


independent nor the dependent variable has been transformed. The
eigenfunctions and eigenvalues of the new problem are generally quite
different from those of the original problem, since in fact we are now solving
the eigenvalue problem

For the purpose of solving the differential equation, however, these eigen-
functions are equally applicable, and may result in a simpler solution
method.
As an illustrative example we will consider the following differential
equation:

where k is an integer. This differential equation occurs in the problem of


the loaded membrane (see Chapter 8.6), x signifying the polar coordinate r
of the membrane, while v(x) is the displacement, caused by the load density
SEC. 7.5 TRANSFORMATION OF THE DEPENDENT VARIABLE 361

j8(z). We assume as boundary condition that at x = 1—where the


membrane is fixed—the displacement must vanish:

The eigenvalue problem associated with this equation becomes

which is a special case of the differential equation (2.20), and thus solvable
in thfi form

(We shall see in Section 12 that the second fundamental solution of Bessel's
differential equation, viz. Jk(x), becomes infinite at x = 0 and is thus
ineligible as eigenfunction.) The boundary condition (11) demands

and this means that V — X has to be identified with any of the zeros of the
Bessel function of the order Jc (that is those #-values at which Jjc(x) vanishes).
If these zeros are called xm, we obtain for \m the selection principle

There is an infinity of such zeros, as we must expect from the fact that the
eigenvalue problem of a self-adjoint differential operator (representing a
symmetric matrix of infinite order) must possess an infinity of solutions.
The weight function w(x) of our eigenvalue problem becomes (cf. 4.7)

and we have to normalise our functions (13) by a constant factor Am,


defined by

The solution of the differential equation (10) can now occur by the standard
method of expanding in eigenfunctions. First we expand the right side in
our ortho-normal functions:

where

Then we obtain the solution by a similar expansion, except for the factor
A^1:
362 STUBM-LIOUVILLE PROBLEMS CHAP. 7

The same problem is solvable, however, in terms of elementary functions


if we proceed in a slightly different manner. Making use of the transformation
(2.22) with y = — £ we first replace v(x) by a new function vi(x), defined by

Our fundamental differential equation (10) appears now in the form

The new weight factor is w(x) = x2 and we will make our equation self-
adjoint by multiplying through by xz:

In this new form the eigenvalue problem becomes

This equation is solvable by putting

with the following condition for n:

where we have put

Then the general solution of (24), written in real form, becomes

In contradistinction to the previous solution, in which the origin x = 0


remained a regular point since one of the solutions of Bessel's differential
equation remained regular at x = 0, at present both solutions become
singular and we are forced to exclude the point x = 0 from our domain,
starting with the point x = e and demanding at this point the boundary
condition

Since the differential equation (23) shows that in the neighbourhood of


x — 0 the function v\(x) starts with the power #3/2, we see that we do not
commit a substantial error by demanding the boundary condition (29) at
x = e instead of x = 0, provided that we choose e small enough.
Now we have the two boundary conditions
SEC. 7.5 TRANSFORMATION OF THE DEPENDENT VARIABLE 363

which determines v\(x) uniquely. We obtain for the normalised eigen-


functioris

and thus

and

The new eigenfunctions oscillate with an even amplitude, which was not
the case in our earlier solutions Jfc(#). In fact, we recognise in the new
solution of the eigenvalue problem the Fourier functions, if we introduce
instead of # a new variable t by putting

If, in addition, we change j8(#) to

the expansion of the right side into eigenfunctions becomes a regular Fourier
sine analysis of the function /3i(ee*).
The freedom of transforming the function v(x) by multiplying it by a
proper factor, plus the freedom of multiplying the given differential equation
by a suitable factor, can thus become of great help in simplifying our task
of solving a given second order differential equation. The eigenfunctions
and eigenvalues are vitally influenced by these transformations, and we may
wonder what eigenfunction system we may adopt as the "proper" system
associated with a given differential operator. Mathematically the answer
is not unique but in all problems of physical significance there is in fact
a unique answer because in these problems—whether they occur in hydro-
dynamics, or elasticity, or atomic physics—the eigensolutions of a given
physical system are determined by the separation of a time dependent
differential operator with respect to the time t, thus reducing the problem to
an eigenvalue problem in the space variables. We will discuss such problems
in great detail in Chapter 8.
Problem 285. Transform Laguerre's differential equation (2.16) into a self-
adjoint form, with the help of the transformation (2), (3).
[Answer:
364 STUBM-LIOUVILLB PROBLEMS CHAP. 7

Problem 286. Transform Hermite's differential equation (2.18) into a self-


adjoint form.
[Answer:

Problem 287. Transform Chebyshev's differential equation (2.15) into a self-


adjoint form.
[Answer:

Problem 288. Making use of the differential equation (2.23) find the normalised
eigenfunctions and eigenvalues of the following differential operator:

Range: x — [0, a], boundary condition: v'(a) = 0.


[Answer:

7.6. The Green's function of the general second-order differential equation


We will now proceed to the construction of the Green's function for the
general second-order differential operator (2.1). We have seen in (5.8) that
a simple multiplication of the given differential equation

by w(x) on both sides changes this equation to a self-adjoint equation.


Hence it suffices to deal with our problem in the form

This means

for which we can also put

It will thus be sufficient to construct the Green's function for the self-adjoint
equation (4).
SBO. 7.6 GREEN'S FUNCTION OF SECOND-ORDER DIFFERENTIAL EQUATION 365
Green's identity now becomes

As boundary conditions we prescribe the conditions

These are self-adjoint conditions, as we have seen in Section 4. Hence


we have a self-adjoint operator with self-adjoint boundary conditions which
makes our problem self-adjoint. The symmetry of the resulting Green's
function is thus assured.
We proceed to the construction of the Green's function hi the standard
fashion, considering £ as the variable and x as the fixed point. To the left
of x we will have a solution of the homogeneous differential equation which
satisfies the boundary condition at the point £ = a. Now the homogeneous
second-order differential equation has two fundamental solutions, let us say
vi(£) and vz(£) and the general solution is a linear superposition of these two
solutions:

By satisfying the boundary condition at the point £ = a one of the constants


can be reduced to the other and we obtain a certain linear combination of
vi(g) and v%(t;) (multiplied by an arbitrary constant), which satisfies not only
the differential equation but also the boundary condition at the point £ = a.
We shall denote this linear combination by vi(g), and similarly the other
linear combination which satisfies the boundary condition at £ = 6, by
vz(£)' Hence the two analytical expressions which give us the Green's
function to the left and to the right of the point £ = x, become

with the two undetermined constants C\ and Cz which now have to be


obtained from the conditions at the dividing point £ = x. At this point we
have continuity of the function. This gives

We have a jump, however, in the first derivative. The magnitude of this


jump, going from the left to the right, is equal to 1 divided by the coefficient
of v"(g) at the point £ = x. We thus get a second condition in the form
366 STUBM-UOFVTLLB PEOBLEMS CHAP. 7

The solution of the two simultaneous algebraic equations (9) and (10) for
the constants C± and C% yields:

The quantity which appears in the denominator of the second factor is


called the "Wronskian" of the two functions vi(x) and v%(x). We will
denote it by W(x):

We thus obtain as final result:

Now the symmetry of the Green's function demands that an exchange of


x and £ in the first expression shall give the second expression. In the
numerator we find that this is automatically fulfilled. In the denominator,
however, we obtain the condition

Since the point £ can be chosen arbitrarily, no matter how we have fixed
the point x, we obtain the condition

or, considering the expression (4.7) for w(x):

We have obtained this result from the symmetry of the Green's function
G(x, |) but we can deduce it more directly from the weighted Green's identity

Since both vi and v^ satisfy the homogeneous differential equation Dv = 0,


we can identify v with v% and u with vi. Then the relation (16) becomes

which is exactly the relation we need for the symmetry of G(x, £).
An important consequence of this result can be deduced if we consider
the equation
SEC. 7.6 GREEN'S FUNCTION or SECOND-ORDER DIFFERENTIAL EQUATION 367
as a differential equation for vz(x), assuming that v\(x) is given. We can
integrate this equation by the method of the "variation of constants".
We put

Considering C as a function of x we obtain

and thus

and

We see that a second-order differential equation has the remarkable


property that the second fundamental solution is obtainable by quadrature if
the first fundamental solution is given.
Problem 289. Consider the differential equation of the vibrating spring:

Given the solution

Obtain the second independent solution with the help of (23).


Problem 290. Consider Chebyshev's differential equation (2.14). Given the
fundamental solution

if

Find the second fundamental solution with the help of (23).


[Answer:

Problem 291. Consider Legendre's differential equation (2.12) for n = 2.


Given the solution

Find the second fundamental solution by quadrature.


[Answer:

Problem 292. Consider Laguerre's differential equation (2.16) which defines the
polynomials Ln(x). Show that the second solution of the differential equation
goes for large x to infinity with the strength ex.
Problem 293. The proof of the symmetry of the Green's function, as discussed
in the present section, seems to hold universally while we know that it holds
368 STUKM-LIOUVTLLE PROBLEMS CHAP. 7

only under self-ad joint boundary conditions. What part of the proof is
invalidated by the not-self-ad joint nature of the boundary conditions?
[Answer: As far as vi(£) and v%(J;) goes, they are always solutions of the homo-
geneous equation. But the dependence on x need not be of the form f(x)v(£),
as we found it on the basis of our self-adjoint boundary conditions.]
7.7. Normalisation of second order problems
The application of a weight factor w(x) to a given second-order differential
equation is a powerful tool in the investigation of the analytical properties
of the solution and is often of great advantage in obtaining an approximate
solution in cases which do not allow an explicit solution in terms of elementary
functions. In the previous sections we have encountered the method of the
weight factor in two different aspects. The one was to multiply the entire
equation by a properly chosen weight factor w(x) (cf. 6.2). If w(x) is chosen
according to (4.7), the left side of the equation is transformed into

Another method was to multiply and obtain a differential


operator for

But here we have not only made the substitution

but also changed the operator D to Z>i (cf. 5.3). The new differential
equation thus constructed became [cf. (5.7)]:

where

We shall now combine these two methods in order to normalise the


general second-order differential equation into a form which is particularly
well suited to further studies. First of all we will divide the entire equation
by the coefficient of v"(x):

We thus obtain two new functions which we will denote by b(x) and c(x):
SEC. 7.7 NORMALISATION OF SECOND ORDER PROBLEMS 369

Our equation to be solved has now the form

Then we apply the transformation (3-5) which is now simplified due to the
fact that A(x) = 1:

The new form of the differential operator has the conspicuous property
that the term with the first derivative is missing. The new differential operator
is of the Sturm-Liouville type (2.3) but with A(x) — 1. It is characterised
by only one function which we will call U(x):

The homogeneous equation has now the form

If we know how to solve this differential equation, we have also the solution
of an arbitrary second order equation since the solution of an arbitrary
second order differential equation can be transformed into the form (12)
which is often called the "normal form" of a linear homogeneous differential
equation of second order.
Problem 294. Transform Besael's differential equation (2.4) into the normal
form (12).
[Answer:

Problem 295. Transform Laguerre's differential equation (2.16) into the normal
form.
[Answer:

Problem 296. Transform Hermite's differential equation (2.18) into the normal
form.
[Answer:

25—L.D.O.
370 STUBM-LIOtTVILLE PROBLEMS CHAP. 7

7.8. Riccati's differential equation


The following exponential transformation has played a decisive role in
the development of wave mechanics and in many other problems of
mathematical physics. We put

and consider y>(x) as a new function for which a differential equation is to


be obtained.
We now have

The differential equation

yields the following equation for tp(x):

If we put

we obtain for y(x) the following first order equation:

This is a non-linear inhomogeneous differential equation of the first order


for y(x), called "Riccati's differential equation".
It is a fundamental property of this differential equation that the general
solution is obtainable by quadratures if two independent particular solu-
tions are given. Let y\ and y% be two such solutions, then

yield two particular solutions of the homogeneous linear equation (2), in


the form

But then the general solution of (2) becomes

which gives for the associated function cp(x):

where AI and A% are two arbitrary constants.


SEO. 7.9 PERIODIC SOLUTIONS 371

Problem 297. In Riccati's differential equation (4) assume U = const. = — Cz.


Then the differential equation is explicitly solvable by the method of the
"separation of variables". Demonstrate the validity of the result (8), choosing
as particular solutions: y\ = C, yz = —C.

7.9. Periodic solutions


In a very large number of problems in which second order differential
equations are involved, some kinds of vibrations occur. If the differential
equation is brought into the form (8.2), we can see directly that we shall
obtain two fundamentally different types of solutions, according to the
positive or negative sign of U(x). If U(x) were a constant, the solution of
the differential equation (8.2) would be of the form

if U is positive and

if U is negative. Although these solutions will not hold if U(x) is a function


of x, nevertheless, for a sufficiently small interval of x, U(x) could still be
considered as nearly constant and the general character of the solution will
not be basically different from the two forms (1) or (2). We can say quite
generally that the solution of the differential equation (8.2) has a periodic
character if U(x) is positive and an exponential character if U(x) is negative.
In most applied problems the case of a positive U(x) is of primary interest.
If U(x) is negative and the solution becomes exponential, the exponentially
increasing solution is usually excluded by the given physical situation while
the exponentially decreasing solution exists only for a short interval, beyond
which the function is practically zero. The case of positive U(x) is thus
of much more frequent occurrence.
Now in the case of positive U(x) we usually look for solutions of Riccati's
equation which are not real. A real solution would lead to an infinity of
<p(x) if v(x) vanishes. Hence at every point at which a periodic oscillation
goes to zero, the associated solution of the Riccati equation would become
singular. This is not the case, however, if the solution is complex, i.e. if
y(x) is of the form

Such a solution has the further advantage that it immediately supplies a


second solution since we know in advance that also a(x) — if$(x) must be a
solution in the case that U(x) is real. And thus any complex solution of
Riccati's equation (8.4) is equivalent to a complete solution of the associated
equation (8.2).
Let us now introduce the complex quantity (3) in the equation (8.4).
This equation now separates into the two real equations
372 STURM-LIOUVTLLE PROBLEMS CHAP. 7

The second equation is integrable and gives

thus

But then

and

Since both the real and the imaginary part of this solution must be a
solution of our equation (8.2) (assuming that U(x) is real), we obtain the
general solution in the form

for which we may also write

where 0 is an arbitrary phase angle. Since, however, the indefinite integral


$fidx has already a constant of integration, it suffices to put

This form of the solution leads to the remarkable consequence that the
solution of an arbitrary linear homogeneous differential equation of second
order for which the associated U(x) is positive in a certain interval, may be
conceived as a periodic oscillation with a variable frequency and variable
amplitude. Ordinarily we think of a vibration in the sense of a function of
the form

where A, o>, and 6 are constants. If the amplitude A becomes a function


of x, we can still conceive our function as a periodic vibration with a variable
amplitude. But we can go still further and envisage that even the frequency
CD is no longer a constant. Now if the argument of the sine function is no
longer of the simple form <ax but some arbitrary function of x, we could
introduce the concept of an "instantaneous frequency" which we can define
as the derivative of the argument. But then we see that the solution (12)
has the significance of an oscillation with the instantaneous frequency j8.
Furthermore, we come to the important conclusion that the amplitude and
SEC. 7.9 PERIODIC SOLUTIONS 373

frequency of this oscillation are necessarily coupled with each other. If the
frequency of the oscillation is a constant, the amplitude is also a constant.
But if the frequency changes, the amplitude must also change according to
a definite law. The amplitude of the vibration is always inversely pro-
portional to the square root of the instantaneous frequency. If we study the
distribution of zeros in the oscillations of the Bessel functions or the Jacobi
polynomials or the Laguerre or Hermite type of polynomials, the law of the
zeros is not independent of the law according to which the maxima of the successive
oscillations change. The law of the amplitudes is uniquely related to the
law of the zeros and vice versa.
This association of a certain vibration of varying amplitude and frequency
with a solution of a second-order differential equation is not unique, however.
The solution (11) contains two free constants of integration, viz. the phase
constant 6 and the amplitude constant A. This is all we need for the
general solution of a second-order differential equation. And yet, if we
consider the equations (4) and (5), which determine j8(z), we notice that we
get for j8(x) a differential equation of second order, thus leaving two further
constants free. This in itself is not so surprising, however, if we realise that
we now have a complex solution of the given second-order differential
equation, with the freedom of prescribing V(XQ) and V'(XQ) at a certain point
x = XQ as two complex values, which in fact means four constants of
integration. But if we take the real part of the solution for itself:

then we see that the freedom of choosing j3(#o) and P'(XQ) freely must lead
to a redundancy because to any given V(XQ), V'(XQ) we can determine the
constants A and 6 and, having done so, the further course of the function
v(x) is uniquely determined, no matter how fi(x) may behave. This means
that the separation of our solution in amplitude and frequency cannot be
unique but may occur in infinitely many ways.
Let us assume, for example, that at x = XQ v(x) vanishes. This means
that, if the integral under the cosine starts from x = XQ, the phase angle
becomes 7r/2. Now in this situation the choice of /?'(#o) can have no effect
on the resulting solution, while the choice of j8(#o) can change only a factor
of proportionality. And yet the course of j8(a;)—and thus the instantaneous
frequency and the separation into amplitude and frequency—is profoundly
influenced by these choices.
Problem 298. Investigate the differential equation

with the boundary condition

which suggests a unique separation into a vibration of constant frequency and


constant amplitude, while in actual fact this separation is not unique. The
374 STUBM-LIOUVTLLE PEOBLEMS CHAP. 7

associated Biccati equation is here completely integrable (cf. Problem 297).


Obtain the general solution of the problem in terms of instantaneous frequency
and amplitude and show that the choice of /S'(0) has no effect on the solution,
while the choice of )8(0) changes only a multiplicative factor.
[Answer:

where p and A are arbitrary constants.

7.10. Approximate solution of a differential equation of second order


We have seen that an arbitrary linear homogeneous differential equation
of the second order can be transformed into the form (8.2). That form,
furthermore, could be transformed into the Riccati equation (8.4). We
have no explicit method of solving this equation exactly. We have seen,
however, that the solution is reducible, as far as the function v(x) is con-
cerned, to the form (9.11) and is thus obtainable by mere quadrature if
fi(x) is at our disposal. This would demand an integration of the differential
equation (9.4), expressing a(x) in terms of fi(x) according to the relation
(9.5). We can make good use, however, of the redundancy of this differential
equation which permits us to choose any particular solution of the differential
equation and still obtain a complete solution of the original equation (8.2)
While the equation for fi(x) is in fact a second-order differential equation,
we shall handle it in a purely algebraic manner, neglecting altogether the
terms with «' and a2, thus putting simply

The resulting solution for v(x):

is of great value in many problems of atomic physics and gives very


satisfactory approximations for many important function classes, arising
out of the Sturm-Liouville type of problems. It is called the "Kramers-
Wentzel-Brillouin approximation", or briefly the "KWB solution". The
corresponding solution in the range of negative U becomes

The condition for the successful applicability of the simplified solution (1)
is that U must be sufficiently large. Our solution will certainly fail in the
neighbourhood of U = 0. It so happens, however, that in a large class of
SEC. 7.10 SOLUTION OF A DIFFERENTIAL EQUATION OF SECOND ORDER 375

problems U ascends rather steeply from the value U = 0 and thus the
range in which the KWB approximation fails, is usually limited to a relatively
small neighbourhood of the point at which U(x) vanishes.
In order to estimate the accuracy of the KWB solution, we shall substitute
in Riccati's differential equation

where yo is the KWB solution, namely

Then we obtain for 77(3) the differential equation

Since 77 is small, 7?2 is negligible. Moreover, we expect that 7? will be small


relative to t/o- This makes even 77' negligible in comparison to 2yo-r). Hence
we can put with sufficient accuracy for estimation purposes:

(We have replaced in the denominator yo by its leading term.)


Here we have an estimation of the error of the solution of Riccati's
differential equation if we accept the simplified solution (1) for fi(x) (and
correspondingly the solution (2) for v(x)). This is the quantity which will
decide how closely the KWB solution approximates the true solution. The
KWB solution will only be applicable in a domain in which the quantity (7)
is sufficiently small.
So far we have only obtained the error of y. We have to go further in
order to estimate the error of v(x). Here the solution of Riccati's differential
equation appears in the exponent and requires an additional quadrature.
Hence we need the quantity

if our realm starts with x = a and we assume that v(x) is at that point
adjusted to the proper value v(a) in amplitude and phase. To carry through
an exact quadrature with the help of (7) as integrand will seldom be possible.
But an approximate estimation is still possible if we realise that the second
term in the numerator of (7) is of second order and can thus be considered
as small. If, in addition, we replace in the denominator Vl7 by its
minimum value between a and x, we obtain the following estimation of the
relative error of the KWB approximation:
376 STUEM-LIOTJVILLE PROBLEMS CHAP. 7

(We have assumed that (log U)" does not go through zero in our domain,
otherwise we have to sectionalise the error estimation.)
Problem 299. Given the differential equation

In this example one of the fundamental solutions is explicitly available in


exact form:

which makes an exact error analysis possible. Obtain the solution by the
KWB method and compare it with the exact solution in the realm x = [3, oo].
Choosing a = oo, estimate the maximum error of y(x) and v(x) on the basis of
the formulae (7) and (9) and compare them with the actual errors.
[Answer:
Maximum error of y(x) (which occurs at x = 3): 77(8) = — 0.0123
(formula (7) gives - 0.0107)
Maximum relative error of
(formula (9) gives 0.0237)]
Problem 300. For what choice of U(x) will the KWB approximation become
accurate? Obtain the solution for this case.
[Answer:

where a and & are arbitrary constants.

Problem 301. Obtain the KWB-approximation for Bessel's differential equation,


transformed to the normal form.
[Answer:

Problem 302. Obtain the KWB-approximation for Hermite's differential


equation (of. 7.15).
[Answer:

7.11. The joining of regions


In many important applications of second-order differential equations, in
particular in the eigenvalue problems of atomic physics, the situation is
encountered that the function U(x), which appears in the normal form of a
differential equation of second order, changes its sign in the domain under
SEC. 7.11 THE JOINING OF REGIONS 377

consideration. This change of sign does not lead to any singularity but it
does have a profound effect on the general character of the solution since the
solution has a periodic character if U(x) is positive and an exponential
character if U(x) is negative. The question arises how we can continue our
solution from the one side to the other, in view of the changed behaviour of
the function. The KWB approximation is often of inestimable value in
giving a good overall picture of the solution. The accuracy is not excessive
but an error of a few per cent can often be tolerated and the KWB method
has frequently an accuracy of this order of magnitude. The method fails,
however, in the neighbourhood of U(x) = 0 and in this interval a different
approach will be demanded. Frequently the transitory region is of limited
extension because U(x) has a certain steepness in changing from the negative
to the positive domain (or vice versa) and the interval in which U(x)

becomes too small for an effective approximation by the KWB method, is


often sufficiently reduced to allow an approximation of a different kind.
If in the figure the KWB approximation fails between the points A and B,
we see that on the other hand U(x) changes in this interval gently enough to
permit a linear approximation of the form a(x — XQ). Since we can always
transfer the origin of our reference system into the point x = XQ, we will
pay particular attention to the differential equation

If we have the solution of this particular differential equation, we shall have


the link we need in order to bridge the gap between the exponential and the
periodic domains. This differential equation can be conceived as a special
case of (2.23) which is solvable in terms of Bessel functions. For this
purpose we have to choose the constants a, ft, y, p as follows:

We shall choose 8 = 1 and thus put a = 9/4. The differential equation


378 STUEM-LIOTTVILLE PROBLEMS CHAP. 7

can thus be solved in the following form:

In order to study the behaviour of this solution for both positive and
negative values of x, a brief outline of the basic analytical properties of the
Bessel functions will be required.

7.12. Bessel functions and the hypergeometric series


The hypergeometric series (2.7) is closely related to the Bessel functions.
Let us put a = jSanda; = a~zx\. We will thus consider the infinite expansion

If in this expansion we let a go to infinity, we obtain a definite limit:

This series defines a function of x which is in fact an "entire function",


i.e. it has no singular point anywhere in the entire complex plane. Indeed,
the only singular point of the hypergeometric function F(a, ^,y;x) is the
point x = I but the substitution x/a2 for x pushes the singular point out
into infinity and makes the series (2) converge for all real or complex values
of x. The parameter y is still at our disposal.
Now the substitution a = /?, and

transforms the Gaussian differential equation (2.6) as follows:

If we now go with a to infinity, we obtain in the limit for the function


F(y;x), defined by the expansion (2), the following differential equation
(dividing through by x\ which we denote once more by x):
SEC. 7.12 BESSEL FUNCTIONS AND THE HYPEEGEOMETRIC SERIES 379

This differential equation can again be conceived as a special case of (2.23).


For this purpose the following choices have to be made (replacing the y of
(2.23) by yi since the present y refers to a different quantity):

This means:

and thus we see that the entire function F(y; x) satisfies Bessel's differential
equation if we make the following correlation:

The same correlation may be written in the following form:

Since, furthermore, the constant p appears in (2.4) solely hi the form p2,
we obtain the general solution of Bessel's differential equation as follows:

(The second solution is invalidated, however, if p is an integer n, since the


y of the hypergeometric series must not be zero or a negative integer.)
Bessel's function of the order p is specifically defined as follows:

We will define the following entire function of x2:

Then we can put

If p happens to be a (positive) integer w, the function JP(x) is in itself an


"entire function", i.e. a function which is analytical throughout the complex
plane z = x + iy. But if p is not an integer, the factor x? will interfere
380 STTJRM-LIOUVILLE PROBLEMS CHAP. 7

with the analytical nature of Jp(x) and require that along some half-ray of
the complex plane, between r = 0 and oo, a "cut" is made and we must
not pass from one border to the other. However, this cut is unnecessary
if we stay with x in the right complex half plane:

It is in fact unnecessary to pass to the negative half plane. The function


Jp(z) is—as we see from the definition (10)—reducible to the basic function
Ap(z2) which assumes the same values for ±z. Hence Jp( — z) is reducible
to Jp(z). Moreover, the Taylor expansion (10) of Jp(z), having real co-
efficients, demonstrates that Jp(x — iy) is simply the complex conjugate of
Jp(x + iy). Hence it suffices to study the analytical behaviour of Jp(z) in
the right upper quarter of the complex plane, letting the polar angle 6 of
the complex number z = reie vary only between 0 and ir/2.
Problem 303. Show the following relation on the basis of the definition (10) of
Jp(z):

7.13. Asymptotic properties ofJp(z) in the complex domain


The behaviour of the Bessel functions in the outer regions of the complex
plane is of considerable interest. We need this behaviour for our problem
of joining the periodic and exponential solutions of a differential equation
of second order. We will not go into the profound function-theoretical
investigations of K. Neumann and H. Hankel which have shed so much
light on the remarkable analytical properties of the Bessel and related
functions.* Our aim will be to stay more closely to the defining differential
equation and draw our conclusions accordingly. In particular, wTe will use
the KWB method of solving this differential equation in the complex, for a
simple derivation of the asymptotic properties of the function Jp(z) for
large complex values of the argument z.
For this purpose we write Bessel's differential equation once more in the
normal form:

with

We now introduce the complex variable z in polar form:

* Cf. the chapters on Bessel functions in the fundamental literature: Courant-Hilbert,


Whittaker-Watson, Watson, quoted in the Bibliography.
SEC. 7.13 ASYMPTOTIC PROPERTIES OF Jp(z) IN THE COMPLEX DOMAIN 381

We want to move along a large circle with the radius r = TQ, the angle 6
changing between 0 and ir/2. This demands the substitution

In the new variable 6 our differential equation (1) becomes

Now we have to make the substitution (cf. 7.9)

thus bringing (5) into the normal form

Now the function

is large, in view of the largeness of TO, and is thus amenable to the KWB
solution. In fact, pz is negligible in comparison to the first term (except if
the order of the Bessel function is very large), which shows that the
asymptotic behaviour of the Bessel functions will be similar for all orders p.
The KWB solution (10.3) becomes in our case (considering that TQ is a
constant which can be united with the constants A\ and AZ) :

Returning to the original v(6) (cf. 13.6) and writing our result in terms of
z we obtain
382 STUBM-LIOUVELLE PROBLEMS CHAP. 7

and finally, in view of (2):

The undetermined constants AI and AZ have to be adjusted to the initial


values, as they exist at the point 0 = 0, that is on the real axis:

If we know how Jp(x) behaves for large real values of the argument, the
equation (11) will tell us how it behaves for large complex values. Our
problem is thus reduced to the investigation of Jp(x) for large real values
of a;.

7.14. Asymptotic expression ofJp(x) for large values of x


In our investigation of the interpolation problem of the Bessel functions
(cf. Chapter 1.22), we encountered the following integral representation
of the Bessel functions which holds for any positive p:

With the transformation sin <p = t the same integral may be written in the
form

Moreover, since the integrand is an even function, the same integral may
also be written in the complex form:

Now let x be large. Then we will modify the path of integration of the
variable t as follows:

We will first investigate the contribution of the path CD. Let us put
SEC. 7.14 ASYMPTOTIC EXPANSION OF Jp(x) FOE LARGE VALUES OF X 383

Then the function e~ixt becomes

The first factor is a mere constant with respect to the integration in r.


The second factor diminishes very rapidly to zero, due to the largeness of
x. Hence only the immediate neighbourhood of r = 0 contributes to the
value of the integral. But if r is small, we can put

and

The result of the integration can be written down as the constant factor

multiplied by the integral

(The limit of integration can be extended to infinity because the largeness of


x blots out everything beyond a small neighbourhood of T = 0.) This
integral becomes Euler's integral of the factorial function if we introduce
xr = £ as a new variable:

The path AB contributes the same, except that all i have to be changed
to — i. The path BC contributes nothing since here the integrand becomes
arbitrarily small. The final result of our calculation is that the integral (3)
becomes

and thus we obtain the following asymptotic representation of Jp(x) for


large positive values of x:

Although we have proved this asymptotic behaviour only for positive p,


it holds in fact for all p, as we can see from the recurrence relation
384 STUBM-LIOUVILLE PROBLEMS CHAP. 7

which holds for negative orders as well and which carries the asymptotic
relation (11) over into the realm of negative p.

7.15. Behaviour ofJp(z) along the imaginary axis


The asymptotic relation (13.11) shows that Jp(z) for purely imaginary
values of z will have an exponentially increasing or decreasing character.
Whenever A% is present, the exponentially increasing part will swamp the
first term and the function will go exponentially to infinity for large values
of z = iy. Only for one particular linear combination of the two Bessel
functions Jp(x) and J-p(x) can we obtain an exponentially decreasing course.
This will happen if the constant A% is obliterated and that again demands—
as we can see from (13.12)—that the function shall go to infinity for real
values of x like eixlVx, without any intermingling of e~ixl'Vx. Now, if
we write (11) in complex form:

we notice that the e~ix part of Jv(x] can only be obliterated, if we choose the
following linear combination of Jp(x) and J-p(x):

This function becomes for large values of x:

Accordingly in the formula (13.11) the constant AZ drops out and we obtain
along the imaginary axis z — iy:

Apart from an arbitrary complex constant, the function Kp(z) is the only
linear combination of Bessel functions which decreases exponentially along
the positive imaginary axis.
Problem 304. Show on the basis of (12.10) that the function Kp(z) is real every-
where along the imaginary axis.
Problem 305. Obtain the asymptotic value of the function

along the imaginary axis.


FAnswer:
SEC. 7.16 THE BESSEL FUNCTIONS OF THE ORDER £ 385

Problem 306. Obtain the asymptotic value of Mp( — iy) (cf. 12.14).
[Answer:

Problem 307. Obtain the asymptotic value of the combination

along the real and imaginary axes.


[Answer:

7.16. The Bessel functions of the order |


We are now sufficiently prepared to return to the problem of Section 11.
We wanted to study the solution of the special differential equation (11.3)
which will enable us to join the periodic and exponential branches of the
KWB solution in the case that the function U(x) changes its sign. We have
found the solution in the form (11.4). Now we must dispose of the constants
A\ and A% properly.
First of all we notice on the basis of (12.10-12) that our solution is in the
following relation to the entire function Ap(x2):

We thus see that v(x) is for any choice of the constants A\ and A% an entire
analytical function of the complex variable z, regular throughout the complex
plane. The cuts needed in the study of the Bessel functions disappear com-
pletely in the resulting function (11.4).
We will now choose as fundamental solutions of our differential equation
the following combinations of Bessel functions:

The asymptotic behaviour of these two functions can be predicted on the


basis of the general asymptotic behaviour of the Bessel functions. First
of all we will move along the positive or-axis. Then we obtain for large
values of x:

26—L.D.O.
386 STUEM-LIOUVILLE PROBLEMS CHAP. 7

We see that in the positive range of x the functions f(x) and g(x) represent
two oscillations (of variable amplitude and frequency) which have the
constant phase shift of ir/2 relative to one another.
We now come to the study of the negative range of x, starting with the
function f(x). As we see from (1) (considering a: as a positive quantity):

We will now make the correlation

Then by definition :

and thus

But then, making use of the asymptotic behaviour of KP(iy] (see 15.4), we
obtain for large values of x:

We proceed quite similarly for g( — x):

For large values of x we obtain (on the basis of the asymptotic behaviour
of the function Ip(x) of Problem 307 (cf. 15.10)):
SEC. 7.17 JUMP CONDITIONS FOR TRANSITION "EXPONENTIAL-PERIODIC" 387

7.17. Jump conditions for the transition "exponential-periodic"


The differential equation (11.3), on which we have concentrated, is
characterised by

Hence

In the negative range we will put

The general KWB approximation in the negative range can be written as


follows. We select the point x — XQ (in which U(XQ) = 0) as our point of
reference and write

In the positive range, on the other hand, we put

The four constants of these solutions cannot be independent of each other.


If we start with the exponential region and know from certain boundary
conditions the values of A\ and A%, the transition to the periodic region is
uniquely established, and thus A'\ and A'2 must be expressible in terms of
AI and AZ. On the other hand, if we start from the periodic region and
know on the basis of some boundary conditions the values of A'\ and A'2,
the transition back to the exponential range is once more uniquely deter-
mined and thus AI and AZ must be expressible in terms of A'\, A'%.
If we could neglect the singular character of the point U(XQ) = 0, we
could argue that we should have

since in the exponents V — U and iVU are identical expressions. But this
argument is in fact wrong, because there is a gulf between the two types of
solutions which cannot be bridged without the proper precautions. We
have to use our function f(x) as a test function for the coefficient AI and
g(x) as a test function for the coefficient AZ- The comparison of the formulae
388 STURM-LIOUVTLLE PROBLEMS CHAP. 7

(16.9) and (16.3)—the latter written in complex form—yields the following


relation:

On the other hand, the comparison of the formulae (16.11) and (16.4)—
the latter written in complex form—yields:

and thus the complete relation between the two pairs of constants becomes

which means

The inverse relations become

7.18. Jump conditions for the transition "periodic-exponential"


If it so happens that U(x) changes its sign from plus to minus during its
transition through the critical point XQ — 0, we can write the approximate
solutions once more in the form (17.7) and (17.8), but using now the formula
(17.8) for x < XQ and the formula (17.7) for x > XQ. Once more we can
ask for the relation between the two pairs of constants AI, A% on the one
hand and A'i, A'2, on the other hand. The new situation is reducible to the
previous one by changing x to —x. This means that AI and A%, and simi-
larly A'i and A'z, become interchanged. Accordingly, the relations (17.13)
and (17.14) have now to be formulated as follows:
SEC. 7.19 AMPLITUDE AND PHASE IS THE PERIODIC DOMAIN 389

while the reciprocal relations become

7.19. Amplitude and phase in the periodic domain


In the exponential domain the two branches of the KWB approximation
define two completely different analytical functions whose separation is
often of vital importance. The one is an exponentially increasing, the
other an exponentially decreasing function. In the periodic domain, how-
ever, the two branches of the solution represent two analytical functions of
the same order of magnitude whose separation is not advocated on natural
grounds. In most practical applications we deal with real solutions while
the two branches of the KWB approximation operate with complex quantities
Hence it is preferable to combine the two complex branches into one real
branch by writing the solution in the form

The solution appears in the form of a vibration of variable amplitude and


frequency. The two constants of integration A'\ and A'z are now replaced
by the amplitude factor C and the phase angle 9. The relation between the
two types of constants is obtained if we write (1) in complex form, obtaining

The relations (17.14) which hold for the transition from the exponential
to the periodic domain, now become:

and thus
390 STUKM-UOT7VILLE PEOBLEMS CHAP. 7

These are the formulae by which the constants of the exponential domain are
determined if the constants of the periodic domain are given, and vice versa.
The transition occurs in the sequence: exponential-periodic. If the sequence
is the reverse, viz. periodic-exponential, we have to utilize the formulae
(18.2) which now give

and

7.20. Eigenvalue problems


In the application of differential equations to the problems of atomic
physics we have often problems characterised by certain homogeneous
boundary conditions. We demand for example that the function shall
vanish at the two end-points of the domain. Or we may have the situation
encountered in the differential equation of Gauss where the boundary points
are singular points of the domain and we demand that the function shall
remain bounded at the two end-points of our range. Generally such con-
ditions cannot be met without the freedom of a certain parameter, called
the "eigenvalue" of the differential equation. In the case of the hyper-
geometric series, for example, the two fundamental solutions go out of
bound at either the lower or the upper boundary. To demand that they
shall remain finite at both boundaries means that a certain restriction has to
be fulfilled by the parameters of the solution. Generally, if

is a homogeneous linear differential operator and we add the proper number


of homogeneous boundary conditions, we shall get as the only possible
solution the trivial solution u = 0, because the "right side" is missing
which excites the various eigenfunctions of the operator. These eigen-
functions are in the case of a self-adjoint operator (and all second-order
operators are self-adjoint, if we include the proper weight factor), denned by
the modified differential equation

which is once more homogeneous but now contains the parameter A,


adjustable in such a way that the given homogeneous boundary conditions
shall be satisfied. That this is indeed possible, and with an infinity of
SEC. 7.21 HEBMITE'S DIFFERENTIAL EQUATION 391
real \t, follows from the general matrix treatment of linear differential
operators (as we have seen in Chapters 4 and 5). In fact, the solution of
the equation (2) yields those "principal axes" of the operator which establish
an orthogonal frame of reference in function space and make an expansion of
both right side and solution into eigenfunctions possible.
The very same eigenfunctions have, however, direct physical significance
in all vibration problems. It is the tune dependent part of the differential
operators of mathematical physics which provides the Av-term of the
differential equation (2) and thus yields those "characteristic vibrations",
which the physical system is capable of performing. In atomic physics the
eigenvalues have the significance of the various energy values associated with
the possible states of the atom, while the eigensolutions represent the various
"quantum states" in which the atom may maintain itself.
Before the advent of Schrodinger's wave-equation, in the epoch of Bohr's
atomic model, the various quantum states were interpreted on the basis of
certain "quantum conditions" which led to definite stable configurations
of the atom, associated with definite energy values. These "quantum
conditions" are closely related to the KWB approximation applicable to
Schrodinger's wave equation—although this connection was not known
before the discovery of wave-mechanics. In particular, the selection
principle which brings about the quantum conditions within the framework
of the KWB approximation lies in the condition that in the exponential
domain only the exponentially decreasing solution is permitted, since the
other solution would increase to infinity and is void of physical significance.
The close agreement of the older results of Bohr's atomic theory with the
later eigenvalue theory demonstrates the usefulness of the KWB method
hi atomic problems. Although the wave-mechanical solution replaced the
approximative treatment by the exact solution and the exact eigenvalues,
the KWB method retains its great usefulness. Many of the fundamentally
important function classes of mathematical physics arise from Sturm-
Liouville problems, since the separation of the wave-equation in various
systems of coordinates leads to ordinary second-order differential equations
(as we will see in Chapter 8). Although the hypergeometric series provides
us with the exact eigenvalues and exact eigenfunctions in all these cases,
this is of little help if our aim is to study the actual course of the function
since the sum of too many terms would be demanded for this purpose
(except for the eigenfunctions of lowest order). We fare much better if
we possess a solution in closed form which, although only approximative,
can be handled in explicit terms. In the following sections we will see how
some of the fundamental function classes of mathematical physics can be
represented in good approximation with the help of elementary functions, on
the basis of the KWB solution.
7.21. Hermite's differential equation
The differential equation (2.18) of Hermite, if transformed in the normal
form, has a U(x) given by (5.39). It is this differential operator which
392 STUBM-LIOUVILLE PROBLEMS CHAP. 7

describes in wave-mechanics an "atomic oscillator". The differential


operator is here simply

and the associated eigenvalue problem becomes

The domain of our solution is the infinite range x = [— oo, + oo] and we
demand that the function shall not go to infinity as we approach the two
end-points x = ± oo.
Now this eigenvalue problem is solvable with the help of the hyper-
geometric series, after the proper transformations. The solution is well
known in terms of the "Hermitian polynomials" Hn(x) (cf. 2.19). But let
us assume that this transformation had escaped us and we would tackle our
problem by the KWB method. We see that if x stays within the limits
± VA, we have a periodic, outside of those limits an exponential domain.
The KWB method requires the following integration:

We put

For the exponential range we need the integral

Here we make the substitution

Now the exponential solution associated with the constant A\ increases


to infinity, while the exponential solution associated with A% decreases to
zero. It is this second solution which we admit but not the first one.
Hence we obtain a very definite selection principle by demanding that the
constant A\ must become zero. This condition imposes a definite demand
on the phase constant 6 which exists in the periodic range. We have now
SEC. 7.21 HEEMITE'S DIFFERENTIAL EQUATION 393
(for x > 0) the sequence "periodic-exponential" and have to make use of
the relations (19.5). The requirement A\ = 0 means

or

where k is an arbitrary integer. This means that the periodic solution must
arrive at the critical point U = 0 with a definite phase angle.
The solution appears in the periodic range according to (19.1) in the
following form:

So far only one of the constants, namely 6, has been restricted, but we
still have the constant C at our disposal and thus a solution seems possible
for all A. We have to realise, however, that x assumes both positive and
negative values and the transition to the exponential domain occurs at both
points x = ± VA. The second point adds its own condition, except if
F(x) is either an even or an odd function, in which case the two conditions
on the left and on the right collapse into one, on account of the left-right
symmetry of the given differential operator. In fact, this is the only chance
of satisfying both conditions. Now the solution (7) yields in the numerator

The condition that our function shall become even demands the vanishing of
the second term, which means

and thus

Similarly, the condition that our function shall become odd demands the
vanishing of the first term, which means

and thus
394 STIJRM-LIOTTVILLE PROBLEMS CHAP. 7

where k is an arbitrary integer. But then, since

we obtain for A the following selection rules:


(even function)
(odd function)
Both conditions are included if we put

where an even n belongs to the even, an odd n to the odd eigenfunctions.


A comparison with the exact treatment shows that in the present example
we obtained an exact result which is more than we should have expected,
in view of the purely approximative character of the KWB method. The
solution v(x) is in the present problem obtainable in terms of the "Hermitian
polynomials" Hn(x), in the form

The eigenvalue of Hermite's differential equation is exactly 2n + 1, in


agreement with our result (15).

7.22. Bessel's differential equation


Another interesting example is provided by the celebrated differential
equation (2.4) which defines the Bessel functions. We bring this differential
equation in the normal form (7.13) and have thus

We consider the order p of the Bessel function to be an arbitrary real


number which, however, shall not be too small (less than about 4), since
otherwise our approximation procedure becomes too inaccurate (the Bessel
functions of low order can be made amenable to the KWB method if we
first transform the defining differential equation into the more general form
(2.23). An example was provided by the Bessel functions of the order 1/3).
First of all we carry through the necessary integrations:

The second integral becomes manageable by the substitution


SEC. 7.22 BESSEL'S DIFFERENTIAL EQUATION 395
and we obtain

This holds in the periodic domain x > p\. In the exponential domain
x < pi we obtain similarly

The substitution

reduces the second integral to

and the complete result, valid in the exponential domain, may be written
in the following form

The two fundamental KWB approximations in the exponential region can


thus be put in the following form:

The general solution has two free constants B\ and BZ, associated with the
+ signs in the exponent.
The point x = 0 is a singular point of our differential equation in which
U(x) becomes infinite. However, our approximation does not fail badly in
this neighbourhood. If we let x go towards zero, we find

We must remember that our function v(x) is not the Bessel function Jp(x)
but JP(x)Vx. Moreover, the Bessel functions Jp(x), respectively J-p(x),
assume by definition in the vicinity of zero the values
396 STUBM-LIOFVILLE PEOBLEMS CHAP. 7

Hence we conclude that the two branches of our approximation belong to


the two Bessel functions Jp(x) and J-P(x). If Si alone is present, we shall
obtain Jp(x), if BZ alone is present, we shall obtain J-p(x}* The error of
our approximation is very small in this neighbourhood since p and pi (for
not too small p) are very nearly equal (for example for p = 5 the relative
error is only £%). The values of B\ and BZ can be determined by the
initial values (6), obtaining

We will now see what happens if we come to the end of the exponential
domain and enter the periodic domain. In order to make the transition,
we must put our solution in the form (17.7). But let the upper branch
with the + sign in the exponent be given hi the more general form

Then the form (17.7) demands that we shall write this solution as follows:

which gives

Similarly

Now hi our problem the point XQ in which U(x) vanishes becomes the
point x = pi. The value of K(XQ) can be taken from the form (3) of our
solution:

Let us first consider the case of Jp(x). Here only B\ is present and we
obtain

The transition to the periodic range occurs according to the formulae (19.4)
which now gives

* That we were rash in this identification will be seen a little later.


SBC. 7.22 BESSEL'S DIFFERENTIAL EQUATION 397
Hence the representation of Jp(x)Vx beyond the point x = p\ becomes,
in view of (2):

It is of interest to compare this result with the previously obtained


asymptotic estimation (14.11) of the Bessel functions for large values of the
argument. If x goes to infinity, the argument of the cosine function in (15)
approaches

which differs from the corresponding quantity in (14.11) only by the fact
that p is replaced by p\. This involves a very small error, as we have seen
before.
Another change can be noticed in the amplitude constant C. In the
traditional estimation G should have the value

while our result is

Now pi is very near to p and p! can be estimated on the basis of Stirling's


formula:

Hence our C is near to

which is in agreement with the correct asymptotic value. Here again the
error is fairly small, e.g. for n = 5 not more than 5.7%. It is frequently
more advisable, however, to make the amplitude factor G correct in the
periodic range and transfer the amplitude error to the exponential domain.
In this case the approximation of Jp(x) will be given as follows:
for x > pi:
398 STURM-LIOUVILLE PROBLEMS CHAP. 7

We now come to the discussion of J-p(x). Here we obtain instead of (13)


the new constants

The transition to the periodic range occurs once more on the basis of the
formulae (19.4) which now gives

The representation of J-p(x)Vx beyond the point x — pi becomes

For large values of x the cosine part of the function becomes

which is replaceable bj

while in actual fact the periodicity factor of J-p(x) should come out as

Hence we have arrived in the periodic range with the wrong phase.
Let us investigate the value of the constant (7. Disregarding the small
difference which exists between p and^i and making use of Stirling's formula
(19) we obtain

Now we can take advantage of the reflection theorem of the Gamma function
which gives

Hence

Here again our result is erroneous since the amplitude factor of VxJ-p(x)
in infinity is V2}ir, without the factor sin irp.
SEC. 7.22 BESSEL'S DIFFERENTIAL EQUATION 399
The phenomenon here encountered is of considerable interest. We have
tried to identify a certain solution of Bessel's differential equation by
starting from a point where the solution went to infinity. But the differential
equation has two solutions, the one remaining finite (in fact going to zero
with the power XP) the other going to infinity. Now the solution which
remains finite at x = 0 allows a unique identification since the condition of
finiteness automatically excludes the second solution. If, however, we try
to identify the second solution by fitting it in the neighbourhood of the
singular point x = 0, we cannot be sure that we have in fact obtained the
right solution since any admixture of the regular solution would remain
undetected. The solution which goes out of bound swamps the regular
solution.
What we have obtained by our approximation, is thus not necessarily
J-p(x) but

where a is an arbitrary constant. The behaviour of v(x) in the periodic


range shows that it is the Neumann function Np(x) to which our solution is
proportional:

The constant a in (30) is thus identified as — COS^JTT. The Neumann


function Np(x) has the property that for large x its periodicity is given by

Hence the function — NP(x) has a periodicity which agrees with (26).
Moreover, the amplitude factor (29) is explained by the fact that it is not
— NP(x) itself but — (sin prr)N p(x) that has been represented by our
approximation.
If again we agree that the amplitude factor C shall become correct in the
periodic region and only approximately correct in the exponential region,
we obtain the following approximate representation of the Neumann function
N9(x):
for x > pi:
400 STUEM-UOUVILLE PEOBLEMS CHAP. 7

7.23. The substitute functions in the transitory range


While we have obtained a useful approximative solution of a second
order differential equation (after we have transformed it in its normal form),
there is nevertheless the difficulty that our solution loses its significance if
we come near to the point U(XQ) = 0 where the amplitude factor becomes
infinite. This difficulty can be overcome, however, by the following artifice.
Before U(XQ) vanishes, there is a certain finite region in which U(x) allows
a linear approximation. In this region we were able to solve the resulting
differential equation (11.1) with the help of Bessel functions of the order
1/3. But if this is so, we can proceed as follows. Before our approximate
solution goes out of hand, we have a KWB approximation which can be
matched to the KWB approximation of the differential equation (11.1).
Having performed the matching we can now discard our KWB approxima-
tion and substitute for it the solution of the differential equation (11.1).
This substitute solution is now valid in the transitory region and with this
solution we have no difficulty, even if we come to the critical point x = XQ
and go beyond. It will thus be sufficient to tabulate four auxiliary functions
which we have to substitute in the place of the KWB approximation in the
vicinity of the critical point x = XQ. We need four such functions because
we have in the exponential range the constants A\ and A2 which are multi-
plied by the two branches of the KWB solution and we have to know what
functions we have to substitute as factors of A\ and A%. We will call these
functions tpie(x), respectively <pze(x)- But then we have in addition the
solution in the periodic range, of the form

For this we may put

and thus we have to tabulate the two substitute functions which will become
the factors of C cos 6 and C sin 6 in the transitory region. These latter
functions shall be called <pip(x) and (pzv(x).
Let us first normalise the constant a of the differential equation (11.1) to
9/4, as we have done in (11.3). We have chosen the two functions (16.2)
as the two fundamental solutions of our differential equation. Moreover,
we have seen by the formula (16.9) that the upper branch of the KWB
approximation will go with /(—#). However, the asymptotic solution
should become
SEC. 7.23 THE SUBSTITUTE FUNCTIONS IN THE TRANSITORY RANGE 401

The comparison with (16.9) shows that the function <p\e(x) should become
identified with

It will be more natural, however, to normalise the constant a of the


differential equation (11.1) to 1. This change of the constant is equivalent
to the transformation

Furthermore, in order to arrive at the proper amplitude in the asymptotic


range, we have to multiply our function by (2/3)"1/3 and the final expression
for <pie(x) becomes

In a similar fashion we obtain with the help of (16.11):

In the periodic range the formulae (16.3) and (16.4) come in operation
and our final result becomes:

The method of computing these four functions will be given in the following
section.
Generally the differential equation which is valid in the transitory range
will be of the form (11.1) with a constant a which is not 1 but U'(XQ). The
transition to the general case means that our previous x has to be replaced by

Hence in the general case, where the value of XQ and U'(XQ) is arbitrary, the
substitute functions have to be taken with the argument ^/U'(xo)(x — XQ).
Moreover, a constant factor has to be applied in order to bring the
asymptotic representation of these functions in harmony with the KWB
approximation. This factor is | J7'(zo)|~1/6-
Let us first assume that V'(XQ) > 0. We then have the transition
27—L.D.O.
402 STUBM-LIOUVTLLE PEOBLEMS CHAP. 7

exponential to periodic. We can now obtain the substitute functions of the


transitory region according to the following table:
1. factor of A\:

2. factor of A 3:

3. factor of C cos 6:

4. factor of C sin 6:

Let us assume, on the other hand, that the transition occurs in the sequence
periodic to exponential. In this case the correlation occurs as follows:
1. factor of AI:

2. factor of AZ :

3. factor of C cos 0:

4. factor of C sin 9:

As an example let us consider the case of the Bessel functions Jp(x) (with
positive p), studied before. We want to determine the value of Jv(pi), or
still better the value of the function v(x) = VxJp(x), at the transition point
* = XQ. Here we have
SEC. 7.23 THE SUBSTITUTE FUNCTIONS IN THE TRANSITORY RANGE 403

Furthermore, the definition of the functions f(x) and g(x) according to


(16.2) yields

We substitute these values in (8) and (9):

Now in our problem we have according to (22.21):

and thus

We have to make use of the table (11) which now gives

and going back to J$(x):

If, on the other hand, the Neumann function NP(x) is in question, we have
(cf. (22.33))

and the table (11) gives (in view of (15))

Example. As an example we substitute for p the values 4, 6, 8, 10, and 12,


and make the comparison of our approximation with the corresponding
404 STUEM-LIOUVILLE PROBLEMS CHAP. 7

exact values of the Bessel functions Jp(x), taken at the point pi. The
values of pi now become :
3.9686, 5.9791, 7.9843, 9.9875, 11.990
Substitution in the formula (16) yields

7.24. Tabulation of the four substitute functions


The numerical evaluation of the four functions <pie(x), <p2e(#), VIP(X),
q>2p(x) causes no difficulties since all these functions are entire functions
and possess a Taylor expansion which in the region concerned has satis-
factory convergence. Certain linear combinations of these four functions
exist, however, in tabulated form* and we can make use of these tables
in order to obtain our functions. The actually tabulated functions have a
real and an imaginary part, denoted by ^(^i) and I(hi). We have to take
the following linear combinations of the tabulated functions in order to
obtain our functions <p(x):

This means in numerical terms:

The Table III of the Appendix gives the numerical values of these four
* Tables of the modified Hankel functions of order one-third and of their derivatives
(Harvard University Press, Cambridge, Mass., 1945); cf. in particular the case y — 0
on pp. 2 and 3.
SEC. 7.25 INCREASED ACCURACY IN THE TRANSITION DOMAIN 405

functions, in intervals of 0.1, for the range x = [0, — 3], respectively [0, 3].
Beyond this range no substitution is demanded since the KWB approxima-
tion becomes sufficiently accurate.
Problem 308. Obtain with the help of the tables the values of J5(4.7), ^5(6),
J5(5.2) and likewise Jio(9.5), Jio(lO), Ji0(10.5).
[Answer:
J5(4.7) = 0.2307 J5(5) = 0.2661 J5(5.2) = 0.2977
exact: (0.2213) (0.2611) (0.2865)
Jlo(9.5) = 0.1695 Jio(lO) = 0.2086 Jio(10.5) = 0.2459
(0.1650) (0.2075) (0.2477) ]
Problem 309. The x-value at which <piP + cpzv, respectively <p]V — (p%P first
vanishes, is x = 2.3381, resp 1.1737. Find accordingly the approximate position
of the first zero of Jp(x) and Np(x), In the case of JP(x) compare the result
with the exact position of the first zero for p = 5, 7, 9, and 10.
[Answer:

The relatively poor agreement indicates the presence of a systematic error which
will be investigated in the next section.]

7.25. Increased accuracy in the transition domain


In using the substitution functions <p(x) in the vicinity of the point XQ we
observe that we obtain good results only in a very narrow interval around
XQ. As soon as we depart from the transition point x =• XQ to a substantial
degree, our results deviate considerably from the true functional values.
For example the position of the first zero of the Bessel functions involves
the #-value 2.34 of the functions <pip(x) and <pzp(x) but the answers do not
agree well with the actual position of the zeros.
The reason for this phenomenon is that the purely linear approximation
of U(x) in the neighbourhood of x = XQ holds only within a very small
interval. The question arises whether we could not extend this interval to
a somewhat larger domain, without losing too much in simplicity.
If we expand U(x) in the neighbourhood of U(XQ) = 0, we obtain

The higher order terms shall be neglected but not the quadratic term. We
will briefly put
406 STUEM-LIOUVILLB PROBLEMS CHAP. 7

Then the given differential equation becomes in the transitory region:

On the other hand, our four functions <p(x) satisfied the differential equation

It will be our aim to bring the two differential equations (3) and (4) in
harmony with each other. For this purpose we establish a relation between
the variables x and £. While before we have assumed that x and £ are
simply proportional to each other, we will now apply a correction term and
put

considering e as a small quantity the square of which may be neglected.


Let us first investigate the transformation of the independent variable in
general terms. The transformation

means that the operation of differentiating has to be changed as follows:

Our differential equation

thus becomes in the new variable £:

which gives

Now we make use of the method of taking out a proper factor in order to
obliterate the first order term (cf. 7.9-10). We put

and obtain for fli(£) the new equation

where
SBC. 7.25 INCREASED ACCURACY IN THE TRANSITION DOMAIN 407

We obtain from (10):

Moreover, our postulated relation (5) gives

and thus, neglecting quantities of second order in e:

Since 6 is a constant and thus 6' = 0 (and since 62 is negligible, being of


second order in e), the differential equation (11) becomes:

The factor of v(£) becomes, neglecting quantities of second order:

The corresponding factor of the differential equation (3) may be written in the
form

and we obtain agreement by the choice

Finally, going back to the relation (5):

Considering (12) the final result becomes that in the transition domain we
have to use the ^-functions in the following manner:
408 STURM-LIOUVTLLE PROBLEMS CHAP. 7

For the sake of increased accuracy the tables (23.11) and (23.12) have to
be modified according to this correction. The correction is in fact quite
effective. Let us obtain for example once more the first zero of Jp(x).
This demands the first zero of the function <piP + <pz* which is at the point
x = 2.3381. Now in the present case

and we obtain the condition

which yields with sufficient accuracy

The same condition for the Neumann functions NP(x) becomes

The last term is the correction which has to be added to our previous formulae
(24.5). The corrected values of the zeros obtained in Problem 309 now
become

The agreement is now quite satisfactory.


Problem 310. Obtain with the help of the corrected formula the values of
,/5(4.6) and J5(5.5) and likewise the values of «7io(9), «7io(9.5), «/io(10.5), Jio(ll).
[Answer:
J6(4.5) = 0.1993 J5(5.6) = 0.3118
(0.1947) (0.3209)
Jlo(9) = 0.1262, ^10(9.5) = 0.1664, Ji0(10.5) = 0.2493, Jio(H) = 0.2817
(0.1247) (0.1650) (0.2477) (0.2804)]
SBC. 7.26 EIGBNSOLUTIONS REDUCIBLE TO THE HYPERGEOMETRIC SERIES 409

7.26. Eigensolutions reducible to the hypergeometric series


The differential operator associated with the hypergeometric series (2.7)
is given as follows:

where we have put

The factor A(x) of v"(x) vanishes at the two points x = 0 and 1. These
points are thus natural boundary points of the operator which limits the
range of x to [0, 1]. According to the general theory the weight factor
w(x) of our operator becomes (cf. 4.7)

The eigenvalue equation associated with our operator assumes the form

(if we agree that the eigenvalues are denoted by — A in order to make A


positive). According to the differential equation of Gauss (cf. 2.6) this
equation is solvable in terms of the hypergeometric function F(a, 6 — a — 1,
a; x) if the following identification is made:

The boundary term of Green's identity associated with our operator


becomes

This term has the peculiarity that it vanishes without any imposed conditions
on u and v, due to the vanishing of the first factor. In fact, however, this
implies that v(x) and v'(x) remain finite at the points x = ± 1. Since the
points x = ± 1 are singular points of our differential operator where the
solution goes out of bound if no special precautions are taken, the very
condition that v(x) and v'(x) must remain finite at the two end-points of the
range, represents two homogeneous boundary conditions of our differential
operator which selects its eigenvalues and eigenfunctions. In particular,
the hypergeometric function F(a, /?, y; x) goes to infinity at the point x = 1,
except in the special case that the series terminates automatically after a
finite number of terms. This happens if the parameter a (or equally j8 but
this does not give anything new since F is completely symmetric in a and
j3) is equated to a negative integer —n. Then the eigenvalues An become
(according to (5)):
410 STUBM-LIOUVILLE PROBLEMS CHAP. 7

Hence the eigenfunctions of our operator become simply polynomials of the


order n:

The integer n can assume all values between 0 and infinity:

The polynomials thus generated are called "Jacobi polynomials".


Certain choices of the parameters a and b of the operator (1) lead to
particularly interesting function classes. For example the choice a = 1
brings about the weight factor

This weighting has the peculiarity that it puts the emphasis on the
neighbourhood of the origin x = 0 (for b > 2). With increasing b we
obtain a weight factor which is practically exponential since for large b
we obtain practically

This kind of weighting occurs in the Laguerre polynomials which we will


consider in Section 29.
Problem 311. By substituting in the differential equation (2.6) of Gauss

show that a second solution is obtainable in the form

Problem 312. Let in (10) 6 go to infinity like l//x, but at the same time transform
x to fj.xi. Show that in the limit the weight factor w(x) becomes e~x.

7.27. The ultraspherical polynomials


Let us put the origin of our reference system in the middle of the range by
putting x = (1/2) + £. Now the weight factor (26.3) changes to

A particularly interesting mode of weighting arises if we maintain left-right


symmetry. This demands the condition

which means

or in terms of the original parameters (26.2):

It is more convenient to change the range ± £ of £ to +1 and that means


that our original x is transformed into
SEC. 7.27 THE ULTRASPHERICAL POLYNOMIALS 411

Now the operator (1) becomes in the new variable and under the condition
(3):

The weight factor w(x) is now

and the eigensolutions are given by the polynomials

with the eigenvalues

This special class of Jacobi polynomials received the name "ultraspherical".


These polynomials are related to each other by many interesting analytical
properties but from the standpoint of applications two special cases are of
particular importance. The one corresponds to the choice y — \ (and thus
the weight factor (1 — a;2)"1/2):

They are called "Chebyshev polynomials". They have many outstanding


advantages in approximation problems* and have the further advantage
that they are representable in terms of elementary functions. If we put
x = cos 6, these polynomials become simply

thus establishing a relation between polynomial expansions and the Fourier


cosine series.
The other fundamentally important choice is y = 1, in which case the
weight factor w(x) becomes 1 and we obtain a set of polynomials which are
orthogonal in the range +1 without any weighting. They are called
'' Legendre polynomials''

These polynomials have important applications in potential theory, in


least square problems, in Gaussian quadrature, in statistical investigations,
and hence deserve special attention. We will apply the KWB method to
the defining differential equation and thus obtain an approximation of these
important functions which sheds interesting light on their general analytic
behaviour. The fact that these functions are obtainable in polynomial
form, does not mean that we can handle them easily since a polynomial of
high or even medium order is not directly amenable to analytical studies.
* Cf. A. A., Chapter 7.
412 STUEM-LIOTJVILLE PROBLEMS CHAP. 7

7.28. The Legendre polynomials


We shall not assume in advance that the eigenvalues of Legendre's
differential equation are given. We shall prefer to consider the equation
in the form

(replacing the undetermined constant A by a2) and find the eigenvalues in


the course of our investigation. Our first aim will be to transform our
equation into the normal form in which the first derivative is missing. We
could do that by splitting away a proper factor as we have seen it in the
discussion of Hermite's differential equation and likewise in dealing with
Bessel's differential equation. However, we will prefer to follow a slightly
different course. Instead of splitting away a proper factor we want to
change the independent variable. Let us multiply the entire equation by
(l-*2):

If we now introduce a new variable £ by putting

then our equation in £ becomes

It will be convenient to introduce an additional angle variable 6 by putting

Then

Moreover,

The constant jS can be determined if we consider that an increase of x from


0 to 1 means that 6 decreases from TT/£ to 0. If we want to integrate from
SEC. 7.28 THE LEGENDEE POLYNOMIALS 413

x = 0, we have to put j8 = a(7r/2) and we obtain our approximate solution


in the form

where the constant c is arbitrary and the cosine has to be chosen if an


even, the sine if an odd function of x is involved.
Up to now we see no reason why « should be restricted to some exceptional
values. On the other hand, we do not know yet, how our solution will
behave if we approach the critical point 6 = 0. Here our approximation
goes out of hand because we come to the point XQ in which U(XQ) vanishes.
Generally this point causes no difficulty and leads to no singularity. We
merely replace our solution by the substitute functions q>(x) which we have
studied in the previous sections. At present, however, this method is not
available since the vanishing of

occurs at the point £ = oo and thus recedes to infinity. We have to find


some other method to study our solution as £ becomes large.
Now for large £ we have practically

since the 1 becomes negligible in the denominator of (9) if £ grows large.


Let us then put

Hence in the new variable t our differential equation (for large £) becomes

But this is BesseVs differential equation for the order n = 0, taken at the
point 2at (cf. 2.20):

Generally a differential equation of second order has two solutions and in


actual fact the differential equation (11) has a second solution No(t) which,
414 STTJRM-LIOTIVILLE PROBLEMS CHAP. 7

however, goes logarithmically to infinity at t = 0, i.e. £ = oo. The demand


that our solution shall remain finite at x = 1—i.e. f = oo—excludes the
second solution and restricts the function v(t) to jQ(2at), except for a factor
of proportionality. But this factor is in our case equal to 1 because the
Legendre polynomials Pn(x) are normalised by the condition

This now means that v(Q) must become 1, but then the factor of proportion-
ality of Jo must become 1 in view of the fact that Jv(x) starts with the value

which becomes 1 for the case of JQ(X).


Now JQ(X) assumes for an x which is not too small, the asymptotic law

which in our case becomes

Since for small values of 6 the tangent becomes practically the angle itself,
we obtain for small 9:

This solution can be linked to the solution (8) which is valid in the domain
of larger 6. Let us first assume the case of an even function. Then the
matching of the cosine factors demands the condition

where k is an arbitrary integer. This gives

The constant c is then determined to


SEC. 7.28 THE LEGENDRE POLYNOMIALS 415

Let us now assume the case of an odd function. The solution can now be
written in the form

and the matching of the cosine factor yields the condition

which gives

while c becomes once more

The case of both even and odd functions is included in the selection rule

where n is an integer, with the understanding that all even n demand the
choice cosine and all odd n the choice sine in the general expression (8).
The determination of the eigenvalues a according to (22) is not exact
but very close. The exact law for a is

Hence the approximation (22) is quite satisfactory.


Of particular interest is the distribution of the zeros of Pn(x). In order
to study this distribution and make a comparison with the zeros of the
Chebyshev polynomials, it will be appropriate to change 6 to the comple-
mentary angle 9 by putting

Then the expression for the polynomials of even order becomes

while the polynomials of odd order become


416 STUEM-LIOTJVILLB PROBLEMS CHAP. 7

The corresponding expressions for the Chebyshev polynomials become (not


in approximation but exactly):

The zeros of the Chebyshev polynomials of odd order follow the law

while the Gaussian zeros follow (in close approximation) the law

The difference is that hi the Chebyshev case the full circle is divided into
4/u, + 2 = 2n equal parts and the points projected down on the diameter.
In the Gaussian case the full circle is divided into 4/i + 3 = 2n + I equal
parts and again the points projected down on the diameter.
In the case of the polynomials of even order the zeros of the Chebyshev
polynomials become

while the Gaussian zeros become

Now the half circle is divided into 4/z = 2n, respectively 4/z + 1 = 2n + I
equal parts and the points are projected down on the diameter, but skipping
all points of even order 2, 4, ... and keeping only the points of the order
1, 3, 5, . . ., n — 1.
The asymptotic law of the zeros is remarkably well represented even for
small n. We obtain for example for n = 4, 5, 6, 7, 8 the following
distribution of zeros, as compared with the exact Gaussian zeros:
n =4 n =5
0.3420 (0.3400) 0.5406 (0.5385)
0.8660 (0.8611) 0.9096 (0.9062)

n—6 n= 7 n =8
0.2393 (0.2386) 0.4067 (0.4058) 0.1837 (0.1834)
0.6631 (0.6612) 0.7431 (0.7415) 0.5265 (0.5255)
0.9350 (0.9325) 0.9510 (0.9491) 0.7980 (0.7967)
0.9618 (0.9608)
In order to test the law of the amplitudes we will substitute x = 0. Then
SEC. 7.28 THE LEGENDRE POLYNOMIALS 417

we obtain for the polynomials of even order n = 2p, the following starting
values:

while the derivative of the polynomials of odd order n = 2p, + I at the


point x = 0 becomes:

The Legendre polynomials have the property that they are derivable from
a "generating function" in the followinsr wav:

From this property of Pn(x) we derive the following exact values of PZ^)
and P' 2M+ i(0):

If we make use of Stirling's formula (22.19) which approximates nl


remarkably well even in the realm of fairly small n, we obtain

Within this accuracy the value of P2^(0) coincides with that given in (32),
except that /* + J is replaced by p., while in the case of P'2M+i(0) we find
that 4ju, + 3 is replaced by 4/j, + 4.
The numerical comparison shows that in the realm of n = 5 to n = 10
we obtain on the basis of the asymptotic formula the following initial
values, respectively initial derivatives (the numbers in parenthesis give the
corresponding exact values):
ro = 6 n= 8 n = 10
PB(0) = -0.3130 0.2737 -0.2462
(-0.3125) (0.2734) (-0.2461)

n =5 n =7 TI = 9
P'ft(0) = 1.8712 -2.1851 2.4592
(1.8750) (-2.1875) (2.4609)
We see that the asymptotic law gives very good results even in the realm
of small n (starting from n = 5).
28—L.D.O.
418 STURM-LIOTJVTLLE PROBLEMS CHAP. 7

7.29. The Laguerre polynomials


Another eminently important class of polynomials which is closely related
to the theory of the Laplace transform, was introduced in analytical research
by the French mathematician Laguerre. The Laguerre polynomials are
associated with the range x = [0, oo] and they are orthogonal with respect
to the weight factor e~x. The Laguerre functions are thus the functions

They are orthogonal in themselves, without any weight factor:

Moreover, they have the remarkable property that their "norm" becomes 1:

if we prescribe the following simple initial condition for the polynomials


Lk(x}:

We have encountered these polynomials earlier, in Chapter 1, when dealing


with the Gregory-Newton type of equidistant interpolation, on account of
their outstanding properties in the calculus of finite differences. At that
time we could draw our conclusions without any reference to the differential
equation that these polynomials satisfy. In our present discussions we
consider them as a special case of the eigenfunctions which arise from the
hypergeometric series and will carry through the same kind of treatment
which we applied to a close approximation of the other kinds of hyper-
geometric polynomials.
The Laguerre polynomials are in many respects counterpart to the
Legendre polynomials. Any given finite range x — [a, b] can be normalised
to the range [—1, +1] by a simple linear transformation of x. For an
infinite range, however, such a transformation is not possible. It is still
possible, however, to change any range x = [a, oo] into the range [0, oo]
by a proper linear transformation of x. Since the powers of x are unbounded
and grow to infinity if x approaches infinity, we cannot use them for
approximation purposes without a proper weight factor which will cut down
the contribution from the domain which is very far out. The most natural
weighting for this purpose is the factor e~ax and since a can always be
normalised to 1 by a proper scaling of x, we come automatically to the
definition of the Laguerre polynomials and the associated Laguerre functions
(1). We have seen in (7.14) that the function
SEC. 7.29 THE LAGUERRE POLYNOMIALS 419

satisfies the following differential equation:

We prefer once more to write this equation in the form of an eigenvalue


problem:

leaving the eigenvalue A for the first undetermined.


In the present problem

and we see that for small values of x we are in the periodic, for large values
in the exponential domain. The dividing point U(XQ) = 0 is determined by
the root of a quadratic equation. We will simplify our task, however, by
the observation that for a sufficiently large A the first term of U(x) will
quickly become negligible. We fare better if we do not neglect the first
term but combine it with the second term in the form

Then, if x is not too small, we have practically

and we can consider

as a sufficiently close representation of our problem, except in the realm


of very small x. Here, however, another simplification takes place. In the
range of very small x the constant term J becomes negligible and we can put

But then we have a differential equation which belongs to a recognisable


class. It is of the form (2.23) which is solvable in terms of Bessel functions.
In order to reduce (11) to the form (2.23), the following choice of the constants
has to be made:
420 STURM-LIOUVILLE PROBLEMS CHAP. 7

This gives

and our solution becomes

The second solution, VxNo (VAa;), has to be rejected since we know that
x~Vzv(x) must remain finite at x = 0. Moreover, the solution (13) is
already properly normalised in view of the condition (4) which holds for all
Laguerre polynomials. We have thus a situation similar to that encountered
in the study of the Legendre polynomials around the point x = 1.
We now come to the U(x) given by (10) and the KWB approximation
associated with it. We will put

Then

and the KWB solution becomes

with the two constants of integration C and 8.


On the other hand, the asymptotic law of the Bessel functions gives on
account of (13):

and we have to link the solution (17) to (18) for sufficiently small values of x.
Now small values of x mean small values of <p. But in the realm of
small <p the argument of the cosine function in (17) is replaceable by 2Xy — 8
and this quantity becomes, if we go back to the original variable x on the
basis of (14)—neglecting in this realm the small constant ju/2:

The comparison with (17) shows that we must put

On the other hand, the amplitude factor becomes according to (18):


SBC. 7.29 THE LAGUERRE POLYNOMIALS 421

This determines the factor 0 of (17) to

and we obtain the solution

Now the transition point XQ at which U(XQ) becomes zero, belongs to the
value (p — 7T/2. Hence the phase angle 6 becomes, according to (21.7):

The transition into the exponential domain must be such that the branch
with the positive exponent vanishes since otherwise our solution would go
exponentially to infinity which is prohibited since we know that our solution
must go exponentially to zero as x grows to infinity. This means A\ = 0
and we see from (19.5) that this condition demands

where n is an arbitrary integer. The comparison of (22) and (23) gives the
selection rule

and this is the exact eigenvalue of Laguerre's differential equation (6). Once
more, as in the solution of Hermite's differential equation, the KWB method
leads to an exact determination of the eigenvalue which is generally not to
be expected, in view of the approximate nature of our integration procedure.
The constant of the negative branch of the exponential domain becomes,
according to (19.5):

The transition occurs at the point

For the exponential range we will make the substitution


422 STUEM-LIOUVILLE PROBLEMS CHAP. 7

The solution in the exponential range becomes

In order to obtain the Laguerre functions *Pn(x), we have to divide by Vx.


The final solution can thus be written down as follows:

Problem 313. Obtain the value of *Pn(x) at the transition point XQ.
[Answer:

Problem 314. Obtain the position of the last zero of W^x).


f Answer:

7.30. The exact amplitude equation


The representation of the solution of a second-order linear differential
equation in terms of an oscillation of variable frequency and variable
amplitude was based on a definite approximation procedure, called the KWB
method. We may ask whether this separation into a purely oscillatory
part and a variable amplitude part could not be put on a rigorous basis,
omitting all reference to an approximation. This is indeed possible and can
be achieved as follows. We start again with the normal form of our
differential equation:

(omitting any specific reference to the argument x which we take for


granted). We shall take out a proper amplitude factor C(x) by putting

Then

and our differential equation becomes, if we divide through by C:


SEC. 7.30 THE EXACT AMPLITUDE EQUATION 423

Let us put

We will now again make use of the exponential transformation

obtaining once more a Riccati type of differential equation:

But now we can assume that <p is purely imaginary since we have split away
the amplitude C(x) and what remains is a pure oscillation with constant
amplitude. But then we can put

considering y\ as a real quantity. Then the differential equation (6)


separates into a real and imaginary part and splits into the two equations

The second equation is integrable in the form

where CQ is an arbitrary constant.


We have thus obtained the following solution:

which we may write in the real form

with the two constants of integration CQ and 0.


We recognise the solution (11) once more as the KWB solution of the
given differential equation, if we are in the periodic range t/i > 0. The
424 STUBM-UOUVILLE PROBLEMS CHAP. 7

difference is, however, that the function U(x) is replaced by Ui(x), defined
according to (4). The solution (11) is no longer an approximate but an
exact solution of the given problem. In order to obtain this solution we
must first obtain the function C(x) by solving the non-linear second-order
differential equation

The customary KWB approximation results if we neglect C"JC in comparison


to U. But if we do not neglect this term, we obtain a tool by which we
can study with complete accuracy the change of the amplitude of our
vibration in a given limited range. Once more we see that the separation
of our solution into amplitude and frequency is not unique. The differential
equation (12) allows the prescription of the initial value of C and 0' at the
point of start. In addition we have the value of CQ at our disposal but this
constant is irrelevant since it can be absorbed into the initial value of C.
Hence we can put CQ = 1 without loss of generality. But the initial value
of C and 0' is still arbitrary.
While it will generally not be possible to solve the differential equation
(12), we may draw important conclusions from it concerning the law accord-
ing to which the amplitude C(x) changes. We may want to know for example
whether the ratio of a maximum amplitude to the next maximum is smaller
or larger than 1. For this purpose we can start at a point xi at which the
cosine factor of the solution (11) assumes its maximum value 1. This
means that the integration of VUi(x) will start from that particular point
x = x\:

We insure a maximum of v(x) by choosing the initial values of C(x) as


follows:

Then also C"(x\) = 0 and we see that C(x) will change very slowly in this
neighbourhood. We may be able to proceed to the next maximum on the
basis of a local expansion of a few terms and estimate the increase or decrease
of the maximum without actually integrating the amplitude equation (12).
Problem 315. In the discussion of the differential equation (9.15) of the
vibrating spring a solution was obtained which could be interpreted as an
oscillation of variable amplitude C(x) and frequency /3(x). Show that the
variable amplitude QS(aj)]-1/2 satisfies the exact amplitude equation (12).
SEC. 7.31 STTJRM-LIOUVILLE PBOBLEMS 425

7.31. Sturm-Liouville problems and the calculus of variations


We have seen that all linear second-order differential operators are self-
adjoint if the proper weight factor is applied. On the other hand, self-adjoint
differential equations are derivable from a variational principle. This
means that the given differential equation can be conceived as the solution
of the problem of making a certain definite integral to a minimum, or at
least to a stationary value. The existence of such a minimum principle is of
great advantage because it often enables us to establish the existence of a
minimum by tools which do not require the solution of a differential equation.
Furthermore, the actual numerical solution of a differential equation is
frequently greatly facilitated by trying to minimise the variational integral
directly, paying no attention to the differential equation which is the
consequence of the variational problem.
In the case of a second-order operator the integrand of the variational
principle is particularly simple. We have here

Making use of the standard technique discussed hi Chapter 5.9 (see particu-
larly the second equation of the system (5.9.4)), we arrive at the differential
equation

which coincides with the standard form (4.1) of a self-adjoint second-order


differential equation.
This in itself, however, does not establish the full solution of the given
variational problem. During the process an integration by parts takes
place which gives rise to the following boundary term:

The differential equation (2) is a necessary but not sufficient condition for
the stationary value of the variational integral. A further condition is
required by the vanishing of the boundary term. This establishes the
boundary conditions without which our problem is not fully determined.
Now it is possible that our variational problem is constrained by certain
conditions which have to be observed during the process of variation. For
example we may demand the minimum of the integral

but under the restricting condition that some definite boundary values are
prescribed for v(x):
426 STTJRM-LIOUVILLB PEOBLEMS CHAP. 7

In this case the variation of v(x) vanishes on the boundary:

because v(x) remains fixed at the two end-points during the process of
variation. Then the boundary term (3) vanishes automatically and we get
no further boundary conditions through the variational procedure.
It is equally possible, however, that we have no prescribed boundary
conditions, that is, the variational integral (4) is free of further constraints.
In that case 8v(x) is not constrained on the boundary but can be chosen
freely. Here the vanishing of the boundary term (3) introduces two
"natural boundary conditions" (caused by the variational principle itself,
without outside interference), namely—assuming that A(x) does not vanish
on the boundary:

We may also consider the case that the problem is partly constrained, by
prescribing one boundary condition, for example

Then 8v(x) vanishes on the lower boundary but is free on the upper boundary,
and the variational principle provides the second boundary condition in the
form

In Section 4 we dealt in full generality with the question of self-adjoint


boundary conditions and came to the conclusion that a wide class of
boundary conditions satisfy the condition of self-adjointness. Assuming
any two linear relations between the four quantities v(b), v'(b), v(a), v'(a),
we could eliminate v(b) and v'(b) in terms of v(a), v'(a); then the condition
(4.10) demanded only one single condition between the four constants of
these relations. In view of the equivalence between self-adjointness and
existence of a variational principle we must be able to prove the same result
on a variational basis.
In Chapter 5 we have found that any variational problem can be trans-
formed into the Hamiltonian normal or "canonical" form (cf. particularly
Chapter 5.10). In our problem we bring about the canonical form by
adding v'(x) as surplus variable w(x), by the auxiliary condition

Then our variational integrand becomes

with the auxiliary condition (10). This condition can be united, however,
with the Lagrangian by the method of the Lagrangian multiplier (cf.
Chapter 5.9), which we want to denote by p:
SEC. 7.31 STTTEM-LIOUVILLE PEOBLEMS 427

This adds to our original problem the two new variables p and w, but w is
purely algebraic and can be eliminated, by the equation

This yields

and

where

The resulting canonical equations

are equivalent to the single equation (2).


These equations are not influenced, however, by adding to L' a complete
derivative of an arbitrary function since such a term is equivalent to a
mere boundary term in Q which has no effect on any inside point of the
range. If we choose the quantity — ^(pv)' as such a term, our L' is changed
to the equivalent

With this LI the boundary term of the process of variation becomes

and if we assume the boundary conditions

(which demands the same conditions for the variations 8v(x) and $p(x),
omitting the constants y\ and 72 on the right side), we see that the boundary
term (19) vanishes if the single condition

is fulfilled. Considering the relation p = Av' we see that this result is hi


perfect agreement with our previous result, except for the notations.
428 STUEM-LIOUVILLE PROBLEMS CHAP. 7

The canonical equations (17) lead to an interesting conclusion. In view


of these equations the Lagrangian LI becomes for the actual solution

We will consider the case of the homogeneous differential equation and thus
put P(x) = 0. Then the Hamiltonian function (16) becomes a homogeneous
algebraic form of the second order in the variables p and v, and thus by
Euler's formula of homogeneous forms:

which in view of (22) yields

and also

Hence the variational integral becomes zero for the actual solution. It is
worth remarking that this result is independent of any boundary conditions.
It is a mere consequence of the canonical equations which in themselves do
not guarantee a stationary value of Q.
Let us now consider the eigenvalue problem associated with our operator:

(In the boundary conditions (20) we now have to replace y\ and 72 by zero.)
This equation can be conceived as consequence of our variational principle
if in our Lagrangian (1) we put fi(x) = 0 and replace C(x) by C(x) + A. But
another interpretation is equally possible. We normalise the solution v(x)
bv the condition

which is considered as a given constraint during the process of variation,


without changing our L. Then the Lagrangian multiplier method of
Chapter 5.9 gives again a modification of L and, if the multiplier is denoted
by — £A, the new Lagrangian becomes

which is entirely equivalent to a modification of C(x) by the constant A.


Now we make use of the result (25) which has shown that as the result of the
variational equations the variational integral vanishes for the actual
solution:
SEC. 7.31 STURM-LIOTJVILLE PROBLEMS 429

This again yields in view of the constraint (27):

The significance of this result is as follows. The eigenvalues and eigen-


functions associated with our self-adjoint operator D can be obtained by the
solution of the following variational problem: "Make the variational integral
Q stationary, under the normalisation condition (27)." Any solution of
this problem yields one of the eigenfunctions of the operator D. Moreover,
the eigenvalue A itself is equal to twice the stationary value of the given
integral Q.
More definite statements can be made if it so happens that not only is
A(x) everywhere positive inside the range [a, b] but C(x) is everywhere
negative. Then we can put C(x) = — K(x) and write our Lagrangian in the
following form:

This expression is now a positive definite quadratic form of v and v' which
cannot take negative values for any choice of v(x). The same holds then of
the integral Q and—in view of (30)—of the eigenvalues A<. Hence the
eigenvalues associated with a positive definite Lagrangian can only be positive
numbers.
But we can go further and speak of the absolute minimum obtainable for
Q under the constraint (27). This absolute minimum (apart from the
factor 2) will give us the smallest eigenvalue of our eigenvalue spectrum.
This process can be continued. After obtaining the lowest eigenvalue and
the associated eigenfunction vi(x), we now minimise once more the integral
Q with the auxiliary condition (27), but now we add the further constraint
that we move orthogonally to the first eigenfunction Vi(x):

The absolute minimum of this new variational problem (which excludes the
previous solution vi(x) due to the constraint (32)) yields the second lowest
eigenvalue A2 and its eigenfunction vz(x). We can continue this process,
always keeping all the previous constraints, plus one new constraint, namely
the orthogonality to the last eigenfunction obtained. In this fashion we
activate more and more dimensions of the function space and the eigenvalues
enter automatically in the natural arithmetical order.
We will now give an example which seems to contradict our previous
result by yielding a negative eigenvalue, in spite of a positive definite
Lagrangian. We make the simple choice

and prescribe the self-adjoint boundary conditions


430 STUBM-LIOUVILLE PROBLEMS CHAP. 7

The solution

obviously satisfies the given boundary conditions, together with the


differential equation

where A has the negative value A = — a2.


The explanation of this apparent paradox lies in the fact that the
minimisation of our Lagrangian integral (1) cannot lead to boundary
conditions of the form (34), in spite of their self-adjoint character. Only
such boundary conditions can be prescribed which make the boundary term

to zero. For boundary conditions of a more general type we had to subtract


the term \(pv)' from L, that is our variational integral becomes now

The added term amounts to a mere boundary term but this term is no longer
positive definite, in fact it may become negative and counteract the first
positive term to such an extent that the resulting Q becomes negative.
This is what actually happens in the example (33-34).
Problem 316. Obtain the complete ortho-normal system associated with the
problem (33), (34).
[Answer:

Problem 317. Obtain the eigenfunctions and eigenvalues which belong to the
Lagrangian

[Answer:
Added boundary condition:

where
SEC. 7.31 STUEM-LIOUVILLE PROBLEMS 431

Problem 318. Show that for any /it > 2 the integral Q can be made arbitrarily
small by a function of the type e~at where a is very large. Hence the minimum
problem cannot lead to a discrete smallest eigenvalue.
Problem 319. Obtain the eigenfunctions and eigenvalues for the limiting case
/i = 2.
[Answer: The eigenvalues become continuous since only the boundary condition
at v(l) gives a selection principle:

The eigenfunctions cannot be normalised since they are not square-integrable.


However, by cutting out an arbitrarily small neighbourhood around the singular
point x — 0, the continuous spectrum changes to a dense line spectrum and the
functions become once more normalisable.]
Problem 320. Find the solution of the variational problem associated with the
integral

with the constraints (27) and (34). (Although this problem seems to coincide
with Problem 316, this is in fact not the case because now the given Lagrangian
is positive definite and the boundary conditions (34) must be treated as constraints.
A negative eigenvalue is not possible under these conditions.)
[Answer: The method of the Lagrangian multiplier yields the differential
equation

with the following interpretation. A strict minimum is not possible under the
given constraints. The minimum comes arbitrarily near to the solution of the
eigenvalue problem which belongs to the boundary conditions t/(0) = v'(n) = 0,
without actually reaching it.
This example shows that a variational problem does not allow any tampering
with its inherent boundary conditions (which demand the vanishing of the boundary
term (37)). If constraints are prescribed which do not harmonise with the
inherent boundary conditions, the resulting differential equation is put out of
action at the end-points, in order to allow the fulfilment of the inherent boundary
conditions which must be rigidly maintained.]
BIBLIOGRAPHY
[1] Cf. {!}, pp. 82-97, 324-36, 466-510, 522-35
[2] Cf. {12}, Chapters XIV-XVII (pp. 281-385)
[3] Jahnke, E., and F. Emde, Tables of Functions with Formulae and Curves
(Dover, New York, 1943)
[4] MacLachlan, N. W., Bessel Functions for Engineers (Clarendon Press, Oxford,
1934)
[5] Magnus, W., and F. Oberhettinger, Formulas and Theorems of the Special
Functions of Mathematical Physics (Chelsea, New York, 1949)
[6] Szego, G., Orthogonal Polynomials (Am. Math. Soc. Colloq. Pub., 23, 1939)
[7] Watson, G. N., A Treatise on the Theory of Bessel Functions (Cambridge
University Press, 1944)
CHAPTER 8

BOUNDARY VALUE PROBLEMS

Synopsis. For irregular boundaries the boundary value problem of


even simple differential operators becomes practically unmanageable,
if our aim is to arrive at an analytical solution. For certain regular
boundaries, however, the fundamental partial differential equations of
mathematical physics become solvable by the method of the "separation
of variables ". While the number of explicitly solvable boundary value
problems is thus very restricted, the study of these problems has had a
profound impact on the general theory of partial differential equations,
and given rise to a large class of auxiliary function systems which have
a wide field of application. The present chapter discusses the standard
type of boundary value problems which played such a decisive role in
the understanding of the general theory of second order operators. We
enlarge, however, our field of interest by the addition of certain
unconventional types of boundary value problems which are not less
solvable than the conventional types, although they do not submit to
the customary conditions of a "well-posed" problem. We encounter
the "parasitic spectrum" with its ensuing consequences. Finally we
arrive at a perturbation method which transforms all non-conformist
problems into the traditional "well-posed " ones and obtains the solution
by a certain limit process.

8.1. Introduction
In all the previous chapters we were primarily concerned with the general
theory of linear differential operators which were characterised by homo-
geneous boundary conditions; that is, certain linear combinations of the
unknown function v(x) and its derivatives on the boundary were prescribed
as zero on the boundary, while, on the other hand, the "right side" of the
differential equation was prescribed as some given function fi(x) of the
domain. Historically, another problem received much more elaborate
attention. The given differential equation itself is homogeneous by having
zero on the right side (hence f$(x) = 0). On the other hand, some linear
combinations of the unknown function and its partial derivatives are now
prescribed as given values, generally different from zero. Instead of an
inhomogeneous differential equation with homogeneous boundary conditions
we now have a homogeneous differential equation with inhomogeneous
432
SEC. 8.1 INTBODUCTION 433

boundary conditions. Problems of this type occur particularly frequently


in mathematical physics and in fact the entire theory of boundary value
problems took its departure from the exploration of natural phenomena.
For example one of the best investigated partial differential equations of
mathematics, the "Laplacian equation", or "potential equation" :

originated from the Newtonian theory of gravitation, but it occurred equally


in hydrodynamics, in elasticity, in electrostatics, hi heat conduction. Again,
it was the problem of sound propagation which induced Riemann (around
1860) to discover a completely different method for the investigation of
another type of differential equation which later received the name
"hyperbolic". Based on earlier results of G. Monge (1795), P. du Bois-
Eiaymond introduced (in 1889) the classification of second-order differential
operators into the "elliptic", "parabolic", and "hyperbolic" types. The
potential equation is characteristic for the "elliptic" type. The heat flow
equation

or more generally

represents the "parabolic" type, while the equation of the vibrating string:

or more generally

(the problem of the vibrating membrane) and

(called the "wave equation"), belong to the "hyperbolic" type. With the
advent of wave-mechanics the Schrodinger equation

and many allied equations came in the focus of interest. Here the question
of boundary values is often of subordinate importance since the entire space
is the domain of integration. But in atomic scattering problems we encounter
once more the same kind of boundary value problems which occur hi optical
and electro-magnetic diffraction phenomena, associated with the Maxwellian
equations.
29—L.D.O.
434 BOUNDARY VALUE PROBLEMS CHAP. 8

In the investigation of this particular class of second-order differential


equations the observation was made that the analytical nature of the
solution differed widely according to the "type" of the given differential
operator. Furthermore, each type required its own kind of boundary
conditions. For example the elliptic type of differential equations required
conditions all along the boundary,* while the hyperbolic type required either
initial or end conditions, the parabolic type initial conditions, excluding
end conditions. Under such circumstances a truly universal approach to
the theory of boundary value problems seemed out of the question. In
every single case one had to establish first, what kind of boundary conditions
are appropriate to the given problem, in order to make the problem
analytically feasible. Hadamard in his famous Lectures on the Cauchy
Problem (cf. Chapter Bibliography [5]), defined the conditions which a given
boundary value problem had to satisfy in order to be admitted as a "well-
posed" problem, with the implication that problems not satisfying these
conditions are analytically inadmissible.
In marked contrast with these views we find that the fundamental
"matrix decomposition theorem", which we have encountered in Chapter
3.9—and which in proper interpretation carries over into the field of
arbitrary linear differential operators, with arbitrarily given boundary
conditions—provides a universal platform for the theory of boundary value
problems, irrespective of the "type" to which the differential operator
belongs. In this treatment the subclass of "well-posed" problems is
distinguished by a very definite property of the eigenvalue spectrum
associated with the given operator. This property is that the eigenvalue
A = 0 is excluded from the eigenvalue spectrum, both with respect to the
t7-space (thus excluding over-determination) and the F-space (thus excluding
under-determination). Then the operator is activated in the entire function
space and not only in a restricted sub-space of that function space. But
even this condition is not enough. The further condition has to be made
that the eigenvalue A = 0 must not be a limit point of the eigenvalue spectrum,
that is the spectrum of the eigenvalues must start with a definite finite
eigenvalue AI > e, while the segment between A = 0 and A = e must
remain free of eigenvalues.
From the standpoint of the numerical solution of a given boundary
value problem Hadamard's "well-posed" condition is well justified, since
problems which do not satisfy this condition, lead to numerical instability,
which is an undesirable situation. From the analytical standpoint, how-
ever, it is hardly advantageous to restrict our investigation by a condition
which does not harmonise with the nature of a linear operator. Many
fundamentally important operators of mathematical physics—as we have
seen in Chapter 4—are not activated in the entire function space. They may
omit axes in either the U or the V space, or in both. Accordingly our data
* Such conditions we will call "peripheral conditions", in order to avoid collision
with the more general term "boundary conditions" which includes all classes of
boundary data.
SEC. 8.2 INHOMOGENEOUS BOUNDARY CONDITIONS 435

cannot be chosen freely but are subject to a finite or infinite number of


compatibility conditions. Moreover, the solution of the given problem may
not be unique, the uncertainty occurring in a finite or even infinite number of
dimensions of the full F-space. But these apparent deficiencies are fully
compensated by the fact that the given operator is within its own field of
activation both complete and unconstrained. Our difficulties arise solely by
going outside the field of operation which is inherently allotted to the
operator. If we stay within that field, all our difficulties disappear and
neither over-determination nor under-determination comes into existence.
What remains is the possibility of the eigenvalue A = 0 as a limit point.
This happens if boundary conditions are given which are not "according to
type", that is initial (or end) conditions for the elliptic type of equations,
or peripheral conditions for the hyperbolic type, or end-conditions for the
parabolic type of equations. But if we have admitted the previous
restriction of the given data to a definite subspace of the ?7-space, then we
cannot exclude this new class of problems from our considerations either,
because they represent a natural modification of the previous compatibility
conditions. If before we required our data to have no projections in the
"forbidden" (that is inactive) dimensions of the function space, now we
have to demand a less stringent condition, because in those dimensions which
belong to arbitrarily small eigenvalues, the projections of the data need
not be zero, but only sufficiently weak.
With these restrictions any kind of boundary value problem, whose data
are properly given, has a solution, and in fact a unique solution. Finally
we shall encounter a further fact which tends to disprove the unique position
of the '' well-posed'' problems. By a finite but arbitrarily weak perturbation
we can wipe out the eigenvalue A = 0 and its infinitesimal neighbourhood,
thus transforming any arbitrarily "ill-posed" problem to a "well-posed" one
(cf. Section 18).

8.2. Inhomogeneous boundary conditions


Although we shall deal specifically in this chapter with solution methods
adapted to the case of inhomogeneous boundary conditions associated with
a homogeneous differential equation, it is of greatest theoretical importance
that our previous theory which assumed an inhomogeneous differential
equation with homogeneous boundary conditions, includes the solution of the
present problem.
We have recognised the expansion into eigenfunctions as a particularly
powerful and universal method of solving linear differential equations. The
presence of inhomogeneous boundary conditions seems to put this method
out of action since the eigenfunctions cannot satisfy anything but homo-
geneous boundary conditions and the same holds for any linear combination
of them. By the following detour, however, we get round the difficulty.
First of all we find a function VQ(X) which need not satisfy any differential
equation but is chosen in such a way that it shall satisfy the given boundary
conditions. We require a certain smoothness of this function VQ(X), because
436 BOUNDAEY VALUE PEOBLEMS CHAP. 8

the operator DVQ(X) must be able to operate on it and thus it must be


differentiable to the proper degree. This may require a certain smoothness
of the boundary data which is more than necessary for the existence of a
solution but puts us on the safe side. Now we will put

Then, if the given problem is to solve the homogeneous differential equation

with inhomogeneous boundary conditions, the substitution of (1) in our


equation yields for the function V(x) the inhomogeneous equation

but the given inhomogeneous boundary conditions are absorbed by VQ(X),


with the consequence that V(x) must satisfy the same kind of boundary
conditions which have been prescribed for v(x), but with zero on the right side.
And thus we have succeeded in transforming our originally given homogeneous
differential equation with inhomogeneous boundary conditions into an
inhomogeneous differential equation with homogeneous boundary conditions.
Let us see how the method of the eigenfunctions would operate under
these circumstances. We assume that we possess the complete system of
eigenfunctions and eigenvalues associated with the eigenvalue problem

Then V(x) can be expanded into the eigenfunctions vt (x):

The coefficients of this expansion are obtainable in terms of the expansion


coefficients of the right side:

where

Now we make use of Green's identity (4.17.4):

which yields
SEC. 8.2 INHOMOGENEOTJS BOUNDARY CONDITIONS 437

The second term is extended over the boundary surface a and involves the
boundary values of Ui(a) and its partial derivatives, together with the
boundary values of VQ(X), which in fact coincide with the given boundary
values for v(x).
If we separate the first term on the right side of (9) and examine its
contribution to the function, we find the infinite sum

and it seems that we have simply obtained — VQ(X) which compensates the
VQ(X) we find on the right side of the expression (1). It seems, therefore,
that the whole detour of separating the preliminary function VQ(X) is
unnecessary. But in fact the function VQ(X) does not belong to the functions
which can be expanded into the eigenfunctions vt(x), since it does not
satisfy the necessary homogeneous boundary conditions. Moreover, the
separation of this term from the sum (9) might make the remaining series
divergent. In spite of the highly arbitrary nature of VQ(X) and the inde-
pendence of the final solution (2) of the choice of VQ(X), this separation is
nevertheless necessary if we want to make use of the expansion of the
solution into a convergent series of eigenfunctions vt(x).
The method of the Green's function is likewise applicable to our problem
and here we do not have to split away an auxiliary function but can apply
the given inhomogeneous boundary values directly, in spite of the fact that
the Green's function is defined in terms of homogeneous boundary con-
ditions. We have defined the Green's function by the differential equation

(the operator I) includes the adjoint homogeneous boundary conditions;


moreover, the right side has to be modified, if necessary, by the proper
constraints, as we have seen in (5.22.9)). While, however, in our previous
problem (when v(x) was characterised by homogeneous boundary conditions),
Green's identity appeared in the form

and led to the solution

now this volume integral is zero, due to the vanishing of j8(#). On the
other hand, we now have to make use of the "extended Green's identity"
(4.17.4):

and obtain v(x) in the form of an integral extended over the boundary
surface a:
438 BOUNDARY VALUE PROBLEMS CHAP. 8

Here the functions Fa(u, v) have automatically the property that they
blot out all those boundary values which have not been prescribed. What
remains is a surface integral involving all the "given right sides" of the
boundary conditions. We may denote them by

Then the general form of the solution can be written as follows:

where the auxiliary functions GI(X, u), . . . , Gp(x, a) are formed with the
help of the Green's function G(x, £) and its partial derivatives, applied to the
boundary surface a.
The unique solution thus obtained is characterised by the following
properties. If our operator is incomplete by allowing solutions vi(x) of the
homogeneous equation

our solution is made orthogonal to all these solutions. Furthermore, if the


adjoint homogeneous equation possesses non-zero solutions:

we assume that the given boundary conditions fa(a) are such that the
integral conditions

are automatically fulfilled since otherwise the boundary data are in-
compatible and the given problem is unsolvable.
Problem 321. Assume the existence of non-zero solutions of (18) and the ful-
filment of the required compatibility conditions (19). Now apply the method
of transforming the given inhomogeneous boundary value problem into a homo-
geneous boundary value problem with inhomogeneous differential equation.
Show that now the orthogonality of the right side to the "forbidden" axes ui
is automatically satisfied.

8.3. The method of the "separation of variables"


Although the general theory of the Green's function and the associated
double set of eigenfunctions is of great value for the general analytical
investigation of boundary value problems, the actual solution of such
problems can often be accomplished by simpler tools. The solution of an
inhomogeneous differential equation involves a right side which is given in
an w-dimensional domain, while the boundary data of a homogeneous
differential equation belong to the boundary surface, i.e. a domain of only
n — 1 dimensions. Hence we can expect that under the proper circum-
SEC. 8.4 THE POTENTIAL EQUATION OF THE PLANE 439

stances a boundary value problem is solvable without the full knowledge of


the complete set of eigenfunctions which are associated with the given
differential operator.
In many of the particularly important differential operators of mathe-
matical physics a fortunate circumstance exists, without which our
knowledge concerning the nature of boundary value problems would be
much more restricted. It consists of an artifice first employed by
D. Bernoulli (1775) which has retained its fundamental importance to our
day. We try to reduce the given partial differential equation to the
solution of a set of ordinary differential equations, which depend on a single
variable only. We do that by trying a solution which is set up as a product
of functions of one single variable only:

That such an experiment succeeds is by no means self-evident. It is merely


a fortunate circumstance that most of the basic differential operators of
mathematical physics actually allow such a separation in the variables, in
fact in many cases a separation is possible in a great variety of coordinates
(for example the Laplacian operator A can be separated in rectangular,
polar, cylindrical, parabolic, and many other coordinates). Furthermore,
in most cases in which the separation succeeds, we obtain an infinite set of
particular solutions and by a linear superposition of all these particular
solutions the complete solution of the given differential equation can be
accomplished, inside of a properly chosen domain. The coefficients of this
linear expansion are obtainable in terms of the given boundary values, by
integrating over the boundary surface.
The drawback of this method is only that the domain of the validity of
these expansions is restricted to boundary surfaces of great regularity, such
as a sphere, a cylinder, a parallelepiped—and their counterparts in two
dimensions—occasionally also surfaces or curves of second order. For
boundaries of irregular shape the method loses its applicability and in such
cases we are frequently forced to take recourse to purely numerical methods.

8.4. The potential equation of the plane


The Laplacian equation in two dimensions:

has many exceptional properties, due to its close relation to the celebrated
"Cauchy-Riemann differential equations", which are at the foundation of
the theory of analytical functions. A function of the complex variable
2 = x + iy:
440 BOUNDARY VALUE PROBLEMS CHAP. 8

has the property that its real and imaginary parts are related to each other
by the two partial differential equations

these are the differential equations (first discovered independently by


Cauchy and by Riemann), which have the consequence that both the real
and imaginary part of f(z) satisfy the potential equation (1):

Hence an arbitrary function/(z), if separated into real and imaginary parts,


solves the potential equation of the plane. Furthermore, the Cauchy-
Riemann equations (3) can be conceived as a "conformal mapping" of the
plane on itself, that is a mapping which preserves angles but not lengths.
Such a mapping can be used to transform an irregular closed boundary into
a simple boundary, in particular into a circle. The Cauchy-Riemann
equations—and thus also the potential equation—are invariants of such a
mapping. Hence in principle—as it was shown by Riemann—any simply
connected domain of not too great irregularity can be mapped into the
inside of a circle, although unfortunately we do not possess the tools for the
explicit construction of such a mapping, except in a few simple cases.
For a circle it is advantageous to change from the rectangular coordinates
x, y, to polar coordinates r, 9, by the transformation

In these coordinates the equations (3) appear in the form

Now we try separation by putting

which yields the conditions


SEC. 8.4 THE POTENTIAL EQUATION OF THE PLANE 441

But a function of r can only be a function of 6 if in fact both functions are


reduced to mere constants. Hence the equations (8) separate into the
ordinary differential equations

The second set of equations yields

solvable by exponential functions:

(with a similar solution for Vz)- But now we have to demand that our
solution be periodic in 6 with the period 27r, since two points which belong
to the angles 6 and B + 2™, in fact coincide, and without the required
periodicity our solution would not be single-valued. This condition restricts
the possible values of the product a/3 to Jcz, where k is an arbitrary positive
integer, or zero:

We thus obtain the solutions

We now come to the solution of the first set of conditions (9). Here we
obtain for U\ alone the differential equation

which has the solution

But the second term becomes infinite at r = 0 and has to be rejected.


Moreover, the arbitrary constant a does not add anything new to the free
constants A and B of (13), and may be normalised to 1. Hence the method
of separation, applied to polar coordinates, yields the following class of
particular solutions of the problem (6):

while the first of the conditions (9), combined with (13), gives
442 BOUNDAEY VALUE PROBLEMS CHAP. 8

From these particular solutions we proceed to form by linear superposition


the infinite sum

where the complex constant Ck stands for

It is shown in the theory of analytical functions that this is indeed the


general solution of the Cauchy-Riemann differential equations, inside of a
circle with the radius r.
Let us now consider the function u(r, 6) alone, without reference to
v(r, 6). It can be considered as a solution of the Laplacean equation (1)
which in polar coordinates becomes

The infinite sum

represents once more the complete solution of the equation (20), if we stay
inside a circle within which the equation holds. Let us normalise the
radius of that circle to r = 1 and prescribe on the periphery of the circle
the boundary values

Then our problem is to satisfy the equation

which is equivalent to the problem of expanding a given function into a


trigonometric series. We have solved this problem in Chapter 2.2, with the
following result:
SEC. 8.4 THE POTENTIAL EQUATION OF THE PLANE 443

The entire solution can be combined into the single equation

We can arrive at a Green's function type of solution if we succeed in getting


a closed expression for the infinite sum

This can be done in the present case since we have the real part of an infinite
series which is summable by the formula of the geometrical series:

and thus, putting t — d = s:

This Green's function is not identical with the full Green's function
G(x, |) of the potential equation, but the two functions are closely related
to each other. We have seen in (2.16) that the auxiliary functions
Ga(x, a) are expressible in terms of the Green's function G(x, £) and its
partial derivatives, taken on the boundary surface a, which in our case of
two dimensions is reduced to a boundary curve. But in our simple problem
of a circle we have no difficulty in constructing even the full Green's function
G(x, £) which satisfies the differential equation

together with the homogeneous boundary condition

The high symmetry of the Laplacian operator permits us to study the


Green's function of the potential operator in much more detail than that of
an arbitrary operator, and for a few sufficiently regular domains we can
actually construct the Green's function in explicit form.
The following properties of the Laplacian operator

hold in spaces of arbitrary dimensions, although the spaces of two, three, and
four dimensions are of primary interest from the applied standpoint:
1. The operator remains invariant with respect to arbitrary translations
and rotations. Hence any solution of the Laplacian equation remains a
444 BOUNDARY VALUE PROBLEMS CHAP. 8

solution if we translate it to any other point of space, or rotate it rigidly


around an arbitrary axis by an arbitrary angle.
2. The Laplacian operator is separable in polar coordinates and a solution
exists which is a function of r only. For this solution the defining
differential equation becomes

whose solution is

In two dimensions the corresponding solution becomes

3. If we apply to (31) the Gaussian integral transformation

we obtain that in any "analytical" domain (in which the differential


equation (31) is satisfied without any singularities) for any closed surface a
the fundamental relation holds :

On the other hand, if we identify V with that negative power of r which


satisfies the Laplacian equation (in the plane we have to choose log r),
there is a point of singularity at r = 0 which has to be excluded from our
analytical domain. Then the theorem (36) still holds, but we now have an
inner and outer boundary, the integral extended over the inner plus the
outer boundary is zero, which means that the integral extended over the
outer boundary alone becomes a constant. We can evaluate this constant
by integrating over a sphere (in the plane a circle) of the radius r. For
example, choosing V — log r, and integrating over a circle of the radius
r, we obtain for the integral (36):

while in three dimensions, choosing V = r~l, we obtain

4. Let us consider the solution of the mhomogeneous equation


SEC. 8.4 THE POTENTIAL EQUATION OF THE PLANE 445

and let us apply the Gaussian integral transformation to this equation. We


now obtain the integral theorem (36) in the more general form:

Let us assume in particular that (3(x) is a pure function of r. Then a


solution of (39) can be found which is likewise a function of r only, while the
general solution will be this particular Vi(r), plus a solution of the homo-
geneous equation

5. We will assume that j3(r) is different from zero only within a certain
(^-dimensional) sphere of the radius r = e. Then our particular solution
Vi(r) outside of this sphere must be of the form (33) and the constant 6 will
be determined by the application of the relation (40), integrating over the
inner sphere:

For example in two dimensions we get

hi three dimensions

in four dimensions

and in a space of the arbitrary dimensionality n, depending on the even or


odd character of n:

for n — 2k:

for n = 2Jfc + 1:

6. Let us now investigate the solution of the equation

By the definition of the delta function the integral over the right side
becomes 1. The delta function can be assumed to be spherically symmetric
and concentrated in a small sphere—whose centre is at the origin—with
the radjus « which shrinks to zero. Hence the constant B is now 1 and
446 BOUNDARY VALUE PROBLEMS CHAP. 8

the validity of our solution (43-46) applies in the limit to any point outside
the point r = 0. We thus obtain for the Green's function G(x, g):
in two dimensions:

in three dimensions:

in four dimensions:

where F(£) is a solution of the Laplacian equation (41) which is regular


throughout the given domain. This part of the Green's function will be
uniquely determined by the homogeneous boundary conditions prescribed
for 0(x, £).
With this excursion into the general properties of the Green's function
associated with a Laplacian equation in arbitrary dimensions, we now
return to the boundary value problem (22) which we have previously solved
by the method of the separation of variables. Now we will solve the same
problem on the basis of the Green's function. For this purpose we make use
of the geometrical property of the so-called "conjugate points" of a circle
or sphere. The "conjugate" of a point x with respect to the unit circle
lies on the same radius vector but has the reciprocal distance l/r from the
origin (in Figure (51) x = P, x' = P', while the running point is £ = Q).

The contribution of the point P is fixed (according to (48)) to (l/2?r) log p.


A similar contribution from the point P' satisfies the condition of the
additional function V(£), since the singularity of this function is outside the
circle and thus does not violate the condition that F(f) has to be analytical
everywhere inside the circle.
By the laws of geometry we have
SEC. 8.4 THE POTENTIAL EQUATION OF THE PLANE 447

and we observe that p and rop' become equal on the unit circle. Hence
the linear superposition

yields a solution of the Laplacian equation which vanishes on the boundary


and satisfies all the other conditions of the Green's function. The same
holds in three dimensions if we choose the solution

Hence we have obtained the Green's function of the "first boundary value
problem of potential theory" (when the values of F(o-) are prescribed on the
boundary), for the case that the boundary is the unit circle or the unit
sphere. The result is for the case of two dimensions:

while in three dimensions we obtain the expression

Furthermore, in our problem the extended Green's identity becomes (8.4.56)

which leads to the solution

If—for the case of the circle—we substitute in this formula the expression
(55) (CT corresponds to r = 1), we obtain

Finally, if the point x does not have the coordinates fo, 0 but r, 6, we can
reduce this problem to the previous one by a mere rotation of our reference
system. The final formula becomes, if the integration variable is denoted
by*:

which is in full agreement with our previous result (28).


Problem 322. By a method which corresponds to that employed in the derivation
of the equations (23-28), find the solution of the "second boundary value
448 BOUNDABY VALUE PROBLEMS CHAP. 8

problem of potential theory" (the "Neumann problem"), in which the prescribed


boundary values—in our case specified to the unit circle—belong to dVjdv
instead of F, that is

[Answer:

Constraint:

Problem 323. Construct the complete Green's function G(x, £) of this problem
(constrained on account of (64)) and show that the solution (63) is identical
with the solution obtained on the basis of the Green's function method.
Demonstrate the symmetry of Q(x, £).
(Hint: Use again a proper linear combination of the contributions of the point x
and its conjugate x'.)
[Answer :
Definition of G(x, $):

Solution:

Problem 324. Obtain special solutions of the potential equation (20) by assuming
that V(r, 6) is the sum of a function of r and a function of 6:

Formulate the result as the real part of a function /(z) of the complex variable
z = reie.
[Answer:

8.5. The potential equation in three dimensions


The Laplacian equation in three dimensions:
SEC. 8.5 THE POTENTIAL EQUATION IN THREE DIMENSIONS 449

leads to the definition of a number of important function classes. It can be


separated in a great variety of coordinates, such, as rectangular, polar,
cylindrical, elliptic, and parabolic coordinates, every one of these separations
occurring in actual physical situations. We will restrict ourselves to the
case of polar coordinates. In these coordinates r, 6,</> the Laplacian
equation appears in the following form* :

We separate first of all in the variable r by writing F in the form

This leads to the ordinary differential equation •

and the following partial differential equation for the function Y(9, </>):

This differential equation (5) has an independent significance of its own.


Its left side represents the Laplacian operator A Y, written down for the
surface of a sphere, instead of a plane. The radius of this sphere is 1. The
entire equation expresses the eigenvalue, problem associated with the self-
adjoint operator — AY, a being the eigenvalue. But we know from the
general theory that the eigenvalue spectrum of a finite domain must be a
discrete spectrum (except for the case of singular operators which lead to
non-normalisable eigenfunctions). We can thus state in advance that the
separation constant a must be restricted to an infinity of discrete values.
For the purpose of studying our eigenvalue problem in more detail we will
apply the method of separating the variables once more:

obtaining the two ordinary differential equations

* The formulation of invariant differential operators in arbitrary curvilinear co-


ordinates is the subject matter of "tensor calculus", or "absolute calculus" (cf., e.g.,
the books [7], [9], [10] of the Chapter Bibliography). For a brief introduction into the
principles of tensor calculus see the author's article "Tensor Calculus" in the Handbook
of Physics (Condon and Odeshaw) (McGraw-Hill, 1958), Part 1, pp. 111-122.
30—L.D.O.
450 BOUNDARY VALUE PROBLEMS CHAP. 8

Now the new constant f3 is also an eigenvalue, belonging to the operator


—S". Exactly as in Section 4, the only possible values of /8 become

in view of the fact that S((f)) must become a periodic function of <j>. The
associated eigenfunctions are

We start with m =• 0. Then the equation (8) becomes, in the new variable

identical with Legendre's differential equation (cf. (7.2.12)) :

We have encountered this differential equation earlier as a special case of


the hypergeometric differential equation, and seen that the singularity at
x = ± 1 can only be avoided if the hypergeometric series terminates after a
finite number of terms. This requires the selection principle

The functions Qn(x) then become the Legendre polynomials, expressed in


x = cos 0.
We have now obtained a special class of spherical harmonics which are
independent of the azimuth angle <f>:

and we will return to the equation (4), in order to obtain the full solution
of V(r, 6, 0):

The two solutions of this differential equation are obtainable by putting

which yields for fj. the determining equation

with the two solutions

The second solution has to be rejected because it leads to a point of infinity


at r = 0. On the other hand, if our aim is to solve the potential equation
outside of a certain sphere, then only the second solution must be kept and
the first one dropped, since rn goes to infinity with increasing r,
We can now obtain by superposition the complete solution of the
Laplacian equation inside of a certain sphere with the radius r = a, if that
SEC. 8.5 THE POTENTIAL EQUATION IN THREE DIMENSIONS 451

solution has cylindrical symmetry by not depending on the azimuth angle


0:

Legendre's differential equation gives valuable clues even toward the


general problem of obtaining the spherical harmonics which do depend on the
azimuth angle </>. Let us differentiate Legendre's differential equation

m times. We obtain then for the function y = v(m)(x) the following


differential equation

Now the substitution

yields for w(x) the following differential equation:

and we will dispose of u(x) in such manner that the factor of w' shall remain
— 2x. For this purpose we have to put

The factor of w in (23) now becomes

and we obtain for

the following differential equation:

But this is exactly the differential equation (8) for /3 = m2 and a = n(n +1).
We have thus obtained the following particular solutions of the Laplacian
differential equation:
452 BOUNDARY VALUE PEOBLEMS CHAP. 8

Now a polynomial of the order n cannot be differentiated more than n


times (the higher derivatives vanishing identically). Hence the integer m
in (28) can only assume the values 0, 1, 2, . . . , % . With the exception of
m — 0 every one of these values leads to two solutions, in view of the + sign
of the last factor. Hence the total multiplicity of the eigenvalue n is 2m + 1.

Problem 325. Show that to any solution V(r, 0, $) of the Laplacian equation
a second solution can be constructed by putting

(excluding the point r = 0).


Problem 326. Show that the particular solutions (28) are all polynomials in
the rectangular coordinates x, y, z, where

(The previous notation x for cos 6 is here discarded.)


Problem 327. Show the orthogonality of the function system (26), for fixed m
and variable n.
Problem 328. Assume that V(r,6,(f>) has cylindrical symmetry (i.e. <f> indepen-
dent). Given are the boundary values of V(r, 8) on the unit sphere:

Obtain the coefficients cn of the expansion (19) in terms of the given boundary
values.
[Answer:

Problem 329. Obtain particular solutions of the potential equation (2) by


assuming a V(r, 6) which is a sum of a function of r and a function of 9:

Answer:

In particular for jS = + a we obtain the solution

Problem 330. Choose the upper sign in (35) and demonstrate the following
property of this solution. We select the point r = a, 6 = 0 on the positive
z-axis and construct a sphere of the radius b < a around the point (a, 0) as
centre. Then the normal derivative of V(r, 6) along this sphere becomes
SEC. 8.5 THE POTENTIAL EQUATION IN THREE DIMENSIONS 453

Problem 331. The Green's function for the case of a unit sphere was obtained
in (4.56) as far as the "first boundary value problem of potential theory" is
concerned. Solve the same problem for the "second boundary value problem"
("Neumann problem"). Hint: the operation with the conjugate points is not
enough, but we succeed if the result (36) is taken into account.
[Answer:
Definition of Green's function:

Constraint:

Solution (cf. Fig. (4.51)):

(The additional constant has no effect on the integration and may be omitted.)]
Problem 332. Show that the solution thus obtained automatically satisfies the
conditions

provided that the compatibility condition (38) is satisfied. Explain the origin
of the condition a).
[Answer: the vanishing of the coefficient CQ in the expansion (19).]
Problem 333. Consider the problem of minimising the integral

inside of a certain domain r, with prescribed values of V(a) on a certain portion


C of the boundary surface a, while no restrictions are imposed on the comple-
mentary boundary C". Show that the solution of this variational problem is the
following boundary value problem:

("mixed boundary value problem").


454 BOUNDARY VALUE PROBLEMS CHAP. 8

Problem 334. Given the boundary value problem

where y(a) is a given function on the boundary surface a.


a) Show that this problem is self-adjoint and thus deducible from a variational
problem.
b) Find the boundary integral which has to be added to the volume integral
(42) in order to obtain the boundary condition (45) as the inherent boundary
condition of the variational problem.
c) Consider the eigenvalue problem of (44), putting g(a) — 0, and show that
for y(a) > 0 the eigenvalues become all positive, with the possible exception of
A = 0 in the extreme case that y(cr) vanishes identically.
[Answer:
Added boundary term:

8.6. Vibration problems


Partial differential equations which involve the time t, are frequently of
the following structure. Space and time are separated. We have an
operator Dv which involves the space variables only and whose coefficients
are independent of time. This operator Dv originates by the minimisation
of a certain space integral which is positive definite. Hence Dv is a self-
adjoint operator (with self-adjoint boundary conditions), whose eigenvalues
are all positive. Moreover, the time enters only by an added term which
contains the second derivative of v with respect to t:

We will assume that we possess the complete function system associated


with the eigenvalue problem

Now we expand v(x, t) into the complete ortho-normal eigenfunction system


Vi, with coefficients which are functions of t:

Our differential equation (1) now separates into the ordinary differential
equation

which is solved by
SEC. 8.6 VIBRATION PROBLEMS 455

Our problem is thus reduced to the determination of the coefficients


at, bt. This can be done by prescribing the proper initial conditions at the
time moment t = 0. Let us assume that we possess the data

This means that at the time moment t = 0 the initial displacements and the
initial velocities of the vibrating medium are given. Then we can expand
both f(x) and g(x) into the eigenfunction system vi(x):

with the coefficients

But then the comparison with (5) shows that we have in fact obtained the
coefficients at, bt explicitly:

and we can consider our problem as solved.

Problem 335. Let us assume that the homogeneous equation

possesses non-zero solutions under the given homogeneous boundary conditions.


Show that f(x) can still be prescribed freely, while g(x) is constrained by the
orthogonality to the homogeneous solutions:

Problem 336. Show that the Green's function O(x, t;^,r) of the problem (1),
with/(ic) = g(x) = 0, can be constructed as follows:
456 BOUNDARY VALUE PROBLEMS CHAP. 8

Problem 337. In the case of a vibrating membrane the differential operator D


becomes the negative Laplacian operator of the plane:

Assume a circular membrane of the radius 1 and determine the eigenfrequencies


and vibrational modes of the membrane, under the boundary condition that the
membrane is fixed at r — 1 :

[Answer:

where the eigenvalues Xjcm are determined by the condition

if £km denotes the (infinitely many) zeros of the Bessel function of the order k:

Problem 338. Show that a vibration problem with inhomogeneous but time-
independent boundary conditions can be solved as follows. We first solve the
given inhomogeneous boundary value problem for the differential equation

Then we replace the given inhomogeneous boundary conditions by the


corresponding homogeneous conditions and follow the previous procedure,
merely replacing the initial displacement function /(#) by /(#) — VQ(X). The
resulting solution of our problem becomes

8.7. The problem of the vibrating string


The problem of the "vibrating string" (which, in fact, as we will see,
is far from "vibrating"), is one of the historically most interesting examples
of a boundary value problem. It was in connection with this problem
that D. Bernoulli discovered the method of eigensolutions and their super-
position, thus obtaining a complete solution of a given inhomogeneous
boundary value problem. The same problem brought about the remarkable
controversy concerning the nature of the trigonometric series, which involved
D'Alembert, Lagrange, Euler, Bernoulli, and finally Fourier.
This problem is once more associated with the Laplacian operator, which,
however, is here reduced to a single dimension. The differential equation

leads to physically important applications, whether the Laplacian operator


SBC. 8.7 THE PROBLEM OF THE VIBRATING STRING 457

A involves one, two, or three rectangular space coordinates. The simplest


case of one single coordinate leads (in proper normalisation of units) to the
differential equation of the vibrating string:

The string is fixed at the points x = 0 and x = I, which imposes the


boundary conditions

According to (6.13) we can proceed immediately to the construction of the


Green's function of our problem. For this purpose we have to solve the
eigenvalue problem (6.2) which in our case becomes

In view of the boundary conditions (3) we obtain the normalised eigen-


solutions

with

Then, in view of (6.13), we obtain for the Green's function G(x, t; £, r)


which we prefer to denote by G(x, £ ' , t , r ) :

If we denote

and replace the product of two sine functions by the difference of two
cosine functions, we obtain the general term of the expansion (7) in the
following form:
458 BOUNDARY VALUE PROBLEMS CHAP. 8

Hence it suffices to evaluate one universal function F(p], defined by

In terms of this function the resulting Green's function becomes:

The Green's function of our problem is not only a mathematically import-


ant function, but has a very definite and simple physical significance. The
delta function on the right side of the defining differential equation means
in physical terms that at the point x = £ and the time moment t = r an
infinitely sharp hammer blow is applied, of infinitely short duration. This
hammer blow conveys to the particles of the string a mechanical momentum
of the magnitude 1, localised to the immediate neighbourhood of the point
x — £. Then the string is left alone and starts to perform its motion on the
basis of the principles of mechanics—more precisely the "principle of least
action", which demands that the time integral of T — V (where T is the
kinetic energy of the string and F its potential energy), shall be made a
minimum, or at least a stationary value:

where

The solution of this variational problem is the differential equation (2), with
the boundary conditions (3), caused by the constraints which are imposed
on the string by the forces which prevent it from moving at the two fixed
ends.
The Green's function itself is the displacement of the string, as a function
of x and t, after the hammer blow is over. The expression (7) shows at
once an interesting property of the motion of the string, observed by the
early masters of acoustical research: "an overtone which has a nodal point
at the point where the hammer strikes, cannot be present in the harmonic
spectrum of the vibrating string". Indeed, the sum (7) represents a
harmonic resolution of the motion of any particle of the string, if we consider
x as fixed and describe the motion as a function of time. The overtones
have frequencies which are integer multiples of the fundamental frequency
1/(2Z) and any overtone receives the weight zero if the last factor has a nodal
point at the point x = £.
It is this harmonic analysis in time which leads to the notion that the
string performs some kind of "vibration", as the name "vibrating string"
indicates. That such vibrations are possible is clear from the mathematical
SEC. 8.7 THE PROBLEM OF THE VIBRATING STRING 459

form of the eigenfunctions, if they are taken separately. But this does not
mean that under the ordinary conditions of bringing the string into motion,
some kind of vibration will occur. It is a curious fact that our "physical
intuition" can easily mislead us if we try to make predictions without the
aid of the exact mathematical theory. What happens if we strike the
string with a hammer ? How will the disturbance propagate? The answer
frequently given is that some kind of "wave" will propagate along the
string, similar to the waves observed in a pond if a stone is dropped in the
water. Another guess may be made from the way in which an electro-
magnetic disturbance is propagated in space: it spreads out on an
ever-increasing sphere at the velocity of light, giving a short disturbance
at the points swept over by this expanding sphere. The picture derived
from this analogy would be a narrow but sharp "hump" on the wire
which will propagate with constant velocity (in fact with the velocity 1
due to the normalisation of our differential equation), both to the right
and to the left from the point of excitation.
Both guesses are in fact wrong. In order to study the phenomenon in
greater detail, we have to find first of all the significance of the sum (10).
We have encountered the same sum earlier (of. 2.2.10), and making use of
the result there obtained we get

together with the two further conditions

which implies

Let us now apply the hammer blow first of all at the centre of the string,
that is the point £ = Z/2 and examine the displacement of the string, given
by (11), at the points x = 1/2 ± x\. By the principle of symmetry we
know in advance that the disturbance must spread symmetrically to the
right and to the left, and thus the displacement v(s ± xi) which belongs to
these points, must be the same. If we substitute the result (14) in the
general formula (11), we get

The picture we derive from this formula can be characterised as follows.


The disturbance propagates from the point of striking in the form of a
hill of constant height \, which spreads out with the velocity 1 to larger and
larger portions of the string. After arriving at the end-points 0 and I, the
hill recedes with the same velocity and finally collapses to zero after the
460 BOUNDARY VALUE PROBLEMS CHAP. 8

time I, when it jumps over to the height — \ and repeats the same cycle
over once more, on the negative side. After the time of 21 the full cycle is
repeated in identical terms.

The remarkable feature of this result is that with the limited kinetic
energy of the initial blow larger and larger portions of the string are excited,
which seems to contradict the conservation law of energy. In actual fact
the expression (13) shows that the accumulated potential energy is zero
because the hill is of constant height and thus dvfdx = 0. The local kinetic
energy is likewise zero because the points of the hill, after rising to the
constant height of £ (or — £) remain perfectly still, their velocity dropping to
zero. The exchange between potential and kinetic energy takes place
solely at the end of the hill which travels out more and more but repeats the
same phenomenon in identical terms.
If the hammer blow is applied away from the centre £ = 1/2, the only
change is that the reflection of the hill at the two end-points now occurs at
different time moments and thus the receding of the hill starts at one end
at a time when the hill is still moving forward on the other side. The
collapsing and reversal of sign now occurs at the mirror image of the point
of excitation, but again after the time 21 has passed from the time moment
of excitation.
The actual motion of the particles is far from a harmonic oscillation. It
SEC. 8.7 THE PROBLEM OF THE VIBRATING STRING 461

consists in a sudden rise from the equilibrium position into a maximum


position, staying there for a while and then falling back once more into the
equilibrium position, with a repetition of the same motion in reversed
sequence and reversed sign, until after the time 21 the entire cycle is com-
pleted and the play starts again. In Figure (19) a few characteristic motion
forms are graphed, with the hammer striking in the middle, and half way
from the middle.
Under these circumstances we may wonder how a hammer-blow instru-
ment such as a piano can serve as a musical instrument at all. We have to
remember, however, that any periodic disturbance of the air will be
recorded by the ear as a musical tone of definite pitch, while the more or
less regular shape of the disturbance influences merely the "tone quality".
Furthermore, the sounding board and the resonating cavities of the piano
body put weight factors to the partial harmonics with the result that the
air particles which bring the ear membrane into forced vibrations perform a
much more regular motion than the jerky motions illustrated in Figure (19).
Another method of exciting the string is by "plucking", as practised in
instruments such as the harp, the guitar, the lute, and many others: the
string is pulled out with the finger, giving it a nearly triangular shape;
then it is released. Here the motion starts with a certain shape f(x) at the
time moment t = 0, while the velocity of the particles at t — 0 can be
considered as zero:

Now the method of separating the variables yields the solution

This formula can be interpreted as follows. The point P = (x, t) is


462 BOUNDARY VALUE PROBLEMS CHAP. 8

projected down on the J£-axis by drawing two straight lines at the angles
of —45° downward. We arrive at the two points x\ and x%. Then we
take the arithmetic mean of the two values f(x\} and f ( x 2 ) as the solution
v(x, t). If f(x) is plotted geometrically, we can obtain the solution by a
purely geometrical construction.
If we pluck the string at the centre, the shape of the string at t = 0 will
be an isosceles triangle. How will this triangle move, if we release the
string? Will the entire triangle move up and down as a unit, vibrating in
unison ? This is not the case. The geometrical construction according to
Figure (22) demonstrates that the outer contour of the triangle remains at rest
but a straight line moves down with uniform speed, truncating the triangle
to a quadrangle of diminishing height, until the figure collapses into the
axis OL. Then the same phenomenon is repeated downward in reversed

sequence, building up the triangle gradually, until the mirror image of the
original triangle is restored. We are now at the time moment t = I and the
half-cycle of the entire period is accomplished. The second half-cycle
repeats the same motion with opposite sign.
If the plucking occurs at a point away from the centre, the descending
straight line will not be horizontal. Furthermore, the triangle which
develops below the zero-line is the result of two reflections, about the X
and about the Y axes. In other respects the phenomenon is quite similar
to the previous case.

If we follow the motion of a single particle of the string as a function of


time, we observe that the motion is now less jerky than it was in (19) but
still far from a regular vibration. The following figure plots a few character-
istic motion patterns for central and non-central plucking.
SEC. 8.7 THE PROBLEM OF THE VIBRATING STRING 463

Problem 339. Obtain the solution of the boundary value problem (2), (3), with
the initial conditions

in terms of the Green's function G(x, £ ' , t ) ( r = 0).


[Answer:

Problem 340. a) By making the transformation

obtain the general solution of the differential equation (2) in the form
P(x + t) + Q(x - t) which may also be written in the form

b) Show that the boundary conditions (3) demand the following extension of
the functions A(p) and B(p) beyond the original range [0, Z]:

c) Obtain the solutions (17) and (21) on the basis of this method, without
any eigenfunction analysis.
464 BOUNDARY VALUE PEOBLEMS CHAP. 8

Problem 341. Show geometrically that the conservation law of energy:

is satisfied for the plucked string (24).


[Answer:
Putting OA = a, AL = b,

8.8. The analytical nature of hyperbolic differential operators


If we examine the solutions obtained in the previous sections, we can
derive some conclusions of principal significance. The differential equation
(7.1) shows a characteristic property. The term in which the time t appears
enters the differential equation with the opposite sign, compared with the
other terms. Whether the Laplacian operator A contains only one space
variable (vibrating string), or two (vibrating membrane) or three coordinates
(the wave equation), in all cases the space terms enter the equation with
a negative, the time with a positive sign. It is this change of sign which
has a profound effect on the analytical character of the solutions. When
we were dealing with the solution of the potential equation (in Sections 4
and 5), the boundary values on the surface of a sphere (or in two dimensions
on a circle) were analysed in terms of functions of the Fourier type. The
expansion of the boundary values into these orthogonal functions does not
require any great regularity. If the boundary data are sectionally continuous
functions, with arbitrary discontinuities at the common boundaries between
the continuous regions, this is entirely sufficient to guarantee the con-
vergence of the expansion. If then we investigate the solution inside the
domain, we find that the potential function shows a surprisingly high
degree of regularity. Although the boundary data could not even be
differentiated since the first derivatives went to infinity at the points of
discontinuity, yet at any inside point the function V(x) is differentiable any
number of times, with respect to any of the coordinates. The integration
with the help of the Green's function has this remarkable smoothing effect
as its consequence. We have found explicit expressions for this Green's
function and these expressions show that G(x, £), considered as a function of
x, is an analytical function of x, which can be differentiated partially any
number of times. The only point of singularity is the point x = g, but this
equality can never occur, since x is in the inside, £ on the boundary of the
given domain.
The solution of a hyperbolic type of differential equation behaves very
differently. Here the singularities propagate from the boundary into the
SEC. 8.8 NATURE OF HYPERBOLIC DIFFERENTIAL OPERATORS 465

inner domain. The propagation occurs along the so-called "character-


istics". For example in the case of the vibrating string the characteristics
are two straight lines at 45° which emanate from the singular point. We
can demonstrate this behaviour if we examine the example of the plucked
string. Initially the string had the shape (23), with a singular point at B
in which the function f(x) is continuous but its tangent becomes dis-
continuous. With increasing time this point moved with constant velocity
to the left and to the right, changing the contour of the string to a
quadrangle. The discontinuity in the tangent remained.
Let us now pay attention to the fact that the operator on the left side of
(7.2) requires the formation of the second derivative with respect to t and
x. This means that not only must v(x, t) be a continuous function of x and
t, but even the first derivative must be everywhere continuous since otherwise
the second derivative would become infinite at certain points and our
operator would lose its significance. Hence the shape of the string as
pictured in (24) is not a mathematically or physically acceptable initial
condition. We have to subject the initial displacement v(x, 0) to the
constraint that it must be a continuous and differentiate function, with a
piece wise continuous second derivative. Hence we must assume that the
sharp corner shown in the figure is in fact replaced by a round corner which
changes the tangent sharply but not instantly. This smoothing can occur
along a microscopically small portion of the string and causes a merely local
disturbance which has no effect in the distance. But without this smoothing
the nature of the differential equation is violated.
The same objection has to be raised even more pointedly against the
form (7.18) of the Green's function. Here the function itself becomes dis-
continuous at two points which spread to the right and to the left with the
velocity 1. We should remember, however, that the Green's function is
not more than an auxiliary function with the help of which the solution is
obtained. The right side 8(x, £) of the defining differential equation prevents
the function G(x, £) from satisfying the differential equation at the critical
point x = £. But we would expect that it does satisfy the given differential
equation everywhere else, in view of the local nature of the delta function.
And this is indeed the case with the Green's functions (4.55) and (4.56)
of the potential equation. But if we make a comparison with the Green's
function (7.18), we observe an important difference. The previous Green's
function remained a regular solution of the homogeneous differential
equation everywhere except at the point x — £, where the function went out
of bound. The new Green's function fails to satisfy the differential equation
not only at the source x = £, but all along the two characteristics which
emanate from the point x = £. And even more can be said. The delta
function is not a legitimate input function, as we have emphasized at several
occasions. It is the limit of a legitimate input function. An arbitrarily
small local smoothing changed the extreme nature of the delta function to a
legitimate function. Then we had a right side which was generally zero but
jumped from zero to the high constant value 1/e in the vicinity c of the point
31—L.D.O.
466 BOUNDARY VALUE PROBLEMS CHAP. 8

x = £. In our present problem, if such a function is applied as input


function, this means in physical terms that a sharp hammer blow of short
(but not instantaneous) duration hits a small (but finite) portion of the
string, in the neighbourhood of the point x = g. The resulting solution
becomes quite similar to the previous figure (7.18), but the two ends of the
hill descend now with a finite tangent and the discontinuity of the function
is avoided. This, however, is not enough. We have just seen that the
operator (7.2) requires that the solution must have a continuous first
derivative in both x and t. Hence the solution (7.18) is not acceptable as
the solution of the given differential equation, even if the ends are slanted.
We have to demand that the sharp corners of the figure are rounded off.
Hence the Green's function is not one but two smoothings away from a
permissible solution of the differential equation and that again means that
the right side of a hyperbolic type of differential equation cannot be prescribed
with the same freedom that is permitted in the case of an elliptic (or, as we
will see later, a parabolic) equation. There the right side need not be more
than piece wise continuous. Now we have to require the right side to be a
continuous and even differentiate function. Without this restriction our
problem becomes unsolvable because, while we should obtain a solution v(x, t)
which is in itself continuous, the first partial derivative vx or vt, or both,
would go out of bound.
This phenomenon must give us a certain disquietude if we look at it
from the standpoint of eigenfunction expansion. A piecewise continuous
function can certainly be expanded into a convergent series of orthogonal
functions and now, if we construct the solution by dividing by the eigenvalues
\i, the convergence should become better rather than worse, since the eigen-
functions of high order—which may cause divergence—are divided by
numbers which grow to infinity. Now it is true that the difficulty is not
caused by the function v(x, t) itself, but by the partial derivatives vx and vt
which are less smooth than v itself. But a closer analysis reveals that we
cannot escape by this argument. We have seen in Chapter 5.11 that partial
differential equations can be reduced to a "canonical form" in which no
higher than first derivatives appear. This means in our problem that the
partial derivatives vx and vt can be added to the function v as new inde-
pendent variables, with the result that we get a coupled system of three
equations instead of the previous single equation. In this system we need
no differentiation any more to get the quantities vx and vt; they are fully-
fledged components of the solution, which now consists of the three
quantities v, vx, vt.
Let us carry out the actual procedure according to the general technique
discussed in Chapter 5.11. We introduce the two partial derivatives as new
variables:

(making use of the simplified notation vx for dv/dx, vt for 8vj8t). Now our
SEC. 8.8 NATURE OF HYPERBOLIC DIFFERENTIAL OPERATORS 467

Lagrangian becomes, according to (7.12), if we multiply by —1 for the


sake of convenience:

We modify the Lagrangian according to the method of the Lagrangian


multiplier:

with

But the algebraic variables w\ and w% can be eliminated which reduces H to

and thus our final Lagrangian becomes

Variation with respect to p\, pz, and v yields the system

with the boundary conditions

and the initial conditions

These latter conditions arise from the fact that we want to employ the
method discussed in Section 2 which transforms the inhomogeneous boundary
value problem (in our case initial value problem) with a homogeneous
differential equation into an inhomogeneous differential equation with
homogeneous boundary conditions. The "right side" fi(x, t) of this in-
homogeneous equation has to be put in the new formulation (7) in the place
of the zero of the third equation, while the right sides of the first and the
second equation remain zero. If we eliminate pi and pz from these two
equations and substitute in the third equation, we are back at the single
equation

We will now place the paradox in sharp focus by formulating it in exact


quantitative terms. We assume that we have solved the eigenvalue problem
associated with our system (7) (whose differential operator is self-adjoint
468 BOUNDARY VALUE PROBLEMS CHAP. 8

but not the boundary conditions (9)). We will call the orthogonal eigen-
functions of our problem Ut(x) and Vt(x), each one of these functions
representing in fact a vector, with the three components

We prescribe fi(x, t) as a finite, sectionally continuous function of bounded


variation, with a finite number of discontinuities in the variable x, if we
freeze t to a constant value (the same could be done with the variable t,
considering a: as a constant). We expand the vector of the right side
(0, 0, ft] into the eigenfunctions Ui(x, t):

with

Correspondingly the vector of the solution can be expanded—in accordance


with the general theory—into the following series:

Now, by the properties of orthogonal expansions we obtain for the "length"


or "norm" of the function /3(x, t) (cf. Chapter 4.7):

while the corresponding norm of the solution becomes

This quantity, however, cannot converge under the given circumstances.


We have found in our previous discussion that a discontinuity of /?(#, t) with
respect to x causes that pi = vx becomes infinite at that point and the
solution loses its quadratic integrability. Hence the sum (16) must go to
infinity, although the sum (15) converged. But how can it happen that the
solution is less convergent than the right side, when in fact we divide the c\
by the A* which with increasing i go to infinity?
We leave the resolution of this apparent contradiction to Section 16,
where we will be able to understand it as a special case of a much more
general phenomenon.
SEC. 8.9 THE HEAT FLOW EQUATION 469

8.9. The heat flow equation


The differential equation which describes the conduction of heat, is
formally very similar to the wave-equation in one, two, or three dimensions.
The only difference is that the second derivative with respect to t is replaced
by the first derivative. The fundamental differential equation thus becomes

with the proper boundary and initial conditions. If we once more solve the
time-independent eigenvalue problem

we can once more expand V(T, t)—where r symbolises the coordinates of


space—in the following form:

obtaining for c(t) the ordinary differential equation

which has the solution

Hence the complete solution becomes

The undetermined coefficients c< can be obtained by prescribing the initial


condition

Then

Although formally our solution is very similar to that of the vibrating


string (or membrane, or space), there is in fact a fundamental difference
between the two solutions, because the appearance of an exponential in place
of the earlier periodic function in time changes the analytical nature of the
solution profoundly. We have observed in the problem of the vibrating
string that any discontinuity in the initial position or its tangent propagated
into the inside of the domain and maintained its character unchanged.
Such behaviour is now out of the question. The infinite sum (6) has
exceedingly good convergence because for large values of * also A$ becomes
large and, as i goes to infinity, the factor e~M cuts down so effectively the
contribution of the high order terms that the sum (6) and even all its
470 BOUNDAEY VALUE PROBLEMS CHAP. 8

derivatives of arbitrary order (with respect to x or t) remain convergent, for all


values t > 0. Hence an initial discontinuity in function or derivative is
immediately smoothed out by the phenomenon of heat conduction and the
initially irregular function /(T) is transformed into a function V(T, t) which is
analytical in all its derivatives. In fact, Weierstrass made use of this
property of the phenomenon of heat conduction to demonstrate that an
arbitrarily non-analytical, although single-valued and sectionally continuous
function can be approximated to any degree by strictly analytical functions
(for example polynomials).
If we consider a one-dimensional heat flow in a rod whose ends at x = 0
and x = I are kept at zero temperature, we have a problem which is quite
analogous to the motion of the vibrating string. The function v(x, t)
satisfies once more the boundary conditions

and the differential equation

is once more solvable by a similar separation of variables as employed in


(7.21):

with

This solution may also be written hi terms of a Green's function G(x, $ ; t ) :

where

if we define the following function of two variables:

The significance of the Green's function (14) is the heat flow generated by
a heat source of the intensity 1, applied during an infinitesimal time at
t = 0, and in the infinitesimal neighbourhood of the point x = £. Such a
heat flow would occur even if the body extended on both sides to infinity
SEC. 8.9 THE HEAT FLOW EQUATION 471

We can thus ask for the limiting value of the Green's function if I goes to
infinity. Accordingly, we will place the point £ at the midpoint of the rod:
| = 1/2 and put once more (as we have done in the problem of the viratingb
string):.

Then the function we want to get becomes

letting I go to infinity. But then the sum on the right side of (15) becomes
more and more an integral; in the limit, as I grows to infinity, we obtain

The integrand is the real part of

The integration is thus reducible to the definite integral

and we obtain

Thus the Green's function of the heat flow equation in one dimension for
an infinitely extended medium becomes

This is the fundamental solution of the heat conduction equation, com-


parable to the solution — (477-r)-1 in the case of the three-dimensional
potential equation.
Problem 342. Define the Green's function of the heat flow equation by the
standard technique (making use of the adjoint equation which does not coincide
with the original one) and show that the Green's function G(x, g ; t, r) of the
problem (7), (9), (10) is in the following relation to the Green's function (14):

Problem 343. Change the boundary conditions (9) to

which hold if the two ends of the rod are insulated against heat losses.
472 BOUNDAEY VALUE PROBLEMS CHAP. 8

a) Show that any solution of the heat flow equation under these boundary
conditions satisfies the condition

b) Show that, as t increases to infinity, we get in the limit

where

c) Demonstrate that the solution (22), if integrated with respect to x between


+ oo, gives 1.

8.10. Minimum problems with constraints


Many of the previously considered boundary value problems were
characterised by self-adjoint operators (and boundary conditions) and were
thus derivable by minimising a certain integral. This is particularly the
case if the equilibrium position of a mechanical system is to be found,
composed of continuously distributed masses. The potential energy of
such a system is given by a definite integral, extended over the domain of
the masses. For example the equilibrium of a stretched membrane requires
the minimisation of the integral

with prescribed boundary values, determined by the closed space curve which
terminates the membrane. We will consider the simple case of a plane
circular membrane of unit radius whose frame is kept at the constant
distance a from the horizontal plane. Here the solution

makes the potential energy to a minimum, namely zero.


We will now restrict the movability of the membrane by a vertical peg
of circular cross-section which pins the membrane down to the horizontal
plane. Let e be the radius of the peg and let us make e smaller and smaller.

We then approach the limit in which only one, point of the membrane is
pinned down. For the sake of simplicity we assume that the peg is centrally
applied.
Our problem is a "minimum problem with constraints" since the potential
SBC. 8.10 MINIMUM PROBLEMS WITH CONSTRAINTS 473

energy has to be minimised under the auxiliary condition that v(r, 6) is


fixed not only at r = I but also at r = e. The differential equation of the
free membrane (4.20)

will hold between r = [e, 1] while between r = 0 and e we have to put

where the right side is proportional to the force density required for the
pinning down of the membrane.
Now we have seen that the only possible solution of (3) under circular
symmetry is

and since the constraints demand the two conditions

we obtain the solution

Let us now multiply the inhomogeneous equation (4) by the area element
2irrdr and integrate between 0 and e. We then obtain

The quantity on the left side is proportional to the total force required for
pinning the membrane down. We see that this force is becoming smaller
and smaller as the radius e of the peg decreases. At the same time the

solution (7) shows that the indention caused by the peg becomes more and
more local since for very small e/v(r) becomes practically v = a, except in the
immediate neighbourhood of r = e. In the limit, as e recedes to zero, we
obtain the following solution: the membrane is everywhere horizontal but
it is pinned down at r = 0. This means that v(r) assumes everywhere the
constant value v = a, except at r = 0, where v(0) = 0.
While this solution exists as a limit, it is not a legitimate solution because
it is a function which cannot be differentiated at r = 0 and hence does not
belong to that class of functions which are demanded in the process of
474 BOUNDARY VALUE PROBLEMS CHAP. 8

minimising the integral (1). We thus encounter the peculiar situation that
if we require the minimisation of the integral (1) with the boundary con-
dition v(l) = a and the inside condition

this problem has no solution. We can make the given integral as small
as we wish but not zero, and thus no definite Tnim'mum can be found under the
given conditions.
The situation is quite different, however, if the membrane is not pinned
down at a point but along a line. A line has the same dimensionality as the
boundary and a constraint along a line can in fact be considered as a boundary
condition, if we add the line of constraint to the outer boundary. Constraints
of this type are of frequent occurrence in physical problems. We may
consider for example a three-dimensional flow problem of a fluid which is
forced to flow around an obstacle which is given in the form of a surface.
Or we may have a problem in electrostatics in which the potential along an
inner surface is given as zero, since the surface is earthed. Again, in a
diffraction problem which requires the solution of the differential equation

the function F along an absorbing screen may be prescribed as zero.


In problems of this type it is often desirable to consider the added
conditions as constraints, rather than as parts of the boundary conditions.
But then we have to take into consideration that the maintaining of a con-
straint demands an external force—corresponding to the appearance of a
"right side" of the given differential equation—and this is equivalent of
saying that at the points of constraint the differential equation will be violated.
The differential equation now to be solved can be written in the form

where f3(x) is not zero in the domain of the constraint and the symbolic
notation x refers again to an arbitrary point of an ^-dimensional manifold.
The solution of our new problem will be once more a function VQ(X) which
satisfies the given inhomogeneous boundary data, together with the homo-
geneous differential equation, but to this solution we now have to add the
solution of the inhomogeneous equation (12), with homogeneous boundary
conditions. This can be done in terms of the Green's function of our problem
and thus the complete solution of our problem can be given as follows:

But now we must make use of the fact that the given constraint exists in a
certain sw6-domain of our space, more precisely on a given inner surface
which we want to denote by a', in distinction to the boundary surface a.
Accordingly we have to rewrite the equation (13) as follows:
SEC. 8.10 MINIMUM PROBLEMS WITH CONSTRAINTS 475

The quantity /3(cr') is proportional to the density of the surface force which
is required for the fulfilment of the given constraint. But this force is not a
given quantity. What is given is the constraint, which demands that v(x)
becomes zero on the surface a', or more generally that v(x) becomes some
prescribed function on this surface. The force needed for the maintenance
of this condition adjusts itself in such a way that the constraint is satisfied.
Let us express this physical situation in mathematical terms. We will
denote by s' an arbitrarily selected point of the inner surface a'. Then
our constraint demands that the following equation shall be satisfied:

The peculiar feature of this equation—called an "integral equation"—is that


it is not j8(s') that is given to us (in order to obtain v(s') by the process of
integration), but j3(s') is the unknown function and the left side v(s') is the
given function (together with VQ(S') which can be transferred to the left side
and combined with v(s')). The function G(s', a') is called the "kernel" of
the integral equation. The general form of an integral equation is thus

where /(£) is the unknown and g(x) the given function.


Generally integration is a smoothing operation and the function on the
left and on the right side of an integral equation cannot belong to the same
class of functions. If the kernel function is everywhere bounded, the result
of the integration on the left side of (16) is that we obtain a continuous and
even differentiable function. Hence the given function g(x) must be pre-
scribed as a function of sufficient regularity to make the integral equation
solvable. But in the potential problem discussed in the beginning of this
section the situation is quite different. Our kernel K(x, £) is here the
Green's function G(x, £) of the potential equation which goes out of bound
at the point x = £. For example in two dimensions the Green's function
goes to infinity at the critical point with the strength log r x f , in three
dimensions with the strength rxf~l. Furthermore, the integration is
restricted to a lower dimensional manifold—in two dimensions to a curve,
in three dimensions to a surface—which increases the strength of the
singularity. Integral equations Avith such kernels are called "singular
integral equations". The singularity of the kernel counterbalances the
usual discrepancy which exists between the smoothness of the functions
f(x) and g(x).
Problem 344. Consider the minimum problem. (1) in three dimensions, with
the boundary condition v = a on the sphere r = 1, and the added constraint
v(0) = 0. Show that the minimum can be made as small as we wish, but not
zero.
[Answer:
Choose
476 BOUNDARY VALUE PROBLEMS CHAP. 8

Then

8.11. Integral equations in the service of boundary value problems


The method of the "separation of variables" is of eminent theoretical
importance since it yields almost all the fundamental function classes of
mathematical physics. As a tool for solving boundary value problems it is
of limited applicability because it is restricted to boundaries of simple shape.
Moreover, even in the case of boundaries of high symmetry the kind of
boundary conditions prescribed must be of considerable simplicity. For
example in the case of the potential equation we have succeeded in solving
the boundary value problem of the first and the second kind. But if we
consider the more general boundary condition (5.45), we do not succeed
with the separation in polar coordinates because we do not obtain any
explicit expression for the determination of the expansion coefficients.
Under these circumstances it is of great advantage that the theory of
"integral equations" can help solve boundary value problems. Any
boundary value problem can be formulated as an integral equation and thus
the methods for solving integral equations are applicable to the solution of
boundary value problems.
We have seen in the previous section how certain constraints on an inner
boundary could be solved in two ways. We could extend the outer boundary
by the inner boundary and consider the entire problem as one single boundary
value problem; the constraints then appear as boundary data. But we
could also use a more direct approach, replacing the given constraints on the
inner boundary by an integral equation. We can now extend this method
to the outer boundary and reduce our entire problem to the solution of an
integral equation.
We will illustrate the method by the example of the potential equation,
although it is equally applicable to boundary value problems of the parabolic
or hyperbolic type. Let our problem be to solve the potential equation in
a given closed domain in which the Laplacian equation is to be solved. It
would be enough to prescribe the boundary values of the potential function,
or the normal derivatives, but we want to over-determine the problem by
giving both sets of values. This cannot be done freely, of course. If we
prescribe improper boundary values, the forces of constraint will come into
operation, alter the differential equation in the vicinity of the boundary,
and we shall not obtain what we want. If, however, the prescribed sur-
plus data are the correct data, we have done no harm to the given problem.
The question is merely from where to take these surplus data, but we will find
a solution to this problem.
The giving of these surplus data has the following fortunate effect. We
can solve our problem in terms of the Green's function. But in the original
formulation the construction of the Green's function is not an easy task.
We have that part of the Green's function which becomes singular at the
SEC. 8.11 INTEGRAL EQUATIONS IN BOUNDARY VALUE PROBLEMS 477

point x = £, but to this part we had to add a regular solution of the potential
equation (cf. (4.47-49)), chosen in such a way that on the boundary a
the required homogeneous boundary conditions of the Green's function are
satisfied. This is now quite different in our present problem. The adjoint
equation—which defines the Green's function—is now strongly under-
determined and in fact we obtain no boundary conditions of any kind for the
function G(x, £). Hence we can choose the added function V(x) as any
solution of the potential equation, even as zero. Hence the over-determina-
tion of the problem has the fortunate consequence that the Green's function
can be explicitly given in the form of a simple power of the distance rX(
(times a constant), in particular in two dimensions

and in three dimensions

Then the solution of our over-determined problem appears in the following


form (making use of the extended Green's identity (4.57), but generalised
to any boundary surface instead of a sphere):

Now the fact that our data have been properly given has the following
consequence. If we approach with the inside point x any point s on the
boundary, we actually approach the given boundary value F(s). Hence we
can consider as the criterion of properly given boundary values that the
equation (3) remains valid even in the limit, when the point x coincides
with the boundary point s. Then we get the following integral relation,
valid for any point s of the boundary surface a:

Now, instead of using this relation as a check on the prescribed boundary


data, we can use it for the determination of the surplus data. Let us assume
for example that the data corresponding to the Neumann problem are given.
Then the second integral on the right side of (4) is at our disposal and we
obtain for the data V(s) on the boundary the following integral equation:

This integral equation is of the following general form:

to be solved for f(x), with given g(x).


478 BOUNDARY VALUE PROBLEMS CHAP. 8

The general theory of integral equations of this type was developed by


I. Fredholm (1900) and subsequently the same subject gave rise to a very
extensive literature.* All the methods which have been designed for the
solution of the Fredholm type of integral equations, are immediately
applicable to the solution of boundary value problems, under much more
general conditions than those under which a solution is obtainable by the
separation of variables, or by an explicit construction of the Green's function.
The basic method may be characterised as follows. It is often possible to
give the solution of the differential equation

if we do not demand any additional boundary conditions. This can be


achieved by over-determining the original problem by the addition of
surplus data. These data are obtained by solving an integral equation for
the boundary surface.
We add one more example by considering the general boundary value
problem (5.45) for an arbitrary boundary surface a. For this purpose we
write the integral relation (4) as follows:

or, putting the given data to the right side:

This is once more a Fredholm type of integral equation, only the kernel
K(x, g) has changed, compared with the previous problem (5).
From the standpoint of obtaining an explicit solution in numerical terms
we may fare better if we avoid the solution of an integral equation whose
kernel goes out of bound at the point s — a. The surplus data are also
obtainable by making use of the compatibility conditions which have to be
satisfied by our data. We then have a greater flexibility at our disposal
because the compatibility conditions appear in the form

where u(a) can be chosen as any function which satisfies the homogeneous
equation

and is free of any singularities inside the given domain. We can once
more choose as our u(a) the reciprocal distance rax~l, provided that the fixed
point x is chosen as any point outside the boundary surface. By putting the
point x sufficiently near to the surface, yet not directly on the surface we
* For a more thorough study of the theory of integral equations, cf. [8] and [11] of
the Chapter Bibliography.
SEC. 8.12 THE CONSERVATION LAWS OF MECHANICS 479

avoid the singularity of the kernel and reduce the determination of the
surplus data numerically to the solution of a well-conditioned large-scale
system of ordinary linear equations.
Problem 345. Show on the basis of (3) that the potential function V(T) is
everywhere inside the domain T an analytical function of the rectangular
coordinates (x, y, z) (that is the partial derivatives of all orders exist), although
the boundary values themselves need not be analytical.

8.12. The conservation laws of mechanics


The boundary values associated with a given homogeneous differential
equation are not always freely at our disposal. We have just seen that in a
potential problem, if we prescribe both the function and its normal derivative
on the boundary, we have strongly over-determined our problem and
accordingly we have to satisfy an infinity of compatibility conditions. But
even without over-determination our data may be subject to constraints.
We have seen for example that the data of the Neumann problem had to
satisfy the condition (5.38) which for an arbitrary boundary surface a
becomes

Under all circumstances we have an unfailing method by which we can


decide how much or how little the given data are constrained. The decision
lies with the adjoint homogeneous equation (cf. Section 8.2, particularly
(2.18)). To every independent non-zero solution of the adjoint homo-
geneous equation belongs a definite compatibility condition, and vice versa,
these are all the compatibility conditions that our data have to satisfy. In
physical problems these compatibility conditions have frequently an
important significance, as they express the conservation of some physical
quantity. Of particularly fundamental importance are the conservation
laws of mechanics, which in the case of continuously distributed masses are
the consequence of certain compatiblity conditions of partial differential
equations.
The physical state of such masses is characterised by a fundamental set
of quantities, called the "matter tensor". The components of this tensor
form a matrix and are thus characterised by two subscripts. We can
conceive the components of the matter tensor as an n x n matrix of the
w-dimensional space, whose components are continuous and differentiate
functions of the coordinates. We will denote this tensor by TM, with the
understanding that the subscripts i and Jc assume independently the values
1, 2 in two dimensions, 1, 2, 3 in three dimensions, and 1, 2, 3, 4 in four
dimensions (the fourth dimension is in close relation to the time t). This
matter tensor has two fundamental properties. First of all it has the
algebraic property that the components of the matter tensor form a matrix
which is symmetric:
480 BOUNDARY VALUE PROBLEMS CHAP. 8

and for this reason we call the matter tensor a "symmetric tensor". Hence
the number of independent components is reduced from n2 to n(n + l)/2,
which means in 2, 3, and 4 dimensions respectively 3, 6, and 10 independent
components. A further fundamental property of the matter tensor is that
its divergence vanishes at all points:

This represents a vectorial system of n homogeneous partial differential


equations of first order. Since only n equations are prescribed for
n(n + l)/2 quantities, our system is obviously strongly under-determined,
and accordingly the adjoint system strongly over-determined. This,
however, does not mean that the adjoint system does not possess non-zero
solutions. In fact such solutions exist and each one of them yields a
condition between the boundary values which has an important physical
significance.
Making use of the usual technique of obtaining the adjoint equation by
multiplying the given system by an undetermined factor and then
"liberating" the original functions by the method of integrating by parts,
we now have to apply as undetermined factor a vector $i of n components.
We obtain

Hence the adjoint homogeneous equation becomes

To every independent solution of this equation a condition between the


boundary values is obtained, of the form

The integration is extended over the boundary surface a which encloses


the domain r in which the equations (3) hold.
Now the equations (5) do not possess many solutions, as we can imagine
if we realise that the vector $< of n components is subjected to n(n + l)/2
conditions. First of all we have the solutions

These solutions can be systematised by putting first $1 = 1, all other


0{ = 0, then 02 = 1, all other fy = 0, . . ., finally <Pn = 1, all other
SEO. 8.12 THE CONSERVATION LAWS OF MECHANICS 481

$i = 0. Accordingly the boundary conditions (6) for these special solutions


become

If the matter tensor were not symmetric, these would be all the adjoint
solutions since in that case the operator on the left side of (5) would be
replaced by d^t/Bx^ alone. The symmetry of the matter tensor has, how-
ever, another class of solutions in its wake. We choose an arbitrary pair
of subscripts, for example i and k, and put

while all the other <&a are equated to zero. These solutions, whose total
number is n(n — l)/2, give us an additional set of boundary conditions,
namely

The total number of independent boundary conditions is thus n(n + l)/2.


We now come to the discussion of the physical significance of these
conditions. We begin with the case n = 3. This means that we consider
a physical system in equilibrium because the omission of the fourth co-
ordinate means that everything is time-independent, in other words, the
masses are in rest. Let us now surround a mass at rest with the surface
which terminates the mass distribution, and let us apply to this surface a
the condition (8). The coordinates x\, xz, xz have the significance of
rectangular coordinates, usually denoted by x, y, z. The components
vi, vz, vs are the three components of the outward normal v. Furthermore,
the vector

has the following physical significance. In the field theoretical description


of events it represents the external force which is impressed from the outside
on the material body, per unit surface. The integration over the entire
surface represents accordingly the resulting force of all the forces which act
on the body, from the surrounding field.
Now the boundary condition

expresses the fact that a material body can be in equilibrium only if the
resultant of the external forces acting on it is zero.
Let us now turn to the second class of boundary conditions of the type
32—L.D.O.
482 BOUNDARY VALUE PBOBLBMS CHAP. 8

(10), which involve a pair of indices * and k. In three dimensions we have


only the three possible combinations 2,3; 3,1; 1,2. Moreover, the three
quantities

form the components of a vector, called the "moment" M of the force F.


Hence the second set of boundary conditions:

obtain the following physical significance:

which means that a material body can be in equilibrium only if the resultant
moment of the external forces acting on the body is zero. The conditions (12)
and (15) are fundamental in the statics of rigid or any other kind of bodies.
We have obtained them as the compatibility conditions of a partial differential
equation, namely the equation which expresses the divergence-free nature
of the matter tensor. Earlier, in Chapter 4.15, when dealing with an elastic
bar which is free at the two end-points, we found that the differential
equation of the elastic displacement was only solvable if two compatibility
conditions are satisfied: the sum of the forces and the sum of the moments
of the forces had to be zero. At that time we had an ordinary differential
equation of fourth order; now we have a system of three partial differential
equations of first order which leads in a more general setting to similar
compatibility conditions.
We will now leave the realm of statics and enter the realm of dynamics.
Einstein in his celebrated "Theory of Relativity" has shown that space
and time belong inseparably together by forming a single manifold.
Minkowski demonstrated that the separation of the physical world into space
and time is purely accidental. All the equations of mathematical physics
can be written down in a form in which not merely the three space variables
#i> #2, #3 (corresponding to the three rectangular coordinates x, y, z) play an
equivalent role, but these three coordinates are supplemented by the fourth
coordinate #4, which in physical interpretation corresponds to the product
ict, where c is the velocity of light:

The extension of the matter tensor to four dimensions introduces a fourth


row and column T^ = TM, which means four new quantities. In view of
the imaginary character of x± we must assume that the three components
TM (i = 1, 2, 3) are purely imaginary, while T\± is real.
The change from statics to dynamics means that the divergence condition
of the matter tensor is extended from n = 3 to n = 4. Instead of
equilibrium conditions we shall now get principles which govern the motion
of material bodies. We shall have to interpret the four boundary conditions
of the type (8), and the 4 x 3 / 2 = 6 conditions of the type (10).
SEC. 8.12 THE CONSERVATION LAWS OF MECHANICS 483

The operations in the space-time world of Relativity require some special


experiences which we will not assume at this phase of our discussion. We
shall prefer to formulate our results in the usual fashion which separates
space and time, although in the basic equations they play a similar role.
In relativistic deductions the index i runs from 1 to 4, while we will restrict
i to the values 1, 2, 3, and write down separately the terms which belong
to the dimension "time" (in our formalism #4). Similarly the Gaussian
integral transformation will be restricted to a volume of the ordinary three-
dimensional space and not to a four-dimensional volume. The "boundary
surface" a thus remains a surface of our ordinary space, although now no
longer hi equilibrium but in some form of motion.
Our fundamental equations are the four equations

with the six symmetry conditions

First of all we consider the four conditions (8). In view of the added
terms we have to complement the previous surface integrals by further
integrals which are extended over the entire volume of the masses. We
will introduce the following four quantities, three of whom correspond to the
three components of a vector and the fourth to a scalar:

Since we have integrated over the total volume of our domain, these four
quantities are no longer functions of xi, x%, #3, but they are still functions of
£4. Now the equation (12) appears in the following more general form:

to which we have to add a fourth equation in the form

Let us now remember that the first term of (20) was physically interpreted
as the "total force" exerted on the body by the outside forces. Since
Newton's law of motion states that "the time rate of change of the total
484 BOUNDARY VALUE PROBLEMS CHAP. 8

momentum is equal to the external force", we come in harmony with this


law if we interpret

as the "total momentum" contained in the volume T. But the surface


integral of the first term of (20) allows the interpretation that it is the
"momentum flux" through the boundary surface a and in this interpretation
the equation (20) can be conceived as the conservation law of momentum.
But then the equation (21) must also have the significance of a conservation
law since in Relativity a vector has four instead of three components and the
three equations (20) and the equation (21) form an inseparable unity. In
analogy to (11) we have to define as the "fourth component of the external
force" the quantity

and now write (21) in the form

And since it is shown in Relativity that momentum and energy go inseparably


together, we must interpret the equation (21) as the conservation law of
energy. Accordingly we must interpret the quantity E as the total energy
of the body (or the material system), while the first term of (21)—if
multiplied by ic—represents the energy flux through the boundary surface.
But now we make use of the fundamental symmetry of the matter tensor
which has the consequence that the components of the energy flux become
identical with the components of the momentum density—namely iT^fc—
multiplied by c2. And, since the conservation of mass is expressed by exactly
the same equation, except that E is replaced by me2, we arrive at the
monumental identification of mass and energy, according to the celebrated
equation of Einstein:

Up to now the "momentum" pi had no specific significance. We have


called it "momentum" but a motion law will only result if we succeed in
interpreting this quantity in kinematic terms. Now we still have six more
conservation laws, corresponding to the second set (9) of the solutions of the
adjoint homogeneous equation. We will in particular choose $4 = XK and
thus multiply the fourth equation of the system (17) by #&. Then this
equation may be written as follows:
SEC. 8.12 THE CONSERVATION LAWS OF MECHANICS 486

Now we define the "centre of energy " or "centre of mass" of our mechanical
system by putting

For physical reasons the energy density ^44 has to be assumed as a


necessarily positive quantity at every point of the domain. Due to this
property of the energy (or mass) the centre of mass & is necessarily inside
the domain T.
If we integrate (26) over the volume r, the second term becomes, in view
of (19), — Pjc, while the last term becomes, on account of the definition (27)

We can get rid of the last term by subtracting the equation (21), multiplied
by £k> thus obtaining

Finally, dividing the equation (29) by — ic, we obtain:

In this equation we recognise "Newton's first law of motion", applied to


an arbitrary mechanical system: "the total momentum of a mechanical
system is equal to the total mass, multiplied by the velocity of the centre
of mass of the system". Actually the last term adds a small correction
term in the form of an added momentum which is not of kinematic origin
but caused by the external field.
We still have three additional conservation laws which correspond to an
extension of the "law of moments" of statics. They correspond to the
choice (9) for the vector 0f, putting $4 = 0, and choosing for i, k a pair of
space indices. The conservation law of momentum is thus complemented
by the "conservation of angular momentum" which leads to the following
extension of the law of moments: "the time rate of change of the angular
momentum is equal to the resulting total moment of the forces acting on the
system ". This is the fundamental dynamical law which governs the motion
of rotating bodies.
Here then are the ten fundamental laws of mechanics, which take their
origin (in the field theoretical description of matter) in the divergence-free
nature of the matter tensor with its associated solutions of the adjoint
homogeneous equation, giving rise to ten constraints between the boundary
486 BOUNDARY VALUE PROBLEMS CHAP. 8

values of the matter tensor. These constraints, expressed in physical


terms, give rise to the following ten conservation laws:
the three equations of the conservation of momentum (Newton's second
law of motion);
the one equation of the conservation of energy (which is also the con-
servation of mass);
the three equations of the conservation of angular momentum (Euler's
equations for rotating bodies);
and finally the three equations which give a kinematic interpretation of
momentum, in accordance with Newton's first law.
The fundamental laws of dynamics played a decisive role in the evolution
of physics, starting with Newton's particle mechanics in which the field
concept is not yet present, and culminating in Einstein's General Relativity,
in which the divergence-free quality of the matter tensor (interpreted in
terms of Riemannian geometry, instead of the flat space-time world of
Minkowski) is no longer an external postulate but an inevitable consequence
of the space-time structure of the physical universe.

8.13. Unconventional boundary value problems


In the previous sections we have studied some of the historically interesting
boundary value problems and discussed the analytical methods employed
for their solution. We shall now return to more fundamental questions and
investigate the theory of boundary value problems from a general standpoint.
Historically the differential operators of second order were classified into
the three types of elliptic, parabolic, and hyperbolic differential operators
and parallel with this classification went the prescription that elliptic
differential equations required peripheral boundary conditions, while the
parabolic and hyperbolic type of equations had to be characterised by initial
type of boundary conditions.
We shall now consider three plausible physical situations which seem to
allow a unique mathematical answer and yet do not satisfy the customary
conditions.
1. The cooling of a bar is observed. By an oversight the temperature
distribution of the bar was not recorded at the time moment t = 0 but at a
somewhat later time t = T. We should like to find by calculation what
the temperature distribution was at t = 0. Since there is a one-to-one
correspondence between v(x, 0) and v(x, T), it must be possible to restore
the first function by giving the second one. The problem does not fit the
conventional pattern since we have the heat-flow equation with an end-
condition instead of an initial condition.
2. The vibrating string starts its motion at t — 0. Instead of giving
v(x, 0) and Vt (x, 0), we take two snapshots of the string at the time moments
t = 0 and t = T, obtaining v(x, 0) and v(x, T). If T is sufficiently small,
the difference v(x, T) — v(x, 0) cannot be far from Tvt(x, 0) and our data
SEC. 8.14 THE EIGENVALUE A = 0 AS A LIMIT POINT 487

must be sufficient to restore the missing quantity vt(x, 0). But as a boundary
value problem we have violated the condition that a hyperbolic differential
equation should not be characterised by peripheral data.
3. The values of the potential V(x, y, z) are given on a very flat ellipsoid
a. By calculation we have obtained V in the neighbourhood of the origin
x = y = z = 0, in the form of an infinite Taylor expansion which, however,
does not converge beyond a certain small radius r = p at which the sphere
r = p touches the ellipsoid. By an accident we have lost the original
boundary values whose knowledge is very precious to us. We want to
restore the original data from the given Taylor expansion. We know that
the solution exists and is in fact obtainable by the method of analytical
continuation. But considered as a boundary value problem we can say that
V and dV/dv are given on the inner boundary r = p, while no boundary
values are given on the outer boundary <r. These are initial type of boundary
conditions for an elliptic differential equation, in contradiction to the general
rules.
From the standpoint of the general analytical theory we have the right
to ask what motivations are behind these prohibitions. The answer was
given by J. Hadamard who, in his celebrated "Lectures on the Cauchy
Problem" (cf. [5]), introduced the concept of a "well-posed" or "correctly
set" problem ("un probleme correctement pose"), by postulating certain
conditions that a properly formulated boundary value problem should
satisfy. The context of his discussions demonstrates that he considers
both under-determined and over-determined problems as not-well-posed.
In the under-determined case the solution is not unique, while in the over-
determined case the given data are not freely choosable but restricted by the
necessary compatibility conditions. Hence Hadamard's "well-posed"
problem represents in the language of algebra the case of an n x n linear
system with non-vanishing determinant which establishes a one-to-one
correspondence between the left side and the right side.
There is, however, a third condition demanded by Hadamard which has
no analogy in the algebraic situation. We will call this the Condition C:
"an arbitrarily small perturbation of the data should not cause a finite change
in the solution". It is this condition to which we have to pay particular
attention when dealing with the general theory of boundary value problems,
in which we abandon the restrictions which go with the special class of
"well-posed" problems.

8.14. The eigenvalue A = 0 as a limit point


In our general dealings with partial differential operators we came to the
conclusion that we do injustice to the nature of such an operator if we try
to impose on it the n x n condition. Generally the function on which the
operator operates and the result of the operation belong to completely
different manifolds and the condition of a one-to-one correspondence between
left side and right side is not satisfied. But a deeper analysis revealed that
488 BOUNDARY VALUE PROBLEMS CHAP. 8

in proper interpretation every linear operator establishes a one-to-one corre-


spondence between the U-space in which it is activated and the V-space in
which it is activated. If we do not move out of the space of activation of
the operator (the "eigenspace" associated with the operator), we do not
observe anything that could give rise to something "not-well-posed". The
condition is merely that both solution and given right side shall belong to
the eigenspace of the operator. If this condition is satisfied, the relation
between right and left sides is unique and one-to-one.
In Chapters 4 and 5 we have seen numerous examples for under-determined
and over-determined systems and the manner in which these systems
subordinate themselves to the general theory. It will thus be of consider-
able interest to ask: what happens if we depart from the customary type of
"well-posed" boundary value problems and assume data which do not
harmonise with the traditional prescriptions? From the very beginning it
has been our policy to consider the boundary conditions as an integrating
part of the operator. The actual numerical values of the boundary data are
of no significance as far as the operator goes—just as the "right side" of a
differential equation does not belong to the operator—but the question is:
what kind of boundary data are given? Hence it is the left side of the
boundary conditions which are integrating parts of the operator, and
changing these left sides also changes our operator profoundly. Hence the
same differential operator, once complemented by peripheral and once by
the initial type of boundary conditions, represents in fact two completely
different operators. Yet so far as the general theory is concerned, we can
see no reason why the one operator should be less amenable to the applica-
tion of the general principles than the other.
Let us recall briefly the main features of this theory, in order to see
whether or not it can serve as a sufficiently broad basis if we venture out into
the field of non-conventional boundary conditions. Our basic departure
point was the "shifted eigenvalue problem" (5.26.1) which led to the
following decomposition of the operator D into eigenfunctions:

This is a purely symbolic equation which has no direct significance since the
right side represents a necessarily divergent infinite sum. But the signifi-
cance of this sum was that the operation Dv(x) could be obtained with the
help of the following integral operation:

This is a meaningful operation since on the right side we can integrate


term by term. The resulting new sum is no longer meaningless if v(x)
belongs to the "permissible" class of functions which are sufficiently regular
SEC. 8.14 THE EIGENVALUE A = 0 AS A LIMIT POINT 489

and differentiable that the operator D can actually operate on them. In


that case the smalhiess of the definite integral

more than compensates for the largeness of A$ and even AjCf converges to
zero. The resulting sum

converges and represents the function u(x) which came about as the result
of the operation Dv(x).
The sum (1) finds its natural counterpart in another infinite sum which
represents the eigenfunction decomposition of the inverse operator:

This too is a sum which need not have an immediate significance. What
we mean by it is once more that this sum operates in the sense of a term by
term integration on the function u(g), in order to obtain v(x):

We can thus go from the left to the right, starting with v(x) and obtaining
u(x) on the basis of the operation (2), or we can go from the right to the
left, obtaining v(x) on the basis of the operation (6). So far as the analytical
theory of linear differential operators is concerned, both operations are of
equal interest, although usually we consider only the second operation, if our
task is to "solve" the given differential equation (with the given boundary
conditions).
The operator D~l is much nearer to an actual value than D itself. In
many problems the infinite sum (5) converges and defines a definite function
of the two points x and £; the " Green's function " of the problem. But even
if the sum (5) did not converge in itself, we could arrive at the Green's
function by a proper limit process.
This general exposition has to be complemented by the remark that we
have omitted from our expansions the eigenvalue A = 0. The significance
of the zero eigenvalue was that certain dimensions of the function space
were not represented in the operator and exactly for this reason the omission
of these axes was justified. However, the zero axes of the C7-space were
not immaterial. We had to check our data concerning their orthogonality
with respect to these axes since otherwise our problem was self-contradictory
and thus unsolvable.
What would happen now to this theory if we applied it to the case of
boundary data which hi the customary sense are injudiciously chosen?
490 BOUNDAEY VALUE PROBLEMS CHAP. 8

Will the fundamental eigenvalue problem (5.26.1) go out of action? We


have seen that neither under-determination nor over-determination can
interfere with the shifted eigenvalue problem since the fortunate circum-
stance prevails that the more over-determined the original operator D is, the
more under-determined becomes the adjoint operator D—and vice versa—
balancing completely in the final system. Hence it seems hardly possible
that an injudicious choice of our boundary data could put our eigenvalue
problem out of action, and indeed this is not the case. In every one of the
"ill-posed" problems mentioned above, the associated eigenvalue problem
is solvable, and provides the necessary and sufficient system of eigenfunctions
for expansion purposes. And yet it so happens that the Green's function in
the ordinary sense of the word does not exist in any one of these problems.
For example in the problem of the cooling bar with given end-condition the
definition of the Green's function requires a heat source at a certain point
x = $, t = T with the added condition that the temperature distribution shall
become zero at a time t = T which is beyond the time t = r. Such a function
does not exist. Nor does the Green's function exist in any of the other
ill-posed problems. In fact, if the Green's function did exist, we could
solve our problem in the conventional manner, and there would be no
chance of choosing our boundary values injudiciously.
If we examine the sum (5) closer, we observe that here too the eigenvalue
spectrum reveals a danger spot. The infinite sum (4) could not converge
because the eigenvalues A^ increase to infinity. Now the same danger that
A = oo represents for the sum (4), is represented by the value A = 0 for the
sum (5). It is true that division by zero cannot occur since we have excluded
the eigenvalue A = 0 from our eigenvalue spectrum (being non-existent so
far as the given operator is concerned). But we have to envisage the
possibility that A = 0 may be a limit point of the eigenvalue spectrum.
This means that although A = 0 is excluded as an eigenvalue, we may have
an infinity of \i which come to zero as near as we wish. If this is the case,
the sum (5) cannot converge under any circumstances, and the non-existence
of the Green's function is explained. For example, if the A-spectrum
contains a set of numbers which follow the law

where k is an integer which increases to infinity, our eigenvalue spectrum


remains discrete and positive, but our spectrum has a "point of condensa-
tion" or "limit point" at A = 0, although A = 0 is never reached exactly.
We will call a spectrum of this kind—which is the characteristic feature of
all "ill-posed" types of boundary value problems which violate the
"Condition C" (cf. Section 13) of Hadamard—a "parasitic spectrum". It
is characterised by an infinity of discrete eigenvalues which do not collapse
into zero but come arbitrarily near to zero, and thus "crowd" around zero
in a parasitic way.
It is this parasitic spectrum which distinguishes the non-traditional type
SEC. 8.14 THE EIGENVALUE A = 0 AS A LIMIT POINT 491

of boundary value problems from the traditional ones. As far as the


eigenfunctions go, they represent once more a complete function system
within the activated space of the operator. But the eigenvalues show the
peculiarity that they fall in two categories. We can start from a certain
AI > e where e may be chosen as small as we wish, and now arrange our
eigenvalues in a sequence of increasing magnitude:

In the boundary value problems of the conventional type we exhaust the


entire A-spectrum by this procedure. In a non-conventional type of problem,
however, a second infinite set of eigenvalues remains which has to be
arranged in decreasing order:

These eigenvalues—denoted by X'i in order to distinguish them from the


regular eigenvalues Af of the normal spectrum—cause the non-existence of
the Green's function because the infinite sum

cannot converge, exactly as the sum (4) could not converge on account of
the limit point A$ = oo of the regular spectrum.
But here again the divergence of the sum (10) does not mean that the
solution (6) has to go out of bound. The substitution of a permissible
function in (2) had the consequence that the right side of (4) approached a
definite limit which was u(x). Now, if we go backward by starting with
u(x) as the given function, we shall obtain the right side of (5) as a con-
vergent sum because the expansion coefficients

become sufficiently small to compensate for the smallness of the denominator


X'i and even y'^/A'j converges to zero.
We see that now the given right side u(x) cannot be chosen freely from the
class of sectionally continuous functions of bounded variations but has to be
submitted to more stringent constraints, in order to make the solution v(x)
possible. This, however, cannot be considered as objectionable since we
have accepted the fact that our data may have to satisfy some given
constraints. They had to be strictly orthogonal to all the zero-solutions of
the adjoint equation and such solutions may be present in infinite number
if the eigenvalue A — 0 has infinite multiplicity. Now our requirements are
less stringent. We do not demand that the expansion coefficients of u(x) in
the direction of the eigenfunctions u'i(x] shall vanish. It suffices if they
are sufficiently small, in order to make the sum

convergent at all points x of our domain.


492 BOUNDARY VALUE PROBLEMS CHAP. 8

The following treatment of our problem is then possible. We separate


the parasitic spectrum (9) from the regular spectrum (8). As far as the
regular spectrum goes, we can obtain our solution in the usual fashion:

with

In fact, for this part of the solution even a Green's function can be con-
structed and we can put

because the sum (5), extended only over the eigenfunctions Vi(x), even if it
does not converge immediately, will converge after an arbitrarily small
smoothing.
We now come to the parasitic spectrum for which a solution in terms of a
Green's function is not possible. Here the sum

has to remain in the form of a sum and we have to require the convergence
of this sum at all points x of our domain. This implies—since v(x) must be
quadratically integrable—that we should have

as a necessary condition to which our data have to be submitted. This


condition need not be sufficient, however, to insure pointwise convergence
of the sum (16) and we may have to ask the fulfilment of the further
condition

But these conditions are much milder than strict orthogonality to the
parasitic spectrum which would make the right sides of (17) and (18) not
finite but zero.
The resulting solution v(x) of our problem is now the sum of the contribu-
tions of the regular and the parasitic spectrum:
SBC. 8.14 THE EIGENVALUE A = 0 AS A LIMIT POINT 493

Problem 346. Consider the problem of the cooling bar (9.9-10), but replacing
the initial condition (9.7) by the end condition

Obtain the compatibility condition of the function F(x), on the basis of the
Fourier expansion (9.11).
[Answer:

where

Problem 347. The Cauchy-Riemann differential equations (4.3) can be


written in the form of a single equation for the complex function u + iv = /(z):

Transform this equation to polar coordinates (r, 6) and assume that /(I, 9) is
given as a (complex) function of 6:

Given the further information that /(z) is analytical between the circles r = 1
and r = R. Find the compatibility condition to be satisfied by <p(6).
[Answer:

where

Problem 348. Given the values of the two-dimensional potential function


V(r, 6) and its normal derivative on the unit circle:

Find again the compatibility condition of this problem under the same
assumptions as those of the previous problem. Interpret the result in terms of
the Cauchy-Biemann equations (25).
[Answer:

where
494 BOUNDABY VALUE PEOBLEMS CHAP. 8

8.15. Variational motivation of the parasitic spectrum

The early masters of calculus assumed that the initial values (8.7.26) of
the problem of the vibrating string have to be prescribed as analytical
functions of x. They were led to this assumption by the decomposition of
the motion into eigenfunctions which are all analytical and thus any linear
combination of them is likewise analytical. Later the exact limit theory
of Cauchy revealed the flaw in this argument which comes from the fact that
an infinite series, composed of analytical terms, can in the limit approach a
function which is non-analytical. But exactly the same argument can also
be interpreted in the sense that a non-analytical function can be replaced
by an analytical function with an error which can be made at each point
as small as we wish. Hence we would think that the diiference between
demanding analytical or non-analytical boundary values cannot be of too
great importance. And yet, the decisive difference between the well-posed
and ill-posed type of boundary value problems lies exactly in the question
whether the nature of the given problem allows non-analytical data, or
demands analytical data. An analytical function is characterised by a very
high degree of consistency, inasmuch as the knowledge of the function
along an arbitrarily small arc uniquely determines the course of the function
along the large arc, while a non-analytical function may change its course
capriciously any number of times. But then, if an analytical function has
such a high degree of predicatability, we recognise at once that a boundary
value problem which requires analytical data will be automatically over-
determined to an infinite degree, and we can expect that conditions will
prevail which deviate radically from the "well-posed" type of problems
whose data need not be prescribed with such a high degree of regularity.
In this section we shall show that the parasitic spectrum comes into existence
automatically in problems of this type.
As a starting point we will consider the first unusual boundary value
problem listed in Section 13, the cooling of a bar whose temperature distribu-
tion has been observed at the time moment t = T, while our ami is to find
by calculation what the temperature distribution was at the earlier time
moment t = 0. We realise, of course, that the function v(x, T) = F(x) is by
no means freely at our disposal. But we can assume that F(x) is given to
us as the result of measurements and we have the right to idealise the physical
situation by postulating that our recording instrument provides the course
of F(x) free of errors, to any degree of accuracy we want. Hence the
compatibility of our data is assured in advance. We have given the function
v(x, T) which has developed from an initially given non-analytical but
permissible temperature distribution v(x, 0) = /(#). If we can obtain
v(x, T) from v(x, 0), we must also be able to obtain v(x, 0) from v(x, T).
And in fact the solution (9.11) is reversible. By obtaining the coefficients
C| from the given initial distribution we could obtain v(x, t) at any later time
moment. But if we start with v(x, T), the expansion coefficients of this
function will give us c* multiplied by an exponential function and thus the
SEC. 8.15 VARIATIONAL MOTIVATION OF PAEASITIC SPECTRUM 495

coefficients Ci themselves require a multiplication by the same exponential


function, but changing the sign of the exponent to the opposite. The
original f(x) thus becomes

This sum would diverge, of course, if we had started with the wrong data,
but it remains convergent if F(x) has been properly given.
We will now investigate the eigenvalue spectrum associated with our
problem. This problem can be conceived as the solution of a minimum
problem. The shifted eigenvalue problem

yields for v alone the differential equation

and for u alone the differential equation

The differential equation (3) can be conceived as the solution of the


following minimum problem. Minimise the positive definite variational
integral

with the auxiliary condition

Similarly the differential equation (4) is derivable from the variational


integral

with the auxiliary condition

The eigenvalue A2 has the following striking significance: it is equal to the


value of the variational integral if we substitute in it the solution of the
variational problem. The eigenvalue A2 is thus the minimum itself,
obtained by evaluating the integral Q (resp. Q) for the actual solution of the
given variational problem.
We will ask in particular for the smallest possible minimum of Q,
respectively Q; then we will obtain the smallest eigenvalue with which our
eigenvalue spectrum starts. Since both problems (5) and (7) lead to the
same eigenvalue A]2, we can obtain Ai2 in two different ways:
496 BOUNDARY VALUE PROBLEMS CHAP. 8

This reasoning would not hold in the case where the minimum is zero
because Dv(x) = 0, or jbu(x) = 0 may have non-vanishing solutions, although
the other equation may have no such solution. We assume, however, that
A = 0 is not included in the eigenvalue spectrum.
This condition is satisfied in our cooling problem. The boundary condition
for v(x, t) is

and no regular solution of the heat flow equation exists which would give a
uniformly vanishing solution at t = T, without vanishing identically. The
same can be said of the adjoint equation Du = 0, under the boundary
condition

But the analytical nature of the heat flow equation for any t > 0 allows a
much more sweeping conclusion. Let us assume that F(x, T) is not given
along the entire rod between x = 0 and x = I, but only on a part of the rod.
Then the corresponding boundary condition (10) will now involve only the
range x = [0,1 — e]. Yet even that is enough for the conclusion that the
homogeneous equation has no non-vanishing solution, because an analytical
function must vanish identically if it vanishes on an arbitrarily small arc.
Then our minimum problem requires that we shall minimise the integral
(5) (with the auxiliary condition (6)), under the boundary condition

This condition is less stringent than the previous condition (10) which required
the vanishing of v(x, T) for the complete range of x. This greater freedom
in choosing our function v(x, t) must give us a better minimum than before;
that is, the smallest eigenvalue must decrease. But let us view exactly the
same problem from the standpoint of the adjoint equation. Here the
shrinking of the boundary for v(x, t) increases the boundary value for u(x, t),
because now it is not enough to require that u(x, 0) shall be zero. It has
to be zero also on that portion of the upper boundary on which v(x, T)
remained free. We have thus a more restricted minimum problem which
must lead to an increase of the smallest eigenvalue. And thus we come to
the contradictory conclusion that the same eigenvalue must on the one hand
decrease, on the other hand increase. We have tacitly assumed in our
reasoning that there exists a smallest eigenvalue. The contradiction at
which we have arrived forces us to renounce this assumption, and this can
only mean that the eigenvalue spectrum can become as small as we wish,
because in that case there is no smallest eigenvalue. And thus we have
been able to demonstrate the existence of the parasitic spectrum in the
given cooling problem by a purely logical argument, without any explicit
calculations.
Quite similar is the situation concerning the third of the problems
enumerated in Section 13. Here the potential function was characterised
by giving the function and its normal derivative along a certain portion
SEC. 8.15 VARIATIONAL MOTIVATION OF PARASITIC SPECTRUM 497

of the boundary surface a (for example on an inner boundary a' which,


however, can be considered as part of the boundary surface). Here again
the existence of the parasitic spectrum follows once more by the same
argument that we employed in the case of the parabolic heat now equation,
since the potential function is likewise an analytical function everywhere
inside the boundaries. This shows that here again the Green's function hi
the ordinary sense does not exist, since it has to be complemented by an
infinite sum which cannot converge, as discussed in Section 14. The solution
of our problem exists, however, if the boundary data are properly given.
Yet in this problem we encounter a situation which is even more
surprising. Let us consider the following minimum problem. Minimise
the integral

under the constraints that the values of v and dvjdv are prescribed on the
boundary surface a. The problem leads to the differential equation

The associated eigenvalue problem

is identical with that obtained for the functions v, u of the previous para-
graph, if we identify p. with A2. Our problem seems "well-posed", and in
fact it is, if the given boundary data extend over the complete boundary.
Then the A-spectrum starts with a definite finite AI and the parasitic
spectrum does not appear. The solution is unique and the data freely
choosable. But let us now assume that once more the same minimum
problem is given, but with boundary data which omit one part of the boundary,
be that part ever so small. At this moment the situation changes com-
pletely. The smallest eigenvalue falls to an infinitesimal quantity; we get
the parasitic spectrum, and the problem becomes unsolvable with boundary
data which are not properly given. This means that our minimum problem
has no solution. We can approach a certain minimum as near as we wish
but a definite minimum cannot be obtained.
Indeed, the analytical solution demands once more the fulfilment of the
differential equation (14), with the boundary conditions

to which the variational principle adds the boundary conditions

The equation (14) can now be written in the form


498 BOUNDARY VALUE PROBLEMS CHAP. 8

and in the boundary conditions (17) we can replace Av by u. But we


know from the analytical nature of the potential function that u must
vanish identically if the boundary conditions (17) hold even along an
arbitrarily small portion of the boundary. And thus the analytical solution
of our minimum problem demands such boundary data as make the equation

possible. In that case we get zero for the requested minimum. But for
any other choice of boundary data our problem becomes unsolvable. And
yet the solution exists immediately if we add further conditions to our
problem, for example by requiring that v and 8v/dv shall vanish on the
remaining portion S of the boundary surface a.
The "method of least squares" is based on the principle that a function
of some parameters which is everywhere positive must have a minimum
for some values of the parameters. This theorem is true in algebra, where
the number of parameters is finite. It seemed reasonable to assume that
the same theorem will hold in the realm of positive definite differential
operators. Hence the attempt was made to demonstrate the existence of
the solution of the boundary value problems of potential theory on this
basis. This principle is called (although with no historical justification)
"Dirichlet's principle". In the case of the potential equation this principle
is actually applicable, no matter whether the boundary values are prescribed
on the total boundary or only on some parts of it. But our result con-
cerning the "biharmonic equation" (14) shows that this principle can have
no universal validity. It holds in all cases in which the parasitic spectrum
does not exist. But here we have an example of a completely "elliptic"
type of differential equation, with apparently well-chosen peripheral
boundary conditions, which is in fact "ill-posed" in Hadamard's sense.
This peculiarity of our problem is then traceable to the appearance of the
parasitic spectrum which again is closely related to the failure of Dirichlet's
principle.

8.16. Examples for the parasitic spectrum


An explicit construction of the parasitic spectrum requires in most
cases heavy calculations, because in most physical situations we are led to
the solution of differential equations of fourth order which are less familiar
to us than differential equations of the second order. In the case of heat
conduction, however, we are in the fortunate position that the explicit
solution is obtainable with the help of simple tools. In the one-dimensional
problem of the cooling bar our equation is separable in x and t. The separa-
tion in x reduces the problem to an ordinary differential equation of first
order in t alone.
For the sake of simplicity we will normalise the length of the bar to
SEC. 8.16 EXAMPLES FOR THE PARASITIC SPECTRUM 499

and put

Now the shifted eigenvalue problem appears in the following form:

In order to familiarise ourselves with the nature of the problem, we will


first treat the case of the "regular" eigenvalue spectrum, in which v(x, t)
is given at the initial moment t = 0. Then the boundary conditions of the
system (3) become:

We assume both u and v in exponential form:

Substitution in (3) yields the two conditions

from which

and thus

To every given A two exponents are obtained, namely + a, if we agree that


a is defined as the positive, value of the square root appearing in (7). The
full solution now becomes

The first boundary condition (4) demands A% = —A\ and hence we can
put:

Now the relation (7) does not make a necessarily real. For any A which
is larger than k2, a becomes imaginary. We can show at once that indeed
this is the only possibility. In the former case we see from (9) that u(t) is
a monotonously increasing function of t which cannot vanish for any value
500 BOUNDARY VALUE PROBLEMS CHAP. 8

of t, in contradiction to the second boundary condition (4). This shows that


of necessity

and the possibility of a parasitic spectrum is excluded.


Let us now assume that we have given v(x, t) at the end point t = T,
instead of the initial point t = 0. Then our new boundary conditions
become

and exactly with the same reasoning as above we obtain the solution

In this case the possibility of a real a cannot be ruled out. The first
boundary condition (11) requires the condition

Since we are only interested in the possibility of very small A| (which we


shall denote by A'*), we can put, hi view of (7):

and obtain for sufficiently large k the relation

and thus

For large k the A'fc decrease rapidly and come arbitrarily near to zero. We
have thus proved the existence of a parasitic spectrum.
The analysis of this solution shows two characteristic features:
1. The parasitic spectrum is a one-dimensional sequence; to every k
(for large k) only one A'jt can be found, while the regular eigenvalue spectrum
is two-dimensional (to every k an infinity of periodic solutions can be found).
2. The division by a very small A'fc makes the data exceedingly vulnerable
to small errors and in principle our data have to be given with infinite
accuracy, in order to solve the given problem. Hadamard's "Condition C"
(see Section 13), is not fulfilled. But a closer examination reveals that the
very small A'^ belong to very high k values. If the time T is sufficiently
small, then the dangerously small A'fc will occur at such large k values that
even the complete omission of the parasitic spectrum will cause a minor
error, provided that the initial temperature distribution is sufficiently
smooth. Under such circumstances we can restore from our data v(x, T)
SEC. 8.16 EXAMPLES FOE THE PARASITIC SPECTRUM 501

the initial temperature distribution v(x, 0), if our data are given with
sufficiently high, but not infinitely high accuracy. The time T of the
backward extrapolation depends on the accuracy of our data, and it is
clear that for large T an excessive (but still not infinite) accuracy is demanded,
if we want to obtain v(x, 0) with a given finite accuracy. An absolute
accuracy of the data would only be required if we do not tolerate any error
in the finding of v(x, 0).
In this example the parasitic spectrum came into existence on account of
the unconventional type of boundary value problem from which we started.
Much more surprising is the appearance of this spectrum in a perfectly
regular and "well-posed" problem, namely the Cauchy-problem (initial
value problem) associated with the vibrating string. The peculiar riddles
which we have encountered in the last part of Section 8, find their resolution
in the unexpected fact that even in this very well-posed problem the
parasitic spectrum cannot be avoided, if we formulate our problem hi that
" canonical form" which operates solely with first derivatives, the derivatives
of higher order being absorbed by the introduction of surplus variables
(cf. Chapter 5.11).
We have formulated the canonical system associated with our problem in
the equations (8.7). The eigenvalue problem becomes (in view of the
self-adjoint character of the differential operator):

We can first of all separate in the variable x by putting

(k = integer; the length of the string is normalised to TT).


The new system becomes

The first two horizontal lines can be solved algebraically for pi, p2, qi, qz>
obtaining

Substitution in the third line yields the two simultaneous equations:


502 BOUNDARY VALUE PROBLEMS CHAP. 8

Assuming an exponential form of the solution we can put

which yields for the constants A and B the relations

We put

and obtain

Accordingly the full solution of our problem becomes

where «i2 and az2 are determined by the two roots of the equation

The free constants of our solution will be determined by the boundary


conditions which have to be fulfilled. Our original problem demanded the
boundary conditions

These, however, are not the boundary conditions of our canonical problem.
The derivative v'(t) was absorbed by the new variable pz, similarly u'(t) by
<?2. The conditions

demand now (in view of (20)), the boundary conditions

We want to find out whether or not these conditions can be met by very
small A-values. In that case A3 becomes negligible on the right side of (27)
and we have to solve the equation

Since a must become imaginary, we will put a = i/3


SEC. 8.16 EXAMPLES FOB THE PARASITIC SPECTRUM 503

and write (26) in trigonometric rather than exponential form. The first
of the boundary conditions (29) reduces the free constants of our solution
to only three constants:

The second boundary condition (30) prescribed at t = 0 demands

which in view of (32) becomes

and thus

At this point our problem is reduced to but two constants of integration


but we still have to satisfy the two conditions (30) at the point t = T.
The first condition gives directly

while the second condition yields, by the same reasoning that led to (34)
and (35):

The simultaneous fulfilment of these two conditions is only possible if

This means

(ra = integer). We see that for any choice of the integer m the eigenvalue
X'mk can be made as small as we wish by choosing k sufficiently large. The
existence of a very extended parasitic spectrum is thus demonstrated and
we now understand why the solution of the canonical system (8.7) is less
smooth than the right side J3(x), put in the place of zero in the third equation.
The propagation of singularities along the characteristics—which is in such
strange contradiction to our expectations if we approach the problem from
the standpoint of expanding both right side and solution into their respective
eigenfunctions—can now be traced to the properties of the parasitic spectrum
which emerges unexpectedly in this problem.
504 BOUNDARY VALUE PROBLEMS CHAP. 8

Problem 349. Obtain the parasitic spectrum for the following (non-conventional)
boundary value problem:

[Answer:

for small A and large k:

Problem 350. The analytical function f(z) (see equation (14.23)) is known to
be analytical in the strip between y = 0 and y = I, Moreover, it is known to be
periodic with the period 2?r:

The value of f(z) is givem on the line y = 0:

Find the parasitic spectrum of this problem.


[Answer:

For large k and small A:

8.17. Physical boundary conditions


The solution of a differential equation with data given on the boundary is
primarily a mathematical problem and the associated shifted eigenvalue
problem need not have any direct physical significance. But in the vibration
problems of mathematical physics an actual physical situation is encountered
which puts the eigenvalue problem in action, not as a mathematical device
for the solution of a given problem, but as a natural phenomenon, such as
the elastic vibrations of solids and fluids, the electromagnetic vibrations of
antennae or wave-guides, or the atomic vibrations of wave mechanics.
Here one may ask what significance should be attached to the "boundary
conditions" which play such a vital role in the mathematical solution of
eigenvalue problems.
And here we have first of all to record the fact that from the physical
standpoint a "boundary condition" is always a simplified description of an
unknown mechanism which acts upon our system from the outside. A
completely isolated system would not be subjected to any boundary con-
ditions. The mathematical boundary conditions of a certain vibration
problem would follow automatically from the underlying mechanical
SEC. 8.17 PHYSICAL BOUNDARY CONDITIONS 505

principles which provide not only the equations of motion but also the
"natural boundary conditions" of the given physical problem. Imposed
boundary conditions are merely circumscribed interventions from outside
which express in simplified language the coupling which in fact exists
between the given system and the outer world. The actual forces which
act on the system, modify the potential energy of the inner forces and the
physical phenomenon is in reality not a modification of the boundary
conditions of the isolated system but a modification of its potential energy.
Hence it is the differential operator which in reality should be modified and
not the boundary conditions which actually remain the previous "natural
boundary conditions".
As a concrete example let us consider the vibrations of a membrane for
which the boundary condition

is prescribed, where s indicates the points of a certain closed curve along


which the membrane is fixed, thus making its displacement zero on the
boundary. Such a condition cannot be taken with full rigour. We would
need infinitely large forces for the exact fulfilment of this condition. In
fact we have large but not infinitely large forces acting on the boundary.
Hence the question arises, how we could take into account more realistically
the actual physical situation. For this purpose we will replace the given
condition (1) by another condition which is completely equivalent to the
original formulation. We will demand the fulfilment of the integral condition

where the integration is extended over the entire boundary. Although our
condition is now of a global character, whereas before we demanded a condi-
tion that had to hold at every point of the boundary, yet our new condition
can only hold if the integrand becomes zero at every point, and thus we are
back at the original formulation.
But now we will make use of the "Lagrangian multiplier method" that
we have employed so often for the variational treatment of auxiliary con-
ditions. Our original variational integral was given in the form

and required—according to the "principle of least action"—that we should


minimise the time integral of the T — V, where T is the kinetic, V the
potential energy of the system. The method of the Lagrangian multiplier
requires that we modify our variational integral in the following sense:

Let us observe that we would obtain the same Lagrangian if the condition
(2) were replaced by the less extreme condition
506 BOUNDARY VALUE PEOBLEMS CHAP. 8

where e is not zero but small. It is the magnitude of the constant //, which
will decide what the value of the right side of the condition (5) shall become.
With increasing \i the constant e decreases and would become zero if /n
grows to infinity.
The term that we have added to the Lagrangian L = T — V:

represents in physical interpretation the potential energy of the forces which


maintain the constraint (5). Hence we cannot let p go to infinity but must
consider it as a large but finite constant. The motion law of the membrane
now becomes

which is the same partial differential equation we had before. The added
term comes in evidence only when we establish the natural boundary
condition of our problem, which now becomes

This again shows that the exact condition (1) would come about if /*
became infinite, which is prohibited for physical reasons. The changed
boundary condition (8) instead of (1) would eventually come into evidence
in the vibrational modes of extremely high frequencies.
However, even so, we cannot be satisfied by the expression (6). If the
potential energy of the elastic forces require an integration over the two-
dimensional domain of the coordinates (x, y) we cannot assume that the
boundary forces will be concentrated on a line. Although apparently the
membrane is fixed on the boundary line only, physically there is always a
small but finite band along the boundary on which the external forces act.
Accordingly we have to introduce the potential energy of the forces which
maintain the constraint on the boundary in the form

where the function W(x, y} has the property that it vanishes everywhere
except in a very thin strip of the width e in the immediate vicinity of the
boundary curve s, where W(x, y} assumes a large constant value:

In fact, however, we cannot be sure that the force acting on the boundary
has the same strength at all points. We could have started our discussion
by replacing the boundary condition (1) by the integral condition
SEC. 8.17 PHYSICAL BOUNDARY CONDITIONS 507

where the weight factor p(s) is everywhere positive but not necessarily
constant. Accordingly we cannot claim that the large positive value of
W(x, y) in a thin strip along the boundary will be necessarily a constant
along the boundary curve s. It may be a function of s, depending on the
physical circumstances which prevail on the boundary. For the macro-
scopic situation this function is of no avail, since practically we are entitled
to operate with the strict boundary condition (1). But the method we
have outlined—and which can be applied to every one of the given boundary
conditions, for example to the two conditions v = 0 and 8v]dv = 0 in the
case of a clamped plate, which now entails the addition of two expressions
of the form (9)—has the great advantage that it brings into play the actual
physical mechanism which is hidden behind a mathematical boundary
condition. We have modified the potential energy of our system and thus
the given differential operator with which we have to work. The "imposed
boundary conditions" are now gone. They have been absorbed by the
modification of the differential operator. The actual boundary conditions
follow from the variational problem itself and become the natural boundary
conditions of the given variational integral.
We can now answer the question whether the parasitic spectrum
encountered in our previous discussions might not have been caused by the
imposition of artificial boundary conditions, and might not disappear if
we operate with the actual physical situation in which only natural boundary
conditions occur. The answer is negative: the parasitic spectrum cannot
be removed by the replacement of the imposed boundary conditions with the
potential energy of forces which maintain that condition. Indeed, the
replacement of the given constraint by a potential energy weakens the
constraint. Hence the chances of a miiiimum under the given conditions
have improved and the eigenvalue must be lowered rather than increased.
The parasitic spectrum must thus remain, with a very small alteration toward
smaller A'*. And thus we arrive at a strange conclusion. We have seen
that the smallest eigenvalue of the shifted eigenvalue problem can always
be defined as the minimum of a certain positive definite variational integral—
in fact as the absolute minimum of that integral. We should think that at
least under natural boundary conditions a definite minimum must exist.
Now we see that this is not so. In the large class of problems in which the
parasitic spectrum makes its appearance (and that includes not only the
non-conventional type of boundary value problems, but the well-posed
hyperbolic type of problems in which the parasitic spectrum is a natural
occurrence if the problem is formulated in its canonical form), we obtain no
definite miiiimum, in spite of the regular nature of the given differential
operator, the finiteness of the domain, and the fact that we do not impose
any external boundary conditions on the problem. "Dirichlet's principle"
fails to hold in this large class of problems. The minimum we wanted to
get can only be reached as a limit, since we obtain an infinity of stationary
values which come to zero as near as we wish, without ever attaining the
value zero.
508 BOUNDARY VALUE PROBLEMS CHAP. 8

An interesting situation arises in connection with the celebrated


SchrSdinger's wave equation (1.7) for the hydrogen atom. Here

We know that there exists a negative eigenvalue spectrum, given by

where A is a universal constant and n an integer. As n goes to infinity,


A = 0 becomes a limit point. And yet the usual phenomena which
accompany a parasitic spectrum, do not come into evidence. The Green's
function of the differential equation exists and we do not experience the
infinite sensibility of the solution relative to a small perturbation of the
inhomogeneous equation.
A closer examination reveals that the parasitic spectrum is not genuine
in this instance. It comes into being solely by the infinity of the domain
in which we have solved our problem. If we enclose the hydrogen atom in
a sphere of a large but finite radius R, we sweep a certain small but finite
range around A = 0 free of eigenvalues. The negative energy states are
now present in a finite number only and the positive energy states form a
dense but discrete spectrum which starts with a definite e > 0. Under
these circumstances it is clear that the Green's function cannot go out of
existence and that the original limit point A = 0 has to be conceived as the
result of a limit process, by letting R go to infinity.

8.18. A universal approach to the theory of boundary value problems


We have travelled a long way since the beginning of this chapter and
encountered many strange phenomena on our journey. In retrospection we
will summarise our findings. We have seen that the method of the separation
of variables is an eminently useful tool for the solution of some particularly
interesting differential equations, if the boundary is of sufficient regularity.
For the general understanding of the basic properties of boundary value
problems, however, another approach was more powerful which made use of
an auxiliary function of considerable generality. This function had to
satisfy the given inhomogeneous boundary conditions but was otherwise
free of constraints. Hence the differential equation remained unsatisfied,
but the function performed the programme of transforming the originally
given homogeneous differential equation with inhomogeneous boundary
conditions, into an inhomogeneous differential equation with homogeneous
boundary conditions. To this problem we could apply directly the usual
analysis in eigenfunctions, expanding both right side and solution into their
proper eigenfunctions. Hence the eigenfunction analysis remained our basic
frame of reference and included within its scope all boundary value problems,
irrespective of how judiciously or injudiciously the given boundary data
SEC. 8.18 APPROACH TO THEOEY OF BOUNDARY VALUE PROBLEMS 509

may have been chosen. The basic problem thus remained the solution of
the inhomogeneous differential equation

with the proper homogeneous boundary conditions. The "given boundary


data " are thus transformed into the "given right side " j8(a;) of the differential
equation (1).
We obtained a unique solution by demanding orthogonality of v(x) to all
the zero-axes of the F-space:

Furthermore, the solvability of our problem demanded that the right side
(3(x) should be orthogonal to all the zero-axes of the Z7-space:

These conditions can in fact be replaced by the "completeness relation"

where

and the zero axes are omitted. Indeed, if u(x) had projections in the
direction of the t^-axes, we should have to add to the right side the sum

The omission of this sum is only justified if we have

which is only possible if each one of the ft defined by (3) vanishes. Hence
the compatibility of the data can be replaced by the single scalar condition
(4), which makes no reference to the missing axes.
This, however, is generally not enough. In addition to the regular
spectrum whose eigenvalues increase to infinity, we may have a "parasitic
spectrum", whose eigenvalues X't converge to zero. While the given
function j8(x) need not be orthogonal to the axes u't(x) of the parasitic
spectrum, it is necessary that the projections in the direction of these axes
shall be sufficiently weak to make the following sum convergent:
510 BOUNDARY VALUE PROBLEMS CHAP. 8

Beyond this condition—which is necessary but not always sufficient—we


have to demand the pointwise convergence of the infinite sum

at all points x of the domain.


In this general theory we have moved far from the restricted class of
"well-posed" problems which forbid the eigenvalue zero with respect to
both F- and ^/-spaces and forbid also the parasitic spectrum. Hence it is
of interest to see that even in the most "ill-posed" problems we are in fact
not far from a "well-posed" problem, because a small but finite perturbation
transforms all ill-posed problems into the well-posed category. We do that
by a device discussed earlier in Chapter 5.29, establishing a weak coupling
between the given and the adjoint operator. We consider the given
equation (1) as the limit of the following system (letting e go to zero):

The operator on the left side has exactly the same eigenfunctions as the
original one but the eigenvalues are shifted by a small amount. This shift
eliminates the zero eigenvalue and also its immediate neighbourhood—that
is the parasitic spectrum. We now have a complete and unconstrained
operator which satisfies all the requirements of a "well-posed" problem: the
solution is unique, the right side fi(x) is not subjected to compatibility
conditions and the solution is not infinitely sensitive to small changes of
the data. The Green's function exists and we can find the solution in the
usual fashion with the help of this function. The eigenfunction analysis is
likewise applicable and we need not distinguish between small and large
eigenvalues since none of the eigenvalues becomes smaller in absolute value
than e.
The solution of the modified problem exists even for data which from the
standpoint of our original problem are improperly given. Moreover, the
solution is unique. We analyse the given fi(x) in terms of the eigenfunctions
ut(x):

Then the differential equation (10) yields:

The difference between a well-posed and an ill-posed problem has disappeared


SEC. 8.18 APPEOACH TO THEORY OF BOUNDARY VALUE PROBLEMS 511

in this approach and we have arrived at a universal basis for the treatment
of arbitrarily over-determined or under-determined systems. The distinction
between properly and improperly given data comes into appearance only if
we investigate what happens if e converges to zero. The criterion for
properly given data becomes that the solution (vf(x), uf(x)) must approach
a definite limit:

The fact that v(x) is the limit of vf(x) may also be interpreted by saying
that, for an e which is sufficiently small, the difference between vc(x) and
v(x) can be made at all points x as small as we wish. We thus come to the
following result: "If the data are given adequately, the difference between
an arbitrarily ill-posed and a well-posed problem can be reduced at every
point of the domain to an arbitrarily small amount."

BIBLIOGRAPHY

[1] Bergman, S., and M. Schiffer, Kernel Functions and Differential Equations
(Academic Press, New York, 1953)
[2] Churchill, R. V., Fourier Series and Boundary Value Problems (McGraw-
Hill, 1941)
[3] Courant, R., and D. Hilbert, Methods of Mathematical Physics, Vol. II
(German Edition, Springer, Berlin, 1937)
[4] Gould, S. H., Variational Methods for Eigenvalue Problems (University of
Toronto Press, 1957)
[5] Hadamard, J., Lectures on the Cauchy Problem (Dover Publications, New
York, 1953)
[6] Kellogg, O. D., Foundations of Potential Theory (Springer, Berlin, 1929)
[7] McConnell, A. J., Applications of the Absolute Differential Calculus (Blackie
& Sons, London, 1942)
[8] Smithies, F., Integral Equations (Cambridge University Press, 1958)
[9] Spain, B., Tensor Calculus (Oliver & Boyd, Edinburgh, 1956)
[10] Synge, J. L., and A. Schild, Tensor Calculus (University of Toronto Press,
1956)
[11] Tricomi, F. G., Integral Equations (Interscience Publishers, 1957)
CHAPTER 9

NUMERICAL SOLUTION OF TRAJECTORY


PROBLEMS

Synopsis. In this chapter we deal with the numerical solution of


ordinary differential equations, transformed into a first order system.
We study the step-by-step procedures which start from a given initial
value and advance in equidistant small steps from point to point, on
the basis of local Taylor expansions, truncated to a finite number of
terms. Although the truncation error may be small at every step, we
have no control over the possible accumulation of these errors beyond
a danger point. It is thus advisable to complement the local integration
method by a global method which considers the entire range of the
independent variable as one unified whole and adds a linear correction
to the preliminary solution obtained by the step-by-step procedure.
9.1. Introduction
The rapid and spectacular development of the large electronic computers
provided a new and powerful tool for the solution of problems which before
had to be left unsolved. The physicist and the construction engineer face
frequently the situation that they need the numerical solution of a differential
equation which is not simple enough, to allow a purely analytical solution.
The coding for the electronic computer makes many of these problems
solvable in purely numerical terms. But purely numerical computations
remain a groping in the dark if they are not complemented by the principles
of analysis. The adequate translation of a mathematical problem into
machine language can only occur under the proper guidance of analysis.
The numerical solution of partial differential equations requires much
circumspection and elaborate preparation which goes beyond the framework
of our studies and has to be left to the specialised literature dealing with this
subject.* The much more circumscribed problem of ordinary differential
equations, however, is closely related to some of the topics we have
encountered in the early phases of our studies, particularly the problems of
interpolation and harmonic analysis. It will thus be our aim to give a
comprehensive view of the basic analytical principles which lead to an
adequate numerical treatment of ordinary differential equations.
* Cf. particularly the comprehensive textbooks [2] and [6] of the Chapter Bibliography.
512
SEC. 9.2 DIFFERENTIAL EQUATIONS IN NORMAL FORM 513

9.2. Differential equations in normal form


The problems of mechanics are subordinated to a fundamental principle,
called the "principle of least action", which demands that the time integral
of a certain function, called the "Lagrangian" of the variational principle,
shall be made a minimum, or at least a stationary value. This Lagrangian
L is a given function of certain variables, called the "generalised co-
ordinates" of the mechanical system and usually denoted by q\, qz, . . . ,qn-
It contains also the "velocities" of the generalised coordinates, i.e.,
<h> 22, • • • , <?«> and it may happen that the independent variable (the
time t) is also explicitly present in L:

For example in the problem of a missile we may consider the missile as a


rigid body whose position in space requires six parameters for its characterisa-
tion, namely the three rectangular coordinates of its centre of mass, plus
three angles which fix the orientation of the missile relative to the centre of
mass, since the missile is capable of translation and of rotation. We say
that the missile has "six degrees of freedom" because its mechanical state
requires that six variables shall be given as definite functions of the time t.
Here the number of "generalised coordinates" is six and the Lagrangian L
(which is defined as the difference between the kinetic and the potential
energy of the body) becomes a given function of the six parameters
<?i> <?2, • • • > <?6 and their time derivatives qi, qz, . . . , q^.
The direct minimisation of Q by the principles of variational calculus
leads to the "Lagrangian equations of motion" which form a system of n
(in our case six) differential equations of the second order. We have seen,
however, in Chapter 5.10, how by the method of surplus variables we can
always avoid the appearance of derivatives of higher than first order. We
do that by transforming the equations of motion into the "Hamiltonian
form":

The characteristic feature of these equations is that on the right side the
function H (the "Hamiltonian function") is only a function of the variables
qi, pi (and possibly the time t), without any derivatives.
If the basic differential equation—or system of such equations—is not
derivable from a variational principle, or if we deal with a mechanical
system in which frictional forces are present (which do not allow a variational
treatment), we shall nevertheless succeed in reducing our problem to a
first order system by the proper introduction of surplus variables. No
matter how complicated our original equations have been and what order
34—L.D.O.
514 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

derivatives appear in them, we can always introduce surplus variables which


will finally reduce the system to the following normal form:

where the Ft are given functions of the dynamical variables V{ and x. The
equations (2) can be conceived as special cases of the general system (3),
considering the complete set of variables q\, . . . , qn; pi, . . . , pn as our Vi
and thus n = 2m. Moreover, it is a specific feature of the Hamiltonian
system that the right sides F{ can be given in terms of a single scalar
function H, while in the general case (3) such a function cannot be found.
However, while for the general analytical theory of the Hamiltonian
"canonical equations" the existence of H is of greatest importance, for the
numerical treatment this fact is generally of no particular advantage.
Important is only that we shall reduce our system to the normal form (3).
Let us assume for example that the given differential equation is a single
equation of the order n. The general form of such an equation is

We now solve this equation for y^(x) and write it in the explicit form

Then we introduce the derivatives of y(x) as new variables and replace the
original single equation of the order n by the following system of n equations
of the first order—denoting y by v-\_:

We have thus succeeded in formulating our problem in the normal form (3).
9.3. Trajectory problems
For the numerical solution of the system (2.3) it is necessary to start
from a definite initial position of the system. This means that the quantities
vi, vz, . . . ,vn have to be given at the initial time moment x = 0:

In a mechanical problem this condition is usually satisfied. In other cases


we may have different conditions and some of the boundary values may be
given at the other end point x = I. For example in the problem of the
elastic bar, the bar may be clamped at both ends which means that two of
the four boundary conditions are given at the one end point, and the other
two at the other end point. If our problem is linear, we can follow a
systematic treatment and make use of the superposition principle of linear
SEC. 9.4 LOCAL EXPANSIONS 515

operators. We can adjust our solution to arbitrary boundary conditions if


we proceed as follows. We make n separate runs, with the initial conditions

By taking an arbitrary linear superposition of these n solution systems we


have the n constants d, Cz, • • • , Cn at our disposal which we can adjust
to any given boundary conditions. Hence a maximum of n separate runs
with a subsequent determination of n constants will solve our problem.
In actual fact the number of runs is smaller since the ra initial conditions
given at x = 0 will immediately determine ra of the constants and we need
only n — m runs with the subsequent determination of n — m constants,
obtained by satisfying the n — m boundary conditions at the point x = Z.
If our problem is non-linear, we cannot make use of the superposition
principle and we have to change our parameters by trial and error until
finally all the required boundary conditions of the problem are satisfied.
We can give a geometrical interpretation of our system (2.3) by imagining
a space of n dimensions in which a general point P has the rectangular
coordinates v\, vz, . . . , vn. Then the actual solution

represents a definite curve or trajectory of this %-dimensional space. The


initial conditions (1) express the fact that the trajectory takes its origin
from a definite initial point

of the w-dimensional space, while the differential equation (3) determines


the tangent with which the motion starts. The entire mechanical system is
thus absorbed by a single point of an imaginary space of n dimensions, the
"configuration space" E.
After moving for a very short time x = h along this tangent, the direction
of the tangent will change, because now we have to substitute in the
equations (2.3) the values of Vi(h), x — h instead of ^(0), x = 0. We
continue along the new tangent up to the time moment x = 2h and once more
we have to change our course. Continuing in this fashion we obtain a
space polygon of small sides which approaches more and more to a
continuous curve as Ji recedes to zero. It is this limiting curve that we want
to obtain by our calculations.

9.4. Local expansions


The successive construction we have indicated in terms of consecutive
tangents, can actually yield a numerical solution but at the expense of
516 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

much labour and very limited accuracy. We could change our differential
equation to a difference, equation:

considering Ax — h as a small quantity which converges to zero. Then we


have a simple step-by-step procedure in which the solution at the point xjc:

yields directly the solution at the point xjc+i:

because of the relation

While this procedure seems on the surface appealing on account of its great
simplicity, it has the drawback that for an efficient approximation of a
curve with the help of polygons we would have to make the step-size h
excessively small, but then the inevitable numerical rounding errors would
swamp our results.
For this reason our aim must be to render our approximating curve
more flexible by changing it from a straight line to a parabola or a polynomial
of third or even fourth order. In this manner we can increase the step-size
k and succeed with a smaller number of intermediate points.
All methods for the numerical integration of ordinary differential equations
operate with such approximations. Instead of representing the immediate
neighbourhood of our curve as a straight line:

we prefer to expand every one of the functions Vi(x^) in a polynomial of a


certain order (usually 3 to 5) and obtain ^(a^+i) on the basis of this poly-
nomial. For this purpose we could make use of the Taylor expansion
around the point xjc:

The coefficients of this expansion are in principle obtainable with the help
of the given differential equation, by successive differentiations. But in
practice this method would be too unwieldy and has to be replaced by more
suitable means. The various methods of step-by-step integration endeavour
to obtain the expansion coefficients of the local Taylor series by numerically
appropriate methods which combine high accuracy with ease of computation
and avoidance of an undue accumulation of rounding errors (these are
caused by the limited accuracy of numerical computations and should not
be confounded with the truncation errors, which are of purely analytical
origin). A certain compromise is inevitable since the physical universe
operates with the continuum as an actuality while for our mental faculties
SEC. 9.5 THE METHOD OF UNDETERMINED COEFFICIENTS 517

the continuum is accessible only as a limit which we may approach but never
reach. All our numerical operations are discrete operations which can never
be fully adequate to the nature of the continuum. And thus we must be
reconciled to the fact that every step in the step-by-step procedure is not
more than approximate. The calculation with a limited number of decimal
places involves a definite rounding error in all our computations. But even
if our computations had absolute accuracy, another error is inevitable
because an infinite expansion of the form (6) is replaced by & finite expansion.
We thus speak of a "truncation error" caused by truncating an infinite
series to a finite number of terms. Such a truncation error is inevitable
in every step of our local integration process. No matter how small this
error may be, its effect can accumulate to a large error which is no longer
negligible (as we shall see in Section 13). Hence it is not enough to take
into account the possible damage caused by the accumulation of numerical
rounding errors and devise methods which are "numerically stable", i.e.
free from an accumulation of rounding errors. We must reckon with the
damaging effect of the accumulation of truncation errors which is beyond
our control and which may upset the apparently high local accuracy of our
step-by-step procedure. The only way of counteracting this danger is not
to consider our local procedure as the final answer, but to complement it by
a global process in which considerations in the large come into play, against
the purely local expansions which operate with the truncated Taylor series.
This we shall do in Section 17.
For the time being we will study the possibilities of local integration
from the principal point of view. Our procedure must be based on the
method of interpolation. We have at our disposal a certain portion of the
curve which we can interpolate with sufficient accuracy by a polynomial of
not too high order, provided that we choose the step-size Ax = h small
enough. This polynomial can now be applied for extrapolating to the next
point xjc+i. Then we repeat the procedure by including this new point and
dropping the extreme left point of the previous step. In this fashion there
is always the same length of curve under the search light, while this light
moves slowly forward to newer and newer regions, until the entire range of
x is exhausted.

9.5. The method of undetermined coefficients


The first step-by-step integration process which became widely known was
the so-called "Runge-Kutta method".* This method requires a great
amount of wasted labour since we do not make use of the information
available from the previous part of the curve, but start directly at the
point X]c and extrapolate to the point xjc+i.
The much more efficient "Method of Milne"! is based on Simpson's
quadrature formula and makes use of four consecutive equidistant values of
v(x), namely v(xk-z), #(2^-2), v(xk-i), and v(xic), in order to extrapolate to
* Cf. [4], pp. 233, 236; [8], p. 72.
t Cf. [6], p. 66.
518 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

v(xjc+i}- A preliminary extrapolation, which is not accurate enough, is


followed by a correction. Every new point of the curve is thus the result of
two operations: a prediction and a correction.
We will approach our problem from an entirely general angle and take
into account all potentialities. We assume that our procedure will be based
on local Taylor expansions. We also assume that our scheme is self-
perpetuating. These two postulates allow us to develop a definite programme.
Let us assume that by some extrapolation method we have arrived at a
certain point x = kh. This means that we have now the functional values
Vik — vi(kh) at our disposal. But then, by substituting on the right side
of the differential equation

we can evaluate the derivatives V'M — v'i(kh). From now on we will omit
the subscript i of vi(x) since the same interpolation and extrapolation
process will be applicable to all our vi(x}. Hence v(x) may mean any of
the generalised coordinates vi(x) of our problem.
It is clear that in a local process only a limited number of ordinates can
be used. This number cannot become too large without making the step-
size h excessively small since we want to stay in the immediate vicinity of
the point in question. We will not go beyond three or four or perhaps
five successive ordinates. We will leave this number optional, however, and
assume that we have at our disposal the following 2m data:

We choose as our point of reference the point x into which we want to


extrapolate. Hence we want to determine ym = v(x) on the basis of our
data. For this purpose we expand v(x] in the vicinity of the point a; in a
Taylor series, leaving the number of terms of the expansion undetermined:
SEC. 9.5 THE METHOD OP UNDETERMINED COEFFICIENTS 519

To these expressions we will add the expansions of the first derivatives:

Now we multiply all these equations by some undetermined factors «i, «2>
. . . , am as far as the first group is concerned and — j3i, — fa, . . . , —j9m as
far as the second group is concerned. Then we add all these equations.
On the left side we get the sum

On the right side the factor of v(x) becomes

This factor we want to make equal to 1 because our aim is to predict v(x)
(and that is ym) in terms of all the previous ym-i '•

Our programme cannot be carried out with absolute accuracy but it can be
accomplished with a high degree of accuracy if we succeed in obliterating on
the right side the factors of h, hz, and so on. We have 2m coefficients at
our disposal and thus 2m degrees of freedom. One degree of freedom is
absorbed by making the sum (6) equal to 1. The remaining 2m — 1 degrees
of freedom can be used to obliterate on the right side the powers of h, up
to the order ft27™"1. Hence we shall obtain the extrapolated value ym with
an error which is proportional to hzm. An extrapolation will be possible on
the basis of m functional values and m derivatives, which is of the order
2m if we denote the "order of the approximation" according to the power
of h to which the error is proportional. Hence extrapolation on the basis
of 1, 2, 3, and 4 points will be of the order 2, 4, 6, and 8.
The linear system of equations obtained for the determination of the
coefficients o^, fa is given as follows:
520 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

9.6. Lagrangian interpolation in terms of double points


Instead of solving the linear algebraic system (5.8) for the coefficients
0.1, fa we will follow a somewhat different line of approach which sheds new
light on the nature of our approximation. We have dealt in Chapter 5.20
with a Lagrangian interpolation problem in which every point of inter-
polation was used as a double point, making use of function and derivative
at the points of interpolation. This is exactly what we are trying to do in
our present problem in which we want to make a prediction on the basis
of our data which list the values of yi and y'i. The formula (5.20.11) gave
the contribution of a double point x = XK in explicit terms. For this
purpose we have to start with the construction of the fundamental poly-
nomial (5.20.7) which in our case becomes

Moreover, we want to obtain the value of f(x) at the critical point x = 0


to which we want to extrapolate.
The second and third derivatives of F(x) in the formula (5.20.11) come
into play because the root factors (x — x] are squared. If we write

with

we get the simpler formula, immediately applied to x = 0:

If we translate this formula to our present problem, we obtain the following


result:

where

and

The expression in braces { } is taken on the understanding that the term


\l(k — k) is omitted. The last term of (5) estimates the error of our extra-
polation, in accordance with the general Lagrangian remainder formula
(1.5.10), here applied to the case of double points; x refers to a point which
is somewhere between x — mh and x.
SEC. 9.7 EXTRAPOLATIONS OF MAXIMUM EFFICIENCY 521

9.7. Extrapolations of maximum efficiency


If we identify m with 1, 2, 3, 4, . . . , we obtain a sequence of formulae
which extrapolate with maximum accuracy, the error being of the order
Ji2m. The resulting formulae up to m = 4 follow:

9.8. Extrapolations of minimum round-off


In algorithms of a repetitive nature the danger exists that small rounding
errors may rapidly accumulate. For example in the construction of a
difference table the rounding error of half a unit in the last decimal place
rapidly increases as we go to higher and higher differences and quickly
destroys the reliability of high order differences. In our step-by-step
process we repeat the same extrapolation formula again and again, and we
must insist that a rounding error in the last decimal place has no cumulative
effect as we continue our process. We speak of "numerical stability", if
this condition is satisfied. The examination of the formulae of the previous
section reveals that this condition is far from being fulfilled. The ordinates
yi are multiplied by numerical factors which are greater than 1 and thus in a
few steps the original rounding error of a half unit in the last decimal place
would rapidly advance to the lower decimals. Our process would quickly
come to a standstill, because of the intolerably large increase of rounding
errors. This does not mean that the formulae of Section 7 are necessarily
computationally useless. As we shall see later, we can make very good
use of them as long as they are not used in a repetitive algorithm, but only a
few times and with the proper precaution. For our regular algorithm,
however, we have to abandon our hopes of gaining a large power of h by
the use of double points.
522 NUMERICAL SOLUTION OF TEAJECTOEY PEOBLEMS CHAP. 9

But then we can take advantage of the flexibility of the scheme (5.5)
and add conditions which will guarantee numerical stability. The danger
does not come from the coefficients fa which are multiplied by the small
factor h and thus can hardly cause any harm from the standpoint of rounding
errors. Hence the ra degrees of freedom of the fa are still at our disposal,
thus giving us a chance to reduce the error at least to the order of magnitude
hm+l. As far as the «f go, they have to satisfy the condition

but otherwise they are freely at our disposal. Now it seems reasonable to
make all the «< uniformly small by choosing them all equal:

By taking the arithmetic mean of the ordinates

we have minimised the effect of rounding errors since the best statistical
averaging of random errors is obtainable by taking the arithmetic mean of
the data.
There is still another reason why a small value of the «$ is desirable. It
should be our policy to put the centre of gravity of our interpolation
formula on the fa and not on the at since the fa are multiplied by the
derivatives rather than the ordinates themselves. But these very derivatives
are determined by the given differential equation and it is clear that the
less we rely on this equation, the more we shall lose in accuracy and vice
versa. Hence we should emphasise the role of the fa as much as possible
at the expense of the «$. This we accomplish by choosing all the «$ as
uniformly small.
The question of the propagation of a small numerical error during the
iterative algorithm (5.7) is answered as follows. The second term is
negligible, in view of the smallness of the step-size h. We have to investigate
the roots of the algebraic equation

One of the roots is of necessity equal to 1. The condition of numerical


stability demands that all the other roots must remain in absolute value
smaller than 1. Now with the choice (2) our equation becomes
SEC. 9.8 EXTRAPOLATIONS OF MINIMUM ROUND-OFF 523

We know that the absolute value of a complex number cannot be greater


than the sum of the absolute values. This yields

and thus

The assumption |A| > 1 would make the left side negative, in contradiction
to the inequality (7). Moreover, the assumption |A| = 1 but A =£ 1 would
exclude the equal sign of (7) and is thus likewise eliminated. The only
remaining chance that A = 1 is a double root is disproved by the fact that
F'(\) cannot be zero at A = 1. The numerical stability of our process is
thus established.
The coefficients j8j of the formula

can be obtained by solving the algebraic system (5.8) for the jfy;
(substituting for the a* the constant value (2)). But we can also conceive
our problem as an ordinary Lagrangian interpolation problem for f'(x)
instead of f(x), with the m points of interpolation x — —1, — 2 , . . . , — m
(for the sake of convenience we can normalise Ji to 1). Then f(x) is
obtainable by integration.
Let us consider for example the case m = 2. Then

Integrating with respect to x we determine the constant of integration by


the condition

This gives

and extrapolating to x = 0:

To this we have to add the term %(yQ + yi).


In this manner the following formulae can be established for the cases
m = 1 to m = 5:
524 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

9.9. Estimation of the truncation error


The estimation of the truncation error on the basis of the Lagrangian
remainder formula is of not more than theoretical significance; in practice
we do not possess the derivatives of higher than first order in explicit form.
We have to find some other method for the numerical estimation of the
local truncation error in each step of our process. This can be done in the
following way. Together with the extrapolation to ym = v(x) we extra-
polate also to the next point ym+iv(x + h). The error of this ordinate will
be naturally much larger than that of v(x). We shall not use this ym+i as
an actual value of v(x + h) but store it for the next step when we shall get
v(x + h) anyway. The difference between the preliminary value—which we
will denote by ym+i—and the later obtained value ym+i can serve for the
estimation of the truncation error. The basis of this estimation is the
assumption that y(m+V(x) has not changed drastically as we proceeded
from ym to ym+i.
The following table gives the evaluation of ym+i, together with the
estimated truncation error 77, in terms of A, which denotes the difference
between the preliminary ym+i and the later obtained ym+i — v(x + h) '•

The standard term

remains unaltered in all these formulae.


SEC. 9.9 ESTIMATION OF THE TRUNCATION ERROR 525

A different principle for the estimation of the local error can be established
by checking up on the accuracy with which we have satisfied the given
differential equation. The polynomial by which we have extrapolated ym,
can also extrapolate the derivative y'm. Let us call this extrapolated value
y'm. In the absence of errors this y'm would coincide with the y'm obtained
by substituting the ym values of all the functions v^x) into FI(V]C, x). In
view of the discrepancy between the two values we can say that we have
not solved the differential emiation

but the differential equation

where fit is the difference

applied to the ttb function vt(x). The smallness of fa is not necessarily an


indication for the smalhiess of the error of Vi(x) itself, as we shall see in
Section 13. But the increase of the estimated error beyond reasonable
limits can serve as a warning signal that our h has become too large and
should be reduced to a smaller amount.
526 NUMERICAL SOLUTION OF TEAJECTOEY PEOBLEMS CHAP. 9

The following table contains the formulae for the calculation of the
extrapolated value y'm.

9.10. End-point extrapolation

Another possible choice of the coefficients 04 is that we make our


prediction solely on the basis of the last ordinate ym-i> This means the
choice

This method is also distinguished by complete numerical stability and is


known as the "Method of Adams".*
The formulae of this method are once more deducible on the basis of
integrating the interpolation for f ' ( x ) (cf. Section 8), the only difference
being that the constant of integration is now determined by the condition

For example, integrating (9.8.9) for the case m = 2 we now obtain

and thus the extrapolation to x = 0 yields

Carrying through the calculations systematically up to m = 5, we obtain


the following table which also contains the extrapolation to ym+i for the
sake of a numerical estimation of the truncation error, in full analogy to
the previous table (9.3). The estimation of the error in the differential
equation by comparing y'm with y'm can once more occur on the basis of
the table (9.7) which remains applicable to the present method.
* Cf. [6], pp. 3 and 53.
SEC. 9.11 MID-POINT INTERPOLATIONS 527

9.11. Mid-point interpolations


The step-by-step algorithm discussed in the previous sections can be
described as follows. Let us assume that we have at our disposal m
equidistant ordinates yo, yi, . . . , ym-i> and the corresponding derivatives
y'o, y'i, • • • > y'm-i- Then the formulae of Section (8) or those of Section
(10) permit us to evaluate the next ordinate ym as a certain weighted
average of the given data y^ and y'j. We do that for every one of the n
functions Vi(x) of the system (5.1). We have now obtained the next
system of values vt(xm) and substituting in the functions Ft(vi, . . . , vn; xm]
we immediately obtain also the corresponding v'i(xm). Now we proceed by
the same algorithm to the next point Vi(xm+i) by omitting our previous yo
and y'o and operating with the 2m new values (y\, yz, • • •, ym) and
(y'l, y'z, ' • • > y'm}- Once more we obtain all the Vi(xm+i) by the weighted
averaging, substitute again in the functions Fi(v\, . . ., vn; xm+i), obtain
y'm+i = v'i(xm+i), and thus we continue the process in identical terms, all
528 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

the time estimating also the effect of the truncation error in the solution
or in the differential equation or both. Now it may happen that this
error estimation indicates that the error begins to increase beyond a danger
point which necessitates the choice of a smaller h. We shall then stop and
continue our algorithm with an h which has been reduced to half its previous
value. This can be done without any difficulty if we use our interpolating
polynomial for the evaluation of the functional values half way between the
gridpoints. Then we substitute the values thus obtained into the functions
Fi of our differential equation (5.1), thus obtaining the mid-point values
of the y'ic. The new points combined with the old points yield a consecutive
sequence of points of the step-size A/2 and we can continue our previous
algorithm with the new reduced step-size.
The following table contains in matrix form the evaluation of the mid-point
ordinates, using as a-term the arithmetic mean of all the ordinates y^. The
notation y§\ refers to the mid-point value of y half way between yo and y\,
similarly y\i to the mid-point value of y half way between y\ and «/£> a*id
so on. It is tacitly understood that the arithmetic mean M is added to
the tabular products. For example the line yzz, for m = 4, has the following
significance. Obtain the ordinate at an x-value which is half way between
that of 7/2 and 7/3 by the following calculation:

The common denominator is listed on the right side of the table.


SEC. 9.12 THE PROBLEM OF STARTING VALUES 529

9.12. The problem of starting values


The step-by-step methods of the previous sections are self-perpetuating if
we are in possession of m consecutive equidistant ordinates of the mutual
distance h. For the sake of starting our scheme we must first of all generate
these ordinates. For this purpose we can make use of those "extrapolations
of maximum efficiency" which we have studied hi Section 7. Although
the accumulation of rounding errors prohibits the use of these formulae on
a repetitive basis, there is no objection to their use on a small scale if care
is taken for added accuracy in this preliminary phase of our work. This
can be done without substantial numerical hardships if we take into account
that the change of values from one point to the next is small and thus we
may take out a constant of our computations and concentrate completely
on the cliange of this constant, using the full decimal accuracy in the
computation of the change. The unification of the constant with its change
occurs only afterward, when we are through with the first three or four
stages of our calculations. Under these circumstances we are prepared to
make use of unstable formulae in the evaluation of the starting values
which precede the application of the regular step-by-step algorithm.
The initial value problem associated with the differential equation (5.1)
prescribes the values of vt(x) at the initial point x — 0:

To obtain the next value Vi(h) we have to rely on a local Taylor expansion
around the origin x = 0. However, it would be generally quite cumbersome
to obtain the high order derivatives of the functions Fi(vjc, x} by successive
differentiations. We can avoid this difficulty by starting our numerical
scheme with a particularly small value of h. For this purpose we transform
our independent variable x into a new variable t by putting

Such a transformation may appear artificial and unnecessary at the present


stage of our investigation. It so happens, however, that somewhat later,
when we come to the discussion of the weaknesses of a purely local integration
process, and look around for a "global" process in complementation of the
local procedure, we shall automatically encounter the necessity of trans-
forming the independent variable x into a new angle variable 6. This
transformation entails around the origin x = 0 a change of variable of the
type (2).
The transformation (2) has the following beneficial effect. We want to
assume that it is not too difficult to obtain the partial derivatives of the Ft
functions with respect to the v^ in explicit form:

This is in fact unavoidable if our aim is to correct a small error in the Vk


by an added correction scheme.
35—L.D.O.
530 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

The Atk form the components of an n x n matrix. By implicit differentia-


tion we can now form the second derivative of Vi(x) with respect to x:

The knowledge of the second derivatives of all the Vi(x) would still not be
sufficient, however, to obtain vt(h) with sufficient accuracy. But the
situation is quite different if we abandon the original variable x and consider
from now on the functions vi as functions of the new variable t. Then the
expansion around the origin appears as follows:

(We will agree that we denote differentiation with respect to x in the previous
manner by a dash, while derivatives with respect to t shall be denoted by a
dot.) Hence in the new variable t we obtain

We have thus obtained v(t) at the second point t = h with an error which is
of sixth order in h, without differentiating with respect to x more than
twice. Then, by substituting these Vk(h) into the functions Ft(V]c(h); h) we
obtain also

We now come to the third point t = 2h. Here we can make use of the
six data v(0), v(Q), v(Q), v(0) and v(h), v(h), on the basis of the (unstable)
formula:

(in our case this formula is simplified on account of v(0) = v(0) = 0). Then
again we evaluate by substitution the quantities vt(2h).
From here we proceed to the fourth point t = 3h on the basis of the
(likewise unstable) formula

Now we are already in the possession of four ordinates (and their derivatives)
and we can start with the regular algorithm of Section 8 or 10, if we are
SBC. 9.13 THE ACCUMULATION OF TRUNCATION ERRORS 531

satisfied with ra = 4, which leads to an error of the order h5. If, however,
we prefer an accuracy of one higher order (m = 5), we can repeat the
process (9) once more, obtaining vfih) on the basis of the points h, 2k, 3h.
Then we arrived at our five points which are demanded for the step-by-step
algorithm with m = 5.
We add four further formulae which may be of occasional interest, two
of them stable, the other two unstable:
Stable

Unstable

9.13. The accumulation of truncation errors


The accumulation of numerical rounding errors can be avoided by choosing
an algorithm which satisfies the condition of numerical stability. The
accumulation of truncation errors is quite a different matter. The presence
of truncation errors came in evidence by the discrepancy which existed
between the extrapolated value y'm and the actual value y'm obtained by
substituting the obtained ordinates in the given differential equation. We
could interpret this difference as an error term to be applied on the right
side of the given differential equation. What our step-by-step procedure
has given, can thus be interpreted as the solution of a differential equation
whose right side differs from the correct right side by a very small error at
every point of the range. The question is now: what effect will a very
small error committed in the differential equation have on the solution?
Since we have constantly used the notation vt(x) for the function obtained
by the local extrapolation process although this function satisfied not the
correct differential equation (5.1) but the modified differential equation
(9.5), we shall employ the notation v*t(x) for the correct solution of the
532 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

given differential equation (5.1). We can assume that the difference


between v*t(x) and our preliminary vt(x) is small enough for a linear
perturbation method, neglecting the second and higher powers of the
difference. We will thus put

and assumes that terms of the order e2 are negligible. This has the great
advantage that we obtain a linear differential equation for the correction
Ui(x). If we substitute the expression (1) in (5.1) and make use of the
notation AM (cf. (12.3)) for the partial derivatives of Ft with respect to
the Vk, we obtain for ut the following differential equation:

Originally the AM are given as functions of the vi and x. But we can


assume that we have substituted for vt(x) the explicit functions found in our
step-by-step process. This makes the AM mere functions of x.
Now we want to know how a local error, committed in the jth equation
at a certain point x = g, will influence the solution. This means that we
want to solve the equation

(8j(x, £) denotes the delta function put in the jth equation, while the other
equations have zero on their right side). We know from the properties of
the Green's function that all ut(x) will vanish up to the point x = g, while
beyond that point we have to take a certain linear combination of the
homogeneous solutions which satisfy the condition that all the Ui(x] vanish
at x •= g, except for Uj(x) which becomes 1 at the point x = g :

Now we cannot solve the homogeneous equation

in explicit analytical form. But an approximate solution can be found by


considering the AM (which are actually functions of x), in the small
neighbourhood of a point as constants. Then the solution can be given as a
linear superposition of exponential functions of the form

Substitution of (6) in the system (5) yields for A the characteristic equation
SEC. 9.13 THE ACCUMULATION OF TRUNCATION ERRORS 533

The n solutions of this algebraic equation yield n (generally complex) values


for A and to each A = A^ a system of the form (6) can be found, with a free
universal factor Ck remaining in each solution. An arbitrary linear super-
position of these n particular solutions yields the general (approximate)
solution of (5).
The decisive question is now how the real parts T{e\ of these n roots Xk
behave. If all ^A^ are negative, then the solutions are exponentially
decaying functions and this means that a small error in the differential
equation will quickly extinguish itself. If, however, one or more of the A^
have a positive real part, then a small right side of (2) will induce a solution
which increases exponentially, and will cause a large error of Vi(x). We
have no way of preventing this error since the nature of the roots of (7) is
entirely dictated by the given differential operator which cannot be altered.
Whether this accumulation of local errors is damaging or not, will depend
on the nature of the solution. The solution itself may increase ex-
ponentially with a speed which is equal or even larger than the largest of
the ^Afc. In that case the relative error of Vi(x) does not increase unduly.
But if it so happens that the solution locks itself on an exponent which is
smaller than the largest 1(e\ic, then the speed with which the errors
accumulate in the solution, will eventually cause an intolerably large relative
error, in spite of the smallness of local errors observed during the step-by-step
process.
Consider for example the situation exemplified by the Bessel functions
along the imaginary axis. The transformation of Bessel's differential
equation into the normal form yields a pair of first order equations. The
characteristic equation (7)—for not too small x—yields a positive and a
negative root. Now the given initial value problem may be such that it
calls for the exponentially decreasing function. Then it will inevitably
happen that the truncation errors cause the appearance of the other,
exponentially increasing solution which will sooner or later overpower the
desired solution.
Under these circumstances we must ask ourselves whether any step-by-
step procedure can truly solve our integration problem and whether it
would not be more adequate to consider the solution thus obtained as a
preliminary solution which should be corrected by a global method. In this
global method we abandon the idea of breaking up our domain into small
local sections, but consider the entire range of x between x = 0 and x = I
as one integral unit. But then what procedure will be at our disposal?
Shall we replace the approximations by low order polynomials in the
neighbourhood of a point by a high order polynomial in the entire range ?
We have seen that such a procedure cannot succeed in the case of equidistant
534 NUMERICAL SOLUTION OF TEAJECTOEY PEOBLEMS CHAP. 9

data because the interpolating polynomial of high order will generally


diverge between the points of interpolation and cannot be applied for an
approximation in the large. We have to abandon the programme of
equidistant steps and introduce a properly chosen unequal distribution of
ordinates. The first outstanding example of such an integration process was
discovered by Gauss, who introduced an exceptionally powerful method
for the global evaluation of a definite integral. This method is closely
related to those "extrapolations of maximum efficiency" which we have
studied in Sections 6 and 7. In fact, the entire Gaussian quadrature
method can be conceived as a special case of Lagrangian interpolation in
terms of double points, but distributing these points in a particularly
judicious manner.
9.14. The method of Gaussian quadrature
We consider once more the method of Section 6 in which not only the
functional values but also their derivatives are employed for the purpose
of interpolation and extrapolation. Let us consider a function f(x) which
is the indefinite integral of a given function g(x). Then

From the standpoint of f(x) the given functional values represent the
derivatives /'(#*) which appeared in the formula (5.20.11). But then the
difficulty arises that the ordinates /(#*) themselves, which enter the first
term of the formula, are not known. We can overcome, however, this
difficulty by the ingenious device of putting the points xjc in such positions
that their weights automatically vanish, Then our interpolation (or extra-
polation) formula will not contain any other data than those which are
given to us.
Let us assume that we want to obtain the definite integral

Since hi f(x) we have a constant of integration free, we can define

and consider this condition as additional data, to be added to the n


ordinates /'(##) = g(xic}- This means that the fundamental polynomial
F(x) becomes of the order 2n + 1 because we have added the single point
x ss — 1 to the double points xje (we consider x = — 1 as single since we have
no reason to demand that/'(#) must be given at this point). Hence

Now the vanishing of the factor of/(a^) demands the following condition
(cf. (5.20.11)):
SEC. 9.14 THE METHOD OF GAUSSIAN QUADRATURE 535

We cannot satisfy this condition simultaneously for various rvalues.


But our aim is—according to (2)—to obtain/(I) and thus we can identify
the point x with x = 1. Let us put

Then

Since Gfa) = 0, we obtain

and the condition (5) (for x = I), demands

or

We may modify this condition to

because the addition of the last term does not change anything, since
G(xjc) = 0. But now the differential operator in front of G(x) has the property
that it obliterates the power xn and thus transforms a polynomial of the
order n into a polynomial of lower order. But a polynomial of an order
less than n cannot vanish at n points without vanishing identically. And
thus the differential equation (11)—which is Legendre's differential equation
—must hold for G(x) not only at the points Xk but everywhere. This
identifies G(x) as the nth Legendre polynomial Pn(x), except for an
irrelevant factor of proportionality:

The zeros of the Gaussian quadrature are thus identified as the zeros of the
nth Legendre polynomial.
In this approach to the problem of Gaussian quadrature we obtain the
weight factors of the Gaussian quadrature formula in a form which differs
from the traditional expression. The traditional weights of the Gaussian
quadrature formula
536 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

are given as follows*:

while the second term of the formula (5.20.11) gives, without any integration,
in view of the formulae (7):

The remainder of the Gaussian quadrature is likewise obtainable without


any integration, on the basis of the Lagrangian remainder formula associated
with the interpolation off(x):

The factor cn in (12) is determined by the highest power of x which is 1 on


the left side, while the highest coefficient of Pn(x) becomes (2ri)\(n!)~z2~n.
Hence

and we obtain

which agrees with the traditional expression of rjn.


9.15. Global integration by Chebyshev polynomials
The Gaussian quadrature method possesses an exceptionally high degree
of accuracy since we accomplish with n ordinates what otherwise would
require 2n ordinates. This was the original motivation in the Gaussian
discovery. In actual fact more is accomplished than a mere saving of
ordinates. The Gaussian choice of the zeros yields a convergent method
of integration since the Legendre polynomials form an orthogonal set of
functions. The more ordinates we take into account, the nearer we come
to the true value of A. This is by no means so if equidistant ordinates
are employed, even if we are willing to go through the double amount of
computational labour. Equidistant ordinates do not have the tendency to
converge—as we have seen in Chapter 1—except if f(x) belongs to a very
limited class of functions. The Gaussian quadrature, on the other hand,
converges even if applied to non-analytical functions. If we want to
operate with polynomials of high order, we must dispense with the use of
equidistant ordinates and replace them by ordinates which are related to
some orthogonal set of functions.
The Gaussian method works only for the evaluation of a definite integral.
If an indefinite integral is in question, we must follow a somewhat different
approach. Here we cannot expect the very high degree of accuracy that
* Of. e.g. A. A. (6-10.4) and (8), p. 398.
SEC. 9.15 GLOBAL INTEGRATION BY CHEBYSHEV POLYNOMIALS 537

characterises the Gaussian quadrature. But we can save the other outstand-
ing feature of the Gaussian method, namely that it operates with orthogonal
polynomials and thus provides a global approximation which converges
better and better as the degree of the polynomial increases.
We obtain a perfect counterpart of the Gaussian quadrature for the case
of an indefinite integral if we replace the Legendre polynomials by another
outstanding set of polynomials, called "Chebyshev polynomials".* The
operation with these polynomials is equivalent to the transformation

which transforms the range [0, Z] of x into the range [0, TT] of 6. The
originally given function g(x), if viewed from the variable 6, becomes a
periodic function of 6 of the period 2-rr, which can be expanded in a Fourier
series. Moreover, it is an even function of 6 which requires only cosine
functions for its representation. We assume that g(x) is given at n points
which are equidistant in the variable 9 but not equidistant in the variable x.
Two distributions are in particular of interest:

and

The irrational distribution of points from the standpoint of the variable x


causes a certain inconvenience in the case of tabulated functions but is not
objectionable if the integration of a differential equation is involved where
no tabulated functions occur, and the unknown functions are to be deter-
mined on the basis of the differential equation itself. Here we have no
difficulty in abandoning the variable x from the beginning and introducing
immediately the new variable 6.
We will first solve the integration problem associated with the Fourier
series, paying no attention to the specific conditions which prevail in our
problem on account of the even character of the function involved. We
start with the trigonometric identity

and observe that a function (p(9), given in any 2n equidistant points 6k of


the mutual distance tr/n, allows the following trigonometric interpolation:

* Of. A. A., p. 245.


538 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

where

Indeed, the form of the function (6) shows that the sum (5) represents a
Fourier series of sine and cosine terms up to the order n. Moreover, if we
put 0 = 6k, we obtain

and thus we have actually interpolated the given 2n data with the help
of a Fourier series of lowest order which fits these data.
Now we want to integrate the function y(6), denoting the indefinite
integral by 0(6):

For this purpose we replace the exact cp(0) by its approximation on the
basis of trigonometric interpolation, that is <pn(d). Then the formula (5)
yields

where

The integral in the last line is for large n very nearly equal to Si (nt) where
Si x is the sine integral, defined by (2.3.11). It is preferable to introduce a
slightly different function that we want to denote by K(t):

and put

The function K(t) has the following two fundamental properties:

We now return to our original global integration problem in the variable


x. We want to obtain
SEC. 9.15 GLOBAL INTEGRATION BY CHEBYSHEV POLYNOMIALS 539

The problem of integration can then be solved as follows. We first define


our fundamental data by putting

The subscript n rims from 0 to n in the case of the distribution (2) and
from 1 to n in the case of the distribution (3) (in the first case yo — 7n = 0;
this means that we lose our two end-data gr(0) and g(l) which enter all our
calculations with the weight zero. The loss is not serious, however, if n is
sufficiently large). We must extend our domain of ##• values toward negative
fc, in order to have a full cycle. Hence we define

The full range of k extends now from — n to +n (including k = 0 in the


first distribution and excluding it in the second).
Now the indefinite integral of g(x), expressed in the variable becomes:

Although the data (16) have to be weighted for every value of 0 separately,
yet the formula (18) shows that it suffices to give a one-dimensional sequence
of weight factors for every n. Let us assume that we want to obtain G(6)
at the data points. Then

and it suffices to evaluate the n coefficients

(with the added condition W-s = — Ws), in order to obtain

We may prefer to obtain the integral not at the data points but half way
between these points. This is advisable in order to minimise the errors.
The error oscillations of trigonometric interpolation follow the law

where 9? is a constant phase angle (which is zero for the first and ir/2 for the
second distribution of data points), while the amplitude A(B) changes
540 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

slowly, compared with the rapid oscillations of the second factor. Hence
the approximate law of the error oscillations of the integral becomes

which is zero half-way between the data points. At these points we can
expect a particularly great accuracy, gaining the factor n compared with the
average amplitude of the error oscillations. For these points the coefficients
Ws have to be defined as follows:

(the prime attached to sigma shall indicate that the last term is to be taken
with half weight). The symmetry pattern of this weighting is somewhat
different from the previous pattern, due to the exclusion of the subscript
5 = 0.

9.16. Numerical aspects of the method of global integration


The global method of integration is based on the great efficiency of the
Chebyshev polynomials for the global representation of functions, defined in
a finite range. However, the actual expansion of g(x) (and also G(x)) into
Chebyshev polynomials does not appear in explicit form but remains latent
in the technique of trigonometric interpolation. The range of applicability
of this method is wider than that of the usual "point to point integration
techniques" which are based on Simpson's formula or some formula of a
similar type. Such techniques assume the existence of derivatives up to a
certain order. Here the analytical nature of the function g(x) is not required.
Functions of the type log x, Vx, and other similar functions which are
integrable but not differentiable, are included in the validity domain of the
method of global integration, in full analogy to the Gaussian quadrature
method which, however, is restricted to the evaluation of a definite integral.
Compared with the Gaussian method we lose the advantage of halving the
number of ordinates; that is, the accuracy obtained is comparable to the
Gaussian accuracy with half as many ordinates. However, the saving of
ordinates is not as decisive as the advantage of a global technique which
avoids the accumulation of local truncation errors.
We will study the numerical aspects of the method for n = 12. First of
all we will generate the sequence of weight factors Ws, defined by (15.20)
and (15.24). Since all the weights are near to + £, we will omit the
constant | from our W8. Hence we will put (for positive s):

and retain the definition


SEC. 9.16 NUMERICAL ASPECTS OF GLOBAL INTEGRATION 541

For the set (15.24) we obtain the following sequence:

These values represent very nearly the oscillations of the function

and the amplitudes are particularly small since we are near to the nodal
points of these oscillations, namely half-way between the consecutive
minima and maxima which occur at the points

Hence we obtain only a small correction of the principal contributions of


the weights W8 which is — \ for positive s and + \ for negative s. This
contribution can be taken into account separately. Apart from an additive
constant which is irrelevant—since the indefinite integral contains a free
additive constant anyway—it amounts to the summing of the ordinates
yi, y2, , . . , up to ym.
For a more concrete elucidation of the numerical technique we will
normalise the range of x to [0, 1] and we will assume that the data points
are given as the n points

which corresponds to the second distribution (9.15.3). We wish to obtain


the indefinite integral at the midpoints between the data points, that is at
the values

In order to normalise the constant of integration, we assume that the


initial value (7(0) of the integral is given.
Now the formula (15.21) expresses a numerical method known as the
"movable strip technique".* Let us assume that the data y^ are put in
proper succession on a horizontal strip which is fixed. The weights wjc
* Cf. A. A., p. 13.
542 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

are put on a parallel strip which gradually moves to the right. The initial
position of these two strips is as follows:

We multiply corresponding elements and form the sum. Let the result be
G0;

We now move the strip one place to the right, obtaining the following picture:

In view of the cyclic nature of our method we should rather think of a


movable "band" which has no beginning and no end. The number wn
which disappeared at the right end, comes back at the left end. The
movable strip is thus all the time filled up with the same set of numbers;
it is only the left starting point which gradually moves from — n to n, then
to n — 1, n — 2, . . . , as the strip moves forward.
Once more we form the products of corresponding elements and their
sum, noting down the result as GI. Then the strip moves forward again,
giving rise to GZ, and so on (this kind of procedure is particularly well
adapted to the task of coding for the electronic computer). Generally

The integral of the given function g(x) is now evaluated as follows:

As a concrete numerical example we will consider the following function of


the range [0, 1]:

First of all we need our data yjc and for this purpose we have to form the
products (15.16) for n = 12, 6k = 7.5°, 22.5°, 37.5°, 52.5°, 67.5°, 82.5°.
SBO. 9.16 NUMERICAL ASPECTS OF GLOBAL INTEGRATION 543

Since g(x) is symmetric with respect to the centre point x = % (or 9 = 90°),
the second set of data k = 7 to 12 merely repeats the first set in reverse
order. Hence we will list the yic only up to k = 6.
k 7k
1 0.00861632
2 0.02702547
3 0.04890525
4 0.07577005
5 0.10548729
6 0.12760580
If now we carry out the calculations according to the movable strip technique,
with the data yjc on the fixed strip and the weights (3) on the movable strip,
we obtain the following results. The exact integral is available in our
example:

which is a tabulated function.* Hence we have no difficulties hi checking


our numerical results. The following table contains the exact values
(given to 8 decimal places) of the function

(evaluated at points which are in the variable 6 half-way between the data
pouits), and the corresponding values obtained by global integration. The
sum of the ordinates is listed separately, in order to show the effect of the
correction.

m G*(xm) - G*(Q) Om - Go G(xm) - G(0)

"1 0.00866532 0.00861632 0.00004918 0.00866550


2 0.03583689 0.03564178 0.00019590 0.03583747
3 0.08495923 0.08454704 0.00041242 0.08495945
4 0.16087528 0.16031709 0.00055811 0.16087520
5 0.26606830 0.26580438 0.00026414 0.26606852
6 0.39269908 0.39341018 -0.00071101 0.39269917
(9.16.17)
The "sum of the ordinates" 2y* corresponds to the simple "trapezoidal
rule" of obtaining an area. Since the correction is small, the global integra-
tion method can only be effective if the ordinates are sufficiently close to
allow a satisfactory application of the trapezoidal rule (although it is of
interest to remember that in our case we would get exact results for any
g(x) which is given as an arbitrary polynomial not exceeding the order 11).
* "Tables of the Arc tan x", NBS, Applied Mathematical Series 26 (U.S. Government
Printing Office, Washington, B.C., 1953).
544 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

The correction, although very small, is highly effective and extends the
accuracy to the seventh decimal place.
We will, however, add another distribution of our data points which
corresponds to (15.2) and which is more suitable in view of the application
to the solution of differential equations. We will now assume that our data
are given at multiples of the angle irfn and that the evaluation of the
integral shall occur at the same points. Then we need the weights which
correspond to the definition (15.20). These weights are contained in the
following table.
s ws s W8
0 0 1 -0.01012223
1 -0.08891091 8 0.00762260
2 0.04742642 9 -0.00547171
3 -0.03134027 10 0.00354076
4 0.02267382 11 -0.00174001
5 -0.01712979 12 0
6 0.01317377
The values in this table oscillate with much larger amplitudes than those of
the previous table (3) since at present we are at the points of the minima
and maxima of the function (4). Special attention has to be given to the
value WQ which according to the general rule should be listed as | instead of
zero. But again we will take into account the effect of this large constant
separately. It amounts to the following modification of the previous law
of the ordinates. Instead of adding up the yi according to (12), we have to
modify the sum by taking the two limiting ordinates with half weight:

Once more we have to construct our data according to the formula


(15.16) but now associated with the angles Qk = 15°, 30°, 45°, 75°, 90°.
Once more we need not list our data beyond k = 6 because they return in
reversed order (but the symmetry point now is at k = 6 which is not repeated;
71 — 75, ys = 74, etc.).
* y*
0
1 0.01752670
2 0.03739991
3 0.06170671
4 0.09068997
5 0.11850131
6 0.13089969
The movable strip (8) comes again into operation but with a slightly modified
symmetry pattern since k = 0 is now included in our subscripts:
SEC. 9.16 NUMERICAL ASPECTS OF GLOBAL INTERRATION 545

We list the results once more in tabular form, in full analogy to the previous
table (17).

m G*(xm) - G*(0) Gm — GQ G(xm} - 0(0)

1 0.00866532 0.00876336 -0.00010103 0.00866232


2 0.03583689 0.03622666 -0.00038977 0.00358369
3 0.08495923 0.08577996 -0.00082367 0.08495629
4 0.16087528 0.16197830 -0.00110330 0.16087500
5 0.26606830 0.26657394 -0.00050791 0.26606603
6 0.39269908 0.39127445 + 0.00142313 0.39269758
(9.16.22)
The error has now moved up to the sixth decimal place which is a consider-
able increase compared with the much smaller errors of Table (17); this is
what we have expected on the basis of the error behaviour of the Fourier
series.
Actually the two sets of weights (3) and (18), if applied to the same sets of
data, give redundant results inasmuch as they both define the same function
G(6), computed at two sets of points. Since this function can be generated
in terms of trigonometric interpolation by one set of data, we can predict
the results of the weighting by the other set of coefficients if we apply
half-way interpolation to the computed points, on the basis of the formula
(15.5). Hence the much less accurate ordinates which have been obtained
by the weights (18), are nevertheless able to restore the much more accurate
ordinates half-way between (which are also available by applying the
weights (3) to the data), through the medium of trigonometric interpolation.
The objection can be raised that the chosen function (13) is too smooth for
a true testing of the method since it belongs to that class of functions which
are amenable even to equidistant interpolation. A more characteristic
choice would have been a similar function, encountered in Chapter 1:

where a had the value 0.2 instead of 1. However, the more or less smooth
character of the function merely changes the number of ordinates needed
for a certain accuracy. In the present example we are in the possession
of an exact error analysis and we can show that the accuracy of the tables
(17), respectively (22), could have been matched with the much less smooth
function (23) (with a = 0.2), but at the expense of a much larger number of
ordinates since the present number n = 12 would have to be raised to
36 + L.D.O.
546 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

n = 50. Our aim was merely to demonstrate the numerical technique and
for that purpose a simpler example seemed more adequate.
9.17. The method of global correction
In the previous methods of solving trajectory problems by inching
forward step by step on the basis of local Taylor expansions, we did not
succeed in exhibiting an explicit solution in the form of a set of continuous
functions Vi(x). The point of reference was constantly shifting and we could
not arrive at a solution which could be truly tested whether or not it satisfied
the given differential equation. Even if we did find that the differential
equation is actually satisfied at every point with a high degree of accuracy,
we would still have to convince ourselves that these small local errors will
not accumulate and possibly cause a large error in the solution. But the
principal objection to the method of shifting centres of expansion is that
we have obtained a set of discrete values Vi(xm) instead of true functions of
x\ Vi(x). We usually get round the difficulty in a purely empirical fashion
by reducing the step-size h and making repeated trials, until we come to the
point where the ym-values become "stabilised", by approaching more and
more definite limits. But a real "solution " in the sense of testable functions
has not been achieved.
This situation is quite different, however, if we change from the variable
x to the angle variable 8, obtaining once more the solution by the previous
step-by-step process. Although we have once more only a discrete sequence
of ordinates, we can now combine all these ordinates into a continuous and
differentiable function by the method of trigonometric interpolation. Now
we have actually obtained our Vi(6) and can test explicitly to what extent
the given differential equation has been satisfied.
It is more satisfactory to start with the derivatives y'm, which we possess
in all the data points. By trigonometric interpolation we can combine these
data to a true function v'(6) and then, by the technique of integration
discussed in the sections 15 and 16 we also obtain v(6). In contradistinction
to the previous problem our "data" are now more simply constructed. In
the previous case (15.16) the factor 1/2 sin 6% appeared because we had to
obtain dg/dd in terms of the given dg\<Lx. In the present problem we have
abandoned from the very beginning the operation with the original variable
x and changed over to the variable 6. Hence we possess dg/dd without any
transformation and our y^ now become simply

We apply the "movable strip technique" and arrive at the functional


values Vi(9m). These vt(dm) will generally not agree, however, with the
previous ^-values obtained in the course of the step-by-step process. We
have thus to distinguish between three functions: the function Vi(6), found
by the step-by-step procedure at the data points 6m (and combinable to a
true function V{(0) by the process of trigonometric interpolation): then the
SEC. 9.17 THE METHOD OF GLOBAL CORRECTION 547

function Vi(d) found by integrating the derivative data y'm, and finally the
true function v*i(0) which actually satisfies the given differential equation
(5.1). The difference

can be explicitly obtained at all the data points because the previous process
gave us Vi(dm) and now we have found the new Vi(0m). We assume that the
difference pi(0m) is small. Then we have a good indication that the function
Vi(d) will need only a small correction in order to obtain the true function
v*(6). Hence we will put

The substitution of this expression in our differential equation (9.5.1) yields:

Then, in view of the smallness of Ui and pi, we can be satisfied with the
solution of the linear perturbation problem

If the second term were absent, we could immediately integrate this equation
by the previous global integration technique. The second term has the
effect that instead of an explicit solution ui(dm} at the data points we obtain
a large scale linear system of algebraic equations for the determination of
the Ui(0m). Since the matrix of this system is nearly triangular, we have
no difficulty in solving our system by the usual successive approximations.
We have then obtained the global correction which has to be added to the
preliminary step-by-step solution and we have the added advantage that
now we really possess the functions v*i(d) at all points, instead of a discrete
set of ordinates which exist only in a selected sequence of isolated points.
Great difficulties arise, however, if it so happens that the linear algebraic
system associated with the linear differential equation (5) is "badly
conditioned". Then the smallness of the right side cannot guarantee the
smallness of ui(0}. Not only is our perturbation method then in danger
of being put out of action, but the further danger exists that the solution of
the given problem (5.1) is exceedingly sensitive to very small changes of
the initial data. The solution of such problems is very difficult and we may
have to resort to the remedy of sectionalising our range and applying the
step-by-step method in combination with the global correction technique
separately in every section, thus reducing the damaging influence of
explosive error accumulations.
Numerical Example. The following numerical example characterises the
intrinsic properties of the local and global integration procedures. We
548 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

choose a differential equation whose solution is known in analytical form


and which is tabulated with great accuracy.* Bessel's differential equation
for imaginary arguments of x and the order p = 0 can be written in the
normal form (2.3) as a pair of first order equations:

We choose the interval x = [1,5] and give at x = 1 the initial conditions


for the Hankel function iHW(ix), which has the property that it decreases
exponentially to zero as x increases to infinity, while the second solution
iHW(ix)—which will enter in consequence of the truncation and rounding
errors—increases exponentially.
Our aim is to study the error pattern of the local and the global procedure,
excluding the accumulation of the purely numerical rounding errors. The
results are thus tabulated to 6 decimal places, while the computations were
made to 10 decimal places. In the local procedure the interval h — 0.2 was
chosen which means that the total interval was covered by 21 points. It
it convenient to change the independent variable x to the new variable
xi = l.2x. In the new variable h becomes 0.24 and the formula for the
step-by-step integration, with an error of the order h5, becomes, if we use
Adams' method (cf. 10.5):

Since our aim is to study the gradual accumulation of truncation errors,


we assume optimum starting conditions and give the first 4 values of u(x)
and v(x) with 10 place accuracy, taken from the NBS tables.
In the global procedure we will not use 20 sections since this would make
the analytical errors so small that the only remaining errors would be the
numerical rounding errors. We will employ the half number of points by
dividing the interval in 10 sections only, satisfying the given differential
equation in the 10 equidistant points

where

While the local procedure uses parabolas of fourth order with a constantly
shifting origin, the global procedure uses the same trigonometric cosine
polynomial of tenth order. It is this sameness which prevents the gradual
increase of the truncation errors. The trigonometric interpolation is
characterised by periodic rather than exponentially increasing errors.
* Tables of the Bessel Functions Yo(Z) and Yi(Z) for Complex Arguments (Computa-
tion Laboratory, National Bureau of Standards; Columbia University Press, New York,
1950).
SEO. 9.17 THE METHOD OF GLOBAL CORRECTION 549
In the following table ui denotes the successive u(x) values, obtained by
the step-by-step procedure, while ug denotes the values obtained by the
global method. The correct values u(x) (taken from the NBS tables), are
listed under u*. The function u(x) corresponds to the negative real part
of YQ(IX) of the tables (on p. 364). The same notations hold for the function
v(x) which corresponds to the negative imaginary part of Yi(ix).

X Ui ug u* -vt -v, -v*


1.0 0.268032 0.268032 0.268032 0.383186 0.383186 0.383186
1.2 0.202769 0.202779 0.202769 0.276670 0.276687 0.276670
1.4 0.155116 0.155117 0.155116 0.204250 0.204280 0.204250
1.6 0.119656 0.119642 0.119656 0.153192 0.153181 0.153192
1.8 0.093298 0.092887 0.092903 0.117695 0.116220 0.116261
2.0 0.072447 0.072502 0.072507 0.090559 0.089029 0.089041
2.2 0.056892 0.056831 0.056830 0.070781 0.068732 0.068689
2.4 0.044176 0.044693 0.044701 0.055069 0.053363 0.053301
2.6 0.034641 0.035237 0.035268 0.043698 0.041585 0.041561
2.8 0.026656 0.027850 0.027897 0.034486 0.032506 0.032539
3.0 0.020664 0.022071 0.022116 0.027903 0.025510 0.025564
3.2 0.015487 0.017543 0.017567 0.022502 0.020131 0.020144
3.4 0.011537 0.013947 0.013979 0.018755 0.015881 0.015915
3.6 0.007987 0.011133 0.011140 0.015695 0.012704 0.012602
3.8 0.005170 0.008850 0.008891 0.013724 0.010068 0.009999
4.0 0.002501 0.007020 0.007104 0.012201 0.007944 0.007947
4.2 0.000224 0.005580 0.005683 0.011446 0.006305 0.006327
4.4 -0.002078 0.004467 0.004551 0.011042 0.005118 0.005044
4.6 -0.004244 0.003574 0.003648 0.011241 0.004208 0.004027
4.8 -0.006663 0.002787 0.002927 0.011785 0.003315 0.003218
5.0 -0.008996 0.002265 0.002350 0.012917 0.002780 0.002575

We observe that the errors of HI and Vi become gradually worse as we


approach the end of the range and eventually cease to constitute even a
rough approximation, although the trend of the successive values remains
smooth and a purely numerical examination would not lead us to suspect
that something had gone out of order. Actually, around the end of range
even the sign of ui(x) changes from positive to negative. Moreover, the
negative values would increase more and more if we had continued our
calculations for larger values of x, although the actual function goes to zero.
This is inevitable since the truncation errors invoke the exponentially
increasing solution with greater and greater force. We observe the same
phenomenon in — Vi(x) where a certain minimum is reached and then the
values increase again.
However, the global approximation behaves quite differently. We
observe the presence of small periodic errors from the beginning, but later a
secular error of exponentially increasing strength becomes manifest. This
seems puzzling since on the basis of the properties of the finite Fourier series
we have been expecting solely periodic errors. And yet even here the
unwanted exponentially increasing solution seems to make itself felt around
the end of the range.
550 NUMERICAL SOLUTION OF TRAJECTORY PROBLEMS CHAP. 9

The reason of this phenomenon is that at the start x = 1 we should make


allowance for the existence of a small periodic error. We have suppressed this
error by satisfying the given initial conditions exactly. The consequence is
that the undesired solution is excited with a small constant factor, thus
causing a gradually increasing secular error which is imposed on the small
periodic errors. There is, however, a fundamental difference between this
error and the error of the step-by-step procedure. In the latter case the
truncation errors provide a constant energy source which feeds the
exponentially increasing solution all the time. Hence the undesired solution
comes into play in constantly increasing strength, while in the global method
this strength remains constant. Moreover, this constant strength recedes
rapidly as the number of points increases. Had we used 15 instead of
10 points of the given interval, the secular error would have disappeared in
the first six decimal places.

BIBLIOGRAPHY
[1] Bennett, A. A., W. E. Milne, and H. Bateman, Numerical Integration of
Differential Equations (Dover Publications, New York, 1956)
[2] Collatz, L., The Numerical Treatment of Differential Equations (Springer,
Berlin, 1960)
[3] Fox, L., The Numerical Solution of Two-Point Problems in Ordinary Differ-
ential Equations (Clarendon Press, Oxford, 1957)
[4] Hildebrand, F. B., Introduction to Numerical Analysis (McGraw-Hill, 1956)
[5] Levy, H., and E. A. Baggot, Numerical Studies in Differential Equations
(Watts, London, 1934)
[6] Milne, W. E., Numerical Solution of Differential Equations (Wiley, New
York, 1953)
[7] Morris, M., and O. E. Brown, Differential Equations (Prentice-Hall, New
York, 1952)
APPENDIX

TABLE I: Smoothing of the Gibbs oscillations of a Fourier series; 9 decimal


place table of the sigma factors

for n = 2, to n = 20; cf. Chapter 2.14.

TABLE II: Double smoothing of the Gibbs oscillations of a Fourier series;


9 decimal place table of the square of the sigma factors, for n = 2 to n = 20.

k Ok <VcZ k <T* o*»


n =2 n =2 n =7 n =7

1 0.636619772 0.405284735 1 0.966766385 0.934637243


2 0.871026416 0.758687017
n =3 n =3 3 0.724101450 0.524322909
4 0.543076087 0.294931636
l 0.826993343 0.683917990 5 0.348410566 0.121189923
2 0.413496672 0.170979497 6 0.161127731 0.025962146

n =4 n =4 n =8 n =8

1 0.900316316 0.810569469 1 0.974495358 0.949641203


2 0.636619772 0.405284735 2 0.900316316 0.810569469
3 0.300105439 0.090063274 3 0.784213304 0.614990506
4 0.636619772 0.405284735
n =5 n =5 5 0.470527982 0.221396582
6 0.300105439 0.090063274
1 0.935489284 0.875140200 7 0.139213623 0.019380433
2 0.756826729 0.572786697
3 0.504551153 0.254571865 n =9 n =9
4 0.233872321 0.054696263
1 0.979815536 0.960038485
n =6 n -6 2 0.920725429 0.847735315
3 0.826993343 0.683917990
1 0.954929659 0.911890653 4 0.705316598 0.497471504
2 0.826993343 0.683917990 5 0.564253279 0.318381763
3 0.636619772 0.405284735 6 0.413496672 0.170979497
4 0.413496672 0.170979497 7 0.263064408 0.069202883
5 0.190985932 0.036475626 8 0.122476942 0.015000601

551
552 APPENDIX

k ff* <r*2 k Ok ok*


n= 10 n = 10 n= 14 n= 14
1 0.983631643 0.967531210 1 0.991628584 0.983327250
2 0.935489284 0.875140200 2 0.966766385 0.934637243
3 0.858393691 0.736839729 3 0.926160517 0.857773303
4 0.756826729 0.572786697 4 0.871026416 0.758687017
5 0.636619772 0.405284735 5 0.803004434 0.644816121
6 0.504551153 0.254571865 6 0.724101450 0.524322909
7 0.367883011 0.135337909 7 0.636619772 0.405284735
8 0.233872321 0.054696263 8 0.543076087 0.294931636
9 0.109292405 0.011944830 9 0.446113574 0.199017321
10 0.348410566 0.121389923
n = 11 n= 11 11 0.252589232 0.063801320
12 0.161127731 0.025962146
1 0.986463589 0.973110413 13 0.076279122 0.005818504
2 0.946502244 0.895866498
3 0.882062724 0.778034649 n = 15 n= 15
4 0.796248357 0.634011446
5 0.693153891 0.480462317 1 0.992705200 0.985463613
6 0.577628242 0.333654387 2 0.971012209 0.942864710
7 0.454999061 0.207024145 3 0.935489284 0.875140200
8 0.330773521 0.109411122 4 0.887063793 0.786882173
9 0.210333832 0.044240321 5 0.826993343 0.683917990
10 0.098646084 0.009731050 6 0.756826729 0.572786697
7 0.678356039 0.460166915
n = 12 n= 12 8 0.593561534 0.352315294
9 0.504551153 0.254571865
1 0.988615929 0.977361456 10 0.413496672 0.170979497
2 0.954929659 0.911890653 11 0.322568652 0.104050535
3 0.900316316 0.810569469 12 0.233872321 0.054696263
4 0.826993343 0.683917990 13 0.149386494 0.022316325
5 0.737912976 0.544515560 14 0.070907514 0.005027876
6 0.636619772 0.405284735
7 0.527080697 0.277814061 n= 16 n = 16
8 0.413496672 0.170979497
9 0.300105439 0.090063274 1 0.993586851 0.987214831
10 0.190985932 0.036475626 2 0.974495358 0.949641203
11 0.089874175 0.008077367 3 0.943165321 0.889560823
4 0.900316316 0.810569469
n = 13 n= 13 5 0.846927993 0.717287025
6 0.784213304 0.614990506
1 0.990295044 0.980684275 7 0.713585488 0.509204249
2 0.961518870 0.924518537 8 0.636619772 0.405284735
3 0.914673491 0.836627595 9 0.555010935 0.308037138
4 0.851382677 0.724852462 10 0.470527982 0.221396582
5 0.773824776 0.598804783 11 0.384967270 0.148199799
6 0.684642939 0.468735954 12 0.300105439 0.090063274
7 0.586836804 0.344377435 13 0.217653536 0.047373062
8 0.483640485 0.233908118 14 0.139213623 0.019380433
9 0.378392301 0.143180734 15 0.066239124 0.004387621
10 0.274402047 0.075296483
11 0.174821613 0.030562596
12 0.082524587 0.006810307
APPENDIX 553

k Ok <** k cr/fc a*2


n = 17 n = 17 n = 19 n = 19
1 0.994317898 0.988668081 1 0.995449621 0.990919947
2 0.977387746 0.955286806 2 0.981872985 0.964074559
3 0.949555184 0.901655048 3 0.959492150 0.920625185
4 0.911386930 0.830626137 4 0.928672398 0.862432423
5 0.863657027 0.745903461 5 0.889915138 0.791948953
6 0.807328089 0.651778643 6 0.843848160 0.712079718
7 0.743528055 0.552833969 7 0.791213480 0.626018771
8 0.673523068 0.453633324 8 0.732853009 0.537073533
9 0.598687172 0.358426330 9 0.669692359 0.448487856
10 0.520469639 0.270888645 10 0.602723123 0.363275163
11 0.440360776 0.193917613 11 0.532984007 0.284071952
12 0.359857095 0.129497129 12 0.461541197 0.213020277
13 0.280426748 0.078639161 13 0.389468382 0.151685620
14 0.203476111 0.041402528 14 0.317826835 0.101013897
15 0.130318366 0.016982876 15 0.247645973 0.061328528
16 0.062144869 0.003861985 16 0.179904778 0.032365729
17 0.115514469 0.013343593
n = 18 w = 18 18 0.055302757 0.003058395
1 0.994930770 0.989887237 n = 20 n = 20
O .Q7QC1 PCKQR
y t yoioooo O .youuoorfcoo
QCnAOQXOK
3 0.954929659 0.911890653 1 0.995892735 0.991802340
4 0.920725429 0.847735315 2 0.983631643 0.967531210
5 0.877822270 0.770571938 3 0.963397762 0.928135248
6 0.826993343 0.683917990 4 0.935489284 0.875140200
7 0.769148875 0.591589991 5 0.900316316 0.810569469
8 0.705316598 0.497471504 6 0.858393691 0.736839729
9 0.636619772 0.405284735 7 0.810331958 0.656637882
10 0.564253279 0.318381763 8 0.756826728 0.572786697
11 0.489458375 0.239569501 9 0.698646585 0.488107151
12 0.413496672 0.170979497 10 0.636619772 0.405284735
13 0.337623950 0.113989932 11 0.571619933 0.326749348
14 0.263064408 0.069202883 12 0.504551152 0.254571865
15 0.190985932 0.036475626 13 0.436332593 0.190386132
16 0.122476942 0.015000601 14 0.367883011 0.135337909
17 0.058525340 0.003425215 15 0.300105439 0.090063274
16 0.233872321 0.054696263
17 0.170011370 0.028903866
18 0.109292405 0.011944830
19 0.052415407 0.002747375

36*
554 APPENDIX

TABLE III: The four transition functions from the exponential to the
periodic domain and vice versa; KWB method; cf. Chapter 7.23, 24.

X <pie(x) <p2'(x) X 9>ip(aO 9>2p(z)


0 1.258542 1.089930 0 1.215658 -0.325735
-0.1 1.166995 1.169574 0.1 1.191711 -0.237065
-0.2 1.079780 1.250403 0.2 1.166576 -0.148173
-0.3 0.988343 1.333746 0.3 1.139111 -0.058999
-0.4 0.903038 1.421106 0.4 1.108239 0.030336
-0.5 0.821332 1.514167 0.5 1.072938 0.119535
-0.6 0.743722 1.614818 0.6 1.032281 0.208122
-0.7 0.670563 1.725180 0.7 0.985441 0.295447
-0.8 0.602089 1.847645 0.8 0.931716 0.380690
-0.9 0.538425 1.984923 0.9 0.870551 0.462875
-1.0 0.479599 2.140103 1.0 0.801567 0.540885
-1.1 0.425564 2.316729 1.1 0.724587 0.613477
-1.2 0.376206 2.518894 1.2 0.639658 0.679316
-1.3 0.331359 2.751352 1.3 0.547075 0.737001
-1.4 0.290817 3.019658 1.4 0.447403 0.785106
-1.5 0.254345 3.330337 1.5 0.341493 0.822226
-1.6 0.221688 3.691089 1.6 0.230484 0.847022
-1.7 0.192576 4.111043 1.7 0.115812 0.858282
-1.8 0.166739 4.601059 1.8 - 0.000807 0.854970
-1.9 0.143903 5.174106 1.9 -0.117393 0.836294
-2.0 0.123803 5.845721 2.0 -0.231732 0.801758
-2.1 0.106179 6.634567 2.1 -0.341425 0.751220
-2.2 0.090786 7.563125 2.2 -0.443943 0.684944
-2.3 0.077392 8.658546 2.3 -0.536697 0.603639
-2.4 0.065780 9.953694 2.4 -0.617115 0.508492
-2.5 0.055747 11.488444 2.5 -0.682740 0.401183
-2.6 0.047109 13.311284 2.6 -0.731322 0.283883
-2.7 0.039698 15.482199 2.7 -0.760925 0.159239
-2.8 0.033359 18.070646 2.8 -0.770025 0.030326
-2.9 0.027956 21.167626 2.9 -0.757616 -0.099413
-3.0 0.023365 24.880518 3.0 -0.723292 -0.226254
INDEX

Absolute calculus, 449 Ballistic galvanometer, 253


Accumulation, truncation errors, 531 Green's function, 254
Action integral, 228, 230 Bar, elastic, 189, 224, 225, 230, 236
of Laplace operator, 228 Base vectors, 154, 157
Active variable, 222 Bateman, H., 550
Adams' method, 526 Bennett, A. A., 550
Adjoint equation, 207 Bernoulli, D., 163, 169, 439, 456
matter tensor, 480 Bernoulli numbers, 58
Adjoint operator, 183, 198, 199, 480 Bessel functions: 349
Hermitian, 182, 205 approximation, 394
includes boundary conditions, 181 asymptotic properties, 380
as transposed matrix, 179 first zero, 408
Adjoint system, 115 half order, 352
under-determined, 155 hypergeometric series, 378
Algebraic adjoint, 182 interpolation, 45
Algebraic approximation, closeness of, large imaginary values of x, 384
175 large real values of x, 383
Algebraic variables, eliminated, 229 order1/3,385
Amplitude: equation, exact, 422 recurrence relation, 46
variable, 372 Bessel's differential equation, 349
Analytical data, 494 generalised, 352
Analytical domain, 444 KWB solution, 376
Analytical function, 144 normal form, 369, 380, 394
Angular momentum, conserved, 485 Bi-harmonic equation, 497
Anti-Hermitian, 227 canonical form, 238
Approximation: algebraic, 175 Bi-harmonic operator, 228
KWB, 387 Bilinear expansion, 291
sine, cosine, 265 convergence, 295
Arbitrary system, n x m, 115 examples, 295
enlarged by transposed, 115 Bilinear function, 183
Argument, 335 Bilinear identity, 181
Aristophanes, xiii defines A, 152
Arithmetic mean: method, 71 Binomial coefficients, 10, 21
of limiting ordinates, 74, 78, 306 Binomial expansion, 11, 26
Associated polynomials, Legendre, 451 Black box, 330
Asterisk = approximate functional value, Blind spot, 123
18 Bohr, N., 349, 391
= conjugate complex, 152 Borel, E., 1
= exact functional value, 531 Boundary data, injudicious, 489
rule in Hermitian problems, 303 Boundary conditions, 9, 102, 171, 181,
Atomic oscillator, 392 425, 434
Atomisation of continuum, 168, 170 adjoint, 184
Auxiliary function, 154, 156 inhomogeneous, 217, 278, 735
Auxiliary vector, 155 natural, 172, 507
Axes: essential, 149 as part of operator, 180
ignored, 122 physical, 504
555
556 INDEX

Boundary surface, 439 Complex variable, 439


Boundary term, 183, 427 Condensation point, see limit point
partial operators, 196 Condition C, 487, 490, 500
Bounded kernel, 475 Condition number, 133
Brillouin, L., 374 Configuration space, 515
Bush, V., 347 Conformal mapping, 440
Conjugate points, 446
Calculus, variations, 229, 425 Conservation laws of mechanics, 479
Canonical equations, 232 angular momentum, 485
bar problem, 236 energy, momentum, 484
non-selfadjoint systems, 234 Consonants, fidelity, 342
planetary motion, 237 Constants: of integration, 248
Canonical formulation, 230, 236, 426 variations of, 254, 367
vibrating string, 466, 501 Constrained Green's function, 272, 273,
Carr III, John W., xvi 274, 313
Carson, J. R., 347 Constrained systems, 136, 137
Catalogue of values, 168 Constraints, 472
Cauchy, A. L., 1, 164, 440 maintained by potential energy, 506
Cauchy problem, 434, 487, 501 Continuum, atomised, 168
Cauchy's inequality, 57, 92, 178 Contour, vibrating string, 462
Cauchy's integral theorem, 156 Convergence, 4, 295, 489, 492, 510
Cauchy-Riemann differential equations, Convergents, successive, 17
440, 442, 493 Cooling bar, 470, 493
Causality principle, 332 Cosine series, Fourier, 99
Cayley, A., 100, 101 Courant, R., xvii, 314, 380, 431
Cells, atomisation of continuum, 169 Critical damping, 324
Central blow, vibrating string, 459 Curl, 198, 199, 212
Central differences, 13 Curly D process, 75
operation y, 8, 14 Cut, 380
replaced by weighting, 15
table, 14 D'Alembert, J., 456
Centre of mass, energy, 485 Damping: critical, 328, 345
Characteristic equation, 105, 108, 533 fidelity, 329
Characteristic values, see eigenvalues ratio, 324
Chebyshev differential equation, 351 Data, 144
self-adjoint form, 364 measured, 494
Chebyshev polynomials, 351, 411 Decomposition: arbitrary matrix, 122
in global integration, 536 inverse operator, 489
Churchill, R. V., 99 operator, 488
Circular membrane, 361, 456 symmetric matrix, 112
Class of functions, 22, 488 Deficiency, 123, 126
Collatz, L., 458, 550 degree of, 119
Column vector, 103 eliminated, 140, 143
Communication Problems, 315-347 grad equation, 199
Compatibility, 141 Delta function, 70, 77, 80, 208, 211, 223,
of grad equation, 199 445
Compatibility conditions, 114, 119, 123, as point load, 241
194, 271 transformation of, 355
examples, 191 A operation, 27
Complete linear system, 135 Ax, normalised, 9
Complete ortho-normal system, 430 Dependent variable, transformed, 359
Completeness in activated space, 294, Derivatives as new variables, 514
435 Descartes, R., 166
Completeness relation, 509 Determinant, 108
Completion, linear operators, 308 Diagonal matrix, 107
Complex Fourier series, 308 positive, 121
INDEX 557
Difference: calculus, 8 Equation: characteristic, 105, 108, 533
equations, 170 Poisson's, 140, 142
table, simple, 12 potential, 145, 202, 238, 439, 448
Differences, central, 13 Equation of motion, Newton's, 185
operations y, 8, 14 Equations of motion, Lagrangian, 513
table, 14 Equidistant interpolation, 8
replaced by weighting, 15 central differences, 14
Differencing, 27 of high order, 13
Differential equation, Equilibrium, as minimum of potential
Bessel, Legendre, etc., see under respec- energy, 228
tive names Equilibrium conditions: free bar, 190
elastic bar, 189, 224, 225, 230, 236 of mechanics, 481, 482
hypergeometric function, 350 Error: bounds, 12
loaded membrane, 360 estimation, 4
Differential operator: elliptic, 433 mean square, 55
hyperbolic, 464 Error behaviour: Fourier series, 545
as matrix, 170 polynomial interpolation, 13, 17
parabolic, 469 Error vector, 130
Dirac, P., 68, 70, 77, 80, 211, 223, 241 magnified by small eigenvalue, 131
Dirichlet's kernel, 69, 72, 81, 82 Essential axes of matrix, 149
Dirichlet's principle, 498 Euclidean space, 166
failure of, 498, 507 Euler, L., 101, 169, 348, 456
Discontinuity, n— 1st derivative, 251 Euler's constant, 23
Dissipation of energy, 333 Euler's integral, 383
Distance, Pythagorean, 166 Expansion: asymptotic, 86
Divergence, 102, 139, 198, 199, 212 left side function, 289
of matter tensor, 480 right side function, 288
Double orthogonal vector system, 117, 165 Expansions, orthogonal, 286
Dubois-Baymond, P., 433 examples, 291, 295, 296
Duff, G. F. D., xvii Extended Green's identity, 437
Extension of expandable functions, 83
Eigenspace, 123, 293 Extrapolation: from end-point, 626
always well-posed, 270, 488 by Gregory-Newton series, 25
completely spanned, 288 of maximum accuracy, 521
Eigensolutions, physical, 364 of minimum round-off, 521
Eigenvalue: equation, 105, 116, 117 Even-determined, 102
largest as maximum, 158
problem, 105, 107, 362, 390, 449 Factorial function, 383
shifted, 117, 287, 311, 510 Factorial, reciprocal of, 23
spectrum, 490 Fejer, L., 71
zero, 122, 489 Fejer's kernel, 72, 73
Eigenvalues, real, 108 Ferrar, W. L., 162
Eigenvector, 107 Feshbach, H., 314
Einstein, A., 482, 484, 486 Fidelity damping, 327
Elastic bar: loaded, 189, 224, 225, 230, 236 Fidelity of: galvanometer response, 325
Green's function, 254, 255, 256 noise reproduction, 346
Electronic computer, 146, 512 truncated Fourier series, 79
Element, infinitesimal, 3 Finite Fourier series, remainder, 53
Elimination of algebraic variables, 229 First boundary value problem, 447
Elliptic differential operators, 433 Fixed point, variable point, exchanged,
Emde, F., 352, 431 240
End condition, 215, 494 Flat top amplitude response, 344
Energy: as eigenvalue, 391 Focusing power, Fejer's kernel, 72
flux, 484 Forbidden: axes, 435
Energy-mass equivalence, 484 components, 151
Entire function, 378, 379 Force, maintaining constraints, 506
558 INDEX
Fourier, J., 456 General solution, 115
Fourier functions, 363 Generalised Bessel equation, 419
Fourier series: complex form, 56 Generalised Laguerre polynomials, 20
convergence of, 52 Generating function, 201
in curve fitting, 98 Generating vector, 139
for differentiable functions, 50, 56 Geometrical language, 165
error bounds, 55, 57, 59 Gibbs oscillations, 60
increased convergence, 78 amplitude, phase, 63
weighted, 76, 84 smoothed by Fejer's method, 71
Fourier: cosine series, 45 smoothed by sigma factors, 78
sine series, 98, 363 Given right side, 221, 223
Fourth variable, relativity, 203 orthogonality to adjoint solution, 153
Fox, C., 314 Global correction, 546
Fox, L., 550 Global integration method, 533, 536
Franklin, Ph., 99 numerical example, 548
Fredholm, I., xi, 152, 166, 349, 478 Gradient, 142, 198, 212
Free bar, equilibrium, 190 Green, G., 201
Frequency: analysis, 334 Green's Function, 206-314
instantaneous, 372 bilinear expansion, 292
response, pulse response, 337 existence, 211
Friedman, B., xvii, 314 as function of active variable, 210
Frobenius, F. G., 115 as function of passive variable, 246
Function: analytical, 2 function of single variable, 323
of complex variable, 448 as inverse operator, 302
entire, 2, 378 non-existence, 215, 216, 490
Space, 163-205, 220, 222 reciprocity theorem, 240, 243, 294
as vector, 167, 169 symmetry, 241, 303
Function generated by: parabolic arcs, weighted, 355
319 Green's function of: Heaviside, 321
pulses, 245, 316 heat equation, 471
step functions, 317 ordinary differential equations, 247,
straight lines, 318 251
Functional equation, 9 examples, 251
for hypergeometric series, 29 of second order, 366
Fundamental: building block, 316 potential equation, 146, 446
decomposition theorem, 122 vibrating string, 458
polynomial, 5 Green's function, constrained, 270, 273,
squared, 264 293, 309, 310
examples, 285
Galvanometer error, 328 reciprocity theorem, 294
critical damping, 329 symmetry, 274
fidelity damping, 329 Green's function method in error estima-
Galvanometer: problem, 323-330 tion,
memory, 325 Fourier series, 66
response, 328 Lagrangian interpolation, 7, 261
Gamma function, 94 trigonometric interpolation, 91
logarithmic derivative, 27 Green's identity, 182, 208
Gauss, C. F., 1, 141, 411, 444, 534 extended, 183, 195, 437
Gauss' differential equation, 350 weighted, 352
Gauss' integral theorem, 196, 444 Green's vector, 223, 266
Gaussian quadrature, 534 reciprocity theorem, 242
remainder, 536 symmetry, 242
weight factors, 535 Gregory, J., 1
zeros, 416 Gregory-Newton series, 19, 23
General n x m system, solvability, 118 associated integral transform, 24
General: relativity, 486 interpolation of Jp(x), 47
INDEX 559
Hadamard, J., 136, 239, 434, 487, 490, 500 Identity, bilinear, 152
Halmos, P. R., 205 Ignored axes, 122
HamUton, W. R., 230, 232 Ill-conditioned matrix, 134, 647
Hamiltonian function, 428 Ill-posed problem, 136, 490, 498, 510
Hamilton's canonical equations, 232, 513 Ince, E. L., xvii
in bar problem, 225, 236 Incompatibility, 143
in planetary motion, 237 Incompatible system, 151
Hamiltonisation of partial operators, 237 Incomplete system, 136, 137, 139, 187
vibrating string, 467 completed, 153
Hankel, H., 380 solved by generating function, 201
Hankel functions, 404 Independent variable: as ordering prin-
Hard of hearing, 344 ciple, 168
Harmonic Analysis, 49-99 transformed, 354, 529
Harmonic vibration, 335 Inequality, Cauchy's, 57, 92, 178
Heat equation, 469 Inferential extrapolation, 341
Green's function, 471 Information insufficient, 132, 133
smoothing property, 470 Inherent (natural), boundary conditions,
Heaviside, O., 315, 316, 317, 321, 322 172, 431, 507
Hermite's differential equation, 351 Inhomogeneous boundary conditions, 217,
approximate solution, 376 278
eigenvalues, 394 homogenised, 185, 436
normal form, 369 Initial conditions, 455
self-adjoint form, 364 Initial position, 514
weight factor, 358 Input signal, 330
Hermitian adjoint, 182, 205 Integral equation, 331
Hermitian functions, 364, 394 first kind, 475
approximate solution, 392 kernel, 475
differential equation, 364, 392 second (Fredholm) kind, 478
Hermitian operator, first order, 307 singular, 475
Hermitian complete ortho-normal set, 307 Integral operator, 69
Hermitian polynomials, 351, 392 Integral theorem: Cauchy's, 144
Hermitian problems, 299-308 Gauss', 444
High-fidelity reproduction, 336 Integral transform: Fourier type, 35
High order polynomial in interpolation, associated with Gregory-Newton series,
533 24
Hilbert, D., xv, xvii, 166, 314, 349, 380, Stirling series, 44
431 Integration: by Chebyshev polynomials,
Homogeneous boundary conditions, 185, 537
436, 437, 509 by parts, 183, 196, 228, 229
Homogeneous solutions, 114, 118, 119, step-by-step, 516
148, 149, 289 Internal generator, 333
elimination of, 309, 311 Interpolation, 1-48
orthogonalisation of, 149 Interpolation, equidistant, 8
Householder, A. S., 162 by Gregory-Newton formula, 10
Hyperbolic differential operator, 215, 433, of high order, 13
464 related to trigonometric interpolation, 95
Hypergeometric function, 26 by Stirling formula, 14, 15
differential equation, 350 used for extrapolation, 517
Hypergeometric series, 21, 26 used for midpoint values, 527
confluent, 21 Interpolation, Lagrangian, 5
Hypergeometric series and: remainder, 8, 265
Bessel functions, 378 with double points, 263, 520
Hermite polynomials, 351 Interpolation: linear, quadratic, 11
Jacobi polynomials, 410 Inverse of matrix: left, 157
Laguerre polynomials, 21 natural, 124, 151
ultraspherical polynomials, 411 right, 157
560 INDEX

Inverse operator, 302, 489 Langer, R. E., xv


Inversion of matrix, 147 Laplace (potential) equation, 145, 166,
Isomorphism, linear differential operator 348, 433
and matrix, 181 separated in polar coordinates, 449
Laplace operator, 198, 220, 439, 448, 469,
Jacobi polynomials, 350, 410 473, 478
weight factor, 409 general properties, 443
Jackson, D., 99 Laplace transform, 29, 30, 336
Jahnke, E., 352. 431 and Gregory-Newton series, 31
Jeffreys, H., xvii key-values, 32
Jeffreys, B. S., xvii Least action principle, 226, 513
Jordan, Ch., 48 Least squares, 126, 141
Left inverse, 157
Left vector, 222
Kellogg, O.D., 314 Legendre polynomials, 295, 350, 411
Kernel, 51, 91 Legendre's differential equation, 275, 350,
bounded, 475 450
Dirichlet's, 69, 72, 81, 82 Green's function, 278
FejeVs, 72, 73 second solution, 367
of integral equation, 475 Leibniz series, 306
Kramers, H. A., 374 Length of vector as invariant, 173
Kronecker, L., 115 Liberation method, 196, 229, 480
Kronecker's symbol, 6, 223 Limit: of infinite series, 3
KWB method, 374 point, 289, 434, 508
complemented for transitory range, 400, process, delta function, 273
405 of Stirling series, 96
estimated error, 375 differs from expected value, 97
in real form, 391 Limit method in boundary value problems,
KWB solution, 374 511
accurate, 376 Line integral, 214
of Bessel functions, 394 Linear algebraic system, 116, 171
of Hermite functions, 392 badly conditioned, 547
of Laguerre functions, 418 classification, 135
of Legendre polynomials, 412 error analysis, 129
of Neumann functions, 399 Linear communication devices, 330
Linear input function (galvanometer), 325
Lagrange, J., 5, 6, 101, 169, 456 Linear operators: completed, 308
Lagrangian function, 229, 230, 235-238, superposition principle, 244
425-430, 467, 513 Liouville, J., 348
Lagrangian interpolation, 5 Local: expansions, 515
by double points, 263, 520 indention, 473
Green's function, 261 smoothing, 76, 343
remainder, 6, 520 Lonseth, A., xv
Lagrangian multiplier, 229, 231, 235, 237,
426, 428, 431, 464 MacDuffee, C. C., 162
Lagrangian multiplier method, for bound- MacLachlan, N. W., 431
ary conditions, 505 Magnus, W., 431
Laguerre functions, 418 Margenau, H., xvii
Laguerre polynomials, 21, 351, 418 Mass-energy equivalence, 484
recurrence relation, 28, 29 Mathieu's differential equation, 350, 358
Laguerre's differential equation, 351 Matrix calculus, 100-162
eigenvalues, 421 Matrix: defective, 105
in normal form, 369 inequalities, 162
in self-adjoint form, 363 inverse, natural, 124
second solution, 367 as operator, 101
Lanczos, C., 2, 314, 449 rank,119
INDEX 561
Matrix: cont. Noise: band, 338
symmetric and orthogonal, 110 profile, 342
Matter tensor, 479 reproduction, 344
Maxwell, J. C., 241 Non-central blow (vibr. string), 460
Maxwell's equations, 203, 204 Non-selfadjoint equations:
Maxwell's reciprocity theorem, 241 variational principle, 233
Mechanisms, time-independent, 332 made self-adjoint, 234
Membrane, circular, 456 Non-zero eigenvalues, 112
vibration of, 361 Norm of function, 58, 175, 468
Memory (galvanometer), 325 Normal derivative, 145
time, 337, 338, 339 Normal form: diff. eqn. of second order,
Method of: Adams, 526 369
global integration, 536 trajectory problem, 514
least squares, 126, 141 Numerical example:
liberation, 196, 229, 480 accumulation of truncation errors, 549
Milne, 517 global integration, 542
Runge-Kutta, 517 Numerical Solution of Trajectory Prob-
separation of variables, 439 lems, 512-548
sigma factors, 75
undetermined multipliers, 204, 519 Oberhettinger, FM 431
variation of constants, 254, 367 Operation: y, §, 14
Milne, W. E., 48, 550 A, 27
Milne's method, 517 Operational space, 123
Milne-Thomson, L. M., 48 Operator: Laplacian, 198, 220, 439, 448,
Minimum: not obtained, 431, 474 469, 473, 478
as limit, 507 bi-harmonic, 228
of potential energy, 228 Operator, linear: adjoint, 179
Mixed boundary value problem, 453 complete, unconstrained, 510
Momentum: angular, 485 includes boundary conditions, 180
conservation, 484 excludes zero axes, 122, 489
denned kinematically, 485 inverse, 302, 489
flux, 484 completed, 309
Momenta as Lagrangian multipliers, 231 Operators grad, div, curl, 198
Monge, G., 433 Order of approximation, 519
Morse, P. M., 314 Orthogonal expansions, 286
Movable strip, 541, 546 examples, 291, 295, 296
Multiple eigenvalues, 288, 452 Orthogonal: transformations, 128
Multiplier, Lagrangian, 229, 231, 235, vector system, double, 117, 165
237, 426, 428, 431, 464 Orthogonalisation: successive, 149
Murphy, G. M., xvii to zero-field, 142
Orthogonality: conditions of compatibility,
n-dimensional space, trajectory, 515 114
n = inner normal, 145 of Fourier functions, 58
Vi = outer normal, 196 Hermitian, 175
Natural (inherent), boundary conditions, weighted, 354
172, 431, 507 Ortho-normal system, 454
Natural inverse, 139, 151 complete, 430
Neumann, K., 380 Ortho-normal zero axes, 160, 161
Neumann functions, 399, 403 Oscillations, 371
first zero, 408 instantaneous frequency, 373
Neumann problem, 448 Oscillator, atomic, 392
Newton, I., 1 Output, 330
Newtonian potential, 145 Over-determined system, 102, 135, 144,
Newton's definition of momentum, 485 146, 281
Newton's law of motion, 185, 483 Over-under-determined reciprocity, 185
Nodal point, 458 Overtones, 340
562 INDEX

Page, C. H., xvii Principal axes, 113


Paired axes, 118 as frame of reference, 173
Parabolic arc: as building block, 319 Principle of: least action, 226, 513
as input function, 325, 327 least squares, 126, 141
Parabolic differential equation, 469 minimum potential energy, 228
Parabolic differential operator, 433 stationary value, 227
Paradox: bi-harmonic equation, 497 relativity, 203, 482
hyperbolic equation, 467 Propagation of singularity, 464
interpolation theory, 93 Proper space, 123
matrix inversion, 131 Pulse: square, 77
negative eigenvalue, 430 triangular, 82
Parasitic spectrum, 490, 503 Pythagorean distance, 166
examples, 498
in Schrodinger eqn., 508 Quadratic form, positive definite, 429
variational motivation, 494 Quadratic input function (galvanometer),
Partial operators: Green's identity, 195 327
Hamiltonian form, 237 Quadratic surface, 107
method of liberation, 196 Quadratically integrable, 294, 468, 492
Particular solution, 128, 143 Quadrature: Gaussian, 534
Passive network, 334 second solution obtained by, 367
Passive variable, 222 Quantum conditions, 391
Periodic input function, 335
Peripheral conditions, 434 Radius of convergence, 2
Permissible components, 151 Bank of matrix, 135
Permissible functions, 488 Rayleigh, Lord, 349
Phase response, 341 Reciprocity, bilinear expansion, 294
importance of, 346 Reciprocity, over-under-determination,
Physical boundary conditions, 504 185
Physical intuition, 161, 459 Reciprocity theorem: Green's function,
Pipes, L. A., 347 241
Plucking of string, 461 Green's vector, 242
Point: in n-space, 166 Maxwell's, 241
load, 194, 241, 281 Recurrence relation: Bessel functions, 46
torque, 281 hypergeometric function, 29
Poisson's equation, 140, 142 Laguerre polynomials, 28
Polar coordinates, 440 Relativity, 203, 482
Polynomial approximation (step-by-step Remainder: Gaussian quadrature, 536
process), 516 Lagrangian interpolation, 8, 265
Polynomial, fundamental, 5 Taylor series, 3, 194, 256
squared, 264 Resolution power of ear, 344
Polynomial interpolation, 5, 263, 520 Restoring force, 323
of high order, 11, 13, 537 Riccati's differential equation, 370, 423
Polynomials, Jacobi, Legendre, etc.: see Riemann, B., 440
under respective names Riemann's zeta function, 48
Positive definite: Lagrangian, 429 Right inverse, 157
operator, 454 Right side, 119, 130, 432, 466
Positive eigenvalues, 118,121,287,292,454 Right vector, 223
Potential energy: minimum, 228 Robinson, G., 48
boundary condition maintained by, 505, Role of homogeneous systems, 188
506 Rotating bodies, 485
Potential equation, 202 Rotations: of basic axes, 173
canonical form, 238 in two spaces, 127
three dimensions, 448 Rounding errors, 517
two dimensions, 439 Row vector, 103
Potential function analytical, 479 Runge, C., 1, 13
Potential, Newtonian, 145 Runge-Kutta method, 517
INDEX 563
Scalar potential, 204 Stirling, J., 14, 15
Scalar product of functions, 175 Stirling series, 14, 15
Schrodinger, E., 348, 391 approaches incorrect limit, 17
Schrodinger's wave equation, 391, 433, 508 examples, 41
Second boundary value problem, 448 of hypergeometric function, 33, 35
Green's function, 453 recurrence relations, 35, 37, 39
Second-order diff. equations, 349 Sturm, J. Ch. F., 348
in normal form, 369 Sturm-Liouville Problems, 348-431
Second-order operator, self-adjoint, 356 generated by variation, 425
by multiplication, 360 Subscript as discrete variable, 221
by weight factor, 357 Subspace, 129, 294
Self-adjoint boundary conditions, 358 Successive orthogonalisation of vectors,
Self-adjoint operator, 454 148
completed, 309 Superposition principle, linear operators,
Self-adjoint systems, 113, 225 244
Self-adjointness destroyed, by improper Superposition of special solutions, 515
boundary conditions, 186 Surplus data, 477
Semi-convergent, 16 compatibility conditions, 478
Separation of variables, 438 Surplus variables, 513
Sequence of operations, 197 Symbolic: equation, 488
Series: extrapolating, 3 notation, 10
finite, 3 Symmetric matrix, 479
Fourier, 50 in canonical equations, 232
Taylor, 2 determinant, 108
Servo-mechanisms, 330 principal axes, 106
Shifted eigenvalue problem, 117, 287, 311, spur, 109
510 Symmetry of Green's function, 274, 366
Sigma factors, 76 shown by bilinear expansion, 294
asymptotic relations, 84, 89 Synge, J. L., xv, 205
exact relations, 88 Synthetic division, 5
Sine integral, 55, 538 Systems: arbitrary n x m, 115
Sine series, 98, 363 classification, 134
Singular integral equations, 475 incomplete, 139, 187
Smallest eigenvalue: minimum property, over-determined, 141, 190
158 Szego, G., 431
non-existence of, 496
Smoothing, local, 79, 465 Tabulation, equidistant, 9
time, 343 infinite interval, 17
Sneddon, I. N., xvii, 99 maximum Ax, 47
Sommerfeld, A., xvii Taylor, B., 2
Space, atomised, 169 Taylor expansions, local, 518
Space and time, 482 Taylor series, 2
Spectrum: parasitic, 490 remainder, 3, 194, 256
regular, 492 Technique, movable strip, 541
Sphere (n-dimensional), 445 Telephone, fidelity, 341
Spur of matrix, 109 Ten conservation laws, 486
Square matrices, U and V, 118 Tensor calculus, 449
Square matrix, positive, 121 Tilde = transposed, 103
Square pulse, 77 Time atomised, 169
Square wave, 86 Time as fourth coordinate, 203, 482
Starting values, 529 Time-independent mechanisms, 332
Stationary value, 227 Time lag, galvanometer, 327, 328
Steady state analysis, 339 Tone quality, 461
Step function, 262 Total force, 190, 483
Step-by-step integration, 516 Total load, 209
Stiffness constant, 323 Total moment, 190, 485
564 INDEX

Total momentum, 484 Variational problems, 227, 229, 454


Trajectory Problems, 512-548 in canonical form, 230, 237
Transfer function, 336 Vector, 101, 103
Transformation of: axes, 108 length, 173
independent variable, 354, 629 Vector analysis, field operations, 198
Transient analysis, 342 Vector potential, 204
Transition: exponential-periodic, 387 Vibrating membrane, 433
periodic-exponential, 388 Vibrating spring, 253
Transitory domain: 377 resisting medium, 254
increased accuracy, 405 Vibrating string, 433, 456
substitute functions, 400 canonical system, 501
tabulation, 404, 554 Vibration problems, 454
Transposed operator, 142 Vowels, fidelity, 342
Transposition rule, matrix, 104
Trapezoidal rule, 543 Wallis, J., 1
Triangular pulse, 82 Watson, G. N., xvii, 380, 431
Trigonometric functions, 49 Wave equation, 433
Trigonometric interpolation, 66, 91, 637, Schrodinger's, 391, 433, 508
546 Webster, A. G., xvii
Truncated series, 62 Weierstrass, K., 470
Truncation errors, 517 Weight factor: for self-adjointness, 357
accumulation, 531 Jacobi polynomials, 409
estimation, 524 Weight factors: Gaussian quadrature, 535
global integration, 537, 541, 544
Weighted orthogonality, 354
Ultraspherical polynomials, 350, 410 Well-posed problem (Hadamard), 136,140,
weight factor, 411 142, 239, 434, 510
Unconstrained, 435 Wentzel, G., 374
Under-determined system, 102, 135 Whittaker, E. T., xvii, 48, 380, 431
Undetermined multiplier, 204, 519 Whittaker, J. M., 48
Unequal distribution, ordinates, 534 Wronskian, 366
Unit step function, 316
integrated, 318 Zero eigenvalue, 122, 489
Universal theory, boundary value prob- Zero fields, 123
lems, 434 activated, 309, 510
Unpaired axes, 118 Zeros: Gaussian, 416
well-distributed, 93, 537
Variation of constants, 254, 367 Zeta function, Riemann, 48
Variational calculus, 229, 425, 494 Zig-zag lines, 262

You might also like