Springer Undergraduate Mathematics Series
Viorel Barbu
Differential
Equations
Springer Undergraduate Mathematics Series
Advisory Board
M.A.J. Chaplain, University of St. Andrews, St. Andrews, Scotland, UK
A. MacIntyre, Queen Mary University of London, London, England, UK
S. Scott, King’s College London, London, England, UK
N. Snashall, University of Leicester, Leicester, England, UK
E. Süli, University of Oxford, Oxford, England, UK
M.R. Tehranchi, University of Cambridge, Cambridge, England, UK
J.F. Toland, University of Cambridge, Cambridge, England, UK
The Springer Undergraduate Mathematics Series (SUMS) is a series designed for
undergraduates in mathematics and the sciences worldwide. From core foundational
material to final year topics, SUMS books take a fresh and modern approach.
Textual explanations are supported by a wealth of examples, problems and
fully-worked solutions, with particular attention paid to universal areas of difficulty.
These practical and concise texts are designed for a one- or two-semester course but
the self-study approach makes them ideal for independent use.
More information about this series at http://www.springer.com/series/3423
Viorel Barbu
Differential Equations
123
Viorel Barbu
Department of Mathematics
Alexandru Ioan Cuza University
Iaşi
Romania
ISSN 1615-2085
ISSN 2197-4144 (electronic)
Springer Undergraduate Mathematics Series
ISBN 978-3-319-45260-9
ISBN 978-3-319-45261-6 (eBook)
DOI 10.1007/978-3-319-45261-6
Library of Congress Control Number: 2016954710
Mathematics Subject Classification (2010): 34A12, 34A30, 34A60, 34D20, 35F05
Translation from the Romanian language edition: Ecuaţii diferenţiale by Viorel Barbu © 1985 Junimea,
Iaşi, Romania. All Rights Reserved.
© Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The present book is devoted to the study of differential equations. It is well known
that to write a monograph or a textbook in a classical subject is a difficult enterprise
requiring a rigorous selection of topics and exposition techniques. Part of mathematical analysis, the theory of differential equations is a fundamental discipline that
carries a considerable weight in the professional development of mathematicians,
physicists and engineers and, to a lesser extent, that of biologists and economists.
Through its topics and investigative techniques, this discipline has a broad scope,
touching diverse areas such as topology, functional analysis, mechanics, mathematical and theoretical physics, and differential geometry.
Although this book is structured around the main problems and results of the
theory of ordinary differential equations, it contains several more recent results
which have had a significant impact on research in this area. In fact, even when
studying classical problems we have opted for techniques that highlight the functional methods and which are also applicable to evolution equations in
infinite-dimensional spaces, thus smoothing the way towards a deeper understanding of the modern methods in the theory of partial differential equations.
We wrote this work bearing in mind the fact that differential equations represent,
in truth, a branch of applied mathematics and that the vast majority of such
equations have their origin in the mathematical modelling of phenomena in nature
or society. We tried to offer the reader a large sample of such examples and
applications in the hope that they will stimulate his interest and will provide a
strong motivation for the study of this theory. Each chapter ends with a number of
exercises and problems, theoretical or applied and of varying difficulty.
Since this book is not a treatise, and since we wanted to keep it within a
reasonable size, we had to conscientiously omit several problems and subjects that
are typically found in classical texts devoted to ordinary differential equations such
as periodic systems of differential equations, Carathéodory solutions,
delay-differential equations, differential equations on manifolds, and Sturm–
Liouville problems. The willing reader can take up these topics at a later stage. In
fact, the list of references contains the titles of several monographs or college
textbooks that can substitute these omissions and complement the present book.
v
vi
Preface
In closing, I would like to thank my colleagues Dr. Nicolae Luca and
Dr. Gheorghe Morosanu, who read the manuscript and made pertinent comments
that I took into account.
Iaşi, Romania
December 1982
Viorel Barbu
Acknowledgements
This is the English translation of the Romanian edition of the book published with
Junimea Publishing House, Iaşi, 1985. The English translation was done by Prof.
Liviu Nicolaescu from Notre Dame University (USA). I want to thank him for this
and also for some improvements he included in this new version. Many thanks are
due also to Dr. Gabriela Marinoschi and Professor Cӑtӑlin Lefter, who corrected
errors and suggested several improvements.
Iaşi, Romania
May 2016
vii
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 The Concept of a Differential Equation . . . . . . . . . . . . . . . . .
1.2 Elementary Methods of Solving Differential Equations. . . . . .
1.2.1 Equations with Separable Variables . . . . . . . . . . . . . .
1.2.2 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . .
1.2.3 First-Order Linear Differential Equations . . . . . . . . . .
1.2.4 Exact Differential Equations . . . . . . . . . . . . . . . . . . . .
1.2.5 Riccati Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.6 Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.7 Clairaut Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.8 Higher Order ODEs Explicitly Solvable . . . . . . . . . . .
1.3 Mathematical Models Described by Differential Equations . . .
1.3.1 Radioactive Decay . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Population Growth Models . . . . . . . . . . . . . . . . . . . . .
1.3.3 Epidemic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.4 The Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . .
1.3.5 The Motion of a Particle in a Conservative Field . . . .
1.3.6 The Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . .
1.3.7 Oscillatory Electrical Circuits . . . . . . . . . . . . . . . . . . .
1.3.8 Solitons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.9 Bipartite Biological Systems . . . . . . . . . . . . . . . . . . . .
1.3.10 Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Integral Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
5
5
7
7
8
9
9
10
11
12
12
13
14
14
15
16
17
17
20
21
22
2 Existence and Uniqueness for the Cauchy Problem . . . . . . . . . .
2.1 Existence and Uniqueness for First-Order ODEs . . . . . . . . . .
2.2 Existence and Uniqueness for Systems of First-Order ODEs .
2.3 Existence and Uniqueness for Higher Order ODEs . . . . . . . .
2.4 Peano’s Existence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5 Global Existence and Uniqueness . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
29
32
35
36
40
ix
x
Contents
2.6 Continuous Dependence on Initial Conditions and Parameters . . . .
2.7 Differential Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
58
3 Systems of Linear Differential Equations. . . . . . . . . . . . . . . . . . .
3.1 Notation and Some General Results . . . . . . . . . . . . . . . . . . . .
3.2 Homogeneous Systems of Linear Differential Equations . . . . .
3.3 Nonomogeneous Systems of Linear Differential Equations . . .
3.4 Higher Order Linear Differential Equations . . . . . . . . . . . . . .
3.5 Higher Order Linear Differential Equations with Constant
Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 The Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . .
3.6 Linear Differential Systems with Constant Coefficients . . . . . .
3.7 Differentiability in Initial Data . . . . . . . . . . . . . . . . . . . . . . . .
3.8 Distribution Solutions of Linear Differential Equations. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
79
79
81
85
86
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
89
94
96
102
106
4 Stability Theory . . . . . . . . . . . . . . . . . . . . . .
4.1 The Concept of Stability . . . . . . . . . . . .
4.2 Stability of Linear Differential Systems
4.3 Stability of Perturbed Linear Systems . .
4.4 The Lyapunov Function Technique . . .
4.5 Stability of Control Systems . . . . . . . . .
4.6 Stability of Dissipative Systems . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
123
123
127
130
134
143
147
5 Prime Integrals and First-Order Partial Differential Equations.
5.1 Prime Integrals of Autonomous Differential Systems . . . . . . .
5.1.1 Hamiltonian Systems . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Prime Integrals of Non-autonomous Differential Systems . . . .
5.3 First-Order Quasilinear Partial Differential Equations . . . . . . .
5.3.1 The Cauchy Problem. . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Nonlinear Partial Differential Equations . . . . . . . . . . . . . . . . .
5.6 Hamilton–Jacobi Equations . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
161
161
165
168
170
172
175
182
190
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Frequently Used Notation
•
•
•
•
•
•
•
•
Z—the set of integers.
Z k —the set integers k.
R—the set of real numbers
ða; bÞ :¼ fx 2 R; a \ x \b g.
þ
R þ :¼ ð0; 1Þ, R :¼ ½0; 1Þ
C—the set of complex numbers.
pffiffiffiffiffiffi
i :¼
1.
ð ; Þ :¼ the Euclidean inner product in Rn
n
P
xj yj ; x ¼ ðx1 ; . . .; xn Þ; y ¼ ðy1 ; . . .; yn Þ:
ðx; yÞ :¼
j¼1
• k
ke :¼ the Euclidean norm on Rn ,
pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
kxke ¼ ðx; xÞ ¼ x21 þ þ x2n ; x ¼ ðx1 ; . . .; xn Þ 2 Rn :
• 2S —the collection of all subsets of the set S.
xi
Chapter 1
Introduction
This chapter is devoted to the concept of a solution to the Cauchy problem for various
classes of differential equations and to the description of several classical differential
equations that can be explicitly solved. Additionally, a substantial part is devoted to
the discussion of some physical problems that lead to differential equations. Much
of the material is standard and can be found in many books; notably [1, 4, 6, 9].
1.1 The Concept of a Differential Equation
Loosely speaking, a differential equation is an equation which describes a relationship between an unknown function, depending on one or several variables, and its
derivatives up to a certain order. The highest order of the derivatives of the unknown
function that are involved in this equation is called the order of the differential equation. If the unknown function depends on several variables, then the equation is
called a partial differential equation, or PDE. If the unknown function depends on a
single variable, the equation is called an ordinary differential equation, or ODE. In
this book, we will be mostly interested in ODEs. We investigate first-order PDEs in
Chap. 5.
A first-order ODE has the general form
F(t, x, x ′ ) = 0,
(1.1)
where t is the argument of the unknown function x = x(t), x ′ (t) = ddtx is its derivative,
and F is a real-valued function defined on a domain of the space R3 .
We define a solution of (1.1) on the interval I = (a, b) of the real axis to be a
continuously differentiable function x : I → R that satisfies Eq. (1.1) on I , that is,
F t, x(t), x ′ (t) = 0, ∀t ∈ I.
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6_1
1
2
1 Introduction
When I is an interval of the form [a, b], [a, b) or (a, b], the concept of a solution on
I is defined similarly.
In certain situations, the implicit function theorem allows us to reduce (1.1) to an
equation of the form
(1.2)
x ′ = f (t, x),
where f : Ω → R, with Ω an open subset of R2 . In the sequel, we will investigate
exclusively equations in the form (1.2), henceforth referred to as normal form.
From a geometric viewpoint, a solution of (1.1) is a curve in the (t, x)-plane,
having at each point a tangent line that varies continuously with the point. Such a
curve is called an integral curve of Eq. (1.1). In general, the set of solutions of (1.1)
is infinite and we will (loosely) call it the general solution of (1.1). We can specify
a solution of (1.1) by imposing certain conditions. The most frequently used is the
initial condition or the Cauchy condition
x(t0 ) = x0 ,
(1.3)
where t0 ∈ I and x0 ∈ R are a priori given and are called initial values.
The Cauchy problem associated with (1.1) asks to find a solution x = x(t) of (1.1)
satisfying the initial condition (1.3). Geometrically, the Cauchy problem amounts to
finding an integral curve of (1.1) that passes through a given point (t0 , x0 ) ∈ R2 .
Phenomena in nature or society usually have a pronounced dynamical character.
They are, in fact, processes that evolve in time according to their own laws. The
movement of a body along a trajectory, a chemical reaction, an electrical circuit,
biological or social groups are the simplest examples. The investigation of such a
process amounts to following the evolution of a finite number of parameters that
characterize the corresponding process or system.
Mathematically, such a group of parameters describes the state of the system or
the process and represents a group of functions that depend on time. For example,
the movement of a point in space is completely determined by its coordinates x(t) =
(x1 (t), x2 (t), x3 (t)). These parameters characterize the state of this system.
The state of a biological population is naturally characterized by the number of
individuals it is made of. The state of a chemical reaction could be, depending on
the context, the temperature or the concentration of one or several of the substances
that participate in the reaction.
The state is rarely described by an explicit function of time. In most cases, the
state of a system is a solution of an equation governing the corresponding phenomenon according to its own laws. Modelling a dynamical phenomenon boils down
to discovering this equation, which very often is a differential equation. The initial
condition (1.3) signifies that the state of the system at time t0 is prescribed. This leads
us to expect that this initial condition uniquely determines the solution of equation
(1.1) or (1.2). In other words, given the pair (t0 , x0 ) ∈ I × R we expect that (1.2)
has only one solution satisfying (1.3). This corresponds to the deterministic point of
view in the natural sciences according to which the evolution of a process is uniquely
determined by its initial state. We will see that under sufficiently mild assumptions on
1.1 The Concept of a Differential Equation
3
the function f this is indeed true. The precise statement and the proof of this result,
known as the existence and uniqueness theorem, will be given in the next chapter.
The above discussion extends naturally to first-order differential systems of the
form
(1.4)
xi′ = f i (t, x1 , . . . , xn ), i = 1, . . . , n, t ∈ I,
where f 1 , . . . , f n are functions defined on an open subset of Rn+1 . By a solution
of system (1.4) we understand a collection of continuously differentiable functions
{x1 (t), . . . , xn (t)} on the interval I ⊂ R that satisfy (1.4) on this interval, that is,
xi′ (t) = f i t, x1 (t), . . . , xn (t) , i = 1, . . . , n, t ∈ I,
xi (t0 ) = xi0 , i = 1, . . . , n,
(1.5)
(1.6)
where t0 ∈ I and (x10 , . . . , xn0 ) is a given point in Rn . Just as in the scalar case, we
will refer to (1.5)–(1.6) as the Cauchy problem associated with (1.4), named after
the French mathematician A.L. Cauchy (1789–1857), who first defined this problem
and proved the existence and uniqueness of its solutions.
From a geometric point of view, a solution of system (1.4) is a curve in the space
or phenomena
modelled by differential systems of type (1.4),
Rn . In many situations
the coordinates
the collection x1 (t), . . . , xn (t) represents
of the state of a system
at time t, and thus the trajectory t → x1 (t), . . . , xn (t) describes the evolution of
that particular system. For this reason, the solutions of a differential system are often
called the trajectories of the system.
Consider now ordinary differential equations of order n, that is, having the form
F t, x, x ′ , . . . , x (n) = 0,
(1.7)
x (n) = f t, x, . . . , x (n−1) .
(1.8)
0
,
x(t0 ) = x00 , x ′ (t0 ) = x10 , . . . , x (n−1) (t0 ) = xn−1
(1.9)
where F is a given function. Assuming it is possible to solve for x (n) , we can reduce
the above equation to its normal form
By a solution of (1.8) on the interval I we understand a function of class C n on I
(that is, a function n-times differentiable on I with continuous derivatives up to order
n) that satisfies (1.8) at every t ∈ I . The Cauchy problem associated with (1.8) asks
to find a solution of (1.8) that satisfies the conditions
0
are given.
where t0 ∈ I and x00 , x10 , . . . , xn−1
Via a simple transformation we can reduce Eq. (1.8) to a system of type (1.4).
To this end, we introduce the new unknown functions x1 , . . . , xn using the unknown
function x by setting
x1 := x, x2 := x ′ , . . . , xn := x (n−1) .
(1.10)
4
1 Introduction
In this notation, Eq. (1.8) becomes the differential system
x1′
x2′
..
.
=
=
..
.
x2
x3
..
.
(1.11)
xn′ = f (t, x1 , . . . , xn ).
Conversely, any solution of (1.11) defines via (1.10) a solution of (1.8). The change
of variables (1.10) transforms the initial conditions (1.1) into
0
, i = 1, . . . , n.
xi (t0 ) = xi−1
The above procedure can also be used to transform differential systems of order n
(that is, differential systems containing derivatives up to order n) into differential
systems of order 1. There are more general ordinary differential equations where the
unknown functions appears in the equation with different arguments. This is the case
for differential equations of the form
x ′ (t) = f t, x(t), x(t − h) , t ∈ I,
(1.12)
where h is a positive constant. Such an equation is called delay-differential equation.
It models certain phenomena that have “memory”, in other words, physical processes
in which the present state is determined by the state at a certain period of time in
the past. It is a proven fact that many phenomena in nature and society fall into
this category. For example in physics, the hysteresis and visco-elasticity phenomena
display such behaviors. It is also true that phenomena displaying “memory” are
sometimes described by more complicated functional equations, such as the integrodifferential equations of Volterra type:
′
x (t) = f 0 t, x(t) +
a
t
K t, s, x(s) ds,
(1.13)
named after the mathematician V. Volterra (1860–1940) who introduced and studied
them for the first time. The integral term in (1.13) incorporates the “history” of the
phenomenon. If we take into account that the derivative ddtx at t is computed using
the values of x at t and at infinitesimally close moments t − ε, we can even say that
the differential equation (1.1) or (1.2) “has memory”, only in this case we are talking
of “short-term-memory” phenomena.
There exist processes or phenomena whose states cannot be determined by finitely
many functions of time. For example, the concentration of a substance in a chemical
reaction or the amplitude of a vibrating elastic system are functions that depend both
on time and on the location in the space where the process is taking place. Such
processes are called distributed and are typically modelled by partial differential
equations.
1.2 Elementary Methods of Solving Differential Equations
5
1.2 Elementary Methods of Solving Differential Equations
The first differential equation was solved as soon as differential and integral calculus
was invented in the 17th century by I. Newton (1642–1727) and G.W. Leibniz (1646–
1716). We are speaking, of course, of the problem
x ′ (t) = f (t), t ∈ I,
(1.14)
where f is a continuous function. As we know, its solutions are given by
x(t) = x0 +
t0
t
f (s)ds, t ∈ I.
The 18th century and a large part of the 19th century were dominated by the
efforts of mathematicians such as J. Bernoulli (1667–1748), L. Euler (1707–1783),
J. Lagrange (1736–1813) and many others to construct explicit solutions of differential equations. In other words, they tried to express the general solution of a
differential equation as elementary functions or as primitives (antiderivatives) of
such functions. Without trying to force an analogy, these efforts can be compared
to those of algebrists who were trying to solve by radicals higher degree polynomial equations. Whereas in algebra E. Galois (1811–1832) completely solved the
problem of solvability by radicals of algebraic equations, in analysis the problem
of explicit solvability of differential equations has lost its importance and interest
due to the introduction of qualitative and numerical methods of investigating differential equations. Even when we are interested only in approximating solutions of a
differential equation, or understanding their qualitative features, having an explicit
formula, whenever possible, is always welcome.
1.2.1 Equations with Separable Variables
These are equations of the form
dx
= f (t)g(x), t ∈ I = (a, b),
dt
(1.15)
where f is a continuous function on (a, b) and g is a continuous function on a,
possibly unbounded, interval (x1 , x2 ), g = 0. Note that we can rewrite equation
(1.15) as
dx
= f (t)dt.
g(x)
Integrating from t0 to t, where t0 is an arbitrary point in I , we deduce that
6
1 Introduction
x(t)
x0
dτ
=
g(τ )
x
We set
G(x) :=
t
f (s)ds.
(1.16)
t0
x0
dτ
.
g(τ )
(1.17)
The function G is obviously continuous and monotone on the interval (x1 , x2 ). It is
thus invertible and its inverse has the same properties. We can rewrite (1.16) as
x(t) = G
−1
t
f (s)ds
t0
, t ∈ I.
(1.18)
We have thus obtained a formula describing the solution of (1.15) satisfying the
Cauchy condition x(t0 ) = x0 . Conversely, the function x given by equality (1.18) is
continuously differentiable on I and its derivative satisfies
x ′ (t) =
f (t)
= f (t)g(x).
G ′ (x)
In other words, x is a solution of (1.15). Of course, x(t) is only defined for those
t
values of t such that t0 f (s)ds lies in the range of G.
By way of illustration, consider the ODE
π
,
x ′ = (2 − x) tan t, t ∈ 0,
2
where x ∈ (−∞, 2) ∪ x ∈ (2, ∞). Arguing as in the general case, we rewrite this
equation in the form
dx
= tan t.
2−x
Integrating
x
x0
θ
dθ =
2−θ
we deduce that
ln
t
t0
π
, x(t0 ) = x0 ,
tan s ds, t0 , t ∈ 0,
2
| cos t|
cos t0
|x(t) − 2|
= − ln
= ln
·
|x0 − 2|
| cos t0 |
cos t
If we set C := |x0 − 2| cos t0 , we deduce that the general solution is given by
x(t) =
where C is an arbitrary constant.
π
C
,
+ 2, t ∈ 0,
cos t
2
1.2 Elementary Methods of Solving Differential Equations
7
1.2.2 Homogeneous Equations
Consider the differential equation
x ′ = h(x/t),
(1.19)
where h is a continuous function defined on an interval (h 1 , h 2 ). We will assume that
h(r ) = r for any r ∈ (h 1 , h 2 ). Equation (1.19) is called a homogeneous differential
equation. It can be solved by introducing a new unknown function u defined by the
equality x = tu. The new function u satisfies the separable differential equation
tu ′ = h(u) − u,
which can be solved by the method described in Sect. 1.2.1. We have to mention
that many first-order ODEs can be reduced by simple substitutions to separated or
homogeneous differential equations. Consider, for example, the differential equation
x′ =
at + bx + c
,
a1 t + b1 x + c1
where a, b, c and a1 , b1 , c1 are constants. This equation can be reduced to a homogeneous equation of the form
as + by
dy
=
ds
a1 s + b1 y
by making the change of variables
s := t − t0 , y := x − x0 ,
where (t0 , x0 ) is a solution of the linear algebraic system
at0 + bx0 + c = a1 t0 + b1 x0 + c0 = 0.
1.2.3 First-Order Linear Differential Equations
Consider the differential equation
x ′ = a(t)x + b(t),
(1.20)
where a and b are continuous functions on the, possibly unbounded, interval (t1 , t2 ).
To solve (1.20), we multiply both sides of this equation by
8
1 Introduction
t
a(s)ds ,
exp −
t0
where t0 is some point in (t1 , t2 ). We obtain
d
dt
t
t
a(s)ds .
a(s)ds x(t) = b(t) exp −
exp −
t0
t0
Hence, the general solution of (1.20) is given by
x(t) = exp
t
a(s)ds
t0
x0 +
t0
t
s
a(τ )dτ ds ,
b(s) exp −
(1.21)
t0
where x0 is an arbitrary real number. Conversely, differentiating (1.21), we deduce
that the function x defined by this equality is the solution of (1.20) satisfying the
Cauchy condition x(t0 ) = x0 .
Consider now the differential equation
x ′ = a(t)x + b(t)x α ,
(1.22)
where α is a real number not equal to 0 or 1. Equation (1.22) is called a Bernoulli type
equation and can be reduced to a linear equation using the substitution y = x 1−α .
1.2.4 Exact Differential Equations
Consider the equation
x′ =
g(t, x)
,
h(t, x)
(1.23)
where g and h are continuous functions defined on an open set Ω ⊂ R2 . We assume
additionally that h = 0 in Ω and that the expression hd x − gdt is an exact differential. This means that there exists a differentiable function F ∈ C 1 (Ω) such that
d F = hd x − gdt, that is,
∂F
∂F
(t, x) = h(t, x),
(t, x) = −g(t, x), ∀(t, x) ∈ Ω.
∂x
∂t
(1.24)
Equation (1.23) becomes
d F t, x(t) = 0.
Hence every solution x of (1.23) satisfies the equality
F t, x(t) = C,
(1.25)
1.2 Elementary Methods of Solving Differential Equations
9
where C is an arbitrary constant. Conversely, for any constant C, equality (1.25)
defines via the implicit function theorem (recall that ∂∂xF = h = 0 in Ω) a unique
function x = x(t) defined on some interval (t1 , t2 ) and which is a solution of (1.23).
1.2.5 Riccati Equations
Named after J. Riccati (1676–1754), these equations have the general form
x ′ = a(t)x + b(t)x 2 + c, t ∈ I,
(1.26)
where a, b, c are continuous functions on the interval I . In general, Eq. (1.26) is not
explicitly solvable but it enjoys several interesting properties which we will dwell
upon later. Here we only want to mention that if we know a particular solution ϕ(t)
of (1.26), then, using the substitution y = x − ϕ, we can reduce Eq. (1.26) to a
Bernoulli type equation in y. We leave to the reader the task of verifying this fact.
1.2.6 Lagrange Equations
These are equations of the form
x = tϕ(x ′ ) + ψ(x ′ ),
(1.27)
where ϕ and ψ are two continuously differentiable functions defined on a certain
interval of the real axis such that ϕ( p) = p, ∀ p. Assuming that x is a solution of
(1.27) on the interval I ⊂ R, we deduce after differentiating that
x ′ = ϕ(x ′ ) + tϕ′ (x ′ )x ′′ + ψ ′ (x ′ )x ′′ ,
where x ′′ =
that
d2 x
.
dt 2
(1.28)
We denote by p the function x ′ and we observe that (1.28) implies
dt
ϕ′ ( p)
ψ ′ ( p)
=
t+
.
dp
p − ϕ( p)
p − ϕ( p)
(1.29)
We can interpret (1.29) as a linear ODE with unknown t, viewed as a function of p.
Solving this equation by using formula (1.21), we obtain for t an expression of the
form
t = A( p, C),
(1.30)
where C is an arbitrary constant. Using this in (1.27), we deduce that
x = A( p, C)ϕ( p) + ψ( p).
(1.31)
10
1 Introduction
If we interpret p as a parameter, equalities (1.30) and (1.31) define a parametrization
of the curve in the (t, x)-plane described by the graph of the function x. In other
words, the above method leads to a parametric representation of the solution of (1.27).
1.2.7 Clairaut Equations
Named after A.C. Clairaut (1713–1765), these equations correspond to the degenerate case ϕ( p) ≡ p of (1.27) and they have the form
x = t x ′ + ψ(x ′ ).
(1.32)
Differentiating the above equality, we deduce that
x ′ = t x ′′ + x ′ + ψ ′ (x ′ )x ′′
and thus
x ′′ t + ψ ′ (x ′ ) = 0.
(1.33)
We distinguish two types of solutions. The first type is defined by the equation x ′′ = 0.
Hence
x = C1 t + C2 ,
(1.34)
where C1 and C2 are arbitrary constants. Using (1.34) in (1.32), we see that C1 and
C2 are not independent but are related by the equality
C2 = ψ(C1 ).
Therefore,
x = C1 t + ψ(C1 ),
(1.35)
where C1 is an arbitrary constant. This is the general solution of the Clairaut equation.
A second type of solution is obtained from (1.33),
t + ψ ′ (x ′ ) = 0.
(1.36)
Proceeding as in the case of Lagrange equations, we set p := x ′ and we obtain from
(1.36) and (1.32) the parametric equations
t = −ψ ′ ( p),
x = −ψ ′ ( p) p + ψ( p)
(1.37)
1.2 Elementary Methods of Solving Differential Equations
11
that describe a function called the singular solution of the Clairaut equation (1.32). It
is not difficult to see that the solution (1.37) does not belong to the family of solutions
(1.35). Geometrically, the curve defined by (1.37) is the envelope of the family of
lines described by (1.35).
1.2.8 Higher Order ODEs Explicitly Solvable
We discuss here several classes of higher order ODEs that can be reduced to lower
order ODEs using elementary changes of variables.
One first example is supplied by equations of the form
F t, x (k) , x (k+1) , . . . , x (n) = 0,
(1.38)
where 0 < k < n. Using the substitution y := x (k) , we reduce (1.38) to
F t, y, . . . , y (n−k) = 0.
If we can determine y from the above equation, the unknown x can be obtained from
the equation
(1.39)
x (k) = y
via repeated integration and we obtain
k−1
x(t) =
Cj
j=0
(t − a) j
1
+
j!
k!
t
a
(t − s)k y(s)ds.
(1.40)
Consider now ODEs of the form
F(x, x ′ , . . . , x (n) ) = 0.
(1.41)
We set p := x ′ and we think of p as our new unknown function depending on the
independent variable x. We now have the obvious equalities
x ′′ =
dp
d dp
dp
=
p, x ′′′ =
p p.
dt
dx
dx dx
k−1
In general, x (k) can be expressed as a nonlinear function of p, ddpx , . . . , dd x k−1p . When
we replace x (k) by this expression in (1.41), we obtain an ODE of order (n − 1) in
p and x. In particular, the second-order ODE
F(x, x ′ , x ′′ ) = 0
(1.42)
12
1 Introduction
reduces via the above substitution to a first-order equation
F(x, p, ṗ) = 0, ṗ :=
dp
.
dx
For example, the Van der Pol equation (B. Van der Pol (1889–1959))
x ′′ + (x 2 − 1)x ′ + x = 0,
reduces to the first-order ODE
ṗ + x p −1 = 1 − x 2 .
1.3 Mathematical Models Described by Differential
Equations
In this section, we will present certain classical or more recent examples of ODEs or
systems of ODEs that model certain physical phenomena. Naturally, we are speaking
of dynamical models. Many more examples can be found in the book Braun (1978)
from our list of references.
1.3.1 Radioactive Decay
It has been experimentally verified that the rate of radioactive decay is proportional
to the number of atoms in the decaying radioactive substance. Thus, if x(t) denotes
the quantity of radioactive substance that is available at time t, then the rate of decay
x ′ (t) is proportional to x(t), that is,
− x ′ (t) = αx(t),
(1.43)
where α is a positive constant that depends on the nature of the radioactive substance.
In other words, x(t) satisfies a linear equation of type (1.20) and thus
x(t) = x0 exp −α(t − t0 ) , t ∈ R.
(1.44)
Usually, the rate of decay is measured by the so-called “half-life”, that is, the time it
takes for the substance to radiate half of its mass. Equality (1.44) implies easily that
the half-life, denoted by T , is given by the formula
T =
ln 2
.
α
In particular, the well-known radiocarbon dating method is based on this equation.
1.3 Mathematical Models Described by Differential Equations
13
1.3.2 Population Growth Models
If p(t) is the population of a certain species at time t and d(t, p) is the difference
between the birth rate and mortality rate, assume that the population is isolated, that
is, there are no emigrations or immigrations. Then the rate of growth of the population
will be proportional to d(t, p). A simplified population growth model assumes that
d(t, p) is proportional to the population p. In other words, p satisfies the differential
equation
(1.45)
p ′ = α p, α ≡ constant.
The solution of (1.45) is, therefore,
p(t) = p0 eα(t−t0 ) .
(1.46)
This leads to the Malthusian law of population growth.
A more realistic model was proposed by the Dutch biologist P. Verhulst (1804–
1849) in 1837. In Verhulst’s model, the difference d(t, p) is assumed to be α p − β p 2 ,
where β is a positive constant, a lot smaller than α. This nonlinear growth model,
which takes into account the interactions between the individual of the species and,
more precisely, the inhibitive effect of crowding, leads to the ODE
p′ = α p − β p2 .
(1.47)
It is interesting that the above equation also models the spread of technological
innovations.
Equation (1.47) is a separable ODE and, following the general strategy for dealing
with such equations, we obtain the solution
p(t) =
α p0
,
β p0 + (α − β p0 ) exp(−α(t − t0 ) )
(1.48)
where (t0 , p0 ) are the initial conditions.
A more complex biological system is that in which two species S1 and S2 share the
same habitat so that the individuals of the species S2 , the predators, feed exclusively
on the individuals of the species S1 , the prey. If we denote by N1 (t) and N2 (t) the
number of individuals in, respectively, the first and second species at time t, then
a mathematical model of the above biological system is described by the Lotka–
Volterra system of ODEs
N1′ (t) = a N1 − bN1 N2 ,
N2′ (t) = −cN2 + d N1 N2 ,
(1.49)
where a, b, c, d are positive constants. This system, often called the “predator-prey”
system, is the inspiration for more sophisticated models of the above problem that
lead to more complex equations that were brought to scientists’ attention by V.
Volterra in a classical monograph published in 1931.
14
1 Introduction
1.3.3 Epidemic Models
We present here a classical mathematical model for epidemic spread proposed in
1927 by W.O. Kermac and A.G. McKendrick.
Consider a population consisting of n individuals and an infectious disease that
spreads through direct contact. We assume that the infected individuals will either be
isolated, or they will become immune after recovering from the disease. Therefore,
at a given moment of time t, the population is comprised of three categories of
individuals: uninfected individuals, infected individuals roaming freely, and isolated
individuals. Let the sizes of these categories be x(t), y(t) and z(t), respectively.
We will assume that the infection rate −x ′ (t) is proportional to the number x y
which represents the number of possible contacts between uninfected and infected
individuals. Also, we assume that the infected individuals are being isolated at a rate
proportional to their number y. Therefore, the equations governing this process are
x ′ = −βx y,
y ′ = βx y − γ y,
x + y + z = n.
(1.50)
(1.51)
Using (1.50), we deduce that
x′
dy
γ − βx
βx
⇒
=
.
=−
y′
βx − γ
dx
βx
Integrating, we get
y(x) = y0 + x0 − x +
x
γ
ln ,
β x0
where x0 , y0 are the initial values of x and y. Invoking (1.51), we can now also
express z as a function of x. To find x, y, z as functions of t it suffices to substitute
y as described above into the first equation of (1.50) and then integrate the newly
obtained equation.
1.3.4 The Harmonic Oscillator
Consider the motion of a particle of mass m that is moving along the x-axis under
the influence of an elastic force directed towards the origin. We denote by x(t) the
position of the particle on the x-axis at time t. Newton’s second law of dynamics
implies that
(1.52)
mx ′′ = F.
On the other hand, F being an elastic force, it is of the form F = −ω 2 x. We conclude
that the motion of the particle is described by the second-order ODE
1.3 Mathematical Models Described by Differential Equations
15
Fig. 1.1 A pendulum
x
l
m
F
F1
mx ′′ + ω 2 x = 0.
(1.53)
A more sophisticated model of this motion is one in which we allow for the presence
of a resistance force of the form −bx ′ and of an additional external force f (t) acting
on the particle. We obtain the differential equation
mx ′′ + bx ′ + ω 2 x = f.
(1.54)
If the force F is not elastic but depends on the position x according to a more
complicated law, Eqs. (1.53) and (1.54) will be nonlinear
mx ′′ + bx ′ + F(x) = f.
Consider, for example, the motion of a pendulum of mass m with rigid arm of length
l that moves in a vertical plane; see Fig. 1.1. We denote by x(t) the angle between the
arm and the vertical direction. The motion is due to the gravitational force F1 = mg,
where g is the gravitational acceleration. This force has an active component F,
tangent to the circular trajectory, and of size F = −mg sin x. Invoking Newton’s
second law again, we deduce that this force must equal mlx ′′ , the product of the
mass and the acceleration. Therefore, the equation of motion is
lx ′′ + g sin x = 0.
(1.55)
1.3.5 The Motion of a Particle in a Conservative Field
Consider a particle of mass m that moves under the influence of a force field F :
R3 → R3 . We denote by F1 , F2 , F3 the components of the vector field F and by
x1 (t), x2 (t) and x3 (t) the coordinates of the location of the particle at time t. Newton’s
second law then implies
mxi′′ (t) = Fi x1 (t), x2 (t), x3 (t) , i = 1, 2, 3.
(1.56)
The vector field F is called conservative if there exists a C 1 function U : R3 → R
such that
16
1 Introduction
F = −∇U ⇐⇒Fi = −
∂U
, ∀i = 1, 2, 3.
∂xi
(1.57)
The function U is called the potential energy of the field F. An elementary computation involving (1.56) yields the equality
d
dt
m
2
3
i=1
2
xi′ (t) + U x1 (t), x2 (t), x3 (t) = 0.
In other words, along the trajectory of the system, the energy
E=
m
2
3
2
i=1
xi′ (t) + U (x1 (t), x2 (t), x3 (t))
is conserved.
The harmonic oscillator discussed earlier corresponds to a linear force field
F(x) = −ω 2 x, x = (x1 , x2 , x3 ). A vector field F : R3 → R3 is called central if
it has the form
(1.58)
F(x) = f xe x,
where f : R → R is a given function and xe is the Euclidean norm
xe :=
x12 + x22 + x32 .
Without a doubt, the most important example of a central field is the gravitational
force field. If the Sun is placed at the origin of the space R3 , then it generates a
gravitational force field of the form
F(x) = −
gm M
x,
x3e
(1.59)
where M is the mass of the Sun, and m is the mass of a planet situated in this field.
M
.
We are dealing with a conservative field with potential U (x) = − gm
xe
1.3.6 The Schrödinger Equation
In a potential field U , the steady states of the one-dimensional motion of a particle
of mass m are described by the second-order ODE
ψ ′′ +
2m
E − U (x) ψ = 0.
2
(1.60)
1.3 Mathematical Models Described by Differential Equations
Fig. 1.2 An R LC circuit
17
R
c
b
C
L
d
a
U
Equation (1.60), first described in 1926, is called the Schrödinger equation after its
discoverer (the physicist E. Schrödinger (1887–1961)). It is the fundamental equation
of non-relativistic quantum mechanics. In (1.60) is Planck’s constant and E is
the energy level of the steady state. The function ψ is called the wave function of
the particle. This means that for any α < β the probability that the particle can be
β
detected in the interval [α, β] is α |ψ(x)|2 d x.
1.3.7 Oscillatory Electrical Circuits
Consider an electrical circuit made of a coil with inductance L, a resistor with resistance R, a capacitor or condenser with capacitance C, and a source of electricity
which produces a potential difference (or voltage) U (Fig. 1.2).
If we denote by I (t) the intensity of the electric current, and by Uab , Ubc , Ucd ,
Uda the potential drops across the segments ab, bc, cd and da respectively, we obtain
from the basic laws of electricity the following equations
′
(t) = I (t), Uda = U.
Uab (t) = L I ′ (t), Ubc (t) = R I (t), CUcd
Kirchoff’s second law implies that Uab + Ubc + Ucd + Uda = 0. Hence
L I ′′ (t) + R I ′ (t) +
1
I (t) = f (t),
C
(1.61)
where f (t) = U ′ (t). It is remarkable that Eq. (1.54) of the harmonic oscillator is
formally identical to Eq. (1.61) of the oscillatory electrical circuit.
1.3.8 Solitons
Consider the function u = u(x, t) depending on the variables t and x. We denote
by u tt and u x x the second-order partial derivatives of u with respect to t and x,
respectively. Suppose that u satisfies the partial differential equation
18
1 Introduction
u tt − C 2 u x x = 0, t ≥ 0, x ∈ R,
(1.62)
where C is a real constant. The variable x is the “position” variable while t is
the “time” variable. Equation (1.62) is known in mathematical physics as the wave
equation. It describes, among other things, the vibration of an elastic string. The
function u(x, t) is the amplitude of the vibration at location x on the string and at
time t.
A traveling wave of (1.62) is a solution of the form
u(x, t) = ϕ(x + µt),
(1.63)
where µ is a constant and ϕ is a one-variable C 2 -function. Using u defined by (1.63)
in Eq. (1.62), we deduce that µ = ±C. In other words, Eq. (1.62) admits two families
of solutions of type (1.63), and D’Alembert’s principle states that the general solution
of (1.62) is
U (x, t) = ϕ(x + Ct) + ψ(x − Ct),
(1.64)
where ϕ, ψ : R → R are arbitrary C 2 -functions. In general, a solution of type (1.64)
for a given equation is called a soliton. The importance of solitons in mathematical
physics stems from the fact that their general features tend to survive interactions. We
want to emphasize that the solitons are found by solving certain ordinary differential
equations.
Consider for example the Korteweg–de Vries equation
u t + uu x + u x x x = 0,
(1.65)
that was initially proposed to model the propagation of ocean surface waves that
appear due to gravity. Subsequently, it was discovered that (1.65) is also relevant in
modeling a large range of phenomena, among which we mention the propagation of
heat in solids (the Fermi–Pasta–Ulam model).
The solitons of (1.65) satisfy the third-order ODE
ϕ′′′ + ϕϕ′ + µϕ′ = 0,
(1.66)
.
where µ is a real parameter and ϕ′ = dϕ
dx
Note that (1.66) is an equation of type (1.41) and we will use the techniques
presented at the end of Sect. 1.2.8 to solve it. We have
ϕ′′ =
and thus
dp dϕ
dp
dp
=
=
p,
dx
dϕ d x
dϕ
d
ϕ =
dϕ
′′′
dp
p
dϕ
p.
1.3 Mathematical Models Described by Differential Equations
19
Equation (1.66) reduces to a second-order ODE
d
p
dϕ
dp
p
dϕ
+ pϕ + µ p = 0,
that is,
d
dϕ
p
dp
dϕ
+ ϕ + µ = 0.
Integrating, we deduce that
1
p 2 + ϕ3 + µϕ2 + C1 ϕ + C2 = 0,
3
where C1 and C2 are arbitrary constants. We deduce that the solution ϕ is given by
the equation1
√
± 3
ϕ
ϕ0
du
= t + C3 .∗
3
2
−u − 3µu − 3C1 u − 3C2
An equation related to (1.65) is
u t + uu x = νu x x .
(1.67)
It is called the Burgers equation and it is used as a mathematical model of turbulence.
If we make the change in variables v x := u and then integrate with respect to x, we
obtain the equivalent equation
1
v t + |v x |2 = νv x x
2
(1.68)
which in its turn has multiple physical interpretations.
The method we employed above can be used to produce explicit
solutions
to
(1.68). More precisely, we seek solutions of the form v(t, x) = ψ t + y(x) .
Computing the partial derivatives v t , v x and substituting them into (1.68), we
deduce that
2
νψ ′′ t + y(x) − ψ ′ t + y(x) /2
1 − ν y ′′ (x)
=
.
y ′ (x)2
−ψ ′ t + y(x)
1 N.T.:
Equality (∗) shows that ϕ is an elliptic function.
20
1 Introduction
Thus, for v to satisfy (1.68) it suffices that y and ψ satisfy the ordinary differential
equations
C(y ′ )2 + ν y ′′ = 1,
1 ′ 2
(ψ ) − νψ ′′ = Cψ ′ ,
2
(1.69)
(1.70)
where C is an arbitrary constant.
To solve (1.69), we set z := y ′ and we obtain in this fashion the first-order separable equation
C z 2 = 1 − νz ′
which is explicitly solvable. Equation (1.70) can be solved in a similar way and we
obtain an explicit solution v of (1.68) and, indirectly, an explicit solution u of (1.67).
1.3.9 Bipartite Biological Systems
We consider here the problem of finding the concentration of a chemical substance
(e.g., a medical drug) in a system consisting of two compartments separated by a
membrane.
The drug can pass through the membrane in both directions, that is, from compartment I to compartment II and, reversely, from compartment II to compartment I, but
it can also flow out from compartment II to an exterior system. If the compartments
have volumes v 1 and v 2 , respectively, and x1 (t) and x2 (t) denote the amount of the
drug in compartment I and II, respectively, then the rate ddtx1 of transfer of the drug
from compartment I to compartment II is proportional to the area A of the membrane
and the concentration vx11 of the drug in compartment I. Similarly, the rate of transfer
of the drug from compartment II is proportional to the product A vx22 . Therefore, there
exist positive constants α, β such that
x1
d x1
x2
= −β A + α A .
dt
v1
v2
(1.71)
We obtain a similar equation describing the evolution of x2 . More precisely,
we have
x1
x2
d x2
= β A − α A − γx2 ,
(1.72)
dt
v1
v2
where γx2 represents the rate of transfer of the drug from compartment II to the
exterior system. In particular, γ is also a positive constant.
The differential system (1.71), (1.72) describes the evolution of the amount of the
drug in the above bipartite system. This system can be solved by expressing one of
the unknowns in terms of the other, thus reducing the system to a second-order ODE.
We will discuss more general methods in Chap. 3.
1.3 Mathematical Models Described by Differential Equations
21
1.3.10 Chemical Reactions
Consider n chemical substances reacting with each other. Let us denote by x1 (t), . . . ,
xn (t) their respective concentrations at time t. The rate of change in the concentration
xi is, in general, a function of the concentrations x1 , . . . , xn , that is,
d xi
= f i (x1 , . . . , xn ), i = 1, . . . , n.
dt
(1.73)
Let us illustrate this in some special cases.
k
If the chemical reaction is unimolecular and irreversible of type A → B (the
chemical substance A is converted by the reaction into the chemical substance B),
then the equation modeling this reaction is
−
d
d
[A] = k[A] or
[B] = k[B],
dt
dt
(1.74)
where we have denoted by [A] and [B] the concentrations of the substance A and B,
respectively.
If the reaction is reversible, A
k1
⇄
k2
B, then we have the equations
d
[A] = −k1 [A] + k2 [B],
dt
d
[B] = k1 [A] − k2 [B],
dt
or, if we set x1 := [A], x2 := [B], then
d x1
= −k1 x1 + k2 x2 ,
dt
d x2
= k1 x 1 − k2 x 2 .
dt
Consider now the case of a bimolecular reaction A + B → P in which a moles of the
chemical substance A and b moles of the chemical substance B combine to produce
the output P.
If we denote by x(t) the number of moles per liter from A and B that enter into
the reaction at time t, then according to a well-known chemical law (the law of mass
action) the speed of reaction ddtx is proportional to the product of the concentrations
of the chemicals participating in the reaction at time t. In other words,
dx
= k(x − a)(x − b).
dt
Similarly, a chemical reaction involving
A + B + C → P, is described by the ODE
three chemical
dx
= k(x − a)(x − b)(x − c).
dt
substances,
22
1 Introduction
1.4 Integral Inequalities
This section is devoted to the investigation of the following linear integral inequality
x(t) ≤ ϕ(t) +
t
a
ψ(s)x(s)ds, t ∈ [a, b],
(1.75)
where the functions x, ϕ and ψ are continuous on [a, b] and ψ(t) ≥ 0, ∀t ∈ [a, b].
Lemma 1.1 (Gronwall’s lemma) If the above conditions are satisfied, then x(t)
satisfies the inequality
x(t) ≤ ϕ(t) +
t
ϕ(s)ψ(s) exp
a
t
ψ(τ )dτ ds.
(1.76)
s
In particular, if ϕ ≡ C, (1.76) reduces to
x(t) ≤ C exp
t
a
ψ(s)ds ds, ∀ t ∈ [a, b].
Proof We set
y(t) :=
(1.77)
t
ψ(s)x(s)ds.
a
Then y ′ (t) = ψ(t)x(t) and (1.75) can be restated as x(t) ≤ ϕ(t) + y(t).
Since ψ(t) ≥ 0, we have
ψ(t)x(t) ≤ ψ(t)ϕ(t) + ψ(t)y(t),
and we deduce that
y ′ (t) = ψ(t)x(t) ≤ ψ(t)ϕ(t) + ψ(t)y(t).
t
We multiply both sides of the above inequality by exp − a ψ(s)ds
d
dt
to obtain
t
t
y(t) exp −
ψ(s)ds .
≤ ψ(t)ϕ(t) exp −
ψ(s)ds
a
a
Integrating, we obtain
y(t) ≤
a
t
ϕ(s)ψ(s) exp
t
ψ(τ )dτ ds.
(1.78)
s
We reach the desired conclusion by recalling that x(t) ≤ ϕ(t) + y(t).
1.4 Integral Inequalities
23
Corollary 1.1 Let x : [a, b] → R be a continuous nonnegative function satisfying
the inequality
t
x(t) ≤ M +
ψ(s)x(s)ds,
(1.79)
a
where M is a positive constant and ψ : [a, b] → R is a continuous nonnegative
function. Then
t
x(t) ≤ M exp
ψ(s)ds , ∀t ∈ [a, b].
(1.80)
a
Remark 1.1 The above inequality is optimal in the following sense: if we have
equality in (1.79), then we have equality in (1.80) as well. Note also that we can
identify the right-hand side of (1.80) as the unique solution of the linear Cauchy
problem
x ′ (t) = ψ(t)x(t), x(a) = M.
This Cauchy problem corresponds to the equality case in (1.79).
We will frequently use Gronwall’s inequality to produce a priori estimates of
solutions of ODEs and systems of ODEs. In the remainder of this section, we will
discuss two slight generalizations of this inequality.
Proposition 1.1 (Bihari) Let x : [a, b] → [0, ∞) be a continuous function satisfying the inequality
x(t) ≤ M +
t
a
ψ(s)ω x(s) ds, ∀t ∈ [a, b],
(1.81)
where ω : [0, ∞) → (0, ∞) is a continuous nondecreasing function. Define Φ :
[0, ∞) → R by setting
Φ(u) =
Then
u
u0
ds
ds, u 0 ≥ 0.
ω(s)
t
ψ(s)ds , ∀t ∈ [a, b].
x(t) ≤ Φ −1 Φ(M) +
(1.82)
(1.83)
a
Proof We set
y(t) :=
a
t
ω x(s) ψ(s)ds.
Inequality (1.81) implies that x(t)
≤ M + y(t), ∀t ∈ [a, b]. Since ω is nondecreasing, we deduce that y ′ (t) ≤ ω M + y(t) ψ(t). Integrating the last inequality over
[a, t], we have
24
1 Introduction
y(t)+M
M
dτ
=
ω(τ )
y(t)+M
y(a)+M
dτ
=
ω(τ )
y(t)
y(a)
that is,
Φ y(t) + M ≤ Φ(M) +
ds
≤
ω(M + s)
t
ψ(s)ds,
a
t
ψ(s)ds.
a
The last inequality is equivalent to inequality (1.83).
Proposition 1.2 Let x : [a, b] → R be a continuous function that satisfies the
inequality
t
1
1 2
2
ψ(s)|x(s)|ds, ∀t ∈ [a, b],
(1.84)
x(t) ≤ x0 +
2
2
a
where ψ : [a, b] → (0, ∞) is a continuous nonnegative function. Then x(t) satisfies
the inequality
t
|x(t)| ≤ |x0 | +
ψ(s)ds, ∀t ∈ [a, b].
a
(1.85)
Proof For ε > 0 we define
1
yε (t) := (x02 + ε2 ) +
2
a
t
ψ(s)|x(s)|ds, ∀t ∈ [a, b].
Using (1.84), we get
x(t)2 ≤ 2yε (t), ∀t ∈ [a, b].
(1.86)
Combining this with the equality
yε′ (t) = ψ(t)|x(t)|
and (1.84), we conclude that
yε′ (t) ≤
2yε (t)ψ(t), ∀t ∈ [a, b].
Integrating from a to t, we obtain
2yε (t) ≤
t
ψ(s)ds, ∀t ∈ [a, b].
2yε (a) +
a
Using (1.86), we get
t
t
ψ(s)ds ≤ |x0 | + ε +
ψ(s)ds, ∀t ∈ [a, b].
|x(t)| ≤ 2yε (a) +
a
a
Letting ε → 0 in the above inequality, we obtain (1.85).
1.4 Integral Inequalities
25
Problems
1.1 A reservoir contains ℓ liters of salt water with the concentration c0 . Salt water
is flowing into the reservoir at a rate of ℓ0 -liters per minute and with a concentration
α0 . The same amount of salt water is leaving the reservoir every minute. Assuming
that the salt in the water is uniformly distributed, find the time evolution of the
concentration of salt in the water.
Hint. If we let the state of the system be the concentration x(t) of salt in the water,
the data in the problem lead to the ODE
ℓx ′ (t) = α0 − x(t) ℓ0 ,
(1.87)
and the initial condition x(0) = c0 . This is a linear and separable ODE that can be
solved using the methods outlined in Sect. 1.2.1.
1.2 Prove that any solution of the ODE
′
x =
3
x2 + 1
t4 + 1
has two horizontal asymptotes.
Hint. Write the above equation as
0
x(t)
√
3
dy
x2 + 1
=
t
3
0
s 4 + 1 ds
and study lim x(t).
t→±∞
1.3 Find the plane curve with the property that the distance from the origin to any
tangent line to the curve is equal to the x-coordinate of the tangency point.
1.4 (Rocket motion) A body of mass m is launched from the surface of the Earth
with initial velocity v 0 along the vertical line corresponding to the launching point.
Assuming that the air resistance is negligible and taking into account that the gravimg R 2
tational force that acts on the body at altitude x is equal to (x+R)
2 (R is the radius of
the earth), we deduce from Newton’s second law that the altitude x(t) of the body
satisfies the ODE
g R2
.
(1.88)
x ′′ = −
(x + R)2
Solve (1.88) and determine the minimal initial velocity such that the body never
returns to Earth.
Hint. Equation (1.88) is of type (1.42) and, using the substitution indicated in
Sect. 1.2.8, it can be reduced to a first-order separable ODE.
26
1 Introduction
1.5 Find the solution of the ODE
3x 2 x ′ + 16t = 2t x 3
that is bounded on the positive semi-axis [0, ∞).
Hint. Via the substitution x 3 = y, obtain for
y ′ − 2t y + 16t = 0.
y
the linear equation
1.6 Prove that the ODE
x ′ + ωx = f (t),
(1.89)
where ω is a positive constant and f : R → R is continuous and bounded, has a
unique solution that is bounded on R. Find this solution x and prove that if f is
periodic, then x is also periodic, with the same period.
t
Hint. Start with the general solution x(t) = e−ωt C + −∞ eωs f (s)ds .
1.7 Consider the ODE
t x ′ + ax = f (t),
where a is a positive constant and limt→0 f (t) = α. Prove that there exists a unique
solution of this equation that has finite limit as t → 0 and then find this solution.
1.8 According to Newton’s heating and cooling law, the rate of decrease in temperature of a body that is cooling is proportional to the difference between the temperature
of the body and the temperature of the ambient surrounding. Find the equation that
models the cooling phenomenon.
1.9 Let f : [0, ∞) → R be a continuous function such that limt→∞ f (t) = 0. Prove
that any solution of (1.89) goes to 0 as t → ∞.
Hint. Start with the general solution of the linear ODE (1.89).
∞
1.10 Let f : [0, ∞) → R be a continuous function such that 0 | f (t)|dt < ∞.
Prove that the solutions of the ODE
converge to 0 as t → ∞.
x ′ + ω + f (t) x = 0, ω > 0,
1.11 Solve the (Lotka–Volterra) ODE
y(d x − c)
dy
=
.
dx
x(a − by)
1.12 Solve the ODE
x ′ = k(a − x)(b − x).
(Such an equation models certain chemical reactions.)
1.4 Integral Inequalities
27
1.13 Find the solution of the ODE
x ′ sin t = 2(x + cos t)
that stays bounded as t → ∞.
Hint. Solve it as a linear ODE.
x′
,
x
we can reduce the second-order
y ′ = −y 2 + a(t).
(1.90)
1.14 Prove that, by using the substitution y =
ODE x ′′ = a(t)x to the Riccati-type equation
1.15 Prove that, if x1 (t), x2 (t), x3 (t), x4 (t) are solutions of a Riccati-type ODE, then
the cross-ratio
x3 (t) − x1 (t) x4 (t) − x1 (t)
:
x3 (t) − x2 (t) x4 (t) − x2 (t)
is independent of t.
1.16 Find the plane curves such that the area of the triangle formed by any tangent
with the coordinate axes is a given constant a 2 .
1.17 Consider the family of curves in the (t, x)-plane described by
F(t, x, λ) = 0, λ ∈ R.
(1.91)
(a) Find the curves that are orthogonal to all the curves in this family.
(b) Find the curves that are orthogonal to all those in the family x = λet .
Hint. Since the tangent line to a curve is parallel to the vector 1, − FFxt , the orthogonal curves are solutions to the differential equation
x′ +
Fx
= 0,
Ft
(1.92)
where λ has been replaced by its value λ = λ(t, x) determined from (1.91).
1.18 Find the solitons of the Klein–Gordon equation
u tt − u x x + u + u 3 = 0.
(1.93)
Hint. For u(x, t) = ϕ(x + µt) we get for ϕ the ODE (µ2 − 1)ϕ′′ + ϕ + ϕ3 = 0.
28
1 Introduction
Fig. 1.3 An RC circuit
R
C
U
1.19 Find the differential equation that models the behavior of an RC electric circuit
as in Fig. 1.3.
Hint. If we denote by Q the electric charge of the capacitor, then we have C −1 Q +
R I = U , where I denotes the electric current. Thus Q satisfies the ODE
R
dQ
+ C −1 Q = U.
dt
(1.94)
1.20 Find the system of ODEs that models the behavior of the electrical circuit in
Fig. 1.4.
Hint. Denote by Q i the electrical charge of the capacitor Ci , i = 1, 2, and by Ii the
corresponding electrical currents. Kirchoff’s laws yield the equations
C2−1 Q 2 + R I1 = U2 , −C2−1 Q 2 + R1 I2 + C1−1 Q 1 = 0,
d Q1
d Q2
= I2 ,
= I1 − I2 .
dt
dt
(1.95)
Using as a state of the system the pair x1 = Q 1 and x2 = Q 2 , we obtain a system of
first-order ODEs in (x1 , x2 ).
Fig. 1.4 A more complex
RC circuit
I2
R1
C2
C1
U2
R
U1
Chapter 2
Existence and Uniqueness for the Cauchy
Problem
In this chapter, we will present some results concerning the existence, uniqueness and
dependence on data of the solutions to the Cauchy problem for ODEs and systems of
ODEs. From a mathematical point of view, this is a fundamental issue in the theory
of differential equations. If we view a differential equation as a mathematical model
of a physical theory, the existence of solutions to the Cauchy problem represents one
of the first means of testing the validity of the model and, ultimately, of the physical
theory. An existence result highlights the states and the minimal physical parameters
that determine the evolution of a process and, often having a constructive character,
it leads to numerical procedures for approximating the solutions. Basic references
for this chapter are [1, 6, 9, 11, 18].
2.1 Existence and Uniqueness for First-Order ODEs
We begin by investigating the existence and uniqueness of solutions to a Cauchy
problem in a special case, namely that of the scalar ODE (1.2) defined in a rectangle
centered at (t0 , x0 ) ∈ R2 . In other words, we consider the Cauchy problem
x ′ = f (t, x), x(t0 ) = x0 ,
(2.1)
where f is a real-valued function defined in the domain
∆ := (t, x) ∈ R2 ; |t − t0 | ≤ a, |x − x0 | ≤ b .
(2.2)
The central existence result for problem (2.1) is stated in our next theorem.
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6_2
29
30
2 Existence and Uniqueness for the Cauchy Problem
Theorem 2.1 Assume that the following hold:
(i) The function f is continuous on ∆.
(ii) The function f satisfies the Lipschitz condition in the variable x, that is, there
exists an L > 0 such that
f (t, x) − f (t, y) | ≤ L|x − y|, ∀(t, x), (t, y) ∈ ∆.
(2.3)
Then there exists a unique solution x = x(t) to the Cauchy problem (2.1) defined on
the interval |t − t0 | ≤ δ, where
b
, M := sup | f (t, x)|.
δ := min a,
M
(t,x)∈∆
(2.4)
Proof We begin by observing that problem (2.1) is equivalent to the integral equation
x(t) = x0 +
t
t0
f s, x(s) ds.
(2.5)
Indeed, if the continuous function x(t) satisfies (2.5) on an interval I , then it is
clearly a C 1 -function and satisfies the initial condition x(t0 ) = x0 . The equality
x ′ (t) = f t, x(t) is then an immediate consequence of the Fundamental Theorem of Calculus. Conversely, any solution of (2.1) is also a solution of (2.5). Hence,
to prove the theorem it suffices to show that (2.5) has a unique continuous solution
on the interval I := [t0 − δ, t0 + δ].
We will rely on the method of successive approximations used by many mathematicians, starting with Newton, to solve algebraic and transcendental equations. For
the problem at hand, this method was successfully pioneered by E. Picard (1856–
1941).
Consider the sequence of functions xn : I → R, n = 0, 1, . . . , defined iteratively
as follows
xn+1 (t) = x0 +
x0 (t) = x0 , ∀t ∈ I,
t
f ( s, xn (s) )ds, ∀t ∈ I, ∀n = 0, 1, . . . .
(2.6)
t0
It is easy to see that the functions xn are continuous and, moreover,
xn (t) − x0 ≤ Mδ ≤ b, ∀t ∈ I, n = 1, 2, . . . .
(2.7)
This proves that the sequence {xn }n≥0 is well defined. We will prove that this sequence
converges uniformly to a solution of (2.5). Using (2.6) and the Lipschitz condition
(2.3), we deduce that
2.1 Existence and Uniqueness for First-Order ODEs
xn (t) − xn−1 (t) ≤
31
t
f (s, xn−1 (s)) − f (s, xn−2 (s) )ds
t0
t
≤ L
xn−1 (s) − xn−2 (s) ds .
(2.8)
t0
Iterating (2.8) and using (2.7), we have
n−1
M L n−1 δ n
xn (t) − xn−1 (t) | ≤ M L
|t − t0 |n ≤
, ∀n, ∀t ∈ I.
n!
n!
(2.9)
Observe that the sequence {xn }n≥0 is uniformly convergent on I if and only if the
telescopic series
xn (t) − xn−1 (t)
n≥1
is uniformly convergent on this interval. The uniform convergence of this series
follows from (2.9) by invoking Weierstrass’ M-test: the above series is majorized by
the convergent numerical series
n≥1
M L n−1 δ n
.
n!
Hence the limit
x(t) = lim xn (t)
n→∞
exists uniformly on the interval I . The function x(t) is continuous, and from the
uniform continuity of the function f (t, x) we deduce that
f t, x(t) = lim f t, xn (t) ,
n→∞
uniformly in t ∈ I . We can pass to the limit in the integral that appears in (2.6) and
we deduce that
t
f s, x(s) ds, ∀t ∈ I.
(2.10)
x(t) = x0 +
t0
In other words, x(t) is a solution of (2.5).
To prove the uniqueness, we argue by contradiction and assume that x(t), y(t)
are two solutions of (2.5) on I . Thus
t
t
x(t) − y(t) =
f s, x(s) − f s, y(s) ds ≤ L |x(s) − y(s)|ds , ∀t ∈ I.
t0
t0
Using Gronwall’s Lemma 1.1 with ϕ ≡ 0 and ψ ≡ L, we deduce x(t) = y(t),
∀t ∈ I .
32
2 Existence and Uniqueness for the Cauchy Problem
Remark 2.1 In particular, the Lipschitz condition (2.3) is satisfied if the function f
has a partial derivative ∂∂xf that is continuous on the rectangle ∆, or more generally,
that is bounded on this rectangle.
Remark 2.2 We note that Theorem 2.1 is a local existence and uniqueness result
for the Cauchy problem (2.1), that is the existence and uniqueness was proved on
an interval [t0 − δ, t0 + δ] which is smaller than the interval [t0 − a, t0 + a] of the
definition for the functions t → f (t, x).
2.2 Existence and Uniqueness for Systems of First-Order
ODEs
Consider the differential system
xi′ = f i (t, x1 , . . . , xn ), i = 1, . . . , n,
(2.11)
together with the initial conditions
xi (t0 ) = xi0 , i = 1, . . . , n,
(2.12)
where the functions f i are defined on a parallelepiped
∆ := {(t, x1 , . . . , xn ) ∈ Rn+1 ; |t − t0 | ≤ a, |xi − xi0 | ≤ b, i = 1, . . . , n}.
(2.13)
Theorem 2.1 generalizes to differential systems of type (2.11).
Theorem 2.2 Assume that the following hold:
(i) The functions f i are continuous on ∆ for any i = 1, . . . , n.
(ii) The functions f i are Lipschitz in x = (x1 , . . . , xn ) on ∆, that is, there exists an
L > 0 such that
f i (t, x1 , . . . , xn ) − f i (t, y1 , . . . , yn ) ≤ L max |xk − yk |,
1≤k≤n
(2.14)
for any i = 1, . . . , n and any (t, x1 , . . . , xn ), (t, y1 , . . . , yn ) ∈ ∆.
Then there exists a unique solution xi = ϕi (t), i = 1, . . . , n, of the Cauchy problem
(2.11) and (2.12) defined on the interval
b
I := [t0 − δ, t0 + δ], δ := min a,
,
M
where M := max | f i (t, x)|; (t, x) ∈ ∆, i = 1, . . . , n .
(2.15)
2.2 Existence and Uniqueness for Systems of First-Order ODEs
33
Proof The proof of Theorem 2.2 is based on an argument very similar to the one
used in the proof of Theorem 2.1. For this reason, we will only highlight the main
steps.
We observe that the Cauchy problem (2.11) and (2.12) is equivalent to the system
of integral equations
xi0
xi (t) =
+
a
t
f i s, x1 (s), . . . , xn (s) ds, i = 1, . . . , n.
(2.16)
To construct a solution to this system, we again use successive approximations
xik (t) = xi0 +
a
xi0 (s)
≡
xi0 ,
t
f i s, x1k−1 (s), . . . , xnk−1 (s) ds, i = 1, . . . , n, k ≥ 1,
(2.17)
i = 1, . . . , n.
Arguing as in the proof of Theorem 2.1, we deduce that the functions
t → xik (t) are well defined and continuous on the interval I . An elementary argument
based on the Lipschitz condition yields the following counterpart of (2.9)
max |xik (t)| ≤
1≤i≤n
M L k−1 δ k
, ∀k ≥ 1, t ∈ I.
k!
Invoking as before the Weierstrass M-test, we deduce that the limits
ϕi (t) = lim xik (t), 1 ≤ i ≤ n,
k→∞
exist and are uniform on I . Letting k → ∞ in (2.17), we deduce that (ϕ1 , . . . , ϕn )
is a solution of system (2.16), and thus also a solution of the Cauchy problem (2.11)
and (2.12).
The uniqueness follows from Gronwall’s Lemma (Lemma 1.1) via an argument
similar to the one in the proof of Theorem 2.1.
Both the statement and the proof of the existence and uniqueness theorem for
systems do not seem to display meaningful differences when compared to the scalar
case. Once we adopt the vector notation, we will see that there are not even formal
differences between these two cases.
Consider the vector space Rn of vectors x = (x1 , . . . , xn ) equipped with the norm
(see Appendix A)
(2.18)
x := max |xi |, x = (x1 , . . . , xn ).
1≤i≤n
On the space Rn equipped with the above (or any other) norm, we can develop a
differential and integral calculus similar to the familiar one involving scalar functions.
Given an interval I , we define a vector-valued function x : I → R of the form
x(t) = x1 (t), . . . , xn (t) ,
34
2 Existence and Uniqueness for the Cauchy Problem
where xi (t) are scalar functions defined on I . The function x : I → Rn is called
continuous if all its components {xi (t); i = 1, . . . , n} are continuous. The function
x is called differentiable at t0 if all its components xi have this property. The derivative
of x(t) at the point t, denoted by x ′ (t), is the vector
x ′ (t) := x1′ (t), . . . , xn′ (t) .
We can define the integral of the vector function in a similar fashion. More precisely,
b
x(t)dt :=
a
b
x1 (t)dt, . . . ,
b
xn (t)dt
a
a
∈ Rn .
The sequence {x ν } of vector-valued functions
x ν : I → Rn , ν = 0, 1, 2, . . . ,
is said to converge uniformly (respectively pointwisely) to x : I → Rn as ν → ∞ if
each component sequence has these properties. The space of continuous functions
x : I → Rn is denoted by C(I ; Rn ).
All the above notions have an equivalent formulation involving the norm − of
the space Rn ; see Appendix A. For example, the continuity of x : I → Rn at t0 ∈ I
is equivalent to
lim x(t) − x(t0 ) = 0.
t→t0
The derivative, integral and the concept of convergence can be defined along similar
lines.
Returning to the differential system (2.11), observe that, if we denote by x(t) the
vector-valued function
x(t) = ( x1 (t), . . . , xn (t) )
and by f : ∆ → Rn the function
f (t, x) =
then we can rewrite (2.11) as
f 1 (t, x), . . . , f n (t, x) ,
x ′ = f (t, x),
(2.19)
while the initial condition (2.12) becomes
x(t0 ) = x 0 := (x10 , . . . , xn0 ).
In vector notation, Theorem 2.2 can be rephrased as follows.
(2.20)
2.2 Existence and Uniqueness for Systems of First-Order ODEs
35
Theorem 2.3 Assume that the following hold:
(i) The function f : ∆ → Rn is continuous.
(ii) The function f is Lipschitz in the variable x on ∆.
Then there exists a unique solution x = ϕ(t) of the system (2.19) satisfying the initial
condition (2.20) and defined on the interval
b
, M := sup f (t, x) .
I := [t0 − δ, t0 + δ], δ := min a,
M
(t,x)∈∆
In this formulation, Theorem 2.3 can be proved by following word for word the
proof of Theorem 2.1, with one obvious exception: where appropriate, we need to
replace the absolute value | − | with the norm − . In the sequel, we will systematically use the vector notation when working with systems of differential equations.
2.3 Existence and Uniqueness for Higher Order ODEs
Consider the differential equation of order n,
x (n) = g(t, x, x ′ , . . . , x (n−1) ),
(2.21)
together with the Cauchy condition (see Sect. 1.1)
0
,
x(t0 ) = x00 , x ′ (t0 ) = x10 , . . . , x (n−1) (t0 ) = xn−1
(2.22)
0
) ∈ Rn+1 is fixed and the function g satisfies the following
where (t0 , x00 , x10 , . . . , xn−1
conditions.
(I) The function g is defined and continuous on the set
0
| ≤ b, ∀i = 1, . . . , n .
∆ = (t, x1 , . . . , xn ) ∈ Rn+1 ; |t − t0 | ≤ a, |xi − xi−1
(II) There exists an L > 0 such that
|g(t, x) − g(t, y)| ≤ L x − y , ∀(t, x), (t, y) ∈ ∆.
(2.23)
Theorem 2.4 Assume that conditions (I) and (II) above hold. Then the Cauchy
problem (2.21) and (2.22) admits a unique solution on the interval
b
,
I := [t0 − δ, t0 + δ], δ := min a,
M
36
where
2 Existence and Uniqueness for the Cauchy Problem
M := sup max |g(t, x)|, |x2 |, . . . , |xn | .
(t, x)∈∆
Proof As explained before (see (1.10) and (1.11)), using the substitutions
x1 := x, x2 = x ′ , . . . , xn := x (n−1) ,
the differential equation (2.21) reduces to the system of ODEs
x1′
x2′
..
.
=
=
..
.
x2
x3
..
.
(2.24)
xn′ = g(t, x1 , . . . , xn ),
while the Cauchy condition becomes
0
, ∀i = 1, . . . , n.
xi (t0 ) = xi−1
(2.25)
In view of (I) and (II), Theorem 2.4 becomes a special case of Theorem 2.2.
2.4 Peano’s Existence Theorem
We will prove an existence result for the Cauchy problem due to G. Peano (1858–
1932). Roughly speaking, it states that the continuity of f alone suffices to guarantee
that the Cauchy problem (2.11) and (2.12) has a solution in a neighborhood of the
initial point. Beyond its theoretical significance, this result will offer us the opportunity to discuss another important technique for investigating and approximating the
solutions of an ODE. We are talking about the polygonal method, due essentially to
L. Euler (1707–1783).
Theorem 2.5 Let f : ∆ → Rn be a continuous function defined on
∆ := (t, x) ∈ Rn+1 ; |t − t0 | ≤ a,
x − x0 ≤ b .
Then the Cauchy problem (2.11) and (2.12) admits at least one solution on the
interval
b
, M := sup f (t, x) .
I := [t0 − δ, t0 + δ], δ := min a,
M
(t,x)∈∆
Proof We will prove the existence on the interval
The existence on [t0 − δ, t0 ] follows by a similar argument.
[t0 , t0 + δ].
2.4 Peano’s Existence Theorem
37
Fix ε > 0. Since f is uniformly continuous on ∆, there exists an η(ε) > 0 such
that
f (t, x) − f (s, y) ≤ ε,
for any (t, x), (s, y) ∈ ∆ such that
|t − s| ≤ η(ε),
x − y ≤ η(ε).
Consider the uniform subdivision t0 < t1 < · · · < t N (ε) = t0 + δ, where t j = t0 +
j h ε , for j = 0, . . . , N (ε), and N (ε) is chosen large enough so that
hε =
δ
≤ min
N (ε)
η(ε),
η(ε)
.
M
(2.26)
We consider the polygonal line, that is, the piecewise linear function ϕε : [t0 , t0 +
δ] → Rn+1 defined by
ϕε (t) = ϕε (t j ) + (t − t j ) f t, ϕε (t j ) , t j < t ≤ t j+1
ϕε (t0 ) = x 0 .
(2.27)
Notice that if t ∈ [t0 , t0 + δ], then
ϕε (t) − x 0 ≤ Mδ ≤ b.
Thus (t, ϕε (t) ) ∈ ∆, ∀t ∈ [t0 , t0 + δ], so that equalities (2.27) are consistent. Equalities (2.27) also imply the estimates
ϕε (t) − ϕε (s) ≤ M|t − s|, ∀t, x ∈ [t0 , t0 + δ].
(2.28)
In particular, inequality (2.28) shows that the family of functions (ϕε )ε>0 is uniformly bounded and equicontinuous on the interval [t0 , t0 + δ]. Arzelà’s theorem (see
Appendix A.3) shows that there exist a continuous function ϕ : [t0 , t0 + δ] → Rn
and a subsequence (ϕεν ), εν ց 0, such that
lim ϕεν (t) = ϕ(t) uniformly on [t0 , t0 + δ].
ν→∞
(2.29)
We will prove that ϕ(t) is a solution of the Cauchy problem (2.11) and (2.12).
With this goal in mind, we consider the sequence of functions
gεν (t) :=
ϕ′εν (t) − f (t, ϕεν (t) ), t = t νj
0,
t = t νj , j = 0, 1, . . . , N (εν ),
(2.30)
where t νj , j = 0, 1, . . . , N (εν ), are the nodes of the subdivision corresponding to εν .
Equality (2.27) implies that
38
2 Existence and Uniqueness for the Cauchy Problem
ϕ′εν (t) = f t, ϕεν (t νj ) , ∀t ∈]t νj , t νj+1 [,
and thus, invoking (2.26), we deduce that
gεν (t) ≤ εν , ∀t ∈ [t0 , t0 + δ].
(2.31)
On the other hand, the function gεν , though discontinuous, is Riemann integrable
on [t0 , t0 + δ] since its set of discontinuity points, {t νj }0≤ j≤N (εν ) , is finite. Integrating
both sides of (2.30), we get
ϕεν (t) = x 0 +
t
t0
f s, ϕεν (s) ds +
t
gεν (s)ds, ∀t ∈ [t0 , t0 + δ].
(2.32)
t0
Since f is continuous on ∆ and ϕεν converge uniformly on [t0 , t0 + δ], we have
lim f s, ϕεν (s) = f s, ϕ(s)
ν→∞
uniformly in s ∈ [t0 , t0 + δ].
Invoking (2.31), we can pass to the limit in (2.32) and we obtain the equality
ϕ(t) = x 0 +
t0
t
f s, ϕ(s) ds.
In other words, the function ϕ(t) is a solution of the Cauchy problem (2.11) and
(2.12). This completes the proof of Theorem 2.5.
An alternative proof. Consider a sequence f ε : ∆ → Rn of continuously differentiable functions such that
f ε (t, x) − f ε (t, y) ≤ L ε x − y , ∀ (t, x), (t, y) ∈ ∆,
(2.33)
lim f ε (t, x) − f (t, x) = 0, uniformly on ∆.
(2.34)
ε→0
An example of such an approximation f ε of f is
1
f ε (t, x) = n
ε
[ y; y−x 0 ≤b]
f (t, y)ρ
x−y
d y, ∀ (t, x) ∈ ∆,
ε
where ρ : Rn → R is a differentiable function such that Rn ρ(x)d x = 1, ρ(x) =
0 for x ≥ 1.) Then, by Theorem 2.3, the Cauchy problem
dx
(t) = f ε (t, x(t)), t ∈ [t0 = δ, t0 + δ),
dt
x(t0 ) = x 0 ,
(2.35)
2.4 Peano’s Existence Theorem
39
has a unique solution x ε on the interval [t0 − δ, t0 + δ). By (2.34) and (2.35), it
follows that
d
x ε (t) ≤ C, ∀ t ∈ [t0 − δ, t0 + δ], ∀ ε > 0,
dt
where C is independent of ε. This implies that the family of functions {x ε } is uniformly bounded and equicontinuous on [t0 − δ, t0 + δ] and so, by Arzelà’s theorem,
there is a subsequence {x εn } which is uniformly convergent for {εn } → 0 to a continuous function x : [t0 − δ, t0 + δ]. Then, by (2.35), it follows that
x(t) = x 0 +
t
f (s, x(s))ds, ∀ t ∈ [t0 − δ, t0 + δ]
t0
and so x is a solution to the Cauchy problem (2.11) and (2.12).
Remark 2.3 (Nonuniqueness in the Cauchy problem) We cannot deduce the uniqueness of the solution from the above proof since the family (ϕε )ε>0 may contain
several uniform convergent subsequences, each with its own limit. In general, assuming only the continuity of f , we cannot expect uniqueness in the Cauchy problem.
One example of nonuniqueness is offered by the Cauchy problem
1
x ′ = x 3 , x(0) = 0.
(2.36)
This equation has an obvious solution x(t) = 0, ∀t. On the other hand, as easily seen,
the function
⎧ 3
⎪ 2t 2
⎨
, t ≥ 0,
3
ϕ(t) =
⎪
⎩
0,
t < 0,
is also a solution of (2.36).
Remark 2.4 (Numerical approximations) If, in Theorem 2.5, we assume that f =
f (t, x) is Lipschitz in x, then according to Theorem 2.3 the Cauchy problem (2.19)–
(2.20) has a unique solution. Thus, necessarily,
lim ϕε (t) = ϕ(t) uniformly on [t0 − δ, t0 + δ],
εց0
(2.37)
because any sequence of the family (ϕε )ε>0 contains a subsequence that converges
uniformly to ϕ(t). Thus, the above procedure leads to a numerical approximation
scheme for the solution of the Cauchy problem (2.19)–(2.20), or equivalently (2.11)
and (2.12). If h is fixed, h = Nδ , and
t j := t0 + j h, j = 0, 1, . . . , N ,
40
2 Existence and Uniqueness for the Cauchy Problem
then we compute the approximations of the values of ϕ(t) at the nodes t j using
(2.27), that is,
ϕ j+1 = ϕ j + h f t j , ϕ j , j = 0, 1, . . . , N − 1.
(2.38)
The iterative formulae (2.38) are known in numerical analysis as the Euler scheme
and they form the basis of an important class of numerical methods for solving the
Cauchy problem. Equalities (2.38) are also known as difference equations.
2.5 Global Existence and Uniqueness
We consider the system of differential equations described in vector notation by
x ′ = f (t, x),
(2.39)
where the function f : Ω → Rn is continuous on the open subset Ω ⊂ Rn+1 . Additionally, we will assume that f is locally Lipschitz in x on Ω, that is, for any compact
set K ⊂ Ω, there exists an L K > 0 such that
f (t, x) − f (t, y) ≤ L K x − y , ∀(t, x), (t, y) ∈ K .
(2.40)
If A, B ⊂ Rm , then the distance between them is defined by
dist(A, B) = inf
a − b ; a ∈ A, b ∈ B .
It is useful to remark that, if K is a compact subset of Ω, then the distance dist(K , ∂Ω)
from K to the boundary ∂Ω of Ω is strictly positive. Indeed, suppose that (x ν ) is a
sequence in K and ( yν ) is a sequence in ∂Ω such that
lim x ν − yν = dist(K , ∂Ω).
ν→∞
(2.41)
Since K is compact, the sequence (x ν ) is bounded. Using (2.41), we deduce that the
sequence ( yν ) is also bounded. The Bolzano–Weierstrass theorem now implies that
there exist subsequences (x νk ) and ( yνk ) converging to x 0 and respectively y0 . Since
both K and ∂Ω are closed, we deduce that x 0 ∈ K , y0 ∈ ∂Ω, and
x 0 − y0 = lim x νk − yνk = dist(K , ∂Ω).
k→∞
Since K ∩ ∂Ω = ∅, we conclude that dist(K , ∂Ω) > 0.
Returning to the differential system (2.39), consider (t0 , x 0 ) ∈ Ω and a parallelepiped ∆ ⊂ Ω of the form
2.5 Global Existence and Uniqueness
∆ = ∆a,b := (t, x) ∈ Rn+1 ; |t − t0 | ≤ a,
41
x − x0 ≤ b .
(Since Ω is open, ∆a,b ⊂ Ω if a and b are sufficiently small.)
Applying Theorem 2.3 to system (2.39) restricted to ∆, we deduce the existence
and uniqueness of a solution x = ϕ(t) satisfying the initial condition ϕ(t0 ) = x 0
and defined on an interval [t0 − δ, t0 + δ], where
b
, M = sup f (t, x) .
δ = min a,
M
(t,x)∈∆
In other words, we have the following local existence result.
Theorem 2.6 Let Ω ⊂ Rn+1 be an open set and assume that the function f =
f (t, x) : Ω → Rn is continuous and locally Lipschitz as a function of x. Then for
any (t0 , x 0 ) ∈ Ω there exists a unique solution x(t) = x(t; t0 , x 0 ) of (2.39) defined
on a neighborhood of t0 and satisfying the initial condition
x(t; t0 , x 0 )t=t0 = x 0 .
We must emphasize the local character of the above result. As mentioned earlier
in Remark 2.2, both the existence and the uniqueness of the Cauchy problem take
place in a neighborhood of the initial moment t0 . However, we expect the uniqueness
to have a global nature, that is, if two solutions x = x(t) and y = y(t) of (2.39) are
equal at a point t0 , then they should coincide on the common interval of existence.
(Their equality on a neighborhood of t0 follows from the local uniqueness result.)
The next theorem, which is known in the literature as the global uniqueness
theorem, states that global uniqueness holds under the assumptions of Theorem 2.6.
Theorem 2.7 Assume that f : Ω → Rn satisfies the assumptions in Theorem 2.6.
If x, y are two solutions of (2.39) defined on the open intervals I and J , respectively,
and if x(t0 ) = y(t0 ) for some t0 ∈ I ∩ J , then x(t) = y(t), ∀t ∈ I ∩ J .
Proof Let (t1 , t2 ) = I ∩ J . We will prove that x(t) = y(t), ∀t ∈ [t0 , t2 ). The equality
to the left of t0 is proved in a similar fashion. Let
T := τ ∈ [t0 , t2 ); x(t) = y(t); ∀t ∈ [t0 , τ ] .
Then T = ∅ and we set T := sup T . We claim that T = t2 .
To prove the claim, we argue by contradiction. Assume that T < t2 . Then x(t) =
y(t), ∀t ∈ [t0 , T ], and since x(t) and y(t) are both solutions of (2.39), we deduce
from Theorem 2.6 that there exists a ε > 0 such that x(t) = y(t), ∀t ∈ [T, T + ε].
This contradicts the maximality of T and concludes the proof of the theorem.
Remark 2.5 If the function f : Ω → Rn is of class C k−1 on the domain Ω, then,
obviously, the local solution of system (2.39) is of class C k on the interval it is
defined. Moreover, if f is real analytic on Ω, that is, it is C ∞ , and the Taylor series
42
2 Existence and Uniqueness for the Cauchy Problem
of f at any point (t0 , x 0 ) ∈ Ω converges to f in a neighborhood of that point, then
any solution x of (2.39) is also real analytic.
This follows by direct computation from equations (2.39), and, using the fact that
a real function g = g(x1 , . . . , xm ) defined on a domain D of Rm is real analytic if
and only if, for any compact set K ⊂ D and any positive integer k, there exists a
positive constant M(k) such that for any multi-index α = (α1 , . . . , αm ) ∈ Zm
≥0 we
have
|α|
∂ f (x)
|α|
α
∂x 1 · · · ∂xαm ≤ M |α| α!, ∀x = (x1 , . . . , xm ) ∈ K ,
1
m
where
|α| := α1 + · · · + αm , α! := α1 ! · · · αm !
A solution x = ϕ(t) of (2.39) defined on the interval I = [a, b] is called
extendible if there exists a solution ψ(t) of (2.39) defined on an interval J I
such that ϕ = ψ on I . The solution ϕ is called right-extendible if there exists b′ > b
and a solution ψ of (2.39), defined on [a, b′ ], such that ψ = ϕ on [a, b]. The notion
of left-extendible solutions is defined analogously. A solution that is not extendible is
called saturated. In other words, a solution ϕ defined on an interval I is saturated if
I is its maximal domain of existence. Similarly, a solution that is not right-extendible
(respectively left-extendible) is called right-saturated (respectively left-saturated).
Theorem 2.6 implies that a maximal interval on which a saturated solution is
defined must be an open interval. If a solution ϕ is right-saturated, then the interval
on which it is defined is open on the right. Similarly, if a solution ϕ is left-saturated,
then the interval on which it is defined is open on the left.
Indeed, if ϕ : [a, b) → Rn is a solution of (2.39) defined on an interval that is not
(t) defined
open on the left, then Theorem 2.6 implies that there exists a solution ϕ
on an interval [a − δ, a + δ] satisfying the initial condition ϕ
(a) = ϕ(a). The local
uniqueness theorem implies that ϕ
= ϕ on [a, a + δ] and thus the function
ϕ
0 (t) =
ϕ(t), t ∈ [a, b),
ϕ
(t), t ∈ [a − δ, a],
is a solution of (2.39) on [a − δ, b) that extends ϕ, showing that ϕ is not left-saturated.
As an illustration, consider the ODE
x ′ = x 2 + 1,
with the initial condition x(t0 ) = x0 . This is a separable ODE and we find that
x(t) = tan t − t0 + arctan x0 .
It follows that, on the right, the maximal existence interval is
[t0 , t0 + π2 − arctan x0 ), while on the left, the maximal existence interval is (t0 − π2 −
2.5 Global Existence and Uniqueness
43
arctan x0 , t0 ]. Thus, the saturated solution is defined on the interval
(t0 − π2 − arctan x0 , t0 + π2 − arctan x0 ).
Our next result characterizes the right-saturated solutions. In the remainder of
this section, we will assume that Ω ⊂ Rn+1 is an open subset and f : Ω → Rn is a
continuous map that is also locally Lipschitz in the variable x ∈ Rn .
Theorem 2.8 Let ϕ : [t0 , t1 ) → Rn be a solution to system (2.39). Then the following are equivalent.
(i) The solution ϕ is right-extendible.
(ii) The graph of ϕ,
Γ := (t, ϕ(t) ); t ∈ [t0 , t1 ) ,
is contained in a compact subset of Ω.
Proof (i) ⇒ (ii). Assume that ϕ is right-extendible. Thus, there exists a solution
ψ(t) of (2.39) defined on an interval [t0 , t1 + δ), δ > 0, and such that
ψ(t) = ϕ(t), ∀t ∈ [t0 , t1 ).
In particular, it follows that Γ is contained in Γ, the graph of the restriction of ψ to
[t0 , t1 ]. Now, observe that Γ is a compact subset of Ω because it is the image of the
compact interval [t0 , t1 ] via the continuous map t → (t, ψ(t)).
(ii) ⇒ (i) Assume that Γ ⊂ K , where K is a compact subset of Ω. We will prove that
ϕ(t) can be extended to a solution of (2.39) on an interval of the form [t0 , t1 + δ],
for some δ > 0.
Since ϕ(t) is a solution, we have
ϕ(t) = ϕ(t0 ) +
t
t0
f s, ϕ(s) ds, ∀t ∈ [t0 , t1 ).
We deduce that
t
ϕ(t) − ϕ(t ′ ) ≤
′
t
ds ≤ M K |t − t ′ |, ∀t, t ′ ∈ [t0 , t1 ),
f s, ϕ(s)
where M K := sup(s,x)∈K f (s, x) . Cauchy’s characterization of convergence now
shows that ϕ(t) has a (finite) limit as t ր t1 and we set
ϕ(t1 ) := lim ϕ(t).
tրt1
We have thus extended ϕ to a continuous function on [t0 , t1 ] that we continue to
denote by ϕ. The continuity of f implies that
ϕ′ (t1 − 0) = lim ϕ′ (t) = lim f (t, ϕ(t) ) = f (t1 , ϕ(t1 ) ).
tրt1
tրt1
(2.42)
44
2 Existence and Uniqueness for the Cauchy Problem
On the other hand, according to Theorem 2.6, there exists a solution ψ(t) of (2.39)
defined on an interval [t1 − δ, t1 + δ] and satisfying the initial condition ψ(t1 ) =
ϕ(t1 ). Consider the function
Obviously,
ϕ
(t) =
ϕ(t), t ∈ [t0 , t1 ],
ψ(t), t ∈ (t1 , t1 + δ].
(2.42)
ϕ
′ (t1 + 0) = ψ ′ (t1 ) = f (t1 , ψ(t1 )) = f (t1 , ϕ(t1 )) = ϕ′ (t1 − 0).
This proves that ϕ
is C 1 , and satisfies the differential equation (2.39). Clearly, ϕ
extends ϕ to the right.
The next result shows that any solution can be extended to a saturated solution.
Theorem 2.9 Any solution ϕ of (2.39) admits a unique extension to a saturated
solution.
Proof The uniqueness is a consequence of Theorem 2.7 on global uniqueness. To
prove the extendibility to a saturated solution, we will limit ourselves to proving the
extendibility to a right-saturated solution.
We denote by A the set of all solutions ψ of (2.39) that extend ϕ to the right. The
set A is totally ordered by the inclusion of the domains of definition of the solutions
ψ and, as such, the set A has an upper bound, ϕ
. This is a right-saturated solution
of (2.39).
We will next investigate the behavior of the saturated solutions of (2.39) in a
neighborhood of the boundary ∂Ω of the domain Ω where (2.39) is defined. For
simplicity, we only discuss the case of right-saturated solutions. The case of leftsaturated solutions is identical.
Theorem 2.10 Let ϕ(t) be a right-saturated solution of (2.39) defined on the interval
[t0 , T ). Then any limit point as t ր T of the graph
Γ := (t, ϕ(t)); t0 ≤ t < T
is either the point at infinity of Rn+1 , or a point on ∂Ω.
Proof The theorem states that, if (τν ) is a sequence in [t0 , T ) such that the limit
limν→∞ (τν , ϕ(τν )) exists, then
(i) either T = ∞,
(ii) or T < ∞, limν→∞ ϕ(τν ) = ∞,
(iii) or T < ∞, x ∗ = limν→∞ ϕ(τν ) ∈ Rn and (T, x ∗ ) ∈ ∂Ω.
We argue by contradiction. Assume that all three options are violated. Since (i), (ii)
do not hold, we deduce that T < ∞ and that the limit limν→∞ ϕ(τν ) exists and is
2.5 Global Existence and Uniqueness
Fig. 2.1 The behavior of a
right-saturated solution
45
x
(T,x∗)
Ω
S
x=ϕ(t)
K
t0
t
T
a point x ∗ ∈ Rn . Since (iii) is also violated, we deduce that (T, x ∗ ) ∈ Ω. Thus, for
r > 0 sufficiently small, the closed ball
S := (t, x) ∈ Rn+1 ; |t − T | ≤ r,
x − x∗ ≤ r
is contained in Ω; see Fig. 2.1.
If η := dist(S, ∂Ω) > 0, we deduce that for any (s0 , y0 ) ∈ S the parallelepiped
η
∆ := (t, x) ∈ Rn+1 ; |t − s0 | ≤ ,
4
x − y0 ≤
η
4
(2.43)
is contained in the compact subset of Ω,
η
K = (t, x) ∈ Rn+1 ; |t − T | ≤ r + ,
2
x − x∗ ≤ r +
η
.
2
(See Fig. 2.1.) We set
δ := min
η
η
, M := sup
4 4M
(t,x)∈K
,
f (t, x) .
Appealing to the existence and uniqueness theorem (Theorem 2.3), where ∆ is
defined in (2.43), it follows that, for any (s0 , y0 ) ∈ S, there exists a unique solution ψ s0 , y0 (t) of (2.39) defined on the interval [s0 − δ, s0 + δ] and satisfying the
initial condition ψ(s0 ) = y0 .
Fix ν sufficiently large so that
(τν , ϕ(τν )) ∈ S and |τν − T | ≤
and define yν := ϕ(τν ),
δ
,
2
46
2 Existence and Uniqueness for the Cauchy Problem
ϕ
(t) :=
ϕ(t),
t 0 ≤ t ≤ τν ,
ψ τν , yν (t), τν < t ≤ τν + δ.
Then ϕ
(t) is a solution of (2.39) defined on the interval [t0 , τν + δ]. This interval strictly contains the interval [t0 , T ] and ϕ
= ϕ on [t0 , T ). This contradicts our
assumption that ϕ is a right-saturated solution, and completes the proof of Theorem
2.10.
Theorem 2.11 Let Ω = Rn+1 and ϕ(t) be a right-saturated solution of (2.39)
defined on [0, T ). Then only the following two options are possible:
(i) either T = ∞,
(ii) or T < ∞ and limtրT ϕ(t) = ∞.
Proof From Theorem 2.10 it follows that any limit point as t ր T on the graph Γ
of ϕ is the point at infinity. If T < ∞, then necessarily
lim ϕ(t) = ∞.
tրT
Theorems 2.10 and 2.11 are useful in determining the maximal existence interval
of a solution. Loosely speaking, Theorem 2.11 states that a solution ϕ is either defined
on the whole positive semi-axis, or it “blows up” in finite time. This phenomenon is
commonly referred to as the finite-time blowup phenomenon.
To illustrate Theorem 2.11, we depict in Fig. 2.2 the graph of the saturated solution
of the Cauchy problem
x ′ = x 2 − 1, x(0) = 2.
Its maximal existence interval on the right is [0, T ), T = 21 log 3.
In the following examples, we describe other applications of these theorems.
Example 2.1 Consider the scalar ODE
x ′ = f (x),
Fig. 2.2 A finite-time
blowup
(2.44)
x
T
t
2.5 Global Existence and Uniqueness
47
where f : Rn → Rn is locally Lipschitz and satisfies
(x, f (x)) ≤ γ1 x
2
e
+ γ2 , ∀x ∈ Rn ,
(2.45)
where γ1 , γ2 ∈ R. (Here (−, −) is the Euclidean scalar product on Rn and − e is
the Euclidean norm.) According to the existence and uniqueness theorem, for any
(t0 , x 0 ) ∈ R2 there exists a unique solution ϕ(t) = x(t; t0 , x 0 ) of (2.44) satisfying
ϕ(t0 ) = x 0 and defined on a maximal interval [t0 , T ). We want to prove that under
the above assumptions we have T = ∞.
To show this, we multiply scalarly both sides of (2.44) by ϕ(t). Using (2.45), we
deduce that
1 d
ϕ(t)
2 dt
2
e
= (ϕ(t), ϕ′ (t)) = ( f (ϕ(t)), ϕ(t)) ≤ γ1 ϕ(t)
2
e
+ γ2 , ∀t ∈ [t0 , T ),
and, therefore,
ϕ(t)
2
e
≤ ϕ(t0 )
2
e
+ γ1
t
ϕ(s)
2
e
ds + γ2 T, ∀t ∈ [t0 , T ).
0
Then, by Gronwall’s lemma (Lemma 1.1), we get
ϕ(t)
2
e
≤ ( ϕ(t0 )
2
e
+ γ2 T ) exp(γ1 T ), ∀ t ∈ (0, T ).
Thus, the solution ϕ(t) is bounded, and so there is no blowup, T = ∞.
It should be noted that, in particular, condition (2.45) holds if f is globally Lipschitz on Rn .
Example 2.2 Consider the Riccati equation
x ′ = a(t)x + b(t)x 2 + c(t),
(2.46)
where a, b, c : [0, ∞) → R are continuous functions. We associate with (2.46) the
Cauchy condition
(2.47)
x(t0 ) = x0 ,
where t0 = 0. We will prove the following result.
If x0 ≥ 0 and
b(t) ≤ 0, c(t) ≥ 0, ∀t ≥ 0,
then the Cauchy problem (2.46)–(2.47) admits a unique solution x = ϕ(t) defined
on the semi-axis [t0 , ∞). Moreover, ϕ(t) ≥ 0, ∀t ≥ t0 .
We begin by proving the result under the stronger assumption
c(t) > 0, ∀t ≥ 0.
48
2 Existence and Uniqueness for the Cauchy Problem
Let ϕ(t) denote the right-saturated solution of (2.46) and (2.47). It is defined on a
maximal interval [t0 , T ). We will first prove that
ϕ(t) ≥ 0, ∀t ∈ [t0 , T ).
(2.48)
Note that, if x0 = 0, then ϕ′ (t0 ) = c(t0 ) > 0, so ϕ(t) > 0 for t in a small interval
[t0 , t0 + δ], δ > 0. This reduces the problem to the case when the initial condition is positive. Assume, therefore, that x0 > 0. There exists a maximal interval
[t0 , T1 ) ⊂ [t0 , T ) on which ϕ(t) is nonnegative. Clearly, either T1 = T , or T1 < T
and ϕ(T1 ) = 0. If T1 < T , then arguing as above we can extend ϕ past T1 while
keeping it nonnegative. This contradicts the maximality of T1 , thus proving (2.48).
To prove that T = ∞, we will rely on Theorem 2.11 and we will show that ϕ(t)
cannot blow up in finite time. Using the equality
ϕ′ (t) = a(t)ϕ(t) + b(t)ϕ(t)2 + c(t), ∀t ∈ [t0 , T )
and the inequalities b(t) ≤ 0, ϕ(t) ≥ 0, we deduce that
t
t
c(s)ds +
|a(s)| |ϕ(s)|ds.
|ϕ(t)| = ϕ(t) ≤ ϕ(t0 ) +
t0
t0
=:β(t)
We can invoke Gronwall’s lemma to conclude that
t
t
β(s)|a(s)| exp
|ϕ(t)| ≤ β(t) +
|a(τ )|dτ ds.
t0
s
The function in the right-hand-side of the above inequality is continuous on [t0 , ∞),
showing that ϕ(t) cannot blow up in finite time. Hence T = ∞.
To deal with the general case, when c(t) ≥ 0, ∀t ≥ 0, we consider the equation
x ′ = a(t)x + b(t)x 2 + c(t) + ε,
(2.49)
where ε > 0. According to the results proven so far, this equation has a unique
solution xε (t; t0 , x0 ) satisfying (2.47) and defined on [t0 , ∞).
Denote by x(t; t0 , x0 ) the right-saturated solution of (2.46) and (2.47), defined on
a maximal interval [t0 , T ). According to the forthcoming Theorem 2.15, we have
lim xε (t; t0 , x0 ) = x(t; t0 , x0 ), ∀t ∈ [t0 , T ).
εց0
We conclude that x(t; t0 , x0 ) ≥ 0, ∀t ∈ [t0 , T ). Using Gronwall’s lemma as before,
we deduce that x(t; t0 , x0 ) cannot blow up in finite time, and thus T = ∞.
2.5 Global Existence and Uniqueness
49
Example 2.3 Let A be a real n × n matrix and Q, X 0 be two real symmetric, nonnegative definite n × n matrices. We recall that a symmetric n × n matrix S is called
nonnegative definite if
(Sv, v) ≥ 0, ∀v ∈ Rn ,
where (−, −) denotes the canonical scalar product on Rn . The symmetric matrix S
is called positive definite if
(Sv, v) > 0, ∀v ∈ Rn \ {0}.
We denote by A∗ the adjoint (transpose) of S and we consider the matrix differential
equation
(2.50)
X ′ (t) + A∗ X (t) + X (t)A + X (t)2 = Q,
together with the initial condition
X (t0 ) = X 0 .
(2.51)
By a solution of Eq. (2.50), we understand a matrix-valued map
2
X : I → Rn , t → X (t) = xi j (t)
1≤i, j≤n
of class C 1 that satisfies (2.50) everywhere on I . Thus (2.50) is a system of ODEs
involving n 2 unknown functions. When n = 1, Eq. (2.50) reduces to (2.46). Equation
(2.50) is called the matrix-valued Riccati type equation and it plays an important role
in the theory of control systems with quadratic cost functions. In such problems, one
is interested in finding global solutions X (t) of (2.50) such that X (t) is symmetric
and nonnegative definite for any t. (See Eq. (5.114).)
Theorem 2.12 Under the above assumptions, the Cauchy problem (2.50) and (2.51)
admits a unique solution X = X (t) defined on the semi-axis [t0 , ∞). Moreover, X (t)
is symmetric and nonnegative definite for any t ≥ t0 .
Proof From Theorem 2.3, we deduce the existence and uniqueness of a rightsaturated solution of this Cauchy problem defined on a maximal interval [t0 , T ).
Taking the adjoints of both sides of (2.50) and using the fact that X 0 and Q are symmetric matrices, we deduce that X ∗ (t) is also a solution of the same Cauchy problem
(2.50) and (2.51). This proves that X (t) = X ∗ (t), that is, X (t) is symmetric for any
t ∈ [t0 , T ).
Let us prove that
(i) the matrix X (t) is also nonnegative definite for any t ∈ [t0 , T ), and
(ii) T = ∞.
We distinguish two cases.
1. The matrix Q is positive definite. Recall that X (t0 ) = X 0 is nonnegative definite.
We set
50
2 Existence and Uniqueness for the Cauchy Problem
T ′ := sup{τ ∈ [t0 , T ); X (t) ≥ 0, ∀t ∈ [t0 , τ ) .
We have to prove that T ′ = T . If T ′ < T , then X (T ′ ) is nonnegative definite and
there exist sequences (tk ) in (T ′ , T ) and (v k ) in Rn with the following properties
′
• limk→∞ tk =
T.
• v k e = 1, X (tk )v k , v k ) < 0, ∀k, where − e is the standard Euclidean norm
on Rn .
• ∃v ∗ ∈ Rn such that v ∗ = limk→∞ v k , X (T ′ )v ∗ = 0.
For each v ∈ Rn , we define the functions
ϕv , ψv : [t0 , T ) → R, ϕv (t) = X (t)v, v , ψv (t) = X (t)v, Av .
Since X (t) is symmetric, from (2.50) we see that ϕv (t) satisfies the ODE
ϕ′v (t) = −2ψv (t) − X (t)v
2
e
+ (Qv, v).
(2.52)
Moreover, we have
ϕv∗ (T ′ ) = ψv∗ (T ′ ) = 0,
and
ϕ′v∗ (T ′ ) = (Qv ∗ , v ∗ ) > 0.
(2.53)
Using the mean value theorem, we deduce that for any k there exists an
sk ∈ (T ′ , tk ) such that
ϕv (tk ) − ϕvk (T ′ )
.
ϕ′vk (sk ) = k
tk − T ′
We note that, by definition of T ′ , ϕvk (T ′ ) ≥ 0. Since ϕvk (tk ) < 0, we deduce that
ϕ′vk (sk ) > 0. Observing that
lim ϕ′vk (sk ) = lim X ′ (sk )v k , v k = X ′ (T ′ )v ∗ , v ∗ = ϕ′v∗ (T ′ ),
k→∞
k→∞
we deduce that ϕ′v∗ (T ′ ) ≤ 0.
This contradicts (2.53) and proves that X (t) ≥ 0, ∀t ∈ [t0 , T ′ ).
According to Theorem 2.11, to prove that T = ∞ it suffices to show that for any
v ∈ Rn there exists a continuous function f v : [t0 , ∞) → R such that
ϕv (t) ≤ f (t), ∀t ∈ [t0 , T ).
Fix v ∈ Rn . Using the Cauchy–Schwarz inequality (Lemma A.4),
|ψv (t)| = X (t)v, Av ≤ X (t)v e · Av e .
=:gv (t)
=:Cv
2.5 Global Existence and Uniqueness
51
Using this in (2.52), we get
ϕ′v (t) ≤ 2Cv gv (t) − (gv (t))2 + (Qv, v)
and, therefore,
ϕv (t) ≤ f v (t) := ϕv (t0 ) + (t − t0 )(Qv, v)
t
2Cv gv (s) − (gv (s))2 ds,
+
(2.54)
t0
∀t ∈ [t0 , T ). This proves that T = ∞.
2. The matrix Q is only nonnegative definite. For any ε > 0, we set Q ε := Q +
ε1n , where 1n denotes the identity n × n matrix. Denote by X ε (t) the right-saturated
solution of the Cauchy problem
X ′ (t) + A∗ X (t) + X (t)A + X (t)2 = Q ε , X ε (t0 ) = X 0 .
According to the previous considerations, X ε (t) is defined on [t0 , ∞) and it is nonnegative definite on this interval. Moreover, for any v ∈ Rn , any ε > 0 and any t ≥ t0 ,
we have
(X ε (t)v, v) ≤ f vε (t) := (X 0 v, v) + (t − t0 )(Qv, v)
t
+
2Cv gvε (s) − gv (s)2 ds,
(2.55)
t0
gvε (t)
:= X ε (t)v e .
From Theorem 2.15, we deduce that
lim X ε (t) = X (t), ∀t ∈ [t0 , T ).
εց0
If we now let ε → 0 in (2.55), we deduce that
(X ε (t)v, v) ≤ f v (t) ∀t ∈ [t0 , T ),
where f v (t) is defined in (2.54). This implies that T = ∞.
Example 2.4 (Dissipative systems of ODEs) As a final application, we consider
dissipative, autonomous differential systems, that is, systems of ordinary differential
equations of the form
x ′ = f (x),
(2.56)
where f : Rn → Rn is a continuous map satisfying the dissipativity condition
f (x) − f ( y), x − y ≤ 0, ∀x, y ∈ Rn ,
(2.57)
52
2 Existence and Uniqueness for the Cauchy Problem
where, as usual, (−, −) denotes the canonical Euclidean scalar product on Rn . We
associate with (2.56) the initial condition
x(t0 ) = x 0 ,
(2.58)
where (t0 , x 0 ) is a given point in Rn+1 .
Mathematical models of a large class of physical phenomena, such as diffusion,
lead to dissipative differential systems. In the case n = 1, the monotonicity condition (2.57) is equivalent to the requirement that f be monotonically nonincreasing.
For dissipative systems, we have the following interesting existence and uniqueness
result.
Theorem 2.13 If the continuous map f : Rn → Rn is dissipative, then for any
(t0 , x 0 ) ∈ Rn+1 the Cauchy problem (2.56) and (2.57) admits a unique solution
x = x(t; t0 , x 0 ) defined on [t0 , ∞). Moreover, the map
S : [0, ∞) × Rn → Rn , (t, x 0 ) → S(t)x 0 := x(t; 0, x 0 ),
satisfies the following properties.
S(0)x 0 = x 0 , ∀x 0 ∈ Rn ,
(2.59)
S(t + s)x 0 = S(t)S(s)x 0 , ∀x 0 ∈ Rn , t, s ≥ 0,
(2.60)
S(t)x 0 − S(t) y0
e
≤ x 0 − y0 e , ∀t ≥ 0, x 0 , y0 ∈ Rn .
(2.61)
Proof According to Peano’s theorem, for any (t0 , x 0 ) ∈ Rn there exists a solution
x = ϕ(t) to the Cauchy problem (2.56) and (2.57) defined on a maximal interval
[t0 , T ). To prove its uniqueness, we argue by contradiction and assume that this
Cauchy problem admits another solution x = ϕ
(t). On their common domain of
existence [t0 , t1 ), the functions ϕ and ϕ
satisfy the differential system
ϕ(t) − ϕ
(t)
′
= f ( ϕ(t) ) − f ( ϕ
(t) ).
(2.62)
(t), we get
Taking the scalar product of both sides of (2.62) with ϕ(t) − ϕ
Thus
1 d
′
Lemma A.6
ϕ(t) − ϕ
(t) 2e
ϕ(t) − ϕ
(t) , ϕ(t) − ϕ
(t)
=
2 dt
(2.57)
= f ( ϕ(t) ) − f ( ϕ
(t) ), ϕ(t) − ϕ
(t) ≤ 0, ∀t ∈ [t0 , t1 ).
ϕ(t) − ϕ
(t)
2
e
≤ ϕ(t0 ) − ϕ
(t0 ) 2e , ∀t ∈ [t0 , t1 ).
(t0 ) = ϕ(t0 ).
This proves that ϕ = ϕ
on [t0 , t1 ) since ϕ
(2.63)
(2.64)
2.5 Global Existence and Uniqueness
53
To prove that ϕ is defined on the entire semi-axis [t0 , ∞) we first prove that it is
bounded on [t0 , T ). To achieve this, we take the scalar product of
ϕ′ (t) = f ( ϕ(t) )
with ϕ(t) and we deduce that
1 d
ϕ(t)
2 dt
2
e
= ( f (ϕ(t)), ϕ(t))
= f (ϕ(t)) − f (0), ϕ(t) + f (0), ϕ(t)
(2.57)
≤
f (0)
e
· ϕ(t) e , ∀t ∈ [t0 , T ).
Integrating this inequality on [t0 , t] and setting u(t) := ϕ(t) e , C =
deduce that
t
1
1
u(t)2 ≤
x 0 2e + C
u(s)ds, ∀t ∈ [t0 , T ).
2
2
t0
f (0) e , we
From Proposition 1.2 we deduce that
ϕ(t)
e
= u(t) ≤ x 0
e
+ f (0) e (t − t0 ), ∀t ∈ [t0 , T ).
(2.65)
Since we have not assumed that the function f is locally Lipschitz, we cannot
invoke Theorems 2.10 or 2.11 directly. However, inequality (2.65) implies in a similar
fashion the equality T = ∞. Here are the details.
We argue by contradiction and we assume that T < ∞. Inequality (2.65) implies
that there exists an increasing sequence (tk ) and v ∈ Rn such that
lim tk = T, lim ϕ(tk ) = v.
k→∞
k→∞
According to the facts established so far, there exists a solution ψ of (2.56) defined
on [T − δ, T + δ] and satisfying the initial condition ψ(T ) = v.
On the interval [T − δ, T ) we have
ϕ′ (t) − ψ ′ (t) = f ( ϕ(t) ) − f ( ψ(t) ).
Taking the scalar product of this equality with ϕ(t) − ψ(t) and using the dissipativity
condition (2.57), we deduce as before that
1 d
ϕ(t) − ψ(t)
2 dt
2
e
≤ 0, ∀t ∈ [T − δ, T ).
Hence
ϕ(t) − ψ(t)
2
e
≤ ϕ(tk ) − ψ(tk ) 2e , ∀t ∈ [T − δ, tk ].
54
2 Existence and Uniqueness for the Cauchy Problem
Since limk→∞ ϕ(tk ) − ψ(tk ) e = 0, we conclude that ϕ = ψ on [T − δ, T ). In
other words, ψ is a proper extension of the solution ϕ. This contradicts the maximality
of the interval [t0 , T ). Thus T = ∞.
To prove (2.60), we observe that both functions
y1 (t) = S(t + s)x 0 and y2 (t) = S(t)S(s)x 0
satisfy equations (2.56) and have identical values at t = 0. The uniqueness of the
Cauchy problems for (2.56) now implies that y1 (t) = y2 (t), ∀t ≥ 0.
(t) = x(t; 0, y0 ).
Inequality (2.61) now follows from (2.64) where ϕ
Remark 2.6 A family of maps S(t) : Rn → Rn , t ≥ 0, satisfying (2.59), (2.60),
(2.61) is called a continuous semigroup of contractions on the space Rn . The function
f : Rn → Rn is called the generator of the semigroups S(t).
2.6 Continuous Dependence on Initial Conditions
and Parameters
We now return to the differential system (2.39) defined on the open subset Ω ⊂ Rn+1 .
We will assume as in the previous section that the function f : Ω → Rn is continuous
in the variables (t, x), and locally Lipschitz in the variable x. Theorem 2.6 shows
that for any (t0 , x 0 ) ∈ Ω there exists a unique solution x = x(t; t0 , x 0 ) of system
(2.39) that satisfies the initial condition x(t0 ) = x 0 . The solution x(t; t0 , x 0 ), which
we will assume to be saturated, is defined on an interval typically dependent on the
point (t0 , x 0 ). For simplicity, we will assume the initial moment t0 to be fixed.
It is reasonable to expect that, as v varies in a neighborhood of x 0 , the corresponding solution x(t; t0 , v) will not stray too far from the solution x(t; t0 , x 0 ). The next
theorem confirms that this is the case, in a rather precise form. To state this result,
let us denote by B(x 0 , η) the ball of center x 0 and radius η in Rn , that is,
B(x 0 , η) := v ∈ Rn ;
v − x0 ≤ η .
Theorem 2.14 (Continuous dependence on initial data) Let [t0 , T ) be the maximal
interval of existence on the right of the solutions x(t; t0 , x 0 ) of (2.39). Then, for
any T ′ ∈ [t0 , T ), there exists an η = η(T ′ ) > 0 such that, for any v ∈ S(x 0 , η), the
solution x(t; t0 , v) is defined on the interval [t0 , T ′ ]. Moreover, the correspondence
B(x 0 , η) ∋ v → x(t; t0 , v) ∈ C [t0 , T ′ ]; Rn
is a continuous map from the ball S(x 0 , η) to the space C [t0 , T ′ ]; Rn of continuous
maps from [t0 , T ′ ] to Rn . In other words, for any sequence (v k ) in B(x 0 , η) that
2.6 Continuous Dependence on Initial Conditions and Parameters
Fig. 2.3 Isolating a compact
portion of an integral curve
55
x
Ω
K
Ω
t
converges to v ∈ B(x 0 , η), the sequence of functions x(t; t0 , v k ) converges uniformly
on [t0 , T ′ ] to x(t; t0 , v).
Proof Fix T ′ ∈ [t0 , T ). The restriction to [t0 , T ′ ] of x(t; t0 , x 0 ) is continuous and,
therefore, the graph of this restriction is compact. We can find an open set Ω ′ whose
closure Ω̄ ′ is compact and contained in Ω and such that
(t, x(t; t0 , x 0 )); t0 ≤ t ≤ T ′ ⊂ Ω̄ ′ , dist Ω̄ ′ , ∂Ω =: δ > 0.
(2.66)
We denote by K the compact subset of Ω defined by (see Fig. 2.3)
δ
.
K := (t, x) ∈ Ω; dist( (t, x), Ω̄ ′ ) ≤
2
(2.67)
For any (t0 , v) ∈ Ω ′ , there exists a maximal
T̃ ∈ (t0 , T ′ ] such that
the solution
x(t; t0 , v) exists for all t ∈ [t0 , T̃ ] and (t, x(t; t0 , v)); t0 ≤ t ≤ T̃ ⊂ K . On the
interval [t0 , T ′ ], we have the equality
t
f s, x(s; t0 , x 0 ) − f s, x(s; t0 , v) ds.
x(t; t0 , x 0 ) − x(t; t0 , v) =
t0
Because the graphs of x(s; t0 , x 0 ) and x(s; t0 , v) over [t0 , T ′ ] are contained in the
compact set K , the locally Lipschitz assumption implies that there exists a constant
L K > 0 such that
x(t; t0 , x 0 ) − x(t; t0 , v) ≤ x 0 − v + L K
t
x(s; t0 , x 0 ) − x(s; t0 , v) ds.
t0
Gronwall’s lemma now implies
x(t; t0 , x 0 ) − x(t; t0 , v) ≤ e L K (t−t0 ) x 0 − v , ∀t ∈ [t0 , T̃ ].
(2.68)
We can now prove that, given T ′ ∈ [t0 , T ), there exists an η = η(T ′ ) > 0 such that,
for any v ∈ B(x 0 , η),
56
2 Existence and Uniqueness for the Cauchy Problem
(a) the solution x(t; t0 , v) is defined on [t0 , T ′ ], and
(b) the graph of this solution is contained in K .
We argue by contradiction. We can find a sequence (v j ) j≥1 in Rn such that
• x 0 − v j ≤ 1j , ∀ j ≥ 1, and
• the maximal closed interval [t0 , T j ], with the property that the graph of x(t; t0 , v j )
is contained in K , is a subinterval of the half-open interval [t0 , T ′ ).
Using (2.68), we deduce that
x(t; t0 , x 0 ) − x(t; t0 , v j ) ≤
e L K (t−t0 )
, ∀t0 ≤ t ≤ T j .
j
(2.69)
Thus, if
j≥
δe L K (t−t0 )
,
4
then the distance between the graph of x(t; t0 , v j ) and the graph of x(t; t0 , x 0 ) over
[t0 , T j ] is ≤ 4δ . Conditions (2.66) and (2.67) imply that
dist((t, x(t; t0 , v j )), ∂ K ) ≥
δ
, ∀t ∈ [t0 , T j ].
4
We conclude that the function x(t; t0 , v j ) can be extended slightly to the right of T j
as a solution of (2.39) so that its graph continues to be inside K . This violates the
maximality of T j . This proves the existence of η(T ′ ) with the postulated properties
(a) and (b) above.
Consider now two solutions x(t; t0 , u), x(t; t0 , v), where u, v ∈ B(x 0 , η(T ′ )).
For t ∈ [t0 , T ′ ], we have
t
f s, x(s; t0 , u) − f s, x(s; t0 , v) ds.
x(t; t0 , u) − x(t; t0 , v) = u − v +
t0
Using the local Lipschitz condition, we deduce as before that
x(t; t0 , u) − x(t; t0 , v) ≤ u − v + L k
t
x(s; t0 , u) − x(s; t0 , v) ds
t0
(2.70)
and, invoking Gronwall’s lemma again, we obtain
x(t; t0 , u) − x(t; t0 , v) ≤ e L K (t−t0 ) u − v , ∀t ∈ [t0 , T ].
(2.71)
The last inequality proves the continuity of the mapping v → x(t; t0 , v) on the ball
B(x 0 , η(T ′ )). This completes the proof of Theorem 2.14.
Let us now consider the special case when system (2.39) is autonomous, that is,
the map f is independent of t.
2.6 Continuous Dependence on Initial Conditions and Parameters
57
More precisely, we assume that f : Rn → Rn is a locally Lipschitz function. One
should think of f as a vector field on Rn .
For any y ∈ Rn , we set
S(t)u := x(t; 0, u),
where x(t; 0, u) is the unique saturated solution of the system
x ′ = f (x),
(2.72)
satisfying the initial condition x(0) = u. Theorem 2.14 shows that for any x 0 ∈ Rn
there exists a T > 0 and a neighborhood U0 = B(x 0 , η) of x 0 such that S(t)u is well
defined for any u ∈ U0 and any |t| ≤ T . Moreover, the resulting maps
U0 ∋ u → S(t)u ∈ Rn
are continuous for any |t| ≤ T . From the local existence and uniqueness theorem,
we deduce that the family of maps S(t) : U0 → Rn , −T ≤ t ≤ T , has the following
properties
S(0)u = u, ∀u ∈ U0 ,
(2.73)
S(t + s)u = S(t)S(s)u, ∀s, t ∈ [−T, T ]
such that |t + s| ≤ T, S(s)u ∈ U0 ,
(2.74)
lim S(t)u = u, ∀u ∈ U0 .
(2.75)
t→0
The family of applications {S(t)}|t|≤T is called the local flow or the continuous local
one-parameter group generated by the vector field f : Rn → Rn . From the definition
of S(t), we deduce that
1
S(t)u − u , ∀u ∈ U0 .
t→0 t
f (u) = lim
(2.76)
Consider now the differential system
x ′ = f (t, x, λ), λ ∈ Λ ⊂ Rm ,
(2.77)
where f : Ω × Λ → Rn is a continuous function, Ω is an open subset of Rn+1 , and
Λ is an open subset of Rm . Additionally, we will assume that f is locally Lipschitz
in (x, λ) on Ω × Λ. In other words, for any compact sets K 1 ⊂ Ω and K 2 ⊂ Λ there
exists a positive constant L such that
f (t, x, λ) − f (t, y, µ) ≤ L
x− y + λ−µ
,
∀(t, x), (t, y) ∈ K 1 , λ, µ ∈ K 2 .
(2.78)
58
2 Existence and Uniqueness for the Cauchy Problem
Above, we denoted by the same symbol the norms − on Rm and Rn .
For any (t0 , x 0 ) ∈ Ω, and λ ∈ Λ, the system (2.77) admits a unique solution
x = x(t; t0 , x 0 , λ) satisfying the initial condition x(t0 ) = x 0 . Loosely speaking, our
next result states that the correspondence λ → x(−; t0 , x 0 , λ) is continuous.
Theorem 2.15 (Continuous dependence on parameters) Fix a point (t0 , x 0 , λ0 ) ∈
Ω × Λ. Let [t0 , T ) be the maximal interval of existence on the right of the solution
x(t; t0 , x 0 , λ0 ). Then, for any T ′ ∈ [t0 , T ), there exists an η = η(T ′ ) > 0 such that
for any λ ∈ B(λ0 , η) the solution x(t; t0 , x 0 , λ) is defined on [t0 , T ′ ]. Moreover, the
application
B(λ0 , η) ∋ λ → x(−; t0 , x 0 , λ) ∈ C [t0 , T ′ ], Rn )
is continuous.
Proof The above result is a special case of Theorem 2.14 on the continuous dependence on initial data.
Indeed, if we denote by z the (n + m)-dimensional vector (x, λ) ∈ Rn+m , and we
define
f (t, x, λ) =
f : Ω × Λ → Rn+m ,
then system (2.77) can be rewritten as
f (t, x, λ), 0 ∈ Rn × Rm ,
f t, z(t) ,
z ′ (t) =
(2.79)
while the initial condition becomes
z(t0 ) = z 0 := (x 0 , λ).
(2.80)
We have thus reduced the problem to investigating the dependence of the solutions
z(t) of (2.79) on the initial data. Our assumptions on f show that
f satisfies the
assumptions of Theorem 2.14.
2.7 Differential Inclusions
One of the possible extensions of the concept of a differential equation is to consider
instead of the function f : Ω ⊂ Rn+1 → Rn a set-valued, or multi-valued map
n
F : Ω → 2R ,
where we recall that for any set S we denote by 2 S the collection of its subsets. In
this case, system (2.39) becomes a differential inclusion
x ′ (t) ∈ F t, x(t) , t ∈ I,
(2.81)
2.7 Differential Inclusions
59
to which we associate the initial condition
x(t0 ) = x 0 .
(2.82)
In general, we cannot expect the existence of a continuously differentiable solution
of the Cauchy problem (2.81) and (2.82). Consider, for example, the differential
inclusion
(2.83)
x ′ ∈ Sign x,
n
where Sign : R → 2R is given by
⎧
⎪
x < 0,
⎨−1,
Sign (x) = [ − 1, 1], x = 0,
⎪
⎩
1,
x > 0.
Note that, if x0 > 0, then the function
x(t) =
t − t0 + x0 , t ≥ −x0 + t0 ,
0,
t < −x0 + t0 ,
is the unique solution of (2.83) on R \ {t0 − x0 }. However, it is not a C 1 -function
since its derivative has a discontinuity at t0 − x0 . Thus, the above function is not a
solution in the sense we have adopted so far.
This simple example suggests the need to extend the concept of solution.
Definition 2.1 The function x : [t0 , T ] → Rn is called a Carathéodory solution of
the differential inclusion (2.81) if the following hold.
(i) The function x(t) is absolutely continuous on [t0 , T ].
(ii) There exists a negligible set N ⊂ [t0 , T ] such that, for any t ∈ [t0 , T ] \ N , the
function x(t) is differentiable at t and x ′ (t) ∈ F(t, x(t)).
According to Lebesgue’s theorem (see e.g. [12, Sect. 33]), an absolutely continuous function x : [t0 , T ] → Rn is almost everywhere differentiable on the interval
[t0 , T ].
Differential inclusions naturally appear in the modern theory of variational calculus and of control systems. An important source of differential inclusion is represented by differential equations with a discontinuous right-hand side. More precisely,
if f = f (t, x) is discontinuous in x, then the Cauchy problem (2.1) does not have
a Carathéodory solution, but this might happen if we extend f to a multi-valued
mapping (t, x) → F(t, x). (This happens for Eq. (2.83), where the discontinuous
x
was extended to Sign x.) In this section, we will investigate a special
function |x|
class of differential inclusions known as evolution variational inequalities. They were
introduced in mathematics, in a more general context, by G. Stampacchia (1922–
1978) and J.L. Lions (1928–2001). To state and solve such problems, we need to
make a brief digression into (finite-dimensional) convex analysis.
60
2 Existence and Uniqueness for the Cauchy Problem
Recall that a subset C ⊂ Rn is convex if
(1 − t)x + t y ∈ C, ∀x, y ∈ C, ∀t ∈ [0, 1].
Geometrically, this means that for any two points in C the line segment connecting
them is entirely contained in C. Given a closed convex set C ⊂ Rn and x 0 ∈ C, we
set
NC (x 0 ) := w ∈ Rn ; (w, y − x 0 ) ≤ 0, ∀ y ∈ C .
(2.84)
The set NC (x 0 ) is a closed convex cone called the (outer) normal cone of C at the
n
point x 0 ∈ C. We extend NC to a multi-valued map NC : Rn → 2R by setting
NC (x) = ∅, ∀x ∈ Rn \ C.
Example 2.5 (a) If C is a convex domain in Rn with smooth boundary and x 0 is a
point on the boundary, then NC (x 0 ) is the cone spanned by the unit outer normal to
∂C at x 0 . If x 0 is in the interior of C, then NC (x 0 ) = {0}.
(b) If C ⊂ Rn is a vector subspace, then for any x ∈ C we have NC (x) = C ⊥ , the
orthogonal complement of C in Rn .
To any closed convex set C ⊂ Rn there corresponds a projection
PC : Rn → C
that associates to each x ∈ Rn the point in C closest to x with respect to the Euclidean
distance. The next result makes this precise.
Lemma 2.1 Let C be a closed convex subset of Rn . Then the following hold.
(a) For any x ∈ Rn there exists a unique point y ∈ C such that
x−y
e
= dist(x, C) := inf x − z e .
z∈C
(2.85)
We denote by PC (x) this unique point in C, and we will refer to the resulting map
PC : Rn → C as the orthogonal projection onto C.
(b) The map PC : Rn → C satisfies the following properties:
x − PC x ∈ NC PC x , that is,
(2.86)
≤ x − y e , ∀x, z ∈ Rn .
(2.87)
x − PC x, y − PC x ≤ 0, ∀x ∈ Rn , y ∈ C.
PC x − PC z
e
Proof (a) There exists a sequence ( yν ) in C such that
dist(x, C) ≤ x − yν
e
≤ dist(x, C) +
1
.
ν
(2.88)
2.7 Differential Inclusions
61
The sequence ( yν ) is obviously bounded and thus it has a convergent subsequence
( yνk ). Its limit y is a point in C since C is closed. Moreover, inequalities (2.88) imply
that
x − y e = dist(x, C).
This completes the proof of the existence part of (a).
Let us prove the uniqueness statement. Let y1 , y2 ∈ C such that
x − y1
e
= x − y2
e
= dist(x, C).
Since C is convex, we deduce that
y0 :=
1
(y1 + y2 ) ∈ C.
2
From the triangle inequality we deduce that
dist(x, C) ≤ x − y0 e
1
x − y1
≤
2
= dist(x, C).
e
+ x − y2
e
Hence
x − y0
e
= x − y1
e
= x − y2
e
= dist(x, C).
(2.89)
One the other hand, we have the parallelogram identity
1
( y + z)
2
2
+
e
1
( y − z)
2
2
=
e
1
y
2
2
e
+ z
2
e
, ∀ y, z ∈ Rn .
If in the above equality we let y = x − y1 , z = x − y2 , then we conclude from (2.89)
that
y1 − y2 2e = 0.
This completes the proof of the uniqueness.
(b) To prove (2.86), we start with the defining inequality
x − PC x
2
e
≤ x − y 2 , ∀ y ∈ C.
Consider now the function
f y : [0, 1] → R, f y (t) = x − yt
2
− x − PC x 2e ,
62
2 Existence and Uniqueness for the Cauchy Problem
where
yt = (1 − t)PC x + t y = PC (x) + t ( y − PC x).
We have f y (t) ≥ 0, ∀t ≥ 0 and f y (0) = 0. Thus
f y′ (0) ≥ 0.
Observing that
we deduce that
f y (t) = (x − PC x) − t ( y − PC x) 2e − x − PC x 2e
2
= t 2 y − PC x e − 2t x − PC x, y − PC x),
f y′ (0) = −2 x − PC x, y − PC x) ≥ 0, ∀ y ∈ C.
This proves (2.86).
To prove (2.87), let z ∈ Rn and set
u := x − PC x, v := z − PC z, w := PC z − PC x.
From (2.86) we deduce that
u ∈ NC (PC x), v ∈ NC (PC z),
so that (u, w) ≤ 0 ≤ (v, w) and thus
(w, u − w) ≤ 0.
(2.90)
On the other hand, we have x − z = u − w − v, so that
x−z
2
e
= (u − v) − w
= w
2
e
(2.90)
≥
w
2
+ u−v
2
e
− 2(w, u − w)
2
e
= PC x − PC z
2
e
.
Suppose that K is a closed convex subset of Rn . Fix real numbers t0 < T , a
continuous function g : [t0 , T ] → Rn and a (globally) Lipschitz map f : Rn → Rn .
We want to investigate the differential inclusion
2.7 Differential Inclusions
63
x ′ (t) ∈ f x(t) + g(t) − N K x(t) , a.e. t ∈ (t0 , T )
x(t0 ) = x 0 .
(2.91)
This differential inclusion can be rewritten as an evolution variational inequality
x(t) ∈ K , ∀t ∈ [0, T ],
(2.92)
x ′ (t) − f x(t) − g(t), y − x(t) ≥ 0, a.e. t ∈ (t0 , T ), ∀ y ∈ K ,
x(t0 ) = x 0 .
(2.93)
(2.94)
Theorem 2.16 Suppose that x 0 ∈ K and g : [t0 , T ] → Rn is a continuously differentiable function. Then the initial value problem (2.91) admits a unique Carathéodory
solution x : [t0 , T ] → Rn . Moreover,
x ′ (t) = f (x(t)) + g(t) − PN K (x(t)) ( f (x(t)) + g(t)), a.e. t ∈ (t0 , T ).
(2.95)
Proof For simplicity, we denote by P the orthogonal projection PK onto K defined
in Lemma 2.1. Define the map
Γ : Rn → Rn , Γ x = P x − x.
Note that −Γ x ∈ N K (P x) and Γ x e = dist(x, K ). Moreover, Γ is dissipative,
that is,
(2.96)
Γ x − Γ y, x − y ≤ 0, ∀x, y ∈ Rn .
Indeed,
Γ x − Γ y, x − y = P x − P y, x − y − x − y
≤ Px − P y
e
· x−y
e
− x−y
2
e
2
e
(2.87)
≤ 0.
We will obtain the solution of (2.91) as the limit of the solutions {x ε }ε>0 of the
approximative Cauchy problem
1
x ′ε (t) = f x ε (t) + g(t) + Γ x ε (t),
ε
x ε (t0 ) = x 0 .
(2.97)
For any ε > 0, the map Fε : Rn → Rn , Fε (x) = f (x) + 1ε Γ x is Lipschitz. Hence,
the Cauchy problem (2.97) has a unique right-saturated solution x ε (t) defined on
an interval [t0 , Tε ). Since Fε is globally Lipschitz, it follows that x ε is defined over
[t0 , T ]. (See Example 2.1.)
Taking the scalar product of (2.97) with x ε (t) − x 0 , and observing that Γ x 0 = 0,
x 0 ∈ K , we deduce from (2.96) that
64
2 Existence and Uniqueness for the Cauchy Problem
1 d
x ε (t) − x 0
2 dt
2
e
≤
≤
+
f (x ε (t)) + g(t), x ε (t) − x 0
f (x ε (t)) − f (x 0 )
e
· xε − x0
+ g(t)
e
· xε − x0 e.
f (x 0 )
e
e
Hence, if L denotes the Lipschitz constant of f , we have
1 d
x ε (t) − x 0
2 dt
2
e
We set
M := sup
t∈[t0 ,T ]
≤ L x ε (t) − x 0
2
e
+
+
1
x ε (t) − x 0 2e .
2
f (x 0 )
e
+ g(t)
e
1
f (x 0 )
2
2
e
+ g(t)
2
e
, L ′ = 2L + 1,
and we get
d
x ε (t) − x 0
dt
2
e
≤ L ′ x ε (t) − x 0
2
e
+ M.
Gronwall’s lemma now yields the following ε-independent upper bound
x ε (t) − x 0
2
e
′
≤ Me L (t−t0 ) , ∀t ∈ [t0 , T ].
(2.98)
Using (2.97), we deduce that
d
x ε (t + h) − x ε (t) = f (x ε (t + h)) − f (x ε (t)) + g(t + h) − g(t)
dt
1
+ Γ x ε (t + h) − Γ x ε (t) .
ε
Let h > 0. Taking the scalar product with x ε (t + h) − x ε (t) of both sides of the
above equality and using the dissipativity of Γ , we deduce that for any t ∈ [t0 , T ]
we have
1 d
x ε (t + h) − x ε (t)
2 dt
2
e
≤
f (x ε (t + h)) − f (x ε (t)), x ε (t + h) − x ε (t)
+ g(t + h) − g(t), x ε (t + h) − x ε (t)
1
≤ L x ε (t + h) − x ε (t) 2e +
g(t + h) − g(t) 2e + x ε (t + h) − x ε (t) 2e ,
2
so that, setting again L ′ = 2L + 1, we obtain by integration
2.7 Differential Inclusions
65
2
e
x ε (t + h) − x ε (t)
≤ x ε (h) − x 0
2
e
+L ′
+
t
g(s + h) − g(s) 2e ds
t0
t
x ε (s + h) − x ε (s) 2e ds,
t0
for all t ∈ [t0 , T − h]. Since g is a C 1 -function, there exists a C0 > 0 such that
g(s + h) − g(s)
≤ C0 h, ∀t ∈ [t0 , T − h].
e
Hence, ∀t ∈ [t0 , T − h], we have
x ε (t + h) − x ε (t)
2
e
≤ x ε (h) − x 0 e + C02 h 2 (T − t0 )
t
′
x ε (s + h) − x ε (s) 2e ds.
+L
t0
Using Gronwall’s lemma once again, we deduce that
x ε (t + h) − x ε (t)
2
e
′
x ε (h) − x 0 e + C02 h 2 (T − t0 ) e L (t−t0 )
′
(2.98)
′
≤
Me L h + C02 h 2 (T − t0 ) e L (t−t0 ) .
≤
Thus, for some constant C1 > 0, independent of ε and h, we have
x ε (t + h) − x ε (t)
e
≤ C1 h, ∀t ∈ [t0 , T − h].
Thus
x ′ε (t) ≤ C1 , ∀t ∈ [0, T ].
From the equality
x ε (t) − x ε (s) =
(2.99)
t
s
x ′ε (τ )dτ
we find that
x ε (t) − x ε (s)
e
≤ C1 |t − s|, ∀t, s ∈ [t0 , T ].
(2.100)
This shows that the family {x ε }ε>0 is uniformly bounded and equicontinuous on
[t0 , T ]. Arzelà’s theorem now implies that there exists a subsequence (for simplicity,
denoted by ε) and a continuous function x : [t0 , T ] → Rn such that x ε (t) converges
uniformly to x(t) on [t0 , T ] as ε → 0.
Passing to the limit in (2.100), we deduce that the limit function x(t) is Lipschitz on [t0 , T ]. In particular, x(t) is absolutely continuous and almost everywhere
differentiable on this interval. From (2.97) and (2.99), it follows that there exists a
constant C2 > 0, independent of ε, such that
66
2 Existence and Uniqueness for the Cauchy Problem
dist(x ε (t), K ) = Γ x ε (t)
e
≤ C2 ε, ∀t ∈ [t0 , T ].
This proves that dist(x(t), K ) = 0, ∀t, that is, x(t) ∈ K , ∀t ∈ [t0 , T ].
We can now prove inequality (2.93). To do this, we fix a point t where the function
x is differentiable (we saw that this happens for almost any t ∈ [t0 , T ]). From (2.97)
and (2.86), we deduce that for almost all s ∈ [t0 , T ] and any z ∈ K we have
1 d
x ε (s) − z
2 dt
2
e
≤
f (x ε (s)) + g(s), x ε (s) − z .
Integrating from t to t + h, we deduce that
1
( x ε (t + h) − z 2e − x ε (t) − z 2e )
2
t+h
( f (x ε (s)) + g(s), x ε (s) − z)ds, ∀z ∈ K .
≤
t
Now, let us observe that, for any u, v ∈ Rn , we have
1
u+v
2
2
e
− v
2
e
≥ (u, v).
Using this inequality with u = x ε (t + h) − x ε (t), v = x ε (t) − z, we get
1
1
x ε (t + h) − x ε (t), x ε (t) − z ≤
x ε (t + h) − z
h
2h
≤
1
h
t
t+h
2
e
− x ε (t) − z
2
e
f (x ε (s)) + g(s), x ε (s) − z ds, ∀z ∈ K .
Letting ε → 0, we find
1 t+h
1
x(t + h) − x(t), x(t) − z ≤
f (x(s)) + g(s), x(s) − z ds, ∀z ∈ K .
h
h t
Finally, letting h → 0, we obtain
x ′ (t) − f (x(t)) − g(t), x(t) − z) ≤ 0, ∀z ∈ K .
This is precisely (2.93).
The uniqueness of the solution now follows easily. Suppose that x, y are solutions
of (2.69)–(2.94). We obtain from (2.93) that
x ′ (t) − f (x(t)) − g(t), x(t) − y(t)) ≤ 0,
y′ (t) − f ( y(t)) − g(t), y(t) − x(t)) ≤ 0,
2.7 Differential Inclusions
so that
67
x ′ (t) − y′ (t) − ( f (x(t)) − f ( y(t)) ), x(t) − y(t) ≤ 0
which finally implies
1 d
x(t) − y(t)
2 dt
2
e
≤ L x(t) − y(t) 2e ,
for almost all t ∈ [t0 , T ]. Integrating and using the fact that t → x(t) − y(t)
Lipschitz, we deduce that
x(t) − y(t)
2
e
≤ 2L
2
e
is
t
x(s) − y(s) 2e ds.
t0
Gronwall’s lemma now implies x(t) = y(t), ∀t. We have one last thing left to prove,
namely, (2.95). Let us observe that (2.91) implies
d
x(t + s) − f (x(t + s)) − g(t + s) ∈ −N K (x(t + s))
ds
for almost all t, s. On the other hand, using (2.84), we deduce that
Hence
u − v, x(t + s) − x(s) ≥ 0, ∀u ∈ N K (x(t + s)), v ∈ N K (x(t)).
d
1 d
x(t + s) − x(t) 2e =
x(t + s), x(t + s) − x(t)
2 ds
ds
≤ −v + f (x(t + s)) + g(t + s), x(t + s) − x(t) ,
∀v ∈ N K (x(t)). Integrating with respect to s on [0, h], we deduce that
h
1
2
x(t + h) − x(t) e ≤
(−v + f (x(t + s)) + g(t + s), x(t + s) − x(t))ds
2
0
h
≤
− v + f (x(t + s)) + g(t + s) e · x(t + s) − x(t) e ds.
0
Using Proposition 1.2, we conclude that
x(t + h) − x(t) e ≤
h
0
− v + f (x(t + s)) + g(t + s) e ds, ∀h, ∀v ∈ N K (x(t)).
Dividing by h > 0 and letting h → 0, we deduce that
x ′ (t)
e
≤ − v + f (x(t)) + g(t) e , ∀v ∈ N K (x(t)).
68
2 Existence and Uniqueness for the Cauchy Problem
This means that f (x(t)) + g(t) − x ′ (t) is the point in N K (x(t)) closest to f (x(t)) +
g(t). This is precisely the statement (2.95).
Remark 2.7 If the solution ϕ(t) of the Cauchy problem
ϕ′ = f (ϕ) + g, ∀t ∈ [t0 , T ], ϕ(t0 ) = x 0 ,
(2.101)
stays in K for all t ∈ [t0 , T ], then ϕ coincides with the unique solution x(t) of
(2.92)–(2.94).
Indeed, if we subtract (2.101) from (2.91) and we take the scalar product with
x(t) − ϕ(t), then we obtain the inequality
1
x(t) − ϕ(t)
2
2
e
≤ L x(t) − ϕ(t) 2e , ∀t ∈ [t0 , T ].
Gronwall’s lemma now implies x ≡ ϕ.
Example 2.6 Consider a particle of unit mass that is moving in a planar domain
K ⊂ R2 under the influence of a homogeneous force field F(t). We assume that K
is convex; see Fig. 2.4.
If we denote by g(t) an antiderivative of F(t), and by x(t) the position of the
particle at time t, then, intuitively, the motion ought to be governed by the differential
equations
x ′ (t) = g(t), if x(t) ∈ int K
(2.102)
x ′ (t) = g(t) − PN (x(t)) g(t), if x(t) ∈ ∂ K ,
where N (x(t)) is the half-line starting at x(t), normal to ∂ K and pointing towards
the exterior of K ; see Fig. 2.4. (If ∂ K is not smooth, then N K (x(t)) is a cone pointed
at x(t).) Thus x(t) is the solution of the evolution variational inequation
x(t) ∈ K , ∀t ≥ 0,
x ′ (t) ∈ g(t) − N (x(t)), ∀t > 0.
K
x0
x(t)
F(t)
N(x(t))
Fig. 2.4 Motion of a particle confined to a convex region
2.7 Differential Inclusions
69
Theorem 2.16 confirms that the motion of the particle is indeed the one described
above. Moreover, the proof of Theorem 2.16 offers a way of approximating its trajectory.
Example 2.7 Let us have another look at the radioactive disintegration model we
discussed in Sect. 1.3.1. If x(t) denotes the quantity of radioactive material, then the
evolution of this quantity is governed by the ODE
x ′ (t) = −αx(t) + g(t),
(2.103)
where g(t) denotes the amount of radioactive material that is added or extracted per
unit of time at time t. Clearly, x(t) is a solution of (2.103) only for those t such that
x(t) > 0. That is why it is more appropriate to assume that x satisfies the following
equations
x(t) ≥ 0, ∀t ≥ 0,
x ′ (t) = −αx(t) + g(t), ∀t ∈ E x
(2.104)
x ′ (t) = max{g(t), 0}, ∀t ∈ [0, ∞) \ E x ,
where the set
E x := t ≥ 0; x(t) > 0
is also one of the unknowns in the above problem. This is a so-called “free boundary”
problem. Let us observe that (2.104) is equivalent to the variational inequality (2.93)
with
K := x ∈ R; x ≥ 0 .
More precisely, (2.104) is equivalent to
x ′ (t) + αx(t) − g(t) · x(t) − y ≤ 0, ∀y ∈ K ,
(2.105)
for almost all t ≥ 0.
In formulation (2.105), the set E x has disappeared, but we have to pay a price,
namely, the new equation is a differential inclusion.
Example 2.8 Consider a factory consisting of n production units, each generating
only one type of output. We denote by xi (t) the size of the output of unit i at time
t, by ci (t) the demand for the product i at time t, and by pi (t) the rate at which the
output i is produced. The demands and stocks define, respectively, the vector-valued
maps
⎡
⎤
⎤
⎡
x1 (t)
c1 (t)
⎢
⎥
⎢
⎥
c(t) := ⎣ ... ⎦ , x(t) := ⎣ ... ⎦ ,
cn (t)
xn (t)
and we will assume that the demand vector depends linearly on the stock vector, that
is,
70
2 Existence and Uniqueness for the Cauchy Problem
c(t) = C x(t) + d(t),
where C is an n × n matrix, and d : [0, T ] → Rn is a differentiable map. For i =
1, . . . , n, define
E xi := t ∈ [0, T ]; xi (t) > 0 .
Obviously, the functions xi satisfy the following equations
xi (t) ≥ 0, ∀t ∈ [0, T ],
= pi (t) − C x(t) i − di (t), t ∈ E xi ,
′
xi (t) − pi (t) + C x(t) i + di (t) ≥ 0, ∀t ∈ [0, T ] \ E xi .
xi′ (t)
(2.106)
We can now see that the solutions of the variational problem (2.92) and (2.93) with
f (x) = −C x, g(t) = p(t) − d(t),
and
K = x ∈ Rn ; xi ≥ 0, ∀i = 1, . . . , n ,
are also solutions of (2.106).
Remark 2.8 Theorem 2.16 extends to differential inclusions of the form
x ′ (t) ∈ f (x(t)) + φ(x(t)) + g(t), a.e. t ∈ (0, T ),
x(0) = x 0 ,
(2.107)
n
where f : Rn → Rn is Lipschitz and φ : D ⊂ Rn → 2R is a maximal dissipative
mapping, that is,
(v1 − v2 , u 1 − u 2 ) ≤ 0, ∀vi ∈ φ(u i ), i = 1, 2,
and the range of the map u → u + λφ(u) is all of Rn for λ > 0. The method of proof
is essentially the same as that of Theorem 2.16. Namely, one approximates (2.107)
by
x ′ (t) = f (x(t)) + φε (x(t)) + g(t), t ∈ (0, T ),
(2.108)
x(0) = x 0 ,
where φε is the Lipschitz mapping 1ε ((I − εφ)−1 x − x), ε > 0, x ∈ Rn .
Then, one obtains the solution to (2.107) as x(t) = lim x ε (t), where x ε ∈
ε→0
C 1 ([0, T ]; Rn ) is the solution to (2.108). We refer to [2] for details and more
general results. We note, however, that this result applies to the Cauchy problem
with discontinuous monotone functions φ. For instance, if φ is discontinuous in x 0 ,
then one fills the jump at x 0 by redefining φ as
2.7 Differential Inclusions
71
=
φ(x)
φ(x)
for x = x 0 ,
lim φ( y) for x = x 0 .
y→x 0
is maximal dissipative.
Clearly, φ
Problems
2.1 Find the maximal interval of existence for the solution of the Cauchy problem
x ′ = −x 2 + t + 1,
x(0) = 1,
and then find the first three Picard iterations of this problem.
2.2 Consider the Cauchy problem
x ′ = f (t, x),
x(t0 ) = x0 , (t0 , x0 ) ∈ Ω ⊂ R2 ,
(2.109)
where the function f is continuous in (t, x) and locally Lipschitz in x. Prove that
if x0 ≥ 0 and f (t, 0) > 0 for any t ≥ t0 , then the saturated solution x(t; t0 , x0 ) is
nonnegative for any t ≥ t0 in the existence interval.
2.3 Consider the system
x ′ = f (t, x),
x(t0 ) = x 0 , t0 ≥ 0,
where the function f : [0, ∞) × Rn → Rn is continuous in (t, x), locally Lipschitz
in x and satisfies the condition
f (t, x), P x ≤ 0, ∀t ≥ 0, x ∈ Rn ,
(2.110)
where P is a real, symmetric and positive definite n × n matrix. Prove that any
right-saturated solution of the system is defined on the semi-axis [t0 , ∞).
Hint. Imitate the argument used in the proof of Theorem 2.12. Another approach is
to replace the scalar product of Rn by
u, v = (u, Pv), ∀u, v ∈ Rn ,
and argue as in the proof of Theorem 2.13.
72
2 Existence and Uniqueness for the Cauchy Problem
2.4 Consider the Cauchy problem
x ′′ + ax + f (x ′ ) = 0, x(t0 ) = x0 , x ′ (t0 ) = x1 ,
(2.111)
where a is a positive constant, and f : R → R is a locally Lipschitz function satisfying
y f (y) ≥ 0, ∀y ∈ R.
Prove that any right-saturated solution of (2.111) is defined on the semi-axis [t0 , ∞).
Hint. Multiplying (2.111) by x ′ we deduce
1 d ′ 2
|x (t)| + a|x ′ (t)|2 ≤ 0, ∀t ≥ t0 .
2 dt
Conclude that the functions x(t) and x ′ (t) are bounded and then use Theorem 2.11.
2.5 In the anisotropic theory of relativity due to V.G. Boltyanski, the propagation of
light in a neighborhood of a mass m located at the origin of R3 is described by the
equation
mγ
x′ = −
x + u(t),
(2.112)
x 3e
where γ is a positive constant, u : [0, ∞) → R3 is a continuous and bounded function, that is,
∃C > 0; u(t) e ≤ C, ∀t ≥ 0,
and x(t) ∈ R3 is the location of the photon at time t. Prove that there exists an r > 0
such that all the trajectories of (2.112), which start at t = 0 in the ball
Br := x ∈ R3 ;
x
e
<r ,
will stay inside the ball as long as they are defined. (Such a ball is called a black hole
in astrophysics.)
Hint. Take the scalar product of (2.112) with x(t) to obtain the differential inequality
1 d
x(t)
2 dt
2
e
=−
mγ
x(t)
e
mγ
+ u(t), x(t) ≤ −
x(t)
+ C x(t) e .
e
Use this differential inequality to obtain an upper estimate for x(t) e .
2.6 (Wintner’s extendibility test) Prove that, if the continuous function f =
f (t, x) : R × R → R is locally Lipschitz in x and satisfies the inequality
| f (t, x)| ≤ µ(|x|), ∀(t, x) ∈ R × R,
where µ : (0, ∞) → (0, ∞) satisfies
(2.113)
2.7 Differential Inclusions
73
0
∞
dr
< ∞,
µ(r )
(2.114)
then all the solutions of x ′ = f (t, x) are defined on the whole axis R.
Hint. According to Theorem 2.11, it suffices to show that all the solutions of x ′ =
f (t, x) are bounded. To prove this, we conclude from (2.113) that
and then invoke (2.114).
x(t)
x0
dr
≤ |t − t0 |, ∀t,
µ(r )
2.7 Prove that the saturated solution of the Cauchy problem
2
x ′ = e−x + t 2 ,
x(0) = 1,
(2.115)
"
#
is defined on the interval 0, 21 . Use Euler’s method with step size h = 10−2 to find
an approximation of this solution at the nodes t j = j h, j = 1, . . . , 50.
2.8 Let f : R → R be a continuous and nonincreasing function. Consider the
Cauchy problem
x ′ (t) = f (x), ∀t ≥ 0,
(2.116)
x(0) = x0 .
According to Theorem 2.13, this problem has a unique solution x(t) which exists on
[0, ∞).
(a) Prove that, for any λ > 0, the function
1 − λ f : R → R, x → x − λ f (x),
is bijective. For any integer n > 0, we set
(1 − λ f )−n := (1 − λ f )−1 ◦ · · · ◦ (1 − λ f )−1 .
n
(b) Prove that x(t) is given by the formula
−n
t
x(t) = lim 1 − f
x0 , ∀t ≥ 0.
n→∞
n
Hint. Fix t > 0, n > 0, set
h n :=
t
n
(2.117)
74
2 Existence and Uniqueness for the Cauchy Problem
and define iteratively
n
xin − xi−1
= f (xin ), i = 1, . . . n,
hn
x0n = x0 ,
that is,
−1
−i
t
t
n
xi−1
= 1− f
x0 .
xin = 1 − f
n
n
(2.118)
Let x n : [0, t] → R be the unique continuous function which is linear on each of the
intervals [(i − 1)h n , i h n ] and satisfies
x n (i h n ) = xin , ∀i = 0, . . . , n.
Argue as in the proof of Peano’s theorem to show that x n converges uniformly to x
on [0, t] as n → ∞. Equality (2.117) now follows from (2.118).
2.9 Consider the Cauchy problem
x ′ = f (x), t ≥ 0
x(0) = x0 ,
(2.119)
where f : R → R is a continuous nonincreasing function. Let x = ϕ(t) be a solution
of (2.119). Prove that, if the set
F := y ∈ R; f (y) = 0
is nonempty, then the following hold.
(i) The function t → |x ′ (t)| is nonincreasing on [0, ∞).
(ii) limt→∞ |x ′ (t)| = 0.
(iii) There exists an x∞ ∈ F such that limt→∞ x(t) = x∞ .
Hint. Since f is nonincreasing we have
1 d
x(t + h) − x(t)
2 dt
2
= x ′ (t + h) − x ′ (t) x(t + h) − x(t)
= f (x(t + h)) − f (x(t) x(t + h) − x(t) ≤ 0.
Hence, for any h ≥ 0 and t2 ≥ t1 ≥ 0, we have
x(t2 + h) − x(t1 ) ≤ x(t1 + h) − x(t1 ) .
This proves (i). To prove (ii) multiply both sides of (2.119) by x(t) − y0 , where
y0 ∈ F. Conclude, similarly, that
2.7 Differential Inclusions
75
d
x(t) − y0
dt
2
≤ 0,
showing that limt→∞ (x(t) − y0 )2 exists. Next multiply the equation by x ′ (t) to obtain
t
|x ′ (s)|2 ds = g(x(t)) − g(x0 ),
0
where g is an antiderivative of f . We deduce that
∞
|x ′ (t)|2 dt < ∞,
0
which when combined with (i) yields (ii). Next, pick a subsequence tn → ∞ such
that x(tn ) → y0 . From (i) and (ii), it follows that y0 ∈ F. You can now conclude that
lim x(t) = y0 .
t→∞
2.10 Prove that the conclusions of Problem 2.9 remain true for the system
x ′ = f (x),
x(0) = x 0 ,
where f : Rn → Rn is a dissipative and continuous mapping of the form f = ∇g,
where g : Rn → R is of class C 1 and g ≥ 0.
Hint. One proceeds as above by taking into account that
d
g(x(t)) = (x ′ (t), f (x(t))), ∀t ≥ 0.
dt
2.11 Consider the system
x ′ = f (x) − λx + f 0 ,
x(0) = x 0 ,
where f : Rn → Rn is continuous and dissipative, λ > 0 and f 0 , x 0 ∈ Rn . Prove
that
(a) limt→∞ x(t) exists (call this limit x ∞ ),
(b) limt→∞ x ′ (t) = 0,
(c) λx ∞ − f (x ∞ ) = f 0 .
Hint. For each h > 0, one has
1 d
x(t + h) − x(t)
2 dt
2
e
+ λ x(t + h) − x(t)
2
e
≤ 0,
76
2 Existence and Uniqueness for the Cauchy Problem
which implies that (a) holds and that
x ′ (t)
e
≤ e−λt x ′ (0) e , ∀t ≥ 0.
2.12 Prove that (2.117) remains true for solutions x to the Cauchy problem (2.116),
where f : Rn → Rn is continuous and dissipative.
Hint. By Problem 2.11(c), the function 1 − λ f : Rn → Rn is bijective and the Euler
scheme (2.38) is equivalent to
−i
t
x in = 1 − f
x 0 for t ∈ (i h k , (i + 1)h k ).
k
Then, by the convergence of this scheme, one obtains
t −k
x(t) = lim 1 −
x 0 , ∀t ≥ 0.
k→∞
k
2.13 Consider the equation x ′ = f (t, x), where f : R2 → R is a function continuous in (t, x) and locally Lipschitz in x, and satisfies the growth constraint
where
f (t, x) ≤ α(t)|x|, ∀(t, x) ∈ R2 ,
∞
α(t)dt < ∞.
t0
(a) Prove that any solution of the equation has finite limit as t → ∞.
(b) Prove that if, additionally, f satisfies the Lipschitz condition
f (t, x) − f (t, y) ≤ α(t)|x − y|, ∀t ∈ R, x, y ∈ R,
then there exists a bijective correspondence between the initial values of the solutions
and their limits at ∞.
Hint. Use Theorem 2.11 as in Example 2.1.
2.14 Prove that the maximal existence interval of the Cauchy problem
x ′ = ax 2 + t 2 ,
x(0) = x0 ,
(2.120)
(a is a positive constant) is bounded from above. Compare this with the situation
encountered in Example 2.2.
2.7 Differential Inclusions
77
Hint. Let x0 ≥ 0 and [0, T ) be the maximal interval of definition on the right. Then,
on this interval,
t3
1
1
− at.
,
≤
3 x(t; 0, x0 )
x0
x(t; 0, x0 ) ≥ x0 +
Hence, T = (ax0 )−1 .
2.15 Consider the Volterra integral equation
x(t) = g(t) +
t
f (s, x(s))ds, t ∈ [a, b],
a
where g ∈ C([a, b]; Rn ), f : [a, b] × Rn → Rn is continuous and
f (s, x) − f (s, y) ≤ L x − y , ∀s ∈ [a, b], x ∈ Rn .
Prove that there is a unique solution x ∈ C([a, b]; Rn ).
Hint. One proceeds as in the proofs of Theorems 2.1 and 2.4 by the method of
successive approximations
x n+1 (t) = g(t) +
t
f (s, x n (s))ds, t ∈ [a, b],
0
and proving that {x n } is uniformly convergent.
Chapter 3
Systems of Linear Differential Equations
The study of linear systems of ODEs offers an example of a well-put-together theory,
based on methods and results from linear algebra. As we will see, there exist many
similarities between the theory of systems of linear algebraic equations and the theory
of systems of linear ODEs. In applications, linear systems appear most often as “first
approximations” of more complex processes.
3.1 Notation and Some General Results
A system of first-order linear ODEs has the form
xi′ (t) =
n
j=1
aij (t)xj (t) + bi (t), i = 1, . . . , n, t ∈ I,
(3.1)
where I is an interval of the real axis and aij , bi : I → R are continuous functions.
System (3.1) is called nonhomogeneous. If bi (t) ≡ 0, ∀i, then the system is called
homogeneous. In this case, it has the form
xi′ (t) =
n
j=1
aij (t)xi (t), i = 1, . . . , n, t ∈ I.
(3.2)
Using the vector notation as in Sect. 2.2, we can rewrite (3.1) and (3.2) in the form
x′ (t) = A(t)x(t) + b(t), t ∈ I
(3.3)
x′ (t) = A(t)x(t), t ∈ I,
(3.4)
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6_3
79
80
3 Systems of Linear Differential Equations
where
⎡
⎤
⎤
b1 (t)
x1 (t)
⎢
⎥
⎥
⎢
x(t) := ⎣ ... ⎦ , b(t) := ⎣ ... ⎦ ,
⎡
bn (t)
xn (t)
and A is the n × n matrix A := aij (t) 1≤i,j≤n .
Obviously, the local existence and uniqueness theorem (Theorem 2.2), as well as
the results concerning global existence and uniqueness apply to system (3.3). Thus,
for any t0 ∈ I, and any x0 ∈ Rn , there exists a unique saturated solution of (3.1)
satisfying the initial condition
x(t0 ) = x0 .
(3.5)
In this case, the domain of existence of the saturated solution coincides with the
interval I. In other words, we have the following result.
Theorem 3.1 The saturated solution x = ϕ(t) of the Cauchy problem (3.1) and
(3.5) is defined on the entire interval I.
Proof Let (α, β) ⊂ I = (t1 , t2 ) be the interval of definition of the solution x = ϕ(t).
According to Theorem 2.10 applied to system (3.3), Ω = (t1 , t2 ) × Rn , f (t, x) =
A(t)x, if β < t2 , or α > t1 , then the function ϕ(t) is unbounded in the neighborhood
of β, respectively α. Suppose that β < t2 . From the integral identity
t
t
ϕ(t) = x0 +
t0
A(s)ϕ(s)ds +
t0
b(s)ds, t0 ≤ t < β,
we obtain, by passing to norms,
t
t
ϕ(t) ≤ x0 +
t0
A(s) · ϕ(s) ds +
t0
b(s) ds, t0 ≤ t < β.
(3.6)
The functions t → A(t) and t → b(t) are continuous on [t0 , t2 ) and thus are
bounded on [t0 , β]. Hence, there exists an M > 0 such that
A(t) + b(t) ≤ M, ∀t ∈ [t0 , β].
(3.7)
From inequalities (3.6) and (3.7) and Gronwall’s lemma, we deduce that
ϕ(t) ≤ x0 + (β − t0 )M e(β−t0 )M , ∀t ∈ [t0 , β).
We have reached a contradiction which shows that, necessarily, β = t2 . The
equality α = t1 can be proven in a similar fashion. This completes the proof of
Theorem 3.1.
3.2 Homogeneous Systems of Linear Differential Equations
81
3.2 Homogeneous Systems of Linear Differential Equations
In this section, we will investigate system (3.2) (equivalently, (3.4)). We begin with
a theorem on the structure of the set of solutions.
Theorem 3.2 The set of solutions of system (3.2) is a real vector space of dimension n.
Proof The set of solutions is obviously a real vector space. Indeed, the sum of two
solutions of (3.2) and the multiplication by a scalar of a solution are also solutions
of this system.
To prove that the dimension of this vector space is n, we will show that there exists
a linear isomorphism between the space E of solutions of (3.2) and the space Rn . Fix
a point t0 ∈ I and denote by Γt0 the map E → Rn that associates to a solution x ∈ E
its value at t0 ∈ E, that is,
Γt0 (x) = x(t0 ) ∈ Rn , ∀x ∈ E.
The map Γt0 is obviously linear. The existence and uniqueness theorem concerning
the Cauchy problems associated with (3.2) implies that Γt0 is also surjective and
injective. This completes the proof of Theorem 3.2.
The above theorem shows that the space E of solutions of (3.2) admits a basis
consisting of n solutions. Let
x1 , x2 , . . . , xn
be one such basis. In particular, x1 , x2 , . . . , xn are n linearly independent solutions
of (3.2), that is, the only constants c1 , c2 , . . . , cn such that
c1 x1 (t) + c2 x2 (t) + · · · + cn xn (t) = 0, ∀t ∈ I,
are the null ones, c1 = c2 = · · · = cn = 0. The matrix X(t) whose columns are
given by the function x1 (t), x2 (t), . . . , xn (t),
X(t) := x1 (t), x2 (t), . . . , xn (t) , t ∈ I,
is called a fundamental matrix. It is easy to see that the matrix X(t) is a solution to
the differential equation
(3.8)
X ′ (t) = A(t)X(t), t ∈ I,
where we denote by X ′ (t) the matrix whose entries are the derivatives of the
corresponding entries of X(t).
A fundamental matrix is not unique. Let us observe that any matrix Y(t) = X(t)C,
where C is a constant nonsingular matrix, is also a fundamental matrix of (3.2).
Conversely, any fundamental matrix Y(t) of system (3.2) can be represented as a
product
82
3 Systems of Linear Differential Equations
Y(t) = X(t)C, ∀t ∈ I,
where C is a constant, nonsingular n × n matrix. This follows from the next simple
result.
Corollary 3.1 Let X(t) be a fundamental solution of the system (3.2). Then any
solution x(t) of (3.2) has the form
x(t) = X(t)c, t ∈ I,
(3.9)
where c is a vector in Rn (that depends on x(t)).
Proof Equality (3.9) follows from the fact that the columns of X(t) form a basis of
the space of solutions of system (3.2). Equality (3.9) simply states that any solution of (3.2) is a linear combinations of solutions forming a basis for the space of
solutions.
Given a collection {x1 , . . . , xn } of solutions of (3.2), we define the Wronskian of
this collection to be the determinant
W (t) := det X(t),
(3.10)
where X(t) denotes the matrix with columns x(y), . . . , xn (t). The next result, due
to the Polish mathematician H. Wronski (1778–1853), explains the relevance of the
quantity W (t).
Theorem 3.3 The collection of solutions {x1 , . . . , xn } of (3.2) is linearly independent
if and only if its Wronskian W (t) is nonzero at a point of the interval I (equivalently,
on the entire interval I).
Proof Clearly, if the collection is linearly dependent, then
det X(t) = W (t) = 0, ∀t ∈ I,
where X(t) is the matrix [x1 (t), . . . , xn (t)].
Conversely, suppose that the collection is linearly independent. We argue by contradiction, and assume that W (t0 ) = 0 at some t0 ∈ I. Consider the linear homogeneous system
(3.11)
X(t0 )x = 0.
Since det X(t0 ) = W (t0 ) = 0, system (3.11) admits a nontrivial solution c0 ∈ Rn .
The function y(t) := X(t)c0 is obviously a solution of (3.2) vanishing at t0 . The
uniqueness theorem implies
y(t) = 0, ∀t ∈ I.
Hence X(t)c0 = 0, ∀t ∈ I, which contradicts the linear independence of the collection
{x1 , . . . , xn }. This completes the proof of the theorem.
3.2 Homogeneous Systems of Linear Differential Equations
83
Theorem 3.4 (J. Liouville (1809–1882)) Let W (t) be the Wronskian of a collection
of n solutions of system (3.2). Then we have the equality
W (t) = W (t0 ) exp
t
t0
tr A(s)ds , ∀t0 , t ∈ I,
(3.12)
where tr A(t) denotes the trace of the matrix A(t),
tr A(t) =
n
aii (t).
i=1
Proof Without loss of generality, we can assume that W (t) is the Wronskian of a
linearly independent collection of solutions {x1 , . . . , xn }. (Otherwise, equality (3.12)
would follow trivially from Theorem 3.3.) Denote by X(t) the fundamental matrix
with columns x1 (t), . . . , xn (t).
From the definition of the derivative, we deduce that for any t ∈ I we have
X(t + ε) = X(t) + εX ′ (t) + o(ε), as ε → 0.
From (3.8), it follows that
X(t + ε) = X(t) + εA(t)X(t) + o(ε), ∀t ∈ I.
(3.13)
In (3.13), we take the determinant of both sides and we find that
W (t + ε) = det 1 + A(t) + o(ε) W (t) = 1 + ε tr A(t) + o(ε) .
Letting ε → 0 in the above equality, we obtain
W ′ (t) = tr A(t) W (t).
Integrating the above linear differential equation, we get (3.12), as claimed.
Remark 3.1 From Liouville’s theorem, we deduce in particular the fact that, if the
Wronskian is nonzero at a point, then it is nonzero everywhere.
Taking into account that the determinant of a matrix is the oriented volume of the
parallelepiped determined by its columns, Liouville’s formula (3.12) describes the
variation in time of the volume of the parallelepiped determined by {x1 (t), . . . , xn (t)}.
In particular, if tr A(t) = 0, then this volume is conserved along the trajectories of
system (3.2).
This fact admits a generalization to nonlinear differential systems of the form
x′ = f (t, x), t ∈ I,
(3.14)
84
3 Systems of Linear Differential Equations
where f : I × Rn → Rn is a C 1 -map such that
divx f (t, x) ≡ 0.
(3.15)
Assume that for any x0 ∈ Rn there exists a solution S(t; t0 x0 ) = x(t; t0 , x0 ) of system
(3.14) satisfying the initial condition x(t0 ) = x0 and defined on the interval I. Let D
be a domain in Rn and set
D(t) = S(t)D := S(t; t0 , x0 ); x0 ∈ D .
Liouville’s theorem from statistical physics states that the volume of D(t) is constant.
The proof goes as follows.
For any t ∈ I, we have
Vol D(t) =
det
D
∂S(t, x)
dx.
∂x
(3.16)
On the other hand, Theorem 3.14 to come shows that ∂S(t,x)
is the solution of the
∂x
Cauchy problem
∂S(t, x)
∂f
d ∂S(t, x)
t, S(t, x)
=
dt ∂x
∂x
∂x
∂S(t0 , x)
= 1.
∂x
We deduce that
∂S(t, x)
∂f
= 1 + (t − t0 ) (t0 , x) + o(t − t0 ).
∂x
∂x
Hence
det
∂S(t, x)
∂f
= 1 + (t − t0 ) tr
(t0 , x) + o(t − t0 )
∂x
∂x
= 1 + (t − t0 ) divx f (t0 , x) + o(t − t0 ).
This yields
Vol D(t) = Vol D(t − t0 ) + o(t − t0 ).
′
In other words, Vol D(t) ≡ 0, and thus Vol D(t) is constant.
In particular, Liouville’s theorem applies to the Hamiltonian systems
∂H
dp
∂H
∂x
=
(x, p),
=−
(x, p),
∂t
∂p
dt
∂x
where H : Rn × Rn → R is a C 1 -function.
3.3 Nonomogeneous Systems of Linear Differential Equations
85
3.3 Nonomogeneous Systems of Linear Differential
Equations
In this section, we will investigate the nonhomogeneous system (3.1) or, equivalently,
system (3.3). Our first result concerns the structure of the set of solutions.
Theorem 3.5 Let X(t) be a fundamental solution of the homogeneous system (3.2)
and
x(t) a given solution of the nonhomogeneous system (3.3). Then the general
solution of system (3.3) has the form
x(t) = X(t)c +
x(t), t ∈ I,
(3.17)
where c is an arbitrary vector in Rn .
Proof Obviously, any function x(t) of the form (3.17) is a solution of (3.3). Conversely, let y(t) be an arbitrary solution of system (3.3) determined by its initial
condition y(t0 ) = y0 , where t0 ∈ I and y0 ∈ Rn . Consider the linear algebraic system
x(t0 ).
X(t0 )c = y0 −
Since det X(t0 ) = 0, the above system has a unique solution c0 . Then the function
X(t0 )c0 +
x(t) is a solution of (3.3) and has the value y0 at t0 . The existence and
uniqueness theorem then implies that
x(t). ∀t ∈ I.
y(t) = X(t0 )c0 +
In other words, the arbitrary solution y(t) has the form (3.17).
The next result clarifies the statement of Theorem 3.5 by offering a representation
formula for a particular solution of (3.3).
Theorem 3.6 (Variation of constants formula) Let X(t) be a fundamental matrix for
the homogeneous system (3.2). Then the general solution of the nonhomogeneous
system (3.3) admits the integral representation
t
x(t) = X(t)c +
t0
X(t)X(s)−1 b(s)ds, t ∈ I,
(3.18)
where t0 ∈ I, c ∈ Rn .
Proof We seek a particular solution
x(t) of (3.3) of the form
x(t) = X(t)γ(t), t ∈ I,
(3.19)
x(t) is supposed to be a
where γ : I → Rn is a function to be determined. Since
solution of (3.3), we have
86
3 Systems of Linear Differential Equations
X ′ (t)γ(t) + X(t)γ ′ (t) = A(t)X(t) + b(t).
Using equality (3.8), we have X ′ (t) = A(t)X(t) and we deduce that
γ ′ (t) = X(t)−1 b(t), ∀t ∈ I,
and thus we can choose γ(t) of the form
t
γ(t) =
t0
X(s)−1 b(s)ds, t ∈ I,
(3.20)
where t0 is some fixed point in I. The representation formula (3.18) now follows
from (3.17), (3.19) and (3.20).
Remark 3.2 From (3.18), it follows that the solution of system (3.3) satisfying the
Cauchy condition x(t0 ) = x0 is given by the formula
t
x(t) = X(t)X(t0 )−1 x0 +
t0
X(t)X(s)−1 b(s)ds, t ∈ I.
(3.21)
In mathematical systems theory the matrix U(t, s) := X(t)X(s)−1 , s, t ∈ I, is often
called the transition matrix.
3.4 Higher Order Linear Differential Equations
Consider the linear homogeneous differential equation of order n
x (n) + a1 (t)x (n−1) (t) + · · · + an (t)x(t) = 0, t ∈ I,
(3.22)
and the associated nonhomogeneous equation
x (n) + a1 (t)x (n−1) (t) + · · · + an (t)x(t) = f (t), t ∈ I,
(3.23)
where ai , i = 1, . . . , n, and f are continuous functions on an interval I.
Using the general procedure of reducing a higher order ODE to a system of firstorder ODEs, we set
x1 := x, x2 := x ′ , . . . , xn := x (n−1) .
The homogeneous equation (3.22) is equivalent to the first-order linear differential
system
3.4 Higher Order Linear Differential Equations
x1′
x2′
..
.
87
= x2
= x3
.. ..
. .
(3.24)
xn′ = −an x1 − an−1 x2 − · · · − a1 xn .
In other words, the map Λ defined by
⎡
⎢
⎢
x → Λx := ⎢
⎣
x
x′
..
.
x (n−1)
⎤
⎥
⎥
⎥
⎦
defines a linear isomorphism between the set of solutions of (3.22) and the set of
solutions of the linear system (3.24). From Theorem 3.2, we deduce the following
result.
Theorem 3.7 The set of solutions to (3.22) is a real vector space of dimension n.
Let us fix a basis {x1 , . . . , xn } of the space of solutions to (3.22).
Corollary 3.2 The general solution to (3.22) has the form
x(t) = c1 x1 (t) + · · · + cn xn (t),
(3.25)
where c1 , . . . , cn are arbitrary constants.
Just as in the case of linear differential systems, a collection of n linearly independent solutions of (3.22) is called a fundamental system (or collection) of solutions.
Using the isomorphism Λ, we can define the concept of the Wronskian of a collection of n solutions of (3.22). If {x1 , . . . , xn } is such a collection, then its Wronskian
is the function W : I → R defined by
⎡
x1
⎢ x1′
⎢
W (t) := det ⎢ ..
⎣.
···
···
..
.
· · · xn
· · · xn′
.. ..
. .
x1(n−1) · · · · · · xn(n−1)
⎤
⎥
⎥
⎥.
⎦
(3.26)
Theorem 3.3 has the following immediate consequence.
Theorem 3.8 The collection of solutions {x1 , . . . , xn } to (3.22) is fundamental if and
only if its Wronskian is nonzero at a point or, equivalently, everywhere on I.
Taking into account the special form of the matrix A(t) corresponding to the
system (3.24), we have the following consequence of Liouville’s theorem.
88
3 Systems of Linear Differential Equations
Theorem 3.9 For any t0 , t ∈ I we have
W (t) = W (t0 ) exp −
t
a1 (s)ds ,
t0
(3.27)
where W (t) is the Wronskian of a collection of solutions.
Theorem 3.5 shows that the general solution of the nonhomogeneous equation
(3.23) has the form
x(t) = c1 x1 (t) + · · · + cn xn (t) +
x(t),
(3.28)
where {x1 , . . . , xn } is a fundamental collection of solutions of the homogeneous
x(t) is a particular solution of the nonhomogeneous equation
equation (3.22), and
(3.23).
We seek the particular solution using the method of variation of constants already
employed in the investigation of linear differential systems. In other words, we seek
x(t) of the form
(3.29)
x(t) = c1 (t)x1 (t) + · · · + cn (t)xn (t),
where {x1 , . . . , xn } is a fundamental collection of solutions of the homogeneous
equation (3.22), and c1 , . . . , cn are unknown functions determined from the system
c1′ x1 + · · · + cn′ xn
c1′ x1′ + · · · + cn′ xn′
..
.
c1′ x1(n−1) + · · · + cn′ xn(n−1)
= 0
= 0
.. ..
. .
= f (t).
(3.30)
The determinant of the above system is the Wronskian of the collection {x1 , . . . , xn }
and it is nonzero since this is a fundamental collection. Thus the above system has a
unique solution. It is now easy to verify that the function
x given by (3.29) and (3.30)
is indeed a solution to (3.23).
We conclude this section with a brief discussion on the power-series method of
solving higher order differential equations. For simplicity, we will limit our discussion to second-order ODEs,
x ′′ (t) + p(t)x ′ (t) + q(t)x(t) = 0, t ∈ I.
(3.31)
We will assume that the functions p(t) and q(t) are real analytic on I and so, for each
t0 , ∃R > 0 such that
p(t) =
∞
n=0
pn (t − t0 )n , q(t) =
∞
n=0
qn (t − t0 )n ,
∀|t − t0 | < R.
(3.32)
3.4 Higher Order Linear Differential Equations
89
We seek a solution of (3.31) satisfying the Cauchy conditions
x(t0 ) = x0 , x ′ (t0 ) = x1 ,
(3.33)
described by a power series
x(t) =
∞
n=0
αn (t − t0 )n .
(3.34)
Equation (3.31) leads to the recurrence relations
(k + 2)(k + 1)αk+2 +
k
j=0
(j + 1)pk−j αj+1 + qk−j αj = 0, ∀k = 0, 1, . . . . (3.35)
These relations together with the initial conditions (3.33) determine the coefficients
αk uniquely. One can verify directly that the resulting power series (3.34) has positive
radius of convergence. Thus, the Cauchy problem (3.31) and (3.33) admits a unique
real analytic solution.
3.5 Higher Order Linear Differential Equations
with Constant Coefficients
In this section, we will deal with the problem of finding a fundamental collection of
solutions for the differential equation
x (n) + a1 x (n−1) + · · · + an−1 x ′ + an x = 0,
(3.36)
where a1 , . . . , an are real constants. The characteristic polynomial of the differential
equation (3.36) is the algebraic polynomial
L(λ) = λn + a1 λn−1 + · · · + an .
(3.37)
To any polynomial of degree ≤ n,
P(λ) =
n
pk λk ,
k=0
we associate the differential operator
P(D) =
n
k=0
pk Dk , Dk :=
dk
.
dt k
(3.38)
90
3 Systems of Linear Differential Equations
This acts on the space C n (R) of functions n-times differentiable on R with continuous
n-th order derivatives according to the rule
x → P(D)x :=
n
k=0
pk Dk x =
n
k=0
pk
dk x
.
dt k
(3.39)
Note that (3.36) can be rewritten in the compact form
L(D)x = 0.
The key fact for the problem at hand is the following equality
L(D)eλt = L(λ)eλt , ∀t ∈ R, λ ∈ C.
(3.40)
From equality (3.40), it follows that, if λ is a root of the characteristic polynomial,
then eλt is a solution of (3.36). If λ is a root of multiplicity m(λ) of L(λ), then we
define
eλt , . . . t m(λ)−1 eλt ,
if λ ∈ R,
Sλ :=
λt
λt
m(λ)−1
λt
m(λ)−1
λt
Re e , Im e , . . . t
Re e , t
Im e , if λ ∈ C \ R,
where Re and respectively Im denote the real and respectively imaginary part of a
complex number. Note that, since the coefficients a1 , . . . , an are real, we have
Sλ = Sλ̄ ,
for any root λ of L, where λ̄ denotes the complex conjugate of λ. Moreover, if
λ = a + ib, b = 0, is a root with multiplicity m(λ), then
Sλ = ea cos bt, ea sin bt, . . . , t m(λ)−1 ea cos bt, t m(λ)−1 ea cos bt .
Theorem 3.10 Let R be the set of roots of the characteristic polynomial L(λ). For
each λ ∈ RL we denote by m(λ) its multiplicity. Then the collection
S :=
Sλ
(3.41)
λ∈RL
is a fundamental collection of solutions of equation (3.36).
Proof The proof relies on the following generalization of the product formula.
Lemma 3.1 For any x, y ∈ C n (R), we have
L(D)(xy) =
n
1 (ℓ)
L (D)xDℓ y,
ℓ!
ℓ=0
(3.42)
3.5 Higher Order Linear Differential Equations with Constant Coefficients
91
where L (ℓ) (D) is the differential operator associated with the polynomial L (ℓ) (λ) :=
dℓ
L(λ).
dλℓ
Proof Using the product formula, we deduce that L(D)(xy) has the form
L(D)(xy) =
n
Lℓ (D)xDℓ y,
(3.43)
ℓ=0
where Lℓ (λ) are certain polynomials of degree ≤ n − ℓ. In (3.43) we let x = eλt ,
y = eµt where λ, µ are arbitrary complex numbers. From (3.40) and (3.43), we obtain
the equality
n
Lℓ (λ)µℓ .
(3.44)
L(λ + µ) = e−(λ+µ)t L(D)e(λ+µ)t =
ℓ=0
On the other hand, Taylor’s formula implies
L(λ + µ) =
n
1 (ℓ)
L (λ)µℓ , ∀λ, µ ∈ C.
ℓ!
ℓ=0
Comparing the last equality with (3.44), we deduce that Lℓ (λ) =
1 (ℓ)
L (λ).
ℓ!
Let us now prove that any function in the collection (3.41) is indeed a solution of
(3.36). Let
x(t) = t r eλt , λ ∈ RL , 0 ≤ r < m(λ).
Lemma 3.1 implies that
L(D)x =
n
r
1 (ℓ)
1 (ℓ)
L (D)eλt Dℓ t r =
L (λ)eλt Dℓ t r = 0.
ℓ!
ℓ!
ℓ=0
ℓ=0
If λ is a complex number, then the above equality also implies L(D) Re x =
L(D) Im x = 0.
Since the complex roots of L come in conjugate pairs, we conclude that the set S in
(3.41) consists of exactly n real solutions of (3.36). To prove the theorem, it suffices
to show that the functions in S are linearly independent. We argue by contradiction
and we assume that they are linearly dependent. Observing that for any λ ∈ RL we
have
t r λt
t r λt
Re t r eλ =
e + eλ̄t , Im t r eλ =
e − eλ̄t ,
2
2i
92
3 Systems of Linear Differential Equations
and that the roots of L come in conjugate pairs, we deduce from the assumed linear
dependence of the collection S that there exists a collection of complex polynomials
Pλ (t), λ ∈ RL , not all trivial, such that
λ∈RL
Pλ (t)eλt = 0.
The following elementary result shows that such nontrivial polynomials do not exist.
Lemma 3.2 Suppose that µ1 , . . . , µk are pairwise distinct complex numbers. If
P1 (t), . . . , Pk (t) are complex polynomials such that
P1 (t)eµ1 t + · · · Pk (t)eµk t = 0, ∀t ∈ R,
then P1 (t) ≡ · · · ≡ Pk (t) ≡ 0.
Proof We argue by induction on k. The result is obviously true for k = 1. Assuming that the result is true for k, we prove that it is true for k + 1. Suppose that
µ0 , µ1 , . . . , µk are pairwise distinct complex numbers and P0 (t), P1 (t), . . . , Pk (t)
are complex polynomials such that
P0 (t)eµ0 t + P1 (t)eµ1 t + · · · Pk (t)eµk t ≡ 0.
(3.45)
m := max deg P0 , deg P1 , . . . , deg Pk .
(3.46)
Set
We deduce that
P0 (t) +
k
j=1
Pk (t)ezj t ≡ 0, zj = µj − µ0 .
We differentiate the above equality (m + 1)-times and, using Lemma 3.1, we deduce
that
m+1 ℓ
k
k
zj (ℓ)
(3.46)
0≡
Pj (t) ezj t =
Pj (t + zj )ezj t .
ℓ!
j=1
j=1
ℓ=0
The induction assumption implies that Pj (t + zj ) ≡ 0, ∀j = 1, . . . , k. Using this fact
in (3.45), we deduce that
P0 (t)eµu t ≡ 0,
so that P0 (t) is also identically zero.
This completes the proof of Theorem 3.10.
3.5 Higher Order Linear Differential Equations with Constant Coefficients
93
Let us briefly discuss the nonhomogeneous equation associated with (3.36), that
is, the equation
(3.47)
x (n) + a1 x (n−1) + · · · + an x = f (t), t ∈ I.
We have seen that the knowledge of a fundamental collection of solutions to the
homogeneous equations allows us to determine a solution of the nonhomogeneous
equation by using the method of variation of constants. When the equation has
constant coefficients and f (t) has the special form described below, this process
simplifies.
A complex-valued function f : I → C is called a quasipolynomial if it is a
linear combination, with complex coefficients, of functions of the form t k eµt , where
k ∈ Z≥0 , µ ∈ C. A real-valued function f : I → R is called a quasipolynomial if it
is the real part of a complex polynomial. For example, the functions t k eat cos bt and
t k eat sin bt are real quasipolynomials.
We want to explain how to find a complex-valued solution x(t) of the differential
equation
L(D)x = f (t),
where f (t) is a complex quasipolynomial. Since L(D) has real coefficients, and x(t)
is a solution of the above equation, we have
L(D) Re x = Re f (t).
By linearity, we can reduce the problem to the special situation when
f (t) = P(t)eγt ,
(3.48)
where P(t) is a complex polynomial and γ ∈ C.
Suppose that γ is a root of order ℓ of the characteristic polynomial L(λ). (When
ℓ = 0, this means that L(γ) = 0.) We seek a solution of the form
x(t) = t ℓ Q(t)eγt ,
(3.49)
where Q is a complex polynomial to be determined. Using Lemma 3.1, we deduce
from the equality L(D)x = f (t) that
P(t) =
n
n
1 (k)
1 (k)
L (γ)Dk t ℓ Q(t) =
L (γ)Dk t ℓ Q(t) .
k!
k!
k=0
(3.50)
k=ℓ
The last equality leads to an upper triangular linear system in the coefficients of Q(t)
which can then be determined in terms of the coefficients of P(t).
We will illustrate the above general considerations on a physical model described
by a second-order linear differential equation.
94
3 Systems of Linear Differential Equations
3.5.1 The Harmonic Oscillator
Consider the equation of the harmonic oscillator in the presence of friction (see
Sect. 1.3.4)
(3.51)
mx ′′ + bx ′ + ω 2 x = f (t), t ∈ R,
where m, b, ω 2 are positive constants. The associated characteristic equation
mλ2 + bλ + ω 2 = 0
has roots
λ1,2
b
±
=−
2m
b
2m
2
−
ω2
.
m
We distinguish several cases.
1. b2 − 4mω 2 > 0. This corresponds to the case where the friction coefficient b is
“large”, λ1 and λ2 are real, and the general solution of (3.51) has the form
x(t),
x(t) = C1 eλ1 t + C2 eλ2 t +
x(t) is a particular solution of the
where C1 and C2 are arbitrary constants, and
nonhomogeneous equation.
The function
x(t) is called a “forced solution” of the equation. Since λ1 and λ2
are negative, in the absence of the external force f , the motion dies down fast,
converging exponentially to 0.
2. b2 − 4mω 2 = 0. In this case,
λ1 = λ2 =
b
,
2m
and the general solution of equation (3.51) has the form
bt
bt
x(t) = C1 e− 2m + C2 te− 2m +
x(t).
3. b2 − 4mω 2 < 0. This is the most interesting case from a physics viewpoint. In
this case,
b
b
+ iβ, λ2 = −
− iβ,
λ1 = −
2m
2m
where β 2 = −
b 2
2m
+
ω2
.
m
According to the general theory, the general solution of (3.51) has the form
bt
x(t).
C1 cos βt + C2 sin βt e− 2m +
(3.52)
3.5 Higher Order Linear Differential Equations with Constant Coefficients
95
Let us assume that the external force has a harmonic character as well, that is,
f (t) = a cos νt or f (t) = a sin νt,
where the frequency ν and the amplitude a are real, nonzero, constants. We seek a
particular solution of the form
x(t) = a1 cos νt + a2 sin νt.
When f = a cos νt, we find
x(t) =
a(ω 2 − mν 2 ) cos νt − abν sin νt
.
(ω 2 − mν 2 )2 + b2 ν 2
(3.53)
Interestingly, as t → ∞, the general solution (3.52) is asymptotic to the particular
solution (3.53), that is,
x(t) = 0,
lim x(t) −
t→∞
so that, for t sufficiently large, the general solution is practically indistinguishable
from the particular forced solution
x.
Consider now the case when the frequency ν of the external perturbation is equal
to the characteristic frequency of the oscillatory system, that is, ν = √ωm . Then
x=−
a sin νt
·
bν
(3.54)
x is then equal to the characteristic
As can be seen from (3.54), the amplitude of
frequency of the oscillatory system and so, if b ≈ 0, that is, the friction is practia
cally zero, then bν
≈ ∞. This is the resonance phenomenon often encountered in
oscillatory mechanical systems. Theoretically, it manifests itself when the friction
is negligible and the frequency of the external force is equal to the characteristic
frequency of the system.
Equally interesting due to its practical applications is the resonance phenomenon
in the differential equation (1.61) of the oscillatory electrical circuit,
LI ′′ + RI ′ + C −1 I = f (t).
(3.55)
In this case, the characteristic frequency is
1
ν = (LC)− 2 ,
and the resonance phenomenon appears when the source f = U ′ is a function of the
form
f (t) = a cos νt + b sin νt,
96
3 Systems of Linear Differential Equations
describes the behavior
and the resistance R of the system is negligible. If Eq. (3.55)
of an oscillatory electrical circuit in a receiver, U = f (t)dt is the difference of
potential between the antenna and the ground due to a certain source broadcasting
electromagnetic signals with frequency ν. Then the electrical current of the form
similar to (3.53)
a(C −1 − Lν 2 ) cos νt − aRν sin νt
I0 (t) =
(C −1 − Lν 2 )2 + R2 ν 2
develops inside the system while the other components of the general solution (3.52)
are “dying down” after a sufficiently long period of time. The amplitude of the
oscillation I0 depends on the frequency ν of the broadcasting source, but also on the
internal parameters L (inductance), R (resistance) and C (capacitance). To optimally
select a signal from among the signals coming from several broadcast sources it is
necessary to maximize the amplitude of the current I0 . This is achieved by triggering
the phenomenon of resonance, that is, by choosing a capacitance C so that the internal
frequency matches the broadcast frequency ν, more specifically, C = (Lν 2 )−1 . In
this way, the operation of tuning-in boils down to inducing a resonance.
3.6 Linear Differential Systems with Constant Coefficients
We will investigate the differential system
x′ = Ax, t ∈ R,
(3.56)
where A = (aij )1≤i,j≤n is a constant, real, n × n matrix. We denote by SA (t) the
fundamental matrix of (3.56) uniquely determined by the initial condition
SA (0) = 1,
where 1 denotes the identity matrix.
Proposition 3.1 The family {SA (t); t ∈ R} satisfies the following properties.
(i) SA (t + s) = SA (t)SA (s), ∀t, s ∈ R.
(ii) SA (0) = 1.
(iii) limt→t0 SA (t)x = SA (t0 )x, ∀x ∈ Rn , t0 ∈ R.
Proof The group property was already established in Theorem 2.12 and follows
from the uniqueness of solutions of the Cauchy problems associated with (3.56): the
functions Z(t) = SA (t)SA (s) and Y (t) = SA(t + s) both satisfy (3.56) with the initial
condition Y (0) = Z(0) = SA (s). Property (ii) follows from the definition, while (iii)
follows from the fact that the function t → SA (t)x is a solution of (3.56) and, in
particular, it is continuous.
3.6 Linear Differential Systems with Constant Coefficients
97
Proposition 3.1 expresses the fact that the family {SA (t); t ∈ R} is a one-parameter
group of linear transformations of the space Rn . Equality (iii), which can be easily
seen to hold in the stronger sense of the norm of the space of n×n matrices, expresses
the continuity property of the group SA (t). The map t → SA (t) satisfies the differential
equation
d
SA (t)x = ASA (t)x, ∀t ∈ R, ∀x ∈ Rn ,
dt
and thus
Ax =
d
1
SA (t)x − x .
SA (t)x = lim
t→0
dt t=0
t
(3.57)
Equality (3.57) expresses the fact that A is the generator of the one-parameter group
SA (t).
We next investigate the structure of the fundamental matrix SA (t) and the ways
we can compute it. To do this, we need to digress briefly and discuss series of n × n
matrices.
Matrix-valued functions. To a sequence of n × n matrices {Ak }k≥0 , we associate
the formal series
∞
Ak
k=0
and we want to give a precise meaning to equalities of the form
A=
∞
Ak .
(3.58)
k=0
We recall (see (A.5) for more details) that the norm of an n×n matrix A = (aij )1≤i,j≤n
is defined by
n
|aij |.
A := max
i
j=1
Definition 3.1 We say that the series of n × n matrices ∞
k=0 Ak converges to A, and
we express this as in (3.58), if the sequence of partial sums
BN =
N
Ak
k=0
converges in norm to A, that is,
lim
N→∞
BN − A = 0.
(3.59)
98
3 Systems of Linear Differential Equations
Since the convergence (3.59) is equivalent to entry-wise convergence, it is not
hard to see that Cauchy’s theorem on the convergence of numerical series is also
valid in the case of matrix series. In particular, we have the following convergence
criterion.
Proposition 3.2 If the matrix series ∞
k=0 Ak is majorized by a convergent numerical
series, that is,
∞
Ak ≤ ak , where
ak < ∞,
k=0
then the series
∞
k=0
Ak is convergent.
Using the concept of convergent matrix series, we can define certain functions
with matrices as arguments. For example, if f is a numerical analytic function of the
form
∞
f (λ) =
aj λj , |λ| < R,
j=0
we then define
f (A) :=
∞
aj Aj ,
A < R.
(3.60)
j=0
According to Proposition 3.2, the series (3.60) is convergent for A < R. In this
fashion, we can extend to matrices the exponential, logarithm, cos, sin, etc. In particular, for any t ∈ R we can define
etA =
∞ j
t
j=0
j!
Aj ,
(3.61)
where A is an arbitrary n × n matrix. Proposition 3.2 implies that series (3.61)
converges for any t and any A.
Theorem 3.11 etA = SA (t).
Proof Since etA t=0 = 1, it suffices to show that etA satisfies the differential equation
(3.56).
Using classical analysis results, we deduce that series (3.61) can be term-by-term
differentiated and we have
d tA
e = AetA , ∀t ∈ R,
dt
which proves the claim in the theorem.
(3.62)
Remark 3.3 It is worth mentioning that, in the special case of the constant coefficients
system (3.56), the variation of constants formula (3.21) becomes
3.6 Linear Differential Systems with Constant Coefficients
99
t
x(t) = e(t−t0 )A x0 +
t0
e(t−s)A b(s)ds, ∀t ∈ R.
We can now prove a theorem on the structure of the fundamental matrix etA .
Theorem 3.12 The (i, j)-entry of the matrix etA has the form
k
ℓ=1
pℓ,i,j (t)eλℓ t , i, j = 1, . . . , n,
(3.63)
where λℓ are the roots of the equation det(λ1 − A) = 0, and pℓ,i,j is an algebraic
polynomial of degree at most mℓ − 1, and mℓ is the multiplicity of the root λℓ .
Proof Denote by y(t) a column of the matrix etA and by P(λ) the characteristic
polynomial of the matrix A, that is, P(λ) = det(λ1 − A). From Cayley’s theorem,
we deduce that P(A) = 0. Using the fact that y(t) satisfies (3.56), we deduce that the
column y(t) satisfies the linear differential system with constant coefficients
P(D)y(t) = P(A)y(t) = 0,
where the differential operator P(D) is defined as in (3.38). Invoking Theorem 3.10,
we deduce that the components of y(t) have the form (3.63).
We now want to give an explicit formula for computing etA in the form of an
integral in the complex plane. Denote by λ1 , . . . , λk the eigenvalues of the matrix A
and by m1 , . . . , mk their (algebraic) multiplicities.
Let Γ denote a closed contour in the complex plane that surrounds the eigenvalues
λ1 , . . . , λk ; see Fig. 3.1.
The next theorem is the main result of this section.
Fig. 3.1 A contour
surrounding the spectrum
of A
Im λ
λ1
λk
λ2
Re λ
Γ
λk-1
100
3 Systems of Linear Differential Equations
Theorem 3.13 We have the equality
etA =
1
2πi
etλ (λ1 − A)−1 dλ, ∀t ∈ R.
Γ
(3.64)
Proof We denote by X(t) the matrix
X(t) :=
1
2πi
Γ
etλ (λ1 − A)−1 dλ, ∀t ∈ R.
(3.65)
To prove (3.64), it suffices to verify the equalities
X(0) = 1,
X ′ (t) = AX(t), ∀t ∈ R.
(3.66)
The equality (3.65) implies that
X ′ (t) =
1
2πi
Γ
etλ λ(λ1 − A)−1 dλ, ∀t ∈ R.
Using the elementary equality
λ(λ1 − A)−1 = (λ1 − A)−1 + 1, ∀t ∈ R,
we deduce that
X ′ (t) = AX(t) +
1
2πi
Γ
etλ dλ 1.
From the residue formula, we deduce that
Γ
etλ dλ = 0,
and thus X(t) satisfies the second identity in (3.66).
Let us now compute X(0). Note that we have the equality
(λ1 − A)−1 = λ−1
∞
k=0
λ−k Ak , ∀|λ| > A .
(3.67)
To prove this equality, observe first that the series in the right-hand side is convergent
for |λ| > A . Next, a simple computation shows that
λ−1 (λ1 − A)
which obviously implies (3.67).
∞
k=0
λ−k Ak = 1,
3.6 Linear Differential Systems with Constant Coefficients
101
Taking (3.67) into account, we deduce
X(0) =
1
2πi
Γ
(λ1 − A)−1 dλ =
∞
1 k
A
2πi
k=0
λ−k−1 dλ.
(3.68)
Γ
Without loss of generality, we can assume that the contour Γ is contained inside the
disk
λ ∈ C; |λ| ≤ A + ε .
The residue formula then implies
1
2πi
λ
−k−1
Γ
dλ =
1, k = 0
0, k > 0.
(3.69)
Using the last equality in (3.68), we deduce X(0) = 1. This completes the proof of
Theorem 3.13.
Remark 3.4 Equality (3.64) can be used to give an alternative proof of the structure
theorem Theorem 3.12, and also as a method of computing etA .
We set D(λ) := det(λ1 − A), A(λ) := adj(λ1 − A) = the adjoint matrix 1 of
(λ1 − A). We can rewrite (3.64) in the form
etA =
1
2πi
Γ
1
etλ
A(λ)dλ =
D(λ)
2πi
Γ
(λ − λ1
)m1
etλ
A(λ)dλ.
· · · (λ − λk )mk
The residue formula then implies that
etA =
k
R(λj ),
(3.70)
j=1
where we denote by R(λj ) the residue of the matrix-valued function
λ→
etλ
A(λ).
D(λ)
This residue can be computed using the well-known formula
d mj −1
1
R(λj ) =
(mj − 1)! dλmj −1
1 The
etλ (λ − λj )mj
A(λ)
.
λ=λj
D(λ)
(i, j)-entry of A(λ) is the (j, i)-cofactor of (λ1 − A) so that (λ1 − A)A(λ) = D(λ)1.
(3.71)
102
3 Systems of Linear Differential Equations
Equalities (3.70) and (3.71) imply Theorem 3.12. Moreover, the above process shows
that the computation of etA can be performed by using algebraic operations.
3.7 Differentiability in Initial Data
In this section, we have a new look at the problem investigated in Sect. 2.6. Consider
the Cauchy problem
x′ = f (t, x), (t, x) ∈ Ω ⊂ Rn+1 ,
x(t0 ) = x0 ,
(3.72)
where f : Ω → Rn is continuous in (t, x) and locally Lipschitz in x. We denote
by x(t; t0 , x0 ) the right-saturated solution of the Cauchy problem (3.72) defined on
the right-maximal existence interval [t0 , T ). We proved in Theorem 2.14 that for any
t ′ ∈ [t0 , T ) there exists an η > 0 such that for any
ξ ∈ B(x0 , η) = x ∈ Rn ; |x − x0 ≤ η
the solution x(t; t0 , ξ) is defined on [t0 , T ′ ] and the resulting map
B(x0 , η) ∋ ξ → x(t; t0 , ξ) ∈ C [t0 , T ′ ]; Rn
is continuous. We now investigate the differentiability of the above map.
Theorem 3.14 Under the same assumptions as in Theorem 2.14 assume additionally that the function f is differentiable with respect to x and the differential f x is
continuous with respect to (t, x). Then the function x(t; t0 , ξ) is differentiable with
respect to ξ on B(x0 , η) and its differential X(t) := xξ (t; t0 , ξ) is the fundamental
matrix of the linear system (variation equation)
y′ = f x t, x(t; t0 , x) y, t0 ≤ t ≤ T ′ ,
(3.73)
satisfying the initial condition
X(0) = 1.
(3.74)
Proof To prove that the function x(t; t0 , ξ) is differentiable in ξ, we consider two
arbitrary vectors ξ,
ξ ∈ B(x0 , η). We have the equality
x(t; t0 ,
ξ) − x(t; t0 , ξ) − X(t)
ξ−ξ =
t
f s, x(s; t0 ,
ξ)
−f s, x(s; t0 , ξ) − f x s, x(s; t0 , ξ) X(s)
ξ − ξ ds,
t0
(3.75)
3.7 Differentiability in Initial Data
103
where X(t) is the fundamental matrix of the linear system (3.73) that satisfies the
initial condition (3.74). On the other hand, the mean value theorem implies that
= fx
f s, x(s; t0 ,
ξ) − f s, x(s; t0 , ξ)
s, x(s; t0 , ξ) x(s; t0 ,
ξ) − x(s; t0 , ξ) + R s,
ξ, ξ .
(3.76)
Since f is locally Lipschitz, there exists a constant L > 0 such that, ∀t ∈ [t0 , T ′ ],
we have
t
x(t; t0 ,
ξ) − x(t; t0 , ξ) ≤
ξ−ξ +L
t0
Invoking Gronwall’s lemma, we deduce that
x(s; t0 ,
ξ) − x(s; t0 , ξ) ds.
ξ) − x(t; t0 , ξ) ≤
ξ − ξ eL(s−t0 ) , ∀s ∈ [t0 , T ′ ].
x(s; t0 ,
(3.77)
Inequality (3.77) and the continuity of the differential f x imply that the remainder R
in (3.76) satisfies the estimate
where
R s,
ξ, ξ
lim
ξ−ξ →0
≤ ω(
ξ, ξ)
ξ−ξ ,
(3.78)
ω(
ξ, ξ) = 0.
Using (3.76) in (3.75), we obtain the estimate
where
z(t) ≤ (T ′ − t0 )ω(
ξ, ξ)
ξ − ξ + L1
t
t0
z(s)ds, ∀t ∈ [t0 , T ′ ],
(3.79)
ξ − ξ .
ξ) − x(t; t0 , ξ) − X(t)
z(t) := x(t; t0 ,
Invoking Gronwall’s lemma again, we deduce that
x(t; t0 ,
ξ) − x(t; t0 , ξ) − X(t)
ξ−ξ
′
≤ (T ′ − t0 )ω(
ξ, ξ)
ξ − ξ eL1 (T −t0 ) = o
ξ−ξ
.
(3.80)
The last inequality implies (see Appendix A.5) that
xξ (t; t0 , ξ) = X(t).
We next investigate the differentiability with respect to a parameter λ of the
solution x(t; t0 , x0 , λ) of the Cauchy problem
104
3 Systems of Linear Differential Equations
x′ = f (t, x, λ), (t, x) ∈ Ω ⊂ Rn+1 , λ ∈ U ⊂ Rm ,
(3.81)
x(t0 ) = x0 .
(3.82)
The parameter λ = (λ1 , . . . , λm ) varies in a bounded open subset U on Rm . Fix
λ0 ∈ U. Assume that the right-saturated solution x(t; t0 , x0 , λ0 ) of (3.81) and (3.82)
corresponding to λ = λ0 is defined on the maximal-to-the-right interval [t0 , T ). We
have
Theorem 3.15 Let
f : Ω × U → Rn
be a continuous function, differentiable in the x and λ variables, and with the differentials f x , f λ continuous in (t, x, λ). Then, for any T ′ ∈ [t0 , T ), there exists a δ > 0
such that the following hold.
(i) The solution x(t; t0 , x0 , λ) is defined on [t0 , T ′ ] for any
λ ∈ B(λ0 , δ) := λ ∈ Rm ;
λ − λ0 ≤ δ .
(ii) For any t ∈ [t0 , T ′ ], the map B(λ0 , δ) ∋ λ → x(t; t0 , x0 , λ) ∈ Rn is differentiable and the differential y(t) := xλ (t; t0 , x0 , λ) : [t0 , T ′ ] → Rn × Rm is
uniquely determined by the (matrix-valued) linear Cauchy problem
y′ (t) = f x t, x(t; t0 , x0 , λ), λ y(t) + f λ t, x(t; t0 , x0 , λ), λ ,
∀t ∈ [t0 , T ′ ],
y(t0 ) = 0.
(3.83)
(3.84)
Proof As in the proof of Theorem 2.15, we can write (3.81) and (3.82) as a new
Cauchy problem,
x′ = F(t, z),
(3.85)
z(t0 ) = ζ := (ξ, λ),
where z = (x, λ), F(t, z) = f (t, x, λ), 0 ∈ Rn × Rm . According to Theorem 3.14,
the map
ζ → x(t; t0 , ξ, λ), λ0 =: z(t; t0 , ζ)
is differentiable and its differential
∂x
∂z
(t; t0 , ξ, λ)
= ∂ξ
Z(t) :=
0
∂ζ
∂x
(t; t0 , ξ, λ)
∂λ
1m
(1m is the identity m × m matrix) satisfies the differential equation
Z ′ (t) = Fz (t, z)Z(t), Z(t0 ) = 1n+m .
(3.86)
3.7 Differentiability in Initial Data
105
Taking into account the description of Z(t) and the equality
f (t, x, λ) f λ (t, x, λ)
,
Fz (t, z) =
0
0
we conclude from (3.86) that y(t) := xλ (t; t0 , x0 , λ) satisfies the Cauchy problem
(3.83) and (3.84).
Remark 3.5 The matrix xλ (t; t0 , x0 , λ) is sometimes called the sensitivity matrix and
its entries are known as sensitivity functions. Measuring the changes in the solution
under small variations of the parameter λ, this matrix is an indicator of the robustness
of the system.
Theorem 3.15 is especially useful in the approximation of solutions of differential
systems via the so-called small-parameter method.
Let us denote by x(t, λ) the solution x(t; t0 , x0 , λ) of the Cauchy problem (3.81)
and (3.82). We then have a first-order approximation
x(t, λ) = x(t, λ0 ) + xλ (t, λ0 )(λ − λ0 ) + o( λ − λ0 ),
(3.87)
where y(t) = xλ (t, λ0 ) is the solution of the variation equation (3.83) and (3.84).
Thus, in a neighborhood of the parameter λ0 , we have
x(t, λ) ≈ x(t, λ0 ) + xλ (t, λ0 )(λ − λ0 ).
We have thus reduced the approximation problem to solving a linear differential
system. Let us illustrate the technique on the following example
x ′ = x + λtx 2 + 1, x(0) = 1,
(3.88)
where λ is a sufficiently small parameter. Equation (3.88) is a Riccati type equation
and cannot be solved explicitly. However, for λ = 0 it reduces to a linear equation
and its solution is
x(t, 0) = 2et − 1.
According to formula (3.87), the solution x(t, λ) admits an approximation
x(t, λ) = 2et − 1 + λy(t) + o(|λ|),
where y(t) = xλ (t, 0) is the solution of the variation equation
y′ = y + t(2et − 1)2 , y(0) = 0.
Hence
t
y(t) =
0
s(2es − 1)2 et−s ds.
106
3 Systems of Linear Differential Equations
Thus, for small values of the parameter λ, the solution to problem (3.88) is well
approximated by
2et − 1 + λet 4tet − 2t 2 − 4et + e−t + 3 − te−t .
3.8 Distribution Solutions of Linear Differential Equations
The notion of distribution is a relatively recent extension of the concept of function
(introduced by L. Schwarz (1915–2002)) that leads to a rethinking of the foundations of mathematical analysis. Despite its rather abstract character, the concept of
distribution has, as will be seen shortly, a solid physical support, which was behind
its development.
Given an open interval I of the real axis R (in particular, I could be the whole
axis), we denote by C ∞ (I) the space of infinitely differentiable (or smooth) functions
on I. For any ϕ ∈ C ∞ (ϕ), we denote by supp ϕ its support
supp ϕ = {t ∈ I; ϕ(t) = 0},
where D denotes the closure in I of the subset D ⊂ I. We denote by C0∞ (I) the set
of smooth functions ϕ ∈ C ∞ (I) such that supp ϕ is a compact subset of I. It is not
hard to see that C0∞ (I) = ∅. Indeed, if I = (a, b) and t0 ∈ I, then the function
ϕ(t) =
exp
0,
1
|t−t0 |2 −ε2
, |t − t0 | < ε,
|t − t0 | ≥ ε,
0 < ε < min{t0 − a, b − t0 },
belongs to C ∞ (I).
The set C0∞ (I) is obviously a vector space over R (or over C if we are dealing with
complex-valued functions) with respect to the usual addition and scalar multiplication
operations. This space can be structured as a topological space by equipping it with
a notion of convergence.
Given a sequence {ϕn }n≥1 ⊂ C0∞ (I), we say that it converges to ϕ ∈ C0∞ (I), and
we denote this by ϕn ⇒ ϕ, if there exists a compact subset K ⊂ I such that
supp ϕn ⊂ K, ∀n,
(3.89)
d j ϕn
djϕ
(t) → j (t) uniformly on K, ∀j,
j
dt
dt
(3.90)
as n → ∞.
We define a distribution (or generalized function) on I to be a linear and continuous
functional on C0∞ (I), that is, a linear map u : C0∞ (I) → R such that
3.8 Distribution Solutions of Linear Differential Equations
107
lim u(ϕn ) = u(ϕ)
n→∞
for any sequence {ϕn } in C0∞ (I) such that ϕn ⇒ ϕ. We denote by u(ϕ) the value of
this functional at ϕ. The set of distributions, which is itself a vector space, is denoted
by D′ (I). We stop the flow of definitions to discuss several important examples of
distributions.
Example 3.1
(a) Any function f : I → R which is Riemann or Lebesgue integrable on any
compact subinterval of I canonically determines a distribution uf : C0∞ (I) → R
defined by
uf (ϕ) :=
I
f (t)ϕ(t)dt, ∀ϕ ∈ C0∞ (I).
(3.91)
It is not difficult to verify that uf is a linear functional and
uf (ϕn ) → uf (ϕ),
for any sequence ϕn ⇒ ϕ.
On the other hand, if uf (ϕ) = 0, for any ϕ ∈ C0∞ (I), then obviously f (t) = 0
almost everywhere (a.e.) on I. Therefore, the correspondence f → uf is an
injection of the space of locally integrable functions into the space of distributions
on I. In this fashion, we can identify any locally integrable function on I with a
distribution.
(b) The Dirac distribution. Let t0 ∈ R. We define the distribution δt0
δt0 (ϕ) = ϕ(t0 ), ∀ϕ ∈ C0∞ (R).
(3.92)
It is easy to see that δt0 is indeed a linear and continuous functional on C0∞ (R),
that is, δt0 ∈ D′ (R). This is called the Dirac distribution (concentrated) at t0 .
Historically speaking, the distribution δt0 is the first nontrivial example of a
distribution and was introduced in 1930 by the physicist P.A.M. Dirac (1902–
1984). In Dirac’s interpretation, δt0 had to be a “function” on R which is zero
everywhere but at t0 , and whose integral had to be 1. Obviously, from a physical
point of view, such a “function” had to represent an impulsive force, but it did
not fit into the traditional setup of mathematical analysis. We could view such
a “function” as the limit as ε ց 0 of the family of functions defined by (see
Fig. 3.2)
1
t − t0
(3.93)
ηε (t) := η
ε
ε
where η ∈ C0∞ (R) is a function such that
supp η ⊂ (−1, 1),
R
η(t)dt = 1.
108
3 Systems of Linear Differential Equations
Fig. 3.2 Approximating
Dirac’s δ-function
Then
supp ηε ⊂ (t0 − ε, t0 + ε),
R
ηε (t)dt = 1,
and we can regard ηε as approximating the “function” δt0 . We cannot expect that
ηε converges in any conventional way to a function that we could consider to be
δt0 . However, we have the equality
lim
εց0 R
ηε (t)ϕ(t)dt = ϕ(t0 ) = δt0 (ϕ), ∀ϕ ∈ C0∞ (R).
We say that the family {ηε } converges in the sense of distributions to δt0 . It is in
this sense that we can restore Dirac’s original intuition.
An important property of distributions is that they can be differentiated as many
times as we please. For any positive integer j, we define the j-th order distribution of
the distribution u ∈ D′ (I) to be the distribution u(j) given by
u(j) (ϕ) := (−1)j u ϕ(j) , ∀ϕ ∈ C0∞ (I).
(3.94)
Based on the definition, it is not difficult to verify that the linear functional
u(j) : C0∞ (I) → R
is indeed continuous and thus defines a distribution on I.
If u is a distribution associated with a C 1 -function f , that is,
u(ϕ) = uf (ϕ) =
I
f (t)ϕ(t)dt, ∀ϕ ∈ C0∞ (I),
then its distributional derivative is the distribution determined by the derivative of f ,
that is,
3.8 Distribution Solutions of Linear Differential Equations
uf′ (ϕ) = −
I
f (t)ϕ′ (t)dt =
I
109
f ′ (t)ϕ(t) = uf ′ (ϕ), ∀ϕ ∈ C0∞ (I).
In general, if f is a locally integrable function, its distributional derivative is no longer
a function, but the distribution uf′ defined by
uf′ (ϕ) = −
I
f (t)ϕ′ (t)dt, ∀ϕ ∈ C0∞ (I).
Example 3.2 (Heaviside function) Consider the Heaviside function H : R → R,
1, t ≥ 0,
H(t) =
0, t < 0.
Its distributional derivative, which we continue to denote by H ′ , is thus defined by
H ′ (ϕ) = −
R
H(t)ϕ′ (t)dt = −
∞
0
ϕ′ (t)dt = ϕ(0), ∀ϕ ∈ C0∞ (R).
Hence H ′ = δ0 .
Given a smooth function a ∈ C ∞ (I) and a distribution u ∈ D′ (I), we can define
the product au ∈ D′ (I) by the equality
(au)(ϕ) := u(aϕ), ∀ϕ ∈ C0∞ (I).
We denote by D′ (I; Rn ) the space of vector-valued distributions
⎤
y1
⎢ ⎥
y = ⎣ ... ⎦
⎡
yn
where yi ∈ D′ (I), ∀i = 1, . . . , n. We set
⎡
⎤
y1′
⎢ ⎥
y′ := ⎣ ... ⎦ .
yn′
Consider the differential equation with constant coefficients
x (n) + a1 x (n−1) + · · · + an x = f on I,
(3.95)
110
3 Systems of Linear Differential Equations
where f ∈ D′ (I). A solution of (3.95) is by definition a distribution x ∈ D′ (I) that
satisfies (3.95) in the sense of distributions, that is,
x (−1)n ϕ(n) + (−1)n−1 a1 ϕ(n−1) + · · · + an ϕ = f (ϕ), ∀ϕ ∈ C0∞ (I).
More generally, given the differential system
y′ = Ay + f ,
(3.96)
where f ∈ D′ (I; Rn ) and A is an n × n real matrix, we say that y ∈ D′ (I; Rn ) is a
solution of (3.96) if for any ϕ ∈ C0∞ (I) we have
−y′ (ϕ′ ) =
n
j=1
yj (aij ϕ) + fi (ϕ), ∀i = 1, . . . , n.
Clearly, x ∈ D′ (I) is a solution of (3.95) if and only if
⎡
⎢
⎢
y=⎢
⎣
x
x′
..
.
x (n−1)
⎤
⎥
⎥
⎥
⎦
is a solution of system (3.24).
If f is continuous, then, as shown in Sect. 3.5, the set of C n -solutions (let’s call
them classical) of Eq. (3.95) is given by
x,
C1 x1 + · · · + Cn xn +
(3.97)
where {x1 , . . . , xn } is a fundamental collection of solutions of the homogeneous
equation, and
x is a particular solution of the nonhomogeneous equation. It is then
natural to ask if there are distributions not contained in the family (3.97). The answer
turns out to be negative and is the object of Theorem 3.16.
Theorem 3.16 If f is a continuous function on I, then the only solutions of equation
(3.95) (respectively system (3.96)) are the classical ones.
Proof We show first that the only distribution solutions of the equation
x′ = 0
are the classical ones, that is, the constant functions.
Indeed, if x ∈ D′ (R) is a solution of (3.98), then, by definition, we have
x(ϕ′ ) = 0, ∀ϕ ∈ C0∞ (R).
(3.98)
3.8 Distribution Solutions of Linear Differential Equations
111
Hence
x(ψ) = 0
(3.99)
for any function ψ ∈ C0∞ (R) such that ψ = ϕ′ , ϕ ∈ C0∞ (R), that is,
R
ψ(t)dt = 0.
Fix an arbitrary function χ ∈ C0∞ (R) such that
R
χ(t)dt = 0.
For any ϕ ∈ C0∞ (R), we have
ϕ(t) = χ(t)
R
ϕ(s)ds + ϕ(t) − χ(t)
ϕ(s)ds,
R
and thus, according to (3.99),
x(ϕ) = x(χ)
R
ϕ(t)dt + x ϕ − χ
ϕ(t)dt
R
= x(χ)
ϕ(t)dt.
R
If we set C := x(χ), we deduce that
x(ϕ) =
Cϕ(t)dt, ∀ϕ ∈ C0∞ (R),
and thus x = C.
Consider now the equation
x ′ = g ∈ C(I).
(3.100)
This equation can also be rewritten in the form
t
(x − G)′ = 0, G(t) :=
t0
g(s)ds, t0 ∈ I.
From the previous discussion, we infer that x = G + C, where C is a constant.
If g ∈ C(I; Rn ) and y ∈ D′ (I; Rn ) satisfies y′ = g, then, as above, we conclude
that
(y − G)′ = 0,
where
T
G(t) =
t0
g(s)ds, t0 ∈ I,
112
3 Systems of Linear Differential Equations
and we infer as before that
y = G + c,
with c a constant vector in Rn .
Consider now a solution y ∈ D′ (I; Rn ) of system (3.96), where f ∈ C(I; Rn ).
If we define the product e−tA y ∈ D′ (I; Rn ) as above, we observe that we have the
equality of distributions
′
e−tA y
and thus
e−tA y
′
= (e−tA )′ y + e−tA by′ ,
= e−tA f in D′ (I; Rn ).
The above discussion shows that
t
e−tA y = c +
e−sA f (s)ds on I.
t0
Thus, the distribution solutions of (3.96) are the classical solutions given by the formula of variation of constants (3.21). In particular, it follows that the only distribution
solutions of (3.95) are the classical ones, given by (3.97).
We close this section with some examples.
Example 3.3 Consider the differential equation
x ′ + ax = µδ0 , on R,
(3.101)
where δ0 denotes the Dirac distribution concentrated at the origin and µ ∈ R. If
x ∈ D′ (R) is a solution of (3.101), then, by definition,
− x(ϕ′ ) + ax(ϕ) = µϕ(0), ∀ϕ ∈ C0∞ (R).
(3.102)
If, in (3.102), we first choose ϕ such that supp ϕ ⊂ (−∞, 0), and supp ϕ ⊂ (0, ∞),
then we deduce that
x ′ + ax = 0 on R \ {0}
in the sense of distributions. Thus, according to Theorem 3.16, x is a classical solution
to x ′ + ax = 0 on R \ {0}. Hence, there exist constants C1 , C2 ∈ R such that
x(t) =
C1−at ,
t > 0,
−at
C2 e , t < 0.
(3.103)
On the other hand, the function x defined by (3.103) is locally integrable on R and
of class C 1 on R \ {0}. We have, therefore,
3.8 Distribution Solutions of Linear Differential Equations
113
∞
0
x(ϕ′ ) =
R
x(t)ϕ′ (t)dt =
−∞
0
=−
−∞
− x(0 )ϕ(0) −
R
x(t)ϕ′ (t)dt
0
ẋ(t)ϕ(t)dt + x(0− )ϕ(0)
∞
+
=−
x(t)ϕ′ (t)dt +
(3.104)
ẋ(t)ϕ(t)dt
0
ẋ(t)ϕ(t)dt + ϕ(0) x(0− ) − x(0+ ) ,
where ẋ denotes the usual derivative of x on R \ {0}. We rewrite (3.102) in the form
x(ϕ′ ) = a
R
x(t)ϕ(t)dt − µϕ(0), ∀ϕ ∈ C0∞ (R).
(3.105)
Using (3.104) in the above equality, and taking into account that ẋ = −ax on R \ {0},
we deduce that
ϕ(0) x(0− ) − x(0+ ) = −µϕ(0), ∀ϕ ∈ C0∞ (R).
Hence
x(0+ ) − x(0− ) = µ,
and thus
C −at ,
t > 0,
x(t) =
(C − µ)e−at , t < 0,
where C is a constant. As was to be expected, the solutions of (3.101) are not of class
C 1 , they are not even continuous. They have a jump of size µ at t = 0.
Example 3.4 Consider the second-order differential equation
mx ′′ + bx ′ + ω 2 x = µδ0 .
(3.106)
Taking into account the interpretation of the Dirac distribution δ0 , Eq. (3.106)
describes an elastic motion of a particle of mass m that was acted upon by an impulsive
force of size µ at time t = 0.
Such phenomena appear, for example, in collisions or in cases when a cord or
an elastic membrane is hit. A physical analysis of this phenomenon leads us to the
conclusion that the effect of the impulsive force µδ0 is to produce a jump in the
velocity, that is, an instantaneous change in the momentum of the particle. We will
reach a similar conclusion theoretically, by solving Eq. (3.106).
If the distribution x is a solution of (3.106), then
R
x(t) mϕ′′ (t) − bϕ′ (t) + ω 2 ϕ(t) dt = ϕ(0), ∀ϕ ∈ C0∞ (R).
(3.107)
114
3 Systems of Linear Differential Equations
Choosing ϕ such that supp ϕ(−∞, 0) and then supp(0, ∞), we deduce that x satisfies,
in the sense of distributions, the equation
mx ′′ + bx ′ + ω 2 x = on R \ {0}.
According to Theorem 3.16, the distribution x is a classical solution of this equation
on each of the intervals (−∞, 0) and (0, ∞). Integrating by parts, we deduce that
∞
0
R
x(t)ϕ′′ (t)dt =
−∞
x(t)ϕ′′ (t)dt +
x(t)ϕ′′ (t)dt
0
= x(0− ) − x(0+ ) ϕ′ (0) + ẋ(0+ ) − ẋ(0− ) ϕ(0)
∞
0
+
−∞
ẍ(t)ϕ(t)dt +
ẍ(t)ϕ(t)dt,
0
and
∞
0
R
x(t)ϕ′ (t)dt = x(0− ) − x(0+ ) ϕ(0) =
−∞
ẋ(t)ϕ(t)dt −
ẋ(t)ϕ(t)dt,
0
where ẋ and ẍ denote, respectively, the first and the second classical derivative of x
on R \ {0}. Taking into account the equalities
mẋ + bẋ + ω 2 x = 0 on R \ {0}
and (3.107), we deduce that
m x(0− ) − x(0+ ) ϕ′ (0) + m ẋ(0+ ) − ẋ(0− ) ϕ(0)
+b x(0− ) − x(0+ ) ϕ(0) = µϕ(0), ∀ϕ ∈ C0∞ (R).
Hence
x(0− ) = x(0+ ), m ẋ(0+ ) − ẋ(0− ) =
µ
.
m
Thus, the function x is continuous, but its derivative ẋ is discontinuous at the origin
where it has a jump of size mµ . In other words, x is the usual solution of the following
system
⎧
⎪
mẍ(t) + bẋ(t) + ω 2 x(t) = 0, t < 0,
⎪
⎪
⎪
⎨x(0) = x0 , ẋ(0) = x1 ,
mẍ(t) + bẋ(t) + ω 2 x(t) = 0, t > 0,
⎪
⎪
⎪
⎪
⎩x(0) = x0 , ẋ(0) = x1 + µ .
m
3.8 Distribution Solutions of Linear Differential Equations
115
Problems
3.1 Consider the differential system
x′ = A(t)x, t ≥ 0,
(3.108)
where A(t) is an n × n matrix whose entries depend continuously on t ∈ [0, ∞) and
satisfies the condition
t
lim inf
t→∞
0
tr A(s)ds > −∞.
(3.109)
Let X(t) be a fundamental matrix of system (3.108) that is bounded as a function of
t ∈ [0, ∞). Prove that the function
is also bounded.
[0, ∞) ∋ t → X(t)−1 ∈ (0, ∞)
Hint. Use Liouville’s theorem.
3.2 Prove that, if all the solutions of system (3.108) are bounded on [0, ∞) and
(3.109) holds, then any solution of the system
x′ = B(t)x, t ≥ 0,
(3.110)
is bounded on [0, ∞). Above, B(t) is an n × n matrix whose entries depend continuously on t ≥ 0 and satisfy the condition
∞
0
B(s) − A(s) ds < ∞.
Hint. Rewrite system (3.110) in the form x′ = A(t)x + B(t) − A(t) x, and then
use the formula of variation of constants.
3.3 Prove that all the solutions of the differential equation
x′ + 1 +
2
x=0
t(1 + t 2 )
are bounded on [0, ∞).
Hint. Interpret the function f (t) = −2x(t(1 + t 2 ))−1 as a nonhomogeneous term and
then use the formula of variation of constants.
3.4 Express as a power series the solution of the Cauchy problem
x ′′ − tx = 0, x(0) = 0, x ′ (0) = 1.
116
3 Systems of Linear Differential Equations
3.5 Consider the linear second-order equation
x ′′ + a1 (t)x ′ + a2 (t)x = 0, t ∈ I := [α, β],
(3.111)
where ai : I → R, i = 1, 2, are continuous functions. A zero of a solution x(t) is a
point t0 ∈ I such that x(t0 ) = 0. Prove that the following hold.
(i) The set of zeros of a nonzero solution is at most countable, and contains only
isolated points.
(ii) The zero sets of two linearly independent solutions x, y separate each other, that
is, between any two consecutive zeros of x there exists precisely one zero of y.
(This result is due to J. Sturm (1803–1855).)
Hint.
(i) Follows by the uniqueness of solutions to the Cauchy problem for Eq. (3.111).
(ii) Let t1 , t2 be two consecutive zeros of x. y(t) = 0 on [t1 , t2 ], and the function
ϕ(t) = x(t)
is C 1 on [t1 , t2 ]. Use Rolle’s theorem to reach a contradiction.
y(t)
3.6 Prove that the equation x ′′ = a(t)x is non-oscillatory (that is, it admits solutions
with only finitely many zeros) if and only if the Riccati equation y′ = −y2 + a(t)
admits solutions defined on the entire semi-axis [0, ∞).
Hint. Use Problem 1.14.
3.7 Consider the second-order equations
x ′′ + a(t)x = 0,
x ′′ + b(t)x = 0,
(3.112)
(3.113)
where a, b are continuous functions on an interval I = [t1 , t2 ]. Prove that, if ϕ(t) is
a solution of (3.112) and ψ(t) is a solution of (3.113), then we have the identity
ϕ(t2 ) ψ(t2 ) ϕ(t1 ) ψ(t1 )
′
ϕ (t2 ) ψ ′ (t2 ) − ϕ′ (t1 ) ψ ′ (t1 ) =
t2
t1
a(t) − b(t) )ϕ(t)ψ(t)dt.
(3.114)
3.8 (Sturm’s comparison theorem) Under the same assumptions as in Problem 3.7,
prove that, if a(t) ≤ b(t), ∀t ∈ I, then between any two consecutive zeros of the
solution ϕ(t) there exists at least one zero of the solution ψ(t).
Hint. Use identity (3.114).
3.9 Find all the values of the complex parameter λ such that the boundary value
problem
x ′′ + λx = 0, t ∈ [0, 1],
(3.115)
x(0) = x(1) = 0
(3.116)
3.8 Distribution Solutions of Linear Differential Equations
117
Fig. 3.3 An elastic chain
m1
m2
admits nontrivial solutions. (A boundary value problem as above is known as a
Sturm–Liouville problem. The corresponding λ’s are called the eigenvalues of the
problem.)
Hint. Prove first that λ has to be a nonnegative real number. Solve (3.115) separately
in the cases λ < 0, λ = 0 and λ > 0 and then impose condition (3.116) to find that
λ = (nπ)2 , n ∈ Z>0 .
3.10 The differential system
m1 x1′′ + ω1 x1 − ω2 (x2 − x1 ) = 0,
m2 x2′′ + ω2 (x2 − x1 ) = f
(3.117)
describes the motion of a mechanical system made of two particles of masses m1
and m2 serially connected to a fixed point through two elastic springs with elasticity
constants ω1 and ω2 ; see Fig. 3.3.
Solve the system in the special case m1 = m2 = m, ω1 = ω2 = ω and f = 0.
3.11 Solve the differential equation
x ′′ + ω 2 x + ε−1 min(x, 0) = 0, t ≥ 0,
x(0) = x0 , x ′ (0) = 0,
(3.118)
where x0 ≥ 0, ε > 0 and investigate the behavior of the solution as ε ց 0.
Hint. One considers separately the cases x0 > 0 and x0 = 0. The limit case ε = 0
models the harmonic motion in the presence of an obstacle at x = 0. In the limit case
ε = 0, the solution formally satisfies the system
x ′′ (t) + ω 2 x(t) x(t) = 0, ∀t ≥ 0,
x(t) ≥ 0, x ′′ (t) + ω 2 x(t) ≥ 0,
which is a differential variational inequality of the form (2.91).
3.12 Prove that the matrix X(t) = e(ln t)A is a fundamental matrix of the system
tx′ = x, t > 0.
118
3 Systems of Linear Differential Equations
3.13 Prove that, for any n × n matrix A and any t ∈ R, we have
etA
∗
∗
= etA ,
where ∗ indicates the transpose of a matrix.
Hint. Use formula (3.61).
3.14 Prove that the solution X(t) of the matrix-valued differential equation
X ′ (t) = AX(t) + X(t)B, t ∈ R,
satisfying the initial condition X(0) = C is given by the formula
X(t) = etA CetB ,
where A, B, C are n × n matrices.
Hint. Use equation (3.8).
3.15 Prove all the entries of etA are positive for any t ≥ 0 if and only if
aij = 0, ∀i = j,
(3.119)
where aij are the entries of the n × n matrix A.
Hint. From formula (3.61), where t → 0, we see that condition (3.119) is necessary.
Conversely, if (3.119) holds, then there exists an α > 0 such that all the entries of
α1 + A are positive. Next, use the equality etA = e−αt et(α1+A) .
3.16 Prove that, if (3.119) holds, then the solution x of the Cauchy problem
x′ = Ax + f (t), x(0) = x0 ,
where x0 ≥ 0 and f (t) ≥ 0, is positive, that is, all its components are positive.
Hint. Use formula (3.21).
3.17 Prove that, if the n × n matrices A, B commute, that is, AB = BA, then
eA eB = e(A+B) = eB eA .
Hint. Prove that e(A+B)t = eAt eBt , ∀t ≥ 0, using Theorem 3.11 and (3.62).
3.18 Let A be a constant n × n matrix.
(i) Prove that there exists a real number ω and a positive constant M such that
etA ≤ Meω|t| , ∀t ∈ R.
(3.120)
3.8 Distribution Solutions of Linear Differential Equations
119
(ii) Prove that, for any complex numbers λ such that Re λ > ω, the matrix λ1 − A
is nonsingular and we have
λ1 − A
−1
=
∞
e−λt etA dt.
(3.121)
0
Hint. Use formula (3.62).
3.19 Let A be an n × n matrix. Prove (Trotter’s formula)
k
−k
t
t
, ∀t ∈ R.
1 + A = lim 1 − A
k→∞
k→∞
k
k
etA = lim
(3.122)
Hint. Prove first that
k
t
1
1+ A =
k
2πi
Γ
tλ k
1+
λ1 − A
k
−1
dλ,
(3.123)
where Γ is a contour as in the proof of Theorem 3.13. Next, let k → ∞.
3.20 Let {Aj }j≥1 be a sequence of n × n matrices that converges in norm to the n × n
matrix A. Prove that
(3.124)
lim etAj = etA , ∀t ∈ R.
j→∞
Hint. Use the representation formula (3.64) taking into account that Aj → A implies
that
(λ1 − Aj )−1 → (λ1 − A)−1 , ∀λ ∈ spec(A).
3.21 Find the general solution of the equation
x (3) + x = δ1 .
(3.125)
Hint. Use the fact that x (3) +x = 0 on (−∞, 1)∪(1, +∞) and proceed as in Example
3.4.
3.22 Let A be a real n × n matrix, and D ⊂ Rn be an invariant linear subspace of A,
that is, AD ⊂ D. Prove that, if x0 ∈ D, then the solution x(t) of the Cauchy problem
x′ = Ax, x(0) = x0
stays in D for any t ∈ R.
Hint. Use Theorem 3.11 and formula (3.62).
120
3 Systems of Linear Differential Equations
3.23 Let A be a real n × n matrix and B an n × m matrix. Prove that, if
rank [B, AB, A2 B, . . . , An−1 B] = n,
(3.126)
then the only vector x ∈ Rn such that
∗
B∗ eA t x = 0, ∀t ≥ 0,
(3.127)
is the null vector.
Hint. Differentiating equality (3.127) and setting t = 0, we deduce that
B∗ x = B∗ A∗ x = · · · = B∗ (A∗ )n−1 x = 0.
The algebraic condition (3.126) then implies that x = 0. Condition (3.126) was introduced by Kalman (n. 1930) and plays an important role in the theory of controllability
of linear differential systems.
3.24 Let A be an n × n matrix. Study the domain of definition of the matrix-valued
function
∞
t 2k+1
(−1)k
sin(tA) =
A2k+1 ,
(2k + 1)!
k=0
and prove that, for any x0 ∈ R, the function x(t) = sin(tA)x0 is a solution of the
second-order linear differential system x′′ + A2 x = 0.
Hint. Use Proposition 3.2.
3.25 Compute etA when A is one of the following matrices
⎤
⎡
⎤ ⎡
−1 2 −1
−1 0 3
2 −1
, ⎣ −8 1 12 ⎦ , ⎣ −1 −4 1 ⎦ .
−2 3
−2 0 4
−1 −2 −1
Hint. Use (3.64).
3.26 Prove that, using the substitution t = eτ , the Euler equation
t n x (n) + a1 t n−1 x (n−1) + · · · + an x = 0,
where a1 , . . . , an are real constants, reduces to a linear differential equation of order
n with constant coefficients.
3.27 Prove that, if A, B are m × m real matrices, then we have Lie’s formula (S. Lie
(1842–1899))
t t n
et(A+B) = lim e n A e n B , ∀t ≥ 0, ∀t ≥ 0.
(3.128)
n→∞
3.8 Distribution Solutions of Linear Differential Equations
121
Hint. For any positive integer n, the matrix
t t n
Yn (t) = e n A e n B
satisfies the differential equation
t
t
Yn′ (t) = (A + B)Yn (t) + e n A Be− n A − B Yn (t).
Then, by (3.21),
t
Yn (t) = et(A+B) +
0
s
s
e(t−s)(A+B) (e n A Be− n A − B)Yn (s)ds,
from which we can conclude that Yn (t) → et(A+B) as n → ∞.
Remark 3.6 Formula (3.128) is equivalent to the convergence of the fractional step
method
xε′ (t) = Axε (t), t ∈ [iε, (i + 1)ε]; xε (iε) = yε (ε),
yε′ (t) = Byε (t), t ∈ [0, ε]; yε (0) = xε (iε − 0),
to the solution x to the Cauchy problem x ′ = (A + B)x, x(0) = x0 .
Chapter 4
Stability Theory
The concept of stability has its origin in the problem of equilibrium of conservative
mechanical systems. Generally speaking, a motion is stable if “small” perturbations
of the initial conditions lead to only “small” variations of the motion over an infinite
period of time.
Interest in the mathematical theory of stability of motion was stimulated by late
nineteenth century research in celestial mechanics. The works of H. Poincaré (1854–
1912) and J.C. Maxwell (1831–1879) represent pioneering contributions in this field.
We recall that Maxwell used this concept in the study of Saturn’s rings, discovering
that the only configurations that are mathematically stable correspond to a discontinuous structure of the rings.
The modern theory of stability of differential systems is due to the Russian mathematician A.M. Lyapunov (1957–1918) and has as its starting point his doctoral dissertation, defended in 1882 and entitled “On the general problem of the stability of
motion”. In the last few decades, the theory has been enriched by many important
results, some motivated by the wide range of applications of the theory of stability
in the study and design of control systems.
In this chapter, we will describe only the basics of this theory, together with some
illustrations on certain systems of interest in physics. Basic references for this chapter
are [1, 6, 15].
4.1 The Concept of Stability
Consider the differential system
x′ = f (t, x),
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6_4
(4.1)
123
124
4 Stability Theory
where f : Ω → Rn is defined on the region Ω = {(t, x) ∈ (0, ∞)×Rn ; x < a}.
We assume that f is continuous in (t, x) and locally Lipschitz in x.
From the existence, uniqueness and extendibility theorems, we deduce that, for
any (t0 , x0 ) ∈ Ω, system (4.1) admits a unique right-saturated solution x(t; t0 , x0 )
defined on a maximal interval [t0 , T ). Suppose that ϕ(t) is a solution of (4.1) defined
on the entire semi-axis [t0 , ∞).
Definition 4.1 The solution ϕ is called stable if, for any ε > 0, there exist a > 0 and
δ(ε, t0 ) > 0 such that, for any x0 satisfying x0 < a and x0 − ϕ(t0 ) < δ(ε, t0 ),
the following hold.
(i) The solution x(t; t0 , x0 )is defined on [t0 , ∞).
(ii) x(t; t0 , x0 ) − ϕ(t) ≤ ε, ∀t ≥ t0 .
(4.2)
The solution is called uniformly stable if the above δ(ε, t0 ) can be chosen independent
of t0 .
Definition 4.2 (a) The solution ϕ is called asymptotically stable if it is stable and
there exists a µ(t0 ) > 0 such that
lim x(t; t0 , x0 ) − ϕ(t) = 0,
t→∞
(4.3)
for any x0 such that x0 − ϕ(t0 ) ≤ µ(t0 ).
(b) The solution ϕ is called uniformly asymptotically stable if it is uniformly
stable and there exists a µ0 > 0, independent of t0 , such that, if x0 − ϕ(t0 ) ≤ µ0 ,
then
lim x(t; t0 , x0 ) − ϕ(t) = 0,
t→∞
uniformly with respect to t0 .
Roughly speaking, stability means continuity and low sensitivity of solutions on
an infinite interval of time with respect to the initial data.
In many concrete situations, one is interested in stationary (or equilibrium) solutions of the differential systems, that is, solutions constant in time of the form
ϕ(t) ≡ c. These solutions describe the stationary regimes of various mechanisms
and processes. Since, in practical situations, the equilibrium can only be determined
approximatively, the only physically significant equilibrium states are the stable ones.
Using a simple algebraic trick, we can reduce the study of the stability of a solution
ϕ(t) of system (4.1) to the study of the stability of the trivial (or null) solution x ≡ 0
of a different system. Indeed, the substitution y := x − ϕ in (4.1) leads us to the
differential system
(4.4)
y′ = f (t, y + ϕ) − ϕ′ , t ≥ 0.
The solution ϕ of (4.1) corresponds to the trivial solution of system (4.4). Thus,
without loss of generality, we can restrict our attention to the stability of the trivial
4.1 The Concept of Stability
125
solution. For this reason, in the sequel, we will assume that the function f satisfies
the additional condition
(4.5)
f (t, 0) = 0, ∀t ≥ t0 ,
which amounts to saying that x ≡ 0 is a solution of (4.1). In this case, Definition 4.1
for ϕ = 0 takes the following form.
Definition 4.3 The trivial solution ϕ = 0 is called stable if for any ε > 0 there exist
a > 0 and δ(ε, t0 ) > 0 such that, for any x0 satisfying x0 < a and x0 − ϕ(t0 ) <
δ(ε, t0 ), the following hold.
(i) The solution x(t; t0 , x0 ) is defined on [t0 , ∞).
(ii) x(t; t0 , x0 ) ≤ ε, ∀t ≥ t0 .
(4.6)
The trivial solution is called asymptotically stable if it is stable and there exists a
µ(t0 ) > 0 such that
lim x(t; t0 , x0 ) = 0,
(4.7)
t→∞
for any x0 such that x0 ≤ µ(t0 ).
Given the n-th order differential equation
x (n) = g t, x, x ′ , . . . , x (n−1) ,
(4.8)
we will say that its trivial solution (if it exists) is stable if the trivial solution of the
associated system is stable
x1′ = x2
x2′ = x3
(4.9)
.. .. ..
. . .
xn′ = g(t, x1 , . . . , xn ).
Remark 4.1 We want to emphasize that stability is a property of a solution, not of
the system. It is possible that the same system has both stable and unstable solutions.
Take, for example, the pendulum equation
x ′′ + sin x = 0, t ≥ 0.
(4.10)
This equation admits two stationary solutions, ϕ1 (t) = 0, and ϕ2 (t) = π, corresponding to the two equilibrium positions of the pendulum.
The solution ϕ1 is stable. Indeed, multiplying (4.10) by x ′ and integrating, we find
that
(4.11)
|x ′ (t)|2 − 2 cos x(t) = |x ′ (0)|2 − 2 cos x(0), ∀t ≥ 0,
for any solution x(t) of (4.10). If |x ′ (0)| + |x(0)| < δ, then (4.11) implies that
126
4 Stability Theory
|x ′ (0)|2 ≤ 2 cos δ − 1 + δ 2 + 2 1 − cos δ , ∀t ≥ 0.
Given ε > 0, we choose δ = δ(ε) > 0 such that
δ 2 + 2(1 − cos δ) ≤ ε2 ,
and we deduce that
|x ′ (t)|2 + 4 sin
x 2 (t)
≤ ε2 , ∀t ≥ 0.
2
This shows that the trivial solution to Eq. (4.10) is stable. (We ought to point out that
the trivial solution is not asymptotically stable.)
To study the stability of the solution ϕ2 , we use the substitution y := x − π and
reduce the problem to the study of the trivial solution of
y′′ − sin y = 0.
(4.12)
Let y(t) be an arbitrary solution of the above equation. Arguing as above, we obtain
the equality
|y′ (t)|2 + 2 cos y(t) = |y′ (0)|2 + 2 cos y(0), ∀t ≥ 0.
(4.13)
Consider now the solution of the Cauchy problem with the initial conditions
δ
y′ (0) = δ, y(0) = 2 arcsin .
2
Then
cos y(0) = 1 − 2 sin2 arcsin
δ2
δ
= 1 − , |y′ (0)|2 + 2 cos y(0) = 2.
2
2
Using (4.13), we deduce that
y′ (t)
2
2
y(t)
.
= 2 1 − cos y(t) = 4 sin2
2
This implies that y(t) is the solution of the Cauchy problem
δ
y
y′ = 2 sin , y(0) = 2 arcsin .
2
2
This equation can be integrated and we deduce that
y(t) = 4 arctan(Cet ), C = tan
y(0)
.
4
4.1 The Concept of Stability
127
Thus, for ε > 0, there exists no δ > 0 such that
|y(t)|, |y′ (t)| ≤ 1, ∀t ≥ 0,
and
|y(0)|, |y′ (0)| ≤ δ.
These conclusions are in perfect agreement with observed reality.
Definition 4.4 If any solution of system (4.1) is defined on [0, ∞) and converges to
0 as t → ∞, then we say that system (4.1) is globally asymptotically stable.
In the next section, we will encounter many examples of globally asymptotically
stable systems.
4.2 Stability of Linear Differential Systems
Consider the linear homogeneous system
x′ = A(t)x, t ≥ 0,
(4.14)
where A(t) is an n × n matrix whose entries are continuous functions [0, ∞) → R.
Proposition 4.1 If the trivial solution of system (4.14) is stable (respectively uniformly stable, or asymptotically stable), then any other solution of this system is
stable (respectively uniformly stable or asymptotically stable).
Proof If x = ϕ(t) is an arbitrary solution of the system, then via the substitution
y := x − ϕ(t) we reduce it to the trivial solution y = 0 of the same linear system.
The stability of ϕ(t) is thus identical to the stability of the trivial solution.
We deduce that, in the linear case, the stability property of the trivial solution is
a property of the system. In other words, all solutions of this system enjoy the same
stability properties: a solution is stable if and only if all the solutions are stable, etc.
We will say that system (4.14) is stable (respectively asymptotically stable) if all the
solutions of this system are such. The central result of this section is the following
characterization of the stability of linear differential systems.
Theorem 4.1 (a) The linear differential system (4.14) is stable if and only if there
exists a fundamental matrix X(t) of the system which is bounded on [0, ∞). If
this happens, all the fundamental matrices are bounded on [0, ∞).
(b) The linear differential system (4.14) is asymptotically stable if and only if there
exists a fundamental matrix X(t) such that
lim X(t) = 0.
t→∞
128
4 Stability Theory
If such a fundamental matrix exists, then all the fundamental matrices satisfy
the above property. Above, X denotes the norm of the matrix X defined as in
Appendix A.
Proof (a) Suppose first that system (4.14) admits a fundamental solution X(t) that
is bounded on the semi-axis [0, ∞), that is,
∃M > 0 : X(t) ≤ M, ∀t ≥ 0.
For any x0 ∈ R, and any t0 ≥ 0, the solution x(t; t0 , x0 ) is given by (3.9)
x(t; t0 , x0 ) = X(t)X(t0 )−1 x0 .
Hence
x(t; t0 , x0 ) ≤ X(t) · X(t0 )−1 · x0 ≤ MX(t0 )−1 · x0 ,
∀t ≥ 0. Thus
(4.15)
x(t; t0 , x0 ) ≤ ε, ∀t ≥ 0,
as soon as
x0 ≤ δ(ε) :=
ε
.
MX(t0 )−1
Conversely, let us assume that the trivial solution 0 is stable. Let X(t) denote the
fundamental matrix determined by the initial condition X(0) = 1. Since the trivial
solution is stable, we deduce that there exists a δ > 0 such that, for any x0 ∈ R
satisfying x0 ≤ δ, we have
X(t)x0 = x(t; 0, x0 ) ≤ 1, ∀t ≥ 0.
We deduce that
X(t) ≤
1
, ∀t ≥ 0.
δ
Since any other fundamental matrix Y (t) is related to X(t) by a linear equation
Y (t) = X(t)C, ∀t ≥ 0,
where C is a nonsingular, time-independent n × n matrix, we deduce that Y (t) is also
bounded on [0, ∞).
Part (b) is proved in a similar fashion using estimate (4.15), where the constant
M is replaced by a positive function M(t), such that limt→∞ M(t) = 0.
4.2 Stability of Linear Differential Systems
129
Consider now the case when A(t) ≡ A is independent of t. A matrix A is called
Hurwitzian if all its eigenvalues have negative real parts. Theorem 4.1 implies the following criterion of asymptotic stability for linear systems with constant coefficients.
Theorem 4.2 Let A be a real n × n matrix. Then the linear system
x′ = Ax
(4.16)
is asymptotically stable if and only if A is a Hurwitzian matrix.
Proof According to Theorem 4.1, system (4.16) is asymptotically stable if and only if
lim etA = 0.
t→∞
On the other hand, Theorem 3.12 shows that, for any 1 ≤ i, j ≤ n, the (i, j)-entry of
etA has the form
pi,j,λ (t)etλ ,
(4.17)
λ∈spec(A)
where, for any eigenvalue λ ∈ spec(A), we denote by pi,j,λ (t) a polynomial in t of
degree smaller than the algebraic multiplicity of λ. The above equality shows that
lim etA = 0 if and only if Re λ < 0, ∀λ ∈ spec(A).
t→∞
Corollary 4.1 If A is a Hurwitzian matrix, then system (4.16) is asymptotically
stable. Moreover, for any positive number ω such that
ω < min − Re λ; λ ∈ spec(A) ,
there exists an M > 0 such that, for any x0 ∈ R and any t0 ≥ 0, we have
x(t; t0 , x0 ) ≤ Me−ω(t−t0 ) , ∀t ≥ t0 .
(4.18)
Proof The asymptotic stability statement follows from Theorem 4.2. Next, observe
that
x(t; t0 , x0 ) = e(t−t0 )A x0 .
Estimate (4.18) now follows from the structural equalities (4.17).
From the structural equalities (4.17), we obtain the following stability criterion.
Corollary 4.2 If all the eigenvalues of A have nonpositive real parts, and the ones
with trivial real parts are simple, then system (4.16) is stable.
To be able to apply Theorem 4.2 in concrete situations and for differential systems of dimensions ≥ 3, we need to know criteria deciding when a polynomial is
130
4 Stability Theory
Hurwitzian, that is, all its roots have negative real parts. Such a criterion was found
by A. Hurwitz (1859–1919) and can be found in many classical algebra books, e.g.
[10, Chap. XV, Sect. 6]. For degree 3 polynomials
p(λ) = λ3 + a1 λ2 + a2 λ + a3
it reads as follows: the polynomial p(λ) is Hurwitzian if and only if
a1 > 0, a3 > 0, a1 a2 > a3 .
Finally, let us comment on higher order linear differential equations with constant
coefficients
(4.19)
x (n) + a1 x (n−1) + · · · + an x = 0.
Taking into account the equivalence between such equations and linear differential
systems of dimension n with constant coefficients and the fact that the eigenvalues
of the associated matrix are the roots of the characteristic equation
λn + a1 λn−1 + · · · + an = 0,
(4.20)
we obtain from Theorem 4.2 the following result.
Corollary 4.3 The trivial solution of (4.19) is asymptotically stable if and only if
all the roots of the characteristic equation (4.20) have negative real parts.
The previous discussion shows that, if the trivial solution of Eq. (4.19) is asymptotically stable, then all the other solutions converge exponentially to 0 as t → ∞,
together with all their derivatives of orders ≤ n.
4.3 Stability of Perturbed Linear Systems
In this section, we will investigate the stability of the trivial solutions of the differential
systems of the form
(4.21)
x′ = Ax + F(t, x), t ≥ 0,
where A is a fixed n × n real matrix and F : [0, ∞) × Rn → Rn is a function
continuous in the cylinder
Ω = (t, x) ∈ [0, ∞) × Rn ; x < r ,
locally Lipschitz in x, and such that F(t, 0) ≡ 0. Such a system is called a perturbed
linear system and F is called a perturbation.
Systems of the form (4.21) arise naturally when linearizing arbitrary differential
systems at the trivial systems. For sufficiently small perturbations F, we expect
4.3 Stability of Perturbed Linear Systems
131
the stability, or asymptotic stability, of the trivial solution of the linear system to
“propagate” to the perturbed systems as well. This is indeed the case, and the next
results states this in a precise fashion.
Theorem 4.3 (Lyapunov–Poincaré) Suppose that A is a Hurwitzian matrix. Fix
M, ω > 0 such that
etA ≤ Me−ωt , ∀t ≥ 0.
(4.22)
If there exists an L > 0 such that
L<
ω
M
(4.23)
and
F(t, x) ≤ Lx, ∀(t, x) ∈ Ω,
(4.24)
then the trivial solution of system (4.21) is asymptotically stable.
Proof Fix (t0 , x0 ) ∈ Ω and denote by x(t; t0 , x0 ) the right saturated solution x(t) of
(4.21) satisfying the initial condition x(t0 ) = x0 . Let [t0 , T ) denote the right-maximal
existence interval of this solution. Interpreting (4.21) as a nonhomogeneous linear
system and using the formula of variation of constants (3.21), we deduce that
x(t; t0 , x0 ) = e(t−t0 )A x0 +
t
t0
e(t−s)A F s, x(s; t0 , x0 ) ds, ∀t ∈ [t0 , T ).
(4.25)
Taking the norm of both sides, and using (4.22) and (4.24), we deduce that that
x(t; t0 , x0 ) ≤ Me−(t−t0 )ω x0 + ML
t
t0
e−(t−s)ω x(s; t0 , x0 )ds,
(4.26)
∀t ∈ [t0 , T ).
We set
y(t) := etω x(t; t0 , x0 ).
From inequality (4.26), we get
y(t) ≤ Met0 ω x0 + LM
t
t0
y(s)ds, ∀t ∈ [t0 , T ).
Invoking Gronwall’s Lemma, we deduce that
y(t) ≤ Mx0 eLM(t−t0 )+ωt0 ,
so that
x(t; t0 , x0 ) ≤ Mx0 e(LM−ω)(t−t0 ) , ∀t ∈ [t0 , T ).
(4.27)
132
4 Stability Theory
We set δ = ω − LM and we observe that (4.23) implies δ > 0. We deduce that
x(t; t0 , x0 ) ≤ Me−δ(t−t0 ) x0 , ∀t ∈ [t0 , T ).
From the above estimate, we deduce that, if x0 <
x(t; t0 , x0 ) <
(4.28)
r
,
2M
r
, ∀t ∈ [t0 , T ).
2
Thus the solution x(t; t0 , x0 ) stays in the interior of the cylinder Ω on its rightmaximal existence interval. Invoking Theorem 2.10, we deduce that T = ∞. The
asymptotic stability of the trivial solution now follows immediately from (4.28).
Theorem 4.4 Suppose that A is a Hurwitzian matrix and the perturbation F satisfies
the condition
F(t, x) ≤ L x · x,
(4.29)
where L(r) → 0 as r ց 0. Then the trivial solution of system (4.21) is asymptotically
stable.
Proof Choose r0 > 0 such that L(r) < Mω , ∀0 < r < r0 , where M, ω are as in
by applying
Theorem 4.3 to the restriction of
(4.22). The desired conclusion follows
F to the cylinder Ω = ]0, ∞[ × x < r0 .
Theorem 4.4 is the basis of the so-called first-order approximation method of
investigating the stability of the solutions of differential systems.
Consider the autonomous differential system
x′ = f (x),
(4.30)
where
f : D → Rn , D := x ∈ Rn ; x < a ,
is a C 1 -map. We assume additionally that f (0) = 0 and
the Jacobian matrix f x (0) is Hurwitzian.
(4.31)
Theorem 4.5 Under the above assumptions, the trivial solution of (4.30) is asymptotically stable.
Proof Since f is C 1 , we have the equality
f (x) = f (0) + f x (0)x + F(x) =: Ax + F(x),
where F(x) ≤ L x · x, L(r) := supθ≤r f x (θ) − f x (0).
The stated conclusion is now a direct consequence of Theorem 4.4.
(4.32)
4.3 Stability of Perturbed Linear Systems
133
Remark 4.2 Theorem 4.5 admits an obvious generalization to non-autonomous systems.
Example 4.1 We illustrate the general results on a second-order ODE (see L. Pontryagin, [17]),
Lx ′′ + Rx ′ + C −1 x = C −1 f (x ′ ),
(4.33)
which describes the behavior of an oscillatory electrical circuit with resistance R,
capacitance C and inductance L that has a nonlinear perturbation f (x ′ ), (explicitly, a
triode).
Equation (4.33) is equivalent to the system
x′ = y
y′ = (CL)−1 f (y) − (CL)−1 x − RL −1 y.
(4.34)
The above system admits the stationary solution
x = f (0), y = 0.
(4.35)
We are interested in the stability of this solution. Making the change of variables
x1 := x − f (0), x2 := y,
we are led to the study of the trivial solution of the system
x1′ = x2
x2′ = (CL)−1 f (x2 ) − f (0) − x1 − RL −1 x2 .
(4.36)
The linearization at 0 of this system is described by the matrix
A=
1
CL
0
CL
.
−1 f ′ (0) − RC
This matrix is Hurwitzian if and only if
f ′ (0) < RC.
(4.37)
According to Theorem 4.5, condition (4.37) implies the asymptotic stability of the
stationary solution x = f (0) of Eq. (4.33).
134
4 Stability Theory
4.4 The Lyapunov Function Technique
In this section, we will describe a general method for investigating stability, known
in the literature as the Lyapunov function technique. It can be used without having
to solve the corresponding system of ODEs.
The real function V , defined on the cylinder
Ω = (t, x) ∈ R × Rn ; t > 0, x < a ,
is called positive definite if there exists a continuous, nondecreasing function
ω : [0, ∞) → [0, ∞)
(4.38)
such that
ω(0) = 0, ω(r) > 0, ∀r > 0,
and
V (t, x) ≥ ω(x), ∀(t, x) ∈ Ω.
(4.39)
We say that V is negative definite if −V is positive definite.
Remark 4.3 The function ω in the definition of positive definiteness satisfies the
following property:
∀ε > 0, ∃δ = δ(ε) > 0 such that, for any r > 0 satisfying ω(r) < δ,
we have r < ε.
(4.40)
Definition 4.5 The function V : Ω → [0, ∞) is called a Lyapunov function of
system (4.1) if it satisfies the following conditions.
(i) The function V is C 1 on Ω.
(ii) The function V is positive definite and V (t, 0) = 0, ∀t > 0.
(iii) ∂t V (t, x) + gradx V (t, x), f (t, x) ≤ 0, ∀(t, x) ∈ Ω.
Above, (−, −) denotes the canonical Euclidean scalar product on Rn and gradx V
is the gradient of the function of x, that is,
⎤
∂x1 V
⎥
⎢
gradx V := ⎣ ... ⎦ .
∂xn V
⎡
The main result of this section, known as Lyapunov’s stability theorem, is the following.
Theorem 4.6 (Lyapunov stability) Consider system (4.1) satisfying the general conditions in Sect. 4.1.
4.4 The Lyapunov Function Technique
135
(a) If system (4.1) admits a Lyapunov function V (t, x), then the trivial solution of
this system is stable.
(b) Suppose that system (4.1) admits a Lyapunov function V (t, x) such that the
function
W (t, x) := ∂t V (t, x) + gradx V (t, x) , f (t, x)
is negative definite on Ω and
V (t, x) ≤ µ(x), ∀(t, x) ∈ Ω,
(4.41)
where µ : [0, ∞) → [0, ∞) is a continuous function which vanishes at the
origin. Then the trivial solution of (4.1) is asymptotically stable.
Proof (a) For (t0 , x0 ) ∈ Ω, consider the right-saturated solution x(t; t0 , x0 ) of (4.1)
satisfying the initial condition x(t0 ) = x0 . Let [t0 , T ) denote the right-maximal
existence interval of this solution. Using property (iii) of a Lyapunov function, we
deduce the inequality
d
V t, x(t; t0 , x0 ) ≤ 0, ∀t ∈ [t0 , T ).
dt
Integrating, we deduce that
ω x(t; t0 , x0 ) ≤ V t, x(t; t0 , x0 ) ≤ V (t0 , x0 ), ∀t ∈ [t0 , T ).
(4.42)
Using Remark 4.3 and the continuity of V , we deduce that for any ε > 0 there exists
an r = r(ε) > 0 such that, for any x0 satisfying x0 < r(ε), we have
ω x(t; t0 , x0 ) ≤ V (t0 , x0 ) < δ(ε), ∀t ∈ [t0 , T ),
where δ(ε) is defined in (4.40). Thus
x(t; t0 , x0 ) < ε, ∀t ∈ [t0 , T ), ∀x0 < r(ε).
(4.43)
If we choose ε < a2 , we deduce that, if the initial condition x0 satisfies x0 < r(ε),
then the solution x(t; t0 , x0 ) stays inside Ω throughout its existence interval. This
shows that T = ∞. Since, in (4.43), the parameter ε > 0 can be chosen arbitrarily,
we deduce that the trivial solution of (4.1) is stable.
(b) Assume now that there exists an increasing function λ that is continuous and
vanishes at the origin, such that
∂t V (t, x) + gradx V (t, x) , f (t, x) ≤ −λ(x), ∀(t, x) ∈ Ω,
and, moreover, inequality (4.41) holds. We deduce that
(4.44)
136
4 Stability Theory
d
V t, x(t; t0 , x0 ) + λ x(t; t0 , x0 ) ≤ 0, ∀t ≥ t0 .
dt
(4.45)
Integrating, we deduce that
V t, x(t; t0 , x0 ) +
t
t0
λ x(s; t0 , x0 ) ds ≤ V (t0 , x0 ), ∀t ≥ t0 .
(4.46)
From (4.45) and (4.46), it follows that the limit
ℓ = lim V t, x(t; t0 , x0 )
t→0
exists, is finite, and the function t → λ x(t; t0 , x0 ) is integrable on the semi-axis
[t0 , ∞). Hence, there exists a sequence tn → ∞ such that
lim λ x(tn ; t0 , x0 ) = 0.
n→∞
In other words, x(tn ; t0 , x0 ) → 0. From (4.41), we get
lim V x(tn ; t0 , x0 ) = 0,
n→∞
so that ℓ = 0. Using (4.39) and (4.40), we deduce that
lim x(t; t0 , x0 ) = 0.
t→∞
This completes the proof of Theorem 4.6.
n
Theorem 4.7 If Ω = (0, ∞) × R , and the function ω in (4.39) has the property
lim ω(r) = ∞,
r→∞
(4.47)
then, under the same assumptions as in Theorem 4.6 (b), the trivial solution of system
(4.1) is globally asymptotically stable.
Proof Let x0 ∈ Rn and set
C := sup r ≥ 0; ω(r) ≤ V (t0 , x0 ) .
From (4.47), we see that C < ∞, while (4.42) implies that
x(t; t0 , x0 ) < ε, ∀t ∈ [t0 , T ).
This proves that T = ∞. Using inequality (4.46), we obtain as in the proof of
Theorem 4.6 that
lim x(t; t0 , x0 ) = 0.
t→∞
4.4 The Lyapunov Function Technique
137
For autonomous systems
x′ = f (x),
(4.48)
we can look for time-independent Lyapunov functions. More precisely, if
f : D = x ∈ Rn ; x < a ≤ ∞ → Rn
is locally Lipschitz, we define, in agreement with Definition 4.4, a Lyapunov function
on D to be a function V : D → R satisfying the following conditions.
(L1 ) V ∈ C 1 (D), V (0) = 0.
(L2 ) V (x) > 0, ∀x = 0.
(L3 ) gradV (x), f (x) ≤ 0, ∀x ∈ D.
Let us observe that condition (L2 ) is equivalent to the fact that V is positive definite
on any domain of the form
D0 := x ∈ Rn ; x ≤ b < a .
Indeed, if V satisfies (L2 ), then
V (x) ≥ ω(x), ∀x ∈ D0 ,
where ω : [0, ∞) → [0, ∞) is defined by
ω(r) :=
⎧
⎨ inf
r≤x≤b
⎩
ω(b),
V (x), 0 ≤ r ≤ b,
r > b.
One can easily check that ω(r) is continuous, nondecreasing, and satisfies (4.38)
and (4.39). We should also mention that, in this special case, assumption (4.41) is
automatically satisfied with µ : [0, ∞) → [0, ∞) given by
µ(r) :=
⎧
⎨ sup V (x), 0 ≤ r ≤ b,
x≤r
⎩
µ(b),
r > b.
Theorems 4.6 and 4.7 have the following immediate consequence.
Corollary 4.4 If there exists a function V satisfying (L1 ), (L2 ), (L3 ), then the trivial
solution of (4.48) is stable. If, additionally, V satisfies
gradV (x), f (x) < 0, ∀x ∈ D \ {0},
(4.49)
then the trivial solution is asymptotically stable. Furthermore, if V is coercive, that
is,
138
4 Stability Theory
lim V (x) = ∞,
x→∞
then the trivial solution is globally asymptotically stable.
Remark 4.4 It is worth remembering that, for time-independent functions V , the
positivity condition (4.39) is equivalent to the condition V (x) > 0, ∀x = 0. This is
easier to verify in concrete situations.
Example 4.2 The Lyapunov function technique can sometime clarify situations that
are undecidable using the first approximation method. Consider, for example, the
differential system
x1′ = 2x2 (x3 − 2),
x2′ = −x1 (x3 − 1),
x3′ = x1 x2 .
We want to investigate the stability of the trivial solution (0, 0, 0). The linearization
at the origin of this system is described by the matrix
⎡
⎤
0 −4 0
⎣1 0 0⎦
0 0 0
whose eigenvalues are
λ1 = 0, λ2 = 2i, λ3 = −2i.
Hence, this matrix is not Hurwitzian and thus we cannot apply Theorem 4.5.
We seek a Lyapunov function of the form
V (x1 , x2 , x3 ) = αx12 + βx22 + γx32 ,
where α, β, γ are positive constants. Then
gradV = (2αx1 , 2βx2 , 2γx3 ).
If we denote by f the vector defining the right-hand side of our system, we deduce
that
gradV, f = 4αx1 x2 (x3 − 2) − 2βx1 x2 (x3 − 1) + 2γx1 x2 x3 .
Note that, if
α=
1
, β = 2, γ = 1,
2
then
gradV, f = 0 in R3 ,
4.4 The Lyapunov Function Technique
139
and thus V is a Lyapunov function of the system and, therefore, the trivial solution
is stable.
Theorem 4.8 (Lyapunov) System (4.14) with A(t) ≡ A (constant matrix) is asymptotically stable if and only if there exists an n × n, symmetric, positive definite matrix
P satisfying the equality
(4.50)
A∗ P + PA = −1,
where, as usual, we denote by A∗ the adjoint (transpose) of A, and by 1 the identity
matrix.
Proof Assume first that Eq. (4.50) admits a symmetric, positive definite solution P.
Consider the function V : Rn → R defined by
V (x) =
1
Px, x , ∀x ∈ Rn .
2
(4.51)
Observing that gradV (x) = Px, ∀x ∈ Rn , we obtain that
(4.50)
2 gradV (x), Ax = 2(Px, Ax) = (A∗ Px, x) + (x, PAx) = −x2e .
(4.52)
On the other hand, since P is positive definite, we deduce (see Lemma A.3) that there
exists a ν > 0 such that
V (x) ≥ νx2e , ∀x ∈ Rn .
The Cauchy–Schwartz inequality implies
2V (x) ≤ Pe x2e , ∀x ∈ Rn ,
where
Pe := sup Pxe .
xe =1
Thus, V is a Lyapunov function for the differential system
x′ = Ax.
(4.53)
Since the Euclidean norm − e is equivalent to the norm − (see Lemma A.1),
we deduce that the function x → (gradV (x), Ax) is negative definite. Theorem 4.7
now implies that the trivial solution of (4.53) is asymptotically stable.
Conversely, suppose that A is a Hurwitzian matrix. We define a matrix P by the
equality
P :=
∞
0
∗
etA etA dt.
(4.54)
140
4 Stability Theory
The above integral is well defined since the real matrix A∗ is also Hurwitzian, because
its characteristic polynomial coincides with that of A. Theorem 2.13 shows that the
∗
entries of both etA and etA decay exponentially to zero as t → ∞.
tA ∗
tA∗
Note that e = ( e ) so that P is symmetric and
(Px, x) =
0
∞
etA x, etA x dt =
∞
etA x2e dt, ∀x ∈ Rn .
0
(4.55)
The above equality implies that P is positive definite. Using (4.54), we get
A∗ P =
∞
0
∗
A∗ etA etA dt =
∞
d tA∗ tA
e
e dt.
dt
0
Integrating by parts, we obtain
A∗ P = −1 −
0
∞
∗
etA
d tA
e dt = −1 −
dt
0
∞
∗
etA etA Adt = −1 − PA.
Thus P also satisfies equality (4.50). This completes the proof of Theorem 4.8.
Remark 4.5 Theorem 4.8 can be alternatively rephrased as follows: the matrix A is
Hurwitzian if and only if Eq. (4.50) admits a symmetric and positive definite solution
P.
Observe that, if P is a positive definite, symmetric n × n matrix, then the function
(−, −)P : Rn × Rn → R, (x, , y)P := (Px, y),
is a Euclidean scalar product on Rn . Theorem 4.8 essentially states that A is Hurwitzian if and only if there exists a Euclidean scalar product −, − on Rn such that
the resulting symmetric bilinear form
QA (x, y) = Ax, y + x, Ay
is negative definite.
We have to point out that there are no general recipes or techniques for constructing
Lyapunov functions. This takes ingenuity and experience. However, the physical
intuition behind a given model often suggests candidates for Lyapunov functions.
For example, in the case of mechanical, electrical or thermodynamical systems, the
energy tends to be a Lyapunov function. In the case of thermodynamical systems, the
entropy, taken with the opposite sign, is also a Lyapunov function. We will illustrate
this principle in the case of conservative mechanical systems.
Example 4.3 Consider the differential equation
x ′′ + g(x) = 0.
(4.56)
4.4 The Lyapunov Function Technique
141
As explained in Sect. 1.3.5, Eq. (4.56) describes the one-dimensional motion of a unit
mass particle under the influence of a force field of the form −g(x), x(t) denoting
the position of the particle at time t. (We have discussed the case g(x) = sin x a bit
earlier.) Eq. (4.56) is equivalent to the differential system
x ′ = y,
y′ = −g(x).
(4.57)
We assume that g : R → R is continuous and satisfies the condition
xg(x) > 0, ∀x ∈ R \ {0}.
Let G be the antiderivative of g determined by
G(x) :=
0
x
g(r)dr, ∀x ∈ R.
The function G is C 1 and positive away from the origin and thus, according to Remark
4.4, it is positive definite. It is now easy to verify that the function
V (x, y) =
1 2
y + G(x)
2
is conserved along the trajectories of (4.57), which shows that it is a Lyapunov
function of this system.
Let us observe that, in the present case, the Lyapunov function V is none other
than the total energy of system (4.56). Our assumption on g, which showed that the
potential energy G is positive definite, simply states that x = 0 is a minimum of G.
We have thus recovered through Lyapunov’s theorem the well-known principle of
Lagrange for conservative systems: an equilibrium position that is a minimum point
for the potential energy is a stable equilibrium.
However, as we observed earlier, the trivial solution of system (4.57) is not an
asymptotically stable solution. Indeed, if (x(t), y(t)) is a solution of (4.57), then it
satisfies the energy conservation law
d
V x(t), y(t) ) = 0, ∀t ≥ 0,
dt
and hence
1
1
|y(t)|2 + G x(t) = |y(0)|2 + G x(0) , ∀t ≥ 0.
2
2
If
lim x(t) = lim y(t) = 0,
t→∞
t→∞
142
then
4 Stability Theory
1
|y(0)|2 + G x(0) = 0.
2
This implies that x(0) = y(0) = 0. Thus, the only solution that approaches the trivial
solution as t → ∞ is the trivial solution. As a matter of fact, this happens for all
physical systems which conserve energy.
For the gradient differential systems, that is, systems of the form
x′ + grad f (x) = 0,
(4.58)
an isolated equilibrium point is asymptotically stable if and only if it is a local
minimum of the function f : Rn → R. More precisely, we have the following result.
Theorem 4.9 Let f : Rn → R be a C 2 -function. If x0 is an isolated critical point
of f which is also a local minimum, then the stationary solution x = x0 is an
asymptotically stable solution of system (4.58).
Proof Without loss of generality, we can assume that x0 = 0 and f (0) = 0. There
exists then a real number r > 0 such that
f (x) > 0, grad f (x) = 0, ∀0 < x < r.
This shows that the restriction of f to the disk {x < r} is a Lyapunov function for system (4.58) restricted to this disk. The theorem now follows from
Corollary 4.4.
Remark 4.6 Consider the functions
⎧
⎨x 5 sin 1 + 1 , x = 0,
x
f : R → R, f (x) =
⎩
0,
x = 0,
x
F(x) =
f (t)dt.
0
Note that F is a C 2 -function with a unique minimum at x = 0, which is a global
minimum. The function F is an antiderivative of f and has infinitely many critical
points
1
, n ∈ Z \ {0},
π
− 2 + 2nπ
which are neither local minima nor local maxima and accumulate at the origin. The
trivial solution of the equation
x ′ + F ′ (x) = 0
4.4 The Lyapunov Function Technique
143
is stable, but not asymptotically stable. The reason is that the origin is isolated as a
local minimum of F, but not isolated as a critical point of F.
Remark 4.7 Theorem 4.3 can be obtained by an alternate proof based on the Lyapunov function method. Indeed, if the matrix A and the perturbation F satisfy the
assumptions in Theorem 4.3, then, according to Theorem 4.8, the equation
P∗ A + AP = −1
admits a symmetric, positive solution P. The function V (x) =
Lyapunov function for system (4.21). Indeed,
1
(Px, x)
2
is then a
1
(Ax + F(t, x), gradV (x)) = (Ax + F(t, x), Px) = − x2e + (F(t, x), Px)
2
1
1
≤ − x2e + F(t, x)e · Pxe ≤ − x2e + CLx2 .
2
2
Thus, if the constant
small, there exists an α > 0 such that Ax +
L is sufficiently
F(t, x), gradV (x) ≤ −αx2 , and so, now Theorem 4.6 implies the stability of the
trivial solution.
4.5 Stability of Control Systems
In this section, we will present a few application of the Lyapunov function method
to the theory of stability of automatic control systems.
Maxwell’s 1868 paper “On Governors” and I.A. Vishnegradski’s 1877 work on
Watt’s centrifugal governor represent pioneering contributions to this field. The modern mathematical theory of automatic control systems was put together during the
last few decades, and stability theory plays a central part in it.
A linear control system is described mathematically by the differential system
x′ = Ax + Bu,
(4.59)
where A is an n × m matrix, B is an n × m matrix and u is an Rm -valued function
called the control or input. The function x is called the state of the system.
In certain situations, the state of the system is not known directly, but only indirectly through measurements of a certain function C(x) of the state. In such cases,
we associate to the control system the equation
y = C(x).
(4.60)
144
4 Stability Theory
u
x =Ax+Bu
x
u=Dx
Fig. 4.1 A closed-loop system
The quantity y is called the output. The system of Eqs. (4.59) and (4.60) is called an
observed control system. The control u can also be given as a function of the state x
of system (4.59)
u = D(x),
(4.61)
where D : Rn → Rm is a continuous map.
An expression as in (4.61) is called a feedback synthesis or feedback controller
and it arises frequently in the theory of automatic control systems. Putting u, given
in (4.61), into system (4.59), we obtain the so-called “closed-loop” system
x′ = Ax + BD(x).
(4.62)
We can schematically describe this closed-loop system as in Fig. 4.1.
Example 4.4 Consider the following simple control system
x ′ (t) = u(t),
that could describe the evolution of the volume of fluid in a reservoir where the fluid
is added or removed at a rate u(t). Suppose that we want to find the size of the flow
u(t) so that, in the long run, the volume will approach a prescribed value x∞ . This
goal can be achieved by selecting the flow according to the feedback rule
u(t) = α x∞ − x(t) ,
where α is a positive constant. Indeed, the general solution of the above linear
equation is
x(t) = x(0)e−αt + x∞ 1 − e−αt ,
and thus
lim x(t) = x∞ .
t→∞
We say that the feedback synthesis (4.61) stabilizes system (4.59) if the corresponding closed-loop system (4.62) is asymptotically stable.
4.5 Stability of Control Systems
145
As a matter of fact, the transformation of the original system into a stable or
asymptotically stable one is the main motivation in engineering for the design of
feedback controls. In practical situations, the feedback control often takes the form
of a regulator of the system, making it stable under perturbations, possibly endowing
it with other desirable features.
An example available to almost everyone is the classical thermostat that controls
the temperature in heated systems. In fact, the main systems in nature and society
are stable closed-loop systems.
Example 4.5 Consider the following situation
mx ′′ + ω 2 x = u(t), t ≥ 0,
(4.63)
that describes the frictionless elastic motion of a particle of mass m acted upon by
an external force u(t). If we regard u(t) as a control, then we can rewrite (4.63) in
the form (4.59) where the matrix A is given by
A :=
01
,
0
2
− ωm
and the 2 × 1 matrix B is given by
B :=
0
1
m
.
As observed in the previous section, the trivial solution is not asymptotically stable.
To stabilize this system, we consider a feedback control of the form
u(t) = −αx ′ (t),
(4.64)
where α is an arbitrary positive constant. The resulting closed-loop system
mx ′′ + αx ′ + ω 2 x = 0
is obviously asymptotically stable. Concretely, such a system is produced by creating
at every moment of time t > 0 a resistance force proportional to the velocity of the
particle. In other words, we artificially introduce friction into the system.
Consider next the linear control system
x′ = Ax + u(t)b, t ≥ 0, b ∈ Rn ,
(4.65)
with output σ given by the differential equation
σ ′ = (c, x) − αϕ(σ),
(4.66)
146
4 Stability Theory
and the nonlinear feedback law
u = ϕ(σ).
(4.67)
Above, A is an n × n Hurwitzian matrix, b, c ∈ Rn are control parameters and
ϕ : R → R is a C 1 -function satisfying
σϕ(σ) > 0, ∀σ = 0,
lim
(4.68)
σ
σ→∞ 0
ϕ(r)dr = ∞.
(4.69)
We seek conditions that will guarantee the global asymptotic stability of the control
system (4.67), that is,
lim x(t) = 0, lim σ(t) = 0,
(4.70)
t→∞
t→∞
for any (unknown) function ϕ satisfying (4.68) and (4.69).
Such a problem is known as a Lurie–Postnikov problem, named after the authors
who first investigated it; see e.g. [5, Sect. 6.2] or [9].
To solve this problem, we will seek a Lyapunov function for the system (4.65)
and (4.66) of the form
V (x, σ) =
1
Px, x ) +
2
σ
ϕ(r)dr,
(4.71)
0
where P is a symmetric and positive definite solution of the matrix equation (4.50).
From Theorem 4.8 and (4.68) to (4.69), we deduce that the function V is positive
definite and
lim V (x, σ) = ∞.
x+|σ|→∞
If we denote by f (x, σ) the right-hand side of the system (4.65), (4.66) and by −, −
the canonical scalar product on Rn+1 , we deduce that
gradV (x, σ), f (x, σ) = (Ax, Px) + ϕ(σ)(Px, b) + ϕ(σ)(c, x) − αϕ(σ)2
1
= − x2e − αϕ(σ)2 + ϕ(σ)(x, Pb + x)
2
1
1
≤ − x2e − αϕ(σ)2 +
x2e + ϕ(σ)2 Pb + c2e
2
2
1
2
2
= ϕ(σ) Pb + ce − 2α .
2
It follows that the function gradV, f is negative definite as soon as b, c satisfy the
controllability inequality
√
Pb + ce < 2α.
4.5 Stability of Control Systems
147
We have thus found a sufficient condition for the asymptotic stability involving the
parameters α, b, c.
Remark 4.8 The object of the theory of control systems of the form (4.59) involves
problems broader than the one we discussed in this section. The main problem of control systems can be loosely described as follows: find the control function u(t) from
a class of admissible functions such that the corresponding output x has a prescribed
property. In particular, asymptotic stability could be one of those properties.
In certain situations, the control parameter is chosen according to certain optimality conditions, such as, to minimize a functional of the form
T
0
L x(t), u(t) dt
on a collection of functions u, where x is the corresponding output. Such a control
is called an optimal control.
4.6 Stability of Dissipative Systems
We will investigate the asymptotic properties of the system
x′ = f (x),
(4.72)
where f : Rn → Rn is a continuous dissipative map (see (2.57))
f (x) − f (y), x − y ≤ 0, ∀x, y ∈ Rn .
(4.73)
We have seen in Theorem 2.13 that, for any x0 ∈ Rn , system (4.72) admits a unique
solution x(t; t0 , x0 ) satisfying the initial condition x(t0 ) = x0 and defined on [T0 , ∞).
If we denote by S(t) : Rn → Rn the semigroup of transformations
S(t)x0 = x(t; 0, x0 ), ∀t ≥ 0, x0 ∈ Rn ,
then, obviously,
x(t; t0 , x0 ) = S(t − t0 )x0 , ∀t ≥ t0 , x0 ∈ Rn .
Moreover, according to Theorem 2.13, we have
x(t; t0 , x0 ) − x(t; t0 , y0 )e ≤ x0 − y0 e , ∀t ≥ t0 , x0 , y0 ∈ Rn .
(4.74)
148
4 Stability Theory
By the orbit or trajectory of the system through the point x0 , we understand the set
γ(x0 ) := x ∈ Rn ; x = S(t)x0 , t ≥ 0 .
The ω-limit set of the orbit γ(x0 ) is the set ω(x0 ) of the limit points of γ(x0 ).
More precisely, x ∈ ω(x0 ) if and only if there exists a sequence tk → ∞ such that
S(tk )x0 → x.
Example 4.6 Consider, for example, the solution
(x1 , x2 ) = (sin t, cos t)
of the differential system
x1′ = x2 ,
x2′ = −x1 ,
with the initial condition x1 (0) = 0, x2 (0) = 1. In this case, the ω-limit set of (0, 1)
is the unit circle in R2 , centered at the origin.
A sufficient condition for the set ω(x0 ) to be nonempty is the boundedness of
the trajectory γ(x0 ). Let F ⊂ Rn be the set of all the stationary solutions of system
(4.72), that is,
F := x ∈ Rn ; f (x) = 0 .
(4.75)
Lemma 4.1 The set F is closed and convex.
Proof The continuity of f implies that F is closed.
Let x, y ∈ F and t ∈ [0, 1]. Since f is dissipative, we see that
f (ξ), ξ − x ≤ 0, f (ξ), ξ − y ≤ 0, ∀ξ ∈ Rn .
Hence
f (ξ), ξ − xt ≤ 0, ∀ξ ∈ Rn , t ∈ [0, 1],
where xt = tx + (1 − t)z. Let us now choose ξ := εxt + (1 − ε)z, where 0 < ε < 1
and z is arbitrary in Rn . By (4.73), we have
f (εxt + (1 − ε)z), z − xt ≤ 0, ∀z ∈ Rn ,
and, letting ε → 1, we conclude that
f (xt ), z − xt ≤ 0, ∀z ∈ Rn .
Now, choose z := xt + f (xt ) to deduce that f (xt )e ≤ 0, that is, xt ∈ F, as
claimed.
4.6 Stability of Dissipative Systems
149
Proposition 4.2 We have the equality
F = x ∈ Rn ; S(t)x = x, ∀t ≥ 0 .
(4.76)
Proof Clearly, if x0 ∈ F, then S(t)x0 = x0 , ∀t ≥ 0, because, according to Theorem
2.13, system (4.72) has a unique solution x satisfying x(0) = x0 , namely, the constant
solution.
Conversely, let us assume that S(t)x0 = x0 , ∀t ≥ 0. Then
f (x0 ) =
1
d
S(t)x0 − x0 = 0.
S(t)x0 = lim
t=0
tց0 t
dt
This proves that f (x0 ) = 0, that is, x0 ∈ F.
Theorem 4.10 Suppose that the set F is nonempty. Then the following hold.
(i)
(ii)
(iii)
(iv)
For any x0 ∈ Rn the set ω(x0 ) is compact.
S(t)ω(x0 ) ⊂ ω(x0 ), ∀t ≥ 0, ∀x0 ∈ Rn .
S(t)x − S(t)y)e = x − ye , ∀x, y ∈ ω(x0 ), t ≥ 0.
For any y ∈ F there exists an r > 0, such that
ω(x0 ) ⊂ {x ∈ Rn ; x − ye = r}.
If, additionally, ω(x0 ) ⊂ F, then the limit x∞ = limt→∞ S(t)x0 exists and belongs
to the set F.
Proof (i) Since F = ∅, the trajectory γ(x0 ) is bounded for any x0 . Indeed, if we take
the scalar product of the equation
d
S(t)x0 = f S(t)x0 , ∀t ≥ 0,
dt
with S(t)x0 − y0 , where y0 ∈ F, then we deduce from (4.75) and the dissipativity of
f that
d
S(t)x0 − y0 e2 ≤ 0, ∀t ≥ 0.
dt
Hence
so that
S(t)x0 − y0 e ≤ x0 − y0 e , ∀t ≥ 0,
S(t)x0 e ≤ y0 e + x0 − y0 e , ∀t ≥ 0.
Since the trajectory γ(x0 ) is bounded, we deduce that the ω-limit set ω(x0 ) is bounded
as well. Let us show that the set ω(x0 ) is also closed.
150
4 Stability Theory
Consider a sequence {xj } ⊂ ω(x0 ) that converges to ξ ∈ Rn . For any xj there
exists a sequence of positive real numbers {tj,k }k≥1 such that
lim tj,k = ∞ and xj = lim S(tj,k )x0 .
k→∞
k→∞
Hence, for any j and any ε > 0, there exists a K = K(ε, j) such that
S(tj,k )x0 − xj e ≤ ε, ∀k ≥ K(ε, j).
Thus
S(tj,k )x0 − ξe ≤ ε + xj − ξe , ∀k > K(ε, j).
Since xj → ξ as t → ∞, we deduce that there exists a j(ε) > 0 such that
xj − ξe ≤ ε, ∀j ≥ j(ε).
In particular, for j = j(ε) and k := k(ε) = K(ε, j(ε)), we have
S(tj(ε),k(ε) )x0 − ξe ≤ 2ε.
This proves that ξ is a limit point of the sequence {S((tj,k )x0 }j,k≥1 , that is, ξ ∈ ω(x0 ).
(ii) Let ξ ∈ ω(x0 ). There exists a sequence tk → ∞ such that
lim S(tk )x0 = ξ.
k→∞
According to Theorem 2.13, the map S(t) : Rn → Rn is continuous for any t ≥ 0
and thus
lim S(t + tk )x0 = lim S(t)S(tk )x0 = S(t)ξ, ∀t ≥ 0.
k→∞
k→∞
Hence S(t)ξ ∈ ω(x0 ), so that S(t)ω(x0 ) ⊂ ω(x0 ).
(iii) Let us first prove that, if x ∈ ω(x0 ), then there exist sk → ∞ such that
S(sk )x → x as k → ∞.
Indeed, there exist tk → ∞ such that
tk+1 − tk → ∞, S(tk )x0 = x.
If we set sk := tk+1 − tk , then we get S(sk )S(tk ) = S(tk+1 ) and
S(sk )x − xe ≤ S(sk )x − S(sk )S(tk )x0 e + S(tk+1 )x0 − xe
(2.61)
≤ x − S(tk )x0 e + S(tk+1 )x0 − xe → 0.
Thus ω(x0 ) ⊂ ω(x), ∀x ∈ ω(x0 ). The opposite inclusion is obvious and thus
4.6 Stability of Dissipative Systems
151
ω(x0 ) = ω(x), ∀x ∈ ω(x0 ).
Let x, y ∈ ω(x0 ). Then y ∈ ω(x) and there exist sequences sk , τk → ∞ such that
x = lim S(sk )x, y = lim S(τk )x.
k→∞
k→∞
From the semigroup and the contraction properties of the family S(t), we deduce the
following string of inequalities
S(τk )y − ye ≤ S(sk )y − S(τk + sk )xe + S(τk + sk )x − S(τk )xe
+S(τk )x − ye ≤ 2S(τk )x − ye + S(sk )x − xe → 0.
Hence
x = lim S(sk )x, y = lim S(sk )y,
k→∞
k→∞
and we deduce that
x − ye = lim S(sk )x − S(sk )ye
k→∞
= lim S(sk − t)S(t)x − S(sk − t)S(t)ye
k→∞
≤ S(t)x − S(t)ye .
By Theorem 2.13, the map S(t) is a contraction so that
S(t)x − S(t)ye ≤ x − ye .
This proves the equality (iii).
(iv) Let y ∈ F. Scalarly multiplying the equation
d
S(t)x0 − y = f S(t)x0 , t ≥ 0,
dt
by S(t)x0 − y and using the dissipativity condition (4.73), we find that
d
S(t)x0 − y2e ≤ 0, ∀t ≥ 0.
dt
Thus the limit
r := lim S(t)x0 − ye
t→∞
exists. From the definition of the set ω(x0 ), we get that
x − ye = r, ∀x ∈ ω(x0 ).
(4.77)
152
4 Stability Theory
Suppose now that ω(x0 ) ⊂ F. Let y be an arbitrary point in ω(x0 ). From the above
discussion, it follows that there exists an r ≥ 0 such that
x − ye = r, ∀x ∈ ω(x0 ).
Since y ∈ ω(x0 ), it follows that r = 0 and thus ω(x0 ) is reduced to a single point
x∞ = lim S(t)x0 .
t→∞
This completes the proof of Theorem 4.10.
Theorem 4.10 is a very valuable tool for investigating asymptotic stability in
certain situations not covered by the Lyapunov function method, or the first approximation method.
Example 4.7 Consider the dissipative system (4.72) satisfying the additional conditions
f (0) = 0, f (x), x < 0, ∀x = 0.
(4.78)
We want to show that, under these assumptions, the trivial solution of system (4.72)
is globally asymptotically stable, that is,
lim x(t; 0, x0 ) = 0, ∀x0 ∈ Rn .
t→∞
(4.79)
Taking into account (4.74), we can rewrite the above equality as
lim S(t)x0 = 0, ∀x0 ∈ Rn .
t→∞
In view of Theorem 4.10, it suffices to show that
ω(x0 ) = F = {0}.
(4.80)
Assumption (4.78) implies that F = {0}. Let us first observe that (4.78) also implies
that any trajectory γ(x0 ) of system (4.72) is bounded.
Indeed, if x(t) is an arbitrary solution of (4.72), we have
1d
x(t) 2e = f (x(t)), x(t) ≤ 0.
x(t), x′ (t) =
2 dt
Hence
x(t)e ≤ x(0)e , ∀t ≥ 0.
This shows that γ(x0 ) is bounded and thus ω(x0 ) = ∅.
4.6 Stability of Dissipative Systems
153
Fix an arbitrary y0 ∈ ω(x0 ) and set y(t) := S(t)x0 . We have
y′ (t) = f y(t) , ∀t ≥ 0,
and, arguing as above, we deduce that
1d
y(t)2e = f (y(t)), y(t) , ∀t ≥ 0.
2 dt
(4.81)
Integrating the above equality, we deduce that
1
y(t)2e =
2
0
t
1
f (y(s)), y(s) ds + y0 2e , ∀t ≥ 0.
2
Since f (y(s)), y(s) ≤ 0, it follows from the above equality that the integral
0
∞
f (y(s)), y(s) ds
is convergent and thus there exists a sequence sk → ∞ such that y(sk ) → y∞ and
0 = lim f (y(sk )), y(sk ) = f (y∞ ), y∞ .
k→∞
Thus y∞ ∈ ω(x0 ) and f (y∞ ), y∞ = 0. Assumption (4.78) implies that y∞ = 0
and thus 0 ⊂ ω(x0 ). Theorem 4.10(iii) implies that
y(t)e = y0 e , ∀t ≥ 0.
Using this in (4.81), we deduce that
f (y(t)), y(t) = 0, ∀t ≥ 0
and thus y(t) = 0, ∀t ≥ 0. Hence, y0 ∈ F = {0}, ∀y0 ∈ ω(x0 ). This proves (4.80).
Example 4.8 Consider now system (4.72) in the special case n = 2. Suppose that
f (0) = 0 and F {0}. Since F is convex, it contains at least one line segment; see
Fig. 4.2.
Fig. 4.2 Asymptotic
behavior of a
two-dimensional dissipative
system
x1
F
x2
154
4 Stability Theory
On the other hand, according to Theorem 4.10(iv), the set ω(x0 ) is situated on a
circle centered at some arbitrary point of the set F. Thus, ω(x0 ) contains at most two
points x1 , x2 as in Fig. 4.2. Since
S(t)ω(x0 ) ⊂ ω(x0 ) and lim S(t)x = x,
t→0
it follows that S(t)x1 = x1 , S(t)x2 = x2 , ∀t ≥ 0. Thus, x1 , x2 ∈ F, which implies
that x1 = x2 = x∞ and, therefore,
lim x(t; 0, x0 ) = x∞ .
t→∞
If F = {0}, then ω(x0 ) is contained in a circle of radius r ≥ 0 centered at the
origin. If r = 0, then ω(x0 ) = {0}.
Remark 4.9 The method of ω-limit sets represents more recent contributions to the
development of stability theory, due mainly to G.D. Birkhoff, J.P. LaSalle, C. Dafermos and others.
Problems
4.1 Prove that the matrix A is Hurwitzian if and only if any solution x(t) of the
system x′ = Ax is absolutely integrable on [0, ∞), that is,
∞
0
x(t)dt < ∞.
Hint. Let x(t, x0 ) = etA x0 and ϕ(x0 ) =
∞
0
(4.82)
x(t, x0 )dt. By (4.82), it follows that
sup x(t, x0 ) < ∞
t≥0
and, taking into account that x(t + s, x0 ) = x(t, x(s, x0 )), ∀t, s ≥ 0, we get
ϕ(x(t, x0 )) =
t
∞
x(s, x0 )ds → 0 as t → ∞.
Hence, any limit point x0 = limtn →∞ x(tn , x0 ) satisfies ϕ(x∞ ) = 0, that is, x∞ = 0.
4.2 Find the stationary solutions of the equation
x ′′ + ax ′ + 2bx + 3x 2 = 0, a, b > 0
(4.83)
and then investigate their stability.
Hint. Stationary solutions are x1 (t) = 0, x2 (t) = − 2b
·
3
4.3 Find the stationary solutions of the systems below and then investigate their
stability using the first-order-approximation method.
4.6 Stability of Dissipative Systems
155
x1′ = sin(x+ x2 ), x2′ = ex1 − 1,
(4.84)
x1′ = 1 − x1 x2 , x2′ = x1 − x2 ,
x1′ = x1 x2 + x2 cos(x12 + x22 ), x2′ = −x12 + x2 cos(x12 + x22 ).
(4.85)
(4.86)
4.4 Investigate the stability of the stationary solutions of the Lotka–Volterra system
x1′ = x1 (1 − x2 ),
x2′ = x2 (x1 − 1).
(4.87)
Hint. System (4.87) has two stationary solutions (0, 0) and (1, 1). The first solution
is not stable. Translating to the origin the solution (1, 1), we obtain that the function
2
1+y1
V (y1 , y2 ) = y1 + y2 + ln 1+y
is a Lyapunov function of the system thus obtained.
2
4.5 The differential system
mϕ′′ = mn2 ω 2 sin ϕ cos ϕ − mg sin ϕ − bϕ′
λω ′ = k cos ϕ − F
(4.88)
describes the dynamics of J. Watt’s centrifugal governor; see L.C. Pontryagin [17].
Use the first-order-approximation method to study the asymptotic stability of the
stationary solution ϕ = ϕ0 , ω = ω0 , to (4.88).
4.6 Let H : Rn × Rn → R be a positive definite C 1 -function that is zero at (0, 0).
Prove that the trivial solution of the Hamiltonian system
x′ =
∂H
∂H
(x, p), p′ = −
(x, p),
∂p
∂x
(4.89)
is stable but not asymptotically stable.
Hint. The Hamiltonian function H is a Lyapunov function of system (4.89).
4.7 Prove that the null solution of the system
x ′ = y, y′ = − sin x + y,
is not stable.
4.8 Investigate the stability of the stationary solutions of the system
x ′ = y − f (x), y′ = −x,
where f : R → R is a C 1 function.
(4.90)
156
4 Stability Theory
Hint. System (4.90) is equivalent to the Liénard equation
x ′′ + x ′ f (x) + x = 0
that arises in the theory of electrical circuits.
In the special case f (x) = x 3 − x, the equation is known as the Van der Pol
equation. The first-order-approximation method shows that the stationary solution
(0, f (0)) is stable if f ′ (0) > 0. One can reach the same conclusion from Theorem
4.6 by constructing a Lyapunov function of the form
V (x, y) = αx 2 + βy2 , α, β > 0.
4.9 Use the Lyapunov function method to prove that the trivial solution of the
damped pendulum equation
x ′′ + bx ′ + sin x = 0,
(4.91)
where b > 0, is asymptotically stable.
Hint. The equivalent system
x ′ = y, y′ = −by − sin x,
admits Lyapunov functions of the form
V (x, y) = αy2 + β(1 − cos x) + γxy,
(4.92)
for some suitable positive constants α, β, γ.
4.10 Let A be a real n × n matrix that is nonpositive, that is,
(Ax, x) ≤ 0, ∀x ∈ Rn ,
(4.93)
and let B be a real n × m matrix such that
rank [B, AB, A2 B, . . . , An−1 B] = n.
(4.94)
Prove that the matrix A − BB∗ is Hurwitzian.
Hint. As shown in Problem 3.23, assumption (4.94) implies that
∗
B∗ etA x = 0, ∀t ≥ 0 if and only if x = 0.
(4.95)
It suffices to show that the matrix A∗ − BB∗ is Hurtwitzian. To this end, scalarly
multiply the system
(4.96)
y′ = (A∗ − BB∗ )y
4.6 Stability of Dissipative Systems
157
by y and then integrate over [0, ∞). We deduce that
∞
0
B∗ y(t)2e dt < ∞
for any solution of system (4.96). Assumption (4.95) then implies that
V (x) =
∞
0
∗
B∗ et(A−BB ) x2e dt
is a Lyapunov function for system (4.96) and V y(t) → 0 as t → ∞.
Remark 4.10 The preceding problem shows that, under assumptions (4.93) and
(4.94), the system x′ = Ax + Bu can be stabilized using the feedback controller
u = −B∗ x.
(4.97)
4.11 Prove, using a suitable Lyapunov function, that the trivial solution of the system
x1′ = −2x1 + 5x2 + x22 , x2′ = −4x1 = 2x2 + x12
is asymptotically stable.
4.12 Using the Lyapunov function method, investigate the stability of the trivial
solution of the ODE
x ′′ + a(t)x ′ + b(t)x = 0.
Hint. Seek a Lyapunov function of the form
V (x1 , x2 ) = x12 +
x22
.
b(t)
4.13 Consider the differential system
x′ = f (x) +
N
i=1
ui Bi (x), x ∈ Rn ,
(4.98)
where ui , i = 1, . . . , n, are real parameters and f : Rn → Rn , Bi : Rn → Rn ,
i = 1, . . . , N, are locally Lipschitz functions. We assume that there exists a positive
definite function V such that
f (x), grad V (x) ≤ 0, ∀x ∈ Rn ,
(4.99)
158
4 Stability Theory
and the functions f (x), gradV (x) , Bi (x), gradV (x) , i = 1, . . . , N, are not simultaneously zero on Rn . Prove that the feedback controller
ui := − Bi (x), grad V (x) ,
stabilizes system (4.98).
Hint. Verify that V is a Lyapunov function for the differential system
x′ = f (x) −
N
i=1
Bi (x), gradV (x) Bi (x).
4.14 The differential system
α
N ′ = − (T − T0 )N, mCT ′ = −N − N0 ,
ℓ
(4.100)
is a simplified model for the behavior of a nuclear reactor, N = N(t) denoting the
power of the reactor at time t, T = T (t) is the temperature, ℓ is the lifetime of the
neutrons, m is the mass of radioactive material and α, C are some positive parameters.
Investigate the stability of the stationary solution N = N0 > 0, T = T0 .
Hint. Making the change of variables
x1 = ln
N
, x2 = T − T0 ,
N0
the problem can be reduced to investigating the stability of the null solution of the
differential system
x1′ = −µx2 , x2′ = ex1 − 1, µ :=
αmC
,
N0 ℓ
which satisfies the assumptions of Theorem 4.7 for a Lyapunov function of the form
µ
V (x1 , x2 ) = x22 +
2
x1
0
(es − 1)ds.
4.15 Consider the control system
x ′ + ax = u, t ≥ 0,
(4.101)
ρx
, where a, ρ are positive constants. Prove that
with the feedback synthesis u = − |x|
the solutions of the system that are not zero at t = 0 will reach the value zero in a
finite amount of time. Find a physical model for this system.
4.6 Stability of Dissipative Systems
159
Hint. By multiplying (4.101) by
x
,
|x|
it follows that
d
|x(t)| + a|x(t)| + ρ = 0 on [t ≥ 0; x(t) = 0],
dt
+
1
.
which implies that x(t) = 0 for t ≥ T = a1 log a|x(0)|
ρ
4.16 Consider the system
x′ + gradf (x) = 0,
(4.102)
where f : Rn → R is a C 2 -function such that, for any λ ∈ R, the set
x ∈ Rn ; f (x) ≤ λ
is bounded and the equation gradf (x) = 0 has finitely many solutions x1 , . . . , xm .
Prove that any solution x = ϕ(t) of (4.102) is defined on the entire semi-axis [0, ∞)
and limt→∞ ϕ(t) exists and is equal to one of the stationary points x1 , . . . , xm .
Hint. Scalarly multiplying the system (4.102) by ϕ′ (t), we deduce that
1 ′
d
ϕ (t)2e + f ϕ(t) = 0,
2
dt
on the maximal existence interval [0, T [. Thus
1
2
t
0
ϕ′ (s)2e ds + f ϕ(t) = f ϕ(0) , ∀t ∈ [0, T [.
Invoking Theorem 3.10, we deduce from the above inequality that T = ∞, ϕ(t) is
bounded on [0, ∞) and limt→∞ ϕ′ (t) = 0. Then, one applies Theorem 4.10.
Chapter 5
Prime Integrals and First-Order Partial
Differential Equations
In this chapter, we will investigate the concept of a prime integral of a system of ODEs
and some of its consequences in the theory of first-order partial differential equations.
An important part of this chapter is devoted to the study of the Cauchy problem for
such partial differential equations. These play an important role in mathematical
physics, mechanics and the calculus of variations. The treatment of such problems
is essentially of a geometric nature and it is based on properties of systems of ODEs.
5.1 Prime Integrals of Autonomous Differential Systems
Consider the autonomous system
x′ = f (x), x = (x1 , . . . , xn ),
(5.1)
where f : D → Rn is a C 1 -map on an open subset D ⊂ Rn . We begin by defining the
concept of a prime integral.
Definition 5.1 The scalar C 1 -function U(x) = U(x1 , . . . , xn ) defined on an open set
D0 ⊂ D is called a prime integral of system (5.1) if it is not identically constant, but
U ϕ(t) is constant for any trajectory x = ϕ(t) of system (5.1) that stays in D0 .
Theorem 5.1 The C 1 -function U on D0 is a prime integral of system (5.1) if and
only if
(5.2)
grad U(x), f (x) = 0, ∀x ∈ D0 .
Proof If U is a prime integral, then U(ϕ(t)) is constant for any solution ϕ(t) of
system (5.1). Thus
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6_5
161
162
5 Prime Integrals and First-Order Partial Differential Equations
n
0=
∂U
d
U(ϕ(t)) =
(ϕ(t))f i (ϕ(t)) = (grad U(ϕ(t)), f (ϕ(t))).
dt
∂xi
i=1
(5.3)
Since any point of D0 is contained in a trajectory of (5.1), we deduce
that (5.2) holds
on D0 . Conversely, (5.2) implies (5.3) which in turn implies that U ϕ(t) is constant
for any solution ϕ(t) of system (5.1).
To investigate the existence of prime integrals, we need to introduce several concepts.
A point a ∈ Rn is called a critical point of system (5.1) if f (a) = 0. The point is
called regular if f (a) = 0.
The C 1 -functions U1 , . . . , Uk , k ≤ n, are called independent in a neighborhood
of a ∈ Rn if the Jacobian matrix
∂Ui
(a)
(5.4)
1≤i≤k,
∂xj
1≤j≤n
has rank k.
Equivalently, this means that the vectors grad U1 (a), . . . , grad Uk (a) are linearly
independent.
Theorem 5.2 In a neighborhood of a regular point a ∈ Rn of system (5.1), there
exist exactly (n − 1)-independent prime integrals.
Proof Let a ∈ Rn such that f (a) = 0. Assume that the n-th component of the vector
f (a) is not zero
(5.5)
f n (a) = 0.
We first prove that there exist at most (n − 1)-independent prime integrals in the
neighborhood of a.
We argue by contradiction and assume that there exist n independent prime integrals U1 , . . . , Un . From (5.2), we obtain that
∂U1
(a)f 1 (a) + · · · +
∂x1
..
.
∂Un
(a)f 1 (a) + · · · +
∂x1
∂U1
(a)f n (a) =
∂xn
..
.
∂Un
(a)f n (a) =
∂xn
0
..
.
(5.6)
0.
Interpreting (5.6) as a linear homogeneous system with unknown f 1 (a), . . . , f n (a),
not all equal to zero, it follows that the determinant of this system must be zero,
showing that the functions U1 , . . . , Un cannot be independent in a neighborhood
of a.
Let us prove that there exist (n − 1) independent prime integrals near a =
(a1 , . . . , an ). Denote by x = ϕ( t; λ1 , . . . , λn−1 ) the solution of system (5.1) that
satisfies the initial condition
5.1 Prime Integrals of Autonomous Differential Systems
163
x(t) = (λ1 , . . . , λn−1 , an ).
More explicitly, we have
xi = ϕi (t; λ1 , . . . , λn−1 ), i = 1, 2, . . . , n.
(5.7)
λi = ϕi (0; λ1 , . . . , λn−1 ), i = 1, . . . , n − 1,
an = ϕn (0; λ1 , . . . , λn−1 ).
(5.8)
We get
Using Theorem 3.14, we deduce that the map
(t, λ1 , . . . , λn−1 ) → ϕ(t, λ1 , . . . , λn−1 )
is a C 1 -function of λ1 , . . . , λn−1 . Moreover, using (5.8), we see that its Jacobian at
the point (0, a1 , . . . , an−1 ) is
D(ϕ1 , . . . , ϕn )
(0, a1 , . . . , an−1 ) = f n (a) = 0.
D(t, λ1 , . . . , λn−1 )
(5.9)
The inverse function theorem implies that in a neighborhood V(a) of the point
ϕ(0, a1 , . . . , an ) = a there exist C 1 -functions U1 , . . . , Un−1 , V such that
λi = Ui (x), i = 1, . . . , n − 1, x ∈ V(a),
t = V (x).
(5.10)
By construction, the functions U1 , . . . , Un−1 , V are independent on a neighborhood
of a so, in particular, the functions U1 , . . . , Un−1 are independent in a neighborhood
of a.
Let us prove that U1 , . . . , Un−1 are prime integrals, that is,
Ui ϕ(t) = constant, i = 1, . . . , n − 1,
for any solution x = ϕ(t) of system (5.1).
From (5.7), (5.8) and (5.10), it follows that Ui (ϕ(t)) ≡ constant for all
i = 1, ..., n − 1, and, for any solution ϕ(t) whose initial value is ϕ(0) = (λ̃, an ),
we have λ̃ := (λ1 , . . . , λn−1 ). Consider now an arbitrary solution x = ϕ(t; 0, x0 ) of
the system (5.1) which stays in V(a) and has the value x0 at t = 0. As indicated in
Sect. 2.5, the uniqueness theorem implies the group property
ϕ t + τ ; 0, x0 = ϕ t; 0, ϕ(τ ; 0, x0 ) .
On the other hand, for x0 ∈ V(a), the system
x0 = ϕ τ ; 0, ( λ̃, an ) )
(5.11)
164
5 Prime Integrals and First-Order Partial Differential Equations
has a unique solution (τ 0 , λ̃0 ). Using (5.11), we get
ϕ(t; 0, x0 ) = ϕ t; 0, ϕ(τ 0 ; 0, (λ̃0 , an ) ) = ϕ(t + τ 0 ; 0, (λ̃0 , an ) ) = ψ(t).
From the above discussion, we deduce that Ui (ψ(t)) = constant. Hence
Ui ϕ(t; 0, x0 ) = constant, i = 1, . . . , n − 1.
This proves Theorem 5.2.
Roughly speaking, a system of (n − 1)-independent prime integrals plays the same
role for system (5.1) as a fundamental system of solutions for a linear differential
system. This follows from our next theorem.
Theorem 5.3 Let U1 , . . . , Un−1 be prime integrals of system (5.1) which are independent in a neighborhood V(a) of the point a ∈ Rn . Let W be an arbitrary prime
integral of system (5.1) defined on some neighborhood of a. Then there exists an open
neighborhood U of the point
U1 (a), . . . , Un−1 (a) ∈ Rn−1 ,
an open neighborhood W ⊂ V of a and a C 1 -function F : U → R, such that
W (x) = F U1 (x), . . . , Un−1 (x) , ∀x ∈ W.
(5.12)
Proof Fix a function Un ∈ C 1 (Rn ) such that the system {U1 , . . . , Un } is independent
on a neighborhood V ′ ⊂ V of a. The inverse function theorem implies that the C 1 map
V ′ ∋ x → Φ(x) := U1 (x), . . . , Un (x) ∈ Rn
is locally invertible
near a. This means that there exists an open neighborhood W
of a such that Φ W is a bijection onto a neighborhood Û of Φ(a) and its inverse
Φ −1 : Û → W is also C 1 . The inverse is described by a collection of functions
W1 , . . . , Wn ∈ C 1 (Û),
Û ∋ u := (u1 , . . . , un ) → Φ −1 (u) = W1 (u), . . . , Wn (u) ∈ W.
We deduce that
xi = Wi U1 (x), . . . , Un (x) , ∀x = (x1 , . . . , xn ) ∈ W, i = 1, . . . , n.
uk = Uk W1 (u), . . . , Wn (u) , ∀u ∈ Û, k = 1, . . . , n.
(5.13)
(5.14)
5.1 Prime Integrals of Autonomous Differential Systems
165
Now, define the function G ∈ C 1 (Û) by setting
G(u) := W W1 (u), . . . , Wn (u) , ∀u ∈ Û.
(5.15)
W (x) = G U1 (x), . . . , Un (x) , ∀x ∈ W.
(5.16)
Equalities (5.13) imply that
On the other hand, G(u) = G(u1 , . . . , un ) is independent of un . Indeed, we have
n
∂W ∂Wi
∂G
=
.
∂un
∂Wi ∂un
i=1
By Theorem 5.2, the system W, U1 , . . . , Un−1 is dependent, so gradW is a linear
combination of grad U1 , . . . , grad Un−1 . Hence, there exist functions a1 (x, ), . . . ,
an−1 (x), defined on an open neighborhood of a, such that
n−1
∂W
∂Uk
=
ak (x)
, i = 1, . . . , n.
∂xi
∂xi
k=1
Hence
n
∂G
=
∂Un
i=1
n−1
k=1
ak (x)
∂Uk
∂xi
n−1
n
∂Wi
∂Uk ∂Wi
ak (x)
=
∂un
∂xi ∂un
i=1
k=1
From (5.14), we deduce that, for any k = 1, . . . , n − 1,
n
∂uk
∂Uk ∂Wi
=
= 0.
∂x
∂u
∂u
i
n
n
i=1
Thus, the function F(u1 , . . . , un−1 ) = G(u1 , . . . , un−1 , un ) satisfies all the postulated
conditions.
5.1.1 Hamiltonian Systems
A mechanical system with n degrees of freedom is completely determined by the vector q(t) = (q1 (t), . . . , qn (t)), representing the generalized coordinates of the system,
and its derivative q′ (t), representing the generalized velocity.
The behavior of the system is determined by a function
L : R2n → R, L = L(q, q′ )
166
5 Prime Integrals and First-Order Partial Differential Equations
called the Lagrangian of the system. More precisely, according to Hamilton’s principle, any trajectory (or motion) of the system during the time interval [0, T ] is an
extremal of the functional
T
L(q(t), q′ (t))dt,
S=
0
and, as such, it satisfies the Euler–Lagrange equation
d
dt
∂L
∂q′
−
∂L
= 0, ∀t ∈ [0, T ].
∂q
∂L
The functions pi := ∂q
′ are called generalized momenta, while the functions
i
called generalized forces; see e.g. [13].
The function
H : R2n → R, H = H(q, p),
(5.17)
∂L
∂qi
are
defined via the Legendre transform (see Appendix A.6)
H(q, p) := sup (p, q̃) − L(q, q̃) ,
(5.18)
q̃∈Rn
is called the generalized Hamiltonian. The definition of H shows that we have the
equality
∂L
(q, q̃).
(5.19)
H(q, p) + L(q, q̃) = (p, q̃) where p =
∂ q̃
Thus, by setting
p :=
∂L
(q, q′ )
∂q′
we can rewrite (5.17) as follows
p′ (t) = −
′
q (t) =
∂H
(q(t), p(t)),
∂q
∂H
(q(t), p(t)), t ∈ [0, T ].
∂p
(5.20)
System (5.20) is called the Hamiltonian system associated with the mechanical system. Theorem 5.1 implies that the Hamiltonian function H is a prime integral of
system (5.20). In other words
H q(t), p(t) = constant,
for any trajectory (q(t), p(t) ) of system (5.20).
(5.21)
5.1 Prime Integrals of Autonomous Differential Systems
167
In classical mechanics, the Hamiltonian of a system of n-particles of masses
m1 , . . . , mn has the form
H(q1 , . . . , qn ; p1 , . . . , pn ) =
n
1 2
p + V (q1 , . . . , qn ),
2mk k
(5.22)
k=1
where V (q1 , . . . , qn ) is the potential energy of the system. In other words, H(q(t),
p(t)) is the total energy of the system at time t, and (5.21) is none other than the
conservation of energy law.
The Hamiltonian systems are the most general conservative differential systems,
that is, for which the energy is a prime integral (see also (1.56) and (1.57)).
In the special case of conservative systems with a single degree of freedom,
normalizing the mass to be 1, the Hamiltonian has the form
H(q, p) =
1 2
p + G(q), G(q) =
2
q
g(r)dr,
0
and system (5.20) reduces to Newton’s equation
x ′′ + g(x) = 0,
(5.23)
x′ = p
p′ = −g(x),
(5.24)
or, equivalently, to the system
that we have already investigated. In this case, equality (5.21) becomes
1 ′ 2
x (t) + G x(t)) = C,
2
where
C :=
(5.25)
1 2
p + G(x0 ),
2 0
and (p0 , x0 ) are initial data for system (5.24). Integrating (5.25), we get
√
x(t)
2t =
x0
√
dr
, x(0) = x0 .
C − G(r)
(5.26)
Equality (5.25) (respectively (5.26)) describes a curve called the energy level.
Let us, additionally, assume that g is C 1 and satisfies
ug(u) > 0, ∀u = 0, g(−u) = −g(u), ∀u ∈ R.
(5.27)
168
5 Prime Integrals and First-Order Partial Differential Equations
One can prove that, under these assumptions, for C sufficiently small, the solution
of (5.26) is periodic; see [3].
Equation (5.23) describes a general class of second-order ODEs. As we have
already seen, when g is linear, we obtain the harmonic oscillator equation, while, in
the case g(x) = sin x, we obtain the pendulum equation.
If g(x) = ω 2 x 2 + βx 3 , then (5.23) is called the Duffing equation.
5.2 Prime Integrals of Non-autonomous Differential
Systems
Consider the differential system
x′ = f (t, x),
(5.28)
where f : Ω ⊂ Rn+1 → Rn is continuous, differentiable with respect to x, and the
derivative f x is continuous on the open set Ω. Imitating the preceding section, we will
say that the function V = V (t, x) : Ω → R is a prime integral of the system (5.28)
on an open subset Ω0 ⊂ Ω if V is C 1 on Ω0 , it is not identically constant on Ω0
and V (t, ϕ(t)) = constant for any solution ϕ(t) of (5.28) whose graph is contained
in Ω0 . The proof of the following characterization theorem is similar to the proof of
Theorem 5.1 and, therefore, we omit it.
Theorem 5.4 The C 1 -function on Ω0 is the prime integral of system (5.28) if and
only if it satisfies the equality
n
∂V
∂V
(t, x) +
(t, x)fi (t, x) = 0, ∀(t, x) ∈ Ω0 .
∂t
∂xi
i=1
(5.29)
To prove other properties of the prime integrals of system (5.28), it suffices to observe
that this system can be regarded as an (n + 1)-dimensional autonomous differential
system.
Indeed, interpreting (t, x) as unknown functions and introducing a new real variable s, we can rewrite (5.28) in the form
dx
= f (t, x),
ds
dt
= 1.
ds
(5.30)
In this fashion, a prime integral of (5.28) becomes a prime integral of the autonomous
system (5.30). Theorems 5.2 and 5.3 imply the following result.
5.2 Prime Integrals of Non-autonomous Differential Systems
169
Theorem 5.5 In a neighborhood of the point (t0 , a0 ) ∈ Ω, system (5.28) admits
exactly n independent prime integrals V1 , . . . , Vn . Any other prime integral V (t, x)
has the form
(5.31)
V (t, x) = F V1 (t, x), . . . , Vn (t, x) ,
function defined in a neighborhood of the
where F(v1 , . . . , vn ) is a differentiable
point V1 (t0 , a0 ), . . . , Vn (t0 , a0 ) ∈ Rn .
The knowledge of k independent prime integrals k < n allows the reduction of the
dimension of the system. Indeed, if U1 (t, x), . . . , Uk (t, x) are k independent prime
integrals of system (5.28), then, locally, we have the equalities
U1 (t, x1 , . . . , xn ) =
U2 (t, x1 , . . . , xn ) =
..
..
.
.
C1
C2
..
.
(5.32)
Uk (t, x1 , . . . , xn ) = Ck ,
where x(t) = ( x1 (t), . . . , xn (t) ) is a trajectory of the system, and C1 , . . . , Ck are
constants.
Since the functions U1 , . . . , Uk are independent, we may assume that the functional determinant
D(U1 , . . . , Uk )
D(x1 , . . . , xk )
is nonzero. In other words, the implicit system (2.43) can be solved with respect to
(x1 , . . . , xk ); see Theorem A.3. We deduce that
x1
x2
..
.
= ϕ1 (t, xk+1 , . . . , xn ; C1 , . . . , Ck )
= ϕ2 (t, xk+1 , . . . , xn ; C1 , . . . , Ck )
..
..
.
.
(5.33)
xk = ϕk (t, xk+1 , . . . , xn ; C1 , . . . , Ck ).
In this fashion, the only unknown variables left in (2.39) are xk+1 , . . . , xn . In particular, the knowledge of n independent prime integrals of the system is equivalent to
solving it.
Example 5.1 Consider, for example, the differential system,
x1′ = x22
x2′ = x1 x2 .
Rewriting it in a symmetric form
dx1
dx2
,
=
2
x
x2
1 x2
170
5 Prime Integrals and First-Order Partial Differential Equations
we observe that U(x1 , x2 ) = x12 − x22 is a prime integral. In other words, the general
solution admits the representation
x12 − x22 = C,
(5.34)
where C is a real constant. An
explicit form of the solution in the space (x1 , x2 , t)
can be found by setting x2 = x1′ and using this in (5.34). We obtain in this fashion
a differential equation in x1 and t which, upon solving, yields an explicit description
of x1 as a function of t.
5.3 First-Order Quasilinear Partial Differential Equations
In this section, we will investigate the equation
n
i=1
ai (x, z)zxi = a(x, z), x = (x1 , . . . , xn ),
(5.35)
with the unknown function z = z(x), where for i = 1, . . . , n the functions ai are C 1
on an open set Ω ⊂ Rn+1 , and satisfy the condition
n
i=1
ai (x, z)2 = 0, ∀(x, z) ∈ Ω.
(5.36)
∂z
We denote by zxi the partial derivatives ∂x
, i = 1, . . . , n. Equation (5.35) is called a
i
first-order, quasilinear partial differential equation.
Definition 5.2 A solution of (5.35) on an open set D ⊂ Rn is a function z ∈ C 1 (D)
that satisfies equality (5.35) for all x ∈ D.
Geometrically, the graph of a solution of (5.35) is a hypersurface in Rn+1 with the
property that the vector field (a1 , . . . , an , a) is tangent to this hypersurface at all of
its points.
We associate with (5.35) the system of ODEs
dz
dxi
= a(x, z),
= ai (x, z), i = 1, . . . , n,
da
ds
(5.37)
called the characteristics equation. The solutions of (5.37), for which the existence
theory presented in Sect. 2.1 applies, are called the characteristic curves of Eq. (5.35).
We seek a solution of (5.35) described implicitly by an equation of the form
u(x, z) = 0.
5.3 First-Order Quasilinear Partial Differential Equations
Then
zxi = −
171
uxi
, ∀i = 1, . . . , n,
uz
and thus we can rewrite (5.35) in the form
n
i=1
ai (x, z)uxi + a(x, z)uz = 0, (x, z) ∈ Ω.
(5.38)
Theorem 5.1 characterizing the prime integrals of autonomous systems of ODEs
shows that a function u is a solution of (5.38) if and only if it is a prime integral of
the characteristics equation (5.37).
Given our assumptions on the functions ai and a, we deduce that system (5.37)
admits n independent prime integrals U1 , . . . , Un on an open subset Ω ′ ⊂ Ω. The
general solution of (5.38) has the form (see Theorem 5.5)
u(x, z) = F U1 (x, z), . . . , Un (x, z) , (x, z) ∈ Ω ′ ,
(5.39)
where F is an arbitrary C 1 -function. Thus, solving (5.38), and indirectly (5.35),
reduces to solving the characteristics equation, that is, finding n independent prime
integrals of system (5.37).
Example 5.2 Consider the following first-order quasilinear PDE
x1 zzx1 + x2 zx2 = −x1 x2 .
(5.40)
The characteristics equation is given by the system
dx2
dz
dx1
= x1 z,
= x2 z,
= −x1 x2 ,
ds
ds
ds
or, in symmetric form,
dx1
dx2
dz
=
=
.
x1 z
x2 z
x1 x2
(5.41)
From the first equality, we deduce that
dx1
dx2
=
.
x1
x2
Hence
U1 (x1 , x2 , z) =
x1
x2
is a prime integral of the characteristics system. System (5.41) also implies the
equality
2zdz = −d(x1 x2 ).
172
5 Prime Integrals and First-Order Partial Differential Equations
Hence, the function
U2 (x1 , x2 , z) = z2 + x1 x2
is another prime integral of (5.41) that is obviously independent of U1 . Thus, the
general solution of (5.40) is given implicitly by the equation
F
x1 2
, z + x1 x2
x2
= 0, (x1 , x2 ) ∈ R2 ,
where F : R2 → R is an arbitrary C 1 -function.
As in the case of systems of ODEs, when investigating partial differential
equations, we are especially interested in solutions satisfying additional conditions
or having prescribed values on certain parts of their domains of definition. In the
remainder of this section, we will study one such condition, which is a natural generalization of the initial condition in the case of ODEs.
5.3.1 The Cauchy Problem
In the space Rn+1 , consider the (n − 1)-dimensional submanifold Γ defined by the
equations
xi = ϕi (u1 , . . . , un−1 ), i = 1, . . . , n,
z = ϕ(u1 , . . . , un−1 ), u := (u1 , . . . , un−1 ) ∈ U ⊂ Rn−1 ,
(5.42)
where ϕ and ϕi are C 1 -functions defined on the open set U. We will assume that
∂ϕi
det
∂uj
1≤i,j≤n−1
= 0.
A solution of Eq. (5.35) satisfying the Cauchy condition (5.42) is a solution of (5.35)
whose graph contains the manifold Γ , that is,
ϕ(u) = z ϕ1 (u), . . . , ϕn (u) , ∀u ∈ U.
(5.43)
We will prove the following existence result.
Theorem 5.6 Suppose that the following nondegeneracy condition is satisfied
5.3 First-Order Quasilinear Partial Differential Equations
173
⎞
a1 (ϕ1 (u), ..., ϕn (u), ϕ(u)) · · · an (ϕ1 (u), ..., ϕn (u), ϕ(u))
∂ϕ1
∂ϕn
⎟
⎜
(u)
···
(u)
⎟
⎜
⎟
⎜
∂u1
∂u1
⎟= 0, (5.44)
⎜
Δ = det ⎜
..
..
..
⎟
.
.
.
⎟
⎜
⎠
⎝
∂ϕn−1
∂ϕn
(u)
···
(u)
∂u1
∂un−1
⎛
∀u ∈ U. Then, the Cauchy problem (5.35), (5.42) has a unique solution defined in a
neighborhood of the manifold Γ .
Proof Let us observe that the characteristics equation (5.37) define locally an (n +
1)-dimensional family of solutions
xi = xi (s; s0 , x0 , z0 ), i = 1, . . . , n,
z = z(s; s0 , x0 , z0 ), s ∈ I,
where I ⊂ R is an interval. In the above equations, we let (x0 , z0 ) ∈ Γ , that is,
x0 = ϕ1 (u), . . . , ϕn (u) , z0 = ϕ(u), u ∈ U.
The quantities (s, u) are solutions of the nonlinear system
xi = xi s; s0 , ϕ1 (u), . . . , ϕn (u) ,
z = z s; s0 , ϕ1 (u), . . . , ϕn (u) , u ∈ U.
(5.45)
From the characteristics equation (5.37), we deduce that
D(x1 , . . . , xn )
= Δ = 0 for s = s0 , x = x0 .
D(s, u1 , . . . , un−1 )
The inverse function theorem shows that the correspondence
(s, u1 , . . . , un ) → (x1 , . . . , xn )
defined by (5.45) is a diffeomorphism of an open set in the (s, u)-space onto a
neighborhood of Γ . Thus, there exist C 1 -functions Φ, Φ1 , . . . , Φn−1 defined on a
neighborhood of Γ such that
ui = Φi (x1 , . . . , xn ), i = 1, . . . , n − 1,
s = Φ(x1 , . . . , xn ).
(5.46)
This expresses the quantities u1 , . . . , un as functions of x and, using this in the second
equation in (5.45), we obtain a function z = z(x) whose graph, by design, contains
Γ . We want to prove that z is a solution of (5.35).
174
5 Prime Integrals and First-Order Partial Differential Equations
Indeed, by construction, z is part of the solution of the characteristics equation
(5.37) so that
dz
= a(x, z).
ds
On the other hand, from equalities (5.37) and (5.45), we deduce that
n
n
dz ∂z dxi
=
=
zxi ai (x, z).
ds
∂zi ds
i=1
i=1
This shows that z is a solution of (5.35).
The uniqueness follows from the fact that any solution of the Cauchy problem
(5.35), (5.42) is necessarily obtained by the above procedure. Indeed, if z̃(x1 , . . . , xn )
is another solution of the Cauchy problem (5.35), (5.42), then, for any initial vector
(s0 , x0 ) ∈ Rn+1 , the Cauchy problem
dxi
= ai x, z̃(x) , i = 1, . . . , n,
ds
xi (s0 ) = xi0 ,
i = 1, . . . , n,
(5.47)
admits a local solution x = x(s). Since z̃(x) is a solution of (5.35), we deduce that
n
d
z̃( x(s) ) =
z̃xi ( x(s) ) = a( x, z̃(x) ).
ds
i=1
In other words, the curve
s → x(s), z( x(s) )
is a characteristic curve of Eq. (5.35). If xi0 = ϕi (u), i = 1, . . . , n, then necessarily
z̃( x(s0 ) ) = ϕ(u). From the uniqueness of the Cauchy problem for system (5.37), we
see that
z̃ x1 (s), . . . , xn (s) = z x1 (s), . . . , xn (s) ,
where z = z(x) is the solution constructed earlier via equations (5.45). Hence
z = z̃.
Remark 5.1 Hypothesis (5.44) is essential for the existence and uniqueness of the
Cauchy problem for (5.35). If Δ = 0 along the submanifold Γ , then the Cauchy
problem admits a solution only if Γ is a characteristic submanifold of Eq. (5.35),
that is, at every point of Γ the vector field
a1 (x, z), . . . , an (x, z), a(x, z)
is tangent to Γ . However, in this case the solution is not unique.
5.3 First-Order Quasilinear Partial Differential Equations
175
Let us also mention that, in the case n = 2, Eq. (5.35) reduces to
P(x, y, z)zx + Q(x, y, z)zy = R(x, y, z),
(5.48)
and the Cauchy problem consists in finding a function z = z(x, y) whose graph contains the curve
x = ϕ(u), y = ψ(u), z = χ(u), u ∈ I,
where P, Q, R are C 1 -functions on a domain of R3 , and ϕ, ψ, χ are C 1 -functions on
an interval I ⊂ R.
Example 5.3 Let us find a function z = z(x, y) satisfying the first-order quasilinear
PDE
xzx + zzy = y,
and such that its graph contains the line
y = 2z, x + 2y = −z.
(5.49)
This line can be parameterized by the equations
x = −3u, y = 2u, z = u, u ∈ I,
and we observe that assumption (5.44) in Theorem 5.6 is satisfied for u = 0. The
characteristics equation is
dy
dz
dx
= x,
= z,
= y,
ds
ds
ds
and its general solution is x = x0 es , y = 21 (es (y0 + z0 ) + e−s (y0 − z0 )), z = 21
(es (y0 + z0 ) − e−s (y0 − z0 )). The graph of z is filled by the characteristic curves
originating at points on the line (5.49) and, as such, it admits the parametrization
x = −3ues , y =
u s
u s
3e + e−s , z =
3e − e−s .
2
2
5.4 Conservation Laws
A large number of problems in physics lead to partial differential equations of the
form
(5.50)
zx + a(z)zy = 0, (x, y) ∈ [0, ∞) × R,
176
5 Prime Integrals and First-Order Partial Differential Equations
with the Cauchy condition
z(0, y) = ϕ(y), ∀y ∈ R.
(5.51)
Here, a, ϕ : R → R are C 1 -functions. Equation (5.50) is known as a conservation
law equation and arises most frequently in the mathematical modeling of certain
dynamical phenomena that imply the conservation of certain quantities such as mass,
energy, momentum, etc.
If we denote by ρ(y, t) the density of that quantity at the point y ∈ R and at the
moment of time t ≥ 0, and by q(y, t) the flux per unit of time, then the conservation
law for that quantity takes the form
d
dt
y2
y1
ρ(y, t)dy + q(y2 , t) − q(y1 , t) = 0.
Letting y2 → y1 , we deduce that
∂ρ ∂q
+
= 0, y ∈ R, t ≥ 0.
∂t
∂y
(5.52)
If the flux q is a function of ρ, that is,
q = Q(ρ),
(5.53)
then equation (5.52) becomes an equation of type (5.50)
ρt + Q′ (ρ)ρy = 0.
(5.54)
Let us illustrate this abstract model with several concrete examples (see, e.g., [19]).
Example 5.4 (Large waves in rivers) Consider a rectangular channel directed along
the y-axis and of constant width. Denote by q(y, t) the flux of water per unit of width
and by h(y, t) the height of the water wave in the channel at the point y and moment
t. The conservation of mass leads to (5.52) where q = h. Between the flux and the
depth h, we have a dependency of type (5.53)
q = Q(h),
and, experimentally, it is found that Q is given by
3
Q(h) = αh 2 , h ≥ 0.
In this case, the conservation of mass equation becomes
ht +
3α 1
h 2 hy = 0.
2
(5.55)
5.4 Conservation Laws
177
The same Eq. (5.54) models the movement of icebergs. In this case, Q has the form
Q(h) = ChN , N ∈ (3, 5).
Example 5.5 (Traffic flow) Consider the traffic flow of cars on a highway directed
along the y-axis. If ρ(y, t) is the density of cars (number of cars per unit of length)
and v is the velocity, then the flux is given by q(y, t) = ρ(y, t)v(y, t). If the velocity
is a function of ρ, v = V (ρ), then the conservation law equation leads to
ρt + W (ρ)ρy = 0,
where W (ρ) = V (ρ) + ρV ′ (ρ).
Let us now return to the Cauchy problem (5.50), (5.51) and try to solve it using
the method of characteristics described in the previous section. The characteristics
equation for (5.50) has the form
dx
dy
dz
= 1,
= a(z),
= 0,
ds
ds
ds
(5.56)
while the curve that appears in (5.51) has the parametric description
x = 0, y = t, z = ϕ(t).
(5.57)
The general solution of (5.56) is
x = s + x0 , y = a(z0 )s + y0 , z = z0 .
Thus, the graph of z admits the parametrization,
or, equivalently,
x = s, y = a ϕ(t) s + t, z = ϕ(t),
z = ϕ y − xa(z) , x ≥ 0, y ∈ R.
(5.58)
According to the implicit function theorem (Theorem A.3), Eq. (5.58) defines a C 1 function z in a neighborhood of any point (x0 , y0 ), such that
x0 ϕ′ y0 − x0 a(z0 ) a′ (z0 ) = 1.
Thus, there exists a unique solution z = z(x, y) to (5.50)–(5.51) defined on a tiny
rectangle [0, δ] × [−b, b] ⊂ R2 .
From the above construction, we can draw several important conclusions concerning the solution of (5.50). We think of the variable x as time and, for this reason,
we will relabel it t. Observe first that, if the function ϕ : R → R is bounded,
178
5 Prime Integrals and First-Order Partial Differential Equations
ϕ(y) ≤ M, ∀y ∈ R,
then (5.58) shows that the value of z at the point (t, ȳ) depends only on the initial
condition on the interval
w ∈ R; |w − ȳ| ≤ Ct ,
where
C := sup |a(z)|.
|z|≤M
In particular, if the initial data ϕ is supported on the interval [−R, R], then the solution
z(t, y) is supported in the region
(t, y); |y| ≤ R + Ct ,
that is, z(t, y) = 0 for |y| > R + Ct. This property of the solution is called finite speed
propagation.
An interesting phenomenon involving the solutions of the conservation law equations is the appearance of singularities. More precisely, for large values of t, the
function z = z(t, y) defined by (5.58), that is,
z = ϕ(y − ta(z)), t ≥ 0, y ∈ R,
(5.59)
can become singular and even multivalued.
Take, for example, Eq. (5.55) where ϕ(y) = y2 + 1. In this case, the equation
(5.59) becomes
3α 1 2
tz 2
z= y−
+ 1.
2
The above equation describes a surface in the (t, y, z)-space which is the graph of a
function z = z(t, y) provided that
2 y2 + 1
, y > 0.
0<t<
3αy
The solution becomes singular along the curve
3αty = 2 y2 + 1.
It is interesting to point out that the formation of this singularity is in perfect agreement with physical reality. Let us recall that, in this case, z = h(t, y) represents the
height of the water wave at time t and at the point y in the channel, and ϕ(y) describes
the initial shape of the wave; see the left-hand side of Fig. 5.1.
5.4 Conservation Laws
Fig. 5.1 The evolution of
water waves
179
h
h
h=h(y,t)
h=ϕ(y)
0
y
0
y
t>0
t=0
At the location y along the channel, the wave will break at time
2 y2 + 1
.
T (y) =
3αy
This corresponds to the formation of a singularity in the function y → h(t, y); see
the right-hand side of Fig. 5.1.
This example shows that the formation of singularities in the solutions to (5.50)
is an unavoidable reality that has physical significance. Any theory of these types
of equations would have to take into account the existence of solutions that satisfy
the equations in certain regions and become singular along some curves in the (t, y)plane. Such functions, which are sometimes called “shocks”, cannot be solutions in
the classical sense and force upon us the need to extend the concept of solutions for
(5.50) using the concept of distribution or generalized function, as we did in Sect. 3.8.
Definition 5.3 A locally integrable function in the domain
D = (x, y) ∈ R2 ; x ≥ 0
is called a weak solution of equation (5.50) if, for any C 1 -function ψ with compact
support on D, we have
D
∞
zψx + ψA(z) dxdy +
where
−∞
ϕ(y)ψ(0, y)dy = 0,
(5.60)
z
A(z) :=
a(r)dr.
0
Integrating by parts, we see that any C 1 -solution of (5.50) (let’s call it a classical
solution) is also a weak solution. On the other hand, a weak solution need not even be
continuous. Let us remark that, when dealing with weak solutions, we can allow the
initial function ϕ(y) to have discontinuities. Such situations can appear frequently
in real life examples.
Let us assume that the weak solution z of (5.50) is C 1 outside a curve
Γ = (x, y) ∈ D; y = ℓ(x) .
180
5 Prime Integrals and First-Order Partial Differential Equations
Fig. 5.2 A weak solution
with singularities along a
curve Γ
y
Γ
D−
D+
x
We denote by D− and D+ the two regions in which D is divided by Γ ; see Fig. 5.2.
If, in (5.60), we first choose ψ to have support in D− and then to have support in
D+ , we deduce that
zx (x, y) + a(z) + zy (x, y) = 0, ∀(x, y) ∈ D− ,
(5.61)
zx (x, y) + a(z) + zy (x, y) = 0, ∀(x, y) ∈ D+ ,
(5.62)
z(0, y) = ϕ(y), ∀y ∈ R.
(5.63)
We set
z± (x, ℓ(x)) =
lim
(x1 ,y1 )→(x,ℓ(x)
(x1 ,y1 )∈D±
z(x1 , ℓ(x1 )).
Let ψ be an arbitrary C 1 function with compact support on D. Multiplying successively each of equations (5.61) and (5.62) by ψ, and integrating on D− and respectively D+ , we deduce that
0=
D−
+
0=
0
D+
−
0
zx + A(z)zy ψdxdy = −
∞
zψx + A(z)ψy dxdy
ℓ′ (x)z− − A(z− ) ψ(x, ℓ(x))dx −
zx + A(z)zy ψdxdy = −
∞
D−
′
+
+
D−
∞
ϕ(y)ψ(0, y)dy.
0
zψx + A(z)ψy dxdy
ℓ (x)z − A(z ) ψ(x, ℓ(x) )dx.
If we add the last two equalities and then use (5.60), we deduce that
0
∞
ℓ′ (x)z− − A(z− ) − ℓ′ (x)z+ − A(z+ ) ψ(x, ℓ(x) )dx = 0.
Since ψ is arbitrary, we deduce the pointwise equality
A(z+ (x, ℓ(x))) − A(z− (x, ℓ(x))) = ℓ′ (x)(z+ (x, ℓ(x)) − z+ (x, ℓ(x))),
(5.64)
5.4 Conservation Laws
181
∀x ≥ 0. In other words, along Γ we have the jump condition
A(z+ ) − A(z− ) = ν(z+ − z− ),
(5.65)
where ν is the slope of the curve. Equality (5.65) is called the Rankine–Hugoniot
relation and describes the jump in velocity when crossing a shock curve; see [7].
Example 5.6 Consider the equation
zx + z 2 zy = 0
(5.66)
(5.67)
satisfying the Cauchy condition
z(0, y) =
0, y ≤ 0,
1, y > 0.
Let us observe that the function
z(x, y) :=
0,
1,
y ≤ 3x ,
y > 3x ,
(5.68)
is a weak solution for the Cauchy problem (5.66), (5.67). Indeed, in this case, equality
(5.60)
∞
0
∞
dx
x
3
1
ψx (x, y) + ψy (x, y)
3
dy +
∞
0
ψ(0, y)dy = 0
is obviously verified.
The weak solutions are not unique in general. Besides (5.68), the Cauchy problem
(5.66), (5.67) also admits the solution
z(x, y) :=
⎧
⎪
⎨0,
1
⎪
⎩
y
x
1,
2
y
x
,
< 0,
0 ≤ xy ≤ 1,
y
> 1.
x
The nonuniqueness of weak solutions requires finding criteria that will select from
the collection of all possible weak solutions for a given problem those that have a
physical significance. A criterion frequently used is the entropy criterion according to
which we choose only the solutions for which the entropy of the system is increasing.
182
5 Prime Integrals and First-Order Partial Differential Equations
5.5 Nonlinear Partial Differential Equations
In this section, we will investigate first-order nonlinear PDEs of the form
F(x1 , x2 , . . . , xn , z, p1 , p2 , . . . , pn ) = 0,
(5.69)
where F is a C 2 function on an open set Ω ⊂ R2n+1 and we denote by pi the functions
pi (x1 , . . . , xn ) := zxi (x1 , . . . , xn ), i = 1, 2, . . . , n.
(We note that (5.35) is a particular case of (5.69).)
By a solution of (5.69), we understand a C 1 -function z = z(x1 , . . . , xn ) defined
on an open set D ⊂ Rn and satisfying (5.69) for all x = (x1 , . . . , xn ) ∈ D. Such a
solution is called an integral manifold of equation (5.69).
Consider an (n − 1)-dimensional submanifold Γ of the Euclidean space Rn+1
with coordinates (x1 , . . . , xn , z) described parametrically by equations (5.42), that
is,
xi = ϕi (u), i = 1, . . . , n, u = (u1 , . . . , un−1 ) ∈ U,
(5.70)
z = ϕ(u).
As in the case of quasilinear partial differential equations, we define a solution of
the Cauchy problem associated with equation (5.69) and the manifold Γ to be a
solution of (5.69) whose graph contains the manifold Γ . In the sequel, we will use
the notations
Z := Fz , Xi = Fxi , Pi := Fpi , i = 1, . . . , n,
and we will impose the nondegeneracy condition
P1
P2
⎢ ∂ϕ1 ∂ϕ2
⎢
⎢ ∂u1 ∂u1
det ⎢
..
⎢ ..
.
⎢ .
⎣ ∂ϕ1 ∂ϕ2
∂un−1 ∂un−1
⎡
···
···
..
.
···
Pn
∂ϕn
∂u1
..
.
∂ϕn
∂un−1
⎤
⎥
⎥
⎥
⎥ = 0 on U.
⎥
⎥
⎦
(5.71)
Theorem 5.7 Under the above assumptions, the Cauchy problem (5.69), (5.70) has
a unique solution defined in a neighborhood of the manifold Γ .
Proof The proof has a constructive character and we will highlight a method known
in literature as the method of characteristics or Cauchy’s method, already used for
equation (5.35).
We associate to equation (5.69) a system of ODEs, the so-called characteristics
equation
5.5 Nonlinear Partial Differential Equations
183
dxi
= Pi , i = 1, . . . , n,
ds
n
dz
pi Pi , s ∈ I,
=
ds
i=1
(5.72)
dpi
= −(pi Z + Xi ), i = 1, . . . , n.
ds
Since the function Pi , Xi and Z are C 1 , the existence and uniqueness theorem implies
that, for any s0 ∈ I and (x0 , z0 , p0 ) ∈ R2n+1 , system (5.72) with the initial conditions
xi (s0 ) = xi0 , z(s0 ) = z0 , pi (s0 ) = p0i ,
admits a unique solution
xi = xi (s; s0 , x0 , z0 , p0 ), i = 1, . . . , n
z = z(s; s0 , x0 , z0 , p0 ),
0
0
(5.73)
0
pi = pi (s; s0 , x , z , p ), i = 1, . . . , n.
The function s → ( xi (s), z(s) ) is called a characteristic curve.
The integral manifold we seek is determined by the family of characteristic curves
originating at the moment s = s0 at points on the initial manifold Γ (see Fig. 5.3).
Thus, we are led to define
xi0 = ϕi (u1 , . . . , un−1 ), i = 1, . . . , n,
z0 = ϕ(u1 , . . . , un−1 ).
(5.74)
The fact that the function whose graph is this surface has to satisfy (5.69) adds another
constraint,
F ϕ(u) , . . . , ϕn (u), ϕ(u), p01 , . . . , p0n = 0, ∀u ∈ U.
(5.75)
Fig. 5.3 The integral
manifold is filled by the
characteristic curves
emanating from Γ
xi =xi (s), z=z(s)
Γ
184
5 Prime Integrals and First-Order Partial Differential Equations
Finally, we will eliminate the last degrees of freedom by imposing the compatibility
conditions
n
i=1
p0i
∂ϕi (u)
∂ϕ
(u) =
(u), ∀u ∈ U, j = 1, . . . , n − 1.
∂uj
∂uj
(5.76)
The geometric significance of equations (5.76) should be clear: since the collection
of vectors
⎡ ∂ϕ ⎤
1
⎢ ∂uj ⎥
⎢ . ⎥
⎢ . ⎥
⎢ . ⎥
⎥
⎢
⎢ ∂ϕn ⎥ , j = 1, . . . , n − 1,
⎥
⎢
⎢ ∂uj ⎥
⎥
⎢
⎣ ∂ϕ ⎦
∂uj
forms a basis of the tangent space of the manifold Γ at (ϕ1 (u), . . . , ϕn (u), z(u)),
conditions (5.76) state that the vector
⎤
p01
⎢ .. ⎥
⎢ . ⎥
⎢ 0⎥
⎣ pn ⎦
⎡
−1
is normal to Γ at that point.
Taking into account the nondegeneracy assumption (5.71), we deduce from the
implicit function theorem that the system (5.75), (5.76) determines uniquely the
system of C 1 -functions
p0i = p0i (u), u ∈ U0 ⊂ U, i = 1, . . . , n,
(5.77)
defined on some open subset U0 ⊂ U. Substituting equations (5.74) and (5.77) into
(5.73), we obtain for the functions xi , z, pi expressions of the type
xi = Ai (s, u), i = 1, . . . , n,
z = B(s, u), s ∈ I0 , u ∈ U0 ,
pi = Ei (s, u), i = 1, . . . , n,
(5.78)
where Ai , B and C1 are C 1 -functions on the domain I0 × U0 ⊂ I × U.
We will show that the first (n + 1) equations in (5.78) define parametrically
a solution z = z(x) of the Cauchy problem (5.69), (5.70) and the vector field
(E1 , . . . , En , −1) is normal to the graph of this function, more precisely,
5.5 Nonlinear Partial Differential Equations
185
Ei (s, u) = zxi (s, u), ∀(s, u) ∈ I0 × U0 , i = 1, . . . , n.
(5.79)
Indeed, from equations (5.70) and (5.71), we deduce
D(A1 , . . . , An )
D(ϕ1 , . . . , ϕn )
=
= 0,
D(u1 , . . . , un−1 , s)
D(u1 , . . . , un−1 , s)
for s = s0 . We deduce that the system formed by the first n equations in (5.78) can be
solved uniquely for (s, u) in terms of x. Substituting the resulting functions s = s(x),
u = u(x) into the definition of z, we obtain a function z = z(x).
To prove equalities (5.79), we start from the obvious equalities
Bs =
Buj =
n
i=1
n
i=1
n
∂Ai
∂xi
zxi
zxi
=
, u ∈ U0 , s ∈ I0 ,
∂s
∂s
i=1
zxi
∂Ai
, j = 1, . . . , n − 1, u ∈ U0 , s ∈ I0 .
∂uj
Hence, equalities (5.1) are equivalent to the following system of equations
Buj =
n
i=1
n
Ei (Ai )s , u ∈ U0 , s ∈ I0 ,
(5.80)
Ei (Ai )uj , u ∈ U0 , s ∈ I0 , j = 1, . . . , n − 1.
(5.81)
Bs =
i=1
Equation (5.80) follows immediately from the characteristics equation (5.72).
To prove (5.81), we introduce the functions Lj : I0 × U0 → R,
Lj (s, u) =
n
i=1
Ei (Ai )uj − Buj , j = 1, . . . , n − 1.
(5.82)
From equations (5.76), we deduce that
Lj (s0 , u) = 0, ∀u ∈ U0 , j = 1, . . . , n − 1.
(5.83)
On the other hand, from equalities (5.72), (5.78) and (5.82), we deduce that
n
n
∂ 2 Ai
∂Lj
∂2B
=
−
Ei
(Ei )s (Ai )uj +
∂s
∂uj ∂s ∂uj ∂s
i=1
i=1
n
n
n
=−
(Ei Pi )uj .
Ei (Pi )uj −
(Ei Z + Xi )(Ai )uj +
i=1
i=1
i=1
(5.84)
186
5 Prime Integrals and First-Order Partial Differential Equations
Let us now observe that, for any (s, u) ∈ I0 × U0 , we have the equality
F A1 (s, u), . . . , An (s, u), B(s, u), F1 (s, u), . . . , Fn (s, u) = 0.
(5.85)
Indeed, for s = s0 , equality (5.85) reduces to (5.75). On the other hand, using again
system (5.72), we deduce that
∂
F A1 (s, u), . . . , An (s, u), B(s, u), F1 (s, u), . . . , Fn (s, u)
∂s
n
n
n
Ei Pi = 0, ∀(s, u) ∈ I0 × U0 ,
Pi (Ei Z + Xi ) + Z
Xi (Ai )s −
=
i=1
i=1
i=1
which immediately implies (5.85).
Differentiating (5.85) with respect to uj , we deduce that
n
i=1
Xi (Ai )uj + ZBuj +
n
i=1
Pi (Ei )uj = 0.
Using the above equality in (5.84), we find that
n
∂Lj
= Buj −
Ei (Ai )uj Z = −Ls (s, u)Z(s, u),
∂s
i=1
∀j = 1, . . . , n − 1, (s, u) ∈ I0 × U0 .
The above equality defines a first-order linear ODE (in the variable s) for each fixed
u ∈ U0 . For s = s0 , we have
Lj (s0 , u) = 0,
and we obtain that
Lj (s, u) = 0, ∀(s, u) ∈ I0 × U0 , j = 1, . . . , n − 1.
This proves equalities (5.81) and thus z = z(x) is a solution of (5.69). Equality (5.85)
shows that the graph of this function contains the manifold Γ .
The uniqueness of the solution of the Cauchy problem can be proved using the
same method employed in the proof of Theorem 5.6. This completes the proof of
Theorem 5.7.
Remark 5.2 In the case n = 2, the Cauchy problem specializes to the following:
Find the surface z = z(x, y) that satisfies the nonlinear first-order PDE
F(x, y, x, p, q) = 0,
(5.86)
5.5 Nonlinear Partial Differential Equations
187
and contains the curve
x = ϕ(t), y = ψ(t), z = χ(t), t ∈ I,
(5.87)
where F is a C 2 -function defined on a domain Ω ⊂ R5 , and ϕ, ψ, χ are C 1 -functions
on a real interval I.
If we denote by X, Y , Z, P, Q the functions
X = Fx , Y = Fy , Z = Fz , P = Fp , Q = Fq ,
then the characteristic system (5.72) becomes
dx
dy
dz
= P,
= Q,
= pP + qQ,
ds
ds
ds
dq
dp
= −(pZ + X),
= −(qZ + Y ), s ∈ I.
ds
ds
(5.88)
The sought-after surface will be spanned by a one-parameter family of characteristic
curves, that is, solutions of (5.88)
x
y
z
p
q
= x(s; s0 , y0 , z0 , p0 , q0 ),
= y(s; s0 , y0 , z0 , p0 , q0 ),
= z(s; s0 , y0 , z0 , p0 , q0 ), s ∈ I0 ,
= p(s; s0 , y0 , z0 , p0 , q0 ),
= q(s; s0 , y0 , z0 , p0 , q0 ),
(5.89)
where, according to the general procedure (see Eqs. (5.74), (5.75), (5.76)), the initial
vector (x0 , y0 , z0 , p0 , q0 ) is determined by the conditions
x0 = ϕ(t), y0 = ψ(t), z0 = χ(t), t ∈ I,
F(x0 , y0 , z0 , p0 , q0 ) = 0,
p0 ϕ′ (t) + q0 ψ ′ (t) = χ′ (t).
In this fashion, system (5.89) defines the sought-after surface in parametric form
x = A(s, t), y = B(s, t), z = C(s, t).
Let us observe that, in this case, condition (5.71) becomes
det
P ϕ(t), ψ(t), χ(t) Q ϕ(t), ψ(t), χ(t)
= 0.
ψ ′ (t)
ϕ′ (t)
Example 5.7 Let us illustrate the above techniques on an optimal-time problem
T (x0 ) := inf T ; x(T ) = x0 ,
(5.90)
188
5 Prime Integrals and First-Order Partial Differential Equations
where x0 is an arbitrary point in Rn and the infimum is taken over all the solutions
x ∈ C 1 [0, T ], Rn
of the differential inclusion
x′ (t) ∈ U x(t) , t ∈ [0, T ],
x(0) = 0.
(5.91)
Above, for any x, U(x) is a compact subset of Rn . In the sequel, we assume that the
function T : Rn → R is C 1 on Rn \ {0}.
If x∗ ∈ C 1 ([0, T ∗ ], Rn ) is an optimal arc in problem (5.90), that is, a solution of
the multivalued system of ODEs (5.91) such that the infimum in (5.90) is attained
for this solution, then we obviously have
T x∗ (t) = t, ∀t ∈ [0, T ∗ ],
because T ∗ is optimal. Hence
∇T ∗ ( x∗ (t) ), x∗ ′ (t) = 1, ∀t ∈ [0, T ∗ ].
In particular, for t = T ∗ , we have
∇T (x0 ), u0 = 1.
(5.92)
On the other hand, for any solution of system (5.91) satisfying x(T ) = x0 , we
have
T (x(t)) ≤ t − s + T (x(s)), 0 ≤ s ≤ t ≤ T .
Hence
(∇T (x(t)), x′ (t)) ≤ 1, ∀t ∈ [0, T ].
It follows that, for t = T , we have
(∇T (x0 ), u) ≤ 1, ∀u ∈ U(x0 ).
(5.93)
We set z(x0 ) = T (x0 ) and we define the function H0 : Rn × Rn → R
H0 (x, w) = sup (w, u); u ∈ U(x) .
From equalities (5.92) and (5.93), we deduce that z is a solution of the equation
H0 (x, ∇z(x) ) = 1, x = 0.
(5.94)
5.5 Nonlinear Partial Differential Equations
189
Consider the special case in which
U(x) := u ∈ Rn ; ue ≤ v(x) , x ∈ Rn ,
where v : Rn → (0, ∞) is a given C 1 -function. In this case, we have
H0 (x, w) = we v(x),
and so, Eq. (5.94) becomes
∇z(x)2e =
1
, ∀x = 0.
v(x)2
(5.95)
In the special case where n = 3 and v(x) is the speed of light in a nonhomogeneous
medium, (5.95) is the fundamental equation of geometric optics and it is called the
eikonal equation.
From Fermat’s principle, it follows that the surface z(x1 , x2 , x3 ) = λ is the wave
front of light propagation in a nonhomogeneous medium.
Equation (5.95), that is,
zx21 + zx22 + zx23 = n(x) := v(x)−2 ,
has the form (5.69) and can be solved via the method described above. The solutions
xi = xi (s) to the characteristic system associated with (5.95),
dxi
dz
= 2π, i = 1, 2, 3,
= 2(p21 + p22 + p23 ) = 2n
ds
ds
dp1
dp3
dp3
= −nx1 ,
= −nx3 ,
= −nx3 ,
ds
ds
ds
represent, in the geometric optics context, light rays. Consider in the space R4 the
surface Γ described by the parametrization
x1 = ϕ1 (u1 , u2 ), x2 = ϕ2 (u1 , u2 ),
x3 = ϕ3 (u1 , u2 ), z = ϕ(u1 , u2 ).
(5.96)
We want to solve the Cauchy problem associated with equation (5.94) and the surface
Γ . We assume, for simplicity, that the medium is homogeneous, that is, n is a constant
function. The characteristic curves of the equations (5.95) are given by
x1 = x10 + 2p01 s, z = v 0 + 2ns,
x2 = x20 + 2p02 s, p1 = p01 , p2 = p02 ,
x3 = x30 + p03 s, p3 = p03 ,
(5.97)
190
5 Prime Integrals and First-Order Partial Differential Equations
and the initial conditions at s = 0 are determined by constraints (5.74), (5.75), (5.76),
that is,
x10 = ϕ1 (u1 , u2 ), x20 = ϕ2 (u1 , u2 ), x30 = ϕ3 (u1 , u2 ),
z0 = ϕ(u1 , u2 ),
0 2
(p1 ) + (p02 )2 + (p03 )2 = n2 ,
∂ϕ
∂ϕ
∂ϕ3
∂ϕ1
∂ϕ2
∂ϕ3
∂ϕ
∂ϕ
1
2
+ p02
+ p03
=
, p01
+ p02
+ p03
=
,
p01
∂u1
∂u1
∂u1
∂u1
∂u2
∂u2
∂u2
∂u2
from which we obtain a parametrization of the graph of the solution z = z(x1 , x2 , x3 ),
which contains the manifold Γ .
5.6 Hamilton–Jacobi Equations
This section is devoted to the following problem
zt + H0 (x, zx , t) = 0, t ∈ I ⊂ R, x ∈ D ⊂ Rn−1 ,
z(x, 0) = ϕ(x), ∀x ∈ D,
(5.98)
(5.99)
where x = (x1 , . . . , xn−1 ), zx = (zx1 , . . . , zxn−1 ), H0 is a C 2 -function on a domain
Ω ⊂ R2n−1 , and ϕ ∈ C 1 (D).
Problem (5.98), (5.99) is of type (5.69), (5.70), where the variable xn was denoted
by t and
F(x1 , ..., xn−1 , t, z, p1 , ..., pn−1 , pn ) = pn + H0 (x1 , ..., xn−1 , p1 , ..., pn−1 , t),
ϕi (u1 , ..., un−1 ) = ui , ∀i = 1, ..., n − 1.
We deduce by Theorem 5.7 that problem (5.98), (5.99) has a unique (locally defined)
solution that can be determined using the strategy presented in the previous section.
Equation (5.98) is called the Hamilton–Jacobi equation and occupies a special
place amongst the equations of mathematical physics being, among many other
things, the fundamental equation of analytical mechanics. This equation appears in
many other contexts, more often variational, which will be briefly described below.
Consider the C 1 -function
L : Rn−1 × Rn−1 × R → R
and define
H : Rn−1 × Rn−1 × R → R
by setting
H(x, p, t) := sup
v∈Rn−1
#
$
(p, v) − L(x, v, t) ,
(5.100)
5.6 Hamilton–Jacobi Equations
191
where (−, −) denotes, as usual, the canonical scalar product on Rn−1 . In the sequel,
we make the following assumption.
(A1 ) The function H is C 2 on a domain Ω ⊂ R2n−1 .
If we interpret the function L at time t as a Lagrangian on the space
Rn−1 × Rn−1 , then the function H is, for any t, the corresponding Hamiltonian
function (see (5.18)).
Consider the solution w = w(x, t) of the equation
wt − H(x, −wx , t) = 0, x ∈ Rn−1 , t ∈ [0, T ],
(5.101)
satisfying the Cauchy problem
w(x, T ) = ϕ(x), x ∈ Rn−1 .
(5.102)
Using the change of variables
w(x, t) = z(x, T − t),
we see that equation (5.101) reduces to (5.98) with
H0 (x, p, t) = −H(x, −p, T − t).
Thus, we see that (5.101), (5.102) are equivalent to problem (5.98), (5.99).
Consider the following variational problem
inf
%
T
0
′
L x(s), x (s), t ds; x ∈ C 1 ([0, T ], Rn−1 ), x(0) = x0
&
,
(5.103)
and the function
S : Rn−1 × [0, T ] → R,
given by
S(x0 , t) = inf
'
T
L(x(s), x′ (s), t)ds + ϕ(x(T ));
t
(
x ∈ C 1 ([0, T ], Rn−1 ), x(t) = x0 .
(5.104)
In particular, S(x0 , 0) coincides with the infimum in (5.103). In analytical mechanics,
the function S is called the action functional of the Lagrangian. A function x that
realizes the infimum in (5.103) or (5.104) is called an optimal arc.
The connection between equation (5.101) and the infimum (5.104) will be
described in Theorem 5.8 below. With this in mind, we consider the solution
w = w(x, t) of problem (5.101), (5.102). We denote by U(x, t) the vector where
the supremum in (5.100) is attained in the case p = −wx , that is,
192
5 Prime Integrals and First-Order Partial Differential Equations
H x, −wx (x, t) = − wx (x, t), U(x, t) − L x, U(x, t), t .
(5.105)
Concerning the function U : Rn−1 × [0, T ] → Rn−1 , we will make the following
assumption.
(A2 ) The function U is continuous and, for any (x0 , t) ∈ Rn−1 × [0, T ], the Cauchy
problem
x′ (s) = U(x(s), s), t ≤ s ≤ T ,
(5.106)
x(t) = x0 ,
has a unique C 1 -solution on the interval [t, T ].
Theorem 5.8 Under the above assumptions, let w : Rn−1 × [0, T ] be the solution
of (5.101), (5.102). Then
S(x0 , t) = w(x0 , t), ∀x ∈ Rn−1 , t ∈ [0, T ],
(5.107)
and, for any (x0 , t) ∈ Rn−1 × [0, t], the solution x = x̃(s) to problem (5.106) is an
optimal arc for problem (5.104).
Proof Let x ∈ C 1 [t, T ], Rn−1 ) be an arbitrary function such that x(t) = x0 . From
the obvious equality
d
w x(s), s = ws x(s), s + wx x(s), s , x′ (s) ∀s ∈ [t, T ],
ds
(5.108)
and from (5.100), (5.101), it follows that
Hence
d
w x(s), s = H x(s), −wx x(s), s , s + wx x(s), s , x′ (s)
ds
≥ L x(s), x′ (s), s , ∀s ∈ [t, T ].
T
w x0 , t ≤
t
L x(s), x′ (s), s ds + ϕ x(T ) .
Since the function x is arbitrary, we deduce that
w(x0 , t) ≤ S(x0 , t).
(5.109)
Consider now the solution x = x̃(t) of the Cauchy problem (5.106). From equality
(5.105), it follows that
d
w x̃(s), s = H x̃(s), −wx x̃(s), s , s + wx x̃(s), s , x̃′ (s)
ds
= L x̃(s), U(x̃(s), s), s , ∀s ∈ [t, T ].
5.6 Hamilton–Jacobi Equations
193
Integrating over the interval [t, T ], we deduce that
T
w(x0 , t) =
t
L x̃(s), U(x̃(s), s), s ds + ϕ x̃(T ) ≥ S(x0 , T ).
The last inequality coupled with (5.109) shows that w ≡ S. On the other hand, from
the last inequality we also deduce that the solution x of problem (5.106) is also an
optimal arc for problem (5.104).
Let us now assume that, besides (A1 ) and (A2 ), we know that, for any (x, t) ∈
Rn−1 × [0, T ], the function v → L(x, v, t) is convex. According to Proposition A.2,
we have
(5.110)
L(x, v, t) + H(x, p, t) = (p, v) for v = Hp (x, p).
The function U from (5.105) is thus given by
U(x, t) = Hp x, −wx (x, t) , ∀x ∈ Rn−1 , t ∈ [0, T ].
(5.111)
In the particular case
1
L(x, v, t) =
2
1
2
Q(t)x, x + ve , ∀(x, v, t) ∈ Rn−1 × Rn−1 × [0, T ],
2
where Q(t) is an (n − 1) × (n − 1) matrix, we have
H(x, p, t) =
1
p2e − Q(t)x, x , ∀(x, p, t) ∈ Rn−1 × Rn−1 × [0, T ],
2
and we deduce that
U(x, t) = −wx (x, t), ∀(x, t) ∈ Rn−1 × R.
Equation (5.101) becomes
1
1
wt − wx 2e = − Q(t)x, x .
2
2
(5.112)
If we seek for (5.112) a solution of the form
w(x, t) = P(t)x, x , x ∈ Rn−1 , t ∈ [0, T ],
(5.113)
P′ (t) − P(t)2 = −Q(t), t ∈ [0, T ].
(5.114)
where P(t) is an (n − 1) × (n − 1) real matrix, then (5.112) becomes
194
5 Prime Integrals and First-Order Partial Differential Equations
This is a Riccati equation of type (2.50) that we have investigated earlier. From the
uniqueness of the Cauchy problem for (5.112), we deduce that any solution of (5.112)
has the form (5.113).
Returning to the variational problem (5.103), and recalling that any trajectory of
a mechanical system with (n − 1)-degrees of freedom is an optimal arc for a system
of type (5.103), we deduce from Theorem 5.8 that the Hamilton–Jacobi equation
(5.101) represents, together with the Hamiltonian systems (5.21), another way of
describing the motions in classical mechanics. There exists therefore an intimate
relationship between the Hamiltonian systems
∂H
x(t), p(t) )
∂x
∂H
x′ (t) =
x(t), p(t) , t ∈ [0, T ],
∂p
p′ (t) = −
(5.115)
and the Hamilton–Jacobi equation (5.101).
In the case of the one-dimensional motion of (n)-particles of masses m1 , . . . ,
mn−1 , the Hamiltonian is given by (5.22), that is,
H(x1 , . . . , xn ; p1 , . . . , pn ) =
n
1 2
p + V (x1 , . . . , xn ),
2mk k
(5.116)
k=1
and the corresponding Hamilton–Jacobi equation becomes
Wt (x, t) −
n
1
Wx2i (x, t) = V (x), x ∈ Rn , t ∈ [0, T ].
2m
i
i=1
(5.117)
In quantum mechanics, the motion of a particle is defined by its wave function ψ(t, x),
which has the following significance: the integral
E
|ψ(x, t)|2 dx
represents the probability that the particle is located at time t in the region E ⊂ R3 .
In classical mechanics, the Hamiltonian of a particle of mass m is given by
H(x, p) =
1 2
p + V (x), (x, p) ∈ R2 ,
2m
(5.118)
where p is the momentum of the particle during the motion. In quantum mechanics,
the momentum of the particle is represented by the differential operator
p := −i
where is Planck’s constant.
∂
,
∂x
5.6 Hamilton–Jacobi Equations
195
Using the analogy with classical mechanics, the quantum Hamiltonian is defined
to be the differential operator
H=
1
2 ∂ 2
+ V (x),
p · p + V (x) = −
2m
2m ∂x 2
and, formally, the Hamilton–Jacobi equation becomes
iψt (x, t) +
2 ∂ 2 ψ
− V (x)ψ = 0,
2m ∂x 2
or, equivalently,
ψt (x, t) −
i
i
ψxx (x, t) + V (x)ψ(x, t) = 0.
2m
(5.119)
Equation (5.119) satisfied by the wave function is called Schrödinger’s equation.
It is the fundamental equation of quantum mechanics and we have included it here to
highlight its similarities with the Hamilton–Jacobi equation in classical mechanics.
Remark 5.3 In general, the Hamilton–Jacobi equation (5.98)–(5.99) does not have
a global C 1 -solution, and the best one can obtain for this problem is a weak or
generalized solution. One such concept of solution for which one has existence
and uniqueness under certain continuity and growth conditions on the Hamiltonian
function H is that of the viscosity solution introduced by M.G. Crandall and P.L. Lions
[8].
Problems
5.1 Determine two independent, prime integrals of the differential system
x1′ = x22 , x2′ = x2 x3 , x3′ = −x22 .
(5.120)
Hint. Use Theorem 5.2 and the construction given in its proof.
5.2 The differential system
I1 x1′ = (I2 − I1 )x2 x3 ,
I2 x2′ = (I3 − I1 )x3 x1 ,
I3 x3′ = (I1 − I2 )x1 x2 ,
(5.121)
196
5 Prime Integrals and First-Order Partial Differential Equations
describes the motion of a rigid body with a fixed point. Find a prime integral of this
system.
Hint. Check equation (5.2) for U(x1 , x2 , x3 ) = I1 x12 + I2 x22 + I3 x32 .
5.3 Find the general solution of the linear, first-order PDE
zx + αzy = 0.
(5.122)
5.4 Find the integral surface of the equation
(x − z)zx + (y − z)zy = 2z,
(5.123)
with the property that it contains the curve
Γ := {x − y = 2, x + z = 1} ⊂ R3 .
Hint. Use the method described in Sect. 5.3.
5.5 Let A be an n × n real matrix and
f : Rn × R → R, ϕ : Rn → R,
be C 1 -functions. Prove that the solution of the Cauchy problem
zt (x, t) − Ax, zx (x, t) = f (x, t), x ∈ Rn , t ∈ R,
z(x, 0) = ϕ(x), x ∈ Rn ,
(5.124)
is given by the formula
t
z(x, t) = ϕ etA x +
0
f e(t−s)A x, s ds, ∀(x, t) ∈ Rn × R.
(5.125)
5.6 Let A be an n × n real matrix. Using the successive approximations method,
prove the existence and uniqueness of the solution of the Cauchy problem
zt (x, t) − Ax, zz (x, t) = F x, t, z(x, t) ,
z(x, 0) = ϕ(x),
(x, t) ∈ Rn × R,
where ϕ : Rn → R and F : Rn × R × R → R are C 1 -functions.
Hint. Using (5.125), we can transform (5.126) into an integral equation
t
z(x, t) = ϕ etA x +
0
F e(t−s)A x, s, z e(t−s)A x, s ds,
which can be solved using the successive approximations method.
(5.126)
5.6 Hamilton–Jacobi Equations
197
5.7 Prove that the solution z = z(x1 , . . . , xn , t) of the Cauchy problem
zt +
n
i=1
ai zxi = f (x, t), x = (x1 , . . . , xn ) ∈ Rn , t ∈ R,
(5.127)
n
z(x, 0) = ϕ(x),
x∈R ,
where ai are real constants and
f : Rn × R → R, ϕ : Rn → R,
are C 1 -functions, is given by the formula
t
z(x, t) = ϕ(x − ta) +
0
f x − (t − s)a, s ds, a := (a1 , . . . , an ).
Hint. Use the method of characteristics in Sect. 5.3.
5.8 The equation
3
∂n ∂n
vi
=
+
∂t
∂xi
i=1
K(x, θ, v)n(x, θ, t)dθ,
R3
n(x, v, 0) = n0 (x, v), x = (x1 , x2 , x3 ), v = (v1 , v2 , v3 ),
is called the Boltzman transport equation and describes the motion of neutrons where
n = n(x, v, t) is the density of neutrons having velocity v = v(x) at time t and at the
point x ∈ R3 . The function
K : R3 × R3 × R3 → R
is continuous and absolutely integrable.
Using the result in the previous exercise, prove an existence and uniqueness result
for the transport equation by relying on the successive approximations method or
Banach’s contraction principle (Theorem A.2).
5.9 Prove that, if the function ϕ is C 1 and increasing on R, then the solution of the
Cauchy problem
zx + zzy = 0,
x ≥ 0, y ∈ R
(5.128)
z(0, y) = ϕ(y), y ∈ R,
exists on the entire half-plane {(x, y) ∈ R2 ; x ≥ 0}, and it is given by the formula
z(x, y) = ϕ (1 + xϕ)−1 (y) ,
where (1 + xϕ)−1 denotes the inverse of the function
198
5 Prime Integrals and First-Order Partial Differential Equations
t ∈ R → t + xϕ(t) ∈ R.
Generalize this result to equations of type (5.50) and then use it to solve the Cauchy
problem
1
x ≥ 0, y ∈ R,
wx + |wy |2 = 0,
2
(5.129)
w(0, y) = ψ(y), y ∈ R,
where ψ : R → R is a convex C 2 -function.
Hint. Differentiating (5.129) with respect to y, we obtain an equation of type (5.128).
5.10 The Hamilton–Jacobi equation
ht (x, t) + Q(hx (x, t)) = 0, x ∈ R, t ≥ 0,
(5.130)
was used to model the process of mountain erosion, where h(x, t) is the altitude of
(a section of) the mountain at time t and at location x. Solve equation (5.130) with
the Cauchy condition h(x, 0) = 1 − x 2 in the case Q(u) = u4 .
5.11 Extend Theorem 5.8 to the case when problem (5.103) is replaced by an optimal
control problem of the form
inf
u
where
and
%
T
0
&
L(x(s), u(s), s)ds + ϕ(x(T )) ,
(5.131)
u ∈ C [0, T ]; Rm , x ∈ C 1 [0, T ]; Rn−1 ,
x′ (s) = f s, x(s), u(s) , x(0) = x0 ,
f : [0, T ] × Rn−1 × Rm → Rn−1
is a C 1 -function. Determine the Hamilton–Jacobi equation associated with (5.131)
in the special case when m = n − 1 and
L(x, u) =
1
(Qu, u) + u2e , f (s, x, u) = Ay + u,
2
(5.132)
where A, Q are (n − 1) × (n − 1) real matrices.
Hint. Theorem 5.8 continues to hold with the Hamiltonian function given by the
formula
H(x, p, t) := sup p, f (t, x, u)) − L x, f (t, x, u), t .
u∈Rm
5.6 Hamilton–Jacobi Equations
199
In the special case when L and f are defined by (5.132), the Hamiltonian–Jacobi
equations have the form
1
1
wt (x, t) + Ax, wx (x, t) − wx (x, t)2e = − Qx, x .
2
2
(5.133)
Differentiating (5.133) we obtain the Riccati equation (see (5.114))
P′ (t) + A∗ P(t) + P(t)A − P2 (t) = −Q(t), t ∈ [0, T ].
(5.134)
Appendix
A.1 Finite-dimensional Normed Spaces
Let Rn be the standard real vector space of dimension n. Its elements are
n-dimensional vectors of the form x = (x1 , . . . , xn ). Often we will represent x
as a column vector. The real numbers x1 , . . . , xn are called the coordinates of the
vector x. The addition of vectors is given by coordinatewise addition, and the multiplication of a vector by a real scalar is defined analogously. The space R1 is identified
with the real line R.
To a real matrix of size n × m (n-rows, m-columns), we associate a linear map
B : Rm → Rn ,
given by the formula
B x = B x, where B x denotes the column vector
⎡
⎤
m
b1 j x j
⎢ j=1
⎥
⎢ m
⎥
⎥
⎢
⎢
b2 j x j ⎥
⎥
⎢ j=1
B x := ⎢
⎥,
⎥
⎢
.
..
⎥
⎢
⎥
⎢
n
⎦
⎣
bm j x j
j=1
where bi j denotes the entry of B situated at the intersection of the i-th row with the
j-th column, 1 ≤ i ≤ m, 1 ≤ j ≤ n. Conversely, any linear map Rm → Rn has this
form.
The adjoint of B is the m × n matrix B ∗ with entries
b∗ji := bi j , ∀1 ≤ i ≤ n, 1 ≤ j ≤ m.
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6
201
202
Appendix
In particular, any n × n real matrix A induces a linear self-map of Rm . The matrix
A is called nonsingular if its determinant is non-zero. In this case, there exists an
inverse matrix, denoted by A−1 and uniquely determined by the requirements
A · A−1 = A−1 · A = 1n ,
where we denoted by 1n the identity n × n matrix.
Definition A.1 A norm on Rn is a real function on Rn , usually denoted by − ,
and satisfying the following requirements.
(i) x ≥ 0, ∀x ∈ Rn .
(ii) x = 0 if and only if x = 0.
(iii) λx = |λ| x, ∀λ ∈ R, x ∈ Rn .
(iv) x + y ≤ x + y, ∀x, y ∈ Rn .
There exist infinitely many norms on the space Rn . We leave the reader to verify
that the following functions on Rn are norms.
x := max |xi |,
(A.1)
1≤i≤n
n
(A.2)
|xi |,
x1 =
i=1
1
2
n
xi2
xe :=
(A.3)
.
i=1
Any norm on Rn induces a topology on Rn that allows us to define the concept of
convergence and the notion of open ball. Thus the sequence (x j ) j≥1 in Rn converges
to x in Rn in the norm − if
lim x j − x = 0.
j→∞
The open ball of center a ∈ Rn and radius r is the set
x ∈ Rn ; x − a < a
and the closed ball of center a and radius r is the set
x ∈ Rn ; x − a ≤ a .
We can define the topological notions of closure, open and closed sets and continuity.
Thus, a set Rn is called open if, for any point x 0 ∈ D, there exists an open ball centered
at x 0 and contained in D. The set C ⊂ Rn is called closed if, for any convergent
sequence of points in C, the limit is also a point in C. The set K ⊂ Rn is called
Appendix
203
compact if any sequence of points in K contains a subsequence that converges to a
point in K . Finally, a set D is called bounded if it is contained in some ball. For a
more detailed investigation of the space Rn , we refer the reader to a textbook on real
analysis in several variables. e.g., Lang 2005.
It is not to hard to see from the above that the convergence in any norm is equivalent
to coordinatewise convergence. More precisely, this means that given a sequence (x j )
j
j
in Rn , x j = (x1 , . . . , xn ), then
j
lim x j = x = (x1 , . . . , xn )⇐⇒ lim xk = xk , ∀k = 1, . . . , n.
j→∞
j→∞
This fact can be expressed briefly by saying that any two norms on Rn are equivalent.
The next result phrases this fact in an equivalent form.
Lemma A.1 Let − 1 and − 2 be two arbitrary norms on Rn . Then there exists
a constant C > 1 such that
1
x2 ≤ x1 ≤ Cx2 .
C
(A.4)
Thus, the notions of open, closed and compact subsets of Rn are the same for all the
norms on Rn .
Given a real n × n matrix with entries {ai j ; 1 ≤ i, j ≤ n}, its norm is the real
number
n
A = max
i
|ai j |.
(A.5)
j=1
In the vector space of real n × n matrices (which is a linear space of dimension n 2
2
and thus isomorphic to Rn ), the map A → A satisfies all the conditions (i)–(iv)
in Definition A.1.
(i) A ≥ 0.
(ii) A = 0 if and only if A = 0.
(iii) λA = |λ| A, ∀λ ∈ R.
(iv) A + B ≤ A + B.
We will say that the sequence of matrices (A j ) j≥1 converges to the matrix A as
j → ∞, and we will denote this by
A = lim A j ,
j→∞
if
lim A j − A = 0.
j→∞
j
If we denote by akℓ , 1 ≤ k, ℓ ≤ n, the entries of the matrix A j , and by akℓ , 1 ≤
k, ℓ ≤ n the entries of the matrix A, then
204
Appendix
j
lim A j = A⇐⇒ lim akℓ = akℓ , ∀k, ℓ.
j→∞
j→∞
Let us point out that if − is the norm on Rn defined by (A.1), then we have
the inequality
Ax ≤ A x, ∀x ∈ Rn .
(A.6)
A.2
Euclidean Spaces and Symmetric Operators
Given two vectors
x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈ Rn ,
we define their scalar or inner product to be the real number
n
(x, y) :=
(A.7)
xk yk .
k=1
We view the scalar product as a function (−, −) : Rn × Rn → R. It is not hard to
see that it satisfies the following properties.
(x, y) = ( y, x), ∀x, y ∈ Rn .
(A.8)
n
(x, y + z) = (x, y) + (x, z), (λx, y) = λ(x, y), ∀x, y, z ∈ R , λ ∈ R, (A.9)
(x, x) ≥ 0, ∀x ∈ R, (x, x) = 0 ⇐⇒ x = 0.
(A.10)
We observe that the function − e defined by
1
xe := (x, x) 2 , ∀x ∈ Rn ,
(A.11)
is precisely the norm in (A.3). The vector space Rn equipped with the scalar product
(A.7) and the norm (A.11) is called the (real) n-dimensional Euclidean space.
In Euclidean spaces, one can define the concepts of orthogonality, symmetric operators and, in general, one can successfully extend a large part of classical Euclidean
geometry. Here we will limit ourselves to presenting only a few elementary results.
Lemma A.2 Let A be an n × n matrix and A∗ its adjoint. Then
(Ax, y) = (x, A∗ y), ∀x, y ∈ Rn .
(A.12)
The proof of the above result is by direct computation. In particular, we deduce
that, if A is a symmetric matrix, A = A∗ , then we have the equality
(Ax, y) = (x, A y), ∀x, y ∈ Rn .
(A.13)
Appendix
205
A real, symmetric n × n matrix P is called positive definite if
(P x, x) > 0, ∀x ∈ Rn \ {0}.
(A.14)
The matrix P is called positive if
(P x, x) ≥ 0, ∀x ∈ Rn .
Lemma A.3 A real n × n matrix is positive definite if and only if there exists a
positive constant ω such that
(P x, x) ≥ ωx2 , ∀x ∈ Rn .
(A.15)
Proof Obviously, condition (A.15) implies (A.14). To prove the converse, consider
the set
M := x ∈ Rn ; x = 1 ,
and the function
q : M → R, q(x) = (P x, x).
The set M is compact and the function q is continuous and thus there exists an
x 0 ∈ M such that
q(x 0 ) ≤ q(x), ∀x ∈ M.
Condition (A.14) implies q(x 0 ) > 0. Hence
(P x, x) ≥ q0 := q(x 0 ), ∀x ∈ M.
Equivalently, this means that
1
1
P
x,
x
x x
≥ q0 , ∀x ∈ Rn \ {0}.
(A.16)
The last inequality combined with the properties of the scalar product implies (A.15)
with ω = q0 .
Lemma A.4 [The Cauchy–Schwartz inequality] If P is a real, n × n symmetric and
positive matrix, then
(P x, y) ≤ (P x, x) 21 (P y, y) 21 , ∀x, y ∈ Rn .
(A.17)
In particular, for P = 1e , we have
(x, y) ≤ xe ye , ∀x, y ∈ Rn .
(A.18)
206
Appendix
Proof Consider the real-valued function
ψ(λ) = P(x + λ y), x + λ y, , λ ∈ R.
From the symmetry of P and properties of the scalar product, we deduce that for any
x, y ∈ Rn the function ψ(λ) is the quadratic polynomial
ψ(λ) = λ(P y, y) + 2λ(P x, y) + (P x, x), λ ∈ R.
Since ψ(λ) ≥ 0, ∀λ ∈ R, it follows that the discriminant of the quadratic polynomial
ψ is non-positive. This is precisely the content of (A.17).
Lemma A.5 Let P be a real, symmetric, positive n × n matrix. There exists a C > 0
such that
1
P x2e ≤ C(P x, x) 2 xe , ∀x ∈ Rn .
(A.19)
Proof From Lemma A.4, we deduce that
1
1
3
1
P x2e = (P x, P x) ≤ (P x, x) 2 (P 2 x, P x) 2 ≤ K P 2 xe (P x, x) 2 .
Lemma A.6 Let x : [a, b] → R1 be a C 1 -function and P a real, symmetric n × n
matrix. Then
1 d
P x(t) x(t) = P x(t), x ′ (t) , ∀t ∈ [a, b].
2 dt
(A.20)
Proof By definition
1
d
(P x(t), x(t)) = lim [(P x(t + h) − x(t), x(t + h) − (P x(t), x(t)))]
h→0 h
dt
1
1
= lim P (x(t + h) − x(t)), x(t + h) + P x(t), (x(t + h) − x(t))
h→0
h
h
′
′
′
= (P x (t), x(t)) + (P x(t), x (t)) = 2(P x(t), x (t)), ∀t ∈ [a, b].
A.3 The Arzelà Theorem
Let I = [a, b] be a compact interval of the real axis. We will denote by C(I ; Rn ) or
C([a, b]; Rn ) the space of continuous functions I → Rn .
The space C(I ; Rn ) has a natural vector space structure with respect to the natural
operations on functions. It is equipped with the uniform norm
xu := sup x(t).
t∈I
(A.21)
Appendix
207
It is not hard to see that the convergence in norm (A.21) of a sequence of functions
{x j } ⊂ C(I ; Rn ) is equivalent to the uniform convergence of this sequence on the
interval I .
Definition A.2 A set M ⊂ C(I ; Rn ) is called bounded if there exists a constant
M > 0 such that
(A.22)
xu ≤ M, ∀x ∈ M.
The set M is called uniformly equicontinuous if
∀ε > 0, ∃δ = δ(ε) > 0 : x(t) − x(s) ≤ ε, ∀x ∈ M,
∀t, s ∈ I, |t − s| ≤ δ.
(A.23)
Our next theorem, due to C. Arzelà (1847–1912), is a compactness result in the space
C(I, Rn ) similar to the well-known Bolzano–Weierstrass theorem.
Theorem A.1 Suppose that M ⊂ C(I, Rn ) is bounded and uniformly equicontinuous. Then, any sequence of functions in M contains a subsequence that is uniformly convergent on I .
Proof The set Q ∩ I of rational numbers in the interval I is countable and thus we
can describe it as consisting of the terms of a sequence (rk )k≥1 . Consider the set
M1 := x(r1 ); x ∈ M ⊂ R.
The set M1 is obviously bounded and thus, according to the Bolzano–Weierstrass
theorem, it admits a bounded subsequence
x 21 (r1 ), x 21 (r1 ), . . . , x m
1 (r 1 ), . . . .
Next, consider the set
M2 := x 21 (r2 ), x 21 (r2 ), . . . , x m
1 (r 2 ), . . . ⊂ R.
It is bounded, and we deduce again that it contains a convergent subsequence
j
{x2 (r2 )} j≥1 . Iterating this procedure, we obtain an infinite array
x 11
x 12
···
x 1m
x 21
x 22
···
x 2m
···
···
···
···
xm
1
xm
2
···
xm
m
···
···
···
···
(A.24)
Every row of this array is a subsequence of the row immediately above it and the
sequence of functions on the m-th row converges on the finite set {r1 , . . . , rm }. We
deduce that the diagonal sequence {x m
m }m≥1 converges on Q ∩ I . We will prove that
this sequence converges uniformly on I .
208
Appendix
The uniform equicontinuity condition shows that, for any ε > 0, there exists a
δ(ε) > 0 such that
m
x m
m (t) − x m (s) ≤ ε, ∀|t − s| ≤ δ(ε), ∀m.
(A.25)
Since Q is dense in R, we deduce that there exists an N = N (ε) with the property
that any t ∈ I is within δ(ε) from at least one of the points r1 , . . . , r N (ε) ,
min |t − ri | ≤ δ(ε).
1≤i≤N (ε)
Inequalities (A.25) show that, for arbitrary k, ℓ and any t ∈ I , we have
x kk (t) − x ℓℓ (t)
≤
min (x kk (t) − x kk (ri ) + x kk (ri ) − x ℓℓ (ri ) + x ℓℓ (ri ) − x ℓℓ (t))
1≤i≤N (ε)
≤ 2ε + max x kk (ri ) − x ℓℓ (ri ).
1≤i≤N (ε)
The sequence (xkk (ri ))k≥1 converges for any 1 ≤ i ≤ N (ε). It is thus Cauchy for
each such i. In particular, we can find K = K (ε) > 0 such that
max x kk (ri ) − x ℓℓ (ri ) ≤ ε, ∀k, ℓ ≥ K (ε).
1≤i≤N (ε)
Hence
x kk (t) − x ℓℓ (t) ≤ 3ε, ∀t ∈ I, k, ℓ ≥ K (ε).
(A.26)
The above inequalities show that the sequence of functions (x kk )k≥1 satisfies the
conditions in Cauchy’s criterion of uniform convergence, and thus it is uniformly
convergent on I .
Remark A.1 Consider a set M ⊂ C(I ; Rn ) consisting of differentiable functions
such that there exists a C > 0 with the property
x ′ (t) ≤ C, ∀t ∈ I, ∀x ∈ M.
Then M is uniformly equicontinuous. Indeed, for any x ∈ M and any t, s ∈ I ,
s < t,
t
x ′ (τ )dτ ≤ C|t − s|.
x(t) − x(s) ≤
s
A.1 Does the family of functions {sin nt; n = 1, 2, . . . } satisfy the assumptions
of Arzelà’s theorem on [0, π]?
Appendix
209
A.4 The Contraction Principle
Suppose that X is a set and d : X × X → [0, ∞) is a nonnegative function on
the Cartesian product X × X . We say that d defines a metric on X if it satisfies the
following conditions.
d(x, y) ≥ 0, ∀x, y ∈ X,
(A.27)
d(x, y) = 0 ⇐⇒ x = y,
d(x, y) = d(y, x), ∀x, y ∈ X,
d(x, z) ≤ d(x, y) + d(y, z), ∀x, y, z ∈ X.
(A.28)
(A.29)
(A.30)
A set equipped with a metric is called a metric space.
A metric space (X, d) is equipped with a natural topology. Indeed, every point
x0 ∈ X admits a system of neighborhoods consisting of sets of the form S(x0 , r ),
where S(x0 , r ) is the open ball of radius r centered at x0 , that is,
S(x0 , r ) := x ∈ X ; d(x, x0 ) < r .
(A.31)
In particular, we can define the concept of convergence. We say that the sequence
{xn }n≥1 ⊂ X converges to x ∈ X as n → ∞ if
lim d(xn , x) = 0.
n→∞
The sequence {xn }n≥1 is called fundamental if for any ε > 0 there exists an N (ε) > 0
such that
d(xn , xm ) ≤ ε, ∀m, n ≥ N (ε).
It is not hard to see that any convergent sequence is fundamental. The converse is
not necessarily true.
Definition A.3 A metric space (X, d) is called complete if any fundamental sequence
in X is convergent.
Example A.1 (i) The space X = Rn with the metric d(x, y) = x − y is complete.
(ii) For any real number α, the space X = C([a, b]; Rn ) equipped with the metric
d(x, y) = sup x(t) − y(t)eαt , x, y ∈ X,
(A.32)
t∈[a,b]
is complete. Let us observe that convergence in the metric (A.32) is equivalent to
uniform convergence on the compact interval [a, b].
Definition A.4 Let (X, d) be a metric space. A mapping Γ : X → X is called a
contraction if there exists a ρ ∈ (0, 1) such that
210
Appendix
d(Γ x, Γ y) ≤ ρd(x, y), ∀x, y ∈ X.
An element x0 ∈ X is called a fixed point of Γ if
Γ x0 = x0 .
The next theorem is known as the contraction principle or Banach’s fixed point
theorem (S. Banach (1892–1945)).
Theorem A.2 If (X, d) is a complete metric space and Γ : X → X is a contraction
on X , then Γ admits a unique fixed point.
Proof Fix x1 ∈ X and consider the sequence of successive approximations
xn+1 = Γ xn , n = 1, 2, . . . .
(A.33)
Since Γ is a contraction, we deduce that
d(xn+1 , xn ) ≤ ρd(Γ xn , Γ xn−1 ) ≤ · · · ≤ ρn−1 d(x2 , x1 ).
Using the triangle inequality (A.30) iteratively, we deduce that
⎛
n+ p−1
d(xn+ p , xn ) ≤
j=n
d(x j+1 , x j ) ≤ ⎝
n+ p−1
j=n
⎞
ρ j−1 ⎠ d(x2 , x1 ),
(A.34)
for any positive integers n, p. Since ρ ∈ (0, 1), we deduce that the geometric series
∞
ρj
j=0
is convergent, and (A.34) implies that the sequence {xn }n≥1 is fundamental. The
space X is complete and thus this sequence converges as n → ∞ to some point
x∞ ∈ X . Since Γ is a contraction, we deduce that
d(Γ xn , Γ x∞ ) ≤ ρd(xn , x∞ ), ∀n ≥ 1.
Letting n → ∞ in the above inequality, we have
lim Γ xn = Γ x∞ .
n→∞
Letting n → ∞ in (A.33), we deduce that x∞ = Γ∞ . Thus x∞ is a fixed point of Γ .
The uniqueness of the fixed point follows from the contraction property. Indeed,
if x0 , y0 are two fixed points, then
Appendix
211
d(x0 , y0 ) = d(Γ x0 , Γ y0 ) ≤ ρd(x0 , y0 ).
Since ρ ∈ (0, 1), we conclude that d(x0 , y0 ) = 0, and thus x0 = y0 .
Example A.2 As an application of the contraction principle, we present an alternative
proof of the existence and uniqueness result in Theorem 2.4. We will make the same
assumptions and we will use the same notations as in Theorem 2.4.
We consider the set
X := {x ∈ C(T ; Rn ); x(t) − x0 ≤ b}, I = [t0 − δ, t0 + δ],
(A.35)
equipped with the metric
d(x, y) = sup x(t) − y(t)e−2Lt ,
(A.36)
t∈I
where L is the Lipschitz constant of f . On X , we define the operator
(Γ x)(t) := x0 +
t
f (s, x(s) )ds, t ∈ I.
(A.37)
t0
Observe that (2.15) implies that
Γ x(t) − x0 ≤ b, ∀t ∈ I,
that is, Γ maps the space X to itself. On the other hand, any elementary computation
based on the inequality (2.14) leads to
t
(Γ x)(t) − (Γ y)(t) ≤ L
x(s) − y(s)ds ,
t0
and, using (A.36), we deduce
1
d(x, y), ∀x, y, ∈ X.
2
d(Γ x, Γ y) ≤
From Theorem A.2, it follows that there exists a unique x ∈ X such that Γ x = x.
In other words, the integral equation
x(t) = x0 +
t0
has a unique solution in X .
t
f (s, x(s) )ds, t ∈ I,
212
Appendix
A.5 Differentiable Functions and the Implicit Function
Theorem
Let f : Ω → Rm be a function defined on an open subset Ω ⊂ Rn and valued
in Rm . There will be no danger of confusion if we denote by the same symbol the
norms in Rn and Rm .
If x 0 ∈ Ω, we say that
lim f (x) = u
x→x 0
if
lim f (x) − u = 0.
x→x 0
The function f is called continuous at x 0 if
lim f (x) = f (x 0 ).
x→x 0
The function f is called differentiable at x 0 if there exists a linear map Rn → Rm ,
denoted by
∂f
(x 0 ),
f ′ (x 0 ), f x (x 0 ), or
∂x
and called the derivative of f at x 0 , such that
lim
x→x 0
f (x) − f (x 0 ) − f ′ (x 0 )(x − x 0 )
= 0.
x − x 0
(A.38)
Lemma A.1 shows that the above definition is independent of the norms on the spaces
Rn and Rm . When m = 1, so that f is scalar-valued, we set
grad f := f ′ ,
and we will refer to this derivative as the gradient of f .
The function f : Ω → Rm is said to be continuously differentiable, or C 1 , if it is
differentiable at every point x ∈ Ω, and then the resulting map
x → f ′x ∈ Hom(Rn , Rm ) := the vector space of linear operators Rn → Rm
is continuous. More precisely, if f is represented as a column vector
⎤
f 1 (x)
⎥
⎢
f (x) = ⎣ ... ⎦ ,
f m (x)
⎡
Appendix
213
then the continuity and differentiability of f is, respectively, equivalent to the continuity and differentiability of each of the components f i (x), i = 1, . . . , m. Moreover,
the derivative f ′x is none other than the Jacobian matrix of the system of functions
f 1 , . . . , f m , that is,
∂ fi
′
(x)
.
(A.39)
f (x) =
1≤i≤m
∂x j
1≤ j≤n
Theorem A.3 (The implicit function theorem) Suppose that U ⊂ Rm and V ⊂ Rn
are two open sets and F : U × V → Rm is a C 1 -mapping. Assume, additionally,
that there exists a point (x 0 , y0 ) ∈ U × V such that
F(x 0 , y0 ) = 0, det Fx (x 0 , y0 ) = 0.
(A.40)
Then there exists an open neighborhood Ω of y0 in V , an open neighborhood O of
x 0 ∈ U , and a continuous function f : Ω → O such that f ( y0 ) = x 0 and
F(x, y) = 0, (x, y) ∈ O × Ω ⇐⇒ x = f ( y).
Moreover,
(A.41)
det Fx f ( y), y = 0, ∀ y ∈ Ω,
the function f is C 1 on Ω, and
−1
f ′ ( y) = −Fx f ( y), y
F y f ( y), y , ∀ y ∈ Ω.
(A.42)
Proof We denote by −m and −n two norms in Rm and Rn respectively. Without
loss of generality, we can assume that
• x 0 = 0 and y0 = 0.
• The set U is an open ball of radius rU < 1 in Rm centered at x 0 , and the set V is
an open ball of radius r V < 1 in Rn centered at y0 .
• For any (x, y) ∈ U × V , the partial derivative Fx ( x, y) is invertible.
We have the equality
F(x, y) = Fx (0, 0)x + F y (0, 0) y+ R(x, y)+ R(x, y), ∀(x, y) ∈ U × V, (A.43)
where Fx and F y are the partial derivatives of F with respect to x and y respectively.
The function R(x, y) is obviously C 1 on U × V and, additionally, we have
R(0, 0) = 0, R x (0, 0) = 0, R y (0, 0) = 0.
(A.44)
Since R ′ = (R x , R y ) is continuous on U × V , we deduce that for any ε > 0 there
exists a δ(ε) > 0 such that
214
Appendix
R ′ (x, y) ≤ ε, ∀xm + yn ≤ δ(ε).
Taking into account definition (A.38) of the derivative R ′ , we deduce that, for any
(xi , yi ) ∈ U × V , i = 1, 2, such that
x i m + yi n ≤ δ(ε), i = 1, 2,
we have
R(x 1 , y1 ) − R(x 2 , y2 )m ≤ ε x 1 − x 2 m + y1 − y2 n .
(A.45)
Let G : U × V → Rm be the function
G(x, y) = −AF y (0, 0) y − A R(x, y), A := Fx (0, 0)−1 .
(A.46)
The equation F(x, y) = 0 we are interested in can be rewritten as a fixed point
problem
x = −G(x, y).
From (4.77) and the equality G(0, 0) = 0, we deduce that
and
G(x, y)m ≤ A F y (0, 0) yn + εA xm + yn
G(x 1 , y) − G(x 2 , y)m ≤ εA x 1 − x 2 m ,
(A.47)
(A.48)
for xm + yn ≤ δ(ε), x i m + yn ≤ δ(ε), i = 1, 2.
We set
1
ε := min rU , A−1 ,
2
min(δ(ε), r V )
,
η :=
2 F y (0, 0)ε−1 + 1
and we consider the open balls
O := x ∈ U ; xm < δ(ε) ,
Ω := y ∈ V ; yn < η .
(A.49)
(A.50)
We denote their closures by O and Ω respectively. From (A.47) and (A.48), we
deduce that for any y ∈ Ω we have
G(x, y) ∈ O, ∀x ∈ O,
1
G(x 1 , y) − G(x 2 , y)m ≤ x 1 − x 1 m , ∀x 1 , x 2 ∈ O.
2
(A.51)
(A.52)
Appendix
215
This shows that for any y ∈ Ω we have a contraction
T y : O → O, x → T y x := G(x, y).
This has a unique fixed point that we denote by f ( y). Note that a point (x, y) ∈ O×Ω
is a solution of the equation F(x, y) = 0 if and only if x = f ( y).
Let us observe that the map f : Ω → O is Lipschitz continuous. Indeed, if
y1 , y2 ∈ Ω, then
f ( y1 ) − f ( y2 )m = G( f ( y1 ), y1 ) − G( f ( y2 ), y2 ) m
≤ AF y (0, 0) y1 − AF y (0, 0) y2 + A R( f ( y1 ), y1 ) − R( f ( y2 ), y2 )m
(A.45)
≤ AF y (0, 0) y1 − y1 n + εA f ( y1 ) − f ( y2 )m + | y1 − y2 n
1
≤ A ε + F y (0, 0) y1 − y1 n + f ( y1 ) − f ( y2 )m ,
2
so that
f ( y1 ) − f ( y2 )m ≤ 2A ε + F y (0, 0) y1 − y1 n .
To prove that the function f is differentiable on Ω, we use the defining property of
f ( y),
F( f ( y), y) = 0.
Since f is Lipschitz continuous, we deduce that for y0 ∈ Ω we have
0 = F f ( y0 + h), y0 + h − F f ( y0 ), y0
= Fx f ( y0 ), y0 f ( y0 + h) − f ( y0 ) + F y f ( y0 ), y0 h + o hn .
Hence,
f ( y0 + h) − f ( y0 )
= −Fx ( f ( y0 ), y0 )−1 F y ( f ( y0 ), y0 )h + Fx ( f ( y0 ), y0 )−1 o(hn ).
This shows that f is differentiable at y0 and its derivative is given by (A.42).
Corollary A.1 Let g : U → Rm be a C 1 -function defined on the open subset
U ⊂ Rm . We assume that there exists an x 0 ∈ U such that
det g ′x (x 0 ) = 0.
(A.53)
Then there exists an open neighborhood O of x 0 ∈ U such that g maps O bijectively
onto a neighborhood Ω of g(x 0 ). Moreover, the inverse map g −1 : Ω → O is also
C 1 . In other words, g induces a C 1 -diffeomorphism O → Ω.
216
Appendix
Proof Apply the implicit function theorem to the function
F(x, y) = y − g(x).
Remark A.2 In more concrete terms, Theorem A.3 states that the solutions of the
(underdetermined) system
F1 (x1 , . . . , xm ; y1 , . . . , yn ) = 0
..
.
(A.54)
Fm (x1 , . . . , xm ; y1 , . . . , yn ) = 0
form a family with n independent parameters y1 , . . . , yn while the remaining
unknown x1 , . . . , xm can be described as differentiable functions of the parameters y,
xi = f i (y1 , . . . , yn ), i = 1, . . . , m,
(A.55)
in a neighborhood of a point
(x ∗ , y∗ ) = (x1∗ , . . . , xm∗ ; y1∗ , . . . , yn∗ )
such that
Fi (x ∗ , y∗ ) = 0, ∀i = 1, . . . , m,
(A.56)
D(F1 , . . . , Fm ) ∗ ∗
(x , y ) = 0.
D(x1 , . . . , xm )
(A.57)
det
To use the classical terminology, we say that the functions f 1 , . . . , f m are implicitly
defined by system (A.54).
A.6 Convex Sets and Functions
A subset K of the space Rn is called convex if, for any two points x, y ∈ K , the line
segment they determine is also contained in K . In other words, if x, y ∈ K , then
t x + (1 − t) y ∈ K , ∀t ∈ [0, 1].
(A.58)
A function f : Rn → (−∞, ∞] is called convex if
f t x + (1 − t) y ≤ t f (x) + (1 − t) f ( y), ∀t ∈ [0, 1], x, y ∈ Rn .
(A.59)
The Legendre transform associates to a convex function f : Rn → (−∞, ∞] the
function f ∗ : Rn → (−∞, ∞] given by
Appendix
217
f ∗ ( p) = sup ( p, x) − f (x); x ∈ Rn .
(A.60)
The function f ∗ is called the conjugate of f . In (A.60), (−, −) denotes the canonical
scalar product on Rn .
We mention without proof the following result.
Proposition A.1 A convex function f : Rn → (−∞, ∞] that is everywhere finite
is continuous.
Let us assume that the function f : Rn → (−∞, ∞] satisfies the growth condition
lim
x→∞
f (x)
= ∞.
x
(A.61)
Proposition A.2 If the convex function f : Rn → (−∞, ∞) satisfies the growth
condition (A.61), then the conjugate function is convex and everywhere finite. If,
additionally, the functions f and f ∗ are both C 1 on Rn , then we have the equality
f ∗ ( p) + f f p∗ ( p) = p, f p∗ ( p) , ∀ p ∈ Rn .
(A.62)
Proof By Proposition A.1, the function f is continuous. If f satisfies (A.61), then
we deduce from (A.60) that
−∞ < f ∗ ( p) < ∞, ∀ p ∈ Rn .
On the other hand, f ∗ ( p) is convex since it is the supremum of a family of convex
functions. Since it is everywhere finite, it is continuous by Proposition A.1.
Let us now observe that for any p ∈ Rn there exists an x ∗ ∈ Rn such that
f ∗ ( p) = ( p, x ∗ ) − f (x ∗ ).
(A.63)
Indeed, there exists a sequence {x ν }ν≥1 ⊂ Rn such that
( p, x ν ) − f (x ν ) ≤ f ∗ ( p) ≤ ( p, x ν ) − f (x ν ) +
1
, ∀ν = 1, 2, . . . .
ν
(A.64)
The growth condition implies that the sequence {x ν }ν≥1 is bounded. The Bolzano–
Weierstrass theorem implies that this sequence contains a subsequence x νk that converges to some x ∗ ∈ Rn . Passing to the limit along this subsequence in (A.64), we
deduce (A.63).
On the other hand, from (A.60) we deduce that
f (x ∗ ) ≥ (x ∗ , q) − f ∗ (q), ∀q ∈ Rn .
218
Appendix
The last inequality, together with (A.63), shows that
f (x ∗ ) = sup (x ∗ , q) − f ∗ (q); q ∈ Rn
= (x ∗ , p) − f ∗ ( p).
In other words, p is a maximum point for the function Φ(q) = (x ∗ , q) − f ∗ (q) and,
according to Fermat’s theorem, we must have
Φq ( p) = 0,
that is,
x ∗ = f p∗ ( p).
Equality (A.62) now follows by invoking (A.63).
References
1. Arnold, V.I.: Ordinary Differential Equations. Universitext, Springer Verlag (1992)
2. Barbu, V.: Nonlinear Semigroups and Differential Equations in Banach Spaces. Noordhoff,
Leyden (1976)
3. Brauer, F., Nohel, J.A.: Qualitative Theory of Ordinary Differential Equations. W.U. Benjamin,
New York, Amsterdam (1969)
4. Braun, M.: Differential Equations and Their Applications, 4th edn. 1993. Springer (1978)
5. Clamrogh, N.H.M.: State Models of Dynamical Systems. Springer, New York, Heidelberg,
Berlin (1980)
6. Corduneanu, C.: Principles of Differential and Integral Equations. Chelsea Publishing Company, The Bronx New York (1977)
7. Courant, R.: Partial Differential Equations. John Wiley & Sons, New York, London (1962)
8. Crandall, M.G., Lions, P.L.: Viscosity solutions of Hamilton-Jacobi equations. Trans. Amer.
Math. Soc. 277, 1–42 (1983)
9. Halanay, A.: Differential Equations: Stability, Oscillations. Time Logs, Academic Press (1966)
10. Gantmacher, F.R.: Matrix Theory, vol. 1–2. AMS Chelsea Publishing, American Mathematical
Society (1987)
11. Hale, J.: Ordinary Differential Equations. John Wiley, Interscience, New York, London, Sidney,
Toronto (1969)
12. Kolmogorov, A.N., Fomin, S.V.: Introductory Real Analysis. Dover (1975)
13. Landau, E., Lifschitz, E.: Mechanics. Course of Theoretical physics, vol. 1. Pergamon Press
(1960)
14. Lang, S.: Undergraduate Analysis, 2nd edn. Springer (2005)
15. LaSalle, J.P., Lefschetz, S.: Stability by Lyapunov’s Direct Method with Applications. Academic Press, New York (1961)
16. Lee, B., Markus, L.: Foundations of Optimal Control. John Wiley & Sons, New York (1967)
17. Pontriagyn, L.S.: Ordinary Differential Equations. Addison-Wesley (1962)
18. Vrabie, I.: Differential Equations. An Introduction to Basic Concepts. Results and Applications.
World Scientific, New Jersey, London, Singapore (2011)
19. Whitman, G.B.: Lectures on Wave Propagation. Springer, Tata Institute of Fundamental
Research, Berlin, Heidelberg, New York (1979)
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6
219
Index
Symbols
B30D − B30De , 50
A
A.e., 107
Action functional, 191
B
Ball
closed, 202
open, 202
Blowup, 46
Burgers’ equation, 19
C
C ∞ (I ), 106
C0∞ (I ), 106
Carathéodory solution, see ODE
Cauchy
condition, 6, 47
problem, 2, 3, 29, 35, 84, 102
Centrifugal governor, 155
Characteristic curves, 170
Characteristic polynomial, 89
Characteristic submanifold, 174
Characteristics equation, 170, 182
Condition
Cauchy, 2, 6
dissipativity, 51
Lipschitz, 30, 32
Contraction, 209
continuous semigroup, 54
principle, 210
Control, 143
optimal, 147
system
automatic, 144
input, 143
linear, 143
observed, 144
output, 144
state, 143
Convex
function, 216
conjugate of, 217
set, 60, 68, 216
D
D’Alembert’s principle, 18
Difference equations, 40
Differential inclusion, 58
Distributed processes, 4
Distribution, 106
Dirac, 107
E
Envelope, 11
Epidemic models, 14
Equation
Boltzman, 197
conservation law, 176
Duffing, 168
eikonal, 189
Euler–Lagrange, 166
Hamilton–Jacobi, 190, 198
Hamiltonian, 166
harmonic oscillator, 14, 94
Klein–Gordon, 27
Korteweg–de Vries, 18
© Springer International Publishing Switzerland 2016
V. Barbu, Differential Equations, Springer Undergraduate Mathematics Series,
DOI 10.1007/978-3-319-45261-6
221
222
Liénard, 156
Newton, 167
of variation, 102
Riccati, 9, 47, 49, 194, 199
Schrödinger, 16, 195
Van der Pol, 156
Euclidean space, 204
Euler scheme, 40
Extendible, see ODE
F
Feedback controller, 144
Feedback synthesis, 144
Fixed point, 210
Flow
local, 57
Formula
Lie, 120
residue, 100, 101
Taylor, 91
variation of constants, 85, 115, 131
Function
coercive, 137
generalized, 106
Heaviside, 109
Lipschitz, 30, 32, 35, 39
locally Lipschitz, 40, 57, 102, 130
Lyapunov, 134, 138
negative definite, 134
positive definite, 134, 155, 157
smooth, 106
support of, 106
G
γ(x 0 ), 148
General solution, 2, 6, 8, 10
Globally asymptotically stable, see ODE
H
, 17
Hamiltonian
generalized, 166
Harmonic oscillator, 14, 94
I
Inequality
Schwartz, 205
Initial
condition, 2
values, 2
Index
Input, 143
Integral curve, 2
L
Lagrangian, 166
Legendre transform, 166
Lemma
Bihari, 23
Gronwall, 22, 31, 48, 55, 64, 65, 67, 80,
103, 131
Lurie–Postnikov problem, 146
Lyapunov function, see function
M
Matrix, 201
adjoint of, 201
fundamental, 81, 127
Hurwitzian, 129, 140, 146, 156
Jacobian, 162
nonnegative definite, 49
nonsingular, 202
positive, 205
positive definite, 49, 205
Metric, 209
space, 209
complete, 209
N
Normal cone, 60
Normal form, 2, 3
O
ODE, 1
autonomous, 51
Bernoulli, 8
Clairaut, 10
singular solution, 11
delay-differential, 4
dissipative, 51
Euler, 120
exact, 8
general solution, 2
globally asymptotically stable, 127
Hamiltonian, 166
higher order, 3, 86
homogeneous, 7
Lagrange, 9
Liénard, 156
linear, 7, 86
constant coefficients, 89
Index
Lotka–Volterra system, 13, 155
multivalued, 58
normal form, 2
order of an, 3
predator-prey, 13
Riccati, 9, 27, 47, 105, 116
separable, 5, 13
solution, 1
asymptotically stable, 124
Carathéodory, 59
extendible, 42
power series, 88
stable, 124, 125
uniformly asymptotically stable, 124
uniformly stable, 124
system of linear, 79
constant coefficients, 96
fundamental matrix, 81, 86
homogeneous, 79, 81
nonhomogeneous, 85
Van der Pol, 12, 156
One-parameter group, 97
generator, 97
local, 57
Orbit, 148
Output, 144
P
PDE, 1
Cauchy problem, 172
first-order quasilinear, 170
Pendulum, 15
Point
critical, 162
regular, 162
Potential, 16
Prime integral, 161, 168
Q
Quasipolynomial, 93
R
Radioactive decay, 12
Rankine–Hugoniot relation, 181
Real analytic, 88
S
2 S , 58
Sensitivity
functions, 105
223
matrix, 105
Sequence
convergent, 209
fundamental, 209
Set
bounded, 203, 207
closed, 203
compact, 203
open, 203
uniformly equicontinuous, 207
Small parameter, 105
Soliton, 18
Stable solution, see ODE
Sturm–Liouville problem, 117
System
Hamiltonian, 166
Lotka–Volterra, 13, 155
System of linear ODE, see ODE
T
Theorem
Arzelà, 37, 65, 207
Banach fixed point, 210
Bolzano–Weierstrass, 40, 207, 217
Cauchy, 98
Cayley, 99
continuous dependence
on initial data, 54
on parameters, 58
existence and uniqueness, 3, 47, 52, 124,
183
global uniqueness, 41
implicit function, 9, 213
inverse function, 163
Lebesgue, 59
Liouville, 83, 84, 87, 115
local existence and uniqueness, 32, 41,
57, 80
Lyapunov, 139
Lyapunov stability, 134
Lyapunov–Poincaré, 131
mean value, 103
on the stability of linear systems, 127
Peano, 36, 52
Rolle, 116
Sturm, 116
Sturm comparison, 116
Trajectory, 3, 148
Transition matrix, 86
Traveling wave, 18
224
V
Variational inequalities, 59
Vector
coordinates, 201
Vector field, 15, 57
central, 16
conservative, 15
potential, 16
Index
Volterra
integro-differential equations, 4
W
Wave equation, 18
Weak solution, 179
Wronskian, 82, 87