Appl Diff Eqs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 94

Lecture Notes

Applied Differential Equations

Thomas Götz

Version: December 21, 2022


Contents

1 Introduction 5

1.1 Modeling with Differential Equations . . . . . . . . . . . . . . . . 5

1.2 Basic Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 The Directional Field of an ODE . . . . . . . . . . . . . . . . . . 12

1.4 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . 16

1.5 Linear First Order ODEs . . . . . . . . . . . . . . . . . . . . . . . 18

2 Existence and Uniqueness 20

2.1 Some Tools from Functional Analysis . . . . . . . . . . . . . . . . 20

2.2 Banach Fixpoint Theorem . . . . . . . . . . . . . . . . . . . . . 22

2.3 Picard–Lindelöf Theorem . . . . . . . . . . . . . . . . . . . . 23

2.4 Peano’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 Upper– and Lower Functions . . . . . . . . . . . . . . . . . . . . . 27

3 Linear Differential Equations 29

3.1 Homogeneous Linear ODE Systems . . . . . . . . . . . . . . . . . 31

3.2 Inhomogeneous Systems . . . . . . . . . . . . . . . . . . . . . . . 32

3.3 The Matrix–Exponential Function . . . . . . . . . . . . . . . . . . 34

3.4 Higher Order Differential Equations . . . . . . . . . . . . . . . . . 38

2
3 CONTENTS

4 Numerical Methods 42
4.1 Convergence Theory . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Runge–Kutta Methods . . . . . . . . . . . . . . . . . . . . . . 51
4.3 Extrapolation Methods . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Adaptive Step Size Selection . . . . . . . . . . . . . . . . . . . . . 60

5 Stability 63
5.1 Continuous Dependence on Initial Data . . . . . . . . . . . . . . . 63
5.2 Asymptotic Behavior of Solutions . . . . . . . . . . . . . . . . . . 67
5.3 Lyapunov Functions . . . . . . . . . . . . . . . . . . . . . . . . . 74

6 Partial Differential Equations 78


6.1 Definition of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Method of Characteristics . . . . . . . . . . . . . . . . . . . . . . 83

A Preliminaries 88
A.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . 88
A.2 Taylor’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Bibliography

[Wal98] Walter, Ordinary Differential Equations, Springer 1998, doi:


10.1007/978-1-4612-0601-9.

[DB02] Deuflhard, Bornemann Scientific Computing with Ordinary Dif-


ferential Equations, Springer 2002, doi: 10.1007/978-0-387-21582-2

[DH03] Deuflhard, Hohmann Numerical Analysis in Modern Scientific


Computing, Springer 2003.

[Python] Python, available for download, see www.python.org.

[LL16] Linge, Langtangen, Programming for Computations – Python,


Springer 2016. doi: 10.1007/978-3-319-32428-9.

[Sun20] Sundnes, Introduction to Scientific Programming with Python,


Springer 2020. doi: 10.1007/978-3-030-50356-7.

The above list is just a small selection of literature and references suitable for
self studying, intensive reading or exam preparation. This list does not claim to
be complete.

Important Note: These lecture notes represent a summary of the contents of


the works mentioned in the bibliography or similar works. The lecture notes
are only intended for internal use within the University of Koblenz. The lecture
notes do not claim to be free of errors or complete. No liability is assumed for the
validity or content of external links. In case of doubt, please consult the works
listed in the table of contents.

4
Chapter 1

Introduction

1.1 Modeling with Differential Equations

Example 1.1 (Discrete population model). We consider the growth of a colony


of bacteria. In a first step we consider discrete time, i.e. we look at the population
yk at discrete time points tk = t0 + k · δt. Here, t0 denotes the initial time of
observation and δt > 0 denotes the (fixed) time interval. The number of bacteria
observed at time tk is called yk = y(tk ).
The population at the next time point tk+1 = tk + δt is given by

yk+1 = yk + Increase

Assuming the increase to be proportional to both the current population yk and


the length of the time interval δt, we obtain

yk+1 = yk + r · δt · yk .

The constant r ∈ R is called the growth rate (in case of r > 0); it measures the
number of newborns per individual and per unit time step. In case of r < 0, one
calls the constant the death or mortality rate.
Given an initial population y0 at initial time t0 , we can predict the population at
any later time based on the above recursion or difference equation.

Python Example 1.1 (Simulation of discrete population growth).


from numpy import ∗
import m a t p l o t l i b . p y p l o t as p l t

# Parameters

5
1.1. MODELING WITH DIFFERENTIAL EQUATIONS 6

5 n = 40
T = 1.0
dt = T/n
t = l i n s p a c e ( 0 ,T, n+1)

10 y = z e r o s ( n+1)
r = 1.2
y0 = 1000

y [ 0 ] = y0
15 for k in range ( n ) :
y [ k+1] = y [ k ] + r ∗ dt ∗y [ k ]

# Plot
plt . figure ()
20 p l t . p l o t ( t , y , ’ bo− ’ )
p l t . x l a b e l ( ’ Time $ t $ ’ )
p l t . y l a b e l ( ’ P o p u l a t i o n $y$ ’ )
plt . grid ()
p l t . show ( )

Example 1.2 (Exponential growth). If the time interval δt shrinks to zero, we


get
y(t + δt) − y(t) δt→0 dy
= r · y(t) −−−−→ = r · y(t) ,
δt dt
a so–called differential equation

y 0 = ry (1.1)

with initial value y(t0 ) = y0 .


Definition 1.1 (Initial value problem). Let I ⊂ R be an interval and t0 ∈ I.
The problem: Find a (continuously diff’able) function y : I → R, satisfying the
(ordinary) differential equation (ODE)
dy
= r · y(t) ,
dt
for all t ∈ I and satisfying the initial condition y(t0 ) = y0 is called an initial value
problem (IVP).

The constant r ∈ R determines the behavior of the population model (1.1)

B For r = 0 the population remains constant.


7 CHAPTER 1. INTRODUCTION

B For r > 0 the population grows in time.

B For r < 0 the population decreases in time.

An educated guess helps finding a solution to Eqn. (1.1). We use the ansatz
y(t) = c · eαt with two yet undetermined real constants c and α. Plugging this
ansatz into the ODE yields
!
y 0 (t) = cαeαt = ry = rceαt

which leads us to

0 = ceαt (α − r) .

It follows, that either c = 0 and hence y ≡ 0. Or, as second possibility, we have


to consider α = r. Using the initial value y(t0 ) = y0 , we obtain

!
y(t0 ) = cert0 = y0 .

Therefore c = y0 e−rt0 and finally

y(t) = y0 er(t−t0 ) .

Example 1.3 (Logistic growth model). We can improve the previous population
model by assuming, that the reproduction rate r is not constant, but decreases
with the current population, e.g. r = K · (M − y). The term M − y can be
interpreted as the available resources. For a population y = M , no resources are
available and the reproduction rate equals zero. The threshold M is also called
the carrying capacity of the model. The resulting ODE reads as

y 0 = K(M − y) y .

This equation is also known as logistic (differential) equation.

Introducing the scalings t = s/(KM ) und y(t) = M u(s) in the logistic equation,
i.e. introducing the new time scale s and measuring the population y in fractions
of the carrying capacity M , we get
dy du
0= − K(M − y)y = KM 2 − K(M − M u) M u = KM 2 [u0 − (1 − u)u] .
dt ds
The prototypic equation is given by

u0 = u(1 − u) .
1.1. MODELING WITH DIFFERENTIAL EQUATIONS 8

A solution to the above ODE can be obtained as follows (a systematic approach


will be given later on)
ã0
u0 u0 u0
Å
0 0 u
1= = − (−1) = (ln u) − (ln(1 − u)) = ln .
u(1 − u) u 1−u 1−u

using partial fractions. Integration yields


u
t + c0 = ln + c1
1−u
u
cet =
1−u
cet 1
u= = .
1 + cet 1 + c̃e−t
Python Example 1.2 (Numerical solution of the logistic equation).
from numpy import ∗
from s c i p y . i n t e g r a t e import s o l v e i v p
import m a t p l o t l i b . p y p l o t as p l t

5 u0 = 0 . 3
T = 5

def r h s ( t , u ) :
return u∗(1. −u )
10

S o l = s o l v e i v p ( rhs , [ 0 , T] , [ u0 ] ,
t e v a l=l i n s p a c e ( 0 ,T, 1 0 1 ) )
t = Sol . t
u = Sol . y [ 0 , : ]
15

plt . figure ()
p l t . p l o t ( t , u , ’ b− ’ )
p l t . x l a b e l ( ’ $t$ ’ )
p l t . y l a b e l ( ’ $u$ ’ )
20 plt . grid ()
p l t . show ( )

Example 1.4 (Predator–prey model). We consider two interacting species: a


predator and a prey. The prey u grows in the absence of predators according to
the first population model and gets diminished by the predators. The number
of removed individuals is assumed to be proportional to both the available prey
population and the current predator population. The predators themselves die
out in the absence of the prey. Their reproduction is assumed to be proportional
9 CHAPTER 1. INTRODUCTION

to the available prey and to their own population. These assumptions lead to the
following system of coupled ODEs for the prey u and predators v

u0 = αu − βuv ,
v 0 = −δv + γuv .

Considering a stationary equilibrium, i.e. solutions (u, v) = const., we obtain


from u0 = 0 = v 0 the following conditions
!
u0 = u (α − βv) = 0 u = 0 or v = α/β
!
v 0 = v (γu − δ) = 0 v = 0 or u = δ/γ

This leads to two possible equilibria or stationary points

B The trivial equilibrium u = 0 = v, i.e. no population at all.


Ä ä
B The coexistence equilibrium (u∗ , v ∗ ) = γδ , αβ .

How do small perturbations affect the coexistence equilibrium (u∗ , v ∗ )?


Let us consider a linearization of the equations around (u∗ , v ∗ ). Using the ansatz

δ α
u = u∗ + small perturbation = + x and v = + y
γ β

where x, y  1, i.e. both functions x(t) and y(t) are assume to be that small,
such that quadratic terms like x2 , xy, y 2 can be neglected compared to the linear
ones. Plugging this into the ODE and keeping only terms up to first order, we
arrive at
βδ
x0 = − y = −k1 y
γ
αγ
y0 = x = k2 x .
β

Differentiating again leads to

x00 = −k1 k2 x = −αδx .

Later on, we will see that this differential equation of second order describes the
so–called harmonic oscillator x00 +k 2 x = 0 and its solutions are periodic functions
of the form
x(t) = c1 cos kt + c2 sin kt .
1.1. MODELING WITH DIFFERENTIAL EQUATIONS 10

This implies, that our solutions u and v oscillate periodically around the equilib-
rium u∗ , v ∗ .
Drawing the two equilibria as well as the qualitative behaviour of the solutions,
i.e. the signum of u0 and v 0 in the uv–plane, we obtain what is called the phase
portrait of the ODE.

Example 1.5 (Mathematical epidemiology). Let us consider the spread of an


infection like influenza in a population. We can subdivide the total population
N into three sub–compartments:

B The susceptible individuals S = u1 .

B The infected individuals I = u2 .

B The recovered individuals R = u3 .

The transmission of the disease proceeds stepwise

B If a susceptible and an infected individual meet, the susceptible one get also
infected at a rate β.

B An infected individual recovers at a rate γ.

B Recovered and hence immune individuals loose their immunity at a rate δ


and get susceptible again.

In mathematical terms, this lead to the ODE–system (called SIR–model)

infection loss of immunity


z }| { z}|{
0
S = −βSI +δR
I 0 = βSI − γI
R0 = +γI − δR
|{z}
recovery

The total population N = S + I + R remains constant, since

N0 = 0 .

Example 1.6 (Pendulum). We consider a pendulum, i.e. a mass m fixed to


a (massless) rod of length l and moving on a circular orbit of radius l around
the point of fixation (the origin). The angle ϕ describes the deflection from the
position at rest ϕ = 0. Gravitation exerts a force F = −mg sin ϕ and hence an
11 CHAPTER 1. INTRODUCTION

2
acceleration ml · ddtϕ2 driving the mass back to its position at rest. Newton’s law
”Mass × Acceleration = Force” leads to the ODE
g
ϕ00 = − sin ϕ .
l
For small angles ϕ  1 we linearize sin ϕ ≈ ϕ and hence
g
ϕ00 = − ϕ .
l
Again, we arrive at a differential equation of second order of the form
ϕ00 + k 2 ϕ = 0
where k 2 = g/l > 0. A straight forward calculation shows, that the periodic
functions
ϕ(t) = c1 cos kt + c2 sin kt
p
are solutions for c1 , c2 ∈ R. The constant k = g/l equals to the frequency of
the harmonic oscillator, i.e. the pendulum.

1.2 Basic Notations


Definition 1.2 (Ordinary differential equation). Let I ⊂ R be an interval. We
call an equation
F (x, u, u0 , u00 , . . . , u(n−1) , u(n) ) = 0 ∀x ∈ I . (1.2)
an ordinary differential equation (ODE) of order n. A function u : I → R is called
solution of (1.2) on the interval I, if u is n–times continuously differentiable in I
and (1.2) holds for all x ∈ I.
Definition 1.3 (Explicit ODE). The ODE (1.2) is called explicit, if the highest
derivative is given explicitly in terms of the lower ones
u(n) = f (x, u, u0 , u00 , . . . , u(n−1) ) , (1.3)
otherwise we call the ODE implicit.
Definition 1.4 (Linear ODE). An ODE is called linear, if the function F is
linear in u, u0 , . . . , u(n) , i.e. (1.2) can be written in the form
an (x)u(n) + · · · + a1 (x)u0 + a0 (x)u + b(x) = 0 . (1.4)
The coefficients ak are allowed to be functions, i.e. ak : I → R. If all the
coefficients ak are constant, i.e. ak = const. for all k = 0 . . . n, we call the equation
a linear ODE with constant coefficients. The equation is called homogeneous, if
b(x) ≡ 0, otherwise we call it inhomogeneous.
1.3. THE DIRECTIONAL FIELD OF AN ODE 12

Definition 1.5 (Autonomous ODE). An ODE of the form F (u, u0 , . . . , u(n) ) = 0


is called autonomous; in this case F does not depend explicitly on x.

Definition 1.6. Let I ⊂ R be an interval. If the functions u, u0 , . . . , u(n)


and F in (1.2) are vector–valued (in Rm ), we call (1.2) a system of m ODEs of
order n. A vector–valued function u : I → Rm is called solution, if it is n–times
continuously diff’able and the ODE (1.2) holds for all x ∈ I.

The complex–valued case u, u0 , . . . , u(n) , F ∈ Cm can also be summarized under


this definition using the identification C ' R2 .

Remark 1.1. In the script we write (most of the time) vectors u ∈ Rn in boldface
and matrices A ∈ Rm×n in bold upright letters.
Remark 1.2. In physical or technological applications where the independent vari-
able t can be interpreted as time, the derivative is often written with a dot,
du d2 u
i.e. u̇ = and for the second derivative ü = 2 . We will not use this notation.
dt dt
Example 1.7 (Some ODEs).

u0 = ku linear growth equation


u00 + ku0 + ω 2 u = f (t) oscillator
u00 + µ(u2 + 1)u0 + u = 0 van der Pol’s equation
u0 = (a − bv)u

predator–pey–model
v 0 = (−c + du)v
y 0 = Ay where A ∈ Rn×n , u(t) ∈ Rn lin. hom. system of order 1
y 0 + ln y 0 = x implicit ODE, order 1

∂ 2u ∂ 2u
But the Laplace equation + = 0 is a partial differential equation (PDE).
∂x2 ∂y 2

1.3 The Directional Field of an ODE

Let Ω ⊂ R2 be non–empty and f : Ω → R. We consider the first order explicit


ODE
u0 = f (x, u) . (1.6)
Let I ⊂ R be an interval (open or halfopen). A function u : I → R is a solution
of the ODE (1.6), if u is diff’able in I, graph u ⊂ Ω and if (1.6) holds, i.e.

(x, u(x)) ∈ Ω and u0 (x) = f (x, u(x)) for all x ∈ I .


13 CHAPTER 1. INTRODUCTION

Assume the solution curve (trajectory)


1 passes through a point (x0 , u0 ) ∈ Ω,
0.8 i.e. u(x0 ) = u0 . Then the slope at that
0.6
point is given by u0 (x0 ) = f (x0 , u0 ). The
0.4

0.2
ODE determines the slopes of the trajec-
0 tories at the points (x, u).
−0.2
Drawing for ”all” points (x0 , u0 ) ∈ Ω
−0.4

−0.6
a line element with slope f (x0 , u0 ),
−0.8
we obtain the directional field of the
−1 ODE (1.6), see Fig. 1.1.
0 0.2 0.4 0.6 0.8 1 A solution to the ODE ”fits” to the di-
rectional field. The direction of the tan-
Figure 1.1: Directional field of the
gents to the solution curve is given by
ODE u0 = x + u.
the line elements of the directional field.

The above geometrical concept of the directional field raises the question:
Given an arbitrary point (ξ, η) ∈ Ω. Does there exist a unique solution u passing
through that point (ξ, η) ?
This leads us to the following
Definition 1.7 (Initial value problem). Let I ⊂ R be an interval, Ω = I × R,
f : Ω → R and (ξ, η) ∈ Ω. The problem:
Find a function u : I → R diff’able, such that

u0 (x) = f (x, u(x)) ∀x ∈ I


(IVP)
u(ξ) = η

is called an initial value problem (IVP). The second equation u(ξ) = η is called
the initial condition.
Python Example 1.3 (Numerical solution of an IVP).
To solve the IVP u0 = f (t, u), u(t0 ) = u0 for a scalar–valued right hand side
f : [t0 , T ] → R using python, we can use the following commands
from s c i p y . i n t e g r a t e import solve ivp
def f ( t , u ) : # D e f i n i t i o n of rhs
...
S o l = s o l v e i v p ( f , [ t0 , T] , [ u0 ] ) # S o l v e
5 t = Sol . t # t−c o o r d i n a t e s
u = Sol . y [ 0 , : ] # solution values

In chapter 2, Section 2.3 and 2.4 we will deal with the existence and uniqueness
of solutions to initial value problems (IVP).
1.3. THE DIRECTIONAL FIELD OF AN ODE 14

Example 1.8 (u0 = f (x)). We consider the IVP

u0 = f (x), u(ξ) = η

where f : I → R is continuous, ξ ∈ I and we define Ω := I × R.

According to the fundamental theorem


of calculus, we get 1

0.8
Z x
0.6
Φ(x) = f (s) ds 0.4
ξ
0.2

0
as a solution of the ODE. We obtain all −0.2

solutions in the form −0.4

−0.6

u = u(x; C) := Φ(x) + C −0.8

−1

for arbitrary constants C ∈ R, the so– 0 0.2 0.4 0.6 0.8 1

called general solution of the ODE. Figure 1.2: Directional field of u0 =


2x. Initial value (ξ, η) = (0, −0.4).

Using the initial condition u(ξ) = η leads to C = η and hence

Z x
u(x) = η + f (s) ds .
ξ

The solution exists for all x ∈ I.

Remark 1.3. If I is not compact (i.e. not bounded and closed), f may be not
bounded on I and hence not integrable over I. However, the integral Φ(x) exists
for all x ∈ I, since [ξ, x] is a compact interval and since f as a continuous function
is integrable on this part.

Example 1.9 (u0 = g(u)). We consider the IVP

u0 = g(u), u(ξ) = η

where g is continuous on an interval I.


15 CHAPTER 1. INTRODUCTION

Formal calculations yield for g(u) 6= 0


1

du du
= g(u) ⇐⇒ = dx 0.8

dx g(u) 0.6

0.4

and integation leads to the implict form 0.2

0
Z Z
du −0.2

= dx = x + C . −0.4

g(u) −0.6

−0.8

Using the initial condition, we obtain −1

Z u 0 0.2 0.4 0.6 0.8 1

dv
x(u) = ξ + . Figure 1.3: Directional field of u0 =
η g(v)
2u. Initial value (ξ, η) = (0, 0.2).
This is an implicit solution x = x(u) of the IVP. Applying the inverse function
theorem leads to the solution u = u(x). Examples can be seen in the tutorials.
Example 1.10 (A non–unique ODE). As a special case of the previous exam-
ple 1.9 we consider the ODE »
u0 = |u| . (1.7)
Thanks to symmetry v(x) := −u(−x) is also a solution, if u(x) is a solution.
Wlog we just consider positive solutions and obtain formally

Z
du
√ =2 u=x+C
u
(x + C)2
u(x; C) = for x > −C ∈ R .
4
Furthermore u ≡ 0 is a solution and −u(−x; C) are negative solutions. Using
these three possibilities, we can construct even more solutions by concatenating
them smoothly, i.e. such that they are at least diff’able.
Considering the initial value u(2) = 1, then all solutions are of the form

2
x /4
 for x > 0
u(x; K) = 0 for K ≤ x ≤ 0 ,
 2
−(x − K) /4 for x < K

where K ≤ 0 is arbitrary.
The IVP (1.7) allows for all initial values u(ξ) = η non unique solutions. For
η = 0 they branch immediately at x = 0, for η 6= 0 at some distance from the
initial value. We say, that the IVP has locally unique solutions for η 6= 0. For
η = 0 the solutions are not even locally unique.
1.4. SEPARATION OF VARIABLES 16

−1

−2

−3

−4

−4 −3 −2 −1 0 1 2

p
Figure 1.4: Directional field of the ODE u0 = |u| and solution trajectories for
the initial value (ξ, η) = (2, 1).

1.4 Separation of Variables

We consider an initial value problem of the type

u0 (x) = f (x) g(u), u(ξ) = η . (1.8)

Example 1.11 (Separable ODE). Examples for such ODEs with separated vari-
ables or separable ODEs are

(1) u0 (x) = f (x).

(2) u0 (x) = g(u).

(3) u0 (x) = f (ax + bu + c).


Here, one may introduce v(x) = ax + bu(x) + c and derive an ODE for v.

(4) u0 (x) = f (u/x).


Let v(x) = u(x)/x and consider an ODE for v.
Ä ä
(5) u0 = f αx+βu+γ
ax+bu+c
.

(6) The linear first order ODE u0 + a(x)u = b(x).

We will treat those examples in the tutorials.

The following theorem deals with the solution of (1.8).


17 CHAPTER 1. INTRODUCTION

Theorem 1.1. Let Ix ⊂ R and Iu ⊂ R be two intervals and let f : Ix → R,


g : Iu → R be continuous. Furthermore let ξ ∈ Ix and η ∈ int Iu . Given that

g(η) 6= 0 ,

then there exists a neighborhood Uξ ⊂ Ix of ξ (one–sided, if ξ ∈ ∂Ix ) such that


the IVP (1.8) admits an unique solution u : Uξ → R. This solution is given
implicitly by
Z u(x) Z x
ds
= f (t) dt . (1.9)
η g(s) ξ

Z x
Z u
ds
Proof. Assume (1.9) holds and define G(u) = and F (x) = f (t) dt.
η g(s) ξ
Then G : Iu → R exists in a neighborhood of η, since g(η) 6= 0. Furthermore, G
is diff’able in this neighborhood and it holds that G0 (u) = 1/g(u). Due to the
inverse function theorem exists a function H = G−1 : R → Iu where u = H(G(u)).
Due to (1.9) we get u = H(F (x)) or F (x) = G(u(x)).
Differentiation w.r.t. x yields G0 (u) · u0 = F 0 , thus u0 = f (x) · g(u). So u satisfied
the ODE (1.8).
Moreover G(η) = 0, F (ξ) = 0 and H(0) = η, therefore u(ξ) = H(F (ξ)) = η;
hence also the initial condition of (1.8) holds.
Let v be another solution, i.e. v 0 (x)/g(v(x)) = f (x). Integration w.r.t. x yields
x x v(x)
v 0 (t) dt
Z Z Z
ds
f (t) dt = = .
ξ ξ g(v(t)) v(ξ) g(s)

Therefore F (x) = G(v(x)) and v(x) = H(F (x)) = u(x).

Question 1.1. What happens in case of g(η) = 0 ?


One solution is given by u(x) ≡ η. But do there exist other solutions?

Theorem 1.2. Consider the same situation as in Thm. 1.1, but let g(η) = 0 and
g(u) 6= 0 for η < u ≤ η + α resp. η − α ≤ u < η for α > 0. Furthermore assume
Z η+α Z η
ds ds
=∞ or resp. =∞.
η g(s) η−α g(s)

Then there exists no solution u(x) approaching the constant u ≡ η from above or
below.

Proof. See [Wal98, Chapter I, §1.VIII].


1.5. LINEAR FIRST ORDER ODES 18

Corollary 1.3. Let u(x) be a solution of (1.8) where u(x0 ) ≶ η for some x0 ∈ Ix .
Then the estimate u(x) ≶ η holds true for all x ∈ Ix .

If η ∈ int Iu and both integrals in Thm. 1.2 diverge, then the IVP (1.8) admits a
locally unique solution.
This is the case, if g(s) has an isolated root at η and if g is Lipschitz–continuous
around η, i.e. |g(u) − g(η)| = |g(u)| ≤ K |u − η|; e.g. if g 0 (η) exists.
Remark 1.4. The opposite direction of Thm.1.2 is in general not true, i.e.
both integrals converge 6=⇒ IVP not uniquely solvable.

1.5 Linear First Order ODEs

A special case of an ODE with separated variables in given by the linear ODE of
first order
u0 (x) = a(x)u + b(x) . (1.10)
To construct solutions of (1.10) we start with the homogeneous problem

u0 (x) = a(x)u . (1.11)

Using Thm. 1.1 we can write down the general solution

u(x) = CeA(x) , (1.12)


R
where A(x) = a(x) dx. The solution of the IVP (1.11) with u(ξ) = η is given
by Z x
A(x)
u(x) = ηe , where A(x) = a(x) dx . (1.13)
ξ

To construct the solution of the inhomogeneous problem (1.10), we use the ”vari-
ation of constants”. This approach goes back to Lagrange and starts from the
general solution (1.12). We replace the constant C by a function C(x) and obtain

u(x) = C(x)eA(x)
u0 (x) = C 0 (x)eA(x) + C(x)A0 (x)eA(x)

Inserting the ODE (1.10) we get

b(x) = u0 (x) − a(x) u = [C 0 (x) + C(x)A0 (x) − a(x)C(x)] eA(x) .


19 CHAPTER 1. INTRODUCTION

a(x) dx, i.e. A0 (x) = a(x), we derive


R
Since A(x) =

C 0 (x) = b(x)e−A(x)
Z
C(x) = C0 + b(t)e−A(t) dt .

The solution of the IVP u0 (x) = a(x)u + b(x), u(ξ) = η can now be written as
ñ Z x ô Z x
−A(t) A(x) A(x)
u(x) = η + b(t)e dt e = ηe + b(t)eA(x)−A(t) dt
ξ ξ

Z t Z x
where A(t) = a(τ ) dτ and especially A(x) − A(t) = a(τ ) dτ .
ξ t
Chapter 2

Existence and Uniqueness

2.1 Some Tools from Functional Analysis


Definition 2.1 (Norm). Let V be a real vector space. A function k·k : V → R
is called norm, if

(1) (definiteness) kuk ≥ 0 and kuk = 0 if and only if u = 0


(2) (homogenity) kλuk = |λ| kuk
(3) (triangle inequality) ku + vk ≤ kuk + kvk

hold for all u, v ∈ V and λ ∈ R.


Definition 2.2 (Norms on continuous functions). Let I ⊂ R be a compact
interval and m ∈ N. We define the space of continuous scalar (or vector–valued)
functions on I as
C = C(I, Rm ) := {u : I → Rm , u continuous} .
One can check, that C is a real vector space.
We define the maximum (or infinity) norm k·k∞ on C by
kuk∞ := max |u(x)|∞ = max max |ui (x)| .
x∈I x∈I i=1...m

One can check, that k·k∞ defines a norm on C and that kuk∞ < ∞ holds for all
u ∈ C.
Let p : I → R be a weighting function satisfying 0 < a ≤ p(x) ≤ b < ∞ for all
x ∈ I. Then
kukp,∞ := ku pk∞ = max (p(x) · |u(x)|∞ )
x∈I

20
21 CHAPTER 2. EXISTENCE AND UNIQUENESS

is called weighted maximum norm. One can check that k·kp,∞ also defines a norm
on C.
As a special case we consider the exponential weighting function p(x) := e−α|x−ξ|
for α > 0 und ξ ∈ I. In the sequel we frequently make use of the norm induced
by this weight and use the notation
kukα,∞ := max |u(x)|∞ · e−α|x−ξ| .
x∈I

Definition 2.3 (Banach space). Let (V, k·k) be a normed vector space. We
call V complete or Banach space, if all Cauchy sequences converge in V .
Lemma
Ä 2.1. Letä I ⊂ R be a compact interval and m ∈ N. The vector space
m
C(I, R ), k·kp,∞ is complete for any weighting function p : I → R.
Ä ä
In particular, for any α > 0, the spaces (C(I, Rm ), k·k∞ ) and C(I, Rm ), k·kα,∞
are Banach spaces.

Proof. As an exercise.
Definition 2.4 (Operator). Let (V, k·kV ) and (W, k·kW ) be two real normed
vector spaces. Let D ⊂ V and T : D → W be a mapping (operator).
We call T a functional, if W = R, C. Furthermore, we call T

linear, if D is a subspace of V and T [αu + βv] = αT [u] + βT [v] for all α, β ∈ R


and all u, v ∈ D.
affine linear, if D is a subspace of V and T [λu + (1 − λ)v] = λT [u] + (1 − λ)T [v]
for all λ ∈ R and all u, v ∈ D.
continuous in u0 ∈ D, if T [un ] → T [u0 ] for all sequences (un )n∈N → u0 .
Lipschitz–cont. (L–cont.) in D with Lipschitz–constant q ≥ 0, if
kT [u] − T [v]kW ≤ q ku − vkV for all u, v ∈ D.
contraction, if T is L–continuous with L–constant q < 1.

Remark 2.1. Let T be a linear operator. Then the following holds true:
T cont. in u0 ∈ V ⇐⇒ T cont. in 0
T L–cont. ⇐⇒ kT [u]kW ≤ q kukV ∀u ∈ D
Lipschitz–cont. operators are continuous. The minimal L–constant of a linear
operator is called the operator norm of T .
Note, that the definition of continuity and L–continuity involve two different
norms: k·kV for the arguments and k·kW for the images.
2.2. BANACH FIXPOINT THEOREM 22

Example 2.1. Let I ⊂ R be a compact interval. Let V = W = C(I, Rm ). The


integral operator Z x
T [u](x) := η + u(s) ds
ξ

is affine linear and L–continuous w.r.t. the k·kα,∞ –norm in V and W with the
L–constant 1/α. Check this as an exercise.

2.2 Banach Fixpoint Theorem


An important tool to prove the Picard–Lindelöf Theorem on existence and
uniqueness of solutions to initial value problems for ordinary differential equations
is the follwoing
Theorem 2.2 (Banach Fixpoint Theorem). Let B be a Banach space and let
D ⊂ B be closed and non–empty. Furthermore let T : D → B be a contraction
and self–mapping, i.e. T (D) ⊂ D. Then there exists a unique fixpoint u∗ of T
in D, i.e. the fixpoint equation
u = T [u]
has a unique solution u∗ ∈ D. The fixpoint–iteration

un+1 = T [un ]

converges for all initial guesses u0 ∈ D to the fixpoint u∗ and

∗ 1 qn
kun − u k ≤ kun+1 − un k ≤ ku1 − u0 k ,
1−q 1−q
where k·k denotes the norm in B and q < 1 equals to the L–constant of T .

Proof. (1) Since T is a self–mapping, it holds that un ∈ D.


(2) We show
kun+1 − un k ≤ q n ku1 − u0 k . (∗)
For n = 0 this holds true and

kun+2 − un+1 k ≤ kT [un+1 ] − T [un ]k ≤ q kun+1 − un k ≤ q n+1 ku1 − u0 k .

(3) Let u, v ∈ D. Then

ku − vk ≤ ku − T [u]k + kT [u] − T [v]k + kv − T [v]k


≤ ku − T [u]k + q ku − vk + kv − T [v]k
1
≤ [ku − T [u]k + kv − T [v]k] (∗∗)
1−q
23 CHAPTER 2. EXISTENCE AND UNIQUENESS

(4) Uniqueness: Let u, v be two fixpoints. Due to (∗∗) we get u = v.

(5) Existence: Set u = un+p and v = un for p, n ∈ N in (∗∗), then

1
kun+p − un k ≤ [kun+p+1 − un+p k + kun+1 − un k]
1−q

and using (∗)

1  n+p  2 ku1 − u0 k n
≤ (q + q n ) ku1 − u0 k ≤ q .
1−q 1−q

hence (un )n∈N is a Cauchy sequence in D ⊂ B. Since B is a Banach


space, the sequence un converges to some u∗ ∈ B. The set D ⊂ B is closed
and un ∈ D, hence the limit u∗ is also an element of D. The operator T is
continuous, therefore T [un ] converges to T [u∗ ] and we obtain the fixpoint
equation u∗ = T [u∗ ].

(6) Let u = un and v = u∗ in (∗∗), then kun − u∗ k ≤ 1−q 1


kun+1 − un k and
q n

using (∗) we obtain the estimate kun − u k ≤ 1−q ku1 − u0 k.

2.3 Picard–Lindelöf Theorem

An important result is the following Theorem by Picard and Lindelöf. We


consider the IVP
u0 = f (x, u), u(ξ) = η . (IVP)
Here, I ⊂ R is a compact interval, x, ξ ∈ I, η ∈ Rm and f : I × Rm → Rm is
assumed to be continuous. This means, we consider a scalar (m = 1), complex
(m = 2, C ' R2 ) or vectorial (m > 1) IVP for a first order ODE.

Theorem 2.3 (Picard–Lindelöf). Given (IVP) and assume f to be continu-


ous w.r.t. x and L–continuous w.r.t. u on I × Rm , i.e.

kf (x, u) − f (x, v)k ≤ L ku − vk ∀u, v ∈ Rm .

Then the initial value problem (IVP) admits a unique solution on I.

Proof. The IVP is equivalent to the fixpoint–problem


Z x
u(x) = T [u](x) := η + f (s, u(s)) ds . (FPP)
ξ
2.3. PICARD–LINDELÖF THEOREM 24

Let u be a continuous solution of (FPP), then u satisfies the initial condition.


Since the right hand side of (FPP) is continuously diff’able, u is also cont. diff’able
and hence u0 = f (x, u). On the other hand, let u be a diff’able solution of (IVP).
Then the continuity of f implies the continuity of ϕ(x) := f (x, u(x)) and hence u
is continuously diff’able. Due to the fundamental theorem of calculus the fixpoint
equation (FPP) holds.

To apply the Banach Fixpoint Theorem, we let B = C(I, Rm ) and use the
exponentially weighted norm k·kα,∞ on B for some α > 0 to be specified later on.

B B is a Banach space w.r.t this norm, c.f Lemma 2.1,

B D = B is closed and non–empty,

B T (D) ⊂ D, hence T is a self–mapping,

B contraction: Wlog we consider x ≥ ξ. Let u, v ∈ D.


Z x
|T [u](x) − T [v](x)|∞ = f (s, u(s)) − f (s, v(s)) ds


Z xξ
≤ |f (s, u(s)) − f (s, v(s))|∞ ds
ξ
Z x
≤ L |u(s) − v(s)|∞ e−α|s−ξ| eα|s−ξ| ds
ξ
eα|x−ξ| − 1
≤ L ku − vkα,∞
α
L
≤ ku − vkα,∞ eα|x−ξ|
α
L
|T [u](x) − T [v](x)|∞ e−α|x−ξ| ≤ ku − vkα,∞ ∀x ∈ I
α
L
kT [u] − T [v]kα,∞ ≤ ku − vkα,∞
α
Choosing α = 2L, we get

1
kT [u] − T [v]kα,∞ ≤ q ku − vkα,∞ for q = .
2

Hence we can apply the Banach Fixpoint Theorem to (FPP) and obtain the
desired result.

In many cases f is not defined on the entire set I × Rm but just on a subset I × G.
25 CHAPTER 2. EXISTENCE AND UNIQUENESS

Corollary 2.4. Let I ⊂ R be a compact interval. Let G ⊂ Rm be compact and


simply connected. Let Ω = I × G. Assume, that f : Ω → Rm be continuous
w.r.t. x ∈ I and Lipschitz–continuous w.r.t u ∈ G.
Then the following holds:
The initial value problem (IVP) admits a unique solution existing at least on an
interval J = I ∩ {|x − ξ| ≤ δ/ kf k∞ } where δ = dist(η, ∂G).

Proof. Let x ∈ J. Then |x − ξ| ≤ δ/ kf k∞ and


Z x
|T [u](x)|∞ ≤ |η|∞ + |f (s, u(s))|∞ ds ≤ |η|∞ + |x − ξ| kf k∞ ≤ |η|∞ + δ
ξ

Hence T [u](x) ∈ G. We modifiy the proof of Picard–Lindelöf as follows:

B Let B = C(J, Rm ). Then B is again a Banach space.

B Let D = {u ∈ B : graph u ⊂ Ω}. Then D is non–empty and closed.

B We have already shown that T [u](x) ∈ G, hence graph T [u] ∈ Ω for all
u ∈ D and therefore T (D) ⊂ D.

The contraction property of T remains unchanged.

The following result deals with the global existence and uniqueness of solutions.

Corollary 2.5. Let Ω ⊂ Rm+1 be a domain (non–empty, open and simply con-
nected). Let f : Ω → Rm be continuous and Lipschitz–continuous w.r.t. u for
all (x, u) ∈ Ω. Then:
The initial value problem (IVP) has a unique solution. This solution can be
extended up to the boundary of Ω.

Proof. See [Wal98, §10, Thm. VI].

Remark 2.2. The above Corollary states, that the solution u exists e.g. on an
interval ξ ≤ x ≤ b right of ξ, where b = ∞ is also possible. Depending on the
value of b we have one of the following cases:

(1) b = ∞ : The solution exists for all x ≥ ξ, i.e. the solution can be extended
up to the boundary of Ω in x–direction.

(2) b < ∞ and sup |u(x)| → ∞ for x → b : The solution blows up, i.e. it can
be extended up to the boundary of Ω in u–direction.
2.4. PEANO’S THEOREM 26

(3) b < ∞ and dist ((x, u(x)), ∂Ω) → 0 for x → b : The solution gets arbitrarily
close to the boundary of Ω.

In particular this implies that the maximal interval of existence of the solution u
is open. The same holds true for values of x left of ξ.

All these results also hold true for initial value problems for ordinary differential
equations of order n. Consider an IVP of order n
Ä ä
F x, u, u0 , . . . , u(n) = 0 ,

with initial conditions

u(i) (ξ) = η (i) for i = 0, . . . , (n − 1) .


We introduce the auxiliary functions yi = u(i) für i = 0, . . . , (n − 1) and obtain
the new IVP
yi 0 = yi+1 for i = 0, . . . , (n − 2) ,
Ä ä
0
F x, y0 , y1 , . . . , yn−1 , yn−1 =0,

with the initial conditions given by

yi (ξ) = η (i) for i = 0, . . . , (n − 1) ,


With this trick, we have transformed the scalar valued IVP of order n into a
system of n first–order IVPs.

2.4 Peano’s Theorem


The Picard–Lindelöf Theorem ensures the existence and uniqueness of a so-
lution to an IVP provided the right hand side f is Lipschitz–continuous. But
what happens, if f is no longer assumed to be Lipschitz–continuous?
As we have already seen in Example 1.10, the uniqueness may get lost. However,
in this example solutions still exist. The Peano existence theorem ensures the
existence of solutions provided the right hand side is at least continuous.
Theorem 2.6 (Peano Existence Theorem). Let I ⊂ R be a compact interval
and m ∈ N. Let G ⊂ Rm be closed and simply connected. Let Ω = I × G and
f ∈ C(Ω) be bounded. Let (ξ, η) ∈ Ω. Then the IVP
u0 = f (x, u) u(ξ) = η (IVP)
has a least one solution existing on an interval J = I ∩{|x − ξ| < δ/ kf k∞ } where
δ = dist(η, ∂G).
27 CHAPTER 2. EXISTENCE AND UNIQUENESS

Proof. The proof utilizes Schauder’s fixpoint theorem instead of Banach. For
details we refer to [Wal98, §7].
Corollary 2.7. Let Ω ⊂ Rm+1 be a domain. Let f ∈ C(Ω, Rm ). For any initial
value (ξ, η) ∈ Ω there exists a solution to the IVP (IVP). This solution can be
extended to the boundary of Ω.

Proof. Without proof.

2.5 Upper– and Lower Functions


In this section we consider just real and scalar ODEs. All functions appearing
are assumed to be real–valued.
The concept of upper– and lower functions allows to derive estimates for the
solutions to ODEs. Hence we can analyze the qualitative behavior of the solutions
without solving the ODE explicitly.
Wlog we just consider solutions to the right of the initial value ξ. Mutatis mu-
tandis the same holds true left of ξ.
Lemma 2.8. Let Φ and Ψ be two diff ’able functions on the interval I0 :=
{ξ < x ≤ ξ + a}. Let Φ(x) < Ψ(x) on ξ < x < ξ + ε for some ε > 0. Then
we are in one of the following cases.

(1) Either Φ < Ψ on the entire interval I0 or


(2) there exists x0 ∈ I0 such that Φ(x) < Ψ(x) on ξ < x < x0 and Φ(x0 ) =
Ψ(x0 ), Φ0 (x0 ) ≥ Ψ0 (x0 ).

Proof. If Case 1 is not true, then due to the assumptions and the continuity
of Φ and Ψ there exists x0 ∈ I0 such that Φ(x) < Ψ(x) for ξ < x < x0 and
Φ(x0 ) = Ψ(x0 ). In remains to show, that Φ0 (x0 ) ≥ Ψ0 (x0 ). Consider the left
sided difference quotient at x0
Φ(x0 ) − Φ(x) Ψ(x0 ) − Ψ(x)
>
x0 − x x0 − x
where ξ < x < x0 . Passing to the limit x % x0 yields Φ0 (x0 ) ≥ Ψ0 (x0 ).
Definition 2.5. Let Ω ⊂ R2 , f ∈ C(Ω). We consider the IVP
u0 = f (x, u), u(ξ) = η (IVP)
on an interval I = [ξ, ξ + a] for some a > 0.
A function v or resp. w is called a lower– or upper function of the IVP, if
v 0 < f (x, u(x)) on I and v(ξ) ≤ η resp. w0 > f (x, u(x)) on I and w(ξ) ≥ η .
2.5. UPPER– AND LOWER FUNCTIONS 28

Proposition 2.9. Let u be a solution of (IVP) and let v and w be lower– and
upper functions. Then
v(x) < u(x) < w(x) on I0 = (ξ, ξ + a) .

Proof. For v: If v(ξ) < η, then due to the assumption v 0 < f (x, u(x))it holds
that v(x) < u(x) for all x ∈ I0 .
If v(ξ) = η, then v 0 (ξ) < f (ξ, u(ξ)) = f (ξ, η) = u0 (ξ) implies the existence of
ξ˜ < ξ where v(ξ)
˜ < u(ξ).˜ Now we are in the situation of the first case.

In applications the following theorem is widely used.


Theorem 2.10. Let I = [ξ, ξ + a], R ⊂ R be an interval and Ω = I × R. Let
f ∈ C(Ω) and g, h ∈ C(Ω) be L–continuous. Furthermore let u be a solution
of (IVP) and assume
v 0 = g(x, v) , v(ξ) ≤ η , g(x, v) ≤ f (x, v) ∀x ∈ I ,
w0 = h(x, w) , w(ξ) ≥ η , h(x, w) ≥ f (x, w) ∀x ∈ I .
Then
v(x) ≤ u(x) ≤ w(x) ∀x ∈ I .
The functions v and w are again called lower– and upper function.

Proof. We consider the auxiliary problem u0n (x) = f (x, u) + n1 , un (ξ) = η + n1


for n ∈ N. Since the assumptions of Proposition 2.9 are satisfied for all n ∈ N,
i.e. v(x) < un+1 (x) < un (x) we obtain u(x) < un+1 (x) < un (x) for all x ∈ I.
The sequence (un )n is pointwise monotonically decreasing and bounded; hence it
converges pointwise. Moreover it holds that
v(x) ≤ u(x) = lim un (x) ∀x ∈ I .
n→∞

For w we proceed analogously.


Remark 2.3. The detour of introducing the sequence un is necessary, since the
equality sign in the assumptions does not allow for a direct proof, right?
Example 2.2. Let I = [0, 2]. We consider the IVP
1 u
0.9
u : exakt
v : Unterfkt. u0 (x) = − , u(0) = 1 .
0.8
w : Oberfkt.
1 + x2
0.7

0.6
The ODE v 0 = −v together with v(0) = 1 yields
0.5 v(x) = e−x as a lower function.
0.4

0.3
The ODE w0 = −w/5 with w(0) = 1 yields the
0.2 upper function w(x) = e−x/5 .
0.1

0
The solution of the original IVP is given by
0 1 2 3 4 5
u(x) = exp(− arctan x).
Chapter 3

Linear Differential Equations

Example 3.1 (Semidiscretization of the heat equation). We consider the one–


dimensional heat equation
∂u ∂ 2u
−κ 2 =f , (3.1)
∂t ∂x
for t ∈ R+ and 0 < x < 1 supplemented by the initial condition

u(t = 0, x) = u0 (x) ,

and the boundary conditions

u(t, x = 0) = uL (t), u(t, x = 1) = uR (t) .

The idea behind semidiscretization is the following: We introduce a spatial grid


xi = ih, h = 1/n for i = 0, . . . , n and approximate the spatial derivative by a
finite difference
∂ 2u ui+1 (t) − 2ui (t) + ui−1 (t)
2
(t, xi ) ≈
∂x h2
using ui (t) = u(t, xi ). Plugging this approximation into the heat equation (3.1),
we obtain a system of ordinary differential equations
κ
u0i − (ui+1 − 2ui + ui−1 ) = fi (t) , (3.2a)
h2
for i = 1, . . . , n − 1 and with initial conditions ui (0) = u0,i . For the nodes i = 1
and i = n − 1 next to the boundary we use the boundary conditions for u0 and
un and obtain
κ κ
u01 − 2 (u2 − 2u1 ) = f1 (t) + 2 uL (t) , (3.2b)
h h
κ κ
u0n−1 − 2 (−2un−1 + un−2 ) = fn−1 (t) + 2 uR (t) . (3.2c)
h h

29
30

Introducing the vectors u(t) = (u1 (t), . . . , un−1 (t)) and f (t) = (f1 (t), . . . , fn−1 (t)),
we can write (3.2) as
u0 = Au + b , (3.3)
using the matrix
â ì
á
−2 1 0
ë κ/h2 uL (t)
.. .. 0
κ 1 . . ..
A= ... ... and b = f + . .
h2 1 0
0 1 −2 2
κ/h uR (t)

The dimension of this ODE system depends on the number n of spatial grid
points; in applications one often works with n = 106 or even more nodes.

In this chapter we consider linear ODE systems of the following form

u01 = a11 (t)u1 + · · · + a1n (t)un + b1 (t) , u1 (ξ) = η1 ,


.. (3.4)
.
0
un = an1 (t)u1 + · · · + ann (t)un + bn (t) , un (ξ) = ηn .

Introducing the notations u = (u1 , . . . , un ) : I 7→ Rn , b = (b1 , . . . , bn ) : I 7→ Rn ,


A = (aij )i,j : I 7→ Rn×n and η = (η1 , . . . , ηn ) ∈ Rn , we can write (3.4) as

u0 = A(t)u + b(t), u(ξ) = η . (3.5)

Definition 3.1. Let A ∈ C 1 (I, Rn×n ) be a continuously diff’able, matrix valued


function on an interval I ⊂ R. We introduce

A0 (t) := a0ij (t) i,j



Z ÅZ ã
A(t) dt := aij (t) dt
i,j
n
X
tr A(t) := aii (t) (trace of the matrix) .
i=1

Definition 3.2 (Matrix norms). Let |·| be a vector norm in Rn and k·k a matrix
2
norm in Rn×n ' Rn . The norm k·k is called compatible with |·|, if

kABk ≤ kAk kBk (submultiplicative) ,


|Ax| ≤ kAk |x| (compatible) .

Example 3.2. The max–norm


P |x| = |x|∞ = maxi |xi | is compatible with the row
sum norm kAk∞ = maxi j |aij |.
31 CHAPTER 3. LINEAR DIFFERENTIAL EQUATIONS

The L1 –normP
P
|x| = |x|1 = i |xi | is compatible with the column sum norm
kAk1 = maxj i |aij |.
ÄP 2
ä1/2
The Euclidean norm |x| = |x|2 = i |xi | is compatible with the spectral
norm kAk2 = ρ(AH A). Here AH = AT denotes the conjugate transposed
(Hermitean) matrix and ρ(A) denotes the spectral radius, i.e. the magnitude of
the largest eigenvalue.
Theorem 3.1. Let I ⊂ R be a compact interval, A : I 7→ Rn×n , b : I 7→ Rn
continuous and ξ ∈ I, η ∈ Rn . Then the IVP

u0 = A(t)u + b(t), u(ξ) = η . (3.6)

has a unique solution u(t) existing on the entire interval I.


If the estimates

kA(t)k∞ ≤ L , |b(t)|∞ ≤ δ ∀t ∈ I and |η|∞ ≤ γ ,

hold, the solution u(t) of (3.6) is bounded by


δ Ä L|t−ξ| ä
|u(t)|∞ ≤ γeL|t−ξ| + e −1 . (3.7)
L

Proof. Existence and uniqueness can be shown using Picard–Lindelöf.


The estimate (3.7) is left as an exercise.
Remark 3.1. Complex valued systems can be treated similarly. Separating the
real and imaginary parts we obtain a real valued system of double the dimension.
Recall: The ODE u0 = Au is called homogeneous and u0 = Au + b is called the
inhomogeneous equation.

3.1 Homogeneous Linear ODE Systems

If not stated otherwise, we assume I ⊂ R to be an interval, ξ ∈ I, A : I 7→ Rn×n


continuous and η ∈ Rn . We consider the IVP

u0 = A(t)u , u(ξ) = η . (3.8)

Thanks to the Picard–Lindelöf Theorem 2.3, there exist for every initial value
u(ξ) = η ∈ Rn a unique solution u(·; η) of the homogeneous IVP (3.8). Let

SH := u ∈ C 1 (I, Rn ) : u0 = Au

3.2. INHOMOGENEOUS SYSTEMS 32

denote the set of all solutions to the homogeneous linear ODE. Now, we can
define the mapping Φ : Rn 7→ SH by Φ(η) := u(·; η). In other words, Φ which
maps an initial value η onto the solution u(·; η) of the homogeneous IVP (3.8).
It is immediate to see, that this map is a bijection and hence the set SH can be
viewed as a vector space of dimension n.
Lemma 3.2. Let u ∈ SH and u(t) = 0 for some t ∈ I. Then u ≡ 0.
Definition 3.3 (Solution matrix and fundamental matrix). A family u1 (t), . . . , un (t)
of linear independent solutions to the linear homogeneous ODE u0 = A(t)u is
called a fundamental system. The matrix

U(t) = (u1 (t)| · · · |un (t)) ∈ Rn×n

composed out of a fundamental system is called solution matrix of the ODE. It


is immediate to see, that the solution matrix is not unique.
A particular fundamental system is given by the solutions of the initial value
problems xi (ξ) = ei , i.e. xi = Φ(ei ). The according solution matrix X(t) is also
called fundamental matrix. With the help of the fundamental matrix, the unique
solution of (3.8) can be written as

u(t) = X(t) η .

Definition 3.4 (Wronski–determinant). Let U(t) = (u1 (t)| · · · |un (t)) be a solu-
tion matrix. We call its determinant w(t) = det U(t) the Wronski–determinant.
Lemma 3.3. The Wronski–determinant satisfies the homogeneous linear ODE

w0 = tr A(t) w .

It holds that either w ≡ 0 or w 6= 0 on the entire interval I. If w 6= 0, the


columns of the solution matrix u1 (t), . . . , un (t) define a fundamental system.

Proof. See [Wal98, Ch. IV, §15 III]

3.2 Inhomogeneous Systems

Given the inhomogeneous problem

u0 = A(t)u + b(t) , u(ξ) = η (3.9)

we define the set of all solutions to the inhomogeneous ODE u0 = A(t)u + b(t)
as
SI := u ∈ C 1 (I, Rn ) : u0 = Au + b .

33 CHAPTER 3. LINEAR DIFFERENTIAL EQUATIONS

Proposition 3.4. Let u denote an arbitrary solution of the inhomogeneous ODE


u0 = A(t)u + b(t). Then SI = SH + u, i.e. for every solution u of the inhomo-
geneous equation there exists a solution z of the homogeneous problem such that
u = z + u.
(General solution of inhomogeneous problem
= General solution of homogeneous problem + particular solution)

To obtain a particular solution of the inhomogeneous problem, we use the varia-


tion of constants.
Let X(t) be the fundamental matrix of the homogeneous equation x0 = A(t)x.
Then, the general solution of the homogeneous problem equals z(t) = X(t) c
for some constant c ∈ Rn . Analogously to the scalar case (see Section 1.5 on
page 18) we seek the solution of the inhomogeneous problem (3.9) using the
ansatz u(t) = X(t) c(t), where the constant c is replaced by some function c(t).
It holds that

u0 (t) = X0 (t) c(t) + X(t) c0 (t) = A(t) X(t) c(t) + X(t) c0 (t)
!
= A(t) u(t) + b(t)

and hence

b(t) = X(t) c0 (t) .

Since X(t) is a fundamental matrix, we get det X(t) 6= 0, and

c0 (t) = X−1 (t) b(t) .

Integration leads to
Z t
c(t) = c(ξ) + X−1 (s) b(s) ds .
ξ

This shows the following

Theorem 3.5. The unique solution of the inhomogeneous initial value prob-
lem (3.9) is given by
Ç Z t å
u(t) = X(t) η + X−1 (s) b(s) ds .
ξ

Remark 3.2. We compare the scalar and the vectorial case (both with constant
coefficients) :
3.3. THE MATRIX–EXPONENTIAL FUNCTION 34

scalar vectorial
homogen. x0 = ax, x(0) = η x0 = Ax, x(0) = η
solution x = eat η x = X(t) η

inhom. u0 = auÄ+ b(t), u(0) = ηä u = Au +Ä b(t), u(0) = η ä


Rt Rt
solution u = eat η + 0 e−as b(s) ds u = X(t) η + 0 X−1 (s) b(s) ds

How to obtain the fundamental system X(t) of a system of homogeneous linear


ODEs with constant coefficients x0 = Ax?
Does there exist an analogy to the scalar case, i.e. the exponential eat ?

3.3 The Matrix–Exponential Function

In this section we consider a system of linear ODEs with constant coefficients.


The fundamental matrix X(t) is the solution of the matrix valued initial value
problem
X0 (t) = A X(t), X(0) = E , (3.10)
for A ∈ Rn×n constant. Wlog we assume ξ = 0.
According to Picard–Lindelöf (Thm. 2.3) and Banach (Thm. 2.2) we know:

B The solution is unique and exists for all t ∈ R.

B The fixpoint iteration


Z t
Xk+1 = E + A Xk ds
0

converges for every initial guess X0 ∈ Rn×n to the solution X(t) of (3.10),
i.e. to the fundamental matrix.

The k-th fixpoint iterate for the initial guess X0 = E reads as

A2 t2 Ak tk
Xk (t) = E + At + + ··· + .
2 k!
Its scalar analogue

a2 t2 ak tk
xk (t) = 1 + at + + ··· +
2 k!
converges to x(t) = eat . This motivates the following
35 CHAPTER 3. LINEAR DIFFERENTIAL EQUATIONS

Definition 3.5 (Matrix–Exponential–Function). The matrix–valued series



X Ak
k=0
k!

converges for all constant matrices A ∈ Rn×n or Cn×n . It defines the so–called
matrix–exponential–function

A
X Ak
exp(A) = e := ∈ Rn×n or Cn×n .
k=0
k!

Theorem 3.6. The fundamental matrix of the homogeneous ODE with constant
coefficients is given by
X(t) = eAt .
Remark 3.3. Based on the analogy to the scalar case, one may assume, that the
fundamental matrix for a problem with non–constant coefficients

X0 = A(t) X

is also given in the form


ïZ t ò
X(t) = exp A(s) ds .
0

This is in general not true!


R
However, if A(t) and A(s) ds commute, then the above formula is true.
Example 3.3. The solution of the linear ODE system
Å ã
0 1 2t
x = A(t) x where A(t) =
0 0
is not given by ÅZ t ã
x(t) = exp A(s) ds .
0
Details of this will be worked out in the tutorials.
Theorem 3.7 (Properties of eA ). Analogous to the scalar case it holds that

d At
(1) e = A · eAt
dt
(2) Let Λ = diag (λ1 , . . . , λn ) be a diagonal matrix. Then

eΛ = diag eλ1 , . . . , eλn .



3.3. THE MATRIX–EXPONENTIAL FUNCTION 36

(3) If B and C commute, i.e. BC = CB, then

eB+C = eB eC = eC eB .

(4) If C is invertible, i.e. det C 6= 0, then


−1
eCBC = C eB C−1 .
−1
(5) eA = e−A

(6) eA(s+t) = eAs eAt = eAt eAs

(7) eA+λE = eA eλ = eλ eA

The properties (2), (4) and (7) show an efficient way to compute eA .

Proposition 3.8. Let A be diagonalizable, i.e. there exists Q ∈ Cn×n , det Q 6= 0


such that A = Q · diag (λ1 , . . . , λn ) · Q−1 . Then

eA = Q · diag eλ1 , . . . , eλn · Q−1 .




What to do, if A is not diagonalizable? Use the Jordan normalform (Thm. A.3)

Proposition 3.9. Let Jk ∈ Cmk ×mk be a Jordan–block of dimension mk to the


eigenvalue λk . Then

1 t t2 /2 · · · tmk −1 /(mk − 1)!


 
 ... ... .. ..
 . .


Jk t
e =e λt 
 .. .. 2

 . (3.11)
. . t /2 
 ... 
 t 
0 1

Proof. As an exercise.

Theorem 3.10. Let A = Q · diag (J1 , . . . , Jk ) · Q−1 be the Jordan normalform


of A. Then
eAt = Q · diag eJ1 t , . . . , eJk t · Q−1 .


Proof. As an exercise.

How to do this in a practical example?


37 CHAPTER 3. LINEAR DIFFERENTIAL EQUATIONS

Example 3.4. We consider the ODE


Å ã
0 1 −1
x = Ax, where A = .
4 −3

The characteristic polynomial of A is given by χA (λ) = (λ + 1)2 . Hence λ = −1


is the only eigenvalue with algebraic multiplicity 2. If A is diagonalizable, we
expect each component of solution to be of the form ce−t . On the other hand, if
A is not diagonalizable, the Jordan form (cf. (3.11)) yields each component to be
of the form e−t times a polynomial of degree less or equal to one. Hence, in both
cases we seek the solution of the ODE of the form
Å ã Å ã
x a + bt −t
= e
y c + dt
where the coefficients a, b, c and d are still to be determined. Inserting this into
the ODE yields
Å ã Å ã
0 −a + b − bt −t ! a − c + (b − d)t
x = e = Ax = e−t
−c + d − dt 4a + 3c + (4b − 3d)t

Comparing coefficients yields (up to scalar multiples) the following two cases
Å ã Å ã
x1 1 −t
(1) b = 0, d = 0, a = 1, c = 2, hence = e .
y1 2
Å ã Å ã
x2 t
(2) b = 1, d = 2, a = 0, c = −1, hence = e−t .
y2 −1 + 2t

The according solution matrix Y(t) reads as


Å ã Å ã
1 t −t 1 0
Y(t) = e , and Y(0) = .
2 −1 + 2t 2 −1

Column operations (linear combinations of fundamental solutions) finally yields


the fundamental matrix
Å ã Å ã
1 + 2t −t −t 1 0
X(t) = e , and X(0) = = E.
4t 1 − 2t 0 1

Corollary 3.11 (to Theorem 3.5). The inhomogeneous system

y 0 = Ay + b(t), y(ξ) = η

has the unique solution


Z t
A(t−ξ)
y=e η+ eA(t−s) b(s) ds .
ξ
3.4. HIGHER ORDER DIFFERENTIAL EQUATIONS 38

3.4 Higher Order Differential Equations

The linear n–th order ODE

u(n) + an−1 (t)u(n−1) + · · · + a1 (t)u0 + a0 (t)u = b(t) (3.12)

is equivalent to the system

u0 = A(t) u + b(t) ,

setting u = u, u0 , . . . , u(n−1) and b(t) = (0, . . . , 0, b(t)) as well as the matrix




á ë
0 1 0
.. ..
A(t) = . . .
0 1
−a0 (t) −a1 (t) · · · −an−1 (t)

Theorem 3.12. Let ak : I → R, k = 0, . . . , n − 1 and b : I → R be continuous


functions. Then the linear ODE (3.12) of order n admits for any initial value

u(ξ) = η0 , u0 (ξ) = η1 , ... , u(n−1) = ηn−1

a unique solution.
The set SH := u ∈ C n (I) : u(n) + an−1 (t)u(n−1) + · · · + a1 (t)u0 + a0 (t)u = 0 of


all solutions to the homogeneous problem is a vector space of dimension n.


Let u be an arbitrary solution of the inhomogeneous problem (3.12). Then
n o
SI := u ∈ C n (I) : u satisfies (3.12) = u + SH

is the affine linear space of all solution of the inhomogeneous problem.

(k)
Let x0 , . . . , xn−1 be a fundamental system to the initial values xi (0) = δik for
i, k = 0, . . . , n − 1. The resulting fundamental matrix is given by
á ë
x0 (t) x1 (t) · · · xn−1 (t)
x00 (t) x01 (t) x0n−1 (t)
X(t) = .. .. .
. .
(n−1) (n−1) (n−1)
x0 (t) x1 (t) · · · xn−1 (t)

The Wronskian w(t) = det X(t) satisfies the ODE

w0 = −an−1 (t)w .
39 CHAPTER 3. LINEAR DIFFERENTIAL EQUATIONS

In the sequel we restrict to the case of constant coefficients, i.e. we assume the
matrix A ∈ Rn×n to be constant.

Let b : I → R be continuous and a0 , . . . , an−1 ∈ R. We consider the ODE

u(n) + an−1 u(n−1) + · · · + a1 u0 + a0 u = b(t) (†)

or the equivalent system

u0 = A u + b(t)

where
á ë
0 1 0
.. ...
A= . .
0 0 1
−a0 −a1 · · · −an−1

The characteristic polynomial χA (λ) = det (A − λE) is also called the charac-
teristic polynomial of the ODE (†).

Lemma 3.13. The characteristic polynomial of the ODE (†) is given by

χA (λ) = (−1)n λn + an−1 λn−1 + · · · + a1 λ + a0 .




Proof. Left as an exercise.

Theorem 3.14. Let λ1 , . . . , λm be the roots of the characteristic polynomial of


the ODE (†) with multiplicities k1 , . . . , km . Then the functions

e , te , . . . , tk1 −1 eλ1 t ; . . . ; eλm t , teλm t , . . . , tkm −1 eλm t


 λ1 t λ1 t

constitute a fundamental system of (†).

Remark 3.4. There might appear complex roots λ = α + iω. Since all the coef-
ficients a0 , . . . , an−1 are assumed to be real, the roots appear in pairs with their
complex conjugate λ = α − iω. A real–valued fundamental system can be ob-
tained replacing the pairs of complex conjugate solutions
¶ ©
eλt , teλt , . . . , tk−1 eλt ; eλt , teλt , . . . , tk−1 eλt

by their real counterparts

eαt · cos(ωt), t cos(ωt), . . . , tk−1 cos(ωt); sin(ωt), t sin(ωt), . . . , tk−1 sin(ωt) .



3.4. HIGHER ORDER DIFFERENTIAL EQUATIONS 40

Example 3.5 (Linear second order ODE with constant coefficients).

The ODE

u00 + 2au0 + bu = 0 (3.13)

can be used to model a damped oscilla-


tor. z
A mass m > 0 is attached to a spring
with spring constant k > 0, z(t) denotes
the deviation from equilibrium

mz 00 = −kz .

According to Stokes law, the viscous


friction in the fluid equals FR = 6πηrv
where η is the viscosity of the fluid, r de-
notes the radius and v = z 0 the velocity. Figure 3.1: Sketch of a damped os-
Hence we obtain cillator.

mz 00 = −βz 0 − kz ,

using β = 6πηr > 0.


The characteristic polynomial of ODE (3.13) is given by λ2 + 2aλ + b. Its roots
are

λ1,2 = −a ± a2 − b, δ 2 = a2 − b .

Based on the discriminant δ we distinguish three cases:

(1) The overdamped case δ 2 > 0. Here we have two real roots λ1 = −a + δ and
λ2 = −a − δ. The fundamental system reads as u1 = eλ1 t and u2 = eλ2 t .
The general solution is therefore given by

u = αeλ1 t + βeλ2 t

where α, β ∈ R.

(2) The critically damped case δ 2 = 0 with one root λ = −a of multiplicity two.
The fundamental system equals u1 = eλ1 t and u2 = teλ2 t and the general
solution is given by
u = αeλ1 t + βteλ2 t
where α, β ∈ R.
41 CHAPTER 3. LINEAR DIFFERENTIAL EQUATIONS

(3) The oscillatory case for δ 2 < 0. In this situation we encounter two complex
conjugate roots λ, λ = −a ± i |δ|. The complex–valued fundamental system

u1 = eλt = e−at+i|δ|t and u2 = eλt = e−at−i|δ|t

leads to the following real–valued fundamental system

ũ1 = (u1 + u2 )/2 = e−at cos(|δ| t) ,


ũ2 = (u1 + u2 )/(2i) = e−at sin(|δ| t)

and the general solution (α, β ∈ R)

u = αe−at cos(|δ| t) + βe−at sin(|δ| t) .

In the physically relevant case a > 0, this solution describes a damped


oscillation.

0
The following Figure 3.2 shows √ the solutions using b = 1, u(0) = 2, u (0) = 0
and damping parameter a = 2 (overdamped), a = 1 (critically damped) and
a = 1/4 (damped oscillation). The left hand side shows the solution u(t); the
right hand side depicts the trajectories in the u u0 –phase space.
2 1

1.5
0.5

1
0
u'

0.5
u

-0.5
0

-1
-0.5

-1 -1.5
0 5 10 15 20 -1 -0.5 0 0.5 1 1.5 2
Time t u

Figure 3.2: Solution trajectories


√ (left) and phase portrait (right) of oscilla-
tor (3.13). Overdamped (a = 2; dashed), critically damped (a = 1; dash–
dotted) and oscillating (a = 1/4; solid line) case.
Chapter 4

Numerical Methods

In the sequel we consider the following initial value problem for a first order ODE

u0 = f (t, u), u(t0 ) = u0 (4.1)


on an interval t ∈ I = [t0 , T ]. The right hand side f : Ω → Rn is assumed
to be smooth, i.e. continuously differentiable up to needed orders and hence in
particular locally Lipschitz –continuous on Ω ⊂ I × Rn .

Example 4.1. For a ∈ R, the solution of the IVP

u0 = a u, u(0) = 1

is given by u(t) = exp(at). Using python, the following code–snippet solves this
IVP on the interval t ∈ [0, 3].
from numpy import ∗
from s c i p y . i n t e g r a t e import s o l v e i v p
def f ( t , u , a ) :
return a∗u
5

a = 1.5
S o l = s o l v e i v p ( f , [ 0 , 3 ] , [ 1 ] , a r g s =(a , ) )
t = Sol . t
u = Sol . y [ 0 , : ]
10 print ( t )
print ( u )

When running this code, we observe, that the t–values are strangely spaced inside
the interval [0, 3]. For different values of the parameter a, the solver solve ivp
even chooses different time points for computing the solution. Why?

42
43 CHAPTER 4. NUMERICAL METHODS

Definition 4.1 (Evolution). Consider a differential equation u0 = f (t, u) with


Lipschitz–continuous right hand side f : Ω → Rn . For an initial value u(t0 ) = u0
we denote the according unique solution u(t) as

u(t) = Φt,t0 u0 .

The operator Φt,t0 : Rn → Rn maps the initial value u0 ∈ Rn at time t0 onto the
solution u(t) ∈ Rn evaluated at time t. The two–parameter family of maps Φ·,·
is called the evolution of the differential equation.

Lemma 4.1. Let Φ be the evolution of a differential equation. Then for all initial
values (t0 , u0 ) ∈ Ω the following holds

(1) Φt0 ,t0 u0 = u0 ,

(2) Φt,s Φs,t0 u0 = Φt,t0 u0 for all t, s ∈ I.

Proof. As an exercise.

Example 4.2 (Euler–Method (Idea 1768)). To determine the evolution Φt,t0 u0


of a differential equation, we use an approximation of the derivative
u(t + h) − u(t)
f (t, u(t)) = u0 (t) '
h
for h small. This yields the following linear approximation of u(t + h)

u(t + h) = u(t) + hf (t, u(t)) .

Iterating this, we can construct an approximate solution uk ' u(tk ) at discrete


time points tk = t0 + kh, k ∈ N by

uk+1 = uk + hf (tk , uk ) .

Graphically, we may interpret this approximate solution as a piecewise linear


curve in the directional field of the differential equation. This piecewise linear
curve is also known as the Euler–polygon.

Python Example 4.1 (Euler–method).


from numpy import ∗

def e u l e r ( f , t , u0 ) :
# f : f ( t , u ) , t s c a l a r , u v e c t o r ; r h s o f th e ODE
5 # t : v e c t o r o f d i s c r e t e time p o i n t s
# u0 : v e c t o r o f i n i t i a l v a l u e s
44

# r e t u r n s U: s o l u t i o n as matrix , rows = componentes

n = len ( t )
10 m = len ( u0 )
U = z e r o s ( (m, n ) )
U [ : , 0 ] = u0

for k in range ( n −1):


15 h = t [ k+1]− t [ k ]
U [ : , k+1] = U [ : , k ] + h∗ f ( t [ k ] , U [ : , k ] )

return U

The above introduced approximation is an example of an explicit one–step–


method, since the new value for the approximate uk+1 can be explicitly computed
knowing the previous value uk ; no system of equations needs to be solved.
Alternatively, we may write the ODE as an integral equation
Z t
u(t) = u(t0 ) + f (s, u(s)) ds
t0

and try to evaluate the integral over an interval [t0 , t0 + h] by a suitable numerical
quadrature.
The left–sided rectangle rule
Z t0 +h
f (s, u(s)) ds ' h · f (t0 , u(t0 ))
t0

leads to

u(t0 + h) ' u(t0 ) + hf (t0 , u(t0 )) .

We recover the previous (explicit) Euler–method.


The right–sided rectangle rule
Z t0 +h
f (s, u(s)) ds ' h · f (t0 + h, u(t0 + h))
t0

leads to the so–called implicit Euler–method

u(t0 + h) ' u(t0 ) + hf (t0 + h, u(t0 + h)) .


45 CHAPTER 4. NUMERICAL METHODS

Here the new value u(t0 + h) appears on both sides of the equations. Hence it
is only given implicitly and in general we need to solve a non–linear system for
u(t0 + h).
The midpoint–rule
Z t0 +h
f (s, u(s)) ds ' h · f (t0 + h/2, u(t0 + h/2))
t0

leads to

u(t0 + h) ' u(t0 ) + h · f (t0 + h/2, u(t0 + h/2)) .

Replacing the yet unknown value u(t0 + h/2) by the Euler–approximation u0 +


h/2 · f (t0 , u0 ), we obtain the Runge– or improved Euler–method
Å ã
h h
u(t0 + h) ' u(t0 ) + h · f t0 + , u(t0 ) + f (t0 , u(t0 )) .
2 2

4.1 Convergence Theory

For the sake of simplicity, we are just considering scalar ODEs in this section.
All the results also apply for systems of ODEs, replacing the scalar values of u
by appropriate vectors.
Definition 4.2 (Grid). Consider an interval [t0 , T ] ⊂ R and a subdivision called
grid Th = {ti : i = 0, . . . , n} ⊂ [t0 , T ] where t0 < t1 < · · · < tn = T . We
define the step size hi = ti+1 − ti for i = 0, . . . , n − 1 and the step size vector
h = (h0 , . . . , hn−1 ). The maximal step size is denoted by hmax = |h| = maxi hi .
If the step sizes are constant, i.e. hi = h for all i, we call the grid equidistant.
A second grid S = {sj : j = 0 . . . m} is called refinement of T , if T ⊂ S and
hmax (S) < hmax (T ).
Let u : [t0 , T ] → R be a function. We call the vector

uh,i = ui = u(ti ) ∈ Rn+1

the grid function associated to u. To compare a function y : [t0 , T ] → R to a grid


function uh ∈ Rn+1 we use

ky − uh k := max |y(ti ) − uh,i | .


i=0...n

For the sake of shorter notation, we often denote the grid function uh also by u,
if it is clear, that we are speaking about a grid function and if confusion with the
function itself can be excluded.
4.1. CONVERGENCE THEORY 46

Definition 4.3 (Increment, numerical evolution). A numerical method to solve


the IVP (4.1) is called one–step method, if there exists an increment (function)
ψ : [t0 , T ] × R × R+ → R such that

uj+1 = uj + hj ψ(tj , uj ; hj )

for j = 0, . . . , n − 1 and u0 = u(t0 ).


We call Ψtj+1 ,tj uj := uj + hj ψ(tj , uj ; hj ) the numerical evolution of the method.

Example 4.3. The Euler–method is a one–step method with increment

ψ(t, u, h) = f (t, u) .

The Runge–method uses the increment


Å ã
h h
ψ(t, u, h) = f t + , u + f (t, u) .
2 2

For a given IVP (4.1) with (exact) evolution Φ we will try to construct a numerical
evolution Ψ which is as close as possible to the exact evolution Φ. This leads us
to the following

Definition 4.4 (Consistency). Let u(t) denotes the exact solution of the IVP.
We call
1 t+h,t  u(t + h) − u(t)
δh (t, u) = Φ u(t) − Ψt+h,t u(t) = − ψ(t, u(t), h)
h h
the local discretization error or consistency error of the numerical evolution Ψ.
We call the numerical evolution Ψ or its defining method consistent, if

max |δh (t, u)| −→ 0


t∈[t0 ,T ]

for step size h → 0.


We call the numerical evolution consistent of order p ∈ N, if

δh (t, u) = O(hp ) .

Remark 4.1. Some authors, cf. Deuflhard [DB02], define the local discretiza-
tion error alternatively by δh (t, u) = Φt+h,t u(t) − Ψt+h,t u(t). However, then the
condition for consistency order p reads as δh (t, u) = O(hp+1 ).

Lemma 4.2. The following are equivalent:

(1) The numerical evolution Ψ is consistent.


47 CHAPTER 4. NUMERICAL METHODS

(2) The increment satisfies ψ(t, u, 0) = f (t, u).

Proof. As an exercise.
Theorem 4.3. The explicit Euler–method is consistent of order p = 1.

Proof. We start from the definition of the local discretization error


1
δh = (u(t + h) − u(t)) − ψ(t, u, h) .
h
Taylor expansion yields
Å ã
0 h 00 2
δh = u (t) + u (t) + O(h ) − ψ(t, u, h) .
2

Plugging in the increment ψ(t, u, h) = f (t, u) of the Euler–method and using


the ODE u0 = f (t, u), we obtain

h 00
δh = u (t) + O(h2 ) = O(h) .
2

Definition 4.5 (Convergence). Let v(t) = Φt,t0 u0 denote the exact solution of
the IVP (4.1) and let uh be a numerical solution. We call

eh := kv − uh k∞ = max |v(tk ) − uh,k |


k=1...n

the global error of the numerical method on the grid Th .


We call the method convergent, if eh −→ 0 for step size h → 0. We call the
method convergent of order p ∈ N, if

eh = O(hp ) .

The following convergence theorem requires a discrete version of the Gronwall


Lemma as a tool.
Lemma 4.4 (Gronwall, discrete version). Let (en )n∈N , (pn )n∈N and (qn )n∈N be
non–negative sequences. Assume that

en+1 ≤ (1 + qn )en + pn

holds for all n, then the elements of the sequence (en ) can be estimated by
n−1
! " n−1 #
X X
en ≤ e0 + pj · exp qj .
j=0 j=0
4.1. CONVERGENCE THEORY 48

Proof. As an exercise.
Theorem 4.5 (Convergence of one–step methods). Consider the IVP (4.1) and
its solution u(t) = Φt,t0 u0 . Furthermore let Ψ be the numerical evolution of a
one–step method with increment ψ. Assume that

(1) The increment ψ is continuous on the set


G = {(t, u, h) : t0 ≤ t ≤ T, |u − u(t)| ≤ γ and 0 ≤ h ≤ h0 }
for some γ, h0 > 0. This rather technical condition ensures, that the incre-
ment is well defined at least in a neighborhood of the solution trajectory.
(2) There exists a constant M > 0, such that
|ψ(t, u, h) − ψ(t, v, h)| ≤ M |u − v|
for all (t, u, h), (t, v, h) ∈ G, i.e. the increment is Lipschitz–continuous
w.r.t. u. This condition is also called stability of the method.
(3) The method is consistent or even consistent of order p, i.e.
|δh (t, u)| −→ 0 or δh (t, u) = O(hp ) .

Then the method is convergent or even convergent of order p


eh −→ 0 or eh = O(hp ) .
In short: Consistency + Stability =⇒ Convergence.

Proof. We proceed in several steps.

(1) First (a rather technical step) we define an extension of the increment


®
ψ(t, u, h) for (t, u, h) ∈ G
ψ̃(t, u, h) =
ψ(t, u(t) ± γ, h) for |u − u(t)| > γ .
Then the extended increment function ψ̃ is continuous not only on G but
even on the larger set G̃ = {(t, u, h) : t0 ≤ t ≤ T, u ∈ R and 0 ≤ h ≤ h0 }
and it holds that

ψ̃(t, u, h) − ψ̃(t, v, h) ≤ M |u − v| . (∗)

Using ψ̃(t, u(t), h) = ψ(t, u(t), h) we obtain from consistency



u(t + h) − u(t)
δh (t, u) = − ψ̃(t, u(t), h) −→ 0
h
or, if the method is even consistent of order p, then there exists K1 , such
that
δh (t, u) ≤ K1 |h|p .
49 CHAPTER 4. NUMERICAL METHODS

(2) Next, we consider the numerical solution ũh generated by the extended
increment ψ̃ on a grid Th with step size |h| < h0

ũ0 := u(t0 ) and ũi+1 = ũi + hi ψ̃(ti , ũi , h) .

The exact solution u satisfies on this grid the equation

u(ti+1 ) − u(ti )
u(ti+1 ) = u(ti ) + hi .
hi
Now, the error ẽi = ũi − u(ti ) satisfies

u(ti+1 ) − u(ti )
ï ò
ẽi+1 = ẽi + hi ψ̃(ti , ũi , hi ) −
hi
î ó
= ẽi + hi ψ̃(ti , ũi , h) − ψ̃(t, u(ti ), h)
u(ti+1 ) − u(ti )
ï ò
+ hi ψ̃(ti , u(ti ), h) −
hi

Thanks to the previous result (∗) it holds that



ψ̃(ti , ũi , h) − ψ(t, u(ti ), h) ≤ M |ũi − u(ti )| = M |ẽi |

and consistency yields




ψ̃(ti , u(ti ), h) − u(t i+1 ) − u(t i )
= δh (ti , u(ti )) .
i
h i

Hence we obtain the following recursion for the error

|ẽi+1 | ≤ (1 + hi M ) |ẽi | + hi δhi (ti , u(ti ))

as well as ẽ0 = 0. The discrete Gronwall lemma 4.4 shows that


i−1
X î X ó
|ẽ|i ≤ hj · δhj (tj , u(tj )) · exp M hj
j=0

≤ |ti − t0 | · max δhj (tj , u(tj )) · eM (ti −t0 )


j=0...i−1

≤ |T − t0 | · max δhj (tj , u(tj )) · eM (T −t0 ) .


j=0...i−1

(3) Since we assume the method to be consistent, i.e. max δhj −→ 0 for h → 0,
there exists h1 , 0 < h1 ≤ h0 , such that |ẽi | ≤ γ for all i and all step sizes
|h| < h1 .
Now we can choose a grid Th with step size |h| < h1 and consider the
4.1. CONVERGENCE THEORY 50

numerical solution on that grid. Then we can skip the extension of the
increment, i.e. ψ = ψ̃ and we finally obtain

eh ≤ |T − t0 | · max δh (t, u) · eM |T −t0 | .


t∈[t0 ,T ]

If the method is just consistent, i.e. δh → 0, or if we even have consistency


of order p, i.e. δh = O(hp ), we get

eh −→ 0 or eh = O(hp ) .

Since we know that Euler’s method is consistent of order p = 1 we can even


prove that it is convergent of order p = 1. Is this the end of the story about
numerical methods for ordinary differential equations? Or why should one strive
for higher order methods?
Let us consider the influence of numerical round–off errors:
Consider an equidistant grid Th and a method with increment ψ. Let u denote
the numerical solution without round–off errors. Besides that, let U denote the
numerical solution including round–off errors. Then it holds, that

U0 = u(t0 ) + ρo and Uj+1 = Uj + hψ(tj , Uj , h) + ρj ,

where ρj denotes the round–off error in the j–th step. The global error Ej = Uj −
u(tj ) satisfies the following recursive estimate which can be obtained analogously
to the above proof

|Ej+1 | ≤ |Ej | + h (M |Ej | + δh ) + ρj


≤ |Ej | + h (M |Ej | + max δh ) + ρ

where ρ = maxj ρj . The discrete Gronwall lemma now shows, that

j−1
" #
X
|Ej | ≤ ρ0 + (h · max δh + ρ) eM (T −t0 )
i=0
T − t0
ï ò
≤ ρ + (T − t0 ) max δh + ρ eM (T −t0 )
h
 ρ
'O δ+ .
h
Hence there exists an optimal step size hopt for which the influence of the local
discretization error balances with the round–off errors. For smaller step sizes
h < hopt the local discretization error gets smaller, but the round–off errors start
51 CHAPTER 4. NUMERICAL METHODS

to accumulate, since we have to do many steps. For larger step sizes h > hopt , the
round–off error is less important, but the local discretization error gets dominant.
If the method is consistent of order p, the the overall error will be of order hp + hρ .
Minimizing this overall error, we obtain the optimal step size to be of order
»
hopt ' p+1 ρ/p

The larger the order p of the method, the larger the optimal step size will be.
Hence we aim to construct higher order methods.

4.2 Runge–Kutta Methods

We wish to construct a second order consistent method. We start from the


integral formulation
Z t+h
u(t + h) = u(t) + f (s, u(s)) ds
t

and try to construct the increment ψ such that

u(t + h) = u(t) + hψ(t, u, h) + O(h3 ) .

To approximate the integral up to second order, we use the midpoint rule


Z t+h
f (s, u(s)) ds = h f t + h2 , u(t + h2 ) + O(h3 ) .

t

Replacing the unknown function value u(t + h2 ) again by its integral form, we
obtain
Z t+h Ç Z t+h/2 å
h
f (s, u(s)) ds = h f t + 2 , u(t) + f (s, u(s)) ds + O(h3 ) .
t t

Now, we have to approximate the inner integral maintaining the overall approx-
imation order. It is sufficient to approximate the inner integral up to first order,
i.e. the rectangle rule is sufficient. This yields
Z t+h
f (s, u(s)) ds = hf t + h2 , u(t) + h2 f (t, u(t)) + O(h2 ) + O(h3 )

t
= hf t + h2 , u(t) + h2 f (t, u(t)) + O(h3 ) .

4.2. RUNGE–KUTTA METHODS 52

This is the Runge method (1895), which can be written as a two–stage method

k1 (t, u, h) = f (t, u) stage 1


k2 (t, u, h) = f t + h2 , u + h2 k1

stage 2
ψ(t, u, h) = k2 update uj+1 = uj + h (0 · k1 + 1 · k2 )

The Runge method is an example of an explicit two–stage Runge–Kutta


method.
As a generalization, we consider an s–stage method
s
X
ψ= bi ki
i=1

where the stage functions ki , i = 1, . . . , s are iteratively defined as

i−1
!
X
ki = f t + ci h, u + h aij kj , i = 1...s .
j=1

The coefficients c = (c1 , . . . , cs ) ∈ Rs , b = (b1 , . . . , bs ) ∈ Rs and the matrix


A = (aij ) ∈ Rs×s are collected in the Butcher–Array (b, c, A)

c A
b

Due to the construction, the matrix A is a strict lower triangular matrix, i.e. aij =
0 for j ≥ i.

Example 4.4 (Euler–method as a one–stage Runge–Kutta method).

0 0
1

Number of stages s = 1.
Order of consistency p = 1.

Example 4.5 (Runge–method; improved Euler).

0 0 0
1/2 1/2 0
0 1

Number of stages s = 2.
Order of consistency p = 2.
53 CHAPTER 4. NUMERICAL METHODS

Python Example 4.2 (2–stage Runge–method).


def runge ( f , t , u0 ) :
# 2− s t a g e Runge method
# f : r h s f u n c t i o n : R x Rˆm −> Rˆm
# t : v e c t o r o f d i s c r e t e time p o i n t s
5 # u0 : Rˆm v e c t o r o f i n i t i a l v a l u e s
# returns
# U : U [ : , j ] s o l u t i o n a t time p o i n t t [ j ]

n = len ( t )
10 m = len ( u0 )
U = z e r o s ( (m, n ) )
U [ : , 0 ] = u0

for j in range ( n −1):


15 h = t [ j +1]− t [ j ]
k1 = f ( t [ j ] , U [ : , j ] )
k2 = f ( t [ j ]+h / 2 , U [ : , j ]+h/2∗ k1 )
U [ : , j +1] = U [ : , j ] + h∗ k2

20 return U

Example 4.6. Two methods of consistency order p = 3:


Heun method with s = 3 stages 0 0
1/3 1/3 0
2/3 0 2/3 0
1/4 0 3/4

Runge–method with s = 4 stages 0 0


1/2 1/2 0
1 0 1 0
1 0 0 1 0
1/6 2/3 0 1/6

Example 4.7 (Classical Runge–Kutta method).


0 0
1/2 1/2 0
1/2 0 1/2 0
1 0 0 1 0
1/6 2/6 2/6 1/6

Number of stages s = 4.
Order of consistency p = 4.
4.2. RUNGE–KUTTA METHODS 54

Besides these methods, there exist may more, see e.g. Dormand–Prince.
How to construct these Runge–Kutta methods systematically?
Here are some ideas and remarks.

Lemma 4.6 (Consistency). The Runge–Kutta method (b, c, A) is consistent,


if and only if
Xs
bi = 1 .
i=1

Idea of the proof. Considering


P the numerical evolution u(t + h) = u(t) + hψ with
increment function ψ = bi ki , we obtain
P in the limit h → 0, that
P for each stage
function ki → f holds. Hence ψ = ( bi ) f = f , if and only if bi = 1.

Lemma 4.7 (Stage number s vs. consistency order p). The consistency order p
of an s–stage Runge–Kutta method satisfies

p≤s.

Idea of the proof. To see this, we consider the IVP u0 = u, u(0) = 1. The solution
after one step of size h satisfies

h2 hp
Φh,0 1 = eh = 1 + h + + ··· + + O(hp+1 ) .
2 p!
The ith–stage ki of the RK–method is a polynomial of degree deg ki ≤ i − 1 in
h. Hence the increment function ψ is also a polynomial of deg ψ ≤ s in h. If ψ
is consistent of order p in the above IVP, the ψ approximates the exponential up
to an error O(hp ); i.e. p ≤ s.

The following table due to Butcher (1963,1965,1985) contains the minimal num-
ber of stages s necessary for achieving consistency order p

p 1 2 3 4 5 6 7 8 ≥9
smin 1 2 3 4 6 7 9 11 ≥ p + 3

Remark 4.2 (Autonomization). Given a non–autonomous IVP u0 = f (t, u) and


u(t0 ) = u0 for u ∈ Rd . Introducing the extended variable y = (t, u) ∈ Rd+1 , we
can re–write the problem as an autonomous one
Å ã Å ã
1 t
0
y = g(y) := , y(0) = y0 := 0 .
f (t, u) u0

This trick is called autonomization.


55 CHAPTER 4. NUMERICAL METHODS

Does the same trick hold for a Runge–Kutta–method?


Consider the stage–functions of a RK–method for the original problem
Ä X ä
ki = f t + ci h, u + h aij kj

and the numerical evolution after autonomization


Å ã Å ã s Å ã
t+h t X θ
= + hψ̂ , where ψ̂ = bi i
u(t + h) u ki
i=1

introducing the extended stage functions


i−1 i+1
!
X X
k̂i = f t+ aij θj , u + h aij k̂j
j=1 j=1

θi = 1 .

Invariance under autonomization requires ki = k̂i , hence


i−1 s
!
X X
aij = aij = ci .
j=1 j=1

Lemma
P 4.8. Consider a Runge–Kutta–method (b, c, A).
If P bi = 1, then the method is consistent.
If j aij = ci , then the method is invariant under autonomization.
Example 4.8 (Constructing a two–stage method of second order). We consider
a general two–stage Runge–Kutta method with increment

ψ = bi k1 + b2 k2

and the the two stages

k1 = f (t, u)
k2 = f (t + c2 h, u + ha21 k1 ) = f (t + c2 h, u + ha21 f (t, u)) .

A Taylor–expansion of k2 yields
c22 a221 2
ï ò
2
k2 = f +h [c2 ∂t f + a21 f · ∂u f ]+h ∂tt f + f · ∂uu f + c2 a21 f · ∂tu f +O(h3 )
2 2
and hence for the increment

ψ = (b1 + b2 ) f + h [b2 c2 ∂t f + b2 a21 f · ∂u f ]


h2  2
b2 c2 ∂tt f + b2 a221 f 2 · ∂uu f + 2b2 c2 a21 f · ∂tu f + O(h3 ) . (†)

+
2
4.2. RUNGE–KUTTA METHODS 56

To compute the consistency error, we still need an expansion of

u(t + h) − u(t) h
= f + [∂t f + f · ∂u f ]
h 2
h2  2
f · ∂uu f + 2f · ∂tu f + (∂u f )2 · f + f · ∂u f · ∂t f + ∂tt f + O(h3 ) (∗)

+
6
The consistency error
u(t + h) − u(t)
δ= −ψ
h
can be obtained by comparing the two expansions (†) and (∗). This yields the
conditions

B To cancel h0 : b1 + b2 = 1.
1
B To cancel h1 : b2 c 2 = 2
and b2 a21 = 12 .

B Canceling h2 : not possible!

Invariance under autonomization yields the additional constraint c2 = a21 . Hence


all RK–methods with s = 2 stages and of consistency order p = 2 can be written
in the form
2λ − 1 1
Å ã
(b1 , b2 , c2 , a21 ) = , , λ, λ .
2λ 2λ
For λ = 1 we obtain the method
0 0 0
1 1 0
1/2 1/2

and for λ = 1/2 we recover the original Runge–method

0 0 0
1/2 1/2 0
0 1

Remark 4.3. The following table (also due to Butcher) contains the number of
conditions that have to be satisfied constructing a RK–method of order p.

p 1 2 3 4 5 6 7 8 9 10
# conditions 1 2 4 8 17 37 85 200 486 1205

This illustrates, that constructing high–order RK–methods is a non–trivial task.


57 CHAPTER 4. NUMERICAL METHODS

4.3 Extrapolation Methods


Does there exist an alternative and even simpler approach to construct high–order
methods?
The idea of Richardson extrapolation provides a principle to construct methods
with both variable and arbitrary order.
Consider a function d : R+ → R. Assume, we can compute values d(hn ) for a zero
sequence hn → 0. Based on these values, we extrapolate the limit lim d(h) =: d(0).
h→0

Example 4.9 (Richardson–Extrapolation). Consider a smooth function f and


we want to compute its derivative f 0 (x0 ) at some point x0 with high accuracy.

(1) Choose a step size h and consider the centered difference approximation
f (x0 + h) − f (x0 − h)
df (h) := .
2h
Using Taylor, we obtain the following expansion of the error
2 4
0
h (3) h (5)
ef (h) := |f (x0 ) − df (h)| ≤ f (x0 ) + f (x0 ) + . . .

3! 5!
The error admits a power series expansion in even powers of h and the
original approximation df (h) is of second order. How to do better?
(2) Consider a step size sequence h0 > h1 > . . . hk > 0.
(3) Construct the interpolating polynomial D for the data points (hi , df (hi )).
(4) Evaluate the interpolating polynomial at h = 0, i.e. D(0).
Remark 4.4. The last two steps can be done effectively using the Neville–
Aitken–scheme
d(h0 ) := D00
&
d(h1 ) := D11 → D11
.. ..
. .
d(hk−1 ) := Dk−1,1 → ··· → Dk−1,k−1
& & &
d(hk ) := Dk1 → ··· → Dk,k−1 → Dkk ≈ D(0)
Here
hj−l+1 r
Å ã
1
Djl = Dj,l−1 + (Dj,l−1 − Dj−1,l−1 ) , µj,l = −1
µjl hj
where r = 2 is due to the expansion of the error in even powers of h.
4.3. EXTRAPOLATION METHODS 58

Theorem 4.9 (Richardson–Extrapolation). Consider a function d : R+ → R


and assume the error term ef (h) = |d(h) − limh→0 d(h)| admits an expansion in
powers of hr

ef (h) = a1 hr + a2 h2r + . . . am hmr + O(h(m+1)r) ) .

Then the coefficients of the Neville–Aitken–scheme satisfy



ejl := Djl − lim d(h) = bjl hrj−l+1 · · · hrj = O(hlr ) .


h→0

Proof. See lecture on Numerics.

Now, we transfer this idea to the numerical solution of ODEs. Our goal is to
compute the solution u(t + h) = Φt+h,t u(t) after one base step of step size h up
to a desired order q.
We choose a base method ψ of order p and a sequence F = (n0 , n1 , . . . ) with
nk → ∞, e.g. the Romberg–sequence nk = 2k or the harmonic sequence nk =
k + 1. We define the step sizes hk = h/nk and compute uhk (t + h) using the base
method with local step size hk . Next, we apply the Neville–Aitken–scheme
to obtain an extrapolation of the u(t + h) up to the order hp+k .
What is a suitable base–scheme?
The following table compares the number of function evaluation necessary for
order q of the extrapolation scheme using a base scheme of order p.

q

p 1 2 3 4 5 6 7 8
1 1 2 4 7 11 16 22 29
2 2 5 10 17 26 37
3 3 8 16 27 41
4 4 11 22 37
5 6 17 34
6 7 20
7 9 26

This table shows, that the explicit Euler–method with p = 1 is sub–optimal,


but for sure the most simple choice.
Algorithm 4.1 (Extrapolation–Method).
Input u(t), a base step size h, sequence F = (n0 , n1 , . . . ) and the number K of
extrapolation steps.
Output u(t + h) up to order 1 + K.
59 CHAPTER 4. NUMERICAL METHODS

For j = 0 . . . K do

hj := h/nj
(j)
u0 := u(t)
(j) (j)
Ä (j)
ä
ul+1 := ul + hj f t + lhj , ul , l = 0 . . . nj − 1
Uj0 := u(j)
nj

Next, apply Neville–Aitken to extrapolate.


For l = 1 . . . K do
Uj,l−1 − Uj−1,l−1
Ujl := Uj,l−1 + , for j = K, . . . l
nj /nj−l+1 − 1

Set u(t + h) := UKK .

Remark 4.5. Not, that in the Neville–Aitken scheme, we have to use r = 1,


since the Euler method is first oder consistent, i.e. the error has an expansion
in power of h1 .

How to choose the number K of extrapolation steps?


Assume, we wish to achieve an overall accuracy δ of the numerical solution, i.e.
the numerical solution Ukk of the above extrapolation method shall approximate
the exact solution u(t + h) = Φt+h,t u(t) up to this error

|u(t + h) − Ukk | ≤ δ .

The following idea is due to Deuflhard and Hohmann [DH03, Chapter 9.5]:
We use

εk,k−1 := |Uk,k−1 − Ukk |

as an estimator for the error

ek,k−1 := |Ukk − u(t + h)|

Why? It holds that ejl  ej,l−1 , since Ujl is of higher order. Due to

εk,k−1 = |(Uk,k−1 − u(t + h)) − (u(t + h) − Ukk )| ≤ ek,k−1 + ekk

and thanks to the inverse triangle inequality |x − y| ≥ ||x| − |y||

εk,k−1 ≥ ek,k−1 − ekk


4.4. ADAPTIVE STEP SIZE SELECTION 60

we obtain

ek,k−1 − ekk ≤ εk,k−1 ≤ ek,k−1 + ekk


 ekk   ekk 
1− ek,k−1 ≤ εk,k−1 ≤ 1 + ek,k−1
ek,k−1 ek,k−1
| {z } | {z }
1 1

and therefore

εk,k−1 ≈ ek,k−1 .

If the estimator εk,k−1 is small enough, i.e. εk,k−1 ≤ ρ · δ for some safety factor
ρ ∼ 1/4, 1/5, then we accept the numerical solution Ukk as approximation of the
exact solution u(t + h). Otherwise, we add an additional row to the Neville–
Aitken–scheme, i.e. we use an additional extrapolation step to increase the order
and re–check the accuracy afterwards.

4.4 Adaptive Step Size Selection

In general we cannot choose the step size of a numerical method a priori but we
need an adaptive choice of the step size. The step size shall be small to allow
accurate computation, if the solution undergoes large and rapid changes. On the
other hand the step size can be large to allow fast computation, if the solution
does not change much. However, we do not know the solution in advance, hence
we cannot compute the needed step sizes in advance; we need adaptivity. To
construct an adaptive step size method, we need the following ingredients

(1) An error estimator indicating, where to change the step size.

(2) A rule how to change the step size.

(3) An efficient method to do this.

As an error estimator ε we use the difference between two numerical solutions u


und û of different order. Let u be the approximation using a method ψ of order
p, i.e.

δ = O(hp ) = c(tj ) · hp + O(hp+1 )

and û be the approximation using a second method ψ̂ of order p + 1

δ̂ = O(hp+1 ) .
61 CHAPTER 4. NUMERICAL METHODS

Since

Φt+h,t u(t) = u(t) + hψ + hδ = u + hδ = û + hδ̂ ,

it holds that

ε = û − u = h(δ − δ̂) = chp+1 + O(hp+2 ) ≈ chp+1 .

The constant c is unknown.

How to determine a suitable the step size? Our goal is to determine an optimal
step size h∗ such that the error ε equals a desired tolerance τ

τ = c(h∗ )p+1 .

For a given step size h, the error estimator yields

ε = chp+1 .

Combining the above two equations, we can get rid of the unknown constant c
and obtain for the optimal step size
»
h∗ = h p+1 τ /ε .

To be ”on the safe side”, we use instead


 » 
∗ p+1
h = min hmax , qh, h ρτ /ε . (∗)

Here hmax > 0 denotes an upper bound for the step size, q > 1 is a factor limiting
the maximal increase of the step size (q ∼ 5) and 0 < ρ < 1 denotes a safety
factor (ρ ∼ 1/2, 3/4).

If, for a given step size h, the error estimator ε is amller than the tolerance τ ,
then (∗) shows how much can we can to increase the step size, without violating
the accuracy constraint. If ε > τ , then (∗) shows how much we have to decrease
the step size to fulfill the accuracy constraint.

In order to get an efficient computation, we must be able to obtain the lower order
auxiliary solution u with almost no additional effort compared to computing the
high–order solution û. The embedded Runge–Kutta methods due to Fehlberg
and Dormand/Prince are designed for this purpose. They use two Runge–
Kutta methods with s and s + 1 stages that share the same intermediate stages,
but differ just in the weighting coefficients b for the increment.
4.4. ADAPTIVE STEP SIZE SELECTION 62

Example 4.10 (Embedded RK–method of order 4 and 3: RK4(3)).



0 0 

1/2 1/2 0

1/2
classical RK 4
0 1/2 0 

1 0 0 1 0

1 1/6 1/3 1/3 1/6 0
1/6 1/3 1/3 1/6 0 order p = 4
1/6 1/3 1/3 1
0 /6 order p = 3

The standard method for solving ODEs is a s = 6–stage RK4(5)–method due to


Dormand and Prince; its coefficients can be found in [DB02, Chapter 5.4]. This
method is also pre–implemented as the default solver in the function solve ivp
in the python module scipy.integrate. If higher accuracy is required, there
exists also a Runge–Kutta method of order 8 by Dormand and Prince. To use
this one, we refer to the manual pages of scipy.integrate.solve ivp.
Chapter 5

Stability

Differential equations are often used to model technological, physical, chemical or


biological processes. Parameters appearing in those equations are often obtained
by measurements and subject to inaccuracies and errors. It is desirable, that the
mathematical model, i.e. the differential equations, react insensitive to parameter
perturbations. To be more precise, we expect, that a well–posed problem shares
the following features

existence The problem shall have at least one solution.

uniqueness The problem shall have at most one solution.

continuous dependence The solution shall depend continuously on the data.


A small change in the data results in a small change of the solution.

We have touched these issues already in chapter 2. The upper– and lower func-
tions introduced in chapter 2.5 serve as an important tool.
In general we distinguish between the local behavior (e.g. on a compact interval
I) of the solution and the long–term behavior on unbounded intervals. First, we
will analyze the situation on compact (i.e. bounded) intervals.

5.1 Continuous Dependence on Initial Data

We consider an IVP of the form

u0 = f (t, u), u(ξ) = η , (5.1)

on a compact interval I ⊂ R.

63
5.1. CONTINUOUS DEPENDENCE ON INITIAL DATA 64

Theorem 5.1. Let f be defined on D ⊂ I × Rn and assume it to be L–continuous

|f (t, u1 ) − f (t, u2 )| ≤ L |u1 − u2 | , (L)


for some L ≥ 0. Let u be the solution of (5.1) where graph u ⊂ D. Let z be an
approximate solution of (5.1) such that
|z(ξ) − u(ξ)| ≤ γ, |z 0 − f (t, z)| ≤ δ , (5.2)
for some constants γ, δ ≥ 0. Then the estimate
δ Ä L|t−ξ| ä
|u(t) − z(t)| ≤ γeL|t−ξ| + e −1 (5.3)
L
holds for all t ∈ I.

Proof; Sketch. We consider w := u − z, where |w(ξ)| = |u(ξ) − z(ξ)| ≤ γ and


w0 = u0 − z 0 = f (t, u) − f (t, z) + f (t, z) − z 0
|w0 | ≤ L |w| + δ
δ Ä L|t−ξ| ä
w(t) =≤ γeL|t−ξ| + e −1 .
L

The above theorem contains even more information summarized by the following
Theorem 5.2 (Continuous dependence). Let u = u(t) be the solution of (5.1).
For α > 0 let
Sα := {(t, y) : t ∈ I, |y − u(t)| ≤ α}
denote the α–neighborhood of the solution trajectory. Assume, that there exists
some α0 > 0, such that f satisfies the Lipschitz–condition (L) on Sα0 . In partic-
ular, f is assumed to be continuous on Sα0 .
Then the solution u depends continuously on both the initial value η and the right
hand side f . In mathematical terms: For all ε > 0, there exists some δ > 0, such
that all solutions z of the perturbed IVP
z 0 = g(t, z), z(ξ) = η̃ (5.4)
where the right hand side g is continuous on Sα0 and
|g(t, y) − f (t, y)| ≤ δ in Sα0 , |η̃ − η| ≤ δ (5.5)
exist on entire I and the estimate
|z(t) − u(t)| ≤ ε (5.6)
holds for all t ∈ I.
65 CHAPTER 5. STABILITY

Proof; Sketch. Let z satisfy (5.4) and (5.5). As long as z is contained in Sα0 , the
estimate (5.2) holds for γ = δ, hence (5.6) also holds setting γ = δ. Choosing
γ = δ small enough, we can guarantee, that the right hand side of (5.3) is
bounded by α0 /2. As long as z is inside Sα0 the estimate (5.6) holds and therefore
|u0 − z| ≤ α0 /2. Hence the solution trajectory z cannot leave Sα0 .
Let ε > 0. Choosing γ = δ in (5.2) and (5.3) small enough, we can bound the
right hand side of (5.3) by ε.

In applications one frequently encounters the situation, that right hand side f
and/or the initial value u0 of an IVP
u0 = f (t, u; λ), u(t0 ) = u0 (λ)
depend on a parameter λ ∈ Rp . The dependence of the solution u(t; λ) of this
parameter is called the sensitivity of the solution w.r.t. λ.
Definition 5.1 (Sensitivity). Let I ⊂ R be a compact interval and t0 ∈ I. Let
f : I × Rn × Rp → Rn and u0 : Rp → Rn be twice continuously differentiable. We
consider the IVP
u0 = f (t, u; λ), u(t0 ) = u0 (λ)
depending on the parameter λ = (λ1 , . . . , λp ) ∈ Rp with the unique solution
u(t; λ). We call
d
Sk (t) := u(t; λ), k = 1, . . . , p
dλk
the sensitivity of the solution u(t; λ) with respect to the parameter λk .

How to compute those sensitivities?


Theorem 5.3 (Sensitivity). The sensitivity Sk (t) satisfies the linear IVP
∂f ∂f ∂u0
Sk0 (t) = · Sk + , Sk (t0 ) = .
∂u ∂λk ∂λk

Proof; Sketch. Differentiating Sk (t) = dλdk u(t; λ) with respect to t and applying
Schwarz theorem to interchange the order of differentiation, we obtain
d2 d d d ∂f du ∂f
Sk0 (t) = u(t; λ) = u(t; λ) = f (t, u(t; λ); λ) = · +
dt dλk dλk dt dλk ∂u dλk ∂λk
∂f ∂f
= · Sk (t) +
∂u ∂λk
together with the initial condition
∂ ∂
Sk (t0 ) = u(t0 ) = u0 (λ) .
∂λk ∂λk
5.1. CONTINUOUS DEPENDENCE ON INITIAL DATA 66

Example 5.1. We consider the initial value problem


u0 = αu =: f (t, u; α), u(0) = u0 (β) = β
depending on the two–dimensional parameter λ = (α, β) ∈ R2 . The solution is
given by
u(t; λ) = βeαt
and the two sensitivities are easily computed
d
Sα (t) = u(t; λ) = βt eαt

d
Sβ (t) = u(t; λ) = eαt .

The sensitivities satisfy the IVPs
∂f ∂f ∂u0
Sα0 (t) = · Sα + = αSα + u = αSα + βeαt , Sα (0) = =0,
∂u ∂α ∂α
∂f ∂f ∂u0
Sβ0 (t) = · Sβ + = αSβ + u, Sβ (0) = =1.
∂u ∂β ∂β
Solving those two IVPs we recover the analytically computed sensitivities.
Example 5.2. We consider the logistic model
u0 = u(α − u), u(0) = β (5.7a)
depending again on two parameters α, β. We expect, that the sensitivity of
u(t; α, β) for small t is large w.r.t. β but small w.r.t. α and for large times t this
will be the other way round. The sensitivities themselves satisfy the IVP
Sα0 = (α − 2u) · Sα + u, Sα (0) = 0 (5.7b)
Sβ0 = (α − 2u) · Sβ Sβ (0) = 0 (5.7c)
Now, we can solve the coupled u'=u ( -u), u(0)=
1.2
three–dimensional system (5.7) nu-
merically using any method we like 1

to obtain simultaneously the solu-


0.8
tion u as well as the two sensitivities
u
Sα and Sβ . The following Fig. 5.1 0.6 S
S
shows the solution of the logistic
equation together with the two sen- 0.4

sitivities. The graph shows, that the 0.2


sensitivity Sβ for the initial vlaue β
is large for small times t and then 0
0 1 2 3 4 5 6
decreases. The parameter α, being t

the terminal value of the solution u, Figure 5.1: Logistic model and sen-
has a high sensitivity only for large sitivities.
times t.
67 CHAPTER 5. STABILITY

In the next section we we consider stability issues on unbounded intervals. With-


out loss of generality we consider I = [0, ∞).

5.2 Asymptotic Behavior of Solutions

Example 5.3. We consider the linear ODE

u0 = λu, u(0) = 1 .

with the solution u = eλt on entire R.


Perturbing the initial value, i.e.

z(0) = 1 + ε ,

the difference of the according solutions is given by

(z − u)(t) = (1 + ε)eλt − eλt = εeλt .

B For λ > 0 this difference grows for arbitrary small ε > 0 exponentially
without bounds, lim (z − u)(t) = ∞.
t→∞

B For λ = 0 the difference remains constant = ε.

B For λ < 0 the difference decreases exponentially to zero, lim (z − u)(t) = 0.


t→∞

Definition 5.2 (Lyapunov Stability). Let u be a solution of u0 = f (t, u) on


0 ≤ t < ∞. Let f be continuous on Sα := {(t, y) : 0 ≤ t < ∞, |y − u(t)| < α}
for some α > 0. The solution u is called

stable (in the sense of Lyapunov), if for all ε > 0 there exists δ > 0, such that
all solutions z with |z(0) − u(0)| < δ exist for all t ≥ 0 and satisfy

|z(t) − u(t)| < ε for all t > 0 .

asymptotically stable , if it is stable and there exists δ > 0 such that for all
solutions z with |z(0) − u(0)| < δ it holds that

lim |z(t) − u(t)| = 0 .


t→∞

unstable , if it is not stable.


5.2. ASYMPTOTIC BEHAVIOR OF SOLUTIONS 68

First, we consider linear systems


u0 = A(t)u + b(t) , (5.8)
where A(t) and b(t) are assumed to be continuous on I = [0, ∞). For any initial
value u(0), the solution of (5.8) exists on entire I.
Proposition 5.4. If the zero–solution of the homogeneous problem u0 = A(t)u is
stable, asymptotically stable or unstable, then any solution of the inhomogeneous
problem (5.8) has the same property.

A further characterization of stability of arbitrary linear systems is —up to my


knowledge— not available analogous to the question of constructing a fundamen-
tal system. Hence we restrict to systems with constant coefficients
u0 = Au . (5.9)
Then the following theorem holds
Theorem 5.5 (Stability of linear systems). Let γ = max {Re λ : λ ∈ σ(A)} be
the largest real part in the spectrum of A. For the trivial solution u ≡ 0 of the
homogeneous ODE (5.9) it holds that

(1) If γ < 0, it is asymptotically stable,


(2) If γ > 0, it is unstable,
(3) If γ = 0, it is not asymptotically stable, but stable, if and only if all eigen-
values λ with Re λ = 0 have the same algebraic and geometric multiplicity.

Recall: The algebraic multiplicity of an eigenvalue equals go its multiplicity as


a root of the characteristic polynomial. The geometric multiplicity equals to the
number of linear independent eigenvectors to the eigenvalue.

Proof; Sketch. The cases γ > 0 and γ < 0 are easy to see when recalling the
fundamental system in its exponential form eAt .
Let γ = 0 and assume there exists an eigenvalue λ whose geometric multiplicity is
strictly less then its algebraic multiplicity. Then, the Jordan normalform contains
a Jordan block of dimension m > 1. This Jordan block corresponds in the
fundamental system to an unstable solution of the form
Å ã
λt 1 2 1 m−1
e 1 + t + t + ··· + t .
2 (m − 1)!
If all eigenvalues with Re λ = 0, i.e. λ = iω have equal algebraic and geometric
multiplicities, the according Jordan blocks are all of dimension 1 and the corre-
sponding fundamental solutions are given by eλt = eiωt = cos ωt + i sin ωt. Those
solutions are stable but not asymptotically stable.
69 CHAPTER 5. STABILITY

Moreover, the following estimate holds

Theorem 5.6. Assume the eigenvalues λ of a matrix A ∈ Rn×n satisfy

β < Re λ < α .

Then there exists a norm k·k ∈ Cn such that



eβt kxk ≤ eAt x ≤ eαt kxk

for all t ≥ 0 and all x ∈ Cn . Hence



eβt ≤ eAt ≤ eαt .

Proof. See [Wal98].

Example 5.4 (Two–dimensional systems). Consider


Å 0ã Å ã
0 x x
u = 0 =A = Au .
y y

The matrix A has one of the following real normal forms

Å ã
λ 0
B R(λ, µ) = with two real eigenvalues.
0 µ
ã Å
λ 1
B Ra (λ) = with one eigenvalue with simple geometric multiplicity.
0 λ
Å ã
α ω
B K(α, ω) = with a pair of conjugate eigenvalues λ = α ± iω.
−ω α

The following graphs depict the phase portraits of asymptotically stable

R(λ, µ) for λ, µ < 0 Ra (λ) for λ < 0 K(α, ω) for α < 0


λ≠µ 3 2
3
1.5
2
2
1
1
1 0.5
y

0
y

0 0
y

−1
−0.5
−1

−2 −1
−2
−1.5
−3 −1.5 −1 −0.5 0 0.5 1 1.5 2
−3 −3 −2 −1 0 1 2 3
x
−3 −2 −1 0 1 2 3 x
x

and unstable systems


5.2. ASYMPTOTIC BEHAVIOR OF SOLUTIONS 70

R(λ, µ) for λ < 0 < µ K(α, ω) for α > 0


4 1

3 0.8

0.6
2
0.4
1
0.2

y
0
y

0
−1 −0.2

−2 −0.4

−3 −0.6

−0.8
−4 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8
−4 −3 −2 −1 0 1 2 3 4 x
x

In the nonlinear case we need an additional tool; the following

Theorem 5.7 (Gronwall lemma). Let I = [0, a[⊂ R and Φ : I → R be continu-


ous. Assume, that there exist α, β > 0 such that
Z t
Φ(t) ≤ α + β Φ(τ ) dτ (5.10)
0

in I. Then
Φ(t) ≤ αeβt
holds for all t ∈ I.

Rt
Proof. We define ψ = α + β 0 Φ(τ ) dτ . Then ψ 0 = βΦ and due to Φ ≤ ψ we get
ψ 0 ≤ βψ and ψ(0) = α. Let w be the solution of w0 = βw, w(0) = α, the w is an
upper function to ψ and hence Φ ≤ ψ ≤ w = αeβt .

The next theorem renders information on the stability of ODEs with linear prin-
ciple part
u0 = Au + g(t, u) . (5.11)
Here we assume, that for small u the function g(t, u) is small compared to u.

Theorem 5.8 (Stability). Let α > 0. Let g : R × Rn → Rn , (t, z) 7→ g(t, z) be


continuous for t ≥ 0, |z| ≤ α and assume

|g(t, z)|
lim =0
|z|→0 |z|

holds uniformly in t, in particular g(t, 0) = 0. Assume, that the matrix A ∈ Rn×n


is constant and its spectrum is contained in the left half plane, i.e.

Re λ < 0 ∀λ ∈ σ(A) .

Then u ≡ 0 is an asymptotically stable solution of (5.11).


71 CHAPTER 5. STABILITY

Proof. There exist β > 0 und c < 1, such that


e ≤ ce−βt for all t ≥ 0 .
At
(†)

Moreover, there exists δ < α, such that


β
g(t, z) ≤ |z| for all |z| < δ (‡)
2c
We will show: |u(0)| ≤ ε ≤ δ/c implies |u| ≤ cεe−βt/2 .
Side remark: The solution of the inhomogeneous equation u0 = Au + b(t) is given
R t A(t−s)
At
by u(t) = e u0 + 0 e b(s) ds. Analogously, the solution of the equation
u0 = Au + g(t, u) can be written implicitly as
Z t
At
u(t) = e u0 + eA(t−s) g(s, u(s)) ds .
0

Using (†) and (‡) we obtain


Z t
−βt β
|u(t)| ≤ |u0 | ce + ce−β(t−s) |u(s)| ds
0 2c

for |u| ≤ δ. Let u be the solution of (5.11) for |u0 | < ε and define Φ = |u| eβt .
Then |u| ≤ δ leads to
β t
Z
Φ(t) ≤ cε + Φ(s) ds .
2 0
Applying Gronwall yields Φ(t) ≤ cεeβt/2 and hence |u(t)| ≤ cεe−βt/2 ≤ δ.

The above theorem is typically applied when analyzing the stability of autonomous
systems
u0 = f (u); .
Definition 5.3 (Stationary point). Let f : Rn → Rn be continuously diff’able
and consider the autonomous system u0 = f (u). We call u∗ ∈ Rn a singular or
stationary point or equilibrium of the system, if f (u∗ ) = 0.
Remark 5.1. Let f : Rn → Rn be continuously diff’able and let u∗ be a stationary
point of u0 = f (u). Then u(t) ≡ u∗ is the unique solution of the IVP

u0 = f (u), u(0) = u∗ .

To analyze the stability of u∗ , we expand f in its Taylor series around u∗ and


obtain
f (u) = f (u∗ ) +Df (u∗ )(u − u∗ ) + g(u − u∗ ) ,
| {z }
=0
5.2. ASYMPTOTIC BEHAVIOR OF SOLUTIONS 72

where g(u) = O(|u − u∗ |2 ). This linearization leads to the following ODE with
linear principle part

(u − u∗ )0 = Df (u∗ )(u − u∗ ) + g(u − u∗ ) .

Applying Thm. 5.8 leads to

Theorem 5.9 (Nonlinear stability). Let f : Rn → Rn be continuously diff ’able


and let u∗ be a stationary point of u0 = f (u). Let A = Df (u∗ ) ∈ Rn×n denote the
Jacobian of f at u∗ . Then the stationary solution u∗ is

unstable , if there exists λ ∈ σ(A), such that Re λ > 0.

asymptotically stable , if for all λ ∈ σ(A), it holds that Re λ < 0.

Example 5.5 (Logistic growth). We consider the logistic equation

u0 = u (1 − u) = f (u)

with the two equilibria u0 = 0 and u1 = 1. The derivative of the right hand side
is given by Df (u) = 1 − 2u. For the stability of these two we get

(1) The trivial equilibrium u = 0 is unstable, since Df (u0 ) = 1 > 0.

(2) The equilibrium u1 = 1 is asymptotically stable, since Df (u1 ) = −1 < 0.

Remark 5.2 (Remark to Thm. 5.8). If Re λ ≤ 0 for all eigenvalues λ ∈ σ(A)


and Re λk = 0 for at least one eigenvalue λk ∈ σ(A), then the stability of the
equilibrium of the linear problem u0 = Au does not imply the stability of the
equilibrium of the nonlinear problem u0 = Au + g(t, u). This is illustrated by the
next

Example 5.6. We analyze the stability of the stationary solution u ≡ 0 for the
problem
u0 = βu3 := f (u) ,
where β ∈ R is an arbitrary parameter. Linearizing around u = 0 leads to the
trivial ODE u0 = 0 with the constant and stable solution u(t) = u0 . This may lead
us to assume, that u = 0 is also a stable equilibrium for the nonlinear problem.
But: Consider the exact solution of the ODE obtained by separation of variables
u0
u(t) = p .
1 − 2u20 βt
73 CHAPTER 5. STABILITY

This shows, that the analogy to the linearization is not valid. It holds that for

> 0 :
 limt→1/(2u0 β) |u(t)| = +∞ unstable
β =0 : u(t) ≡ u0 stable

<0 : limt→∞ u(t) = 0 asymptotically stable .

The stability (but not asymptotic stability) of the linearization does not imply
the stability of the nonlinear problem.

Example 5.7 (Damped mathematical pendulum). We consider the second order


ODE
u00 + ku0 + sin u = 0 ,
for k > 0. Re–writing this as a system of first order
Å ã0 Å ã
x y
= f (x, y) =
y −ky − sin x

we obtain two stationary points (0 + 2nπ, 0) and (π + 2nπ, 0) for n ∈ Z.

(1) At (0 + 2nπ, 0) the Jacobian is given by


Å ã
0 1
Df (0 + 2nπ, 0) = ,
−1 −k
p
the eigenvalues equal to λ1,2 = −k/2 ± k 2 /4 − 1. Here Re λ1,2 < 0 and
hence the stationary solution u ≡ 0 is asymptotically stable.

(2) At (π + 2nπ, 0) the Jacobian is given by


Å ã
0 1
Df (π + 2nπ, 0) = ,
1 −k
p
the eigenvalues equal to λ1,2 = −k/2 ± k 2 /4 + 1. Here Re λ1 > 0 and
Re λ2 < 0. Hence, the stationary solution u ≡ π is unstable.

Example 5.8 (Predator–Prey model with Allee effect). We consider the modified
predator–prey model

x0 = x(x − ξ)(1 − x) − αxy


y 0 = y(x − η)

where 0 < ξ, η < 1 and α > 0. This model shows four equilibria

(x, y) = (0, 0), (x, y) = (ξ, 0) (x, y) = (1, 0), (x, y) = (η, y ∗ )
5.3. LYAPUNOV FUNCTIONS 74

where y ∗ = α1 (x − ξ)(1 − x). To analyze their stability, we derive the Jacobian


Å ã
(x − ξ)(1 − x) + x(1 − x) − x(x − ξ) − αy −αx
Df (x, y) =
y x−η
and hence
Å ã
−ξ 0
Df (0, 0) = asymptotically stable
0 −ξ
Å ã
ξ(1 − ξ) −αξ
Df (ξ, 0) = unstable, since ξ(1 − ξ) > 0
0 ξ−η

Df (1, 0) = ξ − 1 −α 0 1−η unstable, since 1 − η > 0
Å ã
∗ η(1 + ξ − 2η) −αη
Df (η, y ) =
y∗ 0
The analysis of the stability of the non–trivial (co–existence) equilibrium (η, y ∗ )
is left as an exercise.

5.3 Lyapunov Functions


Example 5.9 (Damped nonlinear oscillator). We consider the equation
u00 + ku0 + p(u) = 0, where k ≥ 0 ,
describing a non–linear oscillator. In case of k = 0 we neglect damping (friction).
in the sequel, we assume that p(u) > 0 for u > 0.
The energy of the system is given by
u(t)
u02
Z
0
E(u, u ) = + P (u), where P (u) = p(v) dv ≥ 0 .
2 0

This energy functional has the following properties

B E(u, u0 ) ≥ 0
B E(u, u0 ) = 0 iff u ≡ 0

These properties are somehow similar to what we expect from a norm.


But what happens to the energy, when we follow it along a solution trajectory?
d
E(u, u0 ) = u0 (u00 + p(u)) = u0 (−ku0 )
dt
= −ku02 ≤ 0
Hence, in case of
75 CHAPTER 5. STABILITY

no damping , i.e. k = 0 we obtain dtd E(u, u0 ) = 0 along trajectories. The energy


is constant along solution trajectories.

damping , i.e. k > 0, then dtd E(u, u0 ) < 0 along trajectories. Hence, the energy
is decreasing along solutions.

Can this energy help to decide, whether a solution is stable or not? To answer
this question, we consider a real autonomous system of the form

u0 = f (u) , (5.12)

where D ⊂ Rn is open, 0 ∈ D and f ∈ C(D) is assumed to be Lipschitz–


continuous. Moreover let f (0) = 0. The u ≡ 0 is an equilibrium of (5.12).
Definition 5.4. The stationary solution u ≡ 0 is called exponentially stable, if
there exist constants β, γ, c > 0 such that for all initial values |u(0)| < β the
unique solution of (5.12) satisfies the estimate |u(t)| ≤ ce−γt for all times t ≥ 0.
Remark 5.3. Exponential stability implies asymptotic stability; the converse is
i.g. not true.
Definition 5.5 (Lyapunov–function). A function V ∈ C 1 (D) is called a Lyapunov–
function to the ODE (5.12), if

(1) V (x) ≥ 0 for all x ∈ D,

(2) V (x) = 0, if and only if x = 0,

(3) the derivative of V in direction of f is non–positive, i.e.

V 0 := (grad V, f ) = f1 · ∂1 V + · · · + fn · ∂n V
≤0 in D .
Remark 5.4. The energy E(u, u0 ) = u02 /2 + P (u) is a Lyapunov–function for the
nonlinear oscillator u00 + ku0 + p(u) = 0.
Often, for ODEs describing physical systems, Lyapunov functions can be con-
structed when looking for energies of the system. However, there is no general
rule or recipe —at least to my knowledge— on how to construct a Lyapunov
function for a given ODE.

Why are Lyapunov–functions important for stability analysis? Due to the fol-
lowing
Theorem 5.10 (Stability Theorem). Let D ⊂ Rn be open and 0 ∈ D, let f
be Lipschitz–continuous and f (0) = 0 and let V be a Lyapunov–function for
u0 = f (u). If
5.3. LYAPUNOV FUNCTIONS 76

(1) V 0 ≤ 0 in D, then u ≡ 0 is stable,

(2) V 0 < 0 in D \ {0}, then u ≡ 0 is asymptotically stable,

(3) V 0 ≤ −αV and V (u) ≥ b |u|β in D for some α, β, b > 0, then u ≡ 0 is


exponentially stable.

Proof. Ad (1). Let ε > 0 and Bε (0) := {x : |x| ≤ ε} ⊂ D. We choose γ > 0


such that V (x) ≥ γ for all |x| = ε. Next, we choose δ, 0 < δ < ε, such that
V (x) < γ for all |x| < δ. Now, let u be the solution of (5.12) with an initial
value |u(0)| < δ. We call Φ(t) := V (u(t)). Then Φ0 (t) ≤ 0 by assumption, hence
Φ(t) ≤ Φ(0) ≤ γ. Since V (u) ≥ γ for |u| = ε, we can conclude, that |u(t)| < ε
for all times t.

Ad (2). Let u(t) be the solution of (5.12) as in part (1) of the proof. Then, there
exists limt→∞ |u(t)| = β ≤ γ. We will show, that
¶ β = 0. ©
Assume the contradiction β 6= 0. Let M = x ∈ Bε (0) : β ≤ V (x) ≤ γ . The
set M is a compact (closed and bounded) subset of Bε (0) \ {0} and hence the
maximum of V 0 exists on M , i.e. maxx∈M V 0 (x) = −α < 0. The solution tra-
jectory u is contained in M and Φ0 (t) = dt d
V (u(t)) < −α. However, this is a
contradiction to β ≤ V (x) in M .
Therefore limt→∞ Φ(t) = 0, and hence β = limt→∞ |u(t)| = 0 as well. Now, let
0 < ε0 < ε. Then minε0 ≤|x|≤ε V (x) = δ and |u(t)| < ε0 for Φ(t) < δ.

Ad (3). we have b |u(t)|β ≤ V (u(t)) =: Φ(t) and Φ0 (t) ≤ −αΦ(t). Hence we arrive
at Φ(t) ≤ Φ(0)e−αt and
Å ã1/β
Φ(0) 1/β
|u(t)| ≤ e−αt ≤ ce−γt .
b

Example 5.10. For the nonlinear oscillator, we obtain

d
E = −ku02 (t) ≤ 0 .
dt
Hence, the zero–solution is stabil.
In case of k > 0 it is even asymptotically stable; his does not directly follow from
the previous theorem.

Theorem 5.11 (Instability theorem). Let V ∈ C 1 (D) with V (0) = 0 and


V (xk ) > 0 for some zero–sequence (xk )k∈N ⊂ D. The zero–solution u ≡ 0 is
unstable, if one of the following two conditions is stasfied
77 CHAPTER 5. STABILITY

(1) V 0 > 0 for x 6= 0 or

(2) V 0 ≥ λV in D for some λ > 0.

Proof. Let u(t) by the solution of (5.12) for u(0) = xk , hence Φ(0) := V (u(0)) =
α > 0.
Ad (1). Let ε > 0 such that V < α in Bε (0). Since Φ0 ≥ 0 we obtain α = Φ(0) ≤
Φ(t) and therefore |u(t)| > ε. Let r > ε, such that Br (0) ⊂ D. For ε ≤ |x| < r we
have that V 0 (x) ≥ β > 0, hence Φ0 ≥ β and Φ ≥ α + βt as long as u(t) ∈ Br (0).
Since V is bounded in Br , the solution trajectory u(t) has to leave the ball Br in
finite time.

Ad (2). Let Φ(t) := V (u(t)). Due to assumption Φ0 ≥ λΦ, and therefore Φ(t) ≥
αeλt . Hence we obtaind |u(t)| > r for large times t similar to the proof of part (1).
Since xk converges to zero, there exist solutions u for arbitrary small initial values
xk , such that |u(t)| > r in finite time t.

Corollary 5.12. In particular, the zero–solution is unstable, if

V (x) > 0 for x 6= 0 and V0 >0 for x 6= 0 .

This short introduction the Lyapunov–functions is just the starting point for a
deeper look into stability theory.
Chapter 6

Partial Differential Equations

Let us recall Definition 1.2: Given an (open) interval I ⊂ R and a function


f : I × R × · · · × R → R, we call
dk u
Å ã
du
f t, u(t), ,..., k = 0 (∗)
dt dt
a k–th order ordinary differential equation (ODE). If u ∈ C k (I; R) and (∗) holds
for all t ∈ I, we call u a classical solution of (∗).

6.1 Definition of PDEs


For the forthcoming, we introduce the following notation.
Definition 6.1 (Multi–index). Let α = (α1 , . . . , αn ) ∈ Nn be a multi–index of
length n. We define |α| := α1 + · · · + αn and

∂ |α| u
Dα u := α1 α
= ∂xα11 . . . ∂xαnn u
∂x1 . . . ∂xnn

for a function u ∈ C |α| (Rn ).

If the unknown function u is multivariate, i.e. u : Ω ⊂ Rn → R depends on


more than one variable, we can generalize the notion of an (ordinary) differential
equation to
Definition 6.2 (Partial Differential Equation, PDE). Given an open domain
2 k
Ω ⊂ Rn and a function F : Ω × R × Rn × Rn × · · · × Rn → R, we call

F y, u(y), Du(y), D2 u(y), . . . , Dk u(y) = 0



(†)

78
79 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS

a k–th order PDE, where u : Ω → R is the unknown and Dk u = {Dα u : |α| = k}


is the vector, matrix, . . . of the k–th partial derivatives of u. If the function
u ∈ C k (Ω; R), i.e. k–times continuously diff’able, satisfies (†) for all x ∈ Ω
identically, we call u a classical solution to (†).
Example 6.1 (Linear Advection Equation). Let a ∈ R and u : R+ × R → R
satisfy the linear advection or transport equation

ut + aux = 0 .
∂u ∂u
Here, ut = ∂t u = and ux = ∂x u = denote the partial derivative of u with
∂t ∂x
respect to time t and space x. The linear advection equation is a first order PDE.
It is easy to check, that for a given initial profile

u(t = 0, x) = u0 (x)

the solution of the linear advection equation is given by

u(t, x) = u0 (x − at) .

The constant a is also called the advection velocity.


Definition 6.3 (Scalar conservation law). Let f : R → R be continuously differ-
entiable. We call the equation

ut + f (u)x = 0

a scalar conservation law with flux f . Its solution is a function u : R+ × R → R


also called density. To solve the scalar conservation law, we need to prescribe an
initial density
u(0, x) = u0 (x) .

For a scalar conservation law ut + f (u)x = 0 with density u and flux f , we call
Z b
M[a,b] (t) := u(t, x) dx
a

the mass inside the interval or domain [a, b]. The mass inside an interval changes
in time according to the equation

d b
Z Z b Z b
d ∂u ∂f (u)
M[a,b] = u(t, x) dx = dx = −
dt dt a a ∂t a ∂x
= f (u(t, a)) − f (u(t, b)) .

Here f (u(t, a)) denotes the influx into the interval [a, b] at x = a and −f (u(t, b))
denotes the outflux out of [a, b] through the boundary at x = b.
6.1. DEFINITION OF PDES 80

Lemma 6.1. Let u ∈ C(R+ × R) be a solution of the scalar conservation law


ut + f (u)x = 0 and assume, that supp u(t, ·) = {(t, x) : u(t, x) 6= 0} is compact
for all t > 0. Then the mass of u
Z
M (t) := u(t, x) dx
R

is constant in t, i.e. M (t) = M (0).

Proof.
Z Z
d ∂u ∂f (u)
M (t) = dx = − dx = lim f (u(t, −R)) − f (u(t, R)) = 0 .
dt R ∂t R ∂x R→∞

Example 6.2 (Traffic on a highway). Let u be the density of vehicles on a


highway. Then the flux f is given by f = u · v, where v is the velocity of the
vehicles.

If all vehicles travel, independent of the density, with the same velocity a, then
f = au and we recover the linear advection equation

∂t u + a∂x u = 0 .

More realistic is the following model: At low density vehicles travel faster than
umax − u
at high density, i.e. v = vmax · . Then the flux is given by
umax

f = f (u) = βu · (umax − u)

and we end up with a nonlinear, first order PDE

∂t u + β · ∂x (u (umax − u)) = 0 .

If we scale the car density u with the maximal density umax , i.e. u = umax ũ, the
space coordinate x = Lx̃ and time t = T t̃, we obtain

vmax T
∂t̃ ũ + ∂x̃ (ũ(1 − ũ)) = 0 .
L

Choosing the time scale T = L/vmax leads to the dimensionless equation (after
dropping the tilde)

∂t u + ∂x u(1 − u) = 0 .
81 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS

Definition 6.4 (Quasi–linear PDE). A first order PDE F (y, u, grad u) = 0 is


called quasi–linear, if F is linear in grad u, but not necessarily in y or u, i.e.

F (y, u, grad u) = ha(y, u) , grad ui − b(y, u)

and the corresponding PDE reads as

ha(y, u) , grad ui − b(y, u) = 0 .

Example 6.3. The advection equation ut +aux = 0 is quasi–linear. We introduce


y = (t, x) and a(y, u) = (1, a), b = 0. Then

ha(y, u) , grad ui − b = 1 · ut + a · ux = 0 .

The traffic flow equation ∂t u + ∂x u(1 − u) = 0 is quasi–linear. We introduce


y = (t, x) and a(y, u) = (1, 1 − 2u), b = 0. Then

ha(y, u) , grad ui − b = 1 · ut + (1 − 2u) · ux = 0 .

The PDEs presented so far are of first order. In applications also second order
equations play an important role. Here are three classical examples.

Definition 6.5 (Elliptic PDE; Laplace/Poisson equation). Let Ω ⊂ Rn be a


domain (open and connected set) and let f : Ω → R be continuous. We call the
PDE
− ∆ u = − ∂x21 + · · · + ∂x2n u = f


Poisson equation/problem. The homogeneous equation

−∆u = 0

is called Laplace equation/problem. To solve this problem, we need to prescribe


some boundary values for the unknown u at the boundary Γ = ∂Ω of Ω. The
following three types of boundary conditions are frequently used

Dirichlet or boundary conditions of the first kind: We prescribe the values of


u, i.e.
u(y) = g(y) for y ∈ Γ .

Neumann or boundary conditions of the second kind: We prescribe the normal


derivative of u, i.e.
∂n u(y) = g(y) for y ∈ Γ .
The normal derivative ∂n u = n · ∇u = hn , ∇ui can be interpreted as the
flux of the quantity u through the boundary.
6.1. DEFINITION OF PDES 82

Robin , mixed or boundary conditions of the third kind: A linear combination


of Dirichlet and Neumann, i.e.

βu + ∂n u = g .

Example 6.4 (Poisson equation on the unit circle). Let Ω = B1 (0) ⊂ R2 denote
the unit circle and consider the Poisson problem

∆ u = 1 in Ω

supplemented by homogeneous Dirichlet boundary conditions

u = 1 on ∂Ω.

Then, the solution is given by

1 x2 + y 2 + 3
u(x, y) = (r2 + 3) = .
4 4
Definition 6.6 (Parabolic PDE; Diffusion equation). Let Ω ⊂ Rn be a domain
with boundary Γ = ∂Ω. Let T > 0. Let α > 0 and f : [0, T ) × Ω → R be
continuous. We call the PDE

ut − α ∆ u = f

diffusion (heat) equation. Its solution u(t, x) depends on the time variable t ∈
[0, T ] ⊂ R+ and the spatial variables x ∈ Ω ⊂ Rn . Note, that the Laplacian is
taken with respect to the spatial variables x. To solve the diffusion equation, we
have to prescribe both

Initial Condition u(0, x) = u0 (x) for x ∈ Ω.

Boundary Condition for t ∈ [0, T ) and x ∈ Γ. Here we may use Dirichlet,


Neumann or Robin type conditions depending on the problem at hand.

The heat equation describes the evolution of a temperature profile u(t, x) over
time t ∈ [0, T ). At the starting time t = 0, we have given an initial temperature
distribution u0 (x) and at the boundary of the domain Ω we have either a fixed
temperature u = g (Dirichlet condition), a fixed heat flux −α∂n u = g (Neumann
condition) or a heat flux, that is proportional to the temperature, e.g. Newton’s
cooling law −α∂n u = k(u−u∞ ). The constant α > 0 denotes the heat conductivity
of the material and k > 0 is the heat transfer coefficient through the boundary.
83 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS

Example 6.5 (Heat equation). Let Ω = [0, 2π] and consider the heat equation
ut − uxx = 0 in R+ × Ω

with the initial condition

u(0, x) = sin x

and the Dirichlet boundary conditions

u(t, 0) = 0 = u(t, 2π) .

Then, the solution is given by

u(t, x) = e−t sin x .


Definition 6.7 (Hyperbolic PDE; Wave Equation). Let T > 0 and f : [0, T ) ×
Rn → R be continuous. We call the PDE
utt − c2 ∆ u = f
wave equation. To solve the wave equation, we have to prescribe initial condi-
tions u(0, x) = u0 (x) and ut (0, x) = v0 (x) for x ∈ Rn . If we consider the PDE
not on the full space x ∈ Rn but just on a domain Ω ⊂ Rn , we have to specific
additional boundary conditions on parts of ∂Ω. For details we refer to the
literature.

For the above examples, the solution of the PDEs can be found by educated
guesses or try–and–error. For first order PDEs there exists a systematic approach
discussed next.

6.2 Method of Characteristics


The scalar conservation law
ut + f (u)x = 0

or its variant called continuity equation with a source term b

ut + f (u)x = b(t, x, u)
both supplemented by an initial condition u(0, x) = u0 (x) can be written in
quasi–linear from as
ut + a(t, x, u) · ux = b(t, x, u), u(0, x) = u0 (x) .
where a(t, x, u) = f 0 (u). The solution u = u(t, x) can be viewed as a surface over
the two–dimensional tx–plane.
6.2. METHOD OF CHARACTERISTICS 84

Definition 6.8. The initial value problem (IVP) for a quasi–linear PDE is given
by:
Find a (classical) solution of
∂t u + a(t, x, u) · ux = b(t, x, u)
where the solution u(t, x) satisfies
u(0, x) = u0 (x) .

To solve a quasi–linear first order PDE


Å ã Å ã
1 ut
hA(t, x, u) , grad ui = · = b(t, x, u)
a(t, x, u) ux
we use the following idea: The term hA , grad ui denotes the directional deriva-
tive of u in direction of the vetor A = (1, a(t, x, u)). We introduce the pa-
rameter s along direction A, and seek the solution u = u(y) along some curves
y = y(s) = (t(s), x(s)) ∈ R2 , the so called characteristic ground curves. These
characteristic ground curves are determined by the ODE–system
d
t = 1, t(0) = 0 ,
ds
d
x = a(t(s), x(s), u(s)), x(0) = x0 ∈ R .
ds
Here, the parameter s ∈ R+ runs along each of the characteristic ground curves
and x0 denotes the “starting point” of the characteristic. From the first ODE we
immediately get t(s) = s. The solution u = u(s, x0 ) along such a characteristic
ground curve is now given by
d d dt ∂ dx ∂
u(s, x0 ) = u(t(s), x(s, x0 )) = · u+ · u
ds ds ds ∂t ds ∂x
= ut + a(t, x, u) · ux = b(t, x, u)
i.e. we get a third IVP
d
u = b(t(s), x(s), u(s)), u(0) = u0 (x0 ) .
ds
Solving the characteristic system consisting of the three IVPs (the first one is
trivial in our case)
d
t = 1, t(0) = 0 ,
ds
d
x = a(t(s), x(s), u(s)), x(0) = x0 ∈ R
ds
d
u = b(t(s), x(s), u(s)), u(0) = u0 (x0 )
ds
85 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS

we obtain the solution u = u( s, x0 ) depending on the two auxiliary parameters


(s, x0 ) along each of the characteristic curves (t, x) = (s, x(s, x0 )) starting at
point (0, x0 ). If we can invert the implicit equation for the curves, we obtain u
as a function of the spatial coordinates (t, x) instead of (s, x0 ).
Example 6.6 (Linear Advection). Let’s consider the linear advection equation
ut + cux = 0 with initial condition u(0, x) = u0 (x), where c ∈ R is some constant.
The characteristic system is given by
dt
= 1, t(0) = 0 ,
ds
dx
= c, x(0) = x0 ,
ds
du
= 0, u(0) = u0 (x0 ) .
ds
Its solution is given by t = s, x(s, x0 ) = x0 + cs = x0 + ct and u(s) = u0 (x0 ).
Eliminating the curve parameter s and writing x as a function of t, we obtain
x = ct + x0 or x0 = x − ct .
Hence the characteristic ground curves are straight lines x = x(t) and the solution
u(t, x) = u0 (x0 ) = u0 (x − ct) is constant along those lines.
Example 6.7 (Nonlinear traffic equation). We consider the nonlinear traffic flow
model ut + (u(1 − x))x = 0. Written in quasi–linear form, this equation reads as
ut + (1 − 2u) · ux = 0. Together with the initial condition u(0, x) = u0 (x) the
method of characteristics yields the system
dt
= 1, t(0) = 0, t=s,
ds
du
= 0, u(0) = u0 = u0 (x0 ), u(t) = u0 (x0 ) ,
ds
dx
= 1 − 2u, x(0) = x0 , x(t) = (1 − 2u0 )t + x0 .
ds

We have x = (1 − 2u0 (x0 )) · t + x0 ; a nonlinear equation for x0 which needs to


be solved.
x−t
Assume u0 (x) = αx. Then x = (1 − 2αx0 )t + x0 and therefore x0 = 1−2αt
for
t 6= 1/(2α).
For t < 1/(2α) we obtain
x−t
u(t, x) = u0 (x0 ) = α .
1 − 2αt
1
But what happens for t ≥ 2α
? Does the solution cease to exist?
6.2. METHOD OF CHARACTERISTICS 86

Let us consider the following nonlinear scalar conservation law with flux f (u) =
u2 /2 Å 2ã
u
ut + = 0, u(0, x) = u0 (x) .
2 x
This equations is called Burger’s equation. Using the method of characteristics
we obtain x0 = u0 (x0 ) with initial condition x(0) = x0 . The solution reads as
x = u0 (x0 )t + x0 which has to be solved for x0 .
Example 6.8 (Shock). Considering the initial condition

1 :
 x<0
u0 (x) = 1 − x : 0 ≤ x ≤ 1

0: x>1

we obtain the characteristic ground curves



t + x 0 :
 x0 < 0
x(t) = (1 − x0 )t + x0 : 0 ≤ x0 ≤ 1

x0 : x0 > 1

If t < 1, the characteristic equations are solvable and we obtain the classical
solution 
1 :
 x<t
1−x
u(t, x) = 1−t : 0 ≤ t ≤ x ≤ 1

0: x>1

But for t ≥ 1, the characteristic equations are not solvable anymore; the charac-
teristic ground curves intersect!
What happens? The solution steepens up and finally shows a jump discontinuity.
This is called a shock wave. Does there exist any concept of a solution for t > 1?
Example 6.9 (Rarefaction wave). Now, let’s consider Burger’s equation with a
different, increasing initial condition. If u0 is smooth, the characteristics cover
the whole (t, x)–plane; the characteristic system is solvable for all t ≥ 0 and we
obtain a classical solution for all times t. But if the initial solution has a jump,
e.g.
®
0: x<0
u0 (x) =
1: x≥0

we obtain the following ground curves


®
x0 : x0 < 0
x(t) =
x0 + t : x0 ≥ 0
87 CHAPTER 6. PARTIAL DIFFERENTIAL EQUATIONS

and the solution reads as


®
0: x<0
u(t, x) =
1: x≥t

But what happens in the domain 0 < x < t, where we have no characteristic
ground curves at all? This phenomenon is called a rarefaction wave.

But this is just the starting point for the course on Numerical Methods for Partial
Differential Equations.
Appendix A

Preliminaries

A.1 Eigenvalues and Eigenvectors

In this section we recall some results from Linear Algebra without proving them.

Definition A.1 (Eigenvalue and Eigenvector). Let A ∈ Rn×n or resp. Cn×n . We


call 0 6= v ∈ Cn an eigenvector to the eigenvalue λ ∈ C, if

Av = λv .

The set Eigλ (A) := {v : Av = λv} of all eigenvectors to the eigenvalue λ is called
the eigenspace to the eigenvalue λ. The eigenspace Eigλ (A) is a vector space.
Its dimension dim Eigλ (A) is called geometric multiplicity of the eigenvalue λ.
The set σ(A) = {λ ∈ C : λ is an eigenvalue of A} consisting of all eigenvalues of
A is called the spectrum of A. The largest eigenvalue (by absolute value) defines
the spectral radius ρ(A) := max {|λ| : λ ∈ σ(A)}.

Theorem A.1. Let λ be an eigenvalue of A to the eigenvector v. Then

(1) The eigenvector is not the zero vector.

(2) The matrix A − λE is singular

det(A − λE) = 0 ,

here E denotes the n × n–identity matrix

(3) The eigenvector v is a non–trivial solution of the homogeneous linear system

(A − λE)v = 0 .

88
89 APPENDIX A. PRELIMINARIES

(4) The matrix A − λE does not have full rank.

(5) The eigenvalue λ is a root of the characteristic polynomial

χA (x) = det(A − xE) .

The characteristic polynomial χA of a matrix A ∈ Rn×n or resp. Cn×n is a


polynomial of degree n. Hence there exist n complex eigenvalues λk , k = 1 . . . n
of an n × n–matrix. If the eigenvalue λk is a root of χA of multiplicity νk , i.e.
n
Y
χA (x) = (λk − x)νk
k=1

then
P νk is called the algebraic multiplicity of the eigenvalue λk . It holds that
νk = n.
Let λ 6= µ be two different eigenvalues of A to eigenvectors v and w. Then v
and w are linear independent.

Definition A.2 (Diagonalization). A matrix A ∈ Cn×n is called diagonalizable,


if there exist n linear independent eigenvectors v1 , . . . , vn ∈ Cn .

Theorem A.2. Let A be diagonalizable. Then there exists a diagonal matrix


D ∈ Cn×n and a regular matrix Q ∈ Cn×n , such that

A = Q · D · Q−1 .

Proof. Since A is diagonalizable, there exist n linear independent eigenvectors


v1 , . . . , vn to the eigenvalues λ1 , . . . , λn . Note that λj = λk is allowed. We write
the eigenvectors as the columns of the matrix Q, i.e.

Q = (v1 | · · · | vn ) ∈ Cn×n .

Due to the linear independence of the columns, the matrix Q is regular, i.e. in-
vertible. The diagonal matrix D is defined by the eigenvalues

D = diag (λ1 , . . . , λn ) .

Then the eigenvalue equations Avk = λk vk hold for each component, or in matrix
form

AQ = QD .

Since Q is invertible we obtain A = QDQ−1 .


A.1. EIGENVALUES AND EIGENVECTORS 90

Remark A.1. Not every matrix is diagonalizable. E.g. the matrix


Å ã
0 1
A=
0 0

has a double eigenvalue λ = 0 but only (up to linear multiples) one eigenvector
T
v = 1 0 . Hence the matrix A is not diagonalizable. However, one can show
that the so–called Jordan normalform exists.
Theorem A.3 (Jordan Normalform). For every matrix A ∈ Cn×n there exist
Q ∈ Cn×n , det Q 6= 0, such that
á ë
λk 1 0
.. ..
. .
Q−1 AQ = diag (J1 , . . . , Jn ) where Jk = .. .
. 1
0 λk
The matrix Jk is called Jordan–block to the eigenvalue λk .
Theorem A.4 (Further Results). It holds that

(1) If 0 ∈ σ(A), then A is singular, i.e. not invertible.


(2) The determinant equals to the product of the eigenvalues
Y
det(A) = λνkk
λk ∈σ(A)

(3) The trace equals to the sum of the eigenvalues


X
tr(A) = νk · λ k
λk ∈σ(A)

(4) Let A = AT ∈ Rn×n . Then σ(A) ⊂ R.


All eigenvalues of a real symmetric matrix are real.
(5) Let A ∈ Rn×n or Cn×n be a lower or upper triangular matrix, i.e. aij = 0
for all j < i or j > i. Then the eigenvalues are the diagonal entries,
i.e. σ(A) = {aii : i = 1 . . . , n}.
Definition A.3 (Definiteness). Let A ∈ Rn×n be symmetric, i.e. AT = A. We
call A positive or negative definite, if λ ≥ 0 or λ ≤ 0 holds for all eigenvalues λ
of A.
We call the matrix strictly positive or strictly negative definite, if λ > 0 or λ < 0
holds for all eigenvalues λ.
The matrix is called indefinite, if there exist positive and negative eigenvalues.
91 APPENDIX A. PRELIMINARIES

A.2 Taylor’s Theorem


In this section we recall some well–known results from Calculus without proving
them.
Definition A.4 (Landau–Symbols). Let f, g : D → R be two functions. We
call g an asymptotic upper bound of f for x → x0 , if there exist K,  > 0 such
that fg(x)
(x)
< K for all x with |x − x0 | < .
Analogously, we call g an asymptotic upper bound of f for x → ∞, if there exist
K, R > 0 such that fg(x)
(x)
< K for all x with x > R.
If g is an asymptotic upper bound of f , we write f = O(g) (for x → x0 resp. x →
∞) and call ”f to be of order big–Oh of g”.
A function f is called asymptotically negligible in relation to g for x → x0 or x →
∞, if limx→x0 fg(x)
(x)
= 0 or resp. limx→∞ fg(x)
(x)
= 0. In this case we write f = o(g)
(for x → x0 or x → ∞) and call ”f to be of order small–Oh of g”.
Remark A.2. Writing f = O(g) is just an abbrevition. If f1 = O(g) and f2 =
O(g) holds, this does i.g. not imply f1 = f2 . However, if f = o(g) then also
f = O(g).
Mostly, we use statements of the form f = o(xk ) or O(xk ) for x → 0 or x → ∞,
e.g. sin x = O(x) for x → 0 and ln(x) = o(1/x) for x → 0.
Theorem A.5 (Mean Value Theorem (MWT)). Let f : [a, b] → R be continuous
and differentiable on ]a, b[. Then there exists x0 ∈ ]a, b[ such that
f (b) − f (a)
f 0 (x0 ) = .
b−a
In a nutshell: At x0 the tangent is parallel to the secant.
Definition A.5 (Convex, Concave). Let f : I → R be twice diff’able. We call
the function

convex , if f 00 (x) > 0 for all x ∈ I,


concave , if f 00 (x) < 0 for all x ∈ I.
Theorem A.6 (Taylor). Let f : Df → R be sufficiently often diff ’able in
x0 ∈ Df and let n ∈ N. Then there exists a unique polynomial Tn of degree n
satisfying
Tn(k) (x0 ) = f (k) (x0 ) for k = 0, 1, 2, . . . , n .
This polynomial is called the n–th order Taylor–polynomial to f at/around x0 .
It holds that
f 00 (x0 ) f (n) (x0 )
Tn (x) = f (x0 ) + f 0 (x0 ) (x − x0 ) + (x − x0 )2 + · · · + (x − x0 )n
2 n!
A.2. TAYLOR’S THEOREM 92

hence
n
X f (k) (x0 )
Tn (x) = αk (x − x0 )k where αk = .
k=0
k!

The approximation quality of the Taylor–polynomial, i.e. the difference between


the function f and its Taylor–polynomial can be estimated using the following
Theorem A.7 (Lagrange–Remainder). Let f : Df → R be (n + 1)–times
continuously differentiable and let Tn be the n–th order Taylor–polynomial at
x0 ∈ Df . Then

Rn (x) := f (x) − Tn (x)

is called remainder. For all x ∈ Df the remainder can be expressed in the La-
grange form

f (n+1) (x̃)
Rn (x) = (x − x0 )n+1
(n + 1)!
for suitable x̃ ∈ [x, x0 ] or resp. x̃ ∈ [x0 , x].
Remark A.3. Some remarks to Taylor’s Theorem.

(1) In the special case of n = 0 we obtain the mean value theorem from the
Lagrange remainder. The result on the remainder itself can be derived from
Rolle’s theorem similarly to the mean value theorem .

(2) Using the Landau–symbole, we may write

f (x) = Tn (x) + O (x − x0 )n+1 .




(3) Taylor–expansion is mostly used for n = 1 or n = 2.


The case n = 1 is also called the linearization of f at/around x0 . It holds,
that
f (x) ≈ T1 (x) = f (x0 ) + f 0 (x0 ) · (x − x0 ) .
The linear Taylor–polynomial (degree n = 1) equals to the tangent of f
at x0 .
The case n = 2 corresponds to the quadratic approximation of f at x0
f 00 (x0 )
f (x) ≈ T2 (x) = f (x0 ) + f 0 (x0 ) · (x − x0 ) + · (x − x0 ) .
2
Definition A.6 (Gradient and Hessian–matrix). Let f : Rn → R be twice
continuously diff’able.
93 APPENDIX A. PRELIMINARIES

(1) The vector field

grad f (x) := (∂x1 f (x), . . . , ∂xn f (x))

is called the gradient of f . Usually, the gradient is written as a row vector.

(2) The Hessian–matrix Hf (x0 ) ∈ Rn×n of f at x0 is defined as

∂x21 f
á ë
∂ x1 x2 f . . . ∂ x1 xn f
∂x2 x1 f ∂x22 f . . . ∂x1 xn f
(Hf (x0 ))ij = ∂xi xj f (x0 ) , i.e. Hf = .. ... .. .
. .
2
∂xn x1 f ∂xn x2 f . . . ∂xn f

Due to Schwarz–theorem, the Hessian is a symmetric matrix.

Theorem A.8 (Taylor–polynomials in Rn ). Let f : Rn → R be three–times


cont. diff ’able. The linear Taylor–polynomial of f at x(0) can be written as
n
X (0)
T1 (x) = f (x(0) ) + ∂xi f (x(0) ) · (xi − xi ) .
i=1

For the second order Taylor–polynomial it holds that


n
1X (0) (0)
T2 (x) = T1 (x) + ∂x x f (x(0) ) · (xi − xi ) · (xj − xj ) .
2 i,j=1 i j

These Taylor–polynomials yield approximations to f (x) in the neighborhood of


x0

f (x) = T1 (x) + O(kx − x0 k2 ) ,


f (x) = T2 (x) + O(kx − x0 k3 ) .

The first–order Taylor–polynomial T1 is called linearization of f at x(0) and the


second–order polynomial T2 is called quadratic approximation of f at x(0) .

Remark A.4. If the function f is sufficiently often differentiable, we can also


construct in Rn Taylor–polynomials of higher orders
n
1 X (0) (0)
Tk (x) = Tk−1 (x) + ∂xi1 ···xik f (x(0) ) · (xi1 − xi1 ) · · · (xik − xik ) ,
k! i ,...,i
1 k

here the sum runs over all k–th partial derivatives of f .


A.2. TAYLOR’S THEOREM 94

Theorem A.9 (Remainder in Rn ). Let f : Rn ⊃ Ω → R be (k + 1)–times


continuously diff ’able and let Tk be the k–th order Taylor–polynomial at x0 ∈ Ω.
Then we call

Rk (x) := f (x) − Tk (x)

the remainder. For all x ∈ Ω the remainder can be written in Lagrange form

X ∂α f (x̃) Ä ä
Rk (x) = (x − x0 )α = O kx − x0 kk+1
α!
|α|=k+1

for some x̃ ∈ [x, x0 ], i.e. x̃ = τ x0 + (1 − τ )(x − x0 ) where 0 ≤ τ ≤ 1.


The sum runs over all partial derivatives ∂α = ∂xα11 · · · ∂xαnn of order |α| = α1 +
· · · + αn = k + 1. Furthermore, we introduce the notion α! = α1 · · · αn and
xα = (xα1 1 , . . . , xαnn ).

Remark A.5. In the special case of f : R2 → R the linear and quadratic Taylor–
polynomials around (x0 , y0 ) are given by

T1 (x, y) = f (x0 , y0 ) + ∂x f (x0 , y0 ) · (x − x0 ) + ∂y f (x0 , y0 ) · (y − y0 )


= f (x0 ) + hgrad f (x0 ) , x − x0 i

and
1 2
T2 (x, y) = T1 (x, y) + ∂ f (x0 , y0 ) · (x − x0 )2
2 x
+2 ∂xy f (x0 , y0 ) · (x − x0 )(y − y0 ) + ∂y2 f (x0 , y0 ) · (y − y0 )2


1
= f (x0 ) + hgrad f (x0 ) , x − x0 i + (x − x0 )T Hf (x0 ) · (x − x0 ) .
2
The linear Taylor–polynomial T1 equals to the tangential plane to the graph of
f at (x0 , y0 ).

You might also like