3 4 Pontryagin

Pontryagins Maximum Principle in a Nutshell
Moritz Diehl
Overview
I Derivation 1: Hamilton-Jacobi-Bellman equation

I Derivation 2: Calculus of Variations
I Properties of Euler-Lagrange Equations
I Boundary Value Problem (BVP) Formulation
I Numerical Solution of BVP
I Discrete Time Pontryagin Principle
Continuous Time Optimal Control
Regard simplified optimal control problem:

6
states x(t)
initial value terminal
x0 r cost E (x(T ))
controls u(t)
p p
-
0 t T
Z T
minimize L(x(t), u(t)) dt + E (x(T ))
x(·),u(·) 0
subject to
x(0) − x0 = 0, (fixed initial value)
ẋ(t) − f (x(t), u(t)) = 0, t ∈ [0, T ]. (ODE model)
Hamilton-Jacobi-Bellman equation
Recall:
I Hamilton-Jacobi-Bellman Equation:
− ∂J
∂t (x, t) = minu H(x, ∇J(x, t), u)
I with Hamiltonian function
H(x, λ, u) := L(x, u) + λT f (x, u)
I and terminal condition J(x, T ) = E (x).
Pontryagin’s Maximum Principle
OBSERVATION: In HJB, optimal controls
u ∗ (x, t) = arg min H(x, ∇x J(x, t), u)

u
depend only on derivative ∇x J(x, t), not on J itself!

IDEA: Introduce adjoint variables
λ(t) =ˆ ∂J T
∂x (x(t), t) ∈ R
nx and get controls from
Pontryagin’s Maximum Principle (historical name)
u ∗ (x, λ) = arg min H(x, λ, u)

u
QUESTION: How to obtain λ(t)?

(Differentiation Lemma)
We want to differentiate optimal solution that depends on

parameters y = (x, λ). How can we do that easiest?
LEMMA: If H ∗ (y ) = minu H(y , u)
∗
then ∂H ∂H ∗
∂y (y ) = ∂y (y , u )
∗
with u = arg minu H(y , u)
∗
∗ ∂H ∗ ∂u ∗
PROOF: ∂H ∂H
∂y (y ) = ∂y (y , u ) + ∂u (y , u ) ∂y (y )
| {z }
=0
due to the first order optimality condition.
(Lemma can be extended to constrained problems, using partial
derivatives of Lagrangian.)
Adjoint Differential Equation
I Differentiate HJB Equation

∂J
− (x, t) = min H(x, ∇J(x, t), u) = H(x, ∇x J(x, t), u ∗ ) =
∂t u
with respect to x and obtain:
∂∇J T ∂H ∂H
− = + ∇2x J(x, t), u ∗ )
∂t ∂x ∂λ
|{z}
=f (x,u ∗ )T
or equivalently
∂∇J ∂
+∇2x J(x, t), u ∗ )f (x, u ∗ ) = ∇x J(x, t) = −∇x H(x, λ, u ∗ )
∂t ∂t
| {z }
=λ̇(t)
Terminal Condition
I Likewise, differentiate J(x, T ) = E (x)

and obtain terminal condition
λ(T ) = ∇E (x(T )).

Necessary Optimality Conditions
Summarize optimality conditions as boundary value problem:

x(0) = x0 , (initial value)
ẋ(t) = f (x(t), u ∗ (t)) t ∈ [0, T ], (ODE model)
∗
λ̇(t) = − ∂H T
∂x (x(t), u (t), λ(t)) , t ∈ [0, T ], (adjoint equations)
u ∗ (t) = arg min H(x(t), u, λ(t)), t ∈ [0, T ], (minimum principle)
u
λ(T ) = ∇E (x(T )). (adjoint final value).
Overview

Alternative Derivation
Regard infinite optimization problem:

Z T
minimize L(x(t), u(t)) dt + E (x(T ))
x(·),u(·) 0
subject to
x(0) − x0 = 0, (fixed initial value)
ẋ(t) − f (x(t), u(t)) = 0, t ∈ [0, T ]. (ODE model)
Introduce Lagrangian multipliers λ and “Lagrangian functional”

Z T
L(x(·), u(·), λ(·)) = L(x, u) − λT (ẋ − f (x, u))dt + E (x(T ))
0
.
Infinitesimal Variations
Abbreviate using the Hamiltonian H(x, λ, u)

Z T
L(x(·), u(·), λ(·)) = H(x, λ, u) − λT ẋdt + E (x(T ))
0
Regard infinitesimal variation of L(x(·), u(·), λ(·)) with

perturbation δx(t)
Z T
∂H ∂E
δL = δx − λT δ ẋdt + δx(T )
0 ∂x ∂x
which by partial integration yields:

Z T
d T ∂E
δL = (∇x H + λ̇)T δx − (λ δx)dt + δx(T )
0 dt ∂x
Infinitesimal Variations (contd.)
Using the fact that δx(0) = 0 and requiring δL = 0 yields

Z T
0= (∇x H + λ̇)T δxdt + (∇x E − λ(T ))T δx(T )
0
which implies, for arbitrary variations,
λ̇ = −∇x H(x(t), λ(t), u ∗ (t))
and
λ(T ) = ∇x E (x(T ))
Thus, calculus of variations leads to the same adjoint differential
equations as differentiation of HJB!
Overview

How to obtain explicit expression for controls?
I In simplest case,
u ∗ (t) = arg min H(x(t), λ(t), u)

u
is defined by
∂H
(x(t), λ(t), u ∗ (t)) = 0
∂u
I In presence of path constraints, expression for u ∗ (t) changes
whenever active constraints change. This leads to state
dependent switches.
I If minimum of Hamiltonian locally not unique, “singular arcs”
occur. Treatment needs higher order derivatives of H.
Nice Case: Example
Regard L(x, u) = 12 (x T Qx + u T Ru) with invertible R and

f (x, u) = Ax + Bu. Then
1
H(x, λ, u) = (x T Qx + u T Ru) + λT (Ax + Bu).
2
and
∂H
= u T R + λT B.
∂u
∂H
Thus, ∂u = 0 implies that
u ∗ = −R −1 B T λ
Singular Arcs
But what if the relation

∂H
(x(t), λ(t), u ∗ ) = 0
∂u
is not invertible w.r.t. to u ∗ ?
This e.g. occurs if L(x, u) is independent of u and f (x, u) is linear
in u.
Singular arcs are due to the fact that only the integral of controls
influences the states, and “singular” perturbations (that go up and
down quickly) do not matter in the objective.
Remedy for Singular Arcs
“What is zero should also have zero derivative”.

Therefore, we differentiate totally w.r.t. to time
d ∂H
(x(t), λ(t), u ∗ ) = 0
dt ∂u
i.e.
∂ ∂H ∂ ∂H
ẋ + λ̇ = 0
∂x ∂u |{z} ∂λ ∂u |{z}
=f (x,u) =− ∂H
∂x
If this still does not allow to find u ∗ explicitly, differentiate even

further . . .
Singular Arc: Example
Regard L(x, u) = x T Qx and f (x, u) = Ax + Bu. Then
1
H(x, λ, u) = x T Qx + λT (Ax + Bu).
2
and
∂H
= λT B
∂u
This is not invertible w.r.t. to u ∗ !
Singular Arc: Example (contd.)
Once more differentiating yields:

d ∂H ∂H
= λ̇T B = − B = −(x T Q + λT A)B
dt ∂u ∂x
Once more differentiating yields:
d d ∂H
= −ẋ T QB−λ̇T AB = −(Ax+Bu)T QB+(x T Q+λT A)AB
dt dt ∂u
Setting this to zero finally yields the feedback law

u ∗ = (B T QB)−1 B T (AT Q − QA)x + AT AT λ
This is only applicable on singular arcs.

Euler Lagrange Differential Equations
Note that
∂
H(x, λ, u) = f (x, u)
∂λ
Thus, ∂H
d x ∂λ
=
dt λ − ∂H
∂x
is a Hamiltonian system. Volume in (x, λ) is preserved. But this

also means that if the dynamics of x is very stable i.e. contracting
than the dynamics of λ must be expanding.
Overview

Boundary Value Problem (BVP)
Can summarize the BVP

x(0) = x0 , (initial value)
ẋ(t) = f (x(t), u ∗ (t)) t ∈ [0, T ], (ODE model)
∗
−λ̇(t) = ∂H T
∂x (x(t), λ(t), u (t)) , t ∈ [0, T ], (adjoint equations)
u ∗ (t) = arg min H(x(t), λ(t), u), t ∈ [0, T ], (minimum principle)
u
∂E T
λ(T ) = ∂x (x(T )) . (adjoint final value).
by using y = (x, λ) and substituting u ∗ explicitly as
0 = r (y (0), y (T )), (boundary conditions)

ẏ (t) = f˜(y (t)) t ∈ [0, T ], (ODE model)
BVP analysis
The BVP
0 = r (y (0), y (T )), (boundary conditions)
ẏ (t) = f˜(y (t)) t ∈ [0, T ], (ODE model)
has 2nx differential equations ẏ = f˜, and 2nx boundary conditions

r . It is therefore (usually) well-defined.
But how to solve a BVP?
Overview

I single shooting
I collocation
Single Shooting
Guess initial value for y0 . Use numerical integration to obtain

trajectory as function y (t; y0 ) of y0 .
r y (T ; y0 )
y0 r trajectory y (t; y0 )
p -
Obtain in particular terminal value y (T ; y0 ).

Single Shooting (contd.)
The only remaining equation is
r (y0 , y (T ; y0 ) = 0
| {z }
=F (y0 )
which might or might not be satisfied for the guess y0 .

Fortunately, r has as many components as y0 , so we can apply
Newton’s method for root finding of
F (y0 ) = 0
which iterates
∂F k
y0k+1 = y0k − (y )F (y0k )
∂y0 0
∂F k
Attention: to evaluate ∂y0 (y0 ) we have to compute ODE
sensitivities.
Collocation (Sketch)
I Discretize states on grid with node values si ≈ y (ti ).

I Replace infinite ODE
0 = ẏ (t) − f˜(y (t)), t ∈ [0, T ]
by finitely many equality constraints
ci (si , si+1 ) = 0, . . . , N −1,

i = 0,
si+1 −si
e.g. ci (si , si+1 ) := ti+1 −ti− f si +si+1
˜
2
Nonlinear Equation in Collocation
After discretization, obtain large scale, but sparse nonlinear

equation system:
r (s0 , sN ) = 0, (boundary conditions)

ci (si , si+1 ) = 0, i = 0, . . . , N − 1, (discretized ODE model)
Solve again with Newton’s method. Exploit sparsity in linear

system setup and solution.
Discrete Time Optimal Control Problem
N−1
X
minimize li (si , qi ) + E (sN )
s,q
i=0
subject to
s0 − x 0 = 0, (initial value)
si+1 − fi (si , qi ) = 0, i = 0, . . . , N − 1, (discrete system)
hi (si , qi ) ≥ 0, i = 0, . . . , N, (path constraints)
r (sN ) ≥ 0. (terminal constraints)
Can arise also from direct multiple shooting parameterization of

continous optimal control problem. This NLP can be solved by
SQP or Constrained Gauss-Newton method.
Optimality of Discretized Optimal Control
For simplicity, drop all inequalities, regard only:

N−1
X
minimize li (si , qi ) + E (sN )
s,q
i=0
subject to
s0 − x0 = 0, (initial value)
si+1 − fi (si , qi ) = 0, i = 0, . . . , N − 1, (discrete system)
What are KKT optimality conditions of this discretized optimal

control problem?
Discrete Optimality Conditions 1
Procedure:
I Introduce multiplier vectors λ0 , . . . , λN for all dynamic state
constraints.
I Formulate Lagrangian
L(s, q, λ) = E (sN ) + (s0 − x0 )T λ0

N−1
X
+ li (si , qi ) − (si+1 − fi (si , qi ))T λi+1
i=0
I Compute
∇si L and ∇qi L,
which must be zero for optimal solution.
Discrete Optimality Conditions 2
Obtain
1. ∇si L = −λi + ∇si li (si , qi ) + ∇si fi (si , qi )λi+1 = 0
(i = 0, . . . , N − 1)
2. ∇sN L = −λN + ∇sN E (sN )
3. ∇qi L = ∇qi li (si , qi )+∇qi fi (si , qi )λi+1 = 0 (i = 0, . . . , N −1)
4. ∇λ0 L = s0 − x0
5. ∇λi+1 L = si+1 − fi (si , qi ) = 0 (i = 0, . . . , N − 1)
These conditions can be simplified by introducing the discrete time
Hamiltonian:
Hi (si , qi , λi+1 ) = li (si , qi ) + fi (si , qi )T λi+1
as follows. . .
Discrete Pontryagin Principle
The KKT conditions are now equivalent to:

1. λi = ∇si Hi (si , qi , λi+1 ) (i = 0, . . . , N − 1) (adjoint equation)
2. λN = ∇sN E (sN ) (terminal condition on adjoints)
3. ∇qi Hi (si , qi , λi+1 ) = 0 (i = 0, . . . , N − 1) (minimum principle)
4. s0 = x0 (initial condition)
5. si+1 = fi (si , qi ) (i = 0, . . . , N − 1) (system dynamics)
Summary
I Pontryagin’s Maximum Principle can be derived by in two

ways (HJB/Calc. of Variations)
I Controls must be explicitly derived. On singular arcs, need
higher order derivatives.
I Boundary Value Problem is well posed but double ODE is
usually unstable. Not easy to simulate forward.
I Solve BVP with single shooting, collocation, or multiple
shooting.
Literature
I A. E. Bryson and Y. C. Ho: Applied Optimal Control,

Hemisphere/Wiley, 1975.

3 4 Pontryagin

Uploaded by

Copyright:

Available Formats

3 4 Pontryagin

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3 4 Pontryagin

Uploaded by

Copyright:

Available Formats

Pontryagins Maximum Principle in a Nutshell

I Derivation 1: Hamilton-Jacobi-Bellman equation

Regard simplified optimal control problem:

OBSERVATION: In HJB, optimal controls

u ∗ (x, t) = arg min H(x, ∇x J(x, t), u)

depend only on derivative ∇x J(x, t), not on J itself!

Pontryagin’s Maximum Principle (historical name)

u ∗ (x, λ) = arg min H(x, λ, u)

QUESTION: How to obtain λ(t)?

We want to differentiate optimal solution that depends on

I Differentiate HJB Equation

with respect to x and obtain:

I Likewise, differentiate J(x, T ) = E (x)

λ(T ) = ∇E (x(T )).

Summarize optimality conditions as boundary value problem:

I Derivation 1: Hamilton-Jacobi-Bellman equation

Regard infinite optimization problem:

Introduce Lagrangian multipliers λ and “Lagrangian functional”

Abbreviate using the Hamiltonian H(x, λ, u)

Regard infinitesimal variation of L(x(·), u(·), λ(·)) with

which by partial integration yields:

Using the fact that δx(0) = 0 and requiring δL = 0 yields

which implies, for arbitrary variations,

λ̇ = −∇x H(x(t), λ(t), u ∗ (t))

I Derivation 1: Hamilton-Jacobi-Bellman equation

u ∗ (t) = arg min H(x(t), λ(t), u)

Regard L(x, u) = 12 (x T Qx + u T Ru) with invertible R and

But what if the relation

“What is zero should also have zero derivative”.

If this still does not allow to find u ∗ explicitly, differentiate even

Regard L(x, u) = x T Qx and f (x, u) = Ax + Bu. Then

Once more differentiating yields:

This is only applicable on singular arcs.

is a Hamiltonian system. Volume in (x, λ) is preserved. But this

I Derivation 1: Hamilton-Jacobi-Bellman equation

Can summarize the BVP

by using y = (x, λ) and substituting u ∗ explicitly as

0 = r (y (0), y (T )), (boundary conditions)

has 2nx differential equations ẏ = f˜, and 2nx boundary conditions

I Derivation 1: Hamilton-Jacobi-Bellman equation

Guess initial value for y0 . Use numerical integration to obtain

Obtain in particular terminal value y (T ; y0 ).

The only remaining equation is

which might or might not be satisfied for the guess y0 .

I Discretize states on grid with node values si ≈ y (ti ).

0 = ẏ (t) − f˜(y (t)), t ∈ [0, T ]

by finitely many equality constraints

ci (si , si+1 ) = 0, . . . , N −1,

After discretization, obtain large scale, but sparse nonlinear

r (s0 , sN ) = 0, (boundary conditions)

Solve again with Newton’s method. Exploit sparsity in linear

Can arise also from direct multiple shooting parameterization of

For simplicity, drop all inequalities, regard only:

What are KKT optimality conditions of this discretized optimal

L(s, q, λ) = E (sN ) + (s0 − x0 )T λ0

Hi (si , qi , λi+1 ) = li (si , qi ) + fi (si , qi )T λi+1

The KKT conditions are now equivalent to:

I Pontryagin’s Maximum Principle can be derived by in two

ci (si , si+1 ) = 0, . . . , N −1,