Mixed Constrained Control Problems: Maria Do Rosário de Pinho

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

J. Math. Anal. Appl.

278 (2003) 293–307


www.elsevier.com/locate/jmaa

Mixed constrained control problems


Maria do Rosário de Pinho
Faculdade de Engenharia da Universidade do Porto, DEEC/ISR, Rua Dr. Roberto Frias s/n,
4200 465 Porto, Portugal
Received 19 November 2001
Submitted by H. Frankowska

Abstract
Necessary optimality conditions are derived in the form of a weak maximum principle for
optimal control problems with mixed state-control equality and inequality constraints. In contrast
to previous work these conditions hold when the Jacobian of the active constraints, with respect
to the unconstrained control variable, has full rank. A feature of these conditions is that they are
stated in terms of a joint Clarke subdifferential. Furthermore the use of the joint subdifferential
gives sufficiency for nonsmooth, normal, linear convex problems. The main point of interest is
not only the full rank condition assumption but also the nature of the analysis employed in this
paper. A key element is the removal of the constraints and application of Ekeland’s variational
principle.
 2003 Elsevier Science (USA). All rights reserved.

Keywords: Optimal control; Mixed state and control constraints; Maximum principle

1. Introduction

In this paper we focus on the derivation of necessary conditions of optimality for certain
optimal control problems involving mixed state and control constraints in the form of both
inequalities and equalities under assumptions which are, in some sense, minimal.
The problem of interest is the following:

E-mail address: [email protected].

0022-247X/03/$ – see front matter  2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S0022-247X(02)00351-7
294 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

   1  

 Minimize l x(0), x(1) + 0 L t, x(t), u(t), v(t) dt




 subject to
 


 ẋ(t) = f t, x(t), u(t), v(t) a.e. t ∈ [0, 1],


 0 = bt, x(t), u(t), v(t)

a.e. t ∈ [0, 1],
(P)  

 0  g t, x(t), u(t), v(t)


a.e. t ∈ [0, 1],



 v(t) ∈ V (t) a.e. t ∈ [0, 1],



 x(0) ∈ C0 ,



x(1) ∈ C1 ,

where l : n × n → , L : [0, 1] × n × ku × kv → , f : [0, 1] × n × ku × kv


→ n , b : [0, 1] × n × ku × kv → mb , g : [0, 1] × n × ku × kv → mg are given
functions, C0 , C1 ⊂ n given sets and V : [0, 1] → kv is a given multifunction. We
set k = ku + kv and m = mb + mg . Throughout this paper we assume that ku  m. The
control variable comprises two components, u and v. When no distinction between such
components is needed we may refer to the control variable simply as w = (u, v) ∈ W (t),
where W : [0, 1] → k is a given multifunction.
For (P) a process is a pair (x, u, v) comprising measurable control functions u and
v and an absolutely continuous function x ∈ W 1,1 ([0, 1], n ) satisfying the constraints
of the problem. A process (x, u, v) of (P) is called a weak local minimizer if there
exists some ε > 0, such that it minimizes the cost over all processes of (P) which
satisfy (x(t), u(t), v(t)) ∈ Tε (t) for a.e. t ∈ [0, 1], where Tε (t) = (x̄(t), ū(t), v̄(t)) + εB,
B denotes the closed unit ball and (x̄(t), ū(t), v̄(t)) is a reference process.
Assume that (x̄, ū, v̄) is an optimal process for (P) and let f¯(t), b̄(t), etc., denote the
corresponding function evaluated at (t, x̄(t), ū(t), v̄(t)). Given two functions φ and ϕ,
[φ, ϕ](·) denotes the function (φ(·), ϕ(·)).
Necessary conditions in the form of weak maximum principles for (P) have previously
been derived assuming full rankness of a given matrix F (t), i.e., assuming that
det F (t)F (t)T  L for a.e. t ∈ [0, 1], for some L > 0. The matrix F has been considered
to be Υ1 (t) = ∇u [b̄, ḡ](t) in [7] and [9],

b̄u (t)
Υ2 (t) = Iβ (t )
ḡu (t)
I (t )
in [10], where Iβ (t) = {i ∈ {1, . . . , mg }: ḡi (t)  −β} and ḡu β (t) denotes the matrix we
obtain after removing from ḡu (t) all the rows of index i ∈ / Iβ (t), and

b̄u (t) 0
Υ3 (t) =

ḡu (t) diag −ḡi (t) i∈{1,...,mg }

in [12]. The full rank condition imposed on Υ1 , Υ2 and Υ3 are related to each other and
they are sufficient for the matrix

b̄u (t)
Υ4 (t) = Ia (t ) (1)
ḡu (t)
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 295

to be of full rank (see [4]), where




Ia (t) = i ∈ {1, . . . , mg }: ḡi (t) = 0 (2)

is the set of active constraints and, as before, ḡuIa (t )(t) denotes the matrix we obtain after
removing from ḡu (t) all the rows of index i ∈ / Ia (t).
Matrix Υ4 (t) is of interest. In fact, necessary conditions in the form of weak maximum
principles are such that the derivative with respect to u of ḡi , for i ∈ / Ia (t), does not
take any part in the determination of the multipliers. It is then reasonable to conjecture
that necessary conditions of optimality for (P) hold when the full rankness condition
is imposed merely on matrix Υ4 (t). In this paper we show that this is indeed the case.
A common approach to derive necessary conditions for (P) under a full rankness condition
is to associate with (P) a problem with only equality mixed constraints. Such approach
is no longer valid when full rank is imposed on Υ4 (t). The key idea behind the proof of
our main result is based on a technique explored in [13] for optimal control problems with
pure state constraints. It consists on the definition of a sequence of standard optimal control
problems in which the constraints are incorporated both in the cost and in the dynamics.
As in [4], here we consider problems with nonsmooth data. Optimality conditions
obtained are stated in terms of a “joint” subdifferential co ∂x,p,u,v Hλ (t, x, u, v, p, q, r).
This special feature is of interest for linear convex problems. Indeed, nonsmooth weak
maximum principles commonly stated in terms of co ∂x Hλ × co ∂p Hλ × co ∂u Hλ × co ∂v Hλ
can fail to provide sufficiency. An example is given in [6]. Here, we show that our
necessary conditions, under a normality hypothesis, are sufficient for optimality in the
normal nonsmooth linear-convex case.

2. Preliminaries

Here and throughout, B represents the closed unit ball centered at the origin. √ The
notation r  0 means that each component ri of r ∈ r is nonnegative. | · | = · , ·
denotes the Euclidean norm, and | · | the induced matrix norm on m×k . The linear space
W 1,1 ([0, 1]; p ) denotes the space of absolutely continuous functions, L1 ([0, 1]; p ) and
L∞ ([0, 1]; p ) denote respectively the space of integrable functions and the space of
essentially bounded functions from [0, 1] to p . Since we assume only measurability of
the data with respect to t, a variant of a uniform implicit function theorem, derived in [3],
will be essential in our setup. Such theorem asserts that if φ(t, x0 (t), u0 (t)) = 0 almost
everywhere, then an implicit function ϕ(t, u) exists and the same neighborhood of u0 can
be chosen for all t.
We make use of constructs from nonsmooth analysis of limiting normal cone to a closed
set A at x, written NA (x), and limiting subdifferential of a semicontinuous function f at x,
written ∂f (x). When the function f is Lipschitz continuous near x, the convex hull of
the limiting subdifferential, co ∂f (x), which may be defined directly, coincides with the
Clarke subdifferential. The full calculus for limiting subdifferential and limiting normal
cones in finite dimensions can be found, for example, in [8,11,13]. Properties of Clarke’s
subdifferentials (upper semi-continuity, sum rules, etc.), can be found in [1].
296 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

Consider a standard optimal control problem


   1  

 Minimize l x(0), x(1) + 0 L t, x(t), w(t) dt






subject to
 ẋ(t) = f t, x(t), w(t) a.e. t ∈ [0, 1],

(S)

 w(t) ∈ W (t) a.e. t ∈ [0, 1],





 x(0) ∈ C0 ,


rx(1) ∈ C1 ,

where l, L, f , C0 and C1 are as defined before for (P) and W : [0, 1] → k is a given
) is a reference process of (S) and ε > 0 a parameter. We
multifunction. Assume that (x̄, w
invoke the following hypotheses on the data of (S):

(H1) [L, f ](· , x, w) is measurable for each (x, w) and [L, f ](t, · , ·) is Lipschitz contin-
(t)) + εB for almost every t ∈ [0, 1]
uous with Lipschitz constant kf (t) on (x̄(t), w
and kf is an L1 -function.
(H2) The cost l is Lipschitz continuous on a neighborhood of (x̄(0), x̄(1)) and C0 and C1
are closed.
(H3) The multifunction W has Borel measurable graph and, for almost every t ∈ [0, 1],
the set Wε (t) := (w(t) + εB) ∩ W (t) is closed.

The following Euler–Lagrange inclusion for (S), provided in [2], will also be of
importance in our analysis.

Proposition 2.1. Let (x̄, w) denote a weak local minimizer for (S). If (H1)–(H3) are
satisfied and H (t, x, p, w) = p · f (t, x, w) − λL(t, x, w) defines the Hamiltonian, then
there exist λ  0, p ∈ W 1,1 ([0, 1]; n) and ζ ∈ L1 ([0, 1]; k ) such that, for almost every
t ∈ [0, 1],

λ + pL∞ = 1,
   
˙
−ṗ(t), x̄(t), ζ (t) ∈ co ∂H t, x̄(t), p(t), w
(t) ,
 
ζ (t) ∈ co NW (t ) w (t) ,
     
p(0), −p(1) ∈ NC0 ×C1 x̄(0), x̄(1) + λ∂l x̄(0), x̄(1) ,

where ∂H denotes the limiting subdifferential in the (x, p, w) variables.

3. Main results

We now concentrate on (P). With reference to a process (x̄, ū, v̄) of (P) and a parameter
ε > 0, we invoke hypotheses (H1)–(H3) stated in the previous section (for w  = (ū, v̄) and
W (t) = ku × V (t)), and the following additional hypotheses:
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 297

(H4) b(· , x, u, v) and g(· , x, u, v) are measurable for each (x, u, v). There exists an L1
function Lb,g such that, for almost every t ∈ [0, 1], [b, g](t, · , · , ·) is continuously
differentiable and Lipschitz continuous with Lipschitz constant Lb,g (t) on Tε (t).
(H5) There exists an increasing function θ̃ : + → + , θ̃ (s) ↓ 0 as s ↓ 0, such that, for all
(x  , u , v  ), (x, u, v) ∈ Tε (t) and for almost every t ∈ [0, 1],

∇x,u,v [b, g](t, x , u , v  ) − ∇x,u,v [b, g](t, x, u, v)
 
 θ̃ (x  , u , v  ) − (x, u, v) .
There exists Kb,g > 0 such that, for almost every t ∈ [0, 1],
 
∇x b̄, ḡ (t) + ∇u,v b̄, ḡ (t)  Kb,g .

(H6) There exists K > 0 such that, for almost every t ∈ [0, 1],


b̄u (t)
det Υ4 (t)Υ4 (t)  K where Υ4 (t) = Ia (t )
T
.
ḡu (t)

Theorem 3.1. Let (x̄, ū, v̄) be a weak local minimizer for (P). If, for some ε > 0,
hypotheses (H1)–(H6) are satisfied and

Hλ (t, x, p, q, r, u, v)
:= p · f (t, x, u, v) + q · b(t, x, u, v) + r · g(t, x, u, v) − λL(t, x, u, v)
defines the Hamiltonian, then there exist p ∈ W 1,1 ([0, 1]; n), ξ ∈ L1 ([0, 1]; kv ) and
λ  0 such that, for almost every t ∈ [0, 1],

(i) pL∞ + λ = 0,
(ii) ˙
(−ṗ(t), x̄(t), 0, ξ(t)) ∈ co ∂x,p,u,v Hλ (t, x̄(t), p(t), q(t), r(t), ū(t), v̄(t)),
(iii) ξ(t) ∈ co NV (t ) (v̄(t)),
(iv) r(t) · g(t, x̄(t), ū(t)) = 0 and r(t)  0,
(v) (p(0), −p(1)) ∈ NC0 (x̄(0)) × NC1 (x̄(1)) + λ ∂l(x̄(0), x̄(1)).

Furthermore, there exist an integrable function KQ and a constant CQ > 0 such that
 
q(t), r(t)  KQ (t) p(t) + CQ p(1) for a.e. t ∈ [0, 1].

The novelty of the theorem is that, unlike previously proved results (see [4,7,9,10,12])
the conclusions hold when the full rank condition is imposed merely on matrix Υ4 . Recall
that, as mentioned in the Introduction, the full rank of matrix Υ4 does not necessarily imply
the full rank of other matrices used in the existing literature. Additionally Theorem 3.1
provides necessary conditions of optimality for problem (P) with data possibly nonsmooth
and they are stated in terms of a “joint” subdifferential (see (ii)).
Necessary conditions of optimality in terms of a “joint” subdifferential like those of
Theorem 3.1 derived for standard optimal control problems and optimal control problems
with state constraints (see [2] and [5], respectively) are also sufficient conditions for linear-
convex problems in the normal form. This is also the case for linear-convex problems
298 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

with mixed constraints problems as we show next. By linear-convex problems we mean


problem (P) where
 
f t, x(t), u(t), v(t) = A(t)x(t) + B(t)u(t) + C(t)v(t),
 
b t, x(t), u(t), v(t) = D(t)x(t) + E(t)u(t) + F (t)v(t),
 
g t, x(t), u(t), v(t) = G(t)x(t) + J (t)u(t) + K(t)v(t),
and the following hypotheses hold:

(HC1) C0 and C1 are convex, V (t) is convex for a.e. t ∈ [0, 1].
(HC2) The function t → L(·, x, u, v) is measurable and l and (x, u, v) → L(t, x, u, v) are
convex.
(HC2) The functions A, D, G, E, F , J and K are integrable and B and C are measurable.
We denote such problem by (LC)

Proposition 3.1. Let (x̄, ū, v̄) be a process for problem (LC). Assume that (HC1)–(HC3)
are in force and that (x̄, ū, v̄) is a normal extremal in the sense that there exist p ∈ W 1,1 ,
ζ, q, r ∈ L1 such that the conditions
   
˙
−ṗ(t), x̄(t), 0, ζ (t) ∈ co ∂Hλ t, x̄(t), p(t), q(t), r(t), ū(t), v̄(t) a.e., (3)
 
ζ (t) ∈ co NV (t ) v̄(t) a.e., (4)
 
r(t)  0 and r(t) · G(t)x̄(t) + J (t)ū(t) + K(t)v̄(t) = 0 a.e., (5)
     
p(0), −p(1) ∈ NC0 ×C1 x̄(0), x̄(1) + λ∂l x̄(0), x̄(1) (6)
are satisfied for λ = 1. Then (x̄, ū, v̄) is a weak local minimizer.

In the Proposition above, subdifferentials and normal cones are understood in the sense
of convex analysis.

Proof. Here we follow the approach of the proof of Proposition 4.1 in [2]. Details are
omitted.
Let (x, u, v) be an arbitrary admissible process and the pair (x̄, ū, v̄) be a normal
extremal for (LC). We compute the difference of the cost between (x, u, v) and (x̄, ū, v̄).
In doing so the following remarks will be of help. By definition of a process t →
L(t, x(t), u(t), v(t)), t → B(t)u(t) and t → C(t)v(t) are integrable. By (5) and since
(x, u, v) is a process for (LC) we have
 
r · G(x − x̄) + J (u − ū) + K(v − v̄)  0 a.e. (7)
From (6), there exists a σ ∈ NC0 ×C1 (x̄(0), x̄(1)) such that
   
p(0), −p(1) − σ ∈ ∂l x̄(0), x̄(1) . (8)
Finally, recall also that for a generic convex function f , we have
f (x) − f (x̄) − ζ · (x − x̄)  0, ∀ζ ∈ ∂f (x̄). (9)
Taking into account (3)–(6) and (7), the difference of the cost between (x, u, v) and
(x̄, ū, v̄)
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 299

1
    
∆= L t, x(t), u(t), v(t) − L t, x̄(t), ū(t), v̄(t) dt
0
   
+ l x(0), x(1) − l x̄(0), x̄(1)
is computed as in the proof of Proposition 4.1 in [2] leading to
1
   d
∆  l x(0), x(1) − l x̄(0), x̄(1) + p · (x − x̄) dt
dt
0
       
= l x(0), x(1) − l x̄(0), x̄(1) − p(0), −p(1) · x(0) − x̄(0), x(1) − x̄(1)
   
+ σ · x(0) − x̄(0), x(1) − x̄(1) − σ · x(0) − x̄(0), x(1) − x̄(1) .
(8) and (9) ensure that this last expression is nonnegative, proving the proposition. ✷

4. Proof of Theorem 3.1

Recall the definition of the set of active constraints Ia (t) (see (2)) and let Ic (t) =
{i ∈ {1, . . . , mg }: ḡi (t) < 0} be its complement. Set qa (t) to be the cardinal of Ia (t)
and qc (t) the cardinal of Ic (t). Note that qa (t) + qc (t) = mg . For i ∈ {1, . . . , mg }, define
δi : [0, 1] →  to be

1 if i ∈ Ia (t),
δi (t) = (10)
0 if i ∈ Ic (t),
and consider the following matrices

mg
∆(t) = diag δi (t) i=1 , ∆ (t) = I − ∆(t),
 
b̄x (t) b̄u (t)
Γ x (t) = , Γ u (t) = ,
∆(t)ḡx (t) ∆(t)ḡu (t)
 
b̄v (t) 0
Γ v (t) = , Γα (t) = ,
∆(t)ḡv (t) ∆ (t)
 
 1 0 ··· 0
0 . . .
Γβ (t) = , D(t) =  .. .. . . . ..  ,
∆(t)
0 0 ··· 1
where D(t) ∈ qc (t )×qc (t ).
Let β̄, ᾱ : [0, 1] → mg be defined componentwise as β̄i (t) = −ḡi (t) and ᾱi (t) = 0.
The functions β̄ and ᾱ are measurable. Let α, β be measurable functions. Take (x̄, ū, v̄) to
be the process in Theorem 3.1 and ε > 0 the parameter. We prove the theorem in the case
L ≡ 0. This restriction is lifted by the use of well known augmentation techniques and an
appeal to standard estimation on limiting subdifferentials; details are omitted.
We start the proof which breaks into steps. The proof consists on two major steps.
Firstly, an uniform implicit function d is determined by applying an uniform implicit
function theorem previously proved in [3] to the active mixed constraints. This function
300 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

allows us to define a sequence of optimal control problems to which Ekeland’s variational


principle applies.
Step 1. We apply a uniform implicit function theorem to a function µ : [0, 1] × n ×
k × mg × mg × m → m in order to obtain an “uniform” implicit function d.
Let µ be defined as
 
µ t, ξ, (u, v), α, β, η
  
= b t, x̄(t) + ξ, ū(t) + u + Γ u (t)T η, v̄(t) + v ,
  
∆(t) g t, x̄(t) + ξ, ū(t) + u + Γ u (t)T η, v̄(t) + v + β̄(t) + β

+ Φ(t, α, η) , (11)

where Φ(t, α, η) = ∆ (t)(α + Γα (t)T η). µ satisfies the conditions under which Theo-
rem 3.2 in [3] applies. Observe that if the components of g are permute in such a way
that the active constraints come first, we have

∂µ Υ4 (t)Υ4 (t)T 0
Γ (t) = (t, 0, 0, 0, 0, 0) = , (12)
∂η 0 D(t)D(t)T
where Υ4 (t) is as defined in (H6). It follows that there exists a constant C > 0 such that

Γ (t) −1  C (13)
for almost every t ∈ [0, 1].
Theorem 3.2 in [3] ensures the existence of σ ∈ (0, ε), δ ∈ (0, ε) and a mapping
d : [0, 1] × (σ B) × (σ B) × (σ B) × (σ B) × (σ B) → δB such that d(· , ξ, u, v, α, β) is
a measurable function for fixed (ξ, u, v, α, β), the functions {d(t, · , · , · , · , ·): t ∈ [0, 1]}
are Lipschitz continuous with common Lipschitz constant, d(t, · , · , · , · , ·) is continuously
differentiable for fixed t, d(t, 0, 0, 0, 0, 0) = 0 for almost every t ∈ [0, 1] and, for all
(ξ, u, v, α, β) ∈ σ B × σ B × σ B × σ B × σ B and for almost every t ∈ [0, 1],
 
µ t, (ξ, u, v, α, β), d(t, ξ, u, v, α, β) = 0, (14)
−1  
∇ξ,u,v,α,β d(t, 0, 0, 0, 0, 0) = − Γ (t) Γ x (t), Γ u (t), Γ v (t), Γ α (t), Γ β (t) .
(15)
Choose σ1 , δ1 > 0 such that
  
 σ1 ∈  0, min{σ, ε/2} ,

δ1 ∈ 0, min{δ, ε/2} , (16)


σ1 + Kb,g δ1 ∈ (0, ε/2),
where Kb,g is given by (H5).
Step 2. We now define an optimization problem to which Ekeland’s theorem applies.
Define the functions
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 301



g + (t, x, u, v) = max 0, g1 (t, x, u, v), . . . , gmg (t, x, u, v) ,
 
f1 (t, x, u, v, α, β) = f t, x, u + Γ u (t)T d̃(t), v ,
 
f2 (t, x, u, v, α, β) = g + t, x, u + Γ u (t)T d̃(t), v ,
where d̃(t) = d(t, x − x̄(t), u − ū(t), v − v̄(t), α − ᾱ(t), β − β̄(t)). Define also the sets
B = {ξ ∈ mg : ξi  0, i = {i, . . . , mg }}, Bσ1 (t) = B ∩ (β̄(t) + σ1 B), Vσ1 (t) = V (t) ∩
(v̄(t) + σ1 B).
Consider a sequence of positive scalars {εk }k∈N , such that limk→∞ εk = 0 and set

 
Ψk (x, y, x  , y  , z) = max l(x, y) − l x̄(0), x̄(1) + εk2 , z + |x  − y  | .
Take W to be the set of all measurable functions (u, v, α, β) and all vectors (a, b, c, e, h) ∈
n × n × n × n ×  such that, for almost every t ∈ [0, 1], (u(t), α(t)) ∈ (ū(t) + σ1 B) ×
σ1 B, (v(t), β(t)) ∈ Vσ1 (t) × Bσ1 (t), (a, b) ∈ C0 × C1 and for which there exist absolutely
continuous functions x, y and z such that
ẋ(t) = f1 (t, x, u, v, β, α) a.e.,
ẏ(t) = 0 a.e.,
ż(t) = f2 (t, x, u, v, β, α) a.e.,
   
x(t), y(t), z(t) ∈ x̄(t), x̄(1), 0 + σ1 B a.e.,
 
x(0), y(0), z(0) = (a, b, 0),
 
x(1), y(1), z(1) = (c, e, h).
To simplify the notation set E = (a, b, c, e, h) ∈ 4n+1 . Let
|E − E  | = |a − a  | + |b − b | + |c − c | + |e − e | + |h − h |
and
 
ν (u, v, α, β), (u , v  , α  , β  )
1 1 1

= u(t) − u (t) dt + v(t) − v  (t) dt + α(t) − α  (t) dt
0 0 0
1

+ β(t) − β  (t) dt.
0
Define dW : W × W → ,
   
dW (u, v, β, α, E), (u , v  , β  , α  , E  ) = ν (u, v, α, β), (u , v  , α  , β  ) + |E − E  |.
Consider the sequence of optimization problems

Minimize Jk (u, v, α, β, E)
(Rk )
subject to (u, v, α, β, E) ∈ W,
where Jk (u, v, α, β, E) = Ψk (x(0), y(0), x(1), y(1), z(1)). Observe that dW defines a
metric in W and, with respect to this metric, the set W is a complete metric space and
the function (u, v, α, β, E) → Jk (u, v, α, β, E) is continuous on (W, dW ).
302 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

 = (x̄(0), x̄(1), x̄(1), x̄(1), 0). For all k ∈ N, Jk (u, v, α, β, E)  0 and Jk (ū, v̄, ᾱ,
Let E

β̄, E) = εk2 . It follows that (ū, v̄, ᾱ, β̄, E) is an “ε2 -minimizer” for (Rk ). According to
k
Ekeland’s variational principle (see [13]), there exists a sequence (uk , vk , αk , βk , Ek ) ∈ W
such that, for each k ∈ N,
  
  εk
dW (uk , vk , αk , βk , Ek ), ū, v̄, ᾱ, β̄, E (17)
and (uk , vk , αk , βk , Ek ) minimizes the perturbed cost function Jk (u, v, α, β, E) +
εk dW ((uk , vk , αk , βk , Ek ), (u, v, α, β, E)) for all (u, v, α, β, E) ∈ W .
Step 3. Rewriting the conclusions of Ekeland’s theorem in control theoretic terms we
obtain a sequence of standard optimal control problems.
Write (xk , yk , zk ) the trajectory corresponding to (uk , vk , αk , βk , Ek ). For each k ∈ N,
the process (xk , yk , zk , w1 ≡ 0, w2 ≡ 0, w3 ≡ 0, w4 ≡ 0, uk , vk , αk , βk ) solves the control
problem (Ck ):
 Minimize

  

 Ψk x(0), y(0), x(1), y(1), z(1) + εk x(0) − xk (0) + εk x(1) − xk (1)






 + εk y(0) − yk (0) + εk y(1) − yk (1) + εk z(1) − zk (1)



 + εk w1 (1) + εk w2 (1) + εk w3 (1) + εk w4 (1)





 subject to


 ẋ(t) = f1 (t, x, u, α, β),
 ẏ(t) = 0,

ż(t) = f2 (t, x, u, α, β),




 ẇ1 (t) = u(t) − uk (t) , ẇ2 (t) = v(t) − vk (t) ,




 ẇ3 (t) = α(t) − αk (t) , ẇ4 (t) = β(t) − βk (t) ,

    

 x(t), y(t), z(t) ∈ x̄(t), x̄(1), 0 + σ1 B,

    


 u(t), v(t), α(t),β(t) ∈ ū(t) + σ1 B × Vσ1 (t) × σ1 B × Bσ1 (t),



 x(0), y(0), z(0) ∈ C0 × C1 × {0},


 
w1 (0), w2 (0), w3 (0), w4 (0) = (0, 0, 0, 0),
where all the equalities and inclusions but the last two are to be understood in an almost
everywhere sense.
Since εk → 0, we have from (17) that (xk , yk , zk ) → (x̄, x̄(1), 0) uniformly. By
discarding initial terms of the sequence, if necessary, we have (xk (t), yk (t), zk (t)) ∈
(x̄(t), x̄(1), 0) + (σ1 /2)B for all k. Then (xk , yk , zk , w1 ≡ 0, w2 ≡ 0, w3 ≡ 0, w4 ≡
0, uk , vk , αk , βk ) is a weak local minimizer of a variant of (Ck ) obtained by dropping the
state constraint “(x(t), y(t), z(t)) ∈ (x̄(t), x̄(1), 0) + σ1 B.”
Step 4. For a certain subsequence of problems (Ck ) we obtain necessary conditions.

Lemma 4.1. Let (xk , yk , zk , 0, 0, 0, 0, uk , vk , αk , βk ) be a weak local minimizer for


problem (Ck ). Set H (t, x, p, r, u, v, α, β) = p · f1 (t, x, u, v, α, β) + r · f2 (t, x, u, v, α, β).
Then there exist scalars λk , ηk and rk , vectors qk , ek ∈ n , integrable functions
ξk : [0, 1] → kv , ζk : [0, 1] → mg and an absolutely continuous function pk ∈ W 1,1 such
that:
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 303

(a) λk + pk ∞ + |qk | + |rk | + 4λk εk = 1,


(b) λk  0, ηk ∈ [0, 1], |ek | = 1,
 
(c) −pk (1) − λk (1 − ηk )ek , −qk + λk (1 − ηk )ek , −rk − λk (1 − ηk )
∈ λk εk (B × B × B),
     
(d) pk (0), qk ∈ NC0 xk (0) × NC1 (yk ) + λk ηk ∂l xk (0), yk (0) + λk εk (B × B),
   
(e) ζk (t) ∈ co NBσ1 (t) βk (t) , ξk (t) ∈ co NVσ1 (t ) vk (t) a.e.,
 
(f) −p˙k (t), ẋk (t), żk (t), 0, ξk (t), 0, ζk (t)
 
∈ co ∂H t, xk (t), pk (t), rk , uk (t), vk (t), αk (t), βk (t)
 
+ λk εk {0} × {0} × {0} × B × B × B × B a.e.

Proof. We only give a sketch of the proof. For details see [6]. It is a simple matter to verify
that the hypotheses are satisfied under which Proposition 2.1 applies to (Ck ). The set of
necessary conditions we obtain by applying Proposition 2.1 to (Ck ) can be rewritten as in
the lemma by observing the following:

(i) The Hamiltonian for (Ck ) is

h(t, x, y, z, w1 , w2 , w3 , w4 , p, q, r, π1 , π2 , π3 , π4 , u, v, α, β)
= H (t, x, p, q, r, u, v, α, β) + q · 0 + π1 |u − uk | + π2 |v − vk |
+ π3 |α − αk | + π4 |β − βk |.
(ii) For i sufficiently large, it can easily be shown that
 
Ψk xk (0), yk , xk (1), yk , zk (1) > 0.
(iii) The max rule guarantees that there exists ηk ∈ [0, 1] such that
 
∂Ψk xk (0), yk , xk (1), yk , zk (1)
 
⊂ ηk ∂l xk (0), yk (0) × {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}


+ (1 − ηk ){0, 0, 0, 0, 0, 0, 0} × (e, −e): e ∈ n , |e| = 1
× {1, 0, 0, 0, 0}. ✷

Step 5. We now consider εk → 0 and we take limits to obtain necessary conditions


for (P).
Recall that (xk , yk , zk ) → (x̄, x̄(1), 0) uniformly. Since (uk , vk , αk , βk ) → (ū, v̄, 0, β̄)
strongly in L1 we can arrange by subsequence extraction that (uk , vk , αk , βk ) →
(ū, v̄, 0, β̄) almost everywhere (used at Step 4). It also follows that d̃(t) → 0, where
 
d̃k (t) = d t, xk (t) − x̄(t), uk (t) − ū(t), vk (t) − v̄(t), αk (t) − ᾱ(t), βk (t) − β̄(t)
and, consequently, ũk (t) = uk (t) + Γ u (t)T d̃k (t) → ū(t) almost everywhere.
The sequences {ek } and {ηk } are uniformly bounded by (b) of the Lemma 4.1. Since
εk → 0, we conclude from (a) of the Lemma 4.1 that λk is also uniformly bounded. We
304 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

can therefore arrange, again by subsequence extraction if necessary, that ek → e, λk → λ


and ηk → η, where |e| = 1, λ  0 and
 t η ∈ [0, 1]. t
Now the sequences {pk }, {t → 0 ξk ds} and {t → 0 ζk ds} are equicontinuous and
uniformly bounded and {ṗk } is uniformly integrable bounded. Standard compactness
arguments and an appeal to Dunford–Pettis criterion for L1 compactness  t ensure that,
 t by
further extraction of subsequences if necessary, pk → p uniformly, 0 ζk ds → 0 ζ ds
t t
uniformly, 0 ξk ds → 0 ξ ds uniformly for some p ∈ W 1,1 and ξ, ζ ∈ L1 , and ṗk → ṗ,
ξk → ξ and ζk → ζ weakly in L1 .
It now follows from the above and from (a), (c) and (d) of Lemma 4.1 that rk → r =
−λ(1 − η), qk → q = λ(1 − η)e, p(1) = −λ(1 − η)e,
       
p(0), −p(1) ∈ NC0 x̄(0) × NC1 x̄(1) + λη∂l x̄(0), x̄(1) , (18)
and λ + p∞ + 2|p(1)| = 1 (observe that p(1) = −q). Since |p(1)| = λ(1 − η), we
get λη + 3|p(1)| + p∞ = 1, a condition which ensures that λ̃ + p∞ = 0, where
λ̃ = λη  0.
The properties of the limiting normal cone and limiting subdifferential and an
application of Theorem 3.1.7 of [1] allow us to pass to the limit in relationships (e) and (f)
of Lemma 4.1. There results
   
˙
−ṗ(t), x̄(t), 0, 0, ξ(t), 0, ζ (t) ∈ co ∂H t, x̄(t), p(t), r, ū(t), v̄(t), 0, β̄(t) (19)
and
   
ζ (t) ∈ co NBσ1 (t ) β̄(t) , ξ(t) ∈ co NVσ1 (t ) v̄(t) . (20)

We conclude that there exist scalars λ̃  0 and r  0 and functions p ∈ W 1,1 , ζ, ξ ∈ L1


such that

(A ) λ̃ + p∞ = 0,
(B ) ˙
(−ṗ(t), x̄(t), 0, 0, ξ(t), 0, ζ (t)) ∈ co ∂H (t, x̄(t), p(t), r, ū(t), v̄(t), 0, β̄(t)) a.e.,
(C ) ζ (t) ∈ co NBσ1 (t )(β̄(t)) a.e.,
(D ) ξ(t) ∈ co NVσ1 (t ) (v̄(t)) a.e.,
(E ) (p(0), −p(1)) ∈ NC0 (x̄(0)) × NC1 (x̄(1)) + λ̃∂ l(x̄(0), x̄(1)),

where H (t, x, p, r, u, v, α, β) = p · f1 (t, x, u, v, α, β) + rf2 (t, x, u, v, α, β).


Step 6. Finally we rewrite relationships (A )–(E ) in the required form.
Observe that
 

NBσ1 β̄ = θ ∈ mg : θi = 0 if i ∈ Ic (t), θi  0 if i ∈ Ia (t) .

Then, from (C ) and the definition of β̄ we deduce that


ζi (t) = 0 if i ∈ Ic (t), ζi (t)  0 if i ∈ Ia (t). (21)
We deduce from the max rule (applied to f2 ), the chain rule (see [1]) and the differentiable
properties of d the following estimation for co ∂H :
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 305

co ∂H (t, x, p, r, u, v, α, β)

⊂ µ − ρΓ Tu (t)Γ −1 (t)Γ Tx (t) + rγ (t) · gx (t)
 
− r γ (t) · gu (t) Γ Tu (t)Γ −1 (t)Γ Tx (t),
ν, 0, ρ − ρΓ Tu (t)Γ −1 (t)Γ Tu (t) + rγ (t) · gu (t)
 
− r γ (t) · gu (t) Γ Tu (t)Γ −1 (t)Γ Tu (t),
I − ρΓ Tu (t)Γ −1 (t)Γ Tv (t) + rγ (t) · gv (t)
 
− r γ (t) · gu (t) Γ Tu (t)Γ −1 (t)Γ Tv (t),
 
− ρΓ Tu (t)Γ −1 (t)Γ Tα (t) − r γ (t) · gu (t) Γ Tu (t)Γ −1
(t)Γ Tα (t),
  
− ρΓ Tu (t)Γ −1 (t)Γ Tβ (t) − r γ (t) · gu (t) Γ Tu (t)Γ −1
(t)Γ Tβ (t) :
(µ, ν, 0, ρ, I) ∈ co ∂p · f, γ = (γ1 , . . . , γmg ) measurable,

γi (t) ∈ [0, 1], γi (t)  0 if ḡi (t) = 0, γi (t) = 0 if ḡi (t)  0

in which f , gi , gx , etc., are evaluated at


  
t, x, u + ū(t) + Γ Tu (t)d t, x − x̄(t), u − ū(t), v − v̄(t), α − ᾱ, β − β̄(t) .
Appealing to an appropriate selection theorem, we deduce existence of measurable
functions
 
µ(t), ν(t), 0, ρ(t), I(t) ∈ co ∂x,p,r,u,v p · f a.e., (22)


γ = (γ1 , . . . , γmg ): γi (t) ∈ [0, 1] and
 γ (t)  0 if ḡ (t) = 0,
i i
a.e., (23)
γi (t) = 0 if ḡi (t)  0
 
ζ (t) ∈ co NBσ1 (t ) β̄(t) a.e., (24)
 
ξ(t) ∈ co NVσ1 (t ) v̄(t) a.e., (25)
such that
 
−ṗ(t), ẋ(t), 0, 0, ξ(t), 0, ζ (t)

= µ − ρΓ Tu (t)Γ −1 (t)Γ Tx (t) + rγ (t) · ḡx (t)
 
− r γ (t) · ḡu (t) Γ Tu (t)Γ −1 (t)Γ Tx (t),
ν, 0, ρ − ρΓ Tu (t)Γ −1 (t)Γ Tu (t) + rγ (t) · ḡu (t)
 
− r γ (t) · ḡu (t) Γ Tu (t)Γ −1 (t)Γ Tu (t),
I − ρΓ Tu (t)Γ −1 (t)Γ Tv (t) + rγ (t) · ḡv (t)
 
− r γ (t) · ḡu (t) Γ Tu (t)Γ −1 (t)Γ Tv (t),
 
− ρΓ Tu (t)Γ −1 (t)Γ Tα (t) − r γ (t) · ḡu (t) Γ Tu (t)Γ −1
(t)Γ Tα (t),
  
− ρΓ Tu (t)Γ −1 (t)Γ Tβ (t) − r γ (t) · ḡu (t) Γ Tu (t)Γ −1
(t)Γ Tβ (t) (26)
306 M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307

in which f , g, gx , etc. are evaluated at (t, x̄(t), ū(t), v̄(t)) (observe that d(t, 0, 0, 0, 0, 0) =
0). Under the hypotheses µ, ν, ρ, I, γ and ζ are all integrable functions.
 = −Γ Tu (t)Γ −1 (t) ∈ k×m and r̃(t) = rγ (t) ∈ mg . Then r̃  0. Take Q
Define Q(t) 
to be the L1 function:
 
 = q̂1 (t), q̂2 (t) = ρ(t)Q(t)
Q(t)  + r̃(t)ḡu (t)Q(t). 

In terms of Q and r̃, (26) becomes


 
−ṗ(t), ẋ(t), 0, 0, ξ(t), 0, ζ (t)

= µ + Q(t)Γ 
x (t) + r̃(t)ḡx (t), ν, 0, ρ + Q(t)Γ u (t) + r̃(t)ḡu (t),
T T



I + Q(t)Γ  
v (t) + r̃(t)ḡv (t), Q(t)Γ α (t), Q(t)Γ β (t) .
T T T
(27)

Since r  0, by definition of γ we conclude that r̃i (t) = 0 if i ∈ Ic (t) and r̃i (t)  0 if
i ∈ Ia (t). Furthermore, by definition of Γ Tα (t), Γ Tβ (t) and by (21) and (27) we deduce that
q̂2i (t) = 0 if i ∈ Ic (t) and q̂2i (t)  0 if i ∈ Ia (t). Set q(t) = q̂1 (t) and r(t) = r̃(t) + q̂2 (t).
Then
 
ri (t) = 0 if gi t, x̄(t), ū(t), v̄(t) < 0,
 
ri (t)  0 if gi t, x̄(t), ū(t), v̄(t) = 0. (28)
Since b and g are smooth we conclude from (27) that
 
−ṗ(t), ẋ(t), 0, ξ(t)


∈ co ∂x,p,u,v p · f (t, x, u, v) + q · b(t, x, u, v) + r · g(t, x, u, v) (29)

in which the subdifferential is evaluated at (t, x̄(t), ū(t), v̄(t)). λ̃, p, q and r obey the state
relationships; (A ), (D ) and (E ) are respectively (i), (iii) and (v) of the theorem with
λ = λ̃, (28) yields (iv) and (29) coincides with (ii) when L ≡ 0. As for the final assertion,
   
q(t), r(t)  kf (t)Kb,g C p(t) + 1 + K 2 C p(1)
b,g

this follows from the definition of q, r, (H1), (H4), (13) and (22). The proof is complete.

Acknowledgment

The financial support from Projecto FCT PRAXIS C/EEI/14156/1998 is gratefully acknowledged.

References

[1] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
[2] M.d.R. de Pinho, R.B. Vinter, An Euler–Lagrange inclusion for optimal control problems, IEEE Trans.
Automat. Control AC-40 (1995) 1191–1198.
[3] M.d.R. de Pinho, R.B. Vinter, Necessary conditions for optimal control problems involving nonlinear
differential algebraic equations, J. Math. Anal. Appl. 212 (1997) 493–516.
M.d.R. de Pinho / J. Math. Anal. Appl. 278 (2003) 293–307 307

[4] M.d.R. de Pinho, A. Ilchmann, Weak maximum principle for optimal control problems with mixed
constraints, Nonlinear Anal. 48 (2002) 1179–1196.
[5] M.d.R. de Pinho, M.M.A. Ferreira, F.A.C.C. Fontes, An Euler–Lagrange inclusion for optimal control
problems with state constraints, J. Dynam. Control Systems 8 (2002) 23–45.
[6] M.d.R. de Pinho, A report on optimality conditions for optimal control problems with mixed constraints,
Internal Report, FEUP, ISR, Porto, Portugal, 2001.
[7] M.R. Hestenes, Calculus of Variations and Optimal Control Theory, Wiley, New York, 1966.
[8] B.S. Mordukhovich, Generalized differential calculus for nonsmooth and set-valued mappings, J. Math.
Anal. Appl. 183 (1994) 250–288.
[9] L.W. Neustadt, Optimization, A Theory of Necessary Conditions, Princeton University Press, Princeton, NJ,
1976.
[10] N.P. Osmolovskii, Second order conditions for a weak local minimum in an optimal control problem
(necessity and sufficiency), Soviet Math. Dokl. 16 (1975) 1480–1484.
[11] R.T. Rockafellar, B. Wets, Variational Analysis, Springer-Verlag, Berlin, 1998.
[12] Z. Pales, V. Zeidan, First- and second-order necessary conditions for control problems with constraints,
Trans. Amer. Math. Soc. 346 (1994) 421–453.
[13] R.B. Vinter, Optimal Control, Birkhäuser, Boston, 2002.

You might also like