J Glob Optim (2011) 51:11–26
DOI 10.1007/s10898-010-9616-7
Inexact Halpern-type proximal point algorithm
O. A. Boikanyo · G. Moroşanu
Received: 29 January 2010 / Accepted: 17 September 2010 / Published online: 30 September 2010
© Springer Science+Business Media, LLC. 2010
Abstract We present several strong convergence results for the modified, Halpern-type,
proximal point algorithm xn+1 = αn u + (1 − αn )Jβn xn + en (n = 0, 1, . . .; u, x0 ∈ H
given, and Jβn = (I + βn A)−1 , for a maximal monotone operator A) in a real Hilbert space,
under new sets of conditions on αn ∈ (0, 1) and βn ∈ (0, ∞). These conditions are weaker
than those known to us and our results extend and improve some recent results such as those
of H. K. Xu. We also show how to apply our results to approximate minimizers of convex
functionals. In addition, we give convergence rate estimates for a sequence approximating
the minimum value of such a functional.
Keywords Proximal point algorithm · prox-Tikhonov algorithm · Monotone operator ·
Control conditions · Strong convergence · Convex function · Minimizer · Minimum value
Mathematics Subject Classification (2000)
47J25 · 47H05 · 47H09
1 Introduction
Let H be a real Hilbert space with inner product ·, · and norm · . A map T : H → H is
said to be nonexpansive if for every x, y ∈ H the inequality T x − T y ≤ x − y holds.
In the case when T x − T y ≤ ax − y holds for some a ∈ (0, 1), then T is said to be a
contraction with Lipschitz constant a. We recall that a mapping A : D(A) ⊂ H → 2 H is
said to be a monotone operator if
x − x ′ , y − y ′ ≥ 0, ∀ (x, y), (x ′ , y ′ ) ∈ A.
O. A. Boikanyo (B) · G. Moroşanu
Department of Mathematics and its Applications, Central European University,
Nador u. 9, 1051 Budapest, Hungary
e-mail:
[email protected]
G. Moroşanu
e-mail:
[email protected]
123
12
J Glob Optim (2011) 51:11–26
In other words, the graph of A, G(A) = {(x, y) ∈ H × H : x ∈ D(A), y ∈ Ax} is a
monotone subset of H × H . The operator A is said to be maximal monotone if its graph is
not properly contained in the graph of any other monotone operator. It is well known that
if A is maximal monotone and β > 0, then the resolvent of A, the operator Jβ : H → H
defined by Jβ (x) = (I + β A)−1 (x), is single-valued and nonexpansive (see, e.g. [13]).
For a fixed u ∈ H and t ∈ (0, 1), let z t denote the fixed point of the contraction Tt given
by the rule x → tu + (1 − t)T x, i.e.,
z t = tu + (1 − t)T z t .
(1)
The strong convergence of z t to a fixed point of T was proved in 1967 by Browder [3]. This
result of Browder has been widely used in the theory of fixed points and extended in different
directions by several authors. Motivated by Browder’s (implicit) convergence result, Halpern
[7] considered the (explicit) iteration
xn+1 = αn u + (1 − αn )T xn , for any u, x0 ∈ H with αn ∈ (0, 1) and all n ≥ 0,
(2)
in a Hilbert space and proved that under certain assumptions on αn , the sequence {xn } given
by the iterative process (2) is strongly convergent, and the limit is the point of F(T ) = {x ∈
H | T x = x} which is nearest to u. Later, Lions [12] proved the strong convergence of (2)
still in a Hilbert space under the control conditions
(C1) lim αn = 0, (C2)
n→∞
∞
n=0
αn = ∞ and (C3) lim
n→∞
(αn+1 − αn )
= 0.
2
αn+1
Unfortunately, Lions’ result excludes the natural choice αn = n −1 . This was overcome in
1992 by Wittmann [17] who showed strong convergence of {x n } under the control conditions
(C1), (C2), and
(C4)
∞
n=0
|αn+1 − αn | < ∞.
In 2002, Xu [18] studied algorithm 2 extensively. First, he showed that in a Banach space
setting, {xn } still maintains its strong convergence on removing the square in the denominator
of (C3), thereby improving Lions’ result twofold. The conditions used were (C1), (C2), and
(C5) lim
n→∞
(αn+1 − αn )
= 0, or equivalently,
αn+1
lim
n→∞
αn
= 1.
αn+1
He then showed that the conditions (C3) and (C4) are not comparable, and did the same
for (C4) and (C5). Xu then observed that Halpern actually showed that the conditions (C1)
and (C2) are necessary to have strong convergence to the metric projection of u on F(T ).
This provided a partial answer to Reich’s question: Concerning {αn }, what are the necessary and sufficient conditions for {xn } to converge strongly? To the best of our knowledge,
the other part of the question concerning sufficiency remains open. However, in a recent
paper of Suzuki [16], it is shown that if the nonexpansive mapping T in (2) is of the form
T := λS + (1 − λ)I (with λ ∈ (0, 1), S a nonexpansive mapping and I the identity operator),
then the conditions (C1) and (C2) are not only necessary for {x n } to converge strongly, but
they are also sufficient. In fact, Suzuki showed strong convergence of the iterative process
xn+1 = αn u + (1 − αn )(λSxn + (1 − λ)xn ), for any u, x0 ∈ H and all n ≥ 0,
(3)
in Banach spaces. The same result was obtained by Chidume and Chidume [4] independently.
Very recently, He et. al. [8] showed also in Banach spaces that if the nonexpansive map S
123
J Glob Optim (2011) 51:11–26
13
above is replaced by the resolvent, Jβn , of an m-accretive operator, then strong convergence
is still guaranteed under (C1), (C2), and the condition
(C6) lim (βn+1 − βn ) = 0,
n→∞
with βn bounded from below away from zero.
Notice that it is possible to prove strong convergence results if one replaces the nonexpansive map T in algorithm 2 by a sequence of nonexpansive mappings. For instance, one
may consider the iterative process, known as the (modified) proximal point algorithm of
Halpern-type, defined by
xn+1 = αn u + (1 − αn )Jβn xn , for any u, x0 ∈ H and all n ≥ 0,
(4)
where {βn } ⊂ (0, ∞). Under additional assumptions on βn , the strong convergence of {xn }
defined by (4) can be obtained. In 2000, Kamimura and Takahashi [10] showed that {xn } is
strongly convergent to the point of the set F(Jc ) = {x ∈ H : Jc x = x} = A−1 (0) (for all
c > 0) nearest to u if one assumes (C1), (C2) and βn → ∞. In fact, they considered the
following algorithm which is the inexact form of algorithm 4:
yn ≈ Jβn xn , for all n ≥ 0,
(5)
xn+1 = αn u + (1 − αn )yn ,
for any u, x0 ∈ H , where the criterion for the approximate computation of yn is given by
yn − Jβn xn ≤ δn with
∞
n=0
δn < ∞.
It is worth mentioning that Xu [18] also obtained the same result independently, and in [1]
(see also [2]), we extended this result to include non-summable errors, en . It is unclear if the
same conclusion can also be derived for bounded βn and the general condition that the error
sequence tend to zero in norm. We refer the interested reader to the paper of Rockafellar [14]
to see what happens in the case when αn = 0 for all n ≥ 0.
The so called prox-Tikhonov regularization method have also been under investigation
from several researchers. In 2006, Xu [19] extended the result of Lehdili and Moudafi [11]
by considering the iterative process
xn+1 = Jβn (αn u + (1 − αn )xn + en ), for any u, x0 ∈ H and all n ≥ 0,
(6)
were {en } is a sequence of errors, and proved strong convergence of {x n } defined by (6) to
the metric projection of u into the fixed point set A−1 (0) under the control conditions which
appear as a combination of αn and βn . More precisely, his conditions were
∞
αn βn+1
1
αn βn+1
lim
(C7)
1−
= 0.
1 − α β < ∞ or, (C8) n→∞
αn
αn+1 βn
n+1 n
n=0
Note that for βn → ∞, the natural choices of αn = n −1 and βn = n, fails under both
conditions. In fact, for any choice of αn and βn , condition (C7) is impossible to achieve as
shall be shown in this paper, (see Remark 4). In another result of Xu, Theorem 3.3 [19], it is
shown that for summable errors, strong convergence is still maintained under the conditions
(C1), (C2), (C4), and βn bounded (from above and from below away from zero) with (C9)
(as defined below) being satisfied. Song and Yang [15] established strong convergence of the
prox-Tikhonov algorithm 6 when the errors are summable, (C1), (C2), (C4) being satisfied,
123
14
J Glob Optim (2011) 51:11–26
and the following condition on βn imposed: βn is bounded from below away from zero with
either
(C9)
∞
n=0
|βn+1 − βn | < ∞, or (C9)′
∞
|βn+1 − βn |
n=0
βn+1
< ∞.
They remarked that their result (Theorem 2) contains Theorem 3.3 [19] as a special case.
Although this seems to be the case at first glance, it turns out that the two theorems are
equivalent. In fact, the condition (C9)′ on βn is equivalent to (C9) and βn bounded from
below away from zero. Obviously, from this equivalence follows the equivalence of the two
theorems. This equivalence is not so obvious and it is discussed in Lemma 4 below.
The main purpose of this paper is to prove strong convergence of {x n } conforming to the
iterative process
xn+1 = αn u + (1 − αn )Jβn xn + en , for any u, x0 ∈ H and all n ≥ 0,
(7)
under new conditions on αn and βn . The conditions we are about to introduce will allow
choices such as αn = n −1 and βn = n, and they are weaker than those previously studied, so
our results can be viewed as significant improvements and refinements of previously known
results. Theorem 5 deals with the conditions
∞
αn−1
αn
1
(αn−1 βn+1 − αn βn ) = 0,
lim
either (C10)
< ∞ or, (C11) n→∞
β −β
αn βn2
n
n+1
n=1
and Theorem 6 is concerned with the conditions
1
1
−
(C6)∗ lim
= 0, and either
n→∞ βn
βn−1
∞
|αn − αn−1 |
(αn − αn−1 )
= 0 or, (C13)
< ∞.
(C12) lim
n→∞
αn−1 βn
βn
n=1
In particular, our results provide an answer to the question we asked in [2]: Can one design
a proximal point algorithm by choosing appropriate regularization parameters αn such that
strong convergence of {xn } is preserved, for en → 0 and βn bounded? Of course, for
constant βn , (C10) reduces to (C4) and (C11) reduces to (C5).
If A is the subdifferential of a proper, convex, lower semicontinuous function ϕ : H →
(−∞, ∞], then our convergence results provide sequences which converge strongly to the
minimum point of ϕ nearest to u. In addition, we give convergence rate estimates for a
sequence converging to inf ϕ (see Theorem 7 of Sect. 4). The reader interested in theoretical
and practical aspects of convex and non-convex optimization theory is referred to the recent
excellent six-volume resource [6]. See also [5,9].
2 Preliminaries
In the sequel, H is a real Hilbert space, F denotes the set A−1 (0) = {x ∈ H : Jc x = x} =
F(Jc ) for all c > 0, and given any sequence {x n }, its weak ω-limit set will be denoted by
ωw ({xn }), that is,
ωw ({xn }) := x ∈ H | xn k ⇀ x for some subsequence {xn k } of {xn } .
123
J Glob Optim (2011) 51:11–26
15
Here “⇀” denotes weak convergence. Setting
vn :=
xn − αn−1 u − en−1
,
1 − αn−1
(8)
we see that (7) can be reformulated as
vn+1 = Jβn (αn−1 u + (1 − αn−1 )vn + en−1 ), for n ≥ 1.
(9)
It is worth pointing out that, for αn → 0 and en → 0, the algorithms (7) and (9) are equivalent, that is, {vn } converges if and only if {xn } does. We shall therefore always use either
form of the algorithm at our convenience. Obviously, (9) has the form (6), with αn−1 , βn ,
and en−1 instead of αn , βn , and en . If we consider (9) instead of (6), then conditions (C7)
and (C8) take the form
∞
αn−1 βn+1
1 − αn−1 βn+1 < ∞ or, (C8)′ lim 1
(C7)′
1
−
= 0.
n→∞ αn−1
αn βn
αn βn
n=1
Theorem 4 and Remark 4 are concerned with these conditions. Let us now recall some
Lemmas which will be useful in proving our main results. The first Lemma can be proved
easily.
Lemma 1 For all x, y ∈ H , we have
x + y2 ≤ y2 + 2x, x + y.
Lemma 2 (Resolvent Identity). For any β, γ > 0, and x ∈ H , the identity
γ
γ
Jβ x = Jγ
x + 1−
Jβ x
β
β
holds true.
Proof The proof of this Lemma is well known, but we provide it for the sake of completeness.
Let β, γ > 0, and x ∈ H be arbitrary but fixed. Set y := Jβ x. Then using the definition of
the resolvent, we have
γ
γ
y = Jβ x ⇔ y + β Ay ∋ x ⇔ y + γ Ay ∋ x + 1 −
y ⇔ y
β
β
γ
γ
x + 1−
y .
= Jγ
β
β
This completes the proof of the resolvent identity.
⊔
⊓
Lemma 3 [18]. Let {sn } be a sequence of non-negative real numbers satisfying
sn+1 ≤ (1 − an )sn + an bn + cn , n ≥ 0,
where {an }, {bn }, {cn } satisfy the conditions: (i) {an } ⊂ [0, 1], with ∞
n=0 an = ∞, (ii)
lim supn→∞ bn ≤ 0, and (iii) cn ≥ 0 for all n ≥ 0 with ∞
c
<
∞.
Then
limn→∞ sn = 0.
n=0 n
We next show that any sequence of positive real numbers satisfying the condition of (C9)′
is bounded (with the lower bound being strictly positive).
Lemma 4 For any sequence {bn } of positive real numbers, the following conditions are
equivalent: (i) ∞
n=0 |bn+1 − bn | < ∞ and 0 < lim inf n→∞ bn (=lim n→∞ bn ),
|bn+1 −bn |
|bn+1 −bn |
(ii) ∞
< ∞, and (iii) ∞
< ∞.
n=0
n=0
bn
bn+1
123
16
J Glob Optim (2011) 51:11–26
Proof First, it is easily seen that (i) ⇒ (ii), and (i) ⇒ (iii). Now let us prove that (ii) ⇒ (i).
For this, it suffices to show that there exist constants m, M > 0 such that m ≤ bn ≤ M for
all n = 0, 1, . . .
From (ii), there exists a sequence {an } ⊂ R, such that ∞
n=0 |an | < ∞, and
bn+1
bn+1 − bn
= an ⇔
= 1 + an , n = 0, 1, . . .
bn
bn
Note that in particular, limn→∞ an = 0. Therefore, we may assume without any loss of
generality that |an | < 1 for all n. Then by simple induction, we have
bn
=
b0
n−1
k=0
(1 + ak ).
(10)
Since 1 + x ≤ exp(x) for all x ≥ 0, it follows from (10) that
bn
=
b0
n−1
n−1
k=0
(1 + ak ) ≤
k=0
(1 + |ak |) ≤ exp
n−1
k=0
|ak | ≤ exp
∞
k=0
|ak | =: M0 < ∞.
(11)
On the other hand,
∞
k=0
|ak | < ∞ ⇔
∞
k=0
(1 − |ak |) > 0,
and again from (10) we obtain
bn
=
b0
n−1
n−1
k=0
(1 + ak ) ≥
k=0
(1 − |ak |) ≥
∞
k=0
(1 − |ak |) =: m 0 > 0.
(12)
The conclusion then follows from (11) and (12). Replacing bn by bn−1 in (ii), one readily gets
(iii), showing that (iii) ⇒ (i) as desired.
⊔
⊓
Let the mapping h : H → H be defined by x → tu + (1 − t)Jc x + e(t) for c > 0, u ∈ H
and t ∈ (0, 1), where e = e(t) is a given function defined on (0, 1) with values in H . For any
fixed t (and c, u), one can easily check that the map h is a contraction with Lipschitz constant 1 − t. The Banach contraction principle asserts that h has a unique fixed point, say, z t .
That is,
z t = tu + (1 − t)Jc z t + e(t) for c > 0 and u ∈ H.
(13)
In fact z t depends on u and c as well.
Theorem 1 Take any c > 0 and u ∈ H , and assume
t −1 e(t) → 0 as t → 0+ .
(14)
0+
If F = ∅, then {z t } defined in (13) converges strongly as t →
to the point of F nearest to
u, denoted by PF u. Moreover, this limit is attained uniformly with respect to c ≥ δ for every
δ > 0.
Proof For every p ∈ F, we have from Lemma 1
z t − p2 ≤ (1 − t)2 z t − p2 + 2tu − p + t −1 e(t), z t − p.
123
J Glob Optim (2011) 51:11–26
17
In other words,
(2 − t)z t − p2 ≤ 2u − p + t −1 e(t), z t − p.
This shows that {z t } is bounded as t →
0+ .
(15)
Now setting
vt := (1 − t)−1 (z t − tu − e(t)) = Jc z t ,
we see that {vt } is also bounded as t → 0+ and the weak ω-limit sets of {z t } and {vt } (as
t → 0+ ) coincide, that is, ωw ({z t }) = ωw ({vt }). Since
Avt ∋
1
(z t − vt ) → 0 as t → 0+ ,
c
we have ωw ({z t }) ⊂ F. By (14) and (15) with p = PF u we get
lim sup z t − PF u2 ≤ 0,
t→0+
which shows that
lim z t − PF u = 0.
t→0+
Obviously, the above limit is attained uniformly with respect to c ≥ δ for every δ > 0.
⊔
⊓
Remark 1 Theorem 1 is an extension of Theorem 3.1 in [19], since vt converges strongly to
PF u (as t → 0+ ) if and only if z t does. We note that Theorem 3.1 in [19] contains a mistake,
since the strong limit of vt (as t → 0+ ) is not attained uniformly for c > 0 (but for c ≥ δ for
every δ > 0).
3 Main results
We devote this section to demonstrate the strong convergence of algorithm 7 under different
sets of assumptions on the parameters αn and βn . We begin by proving a strong convergence
result satisfying similar conditions to those of Lions. One of the conditions
(C3)′ lim
n→∞
|αn+1 − αn |
= 0,
αn2
is weaker than Lions’ condition (C3) in the case when αn is decreasing.
Theorem 2 Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator and
F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let {xn } be the sequence generated by algorithm
(7) with the conditions: (i) αn ∈ (0, 1), (C1), (C2) and (C3)′ , (ii) either ∞
n=0 en < ∞
or en /αn → 0, and (iii) βn ∈ (0, ∞) with (C6)′ limn→∞ βn = β for some β > 0, being
satisfied. Then {xn } converges strongly to PF u, the projection of u on F.
Proof Note that it was shown in [19] that {xn } is bounded if ∞
n=0 en < ∞. Also it was
shown in [2] that {xn } is bounded if {en /αn } is bounded. For each n, let z n be the unique fixed
point of the contraction x → αn u + (1 − αn )Jβ x. According to Theorem 1, z n → PF u as
n → ∞. Therefore it is enough to show that xn − z n → 0 as n → ∞. For this purpose,
we estimate xn+1 − z n+1 as follows
xn+1 − z n+1 ≤ xn+1 − z n + z n − z n+1 .
(16)
123
18
J Glob Optim (2011) 51:11–26
Noting that z n = αn u + (1 − αn )Jβ z n and the fact that Jβ is nonexpansive for all β > 0, we
get
xn+1 − z n ≤ (1 − αn )Jβn xn − Jβ z n + en
≤ (1 − αn )Jβn xn − Jβn z n + Jβn z n − Jβ z n + en
|β − βn |
≤ (1 − αn )xn − z n +
z n − Jβ z n + en
β
|β − βn |
≤ (1 − αn )xn − z n + αn
u − Jβ z n + en ,
β
(17)
where the third inequality follows from the application of the resolvent identity. On the other
hand, we compare z n and z n+1 as follows
z n − z n+1 = (αn − αn+1 )(u − Jβ z n+1 ) + (1 − αn )(Jβ z n − Jβ z n+1 )
≤ |αn − αn+1 |u − Jβ z n+1 + (1 − αn )z n − z n+1 ,
which gives
z n − z n+1 ≤
|αn − αn+1 |
K,
αn
(18)
where K is a positive constant such that u − Jβ z n ≤ K for all n. Combining (16), (17)
and (18) we get
xn+1 − z n+1 ≤ (1 − αn )xn − z n + αn bn + cn ,
where
bn = K
∞
|β − βn | |αn − αn+1 |
→
0
and
c
en < ∞,
=
e
with
+
n
n
β
αn2
n=0
or
xn+1 − z n+1 ≤ (1 − αn )xn − z n + αn bn′ ,
where
bn′
|β − βn | |αn − αn+1 |
en
=K
→0
+
+
β
αn2
αn
for the case en /αn → 0. In either case Lemma 3 gives the required conclusion.
⊔
⊓
Remark 2 For β > 0 and βn = β + (−1)n /(n + 1), the condition (C6)′ is satisfied, whereas
(C9) is not, showing that our condition on βn is weaker than the one used in the following
theorem due to Xu [19]. On the other hand, the sequences αn = n −3/4 and αn = 1/ ln n
satisfy condition (i) of Theorem 2. Since (C3) and (C3)′ are not comparable to (C4) (see
Remark 3.1 [18]), Theorem 2 is new.
Theorem 3 [19]. Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator
and F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let {xn } be the sequence generated by
algorithm (7) with the conditions: (i) αn ∈ (0, 1), (C1), (C2) and (C4), (ii) βn ∈ (0, ∞)
with ∞
n=0 |βn+1 − βn | < ∞ and 0 < lim inf n→∞ βn (= lim n→∞ βn ), being satisfied. If
∞
n=0 en < ∞, then {x n } converges strongly to PF u, the projection of u on F.
123
J Glob Optim (2011) 51:11–26
19
Remark 3 Although it appears from Lemma 3 and inequality (18) that
∞
|αn − αn+1 |
<∞
αn
n=0
can be a possible assumption on αn , there is no sequence {αn } ⊂ (0, 1) satisfying (C1) and
this condition. Indeed, if this condition is satisfied, then Lemma 4 implies that αn is bounded
below away from zero, contradicting (C1).
We next give a result similar to Theorem 3.2 of Xu [19]. In the next result, if we consider algorithm 6 instead of algorithm 7, then we can prove the same result with (C8)′ being
replaced by (C8). In that case, the result extends Theorem 3.2 [19] to a larger class of errors
which include those that are non-summable and still converge to zero in norm. Moreover,
we can show that Theorem 3.2 [19] fails to hold under the condition (C7).
Theorem 4 Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator and
F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let {xn } be the sequence generated by algorithm
(7), where (i) αn ∈ (0, 1), with (C1), and (C2), (ii) either ∞
n=0 en < ∞ or en /αn → 0,
and (iii) βn ∈ (0, ∞) with lim inf n→∞ βn > 0, βn+1 ≥ αn βn and (C8)′ . Then {xn } converges
strongly to PF u, the projection of u on F.
Proof For each fixed n, let yn be the unique fixed point of the contraction x → αn−1 u +
(1 − αn−1 )Jβn x. Then according to Theorem 1, yn → PF u as n → ∞. Set
vn :=
xn − αn−1 u − en−1
yn − αn−1 u
and wn :=
.
1 − αn−1
1 − αn−1
(19)
As a consequence of the boundedness of {xn } and {yn } (see [2] and [19]), the sequences {vn }
and {wn } are bounded. Also by virtue of (19), wn → PF u as n → ∞. It follows from (7)
and the definition of yn that
vn+1 = Jβn ((1 − αn−1 )vn + αn−1 u + en−1 ) and wn = Jβn ((1 − αn−1 )wn + αn−1 u).
As before, using the nonexpansivity of the resolvent, we estimate vn+1 − wn+1 as follows
vn+1 − wn+1 ≤ vn+1 − wn + wn+1 − wn
≤ (1 − αn−1 )vn − wn + wn+1 − wn + en−1 .
(20)
Now using the resolvent identity and the nonexpansivity of the resolvent, we can estimate
wn+1 − wn as follows
βn
βn
J
((1
−
α
)w
+
α
u)
+
1
−
w
wn+1 − wn =
n
n+1
n
n+1
β
n
βn+1
βn+1
−Jβn ((1 − αn−1 )wn + αn−1 u)
αn βn
αn βn
≤ 1−
K,
wn+1 − wn + αn−1 −
βn+1
βn+1
which gives
αn−1 βn+1
wn+1 − wn ≤ 1 −
K,
αn βn
(21)
123
20
J Glob Optim (2011) 51:11–26
for some positive constant K . Combining (20) and (21) we get
αn−1 βn+1
K + en−1 .
vn+1 − wn+1 ≤ (1 − αn−1 )vn − wn + 1 −
αn βn
(22)
Hence from Lemma 3, we see that vn − wn → 0, and the proof is complete.
⊔
⊓
Remark 4 In view of Lemma 3 and (22), it is tempting to infer that the theorem is still valid
under the condition (C7)′ . However we show that this condition is impossible to attain for
any sequences {βn } and {αn } satisfying the conditions of the above theorem. To this end, we
assume that (C7)′ holds true. Denote bn := αn−1 /βn . Then
∞
∞
|bn+1 − bn |
1 − αn−1 βn+1 < ∞ ⇔
< ∞.
αn βn
bn+1
n=1
n=1
Therefore, it follows from Lemma 4 that
αn−1
lim inf
= lim inf bn > 0,
n→∞
n→∞ βn
which implies that βn → 0 (since αn → 0). This is a contradiction as βn is bounded below
away from zero.
However, if we allow βn → 0, then Theorem 1 is no longer applicable. Indeed, from
wn = Jβn ((1 − αn−1 )wn + αn−1 u), we have
αn−1
(23)
(u − wn ) ∈ Awn .
βn
From the above inclusion relation, we can not derive ωw ({wn }) ⊂ F := A−1 (0), even if
wn is strongly convergent (since by (21), ∞
n=1 wn+1 − wn < ∞) because αn−1 /βn may
not necessarily converge to zero. Therefore, in this case {x n } is still strongly convergent
(according to (22)) but we can not derive that its limit is in F. In fact, its limit need not be in
F. We give an example to that effect.
Example 1 Let βn = 1/n and αn = 1/(n + 2) for n ≥ 1. Then we have
1−
1
βn+1
αn−1 βn+1
=
=: an , for all n ≥ 1, and
→ 1 as n → ∞.
αn βn
(n + 1)2
αn
Clearly the condition βn+1 ≥ αn βn for all n ≥ 1 is fulfilled. Let H = R, and let the
sequence {en } ⊂ R satisfy either the condition ∞
n=0 |en | < ∞ or |en |/αn → 0, (for example, |en | = (n + 2)−2 or |en | = 1/(n ln n) for n ≥ 2 with ∞
n=2 |en | = ∞, respectively),
and let A : D(A) = [0, ∞) ⊂ R → R be defined by
⎧
if x > 0,
⎨ ax,
Ax = (−∞, 0], if x = 0,
⎩
∅,
if x < 0,
for some a > 0. Then if u > 0, we have for sufficiently large n, αn−1 u + en−1 > 0 and
0 < wn = Jβn ((1 − αn−1 )wn + αn−1 u + en−1 )
1
=
((1 − αn−1 )wn + αn−1 u + en−1 ),
1 + βn a
which implies that wn → w∞ :=
conclusion is true if u < 0.
123
1
1+a u
∈
/ F = {0}. Hence xn → w∞ ∈
/ F. The same
J Glob Optim (2011) 51:11–26
21
The argument given above shows that if βn is bounded away from zero in Theorem 3.2
of [19], then the condition (C7) is impossible to achieve. Also the above example shows that
the result may not hold if βn → 0.
We now give an example to show the applicability of Theorem 4.
Example 2 Choose βn = β0 > 0 for all n, αn = (n + 1)−1/2 and en = 1/(n + 1) for all
n ≥ 0.
Theorem 5 Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator and
F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let {xn } be the sequence generated by algorithm
(7) where conditions (i) and (ii) of Theorem 4 are fulfilled. If βn ∈ (0, ∞) is increasing and
either (C10) or (C11) is satisfied, then {xn } converges strongly to PF u, the projection of u
on F.
Proof We know that {xn } (and hence {vn }) is bounded, see Theorem 2.
Claim: lim supn→∞ u − PF u, xn − PF u ≤ 0.
Let {xn k } be a subsequence of {xn } converging weakly to some x∞ , such that
lim supu − PF u, xn − PF u = lim u − PF u, xn k − PF u = u − PF u, x∞ − PF u.
n→∞
k→∞
To prove the claim, we only need to show that x ∞ ∈ F, or more generally ωw ({xn }) ⊂ F. If
βn is unbounded, then the conclusion follows from the inclusion relation
1
αn−1
vn+1 − vn
+ A(vn+1 ) ∋
en−1 .
(u − vn ) +
βn
βn
βn
(24)
Otherwise, from Eq. 9, the boundedness of {en /αn } and {vn }, the nonexpansivity of Jβn
and taking advantage of the resolvent identity, we can compare vn+2 and vn+1 as follows
n
n
vn+2
((1 − αn )vn+1 + αn u + en ) + 1 − ββn+1
vn+2 − vn+1 = Jβn ββn+1
−Jβn ((1 − αn−1 )vn + αn−1 u + en−1 )
βn
n
(vn+2 − vn+1 ) + 1 − αβnn+1
(vn+1 − vn )
≤ 1 − ββn+1
en−1
βn
αn βn en
+ αn−1 − αβnn+1
(vn − u − αen−1
)
+
−
βn+1 αn
αn−1
n−1
βn
n
vn+2 − vn+1 + 1 − αβnn+1
vn+1 − vn
≤ 1 − ββn+1
αn βn
βn
en − en−1 ,
+ αn−1 − αβnn+1
K +
β
α
α
n+1
n
n−1
which implies that
vn+2 − vn+1
αn
αn βn vn+1 − vn αn−1
αn
K+
≤ 1−
+
−
βn+1
βn+1
βn
βn
βn+1
βn+1
αn β0 vn+1 − vn αn−1
αn
αn
≤ 1−
+
−
K+
βn+1
βn
βn
βn+1
βn+1
en
− en−1
α
α
n
n−1
en
en−1
.
−
α
αn−1
n
123
22
J Glob Optim (2011) 51:11–26
∞
n=0 en
< ∞, we have
βn
n
(vn+2 − vn+1 ) + 1 − αβnn+1
(vn+1 − vn )
vn+2 − vn+1 ≤ 1 − ββn+1
βn
n
+ αn−1 − αβnn+1
(vn − u) + ββn+1
en − en−1
βn
n
≤ 1 − ββn+1
vn+2 − vn+1 + 1 − αβnn+1
vn+1 − vn
βn
βn ′
en − en−1 ,
+ αn−1 − αβnn+1
K + βn+1
Similarly, for the case
which implies that
vn+2 − vn+1
αn β0 vn+1 − vn αn−1
αn ′
1
≤ 1−
+
−
K +
(en + en−1 ) .
βn+1
βn+1
βn
βn
βn+1
β0
Denote an := αn β0 /βn+1 . Since {αn } satisfy αn ∈ (0, 1), αn → 0 and
do {an }. Therefore, from Lemma 3, we have (in both cases)
∞
n=0 αn
= ∞, so
vn+1 − vn
→ 0 ⇔ vn+1 − vn → 0.
βn
Moreover, (24) implies that ωw ({vn }) ⊂ F, and from (8), we derive ωw ({vn }) = ωw ({xn }),
hence the claim.
Finally we show that {xn } converges strongly to PF u. We have from Lemma 1
en
, xn+1 − PF u . (25)
xn+1 − PF u2 ≤ (1 − αn )xn − PF u2 + 2αn u − PF u +
αn
In the case when en /αn → 0, inequality (25) implies by Lemma 3 that xn → PF u. If
∞
n=0 en < ∞, then we derive from inequality (25)
xn+1 − PF u2 ≤ (1 − αn )xn − PF u2 + 2αn u − PF u, xn+1 − PF u + K en ,
for some K > 0, and Lemma 3 again implies that xn → PF u as desired.
⊔
⊓
Remark 5 The condition (C10) is weaker than the conditions (C4) and (C9) if βn ≥ δ for all
n and for some δ > 0. Indeed,
1
αn−1
1
αn
1
≤
−
|α
−
α
|
+
α
−
n−1
n
n
β
βn+1
βn
βn
βn+1
n
|βn+1 − βn |
1
≤
|αn−1 − αn | +
.
δ
δ
Note that if βn = n 2 for n ≥ 1, then (C10) holds true for any choice of αn ∈ (0, 1).
Remark 6 Observe that (C11) is satisfied for βn = n and αn = (n + 1)−1 , whereas the
condition (C8)′ of Theorem 3 fails. Moreover, (C11) works if βn is constant and αn taken as
before but (C8)′ fails.
Although the condition (C10) is weaker than (C4) and (C9) if lim inf n→∞ βn > 0, our
result is restricted only to those βn ’s which are increasing. The next result is designed to
cater for those βn ’s who does not satisfy this restrictive condition. It is actually an extension
and improvement of Theorem 3 above. Our proof differs from those given in [15] and [19],
and it relies on the equivalence of the algorithms 6 and 7. Note that it was observed in [15]
that a gap exists in the proof of Theorem 3. We remark here that our method of transforming
123
J Glob Optim (2011) 51:11–26
23
Eq. 9 into Eq. 7 is an alternative way of solving this gap as can be seen from the proof of
Theorem 6 below.
Theorem 6 Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator and
F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let the sequence {xn } be generated by algorithm (7) with the following conditions being satisfied: (i) αn ∈ (0, 1), (C1), (C2), (ii) either
∞
n=0 en < ∞ or en /αn → 0, (iii) lim inf n→∞ βn > 0, and (C6)*. If either (C12) or
(C13) hold, then {xn } (and hence {vn }) converges strongly to PF u, the projection of u on F.
Proof We know from [2] and [19] that {xn } is bounded. For en /αn → 0, we have, (by the
resolvent identity and the nonexpansivity of the resolvent),
n
n
Jβn−1 xn−1
xn−1 + 1 − ββn−1
xn+1 − xn ≤ (1 − αn−1 ) Jβn xn − Jβn ββn−1
+|αn − αn−1 | · u − Jβn xn + en /αn + αn−1 en /αn − en−1 /αn−1
n
n
(xn − Jβn−1 xn−1 )
≤ (1 − αn−1 ) ββn−1
(xn − xn−1 ) + 1 − ββn−1
+|αn − αn−1 | · u − Jβn xn + en /αn + αn−1 en /αn − en−1 /αn−1
so that
n
xn − xn−1 + |αn − αn−1 | · u − Jβn xn + en /αn
≤ (1 − αn−1 ) ββn−1
en−1
n
+ αn−1 en − en−1 ,
+αn−1 1 − ββn−1
u − Jβn−1 xn−1 +
αn−1
αn
αn−1
1
xn+1 − xn
xn − xn−1
1
αn−1
≤ (1 − αn−1 )
+ αn−1
−
K+
βn
βn−1
βn
βn−1
βn
|αn − αn−1 |
+K
,
βn
en
en−1
α − α
n
n−1
(26)
for some positive constant K . From Lemma 3 and inequality (26), we have
xn+1 − xn
→ 0,
βn
which is equivalent to
vn+1 − vn
→ 0.
βn
Hence we can derive (see (24) above), ωw ({xn }) = ωw ({vn }) ⊂ F. Consequently, we have
lim supu − PF u, xn − PF u ≤ 0.
(27)
n→∞
−1
Note that for some positive constant C, |βn+1
− βn−1 | ≤ C (since lim inf n→∞ βn > 0) and
xn+1 − xn = (αn − αn−1 )(u − Jβn xn ) + (en − en−1 ) + (1 − αn−1 )(Jβn xn − Jβn−1 xn−1 ),
so that in the case when ∞
n=1 en < ∞, we again get inequality (27) on applying similar
arguments as above.
As in the proof of Theorem 5, we derive strong convergence of {xn } to PF u.
⊔
⊓
Remark 7 Clearly (C6)∗ is weaker than the conditions
∞
1
1
1
1
1
−
−
<
∞
and
(C15)
lim
(C14)
= 0,
β
n→∞ αn
βn−1
βn
βn−1
n
n=1
123
24
J Glob Optim (2011) 51:11–26
both of which hold true if αn = n −1 and βn = n while (C6) fails for this choice of βn .
However, both (C6) and (C6)∗ hold if βn = ln n. We point out that the first inequality in
Remark 5 suggest that for lim inf n→∞ βn > 0, the condition (C10) is weaker than (C4) and
(C14). Also, the condition (C14) is weaker than (C9) whenever lim inf n→∞ βn > 0 holds.
But the condition that βn is increasing is stronger than the assumption lim inf n→∞ βn > 0,
so there are cases in which the following corollary is applicable and Theorem 5 is not. We
remark that both (C4) and (C5) are not satisfied by
1/n,
if n is odd,
αn =
1/(2n), if n is even.
This choice of αn however fulfills the assumptions of (C13) for the case when βn = n and
(C12) for any βn → ∞.
Corollary 1 Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator and
F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let the sequence {xn } be generated by algorithm (7), where αn ∈ (0, 1) and βn ∈ (0, ∞), with the conditions (i) and (ii) taken as in
Theorem 6, and lim inf n→∞ βn > 0 with either (C14) or (C15). If either (C12) or (C13)
hold, then {xn } (and hence {vn }) converges strongly to PF u.
The following corollary is an extension of Theorem 3.
Corollary 2 Assume that A : D(A) ⊂ H → 2 H is a maximal monotone operator and
F := A−1 (0) = ∅. For any fixed u, x0 ∈ H , let the sequence {xn } be generated by algorithm 7, where αn ∈ (0, 1) and βn ∈ (0, ∞), with the conditions (i) and (ii)taken as in
Theβn
1
orem 6, and (iii) lim inf n→∞ βn > 0 and either (C9) or (C16) limn→∞ αn 1 − βn+1 = 0.
If either (C12) or (C13) hold, then {xn } (hence {vn }) converges strongly to PF u.
We give an example to show that the conditions of (iii) are different.
Example 3 Let αn = (n + 2)−1/4 and βn = 2(n + 1)(n + 2)−1 for all n ≥ 0. Then αn and
βn satisfy both conditions of (iii) while βn = (n + 1) and αn as above satisfy only (C16).
Remark 8 Let us observe that if en /αn → 0 and ∞
n=0 en = ∞, then automatically
∞
n=0 αn = ∞. Also the trend that has been followed by many authors in order to obtain
strong convergence of the PPA was to use the criterion which restricts the error sequence
to be summable. We have deviated from this tradition by allowing any sequence of errors
converging strongly to zero and still derived strong convergence of the PPA. Indeed, if
∞
n=0 en = ∞ and en → 0, then we can construct (or choose) a sequence {αn } of
parameters
√ depending on {en } such that the condition en /αn → 0 holds (for example
αn = en if en = 0 and all n big enough). Otherwise, (i.e., if ∞
n=0 en < ∞), we
can choose freely (independent of en ) αn ∈ (0, 1) such that the conditions αn → 0 and
∞
n=0 αn = ∞ are satisfied.
4 The case when A is a subdifferential
Recall that the subdifferential of a proper and convex function ϕ : H → (−∞, +∞] is the
operator (possibly multivalued) ∂ϕ : H → H defined by
∂ϕ(x) = {w ∈ H | ϕ(x) − ϕ(v) ≤ w, x − v, ∀ v ∈ H }.
123
J Glob Optim (2011) 51:11–26
25
If in addition, ϕ is lower semicontinuous, then its subdifferential is a maximal monotone
operator and a point p ∈ H minimizes ϕ if and only if 0 ∈ ∂ϕ( p). In other words, A−1 (0)
for A = ∂ϕ is the set of minimum points of ϕ. Note that for A = ∂ϕ, where ϕ is a proper,
convex and lower semicontinuous function, algorithm (9) is equivalent to
vn+1 = arg minx∈H ϕn (x),
where
ϕn (x) = ϕ(x) +
1
x − αn−1 u − (1 − αn−1 )vn − en−1 2 .
2βn
Obviously, ϕn is a coercive function having a unique minimizer vn+1 due to the quadratic
term added to ϕ(x).
Under the assumptions of the previously proved results, {vn } (equivalently, {xn }) converges strongly to the minimizer of ϕ nearest to u.
We now give two convergence rate estimates for the residual ϕ(wk+1 ) − ϕ(z) where ϕ is
a proper, convex and lower semicontinuous function and z is an arbitrary point of H , and
wn = σn−1
n
k=1
βk vk+1 where σn =
n
βk .
(28)
k=1
In general, if a sequence {vn } converges strongly (resp. weakly) to a point, say p, then the
sequence of its weighted means with positive weights {βk } defined by (28) also converges
strongly (resp. weakly) to the same limit p, provided σn → ∞.
Theorem 7 Let A = ∂ϕ and A−1 (0) = ∅ where ϕ : H → (−∞, +∞] is a proper, convex and lower semicontinuous function. For any fixed u, v1 ∈ H , let {vn } be the sequence
generated by algorithm (9) and {wn } be as in (28).
• If ∞
k=1 ek−1 < ∞, then for some K > 0, the following estimate holds
n
n
v1 − z2 + K
k=1 αk−1 +
k=1 ek−1
ϕ(wn ) − ϕ(z) ≤
, for all z ∈ H. (29)
2σn
• If {en /αn } is bounded, then for some M > 0, we have
ϕ(wn ) − ϕ(z) ≤
If in addition, σn−1
n
k=1 αk−1
v1 − z2 + M
2σn
n
k=1 αk−1
, for all z ∈ H.
(30)
→ 0 as n → ∞, then ϕ(wn ) → inf y∈H ϕ(y).
Proof Let us prove estimate (30). Note that for A = ∂ϕ, we have from (9),
αk−1 (u − vk ) + ek−1 + (vk − vk+1 ) ∈ βk ∂ϕ(vk+1 ),
and for all z ∈ H , we have from the boundedness of {ek /αk } and {vk } (see the proof of
Theorem 1 [2]),
2βk (ϕ(vk+1 ) − ϕ(z)) ≤ 2vk − vk+1 , vk+1 − z + 2αk−1 u − vk + ek−1 /αk−1 , vk+1 − z
≤ vk − z2 − vk+1 − vk 2 − vk+1 − z2 + Mαk−1 ,
(31)
for some M > 0. Summing (31) from k = 1, . . . , n and rearranging terms, we get
n
n
v1 − z2 + M nk=1 αk−1
βk ϕ(vk+1 )
k=1 βk vk+1
2ϕ(z) +
≥2 k=1
≥2ϕ
. (32)
σn
σn
σn
Therefore (30) follows from (32). The proof of the other estimate is similar. The final assertion
of the theorem is obvious.
⊔
⊓
123
26
Acknowledgments
J Glob Optim (2011) 51:11–26
The authors thank Alexandru Kristály for his insightful comments on this paper.
References
1. Boikanyo, O.A., Moroşanu, G.: Modified Rockafellar’s algorithms. Math. Sci. Res. J. 13(5), 101–
122 (2009)
2. Boikanyo, O.A., Moroşanu, G.: A proximal point algorithm converging strongly for general errors. Optim.
Lett. doi:10.1007/s11590-010-0176-z.
3. Browder, F.E.: Convergence of approximants to fixed points of nonexpansive nonlinear mappings in
Banach spaces. Arch. Ration. Mech. Anal. 24, 82–90 (1967)
4. Chidume, C.E., Chidume, C.O.: Iterative approximation of fixed points of nonexpansive mappings.
J. Math. Anal. Appl. 318, 288–295 (2006)
5. Du, D.-Z., Pardalos, P.M., Wu, W.: Mathematical Theory of Optimization. Nonconvex Optimization and
its Applications, vol. 56. Kluwer, Dordrecht (2001)
6. Floudas, C.A., Pardalos, P.M. (eds.): Enciclopedia of Optimization, 2nd edn. Springer, Berlin (2009)
7. Halpern, B.: Fixed points of nonexpanding maps. Bull. Am. Math. Soc. 73, 957–961 (1967)
8. He, Z., Zhang, D., Gu, F.: Viscosity approximation method for m-accretive mapping and variational
inequality in Banach space. An. Şt. Univ. Ovidius Constanţa 17(1), 91–104 (2009)
9. Horst, R., Pardalos, P.M., Thoai, N.V.: Introduction to Global Optiomization, 2nd edn. Nonconvex Optimization and its Applications, vol. 58. Kluwer, Dordrecht (2000)
10. Kamimura, S., Takahashi, W.: Approximating solutions of maximal monotone operators in Hilbert
spaces. J. Approx. Theory 106, 226–240 (2000)
11. Lehdili, N., Moudafi, A.: Combining the proximal algorithm and Tikhonov regularization. Optimization 37, 239–252 (1996)
12. Lions, P.L.: Approximation de points fixes de contractions. C. R. Acad. Sci. Paris. Ser. A-B 284, A1357–
A1359 (1977)
13. Moroşanu, G.: Nonlinear Evolution Equations and Applications. Reidel, Dordrecht (1988)
14. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–
898 (1976)
15. Song, Y., Yang, C.: A note on the paper “A regularization method for the proximal point algorithm”.
J. Glob. Optim. 43, 171–174 (2009)
16. Suzuki, T.: A sufficient and necessary condition for Halpern-type strong convergence to fixed points of
nonexpansive mappings. Proc. Am. Math. Soc. 135, 99–106 (2007)
17. Wittmann, R.: Approximation of fixed points of nonexpansive mappings. Arch. Math. (Besel) 58, 486–
491 (1992)
18. Xu, H.K.: Iterative algorithms for nonlinear operators. J. Lond. Math. Soc. 66(2), 240–256 (2002)
19. Xu, H.K.: A regularization method for the proximal point algorithm. J. Glob. Optim. 36, 115–125 (2006)
123