On superadditive rates of convergence
tome 19, no 4
(1985), p. 671-685
(vol. 19, n° 4, 1985, p. 671 à 685)
by Florian A. POTRA (*)
Communicated by Françoise CHATELIN
Abstract. — In the present paper we prove that a superadditive function is a rate of convergence,
in the sense ofV, Ptâk, if and only ifits itérâtes are pointwise convergent to zero. Moreover if the
function is continuons from the right then this is equivalent to the inequality w(t) < t. These results
are generalized for rates of convergence of several variables.
Resumé. — Dans ce travail nous démontrons qu'une fonction superadditive est un taux de convergence, dans le sens de V. Ptâk, si et seulement si ses itérées convergent ponctuellement vers zéro.
De plus, si la fonction est continue à droite, alors ce f ait est équivalent à l'inégalité w(t) < t. Ces
résultats sont généralisés aux taux de convergence de plusieurs variables.
The method of nondiscrete induction was introduced in 1966 by V. Ptâk [16]
in connection with some quantitative refînements of the closed graph theorem.
Since then the method has successfully been applied to the study of various
itérative constructions in analysis and numerical analysis. The results were
published in a series of papers ([1, 2,4-28, 30]) and they were recently collected
in a book [15]. Many of the theorems obtained by using this method are sharp
in the sense that neither the convergence conditions nor the error estimâtes
can be improved in the class of problems considered. This is explained by the
fact that in the application of the method of nondiscrete induction the classical
way of measuring convergence is replaced by a more refined one which makes
it possible to obtain estimâtes sharp not only asymptotically but throughout
the whole process. A rate of convergence is defined as a function, not as a
(*) Received in December 1984.
l1) Department of Mathematics, University of Iowa, Iowa City, Iowa 52242, U.S.A.
DÉFINITION 1 . 1 : Let T dénote either the positive semiaxis ]0, oo[ or a half
open interval oftheform ]0, b]. Afunction w : T -> T is called a rate of convergence on T if
t GT
win\t) < oo ,
where win) dénotes the rcth iterate ofw in the sensé ofthe usual function compo-
sition (Le., wm(t) = u w(n+1)(t) = w{w{n)(t% n = 0, 1, 2, ...)-
By defining the rate of convergence of an itérative procedure as a function
we can retain more information at each step and obtain a sharp fùiai estimate.
This departure from tradition in measuring convergence is justified by V. Ptak
in a paper suggestively entitled « What should be a rate of convergence ? » [25].
In the same paper he stresses the importance of the rates of convergence
which satisfy a functional inequality of the form
5o w ^ Wos
where s(t) dénotes the sum of the series (1). This inequality is not implied by
Définition 1.1 and V. Ptak shows that it is satisfied at least by convex rates
of convergence. It is easy to prove that (2) holds for superadditive rates of
convergence continuous from the left (see Proposition 3.2). Every convex
rate of convergence is superadditive but the converse is not true. For example
the rate of convergence of Newton's method
MO = . rC
is superadditive on U+ but it is convex only on the interval [0, a^jT\ (see
définitions in Sections 2.1 and 2.7).
These facts demonstrate that the class of superadditive rates of convergence
deserves special attention. In the present paper we in tend to show that for
superadditive functions w : T -* T condition (1) is equivalent to the simpler
lim w{n)(t) = 0,
te T.
Moreover, if w : T -» T is superadditive and continuous from the right
then (1) is equivalent to the inequality
The above-mentioned results will be proved in Section 3 in the more gênerai
context of^-dimensional rates of convergence. The notion of a /7-dimensional
rate of convergence (or rate of convergence of type (p. 1)) was first introduced
in [7] in connection with the study of Régula Falsi and it represents a natural
generalization of the notion given in Définition 1.1 (see also [9] and [15]).
We have mentioned in the introduction that convex rates of convergence
are superadditive. This is a conséquence of the fact that a convex function
ƒ : R+ -> R which vanishes at zero is superadditive. This fact holds for functions of several variables as well, with a proper generalization of the notion
of convexity. We will consider the notion of S-convexity, which was introduced
(in a more gênerai context) by J. W. Schmidt and H. Leonhardt [29], and
will compare it with some more popular generalizations of the notion of a
convex function, in order to give the reader a better understanding of its significance.
In what follows we will consider the Euclidean space Up endowed with the
natural (component-wise) partial ordering « < » (Le., for any w = (wl5 ..., wp),
v = (vu ..., vp)fromRp the relation u ^ v is equivalent to ut ^
The set R?. is called the non-negative cône of Rp. Two éléments x, y of Up
are called comparable if either x ^ y or y ^ x holds. We will identify the
space (IRP)* of ail linear functionals defined on Up with the space Up itself.
This motivâtes also the notation uv for the scalar product between the vectors u and v. We will dénote by el3 e2,..., ep the standard basis of Up and by
e the vector (1, 1,..., 1). Obviously e = e1 + e2 + *•• + ep.
DÉFINITION 2 . 1 : Let D be a convex subset of Up. A function ƒ : D -> R is
called :
à) convex, if
f(Xx + (1 - X) y) ^ Xf(x) + (1 - X) f (y)
for ail X e [0, 1] and ail x,yeD;
b) order convex, if (6) holds for ail X e [0, 1] and ail comparable x, yfrom D ;
c) S-convex, if there is a mapping
5/(., .):A>= {(x,y)eDx
for ail (x,y)eA,
8/(x, y) ^ 6f(u, v), for ail (x, y), (u, v)eA
with x ^ u, y ^ v .
For p = 1 the above defîned notions coincide. If p > 1 convexity implies
order-convexity, but the reverse is not true. A similar statement holds if we
replace convexity by S-convexity.
2.2 : Let D be a convex subset ofUp and let ƒ : D -> U be an
S-convex function. Then ƒ is also order-convex.
Proof : Consider two points x, y e D with x < y and let X be a number
between 0 and 1. Dénote v = Xx + (1 - X) y. We have obviously x ^ v ^ y,
v — x = (1 — X) (y — x), y — v = X(y — x) so that we can write :
Xf(x) + (1 - X) f (y) - f(v) = X(f(x) - f(v)) + (1-X)
(f(y) - f(v))
= Xbf{x, v) (x - v) + (1 - X) &f{y, v) (y - v)
We will see in what follows that the reverse of the above proposition is not
true. Before that let us give a useful characterization of S-convexity :
2 . 3 : Let D be a convex subset of Up. A function
if and only if
is S-convex
i ( ƒ (x + sed - ƒ (x)) < \ (f(y + ted - f (y))
for ail i = 1, 2,..., p, x, y e D, s, t e U\{ 0 } satisfying x ^ y and
x + set ^ y +
Proof : Suppose (10) is satisfïed, If ƒ is S-convex we can write
- ( ƒ (x + set) - ƒ (x)) = - 5/(x + sei9 x) se(
= 5/(x + seu x) et < 8f(y + te;, y) et
which proves the necessity of the condition stated in the proposition. Then,
taking x = y in (9), it is easy to see that the foliowing limits exist :
Dénote D£/(x) = (Df f(x) + Dt f(x))/2. For any two points x = {xl9 x 2 ,..., xp),
y — CVIJ y il •-) JP) from W define a vector ô/(x, y) e [RP whose zth component is given by
if xt± yt\ and by
D J ( l 5 ..., x £ _ ls x (>
if x£ = yt. It is easy to check that conditions (7) and (8) are satisfied for the
mapping ô/(., .) defined as above. The proof is complete. D
In order to clarify the relation between the notions introduced in Définition 2.1, let us suppose that the function ƒ is twice G-differentiable. Then ƒ
is convex if and only if
ƒ "(x) hh ^ 0 for ail h e Up ,
xeD .
Also, ƒ is order convex if and only if
f"(x) hh>0
for ail h e Up+ ,
For a proof of the above statements see, for example, [3]. Using Proposition 2.3,
we can easiiy prove that ƒ is S-convex if and oniy if
for ail
Condition (11) means that the matrix /"(x) is semipositive defmite while
condition (13) means that the matrix /"(x) has all the entries nonnegative.
This observation shows us that there are many functions which are convex
but not S-convex as well as functions which are S-convex but not convex.
As an example from the first category let us take
ƒ : U2 -> R ,
f ( x u x2) = x 2 x - x x x 2 + x \
and as an example from the second category
= xl + 4xiX 2 + x i .
It is known that convex functions are continuous (for a proof, see [3]).
For order-convex functions (and even for S-convex functions) we have only
continuity from the left and from the right in the sensé of the following définition :
DÉFINITION 2 . 4 : A function f:D<^Mp^>Mis
called continuons from the
right (resp. left) at xe D if for any e > 0 there is a 5 > 0 such that
whenever y e D and x ^ y ^ x + 8e (resp. x — he < y ^ x). D
More precisely we have the following.
2 . 5 : Let D be an open convex subset of Up and
an order convex function. Then ƒ is continuons from the left andfrom the right
at each point of D.
The above theorem can be proved by adapting the proof of Theorem 3.4.2
of [3] and using the following lemma.
2 . 6 : Let Up be the hyper cube
Up = \ x = (xl9 ..., xp) GUP; max
i <j<
and let u[p\ u{2p\ ..., u{$ be its vertices. Iff:Up-*U
any xe Up we have
is order convex then for
max f(uip)).
Proof : Our lemma is trivially verified for p = 1. Suppose it holds for a
given p ^ 1. By reordering the vertices of Up+1 we may suppose that the
last coordinate of
is equal to — 1 while the last coordinate of
(p+i) jjip+x)
2P+1> W 2 P + 2 ' ' " 'U2P+1
V1 ')
equals 1. Let us dénote by Up the convex envelope of the points (16) and by
Up the convex envelope of the points (17). From the induction hypothesis
we have
max / ( 4 P + 1 ) ) ,
Now any x = (x l5 x 2 , ..., x p , xp+1) e Up+1 can be written as
x - Xx' + (1 - X) x" ,
Xe [0, 1]
with x' = (x 1( ..., xp, - 1) e U'p, x" = (xu ..., xp, 1) e £/;. We have clearly
x' < x" so that using the order convexity of ƒ we deduce that
f(x) < Xf (x') + (1 - X) /(x") <
1 ^fc^2P
The proof is complete.
We note that the above lemma does not hold if we replace Up by some
other convex polyhedron. Indeed if we consider the function ƒ given by (15)
and the square of vertices (1, 0), (0, 1), (— 1, 0), (0, - 1) we have
/(l,0)=/(0,l)=/(-l,0)=/(0,-l) = l
and ƒ ( - 1 / 2 , - 1 / 2 ) = 1 + 1/2 .
Let us state now two very simple results to be used later :
DÉFINITION 2.7 : A function f: D c Up -> IR is called superadditive if
f(x + y) > f(x) + f(y)
for all
x, y e D
x + yeD
DÉFINITION 2.8 : A function f :D c Up -• U is called isotone if
f(x) < fiy)
x, y e D
x < y.
PROPOSITION 2.9 : Let T dénote either the whole positive axis ]0, oo[ or a
half open interval of the form JÜ, b\ and set
where 0 IJ the origin of Up. If f : Do -> U is S-convex and /(O) = 0 then f is
Proof : f(x +y)-
f(x) - f(y) - f(x + y) - f{x) - (f(y) - /(O)) =
PROPOSITION 2.10 : If f : D <= Up ^> M is superadditive and nonnegative
(Le., f(D) c R + ) then f is isotone,
Proof : If x, y e D and x < y then
fiy) =f(x + y-x)
>f(x) +f(y-x)^o.
It is interesting to remark that Proposition 2.9 does not hold if we replace
S-convexity by convexity. For example the function f given by (14) is convex
but if we take x = (1, 9) and y = (9, 1) then we have /(x+j>)=/(10, 10) = 100
and ƒ(x) + f{y) = 2(81 - 9 + 1) = 146.
Let T dénote as before either the set of all positive real numbers or a half
open interval of the form ]0, b]. Let w b e a mapping of the cartesian product
Tp into T and let us consider the « itérâtes » w{n) of w given for each
t =
by the following récurrent scheme :
w ( 0 ) (0 = tp,
w<"+ *>(*) = w(n>(£2,.., tp, w(t)),
n = 0, 1, 2, .... (18)
3.1 : A mapping w : Tp -> T with the above itération law is
called a p-dimensional rate of convergence {or rate of convergence of type (/?, 1))
on T if
f w{n\t) < oo,
Itisconvenienttoattachtothemappingw : Tp -• Tamapping ~w :TP -> Tp
defined for every t - (tl912, ...s tp) e Tp by
w(t) = (t29..,
tp9 w(t)).
It is easily seen that if we dénote by w{n) the itérâtes of w in the sensé of the
usual composition of functions (i.e., w(0)(t) = t, vftn+l){t) = w(win)(t))) then
from (18) it follows that
. w{n+1\t)
= w(w{n)(t)) = win\w(t)),
w™(t) = (w*-p+ l\t\
n = 0, 1, 2, ...,
.., w(«>(0),
n = p - 1 , ^ /i + 1,.... (200
Let us dénote by s{t) the sum of the series (19) and let us consider the mappings
st : Tp -> R, f = 1, 2, ...,/> given by
st{t) = s{i) + P Ë t,,
t = (tu t2,... tp) e T p .
Finally let us define the mapping
s : Tp -> Tp,
J(t) = ( Sl (0, s2(0, », M0) •
With the above notation we have :
3.2 : If the p-dimensional rate of convergence w :TP -> T is
superadditive and continuons from the left then
s(w(t)) ^ w(J(t)),
for all teTp.
Proof : First, let us observe that
Using the superadditivity of w and (20) we M/"X
can write
Finally, letting n to tend to infinity in the above inequality, we obtain (21).
The inequality (21) was first proved by V. Ptak [25] for unidimensional
convex rates of convergence. The heuristic motivation for the importance of
such an inequality given in the above-mentioned paper can be generalized
to the /7-dimensional case as follows : Suppose { xn}n^0 is a séquence of
points belonging to a complete metric space (X, d) such that
>Xn+l) < M4(xn-p>
n-p+l)> -94(xn_l9
where w is an isotone /?-dimensional rate of convergence. It is easy to see
that under this assumption the séquence {xn}n^0 has to be convergent.
Moreover if we dénote
x* = lim xn,
r0 = (d(xOi x j , . . . , d(xp_l9 xpj)
then we have the estimâtes
If the séquence { xn } n ^ 0 is obtained via an itérative procedure then at a
certain stage the distances d(xn, x n+1 ) are known for, say, ail n < N while
the distances d{xm x*) are generally not known. It is then important to note
that if (21) holds then the estimâtes en wilî satisfy a relation similar to the
relation satisfied by the distances d{xn, x„ +1 ). Indeed we have :
- w(en_p+l,..., O .
In what follows we will show that for superadditive functions w :TP -> T,
condition (19), which is generally very difficult to verify, can be replaced by
much simpler ones.
3 . 3 : Let w : Tp -> T be a superadditive function. Then w is a
p-dimensional rate of convergence on T if and only if
lim w(n)(0 = 0
for ail
Proof : The necessity of condition (22) is obvious. In order to prove the
sufïîciency we note fîrst that (22) implies
Indeed if we suppose that there is an r] G T such that w(r\, T|, ..., r|) > r| then
because of the isotony of w (see Proposition 2.10) we have w(n)(r), r|,..., r|) ^ rj
which contradicts (22).
Now, let us fîx a point teTp and a positive number e > 0. Let us dénote
a — e — w(e, e,..., e), From (22) it foliows that there is an integer N = N(t, e)
such that
w(n)(0 < a/p
for any
n 7* N .
We will prove by induction with respect to k that
win)(t) + w(n+1)(t) + - + witt+k)(t) ^ 8
for any
n > N .
For k = 0, 1, 2, ...,p — 1 this follows immediately from (24). Suppose (25)
holds for a given k ^ p — 1. Using the superadditivity of w and (20') we
have :
win)(t) + win+1)(t) + - +
< wn(t) + •» + w ( " + p - 1 } (0 + w{W+p-1(t)
+ •" + wn+k{t))
< a + 1^(8, 8, ..., 8) = 8 .
Thus (25) holds for ail n ^ N and ail k = 0, 1, 2,... Hence the series (19) is
convergent. The proof is complete.
In the proof of the above theorem we have seen that if w : Tp -> T is an
isotone function then (22) implies (23). In what follows we show that for right
continuous functions the converse is also true.
3.4 : Let w :TP -> T be isotone and continuous from the
right, Then condition (22) is equivalent to
^ G T .
Proof : We only have to prove that (23) implies (22). Consider a vector
t = (tl912, ..., tp) G Tp and set r| = max { tu t2,..., tp }. The séquence
an = win)(r\, T|s ..., r|) is a nonincreasing séquence of positive numbers so that
it is convergent. Set a = lim an and suppose a > 0. Then from the right
continuity of w we have
a = lim an+1
= lim w(a B _ p + 1 , Û B _ P + 2 Ï - , <O = w(o, a, ..., a)
which contradicts (23). Hence lim an = 0. From the isotony of H> it foliows
H — • oo
that w{n)(t) ^ an so that (22) is satisfïed. D
Now from Theorem 3.3 and Proposition 3.4 we can immediately deduce
the main resuit of our paper :
3.5 : If w : Tp -> T is superadditive and continuous from the
right then conditions (19), (22) and (23) are equivalent. D
In particular, for S-convex functions we have :
3.6 : Let w : Tp -> T be an S-convex function. Then w is a
p-dimensional rate of convergence on T if and only ifone of the conditions (22)
or (23) is satisfïed.
Proof : By virtue of 2.2, 2.5, 2.6 and 3.5 we only have to prove that if
w : Tp -> T is S-convex and if there is a séquence an e Tp such that
aB+i < a» >
n = 09 1, 2, ... ;
lim an = 0 ,
lim W(ÛB) = 0
thenbytakingvt>(0) = O,wextendsto an S-convex function on D 0 = Tp u { 0 } .
Let us fix a point xeTp and consider the séquence { Sw(x, a j }„ ^ 0 . According
to 2.1, c) we have
8w(x» an+1) < 5w(x, a„),
n = 0, 1, 2, ...
(x - fln) = w(x) - w(an).
From (26) it folio ws that all the séquences { 5w(x, an) et }„^ 0 (/ = 1, 2, ...,/>)
are nonincreasing. Hence d{ = lim 5w(x, an) e{ exists although it might be
equal to — oo. From (27) it folio ws that £ d{{xe^ = w(x) > 0 and taking
into account the fact that xet > 0 for i = 1, 2,...,/? it folio ws that df > — oo
for i = 1, 2, ...,/?. Hence lim 8w(x, an) exists and we take by définition
bw(x, 0) = lim 5w(x, a„) .
From (27) it follows that
8w(x, 0) (x - 0) = 5w(x, 0) x = w{x) = w(x) - w{0).
We have to prove that
8w(x} 0) ^ 5w(u, v) for ail u,vsD0
with x ^ u .
= 0 we have
Sw(x, 0) = lim 8w(x, an) < lim 8w(w, an) = 8w(w, 0).
If f # 0 then there is an integer iV such that an ^ v for n ^ N. Hence
8w(x, 0) = lim ôw(x, a„)
In a similar way by defming
8w(0, x) = lim 8w(an, x)
we can prove that 8w(0, x) (0 - x) = w(Q) - w(x) and 8w(0, x) ^ 8(w, v) for
ail u,v e Do with x ^ u. The proof is complete. D
From the proof of the above corollary it follows that any S-convex rate of
convergence w : Tp -> T is superadditive. The reciprocal is clearly not true.
It is interesting to note that the rate of convergence of Newton's method (3)
is convex only on the interval [0, a s/ï] but it is superadditive on U+. The
proof of this fact is elementary but not very straightforward and it will be
given in the following.
The rate of convergence of Newton's method (3) has been intensely studied
(see [12, 15, 20 and 21]). For this function the series (1) is convergent for any
t e U+ and its sum has the simple explicit expression :
s(t) = t - a +
Explicit expressions for the finite sums
sn(i) = t + w{t)
have also been found (see [20] or [15]). These explicit expressions were used
to obtain sharp and elegant a priori and a posteriori estimâtes for Newton's
method for solving nonlinear operator équations in Banach spaces (see [23,
12 and 15]). The second derivative of w is of the form
„, x
(2 a2 - t1) a2
which shows that w is convex on [0, a yfï\ and concave on [a ^/ï, + oo[.
In what follows we will show that w is superadditive on [0, + oo[. We have
to prove that
_|_ (x _|_ y ) 2 ! 1 ' 2
-f X ) '
+ V )
If a = O this is trivially satisfied. If a > O then by dividing both sides of the
inequality by a and denoting u = x/a, v = J/Ö this reduces to
[1 + (u + i;) 2 ] 1 ' 2 " (1 + i/2)1'2 " (1 + » 2 ) 1/2
Let us fix v and consider the left hand side of the above inequality as a function
of w, say f(ti). We have obviously /(O) = 0. We will prove that ƒ (w) > 0 for
u > 0 by showing that :
1) There is an M > 0 such that ƒ (w) > 0 for ail u ^ M.
2) The derivative of ƒ has a unique root M* > 0 such that f'(ü) > 0 for
u < w* and ƒ'(M) < 0 for M > w*.
Indeed if 1) and 2) hold then for any M < M* we have ƒ (w) > ƒ (0) = 0 while
if u > u* then by taking z > max { w, M } we get f(u) > ƒ (z) > 0.
In order to prove statement 1) we first perform the change of variable
e = l/w and obtain a function of 8,
We have
(1 + zv)2 (1 + 82)1/2 - [e 2 + (1 + sv)2f/2 =
(1 + st;)2 + 82[(1 + 8z;)2 + 1]
sv(Z + m ( l + £v)2 ( 1 + g2)1/2 +
so that we can write
g(e) = sv[a(e) (1 + t;2)1/2 - i(e) v]
where lim a(e) = lim b(e) = 1. This shows that g(e) > 0 for s sufficiently
small, which complètes the proof of statement 1).
If we dénote
t(2 + t2)
then the derivative of ƒ can be written as
- *(w + y) - *(«) .
It is easy to prove that for any given v there is a unique w > 0 (more precisely
we have [(^/S - 1)/2] 1/2 < w < ^/2) such that h(u + u) = A(M). (Intuitively
this can be easily realized by drawing the graph of h.) Hence the équation
f'{u) = Ohas a unique positive solution M*. We have obviously ƒ'(0) = h(v) > 0
and from the mean value theorem it follows that there is a w > yJ2 such that
which complètes the proof of statement 2).
