On Angles, Projections and Iterations
On Angles, Projections and Iterations
On Angles, Projections and Iterations
2
Rheinische Friedrich-Wilhelms-Universität Bonn
3
The Technion—Israel Institute of Technology
1 Introduction
The interest in the convergence of sequences of iterates of projections of various types goes
back at least to the mid-twentieth century. J. von Neumann’s article [17] from 1949 can
be considered one of the starting points of these investigations. In this article he shows
that given a Hilbert space H and two closed subspaces M, N ⊂ H, with corresponding
orthogonal projections PM and PN , respectively, the sequence defined by
1
H. L. Weinert showed that the speed of convergence is determined by the Friedrichs
numbers between the subspaces involved. This can be considered a geometric condition
controlling the convergence behaviour.
Note that in all these cases the order in which the projections are iterated is of crucial
importance. In [1], I. Amemiya and I. Ando asked the question of whether convergence in
norm can always be achieved provided that each projection appears infinitely often. This
question was finally answered negatively by E. Kopecká and A. Paszkiewicz in [13], where
they give an example of three subspaces and an iteration order without convergence in
norm. More information concerning this phenomenon can be found in [11] and [12].
In Banach spaces, there are at least two natural generalisations of orthogonal pro-
jections: metric projections and linear projections. For the first one, the image is the
point inside the subspace minimising the distance to the argument. It turns out that
for iterations of metric projections, one cannot expect convergence of the iterates to the
metric projection onto the intersection; see, for example, [20].
Recall that a linear mapping P on a Banach space X is called a linear projection if it
satisfies the condition that P 2 = P . In this case, there are many positive results under
the additional assumption that the projections are of norm one. For example, if the
Banach space X is uniformly convex, convergence of iterates of norm-one projections
was established by R. E. Bruck and S. Reich in [5]. This result was later generalised,
for example, to further classes of Banach spaces by C. Badea and Y. I. Lyubich in [2].
A dichtotomy for the speed of convergence of iterations of projections in Banach spaces
which are uniformly convex of some power type has been exhibited by C. Badea and
D. Seifert in [3]. More results on the convergence of the alternating algorithm for norm-
one projections can be found in [7].
In the context of property (T) for certain groups, I. Oppenheim introduced in [19] an
angle between linear projections in Banach spaces. This concept was developed further
in [18], where a number of sufficient conditions for convergence of iterates of projections
are given.
Iterations of non-orthogonal projections in Hilbert spaces, which then necessarily have
a norm which is larger than one, are of interest in the context of discrete linear inclusions
and of Skorokhod problems; see, for example, [15, 21].
Since in most of the above results the convergence behaviour of the iterates of projec-
tions in determined or at least influenced by some kind of angle, one might hope that
for non-orthogonal projections in Hilbert spaces the situation could be similar. More
precisely, in the case of two linear projections, these projections are determined by two
subspaces each—the range and the kernel. Moreover, in the case of Euclidean spaces,
the concept of principal angles allows to determine the relative position of two subspaces
up to an isometry. Therefore one could hope that these data might determine the con-
vergence of the iterates of these projections.
The aim of this article is twofold: in the first part, we show that even in Euclidean
spaces the convergence of the iterates is not determined by the principal angles between
the subspaces involved. In the second part, we investigate the properties of the Oppen-
heim angle between two linear projections and provide an example which shows that the
modification of the definition of this angle introduced in [18] is indeed necessary.
2
2 Preliminaries and Notation
2.1 Principal Angles
Principal angles are used to describe the geometric configuration of two subspaces of a
real Hilbert space H up to orthogonal mappings. Given two finite-dimensional subspaces
S1 , S2 ⊆ H and denoting by q the minimum of the dimensions of S1 and S2 , the principal
angles h πi
θ1 , ..., θq ∈ 0,
2
and the corresponding principal vectors
u1 , ..., uq ∈ S1 v1 , ..., vq ∈ S2
for k = 1, . . . , q. The principal angles can also be represented in terms of the orthogonal
projections PS1 and PS2 onto S1 and S2 , respectively. More precisely,
p
θk = arccos( λk ),
where λ1 ≥ ... ≥ λq are the first q eigenvalues of the restriction of PS2 PS1 to S2 . This
formula allows for a direct computation of the principal angles, thus avoiding the op-
timisation problems in (1). Moreover, it has the advantage that it also makes sense for
infinite-dimensional subspaces; see, for example, [10].
The principal angles between two subspaces define them completely up to a simultan-
eous rotation. This means that every rotation-invariant function of two subspaces can
be written as a function of the principal angles between them; for example, the Dixmier
angle and the Friedrichs angle are the smallest and the smallest non-zero principal angle,
respectively.
Since the function which maps subspaces to their orthogonal complements commutes
with rotations, knowing all the angles between two subspaces S1 and S2 is equivalent to
knowing all the angles between S1 and S2⊥ .
We denote by Θ(S1 , S2 ) the ordered tuple of the principal angles between two subspaces
S1 and S2 .
A more detailed exposition of these angles, where the relations between various ap-
proaches to angles between subspaces—including principal angles and directed distances—
is examined, can be found in [16, Chapter 5.15].
3
2.2 The Cross-Ratio of projective points
For four distinct points a1 , a2 , a3 , a4 of the projective line P1 (R), the cross-ratio of these
points, denoted by [a1 , a2 , a3 , a4 ] ∈ R ∪ {∞}, is defined by
λ3 λ1 λ4 λ2
det det
µ3 µ1 µ4 µ2
[a1 , a2 , a3 , a4 ] =
λ3 λ2 λ4 λ1
det det
µ3 µ2 µ4 µ1
ha1 , a3 iha2 , a4 i
= [span(a1 ), span(a2 ), span(a3 )⊥ , span(a4 )⊥ ]
ha1 , a4 iha2 , a3 i
Proof. Using (x2 , −x1 ) as homogeneous coordinates for span(x)⊥ and the Leibniz formula
for the determinant, we can directly obtain this assertion from the definition.
The behaviour of the cross-ratio function with respect to permutations is well known.
Since we will use the following lemma later, we state it at this point without proof (which
can be found, for example, in [4, pp. 123–126]).
Lemma 2. The cross-ratio satisfies
cosP12 (∠(P1 , P2 )) = max{kP1 (P2 − P12 )k, kP2 (P1 − P12 )k}
and
In the case of two orthogonal projections P1 , P2 in a Hilbert space, the above angle
coincides with the Friedrichs angle between the images of P1 and P2 . The subtraction of
4
the projection P12 in the definition above plays the role of the quotient in the definition
of the Friedrichs angle. There need not always be such a projection P12 . Moreover, the
intersection of the images of P1 and P2 need not even be complemented. Two projections
P1 and P2 are called consistent if such a projection does exist. We also call a projection
P12 with the above properties a consistency projection. The main interest in this angle lies
in the fact that a large Oppenheim angle, that is, a small cosine in the above definition,
implies that the iterations (P1 P2 )n converge uniformly to a (consistency) projection onto
im P1 ∩ im P2 . For a detailed discussion of these angles, we refer the reader to [18].
λ = [R1 , R2 , N2 , N1 ].
In particular, we have
s(R1 , N2 )s(R2 , N1 )
ρ(P1 P2 ) = ,
s(R1 , N1 )s(R2 , N2 )
where s(M, N ) := 1 − c(M, N )2 for subspaces M, N ⊂ R2 and c(M, N ) is the Friedrichs
p
number of M and N . The iterates (P1 P2 )n converge to zero if and only if the geometric
condition
s(R1 , N2 )s(R2 , N1 )
<1
s(R1 , N1 )s(R2 , N2 )
is satisfied.
5
since Pk is a projection,
wk = Pk (wk ) = wk hvk , wk i,
and thus 1 = hvk , wk i.
Since R1 = span(w1 ), any non-zero eigenvector of P1 P2 must be a multiple of w1 .
Calculating P1 P2 w1 , we get
hv1 , w2 ihv2 , w1 i
P1 P2 w1 = w1 v1∗ (w2 )v2∗ (w1 ) = w1
hv1 , w1 ihv2 , w2 i
and hence, using Lemma 1, we obtain
hv1 , w2 ihv2 , w1 i hw1 , v2 ihw2 , v1 i
λ= =
hv1 , w1 ihv2 , w2 i hw1 , v1 ihw2 , v2 i
= [span(w1 ), span(w2 ), span(v2 )⊥ , span(v1 )⊥ ] = [R1 , R2 , N2 , N1 ]
and
w1
|hw1 , v2 i||hw2 , v1 i| |h kw , v2 i||h kw
1 k kv2 k
w2
, v1 i|
2 k kv1 k s(R1 , N2 )s(R2 , N1 )
ρ(P1 P2 ) = = w1 v1 w2 v2 = .
|hw1 , v1 i||hw2 , v2 i| |h kw1 k , kv1 k i||h kw2 k , kv2 k i| s(R1 , N1 )s(R2 , N2 )
In particular, we see that convergence occurs if and only if the modulus of the above
number is smaller than one.
Remark 1. Note that for two projections P1 and P2 in R2 with distinct one-dimensional
images, the zero mapping is the uniquely determined projection onto the intersection of
these ranges. Using the above result, we see that
s(R1 , N2 )s(R2 , N1 )
= ρ(P1 P2 ) ≤ kP1 P2 k2 = cos ∠0 (P1 , P2 ) = cos ∠(P1 , P2 ),
s(R1 , N1 )s(R2 , N2 )
that is, the iterates converge whenever the “cosine” of the Oppenheim angle between
the projections is smaller than one. Moreover, the above discussion shows that, in this
particular case, there is a relation between the Oppenheim angle and the principal angles.
We start our investigation of the convergence behaviour with the observation that the
problem is at its core still a two-dimensional problem. We call an eigenvector of the
composition P1 P2 non-trivial if the corresponding eigenvalue is neither zero nor one.
6
Lemma 3. For every non-trivial eigenvector v of P1 P2 , there exists a two-dimensional
subspace Ev such that the intersections of Ev with R1 , R2 , N1 , N2 are all one-dimensional
with trivial pairwise intersection and v ∈ Ev .
Proof. Let λ be the eigenvalue corresponding to v (that is P1 P2 v = λv). Set w = P2 (v),
Ev = span(w, v) (λ 6= 0 implies w 6= 0). Then
λv ∈ R1 ∩ Ev , λv − w = P1 (w) − w ∈ N1 ∩ Ev ,
w ∈ R2 ∩ Ev and w − v = P2 (v) − v ∈ N2 ∩ Ev .
Note that w ∈ R1 or v ∈ R2 would mean that w = v = λv, while w ∈ N1 or v ∈ N2 would
mean that P1 P2 v = 0, both contrary to the assumption that v is non-trivial. Therefore
v and w are linearly independent, all the intersections are one-dimensional and Ev is
two-dimensional. Also using this, we can easily reproduce the triviality of the pairwise
intersections.
The above lemma implies, in particular, that we still can interpret the ranges and
kernels of the projections under consideration as projective points. Moreover, the cross-
ratio of these points still carries the vital information regarding the convergence.
Lemma 4. Let v be an eigenvector corresponding to the non-trivial eigenvalue λ of the
operator P1 P2 and Ev the associated subspace established in Lemma 3. Set
Then
λ = [R1′ , R2′ , N2′ , N1′ ],
where the cross-ratio is meant to be taken on Ev .
Proof. First, we show that Pk (Ev ) ⊆ Ev for k ∈ {1, 2}. To this aim, first observe that
Lemma 3 implies that Ev = Rk′ ⊕ Nk′ . Given x ∈ Ev , we choose r ∈ Rk′ , n ∈ Nk′ such
that x = r + n. Then
Pk (x) = Pk (r) + Pk (n) = r ∈ Ev .
Hence the mappings Pk′ : Ev → Ev , x 7→ Pk (x) are well-defined projections for k ∈ {1, 2}.
Since, by construction, λ is the non-trivial eigenvalue of P1′ P2′ , we can use Proposition 1
to complete the proof:
λ = [im P1′ , im P2′ , ker P2′ , ker P1′ ] = [R1′ , R2′ , N2′ , N1′ ].
7
Proof. As in the proof of Lemma 4 above, the well-definedness of the two projections
Pk′ : E → E, x 7→ Pk (x) follows from E = (Rk ∩ E) ⊕ (Nk ∩ E) for k ∈ {1, 2}. Since R1′ =
R(P1′ ) is one-dimensional, it is an eigenspace of P1′ P2′ corresponding to the eigenvalue
N (PV ⊥ ) = V ⊆ V ⊕ W = N (P(V ⊕W )⊥ )
8
Proposition 2. Let P1 , P2 : R3 → R3 be two projections with two-dimensional images.
There is at most one non-trivial eigenvalue and if it exists it satisfies the equation
δ(N2 , R1 )δ(N1 , R2 )
|λ| = .
δ(N1 , R1 )δ(N2 , R2 )
δ(N2 , R1 )δ(N1 , R2 )
< 1,
δ(N1 , R1 )δ(N2 , R2 )
that is, the convergence is determined by the angles between the ranges and kernels.
Proof. Assuming the existence of at least one non-trivial eigenvector v with corresponding
eigenvalue λ, we see that N1 ∩ N2 = {0}, the only possible choice of Ev is N1 ⊕ N2 , and
R1 ∩ R2 ∩ Ev = {0}. We set
as claimed.
Remark 2. We conclude this section with a few observations concerning the validity of
the above characterisation of convergence of the iterates.
1. Using the behaviour of the cross-ratio with respect to permutations of its arguments
stated in Lemma 2, we obtain that for the case of two linear projections P1 , P2 on
R3 with one-dimensional ranges, we also see that there is at most one non-trivial
eigenvalue and if it exists, then it satisfies the equation
δ(R1 , N2 )δ(R2 , N1 )
|λ| = .
δ(R1 , N1 )δ(R2 , N2 )
δ(R1 , N2 )δ(R2 , N1 )
< 1.
δ(R1 , N1 )δ(R2 , N2 )
This shows that also in this case, convergence of the iterates is determined by the
angles between the ranges and the kernels.
9
2. Copying the above arguments, it is possible to show the same characterisation for
projections on Hilbert spaces, where both have either one-dimensional images or
one-dimensional kernels.
3. The “mixed case”, that is, the case where one projection has a one-dimensional
range and the other projection has a one-dimensional kernel, is more complicated.
Let P1 and P2 be two projections on R3 and assume that P1 has a one-dimensional
range and P2 has a two-dimensional one. Using the results of this section, we may
conclude that there is a unique non-trivial eigenvalue µ of
and the cross-ratio [im P1 , ker P2 , im P2 , ker P1 ] > 0, we have convergence of the
iterates (P1 P2 )n . In order to show that condition (3) is not enough, we consider
the following example. Consider the vectors
1 1 1 0 1
′
w1 = −1 , w2 = 1 , w3 = 3 , w4 =
1 and w4 = 0
0 0 1 −1 1
1
= [V1 , V2 , V3 , V4 ] = −[V1 , V2 , V3 , V4′ ].
2
Using the above arguments, we see that the sequence (P (S1 , S4 )(P (S3 , S2 ))n )∞
n=1
converges whereas ((P (S1 , S4′ )P (S3 , S2 ))n )∞
n=1 does not. Since all the principal
angles are the same in both cases, they do not determine the convergence behaviour
on their own.
10
need a simple observation on operator matrices. For two operators T1 : H1 → H1 ,
T2 : H2 → H2 on Hilbert spaces H1 and H2 , we denote the operator matrix
T1 0
: H1 ⊕ H2 → H1 ⊕ H2
0 T2
(h1 , h2 ) 7→ (T1 (h1 ), T2 (h2 ))
11
We combine these projections by setting
P1 := P11 ⊕ P12 , and 1
P2,s := P2,s 2
⊕ P2,s
for s = ±1. Since the principal angles between direct sums of subspaces are just the
combination of the principal angles between the individual spaces, the principal angles
between the ranges and kernels of P1 and P2,−1 are the same as the ones between the
ranges and kernels of P1 and P2,1 . On the other hand, we have ρ(P1 P2,1 ) = 0 and
ρ(P1 P2,−1 ) = 2, that is, although the principal angles agree, nevertheless the convergence
behaviour is vastly different.
12
In infinite dimensional Banach spaces, even the question of whether two projections
P1 and P2 are consistent, that is, if there is a projection P12 onto the intersection of
the ranges of P1 and P2 such that P12 P1 = P12 P2 = P12 , is of interest. Note that
there are complemented subspaces with the property that their intersection is no longer
complemented. In other words, it might happen that not only there is no projection
satisfying the above condition, but that there is no bounded projection at all.
On the positive side, we can mention the following result of R. E. Bruck and S. Reich:
Proposition 3 (Theorem 2.1 in [5, p. 464]). Let X be a uniformly convex space and
let P1 , . . . , Pk be linear norm-one projections onto subspaces Y1 , . . . , Yk . Then the strong
limit limn→∞ (Pk Pk−1 · · · P1 )n x exists for each x ∈ X and defines a norm-one-projection
onto the intersection Y1 ∩ . . . ∩ Yk .
Using this proposition together with the uniqueness of norm-one projections in smooth
spaces, we obtain the following simple result:
Proposition 4. Let X be a uniformly convex and smooth Banach space and let P1 and
P2 be two norm-one projections in X. Then these projections are consistent, that is, there
is a projection P12 onto the intersection of the ranges of P1 and P2 with the property that
P12 P1 = P12 P2 = P12 .
and
P12 P2 x = lim (P1 P2 )n P2 x = lim (P1 P2 )n−1 P1 P22 x = lim (P1 P2 )n x = P12 x
n→∞ n→∞ n→∞
for all x ∈ X. Since in smooth Banach spaces norm-one projections are unique (see, for
′ and this projection
example, Theorem 6 in [6, p. 356]), we necessarily have P12 = P12
has the required properties.
Note that we cannot drop the assumption that X is smooth. This can be seen in the
following four-dimensional example.
Example 3. We consider the space R4 equipped with the norm
p
k(x, y, z, w)k = x2 + y 2 + z 2 + w2 + |x| + |y| + |z| + |w|
which turns it into a uniformly convex but non-smooth space. Consider the projections
z
P1 (x, y, z, w) = x, y − , 0, 0 and P2 (x, y, z, w) = (0, y, 0, w)
4
13
which, by
r z 2 z
kP1 (x, y, z, w)k = x2 + y − + |x| + y −
4 4
√
p z 2 |z|
≤ x2 + y 2 + + |x| + |y| +
4 4
p |z|
≤ x2 + y 2 + z 2 + w2 + |x| + |y| + + |w|
2
are both norm-one projections. Moreover,
z
(P1 P2 )(x, y, z, w) = (0, y, 0, 0) and (P2 P1 )(x, y, z, w) = 0, y − , 0, 0
4
and since both elements are already in the intersection of the ranges, the sequence of
iterates is constant in both cases. Hence we get limn→∞ (P1 P2 )n 6= limn→∞ (P1 P2 )n .
Moreover, note that neither one of these projections has the properties required in the
definition of the Oppenheim angle.
Acknowledgments. The third author was partially supported by the Israel Science
Foundation (Grants No. 389/12 and 820/17), the Fund for the Promotion of Research
at the Technion and by the Technion General Research Fund.
References
[1] I. Amemiya and T. Andô. Convergence of random products of contractions in Hilbert
space. Acta Sci. Math. (Szeged) 26 (1965), pp. 239–244.
[2] C. Badea and Y. I. Lyubich. Geometric, spectral and asymptotic properties of aver-
aged products of projections in Banach spaces. Studia Math. 201.1 (2010), pp. 21–35.
[3] C. Badea and D. Seifert. Quantified asymptotic behaviour of Banach space operators
and applications to iterative projection methods. Pure Appl. Funct. Anal. 2.4 (2017),
pp. 585–598.
[4] M. Berger. Geometry I. Universitext. Translated from the 1977 French original by
M. Cole and S. Levy, Fourth printing of the 1987 English translation [MR0882541].
Springer-Verlag, Berlin, 2009, pp. xiv+428.
[5] R. E. Bruck and S. Reich. Nonexpansive projections and resolvents of accretive op-
erators in Banach spaces. Houston J. Math. 3.4 (1977), pp. 459–470.
[6] H. B. Cohen and F. E. Sullivan. Projecting onto cycles in smooth, reflexive Banach
spaces. Pacific J. Math. 34 (1970), pp. 355–364.
[7] O. Darwin et al. Non-optimality of the greedy algorithm for subspace orderings in
the method of alternating projections. Results Math. 72.1-2 (2017), pp. 979–990.
14
[8] I. Halperin. The product of projection operators. Acta Sci. Math. (Szeged) 23 (1962),
pp. 96–99.
[9] S. Kayalar and H. L. Weinert. Error bounds for the method of alternating projections.
Math. Control Signals Systems 1.1 (1988), pp. 43–59.
[12] E. Kopecká and V. Müller. A product of three projections. Studia Math. 223 (2014),
pp. 175–186.
[13] E. Kopecká and A. Paszkiewicz. Strange products of projections. Israel J. Math. 219.1
(2017), pp. 271–286.
[14] E. Kopecká and S. Reich. A note on the von Neumann alternating projections al-
gorithm. J. Nonlinear Convex Anal. 5.3 (2004), pp. 379–386.
[16] C. Meyer. Matrix analysis and applied linear algebra. With 1 CD-ROM (Windows,
Macintosh and UNIX) and a solutions manual (iv+171 pp.) Society for Industrial
and Applied Mathematics (SIAM), Philadelphia, PA, 2000, pp. xii+718.
[17] J. von Neumann. On rings of operators. Reduction theory. Ann. of Math. (2) 50
(1949), pp. 401–485.
[18] I. Oppenheim. Angle criteria for uniform convergence of averaged projections and
cyclic or random products of projections. Israel J. Math. 223.1 (2018), pp. 343–362.
[20] W. Stiles. Closest-point maps and their products. Nieuw Arch. Wisk. (3) 13 (1965),
pp. 19–29.
[21] A. Vladimirov, L. Elsner and W.-J. Beyn. Stability and paracontractivity of discrete
linear inclusions. Linear Algebra Appl. 312.1-3 (2000), pp. 125–134.
15