Generalized Polar Decompositions and The Approximation of The Matrix Exponential

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Generalized Polar Decompositions

and the approximation of the


Matrix Exponential

1/30

Yet another method for computing the matrix exponential!!!

Antonella Zanna
University of Bergen, Norway
email: [email protected]
http://www.ii.uib.no/anto

JJ
II
J
I
Back
Close

Overview
Motivations

2/30

Integration methods using exponentials in GI need fast algorithms that approximate the matrix
exponential
Exact computation is not an issue but it is crucial that the exponential approximation is in
the Lie group.

JJ
II
J
I
Back
Close

Overview
Motivations

2/30

Integration methods using exponentials in GI need fast algorithms that approximate the matrix
exponential
Exact computation is not an issue but it is crucial that the exponential approximation is in
the Lie group.

Generalized polar decompositions:


The classical polar decomposition
Generalized polar decompositions
Some examples of applications
Work in collaboration with R. Quispel and H. Munthe-Kaas

JJ
II
J
I
Back
Close

Overview
Motivations

2/30

Integration methods using exponentials in GI need fast algorithms that approximate the matrix
exponential
Exact computation is not an issue but it is crucial that the exponential approximation is in
the Lie group.

Generalized polar decompositions:


The classical polar decomposition
Generalized polar decompositions
Some examples of applications
Work in collaboration with R. Quispel and H. Munthe-Kaas

Application to the computation of the matrix exponential


The plain-vanilla approach

JJ
II
J
I
Back
Close

Fast methods
Work in collaboration with H. Munthe-Kaas and A. Iserles

3/30

JJ
II
J
I
Back
Close

The polar decompositions


In matrix theory, it is well known that every matrix A can be decomposed as

4/30

A = HU,
where H is symmetric positive (semi-) definite, and U is orthogonal (U U T = I).
Moreover, such decomposition if unique provided that det U = 1.

JJ
II
J
I
Back
Close

The polar decompositions


In matrix theory, it is well known that every matrix A can be decomposed as

4/30

A = HU,
where H is symmetric positive (semi-) definite, and U is orthogonal (U U T = I).
Moreover, such decomposition if unique provided that det U = 1.
Let us consider the map
(z) = z T
on the group G = GL(n) of invertible matrices.
(xy) = (x)(y) for all possible invertible matrices x, y ( is an endomorphism of G)
is invertible and 1 = , or, equivalently, 2 = id (involution).
these two properties together imply that is an involutive automorphims of G.
The factors H and U of A possess the property
(U ) = U
(H) = H 1
in other words, U and H are fixed and anti-fixed points of an involutive automorphism on the
group of invertible matrices.

JJ
II
J
I
Back
Close

Generalized polar decompositions


What we need:

5/30

A Lie group G
An involutive automorphism

JJ
II
J
I
Back
Close

Generalized polar decompositions


What we need:

5/30

A Lie group G
An involutive automorphism
What we can do with them:
We can factorize
z = xy

GPD

where
(x) = x1
and
(y) = y
where x, y are appropriate group elements which are determined by z and .

JJ
II
J
I
Back
Close

Generalized polar decompositions


What we need:

5/30

A Lie group G
An involutive automorphism
What we can do with them:
We can factorize
z = xy

GPD

where
(x) = x1
and
(y) = y
where x, y are appropriate group elements which are determined by z and .
In particular, if z = exp(tZ), it is true that
exp(tZ) = exp(X(t)) exp(Y (t))
and for t sufficiently small, the functions X(t) and Y (t) are uniquely determined.

JJ
II
J
I
Back
Close

Properties of the decomposition: at the group level


z = exp(tZ) = exp(X(t)) exp(Y (t)) = xy

GPD of z

with (x) = x1 , (y) = y.


6/30

Consider the sets


G = {z G : (z) = z}
G = {z G : (z) = z 1 }

fixed points of
anti-fixed points of

G has the structure of a group:


z1 , z2 G

z1 z2 G ,

z11 G

G has the structure of a symmetric space,


z1 , z2 G

z1 ? z2 = z1 z21 z1 G .

JJ
II
J
I
Back
Close

At the algebra level. . .


Assume z = exp(tZ), where Z g, the Lie-algebra of G. The group automorphism induces a
Lie-algebra map
d
d(Z) =
Zg
(exp(tZ)),
dt t=0
which is also an involutive automorphism since
d([A, B]) = [ d(A), d(B)],

A, B g,

7/30

d 2 = id,

We denote
k = {Z g : d(Z) = Z}
p = {Z g : d(Z) = Z}

subalgebra of g
Lie triple system.

The subspaces p and k obey important inclusion properties:


[k , k ] k ,
[k , p ] p ,
[p , p ] k ,

JJ
II
J
I
Back
Close

At the algebra level. . .


Assume z = exp(tZ), where Z g, the Lie-algebra of G. The group automorphism induces a
Lie-algebra map
d
d(Z) =
Zg
(exp(tZ)),
dt t=0
which is also an involutive automorphism since
d([A, B]) = [ d(A), d(B)],

A, B g,

7/30

d 2 = id,

We denote
k = {Z g : d(Z) = Z}
p = {Z g : d(Z) = Z}

subalgebra of g
Lie triple system.

The subspaces p and k obey important inclusion properties:


[k , k ] k ,
[k , p ] p ,
[p , p ] k ,

(+1) (+1) = (+1)


(+1) (1) = (1)
(1) (1) = (+1).

JJ
II
J
I
Back
Close

It is true that
g = p k,

in other words, every Z g can be uniquely written as


Z = P + K,

d(P ) = P,

d(K) = K,
8/30

where

1
P = (Z d(Z)),
2

1
K = (Z + d(Z)).
2

JJ
II
J
I
Back
Close

It is true that
g = p k,

in other words, every Z g can be uniquely written as


Z = P + K,

d(P ) = P,

d(K) = K,
8/30

where

1
P = (Z d(Z)),
2

1
K = (Z + d(Z)).
2

In summary:
Group level

Algebra level

G := {z G : (z) = z}, subgrp

k = {Z g : d(Z) = Z}, subalg.

G := {z G : (z) = z 1 }, symm. sp.

p = {Z g : d(Z) = Z}, LTS

G = G G

g=pk

z = xy

Z =P +K
P
i
X(t) =
Xi p
i=1 Xi t p,
P
i
Y (t) =
Yi k
i=1 Yi t k,

x = exp(X(t)) G
y = exp(Y (t)) G

JJ
II
J
I
Back
Close

The decomposition
Z =P +K
completely determines the functions X(t) and Y (t):
X = P t 12 [P, K]t2 16 [K, [P, K]]t3


1
1
+ 24
[P, [P, [P, K]]] 24
[K, [K, [P, K]]] t4

7
1
+ 360
[K, [P, [P, [P, K]]]] 120
[K, [K, [K, [P, K]]]]

9/30

1
[[P, K], [P, [P, K]]]
180

t5

+ O(t6 ),
Y

= Kt

1
[P, [P, K]]t3
12

1
[P, [P, [P, [P, K]]]]
120

1
[K, [K, [P, [P, K]]]]
720

1
[[P, K], [K, [P, K]]]
240

t5 + O(t7 ).

In general, all the terms in the expansion of X(t) and Y (t) are obtained by explicit recurrence
relations in terms of P and K.

JJ
II
J
I
Back
Close

Some examples and applications


Factorization of matrices with factors having specific properties

10/30

Factorization of flows of vector fields


Coordinates on Lie groups
Approximation of the exponential

JJ
II
J
I
Back
Close

Matrix factorizations
All three classical polar decompositions z = xy of complex matrices can be obtained by this
approach,
1 (z) = z T
x s.p.d, y complex orth.

2 (z) = z
x Hermitian p.d., y unitary
3 (z) = z
x coinvolutory, y real.

11/30

JJ
II
J
I
Back
Close

Matrix factorizations
All three classical polar decompositions z = xy of complex matrices can be obtained by this
approach,
1 (z) = z T
x s.p.d, y complex orth.

2 (z) = z
x Hermitian p.d., y unitary
3 (z) = z
x coinvolutory, y real.

11/30

For any involutory matrix S G (S 2 = I), the map


(z) = SzS
is an involutory automorphism, hence induces a GPD. Moreover
d(Z) = SZS.
As an example, assume

0
..
. 0
S=
0 1
1 0

that S is the flip matrix,

0 1
zn,n
zn,n1 zn,1

zn1,n zn1,n1
zn1,1
1 0

(z)
=
SZS
=

.
..
...
...
.
..

.
0 ..
z1,n
z1,n1 z1,1
0

by the GPD, z = xy where SxS = x1 (x centro-orthogonal), SyS = y (y centro-symmetric). At


the algebra level, SP S = P centro-skew, SKS = K, centro-symmetric.

JJ
II
J
I
Back
Close

Factorization of vector fields


Given a vector field and a symmetry S, factorize the flow into the composition of flows that have
S as a symmetry and S as a reversing symmetry. . .
6

-2

-2

-4

-4

-6
-6

-4

-2
0
2
4
F = (1-u)v u + (2u-3v) v

-6
-6

-2

-2

-4

-4

-6
-6

-4

-2
0
2
X P - 1/2 [P,K]

-6
-6

12/30

-4

-2

0
XP

-4
-2
0
2
4
X P- 1/2 [P,K] - 1/6 [K,[P,K]]

JJ
II
J
I
Back
Close

-2

-2

-4

-4

-6
-6

-4

-2

0
YK

-6
-6

13/30

-4

-2
0
2
Y K - 1/12 [P,[P,K]]

and so on . . .

JJ
II
J
I
Back
Close

Approximation of the matrix exponential


Recall that by GPD:

14/30

exp(tZ) = exp(X(t)) exp(Y (t)),


where
X = P t 12 [P, K]t2 16 [K, [P, K]]t3


1
1
+ 24
[P, [P, [P, K]]] 24
[K, [K, [P, K]]] t4

7
1
+ 360
[K, [P, [P, [P, K]]]] 120
[K, [K, [K, [P, K]]]]

1
[[P, K], [P, [P, K]]]
180

t5

+ O(t6 ),
Y

= Kt

1
[P, [P, K]]t3
12

1
[P, [P, [P, [P, K]]]]
120

1
[K, [K, [P, [P, K]]]]
720

and

1
Z = P + K,
P = (Z d(Z)),
2
Defines a splitting method that approximates exp(tZ):
Choose an appropriate .

1
[[P, K], [K, [P, K]]]
240

t5 + O(t7 ).

1
K = (Z + d(Z)).
2

JJ
II
J
I
Back
Close

Split Z.
Truncate the expansion to desired order.
Compute the exponential p-part (eventually k-part)
In general, we iterate the procedure on the reduced space until we get a space of low dimension.
At the end,
exp(tZ) exp(X [1] ) exp(X [2] ) exp(X [m] ) exp(Y [m] ).

15/30

JJ
II
J
I
Back
Close

What are good choices of ?


The splitted factors should be easy to compute.
Commutators should have a low complexity.
Exponential of splitted parts should be easy to compute (either approximately or preferably
exactly).

16/30

A good choice is splitting in matrices of low rank, for instance, borderded matrices: take

1 0 0
. . . ..

1
.
0
d(Z) = SZS,
S= . .
,
.
.. .. 0
..
0 0 1
then

z
2,1
P = ..
.
zn,1

z1,2 z1,n
0 0

.. . .
.. ,
.
.
.
0 0

K=

z1,1 0
0 z2,2
..
..
.
.
0 zn,2

0
z2,n
..
...
.
zn,n

Such automorphisms work for GL(n), SL(n), SO(n).


Note that the commutators appearing in the expansion can be computed in O(n2 ) computations
(n3 if the procedure is iterated for matrices of decreasing dimension)

JJ
II
J
I
Back
Close

An EulerRodrigues like formula for bordered matrices


The exponential of bordered matrices can be computed exactly by means of a formula similar to
the EulerRodrigues formula for computing the exponential of a 3 3 skew-symmetric matrices.
Assume that A p is of the form
17/30


A=
Then,

O
bT

a
0


,

a, b Rn .


2
sinh
1 sinh(/2)

A2 ,
I
+
A
+

2
/2

I + A + 12 A2 ,
exp(A) =


2

I + sin A + 1 sin(/2) A2 ,

2
/2

where
2

A =

abT
0T

0
2

if aT b > 0, =

aT b,

if aT b = 0,
if aT b < 0, =

aT b


.

Note that A2 is never computed explicitely but always applied to a vector/matrix, hence exp(A)
amounts to O(n2 ) computations. If only the last p elements of a and b are nonzero, then the
complexity is O(p2 ).

JJ
II
J
I
Back
Close

Complexity of the algorithms:


Order
2
splitting
assembly exp

sl(n), so(p, q)

vector
1 13 n3
3n2

matrix
1 13 n3
2n3

vector
2 3
n
3
3n2

matrix
2 3
n
3
2n3

total

1 13 n3

3 13 n3

2 3
n
3

2 23 n3

sl(n), so(p, q)

so(n)

Order
3(4)
splitting
assembly exp

vector
5(7)n3
3n2

matrix
5(7)n3
2n3

vector
2 12 (4)n3
3n2

matrix
2 12 (4)n3
2n3

total

5(7)n3

7(9)n3

2 12 (4)n3

4 12 (6)n3

18/30

so(n)

These algorithms have a complexity that is comparable with other classical algorithms, like for
instance diagonal Pade approximants.
Feasible algorithms are up to order 4, because for higher order the complexity becomes larger
(although still O(n3 )).

JJ
II
J
I
Back
Close

Faster algorithms
The main difference with the approach presented before is that the matrix Z is preprocessed and
reduced to a sparse form stable under commutation, which is

19/30

tridiagonal (for symmetric and skew-symmetric matrices),


upper Hessenberg (for matrices in sl(n)),
butterfly form (for symplectic matrices).

Again, we split rows and columns and start computing commutators.


In the following example, we consider a skew-symmetric tridiagonal matrix.
red for the p-part, blue for the k-part
updated elements are denoted with dots instead of crosses

JJ
II
J
I
Back
Close

Order 1:
0
2

20/30

4
6
8
10
12
14

JJ
II
J
I

16
18
20

Back
0

10
12
nz = 36

14

16

18

20

Close

Order 2:
0
2

21/30

4
6
8
10
12
14

JJ
II
J
I

16
18
20

Back
0

10
12
nz = 36

14

16

18

20

Close

Order 3:
0
2

22/30

4
6
8
10
12
14

JJ
II
J
I

16
18
20

Back
0

10
12
nz = 2

14

16

18

20

Close

Order 4:
0
2

23/30

4
6
8
10
12
14

JJ
II
J
I

16
18
20

Back
0

10
12
nz = 2

14

16

18

20

Close

Order 5:
0
2

24/30

4
6
8
10
12
14

JJ
II
J
I

16
18
20

Back
0

10
12
nz = 4

14

16

18

20

Close

Main observations:
Each extra order fills in two symmetric elements in the p-part.
The fill-in in the k-part starts only at order 5.
As long as the matrices P, K are tridiagonal, the commutators cost O(1).
25/30

The ugly and the bad fill-in


The fill-in in the p-part is ugly but not harmful: once once the p-term is computed up to
desired order, one needs only compute the exponential.
The fill-in in the k-part is much more dangerous: if not taken care of, it propagates and we
lose the whole benefits of our tridiagonalization/reduction to Hessenberg
0

10

10

10

12

12

12

14

14

14

16

16

16

18

18

18

20

20
0

10

12

14

16

18

20

20
0

10

12

14

16

18

20

10

12

14

16

18

20

Therefore the fill-in elements in the k part must be annihilated by, for instance, Givens rotations
(O(1) computations)

JJ
II
J
I
Back
Close

Matrices in GL(n), SL(n)


Order 5 terms for matrices in Hessenberg form:
X

10

10

0
1

26/30

2
3
4

10

X4

10

X5

5
6

10

10
0

10

10

10

11

10

11

JJ
II
J
I
Back
Close

Symplectic matrices
Order 5 terms for matrices in butterfly form:
0

10

10

12

12

14

14

16

16

18

18

20

20
0

10
12
nz = 56

14

16

18

20

27/30

10
12
nz = 4

14

16

18

20

JJ
II
J
I
Back
Close

Reduction to tridiagonal/Hessenberg/
butterfly form

28/30

For tridiagonalization/Hessenberg use Householder reflections:


H = I vvT ,

2
kvk2

Then,
T
HZH = Z vvT Z ZvvT + 2 vv
ZvvT}
| {z

0 if Z is skew

Cost:
12 n3 for skew-symmetric matrices
43 n3 for symmetric matrices

10 3
n
3

for arbitrary matrices

Reduction to butterfly form is done by means of symplectic transformations:


symplectic Givens/Householder (orthosymplectic)
symplectic Gauss transformations
as proposed by Fabender, Benner, Watkins (at the group level, for QR-like iterations).

JJ
II
J
I
Back
Close

The costs
For symmetric/skew-symmetric matrices the main cost is the tridiagonalization, amounting to
4 3 1 3
n , 2 n plus O(n2 ) operations.
3
For arbitrary matrices, we have 10
n3 operations coming from the reduction to Hessenberg, but
3
also the computation of commutators, exponentials costs some O(n3 ).
Order
2
Hessenberg
splitting
assembly exp
total
Order
4
Hessenberg
splitting
assembly exp
total

ZMK
vector matrix
1 3
4 3
13n
n
3
2
3n
2n3

vector
10 3
n
3
1 3
n
3
1 2
n
2

matrix
10 3
n
3
1 3
n
3
1 3
n
2

Order
3
Hessenberg
splitting
assembly exp

1 13 n3

3 13 n3

5 13 n3

total

7 3
n
3

ZMK
vector matrix
7n3
7n3
3n2
2n3
7n3

9n3

IZ

ZMK
vector matrix
3
5n
5n3
3n2
2n3
5n3

7n3

29/30

IZ
vector
10 3
n
3
2 3
n
3
1 2
n
2

matrix
10 3
n
3
2 3
n
3
1 3
n
2

4n3

4 12 n3

IZ
vector
10 3
n
3
n3
1 2
n
2

matrix
10 3
n
3
n3
1 3
n
2

4 13 n3

4 56 n3

JJ
II
J
I
Back
Close

Some concluding remarks


What we got so far . . .

30/30

Methods that stay on the correct Lie group


They are cheap
We can tackle most groups
High order is easily achievable
Easy scaling and squaring

Open issues
A divide and conquer approach is it suitable for large problems on parallel machines?
Stiff problems? everything O.K. on SO(n), but what about other problems?
Different choices of automorphisms that lead to other splittings

JJ
II
J
I
Back
Close

You might also like