Academia.eduAcademia.edu

Symmetric centrosymmetric matrix–vector multiplication

2000, Linear Algebra and its Applications

We present a method for the multiplication of an arbitrary vector by a symmetric centrosymmetric matrix, requiring 5 4 n 2 + O(n) floating-point operations, rather than the 2n 2 operations needed in the case of an arbitrary matrix. Combining this method with Trench's algorithm for Toeplitz matrix inversion yields a method for solving Toeplitz systems with the same complexity as Levinson's algorithm.

Linear Algebra and its Applications 320 (2000) 193–198 www.elsevier.com/locate/laa Symmetric centrosymmetric matrix–vector multiplication A. Melman 1 Department of Mathematics, University of San Francisco, San Francisco, CA 94117-1080, USA Received 15 April 1999; accepted 15 July 2000 Submitted by R.E. Hartwig Abstract We present a method for the multiplication of an arbitrary vector by a symmetric centrosymmetric matrix, requiring 54 n2 + O(n) floating-point operations, rather than the 2n2 operations needed in the case of an arbitrary matrix. Combining this method with Trench’s algorithm for Toeplitz matrix inversion yields a method for solving Toeplitz systems with the same complexity as Levinson’s algorithm. © 2000 Elsevier Science Inc. All rights reserved. AMS classification: 65F05 Keywords: Centrosymmetric matrix; Matrix–vector multiplication; Toeplitz system; Trench’s algorithm; Levinson’s algorithm 1. Introduction The multiplication of an arbitrary vector by a matrix on a sequential machine requires, in general, 2n2 floating-point operations (following [5], we define a floating-point operation, or flop, as either an addition/subtraction or multiplication/division). This remains true even for a symmetric matrix. However, if, in addition to being symmetric, the matrix is also persymmetric (symmetric with respect to the southwest–northeast diagonal), then the complexity can be reduced significantly. Matrices, which are both symmetric and persymmetric, are called symmetric centrosymmetric. In this paper, we present a method for multiplying an arbitrary vector E-mail address: [email protected] (A. Melman). 1 On leave from Ben-Gurion University, Beer-Sheva, Israel. 0024-3795/00/$ - see front matter ( 2000 Elsevier Science Inc. All rights reserved. PII: S 0 0 2 4 - 3 7 9 5 ( 0 0 ) 0 0 2 3 2 - 9 194 A. Melman / Linear Algebra and its Applications 320 (2000) 193–198 by such a matrix with 54 n2 + O(n) flops. Additional multiplications by the same matrix cost only an additional n2 + O(n) flops per multiplication. We also briefly consider an application of this method by combining it with Trench’s algorithm [5,7] for Toeplitz matrix inversion to produce a method for solving positive-definite symmetric Toeplitz systems, which has the same complexity as Levinson’s algorithm [5,6]. The paper is organized as follows: Section 2 is devoted to a few basic definitions and results; in Section 3, we present our method and discuss some implications, whereas in Section 4, we briefly show an application to Toeplitz systems. 2. Preliminaries A matrix M ∈ R(n,n) is said to be persymmetric if it is symmetric about its southwest–northeast diagonal. For such a matrix M, this is the same as requiring that J M T J = M, where J is a matrix with ones on its southwest–northeast diagonal and zeros everywhere else (the exchange matrix). It is easy to see that the inverse of a persymmetric matrix is also persymmetric. A matrix Q ∈ R(n,n) is said to be centrosymmetric if it satisfies J QJ = Q. Its inverse is also centrosymmetric. A matrix which is both symmetric and persymmetric is therefore called symmetric centrosymmetric, or doubly symmetric. An even vector v is defined as a vector satisfying J v = v and an odd vector w as one that satisfies J w = −w. A symmetric matrix T ∈ R(n,n) is said to be Toeplitz if its elements Tij satisn T fy Tij = ρ|j −i| , where {ρj }n−1 j =0 are the components of a vector (ρ0 , r) ∈ R , with n−1 r = (ρ1 , . . . , ρn−1 )T ∈ R , so that   ρ0 ρ1 ρ2 . . . ρn−1  ρ1 ρ0 ρ1 . . . ρn−2     .. ..  . . . .. .. T = .  . . . .   ρn−2 ρn−3 ρn−4 . . . ρ1  ρn−1 ρn−2 ρn−3 . . . ρ0 The matrix T is symmetric centrosymmetric and so is its inverse. Such matrices appear in many applications and we refer to [2] for a good overview. We denote the exchange matrix by J without specifically indicating its dimension, which is assumed to be clear from the context. 3. Symmetric centrosymmetric matrix–vector multiplication Consider the multiplication Mn b (n) , where Mn is a real n × n symmetric centrosymmetric matrix and b (n) is a real n-dimensional vector. The matrix Mn can be partitioned as follows: A. Melman / Linear Algebra and its Applications 320 (2000) 193–198  α t Mn =  β tT Mn−2 tTJ 195  β J t , α (n−2,n−2) where Mn−2 ∈ R , t ∈ Rn−2 and α, β ∈ R. We note that Mn−2 is also symmetric centrosymmetric. The method we are about to construct is based on this partition of Mn and on the fact that b(n) = be(n) + bo(n) , where be(n) = 21 (b(n) + J b (n) ) and bo(n) = 21 (b(n) − J b(n) ). The even part of b(n) satisfies J be(n) = be(n) , whereas for the odd part we have (n) (n) (n) (n) J bo = −bo . We compute Mn b(n) as Mn be + Mn bo . Writing be(n) = (σe , be(n−2) , σe )T respectively, we have  α tT  Mn be(n) =  t Mn−2 β t TJ and bo(n) = (σo , bo(n−2) , −σo )T ,  (n−2)    (α + β)σe + t T be β σe   (n−2)  (n−2)  J t =   be , σe (t + J t) + Mn−2 be σ e α (n−2) (α + β)σe + t T be (n) which computes the first and last components of Mn be , and      (n−2) β α tT σo (α − β)σo + t T bo     Mn bo(n) =  t Mn−2 J t  bo(n−2) = σo (t − J t) + Mn−2 bo(n−2)  , (n−2) −σo −(α − β)σo − t T bo β t TJ α which computes the first and last components of Mn bo(n). We then repeat this pro(n−2) (n−2) cedure for Mn−2 be and Mn−2 bo , add the result to that of the previous step, and continue in this manner until the whole product is computed. This leads to the algorithm, labeled “Algorithm 1”, for computing y = Mn b (n) . We have used similar notation as in [5], with the subscripts “e” and “o” referring to even and odd quantities, respectively. For positive numbers (as is the case here), the notation ⌊ ⌋ means “rounded to the nearest smaller integer”. For the sake of clarity, some redundant quantities were introduced in the algorithm. These do not affect the complexity. Algorithm 1. m = ⌊ n+1 2 ⌋ be (1 : m) = (1/2) (b(1 : m) + b(m : −1 : 1)) bo (1 : m) = (1/2) (b(1 : m) − b(m : −1 : 1)) if n is even for j = 1 : m − 1 α = q(j, j ) β = q(j, n − j + 1) 196 A. Melman / Linear Algebra and its Applications 320 (2000) 193–198 s = q(j, j + 1 : n − j ) se (1 : m − j ) = s(1 : m − j ) + s(n − 2j : −1 : m − j + 1) so (1 : m − j ) = s(1 : m − j ) − s(n − 2j : −1 : m − j + 1) ze (j ) = (α + β)be (j ) + se (1 : m − j )T be (j + 1 : m) ze (j + 1 : m) = ze (j + 1 : m) + be (j )se (1 : m − j ) zo (j ) = (α − β)bo (j ) + so (1 : m − j )T bo (j + 1 : m) zo (j + 1 : m) = zo (j + 1 : m) + bo (j )so (1 : m − j ) end else (n is odd) for j = 1 : m − 1 α = q(j, j ) β = q(j, n − j + 1) s = q(j, j + 1 : n − j ) se (1 : m − j ) = s(1 : m − j ) + s(n − 2j : −1 : m − j ) so (1 : m − j ) = s(1 : m − j ) − s(n − 2j : −1 : m − j ) ze (j ) = (α + β)be (j ) + se (1 : m − j )T be (j + 1 : m) −se (m − j )be (m) ze (j + 1 : m) = ze (j + 1 : m) + be (j )se (1 : m − j ) zo (j ) = (α − β)bo (j ) + so (1 : m − j )T bo (j + 1 : m) zo (j + 1 : m) = zo (j + 1 : m) + bo (j )so (1 : m − j ) end end y(1 : m) = ze (1 : m) + zo (1 : m) if n is even y(m + 1 : n) = ze (m : −1 : 1) − zo (m : −1 : 1) else (n is odd) y(m + 1 : n) = ze (m − 1 : −1 : 1) − zo (m − 1 : −1 : 1) end We stress that this is a conceptual algorithm only, as the final implementation depends on the architecture of the machine on which it is supposed to run. The algorithm could also be executed backwards, by starting with M1 b(1) for n odd, or M2 b(2) for n even. Furthermore, the computations of the even and odd parts of y are independent of each other and could be processed in parallel. At each step of (n) (n) Algorithm 1, two components are computed for both be and bo , from the outer components down to the middle ones as it proceeds from step to step. 197 A. Melman / Linear Algebra and its Applications 320 (2000) 193–198 Let us now have a look at the number of flops executed by Algorithm 1. We first (j ) consider Mn be(n) when n is even. At the stage where Mj be (j < n) is carried out, we have   ρ    e T b (j −2) δ γ sT (γ + δ)ρ + s e e   (j −2)  (j ) (j −2)  = Mj be =  s Mj −2 J s   ,  ρe (s + J s) + Mj −2 be  be (j −2) T T (γ + δ)ρe + s be δ s J γ ρe (j ) where γ , δ ∈ R and s ∈ Rj −2 . Taking into account the symmetry of be and (s + J s), we need to execute, at this stage, (j − 1) additions and (j − 1) multiplications. Except for j = n, this must then be added to the solution vector found thus far, which requires an additional j/2 additions. Summing over all steps, this gives  n   n  n −1 + − 2 + ···+ 2 + 1 (n + (n − 2) + · · · + 2) − + 2 2 2 additions and n (n + (n − 2) + · · · + 2) − 2 multiplications. This yields ( 38 n2 − 41 n) additions and ( 14 n2 ) multiplications. Analo(n) gously, the same complexity is obtained for Mn bo , so that if we apply the matrix (n) to an arbitrary vector b and also take into account the cost of computing the even and odd parts of b(n) and of adding the even and odd parts of the solution, we obtain for the computation of Mn b (n) a complexity of ( 43 n2 + n) additions and ( 12 n2 + n) multiplications. This means that the total complexity of the method is 45 n2 + O(n) flops. The same complexity is obtained when n is odd. If the vectors   ⌊ n+1 ⌊ n+1 2 ⌋ 2 ⌋ and vj − J vj j =1 , vj + J vj j =1 where the vj ’s are the columns of Mn , are stored (at a cost of roughly n2 /4), then each subsequent multiplication by an arbitrary vector costs only n2 + O(n) additional flops. No extra storage is required if the matrix Mn , which requires the same amount of storage, is overwritten by these data. 4. Positive-definite Toeplitz systems of equations In this section we briefly illustrate an application of symmetric centrosymmetric matrix–vector multiplication in the context of Toeplitz systems by showing that it leads to yet another O(n2 ) algorithm for such problems. The special system of equations T x = −r, with T and r as in Section 2 and T positive-definite, is called the Yule–Walker (YW) system of equations. It frequently appears in digital signal processing problems and can be solved by Durbin’s method with 2n2 + O(n) flops (see [4]). Its solution is used in an efficient algorithm for the 198 A. Melman / Linear Algebra and its Applications 320 (2000) 193–198 inversion of a positive-definite Toeplitz matrix, due to Trench [7]. If Durbin’s algorithm is used for solving the YW equations, then Trench’s algorithm requires a total 2 of 13 4 n + O(n) flops. However, in [3], an algorithm (“the split Levinson method”) is derived, requiring only 23 n2 + O(n) flops to solve the YW equations. This reduces 2 Trench’s algorithm’s complexity to 11 4 n + O(n) flops. We refer, once again, to [5] for an overview of algorithms and references related to Toeplitz matrices. There are other algorithms for these same problems with better complexity for large matrices and we refer to, i.e., [1] and references therein. A classical method for solving positive-definite real symmetric Toeplitz systems of linear equations with arbitrary right-hand side (of the form T y = b) and which does not involve matrix inversion, is Levinson’s algorithm [5,6]. Its complexity is 4n2 + O(n) flops. An alternative to Levinson’s algorithm can be obtained by first inverting the coefficient matrix using Trench’s method and then applying the resulting matrix to the right-hand side, i.e., y = T −1 b. However, this would not be advantageous in terms of number of operations if regular matrix multiplication were used. On the other hand, the inverse of a symmetric Toeplitz matrix is symmetric centrosymmetric and we can therefore, given T −1 , apply our algorithm to compute T −1 b. Let us have a look at the complexity of such a procedure. Trench’s algorithm with the 2 split Levinson algorithm requires 11 4 n + O(n) flops. Multiplying the inverse matrix and the right-hand side using Algorithm 1 then adds 54 n2 + O(n) flops, yielding a total of 4n2 + O(n) flops, which is precisely the same as for Levinson’s algorithm. In fact, it may be more desirable to use our method if the system needs to be solved for more than one right-hand side, especially since each subsequent solve requires only n2 + O(n) flops, once the necessary quantities have been stored. Such an algorithm is, in general, likely to be less accurate than Levinson’s algorithm, but iterative refinement of the solution could always be used, if necessary. However, these are practical considerations (as is the size of the matrix, which may give an advantage to the so-called superfast solvers in [1]) and the choice of algorithm will depend on the circumstances. References [1] G.S. Ammar, W.B. Gragg, The generalized Schur algorithm for the superfast solution of Toeplitz systems, in: J. Gilewicz, M. Pindor, W. Siemaszko (Eds.), Rational Approximations and its Applications in Mathematics and Physics, Lecture Notes in Mathematics, 1237, Berlin, 1987, pp. 315–330. [2] J.R. Bunch, Stability of methods for solving Toeplitz systems of equations, SIAM J. Sci. Stat. Comput. 6 (1985) 349–364. [3] P. Delsarte, Y. Genin, The split Levinson algorithm, IEEE Trans. Acoust. Speech, Signal Processing ASSP-34 (1986) 470–478. [4] J. Durbin, The fitting of time series model, Rev. Inst. Int. Stat. 28 (1960) 233–243. [5] G. Golub, C. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, MD, 1989. [6] N. Levinson, The Wiener RMS (root mean square) error criterion in filter design and prediction, J. Math. Phys. 25 (1947) 261–278. [7] W.F. Trench, An algorithm for the inversion of finite Toeplitz matrices, J. SIAM 12 (1964) 515–522.