Ranjan 21ma40022 Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Tensor Decomposition and its Application

Submitted by

Ranjan Sarkar
(Roll No. 21MA40022)

Under the Supervision of


Prof. Swanand R. Khare

Department of Mathematics
Indian Institute of Technology Kharagpur
West Bengal, India - 721302

Signature of the Student Signature of the Supervisor


Date: Date:
Acknowledgement
I am heartily thankful to my supervisor, Dr. Swanand R. Khare, whose encour-
agement, guidance and support enabled me to develop an understanding of this topic.
Finally I thank to Mathematics Department which provides all facilities required to do
this project.

Ranjan Sarkar
Abstract
In this project, I learned about Tensor. What is tensor? Various types of tensor, it’s
operations and some of the decomposition techniques of the higher order tensors. By
decomposing a higher dimensional tensor in a efficient way, we can optimize the time
to calculate tensor operations. Decompositions of higher-order tensors (i.e., N -way
arrays with N ≥ 3) have applications in psychometrics, chemometrics, signal processing,
numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience,
graph analysis, and elsewhere.
Keywords: Tensor, Tensor Decomposition, multiway arrays, multilinear algebra,
parallel factors (PARAFAC), canonical decomposition (CANDECOMP), Data Pro-
cessing, Image Compression, Approximation, Optimization, Higher Dimensional Data
Handling
Contents

1 Introduction 5
1.1 What is Tensor? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Subarray . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Fibers and Slices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Types of Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.1 Cubical Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.2 Symmetric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.3 Partially Symmetric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.4 Diagonal Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.5 Rank-One Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Matricization (Transforming a Tensor into a Matrix) . . . . . . . . . . . . . . . . . . . . . 7
1.8 Python Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Tensor Product / Multiplication 10


2.1 n-mode Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Kronecker Product of 2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Khatri-Rao Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Hadamard Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Python Codes for Product Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Tensor Rank and the CP Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.1 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.7 Low-Rank Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7.1 Iterative Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Future Work 19

References 20
Chapter 1

Introduction

1.1 What is Tensor?


A Tensor is a multi-dimensional array, a general form of a Matrix (which is a 2-dim array). More formally,
an N-way or Nth-order tensor is an element of the tensor product of N vector spaces, each of which has
its own coordinate system.
A first-order tensor is a vector, a second-order tensor is a matrix, and tensors of order three or higher
are called higher-order tensors.

1.2 Examples
The order of a tensor is the number of dimensions, also known as ways or modes. Vectors (tensors of
order one) are denoted by boldface lowercase letters, e.g., a. Matrices (tensors of order two) are denoted
by boldface capital letters, e.g., A. Higher-order tensors (order three or higher) are denoted by boldface
Euler script letters, e.g., χ. Scalars are denoted by lowercase letters, e.g., a.


1-dim array - 1, 2, 3 - a (Vector)
 
3 3 9
2-dim array - - A (2×3M atrix)
3 3 2
   
2 8 2 9 6 8 1 1
3-way array - , - χ (2 ×2 × 4T ensor)
3 0 1 0 2 2 3 3

1.3 Subarray
Subarrays are formed when a subset of the indices is fixed. For matrices, these are the rows and columns.
A colon is used to indicate all elements of a mode. Thus, the jth column of A is denoted by a:j , and the
ith row of a matrix A is denoted by ai:

The followings are the subarrays of Matrix A.

Rows of A: a1: = ( a11 a12 a13 )


a2: = ( a21 a22 a23 )
a11 a12 a13
Columns of A: a:1 = ( ), a:2 = ( ), a:3 = ( )
a21 a22 a23

5
1.4 Fibers and Slices
Fibers are the higher-order analogue of matrix rows and columns. A fiber is defined by fixing every index
but one. A matrix column is a mode-1 fiber and a matrix row is a mode-2 fiber. Third-order tensors
have column, row, and tube fibers, denoted by x:jk , xi:k , and xij: , respectively.

Slices are two-dimensional sections of a tensor, defined by fixing all but two indices. The horizontal,
lateral, and frontal slides of a third-order tensor X, denoted by Xi:: , X:j: , and X::k , respectively.

1.5 Norm
The norm of a tensor X ϵ RI1 ×I2 ×···×IN is the square root of the sum of the squares of all its elements,
i.e.,
qP
I1 PI2 PIN 2
∥X ∥ = i1 =1 i2 =1 · · · iN =1 xi1 i2 ···iN

This is analogous to the matrix Frobenius norm, which is denoted ∥A∥ for a matrix.

The inner product of two same-sized tensors X , Y ∈ RI1 ×I2 ×···×IN is the sum of the products of their
entries, i.e.,
I1 X
X I2 IN
X
⟨X , Y⟩ = ··· xi1 i2 ···iN yi1 i2 ···iN
i1 =1 i2 =1 iN =1
2
It follows immediately that ⟨X , X ⟩ = ∥X ∥ .

1.6 Types of Tensor


1.6.1 Cubical Tensor
A tensor is called cubical if every mode is the same size, i.e., X ∈ RI×I×I×···×I . For Matrix, it is Square
Matrix.

1.6.2 Symmetric Tensor


A Cubical tensor is called supersymmetric or "symmetric" if its element remain constant under any
permutation of the indices. For instance, a three-way tensor X ∈ RI×I×I is supersymmetric if

xijk = xikj = xjik = xjki = xkij = xkji for all i, j, k = 1, . . . , I.

1.6.3 Partially Symmetric Tensor


Tensors can be (partially) symmetric in two or more modes as well. For example, a three-way tensor
X ∈ RI×I×K is symmetric in modes one and two if all its frontal slices are symmetric, i.e.,

Xk = X⊤
k for all k = 1, . . . , K

1.6.4 Diagonal Tensor


A tensor X ∈ RI1 ×I2 ×···×IN is diagonal if xi1 i2 ···iN ̸= 0 only if i1 = i2 = · · · = iN . Figure 1.1 illustrates
a cubical tensor with ones along the superdiagonal.

6
1.6.5 Rank-One Tensor
An N -way tensor X ∈ RI1 ×I2 ×···×IN is rank one if it can be written as the outer product of N vectors,
i.e.,
X = a(1) ◦ a(2) ◦ · · · ◦ a(N ) .
The symbol "o" represents the vector outer product. This means that each element of the tensor is the
product of the corresponding vector elements:
(1) (2) (N )
xi1 i2 ···iN = ai1 ai2 · · · aiN for all 1 ≤ in ≤ In .

Figure 1.2 illustrates X = a ◦ b ◦ c, a third-order rank-one tensor.

Example of 3rd order Rank-One Tensor -


  
X= 1 2 −1 1 ◦ 3 2

   
−3 −2 −6 −4
= ,
3 2 6 4

1.7 Matricization (Transforming a Tensor into a Matrix)


Matricization, also known as unfolding or flattening, is the process of reordering the elements of an
N -way array into a matrix. For instance, a 2 × 3 × 4 tensor can be arranged as a 6 × 4 matrix or a 3 × 8
matrix, and so on. The mode- n matricization of a tensor X ∈ RI1 ×I2 ×···×IN is denoted by X(n) and
arranges the mode-n fibers to be the columns of the resulting matrix. Though conceptually simple, the
formal notation is clunky. Tensor element (i1 , i2 , . . . , iN ) maps to matrix element (in , j), where

N
X k−1
Y
j =1+ (ik − 1) Jk with Jk = Im .
k=1 m=1
k̸=n m̸=n

The concept is easier to understand using an example. Let X ∈ R2×3×4 be


   
1 2 3 4 13 14 15 16
X =  5 6 7 8  ,  17 18 19 20 
9 10 11 12 21 22 23 24

7
Then the three mode-n unfoldings are
 
1 2 3 4 5 6 7 8 9 10 11 12
X(1) =
13 14 15 16 17 18 19 20 21 22 23 24
 
1 2 3 4 13 14 15 16
X(2) =  5 6 7 8 17 18 19 20  ,
9 10 11 12 21 22 23 24
 
1 5 9 13 17 21
 2 6 10 14 18 22 
X(3) =  3
.
7 11 15 19 23 
4 8 12 16 20 24

it is also possible to vectorize a tensor. In the example above, the vectorized version is,
 
1
 2 
vec(X) = 
 
.. 
 . 
24

1.8 Python Codes


Importing modules

[1]: import numpy as np


import sympy as sp
import tensorly as tl

Defining Tensor in Tensorly

[2]: arr = np.arange(1,25).reshape(2,3,4)


t = tl.tensor(arr)
t

[2]: array([[[ 1, 2, 3, 4],


[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],

[[13, 14, 15, 16],


[17, 18, 19, 20],
[21, 22, 23, 24]]])

Vectorization of Tensor
Tensor > Vector

[3]: vec = tl.tensor_to_vec(t)


vec

[3]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24])

8
Vector > Tensor

[4]: tl.vec_to_tensor(vec, (3,2,4))

[4]: array([[[ 1, 2, 3, 4],


[ 5, 6, 7, 8]],

[[ 9, 10, 11, 12],


[13, 14, 15, 16]],

[[17, 18, 19, 20],


[21, 22, 23, 24]]])

Tensor Unfolding / Matricization


Unfolding the Tensor wrt Axis = 0

[5]: tl.unfold(t, 0)

[5]: array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],


[13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

Unfolding wrt Axis = 1

[6]: tl.unfold(t, 1)

[6]: array([[ 1, 2, 3, 4, 13, 14, 15, 16],


[ 5, 6, 7, 8, 17, 18, 19, 20],
[ 9, 10, 11, 12, 21, 22, 23, 24]])

[7]: tl.unfold(t, 1).flatten()

[7]: array([ 1, 2, 3, 4, 13, 14, 15, 16, 5, 6, 7, 8, 17, 18, 19, 20, 9,
10, 11, 12, 21, 22, 23, 24])

Unfolding wrt axis = 2

[8]: unf = tl.unfold(t, 2)


unf

[8]: array([[ 1, 5, 9, 13, 17, 21],


[ 2, 6, 10, 14, 18, 22],
[ 3, 7, 11, 15, 19, 23],
[ 4, 8, 12, 16, 20, 24]])

9
Chapter 2

Tensor Product / Multiplication

2.1 n-mode Product


Tensors can be multiplied together, though obviously the notation and symbols for this are much more
complex than for matrices. Here we consider only the tensor n-mode product, i.e., multiplying a tensor
by a matrix (or a vector) in mode n.

The n-mode (matrix) product of a tensor X ∈ RI1 ×I2 ×···×IN with a matrix U ∈ RJ×In is denoted
by X ×n U and is of size I1 × · · · × In−1 × J × In+1 × · · · × IN . Elementwise, we have

In
X
(X × n U)i1 ···in−1 jin+1 ···iN = xi1 i2 ···iN ujin .
in =1

Each mode- n fiber is multiplied by the matrix U. The idea can also be expressed in terms of unfolded
tensors:

y = X ×n U ⇔ Y(n) = UX(n) .

The n-mode product of a tensor with a matrix is related to a change of basis in the case when a tensor
defines a multilinear operator.
Example of n-mode matrix product.
   
1 0 2 2 1 3
X2×2×3 = ,
1 −1 −2 1 2 −3
 
1 −1
U2×3 =  2 0 
1 1
 
1 0 2 1 −1 −2
X(1) =
2 1 3 1 2 −3
 
−1 −1 −1 0 −3 1
U2×3 × X(1) =  2 0 4 2 −2 −4 
3 1 5 2 1 −5
     
−1 −1 −1 2 0 4 3 1 5
X ×1 U = , ,
0 −3 1 2 −2 −4 2 1 −5

A few facts regarding n-mode matrix products are in order. For distinct modes in a series of multiplica-
tions, the order of the multiplication is irrelevant, i.e.,

X ×m A ×n B = X ×n B ×m A (m ̸= n).

10
If the modes are the same, then

X ×n A ×n B = X ×n (BA).

The n-mode (vector) product of a tensor X ∈ RI1 ×I2 ×···×IN with a vector v ∈ RIn is de-
noted by X ×n v. The result is of order N − 1, i.e., the size is I1 × · · · × In−1 × In+1 × · · · × IN .
Elementwise,

In
X
(X×n v)i1 ···in−1 in+1 ···iN = xi1 i2 ···iN vin .
in =1

The idea is to compute the inner product of each mode-n fiber with the vector v.
 ⊤
For example, let X be as given and define v = 1 2 . Then
 
5 2 8
X ×1 v = .
3 3 −8

When it comes to mode- n vector multiplication, precedence matters because the order of the intermediate
results changes. In other words,

X ×m a×n b = (X ×m a) ×n−1 b = (X x̄n b) ×m a for m < n.

2.2 Kronecker Product of 2 Matrices


The Kronecker product of matrices A ∈ RI×J and B ∈ RK×L is denoted by A ⊗ B. The result is a
matrix of size (IK) × (JL) and defined by
 
a11 B a12 B ··· a1J B
 a21 B a22 B ··· a2J B 
A⊗B=
 
.. .. .. .. 
 . . . . 
aI1 B aI2 B · · · aIJ B

Examples:

A ∈ R2×2 , B ∈ R2×2 , then A ⊗ B ∈ R4×4


 
    1 −1 2 −2
1 2 1 −1  −2 3 −4 6 
A= , B= , then A ⊗ B =  
−1 0 −2 3  −1 1 0 0 
2 −3 0 0

2.3 Khatri-Rao Product


The Khatri-Rao product is the "matching columnwise" Kronecker product. Given matrices A ∈ RI×K
and B ∈ RJ×K , their Khatri-Rao product is denoted by A ⊙ B. The result is a matrix of size (IJ) × K
defined by
If a and b are vectors, then the Khatri-Rao and Kronecker products are identical, i.e., a ⊗ b = a ⊙ b.

11
Examples:
A ∈ R2×2 , B ∈ R3×2 , then A ⊙ B ∈ R6×2

 
2 −2
   4 1 
  1 −2  
2 1  6 0 
A= , B= 2 1  , then A ⊙ B =  
−1 0  −1 0 
3 0  
 −2 0 
−3 0

2.4 Hadamard Product


The Hadamard product is the elementwise matrix product. Given matrices A and B, both of size I × J,
their Hadamard product is denoted by A ∗ B. The result is also of size I × J and defined by
 
a11 b11 a12 b12 ··· a1J b1J
 a21 b21 a22 b22 ··· a2J b2J 
A∗B=
 
.. .. .. .. 
 . . . . 
aI1 bI1 aI2 bI2 ··· aIJ bIJ

These matrix products have properties that will prove useful in our discussions:

(A ⊗ B)(C ⊗ D) = AC ⊗ BD,
(A ⊗ B)† = A† ⊗ B† ,
A ⊙ B ⊙ C = (A ⊙ B) ⊙ C = A ⊙ (B ⊙ C),
(A ⊙ B)⊤ (A ⊙ B) = A⊤ A ∗ B⊤ B,
†
(A ⊙ B)† = A⊤ A ∗ B⊤ B (A ⊙ B)⊤ .


Here A† denotes the Moore-Penrose pseudoinverse of A.

2.5 Python Codes for Product Operations


Kronecker Product
[9]: a = sp.Matrix([[1,2],[4,1]])
b = sp.Matrix([[4,0],[2,-1]])
display(a)
display(b)
 
1 2
4 1
 
4 0
2 −1

[10]: sp.kronecker_product(a,b) # using sympy


[10]:  4 0 8 0

 2 −1 4 −2
 
16 0 4 0 
8 −4 2 −1

[11]: np.kron(a,b) # using numpy

[11]: array([[4, 0, 8, 0],


[2, -1, 4, -2],

12
[16, 0, 4, 0],
[8, -4, 2, -1]], dtype=object)

[12]: from scipy import linalg # using scipy

m1 = np.array([[1,2],[-1,0]])
m2 = np.array([[1,-1],[-2,3]])
linalg.kron(m1, m2)

[12]: array([[ 1, -1, 2, -2],


[-2, 3, -4, 6],
[-1, 1, 0, 0],
[ 2, -3, 0, 0]])

Khatri Rao Product


[13]: a = np.array([[2, 1],[-1, 0]])
b = np.array([[1, -2],[2, 1], [3, 0]])
linalg.khatri_rao(a, b)

[13]: array([[ 2, -2],


[ 4, 1],
[ 6, 0],
[-1, 0],
[-2, 0],
[-3, 0]])

Tensor n-mode Product


[14]: ten = tl.tensor([[[1, 0, 2],
[1, -1, -2]],
[[2, 1, 3],
[1, 2, -3]]])
U = tl.tensor([[1,-1],
[2, 0],
[1, 1]])
v = tl.tensor([1, 2])

Mode - 1 Tensor Product (Axis = 0)


[15]: result = tl.tenalg.mode_dot(ten, U, 0) # product with matrix U
result

[15]: array([[[-1, -1, -1],


[ 0, -3, 1]],

[[ 2, 0, 4],
[ 2, -2, -4]],

[[ 3, 1, 5],
[ 2, 1, -5]]])

[16]: tl.tenalg.mode_dot(ten, v, 0) # product with vector v

[16]: array([[ 5, 2, 8],


[ 3, 3, -8]])

13
2.6 Tensor Rank and the CP Decomposition

The CANDECOMP/PARAFAC (CP) decomposition factorizes a tensor into a sum of component rank-
one tensors. For example, given a third-order tensor X ∈ RI×J×K , we wish to write it as

R
X
X≈ ar ◦ br ◦ cr , (2.1)
r=1

where R is a positive integer and ar ∈ RI , br ∈ RJ , and cr ∈ RK for r = 1, . . . , R. Elementwise, Eq. 2.1


is written as

R
X
xijk ≈ air bjr ckr for i = 1, . . . , I, j = 1, . . . , J, k = 1, . . . , K. (2.2)
r=1

This is illustrated in Figure: 2.1. When Eq. 2.1 is perfect equals to then R = Rank of the Tensor X

The rank of a tensor X, denoted rank(X), is defined as the smallest number of rank-one tensors (see
section 2.1) that generate X as their sum. In other words, this is the smallest number of components
in an exact CP decomposition, where "exact" means that there is equality in 2.1. An exact CP
decomposition with R = rank(X ) components is called the rank decomposition.

The factor matrices refer to the combination of the vectors from the rank-one components, i.e., A =
a1 a2 · · · aR and likewise for B and C. Using these definitions, 2.1 may be written in matricized
form:

X(1) ≈ A(B ⊙ C)⊤ ,


X(2) ≈ B(A ⊙ C)⊤ ,
X(3) ≈ C(A ⊙ B)⊤ .

⊙ denotes the Khatri-Rao product. The three-way model is sometimes written in terms of the frontal
slices of X :

Xk ≈ AD(k) B⊤ , where D(k) ≡ diag (ck: ) for k = 1, . . . , K.

Analogous equations can be written for the horizontal and lateral slices. In general, though, slicewise
expressions do not easily extend beyond three dimensions. The CP model can be concisely expressed as

R
X
X ≈ A, B, C ≡ ar ◦ br ◦ cr .
r=1

It is often useful to assume that the columns of A, B, and C are normalized to length one with the
weights absorbed into the vector λ ∈ RR so that

14
R
X
X ≈ λ; A, B, C ≡ λr ar ◦ br ◦ cr .
r=1

We have focused on the three-way case because it is widely applicable and sufficient for many needs. For
a general Nth-order tensor, X ∈ RI1 ×I2 ×···×IN , the CP decomposition is

R
X
X ≈ λ; A(1) , A(2) , . . . , A(N ) ≡ λr a(1) (2) (N )
r ◦ ar ◦ · · · ◦ ar ,
r=1

where λ ∈ RR and A(n) ∈ RIn ×R for n = 1, . . . , N . In this case, the mode- n matricized version is given
by

 ⊤
X(n) ≈ A(n) Λ A(1) ⊙ · · · ⊙ A(n−1) ⊙ A(n+1) ⊙ · · · ⊙ A(N ,

where Λ = diag(λ).

2.6.1 Uniqueness
An interesting property of higher-order tensors is that their rank decompositions are often unique,
whereas matrix decompositions are not.
Consider the fact that matrix decompositions are not unique. Let X ∈ RI×J be a matrix of rank R.
Then a rank decomposition of this matrix is

R
X
X = AB⊤ = ar ◦ br .
r=1

If the SVD of X is UΣV⊤ , then we can choose A = UΣ and B = V. However, it is equally valid to
choose A = UΣW and B = VW, where W is some R × R orthogonal matrix. In other words, we can
easily construct two completely different sets of R rank-one matrices that sum to the original matrix.
The SVD of a matrix is unique (assuming all the singular values are distinct) only because of the addition
of orthogonality constraints (and the diagonal matrix of ordered singular values in the middle).
The CP decomposition, on the other hand, is unique under much weaker conditions. Let X ∈ RI×J×K
be a three-way tensor of rank R, i.e.,

R
X
X= ar ◦ br ◦ cr = [A, B, C] . (2.3)
r=1

Uniqueness means that this is the only possible combination of rank-one tensors that sums to X , with the
exception of the elementary indeterminacies of scaling and permutation. The permutation indeterminacy
refers to the fact that the rank-one component tensors can be reordered arbitrarily, i.e.,

X = A, B, C = AΠ, BΠ, CΠ for any R × R permutation matrix Π.

The scaling indeterminacy refers to the fact that we can scale the individual vectors, i.e.,

R
X
X= (αr ar ) ◦ (βr br ) ◦ (γr cr ) ,
r=1

as long as αr βr γr = 1 for r = 1, . . . , R.

15
The k-rank of a matrix A, denoted kA , is defined as the maximum value k such that any k columns are
linearly independent. Kruskal’s result [3, 4] says that a sufficient condition for uniqueness for the CP
decomposition in 2.3 is

kA + kB + kC ≥ 2R + 2.

Let X be an N -way tensor with rank R and suppose that its CP decomposition is

R
X
X= a(1) (2) (N )
r ◦ ar ◦ · · · ◦ ar = A(1) , A(2) , . . . , A(N ) .
r=1

[5] Then a sufficient condition for uniqueness is

N
X
kA(n) ≥ 2R + (N − 1).
n=1

The previous results provide only sufficient conditions. Ten Berge and Sidiropoulos showed that the
sufficient condition is also necessary for tensors of rank R = 2 and R = 3, but not for R > 3. Liu and
Sidiropoulos [6] considered more general necessary conditions. They showed that a necessary condition
for uniqueness of the CP decomposition is

min{rank(A ⊙ B), rank(A ⊙ C), rank(B ⊙ C)} = R.

More generally, they showed that for the N -way case, a necessary condition for uniqueness of the CP
decomposition is
 
min rank A(1) ⊙ · · · ⊙ A(n−1) ⊙ A(n+1) ⊙ · · · ⊙ A(N ) = R
n=1,...,N

They further observed that since rank(A ⊙ B) ≤ rank(A ⊗ B) ≤ rank(A) · rank(B), an even simpler
necessary condition is
 
N
Y  
min rank A(m)  ≥ R.
 

n=1,...,N
m=1
m̸=n

2.7 Low-Rank Approximations


For matrices, Eckart and Young showed that a best rank- k approximation is given by the leading k
factors of the SVD. In other words, let R be the rank of a matrix A and assume its SVD is given by

R
X
A= σr ur ◦ vr with σ1 ≥ σ2 ≥ · · · ≥ σR > 0.
r=1

Then a rank- k approximation that minimizes ∥A − B∥ is given by

k
X
B= σr ur ◦ vr .
r=1

This type of result does not hold true for higher-order tensors. For instance, consider a third-order tensor
of rank R with the following CP decomposition:

16
R
X
X= λr ar ◦ br ◦ cr .
r=1

Ideally, summing k of the factors would yield a best rank-k approximation, but that is not the case. It
is possible that the best rank-k approximation may not even exist. The problem is referred to as one of
degeneracy. A tensor is degenerate if it may be approximated arbitrarily well by a factorization of lower
rank. Let X ∈ RI×J×K be a rank-three tensor defined by

X = a1 ◦ b1 ◦ c2 + a1 ◦ b2 ◦ c1 + a2 ◦ b1 ◦ c1 ,

where A ∈ RI×2 , B ∈ RJ×2 , and C ∈ RK×2 , and each has linearly independent columns. This tensor
can be approximated arbitrarily closely by a rank-two tensor of the following form:
     
1 1 1
y = α a1 + a2 ◦ b1 + b2 ◦ c1 + c2 − αa1 ◦ b1 ◦ c1 .
α α α

Specifically,

1 1
∥x − y∥ = a2 ◦ b2 ◦ c1 + a2 ◦ b1 ◦ c2 + a1 ◦ b2 ◦ c2 + a2 ◦ b2 ◦ c2 ,
α α

which can be made arbitrarily small. As this example illustrates, it is typical of degeneracy that factors
become nearly proportional and that the magnitude of some terms in the decomposition go to infinity
but have opposite sign.
Kruskal, Harshman, and Lundy also discuss degeneracy and illustrate the idea of a series of lower-rank
tensors converging to one of higher rank. Figure 2.2 shows the problem of estimating a rank-three tensor
y by a rank-two tensor. Here, a sequence {Xk } of rank-two tensors provides increasingly better estimates
of y. Necessarily, the best approximation is on the boundary of the space of rank-two and rank-three
tensors. However, since the space of rank-two tensors is not closed, the sequence may converge to a tensor
X ∗ of rank other than two. The previous example shows a sequence of rank-two tensors converging to
rank three.

2.7.1 Iterative Method


Low-Rank Approximation for a Matrix A
R
X
A= σr ur ◦ v r , σ 1 ⩾ σ2 ⩾ . . . ≥ σR > 0
r=1

17
rank-K approximation which minimizes ∥A − B∥ is given by
K
X
B= σr ur ◦ vr
r=1

We are given,
 
x11 x12 ··· x1m
 x21 x22 ··· x2m 
Xn×m = 
 
.. .. .. .. 
 . . . . 
xn1 xn2 ··· xnm

We are approximating it by
R
X
A= ai × bi
i=1

Where,  
ai1
 ai2   
ai =  , and bi = bi1 ... bim
 
..  1×m
 . 
ain n×1

Here,
R
X
(X − A)ij = xij − aki · bkj
k=1
m
n X R
!2
X X
2
D = ∥x − A∥ = xij − aki · bkj
i=1 j=1 k=1
m R
!
∂D X X
=2 x1j − ak1 bkj (−b1j )
∂a11 j=1 k=1
m R
!
∂D X X
=2 x2j − ak2 · bkj (−b1j )
∂a12 j=1 k=1
..
.
m R
!
∂D X X
=2 x1j − ak1 · bkj (−b2j )
∂a21 j=1 k=1

Similarly, !
n R
∂D X X
=2 xi1 − aki · bk1 (−a2i )
∂b21 i=1 k=1

So, we can found the elements of the Matrix by iterative method, just we have to start with a initial
matrix, then apply gradient descent algorithm, to get the elements of the next iteration,

(t+1) ∂D
aij = atij −
∂aij

18
Future Work

This is what I have covered through this semester. Now I am planning to


apply this low-rank tensor approximation methods along with other methods
like ALS (Alternating Least Squares) to optimize the computation if it can
be.

My next work will be on the Applications of CP decomposition (the gener-


alization of Matrix SVD - Singular Value Decomposition). I will be working
on how it can be applied to optimize data processing and computing cost.

19
References

[1] Kolda, Tamara G., and Brett W. Bader. Tensor decompositions and applications SIAM review 51.3
(2009): 455-500.
[2] Stephen Boyd, Lieven Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices,
and Least Squares Cambridge University Press 2018, pp. 225 - 239
[3] J. B. Kruskal, Three-way arrays: Rank and uniqueness of trilinear decompositions, with application
to arithmetic complexity and statistics, Linear Algebra Appl., 18 (1977), pp. 95–138.
[4] J. B. Kruskal, Rank, decomposition, and uniqueness for 3-way and N-way arrays, in Multiway Data
Analysis, R. Coppi and S. Bolasco, eds., North-Holland, Amsterdam, 1989, pp. 7–18.
[5] N. D. Sidiropoulos and R. Bro, On the uniqueness of multilinear decomposition of N-way arrays, J.
Chemometrics, 14 (2000), pp. 229–239.
[6] X. Liu and N. Sidiropoulos, Cram´er-Rao lower bounds for low-rank decomposition of multidimen-
sional arrays, IEEE Trans. Signal Process., 49 (2001), pp. 2074–2086

You might also like