Ranjan 21ma40022 Report
Ranjan 21ma40022 Report
Ranjan 21ma40022 Report
Submitted by
Ranjan Sarkar
(Roll No. 21MA40022)
Department of Mathematics
Indian Institute of Technology Kharagpur
West Bengal, India - 721302
Ranjan Sarkar
Abstract
In this project, I learned about Tensor. What is tensor? Various types of tensor, it’s
operations and some of the decomposition techniques of the higher order tensors. By
decomposing a higher dimensional tensor in a efficient way, we can optimize the time
to calculate tensor operations. Decompositions of higher-order tensors (i.e., N -way
arrays with N ≥ 3) have applications in psychometrics, chemometrics, signal processing,
numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience,
graph analysis, and elsewhere.
Keywords: Tensor, Tensor Decomposition, multiway arrays, multilinear algebra,
parallel factors (PARAFAC), canonical decomposition (CANDECOMP), Data Pro-
cessing, Image Compression, Approximation, Optimization, Higher Dimensional Data
Handling
Contents
1 Introduction 5
1.1 What is Tensor? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Subarray . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Fibers and Slices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Types of Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.1 Cubical Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.2 Symmetric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.3 Partially Symmetric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.4 Diagonal Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6.5 Rank-One Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Matricization (Transforming a Tensor into a Matrix) . . . . . . . . . . . . . . . . . . . . . 7
1.8 Python Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Future Work 19
References 20
Chapter 1
Introduction
1.2 Examples
The order of a tensor is the number of dimensions, also known as ways or modes. Vectors (tensors of
order one) are denoted by boldface lowercase letters, e.g., a. Matrices (tensors of order two) are denoted
by boldface capital letters, e.g., A. Higher-order tensors (order three or higher) are denoted by boldface
Euler script letters, e.g., χ. Scalars are denoted by lowercase letters, e.g., a.
1-dim array - 1, 2, 3 - a (Vector)
3 3 9
2-dim array - - A (2×3M atrix)
3 3 2
2 8 2 9 6 8 1 1
3-way array - , - χ (2 ×2 × 4T ensor)
3 0 1 0 2 2 3 3
1.3 Subarray
Subarrays are formed when a subset of the indices is fixed. For matrices, these are the rows and columns.
A colon is used to indicate all elements of a mode. Thus, the jth column of A is denoted by a:j , and the
ith row of a matrix A is denoted by ai:
5
1.4 Fibers and Slices
Fibers are the higher-order analogue of matrix rows and columns. A fiber is defined by fixing every index
but one. A matrix column is a mode-1 fiber and a matrix row is a mode-2 fiber. Third-order tensors
have column, row, and tube fibers, denoted by x:jk , xi:k , and xij: , respectively.
Slices are two-dimensional sections of a tensor, defined by fixing all but two indices. The horizontal,
lateral, and frontal slides of a third-order tensor X, denoted by Xi:: , X:j: , and X::k , respectively.
1.5 Norm
The norm of a tensor X ϵ RI1 ×I2 ×···×IN is the square root of the sum of the squares of all its elements,
i.e.,
qP
I1 PI2 PIN 2
∥X ∥ = i1 =1 i2 =1 · · · iN =1 xi1 i2 ···iN
This is analogous to the matrix Frobenius norm, which is denoted ∥A∥ for a matrix.
The inner product of two same-sized tensors X , Y ∈ RI1 ×I2 ×···×IN is the sum of the products of their
entries, i.e.,
I1 X
X I2 IN
X
⟨X , Y⟩ = ··· xi1 i2 ···iN yi1 i2 ···iN
i1 =1 i2 =1 iN =1
2
It follows immediately that ⟨X , X ⟩ = ∥X ∥ .
Xk = X⊤
k for all k = 1, . . . , K
6
1.6.5 Rank-One Tensor
An N -way tensor X ∈ RI1 ×I2 ×···×IN is rank one if it can be written as the outer product of N vectors,
i.e.,
X = a(1) ◦ a(2) ◦ · · · ◦ a(N ) .
The symbol "o" represents the vector outer product. This means that each element of the tensor is the
product of the corresponding vector elements:
(1) (2) (N )
xi1 i2 ···iN = ai1 ai2 · · · aiN for all 1 ≤ in ≤ In .
N
X k−1
Y
j =1+ (ik − 1) Jk with Jk = Im .
k=1 m=1
k̸=n m̸=n
7
Then the three mode-n unfoldings are
1 2 3 4 5 6 7 8 9 10 11 12
X(1) =
13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 13 14 15 16
X(2) = 5 6 7 8 17 18 19 20 ,
9 10 11 12 21 22 23 24
1 5 9 13 17 21
2 6 10 14 18 22
X(3) = 3
.
7 11 15 19 23
4 8 12 16 20 24
it is also possible to vectorize a tensor. In the example above, the vectorized version is,
1
2
vec(X) =
..
.
24
Vectorization of Tensor
Tensor > Vector
[3]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24])
8
Vector > Tensor
[5]: tl.unfold(t, 0)
[6]: tl.unfold(t, 1)
[7]: array([ 1, 2, 3, 4, 13, 14, 15, 16, 5, 6, 7, 8, 17, 18, 19, 20, 9,
10, 11, 12, 21, 22, 23, 24])
9
Chapter 2
The n-mode (matrix) product of a tensor X ∈ RI1 ×I2 ×···×IN with a matrix U ∈ RJ×In is denoted
by X ×n U and is of size I1 × · · · × In−1 × J × In+1 × · · · × IN . Elementwise, we have
In
X
(X × n U)i1 ···in−1 jin+1 ···iN = xi1 i2 ···iN ujin .
in =1
Each mode- n fiber is multiplied by the matrix U. The idea can also be expressed in terms of unfolded
tensors:
y = X ×n U ⇔ Y(n) = UX(n) .
The n-mode product of a tensor with a matrix is related to a change of basis in the case when a tensor
defines a multilinear operator.
Example of n-mode matrix product.
1 0 2 2 1 3
X2×2×3 = ,
1 −1 −2 1 2 −3
1 −1
U2×3 = 2 0
1 1
1 0 2 1 −1 −2
X(1) =
2 1 3 1 2 −3
−1 −1 −1 0 −3 1
U2×3 × X(1) = 2 0 4 2 −2 −4
3 1 5 2 1 −5
−1 −1 −1 2 0 4 3 1 5
X ×1 U = , ,
0 −3 1 2 −2 −4 2 1 −5
A few facts regarding n-mode matrix products are in order. For distinct modes in a series of multiplica-
tions, the order of the multiplication is irrelevant, i.e.,
X ×m A ×n B = X ×n B ×m A (m ̸= n).
10
If the modes are the same, then
X ×n A ×n B = X ×n (BA).
The n-mode (vector) product of a tensor X ∈ RI1 ×I2 ×···×IN with a vector v ∈ RIn is de-
noted by X ×n v. The result is of order N − 1, i.e., the size is I1 × · · · × In−1 × In+1 × · · · × IN .
Elementwise,
In
X
(X×n v)i1 ···in−1 in+1 ···iN = xi1 i2 ···iN vin .
in =1
The idea is to compute the inner product of each mode-n fiber with the vector v.
⊤
For example, let X be as given and define v = 1 2 . Then
5 2 8
X ×1 v = .
3 3 −8
When it comes to mode- n vector multiplication, precedence matters because the order of the intermediate
results changes. In other words,
Examples:
11
Examples:
A ∈ R2×2 , B ∈ R3×2 , then A ⊙ B ∈ R6×2
2 −2
4 1
1 −2
2 1 6 0
A= , B= 2 1 , then A ⊙ B =
−1 0 −1 0
3 0
−2 0
−3 0
These matrix products have properties that will prove useful in our discussions:
(A ⊗ B)(C ⊗ D) = AC ⊗ BD,
(A ⊗ B)† = A† ⊗ B† ,
A ⊙ B ⊙ C = (A ⊙ B) ⊙ C = A ⊙ (B ⊙ C),
(A ⊙ B)⊤ (A ⊙ B) = A⊤ A ∗ B⊤ B,
†
(A ⊙ B)† = A⊤ A ∗ B⊤ B (A ⊙ B)⊤ .
12
[16, 0, 4, 0],
[8, -4, 2, -1]], dtype=object)
m1 = np.array([[1,2],[-1,0]])
m2 = np.array([[1,-1],[-2,3]])
linalg.kron(m1, m2)
[[ 2, 0, 4],
[ 2, -2, -4]],
[[ 3, 1, 5],
[ 2, 1, -5]]])
13
2.6 Tensor Rank and the CP Decomposition
The CANDECOMP/PARAFAC (CP) decomposition factorizes a tensor into a sum of component rank-
one tensors. For example, given a third-order tensor X ∈ RI×J×K , we wish to write it as
R
X
X≈ ar ◦ br ◦ cr , (2.1)
r=1
R
X
xijk ≈ air bjr ckr for i = 1, . . . , I, j = 1, . . . , J, k = 1, . . . , K. (2.2)
r=1
This is illustrated in Figure: 2.1. When Eq. 2.1 is perfect equals to then R = Rank of the Tensor X
The rank of a tensor X, denoted rank(X), is defined as the smallest number of rank-one tensors (see
section 2.1) that generate X as their sum. In other words, this is the smallest number of components
in an exact CP decomposition, where "exact" means that there is equality in 2.1. An exact CP
decomposition with R = rank(X ) components is called the rank decomposition.
The factor matrices refer to the combination of the vectors from the rank-one components, i.e., A =
a1 a2 · · · aR and likewise for B and C. Using these definitions, 2.1 may be written in matricized
form:
⊙ denotes the Khatri-Rao product. The three-way model is sometimes written in terms of the frontal
slices of X :
Analogous equations can be written for the horizontal and lateral slices. In general, though, slicewise
expressions do not easily extend beyond three dimensions. The CP model can be concisely expressed as
R
X
X ≈ A, B, C ≡ ar ◦ br ◦ cr .
r=1
It is often useful to assume that the columns of A, B, and C are normalized to length one with the
weights absorbed into the vector λ ∈ RR so that
14
R
X
X ≈ λ; A, B, C ≡ λr ar ◦ br ◦ cr .
r=1
We have focused on the three-way case because it is widely applicable and sufficient for many needs. For
a general Nth-order tensor, X ∈ RI1 ×I2 ×···×IN , the CP decomposition is
R
X
X ≈ λ; A(1) , A(2) , . . . , A(N ) ≡ λr a(1) (2) (N )
r ◦ ar ◦ · · · ◦ ar ,
r=1
where λ ∈ RR and A(n) ∈ RIn ×R for n = 1, . . . , N . In this case, the mode- n matricized version is given
by
⊤
X(n) ≈ A(n) Λ A(1) ⊙ · · · ⊙ A(n−1) ⊙ A(n+1) ⊙ · · · ⊙ A(N ,
where Λ = diag(λ).
2.6.1 Uniqueness
An interesting property of higher-order tensors is that their rank decompositions are often unique,
whereas matrix decompositions are not.
Consider the fact that matrix decompositions are not unique. Let X ∈ RI×J be a matrix of rank R.
Then a rank decomposition of this matrix is
R
X
X = AB⊤ = ar ◦ br .
r=1
If the SVD of X is UΣV⊤ , then we can choose A = UΣ and B = V. However, it is equally valid to
choose A = UΣW and B = VW, where W is some R × R orthogonal matrix. In other words, we can
easily construct two completely different sets of R rank-one matrices that sum to the original matrix.
The SVD of a matrix is unique (assuming all the singular values are distinct) only because of the addition
of orthogonality constraints (and the diagonal matrix of ordered singular values in the middle).
The CP decomposition, on the other hand, is unique under much weaker conditions. Let X ∈ RI×J×K
be a three-way tensor of rank R, i.e.,
R
X
X= ar ◦ br ◦ cr = [A, B, C] . (2.3)
r=1
Uniqueness means that this is the only possible combination of rank-one tensors that sums to X , with the
exception of the elementary indeterminacies of scaling and permutation. The permutation indeterminacy
refers to the fact that the rank-one component tensors can be reordered arbitrarily, i.e.,
The scaling indeterminacy refers to the fact that we can scale the individual vectors, i.e.,
R
X
X= (αr ar ) ◦ (βr br ) ◦ (γr cr ) ,
r=1
as long as αr βr γr = 1 for r = 1, . . . , R.
15
The k-rank of a matrix A, denoted kA , is defined as the maximum value k such that any k columns are
linearly independent. Kruskal’s result [3, 4] says that a sufficient condition for uniqueness for the CP
decomposition in 2.3 is
kA + kB + kC ≥ 2R + 2.
Let X be an N -way tensor with rank R and suppose that its CP decomposition is
R
X
X= a(1) (2) (N )
r ◦ ar ◦ · · · ◦ ar = A(1) , A(2) , . . . , A(N ) .
r=1
N
X
kA(n) ≥ 2R + (N − 1).
n=1
The previous results provide only sufficient conditions. Ten Berge and Sidiropoulos showed that the
sufficient condition is also necessary for tensors of rank R = 2 and R = 3, but not for R > 3. Liu and
Sidiropoulos [6] considered more general necessary conditions. They showed that a necessary condition
for uniqueness of the CP decomposition is
More generally, they showed that for the N -way case, a necessary condition for uniqueness of the CP
decomposition is
min rank A(1) ⊙ · · · ⊙ A(n−1) ⊙ A(n+1) ⊙ · · · ⊙ A(N ) = R
n=1,...,N
They further observed that since rank(A ⊙ B) ≤ rank(A ⊗ B) ≤ rank(A) · rank(B), an even simpler
necessary condition is
N
Y
min rank A(m) ≥ R.
n=1,...,N
m=1
m̸=n
R
X
A= σr ur ◦ vr with σ1 ≥ σ2 ≥ · · · ≥ σR > 0.
r=1
k
X
B= σr ur ◦ vr .
r=1
This type of result does not hold true for higher-order tensors. For instance, consider a third-order tensor
of rank R with the following CP decomposition:
16
R
X
X= λr ar ◦ br ◦ cr .
r=1
Ideally, summing k of the factors would yield a best rank-k approximation, but that is not the case. It
is possible that the best rank-k approximation may not even exist. The problem is referred to as one of
degeneracy. A tensor is degenerate if it may be approximated arbitrarily well by a factorization of lower
rank. Let X ∈ RI×J×K be a rank-three tensor defined by
X = a1 ◦ b1 ◦ c2 + a1 ◦ b2 ◦ c1 + a2 ◦ b1 ◦ c1 ,
where A ∈ RI×2 , B ∈ RJ×2 , and C ∈ RK×2 , and each has linearly independent columns. This tensor
can be approximated arbitrarily closely by a rank-two tensor of the following form:
1 1 1
y = α a1 + a2 ◦ b1 + b2 ◦ c1 + c2 − αa1 ◦ b1 ◦ c1 .
α α α
Specifically,
1 1
∥x − y∥ = a2 ◦ b2 ◦ c1 + a2 ◦ b1 ◦ c2 + a1 ◦ b2 ◦ c2 + a2 ◦ b2 ◦ c2 ,
α α
which can be made arbitrarily small. As this example illustrates, it is typical of degeneracy that factors
become nearly proportional and that the magnitude of some terms in the decomposition go to infinity
but have opposite sign.
Kruskal, Harshman, and Lundy also discuss degeneracy and illustrate the idea of a series of lower-rank
tensors converging to one of higher rank. Figure 2.2 shows the problem of estimating a rank-three tensor
y by a rank-two tensor. Here, a sequence {Xk } of rank-two tensors provides increasingly better estimates
of y. Necessarily, the best approximation is on the boundary of the space of rank-two and rank-three
tensors. However, since the space of rank-two tensors is not closed, the sequence may converge to a tensor
X ∗ of rank other than two. The previous example shows a sequence of rank-two tensors converging to
rank three.
17
rank-K approximation which minimizes ∥A − B∥ is given by
K
X
B= σr ur ◦ vr
r=1
We are given,
x11 x12 ··· x1m
x21 x22 ··· x2m
Xn×m =
.. .. .. ..
. . . .
xn1 xn2 ··· xnm
We are approximating it by
R
X
A= ai × bi
i=1
Where,
ai1
ai2
ai = , and bi = bi1 ... bim
.. 1×m
.
ain n×1
Here,
R
X
(X − A)ij = xij − aki · bkj
k=1
m
n X R
!2
X X
2
D = ∥x − A∥ = xij − aki · bkj
i=1 j=1 k=1
m R
!
∂D X X
=2 x1j − ak1 bkj (−b1j )
∂a11 j=1 k=1
m R
!
∂D X X
=2 x2j − ak2 · bkj (−b1j )
∂a12 j=1 k=1
..
.
m R
!
∂D X X
=2 x1j − ak1 · bkj (−b2j )
∂a21 j=1 k=1
Similarly, !
n R
∂D X X
=2 xi1 − aki · bk1 (−a2i )
∂b21 i=1 k=1
So, we can found the elements of the Matrix by iterative method, just we have to start with a initial
matrix, then apply gradient descent algorithm, to get the elements of the next iteration,
(t+1) ∂D
aij = atij −
∂aij
18
Future Work
19
References
[1] Kolda, Tamara G., and Brett W. Bader. Tensor decompositions and applications SIAM review 51.3
(2009): 455-500.
[2] Stephen Boyd, Lieven Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices,
and Least Squares Cambridge University Press 2018, pp. 225 - 239
[3] J. B. Kruskal, Three-way arrays: Rank and uniqueness of trilinear decompositions, with application
to arithmetic complexity and statistics, Linear Algebra Appl., 18 (1977), pp. 95–138.
[4] J. B. Kruskal, Rank, decomposition, and uniqueness for 3-way and N-way arrays, in Multiway Data
Analysis, R. Coppi and S. Bolasco, eds., North-Holland, Amsterdam, 1989, pp. 7–18.
[5] N. D. Sidiropoulos and R. Bro, On the uniqueness of multilinear decomposition of N-way arrays, J.
Chemometrics, 14 (2000), pp. 229–239.
[6] X. Liu and N. Sidiropoulos, Cram´er-Rao lower bounds for low-rank decomposition of multidimen-
sional arrays, IEEE Trans. Signal Process., 49 (2001), pp. 2074–2086