Algorithms 09 00068 PDF
Algorithms 09 00068 PDF
Algorithms 09 00068 PDF
Abstract: This paper overviews the key applications enabled by matrix theory in two major fields of
interest in electrical engineering, namely wireless communications and signal processing. The paper
focuses on the fundamental role played by matrices in modeling and optimization of wireless
communication systems, and in detection, extraction and processing of the information embedded in
signals. Among the major applications in wireless communications, the role of matrix representations
and decompositions in characterizing multiple-input multiple-output (MIMO) and orthogonal
frequency division multiplexing (OFDM) communication systems is described. In addition, this
paper points out the important contribution made by matrices in solving signal estimation and
detection problems. Special attention is given to the implementation of matrices in sensor array signal
processing and the design of adaptive filters. Furthermore, the crucial role played by matrices in
representing and processing digital images is depicted by several illustrative applications. This paper
concludes with some applications of matrix theory in the area of compressive sensing of signals and
by outlining a few open research problems for future study.
Keywords: matrix; signal processing; wireless communications; image processing; compressive sensing
1. Introduction
Matrix theory is widely used in many disciplines of modern engineering, including wireless
communications, signal processing, control, biomedical engineering, etc. From a mathematical
perspective, matrices are used both as a representation as well as an information processing tool
to solve efficiently and reliably practical engineering problems. Since matrices are prevalent in
engineering and have enabled remarkable solutions to problems of paramount importance, this paper
overviews some of the major applications of matrix theory in wireless communications and signal
processing. Our hope is that this article will help in renewing the interest of engineering community in
linear algebra, and in particular in the tools offered by matrix theory, and its applications in engineering.
Rather than describing details for all the statements and developments, our goal is instead to provide
the readers the basic facts and key features of how matrix techniques were applied in some major
engineering applications that played a fundamental role during the last two decades in the fields of
signal processing and wireless communications.
The outline of this paper is summarized as follows. Section 2 describes how matrix techniques are
exploited in the modern communication systems. Specifically, we discuss the key role played by the
concept of singular value decomposition (SVD) in modeling and characterization of the multiple-input
multiple-output (MIMO) or multiple antennas based communication systems, and the important role
played by unitary matrix transformations such as Discrete Fourier Transform (DFT) in the orthogonal
frequency-division multiplexing (OFDM) signaling scheme, which currently has been adopted in many
wireless communications standards. Section 3 focuses on several illustrative uses of matrices in signal
estimation and detection applications. In particular, the usefulness of matrix decompositions and
matrix inversion lemma in enabling online (recursive) estimation of parameters and development of
adaptive filters is illustrated. A special attention is also given to the applications of matrices in sensor
array signal processing. A more modern estimation method based on large dimensional random
matrix theory is also briefly discussed to exemplify the recent trends in the utilization of matrices in
signal processing and wireless communications. Section 4 describes the close relationship between
matrices and images. Two image compression methods, which rely on matrix transformation and
separation operations, are described. In Section 5, a recently proposed data compression approach
termed as compressive (compressed) sensing is briefly examined from the perspective of the role
played by matrices. Finally, Section 6 concludes the paper by outlining the key role played by matrices
in solving some important problems from the fields of signal processing and wireless communications.
Several open research problems that require more advanced linear algebra concepts and results are
pointed out as well.
Throughout this paper, we will denote scalars by lowercase italic letters (a), vectors by bold
lowercase italic letters (a), and matrices by bold uppercase Roman letters (A).
y1 h11 ··· h1Nt x1 ν1
.. .. .. .. .. ..
. = . . . . + . , (1)
y Nr h Nr 1 ··· h Nr Nt x Nt νNr
Nt
yi = ∑ hij x j + νi , i = 1, 2, · · · , Nr .
j =1
h11
x1 y1
hNr1
h1Nt
hNrNt
xNt yNr
In other words, the output symbol yi is a linear combination of the input symbols, which makes it
difficult to recover the transmitted data symbols. This unfavorable phenomenon is referred to as the
intersymbol interference (ISI) and results in signal aliasing. However, it is shown next that the MIMO
channel can be decomposed into independent signal paths by the help of SVD. This feature enables
to design efficient decoding schemes while achieving the spatial multiplexing gain to improve the
channel capacity. The SVD representation of channel matrix H takes the form:
H = UΣV H , (2)
~x ~y
x = V~
x y = Hx + n ~
y =U H
y
In the new coordinate system, the relationship between the input and output of channel takes
the form:
ỹ = U H (Hx + ν)
= U H (HVx̃ + ν)
(3)
= U H (UΣV H Vx̃ + ν)
= Σx̃ + ν̃.
Algorithms 2016, 9, 68 4 of 21
Assume Σ = diag(σ1 , σ2 , · · · , σr , 0, · · · , 0). Thus, the MIMO channel can be decomposed into
r independent SISO channels. Specifically, the ith SISO channel has a gain σi and a noise factor ν̃i ,
and assumes the representation
ỹi = σi x̃i + ν̃i .
This implies that all the efficient signaling and decoding strategies well investigated in the SISO
context could be transferred mutatis-mutandis to the MIMO framework with a minimal effort thanks
to SVD.
Moreover, by assuming that the channel state information, or equivalently, the channel matrix H
is known at the receiver but not at the transmitter, Telatar [3] showed that the MIMO channel capacity
can be expressed as !
r σi2 ρ
C = ∑ log2 1 + , (4)
i =1
Nt
where ρ is the transmitted signal power and the additive receiver noise is assumed to have unit
power. This result is in perfect accordance with (3) since the MIMO capacity in (4) can be interpreted
as the summation of independent SISO channel capacities. Under the assumption that the entries
of H follow a zero-mean spatially white (ZMSW) model, the law of large numbers implies that
lim N1t HH H = I Nr , and the capacity can be expressed asymptotically [4] in the presence of a large
Nt →∞
number of transmit/receive antennas as
C = Nr log2 (1 + ρ).
This implies that as N = min ( Nr , Nt ) grows large, the channel capacity increases linearly
with N [5]. This fact as well as some other results for MIMO capacity analysis [6,7] makes MIMO
signaling superior to SISO signaling and it also explains the reason why currently MIMO techniques
are so appealing in a broad range of applications such as wireless communications, radar, sonar,
geo-physical (seismic) signal processing, wireless sensor networks, and antenna array processing.
The proposed framework for determining the capacity of MIMO linear time-invariant channels could
be further extended to determine the capacity of memoryless nonlinear MIMO channels subject to
a power constraint [8]. Reference [8] re-formulates this problem within a functional framework and
employs variational calculus techniques to determine the capacity-achieving input distribution of
memoryless nonlinear MIMO channels. As a special case, the capacity-achieving input distribution for
the linear MIMO channel is re-established in [8].
H
X[0] x[0]
D/A
X[N-1] x[N-1]
Channel
(Lowpass
Filer)
Y[0] y[0]
A/D
Y[N-1] y[N-1]
To simplify the exposition, we have represented all the operations between channel input
vector x and output vector y as a system with transfer matrix H, such that y = Hx + ν, where ν
stands for the additive noise vector. OFDM implements an inverse discrete Fourier transform
(IDFT) at the transmitter side and a discrete Fourier transform (DFT) at the receiver side. However,
in practice, the fast Fourier transform (FFT) algorithm is widely used to compute DFT due to its fast
implementation and computational efficiency. This is the reason why the processing blocks are marked
in Figure 3 with FFT and IFFT instead of DFT and IDFT.
Given x [n], n = 0, 1, · · · , N − 1, the N-point DFT of x [n] is defined as
N −1
1 2πnk
X [k] = √
N
∑ x [n]e− j N , k = 0, 1, · · · , N − 1.
n =0
N −1
1 2πnk
x [n] = √
N
∑ X [k]e j N , n = 0, 1, · · · , N − 1.
k =0
It is easy to show that the DFT operation can be represented as a matrix multiplication
X = Wx,
where X = [ X [0], X [1], · · · , X [ N − 1]] T , x = [ x [0], x [1], · · · , x [ N − 1]] T , and the DFT matrix W is
a unitary matrix expressed as
1 1 1 ··· 1
1 w w2 ··· w N −1
1 1 w2 w4 ··· w 2( N −1)
W= √ (5)
N
.. .. .. .. ..
. . . . .
2
1 w −1
N w 2( N −1) ··· w −1)
( N
Algorithms 2016, 9, 68 6 of 21
2π
with w = e− j N . Additionally, the IDFT operation can be expressed as
x = W−1 X = W H X,
This yields the output y[n] = x̃ [n] ∗ h[n] + ν[n]. Introducing the simplified notations x̃ [n] as x̃n ,
h[n] as hn , etc., it turns out that the relation between the input and output of the channel can be
expressed as
x̃ N −1
..
y N −1 h0 h1 ··· hM 0 ··· 0 . ν
N −1
y N −2 0 h0 ··· h M −1 hM ··· 0 x̃0 νN −2
=
+ . .
.. .. .. .. .. .. .. .. .. (7)
. . . . . . . .
x̃−1
y0 0 0 0 h0 ··· h M −1 hM ..
. ν0
x̃− M
It turns out that H is a circulant matrix with the generator vector [h0 , 0, · · · , 0, h M , · · · , h1 ] T .
According to the properties of circulant matrices [9], H admits the eigenvalue decomposition
H = UΛU H ,
where Λ is a diagonal matrix whose diagonal elements stand for the eigenvalues of H and U is a unitary
matrix whose rows correspond to the columns of DFT matrix W in (5). In other words, U = W H .
Thus, one can infer the following relations:
Algorithms 2016, 9, 68 7 of 21
Y = Wy
= W(Hx + ν)
= W(HW H X + ν) (9)
H H
= WUΛU W X + Wν
= ΛX + ν̃.
It turns out that by exploiting DFT and IDFT operations and inserting the cyclic prefix,
OFDM transforms a frequency-selective channel into N orthogonal flat-fading subchannels without
ISI, which helps to reduce the complexity of the receiver decoder. Due to its advantages in combining
the energy of all the propagation paths of the channel and enabling a reduced complexity equalization,
OFDM has been adopted in numerous communications standards (ADSL, WLAN, LTE, etc.) and
has been also used in conjunction with the MIMO technique in what it is currently referred to as the
MIMO-OFDM technology [10].
where D is a lower triangular matrix with nonnegative diagonal entries. Every Hermitian
positive-definite matrix has a unique Cholesky decomposition.
Cholesky decomposition is mainly used in whitening the noise for signal estimation and detection
applications. Consider a simple vector parameter estimation problem expressed as [12]:
x = θ + ν, (10)
where x ∈ R N stands for the observed data, ν ∈ R N denotes the additive noise with Gaussian
distribution N (0, Σ) (Σ is not diagonal) and θ represents the unknown vector to be estimated.
Knowledge of the probability density function (PDF) p(x; θ) lies at the basis for several very
well-known and used estimators such as the minimum variance unbiased estimator (MVUE) and the
maximum likelihood estimator (MLE) [12]. In the present application, p(x; θ) stands for a multivariate
Gaussian PDF of the form:
N 1 1 T Σ−1 (x− θ)
p(x; θ) = (2π )− 2 |Σ|− 2 e− 2 (x−θ) , (11)
E(ν̃ν̃ T ) = E(D−1 νν T D−T ) = D−1 E(νν T )D−T = D−1 ΣD−T = D−1 DDT D−T = I.
This transformation whitens the noise such that the new PDF takes the form:
N 1 T (x̃− θ̃)
p(x̃; θ̃) = (2π )− 2 e− 2 (x̃−θ̃)
N −1
− 21 ∑ ( x̃i −θ̃i )2
− N2
= (2π ) e i =0 (13)
N −1
= ∏ p( x̃i ; θ˜i ).
i =0
Using representation (13), where the noise is white (not correlated anymore), the estimation
task can be performed by processing independently each θ̃i on a one-by-one basis, which, in general,
for many applications, leads to a closed form solution. The estimate θ can be obtained by θ = Dθ̃.
In contrast, in (11), we must process the entire vector x due to the dependence caused by the colored
noise. Thus, the whitening method significantly simplifies the estimation problem when the noise
is colored.
x = Aθ + ν,
where x ∈ R N is the observed data, ν ∈ R N stands for the additive noise, A ∈ R N × P is the given linear
interaction matrix, and θ ∈ RP is the unknown vector to be estimated.
The LS estimator is defined as
where || · ||2 stands for the Euclidean norm. If A is full column rank, i.e., rank(A) = P and N ≥ P,
it admits the closed form solution
θ̂LS = (A H A)−1 A H x. (15)
However, in the case of N < P, matrix A H A is rank deficient and this solution is invalid. In this
case, an infinite number of minimizers exist, and the set of these solutions is denoted by Θ. A unique
solution for the LS problem can be enforced by selecting the minimum l2 (Euclidean) norm solution
from the set Θ. Thus, one should add this constraint to (14):
The new optimization problem defined by (14) and (16) can be solved by means of SVD.
In particular, due to the SVD of A, i.e., A = UΣV H , and the fact that multiplying a vector with
a unitary matrix does not change the Euclidean norm of the vector, it turns out that
||Aθ − x||22 = ||U H (UΣV H θ − x)||22 = ||ΣV H θ − U H x||22 = ||Σy − b||22 , (17)
" #
Σ1 0
Σ= .
0 0
Moreover, b and y can also be decomposed as b = [b1T b2T ] T and y = [y1T y2T ] T , respectively.
Hence, (17) resumes to
||Aθ − x||22 = ||Σ1 y1 − b1 ||22 + ||b2 ||22 . (18)
Therefore, the minimum-norm LS-solution that minimizes ||Aθ − x||22 is given by y1 = Σ1−1 b1
and y2 = 0. Alternatively, " #" #
Σ1−1 0 b1
y= = Σ† b,
0 0 b2
where the block matrix is denoted as Σ† . Finally, the LS-estimate of θ can be obtained by
3.3. Matrix Inverse Lemma for Derivation of the Recursive Least-Squares Filter
In many signal processing applications, the data is generated by sampling a continuous-time
signal. In this way, more data becomes available as the time progresses. One can either wait for all the
available data or process the data iteratively in time. The latter approach requires a recursive (online)
approach to update the filter taps. Next, the key role played by the matrix inverse lemma (Woodbury
matrix identity) in deriving the least-squares (RLS) filter will be presented.
The matrix inverse lemma states that [12]:
A−1 xy H A−1
(A + xy H )−1 = A−1 − , (19)
1 + y H A−1 x
where an = [ a(n), a(n − 1), · · · , a(n − m + 1)] H is the vector containing the m most recent samples of
input signal a(n). Our goal is to estimate the taps of filter h at each time slot n using a least-squares
estimation approach. Specifically, the problem resumes to:
where
a (1) a (0) ··· a(−m + 2) a1H
a (2) a (1) ··· a(−m + 1) a2H " #
.. .. .. .. .. A −
An =
. . . .
=
.
=
n 1
,
anH
a ( n − 1) a ( n − 2) ··· a(n − m) H
an −1
a(n) a ( n − 1) ··· a ( n − m + 1) anH
and
Algorithms 2016, 9, 68 10 of 21
h n (0) x (1)
h n (1) x (2) " #
.. .. x −
hn =
.
and xn =
.
=
n 1
.
x (n)
h n ( m − 2) x ( n − 1)
h n ( m − 1) x (n)
As time evolves the number of rows of matrix An keeps increasing, and to avoid repeatedly
calculating from scratch the least-squares estimate of ĥn for each instant of time n, expressing ĥn in
terms of ĥn−1 helps to reduce the computational burden. Indeed, such an approach exhibits the huge
benefit of reducing the computational complexity incurred by evaluating the taps of the filter at each
time instant.
y(n) - x(n)
a(n) LTI Filter h +
e(n)
Update Algorithm
As we described earlier in (14) and (15), the problem (21) admits the solution
ĥn = R− 1
n pn . (22)
Moreover, since
n n −1
Rn = AnH An = ∑ ai aiH = ∑ ai aiH + an anH = Rn−1 + an anH ,
i =1 i =1
Applying the matrix inverse lemma (19) to the inverse of the matrix contained in
Equation (23) yields:
R− 1 H −1
n −1 an an Rn −1
R− 1 −1 H −1
n = (Rn −1 + an an ) = R− 1
n −1 − . (24)
1 + anH R− 1
n −1 an
R− 1
n −1 an
Furthermore, defining kn = and Σn = R− 1
n , it follows that
1+anH R− 1
n −1 an
and
Algorithms 2016, 9, 68 11 of 21
ĥn = Σn pn
= Σn [pn−1 + an x (n)]
= Σ n pn −1 + Σ n an x ( n )
= (Σn−1 − kn anH Σn−1 )pn−1 + kn x (n) (27)
= ĥn−1 − kn anH ĥn−1 + kn x ( n )
= ĥn−1 + kn [ x (n) − anH ĥn−1 ]
= ĥn−1 + e(n)kn ,
where e(n) = x (n) − anH ĥn−1 and the second equality comes from the fact that
Σ n −1 an
kn = ,
1 + anH Σn−1 an
e(n) = x (n) − anH ĥn−1 ,
ĥn = ĥn−1 + e(n)kn ,
Σn = Σn−1 − kn anH Σn−1 .
As the algorithm processes, the error made by approximating Σ0 = αI and ĥ0 = 0 becomes less
and less significant [12].
In some cases, a weighted least squares estimation error is used as the minimizer by multiplying
the error terms with so-called “forgetting factors”, which are intended to reduce the influence of old
data [14]. The RLS algorithm has emerged as a powerful tool for adaptive filtering, prediction and
system identification applications. Despite the fundamental role played by the RLS algorithm in the
adaptive signal processing field, it encounters numerical problems when implemented in a finite
precision environment, and especially in fixed-point arithmetic DSP processors [15]. The round-off
error propagation, defined as the influence of subsequent computations when a single perturbation
occurs at a random iteration, represents one of the major drawbacks of the RLS algorithm in a finite
precision implementation and has been well studied in literature [16–18]. A popular method to improve
the accuracy and stability of RLS algorithm is to implement a variable forgetting factor (VFF) [19–21].
Other modified RLS methods can be found in [22–25] and the references therein.
role of matrix theory in representing and addressing the detection and estimation problems associated
with the radiating sources.
First consider the one-source-one-sensor scenario shown in Figure 5. It is proved in [26] that
under the far-field and narrowband assumptions, the output of a sensor can be modeled as follows:
T
x (t) = g(θ )s(t)e− jr k ,
where g(θ ) is a function of the direction of arrival (DOA) θ, and it represents the frequency response
of the sensor. Variable s(t) represents the complex envelope of the source signal, and r and k
stand for the radius vector and the wave vector, respectively. The magnitude of k is defined as
|k| = α = 2π/λ, where λ is the wavelength. As shown in Figure 5, in a two-dimensional plane,
we have k = α(cos θ, sin θ ) T and r = ( x, y) T . Thus, the output of the sensor can be simplified to
x (t) = g(θ )e− jα( x cos θ +y sin θ ) s(t) = a(θ )s(t), (28)
Source
Sensor
r
-k
θ
x
For the one-source-L-sensor case, the array output vector can be expressed as
x( t ) = a( θ ) s ( t ), (29)
where x(t) = [ x1 (t), x2 (t), · · · , x L (t)]T denotes the output of each sensor and
a(θ ) = [ a1 (θ ), a2 (θ ), · · · , a L (θ )] T stands for the steering vector and captures the sensor array
geometry information. Two common array geometries, termed as the uniform linear array (ULA) and
uniform circular array (UCA), are depicted in Figure 6. For ULA, rl = [(l − 1)d, 0] T , and for UCA,
rl = R[cos(2π (l − 1)/L), sin(2π (l − 1)/L)] T . Additionally, if we assume that all the sensors have the
same frequency response, g1 (θ ) = · · · = g L (θ ) = g(θ ), the ULA steering vector takes the form:
y
R
2p / L
x
0 d 2d 3d (L-1)d x
K p
y( t ) = ∑ e i xi (t) + σν(t),
Pi U (35)
k =1
where Pi denotes the transmit power of source i, xi ∈ Cni stands for the signal collected from source i
and it is assumed to be complex Gaussian distributed with zero mean and identity covariance matrix,
and σν represents the additive noise. For simplicity, the channel matrix U e i ∈ C N ×ni is selected such
Algorithms 2016, 9, 68 15 of 21
that [U e 2, · · · , U
e 1, U e K ] ∈ C N ×(n1 +n2 +···+nK ) represent the n1 + n2 + · · · + nK columns of the unitary
matrix U ∈ C N × N .
If follows that
E[y(t)y(t) H ] = UPU H + σ2 I N ,
P , · · · , P , P , · · · , P , · · · , PK , · · · , Pk , 0, · · · , 0 .
| 1 {z }1 | 2 {z }2 | {z } | {z }
n1 n2 nK N −n1 −···−nK
Suppose that L independent observations of y(t) are available, i.e., y(1), y(2), · · · , y( L). Thus,
it follows that as L → ∞:
1 L
H
a.s.
∑ y(i )y(i ) − σ I N − UPU
−→ 0,
H 2
(36)
L i =1
and a consistent estimator can be derived. However, when N is also a large number and L and
N are comparable, (36) does not hold anymore. Using the Stieltjes transform and its extensions,
the G-estimation method is generally adopted to improve the estimation performance. More technical
and mathematical details about the Stieltjes transform and its generalizations can be found in [31,35].
162 162 ··· 133 138 ··· 152 129
162 162 ··· 133 138 ··· 152 129
.. .. .. .. .. .. .. ..
. . . . . . . .
99 99 ··· 98 108 ··· 133 131 512
95 95 ··· 96 104 ··· 134 128
.. .. .. .. .. ..
.. ..
. . . . . . . .
44 44 ··· 132 138 · · · 105 108
44 44 ··· 132 138 · · · 105 108
| {z }
512
block transform coding, one can develop a very simple storage mechanism since most of the block
transformed entries assume small values and can be neglected.
Specifically, an M × N image is divided into MN/n2 subimages each of which has a size n × n.
Consider a subimage with coefficients g( x, y), the 2-D transform can be expressed as follows:
n −1 n −1
T (u, v) = ∑ ∑ g(x, y)r(x, y, u, v). (37)
x =0 y =0
n −1 n −1
g( x, y) = ∑ ∑ T (u, v)s(x, y, u, v). (38)
u =0 v =0
where q
1
, x = 0,
α( x ) = qn
2
x = 1, 2, · · · , n − 1.
n,
Equations (37) and (38) can be written in a compact form by using matrices. Consider (38) as
an example,
n −1 n −1
G= ∑ ∑ T (u, v)Suv , (39)
u =0 v =0
n −1 n −1
T= ∑ ∑ g(x, y)Rxy . (40)
x =0 y =1
shown in the upper left corner of Figure 8b. In this way, the image is compressed to a higher degree
given the premise that the original image can be recovered with a tolerated quality.
Approx Horizontal
Detail
Vertical Diagonal
Detail Detail
(a) First scale Haar wavelet transform (b) Second scale Haar wavelet transform
Figure 8. Haar Wavelet transform.
5. Compressive Sensing
The celebrated Shannon/Nyquist sampling theorem states that in order to reconstruct a signal
without losing information, the sample rate must be at least two times larger than the signal bandwidth.
So far, most signal acquisition techniques rely on this principle. As mentioned in Section 4.1,
transform coding plays a key role in data acquisition systems. However, this compression framework
suffers from several drawbacks [38]. For instance, consider a discrete signal x ∈ R N with entries x [n],
n = 0, 1, · · · , N − 1. An image or a higher-dimensional data structure can be also vectorized into
a long one-dimensional vector. It is known that any signal in R N can be represented in terms of a basis
of N vectors {ψi }iN=−0 1 . In this case,
N −1
x= ∑ si ψi = Ψs,
i =0
where s stands for the weighting coefficients and Ψ is the basis matrix with columns [ψ0 , · · · , ψN −1 ].
The signal x is called K-sparse if only K coefficients of s are non-zero and the remaining N − K
coefficients are roughly zero (or can be truncated to zero). Thus, in order to process the transform
coding, the locations of the K non-zero coefficients must be coded, which results in an overhead.
Moreover, the N − K zero coefficients still need to be computed even though they will be discarded.
Compressive sensing addresses these issues by directly processing the compressed signal without
acquiring the original N-sample signal provided that the signal is K-sparse [39]. This novel sampling
method dramatically undersamples the signal at a greatly reduced rate. Next we will present the
mathematical model behind this new technique.
Consider the following representation [38]:
where x ∈ R N stands for the original signal, y ∈ R M (M N) denotes the compressed signal,
and Φ ∈ R M× N represents the linear measurement process. Mathematically, the problem can be
formulated as follows. First, find a matrix Φ such that y = Φx preserves most of the information
embedded in x. Second, find an algorithm to recover x from only M samples of y.
Algorithms 2016, 9, 68 18 of 21
min ||s||0
(42)
s.t. Θs = y.
min ||s||2
(43)
s.t. Θs = y,
which leads to the closed form solution ŝ = θ T (θθ T )−1 y. However, this method will almost never
yield a K-sparse solution. Even worse, usually a full representation solution (no zero elements) will be
generated by using the l2 norm.
Surprisingly, the optimization based on the l1 norm is the most promising recovery technique
used so far for sparse signals. In this case, the recovery problem is formulated as follows:
min ||s||1
(44)
s.t. Θs = y.
In some cases, the l1 minimization yields the same result as the ideal l0 minimization [46].
Moreover, the l1 minimization can be efficiently solved via linear programming which presents
a computational complexity of O( N 3 ). For the last decade, a series of papers [39,43,47] explained
why l1 minimization can be approximated as an l0 minimization and why it can recover a sparse
signal. Until now, most of the works dedicated to compressive sensing, see, e.g., the works of Dandes,
Romberg, Tao and Donoho [39–43] assume the l1 norm formulation.
Algorithms 2016, 9, 68 19 of 21
6. Conclusions
This paper overviewed how matrix theory can be applied in wireless communications and
signal processing applications. In particular, this paper illustrated how matrix transformations,
matrix decompositions and random matrix theory could be exploited in the implementation,
characterization and optimization of MIMO communication systems, OFDM signaling schemes and
signal estimation applications. This paper also reviewed the significant role played by matrices in
representing and processing digital images and signals acquired by an array of sensors. Furthermore,
a recent research activity termed as compressive sensing was briefly discussed herein paper. It will
require continued efforts from researchers in wireless communications and signal processing to
ultimately exploit the full potential applications of matrix theory. Furthermore, the complex
phenomena in nature and engineering applications require higher-dimensional matrices for their
modeling, characterization and understanding. In this regard, tensors have been used in mathematics,
physics and engineering since they are a generalization of scalars, vectors and two-dimensional arrays
to represent more complex quantities and properties in higher dimensions. Therefore, a promising
research direction is to investigate new applications of tensors in wireless communications and signal
processing, and to extend the arsenal of mathematical tools available for storing, extracting and
processing information from tensors. Finally, we would like to mention that linear algebra plays a very
important role in developing powerful algorithms in bioinformatics and systems biology applications.
Numerous algorithms for the inference of gene and proteomic regulatory networks as well as for
the inference of transcription factor networks have been proposed in literature (see, e.g., [48,49]) by
exploiting linear algebra tools. However, reviewing these computational biology applications as well
as numerous other machine learning applications where linear algebra plays a key role is beyond the
scope of this paper.
References
1. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423.
2. Introduction to LTE-Advanced: Application Note. Agilent Technol. 2011. Available online: http://cp.
literature.agilent.com/litweb/pdf/5990-6706EN.pdf (accessed on 6 January 2016).
3. Telatar, E. Capacity of multi-antenna Gaussian channels. Eur. Trans. Telecommun. 1999, 10, 585–596.
4. Goldsmith, A. Wireless Communications; Cambridge University Press: Cambridge, UK, 2005.
5. Foschini, G.J. Layered space-time architecture for wireless communication in fading environments when
using multi-element antennas. Bell Labs Tech. J. 1996, 1, 41–59.
6. Foschini, G.J.; Gans, M.J. On limits of wireless communications in fading environment when using
multiple antennas. Wirel. Pers. Commun. 1998, 6, 311–335.
7. Goldsmith, A.; Jafar, S.A.; Jindal, N.; Vishwanath, S. Capacity limits of MIMO channels. IEEE J. Sel.
Areas Commun. 2003, 21, 684–702.
8. Wang, X.; Serpedin, E.; Qaraqe, K. A variational approach for assessing the capacity of a memoryless
nonlinear MIMO channel. IEEE Commun. Lett. 2014, 18, 1315–1318.
9. Gray, R.M. Toeplitz and Circulant Matrices: A Review; Now Publishers Inc.: Norwell, MA, USA, 2006.
10. Bolcskei, H.; Zurich, E. MIMO-OFDM wireless systems: Basics, perspectives, and challenges. IEEE Trans.
Wirel. Commun. 2006, 13, 31–37.
11. Golub, G.H.; VanLoan, C.F. Matrix Computations; The Johns Hopkins University Press: Baltimore, MA,
USA, 1996.
12. Kay, S. Fundamentals of Statistical Signal Processing: Estimation Theory; Prentice Hall: Upper Saddle River, NJ,
USA, 1993.
Algorithms 2016, 9, 68 20 of 21
13. Reilly, J.P. Matrix Computations in Signal Processing: Lecture Notes; Electrical and Computer Engineering
Department, McMaster University: Hamilton, ON, Canada, 24 September 2006.
14. Haykin, S. Array Signal Processing; Prentice Hall: Upper Saddle River, NJ, USA, 1985.
15. Paleologu, C.; Ciochina, S.; Enescu, A. A family of recursive least-squares adaptive algorithms suitable for
fixed-point implementation. Int. J. Adv. Telecommun. 2009, 2, 88–97.
16. Liavas, A.; Regalia, P. On the numerical stability and accuracy of the conventional recursive least
squares algorithm. IEEE Trans. Signal Process. 1999, 47, 88–96.
17. Ljung, S.; Ljung, L. Error propagation properties of recursive least-squares adaptation algorithms.
Automatica 1985, 21, 157–167.
18. Verhaegen, M. Round-off error propagation in four generally-applicable, recursive, least-squares estimation
schemes. Automatica 1989, 25, 437–444.
19. Bhotto, Z.; Antoniou, A. New improved recursive least-squares adaptive-filtering algorithms. IEEE Trans.
Circuits Syst. I Reg. Pap. 2013, 60, 1548–1558.
20. Leung, S.; So, C. Gradient-based variable forgetting factor RLS algorithm in time-varying environments.
IEEE Trans. Signal Process. 2005, 53, 3141–3150.
21. Paleologu, C.; Benesty, J.; Ciochina, S. A robust variable forgetting factor recursive least-squares algorithm
for system identification. IEEE Signal Process. Lett. 2008, 15, 597–600.
22. Ardalan, S. Floating-point roundoff error analysis of the exponentially windowed RLS algorithm for
time-varying systems. IEEE Trans. Circuits Syst. 1986, 33, 1192–1208.
23. Bottomley, G.; Alexander, S. A novel approach for stabilizing recursive least squares filters. IEEE Signal
Process. Lett. 1991, 39, 1770–1779.
24. Chansakar, M.; Desai, U. A robust recursive least squares algorithm. IEEE Signal Process. Lett. 1997,
45, 1726–1735.
25. Haykin, S. Adpative Filtering Theory; Prentice Hall: Upper Saddle River, NJ, USA, 2002.
26. Krim, H.; Viberg, M. Two decades of array signal processing research. IEEE Signal Process. Mag. 1996,
13, 67–94.
27. Roy, R.; Kailath, T. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans.
Acoust. Speech Signal Process. 1989, 37, 984–995.
28. Stoica, P.; Moses, R.L. Introduction to Spectral Analysis; Prentice Hall: Upper Saddle River, NJ, USA, 1997.
29. Wigner, E. Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 1955, 62,
548–564.
30. Marcenko, V.A.; Pastur, L.A. Distributions of eigenvalues for some sets of random matrices.
Math. USSR-Sbornik 1967, 1, 457–483.
31. Couillet, R.; Debbah, M. Signal processing in large systems: A new paradigm. IEEE Signal Process. Mag. 2003,
30, 24–39.
32. Couillet, R.; Debbah, M.; Siverstein, J.W. A deterministic equivalent for the analysis of correlated MIMO
multiple access channels. IEEE Trans. Inf. Theory 2011, 57, 3493–3514.
33. Dupuy, F.; Loubaton, P. On the capacity achieving covariance matrix for frequency selective MIMO channels
using the asymptotic approach. IEEE Trans. Inf. Theory 2011, 57, 5737–5753.
34. Couillet, R.; Debbah, M. Random Matrix Methods for Wireless Communications; Cambridge University Press:
Cambridge, UK, 2011.
35. Tulino, M.A.; Verdu, S. Random Matrix Theory and Wireless Communications; Now Publishers: Norwell, MA,
USA, 2004.
36. Girko, V.L. Theory of Random Determinants; Kluwer Academic Publisher: Dordrecht, The Netherlands, 1990.
37. Gonzalez, R.C.; Woods, R.E. Digital Image Processing; Prentice Hall: Upper Saddle River, NJ, USA, 2007.
38. Baraniuk, R.G. Compressive Sensing. IEEE Signal Process. Mag. 2007, 24, 119–121.
39. Dandes, E.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly
incomplete frequency information. IEEE Trans. Inf. Theory 2006, 52, 489–509.
40. Dandes, E.; Romberg, J. Quantitative robust uncertainty principles and optimally sparse decompositions.
Found. Comput. Math. 2006, 6, 227–254.
41. Candes, E.; Tao, T. Decoding by linear programming. IEEE Trans. Inf. Theory 2005, 51, 4203–4215.
42. Candes, E.; Tao, T. Near optimal signal recovery from random projections: Universal encoding strategies?
IEEE Trans. Inf. Theory 2006, 51, 5406–5425.
Algorithms 2016, 9, 68 21 of 21
43. Donoho, D.L. For most large underdetermined systems of equations, the minimal l1 -norm near-solution
approximates the sparsest near-solution. Commun. Pure Appl. Math. 2004, 59, 797–829.
44. Baraniuk, R.G.; Davenport, M.A.; DeVore, R.A.; Wakin, M.B. A simple proof of the restricted isometry
property for random matrices. Constr. Approx. 2008, 28, 253–263.
45. Natarajan, B.K. Sparse approximate solutions to linear systems. SIAM J. Comput. 1995, 24, 227–234.
46. Holtz, H.V. Compressive sensing: A paradigm shift in signal processing. 2008, arXiv:0812.3137.
47. Donoho, D.L.; Huo, X. Uncertainty principles and ideal atomic decomposition. IEEE Trans. Inf. Theory 2001,
47, 2845–2862.
48. Wang, X.; Alshawaqfe, M.; Dang, X.; Wajid, B.; Noor, A.; Qaraqe, M.; Serpedin, E. An overview of NCA-based
algorithms for transcriptional regulatory network inference. Microarrays 2015, 4, 596–617.
49. Noor, A.; Serpedin, E.; Nounou, M.; Nounou, H. An overview of the statistical methods for inferring gene
regulatory networks and protein-protein interaction networks. Adv. Bioinform. 2013, 2013, 953814.
c 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC-BY) license (http://creativecommons.org/licenses/by/4.0/).