Dynamical Sparse Recovery with Finite-time
Convergence
Lei Yu, Gang Zheng, Jean-Pierre Barbot
To cite this version:
Lei Yu, Gang Zheng, Jean-Pierre Barbot. Dynamical Sparse Recovery with Finite-time Convergence.
IEEE Transactions on Signal Processing, 2017, 65 (23), pp.6146-6157. �10.1109/TSP.2017.2745468�.
�hal-01649419�
HAL Id: hal-01649419
https://inria.hal.science/hal-01649419
Submitted on 27 Nov 2017
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
1
Dynamical Sparse Recovery with Finite-time
Convergence
Lei Yu, Gang Zheng, Jean-Pierre Barbot
Abstract—Even though Sparse Recovery (SR) has been successfully applied in a wide range of research communities,
there still exists a barrier to real applications because of the
inefficiency of the state-of-the-art algorithms. In this paper, we
propose a dynamical approach to SR which is highly efficient
and with finite-time convergence property. Firstly, instead of
solving the ℓ1 regularized optimization programs that requires
exhausting iterations, which is computer-oriented, the solution to
SR problem in this work is resolved through the evolution of a
continuous dynamical system which can be realized by analog
circuits. Moreover, the proposed dynamical system is proved to
have the finite-time convergence property, and thus more efficient
than LCA (the recently developed dynamical system to solve
SR) with exponential convergence property. Consequently, our
proposed dynamical system is more appropriate than LCA to
deal with the time-varying situations. Simulations are carried out
to demonstrate the superior properties of our proposed system.
Index Terms—Sparse Recovery, ℓ1 -minimization, Dynamical
System, Finite-time Convergence
I. I NTRODUCTION
As a fundamental of Compressive Sensing (CS) theory [1],
Sparse Recovery (SR), or sparse representation, has been substantially investigated in the last two decades. As a powerful
tool, it has also been successfully applied in a wide range of research communities and obtained compelling results, including
signal processing [1]–[5], medical imaging [6], [7], machine
learning [8], [9], and computer vision [10]. In particular,
the objective of SR is to find a concise representation of a
signal using a few atoms from some specified (over-complete)
dictionary,
y = Φx + ε
with y ∈ RM the observed measurements corrupted by some
noises ε, x ∈ RN the sparse representation with no more
than s nonzero entries (s-sparsity) and Φ ∈ RM ×N the
dictionary (normally M ≪ N ). Thus it always involves an
underdetermined linear inverse problem. Providing that the
Restricted Isometry Property (RIP) of dictionary is fulfilled,
the unique solution is guaranteed [11].
The problem of SR is often casted as an optimization program that minimizes a cost function constructed by leveraging
This work is supported by NSFC Grant 61401315, and the Projectsponsored by SRF for ROCS, SEM, under Grant 230303.
Lei Yu is with School of Electronic and Information, Wuhan University,
Wuhan Hubei, China (email:
[email protected]).
Gang Zheng is with Non-A, INRIA Lille, France (email:
[email protected]).
Jean-Pierre Barbot is with Quartz EA 7393, ENSEA, Cergy Pontoise and
Non-A Inria Lille, (email:
[email protected])
the observation error term and the sparsity-inducing term [12]–
[14], i.e.,
x∗ = arg min
x∈RN
1
∥y − Φx∥22 + λψ(x)
2
(1)
and
∑ typically, the sparsity-inducing term ψ(x) = ∥x∥1 ≜∗
i |xi | and λ > 0 is the balancing parameter. We call x
as the critical point, i.e., the solution of (1). And typically,
for sparse vectors x with s-sparsity, the solution will be
unique providing that RIP condition for Φ with order of 2s
is verified [11]. On the other hand, exploiting hierarchical
Bayesian model built on the sparse signals [8], [15]–[18]
results in compelling algorithms inherently with different
sparsity-inducing term [16]. Moreover, the greedy algorithms
are also favorable for SR due to the theoretical guarantees and
high efficiency when the considered signal is highly sparse [9],
[19]–[21].
Although greedy algorithms are efficient, the condition for
the stably recovery of s-sparse x is generally very strong. In
particular, it is showed in [22] that to guarantee a stably recovery of any s-sparse x with the orthogonal matching pursuit
algorithm [20] in s iterations, the dictionary Φ should√satisfy
the RIP with the restrict isometry constant δs < 1/ s + 1.
Although it has been shown in [23] that stably recovery of any
s-sparse x with the orthogonal matching pursuit algorithm [20]
is also possible if Φ satisfies the RIP with the restrict isometry
constant δ31s < 1/3, the required number of iterations is 30s
which is computational expensive. Besides, other aforementioned algorithms are all batch-based which normally require a
large number of iterations to guarantee the convergence (most
of them with sublinear convergence rate) and thus with high
computational complexity. It is thus implausible for the real
applications where the signals are usually time-varying, such
as radar imaging [24], face recognition [10], DOA estimation
[5] and so on. Regarding the real applications, many “online”
algorithms have been proposed recently either by generalizing
ℓ1 regularized LS in the manner of LMS (Least Mean Square)
[25], [26] and RLS (Recursive Least Square) [27], or extending
the Bayesian approaches following an adaptive framework
[28], [29]. On the other hand, instead of the online algorithms,
the Locally Competitive Algorithm (LCA) [30] has been
proposed to solve the SR problem by exploiting the continuous
dynamical systems. And recent advances in very-large-scale
integration (VLSI) enables the realization of LCA with analog
chips [31]. Consequently, instead of numerically calculating
the matrix multiplications in the digital approaches, LCA can
obtain the computation result from analog circuits which will
be very efficient.
2
Mathematically, LCA is in fact a continuous version of the
iterative soft-thresholding algorithm [13], [32]. Moreover, providing that Φ satisfying RIP, LCA guarantees an exponential
convergence rate [31]. Even though armed with analog circuits,
the LCA is much more efficient than its discrete version [32],
the exponential rate is not enough to ensure the convergence of
SR during the evolution of the LCA dynamics especially when
signals varying rapidly. Consequently, the main objective of
this paper is to redesign the dynamics of LCA to increase the
convergence rate. As we can see that sparse recovery problem
(1) is an optimization problem. Note that, except the numerical
method, continuous method can be also used to solve the
optimization problem, which historically has a strong link to
control theory [33], [34]. In fact, in [35], the proposed LCA
method exactly used control theory to solve the optimization
problem (1). In order to clarify the motivation, let us firstly
recall some basic backgrounds of control theory.
A. Recall to System Stabilities
Researchers in control community are interested in stabilizing different types of dynamical systems with some proper
control laws. Consider the following system:
u̇ = f (u)
(2)
with u ∈ RN the system state with respect to time t and
denote u(t) the value of state at time instant t. For this
system, we call the point u∗ ∈ RN as an equilibrium point
if f (u∗ ) = 0. Note that the linear time-invariant system has
only one isolated equilibrium point, but nonlinear system and
switched system may have more than one isolated equilibrium
points. Therefore, only local stability around each equilibrium
point can be analyzed. Concerning the concept of stability,
different definitions are given in the literature.
Definition 1. System (2) is said to be:
1) locally Lyapunov stable around u∗ , if for any ϵ > 0,
there exists δ > 0 such that, if ||u(0) − u∗ || < δ, then
||u(t) − u∗ || < ϵ, for all t > 0;
2) locally asymptotically stable around u∗ , if there exists δ > 0 such that, if ||u(0) − u∗ || < δ, then
limt→∞ ||u(t) − u∗ || = 0;
3) locally finite-time stable around u∗ , if there exist δ > 0
and T > 0 such that, if ||u(0) − u∗ || < δ, then ||u(t) −
u∗ || = 0 for all t > T .
Lyapunov stability only requires the solution u(t) starting
from the neighborhood of the equilibrium point u∗ staying
inside its neighborhood. Asymptotical stability needs that the
trajectory of the system should converge to u∗ as t tends to ∞.
The strongest definition is the finite-time, which furthermore
imposes that u(t) should exactly equal to u∗ after a finitetime T . Moreover, the extension of local stability to global
one needs just to relax the neighborhood of the equilibrium
point u∗ (||u(0) − u∗ || < δ) by allowing all u(0) ∈ RN . If the
system globally converges to u∗ , it implies as well that u∗ is
the unique equilibrium point.
Without solving the differential equation, in control theory,
the Lyapunov function method is widely used to determine
the types of stability for the studied system. Suppose u∗ is the
equilibrium point, and denote by e(t) = u(t) − u∗ , the basic
idea is to choose a Lyapunov function V (e) which should be
locally positive definite for all e ̸= 0 and V (0) = 0, then
system (2) is:
1) locally Lyapunov stable around u∗ , if
V̇ (e) ≤ 0, ∀e ̸= 0
2) locally asymptotically stable with rate k around u∗ , if
V̇ (e) ≤ −kV (e), ∀e ̸= 0
with k > 0;
3) locally finite-time stable around u∗ , if
V̇ (e) ≤ −kV α (e), ∀e ̸= 0
with k > 0 and α ∈ (0, 1).
Similarly, global stability can be proved by choosing a
globally positive definite and radically unbounded Lyapunov
function, i.e., V (e) → ∞ if ||e|| → ∞. Besides, with the
chosen V (e), if one can only prove V̇ (e) ≤ 0, then LaSalle
Theorem can be still used to prove the asymptotical stability.
It states that if the set V̇ (e) = 0 contains only e = 0, then it
is asymptotical stable.
B. Motivations
In this paper, a new dynamical system will be proposed, of
which the equilibrium point is unique and yields the solution
of the optimization problem (1). Therefore, the above basic
results (Lyapunov and LaSalle Theorems) from control theory
will be used to analyze the performance of convergence. In
order to well explain how to design a dynamical system with
non-asymptotical (finite-time) convergent performance, let us
consider the following two simple systems:
u̇ = −u
(3)
u̇ = −|u|α sgn(u), with α ∈ (0, 1).
(4)
and
It is easy to see that u∗ = 0 is the only equilibrium point for
both systems. For system (3), choose the Lyapunov function
as V (u) = u2 , we have V̇ = −2u2 = −2V , thus u of
system (3) asymptotically converges to the equilibrium point
0. Concerning system (4), choose as well V = u2 , which
1+α
gives V̇ = −2u|u|α sgn(u) = −2V 2 . Since α ∈ (0, 1), u of
system (4) converges to the equilibrium point 0 after a finitetime T . Particularly, when α = 1, system (4) is exactly system
(3), and the finite-time convergence property is degraded to
asymptotical one. In other words, by introducing the sign
function (which is called as sliding mode technique in control
theory), the convergence performance of the studied system
can be improved.
Let us then turn to the problem (1). Motivated by the above
example, the finite-time convergence can also be fulfilled by
exploiting the sliding mode technique in LCA [35], which
has an asymptotical (exponential) convergence property to
solve the optimization problem (1). The rest of this paper is
organized as follows. The new dynamical system is built in
3
Section II and its finite-time convergence property is proved in
Section III. Relationships between our proposed method and
the related works are discussed in Section IV. Simulations
are implemented to verify the theorems and demonstrate the
superior of our proposed system to LCA in Section V and extensions to recover time-varying sparse signals are empirically
presented in Section VI. Conclusions are made in Section VII.
II. S PARSE R ECOVERY VIA DYNAMICAL S YSTEM
with ⌈·⌋α being a function defined as
A. Preliminary of LCA
Let us now first take a look at the LCA method proposed
in [35] to solve the optimization problem (1):
{
τ u̇(t) = −u(t) − (ΦT Φ − I)a(t) + ΦT y
(5)
x̂(t) = a(t)
where u ∈ RN is the state vector, x̂ represents the estimation
of the sparse signal x of (1), τ > 0 is a time-constant
determined by the physical properties of the implementing
system. Since τ always exists as a company of the derivative
with respect to the time t, it can be simply set to τ = 1 for
mathematical analysis and then added to the final result if time
derivative exists. a(t) = Tλ (u(t)) with Tλ (·) is a continuous
soft thresholding function
Tλ (u) = max(|u| − λ, 0) · sgn(u)
(6)
with λ > 0.
Then defining by ui the i-th element of state u, we can call
ui as an active node if the amplitude |ai (t)| is different from
zero, otherwise we call this node inactive. Then define by Γ
the set of active nodes, i.e., aΓ ̸= 0, and Γc the set of inactive
nodes, i.e., aΓc = 0. In order to guarantee the existence of
unique solution of optimization problem (1), assumptions on
Φ should be made before going deep into analysis, where the
restricted isometry property (RIP) [11] is assumed.
Assumption 1 (RIP [11]). Matrix Φ satisfies the s-order of
RIP condition with constant δs ∈ (0, 1).
The above assumption implies that for any s sparse signals
x, i.e., vectors with at most s nonzero elements, the following
condition is verified
(1 − δs )∥x∥22 ≤ ∥Φx∥22 ≤ (1 + δs )∥x∥22 .
Denote Γ the index set of nonzero elements for x, it implies
that
1 − δs ≤ eig(ΦTΓ ΦΓ ) ≤ 1 + δs
where ΦΓ denotes the submatrix of Φ with active nodes.
Explicitly, RIP condition guarantees that eigenvalues of any
Gramm matrix ΦTΓ ΦΓ for any index set Γ are bounded.
Suppose that RIP of Φ is fulfilled with constant δs , the LCA
system (5) converges exponentially, which can be concluded
in the following theorem.
Theorem 1 (LCA Convergence Property [35]). If Assumption 1 holds, then LCA system (5) converges to the equilibrium
point u∗ exponentially fast with convergence speed (1−δs )/τ ,
i.e., ∃K > 0, such that ∀t ≥ 0
∥u(t) − u∗ ∥2 ≤ Ke−(1−δs )t/τ
B. The Proposed Dynamical System
In this paper, a new dynamical system is proposed to solve
the ℓ1 -minimization problems (1). As stated in the last section,
motivated by the sliding mode technique, a new dynamical
system is constructed by introducing the parameter α ∈ (0, 1],
i.e.,
{
τ u̇(t) = −⌈u(t) + (ΦT Φ − I)a(t) − ΦT y⌋α
(7)
x̂(t) = a(t)
⌈·⌋α = | · |α · sgn(·)
where | · |, ·, sgn are all element-wise operators, α ∈ R+
denotes an exponential coefficient and
if ω > 0
= 1,
sgn(ω) ∈ [−1, 1], if ω = 0 .
= −1,
if ω < 0
In the following sections, we will demonstrate that the
new designed system (7) resolves the optimization problem
(1) and converges to the equilibrium point with finite-time
convergence.
Theorem 2. Under Assumption 1, the state u(t) of (7)
converges in finite time to its equilibrium point u∗ , and x̂(t)
of (7) converges in finite-time to x of (1).
Remark 1. Considering the dynamic (7), even if it is not a
Lipschitz function at u = 0 for α ∈ (0, 1), it still has a unique
solution (Cauchy problem). This is due to the fact that dynamic
(7) is at least locally asymptotically stable at u = 0 and then
the only one solution is clearly ∀t > 0 u(t) = 0 if u(0) = 0.
Moreover for α = 0 the solution must be considered in a
Filippov meaning [36].
Remark 2. When α = 1, the proposed dynamical system (7)
becomes exactly the same as LCA proposed in [31]. For α = 0
the dynamic of neuron cell becomes a first order sliding mode
dynamics and chattering phenomenon occurs at equilibrium
point. This is not wished in the neural network and more
particularly into our proposed optimization algorithm for the
problem (1).
III. C ONVERGENCE IN F INITE T IME
In this section, we will analyze the property of the proposed
system (7) in the following four steps. At first, similar to LCA,
we can also prove that the output of the proposed system (7)
converges to the critical point of (1). After that, we will prove
that the trajectory of (7) stays in a bounded space. Then, the
attractive property of an invariant set is proved via LaSalle
theorem [37], [38] by introducing a new semipositive definite
function. At last, the finite time convergence of (7) is proved.
In the following, note that variables u, x, a are always the
function of the time t which are sometimes neglected for
simplicity and the derivative with point above always means
the derivative with respect to the time t. ui represents the
i-th element of vector u and I as the identity matrix. u∗
is a constant with respect to time t which represents the
equilibrium point of the trajectory u(t).
4
A. Solution Equivalence
Considering the proposed dynamical system (7), the second
claim of Theorem 2 can be easily proved by slightly modifying
the result in many papers related to LCA, such as [31]. And
we re-write the following lemma to make our proofs complete.
Lemma 1. Equilibrium points of (7) are critical points of (1).
Proof. The subgradient of (1) with respect to x in the set
valued framework [36], [39], [40] gives
∂ 21 ∥Φx − y∥22 + λ∥x∥1
= (ΦT (Φx − y) + λsgn(x))T (8)
∂x
And define x = Tλ (u), then x and u should have the same
sign. By simple calculation,
u − x = (|u| − max(|u| − λ, 0)) · sgn(u) = λsgn(x)
Then, substitute the sgn in (8)
(
)
∂ 12 ∥Φx − y∥22 + λ∥x∥1
= (ΦT (Φx − y) + u − x)T
∂x
∂ ( 1 ∥Φx−y∥22 +λ∥x∥1 )
Consequently, u̇ = 0 in (7) only when 2
=
∂x
0, this completes the proof.
The above lemma connects the dynamical system (7) and
the optimization problem (1), and guarantees the equivalence
of the output of (7) and the critical point of (1). Since
Assumption 1 implies the uniqueness of critical point of
(1), then Lemma 1 means that the system (7) has only one
equilibrium point.
Remark 3. The generalized active function Tλ is not the
main contribution of this paper, thus only soft-thresholding
function is addressed. Alternating of active function will get
the same result as Lemma 1, and proofs with generalized active
functions can be referred in the appendix of [35].
Due to the sgn function, the resulted system (7) is actually a hybrid (switched) system and it might exist the
Zeno phenomenon (Infinite transitions within finite time [41]),
which makes the analysis very complicated. Consequently, it
is necessary to verify whether Zeno exists, and the following
lemma verifies this point.
Lemma 2. The system (7) with continuous threshold (6) is
everywhere integrable and has a unique solution, moreover
Zeno behavior can’t occur.
Proof. According to control theory, the existence and uniqueness of the solution of dynamical systems is not guaranteed
only at the state point where the system is not Lipschitz.
For the proposed (7), its solution exists except when u(t) +
(ΦT Φ − I)a − ΦT y is equal to zero, i.e., the equilirum point.
Nevertheless, at this equilirum point, Lemma 1 shows that it
is the unique equilibrium point of (7) which concides with the
unique critical point of (1).
As we will prove in Theorem 2 that this unique equilibrium
point is globally stable, therefore, the solution of (7) with
continuous threshold (6) always has a unique solution.
Moreover, since Tλ (u) defined in (6) is a continuous threshold, i.e., Tλ (u) ∈ C0 , then according to the definition of the
proposed dyanmics in (7), the trajectory u should belong to
C1 , which implies that Zeno behavior does not exist for the
proposed system (7) with continuous threshold (6).
In order to invoke the LaSalle theorem in the next subsection, we must prove first that the state behavior stays in a
bounded space.
Lemma 3. For all bounded initial states u, the trajectory of
(7) stays in a bounded set.
Proof. In order to prove that the state trajectory stays on a
bounded set, we invoke again (1) but with respect to u (let
x = Tλ (u)),
1
||y − ΦTλ (u)||22 + λ||Tλ (u)||1
2
and the derivatives with respect to time t is
V (u) =
V̇ (u) = (u + (ΦT Φ − I)Tλ (u) − ΦT y)T Fλ′ u̇
λ (u)
being the Frechet derivative with respect to
with Fλ′ = ∂T∂u
u, thus it leads to a diagonal matrix with 1 on the diagonal if
the neuron is active and 0 if not.
Now considering dynamical system (7), it gives
V̇ = − (u + (ΦT Φ − I)Tλ (u) − ΦT y)T Fλ′ ·
⌈u + (ΦT Φ − I)Tλ (u) − ΦT y⌋α
≤0
As lim∥u∥→∞ V (u) = ∞ and V̇ (u) ≤ 0, one can conclude
that u stays in a bounded set, i.e., (7) is Lyapunov stable.
B. Global Convergence
Even if LaSalle theorem requests that the state behavior
must evolve in a bounded space, as this bounded space can be
as wide as we want with respect to the initial state, then we
consider this convergence as a global one.
On the other hand, it has been proved that under the
Assumption 1, the uniqueness of the solution to (7) is guaranteed [11]. Thus it implies that there exists a solution u∗
to dynamical system (7). In order to prove the convergence
property of (7), the error term 1 is introduced.
ũ(t) ≜ u(t) − u∗
ã(t) ≜ a(t) − a∗
And then define the Lyapunov function with respect to ũ,
E(ũ) =
1
||ũ||22 + 1T (ΦT Φ − I)G(ũ)
2
(9)
with 1 ∈ RN being the vector with all elements equal to 1
and G(ũ) = [G1 (ũ1 ), G2 (ũ2 ), · · · , GN (ũN )]T ∈ RN , where
∫ ũi
Gi (ũi ) =
gi (s)ds
0
1 Variables ũ and ã are always function of t which are neglected in the
following sections for simplicity.
5
with gi (s) = Tλ (s+u∗i )−Tλ (u∗i ). Then we have the following
properties.
Lemma 4. The function E defined in (9) satisfies the following
properties:
ũ2
1) For all ũi ≥ 0, 0 ≤ Gi (ũi ) ≤ 2i ,
2) E is non-increasing, i.e. Ė ≤ 0,
3) For dynamical system (7), E cannot be negative, i.e.
E ≥ 0,
4) There exists a positive constant ν > 0 such that
E(ũ) ≤ ν∥ũ∥22
Gi (ũi ) ≥ 0
Tλ (x) − Tλ (y) ≤ x − y, ∀x ≥ y
which implies gi (s) ≤ s, ∀s ≥ 0. Consequently,
∫ ũi
∫ ũi
ũ2
Gi (ũi ) =
gi (s)ds ≤
sds = i
2
0
0
Then, by defining ν =
inequality.
ρ(1 + δs ) + 1
∥ũ∥22
2
ρ(1+δs )+1
,
2
(15)
one can conclude the second
Theorem 3. Under Assumption 1, the dynamical system (7)
globally converges to the critical point of (1).
where equality holds only if u∗i ≥ λ or u∗i + ũi ≤ −λ.
2) The time derivatives of E gives
(10)
Then due to the fact that u∗ is constant, thus
(11)
According to the definition, u∗ and a∗ are the equilibrium
points of dynamical system (7), which concludes that
u∗ + (ΦT Φ − I)a∗ − ΦT y = 0
N (1 + δs )
∥ũ∥22
2s
thus defining by ρ = N/s the signal to sparsity rate,
1T (ΦT Φ)G(ũ) ≤
According to (9) and its third property stated in Lemma 4,
one can deduce that E is a positive semi-definite and radically
unbounded Lyapunov function. Then armed with the second
property of Lyapunov function E, we have the following
theorem.
For the second inequality, we first have
ũ˙ = u̇ = −⌈u + (ΦT Φ − I)a − ΦT y⌋α
According to Lemma 6 in appendix, the eigenvalue of ΦT Φ
is upper bounded, and thus it has
E(ũ) ≤
Proof. 1) According to (6), the operator Tλ is non-decreasing,
thus one can conclude that gi (s) ≥ 0, ∀s ≥ 0, thus
Ė(ũ) = (ũ + (ΦT Φ − I)ã)T ũ˙
For the second inequality, by exploiting the first result of
this lemma, one can have
1
E(ũ) ≤ ∥ũ∥22 + 1T (ΦT Φ)G(ũ)
2
Proof. According to the LaSalle Theorem [38], we can conclude that ũ will converge to an invariant subset Uinv of
M ≜ {ũ|ũ + (ΦT Φ − I)ã = 0}. From (14) and (13), it is
easy to conclude that Ė = 0 implies ũ˙ = 0, thus all state
of U is invariant. Consequently Uinv = U . Finally, we can
conclude that ũ converges to U , and then a converge to a set
of critical points of (1) i.e., a∗ , and according to Assumption
1, a∗ is unique, then Uinv = U is reduced to a singleton
{ũ = 0} or equivalently {a = a∗ }.
(12)
C. Finite Time Convergence Property
Then plug (12) into (7), we can get
ũ˙ = −⌈ũ + (ΦT Φ − I)ã⌋α
(13)
In this subsection, the convergence property of (7) will be
considered. Hereafter, we will prove then, for ũ sufficiently
close to 0, ∥ũ + (ΦT Φ − I)ã∥22 is not singular with respect to
ũ.
(14)
Lemma 5. There exist a time te < ∞ and a positive value
κ > 0, such that when t > te , the following inequality is
verified,
κ∥ũ∥22 ≤ ∥ũ + (ΦT Φ − I)ã∥22
(16)
Consequently, combine equation (10) and (13), we have
Ė = −(ũ + (ΦT Φ − I)ã)T ⌈ũ + (ΦT Φ − I)ã⌋α
= −∥ũ + (Φ Φ −
T
I)ã∥1+α
1+α
≤0
3) From the result of Lemma 3, the proposed system (7) is
Lyapunov stable for any initial condition, i.e. ũ = 0, which
means E will converge to 0. Furthermore, we know that Ė ≤ 0
for all ũ, so for any E < 0, it will be non-increasing all
the time instead of converging to 0, i.e. the system will not
converge, which is contraindicative. Thus for the proposed
dynamic system (7), E is non-negative, i.e., E ≥ 0.
4) By definition of E(ũ) in (9), we have
1
||ũ||22 + 1T (ΦT Φ − I)G(ũ)
2
1
= ||ũ||22 + 1T ΦT ΦG(ũ) − 1T G(ũ)
2
≥ 1T ΦT ΦG(ũ) ≥ 0
E(ũ) =
Proof. In order to prove this result, we should first prove the
relation between ũ and ã. According to Lemma 1, and armed
with the result from [35], on can also conclude that none of
switching occurs after a finite time t1 < ∞. It means that after
time t1 , every nodes ui (t) will be with the same sign as u∗i .
Then following cases are respectively considered. For the i-th
element,
1) if |u∗i (t)| < λ we have |ãi | = |Tλ (ũi + u∗i ) − Tλ (u∗i )| =
0 ≤ |ũi |.
2) If |u∗i (t)| > λ we have ãi = ũi + u∗i − λ · sgn(ũi + u∗i ) −
u∗i + λ · sgn(u∗i ). According to Lemma 3, the proposed
6
system is globally convergent. It implies that |ũi | can
be very small, i.e., for any small ϵ > 0, there exists a
time t(η) < ∞, such that |ũi (t)| < η, ∀t > t(η). Thus,
define t2 = t(λ − ϵ), we have |ũi (t)| < λ − ϵ, ∀i and
t > t2 , then ũi + u∗i and u∗i are with same sign, and we
have ãi = ũi .
Above all, one can conclude that there exists a time te =
max{t1 , t2 } < ∞, such that for all t > te ,
{
ãi = ũi if i ∈ Γ
ãi = 0
if i ∈ Γc
Convergence Rate
|rFT |
Prop.:E = 0, t ≥ tf
LCA:E → 0, t → ∞
|rE |
t
tf
Consequently, one have
∥ũ + (ΦT Φ − I)ã∥22 = ∥ΦTΓ ΦΓ ũΓ ∥22 + ∥ũΓc − ΦTΓc ΦΓ ũΓ ∥22
≥ ∥ΦTΓ ΦΓ ũΓ ∥22
Exploiting Assumption 1, one can conclude that the Gramm
matrix ΦTΓ ΦΓ is not singular, thus ∥ũ + (ΦT Φ − I)ã∥22 > 0 as
long as ∥ũ∥22 > 0. And furthermore, there exists a small value
κ > 0 such that
κ∥ũ∥22 ≤ ∥ũ + (ΦT Φ − I)ã∥22
Now consider the dynamical system (7) with soft thresholding function (6), then Theorem 2 can be proved as follows.
Proof of Theorem 2. According to Lemmas 3, 4 and 5, the
following result is straightforward, for t > te ,
Ė(ũ) = −∥ũ + (ΦT Φ − I)ã∥1+α
1+α
≤ −∥ũ + (ΦT Φ − I)ã∥1+α
2
≤ −κ
1+α
2
∥ũ∥1+α
2
≤ −(κ/ν)
1+α
2
(E(ũ))
(17)
1+α
2
where the first inequality is due to the fact that ∥x∥1+α ≥ ∥x∥2
as α ∈ (0, 1].
Then ∀t > te E(ũ) converges to zero in finite time denoted
tf > te . Finally, we have ∀t > tf , u = u∗ and this ends the
proof.
D. Convergence Time
According to (17), one can conclude that the trajectory of
Lyapunov function is upper bounded,
) 2
(
1−α
1 − α 1+α 1−α
2
2
t
, t ≤ tf (E0 ) (18)
E(ũ) ≤ E0
θ
−
2
when t > tf (E0 ), we have E(ũ) = 0.
Then it is not so difficult to analyze the convergence time.
Particularly, according to Theorem 4.2 in [42], the settling
time function tf can be explicitly conducted by exploiting (17)
(partial integration with respect to E and t on both sides),
tf (E0 ) =
2
θ
1+α
2
(1 − α)
1−α
E0 2
(19)
with E0 = E(ũ(0)) being the initial condition of Lyapunov
function and θ = κ/ν.
Fig. 1. The schematic diagram of convergence rate. The dashed line represents
the convergence rate of LCA, i.e. |rE |; The solid curve represents the
convergence rate of the proposed system, i.e. |rFT |; the dark shadowed area
represents the equilibrium region of the proposed system, where E = 0.
It means that when t ≥ tf (E0 ), the Lyapunov function
E(ũ) exactly equals to 0, i.e. (7) is stable, as shown in Fig. 1.
Note that the settling time is dependent to the initial value E0
and moreover, when α → 1 the settling function tf → +∞,
which corresponds to the asymptotic convergence property. In
the situation when parameter α ∈ (0, 1), we have to consider
two different cases:
• when 0 < E0 ≤ exp(2)/θ, the settling function tf is
monotone increasing with respect to α,
• when E0 > exp(2)/θ, the settling function tf has a
2
minimum value at α = 1 − ln(θE
0)
Consequently, when the state is close to the equilibrium point,
smaller α will lead to faster convergence.
On the other hand, regarding to the equation (15), the
settling time is also dependent on the settings of the sparse
recovery problem, i.e. (s, M, N ), which determine the RIP
constant δs and the signal to sparsity rate ρ. Apparently,
the larger the number of measurements the smaller the RIP
constant δs which leads to the smaller settling time tf . While
the larger signal to sparsity rate ρ will result in a larger settling
time tf .
E. Convergence Rate
In this subsection, we will compare the convergence rate
between finite-time and exponential convergence. In order
to analyze the convergence property in counterpart of the
exponential convergence rate, the logarithmic form of (18) is
analyzed, i.e.
E(ũ) ≤ e
2
1−α
( 1−α
)
1+α
2 t
log E0 2 − 1−α
2 θ
Then the convergence speed can be evaluated via the slope of
the exponents with respect to time t, i.e.
rFT (t, α) = −
1
+
(α
− 1) 2t
c0 cα
1
√
E0
√ 1 . While the convergence
with c0 =
θ and c1 =
E0 θ
speed of the corresponding exponential convergence rate can
be direct obtained by setting α = 1, then one can get
rE = −θ
7
Considering the convergence rate, apparently, rFT is time
varying, as shown by the
( red solid curve
) in Fig. 1. And
moreover, when t ≥
2
1−α
1−α
2
E0
θ
1+α
2
− θ−1 , we have rFT ≤ rE ,
i.e. |rFT | ≥ |rE |, namely, the proposed system (7) converges
faster than LCA system.
Moreover, as the evolution time t approaching to the settling
time tf , the denominator of rFT will go to zero and then leads
to infinite value of |rFT |, i.e. system (7) will converge super fast
to the equilibrium point, as shown in Fig. 1. In this case, the
proposed system (7) is more appropriate to solve the dynamic
sparse signals, where the consecutive data are close enough
such that the initial E0 is sufficiently small, which makes the
settling time tf small enough to guarantee the real time sparse
recovery.
On the other hand, the convergence rate is also related to
the parameter α. Thus the influence of α can be explicitly
analyzed in the following cases. When c1 > 1, increasing α
will decrease the convergence rate. When 0 < c1 < 1, it can
be divided into two case,
α
• when t > 2c0 log(c1 )c1 , increasing α will decrease the
convergence rate;
α
• when t < 2c0 log(c1 )c1 , increasing α will increase the
convergence rate;
IV. D ISCUSSIONS
The proposed model in this paper is an extension to LCA
proposed in [30], where the ODE of the dynamic system
of LCA is essentially the same form as the well-known
continuous Hopfield neural network (HNN) [43] and Lyapunov
functions [38] plays a very important role in convergence
analysis. However, the difference between LCA and HNN is
also very essential. In particular, active function is continuous
and smooth for HNN, however, it is not necessarily to be
smooth for LCA and our proposed system. On the other
hand, the previous researches have rarely been focused on
the finite-time stabilities of the networks for autonomous
systems (LCA is with exponential stability). In this paper,
we modified the ODE of the LCA system to introduce the
sliding mode technique, and proposed a completely different
Lyapunov function (9) to implicitly prove our results.
Similarly to the seminal work of LCA, the proposed method
in this paper, possesses of solving the sparse representation
via the dynamical system composed of many neuron-like
elements operating in parallel analog architectures [30]. It is
worth to remark that comparing to the computational oriented
algorithms, the computational complexity of the proposed
method is actually not reduced. Alternatively, the complexity
of the proposed method (as well as LCA) is transferred to
the implementation of analog architectures realized by analog
chips. While the algorithm is very efficient as long as the
analog architectures are implemented, e.g., matrix multiplication result can be obtained in real time, the computer-oriented
algorithms require tens or hundreds of operations to get the
result. Consequently, LCA-like approaches would be more
appropriate to real time applications. On the other hand, our
proposed system is with finite-time convergence, instead, LCA
is with exponential convergence, consequently, our proposed
system can cope with signals with higher varying speed than
LCA, which can be illustrated by Example 1.
Comparing to LCA, the complexity to implement the analog architectures of our proposed dynamical system will be
slightly increased due to the fractional exponent and sign
function. In fact, those terms can be easily realized even
by using simple operational amplifiers, with which the basic
functionalities such as multiplication, division,
log, exp, abs already exist. For example, the fractional
exponential operator (such as xα with 0 < α < 1) can be
realized via cascading a logarithm operator and an exponential
operator (xα = exp(α ln x)) [44].
On the other hand, besides the soft-thresholding activation
function, other type of active functions introduced in [35] can
also be exploited in the proposed system. And the analysis for
alternatives can be addressed by analogy, where one only has
to reformulate the Lemma 4 according to Appendix of [35]
and the relationship between ũ and ã used in Lemma 5 can
also be derived.
V. S IMULATIONS
In this section, we present several simulations to illustrate
the theoretical results presented in this paper. Simulations will
be carried out in four aspects. At first, the global convergence
property of the proposed system is illustrated. Afterwards, we
will analyze the number of switches before the convergence
for the proposed system. Then, the property of finite time
convergence is addressed. At last, the effects of α on the
convergence rate is also analyzed.
In the following, we will respectively exploit the proposed dynamical system and LCA to solve the canonical
sparse representation problems. Without special explanation,
the simulations are carried out with the following setting. The
original sparse signals x ∈ RN with N = 200 and sparsity
s = 10 are randomly generated, of which the nonzero entries
are drawn from a normal Gaussian distribution. Afterwards,
measurements y ∈ RM with M = 100 are collected via
random projections, y = Φx + ε, where measurement matrix
Φ ∈ RM ×N is drawn from a normal Gaussian distribution
(Φ is normalized to make every column with unit norm)
and ε is the Gaussian random noise with standard derivation
σ = 0.016. Dynamical equations of LCA and our proposed
system are simulated through a discrete approximation in
Matlab with a step size of 0.001 and a solver time constant is
chosen to be equal to τ = 0.1. The initial states is also set to
u(0) = 0 and the threshold value λ = 0.05 for both systems.
A. Global Convergence
In this subsection, global convergence property of our
proposed system is evaluated. Theorem 3 states that the proposed should converge and recover the solution to the sparse
representation problem (1), which has a unique minimizer.
As shown in Fig. 2, we plot the output a∗ of our proposed
dynamical system (7) after convergence. The comparison is
made to LCA with same initial condition. And it is shown
that our proposed system can reach the same sparse solution
8
as LCA, with 10 nonzero entries, which correspond to the
nonzero entries in original sparse signal x.
2
1.5
1
0.4
0.5
0.2
u 10
LCA
Proposed, α=0.5
Org
0.3
-0.5
0.1
a
0
0
-1
-0.1
-1.5
-0.2
-2
-0.3
-2
-0.5
-0.6
-1
0
1
u
-0.4
0
50
100
150
200
2
3
44
Fig. 4. Trajectories u44 (t) v.s. u10 (t) with 20 different initial conditions via
the proposed system with α = 0.5.
Locations
Fig. 2. Output a∗ of LCA and the proposed system after convergence with
α = 0.5 and λ = 0.05.
On the other hand, we also plot the evolution of several
active nodes and nonactive nodes with respect to time for
LCA and our proposed dynamical system in Fig. 3. The initial
starting points of states u(t) for both systems are identical. It is
shown that every node of both LCA and our proposed system
converge to a fixed point and the convergent points for each
node of LCA and our proposed system are identical, while
the node of our proposed system converges much faster than
LCA.
0.4
LCA, u 10
LCA, u 44
LCA, u 100
LCA, u 135
Proposed,
Proposed,
Proposed,
Proposed,
0.3
0.2
u(t)
0.1
0
-0.1
,=0.5, u10
,=0.5, u
44
,=0.5, u
100
,=0.5, u
135
-0.2
-0.3
-0.4
0
1
2
3
4
5
time (s)
Fig. 3. Evolution of several active nodes (solid lines) and nonactive nodes
(dashed lines) with respect to time for LCA and our proposed dynamical
system with α = 0.5.
At last, we evaluate the global convergence property of our
proposed system by plotting the trajectories of two randomly
selected nodes u10 and u44 starting from 20 randomly generated initial points. And the result is plotted in Fig. 4, from
which one can clearly find that the solution is attractive for
any of those initial points.
B. Finite Switches
In this subsection, we will empirically verify the result of
Lemma 2. The switch occurs as |ui (t)| > λ decreasing to
|ui (t)| ≤ λ or |ui (t)| ≤ λ increasing to |ui (t)| > λ. In
our simulation, the ODE (7) is simulated through a discrete
approximation via ode4 with step size 0.001 and 5 seconds
evolution is implemented to guarantee the convergence, thus
the solution trajectory is discretized into 5000 points. Then,
1000 trials are carried out with randomly generated initial
conditions and noises, then the number of occurrences (for
each trial) of switches is counted along the trajectories of all
the nodes over these 5000 discrete points. At last, we can plot
the histogram of the number of occurrence of switches, as
shown in Fig. 5(b). This figure illustrates that the number of
switches required for our proposed system before convergence
is finite.
Moreover, we also plot the histogram of number of switches
for LCA as the comparison, as shown in Fig. 5(a). Similarly,
the number of switches required for LCA is also finite. Further
more, the average number of switches required for LCA is less
than that required for our proposed system. Even though, as
shown in Fig. 6 where evolutions of the number of active
nodes for LCA and our proposed system are plotted, it is
clear that the number of active nodes converges faster for our
proposed system than LCA. It implies that, although more
switches occurred for proposed system, the interval between
two contiguous switches is much smaller than that for LCA.
C. Convergence in Finite Time
According to Theorem 2, after some time te > 0, the
proposed system will converge in finite time. As shown in
Fig. 7, the evolutions of state error ũ(t) and the number of the
active nodes with respect to time are put together, where initial
state point u(0) is generated randomly. Instead of exponential
convergence rate as LCA (which has been proved in [35]), the
proposed system converges largely faster than LCA, and the
evolution of state error exhibits a finite-time convergence. On
the other hand, the proposed system can find the correct active
nodes faster than LCA.
9
with various signal length N ∈ [200, 400, 600, 800]. The
evolutions of state error ũ(t) = u(t) − u∗ for both LCA and
the proposed system are plotted in Fig. 8(a). Similarly, the
convergence performances comparing to LCA with respect to
sparsity level s, measurement number M and the threshold λ
are respectively considered, as shown in Fig. 8 from (b) to
(d). It is obvious that the proposed system converges much
faster than LCA with different signal length, measurement
number, sparsity level and threshold, and performs the finitetime convergent property.
60
# of occurence
50
40
30
20
log10||u-u *||22
10
0
40
50
60
70
# of switches
(a)
LCA
Proposed, ,=0.5
-2
-4
-6
0
0.2
0.4
0.6
0.8
1
time (s)
# of active nodes
50
# of occurence
0
40
30
30
LCA
Proposed, ,=0.5
20
10
0
0
0.05
0.1
0.15
0.2
0.25
0.3
time (s)
20
Fig. 7. Evolutions of state error ũ(t) and the number of active nodes with
respect to time.
10
0
50
60
70
80
90
100
# of switches
D. Influence of α
(b)
Fig. 5. Histogram of the number of switches required for LCA (a) and the
proposed system (b) with α = 0.5 before convergence over 1000 trials.
24
LCA
Proposed, ,=0.5
Number of active nodes
22
20
18
16
14
12
10
8
0
0.1
0.2
0.3
0.4
0.5
time (s)
Fig. 6. Number of active nodes for the proposed system with α = 0.5.
In order to verify Theorem 2, simulations with different
settings are carried out, as shown in Fig. 8. We firstly fix
the sparsity level s = 10, measurement number M = 100
and the threshold λ = 0.05, then implement the simulation
In this subsection, the performance with respect to α is
analyzed, where simulations are carried out by ranging α
from 0.2 to 1 (when α = 1 it is equivalent to LCA) and
let other parameters be fixed. The results are shown in Fig. 9,
and one can find that the convergence rate is decreasing as α
increasing, which verifies the result in the proof of Theorem
2.
On the other hand, it is worth mentioning that simulations of
dynamical system might induce oscillations when parameter
α is getting smaller. For instance, in Fig. 9 (the left and
middle subfigure), oscillation happens when ODE is realized
by approximating with low oder ODE solvers, such as ode1
with fixed time step 10−3 . This phenomenon is due to the fact
that the function ⌈·⌋α with α < 1 will result in some numerical
computational problems when the variables are getting close
to zero. In numerical simulations, it can be alleviated by either
reducing the time step for ODE solvers or alternating to use
higher order ODE solvers. As shown in Fig. 9, one can find
that the oscillations disappeared when reducing the time step
from 10−3 to 10−4 or replacing ode1 solver by ode4 solver,
i.e., Runge-Kutta method.
VI. E XTENSION TO T IME - VARYING P ROBLEMS
In previous sections, it has been proved that the proposed
dynamical system (7) has the finite-time convergent property,
and empirically, it converge much faster than LCA. This
10
0
0
LCA, N=200
LCA, N=400
LCA, N=600
LCA, N=800
Proposed, N=200
Proposed, N=400
Proposed, N=600
Proposed, N=800
log10||u-u *||22
-1
-1.5
-2
-1
-2.5
-3
-3.5
-1.5
-2
-2.5
-3
-3.5
-4
-4
-4.5
-4.5
-5
0
0.2
0.4
0.6
0.8
LCA, s=5
LCA, s=10
LCA, s=15
LCA, s=20
Proposed, s=5
Proposed, s=10
Proposed, s=15
Proposed, s=20
-0.5
log10||u-u *||22
-0.5
-5
1
0
0.2
0.4
time (s)
(a)
1
0
LCA, M=50
LCA, M=100
LCA, M=150
LCA, M=175
Proposed, M=50
Proposed, M=100
Proposed, M=150
Proposed, M=175
-1
-1.5
-2
-1
-2.5
-3
-3.5
-1.5
-2
-2.5
-3
-3.5
-4
-4
-4.5
-4.5
0
0.2
0.4
0.6
0.8
LCA, 6=0.03
LCA, 6=0.05
LCA, 6=0.07
LCA, 6=0.1
Proposed, 6=0.03
Proposed, 6=0.05
Proposed, 6=0.07
Proposed, 6=0.1
-0.5
log10||u-u *||22
-0.5
log10||u-u *||22
0.8
(b)
0
-5
0.6
time (s)
-5
1
0
0.2
0.4
time (s)
0.6
0.8
1
time (s)
(c)
(d)
Fig. 8. Evolutions of log10 ∥ũ(t)∥22 for LCA (dotted lines) and the proposed dynamical system (7) (solid lines) with α = 0.5 as problem settings are varied
with respect to (a) the signal length N , (b) the sparsity level s, (c) the measurement number M and (d) the threshold λ.
1
1
,=0.2
,=0.5
,=0.8
,=1
0
-1
-1
-4
-5
-6
-2
log10||u-u *||22
-3
-3
-4
-5
-6
-3
-4
-5
-6
-7
-7
-7
-8
-8
-8
-9
-9
-9
-10
-10
-10
0
0.2
0.4
0.6
time (s)
0.8
1
,=0.2
,=0.5
,=0.8
,=1
0
-1
-2
log10||u-u *||22
-2
log10||u-u *||22
1
,=0.2
,=0.5
,=0.8
,=1
0
0
0.2
0.4
0.6
time (s)
0.8
1
0
0.2
0.4
0.6
0.8
1
time (s)
Fig. 9. Convergence of log10 ∥u − u∗ ∥22 for the proposed dynamical system with different value of α ∈ [0.2, 0.5, 0.8, 1]. Different ODE solvers are used
(left) ode1 with time step 1e − 3, (middle) ode1 with time step 1e − 4 and (right) ode4 with fixed time step 1e − 3.
property is more applausive than LCA especially in real applications, where sparse signals encountered are time-varying,
i.e.,
y(t) = Φx(t) + ϵ(t)
(20)
with y and x being both varying with respect to time.
In order to approximate the time-varying sparse signals x(t),
in [32], a maximum sampling rate and a large gradient step
size are required for convergence. However, it is straightforward in our proposed system, where the only requirement is
to plug the time-varying measurements y(t) into the system
(7) without changing any parameters.
To demonstrate the superiority of our proposed system, a
toy example is given here.
Example 1. A length N time-varying sparse signal x(t) is
11
vector with norm 1 and denote by XS the set of s-sparse
vectors
3
Original x 44(t)
a 44(t), LCA
a 44(t), Proposed
2.5
XS = {y ∈ RN |yi = 0, ∀i ∈
/ Γ and yi = xi , ∀i ∈ Γ, Γ ⊂ S}
( −1)
For each element of x, it will appear k = Ns−1
times in all
possible s-sparse vectors in XS . Thus we have the following
equation
∑
kx =
y
a 44(t)
2
1.5
1
y∈XS
Then,
0.5
∑
k∥Φx∥2 = ∥
0
0
2
4
6
8
10
time (s)
Fig. 10. Estimation of time-varying sparse signals via LCA and the proposed
system.
Φy∥2
y∈XS
≤
∑
∥Φy∥2 ≤
√
y∈XS
≤
√
|S|
√∑
∥Φy∥22
y∈XS
√∑
√
|S|
(1 + δs )∥y∥22 = k|S|(1 + δs )
y∈XS
generated with sparsity s = 5, where 4 of nonzero entries are
drawn randomly and stay constant with respect to time. And
the last nonzero entry is varying according to the following
function
x44 (t) = cos(0.4πt) + 1.5
Then measurements are gathered according to (20) with
normal Gaussian noise with derivation σ = 0.016.
The estimations are obtained by evolving both LCA and
our proposed system with α = 0.5 and threshold λ = 0.05,
as shown in Fig. 10. Obviously, LCA cannot tracking the
signal, while our proposed system can successfully tracking
the changing of signal.
VII. C ONCLUSION
In this paper, we proposed a new dynamical system that
can solve the sparse representations. It is with the finitetime convergence property. Comparing to LCA, the proposed
system can converge to the same equilibrium point but with
much faster convergence, which is very applaudable in realtime sparse representation applications.
Moreover, connections between continuous dynamical systems and discrete optimization algorithms for sparse regularized inversion problems have been investigated [45]. Meanwhile, it is also claimed in [32] that the iterative softthresholding algorithm can be considered as the discretized
version to LCA. Thus, the future works would be focused on
investigating the discretized version of our proposed dynamical
system, which might result in a new sparse representation
algorithm with faster convergent property.
A PPENDIX
Lemma 6. If the matrix Φ ∈ RM ×N satisfies the s-order
RIP with constant δs , then the eigenvalue of ΦT Φ is upper
bounded by N (1 + δs )/s.
Proof. Denote by S the
( )all possible subset with size s of
{1, ..., N }, thus |S| = Ns . Then let x ∈ RN be an arbitrary
Thus, ∥Φx∥22 ≤ N (1 + δs )/s.
R EFERENCES
[1] E. Candès and M. Wakin, “An Introduction To Compressive Sampling,”
IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 21–30, mar 2008.
[2] S. Mallat, A wavelet tour of signal processing: the sparse way. Academic press, 2008.
[3] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An Algorithm for
Designing Overcomplete Dictionaries for Sparse Representation,” IEEE
Transactions on Signal Processing, vol. 54, no. 11, pp. 4311–4322, nov
2006.
[4] J. Mairal, M. Elad, and G. Sapiro, “Sparse Representation for Color
Image Restoration.” IEEE Transactions on Image Processing, vol. 17,
no. 1, pp. 53–69, 2008.
[5] X. Xu, X. Wei, and Z. Ye, “DOA Estimation based on Sparse Signal
Recovery Utilizing Weighted-norm Penalty,” IEEE Signal Processing
Letters, vol. 19, no. 3, pp. 155–158, 2012.
[6] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse MRI: The application
of compressed sensing for rapid MR imaging,” Magnetic Resonance in
Medicine, vol. 58, no. 6, pp. 1182–1195, 2007.
[7] Y. Chen, L. Shi, Q. Feng, J. Yang, H. Shu, L. Luo, J.-L. Coatrieux,
and W. Chen, “Artifact Suppressed Dictionary Learning for Low-dose
CT Image Processing,” IEEE Transactions on Medical Imaging, vol. 33,
no. 12, pp. 2271–2292, 2014.
[8] M. E. Tipping, “Sparse Bayesian Learning and the Relevance Vector
Machine,” Journal of Machine Learning Research, vol. 1, pp. 211–244,
2001.
[9] M. Tan, I. W. Tsang, and L. Wang, “Matching pursuit lasso part ii:
Applications and sparse recovery over batch signals,” IEEE Transactions
on Signal Processing, vol. 63, no. 3, pp. 742–753, Feb 2015.
[10] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust Face
Recognition via Sparse Representation,” IEEE Transaction on Pattern
Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210–227, 2009.
[11] E. Candès and T. Tao, “The dantzig selector: Statistical estimation when
p is much larger than n,” The Annals of Statistics, pp. 2313–2351, 2007.
[12] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic Decomposition
by Basis Pursuit,” SIAM Journal on Scientific Computing, vol. 20, no. 1,
pp. 33–61, 1998.
[13] I. Daubechies, M. Defrise, and C. De Mol, “An Iterative Thresholding
Algorithm for Linear Inverse Problems with a Sparsity Constraint,”
Communications on pure and applied mathematics, vol. 57, no. 11, pp.
1413–1457, 2004.
[14] E. J. Candès and T. Tao, “Decoding by Linear Programming,” IEEE
Transaction on Information Theory, vol. 51, no. 12, pp. 4203–4215,
2005.
[15] D. P. Wipf, B. D. Rao, D. P. Wipf, and D. D. Rao, “Sparse Bayesian
Learning for Basis Selection,” IEEE Transactions on Signal Processing,
vol. 52, no. 8, pp. 2153–2164, 2004.
12
[16] D. Wipf and S. Nagarajan, “Iterative Reweighted ℓ1 and ℓ2 Methods for
Finding Sparse Solutions,” IEEE Journal of Selected Topics in Signal
Processing, vol. 4, no. 2, pp. 317–329, 2010.
[17] L. Yu, H. Sun, J. P. Barbot, and G. Zheng, “Bayesian Compressive
Sensing for Cluster Structured Sparse Signals,” Signal Processing,
vol. 92, no. 1, pp. 259–269, 2012.
[18] L. Yu, H. Sun, G. Zheng, and J. Pierre Barbot, “Model based Bayesian
compressive sensing via Local Beta Process,” Signal Processing, vol.
108, no. 3, pp. 259–271, 2015.
[19] J. Tropp, “Just Relax: Convex Programming Methods for Identifying
Sparse Signals in Noise,” IEEE Transactions on Information Theory,
vol. 52, no. 3, pp. 1030–1051, 2006.
[20] J. A. Tropp and A. C. Gilbert, “Signal Recovery From Random
Measurements Via Orthogonal Matching Pursuit,” IEEE Transactions
on Information Theory, vol. 53, no. 12, pp. 4655–4666, 2007.
[21] D. Needell and J. a. Tropp, “CoSaMP: Iterative signal recovery from
incomplete and inaccurate samples,” Applied and Computational Harmonic Analysis, vol. 26, pp. 301–321, 2009.
[22] J. Wen, Z. Zhou, J. Wang, X. Tang, and Q. Mo, “A sharp condition
for exact support recovery with orthogonal matching pursuit,” IEEE
Transactions on Signal Processing, vol. 65, no. 6, pp. 1370–1382, March
2017.
[23] T. Zhang, “Sparse recovery with orthogonal matching pursuit under rip,”
IEEE Transactions on Information Theory, vol. 57, no. 9, pp. 6215–
6221, 2011.
[24] R. Baraniuk and P. Steeghs, “Compressive Radar Imaging,” in 2007
IEEE Radar Conference. IEEE, apr 2007, pp. 128–133.
[25] T. Hu and D. B. Chklovskii, “Sparse LMS via online linearized bregman
iteration,” in ICASSP 2014. IEEE, 2014, pp. 7213–7217.
[26] Y. Chen, Y. Gu, and A. O. Hero, “Sparse LMS for system identification,”
in ICASSP 2009. IEEE, 2009, pp. 3125–3128.
[27] D. Angelosante, J. Bazerque, and G. Giannakis, “Online adaptive
estimation of sparse signals: Where rls meets the ℓ1 -norm,” IEEE
Transactions on Signal Processing, vol. 58, no. 7, pp. 3436–3447, July
2010.
[28] K. Themelis, A. Rontogiannis, and K. Koutroumbas, “A variational
Bayes framework for sparse adaptive estimation,” IEEE Transactions
on Signal Processing, vol. 62, no. 18, pp. 4723–4736, Sept 2014.
[29] L. Yu, C. Wei, and G. Zheng, “Adaptive Bayesian Estimation with
Cluster Structured Sparsity,” IEEE Signal Processing Letters, vol. 22,
no. 12, pp. 2309–2313, 2015.
[30] C. J. Rozell, D. H. Johnson, R. G. Baraniuk, and B. Olshausen, “Sparse
coding via thresholding and local competition in neural circuits.” Neural
computation, vol. 20, pp. 2526–2563, 2008.
[31] A. Balavoine, J. Romberg, and C. J. Rozell, “Convergence and Rate
Analysis of Neural Networks for Sparse Approximation.” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 9, pp.
1377–1389, sep 2012.
[32] A. Balavoine, C. J. Rozell, and J. Romberg, “Discrete and continuoustime soft-thresholding for dynamic signal recovery,” IEEE Transactions
on Signal Processing, vol. 63, no. 12, pp. 3165–3176, 2015.
[33] L. S. Pontryagin, “The mathematical theory of optimal processes,”
Classics of Soviet Mathematics, ISBN-13: 978-2881240775, 1962.
[34] R. Bellman and R. Kalaba, Dynamic programming and modern control
theory, ser. Academic paperbacks. Academic Press, 1965.
[35] A. Balavoine, C. J. Rozell, and J. Romberg, “Convergence speed of a
dynamical system for sparse recovery,” IEEE Transactions on Signal
Processing, vol. 61, no. 17, pp. 4259–4269, 2013.
[36] A. F. Fillipov, Differential equations with discontinuous right-hand
side. Kluwer Academic Publishers, collection : Mathematics and its
Applications, 1988.
[37] J. LaSalle, “An invariance principle in the theory of stability,” in
Differential Equations and Dynamical SystemsStability and Control,
Academic Press, pp. 277–286, 1967.
[38] H. Khalil, “Nonlinear systems,” Prentice Hall, Upper Saddle River, NJ
07458, 1996.
[39] F. H. Clarke, Y. S. Ledyaev, R. J. Stern, and P. R. Wolenski, “Nonsmooth
analysis and control theory,” Graduate Texts in Mathematics, Springer,
ISBN-13: 978-0387983363, no. 7, 1998.
[40] A. Levant, “Sliding order and sliding accuracy in sliding mode control,”
Int. J. of control, vol. 58, no. 6, pp. 1247–1253, 1993.
[41] L. Yu, J.-P. Barbot, D. Boutat, and D. Benmerzouk, “Observability
Forms for Switched Systems With Zeno Phenomenon or High Switching
Frequency,” IEEE T. Automat. Contr., vol. 56, no. 2, pp. 436–441, 2011.
[42] S. P. Bhat and D. S. Bernstein, “Finite-time stability of continuous
autonomous systems,” SIAM Journal on Control and Optimization,
vol. 38, no. 3, pp. 751–766, 2000.
[43] J. J. Hopfield, “Neurons with graded response have collective computational properties like those of two-state neurons,” Proceedings of the
national academy of sciences, vol. 81, no. 10, pp. 3088–3092, 1984.
[44] G. W. Roberts and V. W. Leung, Design and analysis of integrator-based
log-domain filter circuits. Springer Science & Business Media, 2000.
[45] W. Su, S. Boyd, and E. Candes, “A Differential Equation for Modeling Nesterov’s Accelerated Gradient Method: Theory and Insights,”
Advances in Neural Information Processing Systems, pp. 2510–2518,
2014.