Lagrange Programming Neural Networks
Lagrange Programming Neural Networks
Lagrange Programming Neural Networks
7, J U L Y 1992 441
Abstract-Most of the neural techniques for constrained opti- function. Since then various neural computational tech-
mization follow a penalty principle. However, it has been well niques have been reported [lo], [12], [141, [171, [20], [261,
known for a long time that penalty approach exhibits some vital
deficiences. This paper analyzes in detail a class of neural [27]. Most of the techniques, however, follow a direct
networks appropriate for general nonlinear programming, i.e., decent approach and so is only suitable for optimization
problems including both equality and inequality constraints. problems without any constraints. If the problem requires
The methodology is based on the Lagrange multiplier theory in some equality and/or inequality constraints (general non-
optimization and seeks to provide solutions satisfying the neces- linear programming), as would often occur in the real
sary conditions of optimality. The equilibrium point of the
network satisfies the Kuhn-Tucker condition for the problem. world, a penalty approach is usually employed, leading to
No explicit restriction is imposed on the form of the cost a penalty function based on which a decent model is
function apart from some general regularity and convexity con- constructed. Such an approach, i.e., using a decent algo-
ditions. The stability of the neural networks is analyzed in rithm to solve a constrained problem via a penalty func-
detail. The transient behavior of the network is simulated and
the validity of the approach is verified with practical problem:
tion, has been well studied in optimization theory and
Maximum entropy image restoration. many helpful conclusions can be found in a text book
picked up at random, (e.g. [l]).Under the condition that
good initial information regarding the penalty parameter
I. INTRODUCTION and starting point is available, the minimum point of the
strained programming. Instead of following a direct de- tailed theoretic analysis. Denote the gradient of a scalar
scent approach of the penalty function, the network looks function s(x) as
for, if possible, a point satisfying the first-order necessary
conditions of optimality in the state space. There are two
classes of neurons in the network, variable neurons and
Lagrangian neurons, according to their contribution in
searching for an optimal solution. Variable neurons seek where the superscript T indicates the matrix transposi-
for a minimum point of the cost function and provide the tion.
solution at an equilibrium point, while Lagrangian neu- Definition I: Let x* be a vector such that h ( x * ) = 0.
rons lead the dynamic trajectory into the feasible region We say that x* is a regular point if the gradients
(the collection of all the points meeting the constraints). Vh,(x* ),.-a,Vh,(x*) are linearly independent.
The Lagrangian neuron is treated on an equal basis with The Lagrange multiplier approach is based on solving
the variable neurons and the dynamic process is carried the system of equations which constitute the necessary
out simultaneously on both, in other words, both kinds of conditions of optimality for the programming problem.
the neurons possess same temporal behavior. For most of Definition 2: The Lagrange function L : R " + m+ R is
the neural models, the cost function is most often embed- defined by
ded into the Lyapunov function of stability, which greatly
restricts the applicable functional form. For example, in L ( x , A) = f ( x ) + ATh(x) (2-3)
Hopfield's model, a quadratic cost function is usually
required. The model discussed in the paper successfully where A E R" is referred to as the Lagrange multiplier.
separates the Lyapunov function from the cost function, Denote
thus freeing us from specific functional forms and en-
abling us to concentrate on general principles. The model
described here adds to the known repertoire of neural
circuits.
[ ]
The paper is organized as follows. In Section I1 we first d 2 L ( x ,A)
illustrate the method by constructing a neural network for V,: L( x , A) =
nonlinear programming with equality constraints, and dis- dXidXj
cuss local and global stability of the network. A Lyapunov
function is set up during the procedure. In Section I11 and
we discuss as special cases the rederivation of the back- Vh = [Vh,;*.,Vh,].
propagation and the quadratic programming; much
stronger conclusions are obtained for the latter. Experi- We have following results from classical optimization the-
mental results including a simple example and the maxi- ory [l,ch. 11.
mum entropy image restoration problem are presented in Proposition 1: Let x* be a local minimum for (ENP)
Section IV to illustrate the computational capacity of the and assume that x* is a regular point. Then there exists a
network. Section V contains some concluding remarks. unique vector A* E R" such that
Appendix I completes the proof of the theorem in Section
I1 and Appendix I1 extends the algorithm to general V , L ( x * , A*) = 0 (2.4)
nonlinear programming with both equality and inequality
constraints. and
V , L ( x * , A*) = 0 (2.6)
(ENP) Minimize f ( x ) (2.1)
and
Subject to h( x ) = 0 (2.2)
zTv,2,L(x*,y*)z > 0
where x = ( x l , x 2 ; * * ,x , ) ~E R", f : R" + R and h : R"
--f R" are given functions and m I n. The components of for any z # 0 with V h ( x * ) T z= 0. (2.7)
h are denoted hl;.., h,. f and h are assumed to be twice
continuous differentiable. Then x* is a strict local minimum for (ENP).
Before going further we introduce the regularity condi- The first-order necessary condition of optimality can be
tion and some classical results, as a preparation for de- expressed as a stationary point (x*, A*) of L ( x , A) over x
ZHANG AND CONSTANTINIDES:LAGRANGE PROGRAMMING NEURAL NETWORKS 443
and A. That is. to their role in searching for the solution. In the dynamic
process of the neural network, Lagrangian neurons lead
V , L ( x * , A*) = V f ( x * ) + Vh(x*)A* = 0 (2.8) the trajectory into the feasible region while variabk neu-
V A L ( x * A*)
, = h(x*) = 0. (2-9) rons decrease the Lagrangian function L ( x , A). The de-
crease of the Lagrangian function by x can be verified
Or more precisely from the fact that along the trajectory of the network
df(X*) dh(X*)
+ AT- = 0, i = 1,2;.., n (2.10)
dxi j=1 dXj
2 (2)0.
h j ( x * ) = 0, j = 1,2;..,m. (2.11)
f2I:(
our network.
Definition 3: The function E(x,A) is defined by
= 2 0 (2.20)
I= 1
1
V?*L(x*, A*) Vh(x*) Prooj At the limit we have (dE(x,A)/dt = 0 which
G = [ -Vh( x * ) ~ 0 , (2.22) leads to V,L(x, A) = 0 due to (2.24) and the positivity of
the Hessian. And so from the neural dynamic eauation
(2.12) we have & / d t = - V x L ( x , A) = 0, hence x' tends
It is proved in the Appendix that the real part of each
to a limit x*.
eigenvalue of G is strictly positive, which means that - G
Corollary 2: Assume further that Vh,(x* Vh,(x*)
is strictly negative definite. And thus (x*,A*) is asymptot-
);.e,
chaos, starting from an arbitrary point. Thus an optimal 1) From Corollary 2 and (2.15) we can see that
solution can always be obtained by setting the initial state h(x* ) = 0, i.e., the constraint is satisfied. If Vh,(x* ),
ZHANG AND CONSTANTINIDES: LAGRANGE PROGRAMMING NEURAL NETWORKS 445
Vh,(x* I,..., Vh,(x*) is not linearly independent, x* except for some special cases (quadratic program-
is still a local minimum point of E(x,A). While since ming, for example). For most of real world problems,
E(x*,A) is flat in the small region around A*, the however, it is generally possible to justify from our
state of the network may randomly walk along A in prior knowledge that the problem is convex and the
the region due to the inevitable noise in a physical solution is unique. Thus we can employ the proposed
system. Thus the constraint h(x*) = 0 will not nec- neural network with confidence that it will finally
essarily be true. reach the solution from arbitrary initial points.
The local convexity assumption V,, L ( x * , A* ) > 0 in
Theorem 1 can be weakened if the neural model is 111. APPLICATIONS
modified with the aid of the augmented Lagrangian So far we have presented the model in an abstract
function [11. Consider the following equivalent prob- manner. Now let us consider two applications of the
lem to (ENP): model: rederivation of the back-propagation and the
1 quadratic programming.
minimize f(x) + -2- c ( h ( x ) l 2 (2.25)
3.1. Redenuation of Back-Propagation
subject to h ( x ) = 0 (2.26) The back-propagation algorithm can be rederivated us-
ing the Lagrange multiplier. Assuming that x ( t ) repre-
where c is a scalar and I I again denotes Euclidean
sents the state of a vector of unit at instant t and D ( t ) is a
norm. Then an augmented Lagrangian function can
desired target vector, the cost function can be written as
be formed.
1
L,(x, A) =f (x) + A T h ( x ) + -clh(x)l'.
2
(2.27)
The modified neural network, which will be referred The network constraints are expressed as
to afterwards as augmented Lagrange programming
neural netowrk (ALP"), is specified by the follow- x(t + 1) = g ( w x ( t ) ) (34
ing relation.
where x(0) is the initial state, w is a weight matrix, and g
State equations:
is a differentiable function [28]. The Lagrangian L is
L = E + C A T ( t ) [ x ( t )- g ( w x ( t ) ) ] . (3.3)
i = 1,2;.., n (2.28) It is easy to show that V L , = 0 gives a recursive rule to
compute A(t) and that it is exactly the gradient of the
(2.29) back-propagation algorithm [28].
11, the neural network is described by the equations, is connected to p l , the output is y;, and it contains a
self-feedback by (3.6b).
(3.4a)
IV. COMPUTER
SIMULATIONS
dh In this section we will first present a simple example to
-=Ax-b. (3.4b) illustrate the validity of the approach in solving optimiza-
dt
tion problems. While this is a simple example, it does
Or, in component form, serve to indicate the fundamental idea behind the ap-
proach taken, and help to obtain a deep insight into the
transient behavior of the network described previously.
The efficacy of the augmented Lagrangian function in
accelerating convergence is also discussed. The capability
( 3S b ) of the network in dealing with realistic problems is then
demonstrated with the applications of maximum entropy
where Q = (qi,), A = ( a i j ) , p i , and b, are the compo- image restoration.
nents of p and b, respectively. Since in this circumstance 4.1. A Simple Example
V:-L(x, A) = Q > 0 independent of x and A, the network
is globally stable and the network exhibits a unique equi- The example is to minimize a quadratic function with
librium point which provides the optimal solution to the equality constraints.
problem. Example 1:
b) Inequality Constrained Quadratic Programming: The minimize
neural networks for quadratic programming with both
equality and inequality constraints can be easily obtained
following the discussion in Appendix 11, but we have no
intention to explore in that direction further. Instead, let subject to
us consider neural networks for problems with only in-
equality constraints, so as to provide a comparison with
h,(x) =xI +x2 + 2x3 - 2 = 0
those discussed above. h,(x)=x, - ~ 2 = 0 .
The problem has the form
The neural network for this problem can be easily
1 derived from (3.5). We have studied the transient behav-
minimize f( x ) = -xTQx + p'x
2 ior of the circuit from various initial states by simulating
subject to g(x) = A x - b I0. the dynamic equations via the classical fourth-order
Runge-Kutta method [18] on a digital computer. The
All the parameters are similarly defined for those with network converged to the global minimum point x* =
equality constraints. Following the discussion in Appendix (0, 0, which is the theoretical solution. Fig. l(a) shows
11, the neural network is described by the relation a typical transient process of x1 corresponding to an
initial condition x = ( - 1.0,2.0, - lS), A = (0.6, -0.3).
U1
j=1 k=l
4.2. The Effect of the Penalty Parameter.
Let us now examine the effect of the penalty parameter
dYj -
_ - -2pIyj, j = l,...,m (3.6b) c in ALPNN. In Section 2.3 it was shown that the opti-
dt mization problem can be convexified by using an aug-
mented Lagrangian function, leading to the ALP"
b,, j = l ; . . , m . ( 3 . 6 ~ ) model. The condition for stability can be weakened in this
way. In addition, ALPNN is of advantage in accelerating
The network is also globally stable and has a unique the convergence. Fig. l(b) shows the transient behavior of
equilibrium point following the fact that V,',L(x, A) = ALP" on example 1, with c = 1.0 and the same initial
Q > 0, provided that the strict complementary condition condition as in Fig. l(a). Much improved behavior can be
of Theorem 4 holds in the dynamic domain. observed. However, little difference is found when c is
Comparing the network (3.6) with that defined by (3.3, further increased. This phenomenon can be understood
we can see that these two kinds of networks display an from an analysis of the effect of the penalty parameter c.
interchangeable structure. In hardware implementation The effect of c is to introduce a penalty ingredient into
the network for inequality constrained problems can be the object function for any violation of the constraints.
realized by simply "switching on" special neurons repre- When the initial state of the network is set at a random
senting additional variables y, to the network for the point, it is very likely that the constraints are considerably
corresponding equality constrained problems. These neu- broken. Thus a large value of c forces the state to
rons are constructed as follows. The input of the neuron j approach the feasible region quickly. While after the state
I
x'[;p,
ZHANG AND CONSTANTINIDES:LAGRANGE PROGRAMMING NEURAL NETWORKS 447
-
strated to be extremely powerful in optical image en-
4 7 4 hancement and restoration, its applications are quite re-
-1.20
stricted simply because of its intensive computational
0 4 0 12 I6 20 complexity. However, in what follows we can see that the
tlm.
neural network provides a promising computational alter-
native.
Image degradation in a linear system is generally char-
acterized as [23]:
0.70 y = [ H ] X +V (4.1)
where H is the point-spread function, X the original
-0.06 object distribution, V the noise, and y the recorded
image.
Given the recorded image y and the system feature
-0.82
[HI, the restoration problem is to find an acceptable
@ - - - - -- 1- . 2- + solution X,with or without the knowledge on the noise
1 tlm.
448 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 39, NO. 7, JULY 1992
A novel kind of neural networks for constrained opti- positive definite. This claim can be found in [ l , ch. 41 but
mization has been analysed based on the well-known for reference convenience we present the proof here.
Lagrange multiplier theory. The equilibrium point of the We denote by 9 the complex conjugate of a complex
circuit is shown to correspond to the Lagrange solution of vector y and Re( p ) the real part of a complex number p.
the problem and is asymptotically stable under the regu- Let a be an eigenvalue of G, and let ( z , w ) f (0,O) be a
larity and convexity conditions. A Lyapunov function is corresponding eigenvector where z and w are complex
established for the global stability analysis. The model is vectors of dimension n and m ,respectively. We have
then successfully extended to general nonlinear program-
ming with the aid of additional variables. There is no
explicit restriction imposed on the functional form. The
model exhibits remarkable capability in solving general
nonlinear programming. Although it is primarily proposed
for nonlinear programming, we believe that the model can while at the same time using the definition of G
be utilized for other applications such as associative mem-
ory; it even provides much more flexibility since the
position of attraction points in the state space can now be Re{['?' $T]G[ ;I) = R C { ~ ~ V : ~ L ( A*)z
~*,
adjusted by the Lagrangian neurons. Detailed analysis is
still an open problem. +f'Vh(~*)~w - GTVh(x*)z}. (a1.3)
The division of variable neurons and Lagrangian neu-
rons is somewhat artificial, from a purely theoretical view- Since we have for any real n X m matrix M
point. There is actually no difference in the temporal
behavior of the neurons, and the dynamic process is Re{ ~ ' M ' W } = Re{ GTMz) (a1.4)
T
ZHANG AND CONSTANTINIDES:LAGRANGE PROGRAMMING NEURAL NETWORKS 449
it follows from (a1.2) and (a1.3) that Proposition 3: Let x* be a local minimum for (GNP)
Re{iTV,?,L(x*, A*)z} = Re([iTOT]G[ $1) and x* is a regular point. Then there exist unique vectors
A* E R", p* E R' such that
V , L ( x * , A*,p*) 0, (a2.4)
Re(a)(lz12 + 1 ~ 1 ' ) .
=
= (a1.5)
r/; 2 0, r/*I g1 ( x * ) = 0 j = I;.*,r.
Since for any positive definite matrix A we have
(a2.5)
Re{iTAz} > 0 z # 0 (al.6)
Corresponding to the terminology of optimization the-
it follows from ( a 1 3 and the positive definiteness as- ory, we define the following.
sumption on V,?, L ( x * , A*) that either Re( a ) > 0 or else Definition 6: A Kuhn-Tucker point of the optimization
z = 0. But if z = 0, the equation problem is defined as a point (x*, A*, p * ) satisfying the
G[ $1 = ff[]; (a1.7)
conditions below.
af(x*) EA.fh1(x*) dgk(x*)
mdK,
+
yields
dx, dx, + k=l
V h ( x * ) w = 0. (al.8)
= 0, i = l,...,n (a2.6a)
Since V h ( x * ) has rank m it follows that w = 0. This h l ( x * ) = 0, j = l,...,m (a2.6b)
contradicts our earlier assumption that ( z , w ) # (0,O).
Consequently we must have R e ( a ) > 0. Thus the real gk(x*)I 0, k = 1;*-, r (a2.6~)
part of each eigenvalue of G is strictly positive and hence pz >_ 0, k = 1,s.. 7r (a2.6d)
G is positive definite.
&gk(x*) = 0 k = l;..,r. (a2.6e)
I1
APPENDIX
Our aim now is to design a neural network that will
We extend the design principle above to general nonlin- always converge to a Kuhn-Tucker point of the problem
ear programming, i.e., problems including both equality We transfer the inequality constraints into equalities by
and inequality constraints. The problem is initially con- introducing additional variables: y,, i = l;.., r, and con-
verted into an equivalent one that includes only equalities sider the following nonlinear programming problem that
so that the discussion in the previous section can be involves exclusively equality constraints.
conveniently applied.
Minimize f ( x)
A universal form of the problem can be expressed as
Subject to h , ( x ) = = h,(x) = 0 (a2.7)
(GNP) Minimize f ( x)
Subject to h ( x ) = 0, g ( x ) I 0 (a2.1)
g,(x) + y,2 = -.. = g,(x) + y,' = 0. (a2.8)
The term y,' may be replaced by any differentiable
where f : R" + R , h : R" + R", and g : R" + R' are positive functions of y , with suitable dynamic range. But
given functions with m I n. The components of h and g for simplicity y,' is adopted in what follows.
are denoted by hl,..., h , and gl;.., g,, respectively. We It has been proved from nonlinear programming theory
assume that f , h , and g are twice continuous differen- that x* is a local (global) minimum point of (GNP) if and
tiable and g(x) is finite in the domain. only if (x*, yy;-*,y:), where y,* = ,/- j =
Initially, the definition of a regular point has to be
1,2,..., r, is a local (global) minimum of the transformed
generalized. For any x satisfying the constraint g ( x ) I 0,
problem [l, ch. 11.
we denote
The Lagrangian function L ( x , y , A, p ) given below is
~ ( x =) { j I g , ( x ) = 0,j = l;.*,r} (a2.2) based on this conversion.
Definition 7:
i.e., A ( x ) is the subset of indexes corresponding to the
c
rn
inequality constraints active at x. U x , Y , A, P ) = f ( x ) + h,h,(x)
Definition 4: Let x* be a vector such that h ( x * ) = 0, 1=1
_
dP - subject to hi( x , y ) = 0, g k ( x , Y ) = 0,
- V,L(X, Y , A, P I (a2.10d)
dt
j = 1 , - . . , m ,k = 1,*--,r
or, in component form,
State Equations where f ( x , y ) , h j ( x ,y ) , & ( x , y ) are twice continuous dif-
ferentiable due to the fact that f, hi, and gk are twice
h
i- df dh. dgk
_ continuous differentiable. From Proposition 1 we have,
- A-L - P k z '
dt dxj j=l 'dxi k=l
i = 1,2;--, n (a2.11a)
(a2.1lb)
dAj
-= hj(x), j = 1,2,-..,m (a2.1 IC)
dt
V h ( x * ) T w= 0 , V g j ( x * ) T w + 2y7vj = 0, j = l,..., r
( a2.1 Id) (a2.18)
The transient behavior of the neural network seems
quite intricate at first sight. Thus two questions will natu- where
rally arise: 1) Does the steady-state of the system provide
the optimal solution, or more precisely, the Lagrange
solution to the problem? 2) Under what conditions does D = (a2.19)
the network converge locally or globally? Theorem 3
makes certain that the equilibrium point of the neural
network is always a Kuhn-Tucker point of the problem, For every j with y: = 0, we may choose wj = 0, 9 # 0
thus justifying the optimality of the solution. and U , = 0, for k f j, in (a2.17) to obtain py 2 0. For the
Theorem 3: Let ( x * , y * , A*, p * ) be the equilibrium of j's with yy # 0 we already showed that p? = 0. This
the neural network and assume that x* is a regular point. completes the proof.
Then ( x * , y*, A*, p * ) is a Kuhn-Tucker point of the Theorem 4 illustrates that the Lagrange solution of the
problem (GNP). problem is an asymptotically stable point of the network.
Proofi Comparing Definition 6 with the neural dy- Theorem 4: Let ( x * , y * , A*, p * ) be a stationary point of
namic equation (a2.11) what remains to be proved is that L ( x , y , A, p ) such that V;?,L(x*,y*,A*, p * ) > 0. Assume
$ g j ( x * ) = 0 and p; 2 0, j = l , * - * , r . that x* is a regular point and that the strict complementar-
Consider the equations ity condition ( p ? > 0 if j E A ( x * ) } holds. Then
( x * , y * , A*, p * ) is an asymptotically stable point of the
neural network.
Pro08 We consider only those y . , with j E A ( x * ) .
For those y j with j not in A(x*), g j ( x ) # 0, thus py = 0
= g k ( x * ) + y z 2 = 0. (a2.13) according to Proposition 3. From the neural network
dynamic equations (a2.11) we can see that y j , pj have no
If pi # 0 then from (a2.12) y; = 0 and so is gk(X*) effect on other circuit variables in a infinite small region
then y z = ,/mk#
from (a2.13) Thus p g , = 0. Suppose that g k ( x * ) # 0 around Cy?, p;); they are isolated from other neurons
0 and thus & = 0 from (a2.12). and construct a stable self-loop. Thus they can be omitted
These discussions confirm that p i g j , = 0 holds for all without affecting our conclusion.
cases. The linearized system at the equilibrium point
~
(x*, y * , A*, p * ) is obtained as always increasing with the rate - p k . Thus after some
finite time d p k / d t = g k ( x ) + y i 2 0 and pk will be in-
creasing until pk 2 0. However, if p k + 0, d y k / d t + 0
hence ( y k ,p k ) already settle down to an equilibrium
( y : , 0 ) and they do not involved in the transient behavior
of other neurons. Consequently after a finite period we
only need to consider the remaining circuit with these
1
pk > 0, j = 1,2;.-, J.
V:x L ( x* , y * , A*, p* ) V,h ( X* , y * )
For this remaining circuit we define a function
- v,h ( x* ,y * ) 0 E ( x , y , A, p ) following the terminology in the proof of
Theorem 4.
( a2.20)
1
V : ? L ( ~ * , Y * ~, * , p * ) v , h ( x * , y * ) Notice that the Hessian now becomes
G=[ -V,h(x*, y * y 0 . (a2.21)
=I
V&L(x*,y*,A*,p*)
(a2.8).
v ~ , ~ L ( x * , Y * , A * , ~ * o) o REFERENCES
0
[l] D. P. Bertsekas, Constrained Optimization and Lagmnge Multiplier
... Methods. New York: Academic, 1982.
0 0 ..- 2 4 , [2] J. B. Dennis, Mathematical Programming and Electrical Networks.
New York: Wiley, 1959.
[3] J. C. Platt and A. Barr, Constrained differential optimization,
where k, E A ( x * ) ,j = l,...,J. G is strictly positive defi- presented at 1987 Neural Information and Processing Systems
nite from the strictly complementarity condition as well as Conf.
the strict positivity of V,??L(x*,y * , A*, p*). Hence - G is [4] -, Constraint methods for flexible models, Computer Graph-
ics, vol. 22, pp. 279-288, 1988.
strictly negative definite and the point ( x * , y * , A*, p* ) is [5] J. C. Platt, Constraint methods for neural networks and computer
asymptotically stable. graphics, Ph.D. dissertation, California Institute of Technology,
Theorem 5: Suppose that the Hessian V:x L ( x , A, p ) be 1989.
[6] D. P. Atherton, Stability of Nonlinear Systems. New York: Re-
positive definite and that the strict complementarity condi- search Studies Press, 1981.
tion holds in the dynamic domain. Then the network is [7] S. Vajda, Linear & Nonlinear Programming. London: Longman,
globally Lyapunov stable. 1974.
[8] W. Ponfield and H. Jasper, Epilepsy and the Functionalhatomy of
Proofi It may well happen at the start that pk < 0 for the Human Bruin. London: J. and A. Churchill, 1954.
some k . While from (a2.11b) it is easy to verify that y i is [Y] M. P. Kennedy and L. 0. Chua, Neural networks for nonlinear
452 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-11: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 39, NO. 7, JULY 1992
programming, IEEE Trans. Circuits Syst., vol. 35, pp. 554-562, using a neural network, IEEE Trans. Acoust., Speech, Signal
May 1988. Processing, vol. ASSP-36, pp. 1141-1150, July 1988.
[lo] J. J. Hopfield, Neurons with graded response have collective [25] J. K. Paik and A. K. Katsaggelos, Image restoration using the
computational properties like those of two-state neurons, in Proc. hopfield network with nonzero autoconnection, in Proc. ICASSP
Natl. Acad. Sci., vol. 81, pp. 3088-3092, 1984. 90, pp. 1909-1912, 1990.
[ll] D. W. Tank and J. J. Hopfield, Simple neural optimization [261 Y. Yao and Q. Yang, Programming neural networks: A dynamic-
networks: An A / D convertor, signal decision circuit, and a linear static model, in Proc. IJCNN, pp. 1-345-348, 1989.
programming circuit, IEEE Trans. Circuits Syst., vol. CAS-33, pp. [271 B. Kosko, Bidirectional associative memories, IEEE Trans. Syst.,
533-541, May 1986. Man, Cybern., vol. 18, pp. 49-60, Jan./Feb. 1988.
[12] A. Rodriguez-Vazquez, R. Dominguez-Castro, A. Rueda, J. L.
Huertas, and E. Sanchez-Sinencio, Nonlinear switched-capacitor [281 Y. LeCun, A theoretical framework for back-propagation, in D.
Touretzky, G. Hinton, and T. Sejnowski, Eds., Proc. 1988 Connec-
neural netowrks for optimization problems, IEEE Trans. Circuits
tionest Models Summer School, pp. 21-28, CMU, 1988.
Syst., vol. 37, pp. 384-398, Mar. 1990.
[13] L. 0. Chua and G. N. Lin, Nonlinear programming without
computation, IEEE Trans. Circuits Syst., vol. CAS-31, pp. 182-188,
Feb. 1984.
L. 0. Chua and L. Yang, Cellular neural networks: Theory,
IEEE Trans. Circuits Syst., vol. 35, pp. 1257-1272, Oct. 1988. Shengwei Zhang received the BSc. and M.Sc.
0. Barkan, W. R. Smith, and G. Persky, Design of coupling degrees in electrical engineering from Tsinghua
resistor networks for neural network hardware, IEEE Trans. University, Beijlng, Peoples Republic of China,
Circuits Syst., vol. 37, pp. 756-765, June 1990. in 1984 and 1987, respectively.
P. A. Cook, Nonlinear Dynamic Systems. London: Prentice-Hall Between 1989 and 1990 he was an academic
International, 1986. visitor with the Department of Electrical Engi-
C. R. K. Marrian and M. C. Peckerar, Electronic neural net neering at Imperial College, London. From Oc-
algorithm for maximum entropy solutions of ill-posed problems, tober 1990 to December 1991 he was a research
IEEE Trans. Circuits Syst., vol. 36, pp. 288-293, Feb. 1986. fellow with the Department of Electrical Engi-
L. V. Atkinson, P. J. Harley, and J. D. Hudson, Numerical Methods neering at Brunel University, London. He IS
with FORTRAN 77. Wokingham, England: Addison-Wesley, 1989. currently a research engineer with Expervision
T. E. Stern, Theory of Nonlinear Networks and Systems. New York: Inc. in California. His has published a number of papers on neural
Addison-Wesley, 1965. networks, image processing, and pattern recognition. His current inter-
A. J. Agranat, C. F. Neugebauer, R. D. Nelson, and A. Yariv, The ests include neural networks, optical character recognition, nonlinear
CCD neural processor: A neural network integrated circuit with optimization, and real-time learning.
65536 programmable analog synapses, IEEE Trans. Circuits Syst.,
vol. 37, pp. 1073-1075, Aug. 1990. A. G. Constantinides is a professor of signal processing at Imperial
H. Yanai and Y. Sawada, Integrator neurons for analog neural College, and he is also Head of the Signal Processing Section in the
networks, IEEE Trans. Circuits Syst., vol. 31, pp. 854-856, June Department of Electrical Engineering. His current interests are in image
1990. and speech processing In neural networks, optimization techniques,
J. A. Feldman, Connectionest models and parallelism in high communication theory and practice, and adaptive signal processing. he
level vision, in Human and Machine Vision II. New York: Aca- has published widely in the above areas. The Signal Processing Labora-
demic, 1985, pp. 86-108. tory includes many other research activities under his direction
T. S . Huang, Picture Processing and Digital Filtering. Berlin: Dr. Constantinides is a Fellow of the Institution of Electrical Engi-
Springer-Verlag, 1979. neers, and an honorary member of several international professional
Y. Zhou, R. Chellappa, and B. K. Jenkines, Image restoration organizations.