Vai 1999
Vai 1999
Vai 1999
187
188 Vai and Prasad
2. MODELING WITH
NEURAL NETWORKS
p
2.1. Multilayer Feed-Forward
Neural Networks ␥l s Ý h k wk l , Ž2.
ks1
Due to the availability of a power training algo-
rithm called backpropagation w3x, multilayer and p is the number of neurons in the second
feed-forward neural networks are most popular hidden layer. Similarly, the output of the second
for modeling applications. A multilayer neural hidden layer H can be expressed as a function of
network with four layers Žone input layer, two the output of the first hidden layer G which can
hidden layers, and one output layer. used for in turn be expressed as a function of the input
modeling purposes is shown in Figure 1. vector X.
Referring to the notations in Figure 1, X s The backpropagation training algorithm aims
Ž x 1 ⭈⭈⭈ x i ⭈⭈⭈ x m . is the input vector; G s Ž g 1 ⭈⭈⭈ to adjust the weights of a feed-forward neural
g j ⭈⭈⭈ g n ., H s Ž h1 ⭈⭈⭈ h k ⭈⭈⭈ h p ., and Y s Ž y 1 ⭈⭈⭈ network in order to minimize the sum-squared
y l ⭈⭈⭈ yq . are the outputs of the first hidden layer, error of the network, which is defined as
the second hidden layer, and the output layer,
respectively; u i j is the weight between the ith S 1 q
2
input and the jth neuron in the first hidden layer; Es Ý Ý Ž d m l y ym l . , Ž3.
¨ jk is the weight between the jth neuron in the ms1 2 ls1
first hidden layer and the kth neuron in the
second hidden layer; and w k l is the weight be- where S is the number of training data, q is the
tween the kth neuron in the second hidden layer number of output variables, d m s w d m1 d m 2 ⭈⭈⭈
and the lth neuron in the output layer. Bias terms d m q x and ym s w ym1 ym2 ⭈⭈⭈ ym q x are the mth de-
acting like weights on connections from units sired and calculated output vectors, respectively.
whose output is always 1 can also be provided to This is done by continually changing the values of
the neurons Žnot shown in Figure 1.. The output the weights in the direction of steepest descent
of the neural network can be computed as with respect to the error function E.
Open problems related to the architecture of a
1
yl s , Ž1. multilayer feed-forward neural network in model-
1 q ey ␥ l ing are the number of hidden layers and the
Beyond Black-Box Models 189
lems that they are designed to solve, they also cation of neural computing in microwave engi-
provide a framework for constructing special neering problems can be found in w5, 6x, which
computing architectures to solve specific prob- include the use of a neural network system to
lems. The recurrent neural networks described represent a normalized impedance and admit-
here were proposed by Hopfield and are thus tance chart Ž Y᎐Z Smith chart. for design automa-
often referred to as Hopfield networks w4x. Con- tion and the development of a recurrent neural
sider a recurrent neural network of N neurons. If network controller of the stub-tuning process for
the activation of a neuron is updated according to impedance matching.
the equation:
the quantitative functional model of a system into TABLE II. The Qualitative Behavior of a Generic
a recurrent neural network, which has been im- Circuit Element
plemented as an automatic mapping process. Un- V I Z
like the traditional development of a neural net-
work in which its parameters are acquired through Unchanged Unchanged Unchanged
training or learning, the parameters of the recur- Unchanged Increased Decreased
rent neural networks built by this approach are Unchanged Decreased Increased
Increased Increased Unchanged
deterministically solved for a given problem. A
Increased Increased Decreased
typical application of such a neural network is to Increased Increased Increased
determine a reasonable change of a system after Increased Unchanged Increased
one or more of its variables are changed. This Increased Decreased Increased
qualitative modeling approach is explained with Decreased Decreased Unchanged
the optimization of a heterojunction bipolar tran- Decreased Decreased Increased
sistor ŽHBT. equivalent circuit model as an exam- Decreased Unchanged Decreased
ple w9, 10x. Decreased Increased Decreased
The setup for applying a recurrent neural net- Decreased Decreased Decreased
work to guide a microwave circuit optimization
process is shown in Figure 3. The constraints
contained in an equivalent circuit model mostly Table II have an energy of 0 while all inconsistent
come from the element and circuit properties. In states have energies larger than zero. The result
addition, elements may be related to each other neural network is shown in Figure 4.
by the physical parameter from which they were It can be easily shown that the recurrent neu-
derived. ral network of Figure 4 can also be used to
A recurrent neural network is developed ac- qualitatively model other relationships such as
cording to the technique provided in w8x to model
the qualitative behavior of a generic circuit ele-
ment. The relationship V s I = Z, where V and I
are the voltage across and the current through an
impedance Z, respectively, can be qualitatively
represented by Table II. Each row of Table II
describes the change of a variable caused by
varying the other two variables. Coding the un-
changed, increased, and decreased states by bi-
nary numbers 00, 10, and 01 respectively, this
table is implemented as a 6-neuron recurrent
neural network. This is done by formulating a
linear programming problem with the 64 con-
straints Ž13 consistent states and 51 inconsistent
states . according to the recurrent neural network Figure 4. The qualitative model of a generic circuit
energy equation Ž6.. All consistent states listed in element.
Figure 3. A setup for applying a neural network to guide the optimization process.
192 Vai and Prasad
chart in Figure 8 shows the convergence rates of device model in a reverse direction, which is to
both processes. The neural network guided ŽNN- determine circuit parameters that produce the
guided. process converges to its final solution in desired response. While the need for a reverse
less than half of the number of iterations re- model is apparent, a microwave circuit under
quired by the one using a random search. design and thus its neural network model gener-
ally do not have inverse functions. Any attempt to
create a reverse model for a microwave circuit
3.2. Optimization by Learning
unavoidably captures only a portion of the system
A device model allows the designer to predict relations. This is because, in general, a design
circuit behaviors according to given circuit pa- problem does not have a unique solution. While it
rameters. Ideally, a designer would like to use a presents no problem in the development of a
Figure 8. The convergence rates of the neural network ŽNN. guided and random search
modeling process.
194 Vai and Prasad
paradigm for the entire design process to be a relationship among the number of training pat-
performed in a VLSI processor since the sequen- terns, the number of weights to be trained, and
tial solution-searching optimization routine is re- the accuracy of classification expected was pro-
placed by a modified neural network learning posed by Baum and Haussler w15x. On the other
process. In other words, all three main compo- hand, the training patterns must also be a good
nents of a neural network-based design process, sample of the problem. The application of the
the training, the modeling, and the solution number of variables in the model to any standard
searching can be implemented in a hardware neu- t statistic and F distribution tables to determine
ral network coprocessor. Such an implementation, the sample size has been suggested w16x.
provided with an appropriate high-bandwidth While too few training patterns will not pro-
IrO, should have a significant performance gain duce a valid neural network model, too many
over von Neumann architectures. training patterns will make the learning process
Experimental and commercial neural network unnecessarily complicated. The knowledge of the
processors do exist; however, they are not neces- circuit can be used to direct the collection of
sarily suitable for the reverse modeling neural training data. One potential research topic is to
network approach since the neural network is refine the heuristic rules described in Section 3.3
required to be modified into a bidirectional model. for collecting training data to represent the prob-
The hardware implementation, either a custom lem statistically.
design VLSI architecture or the adaptation of an
available programmable architecture, of the neu-
ral network reverse modeling approach is a REFERENCES
worthwhile exploration.
An interesting characteristic of a reverse mod- 1. A.H. Zaabab, Q.-J. Zhang, and M. Nakhla, A neu-
ral network modeling approach to circuit optimiza-
eling process is that multiple solutions generally
tion and statistical design, IEEE Trans Microwave
exist for a desired circuit property. A technique
Theory Tech 43 Ž1995., 1349᎐1358.
called design centering is commonly used to im- 2. C.D. Himmel and G.S. May, Advantages of plasma
prove the yield by selecting a solution to which etch modeling using neural networks over statisti-
the circuit property is least sensitive. The quali- cal techniques, IEEE Trans Semiconduct Manufact
tative modeling approach apparently allows 6 Ž1993., 103᎐111.
multiple legitimate moves to be explored in the 3. R.P. Lippmann, An introduction to computing with
solution space. The reverse modeling approach neural nets, IEEE ASSP Mag Ž1987., 4᎐22.
typically produces multiple solutions with differ- 4. J.J. Hopfield, Neurons with graded response have
collective computational properties like those of
ent initial conditions. These solutions, in theory,
two-state neurons, Proc Natl Acad Sci USA 81
are equally good since they are associated with Ž1984., 3088᎐3092.
similar summed square errors. However, in prac- 5. M. Vai and S. Prasad, Microwave circuit analysis
tice, it is reasonable to assume that component and design by a massively distributed computing
values cannot be manufactured to be exact. It is network, IEEE Trans Microwave Theory Tech 43
thus important to select a solution to which the Ž1995., 1087᎐1094.
circuit property has less sensitivity. A straightfor- 6. M. Vai and S. Prasad, Automatic impedance
ward approach to performing design centering matching with a neural network, IEEE Microwave
Guided Wave Lett 3 Ž1993., 353᎐354.
with our method is to start the reverse modeling
7. P.H. Ladbrooke, MMIC design: GaAs FETs and
process at a number of initial solutions, which can HEMTs, Artech House, Boston, MA, 1989.
be generated randomly. The neural network 8. M. Vai and Z. Xu, Representing knowledge by
model is then used in the forward direction as a neural networks for qualitative analysis and reason-
simulation model to determine the highest yield ing, IEEE Trans Knowledge Data Eng 7 Ž1995.,
solution. The incorporation of design centering 683᎐690.
into the reverse modeling process so that a bal- 9. M. Vai and S. Prasad, Qualitative modeling hetero-
ance can be reached between performance and junction bipolar transistors for optimization: a neu-
ral network approach, Proc 1993 IEEErCornell
yield is another worthwhile research direction.
Conf Advanced Concepts in High Speed Semicon-
The ideal training data set should have the
ductor Devices and Circuits, 1993, pp. 219᎐227.
fewest training patterns that statistically repre- 10. M. Vai, Z. Xu, and S. Prasad, Semiconductor de-
sent the problem on hand. Considerable work has vice modeling by a neural network guided opti-
been done for general modeling problems in this mization process, Proc 1993 Int Semiconductor De-
area. For example, a rule of thumb that describes vice Research Symp, 1993, pp. 507᎐510.
Beyond Black-Box Models 197
11. M. Vai, S. Wu, B. Li, and S. Prasad, Creating work models, IEEE Trans Microwave Theory Tech
neural network based microwave circuit models for 47 Ž1998., 1492᎐1494.
analysis and synthesis, Proc 1997 Asian Pacific Symp 14. S. Wu, Reverse modeling using a neural network
Microwave Conf, 1997, pp. 853᎐856. approach, Doctoral dissertation, Northeastern Uni-
12. M. Vai, S. Wu, and S. Prasad, Reverse modeling of versity, Boston, MA, 1998.
microwave devices using a neural network ap- 15. E.B. Baum and D. Haussler, What size net gives
proach, Proc 1996 Asian Pacific Symp Microwave valid generalization? Neural Comput 1 Ž1989.,
Conf, 1996, pp. 951᎐954. 151᎐160.
13. M. Vai, S. Wu, and S. Prasad, Reverse modeling of 16. K. Yale, Preparing the right data diet for training
microwave circuits with bidirectional neural net- neural networks, IEEE Spectrum 34 Ž1997., 64᎐66.
BIOGRAPHIES
Mankuan Vai received the B.S. degree from National applied physics from Harvard University. She has been on
Taiwan University, Taipei, Taiwan, in 1979, and the M.S. the faculty at New Mexico State University, Las Cruces,
and Ph.D. degrees from Michigan State University, East the American University in Cairo, Egypt, and the Birla
Lansing, MA, in 1985 and 1987, respectively, all in electri- Institute of Technology and Science, Pilani, India. She is
cal engineering. He is currently an associate professor of presently on the faculty of the Department of Electrical
electrical and computer engineering at Northeastern Uni- and Computer Engineering, Northeastern University,
versity, Boston, MA. He has worked and published in Boston, MA. Her areas of research include electromag-
microelectronics, computer engineering, and engineering netics, microwave semiconductor devices, and circuits. Dr.
education. ŽPhoto not available.. Prasad is coauthor, with Professor R. W. P. King, of the
Book Fundamental Electromagnetic Theory and Applica-
tions. She is a member of Sigma Xi. ŽPhoto not available..
Sheila Prasad received the B.Sc degree from the Univer-
sity of Mysore, India, and the S.M. and Ph.D. degrees in