Wavelet Networks For Nonlinear System Modeling
Wavelet Networks For Nonlinear System Modeling
Wavelet Networks For Nonlinear System Modeling
DOI 10.1007/s00521-006-0069-3
O R I G I N A L A RT I C L E
Received: 5 November 2005 / Accepted: 31 August 2006 / Published online: 27 September 2006
Springer-Verlag London Limited 2006
Abstract This study presents a nonlinear systems and applied for parameters updating with momentum term.
function learning by using wavelet network. Wavelet Quadratic cost function is used for error minimization.
networks are as neural network for training and Three example problems have been examined in the
structural approach. But, training algorithms of wave- simulation. They are static nonlinear functions and
let networks is required a smaller number of iterations discrete dynamic nonlinear system.
when the compared with neural networks. Gaussian-
based mother wavelet function is used as an activation Keywords Wavelet networks Training Dynamic
function. Wavelet networks have three main parame- system modeling Wavelet
ters; dilation, translation, and connection parameters
(weights). Initial values of these parameters are ran-
domly selected. They are optimized during training 1 Introduction
(learning) phase. Because of random selection of all
initial values, it may not be suitable for process mod- This study shows a nonlinear static function and dy-
eling. Because wavelet functions are rapidly vanishing namic system modeling using wavelet network. Re-
functions. For this reason heuristic procedure has been cently wavelet network has been used as an alternative
used. In this study serial-parallel identification model of the artificial neural networks because of interpre-
has been applied to system modeling. This structure tation of the model with neural networks is so hard
does not utilize feedback. Real system outputs have [20]. On the other hand training algorithms for wavelet
been exercised for prediction of the future system networks require less number of iterations than neural
outputs. So that stability and approximation of the networks [3]. The wavelet network is an approach for
network is guaranteed. Gradient methods have been system identification in which nonlinear functions are
approximated as the superposition of dilated and
translated versions of a single function [3, 19, 20].
S. Postalcioglu There is another approximation method except the
Department of Electronic and Computer Education, neural network. This is wavelet decomposition. In the
Technical Education Faculty, Kocaeli University,
Kocaeli, Turkey wavelet decomposition, only the weights are identified,
e-mail: [email protected] while the dilation and translations will follow the reg-
ular grid structure. In contrast, in the wavelet network,
Y. Becerikli (&) weights dilations and translations are jointly fitted from
Department of Computer Engineering,
Engineering Faculty, Kocaeli University, Kocaeli, Turkey data [20]. Wavelet network use a wavelet like an acti-
e-mail: [email protected]; [email protected] vation function. The structure of a wavelet network is
shown in Fig. 1 [18].
Y. Becerikli The architecture of a wavelet network is exactly
Department of Computer Engineering,
and Electronics and Telecommunication Engineering, specified by the number of wavelets required for a
Halic University, Istanbul, Turkey given classification or regression application. The
123
434 Neural Comput & Applic (2007) 16:433–441
. (a i x b i ) f ( x)
i 1
w i (a i b i )
.
xn
approximation and prediction capability [17]. Wavelets f ðtÞ ¼ ðf ðtÞ; wm;n ðtÞÞwm;n ðtÞ ð2Þ
m;n¼1
show local characteristics which is the main property in
both space and spatial frequency [14]. Therefore,
For computational efficiency, a0 = 2 and b0 = 1 are
among the all input scope of the network, the hidden
commonly used so that results lead to a binary dilation
nodes with the wavelet function influence the networks
of 2–m and a dyadic translation of n2m. Therefore, a
output only in some local range. This can prevent the
practical sampling lattice is a = 2m and b = n2m so that
interaction between the nodes and assist the training
(3) is obtained.
process and generalization performance. The radial
basis function is also local, but it does not have the wm;n ðtÞ ¼ 2m=2 wð2m t nÞ ð3Þ
spatial–spectral (time–frequency) zooming property of
the wavelet function, and therefore cannot represent Another scheme of decamping f(t)2L2(R) is through
the local spatial–spectral characteristic of the function. father wavelet or the scaling function is given in (4).
So, for approximation and forecasting the wavelet X
network should have a better performance than the f ðtÞ ¼ ðf ðtÞ; UM;n ðtÞÞUM;n ðtÞ
traditional neural network [5, 15]. Families of wavelet n
X
functions especially, wavelet frames are universal ap- þ ðf ðtÞ; Um;n ðtÞÞUm;n ðtÞ ð4Þ
proximators in identification of nonlinear systems. m>M;n
Wavelet networks have been used both for static [12,
where F (t) is the scaling function that has the
20] and dynamic modeling [10, 15]. System modeling is
following relation with mother wavelet. Equations (5)
realized in three steps. First, input variables are
and (6) show this relation.
determined, second, network structure and initial
weights values are decided and finally training proce- pffiffiffiX
dure is done. The parameters of wavelet networks are UðtÞ ¼ 2 hk wð2t kÞ ð5Þ
dilation (d), translation (m), bias a0, and weights (a, c). k
123
Neural Comput & Applic (2007) 16:433–441 435
The result can be interpreted as the fine components 2.1 Mother wavelet function
that belong to the wavelet space Wm for the function
are neglected and coarse components that belong to The wavelets are the family of the signals that is pro-
the scaling space Vm are preserved to approximate the duced by the translations and the dilations of a mother
original function in M scales. Figure 2 shows the wavelet satisfying the admissibility condition. This
wavelet network structure. In the structure of typical condition is given in (12) [11].
multilayer feed-forward networks, if F is used as
nonlinear approximation function of the hidden units Z1 2 dx
^
and the connection weights (cn) then the approxima- cw ¼ wðxÞ ð12Þ
jxj
tion in (8) can be implemented. 0
Wavelet functions can be classified by two catego-
ries. These are orthogonal wavelet and wavelet frames. The term mother wavelet gets its name as two
Wavelet frames are used for application of function important properties of the wavelet analysis. The term
approximation and process modeling due to the wavelet means a small wave. The term mother implies
orthogonal wavelets cannot be expressed in closed that the functions with different region of support
form [9]. Wavelet frames are constructed by mother which are used in the transformation process. They are
wavelet. A wavelet F j (x) is derived from its / z (x) derived from the mother wavelet. In other words, the
mother wavelet: mother wavelet is a prototype to generate the other
windowing functions [13]. The main idea of wavelet
Y
Ni
theory consists of representing an arbitrary signal f(x)
Uj ðxÞ ¼ /ðzjk Þ ð9Þ
by means of a family of functions that are scaled and
k¼1
translated versions of a single main function known as
xk mjk the mother wavelet [4]. The relationship between these
Zjk ¼ ð10Þ
djk functions is represented by (13).
1 t m
Ni, is the number of inputs. Nw is a layer of wavelets.
wm;d ðxÞ ¼ pffiffiffiffiffiffi w m; d 2 R: ð13Þ
The network output y is computed as: jdj d
X
Nw X
Ni
Mother wavelet function gives an efficient, and useful
y ¼ WðxÞ ¼ cj Uj ðxÞ þ a0 þ ak xk ð11Þ
j¼1 k¼1 description of the signal of interest [16]. This function
has some general properties [10]. It is more convenient
a0, a, c are adjustable parameters of wavelet networks. in practice to use a redundant wavelet family than an
orthonormal wavelet basis for constructing the wavelet
y network, because admitting redundancy allows us to
construct wavelet functions with a simple analytical
Output layer Σ form and good spatial–spectral localization properties
c1 cj
[5]. In this paper, a nonorthogonal wavelet has been
Φ1 Φj used as a mother wavelet which is the first derivative of a
Π Π Gaussian function as shown in (14).
2
/ðxÞ ¼ xeð1=2Þx : ð14Þ
ψ 11 ψ 1k ψ j1 ψ jk
Wavelet layer
3 Wavelet network structures for nonlinear system
z11 z1k z j1 z jk
modeling
d 11 d 1k d j1 d jk
m 11 m 1k m j1 m jk
Input layer a0 System identification problem can be described in
1
three groups [8]. First of them is parallel identification
x1
a1 model. This model does not guarantee converge of the
parameters, because of the dynamic model delay out-
ak put feedback. This structure is shown in Fig. 3. This
xk
network is optimized using corresponding optimization
Fig. 2 Wavelet network structure algorithm.
123
436 Neural Comput & Applic (2007) 16:433–441
u (k)
Dynamic System
y p (k) The learning is based on Stochastic Gradient algo-
rithm. This algorithm implies by (15).
@J XN
@yn
¼ en : ð15Þ
@h n¼1
@h
z -1
z -1 @y
¼ Uj ðxÞ j ¼ 1 to Nw
@cj
z -1
w.r.t. translations:
123
Neural Comput & Applic (2007) 16:433–441 437
@zjk 1
u (k) Dynamic y p (k) ¼
System
@mjk djk
w.r.t. dilations:
123
438 Neural Comput & Applic (2007) 16:433–441
8
Equation (19) guarantees that the wavelets extend < 2:186x 12:864 x 2 ½10; 2;
initially over the all input domain. The choice of the f ðxÞ ¼ 4:246x x 2 ½2; 0; :
:
weights (a, c) is less critical for wavelet network. They 10eð0:005x0:5Þ sinðxð0:03xþ0:7ÞÞ x 2 ½0; 10
are initialized to interval 0 and 1. ð20Þ
4.2 Stopping conditions for training
Systems shows different characteristic for different
input domains. The domain of the x data is trans-
Parameters of the wavelet networks are trained during
formed into [ – 1, 1]. The learning procedure is applied
learning phase for approximation the target function.
on this domain. The number of training sequence is 200
Gradient methods have been applied for adjustable
samples which is uniformly distributed in the interval
parameters. When variation of gradient and parame-
of interest. Figure 8 shows static process and model
ters reaches a lower bound or the number of iterations
output.
reaches a fixed maximum, then system training is
These results were obtained using a wavelet network
stopped.
with ten neurons. Learning iteration is 3,000.
Momentum coefficient was selected 0.9. Training mean
5 Dynamic system modeling using wavelet network
square error (TMSE) is 5.6699 · 10–4. TMSE is com-
puted using (21).
Dynamic system identification problems consist of
three groups; parallel, serial-parallel and inverse sys-
tem identification models. In this study, serial-parallel 1X N
TMSE ¼ ðyp ðnÞ yn Þ2 ð21Þ
identification model has been used as shown in Fig. 7. N n¼1
This structure does not use feedback. Real system
outputs have been used to predict of the future system where N is the number of inputs.
outputs. So that stability and approximation of the
network are guaranteed.
6.2 Example 2
6 Simulations
This example shows approximation of two variable
6.1 Example 1 functions [6]. Equation (22) shows the function.
For the static system modeling (20) has been used [20]. f ðx1 ;x2 Þ ¼ ðx21 x22 Þsinð0:5x1 Þ 106 x1 ;x2 6 10: ð22Þ
123
Neural Comput & Applic (2007) 16:433–441 439
The function output is on the domain [ – 1, 1] by yðkÞ ¼ Wðyðk 1Þ; yðk 2Þ; uðk 1Þ; hÞ:
normalization. The input sequence is constituted with
random amplitude in the range [ – 10, 10] · [ – 10, 10] The input and output sequence for training consists of
for training. Figure 9 shows the process for training. pulses with random amplitude in the range [ – 5, 5] and
Process was learned with 200 training samples.
Wavelet network has seven wavelets for Fig. 12 as test
phase. Learning iteration is 1,000. Momentum is se-
lected 0.07, TMSE is 0.0011. Model and process output
is shown in Fig. 10.
6.3 Example 3
123
440 Neural Comput & Applic (2007) 16:433–441
References
123
Neural Comput & Applic (2007) 16:433–441 441
15. Postalcıoğlu S, Erkan K, Bolat DE (2005) Comparison of 18. Thuillard M (2000) Review of wavelet networks, wavenets,
wavenet and neuralnet for system modeling. Lect Notes fuzzy wavenets and their applications. ESIT 2000, Aachen,
Artif Intell 3682:100–107 Germany, 14–15 September 2000
16. Reza AM (October 19, 1999) Wavelet characteristics. White 19. Zhang Q (1997) Using wavelet network in nonparametric
Paper, Spire Lab., UWM estimation. IEEE Trans Neural Netw 8(2):227–236
17. Shi D, Chen F, Ng GS, Gao J (2006) The construction of 20. Zhang Q, Benveniste A (1992) Wavelet networks. IEEE
wavelet network for speech signal processing. Neural Trans Neural Netw 3(6):889–898
Comput Appl 11(34):217–222
123