2002 Ijacsp
2002 Ijacsp
2002 Ijacsp
Int. J. Adapt. Control Signal Process. 2002; 16: 557–576 (DOI: 10.1002/acs.725)
SUMMARY
This paper investigates the processing techniques for non-linear high power amplifiers (HPA) using neural
networks (NNs). Several applications are presented: Identification and Predistortion of the HPA. Various
Neural Network structures are proposed to identify and predistort the HPA.
Since a few decades, NNs have shown excellent performance in solving complex problems (like
classification, recognition, etc.) but usually they suffer from slow convergence speed. Here, we propose to
use the natural gradient instead of the classical ordinary gradient in order to enhance the convergence
properties. Results are presented concerning identification and predistortion using classical and natural
gradient. Practical implementations issues are given at the end of the paper. Copyright # 2002 John Wiley
& Sons, Ltd.
KEY WORDS: high power amplifier; identification; predistortion; equalization; natural gradient algorithm;
neural networks
1. INTRODUCTION
The worldwide demand for wireless communications increases the need for wideband channels.
The third generation system ‘universal mobile telecommunication system’ (UMTS) is a
promising solution providing a global covering area and high transmission rate when using
satellite communication channels (S-UMTS channel). On board the satellite, a high power
amplifier (HPA: travelling wave tube (TWT) or solid state power amplifier (SSPA)) is used.
The HPA is designed in general to work near the saturation in order to get the maximum
power efficiency from the power sources on board the satellite. Working in this region, the HPA
has a non-linear (NL) behaviour: non-linearity in phase (AM/PM conversion) and amplitude
non-linearity (AM/AM conversion). These non-linearities have various effects: NL inter symbol
$
Invited Paper
n
Correspondence to: H. Abdulkader, ENSEEIHT-T!eSA, 2 rue C. CAMICHEL, 31071 Toulouse Cedex 7, France
y
E-mail: [email protected]
interference (ISI), constellation degradation when using non constant envelope constellations
(16-QAM for example).
Because bandwidth efficiency is a very important topic in digital communications, the use of
non-constant envelope constellations is very interesting for satellite communications. Until now,
non-constant envelope constellations are not used because of NL problems related to the HPA.
Multicarrier communications (OFDM for example) are also forbidden in satellite communica-
tions because of the peak to mean power factor [1].
In order to give the ability of using such non-constant envelope modulations it is necessary to
fight against the HPA non-linear effects. For this purpose, two approaches are possible: the first
one is equalization at the terminal side. Conventional equalizers can be used for fighting against
ISI introduced by the propagation channel [2] while NL equalizers are best suited for equalizing
the non-linear effects of the HPA. Many researchers have investigated this last topic. NL
equalizers can be based on Volterra series or on Neural Network (NN) structures. Among
Volterra series equalizers we can point out for example References [3,4]. A complete
presentation of Neural Networks can be found in Reference [5]. Concerning (NN) structures,
we can find in the literature equalizers based on multi layer perceptron (MLP) References [6–8],
radial basis functions (RBF) [9,10], and self organizing maps (SOM) [11]. In References [12,13] a
description and comparison between MLP, RBF and SOM equalizers can be found. NN
structures can be updated using training sequences or in blind [14,15]. NN equalization at the
terminal side has the advantage of equalizing the down link propagation channel (possibly time
varying fading channel) together with fighting against NL distortions (amplitude, phase and
ISI). The main drawback of this technique is the additional cost and the computation load of the
NN equalizer in each terminal.
A second approach is power amplifier linearization or predistortion. The advantage of this
technique lays on the fact that we need only a single system for fighting against the HPA non-
linearity (compared to NL equalizer in each terminal). Linearization techniques are generally
non-adaptive systems [16]. In this paper we present predistortion techniques using NNs with
ability of adaptive behaviour. This kind of predistortion is well adapted for satellites with
regenerative payloads (in-phase I and quadrature Q baseband signals are available on board). It
is interesting to note that there exists analogue NN [17] working in baseband up to symbol rates
equal to 50 Mbauds.
Finally, a third technique is often necessary when dealing with NL links: Identification. In
Reference [18] Ibnkahla has shown that a NN model of the NL satellite channel may be used for
failures detection. Identification is also necessary in order to build adaptive predistortion
schemes because the NL mathematical model is generally unknown (HPA is time varying with
age, temperature, etc.). As for predistortion, Volterra series or NNs can be used for the
identification [19]. As all adaptive systems, the three previous techniques need to minimize some
cost criteria associated to some updating algorithm. In this paper we present two minimization
algorithms: classical gradient descent algorithm called: ‘ordinary gradient descent’ (OGD) and
natural gradient descent (Nat-GD).
Section 2 of the paper presents NN training algorithms using the back propagation (BP)
algorithm based on the OGD and the Nat-GD. Section 3 is devoted to the identification of the
NL HPA using NNs. After presenting the characteristics of the HPA we propose two structures
to model the HPA: mimic structure with a NL system to model the amplitude conversion and a
second one to model the phase distortion, and a classical structure (a simple multilayer NN).
Section 4 of the paper presents two NN predistortion methods using a mimic NN structure.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 559
Identification and predistortion structures will be learned with Nat-GD and OGD algorithms.
Due to intensive literature we have omitted NNs equalization. Section 5 of the paper is related
to implementation issues. We present the implementation of NN equalizers and predistorter.
Finally Section 6 concludes the paper.
2. NEURAL NETWORKS
Recently, NNs were widely used as powerful adaptive signal processing tools. Especially,
wireless and satellite digital communications take advantage of NNs properties in several
applications as identification, equalization, etc. (see Reference [20] for a survey). Ibnkahla [21]
has studied the problem of HPA predistortion in order to linearize the HPA characteristics
using NN. In [22,23] Ibnkahla et al. have applied NNs for identifying and modeling the HPA.
They have used the classical ordinary gradient based-backpropagation algorithm. In the
following we propose to use the natural gradient algorithm for training NNs applied for
identification and predistortion purposes.
where f is the activation function of the neuron j. In our applications, the activation function of
an hidden neurons is an hyperbolic tangent function while the activation function of an output
neuron is linear.
Amari [24] has demonstrated an interesting property concerning NNs. This property which is
called ‘universal approximation theory’ says that an MLP with an infinite number of neurons is
able to model any NL function. In digital communication applications of NNs, experience
showed that some tenths of neurons in the hidden layer can be sufficient to overcome most
problems. In the two following subsections we present two algorithms for training NNs (OGD T
and Nat-GD). Algorithms areTused for modelling some system with input vector x ¼ xI ; xQ
and output vector y ¼ yI ; yQ where the superscript T is matrix transposition.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
560 H. ABDULKADER ET AL.
following we will use only this kind of NNs and the expression feedforward will be omitted
henceforth. NNs are trained by the well known algorithm called backpropagation (BP)
algorithm which can be summarized as follows [5]:
1. Initialization of the coefficients (weights w and bias b).
2. Presentation of an input vector x(n).Then the algorithm contains two phases:
3. Feedforward phase: the network input is forward propagated through the hidden layers
towards the output layer. An error signal is calculated at the output of the NN according
to a certain cost function L. Depending on the cost function L, the algorithm can be either
a supervised learning algorithm (for each input vector x(n) there is a known desired output
vector y(n) called the teacher output), or an unsupervised one (if no teacher output is
available). In our paper we use supervised learning algorithms together with a cost function
equal to the square error function.
1 1 1h 2 i
L ¼ e2 ¼ e2I þ e2Q ¼ ðyI y# I Þ2 þ yQ y# Q ð2Þ
2 2 2
4. Feedback phase: the error signal is then backpropagated from the network output towards
the input. At the output of each neuron an error signal is computed, representing its
contribution to the whole network error.
X
0
dlj ðnÞ ¼ dlþ1
k ðnÞwjk fj ð3Þ
k2lþ1
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 561
authors have proposed several algorithms under certain assumptions for training batch
feedforward NNs. The learning rate is adjusted dynamically according to an estimation of the
Lipschitz} constant. Another approach was used in Reference [27], it takes advantage of the
eigenvalues of the Hessian matrix in order to choose a global learning rate for the whole NN or
an individual one for each coefficient of the NN. Other methods consist in adding a
supplementary term to the ordinary gradient rule in order to accelerate the convergence and to
enhance the stability [28]. Orr et al. [29] have used the Hessian’s matrix in the momentum term.
This algorithm uses information about the NN manifold curvature related to the Hessian
matrix.
In the 1980s, Amari proposed his theory called ‘information geometry’ from which he has
pointed out a novel algorithm: the natural gradient descent algorithm (Nat-GD). When
applying this algorithm for training NNs, the coefficients evolve in the direction of the steepest
descent. Amari et al. (for example References [30–32]) have studied the properties of this
algorithm and showed its ability to avoid the ‘plateaus phenomenon’ currently encountered with
NNs [32]. Besides its fast convergence speed, natural gradient algorithm reaches the Cramer-
Rao lower bound when the cost function is the maximum likelihood criterion [30]. The Nat-GD
can be explained as follows:
Along time, the output of the NN y# ¼ cð x;yÞ moves in a manifold T with dimension equal to
the dimension of the parameters vector y ¼ . . . ; wij ; . . . ; bi ; . . . : This manifold is not an
Euclidean space since its axis along y are not orthonormal, such a manifold is said a
Riemannian manifold. The vectorial field rL=ry is a covariant vectorial field and to get the
contravariant vectorial field (the steepest descent direction in a Riemannian manifold) a
contravariant tensor metric must be used.
Amari [30] has shown that the fisher information matrix (FIM) is the only invariant metric
matrix in a Riemannian manifold. So, the natural gradient vector in such a manifold is:
* L ¼ A1 rL
r ð6Þ
where rL is given by (5) and A1 is the inverse of the fisher information matrix:
" #
rL rL T
A¼E ð7Þ
ry ry
}
The optimal learning rate is equal to half the inverse of the Lipschitz constant which reflects some topological properties
of the error surface.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
562 H. ABDULKADER ET AL.
In the following we will use, for each application (NN identification or predistortion), the two
algorithms: OGD and Nat-GD.
where wjk is the weight connection between xj and the neuron k,(j ¼ I; Q). The function f is a
nonlinear activation function (hyperbolic tangent function). The output of the NN can be
expressed as
X
y# j ¼ ukj zk ð11Þ
k
where ukj is the weight connecting the neuron k of the hidden layer to the neuron j of the output
layer (j ¼ I; Q). The cost function is the square error defined in(2). Updating rules of weights
and bias of the network can be computed following the OGD-based backpropagation algorithm
explained earlier.
}
IBO input back off, is the ratio of the average power of the input signal to the saturation input power.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 563
Figure 1. Top: the AM/AM and AM/PM characteristics of the TWT (Saleh’s model). Bottom: the input
16-QAM constellation (left) and TWT output constellation (right, TWT at 0dB IBO).
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
564 H. ABDULKADER ET AL.
Figure 3. (a) functional diagram of the TWT. (b) NN structure containing two sub-NNs for modelling the
TWT behaviour. (c) sub-NN structure.
give the output of the NN formulated as a function of the two sub-NNs outputs. The cost
function is given by (2). To conclude the updating rules for this structure let us introduce the
following quantities:
rL
dg ¼ ¼ rðeI cos ðfm ðrÞ þ jÞ þ eQ sin ðfm ðrÞ þ jÞÞ
rGm ðrÞ
rL
df ¼ ¼ rGm ðrÞðeI sinðfm ðrÞ þ jÞ eQ cosðfm ðrÞ þ jÞÞ ð13Þ
rfm ðrÞ
Since the two sub-NNs have the same structure we shall carry out the updating rules for one
sub-NN only (Gm(r) for example). The updating rules for weights and biases of this network are:
ui ðn þ 1Þ ¼ ui ðnÞ mdg zi
wki ðn þ 1Þ ¼ wki ðnÞ mrdk
bk ðnÞ ¼ bk ðnÞ mdk
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 565
with dk ¼ dg uk f 0 : Updating rules of fm(r) sub-NN can be deduced from the previous equations
by replacing dg by df :
3.4. Simulations
Hereafter we present simulation results for the TWT identification with the structures presented
above.
Figure 5. AM/AM conversion of the Nat-GD (left) and OGD (right) (global model).
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
566 H. ABDULKADER ET AL.
Figure 6. AM/PM distortion of the Nat-GD (left) and OGD (right) (global model).
explained as follows: The AM/AM and AM/PM conversion of the TWT depend only on the
input signal modulus. For a given modulus ri, there are many input patterns ½xI ; xQ T applied to
the NN input with modulus ri. So, for the same input modulus there are different NN responses
(as many as patterns with the same modulus in the space of the input signal).
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 567
Figure 8. AM/PM distortion of the Nat-GD (left) and OGD (right) (mimic model).
Figure 9. AM/AM conversion of the Nat-GD (left) and OGD (right) (mimic model).
Table I. Comparison between the ordinary and the natural gradient algorithms for the two
models (identification case).
Global model Mimic model
OGD Nat-GD OGD Nat-GD
SER (dB) 25 54 37 74
By comparing Figures 5–6 with 8–9, we can see that the clouds of Figures 5–6 are replaced by
thin curves in Figures 8–9. This is because the sub-NNs are trained by the modulus of the input
patterns instead of the patterns itself, in the mimic model.
As a measure of the modelling quality we have measured the signal to modeling error ratio. It
is given by the following expression:
p
o
SER ¼ 10 log10 ð14Þ
MSE
where po is the power at the TWT output.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
568 H. ABDULKADER ET AL.
Table I compares the signal to modeling error ratio (SER) between the two algorithms and
the two structures. SER values validate the previous remarks, i.e. the mimic structure performs
better than the other one.
4. TWT PREDISTORTION
Mimic model outperforms the global model for TWT modelling. In the above section, results
confirmed the fact that using a priori knowledge about the system to be identified, enhances the
modelling procedure. In this section we shall use, also, mimic structures in order to predistort
the TWT. The goal of predistortion techniques is to linearize the AM/AM conversion and to
cancel the AM/PM distortion of the TWT. Ibnkahla [21] has compared the performance of a
NN based predistorter learned by the OGD with a Volterra series based predistoter. Simulations
gave advantage to the NN. In our paper we propose a new NN implementation scheme which
enhances the cancellation of the AM/PM distortion.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 569
Figure 13. Equivalent scheme for updating the gain sub-NN of the predistorter.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
570 H. ABDULKADER ET AL.
Figure 14. Updating schemes of the gain predistortion and the phase canceling sub-NNs.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 571
Table II. SER for the two gradient algorithms at the end of the training phase (predistortion
based on TWT identification).
Nat-GD OGD
SERI (dB) 31 61
SERQ (dB) 31 61
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
572 H. ABDULKADER ET AL.
Figure 16. Phase distortion results concerning the predistorter+TWT (predistortion based on TWT
identification).
Table III. SER (in dB) at the end of the training phase (predistortion: second method).
AM/PM u" mmmmmodel (AM/AM)1
OGD 32 21
Nat-GD 83 51
To validate this method we presented a validation sequence to the input. Validation results
are depicted in Figures 17 and 18, and in the Table IV. Table IV compares the SER of the two
algorithms for the I and Q signals.
Figures 17 and 18 illustrate some simulation results concerning the validation of the
predistortion scheme. Curves on Figure 17 show: the TWT output modulus versus the
predistorter input modulus (. Linearized HPA), the output predistorter modulus versus the
input predistorter modulus (+) and the TWT output modulus versus its input modulus (}).
Figure 18 presents the residual phase shift of the overall system. It is evident that the NN trained
by the Nat-GD is more accurate than the other one. Moreover it converges faster.
Comparing Tables 2 and 4, we obviously remark that the first method is more accurate than
the second one. In fact, the second method is not really adequate to predistort the TWT but to
post-linearize it. It can be seen, from Figures 13 and 14(a), that the equivalence between the two
methods stands only when the following equation is true:
G(r), Gm(r) and Gp(r) are the gain of TWT, mimic identified model and predistortion NN,
respectively. Admitting that GðrÞ ¼ Gm ðrÞ; i.e. neglecting the modeling error, we ascertain that
Equation (18) is true when GðrÞ ¼ Gp ðrÞ or when the two functions are linear that do not stand
in the TWT case.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 573
Figure 17. Output modulus versus Input modulus for OGD (left) and Nat-GD (right)
(predistortion: second method).
Figure 18. Overall system (Predistorter+HPA) phase distortion using OGD (left) and Nat-
GD (right) (predistortion: second method).
Table IV. SER for the two gradient algorithms during validation phase (predistortion: second method).
OGD Nat-GD
SERI (dB) 19 54
SERQ (dB) 19 54
5. PRACTICAL ISSUE
Practical implementations of Neural Networks have been yet realized. The first implementation
has been achieved during the NEWTEST European project and was dealing with NN
equalization. Another one is being under construction concerning predistortion purposes.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
574 H. ABDULKADER ET AL.
6. CONCLUSION
This paper treated the problems arisen by non-linearity on board satellite. Identification and
predistortion of the TWT have been discussed intensively. We proposed to identify the TWT
following two ways: by a general purpose MLP and by a mimic NN structure composed of two
sub-NNs. The mimic model takes into account an a priori knowledge of the HPA behavior. The
advantage of the mimic model over the general purpose MLP model is shown by simulations.
For identification purposes, we have used two gradient algorithms: classical ordinary gradient
and natural gradient. Nat-GD has shown better convergence speed together with better MSE at
the end of convergence. The performance given by the mimic NN structure motivated using a
mimic NN structure for predistorting the TWT. We have also proposed two methods in order to
predistort the TWT. The first one is based on the identification of the HPA followed by
predistortion of the identified model. The second method uses the input and output patterns of
the TWT in order to compute the predistortion device. Simulations have shown that the method
based on identification is the better one. Furthermore, the latter method can be generalized to
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
NATURAL GRADIENT ALGORITHM 575
identify the HPA and to predistort it adaptively and simultaneously. So, it is possible to follow
the variations of the TWT (ageing, temperature drift, etc.).
Throughout the paper, we used two on-line stochastic algorithms to train NNs: ordinary
gradient descent and natural gradient descent. Simulations showed the advantage of using
natural gradient descent. It converges faster than the ordinary gradient descent and reaches
smaller mean square error in all our applications.
Practical implementations of NN structures for equalization have been yet demonstrated on
devices like Digital Signal Processors or FPGA. The implementation of NN predistortion for
regenerative satellite payloads will be possible using an hybrid technology: analog implementa-
tion for the NN predistorter core and digital implementation of the updating algorithm on a
classical processor.
REFERENCES
1. Davis, JA., Jedwab, J. Peak-to-mean power control in OFDM, golay complementary sequences, and reed-muller
codes. IEEE, Transaction on Information Theory 1999; 45(7).
2. Nordberg, J., Abbas, M., Sven, N., Ingvar, C. Fractionally spaced adaptive equalization of S-UMTS mobile
terminals. Wiley International Journal on Adaptive Control and Signal Processing, this issue.
3. Gutierrez, A., Ryan, W. E. Performance of adaptive Volterra equalizers on nonlinear satellite channels. Proceedings
of ICC’96, Seattle, USA.
4. Benedetto, S., Biglieri, E., Daffara, R., Modeling and Performance Evaluation of Nonlinear Satellite Links}A
volterra Series Approach. IEEE Transactions on Aerospace and Electronics Systems AES 1979; 15(4).
5. Haykin, S. Neural Networks: A Comprehensive Foundations, 2nd edn. Prentice-Hall: Upper Saddle River, NJ. 07458,
1999.
6. Balay, Palicot, J. Equalization of non-linear perturbations by a multilayer perceptron in satellite channel
transmission. Proceedings of IEEE Globecom’94.
7. Chang, P., Wang, B. Adaptive decision feedback equalization for digital satellite channels using multilayer neural
networks. IEEE Journal on selected areas in Communications 1995; 13(2).
8. Chen S., et al. Adaptive equalization of finite non-linear channels using multilayer perceptrons. Signal Processing
1990; 20:107–119.
9. Cha, I., Kassam, S.A., Channel equalization using adaptive complex radial basis function networks, IEEE Journal
On selected areas in communications 1995; 13(1):122–131.
10. Chen, S., McLaughlin, S., Mulgrew, B. Complex-valued radial basis function network, Part I: network architecture
and learning algorithms. Signal Processing 1994; 35:19–31.
11. Kohonen, T. Self-Organizing Maps. Springer: Berlin, 1995.
12. Bouchired, S., Roviras, D., Castanie, F. Equalization of satellite mobile channels with neural network techniques.
Space Communications, Vol. 15. IOS Press: Boston, New York, 1998/1999.
13. Bouchired, S., Ibnkahla, M., Roviras, D., Castanie, F. Equalization of satellite UMTS channels using neural
network devices. In Proceedings of IEEE, (ICASSP’1999), Phoenix, USA.
14. Benvenuto, N., Piazza, F. On the Complex Backpropagation Algorithm. IEEE Transactions on Signal Processing
1992; 40(4):967–969.
15. Murphy, C.D., Kassam, S.A. A Novel Linear/RBF Blind Equalizer for Nonlinear Channels. Proceedings of
CISS’99, Baltimore, March 1999.
16. Bernardini, A., Fina, S. D. Analysis of different optimization criteria for IF predistortion in digital radio links with
nonlinear amplifiers. IEEE Transactions on Communications 1997; 45(4).
17. Langlet, F., Ibnkahla, M., Castanie, F. Neural network hardware implementation: overview and applications to
satellite communications. Proceedings of DSP’98, ESA, Nordwick, Holland, September 1998.
18. Ibnkahla, M., Sombrin, J., Castanie, F. Channel identification and failure detection in digital satellite
communications. Globcom’96, London (UK), November 1996.
19. Pearson, RK., Ogunnaike, BA., Doyle, FJ. Identification of structurally constrained second order Volterra models.
IEEE Transactions on SP 1996; 44(11).
20. Ibnkahla, M. Applications of neural networks to digital communications-survey. Signal processing 2000; 80(7).
21. Ibnkahla, M. Neural network predistortion technique for digital satellite communications. ICASSP’2000, Istanbul,
Turkey.
22. Ibnkahla, M., Bershad, NJ., Sombrin, J., Castanie, F. Neural networks for modeling nonlinear channels. IEEE
Transactions on SP 1997; 45(7).
23. Ibnkahla, M., Bershad, NJ., Sombrin, J., Castanie, F. Neural network modeling and identification of non-linear
channels with memory: algorithms, applications, and analytic models. IEEE Transactions on SP 1998; 46(5).
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576
10991115, 2002, 8, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/acs.725 by Altinbas University, Wiley Online Library on [20/07/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
576 H. ABDULKADER ET AL.
24. Amari, S. A universal theorem for learning curves. Neural Networks 1993; 6: 161–166.
25. Darken, C., Moody, J. Towards faster stochastic gradient search. In Moody, JE., Hanson, SJ., Lipmann, R.P., (eds),
Advances in Neural Information Processing Systems, Vol. 4. Morgan Kufmann Publishers: San Mateo, CA. 1992.
26. Magoulas, GD., Vrahatis, MN., Androulakis, GS. Improving the convergence of the backpropagation algorithm
using learning rate adaptation methods. Neural Computation 1999; 11.
27. LeCun, Y., Simard, P., Pearlmutter, B. Automatic Learning Rate Maximization by On-Line Estimation of the
Hessian’s Eigenvectors, vol. 5, In Hanson, SJ., Cowan, JD., Giles, CL., (eds.). Morgan Kaufmann: San Mateo, CA;
156–163, 1993.
28. Haykin, S. Adaptive digital communication receivers. IEEE Communications Magazine December 2000.
29. Orret, GB., Leen, TK. Using curvature information for fast stochastic search. Advances in Neural Information
Processing Systems vol. 9. MIT Press: Cambridge, MA.
30. Amari, SI. Natural gradient works efficiently in learning. Neural Computing 1998; 10:251–276.
31. Yang, H.H., Amari, S.I. The efficiency and the robustness of natural gradient descent learning rule. Advances in
Neural Information Processing Systems, vol. 10. MIT Press: Cambridge, MA.
32. Yang, H.H., Amari, S.I. Training multi-layer perceptrons by natural gradient descent. In ICONIP prime 97
Proceedings. new Zeland.
33. Golub, G.H., Van Loan, C.F. Matrix computation, (2nd Edn.) The Johns Hopkins University Press: 1989.
34. Saleh, A. Frequency-independent and frequency-dependent nonlinear models of TWT amplifiers. IEEE Transactions
on Communications COMM 29, 1981.
35. Guntsh, A., Ibnkahla, M., Losquadro, G., Mazella, M., Roviras, D., Timm, A. EU’s R&D activities on the third
generation mobile satellite systems (S-UMTS). IEEE Communication Magazine 1998; 36(2):104–110.
36. Langlet, F., Roviras, D., Mallet, A., Castanie, F. Mixed analog/digital implementation of MLP NN for
predistortion. International joint conference on neural networks. Hi, USA, 2002.
Copyright # 2002 John Wiley & Sons, Ltd. Int. J. Adapt. Control Signal Process. 2002; 16:557–576