AIAA Infotech@Aerospace Conference <br>and <br>AIAA Unmanned...Unlimited Conference
6 - 9 April 2009, Seattle, Washington
AIAA 2009-1974
Modeling Error Driven Robot Control
Abraham K. Ishihara and Khalid Al-Ali
Carnegie Mellon SV, NASA Ames Research Center, Moffett Field, CA.
Tony Adami
Ohio University, Athens Ohio, US
Nilesh Kulkarni
QSS Group, Inc., NASA Ames Research Center, Moffett Field, CA
Nhan Nguyen
NASA Ames Research Center, Moffett Field, CA
In this paper, we present a modeling error driven adaptive controller for control of
a robot with unknown dynamics. In general, modeling error is not used since the ideal
parameters are not known. However, using a feedback linearization approach we show that
the modeling error can be obtained by a measured quantity representing the error dynamics
under the ideal conditions, that is, the case for which the robot parameters are known a
priori. We show that using this approach, the learning dynamics and plant dynamics are
effectively decoupled and can then be analyzed separately. We present simulation examples
of the 2-link manipulator that illustrates the algorithm.
I.
Introduction
Over the past three decades, neural networks have been successfully applied to the control of robotic
manipulators for trajectory tracking when the robot dynamics are not completely known. The success can
be attributed, in part, to the universal function approximation property of neural networks in which there
exists a set of weights that ideally approximates the unknown function. However, finding the ideal set of
weights typically involves a gradient descent like algorithm in which a training signal is provided to the
neural network that drives the weight adaptation. The training signal is often an extended tracking error.3–5
The extended tracking error typically takes the form r = ė + γe where e denotes a tracking error and γ > 0.
That is, the training signal reflects the ability of the robot to track a given reference trajectory, and not the
ability of the neural network to approximate an unknown function. It is for this reason that that is is often
difficult to associate a cost function with the neural network.
On the other hand, in system identification of an unknown plant, the backpropagation algorithm, for
example, seeks to minimize a cost function given by J = 21 e2m where em denotes the modeling error at the
output of the neural network. This scenario is depicted in Fig. 1.
Let’s assume d is scalar and qd ∈ Rn is a smooth input signal. Let F : Rn × RN → R and define
d = F (qd ; W ) and d∗ = F (qd ; W ∗ )
where W and W ∗ denote estimated and ideal weights, respectively. Let em = d∗ − d denote the modeling
error. If W (t) → W ∗ , then the modeling error tends to zero. One method to achieve zero modeling error is
by the minimization of a cost function such as J = 12 e2m , via gradient descent or LMS. This means that the
weights evolve according to
Ẇ = −γ
∂em
∂F (qd ; W )
∂J
= −γem
= γem
∂W
∂W
∂W
where we have used the fact that d∗ is independent of W . If F is linear in W (for example F = W T φ(qd )
where φ ∈ RN ) and qd satisfies a persistence of excitation condition,6 the modeling error tends to zero
1 of 9
American
Institute
Aeronautics
Copyright © 2009 by the American Institute of Aeronautics and
Astronautics,
Inc. All of
rights
reserved. and Astronautics
F(qd; W) d
-
qd
em
+
ã
F(qd; W ã) d
Figure 1. LMS
et
qd
F(qd; W)
d
Figure 2. LMS applied to the input-output signals under training signal: et
exponentially. Next, consider Fig. 2. Here, there is no explicit signal d∗ = F (qd ; W ∗ ) present. In this case,
adjusting the weights according to
∂F
Ẇ = γet
∂W
shall be referred to as LMS applied to the input-output signals under the training signal given by et . This
type of scenario is frequently encountered in adaptive control. In most adaptive control systems, the dotted
arrow as depicted in Fig. 2., is a function of the performance error of the closed loop control system. Consider
the two adaptive control systems given in Fig. 3. The only difference is in the training signal. In the top
panel., the training signal is a function of the performance error (typically the extended extended tracking
error). In the bottom panel, the training signal is given by the modeling error.
In this paper, we propose to use modeling error to drive the weight adaptation of the neural net during
the closed loop control of a serial-link robot with unknown dynamics. The difficulty using the modeling error
is that typically we do not have the desired weights ,W ∗ . Indeed, if we knew W ∗ then there would be no
need for adaptation. However, under a specific feedback linearization control structure, we show that the
modeling error can be obtained by function of the performance of the plant. Using this training signal, the
control and learning are decoupled.
The use of modeling error driven neural network control was first proposed by2 and has successfully been
applied to the control of aircraft. The approach followed the IFC framework7 and was analyzed in discrete
time. In this paper, we extend their method to the control of robots in continuous time.
The remainder is organized as follows: In Section II, we discuss the notation used throughout this paper.
In Section III, we motivate the approach by considering the control of a SISO system. In Section IV, we
present the relevant background on the dynamics of robot manipulators, the controller structure, and relevant
neural network details . In Section V, we present the main stability analysis of the proposed controller and
prove a theorem which guarentees the origin of the closed loop system is stable. In Section VI, we present
a simulation example of the two-link robot manipulator. We compare the results to a standard adaptive
control scheme. We conclude in Section VII.
2 of 9
American Institute of Aeronautics and Astronautics
Feedforward Control
x
ü ff
F(x; W)
et
ê
K
Teacher
qd
+
ü
ü fb
K
+
Plant
q
Plant
q
Feedback Control
Feedforward Control
x
ü ff
F(x; W)
em
+
-
F(x; W ã)
Teacher
qd
+
K
ü fb
ü
+
Feedback Control
Figure 3. Two adaptive control schemes.
II.
Notation
Let R denote the set of real numbers. When we define a quantity, say A by the expression B, we write
A = B. Let x ∈ Rn , where n is a positive integer. We denote the Euclidean norm of x by
√
kxk = xT x
d
Let k · kF denote the Frobenius norm. Let Tr(·) denote the trace operator. The Frobenius norm of an
m × n matrix is given by
v
uX
m
√
u n X
T
kAkF = TrA A = t
a2ij
(1)
j=1 i=1
where aij denotes the element of A in the ith row and jth column. We denote by σmin (A) the minimum
singular value of A. We denote the space of functions f : X1 × X2 → Y , that are n1 times continuously
differentiable with respect to X1 and n2 times continuously differentiable with respect to X2 by Cn1 ,n2 (X1 ×
X2 ; Y ). If X is a normed vector space, we denote the open ball about some element x ∈ X of radius ǫ as
d
B(x; ǫ) = {z ∈ X : kz − xk < ǫ}.
3 of 9
American Institute of Aeronautics and Astronautics
x
~
fê
xç d n
x
~d
+
-
~
e
+
K
1
g(x)
P
Linearized
x
~
Figure 4. Feedback Linearization: Motivation
III.
Motivation
Consider the feedback linearization method shown in Fig. 4. Let the plant, P , be described by
ẋ1
= x2
ẋ2 = x3
..
.
ẋn−1 = xn
ẋn
(2)
= f (x) + g(x)u
where x ∈ Rn , f : Rn → R, g : Rn → R, u ∈ R, and xi denotes the ith component of x. Let the desired states
d
be defined by ~xd = [xd1 xd2 · · · xdn ]T . Denote the error vector as ~e ∈ Rn where the component ei = xdi − xi
for i ∈ [1, n]. Assume that g is known and g −1 (~x) exists for all ~x ∈ Rn .
The feedback linearizing control is given by
(3)
u = g −1 (~x) K~e + ẋdn − fˆ(~x)
where K = [k1 k2 · · · kn ]. Substituting the control given in (3) into (2), we get
(n)
(n−1)
e1 + kn e1
+ · · · + k1 e1 = fˆ − f
(4)
(i)
di
ˆ
where e1 = dt
i e1 . In the ideal case where f = f , the right hand side of the above equation is zero and the
errors tend exponentially to zero if ki > 0 for each i ∈ [1, n]. In the non-ideal case, fˆ 6= f and represents the
modeling error. From the point of view of system identification, the modeling error is the correct training
signal to provide to the gradient descent learning algorithm. However, in general this is not known since f is
not known. In feedback linearization, the modeling error is given by the evolution of the errors in the ideal
case as shown in equation (4), which is a known and readily measured quantity. Thus, choosing the training
signal to be
(n)
(n−1)
training signal = e1 + kn e1
+ · · · + k1 e1
is the correct training signal to learn an estimate of f via gradient descent.
IV.
Robot Dynamics and controller structure
Robot Dynamics: The dynamics of the n-link serial manipulator are given as follows:
M (q)q̈ + V (q, q̇) + F (q̇) + G(q) + τd = τ ,
t ≥ t0
(5)
where M (q) ∈ Rn×n is the inertia matrix, V (q, q̇) ∈ Rn×1 is the Coriolis/centripetal matrix, G(q) ∈ Rn×1
is the gravity matrix, F (q̇) ∈ Rn×1 is friction and τd ∈ Rn×1 represents unknown disturbances; q ∈ Rn×1
is the joint angle state vector, and τ ∈ Rn×1 comprises the net torque applied at each joint. We have the
following standard assumptions.3
4 of 9
American Institute of Aeronautics and Astronautics
Assumption 1 M(q) is symmetric, positive definite and bounded in norm.
Assumption 2 The Coriolis/centripetal terms can be written as Vm (q, q̇)q̇ such that Ṁ − 2Vm (q, q̇) is skewsymmetric.
Assumption 3 The gravity and disturbance terms are bounded.
We will neglect friction terms and disturbance terms in the analysis.
Controller Structure: The desired trajectory is denoted by
qd
~qd = q̇d ∈ C([t0 , ∞); R3n )
q̈d
The performance error is denoted by
q̃ = qd − q ∈ Rn
(6)
The control input to the plant is denoted by τ = τf b + τf f where τf b denotes the feedback control and τf f
denotes the feedforward control. The feedback control is given by
τf b = Kp q̃ + Kv q̃˙
(7)
where Kp ∈ Rn×n and Kv ∈ Rn×n denote positive definite symmetric feedback gain matrices. The feedforward control is given by
τf f = H T (x)θ
(8)
where H ∈ RN ×n is a basis function matrix, θ ∈ RN denotes a vector of parameter estimates, and x is the
input.
Linear in Parameter (LIP) property: We assume that there exists a unique vector, θ∗ ∈ RN such that
M (q)q̈d + Vm (q, q̇)q̇ + G(q) = H T (x)θ∗ ∀x ∈ R3n
(9)
where x = [q T q̇ T q̈dT ]T . We define the parameter error as
d
θ̃ = θ∗ − θ
Modelling Error : The modeling error (see Fig. 3) is given by
em = H T (x)θ̃ = M (q)q̈d + Vm (q, q̇)q̇ + G(q) − τf f
Solving for τf f in (5) and substituting the result into the above we get
em = M (q)q̃¨ + τf b
(10)
We also use the following notation for the lower and upper bounds on key terms
d
κM = inf q∈Rn σmin (M (q))
d
κv = σmin (Kv )
d
κp = σmin (Kp )
d
ωM = supq∈Rn kM (q)kF
d
(11)
ωv = kKv kF
d
ωp = kKp kF
d
We denote the ratio of the maximum and minimum singular values βp =
βp ≥ 1.
5 of 9
American Institute of Aeronautics and Astronautics
ωp
κp .
By definition, we have
V.
Stability Analysis
In this section we present a stability analysis of the modeling error driven robot controller. We shall
assume that inertia matrix, M (q) is known. With respect to feedback linearization, this corresponds to
assuming that the g matrix in (2) is known and f is unknown.
Theorem 1 Let the plant be described by (5). Let the feedback and feedforward control be given by (7) and
(8), respectively. Let Kv = κv I and Kp be any positive-definite symmetric matrix. Let the weights evolve
according
θ̇ = γH(x) M (q)q̃¨ + τf b
(12)
where γ > 0 and x = [q̈dT q̇ T q T ]T . Suppose that
κp > 0 and
q
κv > CV kq̇d k + CV2 kq̇d k2 + 2ωM ωp + 4βp2 CV2 kq̇d k2
Then, the origin of the closed loop system is locally stable.
Proof: Consider the following Lyapunov function candidate:
˙ θ)
V (q̃, q̃,
= q̃˙T Kv M (q)q̃˙ + q̃ T Kv Kp q̃ + 2q̃ T Kp M (q)q̃˙
Z t
2
1
¨
q̃¨T M T (q)M (q)q̃dτ
+ γ −1 θ̃ +
2
F
t0
(13)
˙ and θ̃.
Note that the last term on the right hand side is clearly a function of q̃, q̃,
Positive Definite Conditions:
˙ 2 + κv κp kq̃k2 − 2ωp ωM kq̃kkq̃k
˙ + 1 γ −1 θ̃
V ≥ κv κM kq̃k
2
2
F
The lower bound on the cross term follows since
˙ ≤ −2|q̃ T Kp M q̃|
˙ ≤ 2q̃ T Kp M q̃˙
−2ωp ωM kq̃kkq̃k
We also used the fact that since M T (q)M (q) is symmetric positive-definite, the integral on the right hand
side of (13) is non-negative for any t ≥ t0 . Using the fact that
˙
−2ωp ωM kq̃kkq̃k
for any ν 6= 0, we get
˙ 2
= ωp ωM νkq̃k − ν −1 kq̃k
˙ 2
−ωp ωM ν 2 kq̃k2 − ωp ωM ν −2 kq̃k
2
˙
k
q̃k
≥ −ωp ωM ν 2 kq̃k2 − ωp ωM ν 2
˙ 2 + 1 γ −1 θ̃
V ≥ c1 kq̃k2 + c2 kq̃k
2
2
F
where
c1
=
c2
=
κv κp − ω p ω M ν 2
κv κM − ωp ωM ν −2
V is positive definite if c1 and c2 are positive. This yields the condition
r
r
ωp ωM
κv κp
<ν<
κv κM
ω̂p ωM
(14)
(15)
For ν to exist, we must require that
κv > ωp ωM
s
1
κp κM
6 of 9
American Institute of Aeronautics and Astronautics
(16)
Differentiating V given in (13) along the trajectories of the closed loop system yields
V̇
= q̃˙T Kv Ṁ (q)q̃˙ + 2q̃˙T Kv M (q)q̃¨ + 2q̃ T Kp Kv q̃˙
+2 q̃˙T Kp M q̃˙ + q̃ T Kp M q̃¨ + q̃ T Kp Ṁ q̃˙
˙
+γ −1 θ̃T θ̃ + q̃¨T M T M q̃¨
T
= M q̃¨ + τf b M q̃¨ + q̃˙T Kv M q̃¨ + q̃ T Kp M q̃¨ + q̃˙T Kv Ṁ q̃˙
˙
+2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ + 2q̃ T Kp Ṁ q̃˙ + γ −1 θ̃T θ̃
Note that the error dynamics satisfy
M (q)q̃¨ = M (q)q̈d − M (q)q̈
= M (q)q̈d − Kp q̃ − Kv q̃˙ − τf f + Vm (q, q̇)q̇ + G(q)
Hence we get
V̇
=
T
M q̃¨ + τf b (M q̈d − τf b − τf f + Vm q̇ + G)
+τfTb M q̃¨ + q̃˙T Kv Ṁ q̃˙ + 2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙
˙
+2q̃ T Kp Ṁ q̃˙ + γ −1 θ̃T θ̃
T
T T
H θ̃ − M q̃¨ + τf b τf b + τfTb M q̃¨
= M q̃¨ + τf b
˙ v Ṁ q̃˙ + 2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ + 2q̃ T Kp Ṁ q̃˙
+q̃K
˙
+γ −1 θ̃T θ̃
= eTm H θ̃ − q̃˙T Kv Kv q̃˙ − q̃ T Kp Kp q̃ − 2q̃ T Kp Kv q̃˙
+2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ + q̃˙T Kv Ṁ q̃˙
˙
+2q̃ T Kp Ṁ q̃˙ + γ −1 θ̃T θ̃
˙
≤ − κ2v − 2ωp ωM − 2κv CV kq̇d k kq̃˙2 k − κ2p kq̃k2 + 4ωp CV kq̇d kkq̃kkq̃k
˙ 3 + 4ωp CV kq̃k
˙ 2 kq̃k + θ̃T γ −1 θ̃˙ + Hem
+2κv CV kq̃k
≤ −z T Az + 2CV (κv + ωp )kzk3
where we have used the fact that Ṁ = Vm + VmT , the learning algorithm given in (12), and the bounds:
2q̃˙T Kp M q̃˙
q̃˙T Kv Ṁ q̃˙
2q̃˙T Kp Ṁ
"
˙ 2
≤ 2ωM ωp kq̃k
˙ 2 ≤ 2κv Cv (kq̃k
˙ 3 + 2κv CV kq̇d kkq̃k
˙ 2
≤ 2κv kVm kkq̃k
˙ ≤ 4ωp CV kq̃k
˙ 2 kq̃k + kq̇d kkq̃kkq̃k
˙
≤ 4ωp kVm kkq̃kkq̃k
#
kq̃k
The vector z =
and the matrix A is given by
˙
kq̃k
"
κ2p
A=
−2ωp CV kq̇d k
−2ωp CV kq̇d k
2
κv − 2ωM ωp − 2κv CV kq̇d k
#
˙ + kq̇d k . Further details can be found in.1
We have also used the bound kVm (q, q̇)k ≤ CV kq̃k
The matrix A is positive definite if and only if
κ2p + κ2v − 2ωM ωp − 2κv CV kq̇d k > 0
(17)
4ωp2 CV2 kq̇d k2 < κ2p κ2v − 2ωM ωp − 2κv CV kq̇d k
(18)
and
7 of 9
American Institute of Aeronautics and Astronautics
Suppose that κv is chosen such that (18) is satisfied. Then,
κ2p + κ2v − 2ωM ωp − 2κv CV kq̇d k > κ2p + 4βp2 CV2 kq̇d k2
The above is positve if κp > 0 and hence (17) is satisfied. The condition for (18) is then given by
q
κv > CV kq̇d k + CV2 kq̇d k2 + 2ωM ωp + 4βp2 CV2 kq̇d k2
(19)
With the feedback gains selected as shown above, it follows that
V̇ ≤ −λmin kzk2 + 2CV (κv + ωp )kzk3
Thus, it follows that V̇ ≤ 0 if
kzk <
λmin (A)
2CV (κv + ωp )
Remark 1 In the above proof, we have assumed that there exists an exact parameterization of the unknown
dynamics. This assumption may not always hold and therefore it is common to employ a neural network
(radial basis function or sigmoidal, for example) to approximate the unknown function. This results in a
non-zero approximation error that is typically bounded over some compact set. This non-zero approximation
error complicates the stability analysis and typically requires additional learning dynamics. We note that
from the Lyapunov stability analysis point of view, the approximation error is equivalent to having a bounded
torque disturbance, or τd 6= 0 in (5). To deal with this, we apply e-modification, in which the learning
dynamics are modified as: θ̇ = γ H(x) M (q)q̃¨ + τf b − hkzkθ , where h > 0. Using a lemma derived in 1
we can then show that the closed loop system is uniformly bounded.
VI.
Simulation Examples
We illustrate the results of Theorem 1 in Section V. We consider the plant to be the two-link manipulator
with the following dynamics.3
M (q)q̈ + Vm (q, q̇)q̇ + G(q) = τ
where
2
2
(m1 +m2 )a1 +m2 a2 +2m2 a1 a2 cos(q2 )
M (q)=
2
m2 a2 +m2 a1 a2 cos q2
−q̇2 m2 a1 a2 sin q2
Vm (q,q̇)=
q̇1 m2 a1 a2 sin q2
m2 a22 +m2 a1 a2 cos q2
m2 a22
−(q̇1 +q̇2 )m2 a1 a2 sin q2
0
(m1 +m2 )ga1 cos q1 +m2 ga2 cos (q1 +q2 )
G(q)=
m2 ga2 cos (q1 +q2 )
and m1 = 0.8 kg, m2 = 2.3 kg, a1 = 1 m, a2 = 1 m, and g = 9.8 m/s2 . The desired trajectories are
given by qd1 = A cos(ωt) and qd2 = A sin(ωt) where A = 0.1 and ω = π. We compare the modeling error
based approach with a conventional adaptive controller.8 We plot the trajectory errors of each joint in Fig.
5. We observe that the modeling error based control scheme exhibits good performance compared to the
conventional adaptive controller.
VII.
Conclusions
In this paper, we have presented an adaptive control algorithm for the n-link robot manipulator that is
based on the minimization of the modeling error. Due to the choice of the control structure using feedback
linearization, the modeling error can be obtained without knowledge of the unknown parameters. A Lyapunov
analysis was used to prove that the closed loop system is stable. We presented a simulation in which we
compared the proposed control scheme to a conventional adaptive controller.
8 of 9
American Institute of Aeronautics and Astronautics
1
1
Conventional Adaptive Control
Modelling Error Driven Adaptive Control
0.8
0.8
àq1
àq2
àqç 1
àqç 2
0.6
0.4
àq1
àq2
àqç 1
àqç 2
0.6
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−0.8
−1
−1
0
2
4
6
8
10
0
2
4
6
8
10
Time (s)
Time (s)
Figure 5. Comparison of two adaptive control schemes.
References
1 A.K. Ishihara, S. Ben-Menahem, and N Nguyen. Protection ellipsoids for stability analysis of feedforward neural-net
controllers. IEEE International Joint Conference on Neural Networks (IJCNN), 2009.
2 N. Kulkarni, J. Kaneshige, K. Krishnakumar, and J. Burken. Modeling-error-driven performance-seeking direct adaptive
control. AIAA Guidance, Navigation, and Control Conference, 2008.
3 F.L. Lewis, S. Jagannathan, and A. Yesildirek. Neural Network Control of Robot Manipulators and Nonlinear Systems.
Taylor and Francis Ltd., 1999.
4 F.L. Lewis, K. Liu, and Yesilidrek A. Neural net robot controller with guaranteed tracking performance. IEEE Transaction
on Neural Networks, 6(3):703715, 1995.
5 F.L. Lewis, A. Yesildirek, and K. Liu. Multilayer neural-net robot controller with guaranteed tracking performance. IEEE
Transaction on Neural Networks, 7(2):1–11, 1996.
6 K.S. Narendra and A.M. Annaswamy. Stable Adaptive Systems. Prentice Hall, Inc., Englewood, N.J., 1989.
7 N. Nguyen, Krishnakumar K., and Kaneshige J. Dynamics and adaptive control for stability recovery of damaged asymmetric aircraft. AIAA Guidance, Navigation, and Control Conference, AIAA-2006-6049, 2006.
8 J.E. Slotine. Putting physics in controlthe example of robotics. IEEE Contr. Syst. Mag, 8:12–18, 1988.
9 of 9
American Institute of Aeronautics and Astronautics