Academia.eduAcademia.edu

Modeling Error Driven Robot Control

2009, AIAA Infotech@Aerospace Conference

In this paper, we present a modeling error driven adaptive controller for control of a robot with unknown dynamics. In general, modeling error is not used since the ideal parameters are not known. However, using a feedback linearization approach we show that the modeling error can be obtained by a measured quantity representing the error dynamics under the ideal conditions, that is, the case for which the robot parameters are known a priori. We show that using this approach, the learning dynamics and plant dynamics are effectively decoupled and can then be analyzed separately. We present simulation examples of the 2-link manipulator that illustrates the algorithm.

AIAA Infotech@Aerospace Conference <br>and <br>AIAA Unmanned...Unlimited Conference 6 - 9 April 2009, Seattle, Washington AIAA 2009-1974 Modeling Error Driven Robot Control Abraham K. Ishihara and Khalid Al-Ali Carnegie Mellon SV, NASA Ames Research Center, Moffett Field, CA. Tony Adami Ohio University, Athens Ohio, US Nilesh Kulkarni QSS Group, Inc., NASA Ames Research Center, Moffett Field, CA Nhan Nguyen NASA Ames Research Center, Moffett Field, CA In this paper, we present a modeling error driven adaptive controller for control of a robot with unknown dynamics. In general, modeling error is not used since the ideal parameters are not known. However, using a feedback linearization approach we show that the modeling error can be obtained by a measured quantity representing the error dynamics under the ideal conditions, that is, the case for which the robot parameters are known a priori. We show that using this approach, the learning dynamics and plant dynamics are effectively decoupled and can then be analyzed separately. We present simulation examples of the 2-link manipulator that illustrates the algorithm. I. Introduction Over the past three decades, neural networks have been successfully applied to the control of robotic manipulators for trajectory tracking when the robot dynamics are not completely known. The success can be attributed, in part, to the universal function approximation property of neural networks in which there exists a set of weights that ideally approximates the unknown function. However, finding the ideal set of weights typically involves a gradient descent like algorithm in which a training signal is provided to the neural network that drives the weight adaptation. The training signal is often an extended tracking error.3–5 The extended tracking error typically takes the form r = ė + γe where e denotes a tracking error and γ > 0. That is, the training signal reflects the ability of the robot to track a given reference trajectory, and not the ability of the neural network to approximate an unknown function. It is for this reason that that is is often difficult to associate a cost function with the neural network. On the other hand, in system identification of an unknown plant, the backpropagation algorithm, for example, seeks to minimize a cost function given by J = 21 e2m where em denotes the modeling error at the output of the neural network. This scenario is depicted in Fig. 1. Let’s assume d is scalar and qd ∈ Rn is a smooth input signal. Let F : Rn × RN → R and define d = F (qd ; W ) and d∗ = F (qd ; W ∗ ) where W and W ∗ denote estimated and ideal weights, respectively. Let em = d∗ − d denote the modeling error. If W (t) → W ∗ , then the modeling error tends to zero. One method to achieve zero modeling error is by the minimization of a cost function such as J = 12 e2m , via gradient descent or LMS. This means that the weights evolve according to Ẇ = −γ ∂em ∂F (qd ; W ) ∂J = −γem = γem ∂W ∂W ∂W where we have used the fact that d∗ is independent of W . If F is linear in W (for example F = W T φ(qd ) where φ ∈ RN ) and qd satisfies a persistence of excitation condition,6 the modeling error tends to zero 1 of 9 American Institute Aeronautics Copyright © 2009 by the American Institute of Aeronautics and Astronautics, Inc. All of rights reserved. and Astronautics F(qd; W) d - qd em + ã F(qd; W ã) d Figure 1. LMS et qd F(qd; W) d Figure 2. LMS applied to the input-output signals under training signal: et exponentially. Next, consider Fig. 2. Here, there is no explicit signal d∗ = F (qd ; W ∗ ) present. In this case, adjusting the weights according to ∂F Ẇ = γet ∂W shall be referred to as LMS applied to the input-output signals under the training signal given by et . This type of scenario is frequently encountered in adaptive control. In most adaptive control systems, the dotted arrow as depicted in Fig. 2., is a function of the performance error of the closed loop control system. Consider the two adaptive control systems given in Fig. 3. The only difference is in the training signal. In the top panel., the training signal is a function of the performance error (typically the extended extended tracking error). In the bottom panel, the training signal is given by the modeling error. In this paper, we propose to use modeling error to drive the weight adaptation of the neural net during the closed loop control of a serial-link robot with unknown dynamics. The difficulty using the modeling error is that typically we do not have the desired weights ,W ∗ . Indeed, if we knew W ∗ then there would be no need for adaptation. However, under a specific feedback linearization control structure, we show that the modeling error can be obtained by function of the performance of the plant. Using this training signal, the control and learning are decoupled. The use of modeling error driven neural network control was first proposed by2 and has successfully been applied to the control of aircraft. The approach followed the IFC framework7 and was analyzed in discrete time. In this paper, we extend their method to the control of robots in continuous time. The remainder is organized as follows: In Section II, we discuss the notation used throughout this paper. In Section III, we motivate the approach by considering the control of a SISO system. In Section IV, we present the relevant background on the dynamics of robot manipulators, the controller structure, and relevant neural network details . In Section V, we present the main stability analysis of the proposed controller and prove a theorem which guarentees the origin of the closed loop system is stable. In Section VI, we present a simulation example of the two-link robot manipulator. We compare the results to a standard adaptive control scheme. We conclude in Section VII. 2 of 9 American Institute of Aeronautics and Astronautics Feedforward Control x ü ff F(x; W) et ê K Teacher qd + ü ü fb K + Plant q Plant q Feedback Control Feedforward Control x ü ff F(x; W) em + - F(x; W ã) Teacher qd + K ü fb ü + Feedback Control Figure 3. Two adaptive control schemes. II. Notation Let R denote the set of real numbers. When we define a quantity, say A by the expression B, we write A = B. Let x ∈ Rn , where n is a positive integer. We denote the Euclidean norm of x by √ kxk = xT x d Let k · kF denote the Frobenius norm. Let Tr(·) denote the trace operator. The Frobenius norm of an m × n matrix is given by v uX m √ u n X T kAkF = TrA A = t a2ij (1) j=1 i=1 where aij denotes the element of A in the ith row and jth column. We denote by σmin (A) the minimum singular value of A. We denote the space of functions f : X1 × X2 → Y , that are n1 times continuously differentiable with respect to X1 and n2 times continuously differentiable with respect to X2 by Cn1 ,n2 (X1 × X2 ; Y ). If X is a normed vector space, we denote the open ball about some element x ∈ X of radius ǫ as d B(x; ǫ) = {z ∈ X : kz − xk < ǫ}. 3 of 9 American Institute of Aeronautics and Astronautics x ~ fê xç d n x ~d + - ~ e + K 1 g(x) P Linearized x ~ Figure 4. Feedback Linearization: Motivation III. Motivation Consider the feedback linearization method shown in Fig. 4. Let the plant, P , be described by ẋ1 = x2 ẋ2 = x3 .. . ẋn−1 = xn ẋn (2) = f (x) + g(x)u where x ∈ Rn , f : Rn → R, g : Rn → R, u ∈ R, and xi denotes the ith component of x. Let the desired states d be defined by ~xd = [xd1 xd2 · · · xdn ]T . Denote the error vector as ~e ∈ Rn where the component ei = xdi − xi for i ∈ [1, n]. Assume that g is known and g −1 (~x) exists for all ~x ∈ Rn . The feedback linearizing control is given by   (3) u = g −1 (~x) K~e + ẋdn − fˆ(~x) where K = [k1 k2 · · · kn ]. Substituting the control given in (3) into (2), we get (n) (n−1) e1 + kn e1 + · · · + k1 e1 = fˆ − f (4) (i) di ˆ where e1 = dt i e1 . In the ideal case where f = f , the right hand side of the above equation is zero and the errors tend exponentially to zero if ki > 0 for each i ∈ [1, n]. In the non-ideal case, fˆ 6= f and represents the modeling error. From the point of view of system identification, the modeling error is the correct training signal to provide to the gradient descent learning algorithm. However, in general this is not known since f is not known. In feedback linearization, the modeling error is given by the evolution of the errors in the ideal case as shown in equation (4), which is a known and readily measured quantity. Thus, choosing the training signal to be (n) (n−1) training signal = e1 + kn e1 + · · · + k1 e1 is the correct training signal to learn an estimate of f via gradient descent. IV. Robot Dynamics and controller structure Robot Dynamics: The dynamics of the n-link serial manipulator are given as follows: M (q)q̈ + V (q, q̇) + F (q̇) + G(q) + τd = τ , t ≥ t0 (5) where M (q) ∈ Rn×n is the inertia matrix, V (q, q̇) ∈ Rn×1 is the Coriolis/centripetal matrix, G(q) ∈ Rn×1 is the gravity matrix, F (q̇) ∈ Rn×1 is friction and τd ∈ Rn×1 represents unknown disturbances; q ∈ Rn×1 is the joint angle state vector, and τ ∈ Rn×1 comprises the net torque applied at each joint. We have the following standard assumptions.3 4 of 9 American Institute of Aeronautics and Astronautics Assumption 1 M(q) is symmetric, positive definite and bounded in norm. Assumption 2 The Coriolis/centripetal terms can be written as Vm (q, q̇)q̇ such that Ṁ − 2Vm (q, q̇) is skewsymmetric. Assumption 3 The gravity and disturbance terms are bounded. We will neglect friction terms and disturbance terms in the analysis. Controller Structure: The desired trajectory is denoted by   qd   ~qd =  q̇d  ∈ C([t0 , ∞); R3n ) q̈d The performance error is denoted by q̃ = qd − q ∈ Rn (6) The control input to the plant is denoted by τ = τf b + τf f where τf b denotes the feedback control and τf f denotes the feedforward control. The feedback control is given by τf b = Kp q̃ + Kv q̃˙ (7) where Kp ∈ Rn×n and Kv ∈ Rn×n denote positive definite symmetric feedback gain matrices. The feedforward control is given by τf f = H T (x)θ (8) where H ∈ RN ×n is a basis function matrix, θ ∈ RN denotes a vector of parameter estimates, and x is the input. Linear in Parameter (LIP) property: We assume that there exists a unique vector, θ∗ ∈ RN such that M (q)q̈d + Vm (q, q̇)q̇ + G(q) = H T (x)θ∗ ∀x ∈ R3n (9) where x = [q T q̇ T q̈dT ]T . We define the parameter error as d θ̃ = θ∗ − θ Modelling Error : The modeling error (see Fig. 3) is given by em = H T (x)θ̃ = M (q)q̈d + Vm (q, q̇)q̇ + G(q) − τf f Solving for τf f in (5) and substituting the result into the above we get em = M (q)q̃¨ + τf b (10) We also use the following notation for the lower and upper bounds on key terms d κM = inf q∈Rn σmin (M (q)) d κv = σmin (Kv ) d κp = σmin (Kp ) d ωM = supq∈Rn kM (q)kF d (11) ωv = kKv kF d ωp = kKp kF d We denote the ratio of the maximum and minimum singular values βp = βp ≥ 1. 5 of 9 American Institute of Aeronautics and Astronautics ωp κp . By definition, we have V. Stability Analysis In this section we present a stability analysis of the modeling error driven robot controller. We shall assume that inertia matrix, M (q) is known. With respect to feedback linearization, this corresponds to assuming that the g matrix in (2) is known and f is unknown. Theorem 1 Let the plant be described by (5). Let the feedback and feedforward control be given by (7) and (8), respectively. Let Kv = κv I and Kp be any positive-definite symmetric matrix. Let the weights evolve according  θ̇ = γH(x) M (q)q̃¨ + τf b (12) where γ > 0 and x = [q̈dT q̇ T q T ]T . Suppose that κp > 0 and q κv > CV kq̇d k + CV2 kq̇d k2 + 2ωM ωp + 4βp2 CV2 kq̇d k2 Then, the origin of the closed loop system is locally stable. Proof: Consider the following Lyapunov function candidate: ˙ θ) V (q̃, q̃, = q̃˙T Kv M (q)q̃˙ + q̃ T Kv Kp q̃ + 2q̃ T Kp M (q)q̃˙ Z t 2 1 ¨ q̃¨T M T (q)M (q)q̃dτ + γ −1 θ̃ + 2 F t0 (13) ˙ and θ̃. Note that the last term on the right hand side is clearly a function of q̃, q̃, Positive Definite Conditions: ˙ 2 + κv κp kq̃k2 − 2ωp ωM kq̃kkq̃k ˙ + 1 γ −1 θ̃ V ≥ κv κM kq̃k 2 2 F The lower bound on the cross term follows since ˙ ≤ −2|q̃ T Kp M q̃| ˙ ≤ 2q̃ T Kp M q̃˙ −2ωp ωM kq̃kkq̃k We also used the fact that since M T (q)M (q) is symmetric positive-definite, the integral on the right hand side of (13) is non-negative for any t ≥ t0 . Using the fact that ˙ −2ωp ωM kq̃kkq̃k for any ν 6= 0, we get  ˙ 2 = ωp ωM νkq̃k − ν −1 kq̃k ˙ 2 −ωp ωM ν 2 kq̃k2 − ωp ωM ν −2 kq̃k 2 ˙ k q̃k ≥ −ωp ωM ν 2 kq̃k2 − ωp ωM ν 2 ˙ 2 + 1 γ −1 θ̃ V ≥ c1 kq̃k2 + c2 kq̃k 2 2 F where c1 = c2 = κv κp − ω p ω M ν 2  κv κM − ωp ωM ν −2  V is positive definite if c1 and c2 are positive. This yields the condition r r ωp ωM κv κp <ν< κv κM ω̂p ωM (14) (15) For ν to exist, we must require that κv > ωp ωM s 1 κp κM 6 of 9 American Institute of Aeronautics and Astronautics (16) Differentiating V given in (13) along the trajectories of the closed loop system yields V̇ = q̃˙T Kv Ṁ (q)q̃˙ + 2q̃˙T Kv M (q)q̃¨ + 2q̃ T Kp Kv q̃˙   +2 q̃˙T Kp M q̃˙ + q̃ T Kp M q̃¨ + q̃ T Kp Ṁ q̃˙ ˙ +γ −1 θ̃T θ̃ + q̃¨T M T M q̃¨ T = M q̃¨ + τf b M q̃¨ + q̃˙T Kv M q̃¨ + q̃ T Kp M q̃¨ + q̃˙T Kv Ṁ q̃˙ ˙ +2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ + 2q̃ T Kp Ṁ q̃˙ + γ −1 θ̃T θ̃ Note that the error dynamics satisfy M (q)q̃¨ = M (q)q̈d − M (q)q̈ = M (q)q̈d − Kp q̃ − Kv q̃˙ − τf f + Vm (q, q̇)q̇ + G(q) Hence we get V̇ = T M q̃¨ + τf b (M q̈d − τf b − τf f + Vm q̇ + G) +τfTb M q̃¨ + q̃˙T Kv Ṁ q̃˙ + 2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ ˙ +2q̃ T Kp Ṁ q̃˙ + γ −1 θ̃T θ̃ T T  T  H θ̃ − M q̃¨ + τf b τf b + τfTb M q̃¨ = M q̃¨ + τf b ˙ v Ṁ q̃˙ + 2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ + 2q̃ T Kp Ṁ q̃˙ +q̃K ˙ +γ −1 θ̃T θ̃ = eTm H θ̃ − q̃˙T Kv Kv q̃˙ − q̃ T Kp Kp q̃ − 2q̃ T Kp Kv q̃˙ +2q̃ T Kp Kv q̃˙ + 2q̃˙T Kp M q̃˙ + q̃˙T Kv Ṁ q̃˙ ˙ +2q̃ T Kp Ṁ q̃˙ + γ −1 θ̃T θ̃  ˙ ≤ − κ2v − 2ωp ωM − 2κv CV kq̇d k kq̃˙2 k − κ2p kq̃k2 + 4ωp CV kq̇d kkq̃kkq̃k   ˙ 3 + 4ωp CV kq̃k ˙ 2 kq̃k + θ̃T γ −1 θ̃˙ + Hem +2κv CV kq̃k ≤ −z T Az + 2CV (κv + ωp )kzk3 where we have used the fact that Ṁ = Vm + VmT , the learning algorithm given in (12), and the bounds: 2q̃˙T Kp M q̃˙ q̃˙T Kv Ṁ q̃˙ 2q̃˙T Kp Ṁ " ˙ 2 ≤ 2ωM ωp kq̃k ˙ 2 ≤ 2κv Cv (kq̃k ˙ 3 + 2κv CV kq̇d kkq̃k ˙ 2 ≤ 2κv kVm kkq̃k  ˙ ≤ 4ωp CV kq̃k ˙ 2 kq̃k + kq̇d kkq̃kkq̃k ˙ ≤ 4ωp kVm kkq̃kkq̃k # kq̃k The vector z = and the matrix A is given by ˙ kq̃k " κ2p A= −2ωp CV kq̇d k −2ωp CV kq̇d k 2 κv − 2ωM ωp − 2κv CV kq̇d k #  ˙ + kq̇d k . Further details can be found in.1 We have also used the bound kVm (q, q̇)k ≤ CV kq̃k The matrix A is positive definite if and only if κ2p + κ2v − 2ωM ωp − 2κv CV kq̇d k > 0 (17)  4ωp2 CV2 kq̇d k2 < κ2p κ2v − 2ωM ωp − 2κv CV kq̇d k (18) and 7 of 9 American Institute of Aeronautics and Astronautics Suppose that κv is chosen such that (18) is satisfied. Then, κ2p + κ2v − 2ωM ωp − 2κv CV kq̇d k > κ2p + 4βp2 CV2 kq̇d k2 The above is positve if κp > 0 and hence (17) is satisfied. The condition for (18) is then given by q κv > CV kq̇d k + CV2 kq̇d k2 + 2ωM ωp + 4βp2 CV2 kq̇d k2 (19) With the feedback gains selected as shown above, it follows that V̇ ≤ −λmin kzk2 + 2CV (κv + ωp )kzk3 Thus, it follows that V̇ ≤ 0 if kzk < λmin (A) 2CV (κv + ωp )  Remark 1 In the above proof, we have assumed that there exists an exact parameterization of the unknown dynamics. This assumption may not always hold and therefore it is common to employ a neural network (radial basis function or sigmoidal, for example) to approximate the unknown function. This results in a non-zero approximation error that is typically bounded over some compact set. This non-zero approximation error complicates the stability analysis and typically requires additional learning dynamics. We note that from the Lyapunov stability analysis point of view, the approximation error is equivalent to having a bounded torque disturbance, or τd 6= 0 in (5). To deal with this, we apply e-modification, in which the learning dynamics are modified as: θ̇ = γ H(x) M (q)q̃¨ + τf b − hkzkθ , where h > 0. Using a lemma derived in 1 we can then show that the closed loop system is uniformly bounded. VI. Simulation Examples We illustrate the results of Theorem 1 in Section V. We consider the plant to be the two-link manipulator with the following dynamics.3 M (q)q̈ + Vm (q, q̇)q̇ + G(q) = τ where  2 2  (m1 +m2 )a1 +m2 a2 +2m2 a1 a2 cos(q2 ) M (q)= 2 m2 a2 +m2 a1 a2 cos q2   −q̇2 m2 a1 a2 sin q2 Vm (q,q̇)= q̇1 m2 a1 a2 sin q2   m2 a22 +m2 a1 a2 cos q2   m2 a22  −(q̇1 +q̇2 )m2 a1 a2 sin q2   0   (m1 +m2 )ga1 cos q1 +m2 ga2 cos (q1 +q2 )  G(q)=  m2 ga2 cos (q1 +q2 ) and m1 = 0.8 kg, m2 = 2.3 kg, a1 = 1 m, a2 = 1 m, and g = 9.8 m/s2 . The desired trajectories are given by qd1 = A cos(ωt) and qd2 = A sin(ωt) where A = 0.1 and ω = π. We compare the modeling error based approach with a conventional adaptive controller.8 We plot the trajectory errors of each joint in Fig. 5. We observe that the modeling error based control scheme exhibits good performance compared to the conventional adaptive controller. VII. Conclusions In this paper, we have presented an adaptive control algorithm for the n-link robot manipulator that is based on the minimization of the modeling error. Due to the choice of the control structure using feedback linearization, the modeling error can be obtained without knowledge of the unknown parameters. A Lyapunov analysis was used to prove that the closed loop system is stable. We presented a simulation in which we compared the proposed control scheme to a conventional adaptive controller. 8 of 9 American Institute of Aeronautics and Astronautics 1 1 Conventional Adaptive Control Modelling Error Driven Adaptive Control 0.8 0.8 àq1 àq2 àqç 1 àqç 2 0.6 0.4 àq1 àq2 àqç 1 àqç 2 0.6 0.4 0.2 0.2 0 0 −0.2 −0.2 −0.4 −0.4 −0.6 −0.6 −0.8 −0.8 −1 −1 0 2 4 6 8 10 0 2 4 6 8 10 Time (s) Time (s) Figure 5. Comparison of two adaptive control schemes. References 1 A.K. Ishihara, S. Ben-Menahem, and N Nguyen. Protection ellipsoids for stability analysis of feedforward neural-net controllers. IEEE International Joint Conference on Neural Networks (IJCNN), 2009. 2 N. Kulkarni, J. Kaneshige, K. Krishnakumar, and J. Burken. Modeling-error-driven performance-seeking direct adaptive control. AIAA Guidance, Navigation, and Control Conference, 2008. 3 F.L. Lewis, S. Jagannathan, and A. Yesildirek. Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor and Francis Ltd., 1999. 4 F.L. Lewis, K. Liu, and Yesilidrek A. Neural net robot controller with guaranteed tracking performance. IEEE Transaction on Neural Networks, 6(3):703715, 1995. 5 F.L. Lewis, A. Yesildirek, and K. Liu. Multilayer neural-net robot controller with guaranteed tracking performance. IEEE Transaction on Neural Networks, 7(2):1–11, 1996. 6 K.S. Narendra and A.M. Annaswamy. Stable Adaptive Systems. Prentice Hall, Inc., Englewood, N.J., 1989. 7 N. Nguyen, Krishnakumar K., and Kaneshige J. Dynamics and adaptive control for stability recovery of damaged asymmetric aircraft. AIAA Guidance, Navigation, and Control Conference, AIAA-2006-6049, 2006. 8 J.E. Slotine. Putting physics in controlthe example of robotics. IEEE Contr. Syst. Mag, 8:12–18, 1988. 9 of 9 American Institute of Aeronautics and Astronautics