Predictive Control for Dynamic Locomotion of Real Humanoid Robots

Stylianos Piperakis

Predictive Control for Dynamic Locomotion of Real Humanoid Robots

Stylianos Piperakis

Predictive control for dynamic locomotion of real humanoid robots

visibility

…

description

8 pages

link

1 file

— This article presents a complete formulation of the challenging task of stable humanoid robot omnidirectional walk based on the Cart and Table model for approximating the robot dynamics. For the control task, we propose two novel approaches: preview control augmented with the inverse system for negotiating strong disturbances and uneven terrain and linear model-predictive control approximated by an orthonormal basis for computational efficiency coupled with constraints for stability. For the generation of smooth feet trajectories, we present a new approach based on rigid body interpolation, enhanced by adaptive step correction. Finally, we present a sensor fusion approach for sensor-based state estimation and an effective solution to sensors' noise, delay, and bias issues, as well as to errors induced by the simplified dynamics and actuation imperfections. Our formulation is applied on a real NAO humanoid robot, where it achieves real-time onboard execution and yields smooth and stable gaits.

Predictive Control for Dynamic Locomotion of Real Humanoid Robots Stylianos Piperakis, Emmanouil Orfanoudakis, and Michail G. Lagoudakis Intelligent Systems Laboratory, School of Electronic and Computer Engineering, Technical University of Crete Chania, Crete 73100, Greece Email: {spiperakis, eorfanoudakis, lagoudakis}@intelligence.tuc.gr Abstract— This article presents a complete formulation of the challenging task of stable humanoid robot omnidirectional walk based on the Cart and Table model for approximating the robot dynamics. For the control task, we propose two novel approaches: preview control augmented with the inverse system for negotiating strong disturbances and uneven terrain and linear model-predictive control approximated by an orthonormal basis for computational efficiency coupled with constraints for stability. For the generation of smooth feet trajectories, we present a new approach based on rigid body interpolation, enhanced by adaptive step correction. Finally, we present a sensor fusion approach for sensor-based state estimation and an effective solution to sensors’ noise, delay, and bias issues, as well as to errors induced by the simplified dynamics and actuation imperfections. Our formulation is applied on a real NAO humanoid robot, where it achieves real-time onboard execution and yields smooth and stable gaits. I. I NTRODUCTION Robust stable omnidirectional locomotion for humanoid robots is a crucial problem and an active research area nowadays. In general, biped locomotion relies on distinct gait phases, during which it must be ensured that the sum of the forces acting on the robot do not result in a loss of balance. During the Single-Support (SS) phase, the gait relies on balancing on one leg (support leg), while the other one (swing leg) moves towards a planned foot location. When this is accomplished, it enters the Double-Support (DS) phase, where the weight of the robot is shifted from the support to the swing leg, the legs switch roles, and the process repeats. State-of-the-art work on this problem uses the ZeroMoment-Point (ZMP) [1] as a criterion to measure stability. Other approaches [2], [3] rely on the assumptions that minimal disturbing forces act on the robot and the Center of Mass (CoM) of the robot is constrained within the convex hull defined by the supporting legs, so that the humanoid robot is statically walking. However, such models lack in robustness due to partial use of available sensor information, such as the measured ZMP (when available), and cannot achieve high-speed gaits. The ZMP approach is a formal representation of the problem, which makes full use of sensor information commonly available on humanoid robots, and allows for rigorous solutions to be constructed. In general, dynamic walkers can achieve higher-speed locomotion. Distributed-mass models have been developed for the task of walking, taking into account link masses to compute the ZMP using complex dynamics, which result in computationally expensive solutions. Some of these approaches rely on model accuracy and on the assumption that precomputed trajectories of the state variables can be executed in a straightforward way [4], [5]. Unfortunately, these approaches are proven unsuccessful in environments where disturbances or uneven terrain are encountered. On the other hand, wellknown concentrated-mass models, such as the 3-Dimensional Linear Inverted Pendulum Model (3D-LIPM), simplify body dynamics by approximating the total mass of the robot by a point mass constrained to a horizontal plane at some arbitrary height. Linear predictive controllers are adopted to overcome the control task based on the simplified dynamics. In practice, these approaches are very efficient, without compromising stability and robustness significantly. Most works rely on a Preview Controller [6], [7], [8], [9], [10], while others make use of Linear Model Predictive Control (LMPC) optionally with constraints on the gait [11], [12], [13], [14]. The choice of constrained LMPC results in heavy computational solutions, which are inapplicable in practice due to typically limited computational resources for onboard execution. To this end, a sparse formulation [15] has been proposed, which results in larger optimization problems, but offers faster solutions. Unfortunately, most of these predictive control approaches have been tested only in simulation. While CoM pattern generation is a corner stone for solving the walk task, it is equally important to plan smooth feet trajectories that achieve the desired result, while abiding to the physical constraints of the robot. In addition, sensor readings on a humanoid robot are typically corrupted by noise, delay, and bias making the control task inviable. Proper actuation is also non-trivial, due to imperfect nature of the robot’s joints and also due to use of simplified (or approximate) dynamics. These issues must be addressed effectively for any practical implementation on a real humanoid robot. In this article, we present a complete formulation for dynamic humanoid robot walk based on the Cart and Table model for approximating the robot dynamics. For the control task, we propose two novel approaches; one based on the use of preview control augmented with the inverse system for encountering strong disturbances and the presence of uneven terrain and another based on using LMPC approximated by an orthonormal (Laguerre) basis, which offers faster computation and enables the use of constraints in locomotion for increased stability. For the task of generating smooth feet trajectories, we present a new approach based on rigid body interpolation, as an alternative to common Bezier and Spline interpolations, enhanced by adaptive step corrections to the generated trajectories. Finally, we present a sensor fusion III. L INEAR P REVIEW C ONTROL To generate walking patterns we need to take into account future information since the CoM must start moving before the ZMP starts tracking the reference signal [6]. Therefore, we will employ a Preview Controller [16], [17] for the Cart and Table model [8]. Given the ZMP reference zxref (k), the performance index to be minimized is given by: Inertial Frame Jp = ∞ X Qe e(i)2 + ∆x(i)⊤ Qx ∆x(i) + R∆u2x (i) (4) i=k Fig. 1. The Cart and Table model for approximating the robot dynamics approach for state estimation and we address effectively the sensors’ noise, delay, and bias, as well as the errors induced by simplified dynamics and actuation imperfections. II. S IMPLIFIED S YSTEM DYNAMICS The dynamics of the robot can be modeled as a point-mass cart on a massless table (Cart and Table model) as shown in Fig. 1. This gives a convenient representation for stability, as the ZMP must always lie inside the table’s base and the cart must accelerate properly to keep the table upright. We can generate walking patterns by treating the ZMP control as a servomechanism control problem. Let cx (t) be the position of the CoM (cart) at time t along axis x, expressed at an arbitrary inertial frame. At discrete intervals (assuming a sampling period Ts ), we can define the discrete-time state x(k), which includes the position, the velocity, and the acceleration of the cart at discrete time k: ⊤ x(k) = cx (kTs ) ċx (kTs ) c̈x (kTs ) (1) Assuming stability, the mass is constrained on the upright ... table at height hc . If ux (k) = c x (kTs ) is the control input (jerk of the cart) at discrete time k, then the state-space system dynamics are formed as: z 1 x(k + 1) = 0 0 A }| Ts 1 0 b z }| {  Ts2 /2 Ts3 /6 Ts  x(k) + Ts2 /2 ux (k) (2) 1 Ts { The ZMP of the system under gravitational acceleration g is: ux (k) = −Gi k X e(k)−G⊤ x x(k)− i=0 Np X Gp (j)zxref (k+j) (5) j=1 The state-feedback gain vector Gx , the prediction gains Gp (j), and the integral gain Gi are computed from the system matrices A, b, c⊤ defined in (2) and (3). To account for dynamic changes in the walking pattern, we make use of the auxiliary ZMP, originally introduced for the 3D-LIPM [7]. The auxiliary ZMP modifies the instantaneous behavior of the walking robot and as a result the long-term parameters, namely walking cycle and step width, to ensure compliance with the system dynamics. The output zx of the pattern generator, which is used to control the humanoid robot, gives a ZMP consistent with the robot dynamics (2) and is guaranteed to be arbitrarily close to the reference ZMP by appropriately selecting the weights of (4). In fact, the reference ZMP zxref is modified online by an auxiliary ZMP zxaux to counter disturbances and/or the presence of uneven terrain, noted by a tracking error ǫ: zx (k) = zxref (k) + zxaux (k) + ǫ (6) Such a modification can be achieved online with the use of an inverse system [18]. Let ux (k) = vx (k)−G⊤ x x(k), where vx is provided by the inverse system. Substituting into (2): Ā c⊤ }| { z zx (k) = 1 0 −hc /g x(k) where e(k) = zx (k) − zxref (k), ∆x(k) = x(k) − x(k − 1) is the incremental state vector, ∆ux (k) = ux (k) − ux (k − 1) is the incremental input, Qx is a 3 × 3 symmetric nonnegative definite matrix, Qe > 0, and R > 0. When the ZMP reference can be previewed for Np steps, the optimal control which minimizes the performance index (4) is given by: (3) This one-dimensional model can express the dynamics on the x and y axis independently (assuming decoupled motion). The Cart and Table model has many practical advantages over the well studied 3D-LIPM, most important being the fact that the jerk of the cart is usually optimized within some performance index, resulting in smoother actions [14]. In addition, the non-linear dynamics of the cart help in realizing a better ZMP motion than the 3D-LIPM does. Also, both models are suitable for the SS phase, but only the Cart and Table can model the DS phase [9]. Finally, a Cart and Table controller can generate smoother CoM trajectories compared to the 3D-LIPM and can achieve higher speeds. }| { z x(k + 1) = (A − bG⊤ x ) x(k) + bvx (k) (7) The state space for the inverse Cart and Table model is derived by considering (3) one time sample in the future: zx (k + 1) = c⊤ x(k + 1) = c⊤ Āx(k) + c⊤ bvx (k) Then, by solving for vx c⊤ inv dinv z }| { z }| { −1 vx (k) = −(c⊤ b) c⊤ Ā x(k) + (c⊤ b)−1 zx (k + 1) and substituting into (7) Ainv binv z z }| { }| { x(k+1) = (Ā − b(c⊤ b)−1 c⊤ Ā) x(k)+b(c⊤ b)−1 zx (k+1) 0.12 x(k) ZMP Reference ZMP + Auxialiary ZMP Target CoM 0.1 0.08 Preview Controller zxref (k) ux (k) + Cart and Table Model + zx (k) 0.06 0.04 0.02 vxaux (k) 0 Inverse System xinv (k) Pole Placement Controller −0.02 0 0.5 1 1.5 2 2.5 Time 30 z̃xaux (k) Control u 20 - K⊤ inv xinv (k) 10 + 0 zxaux (k) −10 −20 −30 Fig. 2. Preview Controller with inverse system control scheme aux vxaux (k) = c⊤ inv xinv (k) + dinv zx (k) (8) (9) The inverse Cart and Table model is unstable, therefore a discrete-time pole-placement controller was employed for the stabilization of the system, replacing zxaux in (8) and (9) by: z̃xaux (k) = zxaux (k) − K⊤ inv xinv (k) (10) where Kinv is the pole-placement gain vector. The auxiliary ZMP zxaux in (10) can be computed at each discrete time k using information obtained by the sensors of the humanoid robot. The proposed control scheme is shown in Fig. 2. Fig. 3 shows how the ZMP is correctly realized with the assistance of the auxiliary ZMP under disturbance (from 1.1 to 1.5sec) in the x-axis for three consecutive steps of straight walk. For the control task we did assume that the state x is known at every sampling time k, which in practice is not always true. Therefore, we chose to augment the control system with a Luenberger observer to accomplish the state feedback task. We present a sensor fusion approach (Fig. 4) for the Cart and Table model based on sensors commonly available on humanoid robots. We assume that the humanoid robot is equipped with encoders at all joints, which can be used to measure the real CoM cENC , and pressure-sensitive x FSR at each foot, which can be used to measure the ZMP zxFSR that coincides with the center of pressure during stable walk. Then, the measurement ym sent to the observer is: z 1 0 ym (k) = 1 0 c⊤ m }| { ENC 0 c (k) x(k) = xFSR −hc /g zx (k) 0.5 1 1.5 2 2.5 Time the dynamics of the inverse Cart and Table model become: xinv (k + 1) = Ainv xinv (k) + binv zxaux (k) 0 (11) and the observer dynamics over the estimated state x̂ are: x̂(k +1) = Ax̂(k)+bux (k)+Lobs ym (k)−c⊤ m x̂(k) (12) To find a suitable observer gain Lobs , the discrete-time ⊤ infinite-horizon LQR is adopted with matrices (A−bG⊤ x) , cm , Qobs , and Robs . The observer can compensate for errors in the model and for disturbances encountered while walking, but unfortunately cannot resolve effectively the sensor noise, delay, and bias, which we address later on. Fig. 3. x-axis response of the Preview Controller with the inverse system zxref (k) Predictive Controller ux (k) x̂(k) Observer Fig. 4. Humanoid Robot zxFSR (k) ỹm (k) cENC (k) x Predictive control loop with sensor fusion IV. L INEAR M ODEL P REDICTIVE C ONTROL Linear Model Predictive Control (LMPC) in discrete time using Laguerre basis functions has been proposed before [19], but assumes that the reference signal is constant for the duration of the prediction horizon. In our problem, this assumption does not hold. We will present our extension to this approach for the task of walking pattern generation, after providing a brief introduction to Laguerre functions. A. Laguerre Functions Laguerre functions are used in engineering as a way of generating orthonormal basis functions for various applications, such as system identification. The z-transforms of N discrete-time Laguerre functions leads to N Laguerre networks in the frequency domain: √ 1 − α2 Γ1 (z) = 1√− αz −1 1 − α2 z −1 − α Γ2 (z) = 1 − αz −1 1 − αz −1 .. . √ −1 N −1 1 − α2 z −α ΓN (z) = 1 − αz −1 1 − αz −1 where α is the pole of the networks (0 ≤ α < 1 for stability). Taking the inverse z-transform to return to the time domain leads to non-compact expressions. An alternative way to find the Laguerre functions in the time domain is based on a statespace realization of the networks: Γk (z) = z −1 − α Γk−1 (z) 1 − αz −1 (13) √ 2 1−α with initial state Γ1 (z) = 1−αz −1 . If li (k) is the inverse z-transform of Γi (z), the set of discrete-time Laguerre functions can be expressed in vector form as: ⊤ L(k) = l1 (k) l2 (k) · · · lN (k) (14) xe (k) zxref (k) Model Predictive Controller Due to (13), the set of discrete-time Laguerre functions satisfies the discrete-time state space: L(k + 1) = Al (α)L(k) with initial condition: p L(0) = 1 − α2 1 −α ··· (−1)N −1 αN −1 (15) ⊤ To derive the LMPC, a modified state space is used. By enhancing the state-space model with an embedded integrator [20], the modified state vector becomes: ⊤ xe (k) = ∆x(k)⊤ zx (k) (17) The Cart and Table model with the embedded integrator is: xe (k + 1) = }| A c⊤ A ⊤ 0 1 c⊤ e { be z }| { b xe (k) + ⊤ ∆ux (k) c b z }| { zx (k) = 0⊤ 1 xe (k) N X ηxj(k)lj (m) = L(m)⊤ η x (k) xe (k + m|k) = + i=0 Aem−i−1 be L(i)⊤ η x (k) and the predicted output at sampling time m is given by: i=0 C. Unconstrained LMPC In the conventional LMPC approach [20], the performance index to be minimized is given by: 2 2 minimize Ju = kZref x (k) − Zx (k)k2 + k∆Ux (k)kR (23) ∆U (k) x where R is a diagonal positive definite matrix and ref ⊤ ref ref Zref x (k) = zx (k + 1) zx (k + 2) · · · zx (k + Np ) ⊤ Zx (k) = zx (k + 1|k) zx (k + 2|k) · · · zx (k + Np |k) ⊤ ∆Ux (k) = ∆ux (k) ∆ux (k + 1) · · · ∆ux (k + Nc − 1) (19) 2 2 minimize Ju = kZref x (k) − Zx (k)k2 + kη x (k)kRL (24) η (k) (20) In this design framework, there is no explicit control horizon parameter Nc , as in the conventional LMPC approach. Instead, the number of Laguerre functions N along with parameter α determine the complexity of the incremental control trajectory. For α = 0 and N = Nc , the conventional LMPC approach is recovered. When using (20) in the modified state-space model (18), the prediction of the future state variable xe (k + m|k) at sampling time m based on xe (k) is given by: m−1 X zx (k) Replacing each term ∆ux (k + m) as in (20) and using the orthonormality of the Laguerre functions, the performance index (23) becomes equivalent to: j=1 Am e xe (k) Cart and Table Model (18) At discrete time k, the incremental control signal trajectory ∆ux (k), ∆ux (k + 1), . . . , ∆ux (k + m), . . . is regarded as the impulse response of a stable dynamic system. Thus, a set of Laguerre functions, l1 (m), l2 (m), . . . , lN (m), are used to capture this dynamic response with the corresponding Laguerre coefficients ηx1 , ηx2 , . . . , ηxN , which will be determined by the optimization procedure. For an arbitrary future time sample m, with k being the start of the moving horizon window, the incremental control input will be given by: ∆ux (k + m) = ux (k) LMPC control scheme (16) B. LMPC using Laguerre Functions Ae Fig. 5. z z−1 m−1 X ⊤ m−i−1 m zx (k +m|k) = c⊤ ce Ae be L(i)⊤η x (k) (22) e Ae xe (k)+ where Al is a specific lower-triangular N × N matrix. One important aspect of the Laguerre basis is the orthonormality property, which plays a vital role in the optimization procedure of the LMPC. z ∆ux (k) (21) x where RL is a positive definite matrix based on matrix R. denoting the convolution appearing therein as Using (22) Pand m−1 m−i−1 be L(i)⊤ , Zx (k) takes the form: φ(m) = i=0 c⊤ e Ae z  T }| c⊤ e Ae 2 c⊤ e Ae .. . { Φ z }| { φ(1)    φ(2)      Zx (k) =   xe (k) +  ..  η x (k)    .  Np φ(Np ) c⊤ e Ae (25) Substituting (25) in (24) gives the standard quadratic form: ⊤ J = η x (k)⊤(Φ⊤ Φ + RL )η x (k)+2(Txe (k)−Zref x (k)) Φη x (k)+const Then, the analytic solution to the unconstrained problem is: −1 ⊤ ref η x (k) = Φ⊤ Φ + RL Φ Zx (k) − Txe (k) (26) Upon obtaining the optimal coefficients η x (k), the receding horizon control law [19] is realized as: ∆ux (k) = L(0)⊤ η x (k) (27) The proposed control scheme is shown in Fig. 5. The same procedure can be followed for the y-axis. Fig. 6 shows the x-axis response for a few omnidirectional steps. ZMP Reference ZMP Target CoM 0.1 0.05 0 0 0.5 1 1.5 2 2.5 Time 5 0 −5 for m = 1, . . . , Np , where I is the 2 × 2 identity matrix, sx and sy are the length and width of the foot respectively, and Rz (θi ) ∈ SO(2) is a rotation matrix of angle θi about the vertical z-axis at the i-th moment of the prediction horizon. Using (22) we can express the constraints in terms of the Laguerre coefficients, forming a Quadratic Program (QP) with linear constraints (dropping the index k for clarity): −10 −20 Jc = η ⊤ H η + f ⊤ η minimize −15 η Control u 0 0.5 1 1.5 2 2.5 Fig. 6. Unconstrained LMPC response for omindirectional steps (x-axis) where  EΦ −FΦ Ac =  −EΦ FΦ  Fig. 7. DS convex hull approximated by a series of SS polygons D. Constrained LMPC A key feature of model predictive control is the ability to handle hard constraints in the design. In our case we will introduce constraints on the output of the system, namely the ZMP realized by the system dynamics will be constrained to lie in the support polygon defined separately in SS and DS. When walking omnidirectionally, the x and y axes cannot be decoupled in the optimization procedure, because of the coupling imposed by the constraints. Therefore, the complete problem must be solved once every sampling time. The performance index to be minimized takes the following form: Jc = η(k)⊤ H η(k) + f ⊤ (k)η(k) where, using the same Laguerre basis for both inputs ux , uy , ⊤ η (k) Φ Φ + RL 0 η(k) = x H= η y (k) 0 Φ⊤ Φ + R L " ⊤ # 2 Txe (k) − Zref Φ x (k) f (k) = ⊤ 2 Tye (k) − Zref (k) Φ y During the SS phase, the support polygon is defined by the supporting foot and remains unchanged, whereas, during the DS phase, it is the convex hull of the feet and cannot be easily defined. Therefore, at each DS sampling time k, it is approxi- mated as a moving support foot centered at zxref (k), zyref (k) with orientation changing from the previous support foot to the next one using some interpolation method, as shown in Fig. 7. Therefore, the constraints on the ZMP for both phases take the form:   sx ref 1 s  I ⊤ zx (k + m|k) − zx (k + m)  Rz (θi ) ≤  y zy (k + m|k) − zyref (k + m) s −I 2 x sy  FΦ EΦ   −FΦ −EΦ  y ref Sx − E(Txxe (k) − Zref x (k)) − F(Txe (k) − Zy (k)) y ref  Sy + F(Txxe (k) − Zref x (k)) − E(Txe (k) − Zy (k)) bc =  x y ref  Sx + E(Txe (k) − Zx (k)) + F(Txe (k) − Zref (k)) y y ref Sy − F(Txxe (k) − Zref (k)) + E(Tx (k) − Z (k)) e y x Sx = 12 sx . . .  cos(θ1 ) · · ·  .. .. E = . . 0 ··· ⊤ Sy = 21 sy . . . sy    sin(θ1 ) · · · 0 0   .  .. .. ..   F = .. . . . 0 · · · sin(θNp ) cos(θNp ) sx ⊤ We utilize a penalty method [21] for solving the QP, which we have found in practice to be very efficient and accurate. The performance index minimized by the penalty method is: 4Np J0 = Jc + X i=1 (28) (29) subject to Ac η ≤ bc Time t i a⊤ c i η − bc i (30) where ti is the penalty coefficient for constraint i, a⊤ ci is the i-th row of Ac and bci the i-th element of bc . Assuming the problem is feasible, Algorithm 1 finds a solution efficiently by iteratively identifying violated constraints and adjusting the corresponding penalty coefficients. The closed-form solution for ti (Step 4) is due to the quadratic form of J0 . Upon obtaining the optimal vector η for sampling time k, we can apply the receding horizon law as follows: L(0)⊤ 0⊤ ∆u(k) = η(k) (31) 0⊤ L(0)⊤ The constrained LMPC and the system response are shown in Fig. 8, when a disturbance is encountered in the y-axis. Algorithm 1 QP Solver using a Penalty Method 1: while (TRUE) do 2: compute η = −(2H)−1 f 3: find i such that a⊤ ci η − bci > 0 (violated constraint) if i exists then ti = 5: else return η 6: end if 7: end while 4: a⊤ ci η−bci ⊤ ac (2H)−1 aci i and f = f + ti aci 0.1 ZMP 0.08 Reference ZMP Target CoM 0.06 Constraints on ZMP −3 x 10 0.04 15 0.02 0.015 0 10 0.01 −0.02 5 −0.04 0.2 0 0.15 −0.06 −0.08 0.15 0.005 0.1 0.1 0 0.5 1 1.5 2 2.5 Time Disturbance −5 −0.1 0.05 −0.05 0 0 0.05 0 −0.1 0 −0.05 −0.05 0.1 0.05 0 0.05 0.15 0.2 X 0.3 −0.15 −0.05 0.1 −0.1 0.25 Y 0.15 0.2 0.25 −0.1 Y X 80 Control u 60 40 20 0 −20 −40 −60 0 0.5 1 1.5 2 2.5 Time Fig. 8. Constrained LMPC response for a constant disturbance (y-axis). Fig. 9. Feet trajectory using Bezier (top-left), B-spline (top-right), and rigid body interpolation (bottom-left: 3D view, bottom-right: lateral view). V. F EET T RAJECTORY G ENERATION Once the CoM patterns are determined, a proper feet trajectory must be generated to achieve a smooth gait. Using the cubic Bezier interpolation method with four control points, we can generate a smooth trajectory for the swinging leg in the SS phase. However, in practice, the foot does not make contact with the ground smoothly. To compensate for this phenomenon, the cubic B-spline interpolation method was used. Working with five control points, a smoother foot motion was achieved, both at the start and at the end of the SS phase. All such interpolation methods treat the foot as a single point object and solve each axis separately to compute a trajectory. We can treat the feet as rigid bodies and generate minimum jerk trajectories with homogeneous boundary conditions [22]. In this case, the initial and final velocity and acceleration of the foot is set to zero. Given the initial and final foot rotation matrices, Ri , Rf ∈ SO(3), and the initial and final foot position in three-dimensional space, di , df ∈ R3 , the reference trajectory at time k is given by: Rref (k) = Ri ep(k)ω̂0 d ref (k) = p(k)(df − di ) + df (32) (33) where p(k) = 6k 5 − 15k 4 + 10k 3 and ω̂0 ∈ so(3) can be determined from the initial and final rotation and position. The outcome is an affine transformation Tref (k) ∈ SE(3): ref R (k) dref (k) ref T (k) = (34) 0⊤ 1 which is subsequently expressed into the robot local coordinate frame and, after taking into account the target CoM position, is fed into inverse kinematics for computing joint angles. All interpolation methods are illustrated in Fig. 9. If a disturbance is sensed during a SS phase, the placement of the swinging foot can be adaptively corrected, so as to increase the area of the support polygon in the upcoming DS phase and, therefore, improve stability of the gait. At sampling instant k the future ZMP error in the x-axis (similarly in the y-axis) is estimated from the observer as: zxe (k + 1) = ẑx (k + 1) − zxref (k + 1) (35) Then, an appropriate, online modification to the foot trajectory of the swinging leg, dx (k + 1), can be achieved by: e dx (k + 1) = dref x (k + 1) + Kp zx (k + 1) (36) where Kp is a scalar gain determined by experimentation. VI. I MPLEMENTATION Both approaches have been fully implemented on the Aldebaran NAO humanoid robot [23] for the needs of our team Kouretes [www.kouretes.gr] competing in the Standard Platform League (SPL) of the RoboCup competition [24]. NAO (v. 3.3) is a 58cm, 5kg humanoid robot. It is equipped with an x86 AMD Geode processor at 500 MHz and 256 MB SDRAM. It has 21 degrees of freedom; 2 in the head, 4 in each arm, 5 in each leg and 1 in the pelvis. All joints are position-controlled, using closed-loop PID controllers and encoders. It also features an Inertial Measurement Unit (IMU) in the torso and FSR at each foot. ZMP Estimation using FSR: The FSR measurements come with significant delay (about 50ms) and are corrupted by noise. These issues compromise the stability of the proposed control schemes. To compensate for the delay and noise and acquire a noise-free reading for the ZMP in each axis, when at least one foot is in contact with the ground, a discrete-time Kalman filter with feedback from the Cart and Table model in the corresponding axis is employed. Assuming a delay of d sampling instants, the filter dynamics in the x-axis are: ẑxFSR k−d k−d− 1 = ẑxFSR k−d− 1 k−d− 1 + ∆ẑx (k−d) ỹ(k−d) = zxFSR (k−d) − ẑxFSR k−d k−d− 1 ẑxFSR k−d k−d = ẑxFSR k−d k−d− 1 + K(k−d)ỹ(k−d) where ẑxFSR is the filter’s estimate of the ZMP, zxFSR is the ZMP reading provided by the FSR, ỹ is the innovation and K is the optimal Kalman gain, which takes into account the process and measurement noise. The input of the filter is the incremental ZMP estimate ∆ẑx (k) = ẑx (k) − ẑx (k − 1) provided by the observer. This way, observations are integrated immediately into the filter’s estimate at the correct zxref (k) Predictive Controller 0.08 zxFSR (k − d) NAO Humanoid Robot ux (k) ZMP Reference ZMP Target CoM 0.06 0.04 0.02 ẑ(k) b e(k) x ( z−1 ) z ·z −d ∆ẑx (k − d) Kalman Filter 0 −0.02 −0.04 Augmented Observer ỹm (k) cENC (k) x ẑxFSR (k|k −0.06 0 5 10 − d) 80 Fig. 10. 15 Time Implemented control and estimation scheme on the NAO robot Control u 60 40 20 0 sampling instant. As a result, the filter’s estimate is also delayed by d sampling instants. Given the fact that the Kalman filter is an optimal predictor, we use it to obtain the estimate d sampling instants in the future. In such a way the estimate ẑxFSR (k|k − d) becomes synchronized and is fed to the observer. Elimination of ZMP Bias: Another important issue encountered is a systematic bias of the ZMP readings, due to sensor tolerances. Thus, NAO tends to hobble in the direction of the bias and eventually tips over. Our initial attempt to estimate the bias with the Kalman filter presented above had no practical impact, due to the lack of observability. To overcome this issue, the state space of the Cart and Table model was augmented with a ZMP bias state, so that the system observer can estimate it. The augmented state vector is given by: ⊤ e(k) = cx (kTs ) ċx (kTs ) c̈x (kTs ) zxbias (kTs ) x (37) and the augmented Cart and Table model takes the form: e A e b z }| { z}|{ A 0 b e(k + 1) = ⊤ e(k) + u (k) x x 0 x 0 1 (38) e c⊤ z }| { e(k) zx (k) = c⊤ 1 x (39) Therefore, the augmented observer dynamics are given by: z e c⊤ m }| { 1 0 0 0 cENC (k) x e e ym (k) = x(k) = FSR 1 0 −hc /g 1 ẑx (k|k − d) b b b e x (k) + L ex e obs y e(k + 1) = A e(k) + bu em (k) − e e(k) x c⊤ mx where ẑxFSR (k|k − d) is the filter’s estimate, which serves as a measurement of ẑx (k|k−d)+zxbias (k|k−d). This approach proved to be very effective and converged to the bias values in both axes. The final implemented control and estimation scheme is shown in Fig. 10. Actuation Error Correction: Despite the use of our analytical inverse kinematics for NAO [25], a given target Tref (k) for a foot is never reached without error on a real robot. Let Tm (k) be the actual position reached as measured by the joint encoders. The actuation error can be computed −1 ref as E(k) = Tm (k) T (k − 1). Assuming this error −20 −40 −60 −80 0 5 10 15 Time Fig. 11. y-axis response for straight gait using preview control on NAO is roughly constant for consecutive sampling instances, we can adjust the next target to account for it using a damping method. In particular, the Euler angles and the displacement are extracted from E(k), they are scaled by a damping factor 0 ≤ γ ≪ 1, a new transformation matrix Eγ (k) is reconstructed using the damped values, and the corrected next target is expressed as T(k) = Tref (k)Eγ (k). VII. E XPERIMENTAL R ESULTS The proposed walk approach is demonstrated for a simple 30-step straight gait with an average speed of 14cm/sec using both preview control (Fig. 11) and constrained LMPC (Fig. 12) computed on board on a real NAO robot. In the experimentation we used a sampling period of Ts = 10ms, a delay of d = 5, a prediction horizon of Np = 100 for both predictive controllers, a set of N = 15 Laguerre functions, and a pole of α = 0.8. In both cases, the reference ZMP is closely tracked by the actual ZMP with the constrained LMPC yielding more accurate tracking and faster response. Additionally, there are no signs of ZMP delays, noise, and bias in contrast to our initial experiments where these issues were not taken into account. Finally, the most significant difference between the two approaches can be observed in the magnitude of the control signal; the constrained LMPC results in a smoother CoM trajectory and, thus, smoother and much faster walk (measured up to 21cm/sec on NAO). The ZMP bias estimated by the augmented observer over time is shown in Fig. 13. The bias changes smoothly during the cycle (..., left SS, DS, right SS, DS, ...) of support phases. Note that for this particular robot there is no symmetry between left and right leg ZMP biases. The x-axis response of constrained LMPC when a significant constant disturbance is applied to a standing NAO robot is shown in Fig. 14. The attempt to counter the disturbance makes the actual ZMP “hit” the constraints, nevertheless without violating it; soon thereafter, both the actual ZMP and the CoM are driven to equilibrium. 0.06 ZMP Reference ZMP Target CoM 0.04 R EFERENCES 0.02 0 −0.02 −0.04 −0.06 0 5 10 15 Time 40 Control u 30 20 10 0 −10 −20 −30 −40 0 5 10 15 Time Fig. 12. y-axis response for straight gait using constrained LMPC on NAO −3 x 10 4 ZMP bias 3 2 1 0 −1 −2 −3 −4 −5 −6 0 5 10 15 Time Fig. 13. Estimated ZMP bias in the y-axis for straight gait 0.04 0.03 ZMP 0.02 Reference ZMP Target CoM 0.01 Constraints on ZMP 0 −0.01 −0.02 −0.03 −0.04 0 1 2 3 4 5 6 7 Time Constraints activated 80 Control u 60 40 20 0 −20 −40 −60 −80 0 1 2 3 4 5 6 7 Time Fig. 14.Constrained LMPC x-axis response when disturbing a standing NAO VIII. C ONCLUSION In this article we presented a complete formulation for humanoid robot walk using two novel control approaches; preview control augmented with the inverse system and (un-) constrained LMPC approximated by a Laguerre basis. Additionally, we proposed a new approach based on rigid body interpolation for generating smooth feet trajectories. Finally, we addressed all implementation issues and demonstrated the effectiveness of the proposed approaches on a real NAO robot. In the future, we plan to make use of the IMU measurements in the auxiliary ZMP computation and also extend the LMPC framework to make online modifications to the CoM height, aiming at even smoother gaits. [1] M. Vukobratovic and B. Borovac, “Zero-moment point - thirty five years of its life.” I. J. Humanoid Robotics, vol. 1, 2004. [2] C. Graf and T. Röfer, “A closed-loop 3D-LIPM gait for the RoboCup Standard Platform League humanoid,” in Proceedings of the 5th Workshop on Humanoid Soccer Robots, 2010. [3] J. Alcaraz-Jiménez, D. Herrero-Pérez, and H. Martı́nez-Barberá, “Robust feedback control of ZMP-based gait for the humanoid robot Nao,” Int. J. Rob. Res., vol. 32, no. 9-10, pp. 1074–1088, Aug. 2013. [4] Q. Huang, K. Yokoi, S. Kajita, K. Kaneko, H. Arai, N. Koyachi, and K. Tanie, “Planning walking patterns for a biped robot,” IEEE Transactions on Robotics and Automation, vol. 17, no. 3, pp. 280– 289, Jun 2001. [5] K. Hirai, M. Hirose, Y. Haikawa, and T. Takenaka, “The development of honda humanoid robot,” in Robotics and Automation, 1998. Proceedings. 1998 IEEE International Conference on, vol. 2, May 1998, pp. 1321–1326 vol.2. [6] S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Yokoi, and H. Hirukawa, “A realtime pattern generator for biped walking,” in Robotics and Automation, 2002. Proceedings. ICRA ’02. IEEE International Conference on, vol. 1, 2002, pp. 31–37 vol.1. [7] S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi, and H. Hirukawa, “Biped walking pattern generation by using preview control of zero-moment point,” in Robotics and Automation, 2003. Proceedings. ICRA ’03. IEEE International Conference on, vol. 2, 2003, pp. 1620–1626 vol.2. [8] J. Strom, G. Slavov, and E. Chown, “Omnidirectional walking using ZMP and preview control for the NAO humanoid robot,” in RoboCup 2009: Robot Soccer World Cup XIII, 2010, vol. 5949, pp. 378–389. [9] M. Arbulu, “Stable locomotion of humanoid robots based on mass concentrated model,” Ph.D. Thesis, Universidad Carlos III de Madrid, 2008. [10] O. Urbann and S. Tasse, “Observer based biped walking control, a sensor fusion approach,” Autonomous Robots, vol. 35, no. 1, pp. 37– 49, 2013. [11] D. Dimitrov, P. B. Wieber, O. Stasse, H. Ferreau, and H. Diedam, “An optimized linear model predictive control solver for online walking motion generation,” in Robotics and Automation, 2009. ICRA ’09. IEEE International Conference on, 2009, pp. 1171–1176. [12] A. Herdt, H. Diedam, P.-B. Wieber, D. Dimitrov, K. Mombaur, and M. Diehl, “Online Walking Motion Generation with Automatic Foot Step Placement,” Advanced Robotics, vol. 24, no. 5-6, 2010. [13] P. B. Wieber, “Trajectory free linear model predictive control for stable walking in the presence of strong perturbations,” in Humanoid Robots, 2006 6th IEEE-RAS International Conference on, 2006, pp. 137–142. [14] W. T. Chew, A. Astolfi, and X. Ming, “Robust control of bipedal humanoid (tpinokio),” Procedia Engineering, vol. 41, 2012. [15] D. Dimitrov, A. Sherikov, and P.-B. Wieber, “A sparse model predictive control formulation for walking motion generation,” in IROS 2011 - IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, Sept. 2011, pp. 2292–2299. [16] T. Sheridan, “Three models of preview control,” Human Factors in Electronics, IEEE Transactions on, vol. HFE-7, 1966. [17] T. Katayama, T. Ohki, T. Inoue, and T. Kato, “Design of an optimal controller for a discrete-time system subject to previewable demand,” International Journal of Control, vol. 41, no. 3, pp. 677–699, 1985. [18] M. Sain and J. Massey, “Invertibility of linear time-invariant dynamical systems,” Automatic Control, IEEE Transactions on, vol. 14, 1969. [19] L. Wang, Model Predictive Control System Design and Implementation Using MATLAB, 1st ed. Springer Publishing Company, 2009. [20] J. Maciejowski, Predictive Control: With Constraints. Prentice Hall, 2002. [21] D. Luenberger and Y. Ye, Linear and Nonlinear Programming. Springer, 2008. [22] M. Zefran and V. Kumar, “Planning of smooth motions on SE(3),” in Robotics and Automation, 1996. Proceedings., 1996 IEEE International Conference on, vol. 1, Apr 1996, pp. 121–126 vol.1. [23] Aldebaran Robotics, “Nao documentation,” 2012, only available online: www.aldebaran-robotics.com/documentation. [24] H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, E. Osawa, and H. Matsubara, “Robocup: A challenge problem for AI,” AI Magazine, vol. 18, no. 1, pp. 73–85, 1997. [25] N. Kofinas, E. Orfanoudakis, and M. G. Lagoudakis, “Complete analytical forward and inverse kinematics for the nao humanoid robot,” Journal of Intelligent and Robotic Systems, pp. 1–14, 2014.

Log In

Predictive Control for Dynamic Locomotion of Real Humanoid Robots

Related papers

Related papers

Related topics