Predictive Control for Dynamic Locomotion of Real Humanoid Robots
Stylianos Piperakis, Emmanouil Orfanoudakis, and Michail G. Lagoudakis
Intelligent Systems Laboratory, School of Electronic and Computer Engineering, Technical University of Crete
Chania, Crete 73100, Greece
Email: {spiperakis, eorfanoudakis, lagoudakis}@intelligence.tuc.gr
Abstract— This article presents a complete formulation of
the challenging task of stable humanoid robot omnidirectional
walk based on the Cart and Table model for approximating
the robot dynamics. For the control task, we propose two novel
approaches: preview control augmented with the inverse system
for negotiating strong disturbances and uneven terrain and linear model-predictive control approximated by an orthonormal
basis for computational efficiency coupled with constraints for
stability. For the generation of smooth feet trajectories, we
present a new approach based on rigid body interpolation,
enhanced by adaptive step correction. Finally, we present a
sensor fusion approach for sensor-based state estimation and
an effective solution to sensors’ noise, delay, and bias issues,
as well as to errors induced by the simplified dynamics and
actuation imperfections. Our formulation is applied on a real
NAO humanoid robot, where it achieves real-time onboard
execution and yields smooth and stable gaits.
I. I NTRODUCTION
Robust stable omnidirectional locomotion for humanoid
robots is a crucial problem and an active research area
nowadays. In general, biped locomotion relies on distinct
gait phases, during which it must be ensured that the sum
of the forces acting on the robot do not result in a loss of
balance. During the Single-Support (SS) phase, the gait relies
on balancing on one leg (support leg), while the other one
(swing leg) moves towards a planned foot location. When this
is accomplished, it enters the Double-Support (DS) phase,
where the weight of the robot is shifted from the support to
the swing leg, the legs switch roles, and the process repeats.
State-of-the-art work on this problem uses the ZeroMoment-Point (ZMP) [1] as a criterion to measure stability.
Other approaches [2], [3] rely on the assumptions that
minimal disturbing forces act on the robot and the Center of
Mass (CoM) of the robot is constrained within the convex
hull defined by the supporting legs, so that the humanoid
robot is statically walking. However, such models lack in
robustness due to partial use of available sensor information,
such as the measured ZMP (when available), and cannot
achieve high-speed gaits. The ZMP approach is a formal
representation of the problem, which makes full use of sensor
information commonly available on humanoid robots, and
allows for rigorous solutions to be constructed. In general,
dynamic walkers can achieve higher-speed locomotion.
Distributed-mass models have been developed for the task
of walking, taking into account link masses to compute the
ZMP using complex dynamics, which result in computationally expensive solutions. Some of these approaches rely on
model accuracy and on the assumption that precomputed
trajectories of the state variables can be executed in a
straightforward way [4], [5]. Unfortunately, these approaches
are proven unsuccessful in environments where disturbances
or uneven terrain are encountered. On the other hand, wellknown concentrated-mass models, such as the 3-Dimensional
Linear Inverted Pendulum Model (3D-LIPM), simplify body
dynamics by approximating the total mass of the robot by a
point mass constrained to a horizontal plane at some arbitrary
height. Linear predictive controllers are adopted to overcome
the control task based on the simplified dynamics. In practice,
these approaches are very efficient, without compromising
stability and robustness significantly. Most works rely on a
Preview Controller [6], [7], [8], [9], [10], while others make
use of Linear Model Predictive Control (LMPC) optionally
with constraints on the gait [11], [12], [13], [14]. The
choice of constrained LMPC results in heavy computational
solutions, which are inapplicable in practice due to typically
limited computational resources for onboard execution. To
this end, a sparse formulation [15] has been proposed, which
results in larger optimization problems, but offers faster
solutions. Unfortunately, most of these predictive control
approaches have been tested only in simulation.
While CoM pattern generation is a corner stone for solving
the walk task, it is equally important to plan smooth feet trajectories that achieve the desired result, while abiding to the
physical constraints of the robot. In addition, sensor readings
on a humanoid robot are typically corrupted by noise, delay,
and bias making the control task inviable. Proper actuation is
also non-trivial, due to imperfect nature of the robot’s joints
and also due to use of simplified (or approximate) dynamics.
These issues must be addressed effectively for any practical
implementation on a real humanoid robot.
In this article, we present a complete formulation for
dynamic humanoid robot walk based on the Cart and Table
model for approximating the robot dynamics. For the control
task, we propose two novel approaches; one based on the use
of preview control augmented with the inverse system for
encountering strong disturbances and the presence of uneven
terrain and another based on using LMPC approximated
by an orthonormal (Laguerre) basis, which offers faster
computation and enables the use of constraints in locomotion
for increased stability. For the task of generating smooth feet
trajectories, we present a new approach based on rigid body
interpolation, as an alternative to common Bezier and Spline
interpolations, enhanced by adaptive step corrections to the
generated trajectories. Finally, we present a sensor fusion
III. L INEAR P REVIEW C ONTROL
To generate walking patterns we need to take into account
future information since the CoM must start moving before
the ZMP starts tracking the reference signal [6]. Therefore,
we will employ a Preview Controller [16], [17] for the Cart
and Table model [8]. Given the ZMP reference zxref (k), the
performance index to be minimized is given by:
Inertial Frame
Jp =
∞
X
Qe e(i)2 + ∆x(i)⊤ Qx ∆x(i) + R∆u2x (i)
(4)
i=k
Fig. 1.
The Cart and Table model for approximating the robot dynamics
approach for state estimation and we address effectively the
sensors’ noise, delay, and bias, as well as the errors induced
by simplified dynamics and actuation imperfections.
II. S IMPLIFIED S YSTEM DYNAMICS
The dynamics of the robot can be modeled as a point-mass
cart on a massless table (Cart and Table model) as shown in
Fig. 1. This gives a convenient representation for stability,
as the ZMP must always lie inside the table’s base and the
cart must accelerate properly to keep the table upright. We
can generate walking patterns by treating the ZMP control
as a servomechanism control problem.
Let cx (t) be the position of the CoM (cart) at time t along
axis x, expressed at an arbitrary inertial frame. At discrete
intervals (assuming a sampling period Ts ), we can define
the discrete-time state x(k), which includes the position, the
velocity, and the acceleration of the cart at discrete time k:
⊤
x(k) = cx (kTs ) ċx (kTs ) c̈x (kTs )
(1)
Assuming stability, the mass is constrained on the upright
...
table at height hc . If ux (k) = c x (kTs ) is the control input
(jerk of the cart) at discrete time k, then the state-space
system dynamics are formed as:
z
1
x(k + 1) = 0
0
A
}|
Ts
1
0
b
z }| {
Ts2 /2
Ts3 /6
Ts x(k) + Ts2 /2 ux (k) (2)
1
Ts
{
The ZMP of the system under gravitational acceleration g is:
ux (k) = −Gi
k
X
e(k)−G⊤
x x(k)−
i=0
Np
X
Gp (j)zxref (k+j) (5)
j=1
The state-feedback gain vector Gx , the prediction gains
Gp (j), and the integral gain Gi are computed from the
system matrices A, b, c⊤ defined in (2) and (3).
To account for dynamic changes in the walking pattern, we
make use of the auxiliary ZMP, originally introduced for the
3D-LIPM [7]. The auxiliary ZMP modifies the instantaneous
behavior of the walking robot and as a result the long-term
parameters, namely walking cycle and step width, to ensure
compliance with the system dynamics.
The output zx of the pattern generator, which is used to
control the humanoid robot, gives a ZMP consistent with the
robot dynamics (2) and is guaranteed to be arbitrarily close
to the reference ZMP by appropriately selecting the weights
of (4). In fact, the reference ZMP zxref is modified online
by an auxiliary ZMP zxaux to counter disturbances and/or the
presence of uneven terrain, noted by a tracking error ǫ:
zx (k) = zxref (k) + zxaux (k) + ǫ
(6)
Such a modification can be achieved online with the use of
an inverse system [18]. Let ux (k) = vx (k)−G⊤
x x(k), where
vx is provided by the inverse system. Substituting into (2):
Ā
c⊤
}|
{
z
zx (k) = 1 0 −hc /g x(k)
where e(k) = zx (k) − zxref (k), ∆x(k) = x(k) − x(k − 1)
is the incremental state vector, ∆ux (k) = ux (k) − ux (k −
1) is the incremental input, Qx is a 3 × 3 symmetric nonnegative definite matrix, Qe > 0, and R > 0. When the ZMP
reference can be previewed for Np steps, the optimal control
which minimizes the performance index (4) is given by:
(3)
This one-dimensional model can express the dynamics on the
x and y axis independently (assuming decoupled motion).
The Cart and Table model has many practical advantages
over the well studied 3D-LIPM, most important being the
fact that the jerk of the cart is usually optimized within some
performance index, resulting in smoother actions [14]. In
addition, the non-linear dynamics of the cart help in realizing
a better ZMP motion than the 3D-LIPM does. Also, both
models are suitable for the SS phase, but only the Cart and
Table can model the DS phase [9]. Finally, a Cart and Table
controller can generate smoother CoM trajectories compared
to the 3D-LIPM and can achieve higher speeds.
}|
{
z
x(k + 1) = (A − bG⊤
x ) x(k) + bvx (k)
(7)
The state space for the inverse Cart and Table model is
derived by considering (3) one time sample in the future:
zx (k + 1) = c⊤ x(k + 1) = c⊤ Āx(k) + c⊤ bvx (k)
Then, by solving for vx
c⊤
inv
dinv
z
}|
{
z }| {
−1
vx (k) = −(c⊤ b) c⊤ Ā x(k) + (c⊤ b)−1 zx (k + 1)
and substituting into (7)
Ainv
binv
z
z }| {
}|
{
x(k+1) = (Ā − b(c⊤ b)−1 c⊤ Ā) x(k)+b(c⊤ b)−1 zx (k+1)
0.12
x(k)
ZMP
Reference ZMP + Auxialiary ZMP
Target CoM
0.1
0.08
Preview
Controller
zxref (k)
ux (k) +
Cart and
Table Model
+
zx (k)
0.06
0.04
0.02
vxaux (k)
0
Inverse
System
xinv (k)
Pole
Placement
Controller
−0.02
0
0.5
1
1.5
2
2.5
Time
30
z̃xaux (k)
Control u
20
-
K⊤
inv xinv (k)
10
+
0
zxaux (k)
−10
−20
−30
Fig. 2.
Preview Controller with inverse system control scheme
aux
vxaux (k) = c⊤
inv xinv (k) + dinv zx (k)
(8)
(9)
The inverse Cart and Table model is unstable, therefore a
discrete-time pole-placement controller was employed for the
stabilization of the system, replacing zxaux in (8) and (9) by:
z̃xaux (k) = zxaux (k) − K⊤
inv xinv (k)
(10)
where Kinv is the pole-placement gain vector. The auxiliary
ZMP zxaux in (10) can be computed at each discrete time k
using information obtained by the sensors of the humanoid
robot. The proposed control scheme is shown in Fig. 2. Fig. 3
shows how the ZMP is correctly realized with the assistance
of the auxiliary ZMP under disturbance (from 1.1 to 1.5sec)
in the x-axis for three consecutive steps of straight walk.
For the control task we did assume that the state x is
known at every sampling time k, which in practice is not
always true. Therefore, we chose to augment the control
system with a Luenberger observer to accomplish the state
feedback task. We present a sensor fusion approach (Fig. 4)
for the Cart and Table model based on sensors commonly
available on humanoid robots. We assume that the humanoid
robot is equipped with encoders at all joints, which can be
used to measure the real CoM cENC
, and pressure-sensitive
x
FSR at each foot, which can be used to measure the ZMP
zxFSR that coincides with the center of pressure during stable
walk. Then, the measurement ym sent to the observer is:
z
1 0
ym (k) =
1 0
c⊤
m
}|
{
ENC
0
c
(k)
x(k) = xFSR
−hc /g
zx (k)
0.5
1
1.5
2
2.5
Time
the dynamics of the inverse Cart and Table model become:
xinv (k + 1) = Ainv xinv (k) + binv zxaux (k)
0
(11)
and the observer dynamics over the estimated state x̂ are:
x̂(k +1) = Ax̂(k)+bux (k)+Lobs ym (k)−c⊤
m x̂(k) (12)
To find a suitable observer gain Lobs , the discrete-time
⊤
infinite-horizon LQR is adopted with matrices (A−bG⊤
x) ,
cm , Qobs , and Robs . The observer can compensate for errors
in the model and for disturbances encountered while walking,
but unfortunately cannot resolve effectively the sensor noise,
delay, and bias, which we address later on.
Fig. 3.
x-axis response of the Preview Controller with the inverse system
zxref (k)
Predictive
Controller
ux (k)
x̂(k)
Observer
Fig. 4.
Humanoid
Robot
zxFSR (k)
ỹm (k)
cENC
(k)
x
Predictive control loop with sensor fusion
IV. L INEAR M ODEL P REDICTIVE C ONTROL
Linear Model Predictive Control (LMPC) in discrete
time using Laguerre basis functions has been proposed
before [19], but assumes that the reference signal is constant
for the duration of the prediction horizon. In our problem,
this assumption does not hold. We will present our extension
to this approach for the task of walking pattern generation,
after providing a brief introduction to Laguerre functions.
A. Laguerre Functions
Laguerre functions are used in engineering as a way of
generating orthonormal basis functions for various applications, such as system identification. The z-transforms of
N discrete-time Laguerre functions leads to N Laguerre
networks in the frequency domain:
√
1 − α2
Γ1 (z) =
1√− αz −1
1 − α2 z −1 − α
Γ2 (z) =
1 − αz −1 1 − αz −1
..
.
√
−1
N −1
1 − α2
z −α
ΓN (z) =
1 − αz −1 1 − αz −1
where α is the pole of the networks (0 ≤ α < 1 for stability).
Taking the inverse z-transform to return to the time domain
leads to non-compact expressions. An alternative way to find
the Laguerre functions in the time domain is based on a statespace realization of the networks:
Γk (z) =
z −1 − α
Γk−1 (z)
1 − αz −1
(13)
√
2
1−α
with initial state Γ1 (z) = 1−αz
−1 . If li (k) is the inverse
z-transform of Γi (z), the set of discrete-time Laguerre functions can be expressed in vector form as:
⊤
L(k) = l1 (k) l2 (k) · · · lN (k)
(14)
xe (k)
zxref (k)
Model
Predictive
Controller
Due to (13), the set of discrete-time Laguerre functions
satisfies the discrete-time state space:
L(k + 1) = Al (α)L(k)
with initial condition:
p
L(0) = 1 − α2 1 −α
···
(−1)N −1 αN −1
(15)
⊤
To derive the LMPC, a modified state space is used.
By enhancing the state-space model with an embedded
integrator [20], the modified state vector becomes:
⊤
xe (k) = ∆x(k)⊤ zx (k)
(17)
The Cart and Table model with the embedded integrator is:
xe (k + 1) =
}|
A
c⊤ A
⊤
0
1
c⊤
e
{
be
z }| {
b
xe (k) + ⊤ ∆ux (k)
c b
z }| {
zx (k) = 0⊤ 1 xe (k)
N
X
ηxj(k)lj (m) = L(m)⊤ η x (k)
xe (k + m|k) =
+
i=0
Aem−i−1 be L(i)⊤ η x (k)
and the predicted output at sampling time m is given by:
i=0
C. Unconstrained LMPC
In the conventional LMPC approach [20], the performance
index to be minimized is given by:
2
2
minimize Ju = kZref
x (k) − Zx (k)k2 + k∆Ux (k)kR (23)
∆U (k)
x
where R is a diagonal positive definite matrix and
ref
⊤
ref
ref
Zref
x (k) = zx (k + 1) zx (k + 2) · · · zx (k + Np )
⊤
Zx (k) = zx (k + 1|k) zx (k + 2|k) · · · zx (k + Np |k)
⊤
∆Ux (k) = ∆ux (k) ∆ux (k + 1) · · · ∆ux (k + Nc − 1)
(19)
2
2
minimize Ju = kZref
x (k) − Zx (k)k2 + kη x (k)kRL (24)
η (k)
(20)
In this design framework, there is no explicit control horizon parameter Nc , as in the conventional LMPC approach.
Instead, the number of Laguerre functions N along with
parameter α determine the complexity of the incremental
control trajectory. For α = 0 and N = Nc , the conventional
LMPC approach is recovered.
When using (20) in the modified state-space model (18),
the prediction of the future state variable xe (k + m|k) at
sampling time m based on xe (k) is given by:
m−1
X
zx (k)
Replacing each term ∆ux (k + m) as in (20) and using the
orthonormality of the Laguerre functions, the performance
index (23) becomes equivalent to:
j=1
Am
e xe (k)
Cart and
Table
Model
(18)
At discrete time k, the incremental control signal trajectory
∆ux (k), ∆ux (k + 1), . . . , ∆ux (k + m), . . . is regarded as
the impulse response of a stable dynamic system. Thus,
a set of Laguerre functions, l1 (m), l2 (m), . . . , lN (m), are
used to capture this dynamic response with the corresponding Laguerre coefficients ηx1 , ηx2 , . . . , ηxN , which will be
determined by the optimization procedure.
For an arbitrary future time sample m, with k being the
start of the moving horizon window, the incremental control
input will be given by:
∆ux (k + m) =
ux (k)
LMPC control scheme
(16)
B. LMPC using Laguerre Functions
Ae
Fig. 5.
z
z−1
m−1
X ⊤ m−i−1
m
zx (k +m|k) = c⊤
ce Ae
be L(i)⊤η x (k) (22)
e Ae xe (k)+
where Al is a specific lower-triangular N × N matrix. One
important aspect of the Laguerre basis is the orthonormality
property, which plays a vital role in the optimization procedure of the LMPC.
z
∆ux (k)
(21)
x
where RL is a positive definite matrix based on matrix R.
denoting the convolution appearing therein as
Using (22)
Pand
m−1
m−i−1
be L(i)⊤ , Zx (k) takes the form:
φ(m) = i=0 c⊤
e Ae
z
T
}|
c⊤
e Ae
2
c⊤
e Ae
..
.
{
Φ
z }| {
φ(1)
φ(2)
Zx (k) =
xe (k) + .. η x (k)
.
Np
φ(Np )
c⊤
e Ae
(25)
Substituting (25) in (24) gives the standard quadratic form:
⊤
J = η x (k)⊤(Φ⊤ Φ + RL )η x (k)+2(Txe (k)−Zref
x (k)) Φη x (k)+const
Then, the analytic solution to the unconstrained problem is:
−1 ⊤ ref
η x (k) = Φ⊤ Φ + RL
Φ Zx (k) − Txe (k)
(26)
Upon obtaining the optimal coefficients η x (k), the receding
horizon control law [19] is realized as:
∆ux (k) = L(0)⊤ η x (k)
(27)
The proposed control scheme is shown in Fig. 5. The same
procedure can be followed for the y-axis. Fig. 6 shows the
x-axis response for a few omnidirectional steps.
ZMP
Reference ZMP
Target CoM
0.1
0.05
0
0
0.5
1
1.5
2
2.5
Time
5
0
−5
for m = 1, . . . , Np , where I is the 2 × 2 identity matrix, sx
and sy are the length and width of the foot respectively, and
Rz (θi ) ∈ SO(2) is a rotation matrix of angle θi about the
vertical z-axis at the i-th moment of the prediction horizon.
Using (22) we can express the constraints in terms of the
Laguerre coefficients, forming a Quadratic Program (QP)
with linear constraints (dropping the index k for clarity):
−10
−20
Jc = η ⊤ H η + f ⊤ η
minimize
−15
η
Control u
0
0.5
1
1.5
2
2.5
Fig. 6.
Unconstrained LMPC response for omindirectional steps (x-axis)
where
EΦ
−FΦ
Ac =
−EΦ
FΦ
Fig. 7.
DS convex hull approximated by a series of SS polygons
D. Constrained LMPC
A key feature of model predictive control is the ability
to handle hard constraints in the design. In our case we will
introduce constraints on the output of the system, namely the
ZMP realized by the system dynamics will be constrained to
lie in the support polygon defined separately in SS and DS.
When walking omnidirectionally, the x and y axes cannot
be decoupled in the optimization procedure, because of the
coupling imposed by the constraints. Therefore, the complete
problem must be solved once every sampling time. The
performance index to be minimized takes the following form:
Jc = η(k)⊤ H η(k) + f ⊤ (k)η(k)
where, using the same Laguerre basis for both inputs ux , uy ,
⊤
η (k)
Φ Φ + RL
0
η(k) = x
H=
η y (k)
0
Φ⊤ Φ + R L
"
⊤ #
2 Txe (k) − Zref
Φ
x (k)
f (k) =
⊤
2 Tye (k) − Zref
(k)
Φ
y
During the SS phase, the support polygon is defined by the
supporting foot and remains unchanged, whereas, during the
DS phase, it is the convex hull of the feet and cannot be easily
defined. Therefore, at each DS sampling time k, it is approxi-
mated as a moving support foot centered at zxref (k), zyref (k)
with orientation changing from the previous support foot to
the next one using some interpolation method, as shown in
Fig. 7. Therefore, the constraints on the ZMP for both phases
take the form:
sx
ref
1
s
I
⊤ zx (k + m|k) − zx (k + m)
Rz (θi )
≤ y
zy (k + m|k) − zyref (k + m)
s
−I
2 x
sy
FΦ
EΦ
−FΦ
−EΦ
y
ref
Sx − E(Txxe (k) − Zref
x (k)) − F(Txe (k) − Zy (k))
y
ref
Sy + F(Txxe (k) − Zref
x (k)) − E(Txe (k) − Zy (k))
bc =
x
y
ref
Sx + E(Txe (k) − Zx (k)) + F(Txe (k) − Zref
(k))
y
y
ref
Sy − F(Txxe (k) − Zref
(k))
+
E(Tx
(k)
−
Z
(k))
e
y
x
Sx = 12 sx . . .
cos(θ1 ) · · ·
..
..
E = .
.
0
···
⊤
Sy = 21 sy . . . sy
sin(θ1 ) · · ·
0
0
.
..
..
..
F = ..
.
.
.
0
· · · sin(θNp )
cos(θNp )
sx
⊤
We utilize a penalty method [21] for solving the QP, which
we have found in practice to be very efficient and accurate.
The performance index minimized by the penalty method is:
4Np
J0 = Jc +
X
i=1
(28)
(29)
subject to Ac η ≤ bc
Time
t i a⊤
c i η − bc i
(30)
where ti is the penalty coefficient for constraint i, a⊤
ci is the
i-th row of Ac and bci the i-th element of bc . Assuming the
problem is feasible, Algorithm 1 finds a solution efficiently
by iteratively identifying violated constraints and adjusting
the corresponding penalty coefficients. The closed-form solution for ti (Step 4) is due to the quadratic form of J0 .
Upon obtaining the optimal vector η for sampling time k,
we can apply the receding horizon law as follows:
L(0)⊤
0⊤
∆u(k) =
η(k)
(31)
0⊤
L(0)⊤
The constrained LMPC and the system response are shown
in Fig. 8, when a disturbance is encountered in the y-axis.
Algorithm 1 QP Solver using a Penalty Method
1: while (TRUE) do
2:
compute η = −(2H)−1 f
3:
find i such that a⊤
ci η − bci > 0 (violated constraint)
if i exists then ti =
5:
else return η
6:
end if
7: end while
4:
a⊤
ci η−bci
⊤
ac (2H)−1 aci
i
and f = f + ti aci
0.1
ZMP
0.08
Reference ZMP
Target CoM
0.06
Constraints on ZMP
−3
x 10
0.04
15
0.02
0.015
0
10
0.01
−0.02
5
−0.04
0.2
0
0.15
−0.06
−0.08
0.15
0.005
0.1
0.1
0
0.5
1
1.5
2
2.5
Time
Disturbance
−5
−0.1
0.05
−0.05
0
0
0.05
0
−0.1
0
−0.05
−0.05
0.1
0.05
0
0.05
0.15
0.2
X
0.3
−0.15
−0.05
0.1
−0.1
0.25
Y
0.15
0.2
0.25
−0.1
Y
X
80
Control u
60
40
20
0
−20
−40
−60
0
0.5
1
1.5
2
2.5
Time
Fig. 8.
Constrained LMPC response for a constant disturbance (y-axis).
Fig. 9. Feet trajectory using Bezier (top-left), B-spline (top-right), and
rigid body interpolation (bottom-left: 3D view, bottom-right: lateral view).
V. F EET T RAJECTORY G ENERATION
Once the CoM patterns are determined, a proper feet
trajectory must be generated to achieve a smooth gait. Using
the cubic Bezier interpolation method with four control
points, we can generate a smooth trajectory for the swinging
leg in the SS phase. However, in practice, the foot does not
make contact with the ground smoothly. To compensate for
this phenomenon, the cubic B-spline interpolation method
was used. Working with five control points, a smoother foot
motion was achieved, both at the start and at the end of
the SS phase. All such interpolation methods treat the foot
as a single point object and solve each axis separately to
compute a trajectory. We can treat the feet as rigid bodies
and generate minimum jerk trajectories with homogeneous
boundary conditions [22]. In this case, the initial and final
velocity and acceleration of the foot is set to zero. Given the
initial and final foot rotation matrices, Ri , Rf ∈ SO(3), and
the initial and final foot position in three-dimensional space,
di , df ∈ R3 , the reference trajectory at time k is given by:
Rref (k) = Ri ep(k)ω̂0
d
ref
(k) = p(k)(df − di ) + df
(32)
(33)
where p(k) = 6k 5 − 15k 4 + 10k 3 and ω̂0 ∈ so(3) can be
determined from the initial and final rotation and position.
The outcome is an affine transformation Tref (k) ∈ SE(3):
ref
R (k) dref (k)
ref
T (k) =
(34)
0⊤
1
which is subsequently expressed into the robot local coordinate frame and, after taking into account the target CoM
position, is fed into inverse kinematics for computing joint
angles. All interpolation methods are illustrated in Fig. 9.
If a disturbance is sensed during a SS phase, the placement
of the swinging foot can be adaptively corrected, so as to
increase the area of the support polygon in the upcoming
DS phase and, therefore, improve stability of the gait. At
sampling instant k the future ZMP error in the x-axis
(similarly in the y-axis) is estimated from the observer as:
zxe (k + 1) = ẑx (k + 1) − zxref (k + 1)
(35)
Then, an appropriate, online modification to the foot trajectory of the swinging leg, dx (k + 1), can be achieved by:
e
dx (k + 1) = dref
x (k + 1) + Kp zx (k + 1)
(36)
where Kp is a scalar gain determined by experimentation.
VI. I MPLEMENTATION
Both approaches have been fully implemented on the
Aldebaran NAO humanoid robot [23] for the needs of
our team Kouretes [www.kouretes.gr] competing in the
Standard Platform League (SPL) of the RoboCup competition [24]. NAO (v. 3.3) is a 58cm, 5kg humanoid robot.
It is equipped with an x86 AMD Geode processor at 500
MHz and 256 MB SDRAM. It has 21 degrees of freedom;
2 in the head, 4 in each arm, 5 in each leg and 1 in the
pelvis. All joints are position-controlled, using closed-loop
PID controllers and encoders. It also features an Inertial
Measurement Unit (IMU) in the torso and FSR at each foot.
ZMP Estimation using FSR: The FSR measurements
come with significant delay (about 50ms) and are corrupted
by noise. These issues compromise the stability of the
proposed control schemes. To compensate for the delay and
noise and acquire a noise-free reading for the ZMP in each
axis, when at least one foot is in contact with the ground,
a discrete-time Kalman filter with feedback from the Cart
and Table model in the corresponding axis is employed.
Assuming a delay of d sampling instants, the filter dynamics
in the x-axis are:
ẑxFSR k−d k−d− 1 = ẑxFSR k−d− 1 k−d− 1 + ∆ẑx (k−d)
ỹ(k−d) = zxFSR (k−d) − ẑxFSR k−d k−d− 1
ẑxFSR k−d k−d = ẑxFSR k−d k−d− 1 + K(k−d)ỹ(k−d)
where ẑxFSR is the filter’s estimate of the ZMP, zxFSR is the
ZMP reading provided by the FSR, ỹ is the innovation and
K is the optimal Kalman gain, which takes into account the
process and measurement noise. The input of the filter is
the incremental ZMP estimate ∆ẑx (k) = ẑx (k) − ẑx (k − 1)
provided by the observer. This way, observations are integrated immediately into the filter’s estimate at the correct
zxref (k)
Predictive
Controller
0.08
zxFSR (k − d)
NAO
Humanoid
Robot
ux (k)
ZMP
Reference ZMP
Target CoM
0.06
0.04
0.02
ẑ(k)
b
e(k)
x
( z−1
)
z
·z
−d
∆ẑx (k − d)
Kalman
Filter
0
−0.02
−0.04
Augmented
Observer
ỹm (k)
cENC
(k)
x
ẑxFSR (k|k
−0.06
0
5
10
− d)
80
Fig. 10.
15
Time
Implemented control and estimation scheme on the NAO robot
Control u
60
40
20
0
sampling instant. As a result, the filter’s estimate is also
delayed by d sampling instants. Given the fact that the
Kalman filter is an optimal predictor, we use it to obtain
the estimate d sampling instants in the future. In such a way
the estimate ẑxFSR (k|k − d) becomes synchronized and is fed
to the observer.
Elimination of ZMP Bias: Another important issue
encountered is a systematic bias of the ZMP readings, due to
sensor tolerances. Thus, NAO tends to hobble in the direction
of the bias and eventually tips over. Our initial attempt to
estimate the bias with the Kalman filter presented above
had no practical impact, due to the lack of observability.
To overcome this issue, the state space of the Cart and Table
model was augmented with a ZMP bias state, so that the
system observer can estimate it. The augmented state vector
is given by:
⊤
e(k) = cx (kTs ) ċx (kTs ) c̈x (kTs ) zxbias (kTs )
x
(37)
and the augmented Cart and Table model takes the form:
e
A
e
b
z }| {
z}|{
A 0
b
e(k + 1) = ⊤
e(k) +
u (k)
x
x
0 x
0
1
(38)
e
c⊤
z }| {
e(k)
zx (k) = c⊤ 1 x
(39)
Therefore, the augmented observer dynamics are given by:
z
e
c⊤
m
}|
{
1 0
0
0
cENC
(k)
x
e
e
ym (k) =
x(k) = FSR
1 0 −hc /g 1
ẑx (k|k − d)
b
b
b
e x (k) + L
ex
e obs y
e(k + 1) = A
e(k) + bu
em (k) − e
e(k)
x
c⊤
mx
where ẑxFSR (k|k − d) is the filter’s estimate, which serves as
a measurement of ẑx (k|k−d)+zxbias (k|k−d). This approach
proved to be very effective and converged to the bias values
in both axes. The final implemented control and estimation
scheme is shown in Fig. 10.
Actuation Error Correction: Despite the use of our
analytical inverse kinematics for NAO [25], a given target
Tref (k) for a foot is never reached without error on a real
robot. Let Tm (k) be the actual position reached as measured
by the joint encoders. The actuation error can be computed
−1 ref
as E(k) = Tm (k)
T (k − 1). Assuming this error
−20
−40
−60
−80
0
5
10
15
Time
Fig. 11.
y-axis response for straight gait using preview control on NAO
is roughly constant for consecutive sampling instances, we
can adjust the next target to account for it using a damping
method. In particular, the Euler angles and the displacement
are extracted from E(k), they are scaled by a damping
factor 0 ≤ γ ≪ 1, a new transformation matrix Eγ (k) is
reconstructed using the damped values, and the corrected
next target is expressed as T(k) = Tref (k)Eγ (k).
VII. E XPERIMENTAL R ESULTS
The proposed walk approach is demonstrated for a simple
30-step straight gait with an average speed of 14cm/sec
using both preview control (Fig. 11) and constrained LMPC
(Fig. 12) computed on board on a real NAO robot. In the
experimentation we used a sampling period of Ts = 10ms,
a delay of d = 5, a prediction horizon of Np = 100 for both
predictive controllers, a set of N = 15 Laguerre functions,
and a pole of α = 0.8. In both cases, the reference ZMP
is closely tracked by the actual ZMP with the constrained
LMPC yielding more accurate tracking and faster response.
Additionally, there are no signs of ZMP delays, noise, and
bias in contrast to our initial experiments where these issues
were not taken into account. Finally, the most significant
difference between the two approaches can be observed in
the magnitude of the control signal; the constrained LMPC
results in a smoother CoM trajectory and, thus, smoother and
much faster walk (measured up to 21cm/sec on NAO).
The ZMP bias estimated by the augmented observer over
time is shown in Fig. 13. The bias changes smoothly during
the cycle (..., left SS, DS, right SS, DS, ...) of support phases.
Note that for this particular robot there is no symmetry
between left and right leg ZMP biases.
The x-axis response of constrained LMPC when a significant constant disturbance is applied to a standing NAO robot
is shown in Fig. 14. The attempt to counter the disturbance
makes the actual ZMP “hit” the constraints, nevertheless
without violating it; soon thereafter, both the actual ZMP
and the CoM are driven to equilibrium.
0.06
ZMP
Reference ZMP
Target CoM
0.04
R EFERENCES
0.02
0
−0.02
−0.04
−0.06
0
5
10
15
Time
40
Control u
30
20
10
0
−10
−20
−30
−40
0
5
10
15
Time
Fig. 12. y-axis response for straight gait using constrained LMPC on NAO
−3
x 10
4
ZMP bias
3
2
1
0
−1
−2
−3
−4
−5
−6
0
5
10
15
Time
Fig. 13.
Estimated ZMP bias in the y-axis for straight gait
0.04
0.03
ZMP
0.02
Reference ZMP
Target CoM
0.01
Constraints on ZMP
0
−0.01
−0.02
−0.03
−0.04
0
1
2
3
4
5
6
7
Time
Constraints activated
80
Control u
60
40
20
0
−20
−40
−60
−80
0
1
2
3
4
5
6
7
Time
Fig. 14.Constrained LMPC x-axis response when disturbing a standing NAO
VIII. C ONCLUSION
In this article we presented a complete formulation for
humanoid robot walk using two novel control approaches;
preview control augmented with the inverse system and (un-)
constrained LMPC approximated by a Laguerre basis. Additionally, we proposed a new approach based on rigid body
interpolation for generating smooth feet trajectories. Finally,
we addressed all implementation issues and demonstrated
the effectiveness of the proposed approaches on a real NAO
robot. In the future, we plan to make use of the IMU
measurements in the auxiliary ZMP computation and also
extend the LMPC framework to make online modifications
to the CoM height, aiming at even smoother gaits.
[1] M. Vukobratovic and B. Borovac, “Zero-moment point - thirty five
years of its life.” I. J. Humanoid Robotics, vol. 1, 2004.
[2] C. Graf and T. Röfer, “A closed-loop 3D-LIPM gait for the RoboCup
Standard Platform League humanoid,” in Proceedings of the 5th
Workshop on Humanoid Soccer Robots, 2010.
[3] J. Alcaraz-Jiménez, D. Herrero-Pérez, and H. Martı́nez-Barberá, “Robust feedback control of ZMP-based gait for the humanoid robot Nao,”
Int. J. Rob. Res., vol. 32, no. 9-10, pp. 1074–1088, Aug. 2013.
[4] Q. Huang, K. Yokoi, S. Kajita, K. Kaneko, H. Arai, N. Koyachi,
and K. Tanie, “Planning walking patterns for a biped robot,” IEEE
Transactions on Robotics and Automation, vol. 17, no. 3, pp. 280–
289, Jun 2001.
[5] K. Hirai, M. Hirose, Y. Haikawa, and T. Takenaka, “The development
of honda humanoid robot,” in Robotics and Automation, 1998. Proceedings. 1998 IEEE International Conference on, vol. 2, May 1998,
pp. 1321–1326 vol.2.
[6] S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Yokoi, and
H. Hirukawa, “A realtime pattern generator for biped walking,” in
Robotics and Automation, 2002. Proceedings. ICRA ’02. IEEE International Conference on, vol. 1, 2002, pp. 31–37 vol.1.
[7] S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi,
and H. Hirukawa, “Biped walking pattern generation by using preview
control of zero-moment point,” in Robotics and Automation, 2003.
Proceedings. ICRA ’03. IEEE International Conference on, vol. 2,
2003, pp. 1620–1626 vol.2.
[8] J. Strom, G. Slavov, and E. Chown, “Omnidirectional walking using
ZMP and preview control for the NAO humanoid robot,” in RoboCup
2009: Robot Soccer World Cup XIII, 2010, vol. 5949, pp. 378–389.
[9] M. Arbulu, “Stable locomotion of humanoid robots based on mass
concentrated model,” Ph.D. Thesis, Universidad Carlos III de Madrid,
2008.
[10] O. Urbann and S. Tasse, “Observer based biped walking control, a
sensor fusion approach,” Autonomous Robots, vol. 35, no. 1, pp. 37–
49, 2013.
[11] D. Dimitrov, P. B. Wieber, O. Stasse, H. Ferreau, and H. Diedam, “An
optimized linear model predictive control solver for online walking
motion generation,” in Robotics and Automation, 2009. ICRA ’09.
IEEE International Conference on, 2009, pp. 1171–1176.
[12] A. Herdt, H. Diedam, P.-B. Wieber, D. Dimitrov, K. Mombaur, and
M. Diehl, “Online Walking Motion Generation with Automatic Foot
Step Placement,” Advanced Robotics, vol. 24, no. 5-6, 2010.
[13] P. B. Wieber, “Trajectory free linear model predictive control for stable
walking in the presence of strong perturbations,” in Humanoid Robots,
2006 6th IEEE-RAS International Conference on, 2006, pp. 137–142.
[14] W. T. Chew, A. Astolfi, and X. Ming, “Robust control of bipedal
humanoid (tpinokio),” Procedia Engineering, vol. 41, 2012.
[15] D. Dimitrov, A. Sherikov, and P.-B. Wieber, “A sparse model predictive control formulation for walking motion generation,” in IROS
2011 - IEEE/RSJ International Conference on Intelligent Robots and
Systems. IEEE, Sept. 2011, pp. 2292–2299.
[16] T. Sheridan, “Three models of preview control,” Human Factors in
Electronics, IEEE Transactions on, vol. HFE-7, 1966.
[17] T. Katayama, T. Ohki, T. Inoue, and T. Kato, “Design of an optimal
controller for a discrete-time system subject to previewable demand,”
International Journal of Control, vol. 41, no. 3, pp. 677–699, 1985.
[18] M. Sain and J. Massey, “Invertibility of linear time-invariant dynamical
systems,” Automatic Control, IEEE Transactions on, vol. 14, 1969.
[19] L. Wang, Model Predictive Control System Design and Implementation
Using MATLAB, 1st ed. Springer Publishing Company, 2009.
[20] J. Maciejowski, Predictive Control: With Constraints. Prentice Hall,
2002.
[21] D. Luenberger and Y. Ye, Linear and Nonlinear Programming.
Springer, 2008.
[22] M. Zefran and V. Kumar, “Planning of smooth motions on SE(3),”
in Robotics and Automation, 1996. Proceedings., 1996 IEEE International Conference on, vol. 1, Apr 1996, pp. 121–126 vol.1.
[23] Aldebaran Robotics, “Nao documentation,” 2012, only available online: www.aldebaran-robotics.com/documentation.
[24] H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, E. Osawa, and H. Matsubara, “Robocup: A challenge problem for AI,” AI Magazine, vol. 18,
no. 1, pp. 73–85, 1997.
[25] N. Kofinas, E. Orfanoudakis, and M. G. Lagoudakis, “Complete
analytical forward and inverse kinematics for the nao humanoid robot,”
Journal of Intelligent and Robotic Systems, pp. 1–14, 2014.