Outlier-Robust State Estimation for Humanoid Robots*
Stylianos Piperakis1 , Dimitrios Kanoulas2 , Nikos G. Tsagarakis2 , and Panos Trahanias1
Abstract— Contemporary humanoids are equipped with visual and LiDAR sensors that are effectively utilized for Visual
Odometry (VO) and LiDAR Odometry (LO). Unfortunately,
such measurements commonly suffer from outliers in a dynamic
environment, since frequently it is assumed that only the robot
is in motion and the world is static. To this end, robust state
estimation schemes are mandatory in order for humanoids
to symbiotically co-exist with humans in their daily dynamic
environments. In this article, the robust Gaussian Error-State
Kalman Filter for humanoid robot locomotion is presented. The
introduced method automatically detects and rejects outliers
without relying on any prior knowledge on measurement distributions or finely tuned thresholds. Subsequently, the proposed
method is quantitatively and qualitatively assessed in realistic
conditions with the full-size humanoid robot WALK-MAN v2.0
and the mini-size humanoid robot NAO to demonstrate its
accuracy and robustness when outlier VO/LO measurements
are present. Finally, in order to reinforce further research
endeavours, our implementation is released as an open-source
ROS/C++ package.
!#
b
!"
!$
Fig. 1. Illustration of frames used in base estimation on the 29DoF WALKMAN v2.0 humanoid robot: w corresponds to the inertial world frame, b is
the base frame, and blue ellipses indicate the orientation uncertainty.
I. INTRODUCTION
Humanoid robot walking constitutes a challenging task
and remains up-to-date an open-research problem. Contemporary approaches in motion planning [1], [2], [3] and gait
control [4], [5], [6], implicitly or explicitly assume that the
humanoid’s base information is available in advance. To this
end, accurate and robust base state estimation realizes a
vital role in dynamic humanoid robot locomotion [7]. Towards that direction, Kuindersma et al. [8], presented a base
estimator relying on Newton-Euler dynamics of a floating
mass, that effectively combined Inertial Measurement Unit
(IMU), kinematics, and LiDAR measurements to estimate the
base position, orientation, and velocity. The latter utilized the
Error-State Kalman Filter (ESKF), a variant of the Extended
Kalman Filter (EKF), carefully crafted to handle the overparametrization of the base orientation. Subsequently, this
framework was augmented in [9], [10] to directly account for
the visually obtained landscape. In such a way, an ATLAS
robot was able to continuously climb up and down staircases.
Furthermore, in [11], the same scheme was evaluated with
the WALK-MAN robot in a disaster response environment.
Newton-Euler floating mass estimators proved to be effective also in the case of quadruped robots [12]. Bloesch
*This work has been partially supported by the following funded projects:
Greek National grand MIS 5030440 vipGPU and the CogIMon (no 644727)
EU project.
1 Stylianos Piperakis and Panos Trahanias are with the Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH),
Heraklion, Greece. {spiperakis,trahania}@ics.forth.gr
2 Dimitrios Kanoulas and Nikos G. Tsagarakis are with the Humanoids and Human-Centered Mechatronics Department, Istituto Italiano di Tecnologia (IIT), Via Morego 30, 16163, Genova, Italy.
{Dimitrios.Kanoulas, Nikos.Tsagarakis}@iit.it.
et al. [13], [14], presented a similar approach where the
IMU and the kinematically computed feet point contacts
were utilized for accurate base estimation with the starlETH
quadruped. Rotella et al. [15] appropriately modified the
previous scheme for humanoid walking while also considering the feet orientation in the process. Interestingly, it was
demonstrated that the base estimation accuracy increases.
Additionally, state-of-the-art 3D-Center of Mass (CoM)
estimators [16], [17], [18], derived from consistent dynamics,
readily assume that measurements are transformed to an
inertial frame of reference before fused. The latter is also
common practice in 2D-CoM estimation based on simplified
dynamical models [19], [20]. Nevertheless, this dictates that
accurate estimates of the base position and orientation are
available, otherwise base drift will inevitably propagate to
the CoM estimation [21].
A. Problem Statement
Modern humanoids are commonly employed with cameras
and LiDAR sensors to reinforce their perception in unstructured environments. Based on consecutive camera frames
one can derive the camera’s egomotion with respect to the
environment and directly relate it to the robot’s motion. In
literature this is known as Visual Odometry (VO). Prominent
approaches rely on sparse [22] or semi-dense [23] schemes to
facilitate real-time execution. Similarly, based on sequential
LiDAR scans it is straightforward to match the beams
and compute the LiDAR Odometry (LO) [24], [25]. Both
approaches are advantageous in the sense that they are unaffected by slippage in uneven/rough terrain when contrasted
to the kinematically computed leg odometry. However, in
all aforementioned schemes, the world is assumed to be
mostly static and only the robot is in motion, e.g. the static
world assumption. Presumably, this is not the case in human
daily environments, due to humans moving along with the
robots and/or changing the scene. Hence, the static world
assumption is frequently violated. To this end, in order
for humanoids to co-exist with humans in a dynamically
changing environment it is mandatory to robustify their
odometry estimates. Interestingly, in [14] a base estimator
with outlier detection for quadruped locomotion was presented. The authors of [14] utilized a probabilistic threshold
to quantify weather a measurement is an outlier or not before
fusion. Nevertheless, this raises two important questions: a)
how can this threshold be determined in advance and b)
does this threshold depend on the conditions at hand? Other
works [26], not in the scope of base estimation, assumed that
the measurements follow a Student-t distribution. Again the
obvious question arises whether this is a valid assumption in
the case of VO/LO measurements.
B. Contribution
In this work, we propose a novel formulation of the
Error-State Kalman Filter (ESKF) which is robust to outlier
VO/LO measurements that can commonly occur in humanoid
walking in dynamic human environments. The contribution
to the state-of-the-art is as follows:
• The Robust Gaussian ESKF (RGESKF) is mathematically established based on [8], [27]. More specifically, we present an analytical solution for the general
nonlinear Gaussian formulation for outlier detection
of [27]. The latter results in a computationally efficient
implementation that accomplishes real-time execution.
• The above method does not rely on prior assumptions regarding the measurements probability distributions [26] neither thresholding [14] for the imminent
outlier detection.
• We quantitatively and qualitatively assess the proposed
method and demonstrate its accuracy and robustness
in real-world conditions with two robots, the full-size
humanoid WALK-MAN v2.0 [28], and a mini-size NAO
humanoid.
• Since this framework relies on sensing that is commonly
available on contemporary humanoids and furthermore,
is based on generic nonlinear dynamics, we release an
open-source ROS/C++ implementation [29] to reinforce
further research endeavours.
This article is structured as follows: in Sec. II the proposed
Robust Gaussian ESKF is introduced and mathematically
established. Subsequently, the proposed scheme is quantitatively and qualitatively assessed in Sec. III and Sec. IV.
Finally, Sec. V concludes the article and discusses possible
future directions.
II. BASE E STIMATION
Kuindersma et al. [8], presented a base estimator with
Newton-Euler dynamics of a floating mass that is effectively
used in humanoid walking. At time t, the state to be
estimated is:
⊤
xt = b v b w R b w p b b ω b a
where w pb , w Rb denote the base position and rotation with
respect to the world frame w, b v b is the linear velocity,
and bω , bα are the gyro and accelerometer biases, in the
base frame b. However, w Rb is an overparametrization of
the base’s orientation. To this end, to track the orientation
uncertainty we consider perturbation rotations in the base
frame. Thus, if the true base rotation matrix is w Rb then
w
Rb = w R̂b eχ where w R̂b , is the estimated rotation matrix
and χ denotes the perturbation exponential coordinates. For
clarity all aforementioned quantities are depicted in Figure 1.
A. Process Model
In order to properly define the nonlinear dynamics
b
b imu
f (xt , ut , wt ), let b ω̄ b = b ω imu
b −bω and ᾱb = αb −bα ,
be the IMU bias-compensated gyro rate and linear acceleration, respectively, then:
f (xt ,ut ,wt )
}|
{
b
−(b ω̄ b − wω ) × b v b + w R⊤
g
+
ᾱ
−
w
b
a
b
w
Rb (b ω̄ b − wω )[×]
w
b
(1)
ẋt =
Rb v b
w bω
w bα
b imu b imu
where ut = ω b αb
is the input vector,
g is the
gravity vector, wt = wω wa wbω wbα ∼ N (0, Qt )
is the input−process noise that follows a normal zero mean
distribution with covariance Qt , and [×] is the wedge operation.
Subsequently, denoting the error state vector as:
⊤
δxt = b δv b χ w δpb δbω δba
(2)
z
the error-state dynamics assume the following linear form:
δ ẋt = F t δxt + Lt wt
(3)
with
Ft =
−b ω̄ b[×]
0
w
Rb
0
0
w
R⊤
b g
[×]
−b ω̄ b[×]
w
− Rb b v b[x]
0
0
0 −b v b[x]
0
0
0
0
−b v b[x]
−I
Lt =
0
I
0
−I
0
0
0
−I
0
0
0
I
−I
0
0
0
0
(4)
(5)
To this end, the ESKF predict step is readily realized as:
+
d
x̂−
k = f k (x̂k−1 , uk , 0)
d +
d⊤
d d d⊤
P−
k = F k P k−1 F k + Lk Qk Lk
(6)
(7)
where the superscript d indicates the discretized variables
at the discrete-time k, which are obtained by means of the
−
Euler method for simplicity. Moreover, x̂−
k , P k denote the
ESKF mean estimate and error covariance respectively, prior
+
to update, while x̂+
k−1 , P k−1 are the same quantities after
the update at discrete-time k − 1.
B. Measurement Model
The output model of [8], was formulated with the base
velocity using the robot’s kinematics and the base position,
orientation obtained with a Gaussian particle filter on LiDAR
measurements, all expressed in the world frame. In this
work, besides the kinematically computed base velocity, we
consider external measurements of the base position and
orientation from either LiDAR Odometry (LO) or Visual
Odometry (VO). Nevertheless, such measurements can potentially suffer from outliers in human daily environments
due to the static-world assumption, as presented in Sec. IA. Thus, we distinguish the latter with the superscript o for
possible outliers as:
o
yk
=
ho (xk )⊞nk
}|
z
{
w
p b + n pb
n
w
Rb e b[×]
(8)
where nk = npb nb denote the external position and orientation measurement noise that follows normal zero mean
distribution with covariance Rok . The operator ⊞ denotes
the proper summation which applies to rotation matrices as
composition of rotations.
On the contrary, we normally consider the kinematically
computed base velocity as:
n
yk
=
N (0, Rnk )
hn (xk )+nvb
}|
{
z
w
R b b v b + nv b
(9)
with nvb ∼
be the normal zero mean kinematic
velocity noise with covariance Rnk . The above measurements
do not accumulate leg odometry drift during the gait and
are commonly not contaminated with outliers when accurate
contact states are estimated [30], [31]. Thus, we distinguish
them with the superscript n for nominal measurements that
will be not examined for outliers.
To derive the linearization of Eq. (8) we consider the
error exponential coordinates related with the external rotation [21], then:
δy ok = H ok δxk + nk
(10)
with
H ok
0 0
=
0 I
I
0
0 0
0 0
H nk =
w
Rb
−w Rb v b[x]
(11)
(12)
0 0 0
(13)
Subsequently, the ESKF update step is formulated as:
δxk = K ∗k y ∗k ⊟ h∗ (x̂−
(14)
k)
−
x̂+
k = x̂k ⊞ δxk
(15)
−
∗
∗ −
∗⊤
∗
∗⊤
P+
k = P k − K k (H k P k H k + Rk )K k
(16)
∗⊤
∗ −
∗⊤
∗ −1
K ∗k = P −
k H k (H k P k H k + Rk )
(17)
where the superscript * can be either o or n depending on
the set of measurements considered.
C. Outlier Detection
In this section, the main result of this work is presented.
The outlier detection framework presented in [27] is integrated with the ESKF to introduce a base estimator robust
to outliers.
In order to detect outlier measurements, Wang et al. [27],
utilized a beta-Bernouli distribution to probabilistically quantify whether a measurement is outlier or not. Beta-Bernouli
distributions have been proved effective in various outlier
resilient algorithms in the past [32], [33]. To this end, in [27]
a binary indicator variable zk was introduced. Accordingly,
when zk is one, yko is a nomimal measurement while when
zk is zero, yko is an outlier. The latter can be formulated as:
p(y ok |xk , zk ) = N (ho (xk ), Rk )zk
(18)
Evidently, when zk = 0, Eq. (18) becomes a constant
and cannot contribute to the state estimation, since the
distribution is measurement independent.
Subsequently, in order to properly infer the indicator
variable, a beta-Bernoulli hierarchical prior [34] is enforced.
In such a way, zk is a Bernoulli variable influenced by πk :
p(zk |πk ) = πkzk (1 − πk )(1−zk )
(19)
where πk follows a beta distribution:
p(πk ) =
πke0 −1 (1 − πk )f0 −1
B(e0 , f0 )
(20)
with B denoting the beta function, parametrized by e0 and
f0 .
Since, zk is modeled as a beta-Bernoulli variable, when
the mean hzk i is close to zero, e.g. 10−5 , we treat the
measurement as an outlier and ignore it, thus:
−
x̂+
k = x̂k
(21)
−
P+
k = Pk
(22)
otherwise, we weight the measurement noise Rok as:
Rok = Rok /hzk i
On the other hand, the linearization of (9) is straightforward
to compute:
δy nk = H nk δxk + nvb
where
(23)
and perform the regular update as in (14)-(17).
The expectation of zk is computed in each iteration as
follows:
p(zk = 1)
hzk i =
(24)
p(zk = 1) + p(zk = 0)
with
p(zk = 1) =
1
o −1
ceΨ(ek )−Ψ(ek +fk )− 2 tr(B k Rk )
p(zk = 0) = ce
Ψ(fk )−Ψ(ek +fk )
(25)
(26)
where c is the normalization constant to guarantee that
(25), (26) are proper probabilities, Ψ denotes the digamma
function [34], and B k is given by:
Z
+ ⊤
+
o
o
B k = (y ok − ho (x̂+
k ))(y k − h (x̂k )) p(x̂k )dxk (27)
The integral in (27) is not straightforward to compute in
the general nonlinear Gaussian case. In [27] the cubature
rules [35] to obtain an approximate solution are used. In the
context of the EKF, (27) can be derived analytically as:
+ ⊤
o o
B k = y ok y o⊤
k − 2y k h (x̂k )
+ ⊤
o
o +
o⊤
+ ho (x̂+
k )h (x̂k ) + H k P k H k
(28)
The proof is given for completeness in the Appendix.
Finally, ek , fk are updated in each iteration as:
et = e0 + hzk i
ft = f0 + 1 − hzk i
(29)
(30)
The proposed robust Gaussian ESKF (RGESKF) is summarized in Algorithm 1. We note, that no further knowledge
of the measurement distribution [26] other than the covariance Rok is needed or empirically obtained thresholds as
in [13] are required to perform outlier detection. The only
tunable parameters are the beta-Bernoulli prior parameters
e0 and f0 . Experimentally, e0 and f0 have been set to
0.9 and 0.1 respectively. The latter values have been used
in all conducted experiments, including real tests with the
two robots (cf. Sec. III below) and in our open-source
implementation [29]. As also stated in [27], we observed
that the outlier detection process is insensitive to the latter
parameters as long as e0 /(e0 + f0 ) is close to 1. Presumably,
this is the case, since it is more probable to observe a
nomimal measurement rather than an outlier.
III. R ESULTS
In this section, we outline representative results that
demonstrate the accuracy and efficiency of the proposed
scheme under real world conditions. Two actual humanoids
were employed in our experiments, the full-size 29DoF
WALK-MAN v2.0 humanoid [28] and a mini-size NAO
robot. The WALK-MAN v2.0 robot uses the walking module
introduced in [6], [36], with step time of 0.8s, using the
XBotCore [37] and OpenSoT [38] control infrastructure in a
500Hz control-loop. The walking module utilized with the
NAO robot is based on [39] with a step time of 0.4s and
achieves a control-loop of 100Hz. The IMUs noise standard
deviations used in our experiments are shown in Table I.
For the WALK-MAN v2.0 VectorNav VN-100 IMU, we
employed the noise densities given by the manufacturer at
200Hz, while for the IMU utilized in the NAO experiments
an Allan variance analysis [40] was performed with 13 hour
stationary data at 100Hz to properly derive the values.
Algorithm 1: Robust Gaussian ESKF
Data: y o1:T , y n1:T , x̂0 , P 0 , Q1:T , Ro1:T , Rn1:T
+
Result: x̂+
k , P k for t = 1 : T
1 for t = 1, . . . , T do
−
2
Compute x̂−
k and P k via (6), (7);
3
Initialize i = 0, e0 = 0.9, f0 = 0.1, and (i) zt =1;
4
repeat
5
Update Rok with (23);
6
i = i + 1;
7
if (i−1) zt < 10−5 then
8
Update (i) x̂k and (i) P +
k via (14)-(17) with
position and orientation;
9
Update (i) zt via (24);
10
Update (i) et and (i) ft via (29), (30);
11
else
12
Update (i) x̂k and (i) P +
k via (21), (22);
13
break;
14
end
15
until k(i) x̂k ⊟ (i−1) x̂k k < 10−3 ;
(i) +
(i) +
16
x̂+
x̂k and P +
Pk ;
k =
k =
+
17
Update x̂k and P k via (14)-(17) with velocity;
18 end
TABLE I
IMU N OISE STDS
WALK-MAN
NAO
)
wω ( rad
s
9.77e-4
5.63e-3
wα ( sm2 )
2.21e-2
1.58e-2
wbω ( rad
)
s2
1.53e-5
9.66e-4
wbα ( sm3 )
2.43e-4
4.33e-3
Additionally, joint angle measurements are available at
200Hz for the WALK-MAN v2.0 robot and at 100Hz for
NAO.
To facilitate a quantitative assessment, we compare the
proposed RGESKF, to the ESKF without outlier detection,
and to an ESKF where the outlier detection method in [14]
is employed. The latter is termed as ESKF-TH since in each
experiment we had to fine tune in advance a probabilistic
threshold TH to achieve accurate detection according to the
Mahalanobis distance dM :
⊤ −1 o
dM = y ok − ho (x̂−
S k y k − ho (x̂−
(31)
k)
k)
with
o⊤
o
S k = H ok P +
k H k + Rk
(32)
In all our experiments, the aforementioned estimation
schemes achieved real-time execution at the corresponding
IMU rates.
A. VO Outliers
In the case of VO, we conducted two independent experiments, one with the WALK-MAN v2.0 while the groundtruth was recorded with an OptiTrack motion capture system
and another one with a NAO robot navigating to a desired
position in space. In the former case the PointGrey BlackFly
Illustration of conducted experiments – Left: WALK-MAN v2.0 and VO outliers, Middle: NAO and VO outliers, Right: NAO and LO outliers.
Fig. 2.
0.6
0.4
0.2
0
0
0.4
0.2
0
-0.2
0
0.1
in both cases we used SVO [23]. The measurement noise
standard deviations used in our VO experiments, are listed
in Table II.
VO
RGESKF
ESKF
ESKF-TH
5
10
15
20
25
30
35
TABLE II
VO/LO AND K INEMATIC M EASUREMENT N OISE STDS
5
10
15
20
25
30
35
0
5
10
15
20
25
30
35
0.2
0.1
0
-0.1
0
0.1
VO
RGESKF
ESKF
ESKF-TH
0
WALK-MAN - VO
NAO - VO
NAO - LO
npb (m)
0.035
0.04
0.04
nb (rad)
0.05
0.05
0.05
nv b ( m
)
s
0.015
0.013
0.013
-0.1
5
10
15
20
25
30
35
5
10
15
20
25
30
35
5
10
15
20
25
30
35
0
-0.1
0
0.4
0.2
0
0
Fig. 3. Top: 3D - position error, Bottom: 3D - orientation error for the
WALKMAN 2.0 - VO experiment, blue lines indicate the VO, red lines the
proposed Robust Gaussian ESKF, green lines the ESKF, and black lines the
finely tuned ESKF-TH. The first vertical dotted line corresponds to when a
human enters the FoV of the camera, while the second one when he removes
a strong feature source for VO which diverges.
BFLY-U3-23S6C camera at 40fps and 1080p resolution was
used while in the latter, the Matrix-Vision mvBlueFOXMLC-200w running at 30fps and VGA was utilized. Both
cameras are monocular and global-shutter. To obtain the VO
In the WALK-MAN v2.0 experiment, shown in Fig. 2
(left), a human unexpectedly crosses the Field of View (FoV)
of the robot, causing sudden changes in the image intensity
levels, and furthermore removes an object that is a strong
feature source for VO. As a result, the scene changes rather
drastically and henceforth, VO diverges while generating
consecutive outlier measurements.
In Figure 3, the estimated 3D position and orientation
errors w.r.t ground-truth for all three employed schemes is
illustrated. Notice, at t = 17.6s where the human appears
in the FoV, VO starts to misbehave and at t = 19.9 when
static scene changes, VO eventually diverges. Consequently,
the ESKF without outlier detection diverges as well and
thus large positional errors are recorded. Nevertheless, this
is not the case for the RGESKF and the ESKF-TH, where
low errors were observed for both the position and the
orientation. To finely tune the ESKF-TH, we had to run
the filter and log the Mahalanobis distances (31), in order
to determine a proper threshold TH. A value of TH = 23
achieved the lowest estimation error. On the contrary our
approach, RGESKF, which does not rely on finely tuned
threshold or prior knowledge on the measurements, achieved
very accurate and similar results to the tuned in advanced
ESKF-TH. The Root-Mean Square Error (RMSE) for this
experiment is shown in Table III. All employed schemes
0.4
LO
RGESKF
ESKF
0.2
0
0
0.2
0.1
0
-0.1
0
10
20
30
40
50
60
10
20
30
40
50
60
20
30
40
50
60
0
0
-0.04
-0.2
-0.4
22 23
0
0.2
0
VO
RGESKF
ESKF
-0.2
0
0.5
10
20
30
40
50
60
70
80
90
0
-0.5
0
0.5
10
20
30
40
50
60
70
80
90
Fig. 5. 2D - pose error, for the NAO - LO experiment, blue lines indicate
the LO, red lines the proposed Robust Gaussian ESKF, and green lines the
ESKF. The vertical dotted lines corresponds to when a human suddenly
covers the robot’s LiDAR with a box, causing LO to diverge.
ESKF-TH as also evident in Table IV, where the RMSE is
presented.
TABLE IV
RMSE W. R . T ESKF-TH FOR THE NAO VO EXPERIMENT
0
-0.5
0
10
10
20
30
40
50
60
70
80
90
Fig. 4. Top: 3D - position error, Bottom: 3D - orientation error for the
NAO - VO experiment, blue lines indicate the VO, red lines the proposed
Robust Gaussian ESKF, and green lines the ESKF. The vertical dotted line
specifies when a human removes a strong feature source for VO which
diverges.
realized errors in the case of yaw estimation since it is
unobservable [13], [15].
TABLE III
RMSE FOR THE WALK-MAN V 2.0 VO EXPERIMENT
VO
ESKF
RGESKF
ESKF-TH
w px
b
0.217
0.071
0.018
0.017
w py
b
0.162
0.045
0.033
0.025
w pz
b
0.043
0.009
0.003
0.003
roll
0.075
0.066
0.062
0.059
pitch
0.022
0.011
0.011
0.010
yaw
0.160
0.105
0.098
0.102
Similarly in the NAO’s case, a human, present in the
scene removes a strong feature source for VO, while the
robot is walking. For clarity, this is illustrated in Fig. 2
(middle). Since, we do not have ground-truth data available
for the NAO experiments and given that a fine tuned ESKFTH can yield pretty accurate estimation, as evident by our
previous experiment, we assume it as baseline to compare
to. Subsequently, to properly derive the threshold needed,
we computed (31) at t = 39.6s, the exact time when
the human changes the scenery. The previous, was found
to be TH = 16. Figure 4 demonstrates the 3D position
and orientation error w.r.t the ESKF-TH. As illustrated, the
ESKF realizes large errors for both positional and rotational
quantities, when VO diverges at t = 39.6s. However, once
again the RGESKF yielded practically identical results to the
VO
ESKF
RGESKF
w px
b
0.246
0.238
2.8e-6
w py
b
0.653
0.643
2.6e-7
w pz
b
0.227
0.223
2.3e-6
roll
0.066
0.037
6.8e-7
pitch
0.246
0.061
1.6e-6
yaw
0.152
0.146
9.2e-6
B. LO Outliers
Next, we examine how LO outlier measurements can
degrade the estimation performance. To do so, we utilize
an RP-LiDAR360 mounted on NAO’s head to obtain planar
scans every 5Hz. Subsequently, we employed RF2O [24]
to compute the 2D pose e.g. position x,y and yaw, with
scan matching. The measurement noise assumed in this
experiment is shown at Table II.
In order to generate LO outliers, a human covers the
spinning laser while NAO walks, as depicted in Fig. 2 (right).
This corresponds to the scenario where a robot in motion is
suddenly surrounded by people. To this end, the static world
used to derive the LO is drastically changed which in turn
gives rise to outliers.
In the conducted experiment, NAO is commanded to walk
straight, stop, and then walk straight again. While walking a
human covers the LiDAR twice to generate LO outliers. As
previously, we compare the estimation results to the ESKFTH, which we have accurately tuned in advance as before.
A threshold of TH = 9 was experimentally found to be
sufficient for this specific experiment. The 2D pose error is
shown in Figure 5. Time t = 21.5s marks the instant where
the human covers the LiDAR for the first time. At that time,
a large jump in the LO position in x axis is recorded which
in turn causes large error to the ESKF estimation in the same
axis. Subsequently, after 23.5s the human covers the LiDAR
one more time. This time larger errors are evident in the
base’s y position and the base’s yaw angle causing again the
ESKF to misbehave. Interestingly, the proposed scheme was
proven to be robust and automatically ignore the inaccurate
LO measurements. The RMSE for this particular experiment
is indicated in Table V. As demonstrated, the RGESKF yields
a similar estimation result for all quantities of interest when
compared to a finely tuned ESKF-TH.
TABLE V
RMSE W. R . T ESKF-TH FOR THE NAO LO EXPERIMENT
LO
ESKF
RGESKF
w px
b
0.197
0.180
0.004
w py
b
0.066
0.054
0.002
yaw
0.174
0.173
0.006
All our experiments are illustrated in high quality at
https://youtu.be/ojogeY3xSsw
IV. D ISCUSSION
As evident, the proposed RGESKF is characterized by
high accuracy and strong outlier rejection capabilities. The
latter hold true, even when consecutive VO/LO outlier measurements were observed. Additionally, no prior knowledge
of the measurement distributions [26] or finely tuned thresholds [14] are required for the success of the proposed scheme.
On the contrary, notice that the ESKF-TH needed three
different thresholds, one for each experiment, to achieve
accurate performance. This is evident by (31), where an
optimal threshold depends on the measurement noise Rok
and the error-state uncertainty P +
k . In addition, in all our
VO/LO experiments, the outlier detection part in Algorithm 1
took at most three iterations to complete. Thus, in the opensource released implementation [29], we loop three times
instead of computing in every iteration the condition in line
15. Moreover, it is noteworthy that in the VO experiments the
initial derived SVO orientation can be erroneous. We suspect
this is probably due to a) inaccurate scale initialization, b)
imperfect extrinsic and intrinsic calibrations. Nevertheless,
as also seen in the results, this does not degrade the estimation accuracy since the fused IMU and kinematically
computed base velocity measurements also carry information
that contribute to the orientation estimation. Furthermore,
it is important to clarify that the RGESKF does not only
detect and reject outliers as the ESKF-TH does, but automatically weights the measurement noise according to (23)
in order to avoid information loss when non-ideal/non-outlier
measurements arrive. Finally, the proposed method can be
appropriately employed to other robotic platforms, such as
Unmanned Aerial Vehicles (UAVs), which also utilize the
ESKF [41] for state estimation.
V. C ONCLUSION
Prominent examples of VO/LO approaches readily assume
that the world in which the robot acts, is static. Nevertheless,
to enable humanoids co-exist with humans in dynamically
changing environments, their state-estimation schemes must
be robustified. In this work, we tackled the presence of
VO/LO outlier measurements in base estimation by proposing the RGESKF. After mathematically establishing the
proposed scheme, we provided a quantitative and qualitative
assessment with two robots, namely a full-size WALK-MAN
v2.0 humanoid and a mini-size NAO robot, demonstrating
the accuracy and efficiency of the proposed scheme in
real-world conditions. Finally, in order to reinforce further
research endeavours, we release our implementation as an
open-source ROS/C++ package [29].
Planned future work regards, utilizing the proposed
scheme in humanoid navigation in human/gradually changing environments, where outliers are harder to detect [42]. In
addition, we will investigate the estimation performance in
locomotion-manipulation tasks, where the humanoid has to
walk to a desired location and grasp an object. Commonly
while grasping, multiple self-parts appear in the FoV and
possibly give rise to outliers. Moreover, it would be interesting to compare the proposed method to other state estimation
schemes that directly include leg contacts [15]. Finally, we
will address how to recover from VO/LO divergence in order
to continue integrating VO/LO measurements.
A PPENDIX
Z
(y ok − ho (xk ))(y ok − ho (xk ))⊤ p(x̂k )dxk
Z
o
= y ok y o⊤
−
2y
ho (xk )⊤ p(x̂k )dxk
k
k
Z
+ ho (xk )ho (xk )⊤ p(x̂k )dxk
(33)
Bk =
using the first order approximation of (8) post to the
update:
+
o
ho (x) = h(x̂+
k ) + H k (xt − x̂k )
the first integral of (33) can be computed as:
Z
Z
o
⊤
⊤
h (xk ) p(x̂k )dxk = ho (x̂+
k ) p(x̂k )dxk
Z
+ ⊤
o
o⊤
⊤
+ (xt − x̂+
k ) H k p(x̂k )dxk = h (x̂k )
while the second integral of (33) is equivalent to:
Z
Z
+
o
ho (xk )ho (xk )⊤ p(x̂k )dxk = (x̂+
k + H k (xk − x̂k ))
+ ⊤
+ +⊤
o⊤
(x̂+
+ H ok P +
k + H(xt − x̂k )) p(x̂k )dxk = x̂k x̂k
k Hk
Thus (33) becomes:
+ ⊤
o o
B k = y ok y o⊤
k − 2y k h (x̂k )
+ ⊤
o
o +
o⊤
+ ho (x̂+
k )h (x̂k ) + H k P k H k
(34)
ACKNOWLEDGMENT
The authors would like to thank Hongwei Wang for his
helpful discussion on the topic examined in this paper.
R EFERENCES
[1] B. Aceituno-Cabezas, C. Mastalli, H. Dai, M. Focchi, A. Radulescu,
D. G. Caldwell, J. Cappelletto, J. C. Grieco, G. Fernndez-Lpez, and
C. Semini, “Simultaneous contact, gait, and motion planning for robust
multilegged locomotion via mixed-integer convex optimization,” IEEE
Robotics and Automation Letters, vol. 3, no. 3, pp. 2531–2538, July
2018.
[2] A. W. Winkler, C. D. Bellicoso, M. Hutter, and J. Buchli, “Gait and
trajectory optimization for legged systems through phase-based endeffector parameterization,” IEEE Robotics and Automation Letters,
vol. 3, no. 3, pp. 1560–1567, July 2018.
[3] A. Hereid, C. M. Hubicki, E. A. Cousineau, and A. D. Ames,
“Dynamic Humanoid Locomotion: A Scalable Formulation for HZD
Gait Optimization,” IEEE Transactions on Robotics, 2018.
[4] M. Neunert, M. Stuble, M. Giftthaler, C. D. Bellicoso, J. Carius,
C. Gehring, M. Hutter, and J. Buchli, “Whole-body nonlinear model
predictive control through contacts for quadrupeds,” IEEE Robotics
and Automation Letters, vol. 3, no. 3, pp. 1458–1465, July 2018.
[5] A. Herzog, N. Rotella, S. Mason, F. Grimminger, S. Schaal, and
L. Righetti, “Momentum control with hierarchical inverse dynamics
on a torque-controlled humanoid,” Auton. Robots, vol. 40, no. 3, pp.
473–491, Mar. 2016.
[6] C. Zhou, X. Wang, Z. Li, and N. Tsagarakis, “Overview of Gait
Synthesis for the Humanoid COMAN,” vol. 14, no. 1, 2017, pp. 15–
25.
[7] T. Flayols, A. Del Prete, P. Wensing, A. Mifsud, M. Benallegue,
and O. Stasse, “Experimental evaluation of simple estimators for
humanoid robots,” in 2017 IEEE-RAS 17th International Conference
on Humanoid Robotics (Humanoids), Nov 2017, pp. 889–895.
[8] S. Kuindersma, R. Deits, M. Fallon, A. Valenzuela, H. Dai, F. Permenter, T. Koolen, P. Marion, and R. Tedrake, “Optimization-based
Locomotion Planning, Estimation, and Control Design for the Atlas
Humanoid Robot,” Autonomous Robots, vol. 40, pp. 429–455, 2016.
[9] M. F. Fallon, M. Antone, N. Roy, and S. Teller, “Drift-free Humanoid
State Estimation Fusing Kinematic, Inertial and LIDAR Sensing,” in
IEEE-RAS International Conference on Humanoid Robots, 2014, pp.
112–119.
[10] M. F. Fallon, P. Marion, R. Deits, T. Whelan, M. Antone, J. McDonald,
and R. Tedrake, “Continuous Humanoid Locomotion over Uneven
Terrain using Stereo Fusion,” in IEEE-RAS Intl. Conf. on Humanoid
Robots, 2015, pp. 881–888.
[11] V. Sushrutha Raghavan, D. Kanoulas, C. Zhou, D. G. Caldwell,
and N. G. Tsagarakis, “A Study on Low-Drift State Estimation for
Humanoid Locomotion, Using LiDAR and Kinematic-Inertial Data
Fusion,” in IEEE-RAS 18th International Conference on Humanoid
Robots (Humanoids), 2018, pp. 1–8.
[12] M. Camurri, M. Fallon, S. Bazeille, A. Radulescu, V. Barasuol, D. G.
Caldwell, and C. Semini, “Probabilistic contact estimation and impact
detection for state estimation of quadruped robots,” IEEE Robotics
and Automation Letters, vol. 2, no. 2, pp. 1023–1030, April 2017.
[13] M. Bloesch, M. Hutter, M. Hoepflinger, S. Leutenegger, C. Gehring,
C. D. Remy, and R. Siegwart, “State Estimation for Legged Robots–
Consistent Fusion of Leg Kinematics and IMU ,” Robotics Sci. and
Sys., 2012.
[14] M. Bloesch, C. Gehring, P. Fankhauser, M. Hutter, M. A. Hoepflinger,
and R. Siegwart, “State Estimation for Legged Robots on Unstable and
Slippery Terrain,” in IEEE/RSJ International Conference on Intelligent
Robots and Systems, 2013, pp. 6058–6064.
[15] N. Rotella, M. Bloesch, L. Righetti, and S. Schaal, “State Estimation
for a Humanoid Robot,” in IEEE/RSJ Intl. Conf. on Intelligent Robots
and Systems, 2014, pp. 952–958.
[16] N. Rotella, A. Herzog, S. Schaal, and L. Righetti, “Humanoid momentum estimation using sensed contact wrenches,” in IEEE-RAS Intl.
Conf. on Humanoid Robots, 2015, pp. 556–63.
[17] J. Carpintieri, M. Benallegue, N. Mansard, and J. P. Laumond, “Centerof-Mass Estimation for a Polyarticulated System in Contact - A
Spectral Approach,” IEEE Transactions on Robotics, vol. 32, 2016.
[18] S. Piperakis and P. Trahanias, “Non-linear ZMP based State Estimation for Humanoid Robot Locomotion,” in IEEE-RAS Intl. Conf. on
Humanoid Robots, 2016, pp. 202–209.
[19] Xinjilefu and C. G. Atkeson, “State Estimation of a Walking Humanoid Robot,” in IEEE/RSJ Intl. Conf. on Intelligent Robots and
Systems, 2012, pp. 3693–3699.
[20] R. Wittmann, A. Hildebrandt, D. Wahrmann, D. Rixen, and
T. Buschmann, “State Estimation for Biped Robots using Multibody
Dynamics,” in IEEE/RSJ Conf. Intel. Robots and Systems, 2015.
[21] S. Piperakis, M. Koskinopoulou, and P. Trahanias, “Nonlinear State
Estimation for Humanoid Robot Walking,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3347–3354, Oct 2018.
[22] J. Engel, V. Koltun, and D. Cremers, “Direct Sparse Odometry,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 40,
pp. 611–625, 2018.
[23] C. Forster, M. Pizzoli, and D. Scaramuzza, “SVO: Fast Semi-direct
Monocular Visual Odometry,” in IEEE Intl. Conf. on Robotics and
Automation, 2014, pp. 15–22.
[24] M. Jaimez, J. G. Monroy, and J. Gonzalez-Jimenez, “Planar Odometry
from a Radial Laser Scanner. A Range Flow-based Approach,” in IEEE
Intl. Conf. on Robotics and Automation, pp. 4479–4485.
[25] J. Zhang and S. Singh, “LOAM: Lidar Odometry and Mapping in
Real-time,” in Robotics: Science and Systems, 2014.
[26] R. Piche, S. Sarkka, and J. Hartikainen, “Recursive Outlier-robust
Filtering and Smoothing for Nonlinear Systems using the Multivariate
Student-t Distribution,” in IEEE International Workshop on Machine
Learning for Signal Processing, 2012, pp. 1–6.
[27] H. Wang, H. Li, J. Fang, and H. Wang, “Robust Gaussian Kalman Filter With Outlier Detection,” IEEE Signal Processing Letters, vol. 25,
no. 8, pp. 1236–1240, 2018.
[28] N. G. Tsagarakis et al., “WALK-MAN: A High-Performance Humanoid Platform for Realistic Environments,” Journal of Field
Robotics, vol. 34, no. 7, pp. 1225–1259, 2017.
[29] S. Piperakis, “SEROW Humanoid Walking Estimator.” [Online].
Available: https://github.com/mrsp/serow
[30] S. Piperakis, S. Timotheatos, and P. Trahanias, “Unsupervised Gait
Phase Estimation for Humanoid Robot Walking,” in IEEE Intl. Conf.
on Robotics and Automation, 2019.
[31] N. Rotella, S. Schaal, and L. Righetti, “Unsupervised Contact Learning
for Humanoid Estimation and Control,” in Proceedings of the IEEE
Intl. Conf. on Robotics and Automation, 2018.
[32] X. Ding, L. He, and L. Carin, “Bayesian Robust Principal Component
Analysis,” in IEEE Trans. Image Process, vol. 20, no. 12, 2011, pp.
3419–3430.
[33] Q. Wan, H. Duan, J. Fang, H. Li, and Z. Xing, “Robust Bayesian
Compressed Sensing with Outliers,” in Signal Processing, vol. 140,
2017, pp. 104–109.
[34] J. Paisley and L. Carin, “Nonparametric Factor Analysis with Beta
Process Priors,” in 26th Annual International Conference on Machine
Learning. New York, NY, USA: ACM, 2009, pp. 777–784.
[35] I. Arasaratnam and S. Haykin, “Cubature Kalman Filters,” IEEE
Transactions on Automatic Control, vol. 54, no. 6, pp. 1254–1269,
2009.
[36] D. Kanoulas, C. Zhou, A. Nguyen, G. Kanoulas, D. G. Caldwell, and
N. G. Tsagarakis, “Vision-based Foothold Contact Reasoning using
Curved Surface Patches,” in IEEE-RAS 17th International Conference
on Humanoid Robotics (Humanoids), 2017, pp. 121–128.
[37] L. Muratore, A. Laurenzi, E. M. Hoffman, A. Rocchi, D. G. Caldwell,
and N. G. Tsagarakis, “Xbotcore: A Real-Time Cross-Robot Software
Platform,” in IEEE International Conference on Robotic Computing
(IRC), 2017, pp. 77–80.
[38] A. Rocchi, E. M. Hoffman, D. G. Caldwell, and N. G. Tsagarakis,
“OpenSoT: A Whole-body Control Library for the Compliant Humanoid Robot COMAN,” in IEEE International Conference on
Robotics and Automation (ICRA), 2015, pp. 6248–6253.
[39] S. Piperakis, E. Orfanoudakis, and M. Lagoudakis, “Predictive Control
for Dynamic Locomotion of Real Humanoid Robots,” in IEEE/RSJ
Intl. Conf. Intel. Robots and Systems, 2014, pp. 4036–4043.
[40] N. El-Sheimy, H. Hou, and X. Niu, “Analysis and Modeling of Inertial
Sensors Using Allan Variance,” IEEE Transactions on Instrumentation
and Measurement, pp. 140–149, 2008.
[41] A. Bry, A. Bachrach, and N. Roy, “State estimation for aggressive
flight in gps-denied environments using onboard sensing,” in IEEE
International Conference on Robotics and Automation, 2012, pp. 1–8.
[42] S. Piperakis, N. Tavoularis, E. Hourdakis, D. Kanoulas, and P. Trahanias, “Humanoid Robot Dense RGB-D SLAM for Embedded Devices,” in IEEE Intl. Conf. on Robotics and Automation (ICRA):
3rd Full-Day Workshop ”Towards Real-World Deployment of Legged
Robots“, 2019.