Efficient Sparse Pose Adjustment for 2D Mapping
Kurt Konolige
Willow Garage
Menlo Park, CA 94025
Email:
[email protected]
Giorgio Grisetti, Rainer Kümmerle,
Wolfram Burgard
University of Freiburg
Freiburg, Germany
Email:
[email protected]
Abstract— Pose graphs have become a popular representation
for solving the simultaneous localization and mapping (SLAM)
problem. A pose graph is a set of robot poses connected by
nonlinear constraints obtained from observations of features
common to nearby poses. Optimizing large pose graphs has
been a bottleneck for mobile robots, since the computation
time of direct nonlinear optimization can grow cubically with
the size of the graph. In this paper, we propose an efficient
method for constructing and solving the linear subproblem,
which is the bottleneck of these direct methods. We compare our
method, called Sparse Pose Adjustment (SPA), with competing
indirect methods, and show that it outperforms them in terms of
convergence speed and accuracy. We demonstrate its effectiveness
on a large set of indoor real-world maps, and a very large
simulated dataset. Open-source implementations in C++, and the
datasets, are publicly available.
I. I NTRODUCTION
The recent literature in robot mapping shows an increasing
interest in graph-based SLAM approaches. In the most general
form, the graph has nodes that represent both robot poses
and world features, with measurements connecting them as
constraints. The goal of all approaches is to jointly optimize the poses of the nodes so as to minimize the error
introduced by the constraint. One classical variant of this
problem comes from computer vision and is denoted as Bundle
Adjustment [25], which is typically solved with a specialized
variant of the Levenberg-Marquardt (LM) nonlinear optimizer.
In
√ the SLAM literature, Lu-Milios [18], GraphSLAM [24], and
SAM [4] are all variants of this technique.
Since features tend to outnumber robot poses, more compact
systems can be created by converting observations of features into direct constraints among the robot poses, either by
marginalization [1, 24, 4], or by direct matching – for example,
matching laser scans between two robot poses yields a relative
pose estimate for the two. Pose constraint systems, in typical
robotic mapping applications, exhibit a sparse structure of
connections, since the range of the sensor is typically limited
to the vicinity of the robot.
Solving pose graphs efficiently (i.e., finding the optimal
positions of the nodes) is a key problem for these methods
especially in the context of online mapping problems. A
typical 2D laser map for a 100m x 100m office space may
have several thousand nodes and many more constraints (see
Figure 1). Furthermore, adding a loop closure constraint to
this map can affect almost all of the poses in the system.
Benson Limketkai, Regis Vincent
SRI International
Menlo Park, CA 94025
Email:
[email protected]
Fig. 1. Large MIT corridor map, unoptimized (left) and optimized (right).
A full nonlinear optimization of this map (3603 nodes and 4986 constraints),
starting from the odometry positions of the left-side figure, takes just 150 ms
with our method.
At the heart of the LM method lies the solution of a large
sparse linear problem. In this paper, we develop a method
to efficiently compute the sparse matrix from the constraint
graph, and use direct sparse linear methods to solve it. In
analogy to Sparse Bundle Adjustment in the vision literature,
we call this method Sparse Pose Adjustment (SPA), since it
deals with the restricted case of pose-pose constraints. The
combination of an SBA/GraphSLAM optimizer with efficient
methods for solving the linear subproblem has the following
advantages.
• It takes the covariance information in the constraints into
account which leads to more accurate solutions.
• SPA is robust and tolerant to initialization, with very
low failure rates (getting stuck in local minima) for both
incremental and batch processing.
• Convergence is very fast as it requires only a few iterations of the LM method.
• Unlike EKF and information filters, SPA is fully nonlinear: at every iteration, it linearizes all constraints
around their current pose.
• SPA is efficient in both batch and incremental mode.
We document these and other features of the method in
the experimental results section where we also compare our
method to other LM and non-LM state-of-the-art optimizers.
One of the benefits of the efficiency of SPA is that a
mapping system can continuously optimize its graph, providing the best global estimate of all nodes, with very little
computational overhead. Solving the optimization problem for
the large map shown in Figure 1 requires only 150 ms from an
initial configuration provided by odometry. In the incremental
mode, where the graph is optimized after each node is added,
it requires less than 15 ms for any node addition.
Although SPA can be parameterized with 3D poses, for
this paper we have restricted it to 2D mapping, which is
a well-developed field with several competing optimization
techniques. Our intent is to show that a 2D pose-based
mapping system can operate on-line using SPA as its optimization engine, even in large-scale environments and with large
loop closures, without resorting to submaps or complicated
partitioning schemes.
II. R ELATED W ORK
Lu and Milios [18] presented the seminal work on graphbased SLAM, where they determine the pairwise measurements between scans via ICP scan-matching and then optimize
the graph by iterative linearization. At that time, efficient
optimization algorithms were not available to the SLAM
community and graph-based approaches were regarded as
too time-consuming. Despite this, the intuitive formulation of
graph-based SLAM attracted many researchers with valuable
contributions. Gutmann and Konolige [12] proposed an effective way for constructing such a network and for detecting loop
closures while running an incremental estimation algorithm.
Since the Lu and Milios paper, many approaches for graph
optimization have been proposed. Howard et al. [13] apply
relaxation to localize the robot and build a map. Duckett
et al. [6] propose the usage of Gauss-Seidel relaxation to
minimize the error in the network of constraints. To overcome
the inherently slow convergence of relaxation methods, Frese
et al. [9] propose a variant of Gauss-Seidel relaxation called
multi-level relaxation (MLR). It applies relaxation at different
resolutions. MLR is reported to provide very good results in
2D environments, especially if the error in the initial guess is
limited.
Olson et al. [21] proposed stochastic gradient descent to
optimize pose graphs. This approach has the advantage of
being easy to implement and exceptionally robust to wrong
initial guesses. Later, Grisetti et al. [10] extended this approach
by applying a tree based parameterization that significantly
increases the convergence speed. The main problem of these
approaches is that they assume the error in the graph to be
more or less uniform, and thus they are difficult to apply to
graphs where some constraints are under-specified.
The most intuitive way to optimize a graph is probably
by nonlinear least-squares optimization, such as LM. Leastsquares methods require to repetitively solve a large linear
system obtained by linearizing the original likelihood function
of the graph. This linear system is usually very large; accordingly, the first graph-based approaches were time consuming,
because they did not exploit its natural sparsity. One promising
technique is Preconditioned Conjugate Gradient (PCG) [2],
which was later used by Konolige [15] and Montemerlo
and Thrun [20] as an efficient solver for large sparse pose
constraint systems; the preconditioner is typically incomplete
Cholesky decomposition. PCG is an iterative method, which
in general requires n iterations to converge, where n is the
number of variables in the graph. We have implemented a
sparse-matrix version of PCG from Sparselib++ and IML++
[5], and use this implementation for comparison experiments
in this paper.
More recently, Dellaert and colleagues use bundle adjustment, which they implement using
√ sparse direct linear solvers
SAM [4]). Our approach is
[3]; they call
their
system
√
similar to SAM; we differ from their approach mostly in
engineering, by efficient construction of the linear subproblem
using ordered data structures. We also use LM instead of
a standard nonlinear least-square method, thereby increasing
robustness. Finally, we introduce a “continuable LM” method
for the incremental case, and an initialization method that is
a much more robust approach to the batch problem.
√
Kaess et al. [14] introduced a variant of SAM, called
iSAM, that performs incremental update of the linear matrix
associated with the nonlinear least-squares problem. Relinearization and variable ordering are performed only occasionally, thereby increasing computational efficiency. In our
approach, relinearization and matrix construction are very
efficient, so such methods become less necessary. Currently
√
we do not have an implementation of either iSAM or SAM
to test against for performance.
Relaxation or least-squares approaches proceed by iteratively refining an initial guess. Conversely, approaches based
on stochastic gradient descent are more robust to the initial
guess. In the SLAM literature the importance of this initial
guess has been often underestimated. The better the initial
guess is, the more likely it is for an algorithm to find the
correct solution. In this paper, we address this point and
evaluate three different strategies for computing the initial
guess.
In contrast to full nonlinear optimization, several researchers
have explored filtering techniques to solve the graphs incrementally, using an information matrix form. The first such
approach was proposed by Eustice et al. and denoted Delayed
Sparse Information Filter (DSIF) [7]. This technique can be
very efficient, because it adds only a small constant number
of elements to the system information matrix, even for loop
closures. However, recovering the global pose of all nodes
requires solving a large sparse linear system; there are faster
ways of getting approximate recent poses.
Frese proposed the TreeMap [8] algorithm that captures the
sparse structure of the system by a tree representation. Each
leaf in the tree is a local map and the consistency of the
estimate is achieved by sending updates to the local maps
through the tree. Under ideal conditions, this approach can
update the whole map in O(n log n) time, where n is the
numbers of elements in the map. However, if the map has
many local connections the size of the local maps can be
very large and their updates (which are regarded as elementary
operations) become computationally expensive as shown in the
remainder of this paper.
optimized configuration (updated poses)
constraints
sensor data
front−end
(graph−construction)
poses
back−end
(optimization)
Fig. 2. Typical graph-based SLAM system. The front-end and the back end
are executed in an interleaved way.
To summarize the paper: we propose an efficient approach
for optimizing 2D pose graphs that uses direct sparse Cholesky
decomposition to solve the linear system. The linear system
is computed in a memory-efficient way that minimizes cache
misses and thus significantly improves the performance. We
compare our method, in both accuracy and speed, to existing
LM and non-LM approaches that are avaiable, and show
that SPA outperforms them. Open source implementations are
available both in C++ and in matlab/octave.
Efficient direct (non-iterative) algorithms to solve sparse
systems have become available [3], thus reviving a series
of approaches for optimizing the graphs which have been
discarded in the past. In this paper,
III. S YSTEM F ORMULATION
Popular approaches to solve the SLAM problem are the socalled “graph-based” or “network-based” methods. The idea is
to represent the history of the robot measurements by a graph.
Every node of the graph represents a sensor measurement
or a local map and it is labeled with the location at which
the measurement was taken. An edge between two nodes
encodes the spatial information arising from the alignment of
the connected measurements and can be regarded as a spatial
constraint between the two nodes.
In the context of graph-based SLAM, one typically considers two different problems. The first one is to identify the
constraints based on sensor data. This so-called data association problem is typically hard due to potential ambiguities
or symmetries in the environment. A solution to this problem
is often referred to as the SLAM front-end and it directly
deals with the sensor data. The second problem is to correct
the poses of the robot to obtain a consistent map of the
environment given the constraints. This part of the approach is
often referred to as the optimizer or the SLAM back-end. Its
task is to seek for a configuration of the nodes that maximizes
the likelihood of the measurements encoded in the constraints.
An alternative view to this problem is given by the springmass model in physics. In this view, the nodes are regarded as
masses and the constraints as springs connected to the masses.
The minimal energy configuration of the springs and masses
describes a solution to the mapping problem.
During its operation a graph-based SLAM system interleaves the execution of the front-end and of the back-end, as
shown in Figure 2. This is required because the front-end needs
to operate on a partially optimized map to restrict the search
about potential constraints. The more accurate the current
estimate is, the more robust the constraints generated by the
front-end will be and the faster its operation. Accordingly, the
performance of the optimization algorithm, measured in terms
of accuracy of the estimate and execution time, has a major
impact on the overall mapping system.
In this paper we describe in detail an efficient and compact optimization approach that operates on 2D graphs. Our
algorithm can be coupled with arbitrary front-ends that handle
different kinds of sensors. For clarity of presentation we
shortly describe a front-end for laser data. However, the
general concepts can be straightforwardly applied to different
sensors.
IV. S PARSE P OSE A DJUSTMENT
To optimize a set of poses and constraints, we use the wellknown Levenberg-Marquardt (LM) method as a framework,
with particular implementations that make it efficient for the
sparse systems encountered in 2D map building. In analogy
to the Sparse Bundle Adjustment of computer vision, which
is a similarly efficient implementation of LM for cameras and
features, we call our system Sparse Pose Adjustment (SPA).
A. Error Formulation
The variables of the system are the set of global poses
c of the robot, parameterized by a translation and angle:
ci = [ti , θi ] = [xi , yi , θi ]⊤ . A constraint is a measurement
of one node cj from another’s (ci ) position. The measured
offset between ci and cj , in ci ’s frame, is z̄ij , with precision
matrix Λij (inverse of covariance). For any actual poses of ci
and cj , their offset can be calculated as
⊤
Ri (tj − ti )
(1)
h(ci , cj ) ≡
θj − θi
Here Ri is the 2x2 rotation matrix of θi . h(ci , cj ) is called the
measurement equation.
The error function associated with a constraint, and the total
error, are
eij ≡ z̄ij − h(ci , cj )
X
(2)
χ2 (c, p) ≡
e⊤ Λij eij
ij
ij
Note that the angle parameter in h(ci , cj ) is not unique,
since adding or subtracting 2π yields the same result. Angle
differences are always normalized to the interval (−π, π] when
they occur.
B. Linear System
The optimal placement of c is found by minimizing the
total error in Equation 2. A standard method for solving this
problem is Levenberg-Marquardt (LM), iterating a linearized
solution around the current values of c. The linear system is
formed by stacking the variables c into a vector x, and the
error functions into a vector e. Then we define:
Λab
..
Λ≡
.
Λmn
∂e
J≡
∂x
H ≡ J⊤ ΛJ
(3)
The LM system is:
(H + λ diagH) ∆x = J⊤ Λe
(4)
Here λ is a small positive multiplier that transitions between
gradient descent and Newton-Euler methods. Gradient descent
is more robust and less likely to get stuck in local minima, but
converges slowly; Newton-Euler has the opposite behavior.
The matrix H is formed by adding four components for
each measurement h(ci , cj ):
..
will depend on the density of the Cholesky factor, which
in turn depends on the structure of H and the order of its
variables. Mahon et al. [19] have analyzed the behavior of the
Cholesky factorization as a function of the loop closures in
the SLAM system. If the number of loop closures is constant,
then the Cholesky factor density is O(n), and decomposition
is O(n). If the number of loop closures grows linearly with the
number of variables, then the Cholesky factor density grows
as O(n2 ) and decomposition is O(n3 ).
E. Compressed Column Storage
.
Ji⊤ Λij Ji
..
.
⊤
Jj Λij Ji
···
..
.
···
Ji⊤ Λij Jj
..
.
⊤
Jj Λij Jj
(5)
..
.
Here we have abused the notation for J slightly, with Ji
being the Jacobian of eij with respect to the variable ci .
The components are all 3x3 blocks. The right-hand side is
formed by adding 3x1 blocks Jci Λij eij and Jcj Λij eij for each
constraint.
Solving the linear equation yields an increment ∆x that can
be added back in to the current value of x as follows:
ti = ti + ∆ti
θi = θi + ∆θi
(6)
C. Error Jacobians
Jacobians of the measurement function h appear in the
normal equations (4), and we list them here.
∂eij
∂eij
−Ri⊤
−∂Ri⊤ /∂θi (tj − ti )
≡
≡
00
−1
∂ti
∂θi
⊤
∂eij
∂eij
Ri
⊤
≡
≡ [0 0 1]
00
∂tj
∂θj
(7)
D. Sparsity
We are interested in large systems, where the number of
poses ||c|| can be 10k or more (the largest real-world indoor
dataset we have been able to find is about 3k poses, but we
can generate synthetic datasets of any order). The number of
system variables is 3||c||, and the H matrix is ||c||2 , or over
108 elements. Manipulating such large matrices is expensive.
Fortunately, for typical scenarios the number of constraints
grows only linearly with the number of poses, so that H is
very sparse. We can take advantage of the sparsity to solve
the linear problem more efficiently.
For solving (4) in sparse format, we use the CSparse
package [3]. This package has a highly-optimized Cholesky
decomposition solver for sparse linear systems. It employs
several strategies to decompose H efficiently, including a
logical ordering and an approximate minimal degree (AMD)
algorithm to reorder variables when H is large.
In general the complexity of decomposition will be O(n3 )
in the number of variables. For sparse matrices, the complexity
Each iteration of the LM algorithm has three steps: setting
up the linear system, decomposing H, and finding ∆x by
back-substitution. Setting up the system is linear in the number
of constraints (and hence in the number of variables for most
graph-based SLAM systems). In many situations it can be
the more costly part of the linear solver. Here we outline an
efficient method for setting up the sparse matrix form of H
from the constraints generated by Equation (5).
CSparse uses compressed column storage (CCS) format for
sparse matrices. The figure below shows the basic idea.
1 0 4 0
col ptr 0
2
4 5
7
0 5 0 2
⇒ row ind 0 3 1 3 0 1 2
0 0 0 1
val
1 6 5 8 4 2 1
6 8 0 0
(8)
Each nonzero entry in the array is placed in the val vector.
Entries are ordered by column first, and then by row within
the column. col ptr has one entry for each column, plus a
last entry which is the number of total nonzeros (nnz). The
col ptr entry for a column points to the start of the column in
the row ind and val variables. Finally, row ind gives the row
index of each entry within a column.
CCS format is storage-efficient, but is difficult to create
incrementally, since each new nonzero addition to a column
causes a shift in all subsequent entries. The most efficient
way would be to create the sparse matrix in column-wise
order, which would require cycling through the constraints
||c|| times. Instead, we go through the constraints just once,
and store each 3x3 block Ji⊤ Λij Ji in a special block-oriented
data structure that parallels the CCS format. The algorithm is
given in Table I. In this algorithm, we make a pass through the
constraints to store the 3x3 block matrices into C++ std::map
data structures, one for each column. Maps are efficient at
ordered insertion based on their keys, which is the row index.
Once this data structure is created (step (2)), we use the
ordered nature of the maps to create the sparse CCS format of
H by looping over each map in the order of its keys, first
to create the column and row indices, and then to put in
the values. The reason for separating the column/row creation
from value insertion is because the former only has to be done
once for any set of iterations of LM.
Note that only the upper triangular elements of H are stored,
since the Cholesky solver in CSparse only looks at this part,
and assumes the matrix is symmetric.
TABLE I
A LGORITHM FOR SETTING UP THE SPARSE H MATRIX
H = CreateSparse(e, cf )
Input: set of constraints eij , and a list of free nodes (variables)
Output: sparse upper triangular H matrix in CCS format
1) Initialize a vector of size ||cf || of C++ std::map’s; each map
is associated with the corresponding column of H. The key of
the map is the row index, and the data is an empty 3x3 matrix.
Let map[i, j] stand for the j’th entry of the i’th map.
2) For each constraint eij , assuming i < j:
a) In the steps below, create the map entries if they do not
exist.
b) If ci is free, map[i, i] += Ji⊤ Λij Ji .
c) If cj is free, map[j, j] += Jj⊤ Λij Jj .
d) If ci , cj are free, map[j, i] += Ji⊤ Λij Jj .
3) Set up the sparse upper triangular matrix H.
a) In the steps below, ignore elements of the 3x3 map[i, i]
entries that are below the diagonal.
b) Go through map[] in column then row order, and set
up col ptr and row ind by determining the number of
elements in each column, and their row positions.
c) Go through map[] again in column then row order, and
insert entries into val sequentially.
F. Continuable LM System
The LM system algorithm is detailed in Table II. It does
one step in the LM algorithm, for a set of nodes c with
associated measurements. Running a single iteration allows
for incremental operation of LM, so that more nodes can be
added between iterations. The algorithm is continuable in that
λ is saved between iterations, so that successive iterations
can change λ based on their results. The idea is that adding
a few nodes and measurements doesn’t change the system
that much, so the value of λ has information about the state
of gradient descent vs. Euler-Newton methods. When a loop
closure occurs, the system can have trouble finding a good
minima, and λ will tend to rise over the next few iterations to
start the system down a good path.
There are many different ways of adjusting λ; we choose a
simple one. The system starts with a small lambda, 10−4 . If
the updated system has a lower error than the original, λ is
halved. If the error is the same or larger, λ is doubled. This
works quite well in the case of incremental optimization. As
long as the error decreases when adding nodes, λ decreases
and the system stays in the Newton-Euler region. When a
link is added that causes a large distortion that does not get
corrected, λ can rise and the system goes back to the more
robust gradient descent.
V. S CAN M ATCHING
SPA requires precision (inverse covariance) estimates from
matching of laser scans (or other sensors). Several scan-match
algorithms can provide this, for example, Gutmann et al.
[11] use point matches to lines extracted in the reference
scan, and return a Gaussian estimate of error. More recently,
the correlation method of Konolige and Chou [17], extended
by Olson [22], provides an efficient method for finding the
globally best match within a given range, while returning
TABLE II
C ONTINUABLE LM A LGORITHM
ContinuableLM(c, e, λ)
Input: nodes c and constraints e, and diagonal increment λ
Output: updated c
1) If λ = 0, set λ to the stored value from the previous run.
2) Set up the sparse H matrix using CreateSparse(e, c−c0 ), with
c0 as the fixed pose.
3) Solve (H + λ diagH) ∆x = J⊤ Λe, using sparse Cholesky
with AMD.
4) Update the variables c − c0 using Equation (6).
5) If the error e has decreased, divide λ by two and save, and
return the updated poses for c − c0 .
6) If the error e has increased, multiply λ by two and save, and
return the original poses for c − c0 .
an accurate covariance. The method allows either a single
scan or set of aligned scans to be matched against another
single scan or set of aligned scans. This method is used in
the SRI’s mapping system Karto1 for both local matching of
sequential scans, and loop-closure matching of sets of scans
as in [12]. To generate the real-world datasets for experiments,
we ran Karto on 63 stored robot logs of various sizes, using
its scan-matching and optimizer to build a map and generate
constraints, including loop closures. The graphs were saved
and used as input to all methods in the experiments.
VI. E XPERIMENTS
In this section, we present experiments where we compare
SPA with state-of-the art approaches on 63 real world datasets
and on a large simulated dataset. We considered a broad variety
of approaches, including the best state-of-the-art.
• Information filter: DSIF [7]
• Stochastic gradient descent: TORO [10]
• Decomposed nonlinear system: Treemap [8]
• Sparse pose adjustment: SPA, with (a) sparse direct
Cholesky solver, and (b) iterative PCG [15]
We updated the PCG implementation to use the same “continued LM” method as SPA; the only difference is in the
underlying linear solver. The preconditioner is an incomplete
Cholesky method, and the conjugate gradient is implemented
in sparse matrix format.
We also evaluated a dense Cholesky solver, but both the
computational and the memory requirements were several
orders of magnitude larger than the other approaches. As an
example, for a dataset with 1600 constraints and 800 nodes
one iteration using a dense Cholesky solver would take 2.1
seconds while the other approaches require an average of a
few milliseconds. All experiments are executed on an Intel
Core i7-920 running at 2.67 Ghz.
In the following, we report the cumulative analysis of
the behavior of the approaches under different operating
conditions; results for all datasets are available online at
www.ros.org/research/2010/spa. We tested each
method both in batch mode and on-line. In batch mode, we
1 Information
on Karto can be found at www.kartorobotics.com.
101
100
10
time to converge [s]
χ2 / # constraints
100
10-1
10-2
10-3
10-4
1
0.1
0.01
SPA
TORO
PCG
0
2000
4000
# constraints
6000
SPA
TORO
PCG
0.001
8000
0
2000
4000
6000
8000
# constraints
Fig. 4.
Batch optimization on real-world datasets, using Odometry and
Spanning-Tree initialization. Left: the final χ2 error per constraint for all
the approaches. Right: the time required to converge. Every point in the plots
represents one dataset. The datasets are sorted according to the number of
constraints in the graph, which effectively measures the complexity of the
optimization. The curves show the interpolated behavior of each method, for
better visualization. Note that PCG and SPA converge to the same error.
Fig. 3. Effect of the χ2 reduction. This figure shows two maps generated
from two graphs having a different χ2 error. The error of the graph associated
to the top map is 10 times bigger than the one on the bottom. Whereas the
overall structure appears consistent in both cases, the details in the map with
the lower χ2 appear significantly more consistent.
provided the algorithm with the full graph while in on-line
mode we carried out a certain number of iterations whenever
a new node was added to the graph. In the remainder of
this section we first discuss the off-line experiments, then we
present the on-line experiments. We conclude by analyzing all
methods on a large-scale simulated dataset.
A. Accuracy Measure
For these indoor datasets, there is no ground truth. Instead,
the measure of goodness for a pose-constraint system is the
covariance-weighted squared error of the constraints, or χ2
error. If the scan matcher is accurate, lower χ2 means that
scans align better. Figure 3 shows this effect on a real-world
dataset.
B. Real-World Experiments: Off-Line Optimization
To optimize a dataset off-line, we provide each optimizer
with a full description of the problem. We leave out from
the comparison DSIF and TreeMap, since they only operate
incrementally (DSIF is equivalent to one iteration of SPA in
batch mode). Since the success of the off-line optimization
strongly depends on the initial guess, we also investigated two
initialization strategies, described below.
• Odometry: the nodes in the graph are initialized with
incremental constraints. This is a standard approach taken
in almost all graph optimization algorithms.
• Spanning-Tree: A spanning tree is constructed on the
graph using a breadth-first visit. The root of the tree
is the first node in the graph. The positions of the
nodes are initialized according to a depth-first visit of the
spanning tree. The position of a child is set to the position
of the parent transformed according to the connecting
constraint. In our experiments, this approach gives the
best results.
For each dataset and each optimizer we computed the
initial guesses described above. Every optimizer was run for a
minimum number of iterations or until a termination criterion
was met. We measured the time required to converge and the
χ2 error for each approach. The results are summarized in
Figure 4 for the Odometry and Spanning-Tree initializations.
For these datasets, there was no substantial difference in
performance between the two types of initialization.
In the error graph, PCG and SPA converged to almost
exactly the same solution, since the only difference is the
linear solver. They both dominate TORO, which has more than
10 times the error for the larger graphs. We attribute this to
the inability of TORO to handle non-spherical covariances,
and its very slow convergence properties. SPA requires almost
an order of magnitude less computational effort than PCG or
TORO for almost all graphs.
TORO was designed to be robust to bad initializations, and
to test this we also ran all methods with all nodes initialized
to (0,0,0). In this case, SPA and PCG converged to non-global
minima for all datasets, while TORO was able to reconstruct
the correct topology.
C. Real-World Experiments: On-Line Optimization
For the on-line comparison, we incrementally augment the
graph by adding one node and by connecting the newly added
node to the previously existing graph. We invoke the optimizer
after inserting each node, and in this way simulate its behavior
when executed in conjunction with a SLAM front-end. The
optimization is carried out for a maximum number iterations,
or until the error does not decrease. The maximum number of
iterations for SPA/PCG is 1; for TreeMap, 3; and for TORO,
100. Since PCG iteratively solves the linear subproblem, we
limited it to 50 iterations there. These thresholds were selected
to obtain the best performances in terms of error reduction. In
Figure 5 we report the statistics on the execution time and on
the error per constraint every time a new constraint was added.
In terms of convergence, SPA/PCG dominate the other
methods. This is not surprising in the case of DSIF, which is
an information filter and therefore will be subject to linearization errors when closing large loops. TORO has the closest
performance to SPA, but suffers from very slow convergence
per iteration, characteristic of gradient methods; it also does
103
0.03
SPA
TORO
PCG
Treemap
DSIF
102
0.02
100
time (avg) [s]
10-1
10-2
0.01
SPA
TORO
PCG
Treemap
DSIF
10-5
0
2000
4000
# constraints
6000
0
8000
0
2000
4000
6000
8000
# constraints
Fig. 5. On-line optimization on real-world datasets. Left: the average χ2 error
per constraint after adding a node during incremental operation. Right: the
average time required to optimize the graph after inserting a node. Each data
point represents one data set, the x-axis shows the total number of constraints
of that data set. Note that the error for PCG and SPA is the same in the left
graph.
not handle non-circular covariances, which limit its ability to
achieve a minimal χ2 . Treemap is much harder to analyze,
since it has a complicated strategy setting up a tree structure
for optimization. For these datasets, it appears to have a large
tree (large dataset loops) with small leaves. The tree structure
is optimized by fixing linearizations and deleting connections,
which leads to fast computation but poor convergence, with
χ2 almost 3 orders of magnitude worse than SPA.
All the methods are approximately linear overall in the size
of the constraint graph, which implies that the number of large
loop closures grows slowly. Treemap has the best performance
over all datasets, followed by SPA and DSIF. Note that SPA
is extremely regular in its behavior: there is little deviation
from the linear progression on any dataset. Furthermore that
average and max times are the same: see the graph in Figure 8.
Finally, TORO and PCG use more time per iteration, with PCG
about four times that of SPA. Given SPA’s fast convergence,
we could achieve even lower computational effort by applying
it only every n node additions. We emphasize that these graphs
were the largest indoor datasets we could find, and they are
not challenging for SPA.
D. Synthetic Dataset
To estimate the asymptotic behavior of the algorithms we
generated a large simulated dataset. The robot moves on a grid;
each cell of the grid has a side of 5 meters, and we create a
node every meter. The perception range of the robot is 1.5
meters. Both the motion of the robot and the measurements
are corrupted by a zero mean Gaussian noise with standard
deviation σu = diag(0.01 m, 0.01 m, 0.5 deg). Whenever a
robot is in the proximity of a position it has visited, we
generate a new constraint. The simulated area has spans over
500 × 500 meters, and the overall trajectory is 100 km long.
This results in frequent re-observations. The full graph is
shown in Figure 6. This is an extremely challenging dataset,
and much worse than any real-world dataset. In the following
we report the result of batch and on-line execution of all the
approaches we compared.
a) Off-Line Optimization: Each batch approach was executed with the three initializations described in the previous
section: odometry, spanning-tree, and zero. Results are shown
in Figure 7 as a function of time. The only approach which
is able to optimize the graph from a zero or odometry
Fig. 6. Large simulated dataset containing 100k nodes and 400k constraints
used in our experiments. Left: initial guess computed from the odometry.
Right: optimized graph. Our approach requires about 10 seconds to perform
a full optimization of the graph when using the spanning-tree as initial guess.
9
14000
10
SPA - Odometry
SPA - Spanning Tree
PCG - Odometry
PCG - Spanning Tree
TORO
8
10
7
12000
10
6
10
11000
5
10000
4
9000
10
10
3
10
SPA
PCG
13000
2
10
χ error
-4
2
10-3
χ error
χ2 error (avg)
101
8000
0
10
20
30
40
50
time [s]
0
10 20 30 40 50
time [s]
Fig. 7. Evolution of the χ2 error during batch optimization of a large
simulated dataset consisting of 100,000 nodes and 400,000 constraints, under
odometry and spanning-tree initialization. The left plot shows the overall
execution, while the right plot shows a magnified view of the χ2 error of
SPA and PCG close to the convergence point.
initialization is TORO; SPA/PCG essentially does not converge
to a global minimum under odometry or zero starts. SPA/PCG
converges globally from the spanning-tree initialization after
10 seconds or so, with SPA being significantly faster at the
convergence point (see magnified view in Figure 7). TORO
has good initial convergence, but has a long tail because of
gradient descent.
b) On-Line Optimization: We processed the dataset incrementally, as described in Section VI-C. In Figure 8 we
report the evolution of the χ2 error and time per added node.
Both SPA and TreeMap converge to a minimum χ2 (see
Figure 7 for the converged map). However, their computational
behavior is very different: TreeMap can use up to 100 seconds
per iteration, while SPA grows slowly with the size of the
graph. Because of re-visiting in the dataset, TreeMap has a
small tree with very large leaves, and perform LM optimization
at each leaf, leading to low error and high computation.
The other methods have computation equivalent to SPA,
but do not converge as well. Again DSIF performs poorly, and
does not converge. TORO converges but as usual has difficulty
with cleaning up small errors. PCG spikes because it does not
fully solve the linear subproblem, eventually leading to higher
overall error.
VII. C ONCLUSIONS
In this paper, we presented and experimentally validated a
nonlinear optimization system called Sparse Pose Adjustment
(SPA) for 2D pose graphs. SPA relies on efficient linear matrix
construction and sparse non-iterative Cholesky decomposition
4
10
SPA
TORO
PCG
Treemap
DSIF
3
2
χ error / #constraints
10
2
10
1
10
0
10
-1
10
-2
10
0
0⋅10
10
5
1⋅10
2
5
2⋅10
# constraints
5
3⋅10
5
4⋅10
time per iteration [s]
101
100
10-1
10-2
0⋅100
SPA
TORO
PCG
Treemap
DSIF
1⋅105
2⋅105
# constraints
3⋅105
4⋅105
Fig. 8. On-line optimization of the large simulated data set. In the top graph,
SPA and TreeMap have the same minimum χ2 error over all constraints. DSIF
does not converge to a global minimum, while TORO converges slowly and
PCG spikes and has trouble with larger graphs. In the bottom figure, SPA is
clearly superior to TreeMap in computation time.
to efficiently represent and solve large sparse pose graphs.
None of the real-world datasets we could find were challenging
– even in batch mode. The largest map takes sub-second time
to get fully optimized. On-line computation is in the range of
10 ms/node at worst; unlike EKF filters or other methods that
have poor computational performance, we do not have to split
the map into submaps [23] to get globally minimal error.
Compared to state-of-the-art methods, SPA is faster and
converges better. The only exception is in poorly-initialized
maps, where only the stochastic gradient technique of TORO
can converge; but by applying a spanning-tree initialization,
SPA can solve even the difficult synthetic example better than
TORO. When combined with a scan-matching front end, SPA
will enable on-line exploration and map construction. Because
it is a pose graph method, SPA allows incremental additions
and deletions to the map, facilitating lifelong mapping [16].
All the relevant code for running SPA and the other
methods we implemented is available online and as opensource, along with the datasets and simulation generator
(www.ros.org/research/2010/spa). An accompanying video shows SPA in both online and offline mode on a
large real-world dataset.
R EFERENCES
[1] M. Agrawal and K. Konolige. FrameSLAM: From bundle adjustment
to real-time visual mapping. IEEE Transactions on Robotics, 24(5),
October 2008.
[2] F. F. Campos and J. S. Rollett. Analysis of preconditioners for conjugate
gradients through distribution of eigenvalues. Int. J. of Computer
Mathematics, 58(3):135–158, 1995.
[3] T. A. Davis. Direct Methods for Sparse Linear Systems (Fundamentals
of Algorithms 2). Society for Industrial and Applied Mathematics,
Philadelphia, PA, USA, 2006.
[4] F. Dellaert. Square Root SAM. In Proc. of Robotics: Science and
Systems (RSS), pages 177–184, Cambridge, MA, USA, 2005.
[5] J. Dongarra, A. Lumsdaine, R. Pozo, and K. Remington. A sparse matrix
library in c++ for high performance architectures. In Object Oriented
Numerics Conference, pages 214–218, 1994.
[6] T. Duckett, S. Marsland, and J. Shapiro. Fast, on-line learning of globally
consistent maps. Journal of Autonomous Robots, 12(3):287 – 300, 2002.
[7] R. M. Eustice, H. Singh, and J. J. Leonard. Exactly sparse delayed-state
filters for view-based SLAM. IEEE Trans. Robotics, 22(6), 2006.
[8] U. Frese. Treemap: An o(logn) algorithm for indoor simultaneous
localization and mapping. Journal of Autonomous Robots, 21(2):103–
122, 2006.
[9] U. Frese, P. Larsson, and T. Duckett. A multilevel relaxation algorithm
for simultaneous localisation and mapping. IEEE Transactions on
Robotics, 21(2):1–12, 2005.
[10] G. Grisetti, C. Stachniss, and W. Burgard. Non-linear constraint network
optimization for efficient map learning. IEEE Transactions on Intelligent
Transportation Systems, 10:428–439, 2009. ISSN: 1524-9050.
[11] J.-S. Gutmann, M. Fukuchi, and K. Sabe. Environment identification by
comparing maps of landmarks. In International Conference on Robotics
and Automation, 2003.
[12] J.-S. Gutmann and K. Konolige. Incremental mapping of large cyclic
environments. In Proc. of the IEEE Int. Symposium on Computational
Intelligence in Robotics and Automation (CIRA), pages 318–325, Monterey, CA, USA, 1999.
[13] A. Howard, M. Matarić, and G. Sukhatme. Relaxation on a mesh:
a formalism for generalized localization. In Proc. of the IEEE/RSJ
Int. Conf. on Intelligent Robots and Systems (IROS), pages 1055–1060,
2001.
[14] M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Fast incremental
smoothing and mapping with efficient data association. In International
Conference on Robotics and Automation, Rome, 2007.
[15] K. Konolige. Large-scale map-making. In Proceedings of the National
Conference on AI (AAAI), 2004.
[16] K. Konolige and J. Bowman. Towards lifelong visual maps. In
International Conference on Intelligent Robots and Systems, pages
1156–1163, 2009.
[17] K. Konolige and K. Chou. Markov localization using correlation. In
Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), 1999.
[18] F. Lu and E. Milios. Globally consistent range scan alignment for
environment mapping. Journal of Autonomous Robots, 4:333–349, 1997.
[19] I. Mahon, S. Williams, O. Pizarro, and M. Johnson-Roberson. Efficient
view-based SLAM using visual loop closures. IEEE Transactions on
Robotics, 24(5):1002–1014, October 2008.
[20] M. Montemerlo and S. Thrun. Large-scale robotic 3-d mapping of urban
structures. In ISER, 2004.
[21] E. Olson, J. Leonard, and S. Teller. Fast iterative optimization of pose
graphs with poor initial estimates. In Proc. of the IEEE Int. Conf. on
Robotics & Automation (ICRA), pages 2262–2269, 2006.
[22] E. B. Olson. Real-time correlative scan matching. In International
Conference on Robotics and Automation, pages 4387–4393, 2009.
[23] L. Paz, J. Tardós, and J. Neira. Divide and conquer: EKF SLAM in
O(n). IEEE Transactions on Robotics, 24(5), October 2008.
[24] S. Thrun and M. Montemerlo. The graph SLAM algorithm with
applications to large-scale mapping of urban structures. Int. J. Rob.
Res., 25(5-6):403–429, 2006.
[25] B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W. Fitzibbon. Bundle
adjustment - a modern synthesis. In Vision Algorithms: Theory and
Practice, LNCS, pages 298–375. Springer Verlag, 2000.