Jacopo Serafin
Currently, I’m a Ph. D. student in Engineering in Computer Science at the Department of Computer, Control, and Management Engineering “Antonio Ruberti” (DIAG) at Sapienza University of Rome.
Supervisors: Giorgio Grisetti
Address: Rome, Latium, Italy
Supervisors: Giorgio Grisetti
Address: Rome, Latium, Italy
less
InterestsView All (6)
Uploads
Papers by Jacopo Serafin
refers to the ability of a robot to create a map of the surrounding environment, and to localize itself in that map, by using its on-board sensors. SLAM is fundamental for mobile robots that want to efficiently
solve common tasks such as search and rescue, space exploration and
self-driving. Indeed, SLAM allows robots to perform more complex actions and reasoning, and for all these reasons it is considered an enabling technology toward the
construction of truly autonomous mobile robots.
For more than three decades, the robotics research community focused on the problem of localization and mapping. This led to the development of many different SLAM
approaches, and solutions. However, while for 2D indoor applications SLAM reached
stable results, when dealing with large-scale environments, or when extending the problem to 3D data, multiple issues arise. Under such assumptions, indeed, the complexity
and the memory consumption increase significantly, in particular if the robot has to
operate for long periods of time. Despite the recent advancements in the state-of-the-art of SLAM, current 3D localization and mapping methods can robustly handle only
scenarios of moderate size (e.g. buildings). Moreover, in order for humans beings to
be safe, novel solutions must be capable to detect and recover from failures, as well as
be able to deal with noise such as dynamics aspects of the environment and perceptual
aliasing.
In the last years the demand for robots capable of helping human beings in their
daily environment, including agriculture and autonomous driving, is constantly grow-
ing. As a result of this, researchers concentrate on the development of long term SLAM
systems that are able to cope with indoor/outdoor challenging environments such as
urban scenarios, or underground tunnels. This also means that these systems must
be designed to face sensor limitations, such as data sparsity and perceptual aliasing,
and common noise factors like seasonal changes and moving objects. In this thesis we
present multiple novel solutions that improve the accuracy, efficiency and robustness of
the current state-of-the-art in localization and mapping, under such hard conditions.
Modern SLAM systems are built as the aggregation and connection of multiple independent modules like algorithms for feature/key-point detection (data enhancement),
measurement registration methods and optimization frameworks. A fundamental block
is that one related to loop closures. By detecting areas already visited in the past by
the robot, loop closing allows for a drastic reduction of the error affecting the global
map estimate. Advancing any of these parts has the effect of improving the entire
SLAM system. In this work we focused on developing innovative solutions for some of
the most important modules composing modern localization and mapping approaches.
We present a novel point cloud registration method based on an augmented error
function associated to the alignment problem. Through an extensive experimental
evaluation, our approach demonstrates the ability to outperform other state-of-the-art
algorithms. We detail a robust and effective loop closing search heuristic that, based
on the trajectory followed by the robot, significantly reduces the overall computation
time. Additionally, similarly to visual SLAM approaches, it allows for recovering from
inconsistent global map estimates due to miss detected closures. Such an algorithm is
able to correctly map hard and harsh environments like catacombs. Also, we introduce
a method that exploits 3D sensor data to build highly informative two dimensional
measurements. These measurements are then used for achieving robust indoor/outdoor
2D SLAM. Our algorithm also makes efficient use of memory, making the approach
particularly suitable for large-scale mapping. Finally, we present a very fast method
for computing robust 3D geometric features from sparse sensor data. We demonstrate
the effectiveness of that approach by performing 3D feature based SLAM with an
autonomous car at the MCity test facility.
“mapping” relevant information of the world (i.e., spatial information, temporal events, agents and actions) to a formal
description supported by a reasoning engine. Current research
focuses on learning the semantic of environments based on their spatial location, geometry and appearance. Many methods to tackle this problem have been proposed, but the lack of a
uniform representation, as well as standard benchmarking suites,prevents their direct comparison. In this paper, we propose a standardization in the representation of semantic maps, by defining an easily extensible formalism to be used on top of metric maps of the environments. Based on this, we describe the procedure to build a dataset (based on real sensor data) for benchmarking semantic mapping techniques, also hypothesizing some possible evaluation metrics. Nevertheless, by providing a tool for the construction of a semantic map ground truth, we aim at the contribution of the scientific community in acquiring data for populating the dataset.
to recursively align point clouds. By considering each point
together with the local features of the surface (normal and
curvature), our method takes advantage of the 3D structure
around the points for the determination of the data association
between two clouds. The algorithm relies on a least squares
formulation of the alignment problem, that minimizes an error
metric depending on these surface characteristics. We named
the approach Normal Iterative Closest Point (NICP in short).
Extensive experiments on publicly available benchmark data
show that NICP outperforms other state-of-the-art approaches.
The registration algorithm proposed is able to match 3D point clouds extracted by depth images and can be seen as a variant of the generalized ICP algorithm based on a plane-to-plane metric. Substantially, we exploit the structure of the point clouds, in particular we take advantage of the information given by the surface normals of each point.
A closed form solution is developed for the least-squares estimation problem used to register point clouds with normals.
We developed a very fast implementation of the algorithm doing a fast computation of the normals by means of particular structures called integral images. Accepting to lose a bit of accuracy, these images allow the computation of a normal with constant time O(1). Furthermore, we speed up the data association phase by searching the correspondences in the images where the point clouds are reprojected. The images structure, rows and columns, is exploited to make data association reducing in this way the computation time. Despite a shallower search, the resulting set of correspondences is good.
We have also implemented a merging method, based on the uncertainty of the Microsoft Kinect sensor, useful to reduce the number of points and the noise accumulated by the registration of multiple point clouds.
Results obtained are showed taking in consideration experiments on different datasets representing different kinds of environment as domestic and office like environments. All datasets were acquired using a Microsoft Kinect or the Asus Xtion sensor.
refers to the ability of a robot to create a map of the surrounding environment, and to localize itself in that map, by using its on-board sensors. SLAM is fundamental for mobile robots that want to efficiently
solve common tasks such as search and rescue, space exploration and
self-driving. Indeed, SLAM allows robots to perform more complex actions and reasoning, and for all these reasons it is considered an enabling technology toward the
construction of truly autonomous mobile robots.
For more than three decades, the robotics research community focused on the problem of localization and mapping. This led to the development of many different SLAM
approaches, and solutions. However, while for 2D indoor applications SLAM reached
stable results, when dealing with large-scale environments, or when extending the problem to 3D data, multiple issues arise. Under such assumptions, indeed, the complexity
and the memory consumption increase significantly, in particular if the robot has to
operate for long periods of time. Despite the recent advancements in the state-of-the-art of SLAM, current 3D localization and mapping methods can robustly handle only
scenarios of moderate size (e.g. buildings). Moreover, in order for humans beings to
be safe, novel solutions must be capable to detect and recover from failures, as well as
be able to deal with noise such as dynamics aspects of the environment and perceptual
aliasing.
In the last years the demand for robots capable of helping human beings in their
daily environment, including agriculture and autonomous driving, is constantly grow-
ing. As a result of this, researchers concentrate on the development of long term SLAM
systems that are able to cope with indoor/outdoor challenging environments such as
urban scenarios, or underground tunnels. This also means that these systems must
be designed to face sensor limitations, such as data sparsity and perceptual aliasing,
and common noise factors like seasonal changes and moving objects. In this thesis we
present multiple novel solutions that improve the accuracy, efficiency and robustness of
the current state-of-the-art in localization and mapping, under such hard conditions.
Modern SLAM systems are built as the aggregation and connection of multiple independent modules like algorithms for feature/key-point detection (data enhancement),
measurement registration methods and optimization frameworks. A fundamental block
is that one related to loop closures. By detecting areas already visited in the past by
the robot, loop closing allows for a drastic reduction of the error affecting the global
map estimate. Advancing any of these parts has the effect of improving the entire
SLAM system. In this work we focused on developing innovative solutions for some of
the most important modules composing modern localization and mapping approaches.
We present a novel point cloud registration method based on an augmented error
function associated to the alignment problem. Through an extensive experimental
evaluation, our approach demonstrates the ability to outperform other state-of-the-art
algorithms. We detail a robust and effective loop closing search heuristic that, based
on the trajectory followed by the robot, significantly reduces the overall computation
time. Additionally, similarly to visual SLAM approaches, it allows for recovering from
inconsistent global map estimates due to miss detected closures. Such an algorithm is
able to correctly map hard and harsh environments like catacombs. Also, we introduce
a method that exploits 3D sensor data to build highly informative two dimensional
measurements. These measurements are then used for achieving robust indoor/outdoor
2D SLAM. Our algorithm also makes efficient use of memory, making the approach
particularly suitable for large-scale mapping. Finally, we present a very fast method
for computing robust 3D geometric features from sparse sensor data. We demonstrate
the effectiveness of that approach by performing 3D feature based SLAM with an
autonomous car at the MCity test facility.
“mapping” relevant information of the world (i.e., spatial information, temporal events, agents and actions) to a formal
description supported by a reasoning engine. Current research
focuses on learning the semantic of environments based on their spatial location, geometry and appearance. Many methods to tackle this problem have been proposed, but the lack of a
uniform representation, as well as standard benchmarking suites,prevents their direct comparison. In this paper, we propose a standardization in the representation of semantic maps, by defining an easily extensible formalism to be used on top of metric maps of the environments. Based on this, we describe the procedure to build a dataset (based on real sensor data) for benchmarking semantic mapping techniques, also hypothesizing some possible evaluation metrics. Nevertheless, by providing a tool for the construction of a semantic map ground truth, we aim at the contribution of the scientific community in acquiring data for populating the dataset.
to recursively align point clouds. By considering each point
together with the local features of the surface (normal and
curvature), our method takes advantage of the 3D structure
around the points for the determination of the data association
between two clouds. The algorithm relies on a least squares
formulation of the alignment problem, that minimizes an error
metric depending on these surface characteristics. We named
the approach Normal Iterative Closest Point (NICP in short).
Extensive experiments on publicly available benchmark data
show that NICP outperforms other state-of-the-art approaches.
The registration algorithm proposed is able to match 3D point clouds extracted by depth images and can be seen as a variant of the generalized ICP algorithm based on a plane-to-plane metric. Substantially, we exploit the structure of the point clouds, in particular we take advantage of the information given by the surface normals of each point.
A closed form solution is developed for the least-squares estimation problem used to register point clouds with normals.
We developed a very fast implementation of the algorithm doing a fast computation of the normals by means of particular structures called integral images. Accepting to lose a bit of accuracy, these images allow the computation of a normal with constant time O(1). Furthermore, we speed up the data association phase by searching the correspondences in the images where the point clouds are reprojected. The images structure, rows and columns, is exploited to make data association reducing in this way the computation time. Despite a shallower search, the resulting set of correspondences is good.
We have also implemented a merging method, based on the uncertainty of the Microsoft Kinect sensor, useful to reduce the number of points and the noise accumulated by the registration of multiple point clouds.
Results obtained are showed taking in consideration experiments on different datasets representing different kinds of environment as domestic and office like environments. All datasets were acquired using a Microsoft Kinect or the Asus Xtion sensor.