Mobile Mapping System Based On Action Cameras

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-1, 2018

ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

MOBILE MAPPING SYSTEM BASED ON ACTION CAMERAS

J. A. Gonçalves 1*, A. Pinhal 1

1 Science Faculty, University of Porto, Rua Campo Alegre, 4169-004 Porto, Portugal - (jagoncal, apinhal)@fc.up.pt

Commission I, WG I/7

KEY WORDS: mobile mapping, action camera, GNSS, camera calibration, direct georeferencing

ABSTRACT:

Action cameras can operate in outdoor conditions, such as outside a car, and provide good quality imagery that can be exploited to
collect geospatial data by photogrammetric means. Recent models include GPS, which can deliver position and time of individual
images and video frames. That is the case of the very popular camera, Gopro Hero 5. This paper describes the implementation of a
mobile mapping system, based on a GoPro Hero 5 camera mounted on the side rearview mirror of a car. Although the system can be
dependent on the camera GPS positions only, it was developed to include a GNSS dual frequency receiver, carried inside the car, on
the dashboard. Within good observation conditions, without tall buildings, differential positioning (either RTK or PPK) provides the
trajectory with accuracy of a few centimetres. The precise time of individual frames is obtained from the camera GPS and positions
are interpolated from the GNSS receiver. Assuming the car moves in a horizontal plane and the camera has no significant tilts, the
system is treated in planimetric terms, with camera axis azimuth derived from the vehicle trajectory. Positions of observed objects,
such as traffic signs, are derived from consecutive frames. Tests carried out in a sparse urban environment have shown planimetric
accuracy better than 40 cm, appropriate for large scale mapping, such as 1:2000. The system can be improved in several forms,
through processing techniques, such as structure from motion, but without the incorporation of additional hardware.

1. INTRODUCTION images being out of date for a few years. Simple MMS, based
on mobile devices would be of great interest for these data
Topographic data need to be constantly updated in urban areas collection, especially if they can reduce costs, be of simple use
in order to keep accurate GIS databases for infrastructure and keep the required mapping accuracy grade. Recent works
management. Aerial photography is still the main source for by Al-Hamad et al., (2014) and Masiero et al. (2016) use
most of the base map data collection. However, many objects smartphones as the basis of MMS.
around streets and roads, such as traffic signs and urban
equipment, cannot be acquired from aerial images. Field data Action cameras are capable of working in outdoors, in adverse
collection keeps being fundamental for this data acquisition. conditions. They are frequently associated with strong
There are many solutions of mobile mapping systems (MMS) geometric deformations and normally not considered for
based on different sensors, such as photographic cameras, video photogrammetric operations. However, evolution in this field
or laser scanners, that can be used in these tasks. Most of the resulted in fast processors that can apply geometric correction
commercial systems are relatively complex, since they integrate models, large frame rates that can reduce rolling shutter effects,
direct georeferencing equipment (Global Navigation Satellite include GPS positioning, and other improvements that make
Systems, GNSS and Inertial Navigation Systems, INS) with them interesting for terrestrial photogrammetry operations. Not
other sensors in a synchronised data collection and processing many studies have been published on mobile mapping based on
system. Costs are relatively high and economic viability may action cameras.
require large volumes of work. Some papers have been
published on “low-cost” MMS (Ellum and El-Sheimy, 2002, This paper describes a MMS based on a camera GoPro Hero 5,
Artese, 2007, Madeira et al., 2010), but all involving relatively which incorporates the features described above. It is operated
sophisticated GNSS, INS, and cameras hardware. in a vehicle, which also carries a dual frequency GNSS receiver.
The system is extremely simple to operate in the field, involving
Typical MMS are normally classified as survey grade, reaching only sliding the action camera on its adhesive mount, which is
a positional accuracy of a few centimetres. In many cases that is stuck the side rear view mirror of a car, turn it on and start
more than what is needed for GIS databases. In Portugal large driving around. The GNSS receiver will normally work in RTK
scale data are acquired in urban areas, with planimetric accuracy mode (Real Time Kinematics), with corrections broadcast by a
of 20 to 40 cm, corresponding to traditional map scales of permanent station network
1:1000 or 1:2000. Standard feature catalogue tables (DGT,
2018), established by the national mapping and cadastre The article describes the composition and data processing of the
authority, include many objects, such as electricity poles, street system and results of some positional accuracy tests carried out.
lamps, different type of walls, and many others object Improvements that can be done, such as the incorporation of
classifications, that have to be done by field data collection. structure from motion (SfM) processing, for image orientation
Companies end up using frequently services such as the Google improvement and point cloud generation, are also described.
Earth Street View, mainly for verification, but with the risk of

* Corresponding author

This contribution has been peer-reviewed.


https://doi.org/10.5194/isprs-archives-XLII-1-167-2018 | © Authors 2018. CC BY 4.0 License. 167
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-1, 2018
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

2. THE ACTION CAMERA The main considerations in choosing a video mode are: higher
frame rates, because of the reduction in the rolling shutter
The camera used in this system is a GoPro Hero 5 Black. It is a effect, and availability of the linear correction mode. The latter
small and light camera, very robust, which can be operated in is only available for two of the resolutions, so the choice went
rough conditions. It was used to collect videos from which for the highest resolution, 2.7K at 60 fps, which corresponds to
frames are extracted, for photogrammetric processing. It also 4 megapixels.
incorporates GPS, which allows for tagging frames with Another important consideration on the camera operation is to
position and GPS time. Figure 1 shows the camera in its case, turn off the stabilization mode. Since in this mode the frame is a
fixed in the rear view mirror of a car. Once mounted its subsection of a wider image, the principal point may change its
inclination can be adjusted. position, which would cause strong limitations to the
photogrammetric use.

2.2 Frame extraction

Frames were extracted in JPEG format from the MP4 files using
program FFMPEG. Time of frame n, in seconds, can be
calculated (equation 1) with respect to time of frame 1, taking
into account the frame rate. The actual average number of
frames acquired per second for mode 60 fps is 59.94 (GoPro,
2018a).
Figure 1. GoPro camera in its mounting box, fixed on the side
rear view mirror of a vehicle.
tn  t1  n  1
1 (1)
59.94
2.1 Camera characteristics
Data blocks with GPS positions and data from other sensors can
The GoPro Hero 5 Black acquires still images at a maximum also be extracted by this program. At the moment this work was
rate of 3 images every 2 seconds, in a resolution of 12 done the information about the organisation of these data was
megapixels. Although this is a relatively high image rate, when scarce. Recently, GoPro released in GitHub the structure of
moving in a road at a speed of 15 m/s, the spacing between these data blocks (Gopro, 2018b).
consecutive images would be too large for photogrammetric
use. So it was decided to use the video mode. 2.3 Calibration

The camera acquires video in a large variety of resolutions, Still images acquired by the camera include, in the EXIF
frame rates and processing modes. The highest resolution, header, approximate values of focal distance and pixel size.
identified as 4K, has a frame size of approximately 8 However, since images of “linear mode” are being used, these
megapixels, at 30 frames per second (fps). Other resolutions values will be changed. Also for the video mode the actual area
include higher frame rates, up to 240 fps, and processing of the sensor in use may be different, so not much was known
modes. An important processing mode is called “linear”, which about the actual focal distance to be considered. Since the
consists in removing the radial distortion, typical of wide frames cover, in the horizontal direction (2704 pixels), an angle
modes. Figure 2 shows video frames of the standard mode of approximately 85º, the approximate focal distance was
(“wide”) and corrected (“linear”). Table 1 has a list of calculated as (eq. 2):
characteristics of some of the video modes: frame size, frame
rates and availability of linear mode. 2704  85º  (2)
f  cot   1475 pixels
2  2 

Action cameras can be calibrated in a similar manner as other


cameras (Balletti et al., 2014). In our case it was done with
images of targets marked on a wall, which were rigorously
measured. With several images acquired from different angles,
and doing an auto calibration with Agisoft Photoscan, this
Figure 2. Video frame in wide mode (left) and linear mode approximate focal distance was improved to 1486 pixels.
(right) Additionally it could be verified that for images of the linear
mode the principal point can be considered at the image centre
and only a residual radial and tangential distortion remains,
Video Frame size Frame Linear reaching less than 10 pixels in the image corners. Within the
mode (pixels) rates (fps) mode simplifications introduced in the system being implemented, an
4K 38402160 30 No exact central projection was considered, with a focal distance of
2.7 K 27041520 60, 48, 24 Yes 1486 pixels.
1440 p 19201440 80, 60, 48, 30, 24 No
1080 p 19201080 120, 90, 80 No 2.4 GPS positioning
1080 p 19201080 60, 48, 30, 24 Yes
960 p 1280960 120,60 No The camera incorporates a GPS unit, which provides position
720 p 1280720 240,120,60,30 No with standard navigation accuracy, i.e., errors of a few meters.
In the case of still images these positions are provided in the
Table 1. GoPro Hero 5 Black video modes EXIF header of JPEG images. In the case of video there are no

This contribution has been peer-reviewed.


https://doi.org/10.5194/isprs-archives-XLII-1-167-2018 | © Authors 2018. CC BY 4.0 License. 168
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-1, 2018
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

standard formats for the storage of positions of the camera  Ei 1  Ei 1  (3)


trajectory. In the case of MP4 videos acquired by the GoPro,  i  tan 1  
these data are stored in data blocks within the video file.  N i 1  N i 1 
Recently the GoPro company (GoPro, 2018b) has released
information about the GoPro Metadata Format (GPMF). At the where (E, N) are planar coordinates.
start of this work there was not much documentation about the
organization of these data. A commercial program for use of 3.2 System calibration
action cameras in motorised sports, called Race Render,
The camera and the receiver are mounted in the vehicle. With a
managed to extract the GPS time and position from the video.
total station close to the vehicle, several points are measured,
according to figure 3: two points in the vehicle axis (A and B),
The GPS data is acquired at a frequency of 18 Hz, recording
the GNSS antenna, the camera and a set of points in front of the
time, latitude, longitude, height above the ellipsoid and speed.
camera. From these measurements it will be possible to
During the practical tests with the dual frequency receiver it was
calculate relative positions, in a vehicle reference frame, of the
possible to assess the accuracy of these positions. Table 2 shows
the statistics of the errors found. Planimetric accuracy (RMSE, GNSS antenna and the camera (distance b and angle ). These
root mean square error) was better than 2 meters, typical of are used, together with the vehicle azimuth, to transport precise
navigation GPS receivers. coordinates from the GNSS antenna to the camera, for all
instants in which position fix was achieved.
Statistic Long. (m) Lat. (m) Height (m)
Minimum -3.54 -4.52 -5.16
Maximum 3.45 3.28 19.31
Average -0.30 -0.92 4.72
Std. Deviation 1.05 1.05 3.36
RMSE 1.09 1.40 5.80
Table 2. Statistics of the GoPro GPS positional errors
Relying only on the GoPro camera alone, final accuracy would
not be better than this. In order to reach a large scale mapping
grade the precision GNSS unit was incorporated in the system.
In this case the only information to be used from the GoPro
GPS data file will be the initial time. Assuming it corresponds Figure 3. Planar scheme of the vehicle with the GNSS antenna
to the first video frame, it will be possible to calculate time of and the camera (CAM), camera axis and surveyed points
all frames using equation (1).
The points surveyed in front of the camera, also seen on a video
3. COMPOSITION OF THE MMS collected by the camera, are used to estimate a point in the
centre line of the camera, and so calculate angle , between the
3.1 Precise GNSS positioning vehicle axis and the trajectory. In the present case the value
obtained for  was 26.1º. It will be used to calculate the azimuth
A Trimble R6 GNSS unit was mounted in the vehicle’s of the camera axis for any image.
dashboard. It was fixed with velcro tape, so that the antenna
centre is approximately in the vehicle axis. The receiver works 3.3 Interpolation of positions for video frames
in real time kinematic mode (RTK) receiving corrections from
ReNEP, a national network of GNSS permanent stations (DGT, Points were collected by the Trimble receiver at 1 Hz, which
2018) through mobile communications. Later the data was post- were then transported to the camera. Figure 4 shows an
processed (PPK) in order to have better accuracy and possibly example, with the GNSS antenna as black squares and the
positions for instants in which RTK was not successful. The camera positions, also at 1Hz, as green circles. At a speed of 10
RTK has the advantage of giving an audio feedback of to 15 m/s the point density is relatively small, and a linear
ambiguity fixing. In case of loss, the driver may adjust vehicle interpolation is not appropriate to model the trajectory,
speed in order to recover fix and obtain a more complete especially in curves. For this reason cubic interpolation was
dataset. used.

In dense urban environments the percentage of ambiguity fixes


achieved can be low. For the remaining cases, were not even in
post processing, position fixes are not possible, a solution is
proposed below, in the suggestion of improvements to this
simple MMS. Proposed improvements are based on processing
strategies and not in the inclusion of any additional
instrumentation.

Since the MMS is intended to be only planimetric, and has no


additional hardware to provide camera orientation, the azimuth
of the trajectory will have to be calculated. For a set of points
obtained by the GNSS receiver, we can calculate for point i, the  GNSS  Camera 
Frames at 1m intervals
azimuth i of the trajectory using the previous and the next Figure 4. GNSS, camera positions transported from GNSS, and
point, as (eq. 3): interpolated camera positions at 1 m intervals

This contribution has been peer-reviewed.


https://doi.org/10.5194/isprs-archives-XLII-1-167-2018 | © Authors 2018. CC BY 4.0 License. 169
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-1, 2018
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

All the 60 frames per second were interpolated (smooth 4. POSITIONAL ACCURACY ASSESSMENT
trajectory in figure 4). Since they are in an excessive number
they were filtered so that distance between consecutive A test was carried out in the city of Valongo, a suburban area of
positions is 1 meter (small circles in figure 4). This density the city of Porto, in Portugal. A total of 4.8 km were travelled in
provides enough overlap with an acceptable amount of final an area of 1.5 km2. Vector map data with accuracy standards of
data. It also solves the problem of irregularity due to speed map scale 1:2000 was available, together with orthoimages.
changes, or of the vehicle being stopped in traffic lights. For all Figure 7 shows part of the area, with those GIS data and
these 1 m spaced images azimuth of the camera axis was positions collected with the MMS.
calculated as the azimuth of the trajectory plus angle .

3.4 Triangulation

As referred before, the system is intended to be applied for


planar coordinate determination, assuming several conditions,
such as the camera being always horizontal, with no tilts. In this
condition, vertical objects will appear vertical on the images.
For a given object, observed in image k, its x image coordinate,
with respect to the image central line (figure 5a), can be
transformed into an azimuth, according to equation (4):

 xobject  (4)
 object   k    tan1  Figure 7. Sample of the area showing orthoimages, vector data
 f  and locations acquired with the MMS (yellow dots).

where k is the camera axis azimuth of image k. The point can The streets are relatively large and buildings not very tall.
be observed in some other images, leading to a multiple GNSS data, collected at 1 Hz, reached ambiguity fixes in 77%
intersection. However, for the sake of simplicity for manual of the time, in RTK mode in the field. However, the post
measurement in the tests carried out, only two were considered, processing (PPK mode) allowed to improve the number of
with interval of a few images. Figure 5 b shows the same signal position fixes to 85% of the time. Frames were extracted and
in a later frame (9 frame interval). It can be seen that here the filtered in order that distance between consecutive projection
signal is not exactly vertical but the difference from top to centres is 1 m.
bottom is small. The assumption of this MMS is that situations
like this introduce negligible errors. Among the available digital map data there was a layer with
points corresponding to electric facilities, such as electricity
poles. A total of 20 of these points, mainly along the sidewalks,
were chosen as check points to assess the positional accuracy of
the MMS. Each point was identified in 2 frames and its
coordinates were calculated by the process described.
Coordinates were compared with the ones present in the
cartography, and the statistics of errors were calculated (Table
3). Notice that the check points, coming from the digital map
data, also incorporate some error and so the errors may be
overestimated.

(a) (b) Statistic Easting (m) Northing (m)


Figure 5. vertical object observed in two frames. Minimum -0.36 -0.71
Only x image coordinates are measured Maximum 0.34 0.45
Average -0.01 0.04
RMSE 0.24 0.37
Two lines (azimuths 1 and 2) are defined and the intersection
Table 3. Statistics of the errors found in object coordinates
point, P, is calculated (figure 6).
obtained by intersection
The small values of the average might suggest that systematic
errors are not present. That is in principle not the case because
some of the points were along a road with constant direction
and tended to have errors with similar trends. The fact that there
are points along streets with different directions, resulted in
some cancelation effect.

Anyway these results reveal that this MMS can provide data
with an appropriate positional accuracy for map scale 1:2000, or
Figure 6. Planar intersection of lines defined for an object better. In this way, this simple methodology may be applied to
observed on two different frames. complete large scale map data produced by aerial
photogrammetry but that require many object classes that
cannot be collected from aerial data.

This contribution has been peer-reviewed.


https://doi.org/10.5194/isprs-archives-XLII-1-167-2018 | © Authors 2018. CC BY 4.0 License. 170
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-1, 2018
ISPRS TC I Mid-term Symposium “Innovative Sensing – From Sensors to Methods and Applications”, 10–12 October 2018, Karlsruhe, Germany

6. CONCLUSIONS
Potential reasons for systematic errors could be the inaccuracy
of some of the parameters involved. Further investigations are The developed MMS is based on the action camera GoPro Hero
needed in order to improve the quality of the MMS. 5 Black. The system is intended to be very simple and in the
simplest form would require only the camera itself, and relying
5. IMPROVEMENTS TO BE IMPLEMENTED only on the positions provided by the camera GPS. Normally
IN THE SYSTEM the system will be used with a GNSS receiver. When comparing
it with other systems the main difference is that here there are
A better estimation of the parameters involved, such as the focal no cable connections. The synchronization of image frames and
distance, or angle  might have some influence in the quality of positions is made by the common GPS time frames.
the results. An error of 1º in the azimuth angle, which depends
on the trajectory azimuth and angle , introduces a planar error The system allows for position determination with positional
of 17 cm at 10 meter distance. Calibrations should be done with accuracy of around 30 cm, which is appropriate for many high
an accuracy better than 1º. A more careful camera calibration, accuracy applications. This MMS is under development and
and possibly the consideration of residual radial distortion, many of the suggested developments will be implemented in a
might also contribute. near future.
The time synchronisation, based on the first frame time, was
acceptable but also here some error may exist: moving at a
speed of 15 m/s, an error of 1 frame, at 60 fps, implies a ACKNOWLEDGEMENTS
positional error of 25 cm. Methods of time calibration are being
studied for further system improvement. Direção Geral do Território for the use of ReNEP, the
Portuguese network of permanent GNSS stations.
Another possible improvement would be the exploitation of
data provided by other GoPro sensors, such as the gyroscope Valongo City Council, for the provision of digital map data and
and the accelerometer. Attitude angles would help in the orthoimages.
azimuth estimation and also allow for the inclusion of tilts.
Anyway, accuracy of 1º or better in the attitude angles would be REFERENCES
needed.
Al-Hamad, A., Moussa, A., and El-Sheimy, N. (2014). Video
The main improvement will come from the inclusion in the based mobile mapping system using smartphones. Int. Arch.
frame processing of structure from motion (SfM). If successful Photogram. Remote Sensing Spatial Info. Sci., Volume XL-1,
it will allow for the determination of position and attitude 2014, p. 13-18.
angles of all the frames. This would allow for the completion of
the points for which position fixing was not possible, filling the Artese, G. (2007). ORTHOROAD: A low cost Mobile Mapping
gaps in the trajectory. System for road mapping. In C. V. Tao, & J. Li (Eds.), Advances
in Mobile Mapping Technology - ISPRS Book Series (pp. 31-
Additionally, SfM would allow for the generation of point 41). London, UK: Taylor & Francis Group.
clouds. Even with a relatively small success, the extraction of
point clouds for some of the objects of interest would be a great Balletti, C., Guerra, F., Tsioukas, V. and Vernier, P., 2014.
help in automating the data extraction. This step is possible and Calibration of Action Cameras for Photogrammetric Purposes.
was already tested. Figure 8 shows a sample of a point cloud Sensors, 2014, 14, pp. 17471-17490.
generated in this way, where some important objects can clearly
be seen. DGT, 2018. “Direção Geral do Território” Documentation about
Portuguese national mapping standards and facilities for GNSS
positioning. http://www.dgterritorio.pt

Ellum, C., and El-Sheimy, N., 2002. Portable Mobile Mapping


Systems. National Technical Meeting ION NTM. San Diego,
California: The US Institute of Navigation.

GoPro, 2018a. Hero5 Black User Manual. Available on line:


https://gopro.com/content/dam/help/hero5-
black/manuals/HERO5Black_UM_ENG_REVC_Web.pdf

GoPro, 2018b. GPMF - GoPro Metadata Format. Available on-


line. https://github.com/gopro/gpmf-parser

Madeira, S., Gonçalves, J.A., Bastos, L., 2010.


Figure 8. Point cloud generated from a Photogrammetric mapping and measuring application using
sequence of video frames. MATLAB. Computers & Geosciences 36(6), pp. 699-706

Additionally, if the MMS will become of interest to implement Masiero, A., Fissore, F., Pirotti, F., Guarnieri, A. and Vettore, A.
in a production environment it would benefit from the existence 2016. Toward the use of smartphones for mobile mapping, Geo-
of a software tool for data treatment. spatial Information Science, 19(3), pp. 210-221.

This contribution has been peer-reviewed.


https://doi.org/10.5194/isprs-archives-XLII-1-167-2018 | © Authors 2018. CC BY 4.0 License. 171

You might also like