Advanced Engineering Informatics 42 (2019) 100936



Advanced Engineering Informatics



Full length article

Road pothole extraction and safety evaluation by integration of point cloud T

and images derived from mobile mapping sensors
Hangbin Wua, Lianbi Yaoa, , Zeran Xua, Yayun Lib, Xinran Aoa, Qichao Chena, Zhengning Lia,

Bin Menga
College of Surveying and Geoinformatics, Tongji University, PR China
Shanghai Municipal Institute of Surveying and Mapping, PR China


Keywords: The automatic detection and extraction of road pothole distress is an important issue regarding healthy road
Candidate pothole structures, monitoring, and maintenance. In this paper, a new algorithm that integrates the mobile point cloud
Patch and images is proposed for the detection of road potholes. The algorithm includes three steps: 2D candidate
Image segmentation pothole extraction from the images using a deep learning method, 3D candidate pothole extraction via a point
Point cloud
cloud, and pothole determination by depth analysis. Because the texture features of the pothole and asphalt or
Depth distribution
concrete patches greatly differ from those of a normal road, pothole or patch distress images are used to establish
a training set and train and test the deep learning system. Subsequently, the 2D candidate pothole is extracted
from the images and labeled via the trained DeepLabv3+, a state-of-the-art pixel-wise classification (semantic
segmentation) network. The edge of the candidate pothole in the image is then used to establish the relationship
between the mobile point cloud and images. The original road point cloud around the edge of the candidate
pothole is categorized into two groups, that is, interior and exterior points, according to the relationship between
the point cloud and images. The exterior points are used to fit the road plane and calculate the accurate 3D shape
of the candidate potholes. Finally, the interior points of a candidate pothole are used to analyze the depth
distribution to determine if the candidate pothole is a pothole or patch. To verify the proposed method, two
cases, including real and simulation cases, are selected. The real case is an expressway in Shanghai with a length
of 26.4 km. Based on the proposed method, 77 candidate potholes are extracted by the DeepLabv3+ system; 49
potholes and 28 patches are finally filtered. The affected lanes and pothole locations are analyzed. The simu-
lation case is selected to verify the geometric accuracy of the detected potholes. The results show that the mean
accuracy of the detected potholes is ∼1.5–2.8 cm.

1. Introduction categorized into three main types: visual survey methods, image-based
methods, and laser scanning-based methods.
1.1. Background
1.2. Previous studies
A pothole is an important type of road distress. The automatic de-
tection of road potholes is an important issue regarding the monitoring 1.2.1. Visual survey methods
and maintenance of healthy road structures. To help road engineers in Visual surveys are the most traditional method used for pavement
obtaining detailed information about potholes, pavement management distress detection and inspection. They are applied manually using data
systems (PMS) have been widely used by most road agencies [46]. collection sensors. Inspection systems, such as ASTM D5340, can be
Based on the development of data collection sensors and data proces- used to record pavement distress such as cracks and potholes. Based on
sing technologies, these PMS have been updated in the last decade. the development of mobile phones, several Android-based smart de-
Generally, potholes significantly differ from the background surface. vices have been developed to record the locations and images of pa-
Therefore, researchers focus on the extraction of potholes from the vement distress [21,44,49,55,61]. Although smart device-based in-
existing infrastructure. The pothole extraction methods can be spection systems greatly improve the accuracy of distress management,

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

they are slow and time-consuming and create safety, consistency, and [22,45,54]; however, 2D/3D information of the distress cannot be ob-
efficiency concerns [58]. Visual surveys notably affect the operation of tained via a single image.
roads, tunnels, and other infrastructures. Therefore, visual surveys are Computer vision technologies based on low-cost, high-quality, and
always carried out when the infrastructure is out of service. easy-to-use digital cameras have rapidly developed during the last
decade [36]. To address the measurement concerns regarding images,
1.2.2. Ground-penetrating radar-based method computer vision technologies were used to detect and measure infra-
More recently, the ground-penetrating radar (GPR; [5,19]) has been structure distress. The main principle of computer vision-based
frequently used as a non-destructive technique (NDT) for safety and methods is 3D construction based on stereo vision algorithms, which
damage detection in civil infrastructure systems [1,27,33,57,62]. For provide 3D point clouds to determine the pothole as well as the back-
example, Benedetto et al. [6] assessed the reliability of an optimal ground road surface [26,64]. Classification methods based on the 3D
signal processing algorithm for pavement inspection using GPR, which point cloud generated by the stereo vision matching algorithm were
was used to preliminarily detect and classify pavement damages, by proposed to identify the potholes [32]. Amin et al. [2] used stereo vi-
time delay calculations and threshold analysis of the errors. Krysinski sion to reconstruct the 3D texture of a pavement surface. The results
et al. [38] described the results of an investigation of the capabilities of showed that image-based techniques can be successfully applied to
the GPR technique within the field of pavement crack diagnostics. A list recover the 3D heights of pavement surface textures. Staniek [42] de-
of GPR-based cracks with decimeter accuracy can be created through scribed the measurement of pavement conditions with a stereo vision
the correlation of visually observed cracks with corresponding echo- system attached to a vehicle, which can be used for the evaluation of
grams. Jacek et al. [56] discussed the measurement of the thickness of a road network conditions. Another 3D construction-based stereo vision
new asphalt layer using GPR and determined the key factors affecting algorithm is the Kinect sensor, in addition to the true color camera,
the accuracy, which is very important for the successive stages of ac- which consists of an IR (Infrared Radiation) sensor that is used to
ceptance testing. Takahiro et al. [60] focused on the use of GPR for measure the depth [8,18,52–53]. Joubert et al. [17] proposed a low-
infrastructural health monitoring and proposed a sensitive damage cost sensor based on the Kinect sensor and a high-speed USB (Universal
detection algorithm based on deconvolution utilizing a super high-fre- Serial Bus) camera for the detection and analysis of potholes. Moazzam
quency (SHF) band system for the monitoring of a reinforced concrete et al. [29] collected pavement depth images of concrete and asphalt
(RC) bridge slab. However, measurements of potholes in the pavement roads using a Kinect sensor and calculated the approximate volume of a
using radar images have not been discussed. pothole. Jahanshahi et al. [31] used a Kinect sensor as the depth sensor
to detect potholes and other pavement distress types. In addition, video-
1.2.3. Computer vision-based method based approaches were proposed to recognize a pothole in a sequence
Because visual surveys are time-consuming and create safety and of frames [59]. Jog et al. [23] presented a new approach based on 2D
operation concerns, automatic data collection and data analysis systems recognition and 3D reconstruction for the detection and measurement
have been developed. Images are the most common data used in these of potholes, including the width, number, and depth of potholes, using
systems. For example, Ryu et al. [63] developed a pothole detection a monocular camera mounted on the rear of a car. The method of 3D
system that uses an optical device and a pothole detection algorithm. sparse/dense reconstruction and mesh modeling built upon previous
After the images are collected, a three-step method, including seg- work proposed by Golparvar-Fard et al. [43]. Lokeshwor et al. [41]
mentation, pothole candidate selection, and pothole identification, is presented a critical distress detection, measurement, and classification
applied to determine the locations and sizes of the potholes. Because (CDDMC) algorithm for the automated detection and assessment of
the edge of the pothole is similar to a crack or patch, several traditional potholes, cracks, and patches in videos. Karuppuswamy et al. [34] used
image processing methods, such as the weighted neighborhood [47] a vision and motion system to detect simulated potholes larger than 2 ft
and multilevel minimum cross entropy [66] methods, are used to im- in the center of a lane. Koch et al. [35–37] developed a series of com-
prove the extraction results. Dilation and thinning transforms are ap- puter vision-based methods for the pothole detection in asphalt images
plied to avoid fragmentation in the images of cracks or potholes [65]. and videos.
Hu et al. [28] proposed a method based on which the pavement surface The main challenges of pothole extraction via computer vision-
is regarded as textured surface and distress are defined as homo- based methods are the irregular textures and similarities of pavement
geneities occurring in the textured surface and six texture features and surfaces. Because the textures and/or colors of images of the road
two translation-invariant shape descriptors are used to extract pave- surface are quite similar, the stereo matching algorithm may lead to
ment distress. Gavilán et al. [24] combined multiple directional non- mismatching. To overcome this problem, laser scanning technology is
minimum suppression (MDNMS) and a linear support vector machine usually applied.
(SVM)-based classifier to detect cracks using a vehicle-equipped de-
tection system. The images are generally integrated with other sensing 1.2.4. Laser scanning-based methods
technology data, such as GPR, to identify potholes using the images Based on the rapid development of LiDAR (Light Detection And
[40]. Ranging) sensors, mobile laser scanning (MLS) has recently become an
Wavelet and neural networks are common methods used for feature important method for data collection and mapping technology due to
extraction from images; therefore, researchers have proposed several its high accuracy, high efficiency, and low survey cost. The point cloud
related methods to extract potholes or other pavement distress from captured by the MLS system differs from that generated by computer
images. For example, Sun and Qian [58] and Zhou and Huang [67] used vision-based methods because of the intensity value and higher re-
wavelet transforms to identify surface potholes and cracks in images. solution. Intensity is usually used to separate road objects because of
Ouma [51] proposed a wavelet morphology-based method to detect the high correlation with road materials [30] such as highly reflective
cracks in asphalt pavement imageries. Banharnsakun [4] proposed a road markings [25] and illuminated structures [39]. Distress detection
hybrid artificial bee colony–artificial neural network (ABS–ANN) based on laser scanning data is applied to buildings, roads, and other
method to detect surface distress including potholes. Nejad and Zakeri materials [9]. Several algorithms have been developed to extract road
[48,50] integrated a wavelet-radon and neural network to classify the surface features. For example, the clustering method [11] was proposed
pavement distress in images. to detect cracks from point clouds captured by laser scanning data.
Although many studies focused on the pavement distress extraction Roughness descriptors [20] are used to segment and classify asphalt
from images, the results are notably limited because the extraction and stone pavements. The Otsu threshold algorithm [68] is used to
results cannot be used for measurements. Based on most of these extract 3D crack skeletons from mobile LiDAR point clouds. Chen and Li
methods, one image is acquired to detect the shape of the distress [12] used a high-pass convolution algorithm to detect asphalt pavement

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

cracks. Choi et al. [16] used multiresolution segmentation to detect maintenance.

cracks in the road surface using point cloud images. To address this issue, we present a new method to extract potholes,
Although different methods have been developed for the distress which is based on the combination of laser scanning point clouds and
extraction from laser scanning data, the resolution of the point cloud images derived from mobile mapping sensors. First, 2D candidate pot-
notably affects the extraction results. Whether the pit can be detected holes are labeled in the images using DeepLabv3+, a neural network
depends on the resolution of the point cloud. Because the point cloud equipped with an encoder–decoder structure for image segmentation.
represents discretely sampled data, the edge information of the pothole Second, 3D candidate potholes are extracted based on the relationship
cannot be accurately obtained from point cloud data. In addition, the between the images and laser point cloud. Finally, potholes are iden-
amount of point cloud data of roads is quite large and the calculation tified based on the depth analysis of the laser points assigned to the
efficiency of pothole detection using the point cloud is low. candidate potholes. Compared with other methods, the proposed
method is simple and suitable for the automatic extraction of potholes
1.3. Present work using a mobile mapping system (MMS).

Based on previous studies, several types of pothole extraction 2. Pothole extraction based on the integration of mobile point
methods have been developed using Android platform apps, images, clouds and images
and laser scanning point clouds. Image-based methods can be used to
extract the location of potholes; however, they cannot be used to 2.1. Flowchart of the proposed method
measure the length, width, or depth of the pothole. Laser scanning-
based methods can be used to extract the size and depth of the pothole; In this paper, an automatic pothole extraction method (Fig. 1) is
however, these methods are affected by the resolution of the point proposed, which is based on the integration of mobile point clouds and
cloud and cannot be accurately and automatically used to measure the images captured by a MMS. Images are used for automatic pothole
edge of the pothole. In addition, pit information has not been used to extraction using the DeepLabv3+ method, which has been proposed in
assess the safety level of the road and analyze the impact on road previous studies (e.g., [13]. Based on this method, the 2D shape of

Fig. 1. Flowchart of the proposed method.

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Fig. 2. DeepLabv3+ architecture for the pothole extraction from imagery.

candidate potholes (including real potholes and patches) in the image 2.2.2. Pothole edge extraction from labeled pothole portions
plane coordinate system can be obtained. This step will be introduced To accurately compute the shape of a pothole, an edge detection
in Section 2.2. algorithm is applied to the identified pothole portions. In this study, we
After the camera and laser scanner are calibrated, the registration use the Canny algorithm [10] to detect the closed edge of a pothole
parameters of the two sensors are determined to establish the re- because the pothole or patch portions differ from the road surface.
lationship between the point cloud and images. By using the registra- Because the rough location of the pothole has been recognized by the
tion parameters and pothole edge data, the relevant point cloud is ex- DeepLabv3+ system, the edge of a pothole is easy to detect using
tracted from the image coordinate system and divided into two groups, closed path discrimination. The closed edge extracted from the edge
that is, interior and exterior points. The 3D pothole edge is then com- features in the images via the Canny algorithm is regarded as the edge
puted with a collinear equation and plane function, which is fitted using between the road surface and candidate pothole. When the detected
the exterior points. This step will be described in Section 2.3. edge is not clear enough or not closed, the edge of each pothole is
Subsequently, the interior points are used to analyze the depth manually checked to ensure detection accuracy. Finally, a group of
distribution of the candidate potholes. The average depth of the interior (x, y) coordinates, which is used to represent the edge of a pothole from
points is then used to filter the potholes from the patches. This step will an image, is extracted (see Fig. 3).
be described in Section 2.4.
Finally, using the detected geometry and position information of the 2.3. Candidate 3D pothole extraction using the point cloud
potholes, the impact of the potholes on the road safety and maintenance
is analyzed with the aid of high-precision road maps. This step will be Although the edge of the pothole is extracted from the image using
introduced in Section 2.5. the steps introduced in Section 2.2, the extracted edge features are
mixed with the patch edges. Furthermore, the coordinates of the edge
are based on the image plane coordinate system and cannot be used for
2.2. 2D candidate pothole labeling in imagery the measurement. In this section, a 3D point cloud around the pothole
will be used to accurately extract the 3D features of the pothole. The
2.2.1. Automatic recognition of candidate potholes in imagery using
The objective of pothole detection is to determine if there is a
pothole in a given pavement image. For this purpose, a deep convolu-
tional neural network (DeepLabv3+; [13] is used. Unlike traditional
pothole detection methods with human-designed features, such as HOG
(Histogram of Oriented Gradient) [3], the learned features in the Dee-
pLabv3+-based detection method are fully data-driven and more ro-
bust to changing conditions. The proposed DeepLabv3+ architecture is
illustrated in Fig. 2.
A DeepLabv3+ network uses an image and a corresponding labeled
image as input. It is an encoder–decoder network, which uses a con-
volutional neural network (CNN; Xception-65) with atrous convolution
layers to obtain a coarse score map. Subsequently, a conditional
random field is used to produce the final output. As result, the
DeepLabv3+ assigns a semantic label pothole to every pixel in an
image (see Fig. 2). The network is pretrained with Cityscapes [14] and
fine-tuned with the self-collected and labeled data.
Fig. 3. Edge of the candidate pothole in the image.

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Fig. 4. Relationship between the road environment, point cloud, and image.

candidate 3D pothole is then extracted. purpose, a group of parameters, that is, A, B, C, and D, is used to de-
scribe the plane of the pothole:
2.3.1. Selection of the point cloud around the edge feature AX + BY + CZ + D = 0 (2)
To obtain the 3D potholes, the relationship between the real road
environment, point cloud, and image is established (Fig. 4). In this
Section, the laser points around the candidate pothole edge are ex- 2.3.2. Accurate pothole edge calculation
tracted via three steps. Let (X, Y, Z) be the coordinates of the edge point of the pavement
First, the pixels in the image that correspond to the point cloud are distress in the world coordinate system (star in Fig. 5), (x, y) are the
computed based on the collinear equation. Let p be the point of the coordinates of the edge point in the image plane coordinate system,
point cloud with the world coordinate ( Xp , Yp, Zp ), ( x p , yp ) is the co- which can be extracted, as shown in Section 2.2.2. Based on Section
ordinate of the pixel corresponding to p in the image. The coordinates 2.3.1, this point is in the plane described by P1, …, P15. It also satisfies
of the corresponding pixels can be computed using the collinear the collinear equation. Therefore, we obtain:
equation: AX + BY + CZ + D = 0
a (x x 0) + a2 (y y0 ) a3 f
xp x0 = fa
a1 (Xp Xs ) + b1 (Yp Ys ) + c1 (Zp Zs ) X Xs = (Z Zs ) c1 (x x 0) + c 2 (y y0 ) c3 f
3 (Xp Xs ) + b3 (Yp Ys ) + c3 (Zp Zs )
, b (x x 0 ) + b 2 (y y0 ) b3 f
a2 (Xp Xs ) + b2 (Yp Ys ) + c 2 (Zp Zs ) Y Ys = (Z Zs ) c1 (x
yp y0 = f 1 x 0 ) + c 2 (y y0 ) c3 f (3)
a3 (Xp Xs ) + b3 (Yp Ys ) + c3 (Zp Zs ) (1)
where A, B, C, and D are the plane parameters of the fitted polygon P,
where ( x 0 , y0 , f ) are the interior parameters of a camera, which can be
( x 0 , y0 , f ) are the interior parameters of the camera, ( Xs , Ys, Zs ) are the
determined after calibration, and (a1, a2, a3 , b1, b2 , b3 , c1, c2, c3 ) are the
exterior orientation parameters of the camera, and
components of the transformation matrix, which is determined using
(a1, a2, a3 , b1, b2 , b3 , c1, c2, c3 ) are the components of the transformation
the orientation status of the camera at a certain time. Based on Eq. (1),
all points in the point cloud can be converted to an image plane co-
After the derivation, the accurate coordinates of the pothole edge
ordinate system. Second, after the point cloud is transferred from the
can be calculated by:
world coordinate system to the image plane coordinate system, the
relationship between the pothole edge and point cloud can be applied X=
B (k b X s k a Ys ) + C (k c Zs k a Ys ) ka D

to separate the point cloud into two groups: interior points (brown dots (k a A + k b B + k c C )
A (k a Ys kb X s ) + C (k c Ys kb Zs ) kb D
in Fig. 5) and exterior points (black dots in Fig. 5). Y= (k a A + kb B + k c C )
After the points are separated, the edge points of the exterior group A (k a Zs k c Xs ) + B (kb Zs k c Ys ) kc D
are selected to cover the edge of the pavement distress. In this paper, (k a A + kb B + k c C ) (4)
the concave hull algorithm is used to compute the interior edge of
group P (see polygon, which is connected by P1, P2, … and P15 in
Fig. 5). Here, we assume that the surrounding area around the pothole ka = a1 (x x 0) + a2 (y y0 ) a3 f
is a linear plane. The vertexes of the concave hull are then exported and k b = b1 ( x x 0) + b2 (y y0 ) b3 f
used to fit the plane to describe the domain of the pothole. In most kc = c1 (x x 0) + c2 (y y0 ) c3 f (5)
cases, the pavement distress is relatively small; the fitted plane can be
used to describe the surroundings of the pavement distress. For this By using Eqs. (4) and (5), the accurate edge of the candidate pothole

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Fig. 5. Interior and exterior point classification and concave hull processing.

can be calculated according to the location of the edge in the images is dThreshold .
and the road surface function described by the surrounding laser points. The location and extension of the pothole are then used to evaluate
the affected lanes. First, the center location of the pothole is used to
2.4. Pothole determination based on the depth analysis of interior group compute the lane number that is affected by the pothole. The width of
points the pothole is then used to evaluate the affected lanes with respect to
maintenance. The principle of maintenance evaluation is shown in
After the points around the edge of the pothole and the patch are Fig. 7. During the evaluation, a high-resolution road map [15] is used.
categorized as interior or exterior, we use the depth information (di ) of
each interior point to determine the pothole. Because the pavement has 3. Experiments and analysis
been filled in patches, the elevation distribution of a patch is uniform
compared with the pothole. Fig. 6 shows the depth distribution of the To verify the proposed method, two case studies, including real and
interior points of a pothole. simulation experiments, are used. A MMS is used to collect the point
Therefore, we use the following index to differentiate the potholes cloud and relevant images of the experimental road. A real case, which
from the patches: will be introduced in Section 3.2, is then selected to fully verify the
di proposed method. Because the collection of actual geometric informa-
d¯ = i=0
tion on a road in operation is dangerous, we use a simulation experi-
n (6)
ment to verify the geometric accuracy of the pothole extraction. This
where n is the point number of the interior group and d̄ depicts the simulation will be introduced in Section 3.3.
average depth of the interior points assigned to the candidate pothole if
d̄T is the threshold to filter the potholes and patches. In other words, a 3.1. Mobile mapping system
candidate pothole d̄ greater than d̄T is regarded as a pothole. Otherwise,
it will be treated as a patch. A MMS (Fig. 8) that adopts laser scanners and a panoramic camera
is used to collect the data. The main sensors installed in the MMS are
2.5. Safety evaluation using the detected potholes and high-resolution road equipped with a GPS (Global Positioning System), laser scanner, pa-
map noramic camera, and IMU (Inertial Measurement Unit). The laser
scanner is a SICK LMS 511 with a maximum scanning distance of
The safety evaluation using the potholes detected in Section 2.4 is ∼110.0 m, which is suitable for the data collection in road environ-
carried out in two steps. First, the potholes are classified into four types ments. The camera used for image acquisition is a Ladybug, a pa-
according to the geometric attributes of a pothole considering that the noramic camera that contains six lenses.
depth or area of a pothole may affect the safety while driving a vehicle.
Second, the location and extension of a pothole are used to evaluate the
3.2. Real experiment
affected lanes.
The average depth and area of the pothole are selected to classify
3.2.1. Data collection
the potholes into four types: shallow and small potholes, shallow and
The expressway G15 in Shanghai, which extends from the Huangpu
large potholes, deep and small potholes, and deep and large potholes.
River to the grade separation with G1501, is selected for the data col-
During the classification, the area of a pothole is categorized by the
lection and extraction of real potholes. The total length of the case area
threshold SThreshold ; the threshold for deep/shallow pothole classification
is ∼26.4 km. The average velocity of the MMS during data collection is
set to ∼40.0 km/h to maintain a higher point density. The density of
the point cloud is ∼130–190 points/m2 based on the distance from the
laser scanner. The point cloud of the case area is shown in Fig. 9 and the
di general information about the data collection is listed in Table 1.

3.2.2. Candidate pothole detection using DeepLabv3+

To automatically recognize the candidate potholes in the corrected
Fig. 6. Point depth distribution of a pothole. images, a training and testing dataset for DeepLabv3+ should be

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Fig. 7. Principle of maintenance evaluation.

Fig. 8. Panoramic camera and laser scanner of the MMS used for the experi-
Fig. 9. Captured point cloud for the case area.

established. Pavement images with potholes or patches are collected

for the collected positive and negative samples to simulate different
from Google, Yahoo, and previously collected images and then manu-
lighting conditions in the road scene and ensure that the samples satisfy
ally labeled. To achieve an accurate candidate pothole detection per-
most of the lighting conditions in real environments.
formance, the training dataset is processed using the following steps:
Finally, if there is a candidate pothole in the image, the image will
First, proposed candidate potholes with sizes larger than 20 pixels
be processed in Augmentor using a data augmentation transformation
are regarded as pothole classes; otherwise, they will be ignored. Typical
function [7]. The original images are rotated, flipped, and shifted to
positive and negative samples are shown in Fig. 10.
produce more training samples. This procedure increases the number of
Subsequently, different brightness and transparency values are set

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Table 1 Table 2
General information about the case area and data. Confusion matrices for the test set using DeepLabv3+.
Parameter Value Potholes Others User accuracy

Length of the case road 26.4 km Potholes 78,608,234 (TP) 18,198,629 (FP) 0.812
Average velocity during data collection 40.0 km/h Others 18,015,482 (FN) 409,465,655 (TN) 0.958
Density of the point cloud 130–190 points/m2 Producer accuracy 0.814 0.957 0.931
Time duration of data collection 41 min
Working frequency of the panoramic camera 1 Hz
Number of images collected 14,760
Table 3
GPS environment Good
Comparison of the geometric information about the potholes and patches.
Index Pothole Patch
candidate pothole samples because it is only carried out in a small
Count 49 28
proportion of the collected images.
Average depth (m) 0.031 0.014
Among the samples generated using the steps introduced Maximum depth (m) 0.127 0.061
above, > 6000 samples are used as the training set, 2000 samples are Average area (m2) 0.298 0.296
used as the validation set, and 2000 samples are used as the test sam- Minimum area (m2) 0.073 0.022
Maximum area (m2) 0.876 0.718
ples. Several indices, including true positives (TP), true negatives (TN),
( TP
false positives (FP), false negatives (FN), precision precision = TP + FP ,

and recall recall =
, are used to evaluate the validation of the
two deep and large potholes that may notably affect road safety. Most
established DeepLabv3+ system with a NVIDIA P100 GPU (Graphics of the potholes are shallow potholes, as the average depth of 47 pot-
Processing Unit). The training step is set to 30,000 times. The total time holes is less than 0.06 m. There are notably more large potholes than
for the training is ∼7 h. During the test step, the total time used for the small potholes.
2000 test images is ∼508 s, that is, ∼0.25 s/image. The confusion Based on the use of the high-resolution and -accuracy roadmap for
matrix of the test set using DeepLabv3+ is shown in Table 2. the case area, the evaluation location and affected lanes are shown in
After applying DeepLabv3+-based detection to images captured by Table 5.
the panoramic camera, the candidate potholes are detected and labeled. According to Table 5, the number of potholes in the first lane or
As introduced in Section 2.2, the potholes and road patches are detected emergency lane is 5, accounting for ∼10% of the potholes extracted in
and labeled based on the similarity of the texture information. In the this case area. Forty-four potholes are in the second and third lanes.
experiment, 77 candidate potholes are labeled. Further processing This shows that the second and third lanes are more seriously damaged
should be performed to filter the potholes from patches. than the first and emergency lanes. This is probably the case because
the first lane can only be used by family cars, minibuses, and buses
3.2.3. Results for the potholes and patches detected in the case area according to Chinese laws. Trucks and other heavily loaded vehicles are
To separate the potholes from the patches, the point cloud captured prohibited from using the first lane. The second and third lanes can be
by the laser scanner is used to calculate the average depth and area of used by all vehicles, especially trucks and heavily loaded vehicles;
each candidate pothole. In this study, we use d̄T = 0.02 m as the therefore, the damage of the second and third lanes is more serious than
threshold to separate the potholes from patches. In other words, any that of the first and emergency lanes.
region with d̄ greater than 0.02 m is regarded as a pothole. After fil- Based on the affected lanes listed in Table 5, 33 potholes affect only
tering, the case area includes 49 potholes and 28 patches. The geo- one lane, which means that both the center and boundary of the pothole
metric information about the potholes and patches is listed in Table 3. are in a specific lane. The other 16 potholes affect two lanes, which
The locations of the potholes and patches and typical 3D models of means that the potholes cross two lanes. Therefore, two adjacent lanes
potholes are shown in Fig. 11. should be blocked for these 16 potholes during road maintenance,
which will notably affect the operation of the expressway.
3.2.4. Road safety evaluation based on potholes
The extracted 49 potholes are used for further analysis to evaluate 3.3. Simulation experiment for geometric accuracy evaluation
the effects on the road safety. Based on setting SThreshold to 0.2 m2 and
dThreshold to 0.06 m, the extracted potholes are classified into four types: 3.3.1. Data collection
shallow and small potholes, shallow and large potholes, deep and small The pothole extraction was simulated on the campus of Tongji
potholes, and deep and large potholes. The numbers of potholes of each University to avoid dangerous situations with respect to the collection
type are shown in Table 4. Based on the four pothole types, there are of the true geometric pothole information and impact on road traffic.

(a) Typical positive samples (b) Typical negative samples

Fig. 10. Samples used for the DeepLabv3+ training.

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Table 6
General information about the data collection.
Parameter Value

Length of the case road 2.0 km

Average velocity during data collection 12.0 km/h
Density of the point cloud 375 points/m2
Time duration of data collection 12 min
Working frequency of the panoramic camera 2 Hz
GPS environment Normal

The manhole covers distributed in different locations on the campus are

simulated as potholes or patches because the pitting phenomenon of
manhole covers is quite similar to real potholes. The manhole cover
data are also collected using the MMS (see Fig. 7); the main collection
parameters are listed in Table 6.

3.3.2. Geometric accuracy evaluation

In this experiment, 68 manhole covers are selected. Considering that
the manhole covers in the experimental area have both circular and
square shapes, the difference between the actual size of the manhole
cover and the size captured by the proposed method is measured and
compared based on the different shapes. The width and length of the
well covers are measured with a measuring tape and treated as true
values. However, because the depths of the manhole covers are difficult
to measure, the true depths are not collected.
Based on the edge extraction method of the image data, the straight
line- and arc-shaped boundaries of the manhole covers are then ob-
tained. Subsequently, the accurate pothole extraction method in-
troduced in Section 2.3 is used to calculate the size of the manhole
cover including the width and length. The differences between the
calculated values and ground truth are shown in Table 7. Based on the
data listed in Table 7, the mean geometric accuracy of the proposed
method is ∼1.5–2.8 cm.
The depths of the 68 simulated potholes or patches are then cal-
culated via the point cloud assigned to the simulated pits. In this study,
we use d̄T = 0.02 m as a threshold. The results are shown in Table 8.
Among 68 simulated pits, only one pit has a depth below 2.0 cm; the
remaining pits have depths above 2.0 cm.

Fig. 11. Location of potholes and patches and 3D models of typical potholes. 4. Discussion of the factors affecting the proposed method

Table 4 In this section, the main factors that may affect the pothole ex-
Statistics of the four types of potholes classified based on the geometric in- traction results are analyzed.
Index Area 4.1. DeepLabv3+ sample collection

> 0.2 m2 < 0.2 m2 The pothole samples used for DeepLabv3+ training in this paper
are typical, round potholes that can be easily recognized in images.
Average depth > 0.06 m 2 0
< 0.06 m 34 13 Linear potholes and other types are not considered during the sample
collection in this paper. During the candidate pothole extraction, linear
potholes and linear patches are determined. To make the DeepLabv3+
system robust enough for pothole extraction, linear pothole samples
Table 5 should also be collected.
Statistics of the pothole location and lanes affected by potholes.
Index Value Table 7
Difference between the true and calculated sizes of round and square manhole
Location of the pothole First lane 4 covers.
Second lane 31
Third lane 13 Type Index Min (cm) Max (cm) Mean (cm)
Emergency lane 1
Round manhole cover Diameter 0.10 4.46 1.47
Affected lanes 1 lane 33 Square manhole cover Length 0.09 6.63 2.77
2 lanes 16 Width 0.07 6.94 2.02

H. Wu, et al. Advanced Engineering Informatics 42 (2019) 100936

Table 8 the integration and data fusion of mobile mapping point clouds and
Comparison of the geometric information about potholes and patches. images. This method first uses the DeepLabv3+ system and Canny al-
Index Simulated Pothole Simulated Patch gorithm to detect the location and 2D edge of a candidate pothole in the
image. To differentiate the potholes and patches, the point cloud ac-
Count 67 1 quired by a laser sensor is then classified as interior or exterior group
Average depth (m) 0.040 0.010
based on the relationship with the candidate pothole edges. The interior
Maximum depth (m) 0.073 0.010
points are used to calculate the mean depth of the candidate potholes
and to separate the potholes from patches using a threshold. Finally, the
Moreover, the deep neural network DeepLabv3+ can achieve an detected potholes are used to evaluate the effects on the road en-
accuracy of 93.1% during pit extraction. Therefore, the results of the pit vironment.
extraction must be further analyzed before they can be used for sub- A real case in Shanghai is selected to verify the validity of the
sequent procedures via the methods proposed in Sections 2.3 and 2.4. proposed method. A total of 77 candidate potholes are extracted by
DeepLabv3+ among which 49 are successfully identified as potholes
using the proposed method. A safety evaluation is then performed to
4.2. Geometry of the pavement
assess the location and affected lanes based on the relationship between
the potholes and a highly accurate high-resolution roadmap. This
An important precondition of the proposed method is that the shape
method will be beneficial for road maintenance and emergency man-
of the road around the pothole is assumed to be a linear plane, which is
satisfied for most roads. However, this precondition cannot be satisfied
To evaluate the geometric accuracy of the pothole extraction, a si-
in the case of soft base roads or irregular sedimentation roads. In such
mulation experiment is conducted to avoid the dangers of ground truth
cases, the root-mean-square error (RMSE) of the fitted plane is notably
collection in a real road environment. The results show that the mean
higher than that of the other candidate potholes. Therefore, more de-
size accuracy is ∼1.5–2.7 cm.
tailed processing steps should be developed for the pothole extraction
Because the geometric shapes of the patches and potholes are si-
in soft base or irregular sedimentation roads.
milar, traditional methods generally encounter misclassification pro-
blems. Compared with traditional methods, the innovation of the pro-
4.3. Blocking during data collection
posed method is the higher identification rate of potholes, which can be
used for subsequent measurements. The proposed method uses point
Blocking phenomena also affect the data quality during data col-
clouds to analyze the accurate edge and depth distribution of a pothole.
lection and thus the results of the proposed method. First, road potholes
This will ensure the success rate in discerning the potholes.
might be blocked by road objects, such as vehicles, such that the pot-
holes cannot be properly captured by cameras and laser sensors. In
Declaration of Competing Interest
addition, shadows caused by the sheltering of the ground features might
also affect the location of the pits identified by the DeepLabv3+
We declare that we have no financial and personal relationships
system. This is a universal problem of most data collection systems.
with other people or organizations that can inappropriately influence
Therefore, we suggest users to perform data collection when few ve-
our work, there is no professional or other personal interest of any
hicles are on the road, for example, early in the morning.
nature or kind in any product, service and/or company that could be
construed as influencing the position presented in, or the review of, the
4.4. Weather manuscript entitled “ Road pothole extraction and safety evaluation by
integration of point cloud and images derived from mobile mapping sensors ”
Finally, the weather during data collection with a digital camera (ADVEI_2019_216).
should be good (without rain) because the images will be degraded
under poor weather conditions. Moreover, the laser beams will be ab- Acknowledgement
sorbed if there is water on the road or in the pothole. Therefore, no or
few point clouds will be present in the interior part of the pothole. In This study was supported by the National Science Foundation of
such cases, accurate 3D candidate potholes and the depth distribution China (Nos. 41671451, 41771482), the National Science and
cannot be properly calculated. Technology Major Program (No. 2016YFB0502104, 2016YFB1200602-
02) and the Fundamental Research Funds for the Central Universities of
4.5. Computational complexity of the proposed method China.

