Journal Vehicle Speed Estimation
Journal Vehicle Speed Estimation
Journal Vehicle Speed Estimation
I. I NTRODUCTION
Fig. 8. Internal steps of the F IND -H ILLS routine. (a) Rising and falling
phases according to a given threshold ρ. (b) Ascending and descending slopes.
(c) Three slope regions delimited by the rising edge of ascending slopes and the
falling edge of descending slopes.
|h1 − h2 | < t1 h
dx < t2 h The last stage of our license plate detector is a region
classification step, which discards regions that do not seem to
dy < t3 h (4) contain any textual information. We use for this task the T-HOG
text descriptor [28] which is a texture classifier specialized for
where t1 , t2 and t3 are fixed parameters (respectively, 0.7, 1.1, capturing the gradient distribution characteristic of character
and 0.4 in our tests). strokes in occidental-like scripts. We first estimate a center line
The above criteria are applied to each isolated region by for each candidate image region, by taking, at each column, the
using the union-find data structure, which was adapted from center point between the uppermost and the bottommost pixels
Cormen [32] as shown in Fig. 10. Specifically, at the beginning from the filtered and dilated edge image, as shown in Fig. 12(a)
each region b is a disjoint set created by the M AKE -S ET and (b). Fixed-size windows are then centered at regularly
algorithm, as shown in Fig. 11(a) and (b). The U NION routine spaced points along the center line, as shown in Fig. 12(c). This
then tries to group two compatible candidate regions b1, b2, sub-sampling is done to reduce the computational effort, as well
as shown in Fig. 11(c). These regions are then filtered using as to avoid very similar classifications in neighboring positions.
simple geometric tests that remove regions with dimensions not For each window, we compute the T-HOG descriptor and use
compatible with license plates. In our tests we filtered regions it as an input for a SVM classifier. For the SVM classifier we
with dimensions below 32 × 10 pixels. Fig. 9(d) shows the used a Gaussian χ2 kernel, whose standard deviation parameter
grouped and filtered regions for a sample image. was optimized by cross-validation on a training set of text and
1398 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 18, NO. 6, JUNE 2017
Fig. 12. T-HOG/SVM classification. (a) Candidate region. (b) Region center
line in white color. (c) Sampled points to guide the classification in white
color (step of 4 pixels). (d) and (e) Regions classified as text. (f) Text region
bounding box.
V. F EATURE S ELECTION AND T RACKING where d is the displacement vector that describes the motion
of the feature between frames I and J. To obtain d, the KLT
Once a license plate region is detected, our system selects
a set of distinctive features and tracks it across multiple video algorithm takes a current estimate e and iteratively solves for
namely
increments Δd,
frames. Our aim is producing a list containing the trajectories
of the tracked features. The structure of our tracking scheme is 2
E=
I(u) − J u + (e + Δd) (7)
outlined in Fig. 13.
Feature selection is performed only once for each vehicle, Ω
immediately after its license plate is detected. Following the updating e at each iteration until it converges. As we are
approach from Shi and Tomasi [3], a “good feature” is a region tracking features, the initial estimate for each frame may be the
with high intensity variation in more than one direction, such as displacement d obtained for the previous frame.
textured regions or corners. Let [Ix Iy ] be the image derivatives The traditional Lucas-Kanade algorithm only works for
in the x and y directions of image I, and let small displacements (in the order of one pixel). To overcome
Ix 2 Ix Iy
this limitation, we consider the pyramidal version of KLT,
Z= (5) described by Bouguet [33]. The algorithm builds, for each
Ix Iy I2y
Ω frame, a multi-scale image pyramid by using the original image
at the pyramid base, and putting at each subsequent level a
2 We used the source code and training dataset available at www.dainf.ct. version of the image in the previous level with width and height
utfpr.edu.br/%7erminetto/projects/thog.html. reduced by half. The pyramidal KLT algorithm starts by finding
LUVIZON et al.: VIDEO-BASED SYSTEM FOR VEHICLE SPEED MEASUREMENT IN URBAN ROADWAYS 1399
TABLE I
D ATA S ET I NFORMATION : T IME (M INUTES ); N UMBER OF V IDEOS ;
N UMBER OF V EHICLES W ITH P LATES AND S PEED I NFORMATION .
T HE Q UALITY O PTIONS A RE AS F OLLOWS : [H]—H IGH Q UALITY;
[N]—F RAMES A FFECTED BY N ATURAL OR A RTIFICIAL N OISE ;
[L]—F RAMES A FFECTED BY S EVERE L IGHTING C ONDITIONS ;
[B]—M OTION B LUR ; AND [R]—R AIN
TABLE II
M OTION D ETECTION P ERFORMANCE : THE P RECISION p, R ECALL r, AND
AVERAGE T IME ( IN M ILLISECONDS , FOR E ACH F RAME ) FOR
F IVE S UBSAMPLING C ONFIGURATIONS AND
T WO OVERLAPPING T HRESHOLDS
TABLE III
L ICENSE P LATE D ETECTION P ERFORMANCE E VALUATION , BASED ON
P RECISION (p), R ECALL (r), AND THE F -M EASURE . T HE VALUES IN
B OLDFACE A RE THE M AXIMA O BTAINED FOR E ACH C ASE
with the λ threshold indicating how much of the license plate Fig. 16. Examples of license plates detected by our system, for representative
region must be contained within the ROI. We performed tests samples of each set.
using two different values for λ: 1.0 (i.e. the entire license plate
is contained within the ROI) and 0.5.
Table II shows the precision and recall, as well as the average
processing time, obtained by our motion detector. The different
columns show the results obtained with different amounts of
subsampling — more sparse grids will reduce the processing
time, but can also lead to incorrect results. Fig. 17. Examples of license plates not detected by our system.
C. License Plate Detection Evaluation results obtained for the entire data set are shown in Table III,
To evaluate the performance of the license plate detector, we divided in 5 subsets according to weather and recording
compare the detected license plates with those in the ground conditions. Our detector significantly outperformed the other
truth. The comparison is based on the same precision and approaches in these tests. The average time to process each
recall metrics used for evaluating the motion detector (see region of interest was 58 ms for SnooperText; 918 ms for
Section VII-B), with some differences. First, set D refers to the Zheng et al.; 402 ms for SWT; and 195 ms for our detector.
set of detected license plates. Second, function m is defined as Examples of license plates detected by the proposed method
in the PASCAL Visual Object Detection Challenge [39]: are shown in Fig. 16. Our detector worked as expected even
in some situations with severe image noise or motion blur.
1 if area(a∩b)
area(a∪b) > 0.5 Detection errors occurred mainly in the hypothesis generation
m (a, b) = (14)
0 otherwise. phase, with true license plate regions being eliminated by
some filtering criteria when they became connected with a
For ranking purposes, we also consider the F -measure, background region. Samples of license plates not detected by
which is the harmonic mean of precision and recall: F = 2 · our system are shown in Fig. 17.
p · r/(p + r).
We compared our license plate detector with three text
D. Vehicle Speed Measurement Evaluation
and license plate detectors described in the literature (see
Section II-B): SnooperText [27], the Zheng et al. [21] algo- Speed measurement performance was evaluated by compar-
rithm, and the Stroke Width Transform (SWT) [22]. The param- ing the speeds measured by our system with the ground truth
eters for these detectors were obtained by running the system speeds obtained by the inductive loop detectors. According to
on 25% of the videos from the dataset, and selecting the param- the standards adopted in the USA, an acceptable measurement
eter combinations that produced the highest F-measures. The must be within the [−3 km/h, +2 km/h] error interval.
1402 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 18, NO. 6, JUNE 2017
VIII. C ONCLUSION
Fig. 18. Speed measurement error distribution. This paper addressed the problem of measuring vehicle
speeds based on videos captured in an urban setting. We pro-
posed a system based on the selection and tracking of distinctive
The first row in Table IV shows the results obtained features located within each vehicle’s license plate region. The
by our system. Percentages are given regarding the valid system was tested on almost five hours of videos with full-
vehicles—those with both a license plate and an associated HD quality, with more than 8,000 vehicles in three different
speed in the ground truth—and are divided in 3 classes, depend- road lanes, with associated ground truth speeds obtained by
ing on whether the measured speed was below, inside, or above a high precision system based on inductive loop detectors, as
the acceptable error interval. Fig. 18 shows the distribution well as manually labeled ground truth license plate regions. Our
of the measurement errors, with 96% of the measurements system uses a novel license plate detection method, based on a
being inside the acceptable limits. The maximum nominal texture classifier specialized to capture the gradient distribution
error values for the whole dataset were −4.68 km/h and characteristics of character strokes that make the license plate
+6.00 km/h, with an average of −0.5 km/h a standard deviation letters. This module achieved a precision of 0.93 and a recall
of 1.36 km/h. We observed that the assumption that all the of 0.87, outperforming other well-known approaches. We have
license plates have nearly the same distance from the ground also shown that extracting distinctive features from the license
is the main cause of speed measurement errors: when the plate region led to better results than taking features spread over
license plates are very high above the ground (e.g. in buses the whole vehicle, as well as an approach which uses a particle
or trucks) the measured speed can be higher than the actual filter for blob tracking. In our experiments, the measured speeds
speed, with the opposite occurring when the license plates had an average error of −0.5 km/h, staying in over 96.0% of the
are unusually low. A total of 99.2% of the vehicles were cases inside the +2/−3 km/h error interval determined by the
successfully tracked until they reached the speed measurement regulatory authorities in several countries.
region. On average, our tracking module spent 49.8 millisec- As future work, we intend to verify if estimating the distance
onds per frame. Examples of measured speeds are shown of the license plates from the ground can improve the results.
in Fig. 19. We also aim to apply an OCR on the detected license plates
In order to verify if distinctive features from a license plate in order to create a traffic speed control system with integrated
region are a good choice for measuring a vehicle’s speed, we surveillance tools, e.g. to compute the traffic flow, to identify
performed tests using a version of our system which takes fea- stolen vehicles, etc. Another topic for future work is the imple-
tures from the whole vehicle region (“free feature selection”). mentation on a compact platform that allows local processing,
The results are shown in Table IV. It can be seen that the including optimizations such as parallel processing on GPUs,
percentage of vehicles whose measured speed is inside the thus reducing communication bandwidth requirements.
LUVIZON et al.: VIDEO-BASED SYSTEM FOR VEHICLE SPEED MEASUREMENT IN URBAN ROADWAYS 1403
Fig. 19. Examples of vehicle speeds measured by our system and by a high-precision meter based on inductive loops.
R EFERENCES [10] C. H. Xiao and N. H. C. Yung, “A novel algorithm for estimating vehicle
speed from two consecutive images,” in Proc. IEEE WACV, 2007, pp. 1–6.
[1] T. V. Mathew, “Intrusive and non-intrusive technologies,” Indian Inst. [11] H. Zhiwei, L. Yuanyuan, and Y. Xueyi, “Models of vehicle speeds
Technol. Bombay, Mumbai, India, Tech. Rep., 2014. measurement with a single camera,” in Proc. Int. Conf. Comput. Intell.
[2] N. Buch, S. Velastin, and J. Orwell, “A review of computer vision tech- Security Workshops, 2007, pp. 283–286.
niques for the analysis of urban traffic,” IEEE Trans. Intell. Transp. Syst., [12] C. Maduro, K. Batista, P. Peixoto, and J. Batista, “Estimation of vehicle
vol. 12, no. 3, pp. 920–939, Sep. 2011. velocity and traffic intensity using rectified images,” in Proc. IEEE ICIP,
[3] J. Shi and C. Tomasi, “Good features to track,” in Proc. IEEE Int. Conf. 2008, pp. 777–780.
CVPR, 1994, pp. 593–600. [13] H. Palaio, C. Maduro, K. Batista, and J. Batista, “Ground plane
[4] B. D. Lucas and T. Kanade, “An iterative image registration technique velocity estimation embedding rectification on a particle filter multi-target
with an application to stereo vision,” in Proc. Joint Conf. Artif. Intell., tracking,” in Proc. IEEE ICRA, 2009, pp. 825–830.
1981, pp. 674–679. [14] L. Grammatikopoulos, G. Karras, and E. Petsa, “Automatic estimation
[5] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” of vehicle speed from uncalibrated video sequences,” in Proc. Mod.
Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004. Technol., Educ. Prof. Pract. Geodesy Related Fields, 2005, pp. 332–338.
[6] D. Luvizon, B. Nassu, and R. Minetto, “Vehicle speed estimation by [15] T. Schoepflin and D. Dailey, “Dynamic camera calibration of roadside
license plate detection and tracking,” in Proc. IEEE ICASSP, 2014, traffic management cameras for vehicle speed estimation,” IEEE Trans.
pp. 6563–6567. Intell. Transp. Syst., vol. 4, no. 2, pp. 90–98, Jun. 2003.
[7] D. Dailey, F. Cathey, and S. Pumrin, “An algorithm to estimate mean [16] G. Garibotto, P. Castello, E. Del Ninno, P. Pedrazzi, and G. Zan, “Speed-
traffic speed using uncalibrated cameras,” IEEE Trans. Intell. Transp. vision: Speed measurement by license plate reading and tracking,” in
Syst., vol. 1, no. 2, pp. 98–107, Feb. 2000. Proc. IEEE Trans. Intell. Transp. Syst., 2001, pp. 585–590.
[8] V. Madasu and M. Hanmandlu, “Estimation of vehicle speed by motion [17] W. Czajewski and M. Iwanowski, “Vision-based vehicle speed mea-
tracking on image sequences,” in Proc. IEEE Intell. Veh. Symp., 2010, surement method,” in Proc. Int. Conf. Comput. Vis. Graphics, 2010,
pp. 185–190. pp. 308–315.
[9] S. Dogan, M. S. Temiz, and S. Kulur, “Real time speed estimation of mov- [18] M. Garg and S. Goel, “Real-time license plate recognition and speed
ing vehicles from side view images from an uncalibrated video camera,” estimation from video sequences,” ITSI Trans. Electr. Electron. Eng.,
Sensors, vol. 10, no. 5, pp. 4805–4824, 2010. vol. 1, no. 5, pp. 1–4, 2013.
1404 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 18, NO. 6, JUNE 2017
[19] C.-N. E. Anagnostopoulos, I. E. Anagnostopoulos, I. D. Psoroulas, [37] D. G. R. Bradski and A. Kaehler, Learning OpenCV, 1st ed. Sebastopol,
V. Loumos, and E. Kayafas, “License plate recognition from still images CA, USA: O’Reilly Media, 2008.
and video sequences: A survey,” IEEE Trans. Intell. Transp. Syst., vol. 9, [38] C. Wolf and J.-M. Jolion, “Object count/area graphs for the evaluation
no. 3, pp. 377–391, Mar. 2008. of object detection and segmentation algorithms,” Int. J. Doc. Anal.
[20] S. Du, M. Ibrahim, M. Shehata, and W. Badawy, “Automatic License Plate Recognit., vol. 8, no. 4, pp. 280–296, 2006.
Recognition (ALPR): A state-of-the-art review,” IEEE Trans. Circuits [39] M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman,
Syst. Video Technol., vol. 23, no. 2, pp. 311–325, Feb. 2013. “The PASCAL Visual Object Classes (VOC) challenge,” Int. J. Comput.
[21] D. Zheng, Y. Zhao, and J. Wang, “An efficient method of license plate Vis., vol. 88, pp. 303–338, 2009.
location,” Pattern Recognit. Lett., vol. 26, no. 15, pp. 2431–2438, 2005. [40] P. Perez, C. Hue, J. Vermaak, and M. Gangnet, “Color-based probabilistic
[22] B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes tracking,” in Proc. ECCV, 2002, pp. 661–675.
with stroke width transform,” in Proc. IEEE Int. Conf CVPR, 2010,
pp. 886–893.
[23] B. Li, B. Tian, Y. Li, and D. Wen, “Component-based license plate detec-
tion using conditional random field model,” IEEE Trans. Intell. Transp. Diogo Carbonera Luvizon received the M.Sc. de-
Syst., vol. 14, no. 4, pp. 1690–1699, Dec. 2013. gree from the Federal University of Technology of
[24] A. Ashtari, M. Nordin, and M. Fathy, “An Iranian license plate recogni- Paraná (UTFPR), Curitiba, Brazil, in 2015. He is cur-
tion system based on color features,” IEEE Trans. Intell. Transp. Syst., rently working toward the Ph.D. degree at Université
vol. 15, no. 4, pp. 1690–1705, Aug. 2014. de Cergy-Pontoise, Cergy-Pontoise, France.
[25] H. Li, D. Doermann, and O. Kia, “Automatic text detection and tracking From 2010 to 2014, he was a Research Engi-
in digital video,” IEEE Trans. Image Process., vol. 9, no. 1, pp. 147–156, neer in a company of vehicle speed measurement
Jan. 2000. systems. His main research interests include vehicle
[26] W. Zhou, H. Li, Y. Lu, and Q. Tian, “Principal visual word discovery for detection and tracking, speed estimation, and image
automatic license plate detection,” IEEE Trans. Image Process., vol. 21, descriptors.
no. 9, pp. 4269–4279, Sep. 2012.
[27] R. Minetto, N. Thome, M. Cord, N. J. Leite, and J. Stolfi, “SnooperText:
A text detection system for automatic indexing of urban scenes,” Comput.
Vis. Image Understand., vol. 122, pp. 92–104, 2014. Bogdan Tomoyuki Nassu received the Ph.D. degree
[28] R. Minetto, N. Thome, M. Cord, J. Stolfi, and N. J. Leite, “T-HOG: An in advanced interdisciplinary studies from The Uni-
effective gradient-based descriptor for single line text regions,” Pattern versity of Tokyo, Tokyo, Japan, in 2008.
Recognit., vol. 46, no. 3, pp. 1078–1090, 2013. From 2008 to 2011, he was a Researcher with
[29] A. Bobick and J. Davis, “The recognition of human movement using the Railway Technical Research Institute, Tokyo, and
temporal templates,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, a Postdoctoral Research Fellow with the Federal
no. 3, pp. 257–267, Mar. 2001. University of Parana, Curitiba, Brazil. Since 2012, he
[30] J. Ha, R. Haralick, and I. Phillips, “Document page decomposition by the has been an Assistant Professor with the Federal Uni-
bounding-box project,” in Proc. ICDAR, 1995, vol. 2, pp. 1119–1122. versity of Technology of Paraná (UTFPR), Curitiba.
[31] T. Retornaz and B. Marcotegui, “Scene text localization based on the His main research interest is applying computer
ultimate opening,” in Proc. ISMM, 2007, vol. 1, pp. 177–188. vision techniques to practical problems.
[32] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to
Algorithms, 3rd ed. Cambridge, MA, USA: MIT Press, 2009.
[33] J.-Y. Bouguet, “Pyramidal Implementation of the Lucas Kanade Feature
Tracker,” Intel Corp., Microprocessor Res. Lab., Mountain View, CA, Rodrigo Minetto received the Ph.D. degree in
USA, 2000. computer science from Université Pierre et Marie
[34] K. Mikolajczyk and C. Schmid, “A performance evaluation of local Curie, Paris, France, and the University of Campinas,
Campinas, Brazil, in 2012.
descriptors,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 10,
pp. 1615–1630, Oct. 2005. Since 2012, he has been an Assistant Professor
[35] G. Wang, Z. Hu, F. Wu, and H.-T. Tsui, “Single view metrology from with the Federal University of Technology of Paraná
(UTFPR), Curitiba, Brazil. His main research inter-
scene constraints,” Image Vis. Comput., vol. 23, no. 9, pp. 831–840, 2005.
[36] H. Li, M. Feng, and X. Wang, “Inverse perspective mapping based urban ests include text detection, object tracking, image
road markings detection,” in Proc. IEEE Int. Conf. CCIS, Oct. 2012, descriptors, and additive manufacturing.
vol. 3, pp. 1178–1182.