Detection and Recognition of Road Traffic

1498 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 13, NO.
4, DECEMBER 2012
Real-Time Detection and Recognition

of Road Traffic Signs
Jack Greenhalgh and Majid Mirmehdi, Senior Member, IEEE
Abstract—This paper proposes a novel system for the auto- The proposed method consists of the following two stages:
matic detection and recognition of traffic signs. The proposed 1) detection is performed using a novel application of maxi-
system detects candidate regions as maximally stable extremal mally stable extremal regions (MSERs) [1], and 2) recognition
regions (MSERs), which offers robustness to variations in lighting
conditions. Recognition is based on a cascade of support vector is performed with histogram of oriented gradient (HOG) fea-
machine (SVM) classifiers that were trained using histogram of tures, which are classified using a linear support vector machine
oriented gradient (HOG) features. The training data are generated (SVM).
from synthetic template images that are freely available from an Another novel aspect of this paper is the use of an online road
online database; thus, real footage road signs are not required sign database [2] that consists of synthetic graphical represen-
as training data. The proposed system is accurate at high vehicle
speeds, operates under a range of weather conditions, runs at an tations of signs. To the best of our knowledge, this is the first
average speed of 20 frames per second, and recognizes all classes paper that uses the entire range of road signs in operation. Previ-
of ideogram-based (nontext) traffic symbols from an online road ous works, such as [3]–[5], all use hand-selected subsets. Large
sign database. Comprehensive comparative results to illustrate the training sets are then generated by applying random distortions
performance of the system are presented. to our graphic templates, e.g., geometric distortion, blurring,
Index Terms—Histogram of oriented gradient (HOG) features, and illumination variations, to capture examples of occurrences
maximally stable extremal regions (MSERs), support vector ma- of real-scene distortions. It is essential for the classifiers to be
chines (SVMs), synthetic data, traffic sign recognition. trained on all possible signs to avoid misclassification of similar
but excluded signs. Generating synthetic data in this way allows
I. I NTRODUCTION classification to be performed on all possible road signs and
also avoids the tedious process of hand labeling large data sets.
A UTOMATIC traffic sign detection and recognition is an
important part of an advanced driver assistance system.
Traffic symbols have several distinguishing features that may
Examples of the proposed road sign detection system are shown
in Fig. 1.
be used for their detection and identification. They are designed In Section II, we review previous work and state the im-
in specific colors and shapes, with the text or symbol in high provements that we make. Then, in Section III, we outline the
contrast to the background. Because traffic signs are generally methodology used, which includes detection, recognition, and
oriented upright and facing the camera, the amount of rotational the generation of synthetic data. In Section IV, we describe
and geometric distortion is limited. comparative results to illustrate the performance of the system.
Information about traffic symbols, such as shape and color, Finally, conclusions are drawn in Section V.
can be used to place traffic symbols into specific groups;
however, there are several factors that can hinder effective II. R ELATED W ORK
detection and recognition of traffic signs. These factors include
variations in perspective, variations in illumination (including A significant number of papers that deal with the recognition
variations that are caused by changing light levels, twilight, fog, of ideogram-based road signs in real road scenes have been
and shadowing), occlusion of signs, motion blur, and weather- published [4]–[15].
worn deterioration of signs. Road scenes are also generally The most common approach, quite sensibly, consists of two
very cluttered and contain many strong geometric shapes that main stages: detection and recognition. The detection stage
could easily be misclassified as road signs. Accuracy is a key identifies the regions of interest and is mostly performed using
consideration, because even one misclassified or undetected color segmentation, followed by some form of shape recog-
sign could have an adverse impact on the driver. nition. Detected candidates are then either identified or re-
jected during the recognition stage using, for example, template
matching [16] or some form of classifier such as SVMs [4], [5],
Manuscript received January 13, 2012; revised May 2, 2012; accepted
July 2, 2012. Date of publication August 27, 2012; date of current version [11] or neural networks [3], [17].
November 27, 2012. This work was supported in part by the Engineering and The majority of systems make use of color information as
Physical Sciences Research Council and Jaguar Cars Limited. The Associate a method for segmenting the image [9], [11], [12], [18]–[20].
Editor for this paper was J. Stallkamp.
The authors are with the Visual Information Laboratory, University of The performance of color-based road sign detection is often
Bristol, BS8 1UB Bristol, U.K. (e-mail: [email protected]; majid@ reduced in scenes with strong illumination, poor lighting, or
compsci.bristol.ac.uk). adverse weather conditions such as fog. Color models, such
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. as hue–saturation–value (HSV) [8], [12], [19], YUV [21], and
Digital Object Identifier 10.1109/TITS.2012.2208909 CIECAM97 [10], have been used in an attempt to overcome
1524-9050/$31.00 © 2012 IEEE
GREENHALGH AND MIRMEHDI: REAL-TIME DETECTION AND RECOGNITION OF ROAD TRAFFIC SIGNS 1499
Fig. 1. Examples of the proposed road sign detection and recognition system.
these issues. For example, Shadeed et al. [21] performed seg- ple, the system that was developed by Mobileye [28] detects
mentation by applying the U and V chrominance channels of only speed limit signs and no-overtaking signs. Comparison
the YUV space, with U being positive and V being negative with these commercial systems is difficult, given that little
for red colors. This information was used in combination with information on their performance is available.
the hue channel of the HSV color space to segment red road In Section IV, we first compare our proposed method with
signs. Gao et al. [10] applied a quad-tree histogram method to a similar road sign detection system that was proposed by
segment the image based on the hue and chroma values of the Gómez-Moreno et al. [11]. We then evaluate the performance
CIECAM97 color model. Malik et al. [12] thresholded the hue of our synthetically generated training data against real train-
channel of the HSV color space to segment red road signs. ing data on the German Traffic Sign Recognition Benchmark
In contrast, there are several approaches [7], [22] that entirely (GTSRB) [29].
ignore color information and instead use only shape informa-
tion from grayscale images. For example, Loy and Zelinksy III. T RAFFIC S IGN D ETECTION
[22] proposed a system that used local radial symmetry to AND R ECOGNITION S YSTEM
highlight points of interest in each image and detect octagonal, A. Overview of the System
square, and triangular road signs.
Some recent methods such as [23] and [24] use HOG features The proposed system consists of the following two main
for road sign feature extraction. Creusen et al. [23] extended stages: detection and recognition. The complete set of road
the HOG algorithm to incorporate color information using the signs used in our training data and recognized by the system is
CIELAB and YCbCr color spaces. Overett et al. [24] presented shown in Fig. 2. Candidates for traffic symbols are detected as
two variant formulations of HOG features for the detection of MSERs, as described by Matas et al. [1]. MSERs are regions
speed signs in New Zealand. We also use HOG features to aid that maintain their shape when the image is thresholded at
our classification process and will explain later why we find several levels. This method of detection was selected due to
they are most suited to this application. its robustness to variations in contrast and lighting conditions.
The vast majority of existing systems consist of classifiers Rather than detecting candidates for road signs by border color,
that were trained using hand-labeled real images, for example the algorithm detects candidates based on the background color
[3]–[5], which is a repetitive, time-consuming, and error-prone of the sign, because these backgrounds persist within the MSER
process. Our method avoids collecting and manually label- process. Our proposed method, as described in detail in the
ing training data, because it requires only synthetic graphical following section, is broadly illustrated in Fig. 3.
representations of signs that were obtained from an online
road sign database [2]. Furthermore, although many existing B. Detection of Road Signs as MSERs
systems report high classification rates, the total number of For the detection of traffic symbols with white background,
traffic sign classes recognized is generally very limited, e.g., MSERs are found for a grayscale image. Each frame is bi-
seven classes in [4], 42 classes in [15], or 20 classes in [16], and narized at a number of different threshold levels, and the
are hence less likely to suffer mismatches against similar signs connected components at each level are found. The connected
that were missing from their databases. Our proposed system components that maintain their shape through several threshold
uses all instances of ideogram-based traffic symbols used in the levels are selected as MSERs. Fig. 4 shows different thresholds
U.K. and hence performs its matching in this larger set. We for an example image with the connected components colored.
expect our approach to be equally functional if applied to other It is shown that the connected component that represents the
countries’ traffic sign databases obtained in a similar fashion. circular road symbol maintains its shape through several thresh-
Note that many proposed systems suffer from slow speed, old levels. This helps ensure robustness to both variations in
making them inappropriate for application to real-time prob- lighting and contrast.
lems. Some methods report processing times of several seconds Several features of the detected connected component re-
for a single frame [4], [11], [25], [26]. Our system runs at an gions are used to further reduce the number of candidates.
average speed of 20 frames per second. These features are width, height, aspect ratio, region perimeter
There are a few commercial traffic sign recognition systems and area, and bounding-box perimeter and area. Removing the
on the market, including [27] and [28]. Such commercial sys- connected components that do not match the requirements helps
tems also recognize a very limited set of traffic signs; for exam- speed up the process and improve accuracy. The parameters
1500 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 13, NO. 4, DECEMBER 2012
used as limits for these features were empirically determined

and are shown in Table I.
We approach the detection of traffic symbols with red or
blue backgrounds in a slightly different manner. Rather than
detecting MSERs for a grayscale image, the frame is first
transformed from red–green–blue (RGB) into a “normalized
red/blue” image ΩRB such that, for each pixel of the original
image, values are found for the ratio of the blue channel to the
sum of all channels and the ratio of the red channel to the sum
of all channels. The greater of these two values is used as the
pixel value of the normalized red/blue image, i.e.,

R B
ΩRB = max , . (1)
R+G+B R+G+B
Pixel values in this image are higher for red and blue pixels and
lower for other colors. MSER regions are then found for this
new image. Fig. 5 shows an example image and the result of
the normalized red/blue transform. Fig. 6 shows the connected
components at several different thresholds of the transformed
image. Again, it is shown that the red and blue road signs
maintain their shape at several threshold levels, making them
candidates for classification.
Although MSER offers a robust form of detection for traffic
signs in complex scenes, it can be computationally expen-
sive. Therefore, to increase the speed, we threshold only at
an appropriate range of values rather than at every possible
value, which is the norm in the original MSER [1]. Fig. 7
shows the number of used thresholds plotted against the pro-
cessing time and accuracy of detection. The thresholds were
evenly spaced between the values 70 and 190, because the
MSERs that represent road signs generally appear within this
range. The number of thresholds selected was 24, which, in
this example, corresponds to 94.3% accuracy and 50.1-ms
processing time.
C. Road Sign Classification

The recognition stage is used to confirm a candidate region
as a traffic sign and classify the exact type of sign. For the
classification of candidate regions, their HOG features are
extracted from the image [30], which represent the occurrence
of gradient orientations in the image. HOG feature vectors are
calculated for each candidate region. A Sobel filter is used to
find the horizontal and vertical derivatives and, hence, the mag-
nitude and orientation for each pixel. We find the application
of HOG to recognition of traffic symbols very suitable, given
that traffic symbols are composed of strong geometric shapes
and high-contrast edges that encompass a range of orientations.
Traffic signs are generally found to be approximately upright
and facing the camera, which limits rotational and geometric
distortion, removing the need for rotation invariance.
The HOG features are computed on a dense grid of cells
using local contrast normalization on overlapping blocks. A
nine-bin histogram of unsigned pixel orientations weighted
by magnitude is created for each cell. These histograms are
normalized over each overlapping block. The components of
Fig. 2. Full set of graphical road signs used in training the proposed system [2]. the feature vector are the values from the histogram of each
Fig. 3. Overview schematic of the proposed approach.
Fig. 5. Image transformed into our normalized red/blue color space ΩRB .
Fig. 4. (Top) Original image. (Middle and Bottom) Connected components at

several threshold levels.
TABLE I
P ROPERTIES U SED TO S ORT C ONNECTED C OMPONENTS Fig. 6. Connected components at several thresholds of the normalized
red–blue image.
where R is the region, M is the cell size, B is the number of

cells per block, and H is the number of histograms per cell. The
values used were M = 8 × 8, B = 4, and H = 9.
Regions are then classified using a cascade of multiclass
SVMs [31]. SVM is a supervised learning method that con-
normalized cell. Fig. 8, shows an example image divided into structs a hyperplane to separate data into classes. The “support
cells and an example HOG block that consists of four cells. vectors” are data points that define the maximum margin of
Although this intensive normalization produces large feature the hyperplane. Although SVM is primarily a binary classifier,
vectors (1764 dimensions for a 64 × 64 image), it provides high multiclass classification can be achieved by training many one-
accuracy. The size N of the HOG feature vector is computed against-one binary SVMs. SVM classification is fast, highly
using accurate, and less prone to overfitting compared to many other
classification methods. It is also possible to very quickly train
Rwidth Rheight an SVM classifier, which significantly helps in our proposed
N= −1 × −1 ×B×H (2)
Mwidth Mheight method, given our large amount of training data and high
Fig. 7. Chart that shows the number of thresholds used for MSER plotted
against accuracy of detection and processing time.
Fig. 9. Cascaded SVM classifier.
Road sign classifications from several frames are merged

together to form a decision. A probabilistic SVM model is
used for classification. Rather than having each classification
Fig. 8. Regions of HOG features. counting as a single vote for a specific class, a vote was made
for each class, weighted by its probability. The class with the
number of classes. However, we plan to perform further com- highest score S was taken as the correct classification, i.e.,
parison with other classification methods in future work.
Each region in our system is classified using a cascade of
N
SVM classifiers, as illustrated in Fig. 9. First, the candidate S= P (An ) (3)

region is resized to 24 × 24 pixels. A HOG feature vector n=1
with 144 dimensions is then calculated, and this feature vector where N is the total number of classifications, and A is an SVM
is used to classify the shape of the region as a circle, triangle, classification. The classification is made once S exceeds the
upside-down triangle, rectangle, or background. Octagonal stop decision threshold λ.
signs are considered to be circles. If the region is found to be In Section IV-B, we compare the results with and without the
background, it is rejected. If the region is found to be a shape, inclusion of this frame-merging technique.
it is then passed on to a (symbol) subclassifier for that specific
shape.
D. Generation of Synthetic Training Data
Different classifier trees are used for candidates with white
background (MSERs for a grayscale image) and candidates Training the classifiers on all possible road signs is essential
with red or blue background (MSERs for a normalized red/blue to avoid misclassification of unknown signs. However, gath-
image). Therefore, each subclassifier is specific to symbols with ering a sufficient amount of real data on which to train the
a certain background color and shape. Color background trian- classifiers is difficult and time consuming, given the sheer
gles and color background upside-down triangles are rejected as number of different existing signs and the scarcity of particular
background, because no signs of these types exist in the U.K. signs. This is possibly one of the reasons that most other
road sign database [2]. works (and commercial systems such as Mobileye’s Traffic
To optimize the performance of the linear SVM classifier, Sign Recognition [28]) focus only on a subset of more common
an appropriate value for the cost of misclassification parameter signs regularly found in their footage; for example, see [3]–[5].
C has to be selected. Choosing a value that is too large may We further suggest that using only a subset of signs also avoids
result in overfitting, whereas a value that is too small may cause misclassification against other similar but excluded signs; there-
underfitting. Hence, a cross correlation of the training set is fore, in many cases, the quality of the reported results can
performed for log2 C = −5, −3, −2, . . . , 15, and the value of be unreliable. Our proposed solution to this problem is to
C that produces the highest cross-correlation accuracy is used. use easily available graphical data and synthetically generate
Fig. 10. Comparison of real and synthesized data.
variations and distortions of them to create training data for Fig. 11. Example images from the test set.
the classifiers. This approach allows us to use the entire range
of road signs, avoid tedious manual hand labeling for training
purposes, and report more reliable classification results that are
a true reflection of a complete search.
The graphical base images that we use were obtained from
a free online database provided by the U.K. Department for
Transport [2]. Randomized geometric distortions were then
applied to replicate the range of distortions likely to be seen
in real data and the type of regions likely to be found during the
detection stage. Each distorted example image is superimposed
over a random section of background, taken from a database of
typical background images. Randomized brightness, contrast,
noise, blurring, and pixelation are also applied to each image.
The complete set of 131 road sign images used for training
is shown in Fig. 2. For each sign, 1200 synthetic distorted
images were generated. As a means of comparison, Fig. 10
shows a number of real road sign images next to a number of
our generated training images.
IV. E XPERIMENTAL R ESULTS Fig. 12. Confusion matrix for a cascaded classifier with white background
(accuracy = 89.2%).
The proposed system can operate at a range of vehicle
speeds and was tested under a variety of lighting and weather hue and saturation are built for red, blue, and yellow sign colors,
conditions. A considerable increase in speed was gained by and created using images with a range of weather and lighting
implementing the algorithm in parallel as a pipeline to around conditions. For the segmentation of white road signs, the image
20 frames per second, running on a 3.33-GHz Intel Core i5 is binarized based on the achromaticity of each pixel, and then,
central processing unit under OpenCV, where the frame dimen- each candidate blob is classified by shape. The distance from
sions were 640 × 480. However, the system retained a latency the side of the candidate blob to its bounding box is measured at
of around 200 ms. each side (left, right, top, and bottom) at several points. Binary
We compare our proposed method (later in Section IV-B) SVMs for each shape are then used to vote for each side of
with a road sign detection system that was proposed by the blob (circle or triangle). If the blob receives four votes for a
Gómez-Moreno et al. [11], which also deals with the entire particular shape, that shape is chosen. An SVM with a Gaussian
problem of detection and recognition, detects a relatively large kernel is then used to classify each sign type based on shape and
number of road signs (encompassing a variety of different color. The SVM is trained using pixel values from the candidate
shapes and colors) compared to other methods, and uses SVMs region that falls into a template that represents the shape (circle
for classification. These factors make the system in [11] a or triangle).
particularly suitable method for comparison purposes.
The system that was proposed by Gómez-Moreno et al. [11]
A. Performance of the Proposed Method
detects candidate regions using color information and performs
recognition using SVM based on a training set of between 20 To assess the performance of our classifiers, a test data set
and 100 images per class on an unspecified number of classes. was collected from frames of road video footage and road sign
Each frame is segmented using the hue and saturation compo- images obtained from the Internet. This test set included many
nents of a hue–saturation–intensity (HSI) image. Histograms of challenging images affected by geometric distortion, blurring,
Fig. 15. Image that shows the detection of the “Give Way” sign (taken from
video 1).
Fig. 13. Confusion matrix for a cascaded classifier with color background
(accuracy = 92.1%).
Fig. 16. Image that shows the detection of the “Pedestrians in Road” sign
(taken from video 2).
Fig. 14. Chart that shows precision against recall of system as decision
threshold λ is varied. Fig. 17. Image that shows the detection of the speed limit sign (taken from
video 3).
deterioration, and partial occlusion (see Fig. 11). The accuracy detections, the precision of the system falls as the number of
for white road signs was 89.2% and 92.1% for color signs. false positives increases.
The confusion matrices in Figs. 12 and 13 represent classi-
fier results for white background and color background signs,
B. Comparative Analysis
respectively. The values on the x-axis represent the individual
road sign classes, and the values on the y-axis represent the For the Gómez-Moreno et al. method, between 20 and 100
predictions made by the classifier. Column 1 of both matrices real training images per class were used to train the SVM
represents classification as the background (negative) class, and classifiers for recognition, as suggested in [11]. For test data, we
this is shown to have been the most common misclassification used several videos, filmed under a range of weather conditions,
at 7.0%, with some misclassification between nonbackground at a variety of different vehicle speeds. Video 1 was filmed
classes at 3.4%. This case is preferred, because in the overall in clear weather conditions, at low speeds of around 20 mi/h.
system, decisions are formed over several frames, and regions Video 2 was filmed in thick fog, at high vehicle speeds, e.g.,
that are classified as background are simply ignored. above 50 mi/h. Video 3 was filmed in clear weather conditions,
Fig. 14 shows the precision of the system plotted against at a variety of vehicle speeds, ranging from 20 to 60 mi/h. An
recall as the decision threshold λ is varied. It is shown in this example frame from each video, with results overlaid from our
graph that, because λ is reduced to increase the number of system, is shown in Figs. 15–17.
TABLE II
C OMPARATIVE R ESULTS FOR G ÓMEZ -M ORENO ET AL.’ S S YSTEM [11] AND THE P ROPOSED M ETHOD .
T HE T OTAL N UMBER OF S IGNS WAS 14 IN V IDEO 1, 5 IN V IDEO 2, AND 38 IN V IDEO 3
The results in Table II show that our proposed method the classifier was also tested using a data set that comprises
outperformed the method used in [11]. Although their detec- synthetically generated images, but with different backgrounds
tion method reasonably classified in clear weather conditions, from that in the training set. The accuracy achieved for this
scenes that suffer from poor lighting conditions and strong illu- experiment was 97.6%, which verified the claim.
mination caused it to fail. Our MSER detection system provides To more thoroughly validate the system, another classifier
robustness by searching for candidate regions at a range of was trained, with a data set that contains real images and
thresholds rather than using a single fixed value. The recog- synthetically generated interpolations, created using randomly
nition method that was proposed by Gómez-Moreno et al. [11] distorted version of the real images. The total number of images
also produced a large number of false positives. Our approach in this data set was 43 509. This classifier had an overall
of using the HOG feature descriptor with SVM performed accuracy of 89.2%, which was greater than either the fully
better than directly using pixel values. synthetic or the fully real data set.
Our method was also tested without the use of frame merg-
ing, which was described in Section III-C. Removing this part
of the system reduced the total precision to 67.7%, which is V. C ONCLUSION
a considerable drop from the 86.8% achieved with the use of We have proposed a novel real-time system for the automatic
frame merging. Although the reported results are still too low detection and recognition of traffic symbols. Candidate regions
for use in practice, the performance is high, given the large are detected as MSERs. This detection method is significantly
number of classes recognized. insensitive to variations in illumination and lighting conditions.
Traffic symbols are recognized using HOG features and a
cascade of linear SVM classifiers. A method for the synthetic
C. Performance of the Synthetically Generated Test Set generation of training data has been proposed, which allows
To assess the relevance of the concept of synthetically gener- large data sets to be generated from template images, removing
ated training data, a comparison was made between a classifier the need for hand labeled data sets. Our system can identify
that was trained on real data from the GTSRB [29] and a signs from the whole range of ideographic traffic symbols
classifier that was trained on synthetic training data. currently in use in the U.K. [2], which form the basis of our
A training data set of German road signs was generated from training data. The system retains a high accuracy at a variety of
graphics-based images, with a single example for each class, vehicle speeds.
resulting in a total of 43 classes. A total of 1200 synthetic
images were generated for each class, and HOG features were ACKNOWLEDGMENT
calculated for each image. A data set of HOG features was also
The authors would like to thank Dr. A. Dunoyer and
created from the images contained in the GTSRB training data
Dr. T. Popham of Jaguar Research for their support.
set. A linear SVM classifier was trained for both the real data
set and the generated synthetic data set. Both classifiers were
then tested using the GTSRB test data set. R EFERENCES
The classifier that was trained on the synthetic data gave an [1] J. Matas, “Robust wide-baseline stereo from maximally stable extremal
accuracy of 85.7%, and the classifier that was trained on real regions,” Image Vis. Comput., vol. 22, no. 10, pp. 761–767, Sep. 2004.
data gave an accuracy of 85.9%. Based on these results, it is [2] Dept. Transp., London, U.K., Traffic Signs Image Database, 2011.
[3] M. Hossain, M. Hasan, M. Ali, M. Kabir, and A. Ali, “Automatic detection
shown that the synthetic data set produced results comparable and recognition of traffic signs,” in Proc. RAM, 2010, pp. 286–291.
to a data set of hand-labeled real images. Although the results [4] K. Ohgushi and N. Hamada, “Traffic sign recognition by bags of features,”
for real training data were slightly higher than for the syn- in Proc. TENCON, 2009, pp. 1–6.
[5] H. Pazhoumand-Dar and M. Yaghobi, “DTBSVMs: A new approach for
thetic data, the use of synthetic data allowed the tedious time- road sign recognition,” in Proc. ICCICSN, Jul. 2010, pp. 314–319.
consuming process of manually hand labeling a large data set [6] D. Kang, N. Griswold, and N. Kehtarnavaz, “An invariant traffic sign
of real images to be avoided. recognition system based on sequential color processing and geometrical
transformation,” in Proc. IAI, 1994, pp. 88–93.
To show that the features learned by the classifier relate [7] P. Paclik and J. Novovicová, “Road sign classification without color infor-
only to the road signs and not to background information, mation,” in Proc. ASIC, Lommel, Belgium, 2000.
[8] S. Vitabile, G. Pollaccia, G. Pilato, and E. Sorbello, “Road signs recogni- [27] Volkswagen Media Services, (2012, Jan. 9), Phaeton debuts with new de-
tion using a dynamic pixel aggregation technique in the HSV color space,” sign and new technologies. [Online]. Available: https://www.volkswagen-
in Proc. IAP, 2001, pp. 572–577. media-services.com/medias_publish/ms/content/en/pressemitteilungen/
[9] C. Bahlmann, Y. Zhu, V. Ramesh, M. Pellkofer, and T. Koehler, “A system 2010/04/22/phaeton_debuts_with.standard.gid-oeffentlichkeit.html
for traffic sign detection, tracking, and recognition using color, shape, and [28] Mobileye, (Oct. 26, 2011). Traffic Sign Detection, [Online]. Available:
motion information,” in Proc. IVS, 2005, pp. 255–260. http://mobileye.com/technology/applications/traffic-sign-detection/
[10] X. Gao, L. Podladchikova, D. Shaposhnikov, K. Hong, and N. Shevtsova, [29] J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, “German traffic sign
“Recognition of traffic signs based on their color and shape features recognition benchmark: A multiclass classification competition,” in Proc.
extracted using human vision models,” J. Vis. Commun. Imag. Represent., IJCNN, 2011, pp. 1453–1460.
vol. 17, no. 4, pp. 675–685, Aug. 2006. [30] N. Dalal and B. Triggs, “Histograms of oriented gradients for human
[11] S. Maldonado-Bascón, S. Lafuente-Arroyo, P. Gil-Jimenez, H. Gomez- detection,” in Proc. CVPR, 2005, pp. 886–893.
Moreno, and F. Lopez-Ferreras, “Road-sign detection and recognition [31] C. Cortes and V. Vapnik, “Support vector networks,” J. Mach. Learn.,
based on support vector machines,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 3, pp. 273–297, Sep. 1995.
vol. 8, no. 2, pp. 264–278, Jun. 2007.
[12] R. Malik, J. Khurshid, and S. Ahmad, “Road sign detection and recogni-
tion using color segmentation, shape analysis and template matching,” in
Proc. ICMLC, Aug. 2007, vol. 6, pp. 3556–3560.
[13] M. Rahman, F. Mousumi, E. Scavino, A. Hussain, and H. Basri, “Real-
time road sign recognition system using artificial neural networks for Jack Greenhalgh received the M.Eng. degree in
Bengali textual information box,” in Proc. ITSim, 2008, vol. 2, pp. 1–8. computer systems engineering from the University
[14] Y. Wang, M. Shi, and T. Wu, “A method of fast and robust for traffic sign of Sussex, Brighton, U.K., in 2010. He is currently
recognition,” in Proc. ICIG, 2009, pp. 891–895. working toward the Ph.D. degree, with a focus
[15] A. Ruta, Y. Li, and X. Liu, “Real-time traffic sign recognition from video on driver assistance using automated sign and text
by class-specific discriminative features,” Pattern Recognit., vol. 43, no. 1, recognition, in the Visual Information Laboratory,
pp. 416–430, Jan. 2010. University of Bristol, Bristol, U.K.
[16] Y. Gu, T. Yendo, M. Tehrani, T. Fujii, and M. Tanimoto, “A new vision His research interests include image processing,
system for traffic sign recognition,” in Proc. IV, 2010, pp. 7–12. computer vision, machine learning, and intelligent
[17] E. Cardarelli, P. Medici, P. P. Porta, and G. Ghisio, “Road sign shapes transportation systems.
detection based on Sobel phase analysis,” in Proc. IEEE IVS, 2009,
pp. 376–381.
[18] H. Huang, C. Chen, Y. Jia, and S. Tang, “Automatic detection and recog-
nition of road sign,” in Proc. ICMESA, 2008, pp. 626–630.
[19] P. Wanitchai and S. Phiphobmongkol, “Traffic warning signs detection
and recognition based on fuzzy logic and chain code analysis,” in Proc. Majid Mirmehdi (SM’10) received the B.Sc.
IITA, 2008, pp. 508–512. (Hons.) and Ph.D. degrees in computer science from
[20] C. G. Kiran, L. V. Prabhu, R. V. Abdu, and K. Rajeev, “Traffic sign the City University, London, U.K., in 1985 and 1991,
detection and pattern recognition using support vector machine,” in Proc. respectively.
ICAPR, 2009, pp. 87–90. He is currently a Professor of computer vision
[21] W. Shadeed, D. Abu-Al-Nadi, and M. Mismar, “Road traffic sign detection with the Visual Information Laboratory, University
in color images,” in Proc. ICECS, 2003, vol. 2, pp. 890–893. of Bristol, Bristol, U.K., where he is also the Grad-
[22] G. Loy, “Fast shape-based road sign detection for a driver assistance uate Dean and the Head of the Graduate School of
system,” in Proc. IROS, 2004, pp. 70–75. Engineering. He is the Editor-in-Chief of the IET
[23] I. Creusen, R. Wijnhoven, and E. Herbschleb, “Color exploitation in Computer Vision Journal and an Associate Editor
HOG-based traffic sign detection,” in Proc. ICIP, 2010, pp. 2669–2672. for Pattern Analysis and Applications. His research
[24] G. Overett, “Large-scale sign detection using HOG feature variants,” in interests include natural-scene analysis and medical imaging, and he has pub-
Proc. IV, 2011, pp. 326–331. lished more than 150 refereed journal publications and conference proceedings
[25] A. Reiterer, “Automated traffic sign detection for modern driver assistance in these and other areas.
systems,” in Proc. FCBC, Sydney, Australia, 2010, pp. 11–16. Dr. Mirmehdi is a Fellow of the International Association for Pattern Recog-
[26] J. F. Khan, S. M. A. Bhuiyan, and R. R. Adhami, “Image segmentation nition (IAPR) and a Member of the Institution of Engineering and Technology
and shape analysis for road-sign detection,” IEEE Trans. Intell. Transp. (IET). He serves on the Executive Committee of the British Machine Vision
Syst., vol. 12, no. 1, pp. 83–96, Mar. 2011. Association.

Detection and Recognition of Road Traffic

Uploaded by

Copyright:

Available Formats

Detection and Recognition of Road Traffic

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Detection and Recognition of Road Traffic

Uploaded by

Copyright:

Available Formats

1498 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 13, NO.

Real-Time Detection and Recognition

used as limits for these features were empirically determined

C. Road Sign Classification

Fig. 3. Overview schematic of the proposed approach.

Fig. 4. (Top) Original image. (Middle and Bottom) Connected components at

where R is the region, M is the cell size, B is the number of

Fig. 9. Cascaded SVM classifier.

Road sign classifications from several frames are merged

SVM classifiers, as illustrated in Fig. 9. First, the candidate S= P (An ) (3)

Fig. 10. Comparison of real and synthesized data.

You might also like