Licence Plate Detection

License Plate Detection
Automatic number-plate recognition (ANPR; see also other names below) is a technology that
uses optical character recognition on images to read vehicle registration plates to create vehicle
location data. It can use existing closed-circuit television, road-rule enforcement cameras, or
cameras specifically designed for the task. ANPR is used by police forces around the world for
law enforcement purposes, including to check if a vehicle is registered or licensed. It is also used
for electronic toll collection on pay-per-use roads and as a method of cataloguing the movements
of traffic, for example by highways agencies.
Automatic number plate recognition can be used to store the images captured by the cameras as
well as the text from the license plate, with some configurable to store a photograph of the
driver. Systems commonly use infrared lighting to allow the camera to take the picture at any
time of day or night.[1][2][3] ANPR technology must take into account plate variations from place
to place.
Concerns about these systems have centered on privacy fears of government tracking citizens'
movements, misidentification, high error rates, and increased government spending. Critics have
described it as a form of mass surveillance
Algorithms:
There are seven primary algorithms that the software requires for identifying a license plate:
1. Plate localization – responsible for finding and isolating the plate on the picture.
2. Plate orientation and sizing – compensates for the skew of the plate and adjusts the
dimensions to the required size.
3. Normalization – adjusts the brightness and contrast of the image.
4. Character segmentation – finds the individual characters on the plates.
5. Optical character recognition.
6. Syntactical/Geometrical analysis – check characters and positions against country-
specific rules.
7. The averaging of the recognized value over multiple fields/images to produce a more
reliable or confident result. Especially since any single image may contain a reflected
light flare, be partially obscured or other temporary effect.
Iterative Bilateral Filter:
Bilateral filtering is a technique to smooth images while preserving edges. It can be traced back
to 1995 with the work of Aurich and Weule [4] on nonlinear Gaussian filters. It was later
rediscovered by Smith and Brady [59] as part of their SUSAN framework, and Tomasi and
Manduchi [63] who gave it its current name. Since then, the use of bilateral filtering has grown
rapidly and is now ubiquitous in image processing applications Figure 1.1. It has been used in
various contexts such as denoising [1, 10, 41], texture editing and relighting [48], tone
management [5, 10, 21, 22, 24, 53], demosaicking [56], stylization [72], and optical-flow
estimation [57, 74]. The bilateral filter has several qualities that explain its success:
• Its formulation is simple: each pixel is replaced by a weighted average of its neighbors. This
aspect is important because it makes it easy to acquire intuition about its behavior, to adapt it to
application-specific requirements, and to implement it.
• It depends only on two parameters that indicate the size and contrast of the features to preserve.
• It can be used in a non-iterative manner. This makes the parameters easy to set since their effect
is not cumulative over several iterations.
Fig: The bilateral filter converts any input image (a)to a smoothed version (b). It removes most
texture, noise, and fine details, but preserves large sharp edges without blurring.
It can be computed at interactive speed even on large images, thanks to efficient numerical
schemes [21, 23, 55, 54, 50, 71], and even in real time if graphics hardware is available [16]. In
parallel to applications, a wealth of theoretical studies [6, 7, 13, 21, 23, 46, 50, 60, 65, 66]
explain and characterize the bilateral filter’s behavior. The strengths and limitations of bilateral
filtering are now fairly well understood. As a consequence, several extensions have been
proposed [14, 19, and 23]. This paper is organized as follows. Section 2 presents linear Gaussian
filtering and the nonlinear extension to the bilateral filter. Section 3 revisits several recent, novel
and challenging applications of bilateral filtering. Section 4 compares different ways to
implement the bilateral filter efficiently. Section 5 presents several links of bilateral filtering with
other frameworks and also different ways to interpret it. Section 6 exposes extensions and
variants of the bilateral filter.
To introduce bilateral filtering, we begin with a description of Gaussian convolution in Section

2.2. This filter is simpler, introduces the notion of local averaging, and is closely related to the
bilateral filter but does not preserve edges. Section 2.3 then underscores the specific features of
the bilateral filter that combine smoothing with edge preservation
Terminology and Notation:

For simplicity, most of the exposition describes filtering for a graylevel image I although every
filtering operation can be duplicated for each component of a color image unless otherwise
specified. We use the notation Ip for the image value at pixel position p. Pixel size is assumed to
be 1. F[I] designates the output of a filter F applied to the image I. We will consider the set S of
all possible image locations that we name the spatial domain, and the set R of all possible pixel
values that we name the range domain. For instance, the notation! p∈S denotes a sum over all
image pixels indexed by p. We use |·| for the absolute value and || · || for the L2 norm, e.g., ||p −
q|| is the Euclidean distance between pixel locations p and q.
Figure shows the illustration of 1D bilateral filter. The top right image is the input noisy signal.
The top left image shows the intensity Gaussian while the mid the special Gaussian. The bilateral
response is shown at the bottom.
Figure: Illustration of 1 D bilateral filter.The filter is applied on a 1-D input step signal noised by
random Gaussian noise. The output of the filter is shown in the below fig
Another parameter during the running of the bilateral filter is the window size of how many
pixels should be computed on time. The window size is related to the spatial Gaussian. Basically,
based on the property of the Gaussian distribution, window size should be around 2 to 3 times
the standard deviation of the Gaussian, since when it’s over 3 times sigma, the output of
Gaussian almost equals to zero. In some research, it is shown that the bilateral filter is identical
to the first iteration of the Jacobi algorithm (diagonal normalized steepest descent) with a
specific cost function. Elad et al. [7] related the bilateral filter with the anisotropic diffusion.
Wavelet Decomposition and Thresholding:
Wavelet is a mathematical function used to divide a given function or continuous-time signal

into different scale components. One can assign a frequency range to each scale component.
Each scale component can then be studied with a resolution that matches its scale. Thus the
Wavelet is a multi resolution representation function. Wavelet transform is the discrete sampling
of the wavelets. Based on the recurrence relations property of wavelet, the most common wavelet
transforms, such as Daubechies wavelet transform, generate progressively finer discrete
samplings of an implicit mother wavelet function; each resolution is twice that of the previous
scale down-sampled by 2. Therefore, using the one level wavelet transform, the input signal can
be decomposed into two frequency coefficients, the approximation coefficients as the low
frequency part and the detail coefficients as the high frequency part. This is the so called wavelet
decomposition. With higher level decompositions, multi resolution representation of the signal
can be achieved. Figure 1.3 shows the wavelet decomposition of an image. The left picture is the
original image and the right one using 1-level wavelet decomposition. We can see from the right
image that the top left small picture is the low frequency part which keeps the energy mostly
while the others are detail information.
Figure: Right one shows the wavelet decomposition of the left picture. It uses 1 level Db4 wavelet
decomposition in the Mat lab. The image size is 512 *512
Wavelet thresholding is a denoising method that applies the thresholding shrinkage upon the
high frequency components after the wavelet decomposition. There are two basic thresholding
methods, the hard thresholding and the soft thresholding, by which the threshold value is
computed. Generally speaking, the soft thresholding is used for the wavelet thresholding
application. The most common wavelet thresholding methods are Bayes Shrink [10], Visu
Shrink[9] and SURE Shrink [8].
Edge-preserving Filtering with the Bilateral Filter:
The bilateral filter is also defined as a weighted average of nearby pixels, in a manner very
similar to Gaussian convolution. The difference is that the bilateral filter takes into account the
difference in value with the neighbors to preserve edges while smoothing. The key idea of the
bilateral filter is that for a pixel to influence another pixel, it should not only occupy a nearby
location but also have a similar value. The formalization of this idea goes back in the literature to
Yaroslavsky [77], Aurich and Weule [4], Smith and Brady [59] and Tomasi and Manduchi [63].
The bilateral filter, denoted by BF[ · ], is defined by:
Gaussian weighting that decreases the influence of distant pixels, Gσr is a range Gaussian that
decreases the influence of pixels q when their intensity values differ from Ip. Figure 1.1 shows a
sample output of the bilateral filter and Figure 2.2 illustrates how the weights are computed for
one pixel near an edge.
Parameters:
The bilateral filter is controlled by two parameters: σs and σr. Figure 2.3 illustrates their effect.
• As the range parameter σr increases, the bilateral filter gradually approximates Gaussian
convolution more closely because the range Gaussian Gσr widens and flattens, i.e., is nearly
constant over the intensity interval of the image.
• Increasing the spatial parameter σs smoothes larger features

Fig: The bilateral filter’s range and spatial parameters provide more versatile control than
Gaussian convolution. As soon as either of the bilateral filter weights reaches values near
zero, no smoothing occurs. As a consequence, increasing the spatial sigma will not blur an
edge as long as the range sigma is smaller than the edge amplitude. For example, note the
rooftop contours are sharp for small and moderate range settings σr, and that sharpness is
independent of the spatial setting σs. The original image intensity values span [0,1].
In practice, in the context of denoising, Liu et al. [41] show that adapting the range parameter σr
to estimates of the local noise level yields more satisfying results. The authors recommend a
linear dependence: σr = 1.95 σn, where σn is the local noise level estimate. An important
characteristic of bilateral filtering is that the weights are multiplied: if either of the weights is
close to zero, no smoothing occurs. As an example, a large spatial Gaussian coupled with narrow
range Gaussian achieves limited smoothing despite the large spatial extent. The range weight
enforces a strict preservation of the contours.
Adaptive Histogram Equalization:
The histogram equalization algorithm has been a conventional image enhancement algorithm for
its simplicity and efficiency. It adjusts the gray level of an image according to the probability
distribution function of the image and enlarges the dynamic range of the gray distribution to
improve visual effects of the image [1]. The histogram equalization algorithm may be divided
into two types: local histogram equalization and global histogram equalization. The local
histogram equalization may well enhance local details of the image and it may be divided into
three types: overlapping sub-block, no overlapping sub block, and partially overlapping sub-
block. The non overlapping sub-block method is very rarely used for its obvious square effects;
the overlapping sub-block method is also not used in practice for its large amount of calculation
and low processing speed; the partially overlapping sub-block method can speed up the
calculation, but it is relatively complex [2]. Compared to the local histogram equalization
algorithm, the global algorithm has certain advantages in processing speed, but has
disadvantages in enhancing effects. To improve enhancing effects, references [2] to [9] introduce
many improved algorithm. Based on the conventional histogram equalization algorithm and the
prophase study of [10] and [11], we analyze the relationship of the mapping gray level and
introduce the definition of different gray level, gray threshold setting and identification methods.
Then, we use the information entropy as the target function to get parameter ȕ in the gray level
mapping formula. According to the threshold, the improved algorithm may automatically
identify the gray level of the image and adaptively adjust the spacing of two adjacent gray levels
in the new histogram. Thus, it may effectively improve visual effects for any image under the
premise of the same information entropy.
Histogram equalization is one of the well-known enhancement techniques. In histogram

equalization the dynamic range and contrast of an image is modified by altering the image such
that its intensity histogram has a desired shape. This is achieved by using cumulative distribution
function as the mapping function. The intensity levels are changed such that the peaks of the
histogram are stretched and the troughs are compressed. If a digital image has N pixels
distributed in L discrete intensity levels and nk is the number of pixels with intensity level ik
and then the probability density function (PDF) of the image is given by Equation (1). The
cumulative density function (CDF) is defined in Equation (2).
Though this method is simple, it fails in myocardial nuclear images since the gray values are
physically far apart from each other in the image. Due to this reason, histogram equalization
gives very poor result for myocardial images.
Contrast Limited Adaptive Histogram Equalization:
In contrast limited histogram equalization (CLHE), the histogram is cut at some threshold and
then equalization is applied. Contrast limited adaptive histogram equalization (CLAHE) is an
adaptive contrast histogram equalization method [7-10], where the contrast of an image is
enhanced by applying CLHE on small data regions called tiles rather than the entire image. The
resulting neighboring tiles are then stitched back seamlessly using bilinear interpolation. The
contrast in the homogeneous region can be limited so that noise amplification can be avoided .
Contrast Limited Adaptive Histogram Equalization for Myocardial Perfusion
Images in Color Space:
In this method, the image which is read in RGB space is converted into the color space with a
luminance (Y) and two chrominance components (Cb, Cr) by using the relation given in
Equation (3).
The two chrominance channels are separated and for each chrominance channel the number of
rectangular contextual tiles into which the image is divided is obtained. The optimal value for
this is decided experimentally. Uniform distribution is used as the basis for creating the contrast
transform function. Let ic_min and ic_max be the minimum and maximum permissible intensity
levels and the optimal value of this clip limit is also set. Let Fk(ic_in) be the cumulative
distribution function for input contextual tile ic_in. Then the expression of the modified
chrominance channel tile with uniform distribution is given in Equation (4). The flowchart for
the method is given in Figure 1.
Properties of Adaptive Histogram Equalization:
 The size of the neighborhood region is a parameter of the method. It constitutes a

characteristic length scale: contrast at smaller scales is enhanced, while contrast at larger
scales is reduced.
 Due to the nature of histogram equalization, the result value of a pixel under AHE is
proportional to its rank among the pixels in its neighborhood. This allows an efficient
implementation on specialist hardware that can compare the center pixel with all other pixels
in the neighborhood.An normalized result value can be computed by adding 2 for each pixel
with a smaller value than the center pixel, and adding 1 for each pixel with equal value.
 When the image region containing a pixel's neighborhood is fairly homogeneous regarding to
intensities, its histogram will be strongly peaked, and the transformation function will map a
narrow range of pixel values to the whole range of the result image. This causes AHE to over
amplify small amounts of noise in largely homogeneous regions of the image
Number Plate Extraction:

Proposed Method for Identification of Letters/Numbers from License Plates:
This section elucidates the number plate extraction method for single line number plate. Input to
the system is a vehicle image (with clear view of number plate in it) which is captured by digital
camera and output is the actual characters present in that vehicle image. Each character present
on input number plate image should at least have minimum resolution of 24 × 42 pixels and
distance of number plate from camera should be such that it guarantees clear view of numbers
present on the number plate. Proposed algorithm consists of following steps:
• Image Preprocessing
• Number plate area extraction
• Segmentation of each character area in image
• Image matching for each character
• Output extracted characters in text format
Image Preprocessing:
Preprocessing is used to enhance the contrast of the image and for resizing of the image. RGB
image is converted to grayscale image which carries intensity information. RGB values are
converted to grayscale values by forming a weighted sum of the R, G, and B components:
Gray = 0.2989 * R + 0.587 * G + 0.114 * B
Contrast Enhancement:
Captured image may have unevenly distributed lighting or darkness. During edge detection fine
edge details in dark region of the image are eliminated. Also feature edges in bright regions need
to be preserved. Top-hat transformation is used to preserve these edge details as well as
prominent ones. The main property of top-hat operator can be applied to contrast enhancement.
After applying top-hat filtering, image is converted to binary image from gray scale using Otsu’s
algorithm. Figure 1a–d shows various stages of image preprocessing.
Number Plate Area Detection:
System’s speed and accuracy is enhanced using precise number plate area extraction. At this
stage, number plate area is extracted from entire preprocessed image. This step reduces
processing burden on next stage of identification of numbers from license plate area.
Fig: a )Original image, b) Gray image, c) Top-hat filtered image, d )Binary image, e) Dilated image, f)
Selected object, g) Extracted plate
Plate Area Detection:
A morphological operator is employed to the preprocessed image for extracting the plate area.
Morphological operator that suits the rectangular shape of number plate is used for this purpose.
Binary image is dilated with rectangular box as structuring element. After dilation operation,
horizontal object with Aspect Ratio(R) > 3 is filtered out from dilated image. As number plate
generally have larger lengths as compared to its width. Aspect Ratio = width/height and R
defines the region of interest. Upon detection of plate area, top left and bottom right coordinates
are extracted and further these coefficients are used for selecting plate area from original image.
Figure 1e–g shows various stages involved in plate area extraction. Figure 1e shows dilated
image. Filtered region (selected object) from unwanted region is shown in Fig. 1f. Figure 1g
shows extracted plate area.
Character Segmentation:
In this step, number plate is segmented to obtain characters present in it. Morphological
operations are used in segmentation process. Dilation of an image I by the structure element H is
given by the set operation
Erosion of an image I by the structure element H is given by the set operation
Algorithm for segmentation process is as follows: • Convert extracted plate image is into gray
scale image (Fig. 2a).
Fig. 2 a) Gray image, b) Inverted binary image, c )Dilated image, d) Eroded image, e) Subtracted image,
f )Convolved Image, g) Erosion with horizontal structuring element, h) Subtracted after erosion
Binarized image is using Ostu’s algorithm. Invert pixel values of binarized image, i.e., make 0 to
1 and 1 to 0 for further processing. Result of binarization is shown in Fig. 2b.
• Apply morphological gradient for edges enhancement. Figure 2c shows dilated image with disk
as structuring element. Figure 2d shows results of erosion on binary image Fig. 2b, e shows
result of image subtraction of eroded image from dilated image.
• Convolve subtracted image with [1 1; 1 1] for brightening the edges (Fig. 2f).
• Erode with horizontal line as structuring element to eliminate the possible horizontal lines
from the output image after subtraction operation (Fig. 2g).
• Subtract eroded image from convolved image (Fig. 2h).
• Fill all the regions of the image (Fig. 3a).
• Do thinning on the image to ensure character isolation (Fig. 3b).
• Calculated properties of connected components, i.e., objects present in image using 4 and 8
neighborhood labeling algorithms. Following properties of each object are calculated, (i) [x y]
which specifies the upper left corner of the object. (ii) [x_width y_width] which specifies the
width of the object along each dimension.
• Calculate histogram of all objects based on y-dimensions and y-width. Find intersection of
number of objects in histogram based on y-dimensions and histogram based on y-width, to know
exact number of characters present in image. Use 4, 8 neighborhood connectivity algorithms to
find out bounding box for individual objects in image. Using result of histogram analysis, only
those objects which have been identified in intersection of histogram are proposed for further
processing. The result of object segmentation is as shown in Fig. 3c. Results for two more
samples are shown in Figs. 4 and 5.
Fig. 3 a) Holes filled, b) Thinning, c) Segmented characters, d) Extracted characters stored in text file
Fig. 4 a) Sample image Fig. 2, b) Extracted plate, c) Processed image, d) Extracted letters, e) Extracted
characters in text file
Fig. 5 a) Sample image Fig. 3, b) Extracted plate, c) Processed image, d) Extracted letters, e) Extracted
characters in text file
Image Matching:
Each segmented character image is then compared against database image (database is set of few
samples of character images of different fonts and different style of holes fillings in them) and
correlation between test character image and database images is found. Correlation coefficient
between image A and image B can be calculated using
Where Ā = mean (A), and B̄ = mean (B). The database image for which maximum correlation is
obtained is the identified character. This identified character is then stored in text file as shown
in Fig. 3(d).
Morphological Operations:
Binary images may contain numerous imperfections. In particular, the binary regions produced
by simple thresholding are distorted by noise and texture. Morphological image processing
pursues the goals of removing these imperfections by accounting for the form and structure of
the image. These techniques can be extended to grayscale images.
Morphological image processing is a collection of non-linear operations related to the shape or

morphology of features in an image. Morphological operations rely only on the relative ordering
of pixel values, not on their numerical values, and therefore are especially suited to the
processing of binary images. Morphological operations can also be applied to grayscale images
such that their light transfer functions are unknown and therefore their absolute pixel values are
of no or minor interest.
Morphological techniques probe an image with a small shape or template called a structuring
element. The structuring element is positioned at all possible locations in the image and it is
compared with the corresponding neighborhood of pixels. Some operations test whether the
element "fits" within the neighborhood, while others test whether it "hits" or intersects the
neighborhood
A morphological operation on a binary image creates a new binary image in which the pixel has
a non-zero value only if the test is successful at that location in the input image.
The structuring element is a small binary image, i.e. a small matrix of pixels, each with a value
of zero or one:
 The matrix dimensions specify the size of the structuring element.

 The pattern of ones and zeros specifies the shape of the structuring element.
 An origin of the structuring element is usually one of its pixels, although generally the
origin can be outside the structuring element.
A common practice is to have odd dimensions of the structuring matrix and the origin defined as
the centre of the matrix. Structuring elements play in morphological image processing the same
role as convolution kernels in linear image filtering.
When a structuring element is placed in a binary image, each of its pixels is associated with the
corresponding pixel of the neighborhood under the structuring element. The structuring element
is said to fit the image if, for each of its pixels set to 1, the corresponding image pixel is also 1.
Similarly, a structuring element is said to hit, or intersect, an image if, at least for one of its
pixels set to 1 the corresponding image pixel is also 1.
Zero-valued pixels of the structuring element are ignored, i.e. indicate points where the
corresponding image value is irrelevant.
Fundamental operations:
More formal descriptions and examples of how basic morphological operations work is given in
the Hypermedia Image Processing Reference (HIPR) developed by Dr. R. Fisher et al. at the
Department of Artificial Intelligence in the University of Edinburgh, Scotland, UK.
Erosion and dilation:
The erosion of a binary image f by a structuring element s (denoted f s) produces a new binary
image g = f s with ones in all locations (x,y) of a structuring element's origin at which that
structuring element s fits the input image f, i.e. g(x,y) = 1 is s fits f and 0 otherwise, repeating for
all pixel coordinates (x,y).
Erosion with small (e.g. 2×2 - 5×5) square structuring elements shrinks an image by stripping
away a layer of pixels from both the inner and outer boundaries of regions. The holes and gaps
between different regions become larger, and small details are eliminated:
Larger structuring elements have a more pronounced effect, the result of erosion with a large
structuring element being similar to the result obtained by iterated erosion using a smaller
structuring element of the same shape. If s1 and s2 are a pair of structuring elements identical in
shape, with s2 twice the size of s1, then
Erosion removes small-scale details from a binary image but simultaneously reduces the size of
regions of interest, too. By subtracting the eroded image from the original image, boundaries of
each region can be found: b = f − (f s ) where f is an image of the regions, s is a 3×3
structuring element, and b is an image of the region boundaries.
The dilation of an image f by a structuring element s (denoted f s) produces a new binary

image g = f s with ones in all locations (x,y) of a structuring element's orogin at which that
structuring element shits the the input image f, i.e. g(x,y) = 1 if s hits f and 0 otherwise, repeating
for all pixel coordinates (x,y). Dilation has the opposite effect to erosion -- it adds a layer of
pixels to both the inner and outer boundaries of regions.
The holes enclosed by a single region and gaps between different regions become smaller, and
small intrusions into boundaries of a region are filled in:
Results of dilation or erosion are influenced both by the size and shape of a structuring element.
Dilation and erosion are dual operations in that they have opposite effects. Let f c denote the
complement of an image f, i.e., the image produced by replacing 1 with 0 and vice versa.
Formally, the duality is written as
Thresholding:
Segmentation involves separating an image into regions (or their contours) corresponding to
objects. We usually try to segment regions by identifying common properties. Or, similarly, we
identify contours by identifying differences between regions (edges). The simplest property that
pixels in a region can share is intensity. So, a natural way to segment such regions is through
thresholding, the separation of light and dark regions. Thresholding creates binary images from
grey-level ones by turning all pixels below some threshold to zero and all pixels about that
threshold to one. (What you want to do with pixels at the threshold doesn’t matter, as long as
you’re consistent.) If g(x, y) is a thresholded version of f (x, y) at some global threshold T. g is
equal to 1 if f ( x, y) ≥ T and zero otherwise.
Problems with Thresholding:
The major problem with thresholding is that we consider only the intensity, not any relationships
between the pixels. There is no guarantee that the pixels identified by the thresholding process
are contiguous. We can easily include extraneous pixels that aren’t part of the desired region, and
we can just as easily miss isolated pixels within the region (especially near the boundaries of the
region). These effects get worse as the noise gets worse, simply because it’s more likely that a
pixels intensity doesn’t represent the normal intensity in the region. When we use thresholding,
we typically have to play with it, sometimes losing too much of the region and sometimes getting
too many extraneous background pixels. (Shadows of objects in the image are also a real pain—
not just where they fall across another object but where they mistakenly get included as part of a
dark object on a light background.)
Optimal Global Thresholding:
A threshold is said to be globally optimal if the number of misclassified pixels is minimum
– Histogram is bimodal (object and background)
– Ground truth is known OR the histograms of the object and the background are known
Fig: Gray Level Probability density functions of two regions in an image
If the individual (normalized) histograms are known as p1(z) and p2(z)
– The (normalized) histogram of the overall image is P1p1(z)+P2p2(z) where P1=N1/(N1+N2)

and P2=N2/(N1+N2) Notice that P1+P2
Probability of Error:
 Probability of erroneously classifying a class 1 pixel to class 2 is

 Probability of erroneously classifying a class 2 pixel to class 1 is
 Overall probability of error is then
Minimization of Probability of Error:
 Let E’(T)=0 we have P1p1(T)=P2p2(T)

 If p1(T), p2(T) are known, T can be determined
 If P1=P2 (# of object pixels = # of background pixels), the optimum T is where the
curves intersect – What if P1 is larger? P2 is larger?
Parameter Estimation:
Parameters involved in an assumed functional form can be estimated using a minimization of

mean-square-error approach.
The goal is to search for the parameters that minimizes the above quantity
Minimum Error Thresholding:
Assume object pixels are distributed (histogram) according to ( ) Op x and the background pixels
by ( ) B p x. The error is misclassifying object pixels is
And misclassifying background pixels as object pixels is
Let θ be the fraction of pixels in the object. Then the total error is
To find the minimum we differentiate

Example:
Local Thresholding:
Another problem with global thresholding is that changes in illumination across the scene may
cause some parts to be brighter (in the light) and some parts darker (in shadow) in ways that have
nothing to do with the objects in the image. We can deal, at least in part, with such uneven
illumination by determining thresholds locally. That is, instead of having a single global
threshold, we allow the threshold itself to smoothly vary across the image.
Another way of dealing with illumination is to consider
Hysteresis Thresholding:
Take into account neighbors.
1. Choose two thresholds and T high T low
2. If the pixel is in the body. If T T > high T T < low the pixel is in the background.
3. If the pixel is in the body only if a neighbor is already in the body T T low < < T high
4. Iterate
Automated Methods for Finding Thresholds:
To set a global threshold or to adapt a local threshold to an area, we usually look at the histogram to see
if we can find two or more distinct modes—one for the foreground and one for the background. Recall
that a histogram is a probability distribution:
That is, the number of pixels ng having grayscale intensity g as a fraction of the total number of
pixels n. Here are five different ways to look at the problem:
Multispectral Thresholding:
We are interested in a technique for segmenting images with multiple components (color images,
Land sat images, or MRI images with T1, T2, and proton-density bands). It works by estimating
the optimal threshold in one channel and then segmenting the overall image based on that
threshold. We then subdivide each of these regions independently using properties of the second
channel. We repeat it again for the third channel, and so on, running through all channels
repeatedly until each region in the image exhibits a distribution indicative of a coherent region (a
single mode).
Example:
1. Compute histogram for each channel separately.

2. Find the peak in each histogram Select two thresholds corresponding to some valley on each
side of these peaks. Segment image into two regions. One between these thresholds and one
outside.
3. Project into multi-spectral representation
Thresholding along Boundaries:
If we want our thresholding method to give stay fairly true to the boundaries of the object, we
can first apply some boundary-finding method (such as edge detection techniques) and then
sample the pixels only where the boundary probability is high. Thus, our threshold method based
on pixels near boundaries will cause separations of the pixels in ways that tend to preserve the
boundaries. Other scattered distributions within the object or the background are of no relevance.
However, if the characteristics change along the boundary, we’re still in trouble. And, of course,
there’s still no guarantee that we’ll not have extraneous pixels or holes.
Sobel Edge Detector:
The Sobel operator performs a 2-D spatial gradient measurement on an image and so emphasizes
regions of high spatial frequency that correspond to edges. Typically it is used to find the
approximate absolute gradient magnitude at each point in an input grayscale image.
How It Works:
In theory at least, the operator consists of a pair of 3×3 convolution kernels as shown in Figure 1.
One kernel is simply the other rotated by 90°. This is very similar to the Roberts Cross operator.
Fig: Sobel Convolution Kernels
These kernels are designed to respond maximally to edges running vertically and horizontally
relative to the pixel grid, one kernel for each of the two perpendicular orientations. The kernels
can be applied separately to the input image, to produce separate measurements of the gradient
component in each orientation (call these Gx and Gy). These can then be combined together to
find the absolute magnitude of the gradient at each point and the orientation of that gradient. The
gradient magnitude is given by:
Typically, an approximate magnitude is computed using:
Which is much faster to compute.
The angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial gradient
is given by:
In this case, orientation 0 is taken to mean that the direction of maximum contrast from black to
white runs from left to right on the image, and other angles are measured anti-clockwise from
this.
Often, this absolute magnitude is the only output the user sees --- the two components of the
gradient are conveniently computed and added in a single pass over the input image using the
pseudo-convolution operator shown in Figure 2.
Fig: Pseudo Convolution kernels used to quickly compute approximate gradient magnitude
Using this kernel the approximate magnitude is given by:
Guidelines for Use:
The Sobel operator is slower to compute than the Roberts Cross operator, but its larger
convolution kernel smoothes the input image to a greater extent and so makes the operator less
sensitive to noise. The operator also generally produces considerably higher output values for
similar edges, compared with the Roberts Cross.
As with the Roberts Cross operator, output values from the operator can easily overflow the
maximum allowed pixel value for image types that only support smallish integer pixel values
(e.g. 8-bit integer images). When this happens the standard practice is to simply set overflowing
output pixels to the maximum allowed value. The problem can be avoided by using an image
type that supports pixel values with a larger range.
Natural edges in images often lead to lines in the output image that are several pixels wide due to
the smoothing effect of the Sobel operator. Some thinning may be desirable to counter this.
Failing that, some sort of hysteresis ridge tracking could be used as in the canny operator.
shows the results of applying the Sobel operator to
Compare this with the equivalent Roberts Cross output

Note that the spurious noise that afflicted the Roberts Cross output image is still present in this
image, but its intensity relative to the genuine lines has been reduced, and so there is a good
chance that we can get rid of this entirely by thresholding. Also, notice that the lines
corresponding to edges have become thicker compared with the Roberts Cross output due to the
increased smoothing of the Sobel operator.
Shows a simpler scene containing just a single flat dark object against a lighter background.
Applying the Sobel operator produces
Note that the lighting has been carefully set up to ensure that the edges of the object are nice and
sharp and free of shadows.
The Sobel edge detector can also be applied to range images like
The corresponding edge image is

All edges in the image have been detected and can be nicely separated from the background
using a threshold of 150, as can be seen in
Although the Sobel operator is not as sensitive to noise as the Roberts Cross operator, it still
amplifies high frequencies. The image
is the result of adding Gaussian noise with a standard deviation of 15 to the original image.
Applying the Sobel operator yields
and thresholding the result at a value of 150 produces

We can see that the noise has increased during the edge detection and it is no longer possible to
find a threshold which removes all noise pixels and at the same time retains the edges of the
objects.
The object in the previous example contains sharp edges and its surface is rather smooth.
Therefore, we could (in the noise-free case) easily detect the boundary of the object without
getting any erroneous pixels. A more difficult task is to detect the boundary of
because it contains many fine depth variations (i.e. resulting in intensity changes in the image)
on its surface. Applying the Sobel operator straightforwardly yields
Although the result has improved significantly, we still cannot find a threshold so that a closed
line along the boundary remains and all the noise disappears. Compare this image with the
results obtained with the Canny edge detector.
Common Variants:
A related operator is the Prewitt gradient edge detector (not to be confused with
the Prewitt compass edge detector). This works in a very similar way to the Sobel operator but
uses slightly different kernels, as shown in Figure 3. This kernel produces similar results to the
Sobel, but is not as isotropic in its response.
Fig: Masks for the perwitt gradient edge detector
Connected Component Analysis:
Once region boundaries have been detected, it is often useful to extract regions which are not
separated by a boundary.
• Any set of pixels which is not separated by a boundary is call connected.
• Each maximal region of connected pixels is called a connected component.
• The set of connected components partition an image into segments.

• Image segmentation is an useful operation in many image processing applications.
Connected Neighbors:
Let ∂s be a neighborhood system.
– 4-point neighborhood system
– 8-point neighborhood system
• Let c(s) be the set of neighbors that are connected to the point s.
For all s and r, the set c(s) must have the properties that
c(s) ⊂ ∂s
r ∈ c(s) ⇔ s ∈ c(r)
Example:
c(s) = {r ∈ ∂s : Xr = Xs}
Example:
c(s) = {r ∈ ∂s : |Xr − Xs| < Threshold}
In general, computation of c(s) might be very difficult, but we won’t worry about that now.
Connected Sets:
Definition: A region R ⊂ S is said to be connected under c(s) if for all s, r ∈ R there exists a
sequence of M pixels, s1, · · · , sM such that
s1 ∈ c(s), s2 ∈ c(s1), · · · , sM ∈ c(sM−1), r ∈ c(sM)
i.e. there is a connected path from s to r.
Example of Connect Sets:
• Consider the following image Xs

1 1 1 0 0 0
1 1 1 0 0 0
1 1 1 0 0 0 S1 = {s : Xs = 1} S0 = {s : Xs = 0}
0 0 0 1 1 1
0 0 0 1 1 1
0 0 0 1 1 1
Define c(s) = {r ∈ ∂s : Xr = Xs}
• Result – 4-point neighborhood ⇒ S0 and S1 are not connected sets
– 8-point neighborhood ⇒ S0 and S1 are connected sets!
Region Growing:
Idea - Find a connected set by growing a region from a seed point s0
Assume that c(s) is given
ClassLabel = 1
Initialize Yr = 0 for all r ∈ S
ConnectedSet(s0, Y, ClassLabel) {
B ← {s0}
While B is not empty {
s ← any element of B
B ← B − {s} Ys ← ClassLabel B ← B S {r : r ∈ c(s) and Yr = 0} }
return(Y)
Region Growing Example (1):

The list of (i, j) ∈ B (0,0)


Connected Components Extraction:
 Iterate through each pixel in the image.

 Extract connected set for each unlabeled pixel.
ClassLabel = 1
Initialize Yr = 0 for r ∈ S
For each s ∈ S {
if(Ys = 0) { ConnectedSet(s, Y, ClassLabel)

ClassLabel ← ClassLabel + 1
Connected Components Extraction Example (1):
s = (i, j); ClassLabel (0, 0); 1

Licence Plate Detection

Uploaded by

Copyright:

Available Formats

Licence Plate Detection

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Licence Plate Detection

Uploaded by

Copyright:

Available Formats

License Plate Detection

Iterative Bilateral Filter:

To introduce bilateral filtering, we begin with a description of Gaussian convolution in Section

Terminology and Notation:

Wavelet Decomposition and Thresholding:

Wavelet is a mathematical function used to divide a given function or continuous-time signal

Edge-preserving Filtering with the Bilateral Filter:

• Increasing the spatial parameter σs smoothes larger features

Adaptive Histogram Equalization:

Histogram equalization is one of the well-known enhancement techniques. In histogram

Contrast Limited Adaptive Histogram Equalization:

Properties of Adaptive Histogram Equalization:

 The size of the neighborhood region is a parameter of the method. It constitutes a

Number Plate Extraction:

• Number plate area extraction

• Segmentation of each character area in image

• Image matching for each character

• Output extracted characters in text format

Gray = 0.2989 * R + 0.587 * G + 0.114 * B

Number Plate Area Detection:

Plate Area Detection:

Erosion of an image I by the structure element H is given by the set operation

• Fill all the regions of the image (Fig. 3a).

• Do thinning on the image to ensure character isolation (Fig. 3b).

Morphological image processing is a collection of non-linear operations related to the shape or

 The matrix dimensions specify the size of the structuring element.

Erosion and dilation:

The dilation of an image f by a structuring element s (denoted f s) produces a new binary

Problems with Thresholding:

A threshold is said to be globally optimal if the number of misclassified pixels is minimum

– Histogram is bimodal (object and background)

Fig: Gray Level Probability density functions of two regions in an image

If the individual (normalized) histograms are known as p1(z) and p2(z)

– The (normalized) histogram of the overall image is P1p1(z)+P2p2(z) where P1=N1/(N1+N2)

 Probability of erroneously classifying a class 1 pixel to class 2 is

 Overall probability of error is then

Minimization of Probability of Error:

 Let E’(T)=0 we have P1p1(T)=P2p2(T)

Parameters involved in an assumed functional form can be estimated using a minimization of

Minimum Error Thresholding:

And misclassifying background pixels as object pixels is

To find the minimum we differentiate

Another way of dealing with illumination is to consider

1. Choose two thresholds and T high T low

Automated Methods for Finding Thresholds:

1. Compute histogram for each channel separately.

3. Project into multi-spectral representation

Thresholding along Boundaries:

Sobel Edge Detector:

Typically, an approximate magnitude is computed using:

Which is much faster to compute.

Using this kernel the approximate magnitude is given by:

Guidelines for Use:

shows the results of applying the Sobel operator to

Compare this with the equivalent Roberts Cross output

The corresponding edge image is

and thresholding the result at a value of 150 produces

Fig: Masks for the perwitt gradient edge detector

Connected Component Analysis:

• Any set of pixels which is not separated by a boundary is call connected.