Digital Image Processing
Digital Image Processing
Digital Image Processing
Prepared by
Dr.S.Vijayaraghavan
Assistant Professor-ECE
SCSVMV Deemed University, Kanchipuram
Page | 1
Digital Image Processing SCSVMV Dept of ECE
Design
OBJECTIVES:
➢ To learn digital image fundamentals.
➢ To be exposed to simple image processing techniques.
➢ To be familiar with image compression and segmentation
techniques
➢ To represent image in form of features.
Page | 2
Digital Image Processing SCSVMV Dept of ECE
TEXT BOOK:
1. Rafael C. Gonzales, Richard E. Woods, “Digital Image
Processing”, Third Edition, Pearson Education, 2010.
REFERENCES:
Page | 3
Digital Image Processing SCSVMV Dept of ECE
UNIT-1
DIGITAL IMAGE FUNDAMENTALS
LEARNING OBJECTIVES:
This unit provides an overview of the image –processing system which
includes various elements like image sampling, quantization, Basic steps
in image processing, image formation, storage and display. After
completing this unit, the reader is expected to be familiar with the
following concepts:
1. Image sampling
2. Image sensors
3. Different steps in image processing
4. Image formation
Page | 4
Digital Image Processing SCSVMV Dept of ECE
Medicalapplications:
1. Processing of chest X-rays
2. Cineangiograms
3. Projection images of trans axial tomographyand
4. Medical images that occur in radiology nuclear magneticresonance (NMR)
5. Ultrasonicscanning
Image Display: Image displays in use today are mainly color TV monitors.
These monitors are driven by the outputs of image and graphics displays
cards that are an integral part of computer system.
Hardcopy Devices: The devices for recording image include laser printers,
film cameras, heat sensitive devices inkjet units and digital units such as
optical and CD ROM disk. Films provide the highest possible resolution,
but paper is the obvious medium of choice for written applications.
Page | 7
Digital Image Processing SCSVMV Dept of ECE
Page | 10
Digital Image Processing SCSVMV Dept of ECE
In order to form a digital, the gray level values must also be converted
(quantized) into discrete quantities. So, we divide the gray level scale into
eight discrete levels ranging from eight level values. The continuous gray
levels are quantized simply by assigning one of the eight discrete gray
levels to each sample. The assignment it made depending on the vertical
proximity of a simple to a vertical tick mark. Starting at the top of the
image and covering out this procedure line by line produces a two-
dimensional digitalimage.
Page | 11
Digital Image Processing SCSVMV Dept of ECE
Hence f(x,y) is a digital image ifgray level (that is, a real number from the
set of real number R) to each distinct pair of coordinates (x,y). This
functional assignment is the quantization process. If the gray levels are
also integers, Z replaces R, the and a digital image become a 2D function
whose coordinates and she amplitude value are integers. Due to processing
storage and hardware consideration, the number gray levels typically are
an integer power of 2.
k
L=2
Then, the number ‘b’ of bites required to store a digital image is
b=M *N* k
When M=N, the equation become
2
b=N *k
When an image can have 2k gray levels, it is referred to as “k- bit”. An
8
image with 256 possible gray levels is called an “8- bit image” (256=2 ).
between the also having width W, so a line pair consists of one such line
and its adjacent space thus. The width of the line pair is 2w and there is
1/2w line pair per unit distance resolution is simply the smallest number of
discernible line pair unitdistance.
The components of a single sensor, perhaps the most familiar sensor of this
type is the photodiode, which is constructed of silicon materials and whose
output voltage waveform
isproportionaltolight.Theuseofafilterinfrontofasensorimprovesselectivity.F
orexample, a green (pass) filter in front of a light sensor favors light in the
green band of the color spectrum. As a consequence, the sensor output will
be stronger for green light than for other components in the visible
spectrum.
Page | 14
Digital Image Processing SCSVMV Dept of ECE
Page | 15
Digital Image Processing SCSVMV Dept of ECE
a complete image can be obtained by focusing the energy pattern onto the
surface of the array. Motion obviously is not necessary, as is the case with
the sensor arrangements This figure shows the energy from an
illumination source being reflected from a scene element, but, as
mentioned at the beginning of this section, the energy also could be
transmitted through the scene elements. The first function performed by
the imaging system is to collect the incoming energy and focus it onto an
image plane. If the illumination is light, the front end of the imaging
system is a lens, which projects the viewed scene onto the lens focal plane.
The sensor array, which is coincident with the focal plane, produces
outputs proportional to the integral of the light received at each sensor.
Digital andanalog circuitry sweeps these outputs and convert them to a
video signal, which is then digitized by another section of the imaging
system.
Image sampling and Quantization:
To create a digital image, we need to convert the continuous sensed data
into digital form. This involves two processes.
1. Sampling and
2. Quantization
A continuous image, f(x, y), that we want to convert to digital form. An
image may be continuous with respect to the x- and y- coordinates, and
also in amplitude. To convert it to digital form, we have to sample the
function in both coordinates and in amplitude.
Digitizing the coordinate values is called Sampling. Digitizing the
amplitude values is called Quantization.
Fig: Sampling
Page | 17
Digital Image Processing SCSVMV Dept of ECE
Fig: Quantization
Neighbors of a Pixel:
A pixel p at coordinates (x,y) has four horizontal and verticalneighbors
whose coordinates are givenby:
(x+1,y), (x-1, y), (x, y+1), (x,y-1)
pixel is one unit distance from (x,y) and some of the neighbors of p lie
outside the digital image if (x,y) is on the border of the image. The four
diagonal neighbors of p have coordinates and are denoted by ND(p).
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1).
These points, together with the 4-neighbors, are called the 8-neighbors of
p, denoted by N8(p).
As before, some of the points in ND(p) and N8(p) fall outside the image if
(x,y) is on the border of the image.
Page | 19
Digital Image Processing SCSVMV Dept of ECE
M-adjacency –two pixel P and Q with value from V are m – adjacency if (i)
Q is in N4(p) or
(ii) Q is in ND(q) and the
(iii) Set N4(p) ∩ N4(q) has no pixel whose values are fromV.
Mixed adjacency is a modification of 8-adjacency. It is introduced to
eliminate the ambiguities that often arise when 8-adjacency isused.
Forexample:
Types of Adjacency:
In this example, we can note that to connect between two pixels (finding a
path between twopixels):
In 8-adjacency way, you can find multiple paths between twopixels
While, in m-adjacency, you can find only one path between twopixels
So, m-adjacency has eliminated the multiple path connection that has been
generated by the8-adjacency.
Two subsets S1 and S2 are adjacent, if some pixel in S1 is adjacent to some
pixel in S2. Adjacent means, either 4-, 8- orm-adjacency.
Digital Path:
A digital path (or curve) from pixel p with coordinate (x,y) to pixel q with
coordinate (s,t) is a sequence of distinct pixels with coordinates (x0,y0),
(x1,y1), …, (xn, yn) where (x0,y0) = (x,y) and (xn, yn) = (s,t) and pixels (xi, yi)
and (xi-1, yi-1) are adjacent for 1 ≤ i ≤n, n is the length of thepath.
If (x0,y0) = (xn, yn), the path isclosed.
We can specify 4, 8or m-paths depending on the type of adjacency specified.
Page | 20
Digital Image Processing SCSVMV Dept of ECE
Connectivity:
Let S represent a subset of pixels in an image, two pixels p and q are said
to be connected in S if there exists a path between them consisting entirely
of pixels inS.
For any pixel p in S, the set of pixels that are connected to it in S is called
a connected component of S. If it only has one connected component, then
set S is called a connectedset.
Distance Measures:
Page | 21
Digital Image Processing SCSVMV Dept of ECE
For pixel p,q and z with coordinate (x.y) ,(s,t) and (v,w) respectively D is a
distance function or metric if
D [p.q] ≥ 0 {D[p.q] = 0iff p=q}
D [p.q] = D [p.q] and
D [p.q] ≥ 0 {D[p.q]+D(q,z)
The Euclidean Distance between p and q is definedas:
Pixels having a distance less than or equal to some value r from (x,y) are
the points contained in a disk of radius „ r „centered at (x,y)
Pixels having a D4 distance from (x,y), less than or equal to some value r
form a Diamond centered at (x,y)
Example:
The pixels with distance D4≤ 2 from (x,y) form the following contours of
constant distance.
The pixels with D4= 1 are the 4-neighbors of (x,y)
Page | 22
Digital Image Processing SCSVMV Dept of ECE
Example:
D8distance ≤ 2 from (x,y) form the following contours of constant distance.
DmDistance:
It is defined as the shortest m-path between the points.In this case, the
distance between two pixels will depend on the values of the pixels along
the path, as well as the values of their neighbors.
Example:
Consider the following arrangement of pixels and assume that p, p 2, and
p4 have value 1 and that p1 and p3 can have can have a value of 0 or 1
Suppose that we consider the adjacency of pixels values 1 (i.e. V ={1})
Now, to compute the Dmbetween points p and p4
Page | 23
Digital Image Processing SCSVMV Dept of ECE
The same applies here, and the shortest –m-path will be 3 (p, p2, p3, p4)
Case4: If p1 =1 and p3 = 1
The length of the shortest m-path will be 4 (p, p1, p2, p3, p4)
Page | 24
Digital Image Processing SCSVMV Dept of ECE
UNIT -2
IMAGE ENHANCEMENT
Learning Objectives:
Image enhancement techniques are designed to improve the quality of an
image as perceived by a human being. Image enhancement can be
performed both in the spatial as well as in the frequency domain. After
reading this chapter, the reader should have a basic knowledge about the
following concepts:
1. Image enhancement in spatial and frequency domain
2. Point operations and mask operations in spatial domain
3. Different types of gray –level transformation
4. Histogram and histogram equalization
5. Frequency domain filtering
Introduction:
Image enhancement approaches fall into two broad categories: spatial
domain methods and frequency domain methods. The term spatial domain
refers to the image plane itself, and approaches in this category are based
on direct manipulation of pixels in an image.
Frequency domain processing techniques are based on modifying the
Fourier transform of an image. Enhancing an image provides better
contrast and a more detailed image as compare to non-enhanced image.
Image enhancement has very good applications. It is used to enhance
medical images, images captured in remote sensing, images from satellite
etc. As indicated previously, the term spatial domain refers to the
aggregate of pixels composing an image. Spatial domain methods are
procedures that operate directly on these pixels. Spatial domain processes
will be denoted by the expression.
g(x,y) = T[f(x,y)]
where f(x, y) is the input image, g(x, y) is the processed image, and T is an
operator on f, defined over some neighborhood
of (x, y).
Page | 25
Digital Image Processing SCSVMV Dept of ECE
The simplest form of T is when the neighborhood is of size 1*1 (that is, a
single pixel). In this case, g depends only on the value of f at (x, y), and T
becomes a gray-level (also called an intensity or mapping) transformation
function of the form
s=T(r)
where r is the pixels of the input image and s is the pixels of the output
image. T is a transformation function that maps each value of „r‟ to each
value of „s‟.
For example, if T(r) has the form shown in Fig. 2.2(a), the effect of this
transformation would be to produce an image of higher contrast than the
original by darkening the levels below m and brightening the levels above
m in the original image. In this technique, known as contrast stretching,
the values of r below m are compressed by the transformation function into
a narrow range of s, toward black. The opposite effect takes place for
values of r above m.
In the limiting case shown in Fig. 2.2(b), T(r) produces a two-level (binary)
image. A mapping of this form is called a thresholding function.
One of the principal approaches in this formulation is based on the use of
so-called masks (also referred to as filters, kernels, templates, or windows).
Basically, a mask is a small (say, 3*3) 2-D array, such as the one shown in
Fig. 2.1, in which the values of the mask coefficients determine the nature
of the process, such as image sharpening. Enhancement techniques based
on this type of approach often are referred to as mask processing or
filtering.
Page | 26
Digital Image Processing SCSVMV Dept of ECE
LINEAR TRANSFORMATION:
First, we will look at the linear transformation. Linear transformation
includes simple identity and negative transformation. Identity
transformation has been discussed in ourtutorial of image transformation,
but a brief description of this transformation has been given here.
Identity transition is shown by a straight line. In this transition, each
value of the input image is directly mapped to each other value of output
image. That results in the same input image and output image. And hence
is called identity transformation. It has been shown below:
NEGATIVE TRANSFORMATION:
The second linear transformation is negative transformation, which is
Page | 27
Digital Image Processing SCSVMV Dept of ECE
IMAGE NEGATIVE:
The image negative with gray level value in the range of [0, L-1] is
obtained by negative transformation given by S = T(r) or
S = L -1 – r
Where r= gray level value at pixel (x,y)
L is the largest gray level consists in the image
It results in getting photograph negative. It is useful when for enhancing
white details embedded in dark regions of the image.
The overall graph of these transitions has been shown below.
Page | 28
Digital Image Processing SCSVMV Dept of ECE
LOGARITHMIC TRANSFORMATIONS:
Logarithmic transformation further contains two type of transformation.
Log transformation and inverse log transformation.
LOG TRANSFORMATIONS:
The log transformations can be defined by this formula
S = c log(r + 1).
Where S and r are the pixel values of the output and the input image and c
is a constant. The value 1 is added to each of the pixel value of the input
image because if there is a pixel intensity of 0 in the image, then log (0) is
equal to infinity. So, 1 is added, to make the minimum value at least 1.
During log transformation, the dark pixels in an image are expanded as
compare to the higher pixel values. The higher pixel values are kind of
compressed in log transformation. This result in following image
enhancement.
The shape of the curve shows that this transformation maps the narrow
range of low gray level valuesintheinputimageintoa
widerrangeofoutputimage.
Theoppositeis trueforhighlevelvaluesofinputimage.
Page | 29
Digital Image Processing SCSVMV Dept of ECE
In Fig that curves generated with values of γ>1 have exactly the opposite
effect as those generated with values of γ<1. Finally, we Note that Eq. (6)
reduces to the identity transformation when c=γ=1.
This type of transformation is used for enhancing images for different type
of display devices. The gamma of different display devices is different. For
example, Gamma of CRT lies in between of 1.8 to 2.5, that means the
Page | 30
Digital Image Processing SCSVMV Dept of ECE
CORRECTING GAMMA:
S=Cr γ
S=Cr (1/2.5)
The same image but with different gamma values has been shown here.
Intermediate values of ar1, s1b and ar2, s2b produce various degrees of
spread in the gray levels of the output image, thus affecting its contrast. In
general, r1≤ r2 and s1 ≤ s2 is assumed so that the function is single valued
and Monotonically increasing.
Page | 31
Digital Image Processing SCSVMV Dept of ECE
Finally, Fig. x(d) shows the result of using the thresholding function
defined previously,with r1=r2=m, the mean gray level in the image. The
original image on which these results are based is a scanning electron
microscope image of pollen, magnified approximately 700 times.
Page | 32
Digital Image Processing SCSVMV Dept of ECE
the desired range of gray levels but preserves the background and gray-
level tonalities in the image. Figure y (c) shows a gray-scale image, and
Fig. y(d) shows the result of using the transformation in Fig. y(a).
Variations of the two transformations shown in Fig. are easy to formulate.
BIT-PLANE SLICING:
Instead of highlighting gray-level ranges, highlighting the contribution
made to total image appearance by specific bits might be desired. Suppose
that each pixel in an image is represented by 8 bits. Imagine that the
image is composed of eight 1-bit planes, ranging from bit-plane 0 for the
least significant bit to bit plane 7 for the most significant bit. In terms of 8-
bit bytes, plane 0 contains all the lowest order bits in the bytes comprising
the pixels in the image and plane 7 contains all the high-order bits.
Figure 3.12 illustrates these ideas, and Fig. 3.14 shows the various bit
planes for the image shown in Fig. 3.13. Note that the higher-order bits
Page | 33
Digital Image Processing SCSVMV Dept of ECE
(especially the top four) contain the majority of the visually significant
data.The other bit planes contribute to more subtle details in the image.
Separating a digital image into its bit planes is useful for analyzing the
relative importance played by each bit of the image, a process that aids in
determining the adequacy of the number of bits used to quantize each
pixel.
HISTOGRAM PROCESSING:
The histogram of a digital image with gray levels in the range [0, L-1] is a
discrete function of the form
H(rk)=nk
where rk is the k gray level and nk is the number of pixels in the image
th
The components of the histogram in the high contrast image cover a broad
range of the gray scale. The net effect of this will be an image that shows a
great deal of gray levels details and has high dynamic range.
Page | 34
Digital Image Processing SCSVMV Dept of ECE
HISTOGRAM EQUALIZATION:
Histogram equalization is a common technique for enhancing the
appearance of images. Suppose we have an image which is predominantly
dark. Then its histogram would beskewed towards the lower end of the
grey scale and all the image detail are compressed into the dark end of the
histogram. If we could stretch out the grey levels at the dark end to
produce a more uniformly distributed histogram then the image would
become much clearer.
Let there be a continuous function with r being gray levels of the image to
be enhanced. The range of r is [0, 1] with r=0 repressing black and r=1
representing white. The transformation function is of the form
S=T(r) where 0<r<1
It produces a level s for every pixel value r in the original image.
Thus, the PDF of the transformed variable s is the determined by the gray
levels PDF of the input image and by the chosen transformations function.
A transformation function of a particular importance in image processing
Whereas P and Q are the padded sizes from the basic equations
Wraparound error in their circular convolution can be avoided by padding
these functions with zeros
Fig: Ideal Low Pass Filter 3-D view and 2-D view and line graph
Page | 37
Digital Image Processing SCSVMV Dept of ECE
Fig: (a) Test pattern of size 688x688 pixels (b) its Fourier spectrum
Fig: (a) original image, (b)-(f) Results of filtering using ILPFs with cutoff
frequencies set at radii values 10, 30, 60, 160 and 460, as shown in
fig.2.2.2(b). The power removed by these filters was 13, 6.9, 4.3, 2.2 and
0.8% of the total, respectively.
Page | 38
Digital Image Processing SCSVMV Dept of ECE
-
Transfer function does not have sharp discontinuity establishing cutoff
between passed and filtered frequencies.
Cut off frequency D0 defines point at which H(u,v) = 0.5
Page | 39
Digital Image Processing SCSVMV Dept of ECE
Unlike the ILPF, the BLPF transfer function does not have a sharp
discontinuity that gives a clear cutoff between passed and filtered
frequencies.
Fig. (a) Original image. (b)-(f) Results of filtering using BLPFs of order 2,
with cutoff frequencies at the radii
Fig. shows the results of applying the BLPF of eq. to fig.(a), with n=2 and
D0 equal to the five radii in fig.(b) for the ILPF, we note here a smooth
transition in blurring as a function of increasing cutoff frequency.
Moreover, no ringing is visible in any of the images processed with this
particular BLPF, a fact attributed to the filter’s smooth transition between
low and high frequencies.
Where D0 is the cutoff frequency. When D(u,v) = D0, the GLPF is down to
0.607 of its maximum value. This means that a spatial Gaussian filter,
obtained by computing the IDFT of above equation., will have no ringing.
Page | 41
Digital Image Processing SCSVMV Dept of ECE
Figure shows a perspective plot, image display and radial cross sections of
a GLPF function.
Fig. (a) Perspective plot of a GLPF transfer function. (b) Filter displayed as
an image. (c). Filter radial cross sections for various values of D0
Fig.(a) Original image. (b)-(f) Results of filtering using GLPFs with cutoff
frequencies at the radii shown in fig.2.2.2. compare with fig.2.2.3 and
fig.2.2.6
Page | 42
Digital Image Processing SCSVMV Dept of ECE
Fig: Top row: Perspective plot, image representation, and cross section of a
typical ideal high-pass filter. Middle and bottom rows: The same sequence
for typical butter-worth and Gaussian high-pass filters.
IDEAL HIGH-PASS FILTER:
A 2-D ideal high-pass filter (IHPF) is defined as
H (u,v) = 0, if D(u,v) ≤ D0
H (u,v) = 1, if D(u,v) ˃ D0
Where D0 is the cutoff frequency and D(u,v) is given by eq. As intended,
Page | 44
Digital Image Processing SCSVMV Dept of ECE
the IHPF is the opposite of the ILPF in the sense that it sets to zero all
frequencies inside a circle ofradius D0 while passing, without attenuation,
all frequencies outside the circle. As in case of the ILPF, the IHPF is not
physically realizable.
Fig Spatial representation of typical (a) ideal (b) Butter-worth and (c)
Gaussian frequency domain high-pass filters, and corresponding intensity
profiles through their centers.
We can expect IHPFs to have the same ringing properties as ILPFs. This is
demonstrated clearly in Fig. which consists of various IHPF results using
the original image in Fig.(a) with D0 set to 30, 60, and 160 pixels,
respectively. The ringing in Fig. (a) is so severe that it produced distorted,
thickened object boundaries (E.g., look at the large letter “a”). Edges of the
top three circles do not show well because they are not as strong as the
other edges in the image (the intensity of these three objects is much closer
to the background intensity, giving discontinuities of smaller magnitude).
FILTERED RESULTS OF IHPF:
Page | 45
Digital Image Processing SCSVMV Dept of ECE
Fig. Results of high-pass filtering the image in Fig.(a) using an IHPF with
D0 = 30, 60, and 160
Fig.2.2.2(b). These results are much smoother than those obtained with an
IHPF
Fig. Results of high-pass filtering the image in fig.(a) using a GHPF with
D0 = 30, 60 and 160, corresponding to the circles in Fig.(b)
Page | 47
Digital Image Processing SCSVMV Dept of ECE
UNIT-3
IMAGE RESTORATION AND
SEGMENTATION
Learning Objectives:
The goal of image restoration is to reconstruct the original scene from a
degraded observation. After reading this unit, the reader should be
familiar with the following concepts:
1. Different types of image degradation
2. Linear image –restoration techniques
3. Nonlinear image restoration techniques
IMAGE RESTORATION:
Restoration improves image in some predefined sense. It is an objective
process. Restoration attempts to reconstruct an image that has been
degraded by using a priori knowledge of the degradation phenomenon.
These techniques are oriented toward modeling the degradation and then
applying the inverse process in order to recover the original image.
Restoration techniques are based on mathematical or probabilistic models
of image processing. Enhancement, on the other hand is based on human
subjective preferences regarding what constitutes a “good” enhancement
result. Image Restoration refers to a class of methods that aim to remove
or reduce the degradations that have occurred while the digital image was
being obtained. All-natural images when displayed have gone through
some sort of degradation:
1. During display mode
2. Acquisition mode, or
3. Processing mode
4. Sensor noise
5. Blur due to camera mis focus
6. Relative object-camera motion
7. Random atmospheric turbulence
8. Others
DEGRADATION MODEL:
Degradation process operates on a degradation function that operates on
Page | 48
Digital Image Processing SCSVMV Dept of ECE
NOISE MODELS:
The principal source of noise in digital images arises during image
acquisition and /or transmission. The performance of imaging sensors is
affected by a variety of factors, such as environmental conditions during
image acquisition and by the quality of the sensing elements themselves.
Images are corrupted during transmission principally due to interference
in the channels used for transmission. Since main sources of noise
presented in digital images are resulted from atmospheric disturbance and
image sensor circuitry, following assumptions can be made i.e. the noise
model is spatial invariant (independent of spatial location). The noise
model is uncorrelated with the object function.
Gaussian Noise:
Page | 49
Digital Image Processing SCSVMV Dept of ECE
Rayleigh Noise:
Unlike Gaussian distribution, the Rayleigh distribution is no symmetric. It
is given by the formula.
Gamma Noise:
The PDF of Erlang noise is given by
Page | 50
Digital Image Processing SCSVMV Dept of ECE
Exponential Noise:
Exponential distribution has an exponential shape. The PDF of
exponential noise is given as
Where a>0. The mean and variance of this density are given by
Uniform Noise:
The PDF of uniform noise is given by
Page | 51
Digital Image Processing SCSVMV Dept of ECE
Here, each restored pixel is given by the product of the pixel in the sub
image window, raised to the power 1/mm. A geometric mean filter but it to
loosen image details in the process.
The harmonic mean filter works well for salt noise but fails for pepper
noise. It does well with Gaussian noise also.
Page | 53
Digital Image Processing SCSVMV Dept of ECE
The response of the filter at any point is determined by the ranking result.
Median filter:
It is the best order statistic filter; it replaces the value of a pixel by the
median of gray levels in the Neighborhood of the pixel.The original of the
pixel is included in the computation of the median of the filter are quite
possible because for certain types of random noise, the provide excellent
noise reduction capabilities with considerably less blurring then smoothing
filters of similar size. These are effective for bipolar and unipolar impulse
noise.
min filter.
This filter is useful for flinging the darkest point in image. Also, it reduces
salt noise of the min operation.
Midpoint Filter:
The midpoint filter simply computes the midpoint between the maximum
and minimum values in the area encompassed by its comeliness the order
statistics and averaging. This filter works best for randomly distributed
noise like Gaussian or uniform noise.
Page | 54
Digital Image Processing SCSVMV Dept of ECE
expression
D(u,v)- the distance from the origin of the centered frequency rectangle, W-
the width of the band and Do- the radial center of the frequency rectangle.
These filters are mostly used when the location of noise component in the
frequency domain is known. Sinusoidal noise can be easily removed by
using these kinds of filters because it shows two impulses that are mirror
images of each other about the origin. Of the frequency transform.
The function of a band pass filter is opposite to that of a band reject filter
It allows a specific frequency band of the image to be passed and blocks the
rest of frequencies. The transfer function of a band pass filter can be
obtained from a corresponding band reject filter with transfer function
Hbr(u,v) by using the equation
Page | 55
Digital Image Processing SCSVMV Dept of ECE
Notch Filters:
A notch filter rejects (or passes) frequencies in predefined neighborhoods
Inverse Filtering:
The simplest approach to restoration is direct inverse filtering where we
complete anestimate of the transform of the original image simply
by dividing the transform of the degraded image G(u,v) by degradation
Page | 56
Digital Image Processing SCSVMV Dept of ECE
function H(u,v)
We know that
Therefore
From the above equation we observe that we cannot recover the
undegraded image exactly because N(u,v) is a random function whose
Fourier transform is not known. One approach to get around the zero or
small-value problem is to limit the filter frequencies to values near the
origin.We know that H(0,0) is equal to the average values of h(x,y). By
Limiting the analysis to frequencies near the origin we reduse the
probability of encountering zero values.
Page | 57
Digital Image Processing SCSVMV Dept of ECE
Page | 58
Digital Image Processing SCSVMV Dept of ECE
IMAGE SEGMENTATION:
Edge Detection:
Edge detection is a fundamental tool in image processing and computer
vision, particularly in the areas of feature detection and feature extraction,
which aim at identifying points in a digital image at which the image
brightness changes sharply or more formally has discontinuities.
Motivation: Canny edge detection applied to a photograph
Page | 59
Digital Image Processing SCSVMV Dept of ECE
Edge Properties:
The edges extracted from a two-dimensional image of a three-dimensional
scene can be classified as either viewpoint dependent or viewpoint
independent. A viewpoint independent edge typically reflects inherent
properties of the three-dimensional objects, such as surface markings and
surface shape. A viewpoint dependent edge may change as the viewpoint
changes, and typically reflects the geometry of the scene, such as objects
occluding one another.
A typical edge might for instance be the border between a block of red color
and a block of yellow. In contrast a line(as can be extracted by a ridge
detector) can be a small number of pixels of a different color on an
otherwise unchanging background. For a line, there may therefore usually
be one edge on each side of the line.
At the left side of the edge, the intensity is , and right of the
edge it is . The scale parameter σ is called the blur scale of
the edge.
Page | 60
Digital Image Processing SCSVMV Dept of ECE
To illustrate why edge detection is not a trivial task, let us consider the
problem of detecting edges in the following one-dimensional signal. Here,
we may intuitively say that there should be an edge between the 4th and
5th pixels.
If the intensity difference were smaller between the 4th and the 5th pixels
and if the intensity differences between the adjacent neighboring pixels
were higher, it would not be as easy to say that there should be an edge in
the corresponding region. Moreover, one could argue that this case is one
in which there are several edges.
Page | 61
Digital Image Processing SCSVMV Dept of ECE
types of smoothing filters that are applied and the way the measures of
edge strength are computed. As many edge detection methods rely on the
computation of image gradients, they also differ in the types of filters used
for computing gradient estimates in the x- and y-directions.
Although his work was done in the early days of computer vision, the
Canny edge detector (including its variations) is still a state-of-the-art edge
detector. Unless the preconditions are particularly suitable, it is hard to
find an edge detector that performs significantly better than the Canny
edge detector.
Page | 62
Digital Image Processing SCSVMV Dept of ECE
Page | 63
Digital Image Processing SCSVMV Dept of ECE
estimated gradient direction.
Edge Thinning:
Edge thinning is a technique used to remove the unwanted spurious points
on the edge of an image. This technique is employed after the image has
been filtered for noise (using median, Gaussian filter etc.), the edge
operator has been applied (like the ones described above) to detect the
edges and after the edges have been smoothed using an appropriate
threshold value. This removes all the unwanted points and if applied
carefully, results in one-pixel thick edge elements.
Advantages:
1) Sharp and thin edges lead to greater efficiency in object recognition.
2) If you are using Hough transforms to detect lines and ellipses then
thinning could give much better results.
3) If the edge happens to be boundary of a region then, thinning could
easily give the image parameters like perimeter without much algebra.
There are many popular algorithms used to do this, one such is described
below:
1. Choose a type of connectivity, like 8, 6 or 4.
2. 8 connectivity is preferred, where all the immediate pixels
surrounding a particular pixel are considered.
3. Remove points from North, south, east and west.
4. Do this in multiple passes, i.e. after the north pass, use the same
semi processed image in the other passes and so on.
Page | 64
Digital Image Processing SCSVMV Dept of ECE
Remove a point if:
1. The point has no neighbors in the North (if you are in the north pass,
and respective directions for other passes.)
2. The point is not the end of a line. The point is isolated.
3. Removing the points will not cause to disconnect its neighbors in any
way.
4. Else keep the point. The number of passes across direction should be
chosen according to the level of accuracy desired.
Page | 65
Digital Image Processing SCSVMV Dept of ECE
be negative, i.e.,
Written out as an explicit expression in terms of local partial derivatives
Lx, Ly ... Lyyy, this edge definition can be expressed as the zero-crossing
curves of the differential invariant
that satisfy a sign-condition on the following differential invariant
where Lx, Ly ... Lyyy denote partial derivatives computed from a scale-
spacerepresentation L obtained by smoothing the original image with a
Gaussian kernel. In this way, the edges will be automatically obtained as
continuous curves with subpixel accuracy. Hysteresis thresholding can also
be applied to these differential and subpixel edge segments.
Page | 66
Digital Image Processing SCSVMV Dept of ECE
Thresholding:
Thresholding is the simplest method of image segmentation. From
agrayscale image, thresholding can be used to create binary images.
During the thresholding process, individual pixels in an image are marked
as object pixels if their value is greater than some threshold value
(assuming an object to be brighter than the background) and as
―background pixels otherwise. This convention isknown as Threshold
Above. Variants include threshold below, which is opposite of threshold
above; threshold inside, where a pixel is labeled "object" if its value is
between two thresholds; and threshold outside, which is the opposite of
threshold inside (Shapiro, et al. 2001:83). Typically, an object pixel is given
a value of ―1‖ while a background pixel is given a value of ―0. ‖ Finally, a
binary image is created by coloring each pixel white or black, depending on
a pixel's labels.
Threshold Selection:
The key parameter in the thresholding process is the choice of the
threshold value (or values, as mentioned earlier). Several different
methods for choosing a threshold exist; users can manually choose a
threshold value, or a thresholding algorithm can compute a value
automatically, which is known as automatic thresholding (Shapiro, et al.
2001:83). A simple method would be to choose the mean or median value,
the rationale being that if the object pixels are brighter than the
background, they should also be brighter than the average.
In a noiseless image with uniform background and object values, the mean
or median will work well as the threshold, however, this will generally not
be the case. A more sophisticated approach might be to create a histogram
of the image pixel intensitiesand use the valley point as the threshold. The
histogram approach assumes that there is some average value for the
background and object pixels, but that the actual pixel values have some
variation around these average values. However, this may be
computationally expensive, and image histograms may not have clearly
defined valley points, often making the selection of an accurate threshold
difficult.
One method that is relatively simple, does not require much specific
knowledge of the image, and is robust against image noise, is the following
iterative method:
• An initial threshold (T) is chosen, this can be done randomly or
Page | 67
Digital Image Processing SCSVMV Dept of ECE
according to any other method desired.
• The image is segmented into object and background pixels as
described above, creating two sets:
G1 = {f(m,n):f(m,n)>T} (object pixels)
G2 = {f(m,n):f(m,n) T} (background pixels)
Note: f(m,n) is the value of the pixel located in the mth column, nth row
The average of each set is computed.
m1 = average value of G1
m2 = average value of G2
A new threshold is created that is the average of m1 and m2
T = (m1 + m2)/2
Go back to step two, now using the new threshold computed in step four,
keep repeating until the new threshold matches the one before it (i.e. until
convergence has been reached).This iterative algorithm is a special one-
dimensional case of the k-means clustering algorithm, which has been
proven to converge at a local minimum—meaning that a different initial
threshold may give a different final result.
Adaptive Thresholding:
Thresholding is called adaptive thresholding when a different threshold is
used for different regions in the image. This may also be known as local or
dynamic thresholding (Shapiro, et al. 2001:89).Mples are clustered in two
parts as background and foreground (object), or alternately are modeled as
a mixture of two Gaussiansentropy-based methods result in algorithms
that use the entropy of the foreground and background regions, the cross-
entropy between the original and binarized image, etc.object attribute-
based methods search a measure of similarity between the gray- level and
the binarized images, such as fuzzy shape similarity, edge coincidence, etc.
Spatial methods [that] use higher-order probability distribution and/or
correlation between pixels. Local methods adapt the threshold value on
each pixel to the local image characteristics.
Multiband Thresholding:
Colour images can also be thresholders. One approach is to designate a
separate threshold for each of the RGB components of the image and then
combine them with an AND operation. This reflects the way the camera
works and how the data is stored in the computer, but it does not
correspond to the way that people recognize color. Therefore, the HSL and
HSV color models are more often used. It is also possible to use the CMYK
color model (Pham et al., 2007).
Page | 68
Digital Image Processing SCSVMV Dept of ECE
Region Growing:
Region growing is a simple region-based image segmentation method. It is
also classified as a pixel-based image segmentation method since it
involves the selection of initial seed points.This approach to segmentation
examines neighboring pixels of initial ―seed points‖ and determines
whether the pixel neighbors should be added to the region. The process is
iterated on, in the same manner as general data clustering algorithms.
Region-based Segmentation:
The main goal of segmentation is to partition an image into regions. Some
Region-based Segmentation:
The main goal of segmentation is to partition an image into regions. Some
segmentation methods such as "Thresholding" achieve this goal by looking
for the boundaries between regions based on discontinuities in gray levels
or color properties. Region-basedsegmentation is a technique for
determining the region directly. The basic formulation for Region-Based
Segmentation is:
P(Ri) is a logical predicate defined over the points in set P(Rk) andisthe
null set, means that the segmentation must be complete; that is, every
pixel must be in a regionrequires that points in a region must be connected
in some predefined senseindicates that the regions must be disjointdeals
Page | 69
Digital Image Processing SCSVMV Dept of ECE
with the properties that must be satisfied by the pixels in a segmented
region.
For example P(Ri) = TRUE if all pixels in Ri have the same gray
levelindicates that region Ri and Rj are different in the sense of predicate
P.
Page | 70
Digital Image Processing SCSVMV Dept of ECE
may want to segment the lightning from the background. Then probably,
we can examine the histogram and choose the seed points from the highest
range of it.More information of the image is better.Obviously, the
connectivity or pixel adjacent information is helpful for us to determine the
threshold and seed points.The value, “minimum area threshold”, no region
in region growing method result will be smaller than this threshold in the
segmented image.The value, “Similarity threshold value”, if the difference
of pixel-value or the difference value of average gray level of a set of pixels
less than ―Similarity threshold value‖, the regions will be considered as a
same region.
Simulation Examples:
Here we show a simple example for region growing.Figure. 1 is the original
image which is a gray-scale lightning image. The gray-scale value of this
image is from 0 to 255. The purpose we apply region growing on this image
is that we want to mark the strongest lightning part of the image and we
Figures 1, 2 3 and 4
Fig. 4
Threshold:
190~255
also want the result is connected without being split apart. Therefore, we
Page | 71
Digital Image Processing SCSVMV Dept of ECE
choose the points having the highest gray-scale value which is 255 as the
seed points showed in the Figure. 2.
We can observe the difference between the last two figures which have
different threshold value showed above. Region growing provides the
ability for us to separate the part we want connected.As we can see in
Figure. 3 to Figure. 5, the segmented result in this example are seed-
oriented connected. That means the result grew from the same seed points
are the same regions. And the points will not be grown without connected
with the seed points in the beginning. Therefore, we can mention that
there are still lots of points in the original image having the gray-scale
value above 155 which are not marked in Figure. 5. This characteristic
ensures the reliability of the segmentation and provides the ability to
resist noise. For this example, this characteristic prevents us marking out
the non-lightning part in the image because the lightning is always
connected as one part.The advantages and disadvantages of region
growing. We briefly conclude the advantages and disadvantages of region
growing.
Advantages:
1. Region growing methods can correctly separate the regions that have
the same properties we define.
2. Region growing methods can provide the original images which have
clear edges the good segmentation results.
3. The concept is simple. We only need a small numbers of seed point to
represent the property we want, then grow the region.
4. We can determine the seed points and the criteria we want to make.
5. We can choose the multiple criteria at the same time.
6. It performs well with respect to noise.
Page | 72
Digital Image Processing SCSVMV Dept of ECE
Disadvantages
1. The computation is consuming, no matter the time or power.
2. Noise or variation of intensity may result in holes or over
segmentation.
3. This method may not distinguish the shading of the real images.
We can conquer the noise problem easily by using some mask to filter the
holes or outlier. Therefore, the problem or noise actually does not exist. In
conclusion, it is obvious that the most serious problem of region growing is
the power and time consuming.
Page | 73
Digital Image Processing SCSVMV Dept of ECE
Unit IV
IMAGE COMPRESSION
LEARNING OBJECTIVES:
The rapid growth of digital imaging applications, including desktop
publishing, multimedia, teleconferencing and high –definition television
has increased the need for effective and standard image –compression
techniques. After completing this unit, the reader is expected to be
familiar with the following concepts:
1. Need for image compression
2. Lossless and lossy image compression
3. Spatial domain and frequency domain image compression
4. Wavelet based compression
5. Image compression
The term data compression refers to the process of reducing the amount
of data required to represent a given quantity of information. A clear
distinction must be made between data and information. They are not
synonymous. In fact, data are the means by which information is
conveyed. Various amounts of data may be used to represent the same
amount of information. Such might be the case, for example, if a long-
winded individual and someone who is short and to the point where to
relate the same story. Here, the information of interest is the story; words
are the data used to relate the information. If the two individuals use a
different number of words to tell the same basic story, two different
versions of the story are created, and at least one includes nonessential
data. That is, it contains data (or words) that either provide no relevant
information or simply restate that which is already known. It is thus said
to contain data redundancy.
Page | 74
Digital Image Processing SCSVMV Dept of ECE
For the case n2 = n1, CR = 1 and RD = 0, indicating that (relative to the
second data set) the first representation of the information contains no
redundant data. When n2<< n1, CRtends to ∞ and RD tends to 1, implying
significant compression and highly redundant data. Finally, when n 2>>
n1, CR tends to 0 and RDtends to ∞, indicating that the second data set
contains much more data than the original representation. This, of
course, is the normally undesirable case of data expansion. In general,
CRand RD lie in the open intervals (0,∞) and (-∞, 1), respectively. A
practical compression ratio, such as 10 (or 10:1), means that the first data
set has 10 information carrying units (say, bits) for every 1 unit in the
second or compressed data set. The corresponding redundancy of 0.9
implies that 90% of the data in the first data set is redundant.
CODING REDUNDANCY:
In this, we utilize formulation to show how the gray-level histogram of an
image also can provide a great deal of insight into the construction of
codes to reduce the amount of data used to represent it.Let us assume,
once again, that a discrete random variable rk in the interval [0, 1]
represents the gray levels of an image and that each rk occurs with
probability pr (rk).
where L is the number of gray levels, nk is the number of times that the
kth gray level appears in the image, and n is the total number of pixels in
the image. If the number of bits used to representeach value of rk is l (rk),
then the average number of bits required to represent each pixel is
That is, the average length of the code words assigned to the various
gray-level values is found by summing the product of the number of bits
used to represent each gray level and the probability that the gray level
occurs. Thus, the total number of bits required to code an M X N image
isMNLavg.
Page | 75
Digital Image Processing SCSVMV Dept of ECE
INTERPIXEL REDUNDANCY:
Consider the images shown in Figs. 1.1(a) and (b). As Figs. 1.1(c) and (d)
show, these images have virtually identical histograms. Note also that
both histograms are trimodal, indicating the presence of three dominant
ranges of gray-level values. Because the gray levels in these images are
not equally probable, variable-length coding can be used to reduce the
coding redundancy that would result from a straight or natural binary
encoding of their pixels. The coding process, however, would not alter the
level of correlation between the pixels within the images. In other words,
the codes used to represent the gray levels of each image have nothing to
do with the correlation between pixels. These correlations result from the
structural or geometric relationships between the objects in the image.
Fig. 5.1 Two images and their gray-level histograms and normalized
autocorrelation coefficients along one line.
Figures 5.1(e) and (f) show the respective autocorrelation coefficients
computed along one line of each image.
Page | 76
Digital Image Processing SCSVMV Dept of ECE
where
The scaling factor in Eq. above accounts for the varying number of sum
terms that arise for each integer value of n. Of course, n must be strictly
less than N, the number of pixels on a line. The variable x is the
coordinate of the line used in the computation. Note the dramatic
difference between the shape of the functions shown in Figs. 1.1(e) and (f).
Their shapes can be qualitatively related to the structure in the images in
Figs. 1.1(a) and (b). This relationship is particularly noticeable in Fig. 1.1
(f), where the high correlation between pixels separated by 45 and 90
samples can be directly related to the spacing between the vertically
oriented matches of Fig. 1.1(b). In addition, the adjacent pixels of both
images are highly correlated. When n is 1, γ is 0.9922 and 0.9928 for the
images of Figs. 1.1 (a) and (b), respectively. These values are typical of
most properly sampled television images.
Page | 77
Digital Image Processing SCSVMV Dept of ECE
PSYCHOVISUAL REDUNDANCY:
The brightness of a region, as perceived by the eye, depends on factors
other than simply the light reflected by the region. For example, intensity
variations (Mach bands) can be perceived in an area of constant intensity.
Such phenomena result from the fact that the eye does not respond with
equal sensitivity to all visual information. Certain information simply has
less relative importance than other information in normal visual
processing. This information is said to be psychovisual redundant. It can
be eliminated without significantly impairing the quality of image
perception.
Page | 78
Digital Image Processing SCSVMV Dept of ECE
original or input image and the compressed and subsequently
decompressed output image, it is said to be based on an objective fidelity
criterion. A good example is the root-mean-square (rms) error between an
input and output image. Let f(x, y) represent an input image and let f(x,
y) denote an estimate or approximation of f(x, y) that results from
compressing and subsequently decompressing the input.
For any value of x and y, the error e(x, y) between f (x, y) and f^ (x, y) can
be defined as
SNRrms, is
The rms value of the signal-to-noise ratio, denoted SNRrms, is obtained by
taking the square root of Eq. above.
Page | 79
Digital Image Processing SCSVMV Dept of ECE
and f^(x, y).
Page | 80
Digital Image Processing SCSVMV Dept of ECE
and may or may not reduce directly the amount of data required to
represent the image.
Fig.(
a) Source encoder and (b) source decoder model
In the third and final stage of the source encoding process, the symbol
coder creates a fixed- or variable-length code to represent the quantizer
output and maps the output in accordance with the code. The term
symbol coder distinguishes this coding operation from the overall source
encoding process. In most cases, a variable-length code is used to
represent the mapped and quantized data set. It assigns the shortest code
words to the most frequently occurring output values and thus reduces
coding redundancy. The operation, of course, is reversible. Upon
completion of the symbol coding step, the input image has been processed
to remove each of the three redundancies.
Page | 81
Digital Image Processing SCSVMV Dept of ECE
omitted when error-free compression is desired. In addition, some
compression techniques normally are modeled by merging blocks that are
physically separate inFig(a). In the predictive compression systems, for
instance, the mapper and quantizer are often represented by a single
block, which simultaneously performs both operations.The source decoder
shown in Fig(b) contains only two components: a symbol decoder and an
inverse mapper. These blocks perform, in reverse order, the inverse
operations of the source encoder's symbol encoder and mapper blocks.
Because quantization results in irreversible information loss, an inverse
quantizer block is not included in the general source decoder model shown
in Fig(b).
Page | 82
Digital Image Processing SCSVMV Dept of ECE
over the bit fields in which even parity was previously established. A
single-bit error is indicated by a nonzero parity word c4c2c1, where
VARIABLE-LENGTH CODING:
The simplest approach to error-free image compression is to reduce only
coding redundancy. Coding redundancy normally is present in any
natural binary encoding of the gray levels in an image. It can be
eliminated by coding the gray levels. To do so requires construction of a
variable- length code that assigns the shortest possible code words to the
most probable gray levels. Here, we examine several optimal and near
optimal techniques for constructing such a code. These techniques are
formulated in the language of information theory. In practice, the source
symbols may be either the gray levels of an image or the output of a gray-
level mapping operation (pixel differences, run lengths, and so on).
HUFFMAN CODING:
The most popular technique for removing coding redundancy is due to
Huffman (Huffman [1952]). When coding the symbols of an information
source individually, Huffman coding yields the smallest possible number
of code symbols per source symbol. In terms of the noiseless coding
theorem, the resulting code is optimal for a fixed value of n, subject to the
constraint that the source symbols be coded one at a time.
Page | 83
Digital Image Processing SCSVMV Dept of ECE
the reduced source are also ordered from the most to the least probable.
This process is then repeated until a reduced source with two symbols (at
the far right) is reached.
This operation is then repeated for each reduced source until the original
source is reached. The final code appears at the far left in Fig.The average
length of this code is
and the entropy of the source is 2.14 bits/symbol. The resulting Huffman
Page | 84
Digital Image Processing SCSVMV Dept of ECE
code efficiency is 0.973.
Huffman's procedure creates the optimal code for a set of symbols and
probabilities subject to the constraint that the symbols be coded one at a
time. After the code has been created, coding and/or decoding is
accomplished in a simple lookup table manner. The code itself is an
instantaneous uniquely decodable block code. It is called a block code
because each source symbol is mapped into a fixed sequence of code
symbols. It is instantaneous, because each code word in a string of code
symbols can be decoded without referencing succeeding symbols. It is
uniquely decodable, because any string of code symbols can be decoded in
only one way. Thus, any string of Huffman encoded symbols can be
decoded by examining the individual symbols of the string in a left to
right manner. For the binary code of Fig. a left-to-right scan of the
encoded string
010100111100 reveals that the first valid code word is 01010, which is the
code for symbol a3. The next valid code is 011, which corresponds to
symbol a1. Continuing in this manner reveals the completely decoded
message to be a3a1a2a2a6.
ARITHMETIC CODING:
Unlike the variable-length codes described previously, arithmetic coding
generates non-block codes. In arithmetic coding, which can be traced to
the work of Elias, a one-to-one correspondence between source symbols
and code words does not exist. Instead, an entire sequence of source
symbols (or message) is assigned a single arithmetic code word. The code
word itself defines an interval of real numbers between 0 and 1. As the
number of symbols in the message increases, the interval used to
represent it becomes smaller and the number of information units (say,
bits) required to represent the interval becomes larger. Each symbol of
the message reduces the size of the interval in accordance with its
probability of occurrence. Because the technique does not require, as does
Huffman's approach, that each source symbol translate into an integral
number of code symbols (that is, that the symbols be coded one at a time),
it achieves (but only in theory) the bound established by the noiseless
coding theorem.
Page | 85
Digital Image Processing SCSVMV Dept of ECE
In the arithmetically coded message of Fig. three decimal digits are used
to represent the five-symbol message. This translates into 3/5 or 0.6
Page | 86
Digital Image Processing SCSVMV Dept of ECE
decimal digits per source symbol and compares favorably with the entropy
of the source, which is 0.58 decimal digits or 10- ary units/symbol. As the
length of the sequence being coded increases, the resulting arithmetic
code approaches the bound established by the noiseless coding theorem.
LZW CODING:
The technique, called Lempel-Ziv-Welch (LZW) coding, assigns fixed-
length code words to variable length sequences of source symbols but
requires no a priori knowledge of the probability of occurrence of the
symbols to be encoded. LZW compression has been integrated into a
variety of mainstream imaging file formats, including the graphic
interchange format (GIF), tagged image file format (TIFF), and the
portable document format (PDF).
Page | 87
Digital Image Processing SCSVMV Dept of ECE
sequences will be less likely; if it is too large, the size of the code words
will adversely affect compression performance.
Consider the following 4 x 4, 8-bit image of a vertical edge:
Locations 256 through 511 are initially unused. The image is encoded by
processing its pixels in a left-to-right, top-to-bottom manner. Each
successive gray-level value is concatenated with a variable—column 1 of
Table 6.1 —called the "currently recognized sequence." As can be seen,
this variable is initially null or empty. The dictionary is searched for each
concatenated sequence and if found, as was the case in the first row of the
table, is replaced by the newly concatenated and recognized (i.e., located
in the dictionary) sequence. This was done in column 1 of row 2.
Page | 88
Digital Image Processing SCSVMV Dept of ECE
Page | 89
Digital Image Processing SCSVMV Dept of ECE
BIT-PLANE CODING:
An effective technique for reducing an image's interpixel redundancies is
to process the image's bit planes individually. The technique, called bit-
plane coding, is based on the concept of decomposing a multilevel
(monochrome or color) image into a series of binary images and
compressing each binary image via one of several well-known binary
compression methods.
BIT-PLANE DECOMPOSITION:
The gray levels of an m-bit gray-scale image can be represented in the
form of the base 2 polynomial
Here, denotes the exclusive OR operation. This code has the unique
property that successive code words differ in only one-bit position. Thus,
small changes in gray level are less likely to affect all m bit planes. For
instance, when gray levels 127 and 128 are adjacent, only the 7th bit
plane will contain a 0 to 1 transition, because the Gray codes that
Page | 90
Digital Image Processing SCSVMV Dept of ECE
The decoder of Fig. 8.1 (b) reconstructs en from the received variable-
length code words and performs the inverse operation
Page | 91
Digital Image Processing SCSVMV Dept of ECE
Various local, global, and adaptive methods can be used to generate f^n.
In most cases, however, the prediction is formed by a linear combination
of m previous pixels. That is,
Page | 92
Digital Image Processing SCSVMV Dept of ECE
OPTIMAL PREDICTORS:
The optimal predictor used in most predictive coding applications
minimizes the encoder's mean- square prediction error
and
TRANSFORM CODING:
All the predictive coding techniques operate directly on the pixels of an
Page | 93
Digital Image Processing SCSVMV Dept of ECE
image and thus are spatial domain methods. In this coding, we consider
compression techniques that are based on modifying the transform of an
image. In transform coding, a reversible, linear transform (such as the
Fig. A transform coding system: (a) encoder; (b) decoder
Fourier transform) is used to map the image into a set of transform
coefficients, which are then quantized and coded. For most natural
images, a significant number of the coefficients have small magnitudes
and can be coarsely quantized (or discarded entirely) with little image
distortion. A variety of transformations, including the discrete Fourier
transform (DFT), can be used to transform the image data.
WAVELET CODING:
The wavelet coding is based on the idea that the coefficients of a
transform that decorrelates the pixels of an image can be coded more
efficiently than the original pixels themselves. If the transform's basis
functions—in this case wavelets—pack most of the important visual
information into a small number of coefficients, the remaining coefficients
can be quantized coarsely or truncated to zero with little image distortion.
Page | 94
Digital Image Processing SCSVMV Dept of ECE
Page | 95
Digital Image Processing SCSVMV Dept of ECE
Unit V
IMAGE REPRESENTATION AND
RECOGNITION
LEARNING OBJECTIVES:
Image representation is the process of generating descriptions from the
visual contents of an image. After reading this unit, the reader should be
familiar with the following concepts:
1. Boundary representation
2. Chain Code
3. Descriptors
4. Pattern Recognition
BOUNDARY REPRESENTATION:
Models are a more explicit representation than CSG.The object is
represented by a complicated data structure giving information about
each of the object's faces, edges and vertices and how they are joined
together.
Appears to be a more natural representation for Vision since surface
information is readily available.The description of the object can be into
two parts:
Topology: Itrecords the connectivity of the faces, edges and vertices by
means of pointers in the data structure.
Geometry: Itdescribes the exact shape and position of each of the edges,
faces and vertices.The geometry of a vertex is just its position in space as
given by its (x,y,z) coordinates.Edges may be straight lines, circular arcs,
etc.A face is represented by some description of its surface (algebraic or
parametric forms used).
Page | 96
Digital Image Processing SCSVMV Dept of ECE
CHAIN CODE:
A chain code is a lossless compression algorithm for monochrome images.
The basic principle of chain codes is to separately encode each connected
component, or "blot", in the image. For each such region, a point on the
boundary is selected and its coordinates are transmitted. The encoder
then moves along the boundary of the image and, at each step, transmits
a symbol representing the direction of this movement. This continues
until the encoder returns to the starting position, at which point the blot
has been completely described, and encoding continues with the next blot
in the image.
Page | 97
Digital Image Processing SCSVMV Dept of ECE
To find the skeleton by using the definition is very costly, since it involves
calculating the distance between all points in the object to all points on
the perimeter.
Usually some iterative method is used to remove perimeter pixels until
the skeleton alone remains.
In order to achieve this the following rules must be adhered to:
1. Erosion of object must be kept to a minimum
2. Connected components must not be broken
3. End points must not be removed
The purpose with Description is to Quantify a representation of an object.
This implies that we instead of talking about the areas in the image we
can talk about their properties such as length, curviness and so on.
Page | 98
Digital Image Processing SCSVMV Dept of ECE
Where D is a metric
The line that passes through the points Pi and Pjthat defines the
diameter is called the main axis of the object.
The curviness of perimeter can be obtained by calculating the angle
between two consecutive line-segments of the polygonal approximation
The curvature at point Pjof the curve
We can approximate
the area of an object with the
number of pixels belonging to
the object.
More accurate measures are, however obtained by using a polygonal
approximation. The area of a polygon segment (with one corner in the
origin) is given by the area of the entire polygon is then
A circle of radius r has the area A=πr2 and the length of the perimeter is
P=2 πr. So, by defining the quotient
Page | 99
Digital Image Processing SCSVMV Dept of ECE
The measure of the shape of the object can be obtained according to:
1. Calculate the chain code for the object
2. Calculate the difference code for the chain code
3. Rotate the code so that it is minimal
4. This number is called the shape number of the object
5. The length of the number is called the order of the shape
Page | 100
Digital Image Processing SCSVMV Dept of ECE
TEXTURES:
"Texture" is an ambiguous word and in the context of texture synthesis
may have one of the following meanings. In common speech, "texture" used
as a synonym for "surface structure". Texture has been described by five
different properties in the psychology of perception: coarseness, contrast,
directionality, line-likeness and roughness.
In 3D computer graphics, a texture is a digital image applied to the
surface of a three-dimensional model by texture mapping to give the
model a more realistic appearance. Often, the image is a photograph of a
"real" texture, such as wood grain.In image processing, every digital
image composed of repeated elements is called a "texture." Texture can be
arranged along a spectrum going from stochastic to regular:
Page | 101
Digital Image Processing SCSVMV Dept of ECE
For instance, imagine somebody searching a scene ofa happy person. The
happiness is a feeling and it is not evident its shape, color and texture
description in images.The description of the audio-visual content is not a
superficial task and it is essential for the effective use of this type of
archives. The standardization system that deals with audio-visual
descriptors is the MPEG-7 (Motion Picture Expert Group - 7).
Page | 102
Digital Image Processing SCSVMV Dept of ECE
COLOR:The most basic quality of visual content. Five tools are defined to
describe color. The three first tools represent the color distribution and
the last ones describe the color relation between sequences or group of
images:
1. Dominant Color Descriptor (DCD)
2. Scalable Color Descriptor (SCD)
3. Color Structure Descriptor (CSD)
4. Color Layout Descriptor (CLD)
Page | 103
Digital Image Processing SCSVMV Dept of ECE
Descriptors Applications:
Among all applications, the most important ones are:
• Multimedia documents search engines and classifiers.
• Digital Library:visual descriptors allow a very detailed and concrete
search of any video or image by means of different search
parameters. For instance, the search of films where a known actor
appears, the search of videos containing the Everest mountain, etc.
• Personalized electronic news service.
• Possibility of an automatic connection to a TV channel broadcasting a
soccer match, for example, whenever a player approaches the goal
area.
• Control and filtering of concrete audio-visual contents, like violent or
pornographic material. Also, authorization for some multimedia
contents.
Page | 104
Digital Image Processing SCSVMV Dept of ECE
Example: consider our face then eyes, ears, nose etc are features of the
face.
A set of features that are taken together, forms the features vector.
Example: In the above example of face, if all the features (eyes, ears, nose
etc) taken together then the sequence is feature vector ([eyes, ears, nose]).
Feature vector is the sequence of a features represented as a d-
dimensional column vector. In case of speech, MFCC (Mel frequency
Cepstral Coefficient) is the spectral features of the speech. Sequence of
first 13 features forms a feature vector.
Page | 105
Digital Image Processing SCSVMV Dept of ECE
Page | 106
Digital Image Processing SCSVMV Dept of ECE
variation
A Training Set: Training data is a certain percentage of an overall dataset
along with testing set. As a rule, the better the training data, the better
the algorithm or classifier performs.
Page | 107
Digital Image Processing SCSVMV Dept of ECE
3. What is pixel?
a) Pixel is the elements of a digital image
b) Pixel is the elements of an analog image
c) Pixel is the cluster of a digital image
d) Pixel is the cluster of an analog image
Page | 108
Digital Image Processing SCSVMV Dept of ECE
12. After digitization process a digital image with M rows and N columns
have to be positive and for the number, L, max gray levels i.e. an
integer power of 2 for each pixel. Then, the number b, of bits required
to store a digitized image is:
a) b=M*N*k
b) b=M*N*L
c) b=M*L*k
d) b=L*N*k
Page | 109
Digital Image Processing SCSVMV Dept of ECE
13. In digital image of M rows and N columns and L discrete gray levels,
calculate the bits required to store a digitized image for M=N=32 and
L=16.
a) 16384
b) 4096
c) 8192
d) 512
15. The most familiar single sensor used for Image Acquisition is
a) Microdensitometer
b) Photodiode
c)CMOS
d) None of the Mentioned
16. The difference is intensity between the highest and the lowest
intensity levels in an image is ___________
a) Noise
b) Saturation
c)Contrast
d) Brightness
Page | 110
Digital Image Processing SCSVMV Dept of ECE
a. s=clog10(1/r)
b. s=clog10(1+r)
c. s=clog10(1*r)
d. s=clog10(1-r)
21. What is the full form for PDF, a fundamental descriptor of random
variables i.e. gray values in an image?
a) Pixel distribution function
b) Portable document format
c) Pel deriving function
d) Probability density function
24. Which of the following comes under the application of image blurring?
a) Object detection
b) Gross representation
c) Object motion
d) Image segmentation
Page | 111
Digital Image Processing SCSVMV Dept of ECE
26. What is the name of process used to correct the power-law response
phenomena?
a) Beta correction
b) Alpha correction
c) Gamma correction
d) Pie correction
30. The lowpass filtering process can be applied in which of the following
area(s)?
a) The field of machine perception, with application of character
recognition
b) In field of printing and publishing industry
c) In field of processing satellite and aerial images
d) All of the mentioned
Page | 112
Digital Image Processing SCSVMV Dept of ECE
37.If the inner region of the object is textured then approach, we use is
a) Discontinuity
b) Similarity
c) Extraction
d) Recognition
Page | 113
Digital Image Processing SCSVMV Dept of ECE
43. Method in which images are input and attributes are output is called
a) Low Level Processes
b) High Level Processes
c) Mid-Level Processes
d) Edge Level Processes
Page | 114
Digital Image Processing SCSVMV Dept of ECE
Page | 115
Digital Image Processing SCSVMV Dept of ECE
c) Continuity
d) Zero Crossing
57. Which of the following measures are not used to describe a region?
a) Mean and median of grey values
Page | 116
Digital Image Processing SCSVMV Dept of ECE
Page | 117
Digital Image Processing SCSVMV Dept of ECE
REFERENCES
Page | 118