DIP Answer Key

Question Repository for aApr/May 2023 Examinations
Common To
Subject Code 19EC503 Subject Name Digital Image Processing
Electronics and Communication ECE, EIE, CSE

Faculty Name Dr. S. Dhandapani Department
Engineering
(PART A – 2 Marks)
UNIT - I
Q. No Questions
Brightness of an object is the perceived luminance of the surround. Two objects with different surroundings would have identical luminance but
QA101
different brightness.
QA102 A Color model is a specification of 3D-coordinates system and a subspace within that system where each color is represented by a single point
In pixel resolution, the term resolution refers to the total number of count of pixels in an digital image. For example. If an image has M rows and N
QA103
columns, then its resolution can be defined as M X N.
Image sharpening and restoration

Medical field
Remote sensing
Transmission and encoding
Machine/Robot vision
QA104
Color processing
Pattern recognition
Video processing
Microscopic Imaging
Photopic vision
The human being can resolve the fine details with these cones because each one is connected to its own nerve end. This is also known as bright light
vision.
QA105
Scotopic vision
Several rods are connected to one nerve end. So, it gives the overall picture of the image. This is also known as thin light vision.
UNIT - II
Q. No Questions
Zooming simply means enlarging a picture in a sense that the details in the image became more visible and clear. Zooming an image has many wide
QA201 applications ranging from zooming through a camera lens, to zoom an image on internet e.t.c.
Spatial resolution states that the clarity of an image cannot be determined by the pixel resolution. The number of pixels in an image does not matter.
QA202
Spatial resolution can be defined as the smallest discernible detail in an image.
QA203 1) To reduce band width 2) To reduce redundancy 3) To extract feature.
QA204 The individual periods of the convolution will overlap and referred to as wrap around error
DPI or dots per inch is a measure of spatial resolution of printers. In case of printers, dpi means that how many dots of ink are printed per inch when
QA205
an image get printed out from the printer.
UNIT - III
Q. No Questions
• Prewitt Operator
• Sobel Operator
• Robinson Compass Masks
QA301
• Krisch Compass Masks
• Laplacian Operator.
• Spatial domain refers to image plane itself & approaches in this category are based on direct manipulation of picture image. • Frequency
QA302
domain methods based on modifying the image by fourier transform.
Impulse (Salt &Pepper) Noise:
In this case, the noise is signal dependent, and is multiplied to the image. The PDF of bipolar (impulse) noise is given by
If b>a, gray level b will appear as a light dot in image. Level a will appear like a dark dot.
QA303
Kirsch Compass Mask is also a derivative mask which is used for finding edges. This is also like Robinson compass find edges in all the
QA304 eight directions of a compass. The only difference between Robinson and kirsch compass masks is that in Kirsch we have a standard mask
but in Kirsch we change the mask according to our own requirements.
An important application of image averaging is in the field of astronomy, where imaging with very low light levels is routine, causing sensor
QA305
noise frequently to render single images virtually useless for analysis.
UNIT - IV
Q. No Questions
When Threshold T depends only on f(x,y) then the threshold is called global . If T depends both on f(x,y) and p(x,y) is called
QA401 local. If T depends on the spatial coordinates x and y the threshold is called dynamic or adaptive where f(x,y) is the original
image
Lossless compression can recover the exact original data after compression. It is used mainly for compressing database records,
spreadsheets or word processing files, where exact replication of the original is essential. Lossy compression will result in a
QA402 certain loss of accuracy in exchange for a substantial increase in compression. Lossy compression is more effective when used to
compress graphic images and digitized voice where losses outside visual or aural perception can be tolerated.
Variable Length Coding is the simplest approach to error free compression. It reduces only the coding redundancy. It assigns the
QA403
shortest possible codeword to the most probable gray levels.
• The sign of the second derivative can be used to determine whether an edge pixel lies on the dark or light side of an edge. • It
QA404 produces two values for every edge in an image. • An imaginary straightline joining the extreme positive and negative values of
the second derivative would cross zero near the midpoint of the edge.
Source encoder is responsible for removing the coding and interpixel redundancy and psycho visual redundancy.
QA405
Channel encoder reduces reduces the impact of the channel noise by inserting redundant bits into the source encoded data.
UNIT - V
Q. No Questions
The patterns used to estimate the parameters are called training patterns,anda set of such patterns from each class is called a
QA501
training set.
Region growing is a procedure that groups pixels or subregions in to layer regions based on predefined criteria. The basic
QA502 approach is to start with a set of seed points and from there grow regions by appending to each seed these neighbouring pixels
that have properties similar to the seed.
An important approach to represent the structural shape of a plane region is to reduce it to a graph. This reduction may be accomplished by
QA503 obtaining the skeletonizing algorithm. It plays a central role in a broad range of problems in image processing, ranging from automated
inspection of printed circuit boards to counting of asbestos fibers in air filter.
Eccentricity of boundary is the ratio of the major axis to minor axis.
QA504
Curvature is the rate of change of slope.
An approach used to control over segmentation is based on markers. marker is a connected component belonging to an image. We have
QA505
internal markers, associated with objects of interest and external markers associated with background.
(PART B – 13 Marks - Either Or Type)

UNIT - I
Q. No Questions
There are two categories of the steps involved in the image processing –
QB101 (a) Methods whose outputs are input are images.
Methods whose outputs are attributes extracted from those images.
Fig: Fundamental Steps in Digital Image Processing
Image Acquisition: It could be as simple as being given an image that is already in digital form. Generally, the image acquisition stage
involves processing such scaling.
Image Enhancement: It is among the simplest and most appealing areas of digital image processing. The idea behind this is to bring out
details that are obscured or simply to highlight certain features of interest in image. Image enhancement is a very subjective area of
imageprocessing.
Image Restoration: It deals with improving the appearance of an image. It is an objective approach, in the sense that restoration
techniques tend to be based on mathematical or probabilistic models of image processing. Enhancement, on the other hand is based on
human subjective preferences regarding what constitutes a “good” enhancement result.
Color Image Processing: It is an area that is been gaining importance because of the use of digital images over the internet. Color image
processing deals with basically color models and their implementation in image processingapplications.
Wavelets and Multiresolution Processing: These are the foundation for representing image in various degrees of resolution.
Compression: It deals with techniques reducing the storage required to save an image, or the bandwidth required to transmit it over the
network. It has to major approaches
Lossless Compression
Lossy Compression
Morphological Processing: It deals with tools for extracting image components that are useful in the representation and description of
shape and boundary of objects. It is majorly used in automated inspection applications.
Representation and Description: It always follows the output of segmentation step that is, raw pixel data, constituting either the boundary
of an image or points in the region itself. In either case converting the data to a form suitable for computer processing is necessary.
Recognition: It is the process that assigns label to an object based on its descriptors. It is the last step of image processing which use
artificial intelligence of software.
Knowledge Base: Knowledge about a problem domain is coded into an image processing system in the form of a knowledge base. This
knowledge may be as simple as detailing regions of an image where the information of the interest in known to be located. Thus, limiting
search that has to be conducted in seeking the information. The knowledge base also can be quite complex such interrelated list of all
major possible defects in a materials inspection problem or an image database containing high resolution satellite images of a region in
connection with change detection application.
(Or)
Image Sensing and Acquisition
There are 3 principal sensor arrangements (produce an electrical output proportional to light intensity).
(i)Single imaging Sensor (ii)Line sensor (iii)Array sensor
QB101 (b)
(ii)
(iii)
Image Acquisition using a single sensor
The most common sensor of this type is the photodiode, which is constructed of silicon materials and whose output voltage waveform is
proportional to light. The use of a filter in front of a sensor improves selectivity. For example, a green (pass) filter in front of a light sensor
favours light in the green band of the color spectrum. As a consequence, the sensor output will be stronger for green light than for other
components in the visible spectrum.
Fig: Combining a single sensor with motion to generate a 2-D image

In order to generate a 2-D image using a single sensor, there have to be relative displacements in both the x- and y-directions between the
sensor and the area to be imaged. An arrangement used in high precision scanning, where a film negative is mounted onto a drum whose
mechanical rotation provides displacement in one dimension. The single sensor is mounted on a lead screw that provides motion in the
perpendicular direction. Since mechanical motion can be controlled with high precision, this method is an inexpensive (but slow) way to
obtain high-resolution images.
Image Acquisition using Sensor Strips
Fig: (a) Image acquisition using linear sensor strip (b) Image acquisition using circular sensor strip.
The strip provides imaging elements in one direction. Motion perpendicular to the strip provides imaging in the other direction. This is the
type of arrangement used in most flatbed scanners. Sensing devices with 4000 or more in-line sensors are possible. In-line sensors are
used routinely in airborne imaging applications, in which the imaging system is mounted on an aircraft that flies at a constant altitude and
speed over the geographical area to be imaged. One-dimensional imaging sensor strips that respond to various bands of the
electromagnetic spectrum are mounted perpendicular to the direction of flight. The imaging strip gives one line of an image at a time, and
the motion of the strip completes the other dimension of a two-dimensional image. Sensor strips mounted in a ring configuration are used
in medical and industrial imaging to obtain cross- sectional (“slice”) images of 3-D objects. A rotating X-ray source provides illumination
and the portion of the sensors opposite the source collect the X-ray energy that pass through the object
(the sensors obviously have to be sensitive to X-ray energy).This is the basis for medical and industrial computerized axial tomography
(CAT) imaging.
Image Acquisition using Sensor Arrays
Fig: An example of the digital image acquisition process (a) energy source (b) An element of a scene (d) Projection of the scene into the
image (e) digitized image

This type of arrangement is found in digital cameras. A typical sensor for these cameras is a CCD array, which can be manufactured with
a broad range of sensing properties and can be packaged in rugged arrays of 4000 * 4000 elements or more. CCD sensors are used widely
in digital cameras and other light sensing instruments. The response of each sensor is proportional to the integral of the light energy
projected onto the surface of the sensor, a property that is used in astronomical and other applications requiring low noise images.
The first function performed by the imaging system is to collect the incoming energy and focus it onto an image plane. If the illumination
is light, the front end of the imaging system is a lens, which projects the viewed scene onto the lens focal plane. The sensor array, which is
coincident with the focal plane, produces outputs proportional to the integral of the light received at each sensor.
# Cornea and Sclera

# Choroid – Iris diaphragm and Ciliary body
# Retina- Cones and Rods
The eye is early a sphere, with an average diameter of approximately 20 mm.
Three membrance encloses the eye,
1. Cornea
2. Sclera or Cornea:
3. Retina
QB102 (a) . The cornea is a tough, transparent tissue that covers the anterior surface of the eye.
Sclera:
Sclera is an opaque membrance e that encloses the remainder of the optical globe.
Choroid:
-Choroid directly below the sclera. This membrance contains a network of blood vessels that serve as the major source of nutrition to the
eye.
-Choroid coat is heavily pigmented and helps to reduce the amount of extraneous light entering the eye.
-The choroid is divided into the ciliary body and the iris diaphragm.
Lens:
The lens is made up of concentric lay ours of fibrous cells and is suspended by fibrous that attach to the ciliary body. It contains 60to 70%
of water about 60%fat and more protein than any other tissue in the eye.
Retina: The innermost membrance of the eye is retina, which lines the inside of the wall’s entire posterior portion. There are 2 classes of
receptors,
1. Cones
2. Rods
Cones:
The cones in each eye between 6and7 million. They are located primarily in the central portion of the retina called the fovea, and highly
sensitive to Colour.
Rods:
The number of rods is much larger; some 75 to 150 millions are distributed over the retinal surface.
Fovea as a square sensor array of size 1.5mm*1.5mm.
(Or)
RGB model,each color appears in its primary spectral components of red ,green and blue.This model is based on a Cartesian coordinate
system.This color subspace of interest is the cube.RGB values are at three corners cyan.magenta and yellow are at three other corner black
is at the origin and white is the at the corner farthest from the origin this model the gray scale extends from black to white along the line
joining these two points .The different colors in this model are points on or inside the cube and are defined by vectors extending from the
origin.
Images represented in the RGB color model consist of three component images,one for each primary colors.The no of bits used to
QB102 (b) represented each pixel in which each red,green and blue images is an 8 bit image.Each RGB color pixel of values is said
to be 24 bits .The total no of colors in a 24 bit RGB images is 92803=16777,216. The acquiring a color image is basically the process is
shown in fig,. A color image can be acquired by using three filters,sensitive to red,green and blue.When we view a color scene with a
monochrome camera equipped with one of these filters the result is a monochrome image whose intensity is proportional to the response
of that filter. Repeating this process with each filter produces three monochrome images that are the RGB component images of the color
scene.the subset of color is called the set of safe RGB colors or the set of all system safe colors. In inter net applications they are called
safe Web colors or safe browser colors.There are 256 colors are obtained from different combination but we are using only 216 colors .
The field of digital image processing refers to processing digital images by means of digital computer. Digital image is composed of a
finite number of elements, each of which has a particular location and value. These elements are called picture elements, image elements,
pels and pixels. Pixel is the term used most widely to denote the elements of digital image. An image is a two-dimensional function that
represents a measure of some characteristic such as brightness or color of a viewed scene. An image is a projection of a 3- D scene into a
2D projection plane.
An image may be defined as a two-dimensional function f(x,y), where x and y are spatial (plane) coordinates, and the amplitude tofat any
QB103 (a)
pair of coordinates (x,y) is called the intensity of the image at thatpoint. The term gray levelis used often to refer to the intensity of
monochrome images. Color images are formed by a combination of individual 2-D images. For example, the RGB color system, a color
image consists of three (red, green and blue) individual component images. For this reason, many of the techniques developed for
monochrome images can be extended to color images by processing the three component images individually. An image may be
continuous with respect to the x- and y- coordinates and also in amplitude. Converting such an image to digital form requires that the
coordinates, as well as the amplitude, be digitized. APPLICATIONS OF DIGITAL IMAGE PROCESSING: Since digital image
processing has very wide applications and almost all of the technical fields are impacted by DIP, we will just discuss some of the major
applications of DIP. Digital image processing has a broad spectrum of applications, such as 1. Remote sensing via satellites and other
space crafts 2. Image transmission and storage for business applications 3. Medical processing 4. RADAR (Radio Detection and Ranging)
5. SONAR (Sound Navigation and Ranging) 6. Acoustic Image Processing (The study of underwater sound is known as Underwater
Acousticsor HydroAcoustics) 7. Robotics and automated inspection of industrial parts Images acquired by satellites are useful in
trackingof 1. Earthresources 2. Geographical mapping 3. Prediction of agriculturalcrops 4. Urban growth and weathermonitoring 5. Flood
and fire control and many other environmentalapplications Space image applicationsinclude: 1. Recognition and analysis of objects
contained in images obtained from deep space-probemissions. 2. Image transmission and storage applications occur in broadcasttelevision
3. Teleconferencing 4. Transmission of facsimile images (Printed documents and graphics) for office automation 5. Communication over
computer networks 6. Closed-circuit television-based security monitoring systemsand 7. In militarycommunications Medicalapplications:
1. Processing of chest X-rays 2. Cineangiograms 3. Projection images of trans axial tomographyand 4. Medical images that occur in
radiology nuclear magneticresonance (NMR) 5. Ultrasonicscanning
(Or)
Neighbors of a Pixel
In a 2-D coordinate system each pixel p in an image can be identified by a pair of spatial coordinates (x, y).
Referring to the Fig below, a pixel p has two horizontal neighbors (x−1, y), (x+1, y) and two vertical neighbors (x, y−1), (x, y+1). These 4
QB103 (b) pixels together constitute the 4-neighbors of pixel p, denoted as N4(p).
The pixel p also has 4 diagonal neighbors which are: (x+1, y+1), (x+1, y−1), (x−1, y+ 1), (x−1, y−1). The set of 4 diagonal neighbors
forms the diagonal neighborhood denoted as ND(p). The set of 8 pixels surrounding the pixel p forms the 8-neighborhood denoted as
N8(p). We have N8(p) = N4(p) 𝖴 ND(p).
Figure : Pixel Neighborhood: The center pixel p is shown in a dashed pattern and the pixels inthe defined neighborhood are shown in
filled color.
Adjacency
The concept of adjacency has a slightly different meaning from neighborhood. Adjacency takes into account not just spatial neighborhood
but also intensity groups. Suppose we define a set S={0,L-1} of intensities which are considered to belong to the same group. Two pixels
p and q will be termed adjacent if both of them have intensities from set S and both also conform to some definition of neighborhood.
4 Adjacency: Two pixels p and q are termed as 4-adjacent if they have intensities from set S and q belongs to N4(p).
8 Adjacency Two pixels p and q are termed as 4-adjacent if they have intensities from set S and
q belongs to N8(p).
Mixed adjacency or m-adjacency : (there shouldn‟t be closed path)
Mixed adjacency is a modification of 8-adjacency and is used to eliminate the multiple path connections that often arise when 8-adjacency
is used.
Fig: (a) Arrangement of pixels; (b) pixels that are 8-adjacent to the center pixel (c) m-
adjacency
Path
A (digital) path (or curve) from pixel p with coordinates (x0,y0) to pixel q with coordinates (xn, yn) is a sequence of distinct pixels with
coordinates
(x0, y0), (x1, y1), …, (xn, yn)
Where (xi yi) and (xi-1, yi-1) are adjacent for 1 ≤ i ≤ n.
Here n is the length of the path.
If (x0, y0) = (xn, yn), the path is closed path.
We can define 4-, 8-, and m-paths based on the type of adjacency used.
Connected components
Let S represent a subset of pixels in an image. Two pixels p and q are said to be connected in S if there exists a path between them
consisting entirely of pixels in S. For any pixel p in S, the set of pixels connected to it in S is called a connected component of S. If it has
only one connected component, then set S is called connected set.
A Region R is a subset of pixels in an image such that all pixels in R form a connected component.
A Boundary of a region R is the set of pixels in the region that have one or more neighbors that are not in R. If R happens to be an entire
image, then its boundary is defined as the set of pixels in the first and last rows and columns of the image.
Distance Measure
For pixels p,q and z, with coordinates (x,y),(s,t) and (v,w), repectively, D is a distance function if
D( p, q) 0 (D(p,q)=0 iff p=q)
D(p,q)= D(q,p)
D( p, z) D( p, q) D(q, z)
The distance between two pixels p and q with coordinates (x1, y1), (x2, y2) respectively can be formulated in several ways:
Euclidean Distance
Pixels having a Euclidean distance r from a given pixel will lie on a circle of radius r centered at it and r = distance
D4 distance(also called City-block Distance)

In this case, the pixels having a D4 distance from (x,y) less than or equal to some value r from a diamond centered at (x,y). For example,
the pixels with D4 distance ≤2 from (x,y) (the center point) form the following contours of constant distance:
Pixels having a city-block distance 1 from a given pixel are its 4-neighbors.
Green line is Euclidean distance. Red, blue and yellow are City Block Distance
It's called city-block distance, because it is calculated as if on each pixel between your two coordinates stood a block (house) which you
have to go around. That means, you can only go along the vertical or horizontal lines between the pixels but not diagonal. It's the same
like the movement of the rook on a chess field.
Chess-board Distance=
In this case, the pixels having a D8 distance from (x,y) less than or equal to some value r from a square centered at (x,y). For example, the
pixels with D8 distance ≤2 from (x,y) (the center point) form the following contours of constant distance
To measure D8 distance, you can only go along the vertical or horizontal or diagonal lines between the pixels.
Pixels having a chess-board distance 1 from a given pixel are its 8-neighbors
Dm distance between two points is defined as the shortest m path between the points.
When V={1} p,p2 and p4 have value 1
Dm shortest distance between p and p4 is 2

When V={1} p,p1,p2 and p4 have value 1 or p,p2,p3 and p4 have value 1
Dm shortest distance between p and p4 is 3
Dm distance between p and p4 is 4
Linear and Nonlinear Operations

H(af+bh)=aH(f)+bH(g)
Where H is an operator whose input and output are images.
f and g are two images
a and b are constants.
H is said to be linear operation if it satisfies the above equation or else H is a nonlinear operator.
UNIT - II
Q. No Questions
Logtransformations
LOGARITHMIC TRANSFORMATIONS:
Logarithmic transformation further contains two type of transformation. Log transformation and inverse log transformation.
LOG TRANSFORMATIONS:
The log transformations can be defined by this formula
S = c log(r + 1).
Where S and r are the pixel values of the output and the input image and c is a constant. The value 1 is added to each of the pixel value of the
input image because if there is a pixel intensity of 0 in the image, then log (0) is equal to infinity. So, 1 is added, to make the minimum value
at least 1.
During log transformation, the dark pixels in an image are expanded as compare to the higher pixel values. The higher pixel values are kind
of compressed in log transformation. This result in following image enhancement.
ANOTHER WAY TO REPRESENT LOG TRANSFORMATIONS: Enhance
details in the darker regions of an image at the expense of detail in brighter regions.
T(f) = C * log (1+r)
HereC is constantand r≥ 0
The shape of the curve shows that this transformation maps the narrow range of low gray level valuesintheinputimageintoa
widerrangeofoutputimage. Theoppositeis trueforhighlevelvaluesofinputimage.
QB201 (a)
Fig. Log Transformation Curve input vs output

POWER – LAW TRANSFORMATIONS:
There are further two transformation is power law transformations, that include nth power and nth root transformation. These transformations
can be given by the expression:
S=Crγ
This symbol γ is called gamma, due to which this transformation is also known as gamma transformation.
Variation in the value of γ varies the enhancement of the images. Different display devices / monitors have their own gamma correction,
that’s why they display their image at different intensity, where c and g are positive constants. Sometimes Eq. (6) is written as
S = C (r +ε) γ
to account for an offset (that is, a measurable output when the input is zero). Plots of s versus r for various values of γ are shown in Fig. 2.10.
As in the case of the log transformation, power-law curves with fractional values of γ map a narrow range of dark input values into a wider
range of output values, with the opposite being true for higher values of input levels. Unlike the log function, however, we notice here a
family of possible transformation curves obtained simply by varying γ.
In Fig that curves generated with values of γ>1 have exactly the opposite effect as those generated with values of γ<1. Finally, we Note that
Eq. (6) reduces to the identity transformation when c=γ=1.
Fig. Plot of the equation S = Crγ for various values of γ

(C =1 in all cases)
This type of transformation is used for enhancing images for different type of display devices. The gamma of different display devices is
different. For example, Gamma of CRT lies in between of 1.8 to 2.5, that means the image displayed on CRT is dark.
Varying gamma (γ) obtains family of possible transformation curves
S = C* r γ
Here C and γ are positive constants. Plot of S versus r for various values of γ is γ > 1 compresses dark values and expands bright valuesγ < 1
(similar to Log transformation) but expands dark values Compresses bright values When C = γ = 1, it reduces to identity transformation.
CORRECTING GAMMA:
S=Cr γ
S=Cr (1/2.5)
The same image but with different gamma values has been shown here.
(Or)
# Separability
# Translation
# Periodicity and Conjugate Symmetry
QB201 (b) # Rotation
# Distribution and Scaling
# Average Value
# Laplacian
# Convolution and correlation
# Sampling
1. Separability
F(u, v)=1/N x=0∑ N-1 y=0∑ N-1f(x,y)exp[-j2π(ux +vy)/N] for u, v=0,1,2,…….N-1
f(x, y)=x=0∑ N-1 y=0∑ N-1F(u,v)exp[j2π(ux+vy)/N] for x, y=0,1,2,…….N-1
F(u,v)=1/N x=0∑ N-1F(x,v)exp[-j2πux/N]
where F(x,v)=N[1/Ny=0∑ N-1f(x,y)exp[-j2πvy/N
2. Translation
The translation properties of the Fourier Transorm pair are
f(x,y)exp[-j2π(u0x +v0y)/N] F(u-u0,v-v0) are Fourier Transform pair.
And f(x-x0,y-y0) F(u,v)exp[-j2π(ux0 +vy0)/N]
Where the double arrow indicates the correspondence between a function and its
Fourier transform.
3. Periodicity and Conjugate Symmetry
• Periodicity:
The Discrete Fourier Transform and its inverse are periodic with period N; that is,
F(u,v)=F(u+N,v)=F(u,v+N)=F(u+N,v+N)
• Conjugate symmetry:
If f(x,y) is real, the Fourier transform also exhibits conjugate symmetry,
F(u,v)=F*(-u,-v) or │F(u,v) │=│F(-u,-v) │ where F*(u,v) is the complex
conjugate of F(u,v)
4. Rotation
Polar Coordinates x=rcosθ, y=rsinθ, u=wsinΦ, v=wsinΦ then f(x,y) and F(u,v)
become f(r,θ) and F(w,Φ) respectively. Rotating f(x,y) by an angle θ0 rotates F(u,v) by
the same angle. Similarly rotating F(u,v) rotates f(x,y) by the same angle.
i.e, f(r,θ+ θ0) F(w,Φ+ θ0)
5. Distributivity and scaling
• Distributivity:
The Discrete Fourier Transform and its inverse are distributive over addition but
not over multiplication.
F[f1(x,y)+f2(x,y)]=F[f1(x,y)]+F[f2(x,y)]
F[f1(x,y).f2(x,y)]≠F[f1(x,y)].F[f2(x,y)]
• Scaling
For the two scalars a and b,
Af(x,y) aF(u,v) and f(ax,by) 1/│ab│F(u/a,v/b)
6. Laplacian
The Laplacian of a two variable function f(x,y) is defined as ▼2 f(x,y)=∂2 f/∂x2+∂2 f/∂y2
7. Convolution and Correlation
• Convolution
The convolution of two functions f(x) and g(x) denoted by f(x)*g(x) and is defined by the integral, f(x)*g(x)=-∞∫∞ f(α)g(x-α)dα where α is a
dummy variable.
Convolution of two functions F(u) and G(u) in the frequency domain =multiplication of their inverse f(x) and g(x) respectively.
Ie, f(x)*g(x) F(u)G(u)
• Correlation
The correlation of two functions f(x) and g(x) denoted by f(x)оg(x) and is defined by the integral, f(x)оg(x)=-∞∫∞ f*(α)g(x+α)dα where α is a
dummy variable.
For the discrete case fe(x)оge(x)= 1/M M=0∑ M-1f*(m)g(x+m)
fe(x)= {f(x), 0≤x≤A-1,
{0 , A≤x≤M-1
ge(x)= {g(x), 0≤x≤B-1,
{0 , B≤x≤N-1
Any of the above six explained - 13 marks
Smoothing Filter Concept - 4 marks

QB202 (a) Mask Explanation - 4 marks
Weighted average filter - 5 marks
(Or)
# HPF-output gets sharpen and background becomes darker - 4mark

QB202 (b)
# High boost- output gets sharpen but background remains unchanged - 4 marks
# Derivative- First and Second order derivatives - 5 marks
The illumination – reflectance model can be used to develop a frequency domain procedure for improving the appearance of an image by
simultaneous gray – level compression and contrast enhancement.
• An image can be expressed as the product of illumination and reflectance components.
f(x,y) = i(x,y) r(x,y)
F(f(x,y)) = F(i(x,y) r(x,y))
Where F i(u,v)) and F(r(u,v)) are the Fourier transformation of i(x,y)and r(x,y) respectively.
The inverse (exponential) operation yields the desird enhanced image, denoted by g(x,y); that is,
Ln[f(x,y)] = ln[i(x,y) r(x,y)]
F[ln(f(x,y))] = F[ln(i(x,y)]+F[ln( r(x,y))]
QB203 (a) • This method is based on a special case of a class of systems know as homomorphism systems.
• In this particular application, The key to the approach is the separation of the illumination and reflectance components achieved in the
from.
The homomorphism filter function can then operate on these on these component separately.
The illumination components of an image generally is characterized by slow spatial variations.
While the reflectance component tends to vary abruptly,particularly at the junction, while the reflectance component tends
to vary abruptly, particularly at the junctions of dissimilar objects. A good deal of control can be gained over the illumination and reflectance
components with a homomorphic filter.
This control requires specification of a filter function H(u.v) that affects the low - and high – frequency components of the Fourier
transform in different ways.
(Or)
HISTOGRAM EQUALIZATION:
Histogram equalization is a common technique for enhancing the appearance of images. Suppose we have an image which is predominantly
dark. Then its histogram would beskewed towards the lower end of the grey scale and all the image detail are compressed into the dark end of
the histogram. If we could stretch out the grey levels at the dark end to produce a more uniformly distributed histogram then the image would
become much clearer.
Let there be a continuous function with r being gray levels of the image to be enhanced. The range of r is [0, 1] with r=0 repressing black and
r=1 representing white. The transformation function is of the form
S=T(r) where 0<r<1
It produces a level s for every pixel value r in the original image.
QB203 (b)
The transformation function is assumed to fulfill two condition T(r) is
single valued and monotonically increasing in the internal 0<T(r)<1 for 0<r<1.The transformation function should be single valued so that
the inverse transformations should exist. Monotonically increasing condition preserves the increasing order from black to white in the output
image. The second conditions guarantee that the output gray levels will be in the same range as the input levels. The gray levels of the image
may be viewed as random variables in the interval [0.1]. The most fundamental descriptor of a random variable is its probability density
function (PDF) Pr(r) and Ps(s) denote the probability density functions of random variables r and s respectively. Basic results from an
elementary probability theory states that if Pr(r) and Tr are known and T-1(s) satisfies conditions (a), then the probability density function
Ps(s) of the transformed variable is given by the formula
Thus, the PDF of the transformed variable s is the determined by the gray levels PDF of the input image and by the chosen transformations
function. A transformation function of a particular importance in image processing
This is the cumulative distribution function of r.
L is the total number of possible gray levels in the image.
UNIT - III
Q. No Questions
Introduction:
-Median Filter is one of the part of the smoothing filter.
-No mask is used in the median filters.
-We choose 3x3 sub-image arranged in ascending order and leave first
four values.
3
5
7
2
10
20
30
9
QB301 (a) 4
2,3,4,5,7,9,10,20,30
-Take the median value.
-This median filter is the non-linear spatial filtering.
1)median filtering smoothing
2)Max filter
3)Min filter
Max filter:
R=Max
-Max filter gives the brightest points
Min filter:
R=Min
-It helps to get the largest point in the image.
(Or)
The objective of segmentation is to partition an image into regions. We approached this problem by finding boundaries between based on
discontinuities in gray levels, segmentation was accomplished via thresholds based on the distribution of pixels properties, such as gray level
values or color.
Basic Formulation:
Let Represent the region of image. We may view segmentation as a process that partition R into n subregions,R1,R2,………………,such that
n
(a) ΣRi=R
i=1
(b) Ri is a connected region, i=1,2,…………..n.
(c) Ri∩Rj=Фfor all i and j,i≠j.
(d) P(Ri)=TRUE for i=1,2,……………………n.
(e) P(RiURj)=FALSE for i≠j.
Here, P(Ri) is a logical predicate defined over the points in set Ri and Ф is the null set.
Condition (a) indicates that the segmentation must be complete that is every pixel must be in a region.
Condition (b) requires that points in a region must be connected in some predefined sense.
Condition(c) indicates that the regions must be disjoint.
Condition(d) deals with the properties that must be satisfied by the pixels in a segmented region.
Region Growing:
As its name implies region growing is a procedure that groups pixel or subregions into larger regions based on predefined criteria. The basic
approach is to start with a set of “seed” points and from these grow regions.
If the result of these computation shows clusters of values, the pixels whose properties place them near the centroid of these clusters can
QB301 (b) be used as seeds.
Descriptors alone can yield misleading results if connectivity or adjacency information is not used in the region growing process.
Region Splitting and Merging:
The procedure just discussed grows regions from a set of seed points. An alternative into subdivided an image initially into a set of arbitrary,
disjointed regions and then merge and/or split the regions in an attempt to satisfy the conditions.
Merge any adjacent regions Rj and Rk for which P(RjURk)=TRUE.

3. Stop when no further merging or splitting is possible. Mean and standard deviation of pixels in a region to quantify the texture of region.
Role of thresholding: We introduced a simple model in which an image f(x,y) is formed as the product of a reflectance component r(x,y) and
an illumination components i(x,y). consider the computer generated reflectance function. The histogram of this function is clearly bimodal
and could be portioned easily by placing a single global threshold, T, in the histogram valley. Multiplying the reflectance function by the
illumination function. Original valley was virtually eliminated, making segmentation by a single threshold an impossible task.
Although we seldom have the reflectance function by itself to work with, this simple illustration shows that the reflective nature of objects
and background can be such that they are separable.
ƒ(x,y)=i(x,y)r(x,y)
Taking the natural logarithm of this equation yields a sum:
z(x,y)=ln ƒ(x,y)
=ln i(x,y)+ln r(x,y)
=i (x,y)+r (x,y)
If i (x,y) and r (x,y) are independent random variable, the histogram of z(x,y) is given by the convolution of the histogram of i (x,y)
and r (x,y). But if i (x,y) had a border histogram the convolution process would smear the histogram of r (x,y), yielding a histogram
for z(x,y) whose shape could be quite different from that of the histogram of r (x,y). The degree of distortion depends on the broadness
of the histogram of i (x,y), which in turn depends on the nonuniformity of the illumination function. We have dealt with the logarithm of
ƒ(x,y), instead of dealing with the image function directly.
When access to the illumination source is available, a solution frequently used in practice to compensate for nonuniformity is to project the
illumination pattern onto a constant, white reflective surface. This yields an image g(x,y)=ki(x,y), where k is a constant that depends on
the surface and i(x,y) is the illumination pattern. For any image ƒ(x,y)=i(x,y)r(x,y) obtained from the same illumination function,
simply dividing ƒ(x,y) by g(x,y) yields a normalized function h(x,y)= ƒ(x,y)/g(x,y)= r(x,y)/k.
Thus, if r(x,y) can be segmented by using a single threshold T, then h(x,y) can be segmented by using single threshold of value T/k.
The inverse filtering approach makes no explicit provision for handling noise.
• An approach that incorporate both the degradation function and statistical characteristics of noise into the restoration process.
• The method is founded on considering images and noise as random processes,and the objective is to find an estimate f of the uncorrupted
image f such that the mean square error between them is minimized.
• This error measure is given by 2 e =E{(f-f^)2}
where E{.}is the expected value of the argument.
• It is assumed that the noise and the image are uncorrelated,that one or other has zero mean:and that the gray levels in the estimate are a
linear function of the levels in the degradated image.
• Based on these conditions,the minimum of the error function in Eq is given in the frequency domain by the expression.
This result is known as the wiener filter after N.Wiener,who proposed the concepts in the year shown.the filter which consists
of the term inside the brackets also is commonly referred to as the minimum mean square error filter or the least square error filter.
QB302 (a) • We include references at the end of sources containing detailed derivations of the wiener filter. The restored image in the spatial domain is
given by the inverse Fourier transform of the frequency domain estimate F(u,v).
• If the noise is zero,then the noise power spectrum vanishes and the wiener filter reduces to the inverse filter.
However the power spectrum of the undegraded image seldom is known. Where k is a specified constant.
Example illustrates the power spectrum of wiener filtering over direct inverse filtering.the value of K was chosen interactively to
yield the best visual results. o It illustrates the full inverse filtered result similarly is the radially limited
inverse filter .
• These images are duplicated here for convenience in making comparisons. As expected ,the inverse filter produced an unusable
image.The noise in the inring filter.
• The wiener filter result is by no means perfect,but it does give us a hint as to image content.
• The noise is still quite visible, but the text can be seen through a “curtain” of noise.
(Or)
Noise are unwanted signal which corrupts the original signal. • Origin of noise source is during image acquisition and/or transmission and
digitization. • During capturing ,performance of imaging sensors are affected by the environmental conditions due to the quality of sensors.
• Image acquisition are the principle source of noise. • Due to the interference in the transmission it will affect the transmission of the
image.
Types:
QB302 (b)
Rayleigh noise:
The PDF is
P(Z)= 2(z-a)e-(z—a)2/b/b for Z>=a
0 for Z<a
mean μ=a+√πb/4
standard deviation σ2=b(4-π)/4
Gamma noise:
The PDF is
P(Z)=ab zb-1 ae-az/(b-1) for Z>=0
0 for Z<0
mean μ=b/a
standard deviation σ2=b/a2
Exponential noise
The PDF is
P(Z)= ae-az Z>=0
0 Z<0
mean μ=1/a
standard deviation σ2=1/a2
Uniform noise:
The PDF is
P(Z)=1/(b-a) if a<=Z<=b
0 otherwise
mean μ=a+b/2
standard deviation σ2=(b-a)2 /12
Impulse noise:
The PDF is
P(Z) =Pa for z=a
Pb for z=b
0 Otherwise
Morphological image processing is a collection of non-linear operations related to the shape or morphology of features in an image.
Morphological operations rely only on the relative ordering of pixel values, not on their numerical values, and therefore are especially suited
to the processing of binary images. Morphological operations can also be applied to greyscale images such that their light transfer functions
are unknown and therefore their absolute pixel values are of no or minor interest.
Morphological techniques probe an image with a small shape or template called a structuring element. The structuring element is positioned
at all possible locations in the image and it is compared with the corresponding neighbourhood of pixels. Some operations test whether the
element "fits" within the neighbourhood, while others test whether it "hits" or intersects the neighbourhood:
A common practice is to have odd dimensions of the structuring matrix and the origin defined as the centre of the matrix. Stucturing elements
QB303 (a)
play in moprphological image processing the same role as convolution kernels in linear image filtering.
When a structuring element is placed in a binary image, each of its pixels is associated with the corresponding pixel of the neighbourhood
under the structuring element. The structuring element is said to fit the image if, for each of its pixels set to 1, the corresponding image pixel
is also 1. Similarly, a structuring element is said to hit, or intersect, an image if, at least for one of its pixels set to 1 the corresponding image
pixel is also 1.
Explanation about
Erosion
Dilation
Duality
(Or)
(i) Arithmetic mean filter ƒ^(x,y)=1/mn Σ g(s,t) - 3 marks
(s,t)ЄSxy
(ii) Geometric mean filter – 4 marks

An image restored using a geometric mean filter is given by the expression
f^(x,y) = [ п g(s,t) ] (s,t)ЄSxy here ,each restored pixel is given by the product of the pixels in the subimage window , raised to the power
1/mn
QB303 (b) (iii) Harmonic filters – 6 marks

The harmonic mean filtering operation is given by the expression
ƒ ^(x,y) = mn/∑(1/g(s,t))
• Contra harmonic mean filter
Contra harmonic mean filtering operation yields a restored image based on the expression
Q+1 Q
f^(x,y)=∑g(x,t) /∑g(s,t) where Q is called the order of the filter.This filter is well suited for
reducing or virtually eliminating the effect of salt and pepper noise.
UNIT - IV
Q. No Questions
Compression: It is the process of reducing the size of the given data or an image. It will help us to reduce the storage space required to store
an image or File.
Data Redundancy:
The data or words that either provide no relevant information or simplyrestate that which is already known .It is said to be data redundancy.
Consider N1 and N2 number of information carrying units in two data sets that represent the same information
Data Redundancy Rd = 1-1/Cr Where Cr is called the Compression Ratio.
`
QB401 (a) Cr=N1/N2.
Types of Redundancy
There are three basic Redundancy and they are classified as
1) Coding Redundancy
2) Interpixel Redundancy
3) Psychovisual Redundancy.
1. Coding Redundancy :
We developed this technique for image enhancement by histogram processing on the assumption that the grey levels of an image are random
quantities. Here the grey level histogram of the image also can provide a great deal of insight in the construction of codes to reduce the
amount of data used to represent it.
2. Interpixel Redundancy :
Inorder to reduce the interpixel redundancy in an image, the 2-D pixel array normally used for human viewing and interpretation must be
transformed in to more efficient form.
3. Psychovisual Redundancy:
Certain information simply has less relative importance than other information in the normal visual processing. This information is called
Psycovisual Redundant.
(Or)
In this approach the lable for the DC and AC coefficient are coded differently using
Huffman codes. The DC coefficient values partitioned into categories. The categories are
then Huffman coded. The AC coefficient is generated in slightly different manner. There
are two special codes: End-of-block(EOF) and ZRL
Table: Coding of the differences of the DC labels
QB401 (b)
To encode the AC coefficient First using Zigzag scan. We obtain

-9 3 0 0 0 0 0 ……… 0
The first value belong to category 1. transmit the code corresponding to 0/1 follow by a
single bit 1 to indicate that the value being transmitted is 1 and not -1 .Simillarly other
AC coefficient code are transmited.
To obtain the reconstruction of the original block Dequantization is performed and taking
inverse transform of the coefficient we get the reconstructed block
In vector quantization we group the source output into blocks or vectors. This vector of source outputs forms the input to the vector
quantizer. At both the encoder and decoder of the vector quantizer, we have a set of L-dimensional vectors called the codebook of the vector
quantizer. The vectors in this codebook are known as code-vectors. Each code vector is assigned a binary index.
At the encoder, the input vector is compared to each code-vector in order to find the code vector closest to the input vector
QB402 (a) In order to inform the decoder about which code vector was found to be the closest to the input vector, we transmit or store the binary index
of the code-vector. Because the decoder has exactly the same codebook, it can retrieve the code vector Although the encoder have to perform
considerable amount of computations in order to find the closest reproduction vector to the vector of source outputs, the decoding consists of
a table lookup. This makes vector quantization a very attractive encoding scheme for applications in which the resources available for
decoding are considerably less than the resources available for encoding
Advantages of vector quantization over scalar quantization
For a given rate (bits per sample), use of vector quantization results in lower distortion than when scalar quantization is used at the same rate
Vectors of source output values tend to fall in clusters. By selecting the quantizer output points to lie in these clusters, we have more
accurate representation of the source Output
Use:
One application for which vector quantizer has been extremely popular is image compression.
Disadvantage of vector quantization:
Vector quantization applications operate at low rates. For applications such as high-quality video coding, which requires higher rates this is
definitely a problem. To solve these problems, there are several approaches which entails some structure in the quantization process
Tree structures vector quantizers:
This structure organizes codebook in such a way that it is easy to pick which part contains the desired output vector
Structured vector quantizers: Tree-structured vector quantizer solves the complexity problem, but acerbates the
storage problem We now take entirely different tacks and develop vector quantize that do not have these storage problems; however we pay
for this relief in other ways
(Or)
Huffman coding diagram – 5 Marks
(a) Average Length -2 Marks
QB402 (b) (b) Entrophy- 2 Marks
(c) Efficiency - 2 Marks
(d) Redundancy - 2 Marks
THE CHANNEL ENCODER AND DECODER:
The channel encoder and decoder play an important role in the overall encoding-decoding process when the channel of Fig. 3.1 is
QB403 (a) noisy or proneto error. They are designed to reduce the impact of channel noise by inserting a controlled form of redundancy into
the source encoded data.As the output of the source encoder contains little redundancy, it wouldbe highly sensitive to
transmission noise without the addition of this "Controlled Redundancy."
One of the most useful channels encoding techniques was devised by R.
W. Hamming (Hamming [1950]). It is based on appending enough bits to the data being encoded to ensure that some minimum
number of bits must change between valid code words. Hamming showed, for example, that if 3 bits of redundancy are added to
a 4-bit word, so that the distancebetween any two valid code words is 3, all single-bit errors can be detected and corrected. (By
appending additional bits of redundancy,multiple-bit errors can be detected and corrected.) The 7-bit Hamming (7, 4) code
word h1, h2, h3…., h6, h7 associated with a 4-bit
binary number b3,b2,b1,b0 is
where denotes the exclusive OR operation. Note that bits h 1, h2, and h4 are even- parity bits forthe bit fields b3 b2 b0, b3b1b0,
and b2b1b0, respectively. (Recall that a string of binary bits has even parity if the number of bits with a value of 1 is even.) To
decode a Hamming encoded result, the channel decoder must check the encoded value for odd parity over the bit fields in
which even parity was previously established. A single-bit error is indicated by a nonzero parity word c4c2c1, where
If a nonzero value is found, the decoder simply complements the code word bit position indicated by the parity word. The
decoded binary valueis then extracted from the corrected code word ash3h5h6h7.
Or
The technique, called Lempel-Ziv-Welch (LZW) coding, assigns fixedlength code words to variable length sequences of source symbols but
requires no a priori knowledge of the probability of occurrence of the symbols to be encoded. LZW compression has been integrated into a
variety of mainstream imaging file formats, including the graphic interchange format (GIF), tagged image file format (TIFF), and the
portable document format (PDF). LZW coding is conceptually very simple (Welch [1984]). At the onset of the coding process, a codebook
or "dictionary" containing the source symbols to be coded is constructed. For 8-bit monochrome images, the first 256 words of the dictionary
are assigned to the gray values 0, 1, 2..., and 255. As the encoder sequentially examines the image's pixels, graylevel sequences that are not
QB403 (b) in the dictionary are placed in algorithmically determined (e.g., the next unused) locations. If the first two pixels of the image are white, for
instance, sequence “255- 255” might be assigned to location 256, the address following the locations reserved for gray levels 0 through 255.
The next time that two consecutive white pixels are encountered, code word 256, the address of the location containing sequence 255-255, is
used to represent them. If a 9-bit, 512- word dictionary is employed in the coding process, the original(8 + 8) bits that were used to represent
the two pixels are replaced by a single 9-bit code word. Cleary, the size of thedictionary is an important system parameter. If it is too small,
the detection of matching gray-level sequences will be less likely; if it is too large, the size of the code words will adversely affect
compression performance. Consider the following 4 x 4, 8-bit image of a vertical edge:
Table:Details the steps involved in coding its 16 pixels.
A 512-word dictionary with the following starting content is assumed: Locations 256 through 511 are initially unused. The image is encoded
by processing its pixels in a left-to-right, top-to-bottom manner. Each successive gray-level value is concatenated with a variable—column 1
of Table 6.1 —called the "currently recognized sequence." As can be seen, this variable is initially null or empty. The dictionary is searched
for each concatenated sequence and if found, as was the case in the first row of the table, is replaced by the newly concatenated and
recognized (i.e., located in the dictionary) sequence. This was done in column 1 of row 2.
No output codes are generated, nor is the dictionary altered. If the concatenated sequence is not found, however, the address of the currently
recognized sequence is output as the next encoded value, the concatenated but unrecognized sequence is added to the dictionary, and the
currently recognized sequence is initialized to the current pixel value. This occurred in row 2 of the table. The last two columns detail the
graylevel sequences that are added to the dictionary when scanning the entire 4 x 4 image. Nine additional code words are defined. At the
conclusion of coding, the dictionary contains 265 code words and the LZW algorithm has successfully identified several repeating gray-level
sequences— leveraging them to reduce the original 128-bit image lo 90 bits (i.e., 10 9- bit codes). The encoded output is obtained by reading
the third column from top to bottom. The resulting compression ratio is 1.42:1.A unique feature of the LZW coding just demonstrated is that
the coding dictionary or code book is created while the data are being encoded. Remarkably, an LZW decoder builds an identical
decompression dictionary as it decodes simultaneously the encoded data stream. Although not needed in this example, most practical
applications require a strategy for handling dictionary overflow. A simple solution is to flush or reinitialize the dictionary when it becomes
full and continue coding with a new initialized dictionary. A more complex option isto monitor compression performance and flush the
dictionary when it becomes poor or unacceptable. Alternately, the least used dictionary entries can be tracked and replaced when necessary.
UNIT - V
Q. No Questions
Image representation is the process of generating descriptions from the visual contents of an image.
CHAIN CODE: A chain code is a lossless compression algorithm for monochrome images. The basic principle of chain codes is to
separately encode each connected component, or "blot", in the image. For each such region, a point on the boundary is selected and its
coordinates are transmitted. The encoder then moves along the boundary of the image and, at each step, transmits a symbol representing the
direction of this movement. This continues until the encoder returns to the starting position, at which point the blot has been completely
described, and encoding continues with the next blot in the image. This encoding method is particularly effective for images consisting of a
reasonable number of large connected components.Some popular chain codes include the Freeman Chain Code of Eight Directions (FCCE),
Vertex Chain Code (VCC), Three Orthogonal symbol chain code (3OT) and Directional Freeman Chain Code of Eight Directions
(DFCCE). A related blot encoding method is crack code Algorithms exist to convert between chain code, crack code, and run-length
encoding. 1. Let i=3 2. Mark the points in the object that have furthest distance from each other with p and p2 3. Connect the points in the
order they are numbered with lines 4. For each segment in the polygon, find the point on the perimeter between the point that have furthest
distance to the polygonal linesegment. If this distance Is larger than a threshold, mark the point with a label pi 5. Renumber the points so
that they are consecutive 6. Increase i 7. If no points have been added break, otherwise go to 3 The convex hull H of a set S is defined as the
smallest convex set that contains S. We define the set D= H \ S The points that have neighbors from the sets D, H and CH is called p. These
points are used for representation of the object. To limit the number of points found it is possible to smooth the edge of the objectWe define
an object’s skeleton or medial axel as the points in the object that have several nearest neighbors on the edge of the object
QB501 (a)
To find the skeleton by using the definition is very costly, since it involves calculating the distance between all points in the object to all
points on the perimeter. Usually some iterative method is used to remove perimeter pixels until the skeleton alone remains. In order to
achieve this the following rules must be adhered to: 1. Erosion of object must be kept to a minimum 2. Connected components must not be
broken 3. End points must not be removed The purpose with Description is to Quantify a representation of an object. This implies that we
instead of talking about the areas in the image we can talk about their properties such as length, curviness and so on. One of the simplest
descriptions is the length P of the perimeter of an object. The obvious measure of perimeter – length is the number of edge pixels. That is,
pixel that do belong to the object, but have a neighbor that belong to the object, but have a neighbor that belong to the background. A more
precise measure is to assume that each pixel center is a corner in a polygon. The length of the perimeter is given by P= aNe+ bNo
Intuitively we would like to set a=1 and b= It is however possible to show that the length of the perimeter will be slightly overestimated
with these values. The optimal weights for a and b (in least square sense) will depend on the curviness of the perimeter. If the perimeter is
straight line (!?) a=0.948 and b=1.343 will be optimal. If it is assumed that the direction of two consecutive line-segments is uncorrelated
a=0.948 and b=1.343 will be optimal. The diameter of an object is defined as Where D is a metric The line that passes through the points Pi
and Pjthat defines the diameter is called the main axis of the object. The curviness of perimeter can be obtained by calculating the angle
between two consecutive line-segments of the polygonal approximation The curvature at point Pjof the curve We can approximate the area
of an object with the number of pixels belonging to the object. More accurate measures are, however obtained by using a polygonal
approximation. The area of a polygon segment (with one corner in the origin) is given by the area of the entire polygon is then
A circle of radius r has the area A=πr2 and the length of the perimeter is P=2 πr. So, by defining the
quotient We have a measurement that is 1, if the object is a circle. The larger measurement the less circle-like is the object.
The measure of the shape of the object can be obtained according to: 1. Calculate the chain code for the object 2. Calculate the difference
code for the chain code 3. Rotate the code so that it is minimal 4. This number is called the shape number of the object 5. The length of the
number is called the order of the shape
(Or)
Discussed the concept of Recognition Based on Minimum distance Classifier - 4 marks
Specified the Calculation of Distance formula - 5 marks
QB501 (b) Drawn the Recognition diagram - 4 marks
Explained Statistical Approach - 7 marks

QB502 (a) Explained Structural Approach - 6 marks
(Or)
A pattern is a quantitative or structural description of an objective or some other entity of interest in an image, A pattern class is a family of
patterns that share some common properties .Pattern classes are denoted w1,w2,----wm, where M is the number of classes Three principle
pattern arrangements used in practice are vectors(for quantitative descriptors ) and strings and trees (for structural descriptions) .
QB502 (b)
Patternvectors are represented by bold lower case letters such as x,y, and z,where Each component x represent the ith descriptors.Pattern
vectors are represented in coloumns (i.e. n x 10 marices) or in the equilant form x=9x1,x2,------xn)T,T-transpose.
The nature of the pattern vector depends on the measurement technique used to describe the physical pattern itself. Ex. If we want to
describe the three types of iris floers(iris setosa,virginica,and versicolor)by measuring the width and length of the petals.It is represented in
the vector form x=[x1,x2]T;x1,x2 correspond to width length respectively.Three pattern classes are w1,w2,w3 corresponding to the three
verities. Because the petals of all flowers vary in width and length to some degree the pattern vectors describing three flowers also will vary,
not only between different classes ,but also with in a class.
The result of this classic feature selection problem shows that the degree of class seperability depends strongly on the choice of pattern
measurements selected for an application.
Discussed on Simple descriptors - 5 marks

Discussed on Shape Numbers - 4 marks
QB503 (a)
Discussed on Fourier descriptors - 4 Marks
(Or)
Pattern recognition is the process of recognizing patterns by using machine learning algorithm. Pattern recognition can be defined as the
classification of data based on knowledge already gained or on statistical information extracted from patterns and/or their representation.
One of the important aspects of the pattern recognition is its application potential. Examples: Speech recognition, speaker identification,
multimedia document recognition (MDR), automatic medical diagnosis. In a typical pattern recognition application, the raw data is
processed and converted into a form that is amenable for a machine to use. Pattern recognition involves classification and cluster of
patterns.In classification, an appropriate class label is assigned to a pattern based on an abstraction that is generated using a set of training
patterns or domain knowledge. Classification is used in supervised learning.Clustering generated a partition of the data which helps decision
making, the specific decisionmaking activity of interest to us. Clustering is used in an unsupervised learning.Features may be represented as
continuous, discrete or discrete binary variables. A feature is a function of one or more measurements, computed so that it quantifies some
significant characteristics of the object. Example: consider our face then eyes, ears, nose etc are features of the face. A set of features that
are taken together, forms the features vector. Example: In the above example of face, if all the features (eyes, ears, nose etc)taken together
then the sequence is feature vector ([eyes, ears, nose]). Feature vector is the sequence of a features represented as a ddimensional column
vector. In case of speech, MFCC (Mel frequency Cepstral Coefficient) is the spectral features of the speech. Sequence of first 13 features
forms a feature vector. Pattern recognition possesses the following features: • Pattern recognition system should recognize familiar pattern
quickly and accurate • Recognize and classify unfamiliar objects • Accurately recognize shapes and objects from different angles • Identify
QB503 (b)
patterns and objects even when partly hidden • Recognize patterns quickly with ease, and with automaticity
PatternRecognition System:Pattern is everything around in this digital world. A pattern can either be seen physically or it can be observed
mathematically by applying algorithms. • In Pattern Recognition, pattern is comprising of the following two fundamental things: •
Collection of observations • The concept behind the observation FeatureVector:The collection of observations is also known as a feature
vector. A feature is a distinctive characteristic of a good or service that sets it apart from similar items. Feature vector is the combination of
n features in n-dimensional column vector.The different classes may have different features values but the same class always has the same
features value Classifier and Decision Boundaries: In a statistical-classification problem, a decision boundary is a hypersurface that
partitions the underlying vector space into two sets. A decision boundary is the region of a problem space in which the output label of a
classifier is ambiguous.Classifier is a hypothesis or discretevalued function that is used to assign (categorical) class labels to particular data
points.Classifier is used to partition the feature space into class-labeled decision regions. While Decision Boundaries are the borders
between decisionregions. Components in Pattern Recognition System: Pattern recognition systems can be partitioned into components.
There are five typical components for various pattern recognition systems. These are as following: Sensor: A sensor is a device used to
measure a property, such as pressure, position, temperature, or acceleration, and respond with feedback. A Preprocessing Mechanism:
Segmentation is used and it is the process of partitioning a data into multiple segments. It can also be defined as the technique of dividing or
partitioning a data into parts called segments. A Feature Extraction Mechanism: Feature extraction starts from an initial set of measured data
and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning and generalization
steps, and in some cases leading to better human interpretations. It can be manual or automated. A Description Algorithm: Pattern
recognition algorithms generally aim to provide a reasonable answer for all possible inputs and to perform “most likely” matching of the
inputs, taking into account their statistical variation A Training Set: Training data is a certain percentage of an overall datasetalong with
testing set. As a rule, the better the training data, the better the algorithm or classifier performs.
(PART C – 15 Marks - Either Or Type)

UNIT - I
Q. No Questions
The HSI Color Model
The RGB,CMY and other color models are not well suited for describing colors in terms that are practical for human interpretation.For
eg,one does not refer to the color of an automobile by giving the percentage of each of the primaries composing its color. When humans
view a color object we describe it by its hue, saturation and brightness.
• Hue is a color attribute that describes a pure color.
• Saturation gives a measure of the degree to which a pure color is diluted by white light.
• Brightness is a subjective descriptor that is practically impossible to measure. It embodies the achromatic notion of intensity and is one
of the key factors in describing color sensation• Intensity is a most useful descriptor of monochromatic images.
Converting colors from RGB to HSI
Given an image in RGB color format ,• the H component of each RGB pixel is obtained using the equation
H = {theta if B<=G 360-theta if B>G with theta = cos-1{1/2[R-G) +(R-B)/[(R-G)2 + (R-B)(G-B)]1/2}
• The saturation component is given by S =1-3/(R+G+B)[min(R,G,B)]
• the intensity component is given by I=1/3(R+G+B)
Converting colors from HSI to RGB
QC101 (a) Given values of HSI in the interval [0,1],we now want to find the corresponding RGB values in the same range .We begin by multiplying
H by 360o ,which returns the hue to its original range of [0o ,360o] RG sector(0o<=120o).when h is in this sector ,the RGB components
are given by the equations
B = I (1 - S)
R = I [1 + S cos H/cos(60o - H)]
G = 1 - (R + B)
GB Sector(120o <= H < 240o ).If the given value of H is in this ,we first subtract 120o from it
H = H -120o Then the RGB components are
B = I (1 – S)
G = I [1 + S cos H/cos(60o - H)]
B = 1 - (R + G)
BR Sector(240o <=H<=360o ).Finally if H is in this range we subtract 240o from it H = H - 240o
Then the RGB components are G = I (1 - S)
B = I [1 + S cos H/cos(60o - H)]
R = 1 - (G + B)
(Or)
# Brightness adaptation - 4 marks
# Subjective brightness - 4 marks
# Weber ratio - 2 marks
QC101 (b)
#Mach band effect -2 marks
#simultaneous contrast - 3marks
UNIT - II
Q. No Questions
Transferfunction of a Butterworth lowpass filter (BLPF) of order n, and with cutoff frequency at a distance D0 from the origin, is defined
as –
Transfer function does not have sharp discontinuity establishing cutoff between passed and
filtered frequencies. Cut off frequency D0 defines point at which H(u,v) = 0.5 Digital Image Processing Page | 37 Fig.(a) perspective plot
of a Butterworth lowpass-filter transfer function. (b) Filter displayed as an image.(c)Filter radial cross sections of order 1 through 4. Unlike
the ILPF, the BLPF transfer function does not have a sharp discontinuity that gives a clear cutoff between passed and filtered frequencies.
QC201 (a)
BUTTERWORTH LPF OF DIFFERENTFREQUENCIES:

Fig. shows the results of applying the BLPF of eq. to fig.(a), with n=2 and D0 equal to the five radii in fig.(b) for the ILPF, we note here a
smooth transition in blurring as a function of increasing cutoff frequency. Moreover, no ringing is visible in any of the images processed
with this particular BLPF, a fact attributed to the filter’s smooth transition between low and high frequencies. A BLPF of order 1 has no
ringing in the spatial domain. Ringing generally is imperceptible in filters of order 2, but can become significant in filters of higher
order.Figure shows a comparison between the spatial representation of BLPFs of various orders (using a cutoff frequency of 5 in all cases).
Shown also is the intensity profile along a horizontal scan line through the center of each filter. The filter of order 2 does show mild
ringing and small negative values, but they certainly are less pronounced than in the ILPF. A butter worth filters of order 20 exhibits
characteristics similar to those of the ILPF (in the limit, both filters are identical).
(Or)
QC201 (b)
UNIT - III
Q. No Questions
One of the simplest approaches f or linking edge points is to analyze the

characteristics of the pixels in a small neighborhood about every point in an
image that has undergone edge detection.
 Two properties used for establishing similarity of edge pixels in the analysis are
 The strength of the response of the gradient operator used to
produce the edge pixel,
 The direction of the gradient.
The first property is given by the value of ▼f.
Thus an edge pixel with coordinates (x’,y’) and in the predefined neighborhood of (x,y) is
similar in magnitude to the pixel at (x,y) if |▼f(x,y) - ▼(x’,y’)|<=T where T is a
QC301 (a) nonnegative threshold.
The direction of the gradient vector is given by
α(x,y)=tanˉ¹ gy/gx
Then an edge pixel at (x’,y’) in the predefined neighborhood of (x,y) has an angle similar
to the pixel at (x,y) if | α(x,y)= α(x’,y’)|<A where A is an angle threshold. Note that the
direction of the edge at (x,y) in reality is perpendicular to the direction of the gradient
vector at that point.
A point in the predefined neighborhood of (x,y) is linked to the pixel at (x’,y’) if both
magnitude and direction criteria are satisfied. This process is repeated for every location
in the image. A record must be kept of linked points as the center of the neighborhood is
moved from pixel to pixel. A simple bookkeeping procedure is to assign a different gray
level to each set of linked edge pixels.
(Or)
Geometric transformations are used for image restoration, modify the spatial relationship between the pixels in an image.
• Geometric transformations are often called rubber sheet transformations, because they are may be viewed as the process of printing an
image on a sheet of rubber.
• The geometric transformations consists of two basic operations:
(1) Spatial transformation
(2) Gray level interpolation
1
QC301 (b) 1
1
1
1
1
1
1
1
-1
-1
-1
-1
8
-1
-1
-1
-11.Spatial transformations:-
An image f of pixel coordinates(x,y) undergoes geometric distortion to produce an image g with coordinates(x’,y’).this transformation may
be expressed as
x’=r(x,y) y’=s(x,y)
• where r(x,y) and s(x’,y’) are the spatial transformations that produced the geometrically distorted image g(x’,y’).
• If r(x,y) and s(x,y) were known analytically recovering f(x,y) from the distorted image g(x’,y’) by applying the transformations in reverse
might possible theoretically.
• The method used most frequently to formulate the spatial relocation of pixels by the use of tiepoints,which are a subset of pixels whose
location in the input and output image is known precisely.
• The vertices of the quadrilaterals are corresponding tiepoints.
r(x,y)=c1x+c2y+c3xy+c4
S(x,y)=c5x+c6y+c7xy+c8
x’=c1x+c2y+c3xy+c4
y’=c5x+c6y+c7xy+c8
• Since there are a total of eight known tiepoints these equations can be solved for eight coefficients ci,i=1,2,…8.
• The coefficient constitute the geometric distortion model used to transform all pixels within the quadrilateral region defined by the
tiepoints used to obtain the coefficients.
• Tiepoints are established by a number of different techniques depending on the application.
2. Gray level Interpolation:-
• Depending on the values of coefficients ci equations can yield noninteger values for x’ and y’.
• Because the distorted image g is digital ,its pixel values are defined only at integer co ordinates .
• Thus using non integer values for x’, y’ causes a mapping into locations of g for which no gray levels are defined.
• The technique is used to accomplish this is called gray level interpolation.
UNIT - IV
Q. No Questions
Compression: It is the process of reducing the size of the given data or an image. It will help us to reduce the storage space required to
store an image or File.
Image Compression Model:
QC401 (a)
There are two Structural model and they are broadly Classified as follows
1. An Encoder
2. A Decoder.
An Input image f(x,y) is fed in to encoder and create a set of symbols and after transmission over the channel ,the encoded representation is
fed in to the decoder. A General Compression system model:
The General system model consist of the following components,They are broadly classified as
1. Source Encoder
2. Channel Encoder
3. Channel
4. Channel Decoder
5. Souce Decoder
The Source Encoder Will removes the input redundancies. The channel encoder will increase the noise immunity of the source encoder’s
output. If the channel between encoder and decoder is noise free then the channel encoder and decoder can be
omitted.
MAPPER:
It transforms the input data in to a format designed to reduce the interpixel redundancy in the input image.
QUANTIZER:
It reduce the accuracy of the mapper’s output.
SYMBOL ENCODER:
It creates a fixed or variable length code to represent the quantizer’s output and maps the output in accordance with the code.
SYMBOL DECODER:
The inverse operation of the source encoder’s symbol will be performed and maps the blocks.
(Or)
LOSSY:
In this type of coding, we add a quantizer to the lossless predictive model and examine the resulting trade-off between reconstruction
accuracy and compression performance. As Fig. shows, the quantizer, which absorbs the nearest integer function of the error-free encoder,
is inserted between the symbol encoder
QC401 (b)
and the point at which the prediction error is formed.
It mapsthe prediction error into a limited range of outputs, denoted e^n which establish the amount of compression and distortion
associated with lossy predictive coding. Fig. A lossy predictive coding model: (a) encoder and (b) decoder. In order to accommodate the
insertion of the quantization step, the errorfree encoder of figure must be altered so that the predictions generated by the encoder and
decoder are equivalent. As Fig.9 (a) shows, this is accomplished by placing the lossy encoder's predictor within feedback loop, where its
input, denoted f˙n, is generated as a function of past predictions and the corresponding quantized errors. That is,
prevents error buildup at the decoder's output. Note from Fig. (b) that the output of the decoder also is given by the above Eqn. OPTIMAL
PREDICTORS: The optimal predictor used in most predictive coding applications minimizes the encoder's mean- square prediction error
That is, the optimization criterion is chosen to minimize the mean-square prediction error, thequantization error is assumed to be negligible
(e˙n ≈ en), and the prediction is constrained to a linear combination of m previous pixels.1 These restrictions are not essential, but they
simplify the analysis considerably and, at the same time, decrease the computational complexity of the predictor. The resulting predictive
coding approach is referred to as differential pulse code modulation (DPCM)
LOSSLESS:
The error-free compression approach does not require decomposition of an image into a collection of bit planes. The approach, commonly
referred to as lossless predictive coding, is based on eliminating the interpixel redundancies of closely spaced pixels by extracting and
coding only the new information in each pixel. The new information of a pixel is defined as the difference between the actual and predicted
value of that pixel. Figure shows the basic components of a lossless predictive coding system. The system consists of an encoder and a
decoder, each containing an identical predictor. As eachsuccessive pixel of the input image, denoted fn, is introduced to the encoder, the
predictor generates the anticipated value of that pixel based on some number of past inputs. The output ofthe predictor is then rounded to
the nearest integer, denoted f^n and used to form the difference or prediction error which is coded using a variablelength code (by the
symbol encoder) to generate the next element of the compressed data stream.
This closed loop configuration
The decoder of Fig. 8.1 (b) reconstructs en from the received variablelength code words and performs the inverse operation
Various local, global, and adaptive methods can be used to generate f^n. In most cases, however, the prediction is formed by a linear
combination of m previous pixels. That is,
where m is the order of the linear predictor, round is a function used to denote the rounding ornearest integer operation, and the αi, for i =
1,2,..., m are prediction coefficients. In raster scan applications, the subscript n indexes the predictor outputs in accordance with their time
of occurrence. That is, fn, f^n and en in Eqns. above could be replaced with the more explicit notation f (t), f^(t), and e (t), where t
represents time. In other cases, n is used as an index on the spatial coordinates and/or frame number (in a time sequence of images) of an
image. In 1-D linear predictive coding, for example, Eq. above can be written as
where each subscripted variable is now expressed explicitly as a function of spatial coordinates x and y. The Eq. indicates that the 1-D
linear prediction f(x, y) is a function of the previous pixels on the current line alone. In 2-D predictive coding, the prediction is a function
of the previous pixels in a left-to-right, top-to-bottom scan of an image. In the 3-D case, it is based on these pixels and the previous pixels
of preceding frames. Equation above cannot be evaluated for the first m pixels of each line, so these pixels must be coded by using other
means (such as a Huffman code) and considered as an overhead of the predictive coding process. A similar comment applies to the higher-
dimensional cases.
UNIT - V
Q. No Questions
BOUNDARY REPRESENTATION:
Models are a more explicit representation than CSG.The object is represented by a complicated data structure giving information about each of the
object's faces, edges and vertices and how they are joined together. Appears to be a more natural representation for Vision since surface information is
readily available.The description of the object can be into two parts:
Topology: Itrecords the connectivity of the faces, edges and vertices by means of pointers in the data structure.
Geometry: Itdescribes the exact shape and position of each of the edges, faces and vertices.The geometry of a vertex is just its position in space as
given by its (x,y,z) coordinates.Edges may be straight lines, circular arcs, etc.A face is represented by some description of its surface (algebraic or
QC501 (a)
parametric forms used).

_ Chain codes
A chain code is a lossless compression algorithm for monochrome images. The basic principle of chain codes is to separately encode each connected
component, or "blot", in the image. For each such region, a point on the boundary is selected and its coordinates are transmitted. The encoder then moves
along the boundary of the image and, at each step, transmits a symbol representing the direction of this movement. This continues until the encoder
returns to the starting position, at which point the blot has been completely described, and encoding continues with the next blot in the image. This
encoding method is particularly effective for images consisting of a reasonable number of large connected components. Some popular chain codes
include the Freeman Chain Code of Eight Directions (FCCE), Vertex Chain Code (VCC), Three OrThogonal symbol chain code (3OT) and Directional
Freeman Chain Code of Eight Directions (DFCCE). A related blot encoding method is crack code Algorithms exist to convert between chain code, crack
code, and run-length encoding
A related blot encoding method is crack code Algorithms exist to convert between chain code, crack code, and run-length encoding.
1. Let i=3
2. Mark the points in the object that have furthest distance from each other with p and p2
3. Connect the points in the order they are numbered with lines
4. For each segment in the polygon, find the point on the perimeter between the point that have furthest distance to the polygonal line
segment. If this distance Is larger than a threshold, mark the point with a label pi
5. Renumber the points so that they are consecutive
6. Increase i
7. If no points have been added break, otherwise go to 3
The convex hull H of a set S is defined as the smallest convex set that contains S. We define the set
D= H \ S
The points that have neighbors from the sets D, H and CH is called p. These points are used for representation of the object. To limit the
number of points found it is possible to smooth the edge of the objectWe define an object’s skeleton or medial axel as the points in the object
that have several nearest neighbors on the edge of the object
To find the skeleton by using the definition is very costly, since it involves calculating the distance between all points in the object to all points on
the perimeter.
Usually some iterative method is used to remove perimeter pixels until the skeleton alone remains.
In order to achieve this the following rules must be adhered to:
1. Erosion of object must be kept to a minimum
2. Connected components must not be broken
3. End points must not be removed
The purpose with Description is to Quantify a representation of an object. This implies that we instead of talking about the areas in the image we
can talk about their properties such as length, curviness and so on. One of the simplest descriptions is the length P of the perimeter of an
object. The obvious measure of perimeter – length is the number of edge pixels. That is, pixel that do belong to the object, but have a neighbor that
belong to the object, but have a neighbor that belong to the background. A more precise measure is to assume that each pixel center is a corner in
a polygon.
The length of the perimeter is given by
P= aNe+ bNo
(Or)
Discussed the concept of Recognition Based on Matching by Correlation - 5 marks

Specified the Calculation of Correlation Coefficient - 5 marks
QC501 (b)
Drawn the Recognition based on Matching by correlation diagram - 5 marks

DIP Answer Key

Uploaded by

Copyright:

Available Formats

DIP Answer Key

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DIP Answer Key

Uploaded by

Copyright:

Available Formats

Question Repository for aApr/May 2023 Examinations

Electronics and Communication ECE, EIE, CSE

Image sharpening and restoration

QA203 1) To reduce band width 2) To reduce redundancy 3) To extract feature.

(PART B – 13 Marks - Either Or Type)

Fig: Combining a single sensor with motion to generate a 2-D image

image (e) digitized image

# Cornea and Sclera

D4 distance(also called City-block Distance)

Dm shortest distance between p and p4 is 2

Dm shortest distance between p and p4 is 3

Dm distance between p and p4 is 4

Linear and Nonlinear Operations

Fig. Log Transformation Curve input vs output

Fig. Plot of the equation S = Crγ for various values of γ

Smoothing Filter Concept - 4 marks

# HPF-output gets sharpen and background becomes darker - 4mark

The transformation function is assumed to fulfill two condition T(r) is

Merge any adjacent regions Rj and Rk for which P(RjURk)=TRUE.

(ii) Geometric mean filter – 4 marks

QB303 (b) (iii) Harmonic filters – 6 marks

To encode the AC coefficient First using Zigzag scan. We obtain

Explained Statistical Approach - 7 marks

Discussed on Simple descriptors - 5 marks

(PART C – 15 Marks - Either Or Type)

BUTTERWORTH LPF OF DIFFERENTFREQUENCIES:

One of the simplest approaches f or linking edge points is to analyze the

parametric forms used).

Discussed the concept of Recognition Based on Matching by Correlation - 5 marks

You might also like