DIP ppt1

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 156

Digital Image Processing

20MCAT233
Course Outcomes
At the end of the course students will be able to:

CO1 - Discuss the fundamental concepts of Digital Image Processing, formation and
representation of images.
CO2 - Summarise image enhancement methods in the spatial domain.
CO3 - Explain image transforms and image smoothing and sharpening using various
kinds of filters in frequency domain.
CO4 - Describe various methods in image restoration and compression.
CO5 - Discuss morphological basics and image segmentation methods.
Module-I
Overview of Digital Image Processing

Digital Image Processing - Basic concepts - Difference between image processing and
computer vision - Components of an image processing system - Image processing
Applications

Mathematical preliminaries - Basic Vector and Matrix operations – Toeplitz – Circulant -


Unitary and Orthogonal matrices

Elements of Visual Perception - Structure of the human eye and image formation -
Brightness adaptation and discrimination

Types of Images - Binary, Gray scale and Color Images

Image Sampling and Quantization - Digital image as a 2D array - Spatial and Intensity
resolution - 2D-sampling theorem - RGB and HSI color models
Image and Digital Image Processing
• An image may be defined as a two-dimensional function, f (x, y), where x and y are
spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is
called the intensity or gray level of the image at that point.
• When x, y, and the intensity values of f are all finite, discrete quantities, we call the
image a digital image.

• The field of digital image processing refers to processing digital images by means of
a digital computer.
• A digital image is composed of a finite number of elements, each of which has a
particular location and value. These elements are called picture elements, image
elements, pels, and pixels.
• Pixel is the term used most widely to denote the elements of a digital image.
Computer vision
• Computer vision is to use computers to emulate human vision, including learning
and being able to make inferences and take actions based on visual inputs.

• Digital image processing encompasses processes


• whose inputs and outputs are images and,
• that extract attributes from images and
• the recognition of individual objects.
Fundamental steps in digital image processing
Fundamental steps in digital image processing
• Image acquisition is the first process in Image Processing. Acquisition could be as simple as
being given an image that is already in digital form.

• Image enhancement is the process of manipulating an image so the result is more suitable
than the original for a specific application. Enhancement is based on human subjective
preferences regarding what constitutes a “good” enhancement result.

• Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, the restoration techniques tend to be based on mathematical
or probabilistic models of image degradation.

• In Color image processing, Color is used also as the basis for extracting features of interest in
an image.

• Wavelets are the foundation for representing images in various degrees of resolution.
Fundamental steps in digital image processing
• Compression deals with techniques for reducing the storage required to save an image.
• Morphological processing deals with tools for extracting image components that are useful
in the representation and description of shape.
• Segmentation partitions an image into its constituent parts or objects.
• Feature extraction almost always follows the output of a segmentation stage, which usually
is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels
separating one image region from another) or all the points in the region itself. Feature
extraction consists of feature detection and feature description. Feature detection refers
to finding the features in an image, region, or boundary. Feature description assigns
quantitative attributes to the detected features.
• Image pattern classification is the process that assigns a label (e.g., “vehicle”) to an object
based on its feature descriptors.
• Knowledge about a problem domain is coded into an image processing system in the form
of a knowledge database.
Components of an image processing system
Components of an image processing system
Subsystems are required to acquire digital images
1. physical sensor that responds to the energy radiated by the object we wish to
image. A digitizer is a device for converting the output of the physical sensing device
into digital form. The sensors produce an electrical output proportional to light
intensity. The digitizer converts these outputs to digital data.

2. The computer in an image processing system is a general-purpose computer and


can range from a PC to a supercomputer.

3. Software for image processing consists of specialized modules that perform specific
tasks.

4. Mass storage is digital storage space in image processing applications.


Components of an image processing system
5. Image displays are mainly color, flat screen monitors. Monitors are driven by the
outputs of image and graphics display cards that are an integral part of the computer
system.

6. Hardcopy devices for recording images include laser printers, film cameras, heat
sensitive devices, ink-jet units, and digital units, such as optical and CD-ROM disks.

7. Networking and cloud communication are almost default functions in any


computer system. Because of the large amount of data inherent in image processing
applications, the key consideration in image transmission is bandwidth.

8. Image data compression continues to play a major role in the transmission of large
amounts of image data.
Image processing applications
• categorize images according to their source.
• The energy source for images are electromagnetic energy spectrum, acoustic,
ultrasonic and electronic.
• how images are generated in various categories, and the areas in which they are
applied.
• Images based on radiation from the EM spectrum are the most familiar.
GAMMA-RAY IMAGING
• nuclear medicine and astronomical Observations.
• In nuclear medicine, the approach is to inject a patient with a radioactive isotope
that emits gamma rays as it decays. Images are produced from the emissions
collected by gamma-ray detectors.
X-RAY IMAGING
• in medical diagnostics – chest, angiography
• In industrial imaging, X-ray of electronic circuit
• astronomy
IMAGING IN THE ULTRAVIOLET BAND
• Industrial inspection
• Microscopy - fluorescence microscopy
• Lasers
• biological imaging
• astronomical observations
IMAGING IN THE VISIBLE AND INFRARED BANDS
• remote sensing - usually includes several bands in the visual and
infrared regions of the spectrum.
• Weather observation and prediction also are major applications of
multispectral imaging from satellites.
IMAGING IN THE MICROWAVE BAND
• radar
IMAGING IN THE RADIO BAND
• medicine and astronomy.
• radio waves are used in magnetic resonance imaging (MRI).
Imaging in other modalities
• acoustic imaging (sound – strength and returning sound), electron microscopy, and
synthetic (computer-generated) imaging
• Imaging using “sound” finds application in geological exploration (mineral and oil
exploration, industry, and medicine.
• marine image acquisition
Elements of Visual Perception
• how images are formed and perceived by humans
• Structure of the human eye and image formation
• Brightness adaptation and discrimination
Structure of the human eye
• The eye is nearly a sphere (with a diameter of about 20 mm) enclosed by three
membranes: the cornea and sclera outer cover; the choroid; and the retina.
• The cornea is a tough, transparent tissue that covers the anterior surface of the eye.
• Continuous with the cornea, the sclera is an opaque membrane that encloses the
remainder of the optic globe.
• The choroid lies directly below the sclera. This membrane contains a network of
blood vessels that serve as the major source of nutrition to the eye.
• The choroid coat is heavily pigmented, which helps reduce the amount of
extraneous light entering the eye and the backscatter within the optic globe.
• The central opening of the iris (the pupil) varies in diameter from approximately 2
to 8 mm. The front of the iris contains the visible pigment of the eye, whereas the
back contains a black pigment.
Structure of the human eye
• The lens consists of concentric layers of fibrous cells and is suspended by fibers that attach
to the ciliary body. It is composed of 60% to 70% water, about 6% fat, and more protein
than any other tissue in the eye. The lens is colored by a slightly yellow pigmentation that
increases with age.
• The innermost membrane of the eye is the retina, which lines the inside of the wall’s entire
posterior portion. When the eye is focused, light from an object is imaged on the retina.
• There are two types of receptors: cones and rods. There are between 6 and 7 million cones
in each eye. They are located primarily in the central portion of the retina, called the fovea,
and are highly sensitive to color.
• Rods capture an overall image of the field of view. 75 to 150 million rods are distributed
over the retina. They are not involved in color vision, and are sensitive to low levels of
illumination.
Structure of the human eye
IMAGE FORMATION IN THE EYE
• In the human eye, the distance between the center of the lens and the imaging
sensor (the retina) is fixed, and the focal length needed to achieve proper focus is
obtained by varying the shape of the lens.
• The fibers in the ciliary body accomplish this by flattening or thickening the lens for
distant or near objects, respectively.
• The distance between the center of the lens and the retina along the visual axis is
approximately 17 mm.
• The range of focal lengths is approximately 14 mm to 17 mm, the latter taking place
when the eye is relaxed and focused at distances greater than about 3 m.
• to obtain the dimensions of an image formed on the retina.
• the retinal image is focused primarily on the region of the fovea. Perception then
takes place by the relative excitation of light receptors, which transform radiant
energy into electrical impulses that ultimately are decoded by the brain.
IMAGE FORMATION IN THE EYE
BRIGHTNESS ADAPTATION AND DISCRIMINATION
• digital images are displayed as sets of discrete intensities, the eye’s ability to
discriminate between different intensity levels is an important consideration.
• Subjective brightness (intensity as perceived by the human visual system) is a
logarithmic function of the light intensity incident on the eye.
• the visual system cannot operate over such a range simultaneously. It accomplishes
this large variation by changing its overall sensitivity, a phenomenon known as
brightness adaptation.
• The total range of distinct intensity levels the eye can discriminate simultaneously is
rather small when compared with the total adaptation range.
• For a given set of conditions, the current sensitivity level of the visual system is
called the brightness adaptation level
• The ability of the eye to discriminate between changes in light intensity at any
specific adaptation level is called brightness discrimination.
Brightness adaptation
discriminate between changes in light intensity

• If is not bright enough, the subject says “no,” indicating no perceivable change.
• As gets stronger, the subject may give a positive response of “yes,” indicating a
• perceived change.
• When is strong enough, the subject will give a response of “yes” all the time.
human perception phenomena - simultaneous contrast
human perception phenomena- optical illusions
Types of Images

1. Binary Images
2. Gray-scale images
3. Colour images
Binary Images

• It is the simplest type of image.


• It takes only two values i.e, Black and White or 0 and 1.
• The binary image consists of a 1-bit image and it takes only 1 binary digit to represent a
pixel.
• Binary images are mostly used for general shape or outline.
• Are created from grey scale images via a threshold operation, where every pixel above
the threshold value is turned white and those below it are turned black.
Grayscale images
• Grayscale images are monochrome images, they have only one color.
• Grayscale images do not contain any information about color.
• Each pixel determines available different grey levels.
• A normal grayscale image contains 8 bits/pixel data, which has 256 different grey
levels.
• In medical images and astronomy, 12 or 16 bits/pixel images are used.
Colour images
• The images are represented as red, green and blue (RGB images).
• And each color image has 24 bits/pixel means 8 bits for each of the three color
band(RGB).
IMAGE SAMPLING AND QUANTIZATION
• The output of most sensors is a continuous voltage waveform whose amplitude and
spatial behavior are related to the physical phenomenon being sensed.
• To create a digital image, convert the continuous sensed data into a digital format.
This requires two processes: sampling and quantization.
• An image may be continuous with respect to the x- and y-coordinates, and also in
amplitude.
• To digitize it, we have to sample the function in both coordinates and also in
amplitude.
• Digitizing the coordinate values is called sampling.
• Digitizing the amplitude values is called quantization.
Sampling and Quantization
Sampling and Quantization
• The one-dimensional function in Fig. 2.16(b) is a plot of amplitude (intensity level) values of
the continuous image along the line segment AB in Fig. 2.16(a).
• To sample this function, take equally spaced samples along line AB, as shown in Fig. 2.16(c).
• The samples are shown as small dark squares superimposed on the function, and their
(discrete) spatial locations are indicated by corresponding tick marks in the bottom of the
figure.
• In order to form a digital function, the intensity values also must be converted (quantized)
into discrete quantities. The vertical gray bar in Fig. 2.16(c) depicts the intensity scale
divided into eight discrete intervals, ranging from black to white.
• The vertical tick marks indicate the specific value assigned to each of the eight intensity
intervals.
• The continuous intensity levels are quantized by assigning one of the eight values to each
sample, depending on the vertical proximity of a sample to a vertical tick mark.
• The digital samples resulting from both sampling and quantization are shown as white
squares in Fig. 2.16(d)
Digital image as a 2D array
• Let f (s, t) represent a continuous image function of two continuous variables, s and t.
• Sample the continuous image into a digital image, f (x, y), containing M rows and N
columns, where (x, y) are discrete coordinates.
• Use integer values for these discrete coordinates: x = 0, 1, 2,…,M − 1 and y = 0, 1, 2,…, N
− 1.
• The value of the digital image at the origin is f (0,0), and its value at the next coordinates
along the first row is f (0,1). Here, the notation (0, 1) is used to denote the second
sample along the first row.
• The value of a digital image at any coordinates (x, y) is denoted f (x, y), where x and y are
integers.
• The section of the real plane spanned by the coordinates of an image is called the spatial
domain, with x and y being referred to as spatial variables or spatial coordinates.
Digital image as a 2D array
• The representation is an array (matrix) composed of the numerical values of f (x, y).
• This is the representation used for computer processing.
• In equation form, we write the representation of an M * N numerical array as,

• The right side of this equation is a digital image represented as an array of real numbers.
• Each element of this array is called an image element, picture element, pixel, or pel.
Represent a digital image in a traditional matrix form
Image displayed as visual intensity array and 2-D numerical array
Digital image as a 2D array
• Define the origin of an image at the top left corner.

• The positive x-axis extends downward and the positive y-axis extends to the right.

• The sampling process may be viewed as partitioning the xy-plane into a grid.

• f (x, y) is a digital image if (x, y) are integers and f is a function that assigns an intensity
value (a real number from the set of real numbers, R) to each distinct pair of coordinates
(x, y). This functional assignment is the quantization process.
SPATIAL AND INTENSITY RESOLUTION
• Spatial resolution is a measure of the smallest discernible(visible) detail in an image.

• Quantitatively, spatial resolution can be stated in several ways, with line pairs per unit
distance, and dots (pixels) per unit distance being common measures.

• Image resolution is the largest number of discernible line pairs per unit distance.

• Dots per unit distance is a measure of image resolution used in the printing and
publishing industry. In the U.S., this measure usually is expressed as dots per inch (dpi).

• Intensity resolution similarly refers to the smallest discernible change in intensity level
COLOR MODELS (COLOR SYSTEM, COLOR SPACE)
• A color model is a specification of (1) a coordinate system, and (2) a subspace within
that system, such that each color in the model is represented by a single point
contained in that subspace.
• Most color models in use today are oriented either toward hardware (such as for
color monitors and printers) or toward applications, where color manipulation is a
goal (the creation of color graphics for animation).
• The hardware-oriented models most commonly used in practice are
• the RGB (red, green, blue) model for color monitors and a broad class of color
video cameras;
• the CMY (cyan, magenta, yellow) and
• CMYK (cyan, magenta, yellow, black) models for color printing; and
• the HSI (hue, saturation, intensity) model, which corresponds closely with the
way humans describe and interpret color.
• The HSI model also has the advantage that it separates the color and gray-scale
information in an image.
THE RGB COLOR MODEL
• In the RGB model, each color appears in its primary spectral components of red, green, and
blue. This model is based on a Cartesian coordinate system.
• The color subspace is the cube in which RGB primary values are at three corners; the
secondary colors cyan, magenta, and yellow are at three other corners; black is at the
origin; and white is at the corner farthest from the origin.
• The grayscale (points of equal RGB values) extends from black to white along the line joining
these two points. The different colors in this model are points on or inside the cube, and are
defined by vectors extending from the origin.
• All values of R, G, and B in this representation are assumed to be in the range [0, 1].
• Images represented in the RGB color model consist of three component images, one for
each primary color. When fed into an RGB monitor, these three images combine on the
screen to produce a composite color image.
• The number of bits used to represent each pixel in RGB space is called the pixel depth.
THE RGB COLOR MODEL
• Consider an RGB image in which each of the red, green, and blue images is an 8-bit
image.
• Under these conditions, each RGB color pixel [that is, a triplet of values (R, G, B)]
has a depth of 24 bits (3 image planes times the number of bits per plane).
• The term full-color image is used often to denote a 24-bit RGB color image.
• The total number of possible colors in a 24-bit RGB image is (28 )3 = 16, 777, 216.
• An RGB color image is composed three grayscale intensity images (representing red,
green, and blue).
• RGB is ideal for image color generation (as in image capture by a color camera or
image display on a monitor screen), but its use for color description is limited.
RGB color cube
24-bit RGB color cube

• The range of values in the cube are scaled to the numbers representable by the
number bits in the images.
• the primary images are 8-bit images, the limits of the cube along each axis becomes
[0, 255].
• Then, for example, white would be at point [255, 255, 255] in the cube.

• https://www.youtube.com/watch?v=sq3gUlCT8fc
THE HSI COLOR MODEL
• When humans view a color object, we describe it by its hue, saturation, and
brightness(Intensity).
• Hue is a color attribute that describes a pure color (pure yellow, orange, or red), whereas
saturation gives a measure of the degree to which a pure color is diluted by white light.
• Brightness is a subjective descriptor that is practically impossible to measure.
• It embodies the achromatic notion of intensity and is one of the key factors in describing
color sensation.
• Intensity (gray level) is a most useful descriptor of achromatic images. This quantity is
measurable and easily interpretable.
• HSI (hue, saturation, intensity) color model separates the intensity component from the
color-carrying information (hue and saturation) in a color image.
• As a result, the HSI model is a useful tool for developing image processing algorithms
based on color descriptions that are natural and intuitive to humans, who are the
developers and users of these algorithms.
RGB and HSI
THE HSI COLOR MODEL
Intensity
• The intensity (gray) scale is along the line joining these two vertices.
• The line (intensity axis) joining the black and white vertices is vertical.
• Thus, if we wanted to determine the intensity component of any color point, define a plane
that contains the color point and, at the same time, is perpendicular to the intensity axis.
The intersection of the plane with the intensity axis would give us a point with intensity
value in the range [0, 1].
Saturation
• The saturation (purity) of a color increases as a function of distance from the intensity axis.
• The saturation of points on the intensity axis is zero, as all points along this axis are gray.
Hue
• All points contained in the plane segment defined by the intensity axis and the boundaries of
the cube have the same hue (cyan in this case).
• All colors generated by three colors lie in the triangle defined by those colors.
THE HSI COLOR MODEL
• If two of those points are black and white, and the third is a color point, all points
on the triangle would have the same hue, because the black and white components
cannot change the hue.

• By rotating the shaded plane about the vertical intensity axis, we would obtain
different hues.

• The hue, saturation, and intensity values required to form the HSI space can be
obtained from the RGB color cube.

• HSI space is represented by a vertical intensity axis, and the locus of color points
that lie on planes perpendicular to that axis.
• an angle of 0° from the red axis designates 0 hue, and the hue increases
counterclockwise from there.
• The saturation (distance from the vertical axis) is the length of the vector from the
origin to the point.
• The origin is defined by the intersection of the color plane with the vertical intensity
axis.
• The important components of the HSI color space are
1. the vertical intensity axis,
2. the length of the vector to a color point, and
3. the angle this vector makes with the red axis.
ELEMENTWISE VERSUS MATRIX OPERATIONS
• An elementwise operation involving one or more images is carried out on a pixel-by pixel
basis.
• images can be viewed equivalently as matrices.
• operations between images are carried out using matrix theory.
• consider the following 2 * 2 images (matrices):
• The elementwise product of two images is

• the elementwise product is obtained by multiplying pairs of corresponding pixels.


ELEMENTWISE VERSUS MATRIX OPERATIONS
• the matrix product of the images is formed using the rules of matrix multiplication:
ARITHMETIC OPERATIONS
• Arithmetic operations between two images f (x, y) and g(x, y) are denoted as

• These are elementwise operations which means that they are performed between
corresponding pixel pairs in f and g for x = 0, 1, 2,…,M − 1 and y = 0, 1, 2,…, N − 1.
• M and N are the row and column sizes of the images
• s, d, p, and v are images of size M × N
Toeplitz matrix
• Toeplitz matrix or diagonal-constant matrix, is a matrix in which each descending
diagonal from left to right is constant.
• the following matrix is a Toeplitz matrix:

• Elements with constant value along the main diagonal and sub-diagonals.
• Each row (column) is generated by a shift of the previous row (column).
• The last element disappears.
• A new element appears.
Circulant matrix
• A circulant matrix is a square matrix in which all row vectors are composed of the
same elements and each row vector is rotated one element to the right relative to
the preceding row vector.
• It is a particular kind of Toeplitz matrix.
• Each row (column) is generated by a circular shift (modulo N) of the previous row
(column).
• For a NxN matrix, its elements are determined by a N-length sequence cn | 0
 n  N 1
Image enhancement
• Image enhancement is the process of manipulating an image so the result is more
suitable than the original for a specific application.
• The term spatial domain refers to the image plane itself, and image processing
methods are based on direct manipulation of pixels in an image.
• Two principal categories of spatial processing are intensity transformations and
spatial filtering.

• Intensity transformations operate on single pixels of an image.


Eg:- contrast manipulation and image thresholding

• Spatial filtering performs operations on the neighborhood of every pixel in an


image.
Eg:- image smoothing and sharpening.
spatial domain processes
• The spatial domain processes are based on the expression
g(x, y) = T[ f (x, y)]
where
• f (x, y) is an input image,
• g(x, y) is the output image, and
• T is an operator on f defined over a neighborhood of point (x, y).

• The operator T can be applied


• to the pixels of a single image or
• to the pixels of a set of images
Implementation of g(x, y) = T[ f (x, y)]
“compute the average intensity of the pixels in the neighbourhood”
Implementation of g(x, y) = T[ f (x, y)] – Process with example
• T is defined as “compute the average intensity of the pixels in the neighborhood”
neighborhood is a square of size 3 × 3. The point (x0 , y0 ) shown is an arbitrary location in the
image, and the small region shown is a neighborhood of (x0 , y0 )
• The neighborhood is rectangular, centered on (x0 , y0 ), and much smaller in size than the
image.
• The process that consists of moving the center of the neighborhood from pixel to pixel, and
applying the operator T to the pixels in the neighborhood to yield an output value at that
location. Thus, for any specific location (x0 , y0 ), the value of the output image g at those
coordinates is equal to the result of applying T to the neighborhood with origin at (x , y ) 0 0 in
f.
• Consider an arbitrary location in an image, say (100,150). The result at that location in the
output image, g(100,150), is the sum of f (100,150) and its 8-neighbors, divided by 9.
• The center of the neighborhood is then moved to the next adjacent location and the
procedure is repeated to generate the next value of the output image g.
• The process starts at the top left of the input image and proceeds pixel by pixel in a horizontal
(vertical) scan, one row (column) at a time.
Intensity transformation function and thresholding function
• The smallest possible neighborhood is of size 1 × 1.
• In this case, g depends only on the value of f at a single point (x, y) and T becomes an intensity
(also called a gray-level, or mapping) transformation function of the form
s = T(r)
• s and r the intensity of g and f at any point (x, y) respectively.

• Contrast stretching
• The result of applying the transformation to every pixel in f to generate the corresponding
pixels in g would be to produce an image of higher contrast than the original,
• by darkening the intensity levels below k and
• brightening the levels above k.

• In the limiting case T(r) produces a two level (binary) image. A mapping of this form is called a
thresholding function.
Intensity transformation function and thresholding function
BASIC INTENSITY TRANSFORMATION FUNCTIONS
• Denote the values of pixels, before and after processing, by r and s, respectively.
• These values are related by a transformation T, that maps a pixel value r into a pixel
value s.

• Basic types of functions used frequently in image processing:


1. linear (negative and identity transformations),
2. logarithmic (log and inverse-log transformations), and
3. power-law (nth power and nth root transformations).

• The identity function is the case in which the input and output intensities are
identical.
Basic intensity transformation functions
IMAGE NEGATIVES
• The negative of an image with intensity levels in the range [0,L − 1] is obtained by using
the negative transformation function, which has the form:
s=L−1−r
• Reversing the intensity levels of a digital image in this manner produces the equivalent
of a photographic negative.
• This type of processing is used, in enhancing white or gray detail embedded in dark
regions of an image, especially when the black areas are dominant in size.
LOG TRANSFORMATIONS
• The general form of the log transformation in is
s = c log(1 + r)
where c is a constant and it is assumed that r ≥ 0.
• This transformation maps a narrow range of low intensity values in the input into a
wider range of output levels.
• input levels in the range [0, L 4] map to output levels to the range [0, 3L 4]. Conversely,
higher values of input levels are mapped to a narrower range in the output.
• Can expand the values of dark pixels in an image, while compressing the higher-level
values.
• The opposite is true of the inverse log (exponential) transformation.
• The log function has the important characteristic that it compresses the dynamic range
of pixel values.
Result of applying the log transformation in with c = 1
POWER-LAW (GAMMA) TRANSFORMATIONS
• Power-law transformations have the form

• power-law curves with fractional values of gamma map a narrow range of dark input
values into a wider range of output values, with the opposite being true for higher
values of input levels.
• The response of many devices used for image capture, printing, and display obey a
power law.
• The process used to correct these power-law response phenomena is called gamma
correction or gamma encoding.
power-law curves
PIECEWISE LINEAR TRANSFORMATION FUNCTIONS
• the form of piecewise functions can be arbitrarily complex.
• their specification requires considerable user input

• Contrast Stretching
• Intensity-Level Slicing
• Bit-Plane Slicing
Contrast Stretching
• Low-contrast images can result from poor illumination, lack of dynamic range in the
imaging sensor, or even the wrong setting of a lens aperture during image
acquisition.

• Contrast stretching expands the range of intensity levels in an image so that it spans
the ideal full intensity range of the recording medium or display device.

• a typical transformation used for contrast stretching.

• Contrast stretching, obtained by setting (r1 , s1)=( rmin ,0 ) and (r2, s2)=(rmax, L-1) ,
where rmin and rmax denote the minimum and maximum intensity levels in the
input image
Intensity-Level Slicing
• to highlight a specific range of intensities in an image.

• Some of these applications include enhancing features in satellite imagery, such as


masses of water, and enhancing flaws in X-ray images.

• intensity-level slicing, can be implemented in two basic themes.

• One approach is to display in one value (say, white) all the values in the range of
interest and in another (say, black) all other intensities. This Transformation
produces a binary image.
• The second approach brightens (or darkens) the desired range of intensities, but
leaves all other intensity levels in the image unchanged.
Bit-Plane Slicing
• Pixel values are integers composed of bits. For example, values in a 256-level
grayscale image are composed of 8 bits (one byte).

• Instead of highlighting intensity-level ranges, highlight the contribution made to


total image appearance by specific bits.

• an 8-bit image may be considered as being composed of eight one-bit planes, with
plane 1 containing the lowest-order bit of all pixels in the image, and plane 8 all the
highest-order bits.

• Decomposing an image into its bit planes is useful for analyzing the relative
• importance of each bit in the image
Bit-Plane Slicing
Bit-Plane Slicing
• The four higher-order bit planes, especially the first two, contain a significant amount of the
visually-significant data.

• The lower-order planes contribute to more subtle intensity details in the image.

• The original image has a gray border whose intensity is 194. Notice that the corresponding
borders of some of the bit planes are black (0), while others are white (1).

• To see why, consider a pixel in, say, the middle of the lower border of Fig. The corresponding
pixels in the bit planes, starting with the highest-order plane, have values 1 1 0 0 0 0 1 0,
which is the binary representation of decimal 194.

• The value of any pixel in the original image can be similarly reconstructed from its
corresponding binary-valued pixels in the bit planes by converting an 8-bit binary sequence to
decimal.
HISTOGRAM PROCESSING
HISTOGRAM - explanation
• The most populated histogram bins are concentrated on the lower (dark) end of the
intensity scale.

• The most populated bins of the light image are biased toward the higher end of the
scale.

• An image with low contrast has a narrow histogram located typically toward the
middle of the intensity scale.

• An image whose pixels tend to occupy the entire range of possible intensity levels
and, in addition, tend to be distributed uniformly, will have an appearance of high
contrast and will exhibit a large variety of gray tones. The image shows a great deal
of gray-level detail and has a high dynamic range.
HISTOGRAM EQUALIZATION
• The variable r denote the intensities of an image to be processed.
• Assume that r is in the range [0,l − 1], with r = 0 representing black and r = l − 1
representing white.
• For r satisfying these conditions, transformations (intensity mappings) of the form

• That produce an output intensity value, s, for a given intensity value r in the input
image.
Equalized values - .19, .24, .20, .24, .10
histogram equalization
• the probability of occurrence of intensity level rk in a digital image is approximated
by

• where MN is the total number of pixels in the image, and nk denotes the number of
pixels that have intensity rk .
• The discrete form of the transformation

• where, L is the number of possible intensity levels in the image (e.g., 256 for an 8-
bit image). Thus, a processed (output) image is obtained by mapping each pixel in
the input image with intensity rk into a corresponding pixel with level sk in the
output image, This is called a histogram equalization or histogram linearization
transformation.
• assume that

• In some formulations use the inverse transformation


FUNDAMENTALS OF SPATIAL
FILTERING
Spatial Domain
• An image can be represented in the form of a 2D matrix where each element of the
matrix represents pixel intensity. This state of 2D matrices that depict the intensity
distribution of an image is called Spatial Domain.
Frequency Domain
• frequency in an image tells about the rate of change of pixel values.
• In frequency-domain methods are based on Fourier Transform of an image.
THE MECHANICS OF LINEAR SPATIAL FILTERING
• “filtering” refers to passing, modifying, or rejecting specified frequency components of an image.

• Spatial filtering modifies an image by replacing the value of each pixel by a function of the values
of the pixel and its neighbors.

• If the operation performed on the image pixels is linear, then the filter is called a linear spatial
filter. Otherwise, the filter is a nonlinear spatial filter.

• A linear spatial filter performs a sum-of-products operation between an image f and a filter
kernel, w.

• The kernel is an array whose size defines the neighborhood of operation, and whose coefficients
determine the nature of the filter.
• At any point (x, y) in the image, the response, g(x, y), of the filter is the sum of
products of the kernel coefficients and the image pixels encompassed by the kernel

• As coordinates x and y are varied, the center of the kernel moves from pixel to pixel,
generating the filtered image, g, in the process.
• The center coefficient of the kernel, w(0, 0), aligns with the pixel at location (x, y).
• Linear spatial filtering of an image of size M × N with a kernel of size m × n is given
by the expression

• where x and y are varied so that the center (origin) of the kernel visits every pixel in
f once.
• The center coefficient of the kernel, w(0, 0), aligns with the pixel at location (x, y).
• For a kernel of size m × n, we assume that
m = 2a + 1 and
n = 2b + 1,
where a and b are nonnegative integers.

This means that our focus is on kernels of odd size in both coordinate
directions.
SPATIAL CORRELATION AND CONVOLUTION

• Spatial correlation consists of moving the center of a kernel over an image, and
computing the sum of products at each location.
• Correlation is a function of displacement of the filter kernel relative to the image.

• The mechanics of spatial convolution are the same, except that the correlation
kernel is rotated by 180°.

• Thus, when the values of a kernel are symmetric about its center, correlation and
convolution yield the same result.
1-D illustration
• 1-D function, f, and a kernel, w.
• Kernel of size m × n, we assume that m = 2a + 1 and n = 2b + 1, where a and b are
nonnegative integers
• The kernel is of size 1 × 5, a=0, b=2
• Linear spatial filtering of an image of size M × N

will be changed to

• w is positioned so that its center coefficient is coincident with the origin of f.


Result
• For correlation, result is a copy of the kernel, rotated by 180°.
• convolving a function with an impulse yields a copy of the function at the location of
the impulse.
2 D filtering
• For a kernel of size m × n, pad the image with a minimum of (m − 1) /2 rows of 0’s at the
top and bottom and (n − 1) /2 columns of 0’s on the left and right.
• In this case, m and n are equal to 3, so we pad f with one row of 0’s above and below and
one column of 0’s to the left and right
• the center of w visits every pixel in f, computing a sum of products at each location.
• For convolution, we pre-rotate the kernel as before and repeat the sliding sum of
products.
linear spatial filtering
• the correlation of a kernel w of size m × n with an image f (x, y),
denoted as

• the convolution of a kernel w of size m × n with an image


f (x, y), denoted by
Some fundamental properties of convolution and correlation

A dash means that the property does not hold.


Questions
• What are the Components of an image processing system?
• Write notes on Brightness adaptation and discrimination.
• Define Image Sampling and Quantization.
• What are the different types of Images?
• Differentiate between RGB and HIS Color models.
SMOOTHING (LOWPASS) SPATIAL FILTERS
• Smoothing (also called averaging) spatial filters are used to reduce sharp transitions in
intensity.
• An application of smoothing is noise reduction.
• Smoothing prior to image resampling to reduce aliasing.
• Smoothing is used to reduce irrelevant detail in an image, where “irrelevant” refers to
pixel regions that are small with respect to the size of the filter kernel.
• Another application is for smoothing the false contours that result from using an
insufficient number of intensity levels in an image.
• Smoothing filters are used in combination with other techniques for image
enhancement, such as the histogram processing techniques, unsharp masking…
Types
1. Linear
2. Nonlinear
Linear spatial filtering
• Linear spatial filtering consists of convolving an image with a filter kernel.

• Convolving a smoothing kernel with an image blurs the image, with the degree of
blurring being determined by
1. the size of the kernel and
2. the values of its coefficients.

• Low pass filters based on


1. Box kernel and
2. Gaussian kernels
BOX FILTER KERNELS
• The simplest, separable low pass filter kernel is the box kernel, whose coefficients
have the same value (typically 1).
• The name “box kernel” comes from a constant kernel resembling a box when
viewed in 3-D.
• An m × n box filter is an m × n array of 1’s, with a normalizing constant in front,
whose value is 1 divided by the sum of the values of the coefficients.
BOX FILTER KERNELS
Purpose of normalization
1. The average value of an area of constant intensity would equal that intensity in
the filtered.
2. Normalizing the kernel in this way prevents introducing a bias during filtering;
that is, the sum of the pixels in the original and filtered images will be the same.

• In a box kernel all rows and columns are identical, the rank of these kernels is 1,
means that they are separable.
• Box filters are suitable for quick experimentation and they often yield smoothing
results that are visually acceptable.
• They are useful also when it is desired to reduce the effect of smoothing on edges.
Box filtering
LOWPASS GAUSSIAN FILTER KERNELS
• The values of Gaussian kernel coefficients (and hence their effect) decreases as a
function of distance from the kernel center.

• Gaussian filter yield significantly smoother results around the edge transitions.

• Use this type of filter when generally uniform smoothing is desired.


LOWPASS GAUSSIAN FILTER KERNELS
• Gaussian kernels have to be larger than box filters to achieve the same degree of
blurring.
• This is because, whereas a box kernel assigns the same weight to all pixels, the
values of Gaussian kernel coefficients (and hence their effect) decreases as a
function of distance from the kernel center.
Test pattern of size 1024x1024, result of low-pass filtering with
Gaussian kernel of size 21 x 21, 43 x 43
padding
• Zero padding an image introduces dark borders in the filtered result, with the
thickness of the borders depending on the size and type of the filter kernel used.
• Mirror (also called symmetric) padding, in which values outside the boundary of
the image are obtained by mirror-reflecting the image across its border; and
replicate padding, in which values outside the boundary are set equal to the
nearest image border value.
• The replicate padding is useful when the areas near the border of the image are
constant.
• Conversely, mirror padding is more applicable when the areas near the border
contain image details.
• In other words, these two types of padding attempt to “extend” the characteristics
of an image past its borders.
Smoothing performance as a function of kernel and image size
• the relationship between kernel size and the size of objects in an image can lead to
ineffective performance of spatial filtering algorithms.
Using lowpass filtering and thresholding for region extraction.
• Low pass filtering combined with intensity thresholding for eliminating irrelevant
detail in this image.
• “irrelevant” refers to pixel regions that are small compared to kernel size.
Shading correction using low pass filtering
• One of the principal causes of image shading is nonuniform illumination.

• Shading correction (also called flat-field correction) is important because shading is


a common cause of erroneous measurements, degraded performance of
automated image analysis algorithms, and difficulty of image interpretation by
humans.

• Low pass filtering is a rugged, simple method for estimating shading patterns.
Shading correction
ORDER-STATISTIC (NONLINEAR) FILTERS
• Order-statistic filters are nonlinear spatial filters whose response is based on
ordering (ranking) the pixels contained in the region encompassed by the filter.
• Smoothing is achieved by replacing the value of the center pixel with the value
determined by the ranking result.
• The best-known filter in this category is the median filter, which, as its name
implies, replaces the value of the center pixel by the median of the intensity values
in the neighborhood of that pixel (the value of the center pixel is included in
computing the median).
• Median filters provide excellent noise reduction capabilities for certain types of
random noise, with considerably less blurring than linear smoothing filters of similar
size.
• Median filters are particularly effective in the presence of impulse noise (sometimes
called salt-and-pepper noise, when it manisfests itself as white and black dots
superimposed on an image).
ORDER-STATISTIC (NONLINEAR) FILTERS
• The median, of a set of values is such that half the values in the set are less than or
equal to and half are greater than or equal to .
• In order to perform median filtering at a point in an image, first sort the values of the
pixels in the neighborhood, determine their median, and assign that value to the pixel
in the filtered image corresponding to the center of the neighborhood.
• When several values in a neighborhood are the same, all equal values are grouped.
• the principal function of median filters is to force points to be more like their
neighbors.
• The median filter is the most useful order-statistic filter in image processing.
• The median represents the 50th percentile of a ranked set of numbers.
• Using the 100th percentile results in the max filter, which is useful for finding the
brightest points in an image or for eroding dark areas adjacent to light regions.
• The 0th percentile filter is the min filter.
Median filters
SHARPENING (HIGHPASS) SPATIAL FILTERS
• Sharpening highlights transitions in intensity.
• Uses of image sharpening range from electronic printing and medical imaging to
industrial Inspection and autonomous guidance in military systems.
• Sharpening is often referred to as high pass filtering.
• High frequencies (which are responsible for fine details) are passed, while low
frequencies are attenuated or rejected.
• Implement a 2-d, second-order derivatives and use for image sharpening.
• The approach consists of defining a discrete formulation of the second-order
derivative and then constructing a filter kernel based on that formulation.
sharpening
• Sharpening can be accomplished by spatial differentiation.

• The strength of the response of a derivative operator is proportional to the


magnitude of the intensity discontinuity at the point at which the operator is
applied. Thus, image differentiation enhances edges and other discontinuities (such
as noise) and de-emphasizes areas with slowly varying intensities.

• sharpening is often referred to as high pass filtering.

• high frequencies (which are responsible for fine details) are passed, while low
frequencies are attenuated or rejected.
Defining and implementing operators
for sharpening
by digital differentiation
Fundamental properties of first order and second order derivatives
in a digital context

• Derivatives of a digital function are defined in terms of differences.


First derivative
1. Must be zero in areas of constant intensity.
2. Must be nonzero at the onset of an intensity step or ramp.
3. Must be nonzero along intensity ramps.

definition of a Second derivative


1. Must be zero in areas of constant intensity.
2. Must be nonzero at the onset and end of an intensity step or ramp.
3. Must be zero along intensity ramps.
Definitions
A basic definition of the first-order derivative of a one-dimensional function f (x) is
the difference

The second-order derivative of f (x) as the difference


First order
derivative
computation
is “look-
ahead”
operation
Second order derivative - choice
1. Edges in digital images often are ramp-like transitions in intensity, in which case
the first derivative of the image would result in thick edges because the derivative
is nonzero along a ramp.
On the other hand, the second derivative would produce a double edge one pixel
thick, separated by zeros.
From this, we conclude that the second derivative enhances fine detail much
better than the first derivative, a property ideally suited for sharpening images.

2. Also, second derivatives require fewer operations to implement than first


derivatives.
implementation of 2-D, second-order derivatives and
their use for image sharpening -Laplacian operator
• Defining a discrete formulation of the second-order derivative and then constructing a filter
kernel based on that formulation.
• The simplest isotropic derivative operator (kernel) is the Laplacian, which, for a function
(image) f (x, y) of two variables, is defined as

• Laplacian is a linear operator.


• In the x-direction

• in the y-direction

• the discrete Laplacian of two variables is


Filtering
Laplacian operator is

This equation can be implemented using convolution with the kernel


Laplacian
• Laplacian is a derivative operator, it highlights sharp intensity transitions in an
image and de-emphasizes regions of slowly varying intensities.
• If the definition used has a negative center coefficient, then we subtract the
Laplacian image from the original to obtain a sharpened result.

• Laplacian for sharpening


UNSHARP MASKING AND HIGHBOOST FILTERING
1. Blur the original image.
2. Subtract the blurred image from the original (the resulting difference is called the mask.)
3. Add the mask to the original.

k is the weight, k>=0


When k = 1, it is unsharp masking, When k > 1, the process is referred to as highboost
filtering.
unsharp masking
1) is a horizontal intensity profile across a vertical ramp edge that transitions from dark to
light.
2) shows the blurred scan line superimposed on the original signal (shown dashed)
3) is the mask, obtained by subtracting the blurred signal from the original.
4) is the final sharpened result, obtained by adding the mask to the original signal
USING FIRST-ORDER DERIVATIVES FOR IMAGE SHARPENING—
THE GRADIENT
• First derivatives in image processing are implemented using the magnitude of the
gradient.
• The gradient of an image f at coordinates (x, y) is defined as the two dimensional
column vector

• M(x,y) is called gradient image or gradient.


USING FIRST-ORDER DERIVATIVES FOR IMAGE SHARPENING—
THE GRADIENT

define discrete approximations to the preceding equations, and from these formulate the
appropriate kernels.

The partial derivatives at all points in an image are obtained by convolving the image with
these kernels.
The intensities of pixels in a 3 × 3 region

• the value of the center point, z5, denotes the value of f (x, y) at an arbitrary location, (x, y)
• z1 denotes the value of f (x − 1, y − 1)
the simplest approximations to a first-order derivative

Two other definitions, proposed by Roberts, use cross differences


Kernels - Roberts cross-gradient operators
Kernels - Sobel operators
Kernels
• The coefficients in all the kernels sum to zero, so they would give a response of zero
in areas of constant intensity, as expected of a derivative operator.

• When an image is convolved with a kernel whose coefficients sum to zero, the
elements of the resulting filtered image sum to zero also, so images convolved with
the kernels will have negative values in general.

• The computations of gx and gy are linear operations and are implemented using
convolution.
Vectors
Matrix
Toeplitz matrix
• A Toeplitz matrix T is a matrix that has constant elements along the main diagonal
and the subdiagonals.
Circulant Matrix
Orthogonal matrix
Unitary matrix

You might also like