Dips
Dips
Dips
1. Image Acquisition:
In image processing, it is defined as the action of retrieving an image from some source, usually a hardware-
based source for processing.
It is the first step in the workflow sequence because, without an image, no processing is possible. The image
that is acquired is completely unprocessed. In image acquisition using pre-processing such as scaling is done.
2. Image Enhancement:
It is the process of adjusting digital images so that the results are more suitable for display or further image
analysis. Usually in includes sharpening of images, brightness & contrast adjustment, removal of noise, etc. In
image enhancement, we generally try to modify the image, so as to make it more pleasing to the eyes.
3. Image Restoration:
It is the process of recovering an image that has been degraded by some knowledge of degraded function H
and the additive noise term. Unlike image enhancement, image restoration is completely objective in nature.
4. Color Image Processing:
This part handles the image processing of colored images either as indexed images or RGB images.
5. Wavelets and multiresolution processing:
Wavelets are small waves of limited duration which are used to calculate wavelet transform which provides
time-frequency information.
Wavelets lead to multiresolution processing in which images are represented in various degrees of resolution.
6. Compression:
Compression deals with the techniques for reducing the storage space required to save an image or the
bandwidth required to transmit it.
This is particularly useful for displaying images on the internet as if the size of the image is large, then it uses
more bandwidth (data) to display the image from the server and also increases the loading speed of the
website.
7. Morphological Processing:
It deals with extracting image components that are useful in representation and description of shape.
It includes basic morphological operations like erosion and dilation. As seen from the block diagram above
that the outputs of morphological processing generally are image attributes.
8. Segmentation:
It is the process of partitioning a digital image into multiple segments. It is generally used to locate objects
and boundaries in objects.
9. Representation and Description:
Representation deals with converting the data into a suitable form for computer processing.
Boundary representation: it is used when the focus is on external shape characteristics e.g. corners
Regional representation: it is used when the focus in on internal properties e.g. texture
Description deals with extracting attributes that
results in some quantitative information of interest
is used for differentiating one class of objects from others
10. Recognition:
It is the process that assigns a label (e.g. car) to an object based on its description.
Knowledge Base:
Knowledge about a problem domain is coded into an image processing in the form of the knowledge
database.
Digital Image Processing Basics:
Digital Image Processing means processing digital image by means of a digital computer. We can also say
that it is a use of computer algorithms, in order to get enhanced image either to extract some useful
information.
Digital image processing is the use of algorithms and mathematical models to process and analyze digital
images. The goal of digital image processing is to enhance the quality of images, extract meaningful
information from images, and automate image-based tasks.
The basic steps involved in digital image processing are:
Image acquisition: This involves capturing an image using a digital camera or scanner, or importing an
existing image into a computer.
Image enhancement: This involves improving the visual quality of an image, such as increasing contrast,
reducing noise, and removing artifacts.
Image restoration: This involves removing degradation from an image, such as blurring, noise, and distortion.
Image segmentation: This involves dividing an image into regions or segments, each of which corresponds
to a specific object or feature in the image.
Image representation and description: This involves representing an image in a way that can be analyzed and
manipulated by a computer, and describing the features of an image in a compact and meaningful way.
Image analysis: This involves using algorithms and mathematical models to extract information from an
image, such as recognizing objects, detecting patterns, and quantifying features.
Image synthesis and compression: This involves generating new images or compressing existing images to
reduce storage and transmission requirements.
Digital image processing is widely used in a variety of applications, including medical imaging, remote
sensing, computer vision, and multimedia.
What is an image?
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates, and the
amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that point. When x,y,
and amplitude values of F are finite, we call it a digital image.
In other words, an image can be defined by a two-dimensional array specifically arranged in rows and
columns.
Digital Image is composed of a finite number of elements, each of which elements have a particular value at
a particular location.These elements are referred to as picture elements,image elements,and pixels.A Pixel is
most widely used to denote the elements of a Digital Image.
Types of an image
BINARY IMAGE– The binary image as its name suggests, contain only two pixel elements i.e 0 & 1,where 0
refers to black and 1 refers to white. This image is also known as Monochrome.
BLACK AND WHITE IMAGE– The image which consist of only black and white color is called BLACK AND
WHITE IMAGE.
8 bit COLOR FORMAT– It is the most famous image format.It has 256 different shades of colors in it and
commonly known as Grayscale Image. In this format, 0 stands for Black, and 255 stands for white, and 127
stands for gray.
16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in it.It is also known as High
Color Format. In this format the distribution of color is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue. That famous
RGB format.
Differences between RGB and CMYK color schemes
Both RGB and CMYK are color schemes used for mixing color in graphic design. CMYK and RGB colors are
rendered differently depending on which medium they are used for, mainly electronic-based or print based.
1. RGB Color Scheme :
RGB stands for Red Green Blue. It is the color scheme for digital images. RGB color mode is used if the
project is to be displayed on any screen. RGB color scheme is used in electronic displays such as LCD, CRT,
cameras, scanners, etc.
This color scheme is an additive type mode that combines the colors:- red, green, and blue, in various
degrees which creates a variety of different colors. When all three colors are combined and displayed to the
fullest degree, the combination gives us white color, for example for white combination will be RGB (255, 255,
255). When all three colors are combined to their lowest degree or value, the result is black, for example for
black combination is RGB (0, 0, 0).
RGB color Scheme offers the widest range of colors and hence preferred in many computer softwares.
Uses of RGB Color Scheme –
Used when project involves digital screens like computers, mobile, TV etc.
Used in web and application design.
Used in online branding.
Used in social media.
2. CMYK Color Scheme :
CMYK stands for Cyan Magenta Yellow Key (Black). It is the color scheme used for projects including printed
materials. This color mode uses the colors cyan, magenta, yellow and black as primary colors which are
combined in different extents to get different colors.
This color scheme is a subtractive type mode that combines the colors:- cyan, magenta, yellow and black in
various degrees which creates a variety of different colors. A printing machine creates images by combining
these colors with physical ink. when all colors are mixed with 0% degree white color is created, exp CMYK(0%,
0%, 0%, 0%) for white, when all colors are mixed, we get the black color.
Uses of CMYK Color scheme –
Used when project involves physically printed designs etc.
Used in physical branding like business cards etc.
Used in advertising like posters, billboards, flyers etc.
Used in cloth branding like t-shirts etc.
When to use which color scheme?
If the project involves printing something, such as a business cards, poster, or a newsletter, use CMYK
scheme.
If the project involves something that will only be seen digitally, use RGB Scheme.
Tablee: 1= RGB Color Scheme 2= CMYK Color Scheme
1:used for digital work 2:print work.
1:primary olors: Red,Green,Blue 2: Cyan,Magenta,Yellow,Black.
1:Additive Type Mixing 2: Subtractive Type Mixing.
1:Colors of images are more vibrant 2:Colors are less vibrant.
1:RGB Scheme has wider range of colors than CMYK 2:CMYK has lesser range of colors than RGB.
1:file formats:- JPEG, PNG, GIF etc. 2:file formats:- PDF, EPS etc
1:Basically it is used for online logos, online ads, digital graphics, photographs for website, social media, or
apps etc. 2:Basically it is used for business cards, stationary, stickers, posters, brochures etc.
Histogram
A digital image is a two-dimensional matrix of two spatial coordinates, with each cell specifying the intensity level of the
image at that point. So, we have an N x N matrix with integer values ranging from a minimum intensity level of 0 to a
maximum level of L-1, where L denotes the number of intensity levels. Hence, the intensity levels of a pixel r can take on
values from 0,1,2,3,…. (L-1). Generally, L = 2m, where m is the number of bits required to represent the intensity levels.
Zero level intensity denotes complete black or dark, whereas L-1 level indicates complete white or absence of grayscale.
Intensity Transformation:
Intensity transformation is a basic digital image processing technique, where the pixel intensity levels of an image are
transformed to new values using a mathematical transformation function, so as to get a new output image. In essence,
intensity transformations is simply to implement the following function:
where s is the new pixel intensity level and r is the original pixel intensity value of the given image and r≥0.
With different forms of the transformation function T(r), we get different output images.
1. Image negation: This reverses the grayscales of an image, making dark pixels whiter and white pixels darker. This is
completely analogous to the photographic negative, hence the name.
s=L-1-r
2. Log Transform: Here c is some constant. It is used for expanding the dark pixel values in an image.
s = c log(1+r)
3. Power-law Transform: Here c and γ are some arbitrary constants. This transform can be used for a variety of purposes by
varying the value of γ.
s=cr
Histogram Equalization:
The histogram of a digital image, with intensity levels between 0 and (L-1), is a function h( rk ) = nk , where rk is the kth
intensity level and nk is the number of pixels in the image having that intensity level. We can also normalize the histogram
by dividing it by the total number of pixels in the image. For an N x N image, we have the following definition of a
normalized histogram function:
p( rk ) = nk/N2
This p(rk) function is the probability of the occurrence of a pixel with the intensity level rk. Clearly,
p( rk ) = 1
The histogram of an image, as shown in the figure, consists of the x-axis representing the intensity levels rk and the y-axis
denoting the h(rk) or the p(rk) functions.
The histogram of an image gives important information about the grayscale and contrast of the image. If the entire
histogram of an image is centered towards the left end of the x-axis, then it implies a dark image. If the histogram is more
inclined towards the right end, it signifies a white or bright image. A narrow-width histogram plot at the center of the
intensity axis shows a low-contrast image, as it has a few levels of grayscale. On the other hand, an evenly distributed
histogram over the entire x-axis gives a high-contrast effect to the image.
In image processing, there frequently arises the need to improve the contrast of the image. In such cases, we use an intensity
transformation technique known as histogram equalization. Histogram equalization is the process of uniformly distributing
the image histogram over the entire intensity axis by choosing a proper intensity transformation function. Hence, histogram
equalization is an intensity transformation process.
The choice of the ideal transformation function for uniform distribution of the image histogram is mathematically explained
below.
Let us consider that the intensity levels of the image r is continuous, unlike the discrete case in digital images. We limit the
values that r can take between 0 and L-1, that is, 0 ≤ r ≤ L-1 . r = 0 represents black and r = L-1 represents white. Let us
consider an arbitrary transformation function:
s = T( r )
where s denotes the intensity levels of the resultant image. We have certain constraints on T(r).
The above two conditions make T(r) a bijective function. We know that such functions are invertible. So we can get back r
values from s. We can have a function such that r = T-1( s )
Let us now say that the probability density function (pdf) of r is pr (x) and the cumulative distribution function (CDF) of r is
Fr (x). Now the CDF of s will be :
We put the first condition of T(r) precisely to make the above step hold true. The second condition is needed as s is the
intensity value for the output image and so must be between o and (L-1).
So, a pdf of s can be obtained by differentiating FS( x ) with respect to x. We get the following relation:
Ps(S)=Pr(r)dr/ds
The above step used Leibnitz’s integral rule. Using the above derivative, we get:
Now, we extend the above continuous case to the discrete case. The natural replacement of the integral sign is the
summation. Hence, we are left with the following histogram equalization transformation function.
Since s must have integer values, any non-integer value obtained from the above function is rounded off to the nearest
integer.
Matlab example:
Step 1: The input image is divided into a small block which is having 8x8 dimensions. This dimension is sum up
to 64 units. Each unit of the image is called pixel.
Step 2: JPEG uses [Y,Cb,Cr] model instead of using the [R,G,B] model. So in the 2nd step, RGB is converted into
YCbCr.
Step 3: After the conversion of colors, it is forwarded to DCT. DCT uses a cosine function and does not use
complex numbers. It converts information?s which are in a block of pixels from the spatial domain to the
frequency domain.
DCT Formula
Step 4: Humans are unable to see important aspects of the image because they are having high frequencies.
The matrix after DCT conversion can only preserve values at the lowest frequency that to in certain point.
Quantization is used to reduce the number of bits per sample.
1. Uniform Quantization
2. Non-Uniform Quantization
Step 5: The zigzag scan is used to map the 8x8 matrix to a 1x64 vector. Zigzag scanning is used to group low-
frequency coefficients to the top level of the vector and the high coefficient to the bottom. To remove the
large number of zero in the quantized matrix, the zigzag matrix is used.
Step 6: Next step is vectoring, the different pulse code modulation (DPCM) is applied to the DC component.
DC components are large and vary but they are usually close to the previous value. DPCM encodes the
difference between the current block and the previous block.
Step 7: In this step, Run Length Encoding (RLE) is applied to AC components. This is done because AC
components have a lot of zeros in it. It encodes in pair of (skip, value) in which skip is non zero value and value
is the actual coded value of the non zero components.
Step 8: In this step, DC components are coded into Huffman.
Discrete Fourier Transform
The discrete Fourier transform (DFT) is "the Fourier transform for finite-length sequences" because, unlike
the (discrete-space) Fourier transform, the DFT has a discrete argument and can be stored in a finite
number of infinite word-length locations. Yet, it turns out that the DFT can be used to exactly implement
convolution for finite-size arrays. Our approach to the DFT will be through the discrete Fourier series DFS,
which is made possible by the isomorphism between rectangular periodic and finite-length, rectangular-
support sequences.
(4.2.1)
Figure(4.1) and
Figure( 4.2)
We note that the DFT maps finite support rectangular sequences into themselves. Pictures of the DFT basis functions
of size 8 8 are shown with real parts in Figure (4.1) and imaginary parts in Figure (4.2). The real part of the basis
functions represent the components of x that are symmetric with respect to the 8 8 square, while the imaginary parts
of these basis functions represent the nonsymmetric parts. In these figures, the color white is maximum positive (+1),
mid-gray is 0, and black is minimum negative (-1). Each basis function occupies a small square, all of which are then
arranged into a 8 8 mosaic. Note that the highest frequencies are in the middle at =(4,4) and correspond to
the Fourier transform at . So the DFT is seen as a projection of the finite-support input
sequence x(n1,n2) onto these basis functions. The DFT coefficients then are the representation coefficients for this
basis. The inverse DFT (IDFT) exists and is given by
(4.2.2)
Since we can see that the 2-D IDFT is just the inverse of each of these 1-D DFTs in the
given order, say row first and then by column, we have the desired result based on the
known validity of the 1-D DFT/IDFT transform pair, applied twice.
A second method of proof is to rely on the DFS. The key concept is that rectangular
periodic and rectangular finite-support sequences are isomorphic to one another, i.e.,
given we can define a finite support x as , and given a finite-support x , we
can find the corresponding as where we use the notation ,
meaning "n mod N'. Still a third method is to simply insert (1) into (2) and perform the 2-D
proof directly.
Spatial Filtering technique is used directly on pixels of an image. Mask is usually considered to
be added in size so that it has specific center pixel. This mask is moved on the image such that
the center of the mask traverses all image pixels.
Smoothing Spatial Filter: Smoothing filter is used for blurring and noise reduction in the image.
Blurring is pre-processing steps for removal of small details and Noise Reduction is
accomplished by blurring.
1. Mean Filter:
Linear spatial filter is simply the average of the pixels contained in the neighborhood of the
filter mask. The idea is replacing the value of every pixel in an image by the average of the
grey levels in the neighborhood define by the filter mask.
Types of Mean filter:
2.
(i) Averaging filter: It is used in reduction of the detail in image. All coefficients
are equal.
(ii) Weighted averaging filter: In this, pixels are multiplied by different coefficients.
Center pixel is multiplied by a higher value than average filter.
4.
(i) Minimum filter: 0th percentile filter is the minimum filter. The value of the
center is replaced by the smallest value in the window.
(ii) Maximum filter: 100th percentile filter is the maximum filter. The value of the
center is replaced by the largest value in the window.
(iii) Median filter: Each pixel in the image is considered. First neighboring pixels
are sorted and original values of the pixel is replaced by the median of the list.
Sharpening Spatial Filter: It is also known as derivative filter. The purpose of the sharpening
spatial filter is just the opposite of the smoothing spatial filter. Its main focus in on the removal
of blurring and highlight the edges. It is based on the first and second order derivative.
First order derivative: