AD8703 BCV Unit II 2023
AD8703 BCV Unit II 2023
AD8703 BCV Unit II 2023
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
AD8703
BASICS OF COMPUTER
VISION
UNIT II
Department: AI&DS
Date : 26.07.2023
Table of Contents
S
CONTENTS PAGE NO
NO
1 Contents 1
2 Course Objectives 6
8
3 Pre Requisites (Course Names with Code)
5 Course Outcomes 12
7 Lecture Plan 16
9 Lecture Notes 20
Lecture Slides 40
Lecture Videos 42
10 Assignments 44
11 Part A (Q & A) 47
12 Part B Qs 51
16 Assessment Schedule 60
COURSE OBJECTIVES
To review image processing techniques for computer vision.
To understand various features and recognition techniques
To learn about histogram and binary vision
Apply three-dimensional image analysis techniques
Study real world applications of computer vision algorithms
Prerequisite
PREREQUISITE
NIL
Syllabus
AD8703 -BASICS OF COMPUTER VISION
SYLLABUS 3003
UNIT I INTRODUCTION
Feature Extraction -Edges - Canny, LOG, DOG; Line detectors (Hough Transform),
Corners -Harris and Hessian Affine, Orientation Histogram, SIFT, SURF, HOG, GLOH,
Scale-Space Analysis- Image Pyramids and Gaussian derivative filters, Gabor Filters
and DWT. Image Segmentation -Region Growing, Edge Based approaches to
segmentation, Graph-Cut, Mean-Shift, MRFs, Texture Segmentation.
UNIT V APPLICATIONS
CO1: Recognise and describe how mathematical and scientific concepts are
applied in computer vision.
PO’s/PSO’s
COs
PO PO PO PO PO PO PO PO PO PO PO PO PSO PSO PSO
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
CO1
3 2 2 2 2 - - - 2 - - 2 2 - -
CO2
3 3 2 2 2 - - - 2 - - 2 2 - -
CO3
2 2 1 1 1 - - - 1 - - 1 2 - -
CO4
3 3 1 1 1 - - - 1 - - 1 2 - -
CO5
3 3 1 1 1 - - - 1 - - 1 3 1 -
Mode
No of Taxon
S No Topics Actual Lecture Pertaining of
periods Proposed date omy
Date CO delivery
level
FEATURE
EXTRACTION AND 24.08.20
1 1 CO2 K1 Lecture
FEATURE 23
SEGMENTATION
Line detectors
30.08.20
3 (Hough Transform) 1 CO2 K2 Lecture
23
Corners -
30.08.20
4 Harris and Hessian 1 CO2 K2 Lecture
23
Affine
Orientation
31.08.20 Lecture
5 Histogram, SIFT, 1 CO2 K2
23
SURF, HOG, GLOH
Scale-Space
Analysis & Mean- 4.09.202 Lecture
6 1 CO2 K2
Shift, MRFs, Texture 3
Segmentation
Image Pyramids and
Gaussian derivative Lecture
4.09.202
7 filters, Gabor Filters 1 CO2 K2
3
and DWT
UiPath.CV.Activities.CVGetTextWithDescriptor
https://scholarworks.calstate.edu/downloads/hh63sx58j
Lecture Notes
2. FEATURE EXTRACTION
FEATURE EXTRACTION using edge detection techniques such as Canny,
Laplacian of Gaussian (LOG), and Difference of Gaussians (DOG) can be
useful in various computer vision and image processing applications. Let's
discuss each of these methods:
2.2.3. VOTING:
For each edge pixel in the edge image, calculate the corresponding lines
in the Hough space by varying the parameters rho and theta. Increment
the accumulator array cells that correspond to these lines. This process is
known as voting.
2.2.4. THRESHOLDING:
After the voting process, the Hough space contains peaks that represent lines in
the input image. Apply a threshold to identify the most significant peaks, which
correspond to the most likely lines.
Convert the significant peaks back to the parameter space of the input image, and
extract the corresponding lines. This step involves finding the intersection points of
the peaks in the Hough space and converting them into lines in the original image.
2.2.6. POST-PROCESSING:
Finally, you can perform additional post-processing steps such as line merging, line
filtering, or line fitting to improve the accuracy and quality of the detected lines.
The Hough Transform is widely used for applications such as lane
detection in autonomous driving, shape detection, and feature
extraction in image analysis. It is a powerful technique for line
detection, although it may not be the most efficient method for real-
time applications due to its computational complexity.
2.3 CORNERS
a. Preprocessing:
c. Structure tensor:
Construct the structure tensor for each pixel in the image. It measures
the local intensity variations in different directions.
e. Non-maximum suppression:
f. Thresholding:
The Harris Corner Detector is known for its simplicity and effectiveness in
detecting corners. However, it may struggle with scale and rotation
changes in images.
2.4 HESSIAN AFFINE DETECTOR
c. Affine adaptation:
e. Orientation assignment:
Both the Harris Corner Detector and the Hessian Affine Detector are widely used
corner detection algorithms, each with its strengths and limitations. The choice of
algorithm depends on the specific requirements of the application at hand.
2.5.ORIENTATION HISTOGRAM:
An orientation histogram is a representation of the distribution of gradient
orientations in an image or a local image patch. It is commonly used in feature
extraction algorithms to capture the dominant orientations of image features. The
orientation histogram is constructed by dividing the range of gradient orientations
into multiple bins and accumulating the gradient magnitudes into these bins based
on their respective orientations. This histogram provides information about the
spatial distribution and strength of different orientations within an image or a
feature region.
2.5.1 SIFT (Scale-Invariant Feature Transform):
It is widely used for tasks such as object recognition, image stitching, and image
retrieval. SIFT features are distinctive and invariant to changes in scale, rotation,
and affine transformations. The main steps of the SIFT algorithm include:
2.5.1 Scale-space extrema detection:
2. Keypoint localization:
3. Orientation assignment:
2.7.2. Block division: Divide the image or the local region into regular
blocks.
2.7.4. Normalization:
Image pyramids are useful for various tasks, such as image blending,
image resizing, feature extraction, and object detection. By working with
images at different scales, it becomes possible to handle objects of
different sizes and capture both fine details and global context.
The Gaussian derivative filters are obtained by taking the derivative of the Gaussian
function with respect to the image coordinates. The filters are commonly used to
compute the first-order derivatives (gradients) of an image in the horizontal and
vertical directions. By convolving an image with these filters, you can estimate the
image gradients, which provide information about the intensity changes and edges in
the image.
The Gaussian derivative filters are often used in various computer vision tasks, such as
edge detection, feature extraction, and image enhancement. The magnitude and
orientation of the gradients obtained from these filters can be used to detect edges,
corners, and other image features.
In summary, image pyramids and Gaussian derivative filters are important tools in
computer vision and image processing. Image pyramids provide a multi-scale
representation of an image, while Gaussian derivative filters are used for edge
detection and gradient computation. Both techniques have applications in various
image analysis and computer vision tasks.
2.8.3 Gabor filters and the discrete wavelet transform (DWT)
Gabor filters and the discrete wavelet transform (DWT) are two
important techniques used in signal and image processing. Let's discuss
each of them in more detail:
Gabor filters are linear filters used for analyzing and processing signals,
particularly in the field of image analysis. They are named after Dennis
Gabor, who introduced them in the 1940s. Gabor filters are widely used in
various computer vision tasks, including texture analysis, feature
extraction, and object recognition.
The DWT operates by passing the signal or image through a series of low-
pass and high-pass filters, followed by down sampling. The low-pass filters
capture the coarse-scale information, while the high-pass filters capture
the fine-scale details. This process is recursively applied to the resulting
approximation coefficients, creating a hierarchical structure of wavelet
coefficients at different scales.
The DWT has several advantages, including its ability to represent both
localized features and global properties of the data. It allows for efficient
compression by achieving high data compression ratios while preserving
important features. The DWT has found applications in various fields, such
as image and video compression, signal denoising, feature extraction, and
data analysis.
In summary, Gabor filters and the discrete wavelet transform (DWT) are
powerful techniques used in signal and image processing. Gabor filters are
effective for texture analysis and feature extraction, while the DWT
provides a multi-resolution analysis of signals and images, enabling tasks
like compression, denoising, and feature extraction.
- Initialize the seed points and create an empty region for each seed point.
- Repeat the process until no more pixels can be added to any region.
Region growing can produce accurate segmentations when the regions
have distinct and homogeneous characteristics. However, it can be
sensitive to the initial seed points and may result in over-segmentation or
under-segmentation if the similarity criteria are not properly defined.
- Detect edges in the image using edge detection techniques such as the
Canny edge detector or the Sobel operator.
- Enhance and refine the detected edges using techniques like edge
thinning, edge linking, or edge preserving smoothing.
2.11 Mean-shift:
Markov Random Fields are probabilistic models used for image segmentation
and other computer vision tasks. MRFs model the relationship between pixels
as a graph, where each pixel is a node, and edges represent the dependencies
between neighboring pixels. MRFs define an energy function that quantifies
the compatibility between neighboring pixels and assigns higher energy to less
likely configurations. The segmentation problem is then formulated as finding
the configuration of labels (segments) that minimizes the energy function.
MRFs provide a framework for incorporating both local and global information
into the segmentation process.
Lecture Slides
Lecture Videos
Lecture Videos
Lecture Videos
Assignment
Assignment
Assignments
1. Gaussian filtering. Gradient magnitude. Canny edge detection
2. Detecting interest points. Simple matching of features.
3.Stereo correspondence analysis.
4.Photometric Stereo.
Each assignment contains:
1. paper. Discusses theory, task, methods, results.
2. src folder
code
README file for instructions on how to run code
Part A Q & A
PART -A
48
PART -A
10. What does HOG stand for and how is it used in computer vision?
HOG stands for Histogram of Oriented Gradients. It is used in computer
vision for object detection by describing the distribution of gradient
orientations in an image.
49
PART -A
51
Part B Q
5
3
PART-B
1. List the types of noise. Consider that image is corrupted by
Gaussian noise. Suggest suitable method to minimize Gaussian noise
from the image and explain that method. (K3) (CO2)
2.Types of edge detection. & Explain Sobel operator to detect edges.
(K3) (CO2)
Course
S No Course title Link
provider
https://www.udemy.co
Computer vision applies m/topic/computer-
machine learning.
1 Udemy vision/
https://www.udacity.co
https://www.coursera.o
Advanced Computer Vision rg/learn/advanced-
3 Coursera with TensorFlow
computer-vision-with-
tensorflow
Computer Vision and https://www.edx.org/lear
Image Processing n/computer-
Fundamentals programming/ibm-
edX
4 computer-vision-and-
image-processing-
fundamentals?webview=
false&campaign=Comput
er+Vision+and+Image+
Processing+Fundamental
s&source=edx&product_
category=course&placem
ent_url=https%3A%2F%
2Fwww.edx.org%2Flearn
%2Fcomputer-vision
Real life Applications in
day to day life and to
Industry
5
8
AND TO INDUSTRY
1.Explain the role of an computer vision applications in the most prominent industries
including agriculture, healthcare, transportation, manufacturing, and retail. (K4, CO2)
Content beyond
Syllabus
6
0
https://www.youtube.com/watch?v=2w8XIskzdFw
Assessment Schedule
ASSESSMENT SCHEDULE
FIAT
Proposed date :04.09.2023
Prescribed Text books &
Reference books
PRESCRIBED TEXT BOOKS AND REFERENCE BOOKS
TEXT BOOKS
D. A. Forsyth, J. Ponce, “Computer Vision: A Modern Approach”,
Pearson Education,
2003.
2. Richard Szeliski, “Computer Vision: Algorithms and Applications”,
Springer Verlag London Limited,2011.
REFERENCE BOOKS
B. K. P. Horn -Robot Vision, McGraw-Hill.
Simon J. D. Prince, Computer Vision: Models, Learning, and
Inference, Cambridge University Press, 2012.
Mark Nixon and Alberto S. Aquado, Feature Extraction & Image
Processing for Computer Vision, Third Edition, Academic Press,
2012.
E. R. Davies, (2012), “Computer & Machine Vision”, Fourth Edition,
Academic Press.
Concise Computer Vision: An Introduction into Theory and
Algorithms, by Reinhard Klette,2014
Mini Project
Suggestions
MINI PROJECT SUGGESTIONS
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.