Matching Features Matching With Invariant Features: Computational Photography, 6.882 Prof. Bill Freeman

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Matching features

Computational Photography, 6.882


Matching with Invariant Features
Prof. Bill Freeman
April 11, 2006

Image and shape descriptors: Harris corner detectors and


SIFT features. Darya Frolova, Denis Simakov
The Weizmann Institute of Science
Suggested readings: Mikolajczyk and Schmid, David Lowe
IJCV. March 2004
http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/InvariantFeatures.ppt

Building a Panorama How do we build panorama?


• We need to match (align) images

M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003

Matching with Features Matching with Features


•Detect feature points in both images •Detect feature points in both images
•Find corresponding pairs

1
Matching with Features
•Detect feature points in both images
Matching with Features
•Find corresponding pairs • Problem 1:
•Use these pairs to align images – Detect the same point independently in both
images

no chance to match!

We need a repeatable detector

Matching with Features More motivation…


• Problem 2: • Feature points are used also for:
– For each point correctly recognize the – Image alignment (homography, fundamental matrix)
corresponding one – 3D reconstruction
– Motion tracking
? – Object recognition
– Indexing and database retrieval
– Robot navigation
– … other
We need a reliable and distinctive descriptor

Contents
Selecting Good Features
• Harris Corner Detector
• What’s a “good feature”? – Description
– Satisfies brightness constancy – Analysis
– Has sufficient texture variation
• Detectors
– Does not have too much texture variation
– Rotation invariant
– Corresponds to a “real” surface patch
– Scale invariant
– Does not deform too much over time
– Affine invariant
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant

2
The Basic Idea
An introductory example: • We should easily recognize the point by looking
through a small window
Harris corner detector • Shifting a window in any direction should give
a large change in intensity

C.Harris, M.Stephens. “A Combined Corner and Edge Detector”. 1988

Contents
Harris Detector: Basic Idea
• Harris Corner Detector
– Description
– Analysis
• Detectors
– Rotation invariant
– Scale invariant
– Affine invariant
“flat” region: “edge”: “corner”: • Descriptors
no change in no change along significant change – Rotation invariant
all directions the edge direction in all directions – Scale invariant
– Affine invariant

Go through 2nd order Taylor series


Harris Detector: Mathematics
expansion on board
Window-averaged change of intensity for the shift [u,v]:

E (u, v) = ∑ w( x, y ) [ I ( x + u , y + v) − I ( x, y ) ]
2

x, y

Window Shifted Intensity


function intensity

Window function w(x,y) = or

1 in window, 0 outside Gaussian

3
Harris Detector: Mathematics Harris Detector: Mathematics
Expanding E(u,v) in a 2nd order Taylor series expansion, we
Intensity change in shifting window: eigenvalue analysis
have,for small shifts [u,v], a bilinear approximation:
⎡u ⎤
⎡u ⎤ E (u, v) ≅ [u , v ] M ⎢ ⎥ λ1, λ2 – eigenvalues of M
E (u, v) ≅ [u , v ] M ⎢ ⎥ ⎣v ⎦
⎣v ⎦
direction of the
fastest change
where M is a 2×2 matrix computed from image derivatives: Ellipse E(u,v) = const direction of the
slowest change
⎡ I 2
IxI y ⎤
M = ∑ w( x, y ) ⎢ x

(λmax)-1/2
x, y ⎣⎢ I x I y I y2 ⎦⎥ (λmin)-1/2

Selecting Good Features Selecting Good Features

λ1 and λ2 are large large λ1, small λ2

Selecting Good Features


Harris Detector: Mathematics
Classification of λ2 “Edge”
image points using λ2 >> λ1 “Corner”
eigenvalues of M: λ1 and λ2 are large,
λ1 ~ λ2;
E increases in all
directions

λ1 and λ2 are small;


E is almost constant “Flat” “Edge”
in all directions region λ1 >> λ2
small λ1, small λ2 λ1

4
Harris Detector: Mathematics Harris Detector: Mathematics
Measure of corner response: λ2 “Edge” “Corner”
• R depends only on R<0
R = det M − k ( trace M )
2
eigenvalues of M
• R is large for a corner R>0

det M = λ1λ2 • R is negative with large


magnitude for an edge
trace M = λ1 + λ2 • |R| is small for a flat
region “Flat” “Edge”
(k – empirical constant, k = 0.04-0.06) |R| small R<0
λ1

Harris Detector: Workflow


Harris Detector
• The Algorithm:
– Find points with large corner response function
R (R > threshold)
– Take the points of local maxima of R

Harris Detector: Workflow Harris Detector: Workflow


Compute corner response R Find points with large corner response: R>threshold

5
Harris Detector: Workflow Harris Detector: Workflow
Take only the points of local maxima of R

Contents
Harris Detector: Summary • Harris Corner Detector
– Description
• Average intensity change in direction [u,v] can be – Analysis
expressed as a bilinear form: • Detectors
⎡u ⎤
E (u , v) ≅ [u , v ] M ⎢ ⎥
– Rotation invariant
⎣v ⎦ – Scale invariant
– Affine invariant
• Describe a point in terms of eigenvalues of M:
measure of corner response • Descriptors
– Rotation invariant
R = λ1λ2 − k ( λ1 + λ2 )
2

– Scale invariant
• A good (corner) point should have a large intensity change – Affine invariant
in all directions, i.e. R should be large positive

Harris Detector: Some Properties Harris Detector: Some Properties


• Rotation invariance? • Rotation invariance

Ellipse rotates but its shape (i.e. eigenvalues)


remains the same

Corner response R is invariant to image rotation

6
Harris Detector: Some Properties Harris Detector: Some Properties
• Partial invariance to additive and multiplicative
• Invariance to image intensity change? intensity changes
9 Only derivatives are used => invariance
to intensity shift I → I + b

9 Intensity scale: I → a I

R R
threshold

x (image coordinate) x (image coordinate)

Harris Detector: Some Properties Harris Detector: Some Properties


• Invariant to image scale? • Not invariant to image scale!

All points will be Corner !


classified as edges

Evaluation plots are from this paper


Harris Detector: Some Properties
• Quality of Harris detector for different scale
changes

Repeatability rate:
# correspondences
# possible correspondences

C.Schmid et.al. “Evaluation of Interest Point Detectors”. IJCV 2000

7
Contents
• Harris Corner Detector
– Description
– Analysis
• Detectors
We want to:
– Rotation invariant
detect the same interest points
– Scale invariant
– Affine invariant
regardless of image changes
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant

Contents
Models of Image Change • Harris Corner Detector
– Description
• Geometry
– Analysis
– Rotation • Detectors
– Similarity (rotation + uniform scale) – Rotation invariant
– Scale invariant
– Affine (scale dependent on direction) – Affine invariant
valid for: orthographic camera, locally planar • Descriptors
object
– Rotation invariant
• Photometry – Scale invariant
– Affine intensity change (I → a I + b) – Affine invariant

Contents
Rotation Invariant Detection • Harris Corner Detector
– Description
• Harris Corner Detector – Analysis
• Detectors
– Rotation invariant
– Scale invariant
– Affine invariant
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant

C.Schmid et.al. “Evaluation of Interest Point Detectors”. IJCV 2000

8
Scale Invariant Detection Scale Invariant Detection
• Consider regions (e.g. circles) of different sizes • The problem: how do we choose corresponding
around a point circles independently in each image?
• Regions of corresponding sizes will look the same
in both images

Scale Invariant Detection Scale Invariant Detection


• Solution: • Common approach:
– Design a function on the region (circle), which is “scale
Take a local maximum of this function
invariant” (the same for corresponding regions, even if
they are at different scales) Observation: region size, for which the maximum is
Example: average intensity. For corresponding regions achieved, should be invariant to image scale.
(even of different sizes) it will be the same.
– For a point in one image, we can consider it as a Important: this scale invariant region size is
function of region size (circle radius) found in each image independently!

f Image 1 f Image 2 f Image 1 f Image 2


scale = 1/2 scale = 1/2

region size region size s1 region size s2 region size

Scale Invariant Detection Scale Invariant Detection


• A “good” function for scale detection: • Functions for determining scale f = Kernel ∗ Image
has one stable sharp peak Kernels:

f f f Good ! L = σ 2 ( Gxx ( x, y, σ ) + G yy ( x, y, σ ) )
bad bad (Laplacian)
region size region size region size
DoG = G ( x, y, kσ ) − G ( x, y, σ )
(Difference of Gaussians)
• For usual images: a good function would be a one
which responds to contrast (sharp local intensity where Gaussian
change) x2 + y2
− Note: both kernels are invariant to
G ( x, y , σ ) = 1
2πσ
e 2σ 2
scale and rotation

9
Scale Invariant Detection Scale Invariant Detectors

← Laplacian →
scale
• Compare to human vision: eye’s response • Harris-Laplacian1
Find local maximum of:
– Harris corner detector in
y
space (image coordinates)
– Laplacian in scale
← Harris → x

• SIFT (Lowe)2 scale

← DoG →
Find local maximum of:
– Difference of Gaussians in
space and scale y

← DoG → x

1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. Accepted to IJCV 2004
Shimon Ullman, Introduction to Computer and Human Vision Course, Fall 2003

Scale Invariant Detectors Scale Invariant Detection:


• Experimental evaluation of detectors Summary
w.r.t. scale change • Given: two images of the same scene with a large
scale difference between them
Repeatability rate:
• Goal: find the same interest points independently
# correspondences
in each image
# possible correspondences • Solution: search for maxima of suitable functions
in scale and in space (over the image)

Methods:
1. Harris-Laplacian [Mikolajczyk, Schmid]: maximize Laplacian over
scale, Harris’ measure of corner response over the image
2. SIFT [Lowe]: maximize Difference of Gaussians over scale and space
K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001

Contents Affine Invariant Detection


• Harris Corner Detector
– Description • Above we considered:
– Analysis Similarity transform (rotation + uniform scale)
• Detectors
– Rotation invariant
– Scale invariant
• Now we go on to:
– Affine invariant
Affine transform (rotation + non-uniform scale)
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant

10
Affine Invariant Detection Affine Invariant Detection
• Take a local intensity extremum as initial point • The regions found may not exactly correspond, so we
• Go along every ray starting from this point and stop when approximate them with ellipses
extremum of function f is reached • Geometric Moments:
f I (t ) − I 0
f (t ) =
m pq = ∫x Fact: moments mpq uniquely
p
t
y q f ( x, y )dxdy
points along the ray
1
t ∫ I (t ) − I
o
0 dt
2 determine the function f
• We will obtain approximately
Taking f to be the characteristic function of a region
corresponding regions (1 inside, 0 outside), moments of orders up to 2 allow
to approximate the region by an ellipse
Remark: we search for scale
in every direction
This ellipse will have the same moments of
orders up to 2 as the original region
T.Tuytelaars, L.V.Gool. “Wide Baseline Stereo Matching Based on Local,
Affinely Invariant Regions”. BMVC 2000.

Affine Invariant Detection Affine Invariant Detection


• Covariance matrix of region points defines an ellipse:
• Algorithm summary (detection of affine invariant region):
q = Ap – Start from a local intensity extremum point
– Go in every direction until the point of extremum of some
function f
– Curve connecting the points is the region boundary
pT Σ1−1 p = 1 qT Σ 2−1q = 1
– Compute geometric moments of orders up to 2 for this
Σ1 = ppT Σ 2 = qqT region
region 1 region 2
– Replace the region with ellipse
( p = [x, y]T is relative
to the center of mass) Σ 2 = AΣ1 AT

Ellipses, computed for corresponding


regions, also correspond! T.Tuytelaars, L.V.Gool. “Wide Baseline Stereo Matching Based on Local,
Affinely Invariant Regions”. BMVC 2000.

Affine Invariant Detection Affine Invariant Detection :


• Maximally Stable Extremal Regions Summary
– Threshold image intensities: I > I0 • Under affine transformation, we do not know in advance
– Extract connected components shapes of the corresponding regions
(“Extremal Regions”) • Ellipse given by geometric covariance matrix of a region
– Find a threshold when an extremal robustly approximates this region
region is “Maximally Stable”, • For corresponding regions ellipses also correspond
i.e. local minimum of the relative
growth of its square
– Approximate a region with Methods:
an ellipse
1. Search for extremum along rays [Tuytelaars, Van Gool]:
2. Maximally Stable Extremal Regions [Matas et.al.]

J.Matas et.al. “Distinguished Regions for Wide-baseline Stereo”. Research Report of CMP, 2001.

11
Contents Point Descriptors
• Harris Corner Detector • We know how to detect points
– Description • Next question:
– Analysis How to match them?
• Detectors
– Rotation invariant
– Scale invariant
– Affine invariant
?
• Descriptors
– Rotation invariant
– Scale invariant
Point descriptor should be:
– Affine invariant
1. Invariant
2. Distinctive

Contents Descriptors Invariant to Rotation


• Harris Corner Detector • Harris corner response measure:
– Description depends only on the eigenvalues of the matrix M
– Analysis
⎡ I2 IxI y ⎤
M = ∑ w( x, y ) ⎢ x
• Detectors
– Rotation invariant ⎥
x, y ⎢⎣ I x I y I y2 ⎥⎦
– Scale invariant
– Affine invariant
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant

C.Harris, M.Stephens. “A Combined Corner and Edge Detector”. 1988

Descriptors Invariant to Rotation Descriptors Invariant to Rotation


• Image moments in polar coordinates • Find local orientation
mkl = ∫∫ r e k − iθ l
I (r ,θ )drdθ Dominant direction of gradient

Rotation in polar coordinates is translation of the angle:


θ→θ+θ0
This transformation changes only the phase of the moments, but
not its magnitude • Compute image derivatives relative to this
orientation
Rotation invariant descriptor consists
mkl
of magnitudes of moments:

Matching is done by comparing vectors [|mkl|]k,l


1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
J.Matas et.al. “Rotational Invariants for Wide-baseline Stereo”. Research Report of CMP, 2003 2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. Accepted to IJCV 2004

12
Contents Descriptors Invariant to Scale
• Harris Corner Detector • Use the scale determined by detector to
– Description compute descriptor in a normalized frame
– Analysis
• Detectors For example:
– Rotation invariant • moments integrated over an adapted window
– Scale invariant • derivatives adapted to scale: sIx
– Affine invariant
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant

Contents Affine Invariant Descriptors


• Harris Corner Detector • Affine invariant color moments
– Description
– Analysis pq =
m abc ∫
region
x p y q R a ( x, y )G b ( x, y ) B c ( x, y )dxdy
• Detectors
– Rotation invariant
– Scale invariant Different combinations of these moments
– Affine invariant are fully affine invariant
• Descriptors
– Rotation invariant Also invariant to affine transformation of
– Scale invariant intensity I → a I + b
– Affine invariant

F.Mindru et.al. “Recognizing Color Patterns Irrespective of Viewpoint and Illumination”. CVPR99

Affine Invariant Descriptors SIFT – Scale Invariant Feature Transform1


• Find affine normalized frame • Empirically found2 to show very good performance,
invariant to image rotation, scale, intensity change, and to
A
moderate affine transformations
Σ1 = ppT Σ 2 = qqT

Scale = 2.5
Σ1−1 = A1T A1 A1 A2 Σ 2−1 = A2T A2 Rotation = 450

rotation

• Compute rotational invariant descriptor in this


normalized frame 1 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. Accepted to IJCV 2004
J.Matas et.al. “Rotational Invariants for Wide-baseline Stereo”. Research Report of CMP, 2003 2 K.Mikolajczyk, C.Schmid. “A Performance Evaluation of Local Descriptors”. CVPR 2003

13
CVPR 2003 Tutorial Invariant Local Features
• Image content is transformed into local feature
Recognition and Matching coordinates that are invariant to translation, rotation,
scale, and other imaging parameters
Based on Local Invariant
Features
David Lowe
Computer Science Department
University of British Columbia

SIFT Features

Advantages of invariant local features


Scale invariance
• Locality: features are local, so robust to Requires a method to repeatably select points in location
occlusion and clutter (no prior segmentation) and scale:
• The only reasonable scale-space kernel is a Gaussian
• Distinctiveness: individual features can be (Koenderink, 1984; Lindeberg, 1994)
matched to a large database of objects
l
p
esam
R

r
lu
B

trac
b
u
S

• An efficient choice is to detect peaks in the difference of


• Quantity: many features can be generated for Gaussian pyramid (Burt & Adelson, 1983; Crowley &
Parker, 1984 – but examining more scales)
even small objects
• Difference-of-Gaussian with constant ratio of scales is a
• Efficiency: close to real-time performance close approximation to Lindeberg’s scale-normalized
Laplacian (can be shown from the heat diffusion
• Extensibility: can easily be extended to wide equation)
range of differing feature types, with each
adding robustness

Scale space processed one octave at a time Key point localization


• Detect maxima and minima of
difference-of-Gaussian in scale
space
• Fit a quadratic to surrounding
values for sub-pixel and sub-scale
interpolation (Brown & Lowe,
2002)
l
p
esam
R

r
lu
B

trac
b
u
S

• Taylor expansion around point:

• Offset of extremum (use finite


differences for derivatives):

14
Select canonical orientation Example of keypoint detection
Threshold on value at DOG peak and on ratio of principle
curvatures (Harris approach)

• Create histogram of local (a) 233x189 image


gradient directions computed (b) 832 DOG extrema
(c) 729 left after peak
at selected scale value threshold
(d) 536 left after testing
• Assign canonical orientation ratio of principle
at peak of smoothed curvatures
histogram
• Each key specifies stable 2D
coordinates (x, y, scale,
orientation)
0 2π

SIFT vector formation Feature stability to noise


• Thresholded image gradients are sampled over 16x16 • Match features after random change in image scale &
array of locations in scale space orientation, with differing levels of image noise
• Create array of orientation histograms • Find nearest neighbor in database of 30,000 features
• 8 orientations x 4x4 histogram array = 128 dimensions

Feature stability to affine change Distinctiveness of features


• Match features after random change in image scale & • Vary size of database of features, with 30 degree affine
orientation, with 2% image noise, and affine distortion change, 2% image noise
• Find nearest neighbor in database of 30,000 features • Measure % correct for single nearest neighbor match

15
Talk Resume
A good SIFT features tutorial • Stable (repeatable) feature points can be detected
regardless of image changes
– Scale: search for correct scale as maximum of
http://www.cs.toronto.edu/~jepson/csc2503/tutSIFT04.pdf appropriate function
By Estrada, Jepson, and Fleet. – Affine: approximate regions with ellipses (this
operation is affine invariant)
• Invariant and distinctive descriptors can be
computed
– Invariant moments
– Normalizing with respect to scale and affine
transformation

Evaluation of interest points and


Introduction
descriptors • Quantitative evaluation of interest point detectors
– points / regions at the same relative location

=> repeatability rate


Cordelia Schmid
• Quantitative evaluation of descriptors
CVPR’03 Tutorial – distinctiveness

=> detection rate with respect to false positives

16
Quantitative evaluation of detectors Comparison of different detectors
repeatability - image rotation
• Repeatability rate : percentage of corresponding points

homography

• Two points are corresponding if


1. The location error is less than 1.5 pixel
2. The intersection error is less than 20%
[Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98]

Comparison of different detectors Harris detector + scale changes


repeatability – perspective transformation

[Comparing and Evaluating Interest Points, Schmid, Mohr & Bauckhage, ICCV 98]

Harris detector – adaptation to scale Evaluation of scale invariant detectors


repeatability – scale changes

17
Evaluation of affine invariant detectors Quantitative evaluation of descriptors
repeatability – perspective transformation • Evaluation of different local features
0 – SIFT, steerable filters, differential invariants, moment invariants,
cross-correlation

• Measure : distinctiveness
40 – receiver operating characteristics of
detection rate with respect to false positives

– detection rate = correct matches / possible matches


60
– false positives = false matches / (database points * query points)

[A performance evaluation of local descriptors, Mikolajczyk & Schmid,


70 CVPR’03]

Experimental
Scale change (factor 2.5)
evaluation

Harris-Laplace DoG

Viewpoint change (60 degrees) Descriptors - conclusion

• SIFT + steerable perform best

• Performance of the descriptor independent


of the detector

• Errors due to imprecision in region


estimation, localization
Harris-Affine (Harris-Laplace)

18
SIFT – Scale Invariant Feature Transform
end • Descriptor overview:
– Determine scale (by maximizing DoG in scale and in space),
local orientation as the dominant gradient direction.
Use this scale and orientation to make all further computations
invariant to scale and rotation.
– Compute gradient orientation histograms of several small windows
(128 values for each point)
– Normalize the descriptor to make it invariant to intensity change

D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. Accepted to IJCV 2004

Affine Invariant Texture Descriptor


Invariance to Intensity Change
• Segment the image into regions of different textures (by a non-
invariant method)
• Compute matrix M (the same as in
• Detectors
Harris detector) over these regions – mostly invariant to affine (linear) change in
⎡ I2 IxI y ⎤ image intensity, because we are searching for
M = ∑ w( x, y ) ⎢ x ⎥ maxima
x, y ⎣⎢ I x I y I y2 ⎦⎥
• Descriptors
• This matrix defines the ellipse
– Some are based on derivatives => invariant to
⎡ x⎤
[ x, y ] M ⎢ ⎥ = 1 • Regions described by these ellipses are intensity shift
⎣ y⎦ invariant under affine transformations
– Some are normalized to tolerate intensity scale
• Find affine normalized frame
• Compute rotation invariant descriptor – Generic method: pre-normalize intensity of a
region (eliminate shift and scale)
F.Schaffalitzky, A.Zisserman. “Viewpoint Invariant Texture Matching and Wide Baseline
Stereo”. ICCV 2003

19

You might also like