L9 Segmentation
L9 Segmentation
L9 Segmentation
Image Segmentation
Inspiration from psychology
• The Gestalt school: Grouping is key to
visual perception
• “The whole is greater than the sum of its parts”
familiar configuration
http://en.wikipedia.org/wiki/Gestalt_psychology
Gestalt grouping factors
Emergence
http://en.wikipedia.org/wiki/Gestalt_psychology
Segmentation
• Segmentation is the process of dividing an
image into “coherent regions”.
Many slides are curtesy of S. Lazebnik , B. Girod and S. Seitz. Also thanks to
6
Eric Weizmann, Zvi Solomon, Seri khoury , Jad Silbak, Avner Gidron, David Cohen
The goals of segmentation
• The goal of segmentation is to simplify and/or change the
representation of an image into components that are
more meaningful and easier to analyze.
• Image segmentation is the process of assigning a label
to every pixel such that same-labeled pixels
share certain characteristics.
Object segments
“superpixels”
Regions as primitives for recognition
Vold
T 255
Threshold value
Segmentation using Thresholding
Original Histogram
1500
1000
500
0
0 100 200
50 75
Threshold = 50 Threshold = 75
Segmentation using Thresholding
• How to choose the optimal threshold?
• Can a global threshold suffice?
Slide by B. Girod
Error Probability for Thresholding
• Assuming there are two components in the image:
– Foreground
– Background
• What is the probability of error given a threshold T?
Slide by B. Girod
Error Probability for Thresholding
• The optimal threshold T with respect to MAP estimation
(Maximum A Posteriori) is T where 𝑃 𝐵 𝑇 = 𝑃(𝐹|𝑇)
• Goal: Find the optimal T given an image.
Slide by B. Girod
Optimal Thresholding using Otsu’s Method.
Slide by B. Girod
Locally Adaptive Thresholding
• Slide a window over the image
• For each window decide whether to apply thresholding:
– Thresholding should not be performed in uniform windows.
– Can use variance or other criterion for “uniformity”.
Slide by B. Girod
Locally Adaptive Thresholding
Slide by B. Girod
Edge Based Segmentation
Edge Based Segmentation
We are looking for boundaries, not edges.
C = { ν( s ) | s [ 0 ,1 ]}
ν( s ) = ( xs , y s )
E = Ein + Eex
For every image pixel v=(x,y) define its External Energy as:
Eex ( v ) = − | I ( v ) |2 = − | I ( x, y ) |2
1
Eex = Eex ( ( s )) ds continuous case
0 C = { ν( s ) | s [ 0 ,1 ]}
discrete case
C = { νi | 0 i n }
External Energy
discrete case
C = { νi | 0 i n }
d d
2
2
Ein ( ( s )) = ( s ) + (s)
2
ds d s
Elasticity/stretching Stiffness/bending
v 10 v7
v9 v8 bending energy
(stiffness)
d 2
2
( i + 1 − i ) − ( i − i −1 ) = i + 1 − 2 i + i −1
ds
Internal Energy
n −1
Ein = i =0
| i +1 − i | + | i +1 − 2 i + i −1 |
2 2
Elasticity Stiffness
large
Source: K. Grauman
K-means
• How to choose the representative colors?
– This is a clustering problem!
G G
R R
Objective
• Each point should be as close as possible to a cluster center
• Minimize sum squared distance of each point to closest center
K=2
נכתב ע"י אריק ויצמן
K-Means - Example
Repeat
נכתב ע"י אריק ויצמן
K-Means - Example
Source: K. Grauman
Segmentation as clustering
• Pros
– Very simple method
– Converges to a local minimum of the error function
• Cons
– Memory-intensive
– Need to pick K
– Sensitive to initialization
– Sensitive to outliers
– Only finds “spherical”
clusters
Feature space
image (L*u*v* color values)
Non-parametric density mode estimation
Non-parametric
Density Estimation
Discrete PDF Representation
Data
Non-parametric
MODE Estimation
(Mean Shift)
PDF Analysis
• Non-parametric – no assumption about PDF form (e.g.
normal distribution)
• Density Gradient is estimated instead of Density itself.
55
Mean shift
Search
window
Center of
mass
Mean Shift
vector
Center of
mass
Mean Shift
vector
Center of
mass
Mean Shift
vector
Center of
mass
Mean Shift
vector
Center of
mass
Mean Shift
vector
Center of
mass
Mean Shift
vector
Center of
mass
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Mean shift segmentation results
Mean shift segmentation results
Mean shift
• Pros:
– Does not assume shape on clusters
– One parameter choice (window size)
– Generic technique
– Find multiple modes
• Cons:
– Selection of window size
– Does not scale well with dimension of feature space
Kristen Grauman
Segmentation as clustering
• Clustering based on (r,g,b,x,y) values
enforces more spatial coherence
Segmentation as graph partitioning
j
wij
i
small σ large σ
Segmentation as graph partitioning
j
wij
i
A B C
B
A
Source: S. Seitz
Minimum cut
Filter bank
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for
Image Segmentation". IJCV 43(1),7-27,2001.
Segmenting textured images
• Convolve image with a bank of filters
• Find textons by clustering vectors of filter bank outputs
• Represent pixels by texton histograms computed over
neighborhoods at some “local scale”
• Define affinities as similarities between local texton
histograms
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for
Image Segmentation". IJCV 43(1),7-27,2001.
Results: Berkeley Segmentation
Engine
http://www.cs.berkeley.edu/~fowlkes/BSE/
Berkeley Segmentation Engine
http://www.cs.berkeley.edu/~fowlkes/BSE/
Segmentation as labeling
Credit: N. Snavely
What does a U-Net do?
Learns Segmentation
Input Output
Image Segmentation Map
U-Net Architecture
“Contraction”
- Increases field of view
Phase
- Lose Spatial
Information
“Expansion”
- Create High Resolution
Phase Mapping
• Contraction Phase
– Reduce spatial dimension, but increases the
“what.”
• Expansion Phase
– Recovers object details and the dimensions,
which is the “where.”
• Concatenating feature maps from the
Contraction phase helps the Expansion
phase with recovering the “where”
information.
Author Results