L9 Segmentation

Perceptual Organization:
Image Segmentation
Inspiration from psychology
• The Gestalt school: Grouping is key to
visual perception
• “The whole is greater than the sum of its parts”
subjective contours occlusion
familiar configuration
http://en.wikipedia.org/wiki/Gestalt_psychology
Gestalt grouping factors
Emergence
http://en.wikipedia.org/wiki/Gestalt_psychology
Segmentation
• Segmentation is the process of dividing an
image into “coherent regions”.
Many slides are curtesy of S. Lazebnik , B. Girod and S. Seitz. Also thanks to
6
Eric Weizmann, Zvi Solomon, Seri khoury , Jad Silbak, Avner Gidron, David Cohen
The goals of segmentation
• The goal of segmentation is to simplify and/or change the
representation of an image into components that are
more meaningful and easier to analyze.
• Image segmentation is the process of assigning a label
to every pixel such that same-labeled pixels
share certain characteristics.
• Segmentation can be treated at

different levels of details.
Object segments
“superpixels”
Regions as primitives for recognition
J. Tighe and S. Lazebnik, ECCV 2010

• Interactive segmentation for graphics
Approaches to segmentation
• Gray-level Histogram thresholding

• Edge Based Segmentation
• Segmentation as clustering
• Segmentation as graph partitioning
Global Thresholding –
Segmentation into two components
Vnew
F(V)
255
Vold
T 255
Threshold value
Segmentation using Thresholding
Original Histogram
1500
1000
500
0
0 100 200
50 75
Threshold = 50 Threshold = 75
Segmentation using Thresholding
• How to choose the optimal threshold?
• Can a global threshold suffice?
Slide by B. Girod
Error Probability for Thresholding
• Assuming there are two components in the image:
– Foreground
– Background
• What is the probability of error given a threshold T?
Slide by B. Girod
Error Probability for Thresholding
• The optimal threshold T with respect to MAP estimation
(Maximum A Posteriori) is T where 𝑃 𝐵 𝑇 = 𝑃(𝐹|𝑇)
• Goal: Find the optimal T given an image.
Slide by B. Girod
Optimal Thresholding using Otsu’s Method.
Slide by B. Girod
Locally Adaptive Thresholding
• Slide a window over the image
• For each window decide whether to apply thresholding:
– Thresholding should not be performed in uniform windows.
– Can use variance or other criterion for “uniformity”.
• In non-uniform windows: apply

Otsu’s thresholding based on
local histogram.
• In uniform windows: classify the
entire window as Foreground or
Background based on mean value.
Slide by B. Girod
Locally Adaptive Thresholding
Slide by B. Girod
Edge Based Segmentation
Edge Based Segmentation
We are looking for boundaries, not edges.
Problem with edges in presence of noise and

sampling artifacts (e.g. medical images).
Snakes – Active Contours
1. The snake is placed near the image contour of interest.
2. During an iterative process, the snake is attracted towards

the target contour by various forces that control the shape and
location of the snake within the image.
Snakes: Active Contour Models (1987)

Michael Kass, Andrew Witkin, and
Demetri Terzopoulos
Snakes
Works with noisy and discontinuous boundary edges
Snakes
• Representation of the contours (x0,y0, x1,y1,…., xn,yn)
C = { ν( s ) | s  [ 0 ,1 ]}
ν( s ) = ( xs , y s )
• Defining the energy functions

– Internal
– External
• Minimizing the energy function

Snakes
Usually, the total energy(cost function) of snake is
a combination of internal and external energies
E = Ein + Eex
Internal energy encourages External energy encourages

smoothness of shape alignment of curve with
image structures (edges).
Minimize E over all possible locations of (x0,y0, x1,y1,…., xn,yn)

External Energy
• Measure how well the curve matches the image data.
• “Attract” the curve toward different image features

(edges, lines, etc)
• Think of external energy from image as gravitational pull

towards areas of high contrast.
External Energy
For every image pixel v=(x,y) define its External Energy as:
Eex ( v ) = − | I ( v ) |2 = − | I ( x, y ) |2
External energy term for the snake whose points are s :
1
Eex =  Eex ( ( s )) ds continuous case
0 C = { ν( s ) | s  [ 0 ,1 ]}
discrete case
C = { νi | 0  i  n }
External Energy
discrete case
C = { νi | 0  i  n }
Snake moves to decrease external energy:

Internal Energy
The Internal Energy (= Smoothess) at contour point v(s) is
defined as:
2
d d
2
2
Ein ( ( s )) =  ( s ) +  (s)
2
ds d s
Elasticity/stretching Stiffness/bending
The Internal Energy (smoothness) of the whole snake is :

1
Ein =  Ein ( ( s )) ds C = { ν( s ) | s  [ 0 ,1 ]}
0
Internal Energy
v4
elastic energy v3
(elasticity) v5
v2
d
 vi +1 − i
ds v1 v6
v 10 v7
v9 v8 bending energy
(stiffness)
d 2
2
 (  i + 1 −  i ) − (  i −  i −1 ) =  i + 1 − 2  i +  i −1
ds
Internal Energy
n −1
Ein = i =0
 | i +1 − i | +  | i +1 − 2 i + i −1 |
2 2
Elasticity Stiffness
large 
small  large  small 

Snakes
Minimize E = Ein + Eex
over all possible locations of snake (v0, v1,…., vn)
vi = (xi,yi)
Gradient descent, iterative local minimization,

Veterbi (dynamic programming)
Color Segmentation:
Segmentation as clustering
Source: K. Grauman
K-means
• How to choose the representative colors?
– This is a clustering problem!
G G
R R
Objective
• Each point should be as close as possible to a cluster center
• Minimize sum squared distance of each point to closest center
Slide by Steve Seitz

Break it down into sub problems
• Suppose I tell you the cluster centers ci
– Q: how to determine which points to associate with each ci?
− A: for each point p, choose closest ci
Suppose I tell you the points in each cluster

• Q: how to determine the cluster centers?
• A: choose ci to be the mean of all points in the cluster
Slide by Steve Seitz

K-means clustering: algorithm
1. Randomly initialize the cluster centers, c1, ..., cK

2. Given cluster centers, determine points in each
cluster
• For each point p, find the closest ci. Put p into
cluster i
3. Given points in each cluster, solve for ci
• Set ci to be the mean of points in cluster i
4. If ci have changed, repeat Step 2
Text by Steve Seitz

K-Means - Example
Choose K seed positions randomly

‫נכתב ע"י אריק ויצמן‬
‫‪K-Means - Example‬‬
‫‪K=2‬‬
K-Means - Example
Associate elements to nearest seed.

K-Means - Example
Update seed location to COM of all associated elements.

K-Means - Example
Repeat



K-Means - Example
Repeat until no change in element assignments to seeds.

K-Means – Image Segmentation
Input Image
Background CSF Gray White

Matter Matter

K-Means – Image Segmentation
Iteration: 76
21
5
3
4
Seed Locations: 0.7324

126.6209
1.0536
1.4523
2.5485
13.2753
109.3117
164.4521
122.6038
116.3612
131.0522
146.8334
168.9954
190.8013
209.5154
195.0548
192.7618
197.6932
204.1537
204.1537
229.4411
247.5998
247.5998
247.5998

• K-means clustering based on intensity or color is
essentially vector quantization of the image attributes
Image Intensity-based clusters Color-based clusters
Slide by Svetlana Lazebnik

• K-means clustering based on texture
Clusters don’t have to be spatially coherent
Source: K. Grauman
• Clustering based on (r,g,b,x,y) values enforces more

spatial coherence

Over segmentation – Super pixels
using K-means
• SLIC (Simple Linear Iterative Clustering) super pixel
based on k-means clustering (r,g,b,x,y)
• Initial centers – regular kxk windows.
• Spatial domain and
color domain can be
weighted differently.
• Iterate k-means until
convergence.
• Taking the role of
pixels in many
applications.
Achanta, et. Al (2010).

Slic superpixels (EPFL-REPORT-149300).
K-Means for segmentation
• Pros
– Very simple method
– Converges to a local minimum of the error function
• Cons
– Memory-intensive
– Need to pick K
– Sensitive to initialization
– Sensitive to outliers
– Only finds “spherical”
clusters

Mean shift segmentation
• The mean shift algorithm seeks modes or local
maxima of density in the feature space
Feature space
image (L*u*v* color values)
Non-parametric density mode estimation
Non-parametric
Density Estimation
Discrete PDF Representation
Data
Non-parametric
MODE Estimation
(Mean Shift)
PDF Analysis
• Non-parametric – no assumption about PDF form (e.g.
normal distribution)
• Density Gradient is estimated instead of Density itself.
55
Mean shift
Search
window
Center of
mass
Mean Shift
vector
Slide by Y. Ukrainitz & B. Sarel

Mean shift
Search
window
Center of
mass
Mean Shift
vector

Mean shift
Search
window
Center of
mass
Mean Shift
vector

Mean shift
Search
window
Center of
mass
Mean Shift
vector

Mean shift
Search
window
Center of
mass
Mean Shift
vector

Mean shift
Search
window
Center of
mass
Mean Shift
vector

Mean shift
Search
window
Center of
mass

Mean shift clustering
• Cluster: all data points in the attraction basin
of a mode
• Attraction basin: the region for which all
trajectories lead to the same mode

Mean shift clustering/segmentation
• Find features (color, gradients, texture, etc)
• Initialize windows at individual feature points
• Perform mean shift for each window until convergence
• Merge windows that end up near the same “peak” or mode
Mean shift segmentation results
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Mean shift
• Pros:
– Does not assume shape on clusters
– One parameter choice (window size)
– Generic technique
– Find multiple modes
• Cons:
– Selection of window size
– Does not scale well with dimension of feature space
Kristen Grauman
• Clustering based on (r,g,b,x,y) values
enforces more spatial coherence
Segmentation as graph partitioning
j
wij
i
• Node for every pixel

• Edge between every pair of pixels (or
every pair of “sufficiently close” pixels)
• Each edge is weighted by the affinity or
similarity of the two nodes Source: S. Seitz
Measuring affinity
• Represent each pixel by a feature vector

x and define an appropriate  1 distance 2 
affinity(x i , x j ) = exp − 2 dist (x i , x j ) 
function  2 
Role of σ
small σ large σ
Segmentation as graph partitioning
j
wij
i
A B C
• Break Graph into Segments

– Delete links that cross between segments
– Easiest to break links that have low affinity
• similar pixels should be in the same segments
• dissimilar pixels should be in different segments
Source: S. Seitz
Graph cut
B
A
• Set of edges whose removal makes a

graph disconnected
• Cost of a cut: sum of weights of cut edges
• A graph cut gives us a segmentation
– What is a “good” graph cut and how do we find
one?
Source: S. Seitz
Minimum cut
• We can do segmentation by finding the

minimum cut in a graph
– EfficientMinimum cut example
algorithms exist for doing this
Minimum cut
• We can do segmentation by finding the

minimum cut in a graph
– EfficientMinimum cut example
algorithms exist for doing this
Example result
Challenge
• How to define affinities for segmenting

highly textured images?
Segmenting textured images
• Convolve image with a bank of filters
• Find textons by clustering vectors of filter bank outputs
Image Texton map
Filter bank
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for
Image Segmentation". IJCV 43(1),7-27,2001.
Segmenting textured images
• Convolve image with a bank of filters
• Find textons by clustering vectors of filter bank outputs
• Represent pixels by texton histograms computed over
neighborhoods at some “local scale”
• Define affinities as similarities between local texton
histograms
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for
Image Segmentation". IJCV 43(1),7-27,2001.
Results: Berkeley Segmentation
Engine
http://www.cs.berkeley.edu/~fowlkes/BSE/
Berkeley Segmentation Engine
http://www.cs.berkeley.edu/~fowlkes/BSE/
Segmentation as labeling
• Suppose we want to segment an image

into foreground and background
• Binary labeling problem
Credit: N. Snavely
What does a U-Net do?
Learns Segmentation
Input Output
Image Segmentation Map
U-Net Architecture
Ronneberger et al. (2015) U-net Architecture

U-Net Architecture
“Contraction”
- Increases field of view
Phase
- Lose Spatial
Information

U-Net Architecture
“Expansion”
- Create High Resolution
Phase Mapping

U-Net Architecture
Concatenate with high-resolution

feature maps from the Contraction
Phase

U-Net Summary
• Contraction Phase
– Reduce spatial dimension, but increases the
“what.”
• Expansion Phase
– Recovers object details and the dimensions,
which is the “where.”
• Concatenating feature maps from the
Contraction phase helps the Expansion
phase with recovering the “where”
information.
Author Results
Ronneberger et al. (2015) ISBI cell tracking

challenge

L9 Segmentation

Uploaded by

Copyright:

Available Formats

L9 Segmentation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L9 Segmentation

Uploaded by

Copyright:

Available Formats

Perceptual Organization:

subjective contours occlusion

• Segmentation can be treated at

J. Tighe and S. Lazebnik, ECCV 2010

• Gray-level Histogram thresholding

• In non-uniform windows: apply

Problem with edges in presence of noise and

2. During an iterative process, the snake is attracted towards

Snakes: Active Contour Models (1987)

• Defining the energy functions

• Minimizing the energy function

Internal energy encourages External energy encourages

Minimize E over all possible locations of (x0,y0, x1,y1,…., xn,yn)

• “Attract” the curve toward different image features

• Think of external energy from image as gravitational pull

External energy term for the snake whose points are s :

Snake moves to decrease external energy:

The Internal Energy (smoothness) of the whole snake is :

small  large  small 

Gradient descent, iterative local minimization,

Slide by Steve Seitz

Suppose I tell you the points in each cluster

Slide by Steve Seitz

1. Randomly initialize the cluster centers, c1, ..., cK

Text by Steve Seitz

Choose K seed positions randomly

Associate elements to nearest seed.

Update seed location to COM of all associated elements.

‫נכתב ע"י אריק ויצמן‬

‫נכתב ע"י אריק ויצמן‬

‫נכתב ע"י אריק ויצמן‬

Repeat until no change in element assignments to seeds.

Background CSF Gray White

‫נכתב ע"י אריק ויצמן‬

Seed Locations: 0.7324

‫נכתב ע"י אריק ויצמן‬

Image Intensity-based clusters Color-based clusters

Slide by Svetlana Lazebnik

• Clustering based on (r,g,b,x,y) values enforces more

Slide by Svetlana Lazebnik

Achanta, et. Al (2010).

Slide by Svetlana Lazebnik

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

Slide by Y. Ukrainitz & B. Sarel

• Node for every pixel

• Represent each pixel by a feature vector

• Break Graph into Segments

• Set of edges whose removal makes a

• We can do segmentation by finding the

• We can do segmentation by finding the

• How to define affinities for segmenting

Image Texton map

• Suppose we want to segment an image