02/25/10
Graph-based Segmentation
Computer Vision
CS 543 / ECE 549
University of Illinois
Derek Hoiem
Last class
Gestalt cues and principles of
organization
Mean-shift segmentation
Good general-purpose segmentation
method
Generally useful clustering, tracking
technique
Watershed segmentation
Good for hierarchical segmentation
Use in combination with boundary prediction
Todays class
Treating the image as a graph
Normalized cuts segmentation
MRFs Graph cuts segmentation
Recap
Go over HW2 instructions
Images as graphs
i
w ij
j
Fully-connected graph
node for every pixel
link between every pair of pixels, p,q
similarity wij for each link
Source: Seitz
Similarity matrix
Increasing sigma
Segmentation by Graph Cuts
Break Graph into Segments
Delete links that cross between segments
Easiest to break links that have low cost (low
similarity)
similar pixels should be in the same segments
dissimilar pixels should be in different segments
Source: Seitz
Cuts in a graph
Link Cut
set of links whose removal makes a graph disconnected
cost of a cut:
One idea: Find minimum cut
gives you a segmentation
fast algorithms exist for doing this
Source: Seitz
But min cut is not always the best
cut...
Cuts in a graph
Normalized Cut
a cut penalizes large segments
fix by normalizing for size of segments
volume(A) = sum of costs of all edges that touch A
Source: Seitz
Recursive normalized cuts
1. Given an image or image sequence, set up
a weighted graph: G=(V, E)
Vertex for each pixel
Edge weight for nearby pairs of pixels
2. Solve for eigenvectors with the smallest
eigenvalues: (D W)y = Dy
Use the eigenvector with the second smallest
eigenvalue to bipartition the graph
Note: this is an approximation
4. Recursively repartition the segmented
parts if necessary
Details:
http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf
Normalized cuts results
Normalized cuts: Pro and con
Pros
Generic framework, can be used with
many different features and affinity
formulations
Provides regular segments
Cons
Need to chose number of segments
High storage requirement and time
complexity
Bias towards partitioning into equal
segments
Usage
Use for oversegmentation when you want
regular segments
Graph cuts segmentation
Markov Random Fields
Node yi: pixel label
Edge: constrained
pairs
Cost to assign a label to
each pixel
Cost to assign a pair of labels to
connected pixels
Energy (y; , data ) 1 ( yi ; , data )
i
i , jedges
( yi , y j ; , data )
Markov Random Fields
Unary potential
Example: label smoothing grid
0: -logP(y = 0 ; data)
i
1: -logP(yi = 1 ; data)
Pairwise Potential
0
0 0
1 K
Energy (y; , data ) 1 ( yi ; , data )
i
i , jedges
1
K
0
( yi , y j ; , data )
Solving MRFs with graph cuts
Source (Label 0)
Cost to assign to 0
Cost to split nodes
Cost to assign to 1
Sink (Label 1)
Energy (y; , data ) 1 ( yi ; , data )
i
i , jedges
( yi , y j ; , data )
Solving MRFs with graph cuts
Source (Label 0)
Cost to assign to 0
Cost to split nodes
Cost to assign to 1
Sink (Label 1)
Energy (y; , data ) 1 ( yi ; , data )
i
i , jedges
( yi , y j ; , data )
Grab cuts and graph cuts
Magic Wand
(198?)
Intelligent Scissors
Mortensen and Barrett (1995)
GrabCut
User
Input
Result
Regions
Boundary
Regions & Boundary
Source: Rother
Colour Model
Iterated
graph cut
Foreground &
Background
Background
R
Foreground
Background
Gaussian Mixture Model (typically 5-8 components)
Source: Rother
Graph cuts
Boykov and Jolly (2001)
Image
Foreground
(source)
Min Cut
Background
(sink)
Cut: separating source and sink; Energy: collection of edges
Min Cut: Global minimal enegry in polynomial time
Source: Rother
Graph cuts segmentation
1. Define graph
usually 4-connected or 8-connected
2. Define unary potentials
Color histogram or mixture of Gaussians
for background and foreground
P (c( x); foreground )
unary _ potential ( x) log
3. Define pairwise potentials
P(c( x);
)
background
c( x) c( y ) 2
edge _ potential ( x, y ) k1 k 2 exp
2
2
Apply graph cuts
4.
5. Return to 2, using current labels to
compute foreground, background
models
Moderately straightforward
examples
GrabCut completes automatically
GrabCut Interactive Foreground Extraction
10
Difficult Examples
Camouflage &
Low Contrast
Fine structure
Harder Case
Initial
Rectangle
Initial
Result
GrabCut Interactive Foreground Extraction
11
Using graph cuts for
recognition
TextonBoost (Shotton et al. 2009 IJCV)
Using graph cuts for
recognition
Unary Potentials
Alpha Expansion
Graph Cuts
TextonBoost (Shotton et al. 2009 IJCV)
Limits of graph cuts
Associative: edge potentials penalize
different labels
Must satisfy
If not associative, can sometimes clip
potentials
Approximate for multilabel
Alpha-expansion or alpha-beta swaps
Graph cuts: Pros and Cons
Pros
Very fast inference
Can incorporate recognition or high-level
priors
Applies to a wide range of problems
(stereo, image labeling, recognition)
Cons
Not always applicable (associative only)
Need unary terms (not used for generic
segmentation)
Use whenever applicable
Further reading and resources
Normalized cuts and image segmentation (Shi and
Malik)
http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf
N-cut implementation
http://www.seas.upenn.edu/~timothee/software/ncut/ncut.html
Graph cuts
http://www.cs.cornell.edu/~rdz/graphcuts.html
Classic paper: What Energy Functions can be Minimized
via Graph Cuts? (Kolmogorov and Zabih, ECCV '02/PAMI
'04)
Recap of Grouping and Fitting
Line detection and Hough
transform
Canny edge detector =
smooth derivative thin
threshold link
Generalized Hough transform
= points vote for shape
parameters
Straight line detector =
canny + gradient orientations
orientation binning linking
check for straightness
Robust fitting and registration
Key algorithm
RANSAC
Clustering
Key algorithm
Kmeans
EM and Mixture of Gaussians
Tutorials:
http://www.cs.duke.edu/courses/spring04/cps196.1/.../EM/t
omasiEM.pdfhttp://wwwclmc.usc.edu/~adsouza/notes/mix_gauss.pdf
Segmentation
Mean-shift segmentation
Flexible clustering method, good
segmentation
Watershed segmentation
Hierarchical segmentation from soft
boundaries
Normalized cuts
Produces regular regions
Slow but good for oversegmentation
MRFs with Graph Cut
Incorporates foreground/background/object
model and prefers to cut at image boundaries
Good for interactive segmentation or
recognition
Next section: Recognition
How to recognize
Specific object instances
Faces
Scenes
Object categories
Materials