Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
15 views38 pages

Deep Segmentation

The document discusses deep segmentation networks, focusing on DeepLab and U-Net architectures for semantic image segmentation. It explains the principles of DeepLab, including atrous convolution and fully connected conditional random fields (CRFs) to improve segmentation accuracy. U-Net is also introduced, highlighting its contraction and expansion phases for effective segmentation learning and detail recovery.

Uploaded by

Sabbir Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views38 pages

Deep Segmentation

The document discusses deep segmentation networks, focusing on DeepLab and U-Net architectures for semantic image segmentation. It explains the principles of DeepLab, including atrous convolution and fully connected conditional random fields (CRFs) to improve segmentation accuracy. U-Net is also introduced, highlighting its contraction and expansion phases for effective segmentation learning and detail recovery.

Uploaded by

Sabbir Hossain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 38

Deep Segmentation

Deep Segmentation Networks

1. DeepLab v1, v2, v3

2. U-Nets
Introduction of DeepLab
What is semantic image segmentation?

 Partitioning an image into regions of meaningful


objects.
 Assign an object category label.

3
Introduction of
DCNNDeepLab
and image segmentation

Select maximal score


DCNN
class

Class prediction scores for


each pixel

 What happens in each standard DCNN layer?


 Striding
 Pooling
4
Introduction
DCNN and image segmentation
Pooling advantages:
 Invariance to small translations of the input.
 Helps avoid overfitting.
 Computational efficiency.

Striding advantages:
 Fewer applications of the filter.
 Smaller output size. 5
Introduction
DCNN and image segmentation

What are the disadvantages for semantic


segmentation?
xDown-sampling causes loss of information.
xThe input invariance harms the pixel-perfect
accuracy.

DeepLab address those issues by:


Atrous convolution (‘Holes’ algorithm).
6
CRFs (Conditional Random Fields).
Up-Sampling
Addressing the reduced resolution
problem
Possible solution:

‘deconvolutional’ layers (backwards convolution).


x Additional memory and computational time.
x Learning additional parameters.

Suggested solution:

Atrous (‘Holes’) convolution

7
Deeplabv2
Atrous (‘Holes’)
Algorithm
 Remove the down-sampling from the last pooling layers.
 Up-sample the original filter by a factor of the strides:
Atrous convolution for 1-D signal:

Introduce zeros between


filter values

 Note: standard convolution is a special case for rate r=1.


Chen, Liang-Chieh, et al. " 12
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs." arXiv preprint
Atrous (‘Holes’)
Algorithm
Standard convolution

Atrous
convolution

Chen, Liang-Chieh, et al. " 13


DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs." arXiv preprint
Atrous (‘Holes’)
Algorithm
Filters field-of-view
 Small field-of-view → accurate localization
 Large field-of-view → context assimilation
 ‘Holes’: Introduce zeros between filter values.
 Effective filter size increases (enlarge the field-of-view of filter):

 However, we take into account only the non-zero filter values:


 Number of filter parameters is the same.
 Number of operations per position is the same.
Chen, Liang-Chieh, et al. " 14
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs." arXiv preprint
Atrous (‘Holes’)
Algorithm Original
filter
Standard convolution

Padded
filter
Atrous
convolution

Chen, Liang-Chieh, et al. " 15


DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs." arXiv preprint
Boundary recovery
 DCNN trade-off:
Classification accuracy ↔ Localization accuracy

 DCNN score maps successfully predict classification and rough


position.
x Less effective for exact outline.

Chen, Liang-Chieh, et al. " 16


DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs." arXiv preprint
Boundary recovery

 Possible solution: super-pixel representation.

 Suggested Solution: fully connected CRFs.

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “


Semantic image segmentation with deep convolutional nets and fully connected CRFs,” in ICLR, 2015. 17
https://www.researchgate.net/figure/225069465_fig1_Fig-1-Images-segmented-using-SLIC-into-superpixels-of-size-64-256-and-1024-pixels
Conditional Random
Problem statement
Fields
 X - Random field of input observations (images) of size N.
 L  l1 ,..., lM  - Set of labels.
 Y - Random field of pixel labels.
 X j - color vector of pixel j.
 Y j - label assigned to pixel j.

CRFs are usually used to model connections between different images.


Here we use them to model connection between image pixels!

P. Krahenbuhl and V. Koltun, “Efficient inference in fully connected CRFs with Gaussian edge potentials,” in NIPS, 2011. 23
Probabilistic Graphical
 Graphical ModelModels
Factorization - a distribution over many variables
represented as a product of local functions, each
depends on a smaller subset of variables.
1
p x , y   Z   a  x ,y 

aF 
N a  N ( a ) 

24
C. Sutton and A. McCallum, “An introduction to Conditional Random Fields”, Foundations and Trends in Machine Learning, vol. 4, No. 4 (2011) 267–373
Probabilistic Graphical
Models
 Undirected vs. Directed
G(V, F, E)
Undirected Directed

4
p  y1 , y2 , y3     y1 , y2    y2 , y3    y1 , y3 
1 1 1
 
p y x  p  y   p  xk y 
k 1

25
C. Sutton and A. McCallum, “An introduction to Conditional Random Fields”, Foundations and Trends in Machine Learning, vol. 4, No. 4 (2011) 267–373
Conditional Random
Fully connected CRFs
Definition: Fields
A
1
P Y X     a Ya | X 
Z X  a 1
 Z(X) - is an input-dependent normalization factor.

Factorization (energy function):


N
E y | X   i  yi | X    i , j  yi , y j | X 
i 1 i j

y - is the label assignment for pixels.


P. Krahenbuhl and V. Koltun, “Efficient inference in fully connected CRFs with Gaussian edge potentials,” in NIPS, 2011. 27
C. Sutton and A. McCallum, “An introduction to Conditional Random Fields”, Foundations and Trends in Machine Learning, vol. 4, No. 4 (2011) 267–373
Conditional Random
Potential functions in our case
Fields
i  i 
y | X  log p y | X   i 
 - is the label assignment probability for pixel i computed by
DCNN.  
  s  s 2 x  x
2
  s  s 2 
 i , j  yi , y j | X  1 yi y j  1 exp     2 exp   i 
i j i j j

  2 a 2
2 b 
2
 2   
2

                     

 ‘bilateral ’ kernel smoothness kernel 

 - position of pixel i.
 - intensity (color) vector of pixel i.
 - learned parameters (weights).

Chen, Liang-Chieh, et al. -"DeepLab:
hyperSemantic
parameters (what
Image Segmentation withis considered
Deep “near”
Convolutional Nets, / “similar”).
Atrous Convolution,
28
and Fully Connected
CRFs." arXiv preprint arXiv:1606.00915 (2016).
Conditional Random
Potential functions in our case
Fields
 
  s  s 2 2
xi  x j   s  s  2

 i , j  yi , y j | X  1 yi y j  1 exp     2 exp   i 
i j j

  2 a 2
2 b 
2
 2   
2

                      
 ‘bilateral ’ kernel smoothness kernel 
Pixels “nearness” Pixels color similarity

 Bilateral kernel – nearby pixels with similar color are likely to be in the
same class.
 - what is considered “near” / “similar”).

29
Chen, Liang-Chieh, et al. "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected
CRFs." arXiv preprint arXiv:1606.00915 (2016).
Conditional Random
Potential functions in our case
Fields
 
  s  s 2 2
xi  x j   s  s  2

 i , j  yi , y j | X  1 yi y j  1 exp     2 exp   i 
i j j

  2 a 2
2 b 
2
 2   
2

                      
 ‘bilateral ’ kernel smoothness kernel 

 – uniform penalty for nearby pixels with different labels.


x Insensitive to compatibility between labels!

30
P. Krahenbuhl and V. Koltun, “Efficient inference in fully connected CRFs with Gaussian edge potentials,” in NIPS, 2011.
Boundary recovery

Score map

Belief map

31
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully
connected CRFs,” in ICLR, 2015.
DeepLab
 Group:
 CCVL (Center for Cognition, Vision, and Learning).

 Basis networks (pre-trained for ImageNet):


 VGG-16 (Oxford Visual Geometry Group, ILSVRC 2014
1st).

 ResNet-101 (Microsoft Research Asia, ILSVRC 2015 1 ).


st

 Code: https://bitbucket.org/deeplab/deeplab-public/
32
U-Net
What does a U-Net do?

Learns Segmentation

Input Image Output Segmentation


Map
U-Net Architecture

“Contraction”
Phase
- Increases field of
view
- Lose Spatial
Information

Ronneberger et al. (2015) U-net


Architecture
U-Net Architecture

“Expansion”
- PhaseHigh Resolution
Create
Mapping

Ronneberger et al. (2015) U-net


Architecture
U-Net Architecture
Concatenate with high-
resolution feature maps from
the Contraction Phase

Ronneberger et al. (2015) U-net


Architecture
U-Net Summary
• Contraction Phase
– Reduce spatial dimension, but increases the “what.”
• Expansion Phase
– Recovers object details and the dimensions, which is the “where.”
• Concatenating feature maps from the Contraction phase helps the
Expansion phase with recovering the “where” information.
Author Results

Ronneberger et al. (2015) ISBI cell


tracking challenge

You might also like