Lecture 6
Linear Processing
ch. 5 of Machine Vision by Wesley E. Snyder & Hairong Qi
Spring 2023
16-725 (CMU RI) : BioE 2630 (Pitt)
Dr. John Galeotti
The content of these slides by John Galeotti, © 2012 - 2023 Carnegie Mellon University (CMU), was made possible in part by NIH NLM contract#
HHSN276201000580P, and is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. To view a copy of this
license, visit http://creativecommons.org/licenses/by-nc/3.0/ or send a letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco,
California, 94105, USA. Permissions beyond the scope of this license may be available either from CMU or by emailing
[email protected].
The most recent version of these slides may be accessed online via http://itk.galeotti.net/
Linear Operators
“If and only if”
§D is a linear operator iff:
D( αf1 + βf2 ) = αD( f1 ) + βD( f2 )
Where f1 and f2 are images,
and a and b are scalar multipliers
§Not a linear operator (why?):
g = D( f ) = af + b
2
Kernel Operators
§Kernel (h) = h-1,-1 h0,-1 h1,-1 f0,0 f1,0 f2,0 f3,0 f4,0
“small image” h-1,0 h0,0 h1,0 f0,1 f1,1 f2,1 f3,1 f4,1
§ Often 3x3 or 5x5 h-1,1 h0,1 h1,1 f0,2 f1,2 f2,2 f3,2 f4,2
§Correlated with f0,3 f1,3 f2,3 f3,3 f4,3
a “normal” image ( f ) f0,4 f1,4 f2,4 f3,4 f4,4
§Implied correlation (sum of products) makes a kernel an
operator. A linear operator.
§Note: This use of correlation is often mislabeled as
convolution in the literature.
§Any linear operator applied to an image can be
approximated with correlation.
3
Kernels for Derivatives
§Task: estimate partial spatial derivatives
§Solution: numerical approximation
§ [ f (x + 1) - f (x) ]/1
§ Really Bad choice: not even symmetric
§ [ f (x + 1) - f (x - 1) ]/2
§ Still a bad choice: very sensitive to noise
§ We need to blur away the noise (only blur orthogonal to
the direction of each partial):
* # −1 0 1 & - * # −1 0 1 & - The Sobel kernel
∂f 1 , % / ∂f 1 , % / is center-weighted
( (
= , % −1 0 1 ( ⊗ f/ or = , % −2 0 2 ( ⊗ f/
∂x 6 , −1 0 1 / ∂x 8 , −1 0 1 /
+ %$ (' . + %$ (' .
Correlation 4
(sum of products)
Derivative Estimation #2:
Use Function Fitting
§ Think of the image as a surface
§ The gradient then fully specifies the orientation of the tangent
planes at every point, and vice-versa.
§ So, fit a plane to the neighborhood around a point
§ Then the plane gives you the gradient
§ The concept of fitting occurs frequently in machine vision.
Ex:
§ Gray values
§ Surfaces
§ Lines
§ Curves
§ Etc.
5
Derivative Estimation: Derive a
3x3 Kernel by Fitting a Plane
§ If you fit by minimizing squared error, and you use symbolic
notation to generalize, you get:
§ A headache
§ The kernel that we intuitively guessed earlier:
-1 0 1
1
-1 0 1
6
-1 0 1
6
Vector Representations of
Images
§ Also called lexicographic representations
§ Linearize the image !7$
# &
§ Pixels have a single index (that starts at 0) #4&
#6&
0 is the F0=7 #1&
Lexicographic #3&
# &
index 7 4 6 1
#5&
3 5 9 0 #9&
f0,0 f1,0 f2,0 f3,0 F0 F1 F2 F3
F =#0&
8 1 4 5 #8&
f0,1 f1,1 f2,1 f3,1 F4 F5 F6 F7 #1&
2 0 7 2 # &
f0,2 f1,2 f2,2 f3,2 F8 F9 F10 F11 #4&
#5&
f0,3 f1,3 f2,3 f3,3 F12 F13 F14 F15
Vector listing # 2 &
#0&
of pixel values # &
Change of coordinates #7& 7
"2%
Vector Representations of
Kernels This is
HUGE
§ Can also linearize a kernel (N2)
" %
$ −3 0 0 '
§ Linearization is unique for each pixel coordinate $ 1 0 0 '
and for each image size. $ 2
$ 0
0
0
0 '
0 '
$ '
§ For pixel coordinate (1,2) (i.e. pixel F9) in our image: $ −5 −3 0 '
$ 4 1 −3 '
F0 F1 F2 F3 T $ 6 2 1 '
-3 1 2 H 9 = "# 0 0 0 0 −3 1 2 0 −5 4 6 0 −7 9 8 0 $% $ 0 0 2 '
F4 F5 F6 F7 H = $
h= -5 4 6 T −7 −5 0 '
F8 F9 F10 F11 H10 = "# 0 0 0 0 0 −3 1 2 0 −5 4 6 0 −7 9 8 $% $ '
-7 9 8 $ 9 4 −5 '
F12 F13 F14 F15 $ 8 6 4 '
$ 0 0 6 '
§ Can combine the kernel vectors for each of the $ 0 −7
$
0 '
'
pixels into a single lexicographic kernel matrix (H) $ 0
$ 0
9
8
−7 '
9 '
§ H is circulant (columns are rotations of one $# 0 0 8 '&
another). Why?
H5 H9 H10
8
Convolution in Lexicographic
Representations
§Convolution becomes matrix multiplication!
§Great conceptual tool for proving theorems
§H is almost never computed or written out
9
Basis Vectors for
(Sub)Images
Cartesian Basis Vectors
§ Carefully choose a set of basis T
u1 = !" 1 0 0 0 0 0 0 0 0 #$
vectors (image patches) on which T
u 2 = !" 0 1 0 0 0 0 0 0 0 #$
to project a sub-image (window)
of size (x,y) T
u 9 = !" 0 0 0 0 0 0 0 0 1 #$
§ Is this lexicographic?
§ The basis vectors with the largest Frei-Chen Basis Vectors
coefficients are the most like this u1 u2
" 1 2 1 % " 1 0 −1 % " 0 −1 2 %
u3
sub-image. $ 0 0 0 ' $ 2 0 − 2 ' $ 1 0 −1 '
$ −1 − 2 −1 ' $ 1 0 −1 ' $ − 2 1 0 '
# & # &# &
§ If we choose meaningful basis u4 u5
" 2 −1 0 % " 0 1 0 %
u6
" −1 0 1 %
vectors, this tells us something $ −1 0 1 ' $ −1 0 1 '
$ 0 1 − 2 ' # 0 1 0&
$0 0 0'
# 1 0 −1 &
about the sub-image #
u7
&
u8 u9
" 1 −2 1 % " −2 1 −2 % " 111 %
$ −2 4 −2 ' $ 1 4 1 ' $ 111 '
# 1 −2 1 & # −2 1 −2 & # 111 &
10
Edge Detection
(VERY IMPORTANT) Easy to
Find
§ Image areas where: Positive step edge
§ Brightness changes suddenly =
Negative step edge
§ Some derivative has a large
magnitude Positive roof edge
§ Often occur at object Negative roof edge
boundaries!
Positive ramp edges
§ Find by:
§ Estimating partial derivatives
with kernels Negative ramp edges
§ Calculating magnitude and
direction from partials Noisy Positive Edge
Noisy Negative Edge
Harder
To Find 11
Edge Detection
Diatom image
(left) and its
gradient
magnitude
(right).
(http://bigwww.epfl.ch/theve
naz/differentials/)
T
# ∂f ∂f & T
∇f = % # &
( ≡ $ Gx Gy '
$ ∂x ∂y '
Detected edges are:
∇f = Gx2 + Gy2 = Edge Strength • Too thick in places
+G . • Missing in places
∠∇f = atan -- x 00 • Extraneous in places
, Gy /
Then threshold the
gradient magnitude image 12
Convolving w/ Fourier
§ Sometimes, the fastest way
to convolve is to multiply in
the frequency domain. For kernels £ 7x7,
§ Multiplication is fast. normal (spatial domain)
Fourier transforms are not. convolution is fastest*.
§ The Fast Fourier Transform
(FFT) helps
§ Pratt (Snyder ref. 5.33)
figured out the details For kernels ≥ 13x13,
§ Complex tradeoff depending the Fourier method
on both the size of the kernel is fastest*.
and the size of the image
*For almost all image sizes
13
Image Pyramids
§ A series of representations of
the same image
§ Each is a 2:1 subsampling of the
image at the next “lower level.
§ Subsampling = averaging = down
Increasing Scale
sampling
§ The subsampling happens across all
dimensions!
§ For a 2D image, 4 pixels in one layer
correspond to 1 pixel in the next
layer.
§ To make a Gaussian pyramid:
1. Blur with Gaussian
2. Down sample by 2:1 in each
dimension
3. Go to step 1
14
Scale Space
§ Multiple levels like a pyramid
§ Blur like a pyramid
§ But don’t subsample
§ All layers have the same size
§ Instead:
§ Convolve each layer with a Gaussian of variance s.
§ s is the “scale parameter”
§ Only large features are visible at high scale (large s).
15
Quad/Oc Trees 0
10
12
§ Represent an image
31
§ Homogeneous blocks 2
33 32
§ Inefficient for storage
§ Too much overhead
§ Not stable across small
changes
0 1 2 3
§ But: Useful for
representing scale space.
10 11 12 13
16
Gaussian Scale Space
§ Large scale = only large objects are visible
§ Increasing s ® coarser representations
§ Scale space causality
§ Increasing s ® # extrema should not increase
§ Allows you to find “important” edges first at high scale.
§ How features vary with scale tells us something about the
image
§ Non-integral steps in scale can be used
§ Useful for representing:
§ Brightness
§ Texture
§ PDF (scale space implements clustering)
17
How do People Do It?
§ Receptive fields
§ Representable by Gabor
functions
§ 2D Gaussian +
§ A plane wave
§ The plane wave tends to
propagate along the short axis of
the Gaussian
§ But also representable by
Difference of offset Gaussians
§ Only 3 extrema
18
Canny Edge Detector
1. Use kernels to find at every point:
§ Gradient magnitude
§ Gradient direction
2. Perform Nonmaximum suppression (NMS) on
the magnitude image
§ This thins edges that are too thick
§ Only preserve gradient magnitudes that are
maximum compared to their 2 neighbors in the
direction of the gradient
19
Canny Edge Detector, contd.
§ Edges are now properly located and 1 pixel wide
§ But noise leads to false edges, and noise+blur lead to
missing edges.
§ Help this with 2 thresholds
§ A high threshold does not get many false edges, and a low threshold
does not miss many edges.
§ Do a “flood fill” on the low threshold result, seeded by the high-
threshold result
§ Only flood fill along isophotes
20