Convolution
Foundations of Data Analysis
April 19, 2022
Spatial Filters
Definition
A spatial filter is an image operation where each pixel
value I(u, v) is changed by a function of the intensities
of pixels in a neighborhood of (u, v).
What Spatial Filters Can Do
Blurring/Smoothing
→
What Spatial Filters Can Do
Sharpening
→
What Spatial Filters Can Do
Weird Stuff
→
Example: The Mean of a Neighborhood
Consider taking the mean in a 3 × 3 neighborhood:
v-1
v
v+ 1
u-1 u u+1
1 1
0 1XX
I (u, v) = I(u + i, v + j)
9
i=−1 j=−1
How a Linear Spatial Filter Works
H is the filter “kernel” or “matrix”
1 1 1
For the neighborhood mean: H(i, j) = 91 1 1 1
1 1 1
General Filter Equation
Notice that the kernel H is just a small image!
Let H : RH → [0, K − 1]
X
0
I (u, v) = I(u + i, v + j) · H(i, j)
(i,j)∈RH
This is known as a correlation of I and H
What Does This Filter Do?
0 0 0
0 1 0
0 0 0
Identity function (leaves image alone)
What Does This Filter Do?
1 1 1
1
1 1 1
9
1 1 1
Mean (averages neighborhood)
What Does This Filter Do?
0 0 0
0 0 1
0 0 0
Shift left by one pixel
What Does This Filter Do?
-1 -1 -1
1
-1 17 -1
9
-1 -1 -1
Sharpen (identity minus mean filter)
Filter Normalization
I Notice that all of our filter examples sum up to one
I Multiplying all entries in H by a constant will cause
the image to be multiplied by that constant
I To keep the overall brightness constant, we need H
to sum to one
X
I 0 (u, v) = I(u + i, v + j) · (cH(i, j))
i, j
X
= c I(u + i, v + j) · H(i, j)
i, j
Effect of Filter Size
Mean Filters:
Original 7×7 15 × 15 41 × 41
What To Do At The Boundary?
What To Do At The Boundary?
I Crop
What To Do At The Boundary?
I Crop
I Pad
What To Do At The Boundary?
I Crop
I Pad
I Extend
What To Do At The Boundary?
I Crop
I Pad
I Extend
I Wrap
Convolution
Definition
Convolution of an image I by a kernel H is given by
X
I 0 (u, v) = I(u − i, v − j) · H(i, j)
(i,j)∈RH
This is denoted: I 0 = I ∗ H
I Notice this is the same as correlation with H , but
with negative signs on the I indices
I Equivalent to vertical and horizontal flipping of H :
X
0
I (u, v) = I(u + i, v + j) · H(−i, −j)
(−i,−j)∈RH
Linear Operators
Definition
A linear operator F on an image is a mapping from one
image to another, I 0 = F(I), that satisfies:
1. F(cI) = cF(I),
2. F(I1 + I2 ) = F(I1 ) + F(I2 ),
where I, I1 , I2 are images, and c is a constant.
Both correlation and convolution are linear operators
Infinite Image Domains
Let’s define our image and kernel domains to be infinite:
Ω=Z×Z
Remember Z = {. . . , −2, −1, 0, 1, 2, . . .}
Now convolution is an infinite sum:
∞
X ∞
X
0
I (u, v) = I(u − i, v − j) · H(i, j)
i=−∞ i=−∞
This is denoted I 0 = I ∗ H .
Infinite Image Domains
The infinite image domain Ω = Z × Z is just a trick to
make the theory of convolution work out.
We can still imagine that the image is defined on a
bounded (finite) domain, [0, w] × [0, h], and is set to
zero outside of this.
Properties of Convolution
Commutativity:
I∗H =H∗I
This means that we can think of the image as the kernel
and the kernel as the image and get the same result.
In other words, we can leave the image fixed and slide
the kernel or leave the kernel fixed and slide the image.
Properties of Convolution
Associativity:
(I ∗ H1 ) ∗ H2 = I ∗ (H1 ∗ H2 )
This means that we can apply H1 to I followed by H2 , or
we can convolve the kernels H2 ∗ H1 and then apply the
resulting kernel to I .
Properties of Convolution
Linearity:
(a · I) ∗ H = a · (I ∗ H)
(I1 + I2 ) ∗ H = (I1 ∗ H) + (I2 ∗ H)
This means that we can multiply an image by a constant
before or after convolution, and we can add two images
before or after convolution and get the same results.
Properties of Convolution
Shift-Invariance:
Let S be the operator that shifts an image I :
S(I)(u, v) = I(u + a, v + b)
Then
S(I ∗ H) = S(I) ∗ H
This means that we can convolve I and H and then shift
the result, or we can shift I and then convolve it with H .
Properties of Convolution
Theorem: The only shift-invariant, linear operators on
images are convolutions.
Computational Complexity of Convolution
If my image I has size M × N and my kernel H has size
(2R + 1) × (2R + 1), then what is the complexity of
convolution?
R X
X R
0
I (u, v) = I(u − i, v − j) · H(i, j)
i=−R j=−R
Answer: O(MN(2R + 1)(2R + 1)) = O(MNR2 ).
Or, if we consider the image size fixed, O(R2 ).
Which is More Expensive?
The following both shift the image 10 pixels to the left:
1. Convolve with a 21 × 21 shift operator (all zeros
with a 1 on the right edge)
2. Repeatedly convolve with a 3 × 3 shift operator 10
times
The first method requires 212 · wh = 441 · wh.
The second method requires (9 · wh) · 10 = 90 · wh.
Some More Filters
Box Gaussian Laplace
Edge Detection
What is an Edge?
Image Value vs X-Position
160
140
120
100
80
60
40
20
0
0 20 40 60 80 100 120
An abrupt transition in intensity between two regions
What is an Edge?
Image X-Derivative vs X-Position
40
20
-20
-40
-60
-80
0 20 40 60 80 100 120
Image derivatives are high (or low) at edges
Review: Derivative of a Function
Given a function f : R → R, its derivative is defined as
df f (x + ) − f (x)
(x) = lim
dx →0
Derivative of f is the slope of
the tangent to the graph of f
Derivatives of Discrete Functions
f(x)
Discrete function defined on integer values of x
Derivatives of Discrete Functions
f(x)
Slopes (derivatives) don’t match on left and right
Derivatives of Discrete Functions
f(x)
Instead take the average of the two (or secant)
Derivatives of Discrete Functions
f(x)
Instead take the average of the two (or secant)
Finite Differences
Forward Difference
∆+ f (x) = f (x + 1) − f (x) right slope
Backward Difference
∆− f (x) = f (x) − f (x − 1) left slope
Central Difference
1
∆ f (x) = ( f (x + 1) − f (x − 1)) average slope
2
Finite Differences as Convolutions
Forward Difference
∆+ f (x) = f (x + 1) − f (x)
Take a convolution kernel: H = [1 −1 0]
∆+ f = f ∗ H
(Remember that the kernel H is flipped in convolution)
Finite Differences as Convolutions
Central Difference
1
∆ f (x) = ( f (x + 1) − f (x − 1))
2
1
0 − 21
Convolution kernel here is: H = 2
∆ f (x) = f ∗ H
Notice: Derivative kernels sum to zero!
Derivatives of Images
I Images have two parameters: I(x, y)
I We can take derivatives with respect to x or y
I Central differences:
∆x I = I ∗ Hx , and ∆y I = I ∗ Hy ,
−0.5
where Hx = [0.5 0 −0.5] and Hy = 0
0.5
Derivatives of Images
x-derivative using central difference:
1
0 − 12
∗ 2 =
Derivatives of Images
y-derivative using central difference:
0.5
∗ 0 =
−0.5
Combining x and y Derivatives
The discrete gradient of I(x, y) is the 2D vector:
∆x I(x, y)
∇I(x, y) =
∆y I(x, y)
The gradient magnitude is
q
k∇I(x, y)k = (∆x I(x, y))2 + (∆y I(x, y))2
Image Gradient
I Gradient points in direction of
maximal increasing intensity
I Length (magnitude) of
gradient equals amount of
change in that direction
I Gradient is perpendicular (90
degrees) to edge contour
Convolutional Neural Networks
(CNNs)
Learning a Filter
w1 w2 w3
w4
w7
w5
w8
w6
w9
?
Filter consists of weights that need to be learned.
Convolutional Neural Networks
http://towardsdatascience.com/