Lecture 2
Image Basic and Filtering
1
Content
• Color
• Types of images
• Image formation
• Image sampling and quantization
• Image interpolation
• Histogram
• Domain transformations
• Affine image transformations
• Range (intensity) transformations
• Noise reduction through spatial filtering
• Filtering as cross-correlation
• Convolution
• Nonlinear (median) filtering
2
What is color
• The result of interaction between physical light in the
environment and our visual system.
• A psychological property of our visual experiences
when we look at objects and lights, not a physical
property of those objects or lights.
3
Linear color spaces
• Defined by a choice of three primaries
• The coordinates of a color are given by the weights of
the primaries used to match it
4
RGB space
• Primaries are monochromatic lights (for monitors, they
correspond to the three types of phosphors)
5
Nonlinear color spaces: HSV
• Perceptually meaningful dimensions: Hue, Saturation,
Value (Intensity)
• RGB cube on its vertex
6
Uses of color: examples
Skin detection segmentation and retrieval
7
Types of images
8
Color Image - one channel
9
Color image representation
10
What is an image?
Think of an image as a function, f, from R2 to R:
• f( x, y ) gives the intensity at position ( x, y )
• Realistically, images defined over a rectangle:
Color image = three functions pasted together
11
An image as a function
f(x,y)
y
Bright regions are high, dark regions are low
12
Digital images
In computer vision we usually operate on
digital (discrete) images:
• Sample the 2D space on a regular grid
• Quantize each sample (round to nearest integer)
• Each sample is a “pixel” (picture element)
• If 1 byte for each pixel, values range from 0 to 255
y
62 79 23 119 120 105 4 0
x 10 10 9 62 12 78 34 0
10 58 197 46 46 0 0 48
176 135 5 188 191 68 0 49
2 1 1 29 26 37 0 77
0 89 144 147 187 102 62 208
255 252 0 166 123 62 0 31
166 63 127 17 1 0 99 30
13
Images as coordinates
14
Grayscale image representation
15
Binary image representation
16
Image formation
17
Camera obscura
• Add a barrier to block off most of the rays
• This reduces blurring
• The opening is known as the aperture
18
Effects of the Aperture Size
• A large aperture makes the image blurry because a
cone of light is let through from each world point
19
Effects of the Aperture Size
• Shrinking the aperture makes the image sharper
• The ideal aperture is a pinhole that only lets through
one ray of light from each world point
20
Why not making the aperture as small as possible?
• With small apertures,
less light gets through →
must increase exposure
time
• If aperture gets too small,
diffraction effects start to
appear
21
Image formation using a converging lens
• A thin converging lens focuses light onto the film
satisfying two properties:
22
Image formation using a converging lens
• A thin converging lens focuses light onto the film
satisfying two properties:
1. Rays passing through the Optical Center are not
deviated
23
Image formation using a converging lens
• A thin converging lens focuses light onto the film
satisfying two properties:
1. Rays passing through the Optical Center are not
deviated
2. All rays parallel to the Optical Axis converge at the
Focal Point
24
Thin lens equation
• Find a relationship between 𝑓, Z and 𝑒
25
Thin lens equation
• Find a relationship between 𝑓, Z and 𝑒
26
Thin lens equation
• Find a relationship between 𝑓, Z and 𝑒
27
“In focus”
• For a given point on the object, there is a specific
distance between the lens and the film, at which the
object appears in focus in the image
• Other points project to a blur circle in the image
28
Blur Circle
29
The Pin-hole approximation
30
The Pin-hole approximation
31
Perspective effects
• Far away objects appear smaller, with size inversely
proportional to distance
32
Perspective Projection: what is preserved or
lost?
• Straight lines are still straight
• Lengths and angles are not preserved
33
Vanishing points and lines
• Parallel lines intersect at a “vanishing point” in the image
• Parallel planes intersect at a “vanishing line” in the image
• Notice that vanishing points can fall both inside or outside the
image
34
Focus and Depth of Field
• The Depth of Field is the distance between the nearest
and farthest objects in a scene that appear acceptably
sharp in an image
35
Focus and Depth of Field
• A smaller aperture increases the depth of field but
reduces the amount of light into the camera: recall the
definition of blur circle (it reduces with aperture)
36
Field of View (FOV)
• The FOV is the angular portion of 3D scene seen by
the camera
37
Field of View (FOV)
38
Perspective Camera
39
Perspective Camera
40
Perspective Camera
• For convenience, the image plane is usually
represented in front of the lens, C, such that the image
preserves the same orientation (i.e. not flipped)
41
From World to Pixel coordinates
42
Perspective Projection – 1/4
43
Perspective Projection – 2/4
44
Perspective Projection – 3/4
45
Perspective Projection – 4/4
46
Radial Distortion
47
Radial & Tangential Distortion
48
Summary: Perspective projection equations
https://docs.opencv.org/2.4.13.3/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
49
Exercise 1
50
Exercise 1 - Solution
51
Digital camera
52
Image Formation: Basics
i(x,y)
f(x,y)
r(x,y)
(from Gonzalez & Woods, 2008)
53
Image Formation: Basics
Image f(x,y) is characterized by 2 components
1. Illumination i(x,y) = Amount of source
illumination incident on scene
2. Reflectance r(x,y) = Amount of illumination
reflected by objects in the scene
f (x, y) = i(x, y)r(x, y)
where
0 < i(x, y) < ∞ and 0 < r(x, y) < 1
r(x,y) depends on object properties
r = 0 means total absorption and 1 means total reflectance
54
Image Formation: Basics
f (x, y) = i(x, y)r(x, y)
where
0 < i(x, y) < ∞ and 0 < r(x, y) < 1
Typical values of i(x,y):
• Sun on a clear day: 90,000 lm/m2
• Cloudy day: 10,000 lm/m2 r=1
• Inside an office: 1000 lm/m2
Typical values of r(x,y)
• Black velvet: 0.01, Stainless steel: 0.65, Snow: 0.93
Typical limits of f(x,y) in an office environment
• 10 < f(x,y) < 1000
• Shifted to gray scale [0, L-1]; 0 = black, L-1 = 255 = white
55
Sampling and Quantization Process
(from Gonzalez & Woods, 2008)
56
Example of a Quantized 2D Image
57
Errors due Sampling
58
Resolution
• is a sampling parameter, defined in dots per inch (DPI)
or equivalent measures of spatial pixel density, and its
standard value for recent screen technologies is 72 dpi
59
Images are Sampled and Quantized
• “grayscale”
• (or “intensity”): [0,255]
• “color”
• RGB: [R, G, B]
• Lab: [L, a, b]
• HSV: [H, S, V]
“grayscale” “color”
60
Suppose we want to zoom an image
Need to fill
Original in values for
image new pixels
Zoomed image
61
Interpolation
Original
Zoomed ** ** ** ** ** ** ** ** ** ** ** **
Need to fill in missing values *
Nearest
Neighbor
Interpolation
For each new pixel, copy nearest value
62
Neared Neighbor Interpolation
Can
we do
better?
Original image
Zoomed image
63
Other image interpolation techniques
Bilinear interpolation:
Compute pixel value v(x,y) as:
v(x, y) = ax + by + cxy + d
a, b, c, d determined from four nearest
neighbors of (x,y)
Bicubic interpolation:
(Used in most commercial image editing
programs, e.g., Photoshop)
3 3
v(x, y) = ∑∑ a x y ij
i j
i=0 j=0
aij determined from 16 nearest
neighbors of (x,y) (from http://www.cambridgeincolour.com/tutorials/image-interpolation.htm)
See also http://en.wikipedia.org/wiki/Bilinear_interpolation
64
Comparison of Interpolation Techniques
Nearest Neighbor Bilinear Bicubic
65
Histogram
• Histogram of an image provides the frequency of the
brightness (intensity) value in the image.
• Histogram captures the distribution of gray levels in the
image.
• How frequently each gray level occurs in the image
66
Histogram
67
Histogram - use case
68
Histogram - use case
69
Image processing
An image processing operation converts an
existing image f to a new image g
Can transform either the domain or range of f
70
Image processing
Range transformation:
(What is an example?)
Noise filtering
71
Image Processing
Domain transformation:
(What is an example?)
Translation
Rotation
72
Recall
Domain transformation:
(What is an example?)
Translation
Rotation
How are these done?
73
Geometric spatial transformations of images
Two steps:
1. Spatial transformation of coordinates (x,y)
2. Interpolation of intensity value at new
coordinates
We already know how to do (2), so focus on (1)
Example: What does the transformation
(x,y) = T((v,w)) = (v/2,w/2) do?
[Shrinks original image in half in both directions]
74
Affine Spatial Transformations
• Most commonly used set of transformations
• General form:
• [x y 1] are called homogenous coordinates
• Can translate, rotate, scale, or shear based on
values tij
• Multiple transformations can be concatenated
by multiplying them to form new matrix T’
75
Example: Translation
What does T look like for translation?
x = v + tx
y = w + ty
76
Affine Transformations
Transformation Affine Matrix T Coordinate Equations Example
77
Affine Transformations (cont.)
Transformation Affine Matrix T Coordinate Equations Example
78
Example of Affine Transformation
Image rotated 21 degrees
Nearest Bilinear Bicubic
Neighbor
(from Gonzalez & Woods, 2008)
79
Recall
Range transformation:
(What is an example?)
Noise filtering
80
Image processing for noise reduction
Common types of noise:
• Salt and pepper
noise: contains
random occurrences of
black and white pixels
• Impulse noise: Original Salt and pepper noise
contains random
occurrences of white
pixels
• Gaussian noise:
variations in intensity
drawn from a Gaussian
normal distribution Impulse noise Gaussian noise
81
How do we reduce the effects of noise?
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
82
How do we reduce the effects of noise?
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0 80
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
83
How do we reduce the effects of noise?
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0 80
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0 10
0 0 0 0 0 0 0 0 0 0
Idea: Compute mean value for each pixel from neighbors
84
Mean filtering
Sliding window k = 1
85
Mean filtering
86
Mean filtering
87
Mean filtering
88
Mean filtering
89
Mean filtering
90
Mean filtering
91
Mean filtering
• In summary:
• This filter “Replaces” each pixel with an average of its
neighborhood.
• Achieve smoothing effect (remove sharp features)
92
Filtering as cross-correlation
If the averaging window is (2k+1)x(2k+1):
In our example in previous slide, k = 1 for a 3x3
averaging window
93
Filtering as cross-correlation
Can generalize this by allowing different
weights for different neighboring pixels:
This is called cross-correlation, denoted by:
H is called the “filter,” “kernel,” or “mask.”
Note: During implementation, we avoid the negative filter indices by
using H[u+k,v+k] instead of H[u,v]
94
Kernel for mean filtering
What is the kernel for a 3x3 mean filter?
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
95
Kernel for mean filtering
What is the kernel for a 3x3 mean filter?
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0 1 1 1
0 0 0 90 90 90 90 90 0 0 1 1 1
1/9
0 0 0 90 0 90 90 90 0 0 1 1 1
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
96
Example of mean filtering
Input image Filtered Images
Salt and pepper noise
3x3 5x5 7x7
Kernel size
97
Gaussian Filtering
A Gaussian kernel gives less weight to pixels further
from the center of the window
1 2 1
0 0 0 0 0 0 0 0 0 0 2 4 2
0 0 0 0 0 0 0 0 0 0 1 2 1
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0 Kernel approximates Gaussian
function:
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
What happens if you increase σ ?
98
Mean versus Gaussian filtering
Input
Image
Mean Gaussian
filtered filtered
99
Filtering an impulse
Impulse signal Kernel
0 0 0 0 0 0 0 a b c
0 0 0 0 0 0 0 d e f
0 0 0 0 0 0 0 g h i
0 0 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Output = ?
100
Filtering an impulse
Impulse signal Filter Kernel
0 0 0 0 0 0 0 a b c
0 0 0 0 0 0 0 d e f
0 0 0 0 0 0 0 g h i
0 0 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 i h g 0 0
0 0 f e d 0 0
Output is equal to filter kernel
0 0 c b a 0 0
flipped horizontally & vertically
0 0 0 0 0 0 0
0 0 0 0 0 0 0
101
What if we want to get an output that
looks exactly like the filter kernel?
102
Flipping kernels
Impulse signal Filter Kernel Flipped Kernel
0 0 0 0 0 0 0 a b c i h g
0 0 0 0 0 0 0 d e f f e d
0 0 0 0 0 0 0 g h i c b a
0 0 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 a b c 0 0
0 0 d e f 0 0
0 0 g h i 0 0
Output is equal to filter kernel! 0 0 0 0 0 0 0
0 0 0 0 0 0 0
103
Convolution
A convolution is a cross-correlation where the filter is
flipped both horizontally and vertically before being
applied to the image:
Written as:
Compare with cross-correlation:
104
Convolution
H = Kernel
X - flip
Y - flip
*
F = Image
105
Convolution - Example
* =
Input Kernel Output
106
Convolution - Example
Output
107
Convolution - Example
Output
108
Convolution - Example
Output
109
Convolution - Example
Output
110
Convolution - Example
Output
111
Convolution - Example
Output
112
Convolution - Example
113
Convolution - Example
114
Convolution - Example
115
Convolution - Example
116
Convolution - Example
117
Convolution - Example
118
Why convolution?
• Convolution is associative (cross-corr. is not):
F * (G * I) = (F * G) * I
• Important for efficiency:
To apply two filters F and G sequentially to
incoming images I, pre-compute (F * G) and
perform only 1 convolution (with pre-
computed filter)
• Convolution also allows effects of filtering to
be analyzed using Fourier analysis (will
touch on this later)
119
Cross-correlation and template matching
Cross-correlation is useful for template matching
(locating a given pattern in an image)
Image Template (pattern)
a b c
d e f
g h i
a b c
d e f
g h i
Highest value
yields location of
pattern in image
120
Nonlinear filters: Median filter
• A Median Filter replaces the value of a pixel
by the median of intensity values of neighbors
• Recall: m is the median of a set of values if half
the values in the set are <= m and half are >= m.
• Median filtering of image I: For each location (x,y),
sort intensity values in its neighborhood,
determine median intensity value, and assign that
value to I(x,y)
• Is a median filter better than a mean filter?
• Is median filtering a convolution?
121
Comparison of filters (salt-and-pepper noise)
122
Comparison of filters (Gaussian noise)
123
Filter example: Image segmentation
124
Convolution CNN
125
Recall
• Color
• Types of images
• Image formation
• Image sampling and quantization
• Image interpolation
• Histogram
• Domain transformations
• Affine image transformations
• Range (intensity) transformations
• Noise reduction through spatial filtering
• Filtering as cross-correlation
• Convolution
• Nonlinear (median) filtering
126