Computer Vision
Image Transforms and
Projective Geometry
Course Plan
Course Topics No. of Hours
Content
s:
1 Introduction and Digital Image Fundamentals 3
2 Image Enhancement and Restoration 3
3 Image Enhancement in the Frequency domain 3
4 Morphological Image Processing 3
5 Image Segmentation, Representation and Description 3
Harris Detector, Sift, Point Matching, Ransac, Local Binary
6 3
Pattern
7 Projective geometry for Computer Vision: 3
8 Multiple views geometry: 3
9 Simultaneous Localization and Mapping 3
10 Optical Flow 3
11 3D reconstruction with a calibrated camera 3
12 Object Tracking 3
13 Object Recognition 3
Modelling from 3D to 2D world
Perspective Matters!!
Taj Mahal
Taj Mahal
Taj Mahal
Perspective Transforms
• Parallel lines in the world intersect in the
projected image at a “vanishing point”.
• Parallel lines on the same plane in the world
converge to vanishing points on a “vanishing
line”.
Vanishing Point Vanishing Point
Vanishing Line
Paris town hall
“Anamorphosis”
Lake Sørvágsvatn in Faroe Islands
100 metres above sea level
CAMERAS, MULTIPLE VIEWS,
AND MOTION
Dimensionality Reduction Machine (3D to 2D)
3D world 2D image
Point of observation
Figures © Stephen E. Palmer, 2002
Parametric (global) transformations
p = (x,y) p’ = (x’,y’)
Transformation T is a coordinate-changing machine:
p’ = T(p)
What does it mean that T is global?
– T is the same for any point p
T can be described by just a few numbers (parameters)
For linear transformations, we can represent T as a matrix
p’ = Tp
x' x
y ' = T y
Common transformations
Original
Transformed
Translation Rotation Scaling
Affine Perspective Slide credit (next few slides):
A. Efros and/or S. Seitz
Scaling
• Scaling a coordinate means multiplying each of its
components by a scalar
• Uniform scaling means this scalar is the same for all
components:
2
Scaling
• Non-uniform scaling: different scalars per component:
X 2,
Y 0.5
Scaling
• Scaling operation: x ' = ax
y ' = by
• Or, in matrix form:
x ' a 0 x
y ' = 0 b y
scaling matrix S
2-D Rotation
Polar coordinates…
x = r cos (f)
y = r sin (f)
x’ = r cos (f + )
(x’, y’) y’ = r sin (f + )
Trig Identity…
(x, y) x’ = r cos(f) cos() – r sin(f) sin()
y’ = r sin(f) cos() + r cos(f) sin()
Substitute…
f
x’ = x cos() - y sin()
y’ = x sin() + y cos()
2-D Rotation
This is easy to capture in matrix form:
x ' cos( ) − sin ( ) x
y ' = sin ( ) cos( ) y
R
Even though sin() and cos() are nonlinear functions of ,
– x’ is a linear combination of x and y
– y’ is a linear combination of x and y
What is the inverse transformation?
– Rotation by –
– For rotation matrices R −1 = RT
Basic 2D transformations
x' s x 0 x x' 1 x x
y ' = 0 s y y y ' =
y 1 y
Scale Shear
x
x' cos − sin x x 1 0 t x
y ' = sin cos y y = 0 1 t y
y
1
Rotate Translate
x
x a b c Affine is any combination of
y = d e
f
y translation, scale, rotation, and shear
1
Affine
Affine Transformations
x
x a b c
Affine transformations are combinations of y = d e
f
y
• Linear transformations, and
1
• Translations
or
Properties of affine transformations:
• Lines map to lines x' a b c x
• Parallel lines remain parallel y ' = d e f y
• Ratios are preserved
1 0 0 1 1
• Closed under composition
2D image transformations (reference table)
‘Homography’
Szeliski 2.1
3D Rigid Body Transform
Rotation
Rotation
Rotation ZRotate(θ)
y
• About z axis p'
θ p
x
z
x' cos θ -sin θ 0 0 x
y' sin θ cos θ 0 0 y
=
z' 0 0 1 0 z
1 0 0 0 1 1
Rotation
x' 1 0 0 0 x
• About y' 0 cos θ -sin θ 0 y
x axis: =
z' 0 sin θ cos θ 0 z
1 0 0 0 1 1
x' cos θ 0 sin θ 0 x
• About y' 0 1 0 0 y
y axis: =
z' -sin θ 1 cos θ 0 z
1 0 0 0 1 1
Euler Angles
Euler Angles
Euler Angles
Euler Angles
Euler Angles
Projective Transformations
Projective transformations are combos of x' a b c x
• Affine transformations, and
y ' = d e f y
w' g i w
• Projective warps h
Properties of projective transformations:
• Lines map to lines
• Parallel lines do not necessarily remain parallel
• Ratios are not preserved
• Models change of basis
• Projective matrix is defined up to a scale (8 DOF)
Can we use homographies to create a 360
panorama?
• In order to figure this out, we need to learn
what a camera is
The Geometry of Image Formation
Szeliski 2.1, parts of 2.2
Mapping between image and world coordinates
– Pinhole camera model
– Projective geometry
• Vanishing points and lines
– Projection matrix
Slides from James Hays, Derek Hoiem, Alexei Efros, Steve Seitz, and David Forsyth
Image Formation: Orthographic Projection
• Means of representing 3-dimensional objects
in 2-Dimensions.
• It is a form of parallel projection, in which all
the projection lines are orthogonal to
the projection plane
Orthographic Projections
• A simple orthographic projection onto the plane z = 0 can be
defined by the following matrix:
• For each point v = (vx, vy, vz), the transformed point Pv would
be
• Often, it is more useful to use homogeneous coordinates. The
transformation in homogeneous coordinates
• For each homogeneous vector v = (vx, vy, vz, 1),
the transformed vector Pv would be:
https://en.wikipedia.org/wiki/Orthographic_projection
Orthographic Projections
• In computer graphics, one of the most common
matrices used for orthographic projection can be
defined by a 6-tuple, (left, right, bottom, top,
near, far), which defines the clipping planes.
• These planes form a box with the minimum
corner at (left, bottom, -near) and the maximum
corner at (right, top, -far).
• The box is translated so that its center is at the
origin, then it is scaled to the unit cube which is
defined by having a minimum corner at
(−1,−1,−1) and a maximum corner at (1,1,1).
Orthographic Projections
Pinhole camera model
f
Real
object
f = Focal length
c = Optical center of the camera
Figure from Forsyth
Perspective Projection
Projection: world coordinates→image coordinates X
Image
. P = Y
center Z
.
(u0, v0)
f
. Z Y
V
Camera
.U
Center
(0, 0, 0)
U
p= f f
V U = −X * V = −Y *
Z Z
p = distance from
image center
What is the effect if f and Z are equal?
Perspective Projection:
• https://www.youtube.com/watch?v=17kqhGR
DHc8
Important Definitions
• Frame of reference: a measurements are made with respect to
a particular coordinate system called the frame of reference.
• World Frame: a fixed coordinate system for representing
objects (points, lines, surfaces, etc.) in the world.
• Camera Frame: coordinate system that uses the camera center
as its origin (and the optic axis as the Z-axis)
• Image or retinal plane: plane on which the image is formed,
note that the image plane is measured in camera frame
coordinates (mm)
• Image Frame: coordinate system that measures pixel locations
in the image plane.
• Intrinsic Parameters: Camera parameters that are internal and
fixed to a particular camera/digitization setup
• Extrinsic Parameters: Camera parameters that are external to
the camera and may change with respect to the world frame.
HW Questions
• Read and define a few parmeters with respect
to camera that are:
1. Intrinsic
2. Extrinsic
Camera Model & Calibration
Due to extensive Mathematical Computations
please study the links:
Video Lecture Links
1. Web link:Projective geometry, camera
models and calibration, IIT Delhi:
http://www.cse.iitd.ernet.in/~suban/vision/geo
metry/index.html
Next Class
• Please Read the content of the lectures to
understand the mathematics of forming
• Camera Matrix
• Ping me the points which you are not able to
understand….
• So that I can send you solutions to those
points.