0% found this document useful (0 votes)

12 views69 pages

Non Rigid Structure From Motion Slides in CV

The lecture discusses Non-Rigid Structure from Motion (NRSfM), focusing on recovering 3D shapes and camera motion from monocular videos of non-rigid objects. It highlights the challenges of non-rigid motion compared to rigid motion, including the use of various priors and optimization techniques to solve the problem. Applications span multiple industries, such as film, sports, and robotics, emphasizing the importance of accurately modeling non-rigid shapes.

Uploaded by

mirensamaniego.ikas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views69 pages

Non Rigid Structure From Motion Slides in CV

Uploaded by

mirensamaniego.ikas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

Lecture:

Non-Rigid Structure from Motion

-------------------------------------------------------------------------------------------------------------------------------

3D Vision
Universitat Pompeu Fabra
Discussion

Non-Rigid Shapes?

§ Can we obtain non-rigid 3D information from images?

Structure from Motion
(Rigid) Structure from Motion

Given: a monocular video (or a collection of pictures)

We want: simultaneously recovering the 3D shape and the camera
motion
Epipolar geometry can be used

The assumption of rigidity is enough to make the problem well-posed

What about non-rigid motion?
Our world is Non Rigid!

No external markers!
One or Many
Why is this important?

The world is non-rigid! Too many everyday applications in many

different domains

Movie industry, augmented reality Experimental industry

Sport industry: sailing Endoscopy

The movie industry
Even more details
er
e sp
m
0 fr a n d
0 o
1 2 sec
Animal Reconstruction
er
e sp
m
0 fr a n d
0 o
1 2 sec
…produces better robots?
Epipolar geometry can be used

The assumption of rigidity is enough to make the problem well-posed

Can Epipolar geometry be used?

Considering only one image, we obtain the same 3D constraint

Epipolar geometry can be used

After acquiring a new image, we obtain a similar constraint but now

triangulation is not available since the shape is non rigid
Epipolar geometry can be used

After acquiring a new image, we obtain a similar constraint but now

triangulation is not available since the shape is non rigid
Non-Rigid
Structure from Motion
Non-Rigid Structure from Motion
Given: a monocular video (or a collection of pictures)
We want: simultaneously recovering the 3D shape of a time-
varying object (4D estimation) and the camera motion
Some Results
Some Results
Some Results
Solving the problem

The problem can be solved by:

• Factorization: a closed-form solution can be achieved by using
SVD factorization, enforcing a specific rank (this can change as
a function of the type of camera model, or the type of scene). In
theory, it is hard to accurately enforce constraints
• Non-linear Optimization: the solution is achieved iteratively, the
computational cost can be bigger but additional priors can be
enforced accurately

In terms of processing, the problem can be solved:

• Offline: all the frames are processed at once, after video
capture
• Online: the frames are processed as the data arrive, frame by
frame. More real applications, but can become less accurate
Non-Rigid
Structure from Motion

Problem Statement
A Reminder of Camera Models

§ Perspective camera: All rays converge to the optical center

§ Orthographic camera: All rays are parallel. Z-coordinate is
irrelevant in the projection

Perspective camera Orthographic camera

3D-to-2D: Perspective Model

A p-th 3D point Xp=[Xp, Yp, Zp]T in homogeneous coordinates can be

related with its 2D projection xp=[xp, yp]T by means of a matrix Pi for
the i-th image, such as:

where Pi is 3x4 matrix as:

3D-to-2D: Orthographic Model

A p-th 3D point Xp=[Xp, Yp, Zp]T can be related with its 2D projection
xp=[xp, yp]T by means of a matrix Ri for the i-th image, such as:

where Ri is 2x3 matrix and ti is a 2x1 translation vector as:

In practice, we subtract the translations by assuming centered

observations (i.e., they are equivalent to the mean values of xp). For
later computations, we will approximate xp= xp- ti
Problem Statement
Orthographic camera

2xP

In
th th
e e
rig sa
id m
2xP 2x3 3xP 2xP

ca e
se
,i
ist
ge
3D ima
nt
re per
ffe
di ion
3IxP

A rat
u
ig
nf
co
2Ix3I
Full Linear Relation

2IxP
Orthographic camera
Measurement Matrix

Considering P non-rigid 3D points observed in I RGB images, we

can collect all observations to obtain a linear system such as:

3IxP 3Ix4I 4IxP 2IxP 2Ix3I 3IxP

Perspective camera Orthographic camera

where W is a 3IxP matrix, P is 3Ix4I, and X is 4IxP for the
perspective case (relation on the left); and W is a 2IxP matrix, R is
2Ix3I, and X is 3IxP for the perspective case (relation on the right)
What about the rank of W?

Considering P non-rigid 3D points observed in I RGB images, we

can collect all observations to obtain a linear system such as:

3IxP 3Ix4I 4IxP 2IxP 2Ix3I 3IxP

Perspective camera Orthographic camera

rank(W)≤ min(3I,P) rank(W)≤ min(2I,P)

A severely ill-posed problem
Orthographic camera

Th
is
is var
an iab
2IxP 2Ix3I 3IxP

ex les
pl
os
io
n
of
2ip entries << 6i variables + 3ip variables
A Toy Comparison
Let us assume a 1 minute video with just 100 tracked points, and
considering only the estimation of the 3D shape

Rigid Case Non-Rigid Case

Input data: Input data:
100 points x 60 sec x 30 Hz x 2 100 points x 60 sec x 30 Hz x 2
= 360,000 measurements = 360,000 measurements
Unknowns: Unknowns:
100 points x 3 100 points x 60 sec x 30 Hz x 3
= 300 unknowns = 540,000 unknowns

well-posed problem ill-posed problem

How can I solve the problem?

The art of priors

Including deformation priors is substantially more difficult than

using simple rigidity
Many possibilities were presented

A wide variety of priors in literature:

§ Physical priors. Particle dynamics, elasticity, finite elements,
and many others
§ Probabilistic priors. Low-rank models on shape, trajectory,
shape-trajectory or force domains. Union of subspaces,
Gaussian priors
§ Geometric priors: isometric, as rigid as possible, bone lengths,
quadratic models
§ Temporal priors: temporal-coherent deformations
§ Piecewise priors
§ Many others
Shape Linear Subspace
(a probabilistic prior)
A Low-Rank Shape Model

Basically, the non-rigid 3D shape can be obtained as a linear

combination of fixed shape vectors. For every combination of
weight coefficients, a different solution can be achieved:

Rotation Linear combination of Translation Your estimation

some shapes
Including the low-rank shape model

We approximate the 3D shape by a linear combination of K shape

vectors b (normally, K << P or I). For every k-th component, a
weight coefficient lk is needed. As the shape is non-rigid, by
modifying the coefficients for every i-th image, we will change the
3D shape as:

3IxP 3Ix3K 3KxP

Another type of expression for the i-th image:

Shape Basis Estimation
In non-rigid structure from motion, we have some alternatives to
estimate the shape basis:
§ The most natural is to learn it on the fly, using only the input data
§ The input data can also be used to estimate a shape basis from
a shape at rest (like a mean shape) by applying:
- Modal analysis based on physical models
- Spectral analysis based on a distance matrix
§ If training data are assumed, we learn it by means of a learning
approach (PCA, deep based, etc.). This approach is supervised
Non-Rigid
Structure from motion
by factorization
Including the low-rank shape model

Thanks to the relation between the 3D shape and the shape basis:
Orthographic camera

3IxP 3Ix3K 3KxP

we obtain the projection equation by using the low-rank shape
model as:

2IxP 2Ix3K 3KxP

Including the low-rank shape model
Orthographic camera

2IxP 2Ix3K 3KxP

What about the perspective case?

A similar analysis can be followed, but now, considering

homogeneous coordinates. We can obtain:

3IxP 3Ix3K+1 3K+1xP

3x3 3x1
Factorization

In both cases, the goal is to infer the motion factor (P or R) and the
3D coordinates X of the observed non-rigid object from 2D point
tracks in a monocular video W:
a
er
m
ca
ic
ph
ra
og

2IxP 2Ix3K 3KxP

th
Or

a
er
m
ca
ve
cti
pe
rs
Pe

3IxP 3Ix3K+1 3K+1xP

The full linear system

W=MB
Two factors: motion factor M (camera rotation and weight
coefficients) and shape one as a product of B and the coefficients
More on factorization Orthographic camera

Because M is a 2Ix3K matrix and B is a 3KxP matrix, the rank of W

is 3K. If we apply SVD to W, we will have only 3K non-zero
singular values

However, measurements are normally noisy, and in practice the

rank will not be 3K. We have to impose it

Applying SVD factorization, we have:

W ra
e nk
ne K
W= UAVT=[U !][ !VT]=[U !Q][Q-1 !VT]

ed a
to pri
i.e., M=U !Q and B=Q-1 !VT (the two factors we look for)

tu ori
ne
th
e
Many solutions can be achieved by modifying Q. Of course, for all
invertible 3Kx3K Q matrices
Metric Upgrade

How is Q computed?

Enforcing orthogonality constraints on the camera rotation. A

rotation matrix always has some properties (it is not a random
matrix), since lies in the SO(3) manifold

Be careful. Now, matrix M also includes the weight coefficients in

addition to the camera rotations!
But in many cases, we
cannot observe all the
points in all the images
==
Missing tracks
A toy example with missing tracks
Orthographic camera

…
l l
u

(
f

(
!"" !"%
o t
( #"
'"% '"% '"$ '"&
(
n
!"% !%% !$%
is = #%
t a!
!"$
d a
!%$ $
$ !&$ #$

u
!"&
t !%& !$& !&&
#&
'"& '&% '&$ '&&
p
In 8x4 8x12 12x4
Handling missing tracks
Two alternatives are possible:
§ Applying a matrix completion algorithm to infer the missing
entries, and then run factorization over the full measurement
matrix
§ No consider missing entries in the formulation by applying non-
linear optimization. Once the 3D model and camera pose are
computed, the 2D missing tracks can be inferred too

?
Non-Rigid
Structure from Motion
by Non-Linear Optimization
Problem Statement
For an orthographic camera, we have:

The problem (compacting over the points) can be formulated as:

and we perform non-linear optimization by minimizing a geometric

error cost function. Translation ti is optional
Bundle Adjustment
Normally, the Levenberg-Marquardt method is used to minimize the
problem. We need a Jacobian matrix J as the derivative of the
function with respect to the unknowns (R, B and the set of weight lk)

Again, there are many variants on how to proceed to reduce the

computational complexity of the problem:
§ Alternate minimization of motion and shape parameters
§ Sparse methods. The computation of J is complex, but it can be
approximated by considering a binary pattern

Initialization: The optimization can be initialized assuming a rigid

shape, i.e., using rigid factorization or non-linear optimization for a
rigid shape
Bundle Adjustment
The bundle adjustment method:
§ Minimize the cost function with Levenberg-Marquadt
§ Exploit the sparseness of the Jacobian function matrix to
decrease computation and memory requirements

The Levenberg-Marquadt algorithm does:

§ Mixture of Gauss-Newton and Gradient descent
§ Behaves like Gauss-Newton when close to the minimum
(quadratic region)
§ Gradient descent when the prediction is poor
§ Depends on a parameter θ that controls the mixture of Gauss-
Newton and Gradient descent as:

(JJT+θI) δp = -g Parameters we
want to estimate
Exercise
Let us assume a monocular video of 3 images, where 6 points are
observed. Considering the map is non-rigid and the visibility is full,
define the corresponding Jacobian matrix. A low-rank shape model
of rank 2 can be considered

Number of unknowns
Number of equations

se !
J= p
r
a rn
S tte
pa
Including priors
As in the rigid case, we can apply temporal smoothness priors, but
now, in both camera motion and shape deformation (be careful
when input data are a collection of pictures). To this end, we may
consider the expression:

where Li includes all K weight coefficients in the i-th image

How can we obtain a sequential solution?

We solve the optimization in a sequential manner, considering the

information as the data arrive. Future frames are not available. Two
options:
§ Pure sequential (frame by frame)
§ Sliding window (from 3 to 5 consecutive frames)

Initialization is performed by rigid estimation (assuming just the

initial frames). The problem is actually challenging
An Extension
Semantic 3D Reconstruction
3D Reconstruction of Categories

Unsupervised 3D Reconstruction and Grouping of Rigid and Non-Rigid Categories. Antonio Agudo. IEEE Transactions on
Pattern Analysis and Machine Intelligence (TPAMI), 44(1): 519-532, 2022.
Input Data as Training Data
Shape Basis as a MLP

Neural Dense Non-Rigid Structure from Motion with Latent Space Constraints. Vikramjit Sidhu, Edgar Tretschk, Vladislav
Golyanik, Antonio Agudo, and Christian Theobalt. European Conference on Computer Vision, 2020.
Shape Basis as a MLP
Priors and models can be considered as a loss function in training. For
example, the next energy includes both data term and priors as:

Neural Dense Non-Rigid Structure from Motion with Latent Space Constraints. Vikramjit Sidhu, Edgar Tretschk, Vladislav
Golyanik, Antonio Agudo, and Christian Theobalt. European Conference on Computer Vision, 2020.
Neural Radiance Fields in
the non-rigid context
Dynamic Neural Radiance Fields

4DPV: 4D Pet from Videos by Coarse-to-fine Non-Rigid Radiance Fields. Sergio M. de Paco and Antonio Agudo. Asian
Conference on Computer Vision, 2024.
Coarse-to-fine Shapes from Videos
Demo

4DPV: 4D Pet from Videos by Coarse-to-fine Non-Rigid Radiance Fields. Sergio M. de Paco and Antonio Agudo. Asian
Conference on Computer Vision, 2024.
Things to remember

3D and 4D information can be obtained from a sequence of images

For rigid objects, the problem is well-posed. For non-rigid ones, it is

inherently ill-posed (additional priors are necessary)

Model-based approaches can handle a wide variety of

deformations. They are normally universal and generic. No
supervision is needed

Data-based approaches require a lot of data to constrain the

solution space. Obtaining *good* data can become hard. Only for a
particular object or deformation (depending on the training data)

Future must be unsupervised (or self-supervised), and probably,

combining both model- and data-based approaches. With a hand-
held camera, performing the estimation of multiple scenarios
Acknowledgments

Thanks to Kris Kitani, Yaser Sheikh, Alessio del Bue, Lourdes

Agapito, Sergio M. de Paco

RRU5903 (850Mhz) - Technical Specifications
No ratings yet
RRU5903 (850Mhz) - Technical Specifications
8 pages
Classic Rock Special Yes The Complete Story 2 ND Edition 2022
91% (11)
Classic Rock Special Yes The Complete Story 2 ND Edition 2022
148 pages
Bài Tập Bổ Trợ Tiếng Anh 9 Global 4 Kỹ Năng Siêu Hay HS UNIT 2
No ratings yet
Bài Tập Bổ Trợ Tiếng Anh 9 Global 4 Kỹ Năng Siêu Hay HS UNIT 2
17 pages
The Economist
No ratings yet
The Economist
27 pages
Lab Report On Basics Logic Gate
80% (10)
Lab Report On Basics Logic Gate
9 pages
Kurikulim Socs
No ratings yet
Kurikulim Socs
16 pages
Lec 07 Odometry Slam Localization
No ratings yet
Lec 07 Odometry Slam Localization
75 pages
57 Brochure
No ratings yet
57 Brochure
42 pages
Slide - 3DP - 10 - Structure From Motion
No ratings yet
Slide - 3DP - 10 - Structure From Motion
38 pages
Installation Manual For EG2233/EG3333/EG8406/EG3355/EG3388 RF EAS Systems
No ratings yet
Installation Manual For EG2233/EG3333/EG8406/EG3355/EG3388 RF EAS Systems
8 pages
Computer Vison 5
No ratings yet
Computer Vison 5
44 pages
Scomi Drilling Fluid
No ratings yet
Scomi Drilling Fluid
23 pages
North Indian Restaurant Financials
No ratings yet
North Indian Restaurant Financials
9 pages
3D Action and Image - Ivc07
No ratings yet
3D Action and Image - Ivc07
14 pages
Roscrea Suffolk Sale Catalogue
100% (1)
Roscrea Suffolk Sale Catalogue
78 pages
Ok Task Based Control
No ratings yet
Ok Task Based Control
85 pages
Core Concepts of SFM - Mastering OpenCV 4 - Third Edition
No ratings yet
Core Concepts of SFM - Mastering OpenCV 4 - Third Edition
20 pages
Pringles Marketing Analysis
100% (2)
Pringles Marketing Analysis
24 pages
Multiple Camera Calibration Using Robust Perspective Factorization
No ratings yet
Multiple Camera Calibration Using Robust Perspective Factorization
8 pages
Slide 3DP 04 From Objects To Camera
No ratings yet
Slide 3DP 04 From Objects To Camera
38 pages
Lotus Dance Floor & LED Display Price List
No ratings yet
Lotus Dance Floor & LED Display Price List
20 pages
Lecture4 CS294 2022
No ratings yet
Lecture4 CS294 2022
36 pages
Structure from Motion Lecture
No ratings yet
Structure from Motion Lecture
84 pages
Lecture5 CS294 2022
No ratings yet
Lecture5 CS294 2022
49 pages
Pari 1
No ratings yet
Pari 1
35 pages
CFA LEVEL 1 - CFA Exam Core Video Series
No ratings yet
CFA LEVEL 1 - CFA Exam Core Video Series
2 pages
(2025) C4 - L5-6 - Epipolargeometry in Computer Vision
No ratings yet
(2025) C4 - L5-6 - Epipolargeometry in Computer Vision
78 pages
Course Slam
No ratings yet
Course Slam
72 pages
Lecture 18
No ratings yet
Lecture 18
56 pages
Lecture20 Calibration Cont, Stereo
No ratings yet
Lecture20 Calibration Cont, Stereo
41 pages
CV Unit 4
No ratings yet
CV Unit 4
30 pages
Cao Wang FTA EMA
No ratings yet
Cao Wang FTA EMA
5 pages
An Invitation To 3-D Vision From Images To Models
No ratings yet
An Invitation To 3-D Vision From Images To Models
339 pages
ASM, Image Search N Classification-2
No ratings yet
ASM, Image Search N Classification-2
4 pages
04 Multi-View Geometry
No ratings yet
04 Multi-View Geometry
54 pages
2005 06 VPCVPR Matchmoving
No ratings yet
2005 06 VPCVPR Matchmoving
2 pages
02-Camera Calibration
No ratings yet
02-Camera Calibration
6 pages
02 Position and Orientation
No ratings yet
02 Position and Orientation
50 pages
Lecture 03 Calibration
No ratings yet
Lecture 03 Calibration
44 pages
An Invitation To 3-D Vision PDF
No ratings yet
An Invitation To 3-D Vision PDF
338 pages
SFM Szeliski
No ratings yet
SFM Szeliski
18 pages
Image Transforms
No ratings yet
Image Transforms
48 pages
Axial Stress and Strain Guide
No ratings yet
Axial Stress and Strain Guide
3 pages
2D Projective Geometry in Computer Vision Applications
No ratings yet
2D Projective Geometry in Computer Vision Applications
68 pages
VSLAM Tutorial CVPR14 A13 BundleAdjustment Handout
No ratings yet
VSLAM Tutorial CVPR14 A13 BundleAdjustment Handout
7 pages
Lecture 1 2 Pose in 2d and 3d
No ratings yet
Lecture 1 2 Pose in 2d and 3d
48 pages
03 Face Detection
No ratings yet
03 Face Detection
7 pages
Jurnal Asing Acara 3 2
No ratings yet
Jurnal Asing Acara 3 2
9 pages
3D Shape Reconstruction for Experts
No ratings yet
3D Shape Reconstruction for Experts
9 pages
Lec 17
No ratings yet
Lec 17
10 pages
Lec 22
No ratings yet
Lec 22
12 pages
Determining The Epipolar Geometry and Its Uncertainty: A Review
No ratings yet
Determining The Epipolar Geometry and Its Uncertainty: A Review
35 pages
EECS 106A Fa24 Homework 5 Vision
No ratings yet
EECS 106A Fa24 Homework 5 Vision
8 pages
Lec 13
No ratings yet
Lec 13
7 pages
Design and Fabrication of Hoverbike
No ratings yet
Design and Fabrication of Hoverbike
11 pages
Calibration and Stereovision Final Kche
No ratings yet
Calibration and Stereovision Final Kche
14 pages
Structure From Motion: Class 9
No ratings yet
Structure From Motion: Class 9
47 pages
Image Feature Extraction
No ratings yet
Image Feature Extraction
7 pages
Learning Non-Rigid 3D Shape From 2D Motion: Ltorresa@cs - Stanford.edu Hertzman@dgp - Toronto.edu
No ratings yet
Learning Non-Rigid 3D Shape From 2D Motion: Ltorresa@cs - Stanford.edu Hertzman@dgp - Toronto.edu
8 pages
LabLecture8 Inertial Odometery Using AR-Drone
No ratings yet
LabLecture8 Inertial Odometery Using AR-Drone
8 pages
Computer Vision Lecture Notes
No ratings yet
Computer Vision Lecture Notes
64 pages
Fastpath SAP Extractor
No ratings yet
Fastpath SAP Extractor
8 pages
Camera Calibration for Robotics
No ratings yet
Camera Calibration for Robotics
64 pages
Class 09 Rigid Bodies PDF
No ratings yet
Class 09 Rigid Bodies PDF
44 pages
3D Reconstruction: Jeff Boody
No ratings yet
3D Reconstruction: Jeff Boody
32 pages
ST93C46 Data Sheets
No ratings yet
ST93C46 Data Sheets
14 pages
3-D Object Pose Determination Using Computer Vision
No ratings yet
3-D Object Pose Determination Using Computer Vision
4 pages
Camera Calibration Simplified
No ratings yet
Camera Calibration Simplified
40 pages
Women's Day - Famous Space Women
No ratings yet
Women's Day - Famous Space Women
2 pages
Lecture16 Stereo
No ratings yet
Lecture16 Stereo
41 pages
UOP Alkylation Technologies Overview
No ratings yet
UOP Alkylation Technologies Overview
1 page
Computer Viruses
No ratings yet
Computer Viruses
58 pages
3D Vision: Coordinate Changes & Rotations
No ratings yet
3D Vision: Coordinate Changes & Rotations
14 pages
PDF Handbook of Pharmaceutical Manufacturing Formulations, Third Edition-Volume Four, Semisolid Products Sarfaraz K. Niazi (Author) Download
100% (3)
PDF Handbook of Pharmaceutical Manufacturing Formulations, Third Edition-Volume Four, Semisolid Products Sarfaraz K. Niazi (Author) Download
53 pages
3D Geometry Applied in Computer Vision Applicatioms
No ratings yet
3D Geometry Applied in Computer Vision Applicatioms
72 pages
Gaze-LLE: Gaze Target Estimation Via Large-Scale Learned Encoders
No ratings yet
Gaze-LLE: Gaze Target Estimation Via Large-Scale Learned Encoders
21 pages
Euclidean Position Estimation of Features On An Object Using A Single Camera: A Lyapunov-Based Approach
No ratings yet
Euclidean Position Estimation of Features On An Object Using A Single Camera: A Lyapunov-Based Approach
19 pages
P Nas: D N A S T T H: Roxyless Irect Eural Rchitecture Earch On Arget Ask and Ardware
No ratings yet
P Nas: D N A S T T H: Roxyless Irect Eural Rchitecture Earch On Arget Ask and Ardware
13 pages
Gazenerf: 3D-Aware Gaze Redirection With Neural Radiance Fields
No ratings yet
Gazenerf: 3D-Aware Gaze Redirection With Neural Radiance Fields
13 pages
Investigation of Architectures and Receptive Fields For Appearance-Based Gaze Estimation
No ratings yet
Investigation of Architectures and Receptive Fields For Appearance-Based Gaze Estimation
8 pages
Visual Modeling With A Hand-Held Camera: Abstract
No ratings yet
Visual Modeling With A Hand-Held Camera: Abstract
26 pages
DDO26B1101
No ratings yet
DDO26B1101
6 pages
GSCH003 - Rev04 24.11.2021
No ratings yet
GSCH003 - Rev04 24.11.2021
55 pages
2019 - X - Important - Comparison of Change Management
No ratings yet
2019 - X - Important - Comparison of Change Management
20 pages
Escp European Standard Clinical Practice Recommendations For Non Hodgkin Lymphoma of Childhood and
No ratings yet
Escp European Standard Clinical Practice Recommendations For Non Hodgkin Lymphoma of Childhood and
45 pages
Ek Ehsaas Ek Vishwas
No ratings yet
Ek Ehsaas Ek Vishwas
32 pages
Beam Telecom PVT LTD.: 8-2-610/A, Road No.10, Banjara Hills, Hyderabad-500034 Tel: +91-40-66272727
No ratings yet
Beam Telecom PVT LTD.: 8-2-610/A, Road No.10, Banjara Hills, Hyderabad-500034 Tel: +91-40-66272727
2 pages
Uncalibrated Euclidean Reconstruction A Review
No ratings yet
Uncalibrated Euclidean Reconstruction A Review
9 pages
Camera Calibration Algorithms
No ratings yet
Camera Calibration Algorithms
33 pages
Libreoffiice Basic: Libreoffic E Referen E Card
No ratings yet
Libreoffiice Basic: Libreoffic E Referen E Card
2 pages

Non Rigid Structure From Motion Slides in CV

Uploaded by

Non Rigid Structure From Motion Slides in CV

Uploaded by

Lecture:

Non-Rigid Structure from Motion

§ Can we obtain non-rigid 3D information from images?

Given: a monocular video (or a collection of pictures)

The assumption of rigidity is enough to make the problem well-posed

The world is non-rigid! Too many everyday applications in many

Movie industry, augmented reality Experimental industry

Sport industry: sailing Endoscopy

The assumption of rigidity is enough to make the problem well-posed

Considering only one image, we obtain the same 3D constraint

After acquiring a new image, we obtain a similar constraint but now

After acquiring a new image, we obtain a similar constraint but now

The problem can be solved by:

In terms of processing, the problem can be solved:

§ Perspective camera: All rays converge to the optical center

Perspective camera Orthographic camera

A p-th 3D point Xp=[Xp, Yp, Zp]T in homogeneous coordinates can be

where Pi is 3x4 matrix as:

where Ri is 2x3 matrix and ti is a 2x1 translation vector as:

In practice, we subtract the translations by assuming centered

Considering P non-rigid 3D points observed in I RGB images, we

3IxP 3Ix4I 4IxP 2IxP 2Ix3I 3IxP

Perspective camera Orthographic camera

Considering P non-rigid 3D points observed in I RGB images, we

3IxP 3Ix4I 4IxP 2IxP 2Ix3I 3IxP

Perspective camera Orthographic camera

rank(W)≤ min(3I,P) rank(W)≤ min(2I,P)

Rigid Case Non-Rigid Case

well-posed problem ill-posed problem

The art of priors

Including deformation priors is substantially more difficult than

A wide variety of priors in literature:

Basically, the non-rigid 3D shape can be obtained as a linear

Rotation Linear combination of Translation Your estimation

We approximate the 3D shape by a linear combination of K shape

3IxP 3Ix3K 3KxP

Another type of expression for the i-th image:

3IxP 3Ix3K 3KxP

2IxP 2Ix3K 3KxP

2IxP 2Ix3K 3KxP

A similar analysis can be followed, but now, considering

3IxP 3Ix3K+1 3K+1xP

2IxP 2Ix3K 3KxP

3IxP 3Ix3K+1 3K+1xP

Because M is a 2Ix3K matrix and B is a 3KxP matrix, the rank of W

However, measurements are normally noisy, and in practice the

Applying SVD factorization, we have:

Enforcing orthogonality constraints on the camera rotation. A

Be careful. Now, matrix M also includes the weight coefficients in

The problem (compacting over the points) can be formulated as:

and we perform non-linear optimization by minimizing a geometric

Again, there are many variants on how to proceed to reduce the

Initialization: The optimization can be initialized assuming a rigid

The Levenberg-Marquadt algorithm does:

where Li includes all K weight coefficients in the i-th image

We solve the optimization in a sequential manner, considering the

Initialization is performed by rigid estimation (assuming just the

3D and 4D information can be obtained from a sequence of images

For rigid objects, the problem is well-posed. For non-rigid ones, it is

Model-based approaches can handle a wide variety of

Data-based approaches require a lot of data to constrain the

Future must be unsupervised (or self-supervised), and probably,

Thanks to Kris Kitani, Yaser Sheikh, Alessio del Bue, Lourdes

You might also like