Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
134 views205 pages

Introduction Inverse Problems

This document provides an introduction to inverse problems. It discusses key elements of inverse problems including the measurement operator, modeling, noise and prior information. It provides examples of measurement operators and discusses how inverse problems relate to modeling and applications like MRI. It also covers topics like the Hilbert scale, integral geometry, the Radon transform, inverse scattering problems, inverse source problems, inverse kinematic problems, inverse transport problems and inverse diffusion problems. Numerical simulations and explicit reconstruction formulas are presented for some examples.

Uploaded by

Cortney Moss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views205 pages

Introduction Inverse Problems

This document provides an introduction to inverse problems. It discusses key elements of inverse problems including the measurement operator, modeling, noise and prior information. It provides examples of measurement operators and discusses how inverse problems relate to modeling and applications like MRI. It also covers topics like the Hilbert scale, integral geometry, the Radon transform, inverse scattering problems, inverse source problems, inverse kinematic problems, inverse transport problems and inverse diffusion problems. Numerical simulations and explicit reconstruction formulas are presented for some examples.

Uploaded by

Cortney Moss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 205

Introduction to Inverse Problems

Guillaume Bal
1
January 29, 2012
1
Columbia University, New York NY, 10027; [email protected]
ii
Contents
1 What constitutes an Inverse Problem 1
1.1 Elements of an Inverse Problem (IP) . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Injectivity and stability of the Measurement Operator . . . . . . . 1
1.1.2 Noise, Modeling, and Prior Information . . . . . . . . . . . . . 2
1.1.3 Numerical simulations . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Examples of Measurement Operator . . . . . . . . . . . . . . . . . . . . . 6
1.3 IP and Modeling. Application to MRI . . . . . . . . . . . . . . . . . . . 8
1.4 Inverse Problems and Smoothing: Hilbert scale . . . . . . . . . . . . . . 12
1.4.1 Fourier transforms and well-posedness . . . . . . . . . . . . . . . 12
1.4.2 Hilbert scale and degrees of ill-posedness . . . . . . . . . . . . . . 13
2 Integral Geometry. Radon transforms 19
2.1 Transmission Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Two dimensional Radon transform . . . . . . . . . . . . . . . . . . . . . 21
2.3 Three dimensional Radon transform . . . . . . . . . . . . . . . . . . . . . 28
2.4 Attenuated Radon Transform . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.1 Single Photon Emission Computed Tomography . . . . . . . . . . 30
2.4.2 Riemann Hilbert problem . . . . . . . . . . . . . . . . . . . . . . 31
2.4.3 Inversion of the Attenuated Radon Transform . . . . . . . . . . . 32
2.4.4 Step (i): The problem, an elliptic equation . . . . . . . . . . . . 33
2.4.5 Step (ii): jump conditions . . . . . . . . . . . . . . . . . . . . . . 35
2.4.6 Step (iii): reconstruction formulas . . . . . . . . . . . . . . . . . . 37
3 Integral Geometry. Generalized Ray Transform 39
3.1 Generalized Ray Transform: Setting in two dimensions. . . . . . . . . . . 40
3.1.1 Family of curves. . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.2 Generalized Ray Transform. . . . . . . . . . . . . . . . . . . . . . 40
3.1.3 Adjoint operator and rescaled Normal operator . . . . . . . . . . 41
3.2 Oscillatory integrals and Fourier Integral Operators . . . . . . . . . . . . 43
3.2.1 Symbols, phases, and oscillatory integrals. . . . . . . . . . . . . . 43
3.2.2 Parameterized oscillatory integrals . . . . . . . . . . . . . . . . . 45
3.2.3 Denition of Fourier Integral Operators. . . . . . . . . . . . . . . 46
3.3 Pseudo-dierential operators and GRT . . . . . . . . . . . . . . . . . . . 47
3.3.1 Absence of singularities away from the diagonal x = y . . . . . . . 48
3.3.2 Change of variables and phase (x y) . . . . . . . . . . . . . . 49
3.3.3 Choice of a parametrix. . . . . . . . . . . . . . . . . . . . . . . . . 50
iii
iv CONTENTS
3.3.4 Proof of smoothing by one derivative . . . . . . . . . . . . . . . . 51
3.3.5 Boundedness of DOs of order 0 in the L
2
sense . . . . . . . . . . 51
3.3.6 Injectivity and implicit inversion formula. . . . . . . . . . . . . . . 53
3.4 Kinematic Inverse Source Problem . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Transport equation . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.2 Variational form and energy estimates . . . . . . . . . . . . . . . 56
3.4.3 Injectivity result . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4.4 Summary on GRT. . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Propagation of singularities for the GRT. . . . . . . . . . . . . . . . . . . 59
3.5.1 Wave Front Set and Distributions. . . . . . . . . . . . . . . . . . . 59
3.5.2 Propagation of singularities in FIOs . . . . . . . . . . . . . . . . . 61
4 Inverse wave problems 65
4.1 One dimensional inverse scattering problem . . . . . . . . . . . . . . . . 66
4.2 Linearized Inverse Scattering problem . . . . . . . . . . . . . . . . . . . . 68
4.2.1 Setting and linearization . . . . . . . . . . . . . . . . . . . . . . . 68
4.2.2 Far eld data and reconstruction . . . . . . . . . . . . . . . . . . 70
4.2.3 Comparison to X-ray tomography . . . . . . . . . . . . . . . . . . 73
4.3 Inverse source problem in PAT . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.1 An explicit reconstruction formula for the unit sphere . . . . . 74
4.3.2 An explicit reconstruction for detectors on a plane . . . . . . . . . 75
4.4 One dimensional inverse coecient problem . . . . . . . . . . . . . . . . 77
5 Inverse Kinematic and Inverse Transport Problems 81
5.1 Inverse Kinematic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1.1 Spherical symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1.2 Abel integral and Abel transform . . . . . . . . . . . . . . . . . . 85
5.1.3 Kinematic velocity Inverse Problem . . . . . . . . . . . . . . . . . 86
5.2 Forward transport problem . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Inverse transport problem . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.1 Decomposition of the albedo operator and uniqueness result . . . 92
5.3.2 Stability in inverse transport . . . . . . . . . . . . . . . . . . . . . 96
6 Inverse diusion 99
6.1 Cauchy Problem and Electrocardiac potential . . . . . . . . . . . . . . . 99
6.2 Half Space Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.2.1 The well posed problem . . . . . . . . . . . . . . . . . . . . . . . 101
6.2.2 The electrocardiac application . . . . . . . . . . . . . . . . . . . . 102
6.2.3 Prior bounds and stability estimates . . . . . . . . . . . . . . . . 102
6.2.4 Analytic continuation . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.3 General two dimensional case . . . . . . . . . . . . . . . . . . . . . . . . 105
6.3.1 Laplace equation on an annulus . . . . . . . . . . . . . . . . . . . 105
6.3.2 Riemann mapping theorem . . . . . . . . . . . . . . . . . . . . . . 106
6.4 Backward Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
CONTENTS v
7 Calderon problem 109
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.2 Uniqueness and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.2.1 Reduction to a Schodinger equation . . . . . . . . . . . . . . . . . 110
7.2.2 Proof of injectivity result . . . . . . . . . . . . . . . . . . . . . . . 112
7.2.3 Proof of the stability result . . . . . . . . . . . . . . . . . . . . . 113
7.3 Complex Geometric Optics Solutions . . . . . . . . . . . . . . . . . . . . 115
7.4 The Optical Tomography setting . . . . . . . . . . . . . . . . . . . . . . 118
8 Coupled-physics IP I: PAT and TE 121
8.1 Introduction to PAT and TE . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.1.1 Modeling of photoacoustic tomography . . . . . . . . . . . . . . . 122
8.1.2 First step: Inverse wave source problem . . . . . . . . . . . . . . . 124
8.1.3 Second step: Inverse problems with internal functionals . . . . . . 125
8.1.4 Reconstruction of one coecient. . . . . . . . . . . . . . . . . . . 126
8.1.5 Introduction to Transient Elastography . . . . . . . . . . . . . . . 127
8.2 Theory of quantitative PAT and TE . . . . . . . . . . . . . . . . . . . . . 127
8.2.1 Uniqueness and stability results in QPAT . . . . . . . . . . . . . . 127
8.2.2 Application to Quantitative Transient Elastography . . . . . . . . 134
8.3 Well-chosen illuminations in PAT and TE . . . . . . . . . . . . . . . . . 134
8.3.1 The two dimensional case . . . . . . . . . . . . . . . . . . . . . . 134
8.3.2 The n dimensional case . . . . . . . . . . . . . . . . . . . . . . . . 135
9 Coupled-physics IP II: UMT 137
9.1 Ultrasound Modulation Tomography . . . . . . . . . . . . . . . . . . . . 137
9.2 Inverse problems in ultrasound modulation. . . . . . . . . . . . . . . . . 139
9.3 Eliminations and redundant systems of ODEs . . . . . . . . . . . . . . . 142
9.3.1 Elimination of F . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
9.3.2 System of ODEs for S
j
. . . . . . . . . . . . . . . . . . . . . . . . 144
9.3.3 ODE solution and stability estimates . . . . . . . . . . . . . . . . 146
9.4 Well-chosen illuminations . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.4.1 The case n = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.4.2 The case n 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
9.5 Remarks on hybrid inverse problems . . . . . . . . . . . . . . . . . . . . 148
10 Priors and Regularization 151
10.1 Smoothness Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . 152
10.1.1 Ill-posed problems and compact operators . . . . . . . . . . . . . 152
10.1.2 Regularity assumptions and error bound . . . . . . . . . . . . . . 153
10.1.3 Regularization methods . . . . . . . . . . . . . . . . . . . . . . . 156
10.2 Sparsity and other Regularization Priors . . . . . . . . . . . . . . . . . . 163
10.2.1 Smoothness Prior and Minimizations . . . . . . . . . . . . . . . . 163
10.2.2 Sparsity Prior and Minimizations . . . . . . . . . . . . . . . . . . 164
10.3 Bayesian framework and regularization . . . . . . . . . . . . . . . . . . . 165
10.3.1 Penalization methods and Bayesian framework . . . . . . . . . . . 166
10.3.2 Computational and psychological costs of the Bayesian framework 167
vi CONTENTS
11 Geometric Priors and Parameterizations 171
11.1 Reconstructing the domain of inclusions . . . . . . . . . . . . . . . . . . 171
11.1.1 Forward Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.1.2 Factorization method . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.1.3 Reconstruction of . . . . . . . . . . . . . . . . . . . . . . . . . 177
11.2 Reconstructing small inclusions . . . . . . . . . . . . . . . . . . . . . . . 178
11.2.1 First-order eects . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
11.2.2 Stability of the reconstruction . . . . . . . . . . . . . . . . . . . . 181
12 Inverse Problems and Modeling 183
12.1 Imaging in Highly Heterogeneous Media . . . . . . . . . . . . . . . . . . 184
12.1.1 Wave model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
12.1.2 Kinetic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
12.1.3 Statistical Stability . . . . . . . . . . . . . . . . . . . . . . . . . . 186
12.1.4 Inverse transport problem . . . . . . . . . . . . . . . . . . . . . . 188
12.1.5 Random media and Correlations . . . . . . . . . . . . . . . . . . . 189
12.1.6 Imaging with waves in random media . . . . . . . . . . . . . . . . 190
12.2 Random uctuations and Noise models . . . . . . . . . . . . . . . . . . . 190
Bibliography 195
Chapter 1
What constitutes an Inverse
Problem
1.1 Elements of an Inverse Problem (IP)
The denition of an inverse problem starts with that of a mapping between objects
of interest, which we call parameters, and acquired information about these objects,
which we call data or measurements. The mapping, or forward problem, is called the
measurement operator (MO). We denote it by M.
The MO maps parameters in a functional space X, typically a Banach or Hilbert
space, to the space of data Y, typically another Banach or Hilbert space. We write
y = M(x) for x X and y Y, (1.1)
the correspondence between the parameter x and the data y. Solving the inverse problem
amounts to nding point(s) x X from knowledge of the data y Y such that (1.1) or
an approximation of (1.1) holds.
The MO describes our best eort to construct a model for the available data y, which
we assume here depend only on the sought parameters x. The choice of X describes our
best eort to characterize the space where we believe the parameters belong.
1.1.1 Injectivity and stability of the Measurement Operator
The rst question to ask about the MO is whether we have acquired enough data to
uniquely reconstruct the parameters. In other words, whether the MO is injective.
Injectivity means that
M(x
1
) = M(x
2
) = x
1
= x
2
for all x
1
, x
2
X. (1.2)
Then the data y, if given in the range of M, uniquely characterize the parameter x.
Measurement operators used in practice are typically discretized and available data
typically contain noise as we shall see below. Such measurement operators are often not
(quite) injective. Yet, most practical measurement operators can be seen as approxima-
tions to measurement operators that are indeed injective. The study of the MO provides
very signicant information about the structure of the inverse problem considered.
1
2 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
When M is injective, we can construct an inversion operator M
1
mapping the
range of M to a uniquely dened element in X. This inverse operation is what is solved
numerically in practice. The main features of the inverse operator are captured by
what are called stability estimates. Such estimates quantify how errors in the available
measurements translate into errors in the reconstructions.
Stability estimates typically take the following form:
|x
1
x
2
|
X
(|M(x
1
) M(x
2
)|
Y
), (1.3)
where : R
+
R
+
is an increasing function such that (0) = 0 quantifying the
modulus of continuity of the inversion operator M
1
. This function gives an estimate
of the reconstruction error |x
1
x
2
|
X
based on what we believe is the error in the data
acquisition |M(x
1
) M(x
2
)|
Y
.
When noise is not amplied too drastically so that the error on the reconstructed
parameters is acceptable, for instance typically when (x) = Cx for some constant C,
then we say that the inverse problem is well-posed. When noise is strongly amplied
and the reconstruction is contaminated by too large a noisy component, for instance
typically when (x) = [ log [x[[
1
so that measurement errors of 10
10
translate into
reconstruction errors of
1
10
, then we say that the inverse problem is ill-posed. The
notion of the ill-posedness of an inverse problem is therefore subjective.
Let us make two observations about the advantages and limitations of stability es-
timates. First, the modulus of continuity depends on the choice of X and Y. Let us
assume that the metric space Y is endowed with a metric d(y
1
, y
2
) = (|y
1
y
2
|
Y
).
Then the distance between the two reconstructions x
1
and x
2
is bounded by the distance
between the two measurements y
1
and y
2
, which seems to correspond to a nice, well-
posed, inverse problem. The stability estimates and corresponding moduli of continuity
are thus also subjective.
Second, we were careful to look at the dierence between y
1
= M(x
1
) and y
2
=
M(x
2
), that is to consider errors of measurements in the range of the measurement
operator. This is legitimate since all that we have constructed so far is measurements in
Y that are in the range of the measurement operator M. In practice, however, noise in
the data acquisition may cause the measurements to leave the natural set M(X) where
measurements are currently dened. Data then need to be projected onto the range of
the MO rst, or an entirely new set of stability estimates need to be developed.
In spite of the aforementioned caveats, the notions of injectivity and stability of a
measurement operator are very fruitful concepts that provide practically useful informa-
tion about the structure of the inverse problem of interest. Most of this book is devoted
to the analysis of injectivity of dierent MO and the derivation of (typically several)
corresponding stability estimates. Let us reiterate that most practical inverse problems
are approximations to injective measurement operators with well-characterized stability
properties.
1.1.2 Noise, Modeling, and Prior Information
The MO is typically not sucient to describe an IP satisfactorily, again, a subjective
notion. Often, noisy contributions need to be added to realistically model the available
1.1. ELEMENTS OF AN INVERSE PROBLEM (IP) 3
data, for instance contributions of parameters not included in x because we have no
chance of reconstructing them. They need to be incorporated because they may have
undesirable eects on the reconstructions. This undesirable eect will in turn require
action from the user by means of imposition of prior assumptions on the parameters.
But rst, let us be more specic about what we mean by noise.
Modeling and Measurement Errors. Rather than modeling the IP as y = M(x),
it is customary to dene an IP as y = M(x) +n, where n is noise and its denition is
precisely
n := y M(x), (1.4)
the discrepancy between the model M(x) and the available data y. n typically has two
contributions. One standard contribution is the detector noise, since measurements are
performed by instruments that are imperfect. Often, a more important contribution
is what we shall call a modeling error, for instance reecting that M is an imperfect
modeling of the physical underlying process mapping parameters x to data y.
We have seen that the analysis of the MO gave some intuition on the eect of noise
in the reconstructions. Let us assume that we know how to project the available data
y onto the range of M so as to decompose it as y = M(x
2
) plus y M(x
2
). Let us then
discard y y for the moment. Then n = M(x
2
) M(x) and the stability estimates
provide a measure of the error x x
2
in the reconstruction. Estimating the above
error term is therefore an important part of solving an IP in practice. Several guiding
principles to do so will be considered in Chapter 12. For the moment, let us simply
mention that it often makes sense to model n as a random process. Indeed, if a simple
(deterministic) model for n was available, that is a model involving well-characterized
parameters, then we would modify the MO to account for said model. The randomness
comes from the aforementioned sources: the detector and modeling errors. What are
the statistics of n, and in particular the ensemble average m(x) = En(x) and the two
point correlation function c(x, y) = n(x)n(y) are practically important information
whose knowledge can very signicantly improve the solution to a given IP.
Prior assumptions. We now have two items characterizing an IP: a MO and a noise
model. The very reason noise is modeled is because it has an eect on the reconstruc-
tion, which is estimated by the stability estimates associated to the MO. Sometimes, the
eect is undesirable. One is then faced with essentially (a combination of) three strate-
gies: (i) acquire more accurate data to lower the size of |n|
Y
and hence of |x
1
x
2
|
X
.
When the stability estimates imply a strong amplication of the noise level in the recon-
struction, this strategy has limited value; (ii) change the MO and acquire dierent data.
This ideal scenario may not always be feasible; and (iii) restrict the class in which the
unknown parameters are sought. We shall call the latter strategy incorporating prior
information. Several major methodologies to do so can be classied as follows:
1. Penalization theory is a deterministic methodology that restricts the domain of def-
inition of the parameters. It includes two subcategories, one being regularization
theory, which assumes that the parameters of interest are suciently smooth,
and the other one being sparsity theory, which assumes that the parameters are
sparsely represented in a given, specic, basis.
4 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
2. The Bayesian framework is an alternative, very versatile, methodology to incorpo-
rate prior assumptions in a statistical fashion. A prior probability density describes
the potential values that the parameters can take. A posterior probability density
describes how these potential values are aected by the measurements.
3. Several geometric constraints give rise to simplied reconstructions (of small in-
clusions for instance) or qualitative methods of reconstructions (of support of
inclusions for instance).
The structure (MO, noise model, prior information) completes the description of an
inverse problem. The main focus of this book is the analysis of typical measurement
operators that appear in practice and on the mathematical tools that are useful in such
analyzes. Yet, explicit solutions of inverse problems often involve numerical simula-
tions. We now briey mention several types of simulations that appear frequently in the
solution of inverse problems. Depending on the type of aforementioned structure, the
computational cost may vary greatly. For additional information, we refer the reader
to, e.g., [22, 38, 63, 66].
1.1.3 Numerical simulations
Let us consider the case of a linear operator M and of an inverse problem of the form
y = Mx + n. Any nonlinear problem y = M(x) + n is typically linearized before being
solved, for instance by iteratively solving linear problems in the form y M(x
0
) =
A(x x
0
) + n, where A is some dierential of M at x
0
. Then the computational cost
of solving the inverse problem modeled by y = Mx + n typically falls into one of the
following categories.
(a) Penalization theories such as regularization theories typically replace y = Mx+n
by
M

y = (M

M + B)x

, (1.5)
where is a regularization parameter and B is a positive denite operator such that
(M

M + B) is an invertible operator with bounded inverse. Here M

is the adjoint
operator to M. The inverse problem then involves solving a linear system of equations.
(b) Penalization theories such as those based on sparsity constraints typically replace
y = Mx + n by
x

= argmin |y Mx|
Y
1
+ |x|
X
2
, (1.6)
where Y
1
is typically a L
2
norm and X
2
typically a L
1
norm to promote sparsity. Again,
is a small, regularizing parameter. Solving such a problem therefore requires solving
an optimization (minimization) problem, which is algorithmically more challenging than
the linear problem in (1.5).
(c) Bayesian frameworks are typically much more challenging computationally. The
premise of such a framework is that we know a prior distribution (x) which assigns a
probability (density) to all potential candidates x before any data are acquired. We also
assume that we know the likelihood function, which is the conditional distribution (y[x)
of the data y conditioned on knowing the parameter x. This is equivalent to knowing
the distribution of the noise n. Bayes theorem then states that
(x[y) = C(y[x)(x). (1.7)
1.1. ELEMENTS OF AN INVERSE PROBLEM (IP) 5
Here C is a normalizing constant so that (x[y) is a probability density, i.e., so that
_
X
(x[y)d(x) = 1 for d(x) the measure of integration on X. In other words, what we
know a priori, (y[x) and (x) before acquiring any data, plus the additional knowledge
obtained from measuring y, allows us to calculate (x[y) the probability distribution of
the unknown parameter x knowing the data y.
There are many advantages and caveats associated to this formalism, which will be
presented in more detail in Chapter 10. However, from a computational view point, the
Bayesian framework poses extraordinary challenges. If we know the distribution of the
noise level n, then estimating (y[x) requires solving a forward problem x M(x). This
calculation has to be performed for a large number of values of x in order to sample
(x[y). Moreover, sampling a distribution (x[y) that does not necessarily admit a
closed form expression is in itself computationally very intensive.
One of the main advantages of the Bayesian framework is that it insists on an im-
portant aspect of inverse problems in general: data are often not suciently informative
to provide exact reconstructions of x. The Bayesian framework recognizes this fact by
providing as an output a distribution of possible parameters x with probability (x[y).
That said, this probability density needs to be processed and presented in a way that
we can analyze and understand.
Two main methodologies are used to do so. The rst one corresponds to estimating
the maximum likelihood of the posterior distribution
x = argmax (x[y) = argmin (x[y). (1.8)
This MAP (maximum a posteriori) estimation is a minimization problem that bears
very strong resemblances with (1.6), including its computational cost. Most models of
the form (1.6) can be recast as MAP estimators of a Bayesian posterior.
A second method consists of estimating the rst few statistical moments of (x[y),
for instance
x =
_
X
x(x[y)d(x), c =
_
X
(x x) (x x)(x[y)d(x), (1.9)
with x the ensemble average of xi and c the correlation matrix of x. For x = (x
i
) a
vector, think of x x as the matrix with elements x
i
x
j
. Here, d(x) is a measure of
integration on X. Other important moments, for instance for uncertainty quantication
involve the estimation of the quantiles of x:

=
_
X
H((x) )(x[y)d(x). (1.10)
Here is a functional from X to R, and could be for instance a norm of x, or the value
of one component of a vector valued x, or the evaluation of the parameter x at a spatial
point and/or a specic time. Also, H(t) is the Heaviside function equal to H(t) = 1 for
t 0 and H(t) = 0 for t < 0. Then

above is the probability that (x) be larger than


a given parameter > 0.
All of these moments require that we sample (x[y). This is an extremely dicult and
computationally expensive task. Standard methodologies to do so involve the Markov
Chain Monte Carlo method that will be briey presented in Chapter 10.
6 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
(d) Geometric constraints and qualitative methods, like the Bayesian framework,
realize that available data are not nearly sucient to uniquely determine x, even ap-
proximately. They severely restrict the set of potential parameters or try to recover not
all of x but rather partial information about x.
Many geometric constraints and qualitative methods have been developed in the
mathematical literature. Their main advantage is that they are computationally much
more tractable than Bayesian reconstructions; see Chapter 11.
1.2 Examples of Measurement Operator
We now present several examples of measurement operators. The rst three MO are
there for the reader to gain some familiarity with the notion if necessary, while the last
two examples of MO nd important applications in, e.g., medical imaging.
Example 1. Let X = (([0, 1]), the space of continuous functions. Let Y = X and
dene
M(f)(x) =
_
x
0
f(y)dy.
Here, a point x in X is the function denoted by f, traditionally denoted by f(x). The
operator M is certainly injective since the equality of data M(f) = M(g) implies that
f =
d
dx
M(f) =
d
dx
M(g) = g, i.e., the equality of parameters.
However, the operator M is smoothing since M(f) is one derivative more regular
than f. So inverting the operator, as indicated above, involves dierentiating the data.
If noise is added to the data and noise is high frequency, the the derivative of
noise will be large and may overwhelm the reconstruction of f(x). The objective of
penalization theory is precisely to make sure this undesirable fact does not happen.
Example 2. On X = (
1
0
([0, 1]), the space of continuously dierentiable functions
with value 0 at x = 0 and Y = (([0, 1]), we can dene
M(f)(x) = f
t
(x), the derivative of f.
Note that the derivative of a continuous function is not continuous (and may not exist as
a function although it can be dened as a distribution). So here, we found it convenient
to dene the domain of denition of M as a subset of the space of continuous functions.
Why do we also insist on f(0) = 0 in X? It is because otherwise, M would not be
injective. Indeed, antiderivatives are dened up to a constant. By enforcing f(0) = 0,
the constant is xed. In X, we have from the fundamental theory of calculus that
f(x) = f(0) +
_
x
0
f
t
(y)dy =
_
x
0
f
t
(y)dy =
_
x
0
M(f)(y)dy.
Now obviously, M(f) = M(g) implies f = g. From the point of view of inverse problems,
this is a very favorable situation. If noise is added to M(f), it will be integrated during
the reconstruction, which is a much more stable process than dierentiating.
Example 3. With the same setting as in Example 1, consider M(f)(x) =
_
1
0
f(y)dy.
This operator is well dened. But it is clearly not injective. All we learn about f is its
mean on (0, 1). This operator corresponds to data that are not very informative.
1.2. EXAMPLES OF MEASUREMENT OPERATOR 7
Example 4. Let X = (
c
(R
2
) the space of continuous functions with compact support
(functions supported in a bounded domain in R
2
, i.e., vanishing outside of that domain).
Let Y = ((R (0, 2)). We dene l(s, ) for s R and (0, 2) as the line with
direction perpendicular to = (cos , sin ) and at a distance [s[ from the origin (0, 0).
More precisely, let

= (sin , cos ) the rotation of by



2
. Then
l(s, ) = x R
2
such that x = s + t

for t R. (1.11)
We dene
M(f)(s, ) =
_
l(s,)
f(x)dl =
_
R
f(s + t

)dt,
where dl is the line measure along l(s, ). In other words, M maps a function to the
value of its integrals along any line. The operator M is the two-dimensional Radon
Transform. We shall see that M is injective and admits an explicit inversion.
The Radon transform and its inversion form the mathematical back-bone of Com-
puterized Tomography (CT), one of the most successful medical imaging techniques
available to-date.
Example 5. Let us conclude with a more involved, though practically very rele-
vant, example: the Calderon problem. We rst introduce the following elliptic partial
dierential equation
(x)u(x) = 0, x X
u(x) = f(x) x X,
(1.12)
where X is a smooth domain in R
n
, X its boundary, (x) a smooth coecient in X
bounded above and below by positive constants, and f(x) is a prescribed Dirichlet data
for the elliptic problem. This equation is a standard forward problem and it is known
that it admits a unique solution. Moreover, the outgoing current
j(x) = (x)
u

(x), with (x) the outward unit normal to X at x X,


is a well dened function. This allows us to dene the Dirichlet-to-Neumann (a.k.a.
Poincare-Steklov) operator

:
H
1
2
(X) H

1
2
(X)
f(x)

[f](x) = j(x) = (x)


u

(x).
(1.13)
Here, H
s
(X) are standard Hilbert spaces of distributions dened at the domains
boundary.
Let now X = (
2
(

X) and Y = L(H
1
2
(X), H

1
2
(X)), where the space L(X
1
, X
2
)
means the space of linear bounded (continuous) operators from X
1
to X
2
. We dene the
measurement operator
M : X M() =

Y. (1.14)
8 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
So the measurement operator maps the unknown conductivity to the Dirichlet-to-
Neumann operator

, which is by construction a functional of since the solution u of


(1.12) is a functional of . The measurement operator therefore lives in a huge (ad-
mittedly, this is subjective) space. Acquiring data means acquiring

, which means for


each and every function g H
1
2
(X) at the domains boundary, perform an experiment
that measures j(x).
As we shall see, the Calderon problem is a typical example of what is usually regarded
as an ill-posed inverse problem. It nds important applications in Electrical Impedance
Tomography and Optical Tomography, two medical imaging modalities. A milestone of
inverse problems theory is the proof that M is an injective operator [60].
1.3 Inverse Problems and Modeling: application to
Magnetic Resonance Imaging
This section presents an extremely simplied version of the extremely successful medical
imaging modality called Magnetic Resonance Imaging (MRI). While doing so, we observe
that MRI reconstructions may be modeled by (at least) three dierent measurement
operators, and hence three dierent inverse problems. This serves as an example of the
importance of modeling when dealing with practical inverse problems.
MRI exploits the precession of the spin of protons in a magnetic eld H(x), which
is a vector in R
3
for each position x = (x, y, z) R
3
. The axis of the precession is
that of the magnetic eld and the frequency of the precession is (x) = [H[(x), where
= e/(2m) is called the gyromagnetic ratio, e is the electric charge of the proton and
m its mass.
In a nutshell, MRI works as follows. Assume rst that we impose a strong static
magnetic eld H
0
= H
0
e
z
along the z axis. All protons end up with their spin parallel to
H
0
and slightly more so in the direction H
0
than in H
0
. This dierence is responsible
for a macroscopic magnetization M pointing in the same direction as H
0
.
In a second step, we generate radio frequency magnetic waves at the Larmor fre-
quency
0
= [H
0
[. In clinical MRI, the frequency is typically between 15 and 80 MHz
(for hydrogen imaging), which corresponds to wavelengths between 20 and 120 m (since
= ck = 2c/ and c 3 10
8
). So the wavelength is not what governs spatial resolution
in MRI, which is as most successful medical imaging modalities, sub-millimetric. For
instance the pulse (assumed to be independent of x to start with) may be of the form
H
1
(t) = 2H
1
cos(
0
t)e
x
and turned on for a duration t
p
. Because the eld oscillates at
the Larmor frequency, the spins of the protons are aected. The resulting eect on the
macroscopic magnetization is that it precesses around the axis e
z
at frequency
0
. The
spins make an angle with respect to the direction e
z
given at time t
p
by
= H
1
t
p
.
Generally, t
p
is chosen such that = /2 or = . The corresponding pulses are called
90
0
and 180
0
pulses, respectively. Thus, after a 90
0
pulse, the magnetization oscillates
in the xy plane and after a 180
0
pulse, the magnetization is pointing in the direction
H
0
.
1.3. IP AND MODELING. APPLICATION TO MRI 9
Once the radio frequency is turned o (but not the static eld H
0
), protons tend to
realign with the static eld H
0
. By doing so, they emit radio frequency waves at the
Larmor frequency
0
that can be measured. This wave is called the free induction decay
(FID) signal. The FID signal after a 90
0
pulse will have the form
S(t) = e
i
0
t
e
t/T
2
. (1.15)
Here is the density of the magnetic moments and T
2
is the spin-spin relaxation time.
(There is also a spin-lattice relaxation time T
1
T
2
, which cannot be imaged with 90
0
pulses and which we ignore.) The main reason for doing all this is that the density
and the relaxation time T
2
depend on the tissue sample. We restrict ourselves to
the reconstruction of here, knowing that similar experiments can be devised to image
T
2
(and T
1
) as well. To simplify, we assume that measurements are performed over a
period of time that is small compared to T
2
so that the exponential term e
t/T
2
can be
neglected.
Now human tissues are not spatially homogeneous, which makes imaging a lot more
useful. The density of magnetic moments = (x) depends on type of tissue at x R
3
.
This is the parameter we wish to reconstruct.
The physical mechanism that allows for the good spatial resolution of MRI (sub-
millimeter resolution for brain imaging) is that only tissue samples under a static
magnetic eld H such that [H[ =
0
will be aected by the radio frequency pulse
H
1
(t) = 2H
1
cos(
0
t)e
x
. We thus need to make sure that the static eld has the correct
amplitude in as small a spatial area as possible. To do so, we impose the static eld
H(z) = H
0
+ G
z
ze
z
.
Only those protons in the slice with z is close to 0 will be aected by the pulse H
1
since
we have assumed that [H
0
[ =
0
. As a consequence, the measured signal takes the
form
S(t) = e
i
0
t
_
R
2
(x, y, 0)dxdy so that e
i
0
t
S(t) =
_
R
2
(x, y, 0)dxdy. (1.16)
The above right-hand-side thus describes the average density in the plane z = 0. MRI
is thus a tomographic technique (tomos meaning section or slice in Greek).
By changing H
0
or
0
, we can obtain the average density in the plane z = z
0
for all
values of z
0
R. Moreover, by rotating the generated magnetic eld H(z), we are ideally
able to obtain the average density in any plane in R
3
. Planes may be parameterized
by their normal vector S
2
, with S
2
the unit sphere in R
3
, and their distance s to
the origin (0, 0, 0) R
3
. Let P(s, ) be the corresponding plane. Then what we have
obtained is that MRI experiments allows us to obtain the plane integrals of the density
R(s, ) =
_
P(s,)
(x)d(x), (1.17)
where d(x) is the surface (Euclidean) measure on the plane P(s, ). Here, R(s, ) is
the three-dimensional Radon transform of (x). This is the rst inverse problem
we encounter. The measurement operator maps functions dened on R
3
(for instance
compactly supported continuous functions) to the Radon transform, which is a function
(for instance compactly supported continuous function) of (s, ) R S
2
. We thus
have the Inverse Problem:
10 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
3D Radon Transform: Reconstruct density (x) from knowledge of R(s, )
Equivalently: Reconstruct a function from its plane integrals.
There are several issues with the above inverse problem. First of all, the Radon
transform integrates over planes, which is a smoothing operation. This smoothing has
to be undone in the reconstruction procedure. With more modeling, we will obtain
another MO that does not involve any smoothing. More fundamentally from a practical
point of view, rotating the whole magnetic eld from the direction e
z
to an arbitrary
direction is very challenging technologically. One could also rotate the object of
interest rather than the heavy magnet. However, for the imaging of human patients,
this is not feasible either for rather obvious reasons. Additional modeling is therefore
necessary.
So far, we have a vertical discrimination of the proton density. The transversal
discrimination is obtained by imposing a static eld linearly varying in the x and y
directions. Remember that after the 90
0
pulse, the magnetization M(x, y, 0) rotates
with frequency
0
in the xy plane (i.e., is orthogonal to e
z
), and is actually independent
of x and y. Let us now impose a static eld H(y) = H
0
+ G
y
ye
z
for a duration T.
Since the frequency of precession is related to the magnetic eld, the magnetization at
position y will rotate with frequency (y) =
0
+ G
y
y. Therefore, compared to the
magnetization at z = 0, the magnetization at z will accumulate a phase during the time
T the eld G
y
ye
z
is turned on given by T((y)
0
) = TG
y
y. Once the eld G
y
ye
z
is
turned o, the magnetization will again rotate everywhere with frequency
0
. However,
the phase depends on position y. This part of the process is call phase encoding. A
measurement of the FID would then give us a radio frequency signal of the form
S(t; T) = e
i
0
t
_
R
2
e
iGyTy
(x, y, 0)dxdy. (1.18)
By varying the time T or the gradient G
y
, we see that we can obtain the frequency
content in y of the density (x, y, 0). More precisely,
_
R
(x, y, 0)dx =
1
2
_
R
e
ikyy
S(t;
k
y
G
y
)dk
y
. (1.19)
This provides us with the line integrals of (x, y, 0) at z = 0 for all the lines that are
parallel to the x-axis. Note that the phase encoding was performed by using a eld that
was linear in the y variable. We can use a eld that is linear in the variable cos x+sin y
instead. Denoting by = (cos , sin ) S
1
a unit vector in R
2
, by

= (sin , cos )
its rotation by

2
, and by l(s, ) the line with normal at a distance s from the origin
(0, 0) R
2
dened in (1.11), we are thus able to measure all line integrals of the function
(x, y, 0):
R(s, ) =
_
l(s,)
(x, y, 0)dl(x, y) =
_
R
(s + t

, 0)dt, (1.20)
where dl(x, y) is the line (Euclidean) measure on l(s, ). This is the second inverse prob-
lem we encounter: we wish to reconstruct (x, y) from knowledge of its line integrals
R(s, ). This is the two-dimensional Radon transform of (x, y, 0); see Example
4 above. The measurement operator maps functions dened on R
2
(for instance com-
pactly supported continuous functions) to the Radon transform, which is a function (for
1.3. IP AND MODELING. APPLICATION TO MRI 11
instance compactly supported continuous function) of (s, ) R S
1
. We thus have
the Inverse Problem:
2D Radon Transform: Reconstruct density (x, y) from knowledge of R(s, )
Equivalently: Reconstruct a function from its line integrals.
The 2D Radon transform still involves smoothing (integration along lines), as does
the three-dimensional Radon transform. Note, however, that there is no need to rotate
the magnet or the patient to acquire the data. Although the inverse Radon transform
is useful and used in practical MRI, the missing information in the x variable in mea-
surements of the form (1.18) can in fact be obtained by additional modeling. Indeed,
nothing prevents us from adding an xdependent static eld during the FID measure-
ments. Let us assume that after time T (where we reset time to be t = 0), we impose
a static eld of the form H(x) = H
0
+ G
x
xe
z
. The magnetization will now precess
around the z axis with xdependent frequency (x) =
0
+ G
x
x. This implies that
the measured signal will be of the form
S(t; T) =
_
R
2
e
iGyTy
e
i(
0
+Gxx)t
(x, y, 0)dxdy. (1.21)
We have thus access to the measured data
d(k
x
, k
y
) = e
i
0
kx/(Gx)
S(
k
x
G
x
;
k
y
G
y
) =
_
R
2
e
ikyy
e
ikxx
(x, y, 0)dxdy. (1.22)
By varying T (or G
y
) and t and G
x
, we can obtain the above information for essentially
all values of k
x
and k
y
. This is our third Inverse Problem:
2D Fourier Transform: Reconstruct density (x, y, 0) from knowledge of d(k
x
, k
y
)
Equivalently: Reconstruct a function from its plane wave decomposition.
This is a well-known problem whose solution involves applying the Inverse Fourier
Transform
(x, y, 0) =
1
(2)
2
_
R
2
e
i(kxx+kyy)
d(k
x
, k
y
)dk
x
dk
y
. (1.23)
Several approximations have been made to obtain this reconstruction formula. Within
this framework, we see however that density reconstructions are relatively simple: all
we have to do is to invert a Fourier transform. The above procedure can be repeated
for all values of z providing the density (x, y, z) everywhere in the domain of interest.
We do not consider the diculties of MRI further. The above derivation shows
that MRI can be modeled by at least three dierent inverse problems. A rst inverse
problem, based on the three dimensional Radon transform, is not very practical. The
second and third inverse problems, based on the inverse Radon transform (RT) and the
inverse Fourier transform (FT), are used in practical reconstructions in MRI. Note that
the IP based on the FT requires the acquisition of more data than the IP based on the
RT. The reason for acquiring more data is that the IP based on the FT is better posed
than than based on the FT. The next section introduces a simple Hilbert scale that
allows one to quantify the notion of well- or ill-posedness in some simple cases.
12 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
1.4 Inverse Problems and Smoothing: Hilbert scale
The inverse problems derived above all involve reconstructions in R
n
for dimension
n = 2 or n = 3 in geometries that are invariant by translation. The Fourier transform
is therefore an ideal candidate to devise reconstruction algorithms and we briey recall
its denition below.
We have also mentioned on several occasions already that inverse problems could be
coined as well-posed or ill-posed. Such notions are, we say it again, subjective. The
main reason a problem is ill-posed is because noise is amplied too much during the
reconstruction, which is a subjective statement. Nonetheless, there is one important
reason why noise is amplied in the reconstruction: it is because high frequencies are
damped during the acquisition of the measurements, or in other words, the MO is a
smoothing operator. The more smoothing (regularizing) the MO, the more amplied
(deregularized) the noise is going to be during the reconstruction. Below we introduce
a Hilbert scale that allows us to quantify the aforementioned regularization and to be
more specic about the notions of well-posedness and ill-posedness.
1.4.1 Fourier transforms and well-posedness
Let f(x) be a complex-valued function in L
2
(R
n
) for some n N

, which means a
(measurable) function on R
n
that is square integrable in the sense that
|f|
2
=
_
R
n
[f(x)[
2
dx < . (1.24)
Here |f| is the L
2
(R
n
)-norm of f and dx the Lebesgue (volume) measure on R
n
. We
dene the Fourier transform of f as

f(k) = [T
xk
f](k) =
_
R
n
e
ikx
f(x)dx. (1.25)
It is a well-known results about Fourier transforms that

f(k) L
2
(R
n
) and the Fourier
transform admits an inverse on L
2
(R
n
) given by
f(x) = [T
1
kx

f](x) =
1
(2)
n
_
R
n
e
ikx

f(k)dk. (1.26)
More precisely we have the Parseval relation
(

f, g) = (2)
n
(f, g) and |

f| = (2)
n
2
|f|. (1.27)
where the Hermitian product is given by
(f, g) =
_
R
n
f(x)g(x)dx. (1.28)
Here g is the complex conjugate to g. So up to the factor (2)
n
2
, the Fourier transform
and its inverse are isometries.
Important properties of the Fourier transform for us here are how they interact with
dierentiation and convolutions. Let = (
1
, ,
n
) be a multi-index of non-negative
1.4. INVERSE PROBLEMS AND SMOOTHING: HILBERT SCALE 13
components
j
0, 1 j n and let [[ =

n
i=1

j
be the length of the multi-index.
We then dene the dierentiation D

of degree [[ as
D

=
n

i=1

i
x

i
i
. (1.29)
We then deduce from the denition (1.25) that
T
xk
[D

f](k) =
_
n

j=1
(ik
j
)

j
_
[T
xk
f](k).
(1.30)
Let us now dene the convolution as
f g(x) =
_
R
n
f(x y)g(y)dy. (1.31)
We then verify that
T
xk
(f g) = T
xk
fT
xk
g, i.e.

f g =

f g,
T
1
kx
(

f g) = (2)
n
f g i.e.

f g = (2)
d

fg.
(1.32)
So the Fourier transform diagonalizes dierential operators (replaces them by multiplica-
tion in the Fourier domain). However Fourier transforms replace products by non-local
convolutions.
The Fourier transform is a well-posed operator from L
2
(R
n
) to L
2
(R
n
) since the
inverse Fourier transform is also dened from L
2
(R
n
) to L
2
(R
n
) and is bounded as
shown in (1.27). Let us assume that we measure

d(k) =

f(k) +

N(k),
where we believe that = |

N| is relatively small. Then the error in the reconstruction


will also be of order in the L
2
(R
n
) norm. Indeed let d(x) be the reconstructed function
from the data d(k) and f(x) be the real function we are after. Then we have
|d f| = |T
1
kx

d T
1
kx

f| = |T
1
kx
(

d

f)| = (2)

n
2
. (1.33)
In other words, measurements errors can still be seen in the reconstruction. The resulting
image is not perfect. However the error due to the noise has not been amplied too
drastically.
1.4.2 Hilbert scale and degrees of ill-posedness
Well-posed Problems and Lipschitz Stability. Let us revisit and make more
explicit the notion of well-posedness and ill-posedness mentioned in section 1.1. We
assume that M is a linear operator A from X to Y for X and Y Banach spaces associated
with their natural norms. For a given data y Y, we would like to solve the linear
problem
Find x such that Ax = y. (1.34)
14 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
As we mentioned earlier, a well-posed problem is a problem where data noise is
not amplied too drastically during the reconstruction. Mathematically, this subjective
notion may be described by the property that the (bounded) operator A is invertible
(A
1
y is dened for all y Y) and (hence by the open mapping theorem) of bounded
inverse, i.e., |A
1
y|
X
C|y|
Y
for a constant C that depends on A but not on y Y.
The error between two solutions x
1
and x
2
corresponding to two data y
1
and y
2
satises
that
|x
1
x
2
|
X
C|y
1
y
2
|
Y
. (1.35)
This stability estimate is consistent with the statement that Noise y
1
y
2
is not ampli-
ed by more than a multiplicative factor C during the reconstruction. Moreover, when
noise levels are reduced by a factor two, which may be performed by adding detectors or
obtaining higher quality measurements, then (1.35) states that the in the worst possible
case (since (1.35) for arbitrary x
1
x
2
, and hence C reects the amplication of noise
in the worst scenario), the error in the reconstruction will also be reduced by a factor
two.
Note that the choice of the spaces X and Y and their norms | |
X
and | |
Y
matters.
The denition and the boundedness of the operator A
1
obviously depends upon these
choices and hence so does the validity of (1.35). An estimate of the form (1.35) is a
stability estimate with Lipschitz constant C and we then say that the inverse problem
(1.34) is Lipschitz-stable.
Ill-posed problems and unbounded operators. The inverse Fourier transform is
an example of a well-posed inverse problem from L
2
(R
n
) to L
2
(R
n
). We will see in the
following chapter that the two-dimensional and three-dimensional Radon transforms
are also well posed with appropriate choices of X and Y. Many inverse problems are,
however, considered to be ill-posed for instance because the application of A
1
to the
noisy data y produces a result that is deemed inadequate or because it is not possible
to dene an operator A
1
. Being ill-posed does not mean that a problem cannot be
solved. However, it means that additional information needs to be incorporated into
the inversion.
Let us attempt to mathematize these subjective notions. Typically, we can distin-
guish two notions of ill-posedness. The rst one corresponds to operators A that are
not injective. In this case, the data do not uniquely determine the parameters. This
situation is typically remedied by acquiring additional information. We will not consider
such settings much in these notes except to say that the Bayesian framework considered
in Chapter 10 is adapted to such problems. Many practical inverse problems may be
seen as discretization of injective operators.
The second notion of ill-posedness involves operators A that are injective but not
surjective on the whole space Y (i.e., the range of A dened by Range(A) = A(X) is a
proper subset of Y; that is to say, is not equal to Y). Because A is injective, A
1
can
be dened from Range(A) to X. However, it is not a bounded operator for the norms
of Y and X in the sense that a Lipschitz equality such as (1.35) does not hold. From a
practical point of view, applying A
1
to the available (noisy) data y provides a results
that the user feels is too dierent from the expected x.
Mathematically, the unbounded character of A
1
very much depends on the choice
of functional space. The operator A
1
could very well be dened and bounded from the
1.4. INVERSE PROBLEMS AND SMOOTHING: HILBERT SCALE 15
other space Y
t
to the other space X
t
, in which case the same inverse problem based on
the MO A could be ill posed from X to Y but well-posed from X
t
to Y
t
. In other words, a
user comfortable with the modeling of A from X
t
to Y
t
deals with a well-posed problem,
whereas the user insisting on a modeling of A from X to Y needs to add information to
the inversion to obtain a more satisfactory answer.
In spite of the subjectivity of the notion of ill-posedness, one of the main reasons why
an inverse problem is deemed ill-posed in practice is because the MO A is smoothing.
Smoothing means that Ax is more regular than x, in the sense that details (small
scale structures) are attenuated by the MO. Again, this does not mean that the details
cannot be reconstructed. When the MO is injective, they can. This means, however,
that the reconstruction has to undo this smoothing. As soon as the data are noisy (i.e.,
always in practice), and that the noise contribution has small scale structures (i.e., often
in practice), then the deregularization process has the eect of amplifying the noise in
a way that can potentially be very harmful (which is, as we already said, subjective).
The answer to ill-posedness is to impose prior assumptions on the parameters we
wish to reconstruct. As we mentioned earlier in this chapter, the simplest example of
such an assumption is to assume that the function is suciently smooth. In order to
dene what we mean by ill-posedness and quantify the degree of ill-posedness, we need
a scale of spaces in which the smoothing of A can be measured. We will use what is
probably the simplest scale of function spaces, namely the scale of Hilbert spaces.
The scale of Hilbert spaces. Let s 0 be a non-negative real-number. We dene
the scale of Hilbert spaces H
s
(R
n
) as the space of measurable functions f(x) such that
|f|
2
H
s
(R
n
)
=
_
R
n
(1 +[k[
2
)
s
[T
xk
f[
2
(k)dk < . (1.36)
We verify that H
0
(R
n
) = L
2
(R
n
) since the Fourier transform is an isometry. We also
verify that
f H
1
(R
n
)
_
f L
2
(R
n
) and
f
x
i
L
2
(R
n
), 1 i n
_
. (1.37)
This results from (1.30). More generally the space H
m
(R
n
) for m N is the space of
functions such that all partial derivatives of f of order up to m are square integrable. The
advantage of the denition (1.36) is that it holds for real values of s as well. So H
1
2
(R
n
)
is the space of functions such that half-derivatives of f are square integrable. Notice
also that s characterizes the degree of smoothness of a function f(x). The larger s, the
smoother the function f H
s
(R
n
), and the faster the decay of its Fourier transform

f(k) as can be seen from the denition (1.36).


It is also useful to dene the Hilbert scale for functions supported on subdomains of
R
n
. Let X be a suciently smooth subdomain of R
n
. We dene two scales. The rst
scale is H
s
0
(X), dened as the closure of C

0
(X), the space of functions of class C

with
support in X (so these functions and all their derivatives vanish at the boundary of X),
for the norm in H
s
(R
n
). Thus, f H
s
0
(X) can be described as the limit of functions
f
n
C

0
(R) uniformly bounded in H
s
(R
n
). We also dene H
s
(X) as the space of
functions f on X that can be extended to functions f

in H
s
(R
n
) (i.e., f = f

X
, where
16 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM

X
is the characteristic function of X) and |f|
H
s
(X)
is the lower bound of |f|
H
s
(R
n
)
over all possible extensions. The are several (sometimes not exactly equivalent) ways to
dene the scale of Hilbert spaces H
s
(X).
Finally, it is also convenient to dene H
s
for negative values of s. We dene H
s
(R
n
)
for s 0 as the subspace of o
t
(R
n
), the space of tempered distributions, such that (1.36)
holds. For bounded domains we dene H
s
(X) as the dual to H
s
0
(X) equipped with
the norm
|f|
H
s
(X)
= sup
gH
s
0
(X)
_
X
fgdx
|g|
H
s
0
(X)
. (1.38)
We can verify that the two denitions agree when X = R
n
, in which case H
s
0
(R
n
) =
H
s
(R
n
).
A well-posed, ill-posed inverse problem. Let us illustrate on a simple example
how the Hilbert scale can be used to understand the ill-posedness of inverse problems.
Let f(x) be a function in L
2
(R) and let u be the solution in L
2
(R) of the following ODE
u
tt
+ u = f, x R. (1.39)
There is a unique solution in L
2
(R) to the above equation given by
u(x) =
1
2
_
R
e
[yx[
f(y)dy = (g f)(x), g(x) =
1
2
e
[x[
,
as can be veried by inspection. In the Fourier domain, this is
(1 + k
2
) u(k) =

f(k).
This implies that u is not only in L
2
(R) but also in H
2
(R) as is easily veried.
The problem is ill-posed... Let us dene the operator A as follows
A :
L
2
(R) L
2
(R)
f Af = u,
(1.40)
where u is the solution to (1.39). As such, the operator A is not invertible on L
2
(R).
Indeed the inverse of the operator A is formally dened by A
1
u = u
tt
+ u. However
for u L
2
(R), u
tt
is not a function in L
2
(R) but a distribution in H
2
(R). The inverse
problem consisting of reconstructing f(x) L
2
(R) from u(x) L
2
(R) is therefore ill-
posed. The reason is that the operator A is regularizing.
... or is it? However let us dene the same operator

A :
L
2
(R) H
2
(R)
f

Af = u.
(1.41)
1.4. INVERSE PROBLEMS AND SMOOTHING: HILBERT SCALE 17
Now

A is invertible from H
2
(R) to L
2
(R) and its inverse is indeed given by

A
1
u =
u
tt
+u. So

A is well-posed from L
2
(R) to H
2
(R) as can easily be veried. Yet a third
instantiation of the same operator is

A :
H
2
(R) L
2
(R)
f

Af = u.
(1.42)
Now

A is invertible from L
2
(R) to H
2
(R) and its inverse is indeed given by

A
1
u =
u
tt
+ u and thus

A is well-posed from H
2
(R) to L
2
(R).
If we assume that our noise (the error between measurement u
1
and measurement u
2
)
is small in the H
2
-norm, so that |u
1
u
2
|
H
2
(R)
, and we are happy with a small error
in the parameter in the L
2
(R) sense, then there is no problem. The reconstruction
will be accurate in the sense that |f
1
f
2
|
L
2
(R)
C, where f
j
=

A
1
u
j
, j = 1, 2.
The same occurs if we assume that noise is small in the L
2
-norm and we are happy
with a small error in the parameter in the H
2
(R) sense. This would typically mean
that we are satised if spatial moments of u (i.e., quantities of the form
_
R
u(x)(x)dx
for H
2
(R)) are accurate, rather than insisting that u be accurate in the L
2
sense.
However, in many instance, noise is not small in the strong norm H
2
(R), but rather
in the weaker norm L
2
(R). At least this is our perception. Moreover, we do not want
small errors in the H
2
(R) sense, but rather insist that the reconstructions look good
in some L
2
(R) sense. Then the problem is certainly ill-posed as A is not boundedly
invertible the spaces X = Y = L
2
(R).
A possible denition of mildly and severely ill-posed inverse problems. We
now introduce a useful practical distinction among ill-posed problems. As heuristic and
subjective as the notion may be, an ill-posed inverse problems that require relatively
mild action from the user (in terms of the introduction of prior information) will be
called mildly ill-posed. Some inverse problems may require much more stringent action
from the user. They will be called severely ill-posed problems.
Using the Hilbert scale introduced earlier, a possible mathematical denition of
mildly and severely ill-posed problems is as follows. We assume here that A is injective.
It is not that the inverse of A does not exist that causes problems. It is because A is
smoothing that action is necessary. When the smoothing is mild, then that action may
be limited to an appropriate penalization such as those that will be described in Chapter
10. When the smoothing is severe, then typically much more drastic prior information
needs to be introduced, for instance by using the Bayesian framework introduced in
Chapter 10. We somewhat arbitrarily separate mild smoothing by a nite number of
derivatives from severe smoothing by an innite number of derivatives.
The problem (1.34) with X = Y = L
2
(R
n
) is said to be mildly ill-posed provided
that there exists a positive constant C and > 0 such that
|Af|
H

(R
n
)
C|f|
L
2
(R
n
)
. (1.43)
We dene |Af|
H

(R
n
)
= + if f does not belong to H

(R
n
). We say that A is mildly
ill-posed of order if is the smallest real number such that (1.43) holds for some
C = C(). Notice that we can choose any 2 for

A so the operator that maps f to
18 CHAPTER 1. WHAT CONSTITUTES AN INVERSE PROBLEM
u solution of (1.39) is a mildly ill-posed problem of order 2. The operator

A in (1.41)
essentially integrates twice the function f. Any injective operator that corresponds to
a nite number of integrations is therefore mildly ill-posed.
We call the inversion a severely ill-posed problems when no such exists. Unfortu-
nately, there are many practical instances of such inverse problems. A typical example
is the following operator
Bf(x) = T
1
kx
[e
k
2
T
xk
f](x). (1.44)
Physically this corresponds to solving the heat equation forward in time: a very smooth-
ing operation. We easily verify that the operator B maps L
2
(R) to L
2
(R). Hoverer it
damps high frequencies exponentially strongly, and more strongly than any nite num-
ber of integrations (m integrations essentially multiply high frequencies by [k[
m
) so no
> 0 in (1.43) exists for B. Note that it does not mean that B is never invertible.
Indeed for suciently smooth functions g(x) in the range of B (for instance for functions
g such that g(k) has compact support), we can easily dene the inverse operator
B
1
g(x) = T
1
kx
[e
k
2
T
xk
g](x).
Physically, this corresponds to solving the heat equation backwards in time. It is clear
that on a space of suciently smooth functions, we have BB
1
= B
1
B = Id. Yet, if
noise is present in the data, it will be amplied by e
k
2
in the Fourier domain. Unless
noise is low frequency, this has devastating eects on the reconstruction. It is instructive
to numerically add the inverse Fourier transform of n(k)e
k
2
to an image f(x) for n(k) =
(1+k
2
)
1
, say, so that noise is square integrable. Even when is machine precision, the
image is quickly drowned by noise even if a relatively small number of Fourier modes is
used to perform the Fourier transform.
Chapter 2
Integral Geometry. Radon
transforms
This chapter and the next are devoted to Integral Geometry, which involves the recon-
struction of an object for instance described by a function from knowledge of integral
quantities of the object such as for instance its integrals along lines or planes.
In this chapter we consider the simplest example of integral geometry: the integration
of a two-dimensional function along all possible lines in the plane, which is called the
Radon transform, and the inversion of such a transform. This forms the mathematical
backbone for one of the most successful medical imaging techniques: computed (or
computerized) tomography (CT).
Later in the chapter, we consider the three dimensional Radon transform, which
concerns the integral of functions over planes in three dimensions of space, as well as a
specic example of a weighted two dimensional Radon transform, the attenuated Radon
transform, which nds an application a the important medical imaging modality called
Single Photon Emission Computerized Tomography (SPECT).
2.1 Transmission Tomography
In transmission tomography, objects to be imaged are probed with non-radiating sources
such as X-rays. X-rays are composed of high energy photons (on the order of 60keV,
which corresponds to a wavelength of about 0.02nm) that propagate through the object
along straight lines unless they interact with the underlying medium and get absorbed.
Let x = (x
1
, x
2
) denote two-dimensional spatial position and S
1
orientation. We
denote by u(x, ) the density of X-rays with position x and orientation , and by (x)
a linear attenuation coecient. Velocity of X-rays is normalized to 1 so that locally the
density u(x, ) satises the following transport equation:

x
u(x, ) + (x)u(x, ) = 0, x X, S
1
. (2.1)
Here X is the physical domain (assumed to be convex) of the object and S
1
is the unit
circle. We identify any point S
1
with the angle [0, 2) such that = (cos , sin ).
The advection operator is given by

x
= cos

x
1
+ sin

x
2
19
20 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
and models free transport of X-rays while (x)u(x, ) models the number of absorbed
photons per unit distance at x.
The probing source is emitted at the boundary of the domain and takes the form
u(x, ) = (x x
0
)(
0
), (2.2)
where x
0
R
2
, say x
0
X and
0
is entering the domain X, i.e.,
0
n(x
0
) < 0
where n is the outward normal to X at x
0
X
0
. Above the delta functions mean
that a unit amount of photons is created at (x
0
,
0
) in the sense that for any domain
(x
0
,
0
) Y R
2
S
1
, we have
_
Y
u(x, )dxd = 1.
The solution to (2.1)-(2.2), which is a rst-order ODE in the appropriate variables,
is given by
u(x + t, ) = u(x, ) exp
_

_
t
0
a(x + s)ds
_
, x R
2
, S
1
. (2.3)
Indeed, write v(t) = u(x + t, ) and b(t) = a(x + t) so that v + bv = 0 and integrate
to obtain the above result. For our specic choice at (x
0
,
0
), we thus obtain that
u(x, ) = (x t x
0
)(
0
) exp
_

_
t
0
a(x s)ds
_
.
In other words, on the half line x
0
+t
0
for t 0, there is a positive density of photons.
Away from that line, the density of photons vanishes.
For x
1
= x
0
+
0
X dierent from x
0
, if a detector collects the amount of
photons reaching x
1
(without being absorbed), it will measure
exp
_

_

0
a(x
1
s)ds
_
= exp
_

_

0
a(x
0
+ s)ds
_
.
The travel time (for a particle with rescaled speed equal to 1, so it is also a distance)
from one side of X to the other side depends on (x
0
,
0
) and is denoted by (x
0
,
0
) > 0.
By taking the logarithm of the measurements, we have thus access to
_
(x
0
,
0
)
0
a(x
0
+ t
0
)dt.
This is the integral of a over the line segment (x
0
, x
1
). By varying the point x
0
and the
direction of the incidence
0
, we can have access to integrals of a(x) over all possible seg-
ments (and since a can be extended by 0 outside X without changing the measurements,
in fact over all possible lines) crossing the domain X.
The main question in transmission tomography is thus how one can recover a function
a(x) from its integration over all possible lines in the plane R
2
. This will be the object of
subsequent sections. In practice, we need to consider integrations over a nite number
of lines. How these lines are chosen is crucial to obtain a rapid and practical inversion
algorithm. We do not consider the problems of discretization here and refer the reader
to [47, 48].
2.2. TWO DIMENSIONAL RADON TRANSFORM 21
2.2 Two dimensional Radon transform
We have seen that the problem of transmission tomography consisted of reconstructing
a function from its integration along lines. We have considered the two-dimensional
problem so far. Since X-rays do not scatter, the three dimensional problem can be
treated by using the two-dimensional theory: it suces to image the object slice by slice
using the two dimensional reconstruction, as we did in MRI (Transmission Tomography
is indeed a tomographic method since tomos means slice in Greek as we already know).
We need to represent (parameterize) lines in the plane in a more convenient way
than by describing them as the line joining x
0
and x
1
. This is done by dening an origin
0 = (0, 0), a direction (cos , sin ) = S
1
, and a scalar s indicating the (signed)
distance of the line to 0. The line is dened by the set of points x such that x

= s,
where

is the rotation of by

2
, i.e., the vector given by

= (sin , cos ). More


precisely, for a smooth function f(x) on R
2
, we dene the Radon transform Rf(s, ) for
(s, ) Z = R (0, 2) as
Rf(s, ) =
_
R
f(s

+ t)dt =
_
R
2
f(x)(x

s)dx. (2.4)
Notice that the cylinder Z is a double covering of the space of lines in the real plane
Figure 2.1: Geometry of the Radon transform.
R
2
. Indeed one easily veries that
Rf(s, ) = Rf(s, + ), as x

= s = x (

) = (s).
Thus there is a redundancy of order two in the parameterization of lines in the Radon
transform.
Radon transform versus X-ray transform. We pause here to make a remark
regarding notation. The notation introduced here for the Radon transform is not the
same as the notation introduced from the Radon transform in Chapter 1. Let us explain
the reason for this change of notation. A line in a two dimensional space is a hyperplane,
as is a plane in a three dimensional space, a notion that is used in the denition of
22 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
the three dimensional Radon transform that we consider below. So it is natural to
parameterize hyperplanes by their uniquely dened (up to a sign) normal vector.
Now, we have just seen that the reason why line integrals appear in Computerized
Tomography is because particles propagate along straight lines. This is true independent
of dimension. And it is natural to parameterize lines by their main direction . This is
the point of view of the X-ray transform (since X-rays roughly follow straight lines in
medical imaging applications).
In three dimensions, the three dimensional Radon transform (along hyperplanes)
and the three dimensional X-ray transform (along rays) are not the same object (if only
because there are many more lines than there are hyperplanes). In two dimensions,
however, the X-ray and Radon transforms correspond to the same object. In Chapter
1, we emphasized the parameterization of hyperplanes and thus modeled lines by their
normal vector . In this chapter, we emphasize the X-ray transform and parameterize
lines by their main direction . One can go from one notation to the other by replacing
by its 90 degree rotation

. The rest of this chapter uses the line parameterizations


of the X-ray transform. However, in two dimensions, we still refer to the integration
of a function along lines, independent of the chosen parameterization, as the Radon
transform.
Some properties of the Radon transform. Let us derive some important proper-
ties of the Radon transform. We rst dene the operator
R

f(s) = Rf(s, ). (2.5)


This notation will often be useful in the sequel. The rst important result on the Radon
transform is the Fourier slice theorem:
Theorem 2.2.1 Let f(x) be a smooth function. Then for all S
1
, we have
[T
s
R

f]() =

R

f() =

f(

), R. (2.6)
Proof. We have that

f() =
_
R
e
is
_
R
2
f(x)(x

s)dxds =
_
R
2
e
ix

f(x)dx.
This concludes the proof.
This result should not be surprising. For a given value of , the Radon transform
gives the integration of f over all lines parallel to . So obviously the oscillations in
the direction are lost, but not the oscillations in the orthogonal direction

. The
oscillations of f in the direction

are precisely of the form



f(

) for R. It is
therefore not surprising that the latter can be retrieved from the Radon transform R

f.
Notice that this result also gives us a reconstruction procedure. Indeed, all we have to
do is to take the Fourier transform of R

f in the variable s to get the Fourier transform

f(

). It remains then to obtain the latter Fourier transform for all directions

to
end up with the full

f(k) for all k R
2
. Then the object is simply reconstructed by using
the fact that f(x) = (T
1
kx

f)(x). We will consider other (equivalent) reconstruction


methods and explicit formulas later on.
2.2. TWO DIMENSIONAL RADON TRANSFORM 23
Before doing so, we derive additional properties satised by Rf. From the Fourier
slice theorem, we deduce that
R

[
f
x
i
](s) =

i
d
ds
(R

f)(s). (2.7)
Exercise 2.2.1 Verify (2.7).
This is the equivalent for Radon transforms of the property (1.30) for the Fourier trans-
form.
Smoothing properties of the Radon and X-ray transforms. Let us now look at
the regularizing properties of the Radon transform. To do so, we introduce a function
(x) of class C

0
(R
2
) (i.e., is innitely many times dierentiable) and with compact
support (i.e. there is a radius R such that (x) = 0 for [x[ > R). When we are interested
in an object dened on a bounded domain X, we can choose (x) = 1 for x X.
As we did for R
n
in the previous chapter, let us now dene the Hilbert scale for the
cylinder Z as follows. We say that g(s, ) belongs to H
s
(Z) provided that
|g|
2
H
s
(Z)
=
_
2
0
_
R
(1 +
2
)
s
[T
s
g()[
2
dd < . (2.8)
This is a constraint stipulating that the Fourier transform in the s variable decays
suciently fast at innity. No constraint is imposed on the directional variable other
than being a square-integrable function. We have then the following result:
Theorem 2.2.2 Let f(x) be a distribution in H
s
(R
2
) for some s R. Then we have
the following bounds

2|f|
H
s
(R
2
)
|Rf|
H
s+
1
2 (Z)
|R(f)|
H
s+
1
2 (Z)
C

|f|
H
s
(R
2
)
.
(2.9)
The constant C

depends on the function (x), and in particular on the size of its


support.
Proof. From the Fourier slice theorem

R

w() = w(

), we deduce that
_
Z
[

Rw(, )[
2
(1 +
2
)
s+1/2
dd =
_
Z
[ w(

)[
2
(1 +
2
)
s+1/2
dd
= 2
_
R
2
[ w(k)[
2
(1 +[k[
2
)
s+1/2
[k[
dk,
using the change of variables from polar to Cartesian coordinates so that dk = dd
and recalling that

f() =

f(()()). The rst inequality in (2.9) then follows from
the fact that [k[
1
(1 +[k[
2
)
1/2
using w(x) = f(x). The second inequality is slightly
more dicult because of the presence of [k[
1
. We now choose w(x) = f(x)(x). Let
24 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
us split the integral into I
1
+ I
2
, where I
1
accounts for the integration over [k[ > 1 and
I
2
for the integration over [k[ < 1. Since (1 +[k[
2
)
1/2

2[k[ for [k[ > 1, we have that


I
1
2

2
_
R
2
[

f(k)[
2
(1 +[k[
2
)
s
dk 2

2|f|
2
H
s.
It remains to deal with I
2
. We deduce from (1.31) that
I
2
C|

f|
2
L

(R
2
)
.
Let C

0
(R
2
) be such that = 1 on the support of so that f = f. We dene

k
(x) = e
ixk
(x). Upon using the denition (1.25), the Parseval relation (1.27) and
the Cauchy Schwarz inequality (f, g) |f||g|, we deduce that
[

f[(k) = [

f[(k) =

_
R
2

k
(x)(f)(x)dx

= (2)
2

_
R
2

k
()

f()d

= (2)
2

_
R
2

k
()
(1 +[[
2
)
s/2
(1 +[[
2
)
s/2

f()d

|
k
|
H
s
(R
2
)
|f|
H
s
(R
2
)
.
Since (x) is smooth, then so is
k
uniformly in [k[ < 1 so that
k
belongs to H
s
(R
2
)
for all s R uniformly in [k[ < 1. This implies that
I
2
C|

f|
2
L

(R
2
)
C

|f|
2
H
s
(R
2
)
,
where the function depends on , which depends on the support of . This concludes
the proof.
The theorem should be interpreted as follows. Assume that the function (or more
generally a distribution) f(x) has compact support. Then we can nd a function
which is of class C

, with compact support, and which is equal to 1 on the support of


f. In that case, we can use the above theorem with f = f. The constant C

depends
then implicitly on the size of the support of f(x).
The above inequalities show that R is a smoothing operator. This is not really sur-
prising as an integration over lines is involved in the denition of the Radon transform.
However, the result tells us that the Radon transform is smoothing by exactly one half
of a derivative. The second inequality in (2.9) tells us that the factor
1
2
is optimal, in the
sense that the Radon transform does not regularize by more than one half of a deriva-
tive. Moreover this corresponds to (1.43) with =
1
2
, which shows that the inversion
of the Radon transform in two dimensions is a mildly ill-posed problem of order =
1
2
:
when we reconstruct f from Rf, a dierentiation of order one half is involved.
At the same time, seen as an operator from H
s
(R
2
) to H
s+
1
2
(Z), the Radon transform
is a well posed problem. For s =
1
2
, small noise in the L
2
(Z) sense will generate small
reconstruction errors in the H

1
2
(R
2
) sense.
Filtered-backprojection inversions. Let us now consider such explicit reconstruc-
tion formulas. In order to do so, we need to introduce two new operators, the adjoint
operator R

and the Hilbert transform H. The adjoint operator R

to R (with respect
2.2. TWO DIMENSIONAL RADON TRANSFORM 25
to the usual inner products (, )
R
2 and (, )
z
on L
2
(R) and L
2
(Z), respectively) is given
for every smooth function g(s, ) on Z by
(R

g)(x) =
_
2
0
g(x

, )d, x R
2
. (2.10)
That R

is indeed the adjoint operator to R is veried as follows


(R

g, f)
R
2 =
_
R
2
f(x)
_
2
0
g(x

, )ddx
=
_
R
2
f(x)
_
2
0
_
R
(s x

)g(s, )dsddx
=
_
2
0
_
R
g(s, )
_
R
2
f(x)(s x

)dxdsd
=
_
2
0
_
R
g(s, )(Rf)(s, )dsd = (g, Rf)
Z
.
We also introduce the Hilbert transform dened for smooth functions f(t) on R by
Hf(t) =
1

p.v.
_
R
f(s)
t s
ds. (2.11)
Here p.v. means that the integral is understood in the principal value sense, which in
this context is equivalent to
Hf(t) = lim
0
_
R\(t,t+)
f(s)
t s
ds.
Both operators turn out to be local in the Fourier domain in the sense that they are
multiplications in the Fourier domain. More precisely we have the following lemma.
Lemma 2.2.3 We have in the Fourier domain the following relations:
(T
x
R

g)() =
2
[[
_
(T
s
g)([[,

) + (T
s
g)([[,

)
_
(T
t
Hf)() = i sign()T
t
f().
(2.12)
We have used the notation

= /[[. For = (cos , sin ) with S
1
and (0, 2),
we also identify the functions g() = g(). Assuming that g(s, ) = g(s, +), which
is the case in the image of the Radon transform (i.e., when there exists f such that
g = Rf), and which implies that g(, ) = g(, ) we have using shorter notation
the equivalent statement:

g() =
4
[[
g([[,

Hf() = i sign()

f().
(2.13)
26 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
Proof. Let us begin with R

g. We compute

g() =
_
e
ix
g(x

, )ddx =
_
e
is

g(s, )dsde
it
dt
=
_
2([[

) g(

, )d =
_
2
[[
(

) g(

, )d
=
2
[[
_
g([[,

) + g([[,

)
_
.
In the proof we have used that (x) =
1
(x) and the fact that there are two direc-
tions, namely

and

on the unit circle, which are orthogonal to


. When g is in the
form g = Rf, we have g([[,

) = g([[,

), which explains the shorter notation


(2.13).
The computation of the second operator goes as follows. We verify that
Hf(t) =
1

_
1
x
f(x)
_
(t).
So in the Fourier domain we have

Hf() =
1

1
x
()

f() = isign()

f().
The latter is a result of the following calculation
1
2
sign(x) =
1
2
_
R
e
ix
i
d.
This concludes the proof of the lemma.
The above calculations also show that H
2
= H H = Id, where Id is the identity
operator, as can easily be seen in from its expression in the Fourier domain. This
property is referred to as saying that the Hilbert transform is an anti-involution. We
are now ready to introduce some reconstruction formulas.
Theorem 2.2.4 Let f(x) be a smooth function and let g(s, ) = Rf(s, ) be its Radon
transform. Then, f can explicitly be reconstructed from its Radon transform as follows:
f(x) =
1
4
R

_

s
Hg(s, )
_
(x). (2.14)
In the above formula, the Hilbert transform H acts on the s variable.
Proof. The simplest way to verify the inversion formula is to do it in the Fourier
domain. Let us denote by
w(s, ) =

s
Hg(s, ).
Since g(s, + ) = g(s, ), we verify that the same property holds for w in the sense
that w(s, + ) = w(s, ). Therefore (2.14) is equivalent to the statement:

f() =
1
[[
w([[,

), (2.15)
2.2. TWO DIMENSIONAL RADON TRANSFORM 27
according to (2.13). Notice that w is the Fourier transform of w in the rst variable
only.
Since in the Fourier domain, the derivation with respect to s is given by multiplication
by i and the Hilbert transform H is given by multiplication by isign(), we obtain
that
T
1
s

s
HT
s
= [[.
In other words, we have
w(, ) = [[ g(, ).
Thus (2.15) is equivalent to

f() = g([[,

).
This, however, is nothing but the Fourier slice theorem stated in Theorem 2.2.1 since
(

=

and = [[

. This concludes the proof of the reconstruction.


The theorem can equivalently be stated as
Id =
1
4
R


s
HR =
1
4
R

H

s
R. (2.16)
The latter equality comes from the fact that H and
s
commute as can easily be observed
in the Fourier domain (where they are both multiplications). Here, Id is the identity
operator, which maps a function f(x) to itself Id(f) = f.
Here is some additional useful notation in the manipulation of the Radon transform.
Recall that R

f(s) is dened as in (2.5) by


R

f(s) = Rf(s, ).
The adjoint R

(with respect to the inner products in L


2
s
(R) and L
2
x
(R
2
)) is given by
(R

g)(x) = g(x

). (2.17)
Indeed (since is frozen here) we have
_
R
(R

f)(s)g(s)ds =
_
R
2
_
R
f(x)(s x

)g(s)dxds =
_
R
2
g(x

)f(x)dx,
showing that (R

f, g)
L
2
(R)
= (f, R

g)
L
2
(R
2
)
. We can then recast the inversion formula
as
Id =
1
4
_
2
0

HR

d. (2.18)
The only new item to prove here compared to previous formulas is that R

and the
derivation commute, i.e., for any function g(s) for s R, we have

(R

g)(x) = (R

s
g)(x).
This results from the fact that both terms are equal to g
t
(x

).
One remark on the smoothing properties of the Radon transform. We have seen that
the Radon transform is a smoothing operator in the sense that the Radon transform is
half of a derivative smoother than the original function. The adjoint operator R

enjoys
28 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
exactly the same property: it regularizes by half of a derivative. It is not surprising that
these two half derivatives are exactly canceled by the appearance of a full derivation
in the reconstruction formula. Notice that the Hilbert transform (which corresponds
to multiplication by the smooth function isign() in the Fourier domain) is a bounded
operator with bounded inverse in L
2
(R) (since H
1
= H).
Exercise 2.2.2 Show that
f(x) =
1
4
_
2
0
(Hg
t
)(x

, )d.
Here g
t
means rst derivative of g with respect to the s variable only.
Exercise 2.2.3 Show that
f(x) =
1
4
2
_
2
0
_
R
d
ds
g(s, )
x

s
dsd.
This is Radons original inversion formula [52].
Exercise 2.2.4 Starting from the denition
f(x) =
1
(2)
2
_
R
2
e
ikx

f(k)dk,
and writing it in polar coordinates (with change of measure dk = [k[d[k[d

k), deduce the


above reconstruction formulas by using the Fourier slice theorem.
2.3 Three dimensional Radon transform
Let us briey mention the case of the Radon transform in three dimensions (generaliza-
tions to higher dimensions being also possible). The Radon transform in three dimen-
sions consists of integrating a function f(x) over all possible planes. As we mentioned
earlier, the Radon transform is therefore a distinct object from the X-ray transform,
which integrates a function along all possible lines.
A plane T(s, ) in R
3
is characterized by its direction S
2
, where S
2
is the unit
sphere, and by its signed distance to the origin s. Notice again the double covering in
the sense that T(s, ) = T(s, ). The Radon transform is then dened as
Rf(s, ) =
_
R
3
f(x)(x s)dx =
_
1(s,)
fd. (2.19)
Notice the change of notation compared to the two-dimensional case. The Fourier slice
theorem still holds

Rf(, ) =

f(), (2.20)
as can be easily veried. We check that Rf(s, ) = Rf(s, ). The reconstruction
formula is then given by
f(x) =
1
8
2
_
S
2
g
tt
(x , )d. (2.21)
2.4. ATTENUATED RADON TRANSFORM 29
Here d is the usual (Lebesgue) surface measure on the unit sphere.
The result can be obtained as follows. We denote by S
2
/2 half of the unit sphere
(for instance the vectors such that e
z
> 0).
f(x) =
1
(2)
3
_
S
2
2
_
R

f(r)e
irx
[r[
2
drd =
1
(2)
3
_
S
2
2
_
R

Rf(r, )e
irx
[r[
2
drd
=
1
(2)
2
_
S
2
2
(g)
tt
( x, )d =
1
2
1
(2)
2
_
S
2
g
tt
( x, )d.
Here we have used the fact that the inverse Fourier transform of r
2

f is f
tt
.
Exercise 2.3.1 Generalize Theorem 2.2.2 and prove the following result:
Theorem 2.3.1 There exists a constant C

independent of f(x) such that

2|f|
H
s
(R
3
)
|Rf|
H
s+1
(Z)
|R(f)|
H
s+1
(Z)
C

|f|
H
s
(R
3
)
,
(2.22)
where Z = R S
2
and H
s
(Z) is dened in the spirit of (2.8).
The above result shows that the Radon transform is more smoothing in three dimensions
than it is in two dimensions. In three dimensions, the Radon transform smoothes by a
full derivative rather than a half derivative.
Notice however that the inversion of the three dimensional Radon transform (2.21)
is local, whereas this is not the case in two dimensions. What is meant by local is the
following: the reconstruction of f(x) depends on g(s, ) only for the planes T(s, ) that
pass through x (and an innitely small neighborhood so that the second derivative can
be calculated). Indeed, we verify that x T(x, ) and that all the planes passing by x
are of the formT(x, ). The two dimensional transform involves the Hilbert transform,
which unlike dierentiations, is a non-local operation. Thus the reconstruction of f at
a point x requires knowledge of all line integrals g(s, ), and not only for those lines
passing through x.
Exercise 2.3.2 Calculate R

, the adjoint operator to R (with respect to the usual L


2
inner products). Generalize the formula (2.16) to the three dimensional case.
2.4 Attenuated Radon Transform
In the previous sections, the integration over lines for the Radon transform was not
weighted. We could more generally ask whether integrals of the form
R

f(s, ) :=
_
R
f(s

+ t)(s

+ t, )dt,
over all possible lines parameterized by s R and S
1
and assuming that the
(positive) weight (x, ) is known, uniquely determine f(x). This is a much more
delicate question for which only partial answers are known.
30 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
The techniques developed in the next chapter allow one to prove that R

is invertible
up to a nite dimensional kernel by an application of the Fredholm theory of compact
operators. Moreover, Jan Boman [23] constructed an example of a function f C

0
(R
2
)
with compact support and a uniformly positive weight (x, ) also of class C

(R
2
) such
that R

f(s, ) = 0. The operator R

f is therefore not always injective.


Proving that R

f is injective, when it is indeed injective, is a hard problem. In the


next chapter, we will see a methodology to prove injectivity based on energy estimates.
The alternative to energy estimates is to use what we will refer to as analytic/unique
continuation techniques. For the rest of this chapter, we focus on the inversion of one
example of such weighted Radon transforms, namely the Attenuated Radon transform
(AtRT). The invertibility of the AtRT has been obtained recently by two dierent meth-
ods in [6] and [50]. Both rely on the continuation properties of analytic functions. We
focus here on the second method, which recasts the inverse Radon transform and the
inverse Attenuated Radon transform as a Riemann-Hilbert problem [1]. The rest of this
section is signicantly more technical than the inversion of the Radon transform. It
is presented here as an example of the complications that often appear in the proof of
injectivity of many transforms of integral geometry when they are indeed injective.
2.4.1 Single Photon Emission Computed Tomography
An important application for the attenuated Radon transform is SPECT, single photon
emission computed tomography. The principle is the following: radioactive particles
are injected in a domain. These particles emit then some radiation. The radiation
propagates through the tissues and gets partially absorbed. The amount of radiation
reaching the boundary of the domain can be measured. The imaging technique con-
sists then of reconstructing the location of the radioactive particles from the boundary
measurements.
We model the density of radiated photons by u(x, ) and the source of radiated
photons by f(x). The absorption of photons (by the human tissues in the medical
imaging application) is modeled by a(x). We assume that (x) is known here. The
absorption can be obtained, for instance, by transmission tomography as we saw in
earlier sections. The density u(x, ) satises then the following transport equation
u(x, ) + (x)u(x, ) = f(x), x R
2
, S
1
. (2.23)
We assume that f(x) is compactly supported and impose that no radiation comes from
innity:
lim
s
u(x s, ) = 0. (2.24)
The transport equation (2.23) with conditions (2.24) admits a unique solution that
can be obtained by the method of characteristics. Let us dene the following sym-
metrized beam transform
D

(x) =
1
2
_

0
[(x t) (x + t)]dt =
1
2
_
R
sign(t)(x t)dt. (2.25)
We verify that D

(x) = (x) so that e


D

(x)
is an integrating factor for (2.23) in
the sense that
(e
D

(x)
u(x, )) = (e
D

f)(x, ).
2.4. ATTENUATED RADON TRANSFORM 31
Therefore the solution u(x, ) is given by
e
D

(x)
u(x, ) =
_

0
(e
D

f)(x t, )dt. (2.26)


We recall that = (cos , sin ) and that

= (sin , cos ) and decompose x =


s

+ t. We deduce from (2.26) that


lim
t+
e
D

(s

+t)
u(s

+ t, ) =
_
R
(e
D

f)(s

+ t, )dt.
In the above expression the left hand side is know from the measurements. Indeed
u(s

+t, ) is the radiation outside of the domain to image and is thus measured and
e
D

(s

+t)
involves the attenuation coecient (x) which we have assumed is known.
The objective is thus to reconstruct f(x) from the right hand side of the above relation,
which we recast as
(R

f)(s, ) = (R
,
f)(s) = (R

(e
D

f))(s), (2.27)
where R

is the Radon transform dened for a function of f(x, ) as


R

f(s) =
_
R
f(s

+ t, )dt =
_
R
2
f(x, )(x

s)dx.
When 0, we recover that the measurements involve the Radon transform of
f(x) as dened in (2.4). Thus in the absence of absorption, SPECT can be handled
by inverting the Radon transform as we saw in earlier sections. When absorption is
constant, an inversion formula has been known for quite some time [64]. The inversion
formula for non-constant absorption is more recent and was obtained independently by
two dierent techniques [6, 50]. We do not consider here the method of Aanalytic
functions developed in [6]. We will present the method developed in [50] based on the
extension of the transport equation in the complex domain and on the solution of a
Riemann Hilbert problem.
2.4.2 Riemann Hilbert problem
Riemann Hilbert problems nd many applications in complex analysis. We consider
here the simplest of Riemann Hilbert problems and refer the reader to [1] for more
general cases and applications.
Let T be a smooth closed curve in the complex plane, which in our application will
be the unit circle, i.e., the complex numbers such that [[ = 1. The reason why we
choose the notation to represent complex numbers will appear more clearly in the
next section. We denote by D
+
the open bounded domain inside the curve T, i.e., in
our application the unit disk C, [[ < 1, and by D

the open unbounded domain


outside of the curve T, i.e., in our application C, [[ > 1. The orientation of the
curve T is chosen so that D
+
is on the left of the curve T.
For a smooth function () dened on D
+
D

, we denote by
+
(t) and

(t) the
traces of on T from D
+
and D

, respectively. So in the case where T is the unit circle,


we have

+
(t) = lim
0<0
((1 )t),

(t) = lim
0<0
((1 + )t).
32 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
We dene (t) on T as the jump of , i.e.,
(t) =
+
(t)

(t). (2.28)
Let (t) be a smooth function dened on T. The Riemann Hilbert problem is stated
as follows. Find a function () on D
+
D

such that
1. () is analytic on D
+
and analytic on D

2. () is bounded as [[ on D

3. the jump of is given by (t) =


+
(t)

(t).
The solution to the above Riemann Hilbert problem is unique and is given by the Cauchy
formula
() =
1
2i
_
T
(t)
t
dt, CT = D
+
D

. (2.29)
This is the form of the Riemann Hilbert problem we will use in the sequel. We refer the
reader to [1] for the theory.
2.4.3 Inversion of the Attenuated Radon Transform
We now want to apply the theory of Riemann Hilbert problems to invert the attenuated
Radon transform (AtRT). The rst step is to extend the transport equation to the
complex domain as follows. We parameterize the unit circle in the complex plane as
= e
i
, (0, 2). (2.30)
The new parameter takes values on the unit circle T for (0, 2). It can also be seen
more generally as an arbitrary complex number C. With the notation x = (x
1
, x
2
),
the transport equation (2.23) can be recast as
_
+
1
2

x
1
+

1
2i

x
2
+ (x)
_
u(x, ) = f(x), x R
2
, T. (2.31)
We can simplify the above equation by identifying x with z = x + iy and by dening

z
=
1
2
_

x
1
i

x
2
_
,

z
=
1
2
_

x
1
+ i

x
2
_
. (2.32)
The transport equation (2.31) is then equivalent to
_


z
+
1

z
+ (z)
_
u(z, ) = f(z), z C, T. (2.33)
The same boundary conditions (2.24) that no information comes from innity need to
be added in the new variables as well.
The above equation can also be generalized to C instead of T. It is in this
framework that the Riemann Hilbert problem theory is used to invert the attenuated
Radon transform. This will be done in three steps
2.4. ATTENUATED RADON TRANSFORM 33
(i) We show that u(z, ) is analytic in D
+
D

= CT and that u(z, ) is bounded


as .
(ii) We verify that (x, ) = u
+
(x, ) u

(x, ), the jump of u at = e


i
can be
written as a function of the measured data R
a
f(s, ).
(iii) We solve the Riemann Hilbert problem using (2.29) and evaluate (2.33) at = 0
to obtain a reconstruction formula for f(z) = f(x).
2.4.4 Step (i): The problem, an elliptic equation
Let us now analyze (2.33). In the absence of absorption the fundamental solution of
(2.33) solves the following equation
_


z
+
1

z
_
G(z, ) = (z), [G(z, )[ 0 as [z[ , (2.34)
for C(T 0).
Lemma 2.4.1 The unique solution to (2.34) is given by
G(z, ) =
sign([[ 1)
(z
1
z)
, , (T 0). (2.35)
Proof. The formula can be veried by inspection. A more constructive derivation is
the following. Let us dene the change of variables
=
1
z z,

=
1
z

z. (2.36)
Let us assume that [[ > 1. The Jacobian of the transformation is [[
2
[[
2
. We
verify that


z
+
1

z
= ([[
2
[[
2
)

.
The change of variables (2.36) has been precisely tailored so that the above holds.
Denoting

G() = G(z), we thus obtain

G() =
1
[[
2
[[
2
(z()) = ().
So

G() is the fundamental solution of the operator


. We verify that

= (). (2.37)
Indeed let (z) be a smooth test function in C

0
(R
2
) and let d() be the Lebesgue
measure dxdy in C R
2
. Then
_
C
()

d() =
_
C

d() = lim
0
_
C\[[<

d()
= lim
0
_
C\[[<

d() =
1
2i
_
[[=

d(),
34 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
by the Green formula with complex variables:
_
X
udz =
_
X
(udx + iudy) =
_
X
(i
u
x

u
y
)dxdy = 2i
_
X
u
z
d(z).
Sending to 0, we nd in the limit
_
C
()

d() =
1
2i
2i(0) =
_
R
2
()()d().
This implies that

G() = ()
1
, hence G(z) = ()
1
= (( z
1
z))
1
. This
is (2.35) for [[ > 1. For [[ < 1, we verify that the Jacobian of the transformation
z (z) becomes [[
2
[[
2
so that

G() =
1
[[
2
[[
2
(z()) = ().
This yields (2.35) for [[ < 1.
The above proof shows that (z)
1
is the fundamental solution of the

=

z
oper-
ator. This implies that the solution to the following

problem

z
f(z) = g(z), z C, (2.38)
such that f(z) vanishes at innity is given by convolution by the fundamental solution,
i.e.,
f(z) =
1

_
C
g()
z
d() =
1
2i
_
C
g()
z
d d

. (2.39)
Here we have used that dz d z = (dx+idy) (dxidy) = 2idxdy = 2id(z), whence
the change of 2-form in the above integrations.
Notice that the Green function G(z, ) tends to 0 as z for , T. This is clearly
not true when T, where G(z, ) = (l

(z)), where l

(z) is the segment t, t > 0.


The reason is that for , (T 0),


z
+
1

z
and

z
,
are elliptic operators, in the sense that in the Fourier domain, their symbol given by
k
z
+
1
k
z
and k
z
, respectively, are positive provided that k
z
is not 0. Indeed we verify
that
k
z
+
1
k
z
= 0 implies [[
2
= 1,
when k
z
,= 0 since [k
z
[ = [k
z
[ , = 0.
Let us now dene h(z, ) as the solution to
_


z
+
1

z
_
h(z, ) = (z), [h(z, )[ 0 as [z[ , (2.40)
for , (T 0). The solution is given by
h(z, ) =
_
R
2
G(z , )()d(). (2.41)
2.4. ATTENUATED RADON TRANSFORM 35
We now verify that
_


z
+
1

z
_
(e
h(z,)
u(z, )) = e
h(z,)
f(z),
so that for , (T 0), the solution of (2.33) is given by
u(z, ) = e
h(z,)
_
R
2
G(z , )e
h(,)
f()d(). (2.42)
We verify that G(z, ), h(z, ) and u(z, ) are dened by continuity at z = 0 since
G(z, ) = 0 by continuity. We now verify that G(z, ) is analytic in D
+
(including at
z = 0) and in D

. Assuming that a(z) and f(z) are smooth functions, this is also
therefore the case for h(z, ) and u(z, ). Moreover we easily deduce from (2.42) that
u(z, ) is bounded on D

. The solution u(z, ) of the transport equation extended to


the complex plane is therefore a good candidate to apply the Riemann Hilbert theorem.
2.4.5 Step (ii): jump conditions
We now want to nd the limit of u(z, ) as approaches T from above (in D

) and
below (in D
+
). Let us write = re
i
and let us send r 1 to 0 on D

. The Green
function behaves according to the following result
Lemma 2.4.2 As r 1 0, the Green function G(x, ) tends to
G

(x, ) =
1
2i(

x i0 sign( x))
. (2.43)
Proof. Let us assume that [[ > 1, i.e., r = 1 + with > 0. We then nd
G(z, re
i
) =
1

1
re
i
z
e
i
r
z
=
1

1
(1 + )e
i
z e
i
(1 )z + o()
=
1
2
1
iJ(e
i
z) + 1(e
i
z) + o()
=
1
2i
1

x + i( x) + o()
.
Passing to the limit 0, we obtain
G

(x, ) =
1
2i
1

x + i0sign( x)
.
Here by convention 0 is the limit of as 0 < 0. The limit on D
+
is treated
similarly.
We have chosen to dene G

(x, ) as functions of (0, 2) instead of functions of


e
i
. We have also identied x = (x, y) with z = x + iy. The above lemma gives us a
convergence in the sense of distributions. We can equivalently say that for all smooth
function (x), we have
_
R
2
G

(x y, )(y)dy =
1
2i
(HR

)(x

) + (D

)(x). (2.44)
We recall that the Hilbert transform H is dened in (2.11) and the Radon transform in
(2.4)-(2.5).
36 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
Proof. The derivation is based on the following result. For any f(x) C

0
(R), we
have
lim
0
_
R
f(x)
ix +
dx = ip.v.
_
R
f(x)
x
dx + sign()f(0). (2.45)
Exercise 2.4.1 Prove the above limit called Plemeljs formula.
Let us denote x =

+ and y = s

+ t. We have
_
R
2
G
+
(y)(x y)dy =
1
2
_
R
2
(( s)

+ ( t))
is + 0sign(t)
dsdt
=
1
2
_
R
p.v.
_
R
(( s)

+ ( t))
is
dsdt +
1
2
_
R
sign(t)(

+ ( t))dt
=
1
2
p.v.
_
R
_
R
(( s)

+ ( t))
is
dtds +
1
2
_
R
sign(t)(x t)dt
=
1
2i
(HR

)(x

) + (D

)(x).
A similar derivation yields the limit on D

.
We deduce that the function h(z, ) dened in (2.40) admits the limits
h

(x, ) =
1
2i
(HR

)(x

) + (D

)(x). (2.46)
Notice that R

and D

involve integrations in the direction only so that


R

[u(x)v(x

)](s) = v(s)R

[u](s), D

[u(x)v(x

)](x) = v(x

)D

[u](x).
Using this result and (2.44), we deduce that the limits of the solution u(z, ) to (2.31)
are given by
u

(x, ) = e
D

e
1
2i
(HR

)(x

)
1
2i
H
_
e
1
2i
(HR

)(s)
R

(e
D

f)
_
(x

)
+e
D

(e
D

f)(x).
(2.47)
We recall that R

(e
D

f) = R
,
f(s) are our measurements. So whereas u
+
and u

do not depend only on the measurements (they depend on D

(e
D

f)(x) which is not


measured), the dierence u
+
u

depends only on the measurements. This is the


property that allows us to invert the attenuated Radon transform. More precisely, let
us dene
(x, ) = (u
+
u

)(x, ). (2.48)
Using (2.47), we deduce that
i(x, ) = R

,
H

R
,
f(x), (2.49)
where we have dened the following operators
R

,
g(x) = e
D

(x)
g(x

), H

= C
c
HC
c
+ C
s
HC
s
C
c
g(s, ) = g(s, ) cos(
HR

(s)
2
), C
s
g(s, ) = g(s, ) sin(
HR

(s)
2
).
(2.50)
Here R

,
is the formal adjoint operator to R
,
.
The above derivation shows that i(x, ) is real-valued and of the form e
D

a(x)
M(x

, ) for some function M. We deduce therefore that


(x, ) + (x, ) = 0. (2.51)
2.4. ATTENUATED RADON TRANSFORM 37
2.4.6 Step (iii): reconstruction formulas
We have seen that u(z, ) is analytic in on D
+
D

and is of order O(z


1
) at innity.
Moreover the jump of u(z, ) across T is given by (x, ) for 0 < 2. We thus
deduce from the Cauchy formula (2.29) that
u(x, ) =
1
2i
_
T
(x, t)
t
dt, D
+
D

, (2.52)
where we identify (x, t) with (x, ) for t = e
i
. We now deduce from (2.33) in the
vicinity of = 0 that
f(x) = lim
0

1

z
u(x, ). (2.53)
Indeed we verify that u(z, ) = O() on D
+
so that (x)u(x, ) 0 as 0. Since
u(x, ) is known thanks to (2.52) in terms of the boundary measurements R
,
f(s), this
is our reconstruction formula. Let us be more specic. We verify that
u(x, ) =
1
2i
_
T
(x, t)
t
dt +
1
2i
_
T
(x, t)
t
2
dt + O(
2
). (2.54)
We thus deduce from (2.53) and the fact that u(x, ) = O() on D
+
that
0 =
1
2i
_
T
(x, t)
t
dt and f(x) =
1
2i
_
T

z
(x, t)
1
t
2
dt. (2.55)
The second equality is the reconstruction formula we were looking for since is dened
in (2.49) in terms of the measurements R
a,
f(s). The rst equality is a compatibil-
ity conditions that i must satisfy in order for the data to be the attenuated Radon
transform of a function f(x). This compatibility condition is similar to the condition
g(s, ) = g(s, + ) satised by the Radon transform in the absence of absorption.
These compatibility conditions are much more dicult to visualize when absorption
does not vanish because the integrals along the line s

+ t; t R dier depending
on the direction of integration.
Let us recast the reconstruction formula so that it only involves real-valued quanti-
ties.
Exercise 2.4.2 Using t = e
i
and dt = ie
i
, deduce that
1
2i
_
T
(x, t)
t
dt =
1
2
_
2
0
(x, )d
1
2i
_
T

z
(x, t)
1
t
2
dt =
1
4
_
2
0

(i)(x, )d +
1
4
_
2
0
(x, )d.
Use (2.51) and (2.55) to show that
f(x) =
1
4
_
2
0

(i)(x, )d. (2.56)


38 CHAPTER 2. INTEGRAL GEOMETRY. RADON TRANSFORMS
Let us denote by g

(s, ) = R

f(s, ) the SPECT measurements. From the above results


we recast (2.56) as
f(x) = [Ag](x)
1
4
_
2
0

(R

,
H

g)(x, )d. (2.57)


Exercise 2.4.3 Show that (2.57) simplies to (2.14) when 0.
Exercise 2.4.4 (Reconstruction with constant absorption.) We assume that f(x) =
0 for [x[ 1 and that (x) = for [x[ < 1. This corresponds thus to the case of a
constant absorption coecient on the support of the source term f(x).
(i) Show that
e
D

(x)
= e
x
, [x[ < 1.
Deduce that

(e
D

(x)
g(x

, )) = e
x
g
s
(x

).
(ii) Verify that the operator H

dened by H

= H
a
for a constant is diagonal in the
Fourier domain and that

u() = isign

() u(), sign

() =
_
_
_
sign() [[ ,
0 [[ < .
(iii) Show that
g

(s, ) = R

(e
x
f)(s),
f(x) =
1
4
_
2
0
e
x
(H

s
g

)(x

, )d.
(2.58)
Verify that in the limit 0, we recover the inversion for the Radon transform.
Chapter 3
Integral Geometry. Generalized Ray
Transform
The preceding chapter was devoted to the Radon and Attenuated Radon transforms for
which explicit inversion formulas are available. This fairly technical chapter presents
some mathematical techniques that have been developed to analyze transforms in in-
tegral geometry for which no explicit inversion formula is known. Although similar
techniques may be applied to more general inverse problems, we focus here on the in-
tegration of functions along a family of two dimensional curves. This integral geometry
problem is described in section 3.1.
Proving the injectivity of such integral transforms is a dicult problem. We saw
in Chapter 2 how the method of unique/analytic continuation was used to obtain the
injectivity of the Attenuated Radon transform. In that chapter, we mentioned Jan
Bomans example of a weighted Radon transform that was not injective. In section
3.4 of this chapter, we present another method based on energy estimates and due to
Mukhometov [45] (see also [58]) that shows that the integral transform is injective for
a wide class of families of curves.
The second main objective of this chapter is to present explicit inversion procedures
for such integral transforms. These procedures are based on constructing parametrices
for the transform that allow us to recast the inverse transform as the inversion of op-
erators of the form I T with T compact and I the identity operator. Most of the
material introduced in the chapter is devoted to recasting the inversion in this frame-
work. It requires the introduction micro-local analytic tools such as oscillatory integrals
and Fourier integral operators. These tools are presented without any required prior
knowledge of micro-local analysis in sections 3.2 and 3.3.
More generally, the representation of integral geometry transforms as Fourier inte-
grals displays one of the fundamental properties of inverse problems, namely that of
propagation of singularities. Heuristically, an inverse problems is well-posed when sin-
gularities (such as discontinuities) in the parameters give rise to singularities in the
available measurements. The inversion procedure therefore rst and foremost must
back-propagate the singularities in the data space to those in the parameter space. A
well-suited language of propagation of singularities is briey introduced in section 3.5.
39
40 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
3.1 Generalized Ray Transform: Setting in two di-
mensions.
3.1.1 Family of curves.
We consider the following family of curves
R t x = (t, s, ) R
2
, (s, ) R S
1
. (3.1)
We assume that curves are traveled along with unit speed so that [
d
dt
[ = 1. We will
make several assumptions on the curves as we proceed.
Dene y = (t, s) so that = (y, ). We assume that the map y (y, ) is
globally invertible for all values of with inverse

so that
(

(x, ), ) = x,

((y, ), ) = y.
We denote the inverse function

(x, ) = (t(x, ), s(x, )). (3.2)


The function s(x, ) provides the unique curve to which x belongs for a xed , i.e.,
x (R, s(x, ), ).
For the Radon transform corresponding to integration along straight lines, we have
(t, s, ) = s

+ t, s(x, ) = x

.
This corresponds to a parameterization of the set of lines where is the vector tangent
to the line and

=
_
0 1
1 0
_
is the vector rotated by

2
. This notation generalizes
to the multi-dimensional cases, where still parameterizes the direction of the lines.
3.1.2 Generalized Ray Transform.
The generalized Ray transform (GRT) is then the integral of functions over curves
(s, ) (t, s, ):
Rf(s, ) =
_
R
f((t, s, ))dt =
_
R
2
f((t, s
0
, ))(s s
0
)ds
0
dt
=
_
R
2
f(x)(s s(x, ))J(x, )dx,
(3.3)
where J(x, ) is the (uniformly positive) Jacobian of the transformation x

(x, ) at
xed :
J(x, ) :=

dx

(x, ) =

ds
0
dt
dx

(x, ). (3.4)
Exercise 3.1.1 Check the change of variables in detail.
3.1. GENERALIZED RAY TRANSFORM: SETTING IN TWO DIMENSIONS. 41
More generally, we considered weighted GRT of the form
R
w
f(s, ) =
_
R
f((t, s, ))w(t, s, )dt, (3.5)
where w(y, ) is a given, positive, weight. Such integrals are of the same form as before:
R
w
f(s, ) =
_
R
f((t, s, ))w(t, s, )dt =
_
R
2
f((t, s
0
, ))(s s
0
)w(t, s
0
, )ds
0
dt
=
_
R
2
f(x)(s s(x, ))J
w
(x, )dx,
with a dierent expression for J(x, ):
J(x, ) J
w
(x, ) :=

dx

(x, )w(

(x, ), ). (3.6)
To simplify, we shall use the notation J(x, ) rather than J
w
(x, ) and R
J
rather than
R
w
. Thus generally, we consider an operator of the form
R
J
f(s, ) =
_
R
2
f(x)(s s(x, ))J(x, )dx, (3.7)
where J(x, ) is a smooth, uniformly positive, and bounded weight.
The objective of this section is to obtain a parametrix for the weighted GRT R
J
. A
(left) parametrix is an opeartor P
J
such that P
J
R
J
= I T
J
, where T
J
is a compact
operator. Provided that +1 is not an eigenvalue of K
J
, then the Fredholm alternatives
guarantees that (I T
J
) is invertibel. This provides the inversion procedure
f = (I T
J
)
1
P
J
(R
J
f).
A complete characterization of when +1 is not an eigenvalue of T
J
is not known in
general, and Bomans counter-example presented in Chapter 2 shows that R
J
, and
hence I T
J
, may not be injective in some situations. Injectivity of R
J
is analyzed
in section 3.4. The rest of this section and of sections 3.2 and 3.3 is devoted to the
construction of a parametrix for R
J
.
3.1.3 Adjoint operator and rescaled Normal operator
When J 1 and s(x, ) = x

, then R
J
is the standard Radon transform in two
dimensions. We have obtained in Chapter 2 the inversion formula:
I =
1
4
R

J
R
J
, = H

s
.
We thus see the need to introduce the Riesz operator and the adjoint operator R

J
.
In curved geometries, however, no explicit formula such as the one given above can be
obtained in general. We no longer have access to the Fourier slice theorem, which uses
the invariance by translation of the geometry in the standard Radon transform.
42 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
The adjoint operator (AGRT) for the L
2
inner product on R S
1
is dened as
R

K
g(x) =
_
S
1
g(s(x, ), )K(x, )d =
_
RS
1
g(s, )K(x, )(s s(x, ))dds. (3.8)
The normal operator is thus given by
R

K
R
J
f(x) =
_
R
2
S
1
f(y)K(x, )J(y, )(s(x, ) s(y, ))dyd. (3.9)
Exercise 3.1.2 Check this.
We need to introduce H
s
to make the operator invertible from L
2
to L
2
as in the
case of the standard Radon transform. A simple way to do so is to recast the operators
as Fourier integral operators (FIOs) as follows. We formally recast the GRT and AGRT
as the following oscillatory integrals
R
J
f(s, ) =
_
R
2
R
f(x)e
i(ss(x,))
J(x, )
dxd
2
R

K
g(x) =
_
RS
1
R
g(s, )e
i(ss(x,))
K(x, )
dsdd
2
.
(3.10)
We then introduce the Riesz operator = H
s
given by the Fourier multiplier [[ in
the Fourier domain:
f(s) = H
s
f(s) = T
1
s
[[T
s
f(s).
We thus have
R
J
f(s, ) =
_
R
2
R
f(x)[[e
i(ss(x,))
J(x, )
dxd
2
.
Exercise 3.1.3 Check this.
The normal operator for the weights J and K is then dened as
Ff(x) := R

K
R
J
f(x) =
_
R
2
S
1
R
f(y)[[e
i(s(y,)s(x,))
K(x, )J(y, )
dydd
2
. (3.11)
It remains to analyze such an operator. We shall have two main objectives: (i) nd the
appropriate value for K(x, ) that makes F an approximation of identity; and (ii) see
how R
J
can be inverted in some specic cases.
We rst observe that Ff(x) is real valued if we choose J and K to be real-valued.
The contribution from > 0 is the same as that from < 0 by complex-conjugating
(3.11). Thus,
Ff(x) =
_
R
2
S
1
R
+
f(y)e
i(s(y,)s(x,))
K(x, )J(y, )
dy dd

. (3.12)
3.2. OSCILLATORY INTEGRALS AND FOURIER INTEGRAL OPERATORS 43
The variables (, ) may be recast as = in R
2
so that

= and [[ = . We
then recast the above operator as
Ff(x) =
_
R
2
R
2
f(y)e
i(s(y,

)s(x,

))[[
K(x,

)J(y,

)
dyd

. (3.13)
The phase and the amplitude are then dened as
(x, y, ) =
_
s(x,

) s(y,

)
_
[[, a(x, y,

) =
1

K(x,

)J(y,

). (3.14)
The phase is homogeneous of degree 1 in since (x, y, t) = t(x, y, ) for t > 0. The
amplitude is homogeneous of degree 0. We may thus recast the above operator as
Ff(x) =
_
R
2
R
2
e
i(x,y,)
a(x, y,

)f(y)dyd. (3.15)
Our rst objective is to prove that for an appropriate choice of K(x, ), then F may be
decomposed as F = I T where T is a compact, smoothing, operator. This provides
an approximate inversion to the generalized Radon transform and an iterative exact
reconstruction procedure in some cases. In general, however, it is dicult to show that
T does not admit 1 as an eigenvalue. This property is expected to hold generically.
But it may not hold or it may not be known how to prove it for specic transforms of
interest.
A second objective is therefore to look at the operator N = R

J
R
J
, which is a self-
adjoint (normal) operator. We shall show that N is injective in some cases of interest
(using the Mukhometov technique) and that QN = I T for some operator Q and
compact operator T. This will allow us to prove that N is an invertible operator in L
2
.
The generalized transform can then be inverted by means of, e.g., a conjugate gradient
algorithm. This provides an explicit reconstruction procedure that is guaranteed to
provide the correct inverse.
3.2 Oscillatory integrals and Fourier Integral Oper-
ators
In this section, we collect generic properties on oscillatory integrals and operators of
the form (3.15). References are Fourier Integral Operators I by Lars H ormander and
Fourier integrals in classical analysis by Christopher Sogge.
3.2.1 Symbols, phases, and oscillatory integrals.
The rst main player is the classes of symbols:
Denition 3.2.1 We denote by S
m
(X R
N
) the set of a C

(X R
N
) s.t. for
compact K X and all multi-orders and , we have
[D

x
D

a(x, )[ C
,,K
(1 +[[)
m[[
, (x, ) K R
N
. (3.16)
44 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
Throughout the notes, X itself will be bounded and a can be chosen of class C

(

XR
N
)
with then K =

X above.
The second main player is the phase (x, ) that appears in the following oscillatory
integrals:
I

(au) =
_
XR
N
e
i(x,)
a(x, )u(x)dxd, u C

0
(X). (3.17)
The phase is assumed to be positively homogeneous of degree 1 with respect to , i.e.,
(x, t) = t(x, ) for t > 0, and such that C

for ,= 0. Typically, the phase


is dened for [[ = 1. It is then extended to all values of by homogeneity, except at
= 0 where it is not smooth.
These oscillatory integrals need to be dened carefully since the integrand is not
Lebesgue-integrable. When vanishes on open sets, the above integral may simply not
be dened at all. However, a reasonable denition can be given when has no critical
point in joint variables (x, ) for ,= 0, i.e., d ,= 0 for ,= 0. Here,
d =
t
x
dx +
t

d.
No critical point means [
t
x
[
2
+ [
t

[
2
> 0 for ,= 0. By denition, a phase is a smooth
function that (i) is positively homogeneous of degree one and (ii) has no critical points
in the variables (x, ) when ,= 0.
By hypothesis, the sum
:= [[
2
[
t

[
2
+[
t
x
[
2
> 0
for ,= 0 is homogeneous of degree 2 since
t

is homogeneous of degree 0 and


t
x
is
homogeneous of degree 1 (all in the variable).
Exercise 3.2.1 Check this.
Let () C

0
(R
N
) so that = 1 near = 0 and dene the dierential operator
M =


x
+ , =
i(1 )

[[
2

, =
i(1 )


x
.
We verify that M is a dierential operator with smooth coecients and that
Me
i
= e
i
Exercise 3.2.2 Verify this in detail.
Let us then dene the transpose dierential operator
L = M
t
=

+
x
+ +

+
x
.
Assuming that a(x, ) vanishes for large , we can integrate (3.17) by parts and obtain
that
I

(au) =
_
XR
N
e
i(x,)
L
k
_
a(x, )u(x)
_
dxd, u C

0
(X), k = 0, 1, 2, . . . . (3.18)
The advantage of these integrations by parts is that L
k
maps S
m
into S
mk
.
3.2. OSCILLATORY INTEGRALS AND FOURIER INTEGRAL OPERATORS 45
Exercise 3.2.3 Check this latter result as well as the integrations by parts in (3.18).
Once mk < N, then the integral is well dened as a classical Lebesgue integral.
Exercise 3.2.4 Check this.
We can then pass by continuity to the denition of the integral when a(x, ) does not
vanish for large . This is our denition of the oscillatory integral (which we can check
does not depend on the choice of or k).
Note that A : u I

(au) is then dened as a distribution. This is in fact a


distribution of order k if a S
m
and mk < N; see also (3.20) below.
We also verify using the same integrations by parts that
I

(au) = lim
0
_
e
i(x,)
a(x, )()u(x)dxd, u C

0
(X), (3.19)
for o(R
n
) and (0) = 1. If a and depend on a parameter t in a continuous
manner, then, using the above characterization as a limit when 0, the integral is
also continuous in that parameter:
t I

(au, t) =
_
XR
N
e
i(x,,t)
a(x, , t)u(x)dxd
is continuous. We can thus dierentiate with respect to these parameters under the
integral sign.
3.2.2 Parameterized oscillatory integrals
This construction allows us to introduce the following parameterized oscillatory integrals.
We denote by X

the open subset of X such that (x, ) has no critical point ,= 0.


This means that
[
t

[
2
(x, ) > 0, x X

, ,= 0.
This is more constraining than the previous constraint in the variable (x, ) since we
have less partial derivatives to be non-vanishing. Seeing x as a parameter, we can then
dene the oscillatory integral as
I

(a, u) =
_
X
A(x)u(x)dx, u C

0
(X

), A(x) =
_
R
N
e
i(x,)
a(x, )d, x X

.
(3.20)
Then A(x) is continuous and even a function in C

(X

).
Let us see some consequences on the singularities of the distribution A. The same
integrations by parts as above show that for A considered as a distribution, we have
sing suppA x X;
t

(x, ) = 0 for some ,= 0.


Exercise 3.2.5 Check this.
46 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
More precisely, we have that for a a symbol vanishing in some conic neighborhood of
the set
C = (x, ), x X, R
N
0,
t

(x, ) = 0,
then the distribution A(x) dened by u I

(au) is a C

function.
By conic neighborhood of , we mean a set of vectors such that belongs to the
neighborhood when [



[ < for some > 0. Here and below, we denote by

=

[[
the direction of .
Exercise 3.2.6 Check this by integrations by parts.
This result on parameterized oscillatory integrals is useful but it cannot be applied
directly to operators. For this, we need to split the variables x into the set of variables
(x, y), where x denotes the parameters in which the operator is dened and y denotes
the variables of integration of the function on which the operator acts. This yields the
notion of Fourier Integral Operators we now consider.
3.2.3 Denition of Fourier Integral Operators.
Consider X Y R
n
X
R
n
Y
and R
N
. Then (x, y, ) is positively homogeneous
of degree 1 in . For a symbol a S
m
, we consider the operator
Au(x) =
_
Y R
N
e
i(x,y,)
a(x, y, )u(y)dyd, u C

0
(Y ), x X. (3.21)
If does not have any critical point as a function of all variables (x, y, ), then we
have seen that the integral was well dened as a distribution. This can be veried by
multiplying the above equation by v(x) C

0
(X) and integrating over X.
If for each xed x, (x, y, ) does not have critical points as a function of (y, ), then
(3.21) is well dened. Moreover, A is a continuous map from C
k
0
(Y ) to C
j
(X) if
m + N + j < k,
using the terminology of the previous sections.
Exercise 3.2.7 Check this.
The same occurs for the adjoint operator A

by exchanging the roles of x and y.


Let R

the open set of points (x, y) such that


t

,= 0 for ,= 0. Then
K
A
(x, y) =
_
R
N
e
i(x,y,)
a(x, y, )d, (x, y) R

,
is a Schwartz kernel for A and denes a function in C

(R

). If R

= X Y , then K
A
denes a map A continuous from c
t
(Y ) to C

(X).
Example 1. Let us consider the phase function (x, y, ) = (x y) . We obtain
that R

is the complement of the diagonal x = y.


Example 2. For the GRT with (s, , x, ) = (s s(x, )), then R

is the com-
plement of the set s = s(x, ). In other words, a singularity at a point x can a priori
generate singularities along all the points of the curve in the (s, ) plane given by
3.3. PSEUDO-DIFFERENTIAL OPERATORS AND GRT 47
s = s(x, ). (Note that, here (s, ) plays the role of x, x plays the role of y,
and plays the role of .) For the RT a singularity at (0, x
2
) for instance generates
the curve s = x
2
cos . Similarly, for the adjoint operator, a singularity at (s, ) becomes
a potential singularity along the line s = x

(a curve for the GRT). It looks as if


a singularity at one point x propagates to singularities everywhere after application of
R

R. That this is not the case shows that the singular support is not sucient to fully
characterize the propagation of singularities: the propagation of singularities for the
Radon transform requires the notion of singularities in the phase space, in other words,
requires the introduction of the Wave Front Set, which will be done in a later section.
Nonetheless, we have the following result. We call an operator phase function if
for each x (or y) it has no critical point in (y, ) (or (x, )). Let C

be the complement
of R

, i.e., the projection on X Y of


C = (x, y, ) X Y R
N
0,
t

(x, y, ) = 0.
In other words, (x, y) C

if (and only if) there exists ,= 0 such that (x, y, ) C.


Then we have:
sing suppAu C

suppu, u c
t
(Y ).
Here, we have dened
C

K = x, (x, y) C

for some y K.
We recall that the support suppu of a function u is the closure of the set of points where
the function does not vanish. This notion generalizes to distributions. A point is in
the open complement of the support of a distribution if the distribution acting on any
function supported in a suciently small vicinity of the point vanishes. The singular
support sing supp is the subset of the support of a distribution or function where the
latter is not a function of class C

in the vicinity of a point in that subset.


In fact we can write u = v + w with v supported in the vicinity of the singular
support of u and w smooth. This allows us to obtain the more rened result:
sing suppAu C

sing supp u, u c
t
(Y ). (3.22)
The latter result is sometimes satisfactory, sometimes not. For the phase function
= (xy), we obtain that the singularities of Au are included in the set of singularities
of u. In some sense, this is satisfactory as it states that singularities cannot propagate
in the x variable.
For the phase (s, , x, ) = (ss(x, )), however, the results are not satisfactory as
they imply that singularities at a point x can propagate to singularities at any point (s, )
and it therefore looks like singularities are spread very wildly by the Radon transform.
The notion of wave front sets will rene this statement and show that there is in fact
a one-to-one (in fact one-to-two) correspondence between properly dened singularities
before and after application of the Radon transform.
3.3 Pseudo-dierential operators and GRT
We now come back to the operator Ff(x) appearing in the analysis of the GRT.
Although the phase appears to be complicated, it is in fact essentially of the form
(x, y, ) = (x y) after an appropriate change of variables as we now show.
48 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
3.3.1 Absence of singularities away from the diagonal x = y
We start with a rst step showing that singularities of f(x) at a point x
0
do not propagate
to singularities in Ff(x) at any other point than x
0
under reasonable assumptions on
the phase.
Let us assume that

(x, y, ) = 0 implies that x = y. (3.23)


Exercise 3.3.1 Show that the above constraint is satised for the Radon transform.
Show that this constraint is still satised for curves suciently close to the straight
lines and for weights w suciently close to 1.
Let then
0
(x, y) = 1 when x = y and supported in the vicinity of x = y in the sense
that
0
(x, y) = 0 when [x y[ > for to be chosen arbitrarily small. Also we assume
that
0
C

(X Y ). Let then
Ff(x) = F
0
f(x) + F
1
f(x), F
0
f(x) =
_
R
2
R
2
f(y)e
i(x,y,)
a(x, y,

)
0
(x, y)dyd. (3.24)
We wish to show that F
1
:= F F
0
is a regularizing operator. This is a repeat of what
we saw in section 3.2.3 but we shall do the derivation in detail nonetheless. We recall
that () is a smooth function with compact support in R
2
. Since
t

(x, y, ) ,= 0 is
homogeneous of degree 0, it is bounded from below by a positive constant uniformly in
and uniformly in x and y for compact domains X and Y on the support of 1
0
(x, y).
We then dene
b
j
(x, y, ) =
i(1 )

[
t

[
2
,
which is homogeneous of degree 0 and bounded. We verify that
L
t
e
i
:=
_
b
j

j
+
_
e
i
= e
i
where L
t
is a rst-order dierential operator adjoint to the rst-order dierential oper-
ator L (as we have done before) so that
F
1
f(x) =
_
R
2
R
2
f(y)e
i(x,y,)
L
k
(a(1
0
))(x, y, )dyd, k N.
Since a is bounded and smooth in (in fact homogeneous of degree 0), is compactly
supported, and b
j
are homogeneous of degree 0, we verify that L
k
a is bounded by [[
k
.
Thus for k 3, the above integral is absolutely continuous in dimension n = 2 since
[[
3
[[d[[ is integrable on (1, ). Now, we can dierentiate the above expression with
respect to x. Each dierentiation brings a contribution
t
x
homogeneous of degree 1
in . If we dierentiate j times and choose k = j + 3, then the integral is absolutely
convergent in again. With j = m, we observe that F
1
maps L
2
(Y ) to H
m
(X) for all
m N and hence is certainly compact by Sobolev imbedding. These are very similar
calculations to those of Exercise 3.2.7.
3.3. PSEUDO-DIFFERENTIAL OPERATORS AND GRT 49
This calculation shows that the singular support of Ff(x) is included in the support
of f, and more precisely in the singular support of f by writing f = f
1
+ f
2
with f
2
smooth and f
1
supported in an arbitrary small neighborhood of the singular support of
f. Indeed, by sending j and 0, we observe that Ff(x) is of class C

away
from the support of f. Thus, as in (3.22),
sing suppFf sing suppf. (3.25)
This is a satisfactory result. It states that the singularities of Ff are at the same
place as those of f. As we shall see in a later section, the application of R
J
to f maps
the singularities of f to singularities in the variables (s, ). The above results states
that applying R

K
to R
J
f maps the singularities of R
J
f back to the location of the
singularities of f. Note that the above result describes an inclusion, not an equality.
When applying F to f, some singularities may be lost. It thus remains to show that
all the singularities of f are captured by F and that F is invertible in some sense.
3.3.2 Change of variables and phase (x y)
Let us now return to the analysis of F
0
. We want to show that the latter operator is a
pseudo-dierential operator, namely that after an appropriate change of variables, the
phase (x, y, ) can in fact be recast as (x y) . Let us consider the identity
s(x,

) s(y,

) = (x y)
x
s(z(x, y,

),

) = s
t
x
j
(z(x, y,

),

)(x
j
y
j
),
for some smooth function z(x, y,

). Then we dene the change of variables = () at
xed y and x:
= s
t
x
(z(x, y,

),

)[[ := h(

; x, y)[[ := (). (3.26)


We need this change of variables to be well dened in the vicinity of x = y since the
operator F has been replaced by F
0
whose kernel is concentrated in the vicinity of x = y.
We verify that
L(x, y, ) :=

d
d

(x, y, ) = det(h, h
t
)(x, y,

). (3.27)
Here, h
t
is the derivative of the vector h in the variable where

= (cos , sin ).
Exercise 3.3.2 Check these calculations in detail and show that the determinant is
indeed homogeneous of degree 0 in .
We assume that [L(x, y, )[ c
0
> 0 is uniformly positive for (x, y) X Y compact
domains and [x y[ < . For suciently small, it is thus sucient to assume that

d
d

(x, x,

) = det(h, h
t
)(x, x,

) 2c
0
> 0. (3.28)
This is a local property of invertibility. However, we need to assume that = () is a
global change of variables. In other words, we assume that is a global dieomorphism
from R
2
to R
2
with inverse
1
for xed values of x and y. Note that since is
50 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
homogeneous of degree 1 so that = (), then
1
is also homogeneous of degree
1. As a consequence, = [[
1
(

) and

=

1
(

)
[
1
(

)[
:=

(x, y,

).
Exercise 3.3.3 Check that =

and that L(x, y, ) = 1 when (t, x, ) = s

+ t.
With this we recast the operator F
0
as
F
0
f(x) =
_
R
2
R
2
f(y)e
i(xy)
M(x, y,

)dyd, (3.29)
with
M(x, y,

) = K(x,

(x, y,

))J(y,

(x, y,

))L(x, y,

(

))
0
(x, y).
Note that M(x, y,

) is a symbol of class S
0
(X Y R
N
) for N = n = 2.
Operators of the form
Pf(x) =
_
Y R
n
f(y)e
i(xy)
a(x, y, )dyd, (3.30)
with a S
0
(X Y R
n
) are called pseudo-dierential operators (DOs) of order zero.
When a = a() is a polynomial, then these are dierential operators. We have therefore
obtained that F could be written as the sum of a DO and a smoothing (compact)
operator.
3.3.3 Choice of a parametrix.
In the above construction, J is given by the problem of interest while K = K(x, ) is a
kernel that we can choose. One way to address the inversion of R
J
is to choose K(x, )
so that the above operator is as close to the identity operator as possible. We have
already shown that all singularities of the kernel of the above oscillatory integral were
located on the diagonal x = y. We thus dene
K(x,

) =
1
(2)
2
J(x,

)L(x, x,

)
. (3.31)
With this, we nd that M(x, x,

) = (2)
2
. In other words, we have
F
0
f(x) = f(x) T
0
f(x),
T
0
f(x) :=
_
R
2
R
2
f(y)e
i(xy)
_
M(x, x,

) M(x, y,

)
_
dyd.
(3.32)
We will prove in detail that T
0
is a compact operator. This will show that R

K
is
an inverse of R
J
up to a remainder that is a compact operator, in other words that
R

K
R
J
= I T where T is a compact operator since we already know that F
1
is a
compact operator. In other words, R

K
is a left-parametrix for the generalized Radon
transform R
J
.
3.3. PSEUDO-DIFFERENTIAL OPERATORS AND GRT 51
3.3.4 Proof of smoothing by one derivative
As an operator from functions on a bounded domain Y to functions on a bounded
domain X, T
0
is a compact operator in L
2
, and in fact an operator mapping L
2
(Y ) to
H
1
(X) of these respective domains. This will be proved by showing that T
0
and
x
j
T
0
for j = 1, 2 are bounded operators from L
2
(Y ) to L
2
(X).
We consider the operators
x
j
T
0
rst and calculate:

x
j
T
0
f(x) =
_
R
2
R
2
f(y)e
i(xy)
i
j
(x y)
x
M(x, (x, y,

),

)dyd + T
00
f(x),
for some operator T
00
f(x) with a symbol in S
0
(XY R
2
) (which will then be bounded
from L
2
to L
2
as we shall see in the next section; check the details as an exercise) and for
some smooth function (x, y,

) so that the components of
x
M(x, (x, y,

),

) belong
to S
0
(X Y R
N
).
Let () still be the compactly supported, smooth function in R
2
such that (0) = 1.
The above integrand is multiplied by +(1 ). The contribution clearly generates
a smooth, bounded, contribution. The contribution (1 ) may be recast as a sum of
terms of the form
_
R
2
R
2
f(y)e
i(xy)

j
(x y)
k
M
k
(x, y, )dyd =
_
R
2
R
2
f(y)e
i(xy)

k
M
k
(x, y, )dyd,
for some smooth and bounded functions M
k
(x, y, ), which are in fact homogeneous of
degree 0 for outside the support of (). Now we observe that
j

k
M
k
(x, y, ) =
a
jk
(x, y, ) is a symbol in S
0
(X Y R
2
) and is in fact homogeneous of degree 0 for
outside the support of (). Indeed, we have in polar coordinates in two dimensions
the following explicit expression:

) =

[[
(

)e
[[
+
1
[[

)e

=
1
[[

)e

.
So we are faced with showing that an oscillatory operator of the form (3.30) with a
bounded amplitude a(x, y, ), homogeneous of degree 0 for large , is bounded from
L
2
(Y ) to L
2
(X).
3.3.5 Boundedness of DOs of order 0 in the L
2
sense
Consider the operator
Pf(x) =
_
Y R
n
f(y)e
i(xy)
a(x, y,

)dyd, (3.33)
with a(x, y,

) S
0
(X Y R
n
). Here, we are in the simplied setting where a is a
symbol of order 0 that also turns out to be homogeneous of degree 0. Since the integral
over any ball in generates a clearly bounded contribution, it is in fact sucient that
a(x, y,

) be homogeneous of degree 0 for [[ suciently large.
52 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
Lemma 3.3.1 Let X and Y be compact domains in R
n
. Then the operator P in (3.33)
is a bounded operator from L
2
(Y ) to L
2
(X) with a constant C independent of f L
2
(Y )
such that
|Pf|
L
2
(X)
C|f|
L
2
(Y )
. (3.34)
Proof. We rst need to separate low frequencies from high frequencies to handle the
lack of regularity of the amplitude at = 0. Let () be smooth, compactly supported
and so that (0) = 1 and dene
P = P
0
+ P
1
, P
0
f(x) =
_
Y R
n
f(y)e
i(xy)
(1 ())a(x, y,

)dyd,
with P
1
clearly bounded from L
2
(Y ) to L
2
(X). Dene a

(x, y, ) = (1 ())a(x, y,

),
noting that for [[ suciently large, then a

is a function of

since = 1. For
suciently small, then a

vanishes.
By Taylor expansion, we write
a

(x, y, ) =

[[<k
(y x)

(x, x, ) +

[[=k
(y x)

(x, y, ).
Note that again all functions a

are functions of

for [[ suciently large and vanish
for suciently small. Now from

e
i(xy)
= i
[[
(x y)

e
i(xy)
,
we deduce that
_
Y R
n
f(y)e
i(xy)
(y x)

(x, y, )dyd =
_
Y R
n
f(y)e
i(xy)
i
[[

(x, y, )dyd.
For [[ < k, the above amplitude is a function of (x, x). With these expressions, we nd
that
P
0
f(x) =
_
Y R
n
f(y)e
i(xy)
P
k
(x, )ddy +
_
Y R
n
f(y)e
i(xy)
R
k
(x, y, )ddy,
where P
k
(x, ) is a smooth bounded function in both variables, and where R
k
(x, y, ) is
a smooth function bounded by [[
k
and vanishing for small.
Exercise 3.3.4 Check this. Hint: dierentiate k times functions that depend only on

.
Upon choosing k = n + 1, we nd that the integral in involving R
k
is absolutely
convergent. That expression therefore clearly has L
2
norm bounded by that of f by
an application of the Cauchy-Schwarz inequality. It remains to address the middle
term involving P
k
(x, ). That this operator is bounded in L(L
2
) is not completely
straightforward.
After extending f by 0 on R
n
Y , we recast that term as
P
k
f(x) =
_
R
n

f()e
ix
P
k
(x, )d.
3.3. PSEUDO-DIFFERENTIAL OPERATORS AND GRT 53
We write for the two-dimensional case n = 2:
P
k
(x, ) = P
k
(0, )+
_
x
1
0

1
P
k
(t
1
, 0, )dt
1
+
_
x
2
0

2
P
k
(0, t
2
, )dt
2
+
_
x
1
0
_
x
2
0

2
12
P
k
(t
1
, t
2
, )dt
1
dt
2
.
Exercise 3.3.5 Write the corresponding expansion for arbitrary dimension n 2.
Let us dene
P
k1
f(x) =
_
R
n

f()e
ix
_
x
1
0

1
P
k
(t
1
, 0, )dt
1
d.
We observe by Fubini, which holds for oscillatory integrals, that
P
k1
f(x) =
_
x
1
0
_
_
R
n

f()e
ix

1
P
k
(t
1
, 0, )d
_
dt
1
.
But then by Minkowskis integral inequality, we have
|P
k1
f|
L
2
(X)

_
x
1
0
_
_
_
_
R
n

f()e
ix

1
P
k
(t
1
, 0, )d
_
_
_
L
2
(X)
dt
1
.
By using the Plancherel identity and dealing with the other contributions in a similar
manner, we obtain that
|P
k
f|
L
2
(X)
C|
_
R
n
e
ix

f()Q()d|
L
2
(X)
C|

f()Q()|
L
2
(R
n
)
,
where Q() is the maximum over t of the operators

P
k
(t, ) for [[ n and t X.
This is a bounded quantity. Moreover, Q() is bounded since the symbol P
k
(x, ) is
bounded in . This shows the result.
Coming back to the generalized Radon transform, we have shown that T
0
was
bounded from L
2
(Y ) to L
2
(X) and that taking partial derivatives in x composed with
T
0
formed an operator that was bounded from L
2
(Y ) to L
2
(X). This shows that T
0
is
a bounded operator from L
2
(Y ) to H
1
(X). This also shows that F is bounded from
L
2
(Y ) to L
2
(X). We summarize these results as follows: We have thus obtained that
|T
0
f(x)|
H
1
(X)
+|Ff(x)|
L
2
(X)
C|f(x)|
L
2
(Y )
. (3.35)
3.3.6 Injectivity and implicit inversion formula.
The above result states that F = I T with T compact from L
2
(X) to L
2
(Y ) since the
injection i : H
1
(X) L
2
(X) is compact. The Fredholm alternative thus says that F is
invertible if it is injective. It is not known how to prove injectivity of F in general. All we
know is that F is injective if 1 is not an eigenvalue of T. Also, even if 1 is an eigenvalue
of T, the operator F can be inverted on the complement of a nite dimensional linear
space. But in general, we do not know how to prove that F is invertible. Of course,
we know that T = 0 for the Radon transform. Therefore, when (t, s, ) is close to
s

+ t, then by continuity, T is of norm less than 1 (we then do not need T to be


compact) and then F =

k=0
T
k
.
54 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
Something else can be done, however, when we can prove (typically by another
method) that R
J
is injective. As the counter-example of Jan Boman shows, R
J
is not
always injective. As we have mentioned already in the last chapter, it is typically very
dicult to prove that a given transform is injective. We were able to do so for the
attenuated Radon transform in Chapter 2. In the next section, we an injectivity result
for R
J
by using a technique of energy estimates developed by Mukhometov.
Let us therefore assume that R
J
is injective and dene the normal operator
Nf(x) = R

J
R
J
f(x) (3.36)
as an operator from L
2
(X) to L
2
(X). Since
1
2
is invertible and self-adjoint, then N
is also injective and self-adjoint for the usual inner product on L
2
(X). Indeed if A is
injective then A

Au = 0 implies that
(A

Au, u) = (Au, Au) = 0


so that u = 0.
We want to show that N is not only injective but in fact invertible in the L
2
sense.
Note that N is no longer of the form I T with T compact and that we therefore again
need to work a bit to get such an invertibility statement. As a rst step, using the
micro-local techniques of the preceding section, we show that
QN = I T,
for T compact and Q a parametrix of order 0. The construction of Q goes as follows.
We write
N = N
0
+ T
1
,
where T
1
is compact and N
0
is given by
N
0
f(x) =
_
R
2n
e
i(xy)
a(y, )f(y)
1
(2)
n
dyd.
Exercise 3.3.6 Check that T
1
is indeed a compact operator. Hint: use the same proof
showing that T
0
in (3.32) is compact.
Here a(y, ) is a symbol of order 0 uniformly bounded from below by a positive constant
by construction of N.
Exercise 3.3.7 Give the explicit expression satised by a(y, ) and show that a(y, ) is
indeed uniformly bounded from below by a positive constant.
Dene Q as the operator
Qf(x) =
_
R
2n
e
i(xy)
1
a(x, )
f(y)
1
(2)
n
dyd. (3.37)
Then we nd after some cancellations that
QN
0
f(x) =
_
R
2n
e
i(xy)
a(y, )
a(x, )
f(y)
1
(2)
n
dyd.
3.4. KINEMATIC INVERSE SOURCE PROBLEM 55
Exercise 3.3.8 Check the cancellations leading to the preceding calculation. Note that
the amplitude depends on y in the denition of N
0
and depends on x in the denition
of Q. It is the only combination for which the product QN
0
admits a nice compact
expression as given above.
How would you dene Q if you wanted an approximate right inverse instead, i.e., an
operator Q such that N
0
Q = I T for T compact ?
This shows that
T
2
f(x) = (QN
0
I)f(x) =
_
R
2n
e
i(xy)
a(y, ) a(x, )
a(x, )
f(y)
1
(2)
n
dyd,
is a compact operator from L
2
(Y ) to H
1
(X) for the same reasons that T
0
in (3.32) is
compact. This shows the existence of a parametrix Q such that QN = I T with T
compact.
We have seen that such operators Q were bounded from L
2
(X) to L
2
(X) and there-
fore deduce that
|f| |QNf| +|Tf| C|Nf| +|Tf|. (3.38)
Since N is injective, we can in fact show that there exists C
0
> 0 such that
|f| C
0
|Nf|. (3.39)
Proof. Indeed, assume that there is no such constant so that we can construct f
n
such that
1 = |f
n
| = n|Nf
n
|.
Then f
n
converges weakly to f in L
2
with then |Nf
n
| 0. But then (Nf
n
, f) =
(f
n
, Nf) (f, Nf) = 0, which implies that f = 0 since N is self-adjoint and injective.
Since T is compact, we obtain that Tf
n
converges to Tf = 0. Now (3.38) implies that
1 = |f
n
| C|Nf
n
| +|Tf
n
|
C
n
|f
n
| + o(1) = o(1),
which is a contradiction. Here, o(1) means a sequence of real numbers that converges
to 0 as n . This shows the existence of C
0
> 0 such that (3.39) holds.
This proves that N is invertible with inverse N
1
bounded by C
0
(indeed the above
estimate proves that N has closed range and trivial kernel so that its range, which is
closed, is all of L
2
(X)).
The inversion of the generalized ray transform may therefore be done as follows. We
rst apply R

J
to the data R
J
f(s, ) and then apply N
1
the inverse of the normal
operator. The latter step can be done iteratively for instance by a conjugate gradient
method.
3.4 Kinematic Inverse Source Problem
We now present an entirely dierent technique of energy estimates developed by Mukhome-
tov showing that R = R
J
dened above is injective in the case where the weight w 1
and the curves are parameterized so that [

[ = 1, i.e., curves are traveled along with


speed equal to 1.
56 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
3.4.1 Transport equation
We consider a bounded domain X R
2
with smooth surface X parameterized by
0 T and points x = S() with S(0) = S(T) and [

S()[ = 1.
For a point x in

X and 0 T, we denote by

(x, ) the unique curve joining x
and S(). For a function f supported in X, we dene the curve integrals
g(
1
,
2
) =
_

(S(
1
),
2
)
fdt, (3.40)
where dt =
_
dx
2
+ dy
2
is the Lebesgue distance measure along the curve. We thus
travel along the curve with speed equal to 1.
We assume g(
1
,
2
) known for all 0
1
,
2
T, which corresponds to the curve
integrals of f for all possible curves in the family passing through X. We then have the
following result:
Theorem 3.4.1 Under the above hypotheses for the family of curves , a function
f C
2
(X) is uniquely determined by its integrals g(
1
,
2
) given by (3.40) along the
curves of . Moreover we have the stability estimate
|f|
L
2
(X)
C|
g(
1
,
2
)

1
|
L
2
((0,T)(0,T))
. (3.41)
The rest of the section is devoted to the proof of this theorem. The proof of injectivity of
the reconstruction of f from knowledge of g is based on analyzing the following transport
equation. We introduce the function
u(x, ) =
_

(x,)
fdt (3.42)
for x

X. We denote by (x, ) the unit tangent vector to the curve

(x, ) at x and
orientated such that
(x, ) u(x, ) = f(x), (x, ) =
_
_
cos (x, )
sin (x, )
_
_
. (3.43)
The latter relation is obtained by dierentiating (3.42) with respect to arc length.
Exercise 3.4.1 Check this.
3.4.2 Variational form and energy estimates
We now dierentiate the above with respect to and obtain

_
u
_
= 0. (3.44)
We nd that

J, J =
_
_
0 1
1 0
_
_
,

:=

.
3.4. KINEMATIC INVERSE SOURCE PROBLEM 57
We calculate
J u

u =

(J u)
2
+ J u u

with u

u. Similarly, we have
u

J u =

( u)
2
u J u

.
Upon adding these two identities and using (3.44), we obtain

_
J u u
_
=

[u[
2
+ J u u

u J u

[u[
2
+ (Ju u

).
Indeed, denoting by R = ([J) the rotation matrix and T = (u[u

) := (a[b), we
nd that
aJ b J a b = det(R
t
T) = det(R)detT = detT = Ja b,
independent of . This little miracle occurs in dimension n = 2. For a = u and
b = u

, this gives Ju u

= (Ju u

) since J = 0. It remains to integrate


over X (0, T) and use the fact that S(0) = S(T) on the surface of X to obtain that
_
T
0
_
X

[u[
2
dxd =
_
T
0
_
X
u Jn(x, )u

(x, )d(x)d, (3.45)


where n is the outward unit normal to X at x X and d(x) the surface (length)
measure on X. Now, S(
t
) at the surface has tangent vector

S(
t
)d
t
= Jn(
t
)d(x)
assuming the parameterization S() counter-clock-wise. Since u(S(
t
), ) = g(
t
, ), we
nd that u

S(
t
) =

g(
t
, ) and u

(x, ) =

g(
t
, ) so that eventually,
_
T
0
_
X

(x, )[u[
2
(x, )dxd =
_
T
0
_
T
0

1
g(
1
,
2
)

2
g(
1
,
2
)d
1
d
2
. (3.46)
From the denition of and

, we observe that
det([

) =

det([J) =

.
The assumption we make on the family of curves is such that the vector

cannot
vanish and cannot be parallel to . In the choice of orientation of S, we nd that

> 0. (3.47)
Note that this is a non-local assumption on the curves. It states that the curves passing
by a point x separate suciently rapidly in the sense that increases suciently rapidly
as the boundary parameter increases.
58 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
3.4.3 Injectivity result
Since [f(x)[ [u(x, )[ from the denition of the transport equation and

integrates
to 2 in , we nd that
2
_
X
[f(x)[
2
dx =
_
T
0
_
X

[f(x)[
2
dxd
_
T
0
_
T
0

1
g(
1
,
2
)

2
g(
1
,
2
)d
1
d
2
.
(3.48)
Since g(,
t
) = g(
t
, ), this shows that
|f|
L
2
(X)

1

2
_
_

g
_
_
L
2
((0,T)(0,T))
. (3.49)
This concludes the proof of Theorem 3.4.1.
When two measurements g are equal so that their dierence and hence the dierence
of their dierential vanishes, then the dierence of sources f also vanishes. This provides
the injectivity of the transform Rf(s, ) for f supported on a compact domain. Indeed,
if g = 0, the

g = 0 and hence f = 0. This gives the injectivity. However, the proof


of injectivity is not a proof of invertibility as we have for the normal opeartor N and is
not as constructive as the result obtained for F = I T.
3.4.4 Summary on GRT.
What have we done so far? We have dened a generalized ray transform R
J
f(s, ). We
have then quickly brought our functions back into the space of positions by applying a
rescaled adjoint operator R

K
. This lead to the denition of the operators F = R

K
R
J
and N = R

J
R
J
. We have seen that by an appropriate choice of K, then F = I T
where T is a compact operator mapping L
2
(X) to H
1
(X). However, we do not know
that F is invertible in general although by continuity we know that it is so when the
curves are close to the straight lines and the weight w is close to 1. Since T is compact,
we know that the space of functions such that T = is nite dimensional. But we do
not know whether it is trivial.
We have then changed gears slightly and have looked at the normal operator N =
R

J
R
J
. Such an operator, like F, is a pseudo-dierential operator. Moreover, it is
invertible up to compact perturbations in the sense that QN = I T for T compact
and Q another pseudo-dierential operator of order 0. Here again, we do not know that
1 is not an eigenvalue of T nor that Q is invertible. However, we have seen that in
some situations, R
J
, and hence N, was injective by using a transport equation. This
allowed us to show that N was in fact an invertible operator in L
2
(X). Once properly
discretized, the resulting equations may then be solved by the method of, e.g., conjugate
gradient.
This provides a reasonable theory for the reconstruction of functions from full data,
i.e., from knowledge of R
J
f(s, ) for all (s, ) R S
1
. In many practical problems,
such data are not available. It then remains a very dicult problem to prove injectivity
of the transform. In the case of the Radon transform, the Fourier slice theorem shows
that the Radon transform is injective as soon as an open set of values of is available
(and all values of s). This is because the Fourier transform of a compactly supported
function is an analytic function in the Fourier variable and that an analytic function
3.5. PROPAGATION OF SINGULARITIES FOR THE GRT. 59
known on an arbitrarily small open set is known everywhere. For the generalized ray
transform, no such results are available. However, it is interesting to understand which
singularities of the function f(x) may be reconstructed from available measurements.
This requires that we understand how singularities propagate when we apply the Radon
transform and the adjoint of the Radon transform.
3.5 Propagation of singularities for the GRT.
3.5.1 Wave Front Set and Distributions.
As we have seen, the notion of singular support is not sucient to describe the propa-
gation of singularities for operators that are not DOs, such as for instance the Radon
transform or the generalized ray transform. The singular supports have to be extended
and rened to a phase space notion (the cotangent bundle). We then need a map from
cotangent bundle to cotangent bundle describing how singularities propagate.
For u T
t
(X) and X R
n
, we dene the Wave Front Set of u denoted by WF(u)
as follows. We say that (x
0
,
0
) , WF(u) iif there exists a function C

0
(X)
with (x
0
) ,= 0 such that the Fourier transform

u() is rapidly decreasing in a conic
neighborhood of the half ray with direction
0
, i.e., for such that


0
is suciently
small.
The main result on Wave Front Sets is then as follows:
Theorem 3.5.1 Let X R
n
, an open cone in X(R
n
0) and a phase function
in . If a S
m
(X R
N
), vanishes near the zero section and cone supp a , then for
A seen as a distribution and dened in (3.20), we have
WF(A) (x,
t
x
); (x, ) cone supp a,
t

(x, ) = 0. (3.50)
Proof. The proof of this result goes as follows. Let (x) be a function concentrating
near a point x
0
. Then

A() =
_
R
N
R
n
e
i(x,)ix
a(x, )(x)ddx.
Let (x, ) = (x, ) . Then
d =
t

d + (
t
x
)dx.
This means that for in a cone away from
0
=
t
x
, we can dene a smooth dierential
operator of the form L = a
x
+ c so that L
t
e
i((x,)x)
= e
i((x,)x)
. The reason is
that since is away from
0
=
t
x
, which is homogeneous of degree one in , then
[
t
x
(x, ) [ C([[ +[[).
The usual integrations by parts then give us that [

A()[ C[[
k
for all k N. This
proves that (x
0
,
0
) , WF(A) and concludes the proof of the result.
In particular, if A is a DO, then (x, y, ) = (x y) and we nd that
WF(K
A
) N

:= (x, x, , ), x X, R
n
0.
60 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
For more general operators, rules of propagation of singularities can also be dened. We
present them without derivations, which are fairly technical.
To do so, it is instructive to look at the product of distributions and understand
when such products are dened. Let
j
, j = 1, 2 be two closed cones in X (R
N
0).
(What we mean is that they are cones in the variable, not the x variable.) We assume
that

1
+
2
= (x,
1
+
2
), (x,
j
)
j
X (R
N
0). (3.51)
The 0 above is the important information. We assume that
1
and
2
cannot be
linearly dependent for an x as the base point in both cones. Then (
1
+
2
)
1

2
is also a closed cone in X (R
N
0). We have the following result:
Theorem 3.5.2 Let
j
be closed cones as above. Then the product of distributions u
j
such that WF(u
j
)
j
can be dened in one and only one way so that it is sequentially
continuous with values in T
t
. Moreover, we have
WF(u
1
u
2
) (
1
+
2
)
1

2
.
A prototypical example is the product of
x
1
and
x
2
in R
n
for n 2. Indeed, for n = 2,

1
= (0, x
2
,
1
, 0) 0 and
2
= (x
1
, 0, 0,
2
) 0. Then
1
+
2
= (0, 0,
1
,
2
) 0
and the above WFS inclusion is clearly satised for
x
:=
x
1

x
2
.
Note that in fact, WF(
x
) =
1
+
2
= (0, ) 0. The inclusion may thus not
be an equality. The reason is that singularities of
x
1
at x
2
,= 0 are no longer present
in the product since u
2
=
x
2
vanishes there. Inclusions may therefore be strict as such
cancellations do occur.
Let us look at the propagation of singularities in a linear transformation. Let a
distribution K T
t
(X Y ) for X R
n
and Y R
m
. Then K denes a continuous
map from C

0
(Y ) to T
t
(X) via
K, := K( ); C

0
(Y ), C

0
(X).
We use K both for the operator (on the left) and the distribution (on the right).
Let u C

0
(Y ). Then
WF(Ku) WF
X
(K) := (x, ); (x, , y, 0) WF(K).
In other words, if K has no singularities that are purely in (x, ), then Ku C

(X).
The denition of Ku for u a distribution is a dual question. Let u
1
= K the
distribution on XY . Let u
2
= 1u another distribution. The product is well dened
if the cones
1
= WF(K) and
2
= WF(1 u) = X WF(u) are such that the sum

1
+
2
satises (3.51), i.e., does not meet
1
+
2
= 0. This means that WF(u) does
not meet
WF
t
Y
(K) = (y, ); (x, 0, y, ) WF(K) for some x .
When WF
t
Y
(K) = , then K denes a continuous map from c
t
(Y ) to T
t
(X).
3.5. PROPAGATION OF SINGULARITIES FOR THE GRT. 61
3.5.2 Propagation of singularities in FIOs
With this, we arrive at the main result:
Theorem 3.5.3 Let X R
n
and Y R
m
and K T
t
(X Y ). If WF
X
(K) = and
for u c
t
(Y ), WF(u) does not meet WF
t
Y
(K), then
WF(Ku) WF
t
(K)WF(u), (3.52)
where we have dened
WF
t
(K) = (x, , y, ) X R
n
Y R
m
; (x, , y, ) WF(K) (3.53)
where WF
t
(K) is regarded as a relation mapping sets in Y R
m
0 to sets in XR
n
0.
In other words, (x, ) WF
t
(K)WF(u) when there exists (y, ) WF(u) such that
(x, , y, ) WF(K), the latter being equivalent to (x, , y, ) WF
t
(K).
For K = (x y) = c
n
_
e
i(xy))
d, which corresponds to the phase (x, y, ) =
(x y) , we have seen that WF
t
(K) = (x, x, , ), (x, ) X R
n
0. Thus,
WF(Ku) WF(u). More generally, for any distribution of a DO, we have
K(x, y) =
_
e
i(xy))
a(x, y, )d,
with also WF
t
(K) = (x, x, , ), (x, ) X R
n
0. This shows that DOs do not
propagate singularities. This is a rened version of the statement we had regarding the
singular support of a distribution.
Let us now consider the phase (s, , x, ) = (s s(x, )) and the distribution
kernel
K(s, , x) =
_
R
e
i(ss(x,))
J(x, )
2
d.
The phase stations at s = s(x, ), the set where
t

= 0. Away from that set, the


distribution kernel is smooth. So all the action takes place on the set
WF(K) =
_
_
s, , x, ,

s(x, ),
x
s(x, )
_
, s = s(x, )
_
.
This implies that
WF
t
(K) =
_
_
s(x, ), , x, ,

s(x, ),
x
s(x, )
_
_
.
How are singularities propagated by the generalized ray transform then? Let us assume
we have a singularity of f(x) at (x, ) X R
2
0. Then in order for that singularity
to propagate to the Radon domain, it has to be of the form
=
x
s(x, ).
Our hypotheses on s show that (, ) is in fact a dieomorphism of R
+
S
1
to
R
2
0 at each x. Therefore, and are uniquely determined by . This means that
the singularity of f at (x, ) propagates to a singularity of R
J
f at
(s(x, ), , ) R S
1
R
+
0.
62 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
Note that the singularity in the latter space is uniquely dened. This should be con-
trasted to the very dierent result we obtained for the singular supports, stating that a
singularity at x may propagate into singularities for all (s(x, ), ) with S
1
.
We have to worry about a slight additional complication here. The above singularity
in the phase space of the Radon domain is uniquely dened for > 0. It is also uniquely
dened for < 0. So it turns out that a singularity at (x, ) actually propagates into
two singularities in (s, ,
s
,

). This is related to the fact that the parameterization


(s, ) RS
1
is a double covering of the space of lines: a line parameterized by (s, )
is the same line as the one parameterized by (s, ). The two corresponding curves
are dierent in general. But we also see in the setting of integrals along curves that one
singularity propagates to two places.
At any rate, we observe that the notion of Wave Front Set is crucial to our under-
standing of the propagation of singularities in generalized ray transforms.
What happens to the adjoint operator R

J
? Its distribution kernel is
K
t
(x, s, ) =
_
R
e
i(ss(x,))
J(x, )
2
d.
Therefore,
WF
t
(K
t
) =
_
_
x, s(x, ), ,
x
s(x, ), ,

s(x, )
_
_
.
Let us now assume that we have a singularity at point (s, ,
s
,

). We have seen
that (s(x, ),

s(x, )) uniquely characterizes x. This is a property of the curves.


This implies that x is uniquely determined by knowledge of (s, ,
s
,

). And so is
=
x
s(x, ). We thus obtain again that singularities in the Radon domain corre-
sponded to uniquely determined singularities in the spatial domain. Moreover, the two
singularities generated by a single (x, ) both back-propagate to (x, ) by composition
with WF
t
(K
t
).
We can therefore apply the operator R

J
R
J
, and observe by composition of the pre-
vious two propagations that a singularity at (x, ) is mapped into the same singularity
at (x, ). This is consistent with the fact that R

J
R
J
is a pseudo-dierential operator
with a phase that can be recast as (x y) .
Strength of the propagating singularities. In the above discussion, the phase
was seen as the main player responsible for the propagation of singularities in XR
n
0
to singularities in Y R
m
0. However, does such transfer always occur? This depends
on the amplitude a(x, y, ). Let us recall the form of the kernel
K(x, y) =
_
e
i(x,y,)
a(x, y, )d.
is a function in a (n+m+N)dimensional manifold. Then (x, y,
t
x
,
t
y
) is dened in
a (n+m+N)dimensional manifold. However, we restrict ourselves to the submanifold
where
t

= 0, which imposes N constraints. Therefore, (x, y,


t
x
,
t
y
),
t

= 0 a priori
is dened in a (n+m)dimensional manifold. This is half the dimension of the product
space X R
n
Y R
m
. This forms what is called a Lagrangian manifold. And if we
3.5. PROPAGATION OF SINGULARITIES FOR THE GRT. 63
want a one-to-one correspondence between singularities (or a one-to-two correspondence
as we have seen is the case for the ray transform), then m = n. For our purpose, this
means that once a singularity is known at (y
0
, =
t
(y
0
)), then this also denes x
0
and

0
on that Lagrangian manifold.
We now want to know whether a(x
0
, y
0
,
0
) vanishes or not. If it vanishes, then
the singularity does not propagate to the x variables. If a(x
0
, y
0
,
0
) ,= 0, then the
singularity propagates to the x variables. Moreover, the strength of a(x
0
, y
0
,
0
) tells
us how the singularities are attenuated (or amplied, though this is rarely the case in
practical inverse problems) by the transform.
There is no simple explanation for the amount of smoothing obtained by FIOs. We
describe some generic rules in a very formal and introductory manner. As a general
procedure, let n be the dimension of X and Y and let N be the dimension of the ber
variable . For the GRT, this is n = 2 and N = 1. Then we dene a symbol a(x, y, )
as an element in S
m+
1
4
(2nN)
(X Y R
N
). This strange normalization means that a
symbol that looks non-smoothing with m+
1
4
(2n 2N) = 0 as we have in the GRT, in
fact is an operator smoothing by m derivatives with m =
1
2
(N n), which is m =
1
2
for the Radon transform. This is the reason why we have to multiply the symbol J(x, )
by
_
[[ to obtain an operator of order 0 (i.e., bounded from L
2
to L
2
).
Summary. So, to summarize this discussion, we have obtained that the phase (x, y, )
dictates how singularities propagate from (y, ) to (x, ) via the canonical relation
(x,
t
x
, y,
t
y
) dened on the manifold
t

= 0. On that manifold, for (y, ) known,


then is also known so that (x, y, ) is known and thus so is a(x, y, ). When the latter
does not vanish, then the singularity does propagate. When a(x, y, ) is strictly positive
for all such points C

, then we say that the FIO is elliptic. We are then guaranteed


that all singularities will propagate. These singularities are then typically attenuated
in the sense that the amplitude of the singularity is multiplied by a constant times [[
m
(asymptotically for large ). What the value of m is is dictated by a(x, y, ), which is
a symbol in S
m+
n
2

N
4
. Again, for the GRT, this means that m =
1
2
. Singularities are
attenuated by [[
1
2
by the GRT and then attenuated by [[
1
2
by the adjoint transform.
Therefore, R

J
R
J
has to be multiplied by ()
1
2
or replaced by R

J
R
J
in order to
dene an operator that is bounded from L
2
to L
2
and in which singularities are neither
amplied nor attenuated.
64 CHAPTER 3. INTEGRAL GEOMETRY. GENERALIZED RAY TRANSFORM
Chapter 4
Inverse wave problems
In the preceding two chapters, the probing mechanisms were based on particles: prop-
agation of X-rays in CT and propagation of particles in media with varying indices of
refraction in the generalized Radon transform. All these inverse problems were seen to
display good stability properties. We have seen that reconstructions based on M.R.I.
also enjoyed good stability properties. The third major way of probing unknown do-
mains with good stability reconstructions is to use waves. In some sense, particles may
be seen as high frequency wave packets with a frequency that is much larger than any
other scale in the problem. There is also considerable interest in considering waves with
lower, but still large, frequencies. Unsurprisingly, the propagation of such waves is mod-
eled by wave equations. Three typical partial dierential equations are then used. A
scalar wave equation models the propagation of acoustic waves, the system of Maxwells
equations models the propagation of electromagnetic waves, and the system of elastic-
ity models the propagation of elastic waves. In this chapter, we restrict ourselves to
simplied scalar models for the three types of waves.
As waves propagate, they interact with the medium of interest because of variations
in a spatially varying parameter called the sound speed (or light speed, or speed of elastic
waves; but since we restrict ourselves to a scalar model, we call such a speed sound
speed). The objective of inverse wave problems is therefore typically the reconstruction
of such a sound speed from available measurements at the boundary of the domain. This
reconstruction can be performed from time-dependent measurements or from frequency
measurements. The theories of reconstructions of sound speeds, called inverse scattering
theories, can be fairly involved mathematically. In this chapter, we consider several
relatively simple, but still representative, inverse wave problems.
The rst problem is a linearization of the inverse scattering problem in one dimension
of space with time-dependent measurements. The second inverse problem is a lineariza-
tion of the inverse scattering problem with measurements at the domains boundary for
one (suciently high) frequency. The third problem is an inverse source problem con-
sisting of reconstructing the initial condition in a wave equation from spatio-temporal
measurements of wave elds at the boundary of a domain enclosing the support of the
initial condition. This inverse source problem nds applications in the medical imaging
modality Photo-acoustic Tomography, which will be further analyzed in a later chapter.
The fourth and nal problem is a nonlinear inverse coecient problem in one dimension
of space.
65
66 CHAPTER 4. INVERSE WAVE PROBLEMS
This selection of inverse problems aims to demonstrate the following statement: In-
verse problems based on (suciently high-frequency) waves that propagate in not-too-
scattering environments involve measurement operators with good stability properties.
Reconstructions for such inverse problems are therefore typically high resolution. Al-
though such features remain true in higher spatial dimensions and for more general
non-linear inverse problems, the geometric descriptions and mathematical analyses can
become extremely complex. For general references on inverse scattering and related
inverse problems, we refer the reader to, e.g., [28, 30, 39].
4.1 One dimensional inverse scattering problem
In this section, we consider an example of a well-posed inverse problem, i.e., an inverse
problem with a Lipschitz stability in the L
2
sense. Inverse problems related to the wave
equation are often of this nature. Indeed, the wave equation propagates singularities
without dampening them, as does the Fourier transform for instance. Here, we consider
a simple one dimensional wave equation and a linearization of the inverse scattering
problem.
Let us consider the one dimensional wave equation
1
c
2
s
(x)

2
p
t
2


2
p
x
2
= (t)(x x
s
), t R, x R, (4.1)
with delta source term at time t = 0 and position x = x
s
. Here, c
s
is the unknown
sound speed, which takes a constant value c = c
s
for [x[ R > 0. We assume causality
so that p(x, t; x
s
) = 0 for t < 0 and assume that p is bounded. We measure p(x
s
, t; x
s
)
as the domains boundary as a function of time and want to reconstruct the unknown
prole c
s
(x).
It is convenient to analyze the problem in the frequency domain. Let us dene
u(x, ; x
s
) the causal Fourier transform of p(x, t; x
s
) in the time variable
u(x, ; x
s
) =
_

0
p(x, t; x
s
)e
it
dt. (4.2)
This transform can be inverted as follows:
p(x, t; x
s
) =
1
2
_

u(x, ; x
s
)e
it
d. (4.3)
The equation for u(x, ; x
s
) is the well-known Helmholtz equation
d
2
u
dx
2
+

2
c
2
s
(x)
u = (x x
s
), R, x R, (4.4)
augmented with the following radiation conditions
du
dx

i
c
u 0, as x . (4.5)
Since p(x
s
, t; x
s
) is measured, then u(x
s
, ; x
s
) is known by taking the Fourier transform
of the available data.
4.1. ONE DIMENSIONAL INVERSE SCATTERING PROBLEM 67
Let us make a few assumptions. We assume that c
s
(x) is known on (, x
s
) (in
Earth prole reconstructions, one is interested in positive depths only) and that c
s
(x)
is close to the background sound speed c on (x
s
, ) in the sense that
1
c
2
s
(x)
=
1
c
2
_
1 (x)
_
, (4.6)
where (x) is small compared to 1. In eect we linearize the problem of the reconstruc-
tion of c
s
(x) from the scattering measurements u(x
s
, ; x
s
).
The advantage is that the resulting problem is linear, relatively straightforward to
invert and admits an explicit solution for small (x). Let us dene by u
i
(i for incident)
the solution of the unperturbed problem
d
2
u
i
dx
2
+

2
c
2
u
i
= (x x
s
),
du
i
dx

i
c
u
i
0, as x . (4.7)
The solution to the above problem is nothing but the Greens function of the Helmholtz
equation with constant coecients. It is given explicitly by
u
i
(x, ; x
s
) = g(x x
s
, ) =
ce
i

c
[xxs[
2i
. (4.8)
Exercise 4.1.1 Verify the above formula for the Greens function (and verify that the
radiation condition is satised).
Let us now decompose the Helmholtz solution as the superposition of the incident
eld and the scattered eld:
u(x, ; x
s
) = u
i
(x, ; x
s
) + u
s
(x, ; x
s
).
From the equations for u and u
i
, we verify that u
s
satises the following equation
d
2
u
s
dx
2
+

2
c
2
u
s
=

2
c
2
(x)(u
i
+ u
s
), (4.9)
with appropriate radiation conditions. By the principle of superposition, this implies
that
u
s
(x, ; x
s
) =
2
_

xs
(y)
c
2
(u
s
+ u
i
)(y, ; x
s
)g(x y, )dy. (4.10)
So far, we have not used the assumption that (x) was small. This approximation,
called the Born approximation, allows us to deduce from the above equation that u
s
is
also of order . This implies that u
s
is of order
2
, hence much smaller than the other
contributions in (4.10). So neglecting u
s
on the right hand side of (4.10) and replacing
u
i
and g by their expression in (4.8), we deduce that a good approximation of u
s
is
u
s
(x
s
,
ck
2
; x
s
) =
_
R
(x)
4
e
ikx
dx, k R. (4.11)
68 CHAPTER 4. INVERSE WAVE PROBLEMS
This implies that the scattering data u
s
(x
s
, ; x
s
) uniquely determines the uctuation
and that the reconstruction is stable: all we have to do is to take the inverse Fourier
transform of u
s
to obtain (x). Namely, we have
(x) =
2

_
R
e
ikx
u
s
(x
s
,
ck
2
; x
s
)dk. (4.12)
Several assumptions have been made to arrive at this result. However as was the case
with the MRI problem, we obtain in the end a very simple reconstruction procedure:
all we have to do is to compute an inverse Fourier transform.
We can summarize the above result as follows.
Theorem 4.1.1 Let u
i
be given by (4.8) and u
s
be the solution of
d
2
u
s
dx
2
+

2
c
2
u
s
=

2
c
2
(x)u
i
, (4.13)
for supported in (x
s
, R) and such that the radiation conditions (4.5) hold. Consider
the measurement operator
M() = u
s
(x
s
, ; x
s
), (4.14)
mapping (x) L
2
(x
s
, R) to u
s
(x
s
, ; x
s
) L
2
(R). Then the measurement op-
erator uniquely determines by means of the explicit inversion formula (4.12). The
inversion is Lipschitz stable in the sense that
| |
L
2
(xs,R)
C|M() M( )|
L
2
(R
) , (4.15)
for a constant C > 0.
Exercise 4.1.2 Write the proof of the above theorem in detail. State a similar theorem
for a linearization of the wave equation in the time domain (4.1).
4.2 Linearized Inverse Scattering problem
In the preceding section, we analyzed the linearization of an inverse scattering problem
in one spatial dimension. In this section, we consider extensions to two and three
dimensions of space. The model is again a wave equation written in the frequency
domain (a Helmholtz equation). Unlike the one dimensional setting, in dimension two
or more, plane waves with a given frequency are sucient to uniquely determine (the
low frequency component of) a sound speed. This is the setting that we present here.
4.2.1 Setting and linearization
Let us consider the wave equation in dimension n = 2, 3 given by
1
c
2
s
(x)

2
p
t
2
p = 0, x R
n
, t > 0, (4.16)
4.2. LINEARIZED INVERSE SCATTERING PROBLEM 69
with appropriate initial conditions. The velocity c
s
(x) is the unknown parameter. Let
us assume that c
s
(x) = c for [x[ > R for a given radius R > 0. We assume that p = 0
for t < 0 and as in the preceding section, pass to the frequency domain by introducing
u(x, ) =
_

0
e
it
p(x, t)dt, p(x, t) =
1
2
_
R
e
it
u(x, )d. (4.17)
The equation for u is then the following Helmholtz equation
( +

2
c
2
s
(x)
)u(x, ) = 0, x R
n
, R,
x u(x, ) i

c
u(x, ) = o([x[
(n1)/2
).
(4.18)
As usual x =
x
[x[
and the second equation is the radiation condition, which ensures that
no energy comes from innity (only waves radiating out are allowed at innity). The
notation o(x) means a quantity such that o(x)/x 0 as 0 < x 0. So the decay at
innity should be faster than [x[
1/2
in two dimensions and faster than [x[
1
in three
dimensions.
Let us now introduce the following linearization for the velocity and the frequency:
1
c
2
s
(x)
=
1
c
2
_
1 (x)
_
, k =

c
. (4.19)
We recast the Helmholtz equation as
( + k
2
)u(x, ) = (x)k
2
u(x, ),
x u(x, ) iku(x, ) = o([x[
(n1)/2
).
(4.20)
Let S
n
be a unit vector. We verify that
( + k
2
)u
i
(x, ; ) = 0, where u
i
(x, ; ) = e
ikx
. (4.21)
Thus plane waves with the right wavenumber k = [k[ are solutions of the homogeneous
Helmholtz equation. Notice however that they do not satisfy the radiation conditions
(they do radiate out in the direction but certainly not in the direction since they
come from innity in that direction).
The forward problem we are interested in is the following: we have a probing plane
wave coming from innity and want to nd a solution u
s
(x, ) modeling the response
of the system to the probing. Therefore we impose that u
s
does not radiate at innity
(i.e., satises the radiation condition) and that the whole eld u = u
i
+ u
s
satises the
Helmholtz equation. We thus end up with the following scattering problem
( + k
2
)u
s
(x, ) = (x)k
2
(u
s
+ u
i
)(x, ),
x u
s
(x, ) iku
s
(x, ) = o([x[
(n1)/2
).
(4.22)
In the above equation we have used (4.21). Under general assumptions on (x), the
above equation admits a unique solution [30]. The inverse scattering problem consists
70 CHAPTER 4. INVERSE WAVE PROBLEMS
then of reconstructing (x) from measurements of u
s
at innity in all possible directions
x for all possible incoming plane waves S
n
. We will not be concerned with this
general problem and refer the reader to [30] for more details.
Instead, we concentrate on the linearization of the inverse scattering problem about
the constant velocity prole c
s
(x) = c. Let us assume that is small (in some appro-
priate sense that we do not describe further). As a consequence, u
s
is also small as can
be seen from (4.22). We therefore neglect the term u
s
, which is second-order in .
This approximation, as in the preceding section, is called the Born approximation. It
also has the advantage of linearizing the problem of reconstructing (x) from scattering
measurements, which will be described in detail below. We are thus now concerned with
the linearized problem
( + k
2
)u
s
(x, ) = (x)k
2
u
i
(x, ),
x u
s
(x, ) iku
s
(x, ) = o([x[
(n1)/2
).
(4.23)
This equation can be solved explicitly as
u
s
(x, ) = k
2
_
R
n
(y)u
i
(y, )g
n
(x y)dy, (4.24)
where g
n
is the Green function, solution of the following equation
( + k
2
)g
n
(x) = (x)
x u
s
(x, ) iku
s
(x, ) = o([x[
(n1)/2
),
(4.25)
and is given for n = 2, 3 by
g
2
(x) =
i
4
H
0
(k[x[), g
3
(x) =
e
ik[x[
4[x[
. (4.26)
Here, H
0
is the 0th order Hankel function of the rst kind, given by
H
0
(k[x[) =
1

_
R
1
_
k
2
p
2
e
i(px+

k
2
p
2
y)
dp, (4.27)
where we have decomposed x = (x, y) in Cartesian coordinates.
4.2.2 Far eld data and reconstruction
The measurements we consider in this section are the far eld scattering data. They
correspond to the scattered waves propagating outwards at innity. This simplication
amounts to saying that the other component of the radiating eld, composed of the
evanescent waves, is not measured. Mathematically, we consider the asymptotic limit of
u
s
as [x[ . Let us consider the three dimensional case rst. Since x goes to innity,
then [x y[ is equal to [x[ plus a smaller order correction. So we have
u
s
(x, ) =
k
2
4[x[
_
R
3
(y)e
iky
e
ik[xy[
dy + l.o.t. .
4.2. LINEARIZED INVERSE SCATTERING PROBLEM 71
Upon using the following approximation
[x y[ = [x[[ x
y
[x[
[ = [x[
_
1 +
[y[
2
[x[
2
2
x y
[x[
_1
2
= [x[ x y + l.o.t.,
we obtain
u
s
(x, ) =
k
2
e
ik[x[
4[x[
_
R
3
(y)e
ik( x)y
dy + l.o.t. .
We thus observe that
u
s
(x, ) =
k
2
e
ik[x[
4[x[
A( x) + o
_
1
[x[
_
,
A( x) =
_
k( x )
_
=
_
R
3
(y)e
ik( x)y
dy.
(4.28)
Recall that = ck. So for a plane wave at a given frequency , i.e., at a given
wavenumber k, and direction , the far eld measurement is A( x) = A( x; k, ) in the
direction x (obtained by multiplying the measured signal by 4[x[e
ik[x[
k
2
).
In two space dimensions (n = 2), the nal result is similar in the sense that u
s
is
proportional to [x[

1
2
at innity with a coecient of proportionality A( x) taking the
same expression as given in (4.28).
The measurement operator we consider in this section thus takes the following form
L

(B(0, R)) M[] = A(; k, ) L

(S
n1
S
n1
), (4.29)
which to a (say bounded) sound speed uctuation (x) associates the far eld measure-
ment ( x, ) A( x; k, ), which is also bounded as the Fourier transform of a bounded,
compactly supported, function. Note that lives in the n-dimensional space B(0, R)
whereas A for a xed frequency k lives in the 2(n 1)-dimensional space S
n1
S
n1
.
The latter dimensions agree when n = 2 whereas in dimensions n 3, the far eld
data are richer than the unknown object since 2(n 1) > n then. Note that within
the (Born) approximation of linearization, the measurement operator again provides
information about in the Fourier domain since
M[]( x, ) = (k( x)). (4.30)
Each measurement ( x, ) provides dierent information about the Fourier transform
of the velocity uctuation (x). We distinguish two types of measurements. The rst
ones correspond to directions of measurements x such that x > 0. These measurements
are called transmission measurements since they correspond to the radiated wave that
have passed through the object we wish to image. The second ones correspond to the
directions such that x < 0. They are called reection measurements.
Transmission inverse scattering. Let us consider transmission measurements rst,
with (k( x )) known for x > 0. In particular we obtain for x =

the value (0),
which is the average of the uctuation (x) over the whole domain. More generally as
x varies in S
1
such that x > 0, we obtain () over a half-circle passing through 0,
of radius k and symmetric about the axis . As varies on the unit circle, we observe
72 CHAPTER 4. INVERSE WAVE PROBLEMS
that (k( x )) lls the disk of radius

2k. At a xed value of k, this is therefore all


we can get: () for such that [[

2k.
The picture in three dimensions is very similar: for a given S
2
, we have access
to (k) for k on a half-sphere of radius

2k passing through 0 and invariant by rotation


about . As varies over the sphere S
2
, we thus get (k) for all k such that [k[

2k,
as in the two-dimensional case.
The diraction inverse problem is therefore not injective. All we can reconstruct
from the measured data is a low-pass lter of the object (x). The high frequencies are
not measured. The high frequencies of are encoded in the radiation eld u
s
. However
they are the evanescent part of the waves. They decay therefore much more rapidly than
[x[
1
(when n = 3), actually exponentially, and thus cannot be measured accurately in
practice.
Let us now consider reconstruction formulas. Since frequencies above

2k cannot
be reconstructed, we assume that
(x) = (T
1
x

2k
()T
x
)(x), (4.31)
where

2k
() = 1 when [[ <

2k and 0 otherwise, i.e. does not have high wavenum-


bers. Note that this assumption is inconsistent with our earlier assumption that was
supported in B(0, R). We do not deal with this minor technical diculty here. Then
the reconstruction is obviously unique according to what we just saw. Let us consider
the two-dimensional case. We want to reconstruct (x) from (k( x )), where x and
run over the unit circle S
1
. The inverse Fourier transform tells us that
(x) =
1
(2)
2
_
R
2
e
ix
()d =
k
2
(2)
2
_
2
0
_

2
0
e
ikx
(k)dd.
Observe that as covers the unit circle, all points of the disk [[ <

2k are covered
twice as x varies, once for a point such that

> 0 and once for a point such that

< 0. Therefore the information corresponding to



> 0 is sucient. This


information is parameterized as follows: for a given we write x as
x(, ) = sin + cos

, 0

2
. (4.32)
We thus obtain that
(k( x )) =
_
k()
_
sin 1
()
+
cos
()

_
_
=
_
k()1()
_
,
with () =

1 cos and 1() an explicitly dened rotation depending on .


Here is the rest of the reconstruction:
(x) =
k
2
(2)
2
_
2
0
_
/2
0
e
ik()x
(k())()
d()
d
dd
=
k
2
(2)
2
_
2
0
_
/2
0
e
ik()x1()
(k()1())
1
2
d
2
()
d
dd,
so that nally
(x) =
2k
2
(2)
2
_
2
0
_
/2
0
e
ik()x1()
(k[ x(, ) [) sin dd. (4.33)
This is the reconstruction formula we were after.
4.2. LINEARIZED INVERSE SCATTERING PROBLEM 73
Reection inverse scattering. Let us conclude the section with a few words about
reection tomography. In that case, we only measure data in directions x such that
x < 0. Following the same techniques as above, we see that we can reconstruct
wavenumbers of (x) in the corona of wavenumbers such that

2k < [[ < 2k.


The reconstruction from reection data is therefore by no means unique. We cannot
reconstruct low-frequency components of and cannot reconstruct very-high frequencies
either. Assuming that the wavenumber content of (x) is in the above corona, then the
reconstruction is unique. A reconstruction formula similar to what we just obtained
can also be derived. Notice that when both the transmission and reection data can be
measured, we can reconstruct all wavenumbers of (x) such that [[ < 2k.
All these result are in sharp contrast to the one-dimensional example we saw in the
preceding section. There, a given wavenumber k allows us to reconstruct one wavenum-
ber of (x). All wavenumbers are thus required (i.e. measurements for all frequencies
) to reconstruct (x). Here (x) is also uniquely determined by measurements ob-
tained for all values of k (since each value of k allows us to reconstruct all wavenumbers
[[ < 2k). However because of the multidimensional nature of the measurements (the
variable x is discrete in one-dimension instead of living on the unit sphere S
n
), measure-
ments for all values of k is quite redundant: once we have obtained measurements at a
given value of k
0
, all measurements obtained for wavenumebers k < k
0
are redundant.
4.2.3 Comparison to X-ray tomography
Let us consider the case of transmission data in two space dimensions. We have seen
that wavenumbers of (x) up to

2k could be reconstructed. However as k tends to


, this essentially means that all wavenumbers of (x) can be reconstructed.
Indeed in that limit we observe that the half circle of radius k becomes the full line
orthogonal to . That is, as k , the measurements tend to
(

) = R
_
, +

2
_
.
Exercise 4.2.1 Show that the reconstruction formula (4.33) indeed converges to the
inverse Radon transform as k .
In the limit of innite frequency, we therefore obtain that the transmission measurements
tend to the Radon transform of . We have seen in Chapter 2 that the knowledge
of R

(, + /2) for all values of and was sucient to uniquely reconstruct the
uctuation (x).
So how should we consider the inverse diraction problem? How ill-posed is it? As
we already mentioned, the rst problem with diraction tomography is that for a xed
frequency , the function (x) cannot uniquely be reconstructed. Only the wavenum-
bers below

2k (below 2k) in the case of transmission (transmission and reection)


measurements can be reconstructed. However in the class of functions (x) L
2
(R
2
)
such that (4.31) holds, we have uniqueness of the reconstruction. In this class we can
perform a similar analysis to what we obtained in Theorem 2.2.2.
Let us consider the measurements d(, ) = (k( x)) for S
1
and 0 /2
using (4.32). We verify that 1
t
()

2 for 0 /2.
74 CHAPTER 4. INVERSE WAVE PROBLEMS
Let us assume that the error we make is of the same order for every angle and
every angle . An estimate of the total error will thus involve
_
S
1
_
/2
0
[d(, )[
2
dd =
_
S
1
_
/2
0
[ (k()1())[
2
dd
=
_
S
1
_

2
0
[ (k)[
2
(
t
)
1
dd
_
S
1
_

2
0
[ (k)[
2
dd

1
k
_
S
1
_

2k
0
[ (u)[
2
dud
1
k
_
S
1
_

2k
0
[ (u)[
2
u du d
u

1
k
||
2
H

1
2 (R
2
)
.
In some sense, the above formula also shows that the data d(, ) are more regular than
the function (x) by half of a derivative. This is consistent with the Radon transform
in the limit as k . To be more consistent with the Radon transform, notice in the
limit k that k cos so that k sin d d as the half circle converges to
the real line. Since sin 1 for most of the wavenumbers as k , this implies
that kd d. Therefore a total error in the angular measurements in diraction
tomography consistent with the measurement errors for the Radon transform is given
by
_
S
1
_
/2
0
[d(, )[
2
kdd ||
2
H
1/2
(R
2
)
.
We recover in this limit that the measurements in diraction tomography regularize the
function by half of a derivative.
Note that we see here again that the ill-posedness of a problem very much depends
on the norm in which the error on the data is measured.
4.3 Inverse source problem in PAT
Consider the following wave equation

2
p
t
2
p = 0, t > 0, x R
n
p(0, x) = f(x), x R
n

t
p(0, x) = 0, x R
n
.
(4.34)
The inverse wave source problem consists of reconstructing the initial condition f sup-
ported in a open, convex, bounded domain X from knowledge of p(t, x) for t > 0 and
x X. This inverse problem nds application in Photo-acoustic tomography (PAT).
We will come back to the modeling of this imaging modality in Chapter 8.
4.3.1 An explicit reconstruction formula for the unit sphere
Explicit inversion formulas are known when = X is a simple geometry. Explicit
reconstruction procedures based on time reversal are also known for general surfaces
enclosing the support of the source f(x), which we assume is compactly supported.
4.3. INVERSE SOURCE PROBLEM IN PAT 75
All explicit inversion formulas are based one way or another on the Fourier transform.
Let us assume that is the unit sphere [x[ = 1. Then an explicit reconstruction
formula by L. Kunyanski shows that in dimension n = 3, we have
f(x) =
1
8
2

_
[y[=1
(y)
_
1
t
d
dt
p(y, t)
t
_
t=[yx[
dS
y
. (4.35)
The above formula generalizes to arbitrary dimension. We shall not delve into the de-
tails of this inversion formula. Like the formula based on the Radon transform, it uses
the symmetries of the Fourier transform. However, it is based on Fourier transforms on
the sphere (spherical harmonics) that we do not present here. The above reconstruc-
tion shows that the source term f(x) can uniquely and stably be reconstructed from
knowledge of u on the surface .
4.3.2 An explicit reconstruction for detectors on a plane
In this section, we present an inversion procedure for point-wise detectors based on the
use of the Fourier transform. We assume that f is compactly supported in the unit ball
B(0, 1) and assume that the measurements are performed at each point of the hyperplane
x
n
= 1. The reconstruction procedure is essentially independent of dimension n 2
and is therefore presented for arbitrary dimensions. We denote by x
t
= (x
1
, . . . , x
n1
)
and by z = x
n
. We want to reconstruct f(x) from knowledge of p(t, x
t
, z = 1) for all
t 0 and x
t
R
n1
knowing that
_

2
t
2

_
p =
t
0
(t)f(x), (t, x) R R
n
.
We denote by u(,
t
, z) the partial Fourier transform of p given by
u(,
t
, z) =
_
R
n
e
i(t+

)
p(t, x
t
, z)dtdx
t
,
and by

f(
t
, z) =
_
R
n1
e
i

f(x
t
, z)dx
t
.
Since dierentiation in t corresponds to multiplication by i in the Fourier domain,
with a similar expression in the x
t
variable, we nd that
_

2
[
t
[
2
+

2
z
2
_
u = i

f := s(,
t
, z).
For the reconstructions, it turns out that we need only the subset > [
t
[ and dene
=
_

2
[
t
[
2
. The above equation is a second-order non-homogeneous equation
similar to the forced harmonic oscillator. Its solution is of the form
u(,
t
, z) = A(,
t
)e
iz
+ B(,
t
)e
iz
+
_
z
0
s(z
t
)
sin (z z
t
)

dz
t
,
which we recast as
u(,
t
, z) =
_
A +
_
z
0
s(z
t
)
e
iz

2i
dz
t
_
e
iz
+
_
B
_
z
0
s(z
t
)
e
iz

2i
dz
t
_
e
iz
.
76 CHAPTER 4. INVERSE WAVE PROBLEMS
The support of f and hence of s is a subset of (1, 1) in the z variable. As z , we
want only outgoing waves (no incoming waves), which imposes that
B(,
t
) =
_
1
0
s(z
t
)
e
iz

2i
dz
t
.
Similarly, as z , the radiation condition imposes that
A(,
t
) =
_
0
1
s(z
t
)
e
iz

2i
dz
t
.
At z = 1, where information is available, this means,
u(,
t
, 1) = e
i
_
1
1
s(,
t
, z)
e
iz

2i
dz
t
=

2
e
i
_
R
e
iz

f(
t
, z)dz.
Since f is real valued, then

f

(
t
, z) =

f(
t
, z). Let us therefore dene
v(,
t
) =
_

_
2
_

2
+[
t
[
2
u(
_

2
+[
t
[
2
,
t
, 1)e
i
> 0
2
_

2
+[
t
[
2
u(
_

2
+[
t
[
2
,
t
, 1)e
i
< 0.
Then we nd that
v(,
t
) =
_
R
e
iz

f(
t
, z)dz :=

f()
is the Fourier transform of f in all variables.
Note that only the frequencies [[ > [
t
[ are used in the reconstruction. This corre-
sponds to the propagating modes. The frequencies [[ < [
t
[ are still measured but are
not necessary in the reconstruction. These evanescent modes decay exponentially in z
and their use might in fact yield unstable reconstructions.
Using the Parseval relation and the fact that [[d = [[d, we have
|f|
L
2
(B(0,1))
= c|

f|
L
2
(R
n
)
= c|v|
L
2
(RR
n1
)
C|u
[z=1

|
L
2
(RR
n1
)
C|u
[z=1
|
L
2
(RR
n1
)
since [

[ 1. Consequently, we nd that
|f|
L
2
(B(0,1))
C|p(t, x
t
, 1)|
RR
n1.
The inverse wave problem is therefore injective and well posed in the L
2
sense. This
means that there exists a constant C > 0 such that
|f|
L
2
(B(0,1))
C|Mf|
L
2
(R
n
)
.
The above inverse wave problem is therefore stable. Measurement errors are not
amplied too drastically during the reconstruction and the inverse wave source problem
is consequently a high resolution inverse problem.
4.4. ONE DIMENSIONAL INVERSE COEFFICIENT PROBLEM 77
4.4 One dimensional inverse coecient problem
The three inverse problems considered so far in this chapter were linear inverse problems.
The rst two inverse scattering problems were dened as the linearization of nonlinear
inverse scattering problems. The last inverse source problem is linear without any ap-
proximation. In this section, we consider a one-dimensional inverse coecient problem,
which is nonlinear and treated as such. This nonlinear inverse problem enjoys the same
properties as their linear counterparts: Wave equations propagate singularities and thus
many measurement operators based on solutions of wave equations are Lipschitz stable.
Consider the equation

2
p
t
2


2
p
x
2
q(x)p = 0, t > 0, x > 0
p(t, 0) = f(t), t > 0,
p(0, x) =
t
p(0, x) = 0 x > 0.
(4.36)
This problem can equivalently be posed for t R by assuming that the source term
f(t) is supported in t R
+
. The solution p then vanishes for t 0 by nite speed of
propagation. For the same reason, the solution is supported in the domain t x 0.
Here, q(x) is an unknown bounded potential. We then assume that g(t) =
x
p(t, 0)+
f
t
(t) is measured at the domains boundary. This denes the measurement operator
L

(0, L) q M
f
(q) = g L

(0, 2L). (4.37)


The objective is to show that g(t) for t (0, 2L) uniquely determines q(x) on (0, L)
under some conditions on f(t). Moreover, depending on the structure of f(t) at t = 0,
we obtain dierent stability estimates showing that the reconstruction of q(x) is a well
posed problem in appropriate topologies.
Cauchy problems at x = 0 and t = 0. Let us dene p
1
(t, x) = p(t, x) f(t x),
where f(t x) is seen to be the solution of the above problem when q 0. Then we
nd that

2
p
1
t
2


2
p
1
x
2
= q(x)p(t, x) := s(t, x) t R, x > 0
p
1
(t, 0) = 0,
p
1
x
(t, 0) = g(t), t R
p
1
(0, x) =
p
1
t
(0, x) = 0 x > 0.
(4.38)
The above problem may be seen in two ways. It can either be seen as a Cauchy
problem for x > 0 and t R with Cauchy data at x = 0. The solution is then given by
p
1
(t, x) =
1
2
_
t+x
tx
g()d +
1
2
_
(t,x)
q(y)p(, y)ddy, (4.39)
where we have dened the triangular-shaped area:
(t, x) =
_
(, y), 0 < y < x, t (x y) < < t + (x y)
_
. (4.40)
78 CHAPTER 4. INVERSE WAVE PROBLEMS
Alternatively, we can write the above problem as a Cauchy problem with data at
t = 0. We need to ensure that the constraint p
1
(t, 0) = 0 is satised. In order to do so,
let us dene f(t, x) = f(t x) for t > 0 and x > 0 and f(t, x) = f(t + x) for t > 0
and x < 0 and let us extend the potential so that q(x) := q(x) for x < 0. We then
recast the wave equation for p
1
as

2
p
1
t
2


2
p
1
x
2
= q(x)
_
f(t, x) + p
1
(t, x)
_
t > 0, x R
p
1
(0, x) =
p
1
t
(0, x) = 0 x R.
(4.41)
The solution to that equation then satises the constraint p(t, x) = p(t, x) and with
s(t, x) := q(x)(f(t, x) + p
1
(t, x)) is given explicitly by
p
1
(t, x) =
1
2
_

(t,x)
s(, y)dyd, (4.42)
where

(t, x) = (, y), 0 < < t, x (t ) < y < x + (t ).
Exercise 4.4.1 Check the above two formulas for p
1
(t, x).
We thus obtain again that
p
1
(t, x) =
1
2
_

(t,x)
q(y)
_
f(, y) + p
1
(, y)

ddy. (4.43)
For f(t) = (t), we observe that p
1
(t, x) is a bounded function. This implies that
g(t) is also a bounded function. We dene here f(t) = (0) as the limit of f

(t) with
f

(t) =
1

for t (0, ) and f

(t) = 0 otherwise. We also write p and p


1
the corresponding
limits as 0.
Exercise 4.4.2 Show that p
1
is indeed a bounded function. Find that the right-hand
side in (4.43) with p
1
set to 0 is indeed bounded. Then use an argument based on a
Gronwall lemma as indicated below to conclude.
Nonlinear problem for q(x). Let us come back to (4.39) and evaluate it at t = 0
for x > 0. Then we nd
p
1
(0, x) =
1
2
_
x
x
g()d +
1
2
_
x
0
_
xy
(xy)
s(, y)ddy.
Dierentiating with respect to x yields
g(x) +
_
x
0
q(y)p(x y, y)dy = 0. (4.44)
This may be seen as a nonlinear equation for q since p depends on q.
The same equation may also be obtained from (4.43). Indeed, dierentiating in x
yields

x
p
1
(t, x) =
1
2
_
t
0
(s(, x + (t )) s(, x (t ))) =
_
t
0
s(, x + (t ))d.
4.4. ONE DIMENSIONAL INVERSE COEFFICIENT PROBLEM 79
Evaluated at x = 0, this yields

x
p
1
(t, 0) = g(t) =
_
t
0
q()p(t , )d, (4.45)
which is equivalent to (4.44). The latter may be recast as
g(x) +
_
x
0
q(y)f(x 2y)dy +
_
x
0
q(y)p
1
(x y, y)dy = 0.
Let us now assume that f(t) = (t) as the limit indicated above. Then we nd that
g(x) +
1
2
q
_
x
2
_
+
_
x
0
q(y)p
1
(x y, y)dy = 0.
However, by the nite speed of propagation, p(t, x) and p
1
(t, x) are supported on t x.
This means that p
1
(x y, y) is supported on y
x
2
. As a consequence, changing x to
2x, we nd that for all x > 0,
q(x) = 2g(2x) 2
_
x
0
q(y)p
1
(2x y, y)dy. (4.46)
This is a nonlinear equation for q(x) of Fredholm type. We dene g(x) = 2g(2x).
Error estimates and Gronwall lemma. Let us consider g = M
f
( q) and g = M
f
(q).
We dene q = q q and p = p p as well as g with obvious notation. Then we nd
that
q(x) = g(x) 2
_
x
0
q(y)p
1
(2x y, y)dy 2
_
x
0
q(y)p
1
(2x y, y)dy
p
1
(t, x) =
1
2
_
t+x
tx
g()d +
1
2
_
(t,x)
_
q(y)p(, y) + q(y)p
1
(, y)

ddy.
(4.47)
Here, we used (4.39) for the expression of p
1
.
Let us dene
(T) = (x, t) R
2
, 0 t T, x = t (x, t) R
2
, 0 t T, x = 2T t, (4.48)
the part of the boundary of (T, T) that does not include the segment (0, 0 < t < T).
We next dene
P(T) = sup
(T)
[p
1
(t, x)[, Q(T) = sup
t[0,T]
[q[(t). (4.49)
Then we deduce from (4.47) that
Q(T) |g|
L

(0,2T)
+ C
_
T
0
Q()d + C
_
T
0
P(d
P(T) T|g|
L

(0,2T)
+ C
_
T
0
Q()d + C
_
T
0
P()d.
80 CHAPTER 4. INVERSE WAVE PROBLEMS
Exercise 4.4.3 Verify the above expression. The main diculty involves the analysis
of the term
_
(t,x)
q(y)( y)ddy,
which has to be seen as the limit with source term f

as 0. It is bounded by
_
T
0
[q[()d and hence by
_
T
0
Q()d.
We deduce that
(P + Q)(T) (1 + T)|g|
L

(0,2T)
+ C
_
T
0
(P + Q)()d
with constant C uniform in T. As an application of the Gronwall lemma, we deduce
that
(P + Q)(T) Ce
CT
|g|
L

(0,2T)
,
for some positive constant C also independent of T.
Exercise 4.4.4 Verify that the Gronwall lemma applies to the above problem and yields
the latter inequality.
This proves the following result:
Theorem 4.4.1 Let f(t) = (t) and q and q be bounded potentials. Then M

(q) =
M

( q) implies that q = q and we have the following error estimate


|q q|
L

(0,T)
Ce
CT
|M

(q) M

( q)|
L

(0,2T)
. (4.50)
The constant C is uniform in T > 0.
Let us now assume that the initial condition is not f(t) = (t) but rather f
n
(t) =
1
n!
t
n
for n 0. Then obviously f
(n+1)
n
(t) = (t). Moreover, since q is independent of t, then

n+1
t
u solves the same equation as u with f
n
replaced by (t). As a consequence the
measurements g
n
(t) with f = f
n
are such that g
(n+1)
n
(t) = g(t), the measurements
corresponding to f = (t). This yields
Corollary 4.4.2 Let f
n
(t) =
1
n!
t
n
and q and q be bounded potentials. Then M
fn
(q) =
M
fn
( q) implies that q = q and we have the following error estimate
|q q|
L

(0,T)
Ce
CT
|M
fn
(q) M
fn
( q)|
W
n+1,
(0,2T)
. (4.51)
The constant C is uniform in T > 0.
This straightforward corollary indicates that the stability estimate strongly depends on
the probing mechanism. The smoother the probing mechanism, the worse is the stability
estimate.
Intuitively, the above result is clear. A signal emitted at t = 0 propagates to position
x during time t = x and then back to x = 0 at time t = 2x. Only that signal
provides information about q at position x. When multiple signals are emitted, then
superpositions of signals create some blurring. The above corollary quanties such a
blurring.
Chapter 5
Inverse Kinematic and Inverse
Transport Problems
Several problems of Integral Geometry model the propagation of particles interacting
with the underlying medium. This is the case for the Radon transform seen as a trans-
mission tomography problem in (2.1) and for the Attenuated Radon transform in (2.23).
Another transport equation was briey encountered in (3.43) in the analysis of the
Generalized Ray transform. These source problems belong to a larger class of inverse
kinematic and inverse transport problems that we analyze in this chapter.
The propagation of particles essentially follows two dynamics: a Hamiltonian (of
classical mechanics) describes the free propagation of particles in the absence of in-
teractions with the underlying structure; a scattering operator models how particles
are absorbed and scattered when they interact with the underlying medium. The in-
verse kinematic and inverse transport problems aim to reconstruct the structures of the
Hamiltonian and the scattering operator from measurements of particle densities.
The rst part of this chapter considers the reconstruction of a simple Hamiltonian of
the form H(x, k) = c(x)[k[ from travel time boundary measurements. The second part
of the chapter assumes the Hamiltonian known with c a constant and considers the re-
construction of the absorption and scattering coecients from boundary measurements
of particle densities.
Scattering: a transition to ill-posedness. Before addressing these inverse prob-
lems in more detail, let us comment on the inuence of scattering on the structure of
an inverse problem and the stability properties of a measurement operator.
The inverse problems we have encountered in the preceding three chapters, to which
we can add the simplied M.R.I. description of the rst chapter, were well-posed
inverse problems in the sense that they involved a stable inversion operator from a
Hilbert space to another Hilbert space. Wave propagation (as in diraction tomography)
or particle propagation along straight lines (as in computerized tomography) all generate
well-posed inverse problems.
What causes then an inverse problem to be ill-posed? As we mentioned in the
introduction, an inverse problem is ill-posed when the measurement operator is a highly
smoothing/regularizing operator. We have seen that solving the heat equation forward
was a highly regularizing process. The main mechanism explaining its regularizing
81
82CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
property is scattering: using a kinematic description of the diusion equation, particles
represented by Brownian motions scatterer innitely often before they exit the domain
of interest.
Scattering is one of the main mechanisms that regularizes solutions to PDEs and
hence renders inverse problems ill-posed. The linear transport equation, also known as
the linear Boltzmann equation or the radiative transfer equation, oers an ideal example
of a transition from non-scattering to scattering environments.
5.1 Inverse Kinematic Problem
In the absence of scattering or absorption, which we assume in this section, the propa-
gation of particles is modeled by the following Hamiltonian dynamics
dx
dt
= x =
k
H(x(t), k(t)) x(0) = x
0
,
dk
dt
=

k =
x
H(x(t), k(t)) k(0) = k
0
.
(5.1)
As we did in the preceding chapter, we consider the simplest example of a Hamiltonian
H(x, k) = c(x)[k[. (5.2)
This Hamiltonian models the propagation of high frequency acoustic waves in a medium
with a spatially varying sound speed c(x). It also models the propagation of light in
media with varying indices of refraction when polarization eects are neglected. The
index of refraction n(x) is dened as n(x) =
c
c(x)
with c light speed in a vacuum. The
same Hamiltonian may be used to model the propagation of seismic (shear) waves in the
Earth when pressure waves and polarization properties of shear waves are neglected.
The equation (5.1) models the propagation of a single particle knowing its informa-
tion (x
0
, k
0
) at time 0. The density u(t, x, k) of an ensemble of such particles then solves
the corresponding Liouville equation
u
t
+H, u = 0, t > 0, u(0, x, k) = u
0
(x, k), (5.3)
where u
0
(x, k) is the density of particles at time t = 0 and where
H, u :=
k
H
x
u
x
H
k
u = c(x)

k
x
u [k[c(x)
k
u, (5.4)
is the Poisson bracket of the Hamiltonian H and the density u. The inverse kinematic
problem thus concerns the reconstruction of the Hamiltonian H from measurements of
the density u at the boundary of a domain of interest.
In the following section, we rst consider a one-dimensional version of the inverse
kinematic problem in a geometry with spherical symmetry. This inverse kinematic
problem was solved by Herghlotz in 1905 as a means to understand the inner structure
of the Earth from travel time measurements. We then revisit the Mukhometov method
to solve the inverse kinematic problem in two dimensions of space.
5.1. INVERSE KINEMATIC PROBLEM 83
5.1.1 Spherical symmetry
We wish to reconstruct the velocity eld c(r) inside the Earth assuming spherical sym-
metry of the Hamiltonian:
H(x, k) = c(r)[k[ with r = [x[.
Our data is the time it takes from a particle following (5.1) and emitted at the surface of
a sphere with a given direction to come back to the surface of that sphere. To simplify
we assume that the Earth radius is normalized to 1. We also assume that c(1) is known.
We want to reconstruct c(r) from the time it takes to travel along all possible geodesics
(the trajectories of (5.1)). Because the geodesics depend on c(r), the travel time is a
nonlinear functional of the velocity prole c(r).
Let us denote x = r x and k = [k[. The Hamiltonian dynamics take the form
x = c(r),

k = c
t
(r)[k[ x. (5.5)
We are interested in calculating the travel times between points at the boundary of the
domain r = [x[ < 1. This implies integrating dt along particle trajectories. Since we
want to reconstruct c(r), we perform a change of variables from dt to dr. This will allow
us to obtain integrals of the velocity c(r) along curves. The objective will then be to
obtain a reconstruction formula for c(r).
In order to perform the change of variables from dt to dr, we need to know where
the particles are. Indeed the change of variables should only involve position r and no
longer time t. This implies to solve the problem t r(t). As usual it is useful to nd
invariants of the dynamical system. The rst invariant is as always the Hamiltonian
itself:
dH(x(t), k(t))
dt
= 0,
as can be deduced from (5.1). The second invariant is angular momentum and is obtained
as follows. Let us rst introduce the basis ( x, x

) for two dimensional vectors (this is


the usual basis (e
r
, e

) in polar coordinates). We decompose k = k


r
x + k

and
=

k
r
x +

k

. We verify that
r = c(r)

k
r
since x = r x + r

x = c(r). (5.6)
We also verify that
d(rk

)
dt
=
dx

k
dt
= x

k + x

k

= c(r)

k c
t
(r)[k[x x

= 0. (5.7)
This is conservation of angular momentum, which implies that
r(t)k

(t) = k

(0),
since r(0) = 1.
By symmetry, we observe that the travel time is decomposed into two identical
components: the time it takes to go down the Earth until k
r
= 0, and the time it takes
to go back up. On the way up to the surface, k
r
is non-negative. Let us denote p =

k

(0)
84CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
with 0 < p < 1. The lowest point is reached when

k

= 1. This means at a point r


p
such that
r
p
c(r
p
)
=
p
c(1)
.
To make sure that such a point is uniquely dened, we impose that the function rc
1
(r)
be increasing on (0, 1) since it cannot be decreasing. This is equivalent to the constraint:
c
t
(r) <
c(r)
r
, 0 < r < 1. (5.8)
This assumption ensures the uniqueness of a point r
p
such that pc(r
p
) = c(1)r
p
.
Since the Hamiltonian c(r)[k[ is conserved, we deduce that
r = c(r)
_
1

k
2

= c(r)

1
_

(0)c(r)
rc(1)
_
2
,
so that
dt =
dr
c(r)

1
_

(0)c(r)
rc(1)
_
2
. (5.9)
Notice that the right-hand side depends only on r and no longer on functions such as

k
r
that depend on time. The travel time as a function of p =

k

(0) is now given by twice


the time it takes to go back to the surface:
T(p) = 2
_
1
t(rp)
dt = 2
_
1
rp
dr
c(r)

1
_

(0)c(r)
rc(1)
_
2
. (5.10)
Our measurements are T(p) for 0 < p < 1 and our objective is to reconstruct c(r) on
(0, 1). We need a theory to invert this integral transform. Let us dene
u =
c
2
(1)r
2
c
2
(r)
so that du =
2rc
2
(1)
c
2
(r)
_
1
rc
t
(r)
c(r)
_
dr.
Upon using this change of variables we deduce that
T(p) = 2
_
1
p
2
_
dr
du
u
r
_
(u)
du
_
u p
2
. (5.11)
It turns out that the function in parenthesis in the above expression can be reconstructed
from T(p). This is an Abel integral. Before inverting the integral, we need to ensure
that the change of variables r u(r) is a dieomorphism (a continuous function with
continuous inverse). This implies that du/dr is positive, which in turn is equivalent to
(5.8). The constraint (5.8) is therefore useful both to obtain the existence of a minimal
point r
p
and to ensure that the above change of variables is admissible. The constraint
essentially ensures that no rays are trapped in the dynamics so that energy entering
the system will eventually exit it. We can certainly consider velocity proles such that
5.1. INVERSE KINEMATIC PROBLEM 85
the energy is attracted at the origin. In such situation the velocity prole cannot be
reconstructed.
Let us denote by f =
dr
du
u
r
. We will show in the following section that f(u) can be
reconstructed from T(p) and is given by
f(u) =
2

d
du
_
1
u
T(

p)

p u
dp. (5.12)
Now we reconstruct r(u) from the relations
f(u)
u
du =
dr
r
, u(1) = 1, so that r(u) = exp
_
_
1
u
f(v)dv
v
_
.
Upon inverting this dieomorphism, we obtain u(r) and g(r) = f(u(r)). Since
g(r) =
1
2
1
1 rc
t
/c
,
we now know rc
t
/c, hence (log c)
t
. It suces to integrate log c from 1 to 0 to obtain c(r)
everywhere. This concludes the proof of the reconstruction.
5.1.2 Abel integral and Abel transform
For a smooth function f(x) (continuous will do) dened on the interval (0, 1), we dene
the Abel transform as
g(x) =
_
1
x
f(y)
(y x)
1/2
dy. (5.13)
This transform can be inverted as follows:
Lemma 5.1.1 The Abel transform (5.13) admits the following inversion
f(y) =
1

d
dy
_
1
y
g(x)
(x y)
1/2
dx. (5.14)
Proof. Let us calculate
_
1
z
g(x)
(x z)
1/2
dx =
_
1
z
_
1
x
f(y)
(x z)
1/2
(y x)
1/2
dxdy =
_
1
z
dyf(y)k(z, y)dy.
The kernel k(z, y) is given by
k(z, y) =
_
y
z
dx
(x z)
1/2
(y x)
1/2
=
_
1
0
dx
_
x(1 x)
=
_
1
1
dx

1 x
2
= .
The latter equality comes from dierentiating arccos. Thus we have
_
1
z
g(x)
(x z)
1/2
dx =
_
1
z
f(y)dy.
Upon dierentiating both sides with respect to z, we obtain the desired result.
86CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
We can also ask ourselves how well-posed the inversion of the Abel transform is. Since
the transforms are dened on bounded intervals, using the Hilbert scale introduced in
Chapter 1 would require a few modications. Instead we will count the number of
dierentiations. The reconstruction formula shows that the Abel transform is applied
once more to g before the result is dierentiated. We can thus conclude that the Abel
transform regularizes as one half of an integration (since it takes one dierentiation to
compensate for two Abel transforms). We therefore deduce that the Abel transform is
a smoothing operator of order = 1/2 using the terminology introduced in Chapter 1.
5.1.3 Kinematic velocity Inverse Problem
Let us generalize the construction of the Earth velocity from boundary measurements
in the setting of a two-dimensional Earth.
As in section 3.4.1, we consider a bounded domain X R
2
with smooth surface X
parameterized by 0 T and points x = S() with S(0) = S(T) and [

S()[ = 1.
Local travel time and distance are related by
ds
2
=
1
c
2
(x)
(dx
2
+ dy
2
) = n
2
(x)(dx
2
+ dy
2
).
which denes a Riemannian metric with tensor proportional to the 22 identity matrix.
The geodesics of that metric generate a family of curves (t, s, ) as in (3.1) in
Chapter 3. We assume that the family of geodesics satises the hypotheses stated
there, namely the existence of an inverse in (3.2). For a point x in

X and 0 T,
we recall that

(x, ) is the unique curve joining x and S().
We are interested in reconstructing n(x) from knowledge of
G(
1
,
2
) =
_

(S(
1
),
2
)
d =
_

(S(
1
),
2
)
n(x)
_
dx
2
+ dy
2
, (5.15)
for every possible boundary points
1
and
2
, where (t
1
, t
2
) is an extremal for the
above functional, i.e., a geodesic of the Riemannian metric d
2
. Notice that since the
extremals (the geodesics) of the Riemannian metric depend on n(x), the above problem
is non-linear as in the one-dimensional case.
Let G
k
for k = 1, 2 correspond to measurements for two slownesses n
k
, k = 1, 2. We
then have the following result
Theorem 5.1.2 Let n
k
be smooth positive functions on X such that the family of ex-
tremals are suciently regular. Then n
k
can uniquely be reconstructed from G
k
(
1
,
2
)
and we have the stability estimate
|n
1
n
2
|
L
2
(X)
C
_
_
_

1
(G
1
G
2
)
_
_
_
L
2
((0,T)(0,T))
. (5.16)
Proof. Even though the inverse kinematic problem is non-linear, the proof is similar
to that of Theorem 3.4.1 for the corresponding linear inverse problem. The reason is
that the same energy estimate can be used in both cases. The regular family of curves
5.1. INVERSE KINEMATIC PROBLEM 87
is dened as the geodesics of the Riemannian metric d
2
. Let us dene the travel
time
t(x, ) =
_

(x,)
nds, (5.17)
so that as before t(S(
1
),
2
) = G(
1
,
2
). We deduce as in (3.43) that

x
t = n(x). (5.18)
We recall (x, ) = (cos (x, ), sin (x, )) is the unit tangent vector to the curve

(x, )
at x and orientated such that (5.18) holds.
Because t is an integration along an extremal curve of the Riemannian metric (this
is where we use that the curves are geodesics), we deduce that

t = 0, so that
x
t = n(x) and [
x
t[
2
(x, ) = n
2
(x).
Upon dierentiating the latter equality we obtain

[
x
t[
2
= 0.
Let us now dene u = t
1
t
2
, the dierence of travel times for the two possible sound
speeds c
1
and c
2
, so that u = n
1

1
n
2

2
. We deduce from the above expression that

t
(u (
1
+
n
2
n
1

2
)) = 0.
We multiply the above expression by 2

1
u and express the product in divergence
form. We obtain as in the preceding section that
2

1
u

t

1
u

t
_

1
u
1
u
_
=

1
[u[
2
+
1
(

1
uu
t
)

1
(
1
uu
t
).
We now show that the second contribution can also be put in divergence form. More
precisely, we obtain, since n
1
and n
2
are independent of t, that
2

1
u

(
n
2
n
1

2
u) = 2

1
(n
1

1
n
2

2
)

(n
2

1

2

n
2
2
n
1
)
= 2n
2
2

1

2
(
1

2
) = 2n
2
2
(

1

2
)
2
(
1

2
)

=
2n
2
2
sin
2
(
1

2
)
(
1

2
)

_
n
2
2
_
sin(2(
1

2
))
2
(
1

2
_
]
_
.
The integration of the above term over X (0, T) thus yields a vanishing contribution.
Following the same derivation as in Chapter 3, we deduce that
_
T
0
_
X

[u[
2
dxd =
_
T
0
_
T
0
G(
1
,
2
)

1
G(
1
,
2
)

2
d
1
d
2
, (5.19)
where we have dened G = G
1
G
2
. To conclude the proof, notice that
u u = [n
1

1
n
2

2
[
2
= n
2
1
+ n
2
2
2n
1
n
2

1

2
n
2
1
+ n
2
2
2n
1
n
2
= (n
1
n
2
)
2
,
since both n
1
and n
2
are non-negative. With (5.19), this implies that n
1
= n
2
when
G
1
= G
2
and using again the Cauchy-Schwarz inequality yields the stability estimate
(5.16).
88CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
5.2 Forward transport problem
So far, the kinetic model accounted for spatial changes in the speed of propagation
but not for scattering or absorption mechanisms. We recall that physically, the trans-
port equation models high frequency waves or particles propagating in scattering en-
vironments. With the specic form of the Hamiltonian considered in (5.2), and in the
time-independent setting to simplify, the transport equation then takes the form
c(x)
v
[v[

x
u [v[c(x)
v
u + (x)u =
_
R
d
k(x, v
t
, v)u(x, v
t
)(c(x)[v[ c(x)[v
t
[)dv
t
,
(5.20)
where boundary conditions are imposed at the boundary of a domain of interest. Here,
u(x, v) still denotes the density of particles at position x with velocity v and c(x) is
the speed of propagation. Since we are primarily interested in scattering eects in
this chapter, we assume that c(x) = 1 is constant and normalized. The equation thus
becomes
v
[v[

x
u + (x)u =
_
R
d
k(x, v
t
, v)u(x, v
t
)( [v
t
[)dv
t
. (5.21)
Here, = c[v[ is the frequency (e.g. the color for light) of the propagating waves.
Scattering is assumed to be elastic so that = c[v[ is preserved by scattering and hence
wave packets with dierent frequencies satisfy uncoupled equations. We also normalize
= 1 so that [v[ = = 1. In other words, v is now the direction of propagation of the
wave packets (photons). We thus have an equation of the form
v
x
u + (x)u =
_
S
d1
k(x, v
t
, v)u(x, v
t
)dv
t
, (5.22)
where x R
n
and v S
n1
the unit sphere in R
n
. It remains to describe the parameters
participating in scattering: (x) the attenuation (aka total absorption) coecient, and
k(x, v
t
, v) the scattering coecient. (x) models the amount of particles that are either
absorbed or scattered per unit distance of propagation. This is the same coecient
already encountered in CT and SPECT. Unlike high-energy CT and SPECT, in lower
energies such as for visible light, many absorbed photons are re-emitted into another
direction, i.e., scattered. Then k(x, v
t
, v) gives the density of particles scattered into
direction v from a direction v
t
. The right-hand side in (5.22) corresponds to a re-
emission of scattered particles into the direction v.
In an inverse transport problem, (x) and k(x, v
t
, v) are the unknown coecients.
They have to be reconstructed from knowledge of u(x, v) measured, say, at the boundary
of the domain of interest. When k 0, this is Computerized Tomography, where the
appropriate logarithm of u provides line integrals of .
To be more specic, we need to introduce ways to probe the domain of interest X
and to model measurements. In the most general setting, photons can enter into X at
any point x X and with any incoming direction. In the most general setting, photon
densities can then be measured at any point x X and for any outgoing direction.
The sets of incoming conditions

and outgoing conditions


+
are dened by

= (x, v) X V, s.t. v (x) > 0, (5.23)


5.2. FORWARD TRANSPORT PROBLEM 89
where (x) is the outgoing normal vector to X at x X and V = S
n1
. Denoting
by g(x, v) the incoming boundary conditions, we then obtain the following transport
equation
v
x
u + (x)u =
_
S
d1
k(x, v
t
, v)u(x, v
t
)dv
t
(x, v) X V
u
[

(x, v) = g(x, v) (x, v)

.
(5.24)
From the functional analysis point of view, it is natural to consider the L
1
norm
of photon densities, which essentially counts numbers of particles (the L
1
norm of the
density on a domain is the number of particles inside that domain). Let us introduce
the necessary notation.
We say that the optical parameters (, k) are admissible when
0 L

(X)
0 k(x, v
t
, ) L
1
(V ) a.e. in X V

s
(x, v
t
) :=
_
V
k(x, v
t
, v)dv L

(X V ).
(5.25)
Here
s
is also referred to as the scattering coecient. In most applications,
s
(x) is
independent of v
t
.
We dene the times of escape of free-moving particles from X as

(x, v) = infs > 0[x sv , X (5.26)


and (x, v) =
+
(x, v) +

(x, v). On the boundary sets

, we introduce the measure


d(x, v) = [v (x)[d(x)dv, where d(x) is the surface measure on X.
We dene the following Banach space
W :=
_
u L
1
(X V )[v
x
u L
1
(X V ),
1
u L
1
(X V )
_
, (5.27)
with its natural norm. We recall that is dened below (5.26). We have the following
trace formula [29]
|f
[

|
L
1
(

,d)
|f|
W
, f W. (5.28)
This allows us to introduce the following lifting operator
Jg(x, v) = exp
_

(x,v)
0
(x sv, v)ds
_
g(x

(x, v)v, v). (5.29)


It is proved in [29] that J is a bounded operator from L
1
(

, d) to W. Note that Jg
is the solution u
0
of
v
x
u
0
+ (x)u
0
= 0 (x, v) X V, u
0
= g (x, v)

.
Exercise 5.2.1 Prove this. This is the same calculation as for the X-ray transform.
90CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
Let us next dene the bounded operators
/u(x, v) =
_

(x,v)
0
exp
_

_
t
0
(x sv, v)ds
_
_
V
k(x tv, v
t
, v)u(x tv, v
t
)dv
t
dt
LS(x, v) =
_

(x,v)
0
exp
_

_
t
0
(x sv, v)ds
_
S(x tv, v)dt (5.30)
for (x, v) X V . Note that LS is the solution u
S
of
v
x
u
S
+ (x)u
S
= S (x, v) X V, u
S
= 0 (x, v)

.
Exercise 5.2.2 Prove this.
Note that
/u(x, v) = L
_
_
V
k(x, v
t
, v)u(x, v
t
)dv
t
_
(x, v),
which allows us to handle the right-hand side in (5.24). Looking for solutions in W, the
integro-dierential equation (5.24) is thus recast as
(I /)u = Jg. (5.31)
Exercise 5.2.3 Prove this.
Then we have the following result [13, 29].
Theorem 5.2.1 Assume that
(I /) admits a bounded inverse in L
1
(X V,
1
dxdv). (5.32)
Then the integral equation (5.31) admits a unique solution u W for g L
1
(

, d).
Furthermore, the albedo operator
/ : L
1
(

, d) L
1
(
+
, d), g /g = u
[
+
, (5.33)
is a bounded operator.
The invertibility condition (5.32) holds under either of the following assumptions

a
:=
s
0 (5.34)
|
s
|

< 1. (5.35)
We shall not prove this theorem here. The salient features are that the transport equa-
tion is well-posed provided that (5.32) is satised, which is not necessarily true for
arbitrary admissible coecients (, k). The conditions (5.34) or (5.35) are sucient
conditions for (5.32) to be satised. The rst condition is the most natural for us and
states that particles that are created by scattering into v by k(x, v
t
, v) are particles
5.2. FORWARD TRANSPORT PROBLEM 91
that are lost for direction v. In other words, the scattering mechanism does not cre-
ate particles. This is quite natural for photon propagation. In nuclear reactor physics,
however, several neutrons may be created by ssion for each incoming scattering neu-
tron. There are applications in which (5.34) is therefore not valid. In most medical
and geophysical imaging applications, however, (5.34) holds and the transport solution
exists. Note that
a
=
s
is the absorption coecient, and corresponds to a measure
of the particles that are lost for direction v
t
and do not reappear in any direction v
(i.e., particles that are absorbed).
Using these operators, we may recast the transport solution as
u = Jg +/Jg + (I /)
1
/
2
Jg, (5.36)
where u
0
:= Jg is the ballistic component, u
1
:= /Jg the single scattering component
and u
2
:= u u
0
u
1
= (I /)
1
/
2
Jg is the multiple scattering component.
Note that when the problem is subcritical, its solution may be expressed in terms of
the following Neumann expansion in L
1
(X V )
u =

m=0
/
m
Jg. (5.37)
The contribution m = 0 is the ballistic part of u, the contribution m = 1 the single
scattering part of u, and so on. It is essentially this decomposition of the transport
solution into orders of scatterings that allows us to stably reconstruct the optical pa-
rameters in the following sections. Note that the above Neumann series expansions has
an additional benet. Since the optical parameters are non-negative, each term in the
above series is non-negative provided that g and S are non-negative so that the trans-
port solution itself is non-negative. A little more work allows us to prove the maximum
principle, which states that u in X V is bounded a.e. by the (essential) supremum of
g in

when S 0.
Finally, the albedo operator /, which maps incoming conditions to outgoing den-
sities models our measurements. We control the uxes of particles on

and obtain
information about X by measuring the density of particles on
+
. This allows us to
dene the measurements operator of inverse transport. Let
X = (, k) such that 0 L

(X), 0 k L

(X V V ),
s
, (5.38)
and let Y = L(L
1
(

, d), L
1
(
+
, d)). Then we dene the measurement operator M
M : X (, k) M(, k) = /[(, k)] Y, (5.39)
where /[(, k)] is the albedo operator constructed in (5.33) with coecient (, k) in
(5.24). Note that the measurement operator, as for the Calderon problem in (1.14) is
an operator, which to a set of coecients maps a coecient-valued operator, the albedo
operator.
The main question of inverse transport consists of knowing what can be reconstructed
in (, k) from knowledge of the full operator M or knowledge of only parts of the operator
M. In these notes, we shall mostly be concerned with the full measurement operator.
92CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
5.3 Inverse transport problem
One of the main results for inverse transport is the decomposition (5.36). The rst
term u
0
:= Jg is the ballistic component and corresponds to the setting of vanishing
scattering. It is therefore the term used in CT and the X-ray transform. It turns out
that this term is more singular, in a sense that will be made precise below, than the
other contributions. It can therefore be extracted from the measurement operator and
provide the X-ray transform of the attenuation coecient .
The second term in (5.36) is u
1
:= /Jg and is the the single scattering component
of the transport solution. Finally, u
2
:= u u
0
u
1
= (I /)
1
/
2
Jg is the multiple
scattering component, which corresponds to particles that have interacted at least twice
with the underlying medium. A fundamental property of the transport equation is that
single scattering is also more singular than multiple scattering in dimension three (and
higher dimensions), but not in dimension two. We shall describe in more detail below in
which sense single scattering is more singular. The main conclusion, however, is that the
single scattering contribution can also be extracted from the full measurement operator
in dimension n 3. As we shall see, single scattering provides a linear operator that
allows us to invert for the scattering coecient k once is known.
Multiple scattering is then less singular than ballistic and single scattering. Intu-
itively, this means that multiple scattering contributions are smoother functions. In
some sense, after multiple scattering, we do not expect the density to depend too much
on the exact location of the scattering events. Multiply scattered particles visit a large
domain and hence are less specic about the scenery they have visited.
5.3.1 Decomposition of the albedo operator and uniqueness
result
Following (5.36), we decompose the albedo operator as
/g = Jg

+
+ /Jg

+
+ /
2
(I /)
1
Jg

[
+
:= /
0
g + /
1
g + /
2
g.
(5.40)
We denote by the Schwartz kernel of the albedo operator /:
/g(x, v) =
_

(x, v, y, w)g(y, w)d(y)dw.


Any linear operator, such as the albedo operator, admits such a decomposition. Knowl-
edge of the operator / is equivalent to knowledge of its kernel . The decomposition
for / then translates into the decomposition for :
=
0
+
1
+
2
.
Here,
0
corresponds to the ballistic part of the transport solution,
1
corresponds to
the single scattering part of the transport solution, and
2
corresponds to the rest of
the transport solution.
5.3. INVERSE TRANSPORT PROBLEM 93
After some algebra, we have the following expression for the rst two contributions
in the time independent case:

0
(x, v, y, w) = exp
_

(x,v)
0
(x sv, v)ds
_

v
(w)
x

(x,v)v
(y). (5.41)

1
(x, v, y, w) =
_

(x,v)
0
exp
_

_
t
0
(x sv, v)ds
_

(xtv,w)
0
(x tv sw, w)ds
_
k(x tv, w, v)
xtv

(xtv,w)w
(y)dt.
(5.42)
Here,
v
(w) denotes a delta function at v on the unit sphere, i.e., the distribution so
that

v
(f) =
_
S
n1
f(w)
v
(w)dw = f(v),
for f a continuous function on S
n1
. Similarly,
x
(y) denotes the delta function at x
on the boundary X, i.e., the distribution so that

x
(f) =
_
X
f(y)
x
(y)d(y) = f(x),
for f a continuous function in the vicinity of x X.
Exercise 5.3.1 Prove the above two formulas (5.41) and (5.42). Note that the rst
formula is nothing but the expression for the X-ray transform.
Note that
0
in any dimension and
1
in dimension n 3 are distributions: they
involve delta functions. In dimension n = 2, we verify that
1
is in fact a func-
tion. A tedious and lengthy calculation shows that the kernel corresponding to multiple
scattering
2
is also a function, and when k is bounded, satises that
[(y) w[
1

2
(x, v, y, w) L

, L
p
(
+
, d)), 1 p <
d + 1
d
. (5.43)
We refer the reader to [13] (see also [29]) for the derivation of such a result. The exact
regularity of the function
2
is not very important for us here. For the mathematical
derivation of stability estimates, the fact that we can take p > 1 above is important.
For a picture of the geometry of the singularities, see Fig. 5.1: particles emitted emitted
with direction v
t
0
at x
t
0
Rv
t
0
scatter along the segment x
0
(Rt)v
t
0
for t 0. Singly
scattered photons reach the plane orthogonal to a given direction v on the right-hand
side of g. 5.1 along a segment of the plane. We can verify that photons that are at
least doubly scattered may reach any point in that plane and in any other plane. This
is consistent with the fact that
2
is a function.
The strategy to recover and k in dimension n 3 thus goes as follows: We
send beams of particles into the medium that concentrate in the vicinity of a point
(y
0
, v
0
)

. More precisely, let g

be a sequence of normalized L
1
functions on

converging as 0 to
v
0
(v)
y
0

(y).
94CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
V
0
'
V
X
0
'-SV
0
'
X
0
'+RV
0
'
X
0
'-Rv
0
'

Figure 5.1: Geometry of single scattering for n = 3.
Since the ballistic term is more singular than the rest, if we place detectors on the
support of the ballistic term, then such detectors will overwhelmingly be measuring
ballistic particles and very little scattered particles. Since single scattering is more
singular than multiple scattering in dimension n 3, we can use the same strategy and
place detectors at the location of the support of single scattering. Such detectors will
overwhelmingly be measuring singly scattered particles and very little multiply scattered
particles. We now briey explain the rather tedious mathematical construction.
Recovery of the attenuation coecient (x)
Let (y
0
, v
0
)

be dened as above and (x


0
, v
0
)
+
such that y
0
= x
0

(x
0
, v
0
)v
0
.
Let

be a sequence of bounded functions on


+
equal to 1 in the vicinity of (x
0
, v
0
)
and with vanishing support as 0. Then we verify that
_

m
(x, v, y, w)

(x, v)g

(y, w)d(x)dvd(y)dw
0
0, m = 1, 2,
so that

, /g

:=
_

(x, v, y, w)

(x, v)g

(y, w)d(x)dvd(y)dw
0
exp
_

(x
0
,v
0
)
0
(x
0
sv
0
, v
0
)ds
_
.
(5.44)
5.3. INVERSE TRANSPORT PROBLEM 95
Exercise 5.3.2 Verify (5.44); see [13] for the details.
In other words, the measurements corresponding to the above choices of functions
g

and

then converge to the function

, /g

0
E(x, y) := exp
_

_
[xy[
0

_
x s
x y
[x y[
_
ds
_
. (5.45)
This proves that knowledge of the albedo operator /, which allows one to construct

, /g

, provides knowledge of E(x, y), the exponential of minus the X-ray transform
of along the segment (x, y). This can be obtained from any segment (x, y) with x and
y on X. As a consequence, knowledge of / provides knowledge of the X-ray transform
of . We know that is then uniquely and stably reconstructed from such knowledge
since the X-ray transform is a smoothing operator by (only) one-half of a derivative.
Recovery of the scattering coecient k(x, v, w)
We assume that = (x) is now recovered. Let z
0
X, v
0
V , and v
0
,= w
0
V .
Dene x
0
= z
0
+
+
(z
0
, v
0
)v
0
so that (x
0
, v
0
)
+
and y
0
= z
0

(z
0
, w
0
)w
0
so
that (y
0
, w
0
)

. We formally show how the scattering coecient may be uniquely


reconstructed from full knowledge of /.
Let us dene g

1
as before and

as a sequence of bounded functions on


+
equal
to a constant in the vicinity of (x
0
, v
0
) and with vanishing support as 0. Since
v
0
,= w
0
, we nd that
_

0
(x, v, y, w)

(x, v)g

1
(y, w)d(x)dvd(y)dw = 0, 0 ,
1
<
0
(x
0
, v
0
, y
0
, w
0
).
i.e., the ballistic contribution vanishes with such measurements. Let us dene g

1
such
that [(y
0
) w
0
[
1
g

1
(y, w) converges to a delta function. The factor [(y
0
) w
0
[
1
is
here to ensure that the number of emitted particles is independent of y
0
and w
0
. The
ballistic part of the transport solution is then approximately concentrated on the line
passing through y
0
and with direction w
0
. Scattering occurs along this line and particles
scattered in direction v
0
are approximately supported on the plane with directions v
0
and w
0
passing through x
0
. The intersection of that plane with the boundary X is a
one-dimensional curve (x
0
, v
0
, w
0
) X. In two space dimensions, the curve has the
same dimension as X. As a consequence,
1
is a function and therefore is not more
singular than
2
in the time independent setting when n = 2.
Let

(x, v) be a bounded test function supported in the vicinity of . Because


is of measure 0 in X when d 3, we nd using (5.43) that
_

2
(x, v, y, w)

(x, v)g

1
(y, w)d(x)dvd(y)dw
,
1
0
0,
i.e., the multiple scattering contribution is asymptotically negligible with such measure-
ments. Now, choosing

(x, v) properly normalized and supported in the


2
vicinity of
(x
0
, v
0
) (for
2
1), we nd that

, /g

,
1
,
2
0
E(y
0
, z
0
)E(z
0
, x
0
)k(z
0
, w
0
, v
0
),
96CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
at each point of continuity of k(z
0
, w
0
, v
0
), where E(x, y) is dened in (5.45). Since (x)
and hence E(x, y) are known from knowledge of /, then so is k(z
0
, w
0
, v
0
) at each point
of continuity in X V V thanks to the above formula.
Exercise 5.3.3 Verify the above formula; see [13] for the details.
The above reconstruction of the attenuation coecient is valid in any dimension n
2. The reconstruction of the scattering coecient, however, is valid only in dimension
n 3. The reason again is that in dimension n = 2, the single scattering contribution
is also a function, not a distribution, and thus cannot be separated from the multiple
scattering component. What we have obtained so far may be summarized as:
Theorem 5.3.1 ([29]) Let (, k) and ( ,

k) be two admissible pairs of optical parame-


ters associated with the same albedo operator / and such that and are independent
of the velocity variable. Then = in dimension n 2. Moreover, k =

k in dimension
n 3.
5.3.2 Stability in inverse transport
Let us assume the existence of two types of measurements / and

/, say, corresponding
to the optical parameters (, k) and ( ,

k), respectively. The question of the stability of


the reconstruction is to bound the errors and k

k as a function of /

/.
We obtain stability estimates in dimension n 3. In dimension n = 2, only the
estimate on (x) is valid. The construction of the incoming source

(x, v) is such
that

C
1
(

) is supported in the
1
vicinity of (x
0
, v
0
) and normalized so that
_

d = 1. Let be a compactly support continuous function, which models the


array of detectors, on
+
such that ||

1. Then

+
(x, v)
_
(/

/)

_
(x, v)d(x, v)

|(/

/)|
/(L
1
)
, (5.46)
where now | |
/(L
1
)
= | |
/(L
1
(

,d),L
1
(
+
,d))
. We still introduce
I
m
(, ) =
_

+
(x, v)
_
(/
m


/
m
)

_
(x, v)d(x, v), m = 0, 1, 2,
and obtain that
lim
0+
I
0
(, ) = (y
0
, v
0
)
_
E(x
0
, y
0
)

E(x
0
, y
0
)
_
lim
0+
I
1
(, ) =
_
V
_

+
(x
0
,v
0
)
0
(x(s) +
+
(x(s), v)v, v)
_
E
+
k

E
+

k)(x(s), v
0
, v)dsdv
(5.47)
where we have introduced x(s) = x
0
+ sv
0
.
The estimate (5.43) allows us to show that

I
2
(, )

C
_
V
_
_
X
[(t, x, v)[
p

dx
_ 1
p

dv, p
t
> d. (5.48)
Multiple scattering is therefore still negligible when the support of :=

tends to 0
when 0.
5.3. INVERSE TRANSPORT PROBLEM 97
The rst sequence of functions

is chosen to have a small support concentrated


in the vicinity of (y
0
, v
0
)
+
. Then the single scattering contribution I
1
(

) 0 as
0. For w
0
xed in V , we choose the sequence of functions

such that they are


concentrated in the vicinity of the curve (s) on
+
and that they approximately take
the value sign(E
+
k

E
+

k)(x
0
+ sv
0
, v
0
, w
0
) along that curve. Since v
0
,= w
0
, we verify
that I
0
(

) 0 for such sequence of functions; see [13]. Now, the function

has a
small support only in dimension n 3. Indeed, in dimension n = 2, the curve has
the same dimensionality as the boundary X. When n = 2, multiple scattering may no
longer be separated from single scattering by using the singular structure of the albedo
operator /. This allows us to state the result:
Theorem 5.3.2 ([13]) Assume that (x) and k(x, v
t
, v) are continuous on

X and

X
V V , respectively and that ( ,

k) satisfy the same hypotheses. Let (x


0
, v
0
)

and
y
0
= x
0
+
+
(x
0
, v
0
)v
0
. Then we have for d 2 that

E(x
0
, y
0
)

E(x
0
, y
0
)

|/

/|
/(L
1
)
,
(5.49)
while in dimension n 3, we have
_
V
_

+
(x
0
,v
0
)
0

E
+
k

E
+

(x
0
+ sv
0
, v
0
, v)dsdv |/

/|
/(L
1
)
. (5.50)
The stability obtained above for the X-ray transform of the absorption coecient is
not sucient to obtain any stability of itself without a priori regularity assumptions
on . This results from the well known fact that the X-ray transform is a smoothing
(compact) operator so that the inverse X-ray transform is an unbounded operator. Let
us assume that belongs to some space H
s
(R
n
) for s suciently large and that
p
dened in (5.25) is bounded. More precisely, dene
/=
_
(, k) C
0
(

X) C
0
(

X V V )[ H
n
2
+r
(X), ||
H
n
2
+r
(X)
+|
p
|

M
_
,
(5.51)
for some r > 0 and M > 0. Then, we have the following result.
Theorem 5.3.3 ([13, 14]) Let n 2 and assume that (, k) / and that ( ,

k)
/. Then the following is valid:
| |
H
s
(X)
C|/

/|

/(L
1
)
, (5.52)
where
1
2
s <
n
2
+ r and =
n+2(rs)
n+1+2r
.
When d 3, we have
|k

k|
L
1
(XV V )
|/

/|

/(L
1
)
_
1 +|/

/|
1

/(L
1
)
_
, (5.53)
where
t
=
2(rr

)
n+1+2r
and 0 < r
t
< r.
Such estimates show that under additional regularization assumptions on , we have
explicit stability expression of H older type on and k. The rst stability result (5.52)
was rst established in [67].
The proof of these theorems is fairly technical and will not be presented here in
detail; see [13, 14].
98CHAPTER 5. INVERSE KINEMATIC ANDINVERSE TRANSPORT PROBLEMS
Chapter 6
Inverse diusion and severe
ill-posedness
This chapter introduces a classical example of a severely ill-posed problem for which
the measurement operator is injective. In the absence of any noise, reconstructions are
feasible. However, (in)stability is such that even tiny amounts of noise in moderately
high frequency will be so amplied during the reconstruction that the resulting noise
in the reconstructed parameters may overwhelm the real parameters. We have seen in
earlier chapters that solving the heat equation backward was an example of a severely
ill-posed problem. In the preceding chapter, we saw that scattering was responsible for
smoothing (in the sense that the multiple scattering contribution of the albedo operator
in (5.40) was smoother than the single scattering contribution and that the latter was
smoother than the ballistic contribution). In this chapter and in the next chapter, we
consider two new examples: the Cauchy problem, which is an inverse source problem,
and the Inverse Diusion problem, also known as the Calder on problem, which is a
nonlinear inverse problem. In both problems, the forward modeling, based on a diusion
equation, is highly smoothing because of scattering eects.
This chapter is devoted to the analysis of the Cauchy problem and related inverse
problems.
6.1 Cauchy Problem and Electrocardiac potential
Let us consider the imaging of the electrical activity of the heart. A probe is sent inside
the heart, where the electrical potential is measured. The inverse problem consists of
reconstructing the potential on the endocardial surface (the inside of the cardiac wall)
from the measurements on the probe.
Mathematically, this corresponds to solving a Cauchy problem for an elliptic equa-
tion. The problem is modeled as follows. Let
0
be a closed smooth surface in R
3
representing the endocardial surface and let
1
be the closed smooth surface inside the
volume enclosed by
0
where the measurements are performed. We denote by X the
domain with boundary X =
0

1
; see Fig.6.1. The electric potential solves the
99
100 CHAPTER 6. INVERSE DIFFUSION
Figure 6.1: Geometry of endocardial measurements
following Laplace equation:
u = 0 in X,
u = u
1
on
1
,
u
n
= 0 on
1
.
(6.1)
The function u
1
models the measurements at the surface of the probe, which is assumed
to be insulated so that n u = 0 on
1
. The objective is then to nd u = u
0
on
0
.
6.2 Half Space Problem
Let us consider the simplied geometry where X is the slab R
n1
(0, L). We assume
that boundary measurements are given at the boundary x
n
= 0 and wish to obtain the
harmonic solution in X. We are thus interested in solving the equation
u = 0, x = (x
t
, x
n
) X
u(x
t
, 0) = f(x
t
), x
t
R
n1
u
x
n
(x
t
, 0) = g(x
t
), x
t
R
n1
.
(6.2)
Let us denote by
u(k
t
, x
n
) = (T
x

k
u)(k
t
, x
n
). (6.3)
Upon Fourier transforming (6.2) in the x
t
variable, we obtain
[k
t
[
2
u +

2
u
x
n
2
= 0, k
t
R
n1
, 0 < x
n
< L
u(k
t
, 0) =

f(k
t
), k
t
R
n1
,
u
y
(k
t
, 0) = g(k
t
), k
t
R
n1
.
(6.4)
6.2. HALF SPACE PROBLEM 101
The solution of the above ODE is given by
u(k
t
, x
n
) =

f(k
t
) cosh([k
t
[x
n
) +
g(k
t
)
[k
t
[
sinh([k
t
[x
n
). (6.5)
We thus observe that the solution at x
n
> 0 is composed of an exponentially growing
component e
[k

[xn
and an exponentially decreasing component e
[k

[xn
.
6.2.1 The well posed problem
Let us assume we are in the favorable case where
[k
t
[

f(k
t
) + g(k
t
) = 0, k
t
R
n1
. (6.6)
In the physical domain, this corresponds to satisfying the non-local problem

f(x
t
) + g(x
t
) = 0 x
t
R
n1
, (6.7)
with

dened as the square root of the Laplacian. It takes the form

= H
d
dx
in two dimensions n = 2, where H is the Hilbert transform.
In the Fourier domain, the solution is thus given by
u(k
t
, x
n
) =

f(k
t
)e
[k

[xn
. (6.8)
Exercise 6.2.1 (i) In two dimensions n = 2, prove that
u(x
1
, x
2
) = (f
1

x
2
x
2
1
+ x
2
2
)(x) =
1

_
R
f(x
1
z)
x
2
z
2
+ x
2
2
dz. (6.9)
Hint: Use (1.32) and show that
1
2
_
R
e
x
2
[k[
e
ix
1
k
1
dk
1
=
1

x
2
x
2
1
+ x
2
2
.
(ii) Show that
1

y
x
2
1
+x
2
2
is the fundamental solution of (6.2) when n = 2 with f(x
1
) = (x
1
).
Calculate the corresponding value of g(x
1
).
Provided the above compatibility condition is met, the problem (6.2) admits a unique
solution and is well-posed, for instance in the sense that
_
R
n1
u
2
(x
t
, x
n
)dx
t

_
R
n1
f
2
(x
t
)dx
t
, for all x
n
> 0.
This is an immediate consequence of (6.8) and the Parseval relation. In this situation,
the construction of u for x
n
> 0 becomes a well-posed problem.
102 CHAPTER 6. INVERSE DIFFUSION
6.2.2 The electrocardiac application
In general, however, (6.6) is not satised. For the electro-cardiac application, g
0
= 0
and the harmonic solution is therefore given (for all x
n
> 0) by
u(k
t
, x
n
) =

f(k
t
) cosh [k
t
[x
n
. (6.10)
This implies that high frequencies are exponentially amplied. Therefore, the mapping
from f(x
t
) to A
xn
f(x
t
) := u(x
t
, x
n
) for a xed x
n
> 0 cannot be bounded from any
H
s
(R
n1
) (even for s very large) to any H
t
(R
n1
) (even for t very large). This does
not mean that A
xn
cannot be inverted. Let us dene the space
X
xn
(R
n1
) = u L
2
(R
n1
), cosh([k
t
[x
n
) u(k
t
) L
2
(R
n1
). (6.11)
Then, A
xn
is indeed continuous in L(L
2
(R
n1
), X
xn
(R
n1
)) with an inverse A
1
xn
contin-
uous in L(X
xn
(R
n1
), L
2
(R
n1
)) and given by
A
1
xn
u = T
1
k

cosh([k
t
[x
n
)T
x

k
u. (6.12)
This construction is useful when noise is small in X
xn
(R
n1
), which means that it is
essentially low-frequency. Such a restrictive assumption on noise is often not veried
in practice. When noise is small only in some L
2
sense, for instance, then A
xn
above
cannot be inverted without assuming prior information about the solution we seek to
reconstruct. Chapters 10 and 11 will be devoted to the analysis of such prior information.
Here, we look at how stability estimates may still be obtained when simple bounds are
available on the harmonic solutions.
6.2.3 Prior bounds and stability estimates
Let us consider the solution (6.10) in the setting where f is small in some sense (modeling
the level of noise in the available data since our problem is linear) and where u(x) the
harmonic solution is bounded a priori in some sense. We can then obtain a stability
result of the form:
Theorem 6.2.1 Let u(x) be the solution to (6.2) with g 0. Let us assume that
|f|
L
2
(R
n1
)
E, |u(, L)|
L
2
(R
n1
)
E. (6.13)
Then we nd that
|u(, x
n
)|
L
2
(R
n1
)
C
1
xn
L
E
xn
L
, (6.14)
for some universal constant C.
Proof. The proof is based on the following observation: u(k
t
, x
n
) is close to

f(k
t
) for
small values of [k
t
[, where we wish to use the constraint involving . For large [k
t
[, the
bound in on

f(k
t
) is no longer sucient and we need to use the bound at L, which
constrains the growth at high frequencies. We thus calculate
_
R
n1
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
=
_
[k

[<k
0
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
+
_
[k

[>k
0
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
cosh
2
(k
0
x
n
)
2
+
cosh
2
(k
0
x
n
)
cosh
2
(k
0
L)
E
2
.
6.2. HALF SPACE PROBLEM 103
We now equate both terms. Since k
0
is large, we verify that cosh x
1
2
e
x
and choose
k
0
=
1
L
ln
2E

.
With this choice, we obtain that
_
R
n1
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
C
2
_
2E

_xn
L
C
_

xn
L
E
1
xn
L
_
2
.
This proves the result.
The above result states that even though solving the Cauchy problem is severely ill-
posed, if we possess information about the solution at a distance L from where the
measurements are collected, then we can reconstruct the elliptic solution for all 0 <
x
n
< L with Holder-type stability, which is a mildly ill-posed problem. However, at
x
n
= L, which is where we would like to obtain some stable reconstructions in the
electro-cardiac problem, then a mere bound at x
n
= L in the L
2
norm is not sucient
to obtain any meaningful stability. However, a stronger bound would be sucient as
the following result indicates.
Theorem 6.2.2 Let u(x) be the solution to (6.2) with g 0. Let us assume that
|f|
L
2
(R
n1
)
E, |u(, L)|
H
s
(R
n1
)
E, s > 0. (6.15)
Then we nd that
|u(, x
n
)|
L
2
(R
n1
)
C

ln
E

s
xn
L

1
xn
L
E
xn
L
, (6.16)
for some universal constant C.
Proof. The proof is similar to the preceding one. We calculate
_
R
n1
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
=
_
[k

[<k
0
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
+
_
[k

[>k
0
[

f[
2
(k
t
) cosh
2
([k
t
[x
n
)dk
t
cosh
2
(k
0
x
n
)
2
+
cosh
2
(k
0
x
n
)
cosh
2
(k
0
L)
k
0

2s
E
2
.
We recall that k
0

2
= 1 + k
2
0
. We then choose
k
0
=
1
L
ln
2E


s
L
ln
1
L
ln
2E

.
This is a good approximation to the equation cosh
2
(k
0
L)
2
= k
0

2s
E
2
. We then nd
e
k
0
xn
=

1
L
ln
2E

s
xn
L

1
xn
L
(2E)
xn
L
,
from which the result follows.
For x
n
< L, the latter result is not much more precise than the preceding one. However,
when x
n
= L, we obtain that the error in the L
2
(R
n1
) norm of u(x
t
, x
n
) is of order
[ ln [
s
. This is a logarithmic stability result that cannot be improved and indicates the
severe ill-posedness of the electro-cardiac problem. Note also that the stability estimate
depends on the strength of the exponent s in the prior regularity assumption.
104 CHAPTER 6. INVERSE DIFFUSION
Exercise 6.2.2 Let us assume that in Theorem 6.2.2, the prior estimate is replaced by
an estimate of the form
|u|
H
m
(R
n1
(0,L))
E, m 0 (6.17)
(i) Obtain a result of the form (6.16) in this setting.
(ii) Show that |u(, L)|
L
2
(R
n1
)
satises a logarithmic stability estimate when m is suf-
ciently large (how large?). Relate this result to that of Theorem 6.2.2.
6.2.4 Analytic continuation
The above results in dimension n = 2 are intimately connected to the analytic continu-
ation of an analytic function given on the real line.
Let f(z) = g(z) + ih(z) be an analytic function with g(z) and h(z) real valued-
functions. Let us assume that g(z) and h(z) are known on the real line (z) = 0. The
objective is to nd them for arbitrary values of z. We identify z = x + iy and assume
that g(x, 0) and h(x, 0) are distributions in o
t
(R) so that their Fourier transform is
dened. Since f(z) is analytic, we have
f
z
= 0,
or equivalently that
g
x

h
y
= 0,
g
y
+
h
x
= 0. (6.18)
These Cauchy-Riemann relations imply that g and h are harmonic, i.e., g = h = 0.
They thus solve the following problems
g = 0, y > 0, h = 0, y > 0,
g
y
(x, 0) =
h
x
(x, 0),
h
y
(x, 0) =
g
x
(x, 0),
g(x, 0) known, h(x, 0) known.
(6.19)
Both problems are of the form (6.2). The solutions in the Fourier domain are given by
g(k
x
, y) = g(k
x
, 0) cosh([k
x
[y) isign(k
x
)

h(k
x
, 0) sinh([k
x
[y)

h(k
x
, y) =

h(k
x
, 0) cosh([k
x
[y) + isign(k
x
) g(k
x
, 0) sinh([k
x
[y)
(6.20)
We verify that the problem is well-posed provided that (6.7) is veried, which in this
context means
h
x
= H
g
x
,
g
x
= H
h
x
. (6.21)
Notice that H
2
= I so that both equalities above are equivalent. When the above
conditions are met, then the analytic continuation is a stable process. When they are
not met, we have seen that high frequencies increase exponentially as y increases, which
renders the analytic continuation process a severely ill-posed problem.
6.3. GENERAL TWO DIMENSIONAL CASE 105
6.3 General two dimensional case
We now consider a general two-dimensional geometry as described in section 6.1. We use
the Riemann mapping theorem to map such a geometry conformally to an annulus (the
region lying between two concentric circles) in the plane. We then solve the problem
on the annulus. The Riemann mapping gives us a stable way to transform the original
domain to an annulus and back. We will see that solving the problem on the annulus is
severely ill-posed similarly to what we saw in the preceding section.
6.3.1 Laplace equation on an annulus
Let us begin with a presentation of the problem on the annulus. We assume that the
inner circle has radius 1 and the outer circle radius > 0. By simple dilation, this is
equivalent to the more general case of two circles of arbitrary radius a and b. In polar
coordinates the Laplacian takes the form
u =
1
r

r
_
r
u
r
_
+
1
r
2

2
u

2
. (6.22)
The general solution to the above equation periodic in is decomposed in Fourier modes
as
u(r, ) = a
0
+ b
0
ln r +

nN

_
a
n
2
r
n
+
b
n
2
r
n
_
e
in
. (6.23)
Since u = 0 on at r = 1, we deduce that b
0
and b
n
a
n
vanish. We then nd that
the solution to (6.1) on the annulus is given by
u(r, ) =

nN
_
1
2
_
2
0
e
in
u
1
()d
_
r
n
+ r
n
2
e
in
. (6.24)
The above solution holds for all r > 1.
Exercise 6.3.1 Find the relation that u
1
() = u(1, ) and g
1
() = u(1, ) must
satisfy so that the problem
u = 0, [r[ > 1
u(1, ) = u
1
(), n u(1, ) = g
1
(), 0 < 2,
is well-posed (in the sense that the energy of u(, ) is bounded by the energy of
u
1
()).
We verify that an error of size in the measurement of the coecient a
n
is amplied
into an error of order
A
n
() =

2
e
nln
at r = .
We verify that A
n
cannot be bounded by any Cn

for all > 0, which would correspond


to dierentiating the noise level times. This implies that the reconstruction of u(, )
from u
1
() using (6.24) is a severely ill-posed problem.
106 CHAPTER 6. INVERSE DIFFUSION
6.3.2 Riemann mapping theorem
Let X be an open smooth two dimensional domain with smooth boundary having two
smooth connected components. We denote by
0
the outer component of X and
1
the inner component; see Fig. 6.1.
For z X C we construct a holomorphic function (z) (i.e., a function such that

z
= 0) mapping X to an annulus of the form 1 < r < . The function is constructed
as follows. Let rst v be the unique solution to the following Dirichlet problem
v = 0 on X, v
[
0
= 1, v
[
1
= 0. (6.25)
For some c to be xed later, let G = cv. We verify that
I =
_

G
y
dx +
G
x
dy = c
_

1
v

ds > 0
by the maximum principle (v is positive inside X and its normal derivative must be
negative on
1
). We x c such that I = 2. In that case we can dene a function H(z),
a conjugate harmonic of G(z) on X, by
H(z) =
_
z
p

G
y
dx +
G
x
dy, (6.26)
where p is an arbitrary point in X. Since G is harmonic, we verify that the denition
of H is independent of the path chosen between p and z. Moreover we verify that
H
x
=
G
y
,
H
y
=
G
x
,
so that G + iH is a holomorphic function on X. Then so is
(z) = e
G(z)+iH(z)
. (6.27)
We verify that (z) maps
0
to the circle [z[ = e
c
and
1
to the circle [z[ = 1. Moreover
is a dieomorphism between X and the annulus U
c
= z C, 1 [z[ e
c
. Finally
we verify that (z) = 0 on X since is holomorphic. Therefore we have replaced the
problem (6.1) on X by solving the Laplace equation on U
c
with boundary conditions
u
1
((z)) on the circle r = 1 and vanishing Neumann boundary conditions (we verify
that Neumann boundary conditions are preserved by the map ).
Exercise 6.3.2 Verify the above statements.
6.4 Backward Heat Equation
Let us now consider the backward heat equation. The forward heat equation is given
by
u
t
u = 0 t > 0, x R
n
u(0, x) = f(x) x R
n
,
(6.28)
6.4. BACKWARD HEAT EQUATION 107
with f(x) an initial condition. Let us assume that u(T, x) = Mf(T, x) is measured and
that we wish to reconstruct f(x). In this setting, we pass to the Fourier domain and
obtain as above that
u(t, k) = e
t[k[
2

f(k). (6.29)
Therefore,

f(k) = e
T[k[
2
u(T, k) is the formal inverse to the measurement operator M.
Again, this is a severely ill-posed problem. But with prior information about the solution
u(t, x), we can again obtain H older and logarithmic type stability results. For instance,
we have
Theorem 6.4.1 Let us assume that
|u(T, )|
L
2
(R
n
)
= E, |f|
H
s
(R
n
)
E. (6.30)
Then we nd that
|u(t, )|
L
2
(R
n
)
C

ln

E

s(Tt)
2T

t
T
E
1
t
T
. (6.31)
Proof. The proof is the same as above. We use the bound for [k[ < k
0
and the
other bound for [k[ > k
0
. We then choose k
0
so that
k
2
0
=
1
T
ln
E


s
2T
ln k
2
0

1
T
ln
E


s
2T
ln
1
T
ln
E

.
The rest of the proof proceeds as above. Note that the main dierence with respect to
the Cauchy problem is that s is replaced by
s
2
in the estimate because frequencies are
exponentially damped with a strength proportional to [k[
2
rather than [k[.
Note that for all t > 0, the error on the reconstruction of u(t, x) is H older in , which
measures the errors in the measurements. At t = 0, which is the problem of interest
in the backward heat equation, we obtain a logarithmic stability: the errors in the
reconstruction are proportional to [ ln [

s
2
, which does converge to 0 as 0, but does
so extremely slowly.
The above proof generalizes to a large class of elliptic operators. For instance,
consider the problem
u
t
Lu = 0, t > 0, x X
u(t, x) = 0 x X
u(0, x) = f(x), x X
(6.32)
with L a symmetric operator such as for instance a + c admitting a spectral
decomposition
L
n
=
n

n
in X
n
= 0 on X, (6.33)
with an orthonormal basis of normalized eigenvectors
n
and positive eigenvalues 0 <

1
<
2
. . . with
n
as n . Then we can decompose
f(x) =

n1
f
n

n
(x), u(t, x) =

n1
u
n
(t)
n
(x).
108 CHAPTER 6. INVERSE DIFFUSION
Let us assume that u
n
(T) is measured and that we want to reconstruct u
n
(t) for t 0
and hence also f
n
at t = 0. Then we nd that
u
n
(t) = e
nt
f
n
= e
n(Tt)
u
n
(T).
Let us assume that f(x) admits an a priori bound in H
s
(X) of the form
|f|
H
s
(X)
:=
_

n1
n
2s
f
2
n
_1
2
= E. (6.34)
Let us assume that we have an error of the form
|u(T, )|
L
2
(X)
:=
_

n1
u
2
n
(T)
_1
2
= E. (6.35)
Let us assume that
n
n

for some > 0 asymptotically. Then we calculate

n1
u
2
n
(t) =

nn
0
1
(e
n(Tt)
u
n
(T))
2
+

nn
0
(e
nt
f
n
)
2
e
2n
0
(Tt)

2
+ e
2n
0
t
n
2s
0
E
2
.
Again, we use the estimate on the noise for the low frequencies of u(t) and the regularity
estimate E for the high frequencies of u(t). It remains to equate these two terms to nd
that
n

0

1
T
ln
E


s
T
ln
1
T
ln
E

.
This yields the error estimate
|u(t, )|
L
2
(X)
C

ln

E

(Tt)s
T

t
T
E
1
t
T
. (6.36)
We leave it to the reader to ll in the gaps and state theorems. As usual, we obtain
a Holder estimate for all 0 < t T. At t = 0, the stability estimate becomes again
logarithmic and crucially relies on the prior information that f is bounded in some space
H
s
(X) with s > 0.
Chapter 7
Calder on problem
This chapter focuses on the Calderon problem, which is another inverse diusion problem
with a similar behavior to the Cauchy problems considered in the preceding chapter.
Because of its central role in inverse problem theory and of the importance of the
mathematical tools that are used to solve it, we have devoted a full chapter to its study.
7.1 Introduction
The Calderon problem nds applications in two medical imaging methods, electrical
impedance tomography (EIT) and optical tomography (OT). We begin with the setting
of EIT, a non-invasive modality that consists of reconstructing the electrical properties
(the conductivity) of tissues from current and voltage boundary measurements. The
mathematical framework for the Calder on problem and the denition of the measure-
ment operator were presented as Example 5 in section 1.2. For completeness, we repeat
this denition here. The Calderon problem is modeled by the following elliptic problem
with Dirichlet conditions
L

u(x) (x)u(x) = 0, x X
u(x) = f(x), x X,
(7.1)
where X R
n
is a bounded domain with smooth boundary X. In what follows, we
assume n 3. Here, (x) is a conductivity coecient, which we assume is a smooth
function, and f(x) is a prescribed Dirichlet data for the elliptic problem.
The Dirichlet-to-Neumann or voltage-to-current map is given by

:
H
1
2
(X) H

1
2
(X)
f(x)

[f](x) = (x)
u

(x).
(7.2)
With X = (
2
(

X) and Y = L(H
1
2
(X), H

1
2
(X)), we dene the measurement operator
M : X M() =

Y. (7.3)
The Calder on problem consists of reconstructing from knowledge of the Calderon
measurement operator M. To slightly simplify the derivation of uniqueness, we also
109
110 CHAPTER 7. CALDER

ON PROBLEM
make the (unnecessary) assumption that and are known on X. The main
result of this chapter is the following.
Theorem 7.1.1 Dene the measurement operator M as in (7.3). Then M is injective
in the sense that M() = M( ) implies that = .
Moreover, we have the following logarithmic stability estimate:
|(x)
t
(x)|
L

(X)
C

log |M() M( )|
Y

, (7.4)
for some > 0 provided that and are uniformly bounded in H
s
(X) for some s >
n
2
.
The proof of the injectivity result was rst obtained in [60]. It is based on two main
ingredients. The rst ingredient consists of recasting the injectivity result as a statement
of whether products of functionals of solutions to elliptic equations such as (7.2) are
dense in the space of, say, continuous functions. The second step is to construct specic
sequences of solutions to (7.2) that positively answer the density question. These specic
solutions are Complex Geometric Optics (CGO) solutions. Their construction is detailed
in section 7.3. We rst start with a section on the uniqueness and stability results.
7.2 Uniqueness and Stability
7.2.1 Reduction to a Sch odinger equation
We start with the following lemma
Lemma 7.2.1 Let

j
for j = 1, 2 be the two operators associated to
j
and let f
j

H
1
2
(X) for j = 1, 2 be two Dirichlet conditions. Then we nd that
_
X
(

2
)f
1
f
2
d =
_
X
(
1

2
)u
1
u
2
dx, (7.5)
where d is the surface measure on X and where u
j
is the solution to (7.1) with
replaced by
j
and f replaced by f
j
.
Here we use the notation on the left-hand side to mean:
_
X
(

2
)f
1
(x)f
2
(x)d(x) := (

2
)f
1
, f
2

1
2 (X),H
1
2 (X)
.
Proof. The proof is a simple integration by parts. Let us consider the equation for
u
1
, multiply it by u
2
and integrate the product over X. Then we nd by application of
the divergence theorem that
0 =
_
X
u
2
L

1
u
1
dx =
_
X

1
u
1
u
2
dx
_
X
u
2

1
u
1
d(x).
=
_
X

1
u
1
u
2
dx
_
X
f
2

1
f
1
d(x).
Exchanging the roles of the indices j = 1 and j = 2 and subtracting the result to the
above equality yields (7.5).
7.2. UNIQUENESS AND STABILITY 111
The above lemma shows that when

1
=

2
, the right-hand-side in (7.1) also
vanishes for any solutions u
1
and u
2
of (7.1) with given by
1
and
2
, respectively.
We are now thus faced with the question of whether products of the form u
1
u
2
are
dense in the space of, say, continuous functions. Unfortunately, answering this question
armatively seems to be a dicult problem. The main diculty in the analysis of
(7.1) is that the unknown coecient appears in the leading order of the dierential
operator L

. The following Liouville change of variables allows us to treat the unknown


coecient as a perturbation to a known operator (with constant coecients):

1
2
L

1
2
= q, q =

1
2

1
2
. (7.6)
Here is the usual Laplacian operator.
Exercise 7.2.1 Prove (7.6).
Therefore if u is a solution of L

u = 0, then v =
1
2
u is a solution to the Schrodinger
equation (q)v = 0. We thus wish to replace the problem of the reconstruction of
by that of q.
Consider the Schrodinger equation (still calling the solution u rather than v)
(q)u = 0 in X, u = f on X, (7.7)
with q given by (7.6). For f H
1
2
(X), we nd a unique solution u H
1
(X) such that
u H

1
2
(X). Indeed, the above equation admits a solution since it is equivalent
to (7.1) by the change of variables (7.6). We then dene the Dirichlet-to-Neumann
operator

q
:
H
1
2
(X) H

1
2
(X)
f(x)
q
[f](x) =
u

(x),
(7.8)
where u is the solution to (7.7). We then verify that

q
f =

1
2

X
f +

1
2

1
2

X
f), f H
1
2
(X). (7.9)
Exercise 7.2.2 Prove the above result.
We thus observe that knowledge of

, [
X
and [
X
implies knowledge of
q
. It
turns out that knowledge of

implies that of [
X
and [
X
:
Theorem 7.2.2 Let us assume that 0 <
i
C
m
(

X) and that

1
=

2
. Then we can
show that for all [[ < m, we have

X
=

X
. (7.10)
See [42] for a proof of this result. To simplify the analysis of stability, we have assumed
here that [
X
and [
X
were known. Thus,

uniquely determines
q
. Our next
step is therefore to reconstruct q from knowledge of
q
. We start with the following
lemma:
112 CHAPTER 7. CALDER

ON PROBLEM
Lemma 7.2.3 Let
q
j
for j = 1, 2 be the two operators associated to q
j
and let f
j

H
1
2
(X) for j = 1, 2 be two Dirichlet conditions. Then we nd that
_
X
(
q
1

q
2
)f
1
f
2
d =
_
X
(q
1
q
2
)u
1
u
2
dx, (7.11)
where d is the surface measure on X and where u
j
is the solution to (7.7) with q
replaced by q
j
and f replaced by f
j
.
Exercise 7.2.3 Prove this lemma following the derivation in Lemma 7.2.1.
The above lemma shows that when
q
1
=
q
2
, then the right-hand-side in (7.11) also
vanishes for any solutions u
1
and u
2
of (7.7) with q replaced by q
1
and q
2
, respectively.
We are now thus faced with the question of whether products of the form u
1
u
2
are dense
in the space of, say, continuous functions. This is a question that admits an armative
answer. The main tool in the proof of this density argument is the construction of
complex geometric optics solutions. Such solutions are constructed in section 7.3. The
main property that we need at the moment is summarized in the following lemma.
Lemma 7.2.4 Let C
n
be a complex valued vector such that = 0. Let |q|

<
and [[ be suciently large. Then there is a solution u of (q)u = 0 in X of the form
u(x) = e
x
(1 + (x)), (7.12)
such that
[[||
L
2
(X)
+||
H
1
(X)
C. (7.13)
The proof of this and more general lemmas can be found in section 7.3. The principle of
such solutions is this. When q 0, then e
x
is a (complex-valued) harmonic function,
i.e., a solution of u = 0. The above result shows that q may be treated as a perturba-
tion of . Solutions of (q)u = 0 are fundamentally not that dierent from solutions
of u = 0.
Now, coming back to the issue of density of product of elliptic solutions. For u
1
and
u
2
solutions of the form (7.12), we nd that
u
1
u
2
= e
(
1
+
2
)x
(1 +
1
+
2
+
1

2
). (7.14)
If we can choose
1
+
2
= ik for a xed k with [
1
[ and [
2
[ growing to innity so that

1
+
2
+
1

2
becomes negligible in the L
2
sense thanks to (7.13), then we observe that
in the limit u
1
u
2
equals e
ikx
. The functions e
ikx
for arbitrary k R
n
certainly form a
dense family of, say, continuous functions.
7.2.2 Proof of injectivity result
Let us make a remark on the nature of the CGO solutions and the measurement operator
M. The CGO solutions are complex valued. Since the equations (7.1) and (7.7) are
linear, we can assume that the boundary conditions f = f
r
+if
i
are complex valued as
a superposition of two real-valued boundary conditions f
r
and f
i
. Moreover, the results
(7.5) and (7.11) hold for complex-valued solutions. Our objective is therefore to show
7.2. UNIQUENESS AND STABILITY 113
that the product of complex-valued solutions to elliptic equations of the form (7.7) is
indeed dense. The construction in dimension n 3 goes as follows.
Let k R
n
be xed for n 3. We choose
1,2
as

1
=
m
2
+ i
k + l
2
,
2
=
m
2
+ i
k l
2
, (7.15)
where the real-valued vectors l, and m are chosen in R
n
such that
m k = m l = k l = 0, [m[
2
= [k[
2
+[l[
2
. (7.16)
We verify that
i

i
= 0 and that [
i
[
2
=
1
2
([k[
2
+[l[
2
). In dimension n 3, such vectors
can always be found. For instance, changing the system of coordinates so that k = [k[e
1
,
we can choose l = [l[e
2
with [l[ > 0 arbitrary and then m =
_
[k[
2
+[l[
2
e
3
, where
(e
1
, e
2
, e
3
) forms a family of orthonormal vectors in R
n
. Note that this construction is
possible only when n 3. It is important to notice that while k is xed, [l[ can be
chosen arbitrarily large so that the norm of
i
can be arbitrarily large while
1
+
2
= k
is xed.
Upon combining (7.11) and (7.14), we obtain for the choice (7.15) that
q
1
=
q
2
implies that

_
X
e
ikx
(q
1
q
2
)dx

_
X
e
ikx
(q
1
q
2
)(
1
+
2
+
1

2
)dx


C
[l[
thanks to (7.13) since [l[(
1
+
2
+
1

2
) is bounded in L
1
(X) by an application of the
Cauchy-Schwarz inequality and e
ikx
(q
1
q
2
) is bounded in L

(X). Since the above


inequality holds independent of l, we deduce that the Fourier transform of (q
1
q
2
)
(extended by 0 outside of X) vanishes, and hence that q
1
= q
2
. So far we have thus
proved that

1
=

2
=
q
1
=
q
2
= q
1
= q
2
,
where q
j
and
j
are related by (7.6). From (7.6) still, we deduce that
0 =
1
2
1

1
2
2

1
2
2

1
2
1
= (
1
2
1

1
2
2

1
2
2

1
2
1
) = (
1

1
_1
2
). (7.17)
Since
1
=
2
on X, this is an elliptic equation for
_

2
_1
2
whose only solution is
identically 1. This shows that
1
=
2
. This concludes the proof of the uniqueness
result

1
=

2
=
1
=
2
. (7.18)
7.2.3 Proof of the stability result
Let us return to (7.11) and assume that
q
1

q
2
no longer vanishes but is (arbitrarily)
small. We rst want to assess how errors in
q
translates into errors in q. For u
j
114 CHAPTER 7. CALDER

ON PROBLEM
solutions as stated in Lemma 7.2.3 and of the form (7.12), we nd that

_
X
e
ikx
(q
1
q
2
)dx

_
X
e
ikx
(q
1
q
2
)(
1
+
2
+
1

2
)dx

_
X
(
q
1

q
2
)f
1
f
2
d

C
[l[
+|
q
1

q
2
|
Y
|f
1
|
H
1
2 (X)
|f
2
|
H
1
2 (X)

C
[l[
+ C|
q
1

q
2
|
Y
|u
1
|
H
1
(X)
|u
2
|
H
1
(X)

C
[l[
+ C[l[|
q
1

q
2
|
Y
e
C[l[
.
Indeed, f
j
= u
j
[
X
and |u|
H
1
2 (X)
C|u|
H
1
(X)
is a standard estimate. This step is
where the ill-posedness of the Calder on problem is best displayed.
Exercise 7.2.4 Verify that that for some constant C independent of [l[ and for u given
by (7.12)-(7.13), we have:
|u|
H
1
(X)
C[l[e
C[l[
.
Dene q = q
1
q
2
. So far, we have obtained a control of

q(k) uniform in k R
n
.
Upon choosing
[l[ =

C
ln
1

, 0 < < 1,
so that e
C[l[
=

, we nd that for := min(1, |


q
1

q
2
|
Y
),
[

q(k)[ := C

ln

1
. (7.19)
Since q is assumed to be bounded and compactly supported, it is square integrable in
R
n
so that |q|
L
2
(R
n
)
:= E < . This and the control in (7.19) allows one to obtain a
control of q in H
s
(R
n
) for s > 0. Indeed
|q|
2
H
s
(R
n
)
=
_
k
2s
[

q[
2
dk
k
n
0

2
+ k
2s
0
E
2
,
by splitting the integration in k into [k[ < k
0
and [k[ > k
0
and choosing k
0
1. We
then choose
k
0
=
_
E

_ 2
n+2s
.
This implies
|q
1
q
2
|
H
s
(R
n
)
CE
n
n+2s
[ ln

2s
n+2s
. (7.20)
Exercise 7.2.5 Assume that |q|
H

(R
n
)
:= E < . Show that the estimate (7.20)
holds with s replaced by s + on the right-hand-side.
7.3. COMPLEX GEOMETRIC OPTICS SOLUTIONS 115
It remains to convert the estimate on q
1
q
2
into an estimate for
1

2
. We nd
that (7.17) is replaced by
(
1

2
)
1
2
(q
1
q
2
) = (
1

1
1
_1
2
) in X,
_

1
1
_1
2
= 0 on X. (7.21)
Standard elliptic regularity results and the fact that
1
is of class C
2
therefore show
that
|
1

2
|
H
1
(X)
C|q
1
q
2
|
H
1
(X)
C[ ln

, (7.22)
with =
2
2+n
if q is bounded in the L
2
sense and =
2(1+)
n+2(1+)
if q is bounded as in
Exercise 7.2.5. The nal result in (7.4) then follows from interpolating the a priori
bound in H
s
of
1

2
, the above smallness bound in H
1
to obtain a small bound in
H

for some
n
2
< < s. Then by the Sobolev imbedding of L

(X) into H

(X), we
conclude the proof of Theorem 7.1.1.
7.3 Complex Geometric Optics Solutions
The major technical ingredient in the proof of Theorem 7.1.1 is the existence of complex
geometrical optics (CGO) solutions of the form (7.12). The proof of Lemma 7.2.4 and
more general results that will be useful in subsequent chapters is undertaken in this
section.
Let us consider the equation u = qu in X. When q = 0, then a rich family of
harmonic solutions is formed by the complex exponentials e
x
for C
n
a complex
valued vector such that = 0. Indeed, we verify that
e
x
= e
x
= 0. (7.23)
A vector =
r
+i
i
is such that = 0 if an only if
r
and
i
are orthogonal vectors
of the same (Euclidean) length.
When q ,= 0, it is tempting to try and write solutions of u = qu as perturbations
of the harmonic solutions e
x
, for instance in the form
u(x) = e
x
(1 + (x)).
This provides an equation for of the form
( + 2 ) = q(1 + ). (7.24)
Exercise 7.3.1 Check this formula.
Treating the right-hand side as a source f, the rst part of the construction consists of
solving the problem
( + 2 ) = f, (7.25)
for f a source in X and dened on X as well. Surprisingly, the analysis of (7.25)
is the most challenging technical step in the construction of solutions to (7.24). The
construction with f L
2
(X) is sucient for the proof of Theorem 7.1.1. In later
chapters, we will require more regularity for the solution to (7.25) and thus prove the
following result.
116 CHAPTER 7. CALDER

ON PROBLEM
Lemma 7.3.1 Let f H
s
(X) for s 0 and let [[ c > 0. Then there exists a
solution to (7.25) in H
s+1
(X) and such that
[[||
H
s
(X)
+||
H
s+1
(X)
C|f|
H
s
(X)
. (7.26)
Proof. We rst extend f dened on X to a function still called f dened and
compactly supported in R
n
and such that
|f|
H
s
(R
n
)
C(X)|f|
H
s
(X)
.
That such an extension exists is proved for instance in [59, Chapter 6, Theorem 5]. We
thus wish to solve the problem
( + 2 ) = f, in R
n
. (7.27)
The main diculty is that the operator ( + 2 ) has for symbol
T
x
( + 2 )T
1
x
= [[
2
+ 2i .
Such a symbol vanishes for
r
= 0 and 2
i
+ [[
2
= 0. The original proof of
the injectivity result of Theorem 7.1.1 in [60] shows the existence and uniqueness of a
solution to (7.27) in appropriate functional spaces. Since uniqueness is of no concern
to us here, we instead follow a path undertaken by [36, 55] and construct a solution
that can be seen of the product of a plane wave with a periodic solution with dierent
period. Let us dene
= e
ix
p, f = e
ix
f
for some vector R
n
to be determined. Then we nd
_
+ 2( + i) + (2i [[
2
)
_
p = (+ i + 2) (+ i)p = f. (7.28)
Exercise 7.3.2 Verify this.
Let us now assume that f is supported in a box Q of size (L, L)
n
for L suciently
large. Then we decompose as Fourier series:
p =

kZ
n
p
k
e
i

L
kx
, f =

kZ
n
f
k
e
i

L
kx
. (7.29)
We then nd that (7.28) is equivalent in the Fourier domain to
p
k
=
1
[

L
k + [
2
+ 2i ( +

L
k)
f
k
(7.30)
The imaginary part of the denominator is given by 2
r
(

L
k +). It remains to choose
=
1
2

r
[
r
[
,
7.3. COMPLEX GEOMETRIC OPTICS SOLUTIONS 117
to obtain that the above denominator never vanishes since k Z
n
. Moreover, for such
a choice, we deduce that

L
k +

2
+ 2i
_
+

L
k
_

C[[,
for some constant C independent of . This shows that
[p
k
[ C[[
1
[f
k
[.
Since f H
s
(Q), we deduce that |f|
2
H
s
(Q)
=

kZ
n
[k[
2s
[f
k
[
2
< , from which we
deduce that
|p|
H
s
(Q)
C[[
1
|f|
H
s
(Q)
.
It remains to restrict the constructed solution to X (and realize that e
ix
is smooth) to
obtain that [[||
H
s
(X)
C|f|
H
s
(X)
and the rst step in (7.26).
The result on ||
H
s+1
(X)
requires that we obtain bounds for [k[p
k
. For [k[ small, say
[k[
8L

[[, then we use the same result as above to obtain


[k[[p
k
[ C[f
k
[, [k[
8L

[[.
For the other values of [k[, we realize that the denominator in (7.30) causes no problem
and that
[k[[p
k
[ C[k[
1
[f
k
[, [k[ >
8L

[[.
This shows that [k[[p
k
[ C[f
k
[ for some constant C independent of k and [[. The proof
that ||
H
s+1
(X)
C|f|
H
s
(X)
then proceeds as above. This concludes the proof of the
fundamental lemma of CGO solutions to Schrodinger equations.
We now come back to the perturbed problem (7.24). We assume that q is a complex-
valued potential in H
s
(X) for some s 0. We say that q L

(X) has regularity s


provided that for all H
s
(X), we have
|q|
H
s
(X)
q
s
||
H
s
(X)
, (7.31)
for some constant q
s
. For instance, when s = 0, when q
s
= |q|
L

(X)
. Then we have the
following result.
Theorem 7.3.2 Let us assume that q H
s
(X) is suciently smooth so that q
s
< .
Then for [[ suciently large, there exists a solution to (7.24) that satises
[[||
H
s
(X)
+||
H
s+1
(X)
C|q|
H
s
(X)
. (7.32)
Moreover, we have that
u(x) = e
x
(1 + (x)) (7.33)
is a Complex Geometrical Optics solution in H
s+1
(X) to the equation
u = qu in X.
118 CHAPTER 7. CALDER

ON PROBLEM
Proof. Let T be the operator which to f H
s
(X) associates H
s
(X) the solution
of (7.27) constructed in the proof of Lemma 7.3.1. Then (7.24) may be recast as
(I Tq) = Tq.
We know that |T|
/(H
s
(X))
C
s
[[
1
. Choosing [[ suciently large so that [[ > C
s
q
s
,
we deduce that (I Tq)
1
=

m=0
(Tq)
m
exists and is a bounded operator in L(H
s
(X)).
We have therefore constructed a solution so that q(1+) H
s
(X). The estimate (7.26)
yields (7.32) and concludes the proof of the theorem.
Note that the above theorem with s = 0 yields Lemma 7.2.4.
Let us now consider the elliptic equation (7.1). The change of variables in (7.6)
shows that u =

1
2
v with v a solution of v = qv, is a solution of (7.1). We therefore
have the
Corollary 7.3.3 Let be suciently smooth so that q =

1
2

1
2
veries the hy-
potheses of Theorem 7.3.2. Then for [[ suciently large, we can nd a solution u of
u = 0 on X such that
u(x) =
1

1
2
(x)
e
x
(1 + (x)), (7.34)
and such that (7.32) holds.
For instance, for s N, we verify that (7.31) holds provided that is of class C
s+2
(X).
The case s = 0 with of class C
2
(X) is the setting of Theorem 7.1.1.
7.4 The Optical Tomography setting
We have seen that optical tomography measurements could be modeled by radiative
transfer equations. In the diusive regime, where photons travel over distances that are
much larger than the mean free path, the photon density is accurately modeled by a
diusion equation. Let us assume that the source of photons used to probe the domain
of interest is time harmonic with a frequency modulation . The density of such time
harmonic photons then solves the following elliptic model
u + ( + i)u = 0 in X, u = f on X. (7.35)
We assume here that f is the prescribed distribution of photons at the domains bound-
ary. This is an approximation as only the density of incoming photons can be prescribed.
More accurate models would not fundamentally change the main conclusions of this sec-
tion and we therefore assume Dirichlet conditions to simplify.
The coecients (, ) are the diusion and absorption coecients, respectively. We
assume that light speed is normalized to c = 1 to simplify (otherwise should be
replaced by c
1
).
The Dirichlet-to-Neumann or density-to-current map is given in this setting by

,
:
H
1
2
(X) H

1
2
(X)
f(x)

,
[f](x) = (x)
u

(x).
(7.36)
7.4. THE OPTICAL TOMOGRAPHY SETTING 119
With X = (
2
(

X) (
0
(

X) and Y = L(H
1
2
(X), H

1
2
(X)), we dene the measurement
operator
M

: X (, ) M

(, ) =

,
Y. (7.37)
The measurement operator is parameterized by the (known) modulation frequency
and there are now two unknown coecients and . Then we have the following result.
Theorem 7.4.1 Dene the measurement operator M

as in (7.37). Then for ,= 0,


we have that M

is injective in the sense that M

(, ) = M

( , ) implies that (, ) =
( , ).
Moreover, we have the following logarithmic stability estimate:
|(x)
t
(x)|
L

(X)
+|(x)
t
(x)|
L

(X)
C

log |M

(, )M

( , )||
Y

, (7.38)
for some > 0 provided that (, ) and ( , ) are uniformly bounded in (H
s
(X))
2
for
some s >
n
2
. Here, C depends on .
Let now = 0. Then M(, ) = M( , ) implies that
q :=

1
2

1
2
+

= q :=

1
2

1
2
+


. (7.39)
and (, ) are dened up to an arbitrary change of variables that leaves q above invariant.
Moreover, q above can reconstructed in H
1
(X) as shown in the preceding sections and
hence in L

(X) by interpolation.
Proof. The proof is very similar to that of Theorem 7.1.1. We mainly highlight the
dierences. Let u be a solution of the above equation and v =
1
2
u. We verify that
v = q

v, q

=

1
2

1
2
+
+ i

.
Let us assume that ,= 0. Then knowledge of
,
yields knowledge of
q
as before.
This uniquely determines q

and allows to obtain an error bound for q

in H
1
(X).
The imaginary part of q

thus provides a reconstruction of in the same space, and


hence in L

by interpolation between H
1
(X) and H
s
(X) for s >
n
2
. Since n 3, we
thus obtain an error in
1
2
also in H
1
, which then yields an error estimate for , also
in H
1
(X). Again, by interpolation, we nd a value of so that (7.38) holds. Note that
the value of is a priori worse than the one found for the Calderon problem since we
do not take advantage of solving an elliptic equation for
1
2
.
This concludes the proof when ,= 0. When = 0, then clearly we can reconstruct
q in (7.39) with the corresponding stability estimate. Now let us consider two couples
(, ) and ( , ) such that (7.39) holds. From (7.9), we deduce that

,
(f) =
1
2

1
2

X
f

1
2

X
f, f H
1
2
(X). (7.40)
Since we assume and known on X, then we obtain that
,
=
,
. This
shows that the reconstruction of (, ) is obtained up to any change of variables that
leaves q in (7.39) invariant.
120 CHAPTER 7. CALDER

ON PROBLEM
The above theorem shows that measurements in Optical Tomography with contin-
uous wave (CW) sources corresponding to = 0 do not guaranty uniqueness of the
reconstruction of (, ). The injectivity of the measurement operator is restored when
,= 0. However, as for the Calderon problem, the stability estimates are logarithmic
(and cannot fundamentally be improved; only the value of may not be optimal in the
above derivations).
For additional information about the Calder on problem, we refer the reader to [32,
55, 65] and to the historical references [7, 46, 49, 60].
Chapter 8
Coupled-physics IP I:
Photo-acoustic Tomography and
Transient Elastography
Many inverse diusion problems, including the Calderon problem, are modeled by mea-
surement operators that are injective but not very stable in the sense that the modulus
of continuity (e) in (1.3) is logarithmic. The main reason for such a behavior is
that solutions to elliptic equations are very smooth away from where the singularities
are generated. As a consequence, singularities in the parameters do not propagate to
strong, easily recognizable, signals in the boundary measurements. High frequencies of
the parameters are still present in the measurements when the measurement operator
is injective. However, they have been exponentially attenuated and the inverse problem
is then best described as severely ill-posed and typically displays poor resolution capa-
bilities. Imaging modalities based on inverse diusion problems are still useful because
of the high contrast they often display, for instance between healthy and non-healthy
tissues in medical imaging. Modalities such as Electrical Impedance Tomography and
Optical Tomography may be described as high contrast, low resolution modalities.
In applications where resolution is not paramount, the severe ill-posedness of inverse
diusion problems might not be an issue. In many instances, however, high resolution
in the reconstruction is an important objective of the imaging modality. The remedy to
such a low resolution is to nd methodologies that combine the high contrast modality
with another well-posed, high-resolution, inversion, such as for instance, those involving
inverse wave problems or inverse problems of integral geometry (such as the Radon
transform). In order for the combination to exist, we need to be suciently fortunate
that a measurable physical phenomenon couples the high resolution wave-like mechanism
with the high contrast elliptic-like phenomenon. Such examples exist in nature and give
rise to hybrid imaging techniques that couples the high contrast (but low-resolution)
of the diusion-like equation with the high resolution (but often low contrast) of the
wave-like equation.
In this chapter and the next, we consider several such physical couplings. In this
chapter, we consider the coupling of optical waves with ultrasound in the so-called
photo-acoustic eect. This gives rise to a modality called Photo-acoustic Tomography
(PAT). We also consider the coupling of elastic waves with ultrasound in the modality
121
122 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
called Transient Elastography (TE). TE and PAT are relatively similar mathematically
and both involve inverse problems with internal functionals.
In the next chapter, we consider another coupling of optical waves with ultrasound
called ultrasound modulation. Mathematically, this problem also gives rise to an inverse
problem with internal functionals, albeit one whose analysis is more complicated than
for TE and PAT.
8.1 Introduction to PAT and TE
8.1.1 Modeling of photoacoustic tomography
Photoacoustic tomography (PAT) is a hybrid medical imaging modality that combines
the high resolution of acoustic waves with the high contrast of optical waves. When a
body is exposed to short pulse radiation, typically emitted in the near infra-red spectrum
in PAT, it absorbs energy and expands thermo-elastically by a very small amount; this
is the photoacoustic eect. Such an expansion is sucient to emit acoustic pulses, which
travel back to the boundary of the domain of interest where they are measured by arrays
of transducers.
Radiation propagation. The propagation of radiation in highly scattering media is
modeled by the following diusion equation
1
c

t
u (x)u + (x)u = 0, x X R
n
, t > 0
u = f x X, t > 0
(8.1)
where X is an open, bounded, connected domain in R
n
with C
1
boundary X (embedded
in R
n
), where n spatial dimension; c is light speed in tissues; (x) is a (scalar) diusion
coecient; and (x) is an absorption coecient. Throughout the paper, we assume
that (x) and (x) are bounded from above and below by (strictly) positive constants.
The source of incoming radiation is prescribed by f(t, x) on the boundary X and is
assumed to be a very short pulse supported on an interval of time (0, ) with c of order
O(1).
Photoacoustic eect. As radiation propagates, a small fraction is absorbed. This
absorbed radiation generates a slight temperature increase, which results in a minute
mechanical expansion. The latter expansion is responsible for the emission of ultrasound,
which are measured at the domains boundary. The coupling between the optical and
ultrasonic waves is called the photo-acoustic eect. The amount of energy deposited
and transformed into acoustic energy is given by:
H(t, x) = (x)(x)u(t, x),
where (x) is the Gr uneisen coecient quantifying the photo-acoustic eect while
(x)u(t, x) is the density of absorbed radiation.
8.1. INTRODUCTION TO PAT AND TE 123
A thermal expansion proportional to H results and acoustic waves are emitted. Such
waves are modeled by
1
c
2
s
(x)

2
p
t
2
p =

t
H(t, x), (t, x) R R
n
, (8.2)
with c
s
the sound speed. We assume here to simplify that the wave equation is posed
in the whole space R
n
. This assumption is justied by the fact that waves propagating
away from the domain of interest are assumed not to interfere with the measurements
and by the fact that the measurement devices themselves do not modify the wave eld.
In a typical measurement setting, the acoustic pressure p(t, x) is then measured on X
as a function of time.
Since light travels much faster than sound with c
s
c, we may assume for short
light pulses that radiation propagation occurs over a very short time at the scale of the
acoustic waves. This justies the simplication:
H(t, x) H(x)
0
(t), H(x) = (x)(x)
_
R
+
u(t, x)dt.
The acoustic signals are therefore modeled by the following wave equation
1
c
2
s
(x)

2
p
t
2
p =

0
(t)
t
H(x). (8.3)
We now set the sound speed to c
s
= 1 to simplify. The acoustic pressure p(t, x) is then
measured on X as a function of time. Assuming a system at rest so that p(t, x) = 0
for t < 0, the wave equation may then be recast as the following equation:

2
p
t
2
p = 0, t > 0, x R
n
p(0, x) = H(x), x R
n

t
p(0, x) = 0, x R
n
.
(8.4)
Exercise 8.1.1 Show that under the above simplifying assumptions, (8.3) formally leads
to the initial value problem (8.4).
We retrieve the wave equation with unknown source term H(x) given by
H(x) = (x)(x)u(x), u(x) =
_
R
+
u(t, x)dt, (8.5)
where u(x) is the total intensity of radiation reaching a point x integrated over the time
span of the radiation pulse, and thus satises the equation
(x)u + (x)u = 0, x X R
n
,
u = f x X,
(8.6)
where f(x) =
_
R
+
f(t, x)dt is the spatial density of photons emitted into the domain.
124 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
8.1.2 First step: Inverse wave source problem
The rst step in PAT consists of reconstructing H(x) from available ultrasound mea-
surements. Two dierent settings of ultrasound measurements are considered now.
Inverse wave problem with boundary measurements. The reconstruction of
H(x) in (8.4) from a measurement operator given by MH(t, x) = p(t, x)[
X
was con-
sidered in Chapter 4. Several explicit reconstruction formulas were presented there,
including (4.35) and the reconstruction procedure in the Fourier domain presented in
section 4.3.2.
Inverse waves problem with planar measurements. Other measurement settings
than point measurements have been considered in practice. One major issue with point
measurements such as those considered above is that the acoustic signal is typically
rather weak and thus dicult to measure accurately. An alternative to such point
measurements is to consider planar detectors, which integrate the pressure eld over a
larger domain and thus become less sensitive to noise.
Consider the setting of (8.2) with f(x) = H(x) an unknown source term. Let P(s, )
for s R and S
2
be the plane of points x R
3
such that x = s. Then we recall
that the three-dimensional Radon transform of a function is dened as
Rf(s, ) =
_
R
3
f(x)(s x )dx =
_
P(s,)
f(x)d(x),
where d(x) is the surface measure on the plane P(s, ). We thus obtain that the
measurements are given as a function of time t and angle by
f M
3
f(t, ) = Rp(1, , t), t > 0, S
2
.
Again, we have in the Fourier domain the Fourier slice theorem

Rf(, ) =

f(),
from which combined with the following representation of the Laplace operator
= T
1
x
[[
2
T
x
,
we deduce the intertwining property of the Radon transform and the Laplacian:
Rf(s, ) =

2
s
2
Rf(s, ).
This property holds for smooth functions that decay suciently fast at innity and by
extension to a large class of distributions.
Thus (8.2) can be recast for Rp(s, , t) as

2
Rp
t
2


2
Rp
s
2
= 0, t > 0, s R, S
2
with conditions
t
Rp(s, , 0) = 0 and Rp(s, , 0) = Rf(s, ).
8.1. INTRODUCTION TO PAT AND TE 125
This is a one-dimensional wave equation, whose unique solution is given by
Rp(s, , t) =
1
2
_
Rp(s + t, , 0) + Rp(s t, , 0)
_
=
1
2
_
Rf(s + t, ) + Rf(s t, )
_
.
For the planes tangent to the unit sphere, s = 1, while t > 0. Then Rf(t + s, ) = 0
since f is supported inside the ball of radius 1 so that Rf(s, ) = 0 for [s[ 1. Thus for
0 < t < 2, we have
Rf(1 t, ) = 2Rp(1, , t) = 2M
3
f(t, ).
Up to a slight smooth change of variables, the data are therefore the three-dimensional
Radon transform of f, i.e., the integral of f along all possible hyperplanes (passing
through the support of f).
We recall the explicit inversion of the Radon transform:
f(x) =
1
8
2
R


2
s
2
Rf(x) =
1
8
2
_
S
2
_

2
s
2
Rf
_
(x , )d.
The Radon transform is injective and stable as we saw in Chapter 2. We have the
following stability result:

2|f|
H
s
(R
3
)
|M
3
f|
H
s+1
(Z)
,
(8.7)
where Z = R S
2
and H
s
(Z) is dened in Chapter 2.
Remarks on the rst step of PAT. The inverse wave source problems considered
here are therefore well posed problems. When ultrasound is measured with sucient
precision, then the reconstruction of the initial condition H(x) = f(x) is stable.
8.1.3 Second step: Inverse problems with internal functionals
Once the rst step of PAT is satisfactorily performed, a second step consists of recon-
structing the unknown coecients ((x), (x), (x)) from knowledge of internal func-
tionals of the form
H
j
(x) = (x)(x)u
j
(x), 1 j J, (8.8)
for J N

illumination maps f
j
(x), where u
j
is the solution to the steady-state equation
(x)u
j
+ (x)u
j
= 0, x X R
n
,
u
j
= f
j
x X.
(8.9)
This second step of PAT is sometimes called quantitative photoacoustic tomography
(QPAT). Indeed, the reconstruction of H
j
(x) in (8.8) oers important information about
the unknown coecients but depends on the illumination f
j
used to probe the domain
of interest and cannot be used to quantify intrinsic properties of the parameters. The
second step of PAT aims to provide quantitative statements on the unknown parameters
((x), (x), (x)).
126 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
Throughout the paper, we assume that the coecients in (8.9) are known at the
domains boundary X. Our objective is to reconstruct them in X. Formally, the
measurement operator in QPAT is therefore
((x), (x), (x)) M((x), (x), (x)) = (H
j
(x))
1jJ
.
One of the main theoretical results about QPAT is that, unfortunately, no matter
how large J and the illuminations f
j
may be chosen, we cannot reconstruct all of
((x), (x), (x)) uniquely from QPAT measurements of the form (8.8). In other words,
M is not injective. However, as we shall see, as soon as one of the coecients is as-
sumed to be known, then the restriction of M in this setting is injective and enjoys good
(Lipschitz or Holder) stability properties.
8.1.4 Reconstruction of one coecient.
We conclude this introductory section by a simpler problem: the reconstruction of one
coecient in ((x), (x), (x)) when the other two coecients are known. In practice, it
is not uncommon as a rst approximation to assume that and are known, at least ap-
proximatively. Then the important absorption coecient (for instance to reconstruct
the oxygenation properties of human tissues) is uniquely and stably determined.
(i) When only is unknown, then we solve (8.9) for u and then construct =
H
u
.
(ii) When only is unknown, then we solve the following elliptic equation for u
u(x) +
H

= 0, in X
u(x) = f(x), on X
, (8.10)
and then evaluate =
H
u
.
Exercise 8.1.2 Assume that f is uniformly bounded above and below by positive con-
stants. Show that the reconstruction of is Lipschitz-stable in L

(X), i.e., that


| |
L

(X)
C|H

H|
L

(X)
, (8.11)
where

H is acquired as in (8.8)-(8.9) with replaced by .
(iii) When only is unknown, we obtain u =
H

and then the above elliptic equation


in (8.10) with
[X
known is a transport equation for . As soon as := u is a
suciently smooth, non-vanishing vector eld, then is uniquely determined by the
linear equation
= + ( ) =
H

, in X
(x) =
[X
(x), on X.
(8.12)
This transport equation will be analyzed in more detail later in the chapter.
What the above results say is that the reconstruction of the optical coecients is
relatively straightforward when two out of the three are already known. The interest of
the QPAT theory resides in the fact that the acquisition of more measurements in fact
allows us to (uniquely and stably) reconstruct two coecients but not three.
8.2. THEORY OF QUANTITATIVE PAT AND TE 127
8.1.5 Introduction to Transient Elastography
Transient elastography images the (slow) propagation of shear waves using ultrasound.
For more details, see, e.g., [44] and its extended list of references. As shear waves
propagate, the resulting displacements can be imaged by ultra-fast ultrasound. Consider
a scalar approximation of the equations of elasticity
(x)u(x, t) = (x)
tt
u(x, t), t R, x X
u(x, t) = f(x, t), t R, x X,
(8.13)
where u(x, t) is the (say, downward) displacement, (x) is one of the Lame parameters
and (x) is density. Using ultra-fast ultrasound measurements, the displacement u(x, t)
can be imaged. This results in a very simplied model of transient elastography where
we aim to reconstruct (, ) from knowledge of u(x, t); see [44] for more complex models.
We may slightly generalize the model as follows. Upon taking Fourier transforms in the
time domain and accounting for possible dispersive eects of the tissues, we obtain
(x; )u(x; ) +
2
(x; )u(x) = 0, R, x X
u(x; ) = f(x; ), R, x X.
(8.14)
The inverse transient elastography problem with dispersion eect would then be the
reconstruction of ((x; ), (x; )) from knowledge of u(x; ) corresponding to one or
several boundary conditions f(x; ) applied at the boundary X.
This corresponds to the setting of PAT with = 1 so that H = u. As in PAT,
H is an internal functional of the unknown coecients (, ). The role of quantitative
transient elastography (QTE) is therefore to reconstruct (, ) from knowledge of one
or several internal functionals of the form H = u.
8.2 Theory of quantitative PAT and TE
Let us come back to the general theory of photo-acoustic tomography and transient
elastic tomography. The two mathematical problems are quite similar. For concreteness,
we rst focus on the PAT setting and next state how the results should be modied to
solve the TE problem.
8.2.1 Uniqueness and stability results in QPAT
We have seen above that one coecient in (, , ) could be uniquely reconstructed
from one internal functional H provided that the other two coecients were known.
It turns out that two coecients in (, , ) can also be uniquely (and stably) be re-
constructed from two (well-chosen) internal functionals H
j
, j = 1, 2 provided that the
third coecient is known. However, these two internal functionals (H
1
, H
2
) uniquely
determine any internal functional H
3
obtained by using an arbitrary illumination f
3
on
X. Moreover, the two internal functionals (H
1
, H
2
) uniquely characterize two explicit
functionals of (, , ) that do not allow us to reconstruct all parameters in (, , )
uniquely.
128 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
Measurement operator in QPAT. We have mentioned above the notion of well-
chosen internal functionals H
j
, j = 1, 2, or equivalently well-chosen illuminations f
j
,
j = 1, 2 on X since the dierent functions H
j
are characterized by the probing f
j
on
X. To make precise statements, we introduce some notation. For f H
1
2
(X), we
obtain a solution u H
1
(X) of (8.9) and we can dene the internal functional operator
H
(,,)
:
H
1
2
(X) H
1
(X)
f H
(,,)
f = (x)(x)u(x).
(8.15)
Let I N

and f
i
H
1
2
(X) for 1 i I be a given set of I boundary conditions.
Dene f = (f
1
, . . . , f
I
). The measurement operator M
f
is then dened as the following
collection of internal functionals:
M
f
:
X Y
I
(, , ) M
f
(, , ) = (H
(,,)
f
1
, . . . , H
(,,)
f
I
).
(8.16)
Here, X is a subset of a Banach space in which the unknown coecients are dened; see
(H1) below). Also Y is a subset of H
1
(X) where the solutions to (8.9) are dened. We
also dene H
j
= H
(,,)
f
j
for 1 j I.
Assumptions on the coecients and the illuminations. Here are now some
mathematical assumptions on the coecients and a denition of illuminations that we
call well-chosen. Here and below, we denote by W
m,p
(X) the space of functions with
derivatives of order less than or equal to m belonging to L
p
(X).
(H1). We denote by X the set of coecients (, , ) that are of class W
1,
(X),
are bounded above and below by xed positive constants, and such that the traces
(, , )
[X
on the boundary X are xed (known) functions.
(H2). The illuminations f
j
are positive functions on X that are the restrictions on
X of functions of class C
3
(

X).
(H3). We say that f
2
= (f
1
, f
2
) is a pair of well-chosen illuminations with corresponding
functionals (H
1
, H
2
) = (H
(,,)
f
1
, H
(,,)
f
2
) provided that (H2) is satised and the
vector eld
:= H
1
H
2
H
2
H
1
= H
2
1

H
2
H
1
= H
2
1

u
2
u
1
= H
2
2

H
1
H
2
(8.17)
is a vector eld in W
1,
(X) such that
[[(x)
0
> 0, a.e. x X. (8.18)
(H3). We say that f
2
= (f
1
, f
2
) is a pair of weakly well-chosen illuminations with cor-
responding functionals (H
1
, H
2
) = (H
(,,)
f
1
, H
(,,)
f
2
) provided that (H2) is satised
and the vector eld dened in (8.17) is in W
1,
(X) and ,= 0 a.e. in X.
8.2. THEORY OF QUANTITATIVE PAT AND TE 129
Remark 8.2.1 Note that (H3) is satised as soon as
f
1
f
2
,= C is not a constant. Indeed,
if = 0 on a set of positive measure, then
u
2
u
1
= 0 on that same set. Yet,
u
2
u
1
solves the
elliptic equation (8.20) below. It is known that under sucient smoothness conditions on
the coecients, the critical points of solutions to such elliptic equations are of measure
zero unless these solutions are constant [37, 53].
Uniqueness/Non-uniqueness result. Hypothesis (H3) will be useful in the anal-
ysis of the stability of the reconstructions. For the uniqueness result, the weaker hy-
pothesis (H3) is sucient. Note that almost all illumination pairs f
2
satisfy (H3),
which is a mere regularity statement. Beyond the regularity assumptions on (, , ),
the domain X, and the boundary conditions f
j
, the only real assumption we impose is
thus (8.18). In general, there is no guaranty that the gradient of
u
2
u
1
does not vanish.
Not all pairs of illuminations f
2
= (f
1
, f
2
) are well-chosen although most are weakly
well-chosen. That the vector eld does not vanish is a sucient condition for the
stability estimates presented below to be satised. It is not necessary. As we shall see,
guaranteeing (8.18) is relatively straightforward in dimension n = 2. It is much compli-
cated in dimension n 3. The only available methodology to ensure that (8.18) holds
for a large class of conductivities is based on the same method of complex geometric
optics (CGO) solutions already used to solve the Calderon problem in Chapter 7.
Under these hypotheses, we obtain the following result:
Theorem 8.2.2 Let X be dened as in (H1) and let f
2
be well chosen illuminations as
indicated in (H2) and (H3). Let I N

and f = (f
1
, . . . , f
I
) be a set of (arbitrary)
illuminations satisfying (H2). Then we have the following:
(i). The measurement operator M
f
2
uniquely determines M
f
(meant in the sense that
M
f
2
(, , ) = M
f
2
( , ,

) implies that M
f
(, , ) = M
f
( , ,

)).
(ii). The measurement operator M
f
2
uniquely determines the two following functionals
of (, , ) (meant in the same sense as above):
(x) :=

(x), q(x) :=
_

_
(x). (8.19)
(iii). Knowledge of the two functionals and q uniquely determines (in the same sense
as above) M
f
2
= (H
1
, H
2
). In other words, the reconstruction of (, , ) is unique up
to any transformation that leaves (, q) invariant.
Proof. Let us start with (i). Upon multiplying the equation for u
1
by u
2
, the equation
for u
2
by u
1
, and subtracting both relations, we obtain
(u
2
1
)
H
2
H
1
= 0, in X
u
2
1
=
[X
f
2
1
, on X.
(8.20)
This is a transport equation in conservative form for u
2
1
. More precisely, this is a
transport equation

= 0 for with
[X
= 1 and

=
2
= (u
2
1
)
H
2
H
1
.
130 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
Since

W
1,
(X) and is divergence free, the above equation for admits the unique
solution 1 since (8.18) holds. Indeed, we nd that ( 1)
2

= 0 by application
of the chain rule with
[X
1 = 0 on X. Upon multiplying the equation by
H
2
H
1
and
integrating by parts, we nd
_
X
( 1)
2

2
H
2
1

H
2
H
1

2
dx = 0.
Using (H3) and the result of remark 8.2.1, we deduce that = 1 on X by continuity.
This proves that u
2
1
is uniquely determined. Dividing by H
2
1
= ()
2
u
2
1
, this implies
that > 0 dened in (8.19) is uniquely determined. Note that we do not need the full
W
1,
(X) regularity of in order to obtain the above result. However, we still need to
be able to apply the chain rule to obtain an equation for ( 1)
2
and conclude that the
solution to the transport equation is unique.
Let now f be an arbitrary boundary condition and let u be the solution to (8.9) and
H = H
(,,)
f dened by (8.8). Replacing H
2
above by H yields

2
H
2
1

H
H
1
= 0, in X
H =
[X

[X
f, on X.
(8.21)
This is a well-dened elliptic equation with a unique solution H H
1
(X) for f
H
1
2
(X). This proves that H = H
(,,)
f is uniquely determined by (H
1
, H
2
) and
concludes the proof of (i).
Let us next prove (ii). We have already seen that was determined by M
f
2
=
(H
1
, H
2
). Dene now v =

u
1
, which is also uniquely determined based on the results
in (i). Dene
q =
v
v
=
(

u
1
)

Du
1
,
which is the Liouville change of variables used in Chapter 7 to solve the Calderon
problem. Since u
1
is bounded from below, is suciently smooth, and solves (8.9), the
following calculations show that q is given by (8.19). Indeed, we nd that
u
1
= (

v) ((

)v) =

v (

)v = u
1
=

v. (8.22)
Finally, we prove (iii). Since q is now known, we can solve
(q)v
j
= 0, X, v
j
=

[X
g
j
X, j = 1, 2.
Because q is of the specic form (8.19) as a prescribed functional of (, , ), it is known
that ( q) does not admit 0 as a (Dirichlet) eigenvalue, for otherwise, 0 would also
be a (Dirichlet) eigenvalue of the elliptic operator
( + ) = (

(q)

) . (8.23)
The latter calculation follows from (8.22). Thus v
j
is uniquely determined for j = 1, 2.
Now,
H
j
= u
j
=

v
j
=
v
j

, j = 1, 2,
and is therefore uniquely determined by (, q). This concludes the proof that (, q)
uniquely determines M
f
2
.
8.2. THEORY OF QUANTITATIVE PAT AND TE 131
Reconstruction of two coecients. The above result shows that the unique recon-
struction of (, , ) is not possible even from knowledge of a measurement operator M
f
corresponding to an arbitrary (in fact possibly innite) number of internal functionals
I. We therefore face this peculiar situation that two well-chosen illuminations uniquely
determine the functionals (, q) but that acquiring additional measurements does not
provide any new information.
However, we can prove the following positive result that if one of the coecients in
(, , ) is known, then the other two coecients are uniquely determined.
Corollary 8.2.3 Under the hypotheses of the previous theorem, let (, q) in (8.19) be
known. Then
(a) If is known, then (, ) are uniquely determined.
(b) If is known, then (, ) are uniquely determined.
(c) If is known, then (, ) are uniquely determined.
Proof. (a) is probably the most practical case as is often assumed to be constant
or known. Since is known, then so is =

/ so that we have the elliptic equation


for

:
(q)

+
1

= 0, X,

[X
=

[X
, X. (8.24)
Again, because of the specic form of q, ( q) is invertible and the above equation
admits a unique solution. Once

, hence , is known, then so is =

.
If is known in (b), then is known from q and is known from .
Finally in (c), we obtain that from the expression for q that

(q)

+ = 0 X,

[X
=

[X
, X. (8.25)
We need to prove a uniqueness result for the above nonlinear equation for

. Let us
assume that

and another solution

for 0 < (x) satisfy the above equation for


xed. We have

(q)

= 0 X.
Thanks to (8.23), this implies the following equation for :
+ (
1

) = 0, X, = 1, X.
Upon multiplying by 1 and integrating by parts, we nd that
_
X
[( 1)[
2
dx +
_
X
[ 1[
2
+ 1

dx = 0.
Since > 0, we deduce from the above that 1 and that is uniquely determined
by q. We then retrieve from knowledge of .
132 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
Reconstruction formulas. Note that the above uniqueness results provide a con-
structive reconstruction procedure. In all cases, we rst need to solve a transport equa-
tion for the functional :
(
2
) = 0 in X,
[X
known on X, (8.26)
with the vector eld dened in (8.17). This uniquely denes > 0. Then we nd
that
q(x) =
(H
1
)
H
1

=
(H
2
)
H
2

. (8.27)
This provides explicit reconstructions for (, q). In case (b), no further equation needs
to be solved. In cases (a) and (c), we need to solve an elliptic equation for

, which is
the linear equation (8.24) in (a) and the nonlinear equation (8.25) in (c). These steps
have been implemented numerically with very satisfactory results in [18].
Stability of the solution of the transport equation for (x). We now derive
a stability result for the reconstruction of obtained from analyzing the transport
equation (8.20). Similar stability results can be obtained for q and then for (, , )
depending on the reconstruction considered.
Theorem 8.2.4 Let M
f
2
(, , ) = (H
1
, H
2
) be the measurements corresponding to the
coecients (, , ) such that (H1), (H2), (H3) hold. Let M
f
2
( , ,

) = (

H
1
,

H
2
) be
the measurements corresponding to the same illuminations f
2
= (f
1
, f
2
) with another
set of coecients ( , ,

) such that (H1), (H2) hold. Dene M
f
2
= M
f
2
( , ,

)
M
f
2
(, , ).Then we nd that
| |
L
p
(X)
C|M
f
2
|
1
2
(W
1,
p
2 (X))
2
, for all 2 p < . (8.28)
Let us assume, moreover, that (x) is of class C
3
(

X). Then we have the estimate
| |
L
p
(X)
C|M
f
2
|
1
3
(L
p
2 (X))
2
, for all 2 p < . (8.29)
By interpolation, the latter result implies that
| |
L

(X)
C|M
f
2
|
p
3(d+p)
(L
p
2 (X))
2
, for all 2 p < . (8.30)
We may for instance choose p = 4 above to measure the noise level in the measurement
M
f
2
in the square integrable norm when noise is described by its power spectrum in the
Fourier domain.
Proof. Dene =
2
and =
2
with dened in (8.19) and and

as in (8.17).
Then we nd that

() + (

) = 0.
Note that =
2
H
2
1

H
2
H
1
is a divergence-free eld. Let be a twice dierentiable,
non-negative, function from R to R with (0) =
t
(0) = 0. Then we nd that

_

_
() +
t
_

_
(

) = 0.
8.2. THEORY OF QUANTITATIVE PAT AND TE 133
Let us multiply this equation by a test function H
1
(X) and integrate by parts.
Since =
t
on X, we nd
_
X

_
dx +
_
X
(

)
_

t
_

__
dx = 0.
Upon choosing =
H
2
H
1
, we nd
_
X
H
2
1

H
2
H
1

2
dx +
_
X
(

)
H
2
H
1

t
dx +
_
X
(

)

H
2
H
1

tt
dx = 0.
Above, stands for (

) in all integrals. By assumption on the coecients,


is bounded a.e.. This is one of our main motivations for assuming that the optical
coecients are Lipschitz. The middle term is seen to be smaller than the third term
and so we focus on the latter one. Upon taking (x) = [x[
p
for p 2 and using
assumption (H3), we nd that
| |
p
L
p
(X)
C
_
X
[

[[ [
p2
dx.
By an application of the H older inequality, we deduce that
| |
L
p
(X)
C|

|
1
2
L
p
2 (X)
.
We next write

= (H
1


H
1
)H
2
+

H
1
((H
2


H
2
) . . . and use the fact that the
solutions to (8.9) and the coecients are in W
1,
(X) to conclude that (8.28) holds.
The other results are obtained by regularity theory and interpolation. Indeed from
standard regularity results with coecients in W
1,
(X), we nd that the solutions to
(8.9) are of class W
3,q
(X) for all 1 q < . Since the coecient is of class C
3
(

X),
then the measurements H
j
are of class W
3,q
(X) for all 1 q < . Standard Sobolev
estimates show that
|H
j


H
j
|
W
1,q
(X)
C|H
j


H
j
|
2
3
L
q
(X)
|H
j


H
j
|
1
3
W
3,q
(X)
.
The last term is bounded by a constant, which gives (8.29) for q =
p
2
. Another interpo-
lation result states that
||

||

||
1
p
, =
d
d + p
.
This provides the stability result in the uniform norm (8.30).
Exercise 8.2.1 Find similar stability estimates for q and (, , ) in the dierent set-
tings considered in Corollary 8.2.3 and in section 8.1.4.
134 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
8.2.2 Application to Quantitative Transient Elastography
We can apply the above results to the time-harmonic reconstruction in a simplied
model of transient elastography. Let us assume that and are unknown functions of
x X and R. Recall that the displacement solves (8.14). Assuming that u(x; )
is known after step 1 of the reconstruction using the ultrasound measurements, then
we are in the setting of Theorem 8.2.2 with = 1. Let us then assume that the two
illuminations f
1
(x; ) and f
2
(x; ) are chosen such that for u
1
and u
2
the corresponding
solutions of (8.14), we have that (H3) holds. Then, (8.19) shows that the reconstructed
function uniquely determines the Lame parameter (x; ) and that the reconstructed
function q then uniquely determines
2
and hence the density parameter (x; ). The
reconstructions are performed for each frequency independently. We may summarize
this as follows:
Corollary 8.2.5 Under the hypotheses Theorem 8.2.2 and the hypotheses described
above, let (, q) in (8.19) be known. Then ((x; ), (x; )) are uniquely determined
by two well-chosen measurements. Moreover, if (H3) holds, the stability results in The-
orem 8.2.4 hold.
Alternatively, we may assume that in a given range of frequencies, (x) and (x)
are independent of . In such a setting, we expect that one measurement u(x; ) for
two dierent frequencies will provide sucient information to reconstruct ((x), (x)).
Assume that u(x; ) is known for =
j
, j = 1, 2 and dene 0 < =
2
2

2
1
,= 1. Then
straightforward calculations show that

= 0,

=
_
u
1
u
2
u
2
u
1
). (8.31)
This provides a transport equation for that can be solved stably provided that [

[
c
0
> 0, i.e., a hypothesis of the form (H3) applies and

does not vanish on X. Then,


Theorem 8.2.2 and Theorem 8.2.4 apply in this setting.
8.3 Well-chosen illuminations in PAT and TE
The stability results presented for QPAT and QTE involve well-chosen illuminations so
that (H3) and (8.18) hold. Such a constraint is clearly not satised for all possible pairs
(f
1
, f
2
). In two dimensions of space, i.e., when n = 2, a very large class of illuminations
can be proved to be well-chosen. In dimensions n 3, the proofs are much more
complicated and involve the CGO solutions constructed in Chapter 7.
8.3.1 The two dimensional case
In dimension n = 2, we have:
Lemma 8.3.1 Assume that h =
f
2
f
1
on X is an almost two-to-one function, i.e., a
function that is a two-to-one map except possibly at its minimum and at its maximum.
Then (8.18) is satised.
8.3. WELL-CHOSEN ILLUMINATIONS IN PAT AND TE 135
Proof. Upon multiplying the equation for u
1
by u
2
, the equation for u
2
by u
1
, and
subtracting both relations, we obtain
(u
2
1
)
u
2
u
1
= 0, in X
u
2
u
1
=
f
2
f
1
, on X. (8.32)
This implies that :=
u
2
u
1
satises an elliptic equation with a diusion coecient = u
2
1
bounded from above and below by positive constants. Note that = H
2
1
. Results
in, e.g., [2, Theorem 1.2] show that cannot vanish inside X. In other words, does
not admit any critical point. By the maximum principle and the assumption on h, no
critical point of can occur on X either. This implies that [[ > 0 and that we can
nd a constant such that (8.18) holds since H
2
1
is bounded from below by a positive
constant and by continuity [[ attains its (strictly positive) minimum in

X.
8.3.2 The n dimensional case
In dimension n 3, the above result on the (absence of) critical points of elliptic
solutions no longer holds. However, by continuity, we may verify that (8.18) is satised
for a large class of illuminations when is close to a constant and is suciently small.
For arbitrary coecients (, ) in dimension n 3, the only available proof that (8.18)
is satised for an open set of illuminations is the one obtained by means of complex
geometrical optics solutions; see [21]. The main result is:
Theorem 8.3.2 Let (H1) hold. Then there is an open set in C
3
(X) of illuminations
f
2
= (f
1
, f
2
) such that (H3) holds.
The open set, unlike the result obtained in two dimensions in Lemma 8.3.1, is not
explicitly constructed.
Proof. Let us consider the elliptic equation (8.6). In chapter 7, we have constructed
solutions of the form
u

(x) =
1

e
x
_
1 +

(x)
_
, (8.33)
with [[

(x) bounded uniformly in H


s
(X) for arbitrary s 0 provided that and
are suciently smooth coecients. Using the construction of Chapter 7, we can prove
the following lemma:
Lemma 8.3.3 Let u

j
for j = 1, 2 be CGO solutions with (, ) suciently smooth for
both
j
and k 1 and with c
1
0
[
1
[ [
2
[ c
0
[
1
[ for some c
0
> 0. Then we have

:=
1
2[
1
[
e
(
1
+
2
)x
_
u

1
u

2
u

2
u

1
_
=

1

2
2[
1
[
+

h, (8.34)
where the vector eld

h satises the constraint
|

h|
C
k
(

X)

C
0
[
1
[
, (8.35)
for some constant C
0
independent of
j
, j = 1, 2.
136 CHAPTER 8. COUPLED-PHYSICS IP I: PAT AND TE
Exercise 8.3.1 Prove the above lemma using the results obtained in Chapter 7.
With
2
=
1
so that u

2
= u

1
, the imaginary part of (8.34) is a vector eld that does
not vanish on X for [
1
[ suciently large. Moreover, let u

1
= v +iw and u

2
= v iw
for v and w real-valued functions. Then the imaginary and real parts of (8.34) are given
by

=
1
[
1
[
e
2T
1
x
(wv vw) =

1
[
1
[
+

h, 1

= 0.
Let u
1
and u
2
be solutions of the elliptic problem (8.6) on X such that u
1
+iu
2
on X
is close (in the C
3
(X) (strong) topology) to the trace of u

1
. The above result shows
that
[u
1
u
2
u
2
u
1
[ c
0
> 0 in X.
This yields (8.18) and the proof of the theorem.
The set of illuminations f
2
= (f
1
, f
2
) for which (8.18) is guaranteed is not known
explicitly. All we know is that if f
2
is chosen suciently close to the traces of CGO
solutions constructed above, then indeed the vector eld will satisfy (8.18). One
major drawback with such a result is that the CGO solutions depend on the unknown
coecient (, ). That said, there does exist an open set of illuminations f
2
such that
(8.18) holds.
This result should be contrasted with the case in two dimensions, where we were
able to prove that (8.18) held for a very large class of (non-oscillatory) illuminations f
2
.
Chapter 9
Coupled-physics IP II: Ultrasound
Modulation Tomography
9.1 Ultrasound Modulation Tomography
The preceding chapter analyzed inverse problems related to the photo-acoustic eect.
In this chapter, we consider another physical mechanism that couples optical waves with
ultrasound, namely the ultrasound modulation eect. To simplify the presentation, and
because most results are known in this simplied setting, we assume that the elliptic
equations involve an unknown diusion coecient but that the absorption coecient is
assumed to vanish. We refer the reader to [20] to generalizations to the practically phys-
ical setting of an elliptic equation with unknown diusion and absorption coecients.
Consider thus the following elliptic equation
(x)u = 0 in X, u = f on X. (9.1)
Here, is the unknown diusion coecient, which we assume for the moment is a real-
valued, scalar, function dened on a domain X R
n
for n = 2 or n = 3. We assume that
is bounded above and below by positive constants so that the above equation admits
a unique solution. We also assume that is suciently smooth so that the solution to
the above equation is continuously dierentiable on

X, the closure of X [33]. As before,
we denote by f(x) the imposed (suciently smooth) Dirichlet boundary conditions.
As we have seen already, the coecient (x) may model the electrical conductivity
in the setting of electrical impedance tomography (EIT) or a diusion coecient of
particles (photons) in the setting of optical tomography (OT). Both EIT and OT are
modalities with high contrast, in the sense that (x) takes dierent values in dierent
tissues and allows one to discriminate between healthy and non-healthy tissues. In OT,
high contrasts are mostly observed in the absorption coecient, which we recall is not
modeled here; see [20].
A methodology to couple high contrast with high resolution consists of perturbing the
diusion coecient acoustically. Let an acoustic signal propagate through the domain.
We assume here that the sound speed is constant and that the acoustic signal is a plane
wave of the form p cos(k x + ) where p is the amplitude of the acoustic signal, k its
wave-number and an additional phase. The acoustic signal modies the properties of
137
138 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
the diusion equation. We assume that such an eect is small and that the coecient
in (9.1) is modied as

(x) = (x)(1 + c), (9.2)


where we have dened c = c(x) = cos(k x +) and where = p is the product of the
acoustic amplitude p R and a measure > 0 of the coupling between the acoustic
signal and the modulations of the constitutive parameter in (9.1). We assume that
1 so that the inuence of the acoustic signal on

admits an asymptotic expansion


that we truncated at the second order as displayed in (9.2). The size of the terms in
the expansion are physically characterized by and depend on the specic application.
Let u and v be solutions of (9.1) with xed boundary conditions g and h, respectively.
When the acoustic eld is turned on, the coecients are modied as described in (9.2)
and we denote by u

and v

the corresponding solutions. Note that u

is the solution
obtained by changing the sign of p or equivalently by replacing by + .
By the standard continuity property of the solution to (9.1) with respect to changes
in the coecients and regular perturbation arguments, we nd that u

= u
0
+u
1
+O(
2
).
Let us multiply the equation for u

by v

and the equation for v

by u

, subtract the
resulting equalities, and use standard integrations by parts. We obtain that
_
X
(

)u

dx =
_
X

d(x). (9.3)
Exercise 9.1.1 Verify the above result.
Here, d(x) is the standard surface measure on X. We assume that

and

are measured on X, at least on the support of v

= h and u

= g, respectively, for
all values of interest. Note that the above equation holds if the Dirichlet boundary
conditions are replaced by Neumann boundary conditions. Let us dene
J

:=
1
2
_
X

d(x) = J
1
+
2
J
2
+ O(
3
). (9.4)
We assume that the real valued functions J
m
= J
m
(k, ) are known (measured func-
tions). Notice that such knowledge is based on the physical boundary measurement of
the Cauchy data of the form (u

) and (v

) on X.
Equating like powers of , we nd at the leading order that
_
X
_
(x)u
0
v
0
(x)

cos(k x + )dx = J
1
(k, ). (9.5)
This may be acquired for all k R
n
and = 0,

2
, and hence provides the Fourier
transform of
H[u
0
, v
0
](x) = (x)u
0
v
0
(x). (9.6)
Note that when v

= u

, then we nd from the expression in (9.3) that J


2
= 0 in
(9.4) so that the expression for J
1
may be obtained from available measurements in (9.4)
with an accuracy of order O(
2
). Note also that
H[u
0
, v
0
](x) =
1
4
_
H[u
0
+ v
0
, u
0
+ v
0
] H[u
0
v
0
, u
0
v
0
]
_
9.2. INVERSE PROBLEMS IN ULTRASOUND MODULATION. 139
by polarization. In other words, the limiting measurements (for small ) in (9.6) may
also be obtained by considering expressions of the form (9.3) with u

= v

.
In the setting of optical tomography, the coecient

in (9.2) takes the form

(x) =

c
n1

(x),
where

is the diusion coecient, c

is the light speed, and n is spatial dimension.


When the pressure eld is turned on, the location of the scatterers is modied by
compression and dilation. Since the diusion coecient is inversely proportional to the
scattering coecient, we nd that
1

(x)
=
1
(x)
_
1 + c(x)
_
.
Moreover, the pressure eld changes the index of refraction (the speed) of light as follows
c

(x) = c(x)(1 +c(x)),


where is a constant (roughly equal to
1
3
for water). This shows that
= (1 + (d 1)). (9.7)
In the setting of electrical impedance tomography, we simply assume that models the
coupling between the acoustic signal and the change in the electrical conductivity of the
underlying material. The value of thus depends on the application.
9.2 Inverse problems in ultrasound modulation.
Assuming the validity of the above derivation, the objective of ultrasound modulated
tomography is to reconstruct the coecient (x) from knowledge of the interior func-
tionals
H
ij
(x) = (x)u
i
(x) u
j
(x), 1 i, j m, (9.8)
where u
j
is the solution to the equation
(u
j
) = 0 in X, u
j
= f
j
on X, 1 j m, (9.9)
for appropriate choices of the boundary conditions f
j
on X. In practice, the ultrasound
modulation eect is extremely weak because the coupling coecient is small. However,
its mathematical analysis shows that UMOT and UMEIT are well posed problems unlike
the (non ultrasound-modulated) OT and EIT problems.
We need some notation to introduce the main result of this chapter. We rst dene
the UMT measurement operator. We again dene f = (f
1
, . . . , f
m
) the vector of illumi-
nations at the domains boundary. The corresponding solutions to (9.9) are denoted by
u
j
. The measurement operator M
f
is then dened as the following collection of internal
functionals:
M
f
:
X /

(R
m
; Y)
M
f
= (u
i
u
j
)
1i,jm
.
(9.10)
140 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
Here, X is a subset of a Banach space in which the unknown coecients are dened, Y
is a subset of H
1
(X) where the solutions to (9.9) are dened, and /

(R
m
; Y) is the
space of symmetric second-order tensors of order m with values in Y.
Again, we observe that M
f
is parameterized by the illuminations f.
Remark 9.2.1 Note that for m = 1, the internal functional is of the form H = [u[
2
.
The unknown coecient can then be eliminated and formally, u solves the following
non-linear equation

H
[u[
2
u = 0 in X, u = f on X. (9.11)
This nonlinear problem is in fact hyperbolic in nature rather than elliptic and its solution
is therefore far from guaranteed. See [?].
In some sense, the UM inverse problem has two unknowns (, u). In the one-
functional setting, eliminationg of to get an equation for u is rather trivial. How-
ever, the resulting equation is dicult to analyze and may not have unique solutions.
The multi-functional setting when m 2 aims to simplify the solution of the result-
ing (redundant system of ) equation(s). However, the elimination of unknowns becomes
signicantly more challenging.
Following [?, 27], we rst perform the change of unknown functions S
i
=
1
2
u
i
for
every i and dene
F(x) := log (x). (9.12)
Let (e
1
, . . . , e
n
) be the canonical basis in R
n
. For a given vector eld V = V
i
e
i
dened
on X, we dene the corresponding one-form V

:= V
i
dx
i
, where dx
i
is the dual basis
(of 1-forms) to e
i
in the sense that dx
i
(e
j
) =
ij
, i.e., 0 if i ,= j and 1 if i = j. With this
notation, we obtain that the vector elds S
j
satisfy the system of equations
S
j
=
1
2
F S
j
, (9.13)
dS

j
=
1
2
F

j
, 1 j m, (9.14)
where and d denote the usual exterior product and exterior derivative, respectively.
The rst equation stems directly from (9.9) whereas the second one states that the
one-form

1
2
S

j
= du
j
is exact, therefore closed, and hence d(

1
2
S

j
) = 0. In dimension
n = 3, this means that u =

1
2
S is a gradient so that its curl vanishes. The above
equations are generalizations to arbitrary dimensions.
When n = 2, 3, equation (9.14) is recast as:
n = 2 : [, S
j
] [F, S
j
] = 0, n = 3 : curl S
j
F S
j
= 0,
where for n = 2, we dene [A, B] := A
x
B
y
A
y
B
x
and [, A] :=
x
A
y

y
A
x
and for
n = 3, denotes the standard cross-product.
Exercise 9.2.1 Check the above formulas.
9.2. INVERSE PROBLEMS IN ULTRASOUND MODULATION. 141
The strategy now is the following: we wish to eliminate F from the above equations.
This will result in an overdetermined system of equations for the vector elds S that we
will prove admits a unique solution. Once all vector elds S
j
are known, we obtain an
explicit expression for F, from which the reconstruction of is relatively straightforward.
The elimination of F requires that a certain system be invertible. As for condition (H3)
of (H3) in chapter 8, we therefore need to nd well-chosen illuminations f for which
such an invertibility condition holds. We will again see that a large class of illuminations
are well-chosen in two dimensions n = 2. In dimensions n 3, the construction of well-
chosen illuminations will again be based on CGO solutions.
The invertibility condition is that the m gradients u
j
have maximal rank in R
n
at every point x X. This hypothesis can be formalized by the somewhat stronger
statement: there exists a nite open covering O =
k

1kN
of X (i.e. X
N
k=1

k
),
an indexing function : [1, N] i (i) = ((i)
1
, . . . , (i)
n
) [1, m]
n
and a positive
constant c
0
such that
min
1iN
inf
x
i
det(S
(i)
1
(x), . . . , S
(i)n
(x)) c
0
> 0. (9.15)
This assumption is equivalent to imposing the following condition on the data
min
1iN
inf
x
i
det H
(i)
(x) c
2
0
> 0, (9.16)
where H
(i)
stands for the n n matrix of elements H
(i)
kl
= S
(i)
k
S
(i)
l
. We state that
f is a well-posed set of illuminations when (9.15) or equivalently (9.16) holds.
This complicated expression simply states that at each point x X, we can nd n
linearly independent vectors u
j(x)
with determinant bounded from below uniformly in
x X. The elimination of F is then guaranteed and can be done in a stable fashion as
the following lemma indicates.
Lemma 9.2.2 Let X open where, up to renumbering of solutions, we have
inf
x
det(S(x)) c
0
> 0, S(x) := (S
1
(x)[ . . . [S
n
(x)).
Let us dene H(x) := S
i
(x) S
j
(x)
1i,jn
and D(x) =
_
det H(x). Then we have:
F =
c
F
D
n

i,j=1
((DH
ij
) S
i
)S
j
= c
F
_
log D +
n

i,j=1
(H
ij
S
i
)S
j
_
,
c
F
:= (
1
2
(n 2) + 1)
1
.
(9.17)
Here H
ij
denotes the element (i, j) of the matrix H
1
.
Once F is eliminated, we can write a system of equations for the vectors S
j
that
admits a unique solution provided that and all S
j
s are known at one point x
0


X,
for instance at the domains boundary X. This leads to well-posed reconstructions as
stated in the following:
142 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
Theorem 9.2.3 Let X R
n
, n 2 be an open convex bounded domain, and let two
sets of m n solutions of (9.9) generate measurements (H = M
f
(),

H = M
f
( ))
whose components belong to W
1,
(X), and who jointly satisfy condition (9.16) with the
same triple (O, , c
0
). In other words, we assume that f is well-chosen for both and .
Let x
0

i
0
X and (x
0
), (x
0
) and S
(i
0
)
i
(x
0
),

S
(i
0
)
i
(x
0
)
1in
be given.Then
we have the stability estimate:
| log log |
W
1,
(X)
C
_

0
+|H

H|
W
1,
(X)
_
, (9.18)
where
0
is the error at x
0
:

0
:= [ log (x
0
) log (x
0
)[ +
n

i=1
|S
(i
0
)
i
(x
0
)

S
(i
0
)
i
(x
0
)|.
9.3 Eliminations and redundant systems of ordinary
dierential equations
The proof of Theorem 9.2.3, which we complete in this section, involves rst proving
Lemma 9.2.2 and next solving a redundant system of equations for the vectors S
j
.
9.3.1 Elimination of F
We now prove Lemma 9.2.2 and rst recall some notation from tensor calculus. For
0 k n,
k
denotes the space of k forms. We recall the denition of the Hodge
star operator :
k

nk
for 0 k n, such that for any elementary k-form
dx
I
= dx
i
1
dx
i
k
, we have
dx
I
= dx
J
, where = sign((1 . . . n) (I, J)). (9.19)
Here, J is implicitly dened by the fact that (1 . . . n) (I, J) is a permutation. Note
that dx
J
is independent of the ordering of the nk indices in J. We recall the following
useful identities:
= (1)
k(nk)
on
k
, (u

) = u v, d u

= u, u, v
1
.
Because S
1
(x), . . . , S
n
(x) forms a basis of R
n
, a vector V can be represented in this basis
by the following representation
V = H
ij
(V S
i
)S
j
. (9.20)
For j = 1, . . . , n, let us introduce the following 1-forms:
X

j
:= (1)
n1

j
(S

i
1
S

i
n1
), (i
1
, . . . , i
n1
) = (1, . . . ,

j, . . . , n), (9.21)
9.3. ELIMINATIONS AND REDUNDANT SYSTEMS OF ODES 143
where the hat indicates an omission and
j
= (1)
j1
. We now show that the vector
elds X
j
satisfy a simple divergence equation. We compute
X
j
= d X

j
=
j
d(S

i
1
S

i
n1
) =
j

n1

k=1
(1)
k
S

i
1
dS

i
k
S

i
n1
=
j

n1

k=1
(1)
k
S

i
1

1
2
(F

i
k
) S

i
n1
=
1
2
(n 1) (F

j
).
Using the identity (u

) = u v, we deduce that
X
j
=
1
2
(n 1)F X
j
, j = 1 . . . n. (9.22)
We now decompose X
j
in the basis (S
1
, . . . , S
n
). For k ,= j, there is an l such that i
l
= k
and we have
X
j
S
k
= det(S
1
, . . . , S
j1
, S
k
, S
j+1
, . . . , S
n
) = 0,
by repetition of the term S
k
in the determinant. Now for k = j, we have
X
j
S
j
= det(S
1
, . . . , S
n
) = det S = D.
Using formula (9.20), we deduce that X
j
admits the expression
X
j
= DH
ij
S
i
.
Plugging this expression into equation (9.22), and using (V ) = V + V ,
we obtain
(DH
ij
) S
i
+ DH
ij
S
i
=
1
2
(n 1)F (DH
ij
S
i
)
(DH
ij
) S
i
DH
ij 1
2
F S
i
=
1
2
(n 1)DH
ij
F S
i
(DH
ij
) S
i
= c
1
F
DH
ij
F S
i
.
Finally, using the representation (9.20) for F itself yields
F = (H
ij
F S
i
)S
j
=
c
F
D
((DH
ij
) S
i
)S
j
. (9.23)
We can also recast the previous expression as
F = c
F
_
log D + ((H
ij
) S
i
)S
j

, (9.24)
and the proof of the lemma is complete.
144 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
9.3.2 System of ODEs for S
j
In this section, we obtain a redundant system of ordinary dierential equations for the
matrix S. Let S be the matrix formed with the column vectors S
j
. Then H = S
T
S is
known from the measurement operator. Moreover, we have just found an equation of
the form dS

j
= F

(S) S

j
, with F = F(S) given above. We thus possess information
about the symmetric part of S and the skew-symmetric part of S. It remains to know
whether this is enough to write an equation of the form S
j
= T
j
(S). The answer
is armative as we now show.
We rst need to introduce other standard geometric notation, without which the
derivations become quickly intractable. Let us denote the Euclidean orthonormal frame
e
i
=
x
i and e
i
= dx
i
. We work on a convex set R
n
with the Euclidean metric
g(X, Y ) X Y =
ij
X
i
Y
j
on R
n
. Following [43], we denote by the Euclidean
connection, which here has the expression we can take as a denition:

X
f = X f = X
i

i
f, and
X
Y = (X Y
j
)e
j
= X
i
(
i
Y
j
)e
j
,
for given vector elds X = X
i
e
i
and Y = Y
i
e
i
. An important identity for the sequel is
the following characterization of the exterior derivative of a one-form :
d(X, Y ) =
X
((Y ))
Y
((X)) ([X, Y ]), (9.25)
or equivalently in the Euclidean metric, writing = S

i
, X = S
j
and Y = S
k
,
S
i
[S
j
, S
k
] = S
j
(S
i
S
k
) S
k
(S
i
S
j
) dS

i
(S
j
, S
k
), (9.26)
where the Lie bracket (commutator) of X and Y may be dened here as [X, Y ] =

X
Y
Y
X = X Y Y X seen as a vector eld.
Exercise 9.3.1 Verify (9.26) directly.
Note that the right-hand side in (9.26) involves no derivatives in the unknown S since
S
i
S
k
= H
ik
is known and dS

i
is a known functional of S by (9.14) and (9.17).
The following relation between inner products and Lie brackets of a given frame (see
e.g. [43, Eq. 5.1 p. 69]) is very useful
2(X Y ) Z = X (Y Z) + Y (Z X) Z (X Y )
Y [X, Z] Z [Y, X] + X [Z, Y ].
(9.27)
Exercise 9.3.2 Check the above expression directly.
We thus nd using (9.14) and (9.26) that
2(S
i
S
j
) S
k
= S
i
H
jk
S
j
H
ik
+S
k
H
ij
2F S
k
H
ij
+ 2F S
j
H
ik
. (9.28)
Note that the right-hand-side no longer involves any derivative of S
j
. Moreover, S
j
forms
a frame. We can therefore extract the full gradient of S
j
from the terms 2(S
i
S
j
) S
k
.
Geometrically, gradients generalize to tensors via the total covariant derivative, which
maps a vector eld X to a tensor of type (1, 1) dened by
X(, Y ) = (
Y
X). (9.29)
9.3. ELIMINATIONS AND REDUNDANT SYSTEMS OF ODES 145
S is a frame provided that the determinant condition inf
x
det S c
0
> 0 holds. In
the frame S, we may express S
i
in the basis S
j
S

n
j,k=1
of such tensors by writing
S
i
= a
ijk
S
j
S

k
and identifying the coecients a
ijk
by writing
S
i
(S

p
, S
q
) = S

p
(
Sq
S
i
) = S
q
S
i
S
p
,
and also
S
i
(S

p
, S
q
) = a
ijk
S
j
S

k
(S

p
, S
q
) = a
ijk
H
jp
H
kq
.
Equating the two, we obtain the representation
S
i
= H
qk
H
jp
[(S
q
S
i
) S
p
]S
j
S

k
, (9.30)
where H
ij
denote the coecients of the matrix H
1
.
Exercise 9.3.3 Seeing S
j
as the matrix with components (i, k) given by
x
i
(S
j
)
k
and S
i
S
m
as the matrix with components (j, k) given by (S
i
)
k
(S
m
)
k
, show that
S
j
=

i,k,l,m
H
ik
(S
k
S
j
) S
l
H
lm
S
i
S
m
= T
j
(S).
Now plugging (9.28) into (9.30), and using H
ij
H
jk
=
ik
, we obtain
2S
i
= 2H
qk
H
jp
(
Sq
S
i
S
p
)S
j
S

k
= H
qk
H
jp
(H
iq
S
p
+H
ip
S
q
H
pq
S
i
+ (H
pq
(F S
i
) H
qi
(F S
p
))) S
j
S

k
= (H
jp
U
ik
S
p
+ H
qk
U
ij
S
q
+H
jk
S
i
+ (H
jk
(F S
i
) H
jp

ik
(F S
p
)))S
j
S

k
,
where we have used H
jk
= H
jp
(H
pq
)H
qk
and have dened
U
jk
:= (H
jp
)H
pk
= H
jp
H
pk
, 1 j, k n. (9.31)
Using formulas H
jk
S
j
S

k
= e
i
e
i
and H
kl
(V S
k
)S
l
= V for any smooth vector
eld V , we obtain for 1 i n
S
i
=
1
2
_
U
ik
S

k
+ S
k
U

ik
+ (H
jk
S
i
)S
j
S

k
_
+
1
2
(F S
i
)I
n

1
2
F S

i
. (9.32)
Using (9.23), we observe that S
i
is equal to a polynomial of degree at most three in the
frame S with coecients involving the known inner products H
ij
. For each 1 i, k n,

k
S
i
is nothing but
e
k
S
i
= S
i
(, e
k
), which can be obtained from (9.32). Denoting
S := (S
T
1
, . . . , S
T
n
)
T
, we are then able to construct the system of equations

k
S =

[[3
Q
k

, S

=
n
2

i=1
S

i
i
, 1 k n, (9.33)
where Q
k

depends only on the data and is an n


2
-index. This redundant system can
then be integrated along any curve (where it becomes a system of ordinary dierential
equations with Lipschitz right-hand sides ensuring uniqueness of the solution) in order
to solve for the matrix-valued function S.
146 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
9.3.3 ODE solution and stability estimates
Once (9.33) has been obtained, the derivation of Theorem 9.2.3 follows rapidly. We
leave this step as an exercise; see [?] for the details.
Exercise 9.3.4 Prove the stability estimates in Theorem 9.2.3 from (9.33), (9.24), and
the denition (9.12).
9.4 Well-chosen illuminations
It remains to nd boundary conditions such that (9.15) holds. As in the preceding
chapter, we need to distinguish dimension n = 2 from dimensions n 3.
9.4.1 The case n = 2
In dimension n = 2, the critical points of u (points x where u(x) = 0) are necessarily
isolated as is shown in, e.g., [2]. From this and techniques of quasiconformal mappings
that are also restricted to two dimensions of space, we can show the following results.
Lemma 9.4.1 ([3]) Let u
1
and u
2
be the solutions of (9.1) on X simply connected with
boundary conditions f
1
= x
1
and f
2
= x
2
on X, respectively, where x = (x
1
, x
2
) are
Cartesian coordinates on X. Assume that is suciently smooth. Then (x
1
, x
2
)
(u
1
, u
2
) from X to its image is a dieomorphism. In other words, det(u
1
, u
2
) > 0
uniformly on

X.
In other words, in two dimensions of space, there are explicit boundary conditions,
such as those above with f
j
the trace of x
j
on X for j = 1, 2 that guarantees that
(9.15) holds uniformly on the whole domain X. It is shown in [25] that the appropriate
extension of this result is false in dimension n 3.
9.4.2 The case n 3
In dimension n 3, we have the following result:
Lemma 9.4.2 Let n 3 and H
n
2
+3+
(X) for some > 0 be bounded from below
by a positive constant. Then for n even, there exists a open set G of illuminations
f
1
, . . . , f
n
such that for any f G, the condition (9.15) holds with O = X for some
constant c
0
> 0.
For n odd, there exists an open set G of illuminations f
1
, .., f
n+1
such that for any
f G there exists an open cover of X of the form
2i1
,
2i

1iN
and a constant
c
0
> 0 such that
inf
x
2i1
det(S
1
, . . . , S
n1
,
i
S
n
) c
0
and inf
x
2i
det(S
1
, . . . , S
n1
,
i
S
n+1
) c
0
,
(9.34)
for 1 i N and with
i
,
i
= 1.
9.4. WELL-CHOSEN ILLUMINATIONS 147
In other words, this lemma indicates that for appropriate boundary conditions f
j
, we
can always nd n corresponding solutions whose gradients form a basis of R
n
.
Proof. Consider the problem (x)u = 0 on R
n
with (x) extended in a con-
tinuous manner outside of X and such that equals 1 outside of a large ball. Let
q(x) =

on R
n
. Then q H
n
2
+1+
(R
n
) since 1 H
n
2
+3+
(R
n
) for some > 0.
By Sobolev imbedding, is of class C
3
(X) while q is of class C
1
(X). With the above
hypotheses, we can apply Corollary 7.3.3.
Let v =

u so that v +qv = 0 on R
n
. Let C
n
be of the form = ( +i

)
with ,

S
n1
,

= 0, and = [[/

2 > 0. Now, as we showed in Corollary


7.3.3, we have
v

= e
x
(1 +

),

[X
= O(1) in C
1
(X),
with ( + q)v

= 0 and hence u

= 0 in R
n
. We have used again the Sobolev
imbedding stating that functions in H
n
2
+k+
(Y ) are of class C
k
(Y ) for a bounded domain
Y . Taking gradients of the previous equation and rearranging terms, we obtain that

= e
x
( +

), with

:=

(1 +

.
Because

is bounded and

[X
= O(1) in C
1
(X), the C
n
-valued function

satis-
es sup
X
[

[ C independent of . Moreover, the constant C is in fact independent of


provided that the norm of the latter is bounded by a uniform constant in H
n
2
+3+
(X)
according to Corollary 7.3.3.
Both the real and imaginary parts of u

, denoted u
T

and u
.

, are solutions of the


free-space conductivity equation. Thus,

u
T

and

u
.

can serve as vectors S


i
.
More precisely, we have

u
T

= e
x
_
( +
1

) cos(

x) (

+
1

) sin(

x)
_
,

u
.

= e
x
_
(

+
1

) cos(

x) + ( +
1

) sin(

x)
_
.
(9.35)
Case n even: Set n = 2p, dene
l
= (e
2l
+ ie
2l1
) for 1 l p, and construct
S
2l1
=

u
T

l
and S
2l
=

u
.

l
, 1 l p.
Using (9.35), we obtain that
det(S
1
, . . . , S
n
) =
n
e
2

p
l=1
x
2l
(1 + f(x)),
where lim

sup
X
[f[ = 0. Letting so large that sup
X
[f[
1
2
and denoting

0
:= min
xX
(
n
e
2

p
l=1
x
2l
) > 0, we have inf
xX
det(S
1
, . . . , S
n
)

0
2
> 0.
Case n odd: Set n = 2p 1, dene
l
= (e
2l
+ ie
2l1
) for 1 l p 1, and

p
= (e
n
+ ie
1
) and construct
S
2l1
=

u
T

l
and S
2l
=

u
.

l
, 1 l p.
148 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
Using (9.35), we obtain that
det(S
1
, . . . , S
n1
, S
n
) =
n
e
(xn+2

p1
l=1
x
2l
)
(cos(x
1
) + f
1
(x)) ,
det(S
1
, . . . , S
n1
, S
n+1
) =
n
e
(xn+2

p1
l=1
x
2l
)
(sin(x
1
) + f
2
(x)) ,
where lim

sup
X
[f
1
[ = lim

sup
X
[f
2
[ = 0. Letting so large that sup
X
([f
1
[, [f
2
[)
1
4
and denoting
1
:= min
xX
(
n
e
(xn+2

p1
l=1
x
2l
)
) > 0, we have that [ det(S
1
, . . . , S
n1
, S
n
)[

1
4
on sets of the form X x
1
]

3
,

3
[+m and [ det(S
1
, . . . , S
n1
, S
n+1
)[

1
4
on
sets of the form Xx
1
]

6
,
5
6
[+m, where m is a signed integer. Since the previous
sets are open and a nite number of them covers X (because X is bounded and is
nite), we therefore have fullled the desired requirements of the construction. Upon
changing the sign of S
n
or S
n+1
on each of these sets if necessary, we can assume that
the determinants are all positive.
Note that as in the preceding paragraph, we have obtained the existence of an
open set of illuminations f such that appropriate determinants remain strictly posi-
tive throughout the domain X. However, these illuminations f are not characterized
explicitly.
9.5 Remarks on hybrid inverse problems
In the past two chapters, we have briey presented two hybrid inverse problems based
on the photo-acoustic eect (with a similar theory for the imaging modality Transient
Elastography) and the ultrasound modulation eect. What characterizes these hybrid
inverse problems is that after a preliminary step (involving and inverse wave problem in
photoacoustics and an inverse Fourier transform in ultrasound modulation) we obtain
an inverse problem with internal functionals of the unknown parameters.
These internal functionals have an immediate advantage: singularities of the un-
known coecients no longer need to be propagated to the boundary of the domain
by an elliptic operator that severely damps high frequencies. The main reason for ill-
posedness of invertible operators, namely the smoothing property of such operators, is
therefore no longer an issue. However, injectivity of the measurement operator is not
guaranteed. We have seen in QPAT that only two out of three coecients could be
reconstructed in QPAT. QPAT data acquired at one frequency are thus not sucient to
reconstruct three coecients. With appropriate prior information about the dependency
of coecients with respect to a frequency parameter, then injectivity of the measure-
ment operator can be restored [?]. But again, this requires prior information that one
may not wish to make. The alternative is then to come up with other prior models that
restore injectivity or to combine QPAT measurements with additional measurements.
Hybrid inverse problems face the same shortcomings as any other inverse problem.
Once injectivity is guaranteed, then stability of the reconstructions are guaranteed
in principle by the fact that singularities no longer need to propagate. We have seen a
few such stability results. Note that these results typically require a certain degree of
smoothness of the unknown coecients. This is a shortcoming of the theories presented
above. The reason why we have recourse to hybrid inverse problems is to obtain high
resolution. The reason we typically need high resolution is because coecients may
9.5. REMARKS ON HYBRID INVERSE PROBLEMS 149
vary rapidly and we wish to quantify such variations. It would therefore be useful to
understand how stability estimates degrade when the coecients are not smooth.
That said, numerical experiments conducted in e.g. [?, 18, 19] show that reconstruc-
tions based on algorithms similar to those presented above do allow us to obtain very
accurate reconstructions even for highly discontinuous coecients, and this even in the
presence of relatively signicant noise.
For additional general references to hybrid inverse problems, we refer the reader to
[4, 56].
150 CHAPTER 9. COUPLED-PHYSICS IP II: UMT
Chapter 10
Priors and Regularization
As we have mentioned several times in these notes, the inuence of noise is largely
subjective: we either nd it acceptable or unacceptable. Mathematically, this inuence
is typically acceptable in a given norm while it may be unacceptable in another (more
constraining) norm. In which norms that inuence is controlled for a given problem is
characterized by stability estimates, which we have presented for the problems consid-
ered in these notes.
Once we have decided that noise had too large an eect in a setting of interest,
something must be done. That something inescapably requires that we add prior in-
formation. Several techniques have been developed to do so. The simplest and most
developed is the regularization methodology. Typically, such a regularization assumes
that the object we are interested in reconstructing has a prior smoothness. We may for
instance assume that the object belongs to H
s
(X) for some s > 0. This assumption
indicates that the Fourier transform object of interest decreases rapidly as frequency
increases. High frequencies, which are not present, thus do not need to be reconstructed
with high accuracy. This allows us to mitigate the eect of high frequency noise in the
reconstructions.
The main drawback of regularization theory is that objects of interest may not
necessarily be smooth. Smooth means that the rst coecients in a Fourier series
expansion are big while the other coecients are small. In several settings of interest,
the objects may be represented by a few big coecients and a large number of small
coecients, but not in the basis of Fourier coecients. In other words, the object may
be sparse in a dierent, known basis. The objective of sparse regularization is to devise
methods to nd these coecients.
In some settings, the problems are so ill-posed that looking even for the rst co-
ecients in a given basis may not provide sucient accuracy. Other sorts of prior
information may be necessary, for instance assuming that the objects of interest are
small inclusions with specic structures, or that next to a blue pixel in an image, red
pixels are extremely unlikely. In such situations, reconstructions are typically not very
accurate and it is often important to characterize this inaccuracy. A very versatile set-
ting to do so is the statistical Bayesian framework. In such a setting, objects of interest
are modeled by a set of possible outcomes with prior probabilities of happening. This is
the prior probability distribution. Then data are acquired with a given noise model. The
probability of such data happening conditioned on given parameters is called the like-
151
152 CHAPTER 10. PRIORS AND REGULARIZATION
lihood probability distribution. Using the Bayes rule, a posterior probability distribution
gives the probability density of the parameters based on availability of the data.
We now briey consider these three settings, the smoothness regularization method-
ology, the sparsity regularization methodology, and the Bayesian framework and show
their relations.
10.1 Smoothness Regularization
We have seen that many of the inverse problems we have considered were either mildly
ill-posed (with > 0 in the sense of Chapter 1) or severely ill-posed (as for instance
the Calder on problem or the Cauchy problems for elliptic equations). We present here
some techniques to regularize the inversion. Such techniques typically work for mildly
ill-posed problems but are often not sucient for severely ill-posed problems. But this
the inuence of noise is subjective anyway, these regularization techniques should be
applied rst as they are the simplest both theoretically and computationally. We refer
the reader to [31, 40, 63] for additional information on these regularization techniques.
10.1.1 Ill-posed problems and compact operators
Let A be an injective and compact operator dened on an innite dimensional (separa-
ble) Hilbert space H with range Range(A) in H:
A : H Range(A) H. (10.1)
We recall that compact operators map the unit ball in H to a subset of H whose closure
(with respect to the usual norm in H) is compact, i.e., veries that every bounded (with
respect to the usual norm in H) family of points admits a converging (with respect to
the usual norm in H) subsequence in the compact set.
Since A is injective (i.e., Ax = 0 implies x = 0), we can dene the inverse operator
A
1
with domain of denition Range(A) and Range H:
A
1
: D(A
1
) = Range(A) H. (10.2)
The problem is that A
1
is never a continuous operator from Range(A) to H when both
spaces are equipped with the usual norm in H:
Lemma 10.1.1 For A as above, there exists a sequence x
n
such that
|x
n
|
H
= 1, |Ax
n
|
H
0. (10.3)
The same holds true with |x
n
|
H
.
Proof. The proof holds in more complicated settings than Hilbert spaces. The Hilbert
structure gives us a very simple proof and is based on the existence of an orthonormal
basis in H, i.e., vectors x
n
such that |x
n
|
H
= 1 and (x
n
, x
m
)
H
= 0 for n ,= m. Since
these vectors belong to the unit ball, we deduce that y
n
= Ax
n
is a converging sequence
(up to taking subsequences), say to y H. Take now x
n
= 2
1/2
(x
n
x
n+1
). We verify
that x
n
satises (10.3). Now dene z
n
= x
n
/|A x
n
|
1
2
H
when the latter denominator does
10.1. SMOOTHNESS REGULARIZATION 153
not vanish and z
n
= n x
n
otherwise. Then Az
n
still converges to 0 while |z
n
|
H
converges
to .
This simple lemma shows that inverting a compact operator can never be a well-posed
problem in the sense that A
1
is not continuous from D(A
1
) to H with the H norm.
Indeed take the sequence y
n
= Ax
n
/|Ax
n
| in D(A
1
), where x
n
is the sequence in
(10.3). Then |y
n
|
H
= 1 while |A
1
y
n
|
H
tends to .
The implication for inverse problems is the following. If y
n
is the measurement
noise for n large, then A
1
y
n
models the error in the reconstruction, which may thus
be arbitrarily larger than the norm of the true object we aim to reconstruct. More
precisely, if Ax = b is the real problem and A x =

b is the exact reconstruction from
noisy data, then arbitrarily small errors |b

b| in the measurements is still compatible


with arbitrary large errors |x x| in the space of parameters. This shows that the
problem needs to be regularized before any inversion is carried out.
10.1.2 Regularity assumptions and error bound
The calculations we have carried out in the preceding section show that an ill-posed
inverse problem cannot satisfactorily be solved if no other assumptions on the problem
are added. A sometimes reasonable and practically useful assumption is to impose,
before we start the reconstruction process, that the object we want to reconstruct is
suciently smooth. This allows us to lter out high frequencies that may appear in the
reconstruction because we know they are part of the noise and not of the object we want
to reconstruct. We present two mathematical frameworks for such a regularization.
In the rst framework, we introduce the adjoint operator A

to A, dened from H
to Range(A

) by the relation
(Ax, y)
H
= (x, A

y)
H
, for all x, y H.
Since A is compact and injective, then so is A

. We can also dene the inverse operator


A

= (A

)
1
from Range(A

) to H.
We may now assume that x, the object we aim at reconstructing, is suciently
smooth that is belongs to the range of A

, i.e., there exists y such that x = A

y. Since
A and A

are compact operators, hence smoothing operators, the above hypothesis


means that we assume a priori that x is smoother than being merely an element in H.
We then dene the stronger norm
|x|
1
= |A

x|
H
. (10.4)
We may also assume that the object x is even smoother than being in the range of
A

. For instance let us assume that x belongs to the range of A

A, i.e., there exists y


such that x = A

Ay. Note that since both A and A

are smoothing operators (because


they are compact), the assumption on x is stronger than simply being in the range of
A

. We dene the even stronger norm


|x|
2
= |(A

A)
1
x|
H
. (10.5)
We want to use these denitions to show that if the solution x is a priori bounded
for the | |
1
or the | |
2
norm and noise is small, then the error in the reconstruction
154 CHAPTER 10. PRIORS AND REGULARIZATION
is small. For instance, assume that y
j
= Ax
j
for j = 1, 2 so that y = Ax for y = y
1
y
2
and x = x
1
x
2
. If both x
j
, j = 1, 2 are bounded and y is small, then how small is x?
For such questions, we have the following result:
Theorem 10.1.2 Let x H such that |x|
1
E and |Ax|
H
. Then we have:
|x|
H

E. (10.6)
If we now assume that |x|
2
E instead, we obtain the better bound
|x|
H
E
1
3

2
3
. (10.7)
Proof. Let y = A

x so that |y|
H
E. We have then
|x|
2
H
= (x, A

y) = (Ax, y) |Ax|
H
|y|
H
E.
This proves (10.6). For the second bound let z = (A

A)
1
x so that |z|
H
E and
compute:
|x|
2
H
= (x, A

Az) = (Ax, Az) |Az| = (Az, Az)


1
2
(z, x)
1
2
E
1
2
|x|
1
2
H
.
This proves the second bound (10.7).
The theorem should be interpreted as follows. Consider that Ax is the noise level in the
measured data and that |x|
1
< E or |x|
2
< E is a priori smoothness information we
have on the object we want to reconstruct. Then the worst error we can make on the
reconstruction (provided we nd an appropriate inversion method; see below) is given
by the bounds (10.6) and (10.7). Note that the latter bound is better (since
2
3

1
2
).
This results from a more stringent assumption on the image x.
Let us now consider smoothing operators in the (second) framework of the Hilbert
scale H
s
(R) we have introduced in Chapter 1. Then we have the following result:
Theorem 10.1.3 Let us assume that the operator A is mildly ill-posed of order > 0
so that
|Af|
L
2
(R)
m|f|
H

(R)
. (10.8)
Suppose now that the measurement error is small and that the function we want to
reconstruct is regular in the sense that
|Af|
L
2
(R)
m, and |f|
H

(R)
E, (10.9)
for some > 0, > 0, and E > 0. Then we have
|f|
L
2
(R)

+
E

+
. (10.10)
Proof. The proof is a simple but interesting exercise in interpolation theory. Notice
that the hypotheses are
|f|
H

(R)
E, and |f|
H

(R)
,
10.1. SMOOTHNESS REGULARIZATION 155
and our objective is to nd a bound for |f|
L
2
(R)
. Let us denote = (1 + [[
2
)
1
2
. We
have
(2)
n
|f|
2
L
2
(R)
=
_
R
n
[

f()[
2

2
[

f()[
2(1)

2
d

_
_
R
n
[

f()[
2

2/
d
_

_
_
R
n
[

f()[
2

2/(1)
d
_
1
,
thanks to H olders inequality

_
R
f(x)g(x)dx

|f|
L
p
(R)
|g|
L
q
(R)
,
which holds for all p 1 and q 1 such that p
1
+q
1
= 1, where we have dened for
all p 1,
|f|
L
p
(R)
=
_
_
R
[f(x)[
p
dx
_
1/p
. (10.11)
Choosing =

+
and =

+
gives (10.10).
Let us briey recall the proof of the H olders inequality [62]. We rst verify that
x
1
p

x
p
+
1
q
, x > 0,
for p
1
+ q
1
= 1 and p 1, since x
1
p

x
p
attains its maximum at x = 1 where it is
equal to q
1
. For y > 0 we use the above inequality for x/y and multiply by y to obtain
x
1
p
y
1
q

x
p
+
y
q
, x > 0, y > 0. (10.12)
Choosing x = [tf(x)[
p
and y = [t
1
g(x)[
q
, we deduce that
_
R
n
[f(x)g(x)[dx
1
p
|tf|
p
L
p
(R)
+
1
q
|t
1
g|
q
L
q
(R)
=
t
p
p
|f|
p
L
p
(R)
+
t
q
q
|g|
q
L
q
(R)
,
for all t > 0. Maximizing over t gives the Holder inequality.
The last theorem applies to a less general class of operators than compact operators
(although it applies to operators that are not necessarily compact) but it gives us an ac-
curate result. We should still consider as the noise level and E as an a priori bound we
have on the object we want to reconstruct. Then depending on the a priori smoothness
of the object, we obtain dierent possible accuracies in the reconstructions. What is
important is the relative regularity of the object compared to the smoothing eect of the
operator A. When = , this corresponds to assuming the same regularity as |x|
1
E
in Theorem 10.1.2. We thus obtain an accuracy of order
1
2
in the reconstruction. When
= 2, this corresponds to |x|
2
E since f = (A

A)
1
g for some g L
2
(R) means
that f is twice as regular as A is smoothing. We thus recover the accuracy of order

2
3
as in Theorem 10.1.2. Theorem 10.1.3 allows us to deal with arbitrary values of .
Notice that as , we recover that the problem is almost well-posed since the error
in the reconstruction is asymptotically of the same order as the noise level.
156 CHAPTER 10. PRIORS AND REGULARIZATION
10.1.3 Regularization methods
Now that we know how noise can optimally be controlled in the reconstruction based on
the regularity of the object we want to reconstruct, we need to devise algorithms that
indeed control noise amplication in the reconstruction.
Since A
1
is an unbounded operator with domain of denition Range(A), a proper
subset of H, we rst need to introduce approximations of the inverse operator. We
denote by R

dened from H to H for > 0 a sequence of regularizations of A


1
such
that
lim
0
R

Ax = x for all x H. (10.13)


Under the hypotheses of Lemma 10.1.1, we can show that the sequence of operators R

is not uniformly bounded. A uniform bound would indeed imply that A


1
is bounded.
Thus, R

A converges to identity strongly (since (10.13) is the denition of strong con-


vergence of operators) but not uniformly in the sense that |R

AI| does not converge


to 0.
Exercise 10.1.1 Prove this.
One of the main objectives of the regularization technique is to handle noise in an
optimal fashion. Let us denote by y

our measurements and assume that |y

Ax|
H
.
We then dene
x
,
= R

. (10.14)
We want to nd sequences R

that deal with noise in an optimal fashion. For instance


assuming that |x|
1
E and that |y

Ax|
H
, we want to be able to show that
|x x
,
| C

E,
at least for some values of . We know from Theorem 10.1.3 that such a bound is
optimal. We will consider three regularization techniques: singular value decomposition,
Tikhonov regularization, and Landweber iterations.
The choice of a parameter is then obviously of crucial importance as the above
bound will not hold independently of . More precisely, the reconstruction error can be
decomposed as
|x
,
x|
H
|R

|
H
+|R

Ax x|
H
. (10.15)
Exercise 10.1.2 Prove this. The operator norm |R

|
H
is dened as the supremum of
|R

x|
H
under the constraint |x|
H
1.
We thus observe that two competing eects enter (10.15). The rst eect comes from
the ill-posedness: as 0, the norm |R

|
H
tends to so should not be chosen too
small. The second eect comes from the regularization: as increases, R

A becomes
a less accurate approximation of identity so should not be chosen too large. Only
intermediate values of will provide an optimal reconstruction.
10.1. SMOOTHNESS REGULARIZATION 157
Singular Value Decomposition
For a compact and injective operator A dened on an innite dimensional Hilbert space
H, let us assume that we know it singular value decomposition dened as follows. Let
A

be the adjoint operator to A and


j
> 0, j N the eigenvalues of the symmetric
operator A

A. Then, the sequence


j
=
_

j
for j N are called the singular values of
A. Since
j
|A|
H
, we order the singular values such that

1

2

n
> 0.
Multiple eigenvalues are repeated as many times as their multiplicity (which is neces-
sarily nite since the associated eigenspace for A

A needs to be compact).
Then there exist two orthonormal systems (x
j
)
jN
and (y
j
)
jN
in H such that
Ax
j
=
j
y
j
and A

y
j
=
j
x
j
, for all j J. (10.16)
We call (
j
, x
j
, y
j
) the singular system for A. Notice that
Ax =

j=1

j
(x, x
j
)y
j
, A

y =

j=1

j
(y, y
j
)x
j
.
Here (x, x
j
) is the inner product in H, (x, x
j
)
H
. We then have the very useful charac-
terization of the Range of the compact and injective operator A:
Lemma 10.1.4 (Picard) The equation Ax = y is solvable in H if and only if

jN
1

2
j
[(y, y
j
)[
2
< , (10.17)
in which case the solution is given by
x = A
1
y =

jN
1

j
(y, y
j
)x
j
. (10.18)
The ill-posedness of the inverse problem appears very clearly in the singular value de-
composition. As j , the singular values
j
tend to 0. And they do so all the faster
that the inverse problem is ill-posed. We can extend the denition of ill-posed problems
in the sense that a compact operator generates a mildly ill-posed inverse problem of
order > 0 when the singular values decay like j

and generates a severely ill-posed


problem when the singular values decay faster than any j
m
for m N.
So in order to regularize the problem, all we have to do is to replace too small
singular values by larger values. Let us dene q(, ) for > 0 and [0, |A|] such
that
[q(, )[ < 1, [q(, )[ c(), and q(, ) 1 0 as 0, (10.19)
(not uniformly in obviously). Then we dene the regularizing sequence
R

y =

jN
q(,
j
)

j
(y, y
j
)x
j
. (10.20)
158 CHAPTER 10. PRIORS AND REGULARIZATION
Compare to (10.18). As 0, R

converges to A
1
pointwise. We are interested in
estimating (10.15) and showing that the error is optimal based on the assumed regularity
of x. The total error is estimated by using
|R

|
H
c(), |R

Ax x|
H
=

j=1
_
q(,
j
) 1
_
2
[(x, x
j
)[
2
. (10.21)
Exercise 10.1.3 Prove these relations.
We can now prove the following results:
Theorem 10.1.5 (i) Let us assume that x = A

z with |z|
H
E and that |y

Ax|
, where y

is the measurements. Choose q(, ) and such that


[q(, ) 1[ C
1

, c()
C
2

, =
C
3

E
. (10.22)
Then we have that
|x
,
x|
H

_
C
2

C
3
+ C
1
_
C
3
_

E. (10.23)
(ii) Let us assume that x = A

Az with |z|
H
E and that |y

Ax| , where y

is
the measurements. Choose q(, ) and such that
[q(, ) 1[ C
4

2
, c()
C
5

, = C
6
_

E
_2
3
. (10.24)
Then we have that
|x
,
x|
H

_
C
5

C
6
+ C
4
C
6
_

2
3
E
1
3
. (10.25)
Proof. Since x = A

z, we verify that (x, x


j
) =
j
(y, y
j
) so that
|R

Ax x|
2
H
=

j=1
_
q(,
j
) 1
_
2
[(z, y
j
)[
2
C
2
1
|z|
2
H
.
This implies that
|R

|
H
+|R

Ax x|
H

C
2

+ C
1

E.
Using (10.15) and the expression for yields (10.23).
Exercise 10.1.4 Using similar arguments, prove (10.25).
This concludes the proof.
We have thus dened an optimal regularization scheme for the inversion of Ax = y.
Indeed from the theory in Theorem 10.1.2 we know that up to some multiplicative
constants, the above estimates are optimal.
10.1. SMOOTHNESS REGULARIZATION 159
It remains to nd lters q(, ) satisfying the above hypotheses. We propose two:
q(, ) =

2
+
2
, (10.26)
q(, ) =
_
_
_
1,
2
,
0,
2
< .
(10.27)
Exercise 10.1.5 Show that the above choices verify the hypotheses of Theorem 10.1.5.
Tikhonov Regularization
One of the main drawbacks of the theory presented in the preceding section is that in
most cases, the singular value decomposition of the operator is not analytically available
(although it is for the Radon transform; see [47, 48]), and is quite expensive to compute
numerically once the continuous problem has been discretized. It is therefore useful to
consider regularization techniques that do not depend on the SVD. One of the most
popular regularization techniques is the Tikhonov-Phillips regularization technique.
Solving Ax = y corresponds to minimizing |Ax y|
H
. Instead one may want to
minimize the regularized Tikhonov functional
J

(x) = |Ax y|
2
H
+ |x|
2
H
, x H. (10.28)
For > 0 and A a linear bounded operator on H, we can show that the above functional
admits a unique minimizer x

solving the following normal equations


A

Ax

+ x

= A

y. (10.29)
Exercise 10.1.6 Prove the above statement.
We can thus dene the regularizing sequence
R

= ( + A

A)
1
A

. (10.30)
The operator is bounded in H by |R

|
H
C
1/2
for all > 0. Notice that for a
compact operator A with singular system (
i
, x
i
, y
i
), we verify that the singular value
decomposition of R

is
R

y =

j=1

j
+
2
j
(y, y
j
)x
j
. (10.31)
This means that the Tikhonov regularization corresponds to the SVD regularization with
lter given by (10.26) and implies that the Tikhonov regularization is optimal to inverse
problem with a priori regularity |x|
1
E or |x|
2
E. It is interesting to observe
that the Tikhonov regularization is no longer optimal when the a priori regularity of x
is better than |x|
2
E (see [40]).
160 CHAPTER 10. PRIORS AND REGULARIZATION
Let us make this observation more explicit. Let us consider the operator A given in
the Fourier domain by multiplication by

for some > 0. We verify that A

= A
and that R

is given in the Fourier domain by


R

= T
1
x

2
+
T
x
, so that |R

|
1
2

.
Indeed, we check that x/(x
2
+ ) 1/(2

) and attains its maximum at x =

. We
now verify that
I R

A = T
1
x

2
+
T
x
,
so that for a function f H

(R
n
), we have
|f R

Af| sup
)1

2
+
|f|
H

(R
n
)
.
Moreover the inequality is sharp in the sense that there exists functions f such that the
reverse inequality holds (up to a multiplicative constant independent of ; Check this).
For > 2, the best estimate we can have for the above multiplier is that it is of order
O() (choose for instance = 1).
Exercise 10.1.7 Using (10.12) show that

2
+

, 0 1.
Show that the above inequality is sharp.
Let Af = g be the problem we want to solve and g

the measurements so that |Af


g

|
L
2
(R
n
)
. Let us assume that f belongs to H

(R
n
). We verify using (10.15) that
the error of the regularized problem is given by
|f R

|

2

+

2
1
|f|
H

(R
n
)
. (10.32)
Here, a b = min(a, b). This implies that
|f R

| C

2
3
|f|

1
3
H

(R
n
)
, (10.33)
for a universal constant C. We therefore obtain that the Tikhonov regularization is
optimal according to Theorem 10.1.3 when 0 < 2. However, for all > 2, the
error between the Tikhonov regularization and the exact solution will be of order
2
3
instead of

+
.
Exercise 10.1.8 More generally, consider an operator A with symbol a(), i.e.,
A = T
1
x
a()T
x
,
such that 0 < a() C

(R
n
) and for some > 0 and a

,= 0,
a()

, as [[ . (10.34)
10.1. SMOOTHNESS REGULARIZATION 161
(i) Show that A

, the adjoint of A for the L


2
(R
n
) inner product, satises the same
hypothesis (10.34).
(ii) Show that R

and S

= R

A I are bounded operator with symbols given by


r

() = ([a()[
2
+ )
1
a(), s

() = ([a()[
2
+ )
1
,
respectively.
(iii) Assuming that f H

(R), show that (10.33) holds.


These results show that for the Radon transform, an a priori regularity of the func-
tion f(x) in H
1
(R
2
) is sucient to obtain an error of order
2
3
. When the function is
smoother, a dierent technique from Tikhonov regularization is necessary to get a more
accurate reconstruction.
Landweber iterations
The drawback of the Tikhonov regularization is that it requires to invert the regulariza-
tion of the normal operator +A

A. This inversion may be computationally expensive


in practice. The Landweber iteration method is an iterative technique in which no
inversion is necessary. It is dened to solve the equation Ax = y as follows
x
0
= 0, x
n+1
= (I rA

A)x
n
+ rA

y, n 0, (10.35)
for some r > 0. By induction, we verify that x
n
= R
n
y, where
R
n
= r
n1

k=0
(I rA

A)
k
A

, n 1. (10.36)
Consider a compact operator A with singular system (
j
, x
j
, y
j
). We thus verify that
R
n
y =

j=1
1

j
_
1 (1 r
2
j
)
n
_
(y, y
j
)x
j
. (10.37)
Exercise 10.1.9 Check (10.37).
This implies that R
n
is of the form R

in (10.20) with = n
1
and
q(, ) = 1 (1 r
2
)
1/
.
Exercise 10.1.10 Show that the above lter veries the hypotheses (10.19) and those
of Theorem 10.1.5.
This implies that the Landweber iteration method is an optimal inversion method by
Theorem 10.1.5.
162 CHAPTER 10. PRIORS AND REGULARIZATION
Exercise 10.1.11 Show that the hypotheses of Theorem 10.1.5 are met provided that
the number of iterations n is chosen as
n = c
E

, n = c
_
E

_2
3
,
when |x|
1
E and |x|
2
E, respectively.
The above result shows that the number of iterations should be chosen carefully: when
n is too small, then R

A is a poor approximation of I, and when n is too large, then


|R

|
H
is too large. Unlike the Tikhonov regularization, we can show that the Landweber
iteration method is also optimal for stronger regularity assumptions on x than those
given in Theorem 10.1.5 (see [40] for instance).
Let us come back to the operator A with symbol a() =

. We verify that R
n
and S
n
= R
n
A I have respective symbols
r
n
() =
1 (1 r
2
)
n

, s
n
() = (1 r
2
)
n
.
Exercise 10.1.12 (i) Show that s
n
()

is bounded by Cn
/(2)
for of order
n
1/(2)
. Deduce that for f H

(R
n
), we have
|S
n
f| Cn


2
|f|
H

(R
n
)
.
(ii) Show that provided that n is chosen as
n = C
2
+
|f|
2
+
H

(R
n
)
,
we have the estimate
|R
n
| +|S
n
f| C

+
|f|

+
H

(R
n
)
. (10.38)
(iii) Deduce that the Landweber iteration method is an optimal regularization technique
for all > 0.
(iv) Generalize the above results for the operators described in Exercise 10.1.8.
We have thus obtained the interesting result that unlike the Tikhonov regularization
method described in (10.33), the Landweber iteration regularization can be made opti-
mal (by choosing the appropriate number of iterations n) for all choices on the regularity
in H

(R
n
) of the object f.
Let us conclude this section by the following summarizing remark. The reason why
regularization was necessary was because the user decided that noise was too amplied
during the not-regularized inversions. Smoothness priors were then considered to restore
well-posedness. This corresponds to a choice of the factor and a bound E. In addition,
we need to choose a regularization parameter, in the Tikhonov regularization algorithm
and the stopping iteration n in the Landweber iteration. How these choices are made
depends on the bound E but also on the estimated error . Various techniques have
been developed to choose or n a posteriori (Morozov principle, L-curve). All these
techniques require that be known. There is no free lunch. Regularization does require
prior information about the solution to mitigate the perceived lack of information in the
available data.
10.2. SPARSITY AND OTHER REGULARIZATION PRIORS 163
10.2 Sparsity and other Regularization Priors
The regularization methods considered in the preceding section have a huge advantage:
the replace ill-posed linear systems of equations by well-posed, at least better-posed
(better-conditioned), linear systems as well. For instance, the inversion of A has been
replaced by that of A

A+I in the simplest version of Tikhonov regularization. Their


main disadvantage is that they render the regularized solution typically smoother than
the exact solution. Such a smoothing is unavoidable with such regularizations.
The reason why smooth objects are well reconstructed by the smoothing regular-
ization method is that such objects can be represented by a small number of large
coecients (e.g., the rst Fourier modes in a Fourier series expansion). It turns out
that some objects are better represented in other bases. For instance, an image tends
to have sharp discontinuities, for instance between a bright area and a dark area. Some
bases, such as for instance those based on wavelets, will be much better than Fourier
bases to represent this type of information.
A general framework to account for such prior information is to recast the inverse
problem as seeking the minimum of an appropriate functional. Solving the inverse
problem then amounts to solving a optimization problem. Let us assume that we have
a problem of the form
M(u) = v,
and assume that v
d
are given data. Let us assume that a functional u 1(u) incor-
porates the prior information about u in the sense that 1(u) is small when u satises
the constraints. Let us assume also that (u, v) is a function that we wish to use to
quantify the error between the available data v
d
and the forward model M(u). Then we
want to minimize (v, v
d
) and at the same time minimize 1(u). Both constraints can
be achieved by introducing a regularization parameter and minimizing the sum
T

(u) =
_
M(u), v
d
_
+ 1(u). (10.39)
Solving the inverse problem consists of minimizing the above functional to get
u

= argmin T

(u). (10.40)
These minimization problems often go by the name of Tikhonov regularization and may
be seen as generalizations of (10.28).
The main objective of regularization (or sparsity) theory is then to devise functions
, 1 and a regularization parameter , that best ts our prior information about the
problem of interest. Once such a problem has been formulated, it remains to devise a
method to solve such an optimization problem numerically. There is a vast literature
on the subject; for recent references see the two books [56, 57].
10.2.1 Smoothness Prior and Minimizations
The simplest class of regularizations consists of choosing (u, v) =
1
2
|D(u v)|
2
H
in the
H = L
2
sense for an operator D that may be identity or an operator of dierentiation if
small errors on derivatives of the solution matter in practice, and choosing 1(u) also as
a quadratic functional, for instance 1(u) =
1
2
|Ru|
2
H
, again for an operator R that may
164 CHAPTER 10. PRIORS AND REGULARIZATION
be identity or a dierential operator. Then associated to the linear problem Au = v, we
have the minimization problem:
T

(u) =
1
2
|D(Au v
d
)|
2
H
+

2
|Ru|
2
H
. (10.41)
The main advantage of the above quadratic expression is that the Euler-Lagrange equa-
tions associated to the above minimization problem is the following linear system of
equations
_
(DA)

(DA) + R

R
_
u = (DA)

Dv
d
. (10.42)
When D = R = I, this is nothing but (10.29). Provided that R is an invertible matrix,
then the above problem can be solved for all > 0. We have seen in the preceding
section how the method converged (at least when D = R = I) as the noise in the data
and the regularization parameter tend to 0.
10.2.2 Sparsity Prior and Minimizations
Choosing (u, v) =
1
2
|D(u v)|
2
H
for the mist to the data may appear as relatively
natural as it corresponds to measuring noise in the H = L
2
sense. The quadratic
functional 1(u) =
1
2
|Ru|
2
H
is, however, much less justied in many settings.
Sometimes, prior knowledge about the object we wish to reconstruct shows that the
latter is sparse in a given representation (a given basis, say). Sparse means here that
the object is represented by a small number of large coecients. For instance, an audio
signal may be represented by a nite number of frequencies. Images typically display
sharp edges that can be represented in a more economical fashion than pixel by pixel
values.
Let us assume that u is discrete and A a matrix to simplify the presentation. Let us
also assume that Bu is sparse, where B is a known matrix. Sparsity will be encoded by
the fact that the l
1
norm of Bu is small. Penalizing the residual and the l
1
norm yields
the minimization of
T

(u) = |Bu|
l
1
+

2
|Au v
d
|
2
l
2
. (10.43)
This and similar minimization problems have been applied very successfully for a large
class of imaging problems.
However, minimizing T

above is computationally more intensive than solving (10.42).


Several algorithms have been developed to solve such minimization problems. We
present one strategy, called the split Bregman iteration, that is both ecient and rela-
tively easy to explain. The main idea is that when B and A are the identity operators,
then the above minimization can be performed for each component of u separately. In
the general case, we introduce
d = Bu,
and replace the above minimization by
min
u,d
|d|
l
1
+

2
|Au v
d
|
2
l
2
+

2
|d Bu|
2
l
2
. (10.44)
Choosing suciently large provides a good approximation of the problem we wish to
solve. Alternatively, we can solve a series of problems of the above form and show that
10.3. BAYESIAN FRAMEWORK AND REGULARIZATION 165
we minimize (10.43) in the limit; we do not present the details here and refer the reader
to [35].
Now the minimization of (10.44) can be performed iteratively by successively min-
imizing for u and for d. The minimization for u becomes a linear problem while the
minimization for d can be performed for each coordinate independently (this is called
soft shrinkage). The iterative algorithm then converges [35]. More precisely, the solution
of
min
u

2
|Au v
d
|
2
l
2
+

2
|d Bu|
2
l
2
,
is given by
(A

A + B

B)u = A

v
d
+ d. (10.45)
This is a linear problem that admits a unique solution. Now the solution of
min
d
|d|
l
1
+

2
|dBu|
2
l
2
= min
d
J

j=1
[d
j
[ +

2
[d
j
(Bu)
j
[
2
=
J

j=1
min
d
j
[d
j
[ +

2
[d
j
(Bu)
j
[
2
.
Each element in the sum can then be minimized separately. We nd that the solution
of
min
d
[d[ +

2
[d a[
2
, (10.46)
is given by the soft thresholding
d = sgn(a) max
_
[a[
1

, 0
_
.
We have presented this algorithm to show that replacing a smoothness regularization
as in (10.41) by a sparsity regularization as in (10.43) increased the computational
complexity of the reconstruction algorithm: instead of solving one linear system, we have
to iteratively solve linear systems of the form (10.45) and soft thresholdings given by
(10.46). When the sparsity assumptions are valid, however, these methods have shown
to be greatly benecial in many practical settings of medical imaging; see [56, 57].
10.3 Bayesian framework and regularization
The penalty regularization framework seen in the preceding two sections is very ecient
when the data are suciently informative. When the data are very informative and noise
relatively small, then no real regularization is necessary. When data are less informative
but still quite informative, prior information becomes necessary and smoothness and
sparsity type priors allow us to still obtain very accurate reconstructions. When data
are even less informative, for instance because noise is very large, or because the problem
is severely ill-posed, then sparsity priors are typically no longer sucient. What one
typically obtains as a result is a function that resembles the minimum of the penalization
term. In some cases, that may not be desirable. Also, additional prior information may
occasionally be known, for instance that next to a black pixel, there is never a blue
pixel. Such information is dicult to include in a penalization method.
166 CHAPTER 10. PRIORS AND REGULARIZATION
A fairly versatile methodology to include various prior informations is the Bayesian
framework. To simplify the presentation slightly, let us assume that the problem of
interest is
y = M(x) + n, (10.47)
where M is the measurement operator, x the unknown set of coecients, y the mea-
surements and n models additive noise in the data.
The main assumption of Bayesian inversions is to assume that x belongs to a class of
possible models X and that each x X is given an a priori probability of being the true
coecient. The associated probability (density) (x) is called the prior distribution. A
second ingredient in the Bayesian framework is the model for the noise n. We denote
by
n
(n) the probability distribution of n.
Let us now dene
(y[x) =
(x, y)
(x)
the conditional probability density of y knowing x with (x, y) the probability density
of x and y. Note that (x) =
_
(x, y)dy as a marginal density so that the above
conditional probability density is indeed a probability density (integrating to 1 in y).
Note that knowledge of (y[x) is equivalent to knowledge of
n
since
(y[x) =
n
(y M(x)) for each xed x.
Bayes rule then essentially states that
(x[y)(y) = (y[x)(x) = (x, y). (10.48)
In our inverse problem where y is the measured data and x the unknown coecients,
this means
(x[y) =
1
(y)
(y[x)(x) (y[x)(x), (10.49)
where means proportional to, i.e., up to a normalizing constant (here 1/(y)). In
other words, if we know the prior density (x) and the likelihood function (y[x), then
by Bayes rule, we know (x[y), which is the posterior probability (density).
Let us recapitulate the main ingredients of the Bayesian formalism. We assume the
prior distribution (x) known as an indication of our prior beliefs about the coecients
before data are acquired. We assume knowledge of the likelihood function (y[x), which
as we have seen is a statement about the noise model in the experiment. From these
two prior assumptions, we use Bayes rule to infer the posterior distribution (x[y) for
x knowing the data.
10.3.1 Penalization methods and Bayesian framework
Before going into the advantages and drawbacks of the method, we rst show that
penalization methods can be seen as an application of the Bayesian framework. Let
1(x) be a given function and assume that the prior is given by the Gibbs distribution:
(x) e
1(x)
.
10.3. BAYESIAN FRAMEWORK AND REGULARIZATION 167
Now assume that the likelihood function is of the form
(y[x) e
(yM(x))
,
where is a distance function. Then by Bayes rule, we nd that
(x[y) e

_
(yM(x))+1(x)
_
.
The Maximum A Posteriori (MAP) x
MAP
is the parameter that maximizes the pos-
terior distribution, or equivalently the minimum of the functional
T(x) = (y M(x)) +1(x).
Therefore, for appropriate choices of the prior and likelihood function, we retrieve the
penalization methods seen in the preceding section.
Note that the minimization problem is solved by linear algebra when both and 1
are quadratic functionals. For instance if
n
(n) A(0, ) a multivariate Gaussian with
correlation matrix , then we have (n) e

1
2
n
t

1
n
. Similarly, for (x) A(0, ),
then 1(x) e

1
2
x
t

1
x
so that we need to minimize
T(x) =
1
2
(y Mx)
t

1
(y Mx) +
1
2
x
t

1
x. (10.50)
If M is a linear operator, then the solution to the minimization problem is, as we already
saw, solution of
(M

1
M +
1
)x = M

1
y. (10.51)
The Bayesian framework can then be used to recover the Tikhonov regularization of
linear equations. Moreover, it gives an explicit characterization of the correlation ma-
trices and as the co-variance functions of the measurement noise and of the prior
assumptions on the coecients, respectively.
Note that the l
1
minimization corresponds to a choice 1(x) =

i
[x
i
[. This corre-
sponds to assuming that each pixel value satises independent and identically distributed
random variables with a Laplace distribution. We thus also recover the sparsity regular-
izations using the Bayesian framework. If we expect nearby pixels to be correlated, then
more complex prior models or functionals 1(x) need to be constructed. This is a very
active area of research. Although the derivation of the best functional is often more
an art than grounded in rst principles, the Bayesian framework sometimes allows for
very pleasing reconstructions (we recall that the ill-posedness of an inverse problem
is a subjective notion).
10.3.2 Computational and psychological costs of the Bayesian
framework
We have seen that the Bayesian framework reduced to an optimization problem when
the Maximum A Posteriori (MAP) x
MAP
is what we are looking for. The Bayesian
framework allows one to obtain much more information, at least in theory, since the
output of the procedure is the full posterior distribution (x[y) and not only its argmax.
168 CHAPTER 10. PRIORS AND REGULARIZATION
In practice, however, we are faced with daunting tasks: rst of all, how do we sample
what is often a very high dimensional distribution (x[y)? And second of all, even if
sampling is possible, how does one represent such a huge object practically? These two
questions severely limit the applicability of Bayesian frameworks in practice.
A third, and in some sense more fundamental and structural, question pertains to the
choice of the prior (x). Where should that prior information come from? There is no
good answer to this fundamental yet ill-formulated question. In some sense, we have
already partially answered it: since the user decided that adding no prior information
was not working in the sense that noise had too large an eect on the reconstruction,
then the user has to come up with another model. There is no such a thing as a non-
informative prior, since the user decided that a prior was necessary. (If data alone were
sucient to obtain good reconstructions, then the Bayesian framework would not be
necessary.) If data are not sucient, then the Bayesian framework provides a very
versatile framework for the user to provide information about the problem that helps to
compensate for what is not present in the data. Some researchers will not be satised
with this way of addressing the inverse problem and the notion of compensating for the
lack of data. This is a perfectly reasonable position. However, the Bayesian framework
at least has this very appealing feature: it provides a logical mapping from the prior
information, namely the prior distribution and the likelihood function, to the outcome of
the procedure, namely the posterior distribution. If nothing else, it can therefore serve as
a very valuable tool to guide intuition and to search what types of prior informations are
necessary for a given set of constraints on the posterior distribution. Moreover, nothing
prevents us from estimating how the posterior distribution is sensitive to variations in
the prior information. This strategy, sometimes referred to as Robust Bayesian analysis,
allows one understand which features of the reconstructed parameters strongly or weakly
depend on the choices of the prior density and likelihood functions.
Now that the psychological cost of the Bayesian framework has been taken into ac-
count, let us come back to its computational cost, which still poses enormous challenges.
Let us rst address the representation of the posterior distribution. Typically, moments
of the posterior distribution are what we are interested in. For instance, one may be
interested in the rst moment (a vector) and the variance (a matrix)
x
m
=
_
x(x[y)dx,
x
=
_
x x (x[y)dx x
m
x
m
. (10.52)
Of interest are also various quantiles of the posterior distribution, for instance the prob-
ability that x
j
be larger than a number :
_
(x[y)(x
j
> )dx, where (X) is the
indicatrix function of the set X equal to 1 on X and to 0 otherwise.
For each of these moments of the posterior distribution, we need to be able to sample
(x[y)dx. In a few cases, the sampling of (x[y)dx may be straightforward, for instance
when (y[x) has a Gaussian structure. In most cases, however, sampling is a dicult
exercise. The most versatile method to perform such a sampling is arguably the Markov
Chain Monte Carlo (MCMC) method. The objective of MCMC samplers is to generate
a Markov chain X
i
for i N whose invariant distribution (distribution at convergence
when the algorithm converges) is the posterior distribution. There are two main MCMC
samplers, the Gibbs sampler and the Metropolis-Hastings sampler. The latter is dened
as follows. We assume that we want to sample a distribution (x[y).
10.3. BAYESIAN FRAMEWORK AND REGULARIZATION 169
Let q(x, x
t
) be a given, positive, transition density from the vector x to the vector x
t
(it thus sums to 1 integrated in all possible vectors x
t
for each x). Let us then dene
(x, x
t
) := min
_
q(x, x
t
)(x
t
[y)
q(x
t
, x)(x[y)
, 1
_
. (10.53)
Note that the above quantity, which is all we need about (x[y) in the Metropolis-
Hastings sampler, depends only on
(x

[y)
(x[y)
and thus is independent of the normalizing
constant of (x[y), which is typically not known in the Bayesian framework, and whose
estimation is typically expensive computationally.
Let X
i
the current state of the Markov chain. Let

X
i+1
be drawn from the transition
kernel q(X
i
, x
t
). Then with probability (X
i
,

X
i+1
), we accept the transition and set
X
i+1
=

X
i+1
while with probability 1 (X
i
,

X
i+1
), we reject the transition and set
X
i+1
= X
i
.
The transition probability of the chain from x to x
t
is thus p(x, x
t
) = (x, x
t
)q(x, x
t
)
while the probability to stay put at x is 1
_
p(x, x
t
)dx
t
. The construction is such
that (x[y)p(x, x
t
) = p(x
t
, x)(x
t
[y), which means that (x[y)dy is indeed the invariant
distribution of the Markov chain. In practice, we want independent samples of (x[y)
so that the following Monte Carlo integration follows from an application of the law of
large numbers (ergodicity), for instance:
_
f(x)(x[y)dx
1
[I[

iI
f(X
i
), (10.54)
for any reasonable (continuous) functional f. Such a rule is accurate if the X
i
are
sampled according to (x[y)dy and are suciently independent. This is for instance
achieved by choosing I = 1 i i
max
, i = Nj, j N. For instance, we can take
i
max
= 10
7
and N = 1000 so that I is composed of [I[ = 10
4
points chosen every 1000
points in the Metropolis-Hastings Markov chain. For an accuracy equal to
_
[I[ = 0.01
(as an application of the central limit theorem to estimate the error in (10.54)), we thus
need 10
7
evaluations of (x[y). Using Bayes rule, this is proportional to (x)(y[x),
where the latter likelihood function requires that we solve a forward problem (for a given
x drawn from the prior (x)) to estimate the law of the data y. In other words, the
construction of the above statistical moment with an accuracy of order 10
2
requires that
we solve 10
7
forward problems. In many practical situations, this is an unsurmountable
computational cost. Moreover, this assumes that the transition q(x, x
t
) has been chosen
in such a way that every 1000 samples X
i
are indeed suciently independent. This is
very dicult to achieve in practice and is typically obtained by experienced users rather
than from sound, physics- or mathematics- based principles.
Note that in practice, it has been observed that I in (10.54) should be the set of
all runs 1 i i
max
. In other words, there is no gain in throwing away points X
i
in
the evaluation of the integrals. However, the above heuristics are correct: the error in
the approximation (10.54) is indeed proportional to the square root of the number of
independent components in X
i
and not

i
max
.
Many methodologies have been developed to improve the eciency of MCMC algo-
rithms. It is however fair to say that even with nowadays computational capabilities,
many problems of interest are totally out of reach using the standard Bayesian frame-
work. That said, it is still a very versatile methodology that goes a long way to address
170 CHAPTER 10. PRIORS AND REGULARIZATION
the main problem of this chapter, which we recall is: the user decided that adding no
prior information was not working and thus something had to be done.
General references on the use of the Bayesian framework in inverse problems include
[38, 61].
Chapter 11
Geometric Priors and
Parameterizations
In this chapter, we consider a few other methods to include prior information one might
possess about the unknown objects of interest. The Bayesian framework introduced in
the preceding chapter is quite versatile, but it is computationally extremely expensive
because deterministic reconstructions are replaced by a probability measure on possible
reconstructions, which is often daunting to sample and visualize from a numerical point
of view.
Other physically motivated methods have been developed to incorporate prior infor-
mation. In one such method, we aim to reconstruct the support of an inclusion rather
than its full description and assume that such an inclusion is embedded in a known
medium. Several techniques have been developed to do this and we focus here on a
method called the factorization method. The factorization method is a functional ana-
lytical tool that allows us to separate the inuence of the unknown inclusion from that
of the known background. It is considered in a simple setting in section 11.1.
In another method, we give up the hope to perform global reconstructions and replace
the unknown object by a low-dimensional parameterization. Any nite dimensional
problem that is injective is necessarily well-posed essentially as an application of the
Fredholm alternative. One such quite useful parameterization consists of assuming that
the inclusion has small volume. We then perform asymptotic expansions in its volume
to understand the leading inuence of such an inclusion on available measurements.
The reconstruction then focuses on the rst coecients appearing in the asymptotic
expansion assuming the surrounding background to be known. The method is presented
in a very simple setting in section 11.2.
11.1 Reconstructing the domain of inclusions
The reconstruction of physical parameters in an elliptic equation from boundary mea-
surements, such as the Neumann-to-Dirichlet map, is a severely ill-posed problem. One
should therefore not expect to stably reconstruct more than a few coecients modeling
the physical parameters, such as for instance the rst Fourier modes in a Fourier series
expansion as we saw in the smoothness penalization considered in Chapter 10.
In certain applications, knowing the rst few coecients in a Fourier series expansion
171
172 CHAPTER 11. GEOMETRIC PRIORS AND PARAMETERIZATIONS
is not what one is interested in. In this chapter, we assume that the physical parameters
are given by a background, which is known, and an inclusion, from which we only know
that it diers from the background. Moreover, we are not so much interested in the
detailed structure of the inclusion as in its location. We thus wish to reconstruct an
interface separating the background from the inclusion.
To reconstruct this interface, we use the method of factorization. The method pro-
vides a constructive method to obtain the support of the inclusion from the Neumann-
to-Dirichlet (NtD) boundary measurements. Notice that the NtD measurements allow
us a priori to reconstruct much more than the support of the inclusion. However, be-
cause we restrict ourselves to this specic reconstruction, we can expect to obtain more
accurate results on location of the inclusion than by directly reconstructing the physical
parameters on the whole domain.
11.1.1 Forward Problem
We consider here the problem in impedance tomography. The theory generalizes to a
certain extent to problems in optical tomography.
Let (x) be a conductivity tensor in an open bounded domain X R
n
with Lipschitz
boundary X. We dene , a smooth surface in X, as the boundary of the inclusion. We
denote by D the simply connected bounded open domain such that = D. This means
that D is the domain inside the surface . We also dene D
c
= XD, of boundary
D
c
= X . We assume that (x) is a smooth known background (x) =
0
(x) on
D
c
, and that and
0
are smooth but dierent on D. For
0
a smooth known tensor
on the full domain X, this means that jumps across so that is the surface of
discontinuity of . More precisely, we assume that the n n symmetric tensor
0
(x) is
of class C
2
(X) and positive denite such that
i

0ij
(x)
0
> 0 uniformly in x X
and in
i

n
i=1
= S
n1
, the unit sphere in R
n
. Similarly, the n n symmetric tensor
(x) is of class C
2
(D)C
2
(D
c
) (in the sense that (x)[
D
can be extended as a function of
class C
2
(D) and similarly for (x)[
D
c) and positive denite such that
i

ij
(x)
0
> 0
uniformly in x X and in
i

n
i=1
= S
n1
.
The equation for the electric potential u(x) is given by
u = 0, in X
u = g on X
_
X
u d = 0.
(11.1)
Here, (x) is the outward unit normal to X at x X. We also denote by (x)
the outward unit normal to D at x . Finally g(x) is a mean-zero current, i.e.,
_
X
gd = 0, imposed at the boundary of the domain.
The above problem admits a unique solution H
1
0
(X), the space of functions in u
H
1
(X) such that
_
X
ud = 0. This results from the variational formulation of the
above equation
b(u, )
_
X
u dx =
_
X
gd(x) l(), (11.2)
11.1. RECONSTRUCTING THE DOMAIN OF INCLUSIONS 173
holding for any test function H
1
0
(X). Indeed from a Poincare-like inequality, we
deduce that b(u, v) is a coercive and bounded bilinear form on H
1
0
(X) and the existence
result follows from the Lax-Milgram theorem. Classical trace estimates show that u[
X

H
1
2
0
(X), the space of functions v H
1
2
(X) such that
_
X
vd = 0.
We dene the Neumann-to-Dirichlet operator

, depending on the location of the


discontinuity , as

: H

1
2
0
(X) H
1
2
0
(X), g u
[X
, (11.3)
where u(x) is the solution to (11.1) with boundary normal current g(x). Similarly, we
introduce the background Neumann-to-Dirichlet operator
0
dened as above with
replaced by the known background
0
. To model that the inclusion has a dierent
conductivity from the background, we assume that satises either one of the following
hypotheses
(x)
0
(x)
1
> 0 on D,
0
(x) = (x), on D
c
, (11.4)

0
(x) (x)
1
> 0 on D,
0
(x) = (x), on D
c
, (11.5)
for some constant positive denite tensor
1
. The tensor inequality
1

2
is meant in
the sense that
i

j
(
1,ij

2,ij
) 0 for all R
n
.
11.1.2 Factorization method
The purpose of the factorization method is to show that

= L

FL, (11.6)
where L and L

are operators in duality that depend only on [


D
= (
0
)[
D
and F or
F is an operator that generates a coercive form on H
1
2
0
() when (11.4) or (11.5) are
satised, respectively. The operators are constructed as follows. Let v and w be the
solutions of
v = 0, in D
c
v = on X
v = 0 on
_

vd = 0,
w = 0, in D
c
w = 0 on X
w = on
_

wd = 0.
(11.7)
These equations are well-posed in the sense that they admit solutions in H
1
(D
c
) with
traces in H
1
2
() and in H
1
2
(X) at the boundary of D
c
. We then dene the operator
L, which maps H

1
2
0
(X) to v[

H
1
2
0
(), where v is the unique solution to the left
equation in (11.7), and the operator L

, which maps H

1
2
0
() to w[
X
, where w is
the unique solution to the right equation in (11.7). We verify that both operators are
in duality in the sense that
(L, )

L d =
_
X
L

d (, L

)
X
.
174 CHAPTER 11. GEOMETRIC PRIORS AND PARAMETERIZATIONS
Let us now dene two operators G

and G

as follows. For any quantity f dened


on D D
c
, we denote by f
+
(x) for x the limit of f(y) as y x and y D
c
, and
by f

(x) the limit of f(y) as y x and y D. Let v and w be the unique solutions to
the following problems
v = 0, in X
[v] = 0, on
[ v] = 0 on
v = g on X
_

v d = 0
w = 0, in X
[w] = , on
[ w] = 0 on
w = 0 on X
_
X
w d = 0.
(11.8)
We dene G

as the operator mapping g H

1
2
0
(X) to G

g = v
+
[
H

1
2
0
()
and the G

as the operator mapping H


1
2
0
() to G

= w
[X
H
1
2
0
(X), where v
and w are the unique solutions to the above equations (11.8).
Except for the normalization
_

v d = 0, the equation for v is the same as (11.1)


and thus admits a unique solution in H
1
(X), say. Moreover integrations by parts on D
c
imply that
_

v
+
d =
_
X
g d = 0.
This justies the well-posedness of the operator G

as it is described above. The oper-


ator G

is more delicate. We rst obtain that for any smooth test function ,
_
D
c
w dx +
_

w
+
d = 0
_
D
w dx
_

d = 0,
so that _
X
w dx =
_

( w) [] d. (11.9)
It turns out that |w|
H

1
2
0
()
is bounded by the norm of w in H(div, X) (see [34]).
This and a Poincare-type inequality shows that the above right-hand side with = w is
bounded by C||
2
H
1
2
0
()
. Existence and uniqueness of the solution w H
1
(D) H
1
(D
c
)
to (11.8) is then ensured by an application of the Lax-Milgram theorem. This also shows
that the operator G

as dened above is well-posed.


Integrations by parts in the equation for v in (11.8) by a test function yields
_
D
u dx
_

d = 0
_
D
c
v dx +
_

v
+
d =
_
X
g d,
(11.10)
from which we deduce that
_
X
v =
_
X
g
_

(G

g)[]. (11.11)
11.1. RECONSTRUCTING THE DOMAIN OF INCLUSIONS 175
That G

and G

are in duality in the sense that


_

g d =
_
X
g G

d, (11.12)
follows from (11.11) with = w and (11.9) with = v since [v] = 0.
We nally dene F

as the operator mapping H


1
2
0
() to F

= w
H

1
2
0
(), where w is the solution to (11.8). Based on the above results, this is a well-
posed operator. Moreover, we deduce from (11.9) that
(F

[w], [])

=
_
X
w dx = ([w], F

[])

, (11.13)
so that F

= F

. Upon choosing [w] = [], we nd that F

is coercive on H
1
2
0
(). This
implies among other things that F

is injective.
We now notice that
G

= L

.
This follows from the uniqueness of the solution to the elliptic problem on D
c
with
conditions dened on D
c
= X. By duality, this also implies that G

= F

L. The
operators G
0
and F
0
are dened similarly except that is replaced by
0
in (11.8). Let
us nally dene the operator M, which maps g H

1
2
0
(X) to u
[X
H
1
2
0
(X), where
u is the solution to
u = 0, in D
c
u = 0, on
u = g, on X
_
X
u d = 0.
(11.14)
Except for the normalization, the operator M is the same as the operator L (so that
L M is proportional to identity) and is thus well-posed. We now verify from the
linearity of the elliptic problems that

= M L

= M L

L,
0
= M L

G
0
= M L

F
0
L. (11.15)
We thus deduce the main factorization result of this section, namely that

= L

FL, F = F

F
0
. (11.16)
The above result would not be very useful if F did not have specic properties. We
now show that F or F generates a coercive form on H
1
2
0
() and may be written as
B

B for some surjective operator B

. Note that F

= F since both F

and F
0
are
self-adjoint.
We denote by w

the solution w to (11.8) and by w


0
the solution to the same equation
with replaced by
0
. Upon multiplying the equation for w

by w
0
and subtracting the
equation for w
0
multiplied by w

, we obtain since =
0
on D
c
that
_
D
(
0
)w
0
w

dx =
_

( w

0
w
0
w

)d
0 =
_

( w

w
+
0
w
0
w
+

)d.
176 CHAPTER 11. GEOMETRIC PRIORS AND PARAMETERIZATIONS
Notice that both and w

jump across but that w

does not. This yields


that
_
D
(
0
)w
0
w

dx =
_

(F

F
0
)d =
_

Fd. (11.17)
Let us now introduce w = w
0
w

. Upon multiplying
0
w+(
0
)w

=
0 by w and integrating by parts on D
c
and D we deduce that
_
X

0
w wdx +
_
D
(
0
)w

dx =
_
D
(
0
)w
0
w

dx.
By exchanging the roles of the indices and 0 we also obtain
_
X
w wdx +
_
D
(
0
)w
0
w
0
dx =
_
D
(
0
)w
0
w

dx.
Combining these results with (11.17) we deduce that
_

Fd =
_
X

0
w wdx +
_
D
(
0
)w

dx
_

Fd =
_
X
w wdx +
_
D
(
0
)w
0
w
0
dx.
(11.18)
Let us assume that (11.4) holds. Then F generates a coercive form on H
1
2
0
(). Indeed,
let us assume that the right-hand side of the rst equality above is bounded. Then by
a Poincare-type inequality, we have w H
1
(X) and w

[
D
H
1
(D) thanks to (11.4).
This implies that (w

)
[
H

1
2
() and thus based on (11.8) that w

[
D
c H
1
(D
c
).
This in turn implies that both w
+

and w

belong to H
1
2
() so that their dierence
H
1
2
0
(). Thus, we have shown the existence of a positive constant C such that
||
H
1
2
0
()
C(F, )
1
2

. (11.19)
Exchanging the indices and 0 also yields the existence of a constant C under hypothesis
(11.5) such that
||
H
1
2
0
()
C(F, )
1
2

. (11.20)
In what follows, we assume that (11.4) and (11.19) hold to x notation. The nal results
are not modied when (11.5) and (11.20) hold instead.
The operator F is dened from H
1
2
0
() to H

1
2
0
(), which have not been identied yet.
So writing F = B

B requires a little bit of work. Let J be the canonical isomorphism


between H

1
2
0
() and H
1
2
0
(). Since it is positive denite, we can decompose it as
J =

, : H

1
2
0
() L
2
0
(),

: L
2
0
() H
1
2
0
().
Both and

are isometries as dened above. We can thus recast the coercivity of F


as
(F, ) = (F

u,

u) = (F

u, u) ||
2
H
1
2
0
()
= |u|
2
L
2
0
()
.
11.1. RECONSTRUCTING THE DOMAIN OF INCLUSIONS 177
So F

as a self-adjoint positive denite operator on L


2
() can be written as C

C,
where C and C

are bounded operators from L


2
0
() to L
2
0
(). Since
|Cu|
2
L
2
0
()
|u|
2
L
2
0
()
,
we deduce that C

is surjective. We thus obtain that F = B

B where B = C(

)
1
maps H
1
2
0
() to L
2
0
() and its adjoint operator B

=
1
C

maps L
2
0
() to H
1
2
0
().
Since is an isomorphism, we deduce that B

is surjective.
From the above calculations we obtain that

= L

FL = L

(L

= A

A, A = BL.
Since the Range of (A

A)
1
2
for A acting on Hilbert spaces is equal to the Range of A

,
we deduce that
1((
0

)
1
2
) = 1(L

) = 1(L

) (11.21)
since B

is surjective. The latter is shown as follows. We always have that 1(L

)
1(L

). Now for y 1(L

) there is x such that y = L

x and since B

is surjective u
such that y = L

x so that y 1(L

); whence 1(L

) 1(L

).
When (11.5) and (11.20) hold instead of (11.4) and (11.19), we deduce that
1((

0
)
1
2
) = 1(L

), (11.22)
instead of (11.21). In both cases, we have shown the following result.
Theorem 11.1.1 Provided that (11.4) or (11.5) holds, the range of the operator L

dened in (11.7) is determined by the Neumann-to-Dirichlet operator

and (11.22)
holds.
11.1.3 Reconstruction of
The above theorem gives us a method to reconstruct : we need to nd a family of
functions that belong to the Range of L

when some probe covers D and do not belong


to it when the probe covers D
c
. Notice that the operator L

does not depend on the


domain D and thus depends only on the tensor
0
and the surface . Consequently,
the reconstruction of is independent of on D, except for the existence of a positive
denite tensor
0
such that (11.4) or (11.5) holds.
Let us now introduce the family of functions N(; y) indexed by the parameter y X
solution of

0
N(; y) = ( y), in X

0
N(; y) = 0 on X
_
X
N(; y) d = 0.
(11.23)
We dene the family of functions g
y
(x) = N(x; y)
[X
on X. Then we have the following
result:
Theorem 11.1.2 The function g
y
(x) belongs to 1(L

) when y D and does not belong


to 1(L

) when y D
c
.
178 CHAPTER 11. GEOMETRIC PRIORS AND PARAMETERIZATIONS
This theorem provides us with a constructive method to image = D. For each
y X, all we have to do is to solve (11.23) and verify whether the trace on X belongs
to the Range of ((
0

))
1
2
, which can be evaluated from the known boundary
measurements. Only when the verication is positive do we deduce that y D.
Proof. The proof of the theorem is as follows. When y D, we have that
N(x; y)
[
H

1
2
0
() and
0
N(; y) = 0 on D
c
so that g
y
1(L

). Let us
now assume that y D
c
and g
y
(x) 1(L

). Then there exists H

1
2
0
() such that
g
y
= L

= w
[X
, where w is the solution to (11.7).Let B(y; ) be the ball of radius
centered at y for suciently small. On D
c
B

, both w and g
y
satisfy the same
equation. By uniqueness of the solution to the Cauchy problem imposing Dirichlet
data and vanishing Neumann data on X, we deduce that w = g
y
on D
c
B

. On

= B

0
B

for some xed


0
> 0, we verify that the H
1
(

) norm of w remains
bounded independently of , which is not the case for the fundamental solution g
y
;
whence the contradiction implying that g
y
is not in the Range of L

when y D
c
.
The factorization method is one example in the class of so-called qualitative recon-
struction methodologies. The interested reader is referred to [26, 41, 51] for additional
information about such methods.
11.2 Reconstructing small inclusions
This second section concerns the reconstruction of small inclusions. We have seen that
the reconstruction of diusion or absorption coecients in an elliptic equation resulted
in a severely ill-posed problem. The previous section dealt with the issue by reconstruct-
ing the support of an inclusion instead of its detailed structure. Because the support of
the inclusion remains an innite dimensional object, the stability of the reconstruction
is still a severely ill-posed problem. Here we further simplify the problem by assuming
that the inclusions have small support. This introduces a small parameter allowing us
to perform asymptotic expansions. We can then characterize the inuence of the inclu-
sion on the boundary measurements by successive terms in the expansion. The interest
of such a procedure is the following. Since high-order terms in the expansion become
asymptotically negligible, the procedure tells us which parameters can be reconstruction
from a given noise level in the measurements and which parameters cannot possibly be
estimated. Moreover, each term in the asymptotic expansion is characterized by a nite
number of parameters. This implies that by truncating the expansion, we are param-
eterizing the reconstruction with a nite number of parameters. Unlike the previous
reconstructions, this becomes a well-posed problem since the ill-posedness comes from
the innite dimensionality of the parameters we want to reconstruct, at least as long
as the mapping from the object to be reconstructed to the noise-free measurements is
one-to one (injective).
We consider in this chapter a mathematically very simple problem, namely the re-
construction of inclusions characterized by a variation in the absorption coecient. We
also restrict ourselves to the reconstruction from the leading term in the aforementioned
asymptotic expansion. The interesting problem of variations of the diusion coecient
is mathematically more dicult, although the main conclusions are in the end very
similar. The presentation follows that in [8].
11.2. RECONSTRUCTING SMALL INCLUSIONS 179
11.2.1 First-order eects
Let us consider the problem of optical tomography modeled by a diusion equation on
a domain X with current density g(x) prescribed at the boundary X. We assume
here that the diusion coecient is known and to simplify, is set to 1. Our main
hypothesis on the absorption coecient is that is is the superposition of a background
absorption, to simplify the constant
0
, and a nite number of uctuations of arbitrary
size
m

0
, with
m
constant to simplify, but of small volume. Smallness of the volume
of the inclusions compared to the volume of the whole domain is characterized by the
small parameter 1. The diusion equation with small absorption inclusions then
takes the form
u

(x) +

(x)u

(x) = 0, X
u

= g, X,
(11.24)
where absorption is given by

(x) =
0
+
M

m=1

zm+Bm
(x). (11.25)
We have introduced here B
m
as the shape of the mth inclusion centered at z
m
, and

zm+Bm
(x) = 1 if x z
m
B
m
and 0 otherwise. The inclusions are centered at zero
in the sense that _
Bm
xdx = 0 for all m, (11.26)
and are assumed to be at a distance greater than d > 0, independent of , of each-other
and of the boundary X. The parameter is a measure of the diameter of the inclusions.
In the three-dimensional setting, which we assume from now on, this implies that the
volume of the inclusions is of order
3
.
We want to derive an asymptotic expansion for u

in powers of and observe which


information about the inclusions we can deduce from the rst terms in the expansion.
Let us rst dene the Green function of the corresponding homogeneous problem
G(x; z) +
0
G(x; z) = (x z), X
G

(x; z) = 0, X,
(11.27)
and the homogeneous-domain solution U(x) of
U(x) +
0
U(x) = 0, X
U

(x) = g(x), X.
(11.28)
As 0, the volume of the inclusions tends to 0 and u

converges to U. To show this,


we multiply (11.27) by u

and integrate by parts to obtain


u

(z) =
_
X
g(x)G(x; z)d(x)
M

m=1
_
zm+Bm

m
G(x; z)u

(x)dx.
180 CHAPTER 11. GEOMETRIC PRIORS AND PARAMETERIZATIONS
Using the same procedure for U(x), we obtain
u

(z) = U(z)
M

m=1
_
zm+Bm

m
G(x; z)u

(x)dx. (11.29)
In three space dimensions, the Green function is given by
G(x; z) =
e

0
[zx[
4[z x[
+ w(x; z), (11.30)
where w(x; z) is a smooth function (because it solves (11.28) with smooth boundary
conditions) provided that X is smooth. For z at a distance greater than d > 0 away
from the inclusions x
m
+ B
m
, we then deduce from the L

bound on u

(because g
and X are assumed to be suciently regular) that
u

(z) = U(z) + O(
3
).
In the vicinity of the inclusions, we deduce from the relation
_
zm+Bm
G(x; z)dx = O(
2
), z z
m
B
m
,
that u

(z) U(z) is of order


2
when z is suciently close to an inclusion. This also
shows that the operator
K

(z) =
M

m=1
_
zm+Bm

m
G(x; z)u

(x)dx (11.31)
is a bounded linear operator in L(L

(X)) with a norm of order


2
. This implies that
for suciently small values of , we can write
u

(z) =

k=0
K
k

U(z). (11.32)
The above series converges fast when is small. Notice however that the series does not
converge as fast as
3
, the volume of the inclusions, because of the singular behavior of
the Green function G(x; z) when x is close to z.
Let us now use that
u

(z) = U(z)
M

m=1
_
zm+Bm

m
G(x; z)U(x)dx
+
M

m=1
M

n=1
_
zm+Bm
_
zn+Bn

n
G(x; z)G(y; x)u

(y)dydx.
(11.33)
For the same reasons as above, the last term is of order
5
, and expanding smooth
solutions U(x) and G(x; z) inside inclusions of diameter , we obtain that
u

(x) = U(x)
M

m=1
G(z; z
m
)C
m
U(z
m
) + O(
5
), (11.34)
where C
m
is given by
C
m
=
3
[B
m
[
m
. (11.35)
The reason why we obtain a correction term of order
5
in (11.34) comes from the fact
that (11.26) holds so that the terms of order
4
, proportional to x U or x G, vanish.
11.2. RECONSTRUCTING SMALL INCLUSIONS 181
11.2.2 Stability of the reconstruction
The above analysis tells us the following. Provided that our measurement errors are of
order O(
5
), the only information that can possibly be retrieved on the inclusions is its
location z
m
and the product C
m
=
3

m
B
m
of the absorption uctuation with the volume
of the inclusion. More rened information requires data with less noise. Assuming that
the inclusions are suciently small so that the above asymptotic expansion makes sense,
no other information can be obtained in a stable fashion from the data.
Note that the problem we now wish to solve is nite-dimensional. Indeed, each inclu-
sion is represented by four real numbers, namely the three components of the position z
m
and the product C
m
. Assuming that only M inclusions are present, this leaves us with
4M parameters to reconstruct. The main advantage of reconstructing a nite number
of parameters is that it is natural to expect stability of the reconstruction. We can even
show stability of the reconstruction from boundary measurements corresponding to one
current density g(x) provided that the homogeneous solution U(x) is uniformly positive
inside the domain. Here is how it can be proved.
Let us assume that the boundary measurements have an accuracy of order O(
5
),
which is consistent with
u

(z) = U(z)
M

m=1
C
m
(G(z
m
; z)U(z
m
)) + O(
5
). (11.36)
We denote by u

and u
t

the solution of two problems with absorption coecients

and
t

of the form (11.25). Using (11.36), we obtain that


u

(z) u
t

(z) = F(z) + O(
5
),
with
F(z) =
M

m=1
_
C
m
(G(z
m
; z)U(z
m
)) C
t
m
(G(z
t
m
; z)U(z
t
m
))
_
. (11.37)
Here we use M = max(M, M
t
) with a small abuse of notation; we will see shortly that
M = M
t
. The function F(z) satises the homogeneous equation F +
0
F = 0 on X
except at the points z
m
and z
t
m
. Moreover, we have that
F

= 0 at X. If F = 0 on X,
we deduce from the uniqueness of the Cauchy problem for the operator +
0
that
F 0 in X. As 0 and u

u
t

0, we deduce that F(z) becomes small not only at


X but also inside X (the continuation of F from X to Xz
m
z
t
m
is independent
of ). However, the functions G(z
m
; z)U(z
m
) form an independent family. Each term
must therefore be compensated by a term from the sum over the prime coecients. We
thus obtain that M = M
t
and that

C
m
(G(z
m
; z)U(z
m
)) C
t
m
(G(z
t
m
; z)U(z
t
m
))

C|u

u
t

|
L

(X)
+ O(
5
).
The left-hand side can be recast as
(C
m
C
t
m
)G(z
m
; z)U(z
m
) + C
t
m
(z
m
z
t
m
)
zm
(G( z
m
; z)U( z
m
))
where z
m
= z
m
+ (1 )z
t
m
for some (0, 1). Again these two functions are linearly
independent and so we deduce that
[C
m
C
t
m
[ +[C
t
m
[[z
m
z
t
m
[ C|u

u
t

|
L

(X)
+ O(
5
).
182 CHAPTER 11. GEOMETRIC PRIORS AND PARAMETERIZATIONS
Using (11.34) and (11.35), we then obtain assuming that |u

u
t

|
L

(X)

5
, that

B
m

m
B
t
m

t
m

+[z
m
z
t
m
[ C
3
|u

u
t

|
L

(X)

2
. (11.38)
Assuming that the accuracy of the measured data is compatible with the expansion
(11.34), i.e. that the u

is known on X up to an error term of order


5
, we can then
reconstruct the location z
m
of the heterogeneities up to an error of order
2
. The product
of the volume of the inclusion and the absorption uctuation is also known with the
same accuracy.
The analysis of small volume expansions has been analyzed in much more general
settings than the one presented above. We refer the reader to [5] for additional details.
In the context of the reconstruction of absorbing inclusions as was considered above, the
expression of order O(
5
) and the additional information it provides about inclusions
may be found in [8].
Chapter 12
Inverse Problems and Modeling
We conclude these notes by a chapter devoted to the interplay between modeling and
inverse problems. We have described an inverse problem as the triplet (Measurement
Operator, Noise Model, Prior information). In such a setting, the MO is given as
our best guess for the operator that maps parameters to measurements. The noise
structure is also given in the model, and depending on the potential eect of noise on
reconstructions. Priors were considered to mitigate is inuence.
In some situations, modeling can help to come up with a triplet (Measurement Opera-
tor, Noise Model, Prior information) that can greatly simplify the analysis and numerical
simulation of an inverse problem. We consider two such situations, one in which model-
ing helps to derive a MO adapted to a particular situation, and one in which asymptotic
expansions may help to come up with simple yet ecient noise models. Modeling may
also assist in the choice of Priors. For instance, if the parameter of interest is an image,
then one might envision situations in which prior probabilities on images are assigned
based on how likely a given image is the transformation of a known reference image for
a transformation satisfying prescribed, physically motivated, constraints. We do not
explore this aspect here.
As an example of modeling of the MO, consider the reconstruction of objects buried
in a highly heterogeneous medium, which means a medium with clutter present at very
small scales compared to the overall size of the physical domain of interest. We assume
that the medium is probed by electromagnetic waves, say, which as usual in this text
are modeled by a scalar wave equation. The space of parameters is therefore the buried
object and the heterogeneous medium. The latter being highly heterogeneous requires
many more parameters than we can aord to consider based on available data. Besides,
we are not interested in the heterogeneous medium. We want to use modeling to simplify
the inverse problem so that the set of (unknown) parameters is restricted to the buried
object plus a number of macroscopic parameters describing the heterogeneous medium
that we wish to be as limited as possible. How this might be done is considered in
section 12.1.
As a second example of the interplay between modeling and inverse problems, let
us consider a situation with an inverse problem in which the inuence of the high fre-
quencies of the parameters on the measurements becomes negligible compared to the
amount of noise. Such high frequencies then cannot be reconstructed. Yet, when the
inverse problem is nonlinear, they may very well inuence the low frequency compo-
183
184 CHAPTER 12. INVERSE PROBLEMS AND MODELING
nent of the available data and hence inuence the reconstruction of the low frequency
component of the parameters. Since said high frequencies cannot be reconstructed, we
might as well model them as random. How such a randomness propagates to the avail-
able measurements is a modeling exercise that is sometimes tractable analytically. The
benet of such a calculation is a physics-based model for the Noise Model associated to
a Measurement Operator focusing on the low frequency component of the parameters.
We consider such a setting in section 12.2.
12.1 Imaging in Highly Heterogeneous Media
12.1.1 Wave model
Consider the propagation of acoustic waves in a heterogeneous medium modeled by the
following wave equation:

2
p

t
2
c
2

(x)p

= 0, t > 0, x R
n
,
p

(0, x) = p
0
(x),
p

t
(0, x) = j
0
(x), x R
n
,
(12.1)
where p

(t, x) is the pressure potential and c

(x) is the sound speed.


The parameter is dened as the ratio of the typical wavelength of the probing
wave eld with the overall distance L of the domain over which propagation is observed.
This is modeled by assuming that e.g. p
0
(x) =
p
(x)
p
(
x

) and j
0
(x) =
j
(x)
j
(
x

) for
some smooth functions
p,j
(x) and
p,j
(x).
We are interested in the regime where 1. The sound speed c

(x) models both


the buried inclusion (for instance c jumps across an interface; or c becomes constant
inside the inclusion) and the heterogeneous surrounding.
Let us assume that the sound speed c

(x) takes the form


c
2

(x) = c
2
0
(x)

V
_
x,
x

_
, (12.2)
where c
0
(x) is the deterministic background speed and V (x,
x

) denotes the rapid oscilla-


tions, also at the scale . This is the so-called weak coupling regime of wave propagation.
The reconstruction of the highly oscillatory c

(x) from boundary measurements is a


formidable task. The objective of the modeling eort is to replace this dicult inverse
problem by a simpler one in which the parameters would be the deterministic back-
ground uctuations c
0
(x), which could model the buried inclusions as well, and a few
deterministic parameters modeling the statistics of V (x,
x

), which would be modeled as


a random eld since we no longer aim to reconstruct it.
We thus assume that V (x, y) is a random eld V (x, y; ), where , the state
space of possible realizations of the heterogeneous medium constructed on a probability
space (, T, P). We assume that for each xed x, V (x, y; ) is a stationary, mean-zero,
random eld, with a correlation function dened by
R(x, y) = EV (x, z)V (x, z + y), (12.3)
where E is mathematical expectation with respect to P and where R is independent
of z by stationarity of V . The correlation function R(x, y) is the only macroscopic,
12.1. IMAGING IN HIGHLY HETEROGENEOUS MEDIA 185
uctuation-dependent parameter that will appear in the macroscopic models we consider
below.
12.1.2 Kinetic Model
The mathematical modeling we wish to perform consists of assuming that 0 and ob-
serving whether a macroscopic, hopefully independent, model emerges in that limit.
It turns out that p

itself is not the right quantity to consider. As 0, wave elds


oscillate rapidly and the (weak) limit of p

does not capture how the wave energy prop-


agates.
The local energy density of the waves is given by
c

(t, x) =
1
2
_
1
c
2

(x)
_

t
_
2
(t, x) +[u

[
2
(t, x)
_
, (12.4)
where
p
t
is the pressure eld and u

=
x
p

is the acoustic velocity eld. Up to


multiplication by a constant density
0
, the rst term above is the potential energy and
the second term the kinetic energy. The total energy is an invariant of the dynamics so
that
c(t) =
_
R
n
c

(t, x)dx = c(0) =


_
R
n
1
2
_
1
c
2

(x)
[j
0
[
2
(x) +

x
p
0

2
(x)
_
dx. (12.5)
All quantities are rescaled so that c(t) is of order O(1) independent of .
The local energy density c

(t, x) does not satisfy a closed-form equation. As in the


classical mechanical description of particles, position needs to be augmented with a
momentum variable to obtain closed-form equations such as Newtons laws for instance.
The same behavior occurs here and the energy density admit a representation in the
phase space of positions and momenta:
lim
0
c

(t, x) =
_
R
n
a(t, x, k)dk, (12.6)
where a(t, x, k) is a phase-space energy density for each t 0.
In the absence of random uctuations, i.e., when V 0, then a satises the same
Liouville equation as in Chapter 5:
a
t
+
k
H
x
a
x
H
k
a = 0, (12.7)
where H(x, k) = c
0
(x)[k[ is the Hamiltonian of the dynamics.
In the presence of random uctuations, V ,= 0, the wave packets interact with the
underlying structure and scatter. The Liouville equation (12.7) no longer holds and
propagation in a homogeneous medium is no longer an accurate model. Rather, we
obtain the following radiative transfer model
a
t
+
k
H
x
a
x
H
k
a =
_
R
n
(x, k, q)
_
a(t, x, q)a(t, x, k)
_

_
H(x, q)H(x, k)
_
dk,
(12.8)
186 CHAPTER 12. INVERSE PROBLEMS AND MODELING
where the scattering coecient is given by:
(x, k, q) =
H
2
(x, k)
2(2)
n

R(x, k q). (12.9)


Here,

R(x, k) is the Fourier transform of the correlation R(x, z) with respect to the
second variable only.
The derivation of kinetic models such as (12.8) is a lengthy process that is understood
only in specic situations. We refer the reader to the review article [15] for an extensive
list of works related to such derivations.
12.1.3 Statistical Stability
We observe that the above kinetic model is deterministic even though the description of
the scattering medium through V (x, y) was random. It remains, however, to understand
in which sense we have convergence in (12.6). Does the whole random variable c

(t, x)
converge to a deterministic limit, or does the ensemble average EE

(t, x) converge to
that limit? The latter would not be very useful in the reconstruction of the prole c
0
(x),
which may for instance model a buried inclusion. Indeed, available measurements of p

,
and hence of c

(t, x), are available for only one realization of the random medium
V (x, y; ). Taking ensemble averages is often not possible practically.
It turns out that the whole random eld c

(t, x) converges in an appropriate sense


to its deterministic limit
_
R
n
a(t, x, k)dk. To obtain a more precise statement, we need
to introduce some notation. We rst introduce the Wigner transform of two possibly
complex-valued functions f and g at the scale :
W[f, g](x, k) =
1
(2)
n
_
R
n
e
iky
f
_
x
y
2
_
g

_
x +
y
2
_
dy. (12.10)
Note that the Wigner transform is nothing but the Fourier transform of the two point
function f(x
y
2
)g

(x+
y
2
). As a consequence, it may be seen as a decomposition over
wave numbers of the correlation f(x)g

(x), since by integrating both sides, we obtain


that
_
R
n
W[f, g](x, k)dk = f(x)g

(x). (12.11)
Let us dene
W

(t, x, k) =
1
2
_
1
c

(x)
W
_

t
,
p

t
_
(x, k) + W
_

x
p

,
x
p

_
(x, k)
_
. (12.12)
Then some calculations show that the above quantity, which we also call a Wigner
transform, is a decomposition over wave numbers of the eld-eld correlation (12.4):
c

(t, x) =
_
R
n
W

(t, x, k)dk.
Exercise 12.1.1 Check this derivation; see [15] for details.
12.1. IMAGING IN HIGHLY HETEROGENEOUS MEDIA 187
In other words, we are interested in which sense the limit of W

(t, x, k) is equal to
a(t, x, k), the solution of (12.8).
The main theoretical advantage of the Wigner transform W

(t, x, k) is that it satises


a closed-form equation [9, 54]. Formal asymptotic expansions developed in [54] allow us
to pass to the limit in that equation and obtain the transport equation (12.8). One way
to show that the full random process W

(t, x, k) converges to a(t, x, k) is to look at the


following scintillation function (covariance function):
J

(t, x, k, y, p) = EW

(t, x, k)W

(t, y, p) EW

(t, x, k)EW

(t, y, p). (12.13)


It has been shown in several regimes of wave propagation that J

(t, x, k, y, p) converges
weakly to 0 as 0; see [15]. Such results allow us to deduce that, for suciently
smooth, we have
_
R
2n
W

(t, x, k)(x, k)dxdk


probability

_
R
2n
a(t, x, k)(x, k)dxdk. (12.14)
Exercise 12.1.2 Using the Chebyshev inequality, show that for any function (x, k)
and > 0, we have the inequality
P
_
[W

(t, x, k), a(t, x, k), [


_

(t),

. (12.15)
The convergence result (12.14) has one positive aspect and one negative one. Let us
start with the former. In our search for a model for the buried inclusion described by
c
0
(x), we have replaced an inverse wave problem for c

(x) with measurements of the form


p

(t, x) by an inverse kinetic problem for (c


0
(x),

R(x, k)) with measurements of the form
W

(t, x, k), say (let us pretend for the moment that such quantities are accessible by
measurements). If the measurements W

(t, x, k) obtained for one realization of V (x, y, )


were not approximately deterministic, then the reconstruction of (c
0
(x),

R(x, k)) would
inevitably strongly depend on the unknown realization of V (x, y, ). The reconstruction
would then be statistically unstable.
So the positive aspect of (12.14) is that the random quantity W

(t, x, k) converges
to a(t, x, k) in probability. Let m

(t) be the left-hand-side in (12.14) and m(t) its right-


hand-side. Then we obtain that for all > 0, the probability that [m

(t) m(t)[
(uniformly in t (0, T) in fact) goes to 0 as 0. Here, (12.14) does not describe how
fast the convergence occurs.
Some estimates (sometimes optimal, sometimes not) of the rate of convergence exist
in the literature [15]. What is much more damaging in practice, and forms the real
negative aspect of (12.14) is that convergence holds only after phase-space averaging.
Averaging over wavenumbers k may not be an issue since typically only c

(t, x) is
measured. Averaging over space is more constraining: the wider the spatial extent
of , the lesser the spatial resolution capabilities of measurements of the form m

(t).
In summary, we were successful in replacing an inverse wave problem for c

(x) with
statistically unstable measurements of the form p

(t, x) by an inverse kinetic problem for


(c
0
(x),

R(x, k)) with statistically stable measurements of the form W

(t, x, k). However,


statistical stability requires that the kinetic measurements be averaged over a suciently
large area, with negative consequences for resolution.
188 CHAPTER 12. INVERSE PROBLEMS AND MODELING
12.1.4 Inverse transport problem
Let us further simplify the inverse transport problem obtained by the modeling presented
above. We assume that the propagating waves are monochromatic waves with frequency
= c
0
[k[. We also assume that the background sound speed c
0
is constant except at the
location of the buried inclusion(s). Furthermore, we assume that

R(x, k) is constant in
x and on the sphere of radius [k[ 2

c
0
and equal to a constant

R
0
. This implies that
scattering is isotropic for the frequency . With all these simplications, and dening
u(x, ) = a(x, [k[), we nd that u solves the following transport equation

x
u +
0
u =
0
_
S
n
u(x,
t
)dv(
t
), (12.16)
where dv is normalized so that
_
S
n
dv(
t
) = 1, at least at those points x where no
inclusion is present.
Exercise 12.1.3 Derive (12.16) from (12.8) and nd the value of
0
.
We now assume that we are able to measure the energy density c

(x) of time har-


monic waves with frequency . Then obviously
(x) :=
_
S
n
u(x,
t
)dv(
t
) = c

(x) + n

(x) (12.17)
is measured for x for instance X with X a bounded domain in R
n
for a scaling
constant .
Exercise 12.1.4 Find the scaling constant .
Let us assume that the inclusions are small and that their inuence can be suitably
linearized. Futher assuming that only one inclusion of strength is present at x = x
0
to simplify, we thus have the model

x
u +
0
u =
0
_
S
n
u(x,
t
)dv(
t
) + (x x
0
), (x, ) X S
n1
,
u(x, ) = u
0
(x, ) (x, )

(X),
(12.18)
and we have for instance a measurement operator for a prescribed incoming boundary
condition u
0
:
M : (
0
, x
0
, ) M(
0
, x
0
, )(x) = (x) L
1
(X). (12.19)
The above simplied model of a measurement operator was used to image inclusions
buried in highly heterogeneous media from experimental data in [12]. The kinetic model
performed remarkably well. Even inclusions x
0
located behind a known blocker located
between x
0
and the array of detectors were reconstructed with very reasonable accuracy.
Arguably such reconstructions would be very dicult to achieve with an inverse problem
based on the wave model for p

.
The accuracy of the reconstructions of (
0
, x
0
, ) in such a model depends on the
structure of the noise n

(x) in (12.17). The derivation of asymptotic models for n

is a
12.1. IMAGING IN HIGHLY HETEROGENEOUS MEDIA 189
dicult task and very few theoretical or even formal results exist in this direction; see
[15]. Note that the above derivation accurately describes n

as the dierence between


the solution of a deterministic transport equation and the energy density of waves prop-
agating in a random medium. The statistics of n

can therefore a priori be computed,


at least numerically, from those of V (x, y), at least when the latter are known. This
provides an accurate physics-based model for the likelihood function in the Bayesian
framework considered in Chapter 10.
12.1.5 Random media and Correlations
Let us conclude this section by a useful generalization of the kinetic model considered in
(12.8). The salient feature of the above derivation is that the phase-space energy density
W

(t, x, k) is statistically stable. The energy density may be seen as the correlation of
a wave eld with itself. More generally, we can consider correlations of two wave elds,
which may or may not propagate in the same medium, or may or may not be generated
by the same initial conditions. It turns out that such eld-eld correlations are also
statistically stable.
Consider for instance two wave elds p
,k
for k = 1, 2 propagating in two dierent
media modeled by
c
2
,k
(x) = c
2
0
(x)

V
k
_
x,
x

_
, (12.20)
Dene then the following eld-eld correlation function:
(

(t, x) =
1
2
_
1
c
,1
(x)c
,2
(x)

p
,1
t

p
,2
t
(t, x) +
x
p
,1

x
p
,2
(t, x)
_
. (12.21)
When the two wave elds are identical, the correlation (

= c

satises the decomposi-


tion (12.6), where a solves (12.8).
We could consider p
,2
as having the same initial condition as p
,1
shifted in a direction
. Alternatively, we may consider that p
,k
have the same initial condition but that p
,2
propagates in a medium that is shifted by compared to the medium of p
,1
. Since
1, this could be approximately (exactly when c
0
is constant) modeled by (12.20).
Let us dene the following cross-correlation functions
R
jk
(x, y) = EV
j
(x, z)V
k
(x, z + y), (12.22)
and let

R
jk
(x, k) be the Fourier transform of R
jk
(x, y) with respect to the second variable
only. Following [9], we obtain that (12.8) generalizes to
lim
0
(

(t, x) =
_
R
n
a
12
(t, x, k)dk, (12.23)
where a
12
(t, x, k) solves the following radiative transfer equation
a
12
t
+
k
H
x
a
12

x
H
k
a
12
+ ((x, k) + i(x, k))a
12
=
_
R
n
(x, k, q)a
12
(t, x, q)
_
H(x, q) H(x, k)
_
dk,
(12.24)
190 CHAPTER 12. INVERSE PROBLEMS AND MODELING
where we have dened the following kinetic parameters:
(x, k) =
H
2
(x, k)
2(2)
n
_
R
n

R
11
+

R
22
2
(x, k q)
_
H(x, k) H(x, q)
_
dq
i(x, k) =
i
4(2)
n
p.v.
_
R
n
_

R
11


R
22
)(x, k q)

H(x, k)H(x, q)
H(x, k) H(x, q)
dq
(x, k, q) =
H
2
(x, k)
2(2)
n

R
12
(x, k q).
(12.25)
In other words, the eld-eld correlations also satisfy deterministic kinetic models
asymptotically, which may then be used to infer properties about the underlying medium
and the buried inclusions; see [11] for some applications of eld-eld correlations in
imaging.
12.1.6 Imaging with waves in random media
The main heuristic conclusion we can draw from the derivation of this section is that
the modeling of an inverse problem should aim to (i) nd a space of parameters that
is of reasonable size when compared to the quality of the available data; and (ii) nd
functionals of these parameters that are as immune as possible to all other parameters
that were neglected in the derivation.
This is more easily said than done of course and often cannot be achieved satisfacto-
rily. In some sense, the factorization methodology presented in Chapter 11 also aimed
to nd smaller parameter sets (such as the location of inclusions) that could uniquely be
determined independent of neglected parameters (such as the quantitative values inside
such inclusions).
In wave imaging in highly heterogeneous media, the proper model is to describe clut-
ter as random and then to look for functionals of the wave elds that are independent of
the realization of such a clutter. Field-eld correlations are such functionals. They have
been successfully exploited to stabilize imaging functionals based on eld measurements
in a technique called Coherent Interferometry [24] as well as to image inclusions in very
strong clutter [16].
The main limitation to the simplied model is encoded in the noise structure; see
for instance n

in (12.17). In the next section, we consider the asymptotic analysis of


such a noise in a simpler physical context.
12.2 Random uctuations and Noise models
In Chapter 4, we encountered the nonlinear inverse coecient problem for q(x) in (4.36),
which was recast as the integral (4.46). In this chapter, we consider a simplied version
of such an inverse problem of the form
y = Ax + Bx x, (12.26)
12.2. RANDOM FLUCTUATIONS AND NOISE MODELS 191
where y is the available measurements and x is the unknown coecient. We assume
that B is a bi-linear map, for instance of the form
Bx x(x) =
_
x
0
_
y
0
b(x, y, z)x(y)x(z)dzdy, (12.27)
to mimic the inverse problem in (4.46). As a rst approximation to the same inverse
problem, we could assume that A is the identity operator.
Let us now assume that measurements are not perfect, as is the case in practice.
More precisely, let us assume that measurements are of the form
y

= C

y, C

y(x) =
_

0

(x y)y(y)dy. (12.28)
In other words, we assume that C

is a convolution operator that blurs measurements


while preserving some causality. We assume that

0, is smooth, integrates to 1,
and has a support of size for instance.
As a consequence, the measurements y

do not allow us to stably reconstruct fre-


quencies of order
1
or larger. Note that unique reconstructions are still possible when

is known since clearly

y() =

() y() so that y() =

y()

()
.
However, for of order
1
,

() is small and inevitable noise in y is therefore ex-


tremely amplied (the inverse denoising problem is a severely ill-posed problem). As a
consequence, typically only the low frequency part of x can be reconstructed. Let us
take C

x as a proxy for such low frequencies. Let us also make the simplifying assump-
tion that C

and A commute, in other words, that A is itself a convolution, which is


certainly the case for A = Id. Then we nd that
y

= Ax

+ C

Bx x, x

= C

x. (12.29)
This can be written as an equation for x

as follows:
y

= Ax

+ C

Bx

+ n

, n

:=
_
C

Bx x C

Bx

_
. (12.30)
Noise and Modeling. We have thus slightly changed problems. After realizing that
x could not be reconstructed explicitly, we changed to a problem for x

= C

x, which we
hope can be reconstructed stably. However, by doing so, we do get a nonlinear equation
for x

that involves a noise term n

. Here is where the modeling step becomes useful in


the theoretical understanding of the above inverse problem.
We rst realize that
n

= C

B(x x

) x + C

Bx (x x

) C

B(x x

) (x x

). (12.31)
Let us assume that some theoretical arguments allow us to show that C

B(x x

)
(x x

) is negligible compared to the other two terms for 1, the regime of interest
for the rest of this section.
192 CHAPTER 12. INVERSE PROBLEMS AND MODELING
Since z

:= x x

cannot be reconstructed, the only choice that we can model such


a term is as the realization of a random process. And because z

is dened at the
scale , we dene it as z

(x) = z(
x

). Moreover, in the absence of information about


potential bias, we assume that Ez(x) = 0, where E denotes ensemble average over
the randomness, and that z(x) is a stationary process (which means that the statistics
of z(x) are not modied by shifts in x). Since it slightly simplies the presentation, let
us assume that b(x, y, z) = b(x y)b(y z) in (12.27) so that B appears as a reiterated
convolution:
Bx x = b (xb x).
Then we nd that
C

B(x x

) x = b

(z

b x), b

= C

b.
In other words,
C

B(x x

) x(x) =
_
x
0
b

(x y)z
_
y

_
(b x)(y)dy. (12.32)
The above contribution to the error term has been recast as a stochastic integral with a
highly oscillatory integrand. The limits of such integrals as 0 are well-understood.
We nd for instance that
1

_
x
0
b

(x y)z
_
y

_
(b x)(y)dy
distribution

0
N(x) :=
_
x
0
b(x y)(b x)(y)dW(y),
(12.33)
where dW(y) is a one dimensional Wiener measure (i.e., white noise).
Exercise 12.2.1 Prove the above formula. The analysis of such oscillatory integrals is
for instance available in [10].
Here, is the strength of the random uctuations, given explicitly by

2
=
_
R
Ez(0)z(x)dx, (12.34)
which we assume is positive and bounded.
We rst observe that b

(z

b x) is of size

. This means that for uctuations
z(x) of order O(1), the contribution b

(z

b x) is asymptotically not of order O(1) but


rather of order O(

).
Second, we note that the right-hand side N(x) of (12.33) is a centered Gaussian
process. This means that jointly, (N(x
1
), . . . N(x
n
)) for n 0 is a multi-dimensional
Gaussian random variable. For instance, we obtain the following expression for covari-
ances:
EN(x
1
)N(x
2
) =
2
_
x
1
x
2
0
b
2
(x
1
x
2
y)(b x)
2
(y)dy, (12.35)
where x y = min(x, y). In other words, we have an estimate of the correlation of this
noise contribution at dierent values of x.
12.2. RANDOM FLUCTUATIONS AND NOISE MODELS 193
Noise Model and Correlations. The noise in (12.31) has two leading contributions:
n

= b

(z

b x) + b

(xb z

) + r

.
The second contribution may be written as
b

(xb z

)(x) =
_
x
0
_
_
x
y
b

(x z)x(z)b(z y)dy
_
z
_
y

_
dy.
Assuming that the contribution r

is negligible as 0, we obtain that


1

_
b

(z

b x) + b

(xb z

)
_
(x)
distribution

0
A(x)[x]
A[x](x) =
_
x
0
n[x](x, y)dW(y)
n[x](x, y) = b(x y)(b x)(y) +
_
_
x
y
b(x z)x(z)b(z y)dz
_
(12.36)
The is above convergence result is the main result of this section. We obtain that
n

(x) is of order

. Moreover, we obtain a limiting description of the statistics of
n

(x). What is interesting is that the limiting distribution A(x)[x] depends on the
unknown x, which is close to x

.
Closed form equation with noise model. To summarize the above derivation, we
obtain that the model for x in the presence of noise in the data is replaced by the
following approximate model for x

= C

x:
y

= Ax

+ C

Bx

+ n

, n

A[x

]. (12.37)
By an application of the Gronwall lemma, it is not dicult to obtain that the above
nonlinear equation is uniquely solvable
Exercise 12.2.2 Verify this in detail.
What is particularly interesting in the above model is that the noise model n

is a
physics-based noise model whose statistics depend on the unknown parameter x

. In
the above derivation, A[x

] is linear because we assumed that the data were quadratic


in the unknown parameters. However, more general forms for A[x

] would appear in
higher order terms are kept in the denition of y = M(x) as would be required in the
treatment of (4.46) for instance.
Here are the main conclusions we can draw from such an analysis. First, the corre-
lation function of the measurements is very far from being diagonal:
En

(x
1
)n

(x
2
) =
2
_
x
1
x
2
0
(n[x](x
1
x
2
, y))
2
dy. (12.38)
Assuming that the measurements at dierent points (or times) x
j
are uncorrelated, as
is often done in practice when no priori knowledge is available, generates larger errors
than when a reasonable model for correlations is available. The above modeling provides
194 CHAPTER 12. INVERSE PROBLEMS AND MODELING
a means to estimate such correlations in a physically justied manner. For instance,
in the Bayesian setting with noise and prior modeled as correlated Gaussian elds, the
maximum of the posterior distribution (which then coincides with its mean) is given
by the solution of (10.51). The solution depends on the correlation matrix , which is
precisely what is provided by (12.38).
Second, the statistics of the noise depend on the unknown parameter x. In most
practical settings, the noise statistics are assumed to be independent of the unknown
coecients. We nd that physics-based noise models provide examples where the statis-
tics of the noise do depend on the unknown parameter x. This has to be accounted for
in practical implementations of the minimization procedures or Bayesian formalisms
presented in Chapter 10, for instance by assuming that in (10.51) depends on x.
In the setting of a one-dimensional Sturm-Liouville problem, the gains in accuracy
that one might expect from such asymptotic model as presented above was quantied
numerically in [17].
Bibliography
[1] M. J. Ablowitz and A. S. Fokas, Complex Variables, Cambridge University
press, 2000. 30, 31, 32
[2] G. Alessandrini, An identication problem for an elliptic equation in two vari-
ables, Ann. Mat. Pura Appl., 145 (1986), pp. 265296. 135, 146
[3] G. Alessandrini and V. Nesi, Univalent e

-harmonic mappings, Arch. Rat.


Mech. Anal., 158 (2001), pp. 155171. 146
[4] H. Ammari, An Introduction to Mathematics of Emerging Biomedical Imaging,
vol. 62 of Mathematics and Applications, Springer, New York, 2008. 149
[5] H. Ammari and H. Kang, Reconstruction of Small Inhomogeneities from Bound-
ary Measurements, Lecture Notes in Mathematics, Springer, Berlin, 2004. 182
[6] E. V. Arbuzov, A. L. Bukhgeim, and S. G. Kazantsev, Two-dimensional
tomography problems and the theory of A analytic functions, Sib. Adv. Math., 8
(1998), pp. 120. 30, 31
[7] K. Astala and L. P aiv arinta, Calderons inverse conductivity problem in the
plane, Annals Math., 163 (2006), pp. 265279. 120
[8] G. Bal, Optical tomography for small volume absorbing inclusions, Inverse Prob-
lems, 19 (2003), pp. 371386. 178, 182
[9] , Kinetics of scalar wave elds in random media, Wave Motion, 43(2) (2005),
pp. 132157. 187, 189
[10] , Central limits and homogenization in random media, Multiscale Model.
Simul., 7(2) (2008), pp. 677702. 192
[11] , Inverse problems in random media: a kinetic approach, vol. 124 of J. Phys.
Conf. Series, 2008, p. 012001. 190
[12] G. Bal, L. Carin, D. Liu, and K. Ren, Experimental validation of a transport-
based imaging method in highly scattering environments, Inverse Problems, 23(6)
(2007), pp. 25272539. 188
[13] G. Bal and A. Jollivet, Stability estimates in stationary inverse transport,
Inverse Probl. Imaging, 2(4) (2008), pp. 427454. 90, 93, 95, 96, 97
195
196 BIBLIOGRAPHY
[14] , Stability for time-dependent inverse transport, SIAM J. Math. Anal., 42(2)
(2010), pp. 679700. 97
[15] G. Bal, T. Komorowski, and L. Ryzhik, Kinetic limits for waves in random
media, Kinetic Related Models, 3(4) (2010), pp. 529 644. 186, 187, 189
[16] G. Bal and O. Pinaud, Imaging using transport models for wave-wave correla-
tions, Math. Models Methods Appl. Sci., 21(3) (2011), pp. 10711093. 190
[17] G. Bal and K. Ren, Physics-based models for measurement correlations. Applica-
tion to an inverse Sturm-Liouville problem, Inverse Problems, 25 (2009), p. 055006.
194
[18] , Multi-source quantitative PAT in diusive regime, Inverse Problems, 27(7)
(2011), p. 075003. 132, 149
[19] G. Bal, K. Ren, G. Uhlmann, and T. Zhou, Quantitative thermo-acoustics
and related problems, Inverse Problems, 27(5) (2011), p. 055007. 149
[20] G. Bal and J. C. Schotland, Inverse Scattering and Acousto-Optics Imaging,
Phys. Rev. Letters, 104 (2010), p. 043902. 137
[21] G. Bal and G. Uhlmann, Inverse diusion theory for photoacoustics, Inverse
Problems, 26(8) (2010), p. 085010. 135
[22] M. Bertero and P. Boccacci, Introduction to Inverse Problems in Imaging,
IOP Publishing, Bristol, 1998. 4
[23] J. Boman, An example of nonuniqueness for a generalized radon transform, J.
Anal. Math., 61 (1993), pp. 395401. 30
[24] B. Borcea, G. C. Papanicolaou, and C. Tsogka, Interferometric array
imaging in clutter, Inverse Problems, 21 (2005), pp. 14191460. 190
[25] M. Briane, G. W. Milton, and V. Nesi, Change of sign of the correctors
determinant for homogenization in three-dimensional conductivity, Arch. Ration.
Mech. Anal., 173(1) (2004), pp. 133150. 146
[26] F. Cakoni and D. L. Colton, Qualitative methods in inverse scattering theory:
an introduction, Springer Verlag, New York, 2006. 178
[27] Y. Capdeboscq, J. Fehrenbach, F. de Gournay, and O. Kavian, Imaging
by modication: numerical reconstruction of local conductivities from corresponding
power density measurements, SIAM J. Imaging Sciences, 2 (2009), pp. 10031030.
140
[28] K. Chadan, D. Colton, L. P aiv arinta, and W. Rundell, An Introduction
to Inverse Scattering and Inverse Spectral Problems, SIAM, Philadelphia, 1997. 66
[29] M. Choulli and P. Stefanov, An inverse boundary value problem for the sta-
tionary transport equation, Osaka J. Math., 36 (1999), pp. 87104. 89, 90, 93,
96
BIBLIOGRAPHY 197
[30] D. L. Colton and R. Kress, Inverse acoustic and electromagnetic scattering
theory, Springer Verlag, Berlin, 1998. 66, 69, 70
[31] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Prob-
lems, Kluwer Academic Publishers, Dordrecht, 1996. 152
[32] J. Feldman and G. Uhlmann, Inverse Problems, Available at
http://www.math.ubc.ca/feldman/ibook/, 2004. 120
[33] D. Gilbarg and N. S. Trudinger, Elliptic Partial Dierential Equations of
Second Order, Springer-Verlag, Berlin, 1977. 137
[34] V. Girault and P.-A. Raviart, Finite Element Methods for Navier-Stokes
Equations, Springer, Berlin, 1986. 174
[35] T. Goldstein and S. Osher, The split Bregman method for L1 regularized prob-
lems, SIAM Journal on Imaging Sciences, 2 (2009), pp. 323343. 165
[36] P. H ahner, A periodic Faddeev-type solution operator, J. Di. Eq., 128 (1996),
pp. 300308. 116
[37] R. Hardt, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, and
N. Nadirashvili, Critical sets of solutions to elliptic equations, J. Dierential
Geom., 51 (1999), pp. 359373. 129
[38] J. P. Kaipio and E. Somersalo, Statistical and Computational inverse prob-
lems, Springer Verlag, New York, 2004. 4, 170
[39] A. Katchalov, Y. Kurylev, and M. Lassas, Inverse boundary spectral prob-
lems, Monographs and Surveys in Pure and Applied Mathematics, 123, Chapman
& Hall CRC, Boca Raton, FL, 2001. 66
[40] A. Kirsch, An Introduction to the Mathematical Theory of Inverse Problems,
Springer-Verlag, New York, 1996. 152, 159, 162
[41] , The factorization method for inverse problems, Oxford University Press, Ox-
ford, 2008. 178
[42] R. Kohn and M. Vogelius, Determining conductivity by boundary measure-
ments, Comm. Pure Appl. Math., 37(3) (1984), pp. 289298. 111
[43] J. M. Lee, Riemannian Manifolds: An Introduction to Curvature, Springer, Berlin,
1997. 144
[44] J. R. McLaughlin, N. Zhang, and A. Manduca, Calculating tissue shear
modulus and pressure by 2D log-elastographic methods, Inverse Problems, 26 (2010),
pp. 085007, 25. 127
[45] R. G. Mukhometov, The reconstruction problem of a two-dimensional Rieman-
nian metric, and integral geometry. (Russian), Dokl. Akad. Nauk SSSR, 232 (1977),
pp. 3235. 39
198 BIBLIOGRAPHY
[46] A. Nachman, Reconstruction from boundary measurements, Ann. Math., 128
(1988), pp. 531577. 120
[47] F. Natterer, The mathematics of computerized tomography, Wiley, New York,
1986. 20, 159
[48] F. Natterer and F. W ubbeling, Mathematical Methods in Image Reconstruc-
tion, SIAM monographs on Mathematical Modeling and Computation, Philadel-
phia, 2001. 20, 159
[49] R. G. Novikov, Multidimensional inverse spectral problem for the equation +
(V (x) EU(x)) = 0, Functional Anal. Appl., 22 (1988), pp. 263272. 120
[50] , An inversion formula for the attenuated X-ray transformation, Ark. Math.,
40 (2002), pp. 145167 (Rapport de Recherche 00/053, Universite de Nantes,
Laboratoire de Mathematiques). 30, 31
[51] R. Potthast, Point sources and multipoles in inverse scattering theory, CRC
Press, Boca Raton, Fl, 2001. 178
[52] J. Radon,

Uber die Bestimmung von Funktionen durch ihre Integralwerte langs
gewisser Mannigfaltigkeiten, Berichte S achsische Akademie der Wissenschaften,
Leipzig, Math.-Phys. Kl., 69 (1917), pp. 262267. 28
[53] L. Robbiano and J. Salazar, Dimension de Hausdor et capacite des points sin-
guliers dune solution dun operateur elliptique. (French. English summary) [Haus-
dor dimension and capacity of the singular points of a solution of an elliptic op-
erator], Bull. Sci. Math., 3 (1990), pp. 329336. 129
[54] L. Ryzhik, G. C. Papanicolaou, and J. B. Keller, Transport equations for
elastic and other waves in random media, Wave Motion, 24 (1996(4)), pp. 327370.
187
[55] M. Salo, Calderon problem, Lecture Notes, Spring 2008, Department of Mathe-
matics and Statistics, University of Helsinki. 116, 120
[56] O. Scherzer, Handbook of Mathematical Methods in Imaging, Springer Verlag,
New York, 2011. 149, 163, 165
[57] O. Scherzer, M. Grasmair, H. Grossauer, M. Haltmeier, and
F. Lenzen, Variational Methods in Imaging, Springer Verlag, New York, 2009.
163, 165
[58] V. A. Sharafutdinov, Integral geometry of tensor elds, VSP, Utrecht, the
Netherlands, 1994. 39
[59] E. Stein, Singular Integrals and Dierentiability Properties of Functions, vol. 30
of Princeton Mathematical Series, Princeton University Press, Princeton, 1970. 116
BIBLIOGRAPHY 199
[60] J. Sylvester and G. Uhlmann, A global uniqueness theorem for an inverse
boundary value problem, Ann. of Math., 125(1) (1987), pp. 153169. 8, 110, 116,
120
[61] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Esti-
mation, SIAM, Philadelphia, 2004. 170
[62] M. E. Taylor, Partial Dierential Equations I, Springer Verlag, New York, 1997.
155
[63] A. V. Tikhonov and V. Y. Arsenin, Solutions of ill-posed problems, Wiley,
New York, 1977. 4, 152
[64] O. Tretiak and C. Metz, The exponential Radon transform, SIAM J. Appl.
Math., 39 (1980), pp. 341354. 31
[65] G. Uhlmann, Calderons problem and electrical impedance tomography, Inverse
Problems, 25 (2009), p. 123011. 120
[66] C. R. Vogel, Computational Methods for Inverse Problems, Frontiers Appl.
Math., SIAM, Philadelphia, 2002. 4
[67] J.-N. Wang, Stability estimates of an inverse problem for the stationary transport
equation, Ann. Inst. Henri Poincare, 70 (1999), pp. 473495. 97

You might also like