Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
30 views102 pages

Waves in Random Media Notes

This document covers topics related to wave propagation in random media including wave equations, homogenization theory, geometric optics, random perturbations, Wigner transforms, radiative transfer equations, and the parabolic regime. It contains various mathematical analyses and derivations related to modeling wave propagation.

Uploaded by

juliusaisecond
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views102 pages

Waves in Random Media Notes

This document covers topics related to wave propagation in random media including wave equations, homogenization theory, geometric optics, random perturbations, Wigner transforms, radiative transfer equations, and the parabolic regime. It contains various mathematical analyses and derivations related to modeling wave propagation.

Uploaded by

juliusaisecond
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

Lecture Notes.

Waves in Random Media

1
Guillaume Bal

January 9, 2006

1
Department of Applied Physics and Applied Mathematics, Columbia University, New York NY,
10027; [email protected]
Contents

1 Wave equations and First-order hyperbolic systems 4


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Wave equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Acoustic waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Elastic and Electromagnetic waves . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Schrödinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 First-order symmetric hyperbolic systems . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Case of constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Plane Wave solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 Case of spatially varying coefficients . . . . . . . . . . . . . . . . . . . . 9
1.3.4 Finite speed of propagation . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.5 Characteristic equation and dispersion relation . . . . . . . . . . . . . . 13

2 Homogenization Theory for the wave equation 15


2.1 Effective medium theory in periodic media . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Multiple scale expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Homogenized equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.3 Energy estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Multidimensional case and estimates of the effective propagation speed . . . . . 21
2.2.1 Effective density tensor in the case of small volume inclusions . . . . . . 21
2.2.2 Effective density tensor in the case of small contrast . . . . . . . . . . . 24
2.3 Case of random media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Variational formulation and effective parameter estimates . . . . . . . . . . . . 28
2.4.1 Classical variational formulation . . . . . . . . . . . . . . . . . . . . . . 28
2.4.2 Hashin-Shtrikman bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Homogenization with boundaries and interfaces . . . . . . . . . . . . . . . . . . 31
2.5.1 The time dependent case . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2 Plane wave reflection and transmission . . . . . . . . . . . . . . . . . . . 34

3 Geometric Optics 36
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Second-order scalar equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.1 High Frequency Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2 Geometric Optics Expansion . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 The Eikonal equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 First-order hyperbolic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

1
4 Random perturbations 44
4.1 Statistical description of continuous random fields. . . . . . . . . . . . . . . . . 44
4.2 Regular Perturbation method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Random Geometric Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5 Wigner Transforms 52
5.1 Definition of the Wigner transform . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Convergence properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3 Equations for the Wigner transform . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Examples of Wigner transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.5 Semiclassical limit for Schrödinger equations . . . . . . . . . . . . . . . . . . . . 63

6 Radiative transfer equations 66


6.1 Non-symmetric two-by-two system . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.2 Structure of the random fluctuations . . . . . . . . . . . . . . . . . . . . . . . . 68
6.3 Equation for the Wigner transform . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.4 Multiple scale expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.5 Leading-order equation and dispersion relation . . . . . . . . . . . . . . . . . . 69
6.6 First-order corrector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.7 Transport equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

7 Parabolic regime 73
7.1 Derivation of the parabolic wave equation . . . . . . . . . . . . . . . . . . . . . 73
7.2 Wigner Transform and mixture of states . . . . . . . . . . . . . . . . . . . . . . 75
7.3 Hypotheses on the randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.4 The Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.5 Proof of Theorem 7.4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.5.1 Convergence in expectation . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.5.2 Convergence in probability . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5.3 Tightness of Pε . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.5.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

A Notes on Diffusion Markov Processes 91


A.1 Markov Process and Infinitesimal Generator . . . . . . . . . . . . . . . . . . . . 91
A.1.1 Definitions and Kolmogorov equations . . . . . . . . . . . . . . . . . . . 91
A.1.2 Homogeneous Markov Processes . . . . . . . . . . . . . . . . . . . . . . 92
A.1.3 Ergodicity for homogeneous Markov processes . . . . . . . . . . . . . . . 93
A.2 Perturbation expansion and diffusion limit . . . . . . . . . . . . . . . . . . . . . 94
A.3 Remarks on stochastic integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
A.4 Diffusion Markov Process Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

2
Introduction

This set of notes covers the material introduced in the course on Waves in Random Media
taught in the fall of 2005 at Columbia University. The first chapter covers fundamental as-
pects of wave equations and includes a (partial) theory for first-order hyperbolic systems of
equations. The second chapter concerns the effective medium theory of wave equations.
This corresponds to low frequency waves propagating in high frequency media. Chapter three
analyzes high frequency waves in low frequency media by the method of geometric optics.
Random media are introduced in Chapter four, which is devoted to perturbation analyses
of low frequency waves in low frequency random media. The rest of the notes concerns the
analysis of high frequency waves in high frequency media, which means that both the typical
wavelength and the typical correlation length of the underlying media are small compared to
the size of the system.
Chapter five introduces the main tool of (micro-local) analysis of high frequency waves
used in these notes, namely the Wigner transform. The semiclassical analysis of quantum
waves, an example of high frequency waves in low frequency media, is analyzed using the
framework of Wigner transforms.
The Wigner transform, which offers a description of the wave energy density, at least
asymptotically, in the phase space, is used in Chapter six to derive radiative transfer equa-
tions from a two-by-two system of acoustic wave equations. The radiative transfer equations
provide a phase space description of the propagation of the acoustic energy density in ran-
dom media characterized by a regime called the weak-coupling regime. Thought it has
solid foundations, the derivation of radiative transfer is formal (i.e., is not justified rigorously
mathematically).
Chapter seven addresses the derivation of radiative transfer equations for an approximation
of acoustic wave propagation: the paraxial (or parabolic) equation. In this approximation,
waves primarily propagate in privileged direction. By modeling randomness as a Markov
process in that direction of propagation, the equations of radiative transfer can be justified
rigorously in this simplified setting.

3
Chapter 1

Wave equations and First-order


hyperbolic systems

1.1 Introduction
This chapter presents classical wave propagation models, including first-order hyperbolic sys-
tems of equations, the theory of which is briefly presented. References useful to this chapter
include [15, 21, 26].

1.2 Wave equations


1.2.1 Acoustic waves
First-order symmetric hyperbolic system The linear system of acoustic wave equations
for the pressure p(t, x) and the velocity field v(t, x) takes the form of the following first-order
hyperbolic system
∂v
ρ(x) + ∇p = 0, t > 0, x ∈ Rd ,
∂t
∂p
κ(x) + ∇ · v = 0, t > 0, x ∈ Rd , (1.1)
∂t
p(0, x) = p0 (x), x ∈ R , d

v(0, x) = v0 (x), x ∈ Rd ,
where ρ(x) is density and κ(x) compressibility. Both quantities are assumed to be uniformly
bounded from below by a positive constant. Our notation is that ∇ = ( ∂x∂ 1 , . . . , ∂x∂ d )t and that
∂u
∇· is the negative of the adjoint operator for the usual L2 inner product, i.e., ∇·u = dj=1 ∂xjj .
P

Acoustic energy conservation is characterized by the fact that


Z
1
EB (t) = (ρ(x)|v|2 (t, x) + κ(x)p2 (t, x))dx = EB (0). (1.2)
2 Rd

Exercise 1.2.1 Derive this for sufficiently smooth solutions.

We now know that total energy is conserved. The role of a kinetic model is to describe its
spatial distribution (at least asymptotically). This course’s main objective is precisely the
derivation of such kinetic models.

4
Scalar wave equation The pressure p(t, x) also solves following closed form scalar equation
∂2p 1 1
2
= ∇· ∇p, t > 0, x ∈ Rd ,
∂t κ(x) ρ(x)
p(0, x) = g(x), x ∈ Rd (1.3)
∂p
(0, x) = h(x), x ∈ Rd .
∂t
We verify that energy conservation takes the form
|∇p|2 (t, x) 
Z
1 ∂p
EH (t) = κ(x)( )2 (t, x) + dx = EH (0). (1.4)
2 Rd ∂t ρ(x)
Both conservation laws (1.2) and (1.4) are equivalent.
Exercise 1.2.2 Prove this statement. Hint: define the pressure potential φ(t, x) as a solution
to (1.3) and then (v, p) = (−ρ−1 ∇φ, ∂t φ); show that (v, p) solves (1.1) and that EH [φ](t) =
EB [v, p](t).

A non-symmetric two by two system. In this paragraph we assume to simplify that


ρ(x) = ρ0 is constant and define the sound speed
1
c(x) = p . (1.5)
ρ0 κ(x)
Let us consider
∂p
q(t, x) = c−2 (x) (t, x). (1.6)
∂t
Then u = (p, q)t solves the following 2 × 2 system
∂u
+ Au = 0, t > 0, x ∈ Rd ,
∂t (1.7)
u(0, x) = (g(x), c−2 (x)h(x))t , x ∈ Rd .
where
! ! !
0 c2 (x) 0 1 −∆ 0
A=− = J Λ(x), J= , Λ(x) = . (1.8)
∆ 0 −1 0 0 c2 (x)
Note that J is a skew-symmetric matrix (J t = −J) and that Λ is a symmetric matrix-valued
operator for the usual L2 scalar product:
Z
(u, v) = v∗ (x)u(x)dx, (1.9)
Rd
where v∗is the complex-conjugated (for complex-valued vectors) transpose vector to v. For
real-valued vectors, we find that v∗ (x)u(x) = u(x) · v(x).
Exercise 1.2.3 Show that Λ is a symmetric operator, i.e., show (by integrations by parts
assuming sufficient regularity for u and v) that
(Λu, v) = (u, Λv).
Exercise 1.2.4 Write a two-by-two system for (p, pt ). Here and below we use alternatively
pt and ∂p
∂t to denote partial derivative with respect to the t variable.

Exercise 1.2.5 Show that the following quantity is conserved:


Z
1
E(t) = uΛudx = E(0).
2ρ0 Rd
Relate this conserved quantity to those in (1.2) and (1.4).

5
1.2.2 Elastic and Electromagnetic waves
See [28].

1.2.3 Schrödinger equation


The Schrödinger equation takes the form
∂ψ 1
i + ∆ψ − V (x)ψ = 0, t > 0, x ∈ Rd
∂t 2 (1.10)
ψ(0, x) = ψ0 (x).

Here V (x) is a real-valued potential, which we assume e.g. with compact support to simplify.
We have the conservation of the number of particles:
Z Z
2
N (t) = |ψ(t, x)| dx = |ψ0 (x)|2 dx. (1.11)
Rd Rd

Exercise 1.2.6 Verify this for sufficiently smooth solutions of (1.10).

1.3 First-order symmetric hyperbolic systems


The equations of acoustics, electromagnetism, and elasticity can all be put in the framework
of first-order symmetric hyperbolic systems.
Let u be an n−vector-valued function in the sense that u(t, x) is a vector with n components
for each t ≥ 0 and x ∈ Rd . We consider the equation
∂u ∂u
A0 (x) + Aj (x) = 0, t > 0, x ∈ Rd
∂t ∂xj (1.12)
u(0, x) = u0 (x), x ∈ Rd ,

where we use the convention of summation over repeated indices (so that the above left-hand
side is a summation over 1 ≤ j ≤ d). We assume that A0 (x) is a smooth (in x) positive-definite
symmetric matrix for all x ∈ Rd and that the Aj (x), 1 ≤ j ≤ n are smooth symmetric matrices
for all x ∈ Rd .
Exercise 1.3.1 (i) Show that the equations of acoustics, electromagnetism, and elasticity can
be put in the framework (1.14) with matrices Aj , 1 ≤ j ≤ d, that are moreover constant.
(ii) Assuming that Aj , 1 ≤ j ≤ d, are constant show that
d
(u(t, ·), u(t, ·))A0 = 0, where (u, u)A0 = (A0 u, u). (1.13)
dt
Show that there is at most one solution to (1.12). Relate this to the energy conservation (1.2).
1/2
Since A0 (x) is symmetric positive definite, it admits a square root A0 (x) and we define
1/2
the quantity v = A0 u. We verify that v satisfies the system
∂v ∂v
+ Bj (x) + B0 (x)v = 0, t > 0, x ∈ Rd
∂t ∂xj (1.14)
v(0, x) = v0 (x), x ∈ Rd ,
where we have defined the matrices
d −1/2
−1/2 −1/2
X −1/2 ∂A0 
Bj = A0 Aj A0 , 1 ≤ j ≤ n, B0 = A0 Aj . (1.15)
∂xj
j=1

6
The matrices Bj for 1 ≤ j ≤ n are now symmetric matrices. Note that B0 (x)v is a bounded
operator (for instance in the L2 −sense). We can thus recast (1.12) into a system of the form
(1.14), which is more commonly analyzed in the mathematical literature.

1.3.1 Case of constant coefficients


In the case of constant matrices Aj 0 ≤ j ≤ d, the above equation takes the form
∂u ∂u
+ Bj = 0, t > 0, x ∈ Rd
∂t ∂xj (1.16)
u(0, x) = u0 (x), x∈ Rd .

The result in Exercise 1.3.1 shows that the energy (u, u) is a conserved quantity in time. Let
us define the Fourier transform as
Z
Fx→k u(k) ≡ û(k) = e−ik·x u(x)dx, (1.17)
Rd

with inverse transformation the inverse Fourier transform


Z
−1 1
Fk→x û(x) ≡ u(x) = eik·x û(k)dk. (1.18)
(2π)d Rd
The Fourier transform is defined on vectors component by component. Using the classical
result
∂u
F[ ](k) = ikj û(k), (1.19)
∂xj
we deduce that (1.16) may formally be recast in the Fourier domain as
∂ û
+ iA(k)û = 0, û(0, k) = û0 (k), (1.20)
∂t
where we have defined the dispersion matrix
d
X
A(k) = kj Bj . (1.21)
j=1

As usual, the Fourier transform recasts constant-coefficient partial differential equations into
ordinary differential equations, which can be solved to yield

k ∈ Rd .

û(t, k) = exp − iA(k)t û0 (k), (1.22)

We have thus constructed a solution to (1.16) of the form


Z
1
u(t, x) = ei(x−y)·k−iA(k)t u0 (y)dkdy. (1.23)
(2π)d R2d

It remains to show that the above integral makes sense, for instance in the L2 setting, which
we leave as an exercise:
Exercise 1.3.2 Show that u(t, x) is uniformly bounded in time in (L2 (Rd ))n for an initial
condition u0 ∈ (L2 (Rd ))n . Hint: Since A(k) is symmetric, we can decompose it as
n
X
A(k) = λm (k)bm (k)b∗m (k), (1.24)
m=1

7
for some real-valued eigenvalues λm (k) and for some eigenvectors bm (k). Show then the
decomposition
n
X
eiA(k)t
= eiλm (k)t bm (k)b∗m (k), (1.25)
m=1

and deduce that


|û(t, k)| = |û0 (k)|, (1.26)
for all k ∈ Rd (use the fact that the bm form an orthonormal basis on Rn ). The latter
equality shows that eiA(k)t is a unitary operator, i.e., is norm-preserving. Conclude by using
the Parseval relation Z Z

(2π) d
u(x)v (x)dx = û(k)v̂ ∗ (k)dk. (1.27)
Rd Rd

A similar results holds when A(k) has real-valued eigenvalues but is not necessarily sym-
metric. We refer to [15, §7.3.3] for the more involved derivation and for additional results
on the regularity of the constructed solution u(t, x). Note that since u(t, x) is unique within
the set of uniformly bounded functions in the L2 sense, (1.23) provides the unique solution to
(1.16).

1.3.2 Plane Wave solutions


Following the notation in [26], we denote by y = (y0 , . . . , yd ) = (t, x) the variable in Rd+1 .
Consider the slightly more general first-order hyperbolic system

L(D)u(y) = 0, t > 0, x ∈ Rd
d
X ∂ (1.28)
L(D) = Am .
∂ym
m=0

and look for solutions of the form

u(y) = f (y · η), f : R → Cn . (1.29)

We compute that
d
X
0
L(D)f (y · η) = L(η)f (y · η), where L(η) = ηm Am .
m=0

Here, L(η) is the symbol of the constant-coefficient operator L(D). Note the following equality
in terms of operators
−1
L(D) = Fη→y L(iη)Fy→η .
For a function f whose derivative never vanishes (such as e.g. f (µ) = eiµ a for some vector
a ∈ Rd+1 ), we see that L(D)f (y · η) = 0 implies that L(η) is not invertible. This is equivalent
to saying that
det L(η) = 0. (1.30)
The above equation is called the characteristic equation of the operator L(D). More specifically
for η = (τ, k1 , . . . , kd ), we find that the characteristic equation is given by
d
X 
det A0 τ + kj Aj = 0, (1.31)
j=1

8
which is equivalent to saying that
d d
−1/2 −1/2 
X X
−τ ∈ σ A−1

0 kj Aj = σ A0 kj Aj A0 . (1.32)
j=1 j=1

Here σ(A) stands for the spectrum of the matrix A. For plane wave solutions of the form
eiη·y a, we observe that a is then in the kernel of the operator L(η), or equivalently that
d
X
A0 τ a + kj Aj a = 0. (1.33)
j=1

1/2 −1/2 −1/2


For each τ defined in (1.32) we thus obtain an eigenvector A0 a of the matrix A0 kj Aj A0 .
As we shall see later, the largest τ of these eigenvalues determines the fastest speed of prop-
agation in the system. The associated plane waves are those propagating with the fastest
speed.
More specifically, let us consider plane waves of the form ei(τ t+k·x) a with k real-valued
(which corresponds to bounded “physical” plane waves). Then the characteristic equation
takes the form
detL(iτ, ik) = 0. (1.34)
For fixed k, the n solutions τj (k) define the dispersion relations of the system of equations. The
associated eigenvectors bj are the propagating plane waves. For conservative equations (i.e.,
when some energy is conserved), the roots τj (k) are real-valued so that the initial condition
eik·x a is simply translated:
τ k
u(t, x) = ei(τ t+k·x) a = u0 (x + tv), where e.g. v = k̂, k̂ = . (1.35)
|k| |k|

The phase velocity is defined up to addition of a vector orthogonal to k as is easily verified.


When a root τj (k) is a multiple eigenvalue, the different eigenvectors are referred to as the
polarization modes associated to that specific eigenvalue.

Exercise 1.3.3 Relate the above construction to the derivation in section 1.3.1, in particular
(1.20) and (1.24).

Exercise 1.3.4 (i) Find the characteristic equation, the dispersion relations and the corre-
sponding eigenvectors for the system of acoustic equations (1.1).
(ii) Do the same exercise for Maxwell’s equations and show that there are two modes of po-
larization associated to each propagating mode (such that τj 6= 0) in space dimension d = 3.
(You may verify that there are d(d − 1)/2 modes of polarization in each dimension d ≥ 2; see
e.g. [3, §8].)

1.3.3 Case of spatially varying coefficients


Still denoting by y = (y0 , . . . , yd ) = (t, x) the variable in Rd+1 , we define the first-order
operator
d
X ∂ ∂
L(y, D) = Am (y) + B(y) = A0 (y) + G(y, D). (1.36)
∂ym ∂t
m=0

The above operator is called symmetric hyperbolic if the matrices Am (y) are smooth symmetric
matrices (which are allowed to depend on the temporal variable as well), B(y) is a smooth

9
matrix, and A0 (y) is in addition uniformly positive definite: A0 (y) ≥ α0 I for some α0 > 0
independent of y. The first-order system of equations associated to L is thus

L(y, D)u = 0, t > 0, x ∈ Rd


(1.37)
u(0, x) = u0 (x), x ∈ Rd .

When Am is constant for m ≥ 1 and B = 0, we have seen that (A0 u, u) is conserved when
u is a solution of Lu = 0.

Exercise 1.3.5 Show that the same energy is conserved when B is such that B + B ∗ = 0.

Energy conservation no longer necessarily holds in more general situations. However a very
useful energy estimate still holds.
Let G(t) be an operator acting on (L2 (Rd )n ) such that G(t) + G∗ (t) is uniformly bounded
in time:
kG(t) + G∗ (t)k ≤ 2C. (1.38)
We recall that G∗ (t) is defined as the operator such that for all u and v in (L2 (Rd ))n we have

(Gu, v) = (u, G∗ v). (1.39)

Let us consider the problem


∂u
A0 (y) + G(t)u = 0, t > 0, (1.40)
∂t
with some initial conditions u(0, x) = u0 (x). Then we observe that

ku(t)kA0 ≤ ect/α0 ku0 kA0 (1.41)

for some constant c, where the above norm is defined such that

kuk2A0 = (A0 u, u) = (u, u)A0

Proof. Let us first recast (1.40) as


(A0 (y)u) + G̃(t)u = 0, t > 0, (1.42)
∂t

where G̃ = G − ∂t A0 (y) is clearly a bounded operator with constant C in (1.38) replaced by


another constant c.
Upon multiplying (1.42) by u(t, x), integrating over Rd and performing a few integrations
by parts, we get
d d
0= (A0 u, u) + 2(G̃(t)u, u) = (A0 u, u) + ((G̃(t) + G̃∗ (t))u, u),
dt dt
so that
d 2c
(u, u)A0 ≤ 2c(u, u) ≤ (u, u)A0 .
dt α0
Now we deduce from u0 ≤ hu with u(t) ≥ 0 that u(t) ≤ eht u(0). This concludes the proof.
This result may be generalized to more complicated equations [26]. The a priori bound
(1.41) shows that the solution to (1.37) is unique since L is a linear operator. Indeed we verify
that G + G∗ is bounded for G defined in (1.36).

10
Exercise 1.3.6 Calculate G∗ using (1.39) and verify that G + G∗ is indeed bounded for G
defined in (1.36). Note that both G and G∗ are first-order differential operators and thus are
not bounded on (L2 (Rd ))n .
The same type of proof, using Gronwall’s inequality, provides the following a priori esti-
mate:
Theorem 1.3.1 For every s ∈ R, there is a constant C such that for all u(t, x) a sufficiently
smooth function with compact support in space uniformly on 0 ≤ t ≤ T for all T > 0, we have
the a priori estimate
Z t
Ct
ku(t)kH s (Rd ) ≤ Ce ku(0)kH s (Rd ) + CeC(t−s) kLu(σ)kH s (Rd ) dσ. (1.43)
0

Exercise 1.3.7 Prove the above theorem for s = 0 using a modification of the above result
and Gronwall’s inequality (see below). Next, prove the result for all integer s. The result then
holds for all s ∈ R by interpolation.
We recall that Gronwall’s inequality states the following:
Lemma 1.3.2 (Gronwall’s inequality) (i) Let u(t) be a non-negative, continuous function
on [0, T ] and ϕ(t) and φ(t) non-negative integrable functions on [0, T ] such that the following
inequality holds
u̇(t) ≤ ϕ(t)u(t) + φ(t). (1.44)
Then
Rt Z t
ϕ(s)ds

u(t) ≤ e 0 u(0) + φ(s)ds , for all 0 ≤ t ≤ T. (1.45)
0
(ii) In integral form, let ξ(t) be a non-negative, integrable function on [0, T ] such that for
constants C1 and C2 , Z t
ξ(t) ≤ C1 ξ(s)ds + C2 . (1.46)
0
Then for 0 ≤ t ≤ T , we have
ξ(t) ≤ C2 (1 + c1 teC1 t ). (1.47)
The same type of bounds is also useful in showing existence of a solution to (1.37). Since
the explicit construction in the Fourier domain is no longer available, the existence result (due
to Friedrichs for general first-order hyperbolic systems) is more involved technically. The main
idea is to construct approximate solutions and then use a priori estimates of the form (1.41)
to pass to the limit and obtain a solution to (1.37). We refer to [15, §7.3.2] for a construction
based on the vanishing viscosity method and the existence of solutions for the heat equation;
to [26, §2.2] for a construction based on finite discretizations of the hyperbolic system; and to
[21] for a construction based on the density of functions of the form Lu = 0 (which uses in a
crucial way the property of finite speed of propagation that we now consider).

1.3.4 Finite speed of propagation


A very important property of first-order hyperbolic systems is that information propagates at
finite speed. Let us recast the first-order hyperbolic system as
d
X ∂u
0 = Am + Bu
∂ym
m=0
d
X ∂u∗
0 = Am + u∗ B ∗ ,
∂ym
m=0

11
using the symmetry of Am , m ≥ 0. Upon multiplying the first equation on the left by u∗ and
the second equation on the right by u and summing the results, we obtain that
d d
X ∂ X ∂Am
0= (u∗ Am u) + u∗ Zu, Z= − B − B∗. (1.48)
∂ym ∂ym
m=0 m=0

Note that Z is a bounded operator and that the first term is written in divergence form.
Let y ∈ Rd+1 and k ∈ Rd such that |k| = 1. We define
−1/2 −1/2
r(y, k) = ρ(A−1
0 (y)Aj (y)kj ) = ρ(A0 (y)Aj (y)kj A0 (y)), (1.49)
where ρ(A) denotes spectral radius (here the largest eigenvalue in modulus). Verify that both
spectral radii indeed agree. Now the maximal speed of propagation c is defined as
c= sup r(y, k). (1.50)
y∈Rd+1 and |k|=1

The above definition implies that kj Aj ≤ cA0 as symmetric matrices for |k| = 1 so that for
arbitrary vectors η ∈ Rd , we have |η|cA0 + ηj Aj ≥ 0. This implies that c is the smallest
number with the property that
d
X
η0 ≥ |β| ⇒ ηm Am ≥ 0.
m=0

You may want to check this assertionP carefully. This shows that if “enough” of the positive
operator A0 is added to the operator dj=1 kj Aj , then we obtain a positive operator as well.
This has the following interesting consequence.
Let a ∈ Rd and R > 0. We define the (truncated) cone Ω(a, R) as
Ω(a, R) = {(t, x), s.t. 0 ≤ t ≤ R/c and |x − a| < R − ct}. (1.51)
Here c is the maximal speed defined in (1.50). We define the sections Ω(a, R, t) as
Ω(t) ≡ Ω(a, R, t) = {x, s.t. (x, t) ∈ Ω(a, R)}. (1.52)
Then we have the result first proved by Friedrichs:
Theorem 1.3.3 Suppose that u is a smooth solution of the equation
Lu(y) = 0, (1.53)
such that u(0, x) = 0 in the ball {|x − a| < R}. Then u = 0 in Ω(a, R).
Proof. For 0 ≤ t ≤ R/c, define
Z
ϕ(t) = u∗ (y)A0 (y)u(y)dx.
Ω(t)

In particular, ϕ(0) = 0 thanks to the initial condition. Recall that



0= (u∗ Am u) + u∗ Zu
∂ym
Let us integrate the above equation spatially over Ωt = Ω(a, R) ∩ (0, t). We thus integrate over
a truncated cone. We find that because A0 is uniformly positive and Z is a bounded operator
that
Z tZ Xd Z t
ϕ(t) − ϕ(0) + u∗ ηm Am udσ(y) ≤ C ϕ(s)ds.
0 |x−a|=R−cs m=0 0

Here, (η0 , . . . , ηd ) ∈ Rd+1 is the outward normal to the truncated cone at times 0 < s < t.

12
Exercise 1.3.8 Verify the above formula in detail. In particular, show that the boundary of
Ωt is composed of the lateral surface {|x − a| = R − cs, 0 < s < t} and of Ω(t) and Ω(0); then
apply the divergence theorem on Ωt to find the result.

The speed c is defined exactly so that dm=0 ηm Am is a positive operator. We thus obtain
P
that Z t
ϕ(t) ≤ C ϕ(s)ds.
0
We conclude by using Gronwall’s lemma that ϕ(t) ≡ 0, which implies that u = 0 on Ω(a, R).

The same type of proof provides the following important local estimate:

Theorem 1.3.4 For all a ∈ Rd , R > 0, T > 0 such that 0 ≤ T ≤ R/c, there is a constant
C = C(L) such that for all sufficiently smooth functions u(t, x), we have
Z t

ku(t)kL2 (Ω(t)) ≤ C ku(0)kL2 (Ω(0)) + kLu(s)kL2 (Ω(s)) ds . (1.54)
0

Exercise 1.3.9 Prove the above theorem and relate it to Theorem 1.3.1.

1.3.5 Characteristic equation and dispersion relation


The characteristic equation of first-order operators with constant coefficients was introduced
in (1.30). We now generalize this notion to spatially varying operators. The main idea is the
following. Consider operators that vary smoothly and “not-too-fast” in space and let them
act on sufficiently highly-oscillatory functions. Then at the scale of the fast oscillations, one
may as a first approximation assume that the spatial coefficients of the differential operator
are frozen (i.e., constant). This is the whole purpose of geometric optic to make this statement
more precise. In any event this justifies the introduction of the following notion.
Let L be a first-order (to simplify) operator of the form
d
X ∂
L(y, D) = Am (y) + B(y) = L1 (y, D) + B(y). (1.55)
∂ym
m=0

Here L1 is thus the “leading” order operator in L, which accounts for all highest-order deriva-
tives (of order one here).

Definition 1.3.5 The characteristic variety of L, denoted by Char L is the set of pairs
(y, η) ∈ Rd+1 × (Rd+1 \0) such that

det L1 (y, η) = 0. (1.56)

Here again, L1 (y, η) is the symbol of L1 (y, D), where each derivative in ym , 0 ≤ m ≤ d is
replaced by iηm . For sufficiently high frequencies, L1 (y, η) has a much greater effect than
B(y, η) (the symbol of B), since the former is linear in iηm whereas the latter is bounded
independent of iηm . It is therefore useful to consider the dispersion relation associated to
L1 (y, η).
For a “macroscopic” y frozen, we look for plane wave solutions of the form ei(τ t+k·x) a with
k real-valued. Neglecting variations in y of the coefficients in L and zero-th order terms, such
solutions approximately satisfy the equation

det L(y, iτ, ik) = 0. (1.57)

13
The roots τj (y, k) define the dispersion relations of the equation Lu = 0. They are associ-
ated with eigenmodes bj (y, k), which generalize the construction obtained in section 1.3.1 for
constant coefficients.
A useful normalization adopted in [28] for the eigenvectors is as follows. The dispersion
relation may be recast as
τ + A−1

0 (y)kj Aj (y) b(y, k) = 0, (1.58)
so that −τ is an eigenvalue of the dispersion matrix

M (y, k) = A−1
0 (y)kj Aj (y). (1.59)

Since M is not symmetric, we define by bj (y, k) its right-eigenvector associated to the eigen-
value −τj (y, k) and by cj (y, k) the corresponding left-eigenvectors so that c∗j M = −τj c∗j . We
normalize the eigenvectors as follows:

cj (y, k) = A0 bj (y, k), c∗i (y, k)bj (y, k) = δij . (1.60)

This allows us to recast the dispersion matrix as


n
X
M (y, k) = − τj (y, k)bj (y, k)c∗j (y, k). (1.61)
j=1

Note that bj (y, k)c∗j (y, k) is a n × n matrix whereas c∗i (y, k)bj (y, k) is a real number.

Exercise 1.3.10 (i) Work out the dispersion relation for the system of acoustic equations
(1.1), the dispersion matrix, and calculate the corresponding eigenvalues and eigenvectors.
(ii) Same problem for Maxwell’s equations. The solution can be found in [28].

14
Chapter 2

Homogenization Theory for the


wave equation

This chapter comes from earlier notes and has a lot more material than what will be covered
in the course.

2.1 Effective medium theory in periodic media


We are interested in approximate solutions of the wave equation when the physical coeffi-
cients vary on a fast spatial scale. A natural question is whether we can replace the rapidly
varying coefficients by homogeneous coefficients. One case where this can be done is when
the wavelength is large compared to the spatial oscillations of the physical coefficients. The
approximation is then given by the effective medium, or homogenization, theory.
This theory is valid for distances of propagation on the order of the wavelength, and is
therefore very useful in the analysis of standing wave problems in confined regions but cannot
account for the radiative transport that will be taken on in subsequent chapters.
To present the homogenization theory of waves, we start with the simple problem of acous-
tic waves in layered media. We introduce a small adimensionalized parameter ε > 0, which is
the ratio between the characteristic length scale of the physical coefficient variations and the
wavelength. The acoustics equations take then the form
x  ∂uε (t, x)
ρ + ∇pε (t, x) = F(t, x) (2.1)
ε ∂t
x  ∂pε (t, x)
κ + ∇ · uε (t, x) = 0, (2.2)
ε ∂t
with vanishing initial conditions and with a volume source term F that we assume is smooth
and independent of ε. We have defined the coordinates x = (x1 , x2 , z) and the differential
operator ∇ = (∂/∂x1 , ∂/∂x2 , ∂/∂z )t . In layered media, the coefficients ρ and κ only depend on
the third variable ρ(x/ε) = ρ(z/ε) and κ(x/ε) = κ(z/ε).
We want to derive the asymptotic behavior of u and p as ε → 0. It will be simpler to
analyze the second-order wave equation for the pressure p

x  ∂ 2 pε (t, x) ∇pε (t, x) F(t, x)


Lε pε (t, x) ≡ κ 2
−∇· x = −∇ · x , (2.3)
ε ∂t ρ

ρ
ε ε
with vanishing initial conditions, obtained by differentiating (2.2) in time and (2.1) in space,
and eliminating uε .

15
2.1.1 Multiple scale expansion
The asymptotic behavior of pε is now obtained by using the theory of multiple scale expansions.
Let us assume to simplify that ρ and κ are periodic functions of period 1. The solution of the
wave equation will then sample these periodic oscillations and itself have periodic variations
at the characteristic length ε. At the same time, the source term F, which is a non-periodic
smooth function in x, will generate variations of the velocity and pressure at the larger scale
of order O(1).
The first basic assumption justifying the multiple scale expansion is that these two scales
separate in the following sense. We suppose that p has approximately the form
x
pε (t, x) ≈ pε t, x, , (2.4)
ε
where pε (t, x, y) is periodic (of period 1) with respect to the last variable y ∈ Y = (0, 1)3
and a smooth function with respect to all variables. We define the fast variable y = x/ε and
denote by y its third component. In this multiple scale form, the spatial gradient acts on both
the slow and fast variables. If we define pε (x) = p(x, x/ε), then we have by the chain rule that
1
∇pε (x) = ∇x p(x, y) + ∇y p(x, y) x. (2.5)
ε y=
ε
Assuming that the fast scale y = x/ε and the slow scale x separate, we recast (2.3) in the
two-scale framework as
∂ 2 pε 1 ∇ y pε 1 ∇ y pε ∇ x pε  ∇ x pε
κ(y) 2 − 2 ∇y · − ∇x · − ∇y · − ∇x ·
∂t ε ρ(y) ε ρ(y) ρ(y) ρ(y)
(2.6)
1 F(t, x) F(t, x)
= − ∇y · − ∇x · .
ε ρ(y) ρ(y)
Multiplying through by ε2 , we recast the above expansion as
Lε pε ≡ (L0 + εL1 + ε2 L2 )pε = εS0 + ε2 S1 , (2.7)
where we have defined the operators
1
L0 = −∇y · ∇y (2.8)
ρ(y)
1 1 
L1 = − ∇y ∇x + ∇x ∇y (2.9)
ρ(y) ρ(y)
∂ 2 1
L2 = κ(y) 2 − ∆x . (2.10)
∂t ρ(y)
The source terms are S0 (t, x, y) = −∇y ·(ρ−1 (y)F(t, x)) and S1 (t, x, y) = −∇x ·(ρ−1 (y)F(t, x)).
The first assumption concerned the two-scale separation of the wave operator Lε and the
expansion of the two-scale operator Lε . The second assumption is that the two-scale quantity
pε (t, x, y) can also be expanded in power series of ε, so that
x x x
pε (t, x) = p0 t, x, + εp1 t, x, + ε2 p2 t, x, + ..., (2.11)
ε ε ε
where the pi are 1−periodic with respect to the third variable. The equations for the successive
terms p0 , p1 , and p2 are obtained by plugging (2.11) into the wave equation (2.3) and equating
like powers of ε. The term of order ε−2 yields
1
L0 p0 (t, x, y) ≡ −∇y · ∇y p0 = 0. (2.12)
ρ(y)

16
Since p0 is periodic in y, the above equation implies that p0 is independent of y: p0 = p0 (t, x).
The next-order equation is
1 
(L0 p1 + L1 p0 − S0 )(t, x, y) ≡ −∇y · ∇y p1 + ∇x p0 − F = 0. (2.13)
ρ(y)

By linearity, the solution p1 satisfies

p1 (t, x, y) = θ(y) · (∇x p0 − F)(t, x), (2.14)

where the vector function θ(y) = (θ1 (y), θ2 (y), θ3 (y)) solves
1 
−∇y · ∇y θi + ei = 0, i = 1, 2, 3
ρ(y) (2.15)
y 7→ θi (y) is 1-periodic.

It is a classical result in elliptic partial differential equations that the above equations admit
unique solutions up to the addition of constants.
In the case of layered media, these equations can be solved exactly. Indeed, ρ(y) = ρ(y),
hence ∇y ρ−1 (y)ei = (ρ−1 )0 δi3 , where the Kronecker symbol δij = 1 if i = j and 0 otherwise.
We then readily deduce that θ1 and θ2 are constant on the periodicity cell (0, 1)3 . Moreover,
θ3 depends only on y by symmetry and solves

d 1 dθ3 (y) 
+ 1 = 0.
dy ρ(y) dy

Upon integrating this equation, we get dθdy + 1 = Cρ(y), where C is a constant. Since θ3 is
3

periodic, we deduce that C = hρi−1 , where h·i denotes averaging over the cell Y . Hence we
have
1 dθ3  1
+1 = , ∇y θ1 (y) = ∇y θ2 (y) = 0. (2.16)
ρ(y) dy hρi
These equations determine θ explicitly up to an additive constant vector, which we may choose
to vanish.
The equation of order 0 is given by

(L0 p2 + L1 p1 + L2 p0 − S1 )(t, x, y) = 0, (2.17)

which may be recast more explicitly as:

∂ 2 p0 1 1 1
κ(y) 2
− ∇ y · ∇ y p2 − ∇ x · ∇ y p 1 − ∇ y · ∇ x p 1
∂t ρ ρ ρ (2.18)
1 F
−∇x · ∇x p0 = −∇x · .
ρ ρ

Let us integrate this equation in y over Y = (0, 1)3 . All terms in divergence form vanish by
periodicity and we obtain

∂ 2 p0 1 
hκi 2
− ∇x · ∇y θ + I3 (∇x p0 − F) = 0, (2.19)
∂t ρ
where I3 is the 3 × 3 identity matrix. This is the compatibility condition ensuring the
existence of a unique solution p2 (t, x, y) to (2.18) defined up to a constant function p20 (t, x)
in the y variable (we choose p20 = 0 for instance).

17
2.1.2 Homogenized equations
We recast the above equation for p0 as

∂ 2 p0
κ∗ − ∇x · (ρ∗ )−1 (∇x p0 − F) = 0, (2.20)
∂t2
where the homogeneous density tensor ρ∗ and the homogeneous compressibility coefficient κ∗
are given by
1 −1
ρ∗ =

∇ y θ + I3 , (2.21)
ρ
κ∗ = hκi. (2.22)

We augment the above equation with vanishing initial conditions, which is compatible with
the expansion (2.11).
By virtue of (2.16), the density tensor ρ∗ simplifies in layered geometry and is given by
1 −1 1 −1
ρ∗ = Diag

, , hρi . (2.23)
ρ ρ

To physically interpret the homogenized equation (2.20), we recast it as the following first-order
system
∂v
ρ∗ + ∇x p0 = F, (2.24)
∂t
∂p0
κ∗ + ∇x · v = 0, (2.25)
∂t
where the homogeneous velocity v is defined by
Z t
∗ −1
v(t, x) = (ρ ) (F − ∇x p0 )(τ, x)dτ. (2.26)
0

Owing to the form of the density tensor (2.23), we deduce that

1 −1 ∂v1
∂p0
+ = F1 (t, x),
ρ ∂t ∂x1
1
−1 ∂v2 ∂p0
+ = F2 (t, x),
ρ ∂t ∂x2 (2.27)
∂v3 ∂p0
hρi + = F3 (x, x),
∂t ∂z
∂p0 ∂v1 ∂v2 ∂v3
hκi + + + = 0.
∂t ∂x1 ∂x2 ∂z

These are anisotropic versions of the original acoustic equations (2.1). The horizontal and
vertical sound speeds are different and given by
1 1 1
c2v = and c2h = . (2.28)
hκihρi hκi ρ

We have seen that the solution p of (2.3), or equivalently of (2.1)-(2.2), converges to the
homogeneous function p0 (t, x) that solves (2.20). Now, what can we say about the convergence

18
of the velocity field u and how does it relate to the homogeneous velocity field v? Let us write
u = u0 + O(ε). From (2.1), we then deduce that
F − (∇p)0 F − ∇ x p 0 − ∇ y p1 I3 + ∇ y θ ∗
∂t u0 = = = ρ ∂t v.
ρ(y) ρ(y) ρ(y)
Therefore, since both u0 and v vanish at t = 0,
I3 + ∇y θ(y) ∗
u0 (t, x, y) = ρ v. (2.29)
ρ(y)
In layered media, by virtue of (2.16) and (2.23), the relation (2.29) simplifies to
 1 1
−1

0 0
 ρ(y) ρ 
 1 1 −1 
u0 (t, x, y) =  0  v(t, x).
0  (2.30)

 ρ(y) ρ 
0 0 1
It is interesting to observe that the third component of the asymptotic velocity u0 does not
depend on the fast scale. However, the two other components, describing the propagation
of the waves perpendicularly to the direction of the layering, feel the heterogeneities and are
inversely proportional to ρ(y).

2.1.3 Energy estimates


We have replaced the heterogeneous equation (2.3) by the homogenized equation (2.20). It
remains to estimate the error between the two solutions. The formal expansion (2.11) provides
the starting point. Let us define
x x x
pε (t, x) = p0 t, x, + εp1 t, x, + ε2 p2 t, x, + ζε (t, x), (2.31)
ε ε ε
where the terms pj , 0 ≤ j ≤ 2 are defined as in the preceding section and pε is the solution
to (2.6). This uniquely defines ζε . It remains to show that the latter term is small in some
sense. In order to do this, we use the energy estimate of the form (1.4), with κ and ρ replaced
by κ(x/ε) and ρ(x/ε), both assumed to be smooth and uniformly bounded from below by a
positive constant.
The first objective is to write an equation for ζε . Obviously the wave equation looks quite
natural and we write:
Lε ζε = Lε pε − p0 − εp1 − ε2 p2 .

(2.32)
We now use the multiple-scale rule:
x
Lε p(t, x, ) = Lε p(t, x, y) x.
ε y=
ε
and the wave equation (2.6) to deduce that
1
ε2 Lε ζε = S1 (t, x, y) + S0 (t, x, y) x − (L0 + εL1 + ε2 L2 )(p0 + εp1 + ε2 p2 )

x.
ε y= y=
ε ε
The asymptotic expansions developed in the preceding section are precisely tailored so that
all the “high-order” terms in the above expression cancel. Using the equations (2.13) for p1
and (2.17) for p2 , we deduce that

Lε ζε = −ε L2 (p1 + εp2 ) + εL1 p2 x ≡ εSε (t, x). (2.33)
y=
ε

19
The source term Sε (t, x) involves applying the differential operators L1 and L2 to the correctors
p1 (t, x, y) and p2 (t, x, y). The latter terms thus need to be sufficiently regular. We do not
dwell on these details. However classical results in the theory of elliptic equations and the wave
equation show that p1 (t, x, y) and p2 (t, x, y) are indeed sufficiently smooth provided that the
coefficients ρ and κ and the source term F(t, x) are sufficiently smooth.
This allows us to conclude that Sε (t, x) is bounded in C(0, T ; L2 (Rd )) independent of ε. It
remains to find an estimate for ζε . We have the following result.

Theorem 2.1.1 Let pε (t, x) be the solution of the heterogeneous wave equation (2.3) and
p0 (t, x) the solution of the homogenized wave equation (2.20). Assuming that the coefficients
κ and ρ and that the source term F(t, x) are sufficiently smooth we find that there exists a
constant independent of ε such that

kpε (t) − p0 (t)kL2 (Rd ) ≤ Cε, (2.34)

uniformly on compact sets 0 < t < T .

Proof. It remains to show a comparable estimate for ζε , since εp1 + ε2 p2 are indeed of order
ε in the above sense. Consider the equation Lε ζε = εSε and the energy
Z
1 x ∂ζε 2 1
(t, x) + x |∇ζε |2 (t, x) dx.

E(t) = κ( )
2 Rd ε ∂t ρ( ε )

We find that
x ∂ζε ∂ 2 ζε
Z Z
1 ∂  ∂
Ė(t) = κ( ) 2
+ x ∇ζε · ∇ ζε dx = ζε εSε dx,
Rd ε ∂t ∂t ρ( ε ) ∂t Rd ∂t

after the usual integration by parts. Upon integrating the above equality over t ∈ (0, T ) we
find using the definition of E(t) that
Z tZ t t
ε2
Z Z
∂ζε 2 ∂ζε 1 ∂ζε 2
(t) L2 (Rd )
≤ (s)εSε (s)dxds ≤ kSε (s)k2L2 (Rd ) ds + (s) L2 (Rd )
ds,
∂t 0 Rd ∂t 2 0 2 0 ∂t

by Cauchy-Schwarz inequality (on Rd ) and the fact that 2ab ≤ a2 + b2 . The integral form of
the Gronwall lemma allows us to conclude that
∂ζε
(t) L2 (Rd )
≤ εCT kSε kC(0,T ;L2 (Rd )) ,
∂t
on 0 ≤ t ≤ T for some constant CT independent of ε and Sε . Since ζε vanishes initially, this
yields that kζε (t)kL2 (Rd ) ≤ Cε uniformly on 0 < t < T and the result.

Exercise 2.1.1 Consider the homogenization of the pressure field in (2.3) obtained from
(2.1)-(2.2) with the right-hand-side of (2.2) replaced by the smooth term g(t, x).

Exercise 2.1.2 The above theorem shows an approximation of order O(ε) even though the
expansion in (2.11) was pushed to second-order. The reason is that p1 (t, x, y) was defined up
to the addition of a function p10 that depends only on (t, x). Push the asymptotic expansion
(2.11) to higher order and generalize the above theorem to obtain an asymptotic expansion of
order O(εN ) for N ∈ N.

20
Exercise 2.1.3 Consider the Schrödinger equation
x
−∆uε + V ( )uε = f (x), x ∈ Rd , (2.35)
ε
for V ( xε ) ≥ V0 > 0 sufficiently smooth (the absorbing case) and periodic of period 1, and f (x)
a square integrable and sufficiently smooth term. Show that uε converges strongly to u as
ε → 0 in the L2 (Rd ) topology (i.e. show that kuε − ukL2 (Rd ) goes to 0; in fact you can show
that it is of order ε), where u is the solution of the same equation with v( xε ) replaced by its
average over the cell Y .

Exercise 2.1.4 For uε defined as in the preceding exercise, find the limit of its gradient ∇uε
as ε → 0.

Exercise 2.1.5 Find an error estimate for the homogenization of the first-order hyperbolic
system (2.1)-(2.2).

Exercise 2.1.3 shows that oscillations in the potential V have no influence on the solution uε
at the leading order. However its derivatives feel the influence of the fluctuations. Compare
this to the effects of ρ(x) and κ(x) in theorem 2.1.1.

2.2 Multidimensional case and estimates of the effective prop-


agation speed
Most of the analysis carried out in the previous section holds in the multidimensional case,
where the coefficients ρ and κ are allowed to depend also on the variables x1 and x2 . The
homogenization of the wave equation (2.3) is obtained by the multiple scale expansion (2.11),
which yields the three equations (2.12), (2.13), and (2.18). We deduce from (2.12) that the
pressure field p0 (t, x) is independent of the fast variable y and that p1 satisfies (2.14). The
homogenized equation (2.20) still holds and the homogenized coefficients are given by (2.21)
and (2.22).
However, no general analytic solution to (2.15) can be obtained in general. There is no
equivalent in the multidimensional case to the layered media formulas (2.16) and the homoge-
neous density tensor in not given by (2.23) in general.

2.2.1 Effective density tensor in the case of small volume inclusions


Since (2.23) is not available to us, we must have recourse to approximate methods to estimate
(2.21). Apart from numerical solutions of (2.15), several asymptotic methods have been devised
to approximately calculate (2.21). An elegant method for doing so in the case of small volume
inclusions in based on results of potential theory [20]. Let Y = (−1/2, 1/2)3 be the unit
periodicity cell and Bδ the ball of center the origin and radius δ, of volume β = 4π 3
3 δ . We
assume that the density ρ = ρ1 is constant inside the unit cell except within the ball Bδ , where
it takes the value ρ = ρ2 .
Let x = (x1 , x2 , x3 ) in the unit cell Y . By symmetry of ρ on the cell, we easily verify that
the effective tensor ρ∗ in (2.21) is given by
1 ∂θ1 −1
ρ∗ = I3 ≡ ρ ∗ I3 .

+1 (2.36)
ρ ∂x1
It is convenient to introduce the coefficients

D(x) = (ρ(x))−1 i = 1, 2 and D∗ = (ρ∗ )−1 . (2.37)

21
These coefficients are analogous to the diffusion coefficient and effective diffusion coefficient
arising in heat conduction, although the physical units are different here. The equation (2.15)
for θ1 is equivalent to

∆θ1 (x) = 0 except on x ∈ ∂Bδ = {|x| = δ}


θ1 is continuous across ∂Bδ
(2.38)
D(x)ν(x) · (∇x θ1 + e1 ) is continuous across ∂Bδ
x 7→ θ1 (x) is 1−periodic.

Here ν(x) is the outward unit normal to Bδ at x ∈ ∂Bδ . Let us introduce the periodic unit
Green’s function that solves

−∆x G(x, y) = δ(x − y) − 1,


(2.39)
x 7→ G(x, y) is 1-periodic for all y ∈ R3 .

This function is given by


1 X 0 e2πim·(x−y)
G(x, y) = , (2.40)
4π 2 3
|m|2
m∈Z

where the conditionally convergent sum runs over all integer values except m = 0. This
expression for G is called a lattice sum, familiar in the context of solid state physics [32]. The
function G has zero mean over the unit cell for every y and satisfies
1
G(x, y) ∼ as |x − y| → 0. (2.41)
4π|x − y|
In other words, the periodic Green’s function has the same behavior as the whole space Green’s
function when |x − y| → 0, which is to be expected since the periodic boundary conditions
are negligible when the source and observation points are very near each other.
We now use the above Green’s function and some results of potential theory [20] to derive
an asymptotic approximation of θ1 as the volume of the inclusion β → 0. We now write θ1 in
the form Z
θ1 (x) = G(x, y)σ(y)dS(y), (2.42)
∂Bδ

where dS is the surface measure on ∂Bδ and σ is a surface distribution to be determined. The
periodicity of θ1 follows from that of x 7→ G(x, y). We deduce from the first line in (2.38) the
constraint Z
σ(y)dS(y) = 0. (2.43)
∂Bδ

Continuity of θ1 is a consequence of the limit (2.41) and of the continuity of the single layer
potential Z
σ(y)
U (x) = dS(y) (2.44)
∂Bδ 4π|x − y|
across ∂Bδ . It is a classical result in potential theory [20, Theorem VI, Chapter VI] that
Z
∂U  σ ∂ 1 
= − + σ(y)dS(y),
∂ν + 2 Z ∂Bδ ∂ν 4π|x − y|
∂U  σ ∂ 1  (2.45)
= + σ(y)dS(y),
∂ν − 2 ∂Bδ ∂ν 4π|x − y|

22
where the subscripts +/− stand for the normal derivative outside and inside Bδ , respectively.
Using (2.45) and again the limit (2.41), we obtain from the third line in (2.38) the following
integral equation for σ
Z Z
σ ∂G  σ ∂G 
D1 − + σdS − D2 + σdS = (D2 − D1 )ν · e1 , (2.46)
2 ∂Bδ ∂ν 2 ∂Bδ ∂ν

This relation is equivalent to


Z
∂G(x, y) D2 + D1 σ(x)
σ(y)dS(y) + = −ν1 (x) on ∂Bδ , (2.47)
∂Bδ ∂ν D2 − D1 2

with ν1 (x) = ν(x) · e1 . Introduce now the rescaled variables x → x/δ and y → y/δ and
functions
σ δ (x) = σ(δx) and Gδ (x, y) = δ G(δx, δy). (2.48)
We can then recast (2.47) as

∂Gδ (x, y) δ D2 + D1 σ δ (x)


Z
σ (y)dS(y) + = −ν1 (x) on ∂B1 . (2.49)
∂B1 ∂ν D2 − D1 2

Because of (2.41), we have


1
Gδ (x, y) → ≡ G0 (x, y) as δ → 0. (2.50)
4π|x − y|

Defining then σ0 (x) = lim σ δ (x), we pass to the limit in (2.49) and obtain
δ→0

∂G0 (x, y) 0 D2 + D1 σ 0 (x)


Z
σ (y)dS(y) + = −ν1 (x) on ∂B1 . (2.51)
∂B1 ∂ν D2 − D1 2

Notice that the specific geometry of ∂B1 did not play any role so far. The above equation
for σ 0 is therefore valid for more general geometries of discontinuity than spheres. However,
(2.51) can be solved exactly when ∂B1 is the unit sphere.
Let us calculate the potential U (x) in (2.44) for σ(x) = ν1 (x). Defining ex = x/|x| and
decomposing ν(y) = (ν(y) · ex )ex + ν ⊥ (y), with ν ⊥ (y) uniquely defined by ν ⊥ (y) · ex = 0,
we deduce from the symmetries of the unit sphere and of G0 that

ν(y) · ex ν1 (x) π
Z Z
cos θ sin θdθ
U (x) = ν1 (x) dS(y) = p .
∂B1 4π|x − y| 2 2
0 sin θ + (|x| − cos θ)2

A simple calculation shows that



ν1 (x)  |x|−2 for |x| ≥ 1,
U (x) = (2.52)
3  |x| for |x| ≤ 1.

As expected from (2.45), U has a discontinuous normal derivative at ∂B1 . The second equality
in (2.45) now yields

∂G0 (x, y)
Z
∂U  ν1 (x) ν1 (x)
ν1 (y)dS(y) = −
− =− for x ∈ ∂B1 . (2.53)
∂B1 ∂ν ∂ν 2 6

23
This shows that ν1 (x), up to a proportionality constant, solves (2.51). This constant is given
by
D1 − D2
σ 0 (x) = 3 ν1 (x). (2.54)
2D1 + D2
Notice that σ 0 in (2.54) also satisfies the constraint (2.43).
Let us now return to the calculation of θ1 and D∗ . From (2.42) we obtain that
D1 − D2 ν1 (x)
Z
θ1 (δx) = δ Gδ (x, y)σ δ (y)dS(y) ∼ 3 δ ,
∂B1 2D1 + D2 |x|2
as δ → 0 for |x| ≥ 1. This shows that
D1 − D2 3 x1 D1 − D2 0
3 ∂G (x, 0)
θ1 (x) ∼ 3 δ = −3 4πδ for |x| ≥ δ. (2.55)
2D1 + D2 |x|3 2D1 + D2 ∂x1
Since θ1 is also of order O(δ 3 ) for |x| < δ, it is not necessary to carry out its expression as it
will be negligible in the computation of D∗ given by (2.36)-(2.37). Keeping this in mind, we
obtain that
D1 − D2 ∂ 2 G0 (x, 0) 
Z

D ∼ D1 1 + 9β dx
2D1 + D2 Y ∂x21
where we recall that β = 4πδ 3 /3. Since −∆G(x, 0) = δ(x) and G(x, 0) only depends on |x|,
R 2 0
we deduce that Y ∂ G∂x(x,0)
2 dx = −1/3, hence
1

D1 − D2 
D∗ = D1 1 − 3 β + o(β), (2.56)
2D1 + D2
or equivalently,
ρ2 − ρ1 
ρ∗ = ρ1 1 + 3 β + o(β). (2.57)
2ρ2 + ρ1
This equation is an approximation of the effective coefficient formulas obtained by Rayleigh
in 1892 [27]. An elegant method to derive higher order terms, based on a more accurate
expansion of the lattice sum (2.40), was first used by Hasimoto [18].

2.2.2 Effective density tensor in the case of small contrast


The asymptotic calculation of the previous section was based on the assumption that the
density was constant except in a small volume where it was allowed to have an O(1) fluctuation.
We now assume that the density has only small fluctuations about a constant mean, but not
necessarily in a small volume region. Let us assume that
1 1 1
= +δ , (2.58)
ρ(x) ρ ρ1 (x)
or using the more convenient notation in (2.37) that D(x) = D0 + δD1 (x), where δ is a small
parameter. We assume that hD1 i = 0. We now want to obtain an approximation of the
effective tensor (2.21) as δ → 0. To do so, we expand the field θ as

θ = θ 0 + δθ 1 + δ 2 θ 2 , (2.59)

where all terms θi are 1−periodic. Here, θ 0 and θ 1 are independent of δ and θ 2 is bounded
uniformly in δ. Plugging this expression into (2.15) and equating like powers of δ yields several
equations that we now analyze. The first one simply is

∆θ 0 = 0, (2.60)

24
which yields that θ 0 is constant. Since only ∇y θ 0 matters, we choose θ 0 = 0. The second
equation is then
−D0 ∆θ 1 − ∇y D1 = 0. (2.61)
Using the Fourier decomposition of D1 and θ 1 for k ∈ Z3 , k 6= 0, we obtain that

D̂1 (k) k
θ̂ 1 (k) = − . (2.62)
D0 |k|2

Because h∇y θ i i = 0 by periodicity, the effective diffusion tensor D∗ is given by

D∗ = hD(∇y θ + I3 )i = hDiI3 + δ 2 hD1 ∇y θ 1 i + O(δ 3 ).

Using the Parseval relation, this yields

δ2 X k⊗k
D∗ = hDiI3 − |D̂1 (k)|2 . (2.63)
D0 3
|k|2
k∈Z

To the first order in δ, the effective tensor can be equated with the average of D over the
periodicity cell. The anisotropy of D only appears as a contribution of order δ 2 . This property
holds in a much more general context in homogenization [23, Chapter 14].

2.3 Case of random media


In this section, we extend the homogenization carried out in the periodic case in the preceding
sections to the random case. We will illustrate the theory by looking at the propagation of
acoustic waves in a bubbly liquid. The location, size, and number of bubbles in the liquid are
unknown to the observer. This justifies the use of a probabilistic theory to estimate the role
of these bubbles in the sound propagation.
Let ω ∈ Ω be a realization of the randomness in the configuration space and P (ω) its
probability density. The physical coefficients κ and ρ in the acoustic equation (2.3) now depend
on the realization: κ = κ(x, ω) and ρ = ρ(x, ω). Assuming that F (t, x) is deterministic, the
solution p of (2.3) will also depend on the realization: p = p(t, x, ω). The type of question
homogenization aims at answering R is: can one obtain some information about the average
over realizations hpi(t, x) = p(t, x, ω)P (dω) knowing that the number of bubbles increases
to infinity while keeping a constant volume fraction?
We have already seen how to answer this question in the periodic case. We will see that
the basic results, the existence of effective coefficients and the convergence in a suitable sense
of the heterogeneous solution to the homogeneous solution, still hold in the random case.
However, formal asymptotic expansions of the form (2.11) can no longer be justified and the
analysis of the cell problem (2.15) that leads to the definition of the homogenized coefficients
(2.21) is more involved.
Yet the periodic case shows the way to homogenization. One crucial point in periodic
homogenization is a certain invariance by translation. The analog of periodicity in the random
case is statistical homogeneity or stationarity. Let us again denote by D = ρ−1 . Stationarity
means that for any set of points x1 , . . . , xm in Rd and any vector h ∈ Rd , the joint distributions
of
D(x1 , ω), . . . , D(xm , ω) and D(x1 + h, ω), . . . , D(xm + h, ω)
are the same, and similarly for κ.

25
The periodic case can actually be seen as a special random case. Let the configuration
space Ω be the unit torus in Rd and P the uniform measure (dP = dx). Then ω is a point on
the unit torus and we set
D(x, ω) = D̃(ω − x),
where D̃ is a function on the unit torus, i.e. a periodic function. The random acoustics equation
is now (2.3) with the origin of coordinates relative to the period cell chosen randomly. Since
the analysis is valid for every ω and independent of ω, nothing is gained by this randomization.
A more interesting example is the Poisson point process case, which models the location
of the centers of the air bubbles. Here, Ω is the set of infinite sequences of points in Rd ,
ω = (ξ 1 , . . .) in which the order is irrelevant, which can be denoted as Ω = (Rd )∞ −symmetric.
For each Ω, let us define the point measure
X
ν= δξj (dx), (2.64)
j

where δy (dx) is the Dirac delta measure at y. In other words, if A is a subset of Rd , then
ν(A) counts the number of points of ω that fall inside A. Let λ > 0 be a parameter called the
intensity of the Poisson distribution. The probability density P can be defined via its Laplace
functional. Introducing for every positive test function φ of compact support
Z X
(φ, ν) = φ(x)ν(dx) = φ(ξ j ),
Rd j

the Laplace functional of P is defined by


Z
E P e−(φ,ν) = he−(φ,ν) i = exp λ (e−φ(ξ) − 1)dξ .
   
(2.65)
Rd

An important property of the Poisson process is that


λ|A| n
P (ν(A) = n) = e−λ|A| , (2.66)
n!
where |A| is the volume of A. So λ gives the average number of points per unit volume. Given
Ω and P , we can now model random distributions of bubbles of radius δ by

 D2 , |x − ξ | ≤ δ for some j,
j
D(x, ω) = (2.67)
 D , |x − ξ | > δ for all j,
1 j

and a similar expression for κ. We can show that the Poisson point process is stationary:
ν(dx) and ν(h + dx) have the same probability P . Therefore, since D is a functional of the
Poisson process, it is itself stationary. Notice that the bubbles are allowed to overlap with this
model, which is not realistic physically.
Another interesting property is that the random functional Dε (x, ω) defined as in (2.67)
with λ and δ replaced by ε−d λ and εδ is statistically equivalent to the functional D(x/ε, ω),
where D is defined by (2.67). Therefore, the scaling x → x/ε and the resulting acoustic
equation (2.3) indeed correspond to sending the number of bubbles per unit volume to infinity
while keeping the volume fraction of air bubbles constant.
Let us now analyze the asymptotic behavior of the acoustic equation

x  ∂ 2 p(t, x, ω) x  x 
κ ,ω 2
− ∇ · D , ω ∇p(t, x, ω) = −∇ · D , ω F(t, x), (2.68)
ε ∂t ε ε

26
with vanishing initial conditions. We now formally extend the analysis of section 2.1 to the
random case. Using the two-scale expansion (2.11), we obtain the same sequence of equations
for the terms pi .
The first equation (2.12) shows that p0 does not depend on the fast variable y, i.e. in
the random case is independent of the realization ω. We deduce from (2.14) that p1 has a
non-trivial dependence in the fast variable y, i.e. will depend on the realization ω. The vector
function θ satisfies now
−∇ · D(y, ω)(∇θ(y, ω) + I3 ) = 0. (2.69)
It turns out that there is no stationary solution θ(y, ω) to this equation. It is however possible
to find a stationary solution ∇θ to the above equation, which suggests to slightly modify the
equation (2.69). Let e be a unit vector in R3 . We consider the following infinite medium
problem: Find two vector fields H(x, ω) and G(x, ω) such that

H(x, ω) = D(x, ω)G(x, ω),


∇ × G = 0,
(2.70)
∇ · H = 0,
hGi = e.

When e = ei , we find that G = ∇θi + ei , where θi is the ith component of θ solving (2.69).
With this definition, we easily get the third equation in (2.70) from (2.69) and the first equation
in (2.70). Also, since G can be written as a gradient, we clearly have the second equation
in (2.69). Since G and H are sought among stationary vector fields only, we get that hGi is
a constant vector. The new infinite medium problem (2.70) admits a unique solution among
stationary fields. The effective tensor is then given by

D∗ (e, l) = (ρ∗ )−1 (e, l) = hHe · li = hDGe · li = hDGe · Gl i. (2.71)

Here, we have denoted by Hl and Gl the solutions of (2.70) with e = l. The last relation
follows from the ergodicity of the random process. Let us indeed write Gl = l + G̃l with
hG̃l i = 0. Then ∇ × G̃l = 0, hence G̃l is a gradient ∇χ. The last relation in (2.71) is
equivalent to hHe · G̃l i = 0. However, by ergodicity, which means that spatial averaging for
every realization ω corresponds to ensemble averaging (which is independent of position), we
get that
Z Z
hHe · G̃l i = He (x, ω)G̃l (x, ω)dx = − ∇ · He (x, ω)χ(x, ω)dx = 0.
Rd Rd

This relation shows that the effective tensors D∗ and ρ∗ are positive definite since D1 and
D2 are positive constants. The effective compressibility κ∗ is still given by (2.22) and the
homogeneous acoustics equation by (2.20).
To summarize, we see that the homogenized equations obtained in the periodic case (2.20),
(2.21), and (2.22) carry over to the random case, with the slight exception that ∇θi + ei is
replaced by Gei . However, the asymptotic expansion (2.11) does not hold. All that can be
said is that Z
(p(t, x, ω) − p0 (t, x))2 dx → 0 (2.72)
Rd
as ε → 0. We will not present a proof of this result here. Moreover, the cell problem (2.15) that
was tractable numerically because posed on the unit cell Y only is now replaced by (2.70),
which is posed in all R3 and needs to be solved for each realization of the random process

27
to have access to (2.71). Estimating the effective tensor D∗ numerically remains therefore a
formidable task.
The asymptotic approximations obtained in the previous section are consequently all the
more important. In the small volume fraction case, we can assume as a first approximation
that the bubbles are far away from each other. The analysis taken up in section 2.2.1 then
applies and (2.56) holds.

2.4 Variational formulation and effective parameter estimates


The homogenization procedures of the two previous sections were fraught with the same dif-
ficulty. Once the homogenized equation is found, how can one estimate the effective density
tensor ρ∗ ? Since brunt force calculation seems often out of range, approximations are in order.
We have seen asymptotic expansion techniques in section 2.2.1 already, which are interesting
when some physical quantity is small. Because the cell and infinite medium equations satisfy
a natural variational interpretation, variational methods can be used to estimate the effective
coefficients. The goal of this section is to present such methods.

2.4.1 Classical variational formulation


Let us return to the definition of (2.71). We have seen that an alternative expression is
D∗ (e, l) = hDGe ·Gl i, which shows that D∗ is self-adjoint and positive definite. The knowledge
of the quadratic form
1
W (e) = e · D∗ e, (2.73)
2
for all values of e then uniquely defines D∗ . We now claim that

e · D∗ e = min hDG · Gi. (2.74)


∇×G=0
hGi=e

We therefore obtain that the effective tensor can be constructed as a minimization procedure
over curl-free fields of constant ensemble averaging. This result obviously offers a very powerful
tool to generate upper bounds to the not-easily calculable effective tensor.
Let us derive this result. We denote by Ge the solution of (2.70). Let G be a field satisfying
∇ × G = 0 and hGi = e. Because D is positive, we have that

hD(G − Ge ) · (G − Ge )i ≥ 0,

which is equivalent to
hDG · Gi − 2hDG · Ge i ≥ hDGe · Ge i.
Since G is curl-free, it is a gradient G = ∇χ. Hence by ergodicity,
Z Z
hDG · Ge i = D(x)∇χ(x)Ge dx = − χ(x)∇ · He (x) dx = 0.
Rd Rd

Since the minimum is attained for G = Ge , which is admissible, (2.74) follows.


We now have a means to constructing upper bounds for D∗ . What about lower bounds?
A dual variational principle actually shows that

e · (D∗ )−1 e = min hD−1 H · Hi. (2.75)


∇·H=0
hHi=e

The derivation is similar to that of (2.74) by swapping the roles of F and H.

28
The simplest consequence of these relations is the arithmetic and harmonic mean bounds
first obtained by Hill [19]
hD−1 i−1 ≤ D∗ ≤ hDi, (2.76)
in the sense that all eigenvalues of the tensor D∗ satisfy these two inequalities. Notice that
these two bounds are precisely the values taken by the density tensor in layered media (2.23).

2.4.2 Hashin-Shtrikman bounds


The main difficulty in the choice of more accurate test functions G in (2.74) (H in (2.75)) is
that the curl-free (divergence-free) constraints must be satisfied. We now derive a variational
formulation for the effective tensor D∗ that does not have this inconvenience. The first step
consist in decomposing any vector-function P(x) defined on R3 in the random case and Y in
the periodic case, as a constant P0 , a mean-zero curl-free function P1 (x), and a mean-zero
divergence-free function P2 (x). This is the Helmholtz decomposition. That this is possible is
more easily seen in the Fourier domain, as

∇ × G(x) = 0 is equivalent to k × Ĝ(k) = 0


∇ · H(x) = 0 is equivalent to k · Ĥ(k) = 0.

Let us define the projection operators

P0 = Γ0 P = hPi = P̂(0),
Z
P1 (x) = Γ1 P = eik·x Γ1 (k)(P̂(k) − P̂(0)), (2.77)
Z
P2 (x) = Γ2 P = eik·x Γ2 (k)(P̂(k) − P̂(0)),

where
k⊗k k⊗k
Γ1 (k) = , Γ2 (k) = I3 − . (2.78)
|k|2 |k|2
The integrations in (2.77) are replaced by a discrete summation in the periodic case. It is easy
to check that Γ2i (k) = Γi (k) for i = 1, 2 and that k × Γ1 (k) = 0 and k · Γ2 (k) = 0. Therefore,
the operators Γ0 , Γ1 , and Γ2 are projections on the set of constant, mean-zero curl-free, and
mean-zero divergence-free functions, respectively. It is then easily checked from its Fourier
symbol that the operator Γ1 is given by

Γ1 G = −∇(−∆)−1 ∇ · (G − hGi), (2.79)

and Γ2 G = (I3 − Γ1 )(G − hGi). Also, these operators are orthogonal to each other, and we
have
Γi Γj = δij Γi , Γ0 + Γ1 + Γ2 = I. (2.80)
Let us come back to the derivation of a variational formulation for D∗ . Let D0 be a constant
reference inverse density coefficient and Ge the solution of (2.70). We define the polarization
vector Pe and the operator Γ as
Γ1
Pe = (D − D0 )Ge = He − D0 Ge and Γ= . (2.81)
D0
We obtain that

(D − D0 )−1 I3 + Γ Pe = Ge + ΓPe = Ge + (hGe i − Ge ) = e,



(2.82)

29
since ΓHe = 0 because ∇ · He = 0. Now, Γ is a self-adjoint and non-negative operator and
assuming that D(x) > D0 uniformly, then (D − D0 )−1 I3 + Γ is a positive definite self-adjoint
operator. This implies that

(Pe − P) · (D − D0 )−1 I3 + Γ (Pe − P) ≥ 0,




for every test function P. Notice that no curl-free or divergence-free constraint is imposed on
P. Because Γ is self-adjoint and h·i is equivalent to integration over R3 by ergodicity, this
inequality can be recast using (2.82) as

he · Pi − e · (D∗ − DI3 )e ≤ P · (D − D0 )−1 I3 + Γ P ,




because from (2.81) and (2.71), we have that hPe i = (D∗ − D0 I3 )e. Choosing now e =
(D∗ − D0 I3 )−1 hPi, we obtain that

hPi · (D∗ − D0 I3 )−1 hPi ≤ P · (D − D0 )−1 I3 + Γ P ,




for all test function P. Since the minimum is attained for P = Pe , we have derived the first
Hashin-Shtrikman variational principle

hPi · (D∗ − D0 I3 )−1 hPi = min P · (D − D0 )−1 I3 + Γ P ,



(2.83)
P

provided D(x) ≥ D0 .
The same calculations can be performed for the dual variables, that is to say when the
roles of G and H are swapped and Γ1 is replaced by Γ2 . The dual variational principle we
obtain is

hPi · ((D∗ )−1 − D0−1 I3 )−1 hPi = min P · (D−1 − D0−1 )−1 I3 + Γ̃ P ,

(2.84)
P

provided D−1 (x) > D0−1 , where Γ̃ = D0 Γ2 .


The primal principle (2.83) offers a lower bound for the effective tensor D∗ and the dual
principle (2.84) an upper bound. We have now switched the difficulty to find curl-free fields to
the calculation of the right-hand side of (2.83), for instance, since the operator Γ is not simple.
An interesting situation is when the random inverse density D is mean-zero and isotropic, that
is to say when
hD(x)D(x + y)i = R(|y|).
Let indeed Q be a mean-zero isotropic random field, i.e. such that hQi i = 0 and hQi (x)Qj (x + y)i =
Rij (|y|). Then we have
1
hΓ1 Q · Qi = hQ · Qi, (2.85)
3
as can be seen in the Fourier domain. Indeed, since Q̂i (0) = 0,
Z Z
ki kj
hΓ1 Q · Qi(x) = ei(k+q)·x 2 hQ̂i (k)Q̂j (q)idkdq
Z Z |k|
i(k+q)·x ki kj
= e Rij (|k|)δ(k + q)dkdq
Z ∞Z |k|2
1 ∞
Z
= ki kj Rij (|k|)d|k|dΩ(k) = Tr(R(|k|))d|k|
0 S2 3 0
1
= hQ · Qi(x),
3
which is actually independent of x by stationarity. We can therefore compute the right-hand
side of (2.83) when P is isotropic. Since D is isotropic, we have by symmetry D∗ = D∗ I3 ,

30
and it is sufficient to consider P in (2.83) of the form P = P e1 , say. Since ΓhPi = 0, we then
recast (2.83) as

1 1 P2 1 (P − hP i)2 

≤ min + . (2.86)
D − D0 P isotropic D − D0 hP i2 3D0 hP i2

Defining δ = (3D0 )−1 (D − D0 ), we obtain that


1 1 1+δ P δ 2 1 

≤ min − + . (2.87)
D − D0 3D0 P isotropic δ hP i 1 + δ 1+δ

We now use the following optimization result

min haq 2 i = ha−1 i−1 β 2 , (2.88)


hqi=β

for a positive isotropic field a, and the minimum is realized for q = βha−1 i−1 a−1 . Since D is
isotropic, so is δ. The minimum in (2.87) with the isotropy constraint relaxed is still attained
by an isotropic field, and we have
1 1 1 1+δ 1 
≤ + ,
D∗ − D0 3D0 1 + δ δ 3D0
which we recast as
δ 1 −1  D − D0
D∗ ≥ D0 1 + 3 , δ= > 0. (2.89)
1+δ 1+δ 3D0
A similar calculation shows that the dual variational principle (2.84) simplifies in the isotropic
case to the same expression with δ < 0:
δ 1 −1  D − D0
D∗ ≤ D0 1 + 3 , δ= < 0. (2.90)
1+δ 1+δ 3D0

2.5 Homogenization with boundaries and interfaces


2.5.1 The time dependent case
This section is concerned with the homogenization of rough boundaries and interfaces. We con-
sider a two dimensional setting where two homogeneous media are separated by a ε−periodic
interface z = h(x/ε). We assume that h is a smooth 1−periodic function with range [−H, 0].
The wave equation (2.3) for the pressure field pε takes the form

∂ 2 pε (t, x)
κε − Dε (x)∆pε (t, x) = S(x), (2.91)
∂t2
with vanishing initial conditions and for t ∈ R+ and x = (x, z) ∈ R2 such that z 6= h(x/ε),
where
 κ+ , z > h x ,  D+ , z > h x ,
   
κε (x, z) = ε Dε (x, z) = ε (2.92)
 κ− , z < h x  ,  D − , z < h x .
ε ε
Furthermore, the pressure and the normal component of the velocity field are continuous across
the interface, so that their jumps vanish
 ∂p  x
[p] = 0, D = 0, z=h . (2.93)
∂n ε

31
Let us now assume that
x x x
pε (t, x, z) = p0 (t, x, , z) + εp1 (t, x, , z) + ε2 p2 (t, x, , z) + O(ε3 ), (2.94)
ε ε ε
where the functions pi are 1−periodic with respect to the third variable. Notice that the
second jump condition for p = p(t, x, y, z) now reads
 h0 ∂p h0 ∂p ∂p 
D 2 + − = 0, z = h(y).
ε ∂y ε ∂x ∂z

Upon plugging (2.94) into (2.91) and (2.93) and equating like powers of ε, we obtain first
that
∂ 2 p0
D 2 = 0, z 6= h(y),
∂y (2.95)
 ∂p0 
[p0 ] = 0, D = 0, z = h(y).
∂y
Because p0 is periodic in y, we deduce that p0 = p0 (x, z) independent of y. The second
equation is
∂ 2 p1
D 2 = 0, z 6= h(y),
∂y (2.96)
 ∂p1 ∂p0 
[p1 ] = 0, D + = 0, z = h(y),
∂y ∂x
because p0 is independent of y. We anticipate here that ∂p 0
∂x is a smooth function. Since it
does not depend on y, it cannot be discontinuous at z = h(y). We see therefore that

∂p0
p1 (x, y, z) = θ1 (y, z) (x, z), (2.97)
∂x
where θ1 is the mean-zero 1−periodic solution of

∂ 2 θ1
D = 0, z 6= h(y),
∂y 2 (2.98)
 ∂θ1 
[θ1 ] = 0, D + 1 = 0, z = h(y).
∂y
We shall come back later to the calculation of θ1 . The next order equation yields

∂ 2 p0 ∂ ∂p2 ∂p1  ∂ ∂p1 ∂p0  ∂ 2 p0


κ + D + + D + + D = S(x), z 6= h(y)
∂t2 ∂y ∂y ∂x ∂x ∂y ∂x ∂z 2 (2.99)
∂p2 ∂p1 ∂p0 
D h0 + h0

[p2 ] = 0, − = 0, z = h(y).
∂y ∂x ∂z
This equation admits a solution provided some compatibility condition is ensured. This
condition is obtained by averaging the above equation over (0, 1) in y. We deduce from (2.98)
that the derivative in y of D( ∂θ
∂y + 1) for z 6= h(y) and its jumps at z = h(y) vanish. Therefore,
1

∂θ1 
Deff (z) = D(y, z) (y, z) + 1 (2.100)
∂y
independent of y. Furthermore, we have
Z 1
∂ ∂p1 ∂p0  ∂ 2 p0
D + dy = Deff (z) 2 .
0 ∂x ∂y ∂x ∂x

32
For every z ∈ R, h−1 (z) is a finite set of points yi (z), 1 ≤ i ≤ I(z), which is empty for
z < −H and z > 0. The function D( ∂p 2 ∂p1
∂y + ∂x ) is 1−periodic in y by assumption. Therefore,
the integral of its derivative (in the sense of distributions) vanishes and we have thanks to the
jump conditions in (2.99) that
Z 1 I(z)
∂ ∂p2 ∂p1  X [D](yi )  ∂p0
D + + = 0.
0 ∂y ∂y ∂x h0 (yi ) ∂z
i=1

Introducing Z 1 Z 1
hDi(z) = D(y, z)dy, and hκi(z) = κ(y, z)dy, (2.101)
0 0
we verify that
I(z)
X [D](yi )  ∂hDi
= .
h0 (y i) ∂z
i=1

Upon integrating (2.99) in y over (0, 1), we obtain

∂ 2 p0 ∂ 2 p0 ∂ ∂p0 
hκi 2
− Deff 2
− hDi = S(x). (2.102)
∂t ∂x ∂z ∂z
This equation holds in the whole physical domain (x, z) ∈ R2 . For z > 0, we observe that
∂θ1 + + + −
∂y = 0, hence Deff (z) = D . Moreover, hDi = D and hκi = κ . Similarly, Deff (z) = D ,
hDi = D− , and hκi = κ− for z < −H. Therefore (2.102) is the same equation as (2.91)
outside the layer −H ≤ z ≤ 0. Inside this layer, the heterogeneities of the fast oscillating
boundary have been homogenized. Notice that the homogenized density tensor ρ = D−1 is
again asymmetrical, since in general Deff 6= hDi.
In order to obtain an explicit expression for Deff , we now assume that h(0) = 0 and that h
is strictly decreasing from y = 0 to y = y1 (−H) and then strictly increasing from y = y1 (−H)
to y = 1 where h(1) = h(0) by periodicity. For −H < z < 0, we denote by y1 (z) and y2 (z) the
functions such that h(y1 (z)) = z, h(y2 (z)) = z and y1 (z) < y2 (z). Therefore, for −H < z < 0,
D(y, z) = D+ for y ∈ (y1 (z), y2 (z)) and D(y, z) = D− for y ∈ (0, y1 (z)) ∪ (y2 (z), 1).
Let us now calculate θ1 in (2.98). Clearly, for z fixed, θ1 is linear on (0, y1 ), (y1 , y2 )
and (y2 , 1) of slope (a, b, c), respectively. By periodicity, we obtain that a = c and b =
−(y2 − y1 )−1 (1 − y2 + y1 )a. Now the jump conditions both yield that D− (a + 1) = D+ (b + 1),
from which we deduce that
(D+ − D− )(y2 − y1 ) (D+ − D− )(y2 − y1 − 1)
a= , b= .
(1 − (y2 − y1 ))D+ + (y2 − y1 )D− (1 − (y2 − y1 ))D+ + (y2 − y1 )D−

Therefore,

D− D+
Deff = D− (a + 1) = D+ (b + 1) = . (2.103)
(1 − (y2 − y1 ))D+ + (y2 − y1 )D−

The same method can be used to homogenize rough boundaries. Consider first the propa-
gation of the pressure waves with Neumann boundary conditions

∂ 2 pε (t, x) x
κ+ − D+ ∆pε (t, x) = S(x), z≥h ,
∂t2 ε (2.104)
∂pε x
= 0, z = h .
∂n ε

33
This equation is equivalent to (2.91) with D− = 0 as can be easily checked, with the difference
R 1 the source term S(x) only contributes when z > h(x/ε). We define then hSi(x, z) =
that
0 S(x, y, z)dy, where S(x, y, z) = S(x, z) when z > h(y) and S(x, y, z) = 0 otherwise. Now
because D− = 0, we deduce from (2.100) that Deff (z) = 0, and (2.102) becomes

∂ 2 p0 ∂ ∂p0 
hκi 2
− hDi = hSi(x), (x, z) ∈ R × (−H, ∞). (2.105)
∂t ∂z ∂z
Boundary conditions at z = −H are not necessary as hDi(−H) = hκi(−H) = hSi(−H) = 0.
When h is strictly decreasing and then strictly increasing on one period, (2.105) simplifies to

∂ 2 p0 D+ ∂ ∂p0 
κ+ 2
− (y2 − y1 ) = S(x), (x, z) ∈ R × (−H, ∞). (2.106)
∂t y2 − y1 ∂z ∂z

Dirichlet conditions (pε = 0 on z = h(x/ε)) can also be considered. We then easily deduce
from (2.95) that p0 = 0 on z = h(y). Since p0 is independent of y, this simply implies that
p0 = 0 on z = 0. It also solves the already homogeneous equation (2.91) for z > 0. The
roughness of the boundary is then not seen in the case of Dirichlet boundary conditions.

2.5.2 Plane wave reflection and transmission


Instead of the evolution problem (2.91), we can also consider the reflection and transmission
of planes waves from a rough interface
x
κε ω 2 pε (x) + Dε (x)∆pε (x) = 0, z 6= h ,
ε
x
pε = eik·x + pR

ε, z>h ,
ε (2.107)
x
pε = pTε , z < h

,
ε
 ∂pε  x
[pε ] = 0, D = 0, z = h .
∂n ε
where kε and Dε are defined in (2.92) and

κ+ ω 2 = D + k 2 , k = |k|, k = (kx , kz ).

Let us introduce pε = eikx x p̃ε = eikx x (eikz z + p̃R


ε ) for z > h(x/ε) and pε = e
ikx x p̃T for
ε
R T
z < h(x/ε). Here, kz < 0. Using the ansatz (2.94), we obtain that p̃ε and p̃ε only depend on
z. Since they solve a homogeneous equation for z > 0 and z < −H, respectively, we have that
p̃R −ikz z for z > 0 and p̃T = T eik̃z z for z < −H, where k̃ < 0 is defined by
ε (z) = Re ε z

κ− ω 2 = D− (kx2 + (k̃z )2 ).

For −H < z < 0, we define pε = eikx x wε for −H < z < 0. The leading order term w0 in wε
satisfies then the following equation, thanks to (2.102)

∂ ∂w0 
(ω 2 hκi − kx2 Deff )w0 + hDi = 0, −H < z < 0. (2.108)
∂z ∂z
It remains to find boundary conditions for w0 . The continuity of u0 and ∂x u0 at z = 0 and
z = −H yield that
1 + R = w(0), ikz (1 − R) = w0 (0),
(2.109)
T = w(−H), ik̃z T = w0 (−H).

34
Upon eliminating R and T , we find that

w0 (0) + ikz w(0) = 2ikz and w0 (−H) − ik̃z w(−H) = 0. (2.110)

Once (2.108) and (2.110) are solved for w0 , the response and transmission coefficients are
obtained from (2.109).
In the case of a rough boundary with Neumann boundary conditions, where D− = 0, the
above analysis holds with T = 0. Introducing the impedance
∂z p0 ∂z w0
ζ= (z = 0) = (z = 0), (2.111)
p0 w0
which is a real coefficient, we obtain that
ikz − ζ
R= . (2.112)
ikz + ζ

Clearly |R| = 1 as expected from conservation of energy. In the limit ε → 0 in the case of
Neumann boundary conditions, the pressure field pε converges to a function p0 that solves

∂ 2 p0
k 2 p0 + = 0, z > 0
∂z 2
p0 = eik·x + pR0
(2.113)
∂p0 ∂ ∂ 
+ ζp0 = 0, z = 0 =− on this surface .
∂n ∂n ∂z
Consider for example the case of a comb-like surface, where the periodic surface in the (y, z)
plane is composed of the line R × {−H} and the segments {n} × (−H, 0) for n ∈ Z. This
surface cannot be represented as a graph of a function, but can be approximated by piecewise
linear functions hη (y) where hη (0) = hη (1) = 0 and hη (y) = −H on (η, 1 − η), where η is a
small parameter sent to 0. In this limit, the equation for w0 is

∂ 2 w0
k 2 w0 + =0
∂z 2
w00 (0) + ikz w0 (0) = 2ikz (2.114)

w00 (−H) = 0.

Upon solving for w0 , we obtain that

ζ = −k tan kH. (2.115)

Notice that this impedance can be positive or negative depending on the size H of the combs.

35
Chapter 3

Geometric Optics

3.1 Introduction
Recall that in the variable y = (t, x) a differential operator L(y, Dy ) is of order m ≥ 0 if its
highest-order differentiation is of order m. Then Lm (y, Dy ) is the homogeneous differential
operator of order m such that L − Lm is at most of order m − 1. The characteristic variety of
L, denoted by CharL, is then the subset of points (y, η) ∈ Rd+1 × Rd+1 \{0} such that

det Lm (y, η) = 0. (3.1)

Geometric Optics is concerned with highly-oscillatory solutions of differential equations.


High oscillations mean high frequencies. Now differential operators of order m multiply fre-
quencies by a coefficient of order |η|m , which is much larger than |η|m−1 when |η|  1. This
implies that the leading term Lm (y, Dy ) plays a crucial role in geometric optics.
Consider an operator such that L = Lm and look for a plane wave solution of the form

Lm (y, Dy )[eiy·η a] = im Lm (y, η)eiy·η a = 0. (3.2)

This will happen locally if and only if (y, η) ∈ CharL. If Lm (Dy ) has constant coefficients,
then eiy·η a is a global solution of Lm u = 0 if and only if det Lm (η) = 0, in which case eiy·η a
is a global solution provided that a is in the kernel of Lm (η).
Let us insist on the high frequency regime by introducing η̂ = η/|η| and ε = |η|−1  1.
Then we verify that
iη̂·y im iη̂·y
Lm (Dy )e ε a = m Lm (η̂)e ε a = 0,
ε
provided that (y, η̂) is in the characteristic variety of Lm and a is an associated eigenvector
in the kernel of Lm (η̂).
In the case of non-constant coefficients in the linear differential operator L, we no longer
expect plane waves of the form eiη·y to be solutions. These need to be generalized to the
following form
φ(y)
u0 (y) = ei ε a0 (y). (3.3)
Note that the above plane waves were such that φ(y) = η · y and ∇y φ(y) = η. Plugging this
ansatz into the differential operator, we verify that
1
L(y, Dy )u0 (y) = Lm (y, i∇y φ(y))u0 (y) + O(ε1−m ), (3.4)
εm
assuming that φ and a0 are smooth functions.

36
Exercise 3.1.1 Verify the above statement.

We thus see that different behaviors emerge depending on whether (y, ∇φ(y)) belongs to the
characteristic variety of L or not. When the characteristic variety is empty, i.e., det Lm (y, η) ≡
|η|m Lm (y, η̂) 6= 0, then we say that the operator L is elliptic. Elliptic operators do not admit
oscillatory solutions unless there is an oscillatory source. Indeed we deduce from (3.4) that a0
has to vanish. We refer to [26, Chap.4] for additional details on the geometric optics theory for
elliptic equations. However, for hyperbolic equations, the characteristic variety is not empty,
and (3.4) tells us that locally, highly oscillatory solutions of Lu = 0 require us to construct
phases φ(y) such that (y, ∇φ(y)) ∈ Char L. This latter statement is in fact equivalent to φ(y)
being a solution to the famous eikonal equation.

3.2 Second-order scalar equation


Consider the second-order scalar equation

∂2p ∂p
− c2 (x)∆p = 0, p(0, x) = p0 (x), (0, x) = 0, (3.5)
∂t2 ∂t
which corresponds to L(y, D) = L2 (y, D) with symbol L(y, η) = −τ 2 + c2 (x)|ξ|2 . We recall
that y = (t, x) and η = (τ, ξ). We thus find that

Char L = {(y, η), such that − τ 2 + c2 (x)|ξ|2 = 0}. (3.6)

The characteristic variety may be decomposed into two leaves parameterized by τ = ±c(x)|ξ|.
This shows that information propagates with speed ±c(x) since τ is real-valued when ξ is. We
thus observe that (y, ∇φ(y)) ∈ Char L is equivalent to the classical eikonal equation

∂φ 2
L(y, i∇φ(y)) = − c2 (x)|∇φ|2 = 0, (3.7)
∂t
which again has the two possible solutions
∂φ
= ±c(x)|∇φ|. (3.8)
∂t
Before we further analyze the eikonal equation, we recast the high-frequency regime as a regime
with highly oscillatory functions.

3.2.1 High Frequency Regime


Consider the framework where the typical distance of propagation L of the waves is much
larger than the typical wavelength λ in the system. We introduce the small adimensionalized
parameter
λ
ε =  1. (3.9)
L
We thus rescale space x → ε−1 x and since l = c × t rescale time accordingly t → ε−1 t to obtain
the equation

∂ 2 pε ∂pε
ε2 = c2ε (x)ε2 ∆pε , pε (0, x) = p0ε (ε−1 x), (0, x) = h0ε (ε−1 x). (3.10)
∂t2 ∂t

37
Note that the terms ε2 cancel in the above partial differential equation so that the high fre-
quency regime is really encoded in the highly oscillatory initial conditions. Energy conservation
is then recast as
Z 
1 ∂pε 2 
E(t) = |ε∇pε |2 (t, x) + c−2 (x) ε (t, x) dx = E(0). (3.11)
2ρ0 Rd ∂t

One of our objective in such a regime is to characterize the spatial distribution of the
energy density, at least in an approximate sense as ε → 0. The high frequency regime of
the wave field is apparent in its initial condition p0 , which depends on ε. We thus want a
highly oscillatory initial field, and want moreover that it be of finite energy. In a homogeneous
medium, the simplest highly oscillatory fields are the plane waves exp(iy · η/ε), which may be
evaluated at t = 0 to yield exp(ix · ξ/ε). Since such a function is not of bounded energy, we
multiply it by a, say, smooth and compactly supported function a0 (x). An initial condition
of the form eix·ξ/ε a0 (x) both has high frequency oscillations (of order ε−1 ) and has bounded
energy.
In media with non-constant coefficients, plane wave solutions rarely exist. Even in ho-
mogeneous media, one may be interested in highly oscillatory initial conditions that may be
constant along surfaces that are not necessarily hyperplanes (the plane waves eix·ξ/ε are con-
stant on hyperplanes x · ξ constant). This encourages us to consider initial conditions of the
form
iφ(x) 
p0 (x) = exp a0 (x). (3.12)
ε
When φ(x) = x · ξ, we retrieve the preceding plane waves. Once we have initial conditions
of the above form, the main question that arises is whether the solution of the wave equation
(3.5) admits a useful asymptotic expansion. The answer consists of looking for solutions of
the same form as that of the initial conditions, namely, a highly oscillatory exponential phase
multiplied by a slowly varying amplitude. The theory of geometric optics provides a general
framework to solve for the phase and the amplitude.

3.2.2 Geometric Optics Expansion


Following the preceding discussion, we define the following ansatz

pε (y) = eiφ(y)/ε aε (y), aε (y) = a0 (y) + εa1 (y) + . . . . (3.13)

and wish to compare the above ansatz to the high frequency solution pε (t, x). Recall that c(x)
is independent of ε in this chapter. Some algebra shows that:

L eiφ(y)/ε a(y) = eiφ(y)/ε ×




i 2 2i i  (3.14)
× L(y, ∇y φ)a + Vφ a + (L(y, Dy )φ)a + (L(y, Dy )a)
ε ε ε
where we have defined the vector field
∂φ ∂ ∂φ ∂
Vφ (y) = (y) − c2 (x) (y) . (3.15)
∂t ∂t ∂xj ∂xj

Note that as is usual in differential geometry, vector fields are identified with first-order dif-

ferentials, i.e., the basis elements ei of Rd are identified with ∂x i
; 1 ≤ i ≤ d.

Exercise 3.2.1 Verify the above derivation.

38
In order for eiφ(y)/ε aε (y) to solve the wave equation, at least approximately, we see that the
leading term in (3.14) must vanish, which implies that the phase φ must satisfy the eikonal
equation
∂φ 2
L(y, i∇φ(y)) = − c2 (x)|∇φ|2 = 0. (3.16)
∂t
In order to justify the above asymptotic expansion, we need to ensure that the phase function
φ(y) is uniquely determined. This is so for instance when |∇φ(0, x)| never vanishes on the
support of a0 (x), in which case either choice in (3.8) can be made. We will come back to
the solution of the eikonal equation in the next section. For the moment we assume that the
eikonal equation is uniquely solvable on an interval (0, T ) once a choice of sign has been made
in (3.8).
The vector field in (3.15) is now uniquely defined and it remains to find an equation for
aε (y), at least approximately. We plug the expansion for aε into (3.14), with a replaced by aε
and equate like powers of ε. The term of order ε−1 provides that

2Vφ a0 + (Lφ)a0 = 0. (3.17)

This is a transport equation for a0 . We verify that Vφ φ = 0, which implies that φ is constant
along the integral curves of the vector field Vφ , which are called rays. We see that a0 is also
transported along the rays, except for the presence of the “absorption” coefficient (which does
not need to have a constant sign) Lφ. Because we assume that |∇φ| = 6 0 so that ∂t φ 6= 0, the
integral curves of Vφ indeed are transverse to the hyperplane t = 0 so that the equation for
a0 (y) with a0 (0, x) known admits a unique smooth solution.
The higher-order terms in the expansion yield in turn that

2Vφ an + (Lφ)an + Lan−1 = 0, n ≥ 1. (3.18)

For the same reasons as before, the above equation for an admits a unique solution.
Formally, if the above construction is carried out for all n ≥ 1, we find that

Lpε = L eiφ(y)/ε aε (y) = O(ε∞ ) ∼ 0.




We refer to [26] for additional details on the notation, which we will not use in the sequel.
Rather we now want to understand the accuracy of the above procedure provided that
the above expansion is truncated at order n ≥ 0, say. This is done by an energy estimate.
Consider
n
X
pε (y) = eiφ(y)/ε aj (y)εj .
j=0

Then we verify that

ε2 Lpε = ε2+n Lan eiφ(y)/ε = ε2 fε (y), with kfε (t, ·)kH s (Rd ) = O(εn−s ), 0 ≤ s ≤ n, 0 ≤ t ≤ T.

Exercise 3.2.2 Verify the last statement.

The error δε = pε − pε thus satisfies the equation


∂δε
ε2 Lδε = ε2 fε , δε (0, x) = (0, x) = 0.
∂t
This shows the existence of a constant such that, for instance,

kpε (t, ·) − pε (t, ·)kL2 (Rd ) ≤ Cεn , uniformly on 0 ≤ t ≤ T. (3.19)

39
Proof. This is based on the energy method. Consider the equation

∂2u
− c2 (x)∆u = f,
∂t2
with vanishing initial conditions at t = 0, and the total energy
Z 
1 ∂u 2 
E(t) = |ε∇u|2 (t, x) + c−2 (x) ε (t, x) dx.
2ρ0 Rd ∂t
We find that
∂u ∂ 2 u 
Z Z
1  ∂u 1 ∂u
Ė(t) = ε∇u · ε∇ + c−2 (x)ε ε 2 dx = εf (t)ε dx,
ρ0 Rd ∂t ∂t ∂t ρ0 Rd ∂t
after integrations by parts. As in the proof of Theorem 2.1.1, we find that
∂u
kε k(t) ≤ CT εkf kC(0,T ;L2 (Rd )) ,
∂t
Here k · k is the L2 (Rd ) norm.

Exercise 3.2.3 Verify the the last statement in detail.

After integration in time, this shows that

ku(t)k ≤ CT kf kC(0,T ;L2 (Rd )) , 0 ≤ t ≤ T.

This concludes the proof of the bound on δε . Of course, the proof holds independent of ε and
we could have chosen ε = 1 since all derivatives in the wave equation are second-order, and
thus the powers ε2 cancel out. We have kept the ε-dependence to insist on the fact that this
is the high-frequency energy estimate E(t) that is needed here.

3.3 The Eikonal equation


We refer to the Appendix in Chapter 5 of [26] and to Chapter 3 in [15] for the details of the
theory.
The eikonal equation takes the form

L(y, ∇φ) = 0. (3.20)

A better notation for ∇φ(y) would be the one form dφ(y) since the theory is more natural
geometrically when (y, dφ(y)) is seen as a covariant vector rather than a contravariant vector.
In Euclidean geometry, both types of vectors have the same coordinate expressions and we
will use the notation in (3.20).
The objective is to solve the above equation by the method of characteristics. Let us
assume that φ(y) is known on a hypersurface M of Rd+1 , for instance on the hyperplane
t = 0, so that φ|M = g. Then the derivatives of φ in the directions tangent to M are also
known, since
e · ∇φ(z) = e · ∇g(z), e ∈ Tz M.
Here Tz M is the tangent space to M at z. It thus remains one unknown directional derivative
of φ(z), namely that in the direction orthogonal to M . We use the equation (3.20) to determine
such a derivative. We need however to make sure that (3.20) indeed determines that derivative.
This is the case provided that ∇η L(y, ∇φ) is not tangent to M at y ∈ M . Otherwise,

40
L(y, ∇φ) = 0 only gives information about the derivatives of φ that are tangent to M , and we
already know those.
For instance for the wave equation with M the hyperplane t = 0, the eikonal equation for
L(y, η) = τ 2 − c2 (x)|ξ|2 implies that

∂ 2
(τ − c2 (x)|ξ|) = 2τ 6= 0,
∂τ
and that
∂φ
= ±|∇φ| = ±|∇g|.
∂t
The constraint ∇η L(y, ∇φ) is not tangent to M thus may provide several solutions. We
pick one of them. We have thus defined ∇φ(z) in all directions for all z ∈ M . It remains to
find φ(z) away from M . This can be achieved by the method of characteristics.
Let us first differentiate the equation (3.20):

∂L ∂φ(y)
(y, ∇φ(y)) + ∇η L(y, ∇φ(y)) · ∇ = 0,
∂yj ∂yj

by the chain rule. Let us now construct curves in the phase space (y(s), η(s)) such that
∇φ(y(s)) = η(s). Differentiating the latter equality yields
X ∂2φ
η̇i (s) = (y(s))ẏj (s).
∂yi ∂yj
j

Upon defining

ẏj (s) = L(y(s), ∇φ(y(s))),
∂ηj
we thus obtain that
∂L
(y(s), ∇φ(y(s))) + η̇j (s) = 0.
∂yj
This is nothing but the system of Hamilton’s equations

ẏ(s) = ∇η L(y(s), η(s)), η̇(s) = −∇y L(y(s), η(s)). (3.21)

Such a system admits a unique solution for given initial conditions y(0), η(0). The curves
(y(s), η(s)) are then called the bicharacteristic curves and their spatial projections y(s) are
called rays. Note that we already defined rays as the integral curves of Vφ in the preceding
section.

Exercise 3.3.1 (not so easy.) Verify that, for the example given in the preceding section, the
two definitions of rays coincide by showing that the spatial projection of the bicharacteristics
are indeed integral curves for Vφ . Hint: use the fact that k = ∇x φ in the construction of the
bicharacteristics.

Let us recapitulate. We know φ(y) on M and have been able to deduce ∇φ(y) on M using
(3.20) and the fact that M is not characteristic for L(y, η), i.e., that ∇η L is not tangent to
M . This provides us with “initial” conditions y(0) = y ∈ M and η(0) = ∇φ(y). We thus
solve for the bicharacteristics (y(s), η(s)). This gives us ∇φ(y(s)) = η(s).
Application of the inverse function theorem shows that for sufficiently small times, the rays
y(s; y, ∇φ(y)) starting at y ∈ M with direction ∇φ(y) so constructed cover a neighborhood
of M . Thus for any point z in the vicinity of M , there is a unique y ∈ M and a unique s0 such

41
that z = y(s0 ) with y(0) = y and η(0) = ∇φ(y). Let γ(s) be such a curve. On the curve, we
have ∇φ(y(s)) = η(s), which implies that
d
φ(y(s)) = ∇φ(y(s)) · ẏ(s) = η(s) · ẏ(s).
ds
Z s0 Z s0
φ(z) = φ(y) + η(s) · ẏ(s)ds = φ(y) + η(s) · ∇η L(y(s), η(s))ds.
0 0
Note that the above construction for φ(y) only works in the vicinity of M , i.e., for suffi-
ciently small times 0 < s < T . Indeed, for large times, it may be that y(s; y, ∇φ(y)) and
y(s0 ; y0 , ∇φ(y0 )) are equal. In such a case, we find that φ(z) becomes multi-valued and thus
ceases to be a smooth solution of the eikonal equation (3.20).
Exercise 3.3.2 Let M be R2 , a surface in R2 × R corresponding to t = 0. Consider the wave
equation with L(y, η) = τ 2 − c2 |ξ|2 with c constant. Let φ(0, x) = g(r) be a function in R2
equal to r for 1 < r < ∞ and equal to 0 otherwise. Choose the sheet of the eikonal equation
given by ∂t φ = −c|∇φ|.
Solve the eikonal equation by the method of characteristics. Until what time can one solve
it? Show that at the final time, the solution of the eikonal equation cannot be smooth.
The above construction for the solution of the eikonal equation is only local in time, but
there is a more fundamental problem. When the solution of the eikonal equation stops being
defined, it means that solutions of the form eiφ/ε aε are no longer sufficiently rich to represent
wave propagation.

3.4 First-order hyperbolic systems


The construction in section 3.2.2 can be generalized to arbitrary first-order hyperbolic systems
of equations. Consider the equation

Lu(y) = 0, u(0, x) = u0 (x), (3.22)

where
d
X ∂
L(y, D) = Am (y) + B(y) = L1 (y, D) + B(y). (3.23)
∂ym
m=0
We can then look for solutions of the form

u(y) = eiφ(y)/ε aε (y), aε (y) = a0 (y) + εa1 (y) + . . . .

We then verify that


 
L(y, D) eiφ(y)/ε aε (y) ∼ eiφ(y)/ε ×

1 X  (3.24)
× L1 (y, i∇φ(y))a0 + εj L1 (y, i∇φ(y))aj+1 (y) + L(y, ∇)aj (y) .
ε
j=0

The leading term shows that L1 (y, i∇φ(y))a0 should vanish to verify (3.22) approximately.
This implies that
det L1 (y, i∇φ(y)) = 0. (3.25)
This is the eikonal equation for φ(y). The graph of ∇φ, i.e, (y, ∇φ(y)) belongs to the
characteristic variety of L(y, D). The solutions of the above equation are obtained as in

42
section 3.3: we first obtain a possible solution of ∂t φ on M based on (3.25) and then once the
corresponding sheet of the characteristic variety has been chosen, we solve the eikonal equation
by the method of characteristics.
The construction of aj is somewhat more complicated and the reader is referred to [26, §5.3]
for the details, where it is shown that the construction can be done, at least for sufficiently
small times so that the eikonal equation can be solved.
A theorem by P.Lax (1957) then shows that eiφ(y)/ε aε (y) so constructed is indeed a solution
of the first order hyperbolic system up to a term of order O(ε∞ ).

43
Chapter 4

Random perturbations

This chapter addresses the variations in the wave fields caused by small fluctuations in the
underlying medium. By small we mean sufficiently small so that simple asymptotic expansions
provide a good approximation for how the field is modified.

4.1 Statistical description of continuous random fields.


This section introduces some notation on the random perturbations that will be useful in the
sequel. Let f (t) be a random process depending on a one-dimensional variable t and taking
values in Rd . By a random process, we mean that the process is a function of the realization
ω ∈ Ω so that at each t in an interval I ⊂ R, ω 7→ f (t; ω) is a random variable on the
probability space (Ω, F, P ).
In the probability space, (Ω, F) is a measurable space, i.e., a set (state space) Ω that comes
with a σ−algebra F, which should be thought of “all the reasonable” subsets of Ω). Then
P is a function F → [0, 1] (a measure), which to each A ∈ F associates a probability P (A).
The measure P is a probability measure when P (Ω) = 1. Thus P (A) roughly indicates the
probability that the realization ω be in A. When all this is true, we call (Ω, F, P ) a probability
space.
Let (Rd , B) be a measurable space with B the Borel σ−algebra on Rd (think of all products
of intervals on Rd and then all “reasonable” unions and intersections of such products and you
get an idea of B [12, 13, 25]). A random variable f on (Ω, F) is an F-measurable function
from Ω to Rd . It induces a probability measure on Rd , µf , defined by

µf (B) = P (f −1 (B)). (4.1)


R
Then if Ω |f (ω)|dP (ω) < ∞, then we define the expectation of f as
Z Z
E{f } = f (ω)dP (ω) = f dµf (f ). (4.2)
Ω Rd

A stochastic process is thus a parameterized family of random variables {ft }t∈I defined on
(Ω, F, P ) and taking values in Rd .
We can also view t 7→ f (t; ω) as a path of f in Ω. We can thus identify ω with that path,
and may thus regard Ω as a subset of Ω̃ = (Rn )I of all functions from I to Rn (an enormous
space!). We refer to [12, 13, 25] for more details. The notation that follows comes almost
verbatim from the book by Tatarski “Wave propagation in turbulent media”.

44
Correlation function and power spectrum. Let f (t) be a random process. The corre-
lation function of f (t) is defined by
Bf (t1 , t2 ) = [f (t1 ) − hf (t1 )i][f (t2 ) − hf (t2 )i] . (4.3)
Here h·i refers to the ensemble average over all possible realizations (with respect to the measure
P introduced above). Thus the correlation function is a second-order moment of the process
f (t). We say that f (t) is stationary if hf (t)i is independent of t and Bf (t1 , t2 ) = Bf (t1 − t2 ).
Often we assume hf (t)i = 0. Then we have the stochastic Fourier-Stieltjes integral with
random complex amplitude Z
f (t) = eiωt dϕ(ω). (4.4)
R
Since Bf depends on t1 − t2 we verify that
hdϕ(ω1 )dϕ∗ (ω2 )i = δ(ω1 − ω2 )W (ω1 )dω1 dω2 , (4.5)
where W (ω) ≥ 0 is the spectal density of f (t). We check that
Z
Bf (t) = eiωt W (ω)dω. (4.6)
R

We see that stationary processes f (t) generate non-negative spectral densities. The Wiener-
Kinchin theorem states the converse: If W (ω) is non-negative, then there is a stationary
random process f (t) with spectral density W (ω). Note that
Z
2
h|f (t)| i = W (ω)dω,
R

whence the name spectral density of the power, or power spectrum.

Stationary increments. More general random functions than stationary may be described
by stationary increments. They are then described by their structure function
Df (t1 , t2 ) = [f (t1 ) − f (t2 )]2 . (4.7)
Stationary increments mean that Df depends on τ = t2 − t1 . Stationary functions are special
cases of functions with stationary increments, for which we have
Df (τ ) = 2[Bf (0) − Bf (τ )].
When Bf (∞) = 0 we have
2Bf (t) = Df (∞) − Df (t),
so Df characterizes the stationary process.
For all processes with stationary increments, we have that
Z
Df (t) = 2 (1 − cos ωt)W (ω)dω,
R

and that a random function with stationary increments can be represented as


Z
f (t) = f (0) + (1 − eiωt )dϕ(ω), (4.8)
R

for some random amplitudes such that


hdϕ(ω1 )dϕ∗ (ω2 )i = δ(ω1 − ω2 )W (ω1 )dω1 dω2 . (4.9)
For this we need to make sure that W (ω) is integrable.

45
Homogeneous random fields. In higher dimensions random processes are replaced by
random fields (it’s just terminology). We define the correlation function Bf (x1 , x2 ) as

Bf (x1 , x2 ) = [f (x1 ) − hf (x1 )i][f (x2 ) − hf (x2 )i] . (4.10)

Stationary random functions are replaced by homogeneous random fields, i.e. fields for which
the correlation function Bf (x1 , x2 ) depends only on x1 − x2 . When Bf depends only on
|x1 − x2 | the field is isotropic.
We have the stochastic Fourier-Stieltjes integral
Z
f (x) = eik·x dϕ(k), (4.11)
Rd

where for some Φ ≥ 0,

hdϕ(k1 )dϕ(k2 )i = δ(k1 − k2 )Φ(k1 )dk1 dk2 . (4.12)

We find that Z
1
Φ(k) = cos(k · x)Bf (x)dx.
(2π)d Rd

Locally homogeneous random fields. We define the structure function

Df (x1 , x2 ) = [f (x1 ) − f (x2 )]2 , (4.13)

which depends on x1 − x2 for locally homogeneous fields. A locally homogeneous field may
then be represented as Z
f (x) = f (0) + (1 − eik·x )dϕ(k), (4.14)
Rd
where f (0) is a random variable. We find that
Z
Df (x) = 2 (1 − cos k · x)Φ(k)dk,
Rd

where Φ(k) is defined as for homogeneous fields.

4.2 Regular Perturbation method


Let us consider the wave equation

1 ∂2p
− ∆p = 0, t > 0, x ∈ Rd
c2 (x) ∂t2 (4.15)
∂p
p(0, x) = 0, (0, x) = h(x).
∂t
The above equation may be recast as

∂2p
− c2 (x)∆p = h(x)δ(t), x ∈ Rd , (4.16)
∂t2
with the condition that p(t, x) ≡ 0 for t < 0. This may be verified by recalling that the
derivative of the Heaviside function is the delta function in the sense of distributions. Upon

46
taking the Fourier transform in time Ft→τ and using the dispersion relation τ = c0 k, for some
constant background c0 , we find that

∆p̂ + k 2 n2 (x)p̂ = −h(x), (4.17)

where the index of refraction is given by

c0 c20
n(x) = , n2 (x) = = 1 + εµ(x), (4.18)
c(x) c2 (x)
where ε  1 is a small parameter measuring the amplitude of the fluctuations in the underlying
medium and µ(x) is a random field on Rd . We thus assume that the sound speed is equal to a
constant background plus a small perturbation. Note that we are in the low frequency regime
here, where λ ∼ L and the correlation length l ∼ L is also of the same order as the medium.
The only small parameter is thus the size of the random fluctuations.
Let us consider the case of an incoming plane wave eik·x onto a compact domain. This is
modeled by an index of refraction given by (4.18). Note that for k = |k|, the plane wave is a
solution of the homogeneous equation ∆p + k 2 p = 0. We thus look for a solution of

∆p + k 2 n2 (x)p = 0, (4.19)

that solves the following Lipman- Schwinger equation


Z
ik·x
p(x) = e + G(x, y)k 2 εµ(y)p(y)dy, (4.20)
Rd

where G(x, y) is the Green’s function associated to the homogeneous Helmholtz equation

∆G(x, y) + k 2 G(x, y) + δ(x − y) = 0, (4.21)

with appropriate radiation conditions at infinity (i.e., so that there is no radiation coming
from infinity). In three dimensions we have the usual form

eik|x−y|
G(x, y) = G(x − y) = . (4.22)
4π|x − y|
Let us recast symbolically the Lipman-Schwinger equation as

p = p0 + εGV p.

Then we observe that, for µ a mean zero stationary random process, we have

hpi = (I + ε2 hGV GV i)p0 + O(ε4 ),

which, neglecting O(ε4 ) terms, may be recast as

(∆ + k 2 )hpi + hV GV ip = 0. (4.23)

We have thus replaced a heterogeneous equation by a homogenized equation for the ensemble
average of p. Note that for µ(x) a stationary random process with correlation R(x), we obtain
that
ε2 k 4 eik|x−y|
Z
(∆ + k 2 )hpi + R(x − y)hpi(y)dy = 0. (4.24)
4π R3 |x − y|
The above equation was obtained by J.B. Keller in 1964.

47
This is a convolution equation, which may thus admits plane waves as exact solutions.
Looking for hpi = eiξ·x , we find the dispersion relation

ε2 k 4 eik|y|
Z
ξ 2 = k2 + R(y)e−iξ·y dy. (4.25)
4π R3 |y|

Here ξ 2 = −(iξ) · (iξ). However we observe that ξ is not real-valued. Let us consider ξ = ρk
for some ρ > 0. We can then to first order replace e−iξ·y by e−ik·y in the above expression to
get
ε2 k 4 eik|y|
Z
2
ξ =k +2
R(y)e−ik·y dy.
4π R3 |y|
For R(y) = R(|y|) (isotropic random medium) we find that

ei(k|y|−k·y)
Z Z
1 
R(y)dy = R(|y|) sin 2k|y| + i(1 − cos 2k|y|) d|y|.
2π R3 |y| 0

Exercise 4.2.1 Check the above formula.

So for instance for R(|y|) = e−α|y| , we verify that both the real and imaginary parts in the
above expression are non-negative. This allows us to conclude that <ξ > k and that =ξ > 0, at
least for sufficiently small ε. This means that the mean wave propagates more slowly than in a
homogeneous medium since τ = c0 k = cin <ξ and that it is attenuated by the inhomogeneities.
The attenuation is not intrinsic attenuation, since energy is conserved by the wave equation
and the Helmholtz equation is the same phenomenon seen in the Fourier domain. What
happens is that the coherent part of the signal decays exponentially. As the waves propagate
through the random medium, energy is scattered into other directions. Because this scattering
is incoherent, it is “lost” when we consider hpi. Higher moments of p need to be considered as
well. This will be done by means of the Wigner transform of the wave fields in later chapters.

4.3 Random Geometric Optics


For the same equation
∆p + k 2 n2 (x)p = 0, (4.26)
let us consider geometric optics solutions

p(x) = eikS(x) a(x), (4.27)

in the high frequency regime, i.e., when kL  1, where L is the typical distance of propagation
we are interested in.
Note that this time, we are interested in frequencies of order k so that the “low-order
term”, at least in terms of derivatives, is as important as the “high-order term” ∆p. As a
“characteristic variety” for the Helmholtz equation, we thus really want to consider

L(x, ξ) = −|ξ|2 + k 2 n2 (x) = 0. (4.28)

Let us plug the anzatz (4.27) into (4.26) and equate like powers of k. The eikonal equation,
which is the leading term in O(k 2 ) now becomes

L(x, k∇S(x)) = k 2 (−|∇S|2 (x) + n2 (x)) = 0. (4.29)

48
Note that the formalism is the same as in the preceding chapters: the eikonal equation captures
the largest homogeneous terms in the equation. Here “largest” means in terms of k, and not
ξ, the dual variable to x.
The above variety has two regular branches and we consider

L̃(x, ξ) = |ξ| − n(x) = 0, (4.30)

where we have replaced ξ by the reduced ξ/k. The eikonal equation thus becomes

L̃(x, ∇S(x)) = |∇S|(x) − n(x) = 0. (4.31)

The above eikonal equation can then be solved by the method of characteristics. As before we
find rays (x(s), ξ(s)) such that ∇S(x(s)) = ξ(s). This is achieved by the Hamiltonian system

Ẋ(s) = Ξ̂(s), X(0) = x,


(4.32)
Ξ̇(s) = ∇n(X(s)) Ξ(0) = ξ.

We verify that
d
S(X(s)) = n(X(s)) = |Ξ(s)|, (4.33)
ds
so that Z s
S(X(s)) = S(x) + n(X(u))du. (4.34)
0
This fully characterizes the phase S(x).
Let us now consider a simplified version where x = (x1 , x2 ) in two space dimensions and
where S(0, x2 ) = 0. We also assume that n(x) = 1 + σµ(x) with ε  1. We thus have the
phase at x1 = 0 and are interested in the phase at x1 > 0.
Characteristics start at (0, x2 ) and with direction e1 + O(σ) ≈ e1 . We want to find the
right scaling in time t so that fluctuations of order O(1) can be observed in the phase with
respect to propagation in a homogeneous medium.
For σ  1, we observe that Ξ̂(s) = (1, 0)t + O(σs) for s  σ −1 . In that regime we thus
have X(s) = X(0) + sΞ̂(0) + O(σs2 ) so that for σs2  1, we have approximatively that

d(S − s)
= σµ(X(0) + sΞ̂(0)).
ds
For large times s = t/ε, we can approximate the phase as
t t
S( ) = + εα Sε (t),
ε ε
where
dSε σ t
= 1+α µ( , x2 ). (4.35)
dt ε ε
Let us now choose ε such that σ = ε1/2+α so that the above equation is recast as

dSε (t; x2 ) 1 t
= √ µ( , x2 ), Sε (0; x2 ) = 0. (4.36)
dt ε ε

Since we want σt2 /ε2  1, this implies that α > 3/2. The value of 3/2 is by no means optimal
because the estimate σt2 /ε2  1 is very conservative.

49
In the limit ε → 0, the above phase function converges to a stochastic process S(t; x2 )
given by
S(t; x2 ) = σ0 (t; x2 )Wt , (4.37)
where Wt is Brownian motion and where the variance σ02 is given by
Z ∞
2
σ0 (t; x2 ) = 2 E{µ(0, x2 )µ(τ, x2 )}dτ. (4.38)
0

If µ is homogeneous in both directions x1 and x2 , then the above variance is independent of


x2 . That this is so for a large class of processes µ comes from a much more general result that
can be found in Appendix A.
The above result is only valid when the method of characteristics provides a unique solution
to S(x) for x1 > 0. It is shown in [31] that caustics form with probability one when ε ∼ σ 2/3 ,
which corresponds to α = 1. In this scaling, it can be shown that X(s/ε) deviates by an
amount of order O(1) from its value in a homogeneous medium. This is therefore the regime
where characteristics cross (with probability one somewhere), whereby invalidating the validity
of the geometric optics ansatz.
Let us consider the regime ε = σ 2/3 more carefully. We first recast the Hamiltonian system
(4.32) as
Ẋ(s) = Ξ̂(s), X(0) = x,
˙  ∇n(X(s)) (4.39)
Ξ̂(s) = I − Ξ̂(s) ⊗ Ξ̂(s) , Ξ̂(0) = ξ.
n(X(s))
Here we have used the fact that |Ξ(s)| − n(X(s)) = 0 to find an equation for (X, Ξ̂).

Exercise 4.3.1 Derive (4.39) from (4.32).

We now restrict ourselves to the two-dimensional setting and define ξ ⊥ as the rotation by π/2
of ξ ∈ S 1 . The above system may then be recast as

Ẋ(s) = Ξ̂(s), X(0) = x,


˙ ⊥ ∇n(X(s)) ⊥ (4.40)
Ξ̂(s) = Ξ̂ (s) · Ξ̂ (s), Ξ̂(0) = ξ.
n(X(s))

Let us now assume that n = 1+σµ with µ(x) a mean zero stationary random field and σ = ε3/2 .
√ √
For times s = t/ε, which are thus such that sσ = t ε  1, we have Ξ̂(s) = Ξ̂(0) + O( ε).
Upon neglecting terms that will not contribute to order O(1) at times of order s = t/ε, we
deduce that
Ẋ(s) = Ξ̂(s), X(0) = x,
˙ √ ∂µ
Ξ̂(s) = ε (X(s))ξ ⊥ , Ξ̂(0) = ξ.
∂x2
This implies that X(s) = sξ + Xε (t), where Xε (t) is a priori not as large as s = t/ε. We thus
recast the above system (now in the t-variable) as

Ẋε (t) = Ξε (t), Xε (0) = x,


1 ∂µ  t  (4.41)
Ξ̇ε (t) = √ ξ + Xε (t) ξ ⊥ , Ξε (0) = ξ.
ε ∂x2 ε

Exercise 4.3.2 Verify the above equation. Note that we have defined implicitly Ξε (t) =
ε−1 (Ξ̂(s) − Ξ̂(0)).

50
Note that only the second component of Ξε (t) evolves in time, which implies the same result
for Xε (t). We thus further simplify (with no additional term neglected) that

Ẋε (t) = Ξε (t), Xε (0) = x2 ,


1 ∂µ  t  (4.42)
Ξ̇ε (t) = √ , Xε (t) , Ξε (0) = 0.
ε ∂x2 ε

This is the correct scaling to show order O(1) modifications caused by the randomness.
The results in the appendix show that Ξε (t) becomes an O(1) process and so Xε (t) as well.
Since X(s) = sξ + Xε (t), we observe that trajectories starting at (0, x2 ) have deviated by
an order O(1) at the large time s = t/ε (large compared to the correlation length of the
medium, which we have scaled as O(1) here) from where they would be should the medium
be homogeneous, i.e., (s, x2 ). It is then shown in [31] that in this regime, rays cross with
probability one, whence generating caustics at those points where an infinite number of rays
come tangentially (whereby concentrating signals and creating very bright areas, as at the
bottom of a pool illuminated from the top).
We are “almost” in the regime considered in the appendix. The reason is here that the
random variable depends continuously on the fast scale t/ε but also on the unknown Xε .
Let us assume that
∂µ X
(t, x) = e−ikj x αj (t),
∂x2
j∈J

where αj (t) is a mean-zero random process and the {αj (t)}j∈J are jointly Markov with in-
finitesimal generator Q. We can now apply the theory with x = (x, ξ) and y = {αj }. We find
that
F1 (x, y) = 0, G1 (x, y) = ξ,
X (4.43)
F2 (x, y) = e−ikj x αj , G2 (x, y) = 0.
j

The only non-vanishing diffusion coefficient is, independent of the number of Fourier coeffi-
cients αj , given by Z ∞ n ∂µ ∂µ o
a22 (x, ξ) = 2 E∞ (0, x) (s, x) ds.
0 ∂x2 ∂x2
The only non-vanishing drift term is

b1 (x, ξ) = ξ.

Writing a22 (x) = σ 2 (x), we obtain that (Xε (t), Ξε (t)) converges (weakly) to the process solu-
tion of
Ẋ(t) = Ξ(t), X(0) = x2 ,
(4.44)
dΞ(t) = σ(X(t))dWt , Ξ(0) = 0.

Here Wt is the usual one-dimensional centered Brownian motion with variance E{Wt2 } = t.

51
Chapter 5

Wigner Transforms

5.1 Definition of the Wigner transform


Let u(x) and v(x) be two m-dimensional vector fields on Rd , and let ε > 0 be a real number
tailored to represent the spatial scale at which phenomena occur, typically ε  1. We define
the Wigner transform of the two fields u and v as the m × m matrix-valued field on Rd × Rd :
Z
1 εy ∗ εy
Wε [u, v](x, k) = eik·y u(x − )v (x + )dy. (5.1)
(2π)d Rd 2 2

Here, v∗ represents the complex conjugate of the adjoint vector to v.


With our definition of the Fourier transform we have
−1
 εy ∗ εy 
Wε [u, v](x, k) = Fy→k u(x − )v (x + ) (x, k), (5.2)
2 2
which is obviously equivalent to the fact that
εy ∗ εy
W̃ε [u, v](x, y) = Fk→y [Wε [u, v]](x, y) = u(x − )v (x + ). (5.3)
2 2
The Wigner transforms can thus be seen as a the Fourier transform in the fast spatial varia-
tions of the two-point correlation of the two fields. It is thus an object defined in the phase
space, which tries to account for the rapid oscillations at reduced wavenumber k (physical
wavenumber k/ε) in the vicinity of a macroscopic scale point x.

Scaling. We verify that

εd Wε [u, v](x, εk) = W1 [u, v](x, k). (5.4)

This is consistent with the fact that the reduced wavenumber k corresponds to oscillations
with physical wavenumber k/ε. We also have the natural relationship

k
Wε [u(α·), v(α·)](x, k) = Wε [u, v](αx, ). (5.5)
α

52
Noteworthy relationships. We verify directly from (5.1) and from the interpretation (5.2)
that
Wε [u, v](x, k) = Wε∗ [v, u](x, k),
Z
Wε [u, v](x, k)dk = (uv∗ )(x),
Rd
Z
iε (5.6)
kWε [u, v](x, k)dk = (u∇v∗ − ∇uv∗ )(x),
Rd 2
Z Z
2
|k| Wε [u, v](x, k)dkdx = ε 2 ∇u · ∇v∗ dx.
R2d Rd

Exercise 5.1.1 Check the above properties.

Wigner transform in Fourier domain. Let us define the Fourier transform of the Wigner
transform
Ŵε [u, v](p, k) = Fx→p [Wε [u, v]](p, k). (5.7)
Then we find that
1 p k c∗ p k 
Ŵε [u, v](p, k) = d
û + v − . (5.8)
(2πε) 2 ε 2 ε
¯
Exercise 5.1.2 Check the above properties. Recall that f¯ˆ(ξ) = fˆ(−ξ).

5.2 Convergence properties


Let φε (x) be a complex-valued (scalar to simplify) sequence of functions uniformly (in ε)
bounded in L2 (Rd ). We consider the Wigner transform of the sequence
Z
1 εy ∗ εy
Wε (x, k) ≡ Wε [φε ](x, k) = d
eik·y φε (x − )φε (x + )dy. (5.9)
(2π) Rd 2 2

The sequence of Wigner transforms defined above satisfies the following uniform bound. We
introduce the space A of functions λ(x, k) of x and k such that
Z
kλkA = dy sup |λ̃(x, y)| < ∞, (5.10)
Rd x

where Z
λ̃(x, y) = dke−ik·y λ(x, k), (5.11)
Rd
is the Fourier transform of λ in k. Then, as the following lemma shows, the distributions
Wε (x, k) are uniformly bounded in A0 , the dual space to A when the sequence of fields φε (x)
are uniformly bounded in L2 (Rd ).

Lemma 5.2.1 Let φε (x) be uniformly bounded in L2 (Rd ) by Φ. The family Wε (x, k) is uni-
formly bounded in A0 , and more precisely,

kWε kA0 ≤ kφε k2L2 (Rd ) ≤ Φ2 . (5.12)

for all ε > 0.

53
Proof. Let λ(x, k) ∈ A. Then,
Z Z
εy ∗ εy dxdkdy ik·y
hWε , λi ≡ Wε (x, k)λ(x, k)dxdk = eik·y φε (x − )φε (x + )λ(x, k) e
R2d R3d 2 2 (2π)d
Z
εy ∗ εy
= φε (x − )φε (x + )λ̃(x, y)dxdy.
R2d 2 2
Therefore, using the Cauchy-Schwarz inequality in x, we have
Z Z εy ∗ εy 
|hWε , λi| ≤ sup |λ̃(x, y)| φε (x − )φε (x + ) dx dy
d x Rd 2 2
ZR Z Z
εy 2 1/2  εy 2 1/2
≤ sup |λ̃(x, y)| φε (x − ) dx φε (x + ) dx dy
Rd x Rd 2 Rd 2
Z Z 
2
≤ sup |λ̃(x, y)|dy φε (x) dx ≤ kφε k2L2 (Rd ) kλkA .
Rd x Rd

This gives (5.12) since by definition, we have


|hWε , λi|
kWε kA0 = sup . (5.13)
λ∈A kλkA

This result shows that the sequence Wε converges weakly * in A0 . The Banach-Alaoglu theorem
(stating that the unit ball in a space E 0 is compact for the weak * topology σ(E 0 , E)) then
implies that for each sequence εn → 0, we can extract a subsequence εn0 such that Wεn0
converges in the weak * topology to a limit W ∈ A0 . What this means is that

lim hWεn0 , λi = hW, λi, for all λ ∈ A.


εn0 →0

The space A0 is a space of distributions on R2d , i.e., a subspace of D0 (R2d ). However this
is a big subspace that includes bounded measures M(R2d ). Therefore, Wigner transforms,
which may be smooth at fixed ε (for instance when φε is smooth), are no longer necessarily
smooth in the limit ε → 0.
The above results extend to matrix valued Wigner transforms. Let uε (x) and vε (x) be
uniformly bounded in (L2 (Cd ))m by Φ. Then the Wigner transform

Wε (x, k) = Wε [uε , vε ](x, k), (5.14)

is uniformly bounded in (Am×m )0 and consequently admits converging subsequences in the


same space for the weak * topology.
Exercise 5.2.1 Verify this claim.
Now let us restrict ourselves to the case where vε = uε . It turns out that the limit is more
regular than (Am×m )0 : it is in the space of bounded measures Mm×m (R2d ). Moreover it is
a nonnegative Hermitian matrix-valued measure. That it is Hermitian comes from the first
property in (5.6). We refer to [17] for the proof that it is a nonnegative measure, i.e., that for
all e ∈ Cm , the limit W 0 satisfies
Xm
Wij0 ei e∗j ≥ 0. (5.15)
i,j=1

Let us consider the matrix uε u∗ε . Since uε is bounded in L2 , then each component in uε u∗ε
is bounded in L1 (Rd ), hence in M(Rd ), the space of bounded measures on Rd , which is the

54
dual of the space of compactly supported continuous functions on Rd equipped with the sup
norm kϕk∞ = supx∈Rd |ϕ(x)|. The same Banach-Alaoglu theorem used above then implies
that uε u∗ε admits subsequences that converge weakly * in the space of bounded measures to a
matrix ν:
uε u∗ε → ν for the weak * topology in M(Rd ). (5.16)
In many cases, ν is precisely the object we are interested in: when uε is a field, ν has the units
of an energy. We have the following important properties of the Wigner transform.

Definition 5.2.2 A bounded family uε (x) in L2 is said to be ε−oscillatory as ε → 0 if for


every continuous compactly supported function ϕ on Rd , we have
Z
lim dε (k)|2 dk → 0 as R → ∞.
|ϕu (5.17)
ε→0 |k|≥R/ε

A bounded family uε (x) in L2 is said to be compact at infinity as ε → 0 if


Z
lim |uε |2 (x)dx → 0 as R → ∞. (5.18)
ε→0 |x|≥R

Compactness at infinity means that the functions oscillate in the vicinity of the origin uniformly
in ε. ε-oscillatory means that the typical frequency of oscillation of the functions is precisely
ε−1 . We can verify that sufficient conditions for such a behavior is for instance:

k(ε∇)uε kL2 ≤ C, independent of ε. (5.19)

Here (ε∇) can be replaced by an arbitrary number (including real-valued) of derivatives, e.g.
of the form (ε2 ∆)s . Then we have the following properties [17]:

Proposition 5.2.3 Let uε be a bounded family in (L2 (Rd ))m with Wigner transform Wε
converging (up to extraction of a subsequence) to a limiting measure W0 . Let us denote by
w0 = Tr W0 . Then w0 is a bounded measure on R2d . Moreover we have

W0 (A, Rd ) ≤ ν(A), A any Borel subset in Rd , (5.20)

with equality if and only if uε is ε−oscillatory.


We also have that Z
2d
w0 (R ) ≤ lim |uε (x)|2 dx, (5.21)
ε→0 Rd

with equality if and only if uε is ε−oscillatory and compact at infinity.

The proof of this and similar results may be found in [17, 22].
The above results are important in the following sense. It states that for uε is ε−oscillatory
and compact at infinity, we have
Z
lim |uε (x)|2 dx = Tr ν(Rd ) = w0 (R2d ). (5.22)
ε→0 Rd

This means that all the energy there is in the system is captured in the limit ε → 0 by the
Wigner transform. The equality W0 (A, Rd ) = ν(A) states that the average of the Wigner
transform over wavenumbers k does give the local energy density in the limit ε → 0. This
however only happens when the function uε oscillates at the right scale. If it oscillates at
the scale ε2 , then the limiting measure ν will still capture those oscillations; however not the
Wigner transform Wε and (5.20) would become a strict inequality.

55
When uε converges weakly to 0, then ν(A) can be considered as a local defect measure,
which measures by how much uε does not converge strongly to 0 on A. When ν(A) = 0,
then uε converges strongly to 0 on A. The limit W0 is a microlocal defect measure, which
measures the defect to compactness not only locally in space, but microlocally in the phase
space. Whereas ν(A) tells that uε does not converge strongly to 0 on A, W0 (x, k) also says in
which directions k it oscillates and what is the “strength” of such oscillations.
This justifies the terminology that W0 is a phase-space energy density. It states how much
energy oscillates at position x with reduced wavenumber k. Note that Wε (x, k) cannot quite
be given this interpretation of phase-space energy density because nothing prevents it from
begin locally negative. It nonetheless helps toR intuitively consider Wε (x, k) as a phase-space
energy density-like object; all the more that Rd Wε (x, k)dk = |uε |2 (x) from the second line
in (5.6).
Let us conclude this section by a remark on Wε [uε , vε ](x, k). Since it is uniformly bounded
in (Am×m )0 , it converges to a limit W0 as well. However the limiting distribution W0 need
not be non-negative. Yet it is a measure on R2d . Indeed we have from the definition of the
Wigner transform and from (5.6) that

(Wε + Wε∗ )[uε , vε ] = Wε [uε + vε , uε + vε ] − Wε [uε , uε ] − Wε [vε , vε ],


(5.23)
(Wε − Wε∗ )[uε , vε ] = i Wε [uε + ivε , uε + ivε ] − Wε [uε , uε ] − Wε [ivε , ivε ] .


All terms on both left hand sides converge to signed measures so that Wε [uε , vε ](x, k) also
converges to an unsigned measure on R2d . The above formulas may also be used to translate
the results stated in Proposition 5.2.3 to limiting correlations.

5.3 Equations for the Wigner transform


The Wigner transform introduced in the preceding section will be a useful tool in the analysis
of the propagation of high frequency waves in random media. Let us now assume that uϕ ε (t, x)
for ϕ = 1, 2 are two wave field solutions of wave equations of the form

∂uϕε
ε + Aϕ ϕ
ε uε = 0, ϕ = 1, 2, (5.24)
∂t
with appropriate initial conditions. We thus explicitly assume that uϕ ε solve a first-order
equation in time. Typically, Aε is a differential operator in the spatial variables, although
more general operators and operators with coefficients that depend on time as well may be
considered. Because it simplifies life a bit and it is true for classical wave equations, we assume
that uϕ ϕ
ε is real-valued and that Aε also are real-valued operators.
When we expect that the fields uϕ −1
ε oscillate at the frequency ε , the Wigner transform of
the two fields will provide a tool to analyze their correlation function, or the energy density of
a wave field when u1ε = u2ε . One of the major advantages of the Wigner transform is that it
satisfies a closed-form equation. This should not come too much as a surprise. Since we have
an equation for uϕ 1 2
ε , it is not difficult to find an equation for the correlation uε (x)uε (y) for a
large class of equations of the form (5.24). We have seen that the Wigner transform is then
not much more that the Fourier transform of a two point correlation function of fields.
More specifically, an equation for the Wigner transform defined as

Wε (t, x, k) = W [u1ε (t, ·), u2ε (t, ·)](x, k), (5.25)

56
is obtained as follows. We verify from (5.1) and (5.24) that

∂Wε
ε + W [A1ε u1ε , u2ε ] + W [u1ε , A2ε u2ε ] = 0. (5.26)
∂t
It thus remains to find operators Aϕ 1 1 2 1 1 2 2
ε such that W [Aε uε , uε ] = Aε [Wε ] and W [uε , Aε uε ] =
A2∗
ε [Wε ] to obtain a closed-form evolution equation for Wε (t, x, k). The initial condition
Wε (0, x, k) is then the Wigner transform of the initial conditions u1ε (0, x) and u2ε (0, x) for
the wave fields, which are supposed to be known. The derivation of such operators Aϕ ε is
based on the following pseudo-differential calculus.

Pseudo-differential calculus. Let P (x, εD) be a matrix-valued pseudo-differential opera-


tor, defined by Z
dk
P (x, εD)u(x) = eix·k P (x, iεk)û(k) . (5.27)
Rd (2π)d
We assume that P (x, iεk) is a smooth function and use the same mathematical symbol for
the operator P (x, εD) and its symbol P (x, iεk). We also define D = ∇ as the gradient in the
spatial variables, and not −i times the gradient as is often the convention. Thus for us, D has
symbol ik. We verify that when P (x, iεk) is polynomial in its second variable, then P (x, εD)
is a differential operator.
Let Ŵ [u, v](p, k) be the Fourier transform Fx→p of the Wigner transform. We recall that

1 p k  ˆ∗ p k  1 p k ∗ p k
Ŵ [u, v](p, k) = d
û + v − = d
û + v̂ − + (5.28)
(2πε) 2 ε 2 ε (2πε) 2 ε 2 ε

when v is real-valued. This implies that for a homogeneous operator P (εD), we have
εip 
Ŵ [P (εD)u, v](p, k) = P ik + Ŵ [u, v](p, k), (5.29)
2
whence
εD
)W [u, v](x, k).
W [P (εD)u, v](x, k) = P (ik + (5.30)
2
The same calculation shows that when v and P (εD)v are real-valued, we have
h εD i
W [u, P (εD)v](x, k) = W [u, v](x, k)P ∗ ik − . (5.31)
2
In the latter right-hand side, we use the convention that the differential operator D acts on
W [u, v](x, k), and thus should be interpreted as the inverse Fourier transform of the matrix
Ŵ [u, v](p, k)P ∗ (ik − εip
2 ). Another way of stating this is that

m
h εD i X εD 
W [u, v](x, k)P ∗ ik − = ∗
Ppk ik − Wjp [u, v](x, k). (5.32)
2 jk 2
p=1

We use the notation [·] to represent such a convention.


We now generalize the above calculation to

W [P (x, εD)u, v](x, k) = LP W [u, v](x, k). (5.33)

We verify that Z

F[P (x, εD)u](k) = P̂ (k − ξ, iεξ)û(ξ) .
Rd (2π)d

57
Using (5.28), we thus obtain that
Z
p εξ dξ
Ŵ [P (x, εD)u, v](p, k) = P̂ (ξ, ik + iε( − ξ))Ŵ [u, v](p − ξ, k − ) .
Rd 2 2 (2π)d
After Fourier transforms, we finally obtain that
Z Z 
ip·(x−y) p dp εξ dξdy
LP W (x, k) = e P̂ (ξ, ik + iε( − ξ)) d
eiξ·y W (y, k − ) . (5.34)
R 2d R d 2 (2π) 2 (2π)d
We verify that (5.34) generalizes (5.30). A very similar expression similarly generalizes (5.31).
Exercise 5.3.1 Work out that generalization.

Asymptotic expansions. The operator LP defined in (5.34) is amenable to asymptotic


expansions. Throughout the text, we shall use the convention
P 0 (x, ik) = ∇ik P (x, ik) = −i∇k P (x, ik). (5.35)
For functions W (x, k) that are sufficiently smooth in the k variable, we have the Taylor
expansion
εξ εξ
W (x, k − ) = W (x, k) − · ∇k W (x, k) + O(ε2 ).
2 2
Similar asymptotic expansions in (5.34) yield that for smooth functions W (x, k) we have

LP W (x, k) = Mε W (x, k) + εNε W (x, k) + O(ε2 )


εD iε εD
Mε W (x, k) = P (x, ik + )W (x, k) + ∇x P (x, ik + ) · ∇k W (x, k) (5.36)
2 2 2
εDx
Nε W (x, k) = i∇x · ∇k P (x, ik + )W (x, k).
2
The above calculations allow us to deduce that for functions W (x, k) that are sufficiently
smooth in both variables x and k, we have

LP W (x, k) = Lε W (x, k) + εNε W (x, k) + O(ε2 )


ε (5.37)
Lε W (x, k) = P (x, ik)W (x, k) + {P, W }(x, k),
2i
where we have defined the Poisson bracket
{P, W }(x, k) = (∇k P · ∇x W − ∇x P · ∇k W )(x, k). (5.38)
Similarly, we define
W [u, P (x, εD)v](x, k) = L∗ W (x, k). (5.39)
We verify that when v and P (x, εD)v are real-valued,

L∗ W (x, k) = (M∗ε + εNε∗ )W (x, k) + O(ε2 ) = (L∗ε + εNε∗ )W (x, k) + O(ε2 )


εD iε εD
M∗ε W (x, k) = [W (x, k)P ∗ (x, ik − )] − [∇k W (x, k) · ∇x P ∗ (x, ik − )]
2 2 2
(5.40)
ε
L∗ε W (x, k) = [W (x, k)P ∗ (x, ik)] + {W, P ∗ }(x, k),
2i
εDx
Nε∗ W (x, k) = −i[W (x, k)∇x · ∇k P ∗ (x, ik + )].
2
When W = W [u, v] with u and v bounded in L2 , and P (x, ik) is a smooth function, then
we verify that the above O(ε2 ) terms are of the form ε2 Rε with Rε uniformly bounded in
(S 0 )(m×m) , the space of matrix-valued Schwartz distributions.

58
Exercise 5.3.2 Verify that Rε is uniformly bounded in (S 0 )(m×m) .

With the above hypotheses, we thus find the following useful result
ε
Wε [P (x, εD)u, v] = P Wε [u, v] + {P, Wε [u, v]} + i∇x · ∇k P (x, ik)Wε [u, v] + ε2 Rε ,
2i
ε
Wε [u, P (x, εD)v] = [Wε [u, v]P ∗ ] + {Wε [u, v], P ∗ } − iWε [u, v]∇x · ∇k P ∗ (x, ik) + ε2 Sε ,
2i
(5.41)
where Rε and Sε are bounded in (S )0 (m×m) .

Highly oscillatory coefficients. The above pseudo-differential calculus was obtained for
smooth pseudo-differential operators P . When these operators involve highly oscillatory co-
efficients, different asymptotic expansions are necessary. We consider here the case where the
operator involves multiplication by a highly oscillatory matrix-valued coefficient. Let V (x) be
a real-valued matrix-valued function. Then we find that
Z
x x·p p dp
W [V ( )u, v](x, k) = ei ε V̂ (p)W [u, v](x, k − ) ,
ε ZRd 2 (2π)d
x x·p p dp (5.42)
W [u, V ( )v](x, k) = ei ε W [u, v](x, k + )Vˆt (p) .
ε Rd 2 (2π)d

Here V̂ (p) is the Fourier transform of V (x) component by component. This may be generalized
as follows. Let V (x, y) be a real-valued matrix function, with Fourier transform V̌ (q, p) and
Fourier transform with respect to the second variable V̂ (x, p). We then find that
Z
x x·p p εq dpdq
W [V (x, )u, v](x, k) = ei ε eix·q V̌ (q, p)W [u, v](x, k − − ) ,
ε ZR2d 2 2 (2π)2d
x·p p dp (5.43)
= ei ε V̂ (x, p)W [u, v](x, k − ) + O(ε),
Rd 2 (2π)d

provided that W [u, v](x, k) is sufficiently smooth in the k variable.

Exercise 5.3.3 Work out the formula for W [V (x)u, v](x, k). Show that it is asymptotically
equivalent to (5.36).

Multiple scale expansion. The error terms in (5.36) and (5.37), although both deduced
from Taylor expansions, have different expressions. While the former involves second-order
derivatives in k of W (x, k), the latter involves second-order derivatives in both the k and x
variables. When W (x, k) has bounded second-order derivatives in x and y, then (LP −Lε )W =
O(ε2 ) and (LP −Mε )W = O(ε2 ).In the sequel however, we will need to apply the operator Mε
to functions that oscillate in the x variable and are smooth in the k variable. Such functions
will have the form W (x, xε , k). The differential operator D acting on such functions then takes
the form
1
D = Dx + Dy .
ε
We then verify that
x Dy
Mε [W (x, , k)](x, k) = P (x, ik + )W (x, y, k)|y= xε + O(ε). (5.44)
ε 2
We will not need higher-order terms. Note that on such functions, (LP − Lε )W = O(1), which
implies that Lε cannot be used as an approximation of LP .

59
5.4 Examples of Wigner transforms
Compact sequence. Consider a sequence of functions uε (x) converging to u strongly in
L2 (Rd ) as ε →. Then one verifies that
εy ∗ εy
uε (x − )uε (x + ) → |u(x)|2 ,
2 2
weakly in S 0 (R2d ) (to simplify) as ε → 0. This implies that the limiting Wigner transform

W0 [uε , uε ](x, k) = |u(x)|2 δ(k). (5.45)

Exercise 5.4.1 Verify the above statements.

Oscillatory sequence. Consider a sequence of the form uε (x) = φξ (x)eix·ξ/ε for ξ ∈ Rd


fixed and φξ a smooth function. Then the limiting Wigner transform is given by

W0 [uε , uε ](x, k) = |φξ (x)|2 δ(k − ξ). (5.46)

We thus see that the Wigner transform of a plane wave is a delta function. The above result
extends to a superposition of plane waves. Let
M
X
uε (x) = φm (x)eix·ξm /ε , (5.47)
m=1

where the momenta ξ m are mutually distinct. Then the limiting Wigner transform is given by
M
X
W0 [uε , uε ](x, k) = |φm (x)|2 δ(k − ξ m ). (5.48)
m=1

This result is fundamental in applications: it states that the Wigner transform of two plane
waves with different wavenumbers tends to 0 in the limit ε → 0:

W [eix·ξ1 /ε , eix·ξ2 /ε ](x, k) = 0, (5.49)

provided that ξ 1 6= ξ 2 . Only those plane waves propagating in the same direction with the
same wavenumber interact coherently in the limit ε → 0. If ξ p = ξ q in (5.47), then the
coefficient |φp + φq |2 would appear in (5.48) rather than |φp |2 + |φq |2 . This example also shows
that the Wigner transform converges only weakly to its limit. Indeed the product of two plane
waves with different directions certainly does not converge strongly to 0 as ε → 0.
Warning: the above result holds for a finite number of plane waves. When the number
of plane waves becomes infinite, the above result may not hold. The reason is that if two
plane waves with wavenumbers differing by O(ε) are present, we cannot conclude that they
are uncorrelated in the limit. Only plane waves with sufficiently different directions (i.e., much
larger than O(ε)) are uncorrelated in the limit ε → 0. A typical example where the above limit
does not hold is that of the Wigner transform of a Bessel function; where the Bessel function is
defined for d = 2 as the superposition of all plane waves with wavenumber of modulus |k| = 1.
I leave this as a (quite difficult) exercise.

60
Point concentration sequence. Consider now the sequence
1 x
uε (x) = φ( ). (5.50)
εd/2 ε
Then the limiting Wigner measure is given by
1
W [uε , uε ](x, k) = δ(x)|φ̂(k)|2 . (5.51)
(2π)d

The energy concentrates at one point in space and is radiated in each wavenumber k according
to the Fourier transform of the waveform φ(x).

Coherent state. The last example may be generalized as follows


1 x − x0 ix·ξ/ε
uε (x) = φ( )e . (5.52)
εd/2 ε
Then we find that the limiting Wigner measure is given by
1
W [uε , uε ](x, k) = δ(x − x0 )|φ̂(k − ξ)|2 . (5.53)
(2π)d

For different scalings, we obtain the following results. Assume that


1 x − x0 ix·ξ/ε
uε (x) = φ( )e . (5.54)
εd/2 εα
When α = 1, we have the result (5.53). When α > 1, we verify that W ≡ 0. This is because all
oscillations occur at a frequency ε−α  ε−1 that the scaled Wigner transform cannot capture.
When 0 < α < 1, we verify that

W [uε , uε ](x, k) = kφk2L2 (Rd ) δ(x − x0 )δ(k − ξ). (5.55)

When α = 0, we recall that this is an oscillatory sequence, treated in (5.46).

WKB states. Let us now suppose that


α
uε (x) = φ(x)eiS(x)/ε , (5.56)

where φ and S are sufficiently smooth. When α < 1, we verify that the high oscillations do
not play any role in the limit and

W [uε , uε ](x, k) = |φ(x)|2 δ(k). (5.57)

When α = 1, we have the limiting Wigner transform

W [uε , uε ](x, k) = |φ(x)|2 δ(k − ∇S(x)). (5.58)

Proof. We calculate that


Z
i y y y i y dxdydk
(Wε , a) = eik·y e ε S(x−ε 2 ) φ(x − ε )φ∗ (x + ε )e− ε S(x+ε 2 ) a(x, k)
Z R3d 2 2 (2π)d
y y
= e−iy·∇S(x) eiεrε (x,y) φ(x − ε )φ∗ (x + ε )â(x, y)dxdy,
R2d 2 2

61
−1
where â(x, y) is the inverse Fourier transform Fk→y of a(x, k), and rε is real-valued and
uniformly bounded for sufficiently smooth functions S(x). This implies that above term is
uniformly (in ε) integrable in (x, y) for φ ∈ L2 and a ∈ A (for then â(x, y) is uniformly
bounded in x and integrable in y). By the dominated Lebesgue convergence theorem, we
obtain that the above term converges to
Z
(W0 , a) = e−iy·∇S(x) |φ(x)|2 â(x, y)dxdy,
R2d

which is nothing but


Z
(W0 , a) = δ(k − ∇S(x))|φ(x)|2 a(x, k)dxdk,
R2d

whence the result.


The case α > 1 is more delicate. When S(x) has no critical points, i.e., ∇S 6= 0 for all
x ∈ Rd , then the limiting Wigner measure is W ≡ 0 as in the case of plane waves. In the
presence of critical points, a more refined analysis is necessary.

Limiting Liouville equation. Let us assume that the phase S(t, x) and the amplitude
φ(t, x) solve the following eikonal and transport equations:

∂S ∂|φ|2
+ ω(x, ∇S) = 0, + ∇ · (|φ|2 (∇k ω)(x, ∇S)) = 0, (5.59)
∂t ∂t
where ω(x, k) is a Hamiltonian. Then the Wigner transform defined by (5.58), namely

W (t, x, k) = |φ(t, x)|2 δ(k − ∇S(t, x)), (5.60)

as it turns out, solves the following Liouville equation,


∂W
+ {ω, W } = 0, (5.61)
∂t
where the Poisson bracket is defined in (5.38).
The proof is an exercise in distribution theory. We find that
∂W ∂ ∂
= |φ(t)|2 δ(k − ∇S(t, x)) + |φ|2 (δ(k − ∇S(t, x)))
∂t ∂t ∂t
∂ ∂
= |φ(t)|2 δ(k − ∇S(t, x)) − |φ|2 (∇k δ)(k − ∇S(t, x))) · ∇x S(t, x)
∂t ∂t
= −∇ · (|φ|2 (∇k ω)(x, ∇S))δ(k − ∇S(t, x)) + |φ|2 (∇k δ)(k − ∇S(t, x))) · ∇x (ω(x, ∇S))
= −∇x |φ|2 · ∇k ω(x, ∇S)δ(k − ∇S) − |φ|2 ∇x · ∇k ω(x, ∇S)δ(k − ∇S)
−|φ|2 ∇2k ω(x, ∇S) · ∇2x Sδ(k − ∇S) + |φ|2 ∇k δ(k − ∇S) · ∇x ω(x, ∇S)
+|φ|2 ∇k δ(k − ∇S)∇k ω · ∇2x S.

We verify that

∇k W · ∇x ω(x, k) = |φ|2 ∇k δ(k − ∇S) · ∇x ω(x, k)


= |φ|2 ∇k δ(k − ∇S) · ∇x ω(x, ∇S(t, x)) − |φ|2 δ(k − ∇S)∇k · ∇x ω(x, ∇S(t, x)).

62
Indeed, let a(k) be a test function. Then

(∇δ(k−k0 )·F, a) = −(δ(k−k0 ), ∇k ·(Fa)) = −∇·F(k0 )(δ(k−k0 ), a)+F(k0 )·(∇δ(k−k0 ), a).

Similarly, we find that

∇x W · ∇k ω(x, k) = (∇x |φ|2 )δ(k − ∇S) · ∇k ω(x, k)


+|φ|2 ∇k δ(k − ∇S(t, x))∇2x S · ∇k ω(x, k)
= (∇x |φ|2 )δ(k − ∇S) · ∇k ω(x, ∇S)
+|φ|2 ∇k δ(k − ∇S)∇2x S · ∇k ω(x, ∇S) + |φ|2 δ(k − ∇S)∇2x S∇2k ω(x, ∇S).

Upon careful inspection, one verifies that the above equalities yield
∂W
= ∇k W · ∇x ω(x, k) − ∇x W · ∇k ω(x, k).
∂t
This is the Liouville equation (5.61). This important equation states that in “low frequency”
media, i.e., heterogeneous media with slow spatial variations so that (5.59) make sense, the
energy density satisfies a linear partial differential equation. It is moreover straightforward to
solve the linear equation by the method of characteristics. Indeed let us define

W (t, x, k) = W0 (X(−t), K(−t)), (5.62)

with (X(t), K(t)) solution of the Hamiltonian system

Ẋ(t) = ∇k ω(X(t), K(t)), K̇(t) = −∇x ω(X(t), K(t)), X(0) = x, K(0) = k. (5.63)

Then one verifies that W (t, x, k) in (5.62) solves (5.61) with initial conditions W (0, x, k) =
W0 (x, k).
That the limit Wigner transform satisfies a Liouville equation is no surprise. In the limit of
vanishing wavelength, the wave energy density follows the trajectories of classical mechanics.
We have obtained this result in the framework of the WKB, or geometric optics, approxima-
tion. We’ll see that this result holds in more general situations where the geometric optics
approximation is not valid.

5.5 Semiclassical limit for Schrödinger equations


The high frequency Schrödinger equation (after the usual change of variables x → x/ε and
t → t/ε) is given by
∂φε ε2
iε + ∆φε − V (x)φε = 0, (5.64)
∂t 2
with ε-oscillatory initial conditions φ0ε (x). The potential V (x) is slowly varying. This is thus
the problem of high frequency waves in low frequency media that was handled by geometric
optics in an earlier chapter.
If we look for geometric optics solutions of the form

φε (t, x) ≈ A(t, x)eiS(t,x)/ε , (5.65)

then the evolution equations for the phase S and the amplitude A take the form of the following
eikonal and transport equations:
∂S ∂|A|2
+ H(x, ∇S(t, x)) = 0, + ∇ · (|A|2 ∇k H(x, ∇S(x))) = 0, (5.66)
∂t ∂t

63
where the Hamiltonian has the familiar expression
1
H(x, k) = |k|2 + V (x). (5.67)
2
Exercise 5.5.1 Derive the equations in (5.66).

The results in the preceding section show that the Wigner transform of (5.65) converges to
the solution of a Liouville equation as ε → 0. We will now re-derive this result without using
the geometric optics approximation.
Let Wε (t, x, k) be the Wigner transform of the solution to (5.64). Note that
Z
Wε (t, x, k)dk = |φε (t, x)|2 ,
Rd

so that the Wigner transform allows us to reconstruct the probability density of the quantum
waves.
Following the steps recalled earlier in the chapter, namely (5.26) and (5.34), we find the
following closed form equation for Wε :
∂Wε
+ k · ∇x Wε + Lε Wε = 0, (5.68)
∂t
where Z
i  εp εp  dp
Lε W (x, k) = eip·x V̂ (p) W (x, k − ) − W (x, k + ) . (5.69)
ε Rd 2 2 (2π)d
Let us assume that W (x, k) is sufficiently smooth in its k variable. Then one relatively
easily finds that

Lε W (x, k) → −∇x V (x) · ∇k W (x, k), as ε → 0. (5.70)

Exercise 5.5.2 Verify the latter statement. Recall that the Fourier transform of ∇x V is
ik · V̂ (k).

This shows formally that the limiting equation for W is the Liouville equation
∂W
+ k · ∇x W − ∇x V · ∇k W = 0. (5.71)
∂t
This is nothing but (5.61) for the specific choice of a Hamiltonian in (5.67).
The above formal derivation is nice, but does not treat the geometric optics case: in the
latter case, W (t, x, k) is not smooth in k as one verifies from (5.60).
Since ψε (t, x) is bounded in L2 (Rd ), we know that Wε (t, x, k) belongs to A0 for all times
and thus converges weakly in that space to W (t, x, k). Let now a(x, k) be a smooth test
function in Cc∞ (R2d ), which we may take as real-valued because we know that W (t, x, k) is
real-valued. We verify that
Z
1 εp εp  dpdkdx
(Lε Wε , a) = eip·x iV̂ (p)Wε (t, x, k) a(x, k + ) − a(x, k − ) .
R3d ε 2 2 (2π)d

We verify that
1 εp εp 
a(x, k + ) − a(x, k − ) = p · ∇k a(x, k) + εrε (x, k, p), (5.72)
ε 2 2

64
where (x, k) → rε (x, k, p) is bounded in A uniformly in p ∈ Rd and so is (x, k) → p·∇k a(x, k).
This shows that Z
dpdkdx
eip·x iV̂ (p)Wε (t, x, k)εrε (x, k, p) →0
R3d (2π)d
as ε → 0. Since eip·x ipV̂ (p) integrates to ∇V (x) (assuming that V is sufficiently regular), we
deduce from the weak convergence of Wε to W in A0 that

(Lε Wε , a) → (W, ∇V · ∇k a) = −(∇V · ∇k W, a) in D0 (R2d ). (5.73)

This shows that the left-hand side in (5.68) converges to the left-hand side in (5.71) in
D0 (R2d ) uniformly in time as ε → 0. Since the right-hand side in (5.68) converges to that in
(5.71), the limiting equation (5.71) for W (t, x, k) is established.
We have thus proved the following

Proposition 5.5.1 Let ψ0ε (x) be an ε-oscillatory sequence of functions with Wigner trans-
form converging weakly in A0 to W0 (x, k). Let ψε (t, x) be the solution of the Schrödinger
equation (5.64) and Wε (t, x, k) its Wigner transform. Then Wε (t, x, k) converges weakly in
A0 uniformly in time to a (uniquely defined accumulation point) W (t, x, k), which is a weak
solution of the Liouville equation (5.71) with initial conditions W0 (x, k).

Note that the whole sequence Wε (t, x, k) converges to W (t, x, k): we do not need to extract
subsequences. The reason is that there is a unique solution in the space of bounded measures
to the Liouville equation (5.71) with initial conditions W0 (x, k), which by assumption is the
only accumulation point of Wε0 (x, k). This uniquely defines the possible accumulation points
of Wε (t, x, k) as ε → 0.
The result stated in the proposition is much more general than the one obtained by ge-
ometric optics. Much more general initial conditions than those of WKB type (5.65) can be
considered. All we need really is bounded initial conditions in the L2 sense that are ε oscillatory
so that no energy is lost when passing to the limit ε → 0. Note also that the Liouville equa-
tion is defined for all times, unlike the eikonal equation. Bicharacteristics, unlike their spatial
projections the rays, never cross in the phase space, so that the Liouville equation (a linear
equation) is not limited to sufficiently small times so that no caustics appear. The Wigner
transform and its limiting Liouville equation allow us to avoid the problem of caustics inherent
to the geometric optics formulation. The price to pay is that the Wigner transform is defined
in the phase space, which is much bigger than the physical domain (since the Hamiltonian
is an invariant of Hamilton’s equations, the system in the phase space is 2d − 1 dimensional
rather than d dimensional in the physical domain).

65
Chapter 6

Radiative transfer equations

High frequency wave propagation in highly heterogeneous media has long been modeled by
radiative transfer equations in many fields: quantum waves in semiconductors, electromagnetic
waves in turbulent atmospheres and plasmas, underwater acoustic waves, elastic waves in the
Earth’s crust. These kinetic models account for the wave energy transport in the phase space,
i.e., in the space of positions and momenta.
Such kinetic models account for the multiple interactions of wave fields with the fluctuations
of the underlying medium. We saw in Chapter 3 the interaction of high frequency waves with
low frequency media and in 2 the interaction of low frequency waves with high frequency
media. Radiative transfer equations model the interaction of high frequency waves in high
frequency media. The latter description encompasses many regimes of wave propagation.
We consider here the so-called weak coupling regime, where the correlation length of the
underlying medium is comparable to the typical wavelength in the system. In order for energy
to propagate, this forces the fluctuations to be of small amplitude, whence the term “weak”.
A systematic method to derive kinetic equations from symmetric first-order hyperbolic
systems, including systems of acoustics and elastic equations, in the weak-coupling limit has
been presented in [28] and extended in various forms in [4, 5, 6, 8]. In these papers, the energy
density of waves is captured by the spatial Wigner transform, which was introduced in Chapter
5. The method is based on formal multiple-scale asymptotic expansions in the Wigner trans-
form and extends to fairly general equations the kinetic models rigorously derived in [14, 29]
for the Schrödinger equation. Mathematically rigorous methods of derivation of macroscopic
models for wave propagation in heterogeneous media are postponed to later chapters.
We focus here on a non-symmetric two-by-two first-order system and on the scalar wave
equation to model acoustic wave propagation.

6.1 Non-symmetric two-by-two system


We recall the system introduced in section 1.2.1 for pressure p(t, x) and the rescaled time
derivative of pressure q(t, x) = c−2 (x)pt (t, x), so that u = (p, q)t solves the following 2 × 2
system
∂u
+ Au = 0, t > 0, x ∈ Rd ,
∂t (6.1)
u(0, x) = (g(x), c−2 (x)h(x))t , x ∈ Rd .

66
where
     
0 c2 (x) 0 1 −∆ 0
A = −  = J Λ(x), J = , Λ(x) =  . (6.2)
∆ 0 −1 0 0 c2 (x)

Note that J is a skew-symmetric matrix (J t = −J) and that Λ is a symmetric matrix-valued


operator for the usual L2 scalar product. We also recall that energy conservation takes the
form Z
1
E(t) = uΛudx = E(0). (6.3)
2ρ0 Rd

High frequency limit. Kinetic models arise in the high frequency limit of wave propagation.
We thus rescale t → ε−1 t and x → ε−1 x and obtain the following equation for uε :
 
∂uε 0 c2ε (x)
ε + Aε uε = 0, Aε = −  , (6.4)
∂t 2
ε ∆ 0

with initial conditions of the form uε (0, x) = u0ε (ε−1 x). We verify that acoustic energy
conservation implies that
Z
1
|ε∇pε |2 (t, x) + c2ε (x)qε2 (t, x) dx = E(0),

E(t) = (6.5)
2ρ0 Rd
is independent of time.
The above energy conservation is governed by quantities of the form |ε∇pε |2 and qε2 .
Whereas such quantities do not solve closed-form equations in the high frequency limit ε → 0,
they can be decomposed in the phase space into a quantity that solves a transport equation.
The role of kinetic models is to derive such a transport equation from the wave equations.
The Wigner transform is perfectly adapted to such a derivation.

Two by two hyperbolic systems. In the weak coupling regime, the medium is character-
ized by the sound speed:
√ x
c2ε (x) = c20 − εV ( ), (6.6)
ε
where c0 is the background speed assumed to be constant to simplify and V (x) accounts for
the random fluctuations. The correlation length of the random heterogeneities of order ε is
here to ensure maximum interaction between the waves and the underlying media. The scaling

ε is the unique scaling that allows the energy to be significantly modified by the fluctuations
while still solving a transport equation. Larger fluctuations lead to other regimes, such as
localization, which cannot be accounted for by kinetic models. Since the localization length
is always smaller than the diffusive (kinetic) length in spatial dimension d = 1, we restrict
ourselves to the case d ≥ 2.
Let us consider the correlation of two fields u1ε and u2ε propagating in random media with
the same background velocity c0 but possibly different heterogeneities modeled by V ϕ , ϕ = 1, 2.
We also replace the Laplacian in (6.4) by the more general smooth, real-valued, positive Fourier
multiplier operator p(εD), which may account for (spatial) dispersive effects. We assume
moreover that p(−ik) = p(ik). We retrieve p(εD) = ∆ for p(iξ) = (iξ) · (iξ) = −|ξ|2 .
We thus consider the equation
∂uϕε
ε + Aϕ ϕ
ε uε = 0, ϕ = 1, 2, (6.7)
∂t

67
and assume the following structure for Aϕε:
   
2
0 c0 √ x 0 1
Aϕε =−
 + εV ϕ ( )K, K= . (6.8)
p(εD) 0 ε 0 0

The correlation of two signals propagating in two different media may be of interest in probing
the temporal variations in the statistics of random media and has also found recent applications
in the analysis of time reversed waves.

6.2 Structure of the random fluctuations


The random inhomogeneities of the underlying media are modeled by the functions V ϕ (x).
We assume that V ϕ (x) for ϕ = 1, 2 is a statistically homogeneous mean-zero random field.
Because higher-order statistical moments of the heterogeneous fluctuations do not appear in
kinetic models, all we need to know about the statistics of the random media in the high
frequency limit are the two-point correlation functions, or equivalently their Fourier transform
the power spectra, defined by

c40 Rϕψ (x) = hV ϕ (y)V ψ (y + x)i, 1 ≤ ϕ, ψ ≤ 2, (6.9)


(2π)d c40 R̂ϕψ (p)δ(p ϕ
+ q) = hV̂ (p)V̂ (q)i. ψ
(6.10)

Here h·i means ensemble average (mathematical expectation). We verify that R̂ϕψ (−p) =
R̂ϕψ (p).
We can also consider more general random fluctuations of the form V ϕ (x, xε ), where for
each x ∈ Rd , V ϕ (x, y) is a statistically homogeneous mean-zero random field.

6.3 Equation for the Wigner transform


We define the Wigner transform of the two fields as

Wε (t, x, k) = W [u1ε (t, ·), u2ε (t, ·)](x, k), (6.11)

and deduce from (5.24) and (6.11) that

∂Wε
ε + W [A1ε u1ε , u2ε ] + W [u1ε , A2ε u2ε ] = 0. (6.12)
∂t
The pseudo-differential calculus recalled in Chapter 5 allows us to obtain the following equation
for the Wigner transform:
∂Wε εD εD √
)Wε + Wε P ∗ (ik − ) + ε Kε1 KWε + Kε2∗ Wε K ∗ = 0,

ε + P (ik + (6.13)
∂t 2 2
 
εD 0 c20 ϕ
Z
x·p p dp
P (ik + )=− 
εD
 , Kε W = ei ε V̂ ϕ (p)W (k − ) . (6.14)
2 p(ik + ) 0 Rd 2 (2π)d
2
Note that (6.13) is an exact evolution equation for the two-by-two Wigner transform Wε (t, x, k).
Its initial conditions are obtained by evaluating (6.11) at t = 0, and thus depend on the initial
conditions for u1ε and u2ε .

68
6.4 Multiple scale expansion
We are now interested in the high-frequency limit as ε → 0 of Wε . Because of the presence of a
highly-oscillatory phase exp(ε−1 ix · k) in the operator Kεϕ , direct asymptotic expansions on Wε
and (6.13) cannot provide the correct limit. Rather, as is classical in the homogenization of
equations in highly oscillatory media, as we have seen in Chapter 2, we introduce the following
two-scale version of Wε :
x
Wε (t, x, k) = Wε (t, x, , k), (6.15)
ε
and still use the symbol Wε for the function on R3d+1 in the new variables (t, x, y, k). We
then find that the differential operator D acting on the spatial variables should be replaced
by Dx + 1ε Dy . The equation for Wε thus becomes

∂Wε Dy εDx Dy εDx √


)Wε +Wε P ∗ (ik− )+ ε K1 KWε +K2∗ Wε K ∗ = 0, (6.16)

ε +P (ik+ + −
∂t 2 2 2 2
where we have defined
Z
p dp
Kϕ W = eiy·p V̂ ϕ (p)W (k − ) . (6.17)
Rd 2 (2π)d

Asymptotic expansions in the new set of variables can now account for the fast oscillations of
the heterogeneous medium. Using the asymptotic expansion P = P0 + εP1 + O(ε2 ) in (6.14)
and √
Wε (t, x, y, k) = W0 (t, x, k) + εW1 (t, x, y, k) + εW2 (t, x, y, k), (6.18)
we equate like powers of ε in (6.16) to obtain a sequence of three equations.

6.5 Leading-order equation and dispersion relation


The leading equation in the above expansion yields
   
−p(ik) 0 0 1
L0 W0 ≡ P0 (ik)W0 + W0 P0∗ (ik) = 0; P0 = −JΛ0 , Λ0 =  , J =  .
0 2
c0 −1 0
p (6.19)
Let us define q0 (ik) = −p(ik). The diagonalization of the dispersion matrix P0 yields
   
−1
1 ±iq (ik) ±iq (ik)
λ± (k) = ±ic0 q0 (ik), b± (k) = √ 
0  , c± (k) = Λ0 b± (k) = √1  0
.
2 −1 2
c0 c0
(6.20)
∗ ∗
The vectors are normalized such that b± Λ0 b± = b± c± = 1 and we verify the spectral decom-
position P0 = λ+ b+ c∗+ +λ− b− c∗− . Since (b+ (k),
P b− (k)) forms a basis of R2 for any k ∈ Rd∗ , any
matrix W may thus be decomposed as W = i,j=± αij bi bj where αij = c∗i W cj = tr(W ci c∗j ),

and a straightforward calculation shows that cP ∗ (P W + W P ∗ )c


k 0 0 m = αkm (λk + λm ). Using

the above decomposition for the matrix W0 = i,j=± aij bi bj , equation (6.19) implies that
a+− = a−+ = 0 so that

W0 = a+ b+ b∗+ + a− b− b∗− ; a± = c∗± W0 c± . (6.21)

69
Because all the components of uϕ ε are real-valued, we verify that W̄ (−k) = W (k). Here ¯
means complex conjugation component by component. From the above expression for a± and
the fact that c(−k) = c(k), we deduce that

ā± (−k) = a∓ (k). (6.22)

It is thus sufficient to find an equation for a+ (k). We verify that Rd a+ dk = 21 Rd tr(Λ0 W0 )dk,
R R

so that in the case where (5.24) is (6.4) and Wε is the Wigner transform of uε , we have
Z Z
1
tr(Λ0 W0 )dkdx = E(t) = a+ (t, x, k)dkdx, (6.23)
2 R2d R2d

at least in the limit ε → 0, where E is defined in (6.5). Thus a+ can be given the interpretation
of an energy density in the phase-space.

6.6 First-order corrector


The next equation in the asymptotic expansion in powers of ε is
Dy Dy
P0 (ik + )W1 + W1 P0∗ (ik − ) + θW1 + K1 KW0 + K2∗ W0 K ∗ = 0. (6.24)
2 2
The parameter 0 < θ  1 is a regularization (limiting absorption) parameter that will be sent
to 0 at the end. It is required to ensure the causality of wave propagation [28]. We denote by
Ŵ1 (t, x, p, k) the Fourier transform y → p of W1 ,

Ŵ1 (t, x, p, k) = Fy→p [W1 (t, x, y, k)](t, x, p, k). (6.25)

Since W0 is independent of y so that its Fourier transform y → p: Ŵ0 = (2π)d δ(p)W0 , we


verify that Ŵ1 satisfies the equation
p p p p
P0 (ik+i )Ŵ1 + Ŵ1 P0∗ (ik−i )+θŴ1 + V̂ 1 (p)KW0 (k− )+ V̂ 2 (p)W0 (k+ )K ∗ = 0. (6.26)
2 2 2 2

Since the vectors bi (k) form a complete basis of R2 for all k, we can decompose Ŵ1 as
X p p
Ŵ1 (p, k) = αij (p, k)bi (k + )b∗j (k − ). (6.27)
2 2
i,j=±

Multiplying (6.26) by c∗m (k + p2 ) on the left and by cn (k − p2 ) on the right, recalling λ∗n = −λn ,
and calculating
1 −1
b∗n (p)K ∗ cm (q) = λm (q), c∗m (p)Kbn (q) = λm (p), (6.28)
2c20 2c20
we get

1 p p 2 p p
1 V̂ (p)λm (k + 2 )an (k − 2 ) − V̂ (p)λn (k − 2 )am (k + 2 )
αmn (p, k) = 2 p p . (6.29)
2c0 λm (k + ) − λn (k − ) + θ
2 2
Note that W1 is linear in the random fields V ϕ . As in earlier multiple scale expansions, we
obtain W1 as a function of W0 . However this still does not provide us with any equation for
the leading-order term W0 .

70
6.7 Transport equation
Finally the third equation in the expansion in powers of ε yields
Dy Dy
P0 (ik + )W2 + W2 P0∗ (ik − ) + K1 KW1 + K2∗ W1 K ∗
2 2 (6.30)
∂W0
+ + P1 (ik)W0 + W0 P1∗ (ik) = 0.
∂t
We consider ensemble averages in the above equation and thus look for an equation for
ha+ i, which we still denote by a+ . We may assume that W2 is orthogonal to W0 in order to
justify the expansion in ε, so that hc∗+ L0 W2 c+ i = 0. This part cannot be justified rigorously
and may be seen as a reasonable closure argument. Such a closure is known to provide the
correct limit as ε → 0 in cases that can be analyzes rigorously [14].
We multiply the above equation on the left by c∗+ (k) and on the right by c+ (k). Recalling
the convention in (5.35), we obtain that since p(ik) = −q02 (ik), we have p0 (ik) = −i∇k p(ik) =
2iq0 (ik)∇k q0 (ik). This implies that
c0
c∗+ P1 W0 c+ = c∗+ W0 P1∗ c+ = ∇k q0 (ik) · ∇x a+ (x, k).
2
Let us define
ω+ (k) = c0 q0 (ik) = −iλ+ (ik). (6.31)
Our convention for the frequencies in the acoustic case are then ω± ± c0 |k| = 0. We thus find
that
c∗+ L2 W0 c+ = {ω+ , a+ }(x, k),
where the Poisson bracket is defined in (5.38). When p(iξ) = (iξ)2 , we obtain that c0 ∇k q0 (ik) =
c0 k̂. Upon taking ensemble averages and still denoting by a+ the ensemble average ha+ i, we
get the equation
∂a+
+ {ω+ , a+ }(x, k) + hc∗+ L1 W1 c+ i = 0.
∂t
Here we have defined
L1 W = K1 KW + K2∗ W K ∗ . (6.32)
In the absence of scattering (L1 ≡ 0), we thus observe that the phase-space energy density
a(t, x, k) solves the Liouville equation, which was introduced in (5.61).
Let us define Ŵ1 (p, k) = V̂ 1 (p)W11 (p, k) + V̂ 2 (p)W12 (p, k) with obvious notation. Using
the symmetry R̂ij (−p) = R̂ij (p), we deduce that
Z
k+q k+q
hL1 W1 (p, k)i = δ(p)c0
\ 4 R̂11 (k − q)KW11 (q − k, ) + R̂12 (k − q)KW12 (q − k, )
Rd 2 2
k+q ∗ k + q ∗
+R̂21 (k − q)W11 (k − q, )K + R̂22 (k − q)W12 (k − q, )K dq.
2 2
Using the convention of summation over repeated indices, we obtain after some algebra that

−R̂11 (k − q)λi (q)a+ (k) R̂12 (k − q)λ+ (k)ai (q)


Z
∗ λ+ (k)
hc+ (k)L1 W1 (k)c+ (k)i = +
4(2π)d Rd λi (q) − λ+ (k) + θ λi (q) − λ+ (k) + θ
12 22
R̂ (k − q)λ+ (k)aj (q) −R̂ (k − q)λj (q)a+ (k) 
+ + dq.
λ+ (k) − λj (q) + θ λ+ (k) − λj (q) + θ
(6.33)
Since λj (k) is purely imaginary, we deduce from the relation
1 1
→ + πsign(ε)δ(x), as ε → 0,
ix + ε ix

71
which holds in the sense of distributions, that
1 1 
lim + = 2πδ(iλj (q) − iλ+ (k)).
0<θ→0 λj (q) − λ+ (k) + θ λ+ (q) − λj (k) + θ

This implies that j = + in order for the delta function not to be restricted to the point k = 0
(we assume λ+ (k) = 0 implies k = 0). So using (6.31), we obtain that
Z
hc∗+ (k)L1 W1 (k)c+ (k)i = (Σ(k) + iΠ(k))a+ (k) − σ(k, q)a+ (q)δ(ω+ (q) − ω+ (k))dq,
Rd

where we have defined the scattering coefficients:


2 (k) Z
πω+ R̂11 + R̂22
Σ(k) = (k − q)δ(ω+ (q) − ω+ (k))dq,
2(2π)d Rd Z 2
1 X λ+ (k)λi (q)
iΠ(k) = d
p.v. (R̂11 − R̂22 )(k − q) dq, (6.34)
4(2π) Rd λ+ (k) − λi (q)
i=±
2 (k)
πω+
σ(k, q) = R̂12 (k − q).
2(2π)d

The radiative transfer equation for a+ is thus


Z
∂a+
+ {ω+ , a+ }(x, k) + (Σ(k) + iΠ(k))a+ = σ(k, q)a+ (q)δ(ω+ (q) − ω+ (k))dq.
∂t Rd
(6.35)
In the case where the two media are identical and p(ik) = −|k|2 so that q0 (ik) = |k| and
ω+ (k) = c0 |k|, we retrieve the classical radiative transfer equation for acoustic wave propaga-
tion [28], whereas (6.35) generalizes the kinetic model obtained in [9].
The radiative transfer equation for the energy density of acoustic waves thus takes the
form
Z
∂a+
+ c0 k̂ · ∇x a+ (x, k) = σ(k, q)(a+ (q) − a+ (k))δ(c0 |q| − c0 |k|)dq, (6.36)
∂t Rd

where R̂(p) = R̂11 (p) = R̂12 (p) = R̂22 (p) and


2 (k)
πω+
σ(k, q) = R̂(k − q). (6.37)
2(2π)d

The latter form shows one of the main properties of the radiative transfer equation, namely
that the scattering operator is conservative (its integral over wavenumbers vanishes) and that
it is elastic, i.e., the wavenumber |k| of waves after scattering equals that |p| before scattering.
Moreover defining the total scattering cross section as
Z
Σ(|k|) = σ(k, q)δ(ω+ (q) − ω+ (k))dq,
Rd

we can interpret σ(k, p)/Σ(|k|) as the probability of scattering from wavenumber p into
wavenumber k such that ω = c0 |k| = c0 |p| is preserved.

72
Chapter 7

Parabolic regime

7.1 Derivation of the parabolic wave equation


Let us consider the scalar wave equation for the pressure field p(z, x, t):

1 ∂2p
− ∆p = 0. (7.1)
c2 (z, x) ∂t2

Here c(z, x) is the local wave speed that we will assume to be random, and the Laplacian
operator includes both direction of propagation, z, and the transverse variable x ∈ Rd . In
the physical setting, we have d = 2. We consider dimensions d ≥ 1 to stress that the analysis
of the problem is independent of the number of transverse dimensions. If we assume that at
time t = 0, the wave field has a “beam-like” structure in the z direction, and if back-scattering
may be neglected, we can replace the wave equation by its parabolic (also known as paraxial)
approximation [30]. More precisely, the pressure p may be approximated as
Z
p(z, x, t) ≈ ei(−c0 κt+κz) ψ(z, x, κ)c0 dκ, (7.2)
R

where ψ satisfies the Schrödinger equation


∂ψ
2iκ (z, x, κ) + ∆x ψ(z, x, κ) + κ2 (n2 (z, x) − 1)ψ(z, x, κ) = 0,
∂z (7.3)
ψ(z = 0, x, κ) = ψ0 (x, κ),

with ∆x the transverse Laplacian in the variable x. The index of refraction n(z, x) = c0 /c(z, x),
and c0 in (7.2) is a reference speed.
A formal justification for the above approximation goes as follows. We start with the
reduced wave equation
∆p̂ + κ2 n2 (z, x)p̂ = 0, (7.4)
and look for solutions of (7.4) in the form p̂(z, x) = eiκz ψ(z, x). We obtain that

∂2ψ ∂ψ
+ 2iκ + ∆x ψ + κ2 (n2 − 1)ψ = 0. (7.5)
∂z 2 ∂z
The index of refraction n(z, x) is fluctuating in both the axial z and transversal x variables
and thus has the form  
2 z x
n (z, x) = 1 − 2σV , ,
lz lx

73
where V is a mean-zero random field, and where lx and lz are the correlation lengths of V in
the transverse and longitudinal directions, respectively. The small parameter σ measures the
strength of the fluctuations.
We now introduce two macroscopic distances of wave propagation: Lx in the x-plane and
Lz in the z-direction. We also introduce a carrier wave number κ0 and replace κ → κ0 κ, κ now
being a non-dimensional wavenumber. The physical parameters determined by the medium
are the length scales lx , lz and the non-dimensional parameter σ  1.
We present the relationship between the various scalings introduced above that need be sat-
isfied so that wave propagation occurs in a regime close to that of radiative transfer. Equation
(7.5) in the non-dimensional variables z → z/Lz , x → x/Lx becomes

1 ∂ 2 ψ 2iκκ0 ∂ψ
 
1 2 2 zLz xLx
+ + 2 ∆x ψ − 2κ κ0 σV , ψ = 0. (7.6)
L2z ∂z 2 Lz ∂z Lx lz lx
Let us introduce the following parameters
lx lz 1 1
δx = , δz = , γx = , γz = , (7.7)
Lx Lz κ0 l x κ0 l z
and recast (7.6) as

∂2ψ ∂ψ δx2 γx2 2κ2 σ


 
z x
γz δz 2 + 2iκ + ∆x ψ − V , ψ = 0. (7.8)
∂z ∂z δz γz γz δz δz δx
Let us now assume the following relationships among the various parameters
p
δx = δz  1, γz = γx2  1, σ = γz δx , ε = δx . (7.9)

Then (7.8), after multiplication by ε/2, becomes

γz ε2 ∂ 2 ψ ∂ψ ε2 2√
z x
+ iκε + ∆ x ψ − κ εV , ψ = 0. (7.10)
2 ∂z 2 ∂z 2 ε ε
We now observe that, when κ = O(1) and γz  1, the first term in (7.10) is small and
may be neglected in the leading order since |ε2 ψzz | = O(1). Then (7.10) becomes

∂ψ ε2 √ z x
iκε + ∆x ψ − κ2 εV , ψ=0 (7.11)
∂z 2 ε ε
which is the parabolic wave equation (7.3) in the radiative transfer scaling. The rigorous
passage to the parabolic approximation in a three-dimensional layered random medium in a
similar scaling is discussed in [1].

Exercise 7.1.1 (i) Show that the above choices imply that

lx  lz .

Therefore the correlation length in the longitudinal direction z should be much longer than in
the transverse plane x.
(ii) Show that
l4 l4
Lx = lx 2x 4 , Lz = lz 2x 4 , Lx  Lz .
σ lz σ lz
The latter is the usual constraint for the validity of the parabolic approximation (beam-like
structure of the wave).

74
In the above scalings, there remains one free parameter, namely γz = lz2 /lx2 , as one can
verify, or equivalently
Lx lx
= ≡ εη , η > 0, (7.12)
Lz lz
where η > 0 is necessary since Lx  Lz . Note that as η → 0, we recover an isotropic random
medium (with lz ≡ lx ) and the usual regime of radiative transfer derived in the preceding
chapter. The parabolic (or paraxial) regime thus shares some of the features of the radiative
transfer regime, and because the fluctuations depend on the variable z, which plays a similar
role to the time variable in the radiative transfer theory, the mathematical analysis is much
simplified.

7.2 Wigner Transform and mixture of states


We want to analyze the energy density of the solution to the paraxial wave equation in the
limit ε → 0. As in the preceding chapter, the Wigner transform is a useful tool. Let us recast
the above paraxial wave equation as the following Cauchy problem
∂ψε ε2 √ z x
iεκ + ∆ψε − κ2 εV , ψε = 0
∂z 2 ε ε (7.13)
ψε (0, x) = ψε0 (x; ζ).

Here, the initial data depend on an additional random variable ζ defined over a state space S
with a probability measure d$(ζ). Its use will become clear very soon.
Let us define the Wigner transform as the usual Wigner transform of the field ψε averaged
over the space (S, d$(ζ)):
Z  εy   εy  dy
Wε (z, x, k) = eik·y ψε z, x − ; ζ ψ̄ε z, x + ;ζ d$(ζ). (7.14)
Rd ×S 2 2 (2π)d

We assume that the initial data Wε (0, x, k) converges strongly in L2 (Rd × Rd ) to a limit
W0 (x, k). This is possible thanks to the introduction of a mixture of states, i.e., an integration
against the measure $(dξ). This is the main reason why the space (S, d$(ζ)) is introduced.
Note that the Wigner transform of a pure state (e.g. when $(dξ) concentrates at one point
in S) is not bounded in L2 (R2d ) uniformly in ε.
Exercise 7.2.1 Show that with the definition (5.1) and u and v scalar functions, we have:
Z
1
|Wε [u, v]|2 (x, k)dxdk = d
kuk2L2 (Rd ) kvk2L2 (Rd ) .
R 2d (2πε)
We thus need to regularize the Wigner transform if we want a uniform bound in a smaller
space than A0 . We assume the existence of (S, d$(ζ)) such that Wε (0, x, k) above converges
strongly in L2 (Rd × Rd ) to a limit W0 (x, k). We will come back to the effect of not regularizing
the Wigner transform at the end of the chapter.
Using the calculus recalled in chapter 5, we verify that the Wigner transform satisfies the
following evolution equation

p  dṼ zε , p
Z 
∂Wε 1 κ ip·x/ε
 p
+ k · ∇x W ε = √ e Wε (k − ) − Wε (k + ) . (7.15)
∂z κ i ε Rd 2 2 (2π)d

Here, Ṽ (z, p) is the partial Fourier transform of V (z, x) in the variable x. The above evolution
equation preserves the L2 (Rd × Rd ) norm of Wε (t, ·, ·):

75
Lemma 7.2.1 Let Wε (t, x, k) be the solution of (7.15) with initial conditions Wε (0, x, k).
Then we have

kWε (t, ·, ·)kL2 (Rd ×Rd ) = kWε (0, ·, ·)kL2 (Rd ×Rd ) , for all t > 0. (7.16)

Proof. This can be obtained by integrations by parts in (7.15), in a way that is similar to
showing that (7.13) preserves the L2 norm. This can also be obtained from the definition of
the Wigner transform and from Exercise 7.2.1.

7.3 Hypotheses on the randomness


We describe here the construction of the random potential V (z, x). Our main hypothesis is
to assume that V (z, x) is a Markov process in the z variable. This gives us access to a whole
machinery relatively similar to the one used in the diffusion Markov approximation in Chapter
4 and Appendix A. The Markovian hypothesis is crucial to simplify the mathematical analysis
because it allows us to treat the process z 7→ (V (z/ε, x/ε), Wε (z, x, k)) as jointly Markov.
In addition to being Markovian, V (z, x) is assumed to be stationary in x and z, mean zero,
and is constructed in the Fourier space as follows. Let V be the set of measures of bounded
total variation with support inside a ball BL = {|p| ≤ L}
 Z 
V = V̂ : ∗
|dV̂ | ≤ C, supp V̂ ⊂ BL , V̂ (p) = V̂ (−p) , (7.17)
Rd

and let Ṽ (z) be a mean-zero Markov process on V with infinitesimal generator Q. The random
potential V (z, x) is given by
Z
dṼ (z, p) ip·x
V (z, x) = e . (7.18)
Rd (2π)d

It is real-valued and uniformly bounded; |V (z, x)| ≤ C. The correlation function R(z, x) of
V (z, x) is

R(z, x) = E {V (s, y)V (z + s, x + y)} for all x, y ∈ Rd , and z, s ∈ R. (7.19)

In the Fourier domain, this is equivalent to the following statement:


n o Z
E hṼ (s), φ̂ihṼ (z + s), ψ̂i = (2π)d dpR̃(z, p)φ̂(p)ψ̂(−p), (7.20)
Rd

where h·, ·i is the usual duality product on Rd × Rd , and the power spectrum R̃ is the Fourier
transform of R(z, x) in x: Z
R̃(z, p) = dxe−ip·x R(z, x). (7.21)
Rd

We assume that R̃(z, p) ∈ S(R×Rd ), the space of Schwartz functions, for simplicity and define
R̂(ω, p) as Z
R̂(ω, p) = dze−iωz R̃(z, p), (7.22)
R
which is the space-time Fourier transform of R.
We now make additional assumptions on the infinitesimal generator so that the Fredholm
alternative holds for the Poisson equation. Namely, we assume that the generator Q is a
bounded operator on L∞ (V) with a unique invariant measure π(V̂ ), i.e. a unique normalized

76
measure such that Q∗ π = 0, and assume the existence of a constant α > 0 such that if
hg, πi = 0, then
kerQ gkL∞
V
≤ CkgkL∞V
e−αr . (7.23)
The simplest example of a generator with gap in the spectrum and invariant measure π is a
jump process on V where
Z Z
Qg(V̂ ) = g(V̂1 )dπ(V̂1 ) − g(V̂ ), dπ(V̂ ) = 1.
V V

Given the above hypotheses, the Fredholm alternative holds for the Poisson equation

Qf = g, (7.24)

provided that g satisfies hπ, gi = 0. It has a unique solution f with hπ, f i = 0 and kf kL∞V

CkgkL∞V
. The solution f is given explicitly by
Z ∞
f (V̂ ) = − drerQ g(V̂ ), (7.25)
0

and the integral converges absolutely thanks to (7.23).

7.4 The Main result


Let us summarize the hypotheses. We define Wε (z, x, k) in (7.14) as a mixture of states of
solutions to the paraxial wave equation (7.13). The mixture of state is such that (x, k) →
Wε (0, x, k), whence (x, k) → Wε (z, x, k) for all t > 0 is uniformly bounded in L2 (R2d ). We
assume that Wε (0, x, k) converges strongly in L2 (R2d ) to its limit W0 (0, x, k). We further
assume that the random field V (z, x) satisfies the hypotheses described in section 7.3.
Then we have the following convergence result.
Theorem 7.4.1 Under the above assumptions, the Wigner distribution Wε converges in prob-
ability and weakly in L2 (R2d ) to the solution W of the following transport equation

∂W
κ + k · ∇x W = κ2 LW , (7.26)
∂z
where the scattering kernel has the form
Z  |p|2 − |k|2   dp
LW (x, k) = R̂ , p − k W (x, p) − W (x, k) . (7.27)
Rd 2 (2π)d

More precisely, for any test function λ ∈ L2 (R2d ) the process hWε (z), λi converges to hW (z), λi
in probability as ε → 0, uniformly on finite intervals 0 ≤ z ≤ Z.
Note that the whole process Wε , and not only its average E{Wε } converges to the (deter-
ministic) limit W . This means that the process Wε is statistically stable in the limit ε → 0. The
process Wε (z, x, k) does not converge pointwise to the deterministic limit: averaging against
a test function λ(x, k) is necessary. However, the deterministic limit is in sharp contrast with
the results obtained in the Markov diffusion limit in Chapter 4 and Appendix A.
The next section is devoted to a proof of the theorem. The main ingredients of the proof
are now summarized as follows. Recall that the main mathematical assumption is that V (z, x)
is Markov in the z variable. Let us set Z > 0 and consider z ∈ [0, Z]. This allows us to show
that (V (z/ε, x/ε), Wε (z, x, k)) is jointly Markov in the space V × X , where X = C([0, L]; BW ),
where BW = {kW k2 ≤ C} is an appropriate ball in L2 (Rd × Rd ).

77
Evolution equation and random process. Since κ plays no role in the derivation, we set
κ = 1 to simplify. Recall that Wε satisfies the Cauchy problem
∂Wε
+ k · ∇x Wε = Lε Wε ,
∂z
with Wε (0, x, k) = Wε0 (x, k), where
z
1
Z dṼ ( , p) h p p i
Lε Wε = √ ε eip·x/ε
W (x, k − ) − W (x, k + ) . (7.28)
ε ε
i ε Rd (2π)d 2 2

The solution to the above Cauchy problem is understood in the sense that for every smooth
test function λ(z, x, k), we have
Z z  

hWε (z), λ(z)i − hWε0 , λ(0)i = hWε (s), + k · ∇x + Lε λ(s)ids.
0 ∂s

Here, we have used that Lε is a self-adjoint operator for h·, ·i. Therefore, for a smooth function
λ0 (x, k), we obtain hWε (z), λ0 i = hWε0 , λε (0)i, where λε (s) is the solution of the backward
problem
∂λε
+ k · ∇x λε + Lε λε (s) = 0, 0 ≤ s ≤ z,
∂s
with the terminal condition λε (z, x, k) = λ0 (x, k).

Tightness of the family of ε−measures. The above construction defines the process
Wε (z) in L2 (R2d ) and generates a corresponding measure Pε on the space C([0, L]; L2 (R2d ))
of continuous functions in time with values in L2 . The measure Pε is actually supported on
paths inside X defined above, which is the state space for the random process Wε (z). With its
natural topology and the Borel σ−algebra F, (X , F, Pε ) defines a probability space on which
Wε (z) is a random variable. Then Fs is defined as the filtration of the process Wε (z), i.e., the
filtration generated by {Wε (τ ), τ < s}. We recall that intuitively, the filtration renders the
past τ < s measurable, i.e., “known”, and the future τ > s non-measurable, i.e., not known
yet.
The family Pε parameterized by ε0 > ε > 0 will be shown to be tight. This in turns
implies that Pε converges weakly to P . More precisely, we can extract a subsequence of Pε ,
still denoted by Pε , such that for all continuous function f defined on X , we have
Z Z

E {f } ≡ f (ω)dPε (ω) → f (ω)dP (ω) ≡ EP {f }, as ε → 0. (7.29)
X X

Construction of a first approximate martingale. Once tightness is ensured, the proof of


convergence of Wε to its deterministic limit is obtained in two steps. Let us fix a deterministic
test function λ(z, x, k). We use the Markovian property of the random field V (z, x) in z to
construct a first functional Gλ : X → C[0, L] by
Z z
∂λ
Gλ [W ](z) = hW, λi(z) − hW, + k · ∇x λ + Lλi(ζ)dζ. (7.30)
0 ∂z

Here, L is the limiting scattering kernel defined in (7.27). We will show that Gλ is an approx-
imate Pε -martingale (with respect to the filtration Fs ), and more precisely that

EPε {Gλ [W ](z)|Fs } − Gλ [W ](s) ≤ Cλ,L ε (7.31)

78
uniformly for all W ∈ X and 0 ≤ s < z ≤ L. Choosing s = 0 above, the two convergences
(7.29) and (7.31) (weak against strong) show that

EP {Gλ [W ](z)|} − Gλ [W ](0) = 0. (7.32)

We thus obtain the transport equation (7.26) for W = EP {W (z)} in its weak formulation.

Exercise 7.4.1 Verify this statement.

Construction of a second approximate martingale and convergence of the full


family of ε−measures. So far, we have characterized the convergence of the first moment
of Pε . We now consider the convergence of the second moment and show that the variance of
the limiting process vanishes, whence the convergence to a deterministic process.
We will show that for every test function λ(z, x, k), the new functional
Z z
∂λ
G2,λ [W ](z) = hW, λi2 (z) − 2 hW, λi(ζ)hW, + k · ∇x λ + Lλi(ζ)dζ (7.33)
0 ∂z

is also an approximate Pε -martingale. We then obtain that

EPε hW, λi2 → hW , λi2 .



(7.34)

This crucial convergence implies convergence in probability. It follows that the limit measure
P is unique and deterministic, and that the whole sequence Pε converges.

7.5 Proof of Theorem 7.4.1


The proof of tightness of the family of measures Pε is postponed to the end of the section
as it requires estimates that are developed in the proofs of convergence of the approximate
martingales. We thus start with the latter proofs.

7.5.1 Convergence in expectation


To obtain the approximate martingale property (7.31), one has to consider the conditional ex-
pectation of functionals F (W, V̂ ) with respect to the probability measure P̃ε on D([0, L]; BW ×
V), the space of right-continuous paths with left-side limits [10] generated by the process
(W, V ). Note that W is a continuous function of z thanks to the evolution equation it solves.
The process V however need not be continuous, whence the above space, sometimes referred
to in the probabilistic literature as the space of càd-làg functions (which stands for the French
continu à droite, limite à gauche). The only functions we need consider are in fact of the form
F (W, V̂ ) = hW, λ(V̂ )i with λ ∈ L∞ (V; C 1 ([0, L]; S(R2d ))). Given a function F (W, V̂ ) let us
define the conditional expectation
n o n o
EP̃ε F (W, V̂ ) (τ ) = EP̃ε F (W (τ ), Ṽ (τ ))| W (z) = W, Ṽ (z) = V̂ , τ ≥ z.
W,V̂ ,z

The weak form of the infinitesimal generator of the Markov process generated by P̃ε is then
given by
   
d P̃ε n o 1 ∂ 1 x
E hW, λ(V̂ )i (z + h) = hW, Qλi + W, + k · ∇x + √ K[V̂ , ] λ .
dh W,V̂ ,z h=0 ε ∂z ε ε
(7.35)

79
Exercise 7.5.1 Derive the above formula in detail using the definition of the Markov process
V with infinitesimal generator Q and the evolution equation for Wε written in the form
∂Wε 1
+ k · ∇x Wε = √ K[Ṽ (z/ε), x/ε]Wε , (7.36)
∂z ε

where the operator K is defined by


Z
1 dV̂ (p) ip·η h p p i
K[V̂ , η]ψ(x, η, k, V̂ ) = e ψ(x, η, k − ) − ψ(x, η, k + ) . (7.37)
i Rd (2π)d 2 2

The above equality implies that


Z z   
ε 1 ∂ 1 x
Gλ = hW, λ(V̂ )i(z) − W, Q+ + k · ∇x + √ K[V̂ , ] λ (s)ds (7.38)
0 ε ∂z ε ε

is a P̃ε -martingale since the drift term has been subtracted.


Given a test function λ(z, x, k) ∈ C 1 ([0, L]; S) we construct a function

λε (z, x, k, V̂ ) = λ(z, x, k) + ελε1 (z, x, k, V̂ ) + ελε2 (z, x, k, V̂ ), (7.39)

with λε1,2 (t) bounded in L∞ (V; L2 (R2d )) uniformly in z ∈ [0, L]. This is the method of per-
turbed test function. Rather than performing asymptotic expansions on the Wigner trans-
form itself, which is not sufficiently smooth to justify Taylor expansions, we perform the
expansion on smooth test functions.
The functions λε1,2 will be chosen to remove all high-order terms in the definition of the
martingale (7.38), i.e., so that

kGελε (z) − Gλ (z)kL∞ (V) ≤ Cλ ε (7.40)

for all z ∈ [0, L]. Here Gελε is defined by (7.38) with λ replaced by λε , and Gλ is defined by
(7.30). The approximate martingale property (7.31) follows from this.
The functions λε1 and λε2 are as follows. Let λ1 (z, x, η, k, V̂ ) be the mean-zero solution of
the Poisson equation
k · ∇η λ1 + Qλ1 = −Kλ. (7.41)
It is given explicitly by
Z ∞ Z
1 rQ dV̂ (p) ir(k·p)+i(η·p) h p p i
λ1 (z, x, η, k, V̂ ) = dre e λ(z, x, k − ) − λ(z, x, k + ) .
i 0 Rd (2π)d 2 2
(7.42)

Exercise 7.5.2 Prove the above formula. Hint: Go to the Fourier domain η → p.

Then we let λε1 (z, x, k, V̂ ) = λ1 (z, x, x/ε, k, V̂ ). Furthermore, the second order corrector is
given by λε2 (z, x, k, V̂ ) = λ2 (z, x, x/ε, k, V̂ ) where λ2 (z, x, η, k, V̂ ) is the mean-zero solution of

k · ∇η λ2 + Qλ2 = Lλ − Kλ1 , (7.43)

which exists because


E {Kλ1 } = Lλ. (7.44)

Exercise 7.5.3 Verify the above equality using the definition of the power spectrum of the
potential V .

80
The explicit expression for λ2 is given by
Z ∞ h i
λ2 (z, x, η, k, V̂ ) = − drerQ Lλ(z, x, k) − [Kλ1 ](z, x, η + rk, k, V̂ ) .
0

Exercise 7.5.4 Verify this.

Using (7.41) and (7.43) we have


√ ε
   
d P̃ε ∂ 1 x 1 ε

E {hW, λε i} (z + h) = W, + k · ∇x + √ K[V̂ , ] + Q λ + ελ1 + ελ2
dh W,V̂ ,z h=0 ∂z ε ε ε
√ ε  √
       
∂ ∂ ε x ε
= W, + k · ∇x λ + Lλ + W, + k · ∇x ελ1 + ελ2 + εK[V̂ , ]λ2
∂z ∂z ε

   

= W, + k · ∇x λ + Lλ + εhW, ζελ i,
∂z

with

   
∂ ∂ x
ζελ = ε
+ k · ∇x λ1 + ε + k · ∇x λε2 + K[V̂ , ]λε2 .
∂z ∂z ε
The terms k · ∇x λε1,2 above are understood as differentiation with respect to the slow variable
x only, and not with respect to η = x/ε. It follows that Gελε is given by
Z z   Z z

 
ε ∂
Gλε (z) = hW (z), λε i − ds W, + k · ∇x + L λ (s) − ε dshW, ζελ i(s) (7.45)
0 ∂z 0

and is a martingale with respect to the measure P̃ε defined on D([0, L]; BW × V). The estimate
(7.31) follows from the following two lemmas.

Lemma 7.5.1 Let λ ∈ C 1 ([0, L]; S(R2d )). Then there exists a constant Cλ > 0 independent
of z ∈ [0, L] so that the correctors λε1 (z) and λε2 (z) satisfy the uniform bounds

kλε1 (z)kL∞ (V;L2 ) + kλε2 (z)kL∞ (V;L2 ) ≤ Cλ (7.46)

and
∂λε1 (z) ∂λε2 (z)
+ k · ∇x λε1 (z) L∞ (V;L2 )
+ + k · ∇x λε2 (z) L∞ (V;L2 )
≤ Cλ . (7.47)
∂z ∂z
Lemma 7.5.2 There exists a constant Cλ such that

kK[V̂ , x/ε]kL2 →L2 ≤ C

for any V̂ ∈ V and all ε ∈ (0, 1].



Indeed, (7.46) implies that |hW, λi − hW, λε i| ≤ C ε for all W ∈ X and V̂ ∈ V, while
(7.47) and Lemma 7.5.2 imply that for all z ∈ [0, L]

kζελ (z)kL2 ≤ C, (7.48)

for all V̂ ∈ V so that (7.31) follows.


Proof of Lemma 7.5.2. Lemma 7.5.2 follows immediately from the definition of K, the
bound (7.17) and the Cauchy-Schwarz inequality.
We now prove Lemma 7.5.1. We will omit the z-dependence of the test function λ to
simplify the notation.

81
Proof of Lemma 7.5.1. We only prove (7.46). Since λ ∈ S(R2d ), there exists a constant
Cλ so that

|λ(x, k)| ≤ .
(1 + |x|5d )(1 + |k|5d )
The value of the exponents 5d is by no means optimal, and is sufficient in what follows. Then
we obtain using (7.17) and (7.23)
Z ∞ Z
ε rQ
h p p i
|λ1 (z, x, k, V̂ )| = C dre dV̂ (p)eir(k·p)+i(x·p)/ε λ(z, x, k − ) − λ(z, x, k + )
Rd 2 2
Z ∞ Z0 h p p i
≤C dre−αr sup |dV̂ (p)| |λ(z, x, k − )| + |λ(z, x, k + )|
0 V̂ Rd 2 2
C
≤ ,
(1 + |x|5d )(1 + (|k| − L)5d χ|k|≥5L (k))

and the L2 -bound on λ1 follows.


We show next that λε2 is uniformly bounded. We have
Z ∞ " Z
ε rQ 1 dV̂ (p) ip·(x/ε+rk)
λ2 (x, k, V̂ ) = − dre Lλ(x, k) − e
0 i Rd (2π)d
h x p x p ii
× λ1 (x, + rk, k − , V̂ ) − λ1 (x, + rk, k + , V̂ ) .
ε 2 ε 2
The second term above may be written as
Z
1 dV̂ (p) ip·(x/ε+rk) h x p x p i
d
e λ 1 (x, + rk, k − , V̂ ) − λ 1 (x, + rk, k + , V̂ )
i Rd (2π) ε 2 ε 2
Z Z ∞ Z
dV̂ (p) ip·(x/ε+rk) dV̂ (q) is(k−p/2)·q+i(x/ε+rk)·q
=− d
e dsesQ d
e
Rd (2π) 0 Rd (2π)
h p q p q i
× λ(x, k − − ) − λ(x, k − + )
2 2 2 2
dV̂ (p) ip·(x/ε+rk) ∞
Z Z Z
dV̂ (q) is(k+p/2)·q+i(x/ε+rk)·q
+ d
e dsesQ d
e
Rd (2π) 0 Rd (2π)
h p q p q i
× λ(x, k + − ) − λ(x, k + + ) .
2 2 2 2
Therefore we obtain
"
Z ∞ Z Z ∞ Z
−αr −αs
|λε2 (x, k, V̂ )| ≤ C dre |Lλ(x, k)| + sup |dV̂ (p)| dse sup |dV̂1 (q)|
0 V̂ Rd 0 V̂1 Rd
 p q p q p q p q i
× |λ(x, k − − )| + |λ(x, k − + )| + |λ(x, k + − )| + λ(x, k + + )
 2 2 2 2 2 2 2 2
1
≤ C |Lλ(x, k)| + ,
(1 + |x|5d )(1 + (|k| − L)5d χ|k|≥5L (k))

and the L2 -bound on λε2 in (7.46) follows because the operator L : L2 → L2 is bounded. The
proof of (7.47) is very similar and is left as a painful exercise.
Lemma 7.5.1 and Lemma 7.5.2 together with (7.45) imply the bound (7.40). The tight-
ness of measures Pε given by Lemma 7.5.4 implies then that the expectation E {Wε (z, x, k)}
converges weakly in L2 (R2d ) to the solution W (z, x, k) of the transport equation for each
z ∈ [0, L].

82
7.5.2 Convergence in probability
We now prove that for any test function λ the second moment E hWε , λi2 converges to


hW , λi2 . This will imply the convergence in probability claimed in Theorem 7.4.1. The proof
is similar to that for E {hWε , λi} and is based on constructing an appropriate approximate
martingale for the functional hW ⊗ W, µi, where µ(z, x1 , k1 , x2 , k2 ) is a test function, and
W ⊗ W (z, x1 , k1 , x2 , k2 ) = W (z, x1 , k1 )W (z, x2 , k2 ). We need to consider the action of the
infinitesimal generator on functions of W and V̂ of the form

F (W, V̂ ) = hW (x1 , k1 )W (x2 , k2 ), µ(z, x1 , k1 , x2 , k2 , V̂ )i = hW ⊗ W, µ(V̂ )i

where µ is a given function. The infinitesimal generator acts on such functions as


d P̃ε n o 1
E hW ⊗ W, µ(V̂ )i (z + h) = hW ⊗ W, Qλi + hW ⊗ W, H2ε µi, (7.49)
dh W,V̂ ,z h=0 ε

where
2
xj
 
X 1
H2ε µ = √ Kj V̂ , µ + kj · ∇xj µ, (7.50)
ε ε
j=1

with Z
1 h p p i
K1 [V̂ , η 1 ]µ = dV̂ (p)ei(p·η1 ) µ(k1 − , k2 ) − µ(k1 + , k2 )
i Rd 2 2
and Z
1 h p p i
K2 [V̂ , η 2 ]µ = dV̂ (p)ei(p·η2 ) µ(k1 , k2 − ) − µ(k1 , k2 + ) .
i Rd 2 2
Therefore the functional

G2,ε
µ = hW ⊗ W, µ(V̂ )i(z) (7.51)
Z z 
1 ∂ 1 x1 x2 
− W ⊗ W, Q + + k1 · ∇x1 + k2 · ∇x2 + √ (K1 [V̂ , ] + K2 [V̂ , ]) µ (s)ds
0 ε ∂z ε ε ε

is a P̃ ε martingale. We let µ(z, x, K) ∈ S(R2d × R2d ) be a test function independent of V̂ ,


where x = (x1 , x2 ), and K = (k1 , k2 ). We define an approximation

µε (z, x, K) = µ(z, x, K) + εµ1 (z, x, x/ε, K) + εµ2 (x, x/ε, K).

We will use the notation µε1 (z, x, K) = µ1 (z, x, x/ε, K) and µε2 (z, x, K) = µ2 (z, x, x/ε, K).
The functions µ1 and µ2 are to be determined. We now use (7.49) to get
  +
2
*
d 1 X
Dε := E (hW ⊗ W, µε (V̂ )i)(z + h) = W ⊗ W, Q + kj · ∇ηj  µ
dh h=0 W,V̂ ,z ε
j=1
 
2 2
* +
1 X X h i
+√ W ⊗ W, Q + kj · ∇ηj  µ1 + Kj V̂ , η j µ (7.52)
ε
j=1 j=1
 
2 2 2
* +
X X h i ∂µ X
+ W ⊗ W, Q + kj · ∇ηj  µ2 + Kj V̂ , η j µ1 + + kj · ∇xj µ
∂z
j=1 j=1 j=1
 
2 2
* +
√ X h
j
i ∂ X
j √
+ ε W ⊗ W, Kj V̂ , η µ2 +  + k · ∇xj  (µ1 + εµ2 ) .
∂z
j=1 j=1

83
The above expression is evaluated at η j = xj /ε. The term of order ε−1 in Dε vanishes since
µ is independent of V and the fast variable η. We cancel the term of order ε−1/2 in the same
way as before by defining µ1 as the unique mean-zero (in the variables V̂ and η = (η 1 , η 2 ))
solution of
X2 X2
j
Kj V̂ , η j µ = 0.
 
(Q + k · ∇ηj )µ1 + (7.53)
j=1 j=1

It is given explicitly by

1 ∞
Z Z h p p i
µ1 (x, η, K, V̂ ) = dre rQ
dV̂ (p)eir(k1 ·p)+i(η1 ·p) µ(k1 − , k2 ) − µ(k1 + , k2 )
i 0 d 2 2
Z ∞ ZR
1 h p p i
+ drerQ dV̂ (p)eir(k2 ·p)+i(η2 ·p) µ(k1 , k2 − ) − µ(k1 , k2 + ) .
i 0 Rd 2 2

When µ has the form µ = λ ⊗ λ, then µ1 has the form µ1 = λ1 ⊗ λ + λ ⊗ λ1 with the corrector
λ1 given by (7.42). Let us also define µ2 as the mean zero with respect to πV solution of

2
X 2
X 2
X
j j
Kj V̂ , η j µ1 ,
   
(Q + k · ∇ηj )µ2 + Kj V̂ , η µ1 = (7.54)
j=1 j=1 j=1
R
where f = dπV f . The function µ2 is given by
Z ∞
µ2 (x, η, K, V̂ ) = − drerQ [K1 [V̂ , η 1 + rk1 ]µ1 (x, η + rK, K) (7.55)
0
− [K1 [V̂ , η 1 + rk1 ]µ1 ](x, η + rK, K, V̂ )]
Z ∞
− drerQ [K2 [V̂ , k2 + rη 2 ]µ1 (x, η + rK, K)
0
− [K2 [V̂ , η 2 + rk2 ]µ1 ](x, η + rK, K, V̂ )].

Unlike the first corrector µ1 , the second corrector µ2 may not be written as an explicit sum of
tensor products even if µ has the form µ = λ ⊗ λ because µ1 depends on V̂ .
The P̃ ε -martingale G2,ε
µε is given by
Z z 
2,ε ∂ ε

Gµ = hW ⊗ W, µ(V̂ )i(z) − W ⊗ W, + k1 · ∇x1 + k2 · ∇x2 + L2 µ (s)ds
0 ∂z
Z z

− ε hW ⊗ W, ζεµ i(s)ds, (7.56)
0

where ζεµ is given by


 
2 2
X h xj i ε  ∂ X √
ζµε = Kj V̂ , µ2 + + kj · ∇xj  (µε1 + εµε2 )
ε ∂z
j=1 j=1

84
and the operator Lε2 is defined by
Z ∞ Z
ε 1  ir(k1 + p )·p
L2 µ = − dr dp R̃(r, p) e 2 (µ(z, x1 , k1 , x2 , k2 ) − µ(z, x1 , k1 + p, x2 , k2 ))
(2π)d 0 Rd
p
−eir(k1 − 2 )·p (µ(z, x1 , k1 − p, x2 , k2 ) − µ(z, x1 , k1 , x2 , k2 ))

x2 −x1
+ eip· ε +irk2 ·p (µ(z, x1 , k1 + p2 , x2 , k2 − p2 ) − µ(z, x1 , k1 + p2 , x2 , k2 + p2 ))

x2 −x1
−eip· ε +irk2 ·p (µ(z, x1 , k1 − p2 , x2 , k2 − p2 ) − µ(z, x1 , k1 − p2 , x2 , k2 + p2 ))

x1 −x2
+ eirk1 ·p+i ε ·p (µ(z, x1 , k1 − p2 , x2 , k2 + p2 ) − µ(z, x1 , k1 − p2 , x2 , k2 − p2 ))

x1 −x2
−eirk1 ·p+i ε ·p (µ(z, x1 , k1 + p2 , x2 , k2 + p2 ) − µ(z, x1 , k1 + p2 , x2 , k2 − p2 ))

p
+ eir(k2 + 2 )·p (µ(z, x1 , k1 , x2 , k2 ) − µ(z, x1 , k1 , x2 , k2 + p))

p
−eir(k2 − 2 )·p (µ(z, x1 , k1 , x2 , k2 − p) − µ(z, x1 , k1 , x2 , k2 )) .


(7.57)
We have used in the calculation of Lε2 that for a sufficiently regular function f , we have
"Z # Z
dV̂ (q) ∞ ∞
Z Z Z
rQ
E d
dr e dV̂ (p)f (r, p, q) = dr R̃(r, p)f (r, p, −p)dp.
Rd (2π) 0 Rd 0 Rd

The bound on ζεµ is similar to that on ζελ obtained previously as the correctors µεj satisfy the
same kind of estimates as the correctors λj :
Lemma 7.5.3 There exists a constant Cµ > 0 so that the functions µε1,2 obey the uniform
bounds
kµε1 (z)kL2 (R2d ) + kµε2 kL2 (R2d ) ≤ Cµ , (7.58)
and
2 2
∂µε1 (z) X ∂µε2 (z) X
+ kj · ∇xj µε1 (z) L2 (R2d )
+ + kj · ∇xj µε2 (z) L2 (R2d )
≤ Cµ , (7.59)
∂z ∂z
j=1 j=1

for all z ∈ [0, L] and V ∈ V.


The proof of this lemma is very similar to that of Lemma 7.5.1 and is therefore omitted.
Unlike the first moment case, the averaged operator Lε2 still depends on ε. We therefore do
not have strong convergence of the P̃ ε -martingale G2,ε µε to its limit yet. However, the a priori
bound on Wε in L allows us to characterize the limit of G2,ε
2
µε and show strong convergence.
This is shown as follows. The first and last terms in (7.57) that are independent of ε give the
contribution:
Z ∞ Z 
p2 −k2
dp ir 2 1
L2 µ = dr d
R̃(r, p − k1 )e (µ(z, x1 , p, x2 , k2 ) − µ(z, x1 , k1 , x2 , k2 ))
0 Rd (2π)
2 −p2
k1
+R̃(r, k1 − p)eir 2 (µ(z, x1 , p1 , x2 , k2 ) − µ(z, x1 , k1 , x2 , k2 ))
p2 −k2
ir 2 2
+R̃(z, p − k2 )e (µ(z, x1 , k1 , x2 , p) − µ(z, x1 , k1 , x2 , k2 ))
k2 −p2

ir 2 2
+R̃(z, k2 − p)e (µ(z, x1 , k1 , x2 , p) − µ(z, x1 , k1 , x2 , k2 ))

p2 − k12
Z
dp
= d
R̂( , p − k1 )(µ(z, x1 , p, x2 , k2 ) − µ(z, x1 , k1 , x2 , k2 ))
Rd (2π) 2
p2 − k22
+R̂( , p − k2 )(µ(z, x1 , k1 , x2 , p) − µ(z, x1 , k1 , x2 , k2 )).
2

85
The two remaining terms give a contribution that tends to 0 as ε → 0 for sufficiently smooth
test functions. They are given by
Z ∞ Z
ε 1
(L2 − L2 )µ = dr dpR̃(r, p)×
(2π)d 0 Rd
x2 −x1 x1 −x2
eip· ε +irk2 ·p + eirk1 ·p+i ε ·p µ(z, x1 , k1 + p2 , x2 , k2 + p2 ) − µ(z, x1 , k1 + p2 , x2 , k2 − p2 )
 
x2 −x1 x1 −x2
+ eip· ε +irk2 ·p + eirk1 ·p+i ε ·p µ(z, x1 , k1 − p2 , x2 , k2 − p2 )− µ(z, x1 , k1 − p2 , x2 , k2 + p2 ) .
 

We have
R̃(z, p) = R̃(−z, −p) ≥ 0
by Bochner’s theorem. Since (Lε2 − L2 ) and λ are real quantities, we can take the real part of
the above term and, after the change of variables r → −r and p → −p, obtain
Z ∞
x2 − x1 irk2 ·p
Z
1
ε
(L2 − L2 )µ = d
dr dpR̃(r, p) cos(p · )(e + eirk1 ·p )
(2π) −∞ R d ε
× µ(z, x1 , k1 + p2 , x2 , k2 + p2 ) + µ(z, x1 , k1 − p2 , x2 , k2 − p2 )
−µ(z, x1 , k1 + p2 , x2 , k2 − p2 ) − µ(z, x1 , k1 − p2 , x2 , k2 + p2 )

Z
2
= dp(R̂(−k1 · p, p) + R̂(−k2 · p, p)) cos(p · x2 −x ε )
1
(2π)d Rd
× µ(z, x1 , k1 + p2 , x2 , k2 + p2 ) − µ(z, x1 , k1 − p2 , x2 , k2 + p2 )


= g1 + g2 + g3 + g4 + c.c.

We have (since µ is real-valued)


Z Z
2
I = dx1 dk1 dx2 dk2 |g1 (z, x1 , k1 , x2 , k2 )| = C dx1 dk1 dx2 dk2 dpdqR̂(−k1 · p, p)R̂(−k1 · q, q)
R4d R6d
i(p−q)·
x2 −x1 p p q q
×e ε µ(z, x1 , k1 − , x2 , k2 + )µ(z, x1 , k1 − , x2 , k2 + ).
2 2 2 2
Using density arguments we may assume that µ has the form

µ(x1 , k1 , x2 , k2 ) = µ1 (x1 − x2 )µ2 (x1 + x2 )µ3 (k1 )µ4 (k2 ).

Then we have
Z
I=C dx1 dk1 dx2 dk2 dpdqR̂(−k1 · p, p)R̂(−k1 · q, q)
R6d
x1 p p q q
×e−i(p−q)· ε µ21 (x1 )µ22 (x2 )µ3 (k1 − )µ4 (k2 + )µ3 (k1 − )µ4 (k2 + )
2 2 2 2
p−q
Z
2
= Ckµ2 kL2 dk1 dk2 dpdqR̂(−k1 · p, p)R̂(−k1 · q, q)ν̂( )
R4d ε
p p q q
×µ3 (k1 − )µ4 (k2 + )µ3 (k1 − )µ4 (k2 + )
2 2 2 2

where ν(x) = µ21 (x). We introduce G(p) = supω R̂(ω, p) and use the Cauchy-Schwarz inequal-
ity in k1 and k2 :

p−q
Z
2 2 2
|I| ≤ Ckµ2 kL2 kµ3 kL2 kµ4 kL2 dpdqG(p)G(q) ν̂( ) .
R2d ε

86
We use again the Cauchy-Schwarz inequality, now in p, to get
Z Z 1/2
2 2 2 p 2
|I| ≤ Ckµ2 kL2 kµ3 kL2 kµ4 kL2 kGkL2 dqG(q) dp ν̂( )
Rd R d ε
≤ Cεd/2 kµ2 k2L2 kµ3 k2L2 kµ4 k2L2 kGkL2 kGkL1 kνkL2 .

This proves that k(Lε2 − L2 )µkL2 → 0 as ε → 0. Note that oscillatory integrals of the form
Z
p·x
ei ε µ(p)dp, (7.60)
Rd

are not small in the bigger space A0 , which is natural in the context of Wigner transforms. In
this bigger space, we cannot control (Lε2 − L2 )µ and actually suspect that the limit measure
P may no longer be deterministic.
We therefore deduce that
Z z 
2 ∂ 
Gµ = hW ⊗ W, µ(V̂ )i(z) − W ⊗ W, + k1 · ∇x1 + k2 · ∇x2 + L2 µ (s)ds
0 ∂z

is an approximate P̃ε martingale. The limit of the second moment

W2 (z, x1 , k1 , x2 , k2 ) = EP {W (z, x1 , k1 )W (z, x2 , k2 )}

thus satisfies (weakly) the transport equation


∂W2
+ (k1 · ∇x1 + k2 · ∇x2 )W2 = L2 W2 ,
∂t
with initial data W2 (0, x1 , k1 , x2 , k2 ) = W0 (x1 , k1 )W0 (x2 , k2 ). Moreover, the operator L2
acting on a tensor product λ ⊗ λ has the form

L2 [λ ⊗ λ] = Lλ ⊗ λ + λ ⊗ Lλ.

This implies that

EP {W (z, x1 , k1 )W (z, x2 , k2 )} = EP {W (z, x1 , k1 )} EP {W (z, x2 , k2 )}

by uniqueness of the solution to the above transport equation with initial conditions given by
W0 (x1 , k1 )W0 (x2 , k2 ). This proves that the limiting measure P is deterministic and unique
(because characterized by the transport equation) and that the sequence Wε (z, x, k) converges
in probability to W (z, x, k).

7.5.3 Tightness of Pε
We now show tightness of the measures Pε in X . We have the lemma
Lemma 7.5.4 The family of measures Pε is weakly compact.
The proof is as follows; see [11]. A theorem of Mitoma and Fouque [24, 16] implies that in order
to verify tightness of the family Pε it is enough to check that for each λ ∈ C 1 ([0, L], S(Rd ×
Rd )) the family of measures Pε on C([0, L]; R) generated by the random processes Wλε (z) =
hWε (z), λi is tight. Tightness of Pε follows from the following two conditions. First, a Kol-
mogorov moment condition [10] in the form

E Pε {|hW, λi(z) − hW, λi(z1 )|γ |hW, λi(z1 ) − hW, λi(s)|γ } ≤ Cλ (z − s)1+β , 0≤s≤z≤L
(7.61)

87
should hold with γ > 0, β > 0 and Cλ independent of ε. Second, we should have
( )
lim lim sup ProbPε sup |hW, λi(z)| > R = 0.
R→∞ ε→0 0≤z≤L

The second condition holds automatically in our case since the process Wλε (z) is uniformly
bounded for all z > 0 and ε > 0. In order to verify (7.61), note that we have
Z z Z z
√ ∂λ √
hW (z), λi = Gελε (z)− εhW, λε1 i−εhW, λε2 i+ dshW, +k·∇x λ+Lλi(s)+ ε dshW, ζελ i(s).
0 ∂z 0

The uniform bound (7.48) on ζελ and the bounds on kλε1,2 (z)kL2 (R2d ) in Lemma 7.5.1 imply
that it suffices to check (7.61) for
Z z
ε ∂λ
xε (z) = Gλε (z) + dshW, + k · ∇x λ + Lλi(s).
0 ∂z

We have
( Z )
z 2
n
2
o∂λ
E |xε (z) − xε (s)| Fs ≤ 2E dτ hW, + k · ∇x λ + Lλi(τ ) Fs
s ∂z
n o
2
+2E Gελε (z) − Gελε (s) Fs ≤ C(z − s)2 + 2E hGελε i(z) − hGελε i(s) Fs .


Here hGελε i is the increasing process associated with Gελε . We will now compute it explicitly.
First we obtain that
d Pε  ∂λ 1 x 1 
hW, λε i2 (z + h) +k·∇x λε + √ K[V̂ , ]λε i+ Q hW, λε i2

E = 2hW, λε ihW,
dh W,V̂ ,t h=0 ∂z ε ε ε

so that
Z z  
∂λ 1 x 1 
hW, λε i2 (z) − + k · ∇x λε + √ K[V̂ , ]λε i(s) + Q hW, λε i2 (s) ds

2hW, λε i(s)hW,
0 ∂z ε ε ε

is a martingale. Therefore we have


Z z  
1 2
hGελε (z)i = ds Q[hW, λε i2 ] − hW, λε ihW, Qλε i (s)
ε ε
Z 0z z
 √
Z
ε 2 ε ε
 
= ds Q hW, λ1 i − hW, λ1 ihW, Qλ1 i(s) + ε dsHε (s)
0 0

with

Hε = 2 ε (Q[hW, λε1 ihW, λε2 i] − hW, λε1 ihW, Qλε2 i − hW, λε2 ihW, Qλε1 i)
+ ε Q[hW, λε2 i2 ] − 2hW, λε2 ihW, Qλε2 i .


The boundedness of λε2 and that of Q on L∞ (V) imply that |Hε (s)| ≤ C for all V ∈ V. This
yields
E hGελε i(z) − hGελε i(s) Fs ≤ C(z − s)


whence n o
E |xε (z) − xε (s)|2 Fs ≤ C(z − s).

88
In order to obtain (7.61) we note that

E Pε {|xε (z) − xε (z1 )|γ |xε (z1 ) − xε (s)|γ }


= E Pε E Pε { |xε (z) − xε (z1 )|γ | Fz1 } |xε (z1 ) − xε (s)|γ

h n oiγ/2 
Pε Pε 2 γ
≤E E |xε (z) − xε (z1 )| Fz1 |xε (z1 ) − xε (s)|

≤ C(z − z1 )γ/2 E Pε {|xε (z1 ) − xε (s)|γ } ≤ C(z − z1 )γ/2 E Pε E Pε {|xε (z1 ) − xε (s)|γ |Fs }

h n oiγ/2 
γ/2 Pε Pε 2
≤ C(z − z1 ) E E |xε (z1 ) − xε (s)| Fs ≤ C(z − z1 )γ/2 (z1 − s)γ/2

≤ C(z − s)γ .

Choosing now γ > 1 we get (7.61) which finishes the proof of Lemma 7.5.4.

7.5.4 Remarks
Statistical stability and a priori bounds. As we have already mentioned, the uniform
L2 bound for the Wigner transform is crucial in the derivation of Thm. 7.4.1. In the absence
of an a priori L2 bound, we are not able to characterize the limiting measure P . However we
can characterize its first moment. The derivation is done in [7]. Let us assume that Wε is
bounded in A0 , as is the case for the Wigner transform of a pure state ψε uniformly bounded
in L2 (Rd ). Then we can show that EPε {Wε } converges weakly to W , solution of (7.26), with
appropriate initial conditions (the Wigner transform of the limit ψε (0, x)). The proof is very
similar to that obtained above, except that in the proof of convergence, as well as in the proof
of tightness of the sequence of measures Pε (now defined on a ball in C([0, L]; A0 )), we need to
show that the test functions λ1,2 are bounded in A0 rather than L2 (R2d ).
However the proof of convergence of the second martingale in section 7.5.2 does not extend
to the case of a uniform bound in A0 . Technically, the obstacle resides in the fact that the
oscillatory integrals (7.60) are small in L2 (R2d ) but not in A0 . Since A0 includes bounded
measures, any measure µ(dp) concentrating on the hyperplane orthogonal to x will render the
integral (7.60) an order O(1) quantity.
The above discussion does not provide proof that Pε does not converge to a deterministic
limit. However it strongly suggests that if Wε is allowed to become quite singular in A0 , then on
these paths Pε may not become sufficiently self-averaging to converge to a deterministic limit.
Actually, in the simplified regime of the Itô-Schrödinger equation (a further simplification
compared to the paraxial wave equation), it is shown in [2] that the measure Pε does not
converge to a limiting deterministic measure when the initial Wigner measure is very singular
(converges to a delta function in both x and k). Instead, scintillation effects, which measure
the distance between the second moment of Wε and the square of its first moment, are shown to
persist for all finite times (for an appropriate scaling). This does not characterize the limiting
measure P either (this remains an open problem even in the Itô-Schrödinger framework), but
at least shows that P is not deterministic.

Paraxial and radiative transfer regimes. Note that in the limit where the potential
V (z, x) oscillates very slowly in the z variable, so that R(z, x) converges to a function that does
not depend on z (because V (z, x) becomes highly correlated in z), whence R̂(ω, p) converges
to a function of the form δ(ω)R̂(p), we obtain the limiting average transport equation

∂W
Z  |k|2 |p|2   dp
κ + k · ∇ x W = κ2 R̂(p − k)δ − W (x, p) − W (x, k) . (7.62)
∂z Rd 2 2 (2π)d

89
This is the radiative transfer equation for the Schrödinger equation (7.13) when the potential
V (x) is independent of the variable z. We do not recover the full radiative transfer equation
as in Chapter 6 since we started with the paraxial approximation. However we recover the
radiative transfer equation for the Schrödinger equation that can be derived formally using
the same tools as those developed in 6.

Exercise 7.5.5 [long project] Derive (7.62) from the Wigner transform of the Schrödinger
equation. Hint: see [28].

Note that the dispersion relation for wave equations ω = c0 |k| is now replaced by its “paraxial”
approximation ω = |k|2 /2, where k now is the transverse component of the wavevector only.

90
Appendix A

Notes on Diffusion Markov


Processes

A.1 Markov Process and Infinitesimal Generator


A.1.1 Definitions and Kolmogorov equations
A process {X(t)}t≥0 is called Markov with state space S if X(t) ∈ S for all t ≥ 0 and for all
A ∈ S, t ≥, and τ ≥ 0, we have

P [X(t + τ ) ∈ A|X(s), s ≤ t] = P [X(t + τ ) ∈ A|X(t)], (A.1)

almost surely [12]. What this means is that the future of X(u) for u > t depends on its past
X(s) for s ≤ t only through its present X(t). In other words, knowing X(t) is enough to know
X(t + τ ). Markov processes forget about the past instantaneously.
Let us define the Transition Probability Density p(t, y|s, x) as

p(t, y|s, x)dy = P [X(t) ∈ (y, y + dy)|X(s) = x] for s ≤ t. (A.2)

The Chapman-Kolmogorov (CK) relation states that for s < τ < t,


Z
p(t, y|s, x) = p(τ, ξ|s, x)p(t, y|τ, ξ)dξ (A.3)

which is an integration over all intermediate states ξ taken at time τ . We can also see this as

(s, x) −→ (τ, ξ) −→ (t, y).

For s ≤ t, we define the 2−parameter family of solution operators Tst as


Z
t
(Ts f )(x) = E{f (X(t))|X(s) = x} = p(t, y|s, x)f (y)dy. (A.4)

The properties of Tst , obtained from the CK relation, are

(1) Ttt = Id (Identity)


(2) Tst = Tsτ Tτt for s < τ < t (A.5)
(3) kTst kL∞ ≤ 1 (contraction) .

91
We define the Infinitesimal Generator Qt of the Markov process X(t) as the operator

Ttt+h − Id
Qt = lim (A.6)
h→0 h
The domain of definition of Qt may be a dense subset of that of Tst .
Let us define a quantity whose calculation we are interested in

u(t, s, x) = Tst f (x) = E{f (X(t))|X(s) = x}. (A.7)

Here, s is the starting time, x the starting position, and t the final time. We are interested
in the average of f (X(t)). We have that u satisfies the following Kolmogorov Backward
Equation (KBE)
∂u
+ Qs u = 0 for s < t
∂s (A.8)
u(t, t, x) = f (x).
Note that the differentiation is with respect to the backwards time s. We solve the PDE
backwards in time from t to s. Qs operates on the backwards starting position x. The
transition probability density p(t, y|s, x) satisfies the KBE with f (x) = δ(x − y):
∂p
(t, y|s, x) + (Qs p(t, y|s, x))(x) = 0 for s<t
∂s (A.9)
p(t, y|t, x) = δ(x − y).

The transition probability density p(t, y|s, x) also satisfies the following Kolmogorov For-
ward Equation (KFE)
∂p
(t, y|s, x) + (Q∗t p(t, y|s, x))(y) = 0 for t>s
∂t (A.10)
p(s, y|s, x) = δ(x − y).

Here Q∗ is the formal adjoint to Q, acting on the forward variable y. Note that this equation
is solved forwards in time, from s to t > s.

A.1.2 Homogeneous Markov Processes


By definition, the statistics of a homogeneous Markov process do not change in time:

p(t, y|s, x) = p(t − s, y|x); probability depends on time difference only. (A.11)

The Chapman-Kolmogorov relation becomes


Z
P (t + τ, y|x) = P (t, ξ|x)P (τ, y|ξ)dξ.

We also now define the 1−parameter family of operators T t


Z
(T t f )(x) = E{f (X(t))|X(0) = x} = p(t, y|x)f (y)dy. (A.12)

with the properties


(1) T00 = Id
(2) T t+τ = T tT τ (A.13)
(3) kT t kL∞ ≤ 1.

92
The family T t forms a continuous contraction semigroup of operators. We also define the
infinitesimal generator
T t − Id
Q = lim (A.14)
t→0 t
with no time dependence. We then verify that the solution operator T t is given by

T t = etQ (A.15)

Indeed, if we define
u(t, x) = E{f (X(t))|X(0) = x} = (T t f )(x), (A.16)
then we have the KBE with change of variables t → −t
∂u
= Qu for t≥0
∂t (A.17)
u(0, x) = f (x).

This implies (A.15). We also have the KBE and KFE

∂p
(t, x) + Qp(t, x) = 0 for t≥0
∂t (A.18)
p(0, y|x) = δ(x − y),

∂p
(t, y) + Q∗ p(t, y) = 0 for t≥0
∂t (A.19)
p(0, y|x) = δ(x − y).
In the KBE, y is a parameter, in KFE, x is a parameter.

A.1.3 Ergodicity for homogeneous Markov processes


By definition, a homogeneous Markov process is called ergodic if there exists a unique
normalized invariant measure p(y) such that
Z

Q p = 0, p(y)dy = 1. (A.20)

Notice that ergodicity implies that the null space of Q is the space of constant functions, since
Q1 = 0 as T t 1 = 1. From (A.20) and (A.18), we deduce that
Z
p(x) = p(t0 , ξ|x)p(ξ)dξ, (A.21)

for all time t0 ≥ 0. This justifies the notion of invariant measure. As t → ∞, the density p
converges to p exponentially. The spectrum of Q∗ gives the rate of mixing, or of convergence.
We can now construct the inverse of an ergodic generator Q. We want to solve the problem

Qu = f (A.22)

where f is a known source term. The Fredholm alternative states that this problem admits
solutions if and only if
Z
E∞ {f (X)} = f (y)p(y)dy = 0, i.e. f ⊥ {p}. (A.23)

93
Here, E∞ is expectation with respect to the invariant measure. TheR solution u is defined up
to a constant function. The solution orthogonal to p, i.e. such that p(y)u(y)dy = 0 is given
by Z ∞ Z ∞
u(y) = − esQ f ds = − T s f ds. (A.24)
0 0
We can summarize by saying that
Z ∞
− esQ ds : D→D
0

is the inverse of Q restricted to

D = {p}⊥ = (Null{Q∗ })⊥ .

A.2 Perturbation expansion and diffusion limit


Let us consider the ordinary differential equation
dXε 1  t   t 
= √ F Xε (t), Y + G Xε (t), Y , Xε (0) = X0 ∈ Rd , (A.25)
dt ε ε ε

for some smooth functions F (x, y) and G(x, y). We set Yε (t) = Y(ε−1 t). We assume that
Y(t) is a homogeneous Markov process with infinitesimal generator Q bounded on L∞ (Y) and
with a unique invariant measure π(y) solution of

Q∗ π(dy) = 0. (A.26)

Moreover we assume the existence of a spectral gap, or equivalently assume that the semigroup
T r = erQ is strictly contracting on {π ⊥ }, the set of functions f such that E∞ {f } = (f, π) = 0
for (·, ·) the usual inner product on Y. The spectral gap α > 0 is such that if (g, π) = 0, then

kerQ gkL∞ (Y) ≤ CkgkL∞ (Y) e−αr .

This allows us to solve the following Poisson equation

Qf = g, (A.27)

with the following Fredholm alternative. If (π, g) 6= 0, then the above equation admits no
solution in L∞ (Y). If (π, g) = 0, then there exists a unique solution f such that (π, f ) = 0,
given by Z ∞
f (y) = − T r g(y)dr, (A.28)
0
which moreover satisfies that kf kL∞ (Y) ≤ CkgkL∞ (Y) .
Since Y(t) is a Markov process, then so are Yε (t) and (Xε (t), Yε (t)) jointly. Let us consider
moments of the form
uε (t, x, y) = E(x,y) {f (Xε (t), Yε (t))}, (A.29)
where the processes (Xε (t), Yε (t)) start at (x, y) at t = 0, and where the function f (x, y)
is smooth. We want to understand the limit of uε (t, x, y) as ε → 0. This characterizes the
limiting law of the joint process (Xε (t), Yε (t)), thus of Xε (t).

94
The analysis is carried out by using a perturbation expansion. The exposition closely
follows that of [11]. The equation for uε is


uε (t, x, y) = Lε uε (t, x, y),
∂t (A.30)
uε (0, x, y) = f (x, y),

where the infinitesimal generator for (Xε (t), Yε (t)) is


1 1
Lε = L0 + √ L1 + L2 ,
ε ε (A.31)
L0 = Q, L1 = F (x, y) · ∇x , L2 = G(x, y) · ∇x .

We can then expand uε = u0 + εu1 + εu2 + ζε , plug this into (A.30), and equate like powers
of ε. This yields
Qu0 = 0
Qu1 + L1 u0 = 0 (A.32)
∂u0
Qu2 + L1 u1 + L2 u0 = .
∂t
The first equation Qu0 = 0 implies that u0 (t, x) is independent of y since Q1 = 0. The second
equation implies that
(L1 u0 , π(dy)) = 0,
which since u0 (t, x) is independent of y necessitates that
Z
(F (x, y), π) = F (x, y)π(dy) = E∞ {F (x, y)} = 0. (A.33)
Y

This is a constraint on F (x, y) that we need to assume in order to obtain a limit for uε as
ε → 0. If the above average is not zero, this means that there is a drift at the scale ε that
prevents one from having a non-trivial limit for uε as ε → 0. When the constraint (A.33) is
satisfied, the second equation in (A.32) admits a unique solution given by
Z ∞
u1 (t, x, y) = erQ F (x, y) · ∇x u0 (t, x)dr. (A.34)
0

The third equation in (A.32) admits solutions only if the following compatibility condition
holds:  ∂u0 
π, L1 u1 + L2 u0 − = 0. (A.35)
∂t
More explicitly, we have
Z Z ∞ Z
∂u0 rQ
= F (x, y) · ∇x e F (x, y) · ∇x u0 (t, x)drπ(dy) + G(x, y) · ∇x u0 (t, x)π(dy).
∂t Y 0 Y

Using the fact that erQ F = T r F = E∞ {F (t + s)|F (t) = F }, this can be recast as

∂u0 1 ∂2 ∂
= aij (x) u0 + bk (x) u0 ,
∂t 2 ∂xi ∂xj ∂xk (A.36)
u0 (t, x) = f (x),

95
with summation over repeated indices, where we have
Z ∞
1 
aij (x) = E∞ Fi (x, y(0))Fj (x, y(s)) ds,
2 0
d Z ∞ (A.37)
X  ∂Fk
bk (x) = E∞ Fj (x, y(0)) (x, y(s)) ds + E∞ {Gk (x, y)}.
0 ∂xj
j=1

The limiting equation for u0 (t, x) is thus a diffusion equation. The matrix a, which as
we can verify is symmetric and positive definite, can be written as a = σ 2 (this expression
is the reason for the factors 1/2 in (A.36) and (A.37)). Then the right-hand side in (A.36)
is the infinitesimal generator of a diffusion Markov process satisfying the following stochastic
ordinary differential equation (in the Itô sense):

dX = b(x)dt + σ(x)dWt , (A.38)

where Wt is d− dimensional Brownian motion.

Theorem A.2.1 For f (x) smooth, let uε (t, x, y) be the solution of (A.30) with initial condi-
tion uε (0, x, y) = f (x), and let u(t, x) be the solution of the limiting equation (A.36) with the
same initial conditions. Then provided that the functions F and G are sufficiently smooth, we
have √
|uε (t, x, y) − u(t, x)| = O( ε), 0 ≤ t ≤ T < ∞, (A.39)
uniformly in x ∈ Rd and y ∈ Y.

Exercise A.2.1 Prove the above theorem assuming that the solutions to (A.36) and to (A.27)
are sufficiently smooth and that the operator ∂t − Lε satisfies a maximum principle. Follow
the steps of the proof of Thm. 2.1.1.

The above theorem shows that

Ex,y {f (Xε (t))} → Ex {f (X(t)}, ε → 0, (A.40)

for all 0 ≤ t ≤ T < ∞, where X(t) solves (A.38). It is not quite sufficient to show the weak
convergence of Xε to X as a measure on the space of paths C([0, T ]; Rd ), which requires a
better control of the regularity in time of the convergence. The proof of weak convergence in
C([0, T ]; Rd ) goes beyond the scope of these notes and we refer to [11] for additional details.

A.3 Remarks on stochastic integrals


In the preceding section we saw that the limiting process solved the following stochastic ordi-
nary differential equation
dX = b(x)dt + σ(x)dWt , (A.41)
where Wt is d− dimensional Brownian motion. Brownian motion Wt is defined as the
stationary Markov process with transition probability density, which for any starting point
W0 = y ∈ Rd , is given by

1 |x − y|2 
p(t, x, y) = exp − , x ∈ Rd , t > 0. (A.42)
(2πt)d/2 2t

96
This defines a stochastic process [12] and the process can be chosen to be continuous, i.e., in
C([0, ∞); Rd ), in the sense that there is a version of Wt that is continuous. (Xt is a version of
Yt if P ({ω; Xt (ω) = Yt (ω)}) = 1).
Important properties of Brownian motion are:
(i) Wt is Gaussian, i.e., for all (t1 , · · · , tk ), (Wt1 , · · · , Wtk ) has a multi-normal distribution.
Among other, this shows that
Ex {Wt } = x, t > 0, (A.43)
where Ex denotes mathematical expectation for a Brownian motion stating at W0 = x. Also
we have
Ex {|Wt − x|2 } = dt, Ex {(Wt − x) · (Ws − x)} = d min (s, t). (A.44)
The above relations imply that
Ex {|Wt − Ws |2 } = n(t − s), if t ≥ s. (A.45)
(ii) Wt has independent increments, i.e.
Wt1 , Wt2 − Wt1 , · · · , Wtk − Wtk−1 , (A.46)
are independent for all 0 ≤ t1 < t2 < . . . < tk . This is a consequence of the Gaussian character
of Wt .
Note that Wt can be chosen continuous. However it can be shown that the continuous
version is nowhere differentiable (almost surely). The notion of dWt therefore requires to be
interpreted more carefully. A logical interpretation of the stochastic equation (A.41) is that
its solution satisfies the integral equation:
Z t Z t
Xt = X0 + b(s, Xs )ds + σ(s, Xs )dWs . (A.47)
0 0
Rt
The first integral 0 b(s, Xs )ds can be given the usual sense for sufficiently smooth functions
Rt
b(s, Xs ). The last integral 0 σ(s, Xs )dWs however requires interpretation. It can actually be
given several non equivalent meanings. Let us generalize the integral to
Z t
f (s, ω)dWs , (A.48)
0

for f (s) a (matrix-valued) stochastic process. Let (tk , tk+1 ) be an interval. Then we have the
reasonable definition Z tk+1
dWs = Wtk+1 − Wtk . (A.49)
tk
Let t1 = 0 < t1 < · · · < tk+1 = t be a partition of (0, t). Then as Riemann sums are used to
define Riemann integrals, we can approximate (A.48) as
k
X
f (t∗m ) Wtm+1 − Wtm ,

(A.50)
m=1

where tm ≤ t∗m ≤ tm+1 . Here however there is a big surprise. The choice of t∗m matters. The

main reason is that Wtm+1 − Wtm has oscillations that on average are of order tm+1 − tm ,
which is much larger than tm+1 − tm , since the latter is equal to the expectation of the square
of Wtm+1 − Wtm . Two choices are famous (with names):

t∗m = tm Itô sense,


tm + tm+1 (A.51)
t∗m = Stratonowich sense.
2

97
As the partition gets more refined and all tm+1 − tm tend to 0 uniformly, the above approxima-
tions (A.50) admit limits, which may be different depending on the choice of t∗m . The accepted
notation for the limiting integral is
Z t
f (s, ω)dWs Itô integral,
Z0 t (A.52)
f (s, ω) ◦ dWs Stratonowich integral.
0

See [25] for examples of processes f (t, ω) for which the two definitions of the integrals provide
very different results. In the mathematical literature, the Itô integral appears much more
frequently. The reason is that the resulting integral is a martingale, for which a great deal
of estimates are available. The Stratonowich integral may make more sense physically in
cases where there is no reason to privilege one direction of time versus the other (forward or
backwards). The Itô choice sums terms of the form f (tm )(Wtm+1 − Wtm ), where both terms
in the product are independent (because Brownian motion has independent increments). If
such an independence is deemed correct (such as may be in the stock market), then the Itô
choice makes sense. In more symmetric processes, the Stratonowich choice makes sense. Note
that although these integrals provide different answers to the meaning of (A.48), we can go
from one to the other by relatively simple calculus [25].

A.4 Diffusion Markov Process Limit


The limit theorem obtained in section A.2 may be generalized as follows. Let q(t) be a
homogeneous ergodic Markov process in a state space S with infinitesimal generator Q and
invariant measure p̄(q). Let 0 < ε  1 be a small parameter. Let F (t, ξ, q, x) and G(t, ξ, q, x),
from R × R × S × Rd to Rd be smooth functions in x such that
Z
E∞ [F (t, ξ, q, x)] = F (t, ξ, q, x)dp̄(x) = 0.
S

Let τ (t) be a smooth function from [0, ∞) to [0, ∞) such that τ 0 (t) > 0.
Let X ε (t) satisfy the ODE

dX ε (t) 1 τ (t) t τ (t) t


= F (t, , q( 2 ), X ε (t)) + G(t, , q( 2 ), X ε (t))
dt ε ε ε ε ε
X ε (0) = X0 .

Then X ε (·) converges weakly to a Diffusion Markov Process with infinitesimal generator
Q̄ given by Z ∞
Q̄ = hE∞ [F. · ∇x dsesQ F · ∇x ]iξ + hE∞ [G]iξ · ∇x .
0
This expression can be simplified by remarking that

esQ F = T s F = E[F (t + s)|F (t) = F ]

independently of t, so that
Z ∞ Z ∞
∂ ∂
E∞ [F · ∇x ds esQ F · ∇x ] = ds E∞ [Fj (t, ξ, q(t0 ), X) Fk (t, ξ, q(t0 + s), x) ].
0 0 ∂xj ∂xk

98
This implies that
1 ∂2 ∂
Q̄ = ajk (t, x) + bk (t, x) ,
2 ∂xj ∂xk ∂xk
where we use summation over repeated indices and

1 DZ ∞ E
ajk (t, x) = E∞ [Fj (t, ξ, q(t0 ), x)Fk (t, ξ, q(t0 + s), x)]ds
2 0 DZ ∞ ξ
∂Fk E
bk (t, x) = E∞ [Gj (t, ξ, q(t0 ), x)] ξ + E∞ [Fj (t, ξ, q(t0 ), x) (t, ξ, q(t0 + s), x)]ds .
0 ∂xj ξ

It is therefore the generator of a diffusion Markov process X satisfying the Itô equation

dX = b(t, x)dt + σ(t, x)dβt ,

where σ is such that ajk = σjl σkl .

99
Bibliography

[1] F. Bailly, J. F. Clouet, and J.-P. Fouque, Parabolic and gaussian white noise
approximation for wave propagation in random media, SIAM J. Appl. Math, 56(5) (1996),
pp. 1445–1470.

[2] G. Bal, On the self-averaging of wave energy in random media, Multiscale Model. Simul.,
2(3) (2004), pp. 398–420.

[3] , Kinetic models for scalar wave fields in random media, to appear in Wave Motion,
(2005).

[4] G. Bal and T. Chou, Water wave transport over spatially random drift, Wave Motion,
35 (2002), pp. 107–124.

[5] G. Bal, A. Fannjiang, G. Papanicolaou, and L. Ryzhik, Radiative Transport in a


Periodic Structure, Journal of Statistical Physics, 95 (1999), pp. 479–494.

[6] G. Bal, V. Freilikher, G. Papanicolaou, and L. Ryzhik, Wave Transport along


Surfaces with Random Impedance, Phys. Rev. B, 62(10) (2000), pp. 6228–6240.

[7] G. Bal, G. Papanicolaou, and L. Ryzhik, Radiative transport limit for the random
Schrödinger equation, Nonlinearity, 15 (2002), pp. 513–529.

[8] G. Bal and L. Ryzhik, Wave transport for a scalar model of the Love waves, Wave
Motion, 36(1) (2002), pp. 49–66.

[9] G. Bal and R. Verástegui, Time Reversal in Changing Environment, Multiscale


Model. Simul., 2(4) (2004), pp. 639–661.

[10] P. Billingsley, Convergence of Probability Measures, John Wiley and Sons, New York,
1999.

[11] P. Blankenship and G. Papanicolaou, Stability and control of stochastic systems


with wide-band noise disturbances, SIAM J. Appl. Math., 34 (1978), pp. 437–476.

[12] L. Breiman, Probability, Classics in Applied Mathematics, SIAM, Philadelphia, 1992.

[13] J. L. Doob, Stochastic Processes, Wiley, New York, 1953.

[14] L. Erdös and H. T. Yau, Linear Boltzmann equation as the weak coupling limit of a
random Schrödinger Equation, Comm. Pure Appl. Math., 53(6) (2000), pp. 667–735.

[15] L. Evans, Partial Differential Equations, Graduate Studies in Mathematics Vol.19, AMS,
1998.

100
[16] J. P. Fouque, La convergence en loi pour les processus à valeur dans un espace nucléaire,
Ann. Inst. H. Poincaré Prob. Stat, 20 (1984), pp. 225–245.

[17] P. Gérard, P. A. Markowich, N. J. Mauser, and F. Poupaud, Homogenization


limits and Wigner transforms, Comm. Pure Appl. Math., 50 (1997), pp. 323–380.

[18] H. Hasimoto, On the periodic fundamental solutions of the Stokes equations and their
application to viscous flow past a cubic array of spheres, J. Fluid Dynamics, 5 (1959),
pp. 317–328.

[19] R. Hill, The elastic behavior of a crystalline aggregate, Proceedings of the Physical
Society, London, A, 64 (1952), pp. 349–354.

[20] O. D. Kellogg, Foundations of potential theory, Dover, 1953.

[21] P. Lax, Lecture Notes on Hyperbolic Partial Differential Equations, Stanford University,
1963.

[22] P.-L. Lions and T. Paul, Sur les mesures de Wigner, Rev. Mat. Iberoamericana, 9
(1993), pp. 553–618.

[23] Milton, The theory of composites, Cambridge Monographs on Applied and Computa-
tional Mathematics, 6. Cambridge University Press, Cambridge, 2002.

[24] I. Mitoma, On the sample continuity of S 0 processes, J. Math. Soc. Japan, 35 (1983),
pp. 629–636.

[25] B. Øksendal, Stochastic Differential Equations, Springer-Verlag, Berlin, 2000.

[26] J. Rauch, Lectures on Geometric Optics, Department of Mathematics, University of


Michigan.

[27] J. W. S. Rayleigh, On the influence of obstacles arranged in rectangular order upon the
properties of the medium, Phil. Mag., 34 (1892), pp. 481–502.

[28] L. Ryzhik, G. Papanicolaou, and J. B. Keller, Transport equations for elastic and
other waves in random media, Wave Motion, 24 (1996), pp. 327–370.

[29] H. Spohn, Derivation of the transport equation for electrons moving through random
impurities, J. Stat. Phys., 17 (1977), pp. 385–412.

[30] F. Tappert, The parabolic approximation method, Lecture notes in physics, vol. 70, Wave
Propagation and Underwater Acoustics, Ed. J.B. Keller and J.S. Papadakis, Springer-
Verlag, pp. 224-287, 1977.

[31] B. S. White, The stochastic caustic, SIAM J. Appl. Math., 44 (1984), pp. 127–149.

[32] J. M. Ziman, Principles of the theory of solids, Cambridge, 1972.

101

You might also like