Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views81 pages

Cont Time1

The document discusses continuous time finance, focusing on stochastic calculus, the Black-Scholes model, and martingale processes. It covers key concepts such as the Wiener process, Ito processes, and the Ito integral, providing mathematical definitions and properties. The content serves as a foundation for understanding financial models and their applications in finance, as outlined in the textbook 'Arbitrage Theory in Continuous Time' by Tomas Björk.

Uploaded by

yusufjarso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views81 pages

Cont Time1

The document discusses continuous time finance, focusing on stochastic calculus, the Black-Scholes model, and martingale processes. It covers key concepts such as the Wiener process, Ito processes, and the Ito integral, providing mathematical definitions and properties. The content serves as a foundation for understanding financial models and their applications in finance, as outlined in the textbook 'Arbitrage Theory in Continuous Time' by Tomas Björk.

Uploaded by

yusufjarso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

Continuous Time Finance

Lisbon 2013

Tomas Björk
Stockholm School of Economics

Tomas Björk, 2013


Contents

• Stochastic Calculus (Ch 4-5).


• Black-Scholes (Ch 6-7.
• Completeness and hedging (Ch 8-9.
• The martingale approach (Ch 10-12).
• Incomplete markets (Ch 15).
• Dividends (Ch 16).
• Currency derivatives (Ch 17).
• Stochastic Control Theory (Ch 19)
• Martingale Methods for Optimal Investment (Ch 20)

Textbook:
Björk, T: “Arbitrage Theory in Continuous Time”
Oxford University Press, 2009. (3:rd ed.)

Tomas Björk, 2013 1


Notation

Xt = any random process,


dt = small time step,
dXt = Xt+dt − Xt

• We often write X(t) instead of Xt .

• dXt is called the increment of X over the interval


[t, t + dt].

• For any fixed interval [t, t + dt], the increment dXt


is a stochastic variable.

• If the increments dXs and dXt, over the disjoint


intervals [s, s + ds] and [t, t + dt] are independent,
then we say that X has independent increments.

• If every increment has a normal distribution we say


that X is a normal, or Gaussian process.

Tomas Björk, 2013 6


The Wiener Process

A stochastic process W is called a Wiener process if


it has the following properties

• The increments are normally distributed: For s < t:

Wt − Ws ∼ N [0, t − s]

E[Wt − Ws] = 0, V ar[Wt − Ws] = t − s

• W has independent increments.

• W0 = 0

• W has continuous trajectories.

Continuous random walk

Note: In Hull, a Wiener process is typically denoted


by Z instead of W .

Tomas Björk, 2013 7


A Wiener Trajectory

0.8

0.6

0.4

0.2

−0.2

−0.4 t
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Tomas Björk, 2013 8


Important Fact

Theorem:
A Wiener trajectory is, with probability one, a
continuous curve which is nowhere differentiable.

Proof. Hard.

Tomas Björk, 2013 9


Wiener Process with Drift

A stochastic process X is called a Wiener process


with drift µ and diffusion coefficient σ if it has the
following dynamics

dXt = µdt + σdWt,

where µ and σ are constants.

Summing all increments over the interval [0, t] gives us

Xt − X0 = µ · t + σ · (Wt − W0 ),

Xt = X0 + µt + σWt

Thus
Xt ∼ N [X0 + µt, σ 2 t]

Tomas Björk, 2013 10


Itô processes

We say, losely speaking, that the process X is an Itô


process if it has dynamics of the form

dXt = µtdt + σt dWt,

where µt and σt are random processes.

Informally you can think of dWt as a random variable


of the form

dWt ∼ N [0, dt]

To handle expressions like the one above, we need


some mathematical theory.

First, however, we present an important example,


which we will discuss informally.

Tomas Björk, 2013 11


Example: The Black-Scholes model

Price dynamics: (Geometrical Brownian Motion)

dSt = µStdt + σStdWt,

Simple analysis:
Assume that σ = 0. Then

dSt = µStdt

Divide by dt!
dSt
= µSt
dt
This is a simple ordinary differential equation with
solution
St = s0 eµt

Conjecture: The solution of the SDE above is a


randomly disturbed exponential function.

Tomas Björk, 2013 12


Intuitive Economic Interpretation

dSt
= µdt + σdWt
St
Over a small time interval [t, t + dt] this means:

Return = (mean return)


+ σ × (Gaussian random disturbance)

• The asset return is a random walk (with drift).

• µ = mean rate of return per unit time

• σ = volatility

Large σ = large random fluctuations

Small σ = small random fluctuations

• The returns are normal.

• The stock price is lognormal.

Tomas Björk, 2013 13


A GBM Trajectory

10

0 t
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Tomas Björk, 2013 14


Stochastic Differentials and Integrals

Consider an expression of the form

dXt = µtdt + σt dWt,


X0 = x0

Question: What exactly do we mean by this?

Answer: Write the equation on integrated form as


Z t Z t
Xt = x0 + µsds + σsdWs
0 0

How is this interpreted?

Tomas Björk, 2013 15


Recall:
Z t Z t
Xt = x0 + µsds + σsdWs
0 0

Two terms:

• Z t
µsds
0
This is a standard Riemann integral for each µ-
trajectory.

• Z t
σsdWs
0

Stochastic integral. This can not be interpreted


as a Stieljes integral for each trajectory. We need a
new theory for this Itô integral.

Tomas Björk, 2013 16


Information

Consider a Wiener process W .


Def:
FtW = “The information generated by W
over the interval [0, t]”

Def: Let Z be a stochastic variable. If the value of Z


is completely determined by FtW , we write

Z ∈ FtW

Ex:
For the stochastic variable Z, defined by
Z 5
Z= Wsds,
0

we have Z ∈ F5W .
We do not have Z ∈ F4W .

Tomas Björk, 2013 17


Adapted Processes

Let W be a Wiener process.

Definition:
A process X is adapted to the filtration

FtW : t ≥ 0 if

Xt ∈ FtW , ∀t ≥ 0

“An adapted process does not look


into the future”

Adapted processes are nice integrands for stochastic


integrals.

Tomas Björk, 2013 18


• The process Z t
Xt = Wsds,
0
is adapted.

• The process
Xt = sup Ws
s≤t
is adapted.

• The process
Xt = sup Ws
s≤t+1
is not adapted.

Tomas Björk, 2013 19


The Itô Integral

We will define the Itô integral

Z b
gsdWs
a

for processes g satisfying

• The process g is adapted.

• The process g satisfies


Z b  2
E gs ds < ∞
a

This will be done in two steps.

Tomas Björk, 2013 20


Simple Integrands

Definition:
The process g is simple, if

• g is adapted.

• There exists deterministic points t0 . . . , tn with


a = t0 < t1 < . . . < tn = b such that g is piecewise
constant, i.e.

g(s) = g(tk ), s ∈ [tk , tk+1)

For simple g we define

Z b n−1
X
gsdWs = g(tk ) [W (tk+1 ) − W (tk )]
a k=0

FORWARD INCREMENTS!

Tomas Björk, 2013 21


Properties of the Integral

Theorem: For simple g the following relations hold

• The expected value is given by


"Z #
b
E gsdWs = 0
a

• The second moment is given by


 ! 
Z b 2 Z b
 2
E gsdWs  = E gs ds
a a

• We have Z b
gsdWs ∈ FbW
a

Tomas Björk, 2013 22


General Case

For a general g we do as follows.

1. Approximate g with a sequence of simple gn such


that Z b h i
2
E {gn(s) − g(s)} ds → 0.
a

2. For each n the integral


Z b
gn(s)dW (s)
a

is a well defined stochastic variable Zn.

3. One can show that the Zn sequence converges to a


limiting stochastic variable.
Rb
4. We define a gdW by
Z b Z b
g(s)dW (s) = lim gn(s)dW (s).
a n→∞ a

Tomas Björk, 2013 23


Properties of the Integral

Theorem: For general g following relations hold

• The expected value is given by


"Z #
b
E gsdWs = 0
a

• We do in fact have
"Z #
b
E gsdWs Fa = 0
a

• The second moment is given by


 ! 
Z b 2 Z b
 2
E  gsdWs  = E gs ds
a a

Tomas Björk, 2013 24


• We have Z b
gsdWs ∈ FbW
a

Tomas Björk, 2013 25


Martingales

Definition: An adapted process is a martingale if

E [Xt| Fs] = Xs, ∀s ≤ t

“A martingale is a process without drift”

Proposition: For any g (sufficiently integrable) he


process Z t
Xt = gsdWs
0
is a martingale.

Proposition: If X has dynamics

dXt = µtdt + σt dWt

then X is a martingale iff µ = 0.

Tomas Björk, 2013 26


Continuous Time Finance

Stochastic Calculus

(Ch 4-5)

Tomas Björk

Tomas Björk, 2013 27


Stochastic Calculus

General Model:

dXt = µtdt + σt dWt

Let the function f (t, x) be given, and define the


stochastic process Zt by

Zt = f (t, Xt)

Problem: What does df (t, Xt ) look like?

The answer is given by the Itô formula.

We provide an intuitive argument. The formal proof is


very hard.

Tomas Björk, 2013 28


A close up of the Wiener process

Consider an “infinitesimal” Wiener increment

dWt = Wt+dt − Wt

We know:

dWt ∼ N [0, dt]

E[dWt] = 0, V ar[dWt] = dt

From this one can show

E[(dWt)2 ] = dt, V ar[(dWt )2] = 2(dt)2

Tomas Björk, 2013 29


Recall

E[(dWt)2 ] = dt, V ar[(dWt )2] = 2(dt)2

Important observation:

1. Both E[(dWt)2 ] and V ar[(dWt)2 ] are very small


when dt is small .

2. V ar[(dWt)2 ] is negligable compared to E[(dWt)2 ].

3. Thus (dWt )2 is deterministic.

We thus conclude, at least intuitively, that

(dWt )2 = dt

This was only an intuitive argument, but it can be


proved rigorously.

Tomas Björk, 2013 30


Multiplication table.

Theorem: We have the following multiplication table

(dt)2 = 0

dWt · dt = 0

2
(dWt) = dt

Tomas Björk, 2013 31


Deriving the Itô formula

dXt = µtdt + σt dWt

Zt = f (t, Xt)

We want to compute df (t, Xt )

Make a Taylor expansion of f (t, Xt ) including second


order terms:

∂f ∂f 1 ∂ 2f 2
df = dt + dXt + (dt)
∂t ∂x 2 ∂t2

1 ∂ 2f 2 ∂ 2f
+ 2
(dXt ) + dt · dXt
2 ∂x ∂t∂x

Plug in the expression for dX, expand, and use the


multiplication table!

Tomas Björk, 2013 32


∂f ∂f 1 ∂ 2f 2
df = dt + [µdt + σdW ] + (dt)
∂t ∂x 2 ∂t2

1 ∂ 2f 2 ∂ 2f
+ 2
[µdt + σdW ] + dt · [µdt + σdW ]
2 ∂x ∂t∂x
∂f ∂f ∂f 1 ∂ 2f 2
= dt + µ dt + σ dW + (dt)
∂t ∂x ∂x 2 ∂t2

1 ∂ 2f 2 2 2 2
+ 2
[µ (dt) + σ (dW ) + 2µσdt · dW ]
2 ∂x
∂ 2f 2 ∂ 2f
+ µ (dt) + σ dt · dW
∂t∂x ∂t∂x

Using the multiplikation table this reduces to:


 2

∂f ∂f 1 2 ∂ f
df = +µ + σ 2
dt
∂t ∂x 2 ∂x

∂f
+ σ dW
∂x

Tomas Björk, 2013 33


The Itô Formula

Theorem: With X dynamics given by

dXt = µtdt + σt dWt

we have
 
∂f ∂f 1 2 ∂ 2f
df (t, Xt) = +µ + σ dt
∂t ∂x 2 ∂x2
∂f
+ σ dWt
∂x

Alternatively

∂f ∂f 1 ∂ 2f 2
df (t, Xt) = dt + dXt + 2
(dXt ) ,
∂t ∂x 2 ∂x

where we use the multiplication table.

Tomas Björk, 2013 34


Example: GBM

dSt = µStdt + σSt dWt

We smell something exponential!


Natural Ansatz:

St = eZt ,
Zt = ln St

Itô on f (t, s) = ln(s) gives us

∂f 1 ∂f ∂ 2f 1
= , = 0, =− 2
∂s s ∂t ∂s2 s

1 1 1 2
dZt = dSt − (dS t )
St 2 St2
 
1 2
= µ − σ dt + σdWt
2

Tomas Björk, 2013 35


Recall  
1
dZt = µ − σ 2 dt + σdWt
2
Integrate!
Z t  Z t
1
Zt − Z0 = µ − σ 2 ds + σ dWs
0 2 0
 
1
= µ − σ 2 t + σWt
2

Using St = eZt gives us

(µ− 12 σ 2)t+σWt
St = S0 e

Since Wt is N [0, t], we see that St has a lognormal


distribution.

Tomas Björk, 2013 36


Changing Measures

Consider a probability measure P on (Ω, F ), and


assume that L ∈ F is a random variable with the
properties that
L≥0
and
E P [L] = 1.

For every event A ∈ F we now define the real number


Q(A) by the prescription

Q(A) = E P [L · IA]

where the random variable IA is the indicator for A,


i.e. (
1 if A occurs
IA =
0 if Ac occurs

Tomas Björk, 2013 139


Recall that
Q(A) = E P [L · IA]

We now see that Q(A) ≥ 0 for all A, and that

Q(Ω) = E P [L · IΩ] = E P [L · 1] = 1

We also see that if A ∩ B = ∅ then

Q(A ∪ B) = E P [L · IA∪B ] = E P [L · (IA + IB )]


= E P [L · IA] + E P [L · IB ]
= Q(A) + Q(B)

Furthermore we see that

P (A) = 0 ⇒ Q(A) = 0

We have thus more or less proved the following

Tomas Björk, 2013 140


Proposition 2: If L ∈ F is a nonnegative random
variable with E P [L] = 1 and Q is defined by

Q(A) = E P [L · IA]

then Q will be a probability measure on F with the


property that

P (A) = 0 ⇒ Q(A) = 0.

I turns out that the property above is a very important


one, so we give it a name.

Tomas Björk, 2013 141


Absolute Continuity

Definition: Given two probability measures P and Q


on F we say that Q is absolutely continuous w.r.t.
P on F if, for all A ∈ F , we have

P (A) = 0 ⇒ Q(A) = 0

We write this as
Q << P.

If Q << P and P << Q then we say that P and Q


are equivalent and write

Q∼P

Tomas Björk, 2013 142


Equivalent measures

It is easy to see that P and Q are equivalent if and


only if
P (A) = 0 ⇔ Q(A) = 0
or, equivalently,

P (A) = 1 ⇔ Q(A) = 1

Two equivalent measures thus agree on all certain


events and on all impossible events, but can disagree
on all other events.

Simple examples:
• All non degenerate Gaussian distributions on R are
equivalent.

• If P is Gaussian on R and Q is exponential then


Q << P but not the other way around.

Tomas Björk, 2013 143


Absolute Continuity ct’d

We have seen that if we are given P and define Q by

Q(A) = E P [L · IA]

for L ≥ 0 with E P [L] = 1, then Q is a probability


measure and Q << P . .

A natural question is now if all measures Q << P


are obtained in this way. The answer is yes, and the
precise (quite deep) result is as follows. The proof is
difficult and therefore omitted.

Tomas Björk, 2013 144


The Radon Nikodym Theorem

Consider two probability measures P and Q on (Ω, F ),


and assume that Q << P on F . Then there exists a
unique random variable L with the following properties

1. Q(A) = E P [L · IA] , ∀A ∈ F

2. L ≥ 0, P − a.s.

3. E P [L] = 1,

4. L∈F

The random variable L is denoted as

dQ
L= , on F
dP

and it is called the Radon-Nikodym derivative of Q


w.r.t. P on F , or the likelihood ratio between Q and
P on F .

Tomas Björk, 2013 145


A simple example

The Radon-Nikodym derivative L is intuitively the local


scale factor between P and Q. If the sample space Ω
is finite so Ω = {ω1, . . . , ωn} then P is determined by
the probabilities p1, . . . , pn where

pi = P (ωi ) i = 1, . . . , n

Now consider a measure Q with probabilities

qi = Q(ωi ) i = 1, . . . , n

If Q << P this simply says that

pi = 0 ⇒ qi = 0

and it is easy to see that the Radon-Nikodym derivative


L = dQ/dP is given by
qi
L(ωi) = i = 1, . . . , n
pi

Tomas Björk, 2013 146


If pi = 0 then we also have qi = 0 and we can define
the ratio qi /pi arbitrarily.

If p1 , . . . , pn as well as q1, . . . , qn are all positive, then


we see that Q ∼ P and in fact
 −1
dP 1 dQ
= =
dQ L dP

as could be expected.

Tomas Björk, 2013 147


The likelihood process on a filtered space
We now consider the case when we have a probability
measure P on some space Ω and that instead of just
one σ-algebra F we have a filtration, i.e. an increasing
family of σ-algebras {Ft}t≥0 .
The interpretation is as usual that Ft is the information
available to us at time t, and that we have Fs ⊆ Ft
for s ≤ t.
Now assume that we also have another measure Q,
and that for some fixed T , we have Q << P on FT .
We define the random variable LT by
dQ
LT = on FT
dP
Since Q << P on FT we also have Q << P on Ft
for all t ≤ T and we define
dQ
Lt = on Ft 0≤t≤T
dP
For every t we have Lt ∈ Ft, so L is an adapted
process, known as the likelihood process.

Tomas Björk, 2013 154


The L process is a P martingale

We recall that

dQ
Lt = on Ft 0≤t≤T
dP

Since Fs ⊆ Ft for s ≤ t we can use Proposition 5 and


deduce that

Ls = E P [Lt| Fs] s≤t≤T

and we have thus proved the following result.

Proposition: Given the assumptions above, the


likelihood process L is a P -martingale.

Tomas Björk, 2013 155


Where are we heading?

We are now going to perform measure transformations


on Wiener spaces, where P will correspond to the
objective measure and Q will be the risk neutral
measure.

For this we need define the proper likelihood process L


and, since L is a P -martingale, we have the following
natural questions.

• What does a martingale look like in a Wiener driven


framework?

• Suppose that we have a P -Wiener process W and


then change measure from P to Q. What are the
properties of W under the new measure Q?

These questions are handled by the Martingale


Representation Theorem, and the Girsanov Theorem
respectively.

Tomas Björk, 2013 156


4.

The Martingale Representation Theorem

Tomas Björk, 2013 157


Intuition

Suppose that we have a Wiener process W under


the measure P . We recall that if h is adapted (and
integrable enough) and if the process X is defined by
Z t
Xt = x0 + hsdWs
0

then X is a a martingale. We now have the following


natural question:

Question: Assume that X is an arbitrary martingale.


Does it then follow that X has the form
Z t
Xt = x0 + hsdWs
0

for some adapted process h?

In other words: Are all martingales stochastic integrals


w.r.t. W ?

Tomas Björk, 2013 158


Answer

It is immediately clear that all martingales can not be


written as stochastic integrals w.r.t. W . Consider for
example the process X defined by
(
0 for 0 ≤ t < 1
Xt =
Z for t ≥ 1

where Z is an random variable, independent of W ,


with E [Z] = 0.

X is then a martingale (why?) but it is clear (how?)


that it cannot be written as
Z t
Xt = x0 + hsdWs
0

for any process h.

Tomas Björk, 2013 159


Intuition

The intuitive reason why we cannot write


Z t
Xt = x0 + hsdWs
0

in the example above is of course that the random


variable Z “has nothing to do with” the Wiener process
W . In order to exclude examples like this, we thus need
an assumption which guarantees that our probability
space only contains the Wiener process W and nothing
else.

This idea is formalized by assuming that the filtration


{Ft}t≥0 is the one generated by the Wiener
process W .

Tomas Björk, 2013 160


The Martingale Representation Theorem

Theorem. Let W be a P -Wiener process and assume


that the filtation is the internal one i.e.

Ft = FtW = σ {Ws; 0 ≤ s ≤ t}

Then, for every (P, Ft)-martingale X, there exists a


real number x and an adapted process h such that
Z t
Xt = x + hsdWs,
0

i.e.
dXt = htdWt.

Proof: Hard. This is very deep result.

Tomas Björk, 2013 161


Note

For a given martingale X, the Representation Theorem


above guarantees the existence of a process h such that
Z t
Xt = x + hsdWs,
0

The Theorem does not, however, tell us how to find


or construct the process h.

Tomas Björk, 2013 162


5.

The Girsanov Theorem

Tomas Björk, 2013 163


Setup

Let W be a P -Wiener process and fix a time horizon


T . Suppose that we want to change measure from P
to Q on FT . For this we need a P -martingale L with
L0 = 1 to use as a likelihood process, and a natural
way of constructing this is to choose a process g and
then define L by
(
dLt = gtdWt
L0 = 1

This definition does not guarantee that L ≥ 0, so we


make a small adjustment. We choose a process ϕ and
define L by
(
dLt = Ltϕt dWt
L0 = 1

The process L will again be a martingale and we easily


obtain Rt
ϕ dW − 1 R t ϕ2ds
s s
Lt = e 0 2 0 s

Tomas Björk, 2013 164


Thus we are guaranteed that L ≥ 0. We now change
measure form P to Q by setting

dQ = LtdP, on Ft, 0 ≤ t ≤ T

The main problem is to find out what the properties


of W are, under the new measure Q. This problem is
resolved by the Girsanov Theorem.

Tomas Björk, 2013 165


The Girsanov Theorem
Let W be a P -Wiener process. Fix a time horizon T .
Theorem: Choose an adapted process ϕ, and define
the process L by
(
dLt = Ltϕt dWt
L0 = 1

Assume that E P [LT ] = 1, and define a new mesure Q


on FT by

dQ = LtdP, on Ft, 0 ≤ t ≤ T

Then Q << P and the process W Q , defined by


Z t
WtQ = Wt − ϕsds
0

is Q-Wiener. We can also write this as

dWt = ϕtdt + dWtQ

Tomas Björk, 2013 166


Changing the drift in an SDE
The single most common use of the Girsanov Theorem
is as follows.
Suppose that we have a process X with P dynamics

dXt = µtdt + σt dWt

where µ and σ are adapted and W is P -Wiener.


We now do a Girsanov Transformation as above, and
the question is what the Q-dynamics look like.
From the Girsanov Theorem we have

dWt = ϕtdt + dWtQ

and substituting this into the P -dynamics we obtain


the Q dynamics as

dXt = {µt + σtϕt} dt + σtdWtQ

Moral: The drift changes but the diffusion is


unaffected.

Tomas Björk, 2013 167


1. Dynamic Programming

• The basic idea.

• Deriving the HJB equation.

• The verification theorem.

• The linear quadratic regulator.

Tomas Björk, 2013 323


Problem Formulation
"Z #
T
max E F (t, Xt, ut)dt + Φ(XT )
u 0
subject to

dXt = µ (t, Xt, ut) dt + σ (t, Xt , ut) dWt


X0 = x0,
ut ∈ U (t, Xt), ∀t.

We will only consider feedback control laws, i.e.


controls of the form

ut = u(t, Xt)

Terminology:
X = state variable
u = control variable
U = control constraint

Note: No state space constraints.

Tomas Björk, 2013 324


Main idea

• Embedd the problem above in a family of problems


indexed by starting point in time and space.

• Tie all these problems together by a PDE: the


Hamilton Jacobi Bellman equation.

• The control problem is reduced to the problem of


solving the deterministic HJB equation.

Tomas Björk, 2013 325


Some notation

• For any fixed vector u ∈ Rk , the functions µu, σ u


and C u are defined by

µu(t, x) = µ(t, x, u),


σ u(t, x) = σ(t, x, u),
C u(t, x) = σ(t, x, u)σ(t, x, u)0 .

• For any control law u, the functions µu , σ u, C u(t, x)


and F u(t, x) are defined by

µu(t, x) = µ(t, x, u(t, x)),


σ u (t, x) = σ(t, x, u(t, x)),
C u(t, x) = σ(t, x, u(t, x))σ(t, x, u(t, x))0 ,
F u (t, x) = F (t, x, u(t, x)).

Tomas Björk, 2013 326


More notation

• For any fixed vector u ∈ Rk , the partial differential


operator Au is defined by
n
X X n
u u ∂ 1 u ∂2
A = µi (t, x) + Cij (t, x) .
i=1
∂xi 2 i,j=1 ∂xi ∂xj

• For any control law u, the partial differential


operator Au is defined by
n
X X n
∂ 1 ∂2
A =
u
µi (t, x)
u
+ Cij (t, x)
u
.
i=1
∂x i 2 i,j=1
∂xi ∂xj

• For any control law u, the process X u is the solution


of the SDE

dXtu = µ (t, Xtu, ut) dt + σ (t, Xtu, ut) dWt,

where
ut = u(t, Xtu)

Tomas Björk, 2013 327


Embedding the problem

For every fixed (t, x) the control problem Pt,x is defined


as the problem to maximize
"Z #
T
Et,x F (s, Xsu, us)ds + Φ (XTu ) ,
t

given the dynamics

dXsu = µ (s, Xsu , us) ds + σ (s, Xsu , us) dWs,


Xt = x,

and the constraints

u(s, y) ∈ U, ∀(s, y) ∈ [t, T ] × Rn.

The original problem was P0,x0 .

Tomas Björk, 2013 328


The optimal value function

• The value function

J : R+ × Rn × U → R

is defined by
"Z #
T
J (t, x, u) = E F (s, Xsu, us)ds + Φ (XTu )
t

given the dynamics above.

• The optimal value function

V : R+ × Rn → R

is defined by

V (t, x) = sup J (t, x, u).


u∈U

• We want to derive a PDE for V .

Tomas Björk, 2013 329


Assumptions

We assume:

• There exists an optimal control law û.

• The optimal value function V is regular in the sense


that V ∈ C 1,2 .

• A number of limiting procedures in the following


arguments can be justified.

Tomas Björk, 2013 330


Bellman Optimality Principle

Theorem: If a control law û is optimal for the time


interval [t, T ] then it is also optimal for all smaller
intervals [s, T ] where s ≥ t.

Proof: Exercise.

Tomas Björk, 2013 331


Basic strategy

To derive the PDE do as follows:

• Fix (t, x) ∈ (0, T ) × Rn.

• Choose a real number h (interpreted as a “small”


time increment).

• Choose an arbitrary control law u on the time inerval


[t, t + h].

Now define the control law u? by



? u(s, y), (s, y) ∈ [t, t + h] × Rn
u (s, y) =
û(s, y), (s, y) ∈ (t + h, T ] × Rn.

In other words, if we use u? then we use the arbitrary


control u during the time interval [t, t + h], and then
we switch to the optimal control law during the rest of
the time period.

Tomas Björk, 2013 332


Basic idea

The whole idea of DynP boils down to the following


procedure.

• Given the point (t, x) above, we consider the


following two strategies over the time interval [t, T ]:
I: Use the optimal law û.

II: Use the control law u? defined above.

• Compute the expected utilities obtained by the


respective strategies.

• Using the obvious fact that û is least as good


as u?, and letting h tend to zero, we obtain our
fundamental PDE.

Tomas Björk, 2013 333


Strategy values
Expected utility for û:

J (t, x, û) = V (t, x)

Expected utility for u?:

• The expected utility for [t, t + h) is given by


"Z #
t+h
Et,x F (s, Xsu, us) ds .
t

• Conditional expected utility over [t + h, T ], given


(t, x):  
Et,x V (t + h, Xt+h) .
u

• Total expected utility for Strategy II is


"Z #
t+h
Et,x F (s, Xsu , us) ds + V (t + h, Xt+h
u
) .
t

Tomas Björk, 2013 334


Comparing strategies

We have trivially
"Z #
t+h
V (t, x) ≥ Et,x F (s, Xsu, us) ds + V (t + h, Xt+h
u
) .
t

Remark
We have equality above if and only if the control law
u is the optimal law û.
Now use Itô to obtain

V (t + h, Xt+h
u
) = V (t, x)

Z t+h  
∂V
+ (s, Xsu) + AuV (s, Xsu ) ds
t ∂t
Z t+h
+ ∇xV (s, Xsu)σ u dWs,
t

and plug into the formula above.

Tomas Björk, 2013 335


We obtain
"Z #
t+h  
∂V
Et,x F (s, Xsu, us) + (s, Xsu ) + AuV (s, Xsu) ds ≤ 0.
t ∂t

Going to the limit:


Divide by h, move h within the expectation and let h tend to zero.
We get
∂V
F (t, x, u) + (t, x) + AuV (t, x) ≤ 0,
∂t

Tomas Björk, 2013 336


Recall

∂V
F (t, x, u) + (t, x) + AuV (t, x) ≤ 0,
∂t
This holds for all u = u(t, x), with equality if and only
if u = û.

We thus obtain the HJB equation

∂V
(t, x) + sup {F (t, x, u) + AuV (t, x)} = 0.
∂t u∈U

Tomas Björk, 2013 337


The HJB equation

Theorem:
Under suitable regularity assumptions the follwing hold:

I: V satisfies the Hamilton–Jacobi–Bellman equation

∂V
(t, x) + sup {F (t, x, u) + AuV (t, x)} = 0,
∂t u∈U

V (T, x) = Φ(x),

II: For each (t, x) ∈ [0, T ] × Rn the supremum in the


HJB equation above is attained by u = û(t, x), i.e. by
the optimal control.

Tomas Björk, 2013 338


Logic and problem

Note: We have shown that if V is the optimal value


function, and if V is regular enough, then V satisfies
the HJB equation. The HJB eqn is thus derived
as a necessary condition, and requires strong ad hoc
regularity assumptions, alternatively the use of viscosity
solutions techniques.

Problem: Suppose we have solved the HJB equation.


Have we then found the optimal value function and
the optimal control law? In other words, is HJB a
sufficient condition for optimality.

Answer: Yes! This follows from the Verification


Theorem.

Tomas Björk, 2013 339


The Verification Theorem
Suppose that we have two functions H(t, x) and g(t, x), such
that

• H is sufficiently integrable, and solves the HJB equation


8
< ∂H (t, x) + sup {F (t, x, u) + AuH(t, x)} = 0,
>
∂t u∈U
>
: H(T , x) = Φ(x),

• For each fixed (t, x), the supremum in the expression

sup {F (t, x, u) + AuH(t, x)}


u∈U

is attained by the choice u = g(t, x).

Then the following hold.

1. The optimal value function V to the control problem is given


by
V (t, x) = H(t, x).
2. There exists an optimal control law û, and in fact

û(t, x) = g(t, x)

Tomas Björk, 2013 340


Handling the HJB equation
1. Consider the HJB equation for V .
2. Fix (t, x) ∈ [0, T ] × Rn and solve, the static optimization
problem
u
max [F (t, x, u) + A V (t, x)] .
u∈U
Here u is the only variable, whereas t and x are fixed
parameters. The functions F , µ, σ and V are considered as
given.
3. The optimal û, will depend on t and x, and on the function
V and its partial derivatives. We thus write û as

û = û (t, x; V ) . (4)

4. The function û (t, x; V ) is our candidate for the optimal


control law, but since we do not know V this description is
incomplete. Therefore we substitute the expression for û into
the PDE , giving us the highly nonlinear (why?) PDE
∂V
(t, x) + F û (t, x) + Aû (t, x) V (t, x) = 0,
∂t
V (T , x) = Φ(x).

5. Now we solve the PDE above! Then we put the solution V


into expression (4). Using the verification theorem we can
identify V as the optimal value function, and û as the optimal
control law.

Tomas Björk, 2013 341


Making an Ansatz

• The hard work of dynamic programming consists in


solving the highly nonlinear HJB equation

• There are no general analytic methods available


for this, so the number of known optimal control
problems with an analytic solution is very small
indeed.

• In an actual case one usually tries to guess a


solution, i.e. we typically make a parameterized
Ansatz for V then use the PDE in order to identify
the parameters.

• Hint: V often inherits some structural properties


from the boundary function Φ as well as from the
instantaneous utility function F .

• Most of the known solved control problems have,


to some extent, been “rigged” in order to be
analytically solvable.

Tomas Björk, 2013 342


The Linear Quadratic Regulator

"Z #
T
min E {Xt0QXt + u0tRut } dt + XT0 HXT ,
u∈Rk 0

with dynamics

dXt = {AXt + But } dt + CdWt.

We want to control a vehicle in such a way that it stays


close to the origin (the terms x0Qx and x0Hx) while
at the same time keeping the “energy” u0 Ru small.

Here Xt ∈ Rn and ut ∈ Rk , and we impose no control


constraints on u.

The matrices Q, R, H, A, B and C are assumed to be


known. We may WLOG assume that Q, R and H are
symmetric, and we assume that R is positive definite
(and thus invertible).

Tomas Björk, 2013 343


Handling the Problem

The HJB equation becomes



 ∂V 0 0

 (t, x) + inf u∈R k {x Qx + u Ru + [∇x V ](t, x) [Ax + Bu]}
 ∂t
1P ∂ 2V 0
 + 2 i,j ∂x ∂x (t, x) [CC ]i,j = 0,

 i j

V (T, x) = x0Hx.

For each fixed choice of (t, x) we now have to solve the static unconstrained
optimization problem to minimize

x0 Qx + u0Ru + [∇xV ](t, x) [Ax + Bu] .

Tomas Björk, 2013 344


The problem was:

min x0Qx + u0Ru + [∇xV ](t, x) [Ax + Bu] .


u

Since R > 0 we set the gradient to zero and obtain

2u0R = −(∇xV )B,

which gives us the optimal u as

1 −1 0
û = − R B (∇x V )0.
2

Note: This is our candidate of optimal control law,


but it depends on the unkown function V .

We now make an educated guess about the structure


of V .

Tomas Björk, 2013 345


From the boundary function x0 Hx and the term x0Qx
in the cost function we make the Ansatz

V (t, x) = x0P (t)x + q(t),

where P (t) is a symmetric matrix function, and q(t) is


a scalar function.
With this trial solution we have,

∂V
(t, x) = x0Ṗ x + q̇,
∂t
∇xV (t, x) = 2x0 P,
∇xxV (t, x) = 2P
û = −R−1B 0 P x.

Inserting these expressions into the HJB equation we


get
n o
x0 Ṗ + Q − P BR−1 B 0 P + A0P + P A x
+q̇ + tr[C 0P C] = 0.

Tomas Björk, 2013 346


We thus get the following matrix ODE for P
(
Ṗ = P BR−1 B 0 P − A0P − P A − Q,
P (T ) = H.

and we can integrate directly for q:


(
q̇ = −tr[C 0P C],
q(T ) = 0.

The matrix equation is a Riccati equation. The


equation for q can then be integrated directly.

Final Result for LQ:


Z T
0
V (t, x) = x P (t)x + tr[C 0P (s)C]ds,
t
û(t, x) = −R−1B 0P (t)x.

Tomas Björk, 2013 347

You might also like