0% found this document useful (0 votes)

15 views15 pages

Error Analysis of Deep Pde Solvers For Option Pricing: Bstract

This document analyzes the effectiveness of deep learning-based PDE solvers for option pricing, specifically the Deep Galerkin Method (DGM) and Time Deep Gradient Flow (TDGF). The study evaluates their empirical performance through experiments on the Black-Scholes and Heston models, focusing on convergence rates and training time as influenced by various parameters. Key findings indicate that while certain parameters improve accuracy, others show no clear relationship with error reduction or training time.

Uploaded by

mushtaq.a9055

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views15 pages

Error Analysis of Deep Pde Solvers For Option Pricing: Bstract

Uploaded by

mushtaq.a9055

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING

JASPER ROU

A BSTRACT. Option pricing often requires solving partial differential equations (PDEs). Although deep learning-
based PDE solvers have recently emerged as quick solutions to this problem, their empirical and quantitative ac-
curacy remain not well understood, hindering their real-world applicability. In this research, our aim is to offer
actionable insights into the utility of deep PDE solvers for practical option pricing implementation. Through com-
parative experiments in both the Black–Scholes and the Heston model, we assess the empirical performance of two
neural network algorithms to solve PDEs: the Deep Galerkin Method and the Time Deep Gradient Flow method
arXiv:2505.05121v1 [q-fin.CP] 8 May 2025

(TDGF). We determine their empirical convergence rates and training time as functions of (i) the number of sam-
pling stages, (ii) the number of samples, (iii) the number of layers, and (iv) the number of nodes per layer. For the
TDGF, we also consider the order of the discretization scheme and the number of time steps.

1. I NTRODUCTION
Option pricing is a fundamental problem in finance. Since the seminal work of Black and Scholes [5], nu-
merous mathematical models and computational approaches have been developed to determine option prices.
One common approach formulates the price of an option as the solution to a partial differential equation (PDE),
which can be solved numerically using methods such as finite differences or finite elements. However, tradi-
tional grid-based methods suffer from the curse of dimensionality: the number of grid points grows exponen-
tially with the dimension of the problem. This challenge is particularly acute in high-dimensional settings, such
as basket options or Markovian approximations of rough volatility models [1, 18].
Neural networks provide a promising alternative by efficiently approximating solutions to PDEs. Once
trained, they can generate option prices rapidly, bypassing the limitations of conventional numerical meth-
ods. Deep learning-based approaches have been successfully applied in other financial contexts, such as risk
management [6], portfolio optimization [21], and optimal stopping [4], as well as data-driven methods to price
options [17]. Given their potential, various deep learning-based PDE solvers have been proposed [11], including
Backward Stochastic Differential Equation (BSDE) methods [12], Deep Galerkin Methods (DGMs) [19] and
Time Deep Gradient Flow (TDGF) methods [9, 18]. However, their empirical accuracy remains not sufficiently
well understood, which limits their practical adoption in financial applications.
This study focuses on two neural network methods for solving PDEs: the DGM and the TDGF. Our primary
objective is to assess their empirical accuracy. For theoretical convergence analyses, see the work of Jiang,
Sirignano, and Cohen [14] for DGM and Liu, Papapantoleon, and Rou [16] for TDGF. For empirical studies of
other deep PDE solvers, such as BSDE-based methods, see the work of Assabumrungrat, Minami, and Hirano
[3].
To provide actionable insights into the applicability of deep PDE solvers for option pricing, we systematically
analyze the impact of key parameters on accuracy and training time. First, we investigate the effect of training
by varying the number of sampling stages and the number of samples. Second, we investigate the effect of the
size of the neural network by varying the number of layers and the number of nodes per layer. For a comparison
of different architectures, see the work of Van Mieghem, Papapantoleon, and Papazoglou-Hennig [20]. Finally,
for TDGF, we also examine the discretization order and the number of time steps.
Our main findings are: the L2 -error decreases almost linearly with the number of sampling stages; the num-
ber of layers tends to decrease the error, but not with a clear rate; and increasing the number of time steps
decreases the L2 -error with the second-order method decreasing quicker than the first-order method. These
three parameters increase the training time linearly. The number of samples and nodes per layer did not show

2020 Mathematics Subject Classification. 91G20, 91G60, 68T07.

Key words and phrases. Option pricing, PDE, artificial neural network, deep PDE solvers.
1
2 J. ROU

a clear relationship with neither the L2 -error nor the training time. The first three parameters concern training
stages done sequentially, while the other two concern training stages which can be done in parallel.
The remainder of this paper is structured as follows. Section 2 introduces the two neural network-based PDE
solvers. Section 3 outlines the option pricing models under consideration: Black–Scholes and Heston. Section 4
details the implementation aspects. Section 5 presents the numerical results for each of the five parameters.
Finally, Section 6 summarizes our findings and conclusions.

2. N EURAL NETWORK METHODS

This section explains the two neural network methods used in this paper. Subsection 2.1 elaborates on the
TDGF and Subsection 2.2 on the DGM.

2.1. Time Deep Gradient Flow Method. The TDGF is a neural network method to efficiently solve high-
dimensional PDEs [9, 10, 18]. Consider the general PDE
∂
u(t, x) + Au(t, x) + ru(t, x) = 0, (t, x) ∈ [0, T ] × Ω,
∂t
u(0, x) = Ψ(x), x ∈ Ω,
with A a second-order differential operator of the form
d d
X ∂2u X ∂u
Au = − aij + βi . (2.1)
∂xi ∂xj ∂xi
i,j=1 i=1

Using the splitting method from Papapantoleon and Rou [18], A can be rewritten in the form
Au = −∇ · (A∇u) + b · ∇u, (2.2)
with a symmetric an positive semi-definite matrix
 11 21
ad1
  1
a a ... b
a21 a22 . . . ad2  b2 
A= . and vector b =  ..  . (2.3)
   
.. .. .. 
 .. . . .  .
a1d a2d . . . add bd
Let us divide the time interval (0, T ] into K equally spaced intervals (tk−1 , tk ], with h = tk − tk−1 = K1 for
k = 0, 1, . . . , K. Let U k denote the approximation to the solution of the PDE u(tk , x) at time step tk , using
either a first- or second-order discretization scheme [2]
U k − U k−1
− ∇ · A∇U k + F U k−1 + rU k = 0,
h
3 k k−2 + 1 U k−1
U − 2U
2 2
− ∇ · A∇U k + 2F U k−1 − F U k−2 + rU k = 0,
h
with F (u) = b · ∇u, U = Ψ and in the second-order scheme we take U 1 from the first-order scheme. Then
0

we can rewrite the discretized PDE as an energy functional [10, 18]

U k = arg min I k (u),
u∈H01

with H01 the Sobolev space in which the derivatives up to order 1 have finite L2 -norm and the energy functionals
Z
k 1 k−1
2 1 T 2

k−1

I1 (u) = u − U +h (∇u) A∇u + ru + F U udx ,
2 L2 (Ω) Ω 2
2
1 4 1
I2k (u) = u − U k−1 + U k−2 (2.4)
2 3 3 L2 (Ω)
Z
2h 1 T 2

k−1

k−2

+ (∇u) A∇u + ru + 2F U −F U udx ,
3 Ω 2
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 3

for the first- and second-order discretization respectively. Let f k (x; θ) denote a neural network approximation
of U k with trainable parameters θ. Applying a Monte Carlo approximation to the integrals, the discretized cost
functional takes the form
 2
M n
|Ω| X X
Lkn (θ; x) = f k (xm ; θ) + αnj f k−j (xm ) + βn hNnk (θ; x) , (2.5)
2M
m=1 j=1

with
M
"
|Ω| X 1 k T 2
k k k
Nn (θ; x) = ∇f (xm ; θ) A∇f (xm ; θ) + r f (xm ; θ)
M 2
m=1
 
n
#
X
j k−j k
+ b · γn ∇f (xm ) f (xm ; θ) .
j=1

Here, M denotes the number of samples xm ; n ∈ {1, 2} the order of the discretization and αnj , βn and γnj are
the corresponding coefficients.
In order to minimize this cost function, we use a stochastic gradient descent-type algorithm, i.e. an iterative
scheme of the form:
θn+1 = θn − α∇θ Lk (θn ; x). (2.6)
The hyperparameter α is the step size of our update, called the learning rate. An overview of the TDGF method
appears in Algorithm 1.

Algorithm 1 Time Deep Gradient Flow method

1: Initialize θ00 .
2: Set f 0 (x; θ) = Ψ(x).
3: for each time step k = 1, . . . , K do
4: Initialize θ0k = θk−1 .
5: for each sampling stage n = 1, ..., N do
6: Generate M random points xm for training.
7: Calculate the cost functional Lk (θnk ; x) for the selected points.
8: k
Take a descent step θn+1 = θnk − α∇θ Lk (θnk ; x).
9: end for
10: end for

2.2. Deep Galerkin Method. We compare the TDGF method with a popular deep learning method for solving
PDEs: the DGM of Sirignano and Spiliopoulos [19]. In the DGM approach, we minimize the square L2 -error
of the PDE:
2
∂u
− ∇ · (A∇u) + b · ∇u + ru + ∥u(0, x) − Ψ(x)∥2L2 (Ω) .
∂t L2 ([0,T ]×Ω)
Then the cost functional for the neural network approximation f (t, x; θ) of u, takes the form
M1
T |Ω| X
L(θ; t, x) = [f (t, xm ; θ) − ∇ · (A∇f (t, xm ; θ)) + b · ∇f (t, xm ; θ) + rf (t, xm ; θ)]
M1
m=1
M2
|Ω| X
+ [f (0, xm ; θ) − Ψ(xm )] .
M2
m=1
The solution of the PDE is approximated by a neural network using stochastic gradient descent as in equation
(2.6). Contrary to the TDGF there is no time stepping. Instead of training a neural network for each time step,
there is one neural network with time as input parameter. An overview of the DGM appears in Algorithm 2.
4 J. ROU

Algorithm 2 Deep Galerkin Method

1: Initialize θ0 .
2: for each sampling stage n = 1, ..., N do
3: Generate M random points (tm , xm ) for training.
4: Calculate the cost functional L(θn ; t, x) for the selected points.
5: Take a descent step θn+1 = θn − α∇θ L(θn ; t, x).
6: end for

3. O PTION PRICING MODELS

This section explains the two option pricing models in which we solve the pricing PDE. Subsection 3.1
elaborates on the Black–Scholes model and Subsection 3.2 on the Heston model.

3.1. Black–Scholes. In the Black and Scholes [5] model, the dynamics of the stock price S is a geometric
Brownian motion:
dSt = rSt dt + σSt dWt , S0 > 0,
with r, σ ∈ R+ the risk free rate and the volatility respectively.
Consider a European call option on S with payoff Ψ(ST ) = (ST − K)+ at maturity time T > 0. Using the
fundamental theorem of asset pricing and the Feynman–Kac formula, the price of this derivative can be written
as the solution to a PDE in this model. Indeed, let u : [0, T ] × Ω → R denote the price of this derivative, with
Ω ⊆ R and t the time to maturity. Then, u solves the Black–Scholes PDE:
∂u 1 2 2 ∂ 2 u ∂u
− σ S − rS + ru = 0, (t, S) ∈ [0, T ] × Ω,
∂t 2 ∂S 2 ∂S
u(0, S) = Ψ(S), S ∈ Ω.
This PDE has an exact solution:
u(t, S) = SΦ(d1 ) − Ke−rt Φ(d2 ),
with Φ the standard normal cumulative distribution function,
S
2

log K + r + σ2 t √
d1 = √ and d2 = d1 − σ t.
σ t
The operator A takes the form (2.2) with the coefficients in (2.3) provided by
1
a = σ2S 2,
2
b = (σ 2 − r)S.

3.2. Heston. The Heston [13] model is a popular stochastic volatility model with dynamics
p
dSt = rSt dt + Vt St dWt , S0 > 0,
p
dVt = λ(κ − Vt )dt + η Vt dBt , V0 > 0.
Here V is the variance process, W, B are correlated (standard) Brownian motions, with correlation coefficient
ρ, and λ, κ, η ∈ R+ . The generator corresponding to these dynamics, in the form (2.1), equals
∂u ∂u 1 ∂2u 1 2 ∂2u ∂2u
Au = −rS − λ(κ − V ) − S2V − η V − ρηSV .
∂S ∂V 2 ∂S 2 2 ∂V 2 ∂S∂V
This PDE does not have an exact solution. The characteristic function of the Heston model does have an
analytical representation [13], from which a reference price can be determined using the COS method [8].
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 5

The operator A takes the form (2.2) with the coefficients in (2.3) provided by
1
a11 = S 2 V,
2
1
a21 = a12 = ρηSV,
2
1
a = η 2 V,
22
2

1
b1 = −r + V + ρη S,
2
1 1
b2 = λ(V − κ) + η 2 + ρηV.
2 2

4. I MPLEMENTATION DETAILS
For the architecture we use the architecture from Papapantoleon and Rou [18] including the use of informa-
tion about the option price in order to facilitate the training of the neural network:
X 1 = σ1 W 1 x + b1 ,

Z l = σ1 U z,l x + W z,l X l + bz,l , l = 1, . . . , L,

Gl = σ1 U g,l x + W g,l X l + bg,l , l = 1, . . . , L,

Rl = σ1 U r,l x + W r,l X l + br,l , l = 1, . . . , L,

H l = σ1 U h,l x + W h,l X l ⊙ Rl + bh,l , l = 1, . . . , L,

X l+1 = 1 − Gl ⊙ H l + Z l ⊙ X l , l = 1, . . . , L,
+
f (x; θ) = S − Ke−rt + σ2 W X L+1 + b ,

with activation functions the hyperbolic tangent function, σ1 (x) = tanh(x), and the softplus function, σ2 (x) =
log (ex + 1), which guarantees that the option price remains above the no-arbitrage bound. The parameters of
the network have dimensions W 1 , U m,l ∈ RD×d ; b1 , bm,l ∈ RD ; W m,l ∈ RD×D ; W ∈ R1×D and b ∈ R for
m = z, g, r, h and l = 1, ..., L, with x ∈ Rd .
We consider the effect of five parameters on the error: the number of sampling stages, N in Algorithm 1;
the number of samples, M in equation (2.5); the number of layers L; the number of nodes per layer D; and
for the TDGF also the number of time steps, K in Algorithm 1. In the last case we consider both the first- and
second-order discretization scheme in equation (2.4). As the default parameter set we take 600 samples per
dimension in each sampling stage. To obtain the total number of samples from this number, multiply by 2 for
DGM in the Black–Scholes (time and stock price) and TDGF in the Heston model (stock price and volatility)
and multiply by 3 for the DGM in the Heston model (time, stock price and volatility). The default network size
is 3 layers and 50 nodes per layers. For the TDGF we take 100 time steps and 500 sampling stages in each time
step and for the DGM we take 100,000 sampling stages. After this many sampling stages the error does not
decrease further.
For both DGM and TDGF we use the Adam optimizer [15] with a learning rate α = 3 × 10−4 , (β1 , β2 ) =
(0.9, 0.999) and zero weight decay. The training is performed on the DelftBlue supercomputer [7], using one
seventh instance of a NVidia Tesla A100 GPU. We run each problem for five different random seed and compare
the average error of the five runs.
As the modeling problem we take the price of a European call option with interest rate r = 0.05 and maturity
T = 1.0 year. We consider the Black–Scholes model with volatility σ = 0.25 and the Heston model with
η = 0.1, ρ = 0.0, κ = 0.01 and λ = 2.0. For the domain Ω we consider S ∈ [0.01, 3.0] and V ∈ [0.001, 0.1].
The solution of the Black–Scholes PDE with these parameters together with the solution produced by the TDGF
with the default training parameters is in Fig. 1.
6 J. ROU

F IGURE 1. Exact price of European call option with r = 0.05, σ = 0.25 and T = 1.0
compared to the price of computed by the TDGF.

5. R ESULTS
In the next subsections we vary one of the parameters while keeping the others constant at the default value.
We compute the L2 -error on an equidistant grid of 47 points in each dimension on the domain.

5.1. Sampling stages. First, we consider the number of sampling stages. For the TDGF we vary the number of
sampling stages per time step from 16 to 500. For the DGM we vary the number of sampling stages from 2048
to 100,000. After this many sampling stages, the error does not decrease anymore in the Black–Scholes model.
In the Heston model, the error seems to quit decreasing quicker for both methods. The fitted convergence rates
for both methods and both models are in Table 1. All convergence rates are slightly larger than -1. The plots
of the L2 -error on linear and log scale are in Figures 2-5 together with the training time. The training time
increases linearly with the number of sampling stages.
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 7

Method Black–Scholes Heston

TDGF -0.91 -0.63
DGM -0.73 -0.75

TABLE 1. Convergence rates for the number of sampling stages.

Method Black–Scholes Heston

TDGF -0.27 -0.11
DGM -0.12 0.2

TABLE 2. Convergence rates for the number of samples.

Method Black–Scholes Heston

TDGF -0.63 -0.47
DGM -0.33 -0.70

TABLE 3. Convergence rates for the number of layers.

Method Black–Scholes Heston

TDGF -1.11 -0.39
DGM 0.15 0.07

TABLE 4. Convergence rates for the number of nodes per layer.

5.2. Samples. Second, we consider the number of samples per dimension in each sampling stage. We vary
the number of samples per dimension from 16 to 600. The fitted convergence rates for both methods and both
models are in Table 2. In general, it is hard to draw conclusions. For the TDGF the rates are slightly negative,
but far from -0.5, which would be the expected rate of convergence for Monte Carlo sampling. For the DGM
the rates are larger and the error does not decrease as uniformly with the number of samples as for the TDGF.
The plots of the L2 -error on linear and log scale are in Figures 6-9 together with the training time. The number
of samples does not have a big impact on the training time.

5.3. Layers. Third, we consider the number of layers of the neural network. We vary the number for layers
from 1 to 4. The fitted convergence rates for both methods and both models are in Table 3. The rates vary but
are all negative so, in general, more layers improves the result. The plots of the L2 -error on linear and log scale
are in Figures 10-13 together with the training time. The training time increases linearly with the number of
layers.

5.4. Nodes per layer. Fourth, we consider the number of nodes per layers of the neural network. We vary
the number of layers from 10 to 50. The fitted convergence rates for both methods and both models are in
Table 4. The rates vary across different methods and models and are even positive for the DGM. The plots of
the L2 -error on linear and log scale are in Figures 14-17 together with the training time. The number of nodes
per layers does not have a big impact on the training time.

5.5. Time steps. Fifth and final, we consider the number of time steps. We vary the number of time steps from
2 to 25. The fitted convergence rates for both models and for both first and second order time-stepping are in
Table 5. After 25 time steps, the second-order scheme does not improve any further, but the first-order scheme
does. The rates for the Black–Scholes are lower than for Heston. In both cases O(2) outperforms O(1). The
plots of the L2 -error on linear and log scale are in Figures 18-19 together with the training time. The training
time grows linearly with the number of time steps with the second order method growing faster than the first
order method.
8 J. ROU

Method Black–Scholes Heston

O(1) -0.29 -0.15
O(2) -0.56 -0.25

TABLE 5. Convergence rates for the number of time steps

6. C ONCLUSION
This research analyzed the error of two neural network methods to solve option pricing PDEs: TDGF and
DGM. We determined the empirical convergence rates of the L2 -error of five parameters in both the Black–
Scholes and the Heston model. We also considered the effect of these parameters on the training time. Based
on the experiments we can give some recommendations that can assist anyone who want to use the methods in
a practical setting.
• For both the TDGF and the DGM, the L2 -error decreases almost linearly with the number of sampling
stages up to some point where it stops converging. Since the training time grows linearly with the the
number of sampling stages, it would be optimal to stop at this point. Unfortunately, there is no method
to locate this point beforehand and we recommend choosing the number of sampling stages based on
whether speed or accuracy is more important in the practical setting.
• For the TDGF the L2 -error decreases slightly with the number of samples. Since the number of samples
does not influence the training time, we recommend using a large number of samples like six hundred
per dimension or even more.
• For the DGM the L2 -error does not decrease with the number of samples. Therefore, it is hard to give
any recommendation.
• For both TDGF and DGM, the number of layers tends to decrease the error, but not with a clear rate.
One layer is clearly not enough, but four layers does not improve the results compared to two or three
layers. Since the number of layers has a big influence on the training time, we recommend using two
or three layers.
• For the TDGF, the L2 -error decreases with the number of nodes per layer. Since the number of nodes
per layer does not influence the training time, we recommend choosing a large number of nodes per
layer like forty of fifty.
• For the DGM, the L2 -error does not decrease with the number of nodes per layer. We recommend
choosing a smaller number of nodes per layer like thirty.
• For the TDGF, the number of time steps decreases the L2 -error. Using a second-order time-stepping
method, the error decreases quicker than using a first-order method. We recommend using the second-
order time stepping method. Since the training time increases linearly with the number of time steps,
we again recommend choosing the number of time steps based on whether speed or accuracy is more
important in the practical setting.

R EFERENCES
[1] E. Abi Jaber and O. El Euch. Multifactor approximation of rough volatility models. SIAM Journal on
Financial Mathematics, 10(2):309–349, 2019.
[2] G. Akrivis and Y.-S. Smyrlis. Implicit–explicit BDF methods for the Kuramoto–Sivashinsky equation.
Applied numerical mathematics, 51(2-3):151–169, 2004.
[3] R. Assabumrungrat, K. Minami, and M. Hirano. Error analysis of option pricing via deep PDE solvers:
Empirical study. In 2024 16th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI),
pages 329–336. IEEE, 2024.
[4] S. Becker, P. Cheridito, and A. Jentzen. Deep optimal stopping. Journal of Machine Learning Research,
20(74):1–25, 2019.
[5] F. Black and M. Scholes. The pricing of options and corporate liabilities. Journal of Political Economy,
81(3):637–654, 1973.
[6] H. Buehler, L. Gonon, J. Teichmann, and B. Wood. Deep hedging. Quantitative Finance, 19(8):1271–
1291, 2019.
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 9

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 2. L2 -error of the TDGF for a call option in the Black–Scholes model against number
of sampling stages varying from 16 to 500.

[7] Delft High Performance Computing Centre (DHPC). DelftBlue Supercomputer (Phase 1). https:
//www.tudelft.nl/dhpc/ark:/44463/DelftBluePhase1, 2022.
[8] F. Fang and C. W. Oosterlee. A novel pricing method for European options based on Fourier-cosine series
expansions. SIAM Journal on Scientific Computing, 31:826–848, 2009.
[9] E. H. Georgoulis, M. Loulakis, and A. Tsiourvas. Discrete gradient flow approximations of high dimen-
sional evolution partial differential equations via deep neural networks. Communications in Nonlinear
Science and Numerical Simulation, 117:106893, 2023.
[10] E. H. Georgoulis, A. Papapantoleon, and C. Smaragdakis. A deep implicit-explicit minimizing movement
method for option pricing in jump-diffusion models. arXiv preprint arXiv:2401.06740, 2024.
[11] L. Gonon, A. Jentzen, B. Kuckuck, S. Liang, A. Riekert, and P. von Wurstemberger. An overview on
machine learning methods for partial differential equations: from physics informed neural networks to
deep operator learning. arXiv preprint arXiv:2408.13222, 2024.
[12] J. Han, A. Jentzen, and W. E. Solving high-dimensional partial differential equations using deep learning.
Proceedings of the National Academy of Sciences, 115(34):8505–8510, 2018.
[13] S. L. Heston. A closed-form solution for options with stochastic volatility with applications to bond and
currency options. The Review of Financial Studies, 6(2):327–343, 1993.
[14] D. Jiang, J. Sirignano, and S. Cohen. Global convergence of deep galerkin and PINNs methods for solving
partial differential equations. arXiv preprint arXiv:2305.06000, 2023.
[15] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980,
2014.
[16] C. Liu, A. Papapantoleon, and J. Rou. Convergence of time-stepping deep gradient flow methods. To
appear, 2025.
[17] S. Liu, C. W. Oosterlee, and S. M. Bohte. Pricing options and computing implied volatilities using neural
networks. Risks, 7(1):16, 2019.
[18] A. Papapantoleon and J. Rou. A time-stepping deep gradient flow method for option pricing in (rough)
diffusion models. arXiv preprint arXiv:2403.00746, 2024.
[19] J. Sirignano and K. Spiliopoulos. DGM: A deep learning algorithm for solving partial differential equa-
tions. Journal of Computational Physics, 375:1339–1364, 2018.
[20] L. Van Mieghem, A. Papapantoleon, and J. Papazoglou-Hennig. Machine learning for option pricing: an
empirical investigation of network architectures. arXiv preprint arXiv:2307.07657, 2023.
[21] Z. Zhang, S. Zohren, and S. Roberts. Deep learning for portfolio optimization. arXiv preprint
arXiv:2005.13665, 2020.
D ELFT I NSTITUTE OF A PPLIED M ATHEMATICS , EEMCS, TU D ELFT, 2628CD D ELFT, T HE N ETHERLANDS
Email address: [email protected]
10 J. ROU

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 3. L2 -error of the TDGF for a call option in the Heston model against number of
sampling stages varying from 16 to 500.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 4. L2 -error of the DGM for a call option in the Black–Scholes model against number
of sampling stages varying from 2048 to 100,000.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 5. L2 -error of the DGM for a call option in the Heston model against number of
sampling stages varying from 2048 to 100,000.
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 11

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 6. L2 -error of the TDGF for a call option in the Black–Scholes model against number
of samples varying from 16 to 600.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 7. L2 -error of the TDGF for a call option in the Heston model against number of
samples varying from 16 to 600.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 8. L2 -error of the DGM for a call option in the Black–Scholes model against number
of samples varying from 16 to 600.
12 J. ROU

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 9. L2 -error of the DGM for a call option in the Heston model against number of
samples varying from 16 to 600.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 10. L2 -error of the TDGF for a call option in the Black–Scholes model against num-
ber of layers varying from 1 to 4.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 11. L2 -error of the TDGF for a call option in the Heston model against number of
layers varying from 1 to 4.
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 13

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 12. L2 -error of the DGM for a call option in the Black–Scholes model against number
of layers varying from 1 to 4.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 13. L2 -error of the DGM for a call option in the Heston model against number of
layers varying from 1 to 4.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 14. L2 -error of the TDGF for a call option in the Black–Scholes model against num-
ber of nodes per layer varying from 10 to 50.
14 J. ROU

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 15. L2 -error of the TDGF for a call option in the Heston model against number of
nodes per layer varying from 10 to 50.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 16. L2 -error of the DGM for a call option in the Black–Scholes model against number
of nodes per layer varying from 10 to 50.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 17. L2 -error of the DGM for a call option in the Heston model against number of
nodes per layer varying from 10 to 50.
ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING 15

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 18. L2 -error of the TDGF for a call option in the Black–Scholes model against num-
ber of time steps varying from 2 to 25.

( A ) Linear scale ( B ) Logarithmic scale

F IGURE 19. L2 -error of the TDGF for a call option in the Heston model against number of
time steps varying from 2 to 25.

The Deep Parametric PDE Method and Applications To Option Pricing
No ratings yet
The Deep Parametric PDE Method and Applications To Option Pricing
21 pages
Sabat e VidalesM - 2023
No ratings yet
Sabat e VidalesM - 2023
123 pages
Accelerating Neural ODEs With Spectral Elements
No ratings yet
Accelerating Neural ODEs With Spectral Elements
15 pages
M L M P F D: Achine Earning Ethods For Ricing Inancial Erivatives
No ratings yet
M L M P F D: Achine Earning Ethods For Ricing Inancial Erivatives
27 pages
Approximation of Solution Operators for High-dimensional PDEs部分1
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分1
2 pages
Neural Network Learning of Black-Scholes Equation For Option Pricing
No ratings yet
Neural Network Learning of Black-Scholes Equation For Option Pricing
12 pages
Control
No ratings yet
Control
38 pages
A Brief Review of The Deep BSDE Method For Solving High-Dimensional Partial Differential Equations
No ratings yet
A Brief Review of The Deep BSDE Method For Solving High-Dimensional Partial Differential Equations
14 pages
Yang Li1 - 2023 - Best
No ratings yet
Yang Li1 - 2023 - Best
6 pages
Gr12 Psis Rev Sequences and Series 2017 Qns
No ratings yet
Gr12 Psis Rev Sequences and Series 2017 Qns
17 pages
Approximation of Solution Operators for High-dimensional PDEs部分3
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分3
2 pages
Thesis - Com - 2024 - Zamxaka Nichume
No ratings yet
Thesis - Com - 2024 - Zamxaka Nichume
40 pages
Deep Surrogates - Hui Chen
No ratings yet
Deep Surrogates - Hui Chen
59 pages
Numerical Analysis - II
No ratings yet
Numerical Analysis - II
158 pages
Approximation of Solution Operators For High-Dimensional Pdes
No ratings yet
Approximation of Solution Operators For High-Dimensional Pdes
15 pages
DGM A Deep Learning Algorithm For Solving Partial Differential Equations
No ratings yet
DGM A Deep Learning Algorithm For Solving Partial Differential Equations
26 pages
2022 GP Mock Maths Gr12 P1 JN - AUGUST Mock Exam QP
No ratings yet
2022 GP Mock Maths Gr12 P1 JN - AUGUST Mock Exam QP
11 pages
Approximation of Solution Operators for High-dimensional PDEs部分2
No ratings yet
Approximation of Solution Operators for High-dimensional PDEs部分2
2 pages
Merger 02
No ratings yet
Merger 02
5 pages
Deep Learning Meets PDEs
No ratings yet
Deep Learning Meets PDEs
2 pages
Stochastic Partial Differential Equations (SPDEs) 22077 29 Source
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 29 Source
9 pages
Stochastic Partial Differential Equations (SPDEs) 22077 37 5
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 37 5
9 pages
Stochastic Partial Differential Equations (SPDEs) 22077 28 These Methods
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 28 These Methods
9 pages
Stochastic Partial Differential Equations (SPDEs) 22077 33 Train The Network
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 33 Train The Network
9 pages
Koopman Neural Operator As A Mesh-Free Solver of Non-Linear Partial Differential Equations
No ratings yet
Koopman Neural Operator As A Mesh-Free Solver of Non-Linear Partial Differential Equations
18 pages
2017-44 FP
No ratings yet
2017-44 FP
14 pages
(00000) - 2018-Weinan - (CommMathStat) - The Deep Ritz Method A Deep Learning-Based
No ratings yet
(00000) - 2018-Weinan - (CommMathStat) - The Deep Ritz Method A Deep Learning-Based
12 pages
Stochastic Partial Differential Equations (SPDEs) 22077 4 Deep Backward Stochastic
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 4 Deep Backward Stochastic
9 pages
Reliable Extrapolation (Comput. Methods Appl. Mech. Eng.)
No ratings yet
Reliable Extrapolation (Comput. Methods Appl. Mech. Eng.)
36 pages
Stochastic Partial Differential Equations (SPDEs) 22077 39 The Deep BSDE
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 39 The Deep BSDE
9 pages
Stochastic Partial Differential Equations (SPDEs) 22077 41 The weights θ
No ratings yet
Stochastic Partial Differential Equations (SPDEs) 22077 41 The weights θ
9 pages
Stable Implementation of Probabilistic ODE Solvers: Nicholas KR Amer
No ratings yet
Stable Implementation of Probabilistic ODE Solvers: Nicholas KR Amer
29 pages
Deep Learning for High-Dim PDEs
No ratings yet
Deep Learning for High-Dim PDEs
13 pages
ML Methods for Solving PDEs
No ratings yet
ML Methods for Solving PDEs
59 pages
Mei Coursework Numerical Methods
100% (2)
Mei Coursework Numerical Methods
5 pages
(2024) Deep Learning Approach For Multi-Asset Option Pricing (Noguer I Alonso, HAIDA)
No ratings yet
(2024) Deep Learning Approach For Multi-Asset Option Pricing (Noguer I Alonso, HAIDA)
10 pages
123 Study Guide 2025
No ratings yet
123 Study Guide 2025
18 pages
009 Opening The Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics
No ratings yet
009 Opening The Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics
11 pages
Rate of Convergence
No ratings yet
Rate of Convergence
10 pages
Optimisation 1
No ratings yet
Optimisation 1
22 pages
Worku Thesis 2022 or 2014 1
No ratings yet
Worku Thesis 2022 or 2014 1
33 pages
Three Ways To Solve Partial Differential Equations With Neural
No ratings yet
Three Ways To Solve Partial Differential Equations With Neural
32 pages
Mgfno: Multi-Grid Architecture Fourier Neural Operator For Parametric Partial Differential Equations
No ratings yet
Mgfno: Multi-Grid Architecture Fourier Neural Operator For Parametric Partial Differential Equations
29 pages
Deep Learning for High-Dim PDEs
No ratings yet
Deep Learning for High-Dim PDEs
6 pages
DeepMartNet: Solving High-Dimensional PDEs
No ratings yet
DeepMartNet: Solving High-Dimensional PDEs
23 pages
2304.00388 Multilevel CNNs For Parametric PDEs
No ratings yet
2304.00388 Multilevel CNNs For Parametric PDEs
42 pages
Deep Neural Network Framework Based On Backward Stochastic Differential Equations For Pricing and Hedging American Options in High Dimensions
No ratings yet
Deep Neural Network Framework Based On Backward Stochastic Differential Equations For Pricing and Hedging American Options in High Dimensions
35 pages
2309.13722 Deep Neural Networks With ReLU
No ratings yet
2309.13722 Deep Neural Networks With ReLU
52 pages
Wa0005
No ratings yet
Wa0005
4 pages
MA261
No ratings yet
MA261
75 pages
Deep Galerkin Method-2017
No ratings yet
Deep Galerkin Method-2017
31 pages
Neural Networks for ODE Solutions
No ratings yet
Neural Networks for ODE Solutions
6 pages
Machine Learning For Option Pricing
No ratings yet
Machine Learning For Option Pricing
29 pages
Error Analysis of Deep Pde Solvers For Option Pricing: Bstract
No ratings yet
Error Analysis of Deep Pde Solvers For Option Pricing: Bstract
15 pages
2205.14398 Deep Neural Networks Overcome The Curse
No ratings yet
2205.14398 Deep Neural Networks Overcome The Curse
34 pages
Improving Physics-Informed Neural Networks With Meta-Learned Optimization
No ratings yet
Improving Physics-Informed Neural Networks With Meta-Learned Optimization
26 pages
Sciadv Abi8605
No ratings yet
Sciadv Abi8605
10 pages
Practical Aspects On Solving Differential Equations Using Deep Learning
No ratings yet
Practical Aspects On Solving Differential Equations Using Deep Learning
32 pages
PINN蒙卡求解器
No ratings yet
PINN蒙卡求解器
28 pages
(Gin, Craig, Et Al.), Deep Learning Models For Global Coordinate Transformations That Linearize Pdes., Arxiv Preprint Arxiv-1911.02710 (2019) .
No ratings yet
(Gin, Craig, Et Al.), Deep Learning Models For Global Coordinate Transformations That Linearize Pdes., Arxiv Preprint Arxiv-1911.02710 (2019) .
27 pages
Ninth Order Method For Nonlinear Equations and Its Dynamic Behaviour
No ratings yet
Ninth Order Method For Nonlinear Equations and Its Dynamic Behaviour
15 pages
Maths Notes
No ratings yet
Maths Notes
195 pages
CFD Week 5: Boundary Conditions & Diffusion
No ratings yet
CFD Week 5: Boundary Conditions & Diffusion
5 pages
Neural Networks-Based Backward Scheme For Fully Nonlinear Pdes
No ratings yet
Neural Networks-Based Backward Scheme For Fully Nonlinear Pdes
24 pages
SDMS L4.02 Silvaco Atlas Syntax-Part 2
No ratings yet
SDMS L4.02 Silvaco Atlas Syntax-Part 2
21 pages
Eecs 639 HW4
No ratings yet
Eecs 639 HW4
8 pages
Numerical Analysis Lecture Notes
No ratings yet
Numerical Analysis Lecture Notes
36 pages
High Precision Differentiation Techniques For Data-Driven Solution of Nonlinear Pdes by Physics-Informed Neural Networks
No ratings yet
High Precision Differentiation Techniques For Data-Driven Solution of Nonlinear Pdes by Physics-Informed Neural Networks
28 pages
Parameter Estimation in Markovian Diffusions
No ratings yet
Parameter Estimation in Markovian Diffusions
13 pages
Numerical Analysis Scilab Root Finding 0
No ratings yet
Numerical Analysis Scilab Root Finding 0
25 pages
Solving High-Dimensional Partial Differential Equations Using Deep Learning
No ratings yet
Solving High-Dimensional Partial Differential Equations Using Deep Learning
6 pages
Ma 128 A Lecture Week 4 Newton
No ratings yet
Ma 128 A Lecture Week 4 Newton
51 pages
Rood Findings
No ratings yet
Rood Findings
7 pages
The Implied Views of Bond Traders On The Spot Equity Market
No ratings yet
The Implied Views of Bond Traders On The Spot Equity Market
8 pages
PDE1
No ratings yet
PDE1
29 pages
NC Research Paper Final.
No ratings yet
NC Research Paper Final.
8 pages
Numerical Methods for Nonlinear Eqns
No ratings yet
Numerical Methods for Nonlinear Eqns
82 pages
Deep Learning for p-Laplacian Equations
No ratings yet
Deep Learning for p-Laplacian Equations
15 pages
Machine Learning Approximation Algorithms For High-Dimensional Fully Nonlinear PDEs and Second-Order Backward SDEs
No ratings yet
Machine Learning Approximation Algorithms For High-Dimensional Fully Nonlinear PDEs and Second-Order Backward SDEs
56 pages
MWN 780 Assignment1 Block 1
No ratings yet
MWN 780 Assignment1 Block 1
5 pages
Solving High-Dimensional Partial Differential Equations Using Deep Learning
No ratings yet
Solving High-Dimensional Partial Differential Equations Using Deep Learning
14 pages
GCI Example
No ratings yet
GCI Example
11 pages
Numerical Methods For Solving Nonlinear Equations1
No ratings yet
Numerical Methods For Solving Nonlinear Equations1
7 pages
Rate of Convergence: ST ND RD TH
No ratings yet
Rate of Convergence: ST ND RD TH
3 pages
Code of The Multidimensional Fractional Quasi-Newton Method Using Recursive Programming
No ratings yet
Code of The Multidimensional Fractional Quasi-Newton Method Using Recursive Programming
7 pages
Convergence Rates in Algorithms
No ratings yet
Convergence Rates in Algorithms
31 pages
J Ymssp 2017 05 030
No ratings yet
J Ymssp 2017 05 030
17 pages
(Ebook) Practical Methods For Optimal Control Using Nonlinear Programming by John T. Betts ISBN 9780898714883, 0898714885 Full Access
No ratings yet
(Ebook) Practical Methods For Optimal Control Using Nonlinear Programming by John T. Betts ISBN 9780898714883, 0898714885 Full Access
158 pages

Error Analysis of Deep Pde Solvers For Option Pricing: Bstract

Uploaded by

Error Analysis of Deep Pde Solvers For Option Pricing: Bstract

Uploaded by

ERROR ANALYSIS OF DEEP PDE SOLVERS FOR OPTION PRICING

2020 Mathematics Subject Classification. 91G20, 91G60, 68T07.

2. N EURAL NETWORK METHODS

we can rewrite the discretized PDE as an energy functional [10, 18]

Algorithm 1 Time Deep Gradient Flow method

Algorithm 2 Deep Galerkin Method

3. O PTION PRICING MODELS

Method Black–Scholes Heston

TABLE 1. Convergence rates for the number of sampling stages.

Method Black–Scholes Heston

TABLE 2. Convergence rates for the number of samples.

Method Black–Scholes Heston

TABLE 3. Convergence rates for the number of layers.

Method Black–Scholes Heston

TABLE 4. Convergence rates for the number of nodes per layer.

Method Black–Scholes Heston

TABLE 5. Convergence rates for the number of time steps

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

( A ) Linear scale ( B ) Logarithmic scale

You might also like