Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views118 pages

Implied Stoch Model

This paper introduces 'implied stochastic volatility models' that aim to accurately fit option-implied volatility data by linking the characteristics of the implied volatility surface to the stochastic volatility model specification. The authors propose both parametric and nonparametric methods to construct these models, utilizing observable shape characteristics of the implied volatility surface to inform model estimation. The approach is validated through Monte Carlo simulations and real data applications, demonstrating its effectiveness in capturing empirical features of the implied volatility surfaces.

Uploaded by

Shlomo Kraus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views118 pages

Implied Stoch Model

This paper introduces 'implied stochastic volatility models' that aim to accurately fit option-implied volatility data by linking the characteristics of the implied volatility surface to the stochastic volatility model specification. The authors propose both parametric and nonparametric methods to construct these models, utilizing observable shape characteristics of the implied volatility surface to inform model estimation. The approach is validated through Monte Carlo simulations and real data applications, demonstrating its effectiveness in capturing empirical features of the implied volatility surfaces.

Uploaded by

Shlomo Kraus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 118

Implied Stochastic Volatility Models∗

Yacine A¨ıt-Sahalia†

Chenxu Li‡

Chen Xu Li§

Department of Economics

Guanghua School of Management

Bendheim Center for Finance

Princeton University and NBER

Peking University

Princeton University

This Version: February 18, 2019

Abstract

This paper proposes to build “implied stochastic volatility models” designed to fit


option-

implied volatility data, and implements a method to construct such models. The
method is based

on explicitly linking shape characteristics of the implied volatility surface to


the specification

of the stochastic volatility model. We propose and implement parametric and


nonparametric

versions of implied stochastic volatility models.

Keywords:

implied volatility surface, stochastic volatility, jumps, (generalized) method of


mo-

ments, kernel estimation, closed-form expansion.

JEL classification: G12; C51; C52.

1 Introduction

No-arbitrage pricing arguments for options most often start with an assumed dynamic
model that

serves as the data generating process for the option’s underlying asset price. Most
often again,

∗We benefited from the comments of participants at the 2017 Stanford-Tsinghua-PKU


Conference in Quantitative
Finance, the 2017 Fifth Asian Quantitative Finance Conference, the 2017 BCF-QUT-
SJTU-SMU Conference on Finan-

cial Econometrics, the Second PKU-NUS Annual International Conference on


Quantitative Finance and Economics,

the 2017 Asian Meeting of the Econometric Society, the Third Annual Volatility
Institute Conference at NYU Shanghai,

the 2018 Review of Economic Studies 30th Anniversary Conference and the 2018 FERM
Conference. The research of

Chenxu Li was supported by the Guanghua School of Management, the Center for
Statistical Science, and the Key

Laboratory of Mathematical Economics and Quantitative Finance (Ministry of


Education) at Peking University, as

well as the National Natural Science Foundation of China (Grant 71671003). Chen Xu
Li is grateful for a graduate

scholarship and funding support from the Graduate School of Peking University as
well as support from the Bendheim

Center for Finance at Princeton University.

†Address: JRR Building, Princeton, NJ 08544, USA. E-mail address:


[email protected].
‡Address: Guanghua School of Management, Peking University, Beijing, 100871, P. R.
China. E-mail address:

[email protected].

§Address: JRR Building, Princeton, NJ 08544, USA. E-mail address:


[email protected].

1
that model is of the stochastic volatility type, see, e.g., Hull and White (1987),
Heston (1993),

Bates (1996), Duffie et al. (2000), and Pan (2002). Unfortunately, the relationship
between the

market data, namely option prices or equivalently, implied volatilities, and the
model is not fully

explicit.

Implied volatilities can only be computed numerically or approximated, even under


the

affine stochastic volatility models, see, e.g., Duffie et al. (2000) and the
references therein, for which

option prices admit analytical Fourier transforms.

As the variety of affine or non-affine specifications suggest, there is no accepted


consensus on

the model specifications in the literature. There is however agreement that a


stochastic volatility

model should produce option prices (or equivalently, implied volatilities) with the
features that

are observed in the empirical data. A prevalent approach relies on fitting pre-
specified models

with particular dynamics to data by estimation or calibration, with goodness-of-fit


determined by

likelihood or mean-squared pricing errors. Alternatively, models can be calibrated


to fit a set (a

continuum is often required) of options or other derivative prices exactly.


Prominent examples of

the latter approach are the local volatility model of Dupire (1994) and the results
of Andersen and

Andreasen (2000), Carr et al. (2004), Carr and Cousot (2011), and Carr and Cousot
(2012) including

local L´evy jumps.

In the same spirit, we ask in this paper whether it is possible to use the
information contained

in implied volatility data to conduct inference about an underlying stochastic


volatility (rather than

a local volatility) model. At each point in time, implied volatility data take the
form of a surface

representing the implied volatility of the option as a function of its moneyness


and time to maturity.
We will show that it is possible to use a small number of observable and
practically useful “shape

characteristics” of the implied volatility surface, including but not limited to


the slope of the implied

volatility smile, to fully characterize the underlying stochastic volatility model.

For this purpose, we will rely on an expansion of the implied volatility surface in
terms of time-

to-maturity and log-moneyness. Various types of expansions for implied volatilities


or option prices,

obtained using different methods, are available in the literature. They include:
small volatility-of-

volatility expansions, near a non-stochastic volatility, also known as small ε or


small noise expansions,

see Kunitomo and Takahashi (2001) and Takahashi and Yamada (2012); expansion based
on slow-

varying volatility, see Sircar and Papanicolaou (1999) and Lee (2001); expansion
based on fast-varying

and slow-varying analysis, see Fouque et al. (2016); short maturity expansions, see
Medvedev and

Scaillet (2007) (for an expansion with respect to the square of time-to-maturity


with expansion

term sorted in terms of moneyness scaled by volatility), Durrleman (2010) (with a


correction due

to Pagliarani and Pascucci (2017)), and Lorig et al. (2017); expansion using PDE
methods, see

Berestycki et al. (2004); singular perturbation expansion, see Hagan and Woodward
(1999); expansion

around an auxiliary model, see Kristensen and Mele (2011); expansion using
transition density

expansion, see Gatheral et al. (2012) and Xiu (2014); expansion of the
characteristic function, see

2
Jacquier and Lorig (2015). Some of these methods apply generally, while others
apply only to

specific models, such as the Heston model, as in Forde et al. (2012) (short
maturity), Forde and

Jacquier (2011) (long maturity), or exponential L´evy models as in Andersen and


Lipton (2013). The

asymptotic behavior of implied volatilities as time-to-maturity approaches zero is


important: for the

continuous case, see Ledoit et al. (2002) and Berestycki et al. (2002), and with
jumps, see Carr and

Wu (2003) and Durrleman (2008). Finally, a number of asymptotic results concerning


long-dated,

short-dated, far out of the money strike, and jointly-varying strike-expiration


regimes are available,

see Lee (2004), Gao and Lee (2014), and Tehranchi (2009).

The expansion we employ for our purposes is different from existing ones; it takes
the form of a

bivariate series in time-to-maturity and log-moneyness, applies to general


stochastic volatility models

and produces closed-form expressions for arbitrary stochastic volatility models


with or without jumps.

Given the extensive literature on expansions, however, the novelty in this paper is
not its expansion

(although it is new) but rather the use of such an expansion as a means to conduct
inference on the

underlying stochastic volatility model. The existing literature on implied


volatility expansions has

been primarily concerned with the derivation of the expansion and its properties,
but rarely with

using the expansion for the purposes of estimating or testing the model that
underlies the expansion.

Said differently, the main use of expansions in the literature has been in the
following direction:

assuming a given stochastic volatility model, what can be said about the implied
volatilities that this

model generates? In this paper, we take the reverse direction: taking the observed
implied volatility

surface as given market data, what can be said about the stochastic volatility
model that generated
the data? We answer this question by constructing “implied stochastic volatility
models”, which

are stochastic volatility models whose characteristics have been completely


estimated to reproduce

salient characteristics of the implied volatility data. Our approach consists in


casting a small number

of observable shape characteristics of the implied volatility surface (its level,


slope and convexity

along the moneyness dimension, as well as its slope along the term-structure
dimension) as a set of

restrictions on the specification of the stochastic volatility model. If one is


interested in estimating a

parametric stochastic volatility model, we show how to set up these restrictions as


moment conditions

in GMM. If one is not willing to parametrize the model, we show how the functions
characterizing

the stochastic volatility model can be recovered nonparametrically from the shape
characteristics of

the implied volatility surface.

Applying the proposed method to S&P 500 index options, we construct an implied
stochastic

volatility model with the following empirical features: a strong leverage effect
between the innovations

in returns and volatility, mean reversion in volatility, monotonicity and state


dependency in volatility

of volatility, while matching the features of the implied volatility surfaces:


level, smile and convexity

in the log-moneyness direction, and slope in the term structure direction.

The paper is organized as follows. Section 2 sets up the problem we are studying,
the notation,

3
and describes the set of relationships between the stochastic volatility model and
the implied volatility

surface. We then use these relationships to propose two methods to construct an


implied stochastic

volatility model, first parametric in Section 3 and then nonparametric in Section 4.


We implement

these methods in Monte Carlo simulations in Section 5, showing that both the
parametric and

nonparametric estimation methods are accurate, and then on real data in Section 6.
Section 7

extends the analysis to allow for jumps in the returns dynamics of the stochastic
volatility model

and discusses the empirical challenges that this poses. Section 8 concludes, while
mathematical

details are contained in the Appendix.

2 Stochastic volatility models and implied volatility surfaces

Consider a generic continuous bivariate stochastic volatility (SV thereafter)


model. Under an as-

sumed risk-neutral measure, the price of the underlying asset St and its volatility
vt jointly follow a
diffusion process

= (r − d)dt + vtdW1t,

dSt
St
dvt = µ(vt)dt + γ(vt)dW1t + η(vt)dW2t.

(1a)

(1b)

We will add jumps in returns to the model in Section 7 below. Here, r and d are the
risk-free rate

and the dividend yield of the underlying asset, both assumed constant for
simplicity, and observable;

W1t and W2t are two independent standard Brownian motions; µ, γ, and η are scalar
functions.
The generic specification (1a)–(1b) nests all existing continuous bivariate SV
models. For models

conventionally expressed in terms of instantaneous variance rather than volatility


(e.g., the model of

Heston (1993)), it is straightforward to obtain the equivalent form of (1a)–(1b) by


Itˆo’s lemma. Our
objective is to fully identify the model, that is, vt at each discrete instant at
which data sampling
occurs, and the unknown functions µ(·), γ(·) and η(·). This is a natural extension
to stochastic

volatility models of the question answered in Dupire (1994) for local volatility
models, which relied
on a method which cannot be used in the stochastic volatility context.1

We are also interested in the leverage effect coefficient function as the correlation
function between

asset returns and innovations in spot volatility, defined as in A¨ıt-Sahalia et al.


(2013) by

ρ(vt) =

γ(vt)
(cid:112)γ(vt)2 + η(vt)2

(2)

This coefficient function is identified once the other components of the model are.
In general, ρ(vt)
is empirically found to be negative, and is in general stochastic since the
dependence in vt need

1Local volatility models are of the form dSt/St = (r − d)dt + σ(St)dWt. The
approach of Dupire (1994), based
on inverting the pricing equation for the function σ(·), cannot be extended from
the local to the stochastic volatility
situation: when employed in a stochastic volatility setting, it can only
characterize E [vT |ST , S0] rather than the full
dynamics (1b).

4
not cancel out between the numerator and denominator in (2): see, e.g., the models
of Jones (2003)

and Chernov et al. (2003), among others. For ρ(vt) to be independent of vt, i.e.,
ρ(v) ≡ ρ for some
constant ρ, it must be that η(v) = ργ(v)/(cid:112)1 − ρ2, i.e., the two functions
η(v) and γ(v) are uniformly
proportional to each other. This is the case in the model of Heston (1993), for
instance.

The arbitrage-free price of an European-style put option with maturity T, i.e.,


time-to-maturity

τ = T − t, and exercise strike K is (in terms of log-moneyness k = log (K/St))

P (τ, k, St, vt) = e−rτ Et[max(Stek − ST , 0)],

where Et denotes the risk-neutral conditional expectation given the information up


to time t. In
practice, the market price of an option is typically quoted through its Black-
Scholes implied volatility

(IV thereafter) Σ, i.e., the value of the volatility parameter which, when plugged
into the Black-

Scholes formula PBS(τ, k, St, σ), leads to a theoretical value equal to the
observed market price of
the option2:

PBS(τ, k, St, Σ) = P (τ, k, St, vt).

Viewed simply as mapping actual option prices into a different unit, using implied
volatilities does

not require that the assumptions of the Black-Scholes model be satisfied, and has a
few advantages:

implied volatilities are independent of the scale of the underlying asset value or
strike price, deviations

from a flat IV surface denote deviations from the Black-Scholes model (or
equivalently deviations

from the Normality of log-returns), and such deviations can be monotonically


interpreted (the higher

the IV above the flat level, the more expensive the option, and similarly below), so
differences in IV

allow for relative value comparisons between options.

2.1 From stochastic volatility to implied volatility

The IV depends on St only through k, that is, Σ = Σ(τ, k, vt). This is because the
option price can
be written in a form proportional to the time-t price St,

P (τ, k, St, vt) = St ¯P (τ, k, vt) with ¯P (τ, k, vt) = e−rτ Et


(cid:20)

max

(cid:18)

ek −

(cid:19)(cid:21)

, 0

ST
St

(3)

For a given log-moneyness k, the function ¯P (τ, k, vt) is independent of the


initial underlying asset
price St since the dynamics (1a) of the underlying asset price imply that the ratio
ST /St is inde-
pendent of St, so is the expectation function (3) for defining ¯P (τ, k, vt).
Writing PBS(τ, k, St, σ) =
St ¯PBS(τ, k, σ), the IV Σ is determined by

and therefore

¯PBS(τ, k, Σ) = ¯P (τ, k, vt),

Σ(τ, k, vt) = ¯P −1

BS (τ, k, ¯P (τ, k, vt)).

(4)

(5)

2Model implied volatilities calculated from put and call options are identical by
put-call parity.

5
The mapping (τ, k) (cid:55)−→ Σ(τ, k, vt) at a given t is the (model) IV surface at
that time. We will
consider several shape characteristics of the IV surface such as its slope and
convexity along the
log-moneyness and the term-structure dimensions, defined by the partial derivative
∂i+jΣ/∂τ i∂kj

for integers i, j ≥ 0. In particular, we will focus on the at-the-money (k = 0) and


short maturity

(τ → 0) shape characteristics

Σi,j(vt) = lim
τ →0

∂i+j
∂τ i∂kj Σ(τ, 0, vt).

(6)

To illustrate, we show in Figure 1 the S&P 500 IV surface on a given day along with
the two slopes

Σ0,1(vt) (log-moneyness slope, or IV smile) and Σ1,0(vt) (term-structure slope) as


red and blue dashed
lines, respectively.

The idea in this paper is to treat the shape characteristics of the IV surface (6)
as observable

from market data, and to use them to determine the SV model (1a)–(1b) that is
compatible with

them. The tool we call upon for that purpose is that of IV asymptotic expansions,
which express

the shape characteristics Σi,j(·) in terms of vt and the functions µ(·), γ(·) and
η(·). The function Σ
admits an expansion of the form

Σ(J,L(J))(τ, k, vt) =

J
(cid:88)

Lj
(cid:88)

j=0

i=0

σ(i,j)(vt)τ ikj,

up to some integer expansion orders J and L(J) = (L0, L1, · · · , LJ ) with Lj ≥ 0,


and therefore

σ(i,j)(vt) =
1
i!j!

Σi,j(vt).

(7)

(8)

For a given SV model, the coefficients σ(i,j)(·) can be derived in closed form to
arbitrary order

one after the other. We provide the precise mathematical details in the Appendix.
In a nutshell, we

first note that the (0, 0)th order term must be given by the instantaneous
volatility

σ(0,0)(v) = v,

(9)

which is a well-known fact (see, e.g., Ledoit et al. (2002) and Durrleman (2008).)
The purpose of

an IV expansion is to compute the higher order coefficients in (7) for an arbitrary


SV model. We

describe in Appendix A our method for achieving this goal; the advantage in our
view compared to

many of the existing alternative approaches described in the Introduction is that


this method yields

fully explicit coefficients, and does so for arbitrary SV models.

We illustrate the approach by focusing on the at-the-money level σ(0,0), slope


σ(0,1), and convexity
σ(0,2) (up to a constant equal to 2) along the log-moneyness dimension, as well as
the slope σ(1,0) along

the term-structure dimension, all for short time-to-maturity. These four basic
shape characteristics

construct a skeleton of the IV surface, and thus conversely they can be extracted
from an IV surface.

Set (J, L(J)) = (2, (1, 0, 0)) in (7), that is,

Σ(2,(1,0,0))(τ, k, vt) = σ(0,0)(vt) + σ(1,0)(vt)τ + σ(0,1)(vt)k + σ(0,2)(vt)k2.

(10)

6
We show in Appendix A that

σ(0,1)(vt) =

σ(1,0)(vt) =

1
2vt
1
24vt

γ(vt), σ(0,2)(vt) =

1
12v3
t

[2vtγ(vt)γ(cid:48)(vt) + 2η(vt)2 − 3γ(vt)2],

[2γ(vt)(6(d − r) − 2vtγ(cid:48)(vt) + 3v2

t ) + 12vtµ(vt) + 3γ(vt)2 + 2η(vt)2].

(11)

(12)

These expressions provide the expansion of the IV surface that corresponds to a


given specification of

the SV model. The main idea in this paper is to use conversely the IV surface
expansion to estimate
the unknown coefficients functions of the SV model. In other words, treating the
coefficients σ(0,0)(·),
σ(0,1)(·), σ(0,2)(·), and σ(1,0)(·) (and higher order coefficients if necessary) as
observable from options

data, how can we use the data and these formulae to estimate the unknown functions
µ(·), γ(·), and

η(·)?

2.2 From implied volatility to stochastic volatility

It is possible in fact to fully characterize the SV model from observations on the


level, log-moneyness

slope and convexity, and term-structure slope of the IV surface. In other words, we
view (11)–(12)

as a system of equations to be solved for γ(·), η(·), and µ(·), given the IV
surface characteristics.

These equations lead to a useful estimation method because it turns out that they
can be inverted

in closed form, so no further approximation, numerical solution of a differential


equation or other
numerical inversion is required. First, observe that (11)–(12) imply

γ(vt) = 2σ(0,0)(vt)σ(0,1)(vt),

and

η(vt) =

(cid:16)

(cid:104)
2

6σ(0,0)(vt)3σ(0,2)(vt) − 2σ(0,0)(vt)γ(vt)γ(cid:48)(vt) + 3γ(vt)2(cid:17)


(cid:105)−1/2
1
4

(2γ(cid:48)(vt) − 3σ(0,0)(vt)) −

γ(vt)
σ(0,0)(vt)

γ(vt)
6

d − r +

(cid:18)

(cid:19)

γ(vt)

µ(vt) = 2σ(1,0)(vt) +

(13a)

(13b)

η(vt)2
6σ(0,0)(vt)

. (13c)

Second, plug in (13a) into (13b), and then plug in both expressions into (13c) to
obtain:

Theorem 1. The coefficient functions γ(·), η(·), and µ(·) of the SV model (1a)–(1b)
can be recovered

in closed form as functions of the coefficients of the IV expansion (10) as follows:

γ(vt) = 2σ(0,0)(vt)σ(0,1)(vt),
(14a)

and

η(vt) = 2σ(0,0)(vt)

(cid:104)

3σ(0,0)(vt)σ(0,2)(vt) + 2σ(0,1)(vt)2 − 4σ(0,0)(vt)σ(0,1)(vt)σ(0,1)(cid:48)(vt)

(cid:105)−1/2

(14b)

µ(vt) = σ(0,0)(vt)2

(cid:20)

σ(0,1)(vt)(2σ(0,1)(cid:48)(vt) − 1) −

(cid:21)
σ(0,2)(vt)

1
2

− 2(d − r)σ(0,1)(vt) + 2σ(1,0)(vt). (14c)

where σ(0,1)(cid:48)(vt) represents the first order derivative of σ(0,1)(vt) with


respect to vt.

7
We note a few interesting implications of this result. First, (14a) shows that for
a given IV level
σ(0,0)(vt), the slope σ(0,1)(vt) plays an important role in determining the
volatility function γ(vt)
attached to the common Brownian shocks W1t of the asset price St and its volatility
vt. For a fixed
level σ(0,0)(vt), a steeper slope σ(0,1)(vt) results in a higher absolute value of
the volatility function
γ(vt). Second, from (14b), a steeper slope σ(0,1)(vt) has an effect on the
volatility function η(vt)
attached to the idiosyncratic Brownian shock W2t in the volatility dynamics which
can be of either
sign. Besides the level σ(0,0)(vt) and slope σ(0,1)(vt), the convexity σ(0,2)(vt)
also matters for the
volatility function η(vt). The total spot volatility of volatility is
(cid:112)γ(vt)2 + η(vt)2, so for a fixed level
σ(0,0)(vt) and slope σ(0,1)(vt), a greater convexity σ(0,2)(vt) results in a larger
volatility of volatility.
Third, from (2) and (14a), we see that the sign of the leverage effect coefficient
ρ(vt) is determined
by the sign of the slope σ(0,1)(vt) : as is typically the case in the data, a
downward-sloping IV smile,
σ(0,1)(vt) < 0, translates directly into ρ(vt) < 0. Further, ρ(vt) is monotonically
decreasing in η(vt),
so it follows from (14b) and (2) that, for a fixed level σ(0,0)(vt) and slope σ(0,1)
(vt), a greater convexity
σ(0,2)(vt) leads to a larger volatility of volatility, and consequently, a weaker
leverage effect ρ(vt).
Finally, (14c) shows that for fixed levels of σ(0,0)(vt), σ(0,1)(vt), and σ(0,2)
(vt), an increase of the
term-structure slope σ(1,0)(vt) on the IV surface results in an increase in the
drift µ(vt), i.e., a faster
expected change of the instantaneous volatility vt.

3 Constructing a parametric implied stochastic volatility model

We now turn to using the above connection between the specification of the SV model
and the

resulting IV expansion in order to estimate the coefficient functions of a


parametric SV model, doing

so in such a way that the estimated model generates option prices that match the
observed features

of the IV surface.

We assume for now that the SV model (1a)–(1b) is a parametric one, so that µ(·) =
µ(·; θ),

γ(·) = γ(·; θ), and η(·) = η(·; θ), where θ denotes the vector of unknown
parameters to be estimated
in a compact space Θ ⊂ RK, and θ0 denotes their true values. We further assume that
the parametric
functions are known, and twice continuously differentiable in θ.

To estimate θ, we propose to use the closed-form IV expansion coefficients to form


moment
conditions. Assume that a total of n IV surfaces are observed with equidistant time
interval ∆,
without loss of generality. On day l, we observe nl implied volatilities Σdata(τ
(m)
time-to-maturity τ (m)
stationary and strong mixing with rate greater than two.

for m = 1, 2, . . . , nl. We assume that the data are

and log-moneyness k(m)

) along with

, k(m)
l

The moment functions we propose to use are

g(i,j)(vl∆; θ) = [σ(i,j)]data

l − [σ(i,j)(vl∆; θ)]model,

(15)

where [σ(i,j)]data

(resp.

[σ(i,j)(vl∆; θ)]model) denote the data coefficients (resp. the closed-form for-

8
mulae given in (11)–(12) and additional higher orders if necessary) of the
expansion terms σ(i,j)(vl∆)
of (7).

We gather the different moment conditions g(i.j) into a vector

g(vl∆; θ) = (g(i,j)(vl∆; θ))(i,j)∈I

of moment conditions, for some integer index set I consisting of nonnegative


integer pairs (i, j)

such that i + j ≥ 1 : for example, I = {(1, 0) , (0, 1) , (0, 2)}. The choice of
moment conditions is

flexible, depending on the shape characteristics one decides to fit, and the number
of parameters to

be estimated, and may include higher order terms. We assume that

E[g(vl∆; θ0)] = 0

and E[g(vl∆; θ)] (cid:54)= 0 for θ (cid:54)= θ0 holds. We also assume that θ0 is in
the interior of Θ. As the moments
are given by coefficients of an expansion, a bias term of small order is left, an
effect similar to that

in A¨ıt-Sahalia and Mykland (2003). We treat this term as negligible on the basis
of fitting each IV

surface near its at-the-money and short maturity point.

To extract [σ(i,j)]data

from the observed options data, recall the form of the expansion (7), which

can be interpreted as a polynomial regression of IV on time-to-maturity τ and log-


moneyness k. So,

on any day l, we regress

Σdata(τ (m)

, k(m)
l

) =

J
(cid:88)

Lj
(cid:88)

j=0
i=0

β(i,j)
l

(τ (m)
l

)i(k(m)
l

)j + (cid:15)(m)

, for m = 1, 2, . . . , nl,

(16)

where (cid:15)(m)
of the expansion (7) is then estimated by the regression coefficient β(i,j)

represent i.i.d. exogenous observation errors with zero means.3 The coefficient
σ(i,j)(vl∆)

in (16):

[σ(i,j)]data

= ˆβ(i,j)
l

, for i, j ≥ 0;

(17)

in particular, vl∆ = [σ(0,0)]data


surface Σ evaluated at (τ, k) = (0, 0), the regression (16) includes observations
with (τ, k) away from

. While the objects of interest (6) are derivatives of the IV

= ˆβ(0,0)
l

(0, 0) in order to estimate these derivatives.

To estimate the parameters θ by GMM (see Hansen (1982)) we construct the sample
analog of

E[g(vl∆; θ)] as follows:


gn (θ) ≡

1
n

n
(cid:88)

l=1

g(vl∆; θ).

The estimator ˆθ is defined as the solution of the quadratic minimization problem

ˆθ = argmin

(cid:124)
gn (θ)

Wngn (θ) ,

(18)

3This is a generalization of the linear regression in Dumas et al. (1998) of


implied volatilities on τ and K = Stek.

9
where Wn is a positive definite weight matrix. If the number of moment conditions is
equal to that
of parameters to estimate, i.e., the model is exactly identified, the estimator ˆθ
is the solution of the
(system of) equations

gn(ˆθ) = 0,

and the choice of Wn does not matter. Otherwise, i.e., if the number of moment
conditions is greater
than that of parameters to estimate, the model is over-identified and the optimal
choice of the weight

matrix Wn follows from a standard two-step estimation.

The asymptotic behavior of ˆθ is given by

ˆθ P→ θ0 and

n(ˆθ − θ0) d→ N (cid:0)0, V −1(θ0)(cid:1) , as n → ∞,

(19)

where

V (θ) = G(θ)(cid:124)Ω−1(θ)G(θ), with G(θ) = E

(cid:20) ∂g (vl∆; θ)
∂θ

(cid:21)

, Ω(θ) = Ω0(θ) +

n−1
(cid:88)

j=1

(Ωj(θ) + Ωj(θ)(cid:124)) ,

and

Ωj(θ) = E[g(vl∆; θ)g(v(l+j)∆; θ)(cid:124)], for j = 0, 1, 2, . . . , n − 1.

A consistent estimator of the matrix ˆV (θ) is given by

ˆV (θ) = ˆG(θ)(cid:124) ˆΩ−1(θ) ˆG(θ), with ˆG(θ) =

1
n

n
(cid:88)

l=1
∂g (vl∆, θ)
∂θ

In the exactly identified case, the matrix ˆΩ(θ) is the Newey-West estimator with
(cid:96) lags:

ˆΩ(θ) = ˆΩ0(θ) +

(cid:96)
(cid:88)

j=1

(cid:18) (cid:96) + 1 − j
(cid:96) + 1

(cid:19) (cid:16) ˆΩj(θ) + ˆΩj(θ)(cid:124)(cid:17)

(20)

(21)

where

ˆΩ0(θ) =

1
n

n
(cid:88)

l=1

g(vl∆; θ)g(vl∆; θ)(cid:124) and ˆΩj(θ) =

1
n

n
(cid:88)

l=j+1

g(vl∆; θ)g(v(l−j)∆; θ)(cid:124), for j = 1, 2, . . . (cid:96).

In principle, the number of lags (cid:96) grows with n at the rate (cid:96) =
O(n1/3). In the over-identified case,
the optimal choice of Wn ought to be a consistent estimator of ˆΩ−1(θ0). For this,
the estimator ˆθ is
obtained by the following two steps: First, set the initial weight matrix Wn in
(18) as the identity
matrix and arrive at a consistent estimator ˜θ. Second, compute ˆΩ(˜θ) according to
(21), so that its
inverse ˆΩ−1(˜θ) is a consistent estimator of Ω−1(θ0). Then set the weight matrix
Wn in (18) as ˆΩ−1(˜θ)
and update the estimator to ˆθ.

We provide below in Section 5.1 an example showing how to construct a Heston


implied stochas-

tic volatility model, and the results of Monte Carlo simulations where the model is
either exactly

10
corresponding finite-sample standard deviation and that the estimator

identified or over-identified. We find that for each parameter, the bias of the
estimator is less than the
ˆV −1(ˆθ)/n of the asymptotic
standard deviations, calculated according to (20), provides a reliable way of
approximating standard

(cid:113)

errors for the parameters.

4 Constructing a nonparametric implied stochastic volatility model

We now turn to the case where no parametric form is assumed for the coefficient
functions µ(·), γ(·),

and η(·) of the SV model, and show how the coefficients of the IV expansion (10) can
be employed

to recover them.

Theorem 1 can now be employed to construct the following explicit nonparametric


estimation

method for SV models. As in the parametric case of Section 3, the data for the four
expansion terms
σ(0,0), σ(0,1), σ(0,2), and σ(1,0) are regarded as input and obtained by a
polynomial regression (16) of
, [σ(0,1)]data
IV on time-to-maturity and log-moneyness. As in (15), we denote by vl∆ =
[σ(0,0)]data
[σ(0,2)]data

these data at time l∆.

, and [σ(1,0)]data

To estimate γ(·) nonparametrically, we rely on (14a). Let

[γ]data
l

= 2[σ(0,0)]data

[σ(0,1)]data
l

and consider the nonparametric regression

[γ]data
l

= γ(vl∆) + (cid:15)l,

(22)

(23)

where vl∆ is the explanatory variable, and (cid:15)l represents the exogenous
observation error. The function
γ(·) can be estimated based on (23) using a local polynomial kernel regression
(see, e.g., Fan and

Gijbels (1996).)

To estimate the coefficient functions η(·) and µ(·), we implement the closed-form
relations (13b)–
(13c).4 Note that these equations require to estimate both the function γ and its
derivative γ(cid:48). One

advantage of local polynomial kernel regression is that it provides in one pass not
only an estimator

of the regression function but also of its derivative(s). Consider specifically


locally linear kernel

regression. For two arbitrary points v and w, suppose that γ(w) can be approximated
by its first
order Taylor expansion around w = v, i.e., γ(w) ≈ γ(v) + γ(cid:48)(v)(w − v). Then,
for any arbitrary value
v of the independent variable, [γ]data

is regarded as being approximately generated from the local

linear regression as follows:

[γ]data
l

≈ α0 + α1(vl∆ − v) + (cid:15)l,

where the localization argument makes the intercept α0 and slope α1 coincide with γ
and its first

4It is mathematically equivalent to implement the closed-form formulae (14b)–(14c)


in Theorem 1.

11
order derivative γ(cid:48) evaluated at v, respectively, i.e.,

ˆγ(v) = ˆα0 and ˆγ(cid:48)(v) = ˆα1.5

The estimators ˆα0 and ˆα1 are obtained from the following weighted least squares
minimization
problem

(ˆα0, ˆα1) = argmin

α0,α1

n
(cid:88)

l=1

([γ]data

l − α0 − α1(vl∆ − v))2K

(24)

(cid:19)

(cid:18) vl∆ − v
h

where K denotes a kernel function and h the bandwidth. In practice, we use the
Epanechnikov kernel

and a bandwidth h selected either by the standard rule of thumb or by standard


cross-validation,

K(z) =

3
4

(1 − z2)1{|z|<1},

which minimizes the sum of leave-one-out squared errors. The sum of leave-one-out
squared errors,
e.g., for the volatility function γ, is given by (cid:80)n
− ˆα0,−l)2, where ˆα0,−l is the local linear
estimator ˆα0, at v = vl∆, obtained from the weighted least squares problem (24)
but without using
the lth observation (vl∆, [γ]data

l=1([γ]data

).6

l
Next, in light of (13b), we define

[η]data
l

(cid:16)

(cid:104)

6([σ(0,0)]data

)3[σ(0,2)]data

l − 2[σ(0,0)]data

ˆγ(vl∆)ˆγ(cid:48)(vl∆) + 3ˆγ(vl∆)2(cid:17)(cid:105)−1/2

and [σ(0,2)]data

, i.e., those of the expansion terms σ(0,0) and σ(0,2), as well as the
given [σ(0,0)]data
estimators of γ and γ(cid:48) obtained previously. In practice, on the right hand
side of the above equation,
the quantity inside the bracket [·]−1/2 may take a negative value, owing to
sampling noise in the data
[σ(0,0)]data

. To solve this problem, we work instead with [η2]data

and [σ(0,2)]data

defined as

[η2]data
l

(cid:16)
(cid:104)

6([σ(0,0)]data

)3[σ(0,2)]data

l − 2[σ(0,0)]data

ˆγ(vl∆)ˆγ(cid:48)(vl∆) + 3ˆγ(vl∆)2(cid:17)(cid:105)−1

(25)

We then estimate the coefficient functions η2(·) at each value v by a kernel


regression that localizes
the data [η2]data
at each point v = vl∆, as we did in (24) for γ(·). In our experience, the estimator
ˆη2(·) is always nonnegative thanks to the kernel smoothing (even though a small
number of data
points [η2]data

l may be negative.) We then define ˆη(·) ≡ (cid:2)ˆη2(·)(cid:3)1/2.


5Note that ˆγ(cid:48)(v) is an estimator of γ(cid:48)(v) but is not the derivative
of ˆγ(v).
6For a choice of kernel function K with bandwidth h, the solution of the weighted
least squares problem (24) is

explicitly given by

(cid:32) n
(cid:88)

i,j=1

ˆα0 =

where

sij(v)(vi∆ − v)

(cid:33)−1 (cid:32) n
(cid:88)

i,j=1

sij(v)(vi∆ − v)yj∆

and ˆα1 = −
(cid:33)

(cid:32) n
(cid:88)

i,j=1

sij(v)(vi∆ − v)

(cid:33)−1 (cid:32) n
(cid:88)

(cid:33)

sij(v)yj∆

i,j=1

sij(v) = K

(cid:16) vi∆ − v
h

(cid:17)

(cid:16) vj∆ − v
h

(cid:17)

(vi∆ − vj∆).

12
Finally, in light of (13c), we define

[µ]data
l

= 2[σ(1,0)]data

l +

ˆη(vl∆)2
6[σ(0,0)]data

ˆγ(vl∆)
6
ˆγ(vl∆)
[σ(0,0)]data

(2ˆγ(cid:48)(vl∆) − 3[σ(0,0)]data
(cid:19)

(cid:18)

d − r +

ˆγ(vl∆)

1
4

(26)

given the estimators of γ, γ(cid:48), and η2 obtained previously and estimate the
coefficient function µ(·)

at each value v using on the data (26) the same kernel localization procedure (24)
as employed for
ˆγ(·) and ˆη2(·).

5 Monte Carlo simulation results

In this Section, we conduct Monte Carlo simulations to determine whether the


coefficient functions

of the SV model can be accurately recovered, either parametrically or


nonparametrically, using the
methods we proposed in Sections 3 and 4.

5.1 An implied Heston model

Consider first the parametric case, which we illustrate with the SV model of Heston
(1993). Under
the assumed risk-neutral measure, the underlying asset price St and its spot
variance Vt = v2

t follow

(cid:112)

= (r − d)dt +

dSt
St
dVt = κ(α − Vt)dt + ξ

VtdW1t,

(cid:112)

Vt[ρdW1t +

(cid:112)

1 − ρ2dW2t],

(27a)

(27b)

where W1t and W2t are independent standard Brownian motions. Here, the parameter
vector is
θ = (κ, α, ξ, ρ) and we assume that Feller’s condition holds: 2κα > ξ2. The
leverage effect parameter

is ρ ∈ [−1, 1].

To estimate the four parameters in θ = (κ, α, ξ, ρ), we successively employ the


four moment
conditions in g = (g(1,0), g(0,1), g(0,2), g(1,1))(cid:124) to exactly identify the
parameters or employ the five
moment conditions in g = (g(1,0), g(0,1), g(0,2), g(1,1), g(2,0))(cid:124) to over-
identify the parameters. We
impose α > 0, κ > 0, ξ > 0 and Feller’s condition as constraints during the GMM
minimization (18).

Itˆo’s lemma applied to vt =

Vt yields

µ(v) =

κ(α − v2)
2v

ξ2
8v

, γ(v) =

ξρ
2

, η(v) =

ξ(cid:112)1 − ρ2
2

(28)

Then, applying the results of Section 2.1 and the general method for deriving
higher orders in
Appendix A, we can calculate the expansion terms σ(0,1)(v), σ(0,2)(v), σ(1,0)(v),
σ(1,1)(v), and σ(2,0)(v):

σ(0,0)(v) = v, σ(0,1)(v) =

ρξ
4v

, σ(0,2)(v) = −

1
48v3

(cid:0)5ρ2 − 2(cid:1) ξ2,

(29)

13

and

σ(1,0)(v) =

1
96v

(cid:0)ξ (cid:0)24ρ(d − r) + ξ (cid:0)ρ2 − 4(cid:1)(cid:1) + v2(12ξρ − 24κ) +


24κα(cid:1) ,

σ(1,1)(v) = −

σ(2,0)(v) =

ξ
384v3
1
30720v3

(cid:0)16 (cid:0)2 − 5ρ2(cid:1) (r − d)ξ + ρ (cid:0)40κα + 3 (cid:0)3ρ2 − 4(cid:1)


ξ2 + v2(4ρξ − 8κ)(cid:1)(cid:1) ,

(cid:2)ξ2 (cid:0)−640(r2 + d2) (cid:0)5ρ2 − 2(cid:1) + 80d (cid:0)3ρ (cid:0)4 −


3ρ2(cid:1) ξ + 16 (cid:0)5ρ2 − 2(cid:1) r(cid:1)

+ (cid:0)59ρ4 − 88ρ2 − 16(cid:1) ξ2 + 240ρ (cid:0)3ρ2 − 4(cid:1) rξ(cid:1) + 320v4


(cid:0)5κ2 − 5κρξ + (cid:0)2ρ2 − 1(cid:1) ξ2(cid:1)
− 80καξ (cid:0)40dρ + (cid:0)5ρ2 − 8(cid:1) ξ − 40ρr(cid:1) − 40v2(2κ − ρξ)
(cid:0)ξ (cid:0)−8dρ + 3ρ2ξ + 8ρr
−4ξ) + 8κα) − 960κ2α2(cid:3) .

We now generate a time series of (St, Vt) with n = 1, 000 consecutive samples at
the daily
frequency, i.e., with time increment ∆ = 1/252, by subsampling higher frequency
data simulated

using the Euler scheme. The parameter values are r = 0.03, d = 0, κ = 3, α = 0.04,
ξ = 0.2,

and ρ = −0.7. Each day, we calculate option prices with time-to-maturity τ equal to
5, 10, 15, 20,

25, and 30 days and for each time-to-maturity τ, include 20 log-moneyness values k
within ±vt
where τ is annualized and vt is the spot volatility. The principles for judiciously
choosing such a
region of (τ, k) for simulation will be intensively discussed in the next
paragraph. Due to the affine

τ ,

nature of the model of Heston (1993), these option prices can be calculated by
Fourier transform

inversion and compute the corresponding IV values. To mimic a realistic market


scenario, we add
observation errors to these implied volatilities, sampled from a Normal
distribution with mean zero

and constant standard deviation equals to 15 bps and further assumed to be


uncorrelated across

time-to-maturity and log-moneyness, as well as over time. Then, for each IV


surface, we follow the
regression procedure described around (16) to extract the estimated coefficients
ˆβ(i,j)
regression (16).

of the bivariate

In practice, one needs to choose the orders J, L0, L1, . . . , LJ in the bivariate
polynomial regression
in (16) and the region in (τ, k) of the IV surface data to compute the regression.
On the one hand, we

need at a minimum to include enough orders in the regression to estimate the


coefficients of interest
for the estimation method; recall that we need the terms σ(0,0), σ(0,1), σ(0,2),
σ(1,0), and σ(1,1) for
constructing an exactly identified Heston model, and need to include an additional
term σ(2,0) for

constructing an over-identified one. But we can consistently estimate all these


lower order coefficients

from a higher order regression, discarding the estimates of the higher order
coefficients. On the other

hand, the orders cannot be chosen as too high and the region in (τ, k) cannot be
chosen as too narrow

to avoid over-fitting the regression. Specifically, we set the order to be (J, L(J))
= (2, (2, 2, 1)), so:

Σdata(τ (m)

, k(m)
l

) = β(0,0)
l
+ β(2,1)
l

+ β(1,0)
l
(τ (m)
l

l + β(2,0)
τ (m)
l + β(0,2)
)2k(m)

(τ (m)
l
(k(m)
l

)2 + β(0,1)
l
)2 + β(1,2)
l

l + β(1,1)
k(m)
(k(m)
τ (m)
l
l

τ (m)
l
l
)2 + (cid:15)(m)

k(m)
l

(30)

14
for m = 1, 2, . . . , nl. The estimated coefficients from this regression estimate
the IV surface charac-
teristics that we need (recall (17)).

We then implement the method proposed in Section 3 to estimate the model


parametrically. We
consider two cases. The first one is exactly identified using g = (g(1,0), g(0,1),
g(0,2), g(1,1))(cid:124), while the
second adds one more moment condition, g(2,0), to over-identify the parameters.
Table 1 summarizes

the results. We find that, for each parameter, the absolute bias is relatively small
and is less than the

corresponding finite-sample standard deviation. In the exactly identified (resp.


over-identified) case,

we compare for each parameter the finite-sample standard deviation exhibited in the
fourth (resp.
sixth) column of Table 1 with the consistent estimator of its asymptotic
counterpart, based on ˆV (ˆθ)
given in (20). Figure 2 (resp. 3) compares the finite-sample standard deviation for
each parameter

with the distribution of sample-based asymptotic counterparts in the exactly


identified (resp. over-

identified) case. Consider the upper left panel of Figure 2 as an example. The
histogram characterizes
11 (ˆθ)/n for parameter κ, where
ˆV −1
the distribution of sample-based asymptotic standard deviation
11 represents the (1, 1)th entry of matrix ˆV −1. The red star marks the
corresponding finite-sample
ˆV −1
standard deviation shown in the fourth cell from the first row of Table 1. As shown
from Figures 2

(cid:113)

and 3, for each parameter, the finite-sample standard deviation falls within the
range of its sampled-

based asymptotic counterparts in both cases. As the sample size further increases,
the finite-sample

standard deviation and its sampled-based asymptotic counterpart tend to converge to


each other,
(cid:113)
ˆV −1(ˆθ)/n of the asymptotic standard deviations

and shows that the sampled-based approximation

is a reasonable estimator of the standard errors.

5.2 Nonparametric implied stochastic volatility model

Next, we apply the nonparametric method of Section 4 to the simulated data that was
generated
under the Heston model. In Figure 4, the upper left, upper right, middle left, and
middle right panels
exhibit the results for nonparametrically estimating the functions µ, −γ, η2, and η
of model (1a)–

(1b), respectively. Consider the upper left panel for the function µ. We perform
local polynomial

regression at equidistantly distributed values of v in the interval [0.1, 0.3]. For


each v ∈ [0.1, 0.3], we

mark the true value of µ(v) by a blue dot, according to its equation given in (28),
and plot the mean

of estimators of µ(v) on a black solid curve. Then, we generate each point on the
upper (resp. lower)

dashed curve from vertically upward (resp. downward) shifting the corresponding one
on the mean

curve by a distance equal to twice of the corresponding finite-sample standard


deviation. As seen

from the figure, the shape of estimated nonparametric function resembles that of the
true one on

average. Besides, the two dashed curves sandwich the true curve. This indicates
that, at each point

of interest, the nonparametric estimator is sufficiently close to the true value


that the estimation

bias is less than twice of the corresponding standard deviation.

We then combine the estimators ˆγ(·) and ˆη2(·) to estimate the leverage effect
ρ(vt) under the

15
nonparametric implied stochastic volatility model (1a)–(1b) by

ˆρ(vt) =

ˆγ(vt)
(cid:112)ˆγ(vt)2 + ˆη(vt)2

(31)

As in the other four panels of Figure 4, we exhibit the estimation results for ρ(v)
in the lowest

panel. We find that the shape of the estimated function ρ(v) is approximately
constant at the level

of parameter ρ, as it ought to be under the model of Heston (1993).

We propose in what follows a bootstrap estimator of standard error.

It is based on multiple

bootstrap replications out of one simulation trial, for mimicking an empirical


estimation scenario.

In each bootstrap replication, we reproduce an IV surface for each day. The


reproduced surface

contains the same number of implied volatilities as that of the original surface,
and the implied

volatilities on the reproduced surface are sampled as i.i.d.

replications of the volatilities on the

original surface. Based on the bootstrap “data”, we apply the same estimation
method proposed
in Section 4 to obtain the bootstrap estimators of functions µ(·), −γ(·), η2(·),
η(·), and ρ(·). The

bootstrap standard error of each function is accordingly calculated as the standard


deviations of its

multiple bootstrap estimators.

To validate this method, which we will employ below in real data, we randomly
select one

simulation trial and calculate the bootstrap standard error of each coefficient
function out of 500

corresponding bootstrap estimators. Figure 5 summarizes the estimation result of


this trial.

In

each panel of Figure 5, the blue dotted (resp. black solid) curve represents the
true function (resp.
nonparametric estimator.) Each point on the upper (resp.

lower) dashed curve is plotted from

vertically upward (resp. downward) shifting the corresponding one on the black
solid curve by a

distance equal to twice of the corresponding bootstrap standard error. Figure 5


suggests that our

nonparametric estimators are all accurate, as they are close to the corresponding
true functions.

More importantly, the bootstrap method appears to be valid from a comparison of


each panel in

Figure 5 with the corresponding one in Figure 4. Compare the upper right panels of
Figures 5 and

4 as an example. For any v, the lengths of intervals bounded by the two dashed
curves in these two

panels are close to each other. Thus, the bootstrap standard errors seem to provide
a reliable way

for calculating standard deviations in the coming empirical analysis.

6 Empirical results

We now employ S&P 500 options data covering the period from January 2, 2013 to
December 29,

2017, obtained from OptionMetrics. Guided by the simulations evidence discussed


above, we select

options with time-to-maturity between 15 and 60 calendar days, thereby excluding


both extremely

short-maturity ones which are subject to significant trading effects and biases, and
long-maturity ones

for which the IV expansion becomes less accurate. Table 2 reports the basic
descriptive statistics of

16
the sample of 269,622 observations. Table 2 divides the data into three (calendar)
days-to-expiration

categories and six log-moneyness categories. For each category, we report the total
number, mean,

and standard deviation of implied volatilities therein.

As in the simulations, we implement each day the regression (16) of implied


volatilities with time-

to-maturities between 15 and 60 days, and log-moneyness within ±vt


time-to-maturity and vt is the instantaneous volatility, which is estimated by the
observed IV with
both the time-to-maturity τ and the log-moneyness k closest to 0 on that day. We
run the regression

τ . Here, τ is the annualized

only if at least four different time-to-maturities between 15 and 60 days are


available; otherwise, we

do not include that day in the sample. We end up with n = 1, 002 IV surfaces at the
daily frequency

∆ = 1/252. Moreover, for choosing the order of polynomial regression (16), a


reasonable compromise

is to set (J, L(J)) = (2, (2, 2, 0)), i.e.,

Σdata(τ (m)

, k(m)
l

) = β(0,0)
l
+ β(1,1)
l

l + β(2,0)
+ β(1,0)
τ (m)
l
l
(τ (m)
l + β(2,1)
k(m)
τ (m)
l
l

(τ (m)
l
)2k(m)
)2 + β(0,1)
l
l + β(0,2)

k(m)
l
(k(m)
l

)2 + (cid:15)(m)

(32)

Comparing with the bivariate regression (30) employed in the Monte Carlo
experiments, we extend

the time-to-maturity τ of the employed IV data to 60 days owing the deficiency of


data with τ less
than 30 days in practice, and remove a high order regression coefficient β(1,2)
errors of the estimators of other low order coefficients without loss of accuracy.

to reduce the standard

Figure 6 plots a histogram of the R2 values achieved by the parametric regressions


(32) across
the full sample of IV surfaces. We find that for over 95% of the sample the R2 are
greater than

0.96, and essentially none are lower than 0.90, suggesting that (32) is quite
successful at fitting the

data. Incidentally, practitioners often use polynomial regression to fit the short-


maturity near at-
the-money region of the IV surfaces in their own internal models7, so it is not
surprising that the

market data we collect end up reflecting this feature. As an example, Figure 7 plots
the IV data

and the corresponding fitted surface produced by regression (32) on a randomly


selected day in our

sample (January 3, 2017).

6.1 Parametric implied stochastic volatility model

We now implement the method of Section 3 to estimate a parametric implied


stochastic volatility
model of the Heston (1993) type. Table 3 reports the GMM results for both of the
exactly identi-

fied and over-identified cases. First, the estimators of ρ are around −0.6 in both
cases, consistent
with what can be heuristically inferred directly from the [σ(0,0)]data and
[σ(0,2)]data, depicted by the

corresponding histograms in Figure 8. The mean and standard deviation of the


multiplicative data
([σ(0,0)]data)3[σ(0,2)]data are 1.55 × 10−4 and 7.19 × 10−3, respectively. Thus,
there is no evidence for

7See, e.g., Gatheral (2006).

17
the mean of ([σ(0,0)]data)3[σ(0,2)]data to be significantly different from zero. On
the other hand, it
follows from the closed-form formulae for σ(0,0)(v) and σ(0,2)(v) given in (29)
that

σ(0,0)(v)3σ(0,2)(v) = −

1
48

(cid:0)5ρ2 − 2(cid:1) ξ2.

Heuristically, moment matching by equating the estimated zero mean requires −


(cid:0)5ρ2 − 2(cid:1) ξ2/48 = 0.
This would approximately estimate ρ as −0.63, independently of the values of v and
ξ.

Second, the estimator of ξ is around 1 (resp. 0.8) in the exactly identified (resp.
over-identified)

case. We find that, for both of these two cases, the estimators of ξ are somewhat
greater than those

in the literature, which are usually less than 0.55 (see, e.g., Eraker et al.
(2003), A¨ıt-Sahalia and

Kimmel (2007), and Christoffersen et al. (2010) among others.) As pointed out in,
e.g., Eraker et al.

(2003), the Heston model tends to underestimate the slope of the IV smile with
small estimators of

ξ. However, our implied stochastic volatility model forces the model to fit this
slope by construction.
Recall that the closed-form formula for the slope σ(0,1), given in (29), is σ(0,1)
(v) = ρξ/(4v). Thus,

for fitting the usually steep slope, the corresponding moment condition requires
(given ρ) ξ to be

larger than other methods, and this is what our GMM estimation procedure produces.
Furthermore,
based on the data [σ(0,0)]data and [σ(0,1)]data shown in Figure 8, the mean of the
multiplicative data
[σ(0,0)]data[σ(0,2)]data is −0.17. On the other hand, it follows from (29) that

σ(0,0)(v)σ(0,1)(v) =

ρξ
4

Similar to the aforementioned determination of ρ via the heuristic moment matching,


we plug the

estimated mean −0.17 of the multiplicative data and the estimator −0.619 of ρ as
shown Table 3 into
the above formula to solve the parameter ξ as 1.1, which basically agrees with our
GMM estimator.

Third, in both the exactly identified and over-identified cases, the estimators for κ
are larger than

10, which are larger values than those estimated in the literature. This is
necessary given the large

values of the volatility of variance ξ, to keep the volatility process vt mean-


reverting sufficiently fast
and consequently diminish the likelihood of having extreme volatilities. Fourth,
again in both cases,

the estimators of the long term variance value α are around 0.02, which is
consistent with the low

values recorded by the S&P 500 volatility during the sample period.

6.2 Nonparametric implied stochastic volatility model

Using the same data, and the same expansion terms σ(0,0), σ(0,1), σ(0,2), and
σ(1,0) estimated from

(32), we now follow the method proposed in Section 4 to construct a nonparametric


implied stochastic

volatility model.

The results are summarized in Figure 9. The upper left, upper right, and middle
left panels
show the estimators ˆµ(·), −ˆγ(·), and ˆη2(·), respectively. The different elements
in each panel are as
follows. Consider the upper left panel. The dots represent realizations of [µ]data,
which we recall are

18
calculated according to (26) while the nonparametric estimator of the function µ is
shown as the solid

curve. The confidence intervals on the curve are pointwise and represent two
standard errors. The

standard errors are calculated by a standard bootstrap procedure based on 500


bootstrap replications
as proposed and validated in Section 5.2. Given the nonparametric estimator of η2
obtained based

on the real sample (resp. bootstrap sample), we calculate the corresponding


estimator of η by taking

a square root. The result for this calculation are presented in the middle right
panel. Finally, given
the estimators of γ and η2 based on the real sample (resp. bootstrap sample), we
calculate the
corresponding estimator of the leverage effect function ρ, i.e., ˆρ(vt) =
ˆγ(vt)/(cid:112)ˆγ(vt)2 + ˆη(vt)2. The
results are shown in the lowest panel. Likewise, the standard error of the
estimators ˆη (resp. ˆρ) is

calculated by the standard deviation of 500 bootstrap estimators of η (resp. ρ.)

We find that ˆµ(·) is positive (resp. negative) when its argument is relatively
small (resp. large),

consistent with mean reversion in vt. The upper right panel indicates that the
coefficient function
ˆγ(·) is always negative (the upper right panel shows −ˆγ(·)) and approximately
linear. As shown in

the middle right panel, ˆη(·) is always positive and concave, as opposed to being
approximately linear

as γ is. The leverage effect estimator ˆρ(·) is consistently negative, non constant,
and more negative

when vt increases. The negativeness of ρ(·) is a direct consequence of that of γ


given (2). This non
constant shape of ˆρ(·) versus vt implies that the leverage effect ρ(vt) is indeed
stochastic, unlike the
assumption in the Heston model.

Finally, we verify that the goodness-of-fit of the expansion terms σ(i,j) involved
in (32).
In
each panel of Figure 10, we plot the data of expansion term σ(i,j) as well as its
fitted values ˆσ(i,j).
Here, the data are inferred from bivariate regression (32), while the fitted values
ˆσ(i,j) are obtained
by plugging in ˆµ, ˆγ and ˆη, as well as ˆγ(cid:48) (which we recall is estimated
at the same time as ˆγ by
locally linear kernel regression) in the corresponding formula for σ(i,j) given in
(11)–(12). The
fitted expansion terms ˆσ(0,1), ˆσ(0,2), and ˆσ(1,0) match the data well, which is
expected since they are
inputs in the construction. Surprisingly, however, we find that the fitted expansion
terms ˆσ(1,1) and
ˆσ(2,0) also match the data well, as shown in the middle right and lowest panels of
Figure 10, even
though the expansion terms σ(1,1) and σ(2,0) (corresponding to the mixed slope Σ1,1
and term-
structure convexity Σ2,0 of the IV surface up some constants according to (8)) are
not employed
in the nonparametric construction of the implied model, and require higher order
derivatives of the

coefficient functions. This indicates that the nonparametric implied stochastic


volatility model is

flexible enough to reproduce all the second order shape characteristics of IV


surface, or equivalently

that all the shape characteristics of the IV surface up to the second order are
consistent with the

nonparametric implied stochastic volatility model. All the aforementioned six shape
characteristics,

that our implied model fits, are more than enough for characterizing an IV surface
in the short-

maturity and near at-the-money region.

19
7 Extension: Adding jumps to the model

We now generalize our approach to include jumps in returns to the model (1a)–(1b):

dSt
St−

= (r − d − λ(vt)¯µ)dt + vtdW1t + (exp(Jt) − 1)dNt,

dvt = µ(vt)dt + γ(vt)dW1t + η(vt)dW2t.

(33a)

(33b)

Nt is a doubly stochastic Poisson process (or Cox process) with stochastic


intensity λ(vt). Jt rep-
resents the size of log-price jump, which is assumed to be independent of the asset
price St. When
i.e.,
a jump occurs at time t, the log-price log St changes according to log St − log St−
= Jt,
St − St− = (exp(Jt) − 1)St−, where St− denotes the pre-jump price of the asset. The
constant
¯µ is

¯µ = E[exp(Jt)] − 1,

where E denotes risk-neutral expectation. Based on this choice of ¯µ, the drift
term −λ(vt)¯µdt
compensates the jump component (exp(Jt)−1)dNt in the sense that the process
(cid:82) t
0 (exp(Js)−1)dNs −
(cid:82) t
0 λ(vs)¯µds becomes a martingale under the risk-neutral measure.

A typical example (as in Merton (1976)) is one where the jump size Jt is normally
distributed

with mean µJ and variance σ2

J , in which case

For future reference, we also define

(cid:18)

¯µ = exp

µJ +

(cid:19)

σ2
J
2

− 1.
µ+ =

µJ
σJ

+ σJ and µ− =

µJ
σJ

(34)

(35)

and let N denote the standard Normal cumulative distribution function.

Adding jumps to the volatility dynamics, or infinite activity jumps to either


returns or volatility

dynamics, has the potential to improve the fit and realism of the model even further
but would

substantially alter the approach we employ to derive the IV expansion. So for now
we consider only

the case of jumps in returns and leave these further extensions to future work.

7.1 The effect of jumps on the implied volatility expansion

Following the same analysis as in Section 2, it is straightforward to see that the


IV Σ remains as in

the continuous model a trivariate function of τ, k, and vt in the form given by


(5). A generalization
of the expansion (7) of the IV surface Σ(τ, k, vt) to the case of jumps will now
incorporate the square
root of time-to-maturity

τ , as well as possibly its negative powers

Σ(J,L(J))(τ, k, vt) =

J
(cid:88)

Lj
(cid:88)

j=0

i=min(0,1−j)

ϕ(i,j)(vt)τ

i
2 kj,

(36)

20
where J and L(J) = (L0, L1, · · · , LJ ) with Lj ≥ min(0, 1 − j) are integers.
Expansion (36) includes
τ in the double summation is less than
negative powers of

τ if the lowest power min(0, 1 − J) of

or equal to −1, i.e., if J ≥ 2. With the presence of jumps in return, the away-
from-the-money IV

will possibly explode to infinity as the time-to-maturity shrinks to zero: this


limiting behavior was

noted by Carr and Wu (2003), who used this divergence to construct a test for the
presence of jumps

in the data, and by Andersen and Lipton (2013).

We show in Appendix B that with Normally distributed jumps, the (3, (2, 1, 0,
−2))th order

expansion of (36) is given by

Σ(3,(2,1,0,−2))(τ, k, vt) = ϕ(0,0)(vt) + ϕ(1,0)(vt)τ

2 + ϕ(2,0)(vt)τ + ϕ(0,1)(vt)k + ϕ(1,1)(vt)τ

1
2 k

+ ϕ(−1,2)(vt)τ − 1

2 k2 + ϕ(0,2)(vt)k2 + ϕ(−2,3)(vt)τ −1k3,

where the (0, 0)th order term ϕ(0,0)(vt) coincides with the spot volatility vt, and
the closed-form
formulae of all other terms are given by

ϕ(−1,2)(vt) =

ϕ(0,1)(vt) =

π
2v2
t

λ(vt)

2
1
2vt
(−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)), ϕ(−2,3)(vt) =

λ(vt)¯µ
3v2
t

[2λ(vt)¯µ + γ(vt)], ϕ(1,0)(vt) = 2v2

t ϕ(−1,2)(vt),

and

ϕ(1,1)(vt) =

πλ(vt)

2v2
2
t

[2(r − d)¯µ + 2(¯µ + 1)N (µ+) (2(d − r) − v2

t ) + ¯µv2

t + 2v2
t

− 2N (µ−) (2(d − r) + v2

t )],

ϕ(0,2)(vt) =

1
12v3
t

[−3γ(vt)2 + 2vtγ(vt)γ(cid:48)(vt) − 3πλ(vt)2(¯µ − 2(¯µ + 1)N (µ+) + 2N (µ−))2

+ 2η(vt)2 + 6λ(vt)(¯µ (2(d − r) − γ(vt)) − 2(¯µ + 2)v2

t )],

(37a)

(37b)

(37c)

(37d)

as well as

ϕ(2,0)(vt) =
1
24vt

[6v2

t γ(vt) + 2η(vt)2 + 12λ(vt)(¯µ(2(d − r) + γ(vt)) − (¯µ + 2)v2

t ) + 3γ(vt)2

+ 12(d − r)γ(vt) + 2vt(6µ(vt) − 2γ(vt) (cid:0)3¯µλ(cid:48)(vt) + γ(cid:48)(vt)


(cid:1)) + 12¯µ2λ(vt)2].

(37e)

Note that if we set the jump intensity function λ(v) to zero, the expansion (36)
reduces to

the expansion (7) under the continuous SV model: under the model (1a)–(1b), the
expansion term
ϕ(i,j)(v) is identically zero for any negative or odd integer i and the expansion
term ϕ(i,j)(v) coincides
with σ(i/2,j)(v) for any nonnegative even integer i.

7.2 Example: The Merton jump-diffusion model

In the special case of the jump-diffusion model of Merton (1976):

dSt
St−

= (r − d − λ¯µ)dt + v0dWt + (exp(Jt) − 1)dNt,

21
where λ represents a constant jump intensity and v0 a constant volatility. We
obtain expansion (36)
under this model simply by letting the SV components be zero and let the jump
intensity function

be the constant λ, i.e.,

vt = v0 and λ(vt) = λ.

(38)

The expansion terms ϕ(0,1)(vt), ϕ(−1,2)(vt), and ϕ(1,1)(vt) reduce to

ϕ(0,1)(v0) =

λ¯µ
v0

, ϕ(−1,2)(v0) =


λ

2

π
2v2
0

(−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)),

and

ϕ(1,1)(v0) =



2

πλ
2v2
0

[2(r − d)¯µ + 2(¯µ + 1)N (µ+) (2(d − r) − v2

0) + ¯µv2

0 + 2v2
0

− 2N (µ−) (2(d − r) + v2

0)],

from the general formulae provided in (37b), (37a), and (37c), respectively.

7.3 From implied volatility to stochastic volatility and jumps

Returning to the general model, the terms ϕ(i,j) correspond to at-the-money IV


shape characteristics

or combinations thereof, as the time-to-maturity shrinks to zero, up to time


scalings. For instance,
the expansion terms ϕ(0,0)(vt), ϕ(−2,3)(vt), ϕ(0,1)(vt), ϕ(1,1)(vt), and ϕ(−1,2)
(vt) satisfy

ϕ(0,0)(vt) = lim
τ →0

Σ(τ, 0, vt), ϕ(−2,3)(vt) = lim


τ →0

τ
6

ϕ(1,1)(vt) = lim
τ →0


2

∂2Σ
∂τ ∂k

(τ, 0, vt), ϕ(−1,2)(vt) = lim


τ →0

∂3Σ
∂k3 (τ, 0, vt), ϕ(0,1)(vt) = lim
∂2Σ
∂k2 (τ, 0, vt),

τ
2

τ →0

while the expansion terms ϕ(0,2)(vt) and ϕ(2,0)(vt) satisfy

and

ϕ(0,2)(vt) = lim
τ →0

(cid:18) 1
2

∂2Σ
∂k2 (τ, 0, vt) + τ

∂3Σ
∂k2∂τ

(cid:19)
(τ, 0, vt)

ϕ(2,0)(vt) = lim
τ →0

(cid:18) ∂Σ
∂τ

(τ, 0, vt) + 2τ

(cid:19)
∂2Σ
∂τ 2 (τ, 0, vt)

∂Σ
∂k

(τ, 0, vt),

(39a)

(39b)

(39c)

(39d)

Formulae (39a)–(39d) hinge on the univariate expansions of at-the-money shape


characteristics
∂i+jΣ(τ, 0, vt)/∂τ i∂kj with respect to
ferentiating both sides of bivariate expansion (36) i times with respect to τ and j
times with respect

τ , while these expansions can be obtained simply by dif-

to k, and then set k to zero. We provide the details for these calculations at the
end of Appendix B.

The following result then generalizes Theorem 1 to the case where jumps are
present. It establishes

that the coefficient functions µ(·), γ(·), η(·), and λ(·), as well as the jump size
parameters µJ and σJ ,
can be recovered explicitly in terms of the shape characteristics (6) of the IV
surface, or equivalently
in terms of the relevant coefficients ϕ(i,j):

22
Theorem 2. The jump size parameters µJ and σJ of the model (33a)–(33b) can be
recovered by the
following coupled algebraic equations

¯µ + 2 − 2(¯µ + 1)N (µ+) − 2N (µ−)


−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)

1
ϕ(0,0)(vt)2

(cid:34)

ϕ(1,1)(vt)
ϕ(−1,2)(vt)

(cid:35)

+ 2(r − d)

(40a)

and

2¯µ
π[−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)]

ϕ(−2,3)(vt)
ϕ(−1,2)(vt)

The coefficient functions λ(·), γ(·), η(·), and µ(·) can be recovered in closed form
as

λ(vt) =

2ϕ(0,0)(vt)2ϕ(−1,2)(vt)
π[−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)]

γ(vt) = 2ϕ(0,0)(vt)ϕ(0,1)(vt) − 2λ(vt)¯µ,

and

(40b)
(40c)

(40d)

(cid:20)

η(vt) =

6ϕ(0,0)(vt)3ϕ(0,2)(vt) − ϕ(0,0)(vt)γ(vt)γ(cid:48)(vt) +

3
2

πλ(vt)2(¯µ − 2(¯µ + 1)N (µ+)

+2N (µ−))2 +

3
2

γ(vt)2 + 3λ(vt)(2¯µ(r − d) + (¯µ + 2)ϕ(0,0)(vt)2 + ¯µγ(vt))

(cid:21) 1
2

(40e)

as well as

µ(vt) =

1
12ϕ(0,0)(vt)

[24ϕ(0,0)(vt)ϕ(2,0)(vt) − 6ϕ(0,0)(vt)2γ(vt) − 2η(vt)2 − 12λ(vt)

× (¯µ(2(d − r) + γ(vt)) − (¯µ + 2)ϕ(0,0)(vt)2) − 12(d − r)γ(vt) − 3γ(vt)2


− 12¯µ2λ(vt)2 + 4ϕ(0,0)(vt)γ(vt) (cid:0)3¯µλ(cid:48)(vt) + γ(cid:48)(vt)(cid:1)].

(40f)

Equations (40a)–(40f) constitute a complete mapping from the expansion terms ϕ(i,j)
(vt) of the

IV surface to the specification of the SV model (33a)–(33b).

Here is how the jump size parameters µJ and σJ are determined from the IV surface.
By assem-
bling the algebraic equation system (40a)–(40b) and the geometric interpretations
of the involved
expansion terms ϕ(0,0), ϕ(−2,3), ϕ(1,1), and ϕ(−1,2) provided in (39a)–(39b), we
obtain the following

mapping from the shape characteristics (on the right hand side) to the jump
parameters µJ and σJ
(on the left hand side):
¯µ + 2 − 2(¯µ + 1)N (µ+) − 2N (µ−)
−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)

lim
τ →0

Σ(τ, 0, vt)2


2

lim
τ →0

lim
τ →0

τ ∂2Σ

∂τ ∂k (τ, 0, vt)
∂2Σ
∂k2 (τ, 0, vt)

τ
2

− 2(d − r)

(41a)

and

2¯µ
π[−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)]

23
lim
τ →0

τ
6

∂3Σ
∂k3 (τ, 0, vt)
∂2Σ
∂k2 (τ, 0, vt)

τ
2

lim
τ →0

(41b)
where we recall that ¯µ, µ+, and µ− are deterministic functions of µJ and σJ defined
in (34) and
(35). According to these equations, one needs various at-the-money IV shape
characteristics in both

log-moneyness and time-to-maturity dimensions to pin down µJ and σJ , without


requiring any prior
In particular, it deserves
identification of any of the coefficient functions λ(·), µ(·), γ(·), or η(·).
noting that the third order derivative ∂3Σ/∂k3 plays a crucial role. This is
somewhat unfortunate

from an empirical perspective as it implies that the jump size parameters µJ and σJ
depend on a
higher order structure of the IV surface that will be difficult to estimate
precisely in the absence of

large amounts of high quality options data.

The stochastic intensity function λ(vt) is characterized by:

Σ(τ, 0, vt)2 · lim


τ →0

∂2Σ
∂k2 (τ, 0, vt)
2 lim
τ →0
π[−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)]

τ
2

λ(vt) =

(42)

So the short-maturity at-the-money IV convexity ∂2Σ(τ, 0, vt)/∂k2 is involved in


determining the
stochastic intensity function λ(vt) but not any third order characteristics, except
of course that
In the continuous case,
those were already needed to identify ¯µ, µ+, and µ−, which enter (42).
the at-the-money convexity is finite as the time-to-maturity shrinks to zero, since
equations (8) and
(6) imply that limτ →0 ∂2Σ(τ, 0, vt)/∂k2 = 2σ(0,2)(vt). By contrast, under the
discontinuous model,
the convexity ∂2Σ(τ, 0, vt)/∂k2 explodes to infinity as the time-to-maturity shrinks
to zero, since
the last equation in (39b) directly implies that ∂2Σ(τ, 0, vt)/∂k2 ∼ 2ϕ(−1,2)(vt)/
τ as τ → 0 with
ϕ(−1,2)(vt) finite. The formula (42) remains valid in the limiting case where the
intensity λ(vt) tends
to zero, i.e., the jumps degenerate. This is because the convexity behaves in that
case according to
τ ∂2Σ(τ, 0, vt)/∂k2 = 0, which obviously results in the right hand side of (42)
tending to

limτ →0
zero.

The volatility function γ(vt) and η(vt) and the drift function µ(vt) are all
affected by the presence
of jumps. Compared to the continuous case, the third order mixed partial derivative
∂3Σ/∂k2∂τ
(resp. term-structure slope ∂Σ/∂τ and term-structure convexity ∂2Σ/∂τ 2)
participate in determining

the volatility function η(vt) (resp. the drift function µ(vt).) By contrast in the
continuous case, the
term structure slope ∂Σ/∂τ is the only IV characteristic along the term-structure
dimension that

matters.

Equations (41a), (41b), and (42) (equivalently, (40a)–(40c) in Theorem 2) apply to


the jump-

diffusion model of Merton (1976), by plugging in the specification assumptions (38).


One is able to

identify all the model components, i.e., the constant volatility v0, intensity λ,
as well as jump size
parameters µJ and σJ . Combining the following equations

= lim
τ →0

∂Σ
∂k

(τ, 0, vt),


λ

2

π
2v2
0

(−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)) = lim


τ →0


τ
2

∂2Σ
∂k2 (τ, 0, vt),

[2(r − d)¯µ + (¯µ + 2)v2

0 + 2(¯µ + 1)N (µ+) (2(d − r) − v2

0) − 2N (µ−) (2(d − r) + v2

0)]

24

λ¯µ
v0


2

πλ
2v2
0
= lim
τ →0


2

∂2Σ
∂τ ∂k

(τ, 0, vt)

with the first equation in (39a), i.e., v0 = limτ →0 Σ(τ, 0, v0), we can identify
the parameters of the
Merton model v0, λ, µJ , and σJ , given observations on the following four
observable short-maturity
IV shape characteristics – at-the-money level Σ, slope ∂Σ/∂k, convexity ∂2Σ/∂k2,
and the mixed
slope ∂2Σ/∂τ ∂k, all evaluated at (τ, 0, v0). If employing equation (41b) instead,
the much less easily
observable third order shape characteristic ∂3Σ/∂k3 would become necessary.

7.4 Implied stochastic volatility models with jumps

The analysis in Section 7.3 provides a theoretical foundation for constructing


parametric and non-

parametric implied stochastic volatility models with jumps. In practice, to


construct a parametric
model, based on the closed-form formulae for the expansion terms ϕ(i,j), we can use
the moment

conditions

E[g(i,j)(vl∆; θ0)] = 0, with g(i,j)(vl∆; θ) = [ϕ(i,j)]data

l − [ϕ(i,j)(vl∆; θ)]model,

where [ϕ(i,j)]data

denotes the data of ϕ(i,j)(vl∆). Then apply the GMM estimation approach proposed

in Section 3 to estimate the parameters θ.

To construct a nonparametric model, we can in principle estimate the jump size


parameters µJ
and σJ before estimating the coefficient functions λ(·), µ(·), γ(·), and η(·) as
discussed. Indeed, the
estimators of µJ and σJ can be obtained by the two (exactly identified) conditions
as the sample
analogs of algebraic equations (40a) and (40b)

¯µ + 2 − 2(¯µ + 1)N (µ+) − 2N (µ−)


−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)

=
(cid:32)

1
n

n
(cid:88)

l=1

1
[ϕ(0,0)]data

[ϕ(1,1)]data
[ϕ(−1,2)]data

(cid:33)

+ 2(r − d)

and

2¯µ
π[−¯µ + 2(¯µ + 1)N (µ+) − 2N (µ−)]


3

1
n

n
(cid:88)

l=1

[ϕ(−2,3)]data
[ϕ(−1,2)]data

Then, regarding the estimators of jump size parameters as inputs, equations (40c)–
(40f) allow us to

estimate coefficient functions λ(·), γ(·), η(·), and µ(·) one after another
iteratively, by following a

similar approach proposed in Section 4 for constructing a nonparametric SV model


without jumps.

7.5 Empirical challenges when jumps are present in the model

So, we have shown that it is possible in theory to imply a SV model with jumps from
the shape

characteristics of the IV surface. However, given the current liquidity of options


markets and resulting

availability of options data, one would encounter significant practical challenges


when implementing

25
the above strategy. As we did in the continuous case, it is natural to interpret
the expansion (36) as

the following regression

Σdata(τ (m)

, k(m)
l

) =

J
(cid:88)

Lj
(cid:88)

j=0

i=min(0,1−j)

β(i,j)
l

(τ (m)
l

2 kj + (cid:15)(m)

, for m = 1, 2, . . . , nl,

(44)

and the estimator of the coefficient β(i,j)

serves as the data [ϕ(i,j)]data

Similar to the case for regression (16), the choice of the orders J, L0,
L1, . . . , LJ and the regions
of IV surfaces data employed in regression (44) should strike a balance between, on
the one hand,
the accuracy of the expansion Σ(J,L(J)) and on the other hand, over-fitting the
regression to the IV
data. Most importantly, the presence of jumps necessitates the estimation of third
order character-

istics of the IV surface, which in our experience is effectively impossible to do


accurately given the

limitations of the data currently available. A substantially denser set of


observations on option prices

or implied volatilities would be necessary to accurately estimate third order


derivatives without the

error introduced by the strike and maturity surface interpolation implicit in (44).
Furthermore, the

divergence of the IV surface due to the presence of negative powers of τ also


requires very short

maturity options to be accurately observed (as in Carr and Wu (2003)’s test for the
presence of

jumps in options data); such data can be affected by trading patterns specific to
options with, e.g.,

time-to-maturity τ less than one week, and log-moneyness k within ±0.1v


instantaneous volatility. This makes inferring the desired data [ϕ(i,j)]data

τ , where v represents the

from the IV surface and

the subsequent procedures for constructing implied stochastic volatility models


substantially more

difficult since we do not need to just identify the divergence as in Carr and Wu
(2003) but also

estimate higher order coefficients.

8 Conclusions and future directions

We proposed to construct implied stochastic volatility models to be consistent with


observed shape

characteristics of the implied volatility market data. In the construction of a


parametric model, all

parameters are estimated in one pass, regardless of how they get involved in
expansion terms. In the

construction of a nonparametric model, the coefficient functions are estimated one


after another it-

eratively, based on the closed-form relationships we derived. At least in


principle, implied stochastic
volatility models in higher dimensions can be constructed using the same principle,
although a bivari-

ate nonparametric implied stochastic volatility model, as we considered and


empirically illustrated,

is flexible enough in terms of fitting the observable and practically useful shape
characteristics of the

implied volatility surface (the level, slope and convexity along the moneyness
dimension, as well as

the slope along the term-structure dimension.)

When jumps are introduced to the model, we showed that the same ideas continue to
work in

principle and that a full characterization of the stochastic volatility model can
still be obtained in

26
closed form, at least for models with jumps only in the returns dynamics. However,
higher order shape

characteristics become necessary, whose estimation require substantially denser


options observations

in both time and moneyness than is currently available, even though options with
shorter maturities,

such as weekly, have recently become more liquid. Adding jumps to the volatility
dynamics, or infinite

activity jumps to either returns or volatility dynamics, would substantially alter


the approach we

employ to derive the implied volatility expansion as a tool, and require in


practice more accurate

and delicate shape characteristics for fully recovering the model components. We
intend to pursue

this line of inquiry in future research.

27
References

A¨ıt-Sahalia, Y., 2002. Maximum-likelihood estimation of discretely-sampled


diffusions: A closed-form

approximation approach. Econometrica 70, 223–262.

A¨ıt-Sahalia, Y., Fan, J., Li, Y., 2013. The leverage effect puzzle: Disentangling
sources of bias at

high frequency. Journal of Financial Economics 109, 224–249.

A¨ıt-Sahalia, Y., Kimmel, R., 2007. Maximum likelihood estimation of stochastic


volatility models.

Journal of Financial Economics 83, 413–452.

A¨ıt-Sahalia, Y., Mykland, P. A., 2003. The effects of random and discrete sampling
when estimating

continuous-time diffusions. Econometrica 71, 483–549.

Andersen, L., Andreasen, J., 2000. Jump-diffusion processes: Volatility smile fitting
and numerical

methods for option pricing. Review of Derivatives Research 4 (3), 231–262.

Andersen, L., Lipton, A., 2013. Asymptotics for exponential L´evy processes and
their volatility smile:

Survey and new results. International Journal of Theoretical and Applied Finance
16, 1–98.

Bates, D. S., 1996. Jumps and stochastic volatility: Exchange rate processes
implicit in Deutsche

Mark options. Review of Financial Studies 9, 69–107.

Berestycki, H., Busca, J., Florent, I., 2002. Asymptotics and calibration of local
volatility models.

Quantitative Finance 2, 61–69.

Berestycki, H., Busca, J., Florent, I., 2004. Computing the implied volatility in
stochastic volatility

models. Communications on Pure and Applied Mathematics 57, 1352–1373.

Br´emaud, P., 1981. Point Processes and Queues: Martingale Dynamics. Springer-
Verlag.

Carr, P., Cousot, L., 2011. A PDE approach to jump-diffusions. Quantitative Finance
11 (1), 33–52.

Carr, P., Cousot, L., 2012. Explicit constructions of martingales calibrated to


given implied volatility

smiles. SIAM Journal on Financial Mathematics 3 (1), 182–214.


Carr, P., Geman, H., Madan, D. B., Yor, M., 2004. From local volatility to local L
´evy models.

Quantitative Finance 4 (5), 581–588.

Carr, P., Wu, L., 2003. What type of process underlies options? A simple robust
test. The Journal

of Finance 58, 2581–2610.

Chernov, M., Gallant, A. R., Ghysels, E., Tauchen, G. T., 2003. Alternative models
for stock price

dynamics. Journal of Econometrics 116, 225–257.

28
Christoffersen, P., Jacobs, K., Mimouni, K., 2010. Volatility dynamics for the
S&P500: Evidence from

realized volatility, daily returns, and option prices. Review of Financial Studies
23 (8), 3141–3189.

Duffie, D., Pan, J., Singleton, K. J., 2000. Transform analysis and asset pricing
for affine jump-

diffusions. Econometrica 68, 1343–1376.

Dumas, B., Fleming, J., Whaley, R. E., 1998. Implied volatility functions:
Empirical tests. The

Journal of Finance 53, 2059–2106.

Dupire, B., 1994. Pricing with a smile. RISK 7, 18–20.

Durrleman, V., 2008. Convergence of at-the-money implied volatilities to the spot


volatility. Journal

of Applied Probability 45, 542–550.

Durrleman, V., 2010. From implied to spot volatilities. Finance and Stochastics 14
(2), 157–177.

Eraker, B., Johannes, M. S., Polson, N., 2003. The impact of jumps in equity index
volatility and

returns. The Journal of Finance 58, 1269–1300.

Fan, J., Gijbels, I., 1996. Local Polynomial Modelling and Its Applications.
Chapman & Hall, London,

U.K.

Forde, M., Jacquier, A., 2011. The large-maturity smile for the Heston model.
Finance and Stochas-

tics 17, 755–780.

Forde, M., Jacquier, A., Lee, R., 2012. The small-time smile and term structure of
implied volatility

under the Heston model. SIAM Journal on Financial Mathematics 3 (1), 690–708.

Fouque, J.-P., Lorig, M., Sircar, R., 2016. Second order multiscale stochastic
volatility asymptotics:

Stochastic terminal layer analysis and calibration. Finance and Stochastics 20,
543–588.

Gao, K., Lee, R., 2014. Asymptotics of implied volatility to arbitrary order.
Finance and Stochastics

18 (2), 349–392.

Gatheral, J., 2006. The Volatility Surface: A Pactitioner’s Guide. John Wiley and
Sons, Hoboken,
NJ.

Gatheral, J., Hsu, E. P., Laurence, P., Ouyang, C., Wang, T.-H., 2012. Asymptotics
of implied

volatility in local volatility models. Mathematical Finance 22 (4), 591–620.

Hagan, P. S., Woodward, D. E., 1999. Equivalent Black volatilities. Applied


Mathematical Finance

6, 147–157.

29
Hansen, L. P., 1982. Large sample properties of generalized method of moments
estimators. Econo-

metrica 50, 1029–1054.

Heston, S., 1993. A closed-form solution for options with stochastic volatility
with applications to

bonds and currency options. Review of Financial Studies 6, 327–343.

Hull, J., White, A., 1987. The pricing of options on assets with stochastic
volatilities. The Journal

of Finance 42, 281–300.

Jacquier, A., Lorig, M., 2015. From characteristic functions to implied volatility
expansions. Advances

in Applied Probability 47, 837–857.

Jones, C. S., 2003. The dynamics of stochastic volatility: Evidence from underlying
and options

markets. Journal of Econometrics 116, 181–224.

Karlin, S., Taylor, H. M., 1975. A First Course in Stochastic Processes, 2nd
Edition. Academic Press.

Kristensen, D., Mele, A., 2011. Adding and subtracting Black-Scholes: A new
approach to approxi-

mating derivative prices in continuous-time models. Journal of Financial Economics


102, 390–415.

Kunitomo, N., Takahashi, A., 2001. The asymptotic expansion approach to the
valuation of interest

rate contingent claims. Mathematical Finance 11 (1), 117–151.

Ledoit, O., Santa-Clara, P., Yan, S., 2002. Relative pricing of options with
stochastic volatility. Tech.

rep., University of California at Los Angeles.

Lee, R., 2001. Implied and local volatilities under stochastic volatility.
International Journal of The-

oretical and Applied Finance 4, 45–89.

Lee, R., 2004. The moment formula for implied volatility at extreme strikes.
Mathematical Finance

14 (3), 469–480.

Li, C., 2014. Closed-form expansion, conditional expectation, and option valuation.
Mathematics of

Operations Research 39, 487–516.


Lorig, M., Pagliarani, S., Pascucci, A., 2017. Explicit implied volatilities for
multifactor local-

stochastic volatility models. Mathematical Finance 27, 927–960.

Medvedev, A., Scaillet, O., 2007. Approximation and calibration of short-term


implied volatilities

under jump-diffusion stochastic volatility. Review of Financial Studies 20 (2), 427–


459.

Merton, R. C., 1976. Option pricing when underlying stock returns are
discontinuous. Journal of

Financial Economics 3, 125–144.

30
Pagliarani, S., Pascucci, A., 2017. The exact Taylor formula of the implied
volatility. Finance and

Stochastics 21, 661–718.

Pan, J., 2002. The jump-risk premia implicit in options: Evidence from an
integrated time-series

study. Journal of Financial Economics 63, 3–50.

Sircar, K. R., Papanicolaou, G. C., 1999. Stochastic volatility, smile &


asymptotics. Applied Math-

ematical Finance 6, 107–145.

Takahashi, A., Yamada, T., 2012. An asymptotic expansion with push-down of


Malliavin weights.

SIAM Journal on Financial Mathematics 3 (1), 95–136.

Tehranchi, M. R., 2009. Asymptotics of implied volatility far from maturity.


Journal of Applied

Probability 46, 629–650.

Xiu, D., 2014. Hermite polynomial based expansion of European option prices.
Journal of Econo-

metrics 179, 158–177.

31

Appendix

Appendix A Implied volatility expansion for continuous models

In this appendix, we sketch on how to derive the IV expansion terms σ(i,j) in (7)
in closed form for

the continuous SV model (1a)–(1b). To simplify notations, we assume St = s and vt =


v at time
t. The main idea hinges on expanding the both sides of identity (4) with respect to
the square root

of time-to-maturity (cid:15) =

τ and log-moneyness k and then matching expansion terms of the same

orders. Thus, as an indispensable preparation, we propose the following (J, L(J))th


order expansion
of ¯P (τ, k, vt) introduced in (3) and appearing on the right hand side of (4):

¯P (J,L(J))((cid:15)2, k, v) =

J
(cid:88)

Lj
(cid:88)

j=0

i=1−j

p(i,j) (v) (cid:15)ikj, with (cid:15) =

τ ,

(A.1)

for any orders J ≥ 0 and Lj ≥ 1 − j, j = 0, 1, . . . , J. The coefficients p(i,j)


can be calculated explicitly
by following Li (2014), in which the option price P ((cid:15)2, k, s, v) admits a
pseudo univariate expansion

with respect to (cid:15) with closed-form expansion terms depending on both


(cid:15) and k. The bivariate

expansion (A.1) follows from taking s = 1 in this univariate expansion and further
expanding the

coefficients with respect to k and (cid:15).

Now, based on the bivariate expansion (A.1) of ¯P (τ, k, vt), which appears on the
right hand side
of (4), in what follows, we accordingly expand the composite function ¯PBS(τ, k,
Σ(τ, k, v)) on the left
hand side. By matching the expansion term on the both sides, we establish a set of
iterations and
solve the expansion terms σ(i,j) recursively.

We start from the following expansion of at-the-money IV Σ((cid:15)2, 0, v) with


respect to (cid:15) :

Σ(L0)((cid:15)2, 0, v) =

L0(cid:88)

i=0

σ(i,0)(v)(cid:15)2i,

(A.2)

which is obtained by setting k = 0 in the bivariate expansion (7). According to


Durrleman (2008),
Σ((cid:15)2, 0, v) converges to the instantaneous SV of the asset price vt, as the
time-to-maturity τ = (cid:15)2
approaches to zero. Thus, σ(0,0)(v) = v. By taking σ(0,0)(v) as the initial input,
all other expansion

terms can be solved recursively.

To compute the expansion terms σ(i,0), we apply the at-the-money condition k = 0 on


the both

sides of (4) to obtain

¯P ((cid:15)2, 0, v) = ¯PBS((cid:15)2, 0, Σ((cid:15)2, 0, v)).

(A.3)

Expanding the both sides of (A.3) with respect to (cid:15) and matching the
coefficients, we can obtain
a system of equations. The closed-form formulae of expansion terms σ(i,0) follows
by solving the
Indeed, for the left hand side of (A.3), the expansion of ¯P ((cid:15)2, 0, v) with

equations recursively.

32
respect to (cid:15) can be obtained from (A.1) by setting k = 0, i.e.,

¯P (L0)((cid:15)2, 0, v) =

L0(cid:88)

l=0

p(l,0) (v) (cid:15)l.

(A.4)

For the right hand side of (A.3), the expansion of ¯PBS((cid:15)2, 0, Σ((cid:15)2,
0, v)) with respect to (cid:15) follows
by combining the expansion of the function ¯PBS((cid:15)2, 0, σ), which is obtained
by expanding the ex-
plicit formula of ¯PBS((cid:15)2, 0, σ), and the expansion of at-the-money IV
Σ((cid:15)2, 0, v), which is pro-
posed in (A.2) with the undetermined expansion terms σ(i,0). Then, the closed-form
expansion of
¯PBS((cid:15)2, 0, Σ((cid:15)2, 0, v)) is in the following form

¯P (J)
BS ((cid:15)2, 0, Σ((cid:15)2, 0, v)) =

J
(cid:88)

l=1

˜p(l,0)(v)(cid:15)l,

(A.5)

for any integer J ≥ 1. In particular, for any odd integer l ≥ 3, the expansion term
˜p(l,0) by computa-
tion consists of IV expansion terms σ(i,0) for all i ≤ (l − 1)/2. Matching the
coefficients of expansions

(A.5) and (A.4) yields the following system of equations

p(l,0)(v) = ˜p(l,0)(v), for any odd integer l ≥ 1.

For any integer i ≥ 1, the closed-form formula of the expansion term σ(i,0)(v)
follows from solving

the above equation with l = 2i + 1.

Finally, to compute the expansion terms σ(i,j) for j ≥ 1, we resort to the


following identity

∂j
∂kj

¯P ((cid:15)2, 0, v) = fj((cid:15), v),

(A.6)

which is obtained from differentiating the identity (4) j times with respect to k
and then applying

the at-the-money condition k = 0. Here, the function fj is defined by

fj((cid:15), v) =

(cid:88)

0≤m1≤m2≤j

(cid:18) j
m2

(cid:19) ∂j−m2+m1 ¯PBS


∂kj−m2∂σm1

((cid:15)2, 0, Σ((cid:15)2, 0, v))G(m1,m2)((cid:15), v),

(A.7)

where the nonnegative integers m1 and m2 satisfy that m1 = 0 if and only if m2 = 0.


The function
G(m1,m2) is defined by G(0,0)((cid:15), v) = 1 and

G(m1,m2)((cid:15), v) =

(cid:88)

l∈Sm1,m2

m2!
R (l)

m1(cid:89)

(cid:96)=1

1
i(cid:96)!

∂i(cid:96)Σ
∂ki(cid:96)

((cid:15)2, 0, v),

(A.8)

for 1 ≤ m1 ≤ m2. Here, the integer index set Sm1,m2 is given by

Sm1,m2 = {(i1, i2, · · · , im1) : 1 ≤ i1 ≤ i2 ≤ · · · ≤ im1, (cid:80)m1

(cid:96)=1i(cid:96) = m2},

(A.9)

and the function R (l) is a constant defined by the product of factorials of the
repeating times

of distinct nonzero entries appearing for more than once in index l. For example,
in index l =

33
(1, 1, 2, 2, 2), distinct entries 1 and 2 appear twice and thrice, respectively.
Then, the constant

R (l) is calculated as 2! × 3! = 24. Similar to the previous case of j = 0, by


expanding the both sides

of (A.6) with respect to (cid:15) and matching the coefficients, we can obtain a
system of equations for
solving the expansion terms σ(i,j)(v) recursively.

Indeed, the expansion of ∂j ¯P ((cid:15)2, 0, v)/∂kj on left hand side of (A.6) is

∂j ¯P (Lj )
∂kj

((cid:15)2, 0, v) =

Lj
(cid:88)

i=1−j

j!p(i,j) (v) (cid:15)i,

(A.10)

which is obtained from differentiating expansion (A.1) j times with respect to k and
then setting

k = 0. According to the definition (A.7), the expansion of the function fj on the


right hand side of
(A.6) hinges on the expansions of two types of ingredients

∂j−m2+m1 ¯PBS
∂kj−m2∂σm1

((cid:15)2, 0, Σ((cid:15)2, 0, v)) and G(m1,m2)((cid:15), v).

(A.11)

As to the first ingredient, its expansion can be obtained by combining the


expansions of the Black-
Scholes sensitivities ∂j−m2+m1 ¯PBS((cid:15)2, 0, σ)/∂kj−m2∂σm1, which is obtained
based on the explicit
formula of ¯PBS, and the expansion of at-the-money IV Σ((cid:15)2, 0, v), which is
explicitly computed from
the preceding iteration for j = 0. By combining these two types of expansions, we
obtain the Jth

order expansion of the first ingredient in (A.11) as

∂j−m2+m1 ¯P (J)
BS
∂kj−m2∂σm1

((cid:15)2, 0, Σ((cid:15)2, 0, v)) =

J
(cid:88)
l=1−j+m2

H (j−m2)
l,m1

(cid:15)l,

(A.12)

for any integer order J ≥ 1 − j + m2, where the expansion terms H (j−m2)
Scholes sensitivities and at-the-money IV expansion terms σ(i,0).

l,m1

consist of various Black-

To obtain the expansion of the second ingredient G(m1,m2)((cid:15), v) in (A.11),


according to its defi-

nition (A.8), it suffices to combine the expansions of various at-the-money IV shape


characteristics
∂i(cid:96)Σ((cid:15)2, 0, v)/∂ki(cid:96), while the expansion of
∂i(cid:96)Σ((cid:15)2, 0, v)/∂ki(cid:96) follows

∂i(cid:96)Σ(Li(cid:96) )
∂ki(cid:96)

((cid:15)2, 0, v) =

Li(cid:96)(cid:88)

l=0

i(cid:96)!σ(l,i(cid:96))(v)(cid:15)2l,

by differentiating (7) i(cid:96) times with respect to k and then setting k = 0.


Then, the function G(m1,m2)
admits a Jth order expansion in the form

G(m1,m2)((cid:15), v) =

J
(cid:88)

l=0

G(m1,m2)

(cid:15)2l,

(A.13)

for any integer order J ≥ 0. Here, the expansion term G(m1,m2)

l
is defined according to

G(m1,m2)

(cid:88)

l∈Sm1,m2 , v∈Tl,l

m2!
R (l)

m1(cid:89)

(cid:96)=1

σ(v(cid:96),i(cid:96))(v),

34
for any integers m2 ≥ m1 ≥ 1 and l ≥ 0, with the integer index set Sm1,m2 given in
(A.9) and the
function R (l) provided right after (A.9). Moreover, for any index l ∈ Sm1,m2, the
integer index set
Tl,l is defined by

Tl,l = {v = (v1, v2, · · · , vm1) : v1 + · · · + vm1 = l and v(cid:96) ≥ 0, for


(cid:96) = 1, 2, . . . , m1} .

Based on the expansions (A.12) and (A.13), it follows from the definition (A.7) that
the function

fj((cid:15), v) admits the following Jth order expansion

f (J)
j

((cid:15), v) =

J
(cid:88)

l=1−j

˜p(l,j)(v)(cid:15)l,

(A.14)

for any integer J ≥ 1 − j, where the expansion term ˜p(l,j) satisfies

˜p(l,j)(v) =

(cid:88)

0≤m1≤m2≤j

(cid:19)

(cid:18) j
m2

(cid:88)

H (j−m2)
l1,m1

G(m1,m2)

l2

l1+2l2=l, l1≥1−j+m2, l2≥0

for any integer l ≥ 1 − j. In particular, for any odd integer l ≥ 1, the expansion
term ˜p(l,j)(v) consists
of IV expansion terms σ(i,m) for all 0 ≤ m ≤ j and 0 ≤ i ≤ (l − 1)/2 + (cid:98)j −
m(cid:99) /2, where the
notation (cid:98)a(cid:99) represents the integer part of any arbitrary real number
a. By matching the coefficients

of expansions (A.14) and (A.10), we obtain the following system of equations

j!p(l,j)(v) = ˜p(l,j)(v), for any integer l ≥ 1 − j.

For any integer i ≥ 1, the closed-form formula of the expansion term σ(i,j)(v)
follows from solving

the above equation with l = 2i + 1.

Appendix B Implied volatility expansion for models with jumps

Similar to the derivation for the continuous case, the expansion terms ϕ(i,j) can
be solved by iter-

ations. These iterations can be obtained by expanding the both sides of identity
(4) with respect

to the square root of time-to-maturity (cid:15) =

τ and log-moneyness k and then matching expansion

terms of the same orders. Solving these matched equations leads to the desired
iterations. Thus,

by omitting the similar arguments, it suffices to the following indispensable


ingredient for complet-

ing the derivation: Under the general SV model with jumps (33a)–(33b), we propose
the following
closed-form bivariate expansion of ¯P (τ, k, vt) introduced in (3) and appearing on
the right hand side
of (4):

¯P (J,L(J))((cid:15)2, k, v) =

J
(cid:88)

Lj
(cid:88)

j=0

i=1−j

¯p(i,j) (v) (cid:15)ikj, with (cid:15) =

τ ,

for any orders J ≥ 0 and Lj ≥ 1 − j, j = 0, 1, 2, . . . , J. This expansion


generalizes that for the
continuous model (1a)–(1b) provided in (A.1) and can be developed from the
following three steps.
35


Without loss of generality, by the time-homogeneity property of the model (33a)–
(33b), the time

span from t to T can be translated to that from 0 to τ = T − t for simplicity. We


assume S0 = s
and v0 = v.

Step 1 – Representing ¯P (τ, k, v) under an auxiliary measure: We will rewrite the


expectation
representation of ¯P (τ, k, v) in (3) under an auxiliary probability measure, under
which the expec-
tation becomes easier to handle. We denote by Q the assumed risk-neutral measure
and denote by
Ft the filtration generated by the process (St, vt)(cid:62). The new probability
measure ˜Q is induced by a
Radon-Nikod´ym derivative Λt according to

dQ
d ˜Q

(cid:12)
(cid:12)
(cid:12)
(cid:12)Ft

= Λt, with Λt defined as Λt =

(cid:32) Nt(cid:89)

i=1

(cid:33)

(cid:26)

λ(vτi)

exp

t −

(cid:27)

λ(vs)ds

(cid:90) t

(B.1)

where τi denotes the arrival time of the ith jump, i.e., τi = inf{t ≥ 0 : Nt = i};
in particular, Λ0 = 1.
According to Theorem T3 in Chapter VI of Br´emaud (1981), Nt is a Poisson process
with constant
jump intensity 1 under the measure ˜Q. Changing the measure from Q to ˜Q yields the
following
equivalent expectation representation of ¯P (τ, k, v):

¯P (τ, k, v) = e−rτ ˜E

(cid:20)

Λτ max

(cid:18)

ek −

(cid:19)(cid:21)

, 0


s

where ˜E represents the expectation under the measure ˜Q. Then, by conditioning on
the number of
jumps between 0 and τ, we reformulate ¯P (τ, k, vt) as the following summation form

¯P (τ, k, v) =


(cid:88)

(cid:96)=0

˜Q(Nτ = (cid:96)) ¯P(cid:96)(τ, k, v), with ¯P(cid:96)(τ, k, v) = ˜E((cid:96))

(cid:20)

Λτ max

(cid:18)

ek −

(cid:19)(cid:21)

, 0


s

(B.2)

where the multiplier ˜Q(Nτ = (cid:96)), as the probability of Nτ = (cid:96) under


the measure ˜Q, can be ex-
plicitly calculated as τ (cid:96)e−τ /(cid:96)!, and the notation ˜E((cid:96))[·]
serves as the abbreviation of the conditional
expectation ˜E[·|Nτ = (cid:96)].

According to the relation (B.2), to expand ¯P , it suffices to multiply the


expansion of ˜Q(Nτ =
(cid:96)) = τ (cid:96)e−τ /(cid:96)! with respect to τ, which is trivial, and the
expansion of conditional expectation ¯P(cid:96) with
τ and k for any (cid:96) ≥ 0. To expand ¯P(cid:96) for (cid:96) = 0, in the
beginning of Step 2, we propose
respect to (cid:15) =
a decomposition of Λτ and Sτ . Then, based on this decomposition, we apply the
method proposed in
Li (2014) and develop a pseudo expansion of ¯P0 with respect to (cid:15) with
coefficients depending on both
(cid:15) and k. The desired bivariate expansion of ¯P0 follows from further
expanding those coefficients with
respect to k and (cid:15). To expand ¯P(cid:96) for (cid:96) ≥ 1, based on the
decomposition of Λτ and Sτ introduced in
Step 2, we apply the operator-based expansion discussed in A¨ıt-Sahalia (2002) to
obtain the desired

result in Step 3.

Step 2 – Expanding the conditional expectation ¯P(cid:96) in (B.2) for (cid:96) =


0: It follows from the dynamics

(33a) that the underlying asset price Su admits the following decomposition form

Su = sSc

uSJ
u ,

36

(B.3)
where Sc

u and SJ

u are the continuous and jump components of Su/s given by

Sc

u = exp

(cid:26)(cid:90) u

(cid:18)

r − d − λ(vt)¯µ −

(cid:19)

1
2

v2
t

dt +

(cid:90) u

(cid:27)

vtdW1t

and SJ

u = exp

(cid:40) Nu(cid:88)

(cid:41)

Jτi

i=1

respectively. Likewise, the Radon-Nikod´ym derivative Λu by definition (B.1) is


decomposed as

Λu = Λc

uΛJ
u,

with the continuous component Λc


u and jump component ΛJ

u given by

Λc

u = exp

(cid:90) u

(cid:26)

u −

(cid:27)

λ(vt)dt

and ΛJ

u =

Nu(cid:89)

i=1

λ(vτi),

respectively. Apparently, the continuous components Sc

u and Λc

u satisfy

dSc
u
Sc
u

= (r − d − λ(vu)¯µ)du + vudWu, Sc

0 = 1,

and

dΛc

u = (1 − λ(vu))Λc

udu, Λc

0 = Λ0 = 1,

(B.4)

(B.5)
(B.6)

(B.7)

(B.8)

respectively, with the volatility vu governed by (33b).

In the case of (cid:96) = 0, the jump components in the decompositions (B.3) and
(B.5) are disabled,

so that the conditional expectation ¯P0 in (B.2) simplifies to

¯P0(τ, k, v) = e−rτ ˜E

(cid:104)

τ max(ek − Sc
Λc

(cid:105)
τ , 0)

τ and Λτ = Λc

τ , By regarding Λc

since Sτ = sSc
security with the underlying asset (Sc

τ , 0(cid:1) as the payoff function of a derivative


τ ) evolving according to dynamics (B.7), (B.8), and (33b),
we apply the method proposed in Li (2014) and arrive at the following Jth order
univariate expansion
of ¯P0(τ, k, v):

τ max (cid:0)ek − Sc

τ , Λc

¯P (J)
0

((cid:15)2, k, v) = e−r(cid:15)2

(cid:15)v

J
(cid:88)

l=0

Φ(l)
0

(cid:19)
(cid:18) ek − 1
v(cid:15)

(cid:15)l,

where the coefficients Φ(l)


follows by further expanding the coefficients Φ(l)

0 can be calculated in closed form. The desired bivariate expansion of ¯P0

0 with respect to k and (cid:15).

Step 3 – Expanding the conditional expectation ¯P(cid:96) in (B.2) for (cid:96) ≥


1: Plugging in the decompo-

sitions (B.3) and (B.5) into (B.2) yields

¯P(cid:96)(τ, k, v) = ˜E((cid:96)) (cid:104)

τ ΛJ
Λc

τ max(ek − Sc

uSJ

(cid:105)
u , 0)

Conditioning on Λc

τ , ΛJ

τ , and Sc

τ , we reformulate the above expectation as

¯P(cid:96)(τ, k, v) = ˜E((cid:96))[Λc

τ ΛJ
τ

˜E((cid:96))[max(ek − Sc

uSJ

u , 0)|Λc

τ , ΛJ

τ , Sc

τ ]].

(B.9)
37
τ and ΛJ

We note that the component SJ


τ , Λc
arguments Sc

τ inside the inner expectation is independent with all the conditioning


τ defined in (B.4), (B.6), and (B.6), respectively, simply because jump sizes
Jτi are assumed to be independent with the asset price Su, the volatility vu, and
the Poisson process
Nu for any u ∈ [0, τ ] under the measure ˜Q. Consequently, the inner expectation in
(B.9) can be
expressed as φ(cid:96)(k, Sc

τ ) for some function φ(cid:96) determined by the following integral form

φ(cid:96)(k, Sc

τ ) = ˜E((cid:96))[max(ek − Sc

uSJ

u , 0)|Λc

τ , ΛJ

τ , Sc
τ ]

(cid:90)

J (cid:96)

max(ek − Sc

τ eu1+u2+···+u(cid:96))f (u1)f (u2) · · · f (u(cid:96))du1du2 · · · du(cid:96),

(B.10)

(B.11)

where J and f represent the state space and the probability density function of the
jump size Jt,
respectively. The integral (B.11) can be explicitly calculated if, for example, the
jump size Jt follows
a normal distribution with mean µJ and variance σ2
J as commonly employed since the breakthrough
invention of the jump-diffusion model by Merton (1976). Under this case, the closed-
form formula

of the integral (B.11) is given by

φ(cid:96)(k, Sc

τ ) = ekN

(cid:18) k − log Sc
(cid:96)σJ

τ − (cid:96)µJ

(cid:19)

− Sc

τ exp

(cid:18)

(cid:96)µJ +

(cid:19)

(cid:96)σ2
J
2

(cid:18) k − log Sc
(cid:96)σJ

τ − (cid:96)µJ

(cid:19)

(cid:96)σJ

It follows from (B.9) and (B.10) that

¯P(cid:96)(τ, k, v) = ˜E((cid:96))[Λc

τ ΛJ

τ φ(cid:96)(k, Sc

τ )].

By conditioning on the components Sc

τ and Λc

τ , as well as the whole path of the volatility vu for all

u ∈ [0, τ ], denoted by V for simplicity, the law of iterated expectation implies


¯P(cid:96)(τ, k, v) = ˜E[Λc

τ φ(cid:96)(k, Sc

τ )˜E((cid:96))[ΛJ

τ |Sc

τ , Λc

τ , V ]].

(B.12)

Plugging in the explicit expression of the jump component ΛJ

τ given in (B.6), we write the inner

expectation as

˜E((cid:96))[ΛJ

τ |Sc

τ , Λc

τ , V ] ≡ ˜E

(cid:34) (cid:96)
(cid:89)

i=1

(cid:12)
(cid:12)
(cid:12)
λ(vτi)
(cid:12)
(cid:12)

(cid:35)
τ , V, Nτ = (cid:96)

τ , Λc
Sc

(B.13)

Given the conditions spelt in (B.13), the randomness of (cid:81)(cid:96)


i=1 λ(vτi) solely hinges on those of the
jump arrival times τi. Since Nt follows a Poisson process with constant intensity 1
independent with
τ , and V under the measure ˜Q, the conditional joint distribution of (τ1, τ2, · ·
· , τ(cid:96)) given Sc
τ , Λc
Sc
τ ,
Λc
τ , V, and Nτ = (cid:96) is equivalent to that of (τ1, τ2, · · · , τ(cid:96)) given
Nτ = (cid:96), which distributes as the
order statistics of (cid:96) independent observations sampled from the uniform
distribution on [0, τ ] (see,

e.g., Theorem 2.3 in Chapter 4.2 of Karlin and Taylor (1975).) Then, direct
computation leads to

that

˜E((cid:96))[ΛJ

τ |Sc

τ , Λc

τ , V ] =

(cid:18)(cid:90) τ

1
τ

(cid:19)(cid:96)

(cid:18)

λ(vu)du

1 −

(cid:19)(cid:96)

log Λc
τ

1
τ

(B.14)

38
where the second equality follows from the representation of Λc
into (B.12), we simplify ¯P(cid:96)(τ, k, v) in (B.9) to

τ in (B.6). Hence, by plugging (B.14)

(cid:34)

¯P(cid:96)(τ, k, v) = ˜E

τ φ(cid:96)(k, Sc
Λc
τ )

(cid:18)

1 −

1
τ

(cid:19)(cid:96)(cid:35)

log Λc
τ

Finally, based on the dynamics of Sc

u given in (B.7), (33b), and (B.8), respectively,


an application of the operator-based expansion discussed in A¨ıt-Sahalia (2002) to
the conditional
expectation ¯P(cid:96)(τ, k, v) yields the Taylor expansion with respect τ =
(cid:15)2 in the form:

u, vu, and Λc

¯P (J)
(cid:96)

(τ, k, v) =

J
(cid:88)

l=0

Φ(l)
(cid:96) (k)τ l,

for any integer order J ≥ 0, where the expansion terms Φ(l)


expansion of ¯P(cid:96) follows from further expanding the coefficients Φ(l)

(cid:96) are in closed form. The desired bivariate


(cid:96) (k) with respect to k.

The last part of this Appendix shows the calculations to link the coefficients
ϕ(i,j) to the IV
surface shape characteristics in Section 7.3. It follows by setting k = 0 in the
bivariate expansion

(36) that

Σ(L0)(τ, 0, vt) =

L0(cid:88)

i=0

ϕ(i,0)(vt)τ

i
2 .

(B.15)

Differentiating both sides of (36) with respect to k once, twice, or thrice, and
then taking k to be

zero, we obtain

Σ(L1)(τ, 0, vt) =

L1(cid:88)

ϕ(i,1)(vt)τ

i
2 ,


∂k

and

∂2
∂k2 Σ(L2)(τ, 0, vt) =

∂3
∂k3 Σ(L3)(τ, 0, vt) =

i=0

L2(cid:88)

i=−1

L3(cid:88)

i=−2

2ϕ(i,2)(vt)τ

i
2 ,

6ϕ(i,3)(vt)τ
i
2 .

(B.16)

(B.17)

Equation (B.15) (resp. (B.16)) implies the first (resp. third) equation in (39a) as
τ approaches
i.e., ϕ(0,1)(vt) = limτ →0 ∂Σ/∂k(τ, 0, vt).) The rest
to zero, i.e., ϕ(0,0)(vt) = limτ →0 Σ(τ, 0, vt) (resp.
of equations listed in (39a)–(39d) hinge on finding the univariate Taylor expansions
with respect to

τ of the time-scaled shape characteristics or their combinations appearing on the


right hand sides

of these equations. Consider (39c). It follows from (B.17) that

1
2

∂2
∂k2 Σ(L2)(τ, 0, vt) =

L2(cid:88)

i=−1

ϕ(i,2)(vt)τ

i
2 and τ

∂3
∂k2∂τ

Σ(L2)(τ, 0, vt) =

L2(cid:88)

i=−1

iϕ(i,2)(vt)τ

i
2 .

39
Adding the above two equations yields

1
2

∂2
∂k2 Σ(L2)(τ, 0, vt) + τ

∂3
∂k2∂τ

Σ(L2)(τ, 0, vt) =

L2(cid:88)

i=0

(i + 1)ϕ(i,2)(vt)τ

i
2 ,

which is a Taylor expansion with leading term ϕ(0,2)(vt) and (39c) follows by
letting τ approach zero.

40
Parameter

True

Exact identification
Bias

Std. dev.

Over identification

Bias

Std. dev.

3.00

0.04

0.20

−0.70

−0.031

3.21 × 10−4

0.0021

0.0058

0.554

0.0022

0.0109

0.0374

−0.259

0.0012

3.53 × 10−4

0.0017

0.488

0.0029
0.0106

0.0374

Table 1: Monte Carlo results for parametric implied stochastic volatility model of
type (27a)–(27b)

Note: In the fourth and sixth columns, the standard deviation of each parameter is
calculated by the finite-
sample standard deviation of estimators based on the 500 simulation trials.

[15, 30]

Number
(30, 45]

(45, 60]

[15, 30]

Mean
(30, 45]

(45, 60]

Standard deviation
(30, 45]

(45, 60]

[15, 30]

Days-to-expiration
Log-moneyness k
k < 5%

8, 481
−5% ≤ k ≤ −2.5% 32, 319
40, 983
23, 556
2, 417
106
107, 862

−2.5% ≤ k < 0
0 ≤ k < 2.5%
2.5% ≤ k < 5%
k ≥ 5%
Total

22, 275
24, 598
24, 151
16, 025
3, 015
269
90, 333

27, 261
15, 643
15, 635
10, 392
2, 205
291
71, 427

21.92
15.38
12.19
10.51
14.10
18.87
13.59

19.68
15.05
12.69
10.89
12.58
17.01
14.75

19.24
15.18
13.05
11.27
11.92
15.80
15.60

4.68
3.35
3.39
3.52
3.50
2.21
4.66

4.43
3.13
3.36
3.56
3.64
2.42
4.82

4.19
3.08
3.22
3.52
3.83
2.51
4.79

Table 2: Descriptive statistics for the S&P500 index implied volatility data, 2013
– 2017

Note: The sample consists of daily implied volatilities of European options written
on the S&P 500 index
covering the period of January 2, 2013 – December 29, 2017. The columns “Mean” and
“Standard deviation”
are reported as percentages. The log-moneyness k is defined by k = log(K/St), where
K is the exercise strike
of the option and St the spot price of the S&P 500 index.

41
Parameter

Estimator

Standard error

Estimator

Standard error

Exact identification

Over identification

15.2

0.023

0.98

−0.619

1.95

0.0032

0.065

0.0021

13.5

0.022

0.77

−0.609

1.64

0.0030

0.052

0.0038

Table 3: Parametric implied stochastic volatility model of type (27a)–(27b)

Note: In the third and fifth columns, the standard error of each parameter is
calculated by the Newey-West
(sample-based) estimator according to (19) and (20). For instance, the standard
error of the parameter κ is

(cid:113)

given by

11 (ˆθ)/n, where ˆV −1
ˆV −1

11

represents the (1, 1)th entry of the matrix ˆV −1.

42
Figure 1: The implied volatility surface of S&P 500 index’s options on January 3,
2017

Note: This plot represents the IV surface (τ, k) (cid:55)−→ Σ(τ, k, vt) on January
3, 2017 for S&P 500 index options.
The two slopes Σ0,1(vt) (log-moneyness slope, or implied volatility smile) and
Σ1,0(vt) (term-structure slope)
are approximated and represented as red and blue dashed lines, respectively, with
each partial derivative
∂i+jΣ(τ, 0, vt)/∂τ i∂kj evaluated at τ = 1 month.

43
Parameter: κ

Parameter: α

Parameter: ξ

Parameter: ρ

Figure 2: Histograms of the Newey-West estimators of asymptotic standard deviations


for the exactly
identified case

Note: In each panel, the histogram characterizes the distribution of 500 Newey-West
(sample-based) estimators
of asymptotic standard deviations. For each simulation trial, the sample-based
asymptotic standard deviation
is calculated according to (19) and (20). The red star marks the finite-sample
standard deviation of the
corresponding parameter as shown in the fourth column of Table 1.

44
Parameter: κ

Parameter: α

Parameter: ξ

Parameter: ρ

Figure 3: Histograms of the Newey-West estimators of asymptotic standard deviations


for the over-
identified case

Note: Except for switching to the over-identified case, all the other settings for
these four panels remain the
same as those for producing Figure 2.

45
Figure 4: Monte Carlo results for nonparametric implied stochastic volatility model
(1a)–(1b)

Note: In each panel, the true function is determined or calculated according to


(28). The black solid curve
represents the mean of nonparametric estimators corresponding to the 500 simulation
trials. Each point on
the upper (resp.
lower) red dashed curve is plotted by vertically upward (resp. downward) shifting
the
corresponding one on the black mean curve by a distance equal to twice of the
corresponding finite-sample
standard deviation.

46

0.10.150.20.250.3-0.4-
0.200.20.40.60.10.150.20.250.30.0680.0690.070.0710.0720.10.150.20.250.334567810-
30.10.150.20.250.30.050.060.070.080.090.10.10.150.20.250.3-0.8-0.75-0.7-0.65-0.6
Figure 5: Nonparametric implied stochastic volatility model (1a)–(1b) built from
one-trial simulation

Note: In each panel, the true function is determined or calculated according to


(28). The black solid curve
represents the one-trial nonparametric estimator. Each point on the upper (resp.
lower) red dashed curve is
plotted from vertically upward (resp. downward) shifting the corresponding one on
black curve by a distance
equal to twice of the corresponding standard error. Here, the standard error is
calculated by the bootstrap
strategy introduced in Section 5.2.

47

0.10.150.20.250.3-0.4-
0.200.20.40.60.10.150.20.250.30.0680.0690.070.0710.0720.10.150.20.250.334567810-
30.10.150.20.250.30.050.060.070.080.090.10.10.150.20.250.3-0.8-0.75-0.7-0.65-0.6
Figure 6: Histogram of R2 for parametric regressions (32) for individual days
across the whole sample
covering the period of January 2, 2013 to December 29, 2017.

Figure 7: Implied volatility data on January 3, 2017 and the corresponding


parametric fitted surface
with regression R2 = 0.9868

Note: The parametric fitted surface is calculated according to bivariate regression


(32).

48

6040810Time-to-maturity (days)-0.0412Implied volatility (%)141618-0.02Log-


moneyness0200.020.04DataFitted surface
Figure 8: Histograms for the data of [σ(0,0)]data, [σ(0,1)]data, and [σ(0,2)]data

Note: [σ(0,0)]data, [σ(0,1)]data, and [σ(0,2)]data are the data of expansion terms
σ(0,0), σ(0,1), and σ(0,2), respec-
tively. They are prepared from the bivariate regression (32) across the whole
sample. In each panel, we plot
a red dashed vertical bar to represent the mean of the corresponding histogram.

49
Figure 9: Nonparametric implied stochastic volatility model (1a)–(1b)

Note: In the upper left and middle left panels, the data [µ]data and [η2]data are
calculated according to (26)
and (25), respectively. In the upper right panel, the data [−γ]data are simply the
opposite numbers of the
data [γ]data, which are calculated according to (22). In all these three panels,
the nonparametric estimators
are obtained by local linear regressions according to the method proposed in
Section 4. In the middle right
panel, the nonparametric estimator of η follows by taking square root of the
estimator of η2. In the lowest
panel, the nonparametric estimator of ρ follows from (31). In all the panels, the
standard errors of estimators
are calculated by the bootstrap strategy introduced in Section 5.2.

50

0.050.10.150.20.250.3-4-
20240.050.10.150.20.250.300.30.60.91.20.050.10.150.20.250.3-
0.100.10.20.30.40.050.10.150.20.250.30.10.20.30.40.50.050.10.150.20.250.3-0.95-0.9-
0.85-0.8-0.75-0.7
Figure 10: Back-check of the fitting performances on expansion terms

Note: In each panel, the data [σ(i,j)]data are obtained from bivariate regression
(32), while the fitted expan-
sion terms ˆσ(i,j) are obtained from replacing the functions µ, γ, and η, as well
as their derivatives by their
nonparametric estimators in the formula of σ(i,j).

51

0.050.10.150.20.250.3-4-3-2-1010.050.10.150.20.250.3-1.5-1-
0.500.510.050.10.150.20.250.3-1001020300.050.10.150.20.250.3-20-
100102030400.050.10.150.20.250.3-5-2.502.557.5

You might also like