Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views21 pages

Lecture 3

The document discusses the implementation of the Metropolis-Hastings algorithm for Bayesian inference in DSGE models, detailing the steps for generating posterior draws and the rationale behind the algorithm's effectiveness. It highlights the Random-Walk Metropolis algorithm, the importance of prior distributions, and the challenges of identification in Bayesian inference. Additionally, it addresses strategies for improving the efficiency of MCMC methods, such as blocking and partitioning parameters.

Uploaded by

jessezheng742247
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views21 pages

Lecture 3

The document discusses the implementation of the Metropolis-Hastings algorithm for Bayesian inference in DSGE models, detailing the steps for generating posterior draws and the rationale behind the algorithm's effectiveness. It highlights the Random-Walk Metropolis algorithm, the importance of prior distributions, and the challenges of identification in Bayesian inference. Additionally, it addresses strategies for improving the efficiency of MCMC methods, such as blocking and partitioning parameters.

Uploaded by

jessezheng742247
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

MCMC for DSGE Models – Metropolis-Hastings

Algorithm

Frank Schorfheide
University of Pennsylvania, CEPR, NBER

December, 2012
Posterior Inference

• We discussed how to solve a DSGE model;

• and how to compute the likelihood function p(Y |θ) for a DSGE
model.

• Bayesian inference requires us to specify a prior p(θ) (more on that


later) and

• according to Bayes Theorem

p(Y |θ)p(θ)
p(θ|Y ) = R
p(Y |θ)p(θ)dθ

• We want to generate draws from posterior...

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Random-Walk Metropolis (RWM) Algorithm for DSGE
Model
1 Use a numerical optimization routine to maximize the log posterior,
which up to a constant is given by ln p(Y |θ) + ln p(θ). Denote the
posterior mode by θ̃.
2 Let Σ̃ be the inverse of the (negative) Hessian computed at the
posterior mode θ̃, which can be computed numerically.
3 Draw θ(0) from N(θ̃, c02 Σ̃) or directly specify a starting value.
4 For s = 1, . . . , nsim :
• Draw ϑ from the proposal distribution N(θ (s−1) , c 2 Σ̃).
• Let

p(Y |ϑ)p(ϑ)
r (θ(s−1) , ϑ|Y ) = . 2
p(Y |θ(s−1) )p(θ(s−1) )
• Let

with probability min {1, r (θ(s−1) , ϑ|Y )}



ϑ
θ(s) =
θ(s−1) otherwise

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Prior-Posterior Draws

• Generated 100,000 draws, discarded first 10,000 draws (burn-in)


• For posterior every 500th draw is plotted.

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


500 Posterior Draws

• First 500 draws after initial 10,000 burn-in draws.


• Posterior mean (Blue) and 90% credible set (Green) are based on
the 90,000 posterior draws.
Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm
50,000 Posterior Draws

• First 50,000 draws after initial 10,000 burn-in draws.


• Posterior mean (Blue) and 90% credible set (Green) are based on
the 90,000 posterior draws.
Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm
Recursive Means Based on 500 Draws

• Based on first 500 draws after initial 10,000 burn-in draws.


• Posterior mean (Blue) and 90% credible set (Green) are based on
the 90,000 posterior draws.
Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm
Recursive Means Based on 50,000 Draws

• Based on first 50,000 draws after initial 10,000 burn-in draws.


• Posterior mean (Blue) and 90% credible set (Green) are based on
the 90,000 posterior draws.
Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm
Why Does This Algorithm Work (In Principle...)?

• Suppose parameter vector θ is scalar and takes only two values:

Θ = {0, 1}

• The posterior distribution p(θ|Y ) can be represented by a set of


probabilities collected in the vector π, say π = [1/4, 3/4].

• Goal: we want to generate a sequence of draws {θ (s) }ns=1


sim
from a
discrete Markov process such that P{θ(s) = 0} −→ 1/4.

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Background
• Consider a 2-state Markov process with transition probabilities
 
p11 p12
P=
p21 p22

where pij is the probability of moving from state i to state j.

• Let π s = [π1s , π2s ] be a 1 × 2 vector of probabilities of being in state i


in iteration s.

• The corresponding probabilities for period s + 1 are π s+1 = π s P.

• Definition: π is equilibrium distribution if π = πP.

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Discrete MH Algorithm: Idea

• In our RWM Algorithm we generated draws from a Normal


distribution and they magically turned into draws from some
complicated posterior distribution...

• Idea of Metropolis algorithm: provide a general way of constructing


a transition matrix P to generate a chain with (pre-specified)
equilibrium distribution π, which is the posterior of interest.

• In our discrete example, let’s use the following proposal distribution.


We use a Markov-chain with transition matrix Q:
 
λ (1 − λ)
Q= .
(1 − λ) λ

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Excursion: Equilibrium distribution of Q
• The equilibrium distribution has to satisfy
 
    λ (1 − λ)
ω (1 − ω) = ω (1 − ω)
(1 − λ) λ

• Thus,

ω = ωλ + (1 − ω)(1 − λ)

• This leads to

1−λ 1
ω= = .
2(1 − λ) 2
• Thus, the equilibrium distribution associated with Q does NOT
equal the targeted posterior distribution!

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Back to the Algorithm
• Iteration s: suppose that θ (s−1) = θi . Based on transition matrix
 
λ (1 − λ)
Q= ,
(1 − λ) λ

determine a proposed state ϑ (which is either 0 or 1 in our


example).

• With probability αij proposed state is accepted. Set θ (s) = ϑ.

• With probability 1 − αij stay in old state and set θ (s) = θ (s−1) .

• Choose αij = min [1, πj /πi ].

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Discrete MH Algorithm: Implementation
• Why choose αij = min [1, πj /πi ], where π = [1/4, 3/4] corresponds
to the targeted posterior distribution?
• The resulting chain is reversible:
πi pij = πi min[1, πj /πi ]qij
= min[πi , πj ]qij
= min[πi , πj ]qji
= πj pji
• In turn, π = [1/4, 3/4] is an equilibrium distribution. Suppose at
iteration s π s = π. Then:
m
X m
X m
X
πjs+1 = (π s P)j = πis pij = πjs pji = πjs pji = πjs
i=1 i=1 i=1

• For λ ∈ [0, 1) the chain is also irreducible and the equilibrium


distribution is unique.

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Discrete MH Algorithm
• The chain’s transition matrix is:
 
λ (1 − λ)
P= .
(1 − λ) 13 λ + (1 − λ) 32
You can verify that π = [1/4, 3/4] is indeed an equilibrium
distribution.
• The persistence of the chain is determined by the second largest
eigenvalue of P:
4
ev (λ) = λ − 1/3
3
• For λ = 1/4 the second eigenvalue is zero and the chain delivers iid
draws.
• A discrete Markov chain is irreducible if all states communicate with
each other, that is, the expected time of arriving from state i in
state j is finite.
• For λ = 1 the second eigenvalue is one, the chain is NOT irreducible
and the equilibrium distribution is not unique.
• These ideas can be generalized to the continuous case... (we’ll do so
tomorrow)
Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm
MCMC: What works and what doesn’t
• State-space representation:
   
φ1 0 1
yt = [ 1 1 ]st , st = st−1 + t .
φ3 φ2 0

• The state-space model can be re-written as ARMA(2,1) process

(1 − φ1 L)(1 − φ2 L)yt = (1 − (φ2 − φ3 )L)t .

• Relationship between state-space parameters φ and structural


parameters θ:

φ1 = θ12 , φ2 = (1 − θ12 ), φ3 − φ2 = −θ1 θ2 .

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Stylized Example
Model
Reduced form: (1 − φ1 L)(1 − φ2 L)yt = (1 − (φ2 − φ3 )L)t .

Relationship of φ and θ: φ1 = θ12 , φ2 = (1 − θ12 ), φ3 − φ2 = −θ1 θ2 .

• Local identification problem arises as θ1 −→ 0.


• Global identification problem p(Y |θ) = p(Y |θ̃):

θ12 = ρ, (1 − θ12 ) = θ1 θ2

versus

θ̃12 = 1 − ρ, θ̃12 = θ̃1 θ̃2

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Bayesian Inference

Global Identification Problem Local Identification Problem

Difficult to draw from Less problematic, because it is


multi-model posteriors fairly straightforward to
generate draws from prior

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Bayesian Inference

• Bayesian inference with proper priors does not require identifiability


as a regularity condition.

• If θ = [θ10 , θ20 ]0 and p(Y |θ) = p(Y |θ1 ), then

p(θ1 , θ2 |Y ) = p(θ1 |Y )p(θ2 |θ1 ).

• If you don’t like priors in identified models, you won’t like them in
partially/weakly identified models...

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Blocking
• In high-dimensional parameter spaces the RWM algorithm generates
highly persistent Markov chains.

• What’s bad about persistence?


n n

 
1X 1 XX
n(X̄ −E[X̄ ]) =⇒ N 0, V[Xi ]+ COV (Xi , Xj ) .
n n
i=1 i=1 j6=i

• Potential Remedy:
• Partition θ = [θ1 , . . . , θK ].
• Iterate over conditional posteriors p(θk |Y , θ<−k> ).

• To reduce persistence of the chain try to find partitions such that


parameters are strongly correlated within blocks and weakly
correlated across blocks.

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm


Blocking
• Chib and Ramamurthy (2010, JoE):
• Use randomized partitions
• Use simulated annealing to find mode of p(θk |Y , θ<−k> ). Then
construct Hessian to obtain covariance matrix for proposal density.

• Herbst (2011, Penn Dissertation):


• Utilize analytical derivatives
• Use information in Hessian (evaluated at an earlier parameter draw)
to construct parameter blocks. For non-elliptical distribution
partitions change as sampler moves through parameter space.
• Use Gauss-Newton step to construct proposal densities

• Performance measure: CPU time to generate, say, 1,000


“independent” draws.

Frank Schorfheide MCMC for DSGE Models – Metropolis-Hastings Algorithm

You might also like