Full text of Working Papers (Federal Reserve Bank of Richmond) : Gaussian Mixture Approximations of Impulse Responses and the Nonlinear Effects of Monetary Shocks, Working Paper 16-08

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Working Paper Series

Gaussian Mixture Approximations of
Impulse Responses and the Nonlinear
Effects of Monetary Shocks

WP 16-08

Regis Barnichon
CREI, Universitat Pompeu Fabra,
CEPR
Christian Matthes
Federal Reserve Bank of Richmond

This paper can be downloaded without charge from:
http://www.richmondfed.org/publications/

Gaussian Mixture Approximations of Impulse Responses and
the Nonlinear Effects of Monetary Shocks
Working Paper No. 16-08∗
Regis Barnichon

Christian Matthes

CREI, Universitat Pompeu Fabra, CEPR

Federal Reserve Bank of Richmond

June 2016
(first draft: March 2014)

Abstract
This paper proposes a new method to estimate the (possibly nonlinear) dynamic effects
of structural shocks by using Gaussian basis functions to parametrize impulse response
functions. We apply our approach to the study of monetary policy and obtain two main
results. First, regardless of whether we identify monetary shocks from (i) a timing restriction, (ii) sign restrictions, or (iii) a narrative approach, the effects of monetary policy are
highly asymmetric: A contractionary shock has a strong adverse effect on unemployment,
but an expansionary shock has little effect. Second, an expansionary shock may have some
expansionary effect, but only when the labor market has some slack. In a tight labor
∗
We would like to thank Luca Benati, Francesco Bianchi, Christian Brownlees, Fabio Canova, Tim Cogley,
Davide Debortoli, Jordi Gali, Yuriy Gorodnichenko, Eleonora Granziera, Oscar Jorda, Thomas Lubik, Jim Nason, Kris Nimark, Mikkel Plagborg-Moller, Giorgio Primiceri, Ricardo Reis, Barbara Rossi, Mark Watson, Yanos
Zylberberg and seminar participants at the Barcelona GSE Summer Forum 2014, the 2014 NBER/Chicago Fed
DSGE Workshop, William and Mary college, EUI Workshop on Time-Varying Coefficient Models, Oxford, Bank
of England, NYU Alumni Conference, Society for Economic Dynamics Annual Meeting (Warsaw), Universitaet
Bern, Econometric Society World Congress (Montreal), the Federal Reserve Board, the 2015 SciencesPo conference on Empirical Monetary Economics, and the San Francisco Fed for helpful comments. The views expressed
here do not necessarily reflect those of the Federal Reserve Bank of Richmond or the Federal Reserve System.
Any errors are our own.

1

market, an expansionary shock generates a burst of inflation and no significant change in
unemployment. JEL classifications: C14, C32, C51, E32, E52

2

1

Introduction

There now exists a relatively broad consensus on the average effect of monetary policy on
economic activity, and it is generally accepted that a monetary contraction (expansion) leads
to a decline (increase) in output.
However, there is still little agreement about possible asymmetric or nonlinear effects of
monetary policy, and two questions at the core of monetary policy making are largely unsettled.1 First, does monetary policy have asymmetric effects on economic activity? As captured
by the string metaphor, does contractionary monetary policy have a much stronger effect –
being akin to pulling on a string– than an expansionary shock –being akin to pushing on a
string–? Second, does the effect of monetary policy vary with the state of the business cycle?
For instance, does the central bank have more room to stimulate economic activity (without
raising inflation) during recessions?
Providing answers to these questions has been difficult in part for one important technical
reason: the standard approach to identify the dynamic effect of shocks relies on structural
Vector-Autoregressions (VARs),2 which are linear models. While VARs can accommodate
certain types of nonlinearities, some questions, such as the asymmetric effect of a monetary
shock, cannot be answered within a VAR framework.
This paper proposes a new method to estimate the (possibly nonlinear) dynamic effects of
structural shocks. Instead of assuming the existence of a VAR representation, our approach
consists in working directly with the structural moving-average representation of the economy.
Then, to make the estimation of the moving-average representation feasible, we parametrize
the impulse response functions with Gaussian basis functions.
Our approach builds on two premises: (i) any mean-reverting impulse response function can
be approximated by a mixture of Gaussian basis functions, and (ii) a small number (one or two)
1

For instance, while Cover (1992) finds evidence of asymmetric effects, Ravn and Sola (1996, 2004) and
Weise (1999) instead find nearly symmetric effects. And while Lo and Piger (2005) and Santoro et al. (2014)
conclude that monetary policy has stronger effects during recessions, Tenreyro and Thwaites (2015) conclude
the opposite.
2
See e.g., Christiano, Eichenbaum, and Evans (1999) and Uhlig (2005).

3

of Gaussian functions can already capture a large variety of impulse response functions, and
notably the typical impulse responses found in empirical or theoretical studies. For instance,
the impulse response functions to monetary shocks are often found (or theoretically predicted)
to be monotonic or hump-shaped (e.g., Christiano, Eichenbaum and Evans 1999, Walsh 2010).
In such cases, a single Gaussian function can already provide an excellent approximation of
the impulse response function.
Thanks to the small number of free parameters allowed by a Gaussian Mixture Approximation (GMA), it is possible to directly estimate the structural moving average model from
the data, i.e., directly estimate the impulse response functions.3 In turn, the parsimony of the
approach allows us to estimate more general nonlinear models.
We conduct a number of Monte-Carlo simulations to illustrate the performance of our approach in finite sample, first for linear models, then for nonlinear models. In a linear model,
we show that a GMA model can generate more accurate impulse response estimates (in a
mean-squared error sense) than a well-specified VAR model. In a simulation with asymmetry and state-dependence, we find that a GMA model can accurately detect the presence of
nonlinearities and deliver good estimates of the magnitudes of the nonlinearities.
We use our GMA approach to estimate the nonlinear effects of monetary shocks. Our
benchmark identification scheme is a recursive identification scheme, whereby monetary policy
shocks can only affect macro variables with a one period lag (Christiano, Eichenbaum and
Evans, 1999). However, to emphasize that GMAs can easily accommodate other structural
identification schemes, we also consider two alternative identification schemes: (i) a set identification scheme based on sign restrictions,4 and (ii) a narrative identification scheme where a
series of monetary shocks has been previously identified from narrative accounts (Romer and
Romer, 2002).
Consistent with the string metaphor, our findings point towards the existence of strong
3

Another advantage of using Gaussian basis functions is that prior elicitation can be much easier than with
Bayesian estimation of standard VARs, because the coefficients to be estimated are directly interpretable as
features of impulse responses.
4
See e.g., Faust (1998), Canova and De Nicolo (2002), Uhlig (2005), Amir Ahmadi and Uhlig (2015).

4

asymmetries in the effects of monetary shocks, and Bayesian model comparison strongly favors
a GMA model with asymmetry over a linear VAR model. Regardless of whether we identify
monetary shocks from a recursive ordering, from sign restrictions or from a narrative approach,
we find that a contractionary shock has a strong adverse effect on unemployment, larger than
implied by linear estimates, while an expansionary shock has little effect on unemployment.5
Although our evidence for inflation is more uncertain, the behavior of inflation suggests that
the asymmetric response of unemployment could be due to the presence downward price/wage
rigidities, because inflation displays a more marked price puzzle following a contractionary
shock than following an expansionary shock.6
We also find that the effect of a monetary shock depends on the state of the business cycle
at the time of the intervention: an expansionary shock can have some expansionary effect, but
only when the labor market has some slack. In a tight labor market, an expansionary shock
generates no significant drop in unemployment but leads to a burst of inflation, consistent with
a standard Keynesian narrative.
Although our use of Gaussian basis functions to model and estimate impulse response
functions is new in the economics literature, our approach can be cast in the broader context
of the machine (supervised) learning literature in that we project the function to be estimated
on the space spanned by a dictionary of basis functions (see Hastie, Tibshirani and Friedman,
2009). In basis functions methods, the number of basis functions is often too large for empirical
purposes, and the complexity of the model is typically controlled through a combination of
restriction, selection and/or regularization methods. Our approach, which consists in using a
limited number of basis functions, uses both selection and restriction to control the complexity
of the model.7
5

This finding is interesting in the context of the current debate on the appropriate timing of the lift-off
of the policy rate from its (close to) zero level in most developed economies. Our estimates suggest that an
inappropriate (i.e., too strong or too early) increase in the policy rate could be a lot more costly (in terms of
economic activity) than conventional (linear) estimates suggest.
6
See e.g., Morgan (1993) for a discussion of the effect of downward price rigidity on asymmetric effects of
monetary policy.
7
It uses selection in the sense that our algorithm scans the dictionary of possible basis functions to find
the Gaussian basis functions that best fit the data (in a maximum likelihood sense), and it uses restriction in

5

In economics, our parametrization of impulse responses relates to an older literature on distributed lag models and in particular the Almon (1965) lag specification, in which the successive
weights, i.e., the impulse response function in our context, are given by a polynomial function.8
Our use of Gaussian basis functions relates to a large applied mathematics literature that relies
on radial basis functions (of which Gaussian functions are one example) to approximate arbitrary multivariate functions (e.g., Buhmann, 2003) or to approximate arbitrary distributions
using a mixture of Gaussian distributions (Alspach and Sorenson 1971, 1972, McLachlan and
Peel, 2000). Although Gaussian basis functions provide a more natural and more parsimonious way than polynomials to approximate mean-reverting impulse response functions, our
approach is general and other basis functions are possible. For instance, the inverse quadratic
function, which is also a popular radial basis function, could be used to parametrize impulse
response functions.9 Finally, our approach shares with the non-parametric econometrics literature (e.g., Racine, 2008) the insight that mixtures of Gaussian kernels can approximate very
general shapes, although we use that insight in a very different manner.
The economic literature has so far tackled the estimation of nonlinear effects of shocks in
two main ways.10
A first approach estimates nonlinear effects by regressing a variable of interest on contemporaneous and lagged values of some independently identified shocks while allowing for possible
nonlinear effects. In the context of monetary policy, Cover (1992), DeLong and Summers
(1988) and Morgan (1993) identify monetary shocks from unanticipated money innovations
(obtained from a money supply process regression, following Barro, 1977) and test whether
the impulse response function depends on the sign of these innovations. While that approach
the sense that we restrict ourselves to the class of impulse response functions that can be generated by a few
Gaussian basis functions.
8
Recently, Plagborg-Moller (2016) proposes a Bayesian method to directly estimate the structural movingaverage representation of the data by using prior information about the shape and the smoothness of the impulse
response.
9
In fact, in a different context, Jorgenson (1966) suggested that ratios of polynomials, of which the inverse
quadratic function is one example, could be used to parametrize distributed lag functions.
10
A third nonlinear approach was recently proposed by Angrist et al. (2013) who develop a semi-parametric
estimator to evaluate the (possibly asymmetric) effects of monetary policy interventions. They find asymmetric
effects of monetary shocks consistent with our findings.

6

was later abandoned because money supply regressions were suspected to poorly identify monetary shocks, the use of independently identified shocks has been recently revived thanks to
the use of narratively identified shocks (Romer and Romer, 2002) and thanks to the Local Projection method pioneered by Jorda (2005).11 The narrative approach was precisely developed
in order to identify exogenous monetary innovations, and Jorda’s method can easily accommodate nonlinearities in the response function.12 However, the Local Projection method is
limited by efficiency considerations. Indeed, while the Local Projection approach is intentionally model-free –not imposing any underlying dynamic system–, this can come at an efficiency
cost (Ramey, 2012), which makes inferences on a rich set of nonlinearities (e.g., sign- and
state-dependence) difficult. In contrast, by positing that the response function can be approximated by one (or a few) Gaussian functions, our approach imposes strong dynamic restrictions
between the parameters of the impulse response function, which in turn allow us to estimate
a rich set of nonlinearities.13 Another advantage of our approach is that it can be used for
model selection and model evaluation through marginal data density comparisons.
A second strand in the literature has relied on regime-switching VAR models –notably
threshold VARs (e.g., Hubrich and Terasvirta, 2013) and Markov-switching VARs (Hamilton,
1989)– to capture certain types of nonlinearities.14,15 However, while regime-switching VARs
can capture state dependence (whereby the value of some state variable affects the impulse
response functions), they cannot capture asymmetric effects of shocks (whereby the impulse
response to a structural shock depends on the sign of that shock). Indeed, with regime11

The combination of Jorda’s method with narratively identified shocks was first introduced in the context
of fiscal policy by Auerbach and Gorodnichenko (2013) in order to test for the existence of state dependence in
the effects of fiscal policy.
12
Santoro et al. (2014) and Tenreyro and Thwaites (2013) use the Jorda method to estimate the extent of
state dependence in the effect of monetary policy.
13
Naturally, this statement also implies that our results are valid under the assumption that response functions can be well approximated by a few Gaussian functions. In this respect, our approach is best seen as
complementing the model-free approach of Jorda (2005).
14
For examples in the monetary policy literature, see Beaudry and Koop (1993), Thoma (1994), Potter (1995),
Kandil (1995), Koop, Pesaran and Potter (1996), Koop and Potter, (1998), Ravn and Sola (1996, 2004), Weise
(1999), Lo and Piger (2005).
15
Another prominent class of nonlinear VARs includes models with time-varying coefficients and/or timevarying volatilities (e.g., Primiceri, 2005).

7

switching VAR models, it is assumed that the economy can be in a finite number of regimes,
and that each regime corresponds to a different set of VAR coefficients. However, if the true
data generating process features asymmetric impulse responses, a new set of VAR coefficients
would be necessary each period, because the (nonlinear) behavior of the economy at any point
in time depends on all structural shocks up to that point. As a result, such asymmetric data
generating process cannot generally be approximated by a small number of state variables such
as in threshold VARs or Markov-switching models. In contrast, by working directly with the
structural moving-average representation, GMA models can easily capture asymmetric impulse
response functions (as well as state dependence).
Section 2 describes how we approximate impulse responses using mixtures of Gaussians,
Section 3 discusses the key steps of the estimation methodology; Section 4 generalizes our
approach to nonlinear models; Section 5 presents Monte Carlo simulations to evaluate the
performance of our approach in finite sample, first for linear models, then for nonlinear models;
Section 6 applies GMA to the study of the nonlinear effects of monetary shocks using US data;
Section 7 concludes.

2

Gaussian Mixture Approximations

This section presents a new method to estimate impulse responses using Gaussian Mixture
Approximations (GMA) of the structural moving-average representation of the economy. Although the use of GMAs was motivated in the introduction by the need to model and estimate
certain types of nonlinearities, the intuition and benefits of GMA models can be understood
in a linear context, and this section introduces GMAs in a linear context. We postpone the
modeling and estimation of nonlinearities to Section 4.

2.1

A structural moving average representation

Our starting point is a structural moving-average model of the economy, in which the behavior
of a system of macroeconomic variables is dictated by its response to past and present structural
8

shocks. Specifically, denoting yt an L × 1 vector of stationary macroeconomic variables, the
economy is described by
yt =

K
∑

Ψk εt−k

(1)

k=0

where boldface letters indicate vectors or matrices, εt is the vector of structural innovations
with Eεt = 0 and Eεt ε′t = I, and K is the number of lags, which can be finite or infinite.
Throughout the text, we omit the intercepts for ease of exposition, but all estimated models
include intercepts. The matrices {Ψk }K
k=0 capture the impulse responses to shocks, and as
a normalization, we posit that Ψ0 has positive entries on the diagonal, i.e., Ψ0,ℓℓ ≥ 0, ∀ℓ ∈
{1, .., L}. For now, the model is linear, and the Ψk matrices are fixed.
If (1) is invertible and admits a VAR representation, the model can be estimated from a
VAR on yt (provided some structural identifying assumption, such as the recursive ordering
of Ψ0 ). However, assuming the existence of a VAR representation can be restrictive. In
particular, in a nonlinear world where Ψk depends on the value of εt−k (for instance, when
the impulse response function varies with the sign of the shock), the existence of a VAR is
compromised. Thus, in this paper, we propose an alternative method that side-steps the need
to invert (1), i.e., we propose a method that side-steps the need for a VAR representation.

2.2

Gaussian Mixture Approximations of impulse response functions

Rather than looking for a VAR representation of the dynamic system (1), our aim is to directly
estimate (1), the moving-average representation of the economy. Because the number of free
parameters {Ψk }K
k=0 in (1) is very large or possibly infinite, our strategy consists in parameterizing the impulse response functions, and more precisely in using mixtures of Gaussian
functions to approximate each impulse response function.
2.2.1

Theoretical background

Our parametrization of the impulse response functions builds on the following theorem, which
states that any integrable function can approximated with a sum of Gaussian functions.
9

Theorem 1 Let f be a bounded continuous function on R that satisfies

∫∞

2
−∞ f (x) dx

< ∞.

There exists a function fN defined by

fN (x) =

N
∑

an e−(

x−bn 2
)
cn

n=1

with an , bn , cn ∈ R for n ∈ N, such that the sequence {fN } converges pointwise to f on
every interval of R.
Proof. See Appendix.
Denote ψ(k) a representative element of matrix Ψk , so that ψ(k) is the value of the impulse
response function ψ at horizon k.
Motivated by Theorem 1, our approach will consist in approximating the impulse response
function ψ with a sum of Gaussian functions, that is

ψ(k) ≃

N
∑

an e−(

k−bn 2
)
cn

,

∀k ∈ (0, K]

(2)

n=1

with an , bn , cn ∈ R.16
Since our strategy consists in approximating impulse response functions with mixtures of
Gaussians, we refer to this class of models as Gaussian Mixture Approximations (GMA), with
a GM A(N ) denoting a GMA with N Gaussian basis functions.

2.2.2

Intuition and Motivation

Before describing the estimation of GMA models, it is instructive to first intuitively discuss
the benefits of our approach over traditional VARs.
The advantage of our approach, and its use for studying the (possibly nonlinear) effects
of policy, will rest on the fact that, in practice, only a very small number of Gaussian basis
16

The GMA parametrization of ψ may or may not include the contemporaneous impact coefficient, that is
one may choose to use the approximation (2) for k > 0 or for k ≥ 0. In this paper, we treat ψ(0) as a free
parameter for additional flexibility.

10

functions are needed to approximate a typical impulse response function, allowing for efficiency
gains and opening the door to estimating nonlinearities.
Intuitively, impulse response functions of stationary variables are often found (or theoretically predicted) to be monotonic or hump-shaped (e.g., Christiano, Eichenbaum, and Evans,
1999).17 In such cases, a single Gaussian function can already provide a good approximate
description of the impulse response. To illustrate this observation, Figure 1 plots the impulse
response functions of unemployment, the price level and the fed funds rate to a monetary shock
estimated from a standard VAR specification,18 along with the corresponding GM A(1), the
Gaussian approximations with only one Gaussian function, i.e., using the approximation
ψ(k) ≃ ae−

(k−b)2
c2

.

(3)

We can see that a GM A(1) already does a good job at capturing the impulse responses implied
by the VAR.19 With a GM A(2), the impulse responses are virtually on top on those of the
VAR (Figure 1). For illustration, Figure 2 plots the Gaussian basis functions used for each
impulse response in the GMA(2) case.
In both cases, the number of free parameters is manageable. For instance, in this 3 variables
example, a GMA(1) only has 27 parameters (9 impulse responses times 3 parameters per
impulse response, ignoring intercepts) to capture the whole set of impulse responses {Ψk }K
k=1 ,
while a GMA(2) has 48 free parameters (9 ∗ 3 ∗ 2 = 48).20
This relatively small number of free parameters in turn allows us to directly estimate the
impulse response functions from the vector moving-average representation (1). This point is at
the core of our GMA approach, because being able to directly work with the moving-average
17

In New-Keynesian models, the impulse response functions are generally monotonic or hump-shaped (see
e.g., Walsh, 2010).
18
See Section 6 for the exact specification of the SVAR behind Figure 1. The VAR is specified with unemployment, PCE inflation and the fed funds rate. The impulse response for the price level is calculated from the
response of inflation.
19
In Figure 1, the parameters of the GMA (the a, b and c coefficients) were set to minimize the discrepancy
(sum of squared residuals) between the two sets of impulse responses.
20
For comparison, a corresponding quarterly VAR with 3 variables and 4 lags has 4 ∗ 32 = 36 free parameters,
and a monthly VAR with 12 lags has 12 ∗ 32 + 6 = 108 free parameters.

11

representation will allow us to estimate models in which shocks can have nonlinear effects.
To conclude this intuition section, we comment on a particularly interesting case: the
GM A(1) model, which has two additional advantages: (i) ease of interpretation, and (ii) ease
of prior elicitation.
In a GM A(1) model like (3), the a, b and c coefficients can be easily interpreted, because
the impulse response function is summarized by three parameters –the peak effect, the time
to peak effect, and the persistence of the impulse response–, which are generally considered
the most relevant characteristics of an impulse response function.21 As illustrated in Figure 3,
parameter a is the height of the impulse-response, which corresponds to the maximum effect
of a unit shock, parameter b is the timing of this maximum effect, and parameter c captures
the persistence of the effect of the shock, as the amount of time τ required for the effect of a
√
shock to be 50% of its maximum value is given by τ = c ln 2.
Then, the ease of interpretation of the a, b and c parameters in turn makes prior elicitation
easier than in standard VARs, in which the VAR coefficients have a less direct economic
interpretation.

3

Bayesian estimation

To estimate our model, we use a Bayesian approach, which is particularly well suited for models
that only approximate the true DGP (Fernandez-Villaverde and Rubio-Ramirez, 2004). In
particular, Bayes factors will allow us to evaluate GMA models against VAR models, even
though the two classes of models are non-nested.22 Bayesian model comparison will also offer
us a natural way to select the order of the GMA model, i.e., the number of Gaussian basis
functions used in the approximation.
In this section, we describe the implementation and estimation of GMA models. We first
21

For instance, when comparing the effects of monetary shocks across different specifications, Coibion (2012)
focuses on the peak effect of the monetary shock, which in a GMA(1) model is simply parameter a.
22
Bayes factors are functions of the marginal data densities for the two models that are being compared.
Since marginal data densities can be rewritten as products of one-step ahead forecast densities, Bayes factors
also offer insights about the relative forecasting abilities of the two models that are being compared.

12

describe how we construct the likelihood function by exploiting the prediction-error decomposition, discuss structural identification, then present the estimation routine based on a multipleblock Metropolis-Hasting algorithm, discuss prior elicitation, the determination of the order
of the GMA and identification issues related to fundamentalness. We conclude by discussing
how to deal with non-stationary data.

3.1

Constructing the likelihood function

We now describe how to construct the likelihood function p(y T |θ) of a sample of size T for the
moving-average model (1) with parameter vector θ and where a variable with a superscript
denotes the sample of that variable up to the date in the superscript.
To start, we use the prediction error decomposition to break up the density p(y T |θ) as
follows:23
p(y |θ) =
T

T
∏

p(yt |θ, y t−1 ).

(4)

t=1

To calculate the one-step-ahead conditional likelihood function needed for the prediction
error decomposition, we assume that all innovations {εt } are Gaussian with mean zero and
variance one,24 and we note that the density p(yt |θ, y t−1 ) can be re-written as p(yt |θ, y t−1 ) =
p(Ψ0 εt |θ, y t−1 ) since
yt = Ψ 0 ε t +

K
∑

Ψk εt−k .

(5)

k=1

Since the contemporaneous impact matrix is a constant, p(Ψ0 εt |θ, y t−1 ) is a straightforward
function of the density of εt .
To recursively construct εt as a function of θ and y t , we need to uniquely pin down the
values of the components of εt from equation (5), that is we need that Ψ0 is invertible. We
impose this restriction by assigning a minus infinity value to the likelihood whenever Ψ0 is not
invertible. It is also at this stage that we impose the identifying restriction that we describe
23

To derive the conditional densities in decomposition (4), our parameter vector θ thus implicitly also includes
the K initial values of the shocks: {ε−K ...ε0 }. We will keep those fixed throughout the estimation and discuss
alternative initializations below.
24
The estimation could easily be generalized to allow for non-normal innovations such as t-distributed errors.

13

next. Finally, to initialize the recursion, we set the first K innovations {εj }0j=−K to zero.25,26

3.2

Structural identifying assumptions

Model (1) is under-identified without additional restrictions. In our application of GMAs to
the study of monetary policy, we will use as our benchmark a recursive identification scheme
(Christiano, Eichenbaum and Evans, 1999). However, to emphasize that GMAs can easily
accommodate other structural identification schemes, we will also consider two popular schemes
to identify monetary shocks: (i) the narrative identification scheme where a series of monetary
shocks has been previously identified from narrative accounts (Romer and Romer, 2002), and
(ii) a set identification scheme based on sign restrictions (Uhlig, 2005).27 We describe the
implementation of these identification schemes next.
Short-run restrictions

Short-run restrictions consist in restrictions on Ψ0 , which are straight-

forward to implement in a GMA model.
Short-run restrictions in a fully identified model consists in imposing

L(L−1)
2

restrictions on

Ψ0 (of dimension L × L), and a common approach is to impose that Ψ0 is lower triangular,
so that the different shocks are identified from a timing restriction. This identifying scheme
is popular in the case of monetary policy, where monetary shocks are assumed to only affect
macro variables with a one period lag (Christiano, Eichenbaum and Evans, 1999).
In a partially identified model, one can impose a timing restriction for one shock only.
In the case of the monetary model considered in section 6, this will amount to ordering the
monetary policy variable last and imposing that Ψ0 has its last column filled with 0 except for
the diagonal coefficient. The submatrix Ψ̃0 made of the first (L − 1) rows and (L − 1) columns
of Ψ0 is then left unrestricted, apart from invertibility to ensure that equation (5) defines a
unique shock vector εt (as described in section 3.1).
25

Alternatively, we could use the first K values of the shocks recovered from a structural VAR.
When K, the lag length of the moving average (1), is infinite, we truncate the model at some horizon
K, large enough to ensure that the lag matrix coefficients ΨK are “close” to zero. Such a K exists since the
variables are stationary.
27
In Barnichon and Matthes (2016), we discuss how to impose other identification schemes.
26

14

Narrative identification In a narrative identification scheme, a series of shocks has been
previously identified from narrative accounts. For that case, we can proceed as with the
recursive identification, because the use of narratively identified shocks can be cast as a partial
identification scheme. If one orders the narratively identified shocks series first in yt , we can
assume that Ψ0 has its first row filled with 0 except for the diagonal coefficient, which implies
that the narratively identified shock does not react contemporaneously to other shocks (as
should be the case if the narrative shocks were correctly identified).

Sign restrictions

Set identification through sign restrictions consists in imposing sign-

restrictions on the sign of the Ψk matrices, i.e., the impulse response coefficients at different
horizons. Again, because a GMA model works directly with the moving average representation and the Ψk matrices, imposing sign-restrictions is straightforward to implement in a GMA
model. One can impose sign-restrictions on only the impact coefficients (captured by Ψ0 , which
could be left as a free parameter in this case) and/or sign restrictions on the impulse response
over a specific horizon (captured by the {an , bn , cn } GMA coefficients that model Ψk ). To
implement parameter restrictions on Ψ0 and/or {an , bn , cn }, we assign a minus infinity value
to the likelihood whenever the restrictions are not met.
More generally, in line with the insights from Baumeister and Hamilton (2015), the implementation of sign-restrictions can take the form of priors on the coefficients of Ψ0 and on the
28
{an , bn , cn }N
n=1 coefficients.

3.3

Estimation routine

To estimate our model, we use a Metropolis-within-Gibbs algorithm (Robert & Casella 2004,
Haario et al., 2001) with the blocks given by the different groups of parameters in our model
28
More generally, because GMAs work directly with the structural moving-average representation, the parameters to be estimated can be interpreted as “features” of the impulse responses, and one could envision set
identification schemes through shape restrictions (see e.g., Lippi and Reichlin, 1994 for an early application of
this idea). For instance, one could posit priors on the location of the peak effect, posit priors on the persistence
of the effect of the shock, among other possibilities. See Plagborg-Moller (2016) for a related idea.

15

(there is respectively one block for the a parameters, one block for the b parameters, one block
for the c parameters and one block for the constant and other parameters).
To initialize the Metropolis-Hastings algorithm in an area of the parameter space that has
substantial posterior probability, we follow a two-step procedure: first, we estimate a standard
VAR using OLS on our data set, calculate the moving-average representation, and we use
the impulse response functions implied by the VAR as our starting point. More specifically,
we calculate the parameters of our GMA model to best fit the VAR-based impulse response
functions.29 Second, we use these parameters as a starting point for a simplex maximization
routine that then gives us a starting value for the Metropolis-Hastings algorithm.

3.4

Prior elicitation

We use (loose) Normal priors centered around the impulse response functions obtained from
the benchmark (linear) VAR. Specifically, we put priors on the a, b and c coefficients that are
centered on the values for a, b and c obtained by matching the impulse responses obtained
from the VAR, as described in the previous paragraph.
Specifically, denote a0ij,n , b0ij,n and c0ij,n , n ∈ {1, N } the values implied by fitting the
GMA(N) to the VAR-based impulse response of variable i to shock j. The priors for aij,n ,
bij,n and cij,n are centered on a0ij,n , b0ij,n and c0ij,n , and the corresponding standard-deviations
are set as follows: σij,a = 10, σij,b = K and σij,c = K (recall that K is the length of the
moving-average).30 While there is clearly some arbitrariness in choosing the tightness of our
priors, it is important to note that they are sufficiently loose to let us explore a large class
of alternative specifications.31 More generally, the use of informative priors is not critical for
29

Specifically, we set the parameters of our model (the a, b and c coefficients) to minimize the discrepancy
(sum of squared residuals) between the two sets of impulse responses.
30
Going back to our intuitive interpretation of the three parameters of a Gaussian basis
√ function in Section 2,
note that these priors are very loose. This is easy to see for a and b. For c, recall that c ln 2 is the the half-life
of √
the effect of a shock. If c = K, this already corresponds to very persistent impulse response functions, since
K ln 2 = 38 quarters.
31
For our monetary policy application, we verified that the prior did not influence our conclusions by using
uninformative priors: We estimated both the asymmetric GMA model and the asymmetric and state dependent
GMA model with improper flat priors, and we obtained very similar results.

16

our approach, and we could have used improper uniform priors, but the use of proper priors
allows us to compute posterior odds ratios, which are important to select the order of the
moving-average and to compare different GMA models.

3.5

Choosing N , the number of Gaussian basis functions

To choose N , the order of the GMA model, we use posterior odds ratios (assigning equal
probability to any two models) to compare models with increasing number of mixtures. We
select the model with the highest posterior odds ratio.32

3.6

Fundamentalness

In a linear moving average model, different representations (i.e., different sets of coefficients
and innovation variances) can exhibit the same first two moments, so that with Gaussiandistributed innovations, the likelihood can display multiple peaks, and the moving average
model is inherently underidentified. Since a GMA model works off directly with the movingaverage representation, it cannot distinguish between invertible (also called “fundamental”)
and non-invertible representations. By using the VAR-based impulse responses as starting
values, we implicitly focus on the invertible part of the parameter space.33,34
32

This approach can be seen as analogous to the choice of the parameter lag in VAR models. While the Wold
theorem shows that any covariance-stationary series can be written as a VAR(∞), one must select a finite lag
order p that reasonable approximate the VAR(∞) (e.g., Canova, 2007). The usual approach is to use information
criteria such as AIC and BIC, which is similar to our present approach. Just as in the case of lag length choice
in a VAR (where this is rarely, if ever, done), we could alternatively treat N as a discrete parameter. We choose
to use one value for N at a time to highlight how different choices for N affect estimated impulse responses.
33
Since a VAR is obtained by inverting the fundamental moving-average representation, it automatically
selects the fundamental representation (e.g., Lippi and Reichlin, 1994).
34
An alternative estimation procedure to handle both invertible and non-invertible representations would be
to use the Kalman filter with priors on the K initial values of the shocks {ε−K ...ε0 }, as recently proposed by
Plagborg-Moller (2016). However, unlike our proposed approach, this procedure would be difficult to implement
in nonlinear models. Note also that the non-uniqueness of the moving average representation was proven for
linear models (under Gaussian shocks). When we consider nonlinearities, the non-uniqueness of the movingaverage representation is not guaranteed anymore, and identification may be easier. In practice (and in MonteCarlo simulations), the likelihood did not display multiple peaks when we allowed for asymmetry or statedependence.

17

3.7

Dealing with non-stationary data

As can be seen from Theorem 1, GMA models can only capture impulse response functions
that are bounded and integrable, which restricts our approach to stationary series. If the
data are non-stationary, we can (i) allow for a deterministic trend in equation (1) and/or (ii)
first-difference the data, and then proceed exactly as described above.
If a deterministic trend is suspected, we allow for a polynomial trend in each series, and
we jointly estimate the parameters of the impulse responses (the Ψk coefficients) and the
polynomial parameters.
If a stochastic trend is suspected, we can transform the data into stationary series by
differencing the data. Importantly, the presence of co-integration does not imply that a GMA
model in first-difference is misspecified.35 After estimation, one can even test for co-integration
K ∑
k
∑
by testing whether the matrix sum of moving-average coefficients (
Ψl ) is of reduced rank
k=1 l=0

(Engle and Yoo, 1987).

4

Gaussian Mixture Approximations of nonlinear models

We now generalize the moving average model (1) by allowing for asymmetry and statedependence, and we show how GMA models can easily accommodate such nonlinearities.

4.1

A nonlinear moving-average model

In this section, we generalize model (1) by allowing the economy to respond nonlinearly to
shocks, and we consider the model

yt =

K
∑

Ψk (εt−k , zt−k )εt−k

(6)

k=0
35

The reason is that a GMA model directly works with the moving-average representation and does not
require inversion of the moving-average, unlike VAR models.

18

where εt is again the vector of structural innovations with Eεt = 0 and Eεt ε′t =I, and zt is
a vector of stationary macroeconomic variables that can be a function of past variables of yt
or a function of variables exogenous to yt . As a normalization, we posit that Ψ0 has positive
entries on the diagonal, i.e., Ψ0,ℓℓ (εt , zt ) ≥ 0, ∀ℓ ∈ {1, .., L}, ∀t ∈ {1, .., T }.
Model (6) is a nonlinear vector moving average representation of the economy, because in
contrast to (1), the matrix of lag coefficients Ψk (εt−k , zt−k ) is no longer constant. Instead,
the coefficients of matrix Ψk can depend on the values of the structural innovations εt−k and
on the values of the macroeconomic variables in zt−k .
With Ψk a function of εt−k , the impulse response functions to a given structural shock
depend on the value of the shock at the time of shock. For instance, a positive shock may
trigger a different impulse response than a negative shock.
With Ψk a function of zt−k , the impulse response functions to a structural shock depend
on the value of the macroeconomic variables in z at the time of that shock. For instance, the
response function may be different depending on the state of the business cycle (recession or
expansion) at the time of the shock.
Because of its nonlinear nature (6) does not admit a VAR representation, and the model
cannot be recovered from a VAR.36 Instead, our GMA approach directly works with the
moving-average representation and can easily accommodate nonlinearities. Moreover, the
parametrization offered by Gaussian mixture approximations can ensure that the dimensionality of the problem remains reasonable. We now discuss in more details two cases of nonlinear
behavior that a GMA model can easily handle: (i) asymmetry and (ii) state-dependence.
36
Regime-switching VAR models can capture certain types of nonlinearities such as state dependence (whereby
the value of some state variable affects the impulse response functions), but they cannot capture asymmetric
effects of shocks (whereby the impulse response to a structural shock depends on the sign of that shock). With
regime-switching VAR models, it is assumed that the economy can be in a finite number of regimes, and that
each regime corresponds to a different set of VAR coefficients. However, if the true data generating process
features asymmetric impulse responses, a new set of VAR coefficients would be necessary each period, because
the (nonlinear) behavior of the economy at any point in time depends on all structural shocks up to that point.
As a result, such asymmetric data generating process cannot generally be approximated by a small number of
state variables such as in threshold VARs or Markov-switching models.

19

4.1.1

Asymmetric effects of shocks

To allow for asymmetries, we let Ψk depend on the sign of the structural shock, i.e., we let Ψk
−
take two possible values: Ψ+
k or Ψk . Specifically, a model that allows for asymmetric effects

of shocks would be

yt =

K
∑
[

]
−
Ψ+
k (εt−k ⊙ 1εt−k >0 ) + Ψk (εt−k ⊙ 1εt−k <0 )

(7)

k=0
−
with Ψ+
k and Ψk the lag matrices of coefficients for, respectively, positive and negative shocks

and ⊙ denoting element-wise multiplication.
+
Denoting ψij
(k), the i-row j-column coefficient of Ψ+
k (that is, the impulse response of

variable j to a positive shock i), a GMA(N) model would then be
(
+
ψij
(k) =

N
∑

a+
ij,n e

−

k−b+
ij,n
c+
ij,n

)2

,

∀k ∈ (0, K]

(8)

n=1
+
+
with a+
ij,n , bij,n , cij,n some constants to be estimated. A similar expression would hold for
−
ψij
(k).

4.1.2

Asymmetric and state-dependent effects of shocks

+
With asymmetry and state dependence, Ψ+
k becomes Ψk (zt−k ), i.e., the impulse response to

a positive shock depends on the indicator vector zt (and similarly for Ψ−
k ).
For simplicity, let us consider the case where the vector of indicator variables z is a scalar
+
z. Using a GMA(N) model, the impulse response function following a positive innovation (ψij
)

can be parametrized as
(
+
+
ψij
(k) = (1 + γij
zt−k )

N
∑

a+
ij,n e

−

k−b+
ij,n
c+
ij,n

)2

,

∀k ∈ (0, K]

(9)

n=1
+
+
+
with γij
, a+
ij,n , bij,n and cij,n parameters to be estimated. An identical functional form holds

20

−
for ψij
.

In this model, the amplitude of the impulse response depends on the state of the business
cycle at the time of the shock. In (9), the amplitude of the impulse response is a function
of the indicator variable zt . Such a specification allows us to test whether, for instance, an
expansionary policy has a stronger effect on output in a recession than in an expansion.
Note that in specification (9), the state of the cycle is allowed to stretch/contract the
impulse response, but the shape of the impulse response is fixed (because a, b and c are all
independent of zt ). While one could allow for a more general model in which all variables a, b
and c depend on the indicator variable, specification (9) has two advantages. First, with limited
sample size, it will typically be necessary to impose some structure on the data, and imposing a
constant shape for the impulse response is a natural starting point.37 Second, specification (9)
generalizes trivially to GMAs of any order. The order of the GMA only determines the shape
of the impulse response with higher order allowing for increasingly complex shapes. Then, for
a given shape, the γ coefficient can stretch or expand the impulse response depending on the
state of the cycle.38

4.2

Bayesian estimation of nonlinear GMA models

The Bayesian estimation of nonlinear GMA models proceeds similarly to linear GMA models,
but the construction of the likelihood involves one additional complication that we briefly
mention here and describe in detail in the Appendix.
The additional complication comes from the fact that one must make sure that the system
Ψ0 (εt , zt )εt = ut has a unique solution vector εt given a set of model parameters and given
some vector ut . With the contemporaneous impact matrix Ψ0 a function of εt , a unique so37

Importantly, this assumption is easy to relax or to evaluate by model comparison using posterior odds ratios.
Note the parallel and difference between (9) and a varying coefficient model. A varying coefficient model
(e.g., Hastie and Tibshirani, 1993) is a (locally) linear model, whose coefficients are allowed to vary smoothly
with some third variable zt . In (9), the use of a finite sum of Gaussian basis functions (independent of zt ) plays
a similar role to smoothness in varying coefficient models by restricting the shape of the impulse response and
disciplining the estimates. Then, the effect of the third variable zt is captured by letting the scale of the impulse
response be a linear function of zt .
38

21

lution is a priori not guaranteed. However, we show in the Appendix that there is a unique
solution when we allow the identified shocks to have with asymmetric and/or state dependent
effects in (i) the (full or partial) recursive identification scheme, (ii) the narrative identification scheme, and (iii) the sign-restriction identification scheme under the restriction that
−
sgn(det Ψ+
0 ) = sgn(det Ψ0 ).

Compared to the linear case, the nonlinear models require some initial values and prior
distribution for the parameters controlling the nonlinearities. As initial guesses, we set the
parameters capturing asymmetry and state dependence to zero (i.e., no nonlinearity).39 This
approach is consistent with the starting point of this paper: structural shocks have linear
effects on the economy, and we are testing this hypothesis against the alternative that shocks
have some nonlinear effects. We then center the priors for these parameters at zero with flat
(but proper) priors.

5

Monte Carlo simulations

In this section, we conduct a number of Monte-Carlo simulations to illustrate the working of
GMA models as well as to evaluate their performances in finite sample. We first evaluate the
performances of GMA models in the linear case, and we then evaluate the ability of GMA
models to detect (i) asymmetry alone and (ii) asymmetry and state-dependence.
Importantly, in all our Monte Carlo exercises, the estimated GMA models will be misspecified and only approximate the true Data Generating Process (DGP). We follow this strategy
for two reasons. First, we want to be conservative and stack the odds against our proposed
method. Second, this strategy is consistent with the idea that a GMA is meant to approximate the true DGP. By focusing on the approximate shape of the impulse response and thereby
economizing on degrees of freedom, a GMA may (i) provide better estimates of the impulse
responses in short sample, –a classical example of the bias-variance trade-off–, and (ii) be able
39
An alternative would be to obtain initial estimates about possible nonlinear effects. One option could be
to combine Jorda’s (2005) local projection method (which can accommodate nonlinearities) with the structural
shocks recovered from the VAR in order to get first estimates of the nonlinear impulse responses.

22

to detect nonlinearities. One goal of these simulation exercises is to evaluate whether this can
indeed be the case.
To simulate data, we proceed as follows. We first estimate a structural VAR on US data (us{ }∞
ing a recursive identification scheme), invert it to obtain a set of impulse responses Ψ̂k
,
k=0

and we modify these baseline impulse responses to introduce nonlinearities, in particular asymmetry or state dependence. From these impulse responses, we generate simulated data from

yt =

∞
∑

Ψ̂k (εt−k , zt−k )εt−k

(10)

k=0

with εt Normally distributed, Eεt = 0 and Eεt ε′t = I.
In each scenario, we use 50 Monte-Carlo replications with a sample size T = 200, which
roughly corresponds to the sample size available for the US.

5.1

Linear model

Our first simulation is meant to illustrate the workings of Gaussian mixture approximations
in the linear case. Our goal is not to claim that GMAs are superior to VARs but instead to
convey that GMAs can provide a useful alternative approach, especially in short samples.
The DGP is obtained from estimating the quarterly VAR(4) considered previously with
the unemployment rate, the PCE inflation rate and the federal funds rate over 1959-2007. The
impulse response functions to a monetary shock can be seen in Figure 1.
For each simulated dataset, we estimate (i) a GMA(2), and (ii) a VAR(4), and we evaluate
the Mean-Square Error (MSE) of the estimated impulse response function over the horizons
k = 1...25.40 Importantly, we stack the odds in favor of the VAR and against the GMA model,
because the estimated VAR is a correctly specified model.
The first row of Table 1 presents the average MSEs over the simulations. For unemployment
and inflation, the GMA(2) is respectively 25 percent and 50 percent more accurate on average
40

Specifically, we report M SE =
and ψ is the true function.

∑25

k=1 (ψ̂(k)

− ψ(k))2 where ψ̂ is the estimated impulse response function

23

than the VAR. For the fed funds rate, the MSE is small in both cases, but again with a
slight advantage for the GMA.41 Table 1 also presents the average length and coverage rate of
the confidence bands capturing the 95 percent posterior probability and compares it with the
confidence bands implied by a Bayesian VAR with loose, but proper, Normal-Wishart priors.
We report the average length and coverage rate at the time of the peak effect of the shock of
the variable of interest. We can see that the average lengths are smaller for the GMA than for
the VAR, while the coverage rate of the GMA remains good.

5.2

Nonlinear models

We now evaluate the performances of GMA models in detecting nonlinearities. For the DGP,
we start from a VAR with (log) GDP, inflation and the fed funds rate, where we detrend
GDP with a quadratic trend. Although we could have used the same VAR as previously, we
preferred this one, because the price puzzle is more substantial in this specification (Figure
4), so that the Monte-Carlo exercise will be a more stringent test on a GMA(1) model that
cannot capture the oscillating pattern in inflation. Again, the goal of the exercise is to assess
whether a GMA model that only approximates the main feature of the impulse responses can
still recover nonlinearities.

Asymmetry
We first consider a DGP where the impulse response functions to monetary shocks depend on
{ }∞
the sign of the shock. To introduce asymmetry, we modify the impulse responses Ψ̂k
k=0

to make them depend on the sign of the monetary shock, and Figure 4 plots the asymmetric
impulse response functions. For realism, the level of asymmetry that we simulate is chosen to
roughly match the magnitude of the asymmetry we later find in US data. Note that we do
not impose asymmetry for the response of the fed funds rate. This is done to test whether our
procedure incorrectly reports the existence of asymmetry when there is none.
41

Intuitively, the reason for the superior performances of GMA is the fact that the VAR often shows counterfactual oscillation patterns. In contrast, the GMA(2) is disciplined by its stricter parametrization.

24

We estimate a GMA(1) with asymmetry on each set of simulated data, and Table 2 presents
summary statistics for a+ − a− , which captures the amount of peak asymmetry for each one
of the three variables in the model.
A number of results emerge. First, as shown by the frequency of rejection of zero coefficient
for a+ − a− , the algorithm can detect asymmetry when it exists (case of output and inflation,
first row of Table 2), even when the impulse response is not generated by one Gaussian, and
even when, as with inflation, there is a strong oscillating pattern that cannot not captured
by a one Gaussian approximation.42 This is encouraging, because it supports our motivating
idea that by approximating the most important feature of an impulse response, one can detect
important nonlinearities. Moreover, the algorithm does not detect asymmetry when there is
none (case of the fed funds rate). Second, looking at the mean and standard-deviation of
the estimates across Monte-Carlo replications (second row of Table 2), we can see that the
algorithm under-estimates the amount of asymmetry (both for output and inflation). This
indicates that in our empirical application on US data, our algorithm may under-estimate
the magnitude of asymmetry present in the data. Third, the dispersion (third row) in the
estimates across the Monte-Carlo replications is reasonably small, while the coverage rate of
the posterior distribution – the frequency with which the true value lies within 90 percent of
the posterior distribution–, is also good (fourth row).

Asymmetry and state dependence
We now consider a DGP where the impulse response functions to monetary shocks depend
on the sign of the shock as well as the state of the business cycle. We introduce asymmetry
exactly as in the previous exercise, but in addition, we posit that there is state dependence
+
for output in response to a positive shock, i.e., γgdp
̸= 0 in (9), where the indicator variable zt
+
is the US unemployment rate.43 Again, the value of γgdp
is chosen to be of the same order of

Specifically, the 90 percent posterior probability of a+ −a− excludes zero for output and inflation respectively
94 and 90 percent of the time.
43
We could have used any indicator, but we wanted an indicator that has the same time series properties as
the one we use on US data. We thus chose to use the US unemployment rate, which is the indicator we used in
42

25

+
magnitude as our later empirical findings with US data, and we set γgdp
= 1.

We estimate a GMA(1) with asymmetry and state dependence on each set of simulated
data, and Table 3 summarizes the results. A number of results emerge. First, the algorithm
+
−
is very successful at detecting state dependence in output and the fact that γgdp
̸= γgdp
(first
−
+
̸= γgdp
in all
set of columns in Table 3). In the 50 Monte-Carlo replications, we detect γgdp
+
−
samples but one (first row). The algorithm also estimates the values of γgdp
− γgdp
without

bias (second row), with reasonable dispersion (third row) and with good coverage (fourth row).
Importantly, the algorithm detects no state dependence when there is none (case of inflation),
as can be seen from the close to zero frequency of rejection of zero coefficient. Second, the
algorithm can still pick up the existence of asymmetry for output and inflation (α+ − α− ̸= 0,
second set of columns). With a larger number of free parameters, estimation is more uncertain,
but we can still detect the existence of asymmetry in more than 80 percent of cases. Finally,
+
−
+
looking at the estimates for γgdp
and γgdp
separately, the algorithm estimates the value of γgdp

–the magnitude of the nonlinearity– with a downward bias, which seems to translate into an
−
upward bias for γgdp
, although that bias is not significant over the 50 Monte-Carlo replications

(last four columns of Table 3).

6

The nonlinear effects of monetary shocks

In this section, we apply our proposed GMA approach and study the nonlinear effects of
monetary shocks. We consider a model of the US economy in the spirit of Primiceri (2005),
where yt includes the unemployment rate, the PCE inflation rate and the federal funds rate.
As in Primiceri (2005), monetary policy affects the economy with a lag, and the matrix Ψ0
has its last column filled with 0 except for the diagonal coefficient. The data cover 1959Q1 to
2007Q4, and we exclude the latest recession where the fed funds rate was constrained at zero
and no longer captured variations in the stance of monetary policy.44 When constructing the
the application section.
44
While we use quarterly data as in Primiceri (2005), we also conducted our estimation using monthly data.
Results were very similar.

26

likelihood, we consider a moving-average model with K = 45, chosen to be large enough such
that the lag matrix coefficients Ψk are close enough to zero for k > K.45 For GMA models, we
leave the non-zero coefficients of the contemporaneous impact matrix Ψ0 as free parameters.
As a preliminary test, we start by checking that a linear GMA model performs well against a
standard VAR model. Then, we present the nonlinear impulse response functions obtained from
a nonlinear GMA with asymmetry alone first, and then with asymmetry and state dependence.

6.1

The linear case: VAR versus GMA

First, we evaluate our GMA approach by doing a simple model comparison between a linear
GMA(1) and a regular VAR with 4 lags.
Table 4 reports the (log) marginal data densities for the GMA and the VAR, so that a
model comparison can be readily obtained by computing the Bayes factor (obtained by taking
the exponential of the difference in (log) marginal data densities) after positing equal priors
for the two competing models. Encouragingly for our approach, Bayesian model comparison
favors the more parsimonious GMA(1) with a Bayes factor of about 400.

6.2

The asymmetric effects of monetary shocks

We now estimate an asymmetric GMA model in which the impulse responses to monetary
shocks depend on the sign of the shock. As detailed in the methodology section, to choose
the appropriate order of the GMA model, we consider models with an increasing number
of Gaussian basis functions. As shown in columns (3) to (5) of Table 4, Bayesian model
comparison favors a GMA(2) , and from now on we will report and discuss the results obtained
using a GMA(2).
We can see that Bayesian model comparison strongly favors a model with asymmetry in
the impulse responses to monetary shocks: the (log) marginal data density of an asymmetric
GMA(2) is respectively 20 log-points larger than the linear (symmetric) GMA model and 25
45

As a robustness check, we consider a higher moving-average lag-length with K = 55. Results were identical.

27

log-points larger than the VAR model, which imply Bayes factors of respectively about 108
and 1011 .
Figure 5 plots the impulse responses (in percentage points) of unemployment, the price
level and the federal funds rate to a one standard-deviation monetary shock. The thick lines
denote the impulse response functions implied by the posterior mode, and the error bands are
the 5th and 95th posterior percentiles.46 When comparing impulse responses to positive and
negative shocks, it is important to keep in mind that the impulse responses to expansionary
monetary shocks (a decrease in the fed funds rate) were multiplied by -1 in order to ease
comparison across impulse responses. With this convention, when there is no asymmetry, the
impulse responses are identical in the upper panels (responses to a contractionary monetary
shock) and in the bottom panels (responses to an expansionary monetary shock).
The evidence for asymmetry is striking: following a contractionary monetary shock, which
represents a 70 basis points increase in the fed funds rate, unemployment increases by about
0.15 percentage points (ppt), whereas a (linear) VAR implies only a 0.10 ppt increase. In
contrast, following an expansionary monetary shock (a 70 basis points decrease in the fed
funds rate), the response of unemployment is small (a decline of 0.04 percentage points) and
non-significantly different from zero. Figure 6 plots the posterior distribution of the difference
in impulse responses between positive and negative shocks. This figure can be seen as a pointwise test of difference in impulse responses at different horizons. The 90 percent posterior
interval of the difference in impulse responses of unemployment is substantially above zero for
horizons 3 to 10, in line with the conclusion from the Bayes factors that the data support a
model with asymmetric impulse responses to monetary shocks.47
Although the error bands are too large to be conclusive, the response of the price level also
displays an interesting asymmetric pattern: the price level appears more sticky following a
contractionary shock –displaying a larger price puzzle– than following an expansionary shock
46

To be specific, this figure and subsequent figures show paths of the moving average coefficients ψk .
In the case of the GMA(1) model, an alternative test for asymmetry is a Wald-type test on a+ − a− . This
test (not shown) gives a similar conclusion: for unemployment, the 90 percent posterior interval of a+ − a−
excludes zero.
47

28

for which the price level drops on impact and displays no price puzzle. This is exactly the pattern one would expect if downward price (or wage) rigidity was responsible for the asymmetric
response of unemployment.48
We also find asymmetry in the response of the fed funds rate to a monetary shock, but
it is relatively mild. A monetary shock generates a slightly more persistent increase in the
fed funds rate than its expansionary counterpart. This can be seen in the bottom right panel
of Figure 5 where the response of the fed funds rate is slightly more short-lived following an
expansionary shock.49

Robustness to identification assumptions
To show the robustness of our findings as well as to highlight how GMAs can accommodate
other identification schemes, we now present asymmetric impulse response functions obtained
with two alternative identification schemes: (i) a narrative approach, and (ii) sign restrictions.

Narrative approach We first evaluate the presence of asymmetry using monetary shocks
identified through the narrative approach by Romer and Romer (2004) and extended until 2007
by Coibion et al. (2012). As pointed out by Coibion (2012), the advantage of the narrative
procedure is that one should be able to more precisely identify the effects of monetary shocks
than with a relatively small model like the one considered above, since the Romer and Romer
measure controls for much of the endogenous fluctuations in the interest rate as well as the
Fed’s information set.
We estimate an asymmetric GMA(2) model with 4 variables included in the following order:
48
The existence of downward wage rigidity is supported empirically by the scarcity of nominal wage cuts
relative to nominal wage increases (e.g., Card and Hyslop, 1997).
49
One way to gauge how much of the asymmetric response of unemployment can be explained by the asymmetric response of the fed funds rate is to proceed as in the government spending multiplier literature (e.g.,
Ramey and Zubairy, 2014) and to compute the total change in unemployment relative to the total change in
K
K
∑
∑
the fed funds rate, that is to compute the multiplier m =
ψkU /
ψkf f r for respectively positive and negak=0

k=0

tive shocks. After “controlling” for the total change in the fed funds rate, the asymmetry is still present with
m+ = .24 > m− = .12 with m+ the multiplier associated with a contractionary shock (an increase in the fed
funds rate) and m− the multiplier associated with an expansionary shock.

29

the Romer and Romer shocks, unemployment, inflation and the fed funds rate, and we posit
that the contemporaneous matrix Ψ0 has its first row filled with 0 except for the diagonal coefficient, which implies that the narratively identified shock does not react contemporaneously
to other shocks. This restriction is innocuous if the narrative shocks were correctly identified.
Figure 7 plots the asymmetric impulse responses to an innovation to the Romer and Romer
shocks. Confirming our previous results, unemployment displays a very asymmetric response:
there is no significant movement in unemployment following an expansionary shock, but there
is a large increase following a contractionary shock.
Sign restrictions

We also evaluate the presence of asymmetry using monetary shocks identi-

fied through sign restrictions. We posit that monetary shocks are the only shocks that raise the
fed funds rate and lower inflation. We use a GMA(1) specification, so that the sign restrictions
for inflation and the fed funds rate are imposed over the whole horizon.50 As initial guess in our
optimization routine, we use the structural impulse responses implied by a Cholesky ordering,
and we use flat priors with a ∈ [−10, 10] (as well as for the intercepts and the coefficients of
Ψ0 ), b ∈ [0, K] and c ∈ [0, K].51
Figure 8 plots the asymmetric impulse responses to a monetary shock. Again, the evidence
for asymmetry is very strong: while a contractionary shock raises unemployment significantly,
an expansionary shock generates a much smaller (and non-significant) change in unemployment. Interestingly, the response of the price level is also strongly asymmetric with a strong
price response following an expansionary shock, but only a weak response following a contractionary shock.52 In other words, following a contractionary shock, quantities react, while
following an expansionary shock, prices react. This asymmetry is consistent with downward
price (or wage) rigidity playing a role in the asymmetric response of unemployment.
50

Other identification schemes are possible, and a GMA(2) would allow us to impose the sign restriction
over a specific horizon. We also experimented with imposing the additional restriction that the unemployment
increases following a contractionary monetary shock. The estimated impulse responses were similar.√
51
The latter prior variance imposes that the effect of a shock can have a half-life as large as K ln 2 = 38
quarters (recall K = 45 in our monetary application), which represents an extremely persistent impulse response.
52
A similar pattern could be seen with the two previous identification schemes, but the asymmetry in the
price response is most striking (and highly significant) with sign restrictions.

30

6.3

The asymmetric and state-dependent effects of monetary shocks

In this section, we enrich our model by allowing the effects of monetary policy to depend
on both the sign of the shock and the state of the business cycle. Intuitively, we would like
to test whether monetary policy is more powerful at stimulating the economy in a period of
economic slack, and whether an expansionary shock is more likely to generate inflation in a
tight labor market. We thus estimate model (9) with a GMA(2), and we use last period’s
unemployment rate as cyclical indicator (zt ).53 To put results into perspective, Figure 9 plots
the unemployment rate (i.e., the indicator variable zt ) along with the identified monetary
shocks.
Table 4 shows that Bayes model comparison strongly favors the model with asymmetry
and state dependence over all the other models.
To visualize the effects of the state of the cycle on the impulse responses, Figure 10 shows
how the peak effect of a monetary shock on unemployment or inflation depends on the state
of the business cycle at the time of the shock.54 The first two rows plot the peak responses of
unemployment and inflation to contractionary and expansionary shocks. The left quadrants
depict how the peak effect of a contractionary shock varies as we move from a tight labor
market (unemployment at 4 percent) to a slack labor market (unemployment at 8 percent),
and the right quadrants plot the same thing for an expansionary shock. The blues line depict
estimates from our nonlinear GMA model, and the thick dashed line represents the VAR
estimate. Since the VAR is linear, that latter estimate is a horizontal line as the peak effect
of monetary policy is independent of the state of the business cycle. Finally, the last row
of Figure 10 plots histograms of the distributions of respectively contractionary shocks and
expansionary shocks over the business cycle. This information is meant to get a sense of the
53

As an alternative, we also experienced with the unemployment rate detrended with an HP-filter (λ = 105 ).
The latter specification was used to make sure that our results were not driven by slow moving trends (e.g., due
to demographics) in the unemployment rate, which could make the unemployment rate a poor indicator of the
amount of economic slack (see e.g. Barnichon and Mesters, 2015). We obtained similar results.
54
To be specific, denote ψ(k, z) the value of an impulse response function to a shock ε at horizon k when
the indicator variable takes the value z at the time of the shock. Figure 10 plots the function f defined by
f (z) = sgn(ε) max |ψ(k, z)|.
k∈[0,K]

31

range of unemployment over which we identify the coefficients capturing state dependence.
We first discuss the response of unemployment. The real effect of a contractionary shock
(top left quadrant) increases with the unemployment rate: in a tight labor market, a (one
standard-deviation) contractionary shock increases unemployment by about 0.13 percentage
point (at the peak effect), but in a slack labor market, the same contractionary shock increases
unemployment by about 0.18 percentage point (at the peak effect). Regarding the real effect of an expansionary shock (top right quadrant), the evidence is not very strong, but our
estimates suggest some mild state dependence going in the same direction: the higher the
unemployment rate, the larger the real effect of an expansionary policy. For instance, the 90th
posterior probability bands start including the VAR point estimate, when the unemployment
rate rises above 7 percent. The asymmetry in the real effects of expansionary and contractionary shocks remains however, and an expansionary shock is always considerably less potent
than its contractionary counterpart.
We now turn to the response of inflation, depicted in the second row of Figure 10. While
there is no evidence of state dependence for contractionary shocks, we find strong evidence
that expansionary shocks generate a substantial rise in inflation when the unemployment rate
is low: with an unemployment rate at 4 percent, an expansionary shock generates a peak
increase in inflation of about 4 basis points (roughly twice as large as implied by the VAR
point estimates). In contrast, with an unemployment rate at 8 percent, an expansionary shock
has no effect on inflation. Interestingly, this finding is consistent with a standard Keynesian
narrative, according to which a monetary authority trying to expand an economy already above
potential would only achieve higher inflation through increased price/wage pressures.

7

Conclusion

This paper proposes a new method to estimate the (possibly nonlinear) dynamic effects of
structural shocks by using Gaussian basis functions to approximate impulse response functions.
We apply our approach to the study of monetary policy and find that the effect of a monetary
32

intervention depends strongly on the sign of the intervention. A contractionary shock has a
strong adverse effect on output, larger than implied by linear estimates, but an expansionary
shock has, on average, no significant effect on output. Interestingly, and while the evidence for
inflation is more uncertain, the behavior of inflation is consistent with asymmetry emerging (at
least in part) out of downward price/wage rigidities, because inflation displays a more marked
price puzzle following a contractionary shock than following an expansionary shock. Finally,
the effect of a monetary shock also depends on the state of the business cycle at the time of
the intervention: An expansionary shock during a time a low unemployment generates not
significant drop in unemployment but leads to a burst of inflation, consistent with a standard
Keynesian narrative.
Although this paper studies nonlinearities in the effect of monetary policy, Gaussian Mixture Approximations of the impulse responses may be useful in many other contexts, and we
showed how our approach can be used with other identification schemes. Looking forward, our
method could be used to estimate the nonlinear effects of other important shocks where the
existence of asymmetry or state-dependence remains an important and unresolved question;
notably fiscal policy shocks (Auerbach and Gorodnichenko, 2012, Ramey and Zubairy, 2014)
or credit supply shocks (Gilchrist and Zakrajsek, 2012). Moreover, the parametrization offered
by GMA models and the associated efficiency gains may be useful even for linear models, where
the sample size is small and/or the data are particularly noisy.

33

References
[1] Almon, S. ”The distributed lag between capital appropriations and expenditures,” Econometrica, 33, January, 178-196, 1965
[2] Amir Ahmadi P. and H. Uhlig. ”Sign Restrictions in Bayesian FaVARs with an Application
to Monetary Policy Shocks,” NBER Working Paper, 2015
[3] Angrist, J., Jorda O and G. Kuersteiner. ”Semiparametric estimates of monetary policy
effects: string theory revisited,” NBER Working Paper, 2013
[4] Alspach D. and H. Sorenson H. ”Recursive Bayesian Estimation Using Gaussian Sums,”
Automatica, Vol 7, pp 465-479, 1971.
[5] Alspach D. and H. Sorenson H. ”Nonlinear Bayesian Estimation Using Gaussian Sum
Approximations,” IEEE Transactions on Automatic Control, Vol 17-4, August 1972.
[6] Auerbach A and Y Gorodnichenko. ”Measuring the Output Responses to Fiscal Policy,”
American Economic Journal: Economic Policy, vol. 4(2), pages 1-27, 2012
[7] Auerbach A and Y Gorodnichenko. “Fiscal Multipliers in Recession and Expansion.” In
Fiscal Policy After the Financial Crisis, edited by Alberto Alesina and Francesco Giavazzi,
pp. 63–98. University of Chicago Press, 2013.
[8] Barnichon R, and G. Mesters, ”On the Demographic Adjustment of Unemployment,”
Working Paper, 2015.
[9] Barnichon R. and C. Matthes. ”Imposing structural identifying restrictions in Gaussian
Mixture Approximation (GMA) models,” Working Paper, 2016
[10] Barro, Robert J., ”Unanticipated Money Growth and Unemployment in the United
States,” American Economic Review, LXVII, 101-15, 1977.

34

[11] Baumeister, C. and J. Hamilton, ”Sign Restrictions, Structural Vector Autoregressions,
and Useful Prior Information,” Econometrica, 83(5), 1963-1999, 2015.
[12] Beaudry, Paul, and Gary Koop. ”Do Recessions Permanently Change Output?” Journal
of Monetary Economics 31 (1993), 149 - 63.
[13] Blanchard, O. and D. Quah. ”The Dynamic Effects of Aggregate Demand and Supply
Disturbances,” American Economic Review, 79(4), pages 655-73, September 1989.
[14] Buhmann, Martin D, Radial Basis Functions: Theory and Implementations, Cambridge
University Press, 2003.
[15] Canova, F. ”Methods for Applied Macroeconomic Research,” Princeton University Press,
2007.
[16] Canova, F. and G. De Nicolo, ”Monetary Disturbances Matter for Business Fluctuations
in the G-7”, Journal of Monetary Economics, 49, 1131-1159, 2002
[17] Card, D. and D. Hyslop, ”Does Inflation Greases the Wheels of the Labor Market?” in
Reducing Inflation: Motivation and Strategy, C. Romer and D. Romer, eds., University
of Chicago Press, 1997.
[18] Casella, G. and R. L. Berger, Statistical Inference, Duxbury, 2002.
[19] Cover, J. ”Asymmetric Effects of Positive and Negative Money-Supply Shocks,” The Quarterly Journal of Economics, Vol. 107, No. 4, pp. 1261-1282, 1992.
[20] Coibion, O. ”Are the Effects of Monetary Policy Shocks Big or Small?,” American Economic Journal: Macroeconomics, vol. 4(2), pages 1-32, April, 2012.
[21] Coibion, O., Y. Gorodnichenko, L. Kueng, and J. Silvia, ”Innocent Bystanders? Monetary
Policy and Inequality in the US,” NBER Working Papers 18170, National Bureau of
Economic Research, Inc, 2012.

35

[22] Christiano, L., M. Eichenbaum, and C. Evans. ”Monetary policy shocks: What have we
learned and to what end?,” Handbook of Macroeconomics, volume 1, chapter 2, pages
65-148, 1999.
[23] DeLong B and L. Summers, ”How Does Macroeconomic Policy Affect Output?,” Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution,
vol. 19(2), pages 433-494, 1988.
[24] Engle, R. and B. Yoo, “Forecasting and testing in co-integrated systems,” Journal of
Econometrics, vol. 35(1), pages 143-159, May 1987.
[25] Faust, J., ”The Robustness of Identified VAR Conclusions about Money”, CarnegieRochester Series on Public Policy, 49, 207-244, 1998.
[26] Fernandez-Villaverde, J. and J. Rubio-Ramirez, “Comparing dynamic equilibrium models
to data: a Bayesian approach,” Journal of Econometrics, vol. 123(1), pages 153-187,
November, 2004.
[27] Gertler M and P Karadi, ”Monetary Policy Surprises, Credit Costs, and Economic Activity,” American Economic Journal: Macroeconomics, vol. 7(1), pages 44-76, January
2015.
[28] Gilchrist, S and E Zakrajsek, ”Credit Spreads and Business Cycle Fluctuations,” American
Economic Review, American Economic Association, vol. 102(4), pages 1692-1720, June
2012.
[29] Haario, H., E. Saksman, and J. Tamminen, ”An adaptive Metropolis algorithm,” Bernoulli
7, no. 2, 223–242, 2001.
[30] Hamilton, J. “A New Approach to the Economic Analysis of Nonstationary Time Series
and the Business Cycle,” Econometrica 57, 357-384, 1989.

36

[31] Hastie, T. and Tibshirani, R. “Varying-coefficient models,” J. Roy. Statist. Soc. Ser. B
55, 757–796, 1993.
[32] Hastie, T., R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Springer
2009.
[33] Hubrich, K. and T. Terasvirta. ”Thresholds and smooth transitions in vector autoregressive models,” Advances in Econometrics “VAR Models in Macroeconomics, Financial
Econometrics, and Forecasting”, Vol. 31, 2013.
[34] Jorda O., ”Estimation and Inference of Impulse Responses by Local Projections,” American Economic Review, pages 161-182, March 2005.
[35] Koop G, M. Pesaran and S. Potter ”Impulse response analysis in nonliner multivariate
models,” Journal of Econometrics, 74 119-147, 1996.
[36] Koop G and S. Potter ”Dynamic asymmetries in US unemployment,” Journal of Business
& Economic Statistics Volume 17, Issue 3, 1999.
[37] Koreyaar J. Mathematical Methods, Vol. 1, pp 330-333. Academic Press, New York, 1968.
[38] Lippi, M. & Reichlin, L. ”VAR analysis, nonfundamental representations, Blaschke matrices,” Journal of Econometrics 63(1), 307–325, 1994.
[39] Lippi, M., Reichlin, L., ”Diffusion of technical change and the decomposition of output
into trend and cycle,” Review of Economic Studies 61 (1) (206), 19–30, 1994b.
[40] Lo, M. C., and J. Piger ”Is the Response of Output to Monetary Policy Asymmetric?
Evidence from a Regime-Switching Coefficients Model,” Journal of Money, Credit and
Banking, 37(5), 865–86, 2005.
[41] McLachlan, G. and D. Peel. Finite Mixture Models. Wiley Series in Probability and Statistics, 2000.

37

[42] Morgan, D. ”Asymmetric Effects of Monetary Policy.” Federal Reserve Bank of Kansas
City Economic Review 78, 21-33, 1993.
[43] Plagborg-Moller, P. ”Bayesian Inference on Structural Impulse Response Functions,”
Working Paper, 2016.
[44] Potter, S. ”A nonlinear approach to US GNP,” Journal of Applied Econometrics,” Vol 10
109-125, 1995
[45] Primiceri, G. ”Time Varying Structural Vector Autoregressions and Monetary Policy,”
Review of Economic Studies, vol. 72(3), pages 821-852, 2005.
[46] Racine, J. ”Nonparametric Econometrics: A Primer,” Foundations and Trends in Econometrics, now publishers, vol. 3(1), pages 1-88, March 2008.
[47] Ravn M and M Sola. ”A Reconsideration of the Empirical Evidence on the Asymmetric Effects of Money-supply shocks: Positive vs. Negative or Big vs. Small,” Archive Discussion
Papers 9606, Birkbeck, 1996.
[48] Ravn M. and M. Sola. ”Asymmetric effects of monetary policy in the United States,”
Review, Federal Reserve Bank of St. Louis, issue Sep, pages 41-60, 2004.
[49] Ramey V. ”Comment on ”Roads to Prosperity or Bridges to Nowhere? Theory and Evidence on the Impact of Public Infrastructure Investment”,” NBER Chapters, in: NBER
Macroeconomics Annual 2012, Volume 27, pages 147-153.
[50] Ramey V. and S. Zubairy. ”Government Spending Multipliers in Good Times and in Bad:
Evidence from U.S. Historical Data ,” Working Paper, 2014.
[51] Robert, Christian P. and George Casella “Monte Carlo Satistical Methods” Springer, 2004.
[52] Romer, C., and D. Romer. “A New Measure of Monetary Shocks: Derivation and Implications,” American Economic Review 94 (4): 1055–84, 2004

38

[53] Santoro, E, I. Petrella, D. Pfajfar and E. Gaffeo ”Loss Aversion and the Asymmetric
Transmission of Monetary Policy,” Journal of Monetary Economics, 68 19-35, 2014.
[54] Swanson, E. and J. Williams. “Measuring the Effect of the Zero Lower Bound on Medium–
and Longer–Term Interest Rates.” American Economic Review 104 (10): 3154–85, 2014.
[55] Tenreyro, S., and G. Thwaites (2015): “Pushing on a string: US monetary policy is less
powerful in recessions,” Working Paper.
[56] Thoma, M. ”Subsample Instability and Asymmetries in Money-Income Causality.” Journal of Econometrics 64, 279-306, 1994
[57] Uhlig, H. ”What are the effects of monetary policy on output? Results from an agnostic identification procedure,” Journal of Monetary Economics, vol. 52(2), pages 381-419,
March 2005.
[58] Walsh C. Monetary Theory and Policy, 3nd. ed., The MIT Press, 2010.
[59] Weise, C. ”The Asymmetric Effects of Monetary Policy: A Nonlinear Vector Autoregression Approach,” Journal of Money, Credit and Banking, vol. 31(1), pages 85-108, February
1999.

39

Appendix A1: Proof of Theorem 1
Following Alspach and Sorenson (1971, 1972) in the context of approximating distributions, the
problem of approximating a function f can be considered within the context of delta families
of positive types.
Delta families are families of functions which converge to a delta function as a parameter
characterizing the family converges to a limit value.
Let {δλ } be a family of functions on the interval ] − ∞, +∞[ which are integrable over every
interval. {δλ } forms a delta family of positive type if the following conditions are satisfied:
1. For every constant γ > 0, δλ tends to zero uniformly for γ ≤ |x| ≤ ∞ as λ → λ0
∫s
2. There exist s in R so that −s δλ (x)dx −→ 1 as λ tends to some limit value λ0
3. δλ (x) ≥ 0 for all x and λ
Defining
x2
1
δλ (x) ≡ Gλ (x) = √
e− λ2 ,
2πλ2

(11)

it is easy to see that the Gaussian functions {Gλ } form a delta family of positive type as λ → 0
(i.e., λ0 = 0). That is, the Gaussian function tends to the delta function as the variance tends
to zero.55
We can then make use of the following theorem.
Theorem: The sequence {fλ } which is formed by the convolution of δλ and f
∫
fλ (x) =

+∞

−∞

δλ (x − u)f (u)du

(12)

converges uniformly to f as λ → λ0 for x on every interval [x0 , x1 ] of R.
Proof. See Korevaar (1968).
Note that this proof can be easily applied to other functions (such as the inverse quadratic function x →
) that form a delta family of a positive type, so that our approach is not restricted to Gaussian functions.
1+( )
55

1

x 2
λ

40

Using (11) in (12), the function fλ given by
∫
fλ (x) =

+∞

−∞

Gλ (x − u)f (u)du

(13)

converges uniformly to f as λ → 0 for x in some arbitrary interval [x0 , x1 ] of R.
Next, we want to approximate (13) with a Riemann sum. To do so, first rewrite fλ as
∫
fλ (x) =

∫

−s

Gλ (x − u)f (u)du +
−∞
|
{z
}

+s

−s

∫
Gλ (x − u)f (u)du +
|

s

+∞

Gλ (x − u)f (u)du
{z
}

(14)

=B(λ,x)

=A(λ,x)

for s > 1.
Note that for any s > 1, we have
∫
0 ≤

+∞

Gλ (u)du
∫ +∞
u
1
≤ √
e− λ2 du since u2 > u for any u in [s, +∞[, s > 1
2
2πλ s
[
]+∞
s
−λ2 − u2
|λ|
√
≤
e λ
= √ e− λ2 −→ 0
λ→0
2π
2πλ2
s
s

which shows that ∀s > 1, lim

∫ +∞

λ−>0 s

Gλ (u)du = 0. Symmetrically, we can show lim

∫ −s

Gλ (u)du
λ−>0 −∞

0.
Going back to (14), we have
∫
0 ≤ |B(λ, x)| ≤ M

x−s

−∞

Gλ (t)dt

where M = sup |f (x)| . Since x ∈ [x0 , x1 ], we can choose an s > 1 such that x − s < −1, so
x∈R

that we can apply the previous result and get
lim |B(λ, x)| = 0.

λ→0

41

(15)

=

Proceeding symmetrically, we have lim |A(λ, x)| = 0.
λ→0

Finally, since the function u 7→ Gλ (x−u)f (u) is continuous over [−s, s], we can approximate

∫ +s
−s

Gλ (x − u)f (u)du with a Riemann sum. Denoting

fλ,N (x) =

N
∑

Gλ (x − ξn )f (ξn ) (ξn − ξn−1 )

n=1

where ξn = −s + n 2s
N , we get that
∫
lim fλ,N (x) =

N →∞

+s

−s

Gλ (x − u)f (u)du.

(16)

Denoting an = f (ξn ) (ξn − ξn−1 ), bn = ξn and cn = λ, using (16), (15) in (14) and combining
with (13), we get that

(
lim

λ→0

)
lim fλ,N (x) = f (x)

N →∞

which completes the proof.

Appendix A2: Identifying restrictions in nonlinear Moving-Average
models
We now detail how to impose the different identifying restrictions used in the paper. We only
∞
∑
discuss the nonlinear model yt =
Ψk (εt−k , zt−k )εt−k , since it includes the simpler linear
model yt =

∞
∑

k=0

Ψk εt−k .

k=0

As described in the main text, we impose the identifying restriction when we construct
the likelihood, so that constructing the likelihood and imposing identifying restrictions are
intimately linked, and we thus describe them jointly. To recursively construct the likelihood at
time t, one must ensure that the shock vector εt is uniquely determined given a set of model
parameters and the history of variables up to time t. As described in the main text, in order

42

to construct the likelihood recursively, the system of equations

Ψ0 (εt , zt )εt = ut
need to have a unique solution vector εt given ut = yt −

(17)
K
∑

Ψk (εt−k , zt−k )εt−1−k . That is,

k=0

we must ensure that there is a one-to-one mapping from εt to Ψ0 (εt , zt )εt . In the linear case,
this means that we must ensure Ψ0 is invertible. In the nonlinear case, ensuring that the shock
vector εt is uniquely determined becomes more complicated, when we allow Ψ0 to depend on
the sign of the shock or on some state variable.56
Consider first the consequences of allowing for state dependence, i.e., when Ψk depends
on the value of the indicator vector zt , so that the likelihood also depends on the value
of the indicator vector zt . Technically, constructing the likelihood of this specification is a
straightforward extension of the linear case, when zt is a function of lagged values of yt .
To see that, note that we use the prediction-error decomposition to construct the likelihood
function. We build a sequence of densities for yt that conditions on past values of yt . Thus,
conditional on past values of yt , zt is known, and as long as Ψ0 (zt ) is invertible, there is
(one-to-one) mapping from εt to Ψ0 εt , and the likelihood can be recursively constructed.57
Consider now the consequences of allowing for asymmetry, i.e., when Ψk depends on the
sign of εt . A complication arises when one allows Ψ0 to depend on the sign of the shock
while also imposing identifying restrictions on Ψ0 . The complication arises, because with
asymmetry, the system of equations Ψ0 (εt )εt = ut need not have a unique solution vector εt ,
because Ψ0 (εt ), the impact matrix, depends on the sign of the shocks, i.e., on the vector εt .
In this appendix, we show how to address the issue when we allow the identified shocks
56

Note that if the impact matrix Ψ0 is a constant and does not depend on εt or zt (so that Ψk depends on
εt or zt only for k > 0), then one can construct the likelihood just as in the linear case, because as long as Ψ0
is invertible, there is (one-to-one) mapping from εt to Ψ0 εt , and εt is uniquely defined from ut .
57
If we wanted to use an indicator function that was not a function of the history of endogenous variables y t−1 ,
this would also be possible by using a quasi-likelihood approach. That is, we would build a likelihood function
that not only conditions on the parameters, but also the sequence of indicators zt . This would in general not be
efficient because the joint density of zt and yt could carry more information about the parameters in our model
than the conditional density we advocate using. As long as zt is highly correlated with elements of (functions
of) yt , this loss in efficiency will likely be small.

43

to have asymmetric and state dependent effects on the impulse response functions. We successively consider each identification scheme used in the paper: (i) recursive ordering, (ii)
narrative identification, and (iii) sign restrictions.

1. Recursive identification scheme
It will be convenient to adopt the following conventions for notation:
• Denote yℓ,t the ℓth variable of vector yt and denote yt<ℓ = (yℓ,t , ..., yℓ−1,t )′ the vector of
variables ordered before variable yℓ,t in yt . Similarly, we can define yt≤ℓ or yt>ℓ .
• For a matrix Γ of size L × L and (i, j) ∈ {1, ..., L}2 , denote Γ<i,<j the (i − 1) × (j − 1)
submatrix of Γ made of the first (i − 1) rows and (j − 1) columns. Similarly, we denote
Γ>i,>j the (L − i) × (L − j) submatrix of Γ made of the last (L − i) rows and (L − j)
columns. In the same spirit, we denote Γi,<j the submatrix of Γ made of the ith row and
the first (j − 1) columns. Γi,<j is in fact a row vector. A combination of these notations
allows us to denote any submatrix of Γ. Finally, d enote Γij the ith row jth column
element of Γ.
With these notations, we can now state the recursive identifying assumption
Assumption 1 (Partial recursive identification) The contemporaneous impact matrix Ψ0
of dimension L × L is of the form





Ψ0 = 






Ψ<ℓ,<ℓ
0
(ℓ−1)×(ℓ−1)

0<ℓ,ℓ
(ℓ−1)×1

0<ℓ,>ℓ
(ℓ−1)×(L−ℓ)

Ψℓ,<ℓ
0

Ψ0,ℓℓ

1×(ℓ−1)

1×1

0ℓ,>ℓ
1×(L−ℓ)

Ψ>ℓ,<ℓ
0
(L−ℓ)×(ℓ−1)

Ψ>ℓ,ℓ
0

Ψ>ℓ,>ℓ
0

(L−ℓ)×1

(L−ℓ)×(L−ℓ)





.




with ℓ ∈ {1, .., L}, Ψ<ℓ,<ℓ
and Ψ>ℓ,>ℓ
matrices of full rank and 0 denoting the L × L zero
0
0
matrix.
44

Assumption 1 states that the shock of interest εℓ,t , ordered in ℓth position in εt , affects
the variables ordered from 1 to ℓ − 1 with a one period lag, and that the first ℓ variables in yt
do not react contemporaneously to shocks ordered after εℓ,t in εt . For instance, in Primiceri
(2005)’s monetary model used in section 6, the policy rate is ordered last, and the recursive
identification scheme states that shocks to the policy rate do not affect unemployment and
inflation contemporaneously, i.e., that the last column of Ψ0 is filled with zeros except for the
diagonal element.
We first consider a model with only asymmetry and then a model with asymmetry and
state dependence.

1.1 Asymmetric impulse response functions
Proposition 1 Consider the nonlinear moving average model defined in (6) with
[
]
−
Ψk (εt−k ) = Ψ+
k 1εℓ,t−k >0 + Ψk 1εℓ,t−k <0 , ∀k ∈ {0, .., K}, ∀t ∈ {1, .., T }

(18)

with ℓ ∈ {1, .., L}, εℓ,t , the ℓth structural shock in εt and with Ψ0 satisfying Assumption 1.
Then, given {yt }Tt=1 , given the model parameters and given K initial values of the shocks
{ε−K ...ε0 }, the series of shocks {εt }Tt=1 is uniquely determined.
Proof. The key to Proposition 1 is to show that the sign of the monetary shock εℓ,t is
uniquely pinned down by (17).
We first establish the following lemma:
Lemma 1 Consider a matrix Γ that can be written as




 A B 
Γ=

C D
where A, B, C and D are matrix sub-blocks of arbitrary size, with A a non-singular squared
45

matrix and D − CA−1 B nonsingular. Then, the inverse of Γ satisfies


Γ−1 = 


A−1 +A−1 BF−1 CA−1

−A

−1

−F−1 CA−1

BF

−1

F−1




with F = D − CA−1 B.
Proof. Verify that ΓΓ−1 = I.
We prove Proposition 1 by induction, so that given past shocks {εt−1−K , ..., εt−1 } (and
given model parameters {Ψk }K
k=0 ), we will prove that the system

ut = Ψ0 (εℓ,t )εt
with ut = yt −

K
∑

(19)

Ψk (εℓ,t )εt−1−k , has a unique solution vector εt .

k=0

Notice that (19) implies the sub-system with ℓ equations


u≤ℓ
t =


Ψ<ℓ,<ℓ
0

0<ℓ,1

Ψℓ,<ℓ
0

Ψ0,ℓℓ (εℓ,t )

 ≤ℓ
 εt

(20)

and notice that the matrix in (20) depends on εℓ,t only through the scalar Ψ0,ℓℓ (εℓ,t ). Denoting
A ≡ Ψ<ℓ,<ℓ
a (ℓ − 1) × (ℓ − 1) invertible matrix (from Assumption 1), C ≡ Ψℓ,<ℓ
a 1 × (ℓ − 1)
0
0
matrix, B ≡ 0 of dimension (ℓ − 1) × 1, and D(εℓ,t )≡Ψ0,ℓℓ (εℓ,t ) the (ℓ, ℓ) coefficient of Ψ0 (a
scalar), we can use Lemma 1 to invert the system (20) and obtain

ε≤ℓ
t =


)A−1

1
 D(εℓ,t

D(εℓ,t )
−CA−1

The last row of (21) provides the equation εℓ,t =

0<ℓ,1

 ≤ℓ
 ut .

(21)

1

1
D(εℓ,t ) (

−CA−1 1 )ut , which defines

εℓ,t . Since the right hand side of that equation only depends on εℓ,t through D(εℓ,t ), the sign
of the right hand side depends on εℓ,t only through the sign of D(εℓ,t ) = Ψ0,ℓℓ (εℓ,t ). But since
46

Ψ0,ℓℓ (εℓ,t ), the sign of the contemporaneous effect of the shock εℓ,t on variable yl,t , is posited to
be positive as a normalization, the sign (and the value) of εℓ,t is uniquely determined from the
and Ψ>ℓ,>ℓ
invertible, (19) has a unique solution vector
last row of (21). Then, with Ψ<ℓ,<ℓ
0
0
εt .
Proposition 1 ensures that the system (17) has a unique solution vector, even when the
shock εℓ,t , identified from a recursive ordering, triggers asymmetric impulse response functions.
With Proposition 1, we can then construct the likelihood recursively. To write down the
one-step ahead forecast density p(yt |θ, y t−1 ) as a function of past observations and model
parameters, we use the standard result (see e.g., Casella-Berger, 2002) that for Ψ0 a function
of εt , we have
p(Ψ0 (εℓ,t )εℓ,t |θ, y t−1 ) = Jt p(εt )
where Jt is the Jacobian of the (one-to-one) mapping from εt to Ψ0 (εt )εt and where p(εt ) is
the density of εt .58
Finally, note that while we considered the case of a partially identified model, we can
proceed similarly for a fully identified model with Ψ0 lower triangular and show that the shock
vector εt is uniquely determined by (17) even when all shocks have asymmetric effects.
1.2 Asymmetric and state-dependent impulse response functions
We now consider a model with asymmetry and state dependence. For clarity of exposition,
we consider the simpler case of a univariate state variable zt ∈ [z, z] with z = min (zt ) and
t∈[1,T ]

z = max (zt ). The following proposition establishes the condition under which system (17)
t∈[1,T ]

has a unique solution even when the identified shock εℓ,t has asymmetric and state dependent
effects.
58
In our case with asymmetry, this Jacobian is simple to calculate, but the mapping is not differentiable at
εℓ,t = 0. Since we will never exactly observe εℓ,t = 0 in a finite sample, we can implicitly assume that in a small
neighborhood around 0, we replace the original mapping with a smooth function.

47

Proposition 2 Consider the nonlinear moving average model defined in (6) with
[
]
−
Ψk (εt−k , zt−k ) = Ψ+
k (zt−k )1εℓ,t−k >0 + Ψk (zt−k )1εℓ,t−k <0 , ∀k ∈ {0, .., K}, ∀t ∈ {1, .., T }
(22)
with zt ∈ [z, z], ℓ ∈ {1, .., L}, εℓ,t , the ℓth structural shock in εt , and with Ψ0 satisfying
Assumption 1. Then, given {yt }Tt=1 , given the model parameters and given K initial values
of the shocks {ε−K ...ε0 }, the series of shocks {εt }Tt=1 is uniquely determined provided that
(
)
(
)
−
sgn Ψ+
(z
)
=
sgn
Ψ
(z
)
> 0, ∀zt ∈ [z, z].
0,ℓℓ t
0,ℓℓ t
Proof. The proof proceeds exactly as with Proposition 1 and consists in showing that the
system ut = Ψ0 (εℓ,t , zt )εt determines a unique solution vector εt . As with Proposition 1, this
is the case as long as Ψ0,ℓℓ (εℓ,t , zt ) > 0 regardless of the value of zt .
Taking as an example the case of the monetary model from section 6, the restriction in
)
(
)
(
+
=
sgn
Ψ
and similarly for Ψ−
Proposition 2 implies sgn Ψ+
(z)
(z)
0,ℓℓ
0,ℓℓ , so that the
0,ℓℓ
coefficient of the impact response of the fed funds rate to a monetary shock is always positive,
regardless of the state of the cycle. Note that this restriction is very mild, in that it is in fact
an existence condition for the moving average model, since the diagonal coefficients of Ψk are
posited to be positive as a normalization.
With Proposition 2 in hand, we can then construct the likelihood recursively as described
in the previous section.

2. Narrative identification scheme
For a narrative identification scheme, we can use the previous results on recursive identification,
since the use of narratively identified shocks can be cast as a partial identification scheme.
Indeed, if one orders the narratively identified shocks series first in yt , we can assume that
Ψ0 has its first row filled with 0 except for the diagonal coefficient, which implies that the
narratively identified shock does not react contemporaneously to other shocks (as should be
the case if the narrative shocks were correctly identified). With Assumption 1 satisfied with
48

ℓ = 1, Proposition 1 and 2 then imply that (17) has a unique solution vector εt even when the
narratively identified shocks has asymmetric and state dependent effects.

3. Identification from sign restrictions
We now consider the case of a set identification scheme based on sign restrictions. Denote εrt the
structural shock of interest identified from sign restrictions. We now establish the conditions
under which system (17) has a unique solution vector, first in a model with asymmetry, and
second in a model with asymmetry and state dependence.

3.1 Asymmetric impulse response functions
Proposition 3 Consider the nonlinear moving average model defined in (6) with
[
]
− r
r
Ψk (εt−k ) = Ψ+
1
+
Ψ
1
, ∀k ∈ {0, .., K}, ∀t ∈ {1, .., T }
ε
>0
ε
<0
k
k
t−k
t−k

(23)

with εrt the structural shock identified from sign restrictions. Then, given {yt }Tt=1 , given the
model parameters and given K initial values of the shocks {ε−K ...ε0 }, the series of shocks
−
{εt }Tt=1 is uniquely determined provided that sgn(det Ψ+
0 ) = sgn(det Ψ0 ).

Proof. Without loss of generality, let us order the variables such that εrt , the shock with
asymmetric effects, is ordered last. We can then write Ψ0 (εrt ) (of dimension L × L) as



B(εrt )


 A
Ψ0 (εt ) = 

C D(εrt )
with A a (L − 1) × (L − 1) invertible matrix, C a 1 × (L − 1) matrix, B(εrt ) a matrix of
dimension (L − 1) × 1 that depends on εrt , and D(εrt )≡Ψ0,LL (εrt ) a scalar. Notice that only the
last column of Ψ0 depends on εrt .
We will make use of the following lemma:

49

Lemma 2 Consider the same matrix Γ as in Lemma 1. We have
det Γ = det(A) det(D − CA−1 B).

Proof. Rewrite Γ as






A−1 B


 A 0  I
Γ =


C I
0 D − CA−1 B
and the lemma follows.
Using Lemma 1 and noting that D(εrt ) is a scalar, we have that the inverse of Ψ0 satisfies

Ψ−1
0 =

1


−1
r
r
D(εt ) − CA B(εt )

(

D(εrt )

−1

− CA

)

B(εrt )


A−1 +A−1 BCA−1

−1

−A

−CA−1

B(εrt )


.

1

r
The last row of the system εt = Ψ−1
0 ut provides the equation εt =

1
(
D(εrt )−CA−1 B(εrt )

−CA−1 1 )ut ,

which defines εrt . Since the right hand side of that equation only depends on εrt through
D(εrt ) − CA−1 B(εrt ), the sign of the right hand side depends on εrt only through the sign
of D(εrt ) − CA−1 B(εrt ).59 Using Lemma 2, this means that the sign of the right hand side
−
depends on εrt only through the sign of det Ψ0 . Thus, with sgn(det Ψ+
0 ) = sgn(det Ψ0 ), the

sign (and value) of εrt is uniquely pinned down, so that with A invertible, the system (17) has
a unique solution vector.
Proposition 3 states that the system ut = Ψ0 (εrt )εt determines a unique solution vector εt
(
)
(
)
−
as long as both sgn det Ψ+
0 = sgn det Ψ0 , i.e., as long as the asymmetry is not too strong.
In practice, we impose this restriction by assigning a minus infinity value to the likelihood
−
whenever sgn(det Ψ+
0 ) ̸= sgn(det Ψ0 ).
59

In fact, we have D(εrt ) − CA−1 B(εrt ) = Ψ0,LL (εrt ) −

L−1
∑ (

CA−1

ℓ=1

50

)
ℓ

Ψ0,ℓL (εrt ).

Then, to construct the likelihood, we proceed as described in the recursive identification
section by using the fact that there is a one-to-one mapping from εt to Ψ0 (εt )εt .
3.2 Asymmetric and state-dependent impulse response functions
For clarity of exposition, we consider the simpler case of a univariate state variable zt ∈ [z, z]
with z = max (zt ) and z = min (zt ). With asymmetric and state dependent effects of εrt , we
t∈[1,T ]

t∈[1,T ]

can establish the proposition
Proposition 4 Consider the nonlinear moving average model defined in (6) with
[
]
−
r
r
Ψk (εt−k , zt−k ) = Ψ+
+
Ψ
(z
)1
(z
)1
t−k εt−k >0
t−k εt−k <0 , ∀k ∈ {0, .., K}, ∀t ∈ {1, .., T }
k
k
(24)
with εrt the structural shock identified from sign restrictions. Then, given {yt }Tt=1 , given the
model parameters and given K initial values of the shocks {ε−K ...ε0 }, the series of shocks
−
{εt }Tt=1 is uniquely determined provided that sgn(det Ψ+
0 (zt )) = sgn(det Ψ0 (zt )), ∀zt ∈ [z, z].

Proof. Proceed as in the proof of Proposition 3.
Proposition 4 states that the system ut = Ψ0 (εrt , zt )εt determines a unique solution vector
(
)
(
)
−
εt as long as sgn det Ψ+
0 (zt ) = sgn det Ψ0 (zt ) is independent of the value of zt , i.e., as long
as state dependence is not too strong. In practice, we can impose this restriction by assigning
±
a minus infinity value to the likelihood whenever sgn(det Ψ±
0 (z)) ̸= sgn(det Ψ0 (z)).

Constructing the likelihood then proceeds as described in the previous section on recursive
identification.

51

Approximation with a GMA(2)
Unemployment

Unemployment

Approximation with a GMA(1)
0.1
0.05
0
5

10

15

0.1
0.05
0

20

0
−0.1
−0.2
5

15

20

10

15

5

10

15

20

0
−0.1
−0.2

20
0.8
VAR
GMA

0.4
0
5

10
15
Quarters

Interest rate

0.8
Interest rate

10

0.1
Price level

Price level

0.1

5

20

VAR
GMA

0.4
0
5

10
15
Quarters

20

Figure 1: Impulse response functions of the unemployment rate (in ppt), the (log) price level
(in percent) and the federal funds rate (in ppt) to a one standard-deviation monetary shock.
Impulse responses estimated with a VAR (dashed-line) or approximated using one Gaussian
basis function (GMA(1), left-panel, thick line) or two Gaussian basis functions (GMA(2), right
panel thick line). Estimation using data covering 1959-2007.

52

Unemployment

Gaussian basis functions
0.1
0.05
0
1

5

10

15

20

1

5

10

15

20

Inflation

0.02

0

−0.02

Interest rate

0.8
GB1
GB2

0.4
0
1

5

10

15

20

Quarters

Figure 2: Gaussian basis functions (dashed lines) used by a GMA(2) to approximate the
responses of unemployment, inflation and the fed funds rate to a monetary shock. The basis
functions are appropriately weighted so that their sum gives the GMA(2) parametrization of
the impulse response functions (solid lines) reported in the right-panels of Figure 1.

53

ψ(t) = ae−(

t−b 2
c

)

b
a
2

a

√
c ln 2

.

0

t
Figure 3: Interpreting an impulse response function with a GMA(1) model.

54

Output

0
−0.5

Linear
Contractionary shock
Expansionary shock

−1
−1.5

5

10

15

20

25

5

10

15

20

25

5

10

15

20

25

Inflation

0.5
0
−0.5
−1

Interest rate

1
0.5
0
−0.5

Figure 4: Monte Carlo simulation with asymmetric impulse responses to monetary shocks.
The thick blue lines report the simulated impulse responses to a contractionary shock, and
the thick red lines report the simulated impulse responses to an expansionary shock (with the
responses to an expansionary shock multiplied by -1 for clarity of exposition). The dashed
lines are the impulse responses estimated from a VAR over 1959-2007.

55

Unemployment
Contractionary shock

0.2

Price level
0.2

0.15

0

0.1

−0.2

VAR

0.8
0.6
0.4

0.05

−0.4

0.2

0

−0.6

0

5

15

25

5

(−) Unemployment
0.2
Expansionary shock

Interest rate
1

15

25

5

(−) Price level

0

0.1

−0.2

25

(−) Interest rate
1

0.2

0.15

15

VAR

0.8
0.6
0.4

0.05

−0.4

0.2

0

−0.6

0

5

15

25

5

15

25

5

15

25

Figure 5: Impulse response functions of the unemployment rate (in ppt), the (log) price level
(in percent) and the federal funds rate (in ppt) to a one standard-deviation monetary shock
identified from a recursive ordering. Estimation from a VAR (dashed-line) or from a GMA(2)
with asymmetry (plain line). Shaded bands denote the 5th and 95th posterior percentiles.
For ease of comparison, responses to the expansionary shock are multiplied by -1. Estimation
using data covering 1959-2007.

56

Unemployment

Inflation

0.3

0.1

0.2

0.05

0.1

0

Fed funds rate
1.2

Difference in IRFs

1
0.8
0.6
0.4
0

0.2

−0.05

0
−0.1

5

15

25

−0.1

5

15

25

−0.2

5

15

25

Figure 6: Differences in impulse response functions of the unemployment rate (in ppt), the
(log) price level (in percent) and the federal funds rate (in ppt) to a one standard-deviation
monetary shock. Shaded bands denote the 5th and 95th posterior percentiles. Estimation
using data covering 1959-2007.

57

Unemployment

Price level

Interest rate

Contractionary shock

0.5

1

0.3

0.8
0

0.2
0.1

0.4

−0.5

0.2

0
−0.1

0.6

−1
10

20

0
10

(−) Unemployment

20

(−) Price level

Expansionary shock

20

(−) Interest rate

0.5

1

0.3

0.8
0

0.2
0.1

0.6
0.4

−0.5

0.2

0
−0.1

10

−1
10

20

0
10

20

10

20

Figure 7: Impulse response functions of the unemployment rate (in ppt), the (log) price level
(in percent) and the federal funds rate (in ppt) to a one standard-deviation Romer and Romer
monetary shock. Estimation from a VAR (dashed-line) or from a GMA(2) with asymmetry
(plain line). Shaded bands denote the 5th and 95th posterior percentiles. For ease of comparison, responses to the expansionary shock are multiplied by -1. Estimation using data covering
1966-2007.

58

Unemployment

Price level

Interest rate

Contractionary shock

1
0.4

0

0.8

0.3

0.6

−0.5
0.2

0.4
−1

0.1

0.2
0

0
5

15

25

−1.5

(−) Unemployment

5

15

25

5

(−) Price level

15

25

(−) Interest rate

Expansionary shock

1
0.4

0

0.8

0.3

0.6

−0.5
0.2

0.4
−1

0.1

0.2
0

0
5

15

25

−1.5

5

15

25

5

15

25

Figure 8: Impulse response functions of the unemployment rate (in ppt), the (log) price level
(in percent) and the federal funds rate (in ppt) to a one standard-deviation monetary shock
identified with sign restrictions. Estimation from a GMA(1) with asymmetry (plain line).
Shaded bands denote the 5th and 95th posterior percentiles. For ease of comparison, responses
to the expansionary shock are multiplied by -1. Estimation using data covering 1959-2007.

59

12
UR
Monetary shocks

10

Unemployment

8
6
4
2
0
−2
−4
1960

1970

1980

1990

2000

Figure 9: Unemployment rate –the business cycle indicator (solid line, left scale)–, and estimated monetary shocks (circles, right scale) with larger circles indicating larger shocks.

60

Contractionary shock

Expansionary shock

Peak effect on U

0
0.2
−0.1
0.1
VAR
GMA
0

5

6

VAR
GMA

−0.2

7

5

6

7

5

6

7

Peak effect on Π

0.1
0
0

5

6

7

Shocks
Distribution

−0.1

5
6
7
Unemployment rate

5
6
7
Unemployment rate

Figure 10: Peak effect of monetary policy on unemployment and inflation (in ppt) as a function
of the state of the business cycle (measured with the unemployment rate) for one standard
deviation contractionary monetary shocks (left panel) and expansionary monetary shocks (right
panel). The dashed lines represent the 5th and 95th posterior percentiles. The thick-dashed
line is the linear VAR estimate. The bottom panel plots the distribution of (respectively)
contractionary shocks and expansionary shocks over the business cycle. Estimation using data
covering 1959-2007.

61

Table 1: Summary statistics for Monte Carlo simulation with a linear model
U
VAR

GMA

VAR

MSE

0.057

0.043

Avg length
(at peak effect)

0.16

Coverage rate
(at peak effect)

0.94

𝛑

ffr
GMA

VAR

GMA

0.077

0.041

0.003

0.002

0.13

0.27

0.11

0.05

0.03

0.83

1

0.78

0.94

0.93

Note: Summary statistics over 50 Monte-Carlo replications. MSE is the mean-squared error of the estimated impulse response function over horizons 1 to 25. Avg length is the
average distance between the lower (2.5%) and upper (97.5%) confidence bands at the time of peak effect of the monetary shock. The coverage rate is the frequency with which
the true value lays within 95 percent of the posterior distribution. The VAR estimates and confidence bands are obtained from a Bayesian VAR with Normal-Whishart priors. U,
π and ffr denote respectively unemployment, inflation and the fed funds rate.

Table 2: Summary statistics for Monte Carlo simulation with asymmetry

a+-agdp

𝛑

ffr

0.94

0.90

0.08

-0.82
(-1.00)

-0.50
(-0.60)

0.03
(0.00)

Std-dev

0.28

0.17

0.12

Coverage rate

0.82

0.86

0.88

Frequency of rejection
of zero coefficient
Mean
(true value)

Note: Summary statistics over 50 Monte-Carlo replications. For each coefficient of interest, "Frequency
of rejection of zero coefficient" is the frequency that 0 lies outside 90 percent of the posterior
distribution, and "Coverage rate" is the frequency with which the true value lies within 90 percent of the
posterior distribution. gdp, π and ffr denote respectively output, inflation and the fed funds rate.

Table 3: Summary statistics for Monte Carlo simulation with asymmetry and state dependence

γ+-γ-

α+-α-

γ+

γ-

gdp

𝛑

gdp

𝛑

gdp

𝛑

gdp

𝛑

0.96

0.03

0.82

0.80

0.87

0.06

0.20

0.05

0.96
(1.00)

0.02
(0.00)

-0.78
(-1.00)

-0.48
(-0.60)

0.71
(1.00)

0.00
(0.00)

-0.21
(0.00)

-0.00
(0.00)

Std-dev

0.26

0.17

0.37

0.23

0.31

0.19

0.23

0.19

Coverage rate

0.84

0.92

0.71

0.70

0.68

0.92

0.65

0.90

Frequency of
rejection of
zero coefficient
Mean
(true value)

Note: Summary statistics over 50 Monte-Carlo replications. For each coefficient of interest, "Frequency of rejection of zero coefficient" is the frequency that 0 lies
outside 90 percent of the posterior distribution, and "Coverage rate" is the frequency with which the true value lies within 90 percent of the posterior distribution.
gdp and π denote respectively output and inflation.

Table 4: Marginal data densities

(log) marginal data density

VAR

GMA(1)

GMA(1)
Asymmetry

GMA(2)
Asymmetry

GMA(3)
Asymmetry

GMA(2)
Asymmetry
State dep.

(1)

(2)

(3)

(4)

(5)

(6)

112

118

127

138

107

158

Note: Trivariate models with unemployment, PCE inflation and the fed funds rate estimated over 1959-2007. The VAR estimates and confidence bands are obtained from a Bayesian
VAR with Normal-Whishart priors.
Full text of Working Papers (Federal Reserve Bank of Richmond) : Gaussian Mixture Approximations of Impulse Responses and the Nonlinear Effects of Monetary Shocks, Working Paper 16-08

FRASER