View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Economic Quarterly—Volume 96, Number 4—Fourth Quarter 2010—Pages 317–337

Monetary Policy and Global
Equilibria in a Production
Economy
Tim Hursey and Alexander L. Wolman

M

acroeconomic models that are applied to the study of monetary
policy often exhibit multiple equilibria.1 Prior to the mid-1990s,
applied monetary theory typically modeled monetary policy in
terms of a rule for the money supply, and it was well understood that multiple equilibria often arose under constant money supply policies. Starting in
the mid-1990s, applied work shifted to modeling monetary policy in terms
of interest rate rules. This was mainly because of the accumulating observations that central banks in fact operated with interest rate targets rather
than money supply targets. A particular class of interest rate rules—so called
“active Taylor rules,” featuring a strong response of the policy interest rate
to inflation—attracted special attention. In linearized models these policy
rules were shown to guarantee a locally unique nonexplosive equilibrium.
Benhabib, Schmitt-Grohé, and Uribe looked beyond the local dynamics in a
series of articles (e.g., 2001a, 2001b, 2002), and showed that active Taylor
rules could in fact lead to multiple equilibria. Whereas local analysis ignored
the zero bound on nominal interest rates, global analysis showed that the zero
bound implied the existence of a second steady-state equilibrium, with low
inflation and a low nominal interest rate. This second steady state proved to be
the “destination” for paths that had appeared explosive in the local analysis.
Benhabib, Schmitt-Grohé, and Uribe’s results attracted much attention in the
academic literature because the prevailing wisdom had held that active Taylor
The views in this paper are those of the authors and do not represent the views of the Federal
Reserve Bank of Richmond, the Federal Reserve Board of Governors, or the Federal Reserve
System. For helpful comments, the authors thank Huberto Ennis, Brian Gaines, Andreas Hornstein, and Thomas Lubik. E-mails: tim.hursey@rich.frb.org; alexander.wolman@rich.frb.org.
1 Michener and Ravikumar (1998) provide a taxonomy of multiple equilibria in monetary
models that predates the recent sticky-price literature.

318

Federal Reserve Bank of Richmond Economic Quarterly

rules generated a unique equilibrium. More recently, the persistence of low
inflation and low nominal interest rates has brought attention to Benhabib,
Schmitt-Grohé, and Uribe’s work in policy circles. Most notably, Bullard
(2010) argued that monetary policy in the United States could unintentionally
be leading the economy to a steady state in which inflation is below its target.
This article provides an introduction to Benhabib, Schmitt-Grohé, and
Uribe’s work on multiple equilibria under active Taylor rules, using two simple
models. While the type of results presented here is not new, the specific
modeling framework—Rotemberg price setting in discrete time—is new, and
it fits neatly into the frameworks typically used for applied monetary policy
analysis. Furthermore, we provide computer programs in the open source
software R to replicate all the results in the article. The programs are available
at www.richmondfed.org/research/economists/bios/wolman bio.cfm.
Section 1 places the topic of this article in historical perspective. Section 2
shows the existence of multiple equilibria in a reduced-form model consisting
only of an active Taylor rule and a Fisher equation, assuming that the real
interest rate is exogenous and fixed. Section 3 describes the discrete-time
Rotemberg pricing model to be used in the remainder of the article. Steadystate equilibria and local dynamics are described in Section 4, and global
dynamics are described in Section 5. Section 6 concludes.

1.

HISTORICAL CONTEXT

Multiple equilibria is a common theme in monetary economics, and has been
at least since the work of Brock (1975). On the theory side, there has been a
steady stream of work on multiple equilibria since the 1970s. In contrast, emphasis on multiple equilibria in applied monetary policy research has fluctuated
as new theoretical results have appeared, the tools of analysis have evolved,
and economic circumstances have changed. The immediate explanation for
why the theoretical results described in this article have attracted attention in
policy circles—10 years after those results first appeared—involves economic
circumstances, namely the existence of low inflation and near-zero nominal
interest rates in the United States. There is a longer history, however, that
also involves the ascent of interest rate feedback rules and linearized New
Keynesian models, and the accompanying focus on active Taylor rules as a
descriptive and prescriptive guide to central bank behavior.
Beginning with Bernanke and Blinder (1992), quantitative research on
monetary policy in the United States rapidly shifted from modeling monetary policy as controlling the money supply to modeling monetary policy as
controlling interest rates.2 At around the same time, Henderson and McKibbin
2 Bernanke and Blinder were not the first to suggest modeling monetary policy in terms of
interest rates. See for example McCallum (1983).

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

319

(1993) and Taylor (1993) influentially proposed particular rules for the conduct of monetary policy. These rules involved the policy rate (federal funds
rate in the United States) being set as a linear function of a small number
of endogenous variables, typically including inflation and some measure of
real activity. Henderson and McKibbin focused on the normative aspects of
interest rate rules, whereas Taylor also argued that what would become known
as the “Taylor rule” actually provided a reasonable description of short-term
interest rates in the United States from 1986–1992.
Just as Taylor rules were attracting more attention, another shift was occurring in the nature of quantitative research on monetary policy. Bernanke and
Blinder’s 1992 article had used vector autoregressions (VARs) for its empirical analysis and, in their policy analysis, Henderson and McKibbin employed
linear rational expectations models with some rule-of-thumb behavior. These
two approaches—VARs and linear rational expectations models—had become
standard in applied monetary economics for empirical analysis and policy
analysis, respectively. Beginning with Yun (1996), King and Wolman (1996),
and Woodford (1997), however, the tide shifted toward what Goodfriend and
King (1997) called New Neoclassical Synthesis (NNS) models. NNS models
represented a melding of real business cycle (RBC) methodology—dynamic
general equilibrium—with nominal rigidities and other market imperfections.
Nominal rigidities made the NNS models appealing frameworks for studying
monetary policy, and the RBC methodology meant that it was straightforward
to model the behavior of monetary policy as following a Taylor-style rule.
While NNS models, like RBC models, were fundamentally nonlinear, they
were typically studied using linear approximation. In linearized NNS models (as with their predecessors, the linear rational expectations models), the
question of existence and uniqueness of equilibrium generally was presumed
to be identical to the question of whether the model possessed unique stable
local dynamics in the neighborhood of the steady state around which one linearized.3 In turn, the nature of the local dynamics depended on the properties
of the interest rate rule. Although specific conditions can vary across models,
the results in Leeper (1991) and Kerr and King (1996) were the basis for a
useful rule of thumb in many monetary models: Taylor-style interest rate rules
were consistent with unique stable local dynamics only if the coefficient on
inflation was greater than one; a coefficient less than one would be consistent
with a multiplicity of stable local dynamics. Taylor rules with a coefficient
greater than one became known as active Taylor rules, and the rule of thumb
3 For example, see Blanchard and Kahn (1980) or King and Watson (1998). In many economic models, explosive paths for some variables are inconsistent with equilibrium. For example,
explosive paths for the capital stock can be inconsistent with a transversality condition (in nontechnical terms, consumers would be leaving money on the table), and explosive paths for real
money balances can violate the requirement of a nonnegative price level. See Obstfeld and Rogoff
(1983) for a discussion of these issues.

320

Federal Reserve Bank of Richmond Economic Quarterly

that active Taylor rules guaranteed a unique equilibrium became known as the
Taylor principle.4 Passive Taylor rules, in contrast, are Taylor rules with a
coefficient on inflation less than one.
Some intuition for the Taylor principle comes from the much earlier work
of Sargent and Wallace (1975) and McCallum (1981). Sargent and Wallace
showed that if the nominal interest rate is held fixed by the central bank,
then in many models expectations of future inflation will be pinned down,
but the current price level is left indeterminate. McCallum followed up by
showing that if the nominal interest rate responds to some nominal variable it
is also possible to pin down the price level. The Taylor principle states that
multiplicity can occur if the nominal interest rate does not respond strongly
enough to inflation, consistent with the message of Sargent and Wallace and
McCallum.
With widespread understanding of the Taylor principle came empirical
applications by Clarida, Gali, and Gertler (2000) and Lubik and Schorfheide
(2004). These authors argued that (i) violation of the Taylor principle could
help explain the macroeconomic instability of the 1970s, and (ii) a shift in
policy so that the Taylor principle did hold could help explain the subsequent
stability after 1982. Although this work brought multiple equilibria into the
mainstream of applied research on monetary policy, it proceeded under the
assumption that the local linear dynamics gave an accurate picture of the nature
of equilibrium. These articles also helped to cement the idea that the Taylor
principle characterized “good” monetary policy, because the Taylor principle
would guarantee that inflation stayed on target.
Beginning with their 2001a article, Benhabib, Schmitt-Grohé, and Uribe
(BSU) showed that when there is a lower bound on nominal interest rates,
the local dynamics can be misleading about the uniqueness of equilibrium
when monetary policy is described by an active Taylor rule. The details of
BSU’s argument will become clear below. The rough intuition is as follows.
Arguments for (local) uniqueness of equilibrium with active Taylor rules posit
that without shocks, the model has a unique equilibrium at the inflation rate
targeted by the interest rate rule. Any other candidate solutions to the model
equations would have the inflation rate exploding to plus or minus infinity,
or oscillating explosively. But many of these explosive paths would violate
the lower bound on the nominal interest rate. When that bound is imposed
and the model is studied nonlinearly, it becomes clear that (i) there is a second steady-state equilibrium at a lower inflation rate, and (ii) there are many
4 Note that Leeper (1991) emphasizes that an active rule guarantees uniqueness only in conjunction with an assumption about fiscal policy, specifically that fiscal policy takes care of balancing
the government budget. We maintain that assumption here. Benhabib, Schmitt-Grohé, and Uribe
(2002) discuss the implication of alternative assumptions about fiscal policy for multiple equilibria
induced by the zero bound on nominal interest rates.

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

321

non-steady-state equilibria in which the inflation rate converges to the lowinflation steady state in the long run.
Initially, while the articles by BSU were widely cited, they did not attract much attention in policy circles. This is somewhat surprising because
the articles were showing that a policy advocated in large part because it was
believed to deliver a unique equilibrium actually delivered multiple equilibria
in some models! Furthermore, a rule that violated the Taylor principle—a
passive rule—would actually be consistent with keeping inflation close to its
targeted value, even though there could be multiple equilibria with this property. Recently however, the results in BSU have attracted substantial attention
in policy circles. The simultaneous occurrence of low inflation and low nominal interest rates in the United States is suggestive of some of the equilibria
identified by BSU, so it is natural to wonder whether we are experiencing
outcomes associated with those global equilibria. Policymakers care about
this because the global equilibria involve average inflation below its intended
level.

2. A SIMPLE FRAMEWORK WITH ONLY NOMINAL
VARIABLES
As a simple framework for communicating some of the key ideas in BSU, this
section works through a two-equation model of the nominal interest rate and
inflation. That minimal structure is sufficient to illustrate the potential for the
local and global dynamics to diverge when monetary policy is given by an
active Taylor rule.
Assume that the real interest rate is exogenous and fixed, rt = r, whereas
the nominal interest rate (Rt ) and the inflation rate (π t ) are endogenous.5
Expectations are rational. The model consists of a Fisher equation relating
the short-term nominal interest rate to the short-term real interest rate and
expected inflation,
Rt = rEt π t+1 ,

(1)

and a rule specifying how the central bank sets the nominal interest rate—in
this case as a function only of the current inflation rate, with an inflation target
of π ∗ :


γ
Rt = 1 + R ∗ − 1 π t /π ∗ ,
(2)
where
R ∗ = rπ ∗ ;

(3)

5 Throughout the article, interest rates and inflation rates are measured in gross terms—that
is, a 4 percent nominal interest rate would be written as Rt = 1.04.

322

Federal Reserve Bank of Richmond Economic Quarterly

that is, the targeted nominal interest rate is the one that is implied by the
steady-state Fisher equation when inflation is equal to its target.
The interest rate rule in (2) may look unfamiliar relative to standard linear
Taylor rules. We use the nonlinear rule because it will simplify the analysis in
the second part of the article.6 Furthermore, the linear approximation to the
rule in (2) around {R ∗ , π ∗ } is
 ∗


R −1 
∗
Rt − R = γ
πt − π∗ ,
(4)
∗
π
a simple inflation-only Taylor rule in which the coefficient on inflation is
γ (R ∗ − 1) /π ∗ , and we assume that γ (R ∗ − 1) /π ∗ > r > 1. The standard local-linear approach around the point {R ∗ , π ∗ } involves combining the
linearized Taylor rule (4) with the linearized Fisher equation (Rt − R ∗ =
(R ∗ /π ∗ ) Et (π t+1 − π ∗ )), which yields an expectational difference equation
in inflation:
 ∗




R −1 
∗
Et π t+1 − π = γ
πt − π∗ .
∗
R
For simplicity, assume perfect foresight—that is, the future is known with
certainty, so that Et (π t+1 − π ∗ ) can be replaced with π t+1 − π ∗ . Perfect
foresight is clearly an unrealistic assumption, but it is a convenient one for
illustrating the difference between local and global dynamics. With perfect
foresight, we have

 ∗



R −1 
∗
∗
π t+1 − π = γ
π
−
π
.
(5)
t
R∗
By assumption the coefficient on π t − π ∗ is greater than one—the rule obeys
the Taylor principle. Consequently, we can show that there is a unique nonexplosive equilibrium. Constant inflation at the targeted steady-state level
(π t = π ∗ ) is clearly an equilibrium because it represents a solution to the
difference equation (5). If inflation in period t were equal to any number
other than π ∗ , inflation would have to follow an explosive path going forward because the coefficient on current inflation is greater than one. Any such
explosive path would be ruled out as an equilibrium by assumption in the
standard local-linear approach.7
6 Imposing the zero bound on an otherwise linear rule creates a nondifferentiability, making
computation more difficult.
7 Since the model here is itself ad-hoc, we cannot complain about ruling out explosive paths
as equilibria by assumption. Depending on the particular model, explosive paths up or down may
or may not be equilibria—see footnote 3. What is important here is that the ad-hoc model we
wrote down is nonlinear, and the nonlinear analysis yields different conclusions about equilibrium
than the linear analysis.

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

323

Figure 1 Steady-State Equilibria
1.05
Nominal Interest Rate from Taylor Rule
Nominal Interest Rate from Fisher Equation
1.04

1.03

1.02

1.01

1.00

0.99
0.99

1.00

1.01

1.02

1.03

Inflation Rate

Steady-State Equilibria
It is obvious that {R ∗ , π ∗ } represents a steady-state solution to the Fisher and
Taylor equations ([1] and [2]). Less obviously, there is also a second steadystate solution with a lower inflation rate and a lower nominal interest rate. To
see this, combine the steady-state Fisher and Taylor equations into a single
equation in π :



γ 
π = r −1 1 + R ∗ − 1 π /π ∗
.
(6)
Figure 1 displays a plot of the right-hand side of (6) (essentially the Taylor
rule) against the 45-degree line—which is also the left-hand side, or the Fisher
equation. The two intersections of the right-hand side and left-hand side represent the two steady-state equilibria. The targeted inflation rate is 2 percent,
and the other steady state involves slight deflation.
The specific Taylor rule we chose for this example never allows the nominal interest rate to hit the zero bound. Alternatively, if we had chosen a typical
linear Taylor rule (Rt = max {R ∗ + f (π t − π ∗ ) , 0}), there would be a kink
in the steady-state Taylor curve at π = 1/r, and the second steady state would
be at π = π ∗ − (1/f ) R ∗ . BSU (2001a) and Bullard (2010) contain pictures
of the analogues to Figure 1 implied by several different interest rate rules that

324

Federal Reserve Bank of Richmond Economic Quarterly

Figure 2 Example of a Non-Steady-State Equilibrium
1.03
Inflation in t + 1
Inflation in t

1.02

Steady state with
targeted inflation rate
1.01

1.00

0.99

Second steady state
0.99

1.00

1.01

1.02

1.03

Inflation in Period t

all satisfy the Taylor principle at the targeted steady state, and all imply the
existence of a second steady state with lower inflation.

Example of a Non-Steady-State Equilibrium
The fact that there are two steady-state equilibria suggests that there may also
be equilibria in which inflation and nominal interest rates fluctuate. Returning
now to the nonlinear model, by combining the Fisher equation (1) and the
interest rate rule (2) and imposing perfect foresight, we have a first-order
difference equation for the inflation rate:



γ 
π t+1 = r −1 1 + R ∗ − 1 π t /π ∗
.
(7)
This is the nonlinear analogue of (5). In contrast to the linearized model, we
can show that there is a continuum of nonexplosive equilibria.8 In Figure 2
we plot the right-hand side of (7): It is an identical curve to the solid line in
8 Note the sensitivity of this result to whether current or (expected) future inflation is the
argument in the policy rule. If the policy rule responds to π t+1 instead of π t , then the same two
steady-state equilibria exist; but the system is entirely static and, under perfect foresight, the two
steady-state equilibria are also the only two equilibrium values for inflation in any period. The

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

325

Figure 1. The dotted line is the 45-degree line, which is also the left-hand
side of (7). The intersections between the two lines are the steady states and,
starting with any initial inflation rate below the targeted steady state, we can
trace an equilibrium path using the solid line and the 45-degree line. For
example, from an initial inflation rate of 1.014, the vertical solid lines with
arrows pointing down indicate the successive values of inflation going forward.
Generalizing from this example, the figure shows that all perfect foresight
equilibria except for the targeted steady state converge to the nontargeted
steady state. In contrast, the conventional local linear approach applied to the
targeted steady state would conclude that the targeted steady state was the only
equilibrium—other solutions are locally explosive and would be ruled out by
assumption. Figure 2 conveys the essence of the literature that began with
BSU (2001a): Local analysis suggests a unique equilibrium, whereas global
analysis reveals that many solutions ruled out as explosive instead lead to a
second steady-state equilibrium.
Because the qualitative results involving a second steady state and multiple
equilibria will carry over into the model with an endogenous real interest
rate and endogenous output, it is interesting to discuss the economics behind
these results. In a neighborhood of the targeted steady state, the interest rate
rule responds to an upward (downward) deviation of inflation from target
by moving the interest rate upward (downward) more than proportionally.
This sets off a locally explosive chain: The Fisher equation (1) dictates that
an increase in the current nominal interest rate must correspond to a higher
future inflation rate, which then is met with a further increase in next period’s
interest rate, etc. One notable aspect of this process is that there is no sense
in which a higher nominal interest rate represents “tighter” monetary policy.
The model has only nominal variables, and a higher nominal interest rate must
correspond to higher expected inflation. In contrast, the Taylor principle is
often thought of as ensuring that an increase in inflation is met with a monetary
tightening, as represented by a higher nominal interest rate. In models with
real effects of monetary policy—such as the one discussed below—an increase
in the nominal interest rate does not have to correspond to higher expected
inflation. However, we have learned from the two-equation model that this
association of higher interest rates with tight monetary policy is not an inherent
ingredient in the local uniqueness and global multiplicity associated with the
Taylor principle.9
“economy” can bounce arbitrarily between those two values in a deterministic way. There may
also be rational expectations equilibria with stochastic fluctuations.
9 See Cochrane (2011) for a similar argument.

326

Federal Reserve Bank of Richmond Economic Quarterly

3. A MODEL WITH REAL VARIABLES AND MONETARY
NONNEUTRALITY
The model above taught us that the Fisher equation together with a Taylor rule
that responds strongly to inflation can lead to multiple steady states and other
equilibria because of the lower bound on nominal interest rates. However,
the only endogenous variables in that model are nominal variables. One of
the simplest ways to endogenize real variables and introduce real effects of
monetary policy is with a version of the Rotemberg (1982) model, which has
quadratic costs of nominal price adjustment. In this model, there is a representative household that takes all prices and aggregate quantities as given, and
chooses how much to consume and how much to work. There is a continuum of monopolistically competitive firms that face convex costs of adjusting
their nominal prices, and there is a monetary authority that sets the short-term
nominal interest rate according to a time-invariant feedback rule.
The representative household has preferences over consumption (ct ) and
(disutility of) labor (ht ) given by
∞


β t (ln (ct ) − χ ht ) .

(8)

t=0

There is a competitive labor market in which the real wage is wt per unit of
time. The consumption good is a composite of a continuum of differentiated
products (ct (z)), each of which are produced under monopolistic competition:
ε
 1
 ε−1
ε−1
ct (z) ε dz
.
(9)
ct =
0

Households own the firms. An individual household’s budget constraint is
ct + Rt−1 Bt /Pt = wt ht + Bt−1 /Pt + t /Pt ,

(10)

where t represents nominal dividends from firms, Pt is the price of the composite good, and Bt is the quantity of one-period nominal discount bonds. As
above, Rt is the gross nominal interest rate. The household’s intratemporal
first-order conditions representing optimal choice of labor input and consumption are given by
λt wt = χ ,

(11)

λt = 1/ct ,

(12)

and

and the intertemporal first-order condition representing optimal choice of
bondholdings is given by
λt −1
λt+1
Rt = β ·
.
Pt
Pt+1

(13)

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

327

In these equations, the variable λt is the Lagrange multiplier on the budget
constraint for period t—it can also be thought of as the marginal utility of
an additional unit of consumption at time t. Note that the intertemporal firstorder condition (13) corresponds to the Fisher equation from the first model,
with the real interest rate now endogenous and given by
rt = β −1

ct+1
.
ct

 
Firms face a cost ξ t in terms of final goods of changing the nominal
price of the good they produce (z):

2
Pt (z)
θ
(14)
−1 .
ξ t (z) =
2 Pt−1 (z)
Because goods are produced both for consumption and for accomplishing
price adjustment, the market-clearing condition is
yt = ct +

θ
(π t − 1)2 ,
2

(15)

where yt denotes total output of the composite good, π t denotes the gross
inflation rate (Pt /Pt−1 ), and we have imposed symmetry across firms, meaning
that all firms choose the same price.
An individual firm chooses its price each period to maximize the expected
present value of profits, where profits in any single period are given by revenue
minus costs of production minus costs of price adjustment. The demand
curve facing each firm is yt (z) = (Pt (z) /Pt )−ε yt , so the profit maximization
problem for firm z is




∞

λt+j
Pt+j (z) Pt+j (z) −ε
j
max
β
yt+j
Pt+j (z)
λt
Pt+j
Pt+j
j =0

2


Pt+j (z)
Pt+j (z) −ε
θ
−wt+j
yt+j −
.
−1
Pt+j
2 Pt+j −1 (z)
The first term in the square brackets is the real revenue a firm earns charging
−ε
yt+j units of goods
a price Pt+j (z) in period t + j ; it sells Pt+j (z) /Pt+j
for relative price Pt+j (z) /Pt+j . The second term in the square brackets (in
the second line of the expression) is the real costs the firm incurs in period
t + j , number of goods sold multiplied by average cost, which is equal to
marginal cost and to the real wage because labor productivity is constant and
equal to one. Finally, the third term in the square brackets is the real cost of
adjusting the nominal price from Pt+j −1 (z) to Pt+j (z). Note that the price
chosen in any period shows up only in two periods of the infinite sum. Thus,
the part of the objective function relevant for the choice of a price in period t

328
is

Federal Reserve Bank of Richmond Economic Quarterly





Pt (z) −ε
Pt (z) Pt (z) −ε
yt − w t
yt
Pt
Pt
Pt

2
2

 
θ
λt+1 θ Pt+1 (z)
Pt (z)
−
−1 −β
−1 .
2 Pt−1 (z)
λt
2
Pt (z)

The first-order condition is




1 Pt (z) −ε
1 Pt (z) −ε−1
yt + εwt
yt
(1 − ε)
Pt
Pt
Pt
Pt






1
Pt (z)
λt+1
Pt+1 (z) Pt+1 (z)
−θ
−1 +β
− 1 = 0.
θ
Pt−1 (z) Pt−1 (z)
λt
Pt (z)
Pt (z)2
If we multiply both sides by Pt and impose symmetry—that is, assume that
all firms choose the same price in any given period, the expression simplifies
to
(1 − ε) yt + εwt yt


λt+1
−θ π t (π t − 1) + β
θ π t+1 (π t+1 − 1) = 0.
λt
Using the goods market clearing condition (15) and the household’s optimality conditions, the previous equation simplifies to a form that we will refer
to as the New Keynesian Phillips Curve:10


ct
(π t − 1)2
+
(1 − ε + χ εct )
(π t − 1) π t =
θ
2


ct
(16)
+βEt
(π t+1 − 1) π t+1 ,
ct+1
where π t is the gross inflation rate.
Finally, monetary policy is given by a nominal interest rate rule similar
to what was used in the two-equation model, with the one difference that the
interest rate responds to expected future inflation instead of to current inflation:


γ
Rt = 1 + π ∗ /β − 1 π t+1 /π ∗ .
(17)
Recall that in the two-equation model, using a policy rule identical to (17)
would render the model entirely static, whereas the rule that responds to current
inflation introduces dynamics. In the current model, optimal pricing already
introduces dynamics, so we choose to use the future-inflation version of the
policy rule.11 Combining the policy rule with the household’s intertemporal
10 We should note that the term “New Keynesian Phillips Curve” typically refers to the linearized version of (16).
11 Note that with current inflation in the policy rule, the steady states do not change and
it would be possible to study dynamic equilibria in the same way we do here—tentative results

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

329

first-order condition (13), using the definition of the inflation rate to eliminate
the price level, and using the household’s intratemporal first-order condition
(12) to eliminate λ, we have

−1



γ 
ct
= β 1 + π ∗ /β − 1 π t+1 /π ∗
.
(18)
π t+1 ct+1
The model has now been reduced to two nonlinear difference equations (16)
and (18) in the variables ct , π t , ct+1 , and π t+1 .

4.

LOCAL DYNAMICS AROUND STEADY-STATE EQUILIBRIA

As with the ad-hoc model in Section 2, there are two steady-state equilibria.
That there are two steady-state equilibrium inflation rates is immediately apparent from (18)—in a steady state it is identical to (6). One of the steady
states has inflation equal to the targeted inflation rate π ∗ , and the other steady
state has a lower inflation rate.12 The steady-state levels of consumption are
determined by (16).
To study dynamic equilibria, we follow the same steps as in the twoequation model, beginning with the linearized model and then moving on to
the exact nonlinear model. The two dynamic equations (16) and (18) can be
represented as
F (ct , ct+1 , π t , π t+1 )
G (ct , ct+1 , π t , π t+1 )

=

0
0

,

where
F (ct , ct+1 , π t , π t+1 ) =


ct
(π t − 1)2
+
(π t − 1) π t −
(1 − ε + χ εct )
θ
2


ct
−β
(π t+1 − 1) π t+1
ct+1



γ 
.
G (ct , ct+1 , π t+1 ) = π t+1 ct+1 − βct 1 + π ∗ /β − 1 π t+1 /π ∗
suggest that qualitatively similar results apply with current inflation in the policy rule. Our approach
in this article is positive rather than normative. For a policymaker choosing a rule, whether multiple
equilibria arise would be one important consideration in that choice.
12 This statement relies again on γ being sufficiently large. In contrast, for low enough γ
 
such that R  π ∗ < 1, the second steady state will involve inflation higher than π ∗ .

330

Federal Reserve Bank of Richmond Economic Quarterly

Table 1 Parameter Values
β
ε
θ
χ
γ
π∗

0.99
6
17.5
5
90
1.005

Linearizing around the steady state with the targeted inflation rate (denoted
[c∗ , π ∗ ]) yields


ct+1 − c
F2 (c∗ , c∗ , π ∗ , π ∗ ) F4 (c∗ , c∗ , π ∗ , π ∗ )
≈
π t+1 − π
G2 (c∗ , c∗ , π ∗ )
G3 (c∗ , c∗ , π ∗ )


F1 (c∗ , c∗ , π ∗ , π ∗ ) F3 (c∗ , c∗ , π ∗ , π ∗ )
ct − c
−
,
(19)
πt − π
G1 (c∗ , c∗ , π ∗ )
0
where Hj (s) denotes the j th partial derivative of the generic function H (),
evaluated at s.
The existence and uniqueness of a nonexplosive equilibrium in the linearized model depends on the eigenvalues of the Jacobian matrix J , given
by
J =−

F2 (.) F4 (.)
G2 (.) G3 (.)

−1

F1 (.) F3 (.)
G1 (.)
0

.

Neither ct nor π t are predetermined variables, so the condition for a unique
nonexplosive equilibrium is that both eigenvalues of J be less than one in
absolute value. Because we are not able to provide a general proof of the
parameter conditions under which equilibrium exists and is unique, we turn to
a numerical example, which we will stay with for the rest of the article.13 Table
1 contains the parameters for that example; they are chosen to be consistent
with a 2 percent annual inflation target (the model is a quarterly model), a 4
percent real interest rate, a markup of 20 percent, and a coefficient in the Taylor
rule of 1.33 when the Taylor rule is linearized around the targeted steady state.
In addition, our choice of θ implies that price adjustment costs are less than
2
percent of output.
10
At the targeted steady state, the local (nonexplosive) dynamics are unique,
in a trivial sense. The Jacobian’s eigenvalues are 0.99771321 ± 0.12791602i,
which means that both eigenvalues have absolute value 1.0059. Local to the
13 If the targeted inflation rate were zero (π ∗ = 1) then it would be straightforward to
characterize uniqueness conditions analytically—this is the standard New Keynesian Phillips Curve.
With a nonzero inflation target there are price-adjustment costs incurred in steady state, and the
analysis is less straightforward.

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

331

targeted steady state, the fact that both eigenvalues have absolute value greater
than one and are imaginary means that any solution to the difference equation
system (19) other than the steady state itself oscillates explosively. In the
linearized model the local dynamics are the global dynamics, so the only
nonexplosive solution is the targeted steady state itself.
Suppose instead that we linearize around the low-inflation steady state.
There the Jacobian’s eigenvalues are 1.1291231 and 0.89509305. This eigenvalue configuration, with one explosive root and one stable root (less than
one), means that there is a saddlepath: Given an initial value for c (or an
initial value for π ), there is a unique initial value for π (or for c) such that the
economy will converge from that point to the steady state with low inflation.
If either inflation or consumption were predetermined variables, then this saddlepath would describe the unique equilibrium at any point in time. Because
neither variable is predetermined, the saddlepath represents one dimension of
equilibrium indeterminacy at any point in time. That is, any value of c (or π)
is consistent with equilibrium in period t, but as was stated above, once that
value of c (or π ) has been selected, the associated value of π (or c) is pinned
down, as is the entire subsequent equilibrium path.14
The conventional linearization approach to studying NNS models, as followed, for example, by King and Wolman (1996), involves implicitly ignoring
the steady state with low inflation. In that approach it is presumed that the
only relevant steady state is the targeted one. From the same kind of reasoning
used in the discussion following (5), the explosiveness of paths local to the
targeted steady state means there is a unique nonexplosive equilibrium, the
steady state itself. One can then proceed to study the properties of the model
when subjected to shocks, for example to productivity or monetary policy.
However, the fact that there are two steady states suggests that it may be revealing to investigate the global dynamics. Furthermore, if one extrapolates
the local dynamics around the two steady states, it leads to the conjecture that
paths that explode locally from the targeted steady state may in fact end up as
stable paths converging at the low-inflation steady state. This is indeed what
we will find in studying the global dynamics.

5.

GLOBAL DYNAMICS

Studying the model’s global dynamics means analyzing the nonlinear equations ([18] and [16]). We will combine the nonlinear equations with information about the local dynamics to trace out the global stable manifold of the
low-inflation steady state. The global stable manifold is the set of inflation and
14 Because we are dealing here with perfect foresight paths, the discussion of period t really

should apply only to an initial period, prior to which the perfect foresight assumption does not
apply. After that initial period the equilibrium outcomes are unique.

332

Federal Reserve Bank of Richmond Economic Quarterly

consumption combinations such that if inflation and consumption begin in that
set, there is an equilibrium path that leads in the long run to the low-inflation
steady state. While this approach may not yield a comprehensive description
of the perfect foresight equilibria, it will provide a coherent picture of how
the two steady states relate to the dynamic behavior of consumption and inflation.15 We will find that the local saddlepath can be understood as part of a path
(the global stable manifold) that begins arbitrarily close to the targeted steady
state and cycles around that steady state with greater and greater amplitude
before converging monotonically to the low-inflation steady state.

From Local to Global
Before plunging into the global dynamics, it may be helpful to take stock of
our knowledge. There are two steady-state equilibria, one with the targeted
inflation rate (π ∗ ) and one with a lower inflation rate (π l ). The levels of
consumption in the two steady states are c∗ and cl . Local to the targeted steady
state, all dynamic paths oscillate explosively. Local to the low inflation steady
state many paths explode and one path converges to that steady state. To go
further, we will combine the forward dynamics local to the low inflation steady
state with the nonlinear backward dynamics. This approach will allow us to
compute the global stable manifold of the low-inflation steady state. Since all
paths diverge around the targeted steady state, no analogous approach can be
applied there.
As described above, the local dynamics around {cl , π l } involve a unique
path in {c, π } space that converges to the steady state. If we begin with a point
on that path, very close to the low-inflation steady state, and then iterate the
nonlinear system backward, we can trace out the global dynamics associated
with the saddlepath—the global stable manifold. We now describe this process
algorithmically.
1. To find a point on the local saddlepath of the low-inflation steady state,
follow the approach described in Blanchard and Kahn (1980). First,
decompose the Jacobian matrix J into its Jordan form: J = P P −1 ,
where is a diagonal 2 × 2 matrix whose diagonal elements are the
eigenvalues of J, and where P is a 2 × 2 matrix whose columns are
the eigenvectors of J. Next, rewrite the system in terms of canonical
variables x1,t and x2,t , which are linear combinations of ct and π t :

x1,t x2,t = P [ct − cl π t − π l ] . The system is


λ1 0  x1,t 
x1,t+1
=
.
(20)
x2,t+1
x2,t
0 λ2
15 While we have not proved that the global stable manifold contains all perfect foresight
equilibria, we conjecture this to be the case.

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

333

Note that at the steady state cl , π l , we have x1,l = x2,l = 0. Recall that
one of the roots (λ1 , λ2 ) is greater than one. Without loss of generality,
assume that λ1 > 1. Any point on the local saddlepath must have
x1,t = 0, because x1,t+j = λ1 x1,t+j −1 , and if x1,t = 0 then x1,t+j could
not approach 0 as j → ∞. Select one such point within an ε ball of
the low-inflation steady state and call that point {cT , π T }. Set t = T .
2. From (18) we have
ct−1

ct
=
β



πt
∗
1 + (π /β − 1) (π t /π ∗ )γ


.

3. Compute π t−1 by solving (16):


1
1 − (1 − ε (1 − χ ct−1 )) π 2t−1 − ε (1 − χ ct−1 ) π t−1 −
2



ct−1
ct−1
= 0. (21)
(π t − 1) π t
(1 − ε (1 − χ ct−1 )) + β
θ
ct
With ct−1 , ct , and π t all known, (21) is a quadratic equation in π t−1 . The
presence of two solutions is rooted in the properties of the firm’s profitmaximization problem—while there is a unique profit-maximizing price,
there are multiple solutions to the first-order condition. Only the
positive root of the quadratic is consistent with the firm maximizing
profits—the negative root typically implies a negative gross inflation
rate, which would imply a negative price level.
4. Set t = t − 1, return to step 2.
Figure 3 describes the results of iterating backward for 450 periods in
steps 2 through 4. The figure is in c, π space. It plots the two steady states
and the global stable manifold of the low-inflation steady state, constructed as
just described. The arrows represent forward movement in time, as opposed
to the backward movement that characterizes the algorithm. The algorithm
starts at a point close to the low-inflation steady state and goes backward in
time. The figure shows that the only path that converges to a steady-state
equilibrium initially involves spirals around the targeted steady state and ends
with monotonic convergence to the low-inflation steady state. The figure
provides us with a unified understanding of the local results around the two
steady states. From the local dynamics we learn that all paths local to the
targeted steady state oscillate explosively. From Figure 3, we see that one of
those paths is not globally explosive, instead converging at the low-inflation
steady state. This path is what we refer to as the global stable manifold.

334

Federal Reserve Bank of Richmond Economic Quarterly

Figure 3 Global Stable Manifold of Low-Inflation Steady State
Δ πt = 0

1.010

Inflation

1.005

Δct= 0

1.000

Δct= 0

Low-Inflation Steady State
Targeted-Inflation Steady State
0.995
0.1655

0.1660

0.1665

0.1670

0.1675

0.1680

0.1685

Consumption

6.

CONCLUSION

Since late 2008, both inflation and nominal interest rates have been extremely
low in the United States. These facts have focused attention on ideas motivated
by the theory in BSU (2001a, 2001b, 2002): An active Taylor rule, together
with a moderate inflation target, could have the unintended consequence of
leading the economy to undesirably low inflation with a near-zero nominal
interest rate. The article by St. Louis Federal Reserve Bank President James
Bullard (2010) represents the leading example of this attention.
The aim of this article was to provide an accessible introduction to the ideas
in BSU (2001a). Much of the literature in this area uses models that are either
set in continuous time or that assume prices are flexible. In contrast, the model
in this article is set in discrete time and has sticky prices. Discrete time reduces
mathematical tractability, but makes it easy to compute specific solutions; in
addition, the quantitative literature on monetary policy overwhelmingly uses
discrete time models. Sticky prices are also a central element in the applied
monetary policy literature. In adapting BSU’s analysis to a discrete-time
framework with sticky prices, we have seen that the general conclusions of
their work also apply to the specific example we have analyzed. First, with
an active Taylor rule, the presence of a lower bound on the nominal interest

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

335

rate leads to the presence of two steady states, one at the targeted inflation
rate and one at a lower inflation rate. Second, the targeted steady state, which
is a unique equilibrium according to the conventional local analysis, instead
is the source for a global stable manifold of the low-inflation steady-state
equilibrium.
In closing we will offer some caveats regarding using the kind of analysis in
this article to interpret current economic outcomes. It is tempting to conclude
from Figure 3 that the low-inflation steady state is “more likely” because it does
possess a stable manifold while the targeted steady state does not. However,
the model only tells us what equilibria exist, not how likely they are to occur.
It is also tempting to conclude from this work that policy may be unwittingly
leading the economy to the unintended steady state. However, the theoretical
analysis is based on perfect information about the model and the equilibrium
by all agents. It is interesting to think about situations where policymakers
and private decisionmakers do not understand the structure of the economy,
but that is not the situation analyzed here. Finally, we should stress that before
using this kind of framework for quantitative analysis, it would be desirable
to enrich the model to incorporate capital accumulation. The behavior of the
capital stock plays a key role in interest rate determination, and at this point
it is an open question whether the kind of dynamics described here carry over
to models with capital accumulation.

REFERENCES

Benhabib, Jess, Stephanie Schmitt-Grohé, and Martı́n Uribe. 2001a. “The
Perils of Taylor Rules.” Journal of Economic Theory 96 (January):
40–69.
Benhabib, Jess, Stephanie Schmitt-Grohé, and Martı́n Uribe. 2001b.
“Monetary Policy and Multiple Equilibria.” American Economic Review
91 (March): 167–86.
Benhabib, Jess, Stephanie Schmitt-Grohé, and Martı́n Uribe. 2002.
“Avoiding Liquidity Traps.” Journal of Political Economy 110 (June):
535–63.
Bernanke, Ben S., and Alan S. Blinder. 1992. “The Federal Funds Rate and
the Channels of Monetary Transmission.” American Economic Review
82 (September): 901–21.

336

Federal Reserve Bank of Richmond Economic Quarterly

Blanchard, Olivier, and Charles M. Kahn. 1980. “The Solution of Linear
Difference Models Under Rational Expectations.” Econometrica 48
(July): 1,305–11.
Brock, William A. 1975. “A Simple Perfect Foresight Monetary Model.”
Journal of Monetary Economics 1 (April): 133–50.
Bullard, James. 2010. “Seven Faces of the Peril.” Federal Reserve Bank of
St. Louis Review 92 (September): 339–52.
Clarida, Richard, Jordi Gali, and Mark Gertler. 2000. “Monetary Policy
Rules and Macroeconomic Stability: Evidence and Some Theory.”
Quarterly Journal of Economics 115 (February): 147–80.
Cochrane, John H. 2011. “Determinacy and Identification with Taylor
Rules.” http://faculty.chicagobooth.edu/john.cochrane/research/Papers/
taylor rule jpe revision.pdf
Goodfriend, Marvin S., and Robert G. King. 1997. “The New Neoclassical
Synthesis and the Role of Monetary Policy.” In NBER Macroeconomics
Annual 1997, Vol. 12, edited by Ben Bernanke and Julio Rotemberg.
Cambridge, Mass.: MIT Press, 231–96.
Henderson, Dale, and Warwick J. McKibbin. 1993. “A Comparison of Some
Basic Monetary Policy Regimes for Open Economies: Implications of
Different Degrees of Instrument Adjustment and Wage Persistence.”
Carnegie-Rochester Conference Series on Public Policy 39 (December):
221–317.
Kerr, William, and Robert G. King. 1996. “Limits on Interest Rate Rules in
the IS Model.” Federal Reserve Bank of Richmond Economic Quarterly
82 (Spring): 47–75.
King, Robert G., and Alexander L. Wolman. 1996. “Inflation Targeting in a
St. Louis Model of the 21st Century.” Federal Reserve Bank of St. Louis
Review 78 (May): 83–107.
King, Robert G., and Mark W. Watson. 1998. “The Solution of Singular
Linear Difference Systems Under Rational Expectations.” International
Economic Review 34 (November): 1,015–26.
Leeper, Eric. 1991. “Equilibria Under ‘Active’ and ‘Passive’ Monetary and
Fiscal Policies.” Journal of Monetary Economics 27 (February): 129–47.
Lubik, Thomas A., and Frank Schorfheide. 2004. “Testing for
Indeterminacy: An Application to U.S. Monetary Policy.” American
Economic Review 94 (March): 190–217.
McCallum, Bennett T. 1981. “Price Level Determinacy with an Interest Rate
Policy Rule and Rational Expectations.” Journal of Monetary
Economics 8: 319–29.

T. Hursey and A. L. Wolman: Monetary Policy and Global Equilibria

337

McCallum, Bennett T. 1983. “A Reconsideration of Sims’ Evidence
Concerning Monetarism.” Economics Letters 13: 167–71.
Michener, Ronald, and B. Ravikumar. 1998. “Chaotic Dynamics in a
Cash-in-Advance Economy.” Journal of Economic Dynamics and
Control 22 (May): 1,117–37.
Obstfeld, Maurice, and Kenneth Rogoff. 1983. “Speculative Hyperinflations
in Maximizing Models: Can We Rule Them Out?” Journal of Political
Economy 91 (August): 675–87.
Rotemberg, Julio J. 1982. “Sticky Prices in the United States.” Journal of
Political Economy 90 (December): 1,187–211.
Sargent, Thomas J., and Neil Wallace. 1975. “‘Rational’ Expectations, the
Optimal Monetary Instrument, and the Optimal Money Supply Rule.”
Journal of Political Economy 83 (April): 241–54.
Taylor, John B. 1993. “Discretion versus Policy Rules in Practice.”
Carnegie-Rochester Conference Series on Public Policy 39 (December):
195–214.
Woodford, Michael. 1997. “Control of the Public Debt: A Requirement for
Price Stability?” In The Debt Burden and Monetary Policy, edited by G.
Calvo and M. King. London: Macmillan (published version is excerpted
from NBER Working Paper No. 5684 [July]).
Yun, Tack. 1996. “Nominal Price Rigidity, Money Supply Endogeneity and
Business Cycles.” Journal of Monetary Economics 37 (April): 345–70.

Economic Quarterly—Volume 96, Number 4—Fourth Quarter 2010—Pages 339–372

Hidden Effort, Learning by
Doing, and Wage Dynamics
Arantxa Jarque

M

any occupations are subject to learning by doing: Effort at the
workplace early in the career of a worker results in higher productivity later on.1 In such occupations, if effort at work is unobservable, a moral hazard problem arises as well. The combination of these
two characteristics of effort implies that employers need to provide incentives
for the employee to work hard, possibly in the form of pay-for-performance,2
while taking into account at the same time the optimal path of human capital
accumulation over the duration of the contract.
The recent crisis had a big impact on the labor market with high jobdestruction rates. If firm-specific human capital accumulation is important,
the effect of these separations on welfare may come from several channels. A
direct channel is through the loss of human capital prompted by the exogenous
separation, as well as the loss in welfare from the decrease in wealth because
of unemployment spells of workers. A less direct channel, but potentially
an important one, is the change in the cost of providing incentives when the
(exogenous to the incentive provision) separation rate increases. However,
we are far from being able to understand and measure the importance of this
I would like to thank Huberto Ennis, Juan Carlos Hatchondo, Tim Hursey, and Pierre Sarte
for helpful comments, as well as Nadezhda Malysheva for great research assistance. Andreas
Hornstein provided many editorial suggestions that helped shape the final version of this article.
All remaining errors are mine. The views presented in this article do not necessarily represent
those of the Federal Reserve Bank of Richmond or the Federal Reserve System. E-mail:
arantxa.jarque@rich.frb.org.
1 See Arrow (1962), Lucas (1988), and Heckman, Lochner, and Taber (1998) for a complete
discussion of this issue, as well as alternative specifications of learning by doing.
2 Lemieux, MacLeod, and Parent (2009) report that, for a Panel Study of Income Dynamics sample of male household heads aged 18–65 working in private sector wage and salary jobs,
the incidence of pay-for-performance jobs was about 38 percent in the late 1970s and increased
to about 45 percent in the 1990s. They define pay-for-performance jobs as employment relationships in which part of the worker’s total compensation includes a variable pay component (bonus,
commission, piece rate). Any worker who reports overtime pay is considered to be in a non-payfor-performance job. See also MacLeod and Parent (1999).

340

Federal Reserve Bank of Richmond Economic Quarterly

cost, since little is known so far about the structure of incentive provision in
the presence of learning by doing.3 This article constitutes a modest first step
in this direction: Abstracting from separations and in a partial equilibrium
setting, this article studies the time allocation of incentives and human capital
accumulation in the optimal contract. This simplified analysis should be a
helpful benchmark in future studies of the fully fledged model with separations
and general equilibrium.
We modify the standard repeated moral hazard (RMH) framework from
Rogerson (1985a) to include learning by doing. In the standard framework, a
risk-neutral employer, the principal, designs a contract to provide incentives
for a risk-averse employee, the agent, to exert effort in running the technology
of the firm. Both the principal and the agent commit to a long-term contract.
The agent’s effort is private information and it affects the results of the firm
stochastically: The probability distribution over the results of the firm (the
agent’s “productivity”) in a given period is determined by the effort choice of
the agent in that same period only. We introduce the following modification to
this standard framework: We specify learning by doing by assuming that the
probability distribution over the results of the firm in each period is determined
by the sum of past undepreciated efforts of the agent, as opposed to his current
effort only. In other words, the agent’s productivity is determined by his
“accumulated human capital.” More human capital implies higher expected
output, although all possible output levels may realize under any level of
human capital. In this specification, the agent determines his human capital
deterministically by choosing effort each period. Lower depreciation of past
effort is interpreted as “more persistence” of effort.
We present a model of two periods. The first period represents the junior
years, when the worker has just been hired and has little experience. The
second period represents the mature worker years, when human capital has
been potentially accumulated and there are no more years ahead in which to
exploit the productivity of the worker. A contract contingent on the observed
performance of the agent is designed by the principal to implement the path
of human capital accumulation that maximizes the principal’s expected profit
(expected output minus expected payments to the agent).
In our analysis, we find the following two main implications of the presence of learning by doing. First, the principal does not find it optimal to require
a high level of human capital in the last period of the contract, since there is
not much time left to exploit the productivity of the worker. Hence, the more
experienced workers are not the most productive ones, since they optimally
are asked to let their human capital depreciate. This implies that workers exert
3 The only articles dealing with effort persistence in a repeated moral hazard problem are,
to our knowledge, Fernandes and Phelan (2000), Mukoyama and Şahin (2005), Kwon (2006), and
Jarque (2010).

A. Jarque: Hidden Human Capital Accumulation

341

the most effort in their junior years, and the least in their pre-retirement years.
In a comparison with the standard RMH problem, we find that the frontloading of effort, as well as the low requirement at the end of the worker’s career,
differ markedly from the optimal path of effort in a context without learning
by doing. Second, and in spite of this difference in effort requirements over
the contract length, we find that learning by doing does not imply a change
in the properties of consumption paths; hence, the properties of consumption
paths found by previous studies, such as Phelan (1994), remain true in this
context (see also Ales and Maziero [2009]).
It is worth noting that in our analysis we assume perfect commitment to
the contract both from the employer and the employee, and we do not allow
for separations to be part of the contract. This means we need to abstract
from the usual career concerns that have been explored in the literature (see
Gibbons and Murphy [1992]). The implications of the hidden human capital
accumulation that we model here should be viewed as complementary to the
implications of career concerns.
As pointed out above, the problem studied here differs from the standard
RMH in that the contingent contract needs to take into account the persistent
effects of effort on productivity. On the technical side, this highly complicates
solving for the optimal contract. The fact that both past and current effort
choices are not observable means that, at the start of every period, the principal
does not know the preferences of the agent over continuation contracts (that is,
the principal does not know the true productivity of the agent for a given choice
of effort today). Jarque (2010) deals with this difficulty and presents a class
of problems with persistence for which a simple solution can be found. The
article studies a general framework in which past effort choices affect current
output, as opposed to other forms of persistence that one may consider, such
as through output autocorrelation (see, for example, Kapička [2008]). The
learning-by-doing problem that we are interested in, hence, constitutes a fitting
application of the results in Jarque (2010). We adapt the assumptions in Jarque
(2010) to a finite horizon and we show how this specification of learning by
doing greatly simplifies the analysis of the optimal contract.
In Section 1 we introduce the common assumptions throughout the article. Section 2 presents, as a benchmark, the case in which the principal can
directly observe the level of effort chosen by the agent every period, and hence
can control his human capital at all times. For reference, we also discuss the
case in which the effort of the agent does not have a persistent effect in time.
The analytical properties of the problem are discussed in both cases. Then we
analyze the main case of interest of this article, in which effort is unobservable
and contracts that specify payments contingent on the observable performance
of the agent are needed to implement the desired sequence of human capital
accumulation. In Section 3, we discuss the case without persistence—a standard two-period repeated moral hazard problem. In Section 4 we discuss the

342

Federal Reserve Bank of Richmond Economic Quarterly

technical difficulties of allowing for effort persistence in problems of repeated
moral hazard, and the solutions provided in the literature. Section 5 presents
the framework of hidden human capital accumulation, a particular case of
effort persistence. As the main result, we provide conditions under which
the problem with hidden human capital can be analyzed by studying a related
auxiliary problem that is formally a standard repeated moral hazard problem.
Hence, the discussion of the properties of the standard case in Section 3 becomes useful when deriving the properties of the case with persistence. The
numerical solution to an example is presented in Section 6, together with a
comparison to the standard RMH without learning by doing, and a discussion
of the main lessons about the effects of hidden human capital accumulation
on wage dynamics. Section 7 concludes.

1.

DESCRIPTION OF THE ENVIRONMENT

The results in this article apply to contracts of finite length T ; however, in order
to keep the exposition and the notation as simple as possible, we discuss here
the case of a two-period contract, T = 2. We assume that both parties commit
to staying in the contract for the two periods. For tractability, we assume that
the principal has perfect control over the savings of the agent. They both
discount the future at a rate β. We assume that the principal is risk neutral and
the agent is risk averse, with additively separable utility that is linear in effort.
Assumption 1 The agent’s utility is given by U (ct , et ) = u (ct ) − vet , where
u is twice continuously differentiable and strictly concave and ct and
et denote consumption and effort at time t, respectively.
There is a finite set of possible outcomes in each period, Y = {yL , yH }.
Histories of outcomes are assumed to be observable to both the principal
and the agent. We assume both consumption and effort lie in a compact set:
ct ∈ [0, yt ] and et ∈ E = e, e for all t.
We model the hidden accumulation of human capital by assuming that
the effect of effort is “persistent” over time, in a learning-by-doing fashion.
That is, we depart from the standard RMH framework, which assumes that
the probability distribution over possible outcomes realizations at t depends
only on et . In our human capital accumulation framework, the probability
distribution at t depends on all past efforts up to time t. Assumption 2 states
this formally for the two-period problem.
Assumption 2 The agent affects the probability distribution over outcomes
according to the following function:
Pr (yt = yH |st ) ≡ π (st ) ,

A. Jarque: Hidden Human Capital Accumulation

343

where
s1 = e 1 ,
s2 = ρs1 + e2 ,

(1)
(2)

and π (s) is continuous, differentiable, concave, and ρ ∈ (0, 1).
In the human capital accumulation language, we could equivalently write
the law of motion for human capital as
s1 = e1
s2 = (1 − δ) s1 + e2 ,
where δ = 1 − ρ would represent the depreciation rate. Then,

f (st ) =

yH with probability π (st )
yL with probability 1 − π (st )

could be interpreted as the production function or technology of the firm.
In the rest of the article, we loosely refer to Assumption 2 as effort being
“persistent,” we refer to st as the accumulated human capital at time t, and we
refer to ρ as the persistence rate.
The strategy of the principal consists of a sequence of consumption transfers
 to the agent contingent on the history of outcome realizations, c =
ci , cij i,j =L,H , to which the principal commits when offering the contract
at time 0. The agent’s strategy is a sequence of period best-response effort
choices that maximize his expected utility from t on, given the past history
of output: e = {e1 , e2i }i=L,H . At the beginning of each period, the agent
chooses the level of current effort, et . Then output yt is realized according
to the distribution determined by all effort choices up to time t. Finally, the
corresponding amount of consumption is given to the agent.
A contract is a pair of contingent sequences c and e. For the analysis in the
rest of the article, it will be useful to follow
  Grossman and Hart (1983) in using
utility levels ui = u (ci ) and uij = u cij as choice variables.4 To denote the
domain for this new choice variable, we need to introduce the following set
notation:
Ui
Uij

= {u|u = u (ci ) for some ci ∈ [0, yi ] , i = L, H }

 

= u|u = u cij for some cij ∈ 0, yj i, j = L, H .

4 If the reader is knowledgeable about contract theory, he or she may notice that this is not
a simple change of notation. In fact, when computing the solution to numerical examples (see
Section 6), we will follow the two-step procedure proposed in Grossman and Hart (1983). This
procedure consists of splitting the expected profit-maximization problem of the principal in two
steps: (1) cost minimization of implementing a given effort level (on a grid of efforts), and (2)
choosing the effort on the grid that implies the highest expected profit for the principal. Using
utility as the choice variable, it is easy to show that under the assumptions of this article there
will exist a unique minimum in the cost minimization problem.

344

Federal Reserve Bank of Richmond Economic Quarterly



The contingent sequence of utility is then denoted u = ui , uij i,j =L,H , and
we assume that ui ∈ Ui and uij ∈ Uij .
In order to keep the expressions in the article as simple as possible, and
abusing notation slightly, we also introduce some notation shortcuts. We
denote ci = u−1 (ui ) for all i. We also write Pr (yt = yH |st ) as π H (st ) and
Pr (yt = yL |st ) as π L (st ).
The expected profit of the principal, denoted by V (u, e), depends on the
contract as follows:
⎧
⎡
⎤⎫
 ⎨


 ⎬
V (u, e) ≡
π j (s2i ) yj − cij ⎦ ,
π i (s1 ) ⎣yi − ci + β
⎭
⎩
j =L,H

i=L,H

where st changes with et as detailed in (1). In the same way, we can write the
agent’s expected utility of accepting to participate in the contract as
⎡
⎤


W0 (u, e) =
π i (s1 ) ⎣ui + β
π i (s2i ) uij − ve2i ⎦ − ve1 .
i=L,H

(3)

j =L,H

Within this environment we are now ready to set up the problem of finding
the optimal contract that will provide the right incentives for human capital
accumulation at the least expected cost. Before analyzing the hidden human
capital accumulation case, however, we go through a series of related and
simpler cases that will serve in clarifying the main case of interest.

2.

OBSERVABLE EFFORT

The case of observable effort is often referred to in the literature as first-best
(FB) since it represents the maximum joint utility achievable in the contractual
relationship between the principal and the agent. This is because, if effort is
observable, the principal can directly control the choice of effort of the agent
and, hence, there is no need for incentives. This implies that there is no
need to impose risk on the agent, which results in lower expected transfers
from the principal to the agent. Although we are interested in the case of
unobservable effort, it is useful to also analyze this simpler benchmark to
learn about the differences between the problem with effort persistence (human
capital accumulation) and the standard RMH problem (in which human capital
fully depreciates every period).
We will refer to the problem of the principal when effort is observable as
problem FB:
max V (u, e)
(u,e)

s.to

A. Jarque: Hidden Human Capital Accumulation
3

345

e ∈ e, e
ui ∈ Ui , uij ∈ Uij ∀i ∀i, j

(ED)
(CD)

w0 ≤ W0 (u, e) .

(PC)

The solution to problem FB is a contract that consists of a pair of contingent
sequences of utility and effort that maximize the expected profit of the principal subject to the participation constraint (PC)—which assures that the agent
expects as much utility from accepting the contract than staying out—and the
domain constraints for consumption (CD) and effort (ED). Characterizing the
solution to this problem when considering all the possible combinations of
(ED) and (CD) binding constraints is very lengthy and tedious. In the interest of space, we choose to discuss here only the case in which neither of the
constraints in (CD) or (ED) bind.
What are the properties of consumption and effort in the optimal contract?
We learn them from looking at the first-order conditions of the problem. Let
λ ≥ 0 be the multiplier of the (PC).5 We have:
(ui ) :


uij



:

1
= λ, for i = L, H
u (ci )
1
  = λ, for i, j = L, H

u cij

(e1 ) : π  (s1 ) + βρπ  (s2i ) (yH − yL ) = vλ
 
e2i : π  (s2i ) (yH − yL ) = vλ, for i = L, H.

(4)

We analyze in turn the case with and without persistence.

Full Depreciation
First we analyze the observable effort version of a standard two-period RMH
problem (see, for example, Rogerson [1985a]). This case is nested in the
common framework presented above, for a value of the persistence parameter
ρ = 0. In this case, effort does not have a persistent effect on the output
distribution, that is, there is no learning by doing. Hence, we can say that the
human capital of the agent fully depreciates every period.
Here and throughout the rest of the paper, we use stars to denote the
solutions to the problems. When necessary, we index the solutions by two
arguments: the first one takes a value P if ρ > 0 (persistence) and a value
N P if ρ = 0 (no persistence). The second one takes a value FB if we
are in the case of observable effort and a value SB if we are in the case of
5 Standard arguments for λ > 0 hold in this setup with persistence. The basic intuition is
that V ∗ (c, e; w0 ) is strictly decreasing in w0 .

346

Federal Reserve Bank of Richmond Economic Quarterly

unobservable effort. Hence, here we denote the solution to problem FB when
ρ = 0 as u∗ (N P , F B) and e∗ (N P , FB). Note that, whenever it does not
lead to confusion, we do not include these arguments to keep the notation
light.
Since the right-hand sides of all the first-order conditions for utility are
equal to λ, we conclude that the level of utility, and hence consumption,
should be the same independent of the output realizations and the period:
u∗i = u∗ij = u∗ for all i, j . The first-order conditions of effort, in turn, imply
that effort requirements are independent of output realizations and the period:
∗
e1∗ = e2i
= e∗ for all i. It is easy to see that, given these properties of
consumption and effort, the (PC) in problem FB simplifies to


w0 = (1 + β) u∗ − ve∗ .
Hence, we can solve for the level of utility in the solution to the FB problem:
u∗ ≡

w0 + v (1 + β) e∗
.
1+β

(5)

Let c∗ ≡ u−1 (u∗ ). Let π j (e2 ) denote the derivative of π j (e2 ). Noting that
π H (e) = −π L (e), we can combine the first-order conditions for consumption
and effort to get
 
 
(6)
u c∗ π H e∗ (yH − yL ) = v ∀t.
That is, the optimal effort level is such that the marginal benefit from increased
effort (the marginal increase in expected output times the marginal utility of
output) equals the marginal utility cost of effort.
The following properties summarize our conclusions about the FB problem with nonpersistent effort:
1A. We have that c1∗ = c2∗ = c∗ .
2A. We have e1∗ = e2∗ = e∗ .
The main property of the optimal consumption sequence of the FB contract in
the standard RMH problem is that the contract insures the agent completely
against consumption fluctuations whenever feasible. The intuition for this
result is straightforward: Since the agent has concave utility in consumption,
this is the cheapest way of providing the agent with his outside utility. The
main property of the optimal effort sequence of the FB contract in the standard
RMH problem is a constant effort requirement over time. The tradeoff between
increasing the disutility suffered by the agent and increasing the expected
output is exactly the same in each period, and hence the solution is the same
each time.
It is worth noting that the solution in the observable-effort case coincides
with that of a repeated static problem (“spot” contract) in which neither the
agent nor the principal commit to the two-period contract, and the outside

A. Jarque: Hidden Human Capital Accumulation

347

Table 1 Parameters of the Numerical Example
υ
β
yH
yL
w0

Marginal effort disutility
Discount factor
Output realization, high state
Output realization, low state
Outside utility

5.00
0.65
30.00
20.00
6.55

utility of the agent is w20 each period. Hence, commitment has no value in the
case of observable effort and no persistence.
An example

Throughout this article, we illustrate the properties of each particular case of
the environment presented by solving a particular numerical example. This
makes it easy to compare across the different cases presented. The common
parameters of the example are√
listed in Table 1.
We also assume u (c) = 2 c and a probability function
√
(7)
π (s) = s,
as well as e= 0.01 and e = 0.99.
We now solve for c∗ and e∗ . Since we are in the case of full depreciation
of human capital, we use ρ = 0 and the formulas derived above. For our
example, we have that (6) becomes
√
1
√ (30 − 20) = 5 c∗
∗
2 e
√
1
=
c∗
√
2 e∗
0.25
c∗ =
.
e∗
Together with (5) this gives us the solutions listed in Table 2.

Observable Human Capital Accumulation
We now turn to analyzing the case in which the effects of effort are persistent
in time, with ρ > 0. That is, we analyze the optimal contract in the presence
of human capital accumulation, or learning by doing.
We established above that the main property of the optimal consumption
sequence of the FB contract in the standard RMH problem is that the contract
insures the agent completely against consumption fluctuations. Here we will
learn that this property remains true in the case with effort persistence. The
main property of the optimal effort sequence of the FB contract in the standard

348

Federal Reserve Bank of Richmond Economic Quarterly

RMH problem is also a constant effort requirement over time. We will learn
that when effort is persistent this property no longer holds: Effort requirements
will vary over time even in the observable effort benchmark.
We now proceed to derive these results by formally analyzing the problem
of the principal FB for the case of ρ > 0. She chooses an optimal contract:
a pair of contingent sequences u∗ (P , F B) and e∗ (P , F B) that solve problem
FB, i.e., they maximize the expected profit of the principal subject to (PC)
and the domain constraints (CD) and (ED). We initially discuss the case in
which neither the (CD) nor the (ED) constraint bind. However, the lower (ED)
constraint (the non-negativity constraint on effort) may bind, with persistence,
in not-so-trivial cases. Because of its relevance, the case of this constraint
binding will be discussed in turn.
We can derive the properties of the solution by analyzing the first-order
conditions in (4) for the case of ρ > 0. The first thing to note is that, as in
the case without persistence, neither consumption nor effort are contingent
on output realizations. However, effort recommendations will depend on the
time period. We can use the (PC) here as well to derive the optimal level of
utility:


w0 + v e1∗ + βe2∗
∗
u ≡
.
1+β
The optimal level of consumption will be c∗ ≡ u−1 (u∗ ). We can substitute
the first-order condition for effort e2 into that for e1 , as well as the expression
of λ from the consumption first-order conditions, to get an expression for the
tradeoff determining the choice of e1 :
 
 
u c∗ π H s1∗ (yH − yL ) = v (1 − βρ) .
(8)
Comparing this to the tradeoff determining the choice of e2 ,
 
 ∗
u c∗ π H s2i
(yH − yL ) = v,

(9)

we learn that the marginal cost of increasing effort in the first period is different
(smaller) than that in the second period. The optimal choice takes into account
that any effort e1 exerted in the first period persists into the second one, i.e.,
it “saves” the agent the equivalent of the discounted disutility of effort of
exerting ρe1 in the second period. This difference in the effective cost of
effort that appears because of persistence implies that the principal sets the
effort requirements in a way that implies a higher probability of observing yH
in the first period than in the second. We can see exactly how this difference
is determined by using the first-order conditions of effort to get the following
relationship:
 
 
π H s1∗
= π H s2∗ .
(10)
1 − βρ

A. Jarque: Hidden Human Capital Accumulation

349

This implies s1 > s2 since 1 − βρ is always between 0 and 1. From the
accumulation of human capital in (1) we have that
e1∗ = s1∗ ,
e2∗ = s2∗ − ρs1∗ ,

(11)

which implies a higher effort in the first period than in the second, e1∗ > e2∗ .
The following properties summarize our conclusions about the case with
persistence and observable effort:
1B. We have that c1∗ = c2∗ = c∗ .
2B. We have that e1∗ > e2∗ .
That is, whenever c∗ is feasible in both states, the principal provides complete
consumption smoothing, both across states and across time. As for effort
requirements, the principal decreases the requirement from the first to the
second period. We repeat the intuition for this result: In the first period,
the effort disutility incurred by the agent is a sort of “investment,” since it
improves the conditional distribution not only in the current period but also
in the following one. At t = 2, however, there is no period to follow, so the
marginal benefits of effort are not as high, while the marginal cost is the same
as in the first period.6
An example

We now solve for the optimal contract with persistence and observable effort.
For this case with accumulation of human capital, we use ρ = 0.2 and
the formulas derived above. We list the solution in Table 2. Note that the
level of s2∗ (P , F B) in this case is 0.16, smaller than that of the secondperiod effort in the no-persistence case of the previous section, which was
e2∗ (N P , F B) = 0.17. Comparing the equations that determine each ([6]
for e2∗ (N P , F B) and [9] for s2∗ (P , F B)), we can see that c∗ (P , F B) <

∗

∗
c∗ (N P , F B) implies
1/u

 > 1/u (c (N P , F B)), and hence
 ∗(c (P , F B))

∗

π H s2 (P , F B) > π H e2 (N P , F B) . Given the concavity of π (·), it follows that s2∗ (P , F B) < e2∗ (N P , F B).

The Nonnegativity Constraint on Effort
In light of this solution we can discuss the case of the lower constraint in (ED)
binding. As an introduction to why this case is of particular relevance to the
6 In a T > 2 framework with s = 0, we would have that e ≥ e for t < T , that e = e
t
t
0
1
2
for t = 2, ..., T − 1, and eT ≤ e2 . Again, the intuition is that in all t < T , effort improves
the conditional distribution not only in the current period, but also in the periods that follow. At
t = 1, since s0 = 0, effort is higher than in any other period.

350

Federal Reserve Bank of Richmond Economic Quarterly

Table 2 Solutions for the Numerical Example, FB Problem
FB Solutions
NP
P

c1∗

c2∗

e1∗

e2∗

s1∗

s2∗

5.82
4.95

5.82
4.95

0.17
0.22

0.17
0.12

0.17
0.22

0.17
0.16

problem with persistence, it is useful to consider the effect of changes in the
persistence parameter, ρ, on the effort solution just presented. For a value of
persistence ρ = 0, effort equals accumulated effort trivially, and its level is
constant across periods. On the other hand, if we instead substitute a value of
persistence ρ = 1, (1 − βρ) takes its minimum value in (10) and the solution
implies the maximum difference between the level of s1 and s2 , with s1 much
higher than s2 . However, carefully inspecting (11), we can already see that
such high level of persistence cannot be compatible with an interior solution
for effort in period 2: The principal would choose e2∗ = 0. Since s1∗ > s2∗ for
all values of ρ > 0, effort e2∗ may not be interior for other high enough values
of ρ. In other words, persistence implies that, in many interesting cases, the
lower domain constraint on effort (ED) cannot be safely ignored.
Constraint (ED) is represented by the following set of inequalities:
s2i ≤ ρs1 + e + e,

(12)

s2i ≥ ρs1 + e.

(13)

and
Constraint (12) may be binding for some parametrizations. However, we
choose not to discuss this case explicitly here because it is easy to impose ex
ante conditions on the parameters that preclude it from binding; for example,
for the specification of the probability in (7), it is easy to see that s ≥ e is
never chosen in the optimal contract. The lower bound on s represented in
(13), however, is endogenous, and equation (13) cannot be checked without
having the solution s1∗ in hand. Fortunately, in the case of observable effort that
we are analyzing here, we are able to include constraint (13) explicitly in the
maximization problem FB. This allows us to study how the solution properties
differ from those in 1B and 2B discussed above when this constraint binds.
Let γ i ≥ 0 be the multiplier associated with constraint (13) in the version
of problem FB for the case ρ > 0. We have that the first-order condition for
e2i is modified as follows:
(e2i ) : π  (s2i ) (yH − yL ) = vλ − γ i , for i = L, H.

(14)

Note that, again, the choice for effort in the second period is not contingent on
the first-period outcome, so we have γ L = γ H = γ . Then we can substitute

A. Jarque: Hidden Human Capital Accumulation

351

(14) into the unmodified first-order condition for first-period effort, (e1 ), to get
a general version of equation (8) that allows for the lower domain constraint
of effort to be binding:
π  (s1 ) (yH − yL ) = vλ (1 − βρ) + βργ .

(15)

From the Kuhn-Tucker conditions, we know that whenever γ > 0 we have
e2∗ = 0 and, hence, s2∗ = ρs1∗ .
An example

In some special cases, we can check ex ante whether γ = 0 is a feasible
solution to the FB problem, and hence we can restrict ourselves to the simpler
analysis without domain constraints. In particular, with the specification for
the probability function in (7) that we are using for our example, equation (10)
becomes
1
1
 ∗ =  ∗,
2 s2
(1 − βρ) 2 s1
or, rewriting,
s2∗ = (1 − βρ)2 s1∗ .

(16)

This is the relationship that should hold between the level of s1∗ and s2∗ whenever
γ = 0. Hence, the domain condition e2 ≥ 0 is satisfied whenever s2∗ ≥ ρs1∗ ,
or, substituting s2∗ from (16), whenever
(1 − βρ)2 ≥ ρ.

(17)

A closer inspection of condition (17) shows that, for β ≤ 0.5, it is always
satisfied. For higher β values, however, the condition is satisfied only for low
enough ρ values, i.e., when effort is not “too persistent.” In our example, for
β = 0.65, we need to check whether (17) is satisfied: The left-hand side is
equal to 0.76, which is clearly greater than the right-hand side, 0.2.
To summarize the findings of our analysis, we have shown that for the
numerical example presented here, we can provide ex ante conditions (a functional form for the probability as in equations [7] and [17]) on the parameters
of the problem that assure us that the domain constraints on (ED) do not bind.
Under such restrictions, the characteristics of the solution to the first-best
problem 1B and 2B presented earlier in this section are valid.
In relation to those characteristics, it is worth pointing out that the properties of effort requirements depend strongly on our assumption that the utility
of the agent is linear in effort. Linearity implies that there is no tradeoff between the efficient accumulation path of human capital and smoothing effort
disutility over time. In other words, the smoothing of effort requirements over
the duration of the contract does not increase the overall utility of the agent,
as is the case with consumption smoothing; hence, the principal only takes

352

Federal Reserve Bank of Richmond Economic Quarterly

into account the effects that different accumulation paths have on the utility
of the agent and his own profit through the changes in expected output over
time. In the numerical example in Section 6, we will revisit the solution to the
observable-effort case discussed here, and we will see the direct consequence
of this: It is optimal to ask the agent to exert effort earlier rather than later
in the contract, since effort that is done early improves the distribution over
future output, holding constant the level of future effort.

3.

UNOBSERVABLE EFFORT WITH FULL DEPRECIATION

When effort is not directly observable, the principal must rely on observed
output realizations, which are imperfect signals about the effort level of the
agent, in order to implement the desired sequence of human capital. Contrary
to the case of observable effort, here consumption in a given period will need
to vary with the output realization in order to provide incentives for the worker
to choose the recommended level of effort.
Formally, the problem of the principal, which we will refer to as the
second-best (SB), is:
max V (u, e)
(u,e)

s.to
3

e ∈ e, e
ui ∈ Ui , uij ∈ Uij ∀i ∀i, j

(ED)
(CD)

U ≤ W0 (u, e)

(PC)

W0 (u, e) ≥ W0 (u,
e) ∀
e = e.

(IC)

The incentive constraint (IC) ensures that the expected utility that the agent
gets from following the principal’s recommendation is at least as large as that
of any other effort sequence.
In order to illustrate clearly the differences that derive from the presence
of effort persistence in this two-period problem, we analyze first the version
without persistence (ρ = 0), that is, with full depreciation of human capital
every period, or no learning by doing. Moreover, because the main result that
we will derive when we study the case with ρ > 0 is that, in some cases, the
properties for consumption in the optimal contract will be the same as those of
the optimal contract in a framework without persistence, it is useful to analyze
in detail the properties of the solution with observable effort.
Without persistence, the structure of the incentive constraints simplifies
considerably. This influences the solution, but also the ways in which the
problem can be studied. In particular, the standard RMH problem has a simple
recursive formulation that is not available with persistence. In this section we

A. Jarque: Hidden Human Capital Accumulation

353

provide an illustration of this difference. Then, we discuss the difficulties
of introducing persistence, along with some potential solutions, in Section
4. In Section 5 we discuss our example with human capital accumulation, a
particularly simple case with effort persistence for which a solution can easily
be found.

A Simplified Incentive Compatibility Constraint
In the case without persistence the structure of the incentive constraints simplifies considerably. In particular, the expected utility of the agent in the second
period is independent of the first-period effort choice. Define

W1i (u, e) =
π j (s2i ) uij − ve2i , for i = L, H,
(18)
j =L,H

as the expected utilities for the second period, contingent on the first-period
realization. This expression for the continuation utility simplifies, when ρ =
0, to

W1i (u, e2i ) ≡
π j (e2i ) uij − ve2i , for i = L, H.
(19)
j =L,H

(Note that, to distinguish the notation for continuation utilities here from those
of the general case that allows for persistence in (18), we denote them here
with a prime and we make explicit the independence of e1 .)
What is the simplification of the incentive constraints that follows from this
independence? As it turns out, all the sequences that have the same choice of
effort in the second period, regardless of the first-period effort choice, provide
the agent with the same expected utility in the second period, conditional
on the first-period output realization being the same. In other words, the
deviations of the agent in the second period can be evaluated independently of
the first-period effort choice, and also independently at each node following
the first-period output realization. As a consequence, the number of relevant
incentive constraints for the agent is drastically decreased.
To see this formally, denote by w1i ≡ W1i (u, e2i ) the continuation utilities
evaluated at the effort requirement of the principal. Then all the incentive
constraints that involve deviations only in the second period, or that have the
same effort choice for the first period, simplify to
w1i ≥ W1i (u,
e2i ) ∀
e2i = e2i for i = L, H.

(20)

We refer to equation (20) as the “second-period incentive constraints.”7
7 For a more concrete illustration, consider the case with discrete effort and E = [e , e ].
L H

Then the initial number of IC constraints would be seven, and they would simplify to three: one
first-period constraint and two second-period constraints.

354

Federal Reserve Bank of Richmond Economic Quarterly

e2i ) on e1 also implies the
Now note that the independence of W1i (u,
following: Imposing the second-period incentive constraints in (20) serves to
assure that all potential deviations (
e1 ,
e2L ,
e2H ) that consider effort choices
in the second period that are not e2H and e2L are dominated by a strategy
e1 , e2L , e2H ) that considers the same deviation in period 1 and none in the
(
second period. Formally, what we are saying is that


e1 ) [ui + βw1i ] − v
π i (
e1 ) ui + βW1i (u,
e2i ) − v
e1
π i (
e1 ≥
i=L,H

i=L,H

trivially simplifies to the second-period incentive constraint in (20). This is
useful because it means that when we are evaluating deviations in the first
period we forget about potential deviations in the second period as well, and
simply substitute w1i into the second period utility:


e1 . (21)
e1 ) [ui + βw1i ] − v
π i (e1 ) [ui + βw1i ] − ve1 ≥
π i (
i=L,H

i=L,H

We refer to these constraints as the “first-period incentive constraints.”
The independence of second-period expected utility on first-period effort
choice not only decreases the number of IC constraints that we need to consider, but also allows the problem of the principal to be analyzed period by
period. This is precisely because all future period payoffs can be summarized
through the promised utility w1i without specifying the particular consumption
transfers or effort recommendations that will deliver w1i in the future. From
a practical point of view, it is important to note that the range of values that
w1i can take is independent of the agent’s action in the first period, and hence
can be calculated by simply using the domain restrictions for consumption
and second-period effort, together with the second-period IC in (20). This is
a very useful feature when we want to compute the solution for a particular
numerical example, as we will do in Section 6.
To summarize, the simplifications we just discussed are the reason why
the recursive formulation first introduced by Spear and Srivastava (1987) is
possible. In a finite two-period problem like the one presented here, this also
means that we can solve the problem backward and characterize the properties
of the solution. We proceed to do that now.

A Backward Induction Solution to the
Optimal Contract
As a first step, we use the fact that incentives in the second period are independent of choices and utilities in the first period. This allows us to split the
problem of the agent in the IC into two problems: a first-period problem and
a second-period problem. The second-period problem, PIC2 , is

π j (e2i ) uij − ve2i ,
max
e2i ∈[e,e]
j =L,H

A. Jarque: Hidden Human Capital Accumulation

355

and the first-period problem, PIC1 , is

max
π i (e1 ) (ui + βw1i ) − ve1 ,
e1 ∈[e,e]
i=L,H
where wi is the expected utility for the second period in equilibrium.
If we want to characterize the optimal contract, first we need to transform
these maximization problems into an equality constraint that we can include
in the problem of the principal. Following the spirit of the first-order approach
(see Rogerson [1985b]), we establish concavity of the maximization problems
in PIC1 and PIC2 . Then we can substitute them by their first-order conditions,
which are necessary and sufficient for a maximum. In our two-outcome example, this concavity is fairly straightforward to guarantee. It is easy to see
that, for any positive first-period effort recommendation to satisfy the original first-period IC in (21), we need uH + βwH > uL + βwL . Also, for any
second-period positive effort recommendation to satisfy the second-period IC
in (20), we need uiH > uiL . Since we have assumed that π H (·) is a concave function of effort, concavity of the expected utility of the agent in effort
follows.8 Hence, we can substitute PIC1 for its first-order condition,

(22)
π i (e1 ) (ui + βwi ) − v = 0,
(e1 ) :
i=L,H

and we can substitute PIC2 by its corresponding first-order condition,

π j (e2i ) uij − v = 0.
(e2i ) :

(23)

j =L,H

Using these in place of the original IC allows us to derive some properties for
the optimal contract.
As a second step in characterizing the optimal contract, we appeal to the
same logic that we spelled out to show the independence of second-period
utility of the agent on his first-period actions, to argue that the same independence holds for the expected profit of the principal. The objective function in
problem SB can be written as

V (u, e) =
π i (e1 ) [yi − ci + βV1i (w1i )] ,
i=L,H

where
V1i (w1i ) =





π j (e2i ) yj − cij .

j =L,H

Hence, to solve problem SB subject to (PC) and (22) and (23)—assuming the
domain constraints are not binding—we can simply split the problem across
8 For a higher number of output levels, the conditions on the probability function that would

assure concavity have not been determined (see Rogerson [1985b] and Jewitt [1988] for a discussion
of these conditions in the context of a static contract).

356

Federal Reserve Bank of Richmond Economic Quarterly

the two periods and solve it backward using subgame perfection. First, we
solve the second-period problem, P2i , for an unspecified value of w1i :
max

uiL ,uiH ,e2i

V1i (w1i )

s.t. (23) and

=
π j (e2i ) uij − ve2i .

w1i

j =L,H

Let μi and λi be the multipliers of the first and second constraints, respectively.
For each i = L, H, the first-order conditions with respect to utility are


uij



π j (e2i )
1
, j = L, H.
:    = λi + μ i
π j (e2i )
u cij

(24)

This condition will be familiar to the reader acquainted with basic contract
theory: Since the second-period problem is, in fact, a static moral hazard, we
find that this first-order condition links consumption to likelihood ratios in
the same way as in a static contract (see Prescott [1999] for a review of this
textbook case). The likelihood ratios capture the informational value of each
possible output realization. The same static intuition prevails in the case for
effort. The first-order conditions are




π j (e2i ) yj − cij + μi
π j (e2i ) uij = 0.
(25)
(e2i ) :
j =L,H

j =L,H

It is easier to see the intuition when we substitute π L (e) = −π H (e) in the
expression above and get
(e2i ) : π H (e2i ) [yH − yL − (ciH − ciL )] + μi π H (e2i ) (uiH − uiL ) = 0.
We see that the principal equates the marginal increase in the expected net profit
that comes from a higher probability of yH with the change in the marginal
increase in expected compensation associated with it, given that uiH > uiL .
Note, however, that the solution for the second period is contingent on
the value of w1i (which plays the role of the period outside utility in a static
problem). With the solution to the second-period problem in hand, we can
calculate the value to the principal of promising a level of utility of w1i to the
agent for the second period. Hence, we know the value of V1i (w1i ) and we
can substitute it in the first-period problem, P1 :

max
π (e1 ) [yH − cH + βV1i (w1i )]
u ,u ,e ,
L

H

1

w1L ,w1H i=L,H

w0

s.t. (22) and

≤
π i (e1 ) (ui + βw1i ) − ve1 .
i=L,H

A. Jarque: Hidden Human Capital Accumulation

357

Let μ and λ be the multipliers of the first and second constraints, respectively.
The first-order conditions for consumption are
(ui ) :

π i (e1 )
1
=
λ
+
μ
, i = L, H.
u (ci )
π i (e1 )

(26)

These mirror the conditions in (24) for the second period: The ranking of consumption is again determined by the likelihood ratios, although the dispersion
is potentially different and depends on the multiplier of the first-period incentive constraint, μ. The values of μ and μi , as well as λ and λi , are difficult to
get for generic utility functions. (To see this, note that the first-order conditions give us information about u (c), while the constraints of the problems P1
and P2i are written in terms of u (c); this makes for a highly nonlinear system
of equations that seldom has an explicit solution.) This is why computing numerically the solution to particular problems is a popular strategy in dynamic
contract theory.9
Recall that in this first period the principal has an extra choice variable
with respect to problem P2i : the contingent levels of expected utility of the
agent in the second period, w1i . The importance of the value of w1i relative
to that of ui in the optimal contract is at the heart of dynamic incentives. We
can explore the optimal tradeoff between the two variables by looking at the
first-order condition for the continuation utility:
(w1i ) : V1i (w1i ) + λ + μ

π i (e1 )
= 0, i = L, H.
π i (e1 )

(27)

To interpret this condition we need to figure out the derivative of the value
function of the principal, V1i (w1i ). We do this by using the envelope theorem
and the second-period problem P2i that determines V (w1i ):
V1i (w1i ) = −λi .
Substituting this derivative into (27) we get
λi = λ + μ

π i (e1 )
, i = L, H.
π i (e1 )

Note that this, combined with (26), implies λi = u (c1 i ) . What does the λi
multiplier represent in the second period? It is the shadow value of relaxing
the “promise keeping” constraint of the principal in the second period. The
principal has committed to deliver a level of expected utility of w1i . How costly
this is for him depends on the necessary spread of utilities in order to satisfy
incentives in the second period. This can be seen formally by multiplying

the first-order conditions for uij in (24) for each j times π j e2j , and then
9 For details on these computations see, for example, Phelan and Townsend (1991) or
Wang (1997).

358

Federal Reserve Bank of Richmond Economic Quarterly

summing the resulting equations for j = L and j = H ; doing this we get that
 
π e2i
1 − π (e2i )
+
= λi .

u (ciH )
u (ciL )
The shadow value depends on the expected tradeoff between the marginal value
to the principal of increasing consumption, −1, and the marginal increase
in utility of spending this extra unit of consumption, u (c). Now we take
this condition further: Since we had established that λi = u (c1 i ) , we get the
following relationship of the inverse of the marginal utility of consumption:
 
π e2i
1 − π (e2i )
1
+
= 
.
(28)


u (ciH )
u (ciL )
u (ci )
This is the so called “Rogerson condition,” first derived in Rogerson (1985a).
It summarizes how the optimal dynamic contract with commitment allocates
incentives over time and histories. We now discuss its implications for the
choices of effort and consumption.

Effort and Consumption Choices Over Time
To illustrate the implications of the Rogerson condition, consider, for the
sake of comparison, a slightly different model to the one presented here:
Everything else equal, assume no commitment to long-term contracts for both
the principal and the agent. This is often referred to as “spot contracting.” For
the purpose of our comparison, set the per period outside utility for the agent
to w20 in both periods. It is easy to see that the solution to this problem without
commitment is the repetition of the one-period optimal contract. This implies
that the second-period consumptions would be independent of the first-period
realizations, and hence identical to those in the first: cH = cLH = cH H , as
well as cL = cH L = cLL . It is immediate that this solution to the spot contract
violates (28).
How is the contract with commitment different than the repetition of the
static contract? The main difference is that with commitment the contract
exhibits memory, i.e., the level of consumption in the second period, contingent on a second-period realization, is different depending on the first-period
realization. Why is it optimal for the contract with commitment to be different
than the repetition of the static contract? Because it allows incentives to be
provided in a more efficient way. The reason becomes clear if we consider how
the principal can improve on the repetition of the static contract once he has
commitment to a two-period contract. If the agent gets a yH realization in the
first period, his overall expected utility increases if he trades off some of the
consumption that the static contract assigns him in the first period with some
expected consumption in the second. Because cH was high to start with, the
decrease in his first-period utility from postponing some consumption translates into a bigger increase in expected utility in the second period, where he

A. Jarque: Hidden Human Capital Accumulation

359

has positive probability of facing low consumption whenever yL realizes. This
means the principal can, with this deviation from the spot contract solution,
keep some of the consumption for himself while leaving constant the expected
utility following the high realization node in the first period, i.e., uH + βwH .
In the same way, if the agent gets a yL realization in the first period, he is
better off by trading some expected utility in the second period for some consumption in the first, and this again saves resources for the principal. Hence,
in the optimal contract, we have that w1H > w1L . It is worth noting that these
optimal tradeoffs result in a violation of the Euler equation of the agent, which
is incompatible with (28).10
The last first-order condition of problem P1 left for analyzing is that of
effort in the first period:


π i (e1 ) (yi − ci + βV1i (w1i ))+μ
π i (e1 ) (ui + βw1i ) = 0.
(e1 ) :
i=L,H

i=L,H

This condition captures the same tradeoff discussed after deriving the secondperiod effort first-order conditions in (25). Of course, the values of the variables and multipliers will typically be different than in the second period,
implying a different solution across periods. To gain some important insight in the properties of effort requirements over time, it is again useful to
compare the effort solution here to that of the spot contract without commitment. It is easy to see that the repetition of the static contract would imply
e1 = e2H = e2L .11 Here, instead, this is not the case. If we recall that the
optimal contract implies w1H > w1L , a simple inspection of the second-period
problem P2 tells us that, for the principal, effort incentives are more expensive
following a yH realization than a yL realization. The continuation utility w1i
plays the role of the outside utility in a static contract. It is immediate from the
risk aversion of the agent that, for the same spread of utility that would satisfy
the IC in (23), a higher level of outside utility translates into more consumption. Hence, the principal will optimally choose e2H < e2L . Moreover, in the
second period the principal cannot provide incentives for effort as efficiently
as in the first period, since the intertemporal tradeoff of consumption that we
described above is not available (there are no future periods after t = 2). This
will typically imply a lower effort requirement in the second period than in
the first. We conclude that, in contrast with the first-best property summarized
in 2A, effort requirements will fluctuate over time and across histories in the
unobservable effort case in order to provide incentives more efficiently.
The solution to this version of our numerical example is presented in Table
3 and Figures 1 and 2. We defer the discussion of this solution example until
10 This follows from Jensen’s inequality and the convexity of 1/u (c).
Rogerson (1985a).
11 Simply set w


1H = w1L and note that π H (e1 ) = −π L (e1 ).

For details, see

360

Federal Reserve Bank of Richmond Economic Quarterly

Section 6, where we compare the solution to the unobserved effort case both
with full depreciation and without.

4.

DEALING WITH PERSISTENCE

The simplifications outlined in the previous section, when effort is not persistent, do not hold for the general case of ρ > 0. Before we go on to analyze a
particular case of human capital accumulation in Section 5 and illustrate the
differences, we discuss here the main particularities that persistence of effort
introduces in the analysis of the optimal contract.
Two main differences with respect to the standard framework appear when
effort is persistent. First, it is no longer the case that a given choice for
effort in the second period provides the agent with the same expected utility
w1i regardless of his first-period effort choice e1 . It follows that the number
of relevant incentive constraints is much higher in the problem with persistence. Second, the problem of the principal cannot, in general, be written
in the usual recursive form in which the promised utility w1i summarizes all
relevant information about past periods. The relevant summary variable is
the original W1i (u, e), which depends on both the first- and the second-period
effort choices. The dependence of W1i (u, e) on e1 complicates the calculation
of its possible values. In particular, this state variable is not a number (like
w1i was) but a function: The principal needs to take into account all possible
choices for e1 , including those off the equilibrium path. Finally, the conditions
for concavity of the agent’s problem in the IC are difficult to establish, even
in the two-outcome case presented here.
These issues have so far been addressed in the literature with two main
strategies. The first strategy limits the effort choices to a two-point set, and
includes explicitly in the problem of the principal the complete list of relevant
incentive constraints for all possible combinations of effort choices. The
second strategy allows for a continuum of effort choices, but puts restrictions
on the functional form of π (e1 , e2 ) in order to simplify the set of constraints.
These approaches are now discussed in some detail.

A Hands-On Analysis of the Joint Deviations Problem
Within the first approach, the main contribution is Fernandes and Phelan
(2000). They provide a tractable setup in which an augmented recursive formulation of the problem of the principal is possible. Intuitively, this formulation has an increased number of state variables with respect to the recursive
formulation of the moral hazard problem without persistence first presented
in Spear and Srivastava (1987). The simplified framework that allows for the
recursive formulation limits the effort choices and the output realizations to
two. Also, the contract lasts for an infinite number of periods but persistence

A. Jarque: Hidden Human Capital Accumulation

361

lasts only for one period; that is, effort at time t affects only the probability
distribution over outcomes at time t and t + 1. The recursive formulation of
the problem of the principal has three state variables, one of which is the standard promised utility on Spear and Srivastava’s formulation. The two extra
states allow the principal to keep track of the marginal disutility of effort for
the agent across periods, as well as the set of utilities achievable by the agent
off the equilibrium path.
Still within the first approach, Mukoyama and Şahin (2005) limit the
effort choices and the output values to two and analyze a two-period problem. They assume that high effort is optimal every period. They are able
to provide analytical conditions on the conditional probability function under
which the implications of persistence are drastically different than those of no
persistence: When the first-period effort affects the second-period probability in a sufficiently stronger way than the second-period effort, the optimal
contract exhibits perfect insurance in the initial period. Using a recursive formulation in the spirit of Fernandes and Phelan (2000), Mukoyama and Şahin
also analyze a three-period problem numerically.
Kwon (2006) uses a very similar framework with discrete effort choices
(0 or 1), also assuming that high effort is implemented every period. He
imposes concavity of π (·) on the sum of past effort choices, so past effort is
more effective than current effort. These assumptions allow him to analyze a
T > 2 period problem that shares the same perfect insurance characteristic as
in Mukoyama and Şahin (2005).

A Particularly Simple Case of Persistence
The second approach, presented in Jarque (2010), allows for a continuum of
effort choices but assumes that the conditional probability depends on past effort choices only through the sum of undepreciated effort in the same manner
as stated in Assumption 2. Note that, even for a concave probability function
π (s), Assumption 2 implies that past effort is less effective than current effort
in contrast to what was assumed in Mukoyama and Şahin (2005) or Kwon
(2006). The article shows that, for a subset of problems with this particular
form of persistence, the computation of the optimal contract simplifies considerably. For these problems, an auxiliary standard repeated moral hazard
problem without persistence can be used to recover the solution to the optimal contract. The linearity in effort of both variable s (which determines the
probability distribution) and the utility of the agent dramatically simplifies the
structure of the joint deviations across periods; in practice, we can think of s
as the choice variable, and the structure of the resulting transformed problem
is (under some conditions) equivalent to that of a standard repeated moral
hazard.

362

Federal Reserve Bank of Richmond Economic Quarterly

In the next section, a finite version of the model in Jarque (2010) is
presented and this result is explained in detail. The finite version allows for the
numerical computation of the optimal contract in an example in
which the stochastic structure is interpreted as unobservable human capital
accumulation.

5.

HIDDEN HUMAN CAPITAL ACCUMULATION

The problem of the principal is again as in problem SB, but now we consider
the case ρ > 0. We argued in Section 4 that this case is more complicated
because of the dependence of second-period utility and optimal actions of the
agent on first-period choices. In order to go around some of these difficulties,
here we adapt to our two-period finite example the strategy presented in Jarque
(2010) for solving problems with persistence. Following this work we will
show that, under our assumptions, the structure of the problem simplifies to that
of the standard repeated moral hazard presented above, provided the domain
constraints in (ED) do not bind. This is an important qualification since, as we
learned when analyzing the case of observable human capital accumulation in
Section 2, in the presence of persistence the effort domain constraints in (ED)
will sometimes bind, especially for high values of the persistence parameter ρ.
To deal with this issue, we follow the approach in Jarque (2010): First, we find
a candidate solution assuming that the constraint in (ED) does not bind. Then
we need to check numerically that this constraint is indeed satisfied to be sure
that we have found a true solution. Unfortunately, a general analysis of the
optimization problem of the principal including the inequality constraints for
effort (again, as in Section 2) is more difficult with unobserved effort. Hence,
finding the properties of the general case when constraint (ED) binds remains
a question for future research.

Rewriting the Problem
Jarque (2010) shows that, whenever the effort domain constraint (ED) is not
binding, we can find the solution to the problem with persistence using a
related RMH problem without persistence as an auxiliary problem. The key
observation for that result is that we can write the expected utility of the agent,
W0 (u, e), as a function of the s variable only. This is convenient because
s is the variable that effectively determines the probability distribution over
outcomes each period; different combinations of effort choices that give rise
to the same s are equivalent both for the principal and for the agent. Hence,
once we rewrite the problem with s as the choice variable, there is no need to
consider joint deviations across periods, the recursive structure is recovered,
and we can solve for the optimal contract as we do with a standard repeated
moral hazard.

A. Jarque: Hidden Human Capital Accumulation

363

0 (u, s) = W0 (u, e) for all the pairs of s and e sequences such that s
Let W
results from effort choices in e according to the law of accumulation of human
capital in (1). Writing the effort in the second period as
e2i = s2i − ρs1 ,
we have
0 (u, s) =
W



π i (s1 ) ui − vs1

i=L,H

⎡



+β

π i (s1 ) ⎣



⎤
π j (s2i ) uij − v (s2i − ρs1 )⎦ .

j =L,H

i=L,H

Note that we have explicitly written the utility accrued in the first period in
the first row of this expression, and that of the second period in the second
row. With utility spelled out this way it is easy to see that, although s1 is all
accumulated in the first period, it appears both in the first- and second-period
utility. Also, since s1 is not contingent on any realization, it appears in the
second period both after observing a first-period yH and a first-period yL .
Hence, we can group the s1 terms of the second period together with those of
the first, to get an expression of the form

0 (u, s) =
W
π i (s1 ) ui − v (1 − βρ) s1
i=L,H

+β


i=L,H

⎡
π i (s1 ) ⎣



⎤
π j (s2i ) uij − vs2i ⎦ .

(29)

j =L,H

This allows us to interpret s as the variable being chosen by the agent. In the
first period, we can interpret v (1 − βρ) as the “marginal disutility of exerting
s1 .” In the second period, the “marginal disutility of exerting s2 ” is instead v.
This rearrangement of terms and thinking about s as the choice variable
is a useful trick. Note that in the second row the expression inside the square
brackets is independent of s1 . Interpreting s2i as the choice variable, we can
see that we can do here as we did in the case of no persistence and write
the continuation utility of the agent independently of the first period’s choice
for s1 :

1i (u, s2i ) .
π j (s2i ) uij − vs2i = W
j =L,H

Hence, we obtain expressions that parallel those of the standard RMH formulation in (19). The expression in (29) can then simply be rewritten as

0 (u, s) =
1i (u, s2i ) − v (1 − βρ) s1 .
W
π i (s1 ) ui + β W
i=L,H

Note also that the structure of the incentive constraints simplifies as it did in
the case of the RMH; in the second period, the first-period choice of s drops

364
out:


Federal Reserve Bank of Richmond Economic Quarterly



π j (s2i ) uij − vs2i − ρv
s1 ≥

j =L,H

s2i ) uij − v
π j (
s2i − vρ
s1 , ∀
s1 ,
s2i .

j =L,H

Again, all these changes of notation are simply aimed at pointing to the following fact: The problem in which effort is persistent has a similar structure
to that of a standard RMH problem in which s is interpreted as effort that is
not persistent, but has marginal disutility of v (1 − βρ) at t = 1 and of v at
t = 2. To make this explicit, using the intertemporal regrouping of s1 , the
problem of the principal in SB can be written as problem SB’:
max V (u, s)
u,s

w0

W0 (u, s)
ui
s1
s2i

≤
≥
∈
∈
∈

s.to
0 (u, s)
W
0 (u,
W
s) ∀
s = s
Ui ∀i
S1
S2 i = L, H,

with S1 = e, e and S2 = ρs1 + e, ρs1 + e + e . This rewriting leads to the
following observation: If problem SB’ were in fact formally equivalent to a
standard RMH problem (with the modified structure of the marginal disutility),
this would help us enormously to find and characterize the solution to SB, since
we would know how to solve it (or at least compute it numerically). However,
a close inspection of SB’ points to a small but potentially important difference
with a standard RMH problem: In problem SB’, the domain S2 depends on
the choice of s1 , while in a standard RMH problem this domain would be
exogenously given.

Using a Related RMH Problem without Persistence
as an Auxiliary Problem
Following Jarque (2010), we now show that, in some instances, we can work
around the difficulty that an endogenous domain S2 poses by using a related
auxiliary problem for our purposes instead of SB’. Consider a problem SBaux
that is equal to SB’ except for the domain S2 , which is substituted by an
auxiliary domain 
S2 = e, e . Note that 
S2 is exogenous so, interpreting s
as effort, problem SBaux is a standard RMH. We will now argue that, under
some conditions, the solution to SB’ coincides with the solution to SBaux , and
hence we can easily obtain a solution to our problem with persistence.
The solution to problems SB’ and SBaux coincides when two conditions
are satisfied: (i) W0 (·) is concave in s, and (ii) the resulting optimal choices
for effort are interior. This is a set of sufficient conditions because if the

A. Jarque: Hidden Human Capital Accumulation

365

expected utility of the agent is concave in his choice of s, then the relevant
effort deviations are those close to the optimal (interior) s, and not those at the
limits of the domain. This implies that using an auxiliary domain that does
not exactly overlap with the true domain is not changing the solution to the
problem, as long as this true solution is contained in the auxiliary domain. Are
each of these conditions satisfied in our framework?
(i) Concavity of W0 (·) in s. In our particular example, it is easy to argue
that the problem of the agent is concave in st for all t. In fact, the argument
is the same that we used earlier to argue that problems PIC1 and PIC2 were
concave: There are only two outcomes, the probability of observing yH is
concave in st , and current and future utility assigned to yH is always higher
than current and future utility assigned to yL .
(ii) Effort is interior. This is not satisfied trivially. Constraint (ED)
implies that two restrictions need to be checked to establish that the true
solution is contained in the proposed auxiliary domain:
s2i
s2i

< ρs1 + e + e, i = L, H,
> ρs1 + e, i = L, H.

(30)
(31)

Under the probability specification in (7), equation (30) is always satisfied.
Other specifications are easy to find for which the upper bound of effort in (30)
is not binding. The lower bound, however, is endogenous, and equation (31)
cannot be checked without having the solution for s in hand. We conclude
that the interiority cannot easily be guaranteed ex ante. The strategy to go
around this problem that is proposed in Jarque (2010) is the following: Solve
the problem assuming that the domain constraint can be substituted—and
hence the equivalence to the RMH can be used—and then, with a candidate
solution for s in hand, check the constraint ex post. We follow this route in
the numerical computation of an example presented next. As it turns out, it is
easy to find parametrizations for which the ex post check on the nonnegativity
of effort is satisfied.

The Optimal Contract for Hidden Human
Capital Accumulation
What do we conclude about the properties of the optimal contract in the presence of hidden human capital accumulation? Denote as
c∗ and
e∗ the solution
to problem SBaux . Whenever the sufficient conditions discussed above are
satisfied, we have that, in the optimal contract:
1. The optimal consumption sequence in problem SB, c∗ (P , SB), is equal
to 
c∗ .
e∗ .
2. The optimal human capital sequence in SB, s∗ (P , SB), is equal to 

366

Federal Reserve Bank of Richmond Economic Quarterly
3. The optimal effort sequence in SB, e∗ (P , SB), can be recovered from
the effort solution to problem SBaux using
e1∗ (P , SB) = 
e1∗
e2∗ (P , SB) = 
e2∗ − ρ
e1∗ .

Importantly, the optimal consumption sequence has the same properties as in
the solution to a standard RMH problem without persistence. Also, the optimal
human capital sequence has the same properties as the effort sequence in a
standard RMH problem. These properties were discussed at length in Section
3. Using these properties, we can reflect on the economic meaning of the ex
post check implied by equation (31).
Whenever the ex post check in (31) is satisfied, the optimal contract asks
the agent to increase human capital in every period. That is, the remaining level
of human capital from the previous period, after depreciation, ρs1 , is never
sufficient to cover the requirement of human capital for the current period, s2i
for i = L, H . In light of the properties of effort in a standard RMH problem,
it is easy to see that this condition may not be satisfied in some examples since
a decrease in the level of human capital from one period to the next could
be part of the optimal solution for the principal. In particular, we learned in
Section 3 that in an interior solution we will typically have e2H < e1 , since the
smoothing of incentives that is present in the first period is not available in the
second, making effort in the second period relatively more expensive. Given
the results we just established for the case with persistence, this means that
we will typically have s2H < s1 in the optimal contract with hidden human
capital accumulation. How does this lead to a violation of the ex post check
in equation (31)? For certain parameters, we may have that s2H is so much
smaller than s1 that, in fact, we have s2H < ρs1 + e, violating the interiority
of effort choices. That is, if it were feasible, the principal would choose to
have s2 lower than ρs1 + e. However, in the true problem with human capital
accumulation (problem SB), effort needs to stay within its domain in each
period, i.e., e2i > e for all i, which rules out the possibility of decreasing s2
below ρs1 + e. Any adjustment should be made in the first period, when the
principal anticipates the added cost of future incentives. That is, the solution
for s1 should differ from the one that was just presented. Unfortunately,
characterizing how exactly the solution for s1 changes is not easy. Solving for
the optimal contract in this case becomes more complicated. As we argued,
the independence of second-period choices from first-period choices breaks
down, both for the principal and for the agent. In practice, even the numerical
computation of examples is more involved, since all feasible combinations of
effort across the two periods (and choices contingent on realizations of output)
need to be tested for incentive compatibility. The simple recursive structure
with w2i as a state variable is no longer valid, and the dimensionality of the

A. Jarque: Hidden Human Capital Accumulation

367

computational problem is similar to that of the strategy proposed in Fernandes
and Phelan (2000).
The next section presents an example for which the ex post check in
(31) is satisfied, and hence solving for the optimal contract is simple. Using the numerical solution, we discuss the implications of persistence for
consumption and effort paths by comparing the solution to that of the case
without persistence (ρ = 0).

6.

NUMERICAL EXAMPLE WITH UNOBSERVED EFFORT:
A COMPARISON

For cases in which the equivalence to a RMH is valid, we can find the solution
to our problem with persistence using the usual numerical methods to solve
standard RMH problems without persistence.
Figures 1 and 2 illustrate the implications for effort and consumption in
the solution to an example with the parameter values listed in Table 1. The
example without persistence has ρ = 0, while the example with persistence
has
√ ρ = 0.2. For the numerical examples we use the functional forms u (c) =
2 c and the probability specification in (7). We also set e = 0.01 and e = 0.99
in order to restrict to cases with full support.
In Figure 1, the solution for s and e in the SB problem with persistence
is plotted with a solid line. As we can see in the top panels, the level of
s1 in problem SB is always higher with persistence than without persistence
(dashed line). Since s1 = e1 , a higher level of s1 with persistence reflects the
fact that human capital is accumulated in the first period with the same cost
as nonpersistent effort, but it lasts (partially) until the following period.12
The solutions for the paths of optimal s in the FB model are also represented in Figure 1 (dotted and dash-dotted line respectively for the persistent
and nonpersistent case). The comparison clearly shows that human capital
accumulation makes frontloading of s optimal. (This also translates into frontloading of effort as shown clearly in the bottom panels of Figure 1.) The main
difference with the solutions to the respective SB problem is the level (higher
in the FB problem). A second difference is that, even without persistence, in
the second period the requirement for s may decrease in the SB problem, for
incentive reasons, following both realizations (although the decrease may be
more pronounced after yH ), and hence we have s1 > s2i for all i.
As we can see in the bottom panels of the solution to the SB problem,
both with persistence and without, effort is higher in the initial period than in
12 The level of s in this example coincides with and without persistence for all i. This is
2i
particular to this example and is violated if, for example, the level of w0 is modified. Although
human capital in the second period is equivalent to nonpersistent effort (because there are no further
periods to exploit the persistence of human capital), the optimal choice for w2i will typically be
different across the two models.

368

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 Contingent Paths for Human Capital and Effort in the
Optimal Contract with and without Effort Persistence, both
for the First-Best and the Second-Best Models
y =y
1

y =y
1

L

0.25

0.20

0.20

0.15

0.15
s(y 1)

s(y 1)

0.25

H

0.10

0.10
FB P
FB NP
SB P
SB NP

0.05

0.05

0.00

0.00
2

t

1

0.25

0.25

0.20

0.20

0.15

0.15

t

2

e(y1 )

e(y1 )

1

0.10

0.10

0.05

0.05

0.00

1

t

2

0.00

1

t

2

Notes: s (y1 ): human capital contingent on history y1 ; e (y1 ): effort contingent on history
y1 ; P: see Table 3, ρ = 0.2; NP: see Table 3, ρ = 0.

the second. However, the frontloading of effort is much more pronounced
with persistence. This is also true when comparing the solutions for the
FB problem: While effort stays constant from one period to the next in the
case without persistence, with persistence it is frontloaded, as discussed in
Section 2.

A. Jarque: Hidden Human Capital Accumulation

369

Figure 2 Contingent Paths for Consumption in the Optimal Contract
with and without Effort Persistence, both for the First-Best
and the Second-Best Models
c(y L,y H )

c(y L,y L)

FB NP
SB P
SB NP

14

14
12
c(y 1,y2 )

12
c(y 1,y2 )

16

FB P

16

10
8

10
8

6

6

4

4

2

2

0

1

t

0

2

1

16

14

14

12

12

10

10

c(y 1,y2 )

c(y 1,y2 )

16

8

8

6

6

4

4

2

2

0

0
t

2

c(y H ,y H )

c(y H ,y L)

1

t

2

1

t

2

Notes: c (y1 , y2 ): consumption contingent on history (y1 , y2 ); P: see Table 3, ρ = 0.2;
NP: see Table 3, ρ = 0.

Consumption, depicted in Figure 2, is, in the SB case, virtually the same
with and without persistence. It simply increases when the realization is
yH and decreases when it is yL for the standard incentive provision reasons
discussed in the earlier sections. However, we can see in the FB case that
consumption is slightly lower in the case with persistence. Since the FB case

370

Federal Reserve Bank of Richmond Economic Quarterly

Table 3 Summary Statistics

E ct∗ 
E u ct∗
V ar ct∗
E et∗
V ar et∗
E st∗
V ar st∗

ρ = 0.2 (FB)

ρ = 0.0 (FB)

ρ = 0.2 (SB)

ρ = 0.0 (SB)

t =1

t =2

t =1

t =2

t =1

t =2

t =1

t =2

6.12
4.95
0
0.22
0
0.22
0

6.12
4.95
0
0.16
0
0.16
0

5.82
4.83
0
0.17
0
0.17
0

5.82
4.83
0
0.17
0
0.17
0

5.30
8.26
4.96
0.14
0
0.14
0

5.47
11.32
13.69
0.05
0.00023
0.0828
0.00023

5.16
7.74
4.34
0.11
0
0.11
0

5.30
10.90
13.96
0.084
0.00022
0.0842
0.00022

is calculated numerically but without using a grid, we conclude that most
likely consumption is also slightly lower with persistence in the true solution
to the unobservable effort case.
Table 3 reports the value of some simple statistics of the comparison across
the two models presented in Figures 1 and 2. The FB model statistics are
included for reference, since they correspond to the solutions reported already
in Sections 1 and 2. All expectations in the first period are conditional on s1∗ ,
∗
and those in the second are conditional on s2i
. When comparing the statistics
for the SB problem, we see that persistence implies a higher level of expected
consumption, expected utility, and a slightly higher variance of consumption
in the first period. When looking at these three moments across periods we see
that persistence implies a steeper increase of expected consumption in time.
Again, the statistics on consumption need to be interpreted with care since
they are likely influenced by the use of a grid.
As for expected effort, we see that the level is higher with persistence
in the initial period, but it drops below the no persistence case in the second
period (a much steeper decrease than without persistence). The comparison
of the expected accumulated human capital explains this: The expected level
of s1 with persistence is much higher than the level of e1 without persistence,
but the solution for s2 with persistence is similar (in this particular example,
identical) to the solution for e2 without persistence.

7.

CONCLUSION

When learning by doing is an important factor in a repeated agency relationship, solving for the optimal contract is generally very difficult. In the
framework studied here, with linear disutility of effort and the productivity of
the agent being a distributed lag of past efforts, we provide an example with
a simple solution. This allows us to numerically establish some properties of

A. Jarque: Hidden Human Capital Accumulation

371

the optimal contract. On one hand, the human capital of the agent in equilibrium and, hence, his productivity tend to be higher with learning by doing
than without. Moreover, the optimal contract offered to the employee implies
a lower productivity in the final years of the contract. The human capital of
the agent is left to depreciate since, close to the end of the contract, the cost
of incentives of requiring a higher productivity is not justified by the benefit
of future productivity. This implies that, over the contractual relationship,
effort is frontloaded and follows a steeper decreasing pattern than in the case
without learning by doing. On the other hand, we find that the properties of
wage dynamics remain unchanged with respect to those of the optimal contract
without learning by doing.

REFERENCES
Ales, Laurence, and Pricila Maziero. 2009. “Accounting for Private
Information.” Mimeo.
Arrow, Kenneth J. 1962. “The Economic Implications of Learning by
Doing.” The Review of Economic Studies 29 (June): 155–73.
Fernandes, Ana, and Christopher Phelan. 2000. “A Recursive Formulation
for Repeated Agency with History Dependence.” Journal of Economic
Theory 91 (April): 223–47.
Gibbons, Robert, and Kevin J. Murphy. 1992. “Optimal Incentive Contracts
in the Presence of Career Concerns: Theory and Evidence.” Journal of
Political Economy 100 (June): 468–505.
Grossman, Sanford J., and Oliver D. Hart. 1983. “An Analysis of the
Principal-Agent Problem.” Econometrica 51 (January): 7–45.
Heckman, James J., Lance Lochner, and Christopher Taber. 1998.
“Explaining Rising Wage Inequality: Explorations with a Dynamic
General Equilibrium Model of Labor Earnings with Heterogeneous
Agents.” Review of Economic Dynamics 1 (January): 1–58.
Jarque, Arantxa. 2010. “Repeated Moral Hazard with Effort Persistence.”
Journal of Economic Theory 145 (November): 2,412–23.
Jewitt, Ian. 1988. “Justifying the First-Order Approach to Principal-Agent
Problems.” Econometrica 56 (September): 1,177–90.
Kapička, Marek. 2008. “Efficient Allocations in Dynamic Private
Information Economies with Persistent Shocks: A First-Order
Approach.” Mimeo, University of California, Santa Barbara.

372

Federal Reserve Bank of Richmond Economic Quarterly

Kwon, Illoong. 2006. “Incentives, Wages, and Promotions: Theory and
Evidence.” RAND Journal of Economics 37 (Spring): 100–20.
Lemieux, Thomas, W. Bentley MacLeod, and Daniel Parent. 2009.
“Performance Pay and Wage Inequality.” Quarterly Journal of
Economics 124 (February): 1–49.
Lucas, Robert E., Jr. 1988. “On the Mechanics of Economic Development.”
Journal of Monetary Economics 22 (July): 3–42.
MacLeod, W. Bentley, and Daniel Parent. 1999. “Job Characteristics, Wages,
and the Employment Contract.” Federal Reserve Bank of St. Louis
Review (May): 13–27.
Mukoyama, Toshihiko, and Ayşegül Şahin. 2005. “Repeated Moral Hazard
with Persistence.” Economic Theory 25: 831–54.
Phelan, Christopher. 1994. “Incentives, Insurance, and the Variability of
Consumption and Leisure.” Journal of Economic Dynamics Control 18:
581–99.
Phelan, Christopher, and Robert M. Townsend. 1991. “Computing
Multi-Period, Information-Constrained Optima.” Review of Economic
Studies 58 (October): 853–81.
Prescott, Edward S. 1999. “A Primer on Moral-Hazard Models.” Federal
Reserve Bank of Richmond Economic Quarterly 85 (Winter): 47–77.
Rogerson, William P. 1985a. “Repeated Moral Hazard.” Econometrica 53
(January): 69–76.
Rogerson, William P. 1985b. “The First-Order Approach to Principal-Agent
Problems.” Econometrica 53 (November): 1,357–67.
Spear, Stephen E., and Sanjay Srivastava. 1987. “On Repeated Moral Hazard
with Discounting.” Review of Economic Studies 54 (October): 599–617.
Wang, Cheng. 1997. “Incentives, CEO Compensation and Shareholder
Wealth in a Dynamic Agency Model.” Journal of Economic Theory 76
(September): 72–105.

Economic Quarterly—Volume 96, Number 4—Fourth Quarter 2010—Pages 373–397

News Shocks and Business
Cycles
Per Krusell and Alisdair McKay

T

he discussion surrounding the recent deep recession seems to have
shifted the focus from currently used business cycle models to the
standard Keynesian model (by which we mean the “old Keynesian,”
as opposed to the new Keynesian, model). In the Keynesian model, pessimism
among consumers and investors about the economy will simultaneously lower
aggregate consumption and aggregate investment, as well as aggregate output,
through an increase in the rate of unemployment, and more generally through
lower capacity utilization. Moreover, in the Keynesian model, pessimism and
optimism are not determined within the model—they appear exogenously and
they disappear exogenously. The analysis is then about how the economy
reacts to these exogenous events. Undoubtedly, there are many indications
that consumers and investors seemed pessimistic about their prospects during
the recession, but does such pessimism necessitate the reversion back to the
Keynesian model? The present article reviews and contributes to a recent
strand of the “modern” business cycle literature, i.e., the literature that insists
on building a model of the economy that is explicit about its microeconomic
foundations and that addresses a related question: Can news shocks generate
positive co-movement among our macroeconomic aggregates? An example
of a negative news shock would be the sudden arrival of information indicating that future productivity will not be as high as previously thought. Thus,
such a shock would generate current pessimism, and yet be grounded in real
and fundamental developments. Another kind of news shock would be a government announcement about a policy change to be implemented on a future
date (say, that taxes will be raised beginning next year). In this recent literature, thus, optimism and pessimism are examined as determinants of business
Krusell is affiliated with the Institute for International Economic Studies and is a visiting
scholar with the Federal Reserve Bank of Richmond. McKay is affiliated with Boston University. The views expressed here do not necessarily reflect those of the Federal Reserve Bank
of Richmond or the Federal Reserve System. E-mails: Per.Krusell@iies.su.se; amckay@bu.edu.

374

Federal Reserve Bank of Richmond Economic Quarterly

cycle fluctuations, but as add-ons to otherwise microfounded macroeconomic
models, and moreover they are tied in a systematic way to anticipated changes
in the economy’s fundamentals.
Models of business cycles that rely on microeconomic foundations generate fluctuations in economic activity in response to fluctuations in fundamentals, such as preferences, technology, or government policy. The first
generations of these models (Kydland and Prescott 1982) relied on technology shocks, i.e., shocks to aggregate productivity; such a shock, if positive
and persistent, would raise output directly, via an increase in aggregate employment, and as a consequence raise both consumption and investment, thus
generating the kind of co-movement we observe in aggregate time series.
Shocks to government expenditures have been considered as well, as have
preference shocks (for consumption now versus consumption in the future),
though these shocks alone do not easily generate co-movement in the remaining aggregate variables. For example, when government spending rises there
is strong pressure on either consumption or investment to fall, unless hours
worked (or perhaps capital utilization) rises significantly; hours worked might
increase if there is a significant wealth effect in labor supply, but in standard
parameterizations the wealth effects are not strong enough.
The new literature begins with Beaudry and Portier (2006, 2007), who analyze time-series data and conclude that news about future productivity may be
an important driver of business cycles and then go on to discuss in what model
economies news can generate co-movement. We briefly review the data analysis in Section 1. In Section 2, we explain why news shocks, like some other
shocks, do not readily generate co-movement in standard neoclassical settings.
Beaudry and Portier suggest their own setting, wherein news shocks have the
desired effect, but there are other frameworks that generate co-movement in
response to news shocks as well. Section 3 describes a very simple setting
that we think has most, if not all, of the necessary qualitative effects: the
Pissarides (1985) model. This model is a general-equilibrium description of
labor markets with search/matching frictions in which unemployment is an
equilibrium phenomenon. Capital does not play a major role in the simplest
version of the model, though the number of firms, which is endogenous and
depends on labor market conditions and on (current and future) productivity,
can be given the interpretation of capital, and the creation of new firms can
be interpreted as investment. We show that in this model, news about, say, a
decline in future productivity—pessimism—will lead fewer firms to enter on
impact. Thus, investment falls. Moreover, there is a rise in unemployment,
along with a stock market bust, which we measure as the value of the firms in
the market. If, in addition, the economy has access to a storage technology,
or the economy is open, a fall in consumption can result as well. Thus, the
model can generate co-movement in all macroeconomic variables. We then

P. Krusell and A. McKay: News Shocks and Business Cycles

375

review, in Section 4, other settings proposed in the literature that achieve the
same goals, and in Section 5 we offer conclusions.

1.

EVIDENCE OF NEWS SHOCKS AND THEIR EFFECTS

Typical Business Cycle Co-Movements
What features of the business cycle might one expect models to capture? Perhaps the key characteristic of the business cycle is the co-movement of broad
measures of economic activity. A business cycle expansion typically involves
rapid growth of output, consumption, and investment and high levels of employment and hours worked. Another distinguishing feature of business cycles
is the frequency of expansions and contractions. Business cycle fluctuations
are typically thought to have a frequency of longer than one year but shorter
than one decade. Finally, one might ask a model to match the magnitude of
business cycle fluctuations in both absolute as well as relative terms. While an
ideal model of the business cycle would be accurate along all these dimensions,
the focus of the discussion here is on matching co-movements.

VARs and Other Evidence
Much of the interest in news shocks stems from the empirical work of Beaudry
and Portier (2006, 2007), who present evidence that news of productivity
shocks arrives in advance of actual changes to productivity. Their evidence
is based on two structural vector autoregressions (VARs). The VARs use the
same two variables, stock prices and total factor productivity (TFP), but they
differ in their structural identification schemes. In the first VAR, the authors
identify a shock to stock prices that is orthogonal to the current TFP shock. In
the second VAR, they use a long-run restriction to identify shocks to long-run
TFP. The authors find that the stock price shock from the firstVAR and the longrun TFP shock from the second VAR are highly correlated, which suggests
that stock market participants are able to predict future innovations to TFP.
Information about future economic conditions should be reflected by many
forward-looking variables beyond stock prices. Beaudry and Portier (2006)
introduce consumption and hours into their VAR system and obtain similar
results to their baseline bivariate VAR. Moreover, the authors show that these
“news” shocks explain a substantial fraction of movements in consumption,
investment, and hours worked at business cycle frequencies.
The empirical relevance of news and other informational shocks for business cycle analysis is an active area of research. Barsky and Sims (2008) consider another forward-looking variable: consumer confidence as measured by
the Michigan Survey of Consumers. One of the questions in the Michigan
survey asks respondents for their expectations of national economic conditions for the next five years. Barsky and Sims show that consumer confidence

376

Federal Reserve Bank of Richmond Economic Quarterly

is a useful predictor of changes in macroeconomic variables. They consider
two interpretations of this finding, which they term the “animal spirits” view
and the “superior information” view. The animal spirits view is that consumer
confidence, or confidence more broadly, directly causes an expansion of economic activity. The superior information view is that consumer confidence
reflects early knowledge of future economic conditions. The authors use a
VAR analysis to distinguish between these two possibilities. The key findings
are that innovations to confidence are highly correlated with innovations to
long-run output and not correlated with transitory innovations to output. These
results suggest that the superior information channel is the operative one because output growth that is not associated with increases in potential output,
as in the animal spirits view, should be short-lived. These results support the
finding of Beaudry and Portier that agents receive signals about productivity
changes ahead of the actual change in productivity.
Sims (2009) proposes a method for identifying news shocks that is an
alternative to the Beaudry and Portier approach. He estimates a VAR with
data on TFP (corrected for capacity utilization), output, consumption, hours,
stock prices, inflation, and consumer confidence. The latter two variables
are intended to augment the information about future productivity provided
by stock prices. After estimating the reduced-form VAR, Sims identifies the
unanticipated shock to TFP with the reduced-form innovation to TFP and
then identifies the news shock as the linear combination of the reduced-form
innovations that best explains the remaining movements in future TFP. The
response of the economy to news shocks under Sims’s identification is quite
different from its response to news shocks under the Beaudry and Portier
identification. Sims finds that a favorable news shock leads to an increase
in consumption but declines in hours, investment, and output on impact. As
we discuss in Section 2, these are the co-movements that the standard real
business cycle (RBC) model would predict for a news shock.
Blanchard, L’Huillier, and Lorenzoni (2009) investigate news shocks in a
context in which agents are unsure about the exact nature of the innovation to
productivity. Their model features permanent shocks to productivity that build
up gradually over time as well as transitory shocks to productivity. Agents
are not able to observe the two components of productivity separately, but
instead observe the level of productivity and a noisy signal about the permanent component of productivity. The noisy signal fluctuates for two reasons:
news and noise. Here news shocks are the permanent productivity shocks
that because of their gradual effect on productivity, are largely information
about future productivity rather than changes in current productivity. Noise
shocks, by contrast, are shocks to the signal that are unrelated to changes in
productivity. Ideally agents would ignore the noise shocks, but they are unable
to fully distinguish between noise and news. The authors assume that agents
smooth consumption completely in the sense that they set consumption equal

P. Krusell and A. McKay: News Shocks and Business Cycles

377

to their estimate of the permanent component of productivity. In response to a
permanent productivity shock, consumption responds only gradually because
the agents are unsure if the productivity shock is permanent, and over time
they revise their estimates in favor of the shock being permanent. In response
to a transitory shock or a news shock, consumption responds initially, but over
time agents learn that the shock is transitory or nonexistent and consumption
returns to its initial level. Importantly, the authors demonstrate that a VAR
applied to data on productivity, consumption, and the productivity signal cannot produce impulse responses that match the true ones implied by the model.
The reason is that the model posits that consumption is a random walk, and so
the VAR, which makes use of current and past observations, cannot identify a
shock that has a transitory impact on consumption. If it could identify such a
shock, then the agents in the model, who have at least as much information as
the econometrician, also would see the transitory dynamics in consumption
and would adjust their consumption to eliminate them. Therefore, the consumption response to any shock the econometrician can identify must be flat.
Moreover, it is not enough to allow the econometrician to use observations
from the future. The problem that arises is related to the invertibility problems
discussed by Fernández-Villaverde et al. (2007). When some state variables
are hidden from the econometrician, an innovation in the statistical model may
either be the result of an economic shock or the result of a discrepancy between the econometrician’s beliefs about the state variables and the true state.
Only if the econometrician can infer the value of the state with certainty can
he or she be certain about what is a shock and what is a “mistake” about the
state. Blanchard, L’Huillier, and Lorenzoni show numerically that even with
a large amount of data from the future, the econometrician is still uncertain
about the state and therefore still uncertain about the shocks that generated the
data. While news and noise shocks cannot be identified using VAR analysis,
the model can be estimated structurally and information about the shocks can
be recovered using the Kalman smoother. By imposing more structure on
the data, the authors are able to summarize, but not completely eliminate, the
uncertainty about the state variables and the economic shocks. The resulting
structural estimates imply that noise shocks are an important source of shortrun volatility, accounting for 50 percent of the variance in consumption at a
four-quarter horizon. The remaining 50 percent of the variance in consumption is attributable to permanent and transitory productivity shocks in roughly
equal measures. The results suggest that the manner by which information
about changes in productivity disseminates is an important part of business
cycle analysis. An interesting avenue for further research would be to see how
the importance of noise shocks holds up in a richer model.
Additional evidence that noise shocks might be factors in aggregate fluctuations comes from the work of Rodrı́guez Mora and Schulstad (2007). These
authors observe that official estimates of gross national product (GNP) are

378

Federal Reserve Bank of Richmond Economic Quarterly

revised over time, and the revisions are often quite substantial. They treat the
final estimate of GNP as the true level of activity in a given quarter and the
initial estimate as the perception of that level at the time. Their main finding
comes from a regression of the true growth in GNP on the true growth and the
perception of growth in the preceding quarter. They find that perceptions of
growth in the previous quarter are useful in predicting future growth, but the
true growth in the previous quarter is not. Moreover, they show that perceptions of growth in the previous quarter affect GNP growth through investment
spending rather than consumption or government spending. These results
suggest that the evolution of macroeconomic aggregates depends in part on
perceptions of economic fundamentals that may not always be correct.
Finally, Schmitt-Grohé and Uribe (2009) investigate the importance of
news shocks using a structural estimation approach. These authors estimate an
RBC model that incorporates a number of real rigidities and structural shocks.
Specifically, they include permanent and transitory shocks to TFP, investmentspecific productivity shocks, and government spending shocks. Each of the
shocks is composed of innovations that are anticipated at different horizons
ranging from zero quarters (unanticipated) to three quarters. Their posterior
mode attributes about 70 percent of the variance of output growth to anticipated
shocks and the posterior probability that this share is less than 50 percent is
essentially zero. Moreover, they find that output, hours, consumption, and
investment all increase in response to a positive anticipated transitory shock.
However, hours fall in response to a positive anticipated permanent shock.
The results in this article strongly support anticipated technology shocks as
sources of business cycle fluctuations.
All in all, much of the literature points to news and other informational
shocks as potentially important drivers of aggregate fluctuations. However, it
is far from clear yet how to best model and identify these disturbances. Relatedly, if one wanted operational measures of news shocks that could be fed
into a model and used to predict aggregate economic variables over the near
term, how would these shocks be constructed in practice (perhaps based on
current events)? The empirical studies discussed above define the shocks as
residuals based on an empirical (structural or semi-structural) specification;
direct measurement is hard, and estimates via, say, surveys regarding “consumer confidence” would tend to mix news shocks with other shocks. This
empirical problem, of course, is shared with, and arguably less severe than in,
traditional Keynesian methods.

2. THEORETICAL CHALLENGES
In light of the evidence that changes in TFP can be anticipated to a significant
extent, a natural question is to ask how such news shocks play out in the
standard real business cycle model. The standard one-sector RBC model has

P. Krusell and A. McKay: News Shocks and Business Cycles
time-additive preferences for consumption and leisure of the form
∞



β t u Ct , H̄ − Ht ,
E0

379

(1)

t=0

where β is the discount factor, u(·, ·) is the period utility function, Ct is the
flow of consumption, and Ht represents hours worked out of a maximum H̄ .
In the standard model, output is produced according to a constant returnsto-scale production function that combines capital and labor. The stochastic
disturbances that drive the business cycle enter through the production function
in the form of technology shocks. The most commonly assumed functional
form for the production function is Cobb-Douglas, which leads to
Yt = F (Kt , Ht , zt ) = zt Ktα Ht1−α ,
with Kt being the stock of capital at the beginning of the period and zt being
the level of technology in period t. Resources evolve according to
Kt+1 = F (Kt , Ht , zt ) + (1 − δ) Kt − Ct .

(2)

This resource constraint implies that there is a single homogeneous good that
is freely used for consumption or as capital.
As the standard model is frictionless, the equilibrium behavior of the
economy can be found through solving a planner’s problem. The planner
chooses stochastic processes for C, H , and K to maximize expected utility
according to equation (1) subject to equation (2), the stochastic process for z,
and the initial condition for K0 . The first-order conditions of this problem can
be expressed as the usual Euler equation




uC Ct , H̄ − Ht = βEt Rt+1 uC Ct+1 , H̄ − Ht+1 ,
(3)
where Rt+1 is the marginal product (in equilibrium, the rental rate) of capital
in period t + 1:
Rt+1 = FK (Kt+1 , Ht+1 , zt+1 ) + 1 − δ,
and the efficiency condition for the labor-leisure tradeoff:




uC Ct , H̄ − Ht wt = uH Ct , H̄ − Ht ,

(4)
(5)

where wt is the marginal product of labor (in equilibrium, the wage rate) in
period t:
wt = FH (Kt , Ht , zt ) .

(6)

Though a full account of the effects of shocks requires a full solution of the
stochastic general-equilibrium model and examination of its simulated timeseries properties, one can obtain significant insight by looking at “unexpected
shocks to steady states.” That is, assume that an economy is in steady state and
will stay there until there is an actual change in the technological opportunities
that occurs with probability zero. The question at hand here is how knowledge

380

Federal Reserve Bank of Richmond Economic Quarterly

of a future change in technology will affect the economy in the intervening
periods before the changes actually occur. While the Beaudry-Portier evidence suggests that positive news about future productivity should lead to
something of a business cycle expansion, the standard one-sector RBC model
cannot generate such a response. To see why the standard model has trouble
generating a business cycle expansion in response to a positive news shock,
consider what is required of the four main variables in the model: output, consumption, investment, and hours. An expansion is marked by an increase in
both consumption and investment. In the standard model there are no imports
or exports and no government spending, so the aggregate resource constraint
requires that output must rise to allow consumption and investment to rise simultaneously. The only option for an increase in output is for hours worked to
rise as the technological opportunities are initially unchanged and the capital
stock is predetermined by what was installed in the previous period. However,
consumption and leisure are normal goods under standard preferences, so that
at a given wage (marginal product of labor) a household will choose to adjust
consumption and leisure in the same direction, i.e., consumption and hours in
opposite directions. To see this mathematically, equation (5) can be used to
implicitly differentiate H with respect to C. Doing so yields
H  (C) = −

uCC w − uCH
,
uH H − uCH w

(7)

and decreasing marginal utility (uCC , uH H < 0) together with the weak complementarity of consumption and hours (uCH ≥ 0) imply this expression is
negative. So hours and consumption must move in opposite directions when
wages are held constant. The only hope for the model is that in equilibrium
wages increase so that the substitution effect raises hours, but, as was already
noted, the capital stock and technology have not changed so increased hours
will lead to lower wages in equilibrium. The implication is that the equilibrium response of the standard one-sector RBC model to a positive news shock
does not look like a business cycle expansion.
If the model does not generate a boom in response to a news shock,
what happens instead? If the preferences exhibit a strong wealth effect then
positive news about future productivity will lead to an increase in consumption.
This increase in consumption is associated with a decline in hours worked as
before, which in turn implies a reduction in output and the aggregate resource
constraint implies a reduction in investment. In contrast, with a weak wealth
effect all of these implications can be reversed.1
It will be useful to consider an extreme case for preferences, both for the
sake of understanding the workings of the basic model and for the sake of
1 Using a particular set of functional forms, Beaudry and Portier (2004) show that consumption and investment respond in opposite directions for any set of parameter values.

P. Krusell and A. McKay: News Shocks and Business Cycles

381

understanding the behavior of the model that is presented in Section 3, which
is based on the Diamond-Mortensen-Pissarides framework. Consider a utility
function that is just linear in consumption u(C, H ) = C, so that leisure is
not valued and labor supply is fixed exogenously. In this case, the return on
capital is pinned down by the discount rate in all periods as shown by the Euler
equation:


1 = βEt FK Kt+1 , H̄ , zt+1 + 1 − δ .
(8)
This Euler equation implies that in an experiment with perfect foresight, the
capital stock will perfectly track the level of technology—Kt is only a function
of zt and parameters of the model. The result is that in response to a news
shock the capital stock remains unchanged until the period before the change
in productivity takes place when (for a positive news shock) consumption is
reduced to raise the capital stock to its new steady-state level. While this case
yields a simple transition to the new steady state, the dynamics it does generate
have consumption and investment moving in opposite directions and with a
delay.
An important element of the Beaudry and Portier analysis is the response
of the stock market or, in terms of the model, the relative price of capital. In
the standard one-sector model there is in essence a single good that is used
for both consumption and capital. Therefore, the relative price of capital is
fixed at one unit of the consumption good at all times. A truly satisfactory
explanation of the Beaudry and Portier results would be able to replicate the
behavior of the stock market as well as the usual macroeconomic aggregate
quantities. Christiano et al. (2007), reviewed below, do discuss stock prices
within their model.

3. A SEARCH MODEL
The overall question we discuss in this article is what kinds of theoretical settings can deliver co-movement in response to news shocks. In Section 4, we survey the recent literature and the range of models discussed
there. Here, mostly for the purpose of illustration, we look at a specific,
and very simple, model: one based on the Diamond-Mortensen-Pissarides
search-and-matching model. What we present here is related to Den Haan and
Kaltenbrunner (2009), who study a similar setting. The setting with search
frictions offers something that the standard neoclassical model does not have:
“free resources,” namely, a set of unemployed agents who would gladly work
if they could just find an employer. Therefore, it is at least imaginable that the
frictions are such that when a news shock arrives, employment responds relatively quickly, provided that frictions are endogenous and respond to the news.
The response of frictions in this model is governed by flows of firms in and
out of the market for workers. The idea is, in principle, very simple: If there is

382

Federal Reserve Bank of Richmond Economic Quarterly

positive news, firms flow in immediately and look for workers, which makes it
easier for workers to find employment, leading to an increase in employment
and higher production, so that the overall resources available are increased.
Firms begin posting vacancies immediately upon learning the positive news
because an employed worker is immediately more valuable since, with some
probability, that worker will still be employed by the firm when productivity
rises.

Model Framework
The model framework is the standard continuous-time Diamond-MortensenPissarides search-and-matching model. A more detailed discussion of the
model framework and the determination of steady-state values can be found
in Pissarides (2000) or Hornstein, Krusell, and Violante (2005).
The model economy is populated by a unit continuum of workers. Workers
have linear utility for consumption discounted at the rate r, which implies
they are risk-neutral. The workers each supply one unit of labor inelastically.
Workers can be either employed or unemployed. Employed workers receive a
wage income of w and unemployed workers receive an unemployment benefit
of b, which also can be interpreted as the value of home production during
unemployment. The wage is an endogenous variable that will depend on,
among other things, the tightness of the labor market. The unemployment
benefit is an exogenous feature of the economic environment. Workers cannot
save and consume their income flows immediately.
The economy is also populated by an endogenous number of firms that
are also risk-neutral and discount future profits at rate r. Firms all have access
to the same production technology so there are no productivity differences
across firms. Firms are free to enter the labor market, but posting a vacancy
involves a flow cost in the amount c. Production requires a single worker and
a single firm and the amount of output produced by such a pair, p(t), varies
through time. It is assumed that production is always efficient in the sense
that p(t) > b.
There is a search friction in the labor market so that, at any point in time,
there will be a fraction u(t) of workers who are unemployed and looking
for firms and there will be a measure v(t) of firms with vacant jobs looking
for workers. These two groups meet at a rate, m(t), that is determined by a
constant-returns-to-scale matching function M(u(t), v(t)). We use a CobbDouglas matching function, M(u, v) = Auα v 1−α . Given the rate at which
new matches occur, the rate at which an unemployed worker finds a firm
is λw (t) = m(t)/u(t) and the rate at which a vacant firm finds a worker is
similarly λf (t) = m(t)/v(t). The gains from forming a productive workerfirm pair are divided between the worker and the firm by Nash bargaining,
with β going to the worker and 1 − β going to the firm. Existing worker-firm
pairs separate at the exogenous rate σ .

P. Krusell and A. McKay: News Shocks and Business Cycles

383

Steady State
To determine the steady-state values of unemployment and wages, we begin
by writing the conditions that must be satisfied by the values for the employed
worker, unemployed worker, matched firm, and vacant firm. Respectively,
these are:
rW (t)
rU (t)
rJ (t)
rV (t)

=
=
=
=

w(t) + σ [U (t) − W (t)] + Ẇ (t)
b + λw (t) [W (t) − U (t)] + U̇ (t)
p(t) − w(t) + σ [V (t) − J (t)] + J˙(t)
−c + λf (t) [J (t) − V (t)] + V̇ (t),

(9)
(10)
(11)
(12)

where a dot over a variable represents the derivative with respect to time. Each
of these equations can be interpreted in terms of the relationship between the
flow value and the capital value of a state. For example, equation (9) states
that the flow value of being an employed worker is equal to the income flow
plus the expected value of the capital loss that occurs upon separation when
the worker becomes unemployed and the change in value over time, possibly
stemming from a changing environment.2
The total surplus of a worker-firm match is the sum of the worker’s gain and
the firm’s gain, S ≡ (W − U ) + (J − V ). The Nash-bargaining determination
of wages implies that the total surplus is divided between workers and firms
according to their bargaining powers:
W −U
J −V

= βS
= (1 − β)S.

(13)
(14)

A useful expression for S can be found by adding and subtracting equations
(9)–(12) and using equations (13) and (14):
rS = p − b + c − σ S − λf (1 − β) S − λw βS + Ṡ.

(15)

This can be viewed as an “asset-pricing” equation: The value of the match—
the worker and the employer—equals a current payoff plus future payoffs,
which are captured by the Ṡ term; they can, in principle, be successively
substituted in so that the price of the asset equals the present value of all
payoffs, present and future. The equation can be rearranged to yield
S=

p − b + c + Ṡ
.
r + σ + λf (1 − β) + λw β

(16)

Now use the fact that firms are free to enter (and exit) the labor market, so the
value of a vacant firm must be zero. Setting V equal to zero in equations (12)
2 See footnote 12 in Hornstein, Krusell, and Violante (2005) for a detailed derivation of these
conditions.

384

Federal Reserve Bank of Richmond Economic Quarterly

Table 1 Model Parameter Values
Symbol
p
b
α
A
β
r
σ
c

Description
Productivity
Unemployment benefit
Elasticity of the matching function
Matching function efficiency
Worker’s bargaining share
Interest rate
Separation rate
Vacancy posting cost

Value
1.000
0.950
0.720
1.350
0.050
0.012
0.100
0.357

Notes: One unit of time is equal to one quarter. See Hornstein, Krusell, and Violante
(2005) for additional details.

and (14) and combining the results yields
c
.
S=
λf (1 − β)

(17)

Combining equations (16) and (17) yields one equation in the two meeting
rates:
p − b + Ṡ
c
.
=
r + σ + λw β
λf (1 − β)

(18)

As the matching function is constant returns to scale, the meeting rates can be
expressed in terms of a single variable that represents labor market tightness:
θ
λw
λf

≡ v/u
= M(u, v)/u = M(1, θ ) = Aθ 1−α
= M(u, v)/v = M(1/θ , 1) = Aθ −α .

(19)

In steady state the total surplus is constant, Ṡ = 0, so equation (18) is one
equation in the unknown θ. Once θ has been found, the λs and values follow
immediately from the equations above and equations (17), (14), and (11) can
be used to find the wage as a function of θ and p.
The unemployment rate evolves slowly as workers gradually flow into and
out of unemployment. The evolution of the unemployment rate follows
u̇(t) = σ [1 − u(t)] − λw (t)u(t),

(20)

and in steady state, unemployment is simply equal to σ / (λw + σ ).
Solving for the steady state of the model requires solving a nonlinear
equation in θ (equation [18], with Ṡ = 0). We do this numerically after calibrating the model following Hagedorn and Manovskii (2008). This calibration
leads to a steady-state unemployment rate of 6.9 percent and features stronger
effects on firm entry of productivity shocks than in alternative calibrations

P. Krusell and A. McKay: News Shocks and Business Cycles

385

such as Shimer (2005); for a discussion, see Hornstein, Krusell, and Violante
(2005). The parameter values used in our calibration appear in Table 1.

Transition to Steady State
Before considering the effects of a news shock it is necessary to consider how
the economy transitions to the steady state in a stationary environment. The
key result for the transition dynamics is that labor market tightness immediately reaches its steady-state value regardless of the initial conditions for the
economy. As unemployment is a predetermined variable, the response of the
labor market is driven by a jump in posted vacancies. To see that this must be
the case, rewrite equation (18) into
θ̇ =

r +σ
βA 1−α (p − b)(1 − β)A −α
+
θ
−
θ
θ,
α
α
cα

(21)

so that dynamics are expressed in terms of θ (thus including its time derivative
θ̇ ).3 Notice that the term inside the brackets is increasing in θ , so for θ below
the steady-state value the time derivative is negative and for θ above the steadystate value the time derivative is positive; therefore, the steady state is unstable,
and the only nonexplosive solution to the problem is for θ to jump immediately
to the steady state.4 It then follows that the λs also must jump to their new
steady-state values and so then must the values W, U, J, V , and S. Given the
new, constant level of λw , one can use equation (20) to trace out the evolution
of the unemployment rate to its new steady-state value, and vacancies are then
determined by the relationship v = θ u. In the end, there are very limited
transition dynamics resulting from an unexpected productivity shock, and if
productivity is expected to remain constant in the future, then θ must be at its
steady-state value.

News Shock (Recession)
We now consider how the model responds to a negative news shock. In
particular we perform the following experiment: Before t = 0, the economy
is in steady state and expected to remain there in perpetuity. At t = 0, news
arrives that at time T = 5 productivity, p, will drop by 1 percent. The arrival of
this news is a zero-probability event, which implies that agents put no weight
3 Equation (17) and its time derivative imply Ṡλ (1 − β) = −cλ̇ /λ , and equation (19)
f
f f
can be used to relate λ̇f to θ̇ .
4 Pissarides (2000) considers the system of differential equations formed by equations (20) and
(21). The boundary conditions for this system are the initial condition on u and the requirement
that the system converge. These conditions can only be met if θ immediately assumes its steadystate level.

386

Federal Reserve Bank of Richmond Economic Quarterly

on the event in forming their expectations, but does not imply that it cannot
occur.
To calculate the equilibrium response of the economy to this news, we use
the fact that θ must be at its new steady-state value at time T when the change
in productivity occurs because after that point the environment is expected
to be stationary. We use this as a terminal condition and solve the ordinary
differential equation (21) from time t = 0 to T . Having done so, we are
able to calculate the λs, trace the evolution of the unemployment rate, and
solve for all the other equilibrium quantities in the model. Interestingly, our
version of the Pissarides model has nontrivial dynamics, whereas the standard
model does not; in the standard model, there is always an immediate jump
in θ in response to a change in productivity, since this change is known as it
is realized. The slow-moving θ we look at, thus, comes from knowing that
productivity will change at a known future date.
The results appear in Figure 1, and we begin by comparing the two steady
states. The lower level of productivity results in a decrease in the total surplus of a match, one implication of which is that the value of a productive
firm is lower. This induces fewer firms to enter the labor market until market
tightness falls sufficiently and the probability of finding a new worker rises
to keep the value of a vacant firm at zero. The weaker labor market leads
to a lower job-finding rate for unemployed workers, which leads to a higher
steady-state unemployment rate. In equilibrium the wage decreases, but by
less than productivity, so profits also decrease eventually, though profits first
rise since wages, which are forward-looking, fall and productivity has not yet
fallen. Total resources fall smoothly, which is the effect sought: Firms leave in
anticipation of future falls in profit, which creates additional unemployment—
there are now even more “free resources” in the form of workers who are not
working. The fact that fewer firms are posting vacancies means that fewer
resources are spent on vacancy posting, which we interpret as investment.
Resources net of investment costs rise somewhat during transition but then
drop and are lower in the long run.5 During the transition to the new steady
state from time t = 0 to time T , the value of a productive firm drops initially
and then smoothly falls toward the new steady-state value. Labor market tightness follows the same pattern, which is achieved by an initial jump and then
decreasing path for vacancies. The weaker labor market decreases the speed
at which workers flow out of unemployment and results in a gradual rise in
the unemployment rate. Unlike the other variables, vacancies overshoot their
steady-state level. This overshooting stems from the fact that unemployment
5 A model in which there is endogenous separation—say, because workers or matches are

heterogeneous so that only some firm-worker contacts lead to lasting matches—might generate
another channel through which more resources are left idle since then some existing matches could
also break up in reaction to negative news.

P. Krusell and A. McKay: News Shocks and Business Cycles

387

Figure 1 News about a Coming Fall in Productivity

1.1

Labor Market Tightness ( )

Unemployment and Vacancies
0.075
u

1.0

0.070

0.9

0.065

0.8

0.060

v

0.7
-5

0

5

10

0.055
-5

0

0.94
value of
one firm

gross

0.92

0.26
0.24
0.22
0.20
-5

10

Total Resources

Stock Market Value
0.30
0.28

5

value of
stock
market

0.90

0

5

10

0.88
-5

net of
vacancy
costs

0

5

10

Profit Flow (p-w)

Wage
0.040

0.975

0.035

0.970

0.030
0.965
0.960
-5

0.025
0

5

10

0.020
-5

0

5

10

has not reached its steady state at time T , but is still below that level. Vacancies must then also be below their steady-state level at T so that labor market
tightness can remain at its steady-state level from T onward. The increase
in the unemployment rate mechanically leads to a decrease in output, and
the level of output jumps when all employed workers become less productive
at T .
The model is successful in generating a decline in employment, output, and
the stock market. What about investment and consumption? If we interpret
firm vacancy-posting costs as investment, then the model also generates a fall
in investment. Consumption, however, must rise on impact if the economy
is closed: No existing matches are broken up endogenously, so on impact no
resources are lost, but investment falls, and thus consumption must rise. An

388

Federal Reserve Bank of Richmond Economic Quarterly

open-economy version of the model with decreasing marginal utility would
reverse this result, as consumers would then want to smooth consumption over
time and use intertemporal international trade to achieve a smoothly declining
path for consumption.
As Figure 1 shows, labor market tightness, θ , drops initially when the
news is received and then converges to its new steady-state level at date T .
This pattern will hold for any choice of parameter values. Quantitatively,
however, the initial impact of the news on labor market tightness depends on
the way the model is calibrated, and there are two ways that the parameters
can affect this initial impact. First, different calibrations lead to different
steady-state responses of θ to changes in p. This sensitivity is the focus of
the literature that studies the implications of search-and-matching models for
unemployment fluctuations in response to unanticipated productivity shocks
(Shimer 2005; Hagedorn and Manovskii 2008; Pissarides 2009). The more
θ must have changed by date T , the more it must jump initially. The second
consideration is the speed with which the market tightness adjusts to its new
steady-state level. If the model dynamics are such that θ moves rapidly when
it is out of steady state, then a small drop in θ is needed at date 0 to achieve
the same level of θ at date T . What then determines the speed of convergence
and therefore the size of the initial impact of the news? Mathematically, if
the right-hand side of equation (21) is increasing more quickly in θ, then the
speed of convergence will be higher, and the initial impact of the news will be
smaller. For example, differentiation of equation (21) shows that an increase
in the interest rate, r, leads to a faster speed of convergence. This result is
intuitive as an increase in the interest rate leads firms to discount the future
more heavily and so the value of a firm depends more on the immediate future
and less on the distant future. As the productivity change does not happen
for some time after the news arrives, firms with high discount rates do not
respond as much as firms with low discount rates. Similar logic holds when
the separation rate is high. In this case, firms discount the future because the
match is likely to be destroyed before the change in productivity occurs.
Differentiation of equation (21) also shows that the speed of convergence
is increasing in the worker’s bargaining share, β. Therefore, when workers
have more bargaining power, the initial impact of the news is smaller. To see
the importance of the worker’s bargaining share, consider the case when β is
set to zero. In that case, the worker’s wage is always equal to the value of
leisure, b, and the firm’s flow profit is p − b, which is unchanged until date
T when it jumps down. Now consider a positive bargaining share, β > 0. As
shown in Figure 1, the worker’s wage falls at date 0 and remains below its
initial level thereafter. With a lower wage, flow profits actually rise between
dates 0 and T . So with a positive β, firms are partially compensated for
the future reduction in productivity by a short-term increase in profits. This

P. Krusell and A. McKay: News Shocks and Business Cycles

389

Figure 2 Misleading News about a Coming Decline in Productivity

1.1

Labor Market Tightness ( )

Unemployment and Vacancies
0.075
u

1.0

0.070

0.9

0.065

0.8

0.060

v

0.7
-5

0

5

10

0.055
-5

0

0.94
value of
one firm

gross

0.92

0.26
0.24
0.22
0.20
-5

value of
stock
market

0.90

0

5

10

0.88
-5
0.036

0.972

0.034

0.970

0.032

0.968

0.030
0

net of
vacancy
costs

0

5

10

Profit Flow (p-w)

Wage
0.974

0.966
-5

10

Total Resources

Stock Market Value
0.30
0.28

5

5

10

0.028
-5

0

5

10

short-term increase in profits motivates firms to post vacancies just after the
news arrives, and this force reduces the initial drop in θ.

The News Shock Turns Out to Be Wrong
The second experiment that we consider is to ask what happens if the expected
lower productivity is not realized at time T , but instead productivity remains at
its initial level both before and after T . Specifically, we assume that after the
news shock arrives, there remains a possibility that productivity fails to decline
at time T , although this possibility has zero probability; thus, we consider what
happens when that zero-probability event occurs. The experiment is displayed
in Figure 2, with T = 5 again. Before time T the economy behaves exactly

390

Federal Reserve Bank of Richmond Economic Quarterly

as in the case when the productivity shock is realized because agents fully
expect that it will be realized. At time T , however, the productivity shock
does not materialize and labor market tightness and the value of a productive
job immediately return to their initial steady states. These developments imply
a stock market boom and an immediate increase in posted vacancies. The new
tightness in the labor market increases the rate at which unemployed workers
find jobs and leads to a gradual fall in the unemployment rate. As employment
rises output also rises, but, as before, the increase in vacancy posting costs
is large enough that it offsets the rise in output so the resources available for
consumption actually decrease.
Looking at this experiment, one might label the shock whose effects are
displayed in Figure 2 “misleading.” More generally, realizing that all shocks
containing “news” do not necessarily always lead the economy in the right
direction, one can speak of “noise” perhaps: shocks that are believed to have
relevance for productivity but in the end do not. For example, the Internet
technology bubble during the last years of the last millenium could have reflected beliefs that eventually turned out to be too optimistic (but may well
have been rational). Thus, the literature on news shocks should be viewed as
closely related to ideas about noise as well. The very recent literature (e.g.,
Angeletos and La’O [2009], or Blanchard, L’Huillier, and Lorenzoni [2009])
takes an explicit signal extraction approach and thus formalizes news and
noise, as shocks driving business cycles, in a slightly different way.

4.

OTHER APPROACHES IN THE LITERATURE

We now briefly discuss the main features of the different models, all with
neoclassical underpinnings, that have been proposed as a way of generating
co-movement in response to news shocks. In this discussion, we omit the
very recent contributions to this literature that build on signal processing and
“noise shocks.”

Other Approaches to Labor Market Frictions
Den Haan and Kaltenbrunner (2009) present a version of the RBC model with
a search friction in the labor market. Specifically, production occurs within
“projects” that require an entrepreneur and a worker. Creating a new project
is a time-consuming process as entrepreneurs and workers must search for
one another. In response to a news shock, entrepreneurs and workers begin
preparing for the future productivity increase by entering the labor market to
begin the process of establishing relationships through which they can exploit
the higher future productivity when it arrives, just as in Section 3. There
are two main differences between the model in Section 3 and Den Haan and
Kaltenbrunner’s work. First, in Section 3 the labor supply is inelastic, while

P. Krusell and A. McKay: News Shocks and Business Cycles

391

Den Haan and Kaltenbrunner allow it to be elastic. With elastic labor supply,
one of the effects of a news shock is an increase in the demand for leisure
through the wealth effect, which might reverse the result that employment increases in response to the news shock. Den Haan and Kaltenbrunner show that
this effect is sufficiently weak to be overcome by the household’s motivation
to enter the labor market to find a job in anticipation of higher productivity
in the future. Therefore, the result that employment increases in response
to a news shock is not an artifact of the inelastic labor supply. Second, the
standard Diamond-Mortensen-Pissarides model considered in Section 3 does
not include capital, so there are no predictions for the response of investment
to a news shock. Den Haan and Kaltenbrunner show that investment does
respond positively to a news shock except in the first period after the shock.
Production is fixed in the first period because the capital stock and employment are predetermined, so it is impossible for consumption and investment
to rise simultaneously in that period. However, the increase in employment
that occurs in response to the news shock quickly increases output to finance
higher investment as well as higher consumption in subsequent periods.

Multiple Sectors
The standard one-sector RBC model has a tight link between consumption and
investment decisions: Investment directly reduces the resources available for
consumption. Beaudry and Portier (2004) present a three-sector model with
final goods, nondurable intermediate goods, and capital produced in different
sectors. The latter two sectors use labor and a sector-specific fixed factor of
production. In this model the link between consumption and investment is
much weaker because output from the capital goods sector cannot be used for
consumption and the presence of the fixed factors limits the extent to which
the planner is willing to alter the amount of labor in the sectors. This uncoupling of the consumption and investment decisions allows consumption and
investment to both increase in response to a positive news shock. Specifically, Beaudry and Portier assume the news concerns the future productivity
of the nondurable goods sector, and the crucial assumption is that nondurable
goods and capital are complementary in the production of final goods. Under
these assumptions, the planner chooses to build up the capital stock in response to positive news about future nondurable goods productivity because
the complementarity implies that capital will be more productive in the future
because nondurables will be cheaper. The accumulation of capital, however,
makes nondurables more valuable, which leads the planner to expand their
production as well. In the end, the production of capital and nondurables
increases, which is achieved through an expansion of hours worked in each
sector and therefore in total. More capital and nondurables directly translate
into more final output for which the only use is consumption. In this way the

392

Federal Reserve Bank of Richmond Economic Quarterly

model delivers an expansion of consumption, investment, hours, and output
in response to positive news about nondurables productivity.

Other Model Features
An alternative approach taken in the literature is to keep the single-sector
framework, but modify the standard RBC model along several other dimensions. Jaimovich and Rebelo (2009) present a model with three key modifications. They assume a functional form for preferences that has extremely
weak short-run wealth effects on labor supply. In fact, the preferences used
nest those of Greenwood, Hercowitz, and Huffman (1988) in which there
is no wealth effect on labor supply. The calibration used by Jaimovich and
Rebelo is extremely close to this case. Since these preferences imply a zero
wealth effect, they allow the model to generate an increase in hours despite a
substantial increase in consumption. The second modification introduced by
Jaimovich and Rebelo is an adjustment cost for the rate of investment, which
serves to produce an investment boom in response to a positive news shock
as the planner wishes to minimize adjustment costs by smoothing investment
over time. Finally, the authors add variable capacity utilization to the model,
which allows the amount of resources to be expanded in the initial periods
in order to finance simultaneous consumption and investment booms. The
resulting model succeeds in generating a sizable boom in response to news of
a future increase in TFP and in response to news of future investment-specific
technical change.
Christiano et al. (2007) make similar modifications to the standard model
in order to generate a boom in response to a positive news shock. Their
key modifications are to introduce habits in consumption and the adjustment
cost to the flow of investment. Jaimovich and Rebelo also have non-timeseparable preferences, but the calibration is such that the habit persistence is
very weak. The habits and adjustment costs in Christiano et al.’s work motivate
the planner to engineer a smooth transition to the new steady state and begin
consuming and investing in advance of the change in productivity. Hours
are able to increase to provide resources for the consumption and investment
booms because there is no longer a tight link between current hours and current
consumption in the presence of habit persistence.
A troubling feature of these models is the response of the price of capital to
a news shock, which is a decline. As investment is raised to reduce adjustment
costs in anticipation of higher investment in the future, there is, in a sense,
an excess of capital before the shock occurs. The result is that the relative
price of capital falls during the boom. Walentin (2009) presents a model that
is close to that of Christiano et al. (2007), with the modification that there is
limited enforcement of financial contracts. With limited enforcement, there
is a wedge between the value of the firm and the cost of its capital and, in

P. Krusell and A. McKay: News Shocks and Business Cycles

393

Walentin’s model, this wedge increases in response to a news shock so that
the value of the firm increases despite the fall in the cost of capital.

Investment-Specific Technical Change
In a model with adjustment costs, the planner chooses to start investing early
in order to minimize the cost of building up the capital stock in response
to a sector-neutral productivity shock. If, however, productivity shocks are
investment-specific, then the only way to take advantage of them is through
investment. Flodén (2007) uses a vintage capital model to argue that the
news that next period’s vintage of capital will be very productive leads to a
boom in the current period. The mechanism draws on the model elements
presented by Greenwood, Hercowitz, and Krusell (2000), which are shocks
to the relative price of capital and variable capacity utilization. The cost of
more intensive utilization of the capital stock is typically modeled as faster
depreciation. When the relative price of capital declines, the replacement
cost of the depreciated capital stock falls. As a result, an investment-specific
technology shock leads to more intensive utilization in the current period,
which raises the marginal product of labor and elicits higher labor supply.
The additional resources produced through the increases in utilization and
labor supply allow consumption to increase at the same time as investment.
Flodén only considers news shocks at a horizon of one period. That is, the
economy learns that the capital being installed in the current period will be
more productive in the next period and thereafter. This short horizon makes
the expectations-driven boom somewhat short-lived, but it may be possible to
extend the boom by extending the period between the receipt of the news and
the technological change.
There is some ambiguity about the timing of the technology shock in that
investment-specific technical change relates to the evolution of resources between periods rather than the productivity within a period. For example,
Greenwood, Hercowitz, and Krusell adopt the timing convention that the
shock relates to the productivity of investment this period and is therefore
a shock in the current period, while Flodén considers the same shock to be a
shock to the productivity of the capital when it is used in the future, which
is then a shock that arrives in the future but is learned about in the current
period through the news shock. Both interpretations are valid, but an important consideration is the interpretation used in the construction of the National
Income and Product Accounts (NIPA). In principle the NIPA investment data
are adjusted for quality, and if the vintage of capital that is being installed is
going to be more productive in the future, this may be accounted for in the
measurement of current investment and current TFP. However, if the shock
raises current TFP, it would not be classified as a news shock by Beaudry and
Portier (2006) because news shocks are orthogonal to current TFP shocks.

394

Federal Reserve Bank of Richmond Economic Quarterly

Financial Frictions
Another way of modifying the model to generate expectations-driven business
cycles is to introduce financial frictions. Chen and Song (2008) consider
a model with two sectors, only one of which requires the use of working
capital. In their model, entrepreneurs have the ability to divert working capital,
and the optimal contract in response to this limited debt enforcement leaves
the sector financially constrained. When a positive news shock arrives, the
entrepreneurs’ continuation value rises because future profits will be higher,
which relaxes the financial constraint. By reducing financial frictions, the
news of higher TFP in the future triggers a reallocation of capital between
the two sectors and raises current TFP. The increase in current TFP leads to
more output that can be used for both more consumption and more investment.
The more efficient use of capital, as well as the accumulation of more capital,
raises the marginal product of labor, which leads to an increase in hours under
Greenwood-Hercowitz-Huffman preferences.
If financial frictions like the ones Chen and Song have proposed are important features of the macroeconomy, then there are implications for other
issues besides expectations-driven business cycles. In particular, there would
be a need for government policy to alleviate the financial constraints of firms.
This could be achieved in a variety of ways; for the same reasons as future
profit increases would improve the current allocation of capital, any policy
that increases future profits would have a desirable effect (production subsidies would suffice for this purpose).6 Whether the economy is subject to this
strong inefficiency is perhaps questionable. If there is already government
policy in place designed to correct the inefficiency, no reallocation of capital
in response to news shocks will take place.

5.

CONCLUSION

The news shocks literature has generated some interesting new insights about
macroeconomic dynamics that seem relevant for understanding co-movement
of macroeconomic aggregates. The above-discussed settings, including the
simple Pissarides (1985) search/matching model used for illustration, do admit
some channels that are promising ways forward. Some of these settings have
more nonstandard features than others, and it is an open question whether they
will survive more microeconomic scrutiny. It is also, as discussed in Section 1,
still an open question how to identify news shocks and whether they really do
lead to co-movements. All in all, this new literature does offer a challenge to
6 Such policies might involve time inconsistency, since it is only by support of future policy
that the desired effect is attained.

P. Krusell and A. McKay: News Shocks and Business Cycles

395

existing macroeconomic settings that do not admit co-movement in response
to news shocks, and, as such, they should perhaps move our priors.
As also briefly mentioned above, a very recent strand of articles is now
exploring explicit signal extraction channels by which news as well as noise
can drive fluctuations. The focus here is on asymmetric information and,
even though Lucas (1972) certainly sparked interest in the importance of this
phenomenon for macroeconomics, there is no quantitatively oriented model
available off the shelf to evaluate. A central reason for this is the theoretical
difficulty of aggregation across agents with different information sets. Therefore, we may have to wait for a closer comparison between models relying on
these ideas and existing representative-agent macroeconomic models.
Finally, the underlying notion in our discussion here is to examine whether
co-movement is possible, in response to the arrival of information, in settings
that are fully microfounded. It should be noted that none of these settings
build on, or admit, coordination failures, which would seem to more easily
admit strong effects of news or noise. With multiple equilibria, however, it
is not clear how the movement across equilibria is supposed to occur, and
there is nothing inherently more attractive about productivity-related shocks
as coordination devices than other shocks, so it would seem that an approach
based on coordination failures would have to be augmented with a theory of
what triggers changes across equilibria. The earlier literature on sunspots (see
Cass and Shell [1983] and later studies) offers an answer, but sunspots are just
coordination devices, and it might be hard, in a reduced-form sense, to distinguish sunspots from true news shocks. If a news shock indicates high future
productivity of capital, investment likely will go up today. Alternatively, in
a model with multiple equilibria because of some form of increasing returns
to capital, say, as an externality of capital use across firms, a sunspot would
trigger either high or low investment, which would both be self-enforcing under the assumption of increasing returns. So this latter model would indeed
justify later movements in productivity, not because of changes in technology,
but through increasing returns and aggregate activity. Ultimately, these two
“stories” could only be told apart by more detailed empirical scrutiny. One
route is through better productivity measurements, perhaps finding ways of
establishing what the returns to scale are on different levels of aggregation.
Alternatively, a more detailed structural description of the model and examination of how the two kinds of economies respond to other shocks could help
identification.

396

Federal Reserve Bank of Richmond Economic Quarterly

REFERENCES
Angeletos, George-Marios, and Jennifer La’O. 2009. “Noisy Business
Cycles.” Working Paper 14982. Cambridge, Mass.: National Bureau of
Economic Research (May).
Barsky, Robert B., and Eric R. Sims. 2008. “Information, Animal Spirits, and
the Meaning of Innovations in Consumer Confidence.” Mimeo,
University of Michigan.
Beaudry, Paul, and Franck Portier. 2004. “An Exploration Into Pigou’s
Theory of Cycles.” Journal of Monetary Economics 51 (September):
1,183–216.
Beaudry, Paul, and Franck Portier. 2006. “Stock Prices, News, and Economic
Fluctuations.” American Economic Review 96 (September): 1,293–307.
Beaudry, Paul, and Franck Portier. 2007. “When Can Changes in
Expectations Cause Business Cycle Fluctuations in Neo-Classical
Settings?” Journal of Economic Theory 135 (July): 458–77.
Blanchard, Olivier J., Jean-Paul L’Huillier, and Guido Lorenzoni. 2009.
“News, Noise, and Fluctuations: An Empirical Exploration.” Manuscript.
Cass, David, and Karl Shell. 1983. “Do Sunspots Matter?” Journal of
Political Economy 91 (April): 193–227.
Chen, Kaiji, and Zheng Song. 2008. “Financial Frictions on Capital
Allocation: The Engine of TFP Fluctuations.” Unpublished manuscript,
University of Oslo and Fudan University.
Christiano, Lawrence, Cosmin Ilut, Roberto Motto, and Massimo Rostagno.
2007. “Monetary Policy and Stock Market Boom-Bust Cycles.”
Manuscript, Northwestern University and the European Central Bank.
Den Haan, Wouter J., and Georg Kaltenbrunner. 2009. “Anticipated Growth
and Business Cycles in Matching Models.” Journal of Monetary
Economics 56 (April): 309–27.
Fernández-Villaverde, Jesús, Juan F. Rubio-Ramı́rez, Thomas J. Sargent, and
Mark W. Watson. 2007. “ABCs (and Ds) for Understanding VARs.”
American Economic Review 97 (June): 1,021–6.
Flodén, Martin. 2007. “Vintage Capital and Expectations Driven Business
Cycles.” CEPR Discussion Paper 6113.
Greenwood, Jeremy, Zvi Hercowitz, and Gregory W. Huffman. 1988.
“Investment, Capacity Utilization, and the Real Business Cycle.”
American Economic Review 78 (June): 402–17.

P. Krusell and A. McKay: News Shocks and Business Cycles

397

Greenwood, Jeremy, Zvi Hercowitz, and Gregory W. Huffman. 2000. “The
Role of Investment-Specific Technological Change in the Business
Cycle.” Europoean Economic Review 44 (January): 91–115.
Hagedorn, Marcus, and Iourii Manovskii. 2008. “The Cyclical Behavior of
Equilibrium Unemployment and Vacancies Revisited.” American
Economic Review 98 (September): 1,692–706.
Hornstein, Andreas, Per Krusell, and Giovanni L. Violante. 2005.
“Unemployment and Vacancy Fluctuations in the Matching Model:
Inspecting the Mechanism.” Federal Reserve Bank of Richmond
Economic Quarterly 91 (Summer): 19–50.
Jaimovich, Nir, and Sergio Rebelo. 2009. “Can News About the Future Drive
the Business Cycle?” American Economic Review 99 (September):
1,097–118.
Kydland, Finn E., and Edward C. Prescott. 1982. “Time to Build and
Aggregate Fluctuations.” Econometrica 50 (November): 1,345–70.
Lucas, Robert E., Jr. 1972. “Expectations and the Neutrality of Money.”
Journal of Economic Theory 4 (April): 103–24.
Pissarides, Christopher A. 1985. “Short-Run Equilibrium Dynamics of
Unemployment Vacancies, and Real Wages.” American Economic
Review 75 (September): 676–90.
Pissarides, Christopher A. 2000. Equilibrium Unemployment Theory.
Cambridge, Mass.: MIT Press.
Pissarides, Christopher A. 2009. “The Unemployment Volatility Puzzle: Is
Wage Stickiness the Answer?” Econometrica 77 (September): 1,339–69.
Rodrı́guez Mora, José V., and Paul Schulstad. 2007. “The Effect of GNP
Announcements on Fluctuations of GNP Growth.” European Economic
Review 51 (November): 1,922–40.
Schmitt-Grohé, Stephanie, and Martı́n Uribe. 2009. “What’s News in
Business Cycles.” Manuscript, Duke University.
Shimer, Robert. 2005. “The Cyclical Behavior of Equilibrium
Unemployment and Vacancies.” American Economic Review 95
(March): 25–49.
Sims, Eric R. 2009. “Expectations Driven Business Cycles: An Empirical
Evaluation.” Mimeo, University of Michigan.
Walentin, Karl. 2009. “Expectation Driven Business Cycles with Limited
Enforcement.” Sveriges Riksbank Working Paper Series 229 (April).

Economic Quarterly—Volume 96, Number 4—Fourth Quarter 2010—Pages 399–416

Risk Sharing, Investment,
and Incentives in the
Neoclassical Growth Model
Emilio Espino and Juan M. Sánchez

T

he amount of risk sharing among households, regions, or countries
is crucial in determining aggregate welfare. For example, pooling
resources at the national level can help regions better deal with natural
disasters like floods. Similarly, pooling resources with an insurance company
can help individuals deal with shocks like a house fire or a car accident.
Capital accumulation and economic growth also are crucial in determining
aggregate welfare. In particular, they determine the stock of wealth available
for consumption and investment. Importantly, wealthier households, regions,
or countries possess a buffer stock of precautionary assets, a form of selfinsurance.
These two important factors in determining welfare have interesting interactions with one another. An important one is how insurance and savings
substitute for each other. For example, individuals may want to save more
when they do not have access to insurance than when they do because the
extra savings can protect against the consequences of an uninsured shock.
Therefore, capital accumulation and growth would be faster in an economy
without perfect insurance than in one with perfect insurance.
This article explores the tradeoffs between insurance and growth in the
neoclassical growth model with two agents and preference shocks. Most of
the analysis reviews the full information version of the model, where there are
no limits on insurance between the two agents, though there is still aggregate
uncertainty that affects aggregate savings behavior. Private information is
Espino is an economist and professor at Universidad Torcuato Di Tella. Sánchez is an
economist affiliated with both the Richmond and St. Louis Federal Reserve Banks. The
authors gratefully acknowledge helpful comments by Arantxa Jarque, Borys Grochulski, Ned
Prescott, Nadezhda Malysheva, and Constanza Liborio. The views expressed here do not necessarily reflect those of the Federal Reserve Bank of Richmond, the Federal Reserve Bank of St.
Louis, or the Federal Reserve System. E-mails: eespino@utdt.edu; juan.m.sanchez@stls.frb.org.

400

Federal Reserve Bank of Richmond Economic Quarterly

then added to the model to limit the ability to insure the two agents. This is a
much harder problem, as has been observed in the literature, and only a partial
characterization is provided.

Literature Review
Our article relates to the voluminous consumption/savings/capital accumulation literature on two levels. On one hand, there has been a growing
literature focusing on the accumulation effects of demand side shocks in dynamic stochastic general equilibrium models, following the pioneering work
of Baxter and King (1991) and Hall (1997). In general equilibrium models, demand side shocks (such as preference shocks to consumption demand) have a
strong tendency to crowd out investment.1 On the other hand, there is literature
on the impact of inequality on capital accumulation. If preferences aggregate
in the Gorman sense, the distribution of wealth does not affect the evolution of
aggregate variables—see Chatterjee (1994) and Caselli and Ventura (2000).
In our setting, preferences do not aggregate in that strong sense. Thus, distribution matters for aggregate savings and the corresponding dynamics of the
aggregate stock of capital.2
The literature analyzing economic growth and private information is not
as large, and the valuable contributions have relied on different simplifying
assumptions to make the analysis tractable. This article is related to those articles because we are interested in understanding when information is (more)
important to implement the full information allocation. However, we solve
the full information model to obtain the full information allocation and characterize only the incentives to misreport the shocks under that allocation.
Pioneering contributions in the literature on constrained efficient allocations with private information abstracted from capital accumulation, as the
main goal was to study wealth distribution. In a pure exchange economy
setting, Green (1987) and Atkeson and Lucas (1992) show that (constrained)
efficient allocations, independent of the feasibility technologies, display extreme levels of “immiserization”: The expected utility level of (almost) every
agent in the economy converges to the lower bound with probability one.
This result is also present in Thomas and Worrall (1990). Then, in an early
contribution that includes capital accumulation, Marcet and Marimon (1992)
examine a two-agent model where a risk-neutral investor with unlimited resources invests in the technology of a risk-averse producer whose output is
subject to privately observed productivity shocks. They show that the full
information investment policy can be implemented in the private information
1 See Wen (2006) for an overview and references therein.
2 See Lucas and Stokey (1984) for a general early discussion and, more recently, Sorger

(2002).

E. Espino and J. M. Sánchez: Investment and Risk Sharing

401

environment. That is, in their setting, a risk-neutral investor can make the
risk-averse entrepreneur follow the full information investment policy and allocate his consumption conditional on output realizations. Thus, they find
that growth levels are as high as with perfect information. The key simplification in this article is that the second agent in the economy is risk-neutral with
unlimited resources.
Khan and Ravikumar (2001) extend Marcet and Marimon (1992) to impose a period-by-period feasibility constraint and endogenous growth. In
particular, they examine the impact of incomplete risk sharing on growth and
welfare in the context of the AK model. The source of market incompleteness
is private information since household technologies are subject to idiosyncratic productivity shocks not observable by others. Risk sharing between
households occurs through contracts with intermediaries. This sort of incomplete risk sharing tends to reduce the rate of growth relative to the complete
risk-sharing benchmark. However, “numerical examples indicate that, on average, the growth and welfare effects on incomplete risk sharing are likely to
be small.” One key simplification in this case is that the allocation solved is
not necessarily the best incentive-compatible allocation.
Recently, Greenwood, Sanchez, and Wang (2010a) embedded the costly
state verification framework into the standard growth model.3 The relationship
between the firm and lender is modeled as a static contract. In the economy
in which information is too costly, undeserving firms are overfinanced and
deserving ones are underfunded. A reduction in the cost of information leads to
more capital accumulation and a redirection of funds away from unproductive
firms toward productive ones. Greenwood, Sanchez, and Wang (2010b) show
that this mechanism has quantitative significance to explain cross-country
differences in capital-to-income ratios and total factor productivity.
Other studies use similar models for other purposes. Espino (2005) studies
a neoclassical growth model that includes a discrete number of agents, like
the one presented in this article. However, he uses the economy with private
information about the preference shock to analyze the validity of Ramsey’s
conjecture about the long-run allocation of an economy in which agents are
heterogeneous in their discount factor. Clementi, Cooley, and Giannatale
(2010) study a repeated bilateral exchange model with hidden action, along
the lines of Spear and Srivastava (1987) and Wang (1997), that includes capital
accumulation. The two agents in the economy are a risk-neutral investor and
a risk-averse entrepreneur. They show that the incentive scheme chosen by
the investor provides a rationale for firm decline.
This article is organized as follows: Section 1 presents the physical environment and the planner’s problem, and derives the optimal allocation. Section
3 See also Khan (2001) and Chakraborty and Lahiri (2007).

402

Federal Reserve Bank of Richmond Economic Quarterly

2 describes the calibration and the numerical solution of the full information allocation. Section 3 studies in which cases the full information allocation would be incentive compatible in an economy with private information.
Section 4 offers concluding remarks.

1.

MODEL

Environment
There is a constant returns to scale aggregate technology to produce the unique
consumption good that is represented by a standard neoclassical production
function, F (K, L), where K is the current stock of capital and L denotes
units of labor. There are two agents in the economy, h = 1, 2. Agent h is
endowed with one unit of time each period and does not value leisure, i.e.,
the time endowment is supplied inelastically in the labor market. The initial
stock of capital at date 0 is denoted by K0 > 0. Capital depreciates at the rate
δ ∈ (0, 1).
At the beginning of date t, agent 1 faces an idiosyncratic preference shock
st ∈ St = {sL , sH }, where sH > sL . This shock is assumed to be i.i.d. across
time, where π i > 0 is the probability of si , i = L, H . Notice that st is also the
aggregate preference shock at date t. The aggregate history of shocks from
date 0 to date t, denoted s t = (s0 , ..., st ), has probability at date 0 given by
π (s t ) = π (s0 )...π (st ).
t
Given a consumption plan {c1,t }∞
t=0 such that c1,t : S → R+ , agent 1’s
state-dependent preferences are represented by
!
∞
∞ 


t
β u1 (s1,t , c1,t ) =
β t π (s t ) u1 (st , c1 (s t )),
U1 (c1 ) = E
t=0

t=0

st

where u1 : R+ → R is strictly increasing, strictly concave, and twice differentiable, lim u (ct ) = +∞, and β ∈ (0, 1). Similarly, given {c2,t }∞
t=0 , agent
ct →0

2’s preferences are represented by
U2 (c2 ) = E

∞

t=0

!

β t u2 (c2,t ) =

∞ 

t=0

β t π (s t )u2 (c2 (s t )).

st

Planner’s Problem
Consider the problem of a fictitious planner choosing the best feasible allocation. Let K  = {Kt+1 }∞
t=0 be an investment plan that every period allocates
next period’s capital for all t. Similarly, let C = {Ct }∞
t=0 be a consumption plan
where Ct = (c1t , c2t ). Given K0 , a sequential allocation (C, K  ) is feasible
if, for all s t ,
Kt (s t ) + c1t (s t ) + c2t (s t ) ≤ F (Kt (s t−1 ), 1) + (1 − δ)Kt (s t−1 ).

E. Espino and J. M. Sánchez: Investment and Risk Sharing

403

We will assume throughout the article that the production function F is CobbDouglas with exponent γ .
The Pareto-optimal allocation in this economy is a feasible allocation such
that there is no other feasible allocation that provides all the agents the same
or more lifetime utility. One reason to be interested in these allocations is that,
under certain conditions, they are equivalent to competitive equilibrium allocations. Under our assumptions, Pareto-optimal allocations can be obtained
by solving the following problem:
max αU1 (c1 ) + (1 − α)U2 (c2 )

(C,K  )

subject to
K(s t ) + c1 (s t ) + c2 (s t ) ≤ F (K(s t−1 ), 1) + (1 − δ)K(s t−1 ), ∀s t ,
where K0 is given and α ∈ [0, 1] is the weight that the planner assigned to
agent 1—referred to hereafter as Pareto weight. Notice that different values of
α characterize different points in the Pareto frontier. Later, we will consider
a different allocation varying the value of α.
To characterize the problem further, it is simpler to consider the methods
developed by Lucas and Stokey (1984) to solve for Pareto-optimal allocations
in growing economies populated with many consumers. It is actually simple
to adapt their method to analyze this economy. The idea is to make next period
welfare weights conditional on the current shock.4
The planner’s recursive problem is a fixed point, V , of the function
equation
V (k, α) = max
α {π L [u1 (sL , c1L )+βw1L ]+π H [u1 (sH , c1H )+βw1H ]}+

c,k ,w

(1 − α) {π L [u2 (c2L ) + βw2L ] + π H [u2 (c2H ) + βw2H ]}

(1)

subject to
f (k) + (1 − δ)k ≥ kL + c1L + c2L , (2)
f (k) + (1 − δ)k ≥ kH + c1H + c2H , (3)
V (kL , α L ) − α L w1L − (1 − α L )w2L ≥ 0,
(4)
min

αL

min
V (kH , α H ) − α H w1H − (1 − α H )w2H

αH

≥ 0,

(5)

where α = {α, 1 − α} and w are the from-tomorrow-on utilities. The idea in
(1)–(5) is to represent the problem of choosing an optimal allocation for a given
stock of capital k and a vector of Pareto weights (α, 1−α) as one of choosing a
feasible current period allocation of consumption c = {c1L , c1H , c2L , c2H } and
capital goods k  = {kL , kH }, and a vector of from-tomorrow-on utilities w =
4 See Beker and Espino (2011) for a discussion about the implementation and the corresponding technical details.

404

Federal Reserve Bank of Richmond Economic Quarterly

{w1L , w1H , w2L , w2H }, subject to the constraint that these utilities be attainable
given the capital accumulation decision, as guaranteed by constraints (4)–(5).
As in Lucas and Stokey (1984), the weights {α L , α H } that attain the minimum
in (4) and (5) will be the new weights used in selecting tomorrow’s allocation,
and so on, ad infinitum.

Characterization
Assume preferences are represented by
c1−σ
c1−σ
and u2 (c) =
.
1−σ
1−σ
The first-order conditions (FOC) for consumption are
u1 (s, c) = s

απ L sL (c1L )−σ
απ H sH (c1H )−σ
(1 − α)π L (c2L )−σ
(1 − α)π H (c2H )−σ

=
=
=
=

λL ,
λH ,
λL ,
λH ,

where λi is the Lagrange multiplier in the resource constraints in state i =
L, H . From these equations it is simple to obtain that the consumption of
each agent will be a share of the aggregate consumption, Ci ,
(αsL )1/σ
CL ,
(αsL )1/σ + (1 − α)1/σ
(αsH )1/σ
c1H =
CH ,
(αsH )1/σ + (1 − α)1/σ
(1 − α)1/σ
c2L =
CL ,
(αsL )1/σ + (1 − α)1/σ
(1 − α)1/σ
c2H =
CH .
(αsH )1/σ + (1 − α)1/σ
The FOC with respect to w are
c1L =

απ L β
απ H β
(1 − α)π L β
(1 − α)π H β

=
=
=
=

(6)

μL α L ,
μH α H ,
μL (1 − α L ),
μH (1 − α H ).

These imply that
απ L β + (1 − α)π L β = μL α L + μL (1 − α L ),
and therefore π L β = μL and π H β = μH . Using the FOC with respect to w
again, these two conditions imply α = α L = α H . Thus, the Pareto weights
will be constant in this problem.

E. Espino and J. M. Sánchez: Investment and Risk Sharing

405

Using the fact that individual consumption is a share of aggregate consumption and that Pareto weights are constant, this problem can be rewritten as
one solving for the consumption (or capital accumulation) of a representative
consumer with aggregate preference shocks. In that case, the state-dependent
utility of the representative consumer, uR , would be

σ C1−σ
uR (s, C) = ( sα)1/σ + (1 − α)1/σ
.
1−σ
Notice here that the level of the shock depends not just on the size of s,
but also on α. This representation is useful to understand that the optimal
investment decision is affected by the realization of the preference shock and
the distributional parameter α. When s is larger, the representative agent
prefers to increase consumption today and decrease investment. Given the
same shock, the size of the drop in investment depends on the Pareto weight
of the agent that received the shock.
The FOC with respect to capital accumulation are
∂V (kL , α L )
,
∂kL
∂V (kH , α H )
= μH
.
∂kH

λ L = μL
λH

An application of the envelope conditions makes these conditions imply the
standard Euler equations determining capital accumulation,


 −σ

π L sL (c1L
) + π H sH (c1H
)−σ
 
,
1 = (F (kL ) + (1 − δ))β
sL (c1L )−σ


 −σ

π L sL (c1L
) + π H sH (c1H
)−σ
 
1 = (F (kH ) + (1 − δ))β
.
sH (c1H )−σ

2.

NUMERICAL SOLUTION

This model can be solved in the computer once the values of the parameters are
determined. Most of the parameters are standard in the neoclassical growth
model and take standard values. Others, such as the size of the preference
shock and the probability of occurrence, were chosen only to illustrate the
behavior of the model. In particular, a high preference shock happens on
average every 6.7 years, but it is large enough to demand a significant amount
of resources. Think, for example, that a country in an economic union requires
help or assistance on average every 6.7 years. Table 1 presents the values for
all the parameters of the model.
The right-hand side of (1)–(5) defines a contraction. The computation is
based on value function iteration as follows. Guess a function V . Then solve
for maxc,w ,k using V , the FOC described above, and numerical maximization.

406

Federal Reserve Bank of Richmond Economic Quarterly

Table 1 Parameter Values

γ
δ
β
σ

Parameter
Exponent of capital in production function
Depreciation rate of capital
Discount factor
Relative risk aversion

sL
sH
πL
πH

Low value of the preference shock
High value of the preference shock
Probability of low value of the preference shock
Probability of high value of the preference shock

Value
0.30
0.07
0.97
0.50
0.95
1.05 and 2.00
0.85
0.15

With this solution, construct a new function V  and restart the maximization
unless V  is sufficiently close to V . Now we discuss the results using the
parameters in Table 1 with sH = 2 and Pareto weights {0.75, 0.25}. Figure
1 presents time series for aggregate consumption and capital accumulation in
the steady state of this economy. On the top panel that aggregate consumption
jumps after a preference shock and then returns slowly to a relatively constant
value until a new shock hits. As a consequence, capital accumulation drops
after a high preference shock to accommodate larger aggregate consumption,
as shown on the top panel. The effect of this change on the incentives to
misreport a shock—if it would be unobservable—is discussed in the next
section. The distribution of consumption among agents is determined by
equations (6), i.e., agent 1’s share of aggregate consumption increases with
the value of the shock. More on this later.
Figure 2 depicts the stationary distribution of the main variables for the
same example analyzed in Figure 1. The top left panel shows that 15 percent
of the time there is a large preference shock equal to 2 and most of the time
(85 percent) a low shock equal to 0.95. The top right panel presents the
stationary distribution of capital. It is somehow surprising that very different
values (e.g., 3 and 6) are reached with positive probability. Most of its mass is
accumulated on the higher values, however. Those correspond to periods with
low preference shocks. The lowest values of capital correspond to periods
of several consecutive high preference shocks. Something similar happens
with c2 , on the bottom right panel. A priori, these properties could have been
expected since k  and c2 are the two sources to finance transfers to agent 1
after a high preference shock. The distribution of c1 , presented on the bottom
left panel, has most of the mass around lower values and some mass at higher
values. The highest values correspond to a high preference shock hitting the
economy after a long period of low shocks.

E. Espino and J. M. Sánchez: Investment and Risk Sharing

407

Figure 1 Consumption and Capital Paths in the Stationary Distribution

3.0
2.5
2.0
1.5
1.0

s

0.5

C
0.0
Time
6.0
5.5
5.0

k'

4.5
4.0
3.5
3.0
2.5

s

2.0
1.5
1.0
Time

Notes: These histograms were computed from time series data of these variables for
5,000 periods after deleting the first 500 realizations.

3. THE ROLE OF INFORMATION
This section investigates the incentives to misreport preference shocks by
agent 1 whenever the full information allocation described above is the target
to be implemented. To do so, consider the value of the following (implicit)
incentive compatibility constraints:
iccH L = sH u (c1H ) + βw1H − [sH u (c1L ) + βw1L ] ,
iccLH = sL u (c1L ) + βw1L − [sL u (c1H ) + βw1H ] .

(7)
(8)

The interpretation of these variables is very important for the analysis hereafter.
If the variable iccH L is positive, it means that when the state H realizes, agent
1 would prefer truthfully reporting a high preference shock and obtaining
{c1H , w1H } instead of misreporting it and receiving {c1L , w1L }. Similarly, a

408

Federal Reserve Bank of Richmond Economic Quarterly

Figure 2 Stationary Distribution, Main Variables
8

80

6

60

4

40

2

20
0

1.0

1.2
1.4
1.6
1.8
s, Preference Shock

2.0

0

2

5
4
3
k', Next Period Capital

6

20
8
15

6

10

4

5
0

2

0

1
2
c 1 , Agent 1 Consumption

3

0

0

.05
.10
c 2 , Agent 2 Consumption1

.15

Notes: These histograms were computed from time series data of these variables for
5,000 periods after deleting the first 500 realizations.

negative value of iccLH means that agent 1 would prefer misreporting a high
preference shock and obtaining {c1H , w1H } to truthfully reporting a low shock
and receiving {c1L , w1L }.
Since c1H > c1L , one may expect that there is no incentive to report the low
shock when the high shock was actually realized, i.e., a positive value of iccH L .
This is actually what happens in the stationary distribution, as shown on the
top panel of Figure 3. In contrast, agent 1 may be tempted to misreport a high
preference shock to obtain higher consumption. Remember that this would
imply that iccLH < 0. This does not need to always be the case, however.
Since k  is lower after a high preference shock, agent 1’s prospects worsen after
a high preference shock. Thus, it will be a race between more consumption
today, c1H > c1L , and less future consumption, w1L > w1H . The results for

E. Espino and J. M. Sánchez: Investment and Risk Sharing

409

Figure 3 Incentive Compatibility in the Stationary Distribution
2.5

Density

2.0

1.5

1.0

0.5

0.0
0.0

0.5

1.5

1.0

icc HL, Utility of Truthfully Reporting High Shock Minus Misrepresenting

8.0

Density

6.0

4.0

2.0

0.0
-0.5

0.0

0.5

1.0

1.5

icc LH , Utility of Truthfully Reporting Low Shock Minus Misrepresenting

Notes: These histograms were computed from time series data of these variables for
5,000 periods after deleting the first 500 realizations.

the example described above are presented in the bottom panel of Figure 3.
There, iccLH is negative more than 80 percent of the time but positive in some
instances. This means that in all such instances, the drop in from-tomorrow-on
utilities caused by reporting a high preference shock is enough to compensate
for the difference in current consumption. What determines whether iccLH is
negative or positive will be studied next by analyzing different examples.
The next two examples capture the role of the size of redistribution versus
disinvestment. The first example is presented in Figure 4. This is the same
example in all the previous figures, but the difference is that the Pareto weight
of agent 1 is only 0.33 (instead of 0.75) and the weight of agent 2 is 0.67. This
implies that agent 2’s consumption is larger, as shown in the top left panel.
The top right panel presents the behavior of capital accumulation. Notice that

410

Federal Reserve Bank of Richmond Economic Quarterly

Figure 4 Paths with Large Redistribution of Aggregate Consumption

1.0

6.0
5.5
5.0

c2

4.5
4.0

0.8
0.6

k'

c1

3.5
3.0
2.5
2.0

0.4

s

1.5

0.2

1.0
Time

Time
8.4
8.2

1.4

w1L

1.2

icc HL

1.0
8.0

0.8
0.6

7.8
7.6

w1H

7.4

0.4
0.2
0.0
-0.2

7.2

-0.4

7.0

icc LH

-0.6

Time

Time

Notes: In this economy, the weights on agents 1 and 2 are 0.33 and 0.67, respectively.
The time series data in all four graphs correspond to the initial 35 periods after the
economy is started with a stock of capital smaller than the steady-state level.

the time series in the graphs correspond to the transition toward a higher level
of capital. From these two figures it is clear that a nontrivial part of the rise in
agent 1’s consumption after a high preference shock comes from redistribution
of consumption across agents. As a consequence, the promised utilities from
next period on are not that different after a report of high or low preference
shock, as shown in the bottom left panel. In turn, this implies that iccLH
is always negative, as presented in the right bottom panel. Thus, this is an
example in which the full information allocation would not be implementable
under private information: After a low preference shock, agent 1 would prefer
to falsely report a high preference shock.

E. Espino and J. M. Sánchez: Investment and Risk Sharing

411

Figure 5 Paths with Large Variation in Investment
5.0

3.0

4.5
c1

k'

4.0

2.0

3.5
3.0
2.5

1.0

2.0

s

1.5
0.0

c2

1.0

Time

Time
2.0

97.0
w1L

96.8
96.6
96.4
96.2
96.0

1.0

icc HL

w1H

95.8
95.6

icc LH

95.4
0.0

95.2
Time

Time

Notes: In this economy, the weights on agents 1 and 2 are 0.85 and 0.15, respectively.
The time series artificial data in all four graphs correspond to 30 periods created after
the steady-state level of capital is reached.

Now consider the example presented in Figure 5. Here, the behavior of
the same series is presented for an economy in which the Pareto weight of
agent 1 is 0.85 and the steady-state distribution of capital is reached. This
implies that agent 1’s consumption is much larger than that of agent 2, as
shown in the top left panel. As a consequence, capital accumulation must
vary significantly to provide more consumption to agent 1 after the realization
of a high preference shock. This is shown in the top right panel. Therefore,
as presented in the bottom left panel, the difference in from-tomorrow-on
utilities associated with low and high preference shocks is large. Thus, both
incentive compatibility constraints are positive in the stationary distribution

412

Federal Reserve Bank of Richmond Economic Quarterly

Figure 6 Incentive Compatibility and Capital Accumulation

8.0

3.0

6.0

k'

2.0
4.0
c1
1.0

2.0
s

c2
0.0

0.0

Time
107.6
107.2
106.8

w1L

106.4
106.0
105.6
105.2

w1H

104.8
104.4

Time

Time
2.0
1.8
1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4

icc HL

icc LH
Time

Notes: In this economy, the weights on agents 1 and 2 are 0.75 and 0.25, respectively.
The time series data in all four graphs correspond to the initial 35 periods after the
economy is started with a stock of capital larger than the steady-state level.

of this economy (see bottom right panel), and the full information allocation
would be implementable under private information.
The previous two examples are useful to understand that the relative importance of the agent who privately observes the shock matters for the role of
private information. When this agent is more important, her share of aggregate
consumption is larger, and the rise of that agent’s consumption after a shock
comes mainly from disinvestment. This makes misreporting a high preference shock too costly in terms of her own future consumption, and hence the
full information allocation is implementable under private information. Thus,
the size of disinvestment, determined by the importance of the agent with
the preference shock, matters for the provision of incentives under private

E. Espino and J. M. Sánchez: Investment and Risk Sharing

413

information. This suggests that in a fully specified model with private information, the planner would like to increase the Pareto weight of the agent with
private information to reduce the incidence of this friction.
The next example illustrates the role of the outlook for economic growth
at the time of disinvestment in preventing misrepresentation of preference
shocks. First, consider the example in Figure 6. It displays the transition to
the steady state from a larger stock of capital. The weights of agents 1 and
2 are 0.75 and 0.25, respectively. Initially, consumption, capital, and fromtomorrow-on utilities decrease. During this initial phase, while capital is large
and decreasing, iccLH is negative and increasing. This means that when there
is extra capital in the economy, as compared to the stationary distribution,
the optimal drop in capital that a high preference shock would require (and
its corresponding drop in promised utility) is not large enough to provide
incentives to make the report of that shock incentive compatible. Eventually,
a high preference shock hits the economy, the consumption of agent 1 jumps,
and capital drops significantly. Now, the economy is expected to grow in
the coming years, which implies that another high preference shock would
hurt both agents more. Therefore, reporting a high preference shock becomes
incentive compatible for a few years, until the stock of capital reaches a higher
level. The same story occurs again in a few years, when a high preference
shock hits the economy again. Thus, this example illustrates the interaction
of growth and information. Misrepresentation of preference shocks is more
costly if the economy is expected to grow. This finding suggests that a planner
solving for the best incentive-compatible allocation would delay growth to
facilitate the provision of incentives.
The last example confirms the importance of the size of disinvestment
and the outlook for economic growth. Consider the time series artificial data
presented in Figure 7. The Pareto weight for agent 1 is larger than in previous
examples, 0.85, but the value of the high preference shock is smaller, sH =
1.05. First, notice that this example confirms the result in the previous figure:
It is easier to provide incentives (iccLH is larger) when the economy is expected
to grow. However, in this case, iccLH is never greater than zero. Notice that
this happens despite agent 1’s weight being larger than in all other examples.
The key difference is that the shock is not that large. Thus, the size of the
drop in capital accumulation is not very relevant, and therefore the difference
between w1L and w2L is small.

4.

CONCLUSIONS

This article studies the interaction between growth and risk sharing. First, it
answers how investment is affected by insurance needs. A stochastic growth
model with two agents and preference shocks is used to answer this question.
Only one of the agents (or groups, regions, countries) is affected by this shock,

414

Federal Reserve Bank of Richmond Economic Quarterly

Figure 7 Paths for the Model with Small Shocks
1.5

5.0
k'

c1

4.0

1.0
3.0
0.5

2.0
c2

s

1.0

0.0

Time
82.00

Time

w1L

0.02

icc HL

81.98
0.01
81.96
81.94

w1H

0.00

81.92

icc LH

-0.01

81.90

Time

Time

Notes: In this economy, the weights on agents 1 and 2 are 0.85 and 0.15, respectively.
The time series artificial data in all four graphs correspond to 30 periods created after
the steady-state level of capital is reached.

which basically increases the need of consumption for this agent. When both
agents are risk-averse, the socially optimal response to this shock requires
both decreasing the consumption of other agents and decreasing capital accumulation. Thus, the occurrence of this shock slows down the convergence
toward the stationary distribution of capital.
Then, we analyze if the best path of capital accumulation and consumption
allocation is implementable if needs are privately observed by the agents. That
is, if the shocks are privately observed by individuals, do they have incentive
to misrepresent? The value of the incentive compatibility constraints implied
by the full information allocation is used to answer this question. Because
investment drops when an agent reports a high preference shock, the prospects

E. Espino and J. M. Sánchez: Investment and Risk Sharing

415

of all agents deteriorate after such a report. This may be enough to prevent
misreporting. The size of disinvestment after the report of a high preference
shock and the outlook for economic growth at the time of disinvestment are
important to induce individuals to report a low realization of the preference
shock truthfully. This analysis suggests that in a fully specified model with
private information, the best incentive compatible allocation would tend to hurt
growth, by decreasing investment, and increase inequality, by augmenting the
share of consumption of the agent with private information. Of course, this
is only a conjecture. Solving for the constrained-efficient allocation in this
environment is necessary to verify the validity of this conjecture. This is left
for future research.

REFERENCES
Atkeson, Andrew, and Robert E. Lucas, Jr. 1992. “On Efficient Distribution
with Private Information.” Review of Economics Studies 59 (July):
427–53.
Baxter, Marianne, and Robert G. King. 1991. “Productive Externalities and
Business Cycles.” Institute for Empirical Macroeconomics at Federal
Reserve Bank of Minneapolis Discussion Paper 53.
Beker, Pablo F., and Emilio Espino. 2011. “The Dynamics of Efficient Asset
Trading with Heterogeneous Beliefs.” Journal of Economic Theory 146
(January): 189–229.
Caselli, Francesco, and Jaume Ventura. 2000. “A Representative Consumer
Theory of Distribution.” American Economic Review 90 (September):
906–26.
Chakraborty, Shankha, and Amartya Lahiri. 2007. “Costly Intermediation
and the Poverty of Nations.” International Economic Review 48 (1):
155–83.
Chatterjee, Satyajit. 1994. “Transitional Dynamics and the Distribution of
Wealth in a Neoclassical Growth Model.” Journal of Public Economics
54 (May): 97–119.
Clementi, Gian Luca, Thomas Cooley, and Soni Di Giannatale. 2010. “A
Theory of Firm Decline.” Review of Economic Dynamics 13 (October):
861–85.

416

Federal Reserve Bank of Richmond Economic Quarterly

Espino, Emilio. 2005. “On Ramsey’s Conjecture: Efficient Allocations in the
Neoclassical Growth Model with Private Information.” Journal of
Economic Theory 121 (April): 192–213.
Green, Edward J. 1987. “Lending and the Smoothing of Uninsurable
Income.” In Contractual Arrangements for Intertemporal Trade, edited
by Edward C. Prescott and Neil Wallace. Minneapolis: University of
Minnesota Press, 3–25.
Greenwood, Jeremy, Juan M. Sanchez, and Cheng Wang. 2010a. “Financing
Development: The Role of Information Costs.” American Economic
Review 100 (September): 1,875–91.
Greenwood, Jeremy, Juan M. Sanchez, and Cheng Wang. 2010b.
“Quantifying the Impact of Financial Development on Economic
Development.” Economie d’Avant Garde Research Report 17.
Hall, Robert E. 1997. “Macroeconomic Fluctuations and the Allocation of
Time.” Journal of Labor Economics 15 (January): S223–50.
Khan, Aubhik. 2001. “Financial Development and Economic Growth.”
Macroeconomic Dynamics 5 (June): 413–33.
Khan, Aubhik, and B. Ravikumar. 2001. “Growth and Risk-Sharing with
Private Information.” Journal of Monetary Economics 47 (June):
499–521.
Lucas, Robert Jr., and Nancy L. Stokey. 1984. “Optimal Growth with Many
Consumers.” Journal of Economic Theory 32 (February): 139–71.
Marcet, Albert, and Ramon Marimon. 1992. “Communication, Commitment,
and Growth.” Journal of Economic Theory 58 (December): 219–49.
Sorger, Gerhard. 2002. “On the Long-Run Distribution of Capital in the
Ramsey Model.” Journal of Economic Theory 105 (July): 226–43.
Spear, Stephen E., and Sanjay Srivastava. 1987. “On Repeated Moral Hazard
with Discounting.” Review of Economic Studies 54 (October): 599–617.
Thomas, Jonathan, and Tim Worrall. 1990. “Income Fluctuation and
Asymmetric Information: An Example of a Repeated Principal-Agent
Problem.” Journal of Economic Theory 51 (August): 367–90.
Wang, Cheng. 1997. “Incentives, CEO Compensation, and Shareholder
Wealth in a Dynamic Agency Model.” Journal of Economic Theory 76
(September): 72–105.
Wen, Yi. 2006. “Demand Shocks and Economic Fluctuations.” Economics
Letters 90 (March): 378–83.