Full text of Economic Quarterly (Federal Reserve Bank of Richmond) : Third Quarter 2011, Vol. 97, No. 3

View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Economic Quarterly—Volume 97, Number 3—Third Quarter 2011—Pages 189–193

Introduction to the Special
Issue on Modern
Macroeconomic Theory
Andreas Hornstein

he great recession of 2007–2009 has generated significant external
criticism of the way economists study and try to understand aggregate
economic outcomes. Modern macroeconomic theory, in particular,
has been criticized for its representation of the economy through highly stylized environments that abstract from distributional issues, ignore or minimize
linkages between the financial and nonfinancial sectors of the economy, and,
in general, rely too much on highly aggregative frameworks. This issue collects four articles that describe how modern macroeconomic research has dealt
with some of these issues as part of a research program that has been ongoing
for more than a decade.
The first article by Nobuhiro Kiyotaki provides a short history of modern
business cycle theory and how it has evolved to potentially address the role
of the financial sector in the aggregate economy. Kiyotaki starts with the
neoclassical growth model as a reference point for most of modern business
cycle theory. This modelling framework, originally known as “real business
cycle” theory, starts with the stark abstraction of one representative household and one representative producer in a competitive environment without
any frictions on the interactions of consumers and producers. From the perspective of this model, business cycles are driven by exogenous shocks, and
the dynamics of the cycle essentially reflect the dynamics of the shocks. In
other words, there is only a weak model-internal mechanism that propagates
shocks. Kiyotaki then studies a sequence of well-defined deviations from
this reference point and asks what deviations are more likely to affect the
baseline interpretation of business cycles. Kiyotaki first shows how heterogeneity in consumption and production can be easily accommodated in this
The views expressed do not necessarily reflect those of the Federal Reserve Bank of Richmond
or the Federal Reserve System. E-mail: andreas.hornstein@rich.frb.org.

190

Federal Reserve Bank of Richmond Economic Quarterly

framework given the assumption of complete markets. In a second step,
Kiyotaki shows how non-competitive markets, either because of market power
or limitations on the interactions of agents, can be introduced into the baseline
model. Neither of these modifications affect the interpretation of business
cycles as being driven by shocks. Finally, Kiyotaki argues that restrictions on
the set of available financial contracts significantly affect the way exogenous
shocks are propagated in the model economy.
The second article by Vincenzo Quadrini elaborates on the role of financial
frictions for production decisions. Quadrini illustrates these financial frictions
in a simple example where entrepreneurs have to acquire capital to operate an
intertemporal production technology. Again, financial frictions are introduced
relative to the baseline complete markets framework. Quadrini discusses the
two most popular models of market incompleteness—the costly state verification (CSV) model and the collateral constraint (CC) model. Both frameworks limit entrepreneurs to the use of two financial instruments: contingent
debt (equity or net worth) and non-contingent debt. In the CSV model, noncontingent debt is the optimal response to a limited information problem, and
an entrepreneur’s net worth limits his ability to issue debt and finance investment projects. In the CC model, posting collateral allows the entrepreneur
to obtain credit despite his inability to credibly commit to the repayment
of debt. The main question then becomes how these financial frictions can
amplify the effects of shocks to the economy or be themselves a source of
shocks to the economy. Quadrini illustrates the basic mechanism for amplification and propagation in the simple model, and surveys the results from more
“realistic” models.
The third article by Fatih Guvenen surveys recent research on household
heterogeneity in the absence of complete markets. We might be interested
in household heterogeneity for two reasons. First, even though we assume
in the baseline “real business cycle” model that aggregate consumption and
labor supply decisions can be modelled through a representative household
construct, we might worry that “distributions” of ability, income, or wealth
do matter for the behavior of these aggregate outcomes. Second, observed
inequality of income and wealth often gives rise to attempts to redistribute
resources. In order to address the costs and benefits of such a policy, one
first needs a theory that accounts for the currently observed inequality across
households. If we care about inequality because of implied differences in
“well-being,” then we should care about inequality in consumption and leisure,
and we should care about income inequality only to the extent that it gives rise
to consumption inequality. Much of the research surveyed by Guvenen studies
how, in the absence of complete markets, income inequality gets translated into
consumption and wealth inequality. If the level of income and its distribution
are exogenous, the redistribution problem is simplified since any attempt to
influence consumption and wealth inequality does not feed back into either the

A. Hornstein: Introduction

191

level or the distribution of income. But economists are always worried about
the labor supply effects of tax policies, that is that at least part of income levels
and inequality are endogenous. In standard models, these labor supply effects
show up as variations in hours worked or labor market participation decisions.
In his survey, Guvenen emphasizes a different labor supply decision, namely
the accumulation of human capital. Overall, Guvenen shows that accounting
for heterogeneity of households in environments with incomplete markets is
feasible, but it also requires the application of advanced computational tools.
In the absence of controlled experiments, researchers are essentially compelled
to construct artificial worlds with a population of heterogeneous households.
Once the consumption and labor supply decisions of the households in the
model mirror the observed behavior of households, we can ask how changes
in the artificial environment will affect outcomes.
The fourth article by Diego Restuccia deviates somewhat from the immediate concerns of the U.S. economy and studies the issues of output determination in a global framework. During the “Great Recession,” U.S. real gross
domestic product (GDP) declined by 5 percent from 2007 to 2009, and, as of
2011, real GDP is now arguably 10 percent below its long-run trend growth
path. While these changes of real output are large, they pale in comparison to
observed cross-country income differences: In 2005, the average per capita
income in the richest countries was about 65 times that of the poorest countries. Restuccia first surveys the evidence on cross-country differences in per
capita income. He shows that, although it appears that cross-country per capita
income inequality has been increasing over the last 30 years, for individual
countries there are success stories and then there are failures. The recent, most
prominent examples for countries that have been catching up with the leading
world economy—the United States—are China and India. However, there
are countries such as Zimbabwe and Venezuela that have been falling behind
the United States more and more. Restuccia then argues that the process of
structural transformation, that is, the transition from a predominantly agricultural economy to an industrialized economy, and then to a service-oriented
economy, can account for some of these differences. In particular, he points
to the relatively low levels of agricultural productivity in poor countries as a
major source of income differences. Essentially, Restuccia argues that crosscountry differences in aggregate productivity and per capita income can be
attributed to differences in sectoral productivities resulting in differences in
resource allocation. Restuccia then surveys theories that attribute differences
in sectoral productivity to distortions that lead to the inefficient allocation of
resources across production establishments. Restuccia’s survey reflects how
the baseline neoclassical model of production can be modified to account for
heterogeneity in production, first at the industry level, then at the establishment level. These modifications are matched to observations, and we can see
how much they contribute to differences in aggregate output.

192

Federal Reserve Bank of Richmond Economic Quarterly

The four articles in this issue represent part of a research program in
macroeconomics that takes the basic stochastic growth model with complete
markets as its point of departure. Work in this research program then adds various sources of frictions and heterogeneity on the consumption and production
side, including restrictions on the set of available markets, and the ability of
market participants to pledge to repay debts. This procedure allows macroeconomists to evaluate the contributions of the various features that allow model
economies to capture more dimensions of available empirical evidence relative to a common benchmark model. Another line of research that is part of
this program, but is not addressed by these articles, departs from the baseline growth model by introducing nominal price rigidities in order to address
monetary non-neutrality.1 In fact, until the Great Recession, research on the
role of nominal price rigidities and monetary policy institutions in particular,
received more attention in macroeconomics in general than did research on
financial market frictions. This ranking of different lines of research simply
reflected the historical experience with the U.S. economy and other advanced
economies: Apparent inflation-output tradeoffs were considered to be much
more important than financial-market instability. For example, in the U.S.
economy the stock market crash of 1987 had no appreciable impact on the
aggregate economy, and the boom in equity prices in the 1990s, with a subsequent crash in 2001, was followed by one of the shallowest recessions in
post-WWII history. For many macroeconomists, the Great Recession changed
the perception on how important financial markets might be for the economy.
Consequently, attention among economists has shifted more toward the lines
of research that emphasize financial market frictions and that are described
in this special issue. The fact that economists continue to discuss the causes
and consequences of the Great Depression should, however, give one pause to
expect any time soon a coherent and generally accepted narrative of the Great
Recession and how it relates to the preceding collapse of the housing bubble
and the ensuing financial crisis.2
1 For an introduction, see Galı́ (2008).
2 Lo (forthcoming), in a very instructive survey of the literature on the financial crisis, both

by academics and journalists, observes that no single narrative has yet emerged from that literature,
and that, even for a number of commonly accepted “stylized facts” of the financial crisis, there
is no clear cut empirical evidence.

A. Hornstein: Introduction

REFERENCES
Galı́, Jordi. 2008. Monetary Policy, Inflation, and the Business Cycle: An
Introduction to the New Keynesian Framework. Princeton, N.J.:
Princeton University Press.
Lo, Andrew W. Forthcoming. “Reading About the Financial Crisis: A
21-Book Review.” Journal of Economic Literature.

193

Economic Quarterly—Volume 97, Number 3—Third Quarter 2011—Pages 195–208

A Perspective on Modern
Business Cycle Theory
Nobuhiro Kiyotaki

he global financial crisis and recession that started in 2007 with the
surge of defaults of U.S. subprime mortgages is having a large impact on recent macroeconomic research. The framework of modern
macroeconomics that has replaced traditional Keynesian economics since the
1970s has been widely criticized. Many of the criticisms have focused on
the assumptions of the representative agent and its abstraction from firm and
household heterogeneity. Critics are also skeptical about the model’s ability to explain unemployment and financial crises because it abstracts from
market frictions and irrationality. As a result, modern macroeconomics has
often been attacked for its futility in providing policy insight in the way that
traditional Keynesian economics has done.1 Some criticisms are constructive and others are misleading. I would like to present my thoughts on what
I believe are the contributions and shortcomings of modern macroeconomic
theory, in particular the business cycle theory, by responding to some of these
criticisms.2

REAL BUSINESS CYCLE THEORY

For the past few decades, real business cycle (RBC) theory has been the
focal point of debates in business cycle studies.3 According to the standard
This is an English translation of my Japanese article “A Perspective on Modern Business
Cycle Theory” in The 75 Years History of Japanese Economic Association, edited by the
Japanese Economic Association (2010). The article is based on my plenary talk at the Japanese
Economic Association annual meeting in October 2009. I would like to thank Raoul Minetti,
Mako Saito, and Akihisa Shibata for thoughtful comments on the lecture slides and the earlier
draft. The opinions expressed in this article do not necessarily reflect those of the Federal
Reserve Bank of Richmond or the Federal Reserve System. E-mail: kiyotaki@princeton.edu.
1 For example, see Krugman (2009).
2 Because of the limitations of space and my expertise, I will not deal with economic growth

or the empirical studies of business cycles. I also admit that the references are heavily biased to
my own work.
3 See Kydland and Prescott (1982) and Prescott (1986) for examples.

196

Federal Reserve Bank of Richmond Economic Quarterly

RBC approach, the competitive equilibrium of the market economy achieves
resource allocation that maximizes the representative household’s expected
utility given the constraints on resources. Although the RBC approach has
often been criticized for its abstraction from firm and household heterogeneity,
these charges are incorrect. Instead, it would be more accurate to view the
RBC framework as one with heterogeneous firms and households all playing
a part in the social division of labor under an ideal market mechanism. The
real business cycle theory is a business cycle application of the Arrow-Debreu
model, which is the standard general equilibrium theory of market economies.
Let us briefly outline the mechanics of an RBC model. Consider an
economy with a homogeneous product that can be either consumed or invested.
Labor, capital, and land are homogeneous inputs to production and total supply
of land is normalized to unity. There are a number of infinitely lived households
(h = 1, 2, ..., H ) and firms (j = 1, 2, ..., J ). A household’s preference is
given by the discounted expected utility of consumption and disutility of work:

∞

β t [uh (cht ) − dh (nht )] .

(1)

t=0

Firm j ’s maximum output is a function of the factors of production: capital, land, and labor (kj t , lj t , nj t ) represented by the production function
yj t = f (kj t , lj t , nj t ; zj t ),

(2)

where productivity of firm zj t follows a Markov process. In the goods market
equilibrium, aggregate output equals aggregate consumption and investment:
J

j =1

yj t =

h=1

cj t +

j =1

kj t+1 − (1 − δ)

kj t ,

(3)

j =1

where δ is the depreciation rate of capital.
Here, we assume that markets are complete, that is, there exists a complete
set of Arrow securities so that state-contingent claims to goods and factors of
production for every possible future state can be traded at the initial period. We
also assume that capital, land, and labor can be allocated freely across firms
every period and all markets are perfectly competitive. Under these assumptions, the competitive equilibrium achieves an allocation that maximizes the
weighted average of all individual households’ expected utilities with constant
weights λh given the resource constraints (Negishi 1960).
We define the representative household’s utility function as the weighted
average of all household utilities:

N. Kiyotaki: A Perspective on Modern Business Cycle Theory

u(C) = Maxch
d (N ) = Mindh

λh uh (ch ) , s.t.

h=1

λh dh (nh ) , s.t.

h=1

197

ch = C,
nh = N.

h=1

The aggregate production function is defined as total output given the efficient
allocation of factors of production and can be written as

Yt = At F (Kt , Nt ) = Maxkj t ,lj t ,nj t

f (kj t , lj t , nj t ; zj t )

j =1

s.t.

kj t = Kt ,

j =1

lj t = 1,

nj t = Nt .

(4)

j =1

Here, aggregate productivity At is a function of zj t for all j . The competitive
equilibrium is described by aggregate
(Ct , Nt , Yt , Kt+1 ) as a func quantities

tion of the state variables Kt and zj t j =1,2,...J that maximize the expected
utility of the representative household

∞

β t [u (Ct ) − d (Nt )] ,

(5)

t=0

subject to the resource constraint
Ct + Kt+1 − (1 − δ) Kt = At F (Kt , Nt ) .
Note that the representative household is not an assumption; it arises as an
implication of constant Negishi weights under complete markets as in Negishi
(1960). The aggregate production function is also constructed under the
assumption that production is efficient in competitive markets without friction. Therefore, the real business cycle theory does not blindly abstract from
firm and household heterogeneity. By assuming that markets are functioning
“well,” we reduce an otherwise general model to one of the representative
agent with an aggregate production function and analyze the business cycle
phenomenon in this simplified economy.
Now I wish to discuss, in an intuitive manner, how the real business
cycle theory explains the fluctuation of aggregate quantities (Ct , Nt , Yt , Kt+1 )
by a shock to aggregate productivity. Suppose that aggregate productivity
suddenly increases temporarily. Following this shock, marginal product of
labor will increase, leading to a rise in the real wage and therefore the quantity

198

Federal Reserve Bank of Richmond Economic Quarterly

of labor supplied. The combined effect of higher productivity and increased
use of labor will cause output to rise. But since the productivity increase is
temporary, future output is expected to increase less than present output, and
permanent income and consumption do not increase as much as present output.
Thus, from the goods market equilibrium condition (output = consumption +
investment), investment and, hence, next period capital stock will increase.
This will increase next period marginal productivity of labor, labor input, and
output, leading to another cycle of aggregate quantity increases and so on.
Hence, we notice that a temporary shock to productivity has precipitated a
persistent rise in aggregate quantities.
However, the biggest problem with the propagation mechanism described
above is that short-term changes in investment have little impact on capital
stock. At the same time, if there is a persistent increase of output, permanent
income and consumption will increase almost as much as current income,
which leaves little room for investment to rise. Therefore, we conclude that
capital plays a limited role in the propagation of a productivity shock. Furthermore, the substitution and wealth effects of a productivity shock on labor
supply work to cancel each other out: At a higher real wage, the representative
agent is willing to supply more labor, but the higher aggregate productivity
also increases the agent’s wealth, which in turn reduces the labor supply. Unless the substitution effect is very large, the overall fluctuations of labor will
not be very large. As a result, we need large and persistent aggregate productivity shocks in order to explain the business cycle phenomenon. Because
RBC models are missing a powerful propagation mechanism whereby small
shocks to the economy amplify and produce large fluctuations, they rely on
large exogenous shocks. But the question is where do these exogenous shocks
come from? It is difficult to identify such shocks even with the recent global
recession or the 1930s Great Depression.

OTHER SHOCKS

While exogenous shocks to productivity were the main source of shock in
early RBC analysis, the framework was later extended to include the effects
of other potential shocks. For example, in the face of a global downturn,
what would be the effects of a decreasing demand for exports? We cannot
address this question in a perfectly competitive economy since individual
firms are assumed to make their production decisions by taking market prices
as given, that is, they cannot perceive changes in demand directly. Therefore,
let us assume a monopolistically competitive economy in which each firm j
sells a differentiated good. The quantity of aggregate output, which can be
used as either a consumption good or investment good, is a function of many
differentiated goods as

N. Kiyotaki: A Perspective on Modern Business Cycle Theory

⎡
J

⎣
xj t
Yt =

1
θ

yj t

θ −1
θ

199

θ
⎤ θ −1

⎦

j =1

where θ is the elasticity of substitution between differentiated goods and θ > 1.
The parameter xj t is an exogenous idiosyncratic demand shock to firm j ’s
product. If we let pj t be the price of each good, the price index that corresponds
to the above aggregate output is
⎡
J

xj t p j t
Pt = ⎣

1
⎤ 1−θ

1−θ

⎦

j =1

Since households and firms use differentiated goods such that their consumption and investment levels are maximized subject to their budget constraints, for a given level of aggregate output produced, aggregate demand for
short, real income of each firm is given by
pj t
Pt

yj t = xj t (Yt ) θ yj t

1− θ1

Export demand shocks, which shift aggregate demand Yt , will affect real
income of firms, and will change production, employment, consumption, and
investment levels the way productivity shocks did in the previous section. Although a monopolistically competitive economy yields inefficient equilibrium
resource allocations, key features of the business cycle are not significantly
different from those of a perfectly competitive economy. In fact, a monopolistically competitive equilibrium corresponds to a perfectly competitive market
equilibrium with a value-added tax that redistributes the tax revenue lump
sum. Therefore, simply adding monopolistic competition to an RBC model
cannot account for the business cycle phenomenon, and some other source of
friction such as price stickiness or a different type of shock is necessary.
Now, instead of a shock to aggregate demand, let us consider a shock to
the quality of capital. Assume that a fraction ψ t+1 of differentiated goods
becomes obsolete between periods t and t + 1, and that the capital used as
inputs in the production of those goods also becomes obsolete. In other words,
the idiosyncratic demand parameter xj t of goods affected by the obsolescence
shock becomes zero, and the corresponding amount of demand shifts toward
new goods. As a result, the productive capital stock will decrease to
Kt+1 = It + (1 − ψ t+1 )(1 − δ)Kt .

(6)

200

Federal Reserve Bank of Richmond Economic Quarterly

This lower capital stock induces a decrease in output and employment. If
there are no other sources of friction in the economy, however, investments
will increase, encouraging the expansion of labor supply, and contribute to
a quick recovery of output. This is similar to the adjustment process of an
economy with initial stock of capital lower than the steady-state equilibrium
level in the Neoclassical optimal growth model. Again, incorporating capital
obsolescence shocks is insufficient to explain standard cases of recessions in
which investment and employment are depressed instead of booming. (See
Section 4 for more explanation.)

LABOR MARKET FRICTION

Real business cycle theory is often criticized for its lack of implications for the
cyclical behavior of unemployment. This issue has been partially addressed
in the Diamond-Mortensen-Pissarides framework that incorporates matching
frictions that exist in the labor market between workers and firms.4 Matching
theory assumes that it is costly and time consuming to find productive matches
because workers and jobs are heterogeneous. In order to include this feature
into the macroeconomic model, they introduced an aggregate job-matching
function written as an increasing function of job vacancies vt and the number
of unemployed workers (difference between the workforce and employment
level, N̄t − Nt ):
Nt+1 = μt M(vt , N̄t − Nt ) + (1 − δ nt )Nt .
μt represents the efficiency of the job matching and δ nt is an exogenous parameter that measures the rate at which current job matches are destroyed.
We assume firms incur a recruitment cost of χ units of the output good per
vacancy. The goods market equilibrium condition is
Yt = Ct + Kt+1 − (1 − δ)Kt + χ vt .
After firms and workers are matched, wages are determined by Nash bargaining. We assume the Hosios condition (Hosios 1990) (under which the
firm’s bargaining power is equal to the elasticity of the number of aggregate
job matches with respect to the vacancies) is satisfied. Each household consists of many workers, and is therefore able to diversify labor income risk
from unemployment. The competitive equilibrium of such an economy with
search maximizes the expected utility of the representative household.
4 See Mortensen and Pissarides (1994), Merz (1995), and Andolfatto (1996) for descriptions
of the Diamond-Mortensen-Pissarides framework and its incorporation into RBC models.

N. Kiyotaki: A Perspective on Modern Business Cycle Theory

201

In a search and matching model, search is an investment of current resources for future returns, and we expect substantial fluctuations in unemployment only when labor productivity and demand are expected to change
persistently. However, according to Shimer (2005), even with persistent labor productivity shocks, fluctuations in unemployment will be small if the
marginal product of labor is significantly larger than the marginal cost of labor
supply (marginal rate of substitution between labor supply and consumption
d´(N̄ )/u´(C)) and if wages are determined by Nash bargaining. Therefore,
search models appear to be limited to explaining fluctuations in unemployment of young workers fresh out of school and old workers nearing retirement
for whom the difference between the marginal product of labor and marginal
cost of labor supply is small.

HETEROGENEITY AND CREDIT LIMITS

In an Arrow-Debreu economy that underlies RBC theory, credit is considered
to be a particular kind of exchange: The borrower receives present goods (or
purchasing power to buy goods at present) in exchange for paying the purchasing power at a future date. In this economy, there is an auctioneer who
has the authority to enforce all contracts for all the contingencies, thus eliminating any failure of payment in the future. Therefore, an exchange between
present goods and future goods in this market is not subject to any frictions
and is no different from an exchange between two present goods. If, however,
this enforcing auctioneer is absent in a decentralized market economy, then a
borrower can default on his payment in the future. Anticipating the possibility
of default, the creditor requires collateral for the loans and makes the amount
of credit contingent on the value of the collateral. In order to analyze the business cycle in economies with such credit constraints, we assume that it takes
one period to transform inputs into output. Instead of production function (2),
we use
yj t+1 = f kj t , lj t , nj t ; zj t .

(7)

We assume that the maturity of all outstanding debt is one period and that
the debt repayment in the next period bj t cannot exceed a fraction φ of the
expected value of the collateral, which in this model we assume to be land:
bj t ≤ φEt (qt+1 lj t ).

(8)

Here qt+1 is the price of the land in period t + 1 and lj t is the amount
of land on collateral. We assume that the repayment amount is independent
of the state of the borrower or the economy (i.e., the debt is noncontingent).
So even though we rationalize the imposition of the borrowing constraint by

202

Federal Reserve Bank of Richmond Economic Quarterly

reference to the possibility of default, we assume that there is no default in
equilibrium. The entrepreneur’s budget constraint is
c j t + k j t + q t lj t + w t n j t

bj t
= yj t + 1 − ψ t (1 − δ) kj t−1 + qt lj t−1 − bj t−1 +
.
(9)
rt
The left-hand side of the equation is the entrepreneur’s expenditure on
consumption (or dividend) and factors of production—capital, land, and labor.
(We assume the entrepreneur must buy capital and land and cannot rent their
services.) The right-hand side represents the firm’s sources of finance where
internal finance is in the square bracket—net worth that equals output plus
undepreciated capital and land from the previous period net of repayment of
old debt. The last term on the right-hand side is the external finance derived
from new debt (calculated as the present value of next period’s repayment on
loans discounted by the gross real interest rate rt ). Each entrepreneur chooses
a sequence of consumption, investment, output, and debt in order to maximize
the discounted expected utility subject to the constraints of technology, credit,
and available funds.
Now let us examine the difference between the RBC model and the economy in which producers are heterogeneous in productivity zj t and are credit
constrained.5 First, if there is limited contract enforcement, then insurance
is incomplete. Because the insurance company is aware of the fact that the
insurees may not pay in the future, the company, as a precautionary measure,
demands premium payments upon entering an insurance contract. Thus, producers and households with low net worth may not purchase insurance with
full coverage. When the economy is then hit by various shocks, the net worth
of firms and households with partial insurance coverage fluctuates, which in
turn requires an adjustment of the Negishi weights. As a result, we can no
longer maintain the assumptions of the representative household approach.
In addition, when borrowing constraints exist, firms must rely on internal
finance—their net worth—as a source of financing inputs. When the borrowing constraint is binding for some firms, the marginal product of capital, land,
and labor across firms will no longer be the same. Thus, the assumptions for
the existence of a representative firm no longer hold, and an aggregate production function such as given by equation (4) no longer exists. Now, aggregate
productivity of the economy will fluctuate endogenously with credit levels and
net worth as emphasized in Kiyotaki and Moore (1997a) and Kiyotaki (1998).
When productive firms borrow up to the credit limit and also use their own
net worth to finance additional investments that the loans could not cover, the
5 See Bernanke and Gertler (1989), Kiyotaki and Moore (1997a, 1997b), Kiyotaki (1998),
and Bernanke, Gertler, and Gilchrist (1999) for examples of RBC models with credit constraints.

N. Kiyotaki: A Perspective on Modern Business Cycle Theory

203

Figure 1 Credit Cycles

Present period

Future periods: t+1, t+2, ...

Negative shock
Net worth of
constrained falls (A)

Net worth of
constrained will fall (C)

Asset demand of
constrained falls (B)

Asset demand of
constrained will fall (D)

User cost of
asset falls

User cost of asset will fall

Asset price falls (E)

impact of a small shock to total productivity, investment, and net worth is large.
In order to explain the propagation of the effects of the shock, let’s assume that
net worth of all firms has declined because of the obsolescence of some of their
products, and thus the capital used to produce those goods has also become
obsolete. Because highly productive firms have outstanding debt from the
previous period, the leverage effect of the debt will result in a sharp reduction of
net worth (refer to Figure 1, pointA). These productive firms will decrease their
demand for capital and land because they cannot borrow more (Figure 1, point
B) and aggregate productivity will fall as the share of investment of productive
firms declines. Because it will take some time for the highly productive firms to
recover their preshock level of net worth (Figure 1, point C), their demand for
assets (capital and land) and labor will be constrained for a while and therefore
aggregate productivity and aggregate demand for assets and investment are
also expected to be stagnant for a while (Figure 1, point D). Under these
expectations, current period asset prices drop (Figure 1, point E) and the
balance sheets of the highly productive firms further deteriorate (Figure 1,
point A). As a result, the small aggregate shock causes a persistent decrease of
the share of investment by credit-constrained and highly productive producers,
which leads to a persistent decline of aggregate productivity. Thus, with
borrowing constraints, the fall in asset price is responsible for the magnified
drop in output.

204

Federal Reserve Bank of Richmond Economic Quarterly

Joan Robinson once said, “the essence of Keynesian economics is its
recognition of the central role of time in human lives. People live in the
present moment which is continuously moving from an unknown future to the
irrevocable past.”6 Because the demand for assets by productive firms that
face credit constraints depends on each firm’s own net worth (which equals
accumulated past savings), it takes time for them to recover from a negative shock to net worth (i.e., the effects of shocks are persistent as firms are
held back by their past savings). Meanwhile, asset prices are driven down by
expectations of a prolonged stagnation in future asset demand (i.e., expectations about the future affect present asset prices). Notice how the asset market
serves as a platform on which past savings and expectations about the future
interact in present time.
In the real business cycle model with no constraints on borrowing, shocks
from the obsolescence of goods and capital trigger higher investment, leading
to a quick recovery of capital stock and output (dotted line in Figure 2.) In
contrast, in an economy where borrowing constraints exist, the obsolescence
shocks significantly reduce net worth and investment of productive firms,
further decreasing capital stock and output. Since it takes time for highly productive, yet credit constrained, firms to recover their net worth and investment
levels, total output and productivity will both fall persistently (solid line in
Figure 2).
Reinhart and Rogoff (2009) claim that, although financial crises are much
like forest fires in the sense that it is difficult to predict when and where they will
occur, certain conditions set the stage for crises. They present the following
as indicators of an emerging crisis for a country: 1) asset prices rise rapidly,
and especially the price-rent ratio for real estate increases sharply; 2) the
amount of debt expands faster than aggregate output and asset values, leading
to a higher leverage (the ratio of total assets to net worth); and 3) the country
experiences massive capital inflows. In the presence of borrowing constraints,
firm heterogeneity in productivity, and diverse investment opportunities, if an
adverse shock arrives when the overall leverage level of the economy is high
enough, the powerful propagation mechanism as described in Figure 1 will
take effect in asset prices, credit levels, and outputs, forcing the economy into
financial crisis.7
When a financial crisis is accompanied by a banking crisis, I expect that the
financial system will cause additional problems that will aggravate the crisis.
6 Introduction of the 1973 Japanese edition of Joan Robinson (1971).
7 Kiyotaki and Moore (1997a) explain how total output, asset price, and debt fluctuate cycli-

cally when exposed to an exogenous shock, while Matsuyama (2008) suggests that the fluctuation
can occur even in the absence of a shock. Aoki, Benigno, and Kiyotaki (2009) distinguish domestic
and foreign credit limits, and show that, if the domestic economy has an underdeveloped financial
system, it becomes prone to both expansion and contraction after capital account liberalization, as
Reinhart and Rogoff (2009) suggest.

N. Kiyotaki: A Perspective on Modern Business Cycle Theory

205

Figure 2 Impulse Response to Capital Quality Shock

Capital stock, Output

No friction

Financial friction

Time

To analyze such crises, we need to look beyond the credit constraints of nonfinancial borrowers and consider the role of financial intermediaries and their
financing constraints. Theories of financial intermediation have developed
since Diamond and Dybvig (1983), and others such as Williamson (1987) have
extended macroeconomic models to include banks. Although there is active
recent research on the source of problems caused by financial intermediaries
and their markets (especially “wholesale” or “interbank” financial markets),
there is not yet a standard macroeconomic model for analysis of financial
intermediation.8
In addition, note that it is the leverage effects from debt obligations that
induce the net worth and investment of highly productive firms to persistently
decline in the presence of borrowing constraints. If firms issue preferred
stock or other securities whose returns are contingent on the firm performance
instead of taking out loans to finance their investments, the leverage effects
will not materialize. Therefore, in order to justify the propagation mechanism,
8 Kiyotaki and Moore (1997b, 2008) analyze the effects of productivity and liquidity shocks
on aggregate production in an economy where firms are involved in both production and financial
intermediation. Gertler and Kiyotaki (2011) study the moral hazard problem of financial intermediaries, the relationship between their balance sheets and business cycles, and the effects of broad
monetary policies. These articles also provide more references to the literature.

206

Federal Reserve Bank of Richmond Economic Quarterly

we need to first explain why firms would choose to borrow and not issue
contingent securities in procuring their funds. We also need to explain why
firms choose not to issue common stocks in order to recover net worth when
it is deteriorating.

CONCLUSION

In this article, I explain that business cycles in an economy of heterogeneous
firms and households can be analyzed using the representative agent approach
if their interactions take place in an economy without frictions and complete
markets. However, in an economy where markets do not function smoothly
because of frictions such as credit constraints, the representative household
framework may no longer be appropriate and aggregate productivity changes
endogenously with the distribution of wealth and productivity of firms. Thus,
I argue that the interaction of heterogeneous firms and households in the presence of credit constraints is important for business cycle analysis. Finally, I
would like to propose some questions and directions for future research.
While in the presence of borrowing constraints, capital and land does not
move between firms so that the marginal products of capital and land are not
equalized across firms; the allocation of capital and land will gradually adjust
in a similar way that water flows downhill. For example, firms with high
marginal products of capital and land do not consume or pay out dividends in
excess, and hence accumulate net worth. As a result, they will eventually be
less constrained by external finance constraints. Even in an economy where
capital and land do not move freely, if labor can move freely between firms, the
marginal product of labor will be equalized across firms. Then, the marginal
product of capital and land will also become more equal across firms. One
suggestion for future research is to study how the distribution of productivity
and net worth of firms evolves and how persistent the differences in marginal
products of inputs across firms are.
In order to obtain a deeper understanding of the importance of firm heterogeneity, we need to analyze what determines and changes the productivity of
individual firms. According to Bernard et al. (2003), firm labor productivity
in an industry varies widely between less than one-fourth and more than four
times the average labor productivity of the industry. Differences in human
and physical capital can account for only a small fraction of these productivity differences among firms. In order to explain the diverse productivity
across firms, we need to consider the accumulation process of both tangible as
well as intangible assets. As modern growth theory has attempted to extend its
models to include endogenous technical progress in addition to the accumulation of the factors of production, perhaps it is about time for modern business
cycle theories to look into the source and the propagation of the shocks by

N. Kiyotaki: A Perspective on Modern Business Cycle Theory

207

exploring the endogenous evolution of an individual firm’s productivity in
general equilibrium.

REFERENCES
Andolfatto, David. 1996. “Business Cycles and Labor-Market Search.”
American Economic Review 86 (March): 112–32.
Aoki, Kosuke, Gianluca Benigno, and Nobuhiro Kiyotaki. 2009. “Capital
Flows and Asset Prices.” In NBER International Seminar on
Macroeconomics 2007, edited by Richard Clarida and Francesco
Giavazzii. Chicago: University of Chicago Press, 175–216.
Bernanke, Ben S., and Mark Gertler. 1989. “Agency Costs, Net Worth, and
Business Fluctuations.” American Economic Review 79 (March): 14–31.
Bernanke, Ben S., Mark Gertler, and Simon Gilchrist. 1999. “The Financial
Accelerator in a Quantitative Business Cycle Framework.” In Handbook
of Macroeconomics, edited by John B. Taylor and Michael Woodford.
Amsterdam: North-Holland; 1,341–93.
Bernard, Andrew B., Jonathan Eaton, J. Bradford Jensen, and Samuel
Kortum. 2003. “Plants and Productivity in International Trade.”
American Economic Review 93 (September): 1,268–90.
Diamond, Douglas W., and Philip H. Dybvig. 1983. “Bank Runs, Deposit
Insurance, and Liquidity.” Journal of Political Economy 91 (June):
401–19.
Gertler, Mark, and Nobuhiro Kiyotaki. 2011. “Financial Intermediation and
Credit Policy in Business Cycle Analysis.” In Handbook of Monetary
Economics, 3(A), edited by Benjamin M. Friedman and Michael
Woodford. Amsterdam: North-Holland, 547–99.
Hosios, Arthur J. 1990. “On the Efficiency of Matching and Related Models
of Search and Unemployment.” Review of Economic Studies 57 (April):
279–98.
Kiyotaki, Nobuhiro. 1998. “Credit and Business Cycles.” Japanese
Economic Review 49 (March): 18–35. Japanese translation in Survey of
Modern Economics 1998, edited by M. Ohtsuki, K. Ogawa, K. Kamiya,
and K. Nishimura. Tokyo: Toyo-Keizai Sjimpo-sha, 29–51.
Kiyotaki, Nobuhiro, and John Moore. 1997a. “Credit Cycles.” Journal of
Political Economy 105 (April): 211–48.

208

Federal Reserve Bank of Richmond Economic Quarterly

Kiyotaki, Nobuhiro, and John Moore. 1997b. “Credit Chains.” Mimeo,
London School of Economics.
Kiyotaki, Nobuhiro, and John Moore. 2008. “Liquidity, Business Cycles and
Monetary Policy.” Mimeo, London School of Economics and Princeton
University.
Krugman, Paul. 2009. “How Did Economists Get So Wrong?” New York
Times Magazine, 6 September, MM36.
Kydland, Finn E., and Edward C. Prescott. 1982. “Time to Build and
Aggregate Fluctuations.” Econometrica 50 (November): 1,345–70.
Matsuyama, Kiminori. 2008. “Aggregate Implications of Credit Market
Imperfections.” In NBER Macroeconomic Annual 2007, edited by Daron
Acemoglu, Kenneth Rogoff, and Michael Woodford. Chicago:
University of Chicago Press, 1–60.
Merz, Monika. 1995. “Search in the Labor Market and the Real Business
Cycle.” Journal of Monetary Economics 36 (November): 269–300.
Mortensen, Dale T., and Christopher A. Pissarides. 1994. “Job Creation and
Job Destruction in the Theory of Unemployment.” Review of Economic
Studies 61 (July): 397–415.
Negishi, Takashi. 1960. “Welfare Economics and Existence of an
Equilibrium for a Competitive Economy.” Metroeconomica 12 (June):
92–7.
Prescott, Edward C. 1986. “Theory Ahead of Business Cycle Measurement.”
Carnegie-Rochester Conference Series on Public Policy 25 (January):
11–44. Reprinted in the Federal Reserve Bank of Minneapolis Quarterly
Review 10 (Fall): 9–22.
Reinhart, Carmen M., and Kenneth S. Rogoff. 2009. This Time is Different.
Princeton, N. J.: Princeton University Press.
Robinson, Joan. 1971. Economic Heresies. New York: Basic Books.
Translated into Japanese by Hirofumi Uzawa (1973). Tokyo: Nihon
Keizai Shin-bunsha.
Shimer, Robert. 2005. “The Cyclical Behavior of Equilibrium
Unemployment and Vacancies.” American Economic Review 95
(March): 25–49.
Williamson, Stephen D. 1987. “Financial Intermediation, Business Failures,
and Real Business Cycles.” Journal of Political Economy 95
(December): 1,196–216.

Economic Quarterly—Volume 97, Number 3—Third Quarter 2011—Pages 209–254

Financial Frictions in
Macroeconomic
Fluctuations
Vincenzo Quadrini

he financial crisis that developed starting in the summer of 2007 has
made it clear that macroeconomic models need to allocate a more
prominent role to the financial sector for understanding the dynamics
of the business cycle. Contrary to what has been often reported in popular press, there is a long and well-established tradition in macroeconomics
of adding financial market frictions in standard macroeconomic models and
showing the importance of the financial sector for business cycle fluctuations.
Bernanke and Gertler (1989) is one of the earliest studies. Kiyotaki and Moore
(1997) provide another possible approach to incorporating financial frictions
in a general equilibrium model. These two contributions are now the classic
references for most of the work done in this area during the last 25 years.
Although these studies had an impact in the academic field, formal macroeconomic models used in policy circles have mostly developed while ignoring
this branch of economic research. Until recently, the dominant structural
model used for analyzing monetary policy was based on the New Keynesian
paradigm. There are many versions of this model that incorporate several
frictions such as sticky prices, sticky wages, adjustment costs in investment,
capital utilization, and various types of shocks. However, the majority of
these models are based on the assumption that markets are complete and,
therefore, there are no financial market frictions. After the financial crisis hit,
it became apparent that these models were missing something crucial about
the behavior of the macroeconomy. Since then there have been many attempts
I am indebted to Andreas Hornstein and Felipe Schwartzman for very detailed and insightful
suggestions that resulted in a much improved version of this article. Of course, I am the
only one responsible for possible remaining errors and imprecisions. The opinions expressed
in this article do not necessarily reflect those of the Federal Reserve Bank of Richmond or
the Federal Reserve System. E-mail: quadrini@usc.edu.

210

Federal Reserve Bank of Richmond Economic Quarterly

to incorporate financial market frictions in otherwise standard macroeconomic
models. What I would like to stress here is that the recent approaches are not
new in macroeconomics. They are based on ideas already formalized in the
macroeconomic field during the last two and a half decades, starting with the
work of Bernanke and Gertler (1989). In this article I provide a systematic
description of these ideas.

1. WHY MODELING FRICTIONS IN FINANCIAL MARKETS?
Before adding complexity to the model, we would like to understand why it
is desirable to have meaningful financial markets in macroeconomic models,
besides the obvious observation that they seem to have played an important
role in the recent crisis. One motivating observation is that the flows of credit
are highly pro-cyclical. As shown in the top panel of Figure 1, the change
in credit market liabilities moves closely with the cycle. In particular, debt
growth drops significantly during recessions. The only exception is perhaps
for the household sector in the 2001 recession. However, the growth in debt
for the business sector also declined in 2001. Especially sizable is the drop in
the most recent recession. The pro-cyclicality of corporate debt is also shown
in Covas and Den Haan (2011) using Compustat data.
The cyclical properties of financial markets can be seen not only by the
aggregate dynamics of credit flows (as shown in the top panel of Figure 1),
but also by indicators of tightening credit standards. The bottom panel of
Figure 1 plots the net fraction of senior bank managers reporting tightening
credit standards for commercial and industrial loans in a survey conducted by
the Federal Reserve Board. Clearly, more and more banks tighten their credit
standard during recessions. Other indicators of credit tightening such as credit
spreads, that is, interest rate differentials between bonds with differing ratings,
convey a similar message as shown in Gilchrist,Yankov, and Zakrajsek (2009).
If markets were complete, the financial structure of individual agents,
being households, firms, or financial intermediaries, would be indeterminate.
We would then be in a Modigliani and Miller (1958) world and there would
not be reasons for the financial flows to follow a cyclical pattern. However,
the fact that credit flows are highly pro-cyclical and the index of tightening
standards is countercyclical suggests that the complete-market paradigm has
some limitations. This is especially true for the index of credit tightening.1
Of course, Figure 1 does not tell us whether it is the macroeconomic recession that causes the contraction in credit growth or the credit contraction
1 Although the pro-cyclicality of financial flows does not contradict Modigliani and Miller
since the financial structure is indeterminate, when markets are complete there is no reason for
lenders to change their “credit standards” over the business cycle. Here I interpret the index
of credit standards as reflecting the characteristics of an individual borrower that are required to
receive a loan. So it is something additional to the market-clearing risk-free interest rate.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

211

Figure 1 Debt and Credit Market Conditions
Panel A: New Debt (Change in Credit Market Liabilities)
12
10
8

Percent of GDP

6
4
2
0
-2
-4
-6
-8

Nonfinancial Business Sector
Household Sector

-10
-12
1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012

Net Percent of Respondents

Panel B: Index of Tightening Standards for Commercial and Industrial Loans
90
80
70
60
50
40
30
20
10
0
-10
-20
-30
-40
-50
-60
-70
-80
-90
1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012

Notes: Panel A shows change in the volume of credit market instruments in the households and business sector divided by gross domestic product. The data are from the Flow
of Funds of the Federal Reserve Board. Panel B shows the index of credit tightening
in commercial and industrial loans. The data are from the Senior Loan Officer Opinion
Survey on Bank Lending Practices from the Federal Reserve Board. The survey was not
conducted from 1984–1990. I thank Egon Zakrajsek for making available the historical
data from 1967–1983.

that causes or amplifies the macroeconomic recession. It would then be convenient to distinguish three possible channels linking financial flows to real
economic activity.

212

Federal Reserve Bank of Richmond Economic Quarterly
1. Real activity causes movements in financial flows. One hypothesis is
that investment and employment respond to changes in real factors
such as movements in productivity. In this case, borrowers cut their
debt simply because they need less funds to conduct economic transactions. If this was the only linkage between real and financial flows, the
explicit modeling of the financial sector would be of limited relevance
for understanding movements in real economic activities.
2. Amplification. The second hypothesis is that the initial driving force
of movements in economic activities are nonfinancial factors such as
drops in productivity or monetary policy shocks. However, as investment and employment fall, the credit ability of borrowers deteriorates
more than the financing need after the drop in economic activity. This
could happen, for instance, if the fall in investment generates a fall in
the market value of assets used as collateral. The presence of financial
frictions will then generate a larger decline in investment and employment compared to the decline we would observe in absence of financial
frictions. Therefore, financial frictions amplify the macroeconomic
impact of the exogenous changes.
3. Financial shocks. A third hypothesis is that the initial disruption arises
in the financial sector of the economy. There are no initial changes in
the nonfinancial sector. Because of the disruption in financial markets,
fewer funds can be channeled from lenders to borrowers. As a result
of the credit tightening, borrowers cut on spending and hiring, and this
generates a recession. I will refer to these types of exogenous changes
as “credit” or “financial” shocks.

Most of the literature in dynamic macrofinance has focused on the second channel, that is, on the “amplification” mechanism generated by financial
market frictions. More specifically, the central hypothesis is that financial
frictions “exacerbate” a recession but are not the “cause” of the recession.
Something wrong (a negative shock) first happens in the nonfinancial sector. This could be caused by “exogenous” changes in productivity, monetary
aggregates, interest rates, preferences, etc. These shocks would generate a
macroeconomic recession even in absence of financial market frictions. With
financial frictions, however, the magnitude of the recession becomes much
bigger.
The third channel, that is, the analysis of financial shocks as a “source” of
business cycle fluctuations, has received less attention in the literature. More
recently, however, a few studies have explored this possibility. In this article
I will present the main theoretical ideas about the second and third channels,
that is, “amplification” and “financial shocks.” I will not focus on the first
hypothesis only because, as observed above, if this was the most relevant channel of linkage between real and financial flows, the explicit modeling of the

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

213

financial sector would be of limited relevance for understanding movements
in real macroeconomic activities.

MODELING FINANCIAL FRICTIONS

Technically, financial frictions emerge when trade in certain assets cannot take
place. In an Arrow-Debreu world with state-contingent trades, markets for
some contingencies are missing and, therefore, there is a limit to the feasible
range of intertemporal and intratemporal trades. In practical terms this implies
that agents are unable to anticipate or postpone spending (for consumption or
investment) or insure against uncertain events (to smooth consumption or investment). Of course, this becomes relevant only if agents are heterogeneous.
Therefore, any models with financial frictions share the following features:
1. Missing markets: Some asset trades are not available or feasible.
2. Heterogeneity:
dimension.

Agents are heterogeneous in some important

I should clarify that these two features are necessary but not sufficient for
incomplete markets to play an important role. That we need heterogeneity
is obvious. If all agents are homogeneous, there is no reason to trade claims
intertemporally or intratemporally. So the fact that some markets are missing becomes irrelevant. Also, if agents could trade any type of contingency,
we would have an economy with complete markets. On the other hand, the
fact that some markets are missing may be irrelevant if in equilibrium agents
choose voluntarily not to trade in these markets. Therefore, market incompleteness and heterogeneity must take specific configurations. In the next two
subsections, I will describe first the most common approaches to modeling
missing markets and then I will discuss the most common approaches used to
generate heterogeneity.

Missing Markets
The approaches used to model missing markets can be divided into two categories: “exogenous” market incompleteness and “endogenous” market incompleteness.
1. Exogenous market incompleteness. The first category includes models that impose exogenously that certain assets cannot be traded. For
example, it is common to assume that agents can hold bonds (issue
debt if negative) but they cannot hold assets with payoffs contingent on
information that becomes available in the future. This approach does
not attempt to explain why certain assets cannot be traded but it takes a

214

Federal Reserve Bank of Richmond Economic Quarterly
more pragmatic approach. Since a large volume of financing observed
in the real economy is in the form of standard debt contracts, while
the volume of state contingent contracts is limited, it makes sense to
assume that debt contracts are the only financial instruments that are
available. A further restriction, which is also exogenously imposed, is
that the total amount of debt cannot exceed a certain limit (exogenous
borrowing constraint). Of course, the goal of this literature is not to explain why markets are incomplete but to understand the consequences
of market incompleteness.
2. Endogenous market incompleteness. The second category includes
models in which the set of feasible contracts are derived from agency
problems. The idea is that markets are missing because parties are not
willing to engage in certain trades because they are not enforceable or
incentive-compatible. What this means is that the borrower is unable to
borrow or insure against the risk because, with high liabilities and full
insurance, he or she would act against the interests of the lender. Typically, endogenous market incompleteness is derived from two agency
problems:
(a) Limited enforcement. The idea of limited enforcement is that the
lender is fully capable of observing whether or not the borrower is
fulfilling his or her contractual obligations. However, there are no
tools the lender can use to enforce the contractual obligations. For
example, even if the lender knows that the borrower is not exerting
effort or is diverting funds, it may be difficult to prove it in court.
There could also be legal limits to what the lender can enforce. For
example the law does not allow the lender to force the borrower to
work in order to repay the debt (no slavery).
(b) Information asymmetry. Information asymmetries also limit the
ability of lenders to force the borrowers to fulfil their obligations.
In this case, the limit derives from the inability to observe the
borrower’s action. For example, if the repayment depends on the
performance of the business and the performance depends on unobservable effort, the borrower may have an incentive to choose
low effort.
From a technical point of view, models with limited enforcement are
typically easier to analyze than models with information asymmetries.
Both models, however, share a common property: higher is the net
worth of borrowers and higher is the (incentive-compatible) financing that can be raised externally—a recurrent factor in the theoretical
analysis that will be conducted in the remaining sections of this article.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

215

Heterogeneity
There are many approaches used in the literature to generate heterogeneity.
One popular approach is that agents are ex-ante identical but they are subject to idiosyncratic shocks. Therefore, the heterogeneity derives from the
assumption that at any point in time each agent receives a different shock.
For example, in the Bewley (1986) economy agents receive stochastic endowments. Because at any point in time there are agents with low endowments
while others have high endowments, it will be optimal to sign state-contingent
contracts that insure the endowment risks and allow for consumption smoothing. With these contracts, agents receive payments when their endowments
are low and make payments when their endowments are high.
If markets were complete, the analysis of this model would be simple.
With incomplete markets, however, the model generates high dimensional
heterogeneity. Even if agents are initially or ex-ante homogeneous, in the
long run there will be a continuum of asset holdings. Because the state dimensionality makes the characterization of the equilibrium challenging, the
majority of applications of Bewley-type economies have abstracted from aggregate uncertainty and business cycle fluctuations. An exception is Krusell
and Smith (1998). Other exceptions include Cooley, Marimon, and Quadrini
(2004), where the heterogeneity is on the production side and, more recently,
Guerrieri and Lorenzoni (2010) and Khan and Thomas (2011). In general,
however, the majority of studies investigating the importance of financial frictions for macroeconomic fluctuations have tried alternative approaches that
keep the degree of heterogeneity small.
A common approach is to assume that there are only two types of agents
with permanent differences in preferences and/or technology. In equilibrium
one agent ends up being the borrower and the other the lender. Alternatively,
there could be a continuum of heterogeneous agents but their aggregate behavior can be characterized by a single representative agent thanks to linear
aggregation. This is the case, for example, in Carlstrom and Fuerst (1997);
Bernanke, Gertler, and Gilchrist (1999); and Miao and Wang (2010). Although
entrepreneurs face uninsurable idiosyncratic risks and there is a distribution
of entrepreneurs over net worth, the linearity of technology and preferences
allows for the derivation of aggregate policies that are independent of the distribution. So, effectively, the reduced form in these models is also characterized
by only two representative agents: households/workers and entrepreneurs.
Still, the fact that firms are owned by entrepreneurs and external financing
is limited is not enough for financial frictions to play an important role. Even
if entrepreneurs (firms) are temporarily financially constrained, that is, they
would like to borrow more than they are allowed, over time they could save
enough resources to make the financial constraints nonbinding. Therefore,
further assumptions need to be made in order for the borrowing constraints to
also be relevant in the long run. This is achieved in different ways.

216

Federal Reserve Bank of Richmond Economic Quarterly
1. Finite life span. A common modeling approach is based on the assumption that borrowers have a finite life span. For example, in overlapping
generations models, it is commonly assumed that newborn agents have
no initial assets and, therefore, they are financially constrained in the
first stage of their lives. Over time agents accumulate assets and become unconstrained. However, since there is a continuous entrance
of newborns, at any point in time there are always some agents who
face binding financial constraints. A similar idea is applied in industry dynamics models where exiting firms are replaced by new entrant
firms.
2. Different discounting. Another common approach is to assume that
borrowers are infinitely lived but they discount the future more heavily
than lenders. What this implies is that the cost of external financing is
lower than the cost of internal funds. As a result, debt is preferred to
internal funds. This insures that borrowers do not save enough to make
the borrowing constraint irrelevant. Then, unanticipated shocks could
lead to a larger spending response of borrowers because of the binding
constraint.
3. Tax benefits. A similar approach to the differential discounting is the
assumption that there are tax benefits of debt. For example, the tax
deductibility of interest payments from corporate earnings generates a
preference for debt over equity, and corporations tend to leverage up.
However, if the firm is unexpectedly required to de-leverage and it is
difficult to replace debt with equity in the short term, the result could
be large drops in investment and employment.
4. Bargaining position. A further assumption proposed in the literature
is that external financing (debt/outside equity) is preferred to inside financing (entrepreneurial equity), not because of differential discounting or tax benefits, but because it affects the bargaining position of
firms in the negotiation of wages and/or executive compensation. The
idea is that, if the compensation of workers and managers is determined through bargaining (in the case of workers the bargaining could
be with unions), high-leveraged firms would be able to bargain lower
compensations simply because the bargaining surplus is reduced by the
debt.

But independent of the particular modeling approach, all models with
financial market frictions are characterized by the presence of at least two
groups of agents—one group that would like to raise external funds and one
group that provides at least some of the funds.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

217

3. A SIMPLE THEORETICAL FRAMEWORK
The discussion conducted so far has provided an informal description of the
basic features of models used to study the importance of financial market
frictions for the business cycle. Now I provide a more analytical description
using a formal model that is rich enough to capture the various ideas proposed
in the literature but remains analytically tractable.
To achieve this goal, I assume that there are only two periods—period 1
and period 2—and two types of agents—a unit mass of workers and a unit
mass of entrepreneurs. Variables that refer to period 2 will be indicated with
a prime superscript.
The lifetime utility of workers is

E c−
+ δc ,
2
where c and h are consumption and labor in period 1 and c is consumption in
period 2. The lifetime utility of entrepreneurs is

E c + βc .
Thus, entrepreneurs’ utility is also linear in consumption but there is no disutility from working. The assumption of risk neutrality is not essential but it
simplifies the analysis. When relevant, I will comment on the importance of
risk neutrality.
I now describe what happens in each of the two periods.
• Period 1. Entrepreneurs enter period 1 with capital K and debt B owed
to workers. In principle, B could be negative. However, as we will see,
this case is not of theoretical interest.
There are two production stages during the first period. In the first stage,
intermediate goods are produced with capital and labor. In the second
stage, the intermediate goods are used as inputs in the production of
consumption and new capital goods.
– Stage 1: Production of intermediate goods. Intermediate goods
are produced by entrepreneurs with the production function
y = AK θ h1−θ ,
where A is the aggregate level of productivity, K is the input of
capital, and h is the input of labor supplied by workers.
– Stage 2: Production of final goods. In this stage, intermediate
goods are used as inputs in the production of consumption and
new capital goods. The transformation in consumption goods is
simple: One unit of intermediate goods is transformed into one

218

Federal Reserve Bank of Richmond Economic Quarterly
unit of consumption goods. New capital goods are produced by
individual entrepreneurs using the technology
k n = ωi,
where i is the quantity of intermediate goods used in the production
of new capital goods and ω is the idiosyncratic productivity realized
after the choice of i. The cumulative density function is denoted
by (ω). Later we will consider two cases: Eω = 1 and Eω = 0.
In the second case there is no production of investment goods and,
therefore, the aggregate stock of capital in period 2 is the same as
in period 1.
• Period 2. Second period production takes place only with the input of
capital. Since this is the terminal period, only consumption goods are
produced. There are two sectors of production.
– Sector 1: Entrepreneurial sector. This is composed of firms owned
by individual entrepreneurs with technology y = A k , where k
is the input of capital acquired by the entrepreneur in period 1.
– Sector 2: Residual sector. The second sector is formed by frictionless firms directly owned by workers with technology y =
A G(k ). The function G(.) is strictly increasing and concave and
satisfies G (0) = 1.
The key difference between the entrepreneurial sector and the residual sector is that the former is more productive than the latter, that is,
G (k ) < 1 for k > 0. As we will see, in absence of financial frictions,
production will take place only in the entrepreneurial sector. With frictions, part of the production could also take place in the less productive
but frictionless residual sector. For simplicity I assume that A is known
in period 1 and, therefore, there is no aggregate uncertainty.
Before proceeding I impose the following conditions:

Assumption 1 Entrepreneurs and workers have the same discounting, δ = β.
Furthermore, βA > 1.
It is often assumed in the literature that δ > β, that is, entrepreneurs
(borrowers) are more impatient than workers (lenders). This is an important
assumption in an infinite horizon model. With only two periods, however,
the discount differential does not play an important role, which motivates
the assumption δ = β. The condition βA > 1, instead, guarantees that
postponing consumption through investment is efficient since the discounted
value of the productivity of capital in period 2 is greater than 1.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

219

Timing Summary
The structure of the model described so far, although stylized, is fairly complex. There are important timing assumptions that are made to keep the model
analytically tractable. To make sure that these assumptions are clear, it would
be helpful to summarize the timing sequence.
1. Entrepreneurs start period 1 with capital K and debt B. Workers start
with wealth B.
2. Entrepreneurs hire workers to produce intermediate goods with the
technology y = AK θ h1−θ . The labor market is competitive and clears
at the wage rate w.
3. Entrepreneurs purchase intermediate goods i to produce new capital
goods using the technology k n = ωi. The choice of i is made before
observing the idiosyncratic productivity ω.
4. At this point we are at the end of period 1. The idiosyncratic productivities are observed and all incomes are realized. Entrepreneurs
and workers allocate their end-of-period wealth between current consumption and savings in the form of capital goods and/or financial
instruments (bonds).
5. We are now in period 2. Production takes place with the capital inputs
accumulated in the previous period.
6. Entrepreneurs repay the debt to workers and both agents consume their
residual wealth.

Plan for the Theoretical Analysis
I have now completed the description of preferences, technology, and timing.
What is left to describe are the financial frictions that impose additional constraints on the choices of debt. These will be specified in the analysis of the
various cases reviewed in this article. The presentation will be organized in
four main sections:
• Section 4 characterizes the equilibrium in the frictionless model. This
provides the baseline framework to which I compare the various versions of the model with financial frictions.
• Section 5 presents the costly state verification model based on information asymmetries where the financial frictions have a direct impact
on investment.
• Section 6 presents the collateral/limited enforcement model. I first
show the properties of this model when the frictions have a direct impact

220

Federal Reserve Bank of Richmond Economic Quarterly
only on investment. I then extend the analysis to the case in which the
frictions also have a direct impact on the demand of labor.
• Section 7 analyzes the impact of credit shocks. I first present the model
with exogenous credit shocks and then I propose one possible approach
to make these shocks endogenous through a liquidity channel. In this
section I also show the importance of credit shocks in an open economy
framework.

BASELINE MODEL WITHOUT FINANCIAL FRICTIONS

I start with the characterization of the problem solved by workers

max
+ δc
c−
c,c ,k ,b
2
subject to:
b
B + wh = c + + qk
R
A G(k ) + b = c
c ≥ 0, c ≥ 0,

(1)

where B is the initial ownership of bonds, R is the gross interest rate, w is the
wage rate, and q is the price of capital. Since A is known in period 1, workers
do not face any uncertainty.
The first two constraints are the budget constraints in period 1 and 2, respectively. They equalize the available resources (left-hand side) to the expenditures (right-hand side). The problem is also subject to the non-negativity
of consumption in both periods. However, thanks to Assumption 1, c will
always be positive and we have to worry only about the non-negativity of
consumption in period 1. Intuitively, since capital is very productive in period
2 and preferences are linear, agents may choose to maximize their savings in
period 1.
The first-order conditions are
h = w(1 + λ)
(1 + λ)q = δA G (k )
1 + λ = δR,

(2)
(3)
(4)

where λ is the Lagrange multiplier associated with the non-negativity constraint on consumption in period 1.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

221

The problem solved by entrepreneurs can be written as

max
E
c
+
βc

h,i,c,k ,b ,c

subject to:
AK θ h1−θ − wh + qK + (qEω − 1)i +
A k = b + c
c ≥ 0, c ≥ 0,

b
= B + c + qk
R

i ≥ 0.

The first two constraints are the budget constraints in period 1 and 2,
respectively. They equalize the available resources (left-hand side) to the
expenditures (right-hand side). The terms AK θ h1−θ − wh and (qEω − 1)i
are, respectively, the profit earned by the entrepreneur in the production of
intermediate goods and the (expected) profit earned in the production of new
capital goods.
As for workers, I do not have to worry about the non-negativity constraint
on c . The first-order conditions are
w
qEω = 1
(1 + γ )q
1+γ

=
≤
=
=

(1 − θ )AK θ h−θ
1,
(= if i > 0)

βA
βR,

(5)
(6)
(7)
(8)

where γ is the Lagrange multiplier on the non-negativity constraint on consumption in period 1. Since δ = β (by Assumption 1), equations (4) and (8)
imply λ = γ . What this means is that the non-negativity of consumption in
period 1 is either binding for both agents or it is not binding for both of them.
Substituting the labor supply (2) in the demand of labor (5), we get the
wage equation
1

−θ

w = (1 − θ ) 1+θ A 1+θ K 1+θ (1 + λ) 1+θ .

(9)

Substituting back in the supply of labor, working hours can be expressed as
1

h = (1 − θ ) 1+θ A 1+θ K 1+θ (1 + λ) 1+θ .

(10)

Entrepreneurs’ income in period 1, after the production of intermediate
goods is
Y e = AK θ h1−θ − wh,

(11)

where w and h are determined in (9) and (10). Therefore, the supply of labor
and entrepreneurial income depend on the multiplier λ. The value of this
variable depends on the assumption about Eω. When I introduce financial
frictions I will consider two cases: Eω = 1 and Eω = 0. The first case
defines an economy with capital accumulation while the second case defines
an economy with fixed capital.

222

Federal Reserve Bank of Richmond Economic Quarterly
• Case 1: Eω = 1. Because βA > 1, the intermediate goods produced
in period 1 are all used in the production of capital goods. Thus, current
consumption is zero for both entrepreneurs and workers. This implies
that the multiplier λ = γ is positive. Since investment i is positive,
condition (6) is satisfied with equality, and therefore, q = 1. Then
equations (7) and (8) imply that R = A . Agents anticipate that the
productivity of capital is high next period and it becomes convenient
to save the whole income to take advantage of the higher return. The
labor supply is higher than the wage since λ = γ > 0 (see equation
[2]) and the demand of labor is determined by its marginal product (see
equation [5]). The whole capital produced in period 1 is accumulated by
entrepreneurs since the entrepreneurial sector is more productive than
the residual sector and there are no agency problems in the repayment
of the intertemporal debt b .
It is now easy to see the impact of productivity changes. An increase
in current productivity A generates an increase in the supply of labor
and output as we can see from equations (9)–(11), after replacing 1 +
γ = βA from equation (7), taking into account that λ = γ . Since
the increase in income is saved, the productivity boom also generates
an investment boom. There is no impact in current consumption but
this is a consequence of assuming risk neutrality. With risk aversion,
consumption in period 1 is also likely to increase in response to a
persistent productivity improvement.
An increase in A also generates an increase in the current supply of
labor (see equation [10] after substituting 1 + γ = βA ), which in turn
generates an increase in output and savings. Therefore, the model has
the typical properties of the neoclassical business cycle model.
• Case 2: Eω = 0. Since Eω = 0, we can see from equation (6) that
i = 0, that is, there is no capital accumulation. The whole capital K is
acquired by entrepreneurs because the entrepreneurial sector is more
productive than the residual sector and there are no agency problems in
the repayment of the intertemporal debt b . Since consumption cannot
be zero for both workers and entrepreneurs, λ = γ = 0 (in absence
of investment aggregate consumption in period 1 must be equal to
aggregate production in period 1). This implies that the price of capital
is q = βA (see equations [3] and [7]). In this way both agents are
indifferent between current and future consumption and the new debt
b is undetermined.
An increase in current productivity A generates an increase in the supply
of labor and output as we can see from (9)–(11) after substituting λ =
0. However, the productivity change in period 1 does not affect next
period production since there is no capital accumulation. Similarly,

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

223

an increase in A generates an increase in next period production but
it does not have any impact on production in period 1. Again, this is
because of the absence of capital accumulation. As we will see, this
feature of the model will change with financial frictions.

COSTLY STATE VERIFICATION MODEL

In the costly state verification model frictions derive from information asymmetry. This is the centerpiece of the financial accelerator model proposed by
Bernanke and Gertler (1989). The model has been further embedded in more
complex macroeconomic models with infinitely lived agents by Carlstrom and
Fuerst (1997) and Bernanke, Gertler, and Gilchrist (1999).
To illustrate the key elements of the financial accelerator, I specialize the
analysis to the case in which frictions are only in the production of capital
goods and, in the analysis of this section, I assume that Eω = 1. This
guarantees that capital goods are produced and there is capital accumulation
in the model.
The frictions derive from the assumption that ω is freely observable only
by entrepreneurs. Other agents could observe ω but only at the cost μi. This
limits the feasibility of financial contracts that are contingent on ω. As it is
well known from the work of Townsend (1979), the optimal contract takes the
form of a standard debt contract in which the entrepreneur promises to repay
an amount that is independent of the realization of ω. If the entrepreneur does
not repay, the lender incurs the verification cost and confiscates the residual
assets.

Optimal Contract with Costly State Verification
The central element of this model is the net worth of entrepreneurs. Before
starting the production of new capital goods, entrepreneurs’ net worth is n =
qK + Y e − B, where Y e is defined in (11). Therefore, if the entrepreneur
purchases i units of intermediate goods, he or she has to borrow i − n units of
intermediate goods on the promise to pay back (i − n)(1 + r k ) units of capital
goods. Notice that the interest rate r k is denominated in capital goods, which
explains the different denomination of the loan (denominated in intermediate
goods) and the repayment (in capital goods). The particular choice of the
denomination is a simple convention that is inconsequential for the properties
of the model.
After the realization of the idiosyncratic shock ω, the entrepreneur defaults
only if the production of new capital goods is smaller than the debt repayment,
that is, ωi ≤ (1 + r k )(i − n). We can then define ω̄ as the shock below which

224

Federal Reserve Bank of Richmond Economic Quarterly

the entrepreneur defaults, which is equal to

i−n
.
i
This equation makes clear that the default threshold is increasing in the
leverage ratio i−n
and in the interest rate. Assuming competition in financial
i
markets, the interest rate charged by the lender must satisfy the zero-profit
condition

∞
ω̄(n,i,r k )
k
q
(ω − μ)i (dω) +
(1 + r )(i − n) (dω) = i − n.
ω̄ = (1 + r k )

ω̄(n,i,r k )

Notice that there is no interest in the cost of funds on the right-hand side
of the equation since the loan is intraperiod, that is, issued and repaid in the
same period. This is different from the intertemporal debt b . The equation
defines implicitly the interest rate charged by the bank as a function of n, i, q,
which I denote as r k (n, i, q). The default threshold can also be expressed as
a function of the same variables, that is, ω̄(n, i, q).
Since entrepreneurs are risk neutral, the production choice is independent
of the consumption/saving decision. More specifically, the optimal choice of
i maximizes the expected entrepreneur’s net worth, that is,
∞

max q
ωi − (1 + r k (n, i, q))(i − n) (dω).
i

ω̄(n,i,q)

Notice that the integral starts at ω̄ because the entrepreneur defaults for values
of ω < ω̄ and the ex-post net worth is zero in the event of default.
Let i(n, q) be the optimal scale chosen by the entrepreneur in the production of capital goods. We can define the net worth after production as

k
π (n, q, ω) = max 0 , q ωi(n, q) − (1 + r (n, i(n, q), q))(i(n, q) − n) .
Using this function, the consumption/saving problem solved by the entrepreneur can be written as

max
c
+
βc

c,c ,k ,b

subject to:
π (n, q, ω) = c + qk −
c = A k − b
c ≥ 0, c ≥ 0.

b
R

Equilibrium and Response to Productivity Shocks
There are two possible equilibria depending on the net worth of entrepreneurs.
In the first equilibrium, the net worth of entrepreneurs is sufficiently large that

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

225

the whole production of intermediate goods is used in the production of new
capital goods. This case is similar to the baseline model without financial
frictions characterized in Section 4.
The second type of equilibrium arises when the net worth of entrepreneurs
is not large enough to use the whole production of intermediate goods to
produce new capital goods. We have defined above i(n, q) the production scale of entrepreneurs, that is, the demand of intermediate goods used
in the production of new capital goods. Since there is a unit mass of entrepreneurs that are initially homogeneous, i(n, q) is also aggregate investment. If i(n, q) < AK θ h1−θ , then only part of the production of intermediate
goods is used in the production of capital goods. This implies that the consumption in period 1 of workers and/or entrepreneurs is positive. Thus, the
multiplier associated with the non-negativity of consumption is γ = 0 and the
equilibrium satisfies the first-order conditions
q = βA
1 = βR.
Thus, the price of capital is equal to βA , which is bigger than one by Assumption 1. I will focus on this particular equilibrium since this is when financial
frictions matter.
I can now study the response of the economy to productivity shocks, that
is, changes in A and A .
• Increase in A. The increase in A raises the net worth of entrepreneurs
n = qK + Y e − B, where Y e is defined in (11). Since q = βA , the
price of capital q does not change if A does not change. Therefore,
the increase in net worth is only determined by the increase in capital
income Y e earned in the first stage of production.
The next step is to see what happens to investment in response to the
higher net worth. We have already seen that investment i increases with
n. Therefore, the productivity improvement generates an investment
boom and increases next period production. In this way the model generates a persistent impact of productivity shocks. This effect, however,
is not necessarily bigger than the effects of a productivity shock in the
baseline model without frictions characterized in Section 4. For this to
be the case, the net worth n has to increase proportionally more than the
increase in output. This requires qK − B < 0, which is unlikely to be
an empirically relevant condition. Therefore, the model with financial
frictions could generate a lower response to nonpersistent productivity
shocks.
If the shock is persistent, that is, a higher A implies a higher value of
A , then the model would generate an increase in net worth also through

226

Federal Reserve Bank of Richmond Economic Quarterly
the market value of owned capital (as we will see next). The impact on
investment could then be bigger.
• Increase in A . An anticipated increase in A generates an increase
in the price of capital today since q = βA . The price increase has
two effects. First, since entrepreneurs own the capital K, the higher q
generates an increase in the entrepreneur’s net worth n = qK +Y e −B.
Notice that the initial leverage is higher, that is, the debt B relative to
the owned capital K, and the (proportional) effect on the net worth is
bigger. The increase in net worth affects investment similarly to the
increase in current productivity. This first channel induces an increase
in the production scale i without changing the probability of default if
we assume that the leverage does not change.
The second effect derives from the impact on the intraperiod leverage.
Since a higher q implies higher profits from producing capital goods,
entrepreneurs have an incentive to expand production proportionally
more than the increase in net worth, even if this increases the cost of
external financing. As a result, the probability of default, or bankruptcy
rate, increases in response to an anticipated productivity shock. Thus,
the model generates pro-cyclical bankruptcy rates and pro-cyclical interest rate premiums—a point emphasized, among others, by Gomes,
Yaron, and Zhang (2003).
One reason the model generates a pro-cyclical interest rate premium is
because investment is very sensitive to the asset price q. The addition
of adjustment costs as in Bernanke, Gertler, and Gilchrist (1999) could
revert this property. In this case, the higher price of capital improves the
net worth position of the entrepreneur, but the adjustment cost contains
the expansion of the production scale. As a result, entrepreneurs could
end up with a lower leverage and lower probability of default. See also
Covas and Den Haan (2010).

Quantitative Performance
In general, it is not easy for the model to generate large amplification effects
in response to productivity changes. In fact, as observed above, financial frictions could dampen the impact of productivity shocks. Because of the higher
profitability in the production of capital goods, entrepreneurs would like to
expand the production scale. However, as they produce more, the cost of
external financing increases. In a frictionless economy, instead, the cost of
external finance does not increase with individual production. So the initial
impact on investment is larger. In essence, financial frictions act like adjustment costs in investment, which could dampen aggregate volatility. Wang

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

227

and Wen (forthcoming) provide a formal analysis of the similarity between
financial frictions and adjustment cost at the aggregate level.
Even though the model has difficulties generating large amplifications, it
has the potential to generate greater persistence. In fact, higher profits earned
by entrepreneurs allow them to enter the next period with higher net worth.
This cannot be shown explicitly with the current model since there are only
two periods. However, suppose that entrepreneurs enter period 1 with a higher
K made possible by the higher profits earned in the previous periods. This will
reduce the external cost of financing, allowing entrepreneurs to produce more
capital goods, which in turn increases production in future periods. The model
could then generate a hump-shape response of output as shown in Carlstrom
and Fuerst (1997).
Although quantitative applications of the financial accelerator do not find
large amplification effects of productivity shocks, it could still amplify the
macroeconomic response to other types of shocks. For example, Bernanke,
Gertler, and Gilchrist (1999) add adjustment costs in the production of capital
goods in order to generate larger fluctuations in q and find that the financial
accelerator could generate sizable amplifications of monetary policy shocks.

COLLATERAL CONSTRAINT MODEL

Here I illustrate the main idea of models with collateral constraints as the
one studied in Kiyotaki and Moore (1997). An alternative to models with
collateral constraints is the consideration of optimal contracts subject to enforcement constraints as in Kehoe and Levine (1993) and Cooley, Marimon,
and Quadrini (2004). However, the business cycle implications of these two
modeling approaches are similar.
To illustrate the idea of the collateral model, I assume that the frictions
are not in the production of capital goods, as in the costly state verification
model. Instead they derive from the ability of borrowers to repudiate their
intertemporal debt. In some models, like in Kiyotaki and Moore (1997), it
is even assumed that physical capital is not reproducible. Therefore, in this
section I assume that Eω = 0 and all intermediate goods are transformed one
to one into consumption goods. I denote by K the aggregate fixed stock of
capital. Since capital is not reproducible, its price fluctuates endogenously in
response to changing market conditions. The price fluctuation plays a central
role in the model. An alternative way to generating price fluctuations is to
relax the assumption that capital is not reproducible but with the addition of
adjustment costs in investment and/or risk aversion.

228

Federal Reserve Bank of Richmond Economic Quarterly

Frictions on the Intertemporal Margin

From an efficiency point of view, the stock of capital should be allocated between entrepreneurs and workers to equalize their marginal product in period

2. More specifically, given K e , the capital allocated to the entrepreneurial
sector (that is, capital purchased by entrepreneurs), efficiency requires A =

A G (K − K e ). The first term is the expected marginal productivity in the entrepreneurial sector and the second is the marginal productivity in the residual
sector. Since G (.) is strictly decreasing and G (0) = 1 < A , the equalization

of marginal productivities requires K e = K, that is, all the capital should be
allocated to the entrepreneurial sector in period 2.

The problem is that entrepreneurs may be unable to purchase K e = K in
period 1. Because of limited enforceability of debt contracts, entrepreneurs
are subject to the collateral constraint

b ≤ ξ q k .

Here b is the new debt, k is the capital purchased by an individual entrepreneur, q is the expected price of capital in period 2, and ξ < 1 is a
parameter that captures possible losses associated with the reallocation of
capital in case of default.
The theory underlying this constraint is developed in Hart and Moore
(1994) and it is based on the idea that entrepreneurs cannot be forced to
produce once they renege on the debt. Thus, in case of default the lender can
only recover a fraction ξ of the capital that can be resold at price q . Since this
is the last period in the model, the price of capital would be zero in the second
period. In an infinite horizon model, however, the price would not be zero
because the capital can still be used in production in future periods. In our twoperiod model we can achieve the same outcome by assuming that a fraction ξ
of the liquidated capital can be reallocated to the residual sector. Therefore,

the liquidation price of capital in period 2 is equal to q = ξ A G (K − K e ).
Since G (.) ≤ 1 and only a fraction ξ can be resold, the value of capital for
lenders is smaller than for entrepreneurs. This is what limits the entrepreneurs’
ability to borrow.
Before continuing I should observe that, in absence of capital accumulation, period 1 consumption cannot be zero for both workers and entrepreneurs.
This is because period 1 production can only be used for consumption. Thus,
the first-order conditions for workers are given by (2)–(4) but with λ = 0 and
the supply of labor is h = w.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

229

The problem solved by entrepreneurs is

max
c
+
βc

h,k ,b

subject to:
θ

c = qK + AK h1−θ − wh − B +
ξ q k ≥ b
c = A k − b ,
c ≥ 0, c ≥ 0,

b
− qk
R

(12)

which is deterministic since there is no capital production (ω = 0) and A is
perfectly anticipated.
The first-order condition for the input of labor is still given by (5), that is,
the entrepreneur equalizes the marginal product of labor to the wage rate. At
the center stage of the model are the choices of next period capital and debt.
The first-order conditions for k and b in problem (12) are
(1 + γ )q = βA + μξ q
1 + γ = (β + μ)R,

(13)
(14)

where μ and γ are, respectively, the Lagrange multipliers associated with the
collateral constraint and the non-negativity of consumption in period 1.
I can now use equations (13)–(14) together with (3)–(4) to derive an expression for μ. Using the fact that the liquidation price of capital in period 2

is q = A G (K − K e ), we derive

μ=

β[1 − G (K − K e )]

(15)
.
(1 − ξ )G (K − K e )
This equation relates the multiplier μ to the capital accumulated by entrepre

neurs K e . Since the function G(.) is concave, G (K − K e ) is increasing in

K e . Therefore, if the capital accumulated by entrepreneurs is higher, μ is
lower.
The equilibrium can take two configurations.
• All the capital is accumulated by entrepreneurs. In the first equilibrium
entrepreneurs have sufficient net worth to purchase all the capital, that

is, K e = K. Equation (15) then implies that μ = 0 since G (0) = 1.
In this case, entrepreneurs’ consumption is positive (γ = 0) and the
price of capital is q = βA .
This is possible only if entrepreneurs start with sufficiently high net
worth, that is, small B. To see this, consider an entrepreneur’s budget
constraint when the entrepreneur borrows up to the limit and chooses
zero consumption. Substituting c = 0 and b = ξ q k , the budget
constraint becomes qK + Y e + ξ q k /R = B + qk , which can be

230

Federal Reserve Bank of Richmond Economic Quarterly
rearranged to

ξ q
k = qK + Y e − B.
q−
R

(16)

The term Y e = AK h1−θ − wh is the entrepreneur’s income earned in
period 1.

Equations (3)–(4) imply A G (K − K e ) = qR. Furthermore, using

q = A G (K − K e ), the above condition can be written as

1
B − Ye

K−
.
(17)
kmax =
1−ξ
q
This is the maximum capital that entrepreneurs can buy given the capital
price q = βA , which I made explicit by adding the subscript. It
depends negatively on B. Therefore, if the initial net worth is not
sufficiently high, entrepreneurs will be unable to purchase K and some
of the capital will be inefficiently allocated to the residual sector. In this

case, K e = kmax
< K. We are then in the second type of equilibrium
configuration.
• Only part of the capital is accumulated by entrepreneurs. In the second
equilibrium, entrepreneurs choose zero consumption and the collateral
constraint is binding. Since entrepreneurs cannot purchase enough

capital, G (K − K e ) < 1. Then equation (15) tells us that μ > 0
and equation (14) implies that γ > 0 since βR = 1 (from [4] if
workers’ consumption is positive, implying λ = 0). Therefore, the
entrepreneur borrows up to the limit and the non-negativity constraint
on consumption is binding.
Using the binding collateral constraint and zero consumption, the budget constraint can be rewritten again as in (16). This expression provides a simple intuition for the key mechanism of the model. The cost

of one unit of capital, q, can be financed with ξRq units of debt and the

rest must be financed with owned wealth. Therefore, q − ξRq is the
minimum down payment required on each unit of capital. Multiplied
by k we get the total down payment necessary to purchase k units of
capital. In order to make the down payment, the entrepreneur needs to
have enough net worth, which is the term on the right-hand side of (16).
Therefore, the lower is the entrepreneurs’ net worth, the lower is the
amount of capital allocated to entrepreneurs. Since entrepreneurs are
more productive than producers in the residual sector of the economy,
lower net worth in period 1 implies lower production in period 2.
As equation (16) makes clear, the capital allocated to the entrepreneurial
sector depends crucially on the equilibrium prices R, q, and q . Although all three prices contribute to the equilibrium outcome, it will

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

231

be helpful to focus on q and q to see the importance of asset prices.
There are several effects induced by changes in these prices.
– Current price: An increase in the current price, q, has two effects.
On the one hand, it increases the entrepreneur’s net worth qK +
Y e − B. On the other hand, it increases the cost of purchasing
new capital. The first effect has a positive impact on k , while the
impact of the second effect is negative.
– Next period price: An increase in the (expected) next period price,
q , allows entrepreneurs to issue more debt. Therefore, for a given
net worth, more capital can be purchased.
Following Kiyotaki and Moore (1997), suppose that q and q both
increase by the same proportion. For example they both increase by
1 percent.2 Provided that B > Y e , this generates an increase in the
capital purchased by entrepreneurs, which, in the next period, increases
output. The condition B > Y e is a leverage condition. Therefore, if
entrepreneurs enter the period with a high leverage, a persistent increase
in prices generates an output boom.
How would the response change if contracts were enforceable? This
is equivalent to the equilibrium in which the collateral constraint is not
binding. In particular, all the capital is purchased by entrepreneurs
since they can borrow without limit. Then a change in price would not
affect the allocation of K and would not have any additional impact on
aggregate production beyond the direct impact of the factors that cause
the price change.
Response to Productivity Shocks

I will now focus on the equilibrium in which the collateral constraint is binding, that is, the equilibrium that prevails if entrepreneurs are highly leveraged.
In a general model with infinitely lived agents this would arise in the long
run if entrepreneurs have some incentives to take on more debt. As discussed
in Section 2, there are different assumptions made in the literature to have
this property. For example, a common assumption is that entrepreneurs (borrowers) are more impatient than workers (lenders). In the simple two-period
model considered here, however, we can simply take the initial leverage to be
sufficiently high.
2 To facilitate the intuition, I take a partial equilibrium approach here and assume that the
prices change exogenously.

232

Federal Reserve Bank of Richmond Economic Quarterly

If the collateral constraint is binding, the capital acquired by entrepreneurs
is given by equation (17), which for convenience I rewrite here:

1
b − Ye
e
K−
.
(18)
K =
1−ξ
q
We now consider the impact of an increase in current and (anticipated)
future productivity.
• Increase in A. The higher value of A increases entrepreneurs’ income
Y e in period 1 (see equation [11]). We see from equation (18) that

this induces an increase in K e . Essentially, entrepreneurs earn higher
capital income in period 1 and this allows them to purchase more capital
for period 2.
In addition to this direct effect, there is an indirect effect induced by

the price of capital. Since K e increases, equation (3) implies that the
current price of capital q also increases. As long as B > Y e , that is,
entrepreneurs are sufficiently leveraged, the increase in q induces a

further increase in K e . Since entrepreneurs are more productive, that

is, G (.) < 1 for K e < K, the reallocation of productive capital to
the entrepreneurial sector generates an output boom in period 2. This
second effect comes from the endogeneity of the collateral constraint,
which depends on the market price q. Since the value of capital depends
on q while the value of debt is fixed, the change in price has a large
impact on the net worth if entrepreneurs are highly leveraged. This is
the celebrated “amplification” effect of productivity shocks induced by
endogenous asset prices.
• Increase in A . Suppose that A increases, that is, in period 1 we expect a
higher productivity in period 2. We can think of this as a “news” shock.
In this way it relates to the recent literature that investigates the impact of
anticipated future productivity changes on the macroeconomy. See, for
example, Beaudry and Portier (2006) and Jiamovich and Rebelo (2009).
Here I show that financial markets could be an important transmission
of these news shocks.
From equation (2) we see that an increase in A generates an increase
in the price of capital q. Then, equation (18) shows that the increase in
q induces a reallocation of capital to the entrepreneurial sector, further
increasing q. This implies that production in period 2 increases more
than the increase in productivity. We thus have an “amplification”
effect. As far as current production is concerned, however, output does
not change. We will see in the next section that, with the addition of
working capital, the anticipated news can also affect employment in
the current period. Therefore, in addition to generating an immediate

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

233

asset price boom, the news shock also generates an immediate macroeconomic boom. This mechanism has been explored in Jermann and
Quadrini (2007) and Chen and Song (2009).
Although we have considered only the case of nonreproducible capital,
similar results apply when there is capital accumulation together with adjustment costs on investment. With investment adjustment costs, the price of
capital is not always one. An increase in future productivity raises the demand
of capital, inducing an asset price boom, which in turn amplifies the impact
of the initial productivity improvement. Sometimes the adjustment costs can
be in the form of capital irreversibility as in Caggese (2007).
Quantitative Performance

There are many quantitative applications of the collateral model. Sometimes the borrowers are households engaged in real estate investments as in
Iacoviello (2005). Other studies consider firms to be in need of funds for
productive investments. However, the quantitative amplification induced by
collateral constraints is often weak. This point has been emphasized in
Cordoba and Ripoll (2004).
There are two reasons for the weak amplification. Similar to the simple
model described above, for a group of models proposed in the literature, the
“direct” effect of the frictions is on investment, not on the input of labor.
Although this has the potential to generate large fluctuations in investments,
the production inputs—capital and labor—are only marginally affected by this
mechanism. As a result, output fluctuations are not affected in important ways
by the financial frictions. I would also like to point out that the consideration of
risk-averse agents will further reduce the amplification effects since savings,
and therefore investments, will become more stable (see Kocherlakota [2000]
and Cordoba and Ripoll [2004]). For the financial frictions to generate large
output fluctuations that are in line with the data, they need to have a direct
impact on labor. This point will be further developed in the next section.
The second reason for the weak amplification is that typical macromodels
do not generate large asset price fluctuations even with the addition of binding
marginal requirements (see Coen-Pirani [2005]). The centerpiece of the amplification mechanism induced by the collateral constraint model is the fact
that the availability of credit, and therefore investment, depends on the price
of assets, that is,
b ≤ ξ q k .
In economic expansions q increases and this allows for more capital investment thanks to the relaxation of the borrowing constraint. However, for this
mechanism to be quantitatively important, the model should generate sizable
fluctuations in q , which is typically not the case in standard macromodels. In

234

Federal Reserve Bank of Richmond Economic Quarterly

this regard, the inability of the model to generate large amplification effects
is more a consequence of the poor asset price performance of macromodels
(which generate much lower asset price fluctuations than in the data) than the
weakness of the collateral or financial accelerator mechanisms.
This suggests that an improvement in the asset price performance of
macromodels could also enhance the amplification effect induced by financial
frictions. In making this conjecture, however, we should use some caution. If
the model generates large asset price fluctuations, borrowing up to the limit
becomes riskier. Thus, agents may choose to stay away from the limit, that is,
they will act in a precautionary manner. As a result, it is not obvious whether
large asset price fluctuations will generate large macroeconomic fluctuations
since, as shown in the simple model studied above, this requires the collateral
constraint to be binding. But with precautionary behavior, the borrowing limit
is only occasionally binding.
Unfortunately, exploring the quantitative importance of occasionally binding constraints cannot be done with local approximation techniques, which is
the dominant approach used to study quantitative general equilibrium models.
It is only recently that the importance of occasionally binding constraints for
business cycle fluctuations has been fully recognized. Mendoza (2010) is one
of the first articles that explores this issue quantitatively. I will return to the
issue of occasionally binding collateral constraints later.

Working Capital Model
The financial mechanisms presented so far affect the transmission of productivity shocks through the investment channel. For example, in the costly state
verification model, the entrepreneur’s net worth affects the production of new
capital goods, which in turn affects next period production. In the model
with collateral constraints, the net worth of entrepreneurs also plays a central
role. Higher net worth allows entrepreneurs to purchase more capital. As a
result, a larger fraction of productive assets are used in the more productive
entrepreneurial sector enhancing aggregate output. In both models the price of
capital q plays a central role. However, this mechanism has a limited impact
on labor.
The intuition for the weak impact on labor is simple. If we use a CobbDouglas production function y = AK θ h1−θ , an increase in the input of capital
increases the demand of labor because h is complementary to K. However,
even though investment is highly volatile, the volatility of capital is small.
Thus, changes in investment that are quantitatively plausible are unlikely to
generate large fluctuations in labor. Empirically, however, labor input fluctuations are an important driver of output volatility. So in general, having
financial frictions that primarily affect investment may not be enough for the
frictions to play a central role in labor and output fluctuations. A more direct
impact can be obtained if financial frictions directly affect the demand of labor.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

235

One way to achieve this is by assuming that employers need working capital,
which is complementary to labor.
The idea of working capital is not new in macroeconomics. For example,
the limited participation models of monetary policy are based on the idea
that producers need to finance working capital. See, for example, Christiano
and Eichenbaum (1992); Fuerst (1992); Christiano, Eichenbaum, and Evans
(1997); and Cooley and Quadrini (1999, 2004). See also Neumeyer and Perri
(2005) for the modeling of working capital in a nonmonetary model. On
one hand, besides the need of working capital, there are not other financial
frictions in these models. On the other hand, business cycle models with
financial frictions have mostly focused on investment, posing little importance
on working capital. Jermann and Quadrini (2006), Mendoza (2010), and
Jermann and Quadrini (forthcoming) are attempts at merging the two ideas:
working capital needs with financially constrained borrowers.
To show how working capital interacts with financial constraints, I consider again the collateral model studied in the previous section. The only
additional assumption is that entrepreneurs also need working capital in the
first period of production. Specifically, they need to pay wages before the
realization of revenues. To make these payments, entrepreneurs must borrow
wh. This is an intraperiod loan, and therefore, there are no interest payments.
The collateral constraint becomes
b + wh ≤ ξ q k .

(19)

The left-hand side is the total debt: intertemporal debt that will be paid back
next period and the intraperiod debt that needs to be repaid at the end of period
1. The right-hand side is the collateral value of assets.
The problem solved by the entrepreneur is similar to (12) but with the new
collateral constraint, that is,

max
c
+
βc

h,k ,b

subject to:
θ

c = qK + AK h1−θ − wh − B +
ξ q k ≥ b + wh
c = A k − b
c ≥ 0, c ≥ 0.

b
− qk
R

(20)

The first-order conditions are also similar with the exception of the optimality condition for the input of labor, which becomes
θ

(1 − θ )AK h−θ = w(1 + μ).

(21)

The variable μ is the Lagrange multiplier associated with the collateral
constraint as in the model without working capital. The multiplier creates a

236

Federal Reserve Bank of Richmond Economic Quarterly

wedge in the demand for labor.3 When the collateral constraint is tighter, μ
increases and the demand for labor declines.
Using the supply of labor, h = w, the wage rate is
1
θ
1 − θ 1+θ
1
A 1+θ K 1+θ .
w(μ) =
1+μ
We can see that the wage depends negatively on the multiplier μ, which I
made explicit in the notation. This also implies that the entrepreneur’s income,
θ
Y e (μ) = AK h1−θ − wh, depends on μ.
The budget constraint for the entrepreneur under a binding collateral constraint (and zero consumption) is

ξ q
q−
(22)
k = qK + Y e (μ) − B.
R
From this equation I can derive the maximum capital that entrepreneurs
can acquire as

1
B − Y e (μ)

kmax =
k−
.
(23)
1−ξ
q

e
The
actual
capital acquired in equilibrium by entrepreneurs is K =

min kmax
,K .

Response to Productivity Shocks

I now consider the impact of changes in current and future productivity.
• Increase in A. Keeping constant μ, the higher productivity induces
an increase in entrepreneurial income Y e (μ). This implies that the
net worth of entrepreneurs increases and, as we can see in (23), more
capital will be allocated to the entrepreneurial sector.
The next step is to see what happens to the price of capital, q, and to

the multiplier μ. From equation (3) we see that the higher K e (smaller
capital k accumulated by workers) must be associated with an increase
in the price of capital q. As long as B > Y e (μ), that is, entrepreneurs

are sufficiently leveraged, the increase in q further increases K e .
We can now see what happens to the Lagrange multiplier μ. According

to equation (15), an increase in K e must be associated with a decline in
μ. Going back to the first-order condition for labor—equation (21)—
we observe that this reduces the labor wedge and generates an increase
in the demand for labor, busting current production.
3 It is common in the literature to use the phrase “labor wedge” to refer to terms that modify

the optimality condition for the input of labor that we would have without frictions. Later I will
discuss in more detail the issue of the labor wedge and provide a more precise definition.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

237

To summarize, the model with working capital can generate an amplification of productivity shocks also in the current period, in addition to
next period output. A key element of the amplification mechanism is
the endogeneity of the asset price q. Because of the asset price boom,
the borrowing constraint is relaxed and firms can borrow more. They
will use the higher borrowing to increase both current employment and
next period capital.
• Increase in A . Let’s consider now the impact of an anticipated productivity improvement (news shock). From equation (3) we see that an
increase in A generates an increase in the price of capital q. Then, from
equation (23) we observe that the increase in q must be associated with
a reallocation of capital to the entrepreneurial sector, further increasing
q (again from equation [23]). This implies that the increase in next period production is bigger than the increase in next period productivity
(amplification).
As we have seen earlier, the amplification result for period 2 is also
obtained in the model without working capital. With working capital,
however, the news shock also generates an output boom in the current
period. Therefore, news shocks affect current employment and production even if there is no productivity change in the current period.
This mechanism has been studied in Jermann and Quadrini (2007)
and Chen and Song (2009) and it is consistent with the findings of
Beaudry and Portier (2006) based on the estimation of structural vector
autoregressions.
Labor Wedge

Financial frictions have the ability to generate a labor wedge if wages or other
costs that are complementary to labor require advance financing (working
capital). Since there is an extensive literature studying the importance of the
labor wedge for business cycle fluctuations, it will be helpful to relate the
properties of the wedge generated by financial frictions with the labor wedge
discussed in the literature.
The labor wedge is defined in the literature as a deviation from the optimality condition for the supply of labor we would have in an economy without
frictions. Without frictions the optimality condition equalizes two terms: (i)
the marginal rate of substitution between consumption and leisure; and (ii) the
marginal product of labor. Thus, the labor wedge is defined as the difference
between these two terms. If the difference is zero, we have the same optimality
condition as in the frictionless model and, therefore, there is no wedge. If the
difference is not zero, we have a labor wedge since we are deviating from the
optimality condition without frictions.

238

Federal Reserve Bank of Richmond Economic Quarterly

Using a constant elasticity of substitution utility and a Cobb-Douglas
production function, the wedge can be written as
W edge ≡ mrs − mpl =

Y
φC
− (1 − θ ) ,
1−H
H

(24)

where C is consumption, H is hours worked, Y is output, and φ and θ are,
respectively, preferences and technology parameters. With the special utility
function for workers used here, the wedge is
Y
.
H
Besides the fact that consumption does not enter the equation, the wedge
generated by the model is very similar to the wedge derived from a more
standard model. Since the labor supply is H = w and the demand of labor
satisfies (1 − θ ) HY = w(1 + μ), the wedge is equal to −wμ.
Gali, Gertler, and López-Salido (2007) conduct a decomposition of the
labor wedge in two components. The first component is the wedge between
the marginal rate of substitution (mrs) and the wage rate (w). The second
component is the wedge between the wage rate (w) and the marginal product
of labor (mpl). More specifically,
W edge ≡ mrs − mpl = H − (1 − θ )

W edge ≡ mrs − w + w − mpl ≡ W edge1 + W edge2 .
Using postwar data for the United States (although excluding the period
of the recent crisis), Gali, Gertler, and López-Salido (2007) show that the
first component of the wedge (W edge1 ) has played a predominant role in
the dynamics of the whole wedge. In the version of the model studied here,
however, the opposite is true since financial frictions generate only a wedge
between the wage rate and the marginal product of labor (W edge2 ). In the
model presented here, wages are fully flexible and the mrs is always equal to
the wage rate. Therefore, W edge1 = 0.
At first, this finding may seem to cast doubts on the empirical relevance
of financial frictions for the dynamics of labor. However, it is important
to recognize that the problem arises because wages are assumed to be fully
flexible. To make this point, suppose that there is some wage rigidity. For
example, we could assume that workers update their wages only with some
probability (like in Calvo pricing). Then a change in the labor demand would
lead to a change in the labor supply but with a small change in the wage. As
a result, W edge1 is no longer zero.
To show this point more clearly, suppose that the wage is fixed at w̄. The
first component of the wedge is equal to W edge1 = H − w̄. Eliminating H
θ
using the first-order condition of firms (1 − θ )AK H −θ = (1 + μ)w̄, we get
θ

W edge1 =

(1 − θ )AK
− w̄.
(1 + μ)w̄

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

239

Therefore, the first component of the wedge is now dependent on μ, which
in turn depends on the shock. So, in principle, by adding wage rigidities the
model could capture some of the movements in the two components of the
wedge.
Quantitative Performance

As discussed above, the addition of working capital gives an extra kick to
the amplification potential of the model. As far as productivity shocks are
concerned, the amplification effect remains weak. As for the collateral model
without working capital, large amplification effects require sizable fluctuations
in asset prices q . However, I have already observed that standard macroeconomic models, even with the addition of financial frictions, find it difficult to
generate large fluctuations in asset prices. As a result, the amplification effect
remains weak.
The analysis of the amplification of other shocks, besides productivity,
has not received much attention in the literature. An exception is Bernanke,
Gertler, and Gilchrist (1999). They embed the financial accelerator in a New
Keynesian monetary model and find that the amplification effects on monetary policy shocks could be sizable: Based on their calibration, the impulse
response of output to a monetary policy shock is about 50 percent larger with
financial frictions.

MODEL WITH CREDIT SHOCKS

In the analysis conducted in the previous sections, I have focused on the propagation of productivity shocks, that is, shocks that arise in the real sector of the
economy. Although the analysis of real shocks is clearly important for business
cycle fluctuations, less attention has been devoted to studying the macroeconomic impact of shocks that arise in the financial sector of the economy—in
particular, shocks that directly impact the ability of entrepreneurs or other
borrowers to raise debt. Of course, we would like to have a theory of why
the ability to borrow could change independently of changes that arise in the
real sector of the economy. I will describe one possible theory later. For the
moment, however, I start with a reduced form approach where the shocks are
exogenous.

Model with Exogenous Credit Shocks
Consider the model with working capital analyzed in the previous section
where entrepreneurs face the collateral constraint (19). Now, however, I assume that the constraint factor ξ is stochastic. I will call the stochastic changes
in ξ “credit” shocks since they affect the borrowing capability of entrepreneurs.

240

Federal Reserve Bank of Richmond Economic Quarterly

Given the analysis conducted in the previous section, it is easy to see how
the economy responds to these shocks. The impact is similar to the response
to an asset price boom induced by a productivity improvement: By changing
the tightness of the collateral constraint, the shock has an immediate impact
on the multiplier μ, and, therefore, on the labor wedge. The change in ξ also
affects the price of capital and this interacts with the exogenous change in the
borrowing limit. Thus, the price mechanism described in the previous section
also acts as an amplification mechanism for the credit shock.
To show this in more detail, consider again equation (23) derived from the
budget constraint of entrepreneurs when the collateral constraint is binding
(and consumption is zero). For simplicity I rewrite the equation here,

1
b − Y e (μ)
e
K =
k−
.
(25)
1−ξ
q
This equation makes clear that, keeping constant the multiplier μ, an
increase in ξ (positive credit shock) increases the capital allocated to the entrepreneurial sector. As a result of this reallocation, we can see from equation
(3) that the price of capital q increases. As long as B > Y e (μ), that is,
entrepreneurs are sufficiently leveraged, the increase in q further increases

K e . Thus, the positive credit shock generates a reallocation of capital to the
entrepreneurial sector, which in turn increases next period output.
The reallocation of capital also affects the multiplier μ and, therefore, the

labor wedge. From equation (15) we can see that an increase in K e generates
a decline in μ. The multiplier also depends positively on ξ . However, if the

negative effect from the increase in K e dominates the positive effect from ξ ,
the multiplier μ and the labor wedge both decline in response to the positive
credit shock. Therefore, credit shocks also have a positive impact on current
employment and production. This is the channel explored in Jermann and
Quadrini (forthcoming).
More on Credit Shocks

There are several articles that consider shocks to collateral or enforcement
constraints. Some examples are Kiyotaki and Moore (2008), Del Negro
et al. (2010), and Gertler and Karadi (2011). In the latter article, the shock
arises in the financial intermediation sector. Mendoza and Quadrini (2010)
also consider a financial shock to the intermediation sector but in the form of
losses on outstanding loans. The impact of these shocks is very similar to a
change in ξ .
Christiano, Motto, and Rostano (2008) propose a different way of modeling a credit shock. They use a version of the costly state verification model
described in Section 5 and assume that the “volatility” of the idiosyncratic
shock ω is time-variant. Thus, the financial shock is associated with greater
investment risks. Since the risk is idiosyncratic and entrepreneurs are risk

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

241

neutral, the transmission mechanism is similar to shocks that affect the verification cost. Furthermore, once we recognize that a higher verification cost
reduces the liquidation value of assets, this is not that different from the collateral model in which ξ falls because a lower portion of the capital can be
recovered. The importance of time-varying risk when there are financial frictions is also studied in Arellano, Bai, and Kehoe (2010) and Gilchrist, Sim,
and Zakrajsek (2010).
Alterative Specification of the Collateral Constraint

From a quantitative point of view and abstracting from changes in ξ , the
collateral constraint (19) may have an undesirable quantitative property. In
particular, if this constraint is binding, the model generates a volatility of debt
that is similar or higher than the volatility of the market price of capital q.
The reason is because k co-moves positively with q in the model. Then, the
linear relation between q k and the debt b implies that the volatility of b is
bigger than the volatility of q . In the data, however, asset prices are much
more volatile than debt. Thus, if the model can generate plausible fluctuations
in asset prices, it also generates excessive fluctuations in the stock of debt.
This problem does not arise if we use an enforcement constraint in which
the liquidation value of capital is related to the book value, that is,
b + wh ≤ ξ k .

(26)

Conceptually, this could derive from the fact that, once the firm goes in
the liquidation stage, the capital ends up being reallocated to alternative uses
and the price is different from q .
With this specification the model could generate plausible fluctuations
in both asset prices and debt. Recognizing this, Perri and Quadrini (2011)
and Jermann and Quadrini (forthcoming) use a specification of the collateral
constraint where the liquidation value of capital does not depend on q . Of
course, by eliminating q in the collateral constraint we no longer have the
amplification mechanism generated by the price of capital. However, once
we focus on credit shocks, the amplification mechanism becomes secondary
since these shocks can already generate significant macroeconomic volatility.
Asset Price Bubbles and Financial Frictions

In various versions of the model presented so far, we have seen that the price
of assets plays an important role when there are financial market frictions.
Whatever makes the price of assets move, it can affect the real sector of the
economy by changing the tightness of the borrowing constraint. One factor
that could generate movement in asset prices is bubbles. Traditionally we
think of bubbles as situations in which the price of assets keeps growing over
time even if nothing “fundamental” changes in the economy. Independent of

242

Federal Reserve Bank of Richmond Economic Quarterly

what can generate and sustain a bubble, it is easy to see the macroeconomic
implications in the context of the simple model studied here.
Consider the version of the collateral model with working capital studied
in Section 6. In this model, the fundamental price of capital in period 2 is

A G (K − K e ). In the presence of a bubble, the price of capital would be
higher. Without going into the details of whether the bubble is rational or

not, the asset price with a bubble will be A G (K − K e ) + B , where B is
the bubble component. The macroeconomic effects are similar to the ones
we have already examined when the change in asset prices was driven by
productivity.
The modeling of rational bubbles is often challenging, especially in models with infinitely lived agents. To avoid this problem, Jermann and Quadrini
(2007) design a mechanism that looks like a bubble, that is, it generates asset price movements, but it is based on fundamentals. The idea is that the
economy can experience different rates of growth and switches from one
growth regime to the other with some probability. When the “believed” probability of switching to a higher growth regime increases, current asset prices
increase and the model generates a macroeconomic expansion. Even if the
mechanism is not technically a bubble, it generates similar macroeconomic
effects.
An alternative approach is to work with models where agents have limited
life spans. These models allow for rational bubbles if certain conditions about
discounting and population growth are met. Examples of these studies are
Farhi and Tirole (2011) and Martin and Ventura (2011).
A third approach is based on the idea of multiple equilibria as in
Kocherlakota (2009). This study is inspired by the study of Kiyotaki and
Moore (2008), who develop a model with two monetary equilibria. In the first
equilibrium, money is valued because there is the expectation that agents are
willing to accept money, while in the second equilibrium money has no value
because agents are not willing to accept it. Kocherlakota (2009) reinterprets
money more generally as a nonproductive asset that could be used as a collateral. For example, housing. He then considers sunspot equilibria in which
the economy switches stochastically from one equilibrium to the other. The
switch is associated with asset price fluctuations, which have an impact on the
real sector of the economy.
Quantitative Performance

The study of the quantitative implications of credit shocks is relatively recent
but the findings suggest that these shocks play an important role for the business
cycle. This is especially true if they directly affect the demand of labor.
An important issue in conducting a quantitative exploration of these shocks
is their identification. Jermann and Quadrini (forthcoming) propose two approaches. The first approach uses a strategy that is reminiscent of the Solow

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

243

Figure 2 Responses to Financial Shocks from Jermann and Quadrini
(Forthcoming)
Panel A: GDP
8
6
4
2
0
-2
-4
-6
-8
Data
Model

-10
-12
-14
85(II)

88(I)

90(IV)

93(IV)

96(IV)

99(IV)

02(IV)

05(IV)

08(IV)

05(IV)

08(IV)

Panel B: Hours Worked
8
6
4
2
0
-2
-4
-6
-8
-10

Data
Model

-12
-14
85(II)

88(I)

90(IV)

93(IV)

96(IV)

99(IV)

02(IV)

residual procedure to construct productivity shocks. Consider the enforcement
constraint specified in equation (26). If this constraint is always binding, we
can use empirical time series for debt, b , wages, wh, and capital, k , to construct time series for the credit variable ξ as residuals from this equation.
Once we have the time series for ξ we can feed the constructed series into the
(calibrated) model and study the response of the variables of interest.
Figure 2 shows the empirical and simulated series of output and labor
generated by the model studied in Jermann and Quadrini (forthcoming). According to the simulation, credit shocks have played an important role in
capturing the dynamics of labor and output in the U.S. economy during the
last two and a half decades.

244

Federal Reserve Bank of Richmond Economic Quarterly

Another approach used to evaluate the importance of the credit shocks is to
conduct a structural estimation of the model. This, however, requires the consideration of many more shocks because, effectively, a structural
estimation has the flavor of a horse race among the shocks included in the
model. For that reason Jermann and Quadrini (forthcoming) extend the basic
model by adding more frictions and shocks. The estimated model is similar
to Smets and Wouters (2007) but with financial frictions and financial shocks.
Through the structural estimation they find that credit shocks contributed at
least one-third to the variance of U.S. output and labor. Christiano, Motto,
and Rostagno (2008) and Liu, Wang, and Zha (2011) also conduct a structural
estimation of a model with financial frictions and financial shocks and they
find that these shocks contributed significantly to the volatility of aggregate
output.

Model with Endogenous Liquidity and
Multiple Equilibria
So far the analysis has focused on equilibria in which entrepreneurs face
binding collateral constraints. This is typically the case when there is no
uncertainty. However, in the presence of uncertainty and especially with credit
shocks, the enforcement constraint may not be binding in some contingencies.
The possibility of “occasionally” binding constraints allows us to think about
the issue of liquidity and the emergence of multiple equilibria.4
I continue to use the collateral constraint specified in (19) but with further
assumptions about the liquidation value of capital.
Following Perri and Quadrini (2011), I assume that in the event of debt
repudiation, the liquidated capital can be sold not only to the residual sector (as
in the previous model) but also to other nondefaulting entrepreneurs. However,
if the capital is sold to the residual sector, only a fraction ξ is usable. Instead,
if the capital is sold to other entrepreneurs, there is no loss of capital. Since
the marginal productivity of capital for entrepreneurs in the next period is A ,
this is also the price that nondefaulting entrepreneurs would be willing to pay
for the liquidated capital. The price obtained by selling capital to the residual

sector, instead, is A G (K − K e ). Because ξ < 1 and G (K − K e ) ≤ 1, the
resale to the entrepreneurial sector is the preferred option.
Notice that the default decision is made after all entrepreneurs have decided their borrowing b . If there were no limits to the ability of nondefaulting
entrepreneurs to purchase liquidated capital, then the residual sector would
4 Occasionally binding constraints is a feature of models studied in Brunnermeier and
Sannikov (2010) and Mendoza (2010), although they abstract from credit shocks and there are
not multiple equilibria. Boz and Mendoza (2010) also consider occasionally binding constraints
with credit shocks but not multiple equilibria. See also Guerrieri and Lorenzoni (2009).

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

245

be irrelevant. Thus, I now introduce an additional assumption that in some
contingencies limits the ability of the entrepreneurial sector to purchase the
liquidated capital.
The assumption is that entrepreneurs can purchase the capital of liquidated
firms only if they have the liquidity to do so. In this context, the liquidity is
determined by the credit ability of entrepreneurs, which in turn depends on
their borrowing decision. More specifically, if the collateral constraint binds,
entrepreneurs will not be able to purchase the capital of liquidated firms since
they no longer have access to additional credit. In this case, the only available
option for the lender is to sell the liquidated capital to the residual sector at a
lower price. However, if entrepreneurs do not borrow up to the limit, they still
have access to credit that can be used in the event of an investment opportunity.
In this case the capital of a liquidated firm can be sold to entrepreneurs at a
higher price.
We now have all the elements to show that the model has the potential
to generate multiple equilibria. Suppose that the initial state B is such that
the enforcement constraint (19) is binding if the residual sector is the only
option for the liquidated capital but it is not binding if the liquidated capital
can be sold to entrepreneurs. In the first case the collateral value is ξ q k =

ξ A G (K − K e ), while in the second it is ξ q k = ξ A . Since the second is
bigger than the first, it is possible that the collateral constraint is binding in
the first case but not in the second. Under these conditions the model admits
multiple self-fulfilling equilibria.
• Bad equilibrium. Suppose that agents expect that the unit value of the

liquidated capital is ξ q = ξ A G (K − K e ). This imposes a tight
constraint on entrepreneurs and, as a result, they borrow up to the limit.
But then, if an entrepreneur defaults, the lender is unable to sell the
liquidated capital to other entrepreneurs since there are no entrepreneurs
capable of purchasing the capital. The recovery value is ξ A G (K −

K e ) per each unit of capital. Therefore, the expectation of a lower
liquidation price is ex-post validated by the lack of “liquidity” available
to entrepreneurs.
• Good equilibrium. Suppose that agents expect that the unit value of
the liquidated capital is q = A . This relaxes the borrowing constraint
on entrepreneurs and allows them to borrow more than required to
purchase k = K. Thus, the collateral constraint is not binding. But
then, if an entrepreneur defaults, the lender is able to sell the liquidated
capital to other entrepreneurs and the recovery value is A . Therefore,
the expectation of high liquidation prices is ex-post validated by the
“liquidity” available to entrepreneurs.
The possibility of multiple equilibria introduces an endogenous mechanism for fluctuations in ξ . More specifically, the value of ξ is low if the

246

Federal Reserve Bank of Richmond Economic Quarterly

enforcement constraint is binding, which in turn generates a low value of ξ .
Instead, if the value of ξ is high, the enforcement constraint is not binding,
which in turn generates a high value of ξ . In this way the credit shock ξ
becomes endogenous and could fluctuate in response to the states of the economy. This provides a concept of liquidity-driven crisis: Expectations of high
prices increase liquidity, which in turn sustains high prices. Instead, expectations of low prices generate a contraction in liquidity, which in turn induces
a downfall in the liquidation price. The transmission of “endogenous” credit
shocks to the real sector of the economy in a closed economy is similar to
the model with “exogenous” credit shocks already described in the previous
section.

The International Transmission of Credit Shocks
The 2007–2009 crisis has been characterized by a high degree of international
synchronization in which most of the industrialized countries experienced
large macroeconomic contractions. There are two main explanations for the
synchronization. The first explanation is that country-specific shocks are internationally correlated. The second explanation is that shocks that arise in
one or few countries are transmitted to other countries because of economic
and financial integration.
The first hypothesis is not truly an explanation: If shocks are correlated
across countries, we would like to understand why they are correlated. Although this is obvious for certain shocks, think for example to oil shocks, it
is less intuitive for others. For instance, if we think that shocks to the labor
wedge are important drivers of the business cycle, it is not obvious why they
should be correlated across countries. The second hypothesis—international
transmission of country-specific shocks—is a more interesting line of research.
In this section I show that “credit” shocks that hit one or few countries
could generate large macroeconomic spillovers to other countries if they are
financially integrated. Therefore, these shocks are possible candidates to account for the international co-movement in macroeconomic aggregates.
To show this, I will consider a two-country version of the collateral model
described earlier. The only additional feature I need to specify is the meaning
of financial integration. One obvious implication of financial integration is
that borrowing and lending can be done internationally. This also implies
that the interest rate is equalized across countries (law of one price). In the
simple model studied here, however, this is inconsequential because agents
are risk neutral and the interest rates are constant and equal across countries
even if they operate in a regime of financial autarky. Therefore, this is not the
important dimension of international integration that matters here.
Another possible implication of financial integration is that investors (in
our case entrepreneurs) hold domestic and foreign firms. Effectively, it is as

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

247

if each firm has two units: one operating in country 1 and the other operating
in country 2. The problem solved by the entrepreneur can be written as

max
c + βc
{hj ,kj ,bj }2j =1

subject to:
2

bj
θ

c=
−
w
h
−
B
+
k
−
q
qj K + Aj K h1−θ
j j
j j
j
R
j =1
ξ j qj kj ≥ bj + wj hj ,

j = 1, 2

(Aj kj − bj )
c =

j =1

c ≥ 0,

(27)

where the index j = 1, 2 identifies the country. Since entrepreneurs have
operations at home and abroad, they make production and investment decisions in both countries. Notice, however, that they face a consolidated budget
constraint. Also notice that the variable ξ is indexed by j since credit shocks
could be country-specific. This would be the case, for example, if there are
financial problems in the banking system of one country but not in the other.
I now show how a credit shock in country 1 (changes in ξ 1 ) affects the
economies of both countries. This can easily be seen from the first-order
conditions with respect to k1 , b1 , k2 , and b2 ,
(1 + γ )q1
1+γ
(1 + γ )q2
1+γ

βA1 + μ1 ξ 1 Eq1
(β + μ1 )R
βA2 + μ2 ξ 2 Eq2
(β + μ2 )R.

=
=
=
=

(28)
(29)
(30)
(31)

Equations (29) and (31) imply that the Lagrange multipliers are equalized
across countries, that is, μ1 = μ2 .
Now consider the first-order conditions with respect to labor,
θ

(1 − θ )A1 K h−θ
1 = w1 (1 + μ1 )
θ

(1 − θ )A2 K h−θ
2 = w2 (1 + μ2 ).
Since μ1 = μ2 , a credit shock in country 1 (change in ξ 1 ) has the same
impact on the demand of labor of both countries. Therefore, a country-specific
credit shock gets propagated to other countries through the labor wedge. This
mechanism is emphasized in Perri and Quadrini (2008, 2011).
The impact on the accumulation of capital is not perfectly symmetric in
the two countries, as we can see from equations (28) and (30). However, k1
and k2 move in the same direction.

248

Federal Reserve Bank of Richmond Economic Quarterly

Endogenous Credit Shocks

Perri and Quadrini (2011) go beyond “exogenous” credit shocks and, adopting
a framework with occasionally binding constraints similar to the model described in the previous subsection, they study the implications of endogenous
ξ j in an international environment.
The emergence of multiple equilibria characterized by different degrees
of liquidity also arises in the two-country model. What is interesting is that,
if countries are financially integrated, then bad and good equilibria outcomes
become perfectly correlated across countries. Thus, the model provides not
only a mechanism for the international transmission of country-specific credit
shocks, but also a mechanism in which “endogenous” credit shocks are internationally correlated. It is important to emphasize that the international
correlation of ξ j is not an assumption but an equilibrium property.
To see this, consider again the two-country model studied in the previous
section. The first-order conditions with respect to k1 , b1 , k2 , and b2 are still
given by (28)–(30). Therefore, μ1 = μ2 . This means that, if the collateral
constraint is binding in one country, it must also be binding in the other country.
But then we cannot have that in one country the liquidation price of capital is
determined by the marginal product in the entrepreneurial sector while in the
other country the price is determined by the marginal product in the residual
sector. If the collateral constraints are binding in both countries, entrepreneurs
lack the liquidity to purchase the capital of liquidated firms and the collateral
value will be low in both countries. This makes the collateral constraints
tighter and entrepreneurs borrow up to the limit (bad equilibrium outcome).
However, if the collateral constraints are not binding in both countries, then
entrepreneurs have the liquidity to purchase the liquidated capital in both
countries. The collateral value is high in both countries and firms do not
borrow up to the limit.
To summarize, either both countries end up in a bad equilibrium or both
countries end up in a good equilibrium. In this way self-fulfilling equilibria
(endogenous shocks) become perfectly correlated across countries.
Quantitative Performance

To the best of my knowledge, the quantitative properties of the international
model with endogenous credit shocks have been explored only in Perri and
Quadrini (2011). This article emphasizes four properties. First, the response
to credit shocks is highly asymmetric. Negative credit shocks generate largeand short-lived macroeconomic contractions while credit expansions generate
gradual and long-lasting macroeconomic booms.
The second finding is that credit contractions (negative credit shocks)
have larger macroeconomic effects if they arise after long periods of credit
expansions. Therefore, long credit expansions create the conditions for highly

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

249

disrupting financial crises. A similar prediction is obtained in Gorton and
Ordoñez (2011) but with a mechanism that is based on the information quality
of collateral assets.
The third finding relates to the difference between exogenous versus endogenous credit shocks. While exogenous credit shocks can generate macroeconomic co-movement, they do not generate cross-country co-movement
in financial flows or leverages, which is a strong empirical regularity. The
model with endogenous credit shocks, however, is also capable of generating
co-movement in financial flows since ξ j are endogenously correlated across
countries.
The last quantitative feature of the model I would like to emphasize is
that it can generate sizable fluctuations in asset prices. For this feature, however, the risk aversion of entrepreneurs becomes important, which we have
abstracted from in the simple version of the model presented here. Assuming
that there is market segmentation and firms cannot be purchased by workers,
a negative credit shock induces firms to pay lower dividends, which in turn
reduces the consumption of entrepreneurs. This implies that the discount rate
of entrepreneurs, βU (ct+1 )/U (ct ), falls. As a result, their valuation of future dividends falls, leading to an immediate drop in the market value of firms.
Since the impact of the credit shocks on entrepreneurs’ consumption is large,
the model generates sizable drops in asset prices.

CONCLUSION

The key principles for adding financial market frictions in general equilibrium
models are not new in the macroliterature. However, it is only with the recent
crisis that the profession has fully recognized the importance of financial
markets for business cycle fluctuations. Thus, more effort has been devoted
to the construction of models that can capture the role of financial markets for
macroeconomic dynamics.
This article has reviewed the most common and popular ideas proposed in
the literature. Using a stylized model with only two periods and two types of
agents, I have shown that the modeling of financial market frictions is useful
for understanding several dynamic features of the macroeconomy in general
and of the business cycle in particular.
The ideas reviewed in this article are all based on the transmission of
shocks through the “credit channel,” that is, conditions that limit the availability of funds or increase the cost of funds needed to make investment and
hiring decisions. Some authors have also proposed models in which the credit
channel and adverse selection in credit markets could generate economic fluctuations even in absence of exogenous shocks—an example is Suarez and
Sussman (1997). Less attention has been devoted in the literature to studying
alternative mechanisms through which financial frictions have an impact on

250

Federal Reserve Bank of Richmond Economic Quarterly

the macroeconomic dynamics. One of these mechanisms is studied in Monacelli, Quadrini, and Trigari (2010), who embed financial market frictions in
a matching model of the labor market with wage bargaining. In this article,
collateral constraints affect employment not because they limit the amount
of funds available to firms for hiring workers, but because they affect the
bargaining of wages. One interesting feature of this mechanism is that the impact of credit shocks on employment is much more persistent than the impact
generated by the typical credit channel reviewed in this article.5

REFERENCES
Arellano, Cristina, Yan Bai, and Patrick Kehoe. 2010. “Financial Markets
and Fluctuations in Uncertainty.” Research Department Staff Report,
Federal Reserve Bank of Minneapolis (April).
Beaudry, Paul, and Franck Portier. 2006. “Stock Prices, News, and Economic
Fluctuations.” American Economic Review 96 (September): 1,293–307.
Bernanke, Ben, and Mark Gertler. 1989. “Agency Costs, Net Worth, and
Business Fluctuations.” American Economic Review 79 (March): 14–31.
Bernanke, Ben S., Mark Gertler, and Simon Gilchrist. 1999. “The Financial
Accelerator in a Quantitative Business Cycle Framework.” In Handbook
of Macroeconomics, Vol. 1C, edited by J. B. Taylor and M. Woodford.
Amsterdam: Elsevier Science; 1,341–93.
Bewley, Truman F. 1986. “Stationary Monetary Equilibrium with a
Continuum of Independent Fluctuating Consumers.” In Contributions to
Mathematical Economics in Honor of Gérard Debreu, edited by Werner
Hildenbrand and Andreu Mas-Colell. Amsterdam: North-Holland.
Boz, Emine, and Enrique G. Mendoza. 2010. “Financial Innovation, the
Discovery of Risk, and the U.S. Credit Crisis.” Working Paper 16020.
Cambridge, Mass.: National Bureau of Economic Research (May).
Brunnermeier, Markus K., and Yuliy Sannikov. 2010. “A Macroeconomic
Model with a Financial Sector.” Manuscript, Princeton University.
5 Other contributions that embed financial market frictions in models with searching and
matching frictions are Weil and Wasmer (2004), Chugh (2009), Petrosky-Nadeau (2009), and
Petrosky-Nadeau and Wasmer (2010). In these articles, however, the main transmission mechanism is still based on the “credit channel.”

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

251

Caggese, Andrea. 2007. “Financing Constraints, Irreversibility and
Investment Dynamics.” Journal of Monetary Economics 54 (October):
2,102–30.
Carlstrom, Charles T., and Timothy S. Fuerst. 1997. “Agency Costs, Net
Worth, and Business Fluctuations: A Computable General Equilibrium
Analysis.”American Economic Review 87 (December): 893–910.
Chen, Kaiji, and Zheng Song. 2009. “Financial Frictions on Capital
Allocation: A Transmission Mechanism of TFP Fluctuations.”
Manuscript, Emory University.
Christiano, Lawrence J., and Martin Eichenbaum. 1992. “Liquidity Effects
and the Monetary Transmission Mechanism.” American Economic
Review 82 (May): 346–53.
Christiano, Lawrence J., Martin Eichenbaum, and Charles L. Evans. 1997.
“Sticky Price and Limited Participation Models of Money: A
Comparison.” European Economic Review 41 (June): 1,201–49.
Christiano, Lawrence J., Roberto Motto, and Massimo Rostagno. 2008.
“Financial Factors in Economic Fluctuations.” Manuscript,
Northwestern University and European Central Bank.
Chugh, Sanjay K. 2009. “Costly External Finance and Labor Market
Dynamics.” Manuscript, University of Maryland.
Coen-Pirani, Daniele. 2005. “Margin Requirements and Equilibrium Asset
Prices.” Journal of Monetary Economics 52 (March): 449–75.
Cooley, Thomas F., Ramon Marimon, and Vincenzo Quadrini. 2004.
“Aggregate Consequences of Limited Contract Enforceability.” Journal
of Political Economy 112 (August): 817–47.
Cooley, Thomas F., and Vincenzo Quadrini. 1999. “A Neoclassical Model of
the Phillips Curve Relation.” Journal of Monetary Economics 44
(October): 165–93.
Cooley, Thomas F., and Vincenzo Quadrini. 2004. “Optimal Monetary
Policy in a Phillips-Curve World.” Journal of Economic Theory 118
(October): 174–208.
Cordoba, Juan Carlos, and Marla Ripoll. 2004. “Credit Cycles Redux.”
International Economic Review 45 (November): 1,011–46.
Covas, Francisco, and Wouter J. Den Haan. 2010. “The Role of Debt and
Equity Finance over the Business Cycle.” Manuscript, University of
Amsterdam.
Covas, Francisco, and Wouter J. Den Haan. 2011. “The Cyclical Behavior of
Debt and Equity Finance.” American Economic Review 101 (April):
877–99.

252

Federal Reserve Bank of Richmond Economic Quarterly

Del Negro, Marco, Gauti Eggertsson, Andrea Ferrero, and Nobuhiro
Kiyotaki. 2010. “The Great Escape? A Quantitative Evaluation of the
Fed’s Non-Standard Policies.” Manuscript, Federal Reserve Bank of
New York.
Farhi, Emmanuel, and Jean Tirole. 2011. “Bubbly Liquidity.” Working Paper
16750. Cambridge, Mass.: National Bureau of Economic Research
(January).
Fuerst, Timothy S. 1992. “Liquidity, Loanable Funds, and Real Activity.”
Journal of Monetary Economics 29 (February): 3–24.
Gali, Jordi, Mark Gertler, and J. David López-Salido. 2007. “Markups,
Gaps, and the Welfare Costs of Business Fluctuations.” The Review of
Economics and Statistics 89 (November): 44–59.
Gertler, Mark, and Peter Karadi. 2011. “A Model of Unconventional
Monetary Policy.” Journal of Monetary Economics 58 (January): 17–34.
Gilchrist, Simon, Jae W. Sim, and Egon Zakrajsek. 2010. “Uncertainty,
Financial Frictions, and Investment Dynamics.” Manuscript, Boston
University.
Gilchrist, Simon, Vladimir Yankov, and Egon Zakrajsek. 2009. “Credit
Market Shocks and Economic Fluctuations: Evidence from Corporate
Bond and Stock Markets.” Journal of Monetary Economics 56 (May):
471–93.
Gomes, Joao F., Amir Yaron, and Lu Zhang. 2003. “Asset Prices and
Business Cycles with Costly External Finance.” Review of Economic
Dynamics 6 (October): 767–88.
Gorton, Gary, and Guillermo Ordoñez. 2011. “Collateral Crises.”
Manuscript, Yale University.
Guerrieri, Veronica, and Guido Lorenzoni. 2009. “Liquidity and Trading
Dynamics.” Econometrica 77 (November): 1,751–90.
Guerrieri, Veronica, and Guido Lorenzoni. 2010. “Credit Crises,
Precautionary Savings and the Liquidity Trap.” Manuscript, University
of Chicago Booth and Massachusetts Institute of Technology.
Hart, Oliver, and John Moore. 1994. “A Theory of Debt Based on the
Inalienability of Human Capital.” Quarterly Journal of Economics 109
(November): 841–79.
Iacoviello, Matteo. 2005. “House Prices, Borrowing Constraints, and
Monetary Policy in the Business Cycle.” American Economic Review 95
(June): 739–64.

V. Quadrini: Financial Frictions in Macroeconomic Fluctuations

253

Jaimovich, Nir, and Sergio Rebelo. 2009. “Can News about the Future Drive
the Business Cycle?” American Economic Review 99 (September):
1,097–118.
Jermann, Urban, and Vincenzo Quadrini. 2006. “Financial Innovations and
Macroeconomic Volatility.” Working Paper 12308. Cambridge, Mass.:
National Bureau of Economic Research (June).
Jermann, Urban, and Vincenzo Quadrini. 2007. “Stock Market Boom and the
Productivity Gains of the 1990s.” Journal of Monetary Economics 54
(March): 413–32.
Jermann, Urban, and Vincenzo Quadrini. Forthcoming. “Macroeconomic
Effects of Financial Shocks.” American Economic Review.
Kehoe, Timothy J., and David K. Levine. 1993. “Debt Constrained Asset
Markets.” Review of Economic Studies 60 (October): 865–88.
Khan, Aubhik, and Julia K. Thomas. 2011. “Credit Shocks and Aggregate
Fluctuations in an Economy with Production Heterogeneity.”
Manuscript, Ohio State University.
Kiyotaki, Nobuhiro, and John H. Moore. 1997. “Credit Cycles.” Journal of
Political Economy 105 (April): 211–48.
Kiyotaki, Nobuhiro, and John H. Moore. 2008. “Liquidity, Business Cycles,
and Monetary Policy.” Manuscript, Princeton University and Edinburgh
University.
Kocherlakota, Narayana R. 2000. “Creating Business Cycles Through Credit
Constraints.” Federal Reserve Bank of Minneapolis Quarterly Review
24 (Summer): 2–10.
Kocherlakota, Narayana R. 2009. “Bursting Bubbles: Consequences and
Cures.” Manuscript, Federal Reserve Bank of Minneapolis.
Krusell, Per, Anthony A. Smith, Jr. 1998. “Income and Wealth Heterogeneity
in the Macroeconomy.” Journal of Political Economy 106 (October):
867–96.
Liu, Zheng, Pengfei Wang, and Tao Zha. 2011. “Land-Price Dynamics and
Macroeconomic Fluctuations.” Working Paper 17045. Cambridge,
Mass.: National Bureau of Economic Research (May).
Martin, Alberto, and Jaume Ventura. 2011. “Economic Growth with
Bubbles.” Manuscript, Pompeu Fabra University.
Mendoza, Enrique G. 2010. “Sudden Stops, Financial Crises, and Leverage.”
American Economic Review 100 (December): 1,941–66.
Mendoza, Enrique G., and Vincenzo Quadrini. 2010. “Financial
Globalization, Financial Crises and Contagion.” Journal of Monetary
Economics 57 (January): 24–39.

254

Federal Reserve Bank of Richmond Economic Quarterly

Miao, Jianjun, and Pengfei Wang. 2010. “Credit Risk and Business Cycles.”
Manuscript, Boston University and Hong Kong University of Science
and Technology.
Modigliani, Franco, and Merton H. Miller. 1958. “The Cost of Capital,
Corporate Finance and the Theory of Investment.” American Economic
Review 48 (June): 261–97.
Monacelli, Tommaso, Vincenzo Quadrini, and Antonella Trigari. 2010.
“Financial Markets and Unemployment.” Manuscript, Bocconi
University and University of Southern California.
Neumeyer, Pablo A., and Fabrizio Perri. 2005. “Business Cycles in
Emerging Economies: The Role of Interest Rates.” Journal of Monetary
Economics 52 (March): 345–80.
Perri, Fabrizio, and Vincenzo Quadrini. 2008. “Understanding the
International Great Moderation.” Manuscript, University of Minnesota
and University of Southern California.
Perri, Fabrizio, and Vincenzo Quadrini. 2011. “International Recessions.”
Working Paper 17201. Cambridge, Mass.: National Bureau of Economic
Research (July).
Petrosky-Nadeau, Nicolas. 2009. “Credit, Vacancies and Unemployment
Fluctuations.” Manuscript, Carnegie-Mellon University.
Petrosky-Nadeau, Nicolas, and Etienne Wasmer. 2010. “The Cyclical
Volatility of Labor Markets under Frictional Financial Markets.”
Manuscript, Carnegie-Mellon University and Science-Po Paris.
Smets, Frank, and Rafael Wouters. 2007. “Shocks and Frictions in US
Business Cycles: A Bayesian DSGE Approach.” American Economic
Review 97 (June): 586–606.
Suarez, Javier, and Oren Sussman. 1997. “Endogenous Cycles in a
Stiglitz-Weiss Economy.” Journal of Economic Theory 76 (September):
47–71.
Townsend, Robert M. 1979. “Optimal Contracts and Competitive Markets
with Costly State Verification.” Journal of Economic Theory 21
(October): 265–93.
Wang, Pengfei, and Yi Wen. Forthcoming. “Hayashi Meets Kiyotaki and
Moore: A Theory of Capital Adjustment Costs.” Review of Economic
Dynamics.
Weil, Philippe, and Etienne Wasmer. 2004. “The Macroeconomics of Labor
and Credit Market Imperfections.” American Economic Review 94
(September): 944–63.

Economic Quarterly—Volume 97, Number 3—Third Quarter 2011—Pages 255–326

Macroeconomics with
Heterogeneity: A Practical
Guide
Fatih Guvenen

What is the origin of inequality among men and is it authorized by
natural law?
—Academy of Dijon, 1754 (Theme for essay competition)

he quest for the origins of inequality has kept philosophers and scientists occupied for centuries. A central question of interest—also
highlighted in Academy of Dijon’s solicitation for its essay competi1
tion —is whether inequality is determined solely through a natural process or
through the interaction of innate differences with man-made institutions and
policies. And, if it is the latter, what is the precise relationship between these
origins and socioeconomic policies?
While many interesting ideas and hypotheses have been put forward over
time, the main impediment to progress came from the difficulty of scientifically testing these hypotheses, which would allow researchers to refine ideas
that were deemed promising and discard those that were not. Economists,
who grapple with the same questions today, have three important advantages
that can allow us to make progress. First, modern quantitative economics provides a wide set of powerful tools, which allow researchers to build “laboratories” in which various hypotheses regarding the origins and consequences of
For helpful discussions, the author thanks Dean Corbae, Cristina De Nardi, Per Krusell, Serdar
Ozkan, and Tony Smith. Special thanks to Andreas Hornstein and Kartik Athreya for detailed
comments on the draft. David Wiczer and Cloe Ortiz de Mendivil provided excellent research
assistance. The views expressed herein are those of the author and not necessarily those of
the Federal Reserve Bank of Chicago, the Federal Reserve Bank of Richmond, or the Federal
Reserve System. Guvenen is affiliated with the University of Minnesota, the Federal Reserve
Bank of Chicago, and NBER. E-mail: guvenen@umn.edu.
1 The competition generated broad interest from scholars of the time, including Jean-Jacques

Rousseau, who wrote his famous Discourse on the Origins of Inequality in response, but failed
to win the top prize.

256

Federal Reserve Bank of Richmond Economic Quarterly

inequality can be studied. Second, the widespread availability of rich microdata sources—from cross-sectional surveys to panel data sets from administrative records that contain millions of observations—provides fresh input into
these laboratories. Third, thanks to Moore’s law, the cost of computation has
fallen radically in the past decades, making it feasible to numerically solve,
simulate, and estimate complex models with rich heterogeneity on a typical
desktop workstation available to most economists.
There are two broad sets of economic questions for which economists
might want to model heterogeneity. First, and most obviously, these models
allow us to study cross-sectional, or distributional, phenomena. The U.S. economy today provides ample motivation for studying distributional issues, with
the top 1 percent of households owning almost half of all stocks and one-third
of all net worth in the United States, and wage inequality having risen virtually
without interruption for the last 40 years. Not surprisingly, many questions of
current policy debate are inherently about their distributional consequences.
For example, heated disagreements about major budget issues—such as reforming Medicare, Medicaid, and the Social Security system—often revolve
around the redistributional effects of such changes. Similarly, a crucial aspect
of the current debate on taxation is about “who should pay what?” Answering
these questions would begin with a sound understanding of the fundamental
determinants of different types of inequality.
A second set of questions for which heterogeneity could matter involves
aggregate phenomena. This second use of heterogeneous-agent models is less
obvious than the first, because various aggregation theorems as well as numerical results (e.g., Rı́os-Rull [1996] and Krusell and Smith [1998]) have
established that certain types of heterogeneity do not change (many) implications relative to a representative-agent model.2
To understand this result and its ramifications, in Section 1, I start by reviewing some key theoretical results on aggregation (Rubinstein 1974;
Constantinides 1982). Our interest in these theorems comes from a practical concern: Basically, a subset of the conditions required by these theorems
are often satisfied in heterogeneous-agent models, making the aggregate implications of such models closely mimic those from a representative-agent economy. For example, an important theorem proved by Constantinides (1982)
establishes the existence of a representative agent if markets are complete.3
This central role of complete markets turned the spotlight since the late 1980s
onto its testable implications for perfect risk sharing (or “full insurance”). As
2 These aggregation results do not imply that all aspects of a representative-agent model will
be the same as those of the underlying individual problem. I discuss important examples to the
contrary in Section 6.
3 (Financial) markets are “complete” when agents have access to a sufficiently rich set of
assets that allows them to transfer their wealth/resources across any two dates and/or states of the
world.

F. Guvenen: Macroeconomics with Heterogeneity

257

I review in Section 2, these implications have been tested by an extensive literature using data sets from all around the world—from developed countries
such as the United States to village economies in India, Thailand, Uganda,
and so on. While this literature delivered a clear statistical rejection, it also
revealed a surprising amount of “partial” insurance, in the sense that individual consumption growth (or, more generally, marginal utility growth) does not
seem to respond to many seemingly large shocks, such as long spells of unemployment, strikes, and involuntary moves (Cochrane [1991] and Townsend
[1994], among others).
This raises the more practical question of “how far are we from the complete markets benchmark?” To answer this question, researchers have recently
turned to directly measuring the degree of partial insurance, defined for our
purposes as the degree of consumption smoothing over and above what an
individual can achieve on her own via “self-insurance” in a permanent income
model (i.e., using a single risk-free asset for borrowing and saving). Although
this literature is quite new—and so a definitive answer is still not on hand—it
is likely to remain an active area of research in the coming years.
The empirical rejection of the complete markets hypothesis launched an
enormous literature on incomplete markets models starting in the early 1990s,
which I discuss in Section 3. Starting with Imrohoroglu (1989), Huggett
(1993), and Aiyagari (1994), this literature has been addressing issues from
a very broad spectrum, covering diverse topics such as the equity premium
and other puzzles in finance; important life-cycle choices, such as education,
marriage/divorce, housing purchases, fertility choice, etc.; aggregate and distributional effects of a variety of policies ranging from capital and labor income
taxation to the overhaul of Social Security, reforming the health care system,
among many others. An especially important set of applications concerns
trends in wealth, consumption, and earnings inequality. These are discussed
in Section 4.
A critical prerequisite for these analyses is the disentangling of “ex ante
heterogeneity” from “risk/uncertainty” (also called ex post heterogeneity)—
two sides of the same coin, with potentially very different implications for
policy and welfare. But this is a challenging task, because inequality often arises from a mixture of heterogeneity and idiosyncratic risk, making
the two difficult to disentangle. It requires researchers to carefully combine
cross-sectional information with sufficiently long time-series data for analysis. The state-of-the-art methods used in this field increasingly blend the set
of tools developed and used by quantitative macroeconomists with those used
by structural econometricians. Despite the application of these sophisticated
tools, there remains significant uncertainty in the profession regarding the
magnitudes of idiosyncratic risks as well as whether or not these risks have
increased since the 1970s.

258

Federal Reserve Bank of Richmond Economic Quarterly

The Imrohoroglu-Huggett-Aiyagari framework sidestepped a difficult issue raised by the lack of aggregation—that aggregates, including prices, depend on the entire wealth distribution. This was accomplished by abstracting
from aggregate shocks, which allowed them to focus on stationary equilibria
in which prices (the interest rate and the average wage) were simply some
constants to be solved for in equilibrium. A far more challenging problem
with incomplete markets arises in the presence of aggregate shocks, in which
case equilibrium prices become functions of the entire wealth distribution,
which varies with the aggregate state. Individuals need to know these equilibrium functions so that they can forecast how prices will evolve in the future
as the aggregate state evolves in a stochastic manner. Because the wealth
distribution is an infinite-dimensional object, an exact solution is typically not
feasible. Krusell and Smith (1998) proposed a solution whereby one approximates the wealth distribution with a finite number of its moments (inspired
by the idea that a given probability distribution can be represented by its
moment-generating function). In a remarkable finding, they showed that the
first moment (the mean) of the wealth distribution was all individuals needed
to track in this economy for predicting all future prices. This result—generally
known as “approximate aggregation”—is a double-edged sword. On the one
hand, it makes feasible the solution of a wide range of interesting models with
incomplete markets and aggregate shocks. On the other hand, it suggests that
ex post heterogeneity does not often generate aggregate implications much
different from a representative-agent model. So, the hope that some aggregate phenomena that were puzzling in representative-agent models could be
explained in an incomplete markets framework is weakened with this result.
While this is an important finding, there are many examples where heterogeneity does affect aggregates in a significant way. I discuss a variety of such
examples in Section 6.
Finally, I turn to computation and calibration. First, in Section 5, I discuss
some details of the Krusell-Smith method. A number of potential pitfalls are
discussed and alternative checks of accuracy are studied. Second, an important practical issue that arises with calibrating/estimating large and complex
quantitative models is the following. The objective function that we minimize
often has lots of jaggedness, small jumps, and/or deep ridges because of a variety of reasons that have to do with approximations, interpolations, binding
constraints, etc. Thus, local optimization methods are typically of little help
on their own, because they very often get stuck in some local minima. In Section 7, I describe a global optimization algorithm that is simple yet powerful
and is fully parallelizable without requiring any knowledge of MPI, OpenMP,
and so on. It works on any number of computers that are connected to the
Internet and have access to a synchronization service like DropBox. I provide a discussion of ways to customize this algorithm with different options
to experiment.

F. Guvenen: Macroeconomics with Heterogeneity

259

1. AGGREGATION
Even in a simple static model with no uncertainty we need a way to deal with
consumer heterogeneity. Adding dynamics and risk into this environment
makes things more complex and requires a different set of conditions to be
imposed. In this section, I will review some key theoretical results on various
forms of aggregation. I begin with a very simple framework and build up to
a fully dynamic model with idiosyncratic (i.e., individual-specific) risk and
discuss what types of aggregation results one can hope to get and under what
conditions.
Our interest in aggregation is not mainly for theoretical reasons. As we
shall see, some of the conditions required for aggregation are satisfied (sometimes inadvertently!) by commonly used heterogeneous-agent frameworks,
making them behave very much like a representative-agent model. Although
this often makes the model easier to solve numerically, at the same time it
can make its implications “boring”—i.e., too similar to a representative-agent
model. Thus, learning about the assumptions underlying the aggregation theorems can allow model builders to choose the features of their models carefully
so as to avoid such outcomes.

A Static Economy
Consider a finite set I (with cardinality I ) of consumers who differ in their preferences (over l types of goods) and wealth in a static environment. Consider
a particular good and let xi (p, wi ) denote the demand function of consumer
i for this good, given prices p ∈ R l and wealth wi . Let (w1 , w2 , ..., wI ) be
the vector of wealth levels for all I consumers. “Aggregate demand” in this
economy can be written as

x (p, w1 , w2 , ..., wI ) =

xi (p, wi ).

i=1

As seen here, the aggregate demand function x depends on the entire
wealth distribution, which is a formidable object to deal with.
The key question
then is, when can we write x(p, w1 , w2 , ..., wn ) ≡ x(p, wi )? For the
wealth distribution to not matter, we need aggregate demand to not change
for

any redistribution of wealth that keeps aggregate wealth constant ( dwi =
0). Taking the total derivative of x, and setting it to zero yields

∂x p, wi
∂xi (p, wi )
=0⇒
dwi = 0
∂wi
∂wi
i=1
for all possible redistributions. This will only be true if

260

Federal Reserve Bank of Richmond Economic Quarterly

∂xj p, wj
∂xi (p, wi )
=
∂wi
∂wj

∀i, j ∈ I.

Thus, the key condition for aggregation is that individuals have the same
marginal propensity to consume (MPC) out of wealth (or linear Engel curves).
In one of the earliest works on aggregation, Gorman (1961) formalized this
idea via restrictions on consumers’ indirect utility function, which delivers the
required linearity in Engel curves.
Theorem 1 (Gorman 1961) Consider an economy with N < ∞ commodities and a set I of consumers. Suppose that the preferences of each consumer
i ∈ I can be represented by an indirect utility function4 of the form
vi (p, wi ) = ai (p) + b (p) wi ,
and that each household i ∈ I has a positive demand for each commodity, then these preferences can be aggregated and represented by those of a
representative household, with indirect utility
where a(p) =

v (p, w) = a (p) + b (p) w,

i ai (p) and w =
i wi is aggregate income.

As we shall see later, the importance of linear Engel curves (or constant MPCs)
for aggregation is a key insight that carries over to much more general models,
all the way up to the infinite-horizon incomplete markets model with aggregate
shocks studied in Krusell and Smith (1998).

A Dynamic Economy (No Idiosyncratic Risk)
Rubinstein (1974) extends Gorman’s result to a dynamic economy where individuals consume out of wealth (no income stream). Linear Engel curves are
again central in this context.
Consider a frictionless economy in which each individual solves an intertemporal consumption-savings/portfolio allocation problem. That is, every
period current wealth wt is apportioned between current consumption ct and
a portfolio of a risk-free and a risky security with respective (gross) returns
f
Rt and Rts .5 Let α t denote the portfolio share of the risk-free asset at time t,
and δ denote the subjective time discount factor. Individuals solve
4 Denoting the consumer’s utility function over goods with U , the indirect utility function is
simply vi (p, wi ) ≡ U (xi (p, wi ))—that is, the maximum utility of a consumer who has wealth wi
and faces price vector p.
5 We can easily allow for multiple risky securities at the expense of complicating the notation.

F. Guvenen: Macroeconomics with Heterogeneity

max E

{ct ,α t }

261

δ t U (ct )

t=1

f
s.t. wt+1 = (wt − ct ) α t Rt + (1 − α t ) Rts .
Furthermore, assume that the period utility function, U , belongs to the
hyperbolic absolute risk aversion (HARA) class, which is defined as utility
functions that have linear risk tolerance: T (c) ≡ −U (c) /U (c) = ρ+γ c and
γ < 1.6 This class encompasses three utility functions that are well-known
−1
in economics: U (c) = (γ − 1)−1 (ρ + γ c)1−γ (generalized power utility;
standard constant relative risk aversion [CRRA] form when ρ ≡ 0); U (c) =
−ρ × exp(−c/ρ) if γ ≡ 0 (exponential utility); and U (c) = 0.5(ρ − c)2
defined for values c < ρ (quadratic utility).
The following theorem gives six sets of conditions under which aggregation obtains.7
Theorem 2 (Rubenstein 1974) Consider the following homogeneity
conditions:
1. All individuals have the same resources w0 , and tastes δ and U .
2. All individuals have the same δ and taste parameters γ = 0.
3. All individuals have the same taste parameters γ = 0.
4. All individuals have the same resources w0 and taste parameters ρ = 0
and γ = 1.
5. A complete market exists and all individuals have the same taste parameter γ = 0.
6. A complete market exists and all individuals have the same resources
w0 and taste δ, ρ = 0, and γ = 1.
Then, all equilibrium rates of return are determined in case (1) as if there
exist only composite individuals each with resources w0 and tastes δ and U ;
and equilibrium rates of return are determined in cases (2)–(6) as if there
exist only composite individuals
with the following economiccharacter each
i
)
istics: (i) resources: w
=
w
/I
;
(ii) tastes: σ = (σ i )(ρ i / ρ i
(where
0
0 i
σ ≡ 1/δ − 1) or δ = δ /I ; and (iv) preference parameters: ρ = ρ i /I ,
and γ .
Several remarks are in order.
6 “Risk tolerance” is the reciprocal of the Arrow-Pratt measure of “absolute risk aversion,”

which measures consumers’ willingness to bear a fixed amount of consumption risk. See, e.g.,
Pratt (1964).
7 The language of Theorem 2 differs from Rubinstein’s original statement by assuming rational
expectations and combines results with the extension to a multiperiod setting in his footnote 5.

262

Federal Reserve Bank of Richmond Economic Quarterly

Demand Aggregation

An important corollary to this theorem is that whenever a composite consumer
can be constructed, in equilibrium, rates of return are insensitive to the distribution of resources among individuals. This is because the aggregate demand
functions (for both consumption and assets) depend only on total wealth and
not on its distribution. Thus, we have “demand aggregation.”
Aggregation and Heterogeneity in Relative Risk Aversion

Notice that all six cases that give rise to demand aggregation in the theorem
require individuals to have the same curvature parameter, γ . To see why this
is important, note that (with HARA preferences) the optimal holdings of the
risky asset are a linear function of the consumer’s wealth: κ 1 +κ 2 wt /γ , where
κ 1 and κ 2 are some constants that depend on the properties of returns. It is easy
to see that with identical slopes, κγ2 , it does not matter who holds the wealth.
In other words, redistributing wealth between any two agents would cause
changes in total demand for assets that will cancel out each other, because
of linearity and same slopes. Notice also that while identical curvature is a
necessary condition, it is not sufficient for demand aggregation: Each of the
six cases adds more conditions on top of this identical curvature requirement.8

A Dynamic Economy (With Idiosyncratic Risk)
While Rubinstein’s (1974) theorem delivers a strong aggregation result, it
achieves this by abstracting from a key aspect of dynamic economies: uncertainty that evolves over time. Almost every interesting economy that we
discuss in the coming sections will feature some kind of idiosyncratic risk that
individuals face (coming from labor income fluctuations, shocks to health,
shocks to housing prices and asset returns, among others). Rubinstein’s
(1974) theorem is silent about how the aggregate economy behaves under these
scenarios.
This is where Constantinides (1982) comes into play: He shows that if
markets are complete, under much weaker conditions (on preferences, beliefs,
discount rates, etc.) one can replace heterogeneous consumers with a planner
who maximizes a weighted sum of consumers’ utilities. In turn, the central
planner can be replaced by a composite consumer who maximizes a utility
function of aggregate consumption.
To show this, consider a private ownership economy with production as in
Debreu (1959), with m consumers, n firms, and l commodities. As in Debreu
8 Notice also that, because in some cases (such as [2]) heterogeneity in ρ is allowed, indi-

viduals will exhibit different relative risk aversions (if they have different wt ), for example in the
generalized CRRA case, and still allow aggregation.

F. Guvenen: Macroeconomics with Heterogeneity

263

(1959), these commodities can be thought of as date-event labelled goods (and
concave utility functions, Ui , as being defined over these goods), allowing us to
map these results into an intertemporal economy with uncertainty. Consumer i
is endowed with wealth
(wi1 , wi2 , ..., wil ) and shares of firms (θ i1 , θ i2 , ..., θ in )
with θ ij ≥ 0 and m θ ij = 1. Let the vectors Ci and Yj denote, respectively,
individual i’s consumption set and firm j ’s production set.
∗ n
∗
An equilibrium is an (m + n + 1)-tuple ((ci∗ )m
i=1 , (yj )j =1 , p ) such that, as
usual, consumers maximize utility, firms maximize their profits, and markets
clear. Under standard assumptions, an equilibrium exists and is Pareto optimal.
Optimality implies that there exist positive numbers λi , i = 1, ..., m, such
that the solution to the following problem (P1),

max
c,y

(P1)

λi Ui (ci )

i=1

s.t. yj ∈ Yj , j = 1, 2, ...n;
ci ∈ Ci , i = 1, 2, ..., m;
m
n
m

cih =
yj h +
wih, h = 1, 2, ...l,
j =1

i=1

(where h indexes commodities) is given by (ci ) = (ci∗ ) and (yj ) = (yj∗ ). Let

aggregate consumption be z ≡(z1 , · · · , zl ), zh ≡ m
i=1 cih . Now, for a given
z, consider the problem (P2) of efficiently allocating it across consumers:
U (z) ≡ max
c

s.t. ci
m

λi Ui (ci )

(P2)

i=1

∈ Ci , i = 1, 2, ..., m,

cih = zh , h = 1, 2, ..., l.

i=1

Now, given the production sets of each firm and the aggregate endowments
of each commodity, consider the optimal production decision (P3):
max U (z)
y,z

s.t. yj

∈ Yj , ∀j ; zh =

(P3)

yj h + wh , ∀h .

Theorem 3 (Constantinides
[1982, Lemma 1]) (a) The solution to (P3) is

(yj ) = (yj∗ ) and zh = nj=1 yj∗h + wh , ∀h .
(b) U (z) is
increasing and concave in z.
(c) If zh = yj∗h + wh , ∀h , then the solution to (P2) is (xi ) = (xi∗ ).

264

Federal Reserve Bank of Richmond Economic Quarterly

(d) Given λi , i = 1, 2, · · · , m, then if the consumers are replaced by
one composite consumer with utility U (z), with endowment equal to the sum
of m consumers’ endowments
and shares the sum of their shares, then the

∗
∗ n
∗
(1 + n + 1)-tuple ( m
c
,
(y
j )j =1 , p ) is an equilibrium.
i=1 i
Constantinides versus Rubinstein

Constantinides allows for much more generality than Rubinstein by relaxing two important restrictions. First, no conditions are imposed on the homogeneity of preferences, which was a crucial element in every version of
Rubinstein’s theorem. Second, Constantinides allows for both exogenous
endowment as well as production at every date and state. In contrast, recall that, in Rubinstein’s environment, individuals start life with a wealth
stock and receive no further income or endowment during life. In exchange,
Constantinides requires complete markets and does not get demand aggregation. Notice that the existence of a composite consumer does not imply demand
aggregation, for at least two reasons. First, composite demand depends on the
weights in the planner’s problem and, thus, depends on the distribution of endowments. Second, the composite consumer is defined at equilibrium prices
and there is no presumption that its demand curve is identical to the aggregate
demand function.
Thus, the usefulness of Constantinides’s result hinges on (i) the degree to
which markets are complete, (ii) whether we want to allow for idiosyncratic
risk and heterogeneity in preferences (which are both restricted in Rubinstein’s
theorem), and (iii) whether or not we need demand aggregation. Below I will
address these issues in more detail. We will see that, interestingly, even when
markets are not complete, in certain cases, we will not only get close to a composite consumer representation, but we can also get quite close to the much
stronger result of demand aggregation! An important reason for this outcome
is that many heterogeneous-agent models assume identical preferences, which
eliminates an important source of heterogeneity, satisfying Rubinstein’s conditions for preferences. While these models do feature idiosyncratic risk, as
we shall see, when the planning horizon is long such shocks can often be
smoothed effectively using even a simple risk-free asset. More on this in the
coming sections.
Completing Markets by Adding Financial Assets

It is useful to distinguish between “physical” assets—those in positive net
supply (e.g., equity shares, capital, housing, etc.)—and “financial” assets—
those in zero net supply (bonds, insurance contracts, etc.). The latter are simply
some contracts written on a piece of paper that specify the conditions under
which one agent transfers resources to another. In principle, it can be created
with little cost. Now suppose that we live in a world with J physical assets and

F. Guvenen: Macroeconomics with Heterogeneity

265

that there are S(> J ) states of the world. In this general setting, markets are
incomplete. However, if consumers have homogenous tastes, endowments,
and beliefs, then markets are (effectively) complete by simply adding enough
financial assets (in zero net supply). There is no loss of optimality and nothing
will change by this action, because in equilibrium identical agents will not
trade with each other. The bottom line is that the more “homogeneity” we
are willing to assume among consumers, the less demanding the complete
markets assumption becomes. This point should be kept in mind as we will
return to it later.

EMPIRICAL EVIDENCE ON INSURANCE

Dynamic economic models with heterogeneity typically feature individualspecific uncertainty that evolves over time—coming from fluctuations in
labor earnings, health status, portfolio returns, among others. Although this
structure does not fit into Rubinstein’s environment, it is covered by
Constantinides’s theorem, which requires complete markets. Thus, a key
empirical question is the extent to which complete markets can serve as a
useful benchmark and a good approximation to the world we live in. As we
shall see in this section, the answer turns out to be more nuanced than a simple
yes or no.
To explain the broad variety of evidence that has been brought to bear on
this question, this section is structured in the following way. First, I begin
by discussing a large empirical literature that has tested a key prediction of
complete markets—that marginal utility growth is equated across individuals.
This is often called “perfect” or “full” insurance, and it is soundly rejected in
the data. Next, I discuss an alternative benchmark, inspired by this rejection.
This is the permanent income model, in which individuals have access to
only borrowing and saving—or “self-insurance.” In a way, this is the other
extreme end of the insurance spectrum. Finally, I discuss studies that take
an intermediate view—“partial insurance”—and provide some evidence to
support it. We now begin with the tests of full insurance.

Benchmark 1: Full Insurance
To develop the theoretical framework underlying the empirical analyses, start
with an economy populated by agents who derive utility from consumption ct
i
i
as well as some other good(s) dt : U i ct+1
, dt+1
, where i indexes individuals.
These other goods can include leisure time (of husband and wife if the unit of
analysis is a household), children, lagged consumption (as in habit formation
models), and so on.
The key implication of perfect insurance can be derived by following two
distinct approaches. The first environment assumes a social planner who pools

266

Federal Reserve Bank of Richmond Economic Quarterly

all individuals’ resources and maximizes a social welfare function that assigns
a positive weight to every individual. In the second environment, allocations
are determined in a competitive equilibrium of a frictionless economy where
individuals are able to trade in a complete set of financial securities. Both of
these frameworks make the following strong prediction for the growth rate of
individuals’ marginal utilities:
δ

i
i
Uci ct+1
, dt+1

Uci

cti , dti

t+1
,
t

(1)

where Uc denotes the marginal utility of consumption and t is the aggregate shock.9 Thus, this condition says that every individual’s marginal utility
must grow in locksteps with the aggregate and, hence, with each other. No
individual-specific term appears on the right-hand side, such as idiosyncratic
income shocks, unemployment, sickness, and so on. All these idiosyncratic
events are perfectly insured in this world. From here one can introduce a
number of additional assumptions for empirical tractability.
Complete Markets and Cross-Sectional Heterogeneity:
A Digression

So far we have focused on what market completeness implies for the study of
aggregate phenomena in light of Constantinides’s theorem. However, complete markets also imposes restrictions on the evolution of the cross-sectional
distribution, which can be seen in (1). For a given specification of U , (1)
translates into restrictions on the evolutions of ct and dt (possibly a vector).
Although it is possible to choose U to be sufficiently general and flexible
(e.g., include preference shifters, assume non-separability) to generate rich
dynamics in cross-sectional distributions, this strategy would attribute all the
action to preferences, which are essentially unobservable. Even in that case,
models that are not bound by (1)—and therefore have idiosyncratic shocks affect individual allocations—can generate a much richer set of cross-sectional
distributions. Whether that extra richness is necessary for explaining salient
features of the data is another matter and is not always obvious (see, e.g.,
Caselli and Ventura [2000], Badel and Huggett [2007], and Guvenen and
Kuruscu [2010]).10
9 Alternatively stated, is the Lagrange multiplier on the aggregate resource constraint
t
at time t in the planner’s problem or the state price density in the competitive equilibrium
interpretation.
10 Caselli and Ventura (2000) show that a wide range of distributional dynamics and income
mobility patterns can arise in the Cass-Koopmans optimal savings model and in the Arrow-Romer
model of productivity spillovers. Badel and Huggett (2007) show that life-cycle inequality patterns
(discussed later) that have been viewed as evidence of incomplete markets can in fact be generated
using a complete markets model. Guvenen and Kuruscu (2010) show that a human capital model
with heterogeneity in learning ability and skill-biased technical change generates rich nonmonotonic

F. Guvenen: Macroeconomics with Heterogeneity

267

Now I return back to the empirical tests of (1).
In a pioneering article, Altug and Miller (1990) were the first to formally
test the implications of (1). They considered households as their unit of analysis and specified a rich Beckerian utility function that included husbands’ and
wives’ leisure times as well as consumption (food expenditures), and adjusted
for demographics (children, age, etc.). Using data from the Panel Study of
Income Dynamics (PSID), they could not reject full insurance. Hayashi,
Altonji, and Kotlikoff (1996) revisited this topic a few years later and, using
the same data set, they rejected perfect risk sharing.11 Given this rejection
in the whole population, they investigated if there might be better insurance
within families, who presumably have closer ties with each other than the
population at large and could therefore provide insurance to the members in
need. They found that this hypothesis too was statistically rejected.12
In a similar vein, Guvenen (2007a) investigates how the extent of risk
sharing varies across different wealth groups, such as stockholders and nonstockholders. This question is motivated by the observation that stockholders
(who made up less than 20 percent of the population for much of the 20th
century) own about 80 percent of net worth and 90 percent of financial wealth
in the U.S. economy, and therefore play a disproportionately large role in the
determination of macroeconomic aggregates. On the one hand, these wealthy
individuals have access to a wide range of financial securities that can presumably allow better risk insurance; on the other hand, they are exposed to
different risks not faced by the less-wealthy nonstockholders. Using data from
the PSID, he strongly rejects perfect risk sharing among stockholders, but, perhaps surprisingly, does not find evidence against it among nonstockholders.
This finding suggests further focus on risk factors that primarily affect the
wealthy, such as entrepreneurial income risk that is concentrated at the top of
the wealth distribution.
A number of other articles impose further assumptions before testing for
risk sharing. A very common assumption is the separability between ct and dt
(for example, leisure), which leads to an equation that only involves consumption (Cochrane 1991, Nelson 1994, Attanasio and Davis 1996).13 Assuming
power utility in addition to separability, we can take the logs of both sides of
dynamics consistent with the U.S. data since the 1970s, despite featuring no idiosyncratic shocks
(and thus has complete markets).
11 Data sets such as the PSID are known to go through regular revisions, which might be
able to account for the discrepancy between the two articles’ results.
12 This finding has implications for the modeling of the household decision-making process
as a unitary model as opposed to one in which there is bargaining between spouses.
13 Non-separability, for example between consumption and leisure, can be allowed for if the
planner is assumed to be able to transfer leisure freely across individuals. While transfers of
consumption are easier to implement (through taxes and transfers), the transfer of leisure is harder
to defend on empirical grounds.

268

Federal Reserve Bank of Richmond Economic Quarterly

equation (1) and then time-difference to obtain
Ci,t = t ,

(2)

where Ct ≡ log(ct ) and Ct ≡ Ct − Ct−1 . Several articles have tested this
prediction by running a regression of the form
Ci,t = t + Zit + i,t ,

(3)

where the vector Zit contains factors that are idiosyncratic to individual/
household/group i. Perfect insurance implies that all the elements of the
vector are equal to zero.
Cochrane (1991), Mace (1991), and Nelson (1994) are the early studies
that exploit this simple regression structure. Mace (1991) focuses on whether
or not consumption responds to idiosyncratic wage shocks, i.e., Zit = Wti .14
While Mace fails to reject full insurance, Nelson (1994) later points out several
issues with the treatment of data (and measurement error in particular) that
affect Mace’s results. Nelson shows that a more careful treatment of these
issues results in strong rejection.
Cochrane (1991) raises a different point. He argues that studies such
as Mace’s, that test risk sharing by examining the response of consumption
growth to income, may have low power if income changes are (at least partly)
anticipated by individuals. He instead proposes to use idiosyncratic events
that are arguably harder to predict, such as plant closures, long strikes, long
illnesses, and so on. Cochrane rejects full insurance for illness or involuntary
job loss but not for long spells of unemployment, strikes, or involuntary moves.
Notice that a crucial assumption in all of the work of this kind is that none
of these shocks can be correlated with unmeasured factors that determine
marginal utility growth.
Townsend (1994) tests for risk sharing in village economies of India and
concludes that, although the model is statistically rejected, full insurance provides a surprisingly good benchmark. Specifically, he finds that individual
consumption co-moves with village-level consumption and is not influenced
much by own income, sickness, and unemployment.
Attanasio and Davis (1996) observe that equation (2) must also hold for
multiyear changes in consumption and when aggregated across groups of
individuals.15 This implies, for example, that even if one group of individuals
experiences faster income growth relative to another group during a 10-year
period, their consumption growth must be the same. The substantial rise in the
education premium in the United States (i.e., the wages of college graduates
14 Because individual wages are measured with (often substantial) error in microsurvey data
sets, an ordinary least squares estimation of this regression would suffer from attenuation bias,
which may lead to a failure to reject full insurance even when it is false. The articles discussed
here employ different approaches to deal with this issue (such as using an instrumental variables
regression or averaging across groups to average out measurement error).
15 Hayashi, Altonji, and Kotlikoff (1996) also use multiyear changes to test for risk sharing.

F. Guvenen: Macroeconomics with Heterogeneity

269

relative to high school graduates) throughout the 1980s provided a key test of
perfect risk sharing. Contrary to this hypothesis, Attanasio and Davis (1996)
find that the consumption of college graduates grows much faster than that of
high school graduates during the same period, violating the premise of perfect
risk sharing.
Finally, Schulhofer-Wohl (2011) sheds new light on this question. He
argues that if more risk-tolerant individuals self-select into occupations with
more (aggregate) income risk, then the regressions in (3) used by Cochrane
(1991), Nelson (1994), and others (which incorrectly assume away such correlation) will be biased toward rejecting perfect risk sharing. By using selfreported measures of risk attitudes from the Health and Retirement Survey,
Schulhofer-Wohl establishes such a correlation. Then he develops a method
to deal with this bias and, applying the corrected regression, he finds that
consumption growth responds very weakly to idiosyncratic shocks, implying
much larger risk sharing than can be found in these previous articles. He also
shows that the coefficients estimated from this regression can be mapped into
a measure of “partial insurance.”

Taking Stock

As the preceding discussion makes clear, with few exceptions, all empirical
studies agree that perfect insurance in the whole population is strongly rejected
in a statistical sense. However, this statistical rejection per se is not sufficient
to conclude that complete markets is a poor benchmark for economic analysis
for two reasons. First, there seems to be a fair deal of insurance against
certain types of shocks, as documented by Cochrane (1991) and Townsend
(1994), and among certain groups of households, such as in some villages
in less developed countries (Townsend 1994), or among nonstockholders in
the United States (Guvenen 2007a). Second, the reviewed empirical evidence
arguably documents statistical tests of an extreme benchmark (equation [1])
that we should not expect to hold precisely—for every household, against every
shock. Thus, with a large enough sample, statistical rejection should not be
surprising.16 What these tests do not do is tell us how “far” the economy is from
the perfect insurance benchmark. In this sense, analyses such as in Townsend
(1994)—that identify the types of shocks that are and are not insured—are
somewhat more informative than those in Altug and Miller (1990), Hayashi,
Altonji, and Kotlikoff (1996), and Guvenen (2007a), which rely on model
misspecification-type tests of risk sharing.
16 One view is that hypothesis tests without an explicit alternative (such as the ones dis-

cussed here) often “degenerate into elaborate rituals designed to measure the sample size (Leamer
1983, 39).”

270

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 Within-Cohort Inequality over the Life Cycle
Cross-Sectional Variance of Log Income

Cross-Sectional Variance of Log Consumption

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1
30

Age

Benchmark 2: Self-Insurance
The rejection of full consumption insurance led economists to search for
other benchmark frameworks for studying individual choices under uncertainty. One of the most influential studies of this kind has been Deaton and
Paxson (1994), who bring a different kind of evidence to bear. They begin by
documenting two empirical facts. Using microdata from the United States,
United Kingdom, and Taiwan, they first document that within-cohort inequality of labor income (as measured by the variance of log income) increases
substantially and almost linearly over the life cycle. Second, they document
that within-cohort consumption inequality shows a very similar pattern and
also rises substantially as individuals age. The two empirical facts are replicated in Figure 1 from data in Guvenen (2007b, 2009a).
To understand what these patterns imply for the market structure, first
consider a complete markets economy. As we saw in the previous section,
if consumption is separable from leisure and other potential determinants
of marginal utility, consumption growth will be equalized across individuals, independent of any idiosyncratic shock (equation [2]). Therefore, while
consumption level may differ across individuals because of differences in permanent lifetime resources, this dispersion should not change as the cohort
ages.17 Therefore, Deaton and Paxson’s (1994) evidence has typically been
17 There are two obvious modifications that preserve complete markets and would be consistent with rising consumption inequality. The first one is to introduce heterogeneity in time

F. Guvenen: Macroeconomics with Heterogeneity

271

interpreted as contradicting the complete markets framework. I now turn to
the details.
The Permanent Income Model

The canonical framework for self-insurance is provided by the permanent income life-cycle model, in which individuals only have access to a risk-free
asset for borrowing and saving. Therefore, as opposed to full insurance, there
is only “self-insurance” in this framework. Whereas the complete markets
framework represents the maximum amount of insurance, the permanent income model arguably provides the lower bound on insurance (to the extent
that we believe individuals have access to a savings technology, and borrowing
is possible subject to some constraints).
It is instructive to develop this framework in some detail as the resulting
equations will come in handy in the subsequent exposition. The framework
here closely follows Hall and Mishkin (1982) and Deaton and Paxson (1994).
Start with an income process with permanent and transitory shocks:
yt
ytP

= ytP + ε t ,
P
= yt−1
+ ηt .

(4)

Suppose that individuals discount the future at the rate of interest and
define: δ = 1/(1 + r). Preferences are of quadratic utility form:

T
1
2
∗
δ t c − ct
max E0 −
2 t=1
s.t.

δ t (yt − ct ) + A0 = 0,

(5)

t=1

where c∗ is the bliss level and A0 is the initial wealth level (which may be
zero). This problem can be solved in closed form to obtain a consumption
function. First-differencing this consumption rule yields

ct = ηt + γ t εt ,

(6)

T −t τ
where γ t ≡ 1/
is the annuitization factor.18 This term is close
τ =0 δ
to zero when the horizon is long and the interest rate is not too high, the

discounting. This is not very appealing because it “explains” by entirely relying on unobservable
preference heterogeneity. Second, one could question the assumption of separability: If leisure is
non-separable and wage inequality is rising over the life cycle—which it does—then consumption
inequality would also rise to keep marginal utility growth constant (even under complete markets).
But this explanation also predicts that hours inequality should also rise over the life cycle, a prediction that does not seem to be borne out in the data—although see Badel and Huggett (2007)
for an interesting dissenting take on this point.
18 Notice that the derivation of (6) requires two more pieces in addition to the Euler equation:
It requires us to explicitly specify the budget constraint (5) as well as the stochastic process for
income (4).

272

Federal Reserve Bank of Richmond Economic Quarterly

well-understood implication being that the response of consumption to transitory shocks is very weak given their low annuitized value. More importantly:
Consumption responds to permanent shocks one-for-one. Thus, consumption
changes reflect permanent income changes.
For the sake of this discussion, assume that the horizon is long enough so
i
that γ t ≈ 0 and thus ct ∼
, ηit = 0
= ηt . If we further assume that covi ct−1
(where i indexes individuals and the covariance is taken cross-sectionally),
we get
i
vari cti ∼
+ var ηt .
= vari ct−1

So the rise in consumption inequality from age t − 1 to t is a measure
of the variance of the permanent shock between those two ages. Since, as
seen in Figure 1, consumption inequality rises significantly and almost linearly, this figure is consistent with permanent shocks to income that are fully
accommodated as predicted by the permanent income model.
Deaton and Paxson’s Striking Conclusion

Based on this evidence, Deaton and Paxson (1994) argue that the permanent
income model is a better benchmark for studying individual allocations than
is complete markets. Storesletten, Telmer, and Yaron (2004a) go one step further and show that a calibrated life-cycle model with incomplete markets can
be quantitatively consistent with the rise in consumption inequality as long
as income shocks are sufficiently persistent (ρ 0.90). In his presidential
address to the American Economic Association, Robert Lucas (2003, 10) succinctly summarized this view: “The fanning out over time of the earnings and
consumption distributions within a cohort that Deaton and Paxson [1994] document is striking evidence of a sizable, uninsurable random walk component
in earnings.” This conclusion was shared by the bulk of the profession in the
1990s and 2000s, giving a strong impetus to the development of incomplete
markets models featuring large and persistent shocks that are uninsurable. I
review many of these models in Sections 3 and 4. However, a number of recent
articles have revisited the original Deaton-Paxson finding and have reached a
different conclusion.
Reassessing the Facts: An Opposite Conclusion

Four of these articles, by and large, follow the same methodology as described
and implemented by Deaton and Paxson (1994), but each uses a data set
that extends the original Consumer Expenditure Survey (CE) sample used by
these authors (that covered 1980–1990) and differ somewhat in their sample
selection strategy. Specifically, Primiceri and van Rens (2009, Figure 2) use
data from 1980–2000; Heathcote, Perri, and Violante (2010, Figure 14) use the
1980–1998 sample; Guvenen and Smith (2009, Figure 11) use the 1980–1992

F. Guvenen: Macroeconomics with Heterogeneity

273

sample and augment it with the 1972–73 sample; and Kaplan (2010, Figure
2) uses data from 1980–2003. Whereas Deaton and Paxson (1994, Figures 4
and 8) and Storesletten, Telmer, and Yaron (2004a, Figure 1) document a rise
in consumption inequality of about 30 log points (between ages 25 and 65),
these four articles find a much smaller rise of about 5–7 log points.
Taking Stock

Taken together, these re-analyses of CE data reveal that Deaton and Paxson’s
(1994) earlier conclusion is not robust to small changes in the sample period
studied. Although more work on this topic certainly seems warranted,19 these
recent studies raise substantial concerns on one of the key pieces of empirical
evidence on the extent of market incompleteness. A small rise in consumption
inequality is hard to reconcile with the combination of large permanent shocks
and self-insurance. Hence, if this latter view is correct, either income shocks
are not as permanent as we thought or there is insurance above and beyond
self-insurance. Both of these possibilities are discussed next.

An Intermediate Case: Partial Insurance
A natural intermediate case to consider is an environment between the two
extremes of full insurance and self-insurance. That is, perhaps individuals
have access to various sources of insurance (e.g., through charities, help from
family and relatives, etc.) in addition to borrowing and saving, but these forms
of insurance still fall short of full insurance. If this is the case, is there a way
to properly measure the degree of this “partial insurance?”
To address this question, Blundell, Pistaferri, and Preston (2008) examine
the response of consumption to innovations in income. They start with equation (6) derived by Hall and Mishkin (1982) that links consumption change
to income innovations, and modify it by introducing two parameters—θ and
φ—to encompass a variety of different scenarios:
ct = θ ηt + φγ t ε t .

(7)

Now, at one extreme is the self-insurance model (i.e., the permanent income model): θ = φ = 1; at the other extreme is a model with full insurance:
θ = φ = 0. Values of θ and φ between zero and one can be interpreted as
the degree of partial insurance—the lower the value, the more insurance there
19 For example, as Attanasio, Battistin, and Ichimura (2007) show, the facts regarding the
rise in consumption inequality over time are sensitive to whether one uses the “recall survey” or
the “diary survey” in the CE data set. All the articles discussed in this section (on consumption
inequality over the life cycle, including Deaton and Paxson [1994]) use the recall survey data.
It would be interesting to see if the diary survey alters the conclusions regarding consumption
inequality over the life cycle.

274

Federal Reserve Bank of Richmond Economic Quarterly

is. In their baseline analysis, Blundell, Pistaferri, and Preston (2008) estimate
θ ≈ 23 and find that it does not vary significantly over the sample period.20
They interpret the estimate of θ to imply that about 13 of permanent shocks are
insured above and beyond what can be achieved through self-insurance.21
A couple of remarks are in order. First, the derivation of equation (6) that
forms the basis of the empirical analysis here requires quadratic preferences.
Indeed, this was the maintained assumption in Hall and Mishkin (1982) and
Deaton and Paxson (1994). Blundell, Pistaferri, and Preston (2008) show
that one can derive, as an approximation, an analogous equation (7) with
CRRA utility and self-insurance, but now θ = φ ≈ π i,t , where π i,t is the
ratio of human wealth to total wealth. In other words, the coefficients θ
and φ are both equal to one under self-insurance only if preferences are of
quadratic form; generalizing to CRRA predicts that even with self-insurance
the response to permanent shocks, given by π i,t , will be less than one-for-one
if non-human wealth is positive. Thus, accumulation of wealth because of
precautionary savings or retirement can dampen the response of consumption
to permanent shocks and give the appearance of partial insurance. Blundell,
Pistaferri, and Preston (2008) examine if younger individuals (who have less
non-human wealth and thus have a higher π i,t than older individuals) have a
higher response coefficient to permanent shocks. They do find this to be the
case.
Insurance or Advance Information?

Primiceri and van Rens (2009) conduct an analysis similar to Blundell,
Pistaferri, and Preston (2008) and also find a small response of consumption
to permanent income movements. However, they adopt a different interpretation for this finding—that income movements are largely “anticipated” by
the individuals as opposed to being genuine permanent “shocks.” As has been
observed as far back as Hall and Mishkin (1982), this alternative interpretation
illustrates a fundamental challenge with this kind of analysis: Advance information and partial insurance are difficult to disentangle by simply examining
the response of consumption to income.
Insurance or Less Persistent Shocks?

Kaplan and Violante (2010) raise two more issues regarding the interpretation
of θ . First, they ask, what if income shocks are persistent but not permanent?
20 They also find φγ = 0.0533 (0.0435), indicating very small transmission of transitory
t
shocks to consumption. This is less surprising since it would also be implied by the permanent
income model.
21 The parameter φ is of lesser interest given that transitory shocks are known to be smoothed
quite well even in the permanent income model and the value of φ one estimates depends on what
one assumes about γ t —hence, the interest rates.

F. Guvenen: Macroeconomics with Heterogeneity

275

This is a relevant question because, as I discuss in the next section, nearly
all empirical studies that estimate the persistence coefficient (of an AR(1) or
ARMA(1,1)) find it to be 0.95 or lower—sometimes as low as 0.7. To explore
this issue, they simulate data from a life-cycle model with self-insurance only,
in which income shocks follow an AR(1) process with a first-order autocorrelation of 0.95. They show that when they estimate θ as in Blundell, Pistaferri,
and Preston (2008), they find it to be close to the 23 figure reported by these
authors.22 Second, they add a retirement period to the life-cycle model, which
has the effect that now even a unit root shock is not permanent, given that its
effect does not translate one-for-one into the retirement period. Thus, individuals have even more reason not to respond to permanent shocks, especially
when they are closer to retirement. Overall, their findings suggest that the
response coefficient of consumption to income can be generated in a model of
pure self-insurance to the extent that income shocks are allowed to be slightly
less than permanent.23 One feature this model misses, however, is the age profile of response coefficients, which shows no clear trend in the data according
to Blundell, Pistaferri, and Preston (2008), but is upward sloping in Kaplan
and Violante’s (2010) model.

Taking Stock

Before the early 1990s, economists typically appealed to aggregation theorems
to justify the use of representative-agent models. Starting in the 1990s, the
widespread rejections of the full insurance hypothesis (necessary for
Constantinides’s [1982] theorem), combined with the findings of Deaton
and Paxson (1994), led economists to adopt versions of the permanent income model as a benchmark to study individual’s choices under uncertainty
(Hubbard, Skinner, and Zeldes [1995], Carroll [1997], Carroll and Samwick
[1997], Blundell and Preston [1998], Attanasio et al. [1999], and Gourinchas
and Parker [2002], among many others). The permanent income model has two
key assumptions: a single risk-free asset for self-insurance and permanent—
or very persistent—shocks, typically implying substantial idiosyncratic risk.
The more recent evidence, discussed in this subsection, however, suggests that
a more appropriate benchmark needs to incorporate either more opportunities
for partial insurance or idiosyncratic risk that is smaller than once assumed.
22 The reason is simple. Because the AR(1) shock decays exponentially, this shock loses 5
percent of its value in one year, but 1 − 0.9510 ≈ 40 percent in 10 years and 65 percent in 20
years. Thus, the discounted lifetime value of such a shock is significantly lower than a permanent
shock, which retains 100 percent of its value at all horizons.
23 Another situation in which θ < 1 with self-insurance alone is if permanent and transitory
shocks are not separately observable and there is estimation risk.

276
3.

Federal Reserve Bank of Richmond Economic Quarterly

INCOMPLETE MARKETS IN GENERAL EQUILIBRIUM

This section and the next discuss incomplete markets models in general equilibrium without aggregate shocks. Bringing in a general equilibrium structure
allows researchers to jointly analyze aggregate and distributional issues. As
we shall see, the two are often intertwined, making such models very useful.
The present section discusses the key ingredients that go into building a general equilibrium incomplete markets model (e.g., types of risks to consider,
borrowing limits, modeling individuals versus households, among others).
The next section presents three broad questions that these models have been
used to address: the cross-sectional distributions of consumption, earnings,
and wealth. These are substantively important questions and constitute an
entry point into broader literatures. I now begin with a description of the basic
framework.

The Aiyagari (1994) Model
In one of the first quantitative models with heterogeneity, Imrohoroglu (1989)
constructed a model with liquidity constraints and unemployment risk that varied over the business cycle. She assumed that interest rates were constant to
avoid the difficulties with aggregate shocks, which were subsequently solved
by Krusell and Smith (1998). She used this framework to re-assess Lucas’s
(1987) earlier calculation of the welfare cost of business cycles. She found
only a slightly higher figure than Lucas, mainly because of her focus on unemployment risk, which typically has a short duration in the United States.24
Regardless of its empirical conclusions, this article represents an important
early effort in this literature.
In what has become an important benchmark model, Aiyagari (1994)
studies a version of the deterministic growth framework, with a Neoclassical
production function and a large number of infinitely lived consumers (dynasties). Consumers are ex ante identical, but there is ex post heterogeneity
because of idiosyncratic shocks to labor productivity, which are not directly
insurable (via insurance contracts). However, consumers can accumulate a
(conditionally) risk-free asset for self-insurance. They can also borrow in this
asset, subject to a limit determined in various ways. At each point in time,
consumers may differ in the history of productivities experienced, and hence
in accumulated wealth.
24 There is a large literature on the costs of business cycles following Lucas’s original calculation. I do not discuss these articles here for brevity. Lucas’s (2003) presidential address to the
American Economic Association is an extensive survey of this literature that also discusses how
Lucas’s views on this issue evolved since the original 1987 article.

F. Guvenen: Macroeconomics with Heterogeneity

277

More concretely, an individual solves the following problem:

∞

t
δ U (ct )
max E0
{ct }

t=0

s.t. ct + at+1 = wlt + (1 + r) at ,
at ≥ −Bmin ,

(8)

and lt follows a finite-state first-order Markov process.25
There are (at least) two ways to embed this problem in general equilibrium. Aiyagari (1994) considers a production economy and views the single
asset as the capital in the firm, which obviously has a positive net supply. In
this case, aggregate production is determined by the savings of individuals,
and both r and the wage rate w, must be determined in general equilibrium.
Huggett (1993) instead assumes that the single asset is a household bond in
zero net supply. In this case, the aggregate amount of goods in the economy is exogenous (exchange economy), and the only aggregate variable to be
determined is r.
The borrowing limit Bmin can be set to the “natural” limit, which is defined
as the loosest possible constraint consistent with certain repayment of debt:
Bmin = wlmin /r. Note that if lmin is zero, this natural limit will be zero. Some
authors have used this feature to rule out borrowing (e.g., Carroll [1997] and
Gourinchas and Parker [2002]). Alternatively, it can be set to some ad hoc
limit stricter than the natural one. More on this later.
The main substantive finding in Aiyagari (1994) is that with incomplete
markets, the aggregate capital stock is higher than it is with complete markets, although the difference is not quantitatively very large. Consequently,
the interest rate is lower (than the time preference rate), which is also true
in Huggett’s (1993) exchange economy version. This latter finding initially
led economists to conjecture that these models could help explain the equity
premium puzzle,26 which is also generated by a low interest rate. It turns
out that while this environment helps, it is neither necessary nor sufficient to
generate a low interest rate. I return to this issue later. Aiyagari (1994) also
shows that the model generates the right ranking between different types of
inequality: Wealth is more dispersed than income, which is more dispersed
than consumption.
25 Prior to Aiyagari, the decision problem described here was studied in various forms
by, among others, Bewley (undated), Schechtman and Escudero (1977), Flavin (1981), Hall and
Mishkin (1982), Clarida (1987, 1990), Carroll (1991), and Deaton (1991). With the exceptions of
Bewley (undated) and Clarida (1987, 1990), however, most of these earlier articles did not consider
general equilibrium, which is the main focus here.
26 The equity premium puzzle of Mehra and Prescott (1985) is the observation that, in the
historical data, stocks yield a much higher return than bonds over long horizons, which has turned
out to be very difficult to explain by a wide range of economic models.

278

Federal Reserve Bank of Richmond Economic Quarterly

The frameworks analyzed by Huggett (1993) and Aiyagari (1994) contain
the bare bones of a canonical general equilibrium incomplete markets model.
As such, they abstract from many ingredients that would be essential today for
conducting serious empirical/quantitative work, especially given that almost
two decades have passed since their publication. In the next three subsections,
I review three main directions the framework can be extended. First, the
nature of idiosyncratic risk is often crucial for the implications generated by
the model. There is a fair bit of controversy about the precise nature and
magnitude of such risks, which I discuss in some detail. Second, and as I
alluded to above, the treatment of borrowing constraints is very reduced form
here. The recent literature has made significant progress in providing useful
microfoundations for a richer specification of borrowing limits. Third, the
Huggett-Aiyagari model considers an economy populated by bachelor(ette)s
as opposed to families—this distinction clearly can have a big impact on
economic decisions, which is also discussed.

Nature of Idiosyncratic Income Risk27
The rejection of perfect insurance brought to the fore idiosyncratic shocks as
important determinants of economic choices. However, after three decades of
empirical research (since Lillard and Willis [1978]), a consensus among researchers on the nature of labor income risk still remains elusive. In particular,
the literature in the 1980s and 1990s produced two—quite opposite—views
on the subject. To provide context, consider this general specification for the
wage process:
i

α + βit
g (t, observables, ...)

!
"
! "

yti =
+
common systematic component profile heterogeneity

i
zt + ε it
! "
(9)
+
stochastic component
zti
ηit

i
= ρzt−1
+ ηit ,

ε it

(10)

where and are zero mean innovations that are i.i.d. over time and across
individuals.
The early articles on income dynamics estimate versions of the process
given in (9) from labor income data and find: 0.5 < ρ < 0.7, and σ 2β 0
(Lillard and Weiss 1979; Hause 1980). Thus, according to this first view, which
I shall call the “heterogeneous income profiles” (HIP) model, individuals are
subject to shocks with modest persistence, while facing life-cycle profiles that
27 The exposition here draws heavily on Guvenen (2009a).

F. Guvenen: Macroeconomics with Heterogeneity

279

are individual-specific (and hence vary significantly across the population). As
we will see in the next section, one theoretical motivation for this specification
is the human capital model, which implies differences in income profiles if,
for example, individuals differ in their ability level.
In an important article, MaCurdy (1982) casts doubt on these findings. He
tests the null hypothesis of σ 2β = 0 and fails to reject it. He then proceeds by
imposing σ 2β ≡ 0 before estimating the process in (9), and finds ρ ≈ 1 (see,
also, Abowd and Card [1989], Topel [1990], Hubbard, Skinner, and Zeldes
[1995], and Storesletten, Telmer, and Yaron [2004b]). Therefore, according
to this alternative view, which I shall call the “restricted income profiles”
(RIP) model, individuals are subject to extremely persistent—nearly random
walk—shocks, while facing similar life-cycle income profiles.
MaCurdy’s (1982) Test

More recently, two articles have revived this debate. Baker (1997) and
Guvenen (2009a) have shown that MaCurdy’s test has low power and therefore the lack of rejection does not contain much information about whether or
not there is growth rate heterogeneity. MaCurdy’s test was generally regarded
as the strongest evidence against the HIP specification, and it was repeated in
different forms by several subsequent articles (Abowd and Card 1989; Topel
1990; and Topel and Ward 1992), so it is useful to discuss in some detail.
To understand its logic, notice that, using the specification in (9) and (10),
the nth autocovariance of income growth can be shown to be

1−ρ 2
i
cov yti , yt+n
ση ,
(11)
= σ 2β − ρ n−1
1+ρ
for n ≥ 2. The idea of the test is that for sufficiently large n, the second
term will vanish (because of exponential decay in ρ n−1 ), leaving behind a
positive autocovariance equal to σ 2β . Thus, if HIP is indeed important—σ 2β is
positive—then higher order autocovariances must be positive.
Guvenen (2009a) raises two points. First, he asks how large n must be for
the second term to be negligible. He shows that for the value of persistence he
estimates with the HIP process (ρ ∼
= 0.82), the autocovariances in (11) do not
even turn positive before the 13th lag (because the second term dominates),
whereas MaCurdy only studies the first 5 lags. Second, he conducts a Monte
Carlo analysis in which he simulates data using equation (9) with substantial
heterogeneity in growth rates.28 The results of this analysis are reproduced
here in Table 1. MaCurdy’s test does not reject the false null hypothesis of
σ 2β = 0 for any sample size smaller than 500,000 observations (column 3)!
28 More concretely, the estimated value of σ 2 used in his Monte Carlo analysis implies that
β

at age 55 more than 70 percent of wage inequality is because of profile heterogeneity.

280

Federal Reserve Bank of Richmond Economic Quarterly

Table 1 How Informative is MaCurdy’s (1982) Test?

Lag
↓

0
1
2
3
4
5
10
15
18

N −→

Data
27,681
.1215
(.0023)
−.0385
(.0011)
−.0031
(.0010)
−.0023
(.0008)
−.0025
(.0007)
−.0001
(.0008)
−.0017
(.0006)
.0053
(.0007)
.0012
(.0009)

Autocovariances
HIP Process
27,681
500,00
.1136
(.00088)
−.04459
(.00077)
−.00179
(.00075)
−.00146
(.00079)
−.00093
(.00074)
−.00080
(.00081)
−.00003
(.00072)
.00017
(.00076)
.00036
(.00076)

.1153
(.00016)
−.04826
(.00017)
−.00195
(.00018)
−.00154)
(.00020)
−.00120
(.00019)
−.00093
(.00020)
−.00010
(.00019)
.00021
(.00020)
.00030
(.00018)

Autocorrelations
Data
HIP Process
27,681
27,681
1.00
(.000)
−.3174
(.010)
−.0261
(.008)
−.0192
(.009)
−.0213
(.010)
−.0012
(.007)
−.0143
(.009)
.0438
(.008)
.0094
(.011)

1.00
(.000)
−.3914
(.0082)
−.0151
(.0084)
−.0128
(.0087)
−.0080
(.0083)
−.0071
(.0090)
−.0003
(.0081)
.0015
(.0086)
.0032
(.0087)

Notes: The table is reproduced from Guvenen (2009a, Table 3). N denotes the sample
size (number of individual-years) used to compute the statistics. Standard errors are in
parentheses. The statistics in the “data” columns are calculated from a sample of 27,681
males from the PSID as described in that article. The counterparts from simulated data
are calculated using the same number of individuals and a HIP process fitted to the
covariance matrix of income residuals.

Even in that case, only the 18th autocovariance is barely significant (with
a t-statistic of 1.67). For comparison, MaCurdy’s (1982) data set included
around 5,000 observations. Even the more recent PSID data sets typically
contain fewer than 40,000 observations.
In light of these results, imposing the a priori restriction of σ 2β = 0 on
the estimation exercise seems a risky route to follow. Baker (1997), Haider
(2001), Haider and Solon (2006), and Guvenen (2009a) estimate the process
in (9) without imposing this restriction and find substantial heterogeneity in
β i and a low persistence, confirming the earlier results of Lillard and Weiss
(1979) and Hause (1980). Baker and Solon (2003) use a large panel data
set drawn from Canadian tax records and allow for both permanent shocks
and profile heterogeneity. They find statistically significant evidence of both
components.
In an interesting recent article, Browning, Ejrnaes, and Alvarez (2010)
estimate an income process that allows for “lots of” heterogeneity. The authors use a simulated method of moments estimator and match a number of

F. Guvenen: Macroeconomics with Heterogeneity

281

moments whose economic significance is more immediate than the covariance
matrix of earnings residuals, which has typically been used as the basis of a
generalized method of moments estimation in the bulk of the extant literature.
They uncover a lot of interesting heterogeneity, for example, in the innovation variance as well as in the persistence of AR(1) shocks. Moreover, they
“find strong evidence against the hypothesis that any worker has a unit root.”
Gustavsson and Österholm (2010) use a long panel data set (1968–2005) from
administrative wage records on Swedish individuals. They employ local-tounity techniques on individual-specific time series and reject the unit root
assumption.
Inferring Risk versus Heterogeneity from Economic Choices

Finally, a number of recent articles examine the response of consumption to
income shocks to infer the nature of income risk. In an important article,
Cunha, Heckman, and Navarro (2005) measure the fraction of individualspecific returns to education that are predictable by individuals by the time
they make their college decision versus the part that represents uncertainty.
Assuming a complete markets structure, they find that slightly more than half
of the returns to education represent known heterogeneity from the perspective
of individuals.
Guvenen and Smith (2009) study the joint dynamics of consumption and
labor income (using PSID data) in order to disentangle “known heterogeneity”
from income risk (coming from shocks as well as from uncertainty regarding
one’s own income growth rate). They conclude that a moderately persistent
income process (ρ ≈ 0.7–0.8) is consistent with the joint dynamics of income
and consumption. Furthermore, they find that individuals have significant
information about their own β i at the time they enter the labor market and
hence face little uncertainty coming from this component. Overall, they conclude that with income shocks of modest persistence and largely predictable
income growth rates, the income risk perceived by individuals is substantially smaller than what is typically assumed in calibrating incomplete markets
models (many of which borrow their parameter values from MaCurdy [1982],
Abowd and Card [1989], and Meghir and Pistaferri [2004], among others).
Along the same lines, Krueger and Perri (2009) use rich panel data on Italian
households and conclude that the response of consumption to income suggests
low persistence for income shocks (or a high degree of partial insurance).29
Studying economic choices to disentangle risk from heterogeneity has
many advantages. Perhaps most importantly, it allows researchers to bring a
29 A number of important articles have also studied the response of consumption to income,
such as Blundell and Preston (1998) and Blundell, Pistaferri, and Preston (2008). These studies,
however, assume the persistence of income shocks to be constant and instead focus on what can
be learned about the sizes of income shocks over time.

282

Federal Reserve Bank of Richmond Economic Quarterly

much broader set of data to bear on the question. For example, many dynamic
choices require individuals to carefully weigh the different future risks they
perceive against predictable changes before making a commitment. Decisions
on home purchases, fertility, college attendance, retirement savings, and so on
are all of this sort. At the same time, this line of research also faces important
challenges: These analyses need to rely on a fully specified economic model,
so the results can be sensitive to assumptions regarding the market structure,
specification of preferences, and so on. Therefore, experimenting with different assumptions is essential before a definitive conclusion can be reached with
this approach. Overall, this represents a difficult but potentially very fruitful
area for future research.

Wealth, Health, and Other Shocks
One source of idiosyncratic risk that has received relatively little attention
until recently comes from shocks to wealth holdings, resulting for example
from fluctuations in housing prices and stock returns, among others. A large
fraction of the fluctuations in housing prices are because of local or regional
factors and are substantial (as the latest housing market crash showed once
again). So these fluctuations can have profound effects on individuals’ economic choices. In one recent example, Krueger and Perri (2009) use panel
data on Italian households’ income, consumption, and wealth. They study the
response of consumption to income and wealth shocks and find the latter to
be very important. Similarly, Mian and Sufi (2011) use individual-level data
from 1997–2008 and show that housing price boom leads to significant equity
extraction—about 25 cents for every dollar increase in prices—which in turn
leads to higher leverage and personal default during this time. Their “conservative” estimate is that home equity-based borrowing added $1.25 trillion
in household debt and accounted for about 40 percent of new defaults from
2006–2008.
Another source of idiosyncratic shocks is out-of-pocket medical expenditures (hospital bills, nursing home expenses, medications, etc.), which can
potentially have significant effects on household decisions. French and Jones
(2004) estimate a stochastic process for health expenditures, modeled as a
normal distribution adjusted to capture the risk of catastrophic health care
costs. Simulating this process, they show that 0.1 percent of households every
year receive a health cost shock with a present value exceeding $125,000.
Hubbard, Skinner, and Zeldes (1994, 1995) represent the earliest efforts to introduce such shocks into quantitative incomplete markets models. The 1995
article shows that the interaction of such shocks with means-tested social insurance programs is especially important to account for in order to understand
the very low savings rate of low-income individuals.

F. Guvenen: Macroeconomics with Heterogeneity

283

De Nardi, French, and Jones (2010) ask if the risk of large out-of-pocket
medical expenditures late in life can explain the savings behavior of the elderly. They examine a new and rich data set called AHEAD, which is part of
the Health and Retirement Survey conducted by the University of Michigan,
which allows them to characterize medical expenditure risk for the elderly
(even for those in their 90s) more precisely than previous studies, such as
Hubbard, Skinner, and Zeldes (1995) and Palumbo (1999).30 De Nardi,
French, and Jones (2010) find out-of-pocket expenditures to rise dramatically
at very old ages, which (in their estimated model) provides an explanation
for the lack of significant dissaving by the elderly. Ozkan (2010) shows that
the life-cycle profile of medical costs (inclusive of the costs paid by private
and public insurers to providers) differs significantly between rich and poor
households. In particular, on average, the medical expenses of the rich are
higher than those of the poor until mid-life, after which the expenses of the
poor exceed those of the rich—by 25 percent in absolute terms. Further, the
expenses of the poor have thick tails—lots of individuals with zero expenses
and many with catastrophically high costs. He builds a model in which individuals can invest in their health (i.e., preventive care), which affects the
future distribution of health shocks and, consequently, the expected lifetime.
High-income individuals do precisely this, which explains their higher spending early on. Low-income individuals do the opposite, which ends up costing
more later in life. He concludes that a reform of the health care system that
encourages use of health care for low-income individuals has positive welfare
gains, even when fully accounting for the increase in taxes required to pay for
them.

Endogenizing Credit Constraints
The basic Aiyagari model features a reduced-form specification for borrowing
constraints (8), and does not model the lenders’ problem that gives rise to such
constraints. As such, it is silent about potentially interesting variations in
borrowing limits across individuals and states of the economy. A number of
recent articles attempt to close this gap.
In one of the earliest studies of this kind, Athreya (2002) constructs a
general equilibrium model of unsecured household borrowing to quantify the
welfare effects of the Bankruptcy Reform Act of 1999 in the United States. In
30 Palumbo’s estimates of medical expenditures are quite a bit smaller than those in De Nardi,
French, and Jones (2010), which are largely responsible for the smaller effects he quantifies. De
Nardi, French, and Jones (2010) argue that one reason for the discrepancy could be the fact that
Palumbo used data from the National Medical Care Expenditure Survey, which, unlike the AHEAD
data set, does not contain direct measures of nursing home expenses. He did imputations from a
variety of sources, which may be missing the large actual magnitude of such expenses found in
the AHEAD data set.

284

Federal Reserve Bank of Richmond Economic Quarterly

the pooling equilibrium of this model (which is what Athreya focuses on), the
competitive lending sector charges a higher borrowing rate than the market
lending rate to break even (i.e., zero-profit condition), accounting for the
fraction of households that will default. This framework allows him to study
different policies, such as changing the stringency of means testing as well as
eliminating bankruptcy altogether.
In an important article, Chatterjee et al. (2007) build a model of personal
default behavior and endogenous borrowing limits. The model features (i)
several types of shocks—to earnings, preferences, and liabilities (e.g., hospital and lawsuit bills, which precede a large fraction of defaults in the United
States), (ii) a competitive banking sector, and (iii) post-bankruptcy legal treatment of defaulters that mimics the U.S. Chapter 7 bankruptcy code. The main
contribution of Chatterjee et al. (2007) is to show that a separating equilibrium exists in which banks offer a menu of debt contracts to households
whose interest rates vary optimally with the level of borrowing to account
for the changing default probability. Using a calibrated version of the model,
they quantify the separate contributions of earnings, preferences, and liability
shocks to debt and default. Chatterjee and Eyigungor (2011) introduce collateralized debt (i.e., mortgage debt) into this framework to examine the causes
of the run-up in foreclosures and crash in housing prices after 2007.
Livshits, MacGee, and Tertilt (2007) study a model similar to Chatterjee
et al. (2007) in order to quantify the advantages to a “fresh start” bankruptcy
system (e.g., U.S. Chapter 7) against a European style system in which debtors
cannot fully discharge their debt in bankruptcy. The key tradeoff is that dischargeable debts add insurance against bad shocks, helping to smooth across
states, but the inability to commit to future repayment increases interest rates
and limits the ability to smooth across time. Their model is quite similar to
Chatterjee et al. (2007), except that they model an explicit overlapping generations structure. They calibrate the model to the age-specific bankruptcy rate
and debt-to-earnings ratio. For their baseline parameterization, they find that
fresh-start bankruptcy is welfare improving, but that result is sensitive to the
process for expenditure and income shocks, the shape of the earnings profile,
and household size. Livshits, MacGee, and Tertilt (2010) build on this framework to evaluate several theories for the rise in personal bankruptcies since the
1970s. Finally, Glover and Short (2010) use the model of personal bankruptcy
to understand the incorporation of entrepreneurs. Incorporation protects the
owners’ personal assets and their access to credit markets in case of default,
but by increasing their likelihood of default, incorporation also implies a risk
premium is built into their borrowing rate.

F. Guvenen: Macroeconomics with Heterogeneity

285

From Bachelor(ette)s to Families
While the framework described above can shed light on some interesting distributional issues (e.g., inequality in consumption, earnings, and wealth), it is
completely silent on a crucial source of heterogeneity—the household structure. In reality, individuals marry, divorce, have kids, and make their decisions
regarding consumption, savings, labor supply, and so on jointly with these
other life choices. For many economic and policy questions, the interaction
between these domestic decisions and economic choices in an incomplete
markets world can have a first-order effect on the answers we get. Just to
give a few examples, consider these facts: Men and women are well-known to
exhibit different labor supply elasticities; the tax treatment of income varies
depending on whether an individual is single, married, and whether he/she has
kids, etc.; the trends in the labor market participation rate in the United States
since the 1960s have been markedly different for single and married women;
the fractions of individuals who are married and divorced have changed significantly, again since the 1960s; and so on.
A burgeoning literature works to bring a richer household structure into
macroeconomics. For example, in an influential article, Greenwood,
Seshadri, and Yorukoglu (2005) study the role of household technologies (the
widespread availability of washing machines, vacuum cleaners, refrigerators,
etc.) in leading women into the labor market. Greenwood and Guner (2009)
extend the analysis to study the marriage and divorce patterns since World
War II. Jones, Manuelli, and McGrattan (2003) explore the role of the closing
gender wage gap for married women’s rising labor supply. Knowles (2007)
argues that the working hours of men are too long when viewed through the
lens of a unitary model of the household in which the average wage of females
rises as in the data. He shows that introducing bargaining between spouses
into the model reconciles it with the data. Guner, Kaygusuz, and Ventura
(2010) study the effects of potential reforms in the U.S. tax system in a model
of families with children and an extensive margin for female labor supply.
Guvenen and Rendall (2011) study the insurance role of education for women
against divorce risk and the joint evolution of education trends with those in
marriage and divorce.

INEQUALITY IN CONSUMPTION, WEALTH,
AND EARNINGS

A major use of heterogeneous-agent models is to study inequality or dispersion in key economic outcomes, most notably in consumption, earnings, and
wealth. The Aiyagari model—as well as its aggregate-shock augmented version, the Krusell-Smith model presented in the next section—takes earnings
dispersion to be exogenous and makes predictions about inequality in consumption and wealth. The bulk of the incomplete markets literature follows

286

Federal Reserve Bank of Richmond Economic Quarterly

this lead in their analysis. Some studies introduce an endogenous labor supply
choice and instead specify the wage process to be exogenous, delivering earnings dispersion as an endogenous outcome (Pijoan-Mas [2006], Domeij and
Floden [2006], Heathcote, Storesletten, and Violante [2008], among others).
While this is a useful step forward, a lot of the dispersion in earnings before
age 55 is because of wages and not hours, so the assumption of an exogenous
wage process still leaves quite a bit to be understood. Other strands of the
literature attempt to close this gap by writing models that also generate wage
dispersion as an endogenous outcome in the model—for example, because of
human capital accumulation (e.g., Guvenen and Kuruscu [2010, forthcoming],
and Huggett, Ventura, and Yaron [2011]) or because of search frictions.31

Consumption Inequality
Two different dimensions of consumption inequality have received attention
in the literature. The first one concerns how much within-cohort consumption
inequality increases over the life cycle. The different views on this question
have been summarized in Section 2.32 The second one concerns whether, and
by how much, (overall) consumption inequality has risen in the United States
since the 1970s, a question whose urgency was raised by the substantial rise
in wage inequality during the same time. In one of the earliest articles on this
topic, Cutler and Katz (1992) use data from the 1980s on U.S. households
from the CE and find that the evolution of consumption inequality closely
tracks the rise in wage inequality during the same time. This finding serves as
a rejection of earlier claims in the literature (e.g., Jencks 1984) that the rise of
means-tested in-kind transfers starting in the 1970s had improved the material
well-being of low-income households relative to what would be judged by
their income statistics.
Interest in this question was reignited more recently by a thought-provoking
article by Krueger and Perri (2006), who conclude from an analysis of CE
data that, from 1980–2003, within-group income inequality increased substantially more than within-group consumption inequality (in contrast, they find
31 The search literature is very large with many interesting models to cover. I do not discuss
these models here because I cannot do justice to this extensive body of work in this limited
space. For an excellent survey, see Rogerson, Shimer, and Wright (2005). Note, however, that as
Hornstein, Krusell, and Violante (2011) show, search models have trouble generating the magnitudes
of wage dispersion we observe in the data.
32 Another recent article of interest is Aguiar and Hurst (2008), who examine the life-cycle
mean and variance profiles of the subcomponents of consumption—housing, utility bills, clothing,
food at home, food away from home, etc. They show rich patterns that vary across categories,
whereby the variance rises monotonically for some categories, while being hump-shaped for others,
and yet declining monotonically for some others. The same patterns are observed for the mean
profile. These disaggregated facts provide more food for thought to researchers.

F. Guvenen: Macroeconomics with Heterogeneity

287

that between-group income and consumption inequality tracked each other).33
They then propose an explanation based on the premise that the development
of financial services in the U.S. economy has helped households smooth consumption fluctuations relative to income variation.
To investigate this story, they apply a model of endogenous debt constraints
as in Kehoe and Levine (1993). In this class of models, what is central is
not the ability of households to pay back their debt, but rather it is their
incentive or willingness to pay back. To give the right incentives, lenders can
punish a borrower that defaults, for example, by banning her from financial
markets forever (autarky). However, if the individual borrows too much or if
autarky is not sufficiently costly, it may still make sense to default in certain
states of the world. Thus, given the parameters of the economic environment,
lenders will compute the optimal state-contingent debt limit, which will ensure
that the borrower never defaults in equilibrium. Krueger and Perri (2006)
notice that if income shocks are really volatile, then autarky is a really bad
outcome, giving borrowers less incentive to default. Lenders who know this,
in turn, are more willing to lend, which endogenously loosens the borrowing
constraints. This view of the last 30 years therefore holds that the rise in the
volatility of income shocks gave rise to the development of financial markets
(more generous lending), which in turn led to a smaller rise in consumption
inequality.34
Heathcote, Storesletten, and Violante (2007) argue that the small rise in
consumption inequality can be explained simply if the rise in income shocks
has been of a more transitory nature, since such shocks are easier to smooth
through self-insurance. Indeed, Blundell and Preston (1998) earlier made
the same observation and concluded that in the 1980s the rise in income
shock variance was mostly permanent in nature (as evidenced by the observation that income and consumption inequality grew together), whereas in the
1990s it was mostly transitory given that the opposite was true. Heathcote,
Storesletten, and Violante (2007) calibrate a fully specified model and show
that it can go a long way toward explaining the observed trends in consumption inequality. One point to keep in mind is that these articles take as given
that the volatility of income shocks rose during this period, a conclusion that
is subject to uncertainty in light of the new evidence discussed above.
33 Attanasio, Battistin, and Ichimura (2007) question the use of the CE interview survey and

argue that some expenditure items are poorly measured in the survey relative to another component
of CE, called the diary survey. They propose an optimal way of combining the two survey data
and find that consumption inequality, especially in the 1990s has increased more than what is
revealed by the interview survey alone.
34 Aguiar and Bils (2011) take a different approach and construct a measure of CE consumption by using data on income and (self-reported) savings rate by households. They argue that
consumption inequality tracked income inequality closely in the past 30 years. Although this is
still preliminary work, the article raises some interesting challenges.

288

Federal Reserve Bank of Richmond Economic Quarterly

Before concluding, a word of caution about measurement. The appropriate price deflator for consumption may have trended differently for households
in different parts of the income distribution (i.e., the “Walmart effect” at the
lower end). To the extent that this effect is real, the measured trend in consumption inequality could be overstating the actual rise in the dispersion of
material well-being. This issue still deserves a fuller exploration.

Wealth Inequality
The main question about wealth inequality is a cross-sectional one: Why do
we observe such enormous disparities in wealth, with a Gini coefficient of
about 0.80 for net worth and a Gini exceeding 0.90 for financial wealth?
Economists have developed several models that can generate highly
skewed wealth distributions (see, for example, Huggett [1996], Krusell and
Smith [1998], Quadrini [2000], Castañeda, Dı́az-Giménez, and Rı́os-Rull
[2003], Guvenen [2006], and Cagetti and De Nardi [2006]). These models
typically use one (or more) of three mechanisms to produce this inequality: (1)
dispersion in luck in the form of large and persistent shocks to labor productivity: the rich are luckier than the poor; (2) dispersion in patience or thriftiness:
the rich save more than the poor; and (3) dispersion in rates of return: the rich
face higher asset returns than the poor. This subsection describes a baseline
model and variations of it that incorporate various combinations of the three
main mechanisms that economists have used to generate substantial inequality
in general equilibrium models.35
Dispersion in Luck

Huggett (1996) asks how much progress can be made toward understanding
wealth inequality using (i) a standard life-cycle model with (ii) Markovian
idiosyncratic shocks, (iii) uncertain lifetimes, and (iv) a Social Security system. He finds that although the model can match the Gini coefficient for
wealth in the United States, this comes from low-income households holding
too little wealth, rather than the extreme concentration of wealth at the top
in the U.S. economy. Moreover, whereas in the U.S. data the dispersion of
wealth within each cohort is nearly as large as the dispersion across cohorts,
the model understates the former significantly.
Castañeda, Dı́az-Giménez, and Rı́os-Rull (2003) study an enriched model
that combines elements of Aiyagari (1994) and Huggett (1996). Specifically,
the model (i) has a Social Security system, (ii) has perfectly altruistic bequests,
35 Some of the models discussed in this section contain aggregate shocks in addition to

idiosyncratic ones. While aggregate shocks raise some technical issues that will be addressed in
the next section, they pose no problems for the exposition in this section.

F. Guvenen: Macroeconomics with Heterogeneity

289

(iii) allows for intergenerational correlation of earnings ability, (iv) has a progressive labor and estate tax system as in the United States, and (v) allows
a labor supply decision. As for the stochastic process for earnings, they do
not calibrate its properties based on microeconometric evidence on income
dynamics as is commonly done, but rather they choose its features (the 4 ×
4 transition matrix and four states of a Markov process) so that the model
matches the cross-sectional distribution of earnings and wealth. To match the
extreme concentration of wealth at the upper tail, this calibration procedure
implies that individuals must receive a large positive shock (about 1,060 times
the median income level) with a small probability. This high income level is
also very fleeting—it lasts for about five years—which leads these high income
individuals to save substantially (for consumption smoothing) and results in
high wealth inequality.

Dispersion in Patience

Laitner’s (1992, 2002) original insight was that wealth inequality could result from a combination of: (1) random heterogeneity in lifetime incomes
across generations, and (2) altruistic bequests, which are constrained to be
non-negative. Each newly born consumer in Laitner’s model receives a permanent shock to his lifetime income and, unlike in the Aiyagari model, faces
no further shocks to income during his lifetime. In essence, in Laitner’s model
only households that earn higher than average lifetime income want to transfer
some amount to their offspring, who are not likely to be as fortunate. This
altruistic motive makes these households effectively more thrifty (compared to
those that earn below average income) since they also care about future utility.
Thus, even small differences in lifetime income can result in large differences
in savings rates—a fact empirically documented by Carroll (2000)—and hence
in wealth accumulation.
The stochastic-beta model of Krusell and Smith (1998) is a variation on
this idea in a dynastic framework, where heterogeneity in thrift (i.e., in the
time-discount rate) is imposed exogenously.36 Being more parsimonious, the
stochastic-beta model also allows for the introduction of aggregate shocks.
Krusell and Smith show that even small differences in the time discount factor
that are sufficiently persistent are sufficient to generate the extreme skewness
of the U.S. wealth distribution. The intuition for this result will be discussed
in a moment.
36 Notice that in this article I use δ to denote the time discount factor and β was used to

denote the income growth rate. I will continue with this convention, except when I specifically
refer to the Krusell-Smith model, which has come to be known as a stochastic-beta model.

290

Federal Reserve Bank of Richmond Economic Quarterly

Dispersion in Rates of Return

Guvenen (2006) introduces return differentials into a standard stochastic growth model (i.e., in which consumers have identical, time-invariant discount
factors and idiosyncratic shocks do not exist). He allows all households to
trade in a risk-free bond, but restricts one group of agents from accumulating
capital. Quadrini (2000) and Cagetti and De Nardi (2006) study models of
inequality with entrepreneurs and workers, which can also generate skewed
wealth distributions. The mechanisms have similar flavors: Agents who face
higher returns end up accumulating a substantial amount of wealth.
The basic mechanism in Guvenen (2006) can be described as follows.
Nonstockholders have a precautionary demand for wealth (bonds), but the
only way they can save is if stockholders are willing to borrow. In contrast, stockholders have access to capital accumulation, so they could smooth
consumption even if the bond market was completely shut down. Furthermore, nonstockholders’ asset demand is even more inelastic because they are
assumed to have a lower elasticity of intertemporal substitution (consistent
with empirical evidence) and therefore have a strong desire for consumption
smoothing. Therefore, trading bonds for consumption smoothing is more important for nonstockholders than it is for stockholders. As a result, stockholders will only trade in the bond market if they can borrow at a low interest rate.
This low interest rate in turn dampens nonstockholders’ demand for savings
further, and they end up with little wealth in equilibrium (and stockholders end
up borrowing very little). Guvenen (2009b) shows that a calibrated version of
this model easily generates the extremely skewed distribution of the relative
wealth of stockholders to nonstockholders in the U.S. data.
Can We Tell Them Apart?

The determination of wealth inequality in the three models discussed so far
can be explained using variations of a diagram used by Aiyagari (1994). The
left panel of Figure 2 shows how wealth inequality is determined in Laitner’s
model and, given their close relationship, in the Krusell-Smith model. The
top solid curve originating from “−Bmin ” plots the long-run asset demand
schedule for the impatient agent; the bottom curve is for the patient agent. A
well-known feature of incomplete markets models is that the asset demand
schedule is very flat for values of returns that are close to the time preference
rate, η (so δ ≡ 1/(1 + η)). Thus, both types of individuals’ demand schedules
asymptote to their respective time preference rates (with ηpatient < ηimpatient ).37
If the equilibrium return (which must be lower than ηpatient for an equilibrium
to exist) is sufficiently close to ηpatient , the high sensitivity of asset demands
37 See, for example, Aiyagari (1994) and references therein. This result also holds when asset
returns are stochastic (Chamberlain and Wilson 2000).

F. Guvenen: Macroeconomics with Heterogeneity

291

Figure 2 Determination of Wealth Inequality in Various Models
Laitner (1992) and Krusell and Smith (1998)

Guvenen (2006)

0.06
5%

Long-Run Asset Demand (Impatient Agent)

Long-Run Asset Demand
Schedule
η: Time Preference Rate

0.05
s
R
Rf

0.03

Asset Return

4%
R

Long-Run Asset Demand (Patient Agent)

0.02

0.01
Impatient Agent's
Wealth

Patient Agent's
Wealth

0.00

0%
-5 Bmin

Asset Demand

Stockholders'
Wealth

Nonstockholders'
Wealth

-5 Bmin

Asset Demand

to interest rates will generate substantial wealth inequality between the two
types of agents.
Similarly, the right panel shows the mechanism in the limited participation
model, which has a similar flavor. For simplicity, let us focus on the case
where stockholders and nonstockholders have the same preferences and face
the same portfolio constraints. We have η > R S > R f . Again, given the
sensitivity of asset demand to returns near η, even a small equity premium
generates substantial wealth inequality. It should be stressed, however, that a
large wealth inequality is not a foregone conclusion in any of these models.
If returns were too low relative to η, individuals would be on the steeper part
of their demand curves, which could result in smaller differences in wealth
holdings.
While the mechanics described here may appear quite similar for the three
models, their substantive implications differ in crucial ways. For example,
consider the effect of eliminating aggregate shocks from all three models. In
Guvenen (2006), there will be no equity premium without aggregate shocks
and, consequently, no wealth inequality. In Krusell and Smith (1998), wealth
inequality will increase as the patient agent holds more of the aggregate wealth
(and would own all the wealth if there were no idiosyncratic shocks). In
Laitner (1992), wealth inequality will remain unchanged, since it is created by
idiosyncratic lifetime income risk. These dramatically different implications
suggest that one can devise methods to bring empirical evidence to bear on
the relevance of these different mechanisms.

292

Federal Reserve Bank of Richmond Economic Quarterly

Cagetti and De Nardi (2006)

Cagetti and De Nardi (2006) introduce heterogeneity across individuals in
both work and entrepreneurial ability. Entrepreneurial firms operate decreasing returns to scale production functions, and higher entrepreneurial ability
implies a higher optimal scale. Because debt contracts are not perfectly enforceable due to limited commitment, business owners need to put up some
of their assets as collateral, a portion of which would be confiscated in case of
default. Thus, entrepreneurs with very promising projects have more to lose
from default, which induces them to save more for collateral, borrow more
against it, and reach their larger optimal scale. The model is able to generate the extreme concentration of wealth at the top of the distribution (among
households, many of whom are entrepreneurs).
Although this model differs from the limited participation framework in
many important ways, the differential returns to saving is a critical element
for generating wealth inequality in both models. This link could be important
because many individuals in the top 1 percent and 5 percent of the U.S. wealth
distribution hold significant amounts of stocks but are not entrepreneurs (hold
no managerial roles), which the Cagetti/De Nardi model misses. The opposite is also true: Many very rich entrepreneurs are not stockholders (outside
of their own company), which does not fit well with Guvenen’s model (see
Heaton and Lucas [2000] on the empirical facts about wealthy entrepreneurs
and stockholders). The view that perhaps the very high wealth holdings of
these individuals is driven by the higher returns that they enjoy—either as a
stockholder or as an entrepreneur—can offer a unified theory of savings rate
differences.

Wage and Earnings Inequality
Because the consumption-savings decision is the cornerstone of the incomplete
markets literature, virtually every model has implications for consumption and
wealth inequality. The same is not true for earnings inequality. Many models
assume that labor supply is inelastic and the stochastic process for wages is
exogenous, making the implications for wage and earnings inequality to be
mechanical reflections of the assumptions of the model. Even if labor supply
is assumed to be endogenous, many properties of the earnings distribution
(exceptions noted below) mimic those of the wage distribution. For things to
get more interesting, it is the latter that needs to be endogenized.
In this subsection, I first review the empirical facts regarding wage inequality—both over the life cycle and over time. These facts are useful
for practitioners since they are commonly used as exogenous inputs into incomplete markets models. Unless specifically mentioned, all the facts discussed here pertain to male workers, because the bulk of the existing work is

F. Guvenen: Macroeconomics with Heterogeneity

293

available consistently for this group.38 Second, I discuss models that attempt
to endogenize wages and explain the reported facts and trends.
Inequality Over the Life Cycle

The main facts about the evolution of (within-cohort) earnings inequality over
the life cycle were first documented by Deaton and Paxson (1994) and shown
in the left panel of Figure 1. The same exercise has been repeated by numerous
authors using different data sets or time periods (among others, Storesletten,
Telmer, and Yaron [2004a], Guvenen [2009a], Heathcote, Perri, and Violante
[2010], and Kaplan [2010]). While the magnitudes differ somewhat, the basic
fact that wage and earnings inequality rise substantially over the life cycle is
well-established.
One view is that this fact does not require an elaborate explanation, because wages follow a very persistent, perhaps permanent, stochastic process
as implied by the RIP model. Thus, the rising life-cycle inequality is simply a
reflection of the accumulation of such shocks, which drive up the variance of
log wages in a linear fashion (in the case of permanent shocks). I will continue
to refer to this view as the RIP model because of its emphasis on persistent
“shocks.”39
An alternative perspective, which has received attention more recently,
emphasizes systematic factors—heterogeneity as opposed to random shocks.
This view is essentially in the same spirit as the HIP model of the previous section. But it goes one step further by endogenizing the wage distribution based
on the human capital framework of Becker (1964) and Ben-Porath (1967),
among others. In an influential article, Huggett, Ventura, and Yaron (2006)
study the distributional implications of the standard Ben-Porath (1967) model
by asking about the types of heterogeneity that one needs to introduce to generate patterns consistent with the U.S. data. They find that too much heterogeneity in initial human capital levels results in the counterfactual implication
that wage inequality should fall over the life cycle. In contrast, heterogeneity in learning ability generates a rise in wage inequality consistent with the
data. A key implication of this finding is that the rise in wage inequality can
be generated without appealing to idiosyncratic shocks of any kind. Instead,
it is the systematic fanning out of wage profiles, resulting from different investment rates, that generates rising inequality over the life cycle. Guvenen,
Kuruscu, and Ozkan (2009) and Huggett, Ventura, andYaron (2011) introduce
38 Some of the empirical trends discussed also apply to women, while others do not. See
Autor, Katz, and Kearney (2008) for a comparative look at wage trends for males and females
during the period.
39 Of course, one can write deeper economic models that generate the observation that wages
follow a random walk process, such as the learning model of Jovanovic (1979) in a search and
matching environment, or the optimal contracts in the limited commitment model of Harris and
Holmstrom (1982).

294

Federal Reserve Bank of Richmond Economic Quarterly

idiosyncratic shocks into the Ben-Porath framework. Both articles find that
heterogeneous growth rates continue to play the dominant role for the rise in
wage inequality. The Ben-Porath formulation is also central for wage determination in Heckman, Lochner, and Taber (1998) and Kitao, Ljungqvist, and
Sargent (2008).
Inequality Trends Over Time

A well-documented empirical trend since the 1970s is the rise in wage inequality among male workers in the United States. This trend has been especially
prominent above the median of the wage distribution: For example, the log
wage differential between the 90th and the 50th percentiles has been expanding in a secular fashion for the past four decades. The changes at the bottom
have been more episodic, with the log 50-10 wage differential strongly expanding until the late 1980s and then closing subsequently (see Autor, Katz,
and Kearney [2008] for a detailed review of the evidence). Acemoglu (2002)
contains an extensive summary of several related wage trends, as well as a
review of proposed explanations. Here I only discuss the subset of articles
that are more closely relevant for the incomplete markets macroliterature.
Larger Shocks or Increasing Heterogeneity?

Economists’ interpretations of the rise in wage inequality over the life cycle and over time are intimately related. The RIP view that was motivated
by analyses of life-cycle wages was dominant in the 1980s and 1990s, so
it was natural for economists to interpret the rise in wage inequality over
time, through the same lens. Starting with Gottschalk and Moffitt (1994) and
Moffitt and Gottschalk (1995), this trend has been broadly interpreted as reflecting a rise in the variances of idiosyncratic shocks, either permanent or
transitory (Meghir and Pistaferri 2004; Heathcote, Storesletten, and Violante
2008; etc.). This approach remains the dominant way to calibrate economic
models that investigate changes in economic outcomes from the 1970s to date.
However, some recent articles have documented new evidence that seems
hard to reconcile with the RIP view. The first group of articles revisits the
econometric analyses of wage and earnings data. Among these, Sabelhaus
and Song (2009, 2010) use panel data from Social Security records covering
millions of American workers, in contrast to the long list of previous studies
that use survey data (e.g., the PSID).40 While this data set has the potential drawback of under-reporting (because it is based on income reported to
the Internal Revenue Service), it has three important advantages: (i) a much
larger sample size (on the order of 50+ million observations, compared to at
40 These include, among others, Meghir and Pistaferri (2004), Dynan, Elmendorf, and Sichel
(2007), Heathcote, Storesletten, and Violante (2008), and Shin and Solon (2011).

F. Guvenen: Macroeconomics with Heterogeneity

295

most 50,000 in the PSID), (ii) no survey response error, and (iii) no attrition.
Sabelhaus and Song find that the volatility of annual earnings growth increased during the 1970s, but that it declined monotonically during the 1980s
and 1990s. Furthermore, applying the standard permanent-transitory decomposition as in Moffitt and Gottschalk (1995) and Meghir and Pistaferri (2004)
reveals that permanent shock variances were stable and transitory shocks became smaller from 1980 into the 2000s. A separate study conducted by the
Congressional Budget Office (2008), also using wage earnings from Social
Security records from 1980–2003, reached the same conclusion.41 Finally,
Kopczuk, Saez, and Song (2010) document (also using Social Security data)
that both long-run and short-run mobility have stayed remarkably stable from
the 1960s into the 2000s. But this finding seems difficult to reconcile with
Moffitt and Gottschalk (1995) and the subsequent literature that found permanent and transitory shock variances to have risen in different subperiods from
1970 to the 2000s. If true, the latter would result in fluctuations in mobility
patterns over these subperiods, which is not borne out in Kopczuk, Saez, and
Song’s (2010) analysis.
Another piece of evidence from income data is offered by Haider (2001),
who estimates a stochastic process for wages similar to the one in Moffitt
and Gottschalk (1995) and others, but with one key difference. He allows
for individual-specific wage growth rates (HIP) and he also allows for the
dispersion of growth rates to vary over time. The stochastic component is
specified as an ARMA(1,1). With this more flexible specification, he finds no
evidence of a rise in the variance of income shocks after the 1970s, but instead
finds a large increase in the dispersion of systematic wage growth rates.
A second strand of the literature studies the trends in labor market flows
in the United States. These articles do not find any evidence of rising job
instability or churning, which one might expect to see in conjunction with
larger idiosyncratic shocks. In contrast, these studies document an acrossthe-board moderation in labor market flows. For example, Gottschalk and
Moffitt (1999) focus on male workers between the ages of 20 and 62 and
conclude their analysis as follows:
[W]e believe that a consistent picture is emerging on changes in job
stability and job security in the 1980s and 1990s. Job instability does

41 Sabelhaus-Song attribute the reason why some earlier studies found rising variances of wage
shocks (e.g., Moffitt and Gottschalk 2008) to the inclusion of individuals with self-employment
income and those who earn less than the Social Security minimum. Even though there are few
of these households, Sabelhaus and Song show that they make a big difference in the computed
statistics. Similarly, Shin and Solon (2011, 978–80) use PSID data and also do not find a trend
in the volatility of wage earnings changes during the 1980s and 1990s. They argue that the
increasing volatility found in earlier studies, such as Dynan, Elmendorf, and Sichel (2007), seems
to be coming from the inclusion of some auxiliary types of income (from business, farming, etc.)
whose treatment has been inconsistent in the PSID over the years.

296

Federal Reserve Bank of Richmond Economic Quarterly
not seem to have increased, and the consequences of separating from an
employer do not seem to have worsened.42

Shimer (2005, 2007) and Davis et al. (2010) extend this analysis to cover
the 2000s and use a variety of data sets to reach the same conclusion. Further,
both articles show that expanding the sample of individuals to include women
and younger workers shows a declining trend in labor market flows and an
increase in job security.
Taking Stock

To summarize, the seeming consensus of the 1990s—that rising wage inequality was driven by an increase in idiosyncratic shock variances—is being
challenged by a variety of new evidence, some of which comes from data sets
many orders of magnitude larger than the surveys used in previous analyses.
In addition, the evidence from labor market flows described above—while perhaps more indirect—raises questions about the sources of larger idiosyncratic
shocks in a period where labor market transitions seem to have moderated.
Although, it would be premature to conclude that the alternative view is the
correct one—more evidence is needed to reach a definitive conclusion. Having said that, if this alternative view is true, and income shock variances have
not increased, this new “fact” would require economists to rethink a variety
of explanations put forward for various trends, which assumed a rise in shock
variances during this period.

HETEROGENEITY WITH AGGREGATE SHOCKS

Krusell and Smith (1998) add two elements to the basic Aiyagari framework.
First, they introduce aggregate technology shocks. Second, they assume that
the cumulative discount factor at time t (which was assumed to be δ t before), now follows the stochastic process δ t = δ̃δ t−1 , where δ̃ is a finite-state
Markov chain. The stochastic evolution of the discount factors within a dynasty captures some elements of an explicit overlapping-generations structure
with altruism and less-than-perfect correlation in genes between parents and
children, as in Laitner (1992). With this interpretation in mind, the evolution
of δ̃ is calibrated so that the average duration of any particular value of the
discount factor is equal to the lifetime of a generation. (Krusell and Smith
[1997] study a version of this model where consumers are allowed to hold a
risk-free bond in addition to capital.)
The specifics of the model are as follows. There are two types of shocks:
(i) idiosyncratic employment status: ( e , u ) ≡ (employed, unemployed);
42 They also say, “Almost all studies based on the various Current Population Surveys (CPS)
supplements...show little change in the overall separation rates through the early 1990s.”

F. Guvenen: Macroeconomics with Heterogeneity

297

and (ii) aggregate productivity: (zg , zb ) ≡ (expansion, recession). Employment status and aggregate productivity jointly evolve as a first-order Markov
process. Assume that is i.i.d. conditional on z, so the fraction of employed
workers (and hence l) only depends on z. Competitive markets imply
w (K, L, z) = (1 − α) z (K/L)−α , and r (K, L, z) = αz (K/L)α−1 . (12)
Finally, the entire wealth distribution, which I denote with is a state variable for this model, and let ´ = H (, z; z´ ) denote its endogenous transition
function (or law of motion).

Krusell-Smith Algorithm
A key equilibrium object in this class of models is the law of motion, H . In
principle, computing this object is a formidable task since the distribution of
wealth is infinite-dimensional. Krusell and Smith (1997, 1998) show, however,
that this class of models, when reasonably parameterized, exhibits “approximate aggregation”: Loosely speaking, to predict future prices consumers need
to forecast only a small set of statistics of the wealth distribution rather than
the entire distribution itself. This result makes it possible to use numerical
methods to analyze this class of models. Another key feature of the KrusellSmith algorithm is that it solves the model by simulating it. Specifically, the
basic version of the algorithm works as follows:
1. Approximate with a finite number (I ) of moments. (H reduces to a
function mapping the I moments today into the I moments tomorrow
depending on z today.)
(a) We will start by selecting one moment—the mean—so I = 1.43
2. Select a family of functions for H . I will choose a log-linear function
following Krusell and Smith.

V (k, ; , z) = max U (c) + δE V k´, ´; ´, z´ | z,
c,k´

c + k´ = w (K, L, z) × l × + r (K, L, z) × k, k´ ≥ 0
log K´ = a0 + a1 log K for z = zb
log K´ = b0 + b1 log K for z = zg
3. Make an (educated) initial guess about (a0 , a1 , b0 , b1 ) =⇒ yields initial
guess for H0 . Make also an initial guess for 0 .
43 When we add more moments, we do not have to proceed as mean, variance, skewness,

and so on. We can include, say, the wealth holdings of the top 10 percent of population, meanto-median wealth ratio, etc.

298

Federal Reserve Bank of Richmond Economic Quarterly
4. Solve the consumer’s dynamic program. Using only the resulting de N,T
cision rules, simulate kn,t n=1,t=1 for (N, T ) large.

5. Update H by estimating (where K̃ = N1 N
n=1 kn ):
log K̃´ = a0 + a1 log K̃
log K̃´ = b0 + b1 log K̃

for z = zb
for z = zg

6. Iterate on 4–5, until the R 2 of this regression is “sufficiently high” and
the forecast variance is “small.”
(a) If accuracy remains insufficient, go back to step 1 and increase I .

Details
Educated Initial Guess

As with many numerical methods, a good initial guess is critical. More often
than not, the success or failure of a given algorithm will depend on the initial guess. One idea (used by Krusell and Smith) is to first solve a standard
representative-agent real business cycle (RBC) model with the same parameterization. Then estimate the coefficients (a0 , a1 , b0 , b1 ) using capital series
simulated from this model to obtain an initial guess for step 1 above.44 More
generally, a good initial guess can often be obtained by solving a simplified
version of the full model. Sometimes this simplification involves ignoring
certain constraints, sometimes by shutting down certain shocks, and so on.
Discretizing an AR(1) Process

Oftentimes, the exogenous driving force in incomplete markets models is assumed to be generated by an AR(1) process, which is discretized and converted
into a Markov chain. One popular method for discretization is described in
Aiyagari (1993) and has been used extensively in the literature. However,
an alternative method by Rouwenhorst (1995) (and which received far less
attention until recently) is far superior in the quality of the approximation
that it provides, especially when the process is very persistent, which is often
the case. Moreover, it is very easy to implement. Kopecky and Suen (2010)
and Galindev and Lkhagvasuren (2010) provide comprehensive comparisons
of different discretization methods, which reveal the general superiority of
44 Can’t we update H without simulating? Yes, we can. Den Haan and Rendahl (2009)
propose a method where they use the policy functions for capital holdings and integrate them
over
distribution (k, ) of households across capital and employment status: K´ = Hj (K, z) =
#
k´j (k, ; , z)d(k, ). This works well when the decision rules are parameterized in a particular
way. See den Haan and Rendahl (2009).

F. Guvenen: Macroeconomics with Heterogeneity

299

Rouwenhorst’s (1995) method. They also show how this method can be extended to discretize more general processes.
Non-Trivial Equilibrium Pricing Function

One simplifying feature of Krusell and Smith (1998) is that equilibrium prices
(wages and interest rates) are determined trivially by the marginal product conditions (12). Thus, they depend only on the aggregate capital stock and not on
its distribution. Some models do not have this structure—instead pricing functions must be determined by equilibrium conditions—such as market-clearing
or zero-profit conditions—that explicitly depend on the wealth distribution.
This would be the case, for example, if a household bond is traded in the economy. Its price must be solved for using a market-clearing condition, which is
a challenging task. Moreover, if there is an additional asset, such as a stock,
two prices must be determined simultaneously, and this must be done in such
a way that avoids providing arbitrage opportunities—along the iterations of
the solution algorithm. Otherwise, individuals’ portfolio choices will go haywire (in an attempt to take advantage of perceived arbitrage), wreaking havoc
with the solution algorithm. Krusell and Smith (1997) solve such a model and
propose an algorithm to tackle these issues. I refer the interested reader to that
article for details.
Checking for Accuracy of Solution

Many studies with aggregate fluctuations and heterogeneity use two simple
criteria to asses the accuracy of the law of motion in their limited information
approximation. If agents are using the law of motion
log K´ = α 1 + α 2 z + α 3 log K + u,

(13)

a perfectly solved model should find u = 0. Thus, practitioners will continue
to solve the model until either the R 2 of this regression is larger than some
minimum or σ u falls below some threshold (step 6 in the algorithm above).
However, one should not rely solely on R 2 and σ u . There are at least three
reasons for this (den Haan 2010). First, both measures average over all periods
of the simulation. Thus, infrequent but large deviations from the forecast rule
can be hidden in the average. These errors may be very important to agents
and their decision rule. For example, the threat of a very large recession may
increase buffer stock saving, but the approximation may understate the movement of capital in such a case. Second, and more importantly, these statistics
only measure one-step-ahead forecasts. The regression only considers the dynamics from one period to the next, so errors are only the deviations between
the actual next-period capital and the expected amount in the next period. This
misses potentially large deviations between the long-term forecast for capital
and its actual level. Aware of this possibility, Krusell and Smith (1998) also

300

Federal Reserve Bank of Richmond Economic Quarterly

check the R 2 for forecasting prices 25 years ahead (100 model periods) and
find it to be extremely high as well! (They also check the maximum error in
long-term forecasts, which is very small.) Third, R 2 is scaled by the left-hand
side of the regression. An alternative is to check the R 2 of
log K´− log K = α 1 + α 2 z + (α 3 − 1) log K.
As a particularly dramatic demonstration, den Haan (2010) uses a savings
decision rule that solves the Krusell and Smith (1998) model, simulates it for
T periods, and estimates a law of motion
(13). He then
in the form of equation
−1
2
ut = 0 but the R falls from 0.9999 to
manipulates α 1 , α 3 such that T
0.999 and then 0.99. This has economically meaningful consequences: The
time series standard deviation of the capital stock simulated from the perturbed
versions of equation (13) falls to 70 percent and then 46 percent of the true
figure.
Finally, den Haan (2010) proposes a useful test that begins with the approximated law of motion,
log K´ = α̂ 1 + α̂ 2 z + α̂ 3 log K + u,
(14)
T

and then compares these
to generate a sequence of realizations of K̃t+1
t=0
to the sequence generated by aggregating from decision rules, the true law of

T
motion. Because K̃t+1
is obtained by repeatedly applying equation (14)
t=0

starting from K̃0 , errors can accumulate. This is important because, in the true
model, today’s choices depend on expectations about the future state, which
in turn depends on the future’s future expectations and so errors cascade. To
systematically compare K̃t to Kt , den Haan proposes an “essential accuracy
plot.” For a sequence of shocks (not those originally used when estimating α
to solve the model), generate a sequence of K̃t and Kt . One can then compare
moments of the two simulated sequences.
The “main
$
$ focus” of the accuracy
$
$
test is the errors calculated by ũt = $log K̃t − log Kt $, whose maximum should
be made close to zero.
Prices are More Sensitive than Quantities

The accuracy of the numerical solution becomes an even more critical issue
if the main focus of analysis is (asset) prices rather than quantities. This is
because prices are much more sensitive to approximation errors (see, e.g.,
Christiano and Fisher [2000] and Judd and Guu [2001]). The results in
Christiano and Fisher (2000) are especially striking. These authors compare
a variety of different implementations of the “parameterized expectations”
method and report the approximation errors resulting from each. For the standard deviation of output, consumption, and investment (i.e., “quantities”), the
approximation errors range from less than 0.1 percent of the true value to 1

F. Guvenen: Macroeconomics with Heterogeneity

301

percent to 2 percent in some cases. For the stock and bond return and the
equity premium, the errors regularly exceed 50 percent and are greater than
100 percent in several cases. The bottom line is that the computation of asset
prices requires extra care.
Pros and Cons

An important feature of the Krusell-Smith method is that it is a “local” solution
around the stationary recursive equilibrium. In other words, this method relies
on simulating a very long time series of data (e.g., capital series) and making
sure that after this path has converged to the ergodic set, the predictions of
agents are accurate for behavior inside that set. This has some advantages
and some disadvantages. One advantage is the efficiency gain compared to
solving a full recursive equilibrium, which enforces the equilibrium conditions
at every point of a somewhat arbitrary grid, regardless of whether or not a given
state is ever visited in the stationary equilibrium.
One disadvantage is that if you take a larger deviation—say by setting K0
to a value well below the steady-state value—your “equilibrium functions”
are likely to be inaccurate, and the behavior of the solution may differ significantly from the true solution. Why should we care about this? Suppose
you solve your model and then want to study a policy experiment where you
eliminate taxes on savings. You would need to write a separate program from
the “transition” between the two stationary equilibria. Instead, if you solve
for the full recursive equilibrium over a grid that contains both the initial and
final steady states, you would not need to do this. However, solving for the
full equilibrium is often much harder and, therefore, is often overkill.
An Alternative to Krusell-Smith: Tracking History of Shocks

Some models have few exogenous state variables, but a large number of endogenous states. In such cases, using a formulation that keeps track of all
these state variables can make the numerical solution extremely costly or even
infeasible. An alternative method begins with the straightforward observation
that all current endogenous state variables are nothing more than functions
of the infinite history of exogenous shocks. So, one could replace these endogenous states with the infinite history of exogenous states. Moreover, many
models turn out to have “limited memory” in the sense that only the recent
history of shocks matters in a quantitatively significant way, allowing us to
only track a truncated history of exogenous states. The first implementation
of this idea I have been able to find is in Veracierto (1997), who studied a
model with plant-level investment irreversibilities, which give rise to S-s type
policies. He showed that it is more practical to track a short history of the S-s
thresholds instead of the current-period endogenous state variables.

302

Federal Reserve Bank of Richmond Economic Quarterly

As another example, consider an equilibrium model of the housing market where the only exogenous state is the interest rate, which evolves as a
Markov process. Depending on the precise model structure, the individual
endogenous state variables can include the mortgage debt outstanding, the
time-to-maturity of the mortgage contract, etc., and the aggregate endogenous
state can include the entire distribution of agents over the individual states.
This is potentially an enormously large state space! Arslan (2011) successfully solves a model of this sort with realistic fixed-rate mortgage contracts,
a life-cycle structure, and stochastic interest rates, using four lags of interest
rates. Other recent examples that employ this basic approach include Chien
and Lustig (2010), who solve an asset pricing model with aggregate and idiosyncratic risk in which agents are subject to collateral constraints arising
from limited commitment, and Lorenzoni (2009), who solves a business cycle
model with shocks to individuals’ expectations. In all these cases, tracking a
truncated history turns out to provide computational advantages over choosing
endogenous state variables that evolve in a Markovian fashion. One advantage
of this approach is that it is often less costly to add an extra endogenous state
variable relative to the standard approach, because the same number of lags
may still be sufficient. One drawback is that if the Markov process has many
states or the model has long memory, the method may not work as well.

6. WHEN DOES HETEROGENEITY MATTER
FOR AGGREGATES?
As noted earlier, Krusell and Smith (1998) report that the time series behavior
of the aggregated incomplete markets model, by and large, looks very similar to the corresponding representative-agent model. A similar result was
reported in Rı́os-Rull (1996). However, it is important to interpret these findings correctly and to not overgeneralize them. For example, even if a model
aggregates exactly, modeling heterogeneity can be very important for aggregate problems. This is because the problem solved by the representative agent
can look dramatically different from the problem solved by individuals (for
example, have very different preferences). Here, I discuss some important
problems in macroeconomics where introducing heterogeneity yields conclusions quite different from a representative-agent model.

The Curse of Long Horizon
It is useful to start by discussing why incomplete markets do not matter in many
models. Loosely speaking, this outcome follows from the fact that a long horizon makes individuals’ savings function approximately linear in wealth (i.e.,
constant MPC out of wealth). As we saw in Section 1, the exact linearity
of savings rule delivers exact demand aggregation in both Gorman (1961)

F. Guvenen: Macroeconomics with Heterogeneity

303

and Rubinstein’s (1974) theorems. As it turns out, even with idiosyncratic
shocks, this near-linearity holds for wealth levels that are not immediately near
borrowing constraints. Thus, even though markets are incomplete, redistributing wealth would matter little, and we have something that looks like demand
aggregation!
Is there a way to get around this result? It is instructive to look at a concrete example. Mankiw (1986) was one of the first articles in the literature
on the equity premium puzzle and one that gave prominence to the role of
heterogeneity. Mankiw (1986) shows that, in a two-period model with incomplete markets and idiosyncratic risk of the right form, one can generate
an equity premium as large as desired. However, researchers who followed
up on this promising lead (Telmer 1993, Heaton and Lucas 1996, and Krusell
and Smith 1997) quickly came to a disappointing conclusion: Once agents in
these models are allowed to live for multiple periods, trading a single risk-free
asset yields sufficient consumption insurance, which in turn results in a tiny
equity premium.
This result—that a sufficiently long horizon can dramatically weaken the
effects of incomplete markets—is quite general. In fact, Levine and Zame
(2002) prove that if, in a single good economy with no aggregate shocks, (i)
idiosyncratic income shocks follow a Markov process, (ii) marginal utility is
convex, and (iii) all agents have access to a single risk-free asset, then, as
individuals’ subjective time discount factor (δ) approaches unity, incomplete
markets allocations (and utilities) converge to those from a complete markets
economy with the same aggregate resources. Although Levine and Zame’s
result is theoretical for the limit of such economies (as δ −→ 1), it still sounds
a cautionary note to researchers building incomplete markets models: Unless
shocks are extremely persistent and/or individuals are very impatient, these
models are unlikely to generate results much different from a representativeagent model.
Constantinides and Duffie (1996) show one way to get around the problem
of a long horizon, which is also consistent with the message of Levine-Zame’s
theorem. Essentially, they assume that individuals face permanent shocks,
which eliminate the incentives to smooth such shocks. Therefore, they behave
as if they live in a static world and choose not to trade. Constantinides and
Duffie also revive another feature of Mankiw’s (1986) model: Idiosyncratic
shocks must have larger variance in recessions (i.e., countercyclical variances)
to generate a large equity premium. With these two features, they show that
Mankiw’s original insight can be made to work once again in an infinite horizon
model. Storesletten, Telmer, and Yaron (2007) find that a calibrated model
along the lines suggested by Constantinides and Duffie can generate about 41
of the equity premium observed in the U.S. data.
The bottom line is that, loosely speaking, if incomplete markets matter
in a model mainly through its effect on the consumption-saving decision, a

304

Federal Reserve Bank of Richmond Economic Quarterly

long horizon can significantly weaken the bite of incomplete markets. With a
long enough horizon, agents accumulate sufficient wealth and end up on the
nearly linear portion of their savings function, delivering results not far from
a complete markets model. This is also the upshot of Krusell and Smith’s
(1998) analysis.

Examples Where Heterogeneity Does Matter
There are many examples in which heterogeneity does matter for aggregate
phenomena. Here, I review some examples.
First, aggregating heterogeneous-agent models can give rise to preferences
for the representative agent that may have nothing to do with the preferences
in the underlying model. A well-known example of such a transformation is
present in the early works of Hansen (1985) and Rogerson (1988), who show
that in a model in which individuals have no intensive margin of labor supply (i.e., zero Frisch labor supply elasticity), one can aggregate the model to
find that the representative agent has linear preferences in leisure (i.e., infinite
Frisch elasticity!). This conclusion challenges one of the early justifications
for building models with microfoundations, which was to bring evidence from
microdata to bear on the calibration of macroeconomic models. In an excellent survey article, Browning, Hansen, and Heckman (1999) issue an early
warning, giving several examples where this approach is fraught with danger.
Building on the earlier insights of Hansen and Rogerson, Chang and Kim
(2006) construct a model in which aggregate labor-supply elasticity depends
on the reservation-wage distribution in the population. The economy is populated by households (husband and wife) that each supply labor only along
the extensive margin: they either work full time or stay home. Workers are
hit by idiosyncratic productivity shocks, causing them to move in and out
of the labor market. The aggregate labor-supply elasticity of such an economy is around one, greater than a typical microestimate and much greater
than the Frisch elasticity one would measure at the intensive margin (which
is zero) in this model. The model thus provides a reconciliation between the
micro- and macro-labor-supply elasticities. In a similar vein, Chang, Kim,
and Schorfheide (2010) show that preference shifters that play an important
role in discussions of aggregate policy are not invariant to policies if they
are generated from the aggregation of a heterogeneous-agent model. Such a
model also generates “wedges” at the aggregate level that do not translate into
any well-defined notion of preference shifters at the microlevel.45 Finally,
Erosa, Fuster, and Kambourov (2009) build a life-cycle model of labor supply, by combining and extending the ideas introduced in earlier articles, such
45 See Chari, Kehoe, and McGrattan (2007) for the definition of business-cycle wedges.

F. Guvenen: Macroeconomics with Heterogeneity

305

as Chang and Kim (2006) and Rogerson and Wallenius (2009). Their goal
is to build a model with empirically plausible patterns of hours over the life
cycle and examine the response elasticities of labor supply to various policy
experiments.
In another context, Guvenen (2006) asks why macroeconomic models
in the RBC tradition typically need to use a high elasticity of intertemporal substitution (EIS) to explain output and investment fluctuations, whereas
Euler equation regressions (such as in Hall [1988] and Campbell and Mankiw
[1990]) that use aggregate consumption data estimate a much smaller EIS
(close to zero) to fit the data. He builds a model with two types of agents
who differ in their EIS. The model generates substantial wealth inequality
and much smaller consumption inequality, both in line with the U.S. data.
Consequently, capital and investment fluctuations are mainly driven by the
rich (who hold almost all the wealth in the economy) and thus reflect the high
EIS of this group. Consumption fluctuations, on the other hand, reflect an
average that puts much more weight on the EIS of the poor, who contribute
significantly to aggregate consumption. Thus, a heterogeneous-agent model
is able to explain aggregate evidence that a single representative-agent model
has trouble fitting.
In an asset pricing context, Constantinides and Duffie (1996), Chan and
Kogan (2002), and Guvenen (2011) show a similar result for risk aversion.
Constantinides and Duffie (1996) show theoretically how the cross-sectional
distribution of consumption in a heterogeneous-agent model gets translated
into a higher risk aversion for the representative agent. Guvenen (2011) shows
that, in a calibrated model with limited stock market participation that matches
several asset pricing facts, the aggregate risk aversion is measured to be as high
as 80, when the individuals’ risk aversion is only two. These results, as well
as the articles discussed above, confirm and amplify the concerns originally
highlighted by Browning, Hansen, and Heckman (1999). The conclusion is
that researchers must be very careful when using microeconomic evidence to
calibrate representative-agent models.

COMPUTATION AND CALIBRATION

Because of their complexity, the overwhelming majority of models in this
literature are solved on a computer using numerical methods.46 Thus, I now
turn to a discussion of computational issues that researchers often have to
confront when solving models with heterogeneity.
46 There are a few examples of analytical solutions and theoretical results established with
heterogeneous-agent models. See, e.g., Constantinides and Duffie (1996), Heathcote, Storesletten,
and Violante (2007), Rossi-Hansberg and Wright (2007), Wang (2009), and Guvenen and Kuruscu
(forthcoming).

306

Federal Reserve Bank of Richmond Economic Quarterly

Calibration and Estimation: Avoid Local Search

Economists often need to minimize an objective function of multiple variables
that has lots of kinks, jaggedness, and deep ridges. Consequently, the global
minimum is often surrounded by a large number of local minima. A typical
example of such a problem arises when a researcher tries to calibrate several
structural parameters of an economic model by matching some data moments.
Algorithms based on local optimization methods (e.g., Newton-Raphson style
derivative-based methods or Nelder-Mead simplex style routines) very often
get stuck in local minima because the objective surface is typically very rough
(non-smooth).
It is useful to understand some of the sources of this roughness. For example, linear interpolation that is often used in approximating value functions
or decision rules generates an interpolated function that is non-differentiable
(i.e., has kinks) at every knot point. Similarly, problems with (borrowing, portfolio, etc.) constraints can create significant kinks. Because researchers use a
finite number of individuals to simulate data from the model (to compute moments), a small change in the parameter value (during the minimization of the
objective) can move some individuals across the threshold—from being constrained to unconstrained or vice versa—which can cause small jumps in the
objective value. And sometimes, the moments that the researcher decides to
match would be inherently discontinuous in the underlying parameters (with a
finite number of individuals), such as the median of a distribution (e.g., wealth
holdings). Further compounding the problems, if the moments are not jointly
sufficiently informative about the parameters to be calibrated, the objective
function would be flat in certain directions. As can be expected, trying to
minimize a relatively flat function with lots of kinks, jaggedness, and even
small jumps can be a very difficult task indeed.47
While the algorithm described here can be applied to the calibration of
any model, it is especially useful in models with heterogeneous agents—since
such models are time consuming to solve even once, an exhaustive search of
the parameter space becomes prohibitively costly (which could be feasible in
simpler models).
47 One simple, but sometimes overlooked, point is that when minimizing an objective function
of moments to calibrate a model, one should use the same “seeds” for the random elements of
the model that are used to simulate the model in successive evaluations of the objective function.
Otherwise, some of the change in objective value will be because of the inherent randomness in
different draws of random variables. This can create significant problems with the minimization
procedure.

F. Guvenen: Macroeconomics with Heterogeneity

307

A Simple Fully Parallelizable Global
Optimization Algorithm
Here, I describe a global optimization algorithm that I regularly use for calibrating models and I have found it to be very practical and powerful. It is
relatively straightforward to implement, yet allows full parallelization across
any number of central processing unit (CPU) cores as well as across any number of computers that are connected to the Internet. It requires no knowledge
of MPI, OpenMP, or related tools, and no knowledge of computer networking
other than using some commercially available synchronization tools (such as
DropBox, SugarSync, etc.).
A broad outline of the algorithm is as follows. As with many global algorithms, this procedure combines a global stage with a local search stage that
is restarted at various locations in the parameter space. First, we would like
to search the parameter space as thoroughly as possible, but do so in as efficient a way as possible. Thoroughness is essential because we want to be sure
that we found the true global minimum, so we are willing to sacrifice some
speed to ensure this. The algorithm proceeds by taking an initial starting point
(chosen in a manner described momentarily) and conducting a local search
from that point on until the minimization routine converges as specified by
some tolerance. For local search, I typically rely on the Nelder-Mead’s downhill simplex algorithm because it does not require derivative information (that
may be inaccurate given the approximation errors in the model’s solution algorithm).48 The minimum function value as well as the parameter combination
that attained that minimum are recorded in a file saved on the computer’s hard
disk. The algorithm then picks the next “random” starting point and repeats
the previous step of local minimization. The results are then added to the
previous file, which records all the local minima found up to that point.
Of course, the most obvious algorithm would be to keep doing a very
large number of restarts of this sort and take the minimum of all the minima
found in the process. But this would be very time consuming and would not be
particularly efficient. Moreover, in many cases, the neighborhood of the global
minimum can feature many deep ridges and kinks nearby, which requires more
extensive searching near the global minimum, whereas the proposed approach
would devote more time to points far away from the true global minimum and
to the points near it. Further, if the starting points are chosen literally randomly,
this would also create potentially large efficiency losses, because these points
have a non-negligible chance of falling near points previously tried. Because
those areas have been previously searched, devoting more time is not optimal.
48 An alternative that can be much faster but requires a bit more tweaking for best perfor-

mance is the trust region method of Zhang, Conn, and Scheinberg (2010) that builds on Powell’s
(2009) BOBYQA algorithm.

308

Federal Reserve Bank of Richmond Economic Quarterly

A better approach is to use “quasi-random” numbers to generate the starting points. Quasi-random numbers (also called low-discrepancy sequences)
are sequences of deterministic numbers that spread to any space in the maximally separated way. They avoid the pitfall of random draws that may end up
being too close to each other. Each draw in the sequence “knows” the location
of previous points drawn and attempts to fill the gaps as evenly as possible.49
Among a variety of sequences proposed in the literature, the Sobol’sequence is
generally viewed to be superior in most practical applications, having a very
uniform filling of the space (i.e., maximally separated) even when a small
number of points is drawn, as well as a very fast algorithm that generates the
sequence.50
Next, how do we use the accumulated information from previous restarts?
As suggested by genetic algorithm heuristics, I combine information from
previous best runs to adaptively direct the new restarts to areas that appear
more promising. This is explained further below. Now for the specifics of the
algorithm.
Algorithm 1 Let p be a J -dimensional parameter vector with generic element
pj , j = 1, ..., J.
• Step 0. Initialization:
– Determine bounds for each parameter, outside of which the objective function should be set to a high value.
– Generate a sequence of Sobol’ numbers with a sequence length of
Imax (the maximum anticipated number of restarts in the global
stage). Set the global iteration number i = 1.
• Step 1. Global Stage:
– Draw the i th (vector) value in the Sobol’ sequence: si .
– If i > 1, open and read from the text file “saved parameters.dat”
the function values (and corresponding parameter vectors) of previously found local minima. Denote the lowest function value found
low
as of iteration i−1 as fi−1
and the corresponding parameter vector
as plow
.
i−1
– Generate a starting point for the local stage as follows:
49 Another common application of low-discrepancy sequences is in quasi-Monte Carlo integration, where they have been found to improve time-to-accuracy by several orders of magnitude.
50 In a wide range of optimization problems, Kucherenko and Sytsko (2005) and Liberti and
Kucherenko (2005) find that Sobol’ sequences outperform Holton sequences, both in terms of computation time and probability of finding the global optimum. The Holton sequence is particularly
weak in high dimensional applications.

F. Guvenen: Macroeconomics with Heterogeneity

309

∗ If i < Imin (< Imax ), then use si as the initial guess: Si =si .
Here, Imin is the threshold below which we use fully quasirandom starting points in the global stage.
∗ If i ≥ Imin , take the initial guess to be a convex combination of
si and the parameter value that generated the best local minima
so far: Si = (1 − θ i )si + θ plow
i−1 . The parameter θ i ∈ [0, θ̄ ]
with θ̄ < 1, and increases with i. For example, I found that
a convex increasing function, such as θ i = min[θ̄ , (i/Imax )2 ],
works well in some applications. An alternative heuristic is
given later.
∗ As θ i is increased, local searches are restarted from a narrower part of the parameter space that yielded the lowest local
minima before.
• Step 2: Local Stage:
– Using Si as a starting point, use the downhill simplex algorithm to
search for a local minimum. (For the other vertices of the simplex,
randomly draw starting points within the bounds of the parameter
space.)
– Stop when either (i) a certain tolerance has been achieved, (ii)
function values do not improve more than a certain amount, or (iii)
the maximum iteration number is reached.
– Open saved parameters.dat and record the local minimum found
(function value and parameters).
• Step 3. Stopping Rule:
– Stop if the termination criterion described below is satisfied. If not
go to Step 1.
Termination Criterion

One useful heuristic criterion relies on a Bayesian procedure that estimates the
probability that the next local search will find a new local minimum based on
the rate at which new local minima have been located in past searches. More
concretely, if W different local minima have been found after K local searches
started from a set of uniformly distributed points, then the expectation of the
number of local minima is
Wexp = W (K − 1) / (K − W − 2) ,
provided that K > W + 2. The searching procedure is terminated if Wexp <
W +0.5. The idea is that, after a while of searching, if subsequent restarts keep

310

Federal Reserve Bank of Richmond Economic Quarterly

finding one of the same local minima found before, the chances of improvement in subsequent searches is not worth the additional time cost. Although
this is generally viewed as one of the most reliable heuristics, care must be
applied as with any heuristic.
Notice also that Wexp can be used to adaptively increase the value of θ i
in the global stage (Step 1 [3] above). The idea is that, as subsequent global
restarts do not yield a new local minimum with a high enough probability,
it is time to narrow the search and further explore areas of promising local
minima. Because jaggedness and deep ridges cause local search methods to
often get stuck, we want to explore promising areas more thoroughly.
One can improve on this basic algorithm in various ways. I am going to
mention a few that seem worth exploring.
Refinements: Clustering and Pre-Testing

First, suppose that in iteration k, the proposed starting point Sk ends up being
“close” to one of the previous minima, say plow
n , for n < k. Then it is likely
that the search starting from Sk will end up converging to plow
n . But then
we have wasted an entire cycle of local search without gaining anything. To
prevent this, one heuristic (called “clustering methods”) proceeds by defining
a “region of attraction” (which is essentially a J -dimensional ball centered)
around each one of the local minima found so far.51 Then the algorithm would
discard a proposed restarting point if it falls into the region attraction of any
previous local minima. Because the local minimization stage is the most
computationally intensive step, this refinement of restarting the local search
only once in a given region of attraction can result in significant computational
gains. Extensive surveys of clustering methods can be found in Rinnooy Kan
and Timmer (1987a, 1987b).
Second, one can add a “pre-test” stage where N points from the Sobol’
sequence are evaluated before any local search (i.e., in Step 0 above), and only
a subset of N ∗ < N points with lowest objective values are used as seeds in
the local search. The remaining points, as well as regions of attraction around
them are ignored as not promising. Notice that while this stage can improve
speed, it trades off reliability in the process.
Narrowing Down the Search Area

The file saved parameters.dat contains a lot of useful information gathered in each iteration to the global stage, which can be used more efficiently
as follows. As noted, the Nelder-Mead algorithm requires J + 1 candidate
51 While different formulas have been proposed for determining the optimal radius, these

formulas contain some undetermined coefficients, making the formulas less than useful in real life
applications.

F. Guvenen: Macroeconomics with Heterogeneity

311

points as inputs (the vertices of the J -dimensional simplex). One of these
points is given by Si , chosen as described above; the other vertices were
drawn randomly. But as we accumulate more information with every iteration
on the global stage, if we keep finding local minima that seem to concentrate
in certain regions, it makes sense to narrow the range of values from which
we pick the vertices. One way to do this is as follows: After a sufficiently
large number of restarts have been completed, rank all the function values
and take the lowest x percent of values (e.g., 10 percent or 20 percent). Then
j
for each dimension, pick the minimum (pmin ) and maximum parameter value
j
(pmax ) within this set of minima. Then, to generate vertices, take randomly
j
j
sampled points between pmin and pmax in each dimension j . This allows
the simplex algorithm to search more intensively in a narrower area, which
can improve results very quickly when there are ridges or jaggedness in the
objective function that make the algorithm get stuck.

Parallelizing the Algorithm
The algorithm can be parallelized in a relatively straightforward manner.52 The
basic idea is to let each CPU core perform a separate local search in a different
part of the parameter space, which is a time-consuming process. If we can do
many such searches simultaneously, we can speed up the solution dramatically.
One factor that makes parallelization simple is the fact that the CPU cores do
not need to communicate with each other during the local search stage. In
between the local stages, each CPU core will contribute its findings (the last
local minimum it found along with the corresponding parameter vector) to
the collective wisdom recorded in saved parameters.dat and also get the
latest updated information about the best local minimum found so far from
the same file. Thus, as long as all CPU cores have access to the same copy of
the file saved parameters.dat, parallelization requires no more than a few
lines for housekeeping across CPUs. Here are some more specifics.
Suppose that we have a workstation with N CPU cores (for example,
N = 4, 6, or 12). The first modification we need to make is to change the
program to distinguish between the different “copies” of the code, running on
different CPU cores. This can be done by simply having the program ask the
user (only once, upon starting the code) to input an integer value, n, between
1 and N , which uniquely identifies the “sequence number” of the particular
instance of the program running. Then open N terminal windows and launch
a copy of the program in each window. Then for each one, enter a unique
sequence number n = 1, 2, ..., N .
52 I am assuming here that a compiled language, such as Fortran or C, is used to write the
program. So multiple parallel copies of the same code can be run in different terminal windows.

312

Federal Reserve Bank of Richmond Economic Quarterly

Upon starting, each program will first simulate the same quasi-random
sequence regardless of n, but each run will pick a different element of this
sequence as its own seed. For simplicity, suppose run n chooses the nth
element of the sequence as its seed and launches a local search from that point.
After completion, each run will open the same file saved parameters.dat
and record the local minimum and parameter value it finds.53
Now suppose that all copies of the program complete their respective first
local searches, so there are N lines, each written by a different CPU core, in
the file saved parameters.dat. Then each run will start its second iteration
and pick as its next seed the (N + n)th element of the quasi-random sequence.
When the total number of iterations across all CPUs exceed some threshold
Imin , then we would like to combine the quasi-random draw with the previous
best local minima as described in Step 1 (3) above. This is simple since all
runs have access to the same copy of saved parameters.dat.54
Notice that this parallelization method is completely agnostic about
whether the CPU cores are on the same personal computer (PC) or distributed across many PCs as long as all computers keep synchronized copies of
saved parameters.dat. This can be achieved by using a synchronization
service like DropBox. This feature easily allows one to harness the computational power of many idle PCs distributed geographically with varying speeds
and CPU cores.

FUTURE DIRECTIONS AND FURTHER READING

This article surveys the current state of the heterogeneous-agent models literature and draws several conclusions. First, two key ingredients in such models
are (i) the magnitudes and types of risk that the model builder feeds into the
model and (ii) the insurance opportunities allowed in the economy. In many
cases, it is difficult, if not impossible, to measure each component separately.
In other words, the assumptions a researcher makes regarding insurance opportunities will typically affect the inference drawn about the magnitudes of
risks and vice versa. Further complicating the problem is the measurement of
risk: Individuals often have more information than the econometrician about
53 Because this opening and writing stage takes a fraction of a second, the likelihood that
two or more programs access the file simultaneously and create a run-time error is negligible.
54 It is often useful for each run to keep track of the total number of local searches completed by all CPUs—call this NLast . For example, sometimes the increase in θ i can be linked
to NLast . This number can be read as the total number of lines recorded up to that point in
saved parameters.dat. Another use of this index is for determining which point in the sequence
to select as the next seed point. So as opposed to running n by selecting the (kN + n)th point
in the sequence where k is the number of local searches completed by run n, it could just pick
the (NLast + 1)th number in the sequence. This avoids leaving gaps in the sequence for seeds,
in case some CPUs are much faster than others and hence finish many more local searches than
others.

F. Guvenen: Macroeconomics with Heterogeneity

313

future changes in their lives. So, for example, a rise or fall in income that the
econometrician may view as a “shock” may in fact be partially or completely
anticipated by the individual. This suggests that equating income movements
observed in the data with risk (as is often done in the literature) is likely
to overstate the true magnitude. This entanglement of “risk,” “anticipated
changes,” and “insurance” presents a difficult challenge to researchers in this
area. Although some recent progress has been made, more work remains.
A number of surveys contain very valuable material that are complementary to this article. First, Heathcote, Storesletten, and Violante (2009) is
a recent survey of quantitative macroeconomics with heterogeneous households that is complementary to this article. Second, Browning, Hansen, and
Heckman (1999) contains an extensive review of microeconomic models that
are often used as the foundations of heterogeneous-agent models. It highlights several pitfalls in trying to calibrate macroeconomic models using microevidence. Third, Meghir and Pistaferri (2011) provides a comprehensive
treatment of how earnings dynamics affect life-cycle consumption choice,
which is closely related to the issues discussed in Section 3 of this survey.
Finally, because heterogeneous-agent models use microeconomic survey data
in increasingly sophisticated ways, a solid understanding of issues related to
measurement error (which is pervasive in microdata) is essential. Failure to understand such problems can wreak havoc with the empirical analysis. Bound,
Brown, and Mathiowetz (2001) is an extensive and authoritative survey of the
subject.
The introduction of families into incomplete markets models represents
an exciting area of current research. For many questions of empirical relevance, the interactions taking place within a household (implicit insurance,
bargaining, etc.) can have first-order effects on how individuals respond to
idiosyncratic changes. To give a few examples, Gallipoli and Turner (2011)
document that the labor supply responses to disability shocks of single workers are larger and more persistent than those of married workers. They argue
that an important part of this difference has to do with the fact that couples are
able to optimally change their time (and task) allocation within households in
response to disability, an option not available to singles. This finding suggests
that modeling households would be important for understanding the design of
disability insurance policies. Similarly, Guner, Kaygusuz, and Ventura (2010)
show that to quantify the effects of alternative tax reforms, it is important to
take into account the joint nature of household labor supply. In fact, it is hard
to imagine any distributional issue for which the household structure does not
figure in an important way.
Another promising area is the richer modeling of household finances in an
era of ever-increasing sophistication in financial services. The Great Recession, which was accompanied by a housing market crash and soaring personal
bankruptcies, home foreclosures, and so on, has created a renewed sense of

314

Federal Reserve Bank of Richmond Economic Quarterly

urgency for understanding household balance sheets. Developments on two
fronts—advances in theoretical modeling as discussed in Section 3, combined
with richer data sources on credit histories and mortgages that are increasingly
becoming available to researchers—will make faster progress feasible in this
area.

REFERENCES
Abowd, John M., and David E. Card. 1989. “On the Covariance Structure of
Earnings and Hours Changes.” Econometrica 57 (March): 411–45.
Acemoglu, Daron. 2002. “Technical Change, Inequality, and the Labor
Market.” Journal of Economic Literature 40 (March): 7–72.
Aguiar, Mark, and Erik Hurst. 2008. “Deconstructing Lifecycle
Expenditures.” Working Paper, University of Rochester.
Aguiar, Mark, and Mark Bils. 2011. “Has Consumption Inequality Mirrored
Income Inequality?” Working Paper, University of Rochester.
Aiyagari, S. Rao. 1993. “Uninsured Idiosyncratic Risk and Aggregate
Saving.” Federal Reserve Bank of Minneapolis Working Paper 502.
Aiyagari, S. Rao. 1994. “Uninsured Idiosyncratic Risk and Aggregate
Saving.” The Quarterly Journal of Economics 109 (August): 659–84.
Altug, Sumru, and Robert A. Miller. 1990. “Household Choices in
Equilibrium.” Econometrica 58 (May): 543–70.
Arslan, Yavuz. 2011. “Interest Rate Fluctuations and Equilibrium in the
Housing Market.” Working Paper, Central Bank of the Republic of
Turkey.
Athreya, Kartik B. 2002. “Welfare Implications of the Bankruptcy Reform
Act of 1999.” Journal of Monetary Economics 49 (November):
1,567–95.
Attanasio, Orazio, and Steven J. Davis. 1996. “Relative Wage Movements
and the Distribution of Consumption.” Journal of Political Economy 104
(December): 1,227–62.
Attanasio, Orazio, Erich Battistin, and Hidehiko Ichimura. 2007. “What
Really Happened to Consumption Inequality in the United States?” In
Hard-to-Measure Goods and Services: Essays in Honour of Zvi
Griliches, edited by E. Berndt and C. Hulten. Chicago: University of
Chicago Press.

F. Guvenen: Macroeconomics with Heterogeneity

315

Attanasio, Orazio P., James Banks, Costas Meghir, and Guglielmo Weber.
1999. “Humps and Bumps in Lifetime Consumption.” Journal of
Business & Economic Statistics 17 (January): 22–35.
Autor, David H., Lawrence F. Katz, and Melissa S. Kearney. 2008. “Trends
in U.S. Wage Inequality: Revising the Revisionists.” The Review of
Economics and Statistics 90 (2): 300–23.
Badel, Alejandro, and Mark Huggett. 2007. “Interpreting Life-Cycle
Inequality Patterns as an Efficient Allocation: Mission Impossible?”
Working Paper, Georgetown University.
Baker, Michael. 1997. “Growth-Rate Heterogeneity and the Covariance
Structure of Life-Cycle Earnings.” Journal of Labor Economics 15
(April): 338–75.
Baker, Michael, and Gary Solon. 2003. “Earnings Dynamics and Inequality
among Canadian Men, 1976–1992: Evidence from Longitudinal Income
Tax Records.” Journal of Labor Economics 21 (April): 267–88.
Becker, Gary S. 1964. Human Capital: A Theoretical and Empirical
Analysis, with Special Reference to Education. Chicago: University of
Chicago Press.
Ben-Porath, Yoram. 1967. “The Production of Human Capital and the Life
Cycle of Earnings.” Journal of Political Economy 75 (4): 352–65.
Bewley, Truman F. Undated. “Interest Bearing Money and the Equilibrium
Stock of Capital.” Working Paper.
Blundell, Richard, and Ian Preston. 1998. “Consumption Inequality And
Income Uncertainty.” The Quarterly Journal of Economics 113 (May):
603–40.
Blundell, Richard, Luigi Pistaferri, and Ian Preston. 2008. “Consumption
Inequality and Partial Insurance.” American Economic Review 98
(December): 1,887–921.
Bound, John, Charles Brown, and Nancy Mathiowetz. 2001. “Measurement
Error in Survey Data.” In Handbook of Econometrics, edited by J. J.
Heckman and E. E. Leamer. Amsterdam: Elsevier; 3,705–843.
Browning, Martin, Lars Peter Hansen, and James J. Heckman. 1999. “Micro
Data and General Equilibrium Models.” In Handbook of
Macroeconomics, edited by J. B. Taylor and M. Woodford. Amsterdam:
Elsevier; 543–633.
Browning, Martin, Mette Ejrnaes, and Javier Alvarez. 2010. “Modelling
Income Processes with Lots of Heterogeneity.” Review of Economic
Studies 77 (October): 1,353–81.

316

Federal Reserve Bank of Richmond Economic Quarterly

Cagetti, Marco, and Mariacristina De Nardi. 2006. “Entrepreneurship,
Frictions, and Wealth.” Journal of Political Economy 114 (October):
835–70.
Campbell, John Y., and N. Gregory Mankiw. 1990. “Consumption, Income,
and Interest Rates: Reinterpreting the Time Series Evidence.” Working
Paper 2924. Cambridge, Mass.: National Bureau of Economic Research
(May).
Carroll, Christopher. 2000. “Why Do the Rich Save So Much?” In Does
Atlas Shrug? The Economic Consequences of Taxing the Rich, edited by
Joel Slemrod. Boston: Harvard University Press, 466–84.
Carroll, Christopher D. 1991. “Buffer Stock Saving and the Permanent
Income Hypothesis.” Board of Governors of the Federal Reserve System
Working Paper Series/Economic Activity Section 114.
Carroll, Christopher D. 1997. “Buffer-Stock Saving and the Life
Cycle/Permanent Income Hypothesis.” The Quarterly Journal of
Economics 112 (February): 1–55.
Carroll, Christopher D., and Andrew A. Samwick. 1997. “The Nature of
Precautionary Wealth.” Journal of Monetary Economics 40 (September):
41–71.
Caselli, Francesco, and Jaume Ventura. 2000. “A Representative Consumer
Theory of Distribution.” American Economic Review 90 (September):
909–26.
Castañeda, Ana, Javier Dı́az-Giménez, and José-Vı́ctor Rı́os-Rull. 2003.
“Accounting for the U.S. Earnings and Wealth Inequality.” The Journal
of Political Economy 111 (August): 818–57.
Chamberlain, Gary, and Charles A. Wilson. 2000. “Optimal Intertemporal
Consumption Under Uncertainty.” Review of Economic Dynamics 3
(July): 365–95.
Chan, Yeung Lewis, and Leonid Kogan. 2002. “Catching up with the
Joneses: Heterogeneous Preferences and the Dynamics of Asset Prices.”
Journal of Political Economy 110 (December): 1,255–85.
Chang, Yongsung, and Sun-Bin Kim. 2006. “From Individual to Aggregate
Labor Supply: A Quantitative Analysis based on a Heterogeneous-Agent
Macroeconomy.” International Economic Review 47 (1): 1–27.
Chang, Yongsung, Sun-Bin Kim, and Frank Schorfheide. 2010. “Labor
Market Heterogeneity, Aggregation, and the Lucas Critique.” Working
Paper 16401. Cambridge, Mass.: National Bureau of Economic
Research.

F. Guvenen: Macroeconomics with Heterogeneity

317

Chari, V. V., Patrick J. Kehoe, and Ellen R. McGrattan. 2007. “Business
Cycle Accounting.” Econometrica 75 (3): 781–836.
Chatterjee, Satyajit, and Burcu Eyigungor. 2011. “A Quantitative Analysis of
the U.S. Housing and Mortgage Markets and the Foreclosure Crisis.”
Federal Reserve Bank of Philadelphia Working Paper 11-26 (July).
Chatterjee, Satyajit, Dean Corbae, Makoto Nakajima, and José-Vı́ctor
Rı́os-Rull. 2007. “A Quantitative Theory of Unsecured Consumer Credit
with Risk of Default.” Econometrica 75 (November): 1,525–89.
Chien, YiLi, and Hanno Lustig. 2010. “The Market Price of Aggregate Risk
and the Wealth Distribution.” Review of Financial Studies 23 (April):
1,596–650.
Christiano, Lawrence J., and Jonas D. M. Fisher. 2000. “Algorithms for
Solving Dynamic Models with Occasionally Binding Constraints.”
Journal of Economic Dynamics and Control 24 (July): 1,179–232.
Clarida, Richard H. 1987. “Consumption, Liquidity Constraints, and Asset
Accumulation in the Presence of Random Income Fluctuations.”
International Economic Review 28 (June): 339–51.
Clarida, Richard H. 1990. “International Lending and Borrowing in a
Stochastic, Stationary Equilibrium.” International Economic Review 31
(August): 543–58.
Cochrane, John H. 1991. “A Simple Test of Consumption Insurance.”
Journal of Political Economy 99 (October): 957–76.
Congressional Budget Office. 2008. “Recent Trends in the Variability of
Individual Earnings and Family Income.” Washington, D.C.: CBO
(June).
Constantinides, George M. 1982. “Intertemporal Asset Pricing with
Heterogeneous Consumers and Without Demand Aggregation.” Journal
of Business 55 (April): 253–67.
Constantinides, George M., and Darrell Duffie. 1996. “Asset Pricing with
Heterogeneous Consumers.” Journal of Political Economy 104 (April):
219–40.
Cunha, Flavio, James Heckman, and Salvador Navarro. 2005. “Separating
Uncertainty from Heterogeneity in Life Cycle Earnings.” Oxford
Economic Papers 57 (2): 191–261.
Cutler, David M., and Lawrence F. Katz. 1992. “Rising Inequality? Changes
in the Distribution of Income and Consumption in the 1980’s.” American
Economic Review 82 (May): 546–51.

318

Federal Reserve Bank of Richmond Economic Quarterly

Davis, Steven J., R. Jason Faberman, John Haltiwanger, Ron Jarmin, and
Javier Miranda. 2010. “Business Volatility, Job Destruction, and
Unemployment.” Working Paper 14300. Cambridge, Mass.: National
Bureau of Economic Research (September).
De Nardi, Mariacristina, Eric French, and John B. Jones. 2010. “Why Do the
Elderly Save? The Role of Medical Expenses.” Journal of Political
Economy 118 (1): 39–75.
Deaton, Angus. 1991. “Saving and Liquidity Constraints.” Econometrica 59
(September): 1,221–48.
Deaton, Angus, and Christina Paxson. 1994. “Intertemporal Choice and
Inequality.” Journal of Political Economy 102 (June): 437–67.
Debreu, Gerard. 1959. Theory of Value. New York: John Wiley and Sons.
den Haan, Wouter J. 2010. “Assessing the Accuracy of the Aggregate Law of
Motion in Models with Heterogeneous Agents.” Journal of Economic
Dynamics and Control 34 (January): 79–99.
den Haan, Wouter J., and Pontus Rendahl. 2009. “Solving the Incomplete
Markets Model with Aggregate Uncertainty Using Explicit
Aggregation.” Working Paper, University of Amsterdam.
Domeij, David, and Martin Floden. 2006. “The Labor-Supply Elasticity and
Borrowing Constraints: Why Estimates are Biased.” Review of
Economic Dynamics 9 (April): 242–62.
Dynan, Karen E., Douglas W. Elmendorf, and Daniel E. Sichel. 2007. “The
Evolution of Household Income Volatility.” Federal Reserve Board
Working Paper 2007-61.
Erosa, Andrés, Luisa Fuster, and Gueorgui Kambourov. 2009. “The
Heterogeneity and Dynamics of Individual Labor Supply over the Life
Cycle: Facts and Theory.” Working Paper, University of Toronto.
Flavin, Marjorie A. 1981. “The Adjustment of Consumption to Changing
Expectations About Future Income.” Journal of Political Economy 89
(October): 974–1,009.
French, Eric, and John Bailey Jones. 2004. “On the Distribution and
Dynamics of Health Care Costs.” Journal of Applied Econometrics 19
(6): 705–21.
Galindev, Ragchaasuren, and Damba Lkhagvasuren. 2010. “Discretization of
Highly Persistent Correlated AR(1) Shocks.” Journal of Economic
Dynamics and Control 34 (July): 1,260–76.
Gallipoli, Giovanni, and Laura Turner. 2011. “Household Responses to
Individual Shocks: Disability and Labour Supply.” Working Paper,
University of British Columbia.

F. Guvenen: Macroeconomics with Heterogeneity

319

Glover, Andrew, and Jacob Short. 2010. “Bankruptcy, Incorporation, and the
Nature of Entrepreneurial Risk.” Working Paper, University of Western
Ontario.
Gorman, William M. 1961. “On a Class of Preference Fields.”
Metroeconomica 13 (June): 53–6.
Gottschalk, Peter, and Robert Moffitt. 1994. “The Growth of Earnings
Instability in the U.S. Labor Market.” Brookings Papers on Economic
Activity 25 (2): 217–72.
Gottschalk, Peter, and Robert Moffitt. 1999. “Changes in Job Instability and
Insecurity Using Monthly Survey Data.” Journal of Labor Economics 17
(October): S91–126.
Gourinchas, Pierre-Olivier, and Jonathan A. Parker. 2002. “Consumption
over the Life Cycle.” Econometrica 70 (January): 47–89.
Greenwood, Jeremy, Ananth Seshadri, and Mehmet Yorukoglu. 2005.
“Engines of Liberation.” Review of Economic Studies 72 (1): 109–33.
Greenwood, Jeremy, and Nezih Guner. 2009. “Marriage and Divorce since
World War II: Analyzing the Role of Technological Progress on the
Formation of Households.” In NBER Macroeconomics Annual, Vol. 23.
Cambridge, Mass.: National Bureau of Economic Research, 231–76.
Guner, Nezih, Remzi Kaygusuz, and Gustavo Ventura. 2010. “Taxation and
Household Labor Supply.” Working Paper, Arizona State University.
Gustavsson, Magnus, and Pär Österholm. 2010. “Does the Labor-Income
Process Contain a Unit Root? Evidence from Individual-Specific Time
Series.” Working Paper, Uppsala University.
Guvenen, Fatih. 2006. “Reconciling Conflicting Evidence on the Elasticity
of Intertemporal Substitution: A Macroeconomic Perspective.” Journal
of Monetary Economics 53 (October): 1,451–72.
Guvenen, Fatih. 2007a. “Do Stockholders Share Risk More Effectively than
Nonstockholders?” The Review of Economics and Statistics 89 (2):
275–88.
Guvenen, Fatih. 2007b. “Learning Your Earning: Are Labor Income Shocks
Really Very Persistent?” American Economic Review 97 (June):
687–712.
Guvenen, Fatih. 2009a. “An Empirical Investigation of Labor Income
Processes.” Review of Economic Dynamics 12 (January): 58–79.
Guvenen, Fatih. 2009b. “A Parsimonious Macroeconomic Model for Asset
Pricing.” Econometrica 77 (November): 1,711–50.
Guvenen, Fatih. 2011. “Limited Stock Market Participation Versus External
Habit: An Intimate Link.” University of Minnesota Working Paper 450.

320

Federal Reserve Bank of Richmond Economic Quarterly

Guvenen, Fatih, and Anthony A. Smith. 2009. “Inferring Labor Income Risk
from Economic Choices: An Indirect Inference Approach.” Working
Paper, University of Minnesota.
Guvenen, Fatih, and Burhanettin Kuruscu. 2010. “A Quantitative Analysis of
the Evolution of the U.S. Wage Distribution, 1970–2000.” In NBER
Macroeconomics Annual 2009 24 (1): 227–76.
Guvenen, Fatih, and Burhanettin Kuruscu. Forthcoming. “Understanding the
Evolution of the U.S. Wage Distribution: A Theoretical Analysis.”
Journal of the European Economic Association.
Guvenen, Fatih, and Michelle Rendall. 2011. “Emancipation Through
Education.” Working Paper, University of Minnesota.
Guvenen, Fatih, Burhanettin Kuruscu, and Serdar Ozkan. 2009. “Taxation of
Human Capital and Wage Inequality: A Cross-Country Analysis.”
Working Paper 15526. Cambridge, Mass.: National Bureau of Economic
Research (November).
Haider, Steven J. 2001. “Earnings Instability and Earnings Inequality of
Males in the United States: 1967–1991.” Journal of Labor Economics
19 (October): 799–836.
Haider, Steven, and Gary Solon. 2006. “Life-Cycle Variation in the
Association between Current and Lifetime Earnings.” American
Economic Review 96 (September): 1,308–20.
Hall, Robert E. 1988. “Intertemporal Substitution in Consumption.” Journal
of Political Economy 96 (April): 339–57.
Hall, Robert E., and Frederic S. Mishkin. 1982. “The Sensitivity of
Consumption to Transitory Income: Estimates from Panel Data on
Households.” Econometrica 50 (March): 461–81.
Hansen, Gary D. 1985. “Indivisible Labor and the Business Cycle.” Journal
of Monetary Economics 16 (November): 309–27.
Harris, Milton, and Bengt Holmstrom. 1982. “A Theory of Wage Dynamics.”
Review of Economic Studies 49 (July): 315–33.
Hause, John C. 1980. “The Fine Structure of Earnings and the On-the-Job
Training Hypothesis.” Econometrica 48 (May): 1,013–29.
Hayashi, Fumio, Joseph Altonji, and Laurence Kotlikoff. 1996.
“Risk-Sharing between and within Families.” Econometrica 64 (March):
261–94.
Heathcote, Jonathan, Fabrizio Perri, and Giovanni L. Violante. 2010.
“Unequal We Stand: An Empirical Analysis of Economic Inequality in
the United States, 1967–2006.” Review of Economic Dynamics 13
(January): 15–51.

F. Guvenen: Macroeconomics with Heterogeneity

321

Heathcote, Jonathan, Kjetil Storesletten, and Giovanni L. Violante. 2007.
“Consumption and Labour Supply with Partial Insurance: An Analytical
Framework.” CEPR Discussion Papers 6280 (May).
Heathcote, Jonathan, Kjetil Storesletten, and Giovanni L. Violante. 2008.
“The Macroeconomic Implications of Rising Wage Inequality in the
United States.” Working Paper 14052. Cambridge, Mass.: National
Bureau of Economic Research (June).
Heathcote, Jonathan, Kjetil Storesletten, and Giovanni L. Violante. 2009.
“Quantitative Macroeconomics with Heterogeneous Households.”
Annual Review of Economics 1 (1): 319–54.
Heaton, John, and Deborah J. Lucas. 1996. “Evaluating the Effects of
Incomplete Markets on Risk Sharing and Asset Pricing.” Journal of
Political Economy 104 (June): 443–87.
Heaton, John, and Deborah Lucas. 2000. “Portfolio Choice and Asset Prices:
The Importance of Entrepreneurial Risk.” Journal of Finance 55 (June):
1,163–98.
Heckman, James, Lance Lochner, and Christopher Taber. 1998. “Explaining
Rising Wage Inequality: Explanations with a Dynamic General
Equilibrium Model of Labor Earnings with Heterogeneous Agents.”
Review of Economic Dynamics 1 (January): 1–58.
Hornstein, Andreas, Per Krusell, and Giovanni L. Violante. 2011. “Frictional
Wage Dispersion in Search Models: A Quantitative Assessment.”
American Economic Review 101 (December): 2,873–98.
Hubbard, R. Glenn, Jonathan Skinner, and Stephen P. Zeldes. 1994. “The
Importance of Precautionary Motives in Explaining Individual and
Aggregate Saving.” Carnegie-Rochester Conference Series on Public
Policy 40 (June): 59–125.
Hubbard, R. Glenn, Jonathan Skinner, and Stephen P. Zeldes. 1995.
“Precautionary Saving and Social Insurance.” Journal of Political
Economy 103 (April): 360–99.
Huggett, Mark. 1993. “The Risk-Free Rate in Heterogeneous-Agent
Incomplete-Insurance Economies.” Journal of Economic Dynamics and
Control 17: 953–69.
Huggett, Mark. 1996. “Wealth Distribution in Life-cycle Economies.”
Journal of Monetary Economics 38 (December): 469–94.
Huggett, Mark, Gustavo Ventura, and Amir Yaron. 2006. “Human Capital
and Earnings Distribution Dynamics.” Journal of Monetary Economics
53 (March): 265–90.

322

Federal Reserve Bank of Richmond Economic Quarterly

Huggett, Mark, Gustavo Ventura, and Amir Yaron. 2011. “Sources of
Lifetime Inequality.” American Economic Review 101 (December):
2,923–54.
Imrohoroglu, Ayse. 1989. “Cost of Business Cycles with Indivisibilities and
Liquidity Constraints.” Journal of Political Economy 97 (December):
1,364–83.
Jencks, Christopher. 1984. “The Hidden Prosperity of the 1970s.” Public
Interest 77: 37–61.
Jones, Larry E., Rodolfo E. Manuelli, and Ellen R. McGrattan. 2003. “Why
are Married Women Working So Much?” Federal Reserve Bank of
Minneapolis Staff Report 317 (June).
Jovanovic, Boyan. 1979. “Job Matching and the Theory of Turnover.”
Journal of Political Economy 87 (October): 972–90.
Judd, Kenneth L., and Sy-Ming Guu. 2001. “Asymptotic Methods for Asset
Market Equilibrium Analysis.” Economic Theory 18 (1): 127–57.
Kaplan, Greg. 2010. “Inequality and the Life Cycle.” Working Paper,
University of Pennsylvania.
Kaplan, Greg, and Giovanni L. Violante. 2010. “How Much Consumption
Insurance Beyond Self-Insurance?” American Economic Journal:
Macroeconomics 2 (October): 53–87.
Kehoe, Timothy J., and David K. Levine. 1993. “Debt-Constrained Asset
Markets.” Review of Economic Studies 60 (October): 865–88.
Kitao, Sagiri, Lars Ljungqvist, and Thomas J. Sargent. 2008. “A Life Cycle
Model of Trans-Atlantic Employment Experiences.” Working Paper,
University of Southern California and New York University.
Knowles, John. 2007. “Why Are Married Men Working So Much? The
Macroeconomics of Bargaining Between Spouses.” Working Paper,
University of Pennsylvania.
Kopczuk, Wojciech, Emmanuel Saez, and Jae Song. 2010. “Earnings
Inequality and Mobility in the United States: Evidence from Social
Security Data Since 1937.” Quarterly Journal of Economics 125
(February): 91–128.
Kopecky, Karen A., and Richard M. H. Suen. 2010. “Finite State
Markov-chain Approximations to Highly Persistent Processes.” Review
of Economic Dynamics 13 (July): 701–14.
Krueger, Dirk, and Fabrizio Perri. 2006. “Does Income Inequality Lead to
Consumption Inequality? Evidence and Theory.” Review of Economic
Studies 73 (1): 163–93.

F. Guvenen: Macroeconomics with Heterogeneity

323

Krueger, Dirk, and Fabrizio Perri. 2009. “How Do Households Respond to
Income Shocks?” Working Paper, University of Minnesota.
Krusell, Per, and Anthony A. Smith. 1997. “Income and Wealth
Heterogeneity, Portfolio Choice, and Equilibrium Asset Returns.”
Macroeconomic Dynamics 1 (June): 387–422.
Krusell, Per, and Anthony A. Smith, Jr.. 1998. “Income and Wealth
Heterogeneity in the Macroeconomy.” Journal of Political Economy 106
(October): 867–96.
Kucherenko, Sergei, and Yury Sytsko. 2005. “Application of Deterministic
Low-Discrepancy Sequences in Global Optimization.” Computational
Optimization and Applications 30: 297–318.
Laitner, John. 1992. “Random Earnings Differences, Lifetime Liquidity
Constraints, and Altruistic Intergenerational Transfers.” Journal of
Economic Theory 58 (December): 135–70.
Laitner, John. 2002. “Wealth Inequality and Altruistic Bequests.” American
Economic Review 92 (May): 270–3.
Leamer, Edward E. 1983. “Let’s Take the Con Out of Econometrics.”
American Economic Review 73 (March): 31–43.
Levine, David K., and William R. Zame. 2002. “Does Market
Incompleteness Matter?” Econometrica 70 (September): 1,805–39.
Liberti, Leo, and Sergei Kucherenko. 2005. “Comparison of Deterministic
and Stochastic Approaches to Global Optimization.” International
Transactions in Operations Research 12: 263–85.
Lillard, Lee A., and Robert J. Willis. 1978. “Dynamic Aspects of Earnings
Mobility.” Econometrica 46 (September): 985–1,012.
Lillard, Lee A., and Yoram Weiss. 1979. “Components of Variation in Panel
Earnings Data: American Scientists 1960–70.” Econometrica 47
(March): 437–54.
Livshits, Igor, James MacGee, and Michèle Tertilt. 2007. “Consumer
Bankruptcy—A Fresh Start.” American Economic Review 97 (March):
402–18.
Livshits, Igor, James MacGee, and Michèle Tertilt. 2010. “Accounting for
the Rise in Consumer Bankruptcies.” American Economic Journal:
Macroeconomics 2 (April): 165–93.
Lorenzoni, Guido. 2009. “A Theory of Demand Shocks.” American
Economic Review 99 (December): 2,050–84.
Lucas, Jr., Robert E. 1987. Models of Business Cycles. New York: Basil
Blackwell.

324

Federal Reserve Bank of Richmond Economic Quarterly

Lucas, Jr., Robert E. 2003. “Macroeconomic Priorities.” American Economic
Review 93 (March): 1–14.
Mace, Barbara J. 1991. “Full Insurance in the Presence of Aggregate
Uncertainty.” Journal of Political Economy 99 (October): 928–56.
MaCurdy, Thomas E. 1982. “The Use of Time Series Processes to Model the
Error Structure of Earnings in a Longitudinal Data Analysis.” Journal of
Econometrics 18 (January) 83–114.
Mankiw, N. Gregory. 1986. “The Equity Premium and the Concentration of
Aggregate Shocks.” Journal of Financial Economics 17 (September):
211–9.
Meghir, Costas, and Luigi Pistaferri. 2004. “Income Variance Dynamics and
Heterogeneity.” Econometrica 72 (1): 1–32.
Meghir, Costas, and Luigi Pistaferri. 2011. “Earnings, Consumption, and
Life Cycle Choices.” In Handbook of Labor Economics, Vol 4B, edited
by Orley Ashenfelter and David Card. Amsterdam: Elsevier, 773–854.
Mehra, Rajnish, and Edward C. Prescott. 1985. “The Equity Premium: A
Puzzle.” Journal of Monetary Economics 15 (March): 145–61.
Mian, Atif, and Amir Sufi. 2011. “House Prices, Home Equity-Based
Borrowing, and the U.S. Household Leverage Crisis.” American
Economic Review 101 (August): 2,132–56.
Moffitt, Robert, and Peter Gottschalk. 1995. “Trends in the Covariance
Structure of Earnings in the United States: 1969–1987.” Institute for
Research on Poverty Discussion Papers 1001-93, University of
Wisconsin Institute for Research and Poverty.
Moffitt, Robert, and Peter Gottschalk. 2008. “Trends in the Transitory
Variance of Male Earnings in the U.S., 1970–2004.” Working Paper,
Johns Hopkins University.
Nelson, Julie A. 1994. “On Testing for Full Insurance Using Consumer
Expenditure Survey Data: Comment.” Journal of Political Economy 102
(April): 384–94.
Ozkan, Serdar. 2010. “Income Differences and Health Care Expenditures
over the Life Cycle.” Working Paper, University of Pennsylvania.
Palumbo, Michael G. 1999. “Uncertain Medical Expenses and Precautionary
Saving Near the End of the Life Cycle.” Review of Economic Studies 66
(April): 395–421.
Pijoan-Mas, Josep. 2006. “Precautionary Savings or Working Longer
Hours?” Review of Economic Dynamics 9 (April): 326–52.

F. Guvenen: Macroeconomics with Heterogeneity

325

Powell, Michael J. D. 2009. “The BOBYQA Algorithm for Bound
Constrained Optimization without Derivatives.” Numerical Analysis
Papers NA06, Department of Applied Mathematics and Theoretical
Physics, Centre for Mathematical Sciences, Cambridge (August).
Pratt, John W. 1964. “Risk Aversion in the Small and in the Large.”
Econometrica 32 (1/2): 122–36.
Primiceri, Giorgio E., and Thijs van Rens. 2009. “Heterogeneous Life-Cycle
Profiles, Income Risk and Consumption Inequality.” Journal of
Monetary Economics 56 (January): 20–39.
Quadrini, Vincenzo. 2000. “Entrepreneurship, Saving, and Social Mobility.”
Review of Economic Dynamics 3 (January): 1–40.
Rinnooy Kan, Alexander, and G. T. Timmer. 1987a. “Stochastic Global
Optimization Methods Part I: Clustering Methods.” Mathematical
Programming 39: 27–56.
Rinnooy Kan, Alexander, and G. T. Timmer. 1987b. “Stochastic Global
Optimization Methods Part II: Multilevel Methods.” Mathematical
Programming 39: 57–78.
Rı́os-Rull, José Victor. 1996. “Life-Cycle Economies and Aggregate
Fluctuations.” Review of Economic Studies 63 (July): 465–89.
Rogerson, Richard. 1988. “Indivisible Labor, Lotteries and Equilibrium.”
Journal of Monetary Economics 21 (January): 3–16.
Rogerson, Richard, and Johanna Wallenius. 2009. “Micro and Macro
Elasticities in a Life Cycle Model With Taxes.” Journal of Economic
Theory 144 (November): 2,277–92.
Rogerson, Richard, Robert Shimer, and Randall Wright. 2005.
“Search-Theoretic Models of the Labor Market: A Survey.” Journal of
Economic Literature 43 (December): 959–88.
Rossi-Hansberg, Esteban, and Mark L. J. Wright. 2007. “Establishment Size
Dynamics in the Aggregate Economy.” American Economic Review 97
(December): 1,639–66.
Rouwenhorst, K. Geert. 1995. “Asset Pricing Implications of Equilibrium
Business Cycle Models.” In Frontiers of Business Cycle Research.
Princeton, N.J.: Princeton University Press, 294–330.
Rubinstein, Mark. 1974. “An Aggregation Theorem for Securities Markets.”
Journal of Financial Economics 1 (September): 225–44.
Sabelhaus, John, and Jae Song. 2009. “Earnings Volatility Across Groups
and Time.” National Tax Journal 62 (June): 347–64.
Sabelhaus, John, and Jae Song. 2010. “The Great Moderation in Micro
Labor Earnings.” Journal of Monetary Economics 57 (May): 391–403.

326

Federal Reserve Bank of Richmond Economic Quarterly

Schechtman, Jack, and Vera L. S. Escudero. 1977. “Some Results on an
‘Income Fluctuation Problem.”’ Journal of Economic Theory 16
(December): 151–66.
Schulhofer-Wohl, Sam. 2011. “Heterogeneity and Tests of Risk Sharing.”
Federal Reserve Bank of Minneapolis Staff Report 462 (September).
Shimer, Robert. 2005. “The Cyclicality of Hires, Separations, and Job-to-Job
Transitions.” Federal Reserve Bank of St. Louis Review 87 (4): 493–507.
Shimer, Robert. 2007. “Reassessing the Ins and Outs of Unemployment.”
Working Paper 13421. Cambridge, Mass.: National Bureau of Economic
Research (September).
Shin, Donggyun, and Gary Solon. 2011. “Trends in Men’s Earnings
Volatility: What Does the Panel Study of Income Dynamics Show?”
Journal of Public Economics 95 (August): 973–82.
Storesletten, Kjetil, Christopher I. Telmer, and Amir Yaron. 2004a.
“Consumption and Risk Sharing Over the Life Cycle.” Journal of
Monetary Economics 51 (April): 609–33.
Storesletten, Kjetil, Christopher I. Telmer, and Amir Yaron. 2004b. “Cyclical
Dynamics in Idiosyncratic Labor Market Risk.” Journal of Political
Economy 112 (June): 695–717.
Storesletten, Kjetil, Christopher I. Telmer, and Amir Yaron. 2007. “Asset
Pricing with Idiosyncratic Risk and Overlapping Generations.” Review
of Economic Dynamics 10 (October): 519–48.
Telmer, Christopher I. 1993. “Asset Pricing Puzzles and Incomplete
Markets.” Journal of Finance 48 (December): 1,803–32.
Topel, Robert H. 1990. “Specific Capital, Mobility, and Wages: Wages Rise
with Job Seniority.” Working Paper 3294. Cambridge, Mass.: National
Bureau of Economic Research (March).
Topel, Robert H., and Michael P. Ward. 1992. “Job Mobility and the Careers
of Young Men.” Quarterly Journal of Economics 107 (May): 439–79.
Townsend, Robert M. 1994. “Risk and Insurance in Village India.”
Econometrica 62 (May): 539–91.
Veracierto, Marcelo. 1997. “Plant-Level Irreversible Investment and
Equilibrium Business Cycles.” Federal Reserve Bank of Minneapolis
Discussion Paper 115 (March).
Wang, Neng. 2009. “Optimal Consumption and Asset Allocation with
Unknown Income Growth.” Journal of Monetary Economics 56 (May):
524–34.

F. Guvenen: Macroeconomics with Heterogeneity
Zhang, Hongchao, Andrew R. Conn, and Katya Scheinberg. 2010. “A
Derivative-Free Algorithm for Least-Squares Minimization.” SIAM
Journal on Optimization 20 (6): 3,555–76.

327

Economic Quarterly—Volume 97, Number 3—Third Quarter 2011—Pages 329–357

Recent Developments in
Economic Growth
Diego Restuccia

fundamental question in the field of economic growth and development is why some countries are rich and others poor. Both the longer
term historical experience of individual countries and the more recent data for a large number of countries show periods of marked increases
in income inequality across countries, as well as episodes where individual
countries catch up with the leading country. What determines when countries
start the process of modern economic growth? Why do some countries sustain positive economic growth for long periods of time while others countries
seem to fail to catch up with the leading country and even fall behind other
countries that are able to catch up? Understanding the factors driving income
inequality has potentially enormous welfare consequences and the design of
effective economic policy hinges on answers to these and related questions.
I start this survey article by first describing a broad set of facts from
international data on gross domestic product (GDP) per capita as a measure
of welfare across countries. These facts motivate most of the inquiry in the
field of growth economics. The main facts can be summarized as follows.
First, not only are there remarkable differences in per capita income across
countries, but also inequality has increased over the last 30 years. To be more
concrete, while average GDP per capita of the richest countries was about
25 times that of the poorest countries in 1960, it was about 65 times that of
the poorest countries in 2005. Second, the international evidence presents
numerous episodes of countries catching up, stagnating, or falling behind in
relative income over time.
The author would like to thank Tasso Adamopoulos, Margarida Duarte, Andreas Hornstein,
and Pierre Sarte for very useful and detailed comments. The author would also like to thank
Baran Doda for excellent research assistance. All remaining errors and misinterpretations are
the author’s. The opinions expressed in this article do not necessarily reflect those of the
Federal Reserve Bank of Richmond or the Federal Reserve System. Restuccia is affiliated
with the University of Toronto. E-mail: diego.restuccia@utoronto.ca.

330

Federal Reserve Bank of Richmond Economic Quarterly

Next, I review the recent literature in growth economics. I take a narrow
view of the field with a focus on quantitative explorations.1 I discuss the
literature that directly or indirectly addresses the facts on income differences
across countries and over time. Essentially, this literature emphasizes that
cross-country differences for aggregate outcomes arise from cross-country
differences in the allocation of factors of production and productivity across
heterogeneous production units where those units can generically refer to
sectors/industries or establishments within sectors. I begin my survey with
the literature that focuses on the structural transformation of the economy—
broadly described as systematic changes in the allocation of factors of production across sectors in the economy. I emphasize the role of agriculture
for the early stages of development and for the current income differences
between rich and poor countries. I also emphasize the reallocation of factors
to the service sector in determining recent patterns of aggregate productivity
growth across countries. I then discuss models that focus on understanding
differences in measured aggregate total factor productivity (TFP) arising from
the allocation of factors of production across establishments with heterogenous productivity levels. Substantial work remains to be done on identifying
the fundamental determinants of productivity and resource allocation across
productive units.
The article is organized as follows. In the next section, I lay out the
main facts in economic growth and development that organize the ultimate
objectives of the recent quantitative literature in growth economics. Section
2 surveys models emphasizing the role of the structural transformation in the
economy—changes in the allocation of factors of production across sectors.
In Section 3, I discuss the literature that relates measured TFP differences
across countries to distortions that misallocate factors of production across
heterogeneous establishments. I conclude in Section 4.

FACTS

In this article, I focus on documenting a narrow set of facts using the recent
data on GDP per capita from Heston, Summers, and Aten (2009). The data
is often referred to as the Penn World Table (PWT). To provide a broader
perspective, I complement the description of the facts from this data with references to the literature where refinements of the basic facts have been made.
Let me first describe the data. I use GDP per capita as a measure of welfare
in each country.2 A critical element of the data is that the measure of GDP
1 Even with a narrow focus, the survey is bound to leave out the discussion of many important
contributions for which I preemptively apologize.
2 Clearly, GDP per capita is a limited measure of welfare in an economy as cross-country
differences in life expectancy, education, work hours, and inequality, among others, are also relevant

D. Restuccia: Recent Developments in Economic Growth

331

reported in the PWT is adjusted for price differences across countries (purchasing power parity adjusted) and, hence, represents a measure of income
in units that are comparable across countries.3 The data spans from 1950–
2007 for 189 countries in the world. Since I am interested in assessing the
evolution of cross-country incomes over time, I restrict attention to a sample of 101 countries that have data for each year from 1960–2007 and that
have a population of more than 1 million people in 2007. I emphasize two
sets of facts from this data. First, income differences across rich and poor
countries are not only large at any point in time between 1960–2007, but also
have increased quite substantially in the last two decades. Second, while the
dispersion in income per capita has either stayed constant or increased in the
last two decades, the data reveal remarkable episodes of individual countries
catching up, stagnating, and declining in per capita income relative to that of
the United States. I now elaborate on the description of these basic facts.

Income Differences
To start, for each year between 1960–2007, I rank countries by their GDP
per capita relative to that of the United States. I use the United States as
a benchmark country for comparison since it is a large, stable, and diverse
country that has been at the frontier of the world’s production technology
during the sample period. As a result, changes in income in the United States
roughly approximate changes in the world state of knowledge that, in principle,
should be available for adoption elsewhere. I then calculate the average GDP
per capita for the richest 5 percent of the countries and the poorest 5 percent of
the countries (i.e., I calculate the average of the richest and poorest 5 countries
in the sample). The ratio of the average GDP per capita of the richest and
poorest 5 percent of countries is reported in Figure 1.4 Income per capita
differences across countries are large. GDP per capita in the richest countries
is, on average, 40 times that of the poorest countries. Moreover, income
differences, while relatively stable between 1960 to about 1985, have been
increasing since then such that in 2007 GDP per capita in the richest countries
was, on average, 66 times that of the poorest countries.5 The increase in income
measures in a country’s welfare. I follow the standard practice in the literature of focusing on
GDP per capita as the main determinant of welfare in a country. See Jones and Klenow (2011) for
an analysis of welfare across countries and time that includes measures of consumption, leisure,
inequality, and mortality.
3 In the version of the PWT I use, international prices refer to world prices of 2005.
4 Parente and Prescott (1993) emphasize the ratio of the richest and poorest 5 percent of
countries in GDP per capita as a measure of dispersion in income across countries at a point in
time and across time. Duarte and Restuccia (2006) emphasize similar statistics but for measures
of labor productivity such as GDP per worker.
5 Note that while there is substantial persistence in cross-country income differences over
time, the set of poor and rich countries may be changing over time.

332

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 GDP per Capita Ratio of Rich to Poor
70

Ratio of Rich to Poor

60
50
40
30
20
10
0
1960

1970

1980

1990

2000

2010

Year
Notes: GDP per capita from Heston, Summers, and Aten (2009). The ratio refers to the
average of the richest 5 percent of countries to the poorest 5 percent of countries in each
year. Since the sample contains 101 countries, these are averages of 5 countries.

inequality between the rich and poor countries is mainly driven by a fall in
relative income in the poorest countries, which is not necessarily a decline in
absolute incomes of poor countries, but a failure of poor countries to grow as
fast as the United States. This fact is not a curiosity of the poorest countries
alone in this sample, which happen to be mostly in Africa, but continues to
hold even when focusing on larger groups of poor countries or on different
subgroups of the poorest countries. To illustrate this fact, Table 1 summarizes
the evolution of GDP per capita across countries relative to that of the United
States for deciles of the income distribution in selected years. The richest 10
percent of countries (Decile 10) gained on average, increasing relative GDP
per capita from 0.87 in 1960 to 0.91 in 2007. The poorest 10 percent of
countries (Decile 1) failed to keep up with the United States, losing half of the
relative income position, from a relative income of 0.04 in 1960 to less than
0.02 in 2007. But relative income also declined for most of the other groups

D. Restuccia: Recent Developments in Economic Growth

333

Table 1 GDP per Capita Relative to the United States (Percent)
Year
Decile
1
2
3
4
5
6
7
8
9
10

1960
4.3
6.3
8.7
11.4
15.0
20.4
27.3
39.3
57.6
86.8

1970
3.9
6.1
7.5
9.9
15.0
18.9
28.9
43.3
64.5
86.2

1980
3.5
5.2
7.0
10.3
15.4
21.3
28.4
45.3
67.8
87.7

1990
2.8
3.9
5.9
9.1
14.2
17.9
26.8
42.4
68.1
87.0

2000
2.2
3.3
5.1
8.3
12.4
17.1
25.0
47.3
70.4
87.0

2007
1.9
3.6
5.0
7.8
12.7
17.9
24.9
52.7
72.0
91.4

Notes: GDP per capita from Heston, Summers, and Aten (2009). Countries are ranked
according to GDP per capita in each year and divided into groups, with Decile 1 being
the poorest countries and Decile 10 being the richest countries. As a result, countries in
each decile may vary from year to year.

of poor countries, such as Deciles 2–7, even though their relative decline is
not as dramatic as in the poorest countries.
One explanation for the large differences in income per capita observed
across countries today attributes them to the countries’ timing of the start
of industrialization: Poor countries are slowly catching up to rich countries
that started the process of modern growth much earlier. In particular, Lucas
(2000, 2002) describes the cross-country differences in the timing of takeoff
in growth in income per capita by looking at the historical time series of GDP
per capita from 1500 to today.6 Lucas shows that prior to 1800, differences in
income per capita were moderate (about a factor of 2 between rich and poor
countries), but that the differences quickly expanded when, starting with the
process of industrialization, GDP per capita no longer remained stagnant for
a group of initially western countries and started to increase at positive rates.
Lucas conjectures that if today’s income differences across countries result
from differences in the timing when modern growth takes off in a country,
then the distribution of per capita income may shrink again to pre-industrial
levels once all countries have made the transition. This interpretation of the
historical relevance of today’s income differences seems difficult to reconcile
with the expanding income differences observed in most deciles of the income
distribution in the cross-country data reported in Table 1. I will return to this
issue in Section 2, where I review the related literature.
6 In related work, Buera, Monge-Naranjo, and Primiceri (2011) study the evolution of state-

intervention and market-oriented policies across countries and time in the context of a learning
model where past experiences (including those of countries’ neighbors) determine policy choices.

334

Federal Reserve Bank of Richmond Economic Quarterly

In addition to documenting the large income differences across countries, the development accounting literature has established that differences in
income per capita are mostly driven by differences in labor productivity (often
measured as either GDP per worker or GDP per labor hour) since differences
in labor supply (measured as either employment to population ratio or total
hours of work per capita) are not large enough to explain a substantial portion
of the differences in per capita income across countries. In turn, differences
in labor productivity are mostly accounted for by differences in TFP. That is,
differences in income per capita are not explained by measurable factors such
as employment, physical capital, or human capital.7

Country Experiences over Time
The reported evolution of the income distribution across countries hides tremendous variation in country experiences over time. In the data, there are
numerous episodes of catch up, catch up followed by a slowdown, stagnation,
and even decline. While reporting time series for 101 countries is impractical,
Table 2 attempts to summarize country experiences by reporting the evolution
of average GDP per capita relative to that of the United States for 20 groups,
each comprising 5 percent of countries in the sample. Unlike in Table 1 and
Figure 1, the countries in each group in Table 2 remain the same over time
and represent the ranking of countries according to relative GDP per capita in
1960.
Focusing on the richest and poorest 5 percent of countries in 1960, I find
that inequality in GDP per capita actually declined from a factor of 26 in 1960
to 16 in 2007, as a result of the richest countries in 1960 declining relative to
the leading country (from .95 in 1960 to .81 in 2007) and the poorest countries
in 1960 catching up relative to the leading country (from .037 in 1960 to .049
in 2007). Table 2 also shows that episodes of catch up and decline occur
throughout the income distribution in 1960, with countries in Group 7 almost
tripling their relative income (from .11 in 1960 to .30 in 2007). Table 2 does not
identify individual countries featuring catch up or decline in relative income.
To complement the summary in Table 2, Figure 2 documents the time series
of GDP per capita for selected countries with remarkable growth experiences
in the sample period. I emphasize the episodes of remarkable catch up in per
capita income by highlighting Singapore, Botswana, and more recently China
and India. I also note the growing gap in per capita income between the United
7 See, for instance, Klenow and Rodrı́guez-Clare (1997), Hall and Jones (1999), Caselli
(2005), and Hsieh and Klenow (2010). A critical element in establishing the relative importance
of TFP and factors of production in explaining income differences across countries is the treatment of human capital. There is a recent literature addressing the importance of human capital
in amplifying differences in TFP across countries; for instance, Manuelli and Seshadri (2006) and
Erosa, Koreshkova, and Restuccia (2010).

D. Restuccia: Recent Developments in Economic Growth

335

Table 2 GDP per Capita Relative to the United States (Percent)
Year
GR5pc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

1960
3.7
5.1
5.8
6.8
7.9
9.4
10.7
12.1
14.1
15.9
18.8
22.1
25.9
28.7
36.6
41.9
51.8
63.4
78.7
94.9

1970
3.7
5.0
6.2
6.0
7.7
7.4
12.5
10.7
12.2
14.6
17.5
24.9
26.8
32.3
37.9
47.7
55.1
68.6
79.9
91.8

1980
3.6
4.7
6.8
6.3
8.5
6.8
18.0
9.2
12.7
15.3
17.8
25.9
33.7
31.2
37.9
47.3
56.2
72.2
81.3
88.4

1990
3.5
3.9
7.3
6.4
8.8
6.6
21.9
7.6
11.5
14.0
13.2
21.2
39.1
28.8
35.6
45.0
50.0
67.6
80.8
83.2

2000
3.9
3.6
7.1
6.3
8.5
6.4
26.0
5.9
11.6
13.6
10.8
17.5
41.9
30.2
34.3
46.6
54.9
64.3
83.3
79.0

2007
4.9
3.8
7.4
6.3
8.7
6.7
29.5
5.7
12.6
14.5
10.3
14.7
45.7
31.3
34.7
49.1
62.3
65.5
85.2
80.6

Notes: GDP per capita from Heston, Summers, and Aten (2009). Countries are ranked
according to GDP per capita in 1960 and divided into groups. The country groups remain
constant across years. For instance, Group 1 refers to the poorest countries in 1960 whose
GDP per capita relative to the United States was 3.7 percent in 1960 and 4.9 percent in
2007.

States and Venezuela, Ghana, and Zimbabwe. Explaining these remarkable
growth and collapse episodes is a challenging and exciting task for the field
of quantitative growth economics.

STRUCTURAL TRANSFORMATION

In this section I discuss the recent quantitative literature that emphasizes the
role of factor reallocation across sectors in explaining income and growth
differences across countries.8 The process of economic development is associated with a systematic reallocation of factors of production across sectors—
8 The literature on structural transformation is too large to be fairly recognized in this article; please see the recent survey in Herrendorf, Rogerson, and Valentinyi (2011) for references. I
note, however, that the literature considers several approaches in driving reallocation across sectors. For example, some models emphasize non-homotetic preferences, such as Echevarria (1997)
and Kongsamut, Rebelo, and Xie (2001), while other models emphasize non-unitary elasticity of
substitution across goods and differential productivity growth across sectors such as Baumol (1967)
and Ngai and Pissarides (2007).

336

Federal Reserve Bank of Richmond Economic Quarterly

Figure 2 GDP per Capita, Selected Countries (in logs)
11
10
9
8
7
6
1960

1970

1980

1990

2000

2010

Year
Botswana

China

Ghana

India

Singapore

United States

Venezuela

Zimbabwe

Notes: GDP per capita data is from Heston, Summers, and Aten (2009).

the structural transformation—whereby factors are reallocated initially from
agriculture to industry and services and later from agriculture and industry to
services. There is a growing literature, following Kuznets (1966), emphasizing the importance of sectoral reallocation for aggregate outcomes.

The Role of Agriculture
An important development in the understanding of income differences across
countries has been the recognition that agriculture plays a crucial role. Progress
in this area has been enhanced by the availability of comparable data on agricultural output across countries, allowing a quantitative characterization of the
magnitude of agricultural productivity differences, and by quantitative assessments of plausible hypotheses using sectoral models.9 To start, let me motivate
why agriculture is important. From a historical perspective, the reallocation
9 See, for instance, Rao (1993) and Restuccia, Yang, and Zhu (2008).

D. Restuccia: Recent Developments in Economic Growth

337

Figure 3 Labor Productivity in Agriculture across Countries

Relative GDP per Worker in Agriculture (log scale)

1/10

1/50

1/100

1/200
1/200

1/100

1/50

1/10

Relative GDP per Worker (log scale)

Notes: Data from Restuccia, Yang, and Zhu (2008). Data for 1985.

process away from agriculture—hence, the process of industrialization—has
been associated with improvements in agricultural productivity (see, for instance, Kuznets [1966]). In addition, in the more recent cross-country data,
we observe that agriculture plays a critical role since, relative to rich countries, labor productivity in agriculture in poor countries is much lower than
in the rest of the economy (see Figure 3) and most of their labor is allocated
to agriculture. Whereas poor countries allocate more than 85 percent of the
labor force to agriculture, rich countries only allocate 4 percent (see Figure
4). Noting that aggregate labor productivity is the sum of labor productivity
across sectors weighted by the share of employment in each sector, and using
labor productivity and employment data for rich and poor countries, I find
that agriculture accounts for 85 percent of the difference in aggregate labor
productivity across rich and poor countries.10 Recalling that the bulk of the
10 The data reported in Restuccia, Yang, and Zhu (2008) suggests that if poor countries
were to have the same share of employment and labor productivity in agriculture as the rich
countries, then the aggregate labor productivity factor between rich and poor countries would be

338

Federal Reserve Bank of Richmond Economic Quarterly

Figure 4 Share of Employment in Agriculture
1.0

Share of Employment in Agriculture

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0

0.2

0.4

0.6

0.8

1.0

GDP per Worker Relative to the Unites States
Notes: This is Figure 1 in Restuccia, Yang, and Zhu (2008). Data for 1985.

differences in income per capita across countries are explained by differences
in labor productivity, the literature concludes that understanding productivity
and labor allocation in agriculture may be at the core of income differences
among rich and poor countries. The recognition that agriculture is central in
understanding low productivity in poor countries is important in seeking the
factors that account for this outcome, whether these factors are policy driven
or institutional.
There are two broad branches of this literature. The first branch can be
roughly summarized as emphasizing the timing of industrialization in explaining current differences in income. The focus is on the delay in the
process of structural transformation—broadly described as the process of resource reallocation from agriculture to other sectors in the economy. The second branch focuses on explaining the factors behind the low productivity in
approximately 5-fold instead of the actual 34-fold difference. Hence, agriculture accounts for 85
percent (100 − 5/34 × 100) of the difference in aggregate labor productivity between rich and poor
countries in the data.

D. Restuccia: Recent Developments in Economic Growth

339

agriculture in poor countries observed in the cross-country data at a point in
time. The two branches are closely connected as they seek to assess the relevance of the sectoral structure (agriculture versus non-agriculture in particular)
in cross-country income differences. The two branches differ in terms of the
relevance of the information that can be extracted from time-series variations
in the sectoral structure across countries. I expand on this issue below.
While there is an old and extensive literature in development on the role
of agriculture and structural transformation, only recently has the literature
provided a quantitative assessment. Gollin, Parente, and Rogerson (2002)
provide a model that rationalizes delays in the process of structural transformation and rising income inequality over long periods of time.11 The model
formalizes many ideas in the traditional development literature and provides
a quantitative assessment of the importance of the timing of the adoption of
modern agricultural technology in explaining current international income differences. The model in Gollin, Parente, and Rogerson (2002) is quite simple.
The economy is populated by homogeneous individuals that derive utility from
consuming agricultural and non-agricultural goods and there is a subsistence
need for agricultural goods. Thus, at low levels of income, individuals spend
a bigger fraction of their income on agricultural goods than at high levels
of income. There is strong empirical evidence in support of these types of
preferences. Agricultural goods can be produced with two alternative technologies: a traditional production function that is linear in labor and features
no labor productivity growth, and a modern technology, also linear in labor,
that features positive labor productivity growth. The technology for producing non-agricultural goods is standard, featuring capital and labor inputs and
positive labor productivity growth. The economy is characterized as follows.
When the productivity of the modern agricultural technology is low—below
that of the traditional technology—all labor is allocated to agriculture and income per capita is low and stagnant—essentially people are consuming close
to their subsistence needs. This characterization resembles economies prior
to 1800, where income per capita was roughly constant over time. Because of
positive productivity growth in the modern agricultural technology, at some
point in time the modern technology becomes more productive than the traditional technology and the adoption of the modern technology in agriculture
starts the process of industrialization and modern growth. With productivity
growth in modern agriculture, labor is systematically reallocated from agriculture to non-agriculture over time. In the long run, the economy features
properties that are consistent with the characterization of modern growth—a
positive and stable per capita income growth.
11 Closely related is the work of Lucas (2000) and Hansen and Prescott (2002), although
these articles do not explicitly consider the agricultural sector.

340

Federal Reserve Bank of Richmond Economic Quarterly

Gollin, Parente, and Rogerson (2002) calibrate a benchmark economy to
U.K. data for about 250 years and show that the model reproduces very well
the reallocation of labor out of agriculture over time, as well as the growth
in output per capita. Then, the authors use the model to conduct experiments
where the productivity of modern agriculture is lowered relative to the level
in the United Kingdom. Different productivity levels imply different dates at
which the modern technology in agriculture becomes more productive than
the traditional technology and, hence, the date at which industrialization and
modern growth starts. Interestingly, reasonable differences in the timing of
adoption of the modern agricultural technology imply large current differences
in output per capita across economies. Moreover, the differences in income
per capita persist for long periods of time. One conclusion from this study is
that, as argued by Lucas (2002), a large portion of today’s income differences
across countries result from differences in the timing of the adoption of modern technologies.12 There are two issues with this interpretation of the results.
First, the persistence of income gaps over time in the model is related to the
assumption that the process of reallocation of employment out of agriculture
is common across counties. Cross-country data indicate, however, that countries that started the process of industrialization later than the United States or
United Kingdom have accomplished a comparable transformation in a much
shorter time (see Duarte and Restuccia [2007] for the case of Portugal). Second, the model implies that income gaps should diminish over time, which is
not observed in the recent cross-country data in Section 1. I conclude that this
branch of the literature is useful in understanding cross-country differences in
the timing of industrialization and the related transition, but it is unlikely to
explain the current differences in agricultural productivity observed between
rich and poor countries.
The second branch of the literature focuses on the factors behind low
productivity in agriculture in poor countries. The focus is on understanding
cross-country differences in the agricultural sector at a point in time as opposed to cross-country differences over time. Restuccia,Yang, and Zhu (2008)
develop a two-sector model of agriculture and non-agriculture emphasizing
economy-wide differences in productivity and barriers to intermediate input
use and labor mobility in agriculture. Empirical evidence suggests there is a
strong systematic relationship between the level of development of a country
and two forms of barriers in agriculture: a wedge between wages in agriculture
and non-agriculture (barriers due to limited labor mobility), and a high relative
price of non-agricultural intermediate inputs such as fertilizers and pesticides
(interpreted broadly as a barrier to intermediate input use). These empirical regularities suggest that inefficiencies in agriculture may contribute to low
12 See Ngai (2004) for a related study of the importance of barriers to investment in physical
capital in the delay of the adoption of modern technologies.

D. Restuccia: Recent Developments in Economic Growth

341

agricultural productivity in poor countries and, as a consequence, a large share
of employment in agriculture. Restuccia, Yang, and Zhu (2008) embed these
features in a model where preferences for consumption goods feature a subsistence level requirement for food. Furthermore, producing non-agricultural
goods requires only labor while producing agricultural goods requires land,
labor, and non-agricultural intermediate inputs. The spirit of the exercise conducted in Restuccia, Yang, and Zhu (2008) is as follows. Since the technology
for producing non-agricultural goods is linear in the labor input, data on labor
productivity in non-agriculture pins down the level of economy-wide productivity in each country. This level of productivity is assumed to be exogenous
in the analysis but standard explanations of technology adoption and capital
accumulation can be applied for this factor. Importantly, these explanations
are not specific to the agricultural sector. Restuccia,Yang, and Zhu (2008) also
take as given the differences across countries in the land-to-population ratio,
the barriers to intermediate input use in agriculture, and the barriers to labor
mobility. These objects are directly pinned down by country observations.
Then the question becomes: How important are all these factors (and each in
isolation) in explaining low productivity in agriculture and high agricultural
employment in poor countries? There are several results worth emphasizing. First, if the model could reproduce the low productivity in agriculture
observed in poor countries (by, for example, lowering an agriculture-specific
productivity parameter in poor countries), then the model can rationalize the
observed large share of employment in agriculture in these countries. Hence,
understanding low productivity in agriculture in poor countries is key, with the
ensuing reallocation of labor acting as a transmission mechanism to aggregate
productivity differences. Second, exogenous differences in economy-wide
productivity (measured as differences in non-agricultural productivity) and
barriers are important in explaining low productivity in agriculture in poor
countries, whereas differences in land endowments are of second-order importance. In particular, the model with exogenous differences in economy-wide
productivity, barriers, and land endowments, can explain two-thirds of the
differences in labor productivity in agriculture between rich and poor countries, still leaving an important factor unexplained (about one-third). Third,
inefficiencies in agriculture are not the only determinant of low productivity
in agriculture in poor countries. If productivity in non-agriculture in poor
countries were to be equalized to that of rich countries—even keeping productivity and barriers in agriculture the same—the model would imply levels
of productivity and employment in poor countries much closer to that of rich
countries compared to the baseline model, for instance, a share of employment
in agriculture of 30 percent versus 68 percent in the baseline model, a factor
difference in labor productivity in agriculture of 10-fold versus 23-fold in the
baseline model, and an aggregate productivity difference of 1.4-fold versus
10.8-fold in the baseline model. This result suggests that not all problems

342

Federal Reserve Bank of Richmond Economic Quarterly

Figure 5 Average Farm Size across Countries
9
Corr = 0.61

AUS

Log of Average Farm Size

7
ARG

CAN
USA

5
PRY

4
3

PER

COL
PAN

HND

WSM
BFA

1
0

ETH

GNB
MWI

6.0

6.5

THA
NAM DMA KNA
LCA
IND
GIN
VCT
KOR
IDN
EGY GRD

DNK
LUX
GER
FRA
AUT
IRL
ESP
PRI BEL
NLD
ISR
CHE
BHS
NOR
PRT
ITA
SVN
GRC CYP

PHL

LSO
NPL
ZAR

-1

IRNFJI
TUR

PAK

UGA

GBR
FIN

BRA

7.0

BRB

JPN

VNM

7.5

8.0

8.5

9.0

9.5

10.0

10.5

Log of 1990 Real GDP per Capita
Source: Adamopoulos and Restuccia (2011).

lie in agriculture; instead, solving the problems that prevent non-agricultural
productivity in poor countries to rise to the level of developed countries can
help in eliminating a substantial portion of the large differences in income
among rich and poor countries.
Since there is still a large unexplained gap in labor productivity in agriculture, understanding low productivity in agriculture in poor countries has
remained an active area of research. Four recent contributions have emphasized the role of transportation infrastructure (Adamopoulos 2011), the role of
ability selection into agriculture (Lagakos and Waugh 2011), the role of farm
size (Adamopoulos and Restuccia 2011), and the role of trade restrictions for
importing food (Tombe 2011).13 In this article, I only summarize the findings
on the importance of farm-size differences across countries. Adamopoulos and
13 See also the recent accounting exercises of the productivity gap between agriculture and
non-agriculture in Herrendorf and Schoellman (2011), who emphasize the differences across U.S.
states, and in Gollin, Lagakos, and Waugh (2011), who emphasize the differences across developing
countries.

D. Restuccia: Recent Developments in Economic Growth

343

Restuccia (2011) develop a model of farm size to investigate its importance
in understanding the low productivity problem in agriculture. The motivation
for why farm size may matter is twofold. First, there are striking differences
in average farm sizes and farm-size distributions across countries. Whereas
average farm size is 54 Hectares (Ha) in the richest 20 percent of countries,
average farm size is only 1.6 Ha in the poorest 20 percent of countries, a
34-fold difference. Figure 5 documents the positive relationship between the
level of development and average farm size across countries. Cross-country
differences in farm-size distributions are systematic. Whereas in poor countries, more than 90 percent of the farms are small (less than 5 Ha), only around
30 percent of the farms in rich countries are small. In poor countries, none of
the farms are large (more than 20 Ha), while almost 40 percent of the farms
in rich countries are large. (See Figure 6 for a documentation of the share of
small and large farms across quintiles of the income distribution.) Second,
labor productivity is much higher in large than in small farms. For instance,
in the data from the U.S. Census of Agriculture, average labor productivity
in farms greater than 800 Ha relative to farms less than 4 Ha is a factor between 14- and 34-fold depending on how operators and hired labor are treated
in the measure of labor in farms. The question addressed by Adamopoulos
and Restuccia (2011) is what explains farm-size differences across countries
and whether or not these differences help explain the productivity problem in
agriculture in poor countries.
Adamopoulos and Restuccia (2011) consider a model of farm size that is
based on the span-of-control model of Lucas (1978) embedded into a standard
sectoral model of agriculture and non-agriculture. The production unit in agriculture is a farm that requires the input of a farmer (labor), capital, and land.
Farmers differ in their productivity of managing a farm and the farming technology is such that for each type of farmer there is an optimal farm size where
more productive farmers demand more capital and land and, hence, manage
larger farms. While reallocation between agriculture and non-agriculture in
the model depends on the same fundamental channels described in the previous
literature (e.g., Gollin, Parente, and Rogerson [2002] and Restuccia,Yang, and
Zhu [2008]), productivity in agriculture is also determined by the allocation of
factors (capital and land) across farmers. There are three main findings. First,
farm-size distortions, such as land reforms that cap the size of farms and progressive land taxes, are the most likely explanation for differences in farm-size
distributions. There is overwhelming evidence for these distortions in crosscountry data and measured distortions can account quantitatively for most of
the differences in farm-size distributions across countries. Other potential explanations such as cross-country differences in aggregate factor endowments
(land, capital, and economy-wide productivity) can account for, at most, onefourth of the cross-country farm-size differences. Second, calibrating farmsize distortions to account for the observed farm-size differences helps explain

344

Federal Reserve Bank of Richmond Economic Quarterly

Figure 6 Farm Size Distribution across Countries
Small Farms
1.00

Holdings 0--5 Ha (Fraction of Total)

0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
Q1

Quintile of Income Distribution

Large Farms
0.40

Holdings 20+ Ha (Fraction of Total)

0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
Q1

Quintile of Income Distribution

Source: Adamopoulos and Restuccia (2011).

three-fourths of the differences in agricultural and aggregate labor productivity
across countries, with the remaining one-fourth being explained by differences
in
aggregate factors. Third, specific distortionary policies in individual countries such as a land reform in the Philippines and progressive land taxation
reform in Pakistan are found to generate substantial drops in size and productivity in these countries. Moreover, other factors occurring at the same time or
over time in these countries are found to potentially mask the negative effects
of distortionary policies on size and productivity in the agricultural sector,
making empirical characterizations of these distortionary policies difficult.

D. Restuccia: Recent Developments in Economic Growth

345

Reallocation to Services
As emphasized earlier, models of structural transformation, that is the reallocation of labor across sectors in an economy over time, have featured
prominently in historical perspectives of growth and the timing of industrialization such as in Lucas (2000, 2002) and Gollin, Parente, and Rogerson
(2002).14 Duarte and Restuccia (2010) argue that structural transformation
is also closely connected with the set of facts emphasized in Section 1 about
the diversity of growth patterns in the time series for individual countries, the
patterns of catch up, slowdown, stagnation, and decline in labor productivity
that are observed even for more developed countries. For these countries, agriculture is less important in the economy and the more relevant transformation
involves a substantial shift to services rather than a shift out of agriculture.15
Duarte and Restuccia (2010) develop a tractable model of the structural
transformation to quantitatively assess the contribution of sectoral labor productivity growth in understanding the evolution of aggregate productivity
across countries. The model consists of three sectors: agriculture, industry, and services, with linear technologies in labor in each sector. Structural
transformation is driven in the model by two factors: non-homothetic preferences for agriculture and services goods (with income elasticity less than
one for agriculture and more than one for services) and an elasticity of substitution less than one for industry and services so that differential productivity
growth in industry and services also generates reallocation across these sectors. Hence, a poor country in the model featuring low productivity in all
sectors allocates a large share of labor to agriculture, a low share of labor
to services, and the remaining labor to industry. With positive productivity
growth in all sectors, labor is reallocated away from agriculture toward industry and services. With faster productivity growth in manufacturing than in
services—as documented in the cross-country data by Duarte and Restuccia
(2010)—there is further reallocation of labor from industry to services. Further, faster productivity growth in agriculture produces a speedier transformation out of agriculture. The framework is used with two purposes. The first
purpose is to infer from the model comparable measures of labor productivity
across sectors and countries. These sectoral measures of labor productivity
are not generally available for a large cross-section of countries. The second
purpose is to assess quantitatively the relevance of sectoral labor productivity
growth in driving labor reallocation across sectors and aggregate productivity
over time across countries.
14 See also the recent survey article by Herrendorf, Rogerson, and Valentinyi (2011) on
models of structural transformation.
15 For example, notice in Figure 7 how, in the earlier stages of structural transformation in
Greece, Ireland, and Spain, labor reallocated from agriculture to both industry and services, but
in a later stage (and throughout Canada) reallocation also occurs from industry to services, with
the agricultural sector representing in a small fraction of total hours.

346

Federal Reserve Bank of Richmond Economic Quarterly

Figure 7 Share of Hours across Sectors, Selected Countries
Greece

1.0

Agriculture
Industry
Services

0.8

0.6

0.4

0.2

0.0
1960

Ireland

1.0

0.0
1970

1980

1990

1960

2000

Spain

1980

1990

2000

Canada

1.0

0.8

0.6

0.4

0.2

0.0
1960

1970

0.0
1970

1980

1990

2000

1960

1970

1980

1990

2000

Notes: This is Figure 2 in Duarte and Restuccia (2010).

Two key findings emerge from this framework. The first finding is that
labor productivity differences across countries at a point in time are largest
in agriculture and services and smaller in industry. These findings have the
following mechanical and intuitive implication. Suppose for the moment that
labor productivity differences across sectors and countries remain constant
over time, that is, assume that growth in labor productivity in each sector is
equal across countries. Then, with positive productivity growth in all sectors,
the process of structural transformation implies that countries are reallocating
labor from agriculture to manufacturing and to services. Since labor productivity is lower in agriculture relative to industry in poor countries compared to
rich countries, the reallocation of labor from agriculture to manufacturing can
explain an increase (catch up) in relative productivity for the poor countries.
As the process of structural transformation continues with reallocation from
manufacturing (and to a lesser extent agriculture) to services, a lower ratio of
labor productivity in services relative to industry in poor countries compared to
rich ones may imply episodes of slowdown, stagnation, and decline in relative
aggregate productivity. The cross-country growth pattern across sectors gets a

D. Restuccia: Recent Developments in Economic Growth

347

Agriculture

Figure 8 Relative Labor Productivity across Sectors and Countries

1.0

First Year
Last Year

0.5

Industry

0.0
1

1.0
0.5

Services

0.0

1.0
0.5
0.0

Quintile of Aggregate Productivity (First Year)

Notes: This is Figure 6 in Duarte and Restuccia (2010).

bit more complicated when, in addition, labor productivity gaps are changing
over time. In fact, the evidence suggests that there has been substantial crosscountry catch up in labor productivity in agriculture and manufacturing over
time but not in services, and that this process is important in understanding
the evolution of aggregate productivity across countries. Figure 8 shows the
implications of the model in Duarte and Restuccia (2010) for the first year in
the sample (1956 for most countries) and the last year in the sample (2005
for most countries). Countries in the second, third, and fourth quintiles of
the income distribution managed to achieve substantial catch up in relative
sectoral productivity for agriculture and industry, but in general there is a lack
of catch up in productivity in services.
The second finding is that the patterns of sectoral productivity across
sectors and countries just emphasized account for most of the labor reallocation observed across countries.16 Moreover, the catch up in manufacturing
16 Duarte and Restuccia (2010) emphasize that, for some countries, sectoral productiv-

ity growth generates labor reallocation that is different from the data, suggesting that distortions/frictions may be important for some individual-country experiences.

348

Federal Reserve Bank of Richmond Economic Quarterly

productivity accounts for 50 percent of the catch up in aggregate productivity across countries and the lack of catch up in services explains all the
experiences of slowdown, stagnation, and decline in aggregate productivity
across countries. These findings point to the importance of the service sector
in current growth experiences and present a challenge for economic policy
in disentangling the relevant policies/regulations that affect the evolution of
service-sector and aggregate productivity across countries.

REALLOCATION ACROSS ESTABLISHMENTS

A recurrent finding of the development accounting literature such as in Klenow
and Rodrı́guez-Clare (1997) and Prescott (1998) is that TFP is the most important factor in explaining income differences across countries. Most of the
analysis in explaining productivity differences across countries was done in
the context of frameworks with a stand-in or representative firm featuring
constant returns to scale of production. The result was then an emphasis on
aggregate factors that explain the lack of technology adoption in poor countries. For instance, Parente and Prescott (1994, 2000) develop a framework
emphasizing barriers to technology adoption in poor countries.
Complementing this work, the evidence from microeconomic studies,
such as Baily, Hulten, and Campbell (1992) and Foster, Haltiwanger, and
Syverson (2008), suggests that the reallocation of factors of production—
from failing to entering firms, and especially from less to more productive
firms—accounts for a substantial portion of aggregate productivity growth in
the data. For this reason, Restuccia and Rogerson (2008) consider a model of
heterogeneous production units where reallocation across these units is at the
core of measured productivity in the economy.17

Misallocation and Productivity
The model in Restuccia and Rogerson (2008) embeds an industry equilibrium
model of Hopenhayn (1992) into a standard one-sector growth model.18 Production takes place in establishments. The technology at the establishment
level differs in TFP and features decreasing returns to scale in capital and
labor inputs. The implication of these two features is that there is an optimal
size of establishments, i.e., an optimal amount of capital, labor, and output
for each productivity type and the size of an establishment is positively related to productivity. In other words, the efficient allocation of factors given
17 See also Banerjee and Duflo (2005) for a survey of closely related literature in microeconomic development.
18 An early analysis of the importance of reallocation is in Hopenhayn and Rogerson (1993),
who focus on the effect of firing taxes on employment differences across countries.

D. Restuccia: Recent Developments in Economic Growth

349

these assumptions is such that capital and labor are allocated according to
productivity, and the amount of aggregate resources determines the number
of establishments. The aggregate production function then features constant
returns to scale in the sense that if capital and labor were to double in the
economy, then the number of establishments and output would double too. A
critical feature of the model is that policies or institutions that affect the prices
paid or received by establishments (what Restuccia and Rogerson [2008] call
idiosyncratic distortions) generate a reallocation across establishments that
lowers productivity. The list of institutions and policies that create such reallocation is large and is a prevalent feature of poor countries. For example,
non-competitive banking systems offering below-market interest rate loans to
selected producers based on non-economic factors, governments exempting
certain producers of regulations or taxes, public enterprises often associated
with low productivity receiving large subsidies from the government for their
operation (financed through taxes on other producers), are the type of distortions that affect the size of certain establishments inducing a misallocation of factors of production. Labor market regulation and trade restrictions
may also lead to idiosyncratic distortions. The approach in Restuccia and
Rogerson (2008) is to represent all these potential sources of distortions
through a generic form of tax/subsidy schemes and to assess their potential
impact on aggregate productivity.
Restuccia and Rogerson (2008) study policy configurations whereby a
fraction of establishments is taxed at a specified rate and the remaining fraction of establishments is subsidized. The subsidy rate is such that the aggregate
capital stock remains the same. The reason for this approach is that the elements that affect capital accumulation are well understood and research has
shown that capital accumulation is not a crucial factor in accounting for income differences (see, for instance, Klenow and Rodrı́guez-Clare [1997]).19
To make a quantitative assessment, Restuccia and Rogerson (2008) calibrate a
benchmark economy with no distortions to data for the United States. The key
components in calibrating the model are the elements that allow the model to
reproduce the distribution of establishments and their size in the data. Experiments are conducted assuming that all countries are identical to the benchmark
economy except on a configuration of idiosyncratic distortions. Even though
the experiments are such that aggregate resources and the distribution of
production efficiencies are the same as in the benchmark economy, idiosyncratic distortions are shown to have substantial negative effects on measured
TFP and output. In particular, a policy configuration where 50 percent of the
19 More generally though, idiosyncratic distortions to establishments can also lead to substantial effects on aggregate capital accumulation, which may be of importance for individual-country
experiences. See, for instance, Bello, Blyde, and Restuccia (2011) for an assessment of idiosyncratic distortions on capital accumulation in Venezuela.

350

Federal Reserve Bank of Richmond Economic Quarterly

most productive establishments are taxed at 40 percent implies a drop in TFP
and output of 30 percent. Drops in TFP and output can be larger if more establishments are taxed, for instance if 90 percent of establishments were taxed
and only 10 percent subsidized, measured TFP and output would drop by 50
percent.20
While the policy experiments that Restuccia and Rogerson (2008) implement are hypothetical, there is substantial evidence on the types of policies that
create idiosyncratic distortions. In related work, Hsieh and Klenow (2009) use
microeconomic data of plants in the manufacturing sector for China, India, and
the United States to measure the size of policy distortions and evaluate their
aggregate impact. They find that eliminating misallocation in China and India
(relative to that of the United States) can increase measured TFP between 30
percent and 60 percent. Roughly speaking, the intuition for how the microeconomic data can uncover the size of policy distortions is that in an economy
without distortions, establishments with access to the same technology (except for TFP) and facing the same prices for output and factor inputs would
equalize the marginal product of factors to the aggregate prices. With underlying differences in productivity across establishments, the more productive
establishments are larger than less productive establishments. Idiosyncratic
policy distortions affect the prices faced by individual establishments and,
hence, prevent establishments from equalizing their marginal products. Data
on establishment-level output, factor inputs, and input payments permit an
evaluation of the price distortions that must be in place for the data to be an
equilibrium of the distorted economy. Therefore, given the distortions, an
evaluation can be made of the productivity gains from eliminating them.21
Interestingly, Hsieh and Klenow’s (2009) empirical work also uncovers
important differences between China, India, and the United States in the distribution of establishment-level productivity. The distribution of productivity
across establishments is assumed to be the same across countries in Restuccia
and Rogerson’s (2008) experiments as the focus is on reallocation across these
units. Differences in the distribution of productivity are also abstracted from
in the gains from reallocation in Hsieh and Klenow’s (2009) calculations.22
The differences in the distribution of productivity across establishments can
20 I note that Restuccia and Rogerson (2008) also look at other potential policy configurations
whereby distortionary policies are either random (some establishments are subsidized and others
taxed but which establishment is taxed/subsidized is not related to productivity) or the more productive establishments are subsidized. While less damaging, these alternative policy configurations
also have a negative impact on aggregate productivity as the size of establishments is distorted.
21 Much work has followed Hsieh and Klenow’s (2009) approach using microeconomic data
on firms to uncover distortions and productivity gains from reallocation in many countries. See,
for instance, Pagés (2010) for applications in Latin American countries.
22 Hsieh and Klenow (2009) calculate the gains from reallocation as the ratio of efficient
output to actual output for each country, where efficient output is produced by assuming factors
of production are assigned efficiently to the establishments in the country.

D. Restuccia: Recent Developments in Economic Growth

351

potentially be the result of distortionary policies and can be studied jointly, for
example, by allowing the policy distortions to have an impact on the selection
of establishments through entry/exit and on productivity investment by establishments. Recent work has started to allow for an interaction between policy
distortions and the distribution of establishments. In these frameworks, the
shift in the distribution of establishment-level productivity is a consequence of
changes in the amount of investment by establishments on their level of productivity in the face of idiosyncratic distortions that may discourage higher
efficiency and barriers to entry and doing business, which are quite prevalent in poor countries.23 In this regard, Restuccia (2011) and Bello, Blyde,
and Restuccia (2011) study variants of the Restuccia and Rogerson (2008)
model, where policy distortions shift the distribution of productivity across
establishments in the economy toward the lower productivity units.24

Specific Policies and Institutions
A limitation of the empirical measures of idiosyncratic distortions in Hsieh and
Klenow (2009) is that they don’t directly connect with specific policies and
institutions. Such connection is critical in the determination of policy prescriptions for poor countries. Recent studies have tried to provide a quantitative
assessment of specific policies or institutions in accounting for misallocation
and low productivity in poor countries. This literature cannot be described
in much detail in this article.25 Broadly speaking, the applications span issues that include: the importance of financial development such as in Buera,
Kaboski, and Shin (2011), Greenwood, Sanchez, and Wang (2010, 2011), and
Midrigan and Xu (2010); the relevance of size-dependent policies that discourage large-scale operation through heavier regulation and taxes such as in
Guner, Ventura, and Xu (2008); the importance of restrictions to foreign direct
investment such as in Burstein and Monge-Naranjo (2009); the relevance of
specific policies such as land reforms and progressive land taxes that discourage large-scale operation in farming in Adamopoulos and Restuccia (2011),
among many others.26 Focusing on the role of specific factors reduces the
23 See, for instance, the empirical measures of cost of entry in Doing Business 2011 from
the World Bank (2011).
24 See also the interesting work in Ranasinghe (2011a, 2011b) and a related literature in
trade that emphasize a shift in the distribution of productivities, e.g., Atkeson and Burstein (2010),
and Rubini (2010).
25 The growing literature on misallocation and productivity will be the subject of a special
issue of the Review of Economic Dynamics to be published in January 2013.
26 There is also a growing empirical literature assessing the importance of policies on specific experiences, but often the empirical studies are limited by the availability of good-quality
microeconomic data and by the difficulty of accessing the data. Two interesting examples of cases
where good microeconomic data is available are the study of trade reform in Colombia in Eslava
et al. (2011), where the data includes quantity and price information for each producer, allowing
for a real measure of productivity at the establishment level as opposed to the typical revenue

352

Federal Reserve Bank of Richmond Economic Quarterly

scope of potential impact on aggregate productivity and often still involves
difficult issues of measurement. As a result, much work remains to be done
in identifying and measuring specific policies and institutions and assessing
their quantitative significance on the allocation of resources across productive
units and, hence, on understanding aggregate productivity differences across
countries.

CONCLUSIONS

Differences in income across nations are large. Moreover, the data shows
remarkable episodes of growth catch up and collapse. In this article, I reviewed
the recent literature in quantitative growth economics, broadly addressing
these facts. In a nutshell, substantial progress has been made by studying the
determinants of resource allocation across heterogeneous productive units,
whether across sectors or across establishments within sectors. Much more
work remains to be done in determining the fundamental factors in resource
allocation across productive units.
To be more concrete, while agriculture has been shown to be important
in explaining the income differences between rich and poor countries, further
advances are needed in accounting for the low productivity problem in agriculture in poor countries. For instance, what specific policies and institutions
explain the small-scale operations in agriculture in poor countries? Is the
lack of well-defined property rights important? Are price distortions or other
specific policies that discriminate against large operational scales important?
What sort of barriers prevent trade in agricultural goods in low productivity
countries? Similarly, while differences in labor productivity across sectors
and countries are found to be important in accounting for the patterns of aggregate labor productivity growth across countries, it remains to be analyzed
in detail what factors/policies/institutions explain the observed differences in
labor productivity levels and growth rates across sectors and countries. For
example, what determines the large gap in labor productivity in the service
sector even among relatively developed countries? How do regulations and
market structure affect productivity in services across countries? Closely related, misallocation of resources across heterogenous production units are also
found to generate substantial negative effects on measured aggregate TFP. But
empirical measures of misallocation have so far been addressed in a relatively
small number of countries, and these measures need to be linked with specific policies and institutions. Better measurement of individual policies and
institutions affecting productivity at the establishment level, as well as better
measurement of productivity at the microeconomic level, are likely to yield
measure of productivity, and the study of the increase in dispersion in tariffs associated with the
Smoot-Hawley Tariff in the United States during the Great Depression in Bond et al. (2011).

D. Restuccia: Recent Developments in Economic Growth

353

important returns in terms of our understanding of productivity differences
across countries. These advances are likely to allow for the design of effective
policies addressing frictions and market imperfections that prevent an optimal
allocation of resources, as well as the removal of barriers that prevent poor
countries from operating closer to the technological frontier.

REFERENCES
Adamopoulos, Tasso. 2011. “Transportation Costs, Agricultural Productivity,
and Cross-Country Income Differences.” International Economic
Review 52 (2): 489–521.
Adamopoulos, Tasso, and Diego Restuccia. 2011. “The Size Distribution of
Farms and International Productivity Differences.” Manuscript,
University of Toronto.
Atkeson, Andrew, and Ariel Tomás Burstein. 2010. “Innovation, Firm
Dynamics, and International Trade.” Journal of Political Economy 118
(3): 433–84.
Baily, Martin Neal, Charles Hulten, and David Campbell. 1992.
“Productivity Dynamics in Manufacturing Plants.” Brooking Papers on
Economic Activity: Microeconomics, 187–267.
Banerjee, Abhijit V., and Esther Duflo. 2005. “Growth Theory through the
Lens of Development Economics.” In Handbook of Economic Growth,
Vol. 1A, edited by Philippe Aghion and Steven Durlauf. New York:
North Holland, 473–552.
Baumol, William J. 1967. “Macroeconomics of Unbalanced Growth: The
Anatomy of Urban Crisis.” American Economic Review 57 (June):
415–26.
Bello, Omar D., Juan S. Blyde, and Diego Restuccia. 2011. “Venezuela’s
Growth Experience.” Latin American Journal of Economics 48 (2):
199–226.
Bond, Rick, Mario Crucini, Tristan Potter, and Joel Rodrigue. 2011.
“Misallocation and Productivity Effects of the Hawley-Smoot Tariff of
1930.” Manuscript, Vanderbilt University.
Buera, Francisco J., Alexander Monge-Naranjo, and Giorgio E. Primiceri.
2011. “Learning the Wealth of Nations.” Econometrica 79 (1): 1–45.

354

Federal Reserve Bank of Richmond Economic Quarterly

Buera, Francisco J., Joseph Kaboski, and Yongseok Shin. 2011. “Finance
and Development: A Tale of Two Sectors.” American Economic Review
101 (August): 1,964–2,002.
Burstein, Ariel T., and Alexander Monge-Naranjo. 2009. “Foreign
Know-How, Firm Control, and the Income of Developing Countries.”
Quarterly Journal of Economics 124 (1): 149–95.
Caselli, Francesco. 2005. “Accounting for Cross-Country Income
Differences.” In Handbook of Economic Growth, Vol. 1A, edited by
Philippe Aghion and Steven Durlauf. New York: North Holland,
679–741.
Duarte, Margarida, and Diego Restuccia. 2006. “The Productivity of
Nations.” Federal Reserve Bank of Richmond Economic Quarterly 92
(Summer): 195–223.
Duarte, Margarida, and Diego Restuccia. 2007. “The Structural
Transformation and Aggregate Productivity in Portugal.” Portuguese
Economic Journal 6 (April): 26-46.
Duarte, Margarida, and Diego Restuccia. 2010. “The Role of the Structural
Transformation in Aggregate Productivity.” Quarterly Journal of
Economics 125 (February): 129–73.
Echevarria, Cristina. 1997. “Changes in Sectoral Composition Associated
with Economic Growth.” International Economic Review 38 (May):
431–52.
Erosa, Andres, Tatyana Koreshkova, and Diego Restuccia. 2010. “How
Important is Human Capital: A Quantitative Theory Assessment of
World Income Inequality.” Review of Economic Studies 77 (October):
1,421–49.
Eslava, Marcela, John Haltiwanger, Adriana Kugler, and Maurice Kugler.
2011. “Trade, Technical Change and Market Selection: Evidence from
Manufacturing Plants in Colombia.” Manuscript, University of
Maryland.
Foster, Lucia, John Haltiwanger, and Chad Syverson. 2008. “Reallocation,
Firm Turnover, and Efficiency: Selection on Productivity or
Profitability?” American Economic Review 98 (1): 394–425.
Gollin, Doug, David Lagakos, and Mike Waugh. 2011. “The Agricultural
Productivity Gap in Developing Countries.” Manuscript, Arizona State
University.
Gollin, Doug, Stephen L. Parente, and Richard Rogerson. 2002. “The Role
of Agriculture in Development.” American Economic Review 92 (May):
160–4.

D. Restuccia: Recent Developments in Economic Growth

355

Greenwood, Jeremy, Juan M. Sanchez, and Cheng Wang. 2010. “Financing
Development: The Role of Information Costs.” American Economic
Review 100 (September): 1,875–91.
Greenwood, Jeremy, Juan M. Sanchez, and Cheng Wang. 2011.
“Quantifying the Impact of Financial Development on Economic
Development.” Manuscript, University of Pennsylvania.
Guner, Nezih, Gustavo Ventura, and Yi Xu. 2008. “Macroeconomic
Implications of Size-Dependent Policies.” Review of Economic
Dynamics 11 (October): 721–44.
Hall, Robert E., and Charles I. Jones. 1999. “Why Do Some Countries
Produce so Much More Output per Worker than Others.” The Quarterly
Journal of Economics 114 (February): 83–116.
Hansen, Gary D., and Edward C. Prescott. 2002. “Malthus to Solow.”
American Economic Review 92 (September): 1,205–17.
Herrendorf, Berthold, and Todd Schoellman. 2011. “Why is Labor
Productivity so Low in Agriculture.” Manuscript, Arizona State
University.
Herrendorf, Berthold, Richard Rogerson, and Akos Valentinyi. 2011.
“Growth and Structural Transformation.” Manuscript, Princeton
University. Forthcoming in the Handbook of Economic Growth.
Heston, Alan, Robert Summers, and Bettina Aten. 2009. “Penn World Table
Version 6.3.” Center for International Comparisons of Production,
Income and Prices at the University of Pennsylvania (August).
Hopenhayn, Hugo A. 1992. “Entry, Exit, and Firm Dynamics in Long Run
Equilibrium.” Econometrica 60 (September): 1,127–50.
Hopenhayn, Hugo A., and Richard Rogerson. 1993. “Job Turnover and
Policy Evaluation: A General Equilibrium Analysis.” Journal of Political
Economy 101 (October): 915–38.
Hsieh, Chang-Tai, and Peter J. Klenow. 2009. “Misallocation and
Manufacturing TFP in China and India.” The Quarterly Journal of
Economics 124 (November): 1,403–48.
Hsieh, Chang-Tai, and Peter J. Klenow. 2010. “Development Accounting.”
American Economic Journal: Macroeconomics 2 (January): 207–23.
Jones, Charles I., and Peter J. Klenow. 2011. “Beyond GDP? Welfare across
Countries and Time.” Manuscript, Stanford University.
Klenow, Peter J., and Andrés Rodrı́guez-Clare. 1997. “The Neoclassical
Revival in Growth Economics: Has It Gone Too Far?” In NBER
Macroeconomics Annual 1997, Vol. 12, edited by Ben S. Bernanke and
Julio Rotemberg. Cambridge, Mass.: MIT Press, 73–114.

356

Federal Reserve Bank of Richmond Economic Quarterly

Kongsamut, Piyabha, Sergio Rebelo, and Danyang Xie. 2001. “Beyond
Balanced Growth.” Review of Economic Studies 68 (October): 869–82.
Kuznets, S. 1966. Modern Economic Growth. New Haven, Conn.: Yale
University Press.
Lagakos, David, and Michael Waugh. 2011. “Specialization, Economic
Development, and Aggregate Productivity Differences.” Mimeo, New
York University.
Lucas, Jr., Robert E. 1978. “On the Size Distribution of Business Firms.”
Bell Journal of Economics 9 (Autumn): 508–23.
Lucas, Jr., Robert E. 2000. “Some Macroeconomics for the 21st Century.”
Journal of Economic Perspectives 14 (Winter): 159–68.
Lucas, Jr., Robert E. 2002. “The Industrial Revolution: Past and Future.” In
Lectures on Economic Growth. Cambridge, Mass.: Harvard University
Press, 109–90.
Manuelli, Rodolfo, and Ananth Seshadri. 2006. “Human Capital and the
Wealth of Nations.” Manuscript, University of Wisconsin.
Midrigan, Virgiliu, and Daniel Yi Xu. 2010. “Finance and Misallocation:
Evidence from Plant-level Data.” Manuscript, New York University.
Ngai, L. Rachel. 2004. “Barriers and the Transition to Modern Growth.”
Journal of Monetary Economics 51 (October): 1,353–83.
Ngai, L. Rachel, and Christopher A. Pissarides. 2007. “Structural Change in
a Multi-Sector Model of Growth.” American Economic Review 97
(March): 429–43.
Pagés, Carmen. 2010. The Age of Productivity: Transforming Economies
from the Bottom Up. New York: Palgrave MacMillan.
Parente, Stephen L., and Edward C. Prescott. 1993. “Changes in the Wealth
of Nations.” Federal Reserve Bank of Minneapolis Quarterly Review 17
(Spring): 3–16.
Parente, Stephen L., and Edward C. Prescott. 1994. “Barriers to Technology
Adoption and Development.” Journal of Political Economy 102 (April):
298–321.
Parente, Stephen L., and Edward C. Prescott. 2000. Barriers to Riches.
Cambridge, Mass.: MIT Press.
Prescott, Edward C. 1998. “Needed: A Theory of Total Factor Productivity.”
International Economic Review 39 (August): 525–51.
Ranasinghe, Ashantha. 2011a. “Property Rights, Extortion and the
Misallocation of Talent.” Manuscript, University of Toronto.

D. Restuccia: Recent Developments in Economic Growth

357

Ranasinghe, Ashantha. 2011b. “Impact of Policy Distortions on Plant-level
Innovation, Productivity Dynamics and TFP.” Manuscript, University of
Toronto.
Rao, D. S. Prasada. 1993. Intercountry Comparisons of Agricultural Output
and Productivity. Rome: Food and Agriculture Organization of the
United Nations.
Restuccia, Diego, and Richard Rogerson. 2008. “Policy Distortions and
Aggregate Productivity with Heterogeneous Establishments.” Review of
Economic Dynamics 11 (October): 707–20.
Restuccia, Diego, Dennis Tao Yang, and Xiaodong Zhu. 2008. “Agriculture
and Aggregate Productivity: A Quantitative Cross-Country Analysis.”
Journal of Monetary Economics 55 (March): 234–50.
Restuccia, Diego. 2011. “The Latin American Development Problem.”
Manuscript, University of Toronto.
Rubini, Loris. 2010. “Innovation and the Elasticity of Trade Volumes to
Tariff Reductions.” Manuscript, Universidad Carlos III de Madrid.
Tombe, Trevor. 2011. “The Missing Food Problem: How Low Agricultural
Imports Contribute to International Income Differences.” Manuscript,
University of Toronto.
World Bank. 2011. Doing Business 2011. Prepared by the Doing Business
Unit. Washington, D.C.: World Bank (November).

Full text of Economic Quarterly (Federal Reserve Bank of Richmond) : Third Quarter 2011, Vol. 97, No. 3

FRASER