View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Economic Quarterly—Volume 93, Number 4—Fall 2007—Pages 317–339

Evolving Inflation
Dynamics and the New
Keynesian Phillips Curve
Andreas Hornstein

I

n most industrialized economies, periods of above average inflation tend
to be associated with above average economic activity, for example, as
measured by a relatively low unemployment rate. This statistical relationship, known as the Phillips curve, is sometimes invoked when economic
commentators suggest that monetary policy should not try to suppress signs of
inflation. But this interpretation of the Phillips curve implicitly assumes that
the statistical relationship is structural, that is, the relationship will not break
down during periods of persistently high inflation. Starting in the mid-1960s,
Friedman and Phelps argued that the Phillips curve is indeed not structural
and the experience of the United States and other countries with high inflation
and low GDP growth in the late 1960s and 1970s has subsequently borne out
their predictions.
Various theories have been proposed to explain the Phillips curve and
most of these theories agree that there is no significant long-term tradeoff between inflation and the level of economic activity. One theory that provides
a structural interpretation of the short-term inflation-unemployment relationship, and that has become quite popular over the last ten years among central
bank economists is based on explicit models of nominal price rigidity. The
most well-known example of this theory is the New Keynesian Phillips Curve
(NKPC).
In this article, I evaluate how well a structural NKPC can account for
the changing nature of inflation in the United States from the 1950s to today.
First, I document that changes in average inflation have been associated with
I would like to thank Chris Herrington, Thomas Lubik, Yash Mehra, and Alex Wolman for
helpful comments, and Kevin Bryan for excellent research assistance. Any opinions expressed
in this article are my own and do not necessarily reflect those of the Federal Reserve Bank
of Richmond or the Federal Reserve System. E-mail: andreas.hornstein@rich.frb.org.

318

Federal Reserve Bank of Richmond Economic Quarterly

changes in the dynamics of inflation as measured by inflation persistence and
the co-movement of inflation with measures of real activity that the NKPC
predicts are relevant for inflation. Then I argue that the NKPC with fixed
structural parameters cannot account for these changes in the inflation process.
I conclude that the NKPC does not provide a complete structural interpretation
of the Phillips curve. This is troublesome since the changed inflation dynamics
are related to changes in average inflation, which are presumably driven by
systematic monetary policy. But if the NKPC is not invariant to systematic
changes of monetary policy, then its use for monetary policy is rather limited.
In models with nominal rigidities, sticky-price models for short, monopolistically competitive firms set their prices as markups over their marginal cost.
Since these firms are limited in their ability to adjust their nominal prices, future inflation tends to induce undesired changes in their relative prices. When
firms have the opportunity to adjust their prices they will, therefore, set their
prices contingent on averages of expected future marginal cost and inflation.
The implied relationship between inflation and economic activity is potentially
quite complicated, but for a class of models one can show that to a first-order
approximation current inflation is a function of current marginal cost and expected future inflation, the so-called NKPC. The coefficients in this NKPC
are interpreted as structural in the sense that they are likely to be independent
of monetary policy.
In the U.S. economy, inflation tends to be very persistent, in particular, it
tends to be at least as persistent as is marginal cost. At the same time, inflation
is not that strongly correlated with marginal cost. This observation appears
to be inconsistent with the standard NKPC since here inflation is essentially
driven by marginal cost, and inflation is, at most, as persistent as marginal
cost. But if inflation is as persistent as is marginal cost then the model also
predicts a strong positive correlation between inflation and marginal cost. One
can potentially account for this observation through the use of a hybrid NKPC
which makes current inflation not only a function of expected future inflation,
but also of past inflation as in standard statistical Phillips curves. With a strong
enough backward-looking element, inflation persistence then need not depend
on the contributions from marginal cost alone.
Another feature of U.S. inflation is that average inflation has always been
positive, and it has varied widely: periods of low inflation, such as the 1950s
and 1960s, were followed by a period of very high inflation in the 1970s, and
then low inflation again since the mid-1980s. Cogley and Sbordone (2005,
2006) point out that the NKPC relates inflation and marginal cost defined
in terms of their deviations from their respective trends. In particular, the
standard NKPC defines trend inflation to be zero. Given the variations in
average U.S. inflation, Cogley and Sbordone (2005, 2006) then argue that
accounting for variations in trend inflation will make deviations of inflation
from trend less persistent. Furthermore, as Ascari (2004) shows, the first-order

A. Hornstein: Inflation Dynamics and the NKPC

319

approximation of the NKPC needs to be modified when the approximation is
taken at a positive inflation rate.
I build on the insight of Cogley and Sbordone (2005, 2006) and study
the implications of a time-varying trend inflation rate for the autocorrelation
and cross-correlation structure of inflation and marginal cost. In this I extend
the work of Fuhrer (2006) who argues that the hybrid NKPC can account
for inflations’s autocorrelation structure only through a substantial backwardlooking element. In this article, I argue that a hybrid NKPC, modified for
changes in trend inflation, cannot account for changes in the autocorrelation
and cross-correlation structure of inflation and marginal cost in the United
States.
The article is organized as follows. Section 1 describes the dynamic properties of inflation and marginal cost in the baseline NKPC and the U.S. economy. Section 2 describes and calibrates the hybrid NKPC, and it compares
the autocorrelation and cross-correlation structure of inflation and marginal
cost in the model with that of the 1955–2005 U.S. economy. Section 3 characterizes the inflation dynamics in the NKPC modified to account for nonzero
trend inflation. I then study if the changes of inflation dynamics, associated
with changes in trend inflation comparable to the transition into and out of the
high inflation period of the 1970s, are consistent with the changing nature of
inflation dynamics in the U.S. economy for that period.

1.

INFLATION AND MARGINAL COST IN THE NKPC

Inflation in the baseline NKPC is determined by expectations about future
inflation and a measure of current economic activity. There are two fundamental differences between the NKPC and more traditional specifications of
the Phillips curve. First, traditional Phillips curves are backward looking and
relate current inflation to lagged inflation rates. Second, the measure of real
activity in the NKPC is based on a measure of how costly it is to produce
goods, whereas traditional Phillips curves use the unemployment rate as a
measure of real activity. More formally, the baseline NKPC is
π t = κ 0 st + βEt π t+1 + ut ,
ˆ
ˆ
ˆ

(1)

ˆ
ˆ
where π t denotes the inflation rate, st denotes real marginal cost, Et π t+1
ˆ
denotes the expected value of next period’s inflation rate conditional on current
information, ut is a shock to the NKPC, β is a discount factor, 0 < β < 1,
and κ 0 is a function of structural parameters described below. The baseline
NKPC is derived as the local approximation of equilibrium relationships for a
particular model of the economy, the Calvo (1983) model of price adjustment.
For the Calvo model one assumes that all firms are essentially identical,
that is, they face the same demand curves and cost functions. The firms are
monopolistically competitive price setters, but can adjust their nominal prices

320

Federal Reserve Bank of Richmond Economic Quarterly

only infrequently. In particular, whether a firm can adjust its price is random,
and the probability of price adjustment is constant. Random price adjustment
introduces ex post heterogeneity among firms, since with nonzero inflation
a firm’s relative price will depend on how long ago the firm last adjusted its
price. Since firms are monopolistically competitive they set their nominal (and
relative) price as a markup over their real marginal cost, and since firms can
adjust their price only infrequently they set their price conditional on expected
future inflation and marginal cost.
The NKPC is a linear approximation to the optimal price-setting behavior
of the firms in the Calvo model. Furthermore, the approximation is local to
a state that exhibits a zero-average inflation rate. The inflation rate π t should
ˆ
be interpreted as the log-deviation of the gross inflation rate from one, that
is, the net-inflation rate, and real marginal cost st should be interpreted as
ˆ
the log-deviation from its long-run mean. For a derivation of the NKPC, see
Woodford (2003).1 The optimal pricing decisions of firms with Calvo-type
nominal price adjustment are reflected in the parameter κ 0 of the NKPC,
1−α
(2)
(1 − αβ) ,
α
where α is the probability that a firm cannot adjust its nominal price,
0 ≤ α < 1.
The shock to the NKPC is usually not derived as part of the linear approximation to the optimal price-setting behavior of firms. Most of the time the
shock is simply “tacked on” to the NKPC, although it can be interpreted as a
random disturbance to the firms’ static markup. Given the absence of serious
microfoundations of the cost shock one would not want the shock to play an
independent role in contributing to the persistence of inflation. We, therefore,
assume that the shock to the NKPC is i.i.d. with mean zero.2
κ0 =

Persistence of Inflation in the NKPC
The NKPC represents a partial equilibrium relationship within a more comprehensive model of the economy. Thus, inflation and marginal cost will
be simultaneously determined as part of a more complete description of the
economy. Conditional on the equilibrium process for marginal cost we can,
however, solve equation (1) forward by repeatedly substituting for future inflation and obtain the current inflation rate as the discounted expected value
1 The NKPC approximated at the zero inflation rate is also a special case of the NKPC

approximated at a positive inflation rate. For a derivation of the latter, see Ascari (2004), Cogley
and Sbordone (2005, 2006), or Hornstein (2007).
2 The shock to the NKPC is often called a “cost-push” shock, but this terminology can be
confusing since the shock is introduced independently of marginal cost.

A. Hornstein: Inflation Dynamics and the NKPC

321

of future marginal cost
∞

πt = κ0
ˆ

β j Et st+j + ut .
ˆ

(3)

j =0

The behavior of the inflation rate, in particular its persistence, is therefore closely related to the behavior of marginal cost. To get an idea of what
this means for the joint behavior of inflation and marginal cost, assume that
equilibrium marginal cost follows a first-order autoregressive process [AR(1)],
st = δ st−1 + ε t ,
ˆ
ˆ

(4)

with positive serial correlation, 0 < δ < 1, and εt is an i.i.d. mean zero shock
with variance σ 2 . This AR(1) specification is a useful first approximation
ε
of the behavior of marginal cost since, as we will see below, marginal cost
is a highly persistent process. For such an AR(1) process the conditional
expectation of marginal cost j -periods-ahead is simply
Et st+j = Et δ st+j −1 + ε t+j = δEt st+j −1 = . . . = δ j st .
ˆ
ˆ
ˆ
ˆ

(5)

Substituting for the expected future marginal cost in (3), we get
∞

πt = κ0
ˆ

β j δ j st + ut =
ˆ
j =0

κ0
st + ut = a0 st + ut .
ˆ
ˆ
1 − βδ

(6)

This is a reduced form relationship between current inflation and marginal
cost. The relationship is in reduced form since it incorporates the presumed
equilibrium law of motion for marginal cost, which is reflected in the fact that
the coefficient on marginal cost, a0 , depends on the law of motion for marginal
cost. If the law of motion for marginal cost changes, then the relation between
inflation and marginal cost will change.
Given the assumed law of motion for marginal cost, inflation is positively
correlated with marginal cost and is, at most, as persistent as is marginal cost.
The second moments of the marginal cost process are
E st st−k = δ k
ˆˆ

σ2
ε
= δk σ 2,
s
1 − δ2

(7)

where σ 2 is the variance of marginal cost. The implied second moments of
s
the inflation rate and the cross-products of inflation and marginal cost are
E π t π t−k
ˆ ˆ
E π t st+k
ˆ ˆ

2
ˆˆ
= a0 E st st−k + I[k=0] σ 2 = δ k (a0 σ s )2 + I[k=0] σ 2 , (8)
u
u

ˆˆ
= a0 E st st+k = δ k a0 σ 2 ,
s

(9)

322

Federal Reserve Bank of Richmond Economic Quarterly

where I[.] denotes the indicator function. The autocorrelation coefficients for
inflation and the cross-correlations of inflation with marginal cost are
Corr π t , π t−k
ˆ ˆ
Corr π t , st+k
ˆ ˆ

2
a0
, and
2
a0 + σ 2 /σ 2
u
s
a0
.
= δk
1/2
2
a0 + σ 2 /σ 2
u
s

= δk

(10)
(11)

As we can see, the autocorrelation coefficients for inflation are simply scaled
versions of the autocorrelation coefficients for marginal cost, and the scale
parameter depends on the relative volatility of the shocks to the NKPC and
marginal cost. If there are no shocks to the NKPC, σ u = 0, then inflation is
an AR(1) process with persistence parameter δ, and it is perfectly correlated
with marginal cost. If, however, there are shocks to the NKPC, σ u > 0,
then inflation and marginal cost are imperfectly correlated and inflation is less
persistent than is marginal cost.

Inflation and Marginal Cost in the U.S. Economy
In order to make the NKPC operational, we need measures of the inflation rate
and marginal cost. For the inflation rate we will use the rate of change of the
GDP deflator.3 We measure aggregate marginal cost through the wage income
share in the private nonfarm business sector. This choice can be motivated
as follows. Suppose that all firms use the same production technology with
labor as the only input. In particular, assume that the production function is
Cobb-Douglas, y = znω , with constant input elasticity ω. Then the nominal
marginal cost is the nominal wage divided by the marginal product of labor
St =

Wt
Wt
=
,
MP Lt
ωyt /nt

(12)

and nominal marginal cost is proportional to nominal average cost. We use the
unit labor cost index for the private nonfarm business sector as our measure
of average labor cost. Deflating nominal average cost with the price index of
the private nonfarm business sector yields real average labor cost, that is, the
labor income share. The log deviation of real marginal cost from its mean is
3 This is the most commonly used price index in the implementation of the NKPC. Other
price indices used include the price index of the private nonfarm business sector or the price index
for Personal Consumption Expenditures (PCE), the consumption component of the GDP deflator.
Although the choice of price deflator affects the results described below, the differences are not
dramatic, e.g., Gal´ and Gertler (1999). We should also note that only consumption based indices,
ı
such as the PCE index, are commonly mentioned by central banks in their communications on
monetary policy.

A. Hornstein: Inflation Dynamics and the NKPC

323

Figure 1 Inflation and Marginal Cost in the United States, 1955–2005

10

0.1

5

0.0

1970

1980
v

v

1960
v

v

B. Persistence: Corr (π t, πt -k) and Corr (st, s t -k)

1990

2000

-0.1
2005
v

0
1955

v

v

Inflation, Annualized in Percent

v

0.2

Log of Marginal Cost, 1992=0

A. Inflation, π, and Marginal Cost, s, 1955Q1–2005Q4

15

C. Cross-correlation Coefficients: Corr (π t, s t+k)

1.0

1.0

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.0

0.0
1

2

Lag k

3

4

-4

-3

-2

-1

0 1
Lag k

2

3

4

Notes: Inflation and marginal cost are defined in the Appendix. The solid line in Panel A
represents the inflation rate and its sample mean, and the dashed line represents marginal
cost and its sample mean. In Panel B, the circles (diamonds) denote the sample autocorrelations for inflation (marginal cost). In Panel C, the squares denote the cross-correlations
of inflation and marginal cost. In Panels B and C, the boxes denote the 5-percentile to
95-percentile range of the statistic calculated from 1,000 bootstraps of the data.

then equal to the log-deviation of the labor income share from its mean
Wt nt
.
(13)
Pt yt
The detailed source information for our data is listed in the Appendix.
In Figure 1.A, we graph the quarterly inflation rate and marginal cost for
the time period 1955Q1 to 2005Q4. Inflation varies widely over this time
period, from about 1 percent at the low end in the early 1960s, to more than
10 percent in the 1970s, with a 3 1/2 percent average inflation rate, Table 1,
column 1. Inflation and marginal cost are both highly persistent, the first-order
autocorrelation coefficient is about 0.9 for both variables, Figure 1.B. To the
st =
ˆ

324

Federal Reserve Bank of Richmond Economic Quarterly

Table 1 Inflation and Marginal Cost
π
¯

σπ
ˆ

s
¯

σs
ˆ

¯ˆ
δπ

¯ˆ
δs

Corr π, s
ˆ ˆ

(1)

(2)

(3)

(4)

(5)

(6)

(7)

1955Q1–2005Q4

3.6

2.4

0.013

0.021

1955Q1–1969Q4

2.5

1.4

0.023

0.018

1970Q1–1983Q4

6.5

2.2

0.024

0.016

1984Q1–1991Q4

3.2

0.9

0.011

0.007

1992Q1–2005Q4

2.1

0.7

-0.009

0.018

0.94
[0.88,0.99]
0.97
[0.83,0.98]
0.80
[0.62,0.98]
0.60
[0.20,1.03]
0.76
[0.50,1.02]

0.93
[0.89,0.98]
0.89
[0.79,1.00]
0.72
[0.56,0.88]
0.73
[0.51,0.95]
0.92
[0.81,1.02]

0.33
[0.23,0.43]
-0.12
[-0.30,0.05]
0.29
[0.10,0.46]
0.10
[0.09,0.34]
-0.06
[-0.32,0.22]

Sample

Notes: Columns (1) and (2) contain the average annualized inflation rate, π, and its
¯
standard deviation, σ π . Columns (3) and (4) contain the average values and standard
ˆ
deviation of marginal cost, s and σ s . Marginal cost is in log deviations from its normal¯
ˆ
ized 1992 value. Columns (5) and (6) contain the sum of the autocorrelation coefficients
¯ˆ
of a univariate OLS regression with four lags for inflation respectively marginal cost, δ π
¯ˆ
and δ s . Column (7) contains the contemporaneous correlation coefficient between inflation and marginal cost. For the sum of autocorrelation coefficients and the correlation
coefficient, columns (5), (6), and (7), we list the 5th and 95th percentile of the respective
bootstrapped statistic with 1,000 replications in brackets.

extent that the autocorrelation coefficients of inflation do not decline as fast as
the ones for marginal cost, inflation appears to be somewhat more persistent
than marginal cost. Levin and Piger (2002) use an alternative measure of
persistence in their analysis of inflation in the United States, namely the sum of
lagged coefficients in a univariate regression of a variable on its own lags. This
measure also yields estimates of significant and similar persistence for inflation
and marginal cost, Table 1, columns 5 and 6. Inflation and marginal cost
tend to move together. The cross-correlations between inflation and marginal
cost are positive, 0.33 contemporaneously and above 0.2 at all four lags and
leads, Table 1, column 7, and Figure 1.C. Although the co-movement between
inflation and marginal cost is significant, it is not particularly strong.4
As we have shown previously, in the basic NKPC model, persistence of
inflation and marginal cost, and co-movement of inflation with marginal cost
go together. The observation that inflation is about as persistent as marginal
cost, but only weakly correlated with marginal cost then seems to be inconsistent with the basic NKPC. We now study if two modifications of the basic
4 The positive cross-correlation coefficients are significant for all four lags and leads. Based

on 1,000 bootstraps the 5-percentile to 95-percentile ranges of the coefficients do not include zero,
Figure 1.C.

A. Hornstein: Inflation Dynamics and the NKPC

325

NKPC can resolve this apparent inconsistency. The first approach is to make
the NKPC more like a standard Phillips curve by directly introducing lagged
inflation. The second approach argues that some of the observed inflation persistence is spurious. Extended apparent deviations of the inflation rate from
the sample average inflation rate, for example in the 1970s, are interpreted
as sub-sample changes in the mean inflation rate. This approach then suggests that the NKPC has to be modified to take into account changes in trend
inflation. We will discuss these two approaches in the following sections.

2. A HYBRID NKPC
The importance of marginal cost for inflation persistence will be reduced if
there is a source of persistence that is inherent to the inflation process itself.
Two popular approaches that introduce such a backward-looking element of
price determination into the NKPC are “rule-of-thumb” behavior and indexation. For the first approach, one assumes that a fraction ρ of the price-setting
firms do not choose their prices optimally, rather they index their prices to past
inflation. For the second approach one assumes that firms who do not have
the option to adjust their price optimally simply index their price to a fraction
ρ of past inflation.5 The two approaches are essentially equivalent and for the
second case the NKPC becomes
ˆ
ˆ
ˆ
(1 − ρL) π t = βEt (1 − ρL) π t+1 + κ 0 st + ut ,

(14)

where L is the lag operator, L xt = xt−j for any integer j .
This modification of the NKPC is also called a hybrid NKPC since current
inflation not only depends on expected inflation as in the baseline NKPC,
but it also depends on past inflation as in a traditional Phillips curve. The
dependence on lagged inflation introduced through backward-looking price
determination is called “intrinsic” persistence since it is an exogenous part
of the model structure. Complementary to intrinsic persistence is “extrinsic”
inflation persistence which comes through the marginal cost process that drives
inflation. To the extent that monetary policy affects marginal cost, it influences
extrinsic inflation persistence.
Note that the hybrid NKPC, equation (14), is of the same form as the basic
NKPC, equation (1), except for the linear transformation of inflation, π t =
˜
π t −ρ π t−1 , replacing the actual inflation rate. Forward-solving equation (14),
ˆ
ˆ
assuming again that marginal cost follows an AR(1) process, as in equation
(4), then yields the following expression for π t :
˜
κ0
π t − ρ π t−1 =
ˆ
ˆ
ˆ
(15)
st + ut = a0 st + ut .
ˆ
1 − βδ
j

5 “Rule-of-thumb” behavior was introduced by Gal´ and Gertler (1999); inflation indexation
ı
has been used by Christiano, Eichenbaum, and Evans (2005).

326

Federal Reserve Bank of Richmond Economic Quarterly

For this specification, inflation can be more persistent than marginal cost
because current inflation is indexed to past inflation.
The autocorrelation coefficients for the linear transformation of inflation,
π t , are the same as defined in equation (10), but the autocorrelation coeffi˜
cients for the inflation rate itself are now more complicated functions of the
persistence of marginal cost and the intrinsic inflation persistence. In Hornstein (2007), I derive the autocorrelation and cross-correlation coefficients for
inflation and marginal cost,
Corr π t , π t−k
ˆ ˆ

=

Corr π t , st+k
ˆ ˆ

=

2
(σ u /σ s )2 A (k; ρ) + a0 B (k; ρ, δ)
and
2
(σ u /σ s )2 A (0; ρ) + a0 B (0; ρ, δ)
a0 C (k; ρ, δ)
2
(σ u /σ s )2 A (0; ρ) + a0 B (0; ρ, δ)

1/2

(16)
,

(17)

where
1
,
1 − ρ2
ρ 1 − δ2 k
1
ρ
,
B (k; ρ, δ) = δ k −
2
δ 1−ρ
(1 − ρ/δ) (1 − ρδ)
1
if k ≥ 0, and
C (k; ρ, δ) = δ k
1 − ρδ
ρ 1 − δ2
1
if k < 0.
C (k; ρ, δ) = δ −k − ρ −k
δ 1 − ρδ 1 − ρ/δ
A (k; ρ) = ρ k

Inflation Persistence in the Hybrid NKPC
Inflation persistence for the hybrid NKPC depends not only on the persistence
of marginal cost and intrinsic inflation persistence, δ and ρ, but also on the
relative volatility of the shocks to the NKPC and marginal cost, σ u /σ s , and
the reduced form coefficient on marginal cost, a0 . In order to evaluate the
implications of the hybrid NKPC for inflation dynamics we, therefore, need
estimates of the structural parameters of the NKPC and the relative standard
deviation of the NKPC shock. In the following, I study the implications of two
alternative calibrations. The first calibration is based on generalized method
of moments (GMM) estimates of the structural parameters, α, β, and ρ, and
an estimate of the relative volatility of the NKPC shocks that is implicit in
the GMM estimates. This calibration has only limited success in matching
the autocorrelation and cross-correlation properties of inflation and marginal
cost. For the second calibration, I then set intrinsic persistence and the relative
volatility of the NKPC shock to directly match the autocorrelation and crosscorrelation properties of inflation and marginal cost.

A. Hornstein: Inflation Dynamics and the NKPC

327

Table 2 New Keynesian Phillips Curve Estimates, 1960 Q1–2005 Q4
α
(1)
(2)

ρ

β

π t−1
ˆ

π t+1
ˆ

st
ˆ

0.901
(0.028)
0.897
(0.021)

0.164
(0.124)
0.469
(0.095)

0.990
(0.028)
0.944
(0.043)

0.141
(0.091)
0.325
(0.046)

0.851
(0.087)
0.654
(0.048)

0.010
(0.007)
0.012
(0.005)

Notes: This table reports estimates of the NKPC approximated at a zero inflation rate,
equation (14). The first three columns contain estimates of the structural parameters:
price non-adjustment probability, α, degree of inflation indexation, ρ, and time discount
factor β. The next three columns contain the implied reduced form coefficients on
marginal cost, and lagged and future inflation when the coefficient on current inflation
is one. The first row represents estimates of the moment conditions from equation (14).
The second row represents estimates of the moment conditions from equation (14) when
the coefficient of contemporaneous inflation is normalized to one. The covariance matrix of errors is estimated with a 12 lag Newey-West procedure. Standard errors of the
estimates are shown in parentheses.

Gal´, Gertler, and L´ pez-Salido (2005) (hereafter referred to as GGLS)
ı
o
estimate the hybrid NKPC for U.S. data using GMM techniques.6 I replicate
their analysis for the hybrid NKPC (14) using the data on inflation and marginal
cost for the time period 1960–2005. The instrument set includes four lags of
the inflation rate, and two lags each of marginal cost, nominal wage inflation,
and the output gap.7 The results reported in Table 2 are not exactly the same
as in GGLS, but they are broadly consistent with GGLS. The time discount
factor, β, is estimated close to one, and the coefficient on marginal cost,
κ 0 = 0.01, is smaller than for GGLS. The small coefficient on marginal
cost translates to a relatively low price adjustment probability: only about 10
percent, 1−α, of all prices are optimally adjusted in a quarter. Similar to GGLS
the estimated degree of inflation indexation depends on the normalization of
the GMM moment conditions. For the first specification, when equation (14)
is estimated directly, we find a relatively low degree of indexation to past
inflation, ρ = 0.16. For the second specification, when the coefficient on
current inflation in equation (14) is normalized to one, we find significantly
more indexation, ρ = 0.47.
We construct an estimate of the volatility of shocks to the NKPC in two
steps. First, we regress current inflation π t on the set of instrumental variables.
ˆ
The instrumental variables contain only lagged variables, that is, information
6 Other work that estimates the NKPC using the same or similar techniques includes Gal´ and
ı
Gertler (1999) and Sbordone (2002). See also the 2005 special issue of the Journal of Monetary
Economics vol. 52 (6).
7 The data are described in detail in the Appendix.

328

Federal Reserve Bank of Richmond Economic Quarterly

Table 3 Calibration
Parameter

Calibration
(1)

β
α
ρ
σ u /σ s
δ

Time Discount Factor
Probability of No Price Adjustment
Price Indexation
Relative NKPC Shock Volatility
Marginal Cost Persistence

(2)

0.99
0.90
0.45
0.10
0.90

0.99
0.80
0.86
2.97
0.90

available in the previous period. We then use this regression to obtain an
estimate of the expected inflation rate conditional on available information,
Et π t+1 , and substitute it together with the information on current inflation and
ˆ
marginal cost, and the estimated parameter values in equation (14), and solve
for the shock to the NKPC, ut . The calculated standard deviation of the shock
is about 1/10 of the standard deviation of marginal cost.8
Based on the GMM estimates for the second specification of the moment
conditions, I now choose a parameterization of the hybrid NKPC with some
intrinsic inflation persistence, Table 3, column 1.9 For the persistence of
marginal cost, I choose δ = 0.9, which provides a reasonable approximation
of the autocorrelation structure of marginal cost for the period 1955 to 2005.
We can now characterize the inflation dynamics implied by the hybrid
NKPC. The bullet points in Figure 2 display the first four autocorrelation
coefficients of inflation and the cross-correlation coefficients of inflation with
marginal cost implied by the calibrated model. Figure 2 also displays the
bootstrapped 5th to 95th percentile ranges for the autocorrelation and crosscorrelation coefficients of inflation and marginal cost for the U.S. economy
from Figure 1.B and 1.C. As we can see, the model does not do too badly
for the autocorrelation structure of inflation: the first-order autocorrelation
coefficient of inflation is just outside the 5th to 95th percentile range, but then
the autocorrelation coefficients are declining too fast relative to the data.10 The
model does generate too much co-movement for inflation and marginal cost
8 Depending on the parameter estimates, σ = 0.0019 for specification one and σ = 0.0025
u
u
for specification two. For either specification the serial correlation of the shocks is quite low, the
highest value is 0.2. Fuhrer (2006) argues for a higher relative volatility of the NKPC shock,
about 3/10 of the volatility of marginal cost.
9 Choosing a lower value for indexation based on specification, one would generate less inflation persistence.
10 Fuhrer (2006) assumes a three times larger relative volatility of the NKPC shocks and,
therefore, requires substantially more intrinsic persistence, that is, a higher ρ, in order to match
inflation persistence.

A. Hornstein: Inflation Dynamics and the NKPC

329

Figure 2 Inflation Dynamics for the Hybrid NKPC
v

v

A. Autocorrelation Coefficients: Corr (π t , π t-k)

1.0
0.8
0.6
0.4
0.2
0.0

3

2

1

4

Lag k
v

v

B. Cross-correlation Coefficients: Corr (π t , st+k )

1.0
0.8
0.6
0.4
0.2
0.0
-4

-3

-2

-1

0
Lead k

1

2

3

4

Notes: The circles (squares) denote autocorrelations and cross-correlations from calibration 1 (2) of the hybrid NKPC. The boxes denote the 5-percentile to 95-percentile range
of the statistic calculated from 1,000 bootstraps of data.

relative to the data: the predicted contemporaneous correlation coefficient is
about 0.8, well above the observed value of 0.3.
Given the failure of the GMM-based calibration to account for the autocorrelation and cross-correlation structure of inflation and marginal cost, I now
consider an alternative calibration that exactly matches the first-order autocorrelation of inflation and the contemporaneous cross-correlation of inflation and
marginal cost. As I pointed out above, the estimated price adjustment probability of 10 percent per quarter is quite low. Other work suggests higher price
adjustment probabilities, about 20 percent per quarter, e.g., Gal´ and Gertler
ı
(1999), Eichenbaum and Fisher (2007), or Cogley and Sbordone (2006).11
For the alternative calibration I, therefore, assume that α = 0.8. Conditional
11 The NKPC specification in equation (14) is based on constant firm-specific marginal cost.
Eichenbaum and Fisher (2007) and Cogley and Sbordone (2006) consider the possibility of increasing firm-specific marginal cost. Adjusting their estimates for constant firm-specific marginal
cost yields α = 0.8.

330

Federal Reserve Bank of Richmond Economic Quarterly

on an unchanged time discount factor, β, this implies a coefficient on marginal
cost, κ 0 = 0.05, which represents an upper bound of what has been estimated
for hybrid NKPCs.
I now choose intrinsic persistence, ρ, and the relative volatility of the
NKPC shock, σ u /σ s , to match the sample first-order autocorrelation coefficient of inflation, Corr π t , π t−1 = 0.88, and the contemporaneous correˆ ˆ
lation of inflation and marginal cost, Corr π t , st = 0.33. This procedure
ˆ ˆ
yields a very large value for inflation indexation, ρ = 0.86, which makes
inflation persistence essentially independent of marginal cost. A very high
relative volatility of the NKPC shock, σ u /σ s = 2.97, can then reduce the
co-movement between inflation and marginal cost without affecting inflation
persistence significantly. The implied parameter values of this calibration are
summarized in the second column of Table 3.
The autocorrelation and cross-correlation structure of the alternative calibration is represented by the squares in Figure 2. With few exceptions the
cross-correlations predicted by the alternative calibration stay in the 5th to
95th percentile ranges of the observed cross-correlations. The autocorrelation
coefficients continue to decline at a rate that is faster than observed in the data.

3. THE CHANGING NATURE OF INFLATION
The behavior of inflation has changed markedly over time, Table 1, column (1).
Inflation tended to be below the sample mean in the 1950s and 1960s, average
inflation was about 2.5 percent, but inflation increased in the second half of
the 1960s. In the 1970s, inflation increased even more, averaging 6.5 percent
and reaching peaks of up to 12 percent. In the early 1980s, inflation came
down fast, averaging 3.2 percent from 1984 to 1991. Finally, in the period
since the early 1990s, inflation continued to decline, but otherwise remained
relatively stable, averaging about 2 percent.12
Most observers attribute the changes in average inflation since the 1960s
to changes in monetary policy, as represented by different chairmen of the
monetary policy committee of the Federal Reserve System. We have the
“Burns inflation” of the 1970s, the “Volker disinflation” of the early 1980s, and
the “Greenspan period” with a further reduction and stabilization of inflation
from the late 1980s to 2005. Interestingly enough, these substantial changes
in the mean inflation rate were not associated with comparable changes in
mean marginal cost: average marginal cost differs by at most 3 percent across
the sub-samples, Table 1, column 3.
12 I choose 1970 as the starting point of the high inflation era since mean inflation before
1970 is relatively close to the sample mean. The year 1984 is usually chosen as representing a
definite break with the high inflation regime of the 1970s, e.g., Gal´ and Gertler (1999) or Roberts
ı
(2006). Levin and Piger (2003) argue for a break in the mean inflation rate in 1991.

A. Hornstein: Inflation Dynamics and the NKPC

331

In the following, we will first show that allowing for changes in mean
inflation rates affects the inflation dynamics as measured by the autocorrelation
and cross-correlation structure. Since it appears that accounting for changes
in the mean inflation rate affects the dynamics of inflation, we investigate
whether the average inflation rate around which we approximate the optimal
price-setting behavior of the firms in the Calvo model affects the dynamics of
the NKPC.

Inflation Dynamics and Average Inflation13
The persistence and co-movement of inflation and marginal cost have varied across decades. In Figure 3, we display the autocorrelations and crosscorrelations of inflation and marginal cost for the four periods we have just
mentioned: the 1960s, 1970s, 1980s, and the period beginning in 1992.
In the 1960s, both inflation and marginal cost are highly persistent, with
inflation being somewhat more persistent than marginal cost: the autocorrelation coefficients for inflation do not decline as fast as the ones for marginal
cost. But in the following periods, it appears as if the persistence of inflation
declines, at least relative to marginal cost. This decline of inflation persistence is especially noticeable for the first- and second-order autocorrelation
coefficients from 1984 on, Figure 3, A.3 and A.4.14
The positive correlation between inflation and marginal cost in the full
sample hides substantial variation of co-movement across sub-samples. The
1970s is the only period with a strong positive correlation between inflation
and marginal cost, Figure 3, B.2. At the other extreme are the 1960s when
the correlation between inflation and marginal cost is negative for almost all
leads and lags, Figure 3, B.1. In between are the remaining two sub-samples
from 1984 on, in which the correlation between inflation and marginal cost
tends to be positive, but only weakly so.

The NKPC at Positive Average Inflation
How should we interpret these changes in the time series properties of inflation
and marginal cost? In particular, what do these changes tell us about the NKPC
as a model of inflation? The decline in persistence is especially intriguing since
it coincides with the decline of the average inflation rate. Most observers
13 Articles that discuss changes in the inflation process include Cogley and Sargent (2001),
Levin and Piger (2003), Nason (2006), and Stock and Watson (2007). Roberts (2006) and Williams
(2006) relate the changes in the inflation process to changes in the Phillips curve.
14 We should note, however, that the sum of autocorrelation coefficients from univariate regressions in the inflation rate and marginal cost do not indicate statistically significant changes in
the persistence of inflation or marginal cost across subperiods, Table 1, columns 5 and 6.

332

Federal Reserve Bank of Richmond Economic Quarterly

Figure 3 Inflation and Marginal Cost Dynamics Over Time
v

v v

v v

v

A.1 Corr (πt, πt-k) and Corr (st , st-k), 1955Q1–1969Q4
1.0

B.1 Corr (π t, s t+k), 1955Q1–1969Q4
0. 5

0.5

0.0

0.0

-0.5
1

2

3

4

-4

-3

-2

-1

Lag k

2

3

4

v

v v

v

v v

A.2 Corr (πt, πt-k) and Corr (st , st-k), 1970Q1–1983Q4
1.0

0
1
Lag k

B.2 Corr (π t, s t+k), 1970Q1–1983Q4
0.5

0.5

0.0

0.0

-0.5
1

2

3

4

-4

-3

-2

-1

Lag k

2

3

4

v

v v

v

v v

A.3 Corr (πt, πt-k) and Corr (st , st-k), 1984Q1–1991Q4
1.0

0
1
Lag k

B.3 Corr (π t, s t+k), 1984Q1–1991Q4
0.5

0.5

0.0

0.0

-0.5
1

2

3

4

-4

-3

-2

-1

Lag k

2

3

4

v v

v

v

v

v

A.4 Corr (πt, πt-k) and Corr (st , st-k), 1992Q1–2005Q4
1.0

0
1
Lag k

B.4 Corr (π t, s t+k), 1992Q1–2005Q4
0.5

0.5

0.0

0.0

-0.5
1

2

3
Lag k

4

-4

-3

-2

-1

0
1
Lag k

2

3

4

Notes: In Panel A, the circles (squares) denote the sub-sample autocorrelations for inflation (marginal cost). In Panel B, the diamonds denote the cross-correlations of inflation
and marginal cost. In Panels A and B, the boxes denote the 5-percentile to 95-percentile
range of the statistic calculated from 1,000 bootstraps of the sub-sample data.

attribute the reduction of the average inflation rate to monetary policy, but
should one also attribute the reduced inflation persistence to monetary policy?
From the perspective of the reduced form NKPC with no feedback from
inflation to marginal cost, equation (15), monetary policy is unlikely to have
affected the persistence of inflation. In this framework, monetary policy works
through its impact on marginal cost, but if anything, marginal cost has become
more persistent rather than less persistent since the 1990s. We now ask if
this conclusion may be premature since it relies on an approximation of the
inflation dynamics in the Calvo model around a zero-average inflation rate. If
one approximates the inflation dynamics around a positive-average inflation
rate, then inflation persistence depends on the average inflation rate, even when
the other structural parameters of the environment remain fixed.

A. Hornstein: Inflation Dynamics and the NKPC

333

The modified hybrid NKPC for an approximation at the gross inflation
rate π ≥ 1 is
¯
1 + φL−1 st + ut .
ˆ
(18)
15
The derivation of (18) is described in Hornstein (2007). The NKPC is now
a third-order difference equation in inflation and involves current and future
marginal cost. The coefficients λ1 , λ2 , φ, and κ 1 are functions of the underlying
structural parameters, α, β, ρ, and a new parameter θ, representing the firms’
demand elasticity. Furthermore, the coefficients also depend on the average
inflation rate, π, around which we approximate the optimal pricing decisions
¯
of the firms.
The modified hybrid NKPC (18) simplifies to the hybrid NKPC (14) for
zero net-inflation, π = 1. As we increase the average inflation rate, inflation
¯
becomes less responsive to marginal cost in the modified NKPC. In Figure
4.A, we plot the coefficient on marginal cost κ 1 in the modified NKPC as a
function of the average inflation rate for our two calibrations of the hybrid
NKPC. In addition to the parameter values listed in Table 3, we also have to
parameterize the demand elasticity of the monopolistically competitive firms,
θ. Consistent with the literature on nominal rigidities, we assume that θ = 11,
which implies a 10 percent steady-state markup. For both calibrations, the
coefficient on marginal cost declines with the average inflation rate, Figure 4.A.
This suggests that everything else being equal, inflation will be less persistent
and less correlated with marginal cost at higher inflation rates, since marginal
cost has a smaller impact on inflation. The first calibration with a low price
adjustment probability represents an extreme case, in that respect, since the
coefficient on marginal cost converges to zero. On the other hand, for the
second calibration with a higher price adjustment probability, the coefficient
on marginal cost is relatively inelastic with respect to changes in the inflation
rate.
Assuming that marginal cost follows an AR(1) with persistence δ such
that the product of δ and the roots of the lead polynomials in equation (18)
are less than one, |δλi | < 1, we can derive the reduced form of the modified
NKPC as
1 + δφ
ˆ
ˆ
(19)
st + ut = a1 st + ut .
ˆ
(1 − ρL) π t = κ 1
(1 − λ1 δ) (1 − λ2 δ)
Et

1 − λ1 L−1

1 − λ2 L−1 (1 − ρL) π t = κ 1 Et
ˆ

This expression is formally equivalent to the reduced form of the hybrid NKPC,
equation (15), but now the coefficient a1 is a function of the average inflation
rate. Since inflation becomes less responsive to marginal cost in the NKPC
15 Ascari (2004) and Cogley and Sbordone (2005, 2006) also derive the modified NKPC, but

choose a different representation. Their representation is based on the hybrid NKPC, equation (14),
and adds a term that involves the expected present value of future inflation.

334

Federal Reserve Bank of Richmond Economic Quarterly

Figure 4 The NKPC and Changes in Average Inflation
A. Coefficient on Marginal Cost in NKPC, κ1

0.06
0.05
0.04
0.03
0.02

Calibration 1
Calibration 2

0.01
0.00
0

1

2

3
4
5
Average Annual Inflation Rate in Percent

6

7

8

B. Coefficient on Marginal Cost in Reduced Form NKPC, a1

0.5
0.4
0.3
0.2

Calibration 1
Calibration 2

0.1
0.0
0

1

2

3
4
5
6
Average Annual Inflation Rate in Percent

7

8

when the average inflation rate increases, inflation in the reduced form NKPC
also becomes less responsive to marginal cost: a1 declines with the average
inflation rate, Figure 4.B. As with the coefficient on marginal cost in the NKPC,
κ 1 , the coefficient on marginal cost in the reduced form NKPC, a1 , declines
much more for the first calibration with the relatively low price adjustment
probability. This feature is important since the autocorrelations and crosscorrelations of inflation depend on the average inflation rate only through the
responsiveness of inflation to marginal cost, a1 .
We now replicate the analysis of Section 2 and calculate the first four
autocorrelation coefficients of inflation and the cross-correlation coefficients
of inflation with marginal cost when the average annual inflation rate varies
from 0 to 8 percent.16 In Figures 5 and 6, we display the autocorrelation
and cross-correlation coefficients for the two calibrations. With a low price
adjustment probability, the first calibration, an increase of the average inflation
rate substantially reduces the persistence of inflation and its co-movement with
marginal cost, Figure 5. Even moderately high annual inflation rates, about 4
16 For the parameter values used in the calibration, the “weighted” roots of the lead polynominal are less than one for all of the average annual inflation rates considered.

A. Hornstein: Inflation Dynamics and the NKPC

335

Figure 5 The Effects of Average Inflation, Calibration 1
v

v

A. Autocorrelation Coefficients, Corr (πt, π t-k)

1.0

π=0.00
π=0.02
π=0.04
π=0.06
π=0.08

0.8
0.6
0.4
0.2
0.0
2

3

Lag k

4

v

1

v

B. Cross-correlation Coefficients, Corr (πt, s t+k )

1.0
0.8
0.6
0.4
0.2
0.0
-4

-3

-2

-1

0
Lag k

1

2

3

4

percent, reduce the first-order autocorrelation and the contemporaneous crosscorrelation by half. This pattern follows directly from equations (16) and (17)
and the fact that the coefficient a1 converges to zero for the first calibration.
With a higher price adjustment probability, the second calibration, a higher
average inflation rate also tends to reduce persistence and co-movement of
inflation, but the quantitative impact is negligible, Figure 6. Again, this pattern
conforms with the limited impact of changes in average inflation on the reduced
form coefficient of marginal cost.

Changing U.S. Inflation Dynamics and the Modified
NKPC
Based on the modified NKPC, can changes in average inflation account for the
changing U.S. inflation dynamics? Not really. There are two big changes in the
average inflation rate between sub-samples of the U.S. economy. First, average
inflation increased from 2.5 percent in the 1960s to 6.5 percent in the 1970s,
and second, average inflation subsequently declined to 3.2 percent in the 1980s.
These changes in average inflation were associated with significant changes
in the persistence of inflation and the co-movement of inflation with marginal

336

Federal Reserve Bank of Richmond Economic Quarterly

Figure 6 The Effects of Average Inflation, Calibration 2
v

v

A. Autocorrelation Coefficients, Corr (π t , π t-k)

1.0
0.8
0.6
0.4
0.2
0.0

2

3

Lag k

4

v

1

v

B. Cross-correlation Coefficients, Corr (πt, st+k)

0.40

π=0
π=0.02
π=0.04
π=0.06
π=0.08

0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
-4

-3

-2

-1

0
Lag k

1

2

3

4

cost. Yet, the predictions of the modified NKPC for inflation persistence
and co-movement based on the observed changes in average inflation are
inconsistent with the observed changes in persistence and co-movement.
On the one hand, a calibration with relatively low price adjustment probabilities, the first calibration, predicts big changes for persistence and comovement in response to the changes in average inflation, but the changes
either do not take place or are opposite to what the model predicts. In response
to the increase of the average inflation rate from the 1960s to the 1970s, inflation persistence and co-movement should have declined substantially, but
persistence did not change and co-movement increased. Indeed the correlation
between inflation and marginal cost switches from negative, which is inconsistent with the NKPC to begin with, to positive. In response to the reduction
of average inflation in the 1980s, the model predicts more inflation persistence
and more co-movement of inflation and marginal cost. Yet again, the opposite
happens. Inflation persistence declines, at least the first- and second-order
autocorrelation coefficients decline, and the correlation coefficients between
inflation and marginal cost decline.
On the other hand, a calibration of the modified NKPC with relatively
high price adjustment probabilities, the second calibration, cannot account

A. Hornstein: Inflation Dynamics and the NKPC

337

for any quantitatively important effects on the persistence or co-movement of
inflation based on changes in average inflation.

4.

CONCLUSION

We have just argued that a hybrid NKPC, modified to account for changes
in trend inflation, has problems accounting for the changes of U.S. inflation
dynamics over the decades. One way to account for these changes of inflation dynamics within the framework of the NKPC is to allow for changes in
the model’s structural parameters. For example, inflation indexation, that is,
intrinsic persistence, could have increased and decreased to offset the effects
of a higher trend inflation in the 1970s. This pattern of inflation indexation in
response to the changes in trend inflation looks reasonable. However, attributing changes in the dynamics of inflation to systematic changes in the structural
parameters of the NKPC makes this framework less useful for monetary policy
analysis. This is troublesome since several central banks have recently begun
to develop full-blown Dynamic Stochastic General Equilibrium (DSGE) models with versions of the NKPC as an integral part. Ultimately, these DSGE
models are intended for policy analysis, and for this analysis it is presumed
that the model elements, such as the NKPC, are invariant to the policy changes
considered. Based on the analysis in this article, it then seems appropriate to
investigate further the “stability” of the NKPC before one starts using these
models for policy analysis.

APPENDIX
We use seasonally adjusted quarterly data for the time period 1955Q1 to
2005Q4. All data are from HAVER with mnemonics in parentheses. From the
national income accounts we take real GDP (GDPH@USECON) and for the
GDP deflator we take the chained price index (JGDP@USECON). From the
nonfarm business sector we take the unit labor cost index (LXNFU@USECON),
the implicit price deflator (LXNFI@USECON), and the hourly compensation
index (LXNFC@USECON). All of the three nonfarm business sector series
are indices that are normalized to 100 in 1992.
We define inflation as the quarterly growth rate of the GDP deflator and
marginal cost as the log of the ratio of unit labor cost and the nonfarm business
price deflator. We construct the instruments for the GMM estimation other
than lagged inflation and marginal cost following Gal´, Gertler, and L´ pezı
o
Salido (2005). The output gap is the deviation of log real GDP from a quadratic
trend, and wage inflation is the growth rate of the hourly compensation index.

338

Federal Reserve Bank of Richmond Economic Quarterly

REFERENCES
Ascari, Guido. 2004. “Staggered Prices and Trend Inflation: Some
Nuisances.” Review of Economic Dynamics 7 (3): 642–67.
Calvo, Guillermo. 1983. “Staggered Prices in a Utility-Maximizing
Framework.” Journal of Monetary Economics 12 (3): 383–98.
Christiano, Lawrence, Martin Eichenbaum, and Charles Evans. 2005.
“Nominal Rigidities and the Dynamic Effects of a Shock to Monetary
Policy.” Journal of Political Economy 113 (1): 1–45.
Cogley, Timothy, and Thomas Sargent. 2001. “Evolving Post-World War II
U.S. Inflation Dynamics.” In NBER Macroeconomics Annual 2001:
331–72.
Cogley, Timothy, and Argia M. Sbordone. 2005. “A Search for a Structural
Phillips Curve.” Federal Reserve Bank of New York Staff Report No.
203 (March).
Cogley, Timothy, and Argia M. Sbordone. 2006. “Trend Inflation and
Inflation Persistence in the New Keynesian Phillips Curve.” Federal
Reserve Bank of New York Staff Report No. 270 (December).
Eichenbaum, Martin, and Jonas D. M. Fisher. 2007. “Estimating the
Frequency of Price Re-Optimization in Calvo-Style Models.” Journal of
Monetary Economics 54 (7): 2,032–47.
Fuhrer, Jeffrey C. 2006. “Intrinsic and Inherited Inflation Persistence.”
International Journal of Central Banking 2 (3): 49–86.
Gal´, Jordi, and Mark Gertler. 1999. “Inflation Dynamics: A Structural
ı
Econometric Analysis.” Journal of Monetary Economics 44 (2):
195–222.
Gal´, Jordi, Mark Gertler, and David L´ pez-Salido. 2005. “Robustness of the
ı
o
Estimates of the Hybrid New Keynesian Phillips Curve.” Journal of
Monetary Economics 52 (6): 1,107–18.
Hornstein, Andreas. 2007. “Notes on the New Keynesian Phillips Curve.”
Federal Reserve Bank of Richmond Working Paper No. 2007-04.
Levin, Andrew T., and Jeremy M. Piger. 2003. “Is Inflation Persistence
Intrinsic in Industrialized Economies?” Federal Reserve Bank of St.
Louis Working Paper No. 2002-023E.
Nason, James. 2006. “Instability in U.S. Inflation: 1967–2005.” Federal
Reserve Bank of Atlanta Economic Review 91 (2): 39–59.

A. Hornstein: Inflation Dynamics and the NKPC

339

Roberts, John M. 2006. “Monetary Policy and Inflation Dynamics.”
International Journal of Central Banking 2 (3): 193–230.
Sbordone, Argia M. 2002. “Prices and Unit Labor Costs: A New Test of
Price Stickiness.” Journal of Monetary Economics 49 (2): 265–92.
Stock, James H., and Mark W. Watson. 2007. “Why Has Inflation Become
Harder to Forecast?” Journal of Money, Credit, and Banking 39 (1):
3–33.
Williams, John C. 2006. “Inflation Persistence in an Era of Well-Anchored
Inflation Expectations.” Federal Reserve Bank of San Francisco
Economic Letter No. 2006-27.
Woodford, Michael. 2003. Interest and Prices. Princeton, NJ: Princeton
University Press.

Economic Quarterly—Volume 93, Number 4—Fall 2007—Pages 341–360

The Evolution of City
Population Density in the
United States
Kevin A. Bryan, Brian D. Minton, and Pierre-Daniel G. Sarte

T

he answers to important questions in urban economics depend on the
density of population, not the size of population. In particular, positive
production or residential externalities, as well as negative externalities
such as congestion, are typically modeled as a function of density (Chatterjee
and Carlino 2001, Lucas and Rossi-Hansberg 2002). The speed with which
new knowledge and production techniques propagate, the gain in property
values from the construction of urban public works, and the level of labor
productivity are all affected by density (Carlino, Chatterjee, and Hunt 2006,
Ciccone and Hall 1996). Nonetheless, properties of the distribution of urban
population size have been studied far more than properties of the urban density
distribution.
Chatterjee and Carlino (2001) offer an insightful example as to why density can be more important than population size. They note that though
Nebraska and San Francisco have the same population, urban interactions
occur far less frequently in Nebraska because of its much larger area. Though
the differences in the area of various cities are not quite so stark, there are
meaningful heterogeneities in city densities. Given the importance of urban
density, the stylized facts presented in the article ultimately require explanations such as those given for the evolution of city population.
This article makes two major contributions concerning urban density.
First, we construct an electronic database containing land area, population,
and urban density for every city with population greater than 25,000 in the
We wish to thank Kartik Athreya, Nashat Moin, Roy Webb, and especially Ned Prescott
for their comments and suggestions. The views expressed in this article are those of the
authors and do not necessarily represent those of the Federal Reserve Bank of Richmond
or the Federal Reserve System. Data and replication files for this research can be found at
http://www.richmondfed.org/research/research economists/pierre-daniel sarte.cfm. All errors are
our own.

342

Federal Reserve Bank of Richmond Economic Quarterly

United States. Second, we document a number of stylized facts about the
urban density distribution by constructing nonparametric estimates of the distribution of city densities over time and across regions.
We compile data for each decade from 1940 to 2000; by 2000, 1,507
cities meet the 25,000 threshold. In addition, we include those statistics for
every “urbanized area” in the United States, decennially from 1950 to 2000.
Though we also present data on Metropolitan Statistical Area (MSA) density
evolution from 1950 to 1980, this definition of a city can be problematic for
work with densities. A discussion of the inherent problems with using MSA
data is found in Section 1. To the best of our knowledge, these data have not
been previously collected in an electronic format.
Our findings document that the distribution of city densities in the United
States has shifted leftward since 1940; that is, cities are becoming less dense.
This shift is not confined to any particular decade. It is evident across regions,
and it is driven both by new cities incorporating with lower densities, and
by old cities adding land faster than they add population. The shift is seen
among several different definitions of cities. A particularly surprising result
is that “legal cities,” defined in this article as regions controlled by a local
government, have greatly decreased in density during the period studied. That
is, since 1940, local governments have been annexing territory fast enough to
counteract the increase in urban population. Annexation is the only way that
cities can simultaneously have increasing population, which is true of the vast
majority of cities in our sample, and yet still have decreasing density.
This article is organized as follows. Section 1 describes how our database
was constructed, and also discusses which definition of city is most appropriate
in different contexts. Section 2 discusses our use of nonparametric techniques
to estimate the distribution of urban density. Section 3 presents our results
and discusses why cities might be decreasing in density. Section 4 concludes.

1.

DATA

What is a city? There are at least three well-defined concepts of a city boundary in the United States that a researcher might use: the legal boundary of
the city, the boundary of the built-up, urban region around a central city (an
“urbanized area”), and the boundary of a census-defined Metropolitan Statistical Area (MSA). The legal boundary of a city is perhaps most relevant
when investigating the area that state and local governments believe can be
covered effectively with a single government. Legal boundaries also have
the advantage of a consistent definition over the period studied; this is not
completely true for urbanized areas, and even less true for MSAs. Urbanized
areas parallel nicely with an economist’s mental image of an agglomeration,
as they include the built-up suburban areas around a central city. MSAs,
though commonly used in the population literature, offer a much vaguer in-

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

343

Table 1 Three Definitions of a City
Legal City

The region controlled by a local government or a
similar unincorporated region (CDP).
Defined by local and state governments.

Urbanized Area

A region incorporating a central city plus surrounding
towns and cities meeting a density requirement.
Defined by the U.S. Census Bureau.

MSA

A region incorporating a central city, the county
containing that city, and surrounding counties meeting a
requirement on the percentage of workers commuting to
the center.
Defined by the U.S. Census Bureau.

terpretation. Figure 1 displays the city, urbanized area, and MSA boundaries
for Richmond, Virginia, and Las Vegas, Nevada, in the year 2000.
Our database of legal cities is constructed from the decennial U.S. Bureau
of the Census Number of Inhabitants, which is published two to three years
after each census is taken. Population and land area for every U.S. “place” with
a population greater than 2,500 are listed. Places include cities, towns, villages, urban townships, and census-designated places (CDPs). Cities, towns,
and townships are legally defined places containing some form of local government, while a census-designated place (called an “unincorporated place”
before 1980) refers to unincorporated areas with a “settled concentration of
population.” Some of these CDPs can be quite large; for instance, unincorporated Metairie, Louisiana has a population of nearly 150,000 in 2000. Though
CDPs do not represent any legal entity, they are nonetheless defined in line with
settlement patterns determined after census consultation with state and local
officials, and are similar in size and density to incorporated cities.1 Including
CDPs in our database, and not simply incorporated cities, is particularly important as some states only have CDPs (such as Hawaii), and “towns” in eight
states, including all of New England, are only counted as a place when they
appear as a CDP.
From this list, we selected every place (including CDPs) with a population
greater than 25,000 for each census from 1940 to 2000. There are 412 places
in 1940 and 1,507 places in 2000 that meet this restriction. Each place was
coded into one of nine geographical regions in line with the standard census
region definition.2 We also labeled each place as either “new” or “old.” An
1 1980 Census of Population: Number of Inhabitants. “Appendix A–Area Classification.” U.S.
Department of Commerce, 1983. Note that CDPs did not appear in the 1940 Census.
2 “Census Regions and Divisions of the United States.”
Available online at
http://www.census.gov/geo/www/us regdiv.pdf.

344

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 A Graphic Representation of City Definitions
Richmond

0

Legend
City
Urbanized Area
Metropolitan Statistical Area

Las Vegas

0 20 40 60 80 Miles

5 10 15 20 Miles

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

345

old place is a place that had a population greater than 25,000 in 1940 and still
has a population greater than 25,000 in 2000. A new place is one that had a
population less than 25,000 or did not exist at all in 1940, yet has a population
greater than 25,000 in 2000. There are some places which had a population
greater than 25,000 in 1940 but less than 25,000 in 2000 (for instance, a
number of Rust Belt cities with declining populations); we considered these
places neither new or old. Delineating places in this manner allows us to
investigate whether the leftward shift of the distribution of U.S. cities was
driven by newly founded cities having a larger area, or by old cities annexing
area faster than their population increases.
In addition to legal cities, we also construct a series of urbanized areas
from the Number of Inhabitants publication. Beginning in 1950, the U.S. Census defined urbanized areas as places with a population of 50,000 or more,
meeting a minimum density requirement, plus an “urban fringe” consisting
of places roughly contiguous with the central city meeting a small population
requirement; as such, urbanized areas are defined in a similar way as agglomerations in many economic models. Aside from 1960, when the density
requirement for central cities was lowered from approximately 2,000 people
per square mile to 1,000 per square mile, changes in the definition of an urbanized area have been minor.3 Our database includes each urbanized area
from 1950 to 2000; there were 157 such areas in 1950 and 452 in 2000.
Much of the literature on city population uses data on Metropolitan Statistical Areas (MSAs). An MSA is defined as a central urban city, the county
containing that city, and outlying counties that meet certain requirements concerning population density and the number of residents who commute to the
central city for work.4 We believe there are a number of reasons that this data
can be problematic for investigating city density. First, it is difficult to get
consistent data on metro areas. Before 1950, they were not defined at all,
though Bogue (1953) constructed a series of MSA populations for 1900–1940
by adding up the population within the area of each MSA as defined in 1950.
Because, by definition, Bogue holds MSA area constant for 1900–1950, this
data set would not pick up any changes in density caused by the changing
area of a city over time. Furthermore, there was a significant change in how
MSAs are defined in 1983, with the addition of the “Consolidated Metropolitan Statistical Area” (CMSA). Because of this, MSAs between 1980 and 1990
are not comparable. Dobkins and Ioannides (2000) construct MSAs for 1990
using the 1980 definition, but no such series has been constructed for 2000.
Second, the delineation of MSAs is highly dependent on county definitions. Particularly in the West, counties are often much larger than in the
3 See the Geographic Areas Reference Manual, U.S. Bureau of the Census, chap. 12. Available online at: http://www.census.gov/geo/www/garm.html.
4 In New England, the town, rather than the county, is the relevant area.

346

Federal Reserve Bank of Richmond Economic Quarterly

Midwest and the East. For instance, in 1980, the Riverside-San BernardinoOntario, California MSA had an area of 27,279 square miles and a population
density of 57 people per square mile.5 This MSA has an area three times the
size of and a lower population density than Vermont.6 When looking solely
at population, MSAs can still be useful because the population in outlying
rural areas tends to be negligible; this is not the case with area, and therefore
density.
Third, the number of MSAs is problematic in that it truncates the number of available cities such that only the far right-hand tail of the population
distribution is included. For instance, Dobkins and Ioannides’ (2000) MSA
database includes only 162 cities in 1950, rising to 334 by 1990. For cities
and census-designated places, three to four times as much data can be used.
Eeckhout (2004) notes that the distribution of urban population size is completely different when using a full data set versus a truncated selection that
includes only MSAs; it seems reasonable to believe that urban density might
be similar in this regard. Further, nonparametric density estimation, as used in
this article, requires a large data set. For completeness, we show in Section 3
that the distribution of densities in MSAs from 1950 to 1980, when the MSA
definition was roughly consistent, follows a similar pattern to that of urbanized
areas and legal cities.
Other than the database used in this article, we know of no other complete
panel data set of urban density for U.S. cities. For 1990 and 2000, a full
listing of places with area and population is available online as part of the
U.S. Census Gazetteer.7 The County and City Data Books, hosted by the
University of Virginia, Geospatial and Statistical Data Center, hold population
and area data for 1930, 1940, 1950, 1960, and 1975; these data were entered
by hand during the 1970s from the same census books we used.8 However,
crosschecking this data with the actual census publications revealed a number
of minor errors, and further indicated that unincorporated places and urban
towns were not included. For some states (for instance, Connecticut and
Maryland), this means that very few places were included in the data set at
all. Our data set rectifies these omissions.
5 The MSA was made up of two counties: Riverside County with an area of 7,214 square

miles, and San Bernardino County with an area of 20,064 square miles.
6 In fact, the entire planet has a land area of around 58 million square miles and a population
of 6.5 billion, giving a density of 112 people per square mile, or twice the density of the Riverside
MSA.
7 The 1990 data can be found at http://www.census.gov/tiger/tms/gazetteer/places.txt. Data for
2000 are available at: http://www.census.gov/tiger/tms/gazetteer/places2k.txt.
8 County and City Data Books. University of Virginia, Geospatial and Statistical Data Center.
Available online at: http://fisher.lib.virginia.edu/collections/stats/ccdb/.

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density
2.

347

NONPARAMETRIC ESTIMATION

With these density data, we estimate changes in the probability density function (pdf ) over time for each definition of a city in order to examine, for
instance, how the distribution of urban densities is changing over time. We
use nonparametric techniques, rather than parametric estimation, because nonparametric estimators make no underlying assumption about the distribution
of the data (for instance, the presence or lack of normality). Assuming, for
instance, an underlying normal distribution might mask evidence of a true
bimodal distribution, and given our lack of priors concerning the distribution
of urban densities, nonparametric estimates offer more flexibility. Potential
pitfalls in nonparametric estimation are the requirement of larger data sets,
and the computational difficulty of calculating pdf estimates with more than
two or three variables;9 however, our data sets are large and our estimated
pdfs are univariate. Nonparametric estimates of a pdf are closely related to
the histogram; a description of this link, and basic nonparametric concepts, is
given in Appendix A.
One frequently used nonparametric pdf estimator is the Rosenblatt-Parzen
estimator,
1
fˆ(x) =
nh

n

K(ψ i ),
i=1

where n is the number of observations, h is a “smoothing factor” to be chosen
below, ψ i = x−xi , and K is a nonparametric kernel. The smoothing factor
h
determines the interval of points around x which are used to compute fˆ(x),
and the kernel determines the manner in which an estimator weighs those
points. For instance, a uniform kernel would weigh all points in the interval
equally.
In practice, the choice of kernel is relatively unimportant. In this article,
we use one of the more common kernels, namely the Gaussian kernel,
K(ψ i ) = (2π )−.5 e−

ψ2
i
2

.

This kernel uses a weighted average of all observations, with weights declining
in the distance of each observation from xi .
The choice of bandwidth h, on the other hand, can be important, and is
often chosen so as to minimize an error function of bias and variance. Given a
set of assumptions about the nature of f (x), the Rosenblatt-Parzen estimator
9 Nonparametric estimates converge to their true values at a rate slower than √n.

348

Federal Reserve Bank of Richmond Economic Quarterly

fˆ(x) is such that10
Bias =

h2
[
2

ψ 2 K(ψ)dψ]f (x) + O(h2 )

(1)

and
1
1
f (x) K 2 (ψ)dψ + O( ).
(2)
nh
nh
A low bandwidth, h, gives low bias but high variance, whereas a high h will
give high bias but low variance. That is, choosing too small of a value for h
will cause the estimated density to lack smoothness since not enough sample
points will be used to calculate each fˆ(xi ), whereas too high a value for h will
smooth out even relevant bumps such as the trough in a bimodal distribution.
A description of the assumptions necessary for our bias and variance formulas
can be found in Appendix B.
The integrated mean squared error is defined as
Variance =

[Bias(fˆ(x))2 + V(fˆ(x))]dx.

(3)

This function simultaneously accounts for bias and variance. It is analogous
to the conventional mean squared error in a parametric estimation. When h
is chosen to minimize (3) after substituting for the bias and variance using
expressions (1) and (2) respectively, we obtain
1

h = cn− 5 where c = [

K 2 (ψ)dψ
1
]5 .
2
2 (f (x))2 dx
[ ψ K(ψ)dψ]

Since f (x) is unknown, and the formula for h involves knowing the true f (x),
no more can be said about h without making some assumptions about the nature
of f (x). For example, if f (x) ∼ N (μ, σ 2 ), then c = 1.06σ , and therefore h =
ˆ
1
1.06σ n− 5 exactly.11 This formula is called Silverman’s Rule of Thumb, and
ˆ
works very well for data that is approximately normally distributed (Silverman
1986). Silverman notes that this rule does not necessarily work well for
bimodal or heavily skewed data, and some of the series in this article (for
instance, city populations) are heavily skewed. In particular, outliers lead to
large increases in the estimated standard deviation, σ , and therefore a very
ˆ
large value for h. Consequently, this article instead uses Silverman’s more
general specification
1

h = .9Bn− 5
10 If Xn → some real number c as n → ∞, then X is O(nk ). O(A) is the largest order
n
nk
of magnitude of a sequence of real numbers Xn .
11 Note that this rule does not imply that the nonparametric estimate will look like a para1

metric normal distribution; it merely says that, given data that are roughly normal, 1.06σ n− 5 is
ˆ
the smoothing factor that minimizes both bias and variance.

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

349

given
I QR
),
1.34
where IQR is the interquartile range of sample data. This formula is much
less sensitive to outliers than the Rule of Thumb. In practice, this has shown
to be nearly optimal for somewhat skewed data.
B = min(σ ,
ˆ

3.

RESULTS

Using the kernel and smoothing parameter from the previous section, we can
construct estimates of the pdf of the distribution of population, area, and urban
density in each decade.
Figure 2 shows nonparametric estimations of the distributions of population size, area, and density for legal cities as defined in Section 1. Panel C
shows a leftward shift of the distribution of city densities; that is, cities in 2000
are significantly less dense than in 1940. The mean population per square mile
during that period fell from 6,742 to 3,802. This is being driven principally
by an increase in the area of each city; mean area has increased from 19.2
square miles to 35.1 square miles between 1940 and 2000. The distribution
of populations has remained relatively constant during this period.
One might imagine that this shift is being driven only by a subset of cities,
such as rapidly-growing suburban and exurban cities, or cities in the West
where land is less scarce. Hence, we divide cities into “new” and “old,” as
defined in Section 1, as well as categorize each city into one of four regions:
East, South, Midwest, and West. Figure 3 shows that the leftward shift in
distribution is similar among both old and new cities; that is, city density
is decreasing both because existing cities are annexing additional area, and
because new cities have lower initial densities than in the past. The number
of cities that change their legal boundaries in a given decade is surprising; for
instance, between 1990 and 2000, nearly 36 percent of the cities in our data
set added or lost at least one square mile. These changes vary enormously
by state, however, in a state such as Massachusetts, where all of the land has
been divided into towns for decades, there is very little opportunity for a city
to add territory. Alternatively, in a state such as Oregon where the majority
of land is unincorporated, annexation is much more common. Might it then
be the case that the shift in city density is specific to the Midwest and West,
where annexation is frequent?
In fact, the leftward shift in city density does not appear to be a regional
phenomenon. Figure 4 shows the distribution of densities in the East, South,
Midwest, and West during the period 1940–2000. Each region showed a similar decline in density. The full distribution of log density from the RosenblattParzen estimator is particularly useful when examining the relatively small
number of cities in each region when compared to a simple table of moments,

350

Federal Reserve Bank of Richmond Economic Quarterly

Figure 2 Legal City Area, Population, and Density
Panel A: Legal City Area Distributions

0.6
0.5
0.4

1940
1960
1980
2000

0.3
0.2
0.1
0.0
0

1

3
4
5
Ln (area in square miles)

2

8

7

6

Panel B: Legal City Population Distributions

1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0

1940
1960
1980
2000

10

12

11

13
Ln (population)

14

16

15

Panel C: Legal City Density Distributions

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0

1940
1960
1980
2000

2

4

6
8
Ln (population per square mile)

10

12

as extreme outliers in the data can result in high skewness. For instance, Juneau,
Alaska, had an area of 2,716 square miles and a population of 30,711 in 2000,
giving a density of approximately 11 people per square mile.
The trend in density is even clearer if we look at urbanized areas. Urbanized areas can be reasonably thought of as urban agglomerations; they
represent the built-up area surrounding a central city. Figure 5 shows the estimated distribution of urbanized areas in 1960, 1980, and 2000. As in the
case of legal cities, there has been a clear decrease in the density of urbanized
areas during this period. Because the boundaries of urbanized areas and legal cities are quite different, it is rather striking that, under both definitions,
the decrease in density has been so evident. That is, cities have not simply
expanded into a mass of lower-density suburbs, but the individual cities and
suburbs themselves have decreased in density, primarily by annexing land.
Finally, we consider the density of Metropolitan StatisticalAreas.As noted
in Section 1, there are only consistently defined MSA data available for the

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

351

Figure 3 Distributions of New and Old Cities
Legal City Density Distributions, New Cities

0.9
0.8

1960
1980
2000

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
2

3

4

8
5
7
6
Ln (population per square mile)

9

10

11

Legal City Density Distributions, Old Cities
0.9
1940
1960
1980
2000

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0

3

4

5

8
7
6
Ln (population per square mile)

9

10

11

period 1950–1980. Furthermore, a decrease in the distribution of MSA density
might simply reflect the increase in the number of MSAs in states with large
counties, since each MSA by definition includes its own county. The urban
economics literature concerning population size, however, often uses MSAs.
Figure 6 shows that the distribution of MSA population density also appears to
be shifting leftward in the same manner as legal cities and urbanized areas, but
again, it is hazardous to give any interpretation to this shift. The definitional
advantages and large data sample size for urbanized areas and legal cities
potentially makes them preferable to MSAs for future work concerning urban
density.
The importance of these shifts in urban density is underscored by the
long-understood link between density and economic prosperity. Lucas (1988)
cites approvingly Jane Jacobs’ contention that dense cities, not simply cities,
are the economic “nucleus of an atom,” the central building block of development through their role in spurring human capital transfers. Ciccone and Hall

352

Federal Reserve Bank of Richmond Economic Quarterly

Figure 4 Distributions of Urban Density by Region
East
1940
1960
1980
2000

0.8

South

1.0

1.0

1940
1960
1980
2000

0.8

0.6

0.6

0.4

0.4

0.2

0.2
0.0

0.0
5

6

8
9
10
7
Ln (population per square mile)

11

Midwest

8
9
7
6
Ln (population per square mile)

1940
1960
1980
2000

1940
1960
1980
2000

0.8

0.6

0.6

0.4

0.4

0.2

10

West

1.0

1.0
0.8

5

4

0.2
0.0

0.0
5

8
9
10
7
6
Ln (population per square mile)

11

2

3

8
9 10 11
5
7
4
6
Ln (population per square mile)

(1996), using county-level data, find that a doubling of employment density in
a county increases labor productivity by 6 percent. In addition to knowledge
transfer, agglomerations arise in order to facilitate effective matches between
employer and employee and to take advantage of external economies of scale
such as a common deepwater port.
Measuring the nature of local knowledge transfer, and in particular whether
the relevant area has expanded as transportation and communication technologies have fallen, is difficult. Jaffe, Trajtenberg, and Henderson (1993) find evidence that, given the existing distribution of industries and research activity,
new patents tend to cite existing patents from the same state and MSA at an
unexpectedly high level. Using data on the urbanized portion of a metropolitan
area, Carlino, Chatterjee, and Hunt (2006) find that patents per capita rise 20
percent as the employment density of a city doubles. They also find that the
benefits of density are diminishing over density, so that cities with employment densities similar to Philadelphia and Baltimore, around 2,100 jobs per
square mile, are optimal.
Given the economic benefits of density, the changes in the urban density distribution presented in this article suggest two questions. First, why

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

353

Figure 5 Distribution of Urbanized Areas
1.2

1.0

0.8

0.6

0.4

0.2

1960
1980
2000

0.0
6.4

6.8

7.2

7.6

8.4
8.0
Ln (population per square mile)

8.8

9.2

9.6

have agglomeration densities decreased? Second, why have the areas of legal
jurisdictions increased?
Decreased densities in urban areas have been explained by a number of
processes in the literature, including federal mortgage insurance, the Interstate
Highway System, racial tension, and schooling considerations. Mieszkowski
and Mills (1993) counter that these explanations tend to be both unique to the
United States and are phenomena of the postwar period, whereas a decrease
in urban density began as early as 1900 and has occurred across the developed
world. Two theories remain.
First, the decreased transportation costs brought about by the automobile
and the streetcar has allowed congestion in central cities to be avoided by firms
and consumers. Glaeser and Kahn (2003) point out that the automobile also
has a supply-side effect in that it allows factories and other places of work
to decentralize by eliminating the economies of scale seen with barges and
railroads; the rail industry was three times larger than trucking in 1947, but
trucks now carry 86 percent of all commodities in the United States. Whereas
the wealthy in the nineteenth century might have preferred to live in the center
of a city while the poor were forced to walk from the outskirts, the modern

354

Federal Reserve Bank of Richmond Economic Quarterly

Figure 6 Distribution of MSAs
0.50

0.40

0.30

0.20

0.10
1960
1980

0.00
2

3

4

5

6

7

8

9

10

Ln (population per square mile)

well-to-do are less constrained by transport times and, therefore, occupy land
in less-dense suburban and exurban cities.
Rossi-Hansberg, Sarte, and Owens (2005) present a model in which firms
set up non-integrated operations such that managers work in cities in order
to take advantage of knowledge transfer externalities but production workers
tend to work at the periphery of a city where land costs are lower. They then
show that, as city population grows, the internal structure of cities changes
along a number of dimensions that are consistent with the data.
A second theory, not entirely independent from the first, posits that cities
have become less dense because of a desire for homogenization. When a large
group with relatively homogenous preferences for tax rates and school quality
is able to occupy its own jurisdiction, it can use land-use controls to segregate
itself from potential residents with a different set of preferences. Mieszkowski
and Mills (1993) argue that land-use restrictions have become more stringent
in the postwar era, and that segregation into income-homogenous areas may
be contributing to decreased densities.
There are fewer existent theories about why legal jurisdictions, at a given
population level, have increased in area. Glaeser and Kahn (2003) note that
effective land use requires larger jurisdictions as transportation costs fall. That

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

355

is, if a city wished to limit sprawl in an era with high transportation costs, it
could enact effective land-use regulations within small city boundaries. In
an era with low transportation costs, however, such a regulation would simply push residents into another bedroom community and have no effect on
sprawl or traffic. The growing number of regional land-use planning commissions, such as Portland’s Metropolitan Service District and Atlanta’s Regional
Commission, speak to this trend (Song and Knaap 2004).
Austin (1999) discusses reasons why cities may want to annex territory,
including controlling development on the urban fringe, increasing the tax base,
lowering the cost of municipal services, lowering municipal service costs by
exploiting returns to scale, or altering the characteristics of the city, such as
decreasing the minority proportion of population. External areas may wish
to be annexed because of urban economies of scale, and because urban areas
offer benefits such as cheaper bond issuance than suburban and unincorporated
areas. Austin finds evidence that cities annex for both political and economic
reasons, but that increasing the tax base does not appear to be a relevant
factor, perhaps because of the growing ability of high-wealth areas to avoid
annexation by poorer cities.

4.

CONCLUDING REMARKS

This article provides two novel contributions. First, it constructs an electronic data set of urban densities in the United States during the previous
seven decades for three different definitions of a city. Second, it applies nonparametric techniques to estimate the distribution of those densities, and finds
that there has been a stark decrease in density during the period studied. This
deconcentration has been occurring continuously since at least 1940, in every
area of the United States, and among both new and old cities. This result
is striking; increasing population and increasing area across cities do not, by
themselves, tell us what will happen to density.
Falling urban densities suggest that, over the past seven decades, the productivity benefits of dense cities have been weakening. Decreasing costs
of transportation and communication have allowed firms to move production
workers out of high-rent areas, and have allowed residents to move away from
downtowns. It is unclear what effect these changes in the urban landscape
will have on knowledge accumulation and growth in the future. For instance,
it is conceivable that the productivity loss from ever-decreasing spatial density
might be counteracted by decreased long-range communication costs. Understanding the broad properties of urban density in modern economies is merely
a necessary first step in understanding how these changing properties of cities
will affect the broader economy.

356

Federal Reserve Bank of Richmond Economic Quarterly

APPENDIX A:

NONPARAMETRIC ESTIMATORS

Classical density estimation assumes a parametric form for a data set and uses
sample data to estimate those parameters. For instance, if an underlying
process is assumed to generate normal data, the estimated density is
−(x−u)2
1
√ e 2σ 2 ,
σ 2π

where σ and μ are the sample standard deviation and mean.
Nonparametric density estimation, on the other hand, allows a researcher
to estimate a complete density function from sample data, and therefore estimate each moment of that data, without assuming any underlying functional
form. For instance, if a given distribution is bimodal, estimating moments
under the assumption of normally distributed data will be misleading. Knowing the full distribution of data also makes clear what stylized facts need to be
explained in theory; if the data were skewed heavily to the right and suffered
from leptokurtosis, a theory explaining that data should be able to replicate
these properties. Nonparametric estimation generally requires a larger data
set than parametric estimation to achieve consistency, but is becoming more
common in the literature. Given that our city data set is large, we use nonparametric techniques in this article. A brief introduction to these techniques
can be found in Greene (2003), while a more complete treatment is found in
Pagan and Ullah (1999).
At its core, a nonparametric density estimate is simply a smoothed histogram. Therefore, the nonparametric estimator can be motivated by beginning with a histogram. In a histogram, the full range of n sample values
is partitioned into non-overlapping bins of equal width h. Each bin has a
height equal to the number of sample observations within the range of that
bin divided by the total number of observations. Given an indicator function
I(A), defined as equal to 1 if the statement A is true, and 0 if the statement A
is false, the height of a bin centered at some point x0 , with width h, is
H (x0 ) =

1
n

n

I(x0 −
i=1

h
h
< xi ≤ x0 + ).
2
2

That is, we are simply counting the number of sample observations in each
bin of width h, and dividing that frequency by the sample size; the resulting
height of each bin is the relative frequency. If there are 40 observations,
of which 10 are in the bin (1,2], with h = 1, then the histogram has height
H (1.5) = .25 for all x in (1,2].
This concept can be extended by computing a “local” histogram for each
point x in the range (xmin − h , xmax + h ], where xmin and xmax are the minimum
2
2

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

357

and maximum values in the sample data.12 In the histogram above, we
computed H (x0 ) for only h points in the range; x0 was required to be the
midpoint of a bin. The local histogram will instead calculate fˆ(x) for every
x in (xmin − h , xmax + h ), where fˆ(x) evaluated at a given point x0 is equal
2
2
to the number of sample observations within (x0 − h , x0 + h ), divided by n
2
2
to give a frequency.13 That is,
fˆ(x) =
=
=

1
n
1
n
1
n

n

I(x −
i=1
n

I(|
i=1
n

h
h
< xi < x + )
2
2

x − xi
1
|< )
h
2

I(|ψ(xi )| <
i=1

1
),
2

where ψ(xi ) = x−xi . fˆ(x) is a proper density function if, first, it is greater
h
than or equal to zero for all x, which is guaranteed since the indicator function
∞
is always either 0 or 1, and second, if −∞ fˆ(x)dx = 1. Dividing fˆ(x) by h
ensures that the function integrates to one. To see this, observe first that
∞
−∞

I(|ψ(xi )| <

1
)dψ =
2

In addition, since ψ(xi ) =
1
h

∞
−∞

1
2
−−1
2

I(|ψ(xi )| <

1
)dψ =
2

1
2
1
−2

dψ = 1.

x−xi
,
h

fˆ(x)dx =
=

1
nh
1
n

n

∞

I(|
i=1
n

−∞
∞

I(|
i=1

−∞

1
x − xi
| < )dx
h
2

1
x − xi
| < )dψ
h
2

= 1.
While local histograms certainly provide a nonparametric estimate of density, and are smoother than proper histograms, they are still discontinuous. It
seems sensible, then, to attempt to smooth the histogram. This is done by
12 The local histogram fˆ(x) must be computed for (x
h
h
min − 2 , xmax + 2 ] and not simply

for (xmin , xmax ], because fˆ(x) > 0 for points outside of (xmin , xmax ]. For instance, if h = 1
and (xmin , xmax ] = (0, 10], fˆ(10.4) will be greater than zero because it will count the sample
observation x0 = 10.
13 In practice, fˆ(x) can only be computed for a finite number of points. The distributions we
display in Section 5 have been computed at 1,000 points evenly divided on the range (xmin , xmax ).

358

Federal Reserve Bank of Richmond Economic Quarterly

replacing the indicator function in
1
fˆ(x) =
nh

n

I(|
i=1

x − xi
1
|< )
h
2

with another function called a kernel, K(ψ), such that fˆ(x) ≥ 0, integrates
to one and is smooth. An estimator of the form
1
fˆ(x) =
nh

n

K(ψ i ), where ψ i =
i=1

x − xi
,
h

is a Rosenblatt-Parzen kernel estimator, and the resulting function fˆ(x) depends on the choice of h, called a bandwidth or smoothing parameter, and
the choice of kernel. A “good” density estimate will have low bias (that is,
E(fˆ(x)) − f (x), where f (x) is the true density of the data) and low variance.

APPENDIX B:

ROSENBLATT-PARZEN BIAS
AND VARIANCE

Bias and variance of a nonparametric estimator can be calculated given the
following four assumptions:
1) The sample observations are i.i.d.
∞
2) The kernel is symmetric around zero and satisfies −∞ K(ψ)dψ =
∞
∞
1, −∞ ψ 2 K(ψ)dψ = 0, and −∞ K 2 (ψ)dψ < ∞.
3) The second-order derivatives of fˆ are continuous and bounded
around x, and
4) h → 0 and nh → ∞ as n → ∞.

It can be shown that the Rosenblatt-Parzen estimator fˆ(x) has
Bias =

h2
[
2

ψ 2 K(ψ)dψ]f (x) + O(h2 )

and
1
1
f (x) K 2 (ψ)dψ + O( ).
nh
nh
The integrated mean squared error (MISE) is defined as
Variance =

[Bias(fˆ(x))2 + V(fˆ(x))]dx.
Substituting the formulas for bias and variance, and ignoring the higher
1
order terms, O(h2 ) and O( nh ), respectively, gives the asymptotic integrated

K. A. Bryan, B. D. Minton, and P. D. Sarte: Urban Density

359

mean squared error (AMISE):
h4
1
[ ψ 2 K(ψ)dψ]2 (f (x))2 dx +
f (x)dx K 2 (ψ)dψ
4
nh
1
h4
[ ψ 2 K(ψ)dψ]2 (f (x))2 dx +
K 2 (ψ)dψ.
=
4
nh
Differentiating with respect to h and setting the result equal to zero, we
have
1
h3 [ ψ 2 K(ψ)dψ]2 (f (x))2 dx − 2 K 2 (ψ)dψ = 0
nh
or
K 2 (ψ)dψ
1
1
]5 .
h = cn− 5 , where c = [
2
2 (f (x))2 dx
[ ψ K(ψ)dψ]

REFERENCES
Austin, D. Andrew. 1999. “Politics vs. Economics: Evidence from Municipal
Annexation.” Journal of Urban Economics 45 (3): 501–32.
Bogue, Donald J. 1953. Population Growth in Standard Metropolitan Areas
1900–1950. Oxford, Ohio: Scripps Foundation in Research in
Population Problems.
Carlino, Gerald, Satyajit Chatterjee, and Robert M. Hunt. 2006. “Urban
Density and the Rate of Invention.” Federal Reserve Bank of
Philadelphia Working Paper No. 06-14.
Chatterjee, Satyajit, and Gerald A. Carlino. 2001. “Aggregate Metropolitan
Employment Growth and the Deconcentration of Metropolitan
Employment.” Journal of Monetary Economics 48 (3): 549–83.
Ciccone, Antonio, and Robert E. Hall. 1996. “Productivity and the Density
of Economic Activity.” American Economic Review 86 (1): 54–70.
Dobkins, Linda, and Yannis Ioannides. 2000. “Dynamic Evolution of the
Size Distribution of U.S. Cities.” In The Economics of Cities, eds. J.
Huriot and J. Thisse. New York, NY: Cambridge University Press.
Eeckhout, Jan. 2004. “Gibrat’s Law for (All) Cities.” American Economic
Review 94 (5): 1,429–51.
Glaeser, Edward L., and Matthew E. Kahn. 2003. “Sprawl and Urban
Growth.” In Handbook of Regional and Urban Economics, eds. J. V.

360

Federal Reserve Bank of Richmond Economic Quarterly
Henderson and J. F. Thisse, 1st ed., vol. 4, chap. 56. North Holland:
Elsevier.

Greene, William. 2003. Econometric Analysis. 5th ed. Upper Saddle River,
NJ: Prentice Hall.
Jaffe, Adam B., Manuel Trajtenberg, and Rebecca Henderson. 1993.
“Geographic Localization of Knowledge Spillovers as Evidenced by
Patent Citations.” Quarterly Journal of Economics 108 (3): 577–98.
Lucas, Robert E., Jr. 1988. “On the Mechanics of Economic Development.”
Journal of Monetary Economics 22 (1): 3–42.
Lucas, Robert E., Jr., and Esteban Rossi-Hansberg. 2002. “On the Internal
Structure of Cities.” Econometrica 70 (4): 1,445–76.
Marshall, Alfred. 1920. Principles of Economics. 8th ed. London: Macmillan
and Co., Ltd.
Mieszkowski, Peter, and Edwin S. Mills. 1993. “The Causes of Metropolitan
Suburbanization.” The Journal of Economic Perspectives 7 (3): 135–47.
Pagan, Adrian, and Aman Ullah. 1999. Nonparametric Econometrics.
Cambridge, UK: Cambridge University Press.
Rossi-Hansberg, Esteban, Pierre-Daniel Sarte, and Raymond Owens III.
2005. “Firm Fragmentation and Urban Patterns.” Federal Reserve Bank
of Richmond Working Paper No. 05-03.
Silverman, B. W. 1986. Density Estimation. London: Chapman and Hall.
Song, Yan, and Gerritt-Jan Knaap. 2004. “Measuring Urban Form: Is
Portland Winning the War on Sprawl?” Journal of the American
Planning Association 70 (2): 210–25.
U.S. Bureau of the Census. “Number of Inhabitants: United States
Summary.” Washington, DC: U.S. Government Printing Office 1941,
1952, 1961, 1971, and 1981.
U.S. Bureau of the Census. 1994. Geographic Areas Reference Manual.
Available online at http://www.census.gov/geo/www/garm.html
(accessed September 4, 2007).

Economic Quarterly—Volume 93, Number 4—Fall 2007—Pages 361–391

Currency Quality and
Changes in the Behavior of
Depository Institutions
Hubert P. Janicki, Nashat F. Moin, Andrea L. Waddle,
and Alexander L. Wolman

T

he Federal Reserve System distributes currency to and accepts deposits
from Depository Institutions (DIs). In addition, the Federal Reserve
maintains the quality level of currency in circulation by inspecting all
deposited notes. Notes that meet minimum quality requirements (fit notes)
are bundled to be reentered into circulation while old and damaged notes are
destroyed (shredded) and replaced by newly printed notes.
Between July 2006 and July 2007, the Federal Reserve implemented a
Currency Recirculation Policy for $10 and $20 notes. Under the new policy,
Reserve Banks will generally charge DIs a fee on the value of deposits that
are subsequently withdrawn by DIs within the same week. In addition, under
certain conditions the policy allows DIs to treat currency in their own vaults
as reserves with the Fed. It is reasonable to expect that the policy change will
result in DIs depositing a smaller fraction of notes with the Fed. While the
policy is aimed at decreasing the costs to society of currency provision, it may
also lead to deterioration of the quality of notes in circulation since notes that
are deposited less often are inspected less often.
This article analyzes the interaction between deposit behavior of DIs and
the shred decision of the Fed in determining the quality distribution of currency.
For a given decrease in the rate of DIs’ note deposits with the Fed, absent any
change in the Fed’s shred decision, what effect would there be on the quality
The authors are grateful to Barbara Bennett, Shaun Ferrari, Juan Carlos Hatchondo, Chris
Herrington, Jaclyn Hodges, Larry Hull, Andy McAllister, David Vairo, John Walter, and John
Weinberg for their input. The views expressed in this article are those of the authors and do
not necessarily reflect those of the Federal Reserve Bank of Richmond or the Federal Reserve
System. Correspondence should be directed to alexander.wolman@rich.frb.org.

362

Federal Reserve Bank of Richmond Economic Quarterly

distribution of currency in circulation? What kind of changes in the shred
criteria would restore the original quality distribution?
To answer these questions, we use the model developed by Lacker and
Wolman (1997).1 In the model, the evolution of the currency quality distribution over time is governed by (i) a quality transition matrix that describes the
probabilistic deterioration of notes from one period to the next, (ii) DIs’ deposit probabilities for notes at each quality level, (iii) the Fed’s shred decision
for notes at each quality level, (iv) the quality distribution of new notes, and
(v) the growth rate of currency.
We estimate three versions of the model for both $5 and $10 notes. We have
not estimated the model for $20 notes because they were redesigned recently,
and the new notes were introduced in October 2003. The transition from
old to new notes makes our estimation procedure impractical; we discuss this
further in the Conclusion.2 Although the policy affects $10 and $20 notes only,
we also estimate the model for $5 notes because the policy change initially
proposed in 2003 included $5 notes. (It is possible that at some point the
recirculation policy might be expanded to cover that denomination.) Also, it
is likely that the reduced deposits of $10 and $20 notes may induce DIs to
change the frequency of transporting notes to the Fed and, hence, affect the
deposit rate of other denominations. The model predicts roughly comparable
results for both denominations.
In each version of our model, we choose parameters so that the model
approximates the age and quality distributions of U.S. currency deposited at
the Fed. For each estimated model, we describe the deterioration of currency
quality following decreases in DI deposit rates of 20 and 40 percent, and we
provide examples of Fed policy changes that would counteract that deterioration. As described in more detail below, we view a 40 percent decrease in
deposit rates as an upper bound on the change induced by the recirculation
policy.
According to the model(s), a 20 percent decrease in the DI deposit rate
would eventually result in an increase in the number of poor quality (unfit)
notes of between 0.8 and 2.5 percentage points. While this range corresponds
to different specifications of the model, not to a statistical confidence interval, it
should be interpreted as indicating the range of uncertainty about our results.
For $10 notes, very small changes in shred policy succeed in preventing a
significant increase in the fraction of unfit notes.3 Slightly larger changes in
1 The Appendix to Lacker (1993) contains a simpler model of currency quality that shares
some basic features with the model here.
2 New $10 notes were introduced in March 2006 and new $5 notes are expected to be
introduced in 2008; our data were collected in 2004 and early 2006.
3 We view “fit notes” as referring to any notes that meet a fixed quality standard determined
by the Federal Reserve. Prior to a decrease in deposit rates, a fit note is synonymous with a note
that meets the Fed’s quality threshold for not shredding. If the Fed adjusts its shred policy in

Janicki, Moin, Waddle, and Wolman: Currency Quality

363

shred policy are required to keep the fraction of unfit $5 notes from increasing
in response to a 20 percent lower deposit rate. Naturally, a 40 percent decrease
in deposit rates would cause a larger increase in the number of unfit notes,
although the greatest increase we find is still less than 6 percentage points.
And even in that case there are straightforward changes in shred policy that
would be effective in restoring the level of currency quality.

1.

INSTITUTIONAL BACKGROUND

Federal Reserve Banks issue new and fit used notes to DIs and destroy previously circulated notes of poor quality. In order to maintain the quality level
of currency in circulation, the Fed uses machines to inspect currency notes
deposited by DIs at Federal Reserve currency processing offices. These machines inspect each note to confirm its denomination and authenticity, and
measure its quality level on many dimensions. The dimensions that are measured include soil level, tears, graffiti or marks, and length and width of the
currency notes. Fit notes are those that pass the threshold quality level on all
dimensions. Once sorted, the fit notes are bundled and then recirculated when
DIs request currency from the Reserve Banks. To replace destroyed notes
and accommodate growth in currency demand, the Federal Reserve orders
new notes from the Bureau of Engraving and Printing (B.E.P.) of the U.S.
Department of Treasury. The Fed purchases the notes from B.E.P. at the cost
of production.4 In 2006, the Federal Reserve ordered 8.5 billion new notes
from the B.E.P., at a cost of $471.2 million (Board of Governors of the Federal
Reserve System 2006a)—approximately 5.5 cents per note.
In 2006, the Federal Reserve took in deposits of 38 billion notes, paid out
39 billion notes, and destroyed 7 billion notes (Federal Reserve Bank of San
Francisco 2006). Of the 19.9 million pounds of notes destroyed every year,
approximately 48 percent are $1 notes, which have a life expectancy of about
21 months. The $5, $10, and $20 denominations last roughly 16, 18, and 24
months, respectively (Bureau of Engraving and Printing 2007). Each day of
2005, the Federal Reserve’s largest cash operation, in East Rutherford, New
Jersey, destroyed approximately 5.2 million notes, worth $95 million (Federal
Reserve Bank of New York 2006).

response to a decrease in deposit rates, then it will shred some notes that were fit according to
this fixed standard.
4 Thus, seigniorage for notes accrues initially to the Federal Reserve. In contrast, the Fed
purchases coins from the U.S. Mint (a part of the Department of Treasury) at face value, so that
seigniorage for coins accrues directly to the Treasury.

364

Federal Reserve Bank of Richmond Economic Quarterly

Costs and Benefits of Currency Processing and
Currency Quality
The Federal Reserve’s operating costs for currency processing in 2006 were
$319 million (Federal Reserve Bank of San Francisco 2006). DIs benefit from
the Fed’s currency processing services in at least two ways. First, the Federal Reserve ships out only fit currency, whereas DIs accumulate a mixture of
fit and unfit currency; to the extent that DIs’ customers—and their ATMs—
demand fit currency, DIs benefit from the Fed’s sorting of currency. Second,
while DIs need to hold currency to meet their customers’ withdrawals, they
also incur costs by holding inventories of currency in their vaults. Currency
inventories take up valuable space and require expenditures on security systems; in addition, currency in the vault is “idle,” whereas currency deposited
with the Fed is eligible to be lent out in the federal funds market at a positive
nominal interest rate. Thus, the Fed’s currency processing services amount to
an inventory management service for DIs. The benefits DIs accrue from currency processing may not coincide exactly with the benefits to society. On one
hand, positive nominal interest rates make the inventory-management benefit
to DIs of currency processing exceed the social benefit (Friedman 1969). On
the other hand, the social benefits of improved currency quality may exceed
the quality benefits that accrue to DIs: for example, maintaining high currency
quality may deter counterfeiting by making counterfeit notes easier to detect
(Klein, Gadbois, and Christie 2004). On net, it seems unlikely that the social
benefit of currency processing greatly (if at all) exceeds the private benefit.
This implies that it would be optimal for DIs to face some positive price for
currency processing. Lacker (1993) discusses in detail the policy question of
whether the Federal Reserve should subsidize DIs’ use of currency.
Historically, the Federal Reserve did not charge DIs for processing currency deposits and withdrawals.5 Policy did prohibit a DI’s office from crossshipping currency; cross-shipping is defined as depositing fit currency with the
Fed and withdrawing currency from the Fed within the same five-day period.
However, as explained in the Federal Reserve Board’s request for comments
that introduced the proposed recirculation policy (Board of Governors of the
Federal Reserve System 2003a), the restriction on cross-shipping was not
practical to enforce. Thus, overall the Federal Reserve cash services policy
clearly subsidized DIs’ use of currency.
Policy Revision
By 2003, the Federal Reserve had come to view existing policy as leading DIs
to overuse the Fed’s currency processing services (Board of Governors of the
5 Note, however, that DIs do pay for transporting currency between their own offices and
Federal Reserve offices.

Janicki, Moin, Waddle, and Wolman: Currency Quality

365

Federal Reserve System 2003b). Factors contributing to this situation included
an increase in the number of ATM machines and a decrease in the magnitude
of required reserves. The former likely increased the value of the Fed’s sorting
services, and the latter meant that for a given flow of currency deposits and
withdrawals by the DIs’ customers, there would be greater demand by DIs to
transform vault cash into reserves with the Fed—which requires utilizing the
Fed’s processing services. In October 2003, the Federal Reserve proposed
and requested comments on changes to its cash services policy, aimed at
reducing DIs’ overuse of the Fed’s processing services (Board of Governors
of the Federal Reserve System 2003a). In March of 2006, a modified version
of the proposal was adopted as the Currency Recirculation Policy (Board of
Governors of the Federal Reserve System 2006b).
The Recirculation Policy has two components, both of which cover only
$10 and $20 denominations. The first component is a custodial inventory
program. This program enables qualified DIs to hold currency at the DI’s
secured facility while transferring it to the Reserve Bank’s ledger—thus making the funds available for lending to other institutions but avoiding both the
transportation cost and the Fed’s processing cost. DIs must apply to be in the
custodial inventory program. One criterion for qualifying is that a DI must
demonstrate that it can recirculate a minimum of 200 bundles (of 1,000 notes
each) of $10 and $20 notes per week in the Reserve Bank zone. The policy’s
second component is a fee of approximately $5 per bundle of cross-shipped
currency. While this new policy is aimed at reducing the social costs incurred because of cross-shipping currency, absent changes in shred policy it is
likely to lower the quality of currency in circulation through reduced deposits
and thus reduced shredding of unfit currency.6 The primary concerns of our
study are the effect on currency quality of the anticipated decrease in deposit
rates, and the measures the Fed can take to offset that decrease in quality. To
address these issues we construct a model of currency quality. We assume
that shredding policy is aimed at restoring or maintaining the original quality
distribution. If the cost of maintaining quality at current levels exceeds the
social benefits of doing so, it would be optimal to let the quality of currency
deteriorate somewhat.

2. THE MODEL
The model applies to one denomination of currency.7 Time is discrete, and
a time period should be thought of as a month. For the purposes of this
6 Federal Reserve Banks have estimated that over 10 years, the recirculation policy could
reduce their currency processing costs by a present value of $250 million. Taking into account
increased DI costs, the corresponding societal benefit is estimated at $140 million (Board of Governors of the Federal Reserve System 2006b).
7 By changing the parameters appropriately, it can be applied separately to more than one
denomination; indeed we will do just that.

366

Federal Reserve Bank of Richmond Economic Quarterly

study, there are three major dimensions to currency quality: soil level front
(we will use the shorthand “soil level” or SLF), ink wear worst front (“ink
wear” or IWWF), and graffiti worst front (“graffiti” or GWF). There are also
at least 18 minor dimensions to currency: soil level back, graffiti total front,
etc. For a given denomination, we have separate models for each major dimension.8 Those models describe, for example, how the distribution over soil
level evolves over time. For each of those models, however, we use data on
the other dimensions to more accurately describe the probability that a note
of a particular major-dimension quality level will be shredded.9
The basic structure of the model is as follows. At the beginning of each
period, banks deposit currency with the Fed; their deposit decision may be
a function of quality in the major dimension (that is, banks may sort for
fitness). The Fed processes deposited notes, shredding those deemed unfit and
recirculating the rest at the end of the period. The shred decision is based on
quality level in whatever major dimension the model is specified. However,
notes that are fit according to their quality level in the major dimension are
nonetheless shredded with positive probability; this is to account for the fact
that they may be unfit along one of the other (major or minor) dimensions
in which the model is not specified. The stock of currency is assumed to
grow at a constant rate. Banks make withdrawals from the Fed at the end
of the period but these are not specified explicitly; instead, withdrawals can
be thought of as a residual that more than offsets deposits in order to make
the quantity of currency grow at the specified rate. In order to accommodate
growth in currency and replace shredded notes, the Fed must introduce newly
printed notes. Meanwhile, the notes that were not deposited with the Fed
deteriorate in quality stochastically. The quality of notes in circulation at the
end of a period, and thus at the beginning of the next period, is determined by
the quality of notes that have remained in circulation and the quality of notes
withdrawn from the Fed.

Formal Specification of the Model
Time is indexed by a subscript t = 0, 1, 2, .... Soil level can take on values
0, 1, 2, ..., ns − 1; ink wear can take on values 0, 1, 2, ..., ni − 1; and graffiti
can take on values 0, 1, 2, ..., ng − 1; in general, larger numbers denote poorer
quality.10 We will use q to denote a particular (arbitrary) quality level.
8 The models for the three major dimensions are truly separate, in that they will yield different
predictions.
9 As mentioned earlier, the model was first developed in Lacker and Wolman (1997). That article studied a different policy question, namely expanding the dimensions of quality measurements
to include limpness.
10 The exception is soil level zero, which is assumed to describe currency that has been
laundered (i.e., has been through a washing machine) and is deemed unfit.

Janicki, Moin, Waddle, and Wolman: Currency Quality

367

For the DIs’ deposit decision, the vector ρ contains in its q th element the
probability that a DI will deposit a note conditional on that note being of quality
level q. The vector ρ has length Q, where Q = ns or ni or ng , depending
on the particular model in question. For the Fed’s fitness criteria, the Qx1
vector α contains in its q th element the probability that a deposited note of
quality q is put back into circulation. If the model were specified in terms of
every quality characteristic—so that Q were a huge number describing every
possible combination of “soil level front,” “soil level back,” etc.—then the
elements of α would each be zero or one and they would be known parameters,
taken from the machine settings. Because the model is specified in terms of
only one characteristic, the elements of α that would be one according to q
are adjusted downward to account for the fact that some quality-q notes are
unfit according to other dimensions of quality. The values of α must then be
estimated, and we describe in Section 4 how they are estimated.
The net growth rate of the quantity of currency is γ ; that is, if the quantity
of currency is M in period t, then it is (1 + γ ) M in period (t + 1). The
Qx1 vector g describes the distribution of new notes; its q th element is the
probability that a newly printed note is of quality q.11 The deterioration of
non-deposited notes is described by the QxQ matrix π ; the row-r column-c
element of π is the probability that a non-deposited note will become quality
r next period, conditional on it being quality-c this period.12 Note that each
column of π sums to one, because any column q contains the probabilities of
all possible transitions from quality level q.
The model’s endogenous variables are the numbers of notes of different
quality levels, i.e., the quality distribution of currency. At the beginning of
period t, the Qx1 vector mt contains in its q th element the number of notes
in circulation of quality q. The total number of notes in circulation is Mt =
Q
th
q=1 mq,t , where mq,t denotes the q element of the vector mt .
11 We allow for new notes to have some variation in quality. However, by choosing g appropriately we can impose the highest quality level for all new notes.
12 We assume that the number of notes is sufficiently large that the probability that a quality
c note makes a transition to quality r is the same as the fraction of type c notes that make the
transition to type r. That is, the law of large numbers applies.

368

Federal Reserve Bank of Richmond Economic Quarterly

Combining these ingredients, the number of notes at each quality level
evolves as follows:
mt+1 = π · (1 − ρ)
QxQ

Qx1

+α

Qx1

mt
Qx1

Qx1

ρ

mt
Qx1

Qx1

(1)
+

Q
q=1 (1

− α q )ρ q mq,t

g
Qx1

1x1

+(γ Mt ) g .
1x1

Qx1

The symbol denotes element-by-element multiplication of vectors or matrices.13
Equation (1) is the model, although we will rewrite it in terms of fractions
of notes instead of numbers of notes. On the left-hand side, mt+1 contains the
number of notes at each quality level at the beginning of period t + 1. The
right-hand side describes how mt+1 is determined from the interaction of mt
(the number of notes at each quality level at the beginning of period t) with
the model’s parameters. The first term on the right-hand side is
π · (1 − ρ)

QxQ

Qx1

mt
Qx1

.

(2)

This term accounts for the fractions (1 − ρ) of notes at each quality level that
are not deposited. These notes deteriorate according to the matrix π , and
thus the first term is a Qx1 vector containing in its q th element the number of
circulating notes that were not deposited in period t and that begin period t + 1
with quality q. If banks were to sort for fitness, then the notes that remain in
circulation and deteriorate during the period would be relatively high quality
notes, otherwise they would be a random sample of notes. The matrix π has
Q2 elements; assigning numbers to those elements will be the key difficulty
we face in choosing parameters for the model.
The second term is
α

Qx1

ρ
Qx1

(3)

mt .
Qx1

This term accounts for the fractions α ρ of notes at each quality level that
are deposited and not shredded—that is, α ρ mt comprises the deposited
notes at each quality level that are fit and will be put back into circulation at
the end of period t. If banks were to sort for fitness in a manner consistent
13 For example, if a = [1, 2] and b = [3, 4], then a

b = [3, 8].

Janicki, Moin, Waddle, and Wolman: Currency Quality

369

with the Fed’s fitness definitions, and if banks possessed enough unfit notes to
meet their deposit needs, then this term would disappear—all deposited notes
would be shredded.
Q
The third term,
g , represents replacement of
q=1 (1 − α q )ρ q mq,t
Qx1

shredded notes. The object in parentheses is the number of unfit notes that
are processed (and shredded) each period. Multiplying by the distribution of
new notes g gives the vector of new notes at each quality level that are added
to circulation at the end of period t to replace shredded notes.
The fourth term, (γ Mt ) g , represents growth in the quantity of currency.
1x1

Qx1

The number of new notes added to circulation to accommodate growth (as
opposed to shredding) is γ Mt , and the distribution of new notes is g, so this
term is a vector containing the numbers of new notes at each quality level
added to circulation at the end of period t to accommodate growth.
We noted above that withdrawals are not treated explicitly in the model.
The quantity of withdrawals can, however, be calculated. The number of
notes withdrawn in period t must be equal to the sum of deposits and currency
growth. That is, withdrawals equal
⎛
⎞
⎝

Q

ρ q mq,t ⎠ + γ Mt .

(4)

q=1

Note that the model does not incorporate currency inventories at the Fed. New
notes materialize as needed, and fit notes deposited at the Fed are recirculated
at the end of the period.
The evolution of currency quality over time is determined entirely by
equation (1). Given a vector mt describing the distribution of currency quality
at the beginning of any period t, equation (1) determines the vector mt+1
describing the distribution of currency quality at the beginning of period t + 1.
The law of motion is determined by the parameters π , ρ, g, γ , and α.14

The Model in Terms of Fractions of Notes
The model has been expressed in terms of the numbers of notes at each quality
level. To express the model in terms of fractions of notes at each quality level,
we first define ft to be the vector of fractions, that is the Qx1 vector of numbers
of notes at each quality level divided by the total number of notes:
1
ft ≡
(5)
· mt .
Mt
14 We have written the model as if all parameters are constant over time. We maintain that

assumption for the quantitative results described in this report. The model remains valid if the
parameters change over time, although estimation becomes more challenging.

370

Federal Reserve Bank of Richmond Economic Quarterly

Likewise, the fraction of notes at a particular quality level is
1
Mt

fq,t ≡

· mq,t .

(6)

Note that the elements of ft sum to one, because Mt = Q mq,t . Using
q=1
these definitions, we can rewrite the model (1) by dividing both sides by Mt
and recalling that Mt+1 = (1 + γ ) Mt :
(1 + γ ) ft+1 = π · ((1 − ρ)
+α

ρ

ft )

ft
(7)

+

Q
q=1 (1

− α q )ρ q fq,t g

+γ g.
With this formulation it will be straightforward to study the model’s steady
state with currency growth.

The Steady-State Distribution of Notes
Under certain conditions, the distribution of currency quality converges to
a steady state with the distribution ft , which is constant over time (see, for
example, Stokely, Lucas, with Prescott, chap.11). Assuming that a unique
steady-state distribution exists, we will denote it by f ∗ . In the steady state,
the law of motion (7) becomes
(1 + γ ) f ∗ = π · ((1 − ρ)
+

Q
q=1 (1

f ∗) + α

ρ

− α q )ρ q fq∗ g + γ g.

f∗
(8)

Our method of choosing the model’s parameters will require us to compute
the steady-state distribution—we will assume that our data are generated in a
steady-state situation. One way to compute the steady state is to simply iterate
on (7) from some arbitrary initial distribution f0 and hope that the iterations
converge. If they converge, we have found the steady state. Alternatively,
we can use matrix algebra to solve directly for the steady state from (8).
Ultimately, we want to rewrite (8) in the form
· f ∗ = γ g,

(9)

where is a QxQ matrix. If we can rewrite (8) in this way, then the steadystate distribution is f ∗ = −1 · (γ g) . The first step is to note that for any Qx1
vector v, we have v f ∗ = diag(v) · f ∗ , where diag(v) denotes the QxQ
matrix with the vector v on the diagonal and zeros, elsewhere. Using this fact,

Janicki, Moin, Waddle, and Wolman: Currency Quality

371

we can rewrite (8) as
(1 + γ ) f ∗ = π · diag(1 − ρ) · f ∗ + diag (α
+

Q
q=1 (1

Next, note that the scalar
((1 − α)

ρ) · f ∗
(10)

− α q )ρ q fq∗ g + γ g.
Q
q=1 (1

− α q )ρ q fq∗

can be rewritten as

∗

ρ) f , where “ ” denotes transpose. Using this fact, we have
⎞
⎛
⎝

Q

(1 − α q )ρ q fq∗ ⎠ g = g ((1 − α)
Qx1

q=1

ρ) f ∗ .

(11)

Qx1

1x1

· f ∗ = γ g, where

Now we can express (8) in the same form as (9),
≡ (1 + γ ) I − π · diag(1 − ρ) − diag (α

ρ) − g ((1 − α)

ρ)

−1

.
(12)

Thus, the steady state can be computed directly as
f∗ =

−1

· (γ g) .

The steady-state distribution f ∗ contains in its q th element the fraction of
notes with quality q, corresponding to a particular measurement of soil level,
graffiti or ink wear. Thus, f ∗ can be thought of as the marginal distribution
over soil level, graffiti or ink wear. When comparing the model to data, we
will use the marginal distributions for each major quality dimension and the
distribution of notes by age. We use the age distribution because the quality
distribution alone puts few restrictions on the matrix π : we can match a
given quality distribution with many π matrices, each implying a different
age distribution.
The Appendix contains a detailed description of how to calculate the
steady-state age distribution of notes. For now, we simply state the notation: hq,k denotes the fraction of notes that are quality q and age k, and hk
denotes the Q by 1 vector of age k notes, the q th element of which is hq,k .

3. THE DATA
The model’s predictions will depend on the numerical values we assign to the
matrix π describing deterioration of notes, the vector ρ of deposit probabilities,
the vector α of shred probabilities, the quality distribution of new notes g, and
the currency growth rate γ . This section describes the basic data whose features
we attempt to match in choosing the model’s parameters.
The ideal data set for our purposes would be one with a time series of
observations on a large number of currency notes, with observations each
month on the quality of every note. Data of this sort would allow for nearly

372

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 Marginal Quality Distributions, $5 Notes

Fraction of Notes

Soil Level Front (SLF)
2004 Data
Per-Note Data – 2006

0.12
0.10
0.08
0.06
0.04
0.02
0

1

2

3

4

5
6
Quality Level (q)

8

7

9

10

11

Fraction of Notes

Ink Wear Worst Front (IWWF)
2004 Data
Per-Note Data – 2006

0.30
0.25
0.20
0.15
0.10
0.05
0

1

2

3

4

5
6
Quality Level (q)

8

7

9

10

11

Graffiti Worst Front (GWF)

Fraction of Notes

0.50

2004 Data
Per-Note Data – 2006

0.40
0.30
0.20
0.10
0.00
0

1

2

3
4
Quality Level (q)

5

6

7

direct measurement of the matrix π. Of course such data does not exist,
and probably the only way it could exist would be if individual notes had
built-in sensors and transmitters. Without such data, we need to estimate the
parameters of π . We use two data sets for this purpose. One data set describes
the marginal quality distributions only and has extremely broad coverage. The
other data set is at the level of individual notes, and contains age of notes as
well as quality. It has more limited coverage.

Large Data Set Describing Marginal Distributions
The large data set comprises fitness data for the entire Federal Reserve System
for the months of January 2004 and May 2004, provided by the Currency
Technology Office (CTO) at the Federal Reserve Bank of Richmond. This
data characterizes the marginal quality distributions for more than two-and-

Janicki, Moin, Waddle, and Wolman: Currency Quality

373

Figure 2 Marginal Quality Distributions, $10 Notes

Fraction of Notes

Soil Level Front (SLF)
2004 Data
Per-Note Data – 2006

0.14
0.12
0.10
0.08
0.06
0.04
0.02
0

1

3

2

4

5
7
6
Quality Level (q)

8

9

10

11

Fraction of Notes

Ink Wear Worst Front (IWWF)
2004 Data
Per-Note Data – 2006

0.25
0.20
0.15
0.10
0.05
0

1

2

3

4

5

7
6
Quality Level (q)

8

9

10

11

12

Fraction of Notes

Graffiti Worst Front (GWF)
2004 Data
Per-Note Data – 2006

0.5
0.4
0.3
0.2
0.1
0

1

2

3
Quality Level (q)

4

5

6

a-half billion notes. The data are at the level of office location, date, shift,
supervisor, and denomination. For a particular denomination, we assume
that summing these data over all dates, supervisors, and shifts generates a
precise estimate of the steady-state marginal distribution over each quality
level. Figures 1 and 2 plot the marginal distributions over soil level, ink wear
and graffiti for the combined January and May 2004 data, for $5 and $10 notes
(solid lines).15
The raw data have 26 quality levels for each category. However, for many
quality levels there are very few notes, and for speed of computation it is advantageous to decrease the number of quality levels. For each denomination
15 The same data set covers $20 notes, but as described in the Conclusion, our limited analysis of the 20s has not used this data.

374

Federal Reserve Bank of Richmond Economic Quarterly

and each category (e.g., SLF) we have, therefore, combined multiple quality
levels into one. For example, our new soil level zero for the $10 notes includes
all notes with soil levels zero through 2 in the data. Table 1 contains comprehensive information about how we combine quality levels. Boxes around
multiple quality levels indicate that we have combined them, and the columns
labeled “q” contain the quality level numbers corresponding to our smaller set
of quality levels. After combining in this way, we are left with between 7 and
13 quality levels for each denomination and category. For each denomination
and each dimension, there are three unfit quality levels. For example, for the
$5 notes SLF, quality levels 9, 10, and 11 are unfit.

Per-Note Data
In addition to the comprehensive data set describing marginal distributions,
we use per-note data sets covering approximately 45,000 notes each of $5
notes and $10 notes. These data were gathered at nine Federal Reserve offices
in February and March 2006. For each note, there is information on the date
of issue, as well as quality level in at least 21 categories, including SLF, GWF,
and IWWF. The dotted lines in Figures 1 and 2 are the marginal quality
distributions for SLF, IWWF, and GWF from the per-note data for the $5 and
$10 notes. There are minor differences relative to the marginal distributions
from the large data set, but the broad patterns are the same. This gives us some
confidence that the per-note data are representative samples.
Because the note data contain date of issue for each note, we are able to
get an estimate of the age distribution of notes. In Figure 3, the jagged dotted
line is a smoothed version of the age distribution of unfit notes from the note
data. The smoothing method involves taking a three-month moving average.
Without smoothing, the age distributions would be extremely choppy. Note
that in Figure 3, we plot the age distribution of unfit notes. It is the unfit notes
with which we are most concerned for this study, and whose age distribution
we care most about matching with the model. Unfit notes are those notes
whose quality is worse than the shred threshold in any dimension—major or
minor.

4.

CHOOSING THE MODEL’S PARAMETERS

There are Q2 + 3Q + 1 parameters in each model; they comprise the Q2
elements of π , the 3Q elements of α, ρ, and g, and the single parameter γ .16
Since Q is between 7 and 13, the number of parameters is between 71 and
209. We select the model’s parameters in several stages.17
16 Recall that Q is either n , n , or n , depending on the version of the model.
s i
g
17 Because our approach to selecting parameters is ad hoc, we hesitate to talk about “esti-

mating the model.” However, in effect that is what we are doing.

SLF

0.000
0.000
0.000
0.004
0.039
0.094
0.116
0.128
0.140
0.139
0.122
0.093
0.056
0.027
0.013
0.007
0.005
0.004
0.003
0.002
0.002
0.002
0.001
0.001
0.001
0.000

Quality
Level

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
11

10

9

1
2
3
4
5
6
7
8

0

q
0.059
0.342
0.121
0.086
0.067
0.054
0.044
0.037
0.031
0.027
0.023
0.019
0.017
0.014
0.011
0.009
0.007
0.005
0.004
0.004
0.003
0.002
0.002
0.002
0.001
0.008

IWWF

$5 Notes

11

10

9

8

7

0
1
2
3
4
5
6

q
0.000
0.333
0.454
0.138
0.036
0.015
0.008
0.005
0.003
0.002
0.002
0.001
0.001
0.001
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000

GWF

7

6

5

4

1
2
3

0

q
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Quality
Level
0.000
0.000
0.001
0.014
0.068
0.122
0.143
0.157
0.153
0.130
0.096
0.058
0.027
0.012
0.006
0.004
0.003
0.002
0.002
0.001
0.001
0.000
0.000
0.000
0.000
0.000

SLF

11

10

9

2
3
4
5
6
7
8

1

0

q

Table 1 Marginal Quality Distributions and Combined Quality Levels

0.040
0.300
0.106
0.084
0.075
0.069
0.063
0.056
0.049
0.041
0.033
0.026
0.019
0.013
0.009
0.006
0.003
0.002
0.001
0.001
0.000
0.000
0.000
0.000
0.000
0.001

IWWF

$10 Notes

12

11

10

9

8

7

0
1
2
3
4
5
6

q

0.001
0.602
0.257
0.078
0.030
0.012
0.006
0.004
0.003
0.002
0.001
0.001
0.001
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000

GWF

6

5

1
2
3
4

0

q

Janicki, Moin, Waddle, and Wolman: Currency Quality
375

376

Federal Reserve Bank of Richmond Economic Quarterly

Figure 3 Age Distribution of Unfit Notes
$5 Notes
0.035

Model-SLF
Model-IWWF
Model-GWF
Smoothed Data

Fraction of Notes

0.030
0.025
0.020

0.015
0.010
0.005
0
0

10

20

30

40
50
Age (months)

60

70

80

$10 Notes
0.040

Model-SLF
Model-IWWF
Model-GWF
Smoothed Data

Fraction of Notes

0.035
0.030
0.025
0.020
0.015
0.010
0.005
0
0

10

20

30

40
50
Age (months)

60

70

80

First, we make some a priori assumptions on the transition matrix π that
decrease the number of free parameters. Next, we pin down g, α, γ , and ρ
based on information from the Federal Reserve System’s Currency Technology
Office, the Federal Reserve Board, and preliminary analysis of the data. We
select the remaining parameters so that the model’s steady-state distribution
matches the quality and age distributions in Figures 1–3.
At this point, it may be useful to remind the reader where we are: we
have specified a model of the evolution of currency quality, and we will now
use data from the period before implementation of the currency recirculation
policy in order to choose parameters of the model. Once the parameters have
been chosen, we will simulate the model under particular assumptions about
how DI behavior will change in response to the recirculation policy. The
recirculation policy itself is “outside the model”; the model does not address
pricing of currency processing by the Fed, and the model does not address
(intraweek) cross-shipping because it is specified at a monthly frequency.

Janicki, Moin, Waddle, and Wolman: Currency Quality

377

A Priori Restrictions on π
We reduce the number of parameters determining π by imposing the restriction
that notes never improve in quality, except that soil level may “improve” to
zero if a note is laundered (i.e., the note has gone through a washing machine).
This restriction means that almost half the elements of π are zeros. For the
ink wear and graffiti model, all elements above the main diagonal are zero.
For the soil level model, the elements above the main diagonal are zero except
in the first row, which may contain nonzero elements in every column to
account for the possibility of laundered notes; in the first column, the first
row contains a one and all other rows contain zeros, because a laundered
note always remains laundered. The numbers of nonzero elements in π are
n ng +1
s
i
thus ns (n2 +1) + ns − 1 , ni (n2+1) and g ( 2 ) for the three models. The last
restriction we impose on π is an inherent feature of the model: the columns
of π must sum to one, and π is a stochastic matrix with each element weakly
between zero and one. This adds Q restrictions, subtracting an equal number
of parameters.

Choosing α, g, ρ, and γ
The Federal Reserve chooses the definition of fit notes, so there would be no
difficulty determining α if the model were specified in terms of all quality
dimensions simultaneously; α q would be one for fit notes and zero for unfit
notes. However, since we specify the model in terms of only one dimension,
we need to adjust the shred parameter α to reflect the fact that notes may be
unfit even though they are fit according to the dimension in which the model
is specified. For example, if the model is specified in terms of soil level, a
note that is very clean may nonetheless be unfit because of its level of ink
wear. We adjust for this possibility as follows, using the soil level example:
for each fit degree of soil level q, calculate the fraction of notes with soil level
q that are unfit according to other dimensions and subtract that fraction from
α q . That calculation is necessarily based on the per-note data, as it requires
going beyond marginal distributions. The corrections we make to α are shown
in Table 2.
The vector g represents the quality distribution of newly printed notes. Our
estimates of g are from the Federal Reserve System’s Currency Technology
Office (unpublished data), and these are presented in Table 3. Sorting behavior
by DIs is captured by the vector ρ.

SLF

0
0.0375
0.0640
0.0867
0.1150
0.1417
0.1852
0.2546
0.3658
0
0
0

q

0
1
2
3
4
5
6
7
8
9
10
11

0
1
2
3
4
5
6
7
8
9
10
11

q
0.0850
0.1080
0.1445
0.1754
0.1801
0.2048
0.2109
0.2461
0.2913
0
0
0

IWWF

$5 Notes

Table 2 Corrections to α Vector

0
1
2
3
4
5
6
7

q
0.0624
0.1215
0.2795
0.5783
0
0
0
0

GWF
0
1
2
3
4
5
6
7
8
9
10
11

q
0
0.0215
0.0106
0.0120
0.0272
0.0389
0.0622
0.1132
0.1890
0
0
0

SLF
0
1
2
3
4
5
6
7
8
9
10
11
12

q
0.0255
0.0545
0.0654
0.0868
0.0844
0.0857
0.0878
0.1036
0.1251
0.1496
0
0
0

IWWF

$10 Notes

0
1
2
3
4
5
6

q

0.0374
0.1142
0.2294
0.3392
0
0
0

GWF

378
Federal Reserve Bank of Richmond Economic Quarterly

Janicki, Moin, Waddle, and Wolman: Currency Quality

379

Table 3 Quality Distribution of New Notes
$5 Notes
q

$10 Notes

SLF

q

IWWF

q GWF

q

SLF

q

IWWF

0
0
1 0.010
2 0.695
3 0.295
4
0
5
0
6
0
7
0
8
0
9
0
10
0
11
0

0
1
2
3
4
5
6
7
8
9
10
11

1
0
0
0
0
0
0
0
0
0
0
0

0 0.935
1 0.065
2
0
3
0
4
0
5
0
6
0
7
0

0
0
1 0.965
2 0.035
3
0
4
0
5
0
6
0
7
0
8
0
9
0
10
0
11
0

0
1
2
3
4
5
6
7
8
9
10
11
12

1
0
0
0
0
0
0
0
0
0
0
0
0

q GWF
0
1
2
3
4
5
6

1
0
0
0
0
0
0

We assume that DIs do not sort, which implies that all elements of ρ are
identical and are equal to the fraction of notes that DIs deposit each period.18
We set each element of ρ to 0.1165 for the $5 notes and 0.1322 for the $10
notes. These numbers are based on data from the Federal Reserve Board (S.
Ferrari, pers. comm.). Finally, γ is the growth rate of the stock of currency.
We have set the annual growth rate at 1.78 percent for the $5 notes, and 0.38
percent for the $10 notes, again based on data from the Federal Reserve Board
(S. Ferrari, pers. comm.).

Matching the Quality and Age Data
We select the remaining parameters of the matrix π—for each specification
of the model—so that the model’s steady-state distribution matches as closely
as possible two features of the data. First, we want to match the marginal
quality distribution from the 2004 comprehensive data (Figures 1 and 2, solid
line). Second, we want to match the age distribution of unfit notes from the
2006 per-note data (Figure 3). Concretely, we select the parameters of π to
minimize a weighted average of (i) the sum of squared deviations between the
marginal quality distribution and that predicted by the model, and (ii) the sum
of squared deviations between the unfit age distributions and that predicted by
18 A recent internal Federal Reserve study confirmed that DIs have not been sorting to any
appreciable extent, as the quality distribution of currency that the Federal Reserve receives from
DIs is close to the quality distribution of currency in circulation (Board of Governors of the Federal
Reserve System 2007). However, the recirculation policy—in particular, the fee for cross-shipping
fit currency—gives DIs an incentive to sort. We address this issue in the Conclusion.

380

Federal Reserve Bank of Richmond Economic Quarterly

Table 4 π Matrix for $5 Notes According to GWF
q

0

1

2

3

4

5

6

7

0
1
2
3
4
5
6
7

0.9469
0.0531
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

0.0000
0.9755
0.0224
0.0000
0.0022
0.0000
0.0000
0.0000

0.0000
0.0000
0.9647
0.0353
0.0000
0.0000
0.0000
0.0000

0.0000
0.0000
0.0000
0.9945
0.0000
0.0054
0.0001
0.0000

0.0000
0.0000
0.0000
0.0000
0.8828
0.1148
0.0024
0.0001

0.0000
0.0000
0.0000
0.0000
0.0000
0.8294
0.1706
0.0000

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.9995
0.0005

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
1.0000

Notes: The row r, column c element of this matrix is the probability that a note will
become quality r next period, conditional on it being of quality c in this period. For
example, the probability that a note will be of quality 4 in the next period, given that it
is quality 1 in this period is 0.0022, the element in row 4, column 1.

the model.19 Table 4 contains one example of the π matrix; it is for the GWF
model of $5 notes.
With respect to the marginal quality distributions, we have no trouble
matching the data. In all of the model specifications, we match the marginal
quality distributions nearly perfectly. The age distributions are a different
matter, which perhaps is not surprising given their choppiness in the data—
the model wants to make the age distribution of unfit notes smooth. Figure 3
plots the age distributions implied by each specification of the model, along
with the age distributions from the data.20 With the exception of the SLF
model for $5 notes, the age distributions implied by the model involve too
many unfit notes more than approximately four years old.

5.

SIMULATING A CHANGE IN DI BEHAVIOR

Because the response of the quality distribution to a decrease in deposit rates
depends on the transition matrix π , the fact that we have multiple models
means that we generate a range of responses to a decrease in deposit rates.
Figures 4 and 5 plot the time series for the fraction of unfit notes, in response
to 20 and 40 percent decreases in DIs’ deposit rates, respectively. According
to final Currency Recirculation Policy (Board of Governors of the Federal
Reserve System 2006b), of the $10 and $20 notes processed by the Fed in
19 We have also experimented with adding to our estimation criterion the fraction of age k

notes that are unfit, for k = 1, 2, ... For moderate weights on this component the results are not
materially affected.
20 In Figure 3, the lines associated with the model stop at 56 months because we did not
attempt to match the age distribution beyond 56 months.

Janicki, Moin, Waddle, and Wolman: Currency Quality

381

Figure 4 Response to 20 Percent Decrease in Deposit Rate

Cumulative Change in
Fraction of Unfit Notes

Transition Path of All Unfit $5 Notes After DI Deposit Rate Falls 20 Percent
0.020

0.015
0.010
SLF Model
IWWF Model
GWF Model

0.005

0
0

10

20

30
40
50
60
Months After Decrease in Deposit Rate

70

80

Cumulative Change in
Fraction of Unfit Notes

Transition Path of All Unfit $10 Notes After DI Deposit Rate Falls 20 Percent
0.016
0.014
0.012
0.010
0.008
0.006
0.004
0.002
0

SLF Model
IWWF Model
GWF Model

0

10

20

30
40
50
60
Months After Decrease in Deposit Rate

70

80

2004, 40.4 percent were cross-shipped. Thus, a 40 percent decrease in deposits corresponds to DIs ceasing entirely to cross-ship. This seems unlikely,
so we view the 40 percent number as an upper bound on the effect of the recirculation policy. In addition, cross-shipping is likely more important for $20
notes than $10 notes, because of the necessity of having crisp (fit) $20 notes
in ATM machines. Since the DIs always receive fit notes from the Federal
Reserve System, a larger volume of $20 notes are cross-shipped than any other
denomination.21 Thus, the 40.4 percent upper bound for $10 notes and $20
notes combined is higher than the upper bound for the $10 notes or $5 notes.
Each line in Figures 4 and 5 represents the transition path for the fraction
of unfit notes for a different major dimension model (soil level, ink wear,
graffiti). In response to a 20 percent decrease in the deposit rate, the models
predict a long-run increase in the fraction of unfit notes of between 0.017
and 0.025 for the $5 notes (i.e., around two percentage points), and between
0.008 and 0.018 for the $10 notes. In our large data sets, the total fractions
21 In 2005, the volume of $5, $10, and $20 notes that were cross-shipped were 12.7 percent,
9.0 percent, and 78.3 percent, respectively.

382

Federal Reserve Bank of Richmond Economic Quarterly

Figure 5 Response to 40 Percent Decrease in Deposit Rate

Cumulative Change in
Fraction of Unfit Notes

Transition Path of All Unfit $5 Notes
0.05
0.04
0.03
SLF Model
IWWF Model
GWF Model

0.02
0.01
0
0

10

20

30
40
50
60
Months After Decrease in Deposit Rate

70

80

Cumulative Change in
Fraction of Unfit Notes

Transition Path of All Unfit $10 Notes
0.040
0.035
0.030
0.025
0.020
0.015
0.010
0.005
0

SLF Model
IWWF Model
GWF Model

0

10

20

30
40
50
60
Months After Decrease in Deposit Rate

70

80

of unfit notes are 0.173 for the $5 notes and 0.150 for the $10 notes. Note
that the model that provides the best fit to the age distribution ($5 SLF) is
also the model that predicts the largest increase in the fraction of unfit notes,
0.025. Not surprisingly, a 40 percent decrease in deposit rates generates a
larger increase in the fraction of unfit notes—between 0.044 and 0.055 for the
$5 notes and between 0.019 and 0.044 for the $10 notes.
Figures 6, 7, 8, and 9 provide a different perspective on the effects of
a decrease in deposit rates. These figures plot on the same panel the initial
steady-state quality distribution (prior to the drop in deposit rates) and the new
steady-state quality distribution corresponding to the lower deposit rate. For
the 20 percent experiment (Figures 6 and 7), the long-run effects on quality
are generally small, reinforcing the message of Figure 4. There are, however,
certain quality levels that are strongly affected. For example, the fraction of
$10 notes at soil level 6 (in Figure 7) eventually rises from 0.13 to 0.1832 in
response to the 20 percent drop in deposits. For the 40 percent experiment,
things look somewhat more dramatic: for example, the fraction of $10 notes
at soil level 6 increases from 0.13 to 0.27 (in Figure 9). To put this change in
perspective though, Table 2 tells us that only 6.2 percent of the level 6 SLF

Janicki, Moin, Waddle, and Wolman: Currency Quality

383

Figure 6 Effect of 20 Percent Deposit Rate Decrease on Quality
Distributions of $5 Notes

Fraction of Notes

Marginal Distribution: SLF
0.20
Old Steady State
New Steady State

0.15
0.10
0.05
0.00
0

1

2

3

4

5
6
Quality Level (q)

7

8

9

10

11

Fraction of Notes

Marginal Distribution: IWWF
0.4
Old Steady State
New Steady State

0.3
0.2
0.1
0.0
0

1

2

3

4

5
6
Quality Level (q)

7

8

9

10

11

Fraction of Notes

Marginal Distribution: GWF
0.5
Old Steady State
New Steady State

0.4
0.3
0.2
0.1
0.0
0

1

2

3
4
Quality Level (q)

5

6

7

$10 notes are unfit, so the big increase in notes at that level (which is still fit
according to SLF) brings with it an increase of less than one percentage point
in unfit notes. Recall that the change in total fraction of unfit notes is shown
in Figures 4 and 5.
If the Fed wished to offset the quality deterioration caused by a decrease
in deposit rates, a natural policy would be to shred notes of higher quality.
Table 5 displays scenarios for fraction of notes to shred at each quality level
in order to maintain the fraction of unfit notes at its old steady-state level.
For example, if deposit rates fall 20 percent, our SLF model for $5 notes
implies that shredding all notes in the worst-fit category and shredding 35
percent of notes in the second worst-fit category would counteract the deposit
decrease, leaving the fraction of notes unchanged. The columns in this table
should be read independently, as they each apply to distinct models. In other
words, the column labeled $5 SLF provides a policy change for SLF that is
predicted to bring about a stable fraction of unfit notes; no changes are made

384

Federal Reserve Bank of Richmond Economic Quarterly

Table 5 Policy Response to Offset Effect of Deposit Rate Decrease
20 Percent Decrease in Deposits: Fraction of Notes to Shred
$5 Notes

$10 Notes

q

SLF

q

IWWF

q

SLF

q

IWWF

0
1
2
3
4
5
6
7
8
9
10
11

1
0
0
0
0
0
0
0.3512
1
1
1
1

0
1
2
3
4
5
6
7
8
9
10
11

0
0
0
0
0
0
0
0.0255
1
1
1
1

0
1
2
3
4
5
6
7
8
9
10
11

1
0
0
0
0
0
0
0
0.545
1
1
1

0
1
2
3
4
5
6
7
8
9
10
11
12

0
0
0
0
0
0
0
0
0
0.6845
1
1
1

40 Percent Decrease in Deposits: Fraction of Notes to Shred
$5 Notes

$10 Notes

q

SLF

q

IWWF

q

SLF

q

IWWF

0
1
2
3
4
5
6
7
8
9
10
11

1
0
0
0
0
0.198
1
1
1
1
1
1

0
1
2
3
4
5
6
7
8
9
10
11

0
0
0
0
0
0
0.48
1
1
1
1
1

0
1
2
3
4
5
6
7
8
9
10
11

1
0
0
0
0
0
0
0.125
1
1
1
1

0
1
2
3
4
5
6
7
8
9
10
11
12

0
0
0
0.22
1
1
1
1
1
1
1
1
1

to shred thresholds for other dimensions. Note that we have omitted GWF
from the analysis in Table 1; we were not successful in finding policies that
counteracted the quality decline by changing the shred policy for GWF. In
order to counteract the effects of a 40 percent decrease in deposits, Reserve
Banks would have to shred currency at significantly higher quality levels,
depending on the particular model specification. In the most extreme case,
which is the IWWF model for $10 notes, the worst six levels of fit notes would
have to be shredded (quality levels four through nine), and 22 percent of notes
at quality level 3 would have to be shredded to prevent overall quality from
deteriorating. Recall, however, that the 40 percent decrease in deposit rates

Janicki, Moin, Waddle, and Wolman: Currency Quality

385

Figure 7 Effect of 20 Percent Deposit Rate Decrease on Quality
Distributions of $10 Notes

Fraction of Notes

Marginal Distribution: SLF
0.3
Old Steady State
New Steady State

0.2
0.1
0.0
0

1

2

3

4

5
6
7
Quality Level (q)

9

8

10

11

Fraction of Notes

Marginal Distribution: IWWF
0.4

Old Steady State
New Steady State

0.3
0.2
0.1
0.0
0

1

2

3

4

5
6
7
Quality Level (q)

8

9

10

11

12

Fraction of Notes

Marginal Distribution: GWF
0.8

Old Steady State
New Steady State

0.6
0.4
0.2
0.0
0

1

2

3
Quality Level (q)

4

5

6

represents an upper bound on how we expect DIs to change their behavior in
response to the recirculation policy.

6.

CONCLUSION

The quality of currency in circulation is an important policy objective for the
Federal Reserve. Changes in the behavior of depository institutions, whether
caused by Fed policy or by independent factors, can have implications for the
evolution of currency quality. Currently the Fed is implementing a recirculation policy, which is expected to cause changes in the behavior of DIs and,
therefore, affect currency quality. The mechanical model of currency quality
in this article can be used to study the effects of changes in DI behavior and
changes in Fed policy on the quality distribution of currency. In general, the
model predicts relatively modest responses of currency quality to decreases in
DI deposit rates that are anticipated to occur as a consequence of the recircu-

386

Federal Reserve Bank of Richmond Economic Quarterly

Figure 8 Effect of 40 Percent Deposit Rate Decrease on Quality
Distributions of $5 Notes

Fraction of Notes

Marginal Distribution: SLF
0.20

Old Steady State
New Steady State

0.15
0.10
0.05
0.00
0

1

2

3

4

5
6
Quality Level (q)

7

8

9

10

11

Fraction of Notes

Marginal Distribution: IWWF
0.4

Old Steady State
New Steady State

0.3
0.2
0.1
0.0
0

1

2

3

4

5
6
7
Quality Level (q)

8

9

10

11

Fraction of Notes

Marginal Distribution: GWF
0.5

Old Steady State
New Steady State

0.4
0.3
0.2
0.1
0.0
0

1

2

3
4
Quality Level (q)

5

6

7

lation policy. For $5 and $10 notes, our model is able to match the marginal
quality distributions perfectly, and the age distributions of unfit notes reasonably well. Thus, we have some confidence in the range of predictions that the
different model specifications make for the effects on currency quality of a
decrease in deposit rates. In what follows, we discuss potential extensions to
the current analysis.
Although our framework allows for sorting by DIs, the quantitative analysis has assumed no sorting occurs. If DIs do sort, then the researcher must
take into account that the distribution of currency in circulation is not the same
as the distribution of currency that visits the Fed. The derivations in this report
do not differentiate between the two distributions, but it is straightforward to
do so. If DIs were to sort using the same criteria as the Federal Reserve, then it
is likely that the results presented here would overstate the decline in currency
quality following implementation of the recirculation policy; by depositing
with the Fed only low-quality notes, DIs would offset the deleterious effect of

Janicki, Moin, Waddle, and Wolman: Currency Quality

387

Figure 9 Effect of 40 Percent Deposit Rate Decrease on Quality
Distributions of $10 Notes

Fraction of Notes

Marginal Distribution: SLF
0.4
0.3

Old Steady State
New Steady State

0.2
0.1
0.0
0

1

2

3

4

5
6
7
Quality Level (q)

9

8

10

11

Fraction of Notes

Marginal Distribution: IWWF
0.4
0.3

Old Steady State

0.2

New Steady State

0.1
0.0
0

1

2

3

4

5
6
7
Quality Level (q)

8

9

10

11

12

Fraction of Notes

Marginal Distribution: GWF
0.8
0.6

Old Steady State

0.4

New Steady State

0.2
0.0

0

1

2

3
Quality Level (q)

4

5

6

depositing fewer notes. The recirculation policy clearly provides an incentive
for at least some DIs to sort because it imposes fees for cross-shipment of fit
currency only.
Our analysis has not addressed $20 notes. Figure 10 illustrates the difficulty they present: they are not in a steady state but are transiting from the old
to the new design. Of the old notes, more than 10 percent are unfit, whereas
of the new notes, less than 3 percent are unfit. All the old notes are more
than two years old, whereas all the new notes are less than three years old.
Our model is not inherently restricted to steady state situations. To apply it
to the 20s, one would want to use the form of the model in (7) and also allow
for γ (the growth rate of currency) to be time-varying or at least allow γ to
vary across designs. The non-steady-state form of the model (7) also could be
useful more generally, in providing a check on our estimates. If there is good
data on marginal quality distributions available monthly, then that data can be
used to generate forecast errors for the model on a real-time basis.

388

Federal Reserve Bank of Richmond Economic Quarterly

Figure 10 Age Distributions of Unfit $20 Notes
Unfit $20 Notes—Old Design (10.08% of Notes Unfit)
0.10

Fraction of Unfit Notes

0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00
0

10

20

30

40
50
60
Age (in months)

70

80

90

100

90

100

Unfit $20 Notes—New Design (2.76% of Notes Unfit)
0.10

Fraction of Unfit Notes

0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00
0

10

20

30

40
50
60
Age (in months)

70

80

One reason to question the steady-state assumption is the possibility that
the payments system is in the midst of a transition away from the use of
currency and toward electronic forms of payment. Although it is difficult to
distinguish a change in the trend from a transitory shock, data on the stock
of currency does give some credence to this concern: from 2002 to 2007
the growth rate of currency has declined steadily, and at 2 percent for the 12
months ending in June 2007 it is currently growing more slowly than most
measures of nominal spending. A decreasing currency growth rate means that
there is a decreasing rate of new notes introduced into circulation. This would
likely require stronger measures by the Federal Reserve to maintain currency
quality in response to a decrease in deposit rates.
The version of the model estimated here is very small and easy to estimate.
Expanding the model so that it describes the joint distribution of all three
quality dimensions studied here leads to an unmanageably large system. A
middle ground that might be worth pursuing would be to specify the model in

Janicki, Moin, Waddle, and Wolman: Currency Quality

389

terms of two dimensions, say graffiti and soil level, and include information
about unfitness in other dimensions, as we have done here.
Finally, it would be useful to embed the currency quality model of this
article in an economic model of DIs and households. The DIs’ deposit rate
and sorting policy (both summarized by ρ) would then be endogenously determined. Such a model could be used to predict the effects of a change in
the Federal Reserve’s pricing policy on DI behavior. It could also be used
to conduct welfare analysis of different pricing and shredding policies. The
model in Lacker (1993) is a natural starting point.

APPENDIX:

DETAILS OF CALCULATING
AGE DISTRIBUTION

It is straightforward to compute the age distribution of notes for any quality
level and the quality distribution at any age. Begin by defining the fraction of
notes at quality level q and age k to be hq,k . These fractions satisfy
∞

Q

1=

hq,k .

(13)

k=0 q=1

For convenience, define hk to be the Q-vector containing in element q, the
fraction of notes that are k-periods old, and in quality level q :
⎡
⎤
h1,k
⎢ h2,k ⎥
⎢
⎥
hk = ⎢ . ⎥ .
(14)
⎣ . ⎦
Qx1
.
hQ,k
Q
Q
We also have that hq,k = eq hk , where eq is a Qx1 selection vector with a
th
1 in the q element and zeros elsewhere.
The fraction of brand-new notes is
Q

N

hq,0
q=1

γ
=
1 − α j ρ j fj∗ ,
+
1+γ
j =1

(15)

and since the quality distribution of new notes is g, the fractions of notes that
are new and in each quality level q are
⎛
⎞
N
γ
h0 = ⎝
1 − α j ρ j fj∗ ⎠ · g.
(16)
+
1+γ
j =1

390

Federal Reserve Bank of Richmond Economic Quarterly

For one-period old notes, the fractions are
h1 =

π · diag (1 − ρ) + diag (α
1+γ

ρ)

· h0 .

(17)

Likewise, we have
hk+1 =

π · diag (1 − ρ) + diag (α
1+γ

ρ)

k+1

· h0 , for k = 0, 1, ..., (18)

with h0 determined by (16). Thus, we can calculate the fraction of notes at
any age-quality combination as
Q
hq,k = eq

π · diag (1 − ρ) + diag (α
1+γ

The age distribution of quality-q notes is
⎡
1
∞
k=0

hq,k

⎢
⎢
⎢
⎣

hq,0
hq,1
.
.
.

ρ)

k

· h0 .

(19)

⎤
⎥
⎥
⎥.
⎦

(20)

hq,∞
and the quality distribution of age-k notes is
⎡
h1,k
⎢ h2,k
1
⎢
⎢ .
Q
.
h ⎣ .
q=0

q,k

⎤
⎥
⎥
⎥.
⎦

(21)

hQ,k

REFERENCES
Board of Governors of the Federal Reserve System. 2003a. “Federal Reserve
Bank Currency Recirculation Policy. Request for Comment, notice.”
Docket No. OP-1164, October 7. Available at:
http://www.federalreserve.gov/Boarddocs/press/other/2003/20031008/
attachment.pdf (accessed July 13, 2007).
Board of Governors of the Federal Reserve System. 2003b. Press release,
October 8. Available at:
http://www.federalreserve.gov/boarddocs/press/other/2003/20031008/
default.htm (accessed July 13, 2007).
Board of Governors of the Federal Reserve System. 2006a. “Appendix C.
Currency Budget.” Annual Report: Budget Review: 31.

Janicki, Moin, Waddle, and Wolman: Currency Quality

391

Board of Governors of the Federal Reserve System. 2006b. “Federal
Reserve Currency Recirculation Policy. Final Policy.” Docket No.
OP-1164, March 17. Available at:
http://www.federalreserve.gov/newsevents/press/other/other20060317
a1.pdf (accessed July 13, 2007).
Bureau of Engraving and Printing. 2007. “Money Facts, Fun Facts, Did You
Know?” Available at: http://www.bep.treas.gov/document.cfm/18/106
(accessed October 30, 2007).
Federal Reserve Bank of New York. 2006. “Currency Processing and
Destruction.” Available at:
http://www.ny.frb.org/aboutthefed/fedpoint/fed11.html (accessed May
30, 2007).
Federal Reserve Bank of San Francisco. 2006. “Cash Counts.” Annual
Report: 6–17.
Federal Reserve System, Currency Quality Work Group. 2007. “Federal
Reserve Bank Currency Quality Monitoring Program.” Internal memo,
June.
Ferrari, Shaun. 2005. “Division of Reserve Bank Operations and Payment
Systems.” Board of Governors of the Federal Reserve System
Governors. E-mail message to author, September 9, 2005.
Friedman, Milton. 1969. Optimum Quantity of Money: And Other Essays.
Chicago, IL: Aldine Publishing Company.
Klein, Raymond M., Simon Gadbois, and John J. Christie. 2004. “Perception
and Detection of Counterfeit Currency in Canada: Note Quality,
Training, and Security Features.” In Optical Security and Counterfeit
Deterrence Techniques V, ed. Rudolf L. van Renesse, SPIE Conference
Proceedings, vol. 5310.
Lacker, Jeffrey M. 1993. “Should We Subsidize the Use of Currency?”
Federal Reserve Bank of Richmond Economic Quarterly 79 (1): 47–73.
Lacker, Jeffrey M., and Alexander L. Wolman. 1997. “A Simple Model of
Currency Quality.” Mimeo, Federal Reserve Bank of Richmond
(November).
Stokely, Nancy L., Robert E. Lucas, Jr., with Edward C. Prescott. 1989.
Recursive Methods in Economic Dynamics. Cambridge, MA: Harvard
University Press.

Economic Quarterly—Volume 93, Number 4—Fall 2007—Pages 393–412

Non-Stationarity and
Instability in Small
Open-Economy Models
Even When They Are
“Closed”
Thomas A. Lubik

O

pen economies are characterized by the ability to trade goods both
intra- and intertemporally, that is, their residents can move goods
and assets across borders and over time. These transactions are
reflected in the current account, which measures the value of a country’s
export and imports, and its mirror image, the capital account, which captures
the accompanying exchange of assets. The current account serves as a shock
absorber, which agents use to optimally smooth their consumption. The means
for doing so are borrowing and lending in international financial markets. It
almost goes without saying that international macroeconomists have had a
long-standing interest in analyzing the behavior of the current account.
The standard intertemporal model of the current account conceives a small
open economy as populated by a representative agent who is subject to fluctuations in his income. By having access to international financial markets,
the agent can lend surplus funds or make up shortfalls for what is necessary
to maintain a stable consumption path in the face of uncertainty. The international macroeconomics literature distinguishes between an international asset
market that is incomplete and one that is complete. The latter describes a
modeling framework in which agents have access to a complete set of statecontingent securities (and, therefore, can share risk perfectly); when markets
I am grateful to Andreas Hornstein, Alex Wolman, Juan Carlos Hatchondo, and Nashat Moin
for comments that improved the article. I also wish to thank Jinill Kim and Mart´n Uribe
ı
for useful discussions and comments which stimulated this research. The views expressed in
this article are those of the author, and do not necessarily reflect those of the Federal Reserve
Bank of Richmond or the Federal Reserve System. E-mail: Thomas.Lubik@rich.frb.org.

394

Federal Reserve Bank of Richmond Economic Quarterly

are incomplete, on the other hand, agents can only trade in a restricted set of
assets, for instance, a bond that pays fixed interest.
The small open-economy model with incomplete international asset markets is the main workhorse in international macroeconomics. However, the
baseline model has various implications that may put into question its usefulness in studying international macroeconomic issues. When agents decide on
their intertemporal consumption path they trade off the utility-weighted return
on future consumption, measured by the riskless rate of interest, against the
return on present consumption, captured by the time discount factor. The basic set-up implies that expected consumption growth is stable only if the two
returns exactly offset each other, that is, if the product of the discount factor
and the interest rate equal one. The entire optimization problem is ill-defined
for arbitrary interest rates and discount factors as consumption would either
permanently decrease or increase.1
Given this restriction on two principally exogenous parameters, the model
then implies that consumption exhibits random-walk behavior since the effects
of shocks to income are buffered by the current account to keep consumption
smooth. The random-walk in consumption, which is reminiscent of Hall’s
(1978) permanent income model with linear-quadratic preferences, is problematic because it implies that all other endogenous variables inherit this nonstationarity so that the economy drifts over time arbitrarily far away from its
initial condition. To summarize, the standard small open-economy model with
incomplete international asset markets suffers from what may be labelled the
unit-root problem. This raises several issues, not the least of which is the overall validity of the solution in the first place, and its usefulness in conducting
business cycle analysis.
In order to avoid this unit-root problem, several solutions have been suggested in the literature. Schmitt-Groh´ and Uribe (2003) present an overview
e
of various approaches. In this article, I am mainly interested in inducing stationarity by assuming a debt-elastic interest rate. Since this alters the effective
interest rate that the economy pays on foreign borrowing, the unit root in the
standard linearized system is reduced incrementally below unity. This preserves a high degree of persistence, but avoids the strict unit-root problem.
Moreover, a debt-elastic interest rate has an intuitive interpretation as an endogenous risk premium. It implies, however, an additional, essentially ad hoc
feedback mechanism between two endogenous variables. Similar to the literature on the determinacy properties of monetary policy rules or models with
1 Conceptually, the standard current account model has a lot of similarities to a model of
intertemporal consumer choice with a single riskless asset. The literature on the latter gets around
some of the problems detailed here by, for instance, imposing borrowing constraints. Much of that
literature is, however, mired in computational complexities as standard linearization-based solution
techniques are no longer applicable.

T. A. Lubik: Small Open-Economy Models

395

increasing returns to scale, the equilibrium could be indeterminate or even
non-existent.
I show in this article that commonly used specifications of the risk premium do not lead to equilibrium determinacy problems. In all specifications,
indeterminacy of the rational expectations equilibrium can be ruled out, although in some cases there can be multiple steady states. It is only under
a specific assumption on whether agents internalize the dependence of the
interest rate on the net foreign asset position that no equilibrium may exist.
I proceed by deriving, in the next section, an analytical solution for the
(linearized) canonical small open-economy model which tries to illuminate the
extent of the unit-root problem. Section 2 then studies the determinacy properties of the model when a stationarity-inducing risk-premium is introduced.
In Section 3, I investigate the robustness of the results by considering different
specifications that have been suggested in the literature. Section 4 presents
an alternative solution to the unit-root problem via portfolio adjustment costs,
while Section 5 summarizes and concludes.

1. THE CANONICAL SMALL OPEN-ECONOMY MODEL
Consider a small open economy that is populated by a representative agent2
whose preferences are described by the following utility function:
∞

β t u (ct ) ,

E0

(1)

t=0

where 0 < β < 1 and Et is the expectations operator conditional on the
information set at time t. The period utility function u obeys the usual Inada
conditions which guarantee strictly positive consumption sequences {ct }∞ .
t=0
The economy’s budget constraint is
ct + bt ≤ yt + Rt−1 bt−1 ,

(2)

where yt is stochastic endowment income; Rt is the gross interest rate at
which the agent can borrow and lend bt on the international asset market. The
initial condition is b−1 0. In the canonical model, the interest rate is taken
parametrically.
The agent chooses consumption and net foreign asset sequences {ct , bt }∞
t=0
to maximize (1) subject to (2). The usual transversality condition applies.
First-order necessary conditions are given by
u (ct ) = βRt Et u (ct+1 ) ,

(3)

2 In what follows, I use the terms “agent,” “economy,” and “country,” interchangeably. This

is common practice in the international macro literature and reflects the similarity between small
open-economy models and partial equilibrium models of consumer choice.

396

Federal Reserve Bank of Richmond Economic Quarterly

and the budget constraint (2) at equality. The Euler equation is standard. At
the margin, the agent is willing to give up one unit of consumption, valued by
its marginal utility, if he is compensated by an additional unit of consumption
next period augmented by a certain (properly discounted) interest rate, and
evaluated by its uncertain contribution to utility. Access to the international
asset market thus allows the economy to smooth consumption in the face of
uncertain domestic income. Since the economy can only trade in a single
asset such a scenario is often referred to as one of “incomplete markets.”
This stands in contrast to a model where agents can trade a complete set of
state-contingent assets (“complete markets”).
In what follows, I assume for ease of exposition that yt is i.i.d. with
mean y, and that the interest rate is constant and equal to the world interest
rate R ∗ > 1. The latter assumption will be modified in the next section.
Given these assumptions a steady state only exists if βR ∗ = 1. Steady-state
consumption is, therefore, c = y + 1−β b. Since consumption is strictly
β
positive, this imposes a restriction on the admissible level of net foreign assets
β
b > − 1−β y. The structure of this model is such that it imposes a restriction on
the two principally structural parameters β and R ∗ , which is theoretically and
empirically problematic; there is no guarantee or mechanism in the model that
enforces this steady-state restriction to hold. Even more so, the steady-state
level of a choice variable, namely net foreign assets b, is not pinned down
by the model’s optimality conditions. Instead, there exists a multiplicity of
steady states indexed by the initial condition b = b−1 .3
Despite these issues, I now proceed by linearizing the first-order conditions
around the steady state for some b. Denoting xt = log xt − log x and xt =
xt − x, the linearized system is4

Et ct+1 = ct ,
cct + bt = yyt + β −1 bt−1 .

(4)
(5)

It can be easily verified that the eigenvalues of this dynamic system in ct , bt
are λ1 = 1, λ2 = β −1 > 1. Since b is a pre-determined variable this results in
a unique rational expectations equilibrium for all admissible parameter values.
The dynamics of the solution are given by (a detailed derivation of the solution
3 In the international real business cycle literature, for instance, Baxter and Crucini (1995),
b is, therefore, often treated as a parameter to be calibrated.
4 Since the interest rate is constant, the curvature of the utility function does not affect the
time path of consumption and, consequently, does not appear in the linearization. Moreover, net
foreign assets are approximated in levels since bt can take on negative values or zero, for which
the logarithm is not defined.

T. A. Lubik: Small Open-Economy Models

397

can be found in the Appendix)
y
1 − β bt−1
(6)
+ (1 − β) yt ,
β
c
c
bt−1
y
bt
=
+ β yt .
(7)
c
c
c
The contemporaneous effect of a 1 percent innovation to output is to raise for¯
eign lending as a fraction of steady-state consumption by β y percent, which
c
¯
is slightly less than unity in the baseline case b = 0. In line with the permanent income hypothesis only a small percentage of the increase in income is
consumed presently, so that future consumption can be raised permanently by
1−β
. The non-stationarity of this solution, the “unit-root problem,” is evident
β
from the unit coefficient on lagged net foreign assets in (7). Temporary innovations have, therefore, permanent effects; the endogenous variables wander
arbitrarily far from their starting values. This also means that the unconditional second moments, which are often used in business cycle analysis to
evaluate a model, do not exist.
Moreover, the solution is based on an approximation that is technically
only valid in a small neighborhood around the steady state. This condition will
be violated eventually with probability one, thus ruling out the validity of the
linearization approach in the first place. Since an equation system such as (4)–
(5) is at the core of much richer open-economy models, the non-stationarity of
the incomplete markets solution carries over. The unit-root problem thus raises
the question whether (linearized) incomplete market models offer accurate
descriptions of open economies. In the next sections, I study the equilibrium
properties of various modifications to the canonical model that have been used
in the literature to “fix” the unit-root problem.5
ct

2.

=

INDUCING STATIONARITY VIA A DEBT-ELASTIC
INTEREST RATE

The unit-root problem arises because of the random-walk property of consumption in the linearized Euler equation (4). Following Schmitt-Groh´ and
e
Uribe (2003) and Senhadji (2003), a convenient solution is to make the interest rate the economy faces a function of net foreign assets Rt = F bt − b ,
5 In most of the early international macro literature, the unit-root problem tended to be ignored
despite, in principle valid, technical problems. The unit root is transferred to the variables of
interest, such as consumption, on the order of the net interest rate, which is quantitatively very
small (in the present example, 1−β ). While second moments do not exist in such a non-stationary
β
environment, researchers can still compute sample moments to perform business cycle analysis.
Moreover, Schmitt-Groh´ and Uribe (2003) demonstrate that the dynamics of the standard model
e
with and without the random walk in endogenous variables are quantitatively indistinguishable over
a typical time horizon. Their article, thus, gives support for the notion of using the incomplete
market setup for analytical convenience.

398

Federal Reserve Bank of Richmond Economic Quarterly

where F is decreasing in b, b is the steady-state value of b, and F (0) = R ∗ .
If a country is a net foreign borrower, it pays an interest rate that is higher
than the world interest rate. The reference point for the assessment of the risk
premium is the country’s steady state. Intuitively, b represents the level of
net foreign assets that is sustainable in the long run, either by increasing (if
positive) or decreasing (if negative) steady-state consumption relative to the
endowment.
If a country deviates in its borrowing temporarily from what international
financial markets perceive as sustainable in the long run, it is penalized by
having to pay a higher interest rate than “safer” borrowers. This has the intuitively appealing implication that the difference between the world interest rate
and the domestically relevant rate can be interpreted as a risk premium. The
presence of a debt-elastic interest rate can be supported by empirical evidence
on the behavior of spreads, that is, the difference between a country’s interest rate and a benchmark rate, paid on sovereign bonds in emerging markets
(Neumeyer and Perri, 2005). Relative to interest rates on U.S. Treasuries, the
distribution of spreads has a positive mean, and they are much more volatile.
A potential added benefit of using a debt-elastic interest rate is that proper
specification of F may allow one to derive the steady-state value of net foreign
assets endogenously. However, the introduction of a new, somewhat arbitrary
link between endogenous variables raises the possibility of equilibrium indeterminacy and non-existence similar to what is found in the literature on
monetary policy rules and production externalities. I study two cases. In
the first case, the small open economy takes the endogenous interest rate as
given. That is, the dependence of the interest rate on the level of outstanding
net assets is not internalized. The second case assumes that agents take the
feedback from assets to interest rates into account.

No Internalization
The optimization problem for the small open economy is identical to the
canonical case discussed above. The agent does not take into account that the
interest charged for international borrowing depends on the amount borrowed.
Analytically, the agent takes Rt as given. The first-order conditions are consequently (2) and (3). Imposing the interest rate function Rt = F bt − b
yields the Euler equation when the risk premium is not internalized:
u (ct ) = βF bt − b Et u (ct+1 ) .

(8)

The Euler equation highlights the difference to the canonical model. Expected
consumption growth now depends on an endogenous variable, which tilts the
consumption path away from random-walk behavior. However, existence of
a steady state still requires R = R ∗ = β −1 . Despite the assumption of an
endogenous risk premium, this model suffers from the same deficiency as the

T. A. Lubik: Small Open-Economy Models

399

canonical model in that the first-order conditions do not fully pin down all
endogenous variables in steady state.6
After substituting the interest rate function, the first-order conditions are
linearized around some steady state b. I impose additional structure by as1−1/σ −1
suming that the period utility function u (c) = c 1−1/σ , where uu (c)c = −1/σ ,
(c)
and σ > 0 is the intertemporal substitution elasticity. Since I am mainly interested in the determinacy properties of the model, I also abstract from time
variation in the endowment process yt = y, ∀t. Furthermore, I assume that
F (0) = −ψ.7 The linearized equation system is then
Et ct+1 = ct − βσ ψ bt ,
1
cct + bt =
− ψb bt−1 .
β

(9)

The reduced-form coefficient matrix of this system can be obtained after a few
steps:
1
−c
where c = y +

1−β
b
β

1
β

−βσ ψ
+ βσ c − b ψ

,

(10)

as before. I can now establish

Proposition 1 In the model with additively separable risk premium and no
internalization, there is a unique equilibrium for all admissible parameter
values.
Proof.
In order to investigate the determinacy properties of this model,
1
I first compute the trace tr = 1 + β + βσ c − b ψ and the determinant
1
det = β −ψb. Since there is one predetermined variable, a unique equilibrium
requires one root inside and one root outside the unit circle. Both (zero) roots
inside the unit circle imply indeterminacy (non-existence). The Appendix
shows that determinacy requires |tr| > 1 + det, while |det| ≶ 1. The first
condition reduces to βσ ψc > 0, which is always true because of strictly
positive consumption. Note also that tr > 1 + det. Indeterminacy and
non-existence require |tr| < 1 + det, which cannot hold because of positive
consumption. The proposition then follows immediately.
6 This is an artifact of the assumption of no internalization and the specific assumptions on
the interest rate function.
7 An example of a specific functional form that is consistent with these assumptions and that
has been used in the literature (e.g., Schmitt-Groh´ and Uribe 2003) is
e

Rt = R ∗ + ψ e− bt −b − 1 .

400

Federal Reserve Bank of Richmond Economic Quarterly

Internalization
An alternative scenario assumes that the agent explicitly takes into account
that the interest rate he pays on foreign borrowing depends on the amount
borrowed. Higher borrowing entails higher future debt service which reduces
the desire to borrow. The agent internalizes the cost associated with becoming
active on the international asset markets in that he discounts future interest
outlays not at the world interest rate but at the domestic interest rate, which is
inclusive of the risk premium.8
The previous assumptions regarding the interest rate function and the
exogenous shock remain unchanged. Since the economy internalizes the dependence of the interest rate on net foreign assets, the first-order conditions
change. Analytically, I substitute the interest rate function into the budget
constraint (2) before taking derivatives, thereby eliminating R from the optimization problem. The modified Euler equation is
u (ct ) = βF bt − b [1 + εF (bt )] Et u (ct+1 ) ,

(11)

where εF (bt ) = FF(bt −b)bt is the elasticity of the interest rate function with
(bt −b)
respect to net foreign assets. Compared to the case of no internalization, the
effective interest rate now includes an additional term in the level of net foreign
assets. Whether the steady-state level of b is determined, therefore, depends
on this elasticity. Maintaining the assumption F (0) = −ψ, it follows that
εF (b) = −ψR ∗ b.
This provides the additional restriction needed to pin down the steady
state:
R ∗ − 1/β
b=
.
(12)
ψ
If the country’s discount factor is bigger than 1/R ∗ , that is, if it is more
patient than those in the rest of the world, its steady-state asset position is
strictly positive. A more impatient country, however, accumulates foreign
debt to finance consumption. Note further that R = R ∗ , but not necessarily
equals β −1 , while b asymptotically reaches zero as ψ grows large. It is worth
emphasizing that βR ∗ = 1 is no longer a necessary condition for the existence
of a steady state, and that b is, in fact, uniquely determined. Internalization of
the risk premium, therefore, avoids one of the pitfalls of the standard model,
but it also nicely captures the idea that some countries appear to have persistent
levels of foreign indebtedness.
8 The difference between internalization and no internalization of the endogenous risk premium
is also stressed by Nason and Rogers (2006). Strictly speaking, with internalization the country
stops being a price-taker in international asset markets. This is analogous to open-economy models
of “semi-small” countries that are monopolistically competitive and price-setting producers of export
goods. Schmitt-Groh´ (1997) has shown that feedback mechanisms of this kind are important
e
sources of non-determinacy of equilibria.

T. A. Lubik: Small Open-Economy Models

401

I now proceed by linearizing the equation system:
Et ct+1 = ct − βσ ψ 2 − b bt ,
cct + bt

=

(13)

∗

R − ψb bt−1 .

The coefficient matrix that determines the dynamics can be derived as:
1
−c
¯

1
β

−βσ ψ(2 − b)
+ βσ ψ(2 − b)c

,

(14)

∗

where now b = R −1/β and c = y + (R ∗ − 1) b. The determinacy properties
ψ
of this case are given in
Proposition 2 In the model with additively separable risk premium and internalization, the equilibrium is unique if and only if
b < 2,
or
b >2+2

1+β 1
.
β βσ ψc

No equilibrium exists otherwise.
Proof.
The determinant of the system matrix is det = β −1 > 1. This
implies that there is at least one explosive root, which rules out indeterminacy.
Since the system contains one jump and one predetermined variable, a unique
equilibrium requires |tr| > 1 + det, where tr = 1 + β −1 + βσ ψc(2 − b).
The lower bound of the condition establishes that βσ ψ(2 − b)c > 0. Since
c > 0, it must be that b < 2. From −tr > 1 + det, the second part of the
determinacy region follows after simply rearranging terms. The proposition
then follows immediately.
The proposition shows that a sufficient condition for determinacy is that
the country is a net foreign borrower, which implies β −1 > R ∗ . A relatively
impatient country borrows from abroad to sustain current consumption. Since
this incurs a premium above the world interest rate, the growth rate of debt
is below that of, say, the canonical case, and debt accumulation is, therefore,
nonexplosive. Even if the country is a net foreign lender, determinacy can
¯
still be obtained for 0 < b < 2 or R ∗ < β −1 + 2ψ. A slightly more patient
country than the rest of the world would imply a determinate equilibrium if
the (internalized) interest rate premium is large enough.
From a technical point of view, non-existence arises if both roots in (13)
are larger than unity, so that both difference equations are unstable. The budget
constraint then implies an explosive time path for assets b which would violate
transversality. This is driven by explosive consumption growth financed by
interest receipts on foreign asset holdings. In the non-existence region, these
are large so as not to be balanced by the decline in the interest rate. Effectively,

402

Federal Reserve Bank of Richmond Economic Quarterly

the economy both over-consumes and over-accumulates assets, which cannot
be an equilibrium. The only possible equilibrium is, therefore, at the (unique)
steady state, while dynamics around it are explosive. This highlights the
importance of the elasticity term 1 + εF (bt ) in equation (11), which has the
power to tilt the consumption away from unit-root (and explosive) behavior
for the right parameterization.
As the proposition shows, the non-existence region has an upper bound
beyond which the equilibrium is determinate again. The following numerical
example using baseline parameter values9 demonstrates, however, that this
boundary is far above empirically reasonable values. Figure 1, Panels A and
B depict the determinacy regions for net foreign assets for varying values of
σ and ψ. Note that below the lower bound b = 2 the equilibrium is always
determinate, while the size of the non-existence region is decreasing in the two
parameters. Recall from equation (12) that the steady-state level b depends
on the spread between the world interest rate and the inverse of the discount
1
factor. Non-existence, therefore, arises if ψ < 2 R ∗ − β −1 . In other words,
−1
if there is a large wedge between R ∗ and β , a researcher has to be careful
not to choose an elasticity parameter ψ that is too small.
Normalizing output y = 1, the boundary lies at an asset level that is
twice as large as the country’s GDP. While this is not implausible, net foreign
asset holdings of that size are rarely observed. However, choosing a different
normalization, for instance, y = 10 presents a different picture, in which a
plausible calibration for, say, a primary resource exporter, renders the solution
of the model non-existent. On the other hand, as y becomes large, the upper
bound for the non-existence region in Figure 1, Panels A and B moves inward,
thereby reducing its size. The conclusion for researchers interested in studying
models of this type is to calibrate carefully. Target levels for the net-foreign
asset to GDP ratio cannot be chosen independently of the stationarity-inducing
parameter ψ if equilibrium existence problems are to be avoided. It is worth
pointing out again that indeterminacy, and thus the possibility of sunspot
equilibria, can be ruled out in this model.
While it is convenient to represent the boundaries of the determinacy
region for net foreign assets b, it is nevertheless an endogenous variable, as is c.
The parameter restriction in the above proposition can be rearranged in terms
of R ∗ . That is, the economy has a unique equilibrium if either R ∗ < β −1 + 2ψ
1
or R ∗ > β −1 +2ψ 1 + 1+β βσ {ψy+(R∗ −1) R∗ −β −1 . Again, the equilibrium
β
(
)}
is non-existent otherwise. Since the second term in brackets is strictly positive,
the region of non-existence is nonempty. Although the upper bound is still
a function of R ∗ (and has to be computed numerically), this version presents
more intuition.
9 Parameter values used are β = 0.98, σ = 1, ψ = 0.001, and y = 1.

T. A. Lubik: Small Open-Economy Models

403

Figure 1 Determinacy Regions for Net Foreign Assets b and Interest
Rates R*
Panel A: Determinacy Region for b

Panel B: Determinacy Region for b

10

0.010

b=2

Determinacy

8

b=2

0.008
0.006

σ

ψ

6

Determinacy

4

0.004
Non-existence

Non-existence

2

0.002

0

0
0

200
400
600
800
Net Foreign Assets

1,000

0

Panel C: Determinacy Region for R*

200
400
600
800
Net Foreign Assets

1,000

Panel D: Determinacy Region for R*

10

0.010

8

Non-existence

0.008

Determinacy

0.006

σ

ψ

6
4

0.004

2

Determinacy

0.002
Non-existence

0

0
1

1.1

1.2
R*

1.3

1.4

1

1.1

1.2
R*

1.3

1.4

Figure 1, Panels C and D depict the determinacy regions for R ∗ with
varying σ and ψ, respectively. The lower bound of the non-existence region
is independent of σ , but increasing in ψ. For a small substitution elasticity,
the equilibrium is non-existent unless the economy is more impatient than the
rest of the world, inclusive of a factor reflecting the risk premium. This is
both consistent with a negative steady-state asset position as well as a small,
positive one as long as b < 2. Figure 1, Panel D shows that no equilibrium
exists even for very small values of ψ. If the economy is a substantial net saver,
then the equilibrium is determinate if the world interest rate is (implausibly)
high. Analytically, this implies that the asset accumulation equation remains
explosive even though there is a large premium to be paid.
To summarize, introducing a debt-elastic interest rate addresses two issues
arising in incomplete market models of open economies, viz., the indeterminacy of the steady-state allocation and the induced non-stationarity of the
linearized solution. If the derivative of the interest rate function with respect
to net asset holdings is nonzero, then the linearized solution is stationary. In
the special case when the economy internalizes the dependence of the interest

404

Federal Reserve Bank of Richmond Economic Quarterly

rate on net foreign assets, the rational expectations equilibrium can be nonexistent. However, this situation only arises for arguably extreme parameter
values. A nonzero elasticity of the interest rate function is also necessary for
the determinacy of the steady state. It is not sufficient, however, as the special
case without internalization demonstrated.

3. ALTERNATIVE SPECIFICATIONS
The exposition above used the general functional form Rt = F bt − b , with
F (0) = R ∗ and F (0) = −ψ. A parametric example for this function would
be additive in the risk premium term, i.e., Rt = R ∗ + ψ e−(bt −b) − 1 .
Alternatively, the risk premium could also be chosen multiplicatively, Rt =
R ∗ ψ (bt ), with ψ (b) = 1, ψ < 0. With internalization, the Euler equation
can then be written as:
u (ct ) = βR ∗ ψ (bt ) [1 + εF (bt )] Et u (ct+1 ) .

(15)

εF (bt ) is the elasticity of the risk premium function with respect to foreign
assets. Again, the first-order condition shows how a debt-elastic interest rate
tilts consumption away from pure random-walk behavior.
A specific example for the multiplicative form of the interest rate function
is Rt = R ∗ e−ψ (bt −b) , which in log-linear form conveniently reduces to Rt =
−ψ bt . Assuming no internalization, the steady state is again not pinned down
so that R = R ∗ = β −1 , and the above restrictions on b apply. Internalization
∗ −1/β
of the risk premium leads to b = R ψR∗ . Again, the economy is a net saver
when it is more patient than the rest of the world. As opposed to the case of
an additive premium, the equilibrium is determinate for the entire parameter
space. This can easily be established in
Proposition 3 In the model with multiplicative risk premium, with either internalization or no internalization, the equilibrium is unique for all parameter
values.
Proof.
See Appendix.
Nason and Rogers (2006) suggest a specification for the risk premium
that is additive in net foreign assets relative to aggregate income: Rt = R ∗ −
ψ btt .10 The difference to the additive premium considered above is that even
y
without internalization, foreign and domestic rates need not be the same in
∗
the steady state. In the latter case, b = R −1/β , whereas with internalization,
ψ
b =

1 R ∗ −1/β
.
2
ψ

This shows that the endogenous risk premium reduces asset

10 Note that in this case the general form specification of the interest rate function is R =
t
F (bt ), and not Rt = F (bt − b).

T. A. Lubik: Small Open-Economy Models

405

accumulation when agents take into account the feedback effect on the interest
rate. The determinacy properties of this specification are established in
Proposition 4 If the domestic interest rate is given by Rt = R ∗ − ψ btt , under
y
either internalization or no internalization, the equilibrium is unique for all
parameter values.
Proof.
See Appendix.
It may appear that the determinacy properties are pure artifacts of the
linearization procedure. While I log-linearized consumption, functions of bt
were approximated in levels as net foreign assets may very well be negative or
zero.11 Dotsey and Mao (1992), for instance, have shown that the accuracy of
linear approximation procedures depends on the type of linearization chosen.
It can be verified,12 however, that this is not a problem in this simple model
as far as the determinacy properties are concerned. The coefficient matrix for
all model specifications considered is invariant to the linearization.

4.

PORTFOLIO ADJUSTMENT COSTS

Finally, I consider one approach to the unit-root problem that does not rely
on feedback from net foreign assets to the interest rate. Several authors, for
example, Schmitt-Groh´ and Uribe (2003) and Neumeyer and Perri (2005),
e
have introduced quadratic portfolio adjustment costs to guarantee stationarity.
It is assumed that agents have to pay a fee in terms of lost output if their
transactions on the international asset market lead to deviations from some
long-run (steady-state) level b. The budget constraint is thus modified as
follows:
ψ
2
(16)
bt − b = yt + R ∗ bt−1 ,
c t + bt +
2
where ψ > 0, and the interest rate on foreign assets is equal to the constant
world interest rate R ∗ . The Euler equation is
u (ct ) 1 + ψ bt − b

= βR ∗ Et u (ct+1 ) .

(17)

If the economy wants to purchase an additional unit of foreign assets, current
consumption declines by one plus the transaction cost ψ bt − b . The payoff
for the next period is higher consumption by one unit plus the fixed (net) world
interest rate.
Introducing this type of portfolio adjustment costs does not pin down
the steady-state value of b. The Euler equation implies the same steadyβ
state restriction as the canonical model, namely βR ∗ = 1 and b > − 1−β y.
11 The interpretation of the linearized system in terms of percentage deviations from the
steady state can still be preserved by expressing foreign assets relative to aggregate income or
consumption, as in equation (7).
12 Details are available from the author upon request.

406

Federal Reserve Bank of Richmond Economic Quarterly

However, the Euler equation (17) demonstrates the near equivalence between
the debt-dependent interest rate function and the debt-dependent-borrowing
cost formulation. The key to avoiding a unit root in the dynamic model is
to generate feedback that tilts expected consumption growth, which can be
achieved in various ways.
The coefficient matrix of the two-variable system in ct , bt is given by
1
−σ ψ
−1
−c β + σ ψc

.

It can be easily verified that both eigenvalues are real and lie on opposite sides
of the unit circle over the entire admissible parameter space. The rational
expectations solution is, therefore, unique. The same conclusion applies when
different linearization schemes, as previously discussed, are used.
It is worthwhile to point out that Schmitt-Groh´ and Uribe (2003) have
e
suggested that the model with portfolio adjustment costs and the model with
a debt-elastic interest rate imply similar dynamics. Inspection of the two
respective Euler equations reveals that the debt-dependent discount factors in
the linearized versions are identical for a properly chosen parameterization.
However, portfolio costs do not appear in the linearized budget constraint,
since they are of second order, whereas the time-varying interest rate changes
debt dynamics in a potentially critical way. It follows, that this assertion is
true only for that part of the parameter space that results in a unique solution,
but a general equivalence result, such as between internalized and external
risk premia, cannot be derived.

5.

CONCLUSION

Incomplete market models of small open economies imply non-stationary
equilibrium dynamics. Researchers who want to work with this type of model
are faced with a choice between theoretical rigor and analytical expediency in
terms of a model solution. In order to alleviate this tension, several techniques
to induce stationarity have been suggested in the literature. This article has
investigated the determinacy properties of models with debt-elastic interest
rates and portfolio adjustment costs. The message is a mildly cautionary
one. Although analytically convenient, endogenizing the interest rate allows
for the possibility that the rational expectations equilibrium does not exist.
I show that an additively separable risk premium with a specific functional
form that is used in the literature can imply non-existence for a plausible
parameterization. I suggest alternative specifications that are not subject to
this problem. In general, however, this article shows that the determinacy
properties depend on specific functional forms, which is not readily apparent
a priori.

T. A. Lubik: Small Open-Economy Models

407

A question that remains is to what extent the findings in this article are
relevant in richer models. Since analytical results may not be easily available,
this remains an issue for further research. Moreover, there are other suggested
solutions to the unit-root problem. As the article has emphasized, the key is
to tilt expected consumption growth away from unity. I have only analyzed
approaches that work on endogenizing the interest rate, but just as conceivably
the discount factor β could depend on other endogenous variables as in the
case of Epstein-Zin preferences. The rate at which agents discount future
consumption streams might depend on their utility level, which in turn depends
on consumption and net foreign assets. Again, this would provide a feedback
mechanism from assets to the consumption tilt factor. Little is known about
equilibrium determinacy properties under this approach.

APPENDIX
Solving the Canonical Model
The linearized equation system describing the dynamics of the model is
Et ct+1 = ct ,
cct + bt = yyt + β −1 bt−1 .
I solve the model by applying the method described in Sims (2002). In order
to map the system into Sims’s framework, I define the endogenous forecast
error ηt as follows:
ct = ξ c + ηt = Et−1 ct + ηt .
t−1
The system can then be rewritten as:
1 0
c 1

ξc
t
bt

=

Invert the lead matrix
ξc
t
bt

=

ξc
t−1
bt−1

1 0
0 β −1

1 0
c 1

1
0
−c β −1

−1

=

1 0
−c 1

ξc
t−1
bt−1

+

0
y

+

yt +

1
0

ηt .

, and multiply through:
0
y

yt +

1
−c

ηt .

Since the autoregressive coefficient matrix is triangular, the eigenvalues of the
system can be read off the diagonal: λ1 = 1, λ2 = β −1 > 1. This matrix can
be diagonalized as follows:
1−β
cβ

1

0
β −1

1 0
0 β −1

cβ
1−β
cβ 2
− 1−β

0
β

.

408

Federal Reserve Bank of Richmond Economic Quarterly

Multiply the system by the matrix of right eigenvectors to get:
cβ
1−β
cβ 2
− 1−β

0

ξc
t
bt

β

=

0
βy

+
Define w1t =
w1t
w2t

cβ c
ξ
1−β t

0
β

cβ
1−β
cβ
− 1−β

yt +

ξc
t−1
bt−1
ηt .

2

cβ
and w2t = − 1−β ξ c + β bt , then:
t

1 0
0 β −1

=

cβ
1−β
cβ 2
− 1−β

1 0
0 β −1

w1t−1
w2t−1

+

0
βy

cβ
1−β
cβ
− 1−β

yt +

ηt .

Treat λ1 = 1 as a stable eigenvalue. Then the conditions for stability are
w2t
cβ
βyyt −
η
1−β t

= 0, ∀t,
= 0.

This implies a solution for the endogenous forecast error:
y
ηt = (1 − β) yt .
c
The decoupled system can consequently be rewritten as:
=

1 0
0 0

w1t−1
w2t−1

+

0
βy

yt +

=

w1t
w2t

1 0
0 0

w1t−1
w2t−1

+

βy
0

yt .
1−β
cβ

Now multiply by the matrix of left eigenvectors

1

βy
−βy

0
β −1

yt

to return to

the original set of variables:
ξc
t
bt

=

1
cβ
1−β

0
0

ξc
t−1
bt−1

+

(1 − β) y
c
βy

yt .

Using the definition of ξ c we find after a few steps:
t
ct
bt

y
= ct−1 + (1 − β) yt ,
c
= bt−1 + βyyt .

The unit-root component of this model is clearly evident from the solution for
consumption. Once the system is disturbed it will not return to its initial level.
In fact, it will tend toward ±∞ with probability one, which raises doubts
about the validity of the linearization approach in the first place. Moreover,
there is no limiting distribution for the endogenous variables; the variance of

T. A. Lubik: Small Open-Economy Models

409

consumption, for instance, is infinite. Strictly speaking, the model cannot be
used for business cycle analysis.
Alternatively, one can derive the state-space representation of the solution,
that is, expressed in terms of state variables and exogenous shocks. Convenient
substitution thus leads to:
y
1 − β bt−1
+ (1 − β) yt ,
β
c
c
bt−1
y
bt
=
+ β yt .
c
c
c
As in the intertemporal approach to the current account, income innovations
only have minor affects on current consumption, but lead to substantial changes
in net foreign assets. Purely temporary shocks, therefore, have permanent
effects.
ct

=

Bounding the Eigenvalues
The characteristic equation of a two-by-two matrix A is given by p(λ) =
λ2 − trλ + det, where tr = trace(A) and det = det(A), are the trace
and determinant, respectively. According to the Schur-Cohn Criterion (see
LaSalle 1986, 27) a necessary and sufficient condition that all roots of this
polynomial be inside the unit circle is
|det| < 1

and |tr| < 1 + det.

I am also interested in cases in which there is one root inside the unit
circle or both roots are outside the unit circle. Conditions for the former
can be derived by noting that the eigenvalues of the inverse of a matrix are
equal to the inverse eigenvalues of the original matrix. Define B = A−1 .
1
Then trace(B) = trace(A) and det(B) = det(A) . By Schur-Cohn, B has two
det(A)
eigenvalues inside the unit circle (and therefore both of A’s eigenvalues are
outside) if and only if |det(B)| < 1 and |trace(B)| < 1+det(B). Substituting
1
the above expressions, I find that det(A) < 1, which implies |det(A)| > 1.
1
1
The second condition is − 1 + det(A) < trace(A) < 1 + det(A) . Suppose
det(A)
first that det(A) > 0. It follows immediately that |trace(A)| < 1 + det(A).
Alternatively, if det(A) < 0, I have |trace(A)| < − (1 + det(A)). However,
since I have restricted |det(A)| > 1, the latter case collapses into the former for
det(A) < −1. Combining these restrictions I can then deduce that a necessary
and sufficient condition for both roots lying outside the unit circle is

|det| > 1

and |tr| < 1 + det.

Conditions for the case of one root inside and one root outside the unit
circle can be found by including all possibilities not covered by the previous

410

Federal Reserve Bank of Richmond Economic Quarterly

ones. Consequently, I find this requires
Either |det| < 1 and |tr| > 1 + det,
or |det| > 1 and |tr| > 1 + det.
As a side note, employing the Schur-Cohn criterion and its corollaries is preferable to using Descartes’ Rule of Sign or the Fourier-Budan theorem since I
may have to deal with complex eigenvalues (see Barbeau 1989, 170). Moreover, the former can give misleading bounds since it does not treat det < −1
as a separate restriction. This is not a problem in the canonical model where
det = β −1 > 1, but may be relevant in the other models.

Proof of Proposition 3
With no internalization of the risk premium, the linearized equation system is
given by
ct
cct + bt

= ct−1 − σ ψ bt−1 ,
= R ∗ 1 − ψb bt−1 .

¯
Its trace and determinant are tr = 1 + R ∗ 1 − ψb + σ ψ c and det =
R ∗ 1 − ψb . Since I have tr = 1 + det + σ ψ c > 1 + det, it follows
¯
immediately that the system contains one stable and one unstable root, so that
the equilibrium is unique for all parameter values.
With internalization of the risk premium, the linearized equation system
is given by
ct
cct + bt

= ct−1 − σ ψ 1 + βR ∗ bt−1 ,
= R ∗ 1 − ψb bt−1 .

Its trace and determinant are tr = 1 + R ∗ 1 − ψb + σ ψc (1 + βR ∗ ) and
det = R ∗ 1 − ψb . Since I have tr = 1 + det + σ ψc (1 + βR ∗ ) > 1 + det,
it follows immediately that the system contains one stable and one unstable
root, so that the equilibrium is unique for all parameter values. This concludes
the proof of the proposition.

Proof of Proposition 4
With no internalization of the risk premium, the linearized equation system is
given by
ct
cct + bt

= ct−1 −
=

σ βψ
bt−1 ,
y

1
b
−ψ
β
y

bt−1 .

T. A. Lubik: Small Open-Economy Models

411

c
1
b
1
b
Its trace and determinant are tr = 1 + σ βψ y + β − ψ y and det = β − ψ y .
c
Since I have tr = 1 + det + σ βψ y > 1 + det, it follows immediately that
the system contains one stable and one unstable root, so that the equilibrium
is unique for all parameter values.
With internalization of the risk premium, the linearized equation system
is given by
σ βψ
ct = ct−1 − 2
bt−1 ,
y

cct + bt

=

1
b
−ψ
β
y

bt−1 .

c
1
b
1
b
Its trace and determinant are tr = 1 + 2σ βψ y + β − ψ y and det = β − ψ y .
c
Since I have tr = 1 + det + 2σ βψ y > 1 + det, it follows immediately that
the system contains one stable and one unstable root, so that the equilibrium is
unique for all parameter values. This concludes the proof of the proposition.

REFERENCES
Barbeau, Edward J. 1989. Polynomials. New York, NY: Springer-Verlag.
Baxter, Marianne, and Mario J. Crucini. 1995. “Business Cycles and the
Asset Structure of Foreign Trade.” International Economic Review 36
(4): 821–54.
Dotsey, Michael, and Ching Sheng Mao. 1992. “How Well Do Linear
Approximation Methods Work? The Production Tax Case.” Journal of
Monetary Economics 29 (1): 25–58.
Hall, Robert E. 1978. “Stochastic Implications of the Life Cycle-Permanent
Income Hypothesis: Theory and Evidence.” Journal of Political
Economy 86 (6): 971–88.
LaSalle, Joseph P. 1986. The Stability and Control of Discrete Processes.
New York, NY: Springer-Verlag.
Nason, James M., and John H. Rogers. 2006. “The Present-Value Model of
the Current Account Has Been Rejected: Round Up the Usual Suspects.”
Journal of International Economics 68 (1): 159–87.
Neumeyer, Pablo A., and Fabrizio Perri. 2005. “Business Cycles in
Emerging Economies: The Role of Interest Rates.” Journal of Monetary
Economics 52 (2): 345–80.

412

Federal Reserve Bank of Richmond Economic Quarterly

Schmitt-Groh´ , Stephanie. 1997. “Comparing Four Models of Aggregate
e
Fluctuations Due to Self-Fulfilling Expectations.” Journal of Economic
Theory 72 (1): 96–147.
Schmitt-Groh´ , Stephanie, and Mart´n Uribe. 2003. “Closing Small Open
e
ı
Economy Models.” Journal of International Economics 61(1): 163–85.
Senhadji, Abdelhak S. 2003. “External Shocks and Debt Accumulation in a
Small Open Economy.” Review of Economic Dynamics 6 (1): 207–39.
Sims, Christopher A. 2002. “Solving Linear Rational Expectations Models.”
Computational Economics 20 (1–2): 1–20.