View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

President’s Welcome
James Bullard

A

s president of the Bank, it is my
pleasure to welcome you to the ThirtyThird Annual Policy Conference of
the Federal Reserve Bank of St. Louis.
This conference concerns measurement of
the economy’s potential output. The concept of
potential output is straightforward to define—the
economy’s maximum sustained level of output—
but difficult to measure. Inclusion of the term
sustained suggests that the concept of potential
growth is closely tied to inflation—a low, stable
inflation rate is essential if an economy is to
attain maximum economic growth and, hence,
remain through time at or near its potential level
of output.
In macroeconomic stabilization theory and
practice, the concept of potential growth has a
long history. Early analyses focused on the output
gap. Fortunately, belief in an exploitable long-run
tradeoff between the unemployment rate and
the rate of inflation was rejected by economists
decades ago. Today’s classical and New Keynesian
models suggest that, given enough time for adjustment and a benign pattern of shocks, the economy
will adjust in the long run toward its potential
level of output. The speed of such adjustment
depends on the relative flexibility or inflexibility
of wages, prices, and expectations—aptly summarized by Keynes’s quip that “In the long run,
we are all dead.” But, taken literally, Keynes’s
call to action, as we now recognize, can be quite
dangerous when near-term preliminary data
contain significant uncertainty and measure-

ment error, as demonstrated by the papers of
Athanasios Orphanides, John Williams, and
Simon van Norden (e.g., Orphanides and
van Norden, 2002; and Orphanides and Williams,
2005).
The concept of potential output is an important feature of monetary policymaking. At our
conference in 2007 in honor of Bill Poole, Lars
Svensson and Noah Williams (2008, p. 275)
characterized the task of policymakers as seeking
to “navigate the sea of uncertainty.” Correct economic stabilization policy, like correct navigation,
requires a focus on the destination, or long-run
objective. The Federal Reserve, in particular, operates with a dual mandate from the Congress to
achieve both price stability and maximum employment. These goals are not in conflict—both require
fostering an environment to support maximum
sustainable growth. Academic policy models,
while differing one from another, typically include
a concept of potential output. Fixed-parameter
policy rules, such as the Taylor rule, feature an
output gap. Flexible inflation targeting models,
such as those of Lars Svensson (e.g., Svensson,
1997) emphasize that inflation can, and does, at
times, move away from the desired level. Thus,
the choice of an optimal policy that will return
inflation to its target depends on a tradeoff between
the costs of the higher-but-falling inflation and
any induced output gap (i.e., an output gap judged
relative to some measure of potential output). One
lesson of such models is that, even when monetary policymakers focus solely on achieving price

James Bullard is president of the Federal Reserve Bank of St. Louis.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 179-80.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the FOMC. Articles may be reprinted, reproduced, published, distributed,
displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other
derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

179

Bullard

stability, the path of the output gap will enter into
their deliberations regarding an optimal policy
to reach that goal.
It is in this spirit of the important policy role
of potential output that I welcome the speakers
who will share their thoughts with us. We are
particularly rich in speakers from abroad, bringing
a distinct international focus to our discussions.
I trust we will all increase our understanding of
the concept of potential output and its role in
policymaking.

REFERENCES
Orphanides, Athanasios and van Norden, Simon.
“The Unreliability of Output Gap Estimates in Real
Time.” Review of Economics and Statistics,
November 2002, 84(4), pp. 569-83.
Orphanides, A. and Williams, John C. “Expectations,
Learning, and Monetary Policy.” Journal of Economic
Dynamics and Control, November 2005, 29(11),
pp. 1807-08.
Svensson, Lars E.O. “Optimal Inflation Targets,
‘Conservative’ Central Banks, and Linear Inflation
Contracts.” American Economic Review, March
1997, 87(1), pp. 98-114.
Svensson, Lars E.O. and Williams, Noah. “Optimal
Monetary Policy Under Uncertainty: A Markov
Jump-Linear-Quadratic Approach.” Federal Reserve
Bank of St. Louis Review, July/August 2008, 90(4),
pp. 275-93.

180

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

What Do We Know (And Not Know)
About Potential Output?
Susanto Basu and John G. Fernald
Potential output is an important concept in economics. Policymakers often use a one-sector neoclassical model to think about long-run growth, and they often assume that potential output is a
smooth series in the short run—approximated by a medium- or long-run estimate. But in both the
short and the long run, the one-sector model falls short empirically, reflecting the importance of
rapid technological change in producing investment goods; and few, if any, modern macroeconomic
models would imply that, at business cycle frequencies, potential output is a smooth series.
Discussing these points allows the authors to discuss a range of other issues that are less well
understood and where further research could be valuable. (JEL E32, O41, E60)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 187-213.

T

he concept of potential output plays a
central role in policy discussions. In
the long run, faster growth in potential
output leads to faster growth in actual
output and, for given trends in population and
the workforce, faster growth in income per capita.
In the short run, policymakers need to assess the
degree to which fluctuations in observed output
reflect the economy’s optimal response to shocks,
as opposed to undesirable deviations from the
time-varying optimal path of output.
To keep the discussion manageable, we confine our discussion of potential output to neoclassical growth models with exogenous technical
progress in the short and the long run; we also
focus exclusively on the United States. We make
two main points. First, in both the short and the
long run, rapid technological change in producing
equipment investment goods is important. This
rapid change in the production technology for
investment goods implies that the two-sector

neoclassical model—where one sector produces
investment goods and the other produces consumption goods—provides a better benchmark for
measuring potential output than the one-sector
growth model. Second, in the short run, the measure of potential output that matters for policymakers is likely to fluctuate substantially over
time. Neither macroeconomic theory nor existing
empirical evidence suggests that potential output
is a smooth series. Policymakers, however, often
appear to assume that, even in the short run,
potential output is well approximated by a smooth
trend.1 Our model and empirical work corroborate these two points and provide a framework
to discuss other aspects of what we know, and
do not know, about potential output.
As we begin, clear definitions are important
to our discussion. “Potential output” is often used
1

See, for example, Congressional Budget Office (CBO, 2001 and
2004) and Organisation for Economic Co-operation and
Development (2008).

Susanto Basu is a professor in the department of economics at Boston College, a research associate of the National Bureau of Economic
Research, and a visiting scholar at the Federal Reserve Bank of Boston. John G. Fernald is a vice president and economist at the Federal
Reserve Bank of San Francisco. The authors thank Alessandro Barattieri and Kyle Matoba for outstanding research assistance and Jonas
Fisher and Miles Kimball for helpful discussions and collaboration on related research. They also thank Bart Hobijn, Chad Jones, John
Williams, Rody Manuelli, and conference participants for helpful discussions and comments.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

187

Basu and Fernald

to describe related, but logically distinct, concepts.
First, people often mean something akin to a
“forecast” for output and its growth rate in the
longer run (say, 10 years out). We will often refer
to this first concept as a “steady-state measure,”
although a decade-long forecast can also incorporate transition dynamics toward the steady state.2
In the short run, however, a steady-state notion
is less relevant for policymakers who wish to
stabilize output or inflation at high frequencies.
This leads to a second concept, explicit in New
Keynesian dynamic stochastic general equilibrium (DSGE) models: Potential output is the rate
of output the economy would have if there were
no nominal rigidities but all other (real) frictions
and shocks remained unchanged.3 In a flexible
price real business cycle model, where prices
adjust instantaneously, potential output is equivalent to actual, equilibrium output. In contrast to
the first definition of potential output as exclusively a long-term phenomenon, the second meaning defines it as relevant for the short run as well,
when shocks push the economy temporarily away
from steady state.
In New Keynesian models, where prices
and/or wages might adjust slowly toward their
long-run equilibrium values, actual output might
well deviate from the short-term measure of potential output. In many of these models, the “output
gap”—the difference between actual and potential
output—is the key variable in determining the
evolution of inflation. Thus, this second definition
also corresponds to the older Keynesian notion
that potential output is the “maximum production without inflationary pressure” (Okun, 1970,
p. 133)—that is, the level of output at which there
is no pressure for inflation to either increase or
decrease. In most, if not all, macroeconomic
models, the second (flexible price) definition
converges in the long run to the first steady-state
definition.
2

In some models, transition dynamics can be very long-lived. For
example, Jones (2002) interprets the past century as a time when
growth in output per capita was relatively constant at a rate above
steady state.

3

See Woodford (2003) for the theory. Neiss and Nelson (2005) construct an output gap from a small, one-sector DSGE model.

188

J U LY / A U G U S T

2009

Yet a third definition considers potential output as the current optimal rate of output. With
distortionary taxes and other market imperfections (such as monopolistic competition), neither
steady-state output nor the flexible price equilibrium level of output needs to be optimal or efficient. Like the first two concepts, this third
meaning is of interest to policymakers who might
seek to improve the efficiency of the economy. 4
(However, decades of research on time inconsistency suggest that such policies should be implemented by fiscal or regulatory authorities, who
can target the imperfections directly, but not by
the central bank, which typically must take these
imperfections as given. See, for example, the
seminal paper by Kydland and Prescott, 1977.)
This article focuses on the first two definitions.
The first part of our article focuses on long-term
growth, which is clearly an issue of great importance for the economy, especially in discussions
of fiscal policy. For example, whether promised
entitlement spending is feasible depends almost
entirely on long-run growth. We show that the predictions of two-sector models lead us to be more
optimistic about the economy’s long-run growth
potential. This part of our article, which corresponds to the first definition of potential output,
will thus be of interest to fiscal policymakers.
The second part of our article, of interest to
monetary policymakers, focuses on a time-varying
measure of potential output—the second usage
above. Potential output plays a central, if often
implicit, role in monetary policy decisions. The
Federal Reserve has a dual mandate to pursue low
and stable inflation and maximum sustainable
employment. “Maximum sustainable employment” is usually interpreted to imply that the
Federal Reserve should strive, subject to its other
mandate, to stabilize the real economy around
its flexible price equilibrium level—which itself
is changing in response to real shocks—to avoid
inefficient fluctuations in employment. In New
Keynesian models, deviations of actual from
potential output put pressure on inflation, so in
4

Justiniano and Primiceri (2008) define “potential output” as this
third measure, with no market imperfections; they use the term
“natural output” to mean our second, flexible-wage/price measure.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

the simplest such models, output stabilization
and inflation stabilization go hand in hand.
The first section of this article compares the
steady-state implications of one- and two-sector
neoclassical models with exogenous technological
progress. That is, we focus on the long-run effects
of given trends in technology, rather than trying
to understand the sources of this technological
progress.5 Policymakers must understand the
nature of technological progress to devise policies
to promote long-run growth, but it is beyond the
scope of our article. In the next section, we use
the two-sector model to present a range of possible scenarios for long-term productivity growth
and discuss some of the questions these different
scenarios pose.
We then turn to short-term implications and
ask whether it is plausible to think of potential
output as a smooth process and compare the
implications of a simple one-sector versus twosector model. The subsequent section turns to
the current situation (as of late 2008): How does
short-run potential output growth compare with
its steady-state level? This discussion suggests a
number of additional issues that are unknown or
difficult to quantify. The final section summarizes
our findings and conclusions.

capital deepening explains the former and demographics explains the latter. The assumption that
labor productivity evolves separately from hours
worked is motivated by the observation that labor
productivity has risen dramatically over the past
two centuries, whereas labor supply has changed
by much less.6 Even if productivity growth and
labor supply are related in the long run, as suggested by Elsby and Shapiro (2008) and Jones
(1995), the analysis that follows will capture the
key properties of the endogenous response of
capital deepening to technological change.
A reasonable way to estimate steady-state
labor productivity growth is to estimate underlying technology growth and then use a model to
calculate the implications for capital deepening.
Let hats over a variable represent log changes. As
a matter of identities, we can write output growth,
ŷ, as labor-productivity growth plus growth in
hours worked, ĥ:

(

)

yˆ = yˆ − hˆ + hˆ .
We focus here on full-employment labor
productivity.
Suppose we define growth in total factor
productivity (TFP), or the Solow residual, as
 = yˆ − α kˆ − (1 − α ) ˆl ,
tfp

THE LONG RUN: WHAT SIMPLE
MODEL MATCHES THE DATA?
A common, and fairly sensible, approach for
estimating steady-state output growth is to estimate growth in full-employment labor productivity and then allow for demographics to determine
the evolution of the labor force. This approach
motivates this section’s assessment of steadystate labor productivity growth.
We generally think that, in the long run, different forces explain labor productivity and total
hours worked—technology along with induced
5

Of course, total factor productivity (TFP) can change for reasons
broader than technological change alone; improved institutions,
deregulation, and less distortionary taxes are only some of the
reasons. We believe, however, that long-run trends in TFP in
developed countries like the United States are driven primarily
by technological change. For evidence supporting this view, see
Basu and Fernald (2002).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

where α is capital’s share of income and 共1 – α 兲
is labor’s share of income. Defining
,
lˆ ≡ hˆ + lq
 is labor “quality” (composition) growth,7
where lq
we can rewrite output per hour growth as follows:
(1)

( yˆ − hˆ ) = tfp + α ( kˆ − lˆ ) + lq.

As an identity, growth in output per hour
worked reflects TFP growth; the contribution of
6

King, Plosser, and Rebelo (1988) suggest a first approximation
should model hours per capita as independent of the level of technology and provide necessary and sufficient conditions on the
utility function for this result to hold. Basu and Kimball (2002)
show that the particular non-separability between consumption
and hours worked that is generally implied by the King-PlosserRebelo utility function helps explain the evolution of consumption
in postwar U.S. data and resolves several consumption puzzles.

7

See footnote 7 on p. 190.

J U LY / A U G U S T

2009

189

Basu and Fernald

capital deepening, defined as α 共k̂ – lˆ 兲; and
increases in labor quality. Economic models suggest mappings between fundamentals and the
terms in this identity that are sometimes trivial
and sometimes not.

The One-Sector Model
Perhaps the simplest model that could reasonably be applied to the long-run data is the onesector neoclassical growth model. Technological
progress and labor force growth are exogenous
and capital deepening is endogenous.
We can derive the key implications from the
textbook Solow version of the model. Consider
an aggregate production function Y = K α 共AL兲1– α,
where technology A grows at rate g and labor
input L (which captures both raw hours, H, and
labor quality, LQ—henceforth, we do not generally differentiate between the two) grows at rate n.
Expressing all variables in terms of “effective
labor,” AL, yields

y = kα ,

(2)

where y = Y/AL and k = K/AL.
Capital accumulation takes place according
to the perpetual-inventory formula. If s is the
saving rate, so that sy is investment per effective
worker, then in steady state

sy = ( n + δ + g ) k .

(3)

Because of diminishing returns to capital, the
economy converges to a steady state where y and
k are constant. At that point, investment per effective worker is just enough to offset the effects of
7

Labor quality/composition reflects the mix of hours across workers
with different levels of education, experience, and so forth. For the
purposes of this discussion, which so far has focused on definitions, suppose there were J types of workers with factor shares of
income βj , where

∑ j β j = (1 − α ).
Then a reasonable definition of TFP would be
 = yˆ − α kˆ −
tfp
∑ j β j hˆ j .

Growth accounting as done by the Bureau of Labor Statistics or by
Dale Jorgenson and his collaborators (see, for example, Jorgenson,
Gollop, and Fraumeni, 1987) defines

ˆl =
∑ j β j hˆ j

190

(1 − α ),

J U LY / A U G U S T

hˆ = d ln∑ j H j , and qˆ = ˆl − hˆ .

2009

depreciation, population growth, and technological change on capital per effective worker. In
steady state, the unscaled levels of Y and K grow
at the rate g + n; capital deepening, K/L, grows at
rate g. Labor productivity Y/L (i.e., output per unit
of labor input) also grows at rate g.
From the production function, measured
TFP growth is related to labor-augmenting technology growth by
 = Yˆ − α Kˆ − (1 − α ) Lˆ = (1 − α ) g .
tfp
The model maps directly to equation (1). In
particular, the endogenous contribution of capital
deepening to labor-productivity growth is
 / (1 − α ) .
α g = α ⋅ tfp
Output per unit of labor input grows at rate g,
which equals the sum of standard TFP growth,
共1 – α 兲g, and induced capital deepening, α g.
Table 1 shows how this model performs relative to the data. It uses the multifactor productivity release from the Bureau of Labor Statistics
(BLS), which provides data for TFP growth as
well as capital deepening for the U.S. business
economy. These data are shown in the first two
columns. Note that in the model above, standard
TFP growth reflects technology alone. In practice,
a large segment of the literature suggests reasons
why nontechnological factors might affect measured TFP growth. For example, there are hardto-measure short-run movements in labor effort
and capital’s workweek, which cause measured
(although not actual) TFP to fluctuate in the short
run. Nonconstant returns to scale and markups
also interfere with the mapping from technological change to measured aggregate TFP. But the
deviations between technology and measured TFP
are likely to be more important in the short run
than in the long run, consistent with the findings
of Basu, Fernald, and Kimball (2006) and Basu
et al. (2008). Hence, for these longer-term comparisons, we assume average TFP growth reflects
average technology growth. Column 3 shows the
predictions of the one-sector neoclassical model
for α = 0.32 (the average value in the BLS multifactor dataset).
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

Table 1
One-Sector Growth Model Predictions for the U.S. Business Sector

Period

Total TFP

Actual capital
deepening contribution

Predicted capital
deepening contribution
in one-sector model

1948-2007

1.39

0.76

0.65

1948-1973

2.17

0.85

1.02

1973-1995

0.52

0.62

0.25

1995-2007

1.34

0.84

0.63

1995-2000

1.29

1.01

0.61

2000-2007

1.37

0.72

0.65

NOTE: Data for columns 1 and 2 are business sector estimates from the BLS multifactor productivity database (downloaded via Haver
on August 19, 2008). Capital and labor are adjusted for changes in composition. Actual capital deepening is α (k̂ – lˆ ), and predicted
 / (1 − α ) .
capital deepening is α ⋅ tfp

A comparison of columns 2 and 3 shows the
model does not perform particularly well. It
slightly underestimates the contribution of capital
deepening over the entire 1948-2007 period, but
it does a particularly poor job of matching the lowfrequency variation in that contribution. In particular, it somewhat overpredicts capital deepening
for the pre-1973 period but substantially underpredicts for the 1973-95 period. That is, given the
slowdown in TFP growth, the model predicts a
much larger slowdown in the contribution of
capital deepening.8
One way to visualize the problem with the
one-sector model is to observe that the model predicts a constant capital-to-output ratio in steady
state—in contrast to the data. Figure 1 shows the
sharp rise in the business sector capital-to-output
ratio since the mid-1960s.

The Two-Sector Model: A Better Match
A growing literature on investment-specific
technical change suggests an easy fix for this
8

Note that output per unit of quality-adjusted labor is the sum of
TFP plus the capital deepening contribution, which in the business
sector averaged 1.39 + 0.76 = 2.15 percent per year over the full
sample. More commonly, labor productivity is reported as output
per hour worked. Over the sample, labor quality in the BLS multifactor productivity dataset rose 0.36 percent per year, so output
per hour rose 2.51 percent per year.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

failure: Capital deepening does not depend on
overall TFP but on TFP in the investment sector.
A key motivation for this body of literature is the
price of business investment goods, especially
equipment and software, relative to the price of
other goods (such as consumption). The relative
price of investment and its main components are
shown in Figure 2.
Why do we see this steady relative price
decline? The most natural interpretation is that
there is a more rapid pace of technological change
in producing investment goods (especially hightech equipment).9
To realize the implications of a two-sector
model, consider a simple two-sector Solow-type
model, where s is the share of nominal output that
is invested each period.10 One sector produces
investment goods, I, that are used to create capital;
the other sector produces consumption goods, C.
The two sectors use the same Cobb-Douglas production function but with potentially different
technology levels:
9

On the growth accounting side, see, for example, Jorgenson (2001)
or Oliner and Sichel (2000); see also Greenwood, Hercowitz, and
Krusell (1997).

10

This model is a fixed–saving rate version of the two-sector neoclassical growth model in Whelan (2003) and is isomorphic to
the one in Greenwood, Hercowitz, and Krusell (1997), who choose
a different normalization of the two technology shocks in their
model.

J U LY / A U G U S T

2009

191

Basu and Fernald

Figure 1
Capital-to-Output Ratio in the United States (equipment and structures)
Ratio Scale Index, 1948 = 1
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1

1.0
1950

1960

1970

1980

1990

2000

SOURCE: BLS multisector productivity database. Equipment and structures (i.e., fixed reproducible tangible capital) is calculated as a
Tornquist index of the two categories. Standard Industrial Classification data (from www.bls.gov/mfp/historicalsic.htm) are spliced to
North American Industry Classification System data (from www.bls.gov/mfp/mprdload.htm) starting at 1988 (data downloaded
October 13, 2008).

1−α

I = K Iα ( AI LI )

1−α

C = QK Cα ( AI LC )

.

In the consumption equation, we have implicitly defined labor-augmenting technological change
as AC = Q1/共1–α 兲AI to decompose consumption
technology into the product of investment technology, AI , and a “consumption-specific” piece,
Q1/共1–α 兲. Let investment technology, AI , grow at
rate gI and the consumption-specific piece, Q,
grow at rate q. Perfect competition and cost minimization imply that price equals marginal cost.
If the sectors face the same factor prices (and the
same rate of indirect business taxes), then
C

PI
MC
=
= Q.
PC MC I
192

J U LY / A U G U S T

2009

The sectors also choose to produce with the same
capital-to-labor ratios, implying that KI /AI LI =
KC /AI LC = K /AI L. We can then write the production functions as
α

I = AI LI ( K AI L )

α

C = QAI LC ( K AI L ) .
We can now write the economy’s budget constraint in a simple manner:
α

Y Inv. Units ; [ I + C Q ] = AI (LI + LC )( K AI L ) ,
(4) or y Inv. Units = k α , where

y Inv. Units = Y Inv. Units AI L and k = K AI L .
“Output” here is expressed in investment
units, and “effective labor” is in terms of technology in the investment sector. The economy
mechanically invests a share s of nominal investF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

Figure 2
Price of Business Fixed Investment Relative to Other Goods and Services
Ratio Scale, 2000 = 100

200
190
180
170
160

Equipment and Software

150
140

Business Fixed Investment

130
120
110
100
90
80

Structures
1960

1965

1970 1975 1980

1985

1990

1995

2000

2005

NOTE: “Other goods and services” constitutes business GDP less business fixed investment.
SOURCE: Bureau of Economic Analysis and authors’ calculations.

ment, which implies that investment per effective
unit of labor is i = s . y Inv. Units.11 Capital accumulation then takes the same form as in the one-sector
model, except that it is only growth in investment
technology, gI, that matters. In particular, in steady
state,
(5)

sy Inv. Units = ( n + δ + g I ) k .

The production function (4) and capitalaccumulation equation (5) correspond exactly to
their one-sector counterparts. Hence, the dynamics
of capital in this model reflect technology in the
investment sector alone. In steady state, capital per
unit of labor, K/L, grows at rate gI , so the contribution of capital deepening to labor-productivity
growth from equation (1) is
11

s ⋅ y Inv. Units =  PI I ( PI I + PC C )  ( I + PC C PI ) AI L  = I AI L .

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

 (1 − α ) .
α g I = α ⋅ tfp
I
Consumption technology in this model is “neutral” in that it does not affect investment or capital
accumulation; the same result carries over to the
Ramsey version of this model, with or without
variable labor supply. (Basu et al., 2008, discuss
the idea of consumption-technology neutrality
in greater detail.12)
To apply this model to the data, we need to
decompose aggregate TFP growth (calculated from
12

Note also that output in investment units is not equal to chain output in the national accounts. Chain gross domestic product (GDP) is

Yˆ = sIˆ + (1 − s )Cˆ .


Inv. Units
In contrast, in this model Y
= sIˆ + (1 − s ) Cˆ − (1 − s ) q.

Inv. Units
Hence, Ŷ = Y
+ (1 − s )q .

J U LY / A U G U S T

2009

193

Basu and Fernald

chained output) into its consumption and investment components. Given the conditions so far,
the following two equations hold:
 = s ⋅ tfp
 + ( 1 − s )tfp
 ,
tfp
I
C




P − P = tfp − tfp .
C

I

C

I

These are two equations in two unknowns—
 and tfp
 .
tfp
I
C
Hence, they allow us to decompose aggregate TFP
growth into investment and consumption TFP
growth.13
Table 2 shows that the two-sector growth
model does, in fact, fit the data better. All derivations are done assuming an investment share of
0.15, about equal to the nominal value of business
fixed investment relative to the value of business
output.
For the 1948-73 and 1973-95 periods, a comparison of columns 5 and 6 indicates that the
model fits quite well—and much better than the
one-sector model. The improved fit reflects that
although overall TFP growth slowed very sharply,
investment TFP growth (column 3) slowed much
less. Hence, the slowdown in capital deepening
was much smaller.
The steady-state predictions work less well
for the periods after 1995, when actual capital
deepening fell short of the steady-state prediction
for capital deepening. During these periods, not
only did overall TFP accelerate, but the relative
price decline in column 2 also accelerated. Hence,
implied investment TFP accelerated markedly (as
did other TFP). Of course, the transition dynamics imply that capital deepening converges only
slowly to the new steady state, and a decade is a
relatively short time. (In addition, the pace of
investment-sector TFP was particularly rapid in
the late 1990s and has slowed somewhat in the
2000s.) So the more important point is that, quali13

The calculations below use the official price deflators from the
national accounts. Gordon (1990) argues that many equipment
deflators are not sufficiently adjusted for quality improvements
over time. Much of the macroeconomic literature since then has
used the Gordon deflators (possibly extrapolated, as in Cummins
and Violante, 2002). Of course, as Whelan (2003) points out, much
of the discussion of biases in the consumer price index involves
service prices, which also miss many quality improvements.

194

J U LY / A U G U S T

2009

tatively, the model works in the right direction
even over this relatively short period.
Despite these uncertainties, a bottom-line
comparison of the one- and two-sector models is
of interest. Suppose that the 1995-2007 rates of
TFP growth continue to hold in both sectors (a big
“if” discussed in the next section). Suppose also
that the two-sector model fits well going forward,
as it did in the 1948-95 period. Then we would
project that future output per hour (like output
per quality-adjusted unit of labor, shown in
Tables 1 and 2) will grow on average about 0.75
percentage points per year faster than the onesector model would predict (1.38 versus 0.63), as
a result of greater capital deepening. The difference is clearly substantial: It is a significant fraction of the average 2.15 percent growth rate in
output per unit of labor (and 2.5 percent growth
rate of output per hour) over the 1948-2007 period.

PROJECTING THE FUTURE
Forecasters, policymakers, and a number of
academics regularly make “structured guesses”
about the likely path of future growth.14 Not surprisingly, the usual approach is to assume that
the future will look something like the past—but
the challenge is to decide which parts of the past
to include and which to downplay.
In making such predictions, economists often
project average TFP growth for the economy as a
whole. However, viewed through the lens of the
two-sector model, one needs to make separate
projections for TFP growth in both the investment and non-investment sectors. We consider
three growth scenarios: low, medium, and high
(Table 3).
Consider the medium scenario, which has
output per hour growing at 2.3 (last column).
Investment TFP is a bit slower than its average
in the post-2000 period, reflecting that investment TFP has generally slowed since the burst
of the late 1990s. Other TFP slows to its rate in
14

Oliner and Sichel (2002) use the phrase “structured guesses.” In
addition to Oliner and Sichel, recent high-profile examples of
projections have come from Jorgenson, Ho, and Stiroh (2008) and
Gordon (2006). The CBO and the Council of Economic Advisers
regularly include longer-run projections of potential output.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Table 2
Two-Sector Growth Model Predictions for the U.S. Business Sector

Period

Total TFP

1948-2007

1.39

1948-1973
1973-1995

Relative price of
business fixed investment
to other
goods and services Investment TFP

Other TFP

Actual
capital deepening
contribution

Predicted
capital deepening
contribution
in two-sector model

–0.61

1.91

1.29

0.76

0.90

2.17

0.33

1.89

2.22

0.85

0.89

0.52

–1.02

1.39

0.37

0.62

0.66

1995-2007

1.34

–1.90

2.94

1.04

0.84

1.38

1995-2000

1.29

–2.93

3.78

0.85

1.01

1.78

2000-2007

1.37

–1.17

2.36

1.20

0.72

1.11

2004:Q4–2006:Q4

0.21

0.29

–0.04

0.25

—

—

2006:Q4–2008:Q3

0.98

–1.12

1.94

0.82

—

—

NOTE: “Other goods and services” constitutes business GDP less business fixed investment. Capital and labor are adjusted for changes in composition. Actual capital deep / (1 − α ) .
ening is α (k̂ – lˆ ), and predicted capital deepening is α ⋅ tfp
SOURCE: BLS multifactor productivity dataset, Bureau of Economic Analysis relative-price data, and authors’ calculations. The final two rows reflect quarterly estimates
from Fernald (2008); because of the very short sample periods, we do not show steady-state predictions.

2009

195

Basu and Fernald

J U LY / A U G U S T

Basu and Fernald

Table 3
A Range of Estimates for Steady-State Labor Productivity Growth

Growth scenario

Investment TFP

Other TFP

Overall TFP

Capital
deepening
contribution

Labor
productivity

Output per
hour worked

Low

1.00

0.70

0.7

0.5

1.2

1.5

Medium

2.00

0.82

1.0

0.9

2.0

2.3

High

2.50

1.10

1.3

1.2

2.5

2.8

NOTE: Calculations assume an investment share of output of 0.15 and a capital share in production, α , of 0.32. Column 3 (Overall TFP)
is an output-share-weighted average of columns 1 and 2. Column 4 is column 1 multiplied by α /(1 – α ). Column 5 is output per unit
of composition-adjusted labor input and is the sum of columns 3 and 4. Column 6 adds an assumed growth rate of labor quality/
composition of 0.3 percent per year, and therefore equals column 5 plus 0.3 percent.

the second half of the 1990s, reflecting an assumption that the experience of the early 2000s is
unlikely to persist.
Productivity growth averaging about 2.25 percent is close to a consensus forecast. For example,
in the first quarter of 2008, the median estimate
in the Survey of Professional Forecasters (SPF,
2008) was for 2 percent labor-productivity growth
over the next 10 years (and 2.75 percent gross
domestic product [GDP] growth). In September
2008, the Congressional Budget Office estimated
that labor productivity (in the nonfarm business
sector) would grow at an average rate of about
2.2 percent between 2008 and 2018.15
As Table 3 clearly shows, however, small and
plausible changes in assumptions—well within
the range of recent experience—can make a large
difference for steady-state growth projections.
As a result, a wide range of plausible outcomes
exists. In the SPF, the standard deviation across
the 39 respondents for productivity growth over
the next 10 years was about 0.4 percent—with a
range of 0.9 to 3.0 percent. Indeed, the current
median estimate of 2.0 percent is down from an
estimate of 2.5 percent in 2005, but remains much
higher than the one-year estimate of only 1.3 percent in 1997.16
15

Calculated from data in CBO (2008).

16

The SPF has been asking about long-run projections in the first
quarter of each year since 1992. The data are available at
www.philadelphiafed.org/research-and-data/real-time-center/
survey-of-professional-forecasters/data-files/PROD10/.

196

J U LY / A U G U S T

2009

The two-sector model suggests several key
questions in making long-run projections. First,
what will be the pace of technical progress in
producing information technology (IT) and, more
broadly, equipment goods? For example, for hardware, Moore’s law—that semiconductor capacity
doubles approximately every two years—provides
plausible bounds. For software, however, we really
have very little firm ground for speculation.
Second, how elastic is the demand for IT?
The previous discussion of the two-sector model
assumed that the investment share was constant
at 0.15. But an important part of the price decline
reflected that IT, for which prices have been falling
rapidly, is becoming an increasing share of total
business fixed investment. At some point, a constant share is a reasonable assumption and consistent with a balanced growth path. Yet over the
next few decades, very different paths are possible.
Technology optimists (such as DeLong, 2002)
think that the elasticity of demand for IT exceeds
unity, so that demand will rise even faster than
prices fall. They think that firms and individuals
will find many new uses for computers, semiconductors, and, indeed, information, as these
commodities get cheaper and cheaper. By contrast,
technology pessimists (such as Gordon, 2000)
think that the greatest contribution of the IT revolution is in the past rather than the future. For
example, firms may decide they will not need
much more computing power in the future, so
that as prices continue to fall, the nominal share
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

of expenditure on IT will also fall. For example,
new and faster computers might offer few advantages for word processing relative to existing computers, so the replacement cycle might become
longer.
Third, what will happen to TFP in the non-ITproducing sectors? The range of uncertainty here
is very large—larger, arguably, than for the first
two questions. The general-purpose-technology
nature of computing suggests that faster computers and better ability to manage and manipulate
information might well lead to TFP improvements
in computer-using sectors.17 For example, many
important management innovations, such as the
Wal-Mart business model or the widespread diffusion of warehouse automation, are made possible by cheap computing power. Productivity in
research and development may also rise more
directly; auto parts manufacturers, for example,
can design new products on a computer rather
than building physical prototype models. That
is, computers may lower the cost and raise the
returns to research and development
In addition, are these sorts of TFP spillovers
from IT to non-IT sectors best considered as
growth effects or level effects? For example, the
“Wal-Martization” of retailing raises productivity
levels (as more-efficient producers expand and
less-efficient producers contract) but it does not
necessarily boost long-run growth.
Fourth, the effects noted previously might
well depend on labor market skills. Many endogenous growth models incorporate a key role for
human capital, which is surely a key input into
the innovation process—whether reflected in
formal research and development or in management reorganizations. Beaudry, Doms, and Lewis
(2006) find evidence that the intensity of personal
computers use across U.S. cities is closely related
to education levels in those cities.
We hope we have convinced readers that it is
important to take a two-sector approach to esti17

See, for example, Basu et al. (2003) for an interpretation of the
broad-based TFP acceleration in terms of intangible organizational
capital associated with using computers. Of course, an intangiblecapital story suggests that the measured share of capital is too low,
and that measured capital is only a subset of all capital—so the
model and calibration in the earlier section are incomplete.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

mating the time path of long-run output. But as
this (non-exhaustive) discussion demonstrates,
knowing the correct framework for analysis is only
one of many inputs to projecting potential output
correctly. Much still remains unknown about
potential output, even along a steady-state growth
path. The biggest problem is the lack of knowledge about the deep sources of TFP growth.

SHORT-RUN CONSIDERATIONS
General Issues in Defining and
Estimating Short-Run Potential Output
Traditionally, macroeconomists have taken
the view expressed in Solow (1997) that, in the
long-run, a growth model such as the ones
described previously explains the economy’s
long-run behavior. Factor supplies and technology
determine output, with little role for “demand”
shocks. However, the short run was viewed very
differently, when as Solow (1997) put it, “…fluctuations are predominantly driven by aggregate
demand impulses” (p. 230).
Solow (1997) recognizes that real business
cycle theories take a different view, providing a
more unified vision of long-run growth and shortrun fluctuations than traditional Keynesian views
did. Early real business cycle models, in particular,
emphasized the role of high-frequency technology
shocks. These models are also capable of generating fluctuations in response to nontechnological
“demand” shocks, such as government spending.
Since early real business cycle models typically
do not incorporate distortions, they provide examples in which fluctuations driven by government
spending or other impulses could well be optimal
(taking the shocks themselves as given). Nevertheless, traditional Keynesian analyses often presumed that potential output was a smooth trend,
so that any fluctuations were necessarily suboptimal (regardless of whether policy could do anything about them).
Fully specified New Keynesian models provide a way to think formally about the sources of
business cycle fluctuations. These models are
generally founded on a real business cycle model,
albeit one with real distortions, such as firms
J U LY / A U G U S T

2009

197

Basu and Fernald

having monopoly power. Because of sticky wages
and/or prices, purely nominal shocks, such as
monetary policy shocks, can affect real outcomes.
The nominal rigidities also affect how the economy responds to real shocks, whether to technology, preferences, or government spending.
Short-run potential output is naturally defined
as the rate of output the economy would have if
there were no nominal rigidities, that is, by the
responses in the real business cycle model
underlying the sticky price model.18 This is our
approach to producing a time series of potential
output fluctuations in the short run.
In New Keynesian models, where prices
and/or wages might adjust slowly toward their
long-run equilibrium values, actual output might
well deviate from this short-term measure of potential output. In many of these models, the “output
gap”—the difference between actual and potential
output—is the key variable in determining the
evolution of inflation. Kuttner (1994) and Laubach
and Williams (2003) use this intuition to estimate
the output gap as an unobserved component in a
Phillips curve relationship. They find fairly substantial time variation in potential output.
In the context of New Keynesian DSGE models,
is there any reason to think that potential output
is a smooth series? At a minimum, a low variance
of aggregate technology shocks as well as inelastic
labor supply is needed. Rotemberg (2002), for
example, suggests that because of slow diffusion
of technology across producers, stochastic technological improvements might drive long-run
growth without being an important factor at business cycle frequencies.19
18

See Woodford (2003). There is a subtle issue in defining flexible
price potential output when the time path of actual output may be
influenced by nominal rigidities. In theory, the flexible price output series should be a purely forward-looking construct, which is
generated by “turning off” all nominal rigidities in the model, but
starting from current values of all state variables, including the
capital stock. Of course, the current value of the capital stock might
be different from what it would have been in a flexible price model
with the same history of shocks because nominal rigidities operated
in the past. Thus, in principle, the potential-output series should
be generated by initializing a flexible price model every period,
rather than taking an alternative time-series history from the flexible
price model hit by the same sequence of real shocks. We do the
latter rather than the former because we believe that nominal rigidities cause only small deviations in the capital stock, but it is possible that the resulting error in our potential-output series might
actually be important.

198

J U LY / A U G U S T

2009

Nevertheless, although a priori one might
believe that technology changes only smoothly
over time, there is scant evidence to support this
position. Basu, Fernald, and Kimball (2006) control econometrically for nontechnological factors
affecting the Solow residual—nonconstant returns
to scale, variations in labor effort and capital’s
workweek, and various reallocation effects—and
still find a “purified technology” residual that is
highly variable. Alexopoulos (2006) uses publications of technical books as a proxy for unobserved
technical change and finds that this series is not
only highly volatile, but explains a substantial
fraction of GDP and TFP. Finally, variance decompositions often suggest that innovations to technology explain a substantial share of the variance
of output and inputs at business cycle frequencies;
see Basu, Fernald, and Kimball (2006) and Fisher
(2006).
When producing a time series of short-run
potential output, it is necessary not only to know
“the” correct model of the economy, but also the
series of historical shocks that have affected the
economy. One approach is to specify a model,
which is often complex, and then use Bayesian
methods to estimate the model parameters on
the data. As a by-product, the model estimates
the time series of all the shocks that the model
allows.20 Because DSGE models are “structural”
in the sense of Lucas’s (1976) critique, one can
perform counterfactual simulations—for example, by turning off nominal rigidities and using
the estimated model and shocks to create a time
series of flexible price potential output.
We do not use this approach because we are
not sure that Bayesian estimation of DSGE models
always uses reliable schemes to identify the relevant shocks. The full-information approach of
these models is, of course, preferable in an efficiency sense—if one is sure that one has specified
19

A recent paper by Justiniano and Primiceri (2008) estimates both
simple and complex New Keynesian models and finds that most
of the volatility in the flexible-wage/price economy reflects extreme
volatility in markup shocks. They still estimate that there is considerable quarter-to-quarter volatility in technology, so that even
if the only shocks were technology shocks, their flexible price
measure of output would also have considerable volatility from
one quarter to the next.

20

See Smets and Wouters (2007).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

the correct structural model of the economy
with all its frictions. We prefer to use limitedinformation methods to estimate the key shocks—
technology shocks, in our case—and then feed
them into small, plausibly calibrated models of
fluctuations. At worst, our method should provide
a robust, albeit inefficient, method of assessing
some of the key findings of DSGE models estimated using Bayesian methods.
We believe that our method of estimating the
key shocks is both more transparent in its identification and robust in its method because it does
not rely on specifying correctly the full model of
the economy, but only small pieces of such a
model. As in the case of the Basu, Fernald, and
Kimball (2006) procedure underlying our shock
series, we specify only production functions and
costs of varying factor utilization and assume that
firms minimize costs—all standard elements of
current “medium-scale” DSGE models. Furthermore, we assume that true technology shocks
are orthogonal to other structural shocks, such
as monetary policy shocks, which can therefore
be used as instruments for estimation. Finally,
because we do not have the overhead of specifying and estimating a complete structural general
equilibrium model, we are able to model the
production side of the economy in greater detail.
Rather than assuming that an aggregate production function exists, we estimate industry-level
production functions and aggregate technology
shocks from these more disaggregated estimates.
Basu and Fernald (1997) argue that this approach
is preferable in principle and solves a number of
puzzles in recent production-function estimation
in practice.
We use time series of “purified” technology
shocks, similar to those presented in Basu,
Fernald, and Kimball (2006) and Basu et al. (2008).
However, these series are at an annual frequency.
Fernald (2008) applies the methods in these
articles to quarterly data and produces higherfrequency estimates of technology shocks. Fernald
estimates utilization-adjusted measures of TFP for
the aggregate economy, as well as for the investment and consumption sector. In brief, aggregate
TFP is measured using data from the BLS quarterly
labor productivity data, combined with capitalF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

service data estimated from detailed quarterly
investment data. Labor quality and factor shares
are interpolated from the BLS multifactorproductivity dataset. The relative price of investment goods is used to decompose aggregate TFP
into investment and consumption components,
using the (often-used) assumption that relative
prices reflect relative TFPs. The utilization adjustment follows Basu, Fernald, and Kimball (2006),
who use hours per worker as a proxy for utilization change (with an econometrically estimated
coefficient) at an industry level. The input-output
matrix was used to aggregate industry utilization
change into investment and consumption utilization change, following Basu et al. (2008).21
To produce our estimated potential output
series, we feed the technology shocks estimated
by Fernald (2008) into simple one- and two-sector
models of fluctuations (see the appendix). Technology shocks shift the production function
directly, even if they are not amplified by changes
in labor supply in response to variations in wages
and interest rates. If labor supply is elastic, then
a fortiori the changes in potential output will be
more variable for any given series of technology
shocks.
Elastic labor supply also allows nontechnology
shocks to move short-run, flexible price output
discontinuously. Shocks to government spending,
even if financed by lump-sum taxes, cause changes
in labor supply via a wealth effect. Shocks to distortionary tax rates on labor income shift labor
demand and generally cause labor input, and
hence output, to change. Shocks to the preference
for consumption relative to leisure can also cause
changes in output and its components.
The importance of all of these shocks for
movements in flexible price potential output
depends crucially on the size of the Frisch
(wealth-constant) elasticity of labor supply.
Unfortunately, this is one of the parameters in
economics whose value is most controversial, at
least at an aggregate level. Most macroeconomists
assume values between 1 and 4 for this crucial
21

Because of a lack of data at a quarterly frequency, Fernald (2008)
does not correct for deviations from constant returns or for heterogeneity across industries in returns to scale—issues that Basu,
Fernald, and Kimball (2006) argue are important.

J U LY / A U G U S T

2009

199

Basu and Fernald

parameter, but not for particularly strong reasons.22 On the other hand, Card (1994) reviews
both microeconomic and aggregative evidence
and concludes there is little evidence in favor of
a nonzero Frisch elasticity of labor supply. The
canonical models of Hansen (1985) and Rogerson
(1988) attempt to bridge the macro-micro divide.
However, Mulligan (2001) argues that the strong
implication of these models, an infinite aggregate
labor supply elasticity, depends crucially on the
assumption that workers are homogeneous and
can easily disappear when one allows for heterogeneity in worker preferences.
We do not model real, nontechnological
shocks to the economy in creating our series on
potential output. Our decision is partly due to
uncertainty over the correct value of the aggregate
Frisch labor supply elasticity, which as discussed
previously is crucial for calibrating the importance of such shocks. We also make this decision
because in our judgment there is even less consensus in the literature over identifying true innovations to fiscal policy or to preferences than there
is on identifying technology shocks. Our decision
to ignore nontechnological real shocks clearly
has the potential to bias our series on potential
output, and depending on the values of key parameters, this bias could be significant.

One-Sector versus Two-Sector Models
In the canonical New Keynesian Phillips
curve, derived with Calvo price setting and flexible wages, inflation today depends on expected
inflation tomorrow, as well as on the gap between
actual output and the level of output that would
occur with flexible prices.
To assess how potential and actual output
respond in the short run in a one- versus twosector model, we used a very simple two-sector
New Keynesian model (see the appendix). As in
the long-run model, we assume that investment
22

In many cases, it is simply because macro models do not “work”—
that is, display sufficient amplification of shocks—for smaller
values of the Frisch labor supply elasticity. In other cases, values
like 4 are rationalized by assuming, without independent evidence,
that the representative consumer’s utility from leisure takes the
logarithmic form. However, this restriction is not imposed by the
King-Plosser-Rebelo (1988) utility function, which guarantees
balanced growth for any value of the Frisch elasticity.

200

J U LY / A U G U S T

2009

and consumption production uses a Cobb-Douglas
technology with the same factor shares but with
a (potentially) different multiplicative technology
parameter. To keep things simple, factors are
completely mobile, so that a one-sector model is
the special case when the same technology shock
hits both sectors.
We simulated the one- and two-sector models
using the utilization-adjusted technology shocks
estimated in Fernald (2008). Table 4 shows standard deviations of selected variables in flexible
and sticky price versions of the one- and twosector models, along with actual data for the U.S.
economy.
The model does a reasonable job of approximating the variation in actual data, considering
how simple it is and that only technology shocks
are included. Investment in the data is slightly
less volatile than either in the sticky price model
or the two-sector flexible price model. This is not
surprising, given that the model does not have any
adjustment costs or other mechanisms to smooth
out investment. Consumption, labor, and output
in the data are more volatile than in the models.23
Additional shocks (e.g., to government spending,
monetary policy, or preferences) would presumably add volatility to model simulations.
An important observation from Table 4 is that
potential output—the flexible price simulations,
in either the one- or two-sector variants—is highly
variable, roughly as variable as sticky price output. The short-run variability of potential output
in New Keynesian models has been emphasized
by Neiss and Nelson (2005) and Edge, Kiley, and
Laforte (2007).
These models, with the shocks we have added,
show a very high correlation of flexible and sticky
price output. In the two-sector case, the correlation is 0.91. Nevertheless, the implied output gap
(shown in the penultimate line of Table 4 as the
difference between output in the flexible and
sticky price cases) is more volatile than would be
implied if potential output were estimated with
the one-sector model (the final line).
23

The relative volatility of consumption is not that surprising,
because the models do not have consumer durables and we have
not yet analyzed consumption of nondurables and services in the
actual data.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

Table 4
Standard Deviations, Model Simulations, and Data
Variable

Investment

Consumption

Labor

Output

One-sector, flexible price

4.40

0.81

0.47

1.52

Two-sector, flexible price

6.28

0.89

0.73

1.66

One-sector, sticky price

4.82

0.84

0.64

1.60

Two-sector, sticky price

5.52

0.87

0.85

1.68

Data

4.54

1.12

1.14

1.95

Output gap (two-sector sticky price
less two-sector flexible price)

5.78

0.59

0.96

0.72

“One-sector” estimated gap (two-sector sticky price
less one-sector flexible price)

2.55

0.18

0.59

0.41

NOTE: Model simulations use utilization-adjusted TFP shocks from Fernald (2008). Two-sector simulations use estimated quarterly
consumption and investment technology; one-sector simulations use the same aggregate shock (a share-weighted average of the two
sectoral shocks) in both sectors. All variables are filtered with the Christiano-Fitzgerald bandpass filter to extract variation between 6
and 32 quarters.

Figure 3 shows that the assumption that potential output has no business cycle variation—
which is tantamount to using (Hodrick-Prescott–
filtered) sticky price output itself as a proxy for
the output gap—would overestimate the variation
in the output gap. This would not matter too much
if the output gap were perfectly correlated with
sticky price output itself—then, at least, the sign,
if not the magnitude, would be correct. However,
as the figure shows, the “true” two-sector output
gap in the model (two-sector sticky price output
less two-sector flexible price output) is imperfectly
correlated with sticky price output—indeed, the
correlation is only 0.25. So in this model, policymakers could easily be misled by focusing solely
on output fluctuations rather than the output gap.

Implications for Stabilization Policy
If potential output fluctuates substantially
over time, then this has potential implications
for the desirability of stabilization policy. In particular, policymakers should be focused only on
stabilizing undesirable fluctuations.
Of course, the welfare benefits of such policies
remain controversial. Lucas (1987, 2003) famously
argued that, given the fluctuations we observe,
the welfare gains from additional stabilization of
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

the economy are likely to be small. In particular,
given standard preferences and the observed
variance of consumption (around a linear trend),
a representative consumer would be willing to
reduce his or her average consumption by only
about ½ of 1/10 th of 1 percent in exchange for
eliminating all remaining variability in consumption. Note that this calculation does not necessarily imply that stabilization policy does not
matter, because the calculation takes as given the
stabilization policies implemented in the past.
Stabilization policies might well have been valuable—for example, in eliminating recurrences of
the Great Depression or by minimizing the frequency of severe recessions—but additional stabilization might not offer large benefits.
This calculation amounts to some $5 billion
per year in the United States, or about $16 per
person. Compared with the premiums we pay for
very partial insurance (e.g., for collision coverage
on our cars), this is almost implausibly low. Any
politician would surely vote to pay $5 billion for
a policy that would eliminate recessions.
Hence, a sizable literature considers ways to
obtain larger costs of business cycle fluctuations,
with mixed results. Arguments in favor of stabilization include Galí, Gertler, and López-Salido
(2007), who argue that the welfare effects of booms
J U LY / A U G U S T

2009

201

Basu and Fernald

Figure 3
Output Gap and Sticky Price Output
Percent
6

4

2

0

–2

–4
Two-Sector Sticky Price Model
Two-Sector Output

:Q

3
07

20

02
20

:Q

3

3
:Q

3
:Q

97
19

3
:Q

92
19

87
19

19

82

:Q

:Q

3

3

3
77
19

:Q

72
19

67
19

:Q

3

3
:Q

3

62

:Q
57

19

:Q
19

52
19

19
4

7:

Q

3

3

–6

NOTE: Bandpass-filtered (6 to 32 quarters) output from two-sector sticky price model and the corresponding output gap (defined as
sticky price output less flexible price output).

and recessions may be asymmetric. In particular,
because of wage and price markups, steady-state
employment and output are inefficiently low
in their model, so that the costs of fluctuations
depend on how far the economy is from full
employment. Recessions are particularly costly—
welfare falls by more during a business cycle
downturn than it rises during a symmetric expansion. Barlevy (2004) argues in an endogenousgrowth framework that stabilization might increase
the economy’s long-run growth rate; this allowed
him to obtain very large welfare effects from business cycle volatility.
This discussion of welfare effects highlights
that much work remains to understand the desirability of observed fluctuations, the ability of
policy to smooth the undesirable fluctuations in
202

J U LY / A U G U S T

2009

the output gap, and the welfare benefits of such
policies.

WHAT IS CURRENT POTENTIAL
OUTPUT GROWTH?
Consider the current situation, as of late 2008:
Is potential output growth relatively high, relatively low, or close to its steady-state value?24
The answer is important for policymakers, where
statements by the Federal Open Market Committee
(FOMC) participants have emphasized the impor24

We could, equivalently, discuss the magnitude or even sign of the
output gap, which is naturally defined in levels. The level is the
integral of the growth rates, of course, and growth rates make it a
little easier to focus, at least implicitly, on how the output gap is
likely to change over time.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

tance of economic weakness in reducing inflationary pressures.25 Moreover, a discussion of the
issue highlights some of what we know, and do
not know, about potential output. Some of the
considerations are closely linked to earlier points
we have made, but these considerations also
allow a discussion of other issues that are not
included in the simple models discussed here.
Several arguments suggest that potential
output growth might currently be running at a
relatively rapid pace. First, and perhaps most
importantly, TFP growth has been relatively rapid
from the end of 2006 through the third quarter
of 2008 (see Table 2). During this period output
growth itself was relatively weak, and hours per
worker were generally falling; hence, following
the logic in Basu, Fernald, and Kimball (2006),
factor utilization appears to have been falling as
well. As a result, in both the consumption and
the investment sectors, utilization-adjusted TFP
(from Fernald, 2008) has grown at a more rapid
pace than its post-1995 average. This fast pace has
occurred despite the reallocations of resources
away from housing and finance and the high level
of financial stress.
Second, substantial declines in wealth are
likely to increase desired labor supply. Most
obviously, housing wealth has fallen and stock
market values have plunged; but tax and expenditure policies aimed at stabilizing the economy
could also suggest a higher present value of taxes.
Declining wealth has a direct, positive effect on
labor supply. In addition, as the logic of Campbell
and Hercowitz (2006) would imply, rising financial stress could lead to increases in labor supply
as workers need to acquire larger down payments
for purchases of consumer durables. And if there
is habit persistence in consumption, workers
might also seek, at least temporarily, to work more
hours to smooth the effects of shocks to gasoline
and food prices.
Nevertheless, there are also reasons to be concerned that potential output growth is currently
lower than its pace over the past decade or so.

First, Phelps (2008) raises the possibility that
because of a sectoral shift away from housingrelated activities and finance, potential output
growth is temporarily low and the natural rate of
unemployment is temporarily high. Although
qualitatively suggestive, it is unclear that the sectoral shifts argument is quantitatively important.
For example, Valletta and Cleary (2008) look at
the (weighted) dispersion of employment growth
across industries, a measure used by Lilien (1982).
They find that as of the third quarter of 2008, “the
degree of sectoral reallocation…remains low relative to past economic downturns.” Valletta and
Cleary (2008) also consider job vacancy data,
which Abraham and Katz (1986) suggest could
help distinguish between sectoral shifts and pure
cyclical increases in unemployment and employment dispersion. The basic logic is that in a sectoral shifts story, expanding firms should have
high vacancies that partially or completely offset
the low vacancies in contracting firms. Valletta
and Cleary find that the vacancy rate has been
steadily falling since late 2006.26
Third, Bloom (2008) argues that uncertainty
shocks are likely to lead to a sharp decline in output. As he puts it, there has been “a huge surge in
uncertainty that is generating a rapid slow-down
in activity, a collapse of banking preventing many
of the few remaining firms and consumers that
want to invest from doing so, and a shift in the
political landscape locking in the damage through
protectionism and anti-competitive policies”
(p. 4). His argument is based on the model simulations in Bloom (2007), in which an increase in
macro uncertainty causes firms to temporarily
pause investment and hiring. In his model, productivity growth also falls temporarily because
of reduced reallocation from lower- to higherproductivity establishments.
Fourth, the credit freeze could directly reduce
productivity-improving reallocations, along the
lines suggested by Bloom (2007), as well as Eisfeldt
and Rampini (2006). Eisfeldt and Rampini argue
that, empirically, capital reallocation is procycli-

25

26

For example, in the minutes from the September 2008 FOMC
meeting, participants forecast that over time “increased economic
slack would tend to damp inflation” (Board of Governors, 2008).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Valletta and Cleary do find some evidence that the U.S. Beveridge
curve might have shifted out in recent quarters relative to its position from 2000 to 2006.

J U LY / A U G U S T

2009

203

Basu and Fernald

cal, whereas the benefits (reflecting cross-sectional
dispersion of marginal products) are countercyclical. These observations suggest that the
informational and contractual frictions, including
financing constraints, are higher in recessions.
The situation as of late 2008 is one in which
financing constraints are particularly severe,
which is likely to reduce efficient reallocations
of both capital and labor.
Fifth, there could be other effects from the
seize-up of financial markets in 2008. Financial
intermediation is an important intermediate input
into production in all sectors. If it is complementary with other inputs (as in Jones, 2008), for
example, you need access to the commercial
paper market to finance working capital needs—
then it could lead to substantial disruptions of
real operations.
Finally, the substantial volatility in commodity prices, especially oil, in recent years could
affect potential output. That said, although oil is
a crucial intermediate input into production,
changes in oil prices do not have a clear-cut effect
on TFP, measured as domestic value added relative to primary inputs of capital and labor. They
might, nevertheless, influence equilibrium output
by affecting equilibrium labor supply. Blanchard
and Galí (2007) and Bodenstein, Erceg, and
Guerrieri (2008), however, are two recent analyses
in which, because of (standard) separable preferences, there is no effect on flexible price GDP or
employment from changes in oil prices. So there
is no a priori reason to expect fluctuations in oil
prices to have a substantial effect on the level or
growth rate of potential output.
A difficulty for all these arguments that potential output growth might be temporarily low is
the observation already made, that productivity
growth (especially after adjusting for utilization)
has, in fact, been relatively rapid over the past
seven quarters.
It is possible the productivity data have been
27

Note also that the data are all subject to revision. For example, the
annual revision in 2009 will revise data from 2006 forward. In addition, labor-productivity data for the nonfinancial corporate sector,
which is based on income-side rather than expenditure-side data,
show less of a slowdown in 2005 and 2006 and less of a pickup
since then. That said, even the nonfinancial corporate productivity
numbers have remained relatively strong in the past few years.

204

J U LY / A U G U S T

2009

mismeasured in recent quarters.27 Basu, Fernald,
and Shapiro (2001) highlight variations in disruption costs associated with tangible investment.
Comparing 2004:Q4–2006:Q4 (when productivity
growth was weak) with 2006:Q4–2008:Q3 (when
productivity was strong), growth in business fixed
investment was very similar, suggesting that timevarying disruption costs probably explain little of
the recent variation in productivity growth rates.
Basu et al. (2004) and Oliner, Sichel, and
Stiroh (2007) discuss the role of mismeasurement
associated with intangible investments, such as
organizational changes associated with IT. With
greater concerns about credit and cash flow, firms
might have deferred organizational investments
and reallocations; in the short run, such deferral
would imply faster measured productivity growth,
even if true productivity growth (in terms of total
output, the sum of measured output plus unobserved intangible investment) were constant. Basu
et al. (2004) argue for a link between observed
investments in computer equipment and unobserved intangible investments in organizational
change. Growth in computer and software investment does not show a notable difference between
the 2004:Q4–2006:Q4 and 2006:Q4–2008:Q3
periods. If anything, the investment rate was
higher in the latter period—so that this proxy
again does not imply mismeasurement.
Given wealth effects on labor supply and
strong recent productivity performance—along
with the failure of typical proxies for mismeasurement to explain the productivity performance—
there are reasons for optimism about the short-run
pace of potential output growth. Nevertheless, the
major effects of the adverse shocks on potential
output seem likely to be ahead of us. For example,
the widespread seize-up of financial markets has
been especially pronounced only in the second
half of 2008. We expect that as the effects of the
collapse in financial intermediation, the surge in
uncertainty, and the resulting declines in factor
reallocation play out over the next several years,
short-run potential output growth will be constrained relative to where it otherwise would
have been.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

CONCLUSION

Basu, Susanto and Fernald, John G. “Returns to Scale
in U.S. Production: Estimates and Implications.”
Journal of Political Economy, April 1997, 105(2),
pp. 249-83.

This article has highlighted a few things we
think we know about potential output—namely,
the importance in both the short run and the long
run of rapid technological change in producing
equipment investment goods and the likely time
variation in the short-run growth rate of potential.
Our discussion of these points has, of course,
pointed toward some of the many things we do
not know.
Taking a step back, we have advocated thinking about policy in the context of explicit models
that suggest ways to think about the world economy, including potential output. But there is an
important interplay between theory and measurement, as the discussion suggests. Every day, policymakers grapple with challenges that are not
present in the standard models. Not only do they
not know the true model of the economy, they also
do not know the current state variables or the
shocks with any precision; and the environment
is potentially nonstationary, with the continuing
question of whether structural change (e.g., parameter drift) has occurred. Theory (and practical
experience) tells us that our measurements are
imperfect, particularly in real time. Not surprisingly, central bankers look at many of the real-time
indicators and filter them analytically—relying
on theory and experience. Estimating potential
output growth is one modest and relatively transparent example of this interplay between theory
and measurement.

Basu, Susanto; Fernald, John G.; Oulton, Nicholas
and Srinivasan, Sylaja. “The Case of the Missing
Productivity Growth: Or, Does Information
Technology Explain Why Productivity Accelerated
in the United States but Not the United Kingdom?”
in M. Gertler and K. Rogoff, eds., NBER
Macroeconomics Annual 2003. Cambridge, MA:
MIT Press, 2004, pp. 9-63.

REFERENCES

Basu, Susanto and Kimball, Miles. “Long Run
Labor Supply and the Elasticity of Intertemporal
Substitution for Consumption.” Unpublished
manuscript, University of Michigan, October 2002;
www-personal.umich.edu/~mkimball/pdf/
cee_oct02-3.pdf.

Abraham, Katharine G. and Katz, Lawrence K.
“Cyclical Unemployment: Sectoral Shifts or
Aggregate Disturbances?” Journal of Political
Economy, June 1986, 94(3), pp. 507-22.
Alexopoulos, Michelle. “Read All About It! What
Happens Following a Technology Shock.” Working
Paper, University of Toronto, April 2006.
Barlevy, Gadi. “The Cost of Business Cycles Under
Endogenous Growth.” American Economic Review,
September 2004, 94(4), pp. 964-90.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu, Susanto and Fernald, John G. “Aggregate
Productivity and Aggregate Technology.” European
Economic Review, June 2002, 46(6), pp. 963-91.
Basu, Susanto; Fernald, John G. and Kimball, Miles S.
“Are Technology Improvements Contractionary?”
American Economic Review, December 2006,
96(5), 1418-48.

Basu, Susanto; Fernald, John G. and Shapiro,
Matthew D. “Productivity Growth in the 1990s:
Technology, Utilization, or Adjustment?” CarnegieRochester Conference Series on Public Policy,
December 2001, 55(1), pp. 117-65.
Basu, Susanto; Fisher, Jonas; Fernald, John G. and
Miles, Kimball S. “Sector-Specific Technical
Change.” Unpublished manuscript, University of
Michigan, 2008.

Beaudry, Paul; Doms, Mark and Lewis, Ethan.
“Endogenous Skill Bias in Technology Adoption:
City-Level Evidence from the IT Revolution.”
Working Paper No. 2006-24, Federal Reserve Bank
of San Francisco, August 2006; www.frbsf.org/
publications/economics/papers/2006/wp06-24bk.pdf.
Blanchard, Olivier, J. and Galí, Jordi. “The
Macroeconomic Effects of Oil Price Shocks: Why

J U LY / A U G U S T

2009

205

Basu and Fernald

Are the 2000s So Different from the 1970s?”
Working Paper No. 07-01, MIT Department of
Economics, August 18, 2007.
Bloom, Nicholas. “The Impact of Uncertainty Shocks.”
NBER Working Paper No. 13385, National Bureau
of Economic Research, September 2007;
www.nber.org/papers/w13385.pdf.
Bloom, Nicholas. “The Credit Crunch May Cause
Another Great Depression.” Stanford University
Department of Economics, October 8, 2008;
www.stanford.edu/~nbloom/CreditCrunchII.pdf.
Board of Governors of the Federal Reserve System.
Minutes of the Federal Open Market Committee.
September 16, 2008; www.federalreserve.gov/
monetarypolicy/fomcminutes20080916.htm.
Bodenstein, Martin; Erceg, Christopher E. and
Guerrieri, Luca. “Optimal Monetary Policy with
Distinct Core and Headline Inflation Rates.”
International Finance Discussion Papers 941,
Board of Governors of the Federal Reserve System,
August 2008; www.federalreserve.gov/pubs/ifdp/
2008/941/ifdp941.pdf.
Calvo, Guillermo. “Staggered Prices in a UtilityMaximizing Framework,” Journal of Monetary
Economics, September 1983, 12(3), pp. 383-98.
Campbell, Jeffrey and Hercowitz, Zvi. “The Role of
Collateralized Household Debt in Macroeconomic
Stabilization.” Working Paper No. 2004-24, Federal
Reserve Bank of Chicago, revised December 2006;
www.chicagofed.org/economic_research_and_data/
publication_display.cfm?Publication=6&year=
2000%20AND%202005.
Card, David. “Intertemporal Labor Supply: An
Assessment,” in C.A. Sims, ed., Advances in
Econometrics. Volume 2, Sixth World Congress.
New York: Cambridge University Press, 1994,
pp. 49-80.
Congressional Budget Office. “CBO’s Method for
Estimating Potential Output: An Update.” August
2001; www.cbo.gov/ftpdocs/30xx/doc3020/
PotentialOutput.pdf.

206

J U LY / A U G U S T

2009

Congressional Budget Office. “A Summary of
Alternative Methods for Estimating Potential GDP.”
March 2004; www.cbo.gov/ftpdocs/51xx/doc5191/
03-16-GDP.pdf.
Congressional Budget Office. “Key Assumptions in
CBO’s Projection of Potential Output” (by calendar
year) in The Budget and Economic Outlook: An
Update. September 2008, Table 2-2; www.cbo.gov/
ftpdocs/97xx/doc9706/Background_Table2-2.xls.
Cummins, Jason G. and Violante, Giovanni L.
“Investment-Specific Technical Change in the US
(1947-2000): Measurement and Macroeconomic
Consequences.” Review of Economic Dynamics,
April 2002, 5(2), pp. 243-84.
DeLong, J. Bradford. “Productivity Growth in the
2000s,” in M. Gertler and K. Rogoff, eds., NBER
Macroeconomics Annual 2002. Cambridge, MA:
MIT Press, 2003.
Edge, Rochelle M.; Kiley, Michael T. and Laforte,
Jean-Philippe. “Natural Rate Measures in an
Estimated DSGE Model of the U.S. Economy.”
Finance and Economics Discussion Series 2007-08,
Board of Governors of the Federal Reserve System,
March 26, 2007; www.federalreserve.gov/pubs/
feds/2007/200708/200708pap.pdf.
Eisfeldt, Andrea and Rampini, Adriano. “Capital
Reallocation and Liquidity.” Journal of Monetary
Economics, April 2006, 53(3), pp. 369-99.
Elsby, Michael and Shapiro, Matthew. “Stepping Off
the Wage Escalator: A Theory of the Equilibrium
Employment Rate.” Unpublished manuscript,
April 2008; www.eief.it/it/files/2008/04/steppingoff-2008-04-01.pdf.
Fernald, John G. “A Quarterly Utilization-Adjusted
Measure of Total Factor Productivity.” Unpublished
manuscript, 2008.
Fisher, Jonas. “The Dynamic Effects of Neutral and
Investment-Specific Technology Shocks.” Journal
of Political Economy, June 2006, 114(3), pp. 413-52.
Galí, Jordi; Gertler, Mark and Lopez-Salido, David J.
“Markups, Gaps, and the Welfare Costs of Business

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

Fluctuations.” Review of Economics and Statistics,
November 2007, 89, pp. 44-59.
Gordon, Robert J. The Measurement of Durable Goods
Prices. Chicago: University of Chicago Press, 1990.
Gordon, Robert J. “Does the ‘New Economy’ Measure
up to the Great Inventions of the Past?” Journal of
Economic Perspectives, Fall 2000, 4(14), pp. 49-74.
Gordon, Robert J. “Future U.S. Productivity Growth:
Looking Ahead by Looking Back.” Presented at the
Workshop at the Occasion of Angus Maddison’s
80th Birthday, World Economic Performance: Past,
Present, and Future, University of Groningen,
Netherlands, October 27, 2006.
Greenwood, Jeremy; Hercowitz, Zvi and Krusell, Per.
“Long-Run Implications of Investment-Specific
Technological Change.” American Economic
Review, June 1997, 87(3), pp. 342-62.
Hansen, Gary. “Indivisible Labor and the Business
Cycle.” Journal of Monetary Economics, November
1985, 16, pp. 309-37.
Jones, Chad. “R&D-Based Models of Economic
Growth.” Journal of Political Economy, August 1995,
103, pp. 759-84.
Jones, Chad. “Sources of U.S. Economic Growth in a
World of Ideas.” American Economic Review,
March 2002, 92(1), pp. 220-39.
Jones, Chad. “Intermediate Goods and Weak Links:
A Theory of Economic Development.” NBER
Working Paper No. 13834, National Bureau of
Economic Research, September 2008;
www.nber.org/papers/w13834.pdf.
Jorgenson, Dale W.; Gollop, Frank M. and Fraumeni,
Barbara M. Productivity and U.S. Economic Growth.
Cambridge, MA: Harvard University Press, 1987.
Jorgenson, Dale W. “Information Technology and the
U.S. Economy.” American Economic Review,
March 2001, 91(1), pp. 1-32.
Jorgenson, Dale W.; Ho, Mun S. and Stiroh, Kevin J.
“A Retrospective Look at the U.S. Productivity

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Growth Resurgence.” Journal of Economic
Perspectives, Winter 2008, 22(1), pp. 3-24.
Justiniano, Alehandro and Primiceri, Giorgio.
“Potential and Natural Output.” Unpublished
manuscript, Northwestern University, June 2008;
http://faculty.wcas.northwestern.edu/~gep575/
JPgap8_gt.pdf.
King, Robert G.; Plosser, Charles I. and Rebelo,
Sergio T. “Production, Growth and Business
Cycles: I. The Basic Neoclassical Model.” Journal
of Monetary Economics, 1988, 21(2-3), pp. 195-232.
Kuttner, Kenneth. “Estimating Potential Output as a
Latent Variable.” Journal of Business and Economic
Statistics, July 1994, 12(3), pp. 361-68.
Kydland, Finn E. and Prescott, Edward C. “Rules
Rather than Discretion: The Inconsistency of Optimal
Plans.” Journal of Political Economy, June 1977,
85(3), pp. 473-92.
Laubach, Thomas and Williams, John C. “Measuring
the Natural Rate of Interest.” Review of Economics
and Statistics, November 2003, 85(4), pp. 1063-70.
Lilien, David M. “Sectoral Shifts and Cyclical
Unemployment.” Journal of Political Economy,
August 1982, 90(4), pp. 777-93.
Lucas, Robert E. Jr. “Econometric Policy Evaluation:
A Critique.” Carnegie-Rochester Conference Series
on Public Policy, 1976, 1(1), pp. 19-46.
Lucas, Robert E. Jr. Models of Business Cycles.
Oxford: Basil Blackwell Ltd, 1987.
Lucas, Robert E. Jr. “Macroeconomic Priorities.”
American Economic Review, March 2003, 93(1),
pp. 1-14.
Mulligan, Casey. “Aggregate Implications of
Indivisible Labor.” Advances in Macroeconomics,
2001, 1(1), Article 4; www.bepress.com/cgi/
viewcontent.cgi?article=1007&context=bejm.
Neiss, Katherine and Nelson, Edward. “Inflation
Dynamics, Marginal Cost, and the Output Gap:
Evidence from Three Countries.” Journal of Money,

J U LY / A U G U S T

2009

207

Basu and Fernald

Credit, and Banking, December 2005, 37(6),
pp. 1019-45.
Okun, Arthur, M. The Political Economy of Prosperity.
Washington, DC: Brookings Institution, 1970.

Rotemberg, Julio J. “Stochastic Technical Progress,
Nearly Smooth Trends and Distinct Business
Cycles.” NBER Working Paper 8919, National
Bureau of Economic Research, May 2002;
papers.ssrn.com/sol3/papers.cfm?abstract_id=310466.

Oliner, Stephen D. and Sichel, Daniel E. “The
Resurgence of Growth in the Late 1990s: Is
Information Technology the Story?” Journal of
Economic Perspectives, Fall 2000, 14(4), pp. 3-22.

Smets, Frank and Wouters, Rafael. “Shocks and
Frictions in US Business Cycles: A Bayesian DSGE
Approach.” American Economic Review, June
2007, 97(3), pp. 586-606.

Oliner, Stephen D. and Sichel, Daniel E. “Information
Technology and Productivity: Where Are We Now
and Where Are We Going?” Federal Reserve Bank
of Atlanta Economic Review, Third Quarter 2002,
pp. 15-44; www.frbatlanta.org/filelegacydocs/
oliner_sichel_q302.pdf.

Survey of Professional Forecasters. Survey from First
Quarter 2008. February 12, 2008;
www.philadelphiafed.org/ research-and-data/realtime-center/survey-of-professional-forecasters/2008/
spfq108.pdf.

Oliner, Stephen D.; Sichel, Daniel and Stiroh, Kevin.
“Explaining a Productive Decade.” Brookings
Papers on Economic Activity, 2007, 1, pp. 81-137.

Solow, Robert M. “Is There a Core of Usable
Macroeconomics We Should All Believe In?”
American Economic Review, May 1997, 87(2),
pp. 230-32.

Organisation of Economic Co-operation and
Development. Revisions of Quarterly Output Gap
Estimates for 15 OECD Member Countries.
September 26, 2008;
www.oecd.org/dataoecd/15/6/41149504.pdf.

Valletta, Robert and Cleary, Aisling. “Sectoral
Reallocation and Unemployment.” Federal Reserve
Bank of San Francisco FRBSF Economic Letter,
No. 2008-32, October 17, 2008; www.frbsf.org/
publications/economics/letter/2008/el2008-32.pdf.

Phelps, Edmund S. “U.S. Monetary Policy and the
Prospective Structural Slump.” Presented at the
7th Annual BIS Monetary Policy Conference,
Lucerne, June 26-27, 2008;
www.bis.org/events/conf080626/phelps.pdf.

Whelan, Karl. “A Two-Sector Approach to Modeling
U.S. NIPA Data.” Journal of Money, Credit, and
Banking, August 2003, 35(4), pp. 627-56.

Rogerson, Richard. “Indivisible Labor, Lotteries and
Equilibrium.” Journal of Monetary Economics,
January 1988, 21(1), pp. 3-16.

208

J U LY / A U G U S T

2009

Woodford, Michael. Interest and Prices: Foundations
of a Theory of Monetary Policy. Princeton, NJ:
Princeton University Press, 2003.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

APPENDIX
A SIMPLE TWO-SECTOR STICKY PRICE MODEL

28

Households
The economy is populated by a representative household which maximizes its lifetime utility,
denoted as
∞

maxE 0 ∑u (Ct , Lt ),
t =0

where Ct is consumption of a constant elasticity of substitution basket of differentiated varieties
ξ
ξ −1
 1
 ξ −1
Ct =  ∫ C ( z ) ξ dz 
0



and Lt is labor effort. u, the period felicity function, takes the following form:

ut = lnCt −

Lηt +1
,
η+1

where η is the inverse of the Frisch elasticity of labor supply. The maximization problem is subject to
several constraints. The flow budget constraint, in nominal terms, is the following:

Bt + PtI I t + PtC Ct = Wt Lt + Rt K t −1 + (1 + it −1 ) Bt −1 + ∆,
ξ
ξ −1
 1
 ξ −1
I =  ∫ I (z ) ξ dz  .
0



where
The price indices are defined as follows:

PtC
I

=

Pt =

( ∫ (z ) dz )
( ∫ P (z ) dz )
1 C
P
0 t

1 I
0 t

1−ξ

1−ξ

1
1−ξ

1
1−ξ

.

Moreover,
(A1)

K t = I t + (1 − δ ) K t −1

(A2)

Lt = LCt + LIt

(A3)

K t −1 = K tC + K tI .

28

The appendix was written primarily by Alessandro Barattieri.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

209

Basu and Fernald

Notice that total capital is predetermined, while sector-specific capital is free to move in each
period. To solve the problem, we write the Lagrangian as

L = lnCt −

Lηt +1
+ ... − Λt Bt + PtI I t + PtC Ct − Wt Lt − Rt Kt −1 − (1 + it −1 ) Bt −1 − ∆t
η+1

(

(

)

)

− β Et Λt +1 Bt +1 + PtI+1I t +1 + PtC+1Ct +1 − Wt +1Lt +1 − Rt +1K t − (1 + it ) Bt − ∆t +1 − ...
The first-order conditions of the maximization problem for consumption, nominal bond, labor,
and capital are as follows:
1
= PtC Λt
(A4)
Ct
(A5)

Λt = β E t (1 + it ) Λt +1 

(A6)

Lη = Λtw t

(A7)

Λt PtI = β E t  Λt +1 Rt +1 + PtI+1 (1 − δ )  .

(

)

Table A1 provides baseline calibrations for all parameters.

Table A1
Baseline Calibration
Parameter

Value

Parameter

Value

β

0.99

INV_SHARE

η

0.25

C_SHARE

0.8

αC

0.3

LI/L

0.2

αI

0.3

LC/L

0.8

δ

0.025

KI/K

0.2

ΓC

1.1

KC/K

0.8

I

Γ

1.1

ρi

0.8

θC

0.75

φπ

1.5

θI

0.75

φµ

0.5

(1 − θC ) (1 − βθC )

ρC

0.99

θC

ρI

0.99

σεtC

1

(1 − θ I ) (1 − βθ I )

σεtI

1

σvt

1

ζC

ζI

210

θI

J U LY / A U G U S T

2009

0.2

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

Firms
Both sectors are characterized by a unitary mass of atomistic monopolistically competitive firms.
Production functions are Cobb-Douglas (possibly with different factor intensities). Productivity in
the two sectors is represented by two AR(1) processes. The cost minimization problem for the firms z
operating in the consumption and investment sectors can be expressed, in nominal terms, as
Min Wt LCt ( z ) + Rt K tC ( z )

(

s.t. YtC (z ) = AtC K tC (z )

αC

) (L

C
t

1−α C

(z ))

− ΦC

and analogously as
Min Wt LIt ( z ) + Rt K tI ( z )

(

s.t. I t = AtI K tI (z )

αI

1−α I

) ( L ( z ))
I
t

− ΦI .

Calling µi with i = C,I the multiplier attached to the minimization problem, reflecting nominal marginal
cost, we can express the factor demand as follows, where we omit z assuming a symmetric equilibrium:
αC

Wt
= (1 − α ) AtC K tC
C
µt

C
C −α
t

( ) (L )
α C −1

Rt
= α AtC K tC
C
µt

( )

Wt
= (1 − α ) AtI K tI
µtI

C
C 1−α
t

(L )

αI

I
I −α
t

( ) (L )

Rt
= α AtI K tI
µtI

α I −1

1−α I

( ) ( )
LIt

.

Taking the ratio for each sector, we get
(A8)

K tC
α Wt
=
1 − α Rt
LCt

(A9)

K tI
α Wt
.
=
1 − α Rt
LIt

Inflation rates are naturally defined as
Πtj =

(A10)

Pt j
.
Pt j−1

Finally, given the Cobb-Douglas assumption, it is possible to express the nominal marginal cost as
follows:
(A11)

MC j =

1 1
R αW 1−α Y j + Φ j ,
A j f (α )

(

)

with j = C, I.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

211

Basu and Fernald

We introduce nominal rigidities through standard Calvo (1983) pricing. Instead of writing the rather
complex equations for the price levels in the C and I sectors, we jump directly to the log-linearized Calvo
equations for the evolution of inflation rates, equations (25) and (26) below.

Monetary Policy
Monetary policy is conducted through a Taylor-type rule with a smoothing parameter and reaction to
inflation and marginal cost. Again, we write the Taylor rule directly in log-linearized form, as equation
(A32) below.

Equilibrium
Beyond the factor market–clearing conditions already expressed, equilibrium also requires a bond
market–clearing condition (B = 0), a consumption goods market–clearing condition (Y C = C ), and an
aggregate adding-up condition (C + I = Y ). (By Walras’s law, we drop the investment market–clearing
condition.)

The Linearized Model
The equations of the model linearized around its nonstochastic steady state are represented by
equations (A12) through (A36), which are 25 equations for the 25 unknown endogenous variables, c,
l I, l C, l, kC, k I, k, λ, w, wC, r, i, yC, I, y, pI, pC, π, π C, π I, µ, µ I, µC, aC, a I, as follows:
(A12)

kt = δ I t + (1 − δ ) kt −1

(A13)

LI I LC C
l +
l = lt
L t
L t

(A14)

K I I KC C
k +
k = kt −1
K t
K t

(A15)

−ct = λt + ptC

(A16)

λt = it + λt +1

(A17)

ηl = λt + w t

(A18)

λt + ptI = λt +1 + 1 − β (1 − δ ) rt +1 + β (1 − δ ) ptI+1

(A19)

y tC = ΓC atC + α C ktC + 1 − α C ltC

(A20)

I t = Γ I atI + α I ktI + 1 − α I ltI

(A21)

ktC + rt = w t + ltC

(A22)

ktI + rt = w t + ltI

212

(

(

J U LY / A U G U S T

2009

) )

(

(

) )

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Basu and Fernald

(A23)

µtC = α rt + (1 − α )w t − atC

(A24)

µtI = α rt + (1 − α )w t − atI

(A25)

π tC = βπtC+1 + ζ µtC − p C

(A26)

π tI = βπtI+1 + ζ µtI − p I

(A27)

π tC = ptC − ptC−1

(A28)

π tI = ptC − ptI−1

(A29)

µt = C_share ⋅µtC + INV_share ⋅µtI

(A30)

π t = C_share ⋅π tC + INV_share ⋅π tI

(A31)

w tC = w t − ptC

(A32)

it = ρi it −1 + (1 − ρi ) φπ πt + φµ µt

(A33)

y t = C_share ⋅ct + INV_share ⋅I t

(A34)

y tC = ct

(A35)

atC = ρC atC−1 + εtC

(A36)

atI = ρI atI−1 + εtI .

(

(

(

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

)

)

)

J U LY / A U G U S T

2009

213

214

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Rodolfo E. Manuelli

B

asu and Fernald (2009) describe and
evaluate alternative theoretical models of potential output to provide a
frame of reference for policy analysis.
They also discuss what is (and what is not)
known about potential output and illustrate their
approach by estimating a two-sector model with
price rigidities.
I find the overall theme—that models ought
to be used to guide policy choices—important and
a welcome reminder of the value of using a consistent framework for policy evaluation. I wholeheartedly agree with the approach. When it comes
to specifics, they conclude that to capture some
essential features of the U.S. economy, the standard one-sector model should be abandoned in
favor of a two-sector model with differential technological change. Here, I am not totally convinced
by their arguments. The second major point that
they argue—and I fully agree with them here—is
that any useful notion of potential output cannot
be assumed to be properly described by a smooth
trend, and it is likely to fluctuate even in the short
run. As before, their choice of model and the
empirical strategy they use are subject to debate.

THE LONG RUN: WHAT SIMPLE
MODEL MATCHES THE DATA?
Basu and Fernald argue that the appropriate
notion of potential output is the steady state of
an economy with no distortions. They consider

two models: a standard one-sector model and a
two-sector model with differential technological
change across sectors. They derive the steadystate predictions in each case and confront the
theoretical predictions about capital deepening—
defined as the contribution of the increase in capital per worker to output—with the data. They
conclude that the two-sector model, which allows
for a change in the price of capital, outperforms
the simple one-sector model.
At this level of abstraction, it is not easy to
pick a winner. Basu and Fernald base their preference for the two-sector model on two different
arguments. First, they show that in the data the
relative price of capital has decreased substantially, which is inconsistent with the one-sector
model. Second, they highlight the ability of the
two-sector model to account for the low contribution of capital in the period of productivity
slowdown.
Basu and Fernald’s first argument—the change
in the price of capital—is not completely persuasive. There is no discussion that capital has
become cheaper, but this does not automatically
imply that this fact is of crucial importance. Of
necessity, models are abstractions of reality and,
by their very nature, will miss some dimension
of the data. To be precise, models that account
for everything are so complex that they cannot be
useful. Thus, adding a sector—which can only
improve the ability of the model—cannot determine a winner. It is easy enough to find other

Rodolfo E. Manuelli is a professor of economics at Washington University in St. Louis and a research fellow at the Federal Reserve Bank of
St. Louis. The author thanks Yongs Shin for useful conversations on this topic.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 215-19.
© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

215

Manuelli

Table 1

Table 2

Capital’s Contribution: Model
Prediction/Data

Price of Capital (1948 = 1)
Period

Period

One-sector model

Two-sector model

1948-73

1.2

1.05

1973-95

0.4

1.06

1995-2000

0.6

1.75

2000-07

0.9

1.54

1948-2007

0.9

1.18

changes in relative prices (e.g., some professional
services) that would necessitate a third sector to
accommodate them, and this approach would
logically lead to a complex and useless model.
Basu and Fernald’s primary reason for choosing a two-sector model rests in its ability to explain
capital deepening. Table 1 presents the two
models’ predictions for capital’s contribution to
growth relative to the data for various time periods.
Considering the longest available horizon (19482007), it is difficult to choose a winner. The onesector model underpredicts the contribution of
capital by 15 percent, while the two-sector
model overpredicts it by 18 percent. Depending
on the period, one model clearly dominates the
other, but I see no reason to emphasize the 197395 period (where the two-sector model is a clear
winner) over the 2000-07 period (in which the
one-sector model dominates).
Basu and Fernald’s preferred model is a twosector version of the Solow growth model. Using
data on the relative price of capital, they estimate
the productivity growth rates in the general goods
and investment goods sectors. Their estimate
hinges on the assumption that the technologies in
these two sectors are similar. In particular, letting
αc = αi be the growth rate of total factor productivity (TFP) in sector j, the specification implies
that

Pˆ i − Pˆc = zˆ c − zˆ i .
In a version of the model in which the capital
shares are allowed to differ across sectors, the
relative price of consumption satisfies
216

J U LY / A U G U S T

2009

Model

Data

1973

1.05

1.09

1995

0.90

0.91

2000

0.82

0.87

2007

0.77

0.86

1 − αc
Pˆ i − Pˆc = zˆ c −
zˆ i .
1 − αi
Valentinyi and Herrendorf (2008) estimate
αc = αi and αi = 0.28, which implies that, relative
to Basu and Fernald’s estimate, the productivity
growth rate of the investment sector was about 9
percent higher. This implies that their estimates
of the contribution of capital deepening must be
increased by almost 10 percent, which exaggerates
even more the overprediction of the two-sector
model relative to the data in the recent past.
Even accepting as a reasonable approximation
that αc = αi , there are two measures of the relative
price of investment goods 共pi = Pi /Pc 兲 that, according to the model, should coincide. One is given by
Y L
pi = M k 
,
 K L 
where y = Y/K共k = K/L兲 is output (capital) per hour
and Mk is a constant under the balanced growth
assumption. Thus, the growth rate of the relative
price of capital is
(1)

ˆ i = yˆ − kˆ .
p

As above, the model implies that
(2)

ˆ i = zˆ c − zˆ i .
p

The estimates based on equation (1)—using
Bureau of Labor Statistics data on output per hour
and capital per hour—are presented in the column
labeled “Data” in Table 2, while the values from
equation (2)—based on model-produced estimates
of productivity growth—are labeled “Model.”
Because the model-based measure predicts a
higher decrease in the price of capital, it is not
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Manuelli

Figure 1
Transitional Dynamics: TFP Shock
Levels, 1960 = 1
2.4
Y/L

2.2
2.0
1.8
1.6

Schooling
1.4
1.2
I/Y
1.0
1960

1970

1980

surprising that the theoretical model tends to overpredict the contribution of capital to output. At
this level of abstraction, it is not possible to identify the source of the problem. However, if the
effective cost of capital is changing—a violation
of the balanced growth assumption—then the
“Data” estimate is biased. In any case, the difference should make us cautious about the appropriateness of the model.
Is it clear that balanced growth is a reasonable
approximation in the long run, given the length
of the horizon covered in the article? It is consistent with the findings of King and Rebelo (1993),
who showed that for reasonable parameterizations,
the standard growth model converges rather rapidly to its balanced growth path. However, recent
work that retains the dynastic specification of
preferences but specifies that individual human
capital completely depreciates at the time of death
(see Manuelli and Seshadri, 2008) has shown that
even one-sector models can display very long
transitions. Figure 1 presents the impact of a onceand-for-all permanent increase in the level of
productivity. From the point of view of this discussion, the interesting result is how long it takes
for the model to reach steady state: approximately
30 years. Thus, if human capital that “disappears”
when an individual dies (even though dynasties
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1990

2000

2010

2020

have infinite horizons) is a realistic feature to
incorporate in a model, the balanced growth
assumption is difficult to justify unless the horizon
is very long.
In this case, a second difficulty is associated
with the measurement of productivity. In the
model analyzed by Manuelli and Seshadri (2008),
conventionally measured TFP and actual TFP do
not coincide. The divergence is due to the endogenous response of the quality and availability of
human capital after a shock. Figure 2 displays
measured TFP (computed using the human-capital
series labeled “Mincer”), which shows an upward
trend—that is, one displaying growth—while
“true” TFP jumps in the first period (labeled 1960
in the figure) and remains constant.
In this example, the series labeled “Effective
Human Capital” moves in response to a productivity shock. Because measured TFP is simply
zq1–α, where q is the ratio of Mincer and Effective
Human Capital, it follows that measured TFP has
a large endogenous component.
Basu and Fernald discuss a variety of scenarios
about future productivity growth and trace the
implications for output growth. The previous
argument suggests that even simple shocks might
have a large impact on conventionally measured
TFP, which would not be captured in their calcuJ U LY / A U G U S T

2009

217

Manuelli

Figure 2
TFP Shock, Effective Human Capital, Mincerian Human Capital, and Measured TFP
Levels, 1960 = 1
1.5
1.4
Effective Human Capital

Measured TFP
1.3
1.2

Mincer
1.1
1.0
0.9
1960

1970

1980

lations. Moreover, given the model that they use—
essentially one in which the only key decision,
saving, is taken as exogenous—any reduced-form
representation of the economic variables of interest is an appropriate model to forecast the future,
with significantly less structure.

SHORT-RUN CONSIDERATIONS
In this section of their article, Basu and
Fernald describe their estimates of technology
shocks (i.e., TFP) in a two-sector model and
define potential output as the output that would
be obtained in the absence of frictions (e.g., price
stickiness). Their major finding is that the variability of productivity shocks is high, even at the
business cycle frequency, and hence that the prescription that in the short run government policy
should try to stabilize output is suspect.
The key question is whether the technology
shocks they identify are indeed “purified” of
policy-induced fluctuations. I am not totally convinced that simple econometric procedures can
effectively isolate TFP shocks, especially given the
authors’ strong assumption about orthogonality
between measured TFP and policy shocks. In particular, it is relatively easy to introduce policies
218

J U LY / A U G U S T

2009

1990

2000

2010

2020

in the Manuelli and Seshadri (2008) model that
endogenously change the rate of utilization of
human capital (with no change in measured
employment) that would appear as changes in
technology. Whether these sources of misspecification are important is a question that is difficult
to answer using Basu and Fernald’s partialspecification approach. As they are aware, some
sources of bias can be detected only when they
are fully specified in the model.

CONCLUSION
In this discussion, I have taken issue with
some of the specific choices made by Basu and
Fernald and with their interpretation of the
results. I would like to end on a more important
note: This paper points policy-based economic
research in the right direction because it emphasizes the necessity of being explicit about the
assumptions underlying our models. Moreover, by
making explicit the economies that are modeled,
it is possible to subject the models to a variety of
tests. On the other hand, reduced-form atheoretical
approaches to policymaking must rely on (often
implicit) assumptions to justify their recommendations, and intelligent evaluation of the results
is often very difficult, if not outright impossible.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Manuelli

REFERENCES
Basu, Susanto and Fernald, John G. “What Do We
Know (And Not Know) About Potential Output?”
Federal Reserve Bank of St. Louis Review,
July/August 2009, 91(4), pp. 187-213.
King, Robert G. and Rebelo, Sergio T. “Transitional
Dynamics and Economic Growth in the Neoclassical
Model.” American Economic Review, September
1993, 83(4), pp. 908-31.
Manuelli, Rodolfo E. and Seshadri, Ananth.
“Neoclassical Miracles.” Working paper, University
of Wisconsin–Madison, November 2008;
www.econ.wisc.edu/~aseshadr/working_pdf/
miracles.pdf.
Valentinyi, Ákos and Herrendorf, Berthold.
“Measuring Factor Income Shares at the Sectoral
Level.” Review of Economic Dynamics, October
2008, 11(4), pp. 820-35.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

219

220

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Issues on Potential Growth Measurement
and Comparison: How Structural Is the
Production Function Approach?
Christophe Cahn and Arthur Saint-Guilhem
This article aims to better understand the factors driving fluctuations in potential output measured
by the production function approach (PFA.) To do so, the authors integrate a production function
definition of potential output into a large-scale dynamic stochastic general equilibrium (DSGE)
model in a fully consistent manner and give two estimated versions based on U.S. and euro-area
data. The main contribution of this article is to provide a quantitative and comparative assessment
of two approaches to potential output measurement, namely DSGE and PFA, in an integrated
framework. The authors find that medium-term fluctuations in potential output measured by the
PFA are likely to result from a large variety of shocks, real or nominal. These results suggest that
international comparisons of potential growth using the PFA could lead to overstating the role of
structural factors in explaining cross-country differences in potential output, while neglecting the
fact that different economies are exposed to different shocks over time. (JEL C51, E32, O11, O47)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 221-40.

I

nternational comparisons of potential output growth have received renewed interest
in recent years. Lower economic performance in Europe compared with the United
States over the past 15 years has generated several
publications whose aim is to explain the sources
of divergence in economic performance and
which question how to enhance economic growth
in Europe. In line with the recommendations of
the Lisbon strategy, one general conclusion is that
structural reforms should help to sustain more
vigorous growth in Europe and enable European
economies to catch up to the United States. Such
reforms include labor and product market liberalization, public policies to encourage innovation,
and so forth. Examples can be found in most
recent International Monetary Fund (IMF) or

Organisation for Economic Co-operation and
Development (OECD) country reports on
European economies. For instance, the 2007 IMF
Article IV Staff Report for France (IMF, 2007)
typically incorporates, among others, the important conclusion that “economic policy needs to
address the root cause of France’s growth deficit:
the weakness of its supply potential.” Against
this background, it is important to have a clear
view on how potential output is measured and
what interpretation can be made of cross-country
differences in potential output growth.
Among the different methods of measurement of potential output, the production function
approach (PFA) is probably the most widely used.
With this approach, output growth is expressed as
a sum of the growth of factor inputs (i.e., capital

Christophe Cahn is a doctoral candidate at the Paris School of Economics and an economist with the Banque de France. Arthur Saint-Guilhem
is an economist with the European Central Bank. The authors thank Jon Faust for his helpful comments, as well as Richard Anderson and
all the participants at the conference.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, the regional Federal Reserve Banks, the European Central Bank, the Banque de
France, or the Paris School of Economics. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their
entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only
with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

221

Cahn and Saint-Guilhem

services and labor input) and a residual (i.e., total
factor productivity [TFP] growth). Additional
assumptions are made on the potential level of
the factors of production. For instance, potential
labor input would be calculated by smoothing
some variables (such as total population and the
participation rate) and by approximating the
medium-term equilibrium unemployment rate
with the non-accelerating inflation rate of unemployment. The major advantage of the PFA, compared with statistical aggregate methods, is that it
provides an economic interpretation of the different factors that drive growth in potential output.
This is especially useful in the context of international comparisons. Moreover, conducting additional econometric analysis allows use of the PFA
as a framework to capture the impact on potential
growth of major changes, such as the pickup in
productivity growth that started in the second
half of the 1990s in the United States.
However, this approach raises some difficulties. Estimates of the components are bounded
by a large degree of uncertainty because analysis
results are highly dependent on the choice of modeling of the different components—for instance,
how trend growth of TFP is estimated. Another
difficulty derives from possible misleading interpretations of potential output as measured by the
PFA. First, in the context of international comparisons, cross-country differences in PFA potential
output are often given a structural interpretation—
say, as being caused by different degrees of rigidities in the labor or good markets, whereas these
differences in potential output measures could
reflect only the lasting effects of temporary shocks
to the economy. This issue is of particular importance because it casts doubt on the ability of the
PFA to give a satisfactory picture of the structural
components of economic growth. Second, the PFA
leaves unidentified the various shocks (supply,
demand, monetary shocks, and so on) that are
likely to affect potential output in the medium
term. This raises some concern about the measurement of output gaps. Indeed, it is not entirely certain that fluctuations in the output gap measured
by the PFA reflect only inflation-related shocks.
Therefore, the PFA might lead to biased output
gap measures that could make them unreliable
for the assessment of monetary policy conditions.
222

J U LY / A U G U S T

2009

An alternative approach to the definition and
measurement of potential output can be found in
New Keynesian dynamic stochastic general equilibrium (DSGE) models. The recent literature on
DSGE models has shown significant progress in
developing models that can be applied to the data.
Indeed, recent research has shown that estimated
DSGE models are able to match the data for key
macroeconomic variables and reduced-form vector
autoregressions (Smets and Wouters, 2007). In
these models, “potential output” is generally
defined as the level of output that would prevail in
an economy with fully flexible prices and wages.
According to the DSGE definition, potential output is therefore the level of output at which prices
tend to stabilize. However, the properties of potential output and output gap fluctuations derived
from DSGE models can be quite different from
the ones derived from the PFA (e.g., Neiss and
Nelson, 2005; and Edge, Kiley, and Laforte, 2007).
For example, the DSGE measure of potential output can undergo relatively larger fluctuations than
potential output derived from the PFA. Similarly,
the output gap in DSGE models tends to be less
variable than with the PFA measures. One caveat
of these papers, however, is that they compare
ad hoc PFA measures of potential output with
DSGE measures—comparisons that would be
enhanced if the PFA measure of potential output
were consistent with the model. In this respect,
one of the main contributions of our paper is to
incorporate the PFA measure of potential output
into a DSGE framework in a fully consistent
manner. As shown later, adopting such a method
reveals that different types of shocks are likely to
cause potential output measured by the PFA to
fluctuate.
Our goals are twofold: (i) better understanding
of the factors driving medium-term fluctuations
in the PFA potential output and (ii) providing a
quantitative comparison of the PFA versus DSGE
measure of potential output. To do so, we build a
large-scale DSGE model, calibrate two versions
of the model using U.S. and euro-area data, and
then integrate into this framework a PFA definition of potential output that is fully consistent
with the model. Our PFA is based on previous
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

work (Cahn and Saint-Guilhem, 2009), where output of the economy is described as a Cobb-Douglas
function. In this respect, the main contribution
of this paper is to provide a quantitative comparison of these two measures of potential output—
the PFA versus the DSGE—in a fully integrated
conceptual framework—namely, an economy
modeled as a large-scale DSGE model with structural parameters calibrated on U.S. and euro-area
data and with an alternative PFA measure of
potential output.
A second contribution of this paper is to assess
the validity of the structural interpretation of crosscountry comparisons of potential output measures
given by the PFA. In general, as described previously, potential output estimates based on the PFA
suggest significant differences across countries
with regard to the sources of potential growth.
However, whether these differences can be attributed to structural factors, such as differences in
labor market or product market institutions,
remains uncertain. Nothing in the PFA guarantees
that this is the case. Our present DSGE framework
enables us to tackle the issue, given that in such
a framework structural differences across two
economies translate into differences of magnitude
across the various parameters of the model. We
can therefore quantify the role of shocks versus
the role of structural factors in explaining crosscountry differences in potential output measured
by the PFA by simulating various counterfactual
scenarios for the two model economies.
Our main results first confirm that the PFA
and the DSGE definitions of potential output are
two different concepts. We find that in an economy
modeled with a DSGE framework, medium-term
fluctuations in potential output measured by the
PFA result from a variety of shocks, such as productivity or monetary shocks. We also find that
differences in potential output between two such
model economies as measured by the PFA can be
attributed not only to structural parameters of
the model but also to the role of some transitory
shocks, real or nominal, affecting the economies.
If we transpose these results into the empirical
field, we see two results: (i) PFA measures of
potential output also reflect the historical pattern
of shocks that affect a given economy, and (ii)
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

international comparisons of potential output
using the PFA could lead to overestimating the
role of structural factors in explaining crosscountry differences in potential output, while
neglecting the role of “luck,” namely, the fact that
different economies are exposed to different histories of stochastic events on which structural
policies could not act.
The remainder of this paper is organized as
follows. In the next section, we sketch the theoretical specification of our DSGE model. The following section describes how we incorporate and
implement into this framework the PFA measure
of potential output. We then present and discuss
the results of the simulations performed with
regard to the decomposition of potential output
dynamics into the contributions of the various
shocks included in the model. Our summary and
conclusion then follow.

A BENCHMARK DSGE MODEL
FOR THE UNITED STATES AND
THE EURO AREA
In this section, we provide details on the
main optimizing behaviors of economic agents—
households, firms, and the fiscal and monetary
authorities—that lead to building the equations
of our benchmark DSGE model, which is largely
taken from Smets and Wouters (2007).1

The Representative Household
We consider an economy populated by a representative household with access to financial
and physical capital markets so that trading in
bonds, investment goods, and state-contingent
securities can occur. Household wealth is given
by gains from government bonds in nominal per
capita terms, Bt–1, held at the beginning of period t.
Labor income comes from the nominal wage rate,
Wth, and homogeneous labor, lth, pooled by a set
of labor unions, u 僆 [0,1]. Households receive
nominal dividends, Πtf and Πut , from intermediate
producers and labor unions, respectively. Capital
1

Detailed equations are given in a technical appendix not included
here. The appendix and Dynare codes are available on request
from the authors.

J U LY / A U G U S T

2009

223

Cahn and Saint-Guilhem

services incomes are rtKK̃t , where rK is the real
rental price of capital service, K̃.
These revenues are used to pay for consumption, PtCt , and investment, Pt It , goods, and for
lump-sum taxes expressed in the output price,
PtTt . Moreover, the representative household buys
discounted government bonds due at the end of
period t, Bt /共ε tbRt 兲, where ε tb is a risk premium
shock. Hence, the budget constraint of such a
household is given by the following:

(

)

Pt Ct + Pt I t + PtTt + Bt / εtb Rt

≤ Wt h lth + Pt rtK K t + Πtf + Πut + Bt −1,
which expressed in real terms becomes

(
)
+ (Π + Π ) / P + B

ε tl a labor supply shock. External habits are given
by Θt = θCt –1, 0 < θ < 1.
The representative household’s problem
consists of maximizing its intertemporal utility
subject to its budget constraint and capital accumulation by choosing the path of Ct , It , Bt , zt , Kt ,
and lth.

Supply Side
We consider a continuum of intermediate
goods producers, f 僆 [0,1]. Each intermediate firm
produces a differentiated good used in the production of a final good. Following Kimball (1995),
the aggregation function is implicitly given by
the following condition:

Ct + I t + Tt + Bt / εtb Rt Pt ≤ w th lth +
rtK K t

f
t

u
t

t

t −1

1

∫0 G

/ Pt ,

where wth = Wth/Pt is the real wage received by
the household.
Capital services come from the combination
of physical capital, Kt , adjusted by capacity utilization, zt , such that K̃t = zt Kt –1. Physical capital
accumulation implies adjustment cost on investment change, S共·兲, and the time-varying depreciation process, δ 共·兲, according to

Y

 y yt (f ) 
 εt Y  df = 1,
t

where GY共·兲 is an increasing, concave function
and verifies G共1兲 = 1 and ε ty is a shock that distorts the aggregator function. The representative
firm in the final good sector maximizes its profit
given the prices of intermediate goods, Pt 共f 兲, and
the price of the final good, Pt .
We assume the following technology in the
intermediate producer sector:

y t ( f ) = εta K t (f )

α



I 
K t = 1 − δ ( zt ) K t −1 +  1 − S  εti t   I t ,
 I t −1  

where ε ti is a shock that deforms the adjustment
cost function.2
We define the intertemporal utility function
as follows:


 +∞




j =0









∑βj

U t = Et








(C

exp ηεtl

t+j

− Θt + j

1−σ c

)

1 − σc

σc − 1 h
lt + j
1+ σl











1+σ l  



 

×

,

( )

where σc is the intertemporal substitution parameter of consumption, σl the intertemporal substitution elasticity of labor, η a scale parameter, and
2

We depart from the Smets and Wouters (2007) model by substituting
the initial cost function on change in capital with a time-varying
depreciation rate.

224

J U LY / A U G U S T

2009

1−α

( (1 + g ) L ( f ) )
t

t

,

where ε ta is a productivity shock,

( )

( )

ln εta = (1 − ρa ) ln ε a + ρa εta−1 + νta ,

νta

 N (0, σ a ),

and g is the growth rate of a deterministic, Harrodneutral technological trend. Assuming that the
input markets are perfectly competitive, a firm
f 僆 [0,1] chooses an input mix, 共K̃t 共f 兲, Lt 共f 兲兲, by
solving the following program:

min

{K t (f ),Lt ( f )}

w t Lt (f ) + rtK K t ( f )

(

t
α
s.t. y t ( f ) = εta K t (f ) (1 + g ) Lt ( f )

1−α

)

,

where the real aggregate labor price, wt , and rental
capital rate, rtK, are given.
Firms are not allowed to optimally reset their
price at each date. With probability ξp > 0, the
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

firm, f, cannot optimally adjust its price at time t ;
instead, it follows the following rule:
1−γ p γ p
π t −1Pt −1

Pt ( f ) = π t

that distorts the aggregator function. Hence, the
labor agency maximizes its profit given by
1

(f ) = Γtp Pt −1 (f ) ;

that is, a nonoptimizing firm sets its price by
indexing the current price on a convex combination of past inflation and the inflation target, to
be defined subsequently.
The intermediate firm’s problem can be written
as follows:
j λt + j P
 +∞

t
∑ βξp

λt Pt + j
max Et  j = 0

P t (f )



 Pt + j (f ) − Pt + j mct + j  y t + j ( f )

( )

Πt = Wt Lt − ∫ Wt (i ) lt ( i )di.
0

Then, the labor unions set their prices following a Calvo scheme, facing the previous relative
demand function and given the wage rate paid to
households, Wth. More precisely, each labor union
seeks to maximize its discounted cash flows by
setting the wage rate, W̃t 共u兲. With probability ξw ,
the union cannot optimally adjust its wage rate
at time t; instead, the union adjusts the wage from
consumer price inflation according to the following rule:

under conditions

Wt (u ) = π t1−γ w π tγ−w1 (1 + g )Wt −1 (u ) = Γw
t Wt −1 (u ) .

∏ j Γ p if j > 0
Pt + j (f ) = Γtp,j Pt (f ) with Γtp,j ≡  s =1 t +s
1 if j = 0

and the relative demand function faced by the
intermediate firm.

With probability 1 – ξw , the union is able to
choose the optimal wage W̃t 共u兲. The labor union’s
problem can be written as follows:
 +∞

j λt + j Pt
∑ ( βξw )

λt Pt + j
max Et  j = 0

W (u )


h 
Wt + j (u ) − Wt + j  lt + j (u )

Wage Setting
In this economy, the representative household
supplies homogeneous labor, lth, to a unitary
continuum of intermediate labor unions indexed
by u. Household and labor unions are price takers
with regard to the price, Wth, of this type of labor,
for which the real counterpart corresponds to the
marginal rate of substitution of consumption for
leisure. The intermediate labor unions aim at differentiating the household’s labor and sell this
outcome, lt 共u兲, to a labor agency, setting its price,
Wt 共u兲, according to a mechanism à la Calvo (1983).
Then the labor agency aggregates these differentiated labor services into a labor package, Lt , and
supplies it to productive firms.
Consequently, we assume that the labor agency
offers a labor aggregate, Lt , to intermediate firms,
derived from differentiated labor unions, lt 共i兲,
according to
1

∫0 G

L

 w lt (i ) 
 εt L  di = 1,
t

where G L共·兲 is an increasing, concave function
and verifies G L共1兲 = 1 and ε tw a stochastic shock
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

with the following condition:

Wt + j ( f ) = Γw
t , j Wt ( f )
∏ sj =1 Γw
t +s if j > 0
with Γw
≡

t ,j
1 if j = 0

and subject to the relative demand function faced
by the labor union.

Government, Nominal Distortions, and
Aggregation
We assume that government bonds and
transfers evolve according to

PtTt + Bt

(ε R ) = B
b
t

t

t −1

+ Pt Gt ,

where Gt is an exogenous process such that the
ratio G /Y = ε tg follows an AR(1) process in log.3
In addition, the central bank sets the current inter3

We use the terms “government shocks” and “external shocks”
interchangeably in the following text.

J U LY / A U G U S T

2009

225

Cahn and Saint-Guilhem

est rate according to the following Taylor rule in
its nonlinear form:

Rt = R

1− ρr

 π  φπ  Y  φ y 
t
t


 π t   YtDSGE  




Rtρ−r1 

 Yt YtDSGE

−1
 Y Y DSGE 
 t −1 t


r y

1− ρr

×

r

 π t  π m
 π  εt ,
t −1

where ρr represents the central bank’s preference
for a smooth interest rate, ε tm is a monetary shock,
π–t is a time-varying inflation target, and YtDSGE is
the output given by a fictional world without nominal rigidities, that is, by setting ξp and ξw to zero.
Hence, this is a measure of the potential output
of such a fictional economy.
Despite the heterogeneity of the wages and
prices due to the Calvo scheme, we are able to
define aggregates for this economy. In fact, total
production, that is, the sum of all productions
from intermediate firms, yt , is a priori different
from the aggregate final product, Yt . Consequently,
a price distortion, Dtp, exists such that yt = Yt Dtp.
The same considerations apply as for the labor
market. Total work effort provided by the representative household is lth. Hence, a wage dispersion exists such that lth = DtwLt .4
We now close the model by deriving the
clearing condition on the final product market.
First, we need to compute aggregate dividends
from intermediate firms:
Πtf =

1

∫0  Pt (f ) − Pt mct  yt (f ) df

= PY
t t − Pt mct yt .

Aggregate dividends from labor unions are
Πut =

1

∫0 Wt (u ) − Wt

h
l
 t

(u ) du = Wt Lt − Wth lt .

Combining these two equations with the
household’s and government nominal budget
constraints, and using the competitive market
condition for production inputs, leads to

Ct + I t + Gt = Yt .
4

In fact, these nominal distortions disappear in a linearized model,
as in Smets and Wouters (2007). Nevertheless, we need to deal with
these distortions as we plan to simulate the model at the second
order.

226

J U LY / A U G U S T

2009

ESTIMATION AND
IMPLEMENTATION OF THE
PFA METHOD
In this section, we first present the estimation
of the two versions of the model on U.S. and euroarea data. Then we describe how we integrate a
potential output measure based on the PFA into
the model in a fully consistent manner.

Functional Forms and Stochastic
Structure
For estimation and simulation, we choose the
following functional forms for investment adjustment costs, time-varying depreciation adapted
from Greenwood, Hercowitz, and Huffman (1988),
and Kimball aggregators that follow the specifications of Dotsey and King (2005):
2

(1 − x (1 + g ))
S (x ) =
2ϕ

δ (z ) = ψ 1 + ψ 2
Gi (x ) =

,ϕ > 0

zd
d

ς
1
(1 + ω i ) x − ω i  i
(1 + ω i ) ς i



1
+ 1 −
 , i ∈{Y, L}.
 (1 + ω i )ς i 
We choose the following stochastic structure for
the exogenous processes in this model:

( )

( )

ln εtκ = ρκ ln εtκ−1 + νtκ , νtκ  N ( 0, σκ ),
with κ 僆 {i,m,p,b,l,y,w}. Finally, we assume that
the central bank’s target and government expenses
on production ratio evolve according to
ρ

p

π t = π 1− ρπ π t −π1εt

( ) ( ) ( )
ν  N ( 0,σ ).

( )

ln εtg = 1 − ρg ln g y + ρg ln εtg−1 + νtg ,
g
t

g

Then, before starting estimation procedures,
we need to make the model stationary. Indeed, as
the model features a balanced growth trend, it is
necessary to turn it into its intensive form for simulations. All real variables of interest are deflated
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

by the deterministic trend 共1 + g兲t. We then rewrite
the model’s equations with intensive variables.5

Priors Distributions, Calibration, and
Data
We use Bayesian techniques to estimate the
main free parameters of the model. Broadly speaking, we compute by numerical simulation the
maximum of the posterior density of the parameters by confronting a priori knowledge about them,
through the likelihood function, against data.6
The first column of Table 1 shows the different
priors set to estimate both the U.S. and euro-area
models.
Almost all of the model’s parameters are estimated, with the following exceptions: The time
preference parameter β is set at 0.998; the Kimball
function’s parameter ζY and ζL is calibrated at
1.02 as in Dotsey and King (2005); and the average
quarterly growth rate of gross domestic product
(GDP), g, is set at 0.66 percent for the euro area
and 0.37 percent for the U.S. economy, based on
our database. Note that the prior density functions
are quite noninformative for most of the estimated
parameters except for inertia coefficients of productivity shocks and what we call “government
shocks.” We used the previous result of highly
persistent shocks in previous works as a prior
belief (e.g., Smets and Wouters, 2007).
The data sources are as follows. We use time
series from 1970:Q1 to 2007:Q4 (United States)
and 2006:Q4 (euro area). For U.S. data, the GDP,
consumption, investment, and GDP deflator are
from the Bureau of Economic Analysis national
accounts. The capacity utilization rate and nominal interest rate—the federal funds effective rate—
are from the Federal Reserve Board database.
For the euro-area data, GDP, consumption, investment, short-term interest rate, and GDP deflator
are from the Area-wide Model (AWM) database
(Fagan, Henry, and Mestre, 2001). Capacity utilization rate data are from the Eurostat database.
Finally, data on labor markets have been used to
5

As written in the technical appendix.

6

See Schorfheide (2000) and Smets and Wouters (2003).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

detrend extensive variables. Total U.S. employment and hours worked for the U.S. and euroarea economies are from the OECD’s Economic
Outlook database (OECD, 2005). European total
employment data are from the AWM database.
All extensive variables, namely GDP, consumption, and investment, are first detrended through
a Hodrick-Prescott (HP) filter with parameter 1600
using a trend in labor that consists of total hours
worked. Then, these variables are deflated by the
GDP deflator. We therefore compute the average
quarterly growth rate of real gross productivity
and detrend again all extensive variables by the
corresponding deterministic time trend. Finally,
these variables are divided by the mean of GDP
over the period.7

Implementing the Production
Function Method
The first step consists of estimating the
benchmark DSGE model for the two economies
and checking the consistency of estimates given
by the two last columns of Table 1.8 We then simulate the model to obtain consistent time series
for production, investment, labor, and capacity
utilization.
We now are able to (i) compute the physical
capital stock series according to the permanent
inventory method (PIM) and, taking into account
the deterministic trend,

ktPIM =

1 − δ PIM
kt −1 + it ,
1+ g

as well as the age of capital,

age t =

1 − δ ktPIM
−1
age t −1 + 1 , and
1 + g ktPIM

(

)

(ii) extract the Solow residual, st , as

(

)

st = ln ( yt ) − α ln ktPIM − (1 − α ) ln (L ).
7

We deliberately exclude data on wages and labor in the estimation
process primarily because of the lack of labor market sophistication
in the model.

8

As a consistency check, one can verify that the posterior modes
obtained by the estimation process correspond to the maximum
of the likelihood function in the parameter direction. Such representations are given in the technical appendix.

J U LY / A U G U S T

2009

227

Cahn and Saint-Guilhem

Table 1
Priors Distributions and Posterior Modes
Prior distribution
Parameter

Posterior modes

Type

Mean

SD

Euro area

United States

θ

beta

0.500

0.2000

0.3210

0.2248

σc

norm

1.500

0.5000

1.1592

1.4674

σl

norm

2.000

0.5000

1.9061

0.6218

d
–
δ

gamma

1.500

0.2000

1.6581

1.8098

beta

0.500

0.2000

0.0820

0.0569

φ

norm

5.500

5.0000

0.1824

0.0890

α

beta

0.500

0.2000

0.2409

0.1907

Preferences

Production and technology

Kimball aggregators

ωY

norm

–6.000

5.0000

–5.1063

–4.4232

ωL

norm

–18.000

5.0000

–16.0926

–16.1249

ξp

beta

0.500

0.2000

0.5411

0.4252

γp

beta

0.500

0.2000

0.0432

0.1295

ξw

beta

0.500

0.2000

0.4431

0.5100

γw

beta

0.500

0.2000

0.7980

0.7781

norm

0.814

0.1000

0.8255

0.8040

beta

0.200

0.1000

0.1935

0.0461

norm

1.014

0.1000

1.0077

1.0125

norm

1.000

0.1000

1.0002

1.0001

norm

1.000

0.1000

0.8602

0.9178

ρa

beta

0.990

0.0010

0.9908

0.9904

ρi

beta

0.500

0.2000

0.1739

0.1655

ρπ

beta

0.500

0.2000

0.0510

0.0961

ρm

beta

0.500

0.2000

0.9686

0.9391

ρp

beta

0.500

0.2000

0.0510

0.0960

ρl

beta

0.500

0.2000

0.4999

0.4989

ρg

beta

0.970

0.0100

0.9750

0.9980

ρb

beta

0.500

0.2000

0.9662

0.9571

ρw

beta

0.500

0.2000

0.5007

0.4998

ρy

beta

0.900

0.0500

0.8473

0.9180

Calvo settings

Steady-state values
z–
g–y
π–
–
L
y–
Autoregressive parameter of shocks

228

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

Table 1, cont’d
Priors Distributions and Posterior Modes
Prior distribution
Parameter

Posterior modes

Type

Mean

SD

Euro area

United States

φπ

norm

2.000

0.5000

2.5860

2.6345

Taylor rule

φy

norm

0.100

0.0500

0.0905

0.1394

r∆y

norm

0.000

0.5000

0.3419

0.6927

r∆π

norm

0.300

0.1000

0.1384

0.1252

ρr

beta

0.500

0.2000

0.9268

0.8505

νi

invg

0.100

2.0000

0.0208

0.0512

νa

invg

0.010

2.0000

0.0072

0.0061

νp

invg

0.001

2.0000

0.0033

0.0026

νm

invg

0.001

2.0000

0.0004

0.0006

νb

invg

0.100

2.0000

0.0018

0.0020

νl

invg

0.001

2.0000

0.0005

0.0005

νg

invg

0.001

2.0000

0.0197

0.0786

νw

invg

0.001

2.0000

0.0005

0.0005

νy

invg

0.100

2.0000

0.0184

0.0209

3,634

3,598

Standard deviation of shocks

Data density

NOTE: This table shows prior distribution of the benchmark model parameters and estimation results at the mode of the marginal
density posteriors. Prior probability density functions are normal (norm), beta (beta), or inverse gamma (invg). SD, standard deviation.

Table 2
Results Estimates of TFP Equation
γ0
intercept

γ1
st –1

γ2
ln(zt )

γ3
aget

R2

Euro area

–0.0100
(0.0061)

0.9059
(0.0081)

–0.1226
(0.0123)

–4.3e-03
(7.0e-04)

0.9974
—

United States

–0.0300
(0.0082)

0.8784
(0.0090)

–0.1477
(0.0123)

–2.2e-03
(5.8e-04)

0.9946
—

Study area

NOTE: This table shows results estimates of the TFP equation based on simulated series of 3,000 occurrences, where the first 1,000
have been dropped. We made 1,000 regressions. The figures in the table correspond to the average parameters over these regressions.
Average standard deviations are listed in parentheses.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

229

Cahn and Saint-Guilhem

Figure 1
Impulse Response Function for Production and DSGE/PFA Measures of Potential Output
(United States)
x10–3

x10–5

nuy

4

2

2

0

0

–2

−2

–4

−4

1y

5y

x10–3

10y

–6

5
0
1y

5y

10y

num

–5

5

10

0

0

5

–5

–5

0

1y

5y

10y

nub

x10–3
2

–10

1y

5y

x10–4
2

10y

5y

–5

10y

nup

1y

5y

x10–3
3

nul

1

0

1y

x10–4

5

–10

nua

10

x10–3

nui

x10–3

nuw

10y

nug

2
y
y DSGE
y PFA

0
–2
–4

1

–1
1y

5y

10y

–2

1y

5y

Finally, we estimate the following TFP equation:

st = γ 0 + γ 1st −1 + γ 2ln ( zt ) + γ 3 age t + εt ,
where εt is an i.i.d. process.9
Table 2 gives the estimates of the TFP equation
for both the U.S. and euro-area model economies.
It is worth noting that results show a negative
coefficient on the capacity utilization, contrary
to what we assumed as an economic intuition in
the section on benchmarking the DSGE model.
We then compute the potential production
based on the PFA. First, we assume that potential
capital is taken as ktPIM, as computed from the
PIM. Then, we use filtered data to assess poten9

See Cahn and Saint-Guilhem (2009).

230

J U LY / A U G U S T

2009

10y

0

1y

5y

10y

tial employment, L tFilt.10 Finally, we define the
medium-term potential TFP, ŝtMT, from our previous estimates by setting zt ⬅ z– and eliminating
the lagged term11:
10

More specifically, we use a moving average version of the HP filter—
formally, if a process, xt , can be split between a cyclical part, ct ,
and a smooth trend, mt . The HP filter defines the cyclical part as
2

ct =

(

λ (1 − L ) 1 − L −1
2

(

2

)

1 + λ (1 − L ) 1 − L −1

2

)

xt ,

where ᑦ is the lag operator. Expanding this expression and considering that ct = xt – mt, we use the following relation to define potential labor:

(

)

Lt = LFilt
+ λ L−2 − 4L−1 + 6I − 4L + L2 LFilt
t
t .
Finally, we set λ = 1600 as is standard for quarterly economic time
series.
11

See Cahn and Saint-Guilhem (2009).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

Figure 2
Impulse Response Function for Production and DSGE/PFA Measures of Potential Output
(Euro Area)
x10–3
4

x10–5
4

nuy

2

2

0

0

−2

−2

−4

1y

5y

x10–3
2

−4

10y

nuw

x10–3
10
5
0

1y

5y

10y

num

nui
0.01

−5

0

2

−2

−0.01

0

1y

5y

10y

−0.02

nub

x10–3
2

1y

5y

x10–5
5

0

10y

1y

5y

−2

1y

5y

x10–3
3

nul

0

2

−5

1

10y

nup

x10–4
4

0

−4

nua

10y

nug
y
y DSGE
y PFA

−2
−4
−6

MT
sˆ t =

1y

5y

−10

10y

1y

5y

γ0
γ
γ
+ 2 ln ( z ) + 0 age t +1.
1− γ1 1− γ1
1− γ1

Consequently, the potential output based on our
production function method is given by

ytPFA = e

sˆ tMT

(

α
ktPIM

1−α
LFilt
t

)( )

.

DISCUSSION
In this section, we analyze and compare the
dynamic behavior of the DSGE and PFA estimates
of potential output through impulse response
functions (IRFs) and variance decomposition. In
the following, the terms “U.S. economy/model”
and “euro-area economy/model” refer to the
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

10y

0

1y

5y

10y

models estimated on U.S. data or euro-area data,
respectively.

IRF Analysis
Figures 1 and 2 plot the IRFs of the stochastic shocks for actual output and both DGSE- and
PFA-based measures of potential output, calculated with the estimated parameters given in
Table 1. Figures 3 through 8 show the IRFs for
various factors (output, consumption, investment,
nominal interest rate, inflation, and DSGE- and
PFA-based output gaps). Results for U.S. and
euro-area models are broadly similar, except for
the response to an external expenses shock, ν g,
for which the U.S. response appears to be much
more inert than the euro-area responses. The folJ U LY / A U G U S T

2009

231

Cahn and Saint-Guilhem

Figure 3
Impulse Response Function for Output, Consumption, and Investment (United States)

x10–3

x10–5

nuy

3
2

0

1

−2

0

−4

−1

nuw

nua

2

1y

5y

x10–3

10y

−6

0.01

0.005

1y

5y

x10–3

nui

5

0

10y

5y

10y

nup

x10–4

num

2

10

0

0

1y

5

−2
−5
−10

1y

5y

10y

nub

x10–3

−6

1y

5y

x10–5

2

2

0

0

−2
−4

0

−4

5y

10y

−4

J U LY / A U G U S T

2009

5y

10y

nug

4
2
y
c
i
1y

5y

lowing analysis applies for both economies, apart
from this shock.
The figures show that after a positive productivity shock, ν a, actual, DSGE, and PFA potential
outputs rise together, but the PFA measure rises
more gradually. Moreover, the response of the
model-based potential output seems to be more
persistent for the DSGE than for the PFA. Indeed,
a positive productivity shock results in an increase
in investment, and therefore the age of capital
stock grows gradually, as do the medium-term TFP
and PFA potential outputs. On the other hand,
the productivity shock instantly affects both the
productivity term and the Solow residual. Consequently, after such a shock, both actual and PFA
potential outputs evolve similarly, but the gap
between them remains constant for a longer
232

1y

x10–3

nul

−2
1y

−5

10y

10y

0
−2

1y

5y

10y

time than with the DSGE potential output (see
Figures 5 and 8).
The effect of a positive—quantitatively negative in its effect—investment shock, ν i, leads to
similar dynamics for the three output measures.
The shock deforms the adjustment cost function,
leading to an increase in the cost of new capital.
Hence, investment falls and capital stock shows
a hump-shaped decrease, reflected in its age and
then in potential TFP. Interestingly, all these variables cross their steady-state path simultaneously
after about 6 years. Before, the PFA potential output is below the DSGE potential output, and this
order changes after the date; the actual output
lies between the two measures. This implies that
the two related gap measures evolve in opposite
directions after an investment shock.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

Figure 4
Impulse Response Function for Nominal Interest Rate and Inflation (United States)

x10–4
5

x10–5
2

nuy

nuw

x10–4
5

0

0

0

−5

−2

−5

−10

−4

−10

−15

−6

1y

5y

x10–4
5

10y

−5
1y

5y

10y

nub

x10–3
0

5y

x10–4
0

nui

0

−10

1y

10y

num

−15

2

−4

1

−6

0
1y

5y

x10–5
10

10y

−1

5

5

−2

0

0

1y

5y

10y

−5

1y

With respect to a positive labor supply shock,
ν , PFA potential output does not react, whereas
DSGE potential output decreases instantaneously,
as does actual output but to a lesser extent. In fact,
labor in the world without nominal rigidities can
adjust more rapidly, and the reaction of DSGE
potential output is one order of magnitude higher
than for actual and PFA potential outputs.
Conversely, after a positive government shock,
ν g, actual and DSGE potential outputs shift upward suddenly, whereas PFA potential output
gradually reaches their level. After the shock,
demand for output shifts upward instantly, coinciding with a higher level of employment. Hence,
potential employment grows gradually and then
results in the slower increase in PFA-based potential output. Note that the response to the governl

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

5y

1y

10y

−5

1y

10y

nup

5y

x10–5
10

nul

−1

−3

5y

x10–3
3

−2

−8

1y

nua

10y

nug
π
R

5y

10y

ment shock of the U.S. model is more persistent
than for the euro-area ones. This is mainly due
to a more persistent stochastic structure of the
shock estimated for the United States.
Not surprisingly, DSGE potential output does
not respond to any nominal shocks—namely,
markup, monetary, or equity premium shocks—as
these shocks do not enter into the real side model.
The most remarkable fact is that PFA potential
output reacts significantly to such shocks. After
a positive monetary shock to the interest rate, ν m,
both actual and PFA output show a hump-shaped
decrease. The qualitative effects of the equity
premium shock, ν b, are quite similar.
The model economies show similar responses
to the price and wage markup shocks with first
an instantaneous fall in actual output and then a
J U LY / A U G U S T

2009

233

Cahn and Saint-Guilhem

Figure 5
Impulse Response Function for Inflation-, DSGE-, and PFA-Based Output Gaps (United States)

x10–3
10

x10–5
2

nuy

x10–3
10

nuw

0

5

nua

5

−2
0
−5

0

−4
1y

5y

x10–3
2

10y

−6

1y

5y

x10–3
5

nui

0

10y

num

1y

5y

10y

nub

x10–3
2

−2
1y

5y

−10

1y

5y

10y

nup

x10–3
3

−1

2

5

1

0

0

−5

−1

1y

5y

Table 3 shows the contribution of each structural shock to the asymptotic forecast error variance of the endogenous variables shown in Table 4.
For the U.S. economy, the productivity shock
seems to dominate asymptotically the sources of
actual and DSGE-based potential outputs by about
50 percent and 60 percent, respectively. A government spending shock is the other main source of

1y

5y

x10–3
3

10

Variance Decomposition

2009

10y

nul

hump-shaped increase. Nevertheless, the actual
output reaction to a wage markup shock is about
two orders of magnitude less than the response
to a price markup shock. PFA potential output
responds to these shocks in a similar manner but
more gradually, generating a persistent drift.

J U LY / A U G U S T

10y

0

x10–5
15

0

234

5y

1
−5

−4

−4

1y

2

0

−2

−6

−5

10y

1y

10y

nug
gap DSGE
gap PFA
π

5y

10y

fluctuations, accounting for 27 percent of actual
production and 37 percent of DSGE potential.
The interest rate shock appears to create the
most striking difference between actual and DSGE
potential output variance—it amounts to about
55 percent of the related output gap measure. For
the PFA potential measure, the external spending
shock accounts for 43 percent of the variance as
the main contributor. The productivity shock contribution reaches only 15 percent, less than the
interest shock (21 percent). All in all, contrary to
the DSGE-based measure, the productivity shock
accounts for 68 percent of the PFA output gap
variance.
For the euro-area model economy, the variance decomposition of actual production is quite
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

Figure 6
Impulse Response Function for Output, Consumption, and Investment (Euro Area)

x10–3

x10–5

nuy

4

nuw

4
2

2

0
0
−2

nua
0.01

0.005

−2
1y

5y

x10–3

−4

10y

1y

5y

10y

num

nui

0

0

4

0

−0.005

2

−2

−0.01

0

1y

5y

10y

−0.015

nub

x10–3

5

0

0

−2

−5

−4

−10
1y

5y

10y

5y

x10–6

2

−6

1y

−15

−2

1y

10y

nup

5y

x10–3

nul

10y

nug

4

y
c
i

2
0
1y

5y

similar to that of the United States. Nevertheless,
almost all the variance of DSGE-based potential
output seems to be derived from the productivity
shock, whereas the largest part of the PFA potential output variance comes from the interest shock
(76 percent) and productivity shock to a lesser
extent (12 percent). As a result, DSGE output gap
variations come primarily from the interest shock
(82 percent), and PFA gap variance is derived
from the productivity shock (74 percent).
Finally, Table 4 shows that both DSGE and
PFA potential growth are less volatile than actual
output. Nevertheless, one could not conclude
that the PFA-based measure is smoother than the
DSGE one.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

10y

5y

x10–4

2

−4

1y

10y

−2

1y

5y

10y

Implications
Our analysis suggests that the PFA and the
DSGE approaches to potential output measurement differ significantly, at least from a business
cycle perspective. For two different models—
one close to the U.S. data, the other to the euroarea data—the output gap related to the DSGE
measure captures mainly nominal shocks, which
in summation amounts to more than 80 percent
(about 97 percent for the U.S. model) of the gap
variance. Alternatively, the PFA gap reacts mainly
to productivity shock (about 70 percent of the
variance.)
As a consequence, it seems to us that using the
PFA to compute potential output and the related
output gap presents some drawbacks related to
J U LY / A U G U S T

2009

235

Cahn and Saint-Guilhem

Figure 7
Impulse Response Function for Nominal Interest Rate and Inflation (Euro Area)

x104

x106

nuy

5

5

0

0

−5

−5

−10

−10

−15

1y

5y

x10–4

10y

−15

0

0

−0.5

−2

−1

−4

−1.5
1y

5y

10y

nub

x10–3

−2

−5
1y

5y

10y

num

−10

1y

5y

x10–5

10y

−2

3

−1

1

2

−1.5

0

1

−2

−1

1y

5y

1y

5y

x10–4

nul

2

2009

nup

0

−0.5

J U LY / A U G U S T

10y

2

4

10y

5y

4

3

5y

1y

x10–3

0

1y

nua

0

its ability to properly reflect inflationary pressures
related to nominal shocks. In contrast, the DSGEbased potential output measure could lead to misstatements about potential growth as this measure
reacts to temporary but persistent shocks such as
productivity shocks. These two assessments can
lead to contradictions in terms of economic diagnostics. For instance, assuming that the model is
the one that generates the actual data, one could
think that during the 1990-95 period, GDP growth
in the United States (2.4 percent) was below its
potential based on the DSGE measure (2.7 percent), as stated in Table 5. One would reach an
opposing conclusion using the PFA-based measure
(1.7 percent). The same contradiction holds for
the euro-area economy during the 2000-05 period.
236

x104
5

x10–3

nui

2

−6

nuw

10y

0

10y

nug
π
R

1y

5y

10y

From an empirical point of view, these results
tend to moderate the possible structural interpretations of the international comparison based on
the PFA. Indeed, if one believes that some structural shocks drive the dynamics of economic variables and wants to compare potential growth of
several economies using the PFA, the fact that the
results depend on the idiosyncratic shocks faced
by each economy must be considered. Consequently, this argues for a normalization of such
structural shocks before applying the PFA. For
instance, based on the PFA (left side of Table 5),
it appears that actual growth in the euro-area
economy stood below its PFA potential in the
past 15 years. Conversely, the U.S. economy’s
actual growth was above its PFA potential. Moreover, the U.S. PFA potential was higher than the
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

Figure 8
Impulse Response Function for Inflation-, DSGE-, and PFA-Based Output Gaps (Euro Area)

x10–3
4

x10–5
4

nuy

nuw

2

2

x10–3
10

nua

5

0
0
−2

0

−2
1y

5y

x10–3
2

−4

10y

1y

5y

x10–3
5

nui

10y

num

0

0

−5

1y

5y

10y

nup

x10–3
4
2

−5
−2
−4

0

−10
1y

5y

10y

−15

nub

x10–3
2
0

1y

5y

x10–5
10

10y

1y

5y

x10–3
3

nul

gap DSGE
gap PFA
π

1
0

−4
1y

5y

10y

−5

0
1y

5y

euro area’s. Does it clarify the need for structural
reforms in the European economy to keep pace
with the U.S. economy? Imagine that both economies interchanged the structural shocks they
faced. Would we observe identical behavior?
The three right columns of Table 5 present the
results of such an experiment; they lead to the
exact opposite conclusion regarding the comparison between the United States and the euro area.
Alternatively, a monetary authority that must
conduct interest rate policy based on a Taylor
rule that includes an output gap measure could
make the opposite decision depending on the
method used to measure drift between actual and
potential output. For instance, after a positive
productivity shock, the central bank could decide
to instantaneously increase the nominal interest
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

10y

nug

2

5

−2

−6

−2

10y

−1

1y

5y

10y

rate if based on the PFA gap estimates, whereas
the decision would be to decrease the interest
rate (as shown in Figure 4), at least in a DSGE
framework.

CONCLUSION
In this article, we compared the PFA measure
of potential output with the DSGE definition of
potential output in a fully integrated framework.
We estimated a DSGE model for U.S. and euroarea data and integrated into the two versions of
the model a PFA measure of potential output fully
consistent with the model. Results have shown
that, in a DSGE framework, the PFA leads to potential output measures that are not exempt from
J U LY / A U G U S T

2009

237

Cahn and Saint-Guilhem

Table 3
Variance Decomposition
yt

Shocks

ytDSGE

ytPFA

πt

ct

it

Rt

gaptDSGE

gaptPFA

United States
Productivity shock

51

59

15

24

65

28

4

14

67

Inflation target shock

—

—

—

11

11

—

—

—

—

Labor supply shock

—

—

—

—

—

—

—

—

—

External spending shock

27

37

43

—

15

8

—

—

1

5

4

10

1

1

45

4

3

4

Investment shock
Equity premium shock

4

—

6

11

4

4

68

18

2

11

—

21

32

11

14

12

55

4

Price distortion shock

2

—

5

21

4

1

12

10

22

Wage distortion shock

—

—

—

—

—

—

—

—

—

48

99

12

4

67

22

2

3

73

—

Interest rate shock

Euro area
Productivity shock
Inflation target shock

—

—

Labor supply shock

—

—

7

—

—

—

—

—

—

—

—

—

—

—

External spending shock

1

1

1

—

5

—

—

—

1

Investment shock

1

—

1

—

—

3

1

—

1

Equity premium shock

5

—

7

8

3

7

44

10

3

42

—

75

72

22

65

48

82

18

Price distortion shock

3

—

4

9

3

3

5

5

4

Wage distortion shock

—

—

—

—

—

—

—

—

—

Interest rate shock

NOTE: This table presents the theoretical variance decomposition among the model’s shocks (expressed in percent).

Table 4
Theoretical Moments
United States

Euro area

Variable

Mean

SD

Mean

SD

y

0.9178

0.0794

0.8602

0.0909

DSGE

0.9178

0.0667

0.8602

0.0611

PFA

0.9178

0.0620

0.8602

0.0819

y

Y

C

0.7237

0.0525

0.5083

0.0433

π

1.0125

0.0066

1.0077

0.0117

R

1.0239

0.0105

1.0136

0.0143

i

0.1517

0.0229

0.1855

0.0347

gapDSGE

0.0000

0.0394

0.0000

0.0755

gapPFA

0.0000

0.0534

0.0000

0.0598

238

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Cahn and Saint-Guilhem

Table 5
Annual Potential Growth Comparison
United States
Period

y

United States*

DSGE

y

y

PFA

y

yDSGE

yPFA

1990-1995

2.4

2.7

1.7

2.0

1.6

2.6

1995-2000

3.9

3.4

3.8

2.4

2.5

3.0

2000-2005

2.4

1.9

1.7

0.0

1.2

0.6

1990-2005

2.9

2.6

2.4

1.5

1.8

2.1

Euro area†

Euro area
y

yDSGE

yPFA

y

yDSGE

yPFA

1990-1995

1.8

2.5

1.5

2.4

3.6

2.2

1995-2000

2.4

2.8

2.7

4.2

4.1

2.3

2000-2005

1.9

1.9

2.9

4.8

3.9

3.1

1990-2005

2.0

2.4

2.4

3.8

3.8

2.5

NOTE: This table shows actual and potential growth on average over different subperiods. Figures are given in percent. They also
include both the deterministic and labor trends. *U.S. model simulated with euro-area smoothed shocks. † Euro-area model simulated
with U.S. smoothed shocks.

the effects of nominal or temporary shocks. The
empirical implication of these results is that estimates of potential output based on an ad hoc PFA
could be highly dependent on transitory phenomena. Moreover, cross-country differences in potential output based on the PFA are likely to reflect
not only structural differences, but also different
patterns of shocks across time. This leads to the
assessment of the quantitative role of shocks in
cross-country differences in potential output.
One way to address this issue is to implement in
a DSGE model a scenario comparing potential
output across economies confronted by the same
shocks across time, while exhibiting differences
in structural parameters.
However, to answer this question in a more
satisfactory manner, we need to improve the
present study in several directions. First, it would
be of particular interest to identify the causes of
divergences between PFA and DSGE potential
output measures. Such an analysis could be conducted parameter by parameter to assess their
weight on the discrepancy between the two assessments. Second, one would need to improve the
estimation procedure by identifying the marginal
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

posterior density of the model through Markovchain Monte Carlo simulations on the one hand,
and by allowing structural breaks in the TFP
regression equation on the other hand. Finally,
one could study the implications for monetary
policy of the use of PFA rather than DSGE measures of output gap in a class of central bank decision rules. Obviously, these studies should be
performed with an enhanced model, especially
with regard to the modeling of the labor market,
with an extension of the model introducing unemployment and participation considerations to
account for additional sources of fluctuations in
potential output and the output gap.

REFERENCES
Cahn, Christophe and Saint-Guilhem, Arthur.
“Potential Output Growth in Several Industrialised
Countries: A Comparison.” Empirical Economics,
2009 (forthcoming).
Calvo, Guillermo A. “Staggered Prices in a UtilityMaximizing Framework.” Journal of Monetary
Economics, September 1983, 12(3), pp. 383-98.

J U LY / A U G U S T

2009

239

Cahn and Saint-Guilhem

Dotsey, Michael and King, Robert G. “Implications
of State-Dependent Pricing for Dynamic
Macroeconomic Models.” Journal of Monetary
Economics, January 2005, 52(1), pp. 213-42.
Edge, Rochelle M.; Kiley, Michael T. and Laforte,
Jean-Philippe. “Natural Rate Measures in an
Estimated DSGE Model of the U.S. Economy.”
Finance and Economics Discussion Series No.
2007-08, Federal Reserve Board, Washington, DC;
www.federalreserve.gov/pubs/feds/2007/200708/
200708pap.pdf.
Fagan, Gabriel; Henry, Jerome and Mestre, Ricardo.
“An Area Wide Model (AWM) for the Euro Area.”
ECB Working Paper No. 42, European Central Bank,
January 2001;
www.ecb.int/pub/pdf/scpwps/ecbwp042.pdf.
Greenwood, Jeremy; Hercowitz, Zvi and Huffman,
Gregory W. “Investment, Capacity Utilization, and
the Real Business Cycle.” American Economic
Review, June 1988, 78(3), pp. 402-17.
International Monetary Fund. “France—2007 Article IV
Consultation Concluding Statement.” International
Monetary Fund, November 19, 2007;
www.imf.org/external/np/ms/2007/111907.htm.

240

J U LY / A U G U S T

2009

Kimball, Miles. “The Quantitative Analytics of the
Basic Neomonetarist Model.” Journal of Money,
Credit, and Banking, 1995, 27(4 Part 2), pp. 1241-77.
Neiss, Katherine and Nelson, Edward. “Inflation
Dynamics, Marginal Costs and the Output Gap:
Evidence from Three Countries.” Journal of
Money, Credit, and Banking, December 2005, 37(6),
pp. 1019-45.
Organisation for Economic Co-operation and
Development. OECD Economic Outlook No. 78.
December 2005.
Schorfheide, Frank. “Loss Function-Based Evaluation
of DSGE Models.” Journal of Applied Econometrics,
15(6), pp. 645-70.
Smets, Frank and Wouters, Rafael. “An Estimated
Dynamic Stochastic General Equilibrium Model of
the Euro Area.” Journal of the European Economic
Association, 2003, 1(5), pp. 1123-75.
Smets, Frank and Wouters, Rafael. “Shocks and
Frictions in U.S. Business Cycles: A Bayesian DSGE
Approach.” American Economic Review, June 2007,
97(3), pp. 586-606.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Jon Faust

T

he Economic Policy conference at the
Federal Reserve Bank of St. Louis has
for several decades been one of the
premier monetary policy conferences
worldwide, and it is a great privilege to participate in this conference focusing on the measurement and forecasting of potential growth. I am
particularly pleased to be discussing the paper
by Christophe Cahn and Arthur Saint-Guilhem
(2009), which is a beautiful example of a broad
class of work that explores how traditional economic concepts and measures relate to similar
concepts in the context of dynamic stochastic
general equilibrium (DSGE) models that are rapidly coming into the policy process. This class
of work is vitally important if policymakers are
to meld successfully traditional methods and
wisdom with the new models to improve the
policy process. I will mainly attempt to explain
this class of work, why it is important, and some
techniques for improving it. While my points
are fairly generic, the Cahn–Saint-Guilhem paper
provides an excellent case study for illustrating
the key issues.

DSGE MODELS AND A NEW
CLASS OF RESEARCH
Around 1980, Lucas, Sims, and others issued
devastating critiques of existing monetary policy
models. One basis for these critiques was the claim
that the existing methods were substantially ad hoc

relative to the ideal to which the profession should
aspire. While this critique was undeniably valid,
the absence of better-founded alternatives meant
that more-or-less traditional ad hoc approaches
continued to be used and refined at central banks
for the next 25 years or so. Meanwhile, the profession did the basic research required to create
models with sounder foundations.
In the past few years, DSGE models have
advanced to the point that they are coming into
widespread use at central banks around the world.
These models are still rife with ad hoc elements,
but there is no doubt that there has been an order
of magnitude advance in the interpretability of
the predictions of the model in terms of wellarticulated economic theory.
There is still considerable disagreement, however, over the degree to which the new models
should supplant the traditional methods. I do not
want to argue this point. Rather, I want to assert
that these models have at least advanced to the
point that they constitute interesting laboratories
in which to explore various claims and principles
that are important in the policy process. My focus
is on how the models can best play this role.
Consider an analogy to medical research. In
attempting to understand the toxicology of drugs
in humans, we often use animal models. That is,
we check if the drug kills the rat before we give
it to humans. In any given pharmacological context, there is generally substantial disagreement
on how literally we should take the model when
extrapolating the results to humans. Despite this

Jon Faust is the Louis J. Maccini Professor of Economics at Johns Hopkins University.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 241-46.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

241

Faust

disagreement, there is a broad consensus that the
rat model is extremely valuable in formulating
policy.
Similarly, I think we should all agree that
DSGE models have at least attained something
akin to rat, or at least fruit fly, status. Under this
agreement, a wide range of work becomes valuable
and important. In particular, I think we should
aggressively explore basic macroeconomic propositions treating these model economies as interesting economic organisms.
Although I am not sure the authors view it
this way, the Cahn–Saint-Guilhem paper can be
viewed in this perspective. Some notion of potential output is often at the center of policy discussions. One traditional measure of potential is based
on the production function approach (PFA) as
clearly described in the paper. Analysis of optimal
policy in DSGE models suggests that for some
purposes we should focus on a concept of potential as measured by the efficient level of output,
known as flexible price output (FPO). FPO potential measures what output would be if certain
distortions were not present.1
If we are to smoothly and coherently bring
DSGE models and the associated measures into
the policy process, it is important to know how
PFA and FPO potential relate in the real world.
One very useful step in this process, I argue, is
exploring how both concepts operate in the simpler context of the DSGE model. That is, first
understand the concepts as fully as possible in
the rat before moving to the human case.
This type of work is relatively straightforward
conceptually. Broadly, we must specify how to
compute a model-based analog of both PFA and
FPO potential. Then we simulate a zillion samples
from the model, calculate both measures on each,
and then summarize apparent similarities and dissimilarities.2 For example, we might ask whether
our traditional interpretation of PFA potential is
correct in the context of the DSGE model.
The paper focuses on a particular question of
this type. Movements in PFA potential are, in
1

Of course, many important issues remain in assessing this counterfactual, but these issues will not be important in this discussion.

2

I add one important conceptual step in the discussion below.

242

J U LY / A U G U S T

2009

practice, often attributed to medium-term
“structural features” of the economy as opposed
to transitory demand or supply features. Is the
interpretation warranted in the DSGE model?
The paper finds (see their Table 3) that it is not.
That is, a large portion of the variance of PFA
potential is attributable to factors we would not
usually consider “structural” in the sense this
term is used in these discussions. FPO potential
looks more structural in this regard. The paper
elaborates this key result in a number of useful
ways. What I want to discuss, however, is what
we should make of this general class of work and
how we can we make it better.
Let me note that this sort of work is multiplying. For example, I have been involved in a long
line of work regarding the reliability of structural
inferences based on long-run identifying restrictions in vector autoregressions (Faust and Leeper,
1997). At a presentation of my work with Eric
Leeper in the early 1990s, Bob King asked why I
did not assess the practical importance of the
points using a DSGE model. I did not see the full
merits of this at that time, but Erceg, Guerrieri,
and Gust (2005) and Chari, Kehoe, and McGrattan
(2008) have now taken up this suggestion (illustrating far more points than raised in the Faust-Leeper
work) and considerably advanced the debate.
I would go so far as to argue that this sort of
analysis should be considered a necessary component of best practice. That is, if anyone proposes
a macroeconomic claim or advocates an econometric technique that is well defined in the new
class of DSGE models, assessing the merits of the
claim in the DSGE context should be mandatory.
If it is coherent to apply the idea in the rat, we
should do so before advocating its use in humans.
The work I am advocating cannot, however,
be seen as part of some necessary or sufficient
conditions for drawing reliable conclusions about
reality. The mere fact that a particular claim is
warranted in the DSGE model is neither necessary nor sufficient for the claim to be useful in
practice. Similarly, the mere fact that a drug does
not kill the rat is neither necessary nor sufficient
for the drug’s safety in humans. Just as judgment
is required to draw lessons from animal studies,
judgment will be required to draw lessons from
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Faust

DSGE studies. I believe that the results can be
valuable nonetheless. In the remainder of the
discussion, I highlight three points that, I believe,
can make work in this spirit much more useful.

DOING IT BETTER
Don’t Confuse the Rat with the Human
In animal studies, there is very rarely any
confusion about when the authors are talking
about the rats and when they are talking about
the humans. The core of the research paper rigorously assesses some feature of the toxicology in
rats and is clearly about the rat. Whatever one
believes about the usefulness of the rat model,
the point of the body of the paper is to support
claims about the rat. This portion of the paper
can be rigorously assessed without getting into
unresolved issues about the ultimate adequacy
of the rat model.
After settling issues about the rat, there is an
active discussion about how the rat model results
should be extrapolated to the human context.3
This process is illustrated in the conclusions of a
joint working group of the U.S. Environmental
Protection Agency and Health Canada regarding
the human relevance of animal studies of tumor
formation (Cohen et al., 2004). They summarized
their proposed framework for policy in the following four steps:
(i) Is the weight of evidence sufficient to
establish the mode of action (MOA) in
animals?
(ii) Are key events in the animal MOA plausible in humans?
(iii) Taking into account kinetic and dynamic
factors, is the animal MOA plausible in
humans?
(iv) Conclusion: Statement of confidence,
analysis, and implications. (p. 182)
In the first step, we clarify the result in the
model. The remaining steps involve asking serious
questions about whether the transmission mech3

See Faust (2009) for a more complete discussion of this issue.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

anisms in the model—to borrow a monetary policy
term—plausibly operate similarly in the relevant
reality.
In contrast, it is customary in macroeconomics
to discuss quantities computed in the context of
a DSGE model in a way that leaves it ambiguous
at best whether the authors are advocating (or
hoping) that we take them as statements about
reality. I suspect that researchers arrive at the
practice of treating statements about the model
and reality as more-or-less equivalent under the
rubric of “taking the model seriously.” This seems
to presume that the best way to take the model
seriously is to take it literally. In toxicology, there
is no doubt that policymakers take animal models
seriously, but this never seems to require equating
rats and humans. In my view, we should not confuse rats and humans; neither should we confuse
DSGE models and reality.

Conceptual Clarity Before Computation
Broadly speaking, the point of the Cahn–
Saint-Guilhem paper is to compare and contrast
the behavior of two measures of potential output
using a computational exercise on a DSGE model.
Because it is so conceptually simple to implement
computational experiments of the sort described
above, it is very tempting to jump straight to the
computer. I think work of this type would be better
clarified by starting with careful conceptual analysis of the measures before computation. We can
clearly lay out the expected differences and then
many aspects of the computational work become
exercises in measuring the empirical magnitude
of effects that have been clearly defined.
I think this is particularly important in the
macro profession where we seem to have a penchant for reusing labels for concepts that are quite
distinct. “PFA potential” and “FPO potential”
illustrate this point. “PFA potential” is meant to
measure the level of output that would be attained
if the current capital stock were used at some
notion of “full capacity.” “FPO potential” is,
roughly speaking, the level of output that would
be obtained if inputs were used efficiently as
opposed to fully. It is clear that these two concepts
of potential need not even be closely correlated.
In any model in which the efficient level of, say,
J U LY / A U G U S T

2009

243

Faust

labor fluctuates considerably around the full
employment level of labor, the two measures may
be quite different. Clearly laying out the conceptual differences can be an incredibly enlightening
step in what ultimately becomes a computational
exercise.
One minor critique of the Cahn–Saint-Guilhem
paper in this regard is that the work refers to FPO
potential as simply the DSGE measure. There are
many concepts of “potential” that might be useful
for different questions in a DSGE model, and
indeed we can discuss many versions of FPO
potential, depending on how we implement the
counterfactual regarding “if prices were flexible.”
Specific labels and careful analysis of the associated concepts can be very helpful.
Used properly, the sort of computational exercises with DSGE models that I am advocating
can be an important tool for clarifying important
conceptual issues. It may, at times, be tempting
to simply substitute the relatively straightforward
computational step for the sometimes painful
step of careful conceptual analysis. Giving in to
this temptation would be to miss an important
opportunity.

Better Lab Technique
While the computational exercises I am advocating are conceptually straightforward, there are
myriad subtle issues that fall under the umbrella
of “lab technique.” The new DSGE models are
complicated and not fully understood. The
Bayesian techniques being developed to analyze
these models are also complicated and not fully
understood. What we know from experience to
date with DSGE models, and with similar tools
applied in other areas, is that we can very easily
create misleading results. For example, Sims
(2003) has discussed such issues at length.
Much of the profession has long experience
with the use of frequentist statistics and has
become familiar with the myriad ways that one
might inadvertently mislead. We need to be mindful of the fact that the profession is very new at
assessing the adequacy of the new DSGE models
using Bayesian techniques.
John Geweke (2005, 2007) has been at the
forefront in developing flexible Bayesian tools
244

J U LY / A U G U S T

2009

for assessing model adequacy in the context of
models that are known to be incomplete or imperfect descriptions of the target of the modeling.
Abhishek Gupta (a Johns Hopkins graduate student) and I have recently been exploring these
methods as they apply to DSGE models intended
for policy analysis (Faust 2008, 2009; Faust and
Gupta, 2009; and Gupta, 2009). I present just a
flavor of one result with possible bearing on the
topic of the Cahn–Saint-Guilhem analysis. The
example is from Faust (2009), which reports
results for the RAMSES model, a version of which
is used by the Swedish Riksbank in its policy
process.4
The simplest form of the idea is to take some
feature of the data that is well defined outside the
context of any particular macroeconomic model
and about which we may have some prior beliefs.
In the simplest form, we simply check whether
the formal prior (which is largely arbitrary in
current DSGE work) corresponds to our actual
prior regarding this feature. Further, we check
how both the formal prior and posterior compare
with the data. A somewhat subtler version of this
analysis instead considers prior and posterior
predictive results for these features of interest.
As an example, consider the correlation
between consumption growth and the short-term,
nominally risk-free interest rate. Much evidence
suggests that there is not a strong relation between
short-term fluctuations in short-term rates and
consumption growth. The upper panel of Figure 1
shows this marginal prior density implied by the
prior over the structural parameters used in estimating RAMSES. The prior puts almost all mass
on a fairly strong negative correlation, with the
mode larger in magnitude than –0.5. The vertical
line gives the value on the estimation sample of
approximately zero. In short, the prior used in
the analysis strongly favors the mechanism that
higher interest rates raise saving and lower consumption. In the posterior (bottom panel), the
mass is moved toward a negative correlation that
is a bit smaller in magnitude, but the sample value
is actually farther into the tail than it was in the
prior.
4

The developers of RAMSES (Riksbank Aggregated Macromodel
for Studies of the Economy of Sweden) were exceedingly generous
in helping me conduct this work.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Faust

Figure 1
Prior and Posterior Densities
Prior and Sample Value

5
4
3
2
1
0
−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.4

0.6

0.8

1

Posterior and Sample Value

5
4
3
2
1
0
−1

−0.8

−0.6

−0.4

−0.2

0

0.2

NOTE: The figure shows the prior (upper panel) and posterior (lower panel) densities along with the sample value for the contemporaneous correlation between the short-term interest rate and quarterly consumption growth in a version of the RAMSES model.
SOURCE: Author’s calculations using computer code provided by Riksbank.

This result and related ones in the work cited
above convince me that current DSGE models
continue to have difficulty matching basic patterns
in consumption and investment as mediated by
the interest rate. If we were to use this model in
policy, we might want to ask whether this is one
feature—like differences between rats and
humans—that we should explicitly adjust for in
moving from model results to reality.
Of course, the forces driving short-run fluctuations in consumption are at the very center of
the distinction between PFA potential and FPO
potential. These results and others like them conF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

vince me that while the DSGE model provides an
interesting lab, there is good reason to question
how literal we should be in extrapolating these
results to the real economy.
A more general lesson is that the methods
just sketched can be applied to any data feature,
including statistics like those reported, for example, in Table 4 in the Cahn–Saint-Guilhem paper.
These techniques allow one to coherently take
estimation and model uncertainty into account
and to evaluate the importance of arbitrary aspects
in the formal prior. I strongly urge the authors to
move in this direction.
J U LY / A U G U S T

2009

245

Faust

CONCLUSION
I commend the St. Louis Fed for holding a
conference on this issue that is vital to the monetary policymaking process, and I commend
Christophe and Arthur for their interesting work
illuminating how two competing measures of
potential output behave in the context of modern
DSGE models. This line of work is extremely
important. I have made three suggestions that I
believe would improve any work of this type. I
hope that these suggestions contribute to making
work of this sort even more influential.

Faust, Jon. “DSGE Models in a Second-Best World of
Policy Analysis.” Unpublished manuscript, Johns
Hopkins University, 2008;
http://e105.org/faustj/download/opolAll.pdf.
Faust, Jon. “The New Macro Models: Washing Our
Hands and Watching for Icebergs.” Economic
Review, March 23, 2009, 1, pp. 45-68.
Faust, Jon and Gupta, Abhishek. “Bayesian Evaluation
of Incomplete DSGE Models.” Unpublished manuscript, Johns Hopkins University, 2009.
Faust, Jon and Leeper, Eric M. “When Do Long-Run
Identifying Restrictions Give Reliable Results?”
Journal Business and Economic Statistics, July
1997, 15(3), pp. 345-53.

REFERENCES
Cahn, Christophe and Saint-Guilhem, Arthur. “Issues
on Potential Growth Measurement and Comparison:
How Structural Is the Production Function
Approach?” Federal Reserve Bank of St. Louis
Review, July/August 2009, 91(4) pp. 221-40.
Chari, V.V.; Kehoe, Patrick J. and McGrattan, Ellen R.
“Are Structural VARs with Long-Run Restrictions
Useful in Developing Business Cycle Theory?”
NBER Working Paper No. 14430, National Bureau
of Economic Research, October 2008;
www.nber.org/papers/w14430.pdf.
Cohen, Samuel M.; Klaunig, James; Meek, M. Elizabeth;
Hill, Richard N.; Pastoor, Timothy; LehmanMcKeeman, Lois; Bucher, John; Longfellow, David G.;
Seed, Jennifer; Dellarco, Vicki; Fenner-Crisp,
Penelope and Patton, Dorothy. “Evaluating the
Human Relevance of Chemically Induced Animal
Tumors.” Toxicological Sciences, April 2004, 78(2),
pp. 181-86.

Geweke, John. Contemporary Bayesian Econometrics
and Statistics. Hoboken, NJ: Wiley, 2005.
Geweke, John. “Bayesian Model Comparison and
Validation.” Unpublished manuscript, University
of Iowa, 2007; www.aeaweb.org/annual_mtg_
papers/2007/0105_0800_0403.pdf.
Gupta, Abhishek. “A Forecasting Metric for DSGE
Models.” Unpublished manuscript, Johns Hopkins
University, 2009.
Sims, Chris. “Probability Models for Monetary Policy
Decisions.” Unpublished manuscript, Princeton
University, 2003.

Erceg, Christopher J.; Guerrieri, Luca and Gust,
Christopher. “Can Long-Run Restrictions Identify
Technology Shocks?” Journal of the European
Economic Association, December 2005, 3(6),
pp. 1237-78.

246

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Parsing Shocks: Real-Time Revisions to
Gap and Growth Projections for Canada
Russell Barnett, Sharon Kozicki, and Christopher Petrinec
The output gap—the deviation of output from potential output—has played an important role in
the conduct of monetary policy in Canada. This paper reviews the Bank of Canada’s definition of
potential output, as well as the use of the output gap in monetary policy. Using a real-time staff
economic projection dataset from 1994 through 2005, a period during which the staff used the
Quarterly Projection Model to construct economic projections, the authors investigate the relationship between shocks (data revisions or real-time projection errors) and revisions to projections of
key macroeconomic variables. Of particular interest are the interactions between shocks to real gross
domestic product (GDP) and inflation and revisions to the level of potential output, potential
growth, the output gap, and real GDP growth. (JEL C53, E32, E37, E52, E58)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), 247-65.

P

otential output is an important economic concept underlying the design
of sustainable economic policies and
decisionmaking in forward-looking
environments. Stabilization policy is designed to
minimize economic variation around potential
output. Estimates of potential output may be used
to obtain cyclically adjusted estimates of fiscal
budget balances; projections of potential output
may indicate trend demand for use in investment
planning or trend tax revenues for use in fiscal
planning; and potential output provides a measure of production capacity for assessing wage or
inflation pressures.
Although potential output is an important
economic concept, it is not observable. The Bank
of Canada defines “potential output” as the sustainable level of goods and services that the economy can produce without adding to or subtracting
from inflationary pressures. This definition is
intrinsic to the methodology used by the Bank of
Canada to construct historical estimates of poten-

tial output. In addition to using a production
function to guide estimation of long-run trends
influencing the supply side of the economy, the
procedure incorporates information on the demand
side that relates inflationary and disinflationary
pressures to, respectively, situations where output exceeds and falls short of potential output.
Potential output and the “output gap,” defined
as the deviation of output from potential output,
play central roles in monetary policy decisionmaking and communications at the Bank of Canada.
Macklem (2002) describes the information and
analysis presented to the Bank’s Governing
Council in the two to three weeks preceding a
fixed announcement date.1 As described in that
document, the output gap—both its level and
rate of change—is the central aggregate-demand
1

In late 2000, the Bank of Canada adopted a system of eight preannounced dates per year when it may adjust its policy rate—the
target for the overnight rate of interest. The Bank retains the option
of taking action between fixed dates in extraordinary circumstances.

Russell Barnett was principal researcher at the time of preparation of this article, Sharon Kozicki is a deputy chief, and Christopher Petrinec
is a research assistant at the Bank of Canada.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, the regional Federal Reserve Banks, or the Bank of Canada. Articles may be
reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation
are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank
of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

247

Barnett, Kozicki, Petrinec

link between the policy actions and inflation
responses.2
In addition to being central to policy deliberations, the output gap has played a critical role
in Bank of Canada communications. The concept
of the output gap is simple to explain and understand. It has been used effectively to simultaneously provide a concise and intuitive view of the
current state of the economy and inflationary
pressures. It also provides a point of reference in
relation to current policy actions and helps align
the Bank’s current thoughts on the economy with
those held by the public.
Use of the output gap as a key communications
device with the public is clearly seen in Monetary
Policy Reports (MPRs) and speeches by governors
and deputy governors of the Bank. The Bank of
Canada began publishing MPRs semiannually in
May 1995 (with two additional Monetary Policy
Report Updates per year starting in 2000), and the
output gap has been prominent in the reports from
the beginning.3 Indeed, a Technical Box appears
in the first MPR regarding the strategy used by the
Bank to estimate potential output.4 Not only is
the Bank’s estimate of the output gap referenced
in the text of the MPR as a source of inflationary
(or disinflationary) pressure in the economy, but
the estimates of recent history of the output gap
up to the current quarter are also charted.
Governors and deputy governors have extensively used the output gap to explain to the gen2

The important role of the output gap as a guide to monetary policymakers, over and above that of growth, was expressed by Governor
Thiessen (1997):

eral public how the monetary policy framework
works. Common elements across these speeches
include discussions on how potential output is
estimated, how it is used to construct the output
gap, and how the output gap affects monetary
policy decisions. These discussions are nontechnical to enhance understanding by noneconomists.
For instance, when discussing the factors affecting
potential output in a speech to the Standing Senate
Committee on Banking, Trade and Commerce in
2001, Governor David Dodge stated:
[T]he level of potential rises over time as more
workers join the labour force; businesses
increase their investments in new technology,
machinery and equipment; policy measures
are taken to make product and labour markets
more flexible; and all of us become more efficient and productive in what we do.

One important challenge associated with
the use of potential output and the output gap
as tools for communication of monetary policy
decisions is that they cannot be directly observed
and must be estimated. Moreover, estimates are
prone to revision as historical data are revised
and new information becomes available. Consequently, the Bank has directly addressed uncertainty surrounding estimates of the output gap
and the drivers behind revisions in policy communications. A discussion of the implications of
uncertainty for the conduct of monetary policy
appeared in the May 1999 MPR (Bank of Canada,
1999, p. 26):
[P]oint estimates of the level of potential output
and of the output gap should be viewed cautiously. This has particular significance when
the output gap is believed to be narrow and
when inflation expectations are adequately
anchored. In this situation, to keep inflation in
the target range, policy-makers may have more
success by placing greater weight on the economy’s inflation performance relative to expectations and less on the point estimate of the
output gap. At about the same time, the Bank
started providing standard error bands around
recent estimates of the output gap.5

Some people apparently assume that it is the speed at
which the economy is growing that determines whether
inflationary pressures will increase or decrease. While the
rate of the growth is not irrelevant, what really matters is the
level of economic activity relative to the production capacity of the economy—in other words…the output gap in the
economy. The size of the output gap, interacting with inflation expectations, is the principal force behind increased
or decreased inflationary pressure.
3

By contrast, incorporation of Governing Council projections has
been more recent, with projections of core inflation first appearing
in the April 2003 MPR and projections of gross domestic product
(GDP) growth first appearing in the July 2005 MPR.

4

The material in this box (May 1995) gives readers an idea of how
the output gap is constructed without being overly technical. Publishing such statistics and the methods underlying their estimation
has contributed importantly to monetary policy transparency in
Canada.

248

J U LY / A U G U S T

2009

5

Standard error bands were provided around recent estimates from
1998 to 2007.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

Revisions to historical estimates of potential
output and the output gap also have been explicitly discussed in MPRs.6 The discussions relate
the revisions to recent developments in wage and
price inflation and revised assessments of trends
in labor input and labor productivity. Overall,
transparency in the construction of the output
gap, in understanding sources of revisions to past
estimates of the output gap, and in uncertainty
around the output gap has contributed to the effectiveness of the output gap as a key communications tool for enhancing understanding of the
monetary policy process and of policy decisions
in real time.
Implicit in the policy use of potential output
and the output gap has been an effective strategy
for managing volatility in estimates of the output
gap. In particular, given the central role of potential output and the output gap in monetary policy,
volatility in time series of the output gap or in
revisions to estimates of the output gap can hinder
the effectiveness of monetary policy communications, and therefore of monetary policy itself.
The next section reviews the methodology
used by the Bank of Canada to estimate potential
output and the output gap in Canada. While the
methodology was designed to be consistent with
the economic structure of the model used by Bank
of Canada staff to construct projections, the
Quarterly Projection Model (QPM), the procedure
is designed to also incorporate information outside the scope of the model, such as demographics
and structural details related to the labor market.7
Features designed to contain end-of-sample revisions to estimates in response to updates of underlying economic data and to the availability of
additional observations are discussed. This paper
examines the extent to which such concerns were
addressed by the methodologies developed to
estimate and project potential output and the
output gap in real time.
6

See, for instance, Technical Box 3 in Bank of Canada (2000).

7

The QPM was used for economic projections between September
1993 and December 2005. Although there have been marginal
changes in the procedure used to estimate the output gap over
time, at the time of writing, the Bank continued to use basically
the same methodology to generate its “conventional” estimate of
the output gap in Canada.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

We next describe a dataset on real-time revisions to economic data and projections that has
been constructed from a historical database of
real-time economic projections made by Bank of
Canada staff. The properties of these real-time
revisions are explored in the subsequent text
section. While the main focus of the analysis is
the parsing of economic shocks into revisions to
projections of (i) the level of potential output,
(ii) the output gap, (iii) real GDP growth, and (iv)
potential growth, the response of projections of
inflation and short-term interest rates to shocks
is also examined.

POTENTIAL OUTPUT IN CANADA
This section describes the techniques used by
Bank of Canada staff to estimate historical values
and project future values of potential output in
Canada. In real time, Bank staff make ongoing
marginal changes to the estimation methodology.
Consequently, the description in this section
should be taken only as broadly indicative of the
procedures followed and the inputs to the estimation exercise.
A unifying assumption underlying both historical estimates and projections of potential output
is that aggregate production can be represented
by a Cobb-Douglas production function:
(1)

Y = (TFP ) N a K (1−a ) ,

where Y is output, N is labor input, K is the aggregate capital stock, TFP is the level of total factor
productivity, and a is the labor-output elasticity
(or labor’s share of income). This production
function also was used in the now-discontinued
model QPM to describe the supply side of the
Canadian economy.
The next subsection describes the process by
which historical estimates of potential output
were estimated, while the following section
focuses on assumptions underlying projections
of potential output.

Historical Estimates of Potential Output
The methodology used to estimate potential
output was heavily influenced by the requirements
J U LY / A U G U S T

2009

249

Barnett, Kozicki, Petrinec

of the monetary policy framework in which it was
to be used. Thus, it was judged that the methodology should be consistent both with the QPM
and the requirements associated with using the
model to prepare economic projections. In this
context, Butler (1996) notes that the following
properties were judged to be of prime concern:
consistency with the economic model (QPM);
the ability to incorporate judgment in a flexible
manner; the ability to both reduce and quantify
uncertainty about the current level of potential
output; and robustness to a variety of specifications of the trend component. In addition, given
concerns about the feasibility and efficiency of
estimates of potential output based solely on a
model of the supply side of the economy, use of
information from a variety of sources to better
disentangle supply and demand shocks was
deemed desirable.
With these guiding principles in mind, in the
1990s researchers at the Bank of Canada developed
a new methodology to estimate potential output
based on a multivariate filter that incorporates
economic structure, as well as econometric techniques designed to isolate particular aspects of
the data.8 The main innovation was the development of a filter, known as the extended multivariate filter (EMVF), that solves a minimization
problem similar to that underlying the HodrickPrescott (HP) filter (Hodrick and Prescott, 1997),
but the EMVF also incorporates information on
economic structure and includes modifications
to penalize large revisions and excess sensitivity
to observations near the end of the sample. For a
variable or vector of variables, x, the general filter
estimates the trend(s), x*, as follows:

x ∗ = max
xˆ

′
 − ( x − xˆ ) Wx ( x − xˆ ) − λ xˆ ′D ′Dxˆ +


{

(2)

}

{



′
*
*
− xˆ ′P ′Wg Pxˆ − x pr − xˆ Wpr x pr − xˆ  
 

8

(

)

See the discussion in Laxton and Tetlow (1992), Butler (1996),
and St-Amant and van Norden (1997).

250

J U LY / A U G U S T

2009

}

− x − xˆ )′ Wx ( x − xˆ ) − λ xˆ ′D ′Dxˆ .

Information on economic structure and judgment can be introduced through the two terms

{

}

− ε ′Wε ε − (s − xˆ )′ Ws (s − xˆ ) .

The term ε ′Wε ε is the main channel through
which information on the demand side of the
economy may be introduced to assist in better
separating demand shocks and supply shocks. In
general, ε represents residuals from key economic
relationships that depend on x̂. For instance, if
the unobserved trend to be estimated is the nonaccelerating inflation rate of unemployment
(NAIRU), ε may contain residuals from a Phillips
curve that relate inflation developments to deviations of the unemployment rate from the NAIRU.
In this sense, residuals may be interpreted as deviations from a structural economic relationship,
perhaps drawing on cyclical economic relationships in the QPM. With this term in the filter the
estimate of the trend may be shifted to reduce
such deviations from the embedded economic
theory.
Additional external structural information
on trends may be introduced through the term
s – x̂′Wss – x̂. In this expression, s generally
represents an estimate of the trend based on information outside the general scope of the model.
For instance, in the case of the trend participation
rate, s may be based on external analysis including information on demographics and otherwise
informed judgment.
Finally, the last two terms,

(

}

)

{(



′
∗
∗
− xˆ ′P ′Wg Pxˆ − x pr − xˆ Wpr x pr − xˆ ,



− ε ′Wε ε − (s − xˆ )′ Ws (s − xˆ ) +

(

This filter nests the HP filter, which is clearly
evident for univariate x, by setting Wε , Ws , Wpr ,
and Wg to zero, leaving only

)

(

)

provide a means to limit revisions to trend estimates. In general, procedures such as the EMVF
are subject to one-sided filtering asymmetries at
the ends of the sample. Although the filter is a
symmetric two-sided weighted moving average
within the sample period, near the end (and beginF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

ning) of the sample, filter weights become onesided. Intuitively, weights that would have been
assigned to future observations if they were available are redistributed across recent observations.
As a consequence, trend estimates near the end of
the sample place large weights on recent data and
tend to be revised considerably as additional observations become available.9 The term x̂′P ′Wg Px̂
penalizes large end-of-sample changes in the
trend estimates and reduces the importance of the
last few observations for the end-of-sample esti* – x̂′W x * – x̂
mate of the trend. The term xpr
pr pr
penalizes revisions to trend estimates between
two successive projection exercises attributable
to any source. In the absence of such a penalty,
trend estimates could be revised more than is
judged desirable due to (i) revisions to historical
data, (ii) the availability of data for an additional
quarter, or (iii) changes to external information
or judgment as summarized by s.10
In many ways, the methodology of the EMVF
was at the leading edge of research contributions
in this area. For instance, although the methodology tends to be applied to estimate the trend in
a single trending variable at a time, the theory is
sufficiently general to include joint estimation of
multiple trends, including situations with common trend restrictions. Stock and Watson (1988)
developed a common trends representation for a
cointegrated system, and state-space models as
outlined in Harvey (1989) could also accommodate common trend restrictions. However, within
the context of filters such as the HP filter, the bandpass filter of Baxter and King (1999), or the exponential-smoothing filter used by King and Rebelo
(1993), imposition of common trend restrictions
was not explored elsewhere in the academic literature until Kozicki (1999).
9

Orphanides and van Norden (2002) show that revisions associated
with the availability of additional data tend to dominate those
related to revisions to historical data.

10

An alternative possibility that was not explored was to penalize
revisions of deviations from the trend rather than just revisions to
the trend, by replacing the last term with the following:
* ′W x – x̂ – x – x * .
x – x̂ – xpr – xpr
pr
pr
pr

In the absence of data revisions xpr = x, all revisions to deviations
would be due to revisions to the trend and both alternatives would
yield the same results.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Another interesting aspect of the EMVF is
that the methodology proposed approaches to
reduce the importance of the one-sided filtering
problem well before it was addressed elsewhere
in the literature. Orphanides and van Norden
(2002) drew attention to the result that many
estimation methodologies yield large revisions to
real-time end-of-sample estimates of the output
gap. One potential approach to mitigating the onesided filtering problem was proposed by Mise,
Kim, and Newbold (2005).
As noted, an important characteristic of the
EMVF is its ability to incorporate, within a filtering environment designed to extract fluctuations
of targeted frequencies, information drawn from
structural economic relationships, information
from data sources external to the QPM, and judgment. The next few paragraphs provide details on
the economic structure incorporated in the EMVF
and the mechanism by which demographic information and structural features of the Canadian
labor market could influence estimates of potential output.
Estimates of potential output are based on
the Cobb-Douglas production function in equation (1). Recognizing that for the specification in
equation (1), the marginal product of labor is
∂Y/∂N = aY/N, the logarithm of output can be
represented as
(3)

y = n + µ − a,

where each term is expressed in logarithms and
n is labor input, µ is the marginal product of labor,
and α = loga is the labor-output elasticity (also
labor’s share of income). The decision to use µ in
constructions of historical estimates of potential
output rather than data on the capital stock was
motivated by concerns about the lack of timely
(or quarterly) data and measurement problems.
To construct log potential output, y*, trends in
log employment, n*, the log marginal product of
labor, µ*, and the log labor share of income, α*,
are estimated separately and then summed.
One component of log potential output, trend
log employment, n*, is estimated using another
decomposition:
(4)

(

)

n ∗ = Pop + p ∗ + log 1 − u∗ ,

J U LY / A U G U S T

2009

251

Barnett, Kozicki, Petrinec

where Pop is the logarithm of the working-age
population, p is the logarithm of the participation
rate, and u* is the NAIRU.11 As for aggregate output, trend employment is constructed as the sum
of the estimated trends of each component. The
trend participation rate, p*, is estimated with the
EMVF using an external estimate of the trend
participation rate for s, setting Wε , Wpr , and Wε
to zero. Around the time of Butler’s (1996) writing,
the smoothness parameter λ was set to a very high
number (λ = 16000) to obtain a very smooth estimate of the trend participation rate. However, the
value of this parameter has been adjusted considerably over time and more recently has been set
to λ = 1600, a value typically used to exclude
fluctuations of “typical” business cycle frequencies from trend estimates. The external estimate
of the trend participation rate accounts for demographic developments, including, for instance,
trends in the workforce participation rate of
women and school employment rates.12 The
NAIRU is also estimated using the EMVF, with
an external estimate of the trend unemployment
rate based on the work of Côté and Hostland (1996)
used for s, and residuals ε , obtained from a priceunemployment Phillips curve drawing on the
work of Laxton, Rose, and Tetlow (1993). The
external estimate of the trend unemployment rate
incorporates information on structural features
of Canadian labor markets, including the proportion of the labor force that is unionized and payroll taxes.
A second component of log potential output
is the trend value of the log labor-output elasticity,
a*. This component is estimated as the smooth
trend obtained by applying an HP filter with a
large smoothing parameter (λ = 10000) to data on
labor’s share of income.
The third component of log potential output,
the trend log marginal product of labor, µ*, is
also estimated by applying the EMVF. The real
producer wage is used for s rather than an external
11

12

Barnett (2007) provides recent estimates and projections of trend
labor input using a cohort-based analysis that incorporates anticipated demographic changes. Barnett’s analysis also accounts for
trend movements in hours.
See Technical Box 2 of Bank of Canada (1996).

252

J U LY / A U G U S T

2009

estimate of the trend, and ε is the residual from
an inflation/marginal product of labor relationship. The latter is motivated by the idea that the
deviation of the marginal product of labor from
its trend level can be interpreted as a factor utilization gap and, hence, provides an alternative
index of excess demand pressures.

Projecting Potential Output
Projections of potential output are based on
the Cobb-Douglas production function, equation
(1), but are driven by consideration of supply-side
features:
(5)

(

)

y ∗ = tfp∗ + a∗n ∗ + 1 − a∗ k ,

where lower-case letters indicate the logarithm
of the respective capitalized notation and an
asterisk denotes that a variable is set to its trend
or equilibrium value.13 Thus, projections of
potential output are constructed with projections
of tfp*, a*, n*, and k.
The capital stock, k, is constructed from the
cumulated projected investment flows given the
actual capital stock at the start of the projection.
The equilibrium labor-output elasticity, a*, is set
to a constant equal to the historical average labor
share of income.
The typical assumption is that in the medium
to long term, trend total factor productivity, tfp*,
will converge toward the level of productivity of
the United States at the historical rate of convergence.14 A short-run path for tfp* links the historical estimate at the start of the projection to the
medium-term path for tfp*, with short-run behavior based on typical cyclical variation.
The equilibrium employment rate, n*, is
based on an analysis of population growth, labor
force participation, and structural effects on the
NAIRU (Bank of Canada, 1995). Analysis draws
on information outside the scope of the QPM.
For instance, labor force participation is related
to demographic factors (Bank of Canada, 1996);
population growth may be influenced by immi13

See the discussion in Butler (1996).

14

Crawford (2002) discusses determinants of trends in labor productivity growth in Canada.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

gration policy; and the NAIRU may be related to
structural factors.15 To a large extent, this series
can be thought of as corresponding to an external
structural estimate of s as used in the EMVF. Thus,
projections are as if they are generated from an
application of the EMVF with all weights other
than Ws set to zero.
Although numerous studies—including Butler
(1996), Guay and St-Amant (1996), St-Amant and
van Norden (1997), and Rennison (2003)—have
compared the properties of alternative approaches
of historical estimates of the output gap across
alternative estimation approaches, no similar
studies exist to examine properties of projections
of potential output or the output gap. This is
one area to which the current study hopes to
contribute.

MEASURING SHOCKS AND
REVISIONS TO PROJECTIONS
The empirical analysis is designed to assess
the sensitivity of economic projections to new
information. If economic projections were “raw”
outputs from application of the QPM, then our
analysis would be merely recovering information
about the structure of the QPM, which is available
elsewhere.16 However, in general, economic projections are influenced by judgment to account
for features of the economy outside the scope of
the economic model. In addition, the QPM is
primarily a business cycle model, designed to
project deviations of economic variables from
their respective trend levels. Consequently, while
potential output and other trends are constructed
to be consistent with the economic structure of
the QPM, evolution of these trends is modeled
outside the QPM.
15

Poloz (1994) and Côté and Hostland (1996) discuss the effects of
structural factors, such as demographics, unionization, and fiscal
policies influencing unemployment insurance, the minimum wage,
and payroll taxation, on the NAIRU. More information on demographic implications for labor force participation is provided by
Ip (1998).

16

A nontechnical description of the QPM is provided in Poloz, Rose,
and Tetlow (1994). Detailed information on the QPM is provided
in the trio of Bank of Canada Technical Reports by Black et al.
(1994), Armstrong et al. (1995), and Coletti et al. (1996).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Real-Time QPM Projections and Data
The analysis uses real-time data from the
Bank of Canada’s staff economic projection database. Bank staff generate projections quarterly to
inform the policy decisionmaking process. The
projection data analyzed in this project were
generated by the QPM. It is important to note
that the projections in these data correspond to
staff economic projections and may not be the
same as projections implicitly underlying policy
decisions, or, in later years, as published in the
MPR, as such projections would correspond to
the views of the Governing Council.
Analysis is limited to projection data for the
period September 1993 through December 2005,
the period during which the QPM was used by
Bank staff producing projections. By limiting
empirical analysis to data within this period, the
likelihood of structural breaks in projections
associated with large changes in the projection
model is small. An additional advantage of this
sample is that it falls entirely within the inflationtargeting regime in Canada, removing concerns
about structural breaks associated with changes
in policy regime.
The database includes a total of 50 vintages
of data, one vintage for each quarterly projection
exercise. As is standard in the real-time-data literature, the term “vintage” is used to refer to the
dataset corresponding to the data from a specific
projection. Vintages are described by the month
and year when the projection was made. Projections were generated four times per year, once per
quarter, in March, June, September, and December.
For each vintage, the database contains the history of the conditioning data as available at the
time of the projection, as well as the projections.
This database is used to construct measures
of shocks and projection revisions. Both shocks
and revisions are constructed as the difference
between values of economic variables (either historical observations or the projection of a specific
variable) for a given quarter as recorded in two
successive vintages of data. The term “revision”
is reserved to reflect a change in the projection
of a variable, whereas the term “shock” is used
to reflect the difference between a new or revised
J U LY / A U G U S T

2009

253

Barnett, Kozicki, Petrinec

observation for a variable and its value (either
an observation or a projection) as recorded in the
previous vintage of data. For each economic
variable, 2 sets of shocks series and 12 sets of
revisions series are constructed.
The timing of the publication of data is critical
to understanding the distinction between shocks
and revisions. In general, data for a full quarter,
t, are not published until the next quarter, t + 1.
Thus, for instance, in the month when Bank of
Canada staff were conducting a projection exercise, the values of variables recorded for the current quarter were “0-quarter-ahead” projections;
values for the next quarter were “1-quarter-ahead”
projections; and values for the prior quarter were
published data. Letting xtv denote the value of
variable x for quarter t as recorded in vintage v of
the dataset, xtv denotes a t – v-quarter-ahead
projection for t ≥ v and is treated as an observation
of published data if t < v. The term “published”
is somewhat of a misnomer and is more appropriate for data on inflation, real GDP, and interest
rates, for instance, than for potential output, potential growth, or the output gap as the latter three
concepts are not directly observed, nor are they
measured or constructed by the statistical agency
of Canada, Statistics Canada. As discussed earlier,
values of these variables are estimated internally
by Bank of Canada staff. Nevertheless, for notational convenience and to facilitate parsimonious
exposition, language such as “observation,”
“data,” and “published” is used synonymously
in reference to all series according to the timing
convention previously described.
The term “shock” is generally used to refer
to marginal information from one vintage to the
next provided by new observations on market
interest rates, new or updated data produced by
Statistics Canada, or new or updated historical
estimates of potential output (and related series)
constructed by the Bank of Canada. Two measures of shocks are examined:
(6)

x as made in t and recorded in vintage v = t. Thus,
shock1 is a projection error. The second measure
of shocks captures the first quarterly update to
the published data and is constructed as
(7)

The term “revisions” is used to refer to
changes in Bank of Canada staff projections of a
variable between successive vintages. Twelve
measures of revisions are examined with each
corresponding to a different projection horizon,
(8)

revisionkt = xtt ++1k +1 − xtt + k +1 ,

where k = 0,…11.
The analysis in this article concentrates on
shocks and revisions to nine variables as defined
below:
• EXCH: the bilateral exchange rate between
Canada and the United States, expressed
as $US per $CDN;
• GAP: the output gap defined as the percent
deviation of real GDP from potential real
GDP (potential output);
• GDP: real GDP growth (an annualized
quarterly growth rate);
• GDPLEV 17: log-real GDP level, constructed
as an index for a given quarter by taking
GDPLEV for the prior quarter and adding
100/4 * log1 + GDP  to it, with current
vintage data for a given quarter early in
the sample used to initiate the recursive
construction;
• POT: potential output growth, calculated
as the annualized one-period percent
change in POTLEV;
• POTLEV: log potential output level, constructed as GDPLEV – GAP, an index;
• INF: CPI inflation (annualized quarterly
growth rate);

shock1t = xtt +1 − xtt
17

is the difference between the published value of
variable x for quarter t as available in quarter t + 1
(the first quarter it is published) and the 0-quarterahead (or contemporaneous) projection of variable
254

shock2t = xtt −+11 − xtt −1 .

J U LY / A U G U S T

2009

During the period of analysis, Statistics Canada rebased GDP
several times. From 1994 to 1996, the base year used for real GDP
calculations was 1986. The base year changed to 1992 from 1996
to July 2001. From July 2001 to May 2007, the base year used was
1997. However, as GDPLEV and POTLEV were constructed as
indices, these rebasings would not affect the analysis of this study.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

• INFX: core CPI inflation (annualized quarterly growth rate). The definition of core
CPI has changed over our period of analysis.
Before May 2001 the Bank of Canada used
CPI excluding food, energy, and indirect
taxes (CPIxFET) as the measure of core inflation. After May 2001 the Bank changed its
official measure of core inflation to CPI
excluding the eight most volatile components (CPIX), and;
• R90: a nominal 90-day short-term interest
rate.
The information content of contemporaneous,
k = 0, projections will differ across variables projected, implying that for some variables shock1
will be much smaller than for others. In particular,
projections are made in the third month of each
quarter. However, the initial release of the national
accounts is at the end of the second month or early
in the third month of the quarter. Data in these
national accounts releases, such as GDP, extend
only through the prior quarter. For example, for
the national accounts release in late August 2008
(the second month of Q3), the latest GDP observations are for 2008:Q2. However, some statistics are
available in a more timely manner. For example,
interest rate data are available in real time. Thus,
by the third month in a quarter, two months of
interest rate data are already available. Likewise,
for some variables shock2 will be much smaller
(and in some cases zero) than for others, because
some published data series, such as GDP, are
revised in quarters after the initial release, while
others, such as interest rates, are not.

PROPERTIES OF PROJECTION
REVISIONS
New information becomes available in the
period between projection exercises. This information takes many forms, including new or revised
data published by statistical agencies, new observations from financial markets, as well as anecdotal information from surveys or the press, among
others. For interest rates, inflation, and real GDP
growth, the information in shock1 reflects proF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

jection errors, whereas the information in shock2
reflects revised data. By contrast, shocks to potential output, the output gap, and potential growth
generally are a function of shocks to data (including, but not limited to, interest rates, inflation,
and real GDP growth), updated judgment on the
part of Bank of Canada staff, and updates to external structural information on trends. Revisions
may reflect some or all of the varying types of
new information. New observations of some published data directly enter into model projections,
but other information may inform judgment and
also be incorporated.
This section examines the properties of
shocks and revisions. The analysis examines the
relative size of revisions to projections of trends
compared with revisions to projections of cyclical
dynamics. Another issue of particular interest is
the parsing of shocks to real GDP growth, interest
rates, inflation, and exchange rates into permanent
and transitory components that will, in turn, affect
shocks and revisions of projections of potential
output and the output gap.

Properties of Projection Revisions and
Shocks
Figure 1 shows the standard deviations of
shocks and revisions to GAP, GDP, GDPLEV, POT,
and POTLEV. This figure shows that both shocks
and revisions to potential output growth (POT)
were small at all horizons. By contrast, projection
errors (shock1) and near-term revisions to real
GDP growth (GDP) tend to be considerably larger.
Both results are consistent with what would generally be expected. By definition, potential is
meant to capture low-frequency movements in
output and is constructed to be smooth. Consequently, it would be surprising to see either volatile
potential growth or frequent large revisions to
potential growth. Real GDP growth, however,
tends to be volatile. Thus, not surprisingly, revisions, particularly to current and one-quarterahead projections, can be sizable. Much of the
volatility of both the underlying growth rate data
and the revisions is likely related to the allocation
and reallocation of inventory investment, imports,
and exports across quarters. At longer horizons,
J U LY / A U G U S T

2009

255

Barnett, Kozicki, Petrinec

Figure 1
Standard Deviations
1.4
GAP
GDP
GDPLEV
POT
POTLEV

1.2

1.0

0.8

0.6

0.4

0.2

0

shock2 shock1

rev0

rev1

rev2

the standard deviation of GAP projection revisions
remains quite large, and the standard deviation
of revisions to projections of GDP growth are considerably larger than revisions to projections of
potential growth. These observations suggest considerable persistence in business cycle propagation of economic shocks. Even at a 2- to 3-year
horizon, real GDP growth does not consistently
converge to potential output growth in projections.
Whereas shocks and revisions to potential
growth are considerably smaller than revisions to
GDP growth, the same is not true for the log levels
of GDP (GDPLEV) and potential output (POTLEV).
For these variables, the standard deviations of
shocks are essentially the same. As expected,
GDPLEV revisions tend to be larger than POTLEV
revisions, but not by nearly as much as was the
case for their growth rates (GDP and POT, respectively). In fact, at the longest horizon, k = 11, the
magnitudes of revisions to the levels are, on average, essentially the same.
256

J U LY / A U G U S T

2009

rev3

rev4

rev5

rev6

rev7

rev11

Figure 2 shows the standard deviations of
shocks and revisions to INF, INFX, and R90.
Shocks to all variables are quite small. As noted
earlier, some monthly data are available for the
contemporaneous quarter, likely explaining the
larger differences in standard deviations of the
projection error (shock1) relative to the first
forecast revision (rev0).18 Revisions to near-term
projections tend to be larger than those to longerhorizon projections for inflation. This may reflect
the effects of endogenous policy designed to
achieve the 2 percent target at a roughly 2-year
horizon.
Very different properties are evident for the
short-term interest rate (R90). Shocks to interest
rates are generally small, owing to the fact that
interest rate data are available daily in real time
(so that much of the current-quarter information
18

For INF and INFX, annual updates to seasonal adjustments to the
data are the main source of nonzero values of shock2. The change
in definition of INFX in May 2001 also leads to a nonzero value of
shock2 for this variable.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

Figure 2
Standard Deviations
0.9
0.8

INF
INFX
R90

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

shock2 shock1

rev0

rev1

rev2

is already available at the time of the contemporaneous-quarter projections) and are not very
volatile. Standard deviations of revision0 are similar to those of inflation. However, as the forecast
horizon increases, standard deviations of revisions to interest rates rise somewhat before leveling off, and in contrast to the results for inflation,
they do not noticeably decline for longer forecast
horizons.
Table 1 provides information on the persistence of projection revisions across forecast horizons.19 Persistence should vary considerably
across different economic variables. In general,
revisions to trend levels should be expected to be
permanent, while revisions to cyclical variables
should be expected to dissipate. Each column of
Table 1 provides correlations of shocks and revisions with revision0 for a single variable. When
revision0 of GAP is revised, so are revisions to
19

Note that an alternative definition of persistence would examine
the persistence of revisions by horizon across time.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

rev3

rev4

rev5

rev6

rev7

rev11

GAP projections at other horizons, although the
correlation diminishes as the projection horizon
increases. Potential growth revisions are also
positively correlated but display a somewhat
different pattern, with much lower correlation at
near-term horizons. GDP growth revisions show
strong near-term momentum, but negative correlations suggest near-term revisions tend to be
partially reversed further out.
Correlations across horizons of revisions to
projections of the three level variables, GAP,
GDPLEV, and POTLEV, clearly reveal the differing
persistence properties of trends and cycles. When
the level of potential output is revised, it tends to
be revised by nearly equal amounts at all projection horizons. By contrast, as noted previously,
when the contemporaneous-quarter projection
of the output gap is revised, subsequent projections are revised in the same direction, but by
diminishing amounts as the projection horizon
increases. By construction, GDPLEV is the sum
J U LY / A U G U S T

2009

257

J U LY / A U G U S T

Barnett, Kozicki, Petrinec

258
2009

Table 1
Correlations of Revisions Across Projection Horizons

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Revision

Gap

GDP
growth

Potential
growth

CPI
inflation

Core
Short-term
inflation interest rate

GDP
level

Potential
level

CPI
level

Core CPI
level

Exchange
rate

Shock2

0.30

0.34

0.26

0.04

–0.11

Shock1

0.79

0.50

0.56

0.53

0.38

0.16

0.70

0.98

0.64

0.97

–0.06

0.08

0.91

0.99

0.80

0.98

0.35

Rev0

1.00

1.00

1.00

1.00

1.00

1.00

1.00

1.00

1.00

1.00

1.00

Rev1

0.94

0.71

0.36

Rev2

0.87

0.09

0.31

0.09

0.19

0.76

0.97

0.99

0.90

0.99

0.92

–0.10

0.32

0.55

0.93

0.98

0.82

0.96

0.87

Rev3

0.81

–0.17

0.29

Rev4

0.69

–0.46

0.26

–0.19

0.23

0.44

0.89

0.98

0.71

0.93

0.79

–0.12

0.22

0.42

0.84

0.97

0.62

0.90

0.70

Rev5

0.53

–0.44

Rev6

0.28

–0.48

0.25

–0.01

0.17

0.38

0.77

0.96

0.55

0.87

0.65

0.34

0.06

0.10

0.33

0.69

0.95

0.50

0.84

0.62

0.01

–0.43

Rev7
Rev11

0.29

0.13

0.02

0.25

0.61

0.95

0.48

0.82

0.60

–0.25

–0.04

0.06

0.45

–0.10

0.09

0.55

0.93

0.53

0.77

0.53

Barnett, Kozicki, Petrinec

Table 2
Correlations among Projection Errors (shock1)

Gap
GDP growth

Gap

GDP
growth

Potential
growth

Potential
log level

CPI
inflation

Core
Short-term Exchange
inflation interest rate
rate

1.00

0.62

–0.08

–0.25

0.14

0.05

–0.01

–0.01

0.62

1.00

0.32

0.29

0.01

–0.01

0.07

–0.01

Potential growth

–0.08

0.32

1.00

0.47

0.09

–0.14

–0.13

0.11

Potential log level

–0.25

0.29

0.47

1.00

–0.15

–0.09

0.01

0.07

CPI inflation

0.14

0.01

0.09

–0.15

1.00

0.63

–0.09

–0.17

Core inflation

0.05

–0.01

–0.14

–0.09

0.63

1.00

–0.14

–0.31

Short-term interest rate –0.01

0.07

–0.13

0.01

–0.09

–0.14

1.00

–0.02

–0.01

0.11

0.07

–0.17

–0.31

–0.02

1.00

Exchange rate

–0.01

Table 3
Correlations among Data Revisions (shock2)

Gap

Gap

GDP growth

Potential growth

Potential log level

1.00

–0.07

–0.35

–0.53

GDP growth

–0.07

1.00

0.46

0.31

Potential growth

–0.35

0.46

1.00

0.59

Potential log level

–0.53

0.31

0.59

1.00

of POTLEV and GAP, so it should not be surprising that persistence properties are intermediate
to the two components. On average, about half of
the contemporaneous projection revision is permanent, whereas the other half shrinks with
longer forecast horizons.
This result is rather striking, as is the result
(evident in Figure 1) that the standard deviation
of shocks to the level of GDP is about the same as
the standard deviation of shocks to the level of
potential GDP. Moreover, the standard deviations
of revisions to projections of the level of potential
output are only somewhat smaller than the standard deviations of revisions to projections of the
output gap (and are smaller for only three nearterm forecasting horizons). Overall, these results
suggest almost the same amount of uncertainty is
associated with the level of potential as with the
gap. Of course, all else equal, revisions to the level
of potential output do not have policy implications, whereas revisions to the output gap do.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

In the case of inflation, revisions to core
inflation projections tend to have some, albeit
low, persistence, whereas those to overall inflation do not. This result is consistent with the
observation that near-term revisions to overall
inflation are generally driven by information on
the volatile components excluded from the core
measures. By contrast, revisions to exchange rate
projections are very persistent. The persistence
of revisions to projections of the short-term
interest rate is roughly similar to the persistence
of revisions to the gap, perhaps indicating a link
between the two. This possibility is explored in
the next subsection.
Correlations among projection errors (shock1)
are presented in Table 2. A few interesting results
emerge. First, correlations among projection errors
to GDP growth, core inflation, R90, and the
exchange rate are very low. Second, the correlation
between projection errors to GDP growth and the
output gap is quite high. This result likely signals
J U LY / A U G U S T

2009

259

Barnett, Kozicki, Petrinec

Table 4
Regression Results: Responses of Revisions to Shocks
GDP
shock1

INFX
shock1

R90
shock1

EXCH
shock1

–
R2

0.17

–0.22

0.33**

–0.22

–2.13***

0.12

0.65

–2.69***

0.16

0.57

0.28***

0.38**

0.24***

0.34**

–0.20

–2.66***

–0.03

0.51

–0.25

–2.29**

–0.11

0.46

4

0.18***

5

0.13**

0.29**

–0.23

–1.65*

–0.14

0.38

0.23

–0.24

–1.13

–0.12

0.27

6
7

0.09

0.15

–0.19

–0.80

–0.15

0.14

0.03

0.11

–0.15

–0.78

–0.04

0.05

11

–0.02

0.08

0.11

–1.57**

0.32

0.16

0.48

0.49

Dependent variable
(revisionk)

k

GAP

0

0.30***

1

0.31***

2
3

GDP

0

POT

POTLEV

GDP
shock2

0.47***

0.73**

–1.24*

–7.61***

1

0.07

0.66**

–0.36

–1.79

–0.05

0.14

2

–0.16*

0.13

–0.04

0.29

–0.71

0.10

3

–0.18*

–0.14

–0.27

2.08

–0.32

0.16

4

–0.24**

–0.21

0.12

3.10*

–0.11

0.21

5

–0.17

–0.26

0.00

2.41

0.12

0.14

6

–0.16

–0.29

0.26

1.83

–0.01

0.14

7

–0.24**

–0.13

0.20

0.49

0.50

0.17

11

–0.04

–0.14

0.02

–0.23

0.21

0.05

0

0.02

0.00

–0.02

0.34

0.08

0.02

1

0.02

–0.01

–0.37*

0.53

–0.24

0.14

2

–0.02

–0.05

–0.12

0.19

0.08

0.06

3

–0.01

–0.00

–0.05

0.50

0.02

0.06

4

0.00

0.00

0.03

0.40

0.01

0.05

5

0.01

0.03

0.04

0.26

0.07

0.05

6

0.01

0.05

0.06

0.42

0.08

0.12

7

0.02

11

–0.01

0.04

0.42

0.07

0.13

0.10***

0
1

0.03

–0.03

0.34*

–0.07

0.28

0.08

0.26

–0.17

–0.38

0.15

0.15

0.08

0.26

–0.26

–0.25

0.09

0.14

2

0.08

0.25

–0.29

–0.20

0.11

0.13

3

0.08

0.25

–0.30

–0.08

0.11

0.12

4

0.08

0.25

–0.29

0.01

0.12

0.11

5

0.08

0.26

–0.28

0.08

0.13

0.11

6

0.08

0.27

–0.27

0.18

0.15

0.11

7

0.08

0.27

–0.26

0.28

0.17

0.12

11

0.08

0.36

–0.28

0.61

0.16

0.13

*Significant at 10 percent; **significant at 5 percent; ***significant at 1 percent.

260

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

Table 4, cont’d
Regression Results: Responses of Revisions to Shocks
INFX
shock1

R90
shock1

EXCH
shock1

–
R2

0.35**

–0.17

–0.41

0.14

0.12

0.35**

–0.26

–0.28

0.09

0.12

2

0.33*

–0.29

–0.24

0.10

0.10

3

0.33*

–0.31

–0.11

0.11

0.10

4

0.33*

–0.30

–0.02

0.11

0.09

5

0.34*

–0.29

0.04

0.13

0.09

6

0.35*

–0.28

0.14

0.15

0.09

7

0.36*

–0.27

0.24

0.16

0.10

11

0.44**

–0.29

0.57

0.16

0.12

0.78

–0.68

0.73

0.07

Dependent variable
(revisionk)

k

POTLEV

0
1

INF

INFX

0

–0.01

1

0.07

2

0.14

3
4

GDP
shock2

0.22

–0.01

–0.51

–0.02

0.18

–0.03

0.56

–1.75

0.25

0.15

0.18**

0.03

0.32

–1.00

0.32

0.18

0.10

0.16

0.57*

–1.35

0.23

0.23

5

0.14**

0.04

0.57**

–1.01

0.42

0.24

6

0.10**

0.07

0.61***

–1.00

0.29

0.28

7

0.11***

0.07

0.50***

–0.99*

0.33

0.37

0.63**

0.40

0.38

0.10

11

0.07

–0.06

0

–0.06

0.03

0.71**

–0.89

–0.09

0.18

1

0.09

0.19

0.04

–0.69

–0.13

0.15

2

0.09*

0.14

–0.03

–1.65**

–0.23

0.21

3

0.10**

0.22*

0.14

–1.09

–0.21

0.28

4

0.04

0.34***

0.34

–1.05

–0.14

0.30

5

0.08*

0.20*

0.33*

–0.86

–0.06

0.30

6

0.07*

0.15

0.33*

–0.36

–0.04

0.28

0.13*

7

R90

GDP
shock1

0.06**

0.25

0.20

–0.23

0.01

0.29

11

0.03

–0.01

0.05

0.43

0.30*

0.15

0

0.01

0.21

–0.04

0.30

–0.09

0.04

1

0.16*

0.51**

–0.06

–1.57

–0.11

0.25

2

0.28***

0.61***

0.04

–0.41

0.32

0.41

3

0.28***

0.60***

–0.00

–0.03

0.46

0.43

4

0.25**

0.48**

–0.20

0.81

0.44

0.35

5

0.22**

0.34

–0.25

1.03

0.68

0.26

6

0.17

0.28

–0.18

1.05

1.01*

0.22

7

0.15

0.19

–0.01

1.09

1.11*

0.19

11

0.08

0.02

0.25

0.52

1.19*

0.10

*Significant at 10 percent; **significant at 5 percent; ***significant at 1 percent.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

261

Barnett, Kozicki, Petrinec

that for a given level of potential output, higher
than expected GDP data would raise both GDP
growth and the gap. Similarly, the correlation
between projection errors to CPI inflation and
core inflation is high, consistent with the fact that
CPI inflation is an aggregate that contains core
inflation, so that errors in core inflation would
also show up in CPI inflation. Correlations among
data revisions (Table 3) are of the same sign as
those among projection errors, although the former
are generally stronger.

Trend versus Cycle: Projection
Revisions in Response to Shocks
An important element of projection exercises
is parsing shocks into permanent components
that influence trends but do not have inflationary
consequences, and transitory components that
affect cyclical dynamics and generally affect inflationary pressures. The QPM was the primary tool
used to map the implications of transitory structural shocks into economic projections. While
judgment may have also entered into projections,
particularly for understanding near-term economic
variation, at medium to longer horizons, endogenously generated model dynamics would play a
more dominant role. As noted earlier, the properties of the QPM are well documented. However,
the implications of shocks for trend projections
are less well understood.
In this section, the responses of projections
of several main economic variables to shocks to
GDP, INFX, R90, and EXCH are analyzed. To a
certain extent, shocks to these variables might be
considered exogenous, as they directly reveal new
information from financial markets (in the case
of interest rates and exchange rates) or as published by Statistics Canada. Revisions to potential
output (and variables constructed using potential
output) might be thought of as responses to this
new information.20 To assess the importance of
20

In examining the empirical results in the table, it is important to
keep in mind that some shocks have smaller standard deviations
than others. In particular, because interest rates tend to move gradually and two of three months of interest rate data are available
for the contemporaneous quarter during the projection exercise,
shocks to interest rates are generally of smaller magnitude. This
feature may explain the somewhat larger coefficients on interest
rate shocks in the tables.

262

J U LY / A U G U S T

2009

these sources of new information, regressions of
the following format were estimated:

(9)

revisionkt = c + βG 1GDPshock1t
+ βG 2GDPshock2t + βI INFXshock1t
+ β R R 90shock1t + β E EXCHshock1t .

Only one shock variable was included for inflation, the short-term interest rate, and the exchange
rate, as these variables are essentially unrevised.
Results are presented in Table 4.
The most important variable in terms of influencing projection revisions is GDP. Shocks to GDP
tend to lead to revisions of the same sign to projections of the output gap, inflation, core inflation,
the short-term interest rate, the level of potential
output, and near-term projections of real GDP
growth; and to revisions of the opposite sign to
longer-term projections of real GDP growth. By
contrast, there is no evidence that potential growth
is responsive to these shocks. In terms of parsing
GDP shocks, a fraction of these shocks (about 1/3)
are mapped into permanent shocks that lead to
parallel shifts of the level of potential without
influencing the growth rate. The remainder of the
GDP shocks are assessed as cyclical (transitory),
with some persistence, and lead to revisions to
gap projections at horizons out to five quarters,
with the largest revisions being to revision1 and
revision2 (about 2/3 of GDP shocks are mapped into
GAP revisions for k = 1,2). For positive shocks,
near-term growth is revised upward and the output gap becomes larger. The additional inflationary pressures lead to tighter monetary policy,
which is consistent with more rapid reductions
in the size of the gap and therefore downward
revisions to GDP growth, both of which are consistent with a closing of the gap after two years.
There are two noteworthy aspects to this parsing of GDP shocks into potential output and the
output gap. First, parsing explicitly recognizes
that not all shocks are transitory demand shocks.
In the EMVF filter, the HP terms imply that estimates of potential output are informed by historical output data. Thus, shocks may lead to revised
estimates of potential output for the last few observations of the historical data. The empirical results
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

suggest that this revision is linked into a new projection of potential output by shifting the previously projected level of potential output up or
down in an essentially parallel fashion so that
the shock has permanent effects.
Second, parallel revisions to the level of potential are consistent with smaller revisions to the
output gap and potential growth, variables that
play more prominent roles in communication.
For communications purposes, it is preferable to
focus on the main underlying signal of the state
of the economy that indicates the extent of inflationary pressures. Large or frequent revisions to
the recent history of the output gap or to projections of economic activity, particularly when
reversed, would be undesirable. The historical
mapping of a fraction of shocks into parallel shifts
of potential output reduces the size of real-time
revisions to the output gap and to projections of
potential growth. In combination with communications about data revisions and uncertainty
surrounding measures of potential output and
the output gap, this may have provided a practical approach to dealing with real-time challenges
of noisy and revised data.21
Finally, the pattern of revisions to projections
of the output gap and R90 in response to GDP
growth shock1 may explain why there are only
small effects of shocks on inflation. In particular,
in general equilibrium, monetary policy responds
(gradually according to the empirical results) to
the revisions in the output gap projections. But
with lags in the response of inflation to aggregate
demand pressures, policy is “ahead of the curve”
and attenuates inflationary implications. A similar
outcome may occur with shocks to the exchange
rate. In particular, projections of R90 at longer
horizons respond positively to EXCH shocks
(which are quite persistent, as evident in Table 1),
possibly indicating slow pass-through of exchange
rate movements to inflation, and therefore a
delayed policy response to such shocks.
21

Such a strategy is not unlike the strategy of using a measure of
core inflation to indicate “underlying inflation” when a few components of the total CPI are subject to large transitory shocks.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

CONCLUSION
The output gap plays a central role in monetary policy decisions and communications at the
Bank of Canada. The methodology used to estimate and project potential output was designed
to be consistent with the structure of the Bank’s
projection model (the QPM), allow estimates to
be (flexibly) influenced by judgment and external
structural estimates of trends, and incorporate
information from a variety of sources to better
disentangle supply and demand shocks. In practice, information sources that are external to the
QPM, such as demographics or structural details
of the Canadian labor market, are important
drivers of the trend labor input component of
potential output.
Analysis of revisions to real-time Bank of
Canada staff economic projections reveals several
interesting results. First, the similar size of typical
revisions to projections of log potential output
and the output gap suggest as much uncertainty
about the trend as about the cycle. Second, real
GDP shocks provided information about both the
trend and the cycle. These shocks were parsed
into permanent components that led to parallel
shifts in projections of potential output and transitory components that led to persistent near-term
revisions of the output gap that, with endogenous
policy, dissipated over the projection horizon.

REFERENCES
Armstrong, John; Black, Richard; Laxton, Douglas and
Rose, David. “The Bank of Canada’s New Quarterly
Projection Model, Part 2: A Robust Method for
Simulating Forward-Looking Models.” Bank of
Canada Technical Report No. 73, Bank of Canada,
February 1995;
www.bankofcanada.ca/en/res/tr/1995/tr73.pdf.
Bank of Canada. Monetary Policy Report: May 1995.
May 1995; www.bankofcanada.ca/en/mpr/pdf/
mpr_apr_1995.pdf.
Bank of Canada. Monetary Policy Report: May 1996.
May 1996; www.bankofcanada.ca/en/mpr/pdf/
mpr_apr_1996.pdf.

J U LY / A U G U S T

2009

263

Barnett, Kozicki, Petrinec

Bank of Canada. Monetary Policy Report: May 1999.
May 19, 1999; www.bankofcanada.ca/en/mpr/pdf/
mpr_may_1999.pdf.
Bank of Canada. Monetary Policy Report: November
2000. November 9, 2000; www.bankofcanada.ca/
en/mpr/pdf/mpr_nov_2000.pdf.
Barnett, Russell. “Trend Labour Supply in Canada:
Implications of Demographic Shifts and the
Increasing Labour Force Attachment of Women.”
Bank of Canada Review, Summer 2007, pp. 5-18;
www.bankofcanada.ca/en/review/summer07/
review_summer07.pdf.

Dodge, David. Opening statement before the Standing
Senate Committee on Banking, Trade and Commerce,
November 29, 2001; www.bankofcanada.ca/en/
speeches/2001/state01-4.html.
Guay, Alain and St-Amant, Pierre. “Do Mechanical
Filters Provide a Good Approximation of Business
Cycles?” Technical Report No. 78, Bank of Canada,
November 1996;
www.bankofcanada.ca/en/res/tr/1996/tr78.pdf.
Harvey, Andrew C. Forecasting, Structural Time
Series Models and the Kalman Filter. New York:
Cambridge University Press, 1989.

Baxter, Marianne and King, Robert G. “Measuring
Business Cycles: Approximate Bank-Pass Filters
for Economic Time Series.” Review of Economics
and Statistics, November 1999, 81(4), pp. 575-93.

Hodrick, Robert J. and Prescott, Edward C. “Post-War
U.S. Business Cycles: An Empirical Investigation.”
Journal of Money, Credit, and Banking, February
1997, 29(1), pp. 1-16.

Black, Richard; Laxton, Douglas; Rose, David and
Tetlow, Robert. “The Bank of Canada’s New
Quarterly Projection Model, Part 1: The SteadyState Model: SSQPM.” Technical Report No. 72,
Bank of Canada, November 1994;
www.bankofcanada.ca/en/res/tr/1994/tr72.pdf.

Ip, Irene. “Labour Force Participation in Canada:
Trends and Shifts.” Bank of Canada Review,
Summer 1998, pp. 29-52;
www.bankofcanada.ca/en/review/1998/r983b.pdf.

Butler, Leo. “The Bank of Canada’s New Quarterly
Projection Model, Part 4: A Semi-Structural Method
to Estimate Potential Output: Combining Economic
Theory with a Time-Series Filter.” Technical Report
No. 77, Bank of Canada, October 1996;
www.bankofcanada.ca/en/res/tr/1996/tr77.pdf.
Coletti, Donald; Hunt, Benjamin; Rose, David and
Tetlow, Robert. “The Bank of Canada’s New
Quarterly Projection Model, Part 3: The Dynamic
Model: QPM.” Technical Report No. 75, Bank of
Canada, May 1996;
www.bankofcanada.ca/en/res/tr/1996/tr75.pdf.
Côté, Denise and Hostland, Doug. “An Econometric
Examination of the Trend Unemployment Rate in
Canada.” Working Paper No. 96-7, Bank of Canada,
May 1996;
www.bankofcanada.ca/en/res/wp/1996/wp96-7.pdf.
Crawford, Allan. “Trends in Productivity Growth in
Canada.” Bank of Canada Review, Spring 2002,
pp. 19-32.

264

J U LY / A U G U S T

2009

King, Robert G. and Rebelo, Sergio, T. “Low Frequency
Filtering and Real Business Cycles.” Journal of
Economic Dynamics and Control, 1993, 17(1-2),
pp. 207-31.
Kozicki, Sharon. “Multivariate Detrending under
Common Trend Restrictions: Implications for
Business Cycle Research.” Journal of Economic
Dynamics and Control, June 1999, 23(7),
pp. 997-1028.
Laxton, Douglas; Rose, David E. and Tetlow, Robert.
“Is the Canadian Phillips Curve Non-Linear?”
Working Paper 93-7, Bank of Canada, July 1993;
www.douglaslaxton.org/sitebuildercontent/
sitebuilderfiles/LRT3.pdf.
Laxton, Douglas and Tetlow, Robert. “A Simple
Multivariate Filter for the Measurement of Potential
Output.” Technical Report No. 59, Bank of Canada,
June 1992; www.douglaslaxton.org/ sitebuildercontent/sitebuilderfiles/LT.pdf.
Macklem, Tiff. “Information and Analysis for Monetary
Policy: Coming to a Decision.” Bank of Canada

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Barnett, Kozicki, Petrinec

Review, Summer 2002, pp. 11-18;
www.bankofcanada.ca/en/review/2002/macklem_e.pdf.
Mise, Emi; Kim, Tae-Hwan and Newbold, P. “On
Suboptimality of the Hodrick-Prescott Filter at
Time-Series Endpoints.” Journal of Macroeconomics,
March 2005, 27(1), pp. 53-67.
Orphanides, Athanasios and van Norden, Simon.
“The Unreliability of Output Gap Estimates in Real
Time.” Review of Economics and Statistics, 2002,
84(4), pp. 569-83.
Poloz, Stephen S. “The Causes of Unemployment in
Canada: A Review of the Evidence.” Working Paper
94-11, Bank of Canada, November 1994;
www.bankofcanada.ca/en/res/wp/1994/wp94-11.pdf.
Poloz, Stephen; Rose, David and Tetlow, Robert.
“The Bank of Canada’s New Quarterly Projection
Model (QPM): An Introduction.” Bank of Canada
Review, Autumn 1994, pp. 23-38;
www.bankofcanada.ca/en/review/1994/r944a.pdf.
Rennison, Andrew. “Comparing Alternative Output
Gap Estimators: A Monte Carlo Approach.” Working
Paper 2003-8, Bank of Canada, March 2003;
www.bankofcanada.ca/en/res/wp/2003/wp03-8.pdf.
Stock, James H. and Watson, Mark W. “Testing for
Common Trends.” Journal of the American
Statistical Association, December 1988, 83(404),
pp. 1097-107.
St-Amant, Pierre and van Norden, Simon. “Measurement of the Output Gap: A Discussion of Recent
Research at the Bank of Canada.” Technical Report
No. 79, Bank of Canada, August 1997;
www.bankofcanada.ca/en/res/tr/1997/tr79.pdf.
Thiessen, Gordon. “Monetary Policy and the Prospects
for a Stronger Canadian Economy.” Notes for
remarks to the Canadian Association for Business
Economics and the Ottawa Economics Association,
Ottawa, Ontario, Canada. March 21, 1997;
www.bankofcanada.ca/en/speeches/1997/sp97-3.html.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

265

266

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Gregor W. Smith

parse: v. tr. resolve (a sentence) into its
component parts and describe grammatically

I

n their thought-provoking and informative
paper, Barnett, Kozicki, and Petrinec (2009)
describe how the Bank of Canada used its
quarterly projection model (QPM) between
1994 and 2005 to resolve changes in macroeconomic variables into their component parts. They
make four distinct contributions by
(i)

giving a history of the QPM,

(ii) describing how potential output was
modeled with a multivariate filter that
was outside the QPM and is still in use,
(iii) outlining the Bank of Canada’s forecasting
methods for potential output, and
(iv) illustrating the properties of multivariate
forecast (or projection) errors and
revisions.
In commenting on these contributions, I begin
by looking at the history and forecasts of potential
output as modeled by the Bank of Canada and
then I draw attention to their findings concerning
forecast revisions and forecast errors.

HISTORY AND FORECASTS OF
POTENTIAL OUTPUT
During the 1990s the Bank of Canada began
to model potential output with its extended multivariate filter (EMVF), a development that was

ahead of its time. The Bank still uses this filter
today. The filter is multivariate in the sense that
it takes a range of indicators (e.g., the participation
rate and the unemployment rate) as inputs in addition to output itself. The filter is extended in the
sense that it uses economic information to define
the output gap. This information includes restrictions requiring a common trend for some series
or a positive correlation between the output gap
and the inflation rate. Finally, the EVMF also is
two sided, using both previous values of its input
variables and subsequent values (or their forecasts, when potential output is being estimated
for recent quarters).
The Bank’s projection method thus involved
two sets of parameters—one in the EVMF and
another in the QPM. It is relatively easy to think of
situations in which identifying parameters in the
second component might depend on the parameterization in the first component. For example,
the EVMF used parameter values that built in
some smoothness in the series for potential output.
The paper by Basu and Fernald (2009), in this
issue, skeptically discusses the use of smoothness
restrictions in defining potential output.
An alternative to the Bank’s procedure would
have been to smooth later in the process in the
QPM. For example, using an unsmooth potential
output series as an input in the QPM presumably
would have led to a calibration of the QPM that
involved smaller reactions to potential output
or that used reactions to both current and lagged
values of potential output, so that the smoothing

Gregor W. Smith is the Douglas D. Purvis Professor of Economics at Queen’s University.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 267-70.
© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

267

Smith

effectively occurred at this second stage. This
particular sequence of calibrations or parameterizations is now history, for the Bank of Canada
has replaced the QPM with the Terms-of-Trade
Economic Model (ToTEM, as described by Fenton
and Murchison, 2006), but the general point about
possible interdependency remains.
The EVMF is a two-sided filter, which naturally raises the questions of how to replace some
future values by forecasts, how to deal with revisions in data, and what to do near the end of a
time-series sample. Here my own vote favors a
one-sided approach using the Kalman filter to
forecast, filter, and then smooth as data vintages
accumulate. Of course, the two-sided filter can be
written in terms of forecasts so that it is one sided.
My suggestion is simply that such a process might
be a clearer place to begin, because I do not interpret Barnett, Kozicki, and Petrinec as arguing for
any special interest in the parameters of the twosided version. Anderson and Gascon (2009), also
in this issue, provide a comprehensive application
of a one-sided approach to U.S. data.
Historical and forecast series for potential output at the Bank of Canada are based on different
input series and restrictions. For example, forecasts of potential output use forecasts for the
capital stock and total factor productivity, while
historical estimates do not use these series. These
two measures obviously serve different purposes.
The Bank of Canada uses potential output and the
output gap to convey the idea that accumulated
events or output relative to the path of potential
matter to current events such as the inflation rate.
That communication can counteract the view that
only the most recent growth rates of macroeconomic variables matter to the subsequent evolution of the economy. I would worry, though, that
having different procedures for measuring historical, potential output, and forecasting current and
future potential output might hinder the communication effort.
Barnett, Kozicki, and Petrinec also discuss
the issue of the sensitivity of the Bank of Canada’s
measure of potential output to the endpoint. As
they note, noisiness in the output gap limits its
usefulness as a communication device. The alternative—the Kalman-filter approach—delivers a
268

J U LY / A U G U S T

2009

lower weight on the observation equation in preliminary data than in revised data to reflect this
uncertainty. Therefore, that alternative approach
again may be a natural framework for this issue.
However, it is always possible that current measures of today’s potential output and output gap
are simply too noisy to be useful guides to policy,
as Croushore (2009) suggests in his commentary
in this issue.

FORECAST REVISIONS AND
FORECAST ERRORS
Barnett, Kozicki, and Petrinec next focus on
the properties of multivariate forecasts, also
known as projections. Readers do not observe
the exact model used to produce the forecasts
because that model combines the QPM with estimates of trends. But studying forecasts is a natural
way to evaluate the model anyway, so their study
of forecast errors and revisions between 1993 and
2005 is welcome.
A key piece of notation is that x tv denotes a
variable for quarter t and measured at quarter v
(for vintage data). For given t, as v counts up, a
switch occurs from forecasts to preliminary data
and then to revised data. Unobserved variables
involve only a succession of forecasts. Bearing in
mind this sequence, I thus apply a gestalt switch
to their Figures 1 and 2. I read them from right to
left so that they describe the changes over time
as the date t to which the forecasts apply first is
approached then left behind.
To comment on their informative reporting, I
use some notation. Let us define

εtv = xtv − xtv −1 ,
which is the one-vintage-apart change in the forecast. With v ≤ t this is a forecast revision; with
v > t it is a forecast error. This updating applies
to three different types of variables: (i) those that
are eventually observed and not subsequently
revised (like the consumer price index [CPI]);
(ii) those that are eventually observed but then
revised (like gross domestic product [GDP]); and
(iii) those that are never observed (like potential
GDP). To help readers understand the forecast
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Smith

process, Barnett, Kozicki, and Petrinec next provide three different types of statistics involving ε tv.

Correlations of Forecast Errors Across
Variables and Horizons

Standard Deviation Over Time

Perhaps the correlation at a longer horizon
for inflation would be more interesting than the
one between contemporaneous revisions. That
would tell us how news about the output gap leads
to immediate revisions in forecasts for subsequent
inflation. The authors’ Table 4 provides exactly
this type of statistic:

A first way to provide information about forecast errors and revisions is to document their variability over time using the standard deviation,

( )

stdt εtv ,
and then see how this varies with v. For most
values of v, these standard deviations in potential
GDP are roughly as large as those in revisions to
actual GDP. But, of course, the revisions or errors
in actual output and potential output are correlated, so the forecast revisions or errors for the
output gap are much less volatile.

Correlation Across Horizons
A second way to study revisions or forecast
errors is to look at their correlation over horizons:

(

)

corrk εtv+ k , εtv .
This correlation naturally reflects the implied,
underlying persistence. (Reporting correlations,
if any, over time also would be interesting.)

Correlations of Forecast Errors Across
Variables
A third, informative statistic is the correlation
of revisions or forecast errors across macroeconomic variables. For example, using y to denote
output and π to denote inflation, an interesting
correlation is

(

)

v
corrk ε πv k , εyt
.

This correlation is high between GDP growth
and the output gap. It is low for inflation (or core
inflation) and the output gap: 0.14 (or 0.05). The
output gap could be measured or defined based
on this correlation. I stress that that is not what
the authors try to do. But since the Bank of Canada
does use the output gap to try to communicate its
views on inflation, it seems natural to test whether
this “news” correlation is significantly different
from zero.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

(

)

v
corr επvt + k , εyt
.

Barnett, Kozicki, and Petrinec find a small,
positive effect of GDP forecast errors on later inflation forecasts, an effect that is statistically significant at five to seven quarters (I am not sure of the
units so cannot report on the economic significance). The policy interest rate rises too (as do
market rates) but not enough to fully offset the
effect of the change in the output gap on later,
forecasted inflation.
Can we parameterize potential output so the
output gap causes inflation? (Or can we test the
causal role for a gap measured with a production
function?) I think the answer is no, we cannot. At
longer horizons, nothing should lead to revisions
to inflation forecasts. Imagine a least squares
regression like this:

π tt+ k − 2.0 = β0 + β1ztt .
In this regression, we should find that the coefficients are indistinguishable from zero for any
variable z tt and any horizon, say, k > 4 quarters.
Thus, this regression could not identify parameters
of the output gap or potential output.
After 1995 the official inflation target (and
average inflation rate) was 2 percent. Forecasts
should equal that value at and beyond the horizon
over which the policy interest rate has effect.
Kuttner and Posen (1999), Rowe and Yetman
(2002), and Otto and Voss (2009) have outlined
this unforecastability of inflation departures from
target under successful inflation targeting. So
deviations from 2 percent in the Bank’s inflation
forecasts could reflect overflexible inflation targeting or an insufficient response of the policy
interest rate. A role for the historical (two-sided)
output gap in this regression would show that
J U LY / A U G U S T

2009

269

Smith

alternative measurement to be misleading. But
the response of the overnight interest rate (policy)
perhaps could identify learning by the central
bank about the output gap.

Barnett, Russell; Kozicki, Sharon and Petrinec,
Christopher. “Parsing Shocks: Real-Time Revisions
to Gap and Growth Projections for Canada.” Federal
Reserve Bank of St. Louis Review, July/August
2009, 91(4), pp. 247-65.

CONCLUSION

Basu, Susantu and Fernald, John G. “What Do We
Know (And Not Know) about Potential Output?”
Federal Reserve Bank of St. Louis Review,
July/August 2009, 91(4), pp. 187-213.

This commentary has followed Barnett,
Kozicki, and Petrinec’s article by beginning with
how potential is and was measured in Canada
and then turning to the properties of revisions
and forecast errors. But perhaps this sequence
could come full circle in the Bank of Canada’s
research: Studying the properties of forecast (projection) errors might well lead to changes in how
the Bank measures potential output.
Under inflation targeting there is no information in inflation forecasts with which to test or
identify lagged effects of potential output or the
output gap on inflation. So, statistically, the output gap might be better thought of not as the thing
that predicts inflation but rather as the thing to
which the policy interest rate reacts and, implicitly, about which the Bank of Canada learns.
I conclude with a brief observation I would
like to emphasize. Full credit goes to the Bank of
Canada and its researchers for publicizing these
data from past projections and documenting their
properties. As this article shows, these data provide a rich source of insights into the tools used
in monetary policy.

REFERENCES

Croushore, Dean. Commentary on “Estimating U.S.
Output Growth with Vintage Data in a State-Space
Framework.” Federal Reserve Bank of St. Louis
Review, July/August 2009, 91(4), pp. 371-81.
Fenton, Paul and Murchison, Stephen. “ToTEM: The
Bank of Canada’s New Projection and PolicyAnalysis Model.” Bank of Canada Review, Autumn
2006, pp. 5-18.
Kuttner, Kenneth N. and Posen, Adam S. “Does Talk
Matter After All? Inflation Targeting and Central
Bank Behavior.” Staff Report No. 88, Federal
Reserve Bank of New York, October 1999;
www.newyorkfed.org/research/staff_reports/sr88.pdf.
Otto, Glenn and Voss, Graham. “Tests of Inflation
Forecast Targeting Models.” Unpublished manuscript, Department of Economics, University of
Victoria, 2009.
Rowe, Nicholas and Yetman, James. “Identifying a
Policymaker’s Target: An Application to the Bank
of Canada.” Canadian Journal of Economics, May
2002, 35(2), pp. 239-56.

Anderson, Richard G. and Gascon, Charles S.
“Estimating U.S. Output Growth with Vintage Data
in a State-Space Framework.” Federal Reserve
Bank of St. Louis Review, July/August 2009, 91(4),
pp. 271-90.

270

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

The Challenges of Estimating
Potential Output in Real Time
Robert W. Arnold
Potential output is an estimate of the level of gross domestic product attainable when the economy
is operating at a high rate of resource use. A summary measure of the economy’s productive capacity,
potential output plays an important role in the Congressional Budget Office (CBO)’s economic
forecast and projection. The author briefly describes the method the CBO uses to estimate and
project potential output, outlines some of the advantages and disadvantages of that approach, and
describes some of the challenges associated with estimating and projecting potential output. Chief
among these is the difficulty of estimating the underlying trends in economic data series that are
volatile, subject to structural change, and frequently revised. Those challenges are illustrated using
examples based on recent experience with labor force growth, the Phillips curve, and labor productivity growth. (JEL E17, E32, E62)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 271-90.

A

ssessing current economic conditions,
gauging inflationary pressures, and
projecting long-term economic growth
are central aspects of the Congressional
Budget Office (CBO)’s economic forecasts and
baseline projections. Those tasks require a summary measure of the economy’s productive capacity. That measure, known as potential output, is
an estimate of “full-employment” gross domestic
product (GDP)—the level of GDP attainable when
the economy is operating at a high rate of resource
use. Although it is a measure of the productive
capacity of the economy, potential output is not
a technical ceiling on output that cannot be
exceeded. Rather, it is a measure of sustainable
output, where the intensity of resource use is
neither adding to nor subtracting from short-run
inflationary pressure. If actual output exceeds
its potential level, then constraints on capacity
begin to bind, restraining further growth and
contributing to inflationary pressure. If output

falls below potential, then resources are lying
idle and inflation tends to fall.
In addition to being a measure of aggregate
supply in the economy, potential output is also
an estimate of trend GDP. The long-term trend in
real GDP is generally upward as more resources—
primarily labor and capital—become available and
technological change allows more productive
use of existing resources. Real GDP also displays
short-term variation around that long-run trend,
influenced primarily by the business cycle but
also by random shocks whose sources are difficult
to pinpoint. Analysts often want to estimate the
underlying trend, or general momentum, in GDP
by removing short-term variation from it. A distinct, but related, objective is to remove the fluctuations that arise solely from the effects of the
business cycle.
Potential output plays a role in several areas
associated with the CBO’s economic forecast. In
particular, we use potential output to set the
level of real GDP in its medium-term (or 10-year)

Robert W. Arnold is principal analyst in the Congressional Budget Office.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

271

Arnold

projections. In doing so, we assume that the gap
between GDP and potential GDP will equal zero
on average in the medium term. Therefore, the
CBO projects that any gap that remains at the end
of the short-term (or two-year) forecast will close
during the following eight years. We also use the
level of potential output as one gauge of inflationary pressures in the near term. For example, an
increase in inflation that occurs when real GDP is
below potential (and monetary growth is moderate) can probably be attributed to temporary factors and is unlikely to persist. Finally, potential
output is an important input in computing the
standardized-budget surplus or deficit, which the
CBO uses to evaluate the stance of fiscal policy
and reports regularly as part of its mandate.

The CBO model for estimating potential output is based on the framework of a neoclassical,
or Solow, growth model. The model includes a
Cobb-Douglas production function for the nonfarm business (NFB) sector with two factor inputs,
labor (measured as hours worked) and capital
(measured as an index of capital services provided by the physical capital stock), and total
factor productivity (TFP), which is calculated as
a residual. NFB is by far the largest sector in the
economy, accounting for 76 percent of GDP in
2007, compared with less than 10 percent for each
of the other sectors. For smaller sectors of the
economy, including farms, federal government,
state and local government, households, and nonprofit institutions, simpler equations are used to
model output. Those equations generally relate
the growth of output in a sector to the growth of
the factor input—either capital or labor—that is
more important for production in that sector.1
To compute historical values for potential output, we cyclically adjust the factor inputs and
then combine them using the production function.
Cyclical adjustment removes the variation in a

series that is attributable solely to business cycle
fluctuations. Ideally, the resulting series will
reflect not only the trend in the series, but also
will be benchmarked to some measure of capacity
in the economy and, therefore, can be interpreted
as the potential level of the series.
For most variables in the model, we use a
cyclic-adjustment equation that combines a relationship based on Okun’s law with linear time
trends to produce potential values for the factor
inputs. Okun (1970) postulated an inverse relationship between the size of the output gap (the
percentage difference between GDP and potential
GDP) and the size of the unemployment gap (the
difference between the unemployment rate and
the “natural” rate of unemployment). According
to that relationship, actual output exceeds its
potential level when the unemployment rate is
below the natural rate of unemployment and falls
short of potential output when the unemployment
rate is above its natural rate (Figure 1).
For the natural rate of unemployment, we use
the CBO estimate of the non-accelerating inflation
rate of unemployment (NAIRU). That rate corresponds to a particular notion of full employment—
the rate of unemployment that is consistent with
a stable rate of inflation. The historical estimate
of the NAIRU derives from an estimated relationship known as a Phillips curve, which connects
the change in inflation to the unemployment rate
and other variables, including changes in productivity trends, oil price shocks, and wage and price
controls. The historical relationship between the
unemployment gap and the change in the rate of
inflation appears to have weakened since the mid1980s.2 However, a negative correlation still exists;
when the unemployment rate is below the NAIRU,
inflation tends to rise, and when it exceeds the
NAIRU, inflation tends to fall. Consequently, the
NAIRU, while it is less useful for inflation forecasts, is still useful as a benchmark for potential
output.
The assumption of linear time trends in the
cyclic-adjustment equation implies that the poten-

1

2

THE CBO METHOD FOR
ESTIMATING POTENTIAL OUTPUT

This section gives an overview of the CBO method. For a more
complete description, see CBO (2001).

272

J U LY / A U G U S T

2009

For a description of the procedure used to estimate the NAIRU,
see Arnold (2008).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Figure 1
Okun’s Law: The Output Gap and the Unemployment Gap
Percent
10

5
4
3

5

2
1
0

0

–1
–2
–5

–10
1950

Output Gap (left scale)

–3

Unemployment Gap
(inverted, right scale)

–4
–5

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

NOTE: Gray bars in Figures 1, 2, 3, 5, 6, and 8 indicate recession as determined by the National Bureau of Economic Research.

tial version of each variable grows at a constant
rate during each historical business cycle. Rather
than constraining the potential series to follow a
single time trend throughout the entire sample,
the model allows for several time trends, each
beginning at the peak of a business cycle. Defining
the intervals of the time trends using full business
cycles helps to ensure that the trends are estimated
consistently throughout the historical sample.
Most economic variables have distinct cyclical
patterns—they behave differently at different
points in the business cycle. Specifying breakpoints for the trends that occur at different stages
of different business cycles (say, from trough to
peak) would likely provide a misleading view of
the underlying trend.
The cyclic-adjustment equation has the following form:

(

log ( X ) = Constant + α U − U *
(1)

)

+ β1T1953 + β2T1957 +…+ β8T1990 + ε,
where X = the series to be cyclically adjusted,
U = unemployment rate,
U * = NAIRU, and
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Ti = zero until the business-cycle peak
occurring in year i, after which it equals
the number of quarters elapsed since that
peak.
Equation (1), a piecewise linear regression,
is estimated using quarterly data and ordinary
least squares. Potential values for the series being
adjusted are calculated as the fitted values from
the regression, with U constrained to equal U *.
Setting the unemployment rate to equal the NAIRU
removes the estimated effects of fluctuations in
the business cycle; the resulting estimate gives
the equation’s prediction of what the dependent
variable (X) would be if the unemployment rate
never deviated from the NAIRU. An example of
the results of using the cyclic-adjustment equation is illustrated in Figure 2, which shows TFP
and potential TFP.
One question that arises is when to add a new
trend break to the equation. Typically, we do not
add a new breakpoint immediately after a business
cycle peak because doing so would create, at least
initially, a very short trend segment for the period
after the peak. Such a segment would be subject
to large swings as new data points were added
J U LY / A U G U S T

2009

273

Arnold

Figure 2
TFP and Potential TFP
Log Scale (Index, 1996 = 1.0)
1.40
1.30
1.20
1.10
1.00
0.90
0.80
0.70

TFP

0.60
0.50
1950

Potential TFP
1955

1960

1965

1970

1975

to the sample because it was so short, at least
initially. Because the final segment of the trend
is carried forward into the projection, those swings
would create instability to our medium-term projections. Consequently, we typically wait until a
full business cycle has concluded before adding
a new break to the trend. For example, the model
does not yet include a break in 2001, though the
addition of one appears to be increasingly likely.
Equation (1) is used for most, but not all,
inputs in the model. One important exception is
the capital input, which does not need to be cyclically adjusted to create a “potential” level because
the unadjusted capital input already represents
its potential contribution to output. Although use
of the capital stock varies greatly during the business cycle, the potential flow of capital services
is always related to the total size of the capital
stock, not to the amount currently being used.
Other exceptions include several variables of
lesser importance that do not vary with the business cycle. Those series are smoothed using the
Hodrick-Prescott filter.
As noted earlier, the method for computing
historical values of potential output in the other
sectors of the economy differs slightly from that
used for the NFB sector. In general, the approach
274

J U LY / A U G U S T

2009

1980

1985

1990

1995

2000

2005

is to express real GDP in each of the other sectors
as a function of the primary factor input (either
labor or capital) in that sector and the productivity
of that input. The potential levels of the primary
input and its productivity are cyclically adjusted
using an analog to equation (1) and then combined
to estimate potential output in that sector. The
list below describes how each sector is modeled.
• Farm sector: Potential GDP in this sector is
modeled as a function of potential farm
employment and potential output per
employee.
• Government sector: Potential GDP in this
sector is the sum of potential GDP in the
federal government and state and local
governments. Potential GDP at each level
of government equals the sum of the compensation of general government employees (adjusted to potential) and government
depreciation. Compensation is modeled as
a function of total employment, and compensation per employee and depreciation
is modeled as a function of the government
capital stock.
• Nonprofit sector: Potential GDP in this
sector is modeled as a function of potential
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

nonprofit employment and potential output
per employee.
• Household sector: Although some of the
GDP in the household sector consists of
the compensation of domestic workers, the
majority is composed of imputed rent on
owner-occupied housing. As such, output
in this sector is composed of a stream of
housing services provided almost entirely
from the capital stock. Potential GDP in this
sector is modeled as a function of the stock
of owner-occupied housing and an estimate
of the productivity of that stock. Similar
to the capital input in the NFB sector, the
housing capital stock is not adjusted to
potential because the unadjusted stock
reflects the potential contribution to output.
For projections of potential output, the same
framework is used for these sectors as is used for
the NFB sector. Given projections of several exogenous variables—of which potential labor force,
potential TFP growth, and the national saving
rate are the most important—the growth model
computes the capital stock endogenously and
combines the factor inputs into an estimate of
potential output. In most cases, projecting the
exogenous variables is straightforward: The CBO
generally extrapolates the trend growth rate from
recent history through the 10-year projection
period. However, the projections for some exogenous variables, most notably the saving rate, are
taken from the CBO economic forecast.

Advantages and Disadvantages of the
CBO Method
The CBO method for estimating and projecting potential output has several key advantages.
First, it looks explicitly at the supply side of the
economy. Potential output is a measure of productive capacity, so any estimate of it is likely to
benefit from explicit dependence on factors of
production. For example, if growth in the available pool of labor increases, then this method
will show an acceleration in potential output (all
other things being equal). With our approach, an
increase in investment spending would also be
reflected in faster growth in productive capacity.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Another advantage of a growth model is that it
allows for a transparent accounting of the sources
of growth. Such a growth-accounting exercise,
which divides the growth of potential GDP into
the contributions from each of the factor inputs, is
especially useful when explaining the factors that
caused a change to CBO projections. A growthaccounting exercise for our current projection is
shown in Table 1.3 The table displays the growth
rates of potential output and its components for
the overall economy and the NFB sector. Note that
the growth rates of the factor inputs (top and middle panels of the table) are not weighted; they do
not sum to the growth in potential output.
A third advantage of using a growth model
to calculate potential output is that it supplies a
projection for potential output that is consistent
with the CBO projection for the federal budget.
That consistency allows the CBO to incorporate
the effects of changes in fiscal policy into its
medium-term (10-year) economic and budget
projections. Fiscal policy has obvious effects on
aggregate demand in the short run, effects that are
reflected in our short-term forecast. However,
fiscal policy will also influence the growth in
potential output over the medium term through its
effect on national saving and capital accumulation.
Because the growth model explicitly includes
capital as a factor of production, it captures that
effect.
Table 1 also shows the contribution of each
factor input to the growth of potential output in
the NFB sector by weighting each input’s growth
rate by its coefficient in the production function.
The sum of the contributions equals the growth
of potential output in the NFB sector. Computing
the contributions to growth highlights the sources
of any quickening or slowdown in growth. For
example, the CBO estimates that potential output
in the NFB sector grew at an average annual rate
of 3.3 percent during the 1982-90 period and 3.5
percent during the 1991-2001 period. That acceleration can be attributed to faster growth in the
capital input (which contributed 1.2 percentage
points to the growth of potential output during
the first period and 1.4 percentage points in the
3

See CBO (2008).

J U LY / A U G U S T

2009

275

Arnold

276

Table 1

J U LY / A U G U S T

Key Assumptions in the CBO’s Projection of Potential Output
Projected
average annual growth (%)

Average annual growth (%)

2009

1950-73

1974-81

1982-90

1991-2001

2002-2007*

Total,
1950-2007*

2008-2013

2014-2018

Total,
2008-2018

Potential output

3.9

3.2

3.1

3.1

2.7

3.4

2.5

2.4

2.4

Potential labor force

1.6

2.5

1.6

1.2

1.1

1.6

0.8

0.5

0.7

Potential labor force productivity†

2.3

0.7

1.4

1.9

1.6

1.8

1.6

1.9

1.7

4.0

3.6

3.3

3.5

3.0

3.6

2.8

2.8

2.8

Overall economy

Nonfarm business sector
Potential output
Potential hours worked

1.4

2.3

1.7

1.1

1.0

1.5

0.7

0.4

0.6

Capital input

3.8

4.2

4.1

4.6

2.5

3.9

2.9

3.5

3.2

Potential TFP

1.9

0.7

0.9

1.3

1.5

1.4

1.4

1.4

1.4

Potential TFP excluding adjustments

1.9

0.7

0.9

1.3

1.3

1.4

1.3

1.3

1.3

TFP adjustments

0.0

0.0

0.0

0.1

0.2

‡

0.1

0.1

0.1

0.0

0.0

0.0

0.1

0.1

‡

0.1

0.1

0.1

‡

‡

0.0

0.0

0.0

Price measurement§
Temporary

adjustment¶

0.0

0.0

0.0

‡

1.0

1.6

1.2

0.8

0.7

1.0

0.5

0.3

0.4

Contributions to the growth of potential
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Output (percentage points)
Potential hours worked
Capital input

1.1

1.3

1.2

1.4

0.8

1.2

0.9

1.0

1.0

Potential TFP

1.9

0.7

0.9

1.3

1.5

1.4

1.4

1.4

1.4

4.0

3.6

3.3

3.5

2.9

3.6

2.8

2.8

2.8

Potential labor productivity in the NFB sector# 2.6

1.3

1.6

2.4

1.9

2.1

2.1

2.3

2.2

Total contributions

NOTE: Data are for calendar years. Numbers in the table may not add up to totals because of rounding.
*Values as of August 22, 2008.
†The ratio of potential output to the potential labor force.
‡Between zero and 0.05 percent.
§An adjustment for a conceptual change in the official measure of the GDP chained price index.
¶An adjustment for the unusually rapid growth of TFP between 2001 and 2003.
#The estimated trend in the ratio of output to hours worked in the NFB sector.
SOURCE: CBO.

Arnold

second) and faster growth of potential TFP (which
contributed 0.9 and 1.3 percentage points to the
growth of potential output during the two periods,
respectively). Faster growth in those two factors
more than offset a slowdown in potential hours
worked between the two periods. This point is
addressed later.
Fourth, by using a disaggregated approach,
the CBO method can reveal more insights about
the economy than a more-aggregated model would.
For example, the model calculates the capital
input to the production function as a weighted
average of the services provided by seven types
of capital. Those data indicate a shift over the
past few decades to capital goods with shorter
service lives: A larger share of total fixed investment is going to producers’ durable equipment
(PDE) relative to structures, and a larger share of
PDE is going to computers and other information
technology (IT) capital. Because shorter-lived
capital goods depreciate more rapidly, the shift
toward PDE and IT capital increases the share of
investment dollars used to replace worn-out capital and tends to lower net investment and the
capital input. Shorter-lived capital goods are also
more productive per year of service life than those
that last longer and are therefore weighted more
heavily in the growth model’s capital input. A
model that ignores the capital input or that does
not disaggregate capital is likely to miss both of
those effects.
On the negative side, the simplicity of our
model could be perceived as a drawback. The
model uses some parameters—most notably, the
coefficients on labor and capital in the production
function—that are imposed rather than econometrically estimated. Although that approach is
standard practice in the growth-accounting literature (in part because it has empirical support),
it is tantamount to assuming the magnitude of
the contribution that each factor input makes to
growth. With such an approach, the magnitude
of that contribution will not change from year to
year as the economy evolves, as it would in an
econometrically estimated model. Moreover, it
requires some strong assumptions that may not
be consistent with the data.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

A second disadvantage of using a growth
model to estimate potential output is that including the capital stock introduces measurement
error. Most economic variables are subject to measurement error, but the problem is particularly
acute for capital, for two basic reasons. First, measuring the stock of any particular type of capital
is difficult because depreciation is hard to define
or measure. Purchases of plant and equipment can
be tallied to produce a historical series for investment, but no corresponding source of data exists
for depreciation. Second, even if accurate estimates of individual stocks were available, aggregating them into a single index would be difficult
because capital is heterogeneous, differing with
respect to characteristics such as durability and
productivity.4
A third point of contention regarding the CBO
approach is the use of deterministic time trends
to cyclically adjust many variables in the model.
Some analysts assert that relying on fixed time
trends provides a misleading view of the cyclical
behavior of some economic time series. They
argue, on the basis of empirical studies of the
business cycle, that using variable rather than
fixed time trends is more appropriate for most
data series.5 However, the evidence on this point
is mixed—it is very difficult to determine whether
the trend in a data series is deterministic or stochastic using existing econometric techniques—
and the methods used to estimate stochastic trends
often yield results that are not useful for estimating potential output. That is, stochastic trends
tend to produce estimates of the output gap that
are not consistent with other indicators of the
business cycle.
Fourth, the CBO growth model is based on
an estimate of the amount of slack in the labor
market, which in turn requires an estimate of the
natural rate of unemployment or the NAIRU. Such
4

The CBO capital input uses capital stock estimates (and the associated assumptions about depreciation) from the Bureau of Economic
Analysis and uses an aggregation equation that is based on the
approach used by the Bureau of Labor Statistics (BLS) to construct
the capital input that underlies the multifactor productivity series.
The CBO estimate of the capital input is quite similar to that calculated by the BLS.

5

See, for example, Stock and Watson (1988).

J U LY / A U G U S T

2009

277

Arnold

estimates are highly uncertain. Few economists
would claim that they can confidently identify
the current NAIRU to within a percentage point.
Our method is not very sensitive to possible
errors in the average level of the estimated NAIRU,
but it is sensitive to errors in identifying how that
level changes from year to year.
Finally, the CBO model does not contain
explicit channels of influence for all major effects
of government policy on potential output. For
example, it does not include an explicit link
between tax rates and labor supply, productivity,
or the personal saving rate; nor does it include
any link between changes in regulatory policy
and those variables. However, that does not mean
that the model precludes a relationship between
policy changes and any of those variables. If a
given policy change is estimated to be large
enough to affect the incentives governing work
effort, productivity, or saving, then those effects
can be included in our projection or in a policy
simulation by adjusting the relevant variable in
the model. For example, changes in marginal tax
rates have the potential to affect labor supply.
Because the Solow model does not explicitly
model the factors that affect the labor input, our
model includes a separate adjustment to incorporate such effects. Indeed, for the past several
years, such an adjustment has been included in
our model to account for the effects on the labor
supply of the scheduled expiration in 2011 of the
tax laws passed in 2001 and 2003. The structure
of our model makes it easier to isolate (and incorporate) the effects of such policy changes than
would be the case with a time-series–based model.

CHALLENGES ASSOCIATED WITH
ESTIMATING AND PROJECTING
POTENTIAL OUTPUT
Potential output plays a key role in the CBO
economic forecast and projection. Perhaps the
two most important are estimating the output gap
(percentage difference between GDP and potential
GDP) and providing a target for the 10-year projection of GDP. Important challenges are associated
with both roles.
278

J U LY / A U G U S T

2009

Challenges Associated with Estimating
the Output Gap
Any method used to estimate the trend in a
series, including potential output, is subject to
an “end-of-sample” problem, which means that
estimating the trend is especially difficult near
the end of a data sample. In the case of the output
gap, this is usually the period of greatest interest.
Three examples from the period since 2000 illustrate the difficulties associated with estimating
the level of potential output at the end of the
sample period.
Potential Labor Force. Fundamentally, the
amount of hours worked in the economy is determined by the size of the labor force, which, in
turn, is largely influenced by two factors: growth
in the population and the rate of labor force participation. Neither of those series is especially
sensitive to business cycle fluctuations, but both
are subject to considerable low-frequency variation. The discussion here focuses on how the rate
of labor force participation has changed during
recent years and how we have modified the CBO
labor force projections as a result.
After a long-running rise that started in the
early 1960s, the labor force participation rate
plateaued at about 67 percent of the civilian population during the late 1990s, declined sharply
between 2000 and 2003, and varied in a narrow
range near 66 percent between 2003 and 2008
(Figure 3). Had that decline in the participation
rate not occurred, the labor force would have
had approximately 2.3 million more workers in
2008 than it actually did.
To assess its impact on potential output, the
challenge during the early 2000s was to determine
whether the decline in the participation rate was
cyclical (i.e., workers had dropped out of the labor
force because their prospects of getting a job
were dim) or structural (i.e., prospective labor
force participants had weighed the alternatives
and found that options such as education, retirement, or child-rearing were more attractive). If
the decline were due to cyclical reasons, then the
dip in participation should not be reflected in the
estimate of potential labor force. If the decline
were due to structural reasons, however, then the
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Figure 3
Labor Force Participation Rate
Percent
68

66

64

62

60

58
1950

1955

1960

1965

1970

1975

estimates of potential labor force and potential
output should be lowered to reflect the decreased
size of the potential workforce.
The drop in the participation rate also complicated the interpretation of movements in the
unemployment rate, which peaked at 6.1 percent in mid-2003 and declined thereafter. During
2006 and 2007, the unemployment rate was
below 5 percent, which suggested considerable
tightness in the labor market. However, the
decline in the participation rate implied that
there existed a pool of untapped labor that could
have been drawn into the workforce had there
been a significant speedup in the pace of job creation. Consequently, at that time, the unemployment rate probably understated the degree of
slack that existed in the labor market. Indeed, in
the early stages of the expansion following the
2001 recession, we projected that the participation rate would recover as job creation picked
up. It never did though, and the CBO has since
concluded that the decline in the participation
rate was more structural than cyclical.6
6

That conclusion was based on an analysis of the factors affecting
the participation rates of various demographic subgroups in the
population; see CBO (2004).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1980

1985

1990

1995

2000

2005

Potential NFB Employment. The second challenge is associated with the behavior of employment since the end of the 2001 recession and its
implications for the estimate of potential hours
worked in the NFB sector. One striking feature of
the economic landscape since the 2001 business
cycle trough is very weak growth in employment,
especially for measures derived from the Bureau
of Labor Statistics’ establishment survey. For
example, since the trough in the fourth quarter
of 2001, growth in nonfarm payroll employment
averaged 0.8 percent at an annual rate, which
means that payrolls were roughly 5 percent higher
in the second quarter of 2008 than they were at
the end of the 2001 recession. However, based on
patterns in past cycles, one would have expected
much faster growth in payroll employment—2.4
percent on average—and a much higher level of
employment—17 percent higher than its trough
value—by the second quarter of 2008 (Figure 4).
A similar pattern holds for employment in
the NFB sector (which differs from the headline
payroll number by excluding employees in private households and nonprofit institutions and
including proprietors). In the second quarter of
2008, NFB employment was about 4 percent above
J U LY / A U G U S T

2009

279

Arnold

Figure 4
Payroll Employment in the Current Expansion Compared with an “Average” Cycle
Percent Difference from Trough Value
20
Average Business Cycle
Current Cycle

15

10

5

0

–5
0

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Quarters after Trough

Figure 5
NFB Sector Employment as a Percent of the Civilian Labor Force
Percent
80
78
76
74
72
70
68
1960

280

J U LY / A U G U S T

1965

2009

1970

1975

1980

1985

1990

1995

2000

2005

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Figure 6
NFB Employment as a Percent of the Civilian Labor Force and Two Counterfactual Paths
Percent
84
Counterfactual I (assuming average recovery and expansion)
82
80

Counterfactual II (assuming 1990s-style recovery and expansion)
Actual

78
76
74
72
70
68
1960

1965

1970

1975

1980

its level at the end of the 2001 recession. Had it
grown according to the pattern seen in a typical
business cycle expansion, it would have been
about 15 percent higher than its level at the trough
of the recession.
The behavior of NFB employment since the
business cycle peak in 2001 also looks very
unusual when viewed from another perspective.
When measured as a share of the labor force
(which controls for the decline in the rate of labor
force participation), NFB employment barely
grew during the expansion that followed the 2001
recession (Figure 5). This is extremely unusual
on two counts. First, it departs from the very
strong procyclical pattern seen in most recovery
and expansion periods. Typically, NFB sector
employment grows much faster than the labor
force during business cycle expansions, which
causes a rapid increase in its share. Second, the
recent behavior breaks with the long-standing
upward trend in the NFB share of the labor force.
Since roughly the mid-1970s, trend growth in NFB
employment has exceeded trend growth in the
labor force on average, leading to a steady increase
in the share. Examining the peaks is a rough-andready way to control for business cycle variation:
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1985

1990

1995

2000

2005

The share increased from about 74 percent in
1973, to just under 75 percent in 1979, to just
over 76 percent in 1989, to just under 80 percent
in 1999.
After the trough in 2001, the share of NFB
employment declined for another two years and
then increased somewhat but not anything like
a normal cyclical rebound. The reasons for this
behavior are not fully clear—shifts of employment
to other sectors, including government and nonprofit institutions, can explain only part of the
shortfall—but it has important implications for
the estimate of potential employment and hours
worked. Specifically, the estimate of potential
employment in the NFB sector is much lower than
it would have been had actual NFB employment
followed a more typical cyclical pattern since 2001.
To illustrate this point, consider what the NFB
employment share would have looked like had it
followed a more typical cyclical pattern. Figure 6
shows NFB employment as a share of the labor
force along with two counterfactual paths for the
share. The thin solid line shows the evolution of
the NFB employment share had it grown since
2001 at the same rate as an “average” historical
expansion. That path embodies much stronger
J U LY / A U G U S T

2009

281

Arnold

employment growth than does the actual path
and would imply a much higher level of potential
employment as well. Arguably, that path is too
strong, given that employment growth has been
sluggish in the recoveries that followed the past
two recessions. So the figure includes a second
counterfactual path (dotted line) showing the
evolution of the NFB employment share had it
grown as it did during the expansion of the 1990s.
It too implies much stronger employment growth
than actually occurred.
For the first few years of the current business
cycle, it was reasonable to expect a typical rebound
in the NFB employment share, even if it was
delayed relative to past expansions. If so, then the
period of sluggish growth in NFB employment
could be interpreted as a cyclical pattern and
would not necessarily imply that the level of
potential NFB employment was lower. However,
as the period of sluggish growth grew longer and
in light of the possibility of a business cycle peak
in early 2008, the position that NFB employment
would eventually rebound became increasingly
untenable.
Instead, it seems increasingly likely that NFB
employment will merely match the growth in the
labor force in the future, rather than grow at a faster
pace. One implication of that interpretation is that
the experience of the late 1990s, when the NFB
employment share of the labor force was very
high, was unusual and is unlikely to be repeated.
Changes in the Phillips Curve and NAIRU.
As noted previously, the natural rate of unemployment is an important input in the CBO model. It
serves as the benchmark used to estimate the
potential values of the factor inputs and, consequently, potential output. Any uncertainties
associated with the size of the unemployment
gap, or difference between the unemployment
rate and the natural rate, will translate directly
into uncertainty about the size of the output gap.
Our estimate of the natural rate, known as
the NAIRU, is based on a standard Phillips curve,
which relates changes in inflation to the unemployment rate, expected inflation, and various
supply shock variables. In particular, the NAIRU
estimate relies on the existence of a negative correlation between inflation and unemployment: If
282

J U LY / A U G U S T

2009

inflation tends to rise when the unemployment
rate is low and tends to fall when the unemployment rate is high, then there must be an unemployment rate at which there is no tendency for
inflation to rise or fall. This does not mean that
the rate is stable or that it is precisely estimated,
just that it must exist.
However, during the past 20 or so years, significant changes in how the economy functions
have affected the relationship between inflation
and unemployment and, consequently, estimates
of the Phillips curve and the NAIRU. Most notably,
the rate of inflation has been lower and much less
volatile since the mid-1980s, a phenomenon often
referred to as the Great Moderation. At the same
time, the unemployment rate has trended downward, which suggests that the natural rate of
unemployment has declined also. Researchers had
identified several factors that would be expected
to lower the natural rate, including the changing
demographic composition of the workforce,
changes in disability policies, and improved efficiency of the labor market’s matching process.
Based on internal evaluation of those factors, the
CBO began to lower its estimate of the NAIRU for
the period since 1990, overriding the econometric
estimate at that time.7
More recent Phillips curve estimates are consistent with the hypothesis that a change occurred
sometime during the past 20 or so years. In a
recent working paper, I presented regression
results from estimates of several Phillips curve
specifications that suggested the presence of significant structural change since the mid-1980s.8
Using the full data sample, from 1955 through
2007, the equations’ performance appeared to be
satisfactory. They fit the data well and their estimated coefficients had the correct sign, were of
reasonable magnitude, and were statistically significant. However, the full-sample regressions
masked evidence of a breakdown in performance
that began during the mid-1980s. Estimation
results from equations that allowed for structural
change indicated that the fit of the equations dete7

That analysis was later summarized in a CBO paper; see Brauer
(2007).

8

See Arnold (2008).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Figure 7
Married-Male Unemployment and the Change in Inflation
Early Sample (1957-90)
Change in Inflation (percentage points)
8
6
4
2
0
–2
–4
–6
0

1

2

3

4

5

6

7

8

6

7

8

Married-Male Unemployment Rate (percent)

Late Sample (1991-2007)
Change in Inflation (percentage points)
8
6
4
2
0
–2
–4
–6
0

1

2

3

4

5

Married-Male Unemployment Rate (percent)
NOTE: The change in inflation is defined as the difference between the quarterly rate of inflation in the personal consumption
expenditure (PCE) price index and a 24-quarter moving average of PCE inflation.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

283

Arnold

riorated and that the coefficients were smaller and
less statistically significant during the latter part
of the data sample than they were during the earlier part. In general, the results suggest that the
NAIRU is lower now than it had been during the
period from 1955 through the mid-1980s, a conclusion consistent with evidence from the labor
market suggesting a decline in the natural rate.
The results also indicate that the Phillips curve
has become less useful for predicting inflation.
However, the relationship between inflation
and unemployment, though not as strong as it
once was, has not collapsed completely. Consider
Figure 7, which plots changes in a measure of
unanticipated inflation against the married-male
unemployment rate. The top panel shows data
from the 1957-90 period, while the bottom panel
shows data from 1991 through 2007.9 Comparing
the two panels reveals four features of the latter
period. First, both graphs show a negative correlation between the two series, so there still appears
to be a tradeoff between inflation and unemployment. Second, the point at which the regression
line intersects the horizontal-axis intercept has
moved to the left in the second panel, which is
consistent with the idea that the NAIRU is lower
now than it had been earlier. Third, the slope of
the trend line is lower during the second part of
the sample, which suggests that the inflationunemployment tradeoff is somewhat flatter during
the second period (i.e., inflation is less responsive
to changes in the unemployment rate). Fourth,
much less variation has occurred in both inflation
and unemployment during the past 20 or so years
than previously.
What do these observations imply for the estimate of potential output? The first observation—
that a negative correlation still exists—means that
the unemployment rate is still consistent with a
stable rate of inflation. The second observation—
that the NAIRU has declined—implies that the
level of potential output is higher than it would
9

The working paper estimated Phillips curve equations using different price indices and used Chow tests to determine when the
structural break occurred in each equation. For the personal consumption expenditures price index, the break was found in 1991.
The married-male unemployment rate was used in the estimation
because it is better insulated from demographic shifts than the
overall unemployment rate.

284

J U LY / A U G U S T

2009

have been had the NAIRU been constant. This
observation also serves as a reminder that structural change in macro equations is a fact of life.
It is important to monitor such equations continually to identify how economic events will affect
their conclusions. The final two observations
imply that Phillips curves, and by extension the
NAIRU and potential output, are less useful indicators of inflationary pressure than they once were.

Challenges Associated with Projecting
Potential Output
Potential output is used for more than gauging
the state of the business cycle. It is also used to
set the path for real GDP in the 10-year forecast
that underlies the CBO budget projections. A separate set of challenges is associated with projecting the variables that underlie our estimate of
potential output.
Projecting Labor Productivity I: The Late1990s’ Acceleration. Labor productivity growth
during the late 1990s provides an important example of the challenges associated with projecting
potential GDP.10 The broad outlines of the story
are familiar: After a long period of sluggish growth,
labor productivity accelerated sharply during the
second half of the 1990s and continued to grow
rapidly during the 2000s. Moreover, the upswing
was substantial. Trend growth in labor productivity averaged 2.7 percent between the end of
1995 and the middle of 2008, considerably faster
than the 1.4 percent pace from 1974 to 1995
(Figure 8). Had it followed that pre-1996 trend of
1.4 percent instead of growing as it did, labor
productivity would be 15 percent lower than it
is today. Furthermore, if the 2.7 percent trend is
sustained over the next decade, then the level of
real GDP will be nearly 30 percent higher in 2018
than the level that would have resulted from the
pre-1996 rate of growth.
One problem for forecasters was that the productivity acceleration was largely unexpected.
In the mid-1990s, few analysts anticipated such
a dramatic increase in the trend rate of growth.
10

In our model, we actually project potential TFP—the projection
for potential labor productivity is implied by the projections for
potential TFP and capital accumulation.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Figure 8
Labor Productivity Growth and Trend (1950-2008)
Index (data in logs)
5.0
4.8
4.6
4.4
4.2
4.0

Productivity
Trend Productivity

3.8
3.6
1950

Pre-1996 Trend Productivity
1955

1960

1965

1970

1975

In January 1995, for example, the CBO projected
that labor productivity growth would average
1.3 percent annually for the 1995-2000 period, a
pace similar to the average for the prior 20 years.
The Clinton administration and the Blue Chip
Consensus of private forecasters projected similar
rates of growth.
Another problem for forecasters was that the
productivity surge in the late 1990s went unrecognized until very late in the decade for two basic
reasons. First, labor productivity is fairly volatile,
with growth rates that can swing widely from
quarter to quarter. As a result, a period of two or
three years is a short window within which to
discern a new trend. Moreover, the acceleration
followed a period of subpar growth (productivity
growth averaged only 0.22 percent annually
between the end of 1992 and 1995:Q3); so, initially, the faster growth appeared to be just making
up lost ground rather than establishing a new,
higher trend growth rate. The postwar data sample includes several episodes of faster- or slowerthan-trend productivity growth that were later
reversed.
Second, early vintages of productivity data
for the late 1990s proved to be understated and,
therefore, painted a misleading picture of the
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1980

1985

1990

1995

2000

2005

productivity trend. Only after several revisions
did a stronger pattern emerge. Using real-time data
culled from our forecast databases, Table 2 shows
that data available in 1996, 1997, and 1998 showed
only a small rise in productivity growth starting
in late 1995. For example, data available in early
1997 showed labor productivity growing by only
0.3 percent on average from 1995:Q4 through
1996:Q3. The story changes markedly using currently available data: Labor productivity growth
for that period was actually 3 percent.
A similar case holds for 1998 and 1999. Data
available in January 1998 showed labor productivity growth averaging 1.8 percent between
1995:Q4 and 1997:Q3. That rate has since been
revised upward by 0.6 percentage points, to 2.4
percent. The growth rate for the three-year period
ending in 1998:Q3 also has been revised from 2.0
percent (using data from early 1999) to 2.5 percent
(using currently available data).
The information in Table 2 highlights an
important point. Productivity data are revised
frequently, and the revisions can be large enough
to alter analyses of trends in productivity growth.
Indeed, after being revised upward several times
during the late 1990s, productivity data have been
revised downward somewhat during recent years.
J U LY / A U G U S T

2009

285

Arnold

Table 2
Changes in Estimates of Average Annual Growth Rate for Labor Productivity
Average annual rate of growth (%)
Period

Initial estimate
(using original data)

Current estimate
(using current data)

Revision
(percentage points)

January 1997

1995:Q4–1996:Q3

0.3

3.0

2.7

January 1998

1995:Q4–1997:Q3

1.8

2.4

0.6

January 1999

1995:Q4–1998:Q3

2.0

2.5

0.5

January 2000

1995:Q4–1999:Q3

2.7

2.4

–0.3

Date of forecast

NOTE: Each forecast is based on productivity data that extend through the third quarter of the previous year. Numbers in the table
may not add up to totals because of rounding.
SOURCE: CBO based on data from the BLS.

In January 2000, labor productivity growth for
1995:Q4 to 1999:Q3 was estimated at 2.7 percent;
that estimate has since been revised to 2.4 percent.
The revisions to productivity data highlight
the difficulty in recognizing a change in the underlying trend growth rate and suggest that we should
be circumspect about data series until they have
undergone revision. This is especially true if the
data show a shift in trend (as in the late 1990s)
or if they are not consistent with other economic
indicators.
Projecting Labor Productivity II: Shifting
Sources of Growth. Another aspect of labor productivity growth during the past decade—a shift
in its sources—has complicated the analysis of
trends and made projections difficult. With our
model we can easily divide the growth in labor
productivity into two components: capital deepening (increases in the amount of capital available
per worker) and TFP. Capital per worker can rise
over time not only because investment provides
more capital goods for workers to use, but also
because the quality of those goods improves over
time and investment can shift from assets with
relatively low levels of productivity (e.g., factories)
to those with higher productivity levels (e.g., computers). Because TFP is calculated as a residual—
the growth contributions of labor and capital are
subtracted from the growth in output—any growth
in labor productivity that is not attributed to capital deepening will be assigned to TFP.
286

J U LY / A U G U S T

2009

With this in mind, the contributions of capital
deepening and TFP to the growth in labor productivity since 1995 can be calculated. The results
of such a growth-accounting exercise are shown
in Table 3. Those results show that capital deepening was the primary source of the surge in labor
productivity growth in the late 1990s and that
faster TFP growth was the primary source of productivity growth during the period after the
business cycle peak in 2001. Between the early
(1991-95) and the late (1996-2001) part of the
past decade, labor productivity growth stepped
up from about 1.5 percent, on average, to 2.5
percent per year. Growth in capital per worker
accounted for 80 percent (0.8 percentage points)
of that 1-percentage-point increase, according to
our estimates. Faster TFP growth was responsible
for the rest of the step-up in productivity growth,
or about 0.2 percentage points.
Since the 2001 recession, however, the sources
of labor productivity growth have completely
reversed. Business investment fell substantially
in 2001 and 2002 and remained weak in 2003,
thus slowing the growth in capital deepening
relative to that in the late 1990s. Consequently,
the contribution of capital per worker to labor
productivity growth fell by 0.7 percentage points
between 2001 and 2005 relative to the 1996-2001
period. At the same time, however, TFP growth
was accelerating sharply, especially in 2003. The
CBO estimates that TFP was responsible for all
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Table 3
Contributions of Capital Deepening and TFP to Labor Productivity Growth (1990-2006)
Change
Average annual growth rate
1991-95

1996-2001

2002-06

1991-95 to
1996-2001

1996-2001 to
2002-06

Contribution of capital deepening
(percentage points)

0.50

1.33

0.62

0.83

–0.72

Contribution of TFP growth
(percentage points)

1.04

1.21

2.07

0.18

0.86

Labor productivity (%)

1.54

2.54

2.65

1.00

0.11

NOTE: Numbers in the table may not add up to totals because of rounding.
SOURCE: CBO using data from the BLS and Bureau of Economic Analysis.

of the acceleration in labor productivity in the
2001-06 period.
A natural question is whether labor productivity will grow as rapidly over the next 10 years
as during the past decade. But the experience since
1995 illustrates why that question is so hard to
answer. Labor productivity growth is volatile, its
measurement is subject to large revisions, and the
reasons for changes in its rate of growth are not
well understood. Consequently, it is a difficult
variable to forecast; past patterns and recent data
provide only a rough guide to future labor productivity. Explanations for the recent acceleration
help to determine whether any of the changes to
growth since 1995 will reverse or recur in the
next 10 years.
Projecting Labor Productivity III: Explaining
the Acceleration. Although it is hard to say conclusively that one factor is the sole cause of the
post-1995 acceleration in productivity growth,
most economists point to IT as the primary source.
This case is easiest for the late 1990s and more
difficult for the period since 2001. As noted previously, the majority of the productivity acceleration for 1996-2000 can be attributed to capital
deepening, which was one result of a huge increase
in business investment. During the late 1990s, not
only did investment boom, but it was heavily
tilted toward IT capital (Figure 9). The CBO
estimates suggest that faster capital deepening
accounted for 80 percent of the upswing in labor
productivity growth during the late 1990s and
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

that IT capital accounted for 75 percent of the
contribution from capital deepening.
In addition, it appears that rapid technological
change in IT industries (including computers,
software, and telecommunications) caused faster
TFP growth in those industries. It also appears
that the pace of technological change was fast
enough, and those industries were large enough,
for faster TFP growth in that sector of the economy
to support overall TFP growth during the late
1990s.11 However, because TFP growth did not
accelerate during the late 1990s, it appears that
faster TFP growth in the IT sectors merely offset
slower TFP growth elsewhere.
It is somewhat harder to make the case that IT
spending was the primary source of the continued
rapid growth in labor productivity since the business cycle peak in 2001. One obvious problem
with this explanation is that spending on IT capital
collapsed after 2000, which strongly suggests that
IT capital was not the reason for the continued
surge. According to our estimates, nearly 80 percent of the post-2001 growth in labor productivity
can be attributed to TFP, with only 20 percent
accounted for by capital deepening.
Despite those estimates, the continued growth
in labor productivity could still be the result of
11

According to estimates by Oliner and Sichel (2000), for example,
the computer and semiconductor industries accounted for about
half of TFP growth from 1996 through 1999, even though those
industries composed only about 2.5 percent of GDP in the NFB
sector during those years.

J U LY / A U G U S T

2009

287

Arnold

Figure 9
Investment in Producers’ Durable Equipment
Percentage of GDP
10
9
8
7
6
5
PDE

4

PDE Excluding Information Technology
3
1960

1965

1970

1975

1980

IT spending if a lag exists between the time when
the capital is installed and when businesses
achieve productivity gains. Several theories, not
necessarily mutually exclusive, have been proposed to explain why such a delay could occur.
They include the possibility that there are adjustment costs associated with large changes in the
capital stock; the possibility that computers are
an example of a general-purpose technology, like
dynamos and electric motors, that fundamentally
change the way businesses operate but take time
to produce results; and the possibility that there
is a link between IT spending and investment in
intangible capital, which refers to spending that
is intended to increase future output more than
current production but does not result in ownership of a physical asset. As computing power
becomes cheaper and more pervasive, managers
can invent new business processes, work practices, and organizational structures, which in turn
allow companies to produce entirely new goods
and services or to improve their existing products’
convenience, quality, or variety.
All of these theories could explain the increase
in TFP growth. However, all would be expected
to have a gradual effect on TFP, raising the growth
288

J U LY / A U G U S T

2009

1985

1990

1995

2000

2005

rate by a small amount over an extended period.
In fact, the TFP data display a very steady trend
during the 1980s, 1990s, and early 2000s; then a
very abrupt increase, occurring entirely in 2003;
and then a return to the previous growth trend
thereafter (Figure 10). This behavior is somewhat
puzzling and hard to reconcile with explanations
that rely on a lagged impact of IT spending during
the late 1990s. We interpret the abrupt increase
as a one-time boost to productivity engendered
by the IT revolution—the burst of investment in
IT capital allowed firms to raise their efficiency
to a higher level but not to permanently increase
the rate of productivity growth. Our estimate of
potential TFP includes an adjustment that temporarily raises its growth rate to include a level
shift similar to that shown in Figure 10.

CONCLUSION
Potential output is a difficult variable to estimate largely because it is an unobservable concept.
There are many ways to compute the economy’s
productive potential. Some methods rely on
purely statistical techniques. Others, including
the CBO method, rely on statistical procedures
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Arnold

Figure 10
TFP and Trend (1980-2008)
Index (1996 = 1.0)
1.2

1.1

1.0

0.9
TFP
Trend TFP
0.8
1980

1985

1990

1995

2000

2005

NOTE: Data are adjusted to exclude the effects of methodological changes in the measurement of prices.

grounded in economic theory. However, all of
the methods have difficulty estimating the trend
in GDP near the end of the data sample, which is
usually the period of greatest interest. Because the
trend at the end of the data sample is the trend
that is projected into the future, any errors in
estimating the end-of-sample trend will be carried
forward into the projection. The process is further
complicated by factors that alter the interpretation
of recent economic events, including data revisions and structural change.
In addition to describing the CBO method and
highlighting the pros and cons of our approach,
this paper describes how we dealt with some
developments during the past several years that
complicated estimation of potential output. As a
general principle, we try to make our estimate
of potential output as objective as possible, but
as this review of recent problems indicates, estimating potential GDP in real time often involves
weighing contradictory evidence. Deciding
whether or not, or how much, to change a trend
growth rate for TFP, for example, often has a
large effect on the estimate of potential for the
medium term.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

This review demonstrates that the economic
landscape is continually changing and that estimates of the trend in any variable, including
potential GDP, are affected by those changes.
Oftentimes, what looks like a new trend in a series
disappears after successive revisions. This factor
argues for a conservative approach to estimating
such trends and being judicious about changes
in those trends.

REFERENCES
Arnold, Robert. “Reestimating the Phillips Curve and
the NAIRU.” Working Paper 2008-06, Congressional
Budget Office, August 2008;
www.cbo.gov/ftpdocs/95xx/doc9515/2008-06.pdf.
Brauer, David. “The Natural Rate of Unemployment.”
Working Paper 2007-06, Congressional Budget
Office, April 2007;
www.cbo.gov/ftpdocs/80xx/doc8008/2007-06.pdf.
Congressional Budget Office. CBO’s Method for
Estimating Potential Output: An Update.
Washington, DC: Government Printing Office,

J U LY / A U G U S T

2009

289

Arnold

August 2001; www.cbo.gov/ftpdocs/30xx/doc3020/
PotentialOutput.pdf.
Congressional Budget Office. “CBO’s Projections of
the Labor Force.” CBO Background Paper,
September 2004; www.cbo.gov/ftpdocs/58xx/
doc5803/09-15-LaborForce.pdf.
Congressional Budget Office. The Budget and
Economic Outlook: An Update. Washington, DC:
Government Printing Office, September 2008;
www.cbo.gov/ftpdocs/97xx/doc9706/
09-08-Update.pdf.
Okun, Arthur M. “Potential GNP: Its Measurement
and Significance,” in The Political Economy of
Prosperity. Appendix. Washington, DC: Brookings
Institution, 1970; pp. 132-45.
Oliner, Steven and Sichel, Daniel. “The Resurgence
of Growth in the Late 1990s: Is Information
Technology the Story?” Journal of Economic
Perspectives, Fall 2000, 14(4), pp. 3-22.
Stock, James and Watson, Mark. “Variable Trends in
Economic Time Series.” Journal of Economic
Perspectives, Summer 1988, 2(3), pp. 147-74.

290

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Robert J. Tetlow

R

obert Arnold (2009) clearly and completely lays out the approach used
by the Congressional Budget Office
(CBO) for measuring potential output
and discusses the limitations therein.
In this commentary, I revisit arguments made
by the authors and discussants of a paper on this
same subject at a 1978 Carnegie-Rochester conference to show how little the CBO methodology
differs from methods used 30 years ago and conjecture on why this is so. From there I speculate
on why current methods have been impervious to
the critiques from 30 years ago and econometric
developments in the years thereafter.
The measurement of potential output clearly
matters, and matters even more in real time, at
least for some decisionmakers. The growth rate of
potential pins down the tax base for fiscal authorities and lawmakers; it provides a baseline for
GDP growth for economic forecasters; and it helps
establish a benchmark for policymakers and financial market participants to interpret the real-time
data.1 The level of potential defines the point to
which the economy is expected to gravitate over
the medium term and so is important for monetary
authorities, forecasters, and anyone who needs
to interpret business cycles. I review why and for
whom it matters and critique the methods used
by the CBO. The CBO methodology is not unique
to that institution; rather, it is my impression that

a number of other, large macroeconomic forecast
teams around the world use broadly similar tools.
To the extent this is true, this critique is germane
to a broader set of model builders and users.
After I provide some background, my comments get more specific. I argue that issues of
econometric identification limit the confidence
with which we can approach the CBO estimates;
I argue against the widespread use of deterministic
time trends, particularly in the real-time context;
and I question the uncritical application of Okun’s
law.

WHITHER POTENTIAL?
Who needs potential output measures and for
what reason? One way of illustrating this question
from the perspective of a policymaker is to refer
to a simple forecast-based Taylor rule, like the
one shown below:
∂rr ∗ /∂∆y ∗ ≥1


(1) Rt = rr ∗
1

(

)

∂ y − y ∗ /∂y ∗ <0




+ φπ ⋅ E t π t +1 + φy ⋅ y t − y t∗ + ut ,
∂E π /∂y ∗ <0

(

)

An example of the latter is the recent decline in labor force participation. Until a few years ago, sustainable employment growth from
the establishment survey was estimated at around 120,000 and
levels below that would have been interpreted as foreshadowing
possible easing in monetary policy and increases in bond prices.
The work of Aaronson et al. (2006) showed that sustainable additions to employment are probably much lower now than before.

Robert J. Tetlow is a senior economist in the Division of Research and Statistics at the Federal Reserve Board. The original discussion slides
from the conference are available at the online version of this Review article. These slides—but not this text—use Federal Reserve Board/U.S.
model vintages and associated databases and Greenbook forecast and historical databases (the latter of which under Federal Reserve Board rules
are permissible for use only for forecasts currently dated before December 2002). The author thanks Peter Tulip, Dave Reifscheider, and Joyce
Zickler for useful comments and Trevor Davis for help with the presentation slides.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 291-96.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

291

Tetlow

where R is the nominal federal funds rate, rr is
the real funds rate, π is inflation, y is (the natural
logarithm of) real gross domestic product (GDP),
and u is a stochastic term. The asterisks on the
real rate and on output represent “potential”
(or “natural”) levels; these natural levels are not
observable. The coefficients, φj, j = y, and π , would
normally be expected to be positive. The partial
derivatives above the equation itself show how
changes in potential output affect the rule and
hence decisionmaking. Starting with the term farthest to the right, an increase in the level of potential—that is, ∂y* > 0—decreases estimates of the
output gap, y – y*, all else equal. Higher potential
would also reduce expected future inflation—
Et π t +1—because smaller gaps usually mean less
inflation and both of these would be expected to
lead to a lower federal funds rate. An increase in
the growth rate of potential—∂共∆y*兲 > 0—raises
the equilibrium real interest rate, which would
call for an increase in the funds rate, all else equal,
but it would also have complex, model-dependent
effects on current and future output gaps and
inflation.2
What complicates this is that the only observables in the equation are current output, which
is subject to revision, and the federal funds rate
itself. A policymaker—in this instance, the Fed—
is obliged to add structure to this underidentified
equation through the use of a macroeconomic
model of some sort. For their part, interpreters
of the data—Fed watchers, among others—are
obliged to “invert” the (perceived) policy rule
and infer what the Fed’s estimates of rr *, ∆y*, y*,
and Et π t +1 might be.3 The only inevitability is
that all parties will get it wrong; the question is
in what way and how critically.4
2

I am thinking of a closed economy here, or at least one that, if open,
is not “small.”

3

Of course, what Fed watchers might also want to infer from policy
decisions given a policy rule is an estimate of the target rate of
inflation. The target rate has been normalized out of our policy
rule, for simplicity.

4

It makes a difference whether it is the Fed that is “getting it wrong”
or the private sector. The more the Fed gets things wrong, the harder
it is for the private sector to infer something about the economy
from Fed behavior. This is, of course, one of the reasons behind
arguments for transparency in monetary policy.

292

J U LY / A U G U S T

2009

METHODOLOGY: A DÉJÀ VU
EXPERIENCE
Bob Arnold’s paper does a solid job of explaining the CBO’s methodology for measuring and
projecting potential output. He also shows substantial awareness of the limitations of their
approach; there is little for me to add on that
score. To provide a different perspective, in this
section I offer readers a “blast from the past,”
from 30 years ago, in fact. I describe the approach
of Perloff and Wachter (1979) from a CarnegieRochester conference in 1978. Like Arnold,
Perloff and Wachter start with an estimate of the
non-accelerating inflation rate of unemployment
(NAIRU) from a previous paper; then, they estimate potential labor input as follows:

(

)

(2) log ( n ) = c + α u − u∗ + β1t + β2t 2 + β3t 3 + εt ,
k

where t , k = 1,2,3 are polynomial time trends, u
is the unemployment rate, c is a constant, and ε
is a residual. Potential labor input, n*, is evaluated using this equation by setting cyclical and
noise terms to zero; in this instance, u = u* and
ε = 0 for all t. Perloff and Wachter follow the same
procedure with potential capital input, except that
the equation in this case is a “cyclically sensitive
translog production function” (p. 122) augmented
with more polynomial time trends. The similarity
to Arnold’s equation (1) is remarkable.5
With this sameness in mind, I can make my
job as discussant easier by shamelessly stealing
from Perloff and Wachter’s discussants. Gordon
(1979) focused on estimation:
[W]ithout making any statistically significant
difference in the wage equation, one could
come up with an estimated increase in u*
between 1956 and 1974 ranging anywhere
from 0.58 to 1.61 percentage points… (p. 190)

In other words, taking u* as exogenous, rather
than estimating a complete system, particularly
while ignoring the imprecision of the first-stage
5

From a real-time perspective, the CBO’s methodology could be
more problematic than Perloff and Wachter’s in that the CBO uses
trends dated back from the previous business cycle peak. No doubt
this is to avoid the political heat that might come from making a
call on a potentially contentious issue in real time. By definition,
this method will miss turning points, possibly by wide margins.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Tetlow

estimates, is problematic. Elsewhere, Gordon
remarks on overparameterization:
Taking this set of data for u*, one can compute
an acceptable and consistent natural output
series without any use of production functions
at all. (p. 188)

That is, because the time-trend variables are
doing the bulk of the work, it is not clear that
there is anything unambiguously “supply side”
in the calculation. The other discussants, Plosser
and Schwert (1979), focused on interpretation of
the results and the related issue of econometric
identification:
[A]ggregate demand policies are not necessarily appropriate in a world where actual output
is viewed as the outcome of aggregate supply
and demand...In such an equilibrium world,
“potential output” ceases to have any significance. (p. 184)

Thus, even though the real business cycle
literature had yet to emerge, the seeds of the idea
were clearly already planted.
Both commentaries remark, in their own way,
on econometric identification. How does one
differentiate between supply (or potential) and
demand (the gap)? Does it even make sense to try?
The use of time trends, which are both deterministic and smooth, is an identifying assumption
made by both Perloff and Wachter (1979) and
Arnold (2009). Their use implies that supply
shocks have not happened often historically and
can be safely ignored in real time for forecast purposes. When Perloff and Wachter were writing,
the literature on unit roots in real GDP—which
would come to include, as it happens, an important contribution by Nelson and Plosser (1982)—
had not yet arisen. But this is not so for the CBO
or any of a variety of other institutions that use
similar approaches.6 Why, then, has the methodology on measuring potential output apparently
6

Barnett, Kozicki, and Petrinec (2009) note that the Bank of Canada
has used a stochastic method for measuring potential since 1992.
The Federal Reserve Board’s FRB/U.S. model forecast uses a stochastic state-space method. The Fed’s official Greenbook forecast—being
judgmental—is more complicated. The Board staff consult a variety
of models for guidance on adjusting potential output and its constituent parts, but they do so on an ad hoc basis. There is, however, a
significant smoothness prior on trend labor productivity, and hence
on potential output, and a prior that Okun’s law holds fairly strongly.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

not absorbed anything from the literature on unit
roots and stochastic trends over the past 30 years?
My conjecture is threefold. First, the CBO—
like most macroeconomic policy institutions—
maintains a distinctly Keynesian perspective on
how the economy works, a view that maintains
that the majority of fluctuations in real GDP come
from demand disturbances and that policy plays
a key role in smoothing those fluctuations. This
approach is natural enough; policy institutions do
tend to draw individuals who believe that policy
is highly consequential. And to paraphrase the
old line: When one likes to use hammers, the
object of interest tends to look like a nail. My
second conjecture is more subtle. Economists at
institutions like the CBO must be able to answer
a wide variety of questions from decisionmakers
and they need a structure that allows them to do
so in short order. The complex, deterministic
accounting structure that Arnold describes allows
the CBO to do that, although one could of course
quarrel with the efficacy of the advice that comes
from such a structure. Third, while I would argue
that the literature on unit roots shows that permanent shocks to GDP—shocks that can fairly be
characterized as supply shocks—are important,
that literature has not yet provided high-precision
tools for measuring those shocks in real time. The
standard errors of estimates of potential output
and the output gap are large.7 And the problem
gets worse as the parameter space of the model
grows.
Nonetheless, I would argue that even though
adopting the stochastic approach involves tackling some difficult issues, it is still a step worth
taking. These same issues exist with the extant
method, but they have been swept under the rug
through the identification by assumption implicit
in the use of time trends to represent aggregate
supply. We are dealing with unobserved variables
here; it only makes sense that, with the passage
of time, our backcasts of potential output would
differ significantly from our nowcasts. To “assume
away” the stochastic properties of the data only
7

The discussion slides show an example of the bootstrapped standard
errors from a simple unobserved components model of potential
output. These are available at the online version of this Review
article.

J U LY / A U G U S T

2009

293

Tetlow

ignores the issue; it doesn’t solve it. A more cleareyed view, in my opinion, is to accept the stochastic nature of potential and adjust procedures and
interpretations to this reality by being prepared
to adapt estimates rapidly and efficiently in real
time (see, e.g., Laxton and Tetlow, 1992).

OKUN’S LAW
I have already noted the strong Keynesian
prior implicit in the methods for measuring potential output at the CBO and other policy institutions. As noted, this prior is evident in the use of
deterministic time trends. It is also a function of
the fact that potential output—and hence output
gaps—are constructed beginning with estimates
of the NAIRU, and hence the unemployment gap,
using Okun’s law. This is illustrated in Arnold’s
Figure 1, which shows the CBO output gap and
the unemployment gap on the same chart. The
chart provides an “ocular regression” of Okun’s
law: The two lines are nearly on top of one another,
meaning that a linear, static relationship between
the two concepts fits the (constructed) data very
well. In essence, this means that the output gap
and the unemployment gap are nearly the same.
The view that the unemployment gap and
output gap are isomorphic—that is, the view that
Okun’s law really is something that approaches a
“law”—has important implications for the characterization of business cycles. The following loglinearized Cobb-Douglas production function
shows this:
(3)

y = a + θ n + (1 − θ ) k ,

where a is total factor productivity, and we measure potential output using full-employment labor
input, n*, and the actual capital stock, k, as is
usually the case:
(4)

294

8

I am blurring the distinction between the unemployment gap and
the labor market gap—the difference being what might be called
the average workweek gap and the labor force participation rate
gap. This distinction is important to my point only if one thinks
that all productivity adjustment—a movements relative to a*—is
carried out on these two margins, which seems unlikely.

9

Whether there is any meaningful distinction among these three
stories depends on the underlying model.

10

My suggestions here are particularly relevant for a decisionmaking
body when the level of the gap is important. I think this is true for
almost all policy institutions but is undoubtedly “more true” for,
say, a central bank, than for a fiscal authority.

y ∗ = a ∗ + θ n ∗ + (1 − θ ) k ,

and then subtract equation (4) from equation (3)
to show the relationship between output gaps,
y – y*, and the labor market gap, n – n*8:
(5)

Now Arnold’s Figure 1 implies that y – y* –
θ 共n – n*兲 is small and unimportant—taken to the
limit, Okun’s law implies that it should be white
noise. This, in turn, means that what we might call
the productivity gap, a – a*, must also be small
and unimportant. Should it be? Should anyone
care? What is the productivity gap anyway? The
productivity gap can represent any or all of a variable workweek of capital, variable capacity utilization, or labor adjustment costs to productivity
shocks.9 Loosely speaking, fluctuations in a that
are not in response to shocks to a* are labor adjustment shocks, whereas shocks to a*, all else equal,
are classic productivity shocks. The productivity
gap, 共a – a*兲, can be unimportant only in the
unlikely circumstance that actual productivity,
a, moves instantaneously with a productivity
shock, a*, and disturbances to a, holding a* constant, are themselves close to white noise. In short,
the only way the productivity gap could be small
and unimportant—and, therefore, the only way
that Okun’s law can hold so tightly as to be called
a law—is either because aggregate demand moves
instantaneously with productivity or if there are
no productivity shocks in the first place. Neither
of these possibilities seems plausible.
My own preference would be to drop the
deterministic time trends, relaxing somewhat
the iron grip of Okun’s law, and treat potential
output as a stochastic variable. Doing so would
allow for meaningful supply-side shocks, modeled using state-space techniques, probably with
the Kalman filter.10 From an operational point of
view, this shifts the prior on the incidence of
shocks somewhat. Under the deterministic prior,
all real surprises are demand shocks and this view

(

) (

)

y − y ∗ = a − a ∗ + θ n − n∗ .

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Tetlow

is adjusted only rarely and after the fact; with the
stochastic view, the default option becomes one
wherein some portion of a given output surprise
is characterized as a supply shock. The model
user could override that prior, but it would be a
conscious decision on the user’s part to do so.
In this way, the stochastic approach would be
responsive in real time, allowing estimates to
adapt to developments such as the productivity
boom of the late 1990s in a way that the deterministic approach would not. Such a property is an
important one, particularly for institutions whose
policy instruments may be adjusted with relatively high frequency. State-space models also
allow the modeling of nonlinearities—for example, to capture different dynamics when cycles
are being driven largely by supply shocks rather
than by demand shocks or to allow for “jobless
recoveries”—although the econometric hurdles
are correspondingly higher.11
Such an approach comes at some cost, however, because either the parameter space must be
small or the user must be willing to impose priors
on enough parameters to give the estimator a
chance of producing reasonable results. Still, this
approach would likely impose fewer restrictions
than the current approach. At a minimum, weakening the prior that all shocks are demand shocks
opens the door for model users to consider what
kind of shocks might have produced the cross section of measured surprises—positive for output
and negative for inflation, for example—in real
time. This, in turn, would allow a more rapid
adjustment to new information and smaller and
less persistent forecast and policy errors than
would otherwise be the case.

CONCLUSION
Bob Arnold has outlined a detailed and
sophisticated approach to measuring potential
output as used by the CBO. In my opinion, the
approach is representative of the perspective and
11

Bayesian methods can be helpful in this regard, particularly for
policy institutions that tend to be unapologetic about having prior
beliefs.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

needs of a range of policy institutions. In general
terms, the remarkable thing about the CBO
method, and methods like it, is how little it differs
from methods used 30 years ago. This lack of penetration of academic ideas into the policymaking
sphere is perplexing in some ways. However, it
reflects, in part, the needs of institutions to be able
to answer myriad questions using the same model.
This practice tends to result in the construction of
large, elaborate models, and unfortunately not all
modern econometric techniques scale up well to
large models. The good news is that new methods
in Bayesian econometrics offer considerable help
in estimating larger systems while paying proper
heed to the priors of the model builders and users.
Another source of the lack of progress, in my view,
is the strong Keynesian prior regarding the sources
of business cycle fluctuations. Many public policy
institutions regard supply shocks as rare enough
to be ignored. I would argue that this prior is
overly strong—we know for a fact it was dead
wrong in the United States in the late 1990s (see,
e.g., Anderson and Kliesen, 2005; and Tetlow and
Ironside, 2007). It might also be deleterious for
policymaking because the perspective that all
shocks are demand shocks leads directly to the
view that all fluctuations should be smoothed out,
which is arguably a recipe for “fine-tuning.”
We are now in a period in which the CBO
methodology is being tested. By construction,
the CBO will have concluded that the current
“financial stress shock” to the U.S. economy is
entirely a demand-side phenomenon with large
implications for the output gap and eventually for
inflation. This is a contestable position. It would
not be hard to fashion an argument that the desired
capital stock, and hence the level of potential
output, has shifted down; interpreting the shock
in this less devoutly Keynesian way would mean
smaller output gaps, less disinflationary pressure,
and somewhat less need for expansionary policy,
all else equal. We shall see. In any case, quite
apart from the methods detailed therein, Bob
Arnold’s paper shows a mindful understanding
of the uncertainties involved, which is probably
more important. It thereby serves the Congress
well.
J U LY / A U G U S T

2009

295

Tetlow

REFERENCES
Aaronson, Stephanie; Fallick, Bruce; Figura, Andrew;
Pingle, Jonathan and Wascher, William. “The Recent
Decline in the Labor Force Participation Rate and
Its Implications for Potential Labor Supply.”
Brookings Papers on Economic Activity, Spring
2006, 1, pp. 69-134.
Anderson, Richard G. and Kliesen, Kevin L.
“Productivity Measurement and Monetary
Policymaking During the 1990s.” Working Paper
No. 2005-067A, Federal Reserve Bank of St. Louis,
October 2005; http://research.stlouisfed.org/wp/
2005/2005-067.pdf.
Arnold, Robert. “The Challenges of Estimating
Potential Output in Real Time.” Federal Reserve
Bank of St. Louis Review, July/August 2009, 91(4),
pp. 271-90.
Barnett, Russell; Kozicki, Sharon and Petrinec,
Christopher. “Parsing Shocks: Real-Time Revisions
to Gap and Growth Projections for Canada.” Federal
Reserve Bank of St. Louis Review, July/August
2009, 91(4), pp. 247-65.

Laxton, Douglas and Tetlow, Robert J. “A Simple
Multivariate Filter for the Measurement of Potential
Output.” Technical Report No. 59, Bank of Canada,
June 1992.
Nelson, Charles R. and Plosser, Charles I. “Trends
and Random Walks in Macroeconomic Time
Series: Some Evidence and Implications.” Journal
of Monetary Economics, September 1982, 10(2),
pp. 139-62.
Perloff, Jeffrey M. and Wachter, Michael L.
“A Production Function—Nonaccelerating Inflation
Approach to Potential Output: Is Measured Potential
Output Too High?” Carnegie-Rochester Conference
Series on Public Policy, January 1979, 10(1),
pp. 113-63.
Plosser, Charles I. and Schwert, William G. “Potential
GNP: Its Measurement and Significance: A Dissenting
Opinion.” Carnegie-Rochester Conference Series
on Public Policy, January 1979, 10(1), pp. 179-86.
Tetlow, Robert J. and Ironside, Brian. “Real-Time
Model Uncertainty in the United States: The Fed,
1996-2003.” Journal of Money, Credit and Banking,
October 2007, 39(7) pp. 1533-61.

Gordon, Robert. “A Comment on the Perloff and
Wachter Paper.” Carnegie-Rochester Conference
Series on Public Policy, January 1979, 10(1),
pp. 187-94.

296

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Trends in the Aggregate Labor Force
Kenneth J. Matheny
Trend growth in the labor force is a key determinant of trends in employment and gross domestic
product (GDP). Forecasts by Macroeconomic Advisers (MA) have long anticipated a marked slowing
in trend growth of the labor force that would contribute to a slowing in potential GDP growth. This
is reflected in MA’s forecast that the aggregate rate of labor force participation will trend down,
especially after 2010, largely in response to the aging of the baby boom generation, whose members
are beginning to approach typical retirement ages. Expectations for a downward trajectory for the
participation rate and a slowing in trend labor force growth are not unique. However, this article
reports on MA research suggesting that the opposite is possible: that the slowdown in trend labor
force growth could be relatively modest and that the trend in the aggregate rate of labor force
participation will decline little, if at all, over the next decade. (JEL E01, J11)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 297-309.

P

rojections of population and labor
force growth are essential elements of
any projection of the economy’s potential output growth. Often, however,
these projections are driven primarily by trends
and dummy variables. The research reported
here constructs a labor force projection from a
much richer set of behavioral determinants of
labor force trends than are typically used. The
set of determinants also is richer than that contained in the aggregate labor force equation that
appears in the current version (as of this writing)
of the Macroeconomic Advisers (MA) commercial
macroeconomic model.
In its September 24, 2008, issue of Long-Term
Economic Outlook, MA projected that the labor
force participation rate would decline by about

1½ percentage points over the next decade—to
64.6 percent in 2017—and that the growth of the
labor force would slow from roughly 1 percent
or a little higher on average in recent years to an
average of 0.6 percent from 2013 to 2017 (Tables 1
and 2). These estimates are comparable to recent
estimates from the Congressional Budget Office.
However, they are considerably stronger than
trend estimates in a recent paper by Aaronson et al.
(2006). Their research suggests that demographic
and other developments could result in a much
larger decline in the participation rate—to 62.5
percent by the middle of the next decade—and a
reduction in trend labor force growth to just 0.2
percent from 2013 to 2015.
The research summarized here leans in the
other direction. It suggests that trend growth of

Kenneth J. Matheny is a senior economist at Macroeconomic Advisers, LLC. The author thanks James Morley of Washington University for
research advice and assistance. Other staff at Macroeconomic Advisers contributed to this research in various ways, including Joel Prakken,
chairman; Chris Varvares, president; Ben Herzon, senior economist; Neal Ghosh, economic analyst; and Kristin Krapja, economic analyst. The
author also acknowledges the following for their assistance or feedback: Robert Arnold, Congressional Budget Office; Jonathan Pingle, Brevan
Howard Asset Management, LLP; Mary Bowler, Sharon Cohany, John Glaser, Emy Sok, Shawn Sprague, and especially Steve Hipple and Mitra
Toosi of the Bureau of Labor Statistics; Steven Braun of the President’s Council of Economic Advisers; and William Wascher of the Federal
Reserve Board of Governors. The staff of Haver Analytics provided assistance locating certain data. Ross Andrese, a former employee of
Macroeconomic Advisers, provided research assistance during an early phase of this project.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

297

Matheny

Table 1
Growth of the Civilian Labor Force
Year

MA Long-Term
Economic Outlook (2008)

Model
prediction of trend*

CBO (2008)
estimate of trend

Aaronson et al. (2006)
estimate of trend

2008

0.8

1.3

1.1

0.4

2009

0.8

1.1

1.0

0.4

2010

1.1

0.9

0.9

0.4

2011

1.0

0.9

0.6

0.4

2012

0.8

0.9

0.6

0.3

2013

0.6

1.0

0.6

0.2

2014

0.6

0.9

0.5

0.2

2015

0.6

0.9

0.5

0.2

2016

0.6

0.9

0.5

NA

2017

0.6

0.9

0.4

NA

NOTE: Data represent annual averages in percent. *Based on the level terms of the regression in Table 3 after removing cyclical contributions from the unemployment and wealth terms, as described in the text.

Table 2
Labor Force Participation Rate
Year

MA Long-Term
Economic Outlook (2008)

Model
prediction of trend*

CBO (2008)
estimate of trend

Aaronson et al. (2006)
estimate of trend

2008

66.0

65.7

66.1

65.2

2009

65.9

65.8

66.0

64.7

2010

65.8

65.7

65.9

64.4

2011

65.8

65.7

65.7

64.0

2012

65.7

65.7

65.4

63.7

2013

65.5

56.8

65.2

63.3

2014

65.3

65.9

64.9

62.9

2015

65.1

66.0

64.6

62.5

2016

64.9

66.0

64.3

NA

2017

64.6

66.0

63.9

NA

NOTE: Data represent annual averages in percent. *Based on the level terms of the regression in Table 3 after removing cyclical contributions from the unemployment and wealth terms, as described in the text.

298

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Matheny

the labor force might not slow as much over the
next decade as previously anticipated. According
to the model, the trend in labor force growth will
edge down slightly to an average of 0.9 percent
through 2017, and the trend in the labor force
participation rate will dip only slightly from recent
levels to average just under 66 percent from now
through 2017. The research reported here updates
our measure of the pure demographic contribution
to the change in the labor force to reflect more age
detail than in our existing model and to match the
population concept on which it is based with the
one that underpins the official estimates of the
labor force and the participation rate from the
Bureau of Labor Statistics (BLS). The updated
model addresses a bedeviling problem with discontinuities in the official estimates of the labor
force and the civilian noninstitutional population.
Unfortunately, data limitations prevent the complete elimination of the spurious impacts of these
discontinuities, which stem from updates to
“population controls” that are entered into the
official data in response to the results of decennial
censuses and for other population-related data.
The research reported here shows a much
richer set of behavioral determinants of labor
force trends than are contained in the equation
for the aggregate labor force that appears in the
MA commercial model at the time of this writing.
Specifically, this analysis drops previously used
deterministic trend and shift terms; instead, the
model includes a small set of factors believed to
exert important behavioral influences on the
labor force.

DEMOGRAPHIC CONTRIBUTION
TO THE LABOR FORCE
As part of our modeling, the pure demographic
contribution to the change in the labor force is
separated from its behavioral influences. We
typically measure the demographic contribution
with a chain-weighted index of the populations
for 30 different age and gender brackets, using
lagged labor force participation rates as weights,
which we label LFCADJL.1 Population details from
the civilian noninstitutional population 16 years
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

and older are used to construct the series. With
lower-case p’s denoting participation rates and
lower-case nc’s denoting population details,
LFCADJL is updated according to

LFCADJLt = LFCADJLt −1 ×
(1)

 30

 ∑ pi ,t −1 × nc i ,t 
i =1

 30

 ∑ pi ,t −1 × nc i ,t −1  .
i =1

The series is indexed to equal the actual labor
force in 2000. (This has no impact on the results
that follow.) Changes in LFCADJL from one quarter to the next are due to changes in the detailed
populations across age and gender brackets (the
nc’s) holding fixed the weights (the p’s). In this
sense, growth of LFCADJL is a comprehensive
measure of the pure demographic contribution
to the change in the labor force. Growth of the
actual labor force and growth of LFCADJL are
displayed together in Figure 1.2 Forecast projections for LFCADJL reflect growth in the population
detail, holding fixed the within-group participation rates. We observe that growth of LFCADJL is
projected to moderate in the forecast, with an
average of 0.4 percent from 2015 to 2017.

BEHAVIORAL COMPONENT OF
THE LABOR FORCE
The behavioral component of the labor force
can be measured by the log-ratio of the actual
labor force to the demographic measure,
log共LFC/LFCADJL兲. This series (Figure 2) is obviously nonstationary, and tests confirm that it
appears to be I(1), that is, the series is stationary
after differencing, implying that cointegrationbased techniques provide a useful framework for
econometric analysis. We found evidence that this
1

Male and female populations and labor forces are separated into
15 non-overlapping age brackets, specifically, 16-17, 18-19, 20-24,
25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-61, 62-64, 65-69,
70-74, and 75 years and older.

2

We return in a subsequent section to the appearance of sharp
swings in the growth rates of these series stemming from updated
population controls that are entered into the population and labor
force data without adjustment. The most recent discontinuity
occurs in data for the first quarter of 2008, reflected in a sharp,
temporary drop in the growth of LFCADJL.

J U LY / A U G U S T

2009

299

Matheny

Figure 1
Labor Force Growth: Actual and Demographic Contribution
Quarterly Percent Change, Annual Rate
8
Actual
Demographic
6

4

2

0

–2
1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012 2016
SOURCE: BLS and MA.

Figure 2
Prediction for Behavioral Component
Quarterly Percent Change, Annual Rate
0.050
0.025
0.000
–0.025
–0.050
–0.075
–0.100

Actual
Predicted
Trend Prediction

–0.125

–0.150
1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012 2016

300

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Matheny

variable is cointegrated with the following set of
“behavioral” variables (and a constant).
Dependency Ratio (YOUNG015): The ratio
of persons 15 years and younger to the entire resident population. This series has generally trended
down over the past several decades, roughly mirroring the inverse of the relative participation
rate for women. This term is typically among the
most robust and statistically significant variables
in labor force regressions.
Life Expectancy (WT65F_LEF65): The life
expectancy of women at the age of 65 years, multiplied by the share of women aged 65 and older
in the total adult, civilian, noninstitutional population. Life expectancy represents the number
of years one would expect to live, on average,
conditional on having attained the age of 65.3
A subsequent section addresses our choice of a
female-weighted, female life expectancy.
Welfare Reform (WR1996): Intended as a
proxy for the effect of welfare reform in the late
1990s. This series is constructed as the product of
several terms, beginning with a dummy variable
that is zero up to the second quarter of 1996 and
one thereafter, to mark the enactment of federal
welfare reform in August 1996.4 The zero-one
dummy is multiplied by one minus the share of
women who are married, by the dependency ratio
(YOUNG015), and by the ratio of the population
of women aged 18 to 49 to the total adult civilian
noninstitutional population.5
3

4

5

Estimates for life expectancy are from the “intermediate-cost”
assumptions of the Social Security Administration (www.ssa.gov/
OACT/TR/TR07/lr5A4.html). Interpolation from annual to quarterly
estimates is accomplished using a cubic spline. We take a centered
nine-quarter moving average to smooth sometimes odd movements
in the first differences that arise because of interpolation. Smoothing has very little effect on the regression results.
The Personal Responsibility and Work Opportunity Reconciliation
Act of 1996 was signed into law by President Clinton on August 22,
1996. Some states began instituting welfare reforms during the
same era or before. We also considered slightly different versions
of this term where the dummy variable switches from zero to one
either before or after the third quarter of 1996. For dates near the
third quarter of 1996, the regression results were little affected.
One might suppose that, in the regression, the welfare reform term
is capturing a behavioral increase in the labor force as persons were
“pulled” into labor markets during a period of strong labor demand
beginning in the late 1990s. We discount this possibility for two
reasons. First, the welfare reform term is significant with the
unemployment rate present, and the unemployment rate arguably
accounts for any “demand pull” effect. Second, the size of the effect
from the welfare reform term is comparable to estimates from other
researchers about the impact of welfare reform in the 1990s.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Household Net Worth (NW_SCALED5564):
The ratio of per capita household net worth to
hourly labor compensation, multiplied by the
population share of persons aged 55 to 64. The
traditional theory of the labor/leisure choice notes
that increases in wealth cause a reduction in labor
supply if leisure is a “normal” good. However,
previous research on the existence of wealth
effects on labor supply has been mixed.6 We
found ambiguous results when the wealth ratio
is not scaled by the population share but robust
results consistent with traditional theory when
the wealth ratio is premultiplied by the share of
the population aged 55 to 64. In other research
on participation rates for individual age brackets,
we found evidence of wealth effects on participation rates for this age bracket.
Unemployment Rate (LURC): The official
unemployment rate, expressed in percent. Its
presence is motivated by search-theoretic considerations, namely, that the expected return to
searching for employment is negatively related
to the level of unemployment.
A simple levels regression among these variables and a constant suggested cointegration, so
a dynamic levels regression was estimated that
also includes leads and lags of the first differences
of all the regressors to control for serial correlation.7 The results of the dynamic regression are
summarized in Table 3.8,9 All five regressors enter
as expected, with positive coefficients on the life
expectancy and welfare reform terms, and negative coefficients on the dependency ratio, the
6

Goodstein (2008) finds that increases in wealth do lead to earlier
withdrawal from the labor force in a panel dataset of older men.
He argues that previous researchers who investigated the issue in
panel datasets found small and statistically insignificant effects of
wealth on retirement because they did not control for differences
in “tastes,” including risk aversion and preference for work, thereby
producing a spurious positive correlation between wealth and labor
force participation.

7

Along with a correction for heteroskedasticity, t-statistics from
the dynamic levels regression are asymptotically valid.

8

To conserve space, the differenced terms, which are immaterial to
what follows, are suppressed in the table.

9

Sample means of the first-differenced terms were removed before
estimation. This has no effect on the estimated coefficients, except
for the constant, and ensures that the predicted value of the level
terms is consistent with the level of the dependent variable during
the estimation sample.

J U LY / A U G U S T

2009

301

Matheny

unemployment rate, and the wealth term. All
terms are statistically significant.
We noted at the outset that the primary focus
of this research was on the determinants of trends
in the labor force. Toward that end, we removed
the direct “cyclical” contribution by replacing
the unemployment rate (LURC) with our estimate
of the long-run natural rate of unemployment
(NAIRU).10 The wealth term is also subject to
cyclical influences, though the issue of identifying its cyclical contribution is ambiguous. On the
one hand, it might not matter much in the forecast beyond 2010, because the contribution from
the wealth term does not vary much after that date.
Nevertheless, we did attempt to reduce the obvious cyclicality in the wealth term as follows. First,
we regressed the unscaled wealth ratio (that is,
per capita wealth divided by hourly compensation without scaling by the population share) on
several leads and lags of the unemployment rate,
along with a constant and trend. We then substituted the contribution from the unemployment
rate with a contribution computed using the
NAIRU and the same coefficients. The adjusted
wealth rate was once again multiplied by the
55- to 64-year-old population share
(NW_SCALED5564LR).11
With these adjustments, the model for “trend”
in the behavioral component of the labor force is
given by
10

11

Our estimate of the NAIRU is not a constant because it includes a
gradually evolving adjustment for changes in the age profile of the
labor force. For example, younger adults on average experience
higher unemployment rates, so an increase in their share of the
labor force would, all else equal, be associated with an increase in
the unemployment rate.
An alternative procedure to reduce the influence of cyclical movements in the unemployment rate on the model’s prediction for the
labor force would be to replace the unemployment rate with the
NAIRU and to replace the original wealth term with the “adjusted”
version when estimating the regression. In this alternative, the
NAIRU is not statistically significant, but the coefficient on the
adjusted wealth term is little changed. Moreover, there is a substantial increase in the coefficient on the life expectancy term that
leads to a much higher forecast for the participation rate—approximately 2 percentage points higher by 2017—which we would be
uncomfortable showing as a base-case scenario. In any event, this
exercise suggests that the forecast projections based on the original
model (derived from the level terms in the regression in Table 3)
are not overly optimistic.

302

J U LY / A U G U S T

2009

0.1233 + 0.0784 × WT65F_LEF65t − 0.9330
(2) ×YOUNG015t + 0.2146 × WR1996Q3t − 0.2682
× NW_SCALED5564LRt − 0.0048 × NAIRUt .
The coefficients in this expression are identical to those on the corresponding level terms in
Table 3. The predicted value for this model over
both history and forecast is displayed in Figure 2,
along with a prediction that does not remove
cyclical contributions from the unemployment
rate. Forecast assumptions for the wealth ratio
and the NAIRU are from MA’s most recent Longterm Outlook publication.
The model easily incorporates the secular
increase in the log-ratio from the early 1960s to
the 1990s. It also easily replicates the flattening
that began in the late 1990s and, to some extent,
the downturn in the first half of the current decade.
As of 2008:Q2, the actual and predicted ratios
differ by just 0.6 percent. According to the model,
about three-fourths of the increase in the ratio of
LFC to LFCADJL is “explained” by the dependency
term, with most of the remainder accounted for
by life expectancy, with smaller and roughly offsetting contributions on net from the other terms.
According to the model, welfare reform raised the
level of the labor force by approximately 0.75
percent beginning in 1996:Q3, or by about 1.0
million persons. This figure is comparable to
estimates by other researchers of the impact of
welfare reform.12 To a first approximation, the
impact on the labor force from the welfare reform
term is nearly constant through the end of the
estimation sample and in the forecast.13 The esti12

Blank (2004) notes that between 1995 and 2001, a period over
which, on net, there was little change in the aggregate unemployment rate, employment of single mothers rose by approximately
820,000, as welfare caseloads fell by roughly double that amount.
The author argues that 820,000 likely understates the full effect
on employment of welfare reform. The impact on the labor force
was likely even larger than the impact on employment. Bartik
(2000) estimated that welfare reform expanded the labor force of
less-skilled women by over 1 million persons.

13

The value of WR1996 rises from zero to about 0.035 in 1996:Q3.
On balance, it drifts down through the end of the estimation sample,
to about 0.031 as of 2008:Q2. Based on the estimated model in
Table 3, the percentage contribution from this term declined from
about 0.75 percent in late 1996 to about 0.66 percent in early 2008.
In level terms, the estimated contribution to the labor force in
early 2008 (of 1 million persons) is essentially identical to the
contribution from this term as of 1996:Q3.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Matheny

Table 3
Summary of Regression Results
Dependent Variable: log(LFC/LFCADJL)
Sample: 1960:Q1–2008:Q2
Included observations: 194
Variable

Coefficient

HAC SE

t-Statistic

p-Value

CONSTANT

0.0123

0.0392

3.1492

0.0020

YOUNG015

–0.9330

0.0556

–16.7784

0.0000

WTF65F_LEF65

0.0784

0.0135

5.7987

0.0000

WR1996

0.2146

0.0761

2.8191

0.0055

LURC

–0.0048

0.0005

–10.7430

0.0000

NW_SCALED5564

–0.2682

0.0436

–6.1487

0.0000

R2

0.9968

Mean dependent variable

Adjusted R 2

0.9960

SD dependent variable

Standard error of regression

0.0030

Akaike information criterion

–8.5967

Sum squared residual

0.0014

Schwarz criterion

–7.9061

Log likelihood
Durbin-Watson statistic

874.88
0.7750

F-statistic
Probability (F-statistic)

–0.0524
0.0473

1,195.81
0.0000

NOTE: HAC SE, heteroskedasticity and autocorrelation consistent standard error; SD, standard deviation. Not shown are the coefficients on the leads and lags of first differences for each of the level regressors (excluding the constant). Three leads and lags and contemporaneous values were included for each of the differenced terms. Sample means were deducted from the first differences before
estimation.

mate of “trend” for the behavioral component is
a little higher than the unadjusted prediction for
periods when the unemployment rate is above
the NAIRU.
The model’s forecast includes a pronounced
upward movement in the behavioral component
of the labor force, especially after 2011, mostly in
response to an increasing (indeed, accelerating)
contribution from the life expectancy term, along
with a small increase in the contribution from
the dependency term. The contribution from the
welfare reform is nearly a constant in the forecast,
and the contribution from the adjusted wealth
term to the change in the forecast through 2017
is small. We return to a discussion of the life
expectancy term and its contribution to the forecast in a subsequent section.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

DISCONTINUITIES IN
POPULATION CONTROLS
The historical time series on the civilian
noninstitutional population periodically exhibits
sharp swings stemming from changes in the population controls that are used to extrapolate survey
results (population data are published by the
U.S. Bureau of the Census in Current Population
Survey). The reason is that when the population
controls are updated, their effects are not normally
backdated or smoothed when entered into the
official estimates for the civilian noninstitutional
population. For example, when the population
control for January 2000 was raised to reflect
the results of Census 2000, it led to an upward
adjustment to the official estimate for the civilian
noninstitutional population as of that date of
J U LY / A U G U S T

2009

303

Matheny

approximately 2.6 million persons. Data for previous periods were not restated upward to reflect
the new, higher population control for January
2000, resulting in a discontinuity in the official
data.14 Similar discontinuities surround previous
decennial censuses and other dates. Discontinuities exist for the same reason in the official data
on the labor force.
The existence of discontinuities affects our
measure of the demographic contribution to the
labor force, LFCADJL, because the population
details used in its construction are subject to the
same discontinuities. This does not represent a
problem for our regression analysis, because the
estimates of the civilian labor force (LFC) and
LFCADJL are subject to discontinuities from the
same source and are consistent. However, the
existence of population control–related discontinuities does affect estimation of “trend” in
these series (and in the civilian noninstitutional
population).15
Estimates of the effect of revised population
controls on the aggregates for the civilian noninstitutional population and for the total labor force
are available in BLS publications for several
decades of data, but detailed information necessary to smooth the impacts on the population
details used to construct LFCADJL is not available. Given these discontinuities, what is the best
way to proceed? Although highly imperfect, we
adjust LFCADJL by multiplying it by the ratio of
the adjusted to the unadjusted totals for the civilian noninstitutional population. This reduces but
clearly does not eliminate some of the spikes in
the growth of LFCADJL over history (Figure 3).
As seen later, this results in extra variability in
the model’s prediction for trend growth of the
labor force.
14

The BLS estimates that the introduction of new population controls
based on Census 2000 raised the civilian noninstitutional population 16 and older (N16C) and LFC by approximately 2.6 and 1.6
million, respectively. Civilian employment was raised by about
1.6 million at the same time. The aggregate unemployment rate
was essentially unaffected by updated population controls based
on Census 2000.

15

The participation rates are usually not affected greatly by the
introduction of updated population controls, as the revisions to
the totals for the labor force and the civilian noninstitutional population are approximately proportional.

304

J U LY / A U G U S T

2009

AN ESTIMATE OF TREND
GROWTH IN THE LABOR FORCE
Figure 4 displays the growth rate of the civilian labor force after adjustments that smooth the
effects of updated population controls, along with
a forecast from MA’s most recent long-term outlook.
The figure also shows the prediction of the trend
in the adjusted labor force. The latter includes the
version of LFCADJL adjusted for revised population controls (the adjustment is admittedly incomplete) and the estimate of “trend” for the behavioral
component of the labor force based on the model
described previously. Figure 5 shows a corresponding set of estimates for the labor force participation rate.
One of the most obvious features is that the
estimate of trend growth is not smooth, especially
in history. In part this reflects changes in its behavioral determinants, but it also reflects discontinuities from updated population controls that,
given available information, we are able to reduce
but not eliminate. The spike in 1990 is an example,
as are a pair of sharp declines in the 1960s.
According to the model, trend growth in the
labor force peaked in the early 1970s at slightly
below 3 percent; but it soon subsided and, for
most of the 1980s and the first half of the 1990s,
trend growth fluctuated between 1 and 2 percent.
It rose briefly in 1996 in response to welfare
reform. Declines in the net worth term generated
brief increases in the model’s prediction for potential labor force growth in the earlier 2000s and
again recently (and through the first couple years
of the forecast).
Turning to the forecast, trend growth of the
labor force is projected to average 0.9 percent
from 2008 to 2017, three-tenths of a percentage
point higher than in our most recent forecast.
The trend in the labor force participation rate is
projected to edge down slightly but remain close
to 66 percent throughout the forecast through
2017, well above our previous forecast of a decline
to 64.6 percent by 2017. The model’s predictions
are also higher than trend estimates from the
Congressional Budget Office (2008) and especially
those from Aaronson et al. (2006).
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Matheny

Figure 3
Demographic Contribution to Labor Force Growth with Population Adjustments
Quarterly Percent Change, Annual Rate
6
5

Unadjusted
Adjusted to Smooth Population Controls

4
3
2
1
0
–2
1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012 2016

Figure 4
Trend Growth of the Labor Force with Population Control Adjustments
4-Quarter Percent Change
4
Actual (adjusted) MA Forecast
Potential
3

2

1

0

–1
1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012 2016

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

305

Matheny

Figure 5
Labor Force Participation (Actual and Trend) with Smoothed Population Controls
Quarterly Percent Change, Annual Rate
70
Trend
68

Actual/MA Forecast

66
64
62
60
58
56
1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008 2012 2016

MODEL SPECIFICATION DETAILS
We contemplated a larger set of potential
behavioral influences on the labor force than
those shown in the model in Table 3. Many terms
that were considered do not appear in the featured
specification because the econometric results did
not support their inclusion, including (i) the difference between the marginal and average net-oftax rates for labor income and the ratio of the
marginal to the average net-of-tax rates; (ii) the
marriage rate for women; (iii) the ratio of the
female to the male participation rate; (iv) the
ratio of after-tax Social Security retirement benefits to after-tax hourly labor compensation and
the same ratio multiplied by the population share
for age 65 and older; (v) a zero-one dummy variable for the elimination in 2000 of the Social
Security earnings test for persons who have
reached normal retirement age; (vi) replacement
of the unemployment rate with separate regressors
for the NAIRU and the difference between the
unemployment rate and the NAIRU; and (vii) a
linear time trend.16
306

J U LY / A U G U S T

2009

Limitations on data availability and labor
resources precluded assessing other factors that
might influence work/retirement decisions, such
as the cost of medical care; parameters that affect
Social Security retirement benefits, such as a
more nuanced assessment of changes in the earnings test, and changes to the delayed retirement
credit; the evolution from defined-benefit to
defined-contribution retirement plans; and edu16

Although one of our goals was to develop a behavioral model
without relying on ad hoc deterministic trends or shift terms, we
did investigate the effect of adding a trend to evaluate whether one
or more of the regressors in the featured specification appeared to
be significant because it (or they) simply filled the role of a time
trend. Fortunately, we did not find that to be the case. When a linear
time trend is added to the regression for log(LFC/LFCADJL), it
enters with a negative coefficient and it is borderline statistically
significant, with a t-statistic of –1.90, while existing level regressors remained statistically significant. The coefficient on the life
expectancy term rose by more than one-third and the sum of the
contributions from the trend and life expectancy terms in the forecast would have resulted in a prediction for the participation rate
that by 2017 is 0.5 percentage points higher than for the featured
model. The prediction from the featured model is already stronger
than existing forecasts, including our own previous long-term
projection, so we are hesitant to adopt specifications that imply
even faster labor force growth in the forecast without a compelling
reason to do so, a hurdle that we did not feel was exceeded with a
t-statistic of about –1.90 on a deterministic trend term.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Matheny

cational attainment and involvement. These issues
certainly merit further investigation.
The labor force participation rate of young
adults in the 16- to 19-year-old age bracket has
declined from a peak near 59 percent in the late
1970s to 41 percent as of 2008:Q2. The possibility
of further declines in the participation rate of this
bracket might constitute a downside risk to projections for the labor force but one that we believe
is small. If the participation rate of this age bracket
fell during the forecast horizon at a pace comparable to its decline over the past decade (which
is steeper than the decline over the entire period
from the late 1970s to now), then it would, all
else equal, lower the aggregate participation rate
in 2017 by approximately 0.4 percentage points.
Furthermore, we think the downside risk to the
forecast could be even less than suggested by the
static calculation. Why? First, our estimation
sample, which begins in 1960, includes the entire
period of decline in this age bracket, so the model
should not be “surprised” by continued declines
comparable to those experienced over history.
Second, as noted previously, when we added a
trend term to the model, the projection for the
labor force was actually higher than for the featured model. Third, we tried adding a trend premultiplied by the population share for 16- to 19year-olds, but it was essentially zero, statistically
insignificant (t-statistic of –0.1), and produced
no discernible changes in other coefficients or in
the model’s predictions.17 Splitting the weighted
trend into separate terms for the period up to 1978
and thereafter was equally ineffective. Finally,
the decline in the participation rate of 16- to 19year-olds seems to be related to increasing educational involvement of this group. For 16- and 17year-olds, school enrollments have risen to more
than 95 percent, which presumably leaves relatively little room for additional increases. There
might be more room for increased participation
17

We also considered whether adding a similar term to the model,
equal to the population share for the 16- to 24-year-old age bracket
times a linear time trend, would change the results. This term did
enter with a negative sign when added to the levels regression for
log(LFC/LFCADJL). However, the in-sample predictions were similar to the model without this term, and the out-of-sample forecast
projections were virtually identical. Based on this evidence, we
chose not to include this term in the model.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

for 18- and 19-year-olds, for whom the enrollment
percentage has risen to a little over 67 percent.

WHY USE FEMALE LIFE
EXPECTANCY DATA?
The rising contribution from the life expectancy term is clearly the most important element
of the model that produces a prediction for labor
force growth that is higher than in other forecasts.
However, we are not inclined to conclude that
the model produces an overly optimistic projection for the labor force over the next decade. We
are comfortable with the notion that increases in
life expectancy raise the amount of wealth required
to support a given flow of expenditures in retirement and thereby contribute to increases in the
participation rates for older age brackets. Furthermore, other developments are likely to complement the impact of rising life expectancy and
contribute to future increases in participation
rates in older age brackets, including changes in
parameters that influence Social Security retirement benefits, including the ongoing increase in
the normal retirement age, the gradual weakening
of the earnings test, and the expansion of the
delayed retirement credit; rising educational
attainment and the increasingly knowledge-based
nature of employment; rising costs for health care;
the expansion of defined-contribution retirement
plans at the expense of defined-benefit plans; and
the possibility that employers will adapt to a slowdown in the growth of the population of primeaged adults by increasing their recruitment and
retention efforts for older, skilled workers.
These factors aside, why did we choose the
particular form of the life expectancy term—the
life expectancy of women at the age of 65, multiplied by the share of women 65 and older in the
civilian noninstitutional population (“femaleweighted, female life expectancy”)? We considered
other life expectancy terms, including the maleweighted, male life expectancy at age 65, and the
male-weighted, female life expectancy, among
others, but we did not include them for several
reasons. First, the female-weighted, female life
expectancy worked well, with a positive coeffiJ U LY / A U G U S T

2009

307

Matheny

cient (as expected) and a high t-statistic of nearly 6.
Second, adding male-weighted life expectancies
(for either men or women at age 65) did not materially improve fit and led to similar estimates of
the contribution of changes in life expectancy
to the growth of the labor force over the forecast
period. Third, replacing the female-weighted,
female life expectancy with either the maleweighted, male life expectancy or the maleweighted, female life expectancy caused the fit
to deteriorate somewhat. Fourth, replacing the
female-weighted, female life expectancy with the
male- and female-weighted average life expectancy
for men and women at age 65 caused the fit of the
equation to deteriorate slightly. Fifth, we found
support for a strong role for female life expectancy
in a preliminary investigation into the labor force
participation rates for specific age brackets of
older men and women but not for male life
expectancy.
Do these statistical results indicate a more
important role for female life expectancy is reasonable? We think they do for several reasons. First,
except when ill health intervenes, spouses tend
to coordinate their work/retirement decisions,
suggesting that the decisions of husbands will
depend in part on the life expectancy of their
wives, and vice versa.18 Second, on average,
women live longer than men, suggesting that the
life expectancy of wives is more important for
savings and retirement decisions within the household.19 Third, Goda, Shoven, and Slavov (2007)
demonstrate that many individuals experience a
sharp increase in their net Social Security tax rate
as they age; and, because of the parameters that
determine taxes and benefits, on average men
are likely to experience a sharper increase than
women and at a much earlier age than women.
For many men, the sharp increase occurs at or
18

Munnell and Sass (2007) discuss many factors that influence the
supply of labor for older Americans. They cite several papers showing a strong tendency for husbands and wives to retire within one
to two years of each other.

19

On a related point, consider the work/retirement decisions of
widows and widowers. They are likely to be influenced by their
own life expectancy but not by the statistical life expectancy of
the opposite gender. There are more widows than widowers, which
accentuates the role of female life expectancy relative to male life
expectancy.

308

J U LY / A U G U S T

2009

before normal retirement age, creating a financial
incentive toward earlier retirement that, on average, is larger for men than for women. This feature
of Social Security tends to diminish the role of
male life expectancy, and in the context of household decisionmaking, accentuates the importance
of female life expectancy for the retirement decisions of both genders.

CONCLUSION: IMPLICATIONS
FOR POTENTIAL OUTPUT
The estimate for trend growth of the labor
force can be combined with other procedures
described in a June 2008 presentation to generate
a consistent estimate of potential GDP growth.20
Here we briefly sketch the implication for potential growth over the forecast through 2017. The
main elements of potential GDP are (i) potential
growth of hours worked in the nonfarm business
sector, (ii) structural productivity growth in the
nonfarm business sector, and (iii) trend growth
in other GDP. The sum of the first two elements
(apart from compounding) provides an estimate
of potential GDP growth in the nonfarm business
sector. That sector accounts for approximately
three-fourths of total GDP.
Trend hours in the nonfarm business sector
equals the trend in the workweek, which we
assume is roughly flat in the forecast, times potential employment in that sector. The latter is equal
to total potential civilian employment less employment in the “other” sectors outside the nonfarm
business sector. Our procedures ensure that the
“other” employments, which account for roughly
20 percent of total employment, are consistent
with our forecasts for the “other” output, which
includes government output, output of the household and institutional sectors, and agricultural
output. Potential civilian employment is simply
one minus the NAIRU (in decimal form) times
the potential labor force. Using these methods,
we estimate that potential hours growth through
2017 will be close to the estimate of potential labor
20

See Matheny’s presentation entitled “Research Update: Potential
GDP” delivered at MA’s June 10, 2008, Quarterly Outlook Meeting.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Matheny

force growth, or about 0.9 percent per annum.
This reflects an assumption of a roughly flat trend
in the workweek, an essentially constant value
for the NAIRU, and an increase in “other” employment that averages about 1.1 percent.
Our estimate of structural productivity growth
reflects contributions from capital deepening and
growth of total factor productivity. We assume in
the forecast that the latter will increase at a 1.2
percent annual rate. Based on projections regarding
the growth of capital services in Macroeconomic
Advisers’ most recent long-term outlook as of this
writing, capital deepening is expected to add
another 0.8 percentage points to productivity
growth in the forecast, resulting in structural
productivity growth of about 2.0 percent and
potential GDP growth in the nonfarm business
sector of about 2.9 percent. Allowing for a contribution from “other” GDP of about 0.4 percentage
points on average, this implies that total potential
GDP growth would be expected to average about
2.6 percent through 2017. This is two-tenths
higher than the Congressional Budget Office’s
(2008) projection that potential GDP growth will
average 2.4 percent over the same period.

REFERENCES
Aaronson, Stephanie; Fallick, Bruce; Figura, Andrew;
Pingle, Jonathan and Wascher, William. “The Recent
Decline in the Labor Force Participation Rate and
Its Implications for Potential Labor Supply.”
Brookings Papers on Economic Activity, Spring
2006, 1, pp. 69-134.
Bartik, Timothy J. “Displacement and Wage Effects of
Welfare Reform,” in David Card and Rebecca M.
Blank, eds., Finding Jobs: Work and Welfare Reform.
New York: Russell Sage Foundation, 2000, pp. 72-122.

Congressional Budget Office. “The Budget and
Economic Outlook: An Update.” CBO, September
2008; www.cbo.gov/ftpdocs/97xx/doc9706/09-08Update.pdf.
Goda, Gopi S.; Shoven, John B. and Slavov, Sita N.
“Removing the Disincentives in Social Security for
Long Careers.” NBER Working Paper No. 13110,
National Bureau of Economic Research, May 2007;
www.nber.org/papers/w13110.pdf?new_window=1.
Goodstein, Ryan. “The Effect of Wealth on the Labor
Force Participation of Older Men.” Unpublished
manuscript, University of North Carolina–Chapel Hill,
March 2008; www.unc.edu/~rmgoodst/wealth.pdf.
Macroeconomic Advisers. Long-Term Economic
Outlook. September 24, 2008.
Munnell, Alicia H. and Sass, Steven A. “The Labor
Supply of Older Americans.” Working Paper No.
2007-12, Center for Retirement Research at Boston
College, June 2007; http://crr.bc.edu/images/stories/
Working_Papers/wp_2007-12.pdf?phpMyAdmin=
43ac483c4de9t51d9eb41.
Matheny, Ken. “Research Update: Potential GDP,”
presented at Macroeconomics Advisers’ Quarterly
Outlook Meeting, June 10, 2008, St. Louis, Missouri.
Social Security Administration. Personal Responsibility
and Work Opportunity Reconciliation Act of 1996
(Public Law 104-193; §115.42 U.S.C. 862a),
August 22, 1996;
www.ssa.gov/OP_Home/comp2/F104-193.html.
U.S. Census Bureau. Current Population Survey.
www.census.gov/cps/.

Blank, Rebecca M. “What Did the 1990s Welfare Reform
Accomplish?” Written for the Berkeley Symposium
on Poverty and Demographics, the Distribution of
Income, and Public Policy, December 2003; updated
2004; http://urbanpolicy.berkeley.edu/pdf/
Ch2Blank0404.pdf.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

309

310

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Ellis W. Tallman

M

acroeconomists ranging from
policymakers to business and
economic forecasters use the concept of potential output in specific
economic constructs. In some applications,
economists look at the “output gap”—the difference between an estimate of potential output and
the measure of actual real output—as a forecasting
tool for inflation to gauge whether deviations of
real output from potential should lead to increases
or decreases in future inflation. Monetary policymakers use potential output in this way in applications of the Taylor rule framework. Separately,
economic forecasters use the estimate of potential
output as a comprehensive measure of the underlying trend in real output growth for the economy.
In the latter usage, calculating an estimate for
potential output typically starts with estimates
of the primary factors of production—capital
and labor inputs.
The motivation for the paper “Trends in the
Aggregate Labor Force” (Matheny, 2009) is the
search for a more accurate and comprehensive
measure of the labor input for potential output
estimates. The goal is commendable, and there
are few reasons to fault the author for committing
resources toward producing an improved estimate
for the labor input. Matheny uses a more detailed
set of labor data series from which to calculate
an estimate of the available labor force and ultimately to create an estimate of the labor input
measure. Even in a preliminary form, the paper
provides a concise survey of a work in progress

as it outlines a number of additional issues that
remain unsettled. Among the main findings is an
influential role of factors that could influence the
labor force participation of women 55 and older
as inferred from an estimated regression model.
This participation rate has increased over time
and is currently higher than has been observed
historically. The bottom line from the research is
that estimates of potential output that do not take
into account behavioral responses that reflect
increasing labor force participation rates of the
older population will underestimate the growth
in the labor force and thereby underestimate the
growth rate for potential output.
In this discussion, I focus my comments on
these central findings of the research. First, my
discussion outlines the contribution of the paper
with respect to the calculation of the demographic
component of the labor force. Next, the comments
focus on the main explanatory variable in the
aggregate labor force participation rate regression—
the population proportion of women 65 and older
weighted by life expectancy of women at age 65
(the behavioral variable WT65F_LEF65 in the
paper). Next, the discussion investigates whether
other, additional factors may explain the strong
observed correlation between the dependent variable (a change in the aggregate labor force participation rate) and the WT65F_LEF65 variable. More
narrowly, I ask whether there are underlying
variables that may explain the increased labor
force participation of women 65 and older in
addition to the rising life expectancy of women.

Ellis W. Tallman is the Danforth-Lewis Professor of Economics at Oberlin College.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 311-16.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

311

Tallman

Further, the discussion investigates whether the
implied elasticity of labor force participation
with respect to the WT65F_LEF65 variable in the
regression is consistent with feasible changes in
the labor force participation rate of women 65
and older. The findings suggest that there remain
numerous interesting research questions that
these observations raise for labor economists in
particular. Finally, I make some suggestions for
broadening the appeal of the work.

CALCULATION OF THE LABOR
INPUT
The bottom-line finding of the paper is that
a revised labor input measure contributes an
increase of nearly 0.5 percentage points to the
estimation of potential output growth. The measure sounds small, but that kind of calculation is
significant, especially if it is an accurate forecast.
Clearly, the labor input for the estimation of potential output is only one of several inputs important
for that calculation. Rather than highlighting the
limitation of focusing on only one factor input,
this discussion adopts the view, as stated in the
paper, that refining the labor input measure for a
potential output estimate is “low-hanging fruit.”
The treatment of labor force growth is central
to the paper, and it clarifies the distinction between
the components of labor force growth that reflect
only shifting population demographics and those
that reflect labor force participation rates of the
demographic subcategories (gender and age categories). The population demographics can be
predicted reliably from population data. In contrast, the labor force participation rates may vary
as a result of changes in economic situation, life
expectancy, and so on and therefore may deviate
from a trend labor force participation rate. The
paper makes a notable contribution to the measurement of the labor input estimate from the calculation of additional gender/age brackets and the
incorporation of the related labor force participation rates. Specifically, the paper increases the
number of age brackets from 7 to 15, thereby
increasing the detail of the population characteristics and likely affording a more comprehensive
312

J U LY / A U G U S T

2009

labor force estimate. Further, the paper uses the
narrower population measure—civilian noninstitutional population—rather than resident population data—to generate more precise estimates
of labor force. Using available population demographic data (civilian noninstitutional population),
the author calculates a chain index of the age-andgender population detail at the quarterly frequency.
The labor force series uses the participation
rates from the previous period (t–1) as weights for
the population demographics for each age and
gender category in the current period and thereby
emphasizes the impact of demographic factors.
The series, listed as LFCADJL, measures the
quarter-to-quarter growth as entirely due to demographic factors. The previous description understates the amount of meticulous data analysis
required to formulate an improved labor force
growth estimate.
The influence of population growth in a given
demographic component on the labor force relies
on the proportion of that demographic group in
the labor force (noting the dating differences of
the aggregate and age-gender bracket). Clearly, if
a demographic group—like those 75 and older—
grows rapidly, but the share of that demographic
group in the labor force is low, then the influence
of that population growth on the labor force is
small. As noted previously, this labor force measure highlights the demographic components of
population and its effects on the labor force if
labor force participation rates were not changing.
The accounting aspect of the investigation,
that is, the addition of demographic subcategories
in the labor input measure, provides only the
groundwork for the economic analysis of the
behavioral element of the labor force input. Still,
the general work on the comprehensive dataset
offered opportunities to investigate the labor force
participation rates of various age and gender
brackets.
Figure 1 illustrates the specific isolation of
the labor force participation rate for women 55
and older and its subcategories (55-59, 60-64,
65-69, and 70 and older). The observation of rising labor force participation rates of women 65
and older compels further investigation, and the
empirical work investigates whether including a
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Tallman

Figure 1
Female Labor Force Participation by Age
Percent
70
60
50

55 to 59 Years
Total, 16 and Over
60 to 64 Years
65 to 69 Years
70 and Over

40
30
20
10
0
1948

1953

1958

1963

1968

1973

measure of the life expectancy of women at age
65 multiplied by the population proportion of
women 65 and older adds explanatory power to
a regression to forecast the behavioral element
(labor force participation rates) of the aggregate
labor input. The research provides an interesting
initial inquiry into a regression-based empirical
model to explain (and then predict) the aggregate
labor force participation rate.

THE REGRESSION
The regression analysis in the paper uses a
set of explanatory variables intended to account
for the behavioral changes in the aggregate labor
force participation rate. The paper outlines and
describes the regression in detail; my discussion
here focuses on one key explanatory variable—
life expectancy of women at age 65 times the share
of women age 65 and older in the adult population. This variable is especially important for the
forecast period 2011-17 and largely explains the
increase in labor force participation in the new
estimate for the labor input.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1978

1983

1988

1993

1998

2003

The finding raises a number of questions; the
main one is whether a regression model that is
meant to explain the behavioral variations in
aggregate labor force participation rates attributes
too much influence to this particular variable. It
would be helpful to have an explicit accounting
for the quantitative increase in the labor force
generated by increases in WT65F_LEF65. First,
the explanatory series should have a positive effect
on the participation rates of women 65 and older.
Second, the increase in the participation rate of
women 65 and older times the population of
women 65 and older should generate an increase
in the labor force of women 65 and older of a similar magnitude to the one generated by the aggregate labor force participation rate regression.1
Conversely, the author can work in the opposite
direction by taking the increase in the labor force
implied by the aggregate labor force participation
rate regression coefficient and investigate the
required increase in the labor force participation
1

A related question is whether the rise in life expectancy for women
at age 65 has significant explanatory power for the participation
rate of women 65 and older.

J U LY / A U G U S T

2009

313

Tallman

Figure 2
Comparison of Weighted versus Unweighted Population Proportion of Women 65 and Older
2.15
2.10
2.05
2.00
1.95
1.90
1.85
1.80
Fixed Life Expectancy
1.75

Weighted Life Expectancy

1.70
2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

SOURCE: Population projections from www.bls.gov/emp/emplab1.htm.

rate for women 65 and older that would be necessary to generate the labor force observation.
Second, the variable itself is composed of
two increasing components—the population
proportion of women 65 and older and the life
expectancy of women at age 65. Figure 2 shows
the estimated series for 2008-17 along with a
series in which the life expectancy after age 65 is
held fixed at 19.7 years (the expectancy in 2008).
Clearly, the dominant component of the series is
the population proportion of women 65 and older,
which reflects the demographic influence of the
large baby boom generation. If the life expectancy
component of the measure were important to the
regression results, then a regression using only
the population proportion of women 65 and older
should not have much explanatory power. If, on
the other hand, the regression results are similar,
then the result suggests that behavioral variations
in the aggregate labor force participation rate
respond to demographic movements. Such an
explanation would be unsatisfying.
The author could also try a few other techniques to assess the feasibility of the result. The
314

J U LY / A U G U S T

2009

data on the population of women 65 and older
could be used to carry the demographic analysis
out to the forecast year 2017, given standard
assumptions for the mortality rate, and so on.
Then, the analysis can focus on examining a set
of possible labor force participation rates for
women 65 and older and how different labor
force participation rates affect the aggregate labor
force. For example, a particular labor force participation rate for this specific entry could be chosen
to determine what that participation rate suggests
for the aggregate labor force calculations. The
accounting of the population demographics is
noncontroversial; the examination of the labor
force implications of various labor force participation rates for this demographic can be thought of
as a conditional forecasting exercise. The analysis
would allow an inference for (i) whether the
explanatory power of the life expectancy of
women at age 65 reflects only the contribution
of women 65 and older to the labor force or (ii)
whether the measure reflects further influences
as a proxy. Other factors may be correlated with
the specific regressor variable; that result, if found,
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Tallman

would allow further refinement of the initial
finding. Additional research would then aim at
uncovering the additional factors with the goal
of identifying (or at least clarifying) other underlying sources for the increase in the labor force participation rate.
The regression model is meant to explain the
behavioral aspects of labor force participation,
although the current findings also introduce some
intriguing questions that, the author admits,
remain unsettled. Some of these questions are
addressed in the paper. For example, the author
investigates whether aggregate wealth calculations
explain the increased labor force participation;
initial results suggest that a measure of wealth
was not associated with the increase in labor force
participation. The result may be only preliminary,
however, because it uses an aggregate measure of
per capita wealth. In accord with the previous
suggestions, an analysis of disaggregate wealth
measures that relate to specific demographic
groups—for example, the population 65 and
older—may have explanatory power for the labor
force participation of that subcategory.
Increased life expectancy of women at age 65
may explain the higher-than-anticipated labor
force participation of women 65 and older; it
makes intuitive sense. Separately, there may be
important cost-of-living elements that drive a
higher labor force participation rate for those 65
and older. Recent empirical work by Broda and
Romalis (2008) suggests that economic analysis
can be more precise with respect to the “wage
gap” with more precise price deflators that relate
more closely to the prices and to the expenditure
patterns of the relevant income groups in the comparison. Perhaps a similar approach can be used
for the population 65 and older. The consumer
basket for a person 65 or older could be notably
different from the standard basket of goods used
in the calculation of the consumer price index.
One might expect a larger component of spending
on prescription drugs and for health services for
those 65 and older; then, there might be a faster
rate of inflation for that cohort than for the general
public. A rising cost of living for those facing fixed
incomes might lead to a higher-than-expected rate
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

of labor force participation. In this case, longer
work lives may also be related to the increased
life expectancy of those 65 and older.
These comments and criticisms aim to refine
and dissect a notable result. The basic finding of
the regression highlights a major flaw in the use
of fixed or trend participation rates in the calculation of “potential” labor force. That contribution
remains even though several other factors remain
to be investigated as potential sources for a forecast of increased labor force participation in the
aggregate labor input measure. Specifically, the
empirical work captures some of the observed
changes in the labor force decisions of older individuals and the effect of those changes on the
labor force projections for the future. The point
is especially important given the demographic
impact of the baby boom generation on the labor
force as that generation approaches retirement
age. If the baby boomers stay in the labor force
longer than anticipated, there will be important
labor market effects, and this paper emphasizes
that point.

IDEAS FOR ILLUSTRATING THE
IMPORTANCE OF THE LABOR
INPUT REVISION
The paper provides ample evidence to suggest
that the labor force participation rate increase
among those aged 65 and older may increase the
potential labor force above the pessimistic forecasts offered by the demographic data alone. Yet,
the labor input is only one component of the calculation of potential output. In addition, some
influential treatments of estimating potential
output have instead focused on the calculation
of the effect of computers on economic growth
(Jorgensen, 2005). The paper can use the impending baby boomer event to motivate the relevance
of the labor input in the calculation of a real-time
potential output estimate. The estimate of potential output that incorporates revised labor force
participation rates (new behavioral labor force
estimates) displays a deviation from the previous
potential gross domestic product estimate that is
J U LY / A U G U S T

2009

315

Tallman

larger than at any earlier period in the estimation
sample.
Revisions to potential gross domestic product
measures have been the subject of numerous
empirical investigations (Orphanides, 2001); the
paper could incorporate some of these findings
to illustrate where prior estimates of potential
output failed to account for certain factors. It is
likely that the current labor force participation
rates are undergoing an adjustment that, in retrospect, will seem more apparent.
It may be worthwhile, though not necessarily
for this research agenda, to determine whether
there are precedents for the labor force participation rate underestimate. Perhaps the increase in
female labor force participation through the 1970s
and 1980s was relatively unexpected. More
recently, the influence of immigration may have
affected estimates of the labor input. The paper
can highlight further its relevance if it can isolate
historical episodes in which more accurate labor
input measures for a potential output estimate
were empirically important.

CONCLUSION
The paper offers an interesting contribution
to the calculation of the labor input for a potential
output estimate by increasing the disaggregation
of the demographic components of the labor force
input. Further, the paper provides initial results
for a model of the behavioral element of the labor
force input, essentially, a model of the aggregate
labor force participation rate. The data-based
enhancements for the labor input measure are
noncontroversial and should offer a roadmap for
other estimates of potential output growth. The
model-based predictions regarding the aggregate
labor force participation rates are intended to
stimulate discussion rather than be taken as ultimate findings. The discussion highlights a number
of avenues to pursue to refine our understanding

316

J U LY / A U G U S T

2009

of the estimated regression model and to assess
its robustness.
The overall implication of the regression
analysis suggests that the pessimistic forecasts
of labor force growth in the United States may be
too low, and that suggestion contributes to an
interesting debate about labor force dynamics in
the medium term. The paper raises a number of
interesting research topics from the aggregate labor
data. Perhaps other interesting research could
use the aggregate research results as motivation
for modeling the behavioral decisions for labor
force participation on the level of the disaggregate
population demographics. Although these ideas
are not part of the author’s research agenda, labor
economists could offer findings that then help
isolate additional sources of the increased labor
force participation rate.

REFERENCES
Broda, Christian and Romalis, John. “Inequality and
Prices: Does China Benefit the Poor in America?”
Unpublished manuscript, University of Chicago,
May 2008; http://faculty.chicagogsb.edu/christian.
broda/website/research/unrestricted/BrodaRomalis
_TradeInequality.pdf.
Jorgenson, Dale W. “Accounting for Growth in the
Information Age,” in Philippe Aghion and Steven
Durlauf, eds., Handbook of Economic Growth.
Volume 1A. Chapter 10. Amsterdam: Elsevier,
2005, pp. 743-815; www.economics.harvard.edu/
faculty/jorgenson/files/acounting_for_growth_
050121.pdf.
Matheny, Kenneth J. “Trends in the Aggregate Labor
Force.” Federal Reserve Bank of St. Louis Review,
July/August 2009, 91(4), pp. 297-309.
Orphanides, Athanasios. “Monetary Policy Rules
Based on Real-Time Data.” American Economic
Review, September 2001, 91(4), pp. 964-85.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Potential Output in a Rapidly Developing Economy:
The Case of China and a Comparison with the
United States and the European Union
Jinghai Zheng, Angang Hu, and Arne Bigsten
The authors use a growth accounting framework to examine growth of the rapidly developing
Chinese economy. Their findings support the view that, although feasible in the intermediate term,
China’s recent pattern of extensive growth is not sustainable in the long run. The authors believe
that China will be able to sustain a growth rate of 8 to 9 percent for an extended period if it moves
from extensive to intensive growth. They next compare potential growth in China with historical
developments in the United States and the European Union. They discuss the differences in production structure and level of development across the three economies that may explain the countries’
varied intermediate-term growth prospects. Finally, the authors provide an analysis of “green” gross
domestic product and the role of natural resources in China’s growth. (JEL L10, L96, O30)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 317-42.

T

he rapid development of emerging
markets is changing the landscape of
the world economy and may have
profound implications for international
relations. China has often been regarded as the
most influential emerging market economy.
Projections indicate that the absolute size of the
Chinese economy may be larger than that of the
United States within two to three decades. While
China’s growth performance since reform has
been hailed as an economic miracle (Lin, Cai, and
Li, 1996), concerns over the sustainability of its
growth pattern have emerged in recent years
when measured total factor productivity (TFP)
growth has slowed.

In recent years, economists have increasingly
referred to China’s growth pattern as “extensive.”
Extensive growth is intrinsically unsustainable
because growth is generated mainly through an
increase in the quantity of inputs rather than
increased productivity. In a previous paper (Zheng,
Bigsten, and Hu, 2009), we focused on China’s
capital deepening versus TFP growth and private
versus government initiatives. In this article, we
first compare China’s growth performance with
what would otherwise have been feasible, taking
into account the main factors commonly employed
to generate growth in rapidly developing economies. In other words, we compare official statistics with estimates of “potential” output growth to
shed further light on China’s recent growth patterns.

Jinghai Zheng is a senior research fellow in the Department of International Economics at the Norwegian Institute of International Affairs,
Norway, an associate professor in the department of economics at Gothenburg University, Sweden, and a guest research fellow at the Centre
for China Studies, Tsinghua University, Beijing, China. Angang Hu is a professor at the School of Public Policy and Management at Tsinghua
University. Arne Bigsten is a professor in the department of economics at Gothenburg University. The authors thank Justin Yifu Lin for his
support and encouragement of this project and Xiaodong Zhu for useful discussion. The study was also presented at the Chinese Economic
Association (Europe) Inaugural Ceremony, December 17, 2008, Oslo, Norway. The authors thank participants at the event and especially those
who commented on their paper. The study benefited from research funding from the Center of Industrial Development and Environmental
Governance at the School of Public Policy and Management, Tsinghua University. Yuning Gao provided excellent research assistance.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

317

Zheng, Hu, Bigsten

Second, we provide projections of the future
potential of the Chinese economy and discuss
China’s impact on the world economy. Specifically, we compare potential growth in China with
that for the United States and European Union
(EU). We note that structural characteristics, rapid
accumulation in capital stock, and improvement
in labor quality are the major factors behind
China’s phenomenal economic growth. China’s
future TFP growth is likely to be faster than that
of the United States and EU because of the stock
of world knowledge it may easily access at affordable prices to enhance its production possibilities
(Prescott, 2002).
Nobel laureate Ed Prescott (1998) asked why
“growth miracles” are a recent phenomenon. We
suspect that the main reasons are differences in
production structure and in the level of development. Examples include the East Asian newly
industrialized countries (NICs), to some extent
post-WWII Japan and Germany, and the Soviet
Union between the first and second World Wars
and in the early years of the Cold War. Now, due
to rapid industrialization, China will soon join the
ranks of the high-performing East Asian nations.
Understanding the causes and conditions of economic miracles may prove useful for developing
countries. Understanding differences in production structure and the level of development may
also help explain why productivity slowed in the
United States and EU in the early 1970s, then
started to surge in the United States but stagnated
in Europe in the mid-1990s.
To analyze growth potential, we consider the
usual suspects of demographics, rural-urban
migration, and aging. In addition, we discuss
how estimates of potential output have affected
Chinese government policy regarding growth
planning. Because environmental regulations
and concerns are of increasing international
importance, we assess in the final section of this
analysis the influence of environmental factors—
specifically, to what extent past economic growth
reflected environmental “inputs” not elsewhere
accounted for.
318

J U LY / A U G U S T

2009

THE ANALYTICAL FRAMEWORK
Years before the current worldwide credit
crunch, the economics literature included many
works that foresaw the looming economic crisis
(e.g., Gordon, 2005; Phelps, 2004; Stiglitz, 2002;
and Brenner, 2000 and 2004). Gordon’s (2005)
application of the growth accounting framework
to the study of the U.S. productivity revival and
slowdown stands out as convincing evidence that
economic theory can powerfully inform empirical
analysis for macroeconomic planning.
Since the publication of Solow’s seminal work
on technical progress and the aggregate production function, growth accounting has been used
to assess the economic performance of the former
Soviet Union (Ofer, 1987), raise concerns about the
sustainability of the economies of the East Asian
“tigers” (Hong Kong, South Korea, Singapore, and
Taiwan) just a few years before the East Asian
financial crisis (Young, 1995; Kim and Lau, 1994;
and Krugman, 1994), and, recently, forewarn planners about the macroeconomic imbalances in
China (Zheng and Hu, 2006).
Adequately implemented and understood,
growth accounting is a useful instrument for
improving the analysis of growth potential for
many countries and regions. Several examples
in the literature show that growth accounting
methods are sensitive enough to detect significant
changes in productivity performance if production
parameters are carefully chosen.
Growth accounting decomposes growth in
output into its components:
(1)

Y A
K
L
= + α + (1 − α ) ,
Y A
K
L

.
where Y is gross domestic product (GDP) and Y
change
in GDP over time; K is capital stock and
.
.
K the change in capital stock; labor is L and
. L
the change in labor input; TFP growth is A /A;
0 < α < 1 is the output elasticity of capital; and 共1
– α 兲 is the output elasticity of labor.
Potential output growth may be calculated
via equation (1) from knowledge of the potential
growth of each of the right-hand side components,
plus estimates of output elasticities for the various inputs. Obviously, both the growth potentials
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

and the output elasticities will differ among countries, reflecting structural differences. Typical
growth accounting structures are represented as
follows: For China (Chow and Li, 2002; and Chow,
2008),
(2)

Y A
K
L
= + 0.6 + 0.4 ;
Y A
K
L

for the United States (Congressional Budget Office
[CBO], 2001),
(3)

Y A
K
L
= + 0.3 + 0.7 ;
Y A
K
L

and for the EU (Mossu and Westermann, 2005),1
(4)

Y A
K
L
= + 0.4 + 0.6 .
Y A
K
L

China has an output elasticity of capital of 0.6,
compared with 0.3 for the United States. Differences of this magnitude are large enough to generate a significant difference between the growth
potential of the two economies. For example, a
capital stock growth rate of 10 percent would
enable China to grow by at least 6 percent per
year, whereas, all else constant, it would increase
the U.S. growth rate by only 3 percent per year.
Growth differences can also be related to differences in investment in physical capital as well
as in TFP growth. For developing economies such
as China, investment opportunities abound
because of the country’s relatively low level of
development compared with that of the United
States, EU, and other industrialized countries.
For the same reason, China more easily absorbs
and benefits from existing worldwide technology,
whereas developed economies, especially the
United States, have to rely on new knowledge and
innovations to shift their production frontier.

Steady-State and Sustainable Growth
The growth accounting framework provides
a compact formula for the study of potential output growth. We define “potential output” as the
highest level of real GDP that can be sustained
1

Proietti, Musso, and Westermann (2007) set capital elasticity at
0.35 and labor elasticity at 0.65.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

over the period of interest. Growth associated
with potential output can therefore be termed
“sustainable growth.” We divide sustainable
growth into three categories according to the different time frames considered.
The first concept of sustainable growth refers
to circumstances in which certain measures of
output growth are maintained permanently as
time goes to infinity. The literature offers two different though related output measures that may be
used in this context. Different studies have used
them for different purposes. Sustainable growth
can be defined as a growth pattern that generates
sustained growth in per capita income over an
infinite time horizon. Usually, per capita income
is treated as a measure of living standards. Following Romer (2006), the standard Solow growth
model can be expressed as follows:

Y = F ( K, L, A (t )), A (t ) ≥ 0,
(5)

A (t ) = e xt , L (t ) = e nt ,

where Y is total output, F 共.兲 is the production
function, K is capital input, L is labor input, and
A共t兲 is the level of technology that progresses at
the exponential rate x while the labor force grows
at the exponential rate n. The change in capital
stock is given by
(6)

K = I − δK = s ⋅ F (K, L, A (t )) − δK ,

where I is investment, δ the real depreciation rate,
and s the saving rate. For I = sY = s . F共K,L,A共t兲兲,
we have
(7)

K = s ⋅ F ( K, L ,A (t )) − δ ⋅ K .

Dividing by labor input, L, on both sides of equation (7) yields
(8)

K
1 dK

K
= s ⋅ F  , A (t ) − δ .


L dt
L
L

Because k = K/L, the growth rate of k can be
written as

d (K L ) 1 dK K dL 1 dK K
=
−
=
− n,
(9) k =
dt
L dt L2 dt L dt L
where n =

∆L
.
L
J U LY / A U G U S T

2009

319

Zheng, Hu, Bigsten

Rearrange equation (9):
(10)

1 dK 
= k + nk = s ⋅ F ( k , A (t )) − δ k ;
L dt

erty demonstrates only what the supply side of
the economy could achieve if other factors, such
as demand conditions, efficiency of the economy,
and political stability, are present.

combine equations (8) and (10):
(11)

k = s ⋅ F (k ,A (t )) − ( n + δ ) ⋅ k ;

then divide k on both sides of the equation to get
the growth rate of k given by
(12)

γ k = s ⋅ F (1, A (t ) k ) − ( n + δ ) .

At steady state, γ *k is constant, which requires
that s, n, and δ are also constant. Thus the average
product of capital, F 共k,A共t兲兲/k, is constant in the
steady state. Because of constant returns to scale,
F 共1,A共t兲/k兲 is therefore constant only if k and A共t兲
grow at the same rate; that is γ *k = x. Output per
capita is given by
(13)

y = F (k , A (t )) = k ⋅ F (1, A (t ) k),

and the steady-state growth rate of y = x. This
implies that, in the long run, output per capita
grows at the rate of technical progress, x. Note
that this conclusion is conditioned on parameters
of the model staying constant, including the saving
rate and, hence, the rate of capital formation. This
property of the model may explain why developing economies can grow faster, as exhibited by
the growth miracles in the East Asian NICs, than
developed economies because the potential for
absorbing new technologies is larger in the former.
Another important implication of the Solow
growth model is that less-advanced economies,
such as China, will tend to have higher growth
rates in per capita income than the more-advanced
economies, such as the United States and EU,
because there are more investment opportunities
in developing nations. The World Bank (1997,
p. 12) called this phenomenon “the advantages of
the backwardness.” This property is also referred
to as “absolute convergence” when the analysis
is not conditioned on other characteristics of the
economies and “conditional convergence” when
the analysis is only valid among economies with
the same steady-state positions. However, caution
should be exercised when applying this property
of the model to real-world situations. The prop320

J U LY / A U G U S T

2009

Extensive versus Intensive Growth
Sustainable growth might as well be interpreted as growth of GDP only, which is particularly
interesting if one is interested in the absolute size
of the economy. It is the size of the aggregate economy that matters with regard to international influence in both economics and politics. Sustainable
growth in this context means the rate of investment need not rise in order to maintain a given
rate of GDP growth. Such sustainable growth is
considered intensive growth. A borderline case for
sustainable growth of this kind is when the capital stock growth rate equals the GDP growth rate.
Extensive growth refers to a growth strategy
focused on increasing the quantities of inputs
(Irmen, 2005). As capital accumulation and growth
of the labor force raise the growth rate of aggregate
output, because of diminishing returns these
growth effects will not have a permanent effect on
per capita income growth (Irmen, 2005). In contrast, intensive growth focuses on TFP. In our
model, labor growth and TFP growth are exogenous; the only input with endogenous growth is
capital. A key feature of the extensive growth
model is that capital grows faster than GDP (or
gross national product) because of the high growth
rate of capital on the one hand and few productivity advancements on the other. Consequently,
the share of investment in GDP, in constant prices,
must grow continuously to sustain the growth rate
of capital (Ofer, 1987). Specifically, the relation
between I (investment), K (the capital stock), and
Y (national product) in real terms can be written
as follows:
(14)

I K = ( I Y )(Y K ).

Notice that
formula
.
. the.growth .accounting
is given by Y/Y = A /A + α K/K +β L /L, where (.)
denotes growth rates, L is labor, and A is the level
of technology. Given the growth rate in the labor
input and the rate of technological progress, sustainable growth in Y requires sustainable growth
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

.
.
in K. Under intensive growth,
K
/K
<
Y
/Y, so Y/K
.
rises over time. For I/K共=. K/K兲. to stay constant,
I/Y must decline; that is, I /I < Y/Y. In other words,
the gross capital formation rate does not have to
rise to sustain a given growth rate in output, which
is feasible.
.
.
Under extensive growth, K/K > Y/Y, so Y/K
declines and a constant I/K implies a rising I/Y,
which is not sustainable in the long run. Moreover,
the share of investment in GNP in current prices
may be written as I C /KC = IPI /YPY, where C represents “in current prices” and P the price level.
A change in the relative price of I (for example,
due to faster technological change) may slow the
rise of I/Y in real terms.
If the initial capital stock growth rate is sufficiently low, the economy will grow for a sustained
period of time even if capital stock growth exceeds
GDP growth substantially. Examples are the Soviet
economy in the 1950s and 1960s, the Japanese
economy during about the same period, and later
the East Asian NICs from the 1960s to 1980s. These
economies all experienced rapid economic growth
in a relatively short period of time. If the capital
growth rate is too high, extensive growth may not
be sustainable in the intermediate term or the long
run. In some typical cases, sustainable growth
requires that the saving rate (and hence the capital
formation rate) vary only within a feasible range
(say, between 0 to 50 percent) if borrowing is not
allowed. Compared with sustainable growth in
per capita income in the long run, this type of
growth can be sustained only for a limited time
because it relies on growth of transitional dynamics rather than on steady-state growth capabilities.

tained simply because labor is lacking. A country with a large population either too young or
too old to work will have a lower growth rate
than one with a large working-age population.
Following Musso and Westermann (2005),
we decompose labor input into its components.
Because we do not have hours worked as a measure of labor, we use employment. Employment,
E, at time t is defined as the difference between
the labor force, N, and total unemployment, U,
and can be expressed as a function of the unemployment rate, ur. The labor force is the product
of the participation rate, pr, and the working-age
population, pWA. The working-age population is
a function of total population, P, and the dependency ratio, dr, where the latter is defined as the
ratio between the number of persons below 15
and above 64 years of age and the working-age
population. These relationships are summarized
as follows:

Et ≡ N t − U t = N t ⋅ (1 − urt )
(15)

N t ≡ prt ⋅ PtWA
1
.
PtWA ≡ Pt ⋅
1 + drt

The potential GDP growth of China may be
expressed as

gY = g A + α ( i − δ ) + (1 − α )
(16) 
ur
dr

gur + g pr −
gdr + g p  ,
 g h −

1 − ur
1 + dr
where the variables are
gY, the growth rate of potential output, Y;
gA, growth of total factor productivity, A;

Sustainable Growth with Input
Constraints
A third concept of sustainable/potential
growth is related to sustainable growth in inputs,
especially labor. Everything else equal, change in
the labor input can be crucial for growth to be
sustained. The economic history of many countries shows that demographics are important for
rapid economic growth. In many developed
countries, faster growth rates may not be susF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

α, the output elasticity of capital;
共1 – α 兲, the output elasticity of labor;
i, the investment rate;

δ, the depreciation rate;
gh , growth in years of schooling, h;
gur , growth of the unemployment rate, ur;
gpr , growth of the participation rate, pr;
gdr , growth of the dependency ratio, dr; and,
gp , the growth rate of the population, p.
J U LY / A U G U S T

2009

321

Zheng, Hu, Bigsten

A Comparison of the Three Concepts
A comparison of the three concepts above
can be derived from the Solow growth model, but
each emphasizes a different aspect of the growth
problem. Sustainable growth is derived according
to the steady-state solution of the dynamic system.
It refers to the mathematical long run, given that
the saving rate, depreciation, and population
growth are fixed. The variable of interest is the
growth rate of per capita income and, hence, TFP
growth. In this case, the Solow growth model
predicts that low-income economies will grow
faster than high-income economies, which leads
to the concept of convergence.
The second concept, extensive versus intensive growth, concerns directly the GDP growth
rate rather than per capita income. In this case,
the saving rate is allowed to change. When the
investment rate is so high that the saving rate must
rise to, say, over 50 percent of GDP, then the
growth pattern is considered problematic. Such
growth is not sustainable, even in the intermediate
term. However, the problem may arise slowly:
If the capital stock growth rate is only 3 percent
per year, it may take two or three decades for the
saving rate to rise to 30 percent of GDP, even if
the growth pattern was initially extensive. This
is a major difference between rapidly developing
and developed economies. The latter need not
worry much about the intermediate term if growth
in capital stock exceeds GDP growth, because
growth rates are generally low in the relevant
variables. In the long run, however, no extensive
growth pattern is sustainable. This concept heightens the need to pay attention to the pattern of
capital accumulation.
The third concept emphasizes the input constraints. Growth will be sustained only as long
as sufficient inputs are available at a given point
in time. This formulation is often used to separate
the labor input into its components.

EMPIRICAL RESULTS
In this section, we present two sets of empirical results within the framework outlined previously. We first use data from 1978-2007 to update
322

J U LY / A U G U S T

2009

our growth accounting results from Zheng,
Bigsten, and Hu (2009). The emphasis is the timeseries behavior of TFP growth. Based on our TFP
estimates, we provide potential growth measures
conditional on the given investment rate and
factors related to labor input, including demographics and the labor participation rate. We then
compare the estimated potential growth with
official statistics.
The second set of results offers projections of
future growth. The growth scenarios should not
be seen as simple extrapolations based on historical data. In fact, our calibration exercise relies
heavily on knowledge of the production structure
of the Chinese economy and the concept of intensive growth.

China’s Growth Pattern and Potential
China’s development strategy in recent years
has been successful in promoting rapid economic
growth, but it also created a series of macroeconomic imbalances. The rapid growth has benefited
China both economically and politically, but
whether it can or will be sustained, and for how
long, is uncertain. The growth has been generated
mainly through the expansion of investment
(extensive growth) and only marginally through
increased productivity (intensive growth). Some
economists fear that if corrective measures are not
taken, per capita income will eventually cease to
grow. Kuijs and Wang (2006) point out that, if
China’s current growth strategies are unchanged,
the investment-to-GDP ratio would need to reach
unprecedented levels in the next two decades in
order to maintain GDP growth of 8 percent per
year. Our estimates in Table 1 show that China’s
growth pattern has been extremely extensive, with
capital stock growth exceeding GDP growth by
3.56 percent during 1995-2007.
Next, we use equation (16) to calculate a
measure of potential growth during 1978-2007.
Our measure is built from estimates of the potential growth of each of the main factors that contribute to sustainable growth, that is, the terms
on the right-hand side of equation (16). The first
term in equation (16), gA, is the TFP growth rate.
We use a growth rate of 3.3 percent for the period
1978-95 and 1.9 percent for 1995-2007, as in
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Figure 1
GDP Growth, 1978-2007
Percent
16
14
12
10
8
6
4

Potential GDP Growth

2

Actual GDP Growth

0
1978

1981

1984

1987

1990

Zheng, Bigsten, and Hu (2009). The second term
in equation (16) is the contribution of capital
(equal to the investment rate minus the depreciation rate, multiplied by an output elasticity of
capital of 0.5). The third term in equation (16) is
the contribution of labor: the sum of the growth
rates of hours worked per person, labor force participation, and population, minus the weighted
growth rate of the unemployment rate and dependency ratio. We replace the growth of hours worked
per person with the growth of quality-adjusted
employment (multiplied by the average years of
schooling). Figure 1 shows that, starting in 2002,
actual GDP growth exceeded potential growth
during six consecutive years. This result is consistent with the growth accounting result based
on the realized production data in Table 1.2

Projections for the Medium Term
Many projections for China’s future output
potential have appeared in recent years. We provide our own estimates using the analytical framework introduced earlier. We show that it is a valid
concern that China’s growth pattern as measured
2

“Green” GDP estimates and TFP trends will be discussed later in
the paper.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1993

1996

1999

2002

2005

Table 1
Growth Accounting for China (percent)
Variable
GDP

1978-95

1995-2007

10.11

9.25

9.12

12.81

Quality-adjusted labor

3.49

2.78

TFP

3.80

1.45

Capital stock

SOURCE: Updated to 2007 from Zheng, Bigsten, and Hu (2009),
with an output elasticity of 0.5.

by potential output may not be sustainable. The
growth accounting result is striking when compared with what the government considers a sustainable growth target (8 percent for 2008).
Our projections rely heavily on two basic
premises: (i) Capital stock growth cannot exceed
GDP growth and (ii) a TFP growth rate of 2 to 3
percent must prevail for the foreseeable future.
China’s government was concerned about maintaining a GDP growth rate of 8 percent, both in
the wake of the East Asian financial crisis of 1997
and when the Chinese economy started to overheat in 2003. We show how the “magic” 8 perJ U LY / A U G U S T

2009

323

Zheng, Hu, Bigsten

Table 2
Sustainable Growth for the Chinese Economy
α

gK

gL

gA

α gK

(1– α )gL

gY

0.5

11

3

3

5.5

1.5

10.0

0.6

11

3

3

6.6

1.2

10.8

0.5

11

3

2

5.5

1.5

9.0

0.6

11

3

2

6.6

1.2

9.8

0.5

10

3

3

5.0

1.5

9.5

0.6

10

3

3

6.0

1.2

10.2

0.5

10

3

2

5.0

1.5

8.5

0.6

10

3

2

6.0

1.2

9.2

0.5

9

3

3

4.5

1.5

9.0

0.6

9

3

3

5.4

1.2

9.6

0.5

9

3

2

4.5

1.5

8.0

0.6

9

3

2

5.4

1.2

8.6

0.5

8

3

3

4.0

1.5

8.5

0.6

8

3

3

4.8

1.2

9.0

0.5

8

3

2

4.0

1.5

7.5

0.6

8

3

2

4.8

1.2

8.0

cent growth rate can be derived from the growth
accounting framework. Suppose that the borderline growth rate between extensive and intensive
growth can be expressed as

gY =

gA

(1 − α )

+ gL,

which can be derived from the usual growth
accounting formula assuming that the capital
stock growth rate, gK, equals the output growth
rate, gY . In Table 2, the GDP growth rate, gY , is in
the far-right-hand column; other columns show
combinations of parameters consistent with gY .
With a 3 percent TFP growth rate and 0.05 output
elasticity of capital, the maximum sustainable
output growth rate would be 9 percent. With a 2
percent TFP growth rate and 0.06 output elasticity of capital, the maximum sustainable output
growth rate would be 8 percent, which is consistent with the Chinese government’s growth target
for 2008 (Wen, 2008).
324

J U LY / A U G U S T

2009

The magical 8 percent growth rate also has
interesting implications for the structural
parameters of the production function. When the
assumed output elasticity of capital is 0.6, the
corresponding sustainable growth rate is exactly
8 percent if TFP growth is 2 percent per year.
However, 8 percent growth will not be sustainable
if TFP growth is 2 percent per year but the output
elasticity of capital is 0.5. Sustainable growth will
be slightly more than 8 percent if TFP growth is
3 percent per year. Slower growth in the labor
input (demographics) will reduce the projected
output growth rate to some extent, but the trends
will remain the same.
Economic growth in developing economies
is considered to be mainly affected by three factors: rural-urban migration, demographics, and
educational attainment. In the late 1990s, Chinese
planners were preoccupied with maintaining a
growth rate of 8 percent in the face of the East
Asian financial crisis. Such forecasts relied on
China’s ability to maintain high capital formaF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

tion—but if the capital growth rate exceeds the
GDP growth rate, the result is extensive growth,
which is likely not sustainable in the longer run,
as discussed above.
We offer one more example. For simplicity,
we omit the role of human capital accumulation
(see Zheng, Bigsten, and Hu, 2009). Assuming, say,
the output elasticity of capital is 0.5, the capital
stock increases 8 percent per year, and the labor
force grows slightly above 1 percent (as it has in
the past decade), the TFP growth rate would be
required to be 3.5 percent to achieve 8 percent GDP
growth. Further, this would require the TFP contribution to GDP growth to reach 44 percent, which
may be difficult to achieve in practice. Using this
as a benchmark, the 5-year forecasts presented
for China’s 10th and 11th congressional sessions
appear wildly overoptimistic because they require
TFP growth to contribute 54 to 60 percent of GDP
growth (see Zheng, Bigsten, and Hu, 2009).

COMPARISONS WITH THE U.S.
AND EU ECONOMIES
In this section, given the structural differences,
we calibrate the model to compare a typical scenario for the Chinese economy with the U.S. and
EU economies. We demonstrate that growth potential varies across the three major economies
because of differences in production structure,
the level of development, and opportunities for
absorbing foreign technologies. Growth in developed countries relies on mainly technological
innovations because investment opportunities are
far fewer than in developing countries. Because
technology development often presents patterns
of cyclical fluctuations, attempts to counterbalance business cycles or alter the trajectory of
growth potentials may result in short-term gains
but long-term losses. Understanding this is crucial
for central banks to carry out sound monetary
policies and to prevent future financial crises.
China clearly has benefitted from extensive
growth in the intermediate term, but as previously
shown this level of growth is not sustainable in
the long run. However, China may still enjoy a
high growth rate of 8 to 9 percent if it manages
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

the transformation from extensive to intensive
growth (see Table 2).
In his report to the First Session of the 11th
National People’s Congress in March 2008,
Premier Wen Jiabao stated that “On the basis of
improving the economic structure, productivity,
energy efficiency and environmental protection,
the GDP should grow by about eight percent”
(Xinhua, 2008). This is the fourth consecutive
year China set its GDP growth target at 8 percent
(after five consecutive years of double-digit GDP
growth) to emphasize the government’s desire to
promote both sound and fast economic and social
development.3 Until recently, China has tightened
monetary policy to curb inflation and an overheated property market to help the transition from
extensive to intensive growth.
However, it appears that China’s measures to
cool its economy to a sustainable level were not
well timed, considering recent developments in
the world economy. By November 2008, most rich
economies were facing recession. The U.S. economy has been in recession since December 2007
(as confirmed by the National Bureau of Economic
Research in December 2008). In November 2008,
the European economy officially fell into its first
recession since the euro was introduced. In China,
industrial production grew by only 8.2 percent
from January through October 2008, less than half
the pace of the previous year and its slowest for
seven years. China announced a massive stimulus
package ($586 billion) in early November 2008.
Although the stimulus package was intended
to boost domestic demand and create more jobs,
the World Bank pointed out that the stimulus
policies provide China with a good opportunity
to rebalance its economy in line with the objectives of the 11th Congress’s five-year plan: “The
stimulus package contains many elements that
support China’s overall long-term development
and improve people’s living standards. Some of
the stimulus measures give some support to the
rebalancing of the pattern of growth from invest3

China revised its GDP growth for 2007 from 11.9 percent to 13.0
percent and in that year overtook Germany to become the world’s
third-largest economy (Reuters, 2009). The growth figure was
announced by the National Bureau of Statistics of China (NBSC)
and was the fastest since 1993 (when GDP grew 13.5 percent).

J U LY / A U G U S T

2009

325

Zheng, Hu, Bigsten

Table 3
Growth Projections (2009-30, percent)
Countries

Capital stock

Labor

TFP

GDP

China

8.0

3.0

2.5

8.0

United States

4.0

0.5

1.2

3.1

EU

3.0

0.0

1.0

2.2

NOTE: Output elasticity of both capital and labor is 0.5 for China, 0.4 for the EU, and 0.3 for the United States.

ment, exports, and industry to consumption and
services. The government can use the opportunity
of the fiscal stimulus package to take more
rebalancing measures, including on energy and
resource pricing; health, education, and the social
safety net; financial sector reform; and institutional
reforms” (World Bank, 2008, p. 1).
In the longer term, China will be able to maintain its momentum as a rapidly developing economy well into the next two decades or so while
the United States and EU may manage a growth
rate of only 2 to 3 percent (as calibrated in Table
3).4 Structural differences help explain the large
differences in growth potential between China
and the United States and EU. The contribution
of capital in China is twice that in the United
States. The level of development provides even
greater opportunities for China than the United
States and EU. Investment opportunities in China
are nearly double those in the United States5;
and the potential for China to absorb new technologies from developed nations is double that
for the United States and EU.
Moreover, a shortage of labor (another important input to the production process) in developed
economies will hinder faster growth of these
economies in the intermediate term. In about 20
4

5

The growth rate in Table 3 is somewhat too optimistic for U.S.
economists: “[M]ainstream economists are exceptionally united
right now around the proposition that the trend growth rate of real
gross domestic product (GDP) in the United States—the rate at
which the unemployment rate neither rises nor falls is in the 2
percent to 2.5 percent range” (Blinder, 2002, p. 57).
Sterman (1985) presented a behavioral model of the economic long
wave, which showed that “capital self-ordering” was sufficient to
generate long waves. In Sterman (1983), capital self-ordering means
that the capital-producing sector must order capital equipment
such as large machinery to build up productive capacity.

326

J U LY / A U G U S T

2009

years China will face the same problem as its population ages. Demographic change due to China’s
baby boomers of the 1960s and 1970s entering
retirement age may significantly affect the labor
supply and the country’s capacity to save and
invest.
In the long run, economic prosperity depends
on innovation-driven productivity growth. There
is evidence, however, that worldwide innovations
might have been ineffective in recent decades.
The literature on diminishing technological
opportunities since the early 1960s and recent
studies on endogenous growth, which discuss
related issues (see, e.g., Jones, 1999; Segerstrom,
1998; and Kortum, 1997), address this phenomenon. In a series of recent articles, Gordon (e.g.,
2004) addresses the issue in terms of demand
creation for new products and technological
advances and suggests that the U.S. productivity
revival that began in 1995 might not be sustainable
(see Table 4). This suggests that the productivity
slowdown that began in other developed countries in the early 1970s may continue into the
next decade or so. Given the input constraints
on potential output growth in the United States
and EU, productivity is left as the only source of
extra growth.
In this regard, historical lessons from the
former Soviet Union need to be taken seriously.
Soviet growth was spectacular: Its industrial
Investment expansions in the 1950s and 1960s accumulated large
excess capacity in the United States and European Union. “But
while stimulating basic research and training the labor force for
‘new-wave’ technologies are important, innovation alone will not
be sufficient to lift the economy into a sustained recovery as long
as excess capacity in basic industries continues to depress investment” (Sterman, 1983, p. 1276).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Table 4
Productivity Slowdowns in the Soviet Union, United States, and EU
Countries
Soviet Union
United States

EU (euro zone)

Period

GDP

Capital

Labor

TFP

1950-70

5.4

8.8

1.8

1.6

1970-85

2.7

7.0

1.1

–0.4

1950-72

3.9

2.6

1.4

1.6

1972-96

3.3

3.1

1.7

0.6

1996-2004

3.6

2.6

0.7

1.5

1960-73

5.1

4.8

3.2

1973-2003

2.2

2.8

1.0

0.5

SOURCE: Mostly period averages calculated from Ofer (1987) for the former Soviet Union, Gordon (2006) for the United States, and
Musso and Westermann (2005) for the euro zone.

structure changed from an economy with an 82
percent rural population and a GNP produced
mainly by agriculture to one with a 78 percent
urban population and 40 to 45 percent of GNP
originating in manufacturing and related industries (Ofer, 1987). This pattern of extensive growth
lasted nearly 70 years, from the late 1920s to the
mid-1980s. By 1970, Soviet TFP growth was zero
and has been negative ever since (see Table 4).
Although the current problem in Western countries is different because their patterns of growth
have not been as extensive (for example, growth
in capital stock has been 3 to 4 percent), their
limited growth in TFP has been worrisome.
Limited TFP growth has important implications for macroeconomic planning. A straightforward strategy to boost productivity growth, of
course, is to increase spending on research and
development. Though many policymakers would
like to believe that research and development for
information and computer technologies (ICT) may
benefit an economy in the long run, when managing the macroeconomy they need to consider the
lag between the emergence of a new technology
and the generation of sufficient demand. For
example, the U.S. economy has recorded impressive productivity growth since the mid-1990s
thanks to innovations and massive investments
in ICT. But the ongoing financial crisis may dramatically alter the interpretation of the U.S. productivity boom of the past decade. Some critics
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

suggest that the problem lies in the desire to
maintain growth above what is sustainable by
encouraging excessive investment in technology
and loosening regulations for risky innovations
in the financial sector. As far as macroeconomic
planning is concerned, this amounts to taking the
concept of “potential output” seriously.6

ENVIRONMENTAL CONSTRAINTS
The environment is a constraint on growth
in China. Increased environmental awareness at
both the central government and grassroots levels
will put greater pressure on regional authorities
to seek alternative patterns of growth. In Zheng,
Bigsten, and Hu (2009) we note
The Chinese government has been working on
criteria and indexes of a green GDP, which
deducts the cost of environmental damage
and resources consumption from the traditional gross domestic product (People’s Daily,
6

Krugman (1997) notes that standard economic analysis suggests
that the United States should not expect its economy to grow at
much more than 2 percent over the next few years. He notes further
that if the Federal Reserve tries to force faster growth by keeping
interest rates low, serious inflation could result. Of course, inflation
did not rise until recently, but the U.S. economy already started
overheating in the mid-1990s. Jorgenson, Ho, and Stiroh (2006)
project the best-case scenario for U.S. GDP growth to be 2.97 percent
per annum for 2005-15, with an uncertainty range of 1.9 to 3.5
percent. McNamee and Magnusson (1996) give a detailed discussion on why a long-run growth rate of 2 percent could be a problem
for the U.S. economy as a whole.

J U LY / A U G U S T

2009

327

Zheng, Hu, Bigsten

March 12, 2004). Preliminary results in the
recently issued Green GDP Accounting Study
Report (2004) suggest that economic losses due
to environmental pollution reached 512 billion
yuan, corresponding to 3.05% of GDP in 2004,
while the imputed treatment cost is 287 billion
yuan, corresponding to 1.80% of GDP (The
Central People’s Government of the People’s
Republic of China, 2006). Although the concept
of and measurement for green GDP are rather
controversial, the report may serve as a wakeup
call to the government’s strategy of growth at
all costs.
From a productivity analysis perspective, the
concept of green GDP can be straightforwardly
extended to TFP, that is, green TFP. A slower
green TFP growth may imply a slower (green)
GDP growth. (p. 881)

We demonstrate that although the green GDP
level has increased as environmental factors have
been taken into account, “green TFP” growth
reveals a similar trend, as shown in the main text
of this article.

Environmental Factors
The World Bank (1997) first proposed the
concept and calculation of “genuine domestic
savings,” that is, a country’s saving rate calculated after subtracting from total output the costs
of depletion of natural resources (especially the
nonreproducible resources) and environmental
pollution.
A formal model of the genuine savings rate is
given by Hamilton and Clemens (1999):
(17) G = GNP − C − δ K − n ( R − g ) − σ (e − d ) + m.
Here, GNP–C is traditional gross savings,
which includes foreign savings, where GNP is
gross national product and C is consumption;
GNP–C–δ K is traditional net savings, where δ K is
the depreciation rate of produced assets; –n共R – g兲
is resource depletion; S = –共R – g兲 is resource
stocks, S, that grow by an amount g, are depleted
by extraction R, and are assumed to be costless
to produce; n is the net marginal resource rental
rate; –σ 共e – d 兲 is pollution emission costs; X =
–共e – d 兲 is the growth of pollutants accumulated
into a pollution stock, X, where d is the quantity
328

J U LY / A U G U S T

2009

of natural dissipation of the pollution stock; δ is
the marginal social cost of pollution; and m is
investment in human capital (current education
expenditures), which does not depreciate (and
may be considered as a form of disembodied
knowledge).
Natural resource depletion is measured by the
rent of exploiting and procuring natural resources.
The rent is the difference between the producing
price received by producers (measured by the
international price) and total production costs,
including the depreciation of fixed capital and
return of capital.
Rational exploitation of natural resources is
necessary for economic growth; however, if
resource rents are too low, overexploitation may
result. If the resources rents are not reinvested
(e.g., in human resources) but instead used for
consumption, the exploitation is also “irrational.”
Pollution loss mostly refers to harm caused
by CO2 pollution. It is calculated by the global
margin loss caused by one ton of CO2 emissions,
which Fankhauser (1995) suggests is US$20.
We expand the green GDP measure from the
World Bank to include not only natural capital
lost (negative factor) and education expenditure
(positive factor),7 but also net imports of primary
goods (positive factor) and sanitation expenditure
(positive factor). We calculate three different versions of GDP from 1978 to 2004: real GDP, World
Bank–adjusted green GDP, and our author-adjusted
green GDP (Table 5).

Green Capital
In the measurement of productivity, the different measures of capital formation greatly influence
the measured capital stock constructed with the
perpetual inventory method. We can define the
green capital stock as following the method of
Hamilton, Ruta, and Tajibaeva (2005):
(18)

K it′ = K it′ −1 (1 − δ it ) + I ′,

where δit is the depreciation rate. (Time subscripts
are omitted.) Our depreciation rate increases
7

We use total education expenditures from NBSC (2006) instead of
the education expenditures from World Development Indicators
(2006).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Figure 2
Capital Formation as a Percentage of GDP
Percent
60
50
40
30
20
Traditional

10

World Bank–Adjusted
0

Author-Adjusted

–10
1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003
SOURCE: NBSC (2007a) and World Bank (2006).

Figure 3
Capital Stock, 1978-2005
100m Yuan, 1987 Price
200,000
180,000

Traditional

160,000

World Bank–Adjusted

140,000

Author-Adjusted

120,000
100,000
80,000
60,000
40,000
20,000
0
1970

1973

1976

1979

1982

1985

1988

1991

1994

1997

2000

2003

SOURCE: NBSC (2007a) and World Bank (2006).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

329

Zheng, Hu, Bigsten

Table 5
Different Measures of Green GDP (percent of real GDP)
World Bank–
Total
Total
Net import
AuthorNatural
Education
adjusted expense on expense on of primary
adjusted
capital lost expenditure green GDP education sanitation
goods
green GDP

Year

Real GDP

1978

100

–23.01

1.85

78.84

2.10

3.10

–0.73

81.46

1979

100

–26.52

1.84

75.33

2.31

3.20

–0.77

78.22

1980

100

–27.54

2.08

74.54

2.51

3.30

–0.71

77.56

1981

100

–29.91

2.11

72.20

2.51

3.40

–0.77

75.23

1982

100

–28.48

2.19

73.70

2.59

3.50

–0.86

76.75

1983

100

–19.92

2.16

82.24

2.61

3.60

–1.27

85.02

1984

100

–17.35

2.07

84.72

2.51

3.30

–2.18

86.28

1985

100

–16.96

2.05

85.09

2.51

3.00

–2.80

85.75

1986

100

–12.76

2.10

89.34

2.62

3.10

–1.90

91.06

1987

100

–14.41

1.90

87.49

2.31

3.20

–1.97

89.13

1988

100

–13.57

1.87

88.30

2.22

3.30

–1.08

90.87

1989

100

–13.74

1.87

88.13

3.07

3.40

–0.74

91.99

1990

100

–15.26

1.79

86.53

3.56

4.03

–1.56

90.77

1991

100

–13.93

1.79

87.86

3.38

4.11

–1.31

92.25

1992

100

–12.50

1.70

89.20

3.25

4.09

–0.78

94.06

1993

100

–10.88

1.71

90.82

3.00

3.96

–0.40

95.68

1994

100

–8.07

2.14

94.07

3.09

3.78

–0.58

98.22

1995

100

–7.57

1.97

94.40

3.09

3.86

0.40

99.78

1996

100

–7.27

2.01

94.74

3.18

4.21

0.41

100.53

1997

100

–5.89

2.01

96.12

3.21

4.29

0.49

102.10

1998

100

–3.98

1.97

97.99

3.49

4.47

0.24

104.22

1999

100

–3.43

1.94

98.51

3.73

4.66

0.64

105.60

2000

100

–4.87

1.95

97.07

3.88

4.62

1.78

105.41

2001

100

–4.07

1.94

97.88

4.23

4.58

1.46

106.20

2002

100

–4.03

1.95

97.92

4.55

4.81

1.43

106.76

2003

100

–4.30

1.96

97.66

4.57

4.85

2.31

107.43

2004

100

–4.58

1.97

97.39

4.53

4.75

3.97

108.67

NOTE: World Bank–adjusted green GDP is the sum of columns 2, 3, and 4; the author-adjusted green GDP is the sum of columns 2, 3,
6, 7, and 8.
SOURCE: World Bank (2006) and NBSC (2006).

330

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Table 6
GDP, Green GDP, and TFP Growth, 1978-2004 (percent)
Variable

1978-92

1992-2004

1978-2004

GDP

9.02

(100.0)

10.12

(100.0)

9.61

(100.0)

K

7.74

(34.3)

11.27

(44.5)

9.56

(39.8)

L

2.96

(9.8)

1.07

(3.2)

2.44

(7.6)

H

2.25

(7.5)

1.90

(5.6)

2.02

(6.3)

TFP1

4.36

(48.3)

4.72

(46.6)

4.45

(46.3)

GGDP1

9.87

(100.0)

11.06

(100.0)

10.51

(100.0)

K′

5.95

(24.1)

15.88

(57.4)

10.42

(39.7)

L

2.96

(9.0)

1.07

(2.9)

2.44

(7.0)

H

2.25

(6.8)

1.90

(5.2)

2.02

(5.8)

TFP2′

5.93

(60.1)

3.82

(34.5)

5.00

(47.6)

GGDP2

10.47

(100.0)

10.75

(100.0)

10.60

(100.0)

K′′

5.80

(22.2)

15.97

(59.4)

10.37

(39.1)

L

2.96

(8.5)

1.07

(3.0)

2.44

(6.9)

H

2.25

(6.4)

1.90

(5.3)

2.02

(5.7)

TFP3′′

6.59

(62.9)

3.47

(32.3)

5.11

(48.2)

NOTE: GDP here is real GDP in 1978 prices; GGDP1 is the World Bank–adjusted green GDP; GGDP2 is the author-adjusted green GDP.
K denotes capital services input, L labor input, H denotes inputs of education, sanitation expenditure, and imports of primary goods.
TFP denotes total factor productivity. The shares of capital, labor, and human resource are 0.4, 0.3, and 0.3, respectively. Numbers in
parentheses are the contribution ratio of each factor.

along a linear trend from 4 percent in 1952 to 6
percent in 2004. I ′ is the green fixed capital formation. In Figure 3, the World Bank analysis
(Hamilton and Clemens, 1999) measures the
green investment in any geographic or political
area as
(19) I it′ = I it − nit ( Rit − g it ) − σ it (eit − dit ) + mit ,
where I is the traditional investment, nit 共Rit – git兲 –
σit 共eit – dit 兲 is the natural capital lost, and mi is the
education expenditure.
In this article, the author-adjusted green capital stock, K ′′, measures green investment as
(20)

I it′′ = I it − nit ( Rit − g it )
−σ it (eit − dit ) + mit + nit + rit ,

where mit is total education expenditure (from
NBSC), nit is sanitation expenditure, and rit is net
import of primary goods.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Green TFP
As shown in Table 6, compared with traditional GDP, the two adjusted green GDPs (the
World Bank and authors’ measures) have about
0.5 to 0.6 percent higher average TFP growth rates
in the 1978-2004 period, with lower TFP growth
in the 1992-2004 period (the author-adjusted GDP
is the lowest) than in the 1978-92 period. The TFP
growth rate of traditional GDP is more stable and
has the opposite trend.
As shown in Figure 4, the annual TFP growth
rates of the adjusted green GDPs are higher than
traditional GDP in most years before 1992. They
reached 13 percent higher in 1983 and then began
to fall, roughly maintaining a gap of 1 to 2 percentage points with traditional GDP through 2004. In
2004, the green GDPs reached their lowest growth
rate, –4 percent.
Our analysis finds that China’s growth has
varied between episodes of extensive and intenJ U LY / A U G U S T

2009

331

Zheng, Hu, Bigsten

Figure 4
TFP Growth, 1979-2005
Percent
20
15
10
5
0

Traditional
World Bank–Adjusted
Author-Adjusted

–5
–10

1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005
NOTE: This accounting does not include human capital. The share of capital and labor comes from Bai, Hsieh, and Qian’s (2006)
estimation.

sive growth. Economic growth in the 1980s was
intensive growth—higher TFP growth compensated for the diminishing contribution of natural
resources, that is, of “natural capital.” During the
1990s, as a result of the comparative decline of
its natural resource consumption, China’s capital
stock began to increase rapidly and its growth
became more extensive, especially with respect
to capital.

CONCLUSION
In this study, we have updated our previously
published results on China’s growth pattern,
estimated China’s potential output growth using
official Chinese statistics, and compared China’s
medium-term growth perspectives with those for
the United States and EU. Our findings suggest
that China’s extensive growth pattern might be
sustainable in the intermediate term but not in
the long run. However, China may still sustain a
high growth rate of 8 to 9 percent if it manages
the transformation from extensive to intensive
growth. Several factors explain this possibility.
Compared with the United States and EU, China
is in a more favorable position with regard to (i)
332

J U LY / A U G U S T

2009

production structure, (ii) the potential to absorb
new technologies, and (iii) investment opportunities. Perhaps these three factors largely explain
Ed Prescott’s query (1998) as to why economic
“miracles” have been only a recent phenomenon.
China’s reform policy since 1978 has dramatically increased its GDP as well as its role in the
world economy. China was a marginal economy
in 1978, but by 2007 its share of world GDP
reached 5.99 percent at regular exchange rates
(or 10.83 percent at purchasing power parity rates)
(International Monetary Fund, 2008). This means
that China now has the same economic weight
as, for example, Germany. Because of China’s
rapid growth in recent years, its contribution to
world growth has been substantial. In 2007 it was
about 17 percent at regular exchange rates and as
much as 33 percent at purchasing power parity
rates. Even at regular exchange rates, China’s
contribution to global growth was considerably
larger than that of the United States or EU. The
global slowdown and financial contagion has
now reduced the growth rate in China. Still, we
believe China can continue to grow at a high rate
over an extended period of time, which suggests
that it will continue to be an important driver of
world growth.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

China’s importance in the global markets for
goods and services also has increased considerably. In 1978, China contributed 0.6 percent of
world exports, but by 2006 its share was over 7
percent (World Bank, 2008). This is an amazingly
fast expansion and entry into the global market.
The export boom China experienced during
the years before the world financial crisis of 2008
could not have continued at its rapid pace even in
the absence of a worldwide economic slowdown.
China’s rapid growth has been driven by U.S.
expansionary policy, China’s acceptance into the
World Trade Organization, the shift of assembly
plants from other countries to China, and the
undervaluation of China’s currency, the yuan.
The impact of these factors, however, cannot be
sustained. Export growth was further supported
by shifting the production structure toward the
international market. However, with exports
approaching half of GDP, there will be less scope
for further shifts. In the future it is likely that
export growth will more or less keep pace with
GDP growth. As long as China continues to grow
faster than the world average, it will increase its
global market share. China’s current strategy is to
shift its production toward more-sophisticated
goods and, even if the impact of such is as yet
limited, it is very likely that trend will continue.
This means that in the future we will likely see
more and more intra-industry trade between China
and the Organisation for Economic Co-operation
and Development (OECD).
In the short term, the world is facing an
extreme international financial crisis. Debate about
China’s role in this crisis is intense. International
financial markets are clearly more integrated than
in earlier crises, although China has not opened its
capital account yet. With the rapid and extended
global economic growth, economic imbalances
have emerged, particularly between the United
States and China. The United States has been
undersaving and China has been oversaving. A
key issue is the character and speed of the rebalancing process. In the United States, to build savings, consumption needs to increase at a slower
pace than incomes, which could hinder growth
for several years. How much China can counteract this by stimulating domestic demand remains
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

to be seen, but steps in this direction have been
taken. China has implemented a large fiscal
expansion. The focus of policy reforms in the
near future will probably be on domestic issues,
and macroeconomic policy interventions will
likely seek to stimulate local demand. The process
of adjustment will take a long time, however,
unless there is concerted policy action.
Another important issue is how international
negotiations about the future design of the financial system will evolve, particularly because of
the conflict between national sovereignty and
the needs of global capital markets. Still, future
discussions will have to involve China in a substantial way.
China is also expanding its economic operations abroad by aggressively using its sovereign
wealth to acquire assets. It currently has extensive
resources, whereas other global investors are facing problems. So the current crisis is an opportunity for China to invest abroad on a larger scale
than ever before. As much as 75 percent of China’s
investments are in developed countries to develop
marketing channels, access more advanced technology, and earn a good return on its capital.
There is a risk, though, that China may overpay
in its eagerness to acquire assets.
China enters the current crisis better prepared
than it was when hit by the 1997 East Asian financial crisis. Still, even if China today is one of the
most resilient economies in the world, it may not
be able to have a very large impact on the Western
economies. It buys many inputs from Asia, assembles goods at home, and then sells final goods to
the OECD. This means that it will suffer during a
recession in wealthier countries. The export markets have also been hurt by the disruptions in the
trade credit market. Most intra-Asian trade is for
intermediates that are assembled in China and
then exported. Few Asian exports are for Chinese
demand. Thus, other Asian countries suffer when
China cannot export final goods to the OECD.
Countries that can supply the Chinese domestic
market may be able to benefit from Chinese
development, though.
Overall it is likely that China will continue
to grow and increase its market share and control
of wealth, which in turn will increase its economic
and political influence over the longer term.
J U LY / A U G U S T

2009

333

Zheng, Hu, Bigsten

REFERENCES
Bai, Chon-gen; Hsieh, Chang-tai and Qian, Yingyi.
“The Return to Capital in China.” NBER Working
Paper No. 12755, National Bureau of Economic
Research, December 2006;
www.nber.org/papers/w12755.pdf?new_window=1.
Baker, Dean. “The New Economy Does Not Lurk in
the Statistical Discrepancy.” Challenge, July-August
1998a, 41(4), pp. 5-13.
Baker, Dean. “The Computer-Driven Productivity
Boom.” Challenge, November-December 1998b,
41(6), pp. 5-8.
Baker, Dean. “The Supply-Side Effect of a Stock
Market Crash.” Challenge, September-October 2000,
43(5), pp. 107-17.
Baker, Dean. “Is the New Economy Wearing Out?”
Challenge, January-February 2002, 45(1), pp. 117-21.
Blinder, Alan S. “The Speed Limit: Fact and Fancy
in the Growth Debate.” American Prospect,
September-October 1997, Issue 34, pp. 57-62.
Bosworth, Barry P. and Triplett, Jack E. “The Early
21st Century Productivity Expansion Is Still in
Services.” International Productivity Monitor,
Spring 2007, Issue 14, pp. 3-19.
Brenner, Robert. “The Boom and the Bubble.” New
Left Review, November-December 2000, 6, pp. 5-39.
Brenner, Robert. “New Boom or New Bubble?” New
Left Review, January-February 2004, 25, pp. 57-100.
Central People’s Government of the People’s Republic
of China. Green GDP Accounting Study Report
2004. September 11, 2006; www.gov.cn/english/
2006-09/11/content_384596.htm.
China Cement. “Discussion of Financing Mode for
Improving the Energy Efficiency of China’s Cement
Industry” (in Chinese). 2007.
Chow, Gregory C. “Another Look at the Rate of
Increase in TFP in China.” Journal of Chinese
Economic and Business Studies, May 2008, 6(2),
pp. 219-24.

334

J U LY / A U G U S T

2009

Chow, Gregory C. and Li, Kui-Wai. “China’s Economic
Growth: 1952-2010.” Economic Development and
Cultural Change, October 2002, 51(1), pp. 247-56.
Congressional Budget Office. “CBO’s Method for
Estimating Potential Output: An Update.” August
2001; www.cbo.gov/ftpdocs/30xx/doc3020/
PotentialOutput.pdf.
Congressional Budget Office. “R&D and Productivity
Growth: A Background Paper.” June 2005.
Fankhauser, Samuel. Valuing Climate Change: The
Economics of the Greenhouse. London: Earthscan,
1995.
Gordon, Robert J. “Does the New Economy Measure
Up to the Great Inventions of the Past?” Journal of
Economic Perspectives, Fall 2000, 14(4), pp. 49-74.
Gordon, Robert J. “Hi-tech Innovation and Productivity
Growth: Does Supply Create Its Own Demand?”
NBER Working Paper No. 9437, National Bureau of
Economic Research, January 2003;
www.nber.org/papers/w9437.pdf?new_window=1.
Gordon, Robert J. “Five Puzzles in the Behaviour of
Productivity, Investment and Innovation.” CEPR
Discussion Paper No. 4414, Centre for Economic
and Policy Research, June 2004.
Gordon, Robert J. “The 1920s and the 1990s in Mutual
Reflection.” NBER Working Paper No. W11778,
National Bureau of Economic Research, November
2005; www.nber.org/papers/w11778.pdf?new_
window=1.
Gordon, Robert J. “Future U.S. Productivity Growth:
Looking Ahead by Looking Back.” Presented at the
Workshop at the Occasion of Angus Maddison’s
80th Birthday, World Economic Performance: Past,
Present, and Future. University of Groningen, The
Netherlands, October 27, 2006.
Hamilton, Kirk and Clemens, Michael. “Genuine
Savings Rates in Developing Countries.”
Environment Department, World Bank, 1998.
Hamilton, Kirk; Ruta, Giovanni and Tajibaeva, Liaila.
“Capital Accumulation and Resource Depletion:

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

A Hartwick Rule Counterfactual.” Policy Research
Working Paper 3480, World Bank, January 2005.
Holz, Carsten A. “The Quantity and Quality of Labor
in China 1978-2000-2025.” Working Paper, Hong
Kong University of Science and Technology, May
2005; http://ihome.ust.hk/~socholz/Labor/HolzLabor-quantity-quality-2July05-web.pdf.
International Monetary Fund. World Economic
Outlook: Financial Stress, Downturns, and
Recoveries. Washington, DC: IMF, October 2008.
Irmen, Andreas. “Extensive and Intensive Growth in
a Neoclassical Framework.” Journal of Economic
Dynamics and Control, August 2005, 29(8),
pp. 1427-48.
Jones, Charles I. “Growth: With or Without Scale
Effects?” American Economic Review, May 1999,
89(2), pp. 139-44.
Jorgenson, Dale W.; Ho, Mun S.; Samuels, Jon D. and
Stiroh, Kevin J. “Industry Origins of the American
Productivity Resurgence.” Economic Systems
Research, September 2007, 19(3), pp. 229-52.
Jorgenson, Dale W.; Ho, Mun S. and Stiroh, Kevin J.
“Potential Growth of the U.S. Economy: Will the
Productivity Resurgence Continue?” Business
Economics, January 2006, 41(1), pp. 7-16.
Kim, Jong-Il and Lau, Lawrence J. “The Sources of
Economic Growth of the East Asian Newly
Industrialized Countries.” Journal of the Japanese
and International Economies, September 1994, 8(3),
pp. 235-71.

Kuijs, Louis and Wang, Tao. “China’s Pattern of
Growth: Moving to Sustainability and Reducing
Inequality.” China and World Economy, JanuaryFebruary 2006, 14(1), pp. 1-14.
Lin, Justin Yifu; Cai, Fang and Li, Zhou. The China
Miracle: Development Strategy and Economic
Reform. Hong Kong: Chinese University Press, 1996.
McNamee, Mike and Magnusson, Paul. “Let’s Get
Growing: The Economy Can Run Faster. Here’s How
To Make It Happen.” Business Week, July 8, 1996,
p. 90-98.
Musso, Alberto and Westermann, Thomas. “Assessing
Potential Output Growth in the Euro Area: A Growth
Accounting Perspective.” ECB Occasional Paper
No. 22, European Central Bank, January 2005;
www.ecb.int/pub/pdf/scpops/ecbocp22.pdf.
Nan, Liangjin and Xue, Jinjun. “Estimation of
China’s Population and Labor Force, 1949-1999”
(in Chinese). China Population Science, 2002,
No. 4, pp. 1-16.
National Bureau of Statistics of China. China
Statistical Yearbook. Beijing: China Statistics Press,
2005a, 2006, 2007a, and 2008.
National Bureau of Statistics of China. Comprehensive
Statistical Data and Materials on 55 Years of New
China. Beijing: China Statistics Press, 2005b.
National Bureau of Statistics of China. China Labour
Statistical Yearbook. Beijing: China Statistics Press,
2007b.

Kortum, Samuel S. “Research, Patenting, and
Technological Change.” Econometrica, November
1997, 65(6), pp. 1389-419.

National Bureau of Statistics of China. Historical Data
on China’s Gross Domestic Product Accounting,
1952-2004. Beijing: China Statistics Press, 2007c.

Krugman, Paul. “The Myth of Asia’s Miracle.”
Foreign Affairs, November-December 1994, 73(6),
pp. 62-78.

Nelson, Richard R. and Romer, Paul M. “Science,
Economic Growth, and Public Policy.” Challenge,
March-April 1996, 39(2), pp. 9-21.

Krugman, Paul. “How Fast Can the U.S. Economy
Grow?” Harvard Business Review, July-August
1997, 75(4), pp. 123-29.

Ofer, Gur. “Soviet Economic Growth: 1928-1985.”
Journal of Economic Literature, December 1987,
25(4), pp. 1767-833.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

335

Zheng, Hu, Bigsten

Oliner, Stephen D.; Sichel, Daniel E. and Stiroh,
Kevin J. “Explaining a Productive Decade.”
Brookings Papers on Economic Activity, 2007, 1,
pp. 81-152.
People’s Daily (Beijing, China). “Green GDP System
to Debut in 3-5 Years in China.” March 12, 2004.
Phelps, Edmund S. “The Boom and the Slump: A
Causal Account of the 1990s/2000s and the 1920s/
1930s.” Journal of Policy Reform, March 2004,
7(1), pp. 3-19.
Prescott, Edward C. “Needed: A Theory of Total
Factor Productivity.” International Economic Review,
August 1998, 39(3), pp. 525-51.
Prescott, Edward C. “Richard T. Ely Lecture: Prosperity
and Depression.”American Economic Review, May
2002, 92(2), pp. 1-15.
Proietti, Tommaso; Musso, Alberto and Westermann,
Thomas. “Estimating Potential Output and the
Output Gap for the Euro Area: A Model-Based
Production Function Approach.” Empirical
Economics, July 2007, 33(1), pp. 85-113.
Reuters. “China’s Revised 2007 GDP Growth Moves
It Past Germany.” January 15, 2009.
Romer, David. Advanced Macroeconomics. Third
Edition. Boston: McGraw-Hill Irwin, 2006.
Schiff, Lenore. “Economic Intelligence: Is the Long
Wave About To Turn Up?” Fortune, February 22,
1993, 127(4), p. 24.
Segerstrom, Paul S. “Endogenous Growth without
Scale Effects.” American Economic Review,
December 1998, 88(5), pp. 1290-310.
Sterman, John D. “The Long Wave.” Science,
March 18, 1983, 219(4590), p. 1276.
Sterman, John D. “A Behavioral Model of the
Economic Long Wave.” Journal of Economic
Behavior and Organization, 1985, 6(1), pp. 17-53.
Sterman, John. “The Long Wave Decline and the
Politics of Depression.” Bank Credit Analyst, 1992,
44(4), pp. 26-42.

336

J U LY / A U G U S T

2009

Solow, Robert M. “Perspectives on Growth Theory.”
Journal of Economic Perspectives, Winter 1994,
8(1), pp. 45-54.
Stiglitz, Joseph. “The Roaring Nineties.” Atlantic
Monthly, October 2002, 290(3), pp. 76-89.
Stiroh, Kevin J. “Is There a New Economy?”
Challenge, July-August 1999, 42(4), pp. 82-101.
Vatter, Harold G. and Walker, John F. “Did the 1990s
Inaugurate a New Economy?” Challenge, JanuaryFebruary 2001, 44(1), pp. 90-116.
Walsh, John. “Is R&D the Key to the Productivity
Problem?” Science, February 1981, 211(13),
pp. 685-88.
Wen, Jiabao. “Report on the Work of the Government.”
Presented at the First Session of the 11th National
People’s Congress, March 5, 2008.
World Bank. Expanding the Measure of Wealth:
Indicators of Environmentally Sustainable
Development. Washington, DC: Environment
Department, World Bank, 1997.
World Bank. World Development Indicator CD-ROM.
Washington, DC: World Bank, 2006 and 2008.
World Bank. China Quarterly Update. Washington, DC:
World Bank, December 2008.
Xinhua. “Highlights of Chinese Premier Wen Jiabao’s
Government Work Report.” March 8, 2008, update.
Young, Alwyn. “The Tyranny of Numbers: Confronting
the Statistical Realities of the East Asian Growth
Experience.” Quarterly Journal of Economics,
August 1995, 110(3), pp. 641-80.
Zheng, Jinghai. “On Chinese Productivity Studies.”
Journal of Chinese Economic and Business Studies,
May 2008, 6(2, Special Issue), pp. 109-19.
Zheng, Jinghai; Bigsten, Arne and Hu, Angang. “Can
China’s Growth Be Sustained: A Productivity
Perspective?” World Development, April 2009,
37(4, Special Issue), pp. 874-88.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Zheng, Jinghai and Hu, Angang. “An Empirical
Analysis of Provincial Productivity in China (19792001).” Journal of Chinese Economic and Business
Studies, 2006, 4(3), pp. 221-39.

APPENDIX: DATA DESCRIPTION
The main variables investigated in the study are aggregate output (GDP at a constant price), aggregate labor (the number of people employed), and capital stock (accumulated fixed capital investment
at a constant price). For details of the treatment of data, see Zheng, Bigsten, and Hu (2009, appendix).
Here we outline the data used in addition to those in that study.

Capital Stock
We have collected a series of capital stock, which is the accumulation of total social fixed asset
investment since 1978. We use the price indices of gross fixed capital formation from Historical Data
of China’s Gross Domestic Product Accounting, 1952-2004 (NBSC, 2007c) to deflate investment data
before 1990. For investment after 1990, we use the price indices of fixed asset investment from the
China Statistical Yearbook (NBSC, 2005a, 2006a, 2007a, and 2008) See Figures A1 through A3 for time
plots of the series and related measures.

Labor
The labor force data used are the economically active population data from Comprehensive
Statistical Data and Materials on 55 Years of New China (NBSC, 2005b) and are extended to 2007 based
on the growth rate for each year from the China Statistical Yearbook (NBSC, 2005a, 2006, 2007a, and
2008). Because of the inconsistency of official data before 1990, we use an adjusted series of labor force
from Nan and Xue (2002) to update our pre-1990 data. The data on employment are from the China
Labour Statistical Yearbook (NBSC, 2007b). We generate a new series of the data before 1990 based on
the official unemployment (defined as the gap of economic active population and employment) and
the labor force data from Nan and Xue (2002). (See Figures A4 to A6.)

Human Capital
To measure human capital, we use average years of schooling of Chinese laborers to adjust for
labor quality improvement. Data for 1978-2005 are from Holz (2005) and include two series, one with
and one without military service included. We use the former. “Labor” is defined as quality-adjusted
laborers, that is, the number of employees multiplied by the average years of schooling. (See Figures A7
to A9.)

Energy Consumption and Carbon Dioxide Emission
Energy consumption data and its structure are from Comprehensive Statistical Data and Materials
on 55 Years of New China (NBSC, 2005b), which provides consumption of fossil fuel (to estimate the
carbon dioxide [CO2] emissions) together with cement production data. CO2 emissions based on energy
consumption is calculated according to the following formula:
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

337

Zheng, Hu, Bigsten

CO2 Emissions = Consumption of Fossil Fuel8 × Carbon Emission Factor × Fraction of Carbon Oxidized
+ Production of Cement × Processing Emission Factor.
The fraction of carbon oxidized refers to the ratio of carbon oxidized to the quantity of CO2 emitted,
which is a constant ratio 3.67 (44:12). The most important coefficient here is the carbon emission factor,
which refers to the equivalent carbon emissions in the consumption of fossil fuel. We use the factor from
the Energy Research Institute of China’s National Development and Reform Committee, which is 0.67
per ton of coal-equivalent fuel. Further, the production of cement emits more CO2 than the consumption
of fossil fuel because of the calcining of limestone, which on average creates 0.365 tons of CO2 per ton
of cement produced (China Cement, 2007).

Figure A1
Gross Capital Stock Growth
Percent
20
18
16
14
12
10
8
6
4
2
0
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

8

A more-accurate calculation would exclude the carbon sink. We use the approximate amount because of the limited availablity of data.

338

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Figure A2
Gross Capital Stock and Its Components
Percent
25
Investment-to-Capital Ratio
Retirement Rate

20
15
10
5
0

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

Figure A3
Determinants of the Investment-to-Capital Ratio
Percent
60
50
40
30
20
Investment-to-GDP Ratio

10

GDP-to-Capital Ratio

0
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

339

Zheng, Hu, Bigsten

Figure A4
Labor Force Growth
Percent
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0
1978

1981

1984

1987

1990

1993

1996

1999

2002

2005

Figure A5
Unemployment Rate
Percent
3.0

Percent
60
Unemployment Rate Change (left scale)
40

2.5

Unemployment Rate (right scale)

2.0
20
1.5
0
1.0
–20

0.5
0

–40
1978

340

J U LY / A U G U S T

2009

1981

1984

1987

1990

1993

1996

1999

2002

2005

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zheng, Hu, Bigsten

Figure A6
Working-Age Population Growth
Percent
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0
1978

1981

1984

1987

1990

1993

1996

1999

2002

2005

Figure A7
Participation Rate
Percent

Percent

2.0

87

1.5
1.0

86

0.5
0

85

–0.5
–1.0

Participation Rate Change (left scale)

–1.5

84

Participation Rate (right scale)

–2.0

83
1978

1981

1984

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1987

1990

1993

1996

1999

2002

2005

J U LY / A U G U S T

2009

341

Zheng, Hu, Bigsten

Figure A8
Population Growth
Percent
1.8
1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0
1978

1981

1984

1987

1990

1993

1996

1999

2002

2005

Figure A9
Dependency Ratio
Percent
0

Percent
80

–0.5
–1.0

70

–1.5
60

–2.0
–2.5

50

–3.0
–3.5
–4.0

Dependency Ratio Change (left scale)

–4.5

Dependency Ratio (right scale)

40
30

–5.0
1978

342

J U LY / A U G U S T

2009

1981

1984

1987

1990

1993

1996

1999

2002

2005

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Xiaodong Zhu

C

hina’s growth performance over the
past three decades has been remarkable, if not unprecedented. A natural
question is whether China’s recent
pattern of growth is sustainable in the long run.
Zheng, Hu, and Bigsten (2009) use a standard
growth accounting framework to address this
question. They assume that the aggregate production function is Cobb-Douglas:

Yt = At K t1−α Lαt ,
where At , Kt , and Lt are total factor productivity
(TFP), capital stock, and employment, respectively, and α is the income share of labor. According to their calculation using a labor share of 0.5,
the contribution of TFP growth to China’s gross
domestic product (GDP) growth has declined in
recent years. As they reported in their Table 1, the
average annual growth rates of GDP and TFP were
10.11 percent and 3.8 percent, respectively, for
1978-95 but 9.25 percent and 1.45 percent, respectively, for 1995-2007. In other words, the contribution of TFP growth to GDP growth declined by
38 percent in the first period and 16 percent in
the second period. In contrast, the average growth
rate of the capital stock increased from 9.12 percent in the first period to 12.81 percent in the
second period. So the contribution of physical
capital accumulation increased from 45 percent
in the first period to 69 percent in the second
period. Based on these calculations, the authors
suggest that in recent years China has pursued
an extensive growth strategy that relies heavily

on capital accumulation rather than TFP growth.
Because investment as a percentage of GDP has
exceeded 40 percent, the authors argue that further
increases in the investment rate, which would be
needed to maintain a growth rate of capital stock
similar to its recent average, is not sustainable and
therefore extensive growth cannot be sustained
in the long run. They suggest that a switch from
extensive to intensive growth is needed for China
to sustain its recent growth performance, thus the
emphasis on productivity increases.
The paper addresses an important question,
and growth accounting is the right place to start.
I am also sympathetic to the authors’ arguments,
especially their suggestion that TFP growth is
crucial for China’s growth performance in the long
run. However, a few puzzling facts about China’s
recent growth performance need to be accounted
for before we can judge the relative role of capital
accumulation and TFP in China’s recent growth
and make projections about its future growth.
First, given the high investment rates in recent
years, low returns to capital might be expected.
However, Bai, Hsieh, and Qian (2006) show that
this is not the case. They find that China’s returns
to capital have been around 20 percent in recent
years, which is not significantly lower than returns
to capital worldwide. If there has been no significant TFP growth, how could China increase its
investment rate without lowering the returns to
capital?
Second, since 1978, when economic reform
started in China, TFP has grown substantially.

Xiaodong Zhu is a professor of economics at the University of Toronto and a special term professor at Tsinghua University in Beijing.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 343-47.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

343

Zhu

Figure 1
China’s Investment-to-GDP Ratio
50
45
40
35
30
25
20
15
10
5
0
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004
SOURCE: Brandt and Zhu (2009).

According to a standard neoclassical growth
model, an increase in the TFP growth rate would
result in a sharp and immediate increase in the
investment rate followed by a gradual decline.
The actual investment rate in China, however,
behaves quite differently. Figure 1 shows that it
has increased gradually over time. Arguably, this
gradual increase in the investment rate may have
been due to a gradual increase in the growth rate
of TFP or the labor input. Figure 2 shows the
growth rates of TFP and employment in China
and neither has had an upward trend. Why, then,
didn’t the investment rate grow more rapidly?
The answers to these questions are important
for understanding the nature of China’s growth
performance and cannot be easily answered using
an aggregate growth accounting framework. I suggest addressing these questions by looking at more
disaggregated data. Figure 3 shows the returns-tocapital and capital-to-labor ratios in the state and
non-state nonagricultural sectors, respectively,
and their significant differences. In the state sector,
the capital-to-labor ratio increased steadily before
344

J U LY / A U G U S T

2009

1997 and dramatically afterward. Correspondingly,
returns to capital were roughly constant at 10
percent before 1997 and declined sharply afterward. Such behavior is consistent with what
Zheng, Hu, and Bigsten (2009) find at the aggregate level. It suggests that, in the state sector, capital accumulation played a much more important
role than TFP growth in recent years. For the nonstate sector, however, the story is quite different.
The capital-to-labor ratio in this sector actually
declined in the early years, which coupled with
TFP growth resulted in a sharp increase in returns
to capital. In recent years, the non-state sector’s
capital-to-labor ratio increased, but the returns
to capital did not decline. This sector has maintained a relatively high rate of returns to capital
(around 60 percent) because of rapid TFP growth
(Figure 4).
So, the answer to the question of whether
China’s recent growth pattern is extensive or intensive depends on which part of the Chinese economy is analyzed. If the focus is on the state sector,
then it clearly follows an extensive growth path.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zhu

Figure 2
China’s TFP and Employment Growth Rates
Percent

TFP Growth Rates

15
10
5
0
–5

19
7
19 9
8
19 0
81
19
8
19 2
8
19 3
84
19
8
19 5
8
19 6
87
19
8
19 8
8
19 9
90
19
9
19 1
92
19
9
19 3
94
19
9
19 5
9
19 6
97
19
9
19 8
9
20 9
00
20
0
20 1
0
20 2
03
20
04

–10

Percent
3.5

Employment Growth Rates

3.0
2.5
2.0
1.5
1.0
0.5

19
7
19 9
8
19 0
81
19
8
19 2
8
19 3
84
19
8
19 5
8
19 6
87
19
8
19 8
8
19 9
90
19
9
19 1
92
19
9
19 3
94
19
9
19 5
9
19 6
97
19
9
19 8
9
20 9
00
20
0
20 1
0
20 2
03
20
04

0

SOURCE: Brandt and Zhu (2009).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

345

Zhu

Figure 3
China’s Returns to Capital and Capital-to-Labor Ratios
Percent
70

Returns to Capital

60
50
40
30
Non-State
20

State

10
0
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

Capital-to-Labor Ratios

50,000
45,000
40,000
35,000
30,000
25,000
20,000
15,000

Non-State

10,000

State

5,000
0
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004
SOURCE: Brandt and Zhu (2009).

346

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Zhu

Figure 4
China’s TFP
3.5

3.0

2.5

2.0

1.5

1.0

0.5

Non-State
State

19
7
19 8
7
19 9
8
19 0
81
19
8
19 2
8
19 3
84
19
8
19 5
8
19 6
87
19
8
19 8
8
19 9
90
19
9
19 1
92
19
9
19 3
94
19
9
19 5
9
19 6
97
19
9
19 8
9
20 9
00
20
0
20 1
0
20 2
03
20
04

0

SOURCE: Brandt and Zhu (2009).

The non-state sector, on the other hand, follows
an intensive growth path that relies much more
on TFP growth than capital accumulation. As
Zheng, Hu, and Bigsten argue in their paper, intensive growth is more likely to be sustainable than
extensive growth. The sustainability of China’s
recent growth performance, then, will depend
on the relative importance of the two sectors.
Measured by the share of employment, the nonstate sector’s importance has increased over time.
According to Brandt and Zhu’s (2009) estimates,
the non-state sector’s share of nonagricultural
employment increased from 48 percent in 1978
to 87 percent in 2004. Measured by the share of
investment, however, the picture of the non-state
sector is not as rosy.
Despite its lackluster TFP growth performance
and declining employment share, the state sector’s
share of investment has always stayed above 60
percent. Given the high TFP growth in the nonstate sector and the high investment rate in the
state sector, China can increase both the aggregate

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

efficiency of the economy and the GDP growth
rate without increasing the aggregate investment
rate, by shifting investment from the state sector
to the non-state sector.

REFERENCES
Bai, Chong-En; Hsieh, Chang-Tai and Qian, Yingyi.
“The Return to Capital in China.” Brookings Paper
on Economic Activity, 2006, Issue 2, pp. 61-88.
Brandt, Loren and Zhu, Xiaodong. “Explaining
China’s Growth.” Working paper, University of
Toronto, 2009.
Zheng, Jinghai; Hu, Angang and Bigsten, Arne.
“Potential Output in a Rapidly Developing Economy:
A Comparison of China with the United States and
the European Union.” Federal Reserve Bank of St.
Louis Review, July/August 2009, 91(4), pp. 317-42.

J U LY / A U G U S T

2009

347

348

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Estimating U.S. Output Growth with
Vintage Data in a State-Space Framework
Richard G. Anderson and Charles S. Gascon
This study uses a state-space model to estimate the “true” unobserved measure of total output in
the U.S. economy. The analysis uses the entire history (i.e., all vintages) of selected real-time data
series to compute revisions and corresponding statistics for those series. The revision statistics,
along with the most recent data vintage, are used in a state-space model to extract filtered estimates
of the “true” series. Under certain assumptions, Monte Carlo simulations suggest this framework
can improve published estimates by as much as 30 percent, lasting an average of 11 periods. Realtime experiments using a measure of real gross domestic product show improvement closer to 10
percent, lasting for 1 to 2 quarters. (JEL C10, C53, E01)
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 349-69.

S

tatistical agencies face a tradeoff
between accuracy and timely reporting
of macroeconomic data. As a result,
agencies release their best estimates of
the “true” unobserved series in the proceeding
month, quarter, or year with some measurement
error.1 As agencies collect more information, they
revise their estimates, and the data are said to be
more “mature.” As the reported data mature, the
estimates, on average, are assumed to converge
toward the “true” unobserved values. This study
examines a methodology in which the “true”
value of an economic variable is latent in the
sense of the state vector in a state-space model. In
doing so, we use recent modeling suggestions by
1

In Appendix B we address the philosophical question of why an
econometrician might believe s/he can improve published data—
after all, statisticians who produce data have access to the same
historical data used by econometricians and, hence, should create
models using the same understanding of the revision process that
econometricians use. Over long periods, benchmarks and redefinitions muddy the analysis. But, it is an act of hubris to assert that any
simple statistical model can produce consistently more-accurate
near-term data than are produced by the specialists constructing the
published data. Hubris aside, we have written this paper regardless.

Jacobs and van Norden (2006) and Cunningham
et al. (2007) regarding relationships among realtime data, measurement error as a heteroskedastic
stochastic process, and the latent, “true” data
for an economic variable of interest.
The importance of potential output growth
in policymaking motivates our study. Forwardlooking macroeconomic models suggest that the
predicted future path of the output gap should
be important to policymakers. To the extent
that policymakers are concerned with a Federal
Reserve–style “dual mandate,” an output gap
equal to 1 percent of potential output may be quite
alarming if projections suggest it will continue,
but relatively innocuous if the gap is expected to
shrink rapidly during the next few quarters. Recent
studies on inflation forecasting conclude that the
output gap, when measured in real-time using
vintage data, has little predictive power for inflation (e.g., Orphanides and van Norden, 2005; and
Stock and Watson, 2007). It is also important to
study the real-time measurement of potential
output because policymakers occasionally face

Richard G. Anderson is a vice president and economist and Charles S. Gascon is a senior research associate at the Federal Reserve Bank of
St. Louis.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

349

Anderson and Gascon

possible changes/breaks in the underlying growth
trend of productivity and, hence, potential output.
Our objective in this study is not to assess
inflation-forecasting models, although that has
been a major use of potential output measures;
rather, it is to estimate the “true” value of real
output for use in the construction of trend-like
measures of potential output. One of the larger
recent studies in this vein, albeit focused on inflation prediction, is by Orphanides and van Norden
(2005). The study considers, as predictive variables for inflation, both a wide range of output
gap measures (which differ with respect to data
vintage and the trend estimator) and lagged values
of real output growth. Their conclusion regarding
output gap models as predictors of inflation is
straightforward—the output gap does not reliably
predict inflation, although the differences in forecast performance between output-gap and outputgrowth models are not statistically significant:
[O]ur analysis suggests that a practitioner could
do well by simply taking into account the information contained in real output growth without
attempting to measure the level of the output
gap. This model was consistently among the
best performers, particularly over the post-1983
forecast sample. (p. 597)

Motivated by these findings, this article models
the true (unobserved) output measure of real output and the implications of such for estimators
of a real output trend. To so do, we explore the
measurement error and subsequent data-revision
process for real gross domestic product (RGDP).

LITERATURE REVIEW
Early studies of real-time data focused on the
sensitivity of certain statistics to data vintage
(Howrey, 1978 and 1984; Croushore and Stark,
2001; Diebold and Rudebusch, 1991; and
Orphanides and van Norden, 2002 and 2005).
Later research posed the problem more formally
as a signal-extraction problem (Kishor and Koenig,
2005; Aruoba, Diebold, and Scotti, 2008; and
Aruoba, 2008). Both approaches emphasized the
sensitivity of statistical inferences, including measures of the forecasting power of the output gap.
350

J U LY / A U G U S T

2009

Recent analyses have focused on “the possibility that the sequence of vintages released over
time may contain useful information with which
to interpret the most recent vintage of data and
to anticipate future outcomes” (Garratt et al., 2008,
p. 792). Such a possibility was discussed by
Howrey (1978) but only recently has become the
centerpiece of certain studies.
A long literature has addressed the use of realtime data, starting with Howrey’s 1978 paper on
forecasting with preliminary data and including
Croushore and Stark’s (2000) release of a vintage
economic dataset at the Philadelphia Fed. This
literature, until recently, has focused on three main
issues: (i) embedding an estimate of the data revision process into forecasting models, (ii) assessing
the sensitivity of statistical inferences in macroeconomic data to data vintage, and (iii) checking
the forecastability of revisions, in the context of
Mankiw and Shapiro’s (1986) classic discussion
of “news vs. noise.”2
Some authors have argued there are policy
implications of such issues. Croushore (2007)
argues that revisions to published personal consumption expenditures (PCE) inflation rates are
forecastable, at least from August to August of
the following year, and identifies an upward
bias to revisions, indicating that initial estimates
consistently are too low. He suggests that policymakers should “account for” this bias and predictability in setting monetary policy. Kozicki
(2004) analyzes vintages of the output gap, employment gap, and inflation data and finds that revised
data and real-time data suggest differing policy
actions. Kozicki suggests that policymakers should
place greater emphasis on more-certain data and
be less aggressive in response to changes in data
subject to large revisions. Previously, Orphanides
and van Norden (2002 and 2005) argued that failure to appreciate the difference between real-time
and final data risks serious policy errors.
2

Our analysis is silent on the discussion of “news vs. noise” in realtime data analysis—“news” meaning that the statistical agency
publishes efficient estimates using all available information, “noise”
meaning there is measurement error unrelated to the true value.
These are not mutually exclusive; both conditions may not hold.
News implies revisions have mean zero, noise does not. Empirical
results suggest that noise dominates the data-generating process.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

Recently, data availability has encouraged
researchers to explore a methodology in which
they estimate the “true” values and measurement
errors as a latent state vector. In such studies, the
revisions are modeled as a statistical process,
emphasizing the “maturity” of each observation,
rather than the vintage of the time series. These
models permit forecasts of data that are to be
released, as well as “backcasts” of data already
published. The methodology may be applied to
individual observations, as well as various trend
estimators, such as those considered by Orphanides
and van Norden (2005).3 Recent work includes
Jacobs and van Norden (2006); Cunningham et al.
(2007); Aruoba (2008); Aruoba, Diebold, and Scotti
(2008); Garratt, Koop, and Vahey (2008); and
Garratt et al. (2008).
This recent literature traces its beginning to
Jacobs and van Norden (2006). They argue that
previous state-space models built on a transition
process for the vintage data plus a set of forecasting
equations do not allow adequately rich dynamics
in the data-revision process:

series is assumed to have an amount of measurement error that is inversely correlated with the
maturity of the datum. Interpreted loosely, the
information content of an older datum for a given
activity date is asserted to contain more information about the “true” value of that datum than a
recent datum for the same activity date.
This modeling framework has been applied
by staffs at the Bank of England and the European
Central Bank (Cunningham et al., 2007). Their
model differs somewhat from that of Jacobs and
van Norden and focuses more attention on modeling the measurement-error process, including
potential bias and heteroskedasticity, but the
underlying philosophy is similar. Our research
applies the Jacobs and van Norden framework to
U.S. data on quarterly GDP from the Federal
Reserve Bank of St. Louis real-time ArchivaL
Federal Reserve Economic Data (ALFRED)
database.

Our formulation of the state-space model is
novel in that it defines the measured series as
a set of various vintage estimates for a given
point in time, rather than a set of estimates from
the same vintage. We find this leads to a more
parsimonious state-space representation and
a cleaner distinction between various aspects
of measurement error. It also allows us to augment the model of published data with forecasts in a straightforward way. (p. 3)

The rich modeling framework proposed by
Cunningham et al. (2007) allows serial correlation
in measurement errors, nonzero correlation
between the state of the economy and measurement errors, and maturity-dependent heteroskedasticity in measurement errors. As a consequence
of the richness of the statistical specification and
the number of dimensions to the data, the estimation is divided into two parts. First, all available
data vintages are used to estimate selected parameters governing measurement error bias and variance. Second, the most recently published
release is used to estimate the state-space model.
The modeling setup is as follows.4 Let the
data-generating process for the true (unobserved)
variable of interest, yt , t=1,…,T, be a simple
autoregressive (AR共q兲) process:

In this spirit, we note the differences between
using state-space models as estimators of unobserved components such as trends (perhaps across
various vintages of real-time data) and as estimators of “true” underlying data. In the former, each
datum within a time series of a particular vintage
is implicitly assumed to be equally accurately
measured; the trend (usually, a time-varying direction vector) is extracted without explicit concern
for measurement error, except so far as the robustness of the extracted trend may be explored across
vintages. In the latter, each datum within a time
3

These trend estimators are discussed in Orphanides and van Norden
(2005, Appendix A).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

STATISTICAL FRAMEWORK

(1)

A ( L ) y t = εt ,

where the polynomial is defined in the usual
manner and the stationary disturbance is spherical
4

The model follows Jacobs and van Norden (2006) and
Cunningham et al. (2007).

J U LY / A U G U S T

2009

351

Anderson and Gascon

UNDERSTANDING REAL-TIME DATA
The table presents a stylized real-time dataset. The columns denote the release date, or vintage,
of the data. The rows denote the activity date, or observation date, of the data. Economic data are
normally released with a one-period lag; that is, data for January are reported in February. Therefore,
the release date, v, lags the activity date, t, by one period.
Each element in the dataset is reported with a subscript identifying the activity date and a (bold)
superscript identifying the maturity, j. Data of constant maturity are reported along each diagonal.

Stylized Real-Time Dataset

Period

v=2

v=3

v=4

t=1

y11

y12
y21

y13
y22
y31

t=2
t=3
⯗

…

v = T– 1

v=T

v = T+ 1

…

y1T – 2
y2T – 3
y3T – 4

y1T – 1
y2T – 2
y3T – 3

y1T

…
…
…

Activity date (t)

Vintage (v)

t = T– 2

⯗

⯗

⯗

yT2 – 2

yT3 – 2

yT1 – 1

yT2 – 1

t = T– 1

yT1

(homoskedastic), E共εt 兲 = 0, V共εt 兲 = E共εt εt ′兲 = σε2I.
Trends (deterministic or stochastic) and structural
breaks, including regime shifts, are explicitly
ruled out (and perhaps have been handled by
prefiltering the data).

Measurement-Error Model
Let the data published by the statistical
agency be denoted

y tj , t = 1,...,T ; j = 1,..., J ,
where t is an activity date and j is the maturity
of the data for that activity date. (See the boxed
insert.) We assume that initial publication of data
for t occurs in period t +1, so that j ≥ 1. Period T
is the final revision date for data published in
period T+1. We assume the published data are
decomposable as

352

y3T – 2

yT1 – 2

t=T

(2)

y2T – 1

y tj ≡ y t + c j + v tj ,
J U LY / A U G U S T

2009

where yt denotes the true “unobserved” value, c j
denotes a bias in published data of vintage j, and
vtj is a measurement error.
Previous studies have suggested that data
releases tend to be biased estimates of the later
releases. Let c j denote the bias of data at maturity
j, such that c1 is the bias for initially published
data. We assume the bias is independent of vintage and solely a function of maturity, j, and that
the bias decays according to the rule
(3)

c j = c 1 (1 + λ )

j −1

, − 1 ≤ λ ≤ 0.

We assume the measurement error, vtj, follows a
simple AR共q兲 process:
(4)

B ( L )v tj = ηtj ,

where E共ηtj 兲 = 0. The measurement-error variance
is assumed heteroskedastic in maturity and decays
toward zero:
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon


′
V ηtj = E  ηtj ηtj  = Σηj ,



( )

( )

(8)

with main diagonal
(5)

σ η2 j

=

σ η21

(1 + δ )

j −1

, − 1 ≤ δ ≤ 0.

It is necessary to designate a maturity at which
data are assumed “fully mature.” Here, we denote
the horizon as N and refer to it below as the
“revision horizon.” For RGDP data, we set N = 20
(5 years of quarterly data). This choice, to some
extent, is arbitrary, and hence it is useful to examine the robustness of results to the value chosen.
Our choices are guided by visual examination of
the revised time series and discussed further in a
later section.

STATE-SPACE MODEL
The measurement equation of the state-space
model has as its dependent variable a vector of
the most recent release of data,
T

ytj =  y 1T , y T2 −1,…, y T2 −1, y T1  .
The superscript j denotes the vintage of data,
measuring yt on activity date t, which is available
at vintage T+1. Note that the maturities of the
elements of
T

 y 1T , y T2 −1,…, yT2 −1, yT1 



differ—some elements may be the 10th or 20th
release of data for a specific activity date, while the
last element is the initial release of data for activity period T. The measurement equation equates
this vector to the sum of a vector of maturityrelated measurement biases, c j; the unknown
true value, yt ; and a measurement error, vtT:
(6)

 yt 
y tj = c j + [1 1]  T  + 0.
v t 

The transition equation for the state vector is
(7)

 y t   µ  α 0   y t −1   εt 
 T  =  + 
 T  +  T 
v t   0   0 β  v t −1  ηt 

with disturbance covariance matrix
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

 σ ε2T

σ 2 j ,t
 ε ,η( )

σ2

ε ,η( j ,t )

σ 2( j ,t )
η


.



Note that the variance of ηtT, denoted ση2j , is a
function of t through its dependence on j, the
maturity of each datum, reflecting the assumed
heteroskedasticity in the measurement-error
process.5 Similarly, the covariance between the
shocks to the variable of interest and the measure2
共j,t兲 = ρε,η σε ση j , is a function of
ment error, σ εη
maturity, j.
Cunningham et al. (2007) make the interesting
suggestion that the measurement equation may
be augmented with auxiliary variables y tZ:

 yT   c j   I
(9)  t  =   + 
Z
Z
Z
 yt   c   Λ

 yt 
  


0 I 0  y t −q +1   0 
 +  .

0 0 0  v t  v tZ 
  


 v t − p +1 

Candidate variables include surveys and/or
private-sector measures/forecasts, asserting that
private-sector agents already have solved their
own variants of the signal-extraction problem.
At this time, our model omits the use of auxiliary
data.
The estimation is partitioned into two parts.
Assuming the measurement equation has been
augmented with an auxiliary variable and allowing for AR(1) processes in the transition equation,
the parameters to be estimated in the state-space
model are

(

)

Φ1 = α 1 ,σ v2Z ,µ,σ ε2 ,c Z , ΛZ ,
conditional on estimated parameters for the
measurement error’s data-generating process,

(

)

Φ2 = σ η21 ,δ , β1, ρε ,η , c 1, λ .
The estimation of Φ2 proceeds assuming that
successive revisions to each datum are well5

State-space models with deterministically time-dependent variances are discussed by Durbin and Koopman (2001, pp. 172-74)
and Kim and Nelson (1999).

J U LY / A U G U S T

2009

353

Anderson and Gascon

behaved, in the statistical sense that the revisions
may be used for estimation.6 Let W denote a matrix
with J rows, in which each row is regarded as a
vector of revisions to data of maturity j. The number of columns is T–N, that is, the number of published data vectors minus the revision horizon.
A general expression for a representative row in
the revision matrix W is
(10) W ( j , .) =

y tj + N

−

ytj ,

j + N < T, 1 ≤ t ≤ T.

Consider j = 1 and N = 20. In this case, the
numbers in the first row of W are

W (1, .) =

y t1+20

−

y t1,

1+ N <T.

Similarly, consider j = 12 and N = 20:

(12)


σ η21
V =
×
2
 1 − (1 + δ ) β1 

1

 (1 + δ ) β1



 (1 + δ ) J −1 β J −1

1

(1 + δ ) β1
(1 + δ )







(1 + δ ) J −1 β1J −2





(1 + δ ) J −1 β1J −1 

(1 + δ ) J −1 β1J −2  ,


(1 + δ ) J −1






and we estimate σ η21, δ, and β 1 via GMM by
minimizing
(13)

(vecV − vecVˆ )′ (vecV − vecVˆ ).

Cunningham et al. (2007) suggest methods to
obtain covariance matrices for higher lag orders.

W (12, .) = y t12+20 − y t12 , 12 + N < T .
Clearly, W has J rows and T–N columns.
Consider an estimator for the bias process,
(11) c j = c 1 (1 + λ ) j −1 , − 1 ≤ λ ≤ 0, 1 ≤ j ≤ J .
The row means of W provide sample measures
of c j. The parameters c1, the mean revision of the
initial release, and λ , the revision decay rate, are
estimated via generalized method of moments
(GMM) subject to the constraint –1 ≤ λ ≤ 0.
Next, we need an estimator for ρε,η as part of
2
σ εη
= ρε,η σε ση T. Cunningham et al. (2007) proi
i
pose an estimator based on an approximation to
* , calculated as the mean (across
ρε,η , designated ρy,v
the J maturities) of the J correlations between the
j th rows of W and the corresponding vector of
published data at maturity j + N; that is, by the
construction of W, N + j < T – t.
Finally, estimators are required for σ η21, δ, and
β 1 (assuming an AR(1) process in v). A sample
estimate of the variance-covariance is obtained
as J –1WW ′. The analytical covariance matrix for
the first-order case is

SIMULATION RESULTS
We conduct Monte Carlo simulations to
explore the ability of the state-space framework
to extract a “true” series from a “published” series
that has been contaminated with measurement
error.
The simulations evaluate the ability of the
model’s state-space vector [ŷt ,v̂t ] to track the
vector of true values, yt , relative to the tracking
ability of the vector of most recently “published”
values, ytT. For each parameterization, T = 100
and we calculate 1,000 replications.
The specification of the experiment is as
follows:
• The “true” data:

y t = α y t −1 + εt , t = 2,...,T , εt ~ N (0,1), y 1 = ε1

( that is, y 0 = 0).
• The “published” data:

y tj = y t + v tj , t = 1,...,T ,
where the superscripts t and j denote,
respectively, the activity date and maturity
of the most recently published data.
• The measurement error:

6

Hereafter, this exercise is conditional on the revision horizon N.

354

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

Figure 1
RMSE by Maturity, j, Model-Simulation Parameters: α = 0.60, β = 0.10, ρεη = –0.50
RMSE by Maturity (published and filtered)

1.4
1.2
1.0
0.8
0.6
0.4

Published
Filtered

0.2
0.0
0

2

4

6

8

10

12

14

16

18

20

16

18

20

Maturity j = Quarters

Relative Gain (published RMSE minus filtered RMSE)

0.30
0.25
0.20
0.15
0.10
0.05
0.00
−0.05
0

2

4

6

8

10

12

14

Maturity j = Quarters

( )

v tj = βv tj−1 + ηtj , t = 2,...,T ; ηtj ~ N 0,σ η2 j ;
t

σ η21

= 1;

σ η2 j
t

=

σ η21

(1 + δ )

j −1

,

v1j = η1j (that is, v0j = 0) and with δ = –0.06.
• The covariance between the state of the
economy and the measurement error:

(

better estimates of the true values for the first 13
maturities, after which time the filtered values
cease to provide an advantage.
Table 1 provides corresponding results.7 The
first three columns report varying parameterizations. The fourth and fifth columns report the
improvement due to the state-space filter for data
maturities 1 and 10, respectively. At maturity 1,
the RMSEs of filtered estimates are approximately

)

cov εt ,ηtj = ρε ,ησ ε σ η j .
t

7

Figure 1 shows root mean square forecast
errors (RMSEs) of the “published” values, yt – ytT,
and the filtered values, yt – ŷt , for one parameterization. The top panel shows the RMSE at each
maturity; the bottom panel shows the difference
between the filtered RMSEs and published RMSEs.
The figure indicates that the filtered values are
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

In some respects, the results presented in Table 1 are similar to
those in Cunningham et al. (2007), while others are puzzlingly
different. The comparable table in Cunningham et al. (2007, Table C)
displays a sharp deterioration of the model’s advantage as the
covariance shock to the economy and measurement error decreases.
Additionally, the gains reported are much larger and more persistent; in some cases, the filtered RMSEs are close to 50 percent of the
published RMSEs at maturity 1 and remain so at maturity 9. The
gains from filtering last at least 18 but sometimes over 100 periods.
At this time we have not resolved these discrepancies; but we
thank the authors for graciously providing their simulation code.

J U LY / A U G U S T

2009

355

Anderson and Gascon

Table 1
Measurement Accuracy Improvement Due to the State-Space Filter
Parameterization
ρε,η

α

Improvement due to state-space filter
β

RMSEfiltered /RMSEpublished
At maturity 1
At maturity 10

Earliest maturity at which
RMSEpublished < RMSEfiltered

0.5

0.1

0.1

0.7179

1.1338

8

0.5

0.1

0.6

0.6489

1.0762

9

0.5

0.6

0.1

0.7500

1.1437

8

0.5

0.6

0.6

0.7344

1.1581

7

0

0.1

0.1

0.7179

1.0357

10

0

0.1

0.6

0.6479

1.0277

9

0

0.6

0.1

0.7493

1.0226

10

0

0.6

0.6

0.7337

1.0478

9

–0.5

0.1

0.1

0.7177

0.9378

15

–0.5

0.1

0.6

0.6537

0.9377

20

–0.5

0.6

0.1

0.7578

0.9494

14

–0.5

0.6

0.6

0.7331

0.9431

14

70 percent of the RMSEs obtained when using
the most recently “published” data. The values
range from 64 to 75 percent improvement depending on the parameterization. The sixth column
reports the earliest maturity at which the filtered
values cease to provide an advantage; these values
range from 8 to 20 periods, with an average of 11
periods.
Our simulations suggest that the state-space
framework may promise significant gains in
measurement accuracy for recently released data
if actual data are well behaved and tend to follow
a low-order AR process. Previous studies suggest
this might be reasonable for RGDP.

this vintage RGDP matrix, the most recently
published data vector matches the data available
from the Bureau of Economic Analysis (BEA).
The specifics of the process are described in
Appendix B.
Estimation proceeds in two steps: First, we
estimate the parameters of the measurement-error
process,

(

)

Φ2 = σ η21 , δ,β1, ρε ,η ,c 1, λ .
Second, conditional on these parameters, we
estimate the parameters of the state-space model
(omitting any auxiliary data),

(

)

Φ1 = α 1,σ v2Z , µ, σ ε2 .

EMPIRICAL MODEL
Our empirical work examines vintage data of
the annualized growth rate of quarterly RGDP
constructed with data from the Federal Reserve
Bank of St. Louis ALFRED database, specifically,
nominal GDP and the implicit price deflator
(GDPDEF) data.8 The construction of RGDP
accounts for changes in the base year of GDPDEF,
as to maintain the correct interactions between
the base year and subsequent vintages. Thus, in
356

J U LY / A U G U S T

2009

Cunningham et al. (2007) note one reason to
use this two-step procedure is that identification
conditions may fail if all parameters were estimated together.9 Moreover, the framework set
8

The adoption of a chain-weighted price index in the middle of the
sample adds an additional dynamic to the RGDP revision process.
It would be ideal to use only post chain-weighted data; however
the sample size is not sufficient for estimation.

9

Cunningham et al. (2007) do not explore the satisfaction and/or
violation of the relevant conditions; neither have we, although so
doing seems a worthwhile task, to say the least.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

Table 2
Estimated Revision-Bias Parameters
RGDP

Estimate

Lower bound

Upper bound

c1

0.5793

0.3317

0.8269

λ

–0.2828

–0.4515

–0.1141

Lower bound

Upper bound

NOTE: Upper and lower bounds represent 95 percent confidence intervals (CIs).

Table 3
Estimated v̂ Parameters
RGDP
Initial variance, σ η21

Estimate
2.7002

2.3988

3.0017

–0.0786

–0.0931

–0.0641

First-order serial correlation, β 1

0.2004

0.1451

0.2557

Correlation with mature data, ρ*yv

0.3181

0.1399

0.4963

Variance decay, δ

NOTE: Upper and lower bounds represent 95 percent CIs.

forth requires only the most recently published
vintage of data, joint estimation of the parameters
would require inputting the entire history of
revisions into the model.

Estimation of Φ2 Parameters
As noted previously, the first step of the estimation is to choose the revision horizon. Here it
is N = 20 (5 years). Our choice is explored in
Appendix A. We input the W matrix (produced
by equation (10)) into equation (11) to estimate
values for the mean revision of the initial release,
c1, and the revision decay parameter, λ . For robustness purposes, Figure A2 shows the estimated
and actual values of c j at different horizons.
Table 2 reports the parameters estimated via
GMM. The mean revision to the initial release is
statistically different from zero. The initial release
to RGDP is, on average, 0.57 percentage points
lower than RGDP reported five years later. The
revision decay parameter describes the rate at
which revisions decay as the data mature: At
revision maturity 2, RGDP is estimated to be 28
percent lower than the initial revision.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

The next step is to calculate the correlation
between the measurement error and the “true”
unobserved state of the economy, ρε,η. Because
we do not observe the measurement error or the
“true” state, we continue to use revisions to data
of maturity j as a proxy of the measurement
error. We assume data reported with maturity
j + N are good estimates of the “true” state of the
economy. The estimated mean correlation between
the revisions of maturity j and the reported values
at j + N are used as an estimate for ρε,η denoted
* ; that is,
ρy,v

ρε ,η ≈ ρ∗y ,v =

1 J
.
∑ ρ j +N
J j =1 y ,W ( j ,.)

The last row of Table 3 reports estimates for
yv . The correlation between revisions to the
data and the estimated “true” state are positive;
although it is not reported, the correlation is
positive for all j. Appendix Figure A3 explores
the choice of the revision horizon: The values of
* stabilize for sufficiently large revision horizons.
ρyv
The final set of first-stage estimates—the
serial correlation between revisions, the initial

ρ*

J U LY / A U G U S T

2009

357

Anderson and Gascon

Figure 2
Actual and Filtered RGDP Growth (vintage 07/31/2008)
Percent Change at an Annual Rate
10

8

6

4

2

0

–2

–4
1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

NOTE: The fan chart depicts the probability of various outcomes for RGDP after the data are “fully revised” (i.e., data reported in 5
years). Fully revised data are expected to fall within the fan chart 90 percent of the time. Each pair of shaded regions indicates an
additional 10 percent CI.

variance of revisions, and the variance decay
rate, β1, σ ε21, δ , respectively—are derived from the
variance-covariance matrix of W, denoted V̂.
The parameters are estimated per equations
(12) and (13). Table 3 reports the estimated parameters for our preferred N. All estimates are significantly different from zero. Notice that the first
order–serial correlation in the revisions is positive.
The initial variance is markedly higher from that
assumed in the simulation. However, the variance
decay parameter is –0.07, which is close to the
–0.06.value used in the simulation.

Estimation of Φ1 Parameters
Using the parameters estimated in the previous section, the vector of recently published data
is put into the state-space model.10 The parameter
driving the state-space model’s covariance matrix,
358

J U LY / A U G U S T

2009

σ ε2, the variance of the shock to the AR共q兲 datagenerating process for the “true” data, is estimated
to be 3.55 (0.55); for U.K. investment data,
Cunningham et al. (2007) report an error variance
of 3.22 (0.67). Estimation results are shown in
Figure 2. The solid black line is the most recently
published data, the darkest band is the mean filtered value, and the outermost band is the 90 percent confidence interval (CI). As the variance of
the revisions decay, so does the CI.
The RGDP growth rate at the most recent data
point, 2008:Q2, was initially published as 1.89
percent on July 31, 2008. The estimated value is
10

Estimation of the model is problematic. Although the datagenerating processes for both the “true” data and the measurement
error are initially asserted to be AR共q兲, in the model the AR共q兲
parameters are not identified. The parameters are also omitted
from estimation in Cunningham et al. (2007). Absent promising
findings in the next section, far more estimation is necessary before
confidence may be placed in such results.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

2.47 percent, with 90 percent CIs between 5.17
percent and 10.23 percent. As of the February 27,
2009, release, 2008:Q2 RGDP was reported as
2.83 percent.
This state-space modeling framework shares
features with multivariate stochastic volatility
models. Harvey, Ruiz, and Shephard (1994) introduce such a model as an alternative to generalized
autoregressive conditional heteroskedasticity
(GARCH) models for high-frequency data. However, their model’s problem differs from the current model. In their model, “multivariate” refers to
four countries’ exchange rates, modeled together,
rather than 20 or more maturities of a single activity date. Their problem is similar to the current
model, though, to the extent that the stochastic
variance is assumed to follow an AR共1兲 process.
This line of econometrics deserves further investigation.

REAL-TIME MODEL EVALUATION
Using real-time RGDP data series to evaluate
model accuracy follows closely the methodology
of our simulation exercise. The main restriction
with the actual data is that we do not observe the
true values of each datum. We proxy the true values from data that have become “fully mature” at
time t + N (where N = 20). Our real-time sample is
restricted to those vintages of data with 10 years of
data preceding (to estimate the parameters in Φ2 )
and 5 years of data following (to evaluate the
forecast) the vintages of interest. This exercise uses
the data range 1985:Q4–2003:Q2 and vintages
between v0 = 01/30/2002 and vk = 07/31/2003 as
the most recently published data. This does not
limit our ability to make real-time forecasts with
the current data; however, it does inhibit us from
testing the forecasting performance for 5 years.
We estimate the model for vintages v0,v1,…,vk
independently, keeping the number of observations, and maturities, fixed across vintages. For
each successive release, we omit the oldest datum
and add the most recent. This process corresponds
nicely with the idea of running k iterations of the
model simulation. The metric used to evaluate
the model performance is the ratio of the RMSE
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

using the filtered datum as a predictor of the
“mature” datum relative to the RMSE using the
published datum as a predictor of the “mature”
datum.
As described previously, the first stage in the
real-time forecasting exercise is to estimate the
parameters in Φ2 . Given a revision horizon of
N = 20 (5 years) for RGDP, at least 10 years of vintage data are used to estimate the W matrix and
corresponding parameters.11 For each successive
vintage, we update the dataset (i.e., add a column
to the W matrix) and reestimate the parameters
in Φ2 .
The stability of each parameter is assessed in
Figure 3. The horizontal axis reports the vintage.
01/30/02 indicates that data available only on or
before January 30, 2002, were used in the estimation of the parameters, thus the final column of
the revision matrix W contains revisions to data
released 20 quarters prior. The latest data points
are identical to those reported in Tables 2 and 3.
The mean revision of the initial release (topleft panel of Figure 3) steadily decreases as the
real-time sample includes more recent data, indicating some improvement in the BEA’s ability to
report less-biased estimates of the “true” values.
Conversely, the initial variance of the measurement error (left-middle panel) increases, indicating
increased uncertainty around the initial estimates.
The decay parameters corresponding to the initial
mean and variance of the revisions are reported
by λ and δ, respectively.
The parameters in Figure 3 are displayed for
all available vintages; however, the real-time forecasting exercise uses only the vintages for which
the fully mature data are available. As noted earlier, this reduces the real-time sample to only the
first seven vintages of data. For example, the
mature values for January 30, 2002, data are
reported after 20 quarters, on January 31, 2007.
The top panel of Figure 4 plots the RMSEs of
the published data and of the filtered values. The
bars in the bottom panel measure the difference
11

We require that the W matrix have at least as many vintages (or
columns) as it has maturities (or rows). The calculation in equation
(10) requires at least N vintages of data as well as the observed “true”
values. In other words, for every vintage, v, we must also observe
the data at v +N.

J U LY / A U G U S T

2009

359

Anderson and Gascon

Figure 3
Estimated Φ2 Parameters in Real Time
c1

1.5

0.0

λ

1.0
−0.5

0.5
0.0
01/30/02

01/27/06

07/31/09

σ2

01/30/02

−0.02

3.0

01/27/06

07/31/09

δ

−0.04

2.0

−0.06

1.0

−0.08

0.0
01/30/02

01/27/06

07/31/09

–0.10
01/30/02

β

0.4

0.8

0.3

0.6

0.2

0.4

0.1

0.2

0.0
01/30/02

01/27/06

07/31/09

0.0
01/30/02

01/27/06

07/31/09

*
ρ yv

01/27/06

07/31/09

between the two series.12 In some ways, the
results are similar to those simulated in Figure 1:
The filtered values tend to be superior estimates
for the first 6 quarters and then diminish. Unlike
in the simulation, however, the transition between
quarters is not particularly smooth. In the top
panel of Figure 4, the RMSEs of both series actually increase for data maturities 3 through 6 and
steeply decline thereafter. According to the bars
in the bottom panel of Figure 3, for maturity 2,
the data from the filtered values show greater
improvement than those from the initial release.
For further examination, Table 4 reports the
improvement due to the state-space filter for the

seven vintages. The first two columns report the
vintages of the published data and the fully mature
data, respectively. The remaining five columns
report the improvement due to the state-space
filter for data of varying maturities. The bottom
row of the table reports the average across the
seven vintages. During the first year, the average
RMSE of the filtered values is 87 percent of the
average RMSE of the published data. The filtered
values most improve the data published July 31,
2003: The RMSE of the filtered values is 48 percent
of the RMSE of the published data.13 For data 2
years old, there is only modest improvement:
The RMSE of the filtered values is 97 percent of

12

13

The following results should be interpreted with some caution;
they are constrained by only seven consecutive releases of data
and additional data points may drastically alter the results.

360

J U LY / A U G U S T

2009

This outlier is driven by a particularly inaccurate initial release of
2.37 percent for 2003:Q1, the filtered value was 3.74 percent, and
the value on January 31, 2007, was 3.46 percent.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

Figure 4
Real-Time Model Performance, RGDP
RMSE by Maturity (published and filtered)

3.0
2.5
2.0
1.5
1.0
Published
Filtered

0.5
0.0
0

2

4

6

8

10

12

14

16

18

20

16

18

20

Maturity j = Quarters

Relative Gain (published RMSE minus filtered RMSE)

0.4
0.3
0.2
0.1
0.0
−0.1
0

2

4

6

8

10

12

14

Maturity j = Quarters

Table 4
Real-Time Model Performance
Improvement due to state-space filter (RMSEfiltered /RMSEpublished )
Release of
published data

Release of
fully mature data

Maturities 1-4
(Year 1)

Maturities 5-8
(Year 2)

Maturities 9-12
(Year 3)

Maturities 13-16 Maturities 17-20
(Year 4)
(Year 5)

1/30/2002

1/31/2007

0.9387

0.8561

1.0018

1.0018

1.0002

4/26/2002

4/27/2007

0.8933

0.9988

0.9885

0.9957

1.0002

7/31/2002

7/27/2007

0.9056

0.9601

1.0179

0.9942

1.0003

10/31/2002

10/31/2007

0.9184

0.9980

1.0128

0.9977

1.0000

1/30/2003

1/30/2008

0.9809

0.9725

1.0083

0.9985

1.0000

4/25/2003

4/30/2008

0.9900

0.9815

1.0254

0.9979

0.9988

7/31/2003

7/31/2008

0.4852

1.0039

1.0032

0.9997

0.9996

0.8731

0.9673

1.0083

0.9979

0.9999

Average

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

361

Anderson and Gascon

the RMSE of the published data. Thus, after data
have been revised for 2 years, there is no apparent
gain from the filtered values.
In addition to the improved estimates of the
true values, as measured by the RMSEs, the filtered
values also provide CIs, which are not provided
by data releases. The CIs indicate the extent to
which incoming data are likely to be revised, providing some assessment as to how much weight
to assign to each datum. Of the 504 mature data
points observed, approximately 83 percent fell
within the 90 percent CI and approximately 41
percent within the 50 percent CI. These numbers
seem reasonable, as the sample consists of seven
consecutive vintages—meaning any given outlier
could be repeated up to seven times within our
sample.

CONCLUSION
A long line of papers has explored methods
to pool vintages of economic data, seeking to
extract the true (or, at least strongest) underlying
signal for a variable of interest. A recent, and likely
fruitful, path is to introduce a cohort-style analysis
that examines the revisions as a function of the
age of the data and estimates the “true” unobservable values via a state-space framework. Here, we
have begun the application of such techniques to
U.S. data, specifically using a measure of RGDP.
The framework is equally applicable to quarterly
or monthly data, although we have not yet considered the case of mixed frequencies (including
when monthly observations are published quarterly, such as for GDP).
Monte Carlo experiments suggest that, for a
wide range of parameter values in AR datagenerating processes, the framework explored
here may be able to extract estimates of recent
values of economic variables and reduce uncertainty by as much as 30 percent. Obviously, empirical application of such techniques introduces
statistical challenges when pooling data across
cohorts. The “revision horizon,” to a large extent,
is an arbitrary selection, and robustness experiments are required. Further, if underlying unobserved true data are to be recovered as a state
362

J U LY / A U G U S T

2009

vector, issues regarding the lack of statistical identification require further exploration. It appears,
however, even with these caveats, the modeling
framework does provide estimators for two important variances—the variance of the empirical
measurement error embedded in each published
datum and the variance of the data-generating
process of the true underlying economic variable.
Real-time experiments, albeit with limited
data, suggest that uncertainty in RGDP estimates
appear to be reduced by close to 10 percent at
early maturities. In addition, CIs extracted from
the model provide information unattainable from
data releases alone. Both, perhaps, will assist
economists and policymakers, by providing a set
of “revision CIs” around releases of incoming data.
One limitation of the application of the methodology is the large amount of data required to produce and evaluate estimates in real time. Nonfarm
payroll employment data, with monthly revisions
and a long release history, is a good candidate
for the application of this methodology.

REFERENCES
Aruoba, S. Borağan. “Data Revisions Are Not WellBehaved.” Journal of Money, Credit and Banking,
March-April 2008, 40(2-3), pp. 319-40.
Aruoba, S. Borağan; Diebold, Francis X. and Scotti,
Chiara. “Real-Time Measurement of Business
Conditions.” Unpublished manuscript, April 2007;
2008 version: Working Paper 08-19, Federal Reserve
Bank of Philadelphia;
www.philadelphiafed.org/research-and-data/
publications/working-papers/2008/wp08-19.pdf.
Boskin, Michael J. “Getting the 21st Century GDP Right:
Progress and Challenges.” American Economic
Review, May 2000, 90(2), AEA Papers and
Proceedings, pp. 247-52.
Croushore, Dean. “Revisions to PCE Inflation
Measures: Implications for Monetary Policy.”
Unpublished manuscript, University of Richmond,
November 2007.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

Croushore, Dean and Stark, Tom. “A Funny Thing
Happened on the Way to the Data Bank: A Real-Time
Data Set for Macroeconomists.” Federal Reserve
Bank of Philadelphia Business Review, September/
October 2000, pp. 15-27; www.phil.frb.org/
research-and-data/publications/business-review/
2000/september-october/brso00dc.pdf.

Harvey, Andrew; Ruiz, Esther and Shephard, Neil.
“Multivariate Stochastic Variance Models.” Review
of Economic Studies, April 1994, 60(2), pp. 247-64.

Croushore, Dean and Stark, Tom. “Data Revisions
and the Identification of Monetary Policy Shocks.”
Journal of Econometrics, November 2001, 105(1),
pp. 111-30.

Howrey, E. Philip. “Data Revision, Reconstruction,
and Prediction: An Application to Inventory
Investment.” Review of Economics and Statistics,
August 1978, 66(3), pp. 386-93.

Cunningham, Alastair; Eklund, Jana; Jeffrey, Chris;
Kapetanios, George and Labhard, Vincent. “A State
Space Approach to Extracting the Signal from
Uncertain Data.” Working Paper 336, Bank of
England, November 2007; www.bankofengland.co.uk/
publications/workingpapers/wp336.pdf.

Jacobs, Jan P.A.M. and van Norden, Simon. “Modeling
Data Revisions: Measurement Error and Dynamics
of ‘True’ Values.” CCSO Working Paper 2006/07,
CCSO Centre for Economic Research, December
2006; www.eco.rug.nl/ccso/quarterly/200607.pdf.

Diebold, Francis X. and Rudebusch, Glenn D.
“Forecasting Output with the Composite Leading
Index: A Real-Time Analysis.” Journal of the
American Statistical Association, September 1991,
86(415), pp. 603-10.
Durbin, James and Koopman, Siem J. Time Series
Analysis by State Space Methods. Oxford: Oxford
University Press, 2001.
Faust, Jon; Rogers, John H. and Wright, Jonathan.
“News and Noise in G-7 GDP Announcements.”
Journal of Money, Credit, and Banking, June 2008,
37(3), pp. 403-19.
Garratt, Anthony; Koop, Gary and Vahey, Shaun P.
“Forecasting Substantial Data Revisions in the
Presence of Model Uncertainty.” Economic Journal,
July 2008, 118(530), pp. 1128-44.

Howrey, E. Philip. “The Use of Preliminary Data in
Econometric Forecasting.” Review of Economics
and Statistics, May 1978, 60(2), pp. 193-200.

Kim, Chang-Jin and Nelson, Charles R. State-Space
Models with Regime Switching. Cambridge, MA:
MIT Press, 1999.
Kishor, N. Kundan and Koenig, Evan F. “VAR
Estimation and Forecasting When Data Are Subject
to Revision.” Working Paper 0501, Federal Reserve
Bank of Dallas, February 2005; http://dallasfed.org/
research/papers/2005/wp0501.pdf.
Kozicki, Sharon. “How Do Data Revisions Affect the
Evaluation and Conduct of Monetary Policy?”
Federal Reserve Bank of Kansas City Economic
Review, First Quarter 2004, pp. 5-37.
Landerfeld, J. Steven; Seskin, Eugune P. and Fraumeni,
Barbara M. “Taking the Pulse of the Economy:
Measuring GDP.” Journal of Economic Perspectives,
Spring 2008, 22(2), pp. 193-216.

Garratt, Anthony; Lee, Kevin; Mise, Emi and Shields,
Kalvinder. “Real Time Representations of the
Output Gap.” Review of Economics and Statistics,
November 2008, 90(4), pp. 792-804.

Mankiw, N. Gregory and Shapiro, Matthew D. “News
or Noise? An Analysis of GNP Revisions.” NBER
Working Paper No. 1939, National Bureau of
Economic Research, June 1986;
www.nber.org/papers/w1939.pdf?new_window=1.

Grimm, Bruce T. and Weadock, Teresa L. “Gross
Domestic Product: Revisions and Source Data.”
Survey of Current Business, February 2008, 86(8),
pp. 11-15.

Orphanides, Athanasios and van Norden, Simon.
“The Unreliability of Output-Gap Estimates in Real
Time.” Review of Economics and Statistics,
November 2002, 84(4), pp. 569-83.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

363

Anderson and Gascon

Orphanides, Athanasios and van Norden, Simon.
“The Reliability of Inflation Forecasts Based on
Output Gap Estimates in Real Time.” Journal of
Money, Credit, and Banking, June 2005, 37(3),
pp. 583-601.

Stock, James H. and Watson, Mark W. “Why Has U.S.
Inflation Become Harder to Forecast?” Journal of
Money, Credit, and Banking, January 2007, 39(1),
pp. 3-33.

Sargent, Thomas. “Two Models of Measurements and
the Investment Accelerator.” Journal of Political
Economy, April 1989, 97(2), pp. 251-87.

364

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

APPENDIX A: SELECTION OF REVISION HORIZON (N)
Figure A1 shows the rows of the W matrix produced by equation (10) for RGDP over time at two
different maturities, j, and horizons, N. The top panel shows the revision to the initial release after 1
quarter; the second panel shows the revision to the 20th release after 1 quarter. As expected, the revisions to the 20th release after 1 quarter tend to be zero. When the horizon is extended from 1 quarter
to 5 years (bottom two panels), the 20th release does exhibit revision. Figure A2 shows the estimated
and actual revision process of data subject to a revision horizon N = 1, 5, 10, and 20. Figure A3 shows
the correlation between revisions and the “true” data subject to a revision horizon.

Figure A1
Revisions to Initial Release and 5-Year-Old RGDP Data at Different Horizons
Revision to Initial Release after 1 Quarter W(1,.) N = 1

Percent
5
0
−5

0

10

20

30

40

50

60

70

60

70

60

70

60

70

t = 1 (1991:Q4)

Percent
5

Revision to 20th Release after 1 Quarter W(20,.) N = 1

0
−5

0

10

20

30

40

50

t = 1 (1987:Q1)

Percent
5

Revision to Initial Release after 5 Years W(1,.) N = 20

0
−5

0

10

20

30

40

50

t = 1 (1991:Q4)
Revision to 20th Release after 5 Years W(20,.) N = 20

Percent
5
0
−5

0

10

20

30

40

50

t = 1 (1987:Q1)

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

365

Anderson and Gascon

Figure A2
Mean Revisions to RGDP by Maturity, j, at Revision Horizon N
Percent
1.0

1.0
Model
Actual

0.5

0.5

0.0

−0.5

0.0

0

5

10
15
Maturity j = Quarters

20

−0.5

N = 10

Percent

1.0

0.5

0.5

0.0

0.0

0

5

0

5

10
15
Maturity j = Quarters

10
15
Maturity j = Quarters

20

−0.5

20

N = 20

Percent

1.0

−0.5

N=5

Percent

N=1

0

5

10
15
Maturity j = Quarters

20

Figure A3
Correlation of Revisions Between Maturity, j, and Published Estimates at Maturity, j + N
0.40
0.35
0.30
0.25
0.20
0.15
0.10

ρyv*
0.05

0

2

4

6

8

10

12

14

16

18

20

N

366

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

APPENDIX B: ABOUT THE DATA
How the Real GDP Data Are Created
The Federal Reserve Bank of St. Louis ALFRED database allows researchers to retrieve vintage
versions of economic data that were available on specific dates in history. Most data are available in
real and nominal terms. If a researcher is interested in one vintage of data, the real series may be suitable; however, in our case we are interested in all vintages of real GDP. As reported in ALFRED, the
unit of measure on the real series changes by vintage. For example, between December 4, 1991, and
January 18, 1996, real GDP is reported in billions of 1987 dollars, whereas between January 19, 1996,
and October 28, 1999, the series is reported in billions of chained 1992 dollars. Due to changes in the
deflator, it is not suitable to obtain the real series from ALFRED and simply calculate the revisions. As
an alternative, the nominal GDP (GDPA), GDP, and implicit price deflator (GDPDEF) series are used to
create a vintage real GDP series.14
As with real GDP, the unit of measure of GDPDEF changes across vintages. Therefore, before deflating GDPA, GDPDEF must be reindexed. The data available in ALFRED are “as reported,” meaning the
base year varies from 1987 = 100 for vintages before January 18, 1996, to 2000 = 100 for vintages after
December 12, 2003. Further complicating the issue, the data released in the base years (1985, 1992, 1996,
and 2000) are also subject to revision; therefore the indexing of GDPDEF can also change between vintages within the same base year. Because we are interested in revisions to the data resulting from new
information, and not simply changes in the base year, we reindex all GDPDEF data to a constant base
year. To match the new series to the most recently reported data, we choose to index all of the data by
setting 2000 = 100 in the July 31, 2008, release. We denote the new deflator series DEFL.
The real GDP series are constructed by multiplying each date and vintage of the GDPA by the corresponding date and vintage of DEFL. After deflating the data, annualized growth rates of each vintage
are calculated, and we denote the resulting series RGDP.
Because the models are not well suited for mixed-frequency data,15 we elect to use only the data
vintages in which a new advance estimate is released. Consistent with our dataset, the first maturity
(n = 1) in national income and product accounts (NIPA) data is the advance estimate. In the NIPA data
from ALFRED, the preliminary estimate would be the second maturity; however, we omit this vintage,
as well as the final estimate. We label the fourth release, which is released at the same time as the subsequent quarter’s advance estimate, as the second maturity (n = 2).
Table B1 presents a stylized real-time dataset after the preliminary and final vintages have been
removed from the data. The columns denote the data vintages; the rows denote the dates of the observations. For descriptive purposes, each element in the dataset is reported with a superscript identifying
the maturity, j, of the observation.
The analysis in this paper hinges on the value chosen for the maturity horizon, or “look-ahead
distance,” denoted J. The value of J is the assumed horizon at which the data are assumed to be true,
in that no further revisions to the data will occur. This paper is absent a discussion about the appropriate
horizon. Our visual inspection of the data, summarized in Appendix A, and data limitations lead us
to set a 5-year horizon ( J = 20) for GDP and RGDP. For robustness purposes, in Figures A1, A2, and A3
all parameters in Φ2 are reported for alternative values for J.
14

The GDPDEF is chosen over the preferred chain-type price index (GDPCTPI) when available. The oldest vintage for GDPDEF is December 4,
1991, whereas the oldest vintage for GDPCTPI is January 19, 1996.

15

The BEA releases the quarterly GDP series at a monthly frequency: The first release is the advance, the second the preliminary, and the third
the final release.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

367

Anderson and Gascon

Table B1
Real-Time Dataset: Annualized Growth Rate of Real GDP
Vintage (v)
Activity date (t)
1999:Q2

1999:Q3

1999:Q4

5.6 1

4.7 2

4.7 3

1

2

10.11

1999:Q3

6.8

1999:Q4

7.9

…

2008:Q1

2008:Q2

2008:Q3

…

6.3 35

6.3 36

6.3 37

…

7.7

34

35

7.7 36

…

11.0 33

11.034

11.0 35

…

⯗

2000:Q1

2007:Q4

⯗
5.8

2008:Q1

7.7
⯗

1

⯗

5.5

2

4.9 3

5.9

1

6.10 2
4.16 1

2008:Q2

3
NOTE: Superscripts denote maturity, j. Following the notation in the paper, ytj denotes y at time t at maturity, j, or y2007:Q4
equals 4.9
percent.

Why Are NIPA Data Revised?
Clearly, revisions to NIPA data are not caused by statisticians at the BEA finding computational
errors and fixing them. Two main causes of such revisions to NIPA data are that over shorter horizons
new data become available (thus prompting revisions) and over longer horizons methodology changes.
Statisticians and economists at the BEA are well aware of these problems and over time have made
significant updates to the data collection and publication process. At the same time, this paper assumes
that by mining the data and revision process we can more accurately predict the true values of a series
of interest. We make this assumption not because of any inadequacy of the BEA’s work, but rather because
of the complexity of the task.
Short-term data revisions are largely a result of the tradeoff faced by the BEA. On one hand, there
is pressure for timely releases of information; on the other hand, there is an assumption that the data
released accurately measure the underlying variable of interest. Because of the desire for timely estimates, the BEA releases their first, or “advance,” estimate with only 75 percent of data for the past quarter
(Landefeld et al., 2008). The estimates of the true value are revised as more data become available.
Table B2 outlines the four data types used to construct the GDP series as well as the total share of each
for 2003:Q3, as reported Grimm and Weadock (2008) in the Survey of Current Business. Trend-based
data are imputed data; complete data are data that have been reported for the quarter for all three months
of the quarter; monthly trend-based data include two months of data and imputed-data for the third
month of the quarter; and revised data are simply revised estimates of the complete data. Notice that
the advance estimate (n = 1) does not contain any revised data and less than half of the data is complete,
whereas over three-fourths of the data in the final release (n ≈ 2) is complete or has been revised. At
the time of the annual revision,16 over 90 percent of the data is complete or has been revised. Detailed
information on the data sources, revision process, and methodology used to create the NIPA data are
provided by Landefeld et al. (2008).

16

The maturity of these data is a function of t. For Q1 data the annual revision will occur at n ≈ 4; for Q2 data at n ≈ 3; for Q3 data at n ≈ 2; and
Q4 data will not be subject to an annual revision until the next year, or n ≈ 5.

368

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson and Gascon

Table B2
Data Sources for Short-Term Revisions to GDP (percent)
Share of 2003:Q3 GDP
Data source

Advance estimate

Final estimate

Trend-based data

25.1

20.9

Monthly and trend-based data

29.7

1.2

Complete data

45.3

8.4

—

69.5

Revised data
SOURCE: Grimm and Weadock (2008).

In addition to problems caused by the lack of data available, challenges exist in regard to quantifying
the actions of economic agents, such as the growth in the service sector, identifying new products as
they enter the economy, and quality improvements for existing products (see Boskin, 2000). Because
of the large scale of these problems, the BEA normally addresses these issues of definitions and methodology in 5-year “benchmark” revisions. In forecasting the true values of GDP, we make no assumptions
about the changes these revisions make. The inability of the model to forecast changes that occur during
benchmark revisions is a shortcoming of our work as well as that of other scholars in this field.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

369

370

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Commentary
Dean Croushore

I

t is a pleasure to discuss Richard Anderson
and Charles Gascon’s (2009) article on their
attempt to develop a state-space model to
measure potential output growth in the face
of data revisions. They use the methodology of
Cunningham et al. (2007) applied to real output,
to see if they can develop a better measure of
potential output than other researchers. Such an
approach seems promising, and they develop a
unique method to study the data.
This approach holds promise because many
practical approaches based on standard statistical
models or production functions have not proven
reliable indicators of potential output. One reason
these methods may fail could be that the data are
revised and the methods used do not account for
such revisions. By accounting for data revisions
in a systematic way, the authors hope to develop
an improved calculation of potential output.
However, if the potential output series is subject to breaks not easily detected for many years,
this approach may not be fruitful—you simply
must wait many years to determine what potential
output is. The state-space method may be ideal
for calculating latent variables that correspond
to an observable variable subject to large data
revisions, but it is not helpful for early detection
of breaks in series like potential output. There is
simply no getting around the fundamental fact
that potential output inherently requires the use
of a two-sided filter and will be tremendously
imprecise at the end of the sample when only a
one-sided filter can be used.

CONGRESSIONAL BUDGET
OFFICE MEASURES OF
POTENTIAL OUTPUT
Many economic models rely on the concept
of potential output, yet it is not observable. As
new data arrive over time, practitioners who need
a measure of potential output for their models use
various statistical procedures to revise their view
of potential output. One such practitioner is the
Congressional Budget Office (CBO), which has
produced a measure of potential output since 1991.
An examination of some of the changes in their
measure of potential output over time helps illustrate some of the difficulties of using the concept.
Figure 1 shows the CBO January 1991 and
January 1996 versions of potential output growth.
The vertical bars indicate the dates the series were
created. In the 1991 version, potential output
growth rises in discrete steps over time; in the
1996 version, growth rates evolve more smoothly.
In the 1996 version, there is substantial volatility
in potential output growth in the 1970s and early
1980s; in the 1991 version it is smoother.
Figure 2 compares the CBO 1996 and 2001
versions. Differences in the series’ volatility in
the 1970s and early 1980s and growth rates in the
1990s and 2000s are substantial. For example, in
1996, the CBO thought potential output growth
for 1996 was about 2 percent per year; but in 2001,
they thought it was about 3 percent.
The CBO 2001 and 2008 versions of potential
output growth (Figure 3) show even greater volatil-

Dean Croushore is a professor of economics and Rigsby Fellow at the University of Richmond.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 371-81.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

371

Croushore

Figure 1
CBO Potential Output Growth, 1991 and 1996
Growth Rate (percent per year)
6

5

1996

4

3
1991
2

1

0
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
NOTE: The vertical bars indicate the dates the data series were created.
SOURCE: Federal Reserve Bank of St. Louis ArchivaL Federal Reserve Economic Data (ALFRED) database (series ID: GDPPOT).

Figure 2
CBO Potential Output Growth, 1996 and 2001
Growth Rate (percent per year)
6

5

2001

1996

4

3

2

1

0
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
SOURCE: Federal Reserve Bank of St. Louis ALFRED database (series ID: GDPPOT).

372

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Croushore

Figure 3
CBO Potential Output Growth, 2001 and 2008
Growth Rate (percent per year)
6

5

2008

4

3
2001
2

1

0
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
SOURCE: Federal Reserve Bank of St. Louis ALFRED database (series ID: GDPPOT).

ity in the 1970s and early 1980s than earlier published series (see Figures 1 and 2) and a large
difference in growth rates in the 2000s.
Thus, in the CBO’s view, the period with the
greatest revisions to potential output growth over
time is the 1970s and early 1980s. In addition, in
both the 1996 and 2001 versions, the potential
growth rates at the end points of the sample
changed substantially over time. This end-point
problem is the major challenge to constructing a
better measure of potential output.

KEY ASPECTS OF THE ANDERSON
AND GASCON APPROACH
Key aspects of the approach taken by Anderson
and Gascon include using a state-space approach
(a very reasonable method) and exploiting the
forecastability of data revisions following
Cunningham et al. (2007). However, the realtime research literature, as described in detail in
Croushore (2008a), includes few examples of
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

macroeconomic variables whose revisions are
forecastable in real time. Forecastable variables
include U.S. retail sales (see Conrad and Corrado,
1979), Mexican industrial production (see
Guerrero, 1993), gross domestic product (GDP)
in Japan and the United Kingdom (see Faust,
Roger, and Wright, 2005), and U.S. core personal
consumption expenditures (PCE) inflation (see
Croushore, 2008b). U.K. GDP is the focus of
Cunningham et al. (2007). For U.S. GDP, revisions
are not likely forecastable at all. And if this indeed
is the case, the major feature of the Anderson and
Gascon article could be a false trail.
A simulated out-of-sample exercise using
real-time data must be performed to determine
whether revisions are forecastable. Simply running
a regression using an entire sample of data is not
sufficient because finding a significant coefficient
using the whole sample does not mean revisions
are forecastable in real time.
The proper procedure to determine whether
revisions are forecastable is described in Croushore
J U LY / A U G U S T

2009

373

Croushore

(2008b) regarding forecasting revisions to core
PCE inflation. Suppose you think the initial
release of data is not a good forecast of data to be
released in the annual July revision of the national
income and product accounts. Specifically, suppose you are standing in the second quarter of
1985 and have just received the initial release of
the PCE inflation rate for 1985:Q1. You need to
run a regression using as the dependent variable
all the data on revisions from the initial release
through the government’s annual release in the
current period, so the sample period is 1965:Q3–
1983:Q4. So, you regress the revisions to the initial
release for each date and a constant term:

data, only that no one has successfully forecasted
the revisions in the manner described above. You
could argue that we should always assume that
real GDP will be revised upward because the statistical agencies will always fall behind innovative processes, so GDP will be higher than initially
reported. But the major reasons for upward revisions to GDP in the past include the reclassification of government spending on capital goods as
investment, the change in the treatment of business
software, and similar innovations that raised the
entire level of real GDP. Whether similar upward
revisions will occur in the future is uncertain.

Revision (t ) = α + β initial (t ) + ε (t ).

THE STRUCTURE OF REAL-TIME
DATA

(1)

Next, use the estimates of α and β to make a
forecast of the August revision that will occur in
1986:

rˆ (1985:Q1) = αˆ + βˆ ⋅ initial (1985:Q 1).
Repeat this procedure for releases from
1985:Q2 to 2006:Q4. Finally, forecast the value
of the annual revision for each date from 1985:Q1
to 2006:Q4 based on the formula
(2)

ˆ (t ) = initial (t ) + rˆ (t ).
A

At the end of this process, examine the root
mean squared forecast errors (RMSEs) as follows:
Take the annual release value as the realization
and compare the RMSE of the forecast of that value
(given by equation (2)) with the RMSE of the forecast of that value assuming that the initial release
is an optimal forecast. In such a case, the results
show that it is possible to forecast the annual
revision. Indeed, had the Federal Reserve used
this procedure, it would have forecast an upward
revision to core PCE inflation in 2002 and might
not have worried so much about the unwelcome
fall in inflation that was a major concern in this
period. However, following such a method does
not appear to work for U.S. real GDP. Cunningham
et al. (2007) found that it worked for U.K. real
GDP, but Anderson and Gascon’s attempt to use
it for U.S. real GDP is less likely to be fruitful.
This is not to say that the initial release of U.S.
real GDP data is an optimal forecast of the latest
374

J U LY / A U G U S T

2009

Researchers of real-time data begin by developing a vintage matrix, consisting of the data as
reported by the government statistical agency at
various dates. An example is given in Table 1.
In the vintage matrix, each column represents
a vintage, that is, the date on which a data series
is published. For example, the first column reports
the dates from 1947:Q1 to 1965:Q3 for data that
would have been observable in November 1965.
Each row in the matrix represents an activity date,
that is, the date for which economic activity is
measured. For example, the first row shows various measures for 1947:Q1. Moving across rows
shows how data for a particular activity date are
revised over time. The main diagonal of the matrix
shows initial releases of the data for each activity
date, which moves across vintages. Huge jumps
in numbers indicate benchmark revisions with
base-year changes. For example, in the first row,
for 1947:Q1 the value rises from 306.4 in early
vintages to 1570.5 in the most recent vintages.
Until about 1999, researchers studying monetary policy or forecasters building models ignored
the vintage matrix and simply used the last column of the matrix available at the time—the latest
data. If data revisions are small and white noise,
this is a reasonable procedure. But in 1999, the
Federal Reserve Bank of Philadelphia put together
a large real-time dataset for macroeconomists,
and it became possible for researchers and foreF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Croushore

Table 1
The Vintage Matrix
Real Output
Vintage (v)
Activity date

11/65

02/66

05/66

…

11/07

02/08

1947:Q1

306.4

306.4

306.4

…

1,570.5

1,570.5

1947:Q2

309.0

309.0

309.0

…

1,568.7

1,568.7

1947:Q3

309.6

309.6

309.6

…

1,568.0

1,568.0

⯗

⯗

⯗

⯗

⯗

⯗
1965:Q3

609.1

613.0

613.0

…

3,214.1

3,214.1

1965:Q4

NA

621.7

624.4

…

3,291.8

3,291.8

1966:Q1

NA

NA

633.8

…

3,372.3

3,372.3

⯗

⯗

⯗

⯗

⯗

⯗

2007:Q1

NA

NA

NA

…

11,412.6

11,412.6

2007:Q2

NA

NA

NA

…

11,520.1

11,520.1

2007:Q3

NA

NA

NA

…

11,630.7

11,658.9

2007:Q4

NA

NA

NA

…

NA

11,677.4

SOURCE: Federal Reserve Bank of Philadelphia Real-Time Dataset for Macroeconomists (RTDSM; series ID: ROUTPUT).

casters to use the entire vintage matrix (see
Croushore and Stark, 2001). Subsequent work at
the Federal Reserve Bank of St. Louis expanded
the Philadelphia Fed’s work to create the vintage
matrix for a much larger set of variables. The
availability of such data has allowed researchers
of real-time data to study data revisions and how
they affect monetary policy and forecasting. The
data revisions turn out to be neither small nor
white noise, so accounting for data revisions is
paramount.
Researchers of real-time data have explored
a number of ways to study what happens in the
vintage matrix. One of the main distinctions in
the literature that is crucial to econometric evaluation of data revisions is the distinction between
“news” and “noise.” Data revisions contain news
if the initial release of the data is an optimal forecast of the later data. If so, then data revisions
are not predictable. On the other hand, if data
revisions reduce noise, then each data release
equals the truth plus a measurement error; but
because the data release is not an optimal forecast,
it is predictable.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Empirical findings concerning news and noise
are mixed. Money-supply data contain noise,
according to Mankiw, Runkle, and Shapiro (1984),
but GDP releases represent news, according to
Mankiw and Shapiro (1986). Different releases
of the same variable can vary in their news and
noise content, as Mork (1987) found. For U.K. data,
releases of most components of GDP contain
noise, according to Patterson and Heravi (1991).
The distinction between news and noise is vital
to some state-space models, such as the one
developed by Jacobs and van Norden (2007).
Anderson and Gascon ignore the distinction
between news and noise because they develop a
new and unique way to slice up the vintage
matrix. Rather than focus on the vintage date,
their analysis is a function of the “maturity” of
data—that is, how long a piece of data for a given
activity date has matured. They then track that
piece of data over a length of time that they call
the “revision horizon,” which they can vary to
discover different properties in the data of the
revisions. This is a clever procedure and has the
potential to lead to interesting results.
J U LY / A U G U S T

2009

375

Croushore

Figure 4
Stark Plot: December 1985–November 1991
Demeaned Log Difference
0.08
0.06
0.04
0.02
0
–0.02
–0.04
–0.06
–0.08
1947

1952

1957

1962

1967

1972

1977

1982

1987

1992

1997

2002

SOURCE: Author’s calculations using data from the Federal Reserve Bank of Philadelphia RTDSM (series ID: ROUTPUT).

The statistical model used by Anderson and
Gascon is based on the following equation:

y tj = yt + c j + vtj .
A measured piece of data of some maturity j for
activity date t is equal to the true value of the
variable at activity date t, plus a bias term that is
a function of maturity (but not vintage or activity
date), plus a measurement error that is a function
of both maturity and the activity date. This is the
same method used by Cunningham et al. (2007).

The Problem of Benchmark Revisions
Unfortunately, the Anderson and Gascon
method may not work well if there are large and
significant benchmark revisions to the data,
because then the relationships in question would
be a function of not only the activity date and
maturity, but also a function of vintage, because
benchmark revisions hit only one vintage of data
every five years or so. But when they do hit, they
affect the values of a different maturity for every
activity date. So, if benchmark revisions are sig376

J U LY / A U G U S T

2009

nificant, then the Anderson and Gascon procedure could face problems.
Are benchmark revisions significant? I like to
investigate the size of benchmark revisions using
Stark plots, which I named after my frequent
coauthor Tom Stark, who invented the plot (see
Croushore and Stark, 2001). Let X共t,s兲 represent the
level of a variable that has been revised between
vintages a and b, where vintage b is farther to
the right in the vintage matrix and thus later in
time than vintage a. Let m = the mean of
log[X共τ,b兲/X共τ,a兲] for all the activity dates τ that
are common to both vintages. The Stark plot is a
plot of log[X共t,b兲/X共t,a兲] – m. Such a plot would be
a flat line if the new vintage were just a scaled-up
version of the old one, that is, if X共t,b兲 = λ X共t,a兲.
If the plot shows an upward trend, then later data
have more upward revisions than earlier data.
Spikes in the plot show idiosyncratic data revisions. More important to analysis of data revisions
would be any persistent deviation of the Stark
plot from the zero line, which would imply a corF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Croushore

Figure 5
Stark Plot: December 1995–October 1999, Chain Weighting;
Government Purchases Reclassified as Investment
Demeaned Log Difference
0.08
0.06
0.04
0.02
0
–0.02
–0.04
–0.06
–0.08
1947

1952

1957

1962

1967

1972

1977

1982

1987

1992

1997

2002

SOURCE: Author’s calculations using data from the Federal Reserve Bank of Philadelphia RTDSM (series ID: ROUTPUT).

Figure 6
Stark Plot: October 1999–November 2003; Software Reclassified as Investment
Demeaned Log Difference
0.08
0.06
0.04
0.02
0
–0.02
–0.04
–0.06
–0.08
1947

1952

1957

1962

1967

1972

1977

1982

1987

1992

1997

2002

SOURCE: Author’s calculations using data from the Federal Reserve Bank of Philadelphia RTDSM (series ID: ROUTPUT).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

377

Croushore

Figure 7
Real Consumption Growth, 1973:Q2
Percent
1.0

0.5

0.0

–0.5

–1.0

–1.5
1973

1976

1979

1982

1985

1988

1991

1994

1997

2000

2003

2006

Vintage
SOURCE: Author’s calculations using data from the Federal Reserve Bank of Philadelphia RTDSM (series ID: RCON).

relation of revisions arising from the benchmark
revision.
In Figures 4, 5, and 6, we examine Stark
plots that span particular benchmark revisions.
Figure 4 shows how vintage data were revised
from December 1985 to November 1991 for activity dates from 1947:Q1 to 1985:Q3. The data early
in the sample period show upward revisions and
those later in the sample period show downward
revisions. There is a clear pattern in the data,
which is mainly driven by the benchmark revision
to the data that was released in late December
1985 (the December 1985 vintage date corresponds to the data as it existed in the middle of
the month).
Figure 5 shows the revisions from December
1995 to October 1999, illustrating the impact of
the benchmark revision of January 1996, which
introduced chain weighting and reclassified
government investment expenditures from their
previous treatment as an investment expense
subject to depreciation. The impact is very large,
with data early in the sample showing downward
revisions relative to data later in the sample.
378

J U LY / A U G U S T

2009

Figure 6 illustrates the impact of the November
1999 benchmark revision, in which business software was reclassified as investment; we look at
the changes from the October 1999 vintage to the
November 2003 vintage. The nonlinear Stark plot
suggests little change in growth rates in the early
part of the sample, but increasing growth rates
later in the sample. The impact of these changes
in the benchmarks is considerable. There is clearly
a significant change in the entire trajectory of the
variable over time, which should be accounted
for in any empirical investigation of the variable.
Do revisions ever settle down and stop occurring? In principle, they do under chain weighting,
except for redefinitions that occur in benchmark
revisions. For example, Figure 7 shows the growth
rate of real consumption spending for activity
date 1973:Q2. It has been revised by several percentage points over time and changed significantly
as recently as 2003, some 30 years after the activity date. Thus, we cannot be confident that data
are ever truly final and that there will never be a
significant future revision.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Croushore

Figure 8
Mean Revision, Initial to Latest
Mean Revision
1.0

0.8

0.6

0.4

0.2

0
1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006
Vintage Date for Latest-Available Data

SOURCE: Author’s calculations using data from the Federal Reserve Bank of Philadelphia RTDSM (series ID: ROUTPUT).

One key idea in the Anderson and Gascon
article is to exploit the apparent bias in initial
releases of the data. Unfortunately the bias seems
to jump at benchmarks, as the Stark plots suggest.
To see the jumps more clearly, Figure 8 plots what
one would have thought the bias was at different
vintage dates for real output growth. That is, it calculates the mean revision from the initial release
to the latest available data, where for the sample
of data from 1965:Q3 to 1975:Q3 the vintages of
the latest available data are from 1980:Q3 to
2007:Q2. If we were standing in 1980:Q3, Figure
8 indicates we would have thought that the bias
in the initial release of real output growth was 0.28
percentage points. But someone observing the
data in the period from 1980:Q4 to 1982:Q1 would
have thought it was 0.45 percentage points. And
the apparent bias keeps changing over time, ending in 2007:Q2 at 0.62 percentage points. So, the
bias changes depending on the date when you
measure the bias. The same is true if you allow
the sample period to change, rather than focusing
on just one sample period as we did in Figure 8.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

The Stark plots provide important information
for researchers—that the bias is a function of the
benchmark dates, not just maturity. Thus, equation (3) in Anderson and Gascon,

c j − c 1 (1 + λ )

j −1

,

which treats the bias solely as maturity dependent,
is not likely to work across benchmark revisions.
The other key assumption that Anderson and
Gascon use in their empirical framework is that
the measurement error follows an autoregressive
共AR共q兲兲 process. The Stark plots suggest that such
an assumption is not well justified, because the
process at different benchmark revisions is much
more complicated than any AR共q兲 process can
capture.

WHERE NEXT?
Given the issues identified here, how should
the authors proceed with their research? I offer
five suggestions. First, they should compare their
J U LY / A U G U S T

2009

379

Croushore

results on the potential output series generated by
their method with that of some benchmark, such
as some of the series generated in Orphanides and
van Norden (2002). Second, they should examine
forecasts of revisions that can be generated by the
model to see if they match up reasonably well
with actual revisions. Third, they should see how
stable their model is when it encounters a benchmark revision. That is, if the model were used in
real time to generate a series for potential output
and then suddenly hit a benchmark revision, what
would that do to the potential output series?
Fourth, they should attempt to reconcile the
Stark plots with their assumptions about the
data to see how much damage such assumptions
might make. Finally, because they have ignored
the distinction between news and noise, they
might want to consider the impact the results of
Jacobs and van Norden (2007) would have on
their empirical model.

Croushore, Dean. “Frontiers of Real-Time Data
Analysis.” Working Paper No. 08-4, Federal Reserve
Bank of Philadelphia, March 2008a;
www.philadelphiafed.org/research-and-data/
publications/working-papers//2008/wp08-4.pdf.
Croushore, Dean. “Revisions to PCE Inflation
Measures: Implications for Monetary Policy.”
Working Paper 08-8, Federal Reserve Bank of
Philadelphia, March 2008b.
Croushore, Dean and Stark, Thomas. “A Real-Time Data
Set for Macroeconomists.” Journal of Econometrics,
November 2001, 105(1), pp. 111-30.
Cunningham, Alastair; Eklund, Jana; Jeffery,
Christopher; Kapetanios, George and Labhard,
Vincent. “A State Space Approach to Extracting
the Signal from Uncertain Data.” Working Paper
No. 336, Bank of England, November 2007.
Faust, Jon; Rogers, John H. and Wright, Jonathan H.
“News and Noise in G-7 GDP Announcements.”
Journal of Money, Credit, and Banking, June 2005,
37(3), pp. 403-19.

CONCLUSION
The research by Anderson and Gascon is an
interesting and potentially valuable contribution
to estimating potential output. However, practical
issues, in particular the existence of benchmark
revisions, may derail it. It may be that no new
empirical method can handle revisions and produce better estimates of potential output in real
time than current methods. If so, then we may
have to conclude that potential output cannot be
measured accurately enough in real time to be of
any value for policymakers.

Guerrero, Victor M. “Combining Historical and
Preliminary Information to Obtain Timely Time
Series Data.” International Journal of Forecasting,
December 1993, 9(4), pp. 477-85.
Jacobs, Jan P.A.M. and van Norden, Simon. “Modeling
Data Revisions: Measurement Error and Dynamics
of ‘True’ Values.” Working paper, HEC Montreal,
June 2007.

REFERENCES

Mankiw, N. Gregory; Runkle, David E. and Shapiro,
Matthew D. “Are Preliminary Announcements of
the Money Stock Rational Forecasts?” Journal of
Monetary Economics, July 1984, 14(1), pp. 15-27.

Anderson, Richard and Gascon, Charles. “Estimating
U.S. Output Growth with Vintage Data in a StateSpace Framework.” Federal Reserve Bank of St. Louis
Review, July/August 2009, 91(4), pp. 349-69.

Mankiw, N. Gregory and Shapiro, Matthew D. “News
or Noise? An Analysis of GNP Revisions.” Survey
of Current Business, May 1986, 66(5), pp. 20-25.

Conrad, William and Corrado, Carol. “Application of
the Kalman Filter to Revisions in Monthly Retail
Sales Estimates.” Journal of Economic Dynamics
and Control, May 1979, 1(2), pp. 177-98.

Mork, Knut Anton. “Ain’t Behavin’: Forecast Errors
and Measurement Errors in Early GNP Estimates.”
Journal of Business and Economic Statistics, April
1987, 5(2), pp. 165-75.

380

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Croushore

Orphanides, Athanasios and van Norden, Simon.
“The Unreliability of Output Gap Estimates in Real
Time.” Review of Economics and Statistics,
November 2002, 84(4), pp. 569-83.
Patterson, K.D. and Heravi, S.M. “Data Revisions and
the Expenditure Components of GDP.” Economic
Journal, July 1991, 101(407), pp. 887-901.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

381

382

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

The Role of Potential Output
Growth in Monetary
Policymaking in Brazil
Carlos Hamilton Araujo

P

otential output is important in policymaking for a number of reasons:

• It is a key variable in most macroeconomic
models because it enables construction of
measures of the output gap. These measures
are often used in the IS and Phillips curves
and the Taylor rule, among others.

• It provides a measure of economic slack
(i.e., its cyclical position).
• It helps to gauge future inflation pressures.
• It is important for estimating cyclically
adjusted variables (e.g., structural fiscal
deficit).
However, potential output is difficult to handle. As a latent variable, it is hard to measure in
any circumstance, and frequent data revisions
worsen the accuracy of any estimation. For example, in Brazil, the available time series has a short
data span and the methodology for calculating
gross domestic product (GDP) has changed frequently. Another potential problem is that geographic data might also be inadequate (e.g., the
unemployment rate).
The Central Bank of Brazil uses a variety of
statistical methods to measure potential output.

The most common are statistical filters, including
the Hodrick-Prescott filter, band-pass filters,
Kalman filters, and Beveridge-Nelson decompositions. These methods are not based on economic
theory or models, and each has its idiosyncrasies—
sometimes with opposite identifying assumptions.
As a general rule, the rationale is the same for all:
to decompose the GDP time series into a permanent component and a transitory, cyclical component to measure the output gap. It is a shortcoming
of these measures that they do not consider information other than GDP itself. They often behave
like moving averages and, hence, perform poorly
when the original GDP series faces large and sudden changes. In addition, the resulting filtered
time series is frequently judged too volatile relative to the prior beliefs of senior policymakers.
We also use macroeconomic methods, including Cobb-Douglas production functions, structural
vector autoregressions, dynamic stochastic general
equilibrium models, and other macro models. To
some extent, these are based on economic theory
and may impose quite strong restrictions on the
data. In addition, estimates are model dependent,
which often leads to disagreement regarding the
“true” model; furthermore, estimates are sensitive
to model specification error. Given these restrictions, these models might be more difficult to
estimate than with the previous statistical methods
and may ignore key determinants of potential
output.
Now, consider the simplest production function approach:

Yt = At K tα Lt1−α.

Carlos Hamilton Araujo is the head of the research department at the Central Bank of Brazil.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 383-85.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, the regional Federal Reserve Banks, or the Central Bank of Brazil. Articles may
be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation
are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank
of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

383

Panel Discussion

Figure 1
Unemployment Rate and Capacity Utilization
Percent
14

Percent
86

13

84

12

82

11
80
10
78
9
76

8
7

Unemployment (left scale)

74

Industrial Capacity Utilization (right scale)
6

This approach is based on widely accepted economic theory and explicitly specifies the sources
of economic growth. An alternative specification
that we find useful is the following:

Gapt = α log (cut ) − ln ( NAIRCU t )
+ (1 − α ) log (1 − unt ) − ln (1 − NAIRU t ) ,
where NAIRCU is the natural rate of capacity
utilization and NAIRU is the natural unemployment rate. The unemployment rate and capacity
utilization rate for Brazil are shown in Figure 1.
Regardless of the adopted measure, potential
output estimates are always uncertain. In this
sense, the Bank relies on additional economic
variables as a cross-check of economic activity;
these variables include unemployment, capacity
utilization, industrial production, retail sales,
wage growth, and surveys of corporate confidence.
Thus, various potential output measures are
compared by computer simulations, focused on
using Phillips curves to forecast inflation and on
comparisons with predictions from Okun’s law.
384

J U LY / A U G U S T

2009

08
20

07
20

06
20

05
20

04
20

20

20

02

03

72

The relationship between output gap estimates
and potential inflationary pressures is of utmost
importance to the Monetary Policy Committee.
Yet, in my view, indicators of inflation expectations are more important drivers of policy decisions than the output gap.
Potential output and capital growth are essential elements of capital deepening and output
growth. Both have shown significant acceleration
in recent years. Although explanations for the
acceleration are uncertain, possible reasons
include increased macroeconomic stability due
to a new political environment (more favorable
to planning); strong inflows of foreign capital in
the form of foreign direct investment, bringing
with it new technology; exchange rate appreciation, which sharply reduced the cost of imported
capital goods; and the culmination of educational
improvements, resulting in a higher-quality labor
force.
Looking forward, we anticipate a slowing of
economic growth. It is likely that slower GDP
growth will adversely affect potential output
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

through similar channels: a reduction in foreign
direct investment inflows; less-accommodative
credit conditions, both in the domestic and international markets; and exchange rate depreciation
that will increase the cost of imported capital
goods.
In a nutshell: Potential output is regarded as
a key indicator for assessing the slack in the economy and gauging the buildup of inflationary
pressures. Because it is not observable, potential

output estimates are imprecise and worsened by
short and volatile time series. At the Central Bank
of Brazil, we use many purely statistical and
structural methods to assess potential output.
Evidence and experience favor structural methods. We seek to mitigate the related uncertainties
by using several methods, as well as other excess
demand indicators. In policymaking, the Bank
places a larger weight on inflation expectations,
in addition to its estimates of the output gap.

The Role of Potential Growth
in Policymaking

Nowadays, it is common to use a specific
economic model to estimate potential output, its
growth rate, and the output gap. Here is a simple
example. Consider an economy with perfect competition and a Cobb-Douglas production function,
Yt = At K tα Nt1–α. The log-linearization of this production function is

Seppo Honkapohja

DEFINITIONS OF POTENTIAL
OUTPUT AND POTENTIAL
GROWTH

C

urrently, differing concepts of potential
output and potential growth are used in
both academic research and policy discussions. Traditionally, potential output and
potential growth are measures of the average productive capacity of an economy and its change
over time. Correspondingly, the output gap is
the deviation of actual output from its potential
value, that is, from average output. If potential
output is viewed as (in some sense) average output, then potential output is naturally measured
by fitting a statistical trend on the path of output
over time. John Taylor’s (1993) seminal paper on
estimated interest rate rules used such a traditional measure for the output gap. Alternatively,
potential growth might be measured by fitting
trends to paths of factor supplies and using these
in an estimated production function.

(1)

y t = α kt + (1 − α ) nt + at ,

where lower-case letters denote logarithms of
output, capital, labor input, and total factor productivity (TFP). The log of TFP, at, evolves exogenously, while the actual values of yt and nt are
determined as part of the competitive equilibrium.
The difficulty is that TFP cannot be directly
observed but must be obtained as a residual from
equation (1), using an estimate or calibrated value
for α .1 With such an estimate, we obtain potential
output, ytp, as

ytp = α t kt + (1 − α t ) nt + a t,
where a–t and α–t are estimates of TFP and parameter α , respectively, for period t. Although modelbased, this calculation often produces measures
1

As is well known, there are also more sophisticated ways to estimate TFP. For example, see Chambers (1988, Chap. 6) for an
introductory discussion.

Seppo Honkapohja is a member of the Board of the Bank of Finland and a research fellow at the Centre for Economic Policy Research.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 385-89.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, the regional Federal Reserve Banks, or the Bank of Finland. Articles may be
reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation
are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank
of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

385

Panel Discussion

close to the traditional statistical measures. In
this case, the output gap is yt – ytp.
In the preceding formulation, a–t and α–t may
be ex post or real-time estimates of TFP and α ,
respectively. In practice, there are short- to
medium-term policy concerns that require realtime measurement of the output gap. A possible
policy objective can be to smooth fluctuations in
aggregate output. Of course, in a competitive
economy without distortions, there is no reason
to offset the random fluctuations in yt, as the equilibrium is Pareto efficient. If there are distortions,
one might have some interest in measuring the
real-time output gap, but this depends on the
nature of the distortions and whether they vary
cyclically. Naturally, measurement of potential
output and potential growth is also important for
setting growth policy; for example, the so-called
Lisbon Agenda was devised to address the sluggish growth of most Western European Union
countries. Such issues are long-run policy concerns and the background studies are based on,
for example, growth accounting methodologies
with ex post estimates for parameters. In such
cases, there is no urgency to obtain real-time
measurement for potential output.

ct = Et ct +1 − σ −1 ( it − E t π t +1 − ρ ),
where ρ = –log β and β is the subjective discount
factor of the economy and σ is a utility function
parameter. In equilibrium,

Ct = Yt
and, therefore, we obtain the dynamic IS curve,

y t = E t y t +1 − σ −1 ( it − E t πt +1 − ρ ) .

(2)

Here lower-case variables denote log-deviations
from the steady state. Equation (2) indicates that
aggregate output in the economy depends positively on expectations of next-period output and
negatively on the real rate of interest, where the
latter is defined in terms of expected next-period
inflation.
The dynamics of inflation are described by
an aggregate supply curve, also called the NK
Phillips curve:

π t = β Et π t +1 + λ ( mct − mc ),
where inflation (as a deviation from the steady
state) depends on expected inflation and the
deviation of marginal cost from its steady-state
value. Here λ is a function of several structural
parameters. It can be shown that

STICKY PRICES
Though some disagreements exist, it is now
a common view that the perfectly competitive,
flexible-price model is not relevant for short- to
medium-run policymaking. The current workhorse for monetary policy analysis is the New
Keynesian (NK) model, which differs from the
perfect-competition model in two crucial respects:
In it the economy is imperfectly competitive and
displays nominal price and/or wage rigidity.2
We modify the model outlined above by
introducing differentiated goods and imperfect
competition. Log-linearized optimal consumption
behavior as a log-deviation from the steady state
is described by the Euler equation,
2

There are several good expositions of the NK model. The formal
details below are based on the excellent exposition of the NK
model in Galí (2008).

386

J U LY / A U G U S T

2009

mct = w t − pt −

1
(at − α yt ) − log (1 − α ).
1−α

It is possible to write mct – mc in terms of a
new measure of the output gap:

y t = y t − y tn , where
y tn =

(1 − α )( µ − log (1 − α ))
1+ψ
.
at −
σ (1 − α ) + ψ + α
σ (1 − α ) + ψ + α

Here ytn is the natural level of output, that is, aggregate output at the flexible price (but monopolistically competitive) level.3 Note that ytn < ytCE
because of imperfect competition. Note also that
the natural level of output is different from potential output of the economy.
3

ω is a utility function parameter, whereas κ is the log of the steadystate markup.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

Using the output gap, the inflation equation
can be written

π t = β Et π t +1 + κ y t ,
which implies that from a business cycle viewpoint the output gap, measured as just explained,
is the relevant concept for monetary policy analysis. The dynamic IS curve can also be written in
terms of the output gap as

(

)

y t = Et y t +1 − σ −1 it − E t π t +1 − rtn , where
rtn = ρ + σ

1+ψ
E ( ∆ at +1) .
σ (1 − α ) + ψ + α t

Here r tn is the natural rate of interest.
To summarize, monetary policy analysis uses
two different concepts of the output gap and both
are used in monetary policy analysis. The traditional concept of potential output and the output
gap are defined by the deviation from trend,
whereas the recent model-based notion of the output gap is defined as the difference between actual
output and the flexible price level of output. The
two concepts are different and can behave in different ways, as vividly illustrated by Edge, Kiley,
and Laforte (2008, Figure 1) and also studied by,
for example, Andres, Lopez-Salido, and Nelson
(2005) and Justiniano and Primiceri (2008).

NOISY DATA
The NK model outlined above suggests that,
in theory, the output gap measure ỹt , derived in
the NK model (or analogously in dynamic stochastic general equilibrium models), is the appropriate
measure of potential output for monetary policy
analysis. It should be emphasized that this view
holds only in theory for several reasons. First,
any model-based output-gap measure is model
dependent and thus capable of generating misleading recommendations. One should always
examine the robustness of conclusions based on
a specific model and the corresponding measure.
Second, how a measure will be used should
be considered before deciding which model to use.
The output gap measure based on the NK model
is intended for analysis of inflation control and
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

often does not measure well the economy’s deviations from its long-term productive capacity.
Third, even if one opts for the measure based
on the NK model, assumptions about the availability of output gap information are very strong
in the standard analysis of monetary policy in
dynamic stochastic general equilibrium models.
Policies that perform well under the usual rational
expectations (RE) assumption, for example, often
do not perform well if the measurement of the
output gap and other variables contain significant
noise. Orphanides emphasizes this problem in a
number of papers (e.g., see Orphanides, 2003). In
particular, he states that naive optimal policies
derived under RE often do poorly if there are
noisy measurements of the true variables.4
In principle, optimal control policies that
take into account the measurement problem can
be calculated using Kalman filters; however, this
approach can be sensitive to measurement problems caused by imperfect knowledge. Neither the
“correct model” nor the data are, in practice, fully
known to the policymaker (further discussed
below). The use of well-performing simple rules
offers another approach to the problem of noisy
measurements. In some cases, though not optimal,
simple rules work better than naive optimal rules.
Such simple rules have the same functional form
as naive optimal rules but respond to noisy realtime data appropriately when the policy coefficients are chosen optimally.

OTHER ASPECTS OF IMPERFECT
KNOWLEDGE
Noisy data are just one aspect of knowledge
imperfections that policymakers face, and
although there are several others, I focus on learning effects—that is, that economic evolution in
the short to medium run can be significantly
influenced by learning effects from economic
agents trying to improve their knowledge. The
literature on learning and macroeconomics has
been widely researched in recent years, and mone4

In addition, data revisions are often significant and make it difficult to use model-based measures, so they will not be discussed
further.

J U LY / A U G U S T

2009

387

Panel Discussion

tary policy design has been shown to be affected
by one’s learning viewpoint.
The basic ideas in learning are that (i) agents
and policymakers have imperfect knowledge, (ii)
expectations are based on existing knowledge and
updated over time using econometric techniques,
and (iii) expectations feed into decisions by agents
and hence to actual outcomes and future forecasts.
Learning dynamics converge to an RE equilibrium,
provided that the economy satisfies an expectational stability criterion. Good policy facilitates
convergence of learning.
Basic learning models use fairly strong
assumptions: (i) Functional forms of agents’ forecasting models are correctly specified relative to
the RE equilibrium, (ii) agents accurately observe
relevant variables, and (iii) economic agents trust
their forecasting model. Most of these assumptions
have been weakened in the recent literature. Misspecification is certainly one concern because it
can inhibit convergence to an RE equilibrium and
create a restricted-perceptions equilibrium. However, the implications of this for policy design
are not further discussed here.
Noisy measurements have been incorporated
into some models of monetary policy that include
learning, most notably by Orphanides and
Williams (2007 and forthcoming). Basically, these
models show that the ideas discussed above still
hold. One can try to consider filtering and learning
together, but this is likely to be formally demanding and has not been studied. Alternatively, one
can use simple rules that work well. In particular,
the recent papers by Orphanides and Williams
(2007, forthcoming) suggest the use of rules that
do not rely on data subject to significant noise.
A specific measurement problem is agents’
private expectations. It has been shown that
expectations-based optimal rules would work
well for optimal monetary policy design. If there
are significant errors in measuring private-sector
expectations, one can try to develop proxies for
them. This is, in fact, typically done, perhaps
using survey data from either professional forecasters or consumer surveys. An alternative is
model-based proxies from a variety of sources,
including indexed and non-indexed bonds, swaps,
388

J U LY / A U G U S T

2009

or information from purely statistical forecasting
models.
If agents do not trust their personal forecasting
model, then they may wish to allow for uncertainty
in their forecasting model and/or their behavioral
attitudes. If one allows for unspecified model
uncertainty in estimation, then robust estimation
methods can be used. In fact, a “maximally robust”
estimation leads to so-called constant-gain stochastic gradient algorithms, which have been studied
for learning in Evans, Honkapohja, and Williams
(forthcoming). Of course, literature on economic
behavior in the presence of unstructured model
uncertainty abounds (see Hansen and Sargent,
2007). In policy design, one can also incorporate
aspects of robust policy with respect to learning
by private agents. Usually, it is assumed that the
policymaker does not know the learning rules of
private agents,5 but considers as policy constraints
E-stability conditions for private-agent learning;
that is, recursive least squares learning is assumed.
One could make additional assumptions about
learning or even identify stability conditions that
are robust in some sense (see, e.g., Tetlow and
von zur Muehlen, 2009).

REFERENCES
Andres, Javier; Lopez-Salido, J. David and Nelson,
Edward. “Sticky-Price Models and the Natural Rate
Hypothesis.” Journal of Monetary Economics, July
2005, 52(5), pp. 1025-53.
Chambers, Robert G. Applied Production Analysis.
Cambridge: Cambridge University Press, 1988.
Edge, Rochelle M.; Kiley, Michael T. and Laforte,
Jean-Philippe. “Natural Rate Measures in an
Estimated DSGE Model of the U.S. Economy.”
Journal of Economic Dynamics and Control,
August 2008, 32(8), pp. 2512-35.
Evans, George W.; Honkapohja, Seppo and Williams,
Noah. “Generalized Stochastic Gradient Learning.”
International Economic Review (forthcoming).
5

A few papers consider policy optimization with respect to learning
rules of private agents.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

Galí, Jordi. Monetary Policy, Inflation, and the
Business Cycle. Princeton: Princeton University
Press, 2008.
Hansen, Lars Peter and Sargent, Thomas J. Robustness.
Princeton: Princeton University Press, 2007.
Justiniano, Alejandro and Primiceri, Giorgio E.
“Potential and Natural Output.” Unpublished
manuscript, Northwestern University, June 2008;
http://faculty.wcas.northwestern.edu/~gep575/
JPgap8_gt.pdf.
Orphanides, Athanasios. “Monetary Policy Evaluation
with Noisy Information.” Journal of Monetary
Economics, April 2003, 50(3), pp. 605-31.

Journal of Monetary Economics, July 2007, 54(5),
pp. 1406-35.
Orphanides, Athanasios and Williams, John C.
“Monetary Policy Mistakes and the Evolution of
Inflation Expectations,” in Michael D. Bordo and
Athanasios Orphanides, eds., The Great Inflation.
Chicago: University of Chicago Press (forthcoming).
Taylor, John B. “Discretion versus Policy Rules in
Practice.” Carnegie-Rochester Conference Series on
Public Policy, December 1993, 39, pp. 195-214.
Tetlow, Robert and von zur Muehlen, Peter.
“Robustifying Learnability.” Journal of Economic
Dynamics and Control, February 2009, 33(2),
pp. 292-316.

Orphanides, Athanasios and Williams, John C. “Robust
Monetary Policy with Imperfect Knowledge.”

The Role of Potential Output
in Policymaking*
James Bullard

O

ften, economists equate potential output
with the trend in real gross domestic
product (GDP) growth. My discussion
is focused on “proper” detrending of aggregate
data. I will emphasize the idea that theory is
needed to satisfactorily detrend data—explicit
theory that encompasses simultaneously both
longer-run growth and shorter-run fluctuations.
The point of view I wish to explore stresses that
both growth and fluctuations must be included
*

This discussion is based on the panel discussion, “The Role of
Potential Output in Policymaking,” available at
http://research.stlouisfed.org/econ/bullard/FallPolicyConference
Bullard16oct2008.pdf and on Bullard and Duffy (2004).

in the same theoretical construct if data are to be
properly detrended. Common atheoretic statistical methods are not acceptable. When detrending
data, an economist should detrend by the theoretical growth path so as to correctly distinguish
output variance in the model due to growth
from the variation in the model due to cyclical
fluctuations.
The quest to fully integrate growth and cycle
was Prescott’s initial ambition; however, it is difficult to develop a model that can match the curvy,
time-varying growth path often envisioned as
describing an economy’s long-run development.
Instead, the Hodrick-Prescott (HP) filter was proposed to remove from the data a flexible timevarying trend (Hodrick and Prescott, 1980). My
argument is that this procedure is unsatisfactory.
The idea in question is: How can we specify
a model that will make the growth path look like
the ones we see in the data? My suggestion is that,

James Bullard is president of the Federal Reserve Bank of St. Louis.
Federal Reserve Bank of St. Louis Review, July/August 2009, 91(4), pp. 389-95.

© 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the FOMC. Articles may be reprinted, reproduced, published, distributed,
displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other
derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J U LY / A U G U S T

2009

389

Panel Discussion

as an initial approach, we use a mainstream core
growth model augmented with occasional trend
breaks and learning. Learning helps the model
fit the data and has important implications for
policy analysis. I will discuss some applications
of this idea in Real Business Cycle (RBC) and
New Keynesian (NK) models from Bullard and
Duffy (2004) and Bullard and Eusepi (2005).

MAIN IDEAS
The equilibrium business cycle literature
encompasses a wide class of models, including
RBC, NK, and multisector growth models. Various frictions can be introduced in all of these
approaches. Many analyses do not include any
specific reference to growth, but all are based on
the concept of a balanced growth path.
I will focus on a framework that is very close
to the RBC model. This will provide a wellunderstood benchmark. However, I stress that
these ideas have wide applicability in other models
as well, and I will briefly discuss an NK application at the end.
Empirical studies, such as Perron (1989) and
Hansen (2001), have suggested breaks in the trend
growth of U.S. economic activity. One reasonable
characterization of the data is to assume log-linear
trends with occasional trend breaks but no discontinuous jumps in level—that is, a linear spline.
This is the bottom line of Perron’s econometric
work. These breaks, however, suggest a degree of
nonstationarity that is difficult to reconcile with
available theoretical models. This is where adding
learning can be helpful.

tant dictum implied by the balanced growth path
assumption: There are restrictions as to how the
model’s variables can grow through time and in
turn, therefore, how one is allowed to detrend
data. In an appalling lack of discipline, economists ignore this dictum and detrend data individually, series by series, which makes little sense
in any growth theory. An acceptable theory specifies growth paths for the model’s variables (i.e.,
consumption, investment, output); individual
trends should not be taken out of the data. Still,
the ad hoc practice dominates the literature.
Most of my criticisms are well known:
• Statistical filters do not remove the “trend”
that the balanced growth path requires.
• Current practice does not respect the cointegration of the variables, that is, the multivariate trend that the model implies.
• Filtered trends imply changes in growth
rates over time; agents would want to react
to these changes and adjust their behavior.
A model without growth does not allow for
this change in behavior.
• The “business cycle facts” are not independent of the statistical filter employed. The
econometrics literature normally—but not
always—filters the data so as to achieve
stationarity for estimation and inference
without regard to the underlying theory’s
balanced growth assumption. Even recent
sophisticated models (for examples, Smets
and Wouters, 2007) do not address this issue.

HOW TO IMPROVE ON THIS?
CURRENT PRACTICE
The standard approach in macroeconomics
today is to analyze separately business cycles
and long-term growth. The core of this analysis
is the statistical trend-cycle decomposition. The
standard method in the literature for trend-cycle
decomposition is to use atheoretic, univariate
statistical filters, that is, to conduct the decomposition series by series (see, for example, King
and Rebelo, 1999). This method ignores an impor390

J U LY / A U G U S T

2009

The criticisms are correct in principle. They
are quantitatively important. And, these issues
cannot be resolved by using alternative statistical
filters, because those filters are atheoretic. Instead,
theory should be used to tell us what the growth
path should look like; then, this theoretical trend
can be used to detrend the data.
In the model I discuss, agents are allowed to
react to trend changes. The ability to react to
changes in trends alters agents’ behavior—how
much they save, how much they consume, and
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

so forth. Of course, this is demanding territory. I
am insisting that the theorist specify both the
longer-term growth and short-run business cycle
aspects of a model, and then explain the model’s
coherence to observed data. This is the research
agenda I propose.

Core Ideas
The core idea is that modelers should use
“model-consistent detrending,” that is, the trends
that are removed from the data are the same as the
trends implied by the specified model. Presumably, changes in trend are infrequent and, perhaps
with some lag, are recognized by agents who then
react to them. This suggests a role for learning.
In addition, the cointegration of the variables or
the different trends in the various variables implied
by the balanced growth path is respected.

FEATURES OF THE ENVIRONMENT
As an example, I will discuss briefly the most
basic equilibrium business cycle model with
exogenous stochastic growth, but replace rational
expectations with learning as in Evans and
Honkapohja (2001). This model perhaps is appropriate when there is an unanticipated, rare break
in the trend (for example, a labor productivity
slowdown or acceleration). I assume agents possess a tracking algorithm and are able to anticipate
the characteristics of the new balanced growth
path that will prevail after the productivity slowdown occurs. If there is no trend break for a sufficient period, then there is convergence to the
rational expectations equilibrium associated with
that balanced growth path. Learning helps around
points where there is a structural break of some
type by allowing the economy to converge to the
new balanced growth path following the structural
break. In order for this to work, of course, the
model must be expectationally stable such that
the model’s implied stochastic process will remain
near the growth path.

Environment
The environment studied by Bullard and
Duffy (2004) is a standard equilibrium business
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

cycle model such as the one studied by Cooley
and Prescott (1995) or King and Rebelo (1999).
A representative household maximizes utility
defined over consumption and leisure. Physical
capital is the only asset. Business cycles are driven
by shocks to technology. Bullard and Duffy (2004)
include explicit growth in the model. Growth in
aggregate output is driven by exogenous improvements in technology over time and labor force
growth. The growth rate is exogenous and constant, except for the rare trend breaks that are
incorporated. The production technology is standard. Under these assumptions, aggregate output,
consumption, investment, and capital will all grow
at the same rate along a balanced growth path.

Structural Change
The idea of structural change in this setting
is simply that either the growth rate of technology
or of the labor force takes on a new value. In the
model, changes of this type are unanticipated.
This will dictate a new balanced growth path, and
the agents learn this new balanced growth path.
In order to use the learning apparatus as in
Evans and Honkapohja (2001), a linear approximation is needed. Using logarithmic deviations
from steady state, one can define and rewrite the
system appropriately, as Bullard and Duffy (2004)
discuss extensively. One must be careful about
this transformation because the steady-state values
can be inferred from some types of linear approximations, but we really don’t want to inform the
agents that the steady state of the system has
changed. We want the agents to be uncertain
where the balanced growth path is and learn the
path over time.

Recursive Learning
Bullard and Duffy (2004) study this system
under a recursive learning assumption as in Evans
and Honkapohja (2001). They assume agents
have no specific knowledge of the economy in
which they operate, but are endowed with a perceived law of motion (PLM) and are able to use
this PLM—a vector autoregression—to learn the
rational expectations equilibrium. The rational
expectations equilibrium of the system is deterJ U LY / A U G U S T

2009

391

Panel Discussion

minate under the given parameterizations of the
model.
Should a trend break occur—say, a productivity slowdown or speedup—the change will be
manifest in the coefficients associated with the
rational expectations equilibrium of this system.
The coefficients will change; agents will then
update the coefficients in their corresponding
regressions, eventually learning the correct coefficients. These will be the coefficients that correspond to the rational expectations equilibrium
after the structural change has occurred.

Expectational Stability
For this to work properly the system must be
expectationally stable. Agents form expectations
that affect actual outcomes; these actual outcomes
feed back into expectations. This process must
converge so that, once a structural change occurs,
we can expect the agents to locate the new balanced growth path. Expectational stability
(E-stability) is determined by the stability of a
corresponding matrix differential equation, as
discussed extensively by Evans and Honkapohja
(2001). A particular minimal state variable (MSV)
solution is E-stable if the MSV fixed point of the
differential equation is locally asymptotically
stable at that point. Bullard and Duffy (2004) calculated E-stability conditions for this model and
found that E-stability holds at baseline parameter
values (including the various values of technology and labor force growth used).

WHAT THE MODEL DOES
The description above yields an entire system—one possible growth theory along with a
business cycle theory laid on top of that. A simulation of the model will yield growing output and
growing consumption, and so on, but at an uneven
trend rate depending on when the trend shocks
occur and how fast the learning guides the economy to the new balanced growth path following
such a shock. The data produced by the model
look closer to the raw data we obtain on the economy, and now we would like to somehow match
up simulated data with actual data.
392

J U LY / A U G U S T

2009

Of course, this model is too simple to match
directly with the data, but it is also a well-known
benchmark model so it is possible to assess how
important structural change is when determining
the nature of the business cycle as well as for the
performance of the model relative to the data.
One aspect of this approach is that the model
provides a global theory of the whole picture of
the data. The components of the data have to add
up to total output. This is because in the model
it adds up and one is using that fact to detrend
across all of the different variables. When considering the U.S. data, then, one has to think about
the pieces that are not part of the model and how
those might match up to objects inside the model.
Bullard and Duffy (2004) discuss this extensively.

Breaks Along the Balanced Growth Path
The slowdown in measured productivity
growth in the U.S. economy beginning sometime
in the late 1960s or early 1970s is well known,
and econometric evidence on this question is
reviewed in Hansen (2001). Perron (1989) associated the 1973 slowdown with the oil price shock.
The analysis by Bai, Lumsdaine, and Stock (1998)
suggests the trend break most likely occurred in
1969:Q1.
The Bullard and Duffy (2004) model says
that the nature of the balanced growth path—the
trend—is dictated by increases in productivity
units X共t兲 and increases in the labor input N共t兲.
To find break dates, instead of relying on econometric evidence alone, Bullard and Duffy (2004)
designed an algorithm that uses a simulated
method of moments search process (genetic algorithm)1 to choose break dates for the growth factors and the growth rates of these factors, based
on the principle that the trend in measured productivity and hours from the model should match
the trend in measured productivity and hours
from the data. Table 1 reports their findings. The
algorithm suggests one trend break date in the
early 1960s for the labor input and two break dates
for productivity: one in the early 1970s and one
in the 1990s.
1

See Appendix B in Bullard and Duffy (2004).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

Table 1
Optimal Trend Breaks
N(t)
Initial annual growth rate (percent)
Break date
Mid-sample annual growth rate (percent)

1.20

2.47

1961:Q2

1973:Q3

1.91

1.21

—

1993:Q3

1.91

1.86

Break date
Ending annual growth rate (percent)

X(t)

Table 2
Business Cycle Statistics, Model-Consistent Detrending
Volatility

Output
Consumption
Investment

Relative volatility

Contemporaneous
correlations

Data

Model

Data

Model

Data

Model

3.25

3.50

1.00

1.00

1.00

1.00

3.40

2.16

1.05

0.62

0.60

0.75

14.80

8.86

4.57

2.53

0.65

0.92

Hours

2.62

1.54

0.81

0.44

0.65

0.80

Productivity

2.52

2.44

0.77

0.70

0.61

0.92

According to Table 1, productivity grows
rapidly early in the sample, then slowly from the
’70s to the ’90s and then somewhat faster after
1993. After each one of those breaks the agents
in the model are somewhat surprised, but their
tracking algorithm allows them to find the new
balanced growth path that is implied by the new
growth rates.
This model includes both a trend and a cycle.
Looking at the simulated data from the model,
what would a trend be? A trend is the economy’s
path if only low-frequency shocks occur. Bullard
and Duffy (2004) turn off the noise on the business
cycle shock and just trace out the evolution of
the economy if only the low-frequency breaks
in technology and labor force growth occur.
Importantly, the multivariate trend defined this
way is then the same one that is removed from
the actual data. In this sense, the model and the
data are treated symmetrically: The growth theory
that is used to design the model is dictating the
trends that are removed from the actual data.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Business Cycle Statistics
The reaction of the economy to changes in
the balanced growth path will depend in part on
what business cycle shocks occur in tandem
with the growth rate changes. Bullard and Duffy
(2004) average over a large number of economies
to calculate business cycle statistics for artificial
economies. They collect 217 quarters of data for
each economy, with trends breaking as described
above. They detrend the actual data using the
same (multivariate) trend that is used for the
model data.
The numbers in Table 2 are not the standard
ones for this type of exercise. In fact, they are quite
different from the ones that are typically reported
for this model, both for the data and for the model
relative to the data. This shows that the issues of
the underlying growth theory and its implications
for the trends we expect to observe are key issues
in assessing theories. One simple message from
Table 2 is we obtain almost twice as much volatilJ U LY / A U G U S T

2009

393

Panel Discussion

Figure 1
U.S. Core PCE Inflation
Percent
14

12

10

8

6

4

2
Actual
0

Model

1965

1970

1975

1980

ity in this model as there would be in the standard
business cycle in this economy. This is so even
though the technology shock is calibrated in the
standard way.

New Keynesian Application
A similar approach can be used in the NK
model. This was done by Bullard and Eusepi
(2005). In the NK model (with capital), a monetary
authority plays an important role in the economy’s
equilibrium. In Bullard and Eusepi (2005), the
monetary authority follows a Taylor-type policy
rule. The trend breaks and the underlying growth
theory are the same as in Bullard and Duffy (2004).
Now, however, one can ask how the policymaker
responds using the Taylor rule given a productivity slowdown that must be learned. The policymaker initially misperceives how big the output
gap is and this is making policy set the interest
394

J U LY / A U G U S T

2009

1985

1990

1995

2000

2005

rate too low, pushing the inflation rate up. How
large is this effect? According to Bullard and Eusepi
(2005), the effect is about 300 basis points on the
inflation rate for a productivity slowdown of the
magnitude experienced in the 1970s (Figure 1).
So, this does not explain all of the inflation in
the 1970s but it helps explain a big part of it.

CONCLUSION
The approach outlined above provides some
microfoundations for the largely atheoretical
practices that are currently used in the literature.
Structural change is not a small matter, and structural breaks likely account for a large fraction of
the observed variability of output. One way to
think of structural change is as a series of piecewise balanced growth paths. Learning is a glue
that can hold together these piecewise paths.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Panel Discussion

I think this is an interesting approach and I
would like to encourage more research that goes
in this direction. It doesn’t have to be a simple
RBC-type model; one could instead use a more
elaborate model that incorporates more empirically realistic ideas about what is driving growth
and what is driving the business cycle. The
approach I have outlined forces the researcher
to lay out a growth theory, which is a tough and
rather intensive task, but also leads to a more
satisfactory detrending method and a model that
is congruent with the macroeconomic data in a
broad way.

REFERENCES
Bai, Jushan; Lumsdaine, Robin L. and Stock, James H.
“Testing for and Dating Common Breaks in
Multivariate Time Series.” Review of Economic
Studies, July 1998, 65(3), pp. 395-432.
Bullard, James and Duffy, John. “Learning and
Structural Change in Macroeconomic Data.”
Working Paper No. 2004-016A, Federal Reserve
Bank of St. Louis, August 2004;
http://research.stlouisfed.org/wp/2004/2004-016.pdf.
Bullard, James and Eusepi, Stefano. “Did the Great
Inflation Occur Despite Policymaker Commitment
to a Taylor Rule?” Review of Economic Dynamics,
April 2005, 8(2), pp. 324-59.
Cooley, Thomas F. and Prescott, Edward C. “Economic
Growth and Business Cycles,” in T.F. Cooley, ed.,
Frontiers of Business Cycle Research. Princeton, NJ:
Princeton University Press, 1995, pp. 1-38.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Evans, George W. and Honkapohja, Seppo. Learning
and Expectations in Macroeconomics. Princeton, NJ:
Princeton University Press, 2001.
Hansen, Bruce E. “The New Econometrics of
Structural Change: Dating Breaks in U.S. Labour
Productivity.” Journal of Economic Perspectives,
Fall 2001, 15(4), pp. 117-28.
Hodrick, Robert J. and Prescott, Edward C. “Postwar
U.S. Business Cycles: An Empirical Investigation.”
Discussion Paper 451, Carnegie-Mellon University,
May 1980.
King, Robert G. and Rebelo, Sergio T. “Resuscitating
Real Business Cycles,” in J.B. Taylor and M.
Woodford, eds., Handbook of Macroeconomics.
Volume 1C. Chap. 14. Amseterdam: Elsevier, 1999,
pp. 927-1007.
Orphanides, Athanasios and Williams, John C. “The
Decline of Activist Stabilization Policy: Natural
Rate Misperceptions, Learning and Expectations.”
Journal of Economic Dynamics and Control,
November 2005, 29(11), pp. 1927-50.
Perron, Pierre. “The Great Crash, the Oil Price Shock,
and the Unit Root Hypothesis,” Econometrica,
November 1989, 57(6), pp. 1361-401.
Smets, Frank and Wouters, Rafael. “Shocks and
Frictions in U.S. Business Cycles: A Bayesian DSGE
Approach.” CEPR Discussion Paper No. 6112,
Centre for Economic Policy Research, February
2007; www.cepr.org/pubs/dps/DP6112.asp.

J U LY / A U G U S T

2009

395

396

J U LY / A U G U S T

2009

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W