Full text of Review (Federal Reserve Bank of St. Louis) : January/February 2006, Vol. 88, No. 1

View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

The Fed’s Monetary Policy Rule
William Poole
This article was originally presented as a speech at the Cato Institute, Washington, D.C.,
October 14, 2005.
Federal Reserve Bank of St. Louis Review, January/February 2006, 88(1), pp. 1-11.

n 1936, Henry Simons published a paper,
“Rules Versus Authorities in Monetary
Policy,” that not only became a classic but
also is still highly relevant to today’s policy
debates.1 I rediscovered several important points
in the paper while preparing this lecture.
In thinking about policy rules in recent years,
I have tended to separate the political and economic cases for a rule. Simons argues for a much
more integrated view of the issue:
There are, of course, many special responsibilities which may wisely be delegated to
administrative authorities with substantial
discretionary power...The expedient must be
invoked sparingly, however, if democratic
institutions are to be preserved; and it is utterly
inappropriate in the monetary field. An enterprise cannot function effectively in the face of
extreme uncertainty as to the action of monetary
authorities or, for that matter, as to monetary
legislation.2

Thus, Simons argues that the rule of law that
characterizes a democracy is also required to
provide monetary policy predictability, which,
in turn, is necessary for efficient operation of a
market economy.
1

Simons (1936).

Simons (1936, pp. 1-2).

I’ve chosen a title designed to be provocative,
for I suspect that few consider current Federal
Reserve policy as characterized by a monetary
rule. My logic is this: There is now a large body
of evidence, which I’ll review shortly, that Fed
policy has been highly predictable over the past
decade or so. If the market can predict the Fed’s
policy actions, then it must be the case that Fed
policy follows a rule, or policy regularity, of some
sort. My purpose is to explore the nature of that
rule. Contrary to Simons’s implication, the behavior of authorities can be predictable.
Before digging into specifics, consider what
the “rules versus discretion” debate is about.
Advocates of discretion, as I interpret them, are
primarily arguing against a formal policy rule, and
certainly against a legislated rule. They believe
that policy will be more effective if characterized
by “discretion.”
Discretion surely cannot mean that policy is
haphazard, capricious, random, or unpredictable.
Advocates of discretion agree with Simons that
“many special responsibilities...may wisely be
delegated to administrative authorities with substantial discretionary power.” However, they do
not agree with Simons that discretion “is utterly
inappropriate in the monetary field.”
Interestingly, Simons argued that a fixed
money stock would be the best rule, but only if

William Poole is the president of the Federal Reserve Bank of St. Louis. The author appreciates comments provided by his colleagues at the
Federal Reserve Bank of St. Louis. Robert H. Rasche, senior vice president and director of research provided special assistance. The views
expressed are the author’s and do not necessarily reflect official positions of the Federal Reserve System.

© 2006, The Federal Reserve Bank of St. Louis. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in
their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made
only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Poole

substantial institutional reforms were in place in
financial markets, such as 100 percent reserve
requirements against bank deposits. Given the
institutional structure, Simons argued for a rule
focused on price-level stabilization, because “no
monetary system can function effectively or survive politically in the face of extreme alternations
of hoarding and dishoarding.”3 That is, Simons
believed that large variations in the velocity of
money would make a fixed money stock rule
work poorly.
Despite the nature of his argument for a pricelevel stabilization rule, elsewhere in the same
paper Simons argued that, “[o]nce well established
and generally accepted as the basis of anticipations, any one of many different rules (or sets of
rules) would probably serve about as well as
another.”4 I think his first argument was correct—
that different rules, even once fully understood,
would have different operating properties in the
economy, and that a choice among various possible rules should depend on which rule yields
better economic outcomes.
My view has evolved over time to this general position: Monetary economists have not yet
developed a formal rule that is likely to have better
operating properties than the Fed’s current practice. It is highly desirable that policy practice be
formalized to the maximum possible extent. Or,
more precisely, monetary economists should
embark on a program of continuous improvement
and enhanced precision of the Fed’s monetary
rule. It is possible to say a lot about the systematic characteristics of current Fed practice, even
though I do not know how to write down the
current practice in an equation. It is in this sense
that I’ll be describing the Fed’s policy rule. And
given that, as far as I know, there is no other effort
to state in one place the main characteristics of
the Fed’s policy rule; I’m sure that subsequent
work will refine and correct the way I characterize
the rule. Thus, I am redefining the “rule” to fit
current practice, which has yielded an environment in which policy actions are highly, though
not perfectly, predictable in the markets.
3

Simons (1936, p. 5).

Simons (1936, p. 29).

J A N UA RY / F E B R UA RY

2006

Before proceeding, I want to emphasize that
the views I express here are mine and do not
necessarily reflect official positions of the Federal
Reserve System. I thank my colleagues at the
Federal Reserve Bank of St. Louis for their comments—especially Bob Rasche, senior vice president and director of research.

POLICY PREDICTABILITY—
A SUMMARY OF FINDINGS
I’ve discussed the predictability of Fed policy
decisions on a number of occasions, most recently
in a speech on October 4, 2005, entitled, “How
Predictable Is Fed Policy?” Let me summarize
the main findings.
Over the past decade, the Federal Open Market
Committee (FOMC) has undertaken a number of
steps toward greater transparency that have greatly
improved the ability of markets to predict future
policy actions. Among these steps are the
announcement of policy actions at the conclusion
of each FOMC meeting; the restriction of policy
actions to regularly scheduled FOMC meetings,
except under extraordinary conditions; the
announcement of a specific numeric target for
the federal funds rate in the post-FOMC meeting
press releases and in the Directive to the Manager
of the open market desk at the Federal Reserve
Bank of New York; the inclusion of the individual
votes at the FOMC meeting in the press release;
and the expedited release of the minutes of the
FOMC meetings. In addition, since 1989 all FOMC
policy actions to change the target for the funds
rate have been in multiples of 25 basis points. With
the exception of one change of 75 basis points,
all the changes have been either 25 or 50 basis
points.
As I have noted previously, I believe that the
evidence supports the conclusion that these steps
toward increased transparency have brought the
markets into much better “synch” with FOMC
thinking about appropriate policy actions. My
metric for judging how well markets have anticipated FOMC policy actions is the reaction of the
yield on the 1-month-ahead federal funds futures
contract between the close of business on the day
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Poole

before the FOMC meets and the close of business
on the day of the meeting. Our research suggests
that changes of less than 5 basis points are
“noise.” Larger changes reflect surprises to market expectations.
Since the middle of 1995, when the FOMC
has undertaken policy actions at regularly scheduled meetings, the markets have been surprised
only 12 times, as measured by a change of 5 basis
points or more in the 1-month-ahead federal funds
futures contract. Since the middle of 2003, when
the FOMC introduced “forward looking” language
into the press release, there have been no surprises. In contrast, on all four occasions when the
FOMC instituted intermeeting policy actions, the
markets were taken by surprise.
On the other side of the coin, FOMC decisions
to leave the funds rate target unchanged have also
become largely predictable. Since the middle of
1995 there have been only two occasions when
the markets expected a change in the funds rate
target and the FOMC left it unchanged.
These findings open this question: What are
the circumstances under which market expectations of FOMC actions are adjusted, so that, by the
time the FOMC meets, the outcomes are generally
correctly foreseen? There is a substantial literature
documenting interest rate responses to arriving
information. Given that the federal funds futures
market predicts FOMC policy decisions quite
accurately, that literature provides insight into
how the FOMC responds to new information.
What I’ll do now is to step back from that level
of detail to discuss policy regularities at a high
level, starting with policy goals.

The dual mandate in the Federal Reserve Act,
as amended, and in other legislation provides for
goals of maximum purchasing power, usually
interpreted as price stability, and maximum
employment. There are two aspects to achieving
the employment goal. First, achieving low and
stable inflation maximizes the economy’s growth
potential and, probably, maximizes the sustainable
level of employment. Second, the Fed can enhance
employment stability through timely adjustments
in its policy stance. A subsidiary goal of general
financial stability is closely related to both inflation and employment goals.
The Fed has gravitated to a specification of the
inflation goal stated in terms of the core personal
consumption expenditures (PCE) index. At the
FOMC meeting of December 21, 1999, Chairman
Greenspan provided a clear statement of the case
for focusing on the PCE price index rather than
on the consumer price index (CPI).
The reason the PCE deflator is a better indicator
in my view is that it incorporates a far more
accurate estimate of the weight of housing in
total consumer prices than the CPI. The latter is
based upon a survey of consumer expenditures,
which as we all know very dramatically underestimates the consumption of alcohol and
tobacco, just to name a couple of its components. It also depends on people’s recollections
of what they spent, and we have much harder
evidence of that in the retail sales data, which
is where the PCE deflator comes from.6

There is evidence that the goal is effectively
a 1 to 2 percent annual rate of change, averaged
over a “reasonable” period whose precise definition depends on context. Evidence supporting

POLICY GOALS

Senate, Ninety-eighth Congress, first session, The Renomination
of Paul A. Volcker to be Chairman, Board of Governors of the
Federal Reserve System for a term of 4 years ending August 6, 1987,
July 14, 1983, p. 15; Committee on Banking, Housing and Urban
Affairs, United States Senate, One Hundredth Congress, first session,
The Nomination of Alan Greenspan of New York, to be a member
of the Board of Governors of the Federal Reserve System for the
unexpired term of 14 years from February 1, 1978, vice Paul A.
Volcker, resigned; and, to be Chairman, Board of Governors of the
Federal Reserve System for a term of 4 years, vice Paul A. Volcker,
resigned, July 21, 1987, p. 29; Committee on Banking, Finance and
Urban Affairs, United States House of Representatives, Testimony
of Alan Greenspan, February 23, 1988, reprinted in the Federal
Reserve Bulletin, April 1988, p. 227.

On many occasions, dating back to Paul
Volcker’s confirmation hearing in 1979, Fed officials have stated that the goal of low and stable
inflation is there because it maximizes the economy’s sustainable rate of economic growth.5
5

See, for example Committee on Banking Housing and Urban Affairs,
United States Senate, Ninety-sixth Congress, first session, Hearings
on the Nomination of Paul A. Volcker to be Chairman, Board of
Governors of the Federal Reserve System, July 30, 1979, p. 20;
Committee on Banking, Housing and Urban Affairs, United States

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

FOMC Transcript, December 21, 1999, p. 49.

J A N UA RY / F E B R UA RY

2006

Poole

this view of the inflation goal appears in the
minutes of the FOMC meetings of May 6, 2003,
and August 9, 2005.7
I regard inflation stability as the primary goal
not because it is more important in a welfare
sense than maximum employment but because
achieving low and stable inflation is prerequisite
to achieving employment goals. Inflation stability
also enhances, but does not guarantee, financial
stability.
I take note, but will not further discuss here,
the ongoing debate as to whether the inflation goal
should be formalized as a particular numerical
goal or range.

CHARACTERISTICS OF THE FED
POLICY RULE
The Fed policy rule has a number of elements
that can be identified and, in many cases, quantified. I’ll now discuss the most important of these.

The Taylor Rule
Statements and testimony of Chairmen
Volcker and Greenspan and other FOMC participants, supplemented by the transcripts and minutes of FOMC discussions over the past 25 years,
clearly indicate that the long-run objective of
Federal Reserve monetary policy is to maintain
price stability, usually phrased as “low and stable inflation.” In the short run, policy actions are
undertaken with the intention of alleviating or
moderating cyclical fluctuations, as Chairman
Greenspan has noted:
[M]onetary policy does have a role to play over
time in guiding aggregate demand into line with
the economy’s potential to produce. This may
involve providing a counterweight to major,
sustained cyclical tendencies in private spending, though we can not be overconfident in our
ability to identify such tendencies and to determine exactly the appropriate policy
response.8
7

Over 10 years ago, John Taylor (1993) noted
that these characteristics of FOMC policy actions
could be summarized in a simple expression:
i = p + .5( p − p* ) + .5y + r *

= 1.5( p − p* ) + .5y + ( r * + p* ),

where i is the nominal federal funds rate, p is the
inflation rate, p* is the target inflation rate, y is
the percentage deviation of real gross domestic
product (GDP) from a target, and r * is an estimate
of the “equilibrium” real federal funds rate.
Under this characterization of the systematic
or “rule like” character of FOMC policy actions,
the funds rate is raised (lowered) when actual
inflation exceeds (falls short of) the long-run inflation objective and is raised (lowered) when output
exceeds (falls short of) a target level. In Taylor’s
example, the target for GDP was constructed from
a 2.2 percent per annum trend of real GDP starting
with the first quarter of 1984. In subsequent analyses this target has been interpreted as a measure
of “potential GDP.” When inflation and real GDP
are on-target, then the policy setting of the real
funds rate is the estimated equilibrium value of
the real rate. This formulation of an interest rate
monetary policy rule satisfies McCallum’s properties for a rule that provides a “nominal anchor”
to the economy.9 Taylor showed that his equation
closely tracked the actual federal funds rate from
1987 through 1992 except around the stock market
crash in October 1987.
For such a rule to be operational, data on the
inflation rate and GDP must be known to the
FOMC. In practice, the equation can be specified
with lagged data on inflation and GDP. More
generally the equation can be written as follows:
it = a ( pt −1 − p* ) + 100b ln ( y t −1 / y tP−1 ) + ( r * + p* ),
– is the previous quarter’s PCE inflation
where p
t –1
rate measured on a year-over-year basis, yt –1 is
the log of the previous quarter’s level of real GDP,
and y tP–1 is the log of potential real GDP as estimated by the Congressional Budget Office. To

FOMC Minutes, May 6, 2003, www.federalreserve.gov/fomc/
minutes/20030506.htm; FOMC Minutes, August 9, 2005,
www.federalreserve.gov/fomc/minutes/20050809.htm.
Committee on Banking, Finance and Urban Affairs, United States

J A N UA RY / F E B R UA RY

2006

House of Representatives, testimony of Alan Greenspan, July 13,
1988. Reprinted in Federal Reserve Bulletin, September 1988, p. 611.
9

McCallum (1981).

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Poole

Figure 1
Greenspan Years: Funds Rate and Taylor Rules ( p* = 1.5, r* = 2.0) a = 1.5, b = 0.5
Percent
12
10

Federal Funds Rate
Taylor Rule–Core PCE
Taylor Rule–PCE

8
6
4
2
0
Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr
87
88 89
90
91 92
93
94 95
96
97 98
99
00 01
02
03 04
05

Figure 2
Greenspan Years: Funds Rate and Taylor Rules ( p* = 1.5, r* = 2.3) a = 1.5, b = 0.8
Percent
12
10

Federal Funds Rate
Taylor Rule–Core PCE
Taylor Rule–PCE

8
6
4
2
0
Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr Apr
87
88 89
90
91 92
93
94 95
96
97 98
99
00 01
02
03 04
05

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Poole

Figure 3
Monthly Changes in Nonfarm Payroll Employment: January 1947–August 2005
1,250
1,000
750
500
250
0
–250
–500
–750
–1,000
1947

1952

1957

1962

1967

1972

1977

1982

1987

1992

1997

2002

NOTE: Shaded bars indicate recessions.

ensure a “nominal anchor” for the economy, the
coefficient a must be greater than 1.0.
Figure 1 shows the equation with the Taylor
coefficients (a = 1.5, b = 0.5), an assumed equilibrium real rate of interest of 2.0, and an assumed
inflation target of 1.5 percent. The solid blue line
shows the actual federal funds rate and the dashed
lines the two Taylor rule funds rates. The smalldash black line is the rule constructed with the
core PCE inflation rate; the long-dash light blue
line with the PCE inflation rate.10 The average
differences between the two “Taylor rules” and
the actual funds rate over the entire period are
15 and 7 basis points, respectively. However, the
volatility of each of the two Taylor rules is much
less than that of the actual funds rate.
Figure 2 shows the comparison of the two
Taylor rules with a larger coefficient on the output
10

Taylor originally specified his equation in terms of CPI inflation.
Since the FOMC has stated a preference for PCE measures of
inflation, those measures are used here.

J A N UA RY / F E B R UA RY

2006

gap (b = 0.8) and a slightly higher assumed equilibrium real rate (r * = 2.3). With these assumptions
the average differences between the two equations
and the funds rate over the entire period are 2 and
–3 basis points, respectively, and the volatility of
the two equations better approximates the volatility of the actual funds rate.
My purpose here is not to try to find the equation that reveals the policy rule of the Greenspan
Fed; as I stated earlier, I do not know how to write
down the current practice in an equation and the
FOMC certainly does not view itself as implementing an equation. Rather, the illustrations should
be viewed as evidence in support of the proposition that the general contours of FOMC policy
actions are broadly predictable.

Policy Asymmetry
Under most circumstances the direction of
FOMC policy actions is “biased” in a sense I’ll
explain. Policy bias exists because turning points
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Poole

Figure 4
Autocorrelations of Monthly Payroll Employment Changes: January 1947–August 2005
1.00

0.75

0.50

0.25

0.00
1

Lag

in economic activity—peaks and troughs of business cycles—are infrequent. Changes in economic
activity as measured by output and employment
are highly persistent. This persistence can be seen
in Figure 3, which shows month-to-month changes
in nonfarm payroll employment from January
1947 through August 2005. During expansions,
employment changes are consistently positive;
during recessions consistently negative. Changes
opposite to the cyclical direction are rare and
generally the consequence of identifiable transitory shocks such as those from strikes and weather
disturbances. This pattern of business cycles generates strong autocorrelations in the month-tomonth changes in payroll employment, as shown
in Figure 4.11
Given such persistence, once it becomes
apparent that a cyclic peak likely has occurred,
the issue is never whether the Fed will raise the
11

An estimate ARIMA model for monthly changes in nonfarm payroll employment over the period since 1947 indicates that

ΔPayroll _ Empt − 0.96ΔPayroll _ Empt −1 = εt − 0.64εt −1.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

target funds rate but whether and how much the
Fed will cut the target rate. Similarly, once it is
apparent that an expansion is underway, the question is not whether the Fed will cut the target rate,
but the extent and timing of increases.

Data Anomalies
Fed policy responds to incoming information, as it should. Sometimes data ought to be
discounted because of anomalous behavior. For
example, the FOMC has indicated that it monitors
inflation developments as measured by the core
rather than the total PCE inflation rate. This
approach is appropriate because the impacts on
inflation of food and energy prices are largely
transitory; the difference between the inflation
rate as measured by the total PCE index and as
measured by the core PCE index fluctuates
around zero.
Another example was the increase in tobacco
prices in late 1998. Tobacco prices had a transitory
impact on measured inflation, for both total and
J A N UA RY / F E B R UA RY

2006

Poole

core indices, during December 1998 and January
1999, but produced no lasting effect on trend
inflation.12 Similarly, information about real
activity sometimes arrives that indicates transitory
shocks to aggregate output and employment. An
example of such a transitory shock is the strike
against General Motors in June and July 1998.13
Similarly, the September 2005 employment
report reflects the impact of Hurricane Katrina.
Transitory and anomalous shocks to the data
are ordinarily rather easy to identify. Both Fed and
market economists develop estimates of these
aberrations in the data shortly after they occur.
The principle of looking through aberrations is
easy to state but probably impossible to formalize
with any precision. We know these shocks when
we see them, but could never construct a completely comprehensive list of such shocks ex ante.
Policymakers piece together a picture of the
economy from a variety of data, including anecdotal observations. When the various observations
fit together to provide a coherent picture, the Fed
can adjust the intended rate with some confidence.
The market generally understands this process, as
it draws similar conclusions from the same data.

Crisis Management
The above rules are suspended when necessary to respond to a financial crisis. The major
examples of the Greenspan era are the stock market
crash of 1987, the combination of financial market
events in late summer and early fall 1998 that
culminated in the near failure of Long Term
Capital Management, crisis avoidance coming
up to the century date change at the end of 1999,
12

From the December 1998 CPI release in January 1999: “Threefourths of the December rise in the index for all items less food and
energy was accounted for by a 18.8 percent rise in the index for
cigarettes, reflecting the pass-through to retail of the 45-cents-a-pack
wholesale price increase announced by major tobacco companies
in late November.”

From the July 16, 1998, Federal Reserve Statistical Release G.17
Industrial Production and Capacity Utilization press release:
“Industrial production declined 0.6 percent in June after a revised
gain of 0.3 percent in May. Ongoing strikes, which have curtailed
the output of motor vehicles and parts, accounted for the decrease
in industrial production.” From the Employment Situation: July
1998, released August 7, 1998: “Nonfarm payroll employment
edged up by 66,000 to 125.8 million, as growth was curtailed by
strikes and plant shutdowns in automobile-related manufacturing.”

J A N UA RY / F E B R UA RY

2006

and the 9/11 terrorist attacks. In each case, the
nature of the response was tailored to circumstances unique to each event. In all cases, crisis
responses were helpful because markets had confidence in the Federal Reserve, including confidence that extra provision of liquidity would be
withdrawn before risking an inflation problem.
In the absence of such confidence, the Fed’s ability
to respond would be severely curtailed.
The history of Fed crisis management since
World War II is generally a happy one. Before the
Greenspan era, significant events include the
failure of Penn Central in 1970 and the near failure
of Continental Illinois in 1984. Perhaps just as
important, the Fed has not responded to certain
events where it was called to do so. Examples
would include the New York City financial crisis
in 1975 and failure of Drexel Burnham Lambert
in 1990.14

Other Regularities in Policy Stance
Since August 1989, the FOMC has adjusted
the intended federal rate in multiples of 25 basis
points only. After February 1994, when the FOMC
first began to announce its policy decision at the
conclusion of its meeting, with few exceptions
all adjustments have been made at regularly
scheduled meetings. These exceptions were
April 18, 1994, September 29, 1998, January 3,
2001, April 18, 2001, and September 17, 2001.
In general, the Fed can use intermeeting
adjustments to respond to special circumstances,
such as the rate cut on September 17, 2001, in
response to 9/11, or to provide information to the
market about a major change in policy thinking
or direction, such as the rate cut on April 18, 2001.
My own preference is to confine intermeeting
adjustments to circumstances in which delaying
action to the next meeting would have significant
costs. In general, if the market believes that
changed circumstances will lead to a changed
decision at the next regularly scheduled meeting,
then little is gained by acting between meetings.
14

Drexel Burnham Lambert was first investigated by the Securities
and Exchange Commission in late 1987 and charged with securities
fraud in June 1988. A settlement was reached in December 1988,
but the firm declared bankruptcy in February 1990.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Poole

By reserving almost all actions to regularly scheduled meetings, intermeeting actions have special
force, which can be valuable in meeting financial
crises.

ISSUES TO BE RESOLVED
The rules-versus-discretion debate historically
was framed in terms of policy actions. The focus
on policy actions was natural because, historically,
central bankers were reticent to comment on the
rationale for their policy actions and only rarely
provided hints about the future course of policy
actions. Over the past 15 years, as central bankers,
including the FOMC, have striven for greater transparency in monetary policy, communication in
the form of policy statements has moved to center
stage. It is clear that policy statements are just as
important as policy actions, at least in the short
run, because significant market effects can flow
from these statements. We need to face a new
question: Can policy statements become predictable? I think the answer in principle is largely in
the affirmative, although evidence on the issue is
scanty and I do not believe that policy statements
are currently highly predictable.
Two significant elements in FOMC policy
statements are the “balance of risks” assessment
introduced in January 2000 and the “forward
looking” language introduced in August 2003. The
balance-of-risks assessment was introduced to
replace the long-standing “bias” statement in the
Directive to the Open Market Desk. Historically,
the bias statement had referred to the intermeeting
period and was not even made public in timely
fashion until May 1999. With the regularization
of FOMC policy actions on scheduled meeting
dates, and issuance of a statement following
every meeting starting with May 1999 to indicate
whether or not the funds rate target was changed,
a consensus emerged among FOMC participants
that the bias formulation did not provide a clear
public communication. The balance-of-risks statement attempted to provide insight into the major
policy concerns of FOMC members over the
“foreseeable future.”
Initially, the Committee sought to summarize
the risks for policy in the foreseeable future in a
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

single assessment covering the prospects for both
real economic activity and inflation. In June 2003,
the assessment of the risk for sustainable growth
was unbundled from the risk for inflation, allowing
the Committee to express concerns in different
directions about the two risks. Until April 2005,
the balance-of-risks was an unconditional statement; since then, the assessment has been conditioned upon “appropriate monetary policy action.”
Over the 49 FOMC meetings since February
2000, there have been 10 substantive changes in
the wording of the balance-of-risks statement.15
One of these changes was a decision not to make
a balance-of-risks assessment on March 18, 2003,
in light of the uncertainty associated with the
Iraq war. In the remaining 10 formulations of the
statement, 5 assessed the risks as roughly balanced
(or balanced conditional on appropriate policy),
3 indicated concern about economic weakness,
1 indicated concern about heightened inflation
pressures, and 1 indicated a concern about the
risk that inflation might become “undesirably low.”
The switch in language on December 19, 2000,
from a concern about heightened inflation pressures to economic weakness, was followed by a
reduction in the federal funds target by 50 basis
points at an unscheduled FOMC meeting on
January 3, 2001. On August 13, 2002, the risk
assessment was changed from balanced to
weighted toward economic weakness, but the
FOMC took no policy actions until it reduced the
target for the funds rate by 50 basis points at its
scheduled meeting on November 6, 2002—the second FOMC meeting after the change in language.
The risk assessment was changed from balanced
to weighted toward weakness at the May 6, 2003,
scheduled FOMC meeting, and the federal funds
rate target was reduced by 25 basis points at the
subsequent FOMC meeting on June 25, 2003. Prior
to August 2003, no policy actions were undertaken at a given FOMC meeting or its subsequent
meeting when the risk assessment was balanced.
Beginning in August 2003, the FOMC added
“forward looking” language to the press statement.
Initially, the language indicated that “policy
15

These changes occurred on December 19, 2000; March 19, 2002;
August 13, 2002; November 6, 2002; March 18, 2003; May 6, 2003;
June 25, 2003; December 9, 2003; May 4, 2004; and March 22, 2005.

J A N UA RY / F E B R UA RY

2006

Poole

accommodation can be maintained for a considerable period.” In January 2004, the Committee
changed the language to indicate that it could be
“patient in removing its policy accommodation.”
The FOMC did not change the target federal funds
rate while these statements were in effect. In May
2004, the Committee indicated that it “believes
that policy accommodation can be removed at a
pace that is likely to be measured.” At its following
meeting, the FOMC raised the federal funds rate
target by 25 basis points. The Committee then
raised the target rate by 25 basis points at all its
subsequent meetings up to the time this speech
was written. The most recent such meeting was
September 20, 2005.
At a minimum, the FOMC can and should
aspire to policy statements that are clear and do
not themselves create uncertainty and ambiguity.
The record since 2000 suggests that the balanceof-risks statement and more recently the forwardlooking language included in the press releases
have provided consistent signals about the direction of future policy actions.
In interpreting the FOMC’s policy statements,
it is important that each statement be read against
previous ones. Changes in the wording are critical
to understanding the perspective of the FOMC
members about future policy actions.

RULE ENFORCEMENT
Obviously, there exists no legal enforcement
mechanism of the current rule. Nevertheless,
there are certainly incentives for the Fed Chairman
to follow the rule, or work to define improvements.
The most powerful incentives arise from
market reactions to Fed policy actions. The federal
funds futures market provides a sensitive measure
of near-term market expectations and the eurodollar futures market a sensitive measure of
longer-term funds rate expectations. The spread
between conventional and indexed Treasury securities provides information on inflation expectations or, more accurately, inflation compensation.
Options in these markets provide information on
the diffusion of investor expectations. Volatility
of market rates and accompanying market commentary provide quick feedback as to market
10

J A N UA RY / F E B R UA RY

2006

reactions to Fed policy actions and policy statements. It is not in the Fed’s interest to confuse or
whipsaw markets, and for this reason market reactions provide an incentive for the Fed to conduct
policy in a predictable fashion that at the same
time achieves policy goals. Policy actions should
be unpredictable only in response to events that
are themselves unpredictable. The response function itself should be as predictable as possible.
That is, given the arrival of new information, the
goal is that the market should be able to predict
the policy action in response to that information.
Although market responses are the most
important disciplining force, FOMC members
other than the Chairman also provide input,
including input through dissents when a member
feels strongly that a different policy decision
would be better. Reserve Bank directors weigh
in through discount rate decisions. Since 1994,
except in unusual circumstances, the FOMC has
not changed the intended federal funds rate unless
several Reserve Banks have proposed corresponding discount rate changes.16
Finally, the general role of public discussion,
including the highly visible congressional hearings, bears on the process. Skillful public officials
do not want to be forced into a defensive posture
when confronting questions in hearings and in
Q&A sessions following speeches. I’ll leave it to
political scientists to study the matter in detail,
but will guess that public opinion plays a more
important role than formal legal processes in
enforcing many legislated and common law rules.
If so, then public opinion can play an important
role in enforcing extra-legal rules as well.

A SUMMING UP
Federal Reserve policy has become highly
predictable in recent years; in the future this predictability will, I am sure, be seen as one of the
16

During this period there were 7 occasions when the target funds
rate was changed without an accompanying action by the Board
of Governors to change the discount rate. Of the remaining 36
changes in the intended funds rate, 33 were accompanied by
changes in the discount rate at four or more Federal Reserve Banks.
On 24 of these occasions, the discount rate was changed at a
majority of the Federal Reserve Banks.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Poole

hallmarks of the Greenspan era. Little has been
institutionalized, and for this reason the current
Federal Reserve policy rule must be regarded as
somewhat fragile. Still, future Chairmen will want
to extend Alan Greenspan’s successful era and
therefore it will be in their interest to commit to
pursue policy regularities that work well.
I do not claim to have accurately identified
all aspects of the Fed’s current policy rule. I am
tempted to call it the “Greenspan policy rule,”
for Alan Greenspan has surely had far more to
do with its construction than anyone else. Nevertheless, I believe that most elements of the rule
have become part of a general Fed culture, understood at least roughly by other FOMC members
and by staff. While it is appropriate to refer to the
“Greenspan rule,” I believe that FOMC debates
and staff contributions have had a lot to do with
development of the rule. For this reason, I believe
that we should be hopeful that consistent and
predictable Fed policy is likely to continue into
the future.

REFERENCES
Simons, Henry C. “Rules Versus Authorities in
Monetary Policy.” Journal of Political Economy,
February 1936, 44(1), pp. 1-30.
McCallum, Bennett T. “Price Level Indeterminacy
with an Interest Rate Policy Rule and Rational
Expectations.” Journal of Monetary Economics,
November 1981, 8(3), pp. 319-29.
Poole, William. “How Predictable Is Fed Policy?”
University of Washington, Seattle Oct. 4, 2005;
www.stlouisfed.org/news/speeches/2005/
10_04_05.htm; Federal Reserve Bank of St. Louis
Review, November/December 2005, 87(6), pp. 659-68.
Taylor, John B. “Discretion versus Policy Rules in
Practice.” Carnegie-Rochester Conference Series on
Public Policy. Amsterdam: North-Holland, 1993,
39(0), pp. 195-214.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

On the Size and Growth of Government
Thomas A. Garrett and Russell M. Rhine
The size of the U.S. federal government, as well as state and local governments, increased dramatically during the 20th century. This paper reviews several theories of government size and growth
that are dominant in the public choice and political science literature. The theories are divided
into two categories: citizen-over-state theories and state-over-citizen theories. The relationship
between the 16th Amendment to the U.S. Constitution and the timing of government growth is
also presented. It is likely that portions of each theory can explain government size and growth,
but the challenge facing economists is to develop a single unifying theory of government growth.
Federal Reserve Bank of St. Louis Review, January/February 2006, 88(1), pp. 13-30.

conomists have long been divided on
the role of government in a society.1
John Maynard Keynes and John
Kenneth Galbraith have argued that an
economy needs to be continually fine-tuned by
an activist government to operate efficiently2:
Thus, as an economy grows, a growing government is also necessary to correct private-sector
inefficiencies. This school of thought grew primarily out of the Great Depression, when markets
seemed to fail and government intervention was
viewed as the means to restore economic stability.
Other 20th century economists, such as Frederick
von Hayek and Milton Friedman, have argued
that an activist government is the cause of economic instability and inefficiencies in the private
sector.3 Government should exist to ensure that
1

a private market operates efficiently; it should
not act to replace the market mechanism.
Various data clearly suggest that the size of
the federal government in the United States has
grown dramatically during the 20th century.4 One
measure of government growth is federal expenditures per capita. The history of real (2000 dollars) federal government expenditures per capita
from 1792 to 2004 is shown in Figure 1. This
growth did not occur gradually, however. In the
early years of the United States, the federal government spent about $30 per person annually. By the
1910s, government expenditures per capita were
about $129, or slightly more than four times the
1792 level. In 2004, the federal government spent
$7,100 per capita, nearly 55 times more than was
spent per capita in the 1910s. Spending growth
did slow in the mid-1980s and actually decreased

The evolution of this debate is presented in Yergin and Stanislaw
(2002).

John Maynard Keynes’s book, The General Theory of Employment,
Interest, and Money, is one of the most influential economic books
of the 20th century. Keynes states the need for substantial increases
in government spending during times of economic contractions.
Similarly, John Kenneth Galbraith argued for an expansionary fiscal
policy to increase economic activity and employment.

Of the many publications of both these Nobel Prize–winning
economists, the most influential are Hayek’s The Road to Serfdom
and Milton Friedman and Anna Schwartz’s A Monetary History of
the United States 1867-1960.

All data on federal, state and local government expenditures are
from the Office of Management and Budget (www.whitehouse.gov/
omb) and the U.S. Census Bureau.

Thomas A. Garrett is a research officer at the Federal Reserve Bank of St. Louis, and Russell M. Rhine is an assistant professor at St. Mary’s
College of Maryland. Lesli Ott provided research assistance.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

Table 1
Cabinet Departments
Year established

Department
State

1789

Treasury

1789

Justice

1789

Defense*

1789

Interior

1849

Agriculture

1889

Commerce

1913

Labor

1913

Health and Human Services

1953

Housing and Urban Development

1965

Transportation

1966

Energy

1967

Education

1979

Veterans Affairs
Environmental Protection

1987
Agency†

Homeland Security

1990
2002

NOTE: *The date refers to the Department of War; the Department of Defense was officially created in 1949. The Department
of War (1789), the Department of the Navy (1798), the Department of the Army (1947) and the Department of the Air Force
(1947) were all reorganized under the Department of Defense
in 1949. See www.dod.gov.
† Cabinet-level rank under George W. Bush;
see www.whitehouse.gov/government/cabinet.html.
SOURCE: Cabinet department websites.

in the mid-1990s. By the year 2000, however, per
capita spending increased once again.
It is clear from Figure 1 that spending on
national defense can have a substantial impact
on the level of government spending. Figure 2 is
a graph of total per capita expenditures with and
without defense spending over the period 19472004. It is evident that the long-term growth in
total per capita government spending is not solely
a function of national defense.
Federal spending has also increased relative
to gross domestic product (GDP) throughout much
of this country’s history, as seen in Figure 3.
Expanded government during World War II is
clearly evident in Figure 3, as is the slowdown
in government growth during the 1980s and 1990s.
14

J A N UA RY / F E B R UA RY

2006

Figure 1 shows that the federal government has
historically spent more per person each year, but
Figure 3 suggests that this growth in spending
has been less than the growth in GDP at the end
of the 20th century.
An examination of the components of federal
government spending provides insight into which
areas the government has increased activity.
Figure 4 plots several components of federal government spending per capita from 1947 to 2004.
Although total per capita spending increased
following World War II, several components of
federal government expenditures stayed relatively
constant or even decreased slightly over the next
50 years: physical resources (e.g., transportation,
energy), national defense, and “other functions”
(e.g., agriculture, general government, international
affairs). In fact, much of the reduction in federal
expenditures per person occurring in the mid- to
late 1990s can be attributed to a reduction in
national defense spending. However, spending
on national-debt net interest payments and human
resources grew substantially over the same period.
The dramatic increase in human resources that
occurred reflects the growth in Social Security
payments and the inception of entitlement programs such as Medicare (in 1965).
Another measure of the size of the federal
government is the number of cabinet departments.
Eight cabinet departments were created from 1788
to 1952. Since 1953, there have been an additional
eight cabinet departments established. Table 1
provides a list of all executive cabinet departments
and the dates they were each established. One can
infer from Table 1 and Figure 1 that the increase
in per capita expenditures during the 20th century
was due to an increase in the physical size of
government as well as an increase in spending
by existing government agencies.
In addition to the increase in federal government expenditures, state and local government
expenditures per capita have also increased since
World War II, as seen in Figure 5. Inflation-adjusted
expenditures per person were about $759 in 1948,
compared with over $4,300 per person in 2004.
The average annual growth rate in real per capita
state and local government expenditures was 3.2
percent, compared with an average annual growth
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

Figure 1
Real Per Capita Federal Expenditures: 1792-2004
2002 $
8,000
7,000
6,000
5,000
4,000
3,000
2,000
1,000
0
1792 1807 1822 1837 1852 1867 1882 1897 1912 1927 1942 1957 1972 1987 2002

Figure 2
Real Per Capita Federal Expenditures: 1947-2004
2002 $
8,000

6,000

4,000

2,000
Per Capita Total Federal Expenditures
Per Capita Total Federal Expenditures Less Defense
0
1947

1952

1957

1962

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1967

1972

1977

1982

1987

1992

1997

2002

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

Figure 3
Total Federal Expenditures as a Percent of GDP: 1930-2004
Percent
50

0
1930

1937

1944

1951

1958

1965

1972

1979

1986

1993

2000

Figure 4
Real Per Capital Federal Expenditures by Component: 1947-2004
2002 $
8,000
7,000
6,000
5,000

National Defense
Human Resources
Other Functions
Physical Resources
Net Interest
Per Capita Total Federal
Expenditures

4,000
3,000
2,000
1,000
0
1947

J A N UA RY / F E B R UA RY

1953

2006

1959

1965

1971

1977

1983

1989

1995

2001

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

Figure 5
Real Per Capita State and Local Government Expenditures: 1948-2004
2000 $
5,000

4,000

3,000

2,000

1,000

0
1948

1953

1958

1963

1968

1973

rate of 2.7 percent for real federal expenditures
per person. Total government expenditures per
person (federal + state + local) totaled $2,350 in
1948 and nearly $12,150 in 2004.
The data illustrated in Figures 1 through 5 provide convincing evidence that the size of government in the United States has grown throughout
the 20th century.5 An important question asked
by economists and political scientists is why this
growth has occurred. This paper presents several
popular theories of government size and growth
that have received attention in the economics and
5

Another measure of government size is federal employment relative
to total employment. Plotting this series over time reveals that
federal employment is a diminishing share of total employment
throughout the 20th century. A closer inspection of the data reveals
that most of this decrease in federal employment is a result of a
reduction in defense employment, which suggests that the number
of federal government employees is not a good measure of the size
of the government because subcontractors complete much of their
work. For example, the federal government does not build military
aircrafts; they pay subcontractors like Lockheed-Martin to build
them. So, thousands of people working on the construction of aircrafts at Lockheed-Martin receive their pay indirectly from the
federal government, and they are not included in government
employment figures. In 2004, Lockheed-Martin had sales of $35.5
billion. Nearly 80 percent of sales were to the U.S. Department of
Defense/Intelligence and Civil Government/Homeland Security
(www.lockheedmartin.com).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

1978

1983

1988

1993

1998

2003

political science literature.6 Since government
and the citizenry are made up of individuals, all
theories of government considered here approach
the issue from a microeconomic perspective;
specifically, they consider the incentives of voters,
public officials, and the inherent inefficiencies
that may arise in a representative democracy. Note
that some theories are better suited to explain
government size and others to explain government
growth.
The theories of government size and growth
fall into two distinct categories. The first category
is citizen-over-state theories of government. These
theories begin with the premise that citizens
demand government programs and, as a republic,
the government is simply responding to the will
of the people. The other category is state-overcitizen theories of government growth. Here, the
size of government is independent from citizen
demand and government grows because of inherent inefficiencies in public sector activities and
incentives facing government bureaucrats. The
paper concludes with a discussion of the potential
6

Kliesen (2003) discusses the increase in government size during
the 20th century.

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

importance of the 16th Amendment to the U.S.
Constitution, which allows the federal government to tax wage and business income. As will
be discussed, the timing of the 16th Amendment
and the start of government growth may be more
than a coincidence.

CITIZEN-OVER-STATE THEORIES
OF GOVERNMENT SIZE AND
GROWTH
The citizen-over-state theories of government
size and growth begin with the premise that government growth occurs because citizen demand
for government programs has increased over time.
It will become evident here that the demand for
government can come from individual citizens
or a collection of citizens organized into special
interest groups. This section discusses three distinct citizen-over-state theories of government
size and growth.

The Government as a Provider of
Goods and a Reducer of Externalities
Voters decide which goods the government
will provide and which negative externalities the
government will correct.7 The tool economists
and other social scientists use to determine where
the government will intervene is the median voter
theorem. Hotelling (1929) and Downs (1957 and
1961) rank voters by political ideology and place
the most conservative individual on the far right
and the most liberal on the far left. Assuming a
two-party system, the voters must choose the
conservative candidate or the liberal candidate.
Since the voter will choose the candidate with
the views closest to his or her own views, whichever candidate wins the median voter will have
a majority of votes and win the election.
An assumption of the median voter theorem
is the use of majority rule voting. Additional
assumptions are that citizens vote directly on
government spending issues and that government
7

A negative externality is a negative (costly) spillover from an activity
onto a nonconsenting third party. An example is pollution from a
factory that is dumped into a river and has an adverse affect on
everyone downstream.

J A N UA RY / F E B R UA RY

2006

spending is the only issue on the ballot. Thus, the
median voter determines the demand for publicly
provided goods, which is a function of income,
the relative price of public goods to private goods,
and tastes.
The price elasticity of demand for government and the price of government both determine
whether government grows or contracts. Government will grow if the demand for government is
price inelastic and the price of government
increases. In other words, if the price of government goods or services increases and the quantity
demanded of the goods or services does not
decrease by a proportionate amount, total government spending increases. The other possibility
for government growth is an elastic demand for
government and a falling price of government.
That is, if the price of government goods or services
decreases and the quantity demanded of the goods
or services increases by a more-than-proportionate
amount, total spending increases.
The literature presents evidence in support
of an increasing price and an inelastic demand
for government. Baumol (1967) addresses the
issue of relative private and public sector prices
in terms of government growth. He shows that the
increase in the price of the public sector goods
and services relative to the price of the private
sector goods and services is due to productivity
gains in manufacturing. Since most government
programs are services (i.e., national defense, education, and police), they have not experienced
the same efficiency gains as manufacturing specifically and the private sector overall; thus, the relative price of public goods has been increasing.
Mueller (2003) presents additional evidence
of the Baumol (1967) effect in OECD countries.8
He found that 20 of the 25 OECD countries showed
the expected growth in government expenditures
as a percent of GDP from 1960 to 1995. Of the
five countries that did not increase government
expenditures by the amount predicted by the
Baumol effect, four of the five still increased to
some extent and only one decreased. It decreased
8

OECD is the Organization of Economic Cooperation and
Development: OECD countries are listed at www.oecd.org.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

expenditures as a result of decreased defense
spending after the end of the Cold War.
In addition to the productivity differences,
Ferris and West (1999) found that wages in the
public sector, which contribute to the price of
government, are increasing faster than in the
private sector. They find evidence of this in the
salaries of unionized versus non-unionized public
school teachers. The near-monopoly nature of
publicly provided goods and services encourages
the creation of unions, and they will demand
higher wages. The government will appease the
unions and simply pass these costs onto the taxpayers.
The remaining determinant that the literature
uses to help explain the government as a provider
of goods and services and reducer of externalities
is citizen tastes and preferences. Over time taste
for publicly provided goods and services changes
and subsequently so will the demand for these
goods and services. One such good is the redistribution of income and wealth for insurance purposes. Rodrik (1998) looks at the risk associated
with open economies and presents evidence to
support the hypothesis that the more open the
economy, the larger the government. Specifically,
he argues that the volatility of income and employment that corresponds with open economies is
an insurable risk. The government programs that
act as a form of insurance to protect workers are
social programs (i.e., unemployment and social
security).9
However, as pointed out by Mueller (2003), a
problem with Rodrik’s (1998) findings is that the
large social programs in the United States grew
at a time of significant slowdown in the domestic
economy—the Great Depression—not the
increased openness of the U.S. economy. Thus,
social insurance programs are meant to reduce
the risk of households’ income volatility due, at
least in part, to business cycles. The programs
also attempt to smooth cash flow over a citizen’s
lifetime and across income levels.
9

Ex post, not all citizens will benefit from social programs, as suggested by Garrett and Rhine (2005), which shows that less than 5
percent of 2003 retirees benefit from the Social Security system.
Ex ante, however, the social program is publicly provided insurance.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

The Government as a Redistributor of
Income and Wealth
The second citizen-over-state theory of government surmises that government serves as a redistributor of income and wealth. All government
programs are seen as mechanisms for redistribution. Meltzer and Richard (1978, 1981, 1983)
present a model where leisure is inversely related
to the fraction of total time worked, consumption
is inversely related to the tax rate and is positively
related to a lump-sum grant received from the
government, and income is positively related to
productivity. Their model produces a well-known
result—a higher level of productivity equates to
a higher level of income, and the higher income
increases consumption and well-being.
Meltzer and Richard (1978, 1981, 1983) show
that individuals will demand the combination of
tax rates and lump sum payments that maximizes
their well being. Individuals with a lower level
of productivity, and subsequently a lower level
of income, will demand a higher tax rate and a
higher lump-sum payment from the government.
The extreme case is individuals that do not work
and pay no taxes; they will simply want to maximize their lump-sum payment and will demand
a higher tax rate than that demanded by those
individuals who are working. This model explains
the growth in government in part because, over
time, new entrants into the voting population
are lower income workers. These lower income
workers will cast votes for the candidate who
will levy higher taxes and increase the amount
of redistribution.
Kristov, Lindert, and McClelland (1992)
explain that the amount of redistribution is based
on social affinity. The closer the middle class feels
to the poor, or the slower incomes are growing,
the greater the amount of redistribution. The
authors study the period immediately preceding
and during the Great Depression as support for
their claim. They explain that, when the economy
was expanding, Americans voted not to increase
taxes to fund relief for the poor. But, after the
economy changed directions in the 1930s, social
programs increased dramatically. Taxes on high
earners increased, and the number of programs
J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

that redistributed income and wealth increased
as well.
Peltzman (1980) explains that candidates
promise transfer payments to groups of citizens
in order to gain their support. If the distribution
of incomes over different socio- economic classes
is similar, then the candidate must offer a greater
amount of redistribution to gain supporters. With
a trend toward more evenly distributed incomes
in years prior to the Peltzman (1980) study, greater
redistribution by the federal government was
undertaken.

Interest Groups
Interest groups can increase the size of government by organizing members and applying political
pressure more effectively than individual citizens
(Olson, 1965, and Moe, 1980). Examples of interest
groups mentioned frequently in the popular press
include the Sierra Club, the National Organization
of Women, and the National Rifle Association.
One can think of an interest group as an organized
collection of individual voters (or businesses)
having the same preference for a specific policy.
Through concentrated lobbying, an interest group
can obtain a desired policy that has direct benefits
for the interest group but the costs of the policy
are spread across millions of taxpayers. Elected
officials play a key role in this process as they
weigh the political costs and benefits of each
policy. Such disconnectedness between costs
and benefits will result in inefficient levels of
government expenditures—that is, the societal
costs of the policy will be greater than the societal
benefits.
Supply and demand analysis can be used to
model an interest group economy (McCormick
and Tollison, 1981). “Demanders” of a policy will
be those groups that can organize and lobby for,
say, $100 at a cost of less than $100. “Suppliers”
(individual tax-payers) are those for whom it
would cost more than $100 to lobby against losing
that $100.10 The incentives facing elected officials
are such that they will target unorganized suppliers
10

As suggested by Mueller (2003), the term suppliers should be taken
loosely because individuals would only likely engage in the
transfer under coercion.

J A N UA RY / F E B R UA RY

2006

with low losses from any transfer while courting
demanders who are organized and active in the
political process. Thus, costs are spread across
many taxpayers but the benefits are concentrated
within the interest group. If too little or too much
wealth is transferred, the political process will
discipline the elected official at the polls.
Although economic theory can be used to
explain how interest groups operate in a political
market for transfers, economics has said little
about how interest groups form (Olson, 1965). In
fact, economic theory suggests that there would
be little or no interest group formation because
of the free-rider problem. Because the benefits of
lobbying are nonrival and nonexcludable, it is
rational for individuals who would benefit from
lobbying to free-ride.11 Despite a lack of theory
for interest group formation, economics has produced dozens of papers that provide theoretical
and empirical evidence on the link between
interest groups and the size of government.12
Weingast, Shepsle, and Johnsen (1981) offer a
rational explanation for the inefficiency (costs >
benefits) of special interest projects. The authors
focus on distributive policies that concentrate
benefits within a geographic area and disperse
the costs (taxes) over all constituencies. In the
model of Weingast, Shepsle, and Johnsen, the
national constituency is divided into districts that
are each assumed to maximize its net benefits from
any redistributive project and have only one representative. Because the district is only a fraction
of the national constituency, the cost of the project
is spread out over the entire constituency. Each
district does not take into account the costs that
are being placed on other constituencies when
evaluating its own benefits. Thus, because the
net benefits of a given project are overstated, the
11

There are several ways in which interest groups can, at least partially, overcome the free-rider problem. One way is through coercion
or mandatory membership, such as in the case of labor unions and
state bar associations. Other interest groups may provide valuable
private benefits to members, such as publications and educational
material, at a relatively low cost of joining. The American Association of Retired Persons (AARP) is an example. Political entrepreneurs can also overcome the free-rider problem. Examples include
many large corporations that have offices in Washington, D.C. The
employees of large corporations also serve as informal lobbyists.

See Ekelund and Tollison (2001) for a detailed overview of the
literature on interest group theory.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

project is larger than the efficient project size.
Furthermore, because local projects are larger
than the efficient level, the district’s representative has even greater interest in acquiring projects
that benefit his or her district.
Becker (1983) presents a theory of public policies that result from competition among special
interest groups (or “pressure groups” according
to Becker). Becker views political pressure as a
public good.13 An increase in interest group membership will increase pressure, but because pressure is a public good, free-riding (by would-be
group members) will increase. Because free-riding
increases, so do the costs of implementing pressure. Becker finds that efficiency in producing
pressure is partly determined by the costs of
controlling pressure—a greater control over freeriding increases the amount of pressure. With
higher amounts of pressure, a special interest
group is able to acquire more benefits (lower taxes
or higher subsidies). Becker believes that efficiency is improved not only by controlling the
free-rider problem, but also through the competition that occurs between tax groups and subsidy
groups that consider their losses via taxes or subsidies. Therefore, interaction among competing
special interest groups increases the power of the
special interest lobby, and thus special interest
spending.14
Sobel (2001) provides empirical evidence on
the positive relationship between political action
committees (PACs) and federal government spending.15 He notes that the rise in federal government
spending during the 1970s and 1980s and the
subsequent slowdown in the 1990s parallel the
13

A public good is nonrival (consumption by one person does not
deny consumption by others) and nonexcludable (no price mechanism exists to deny consumption). National defense is a classic
example of a public good.

Note a key difference between Weingast, Shepsle, and Johnsen
(1981) and Becker (1983): Weingast, Shepsle, and Johnsen believe
interest groups arise as a result of concentrated benefits and dispersed costs that follow from the existence of independent districts
(each district is an interest group), and it is this dispersion between
costs and benefits that leads to larger government. Becker, however,
believes that it is the competition among interacting interest groups
that increases the power of the special interest lobby, and thus
increases special interest spending.

A PAC is an organization whose goal is to raise campaign funds
for candidates seeking political office.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

increase and eventual decrease in the number of
PACs over this same period. He finds that a 10
percent increase (decrease) in the number of PACs
in time t –1 is associated with a 1.07 to 1.57 percent increase (decrease) in federal spending in
time t. However, one issue is whether the number
of registered PACs, as opposed to PAC membership, accurately represents the scope and power
of the special interest lobby in the United States.
Although interest group theory may provide,
at least in part, a reasonable explanation for the
size of government, it is not without its theoretical
and empirical challenges.16 One issue is that of
causality. Specifically, does interest group activity
cause government spending, or do changes in the
level of government spending influence interest
group activity (Mueller and Murrell, 1985, 1986)?
Another issue mentioned earlier is that of interest
group formation. Although there are anecdotal
explanations as to how interest groups can overcome the free-rider problem, such an idea has yet
to be incorporated into a reasonable economic
model. Finally, there is debate as to whether the
interest group theory is in fact a citizen-over-state
theory or a state-over-citizen theory given the
pivotal role that elected officials play in the link
between interest groups and government growth.

STATE-OVER-CITIZEN THEORIES
OF GOVERNMENT SIZE AND
GROWTH
The previous section discussed several citizenover-state theories of government size and growth.
Inherent in these theories is the idea that government is demand driven—that is, government
size and growth occur because citizen demand
for government has increased. This demand for
government can come from individual citizens
or groups of citizens (the interest group theory),
with each party having a desire for some form of
a publicly provided good, externality reduction,
or redistribution of income.
The following section presents several theories of government growth that start from a
16

Ekelund and Tollison (2001).

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

completely opposite premise from the previous
theories—namely, that the size of government is
supply driven rather than demand driven. These
theories posit that the incentives facing public
officials and the nature of our representative form
of government provide an environment for government growth to occur in the absence of citizen
demand. Government grows because of government and its inherent inefficiencies, structure
(e.g., direct democracy versus a representative
democracy), and incentives facing public officials.
Appropriately, then, the following three theories
are classified as state-over-citizen theories of
government growth.

Bureaucracy Theory
Goods and services provided by the government do not arise out of thin air, but rather they
must be created by a government agency. The
supply of government output, then, may be a
function not only of citizen demand (as the previous theories suggest), but also of the demand
of government bureaucrats. Niskanen’s (1971)
theory of bureaucracy postulates that government
bureaucrats maximize the size of their agencies’
budgets in accordance with their own preferences
and are able to do so because of the unique monopoly position of the bureaucrat. Because the bureaucrat provides output in response to his or her own
personal preferences (e.g., the desire for salary,
prestige, power), it is possible that the size of the
bureaucrat’s budget will be greater than the budget
required to meet the demands of the citizenry.
An important point is that bureaucracy theory
does not deny the citizen demand models of
government discussed in the previous section,
but rather it suggests that bureaucrats can generate
budgets that are in excess of what citizen demand
warrants.
The ability of a bureaucrat to acquire a budget
that is greater than the efficient level is dependent
on several institutional assumptions (Niskanen,
1971, 2001). First, unlike private sector production, the public sector does not produce a specific
number of units, but rather supplies a level of
activity. As a result, this creates a monitoring
problem for oversight agencies: It is difficult, if
not impossible, for monitors to accurately judge
22

J A N UA RY / F E B R UA RY

2006

the efficiency of production when no tangible or
countable unit of output is available. Second, the
monopoly nature of most bureaus shields them
from competitive pressures necessary for efficiency and also denies funding agencies (Congress,
the executive branch) comparable information
on which to judge the efficiency of the bureau.
Third, only the bureau knows its true cost schedule because bureau funding is provided by agents
external to the bureau. This provides an opportunity for bureaucrats to overstate their costs in
order to receive a larger budget. Finally, the
bureaucrat can make take-it-or-leave-it budget
proposals to the funding agency.
Niskanen (1971) shows that the bureaucrat
will maximize a budget subject to the constraint
that the budget must cover the costs of producing the good or service. The implication of the
model is that the bureau’s budget (and output) is
expanded beyond the point where the marginal
public benefits of the good or service equals the
bureau’s marginal costs of providing the good or
service.17
Although the model presents clear reasoning
on how a bureau can expand output and costs
beyond the efficient level, in reality many bureaus
cannot expand output beyond the level demanded
by the citizenry. Examples of this at the local level
include school districts and garbage collection:
School districts cannot educate more students
than those who are already attending school,
17

A simple formulation of Niskanen’s (1971) model of bureaucracy is:
• B = B(Q), where B is the bureau’s budget and Q is the perceived
output of the bureau. The funding agency is aware of this public
benefit schedule, B(Q). It is assumed that B ′ > 0 and B ′′ < 0.
• C = C(Q), where C represents the bureau’s cost function, which
is known only to the bureau. Also, C ′ > 0 and C ′′ > 0.
The bureaucrat is assumed to maximize his or her budget subject to
the constraint that the budget must cover the costs of producing Q.
Thus, the bureau’s objective function is
OB = B(Q) + λ (B(Q) – C(Q)).
Differentiating with respect to Q and λ and then rearranging terms
gives
(1)

∂B
λ ∂C
=
∂Q (1 + λ ) ∂Q

(2)

B(Q) = C(Q).

Mueller (2003, Chap. 16) provides a detailed analysis of bureaucracy
theory and presents extensions of the model presented here that
relax several of the initial assumptions.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

and garbage collectors cannot haul more garbage
than is available for disposal. Even in these cases,
however, a bureau may expand its budget beyond
the efficient level—not by providing output
beyond the efficient amount but by providing
the services at a higher cost than necessary.
There has been ample literature that has
compared the costs of public and private organizations that provide similar services. The activities
or firms studied include, but are not limited to,
hospitals (Clarkson, 1972), refuse collection
(Bennett and Johnson, 1979, and Kemper and
Quigley, 1976), water utilities (Morgan, 1977)
and fire protection (Ahlbrandt, 1973). Mueller
(2003, Chap. 16) provides a summary of 70 studies
that examined the cost of public versus private
sector provision of identical services. In all but
five studies cited, the cost of public provision is
significantly greater than private provision, thus
lending support for the bureaucracy theory of
government.
However, the cost difference between private
and public organizations may simply be a result
of a lack of competitive pressure rather than direct
attempts by bureaucrats to maximize their budget.
In addition, Mueller (2003) suggests that many
of the assumptions necessary for the bureaucracy
theory to hold may be too strong and actually
weaken the ability of the bureaucrat to manipulate
price and output.
For example, the ability of a bureau to present
a take-it-or-leave-it budget proposal may be lessened if the funding agency or an oversight agency
is aware of the advantage such a position affords
the bureau. Thus, the funding agency may request
that the bureau present several cost and output
scenarios; if the bureau must present a cost schedule, it becomes more likely that the bureau will
announce its true costs.18 Also, several agencies
exist, such as the U.S. General Accounting Office,
that are set-up for the sole purpose of detecting
excessive costs and inefficiencies in government
bureaus. The possibility of an audit and the negative attention such an action brings creates an
18

Bendor, Taylor, and Van Gaalen (1985) show that a bureau can
charge a price higher than the efficient level (marginal costs =
marginal benefits) only when the demand for the bureau’s service
is inelastic.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

incentive for bureaucrats to limit their pricing
power and, at least somewhat, promote an efficient organization.
Although the constraints on bureaucracy
seem reasonable, they are somewhat limited given
the number of local, state, and federal agencies
that exist relative to the number of funding and
oversight agencies. However, although the literature has presented strong evidence that bureaucracy may partly explain government size, much
less work has been done on explaining how
bureaucracy theory may explain government
growth. One explanation put forth by Mueller
(2003) is that the ability of a bureau to misrepresent its cost and/or output schedule is likely to
be directly correlated with the bureau’s size.
Thus, larger bureaus can better manipulate their
budgets relative to smaller bureaus, and any
manipulation of the bureau’s budget will increase
the size of the bureau, which in turn increases
the bureau’s ability to manipulate the budget.
Despite the limits of bureaucracy theory, it
remains a plausible explanation for the scope of
government seen today. The common inefficiencies of large organizations, be they private or
public, are not unknown by the general public,
who often work in such organizations. In addition,
it is not uncommon for the media to report waste
or fraud that has occurred at large private and
public organizations. The bureaucracy theory fits
arguably well with the real-world experiences of
many people.

Fiscal Illusion
The fiscal illusion theory assumes that government, specifically legislators and the executive
branch, can deceive voters as to the true size of
government. This theory is similar to the bureaucracy theory that postulated that bureaus can
deceive legislators and funding agencies as to
the true size of the bureau. The concept of fiscal
illusion has been discussed in the economics literature for nearly a century, but Buchanan (1967)
formulates the idea into a theory of government
size and growth.
Fiscal illusion assumes that citizens measure
the size of government by the quantity of taxes
they pay. As such, taxes and tax collection measJ A N UA RY / F E B R UA RY

2006

Garrett and Rhine

ures that are less obvious to citizens are more
likely to be used by government. Examples
include the federal withholding of income taxes
and property tax collection through monthly
mortgage payments. Although the income tax is
considered a direct tax, versus indirect taxes
such as gasoline or cigarette taxes, the ability of
direct taxes to be disguised suggests that the collection method of some direct taxes may hide
citizens’ tax bills better than indirect taxes.
Mueller (2003, p. 527) suggests that determining
which taxes are hidden from citizens is largely
an empirical issue.
Oates (1988) provides an overview of the
empirical literature on fiscal illusion. He summarizes the empirical findings and develops five
hypotheses to support the fiscal illusion theory
of government. Oates (1988) concludes (i) tax
burdens are more difficult to evaluate when the
tax structure is more complicated, (ii) progressive
tax structures that increase a citizen’s tax bill
according to income increases are less obvious
than legislated changes to the tax code, (iii) homeowners are better able to judge their portion of
property taxes than are renters, (iv) the issuance
of debt (and thus the likelihood of future tax
increases) appears less costly to voters than current tax increases, and (v) the “fly-paper effect”
of government spending is real.
The fly-paper effect hypothesis deserves some
explanation given the attention it has received
in the literature (see Hines and Thaler, 1995).
Economic theory predicts that a lump-sum
increase in income to one level of government
from another, say a lump-sum grant from the
federal government to a state government, will
increase government spending by the same
amount as would an equal increase in citizen
income in that state. Increases in income (revenue
via taxes) or grants to the voter’s government are
identical because they both increase financial
resources to the government. Government sets
the level of expenditures desired by the median
voter. Thus, when grant monies are obtained by
the government, the voter can consider these
grant funds as an increase in personal income
via a reduction in taxes. Thus, through an efficient
political process, any additional revenue from
24

J A N UA RY / F E B R UA RY

2006

grants is offset by a decrease in tax revenue
demanded by voters.
Typically, a $1 increase in personal income
increases government spending by $0.05 to
$0.10.19 In the absence of a fly-paper effect, one
should expect to see every $1 from a lump-sum
grant to state governments (an income increase
to state governments) increase government spending by the same amount, $0.05 to $0.10. However,
the literature has shown that lump-sum grants
increase government spending by $0.20 to $1 for
every $1 in grant money, which is significantly
greater than the $0.05 to $0.10 increase that would
arise from an increase in median voter income.20
Thus, the grant-money “sticks” to where it is sent,
hence the term fly-paper effect. Inefficiencies in
the political process and a disconnect between
the preferences of the median voter and government are cited as reasons why the fly-paper effect
may exist. If the fly-paper effect exists, then governments can increase spending without apparent
tax increases. Increases in intergovernmental
grants will need to be financed by taxes, but this
tax revenue (and resulting tax burden on citizens)
is not directly linked to the expenditures by the
state governments.
The fly-paper effect and the broader issue of
fiscal illusion are not without critics. Doubters of
the fly-paper effect argue that incorrect modeling
of empirical models and the political processes
in the public sector as well as the failure to discern
between numerous types of grants may explain the
fly-paper effect found in the literature (Hamilton,
1983, and Chernick, 1979).21 Regarding fiscal
illusion, the literature does not explain exactly
how government will grow if fiscal illusion is
indeed present. Just because voters are unaware
19

Hines and Thaler (1995) and Fisher (1996).

Hines and Thaler (1995) summarize the results of numerous studies
that present empirical estimates of the fly-paper effect.

Grants can be lump-sum, matching (where the receiving government
must match a certain percent of the expenditure), closed-ended,
or open-ended. Whereas a lump-sum grant only creates an income
effect, a matching grant creates both an income effect and a substitution effect. Economic theory predicts that matching grants will
result in higher government spending than lump-sum grants (see
Fisher, 1996, Chap. 9). Disentangling the effects of various forms
of grants greatly complicates empirical analyses on the fly-paper
effect.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

Table 2
Vote Trading, Bundling, and Government Size
Net Benefits (+) or Costs (–) to Each Voter in District
Voters of district

Construction of
post office in A

Dredging harbor in B

Construction of
military base in C

Total

+$10

–$3

+$4

–$3

+$10

–$3

+$4

–$3

+$10

+$4

–$3

–$9

–$3

–$9

Total

–$2

–$6

SOURCE: Gwartney and Stroup (1997, p. 503).

of their true tax bill, that does not mean there is
a clear method for government officials (legislators, bureaucrats) to take advantage of this situation to increase the size of government. Mueller
(2003) argues that, for fiscal illusion to explain
government size and growth, it must be combined
with other theories of government growth discussed earlier to form a single model of government growth.

Monopoly Government and Leviathan
The idea that representative governments
behave as monopolists was first suggested by
Breton (1974). The party in control of the legislature has an objective function that includes the
probability of reelection, personal pecuniary gain,
and the pursuit of personal ideals. While providing basic public goods, such as police and fire
protection (in the case of a local government),
the monopoly government can obtain its objectives by bundling narrowly defined issues that
benefit individual members of the government
along with the more popular public-good services
provided.
This idea stems from the neoclassical view
of the monopolist, where a private monopolist
can increase his profit by bundling other products
that he does not monopolize with his monopolist
product. Consumers will then buy the monopolist’s package as long as their consumer surplus
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

on the bundled products exceeds the cost of the
individual packages.
In the case of governments, this bundling of
services results in higher levels of government
output. Tullock (1959) provides a comprehensive
analysis of how the bundling of goods and votetrading among legislators can increase the size of
government. The example shown in Table 2 illustrates the point made by Tullock (1959): A five
member legislature is considering the three projects, each of which is inefficient because the net
costs outweigh the net benefits.22 As a result, if
each project was voted on separately (and each
legislator voted according to the preferences of
his constituency), then none of the projects would
be implemented because each would lose by a 4to-1 margin. But, bundling the three projects will
garner “yes” votes from legislators representing
districts A, B, and C, thus allowing the legislation
to pass 3-to-2, thereby increasing the size of government expenditures.
The monopolist view of government has been
extended further by Brennan and Buchanan
(1977, 1980). In their model of a “leviathan” government, the monopoly government’s sole objective is to maximize revenue. The citizenry is
assumed to have lost all control over their govern22

As noted in Gwartney and Stroup (1997, p. 503), vote-trading and
bundling can also lead to efficient measures. The point made in the
above example is that bundling can lead to greater government size.

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

ment, and political competition is seen as an ineffective constraint on the growth of government.23
Their leviathan view of government is opposite
of the government assumed in the citizen-overstate theories—the latter being a benevolent
provider of goods, a reducer of externalities, and
a redistributor of income. According to Brennan
and Buchanan (1977), only constitutional constraints on the government’s authority to tax and
issue debt can limit a leviathan government.24
Empirical evidence for the monopoly view
of government has provided mixed results. The
studies are often conducted at the local rather
than national level due to data availability. Many
tests for monopoly government have a similar
goal as those for the bureaucracy theory, namely
that the cost of public services is greater than the
costs of identical services provided by the private
sector. Additional research has hypothesized
that a constraint on a monopoly government is
competition from neighboring governments
(Martin and Wagner, 1978). This research on the
monopoly power of government has shown that
restrictions on incorporation raise the costs of
existing local governments.
Tests for leviathan government begin with
the premise that such a system should be less
likely to occur when government is relatively
smaller and there exists strong intergovernmental
competition. As with the studies of monopoly,
much of the literature on leviathan has focused
on local governments (Oates, 1972, Nelson, 1987,
and Zax, 1989). The mixed results obtained in
23

This is a result of the rational ignorance of voters (voters don’t
care about the political process because the costs of doing so outweigh any benefit from their single vote) and collusion by elected
officials.

A revenue-maximizing government will typically not maximize
revenue at a 100 percent tax rate because a tax base shrinks as tax
rates increase. Consider the following: T = r · B(r), where T = tax
revenue, r = tax rate, and B = tax base. Differentiating tax revenue
(T) with respect to the tax rate (r) and manipulating terms gives
the expression

δB . r
= –1 .
δr B
This expression shows that tax revenues will be maximized when
the elasticity of the tax base is equal to –1. If the elasticity is less
than –1, then an increase in the rate will decrease the base by a
larger amount, thereby decreasing revenue. On the other hand, if
the elasticity is greater than –1, then an increase in the rate will
decrease the base by a smaller amount, thereby increasing revenue.

J A N UA RY / F E B R UA RY

2006

these studies are due, at least in part, to the variety of methods authors use to proxy for government size. On the national level, Oates (1985)
finds that countries having a federalist constitution (many levels of government) had a negative,
but insignificant, effect on government growth.
Much more empirical testing must be done before
the leviathan view of government is broadly
accepted as one plausible explanation for government growth.

A NOTE ON THE 16TH
AMENDMENT TO THE U.S.
CONSTITUTION
Prior to the adoption of the 16th amendment
to the U.S. Constitution in 1913, the federal government was constrained from directly taxing
personal income by Article 1, Section 9 of the
U.S. Constitution, which reads as follows: “No
Capitation, or other direct, Tax shall be laid, unless
in Proportion to the Census or Enumeration herein
before directed to be taken.” A careful reading of
this clause reveals that the federal government
actually could levy a personal income tax (which
is a direct tax) prior to the 16th amendment, but
income tax collection had to be in apportionment
to population.
The 16th amendment negated the apportionment clause written in Article 1, Section 9. The
16th amendment reads as follows: “The Congress
shall have the power to lay and collect taxes on
incomes, from whatever sources derived, without
apportionment among the several States, and
without regard to any census or enumeration.”
The 16th amendment was passed by Congress on
July 2, 1909, and ratified on February 3, 1913.25
What makes this amendment interesting with
regard to government growth is that the dramatic
rise in the size of the federal government (see
Figure 1) began immediately following the ratification of the 16th amendment.
The option to levy a federal income tax that
is made available to Congress does not itself imply
25

For an interesting history of the 16th amendment, see National
Archives and Records Administration (1995) and
www.ourdocuments.gov (keyword search “16th amendment”).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

that government will grow. The option to tax
personal income means only that government
has another source of revenue with on which to
finance its growth. Explaining government growth
must be done using the theories presented earlier—
income taxes are simply a fuel that enables the
engine of government growth to start.
However, the government has increased its
reliance on federal income taxes over the past 90
years, the same time period in which government
expenditures have increased dramatically. Personal
income tax revenue as a percentage of all federal
tax revenue increased from about 2 percent in
1913 to over 43 percent in 2004. Also, because
of large exemptions, few people paid personal
income taxes in 1913; if they did pay taxes, the
rates were much lower than today. For example,
in 1913 the lowest tax bracket was $0 to $20,000,
with a 1 percent marginal tax rate; the highest
bracket was on taxable income over $500,000,
taxed at a 7 percent rate; and the personal married
exemption was $4,000. In 2004 dollars, the lowest
1913 bracket and married exemption would be
equal to $381,616 and $76,323, respectively, and
the top 1913 bracket would be equal to a 2004
income of $10,495,000.26 Compare this with actual
2004 tax statistics: The married exemption (no
children) was $6,200, the lowest tax bracket was
10 percent on taxable income up to $7,150, and
the top marginal tax rate was 35 percent on taxable
income over $319,100.27 Although the strength
of any causality between the 16th amendment
and later expansions of the income tax must be
determined empirically, the strong correlation
between these two events is compelling.28

SUMMARY AND CONCLUSIONS
The past 90 years has seen a dramatic rise in
the size and growth of the government in the
United States. This article presented various data
illustrating this increase in government growth
26

Calculations were made using the consumer price index (CPI).

Internal Revenue Service: www.irs.gov/pub/irs-soi/02inpetr.pdf
and 2004 Form 1040.

Holcombe and LaCombe (1998) discuss the ratification of the
16th amendment and government growth.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

and then focused on several economic theories
that attempt to explain this growth. The theories
fit into one of two philosophies of government
growth: either (i) the growth of government is
driven by citizen demand or (ii) the growth in
government is a result of government itself,
brought on by inherent inefficiencies in the public
sector, the personal incentives of public officials,
and representative democracy.
The theories discussed in this article are not
the only theories on government growth that have
been raised. Researchers have suggested that electoral cycles, in conjunction with citizen demand,
may play a role in the size and growth of government (Downs, 1957, and Coughlin, 1992). The
expansion of the voting franchise, an arguably
more controversial explanation for government
growth, was suggested by Meltzer and Richard
(1981); their idea is that groups of individuals
that were given the right to vote were typically
from the lower end of the income distribution
and demanded greater government services.
Although each theory was presented here as
a stand-alone explanation for government size
and growth, the complexity of the public sector
and the political process as well as the limits of
empirical economic analysis suggest that government growth is likely to be a function of some or
all of the above theories. In addition, many of
the theories do a better job at either explaining
size or growth, but do not adequately explain the
current size of government or its growth over time.
Some of the theories have not withstood empirical
tests, and debate continues as to whether this is
a result of incorrect theory or incorrect empirical
modeling. The challenge for economists and
political scientists is to formulate a single cohesive theory that accounts for all aspects of the
citizen-over-state and state-over-citizen theories
presented here.

REFERENCES
Ahlbrandt, Roger S. Jr. “Efficiency in the Provision of
Fire Services.” Public Choice, Fall 1973, 16, pp. 1-15.
Baumol, William J. “The Macroeconomics of
Unbalanced Growth: The Anatomy of Urban Crisis.”

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

American Economic Review, June 1967, 57(3), pp.
415-26.
Becker, Gary S. “A Theory of Competition among
Pressure Groups for Political Influence.” Quarterly
Journal of Economics, August 1983, 98(3), pp. 371400.
Bendor, Jonathan; Taylor, Serge and Van Gaalen,
Roland. “Bureaucratic Expertise versus Legislative
Authority: A Model of Deception and Monitoring
in Budgeting.” American Political Science Review,
December 1985, 79(4), pp. 1041-60.
Bennett, James T. and Johnson, Manuel H. “Public
versus Private Provision of Collective Goods and
Services: Garbage Collection Revisited.” Public
Choice, 1979, 34(1), pp. 55-63.
Brennan, Geoffrey and Buchanan, James M. “Towards
a Tax Constitution for Leviathan.” Journal of Public
Economics, December 1977, 8(3), pp. 255-73.
Brennan, Geoffrey and Buchanan, James M. The
Power to Tax: Analytical Foundations of a Fiscal
Constitution. Cambridge: Cambridge University
Press, 1980.

Downs, Anthony. “Problems of Majority Voting: In
Defense of Majority Voting,” Journal of Political
Economy, April 1961, 69(2), pp. 192-99.
Ekelund, Robert and Tollison, Robert. “The Interest
Group Theory of Government,” in William Shughart
and Laura Razzolini, eds., The Elgar Companion
To Public Choice. Northhampton: Edward Elgar,
2001, pp. 357-78.
Ferris, J. Stephen and West, Edwin G. “Cost Disease
verses Leviathan Explanations of Rising Government
Costs: An Empirical Investigation.” Public Choice,
March 1999, 98(3-4), pp. 307-16.
Fisher, Ronald. State and Local Public Finance.
Chicago: Irwin, 1996.
Friedman, Milton and Schwartz, Anna J. A Monetary
History of the United States, 1867-1960. Princeton:
Princeton University Press, 1963.
Garrett, Thomas A. and Rhine, Russell M. “Social
Security verses Private Retirement Accounts:
A Historical Analysis.” Federal Reserve Bank of
St. Louis Review, March/April 2005, 87(2), Part 1,
pp. 103-21.

Breton, Albert. The Economic Theory of Representative
Government. Chicago: Aldine, 1974.

Gwartney, James and Stroup, Richard L.
Microeconomics: Private and Public Choice. 8th
Edition. Chicago: Dryden Press, 1997.

Buchanan, James. Public Finance in Democratic
Processes. Chapel Hill, NC: University of North
Carolina Press, 1967.

Hamilton, Bruce W. “The Fly Paper Effect and Other
Anomalies.” Journal of Public Economics, December
1983, 22(3), pp. 347-61.

Chernick, Howard. “An Econometric Model of the
Distribution of Project Grants,” in P. Mieszkowski
and W. Oakland, eds., Fiscal Federalism and Grantsin-Aid. Washington, DC: The Urban Institute, 1979.

Hayek, Frederick A. von. The Road to Serfdom.
London: George Routledge and Sons, 1944.

Clarkson, Kenneth W. “Some Implications of Property
Rights in Hospital Management.” Journal of Law
and Economics, October 1972, 15(2), pp. 363-84.

Hines, James R. Jr. and Thaler, Richard H. “The Fly
Paper Effect.” The Journal of Economic Perspectives,
Fall 1995, 9(4), pp. 217-26.

Coughlin, Peter. Probabilistic Voting Theory.
Cambridge: Cambridge University Press, 1992.

Holcombe, Randall G. and Lacombe, Donald J.
“Interests versus Ideology in the Ratification of the
16th and 17th Amendments.” Economics and
Politics, July 1998, 10(2), pp. 143-59.

Downs, Anthony. An Economic Theory of
Democracy. New York: Harper and Row, 1957.

Hotelling, Harold. “Stability in Competition.”
Economic Journal, March 1929, 39(153), pp. 41-57.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Garrett and Rhine

Kemper, Peter and Quigley, John M. The Economics
of Refuse Collection. Cambridge, MA: Balinger, 1976.
Keynes, John Maynard. The General Theory of
Employment, Interest, and Money. New York:
Harcourt Brace, 1936.
Kliesen, Kevin. “Big Government. The Comeback
Kid?” Federal Reserve Bank of St. Louis Regional
Economist, January 2003.
Kristov, Lorenzo; Lindert, Peter and McClelland,
Robert. “Pressure Groups and Redistribution.”
Journal of Public Economics, July 1992, 48(2), pp.
135-63.
Martin, Dolores T. and Wagner, Richard E. “The
Institutional Framework for Municipal Incorporations:
An Economic Analysis of Local Agency Formation
Commissions in California.” Journal of Law and
Economics, October 1978, 21(2), pp. 409-25.
McCormick, Robert and Tollison, Robert. Politicians,
Legislation, and the Economy: An Inquiry into the
Interest Group Theory of Government. Boston:
Martinus Nijhoff, 1981.
Meltzer, Allan H. and Richard, Scott F. “Why
Government Grows (and Grows) in a Democracy.”
Public Interest, Summer 1978, 52, pp. 111-18.
Meltzer, Allan H. and Richard, Scott F. “A Rational
Theory of the Size of Government.” Journal of
Political Economy, October 1981, 89(5), pp. 914-27.

Mueller, Dennis and Murrell, Peter. “Interest Groups
and the Political Economy of Government Size,” in
Francesco Forte and Alan Peacock, eds., Public
Expenditures and Government Growth. Oxford:
Basil Blackwell, 1985.
Mueller, Dennis C. and Murrell, Peter. “Interest Groups
and the Size of Government.” Public Choice, 1986,
48(2), pp. 125-45.
National Archives and Records Administration.
Milestone Documents in the National Archives.
Washington, DC: 1995, pp. 69-73.
Nelson, Michael A. “Searching for Leviathan:
Comment and Extension.” American Economic
Review, March 1987, 77(1), pp. 198-204.
Niskanen, William. Bureaucracy and Representative
Government. Chicago: Aldine-Atherton, 1971.
Niskanen, William. “Bureaucracy,” in William
Shughart and Laura Razzolini, eds., The Elgar
Companion To Public Choice. Northhampton:
Edward Elgar, 2001.
Oates, Wallace E. Fiscal Federalism. New York:
Harcourt Brace Jovanovich, 1972.
Oates, Wallace E. “Searching for Leviathan: An
Empirical Study.” American Economic Review,
September 1985, 75(4), pp. 748-57.

Meltzer, Allan H. and Richard, Scott F. “Tests of a
Rational Theory of the Size of Government.” Public
Choice, 1983, 41(3), pp. 403-18.

Oates, Wallace E. “On the Nature and Measurement
of Fiscal Illusion: A Survey,” in G. Brennan et al.,
eds., Taxation and Fiscal Federalism: Essays in
Honour of Russell Mathews. Sydney: Australian
National University Press, 1988.

Moe, Terry M. The Organization of Interests:
Incentives and the Internal Dynamics of Political
Interest Groups. Chicago: University of Chicago
Press, 1980.

Olson, Mancur. The Logic of Collective Action:
Public Goods and the Theory of Groups. Cambridge:
Harvard University Press, 1965.

Morgan, W. “Investor Owned vs. Publicly Owned
Water Agencies: An Evaluation of the Property
Rights Theory of the Firm.” Water Resources
Bulletin, 1977, 13(4), pp. 775-81.
Mueller, Dennis C. Public Choice III. Cambridge:
Cambridge University Press, 2003.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Peltzman, Sam. “The Growth of Government.”
Journal of Law and Economics, October 1980, 23(2),
pp. 209-87.
Rodrik, Dani. “Why Do More Open Economies Have
Bigger Governments,” Journal of Political Economy,
October 1998, 106(5), pp. 997-1032.

J A N UA RY / F E B R UA RY

2006

Garrett and Rhine

Sobel, Russell S. “The Budget Surplus: A Public
Choice Explanation.” Working Paper 2001-05, West
Virginia University, 2001.
Tullock, Gordon. “Problems of Majority Voting.”
Journal of Political Economy, December 1959,
67(6), pp. 571-79.
Weingast, Barry R.; Shepsle, Kenneth A. and Johnsen,
Christopher. “The Political Economy of Benefits
and Costs: A Neoclassical Approach to Distributive
Politics.” Journal of Political Economy, August
1981, 89(4), pp. 642-64.
Yergin, Daniel and Stanislaw, Joseph. The
Commanding Heights: The Battle for the World
Economy. New York: Simon and Schuster, 2002.
Zax, Jeffrey S. “Is There a Leviathan in Your
Neighborhood?” American Economic Review, June
1989, 79(3), pp. 560-67.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

The Evolution of the Subprime Mortgage Market
Souphala Chomsisengphet and Anthony Pennington-Cross
This paper describes subprime lending in the mortgage market and how it has evolved through
time. Subprime lending has introduced a substantial amount of risk-based pricing into the mortgage
market by creating a myriad of prices and product choices largely determined by borrower credit
history (mortgage and rental payments, foreclosures and bankruptcies, and overall credit scores)
and down payment requirements. Although subprime lending still differs from prime lending in
many ways, much of the growth (at least in the securitized portion of the market) has come in the
least-risky (A–) segment of the market. In addition, lenders have imposed prepayment penalties
to extend the duration of loans and required larger down payments to lower their credit risk
exposure from high-risk loans.
Federal Reserve Bank of St. Louis Review, January/February 2006, 88(1), pp. 31-56.

INTRODUCTION AND MOTIVATION

omeownership is one of the primary
ways that households can build wealth.
In fact, in 1995, the typical household
held no corporate equity (Tracy, Schneider, and
Chan, 1999), implying that most households find
it difficult to invest in anything but their home.
Because homeownership is such a significant
economic factor, a great deal of attention is paid
to the mortgage market.
Subprime lending is a relatively new and
rapidly growing segment of the mortgage market
that expands the pool of credit to borrowers who,
for a variety of reasons, would otherwise be denied
credit. For instance, those potential borrowers who
would fail credit history requirements in the standard (prime) mortgage market have greater access
to credit in the subprime market. Two of the major
benefits of this type of lending, then, are the
increased numbers of homeowners and the opportunity for these homeowners to create wealth.

Of course, this expanded access comes with
a price: At its simplest, subprime lending can be
described as high-cost lending.
Borrower cost associated with subprime
lending is driven primarily by two factors: credit
history and down payment requirements. This
contrasts with the prime market, where borrower
cost is primarily driven by the down payment
alone, given that minimum credit history requirements are satisfied.
Because of its complicated nature, subprime
lending is simultaneously viewed as having great
promise and great peril. The promise of subprime
lending is that it can provide the opportunity for
homeownership to those who were either subject
to discrimination or could not qualify for a mortgage in the past.1 In fact, subprime lending is most
1

See Hillier (2003) for a thorough discussion of the practice of “redlining” and the lack of access to lending institutions in predominately
minority areas. In fact, in the 1930s the Federal Housing Authority
(FHA) explicitly referred to African Americans and other minority
groups as adverse influences. By the 1940s, the Justice Department
had filed criminal and civil antitrust suits to stop redlining.

Souphala Chomsisengphet is a financial economist at the Office of the Comptroller of the Currency. Anthony Pennington-Cross is a senior
economist at the Federal Reserve Bank of St. Louis. The views expressed here are those of the individual authors and do not necessarily
reflect the official positions of the Federal Reserve Bank of St. Louis, the Federal Reserve System, the Board of Governors, the Office of
Comptroller of the Currency, or other officers, agencies, or instrumentalities of the United States government.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

prevalent in neighborhoods with high concentrations of minorities and weaker economic conditions (Calem, Gillen, and Wachter, 2004, and
Pennington-Cross, 2002). However, because poor
credit history is associated with substantially more
delinquent payments and defaulted loans, the
interest rates for subprime loans are substantially
higher than those for prime loans.
Preliminary evidence indicates that the
probability of default is at least six times higher
for nonprime loans (loans with high interest rates)
than prime loans. In addition, nonprime loans
are less sensitive to interest rate changes and, as
a result, subprime borrowers have a harder time
taking advantage of available cheaper financing
(Pennington-Cross, 2003, and Capozza and
Thomson, 2005). The Mortgage Bankers Association of America (MBAA) reports that subprime
loans in the third quarter of 2002 had a delinquency rate 51/2 times higher than that for prime
loans (14.28 versus 2.54 percent) and the rate at
which foreclosures were begun for subprime loans
was more than 10 times that for prime loans (2.08
versus 0.20 percent). Therefore, the propensity
of borrowers of subprime loans to fail as homeowners (default on the mortgage) is much higher
than for borrowers of prime loans.
This failure can lead to reduced access to
financial markets, foreclosure, and loss of any
equity and wealth achieved through mortgage
payments and house price appreciation. In addition, any concentration of foreclosed property can
potentially adversely impact the value of property
in the neighborhood as a whole.
Traditionally, the mortgage market set minimum lending standards based on a borrower’s
income, payment history, down payment, and the
local underwriter’s knowledge of the borrower.
This approach can best be characterized as using
nonprice credit rationing. However, the subprime
market has introduced many different pricing tiers
and product types, which has helped to move the
mortgage market closer to price rationing, or riskbased pricing. The success of the subprime market
will in part determine how fully the mortgage
market eventually incorporates pure price rationing (i.e., risk-based prices for each borrower).
This paper provides basic information about
32

J A N UA RY / F E B R UA RY

2006

subprime lending and how it has evolved, to aid
the growing literature on the subprime market
and related policy discussions. We use data from
a variety of sources to study the subprime mortgage market: For example, we characterize the
market with detailed information on 7.2 million
loans leased from a private data provider called
LoanPerformance. With these data, we analyze
the development of subprime lending over the
past 10 years and describe what the subprime
market looks like today. We pay special attention
to the role of credit scores, down payments, and
prepayment penalties.
The results of our analysis indicate that the
subprime market has grown substantially over
the past decade, but the path has not been smooth.
For instance, the market expanded rapidly until
1998, then suffered a period of retrenchment, but
currently seems to be expanding rapidly again,
especially in the least-risky segment of the subprime market (A– grade loans). Furthermore,
lenders of subprime loans have increased their
use of mechanisms such as prepayment penalties and large down payments to, respectively,
increase the duration of loans and mitigate losses
from defaulted loans.

WHAT MAKES A LOAN SUBPRIME?
From the borrower’s perspective, the primary
distinguishing feature between prime and subprime loans is that the upfront and continuing
costs are higher for subprime loans. Upfront costs
include application fees, appraisal fees, and other
fees associated with originating a mortgage. The
continuing costs include mortgage insurance
payments, principle and interest payments, late
fees and fines for delinquent payments, and fees
levied by a locality (such as property taxes and
special assessments).
Very little data have been gathered on the
extent of upfront fees and how they differ from
prime fees. But, as shown by Fortowsky and
LaCour-Little (2002), many factors, including
borrower credit history and prepayment risk, can
substantially affect the pricing of loans. Figure 1
compares interest rates for 30-year fixed-rate loans
in the prime and the subprime markets. The
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 1
Interest Rates
Interest Rate at Origination
12
Subprime
Subprime Premium
Prime

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

NOTE: Prime is the 30-year fixed interest rate reported by the Freddie Mac Primary Mortgage Market Survey. Subprime is the average
30-year fixed interest rate at origination as calculated from the LoanPerformance data set. The Subprime Premium is the difference
between the prime and subprime rates.

Figure 2
Foreclosures In Progress
Rate Normalized to 1 in 1998:Q1
5
LP-Subprime
MBAA-Subprime
MBAA-Prime

0
1998

1999

2000

2001

2002

2003

2004

NOTE: The rate of foreclosure in progress is normalized to 1 in the first quarter of 1998. MBAA indicates the source is the Mortgage
Bankers Association of America and LP indicates that the rate is calculated from the LoanPerformance ABS data set.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Table 1
Underwriting and Loan Grades
Credit history

Premier Plus

Premier

A–

C–

0 x 30 x 12

1 x 30 x 12

2 x 30 x 12

1 x 60 x 12

1 x 90 x 12

2 x 90 x 12

Foreclosures

>36 months

>24 months

>12 months

>1 day

Bankruptcy, Chapter 7

Discharged
>36 months

Discharged
>24 months

Discharged
>12 months

Discharged

Bankruptcy, Chapter 13

Discharged
>24 months

Discharged
>18 months

Filed
>12 months

Pay

50%

Mortgage delinquency
in days

Debt ratio

SOURCE: Countrywide, downloaded from www.cwbc.com on 2/11/05.

prime interest rate is collected from the Freddie
Mac Primary Mortgage Market Survey. The subprime interest rate is the average 30-year fixedrate at origination as calculated from the
LoanPerformance data set. The difference between
the two in each month is defined as the subprime
premium. The premium charged to a subprime
borrower is typically around 2 percentage points.
It increases a little when rates are higher and
decreases a little when rates are lower.
From the lender’s perspective, the cost of a
subprime loan is driven by the loan’s termination
profile.2 The MBAA reports (through the MBAA
delinquency survey) that 4.48 percent of subprime
and 0.42 percent of prime fixed-rate loans were
in foreclosure during the third quarter of 2004.
According to LoanPerformance data, 1.55 percent
of fixed-rate loans were in foreclosure during the
same period. (See the following section “Evolution
of Subprime Lending” for more details on the
differences between these two data sources.)
Figure 2 depicts the prime and subprime loans
in foreclosure from 1998 to 2004. For comparison,
the rates are all normalized to 1 in the first quarter
of 1998 and only fixed-rate loans are included.
The figure shows that foreclosures on prime
loans declined slightly from 1998 through the
third quarter of 2004. In contrast, both measures
of subprime loan performance showed substan2

The termination profile determines the likelihood that the borrower
will either prepay or default on the loan.

J A N UA RY / F E B R UA RY

2006

tial increases. For example, from the beginning
of the sample to their peaks, the MBAA measure increased nearly fourfold and the
LoanPerformance measure increased threefold.
Both measures have been declining since 2003.
These results show that the performance and termination profiles for subprime loans are much
different from those for prime loans, and after
the 2001 recession it took nearly two years for
foreclosure rates to start declining in the subprime market. It is also important to note that,
after the recession, the labor market weakened
but the housing market continued to thrive (high
volume with steady and increasing prices). Therefore, there was little or no equity erosion caused
by price fluctuations during the recession. It
remains to be seen how subprime loans would
perform if house prices declined while unemployment rates increased.
The rate sheets and underwriting matrices
from Countrywide Home Loans, Inc. (download
from www.cwbc.com on 2/11/05), a leading lender
and servicer of prime and subprime loans, provide
some details typically used to determine what
type of loan application meets subprime underwriting standards.
Countrywide reports six levels, or loan
grades, in its B&C lending rate sheet: Premier Plus,
Premier, A–, B, C, and C–. The loan grade is determined by the applicant’s mortgage or rent payment
history, bankruptcies, and total debt-to-income
ratio. Table 1 provides a summary of the four
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Table 2
Underwriting and Interest Rates
LTV
Loan grade
Premier Plus

Premier

A–

Credit score

60%

70%

80%

90%

100%

680

5.65

5.75

5.80

5.90

7.50

660

5.65

5.75

5.85

6.00

7.85

600

5.75

5.80

5.90

6.60

8.40

580

5.75

5.85

6.00

6.90

8.40

500

6.40

6.75

7.90

680

5.80

5.90

5.95

7.55

660

5.80

5.90

6.00

6.05

7.90

600

5.90

5.95

6.05

6.65

8.45

580

5.90

6.00

6.15

6.95

500

6.55

6.90

8.05

660

6.20

6.25

6.35

6.45

600

6.35

6.45

6.50

6.70

580

6.35

6.45

6.55

7.20

500

6.60

6.95

8.50

660

6.45

6.55

6.65

600

6.55

6.60

6.75

580

6.55

6.65

6.85

500

6.75

7.25

9.20

600

6.95

7.20

580

7.00

7.30

500

7.45

8.95

580

7.40

7.90

500

8.10

9.80

680

680
660

C–

680
660
600

NOTE: The first three years are at a fixed interest rate, and there is a three-year prepayment penalty.
SOURCE: Countrywide California B&C Rate Sheet, downloaded from www.cwbc.com on 2/11/05.

underwriting requirements used to determine
the loan grade. For example, to qualify for the
Premier Plus grade, the applicant may have had
no mortgage payment 30 days or more delinquent
in the past year (0 x 30 x 12). The requirement is
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

slowly relaxed for each loan grade: the Premier
grade allows one payment to be 30-days delinquent; the A– grade allows two payments to be
30-days delinquent; the B grade allows one payment to be 60-days delinquent; the C grade allows
J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

one payment to be 90-days delinquent; and the
C– grade allows two payments to be 90-days
delinquent. The requirements for foreclosures
are also reduced for the lower loan grades. For
example, whereas the Premier Plus grade stipulates no foreclosures in the past 36 months, the
C grade stipulates no foreclosures only in the past
12 months, and the C– grade stipulates no active
foreclosures. For most loan grades, Chapter 7 and
Chapter 13 bankruptcies typically must have been
discharged at least a year before application;
however, the lowest grade, C–, requires only that
Chapter 7 bankruptcies have been discharged
and Chapter 13 bankruptcies at least be in repayment. However, all loan grades require at least a
50 percent ratio between monthly debt servicing
costs (which includes all outstanding debts) and
monthly income.
Loan grade alone does not determine the cost
of borrowing (that is, the interest rate on the loan).
Table 2 provides a matrix of credit scores and
loan-to-value (LTV) ratio requirements that determine pricing of the mortgage within each loan
grade for a 30-year loan with a 3-year fixed interest
rate and a 3-year prepayment penalty. For example, loans in the Premier Plus grade with credit
scores above 680 and down payments of 40 percent or more would pay interest rates of 5.65
percentage points, according to the Countrywide
rate sheet for California. As the down payment
gets smaller (as LTV goes up), the interest rate
increases. For example, an applicant with the
same credit score and a 100 percent LTV will be
charged a 7.50 interest rate. But, note that the
interest rate is fairly stable until the down payment drops below 10 percent. At this point the
lender begins to worry about possible negative
equity positions in the near future due to appraisal
error or price depreciation.
It is the combination of smaller down payments and lower credit scores that lead to the
highest interest rates. In addition, applicants in
lower loan grades tend to pay higher interest rates
than similar applicants in a higher loan grade.
This extra charge reflects the marginal risk associated with missed mortgage payments, foreclosures, or bankruptcies in the past. The highest rate
36

J A N UA RY / F E B R UA RY

2006

quoted is 9.8 percentage points for a C– grade loan
with the lowest credit score and a 30 percent down
payment.
The range of interest rates charged indicates
that the subprime mortgage market actively price
discriminates (that is, it uses risk-based pricing)
on the basis of multiple factors: delinquent payments, foreclosures, bankruptcies, debt ratios,
credit scores, and LTV ratios. In addition, stipulations are made that reflect risks associated with
the loan grade and include any prepayment penalties, the length of the loan, the flexibility of the
interest rate (adjustable, fixed, or hybrid), the lien
position, the property type, and other factors.
The lower the grade or credit score, the
larger the down payment requirement. This
requirement is imposed because loss severities
are strongly tied to the amount of equity in the
home (Pennington-Cross, forthcoming) and price
appreciation patterns.
As shown in Table 2, not all combinations of
down payments and credit scores are available
to the applicant. For example, Countrywide does
not provide an interest rate for A– grade loans
with no down payment (LTV = 100 percent).
Therefore, an applicant qualifying for grade A–
but having no down payment must be rejected.
As a result, subprime lending rations credit
through a mixture of risk-based pricing (price
rationing) and minimum down payment requirements, given other risk characteristics (nonprice
rationing).
In summary, in its simplest form, what makes
a loan subprime is the existence of a premium
above the prevailing prime market rate that a
borrower must pay. In addition, this premium
varies over time, which is based on the expected
risks of borrower failure as a homeowner and
default on the mortgage.

A BRIEF HISTORY OF SUBPRIME
LENDING
It was not until the mid- to late 1990s that the
strong growth of the subprime mortgage market
gained national attention. Immergluck and Wiles
(1999) reported that more than half of subprime
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Table 3
Total Originations—Consolidation and Growth

Year

Total B&C
originations
(billions)

Top 25 B&C
originations
(billions)

Top 25
market share
of B&C

Total
originations

B&C
market share
of total

1995

$65.0

$25.5

39.3%

$639.4

10.2%

1996

$96.8

$45.3

46.8%

$785.3

12.3%

1997

$124.5

$75.1

60.3%

$859.1

14.5%

1998

$150.0

$94.3

62.9%

$1,450.0

10.3%

1999

$160.0

$105.6

66.0%

$1,310.0

12.2%

2000

$138.0

$102.2

74.1%

$1,048.0

13.2%

2001

$173.3

$126.8

73.2%

$2,058.0

8.4%

2002

$213.0

$187.6

88.1%

$2,680.0

7.9%

2003

$332.0

$310.1

93.4%

$3,760.0

8.8%

SOURCE: Inside B&C Lending. Individual firm data are from Inside B&C Lending and are generally based on security issuance or
previously reported data.

refinances3 originated in predominately AfricanAmerican census tracts, whereas only one tenth
of prime refinances originated in predominately
African-American census tracts. Nichols,
Pennington-Cross, and Yezer (2005) found that
credit-constrained borrowers with substantial
wealth are most likely to finance the purchase of
a home by using a subprime mortgage.
The growth of subprime lending in the past
decade has been quite dramatic. Using data
reported by the magazine Inside B&C Lending,
Table 3 reports that total subprime or B&C originations (loans) have grown from $65 billion in 1995
to $332 billion in 2003. Despite this dramatic
growth, the market share for subprime loans
(referred to in the table as B&C) has dropped from
a peak of 14.5 percent in 1997 to 8.8 percent in
2003. During this period, homeowners refinanced
existing mortgages in surges as interest rates
dropped. Because subprime loans tend to be less
responsive to changing interest rates (PenningtonCross, 2003), the subprime market share should
tend to drop during refinancing booms.
The financial markets have also increasingly
securitized subprime loans. Table 4 provides the
3

A refinance is a new loan that replaces an existing loan, typically
to take advantage of a lower interest rate on the mortgage.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

securitization rates calculated as the ratio of the
total number of dollars securitized divided by the
number of dollars originated in each calendar year.
Therefore, this number roughly approximates
the actual securitization rate, but could be under
or over the actual rate due to the packaging of
seasoned loans.4 The subprime loan securitization rate has grown from less than 30 percent in
1995 to over 58 percent in 2003. The securitization rate for conventional and jumbo loans has
also increased over the same time period.5 For
example, conventional securitization rates have
increased from close to 50 percent in 1995-97 to
more than 75 percent in 2003. In addition, all or
almost all of the loans insured by government
loans are securitized. Therefore, the subprime
mortgage market has become more similar to the
prime market over time. In fact, the 2003 securitization rate of subprime loans is comparable to
that of prime loans in the mid-1990s.
4

Seasoned loans refers to loans sold into securities after the date of
origination.

Conventional loans are loans that are eligible for purchase by
Fannie Mae and Freddie Mac because of loan size and include
loans purchased by Fannie Mae and Freddie Mac, as well as those
held in a portfolio or that are securitized through a private label.
Jumbo loans are loans with loan amounts above the governmentsponsored enterprise (conventional conforming) loan limit.

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Table 4
Securitization Rates
Loan type
Year

FHA/VA

Conventional

Jumbo

Subprime

1995

101.1%

45.6%

23.9%

28.4%

1996

98.1%

52.5%

21.3%

39.5%

1997

100.7%

45.9%

32.1%

53.0%

1998

102.3%

62.2%

37.6%

55.1%

1999

88.1%

67.0%

30.1%

37.4%

2000

89.5%

55.6%

18.0%

40.5%

2001

102.5%

71.5%

31.4%

54.7%

2002

92.6%

72.8%

32.0%

57.6%

2003

94.9%

75.9%

35.1%

58.7%

NOTE: Subprime securities include both MBS and ABS backed by subprime loans. Securitization rate = securities issued divided by
originations in dollars.
SOURCE: Inside MBS & ABS.

Many factors have contributed to the growth
of subprime lending. Most fundamentally, it
became legal. The ability to charge high rates
and fees to borrowers was not possible until the
Depository Institutions Deregulation and Monetary
Control Act (DIDMCA) was adopted in 1980. It
preempted state interest rate caps. The Alternative
Mortgage Transaction Parity Act (AMTPA) in 1982
permitted the use of variable interest rates and
balloon payments.
These laws opened the door for the development of a subprime market, but subprime lending
would not become a viable large-scale lending
alternative until the Tax Reform Act of 1986 (TRA).
The TRA increased the demand for mortgage debt
because it prohibited the deduction of interest on
consumer loans, yet allowed interest deductions
on mortgages for a primary residence as well as
one additional home. This made even high-cost
mortgage debt cheaper than consumer debt for
many homeowners. In environments of low and
declining interest rates, such as the late 1990s
and early 2000s, cash-out refinancing6 becomes
a popular mechanism for homeowners to access
6

Cash-out refinancing indicates that the new loan is larger than the
old loan and the borrower receives the difference in cash.

J A N UA RY / F E B R UA RY

2006

the value of their homes. In fact, slightly over onehalf of subprime loan originations have been for
cash-out refinancing.7
In addition to changes in the law, market
changes also contributed to the growth and maturation of subprime loans. In 1994, for example,
interest rates increased and the volume of originations in the prime market dropped. Mortgage
brokers and mortgage companies responded by
looking to the subprime market to maintain volume. The growth through the mid-1990s was
funded by issuing mortgage-backed securities
(MBS, which are sometimes also referred to as
private label or as asset-backed securities [ABS]).
In addition, subprime loans were originated
mostly by nondepository and monoline finance
companies.
During this time period, subprime mortgages
were relatively new and apparently profitable,
but the performance of the loans in the long run
was not known. By 1997, delinquent payments
and defaulted loans were above projected levels
and an accounting construct called “gains-on sales
7

One challenge the subprime industry will face in the future is the
need to develop business plans to maintain volume when interest
rates rise. This will likely include a shift back to home equity
mortgages and other second-lien mortgages.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Table 5
Top Ten B&C Originators, Selected Years
Rank

2003

2002

Ameriquest Mortgage, CA

Household Finance, IL

New Century, CA

CitiFinancial, NY

Washington Mutual, WA

Household Finance, IL

New Century, CA

Option One Mortgage, CA

First Franklin Financial Corp, CA

Ameriquest Mortgage, DE

Washington Mutual, WA

GMAC-RFC, MN

Countrywide Financial, CA

Wells Fargo Home Mortgage, IA

First Franklin Financial Corp, CA

GMAC-RFC, MN

Wells Fargo Home Mortgage, IA

2001

2000

Household Finance, IL

CitiFinancial Credit Co, MO

CitiFinancial, NY

Household Financial Services, IL

Washington Mutual, WA

Option One Mortgage, CA

Bank of America Home Equity Group, NC

GMAC-RFC, MN

Countrywide Financial, CA

Option One Mortgage, CA

First Franklin Financial Corp, CA

Countrywide Financial, CA

New Century, CA

Conseco Finance Corp. (Green Tree), MN

Ameriquest Mortgage, CA

First Franklin, CA

Bank of America, NC

New Century, CA

1996
1

Associates First Capital, TX

The Money Store, CA

ContiMortgage Corp, PA

Beneficial Mortgage Corp, NJ

Household Financial Services, IL

United Companies, LA

Long Beach Mortgage, CA

EquiCredit, FL

Aames Capital Corp., CA

AMRESCO Residential Credit, NJ

NOTE: B&C loans are defined as less than A quality non-agency (private label) paper loans secured by real estate. Subprime mortgage
and home equity lenders were asked to report their origination volume by Inside B&C Lending. Wholesale purchases, including loans
closed by correspondents, are counted.
SOURCE: Inside B&C Lending.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

accounting” magnified the cost of the unanticipated losses. In hindsight, many lenders had
underpriced subprime mortgages in the competitive and high-growth market of the early to mid1990s (Temkin, Johnson, and Levy, 2002).
By 1998, the effects of these events also spilled
over into the secondary market. MBS prices
dropped, and lenders had difficulty finding
investors to purchase the high-risk tranches. At
or at about the same time, the 1998 Asian financial
crisis greatly increased the cost of borrowing and
again reduced liquidity in the all-real-estate markets. This impact can be seen in Table 4, where
the securitization rate of subprime loans drops
from 55.1 percent in 1998 to 37.4 percent in 1999.
In addition, the volume of originations shown in
Table 3 indicates that they dropped from $105.6
billion in 1999 to $102.2 billion in 2000. Both of
these trends proved only transitory because both
volume and securitization rates recovered in
2000-03.
Partially because of these events, the structure
of the market also changed dramatically through
the 1990s and early 2000s. The rapid consolidation
of the market is shown in Table 3. For example,
the market share of the top 25 firms making subprime loans grew from 39.3 percent in 1995 to
over 90 percent in 2003.
Many firms that started the subprime industry
either have failed or were purchased by larger
institutions. Table 5 shows the top 10 originators
for 2000-03 and 1996. From 2000 forward the list
of top originators is fairly stable. For example,
CitiFinancial, a member of Citigroup, appears
each year, as does Washington Mutual and
Countrywide Financial. The largest firms increasingly dominated the smaller firms from 2000
through 2003, when the market share of the top
25 originators increased from 74 percent to 93
percent.
In contrast, many of the firms in the top 25
in 1996 do not appear in the later time periods.
This is due to a mixture of failures and mergers.
For example, Associated First Capital was acquired
by Citigroup and at least partially explains
Citigroup’s position as one of the top originators
and servicers of subprime loans. Long Beach
Mortgage was purchased by Washington Mutual,
40

J A N UA RY / F E B R UA RY

2006

one of the nation’s largest thrifts. United
Companies filed for bankruptcy, and Aames
Capital Corporation was delisted after significant
financial difficulties. Household Financial
Services, one of the original finance companies,
has remained independent and survived the
period of rapid consolidation. In fact, in 2003 it
was the fourth largest originator and number two
servicer of loans in the subprime industry.

THE EVOLUTION OF SUBPRIME
LENDING
This section provides a detailed picture of
the subprime mortgage market and how it has
evolved from 1995 through 2004. We use individual loan data leased from LoanPerformance.
The data track securities issued in the secondary
market. Data sources include issuers, broker
dealers/deal underwriters, servicers, master servicers, bond and trust administrators, trustees,
and other third parties.
As of March 2003, more than 1,000 loan pools
were included in the data. LoanPerformance
estimates that the data cover over 61 percent of
the subprime market. Therefore, it represents the
segment of the subprime market that is securitized
and could potentially differ from the subprime
market as a whole. For example, the average rate
of subprime loans in foreclosure reported by the
LoanPerformance data is 35 percent of the rate
reported by the MBAA. The MBAA, which does
indicate that their sample of loans is not representative of the market, classifies loans as subprime
based on lender name. The survey of lenders of
prime and subprime loans includes approximately
140 participants. As will be noted later in the
section, the LoanPerformance data set is dominated by the A–, or least risky, loan grade, which
may in part explain the higher rate of foreclosures
in the MBAA data. In addition, the demand for
subprime securities should impact product mix.
The LoanPerformance data set provides a host
of detailed information about individual loans
that is not available from other data sources. (For
example, the MBAA data report delinquency and
foreclosure rates but do not indicate any information about the credit score of the borrower, down
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

payment, existence of prepayment penalties, or
interest rate of the loan.8) The data set includes
many of the standard loan application variables
such as the LTV ratio, credit score, loan amount,
term, and interest rate type. Some “cleaning” of
the data is conducted. For example, in each tabulation, only available data are used. Therefore,
each figure may represent a slightly different
sample of loans. In addition, to help make the
results more comparable across figures, only
adjustable- and fixed-rate loans to purchase or
refinance a home (with or without cash out) are
included from January 1995 through the December
of 2004. But because of the delay in data reporting,
the estimates for 2004 will not include all loans
from that year.

Volume
Although the subprime mortgage market
emerged in the early 1980s with the adoption of
DIDMCA, AMTPA, and TRA, subprime lending
rapidly grew only after 1995, when MBS with
subprime-loan collateral become more attractive
to investors. Figure 3 illustrates this pattern using
our data (LoanPerformance) sample. In 1995, for
example, the number of subprime fixed-rate mortgages (FRMs) originated was just slightly above
62,000 and the number of subprime adjustablerate mortgages (ARMs) originated was just
above 21,000. Since then, subprime lending has
increased substantially, with the number of FRM
originations peaking at almost 780,000 and ARM
8

An additional source of information on the subprime market is a
list of lenders published by the United States Department of Housing
and Urban Development (HUD) Policy Development and Research
(PD&R). This list has varied from a low of 51 in 1993 to a high of
256 in 1996; in 2002, the last year available, 183 subprime lenders
are identified. The list can then be matched to the Home Mortgage
Disclosure Act (HMDA) data set. The list is compiled by examining
trade publications and HMDA data analysis. Lenders with high
denial rates and a high fraction of home refinances are potential
candidates. The lenders are then called to confirm that they specialize in subprime lending. As a result, loans identified as subprime
using the HUD list included only firms that specialize in subprime
lending (not full-service lenders). As a result, many subprime loans
will be excluded and some prime loans will be included in the
sample. Very little detail beyond the interest rate of the loan and
whether the rate is adjustable is included. For example, the existence
of prepayment penalties is unknown—a unique and key feature
of subprime lending. Still this lender list has proved useful in
characterizing the neighborhood that these loans are originated
in. See, for example, Pennington-Cross (2002) and Calem, Gillen,
and Wachter (2004).

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

originations peaking (and surpassing FRMs) at
over 866,000.9
The subprime market took a temporary
downturn when the total number of FRM subprime originations declined during the 1998-2000
period; this observation is consistent with our
earlier brief history discussion and the downturn in originations reported by Inside Mortgage
Finance (2004) and shown in Table 3. Since 2000,
however, the subprime market has resumed its
momentum. In fact, from 2002 to 2003 the
LoanPerformance data show a 62 percent increase
and the Inside Mortgage Finance data show a 56
percent increase in originations.
During the late 1990s, house prices increased
and interest rates dropped to some of the lowest
rates in 40 years, thus providing low-cost access
to the equity in homes. Of the total number of
subprime loans originated, just over one-half
were for cash-out refinancing, whereas more than
one-third were for a home purchase (see Figure 4).
In 2003, for example, the total number of loans for
cash-out refinancing was over 560,000, whereas
the number of loans for a home purchase totaled
more than 820,000, and loans for no-cash-out
refinancing loans amounted to just under 250,000.
In the prime market, Freddie Mac estimated that,
in 2003, 36 percent of loans for refinancing took
at least 5 percent of the loan in cash (downloaded
from the Cash-Out Refi Report at
www.freddiemac.com/news/finance/data.html
on 11/4/04). This estimate is in contrast with
typical behavior in the subprime market, which
always has had more cash-out refinancing than
no-cash-out refinancing.
Given the characteristics of an application,
lenders of subprime loans typically identify borrowers and classify them in separate risk categories. Figure 5 exhibits four risk grades, with
A– being the least risky and D being the riskiest
grade.10 The majority of the subprime loan origi9

Similarly, Nichols, Pennington-Cross, and Yezer (2005) note that
the share of subprime mortgage lending in the overall mortgage
market grew from 0.74 percent in the early 1990s to almost 9 percent
by the end of 1990s.

Loan grades are assigned by LoanPerformance and reflect only the
rank ordering of any specific firm’s classifications. Because these
classifications are not uniform, there will be mixing of loan qualities
across grades. Therefore, these categories will likely differ from the
Countrywide examples used earlier.

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 3
Number of Loans Originated
Number
1,000,000
Adjustable Rate
Fixed Rate
800,000

600,000

400,000

200,000

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2000

2001

2002

2003

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 4
Number of Loans Originated by Purpose
Number
1,000,000
Purchase
Refinance—Cash Out
Refinance—No Cash Out
750,000

500,000

250,000

0
1995

1996

1997

1998

1999

SOURCE: LoanPerformance ABS securities data base of subprime loans.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 5
Number of Loans Originated by Grade
Number
700,000
A–
B
C
D

600,000

500,000

400,000

300,000

200,000

100,000

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

SOURCE: LoanPerformance ABS securities data base of subprime loans.

nations in this data set are classified into the lowest identified risk category (grade A–), particularly
after 1998. In addition, the proportion of grade
A– loans to the total number of loans has continuously increased from slightly over 50 percent
in 1995 to approximately 84 percent in 2003. On
the other hand, the shares of grades B, C, and D
loans have all declined since 2000. Overall, these
observations illustrate that, since 1998-99, the
subprime market (or at least the securitized segment of the market) has been expanding in its
least-risky segment. It seems likely then that the
move toward the A– segment of subprime loans
is in reaction to (i) the events of 1998, (ii) the difficulty in correctly pricing the higher-risk segments (B, C, and D credit grades), and, potentially,
(iii) changes in the demand for securities for subprime loans in the secondary market.

Credit Scores
On average, ARM borrowers have lower credit
scores than FRM borrowers (see Figure 6). In 2003,
for example, the average FICO (a credit score
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

created by Fair Isaac Corporation to measure
consumer credit worthiness) for FRMs is almost
50 points lower than for ARMs (623 versus 675).
During the 1990s, average credit scores tended to
decline each year, particularly for ARM borrowers; but since 2000, credit scores have tended to
improve each year. Hence, it appears that subprime lenders expanded during the 1990s by
extending credit to less-credit-worthy borrowers.
Subsequently, the lower credit quality unexpectedly instigated higher delinquency and default
rates (see also Temkin, Johnson, and Levy, 2002).
With the improved credit quality since 2000,
the average FICO has jumped from just under 622
in 2000 to just over 651 in 2004 (closing in on
the 669 average conventional FICO reported by
Nichols, Pennington-Cross, and Yezer, 2005). As
shown in Figure 7, lenders of subprime loans are
increasing the number of borrowers with scores
in the 500-600 and 700-800 ranges and decreasing
the number with scores below 500. Specifically,
from 2000 to 2003, the share of borrowers with
FICO scores between 700 and 800 rose from
approximately 14 percent to 22 percent.
J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 6
Average Credit Score (FICO)
FICO
800
Adjustable Rate
Fixed Rate

700

600

500
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 7
Share of Loans by Credit Score
Percentage
80

FICO ⱕ 500
500 < FICO ⱕ 600
600 < FICO ⱕ 700
700 ⱕ FICO < 800
800 ⱕ FICO

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 8
Loan Amounts by Credit Score
Dollars
250,000
FICO ⱕ 500
500 < FICO ⱕ 600
600 < FICO ⱕ 700
700 ⱕ FICO < 800
800 ⱕ FICO

200,000

150,000

100,000

50,000

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 9
House Prices by Credit Score
Dollars
350,000

FICO ⱕ 500
500 < FICO ⱕ 600
600 < FICO ⱕ 700
700 ⱕ FICO < 800
800 ⱕ FICO

300,000

250,000

200,000

150,000

100,000

50,000

0
1995

1996

1997

1998

1999

2000

SOURCE: LoanPerformance ABS securities data base of subprime loans.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 10
Loan to Value Ratio (LTV)
Loan to Value Ratio
90
Adjustable Rate
Fixed Rate
85

70
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Moreover, lenders have on average provided
smaller loans to higher-risk borrowers, presumably
to limit risk exposure (see Figure 8). As noted previously, these changes in underwriting patterns are
consistent with lenders looking for new ways to
limit risk exposure. In addition, although loan
amounts have increased for all borrowers, the
amounts have increased the most, on average,
for borrowers with better credit scores. Also, as
expected, borrowers with the best credit scores
purchased the most expensive houses (see
Figure 9).

Down Payment
Figure 10 depicts average LTV ratios for subprime loan originations over a 10-year period. The
primary finding here is that down payments for
FRMs were reduced throughout the 1990s but have
increased steadily since. (Note that the change in
business strategy occurs just after the 1998 crisis.)
In contrast, over the same period, down payments
for ARMs were reduced. On first inspection, it may
46

J A N UA RY / F E B R UA RY

2006

look like lenders are adding more risk by originating more ARMs with higher LTVs; however, this
change primarily reflects borrowers with better
credit scores and more loans classified as A–.
Therefore, this is additional evidence that lenders
of subprime loans reacted to the losses sustained
in 1998 by moving to less-risky loans—primarily
to borrowers with higher credit scores.
As shown in Figure 11, this shift in lending
strategy was accomplished by (i) steadily reducing
loans with a large down payment (LTV ⱕ 70), (ii)
decreasing loans with negative equity (LTV > 100),
and (iii) increasing loans with a 10 percent down
payment. Overall, lenders of subprime loans have
been increasing loan amounts, shifting the distribution of down payments, and increasing credit
score requirements, on average, since 2000.
In general, borrowers with larger down payments tend to purchase more expensive homes
(Figure 12). By tying the amount of the loan to
the size of the down payment, lenders limit their
exposure to credit risk.
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 11
Share of Loans by LTV
Percentage
50

LTV ⱕ 70
70 < LTV ⱕ 80
80 < LTV ⱕ 90
90 < LTV ⱕ 100
100 < LTV

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 12
House Prices by LTV
Dollars
450,000

LTV ⱕ 70
70 < LTV ⱕ 80
80 < LTV ⱕ 90
90 < LTV ⱕ 100
100 < LTV

400,000
350,000
300,000
250,000
200,000
150,000
100,000
50,000
0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 13
LTV by Credit Score
Loan to Value Ratio
100
FICO ⱕ 500
500 < FICO ⱕ 600
600 < FICO ⱕ 700
700 ⱕ FICO < 800
800 ⱕ FICO

60
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

The LTV-FICO Trade-off

INTEREST RATES

In Figure 13, we observe that borrowers with
the best credit scores tend to also provide the
largest down payments. But, beyond this observation, there seems little correlation between
credit scores and down payments.
In contrast, Figure 14 shows a clear ordering
of down payments (LTV ratios) by loan grade.
Loans in higher loan grades have smaller down
payments on average. In fact, over time, especially
after 2000, the spread tends to increase. This finding is consistent with the philosophy that loans
identified as being more risky must compensate
lenders by providing larger down payments. This
helps to reduce credit risk associated with trigger
events, such as periods of unemployment and
changes in household structure, which can make it
difficult for borrowers to make timely payments.
Consistent with the loan grade classifications,
Figure 15 shows that lower-grade loans have lower
credit scores. Therefore, as loans move to better
grades, credit scores improve and down payments
decrease.

This section examines patterns in the interest
rate that borrowers are charged at the origination
of the loan. This does not reflect the full cost of
borrowing because it does not include any fees
and upfront costs that are borne by the borrower.
In addition, the borrower can pay extra fees to
lower the interest rate, which is called paying
points.
Despite these stipulations, we are able to find
relationships between the observed interest rates
and underwriting characteristics. There is not
much difference in the average interest rate (the
interest rate on the loan excluding all upfront
and continuing fees) at origination for FRMs and
ARMs (see Figure 16). But, both product types
have experienced a large drop in interest rates,
from over 10 percent in 2000 to approximately 7
percent in 2004.
Underwriting standards usually rely heavily
on credit history and LTVs to determine the appropriate risk-based price. In Figures 17 and 18 we
see evidence of risk-based pricing based on bor-

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 14
LTV by Loan Grade
Loan to Value Ratio
90
A–
B
C
D

50
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 15
Credit Score by Loan Grade
Credit Score
700
A–
B
C
D
650

600

550

500
1995

1996

1997

1998

1999

2000

SOURCE: LoanPerformance ABS securities data base of subprime loans.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 16
Interest Rates
Interest Rate
12
Adjustable Rate
Fixed Rate

5
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

rower credit scores and, to some small extent, on
borrower down payments. For example, borrowers
with the highest FICO scores tend to receive a
lower interest rate. In 2004, average interest rates
vary by over 2 percentage points from the highest
to the lowest FICO scores.
This range of interest rates does not hold
when pricing is based solely on down payments.
In fact, the striking result from Figure 18 is that,
on average, the pricing of subprime loans is very
similar for all down-payment sizes, except for
loans with LTVs greater than 100, which pay a
substantial premium. One way to interpret these
results is that lenders have found good mechanisms to compensate for the risks of smaller down
payments and, as a result, down payments in
themselves do not lead to higher borrower costs.
However, if the equity in the home is negative,
no sufficient compensating factor can typically
be found to reduce expected losses to maintain
pricing parity. The borrower has a financial
incentive to default on the loan because the loan
amount is larger than the value of the home. As a
50

J A N UA RY / F E B R UA RY

2006

result, the lender must increase the interest rate
to decrease its loss if a default occurs.
Figure 19 shows the average interest rate by
loan grade. The riskiest borrowers (Grade D)
receive the highest interest rate, whereas the leastrisky borrowers (Grade A–) receive the lowest
interest rate. Interestingly, although interest rates
overall changed dramatically, the spread between
the rates by grade have remained nearly constant
after 1999. This may indicate that the risks, and
hence the need for risk premiums, are in levels,
not proportions, across risk grades.

Prepayment Penalties
It is beyond the scope of this paper to define
specific examples of predatory lending, but prepayment penalties have been associated with
predatory practices. A joint report by the U.S.
Department of Housing and Urban Development
(HUD) and the U.S. Department of Treasury
(Treasury) (2002) defined predatory lending as
lending that strips home equity and places borrowers at an increased risk of foreclosure. The
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 17
Interest Rates by Credit Score
Interest Rate
12

FICO ⱕ 500
500 < FICO ⱕ 600
600 < FICO ⱕ 700
700 ⱕ FICO < 800
800 ⱕ FICO

5
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 18
Interest Rates by LTV
Interest Rate
14

LTV ⱕ 70
70 < LTV ⱕ 80
80 < LTV ⱕ 90
90 < LTV ⱕ 100
100 < LTV

6
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 19
Interest Rates by Loan Grade
Interest Rate
14
A–
B
C
D
12

6
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

characteristics include excessive interest rates
and fees, the use of single-premium credit life
insurance, and prepayment penalties that provide
no compensating benefit, such as a lower interest
rate or reduced fees. In addition, some public
interest groups such as the Center for Responsible
Lending believe that prepayment penalties are in
their very nature predatory because they reduce
borrower access to lower rates (Goldstein and Son,
2003).
Both Fannie Mae and Freddie Mac changed
their lending standards to prohibit loans (i.e.,
they will not purchase them) that include some
types of prepayment penalties. On October 1, 2002,
Freddie Mac no longer allowed the purchase of
subprime loans with a prepayment penalty after
three years. However, loans originated before
that date would not be affected by the restriction
(see www.freddiemac.com/singlefamily/
ppmqanda.html downloaded on 2/14/05). If a
subprime loan stipulates a prepayment penalty,
Fannie Mae will consider the loan for purchase
only if (i) the borrower receives a reduced interest
52

J A N UA RY / F E B R UA RY

2006

rate or reduced fees, (ii) the borrower is provided
an alternative mortgage choice, (iii) the nature of
the penalty is disclosed to the borrower, and (iv)
the penalty cannot be charged if the borrower
defaults on the loan and the note is accelerated
(www.fanniemae.com/newsreleases/2000/
0710.jhtml).11 Therefore, we may expect to see a
decline in the use of prepayment penalties starting
in 2000 and 2002, at least in part due to changes
in the demand for subprime securities.
Despite these concerns, prepayment penalties
have become a very important part of the subprime market. When interest rates are declining
or steady, subprime loans tend to be prepaid at
elevated rates compared with prime loans
(Pennington-Cross, 2003, and UBS Warburg, 2002).
In addition, subprime loans tend to default at
elevated rates. As a result, the expected life of an
average subprime loan is much shorter than that
11

When a borrower defaults, the lender typically will send an acceleration note informing the borrower that the mortgage contract has
been violated and all of the remaining balance and fees on the
loan are due immediately.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

Figure 20
Share of Loans with a Prepayment Penalty
Percentage
100
Adjustable Rate
Fixed Rate
80

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

of a prime loan. Therefore, there are fewer good
(nonterminated) loans to generate income for an
investor to compensate for terminated (defaulted
and prepayed) loans. One mechanism to reduce
the break-even price on these fast-terminating
loans is to use prepayment penalties (Fortowsky
and LaCour-Little, 2002). Although this same
mechanism is used in the prime market, it is not
as prevalent.
Figure 20 shows that, prior to 2000, the use
of prepayment penalties grew quickly. Substantially more ARMs than FRMs face a prepayment
penalty. For loans originated in 2000-02, approximately 80 percent of ARMs were subject to a prepayment penalty compared with approximately
45 percent of FRMs. Equally important, the share
of ARMs and FRMs subject to a prepayment
penalty rose dramatically from 1995 to 2000. In
fact, at the end of the five-year period, ARMs were
five times more likely and FRMs twice as likely
to have prepayment penalties.
This rapid increase can at least partially be
attributable to regulatory changes in the interpreF E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

tation of the 1982 AMTPA by the Office of Thrift
and Supervision (OTS). Before 1996, the OTS
interpreted AMTPA as allowing states to restrict
finance companies (which make many of the subprime loans) from using prepayment penalties,
but the OTS exempted regulated federal depository institutions from these restrictions. In 1996,
the OTS also allowed finance companies the
same exemption. However, this position was
short lived and the OTS returned to its prior
interpretation in 2002.
In 2003 and 2004, prepayment penalties
declined for ARMs and held steady for FRMs.
This was likely caused by (i) the introduction of
predatory lending laws in many states and cities
(typically these include ceilings on interest rates
and upfront fees, restrictions on prepayment
penalties, and other factors)12; (ii) the evolving
position of Fannie Mae and Freddie Mac on pre12

For more details on predatory lending laws that are both pending
and in force, the MBAA has a “Predatory Lending Law Resource
Center” available at www.mbaa.org/resources/predlend/ and the
Law Offices of Herman Thordsen also provide detailed summaries
of predatory laws at www.lendinglaw.com/predlendlaw.htm.

J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Figure 21
Share of Loans with a Prepayment Penalty by Credit Score
Percentage
100

FICO ⱕ 500
500 < FICO ⱕ 600
600 < FICO ⱕ 700
700 ⱕ FICO < 800
800 ⱕ FICO

0
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

Figure 22
Length of Prepayment Penalty
Months
45
Adjustable Rate
Fixed Rate
40

25
1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

SOURCE: LoanPerformance ABS securities data base of subprime loans.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Chomsisengphet and Pennington-Cross

payment penalties; and (iii) the reversed OTS
interpretation of AMTPA in 2002 (see 67 Federal
Register 60542, September 26, 2002), which again
made state laws apply to finance companies just
as they had prior to 1996.
The share of loans containing a prepayment
penalty is lowest among borrowers with the
highest, or best, FICO scores (see Figure 21). In
2003, for instance, about 20 percent of borrowers
with a FICO score above 800 were subject to a
prepayment penalty, whereas over 60 percent of
borrowers with a FICO score below 700 faced
such a penalty.
To understand the prevalence of these penalties, one must know how long prepayment penalties last. Figure 22 shows that the length of the
penalty has generally been declining since 2000.
Again, the introduction and threat of predatory
lending laws and Freddie Mac purchase requirements (that the term of a prepayment penalty be
no more than three years) is likely playing a role
in this trend. In addition, FRMs tend to have much
longer prepayment penalties. For example, in
2003, the average penalty lasted for almost three
years for FRMs and a little over two years for
ARMs, both of which meet current Freddie Mac
guidelines.

prepayment penalties has declined in the past few
years because the securities market has adjusted
to public concern about predatory lending and
the regulation of finance companies has changed.
The evidence also shows that the subprime
market has provided a substantial amount of riskbased pricing in the mortgage market by varying
the interest rate of a loan based on the borrower’s
credit history and down payment. In general, we
find that lenders of subprime loans typically
require larger down payments to compensate for
the higher risk of lower-grade loans. However, even
with these compensating factors, borrowers with
low credit scores still pay the largest premiums.

CONCLUSION

Fortowsky, Elaine B. and LaCour-Little, Michael.
“An Analytical Approach to Explaining the
Subprime-Prime Mortgage Spread.” Presented at
the Georgetown University Credit Research Center
Symposium Subprime Lending, 2002.

As the subprime market has evolved over the
past decade, it has experienced two distinct
periods. The first period, from the mid-1990s
through 1998-99, is characterized by rapid growth,
with much of the growth in the most-risky segments of the market (B and lower grades). In the
second period, 2000 through 2004, volume again
grew rapidly as the market became increasingly
dominated by the least-risky loan classification
(A– grade loans). In particular, the subprime market has shifted its focus since 2000 by providing
loans to borrowers with higher credit scores,
allowing larger loan amounts, and lowering the
down payments for FRMs. Furthermore, the subprime market had reduced its risk exposure by
limiting the loan amount of higher-risk loans and
imposing prepayment penalties on the majority
of ARMs and low credit-score loans. The use of
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

REFERENCES
Calem, Paul; Gillen, Kevin and Wachter, Susan. “The
Neighborhood Distribution of Subprime Mortgage
Lending.” Journal of Real Estate Finance and
Economics, 2004, 29(4), pp. 393-410.
Capozza, Dennis R. and Thomson, Thomas A.
“Subprime Transitions: Long Journey into
Foreclosure.” Presented at the American Real
Estate and Urban Economics Annual Meeting,
Philadelphia, PA, January 2005.

Goldstein, Debbie and Son, Stacey Strohauer. “Why
Prepayment Penalties are Abusive in Subprime
Home Loans.” Center for Responsible Lending
Policy Paper No. 4, April 2, 2003.
Hillier, Amy E. “Spatial Analysis of Historical
Redlining: A Methodological Exploration.” Journal
of Housing Research, November 2003, 14(1), pp.
137-67.
Immergluck, Daniel and Wiles, Marti. Two Steps Back:
The Dual Mortgage Market, Predatory Lending, and
the Undoing of Community Development. Chicago:
The Woodstock Institute, 1999.
J A N UA RY / F E B R UA RY

2006

Chomsisengphet and Pennington-Cross

Inside Mortgage Finance. The 2004 Mortgage Market
Statistical Annual. Washington, DC: 2004.
Nichols, Joseph; Pennington-Cross, Anthony and Yezer,
Anthony. “Borrower Self-Selection, Underwriting
Costs, and Subprime Mortgage Credit Supply.”
Journal of Real Estate Finance and Economics,
March 2005, 30(2), pp. 197-219.
Pennington-Cross, Anthony. “Subprime Lending in
the Primary and Secondary Markets.” Journal of
Housing Research, 2002, 13(1), pp. 31-50.
Pennington-Cross, Anthony. “Credit History and the
Performance of Prime and Nonprime Mortgages.”
Journal of Real Estate Finance and Economics,
November 2003, 27(3), pp. 279-301.
Pennington-Cross, Anthony. “The Value of Foreclosed
Property.” Journal of Real Estate Research
(forthcoming).
Temkin, Kenneth; Johnson, Jennifer E.H. and Levy,
Diane. Subprime Markets, the Role of GSEs, and
Risk-Based Pricing. Washington, DC: U.S. Department
of Housing and Urban Development, Office of
Policy Development and Research, March 2002.
Tracy, Joseph; Schneider, Henry and Chan, Sewin.
“Are Stocks Over-Taking Real Estate in Household
Portfolios?” Current Issues in Economics and
Finance, Federal Reserve Bank of New York, April
1999, 5(5).
UBS Warburg. “Credit Refis, Credit Curing, and the
Spectrum of Mortgage Rates.” UBS Warburg
Mortgage Strategist, May 21, 2002, pp. 15-27.
U.S. Department of Housing and Urban Development
and U.S. Department of Treasury, National Predatory
Lending Task Force. Curbing Predatory Home
Mortgage Lending. Washington, DC: 2002.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Are the Causes of Bank Distress Changing?
Can Researchers Keep Up?
Thomas B. King, Daniel A. Nuxoll, and Timothy J. Yeager
Since 1990, the banking sector has experienced enormous legislative, technological, and financial
changes, yet research into the causes of bank distress has slowed. One consequence is that traditional supervisory surveillance models may not capture important risks inherent in the current
banking environment. After reviewing the history of these models, the authors provide empirical
evidence that the characteristics of failing banks have changed in the past ten years and argue that
the time is right for new research that employs new empirical techniques. In particular, dynamic
models that use forward-looking variables and address various types of bank risk individually are
promising lines of inquiry. Supervisory agencies have begun to move in these directions, and the
authors describe several examples of this new generation of early-warning models that are not yet
widely known among academic banking economists.
Federal Reserve Bank of St. Louis Review, January/February 2006, 88(1), pp. 57-80.

nderstanding the causes of insolvency
at financial institutions is important
for both academic and regulatory
reasons, and the effort to model bank
deterioration was once a vibrant area of study
in empirical finance. Significant advances were
made between the late 1960s and late 1980s.
Since then, research has slowed considerably on
the characteristics of banks headed for trouble,
reflecting a sense among researchers that the
causes of banking problems are unchanging and
well understood. In this article, we argue that this
complacency may be unwarranted.1 The rapid
pace of technological and institutional change in
the banking sector in recent years suggests that
the dominant models may no longer accurately
1

Note that we are not claiming that bank regulators have grown
complacent, only that the academic community has focused its
attention away from this issue.

represent the nature of bank deterioration. Indeed,
the few observations that we have of recent bank
failures provide evidence consistent with this
hypothesis.
The changes in the banking environment call
for renewed research into the causes of bank
distress. The federal supervisory agencies have
established research programs pursuing this goal,
but—because regulatory banking economists often
work on projects with confidential data and
because many ongoing projects are not formally
disclosed to the public—it can be difficult for
outside economists to benefit from this work. By
describing some efforts that are currently underway to develop new early-warning models at the
Federal Reserve and Federal Deposit Insurance
Corporation (FDIC), we attempt to bridge that gap
in the hope of stimulating more research in this
area beyond that done by government agencies.
One strand of the new monitoring devices attempts

Thomas B. King is an economist at the Federal Reserve Bank of St. Louis; Daniel A. Nuxoll is an economist at the Division of Insurance and
Research, Federal Deposit Insurance Corporation; and Timothy J. Yeager is the Arkansas Bankers’ Association Chair of Banking at the University
of Arkansas. (Yeager was an assistant vice president and economist at the Federal Reserve Bank of St. Louis at the time this article was written.)
The authors thank Alton Gilbert, Hui Guo, Andy Meyer, Greg Sierra, and David Wheelock for helpful comments. The views expressed are
those of the authors and are not necessarily official positions of the Federal Reserve Bank of St. Louis, the Board of Governors of the Federal
Reserve System, or the FDIC.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

to complement traditional early-warning models
by adopting a more theoretical approach using
forward-looking variables. Another strand isolates
and models unique banking risks to facilitate the
risk-focused approach to bank supervision. A
common objective of these models is an increased
flexibility that will allow off-site surveillance to
better keep pace with the dynamic banking environment going forward.

SURVEILLANCE MODELS IN
HISTORICAL CONTEXT
Federal bank supervisors primarily use limiteddependent-variable regression models for off-site
monitoring. Although we argue later that these
models (like all models) have shortcomings, they
reflect years of advancement in academic research,
econometric modeling, and computer technology.
In this section, we describe the evolution of offsite surveillance models, paying particular attention to the link between academic research and
supervisory applications. Table 1 summarizes the
evolution of various off-site surveillance systems
at the Federal supervisory agencies from the mid1970s to the present. The systems transitioned
from simple screens, to hybrid models, to the
econometric models used today.

Discriminant Analysis and Supervisory
Screens
During the 1960s, several studies attempted
to determine the usefulness of various financial
ratios in predicting bankruptcy in non-bank firms.
In his seminal article, Altman (1968) used discriminant analysis over five variables to determine the
characteristics of manufacturing firms headed
for bankruptcy. His paper ushered in a wave of
research applying similar methodology specifically to depository institutions, including Stuhr
and van Wicklen (1974), Sinkey (1975, 1978),
Altman (1977), and Rose and Scott (1978).
Much of this early research on bank distress
was conducted by economists within supervisory
agencies, and some of it was specifically directed
toward the establishment of an off-site earlywarning model for use in everyday supervision.
58

J A N UA RY / F E B R UA RY

2006

Because discrete-response-regression techniques
were still relatively new and too computationally
intensive to be practical, the initial screen-based
systems adopted by all three federal agencies
relied on a variant of discriminant analysis, comparing selected ratios to predetermined cutoff
points and classifying banks accordingly.
The Office of the Comptroller of the Currency
(OCC) adopted the first formal screen-based system
called the National Bank Surveillance System
(NBSS) in 1975. Previously, off-site monitoring
had consisted largely of informal rules of thumb
based on individual financial ratios. According
to White (1992), the impetus for the shift toward
a more systematic approach was the OCC’s failure
to detect the financial difficulties at two large
institutions—United States National Bank and
Franklin National Bank—that became insolvent
in the early 1970s. The OCC’s response to these
shortcomings in off-site surveillance was, in part,
to avail itself of new computing technology to
condense the call-report data into key financial
ratios for each bank under its supervision. One
component of the NBSS, the Anomaly Severity
Ranking System, ranked selected bank ratios
within peer groups to detect outliers.
The FDIC and the Federal Reserve quickly
followed the OCC with similar screen-based
models of their own. In 1977, the FDIC introduced the Integrated Monitoring System. One
component of this system was the humbly titled
“Just A Warning System,” which consisted of 12
financial ratios. The system compared each ratio
with a benchmark ratio determined by examiner
judgment. Banks with ratios that “failed” various
screens were flagged for additional follow-up.
The Federal Reserve adopted the Minimum Bank
Surveillance System (later, the Uniform Bank
Surveillance Screen), which examined seven
bank ratios. These ratios were weighted by their
Z-scores, which were then summed to yield a
composite score for each bank. MBSS, which
resulted from the research program described in
Korobow, Stuhr, and Martin (1977), was the first
surveillance model adopted by a supervisory
body to employ formal statistical techniques.
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Discrete-Response Models and Hybrid
Systems
The development of discrete-response regression techniques, together with the increased
availability of the computing power necessary to
apply them to large datasets, aided the advancement of bank-distress models beginning in the
late 1970s (Hanweck, 1977, Korobow, Stuhr, and
Martin, 1977, and Martin, 1977). Because of its
analytical simplicity, the logistic specification has
been the favorite model of this type, although
arctangent and probit models have also appeared
occasionally.2 As pointed out by Martin (1977),
discriminant analysis can be viewed as a special
case of logistic regression in that the existence of
a unique linear discriminant function implies the
existence of a unique logit equation, whereas the
converse is not true. However, the existence of a
linear discriminant function is commonly rejected
when the number of observations of one class is
substantially smaller than that in the other class.
For this reason, early discriminant studies typically used subsamples of the population of safe
banks (which have always far outnumbered risky
banks by any measure), either matching them
according to certain non-risk characteristics or
randomly selecting the control sample. The use
of a logit model obviates the need for these restrictive sampling methods.
Martin’s (1977) study set the standard for
discrete-response models of bank-failure prediction. Whereas most previous research had focused
on a small sample of banks over two or three years,
Martin used all Fed-supervised institutions during
a seven-year period in the 1970s, yielding over
33,000 observations. In what would become a
standard approach, he confronted the data agnostically with 25 financial ratios and ran several
different specifications in search of the best fit. He
found that capital ratios, liquidity measures, and
profitability were the most significant determinants
of failure over his sample period. Although Martin
did not employ direct measures of asset quality,
his indirect measures—provision expense and loan
2

Linear regression analysis was explored early on by Meyer and
Pifer (1970).

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

concentration—also turned out to be significant.
A host of other studies around the same time,
using both logit and discriminant analysis, confirmed these basic results. Table 2 summarizes a
selection of these papers. Poor asset quality and
low capital ratios are the two characteristics of
banks that have most consistently been associated
with banking problems over time (Sinkey, 1978).
Indeed, as described in Putnam (1983), earlywarning research in the 1970s and 1980s displayed a remarkable consistency in the variables
that emerged as important predictors of banking
problems: profitability, capital, asset quality, and
liquidity appeared as statistically significant in
almost every study, even though they were often
measured using different ratios.3
Motivated in part by the consistency of the
pattern of bank deterioration, the federal banking
agencies adopted the Uniform Financial Rating
System in November 1979.4 Under this system—
which is still the primary rating mechanism for
U.S. bank supervision—capital adequacy (C),
asset quality (A), management competence (M),
earnings performance (E), and liquidity risk (L)
are each explicitly evaluated by examiners and
rated on a 1 (best) to 5 (worst) scale. (Beginning
in 1997, sensitivity to market risk (S) was adopted
as a sixth component.) Examiners also assign a
composite rating (CAMELS) on the same scale,
reflecting the overall safety and soundness of the
institution.
From a supervisory perspective, modeling
CAMELS ratings allows examiners to observe
estimates of current supervisory ratings on a
quarterly basis, rather than only during an on-site
exam. The availability of consistent supervisoryrating data beginning in 1979 allowed researchers
to employ ordered logit techniques to estimate
bank ratings. (See West, 1985; and Whalen and
3

More recently, some research has investigated the potential for
local and regional economic data to add information about future
banking conditions. However, the results have largely rejected this
idea (e.g., Meyer and Yeager, 2001; Nuxoll, O’Keefe, and Samolyk,
2003; and Yeager, 2004). On the other hand, Neely and Wheelock
(1997) show that bank earnings are highly correlated with statelevel personal-income growth.

Prior to 1979, the three federal regulatory agencies assigned banks
scores for capital (1 to 4), asset quality (A to D), and management
(S, F, or P), as well as a composite score (1 to 4).

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Table 1
Evolution of Key Off-Site Surveillance Systems
Screen-Based Systems
National Bank Surveillance System (NBSS)

Agency

Period used

OCC

1975 to ?

Condensed the call-report data into key financial ratios and compared them to peer ratios. One output of the NBSS,
the Anomaly Severity Ranking System, ranked bank ratios by peer group to detect outliers. Another output was the
Bank Performance Report. In cooperation with the Fed and FDIC, the OCC transformed the Bank Performance Report
into the Uniform Bank Performance Report (UBPR). Although the OCC no longer uses the NBSS, the UBPR is used
presently by all federal and state supervisory agencies for both on-site and off-site analysis.
Minimum Bank Surveillance Screen (MBSS)

Federal Reserve

Late 1970s to mid-80s

Employed a set of ratios as off-site screens and added institutions that lay outside a critical range to an “exception list”
that received extra scrutiny. A composite score was also constructed by summing the normalized values of seven
of these ratios.
Integrated Monitoring System (IMS)

FDIC

1977 to 1985

A screening device within the IMS, called the “Just A Warning System” (JAWS), compared 12 key financial ratios to
critical values as determined by examiner expertise. JAWS did not compute composite scores or make direct
comparisons to peer levels.
Uniform Bank Surveillance Screen (UBSS)

Federal Reserve

Mid-1980s to 1993

Improvement upon the MBSS. Computed peer-group percentiles of six financial ratios and summed them to derive
the composite score. Banks in the highest percentiles of the composite score were placed on a watch list.

Hybrid Systems
CAEL

Agency

Period used

FDIC

1985 to late 1998

Replaced IMS. An “expert system,” designed to replicate the financial analysis that an examiner would perform to
assign an examination rating. Ratios were chosen to evaluate capital (C), asset quality (A), earnings (E), and liquidity (L).
Analysts subjectively determined the weights for each of the ratios that fed into the four CAEL components. The CAEL
components were multiplied by their respective weights and summed to yield a composite CAEL score.
Canary

OCC

2000 to present

Canary consists of a package of tools organized into four components: Benchmarks, Credit Scope, Market Barometers,
and Predictive Models. Benchmarks are screen-based ratios that indicate risky thresholds. The Peer Group Risk Model
is a predictive model that projects a bank’s return on assets over the next three years under various economic
scenarios.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Table 1, cont’d
Limited-Dependent Variable Systems
System to Estimate Examination Ratings (SEER)

Agency

Period used

Federal Reserve

1993 to present

Replaced the UBSS. First named the Financial Institutions Monitoring System, SEER is a logit model that consists of
two components, a “risk-rank” model that forecasts bank-failure probabilities and a “rating” model that estimates
current CAMELS scores.
Statistical CAMELS Off-site Rating (SCOR)

FDIC

1998 to present

Replaced CAEL. Like SEER, the model consists of two components: a CAMELS downgrade forecast and a rating forecast.
The downgrade forecast computes the probability that a 1- or 2-rated bank will receive a 3, 4, or 5 rating at the next
examination. The OCC also uses output from the SCOR model in off-site surveillance.
CAMELS Downgrade Probability (CDP)

Federal Reserve Bank of St. Louis

1999 to present

Similar to the downgrade forecast of SCOR, the CDP estimates the probability that a 1- or 2-rated bank will be
downgraded to a 3, 4, or 5 rating over the next two years.

Forward-Looking Early-Warning Systems
Growth Monitoring Sytem (GMS)

Agency

Period used

FDIC

2000 to present

Although GMS was initially developed as an expert system and implemented in the 1980s, it was revised significantly
in the late 1990s to employ explicit statistical techniques. GMS is a logit model of downgrades that estimates which
institutions that are currently rated satisfactory are most likely to be classified as problem banks at the end of three
years. Rather than using credit quality measures as independent variables, GMS includes forward-looking variables
such as loan growth and noncore funding that can be precursors of problems that have yet to manifest themselves.
Liquidity and Asset Growth Screen (LAGS)

Federal Reserve Bank of St. Louis

2002 to present

LAGS is conceptually similar to GMS, but it uses a dynamic vector autoregression approach to forecast the set of
banks most likely to exploit moral hazard-incentives. Such banks exhibit rapid loan growth, increasing dependence
on funding sources with no market discipline, and declining capital ratios. Like GMS, the model uses forward-looking
variables.

Risk-Focused Systems
Real Estate Stress Test (REST)

Agency

Period used

FDIC

2000 to present

REST attempts to identify those banks and thrifts that are most vulnerable to problems in real estate markets by
subjecting them to the same stress as the New England real estate crisis of the early 1990s. Forecast measures of bank
performance are translated to CAMELS ratings using the SCOR model. The result is a REST rating that ranges from
1 to 5.
Economic Value Model (EVM)

Federal Reserve

1998 to present

The EVM is a duration-based economic value of equity model that estimates the loss in a bank’s market value of equity
given an instantaneous 200-basis-point interest rate increase. The model is useful to assess the bank’s long-run
sensitivity to interest rate risk.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Table 2
Comparison of Selected Early Studies Predicting Bank Condition
Model number
Dependent variable
Technique

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

Failure

Rating

Failure

Rating

Failure

Rating

OLS

Discriminant
analysis

Logit

Probit

Factor + Logit

Logit

Factor + Logit

214

33,627

221

820

~5,700

339

1948-65

1967-68

1969-76

1971-76

1980-83

1980-82

1983-84

1983-86

No. of observations
Sample period
Loans vs. securities mix
Efficiency, net operating
expense, or overhead

ROA or ROE

Capital/assets
Classified loans

Loan mix

Size

X
X

Charge-offs
Deposit mix

X
X

Past-due or nonperforming
loans

Liquid assets

X
X

Volatile liabilities or jumbo
CDs

Dividend payout ratio

Interest income, expense,
or margin

Interest-rate sensitivity

Provision expense

Insider activity

Income volatility

Balance sheet volatility

Asset or loan growth

Income growth

Loan-loss reserves
Other

X
X

NOTE: Variables listed in the table are those included in each study. In most cases, variables were selected because of their significance,
and so the table also largely reflects variables that were significant in predicting bank problems. In some studies, some additional
variables were considered but they do not receive an “X” in the table because they were found to be statistically insignificant. The
studies referenced are (1) Meyer and Pifer, 1970; (2) Stuhr and van Wicklen, 1974; (3) Martin, 1977; (4) Hanweck, 1977; (5) Bovenzi,
Marino, and McFadden, 1983; (6) West, 1985; (7) Pantalone and Platt, 1987; and (8) Whalen and Thomson, 1988.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Thomson, 1988.)5 Cole and Gunther (1998) demonstrate that actual supervisory ratings can become
obsolete within as little as six months after being
assigned. Similarly, Hirtle and Lopez (1999) find
that the private supervisory information contained
in these ratings decays as they age. These studies
suggest that early-warning models that estimate
current supervisory ratings are useful tools for
supervisors to keep up with bank fundamentals
without incurring the cost of an examination.6
The FDIC’s CAEL model, introduced in 1985,
represented a significant breakthrough in off-site
monitoring devices. This “hybrid” system—a
discrete-response framework coupled with examiner input—estimated ratings for four of the five
CAMEL components based on quarterly call-report
data. (‘M’ was not estimated.) For each CAEL
component, experienced examiners subjectively
weighted the relevant bank ratios; a rating table
then mapped the model output to a rating ranging
from 1 to 5. The rating table was updated each
quarter to mirror the actual distribution of component CAMEL ratings in the previous year. CAEL
then weighted the four estimated components
themselves to yield a composite rating. In essence,
the model was a calibrated limited-dependentvariable model, with examiner guidance replacing the computationally intensive econometric
procedure.

The Current Surveillance Regime
A wealth of data on bank failures and CAMELS
ratings throughout the 1980s and the rapid pace
5

West (1985) and Wang and Sauerhaft (1989) model supervisory
ratings in a factor-analytic framework. Supervisory ratings had
previously been used to measure composite risk in a discriminantanalysis study by Stuhr and van Wicklen (1974). Two other, related,
lines of research begun in this period involve modeling time to
failure (rather than failure probability) and regulatory closuredecision rules. Examples of the time-to-failure models, which
typically involve Cox (1972) proportional-hazard specifications,
can be found in Lane, Looney, and Wansley (1986), Whalen (1991),
Helwege (1996), and Wheelock and Wilson (1995, 2000, 2005).
For models of supervisory closure behavior, see Barth et al. (1989),
Demirgüç-Kunt (1989), Thomson (1992), and Cole (1993).

It is important to recognize that these models are intended as complements to, rather than substitutes for, on-site examination. Although
CAMELS ratings do become stale rather quickly, Nuxoll, O’Keefe, and
Samolyk (2003) and Wheelock and Wilson (2005) show that they still
retain marginal predictive power for failures, beyond that contained
in the call-report data. Thus, on-site examination appears to recover
some information that is not available in bank financial statements.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

of computer technology in the 1980s and early
1990s allowed supervisory agencies to “catch up”
with the banking and econometric research and
develop off-site monitoring devices employing
limited-dependent-variable econometric techniques. Table 3 compares the explanatory variables
used in select previous and current early-warning
systems. Two systems—SEER and SCOR—are
the primary surveillance tools used today by the
Fed and the FDIC, respectively.
In 1993, the Federal Reserve adopted as its
in-house early-warning model the Financial
Institutions Monitoring System, which was modified slightly and renamed the System to Estimate
Examination Ratings (SEER). This model consists
of two components: a “risk-rank” or failure model
that estimates bank-failure probabilities and a
“rating” model that estimates current CAMELS
scores. The SEER failure model is designed to
detect deficiencies in balance sheet and income
statement ratios that are severe enough to cause
an outright failure or a critical shortfall in capital.
Because these events have been rare since the
inception of SEER, the variables and coefficient
estimates have remained frozen since they were
first estimated on late-1980s and early-1990s
failures. The SEER rating model, in contrast, is
reestimated on a quarterly basis, allowing for different coefficient estimates—and indeed different
independent variables—in each quarter. This
model has the advantage of allowing for new
sources of bank risk, but it can be difficult to interpret changes in risk when the main driver of the
change is the inclusion of a variable that was not
present in the model in the previous quarter. The
two models are used together to achieve a balance
between flexibility and consistency. As Cole,
Cornyn, and Gunther (1995), Cole and Gunther
(1998), and Gilbert, Meyer, Vaughan (1999)
demonstrate, SEER’s performance is superior to
a variety of other early-warning systems, including
actual CAMELS scores assigned by examiners,
in terms of the trade-off between its type-I and
type-II error.7
7

In this case, a type-I error occurs when a bank is not predicted to
fail but does. A type-II error occurs when a bank is predicted to fail
but does not. For obvious reasons, regulators are more concerned
with type-I errors.

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Table 3
Comparison of Early-Warning Systems
JAWS

UBSS

CAEL

SEER

SCOR

Downgrade

GMS

LAGS

FDIC

FRB

FDIC

FRB

FDIC

FRB

FDIC

FRB

Screens

Hybrid

Logit

VAR

Agency
Model type
Tier-1 or tangible capital
Total or risk-weighted
assets

Past due 30
Past due 90

Nonaccruals
OREO
Residential real estate loans

C&I loans
Securities

Jumbo CDs

Net Income (ROA)

X
X

Liquid assets

Loan growth

X
X

Charge-offs
Provision expense

X
X

Total or risk-weighted
asset growth

Volatile liability expense

Volatile liabilities

Loan-loss reserves

Loan/deposit ratio

Interest expense

Loans and long-term
securities

NCNRP funding

Operating expenses or
revenues

Change in capital

Change in deposits

Dividends

Region
Prior composite
supervisory rating

Prior supervisory
management rating
NOTE: For purposes of comparison, some liberties have been taken with variable definitions, e.g., such categories as liquid assets and
tangible capital have been defined in slightly different ways in the various models and the construction of certain ratios differs slightly.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

In 1998, the FDIC developed a model similar
to SEER, known as the Statistical CAMELS Offsite
Rating (SCOR). The SCOR model, which replaced
CAEL, also consists of two components: a rating
forecast and a CAMELS-downgrade forecast.8
The rating component of the FDIC’s SCOR model
is similar to the SEER rating model. SCOR uses a
multinomial logit model to estimate a composite
CAMELS rating as well as ratings for all six of the
CAMELS components, in keeping with the formulation of the preceding CAEL system. SCOR’s
downgrade component estimates probabilities
that safe banks (those with ratings of 1 or 2) will
receive ratings of 3, 4, or 5 at the next examination.
The Federal Reserve has recently undertaken
a similar effort in modeling downgrades. Gilbert,
Meyer, and Vaughan (2002) use a logistic model
to estimate downgrade probabilities for CAMELS
composites. The authors concluded that the
variables included in SEER were also the most
appropriate for their purposes; but one advantage
of the CAMELS downgrade model relative to the
SEER failure model is the ability to update the
coefficients on a periodic basis.
In sum, researchers and practitioners have
made considerable progress in developing models
to predict bank distress. However, as we discuss
below, these models must be complemented with
newer models to account for evolution in the
banking industry and nontraditional sources of
bank risk.

THE NEED FOR NEW WORK
The sophistication of off-site early-warning
systems since 1970 has certainly improved; but,
given the dramatic changes in the banking sector
over the past decade, we may expect that the
current systems—like the screen-based mechanisms that preceded them—have already fallen
behind the pace of financial evolution.9 The main
criticism of prevailing early-warning techniques
is the implicit assumption that future episodes
of bank distress will look similar to past episodes
8

See Cole, Cornyn, and Gunther (1995) and Collier et al. (2003a).

Hooks (1995) and Helwege (1996) provide evidence on the parameter instability of traditional early-warning models over time.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

of distress. However, significant changes in the
banking environment since 1990 combined with
empirical evidence that bank-distress patterns
may be changing suggest that new early-warning
research is needed.

Recent Changes in the Banking
Environment
Shifts in the banking environment erode
confidence in early-warning models because the
future is less likely to reflect the past. Since 1990,
banks have faced significant legislative, financial,
and technological innovations.
The post-1990 legislation, summarized in
Table 4, intended to impose more market discipline
on banks and remove anti-competitive barriers.
The Federal Deposit Insurance Corporation
Improvement Act of 1991 and the National
Depositor Preference Act of 1993 shifted more
of the burden of bank failure from taxpayers to
uninsured creditors. Several studies have documented the changes in market discipline that
appear to have been caused by this legislation
(Flannery and Sorescu, 1996; Cornett, Mehran,
and Tehranian, 1998; Marino and Bennett, 1999;
Hall et al., 2002; Goldberg and Hudgins, 2002;
Flannery and Rangan, 2003; and King, 2005). In
addition, legislation removed geographic branching restrictions (Riegle-Neal Act of 1994) and product restrictions (Financial Services Modernization
Act of 1999). Many banks have expanded into
investment banking, insurance, and other financial
services, and a small but increasing fraction of
bank revenue derives from fee income generated by
these operations (Yeager, Yeager, and Harshman,
2005). A likely outcome of these legislative
changes is a more competitive banking industry
that has the ability to assume different kinds of
credit risk than it assumed in the past.
In addition to the legislative changes, financial
markets have widened and deepened, presenting
banks with new asset and liability management
opportunities and challenges. Previously illiquid
assets have become more liquid as secondary markets have developed and government-sponsored
enterprises such as Fannie Mae and Freddie Mac
have facilitated the growth of the mortgage market.
Many of these products, however, contain embedJ A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Table 4
Key Legislative Changes in the 1990s
Financial Institutions Reform, Recovery, and Enforcement Act of 1989 (FIRREA)
Opened FHLB membership to commercial banks. Previously membership had been available only to thrifts and
certain insurance companies. Advances from the FHLB are a ready source of non-risk-priced funding. Over twothirds of all banks are now FHLB members, and over half of them routinely utilize advances. As Stojanovic, Vaughan,
and Yeager (2001) show, risky banks are more likely to rely on advances than safer banks.
Federal Deposit Insurance Corporation Improvement Act of 1991 (FDICIA)
Restricted regulatory forbearance and creditor protection through prompt corrective action and least-cost-resolution
provisions. This legislation may have induced greater discipline in uninsured credit markets (see Goldberg and
Hudgins, 2002 and Hall et al., 2002), resulting in higher funding costs and different liability structures for troubled
institutions. Mandatory closure rules potentially increased the mean and reduced the variance of the capital levels
of failing banks.
National Depositor Preference (1993)
Enacted as part of the Omnibus Budget Reconciliation Act of 1993, this legislation changed the failure-resolution
hierarchy to make domestic depositors more senior claimants than foreign depositors. Like FDICIA, this legislation
may have changed funding costs for risky banks and caused them to rearrange their liability structures. See Marino
and Bennet (1999).
Reigle-Neal Interstate Banking and Branching Efficiency Act of 1994
Allowed bank branching across state lines. Although this Act allowed for greater geographic diversification, it also
exposed banks to increased competition.
The Gramm-Leach-Bliley Act of 1999 (Financial Services Modernization Act)
Repealed the Glass-Steagal Act and allowed financial holding companies to engage in insurance, securities underwriting and brokerage services, and merchant banking. This Act introduced new potential sources of risk in banking,
although it facilitated the diversification of some traditional sources of risk.

ded options that could increase exposure to
interest rate risk. Liabilities have also evolved since
1990. Banks are relying increasingly on noncore
funding such as brokered deposits and jumbo CDs
(over $100,000) as traditional checking and savings
accounts and local CDs are shrinking. In addition,
the Federal Home Loan Bank (FHLB) opened its
doors to commercial banks in 1989, quickly becoming an important nondeposit source of funding.
(See Stojanovic, Vaughan, and Yeager, 2001;
Bennett et al., 2005; and Craig and Thomson, 2003.)
These changes potentially alter both interest rate
and liquidity risks. Derivatives usage at commercial
banks has also exploded—the notional amount of
derivatives at commercial banks increased tenfold
to more than $70 trillion between 1991 and 2003.
Derivatives can be used to hedge risk, but they can
also be used to speculate on market movements.10
In addition, over-the-counter derivatives potentially expose banks to counterparty risk.
66

J A N UA RY / F E B R UA RY

2006

Finally, as in many other industries, technological innovations revolutionized the business
of banking in the 1990s. Electronic payments,
online banking, and credit scoring are now common and quickly growing activities. As Claessens,
Glaessner, and Klingebiel (2002) argue, these
developments have the potential to change the
competitive landscape dramatically. They also
allow for increased operational risk, including
data theft from security vulnerabilities and the
facilitation of money laundering.
Overall, the new products and markets that
have become available to banks in the past decade
provide opportunities to diversify and hedge risk
in new ways. Yet they also carry dangers—if they
are not fully understood or properly managed,
new business lines may end up increasing risks
10

The literature on the risk effects of derivative use is large. Recent
contributions include Instefjord (2005), Duffee and Zhou (2001),
and Sinkey and Carter (2000).

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Table 5
Trends at Failed Banks, Before and After 1995
Comparison of ratios at failed banks

Comparison of
ratios at failed banks less peer values

Quarters prior
to failure

1995-2003
(%)

1984-94
(%)

Difference
of means (%)
(t-statistic)

1995-2003
(%)

1984-94
(%)

Difference
of means (%)
(t-statistic)

Jumbo CDs

1
6

14.70
13.40

18.80
21.30

–4.10** (–2.06)
–7.90*** (–5.04)

4.10
3.60

9.30
12.10

–5.2*** (–2.65)
–8.5*** (–5.46)

Federal funds
purchased

1
6

0.37
0.77

0.99
1.29

–0.62*** (–3.30)
–0.52 (–1.64)

–1.15
–0.69

–0.20
0.17

–0.95*** (–5.07)
–0.86*** (–2.72)

Demand deposits

1
6

12.70
11.70

14.90
15.20

–2.20 (–1.23)
–3.5** (–1.99)

0.70
–0.40

–4.80
–6.30

5.5*** (–3.12)
5.8*** (–3.29)

Loan-loss
reserves/loans

1
6

4.04
2.63

3.14
1.87

0.90** (–2.15)
0.76*** (–3.7)

2.51
1.06

1.90
0.67

0.61 (–1.46)
0.38* (–1.86)

Cash & due

1
6

7.11
6.14

8.20
9.03

–1.08 (–1.28)
–2.89*** (–3.7)

1.81
0.85

–0.45
0.17

Commercial real
estate loans

1
6

15.80
15.80

11.60
11.60

4.1** (–2.16)
4.2** (–2.36)

–0.10
1.10

3.10
3.50

Fee income

1
6

2.57
2.87

1.11
1.00

1.46** (–2.44)
1.86 (–1.59)

1.58
1.91

0.34
0.29

1.24** (–2.07)
1.62 (–1.38)

OREO

1
6

1.70
1.49

3.48
1.70

–1.78*** (–3.78)
–0.22 (–0.55)

1.54
1.30

3.11
1.40

–1.57*** (–3.33)
–0.10 (–0.25)

Total assets

1
6

Variable

$133M
$137M

$161M
$192M

–28M (–0.57)
–$55M (–0.88)

–$88M
–$51M

$47M
$88M

2.26*** (–2.67)
0.68 (–0.87)
–3.3* (–1.70)
–2.50 (–1.40)

–$135M*** (–2.72)
–$139M** (–2.23)

NOTE: This table shows differences in means for selected risk variables between failing banks in the period 1995-2003 compared with
those in 1984-94. Both the differences in levels and the differences in levels less peer values for the corresponding periods are given,
at both 1- and 6-quarter horizons. *,**, and *** indicate statistical significance at the 10, 5, and 1 percent levels, respectively. All of the
nine variables reported here have exhibited significant changes since 1995 (by at least one of these difference-of-means tests) in their
patterns as failure approaches.

for banks that move into them too hastily. With
the increasing intensity of competition, many
institutions have likely been tempted to do
exactly that. The net effect on banks’ risk positions
is an empirical question.

Evidence of Changes in the Nature of
Bank Distress
Although none of the institutional changes
mentioned above necessarily implies any fundamental change in the process through which banks
deteriorate, together they constitute a prima facie
case that, at the very least, the previous results
should be reaffirmed. Simple empirical analysis
indicates that some of the above changes may
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

indeed have had an impact on the typical pattern
of bank distress. Figure 1 plots nine key ratio
averages for failing banks in the 12 quarters leading to failure between 1984 and 1994 and between
1995 and 2003 against the contemporaneous
averages for banks that did not fail.11 Of course,
the number of failures in the earlier period was
much larger—1,371 compared with 44—yet the
patterns that emerge suggest that many characteristics of banks in the quarters before failure may
have changed between the two time periods.
Table 5, which reports difference-of-means tests
11

The December 1994 cutoff was chosen to exclude the failures of
the early-1990s banking crisis from the more recent sample. Other
break dates around the same time yield similar results.

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Figure 1
Trends at Failed Banks, Before and After 1995
Jumbo CDs/Assets
1995-2003

1984-94

Percent
25

Percent

Federal Funds Purchased/Assets
1995-2003

1984-94

Percent
1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
12 11 10

12 11

Demand Deposits/Assets
1995-2003

1984-94

Percent
25

12 11 10

NOTE: This figure presents the information in Table 5 in graphical form. In each case, the thin black line indicates the path of a failing
bank as the failure date approaches and the thick blue line indicates the average values for non-failing banks. Values on the horizontal
axis indicate the number of quarters prior to failure. For every variable reported here, there is an obvious change in the pattern
between the two periods.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Figure 1, cont’d
Trends at Failed Banks, Before and After 1995
Loan-Loss Reserves/Loans
1995-2003

1984-94

Percent
5.0

4.0

3.0

2.0

1.0

0.0

12 11 10

0.0

12 11 10

Cash/Assets
1995-2003

1984-94

Percent
10

11 10

12 11 10

Commercial RE Loans/Assets
1995-2003

1984-94

Percent
20

11 10

for the same series, shows that, despite the low
number of failure observations in the second
period, many of these changes are statistically
significant. (The table reports the tests for one
and six quarters prior to failure. The choice of
the six-quarter horizon reflects the average time
between bank exams.)
Failing banks in the 1995-2003 period had
lower relative levels of liquidity risk than banks
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

12 11

in the 1984-94 period. Specifically, between 1995
and 2003, failing banks relied substantially less
on jumbo CDs and the purchase of federal funds,
both in absolute terms and relative to safe banks.
Although the ratio of demand deposits to total
assets was lower for all banks in the later period,
failing banks between 1995 and 2003 had ratios
nearly identical to those at non-failing banks. In
contrast, failing banks on average had significantly
J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Figure 1, cont’d
Trends at Failed Banks, Before and After 1995
OREO/Assets
1995-2003

1984-94

Percent

Fee Income/Assets
1995-2003

1984-94

Percent
3.5

3.0

2.5

2.0

1.5
1.0

0.5

0.0

12 11 10

0.0

12 11 10

Total Assets
1995-2003

1984-94

$ Thousands
250,000

200,000

150,000

100,000

50,000

50,000
0

0
12 11 10

fewer demand deposits as a percentage of assets
than non-failing banks in the 1984-94 period.
Finally, the cash-to-assets ratio increased at failing
banks in the quarters leading up to failure in the
1995-2003 period, whereas that ratio displayed
little pre-failure trend in the earlier period. These
interperiod differences in liquidity risk could
reflect the increased depositor discipline imposed
70

J A N UA RY / F E B R UA RY

2006

12 11 10

by the legislative changes of the 1990s, as risky
banks in the 1995-2003 period may have had a
more difficult time attracting uninsured funds.
Credit-risk ratios also reflect significant differences between the two periods. Commercial
real estate lending was significantly higher (about
4 percentage points, scaled by assets) at failing
banks relative to non-failing banks in the earlier
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

period. In the later period the ratio was about the
same at both failing and non-failing banks. Other
real estate owned (OREO) as a percent of assets,
previously one of the best predictors of failure,
did not change substantially in the 1995-2003
period during the quarters leading up to failure.
Although this ratio continues to be somewhat
higher at failing banks relative to non-failing banks,
the gap has shrunk, and the upward trend has
nearly vanished. The loan loss reserves–to–total
loans ratio was higher for failure banks in the
later period than in the earlier period, although
the ratio increased prior to failure in both time
periods. The diminished importance of credit-risk
ratios could reflect the improved risk-management
processes at banks facilitated by the deepening
of financial markets. Indeed, Schuermann (2004)
argues that most banks came through the 2001
recession in excellent shape in part because of
more effective risk management. Advances in
credit scoring allowed banks to better risk-price
their syndicated, retail, and small-business loans.
Two other ratios demonstrate the increased
importance of diversification and nontraditional
lines of business in recent years. Fee income as a
percentage of assets, which was previously about
the same at safe and failing banks, is now substantially higher for failing banks. Finally, failing banks
were larger on average than non-failing banks in
the earlier period but smaller in the later period,
potentially reflecting the diversification benefits
that banks receive from expanding in size and
product offerings.
Despite the differences, we should be cautious
about drawing strong conclusions from these
graphs. The 1995-2003 sample contains only 44
bank-failure observations, so that, although most
of our statistical tests yield statistically significant
differences, the sample may not be entirely representative. In addition, some series that we have
not emphasized have remained fairly constant.
For example, failing banks continue to hold fewer
mortgages and securities, and the pattern of capital
deterioration has changed little. However, the fact
remains that fundamental shifts in the banking
environment make it possible that the path to
bank distress has changed, and the recent data
that are available are at least consistent with this
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

possibility. Moreover, the shifts in the data—in
variables associated with liquidity, credit, and
operational risk—line up well with the types of
institutional changes we know occurred during
this period.
Because much of the academic research and
most of the prevailing early-warning systems are
based on data from the 1984-94 period, the above
comparison gives us cause for concern. Indeed,
these models tend to emphasize the variables that
our evidence indicates have been most affected
by the recent institutional changes. For example,
eight of the eleven variables in SEER and 10 of the
12 variables in SCOR reflect either asset quality or
liquidity. Recognition of the recent fundamental
shifts in the nature of banking has motivated
supervisors to consider new approaches to off-site
monitoring.

NEW DIRECTIONS IN
BANK-DISTRESS MODELS
In this section we describe some recent
attempts by supervisory economists to build
bank-distress models that (i) are less vulnerable
than traditional models to the changing banking
environment and (ii) are designed to assess risks
that current models potentially overlook. We group
the new models into two types: forward-looking
models and risk-focused models. Forward-looking
early-warning models may prove more robust to
the changing bank environment because they rely
on theory rather than past financial ratios to detect
the circumstances that can lead banks to increase
risk-taking. Risk-focused models reflect the shift
to risk-focused supervision as explained in the
Board of Governor’s Supervision and Regulation
Letter 97-25 titled “Risk-Focused Framework for
the Supervision of Community Banks.” The document, dated October 1, 1997, states the following:
The objective of a risk-focused examination is
to effectively evaluate the safety and soundness
of the bank...focusing resources on the bank’s
highest risks. The exercise of examiner judgment to determine the scope of the examination
during the planning process is crucial to the
implementation of the risk-focused supervision
framework, which provides obvious benefits
J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

such as higher quality examinations, increased
efficiency, and reduced on-site examiner time...
[E]ach Reserve Bank maintains various surveillance reports that identify outliers when a bank
is compared to its peer group. The review of
this information assists examiners in identifying both the strengths and vulnerabilities of
the bank and provides a foundation from which
to determine the examination activities to be
conducted.

Rather than identifying banks with high levels
of overall risk, risk-focused monitoring devices
attempt to assess the particular risks of banking
organizations, allowing examiners to allocate
resources to upcoming exams more efficiently.
Risk-focused models have the added advantage
that they scrutinize risks that traditional models
may overlook because those risks were not systematically important in historical episodes of
bank distress. We emphasize, however, that the
new models should be viewed as complements
to rather than substitutes for the more comprehensive and time-tested systems.

Forward-Looking Models
Forward-looking models tend to focus on
asset growth and liquidity as key risk indicators.
Adverse selection and moral hazard incentives
provide complementary stories for why banks
pursuing rapid asset-growth strategies may be
ramping up risk.
The adverse selection story views banks as
having well-established relationships with a core
set of customers. On the liability side of the balance
sheet, these customers provide stable low-cost
funding, while on the asset side the bank has
information about the creditworthiness of these
customers that generally is not available to other
lenders. Banks that pursue a rapid growth strategy
must move into new markets or offer new products,
finding both a new set of borrowers and the funds
to finance the growth. Although growth is not a
problem per se, the bank will suffer from adverse
selection if its pool of prospective new borrowers
is composed disproportionately of those who
have been rejected by other banks. The question
is whether the bank has sufficient expertise and
devotes sufficient resources to address the credit
72

J A N UA RY / F E B R UA RY

2006

problems inherent in rapid growth. These problems are not observable immediately because it
takes time for loans to become delinquent.
The moral hazard story views deposit insurance and other sources of collateralized funding
as vehicles for bank risk-taking. Banks keep the
profits if the risks pay off, but leave the losses to
the FDIC in the event of failure. Banks with relatively high capital ratios have incentives to manage their banks prudently because the owners of
the bank have their own funds at stake. If capital
ratios begin to slip, however, those incentives may
erode (Keeley, 1990). When bank performance
begins to deteriorate for whatever reason, managers and owners increasingly face the prospect
of losing their wealth and jobs should regulators
close the bank. Rather than watch the bank fail,
management might prefer to gamble for resurrection by booking high-risk loans funded with
insured or collateralized funding. Indeed, this
type of behavior is often blamed, in part, for the
magnitude of the 1980s’ thrift crisis (White, 1991).
Banks traditionally have tried to avoid market
discipline by relying on core deposits, and some
evidence suggests that riskier banks shift to core
funding for exactly this reason.12 Managers adopting this strategy, however, run up against two
constraints. First, banks that deliberately try to
sidestep market discipline with FDIC-insured
deposits may invite greater regulatory scrutiny.
Second, the limited supply of core funding
imposes a natural ceiling on asset growth. Since
the early 1990s, competition for insured deposits
has intensified. Faced with less insured funding
and greater demand for bank assets, managers
have sought new funding sources. Banks that want
to grow quickly but are unwilling to pay the risk
premia demanded by uninsured liability holders
may turn to noncore, non-risk-priced (NCNRP)
sources of funding such as insured brokered
deposits and FHLB advances. Brokered deposits
funded much of the risky growth at thrifts during
the 1980s. FHLB advances, which were historically available only to thrifts but became available to commercial banks in 1989, have many of
12

Billet, Garfinkle, and O’Neal (1998).

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Figure 2
Noncore, Non-Risk-Priced Funding at U.S. Banks
Percentage of Total Assets
7.0

6.0

5.0

4.0

FHLB Advances

3.0

2.0

1.0

Brokered Deposits

0.0
Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
00
01
02

the same properties as brokered deposits.13 Both
types of funding are easily accessible in large
quantities, and neither is priced according to the
failure risk of the borrower. Brokered deposits
are insured by the FDIC, while FHLB advances
are fully collateralized. The lenders, therefore,
have little incentive to monitor a borrowing bank’s
condition.
As Figure 2 illustrates, bank reliance on brokered deposits and FHLB advances is at an historically high level, both in absolute terms and as a
percentage of total bank assets. Advances in particular have grown from essentially 0 to 3.5 percent
of banks’ balance sheets since the early 1990s.
Furthermore, rapid loan growth has accompanied
the growth in noncore funding at many institutions. Between 1994 and 2004, bank lending
increased 39 percent faster than total national
income. Although aggregate capital levels and
13

Stojanovic, Vaughan, Yeager (2001) provide further discussion of
why the FHLB might create incentives for abnormal risk-taking and
evidence in support of this hypothesis. Wang and Sauerhaft (1989)
show that thrift reliance on FHLB advances and brokered deposits
were associated with worse supervisory ratings in the 1980s.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

overall bank condition remained relatively sound
over this period, the rapid growth could be an
indication of imprudent lending. The FDIC and
the Federal Reserve Bank of St. Louis have independently developed alternative early-warning
models called the Growth Monitoring System
(GMS) and the Liquidity and Asset Growth Screen
(LAGS), respectively, to address the adverse selection and moral hazard concerns. We briefly
describe each in turn.
Growth Monitoring System. The FDIC has
used the GMS as part of its off-site review process
since the mid-1980s. The original model was an
“expert system” in that its parameter values
were assigned based on professional judgment,
rather than statistical analysis. Weights were
assigned to a number of growth-related variables
in an attempt to identify those institutions most
in danger of a rating downgrade. In the late 1990s,
the FDIC developed a new version of this model
using statistical techniques. This newer version
of GMS, implemented in 2000, uses a logit model
of downgrades, much like more traditional
models, estimating which institutions that are
J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

currently rated satisfactory are most likely to be
classified as problem banks at the end of three
years. Rather than using credit-quality measures
as independent variables, GMS includes forwardlooking variables that can be precursors of problems that have yet to become manifest. The key
variables in the model are indicated in Table 3.14
Two variables have the most effect on the results:
loan growth and noncore funding. Although the
coefficient magnitudes vary somewhat over time,
they are both statistically and economically significant. More rapid loan growth and heavy
dependence on noncore funding generally lead
to higher estimated default probabilities.
Back-testing of GMS shows that the model has
significant forecasting power.15 Between 1996
and 2000, approximately 30 percent of the banks
with GMS rankings at or above the 98th percentile
received a rating of 3 or worse over the next five
years.16 Among the banks with rankings at the
79th percentile or lower, just 8 percent were
downgraded, so banks in the top two percentiles
were approximately two and a half times more
likely to receive a rating of 3 or worse.
The performance of GMS is even better when
flagging more severe problems. Banks with GMS
rankings at or above the 98th percentile were
downgraded to a CAMELS 4 or 5 or failed 9.5
percent of the time; in contrast, banks with GMS
ratings in the lower 79th percentile were downgraded to a rating of 4 or 5 or failed only 1.3 percent of the time. Finally, banks with GMS rankings
at or above the 98th percentile were over eight
times more likely to fail (0.76 percent) than those
banks with rankings in the 79th percentile or
14

Noncore funding, loans to total assets, and assets per employee
are adjusted for size peers. The growth variables and the change
in loan mix are not adjusted because there is no evidence that the
size peers differ. All growth rates are measured year over year to
avoid problems of seasonal adjustment. The growth rates of loans
and assets are adjusted for mergers, but the growth rates in noncore funding and equity are not. This adjustment means that the
model ignores acquisitions unless the acquisitions have eroded
equity or made the bank more dependent on noncore funding.

The GMS system has also had particular success identifying
recent failures due to fraud, although the exact reasons for this
success require further investigation.

Of course, the full five years has not passed for ratings assigned in
the year 2000. The results are for those banks that survived five
years or that filed a September 2003 call report.

J A N UA RY / F E B R UA RY

2006

lower (0.09 percent). It should be noted that while
the GMS model has notable success in identifying
risky institutions, many banks with high GMS
rankings are never downgraded. In other words,
the type-II error rate is high.
Liquidity and Asset Growth Screen. Like
GMS, LAGS attempts to flag banks that use
particular funding vehicles to fuel rapid asset
growth. The central idea is that a bank that
experiences a combination of falling capital
ratios, rapid asset growth, and a surge in noncore, non-risk-priced funding exhibits the classic
characteristics of moral hazard.
The LAGS model consists of ten separate
panel vector autoregressions (VARs), identical in
their variables but estimated on banks of different
inflation-adjusted asset classes. The four dependent variables in the VARs are the quarterly growth
rate of risk-weighted assets; the ratio of brokered
deposits and FHLB advances to total assets; the
CAMELS composite score; and the ratio of equity
to total assets.17 The equations are estimated on
rolling samples of quarterly data, updated every
three months to include the most recent figures
available. The key variable in the model is the
CAMELS score. Banks that have higher forecasted CAMELS ratings over a three-year horizon
are interpreted as being in greater danger of moralhazard-induced risk.
The charts in Figure 3 show how LAGS works
for a hypothetical bank as of June 2004. In each of
the four panels, the data to the left of the vertical
black lines represent the bank’s behavior over
the previous two years. To the right of the black
lines, the graphs show the LAGS forecasts. LAGS
predicts that the sample bank’s CAMELS score
will rise from its present level of 1 to 1.78 over
the next three years. LAGS ranks banking institutions by the predicted rise in total risk.
A closer look at the sample bank’s recent history gives us an idea of why the model predicts
17

Eight quarterly lags of each of these four variables are included as
regressors in each of the four equations. The equations also include
intercept terms. In total, then, LAGS consists of 40 linear regression
equations each containing 36 variables. Banks are excluded from
the sample if they are less than eight quarters old or have merged
with another institution within the previous eight quarters. As of
June 30, 2004, the dataset included approximately 175,000
observations.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Figure 3
LAGS Forecasts for an Anonymous Bank as of June 2004
Equity to Assets Ratio

CAMELS Composite
Percent
1.9
1.8
1.7
1.6
1.5
1.4
1.3 Historical Performance
1.2
1.1
1.0
Jun Dec Jun Dec Jun
02
03
04
02
03

LAGS Forecast

Dec
04

Jun
05

Dec
05

Jun
06

Dec
06

Jun
07

18
16
14
12
10
8
Historical Performance
6
4
2
0
Jun Dec Jun Dec Jun
02
03
04
02
03

LAGS Forecast

Dec
06

Jun
07

Percent of Assets

$ Millions

180
160 Historical Performance
LAGS Forecast
140
120
100
80
60
40
Risk-Weighted Assets
20
–
Jun Dec Jun Dec Jun Dec Jun Dec Jun Dec
02
03
04
06
02
03
04
05
06
05

Jun
07

FHLB Advances and Brokered Deposits

30
25
20
LAGS Forecast

Historical Performance

10
5
0
Jun
02

Dec
02

Jun
03

Dec
03

Jun
04

Dec
04

Jun
05

Dec
05

Jun
06

Dec
06

Jun
07

such a dramatic rise in risk. The bank grew rapidly
between June 2002 and June 2004, increasing its
assets by half and ratcheting up its risk-weighted
asset ratio. The bank funded a substantial portion
of this growth with FHLB advances and brokered
deposits. As of June 2004, these liabilities supported over 35 percent of the bank’s total assets,
a ratio that rose more than 10 percentage points
during the previous two years. Meanwhile, capital
declined by about 100 basis points. The bank, therefore, displays key moral hazard characteristics.
Given the narrow focus of the LAGS model,
we would not expect its performance to be as
impressive as that of a more comprehensive model
such as SEER, yet LAGS does display significant
discriminatory ability. Between March 1998 and
June 2001, 21.7 percent of banks with a CAMELS
rating of 2 and a LAGS score at the 90th percentile
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Jun
05

Dec
05

Jun
06

Total Assets

Dec
04

or above were downgraded to a CAMELS score
of 3, 4, or 5 or failed within the following three
years. In addition, 47.1 percent of the 2-rated
banks with a LAGS score at the 99th percentile
or above were downgraded or failed within three
years. By contrast, only 12.7 percent of banks
below the 90th percentile were subsequently
downgraded or failed.18

Risk-Focused Models
In addition to becoming more forward-looking,
bank-distress models are also evolving to accommodate the risk-focused framework. Several
off-site–monitoring devices have already been
18

As noted, the LAGS coefficients are reestimated every quarter.
The numbers reported in this paragraph reflect the estimates
actually used in each quarter (rather than, say, the most recent
set). In other words, they reflect out-of-sample forecasting ability.

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

developed by the FDIC and the Federal Reserve,
and more are in development. We describe two
of these models here.
Real Estate Stress Test. Real estate crises have
been perennial causes of bank failures.19 In 2000,
the FDIC implemented a Real Estate Stress Test
(REST) that attempts to identify those banks and
thrifts that are most vulnerable to problems in
real estate markets.20
The REST model incorporates the experience
of the New England real estate crisis of the early
1990s. Conceptually, the model subjects banks
to the same stress as that crisis and forecasts the
resulting CAMELS ratings. REST was developed
by regressing performance data for New England
banks in December 1990 on performance and
portfolio data for the same banks in December
1987. These regressions identify the factors that
were observable in 1987 that later were associated
with safety and soundness concerns. A concentration in construction and development loans
is the primary risk factor, but there are a host of
secondary factors, such as concentrations in
commercial mortgages, commercial and industrial
loans, mortgages on multifamily housing, reliance
on noncore funding, and rapid growth. These
regressions are used to forecast measures of bank
performance which are then translated to CAMELS
ratings using the SCOR model. The result is a
REST rating that ranges from 1 to 5. The output
from the model is distributed to FDIC examiners
as well as examiners from other federal and state
banking agencies. The model has been validated
using data from other real estate downturns; it
can identify banks that are vulnerable from real
estate exposure three to seven years in advance.
Because of the long horizon, banks with poor
REST ratings are not an immediate concern. More
importantly, the model does not consider the
underwriting standards and other aspects of risk
management that the bank uses to control its
exposure to real estate downturns. Consequently,
examiners use the output from the REST model
for examination planning. The model produces a
set of “weights” indicating which variables are
19

See Herring and Wachter (1999).

See Collier et al. (2003b).

J A N UA RY / F E B R UA RY

the most responsible for the poor rating, giving
examiners a sense of the aspects of a bank’s operations that deserve the most attention.
Interest Rate Risk. The savings and loan
crisis of the 1980s focused increased attention
in the banking industry on interest rate risk.
Economists at the Board of Governors responded
by developing a duration-based measure of
interest rate risk that could be used for surveillance and risk-scoping purposes.21 The model,
titled the Economic Value Model (EVM), was
launched in the first quarter of 1998 by producing
a confidential quarterly surveillance report (called
the Focus report) for each commercial bank.
The EVM aggregates balance sheet items into
various buckets based upon maturity and optionality. The model then uses the duration from a
proxy financial instrument for each bucket to
calculate the “risk weight,” or the change in economic value of those items that would result from
a 200-basis-point instantaneous rise in rates. For
example, the EVM places all residential mortgages
that reprice or mature within 5 to 15 years in the
same bucket. If the risk weight for the 5- to 15-year
mortgages were 7.0, the value of the 5- to 15-year
mortgages would be estimated to decline by 7.0
percent following an immediate 200-basis-point
rate hike. The change in economic value is
repeated for each balance sheet bucket. The predicted change in economic value of the bank’s
equity, then, is the difference between the predicted change in assets and the predicted change
in liabilities.
Recent research by Sierra and Yeager (2004)
shows that the EVM effectively ranks banks by
their exposure to rising interest rates. That is,
banks that the model predicts to be the most vulnerable to rising interest rates suffer the largest
declines in income and equity following an interest
rate hike. These banks also show the largest gains
in income and equity following interest rate
declines. Bank supervisors can use the model’s
output to rank banks by interest rate risk. If a bank
is found to be an outlier, the examiner in charge
will emphasize that risk in the next exam.
21

2006

See Embersit and Houpt (1991) and Houpt and Wright (1996) for
details.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

CONCLUSION
After their initial introduction in the 1970s,
studies on the causes of bank distress made rapid
progress, fueled by considerable academic interest.
In recent years, this interest has waned outside
the regulatory community, possibly reflecting a
belief that the causes of bank distress are well
understood. However, significant legislative,
financial, and technological innovations may
make it necessary to supplement the prevailing
academic and regulatory models with a new generation of forward-looking and risk-focused monitoring systems.
Newer forward-looking models at the FDIC
and the Federal Reserve include the Growth
Monitoring System and the Liquidity and AssetGrowth Screen. Risk-focused models include the
Real Estate Stress Test and the Economic Value
Model. Additional monitoring devices such as
those analyzing liquidity risk, operational risk,
and counterparty risk seem promising lines of
inquiry.

REFERENCES
Altman, Edward I. “Financial Ratios, Discriminant
Analysis, and the Prediction of Corporate
Bankruptcy.” Journal of Finance, September 1968,
23(4), pp. 589-609.
Altman, Edward I. “Predicting Performance in the
Savings and Loan Association Industry.” Journal of
Monetary Economics, October 1977, 3(4), pp. 443-66.
Barth, James; Brumbaugh, Dan Jr.; Sauerhaft, Daniel
and Wang, George H.K. “Thrift Institution Failures:
Estimating the Regulator’s Closure Rule,” in George
Kaufman, ed., Research in Financial Services.
Greenwich, CT: JAI Press, 1989, pp. 1-25.
Bennett, Rosalind L.; Vaughan, Mark D. and Yeager,
Timothy J. “Should the FDIC Worry about the FHLB?
The Impact of Federal Home Loan Bank Advances
on the Bank Insurance Fund.” Supervisory Policy
Analysis Working Paper 2005-1, Federal Reserve
Bank of St. Louis, July 2005.
Billet, Matthew T.; Garfinkle, Jon A. and O’Neal,
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Edward S. “The Cost of Market vs. Regulatory
Discipline in Banking.” Journal of Financial
Economics, 1998, 48(3), pp. 333-58.
Bovenzi, John F.; Marino, James A. and McFadden,
Frank E. “Commercial Bank Failure Prediction
Models.” Federal Reserve Bank of Atlanta Economic
Review, November 1983, 68, pp. 14-26.
Claessens, Stijn; Glaessner, Thomas and Klingebiel,
Daniela. “Electronic Finance: Reshaping the
Financial Landscape around the World.” Journal of
Financial Services Research, August-October 2002,
22(1-2), pp. 29-61.
Cole, Rebel A. “When Are Thrift Institutions Closed?
An Agency-Theoretic Model.” Journal of Financial
Services Research, December 1993, 7(4), pp. 283-307.
Cole, Rebel A.; Cornyn, Barbara G. and Gunther,
Jeffery W. “FIMS: A New Monitoring System for
Banking Institutions.” Federal Reserve Bulletin,
January 1995, 8(1), pp. 1-15.
Cole, Rebel A. and Gunther, Jeffery W. “Predicting
Bank Failures: A Comparison of On- and Off-Site
Monitoring Systems.” Journal of Financial Services
Research, April 1998, 13(2), pp. 103-17.
Collier, Charles; Forbush, Sean; Nuxoll, Daniel A.
and O’Keefe, John. “The SCOR System of Off-Site
Monitoring.” FDIC Banking Review, Third Quarter
2003a, 15(3), pp. 17-32.
Collier, Charles; Forbush, Sean and Nuxoll, Daniel A.
“The Vulnerability of Banks and Thrifts to a Real
Estate Crises.” FDIC Banking Review, Fourth
Quarter 2003b, 15(4), pp. 19-36.
Cornett, Marcia M.; Mehran, Hamid and Tehranian,
Hassan. “The Impact of Risk-Based Premiums on
FDIC-Insured Institutions.” Journal of Financial
Services Research, April 1998, 13(2), pp. 153-69.
Cox, D.R. “Regression Models and Life Tables.”
Journal of the Royal Statistical Society, 1972,
Series B (34), pp. 187-220.
Craig, Ben R. and Thomson, James B. “Federal Home
Loan Bank Lending to Community Banks: Are
J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Targeted Subsidies Desirable?” Journal of Financial
Services Research, February 2003, 23(1), pp. 5-28.
Demirgüç-Kunt, Asli. “Modeling Large CommercialBank Failures: A Simultaneous-Equations Analysis.”
Working Paper 8905, Federal Reserve Bank of
Cleveland, March 1989.
Duffee, Gregory R. and Zhou, Chunsheng. “Credit
Derivatives in Banking: Useful Tools for Managing
Risk?” Journal of Monetary Economics, August 2001,
48(1), pp. 25-54.
Embersit, James A. and Haupt, James V. “A Method
for Evaluating Interest Rate Risk in U.S. Commercial
Banks.” Federal Reserve Bulletin, August 1991,
77(8), pp. 625–37.
Flannery, Mark J. and Rangan, K. “Market Forces at
Work in the Banking Industry: Evidence from the
Capital Buildup of the 1990s.” Working paper,
University of Florida–Gainesville, 2003.
Flannery, Mark J. and Sorescu, Sorin M. “Evidence
of Bank Market Discipline in Subordinated Debenture
Yields: 1983–1991.” Journal of Finance, September
1996, 51(4), pp. 1347-77.
Gilbert, R. Alton; Meyer, Andrew P. and Vaughan,
Mark D. “The Role of Supervisory Screens and
Econometric Models in Off-Site Surveillance.”
Federal Reserve Bank of St. Louis Review,
November/December 1999, 81(6), pp. 2-27.
Gilbert, R. Alton; Meyer, Andrew P. and Vaughan,
Mark D. “Could a CAMELS Downgrade Model
Improve Off-Site Surveillance?” Federal Reserve
Bank of St. Louis Review, January/February 2002,
84(1), pp. 47-63.
Goldberg, Lawrence G. and Hudgins, Sylvia C.
“Depositor Discipline and Changing Strategies for
Regulating Thrift Institutions.” Journal of Financial
Economics, February 2002, 63(2), pp. 263-74.
Hall, John R.; King, Thomas B.; Meyer, Andrew P.
and Vaughan, Mark D. “Did FDICIA Enhance Market
Discipline at Community Banks?” in George G.
Kaufman, ed., Research in Financial Services:
Private and Public Policy. Volume 14. Boston:
Elsevier, 2002, pp. 63-94.
78

J A N UA RY / F E B R UA RY

2006

Hanweck, Gerald A. “Predicting Bank Failure.”
Board of Governors of the Federal Reserve System,
Research Papers in Banking and Financial
Economics, November 1977, 19.
Helwege, Jean. “Determinants of Savings and Loan
Failures: Estimates of a Time-Varying Proportional
Hazard Function.” Journal of Financial Services
Research, December 1996, 10(4), pp. 373-92.
Herring, Richard J. and Wachter, Susan M. “Real
Estate Booms and Banking Busts—An International
Perspective.” Occasional Paper No. 58. Group of
Thirty, 1999.
Hirtle, Beverly J. and Lopez, Jose A. “Supervisory
Information and the Frequency of Bank
Examinations,” Federal Reserve Bank of New York
Economic Policy Review, April 1999, 5(1), pp. 1-19.
Hooks, Linda M. “Bank Asset Risk: Evidence from
Early-Warning Models.” Contemporary Economic
Policy, October 1995, 13(4), pp. 36-50.
Houpt, James V. and Wright, David M. “An Analysis
of Commercial Bank Exposure to Interest Rate Risk.”
Federal Reserve Bulletin, February 1996, 82(2), pp.
115-28.
Instefjord, Norvald. “Risk and Hedging: Do Credit
Derivatives Increase Bank Risk?”, Journal of Banking
and Finance, February 2005, 29(2), pp. 333-45.
Keeley, Michael C. “Deposit Insurance, Risk, and
Market Power in Banking.” American Economic
Review, December 1990, 80(5), pp. 1183-200.
King, Thomas B. “Discipline and Liquidity in the
Market for Federal Funds.” Supervisory Policy
Analysis Working Paper 2003-2, Federal Reserve
Bank of St. Louis, October 2005.
Korobow, Leon; Stuhr, David P. and Martin, Daniel.
“A Nationwide Test of Early Warning Research in
Banking.” Federal Reserve Bank of New York
Quarterly Review, Autumn 1977, 2(2), pp. 37-52.
Lane, William R.; Looney, Stephen W. and Wansley.
James W. “An Application of the Cox Proportional
Hazards Model to Bank Failure.” Journal of Banking
and Finance, December 1986, 10(4), pp. 511-31.
F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

King, Nuxoll, Yeager

Marino, James A. and Bennett, Rosalind L. “The
Consequences of National Depositor Preference.”
FDIC Banking Review, October 1999, 12(2), pp. 19-38.

Sinkey, Joseph F. Jr. “A Multivariate Statistical
Analysis of the Characteristics of Problem Banks.”
Journal of Finance, March 1975, 30(1), pp. 21-36.

Martin, Daniel. “Early Warning of Bank Failure: A
Logit Regression Approach.” Journal of Banking
and Finance, November 1977, 1, pp. 249-76.

Sinkey, Joseph F. Jr. “Identifying ‘Problem’ Banks:
How Do the Banking Authorities Measure a Bank’s
Risk Exposure?” Journal of Money, Credit, and
Banking, May 1978, 10(2), pp. 184-93.

Meyer, Paul A. and Pifer, Howard W. “Prediction of
Bank Failures.” Journal of Finance, September 1970,
pp. 853-68.
Meyer, Andrew P. and Yeager, Timothy J. “Are Small
Rural Banks Vulnerable to Local Economic
Downturns?” Federal Reserve Bank of St. Louis
Review, March/April 2001, 83(2), pp. 25-38.
Neely, Michelle C. and Wheelock, David C. “Why Does
Bank Performance Vary Across States?” Federal
Reserve Bank of St. Louis Review, March/April
1997, 79(2), pp. 27-40.
Nuxoll, Daniel; O’Keefe, John and Samolyk, Katherin.
“Do Local Economic Data Improve Off-Site Bank
Warning Models?” FDIC Banking Review, 2003,
15(2), pp. 39-53.
Pantalone, Coleen C. and Platt, Marjorie B. “Predicting
Commercial Bank Failure since Deregulation.”
Federal Reserve Bank of Boston New England
Economic Review, July/August 1987, pp. 37-47.
Putnam, Barron H. “Early-Warning Systems and
Financial Analysis in Bank Monitoring.” Federal
Reserve Bank of Atlanta Economic Review,
November 1983, 68, pp. 6-14.

Sinkey, Joseph F. Jr. and Carter, David A. “Evidence
on the Financial Characteristics of Banks That Do
and Do Not Use Derivatives.” Quarterly Review of
Economics and Finance, Winter 2000, 40(4), pp.
431-49.
Stojanovic, Dusan; Vaughan, Mark D. and Yeager,
Timothy J. “Do Federal Home Loan Bank Advances
and Membership Lead to More Bank Risk?” in
Federal Reserve Bank of Chicago, ed., The Financial
Safety Net: Costs, Benefits, and Implications for
Regulations—Proceedings of the 37th Annual
Conference on Bank Structure and Competition,
May 2001, pp. 165-96.
Stuhr, David P. and van Wicklen, Robert. “Rating the
Financial Condition of Banks: A Statistical Approach
to Aid Bank Supervision.” Federal Reserve Bank of
New York Monthly Review, September 1974, pp.
233-8.
Thomson, James B. “Modeling the Bank Regulator’s
Closure Option: A Two-Step Logit Regression
Approach.” Journal of Financial Services Research,
May 1992, 6(1), pp. 5-23.

Rose, Peter S. and Scott, William L. “Risk in
Commercial Banking: Evidence from Postwar
Failures.” Southern Economic Journal, July 1978,
45(1), pp. 90-106.

Wang, George H.K. and Sauerhaft, Daniel.
“Examination Ratings and the Identification of
Problem/Non-Problem Thrift Institutions.” Journal
of Financial Services Research, October 1989, 2(4),
pp. 319-42.

Schuermann, Til. “Why Were Banks Better Off in the
2001 Recession?” Federal Reserve Bank of New
York Current Issues in Economics and Finance,
January 2004, 10(1), pp. 1-7.

West, Robert Craig. “A Factor-Analytic Approach to
Bank Condition.” Journal of Banking and Finance,
June 1985, 9(2), pp. 253-66.

Sierra, Gregory E. and Yeager, Timothy J. “What Does
the Federal Reserve’s Economic Value Model Tell
Us about Interest Rate Risk at U.S. Community
Banks?” Federal Reserve Bank of St. Louis Review,
November/December 2004, 86(6), pp. 45-60.

Whalen, Gary. “A Proportional Hazards Model of
Bank Failure: An Examination of Its Usefulness as
an Early Warning Tool.” Federal Reserve Bank of
Cleveland Economic Review, First Quarter 1991,
pp. 20-31.

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

King, Nuxoll, Yeager

Whalen, Gary and Thomson, James B. “Using Financial
Data to Identify Changes in Bank Condition.”
Federal Reserve Bank of Cleveland Economic
Review, Second Quarter 1988, pp. 17-26.
Wheelock, David C. and Wilson, Paul W. “Explaining
Bank Failures: Deposit Insurance, Regulation, and
Efficiency.” Review of Economics and Statistics,
1995, 77(4), pp. 689-700.
Wheelock, David C. and Wilson, Paul W. “Why Do
Banks Disappear? The Determinants of U.S. Bank
Failures and Acquisitions.” Review of Economics
and Statistics, February 2000, 82(1), pp. 127-38.
Wheelock, David C. and Wilson, Paul W. “The
Contribution of On-Site Examination Ratings to an
Empirical Model of Bank Failures.” Review of
Accounting and Finance, 2005 (forthcoming).
White, Eugene N. The Comptroller and the
Transformation of American Banking. Washington,
DC: Office of the Comptroller of the Currency,
1992, pp. 1960-90.
White, Lawrence J. The S&L Debacle: Public Policy
Lessons for Bank and Thrift Regulation. New York:
Oxford University Press, 1991.
Yeager, Timothy J. “The Demise of Community
Banks? Local Economic Shocks Are Not to Blame.”
Journal of Banking and Finance, September 2004,
28(9), pp. 2135-53.
Yeager, Timothy J.; Yeager, Fred C. and Harshman,
Ellen. “The Financial Services Modernization Act:
Evolution or Revolution?” Federal Reserve Bank of
St. Louis Working Paper No. 2004-05, 2005.

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T. LO U I S R E V I E W

Replicability, Real-Time Data, and the Science of
Economic Research: FRED, ALFRED, and VDC
Richard G. Anderson
This article discusses the linkages between two recent themes in economic research: “real time”
data and replication. These two themes share many of the same ideas, specifically, that scientific
research itself has a time dimension. In research using real-time data, this time dimension is the
date on which particular observations, or pieces of data, became available. In work with replication,
it is the date on which a study (and its results) became available to other researchers and/or was
published. Recognition of both dimensions of scientific research is important. A project at the
Federal Reserve Bank of St. Louis to place large amounts of historical data on the Internet holds
promise to unify these two themes.
Federal Reserve Bank of St. Louis Review, January/February 2006, 88(1), pp. 81-93.

REPLICATION AND REAL-TIME
ECONOMETRICS

uring the past 25 years, two themes
have flowed steadily, albeit often quietly,
through economic research: “real time”
data and replication. In replication studies, the
issue is determining which data were used and
whether the author performed the calculations
as described; in real-time data studies, the issue
is determining the robustness of the study’s
findings to data revisions. These themes share
the same core idea: that scientific research has
an inherent time dimension. In both real-time
data and replication studies, the time dimension
is the date on which particular observations, or
pieces of data, became available to researchers.
Projects at Harvard University and at the Federal
Reserve Bank of St. Louis promise to improve
the quality of empirical economic research by
unifying these themes.1
Although replication studies focus on the
correctness of results and real-time studies on

their robustness, economic theory suggests that
these are related—the likelihood that an author’s
error will become visible to other researchers is
an inverse function of the cost of conducting
tests for replicability and robustness. Yet, for the
profession, excessive emphasis on the criminaldetection aspects of replication (Did the author
fake the results? Or did the author cease experimenting prematurely when a favorable result
appeared?) has tended to increase the reluctance
of researchers to share data and program code.
That is, to the extent that the profession overemphasizes the manhunt of David Dodge’s 1952 To
Catch a Thief, it risks foregoing the benefits of Sir
Isaac Newton’s 1676 dictum, “If I have seen further it is by standing on the shoulders of giants.”
The incentives and disincentives for a
researcher to share data have been discussed by
numerous authors (e.g., Fienberg, Martin, and Starf,
1

The fallacy that neither real-time data nor replication studies are
needed because “important” results always will sift to the top
through repeated studies is addressed at length in Anderson et al.
(2005).

Richard G. Anderson is a vice president and economist at the Federal Reserve Bank of St. Louis. This is a revised version of paper prepared
for the American Economic Association meeting, Philadelphia, PA, January 2005. The author thanks Bruce McCullough of Drexel University,
Gary King of Harvard University, and Dan Newlon of the National Science Foundation for their comments. Giang Ho provided research
assistance.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

Anderson

1985; Boruch and Cordray, 1985; Dewald, Thursby,
and Anderson, 1986; Feigenbaum and Levy, 1993;
Anderson and Dewald, 1994; Bornstein, 1991;
Bailar, 2003).2 Researchers receive a stream of
rewards for the new knowledge contained in a
published article, which begins with publication
and eventually tapers to near zero. Furnishing the
data to other researchers invites the risk that a
replication will demonstrate the article’s results
to be false, an event which immediately ends the
reward stream. If the replication further uncovers malicious or unprofessional behavior (such
as fraud or other unethical conduct), “negative
rewards” flow to the researcher.
Creating original research manuscripts for
professional journals is craft work. Although often
referred to as “knowledge workers,” researchers
might equally well be regarded as artisans, with
creative tasks that include collecting data, writing
code for statistical analysis or model simulation,
and authoring the final manuscript.3 Similar to
the work of other craftsmen, researchers’ output
contains intellectual property—not only the final
manuscript, but also the data and programs developed during its creation. Yet, for academic-type
researchers, some of the intellectual property must
be relinquished so the work can be published in
peer-reviewed journals. This conflict creates a
strategic game in which the researcher feels compelled to reveal a sufficient amount of his material
to elicit publication, while simultaneously seeking
to retain for himself as much of the intellectual
property as possible. There are few, if any, models
of this process in the economics literature. One
such analysis is presented by Anderson et al.
(2005), based on the Crawford and Sobel (1982)
model of strategic information withholding. A
complete presentation of their theoretical analysis
is beyond the scope of this paper. The results

buttress, however, the commonsense intuition
that so long as withholding data and program code
does not reduce the post-publication stream of
rewards (and disclosure of data and program code
does not increase it), researchers will rationally
choose not to disclose data and programs.4 Such
models largely explain the well-known proclivity
of academic researchers in many disciplines,
including economics, to keep secret their data
and programs. For the progress of scientific economic research, such an equilibrium is suboptimal.
One solution to suboptimal equilibria is
collective action. One collective action is for
professional journals to archive data and program
code.5 Such archives—which permit low-cost,
anonymous, ad hoc replication—can improve the
quality of published research by way of an effect
reminiscent of Baumol-like credible threats of
market entry. This process was well-described
by the University of Chicago’s John Bailar (2003)
at a recent National Research Council conference:

Some data cannot be shared. Examples include confidential banking
data held by the Federal Reserve; micro data held by the Bureau
of the Census; and various financial data, including that licensed
by the University of Chicago’s Center for Research in Security Prices
(CRSP). In some cases, the owners/licensors of such data have
archived the datasets built and used by individual researchers
and made the datasets available to subscribers.

The model of Feigenbaum and Levy (1993), in which rewards to
researchers are driven by citations, also suggests that the divergence
between the search for truth and rational individual choice will be
largest for younger researchers (such as those without academic
tenure), who will be less inclined to search for errors than older
researchers and less inclined to devote scarce time to documenting
their work.

Indeed, economists and other scientists often refer to “polishing”
a final manuscript, in the spirit of woodworkers or stone masons
polishing their work.

Historical data, cataloged and indexed by the day on which the
data became available to the public, often are referred to as “vintage”
data.

J A N UA RY / F E B R UA RY

2006

Of all the public myths about how science is
done, one of the broadest and most persistent
is that scientific method rests on replication
of critical observations [i.e., results]. Straight
replication is in fact uncommon, largely, I
believe, because no scientist gets much professional credit for straightforward replication
unless the findings are critical, there is suspicion of fraud, or there is some other unusual
condition such that slavish replication of the
methods reported might have some meaning
not attached to the first round. Here I exclude
replication by an independent investigator for
the sole purpose of assuring himself or herself
that the original results are correct and that the
methods are working properly, as a preliminary
to going further in some way.
Overall, replication...seems to be one of
those ideals that get a fair amount of discussion

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson

but have little influence on behavior. Perhaps
what is most important is that the original
investigators publish background and methods
with enough detail and precision for a knowledgeable reader to replicate the study if he had
the resources and inclination to do so.
[emphasis added]

In a recent article, Pesaran and Timmermann
(2005, p. 221) offer a formal statement of the correspondence between the universe of all possible
datasets and an article’s specific dataset:
Let χ denote the time-invariant universe of all
possible prediction variables that could be considered in the econometric model, while Nxt
is the number of regressors available at time t
so X t = (x1t ,…,xNxt ) # χ. Nxt is likely to grow
at a faster rate than the sample size, T. At some
point there will therefore be more regressors
than time-series observations. However, most
new variables will represent different measurements of a finite number of underlying economic factors such as output/activity, inflation
and interest rates.

Below, we use the notation X t to denote the set
of all observations [values, measurements], on a
fixed list of economic variables, that have been
published as of (up to and including) date t.
Assuming that an author has not falsified or erroneously transcribed data values, the true dataset
for a published article will be contained within
the universe of all such datasets X t, where t is no
greater than the date on which the original author
completed his research. Unfortunately, such
datasets often are too large to be compiled by
individual researchers.
Historical data, cataloged and indexed by the
day on which the data became available to the
public, are referred to as “vintage” data. Collections of such data—X t in the notation above—are
referred to as “real-time” datasets and are indexed
by the date of the most recent data included, t.
The first large-scale project to collect and make
available to the public vintage macroeconomic
data was started in 1991 by the Federal Reserve
Bank of Philadelphia to assess the accuracy of
forecasts collected in the Survey of Professional
Forecasters (Croushore and Stark, 2001). That
project, and its data, is referred to as the Real Time
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Dataset for Macroeconomists (RTDSM). The
design of the Philadelphia RTDSM project seeks
to provide snapshots of the values of certain
macroeconomic variables as they were available
to the public at the end of the 15th day of the
center month of each quarter. Hence, although
both monthly and quarterly data are included,
the dataset’s primary value is macroeconomic
modeling and forecasting at quarterly frequencies.6

Data Projects
This essay discusses two on-going projects
to support the collection and dissemination of
vintage data. The first, ALFRED™, is the Federal
Reserve Bank of St. Louis’s Archival Federal
Reserve Economic Data. (Some features of ALFRED
are still under development and may not be
available at the time this article is printed; see
alfred.stlouisfed.org for an introduction, instructions, and updates.) The second, VDC, is Harvard
University’s Virtual Data Center project. The
projects differ significantly from each other, and
from the Philadelphia project, in several aspects:
• Data frequency: Both the RTDSM and
ALFRED projects focus on macroeconomic
data. The RTDSM dataset is designed for a
quarterly observational data frequency.
The ALFRED project is designed for data
at a daily frequency or lower (e.g., weekly,
biweekly, monthly, etc)—that is, any data
frequency currently supported in the
Federal Reserve Bank of St. Louis’s FRED®
(Federal Reserve Economic Data) database.
The VDC project, discussed further below,
focuses on archiving and sharing complete
datasets from specific research studies
and articles. As such, it is data-frequency
independent.
• Data vintages: The RTDSM project provides
snapshots of the values of certain macroeconomic variables as they were known by
the public at the close of business on the
15th day of each quarter’s center month.
6

See the Federal Reserve Bank of Philadelphia’s web site for details
and documentation, available at www.phil.frb.org/econ/forecast/
reaindex.html, as of October 18, 2005.

J A N UA RY / F E B R UA RY

2006

Anderson

The ALFRED project, operating at a daily
frequency, provides daily (end-of-day)
snapshots of the values of all variables in
the FRED database. The VDC project,
because it stores complete datasets as provided by researchers, has no explicit vintage
component. The lack of a vintage component restricts the value of its design, for
economists, to replication alone—an important function, but distinctly different from
studies of robustness that require vintageindexed data such as RTDSM and ALFRED.
• Data updating: Neither the RTDSM nor VDC
projects have a mechanism to automatically
update their data vintages. Data are added
to RTDSM as Philadelphia staff determine
which figures were available to the public
on the specified days. Datasets are added
to the VDC project as researchers place them
on the Internet and the VDC servers index
their location. The architecture of ALFRED
differs. “Under the hood,” ALFRED and
FRED share the same database architecture.
In this shared design, data values on FRED
that are revised—that is, replaced with
newly released numbers—are automatically
added to ALFRED as vintage data. Combined with a history of release dates for
major economic indicators such as GDP and
employment, ALFRED uniquely provides
a day-by-day vintage snapshot of the evolution of macroeconomic variables. This
architecture, as discussed further below,
uniquely allows ALFRED to be used for
both replication and robustness studies.
Further discussion of ALFRED’s time-indexed
architecture is contained in a subsequent section.

Orphanides and van Norden (2002, 2003),
Bernanke and Boivin (2003), Faust, Rogers and
Wright (2003), Koenig, Dolmas and Piger (2003),
Svensson and Woodford (2003, 2004), Clark and
Kozicki (2004), and Kishor and Koenig (2005).
The analysis of vintage, or “real-time,” data also
is a popular conference topic—examples include
the Federal Reserve Bank of Philadelphia’s 2001
“Conference on Real Time Data Analysis,”7 the
Bundesbank’s 2004 conference “Real-Time Data
and Monetary Policy,”8 and the CIRANO/Bank
of Canada’s October 2005 “Workshop on Macroeconomic Forecasting, Analysis and Policy with
Data Revision.”9
One of the earlier experiments to demonstrate
the dependence of empirical results on data vintage
is reported in Dewald, Thursby, and Anderson
(1986). During their project at the Journal of
Money, Credit, and Banking from 1982 to 1984, a
large number of authors, when asked to submit
datasets and programs, replied that they did not
save publicly available macroeconomic data
because the data could easily be collected from
published sources and their empirical results were
nearly invariant to the vintage of the data.10 To test
this assertion, Dewald, Thursby, and Anderson
examined in detail one article, Goldberg and
Saunders (1981), which contains a model of the
growth of foreign banks in the United States. The
article’s authors furnished their banking data
(which had required considerable effort to collect)
but not their macroeconomic data, which they said
had been collected from various issues of the
Survey of Current Business and Federal Reserve
Bulletin with no record made of which numbers
were obtained from which issues. Dewald,
Thursby, and Anderson collected from the Survey
of Current Business all published values on three
macroeconomic variables used in the article

The Literature
Studies that explore the sensitivity of empirical results to data vintage and how policymaking
might incorporate the data revision process have
a long history. Early studies are reviewed by
Croushore and Stark (2001). Selected recent studies
include Neely, Roy, and Whiteman (2001),
Orphanides (2001), Stark and Croushore (2002),
Christoffersen, Ghysels, and Swanson (2002),
84

J A N UA RY / F E B R UA RY

2006

Program and papers were available at www.phil.frb.org/econ/conf/
rtdaconf.html, as of October 18, 2005.

Program and papers were available at www.bundesbank.de/vfz/
vfz_daten.en.php, as of October 18, 2005.

Program and papers were available at www.cirano.qc.ca/
financegroup/Real-timeData/program.php, as of October 18, 2005.

Even authors who submitted data seldom noted the dates on which
they collected the data or the dates on which the data had been
published.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson

(imports, investment, and GNP) during the period
1972:Q4 through 1982:Q3.11 From these data, they
estimated 500 variants of the Goldberg-Saunders
model, summarizing the results in a set of histograms (Dewald, Thursby, and Anderson, 1986,
p. 599). Overall, the coefficient estimates obtained
varied widely and the modal values often were far
from the coefficients in the Goldberg and Saunders
article.

DATA SHARING
The arguments above suggest that the quality
of empirical economic science is positively correlated with the extent to which researchers preserve and share datasets and program code. This
theme is commonplace in science. In 1979, the
Committee on National Statistics of the National
Research Council (of the National Academy of
Sciences) sponsored a conference on the role of
data sharing in social science research. A subsequent subcommittee on sharing research data
stated the issues clearly (Fienberg, Martin, and
Starf, 1985, pp. 3-4):
Data are the building blocks of empirical
research, whether in the behavioral, social,
biological, or physical sciences. To understand
fully and extend the work of others, researchers
often require access to data on which that work
is based. Yet many members of the scientific
community are reluctant or unwilling to share
their data even after publication of analyses of
them. Sometimes this unwillingness results
from the conditions under which data were
gathered; sometimes it results from a desire to
carry out further analyses before others do; and
sometimes it results from the anticipated costs,
in time or money, or both.
The Committee on National Statistics
believes that sharing scientific data with
colleagues reinforces the practice of open
scientific inquiry. Cognizant of the often substantial costs to the original investigator for
sharing data, the committee seeks to foster
attitudes and practices within the scientific
11

The published Goldberg-Saunders article used data through
1980:Q1; Dewald, Thursby, and Anderson collected data through
the December 1982 issue of the Survey.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

community that encourage researchers to share
data with others as much as feasible.

The subcommittee offered 16 recommendations
(see the appendix) for improving the quality of
social science research through data sharing.
The recommendations are so straightforward as
to seem self-evident: Sharing data should be standard practice; researchers should retain data for
a reasonable period after publication; researchers
requesting data should bear the costs of providing
data; funding organizations should encourage
data sharing by requesting a data-sharing plan in
requests for funding; and journals should encourage authors to share data. Yet, two decades later,
most of these are not yet standard operating procedures in economic research.
At the time of the National Research Council’s
1979 conference, the National Science Foundation’s (NSF) policies embodied many of the
Council’s later recommendations. The NSF Grant
Policy Manual NSF-77-47, as revised October
1979, states the following in paragraph 754.2:
Data banks and software, produced with the
assistance of NSF grants, having utility to
others in addition to the grantee, shall be made
available to users, at no cost to the grantee, by
publication or, on request, by duplication or
loan for reproduction by others...Any out of
pocket expenses incurred by the grantee in
providing information to third parties may be
charged to the third party.

Subsequent to publication of Dewald, Thursby,
and Anderson (1986), the NSF’s social science
program adopted a policy of requiring that investigators place data and software in a public archive
after their award expired.12 The NSF also began
asking researchers, in applications for subsequent
funding, what data and software from previous
awards had been disseminated. Today, the NSF
policy is clear. The NSF’s current Grant Proposal
Guide (NSF 04-23, effective September 2004),
section VI, paragraph I, states that
NSF advocates and encourages open scientific
communication...It expects PIs [principal
12

I am indebted to Dan Newlon, head of the economics program at
the National Science Foundation, for the information contained
in this paragraph.

J A N UA RY / F E B R UA RY

2006

Anderson

investigators] to share with other researchers,
at no more than incremental cost and within
a reasonable time, the data, samples, physical
collections and other supporting materials
created or gathered in the course of the work.
It also encourages grantees to share software
and inventions, once appropriate protection
for them has been secured, and otherwise act
to make the innovations they embody widely
useful and usable. NSF program management
will implement these policies, in ways appropriate to field and circumstances, through the
proposal review process; through award negotiations and conditions; and through appropriate
support and incentives for data cleanup, documentation, dissemination, storage and the like.
Adjustments and, where essential, exceptions
may be allowed to safeguard the rights of individuals and subjects, the validity of results and
the integrity of collections, or to accommodate
legitimate interests of investigators.

The NSF’s Grant Policy Manual (NSF 02-151,
effective August 2, 2002), paragraph 734,
“Dissemination and Sharing of Research Results,”
contains similar statements:
b. Investigators are expected to share with other
researchers, at no more than incremental cost
and within a reasonable time, the primary data,
samples, physical collections and other supporting materials created or gathered in the
course of work under NSF grants. Grantees
are expected to encourage and facilitate such
sharing. Privileged or confidential information
should be released only in a form that protects
the privacy of individuals and subjects
involved. General adjustments and, where
essential, exceptions to this sharing expectation
may be specified by the funding NSF Program
or Division for a particular field or discipline
to safeguard the rights of individuals and subjects, the validity of results, or the integrity of
collections or to accommodate the legitimate
interest of investigators. A grantee or investigator also may request a particular adjustment
or exception from the cognizant NSF Program
Officer.
c. Investigators and grantees are encouraged
to share software and inventions created under
the grant or otherwise make them or their
products widely available and usable.

J A N UA RY / F E B R UA RY

2006

d. NSF normally allows grantees to retain
principal legal rights to intellectual property
developed under NSF grants to provide incentives for development and dissemination of
inventions, software and publications that can
enhance their usefulness, accessibility and
upkeep. Such incentives do not, however,
reduce the responsibility that investigators and
organizations have as members of the scientific
and engineering community, to make results,
data and collections available to other
researchers.

With such a strong policy in place, data warehousing in economic research should be commonplace.
In fact, it remains rare. As of this writing, 10
professional economics journals have data and/or
program archives: American Economic Review;
Econometrica; Macroeconomic Dynamics; Journal
of Money, Credit, and Banking; Federal Reserve
Bank of St. Louis Review; Economic Journal;
Journal of Applied Econometrics; Review of
Economic Studies; Journal of Political Economy;
and Journal of Business and Economic Statistics.
Some require both data and program files, others
only data. In addition, a public archive for data
and programs from published articles in any
professional journal has been maintained since
1995 by the Inter-university Consortium for
Political and Social Research at the University
of Michigan; except for articles related to the
Panel Study of Income Dynamics at Michigan,
all of the economics-related articles’ data and
programs in the archive are from the Federal
Reserve Bank of St. Louis Review.
The ALFRED project has the potential, for
macroeconomic research, to eliminate the need
for journals to store authors’ datasets. ALFRED,
as explained further below, will sharply reduce
the costs that a researcher in macroeconomics
would bear in documenting, storing, and distributing the data used in a research project. In this
aspect, data archives at journals are a “complementary technology” to the vintage archive
structure of ALFRED, and, both in concept and
execution, more similar to the dataset-archivingand-indexing design of the VDC project.
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson

DATA WAREHOUSING: THE
FRASER AND ALFRED PROJECTS
The collection and distribution of data has
classic public-good characteristics, including
economies of scale, network effects, and firstmover advantages. Yet, large-scale systems for
archiving and distributing economic data are rare.
As noted above, data warehousing for “real
time” economic research began with the Federal
Reserve Bank of Philadelphia (Croushore and
Stark, 2001, p.112):
In creating our real-time data set, our goal is
to provide a basic foundation for research on
issues related to data revision by allowing
researchers to use a standard data set, rather
than collecting real-time data themselves for
every different study.

Two current data-warehousing projects of the
Federal Reserve Bank of St. Louis are in the same
spirit: FRASER® (Federal Reserve Archival System
for Economic Research) and ALFRED. Together,
these projects seek to provide a comprehensive
archive of economic statistical publications and
data. Initially, the projects will focus on government macroeconomic data but eventually will be
expanded to include less aggregate data.

FRASER
The FRASER project (fraser.stlouisfed.org)
is an Internet archive of images of statistical
publications. The long-term goal, essentially, is
to include all the statistical documents ever published by the U.S. government, plus other selected
documents from both private and public sources.
FRASER is an “open standards” project—that is,
any organization that wishes to submit images is
encouraged to do so provided that the images
satisfy the requirements suggested by the U.S.
Government Printing Office’s committee of experts
on digital preservation.13 To date, however, most
contributions have been boxes of printed paper
materials, rather than images.

ALFRED
The ALFRED project is an archive of machinereadable real-time data. Since 1989, the Federal
Reserve Bank of St. Louis has provided data to the
public on its FRED system.14 Initially, ALFRED
will be populated with archived FRED data beginning December 1996. Later, other historical data
will be added, including data extracted from
FRASER images. Links will be provided between
ALFRED data and the historical FRASER publications. The purpose of the FRED data system is to
distribute the most-recent value for each variable
and date; the purpose of the ALFRED system is
to distribute in a similar method all data values
previously entered into FRED, plus additional data.
(Again, ALFRED is a work under construction;
some features outlined here are part of the ALFRED
architecture, and some are not yet available.)
ALFRED is a relational database built on
PostgreSQL. Users are able to request subsets of
data by means of an automated interface. In
ALFRED, every data point is tagged with the name
of its source and its publication date. For real-time
projects, a researcher need only submit a variable
list and a desired range of vintages (that is, “as of”
dates); ALFRED will return all values for those
variables that were available during the specified
date range, each tagged with its as-of date (that
is, its publication date). For replication studies,
in the unlikely circumstance that the original
researcher used the most recently published values
for all variables and dates, a researcher need submit only a list of variable names and the as-of date
when the original researcher collected his data.
In the more common circumstance, in which the
original researcher is uncertain whether he collected the then-most-recent data, a putative range
of collection (as-of) dates may be submitted; with
luck, some mixture of the retrieved values perhaps
will reproduce the original published results.
Combined, the FRASER and ALFRED projects
are a “statistical time machine” that, on request,
furnishes to researchers both universes of data, χ,
and time-indexed “real-time” subsets, Nxt.
14

U.S. Government Printing Office (2004); Federal Reserve Bank of
St. Louis (2004).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Initially, FRED operated as a dial-up computer bulletin board system. In 1995, shortly after the release of version 1.0 of the Mosaic
web browser, FRED appeared as a web site on the Internet.

J A N UA RY / F E B R UA RY

2006

Anderson

As of this writing (October 2005), portions of
ALFRED remain in development and not all features are implemented. Essential, however, will
be a scheme to uniquely identify the historiography of data retrieved from ALFRED. The current
design proposal includes the concept of a research
dataset signature, or RDS. The proposed RDS is
a human-readable string of ASCII characters that
uniquely identifies a data series extracted from
FRED or ALFRED. A dataset containing multiple
time series (variables) will have an RDS for each
series. The proposed character encoding pattern
is as follows:
• 1-20: the FRED/ALFRED variable name15;
• 21-22: the frequency code, e.g., “A”, “Q”,
“BW”;
• 23-25: the seasonal adjustment code, either
the string “SA” or “NSA”16;
• 26-34: the 24-hour Greenwich Mean Time
date on which the data were downloaded,
in the form DDMMMYYYY (that is, the form
“27JAN2001”);
• 35-42: the 24-hour Greenwich Mean Time
time-of-day at which the data were downloaded, in the form HH:MM:SS (HH=hour,
MM=minute, SS=second).
Greenwich Mean Time is used because it
does not vary with the geographic location of the
researcher or the season of the year.17 These date
and time-of-day formats are sufficiently general
to accommodate researchers located anywhere
in the world. Within each RDS string, character
fields will be left-justified and padded on the right
with spaces (ASCII 32). The RDS is somewhat
shorter than might be anticipated because it does
not include a “start” and “end” date. In FRED
15

Currently, all FRED/ALFRED variable names are 8 characters in
length and composed only of the characters 0 to 9 and A to Z—
that is, ASCII characters 48 to 57 and 65 to 90. The signature’s
name field, to allow future expansion, is 20 characters in length
and allows the underscore, ASCII 95, as well as 0 to 9 and A to Z.

In the current FRED nomenclature, the seasonally adjusted character of the series can be inferred from the variable name. This field
is included to increase the human usability of the signature and
for possible future expansion of the FRED/ALFRED nomenclature.

Greenwich Mean Time is named for the Royal Observatory at
Greenwich, England. A discussion of Greenwich Mean Time is
available at www.greenwichmeantime.com, as are conversions to
local time zones.

J A N UA RY / F E B R UA RY

2006

and ALFRED, users are not permitted to select/
download a subset of a time series. A time series
must be downloaded in its entirety or not at all.
(The user is free to discard any unwanted data after
download.) This permits a shorter signature.18
Finally, as plain text, RDS strings may easily be
included in working papers and journal articles.
The proposed architecture for ALFRED follows, in part, the bitemporal SQL (structured
query language) database structure of Snodgrass
and Jensen (1999).19 In this design, each datum
(observation) for each time series will be stored
with three 2-element date vectors. One vector
demarcates the beginning and end of the measurement interval for the observation, the second the
beginning and end of the validity interval, and
the third the beginning and end of the transaction
interval.
Measurement intervals are straightforward.
The measurement interval for GDP during
2004:Q1, for example, would be {1Jan2004,
31March2004}; a daily interest rate might have
an interval of the form {5Jan2004, 5Jan2004}; and
a monthly average interest rate might have an
interval of the form {1Jan2004, 31Jan2004}. This
system encompasses, in a uniform way, all data
frequencies.
Validity intervals demarcate the time periods
during which a datum was the most recently published value. As an example, consider 2004:Q1.
See Table 1. During 2004, the Bureau of Economic
Analysis published four measurements on 2004:Q1
GDP: April 29 (“advance”), May 27 (“preliminary”), June 25 (“final”), and July 30 (this fourth
value has no commonly used label). During 2005,
the BEA published one measurement, on July 29,
as part of the 2005 benchmark revisions. The
validity intervals shown reflect these dates. Note
that the fifth validity interval is open-ended and
will remain so until the next revised value is
published.
18

Internally, the database software distinguishes between dates on
which a variable is not defined (such as the Federal Reserve’s M2
monetary aggregate prior to 1959) and dates on which the series is
defined (that is, was visible to observers monitoring the series at
that date) but for which values are missing because, for example,
certain printed publications cannot be located.

The database design is due to George Essig, senior web developer,
Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson

Table 1
Revision History for 2004:Q1 GDP, April 29 through December 29, 2004
Variable name

Measurement interval

Validity interval

11447.8

GDP

{1Jan2004, 31Mar2004}

{29Apr2004, 26May2004}

{29Apr2004, 31Dec9999}

11459.6

GDP

{1Jan2004, 31Mar2004}

{27May2004, 24Jun2004}

{27May2004, 31Dec9999}

11451.2

GDP

{1Jan2004, 31Mar2004}

{25Jun2004, 29Jul2004}

{25Jun2004, 31Dec9999}

11472.6

GDP

{1Jan2004, 31Mar2004}

{30Jul2004, 28Jul2005}

{30Jul2004, 31Dec9999}

11457.1

GDP

{1Jan2004, 31Mar2004}

{29Jul2005, 31Dec9999}

Value

Transaction intervals show dates on which
Federal Reserve Bank of St. Louis staff entered
or changed data values. The first date is the date
on which the datum was added to the database.
The ending date of the interval is infinity (openended) and will remain so indefinitely unless
the datum (in the first column) is erroneous.
Erroneous values in the database never are
changed or removed—doing so would destroy
the database’s historical integrity. Once a datum
has been made visible to web site customers,
integrity of the database requires that the row is
never modified or deleted. Instead, erroneous
values are corrected by adding an additional row
to the database for the same measurement and
validity intervals. When a new row is added to
correct an error, the end date of the erroneous
row’s validity interval will be set to the day on
which the new row is added, and the start date
of the new row’s transaction interval will be set
to the same date.20 A customer selecting a date
interval that includes the correction date will
receive both the original erroneous datum and
the corrected datum, plus a message warning that
the observation for that date was corrected. The
customer is responsible for checking his empirical
results using both values.
The initial version of ALFRED will not include
transaction intervals. This omission matters not
at all so long as data never are changed after being
20

Once created, the integrity of the database requires that every row
be retained in the database; otherwise, entering the same data
signature string on different dates would retrieve different data,
which is unacceptable. Including a transaction interval is an
essential design element in a database system that guarantees
time-invariance of retrieved data when the data are subject both to
revision by the publisher and to possible human data-entry error.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Transaction interval

made visible to researchers via the Internet.
Because initial data will be machine-loaded from
archival files, data entry errors and corrections
are unlikely. Programming ALFRED using the
three-interval architecture is significantly more
difficult than with a two-interval (measurement,
validity) design and would significantly lengthen
ALFRED’s development.
The ALFRED project will make it unnecessary
to archive datasets for studies based on data
obtained from FRED so long as the author retains
the RDS signatures for the dataset.21 Yet, what of
the careless or forgetful researcher who does not
retain the RDS? For them, the current design
includes automatic archiving and retrieval of RDS
data if the researcher signs up for a user account.
After so doing, they will be offered the opportunity
to save, on FRED and ALFRED, the RDS strings for
every series they download. Putative replicators
need only ask the original researcher to retrieve
the RDS strings and make them available.
The ALFRED system, when completed,
promises to unify the concepts of real-time data
and replication in economic research.
21

An additional, related part of the FRASER/ALFRED project, nearing
completion, is a catalog of available federal, state, and local data
series. In its intent and structure, the catalog resembles the
“Statistical Knowledge Network” discussed by Hert, Denn, and
Haas (2004) except that the catalog does not attempt to explain or
instruct in the ways that the data might be used to conduct economic analyses, as these authors suggest their metadata might be
able to do. Also, to the extent that descriptive metadata is stored
as XML tags, the design of Hert, Denn, and Haas is compatible with
the VDC/DDI initiative to cross-index data from various servers
on the Internet. Further, since the St. Louis economic data catalog
will index data from all government agencies, it partially circumvents the barrier to cross-government IT collaboration discussed by
Mullen (2003), although the GPO Access project (U.S. Government
Printing Office, 2001) has been charged by the Congress to promote
electronic dissemination of data and documents.

J A N UA RY / F E B R UA RY

2006

Anderson

THE VDC PROJECT
The VDC project of Harvard University, similar
to ALFRED, has as its goal increasing the replicability of research.22 Unlike ALFRED, however, the
VDC project itself does not include data collection.
Rather, the heart of the VDC project is to provide
a low-cost, integrated suite of software that will
allow other researchers a forum for archiving and
sharing data. More precisely, the VDC project
furnishes “an (OSS) [open-source software] digital
library system ‘in a box’ for numeric data” that
“provides a complete system for the management,
dissemination, exchange, and citation of virtual
collections of quantitative data.”23
An essential component of the VDC project’s architecture are the tools to encourage
researchers—including individuals, professional
journals, and research institutions—to use a single
set of formatting and labeling standards when they
place datasets on the Internet. In turn, a loosely
coupled web of VDC servers will locate, index,
and catalog the datasets, making them available
to other researchers. The VDC’s proposed formatting and labeling standards are those of the
University of Michigan’s Data Documentation
Initiative (DDI) project, an accepted standard in
the document and knowledge management
arena.24 The formatting consists solely of inserting
plain text XML tags within text data files, easily
done within many programs or a simple text editor.
If successful, the VDC project promises the
type of network effects, well-known to economists,
that accompany (and drive) the adoption of standards. Because each VDC node maintains a catalog
of materials held on other VDC nodes, as the VDC
network expands, additional researchers will find
it increasingly attractive to join, so as to make their
work visible to the growing community.
22

Altman et al. (2001, p. 464).

See http://thedata.org/. The VDC and St. Louis projects share the
same core open-source components: Linux, Apache, and
PostgresSQL. The St. Louis middleware is coded as server-side
PHP scripts, similar in spirit if not code to the Java servlets used
in VDC.

On DDI, see Blank and Rasmussen (2004) and
www/icpsr.umich.edu/DDI/.

J A N UA RY / F E B R UA RY

2006

A COMPARISON OF PROJECTS:
ALFRED AND VDC
Both the VDC and ALFRED projects provide
tools that promise to improve the scientific quality
of empirical economic research. But, their philosophies and architectures differ. Altman et al., for
example, writes that “The basic object managed
in the [VDC] system is the study” [emphasis
added]. In the St. Louis ALFRED project, the
basic objects managed are a published study’s
set of signature strings, which permit repeated
extraction of the same dataset from an underlying,
encompassing database.
Because the VDC project focuses on preserving
specific datasets from specific studies, it is wellsuited for archiving both experimental and nonexperimental data. The ALFRED project’s focus
on archiving vintages of non-experimental data
makes it better suited to macroeconomic research,
both real-time and replication studies. In replication studies, the issue is determining which data
were used in a study and whether the calculations
were performed as described; in real-time data
studies, the issue is determining the robustness
of the study’s findings to data revisions. At least
for aggregate macroeconomic data, an archival
system that can do two things—provide a later
investigator with the previous researcher’s original
data as well as provide earlier and later published
values of the same variables—has the promise of
combining a “simple” replication study with a
real-time, data-based robustness study. In Pesaran
and Timmermann’s notation, the archival system
must be able to produce, on demand, both the
universe of all observations on the variables of
interest, χ, and all possible time-indexed “real
time” subsets, Nxt. Although careful use of XML
tags in a VDC/DDI system might permit support
for such real-time econometrics, it likely would
require attaching XML tags to each data point in
each study.

CONCLUSION
Data archiving, data sharing, and replication
are hallmarks of science, necessary to explore the
correctness of published results. Real-time data
F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson

studies are important to address the robustness
of published results. These two lines of inquiry
are linked by means of their recognition that
empirical economic research is inherently timeindexed. Although quite different, the VDC and
ALFRED projects promise to assist and improve
the quality of empirical economic research by
reducing the cost of both lines of inquiry.

REFERENCES
Altman, Micah L.; Diggory, Andreev M.; King, G.;
Sone, A.; Verba, S.; Kiskis, Daniel L. and Krot, M.
“A Digital Library for the Dissemination and
Replication of Quantitative Social Science Research:
The Virtual Data Center.” Social Science Computer
Review, 2001, 19, pp. 458-70.
Anderson, Richard G.; Greene, William H.;
McCullough, Bruce M. and Vinod, H.K. “The Role
of Data & Program Code Archives in the Future of
Economic Research.” Working Paper 2005-14,
Federal Reserve Bank of St. Louis, 2005.
Anderson, Richard G. and Dewald, William G.
“Replication and Scientific Standards in Applied
Economics a Decade After the Journal of Money,
Credit and Banking Project.” Federal Reserve Bank
of St. Louis Review, November/December 1994,
76(6), pp. 79-83.
Bailar, John C. “The Role of Data Access in Scientific
Replication.” Presented at the October 16-17, 2003
workshop “Confidential Data Access for Research
Purposes,” held by the Panel on Confidential Data
Access for Research Purposes, Committee on
National Statistics, National Research Council, 2003;
www7.nationalacademies.org/cnstat/
John_Bailar.pdf.
Bernanke, Ben S. and Boivin, Jean. “Monetary Policy
in a Data-Rich Environment.” Journal of Monetary
Economics, 2003, 50, pp. 525-46.
Blank, Grant and Rasmussen, Karsten Boye. “The
Data Documentation Initiative.” Social Science
Computer Review, Fall 2004, 22(3), pp. 307-18.
Bornstein, Robert F. “Publication Politics, Experimenter

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Bias and the Replication Process in Social Science
Research,” in James Neuliep, ed., Replication
Research in the Social Sciences. Thousand Oaks,
CA: Sage Publications, 1991, pp. 71-84.
Boruch, Robert F. and Cordray, David S. “Professional
Codes and Guidelines in Data Sharing,” in Stephen
E. Fienberg, Margaret E. Martin, and Miron L. Starf,
eds. Sharing Research Data. Washington DC:
National Academy Press, 1985; www.nap.edu/
openbook/030903499X/html/index.html.
Christoffersen, Peter; Ghysels, Eric and Swanson,
Norman R. “Let’s Get ‘Real’ About Using Economic
Data.” Journal of Empirical Finance, 2002, 9, pp.
343-60.
Clark, Todd E. and Kozicki, Sharon. “Estimating
Equilibrium Real Interest Rates in Real Time.”
Working Paper 04-08, Federal Reserve Bank of
Kansas City, September 2004.
Crawford, Vincent P. and Sobel, Joel. “Strategic
Information Transmission.” Econometrica, November
1982, 50(6).
Croushore, Dean and Stark, Tom. “A Real-Time Data
Set for Macroeconomists.” Journal of Econometrics,
2001, 105, pp. 111-30.
Dewald, William G.; Thursby, Jerry G. and Anderson,
Richard G. “Replication in Empirical Economics:
The Journal of Money, Credit and Banking Project.”
American Economic Review, September 1986,
76(4), pp. 587-603.
Faust, Jon; Rogers, John H. and Wright, Jonathan H..
“Exchange Rate Forecasting: The Errors We’ve
Really Made.” Journal of International Economics,
May 2003, 60(1), pp. 35-59.
Federal Reserve Bank of St. Louis. “As It Happened:
Economic Data and Publications as Snapshots in
Time,” presentation by Robert Rasche, Katrina
Stierholz, Robert Suriano, and Julie Knoll, at the
Fall Federal Depository Librarians Conference,
October 19, 2004, Washington, DC;
fraser.stlouisfed.org/fdlp_final.pdf.

J A N UA RY / F E B R UA RY

2006

Anderson

Feigenbaum, S. and Levy, D. “The Market for
(Ir)Reproducible Econometrics” and “Response to
the Commentaries.” Social Epistemology, 1993,
7(3), pp. 215-232 and pp. 286-92.
Fienberg, Stephen E.; Martin, Margaret E. and Starf,
Miron L., eds. Sharing Research Data. Washington,
DC: National Academy Press, 1985; www.nap.edu/
openbook/030903499X/html/index.html.
Goldberg, Lawrence G. and Saunders, Anthony. “The
Growth of Organizational Forms of Foreign Banks
in the U.S.” Journal of Money, Credit, and Banking,
August 1981, pp. 365-74.
Hert, Carol A.; Denn, Sheila and Haas, Stephanie W.
“The Role of Metadata in the Statistical Knowledge
Network.” Social Science Computer Review, Spring
2004, 22(1), pp. 92-99.
Kishor, N. Kundan and Koenig, Evan F. “VAR
Estimation and Forecasting When Data Are Subject
to Revision.” Working Paper 2005-01, Federal
Reserve Bank of Dallas, February 2005.
Koenig, Evan F.; Dolmas, Sheila and Piger, Jeremy.
“The Use and Abuse of Real-Time Data in Economic
Forecasting.” Review of Economics and Statistics,
2003, 85, pp. 618-28.
Mullen, Patrick R. “The Need for Government-Wide
Information Capacity.” Social Science Computer
Review, Winter 2003, 21(4), pp. 456-63.
National Research Council. Access to Research Data
in the 21st Century: An Opening Dialogue Among
Interested Parties. Report of a workshop on the
Shelby Amendment held by the Science, Technology
and Law Panel of the National Research Council,
March 12, 2001. Washington, DC: National Academy
Press, 2002.
Neely, Christopher J.; Roy, Amlan and Whiteman,
Charles H. “Risk Aversion versus Intertemporal
Substitution: A Case Study of Identification Failure
in the Intertemporal Consumption Capital Asset
Pricing Model.” Journal of Business and Economic
Statistics, October 2001, 19(4), pp. 395-403.

Based on Real-Time Data.” American Economic
Review, September 2001, 91(4), pp. 964-85.
Orphanides, Athanasios and van Norden, Simon.
“The Unreliability of Output-Gap Estimates in Real
Time.” Review of Economics and Statistics,
November 2002, pp. 569-83.
Orphanides, Athanasios and van Norden, Simon.
“The Reliability of Inflation Forecasts Based on
Output Gap Estimates in Real Time.” Working
Paper 2003s-01, CIRANO, HEC, Montreal, 2003.
Pesaran, Hashem and Timmermann, Allan. “RealTime Econometrics.” Econometric Theory, 2005,
21(1), pp. 212-31.
Snodgrass, Richard T. and Jensen, Christian S.
Developing Time-Oriented Database Applications
in SQL. San Francisco CA: Morgan Kaufman, 1999.
Stark, Thomas and Croushore, Dean. “Forecasting
with a Real-Time Dataset for Macroeconomists.”
Journal of Macroeconomics, 2002, 24, pp. 507-31.
Svensson, Lars E.O. and Woodford, Michael.
“Indicator Variables for Optimal Policy.” Journal of
Monetary Economics, 2004, 50, pp. 691-720.
Svensson, Lars E.O. and Woodford, Michael.
“Indicator Variables for Optimal Policy Under
Asymmetric Information.” Journal of Economic
Dynamics and Control, 2003, 28, pp. 661-90.
U.S. Government Printing Office. Biennial Report to
Congress on the Status of GPO Access.
Washington, DC: U.S. GPO, 2001;
www.gpoaccess.gov/biennial/index.html.
U.S. Government Printing Office. Report on the
Meeting of Experts on Digital Preservation.
Washington, DC: U.S. GPO, March 12, 2004a;
www.gpoaccess.gov/about/reports/preservation.pdf.
U.S. Government Printing Office. Concept of
Operations for the Future Digital System.
Washington, DC: U.S. GPO, October 1, 2004;
www.gpo.gov/news/2004/ConOps_1004.pdf.

Orphanides, Athanasios. “Monetary Policy Rules

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

Anderson

APPENDIX
Recommendations of the Committee on National Statistics of the
National Research Council and the National Academy of Sciences
The Committee on National Statistics’ final report (Fienberg, Martin, and Starf, 1985) offered 16 specific
recommendations regarding sharing research data. These recommendations, little noticed during the
past 20 years, are as relevant today as then. Taken together, they form a foundation, or body of knowledge, for best-practice in empirical scientific research. Their recommendations are reproduced here
because, although they sound scientific and sensible, most have been ignored in economic science.
For all researchers:
1. Sharing data should be a regular practice.
For initial investigators:
2. Investigators should share their data by the time of publication of initial major results of
analyses of the data except in compelling circumstances.
3. Data relevant to public policy should be shared as quickly and widely as possible.
4. Plans for data sharing should be an integral part of a research plan whenever data sharing is
feasible.
5. Investigators should keep data available for a reasonable period after publication of results
from analyses of the data.
For subsequent analysts:
6. Subsequent analysts who request data from others should bear the associated incremental costs.
7. Subsequent analysts should endeavor to keep the burdens of data sharing on initial investigators
to a minimum and explicitly acknowledge the contribution of the initial investigators.
For institutions that fund research:
8. Funding organizations should encourage data sharing by careful consideration and review of
plans to do so in applications for research funds.
9. Organizations funding large-scale, general-purpose data sets should be alert to the need for data
archives and consider encouraging such archives where a significant need is not now being met.
For editors of scientific journals:
10. Journal editors should require authors to provide access to data during the peer review process.
11. Journals should give more emphasis to reports of secondary analyses and to replications.
12. Journals should require full credit and appropriate citations to original data collections in
reports based on secondary analyses.
13. Journals should strongly encourage authors to make detailed data accessible to other researchers.
For other institutions:
14. Opportunities to provide training on data-sharing principles and practices should be pursued
and expanded.
15. A comprehensive reference service for computer-readable social science data should be developed.
16. Institutions and organizations through which scientists are rewarded should recognize the
contributions of appropriate data-sharing practices.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W

J A N UA RY / F E B R UA RY

2006

J A N UA RY / F E B R UA RY

2006

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W