View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Federal Reserve Bank
of Chicago

Eleventh International
Banking Conference

Second Quarter 2008

Economic

perspectives
Does education improve health? A reexamination of the
evidence from compulsory schooling laws
Bhashkar Mazumder

How do EITC recipients spend their refunds?
Andrew- Goodman-Bacon and Leslie McGranahan

Are inflation targets good inflation forecasts?
Marie Diron and Benoit Mojon

Economic .___

perspectives

President
Charles L. Evans

Senior Vice President and Director of Research
Daniel G. Sullivan
Research Department
Financial Studies
Douglas Evanoff, Vice President

*Macroeconomic Policy Research
Jonas Fisher, Economic Advisor and Team Leader
Microeconomic Policy
Daniel Aaronson, Economic Advisor and Team Leader

Payment Studies
Richard Porter, Vice President
Regional Programs
William A. Testa, Vice President

Economics Editor
Anna L. Paulson, Senior Financial Economist
Editor
Helen O’D. Koshy

Associate Editors
Kathryn Moran
Han Y. Choi

Graphics
Rita Molloy
Production
Julia Baker
Economic Perspectives is published by the Research

Department of the Federal Reserve Bank of Chicago. The
views expressed are the authors’ and do not necessarily
reflect the views of the Federal Reserve Bank of Chicago
or the Federal Reserve System.
© 2008 Federal Reserve Bank of Chicago
Economic Perspectives articles may be reproduced in
whole or in part, provided the articles are not reproduced
or distributed for commercial gain and provided the source
is appropriately credited. Prior written permission must
be obtained for any other reproduction, distribution,
republication, or creation of derivative works of Economic
Perspectives articles. To request permission, please
contact Helen Koshy, senior editor, at 312-322-5830 or email
Helen.Koshy@chi.frb.org.
Economic Perspectives and other Bank
publications are available on the World Wide Web
at www.chicagofed.org.

s chicagofed- org
ISSN 0164-0682

Contents

Second Quarter 2008, Volume XXXII, Issue 2

Does education improve health? A reexamination of the evidence
from compulsory schooling laws
Bhashkar Mazumder
This article analyzes the impact of compulsory schooling laws early in the twentieth century
on long-term health. The author finds no compelling evidence for a causal link between
education and health using this research design. Further, the results suggest that only a small
fraction of health conditions are affected by education, and several of those are conditions,
such as sight and hearing, where economic theories don’t appear to be relevant.

17

How do EITC recipients spend their refunds?
Andrew Goodman-Bacon and Leslie McGranahan
The authors determine what items are purchased using the earned income tax credit (EITC)—
one of the largest sources of public support for lower-income working families in the U.S.
They find that recipient households’ EITC payments are used primarily for vehicle purchases
and transportation spending, both of which are crucial to job access and consistent with the
EITC’s prowork goals.

33

Are inflation targets good inflation forecasts?
Marie Diron and Benoit Mojon

The authors show that quantified inflation objectives, which have been adopted by many
industrialized countries, can be used as rule-of-thumb forecasting devices. Remarkably,
they yield smaller forecast errors than widely used forecasting models and the forecasts
of professional experts.

International Banking Conference
The Credit Market Turmoil of2007-08:
Implications for Public Policy

Does education improve health? A reexamination of the
evidence from compulsory schooling laws
Bhashkar Mazumder

Introduction and summary
Improving the long-term health of the population is
clearly an important goal for policymakers. It is also
likely to become even more so in the coming years
with the aging of the baby boomers and the anticipated
health-related costs that will accompany this demographic
change. Therefore, understanding which policy levers
might improve health is of interest. In a provocatively
titled front page article, “A surprising secret to a long
life: Stay in school,” the New York Times recently
suggested that many researchers now believe that education is the key factor in promoting health.1 While
social scientists have long known that there is a strong
positive correlation between education and longevity,
many researchers have speculated that this association was not truly causal, meaning one didn’t necessarily lead to the other. Rather, the link was thought
to reflect either the fact that for a variety of other reasons (for example, parental income and personal attitudes), people who tend to acquire more schooling
also tend to be in better health, or that healthier children
stayed in school longer. Of course, in the absence of
evidence of a causal link, there is no reason to expect
that policies aimed at increasing educational attainment will result in improvements in health.
The New York Times article was based upon the
results of a recent study by economist Adriana LlerasMuney (2005) that provides perhaps the strongest evidence to date that education has a causal effect on
health. By implementing an instrumental variables
(IV) strategy, this research analyzes changes in compulsory schooling and child labor laws across different states early in the twentieth century and uses this
information to infer the effects of education on mortality. The idea behind this strategy is that if differences in these laws induced people born in different
states in different years to obtain different levels of
schooling for reasons that are unrelated to any other



determinants of health, then one can estimate a true
causal effect that is not confounded by the other factors. Lleras-Muney finds that increased schooling due
to these laws led to dramatic reductions in mortality rates
during the 1960s and 1970s. In fact, the results imply
that one more year of schooling would lower the mortality rate over a ten-year period by nearly 60 percent—a result that is perhaps implausibly large.
If it is true that more education leads to improved
health, such a finding also raises a second important
question—namely, how, exactly, does education affect
health? Economists have proposed a variety of theories
including: that more education leads to better jobs
and more financial resources; that education improves
knowledge and decision-making ability, which improves
health; and that education influences other kinds of
behavioral responses that, in turn, lead to better health
outcomes. So far, however, there is little convincing
empirical evidence on how to evaluate the importance
of these factors.
In this article, I reexamine the use of these compulsory schooling laws as a way of identifying the
causal effects of education on health through the IV
approach. Given the fundamental importance of the
question of whether more education is causally linked
to better health, it is worth investigating the robustness of the relationship. I estimate the same types of
models used in the earlier research, using a much larger
sample and improved measures of compulsory schooling laws. I also present alternative specifications of
the statistical model that may better account for other
Bhashkar Mazumder is a senior economist in the Economic
Research Department and the executive director of the
Chicago Research Data Center at the Federal Reserve
Bank of Chicago. The author thanks Douglas Almond,
Claudia Goldin, Adriana Lleras-Muney, Anna Paulson,
and Diane Schanzenbach.

2Q/2008, Economic Perspectives

reforms that were going on during the same period.
For example, during the early period of the twentieth
century, there were fairly dramatic improvements in
public health measures that led to large declines in
concurrent mortality (Cutler and Miller, 2005). For
school-age children specifically, new nutrition and
vaccination programs may have resulted in improved
long-term health, independent of any effects of increased education.
In addition, if compulsory schooling laws can be
used to identify a causal relationship, then they also
ought to be useful in identifying how education improves health. This can be analyzed by using data on
very specific health conditions for which existing theories might favor one explanation versus another. For
example, if processing information and decision-making
ability are the critical channels by which education
affects health, then we might expect lower incidences
of chronic diseases, such as arthritis, cancer, diabetes,
lung disease, and heart disease. These are conditions
that might respond better to more sophisticated management plans or behavioral changes. If the key factor
is increased access to high-quality health care due to
greater financial resources, then we might expect that
a broad range of health outcomes would be improved.
Therefore, it makes sense to apply the same methodology to other outcomes besides mortality.
A careful analysis of how education affects health
using the IV approach also serves as a credibility check
on the methodology. If, for example, all of the health
effects appeared to be related to the long-term effects
of poor nutrition, then a plausible alternative hypothesis would be that changes in compulsory schooling laws
are really just picking up the long-term health effects of
improved nutrition in schools. In that case, the assumption that these laws represent exogenous sources
of schooling differences would be invalid, and the estimates would not represent a causal relationship between education and health.
In order to address these issues, I first reexamine
the effects of education on mortality from Lleras-Muney
(2005, 2006) by replicating the results and extending
them by adding significantly more data and employing
a variety of robustness checks. I find that the effect of
education on mortality is not robust to the inclusion
of state-specific time trends, casting doubt on whether
there is a true causal effect. At a minimum, my results
show that the point estimates are much smaller than
those previously found in the literature. Moreover,
the results appear to be driven by the earliest cohorts
(born in 1901–12) during the 1960–70 period.
Second, I use individual-level data on health
outcomes from the U.S. Census Bureau’s Survey of

Federal Reserve Bank of Chicago

Income and Program Participation (SIPP) to further
investigate the causal pathways between compulsory
schooling and health. In contrast to the U.S. Census
data, which requires the use of a cohort grouping strategy
to infer mortality, the SIPP provides data on the health
status of each individual so that we can be sure that those
who were affected by the compulsory schooling laws
are indeed the same individuals registering the change
in health. Using the SIPP with the same IV strategy,
I find large and statistically significant effects of education on general health status that are robust to the
inclusion of state-specific time trends. This suggests
that the SIPP micro data are able to overcome the
limitations of the U.S. Census data.
However, when I turn to the results that identify
which specific health conditions were affected by education improvements induced by compulsory schooling
laws, the results do not point to a coherent story of
how education affects health. For example, only a
small fraction of health conditions are affected by
education, and several of those affected are conditions,
such as sight and hearing, where economic theories
don’t appear to be relevant. What is also striking is
the absence of effects among many chronic diseases
where decision-making ability is believed pivotal.
A limitation of the data, however, is that specific conditions are only identified for a subset of the sample
that report having some health limitations. Nevertheless, this pattern of results suggests that the use of
compulsory schooling laws as an instrument may be
suspect. I also note that in a recent working paper, Clark
and Royer (2007) use an even more sophisticated approach to analyze the effects of compulsory schooling
law changes in the United Kingdom on mortality. Their
findings also cast doubt on whether there is a strong
causal connection between education and health.
Background and previous literature
Kitagawa and Hauser (1973) were the first to
document the sharp differences in health in the United
States by socioeconomic status. A large number of
studies have since replicated this basic finding of a
“gradient” in health by education or income, and this
pattern has also been found in other countries.2 For
policymakers, a critical question is whether this gradient reflects a causal relationship that can be exploited to improve the long-term health of the population.
For example, in a document soliciting research proposals on the pathways linking education to health,
the National Institutes of Health (2003) cautioned
that: “The association or pathway between formal
education and either important health behaviors or
diseases may not be causal. Instead it may reflect the



influence of confounding or co-existing determinants
or may be bi-directional.”
A review of the literature on whether the education gradient in health is causal may be found in
Grossman (2005). While these studies typically find
an effect of more education leading to better health,
in most cases it is questionable whether the instruments are truly exogenous. For example, Dhir and
Leigh (1997) use parent schooling, parent income,
and state of residence as instruments, all of which
could plausibly affect long-term health independently
of their effects through schooling. The innovation by
Lleras-Muney (2005) to use changes in compulsory
schooling laws early in the twentieth century appears
to be more compelling, since it is more plausibly exogenous than instruments used in prior work. Nevertheless, other changes in public policy that coincided
with changes in compulsory schooling laws might
have led to long-run improvements in health. Cutler
and Miller (2005) find that the introduction of clean
water technologies during this period could explain
as much as half of the concurrent decline in mortality.
Similarly, many states introduced food programs in
schools, recognizing that compulsory schooling was
pointless if children were malnourished. Near the beginning of the twentieth century, Robert Hunter (1904)
wrote in the book Poverty: “There must be thousands—
very likely sixty or seventy thousand children—in
New York City alone who often arrive at school hungry and unfitted to do well the work required. It is utter
folly, from the point of view of learning, to have a compulsory school law which compels children, in that weak
physical and mental state which results from poverty,
to drag themselves to school and to sit at their desks,
day in and day out, for several years, learning little or
nothing.” In response to this situation, Philadelphia,
Boston, Milwaukee, New York, Cleveland, Cincinnati,
and St. Louis all began large-scale programs to provide
food in public schools during the 1900s and 1910s
(Gunderson, 1971). Mazumder (2007) also provides
suggestive evidence that the mechanism by which
compulsory schooling laws might have improved longterm health was through school requirements for vaccination against smallpox. If improvements in nutrition
and vaccination programs were coincident with changes
in compulsory schooling laws, then these might explain
some or all of the long-term health improvements that
were associated with changes in these laws.
Supposing that it is true that more education leads
to improved health, this finding raises an interesting question—namely, how, exactly, does education affect health?
As Richard Suzman of the National Institute on Aging
recently stated, “Education ... is a particularly powerful



factor in both life expectancy and health expectancy,
though truthfully, we’re not quite sure why.”3 Economists have proposed a variety of explanations. These
theories typically emphasize the role of education in
affecting various proximate determinants of health,
including financial resources, knowledge and decision-making ability, and other behavioral characteristics that could lead to better health outcomes.
Financial resources come into play because
better educated individuals may obtain higher paying
and more stable jobs and thereby may be able to afford better quality health care and health insurance.
With greater economic resources, they may also choose
safer and more secure living and work environments.
One might expect that if financial resources are the
key factor behind the link between education and
health, then we should expect to see virtually all
forms of health conditions affected by exogenous
sources of increased education.
The second explanation is that higher levels of
schooling may lead to greater knowledge and an improved ability to process information and make better
choices or take better advantage of technological improvements. In one widely cited paper, Goldman and
Smith (2002) note that better educated patients may
manage chronic conditions better. Those with more
schooling adhere more closely to treatment regimens
for human immunodeficiency virus (HIV) infection
and diabetes, which can be fairly complex. For such
conditions, the ability to form independent judgments
and comprehend treatments is important, and apparently is fostered by schooling. Accordingly, Goldman
and Smith (2002, p. 10934) state that “self-maintenance is an important reason for the very steep SES
[socioeconomic status] gradient in health outcomes.”
Glied and Lleras-Muney (2003) argue that “the most
educated make the best initial use of new information
about different aspects of health,” permitting them to
respond more adeptly to evolving medical technologies.
Finally, it could be that education induces other
kinds of behavioral changes. For example, the better
educated may value the future more than the present
compared with those with less education, and therefore, the better educated may take better care of their
health (Becker and Mulligan, 1997). Others have argued that education improves one’s perception of one’s
relative status in society and that improved social standing is associated with better health (Marmot, 1994).
Mortality analysis: Methodology and data
The first part of the analysis estimates the effects
of education on mortality, using the approach developed by Lleras-Muney (2005). In the absence of a

2Q/2008, Economic Perspectives

large sample of data on individuals containing both
education and lifespan, I use group-level data from
successive U.S. Decennial Censuses to estimate mortality rates. Specifically, I use population estimates for
groups defined by state of birth, gender, and year of
birth to estimate the mortality rate across ten-year periods. The mortality rate at time t for birth cohort c of
gender g born in state s, (Mcgst), is simply measured as
the percentage decline in the population count (Ncgst)
within these cells over the subsequent ten years:
1)

M cgst =

N cgst − N cgst +1
N cgst

.

I then model the mortality rate for each cell as follows:
2) M cgst = a + Ecgst π + Wcs δ + γ c + α s + θcr
+ fem + τt + ε cgst ,
where Ecgst is the average education level for that
cell at time t and Wcs measures a set of cohort and
state-specific controls measured at age 14 intended
to capture differences in other potential early life determinants of mortality (for example, manufacturing
share of employment and doctors per capita). The model
also includes a set of cohort dummies c, state of birth
dummies s, interactions between cohort and region of
birth θcr , a female dummy (fem), and year dummies τt.
One straightforward way to estimate π in equation 2 would be through weighted least squares (WLS),
with the weights corresponding to the population represented by each cell. However, this would produce a
biased estimate because of omitted variables. Any number of factors could plausibly be associated with both
higher education and lower mortality even at the group
level. Therefore, I use two-stage least squares, where
in the first stage, education is instrumented with the
set of compulsory schooling laws, CLcs, in place for
each cohort and state of birth:
3)

Ecgst = b + CLcs ρ + X cgst β + Wcs δ + γ c + α s
+ θcr + fem + τt + ucgst .

In Lleras-Muney (2005), the instruments for the
compulsory schooling laws were constructed in the
following way. The variable childcom measured the
minimum required age for work minus the maximum
age before a child is required to enter school, by state
of birth and by the year the cohort is age 14. This

Federal Reserve Bank of Chicago

variable takes on one of eight values. A set of indicator variables were then used as instruments. In addition, an indicator for whether school continuation
laws were in place in that state was also used. These
laws required workers of school age to continue school
part time. However, it probably makes more sense to
match individuals to the laws concerning the maximum
age for school entry around the age at which students
start school, rather than to the laws in place when they
were age 14. Therefore, I use a different set of data
independently collected by Goldin and Katz (2003).4
Goldin and Katz carefully compared their series with
other codings of the compulsory schooling laws (for
example, Lleras-Muney, 2005; and Acemoglu and
Angrist, 2001) and resolved differences wherever possible. Since the Goldin and Katz data go back further
in time, it is possible to match all of the cohorts to the
school entry age laws in effect when the cohorts were
younger than 14. I use these data to measure the required age for school entry when the cohorts were at
age 8 instead of 14. In principle, incorporating these
data should provide a better measure of the total
years of compulsory schooling.
Several estimation samples are constructed for
this part of the analysis. Initially, I produce a sample
combining data from the 1 percent Integrated Public
Use Microdata Series (IPUMS) from the 1960, 1970,
and 1980 U.S. Censuses in order to replicate the basic
results in Lleras-Muney (2005, 2006).5 I then expand
the analysis in stages. First, I replace the 1 percent
samples in 1970 and 1980 with a 2 percent sample for
1970 and a 5 percent sample for 1980. Second, I also
expand the periods by adding 5 percent samples for
1990 and 2000. Following the literature, I restrict the
analysis to cohorts born between 1901 and 1925, topcode years of education at 18 starting in 1980, and exclude immigrants and blacks.6 For the expanded samples,
I also exclude cases where age, state of birth, and education are imputed by the U.S. Census Bureau. The
descriptive statistics for the replication sample and
the expanded sample are shown in table 1.
It is worth noting that the death rate for the 1970–80
period is quite a bit larger with the expanded sample
but that the standard deviation is about 20 percent lower.
There are now also five additional cells that had missing
data when using just the 1 percent samples. The death
rates for the 1980–90 and 1990–2000 periods are much
higher because I follow these same cohorts when they
are much older. Figure 1 plots the death rates by age for
each U.S. Census year. This highlights the importance
of controlling for age in the specifications, which is
done by adding polynomials in age to the models.



Table 1

Summary statistics for Integrated Public Use Microdata Series samples
	
	

1960 1%, 1970 1%, and	
1980 1% samples	

1960 1%, 1970 2%, 1980 5%,
1990 5%, and 2000 5% samples

	
Variables	

	
Mean	

Standard	
deviation	

Ten year death rates
Overall	
1960–70	
1970–80	
1980–90	
1990–2000	

0.108	
0.110	
0.105	
—	
—	

0.136	
0.119	
0.152	
—	
—	

4,792	
2,395	
2,397	
—	
—	

0.213	
0.113	
0.154	
0.287	
0.433	

0.173	
0.105	
0.125	
0.170	
0.122	

8,636
2,397
2,400
2,399
1,440

10.548	
0.471	
—	
—	
0.517	
50.366	
0.031	
0.038	
0.044	
0.048	
0.050	

0.990	
0.499	
—	
—	
0.500	
8.482	
0.174	
0.191	
0.205	
0.213	
0.217	

4,795	
4,795	
—	
—	
4,795	
4,795	
4,795	
4,795	
4,795	
4,795	
4,795	

10.729	
0.325	
0.289	
0.142	
0.532	
56.811	
0.025	
0.031	
0.047	
0.052	
0.057	

1.002	
0.469	
0.453	
0.349	
0.499	
11.287	
0.157	
0.174	
0.211	
0.222	
0.232	

8,636
8,636
8,636
8,636
8,636
8,636
8,636
8,636
8,636
8,636
8,636

21.279	
8.523	
11.901	

4,795	
4,795	
4,795	

53.778	
11.562	
8.945	

21.153	
8.430	
11.787	

8,636
8,636
8,636

0.038	
1,343.09	
276.35	
0.000	
42.05	

4,795	
4,795	
4,795	
4,795	
4,795	

0.066	
7,206.15	
535.18	
0.001	
99.78	

0.037	
1,353.57	
272.57	
0.000	
41.71	

8,636
8,636
8,636
8,636
8,636

0.090	

4,795	

0.172	

0.090	

8,636

Individual characteristics
Education	
1960 dummy	
1970 dummy	
1990 dummy	
Female	
Age	
Born in 1905	
Born in 1910	
Born in 1915	
Born in 1920	
Born in 1925	

State of birth characteristics
Percentage urban	
53.523	
Percentage foreign-born	
11.737	
Percentage black	
8.983	
Percentage employed
  in manufacturing	
0.067	
Annual manufacturing wage ($)	
7,171.39	
Value of farm per acre ($)	
540.05	
Per capita number of doctors	
0.001	
Per capita education expenditures ($)	 97.01	
Number of school buildings
  per square mile	
0.174	

Number of	
observations	

	
Mean	

Standard	
deviation	

Number of
observations

Notes: Summary statistics are for state of birth, cohort, and gender cells. All means and standard deviations use sample weights where the weights
are the population estimates for the cell in the base period.
Source: Author’s calculations based on data from the University of Minnesota, Minnesota Population Center, Integrated Public Use Microdata Series.

Health analysis: Methodology and data
The methodological approach changes only
slightly when I turn to using individual-level data
from the SIPP. Many of the outcomes in the SIPP are
indicator variables that take on the value of 1 if a particular health problem is present and 0 otherwise.
Therefore, I now use two-stage conditional maximum
likelihood, or 2SCML (Rivers and Vuong, 1988), rather
than IV.7 Rivers and Vuong show that 2SCML has
desirable statistical properties, is easy to implement,
and produces a simple test for exogeneity. I continue
to use IV for the few continuous dependent variables.
Also, all of the analysis is now done using individuallevel data. The statistical model is similar to equation 2,
only now I use the latent variable framework:



4)

yit* = a + Ei π + X i β + Wcs δ + γ c + α s + trend s
+ τt + fem + εit ,

5)

yit = 1 if y*it > 0,

yit = 0 if y*it ≤ 0.

In the first stage, I run a similar regression as before:
6) Ei = b + CLcs ρ + X i β + Wcs δ + γ c + α s
+ trend s + τt + d + εit .
To implement 2SCML, I use the predicted residuals
from equation 6, ε^it , and I include it as an additional
right-hand side variable (along with the actual value
of Ei) when running the second stage probit. For comparability, I use the same sample restrictions and

2Q/2008, Economic Perspectives

(conducted by the U.S. Department of
Health and Human Services, Centers for
Ten-year mortality rates, by age, across U.S. Census years
Disease Control and Prevention, National
death rates
Center for Health Statistics).10
0.8
I also examine some other general
1960
outcomes.
These are whether the individ0.7
1970
ual was hospitalized during the past year,
1980
0.6
the number of times she was hospitalized,
1990
the total number of nights spent in the
0.5
hospital, and the number of days spent in
0.4
bed in the past four months.
There are also questions dealing with
0.3
functional activities, activities of daily liv0.2
ing, and instrumental activities of daily living that are derived from the International
0.1
Classification of Impairments, Disabilities,
0
and Handicaps (ICIDH). I assembled a com35
40
45
50
55
60
65
70
75
80
mon set of questions that were consistently
age
asked across surveys. These are whether
Source: Author’s calculations based on data from the University of Minnesota,
the individual has “difficulty” with seeMinnesota Population Center, Integrated Public Use Microdata Series.
ing, hearing, speaking, lifting, walking,
and climbing stairs, as well as whether
the person can perform any of these activities “at all.” In addition, there is inforcovariates as in the U.S. Census results, with only a
mation on whether individuals have difficulty getting
few exceptions. I include a quadratic in age and use
around inside the house, going outside of the house,
state-specific cohort trends to address concerns that
or getting in or out of bed, as well as whether they
region of birth interacted with cohort may not adeneed the assistance of others for these activities.
quately control for state-specific factors that are
For a subset of individuals who report limited
smoothly changing over time.8
abilities in certain tasks or who have been classified
The sample is constructed by pooling individuals
as having a work disability (“health limitation”), defrom the 1984, 1986–88, 1990–93, and 1996 SIPP panels.
tailed information is collected on a number of very
Each SIPP panel surveys approximately 20,000 to
specific health conditions including: arthritis or rheu40,000 households, and most panels are representative
matism; back or spine problems; blindness or vision
of the noninstitutionalized population.9 Because particproblems; cancer; deafness or serious trouble hearing;
ipation in many programs is closely related to an indidiabetes; heart trouble; hernia; high blood pressure
vidual’s health and disability status, the SIPP routinely
(hypertension); kidney stones or chronic kidney troucollects information on health and medical conditions.
ble; mental illness; missing limbs; lung problems; paThe SIPP is also ideally suited for this analysis because
ralysis; senility/dementia/Alzheimer’s disease; stiffness
it contains the state of birth of all sample members,
or deformity of limbs; stomach trouble; stroke; thywhich allows me to implement the IV strategy of using
roid trouble or goiter; tumors (cyst or growth); or othcompulsory schooling laws during childhood.
er.11 Since the specific health ailments are only asked
One especially useful outcome is self-reported
of specific subsamples, they probably only pick up on
health (SRH). The SRH is on a 1–5 scale, where 1 is
the most severe cases. Even though many of the sam“excellent,” 2 is “very good,” 3 is “good,” 4 is “fair,”
ple individuals are not actually asked about these speand 5 is “poor.” The SRH has been found to be an excific health conditions, I still include them in the
cellent predictor of mortality and changes in functional
estimation sample so that the sample is not a selected
abilities among the elderly (Case, Lubotsky, and Paxson,
sample of only those in poor health. The summary
2002). I experiment with this measure in a few ways.
statistics for these data are shown in table 2.
First, I use it as a continuous variable. Second, I use
Mortality results
indicators for being in poor health or in fair or poor
health. Finally, I use the health utility scale that meaI begin by trying to match the estimates of the efsures the differences between the categories in a health
fect of education on ten-year mortality rates shown in
model using the National Health Interview Survey
figure 1

Federal Reserve Bank of Chicago



Table 2

Summary statistics for Survey of Income and Program Participation sample
	
Variables	
Outcomes
Self-reported health (1 is excellent, 5 is poor)	
Poor health	
Fair or poor health	
Health index (1–100 scale)	
Hospitalized in last year	
Days in bed, last four months	
Number of times hospitalized	
Number of nights in hospital	
Trouble seeing	
Trouble hearing	
Trouble speaking	
Trouble lifting	
Trouble walking	
Trouble with stairs	
Trouble getting around outside the home	
Trouble getting around inside the home	
Trouble getting in/out of bed	
Trouble seeing at all	
Trouble hearing at all	
Trouble speaking at all	
Trouble lifting at all	
Trouble walking at all	
Trouble with stairs at all	
Needs help getting around outside	
Needs help getting around inside	
Needs help getting in/out of bed	
Work limitation due to health conditions	
Arthritis	
Back	
Blind	
Cancer	
Deaf	
Deformity	
Diabetes	
Heart	
Hernia	
Hypertension	
Kidney	
Lung	
Mental illness	
Missing limb	
Paralysis	
Senility	
Stomach	
Stroke	
Thyroid	
Other	
	
Individual characteristics
Education	
Female	
Age	

	
Mean	

Standard	
deviation	

Number of
observations

3.084	
0.119	
0.357	
67.992	
0.180	
3.937	
0.282	
1.908	
0.136	
0.152	
0.021	
0.237	
0.289	
0.276	
0.129	
0.059	
0.079	
0.023	
0.013	
0.003	
0.115	
0.154	
0.116	
0.088	
0.024	
0.025	
0.423	
0.129	
0.062	
0.026	
0.016	
0.023	
0.027	
0.030	
0.090	
0.006	
0.036	
0.005	
0.043	
0.005	
0.003	
0.006	
0.007	
0.010	
0.021	
0.003	
0.066	
	

1.138	
0.324	
0.479	
24.842	
0.384	
17.030	
1.029	
7.898	
0.342	
0.359	
0.144	
0.425	
0.453	
0.447	
0.335	
0.235	
0.270	
0.149	
0.114	
0.052	
0.319	
0.361	
0.321	
0.283	
0.154	
0.156	
0.494	
0.335	
0.242	
0.159	
0.125	
0.149	
0.162	
0.170	
0.287	
0.080	
0.185	
0.067	
0.203	
0.067	
0.056	
0.075	
0.084	
0.099	
0.144	
0.056	
0.247	
	

26,030
26,030
26,030
26,030
26,484
25,223
22,229
26,274
20,853
20,845
20,834
20,837
20,799
20,820
17,401
17,643
17,636
20,811
20,819
15,138
20,789
20,723
20,775
13,610
13,893
13,868
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073
19,073

11.432	
0.580	
72.079	

3.208	
0.494	
5.606	

26,030
4,795
4,795

Source: Author’s calculations based on data from the U.S. Census Bureau, Survey of Income and Program Participation.

Lleras-Muney (2006).12 Using WLS, Lleras-Muney’s
estimate is –0.036, and using IV, her estimate is –0.063.
These estimates imply huge effects. For example, the
IV estimate implies that one additional year of education



would reduce the ten-year mortality rate by about 60
percent.13 In table 3, I show the results of the replication exercise, as well as the effects of expanding the
sample and employing additional robustness checks.

2Q/2008, Economic Perspectives

Table 3

New estimates of effects of education on mortality
	
Sample and specification	

	
WLS	

	
IV	

Number of
observations

1960–1980 1%:
  No age controls, region × cohort  	
	

–0.036	
(0.004)	

–0.072	
(0.025)

4,792

1960 1%, 1970 2%, and 1980 5%:
  No age controls, region × cohort	
	

–0.045	
(0.004)	

–0.045	
(0.024)

4,797

  With age cubic, region × cohort	
	

–0.039	
(0.004)	

–0.047	
(0.024)

4,797

  With age cubic × Census year, region × cohort	
    	

–0.040	
(0.004)	

–0.047	
(0.024)

4,797

  With age cubic × Census year, state × cohort trend	
	
B. 1960–2000

–0.048	
(0.004)	

–0.016	
(0.024)

4,797

–0.034	
(0.003)	

–0.026	
(0.015)

8,636

–0.036	
(0.003)	

–0.012	
(0.016)

8,636

1960 1%, 1970 2%, and 1980–2000 5% with age cubic × Census year:
   Estimated effect for 1960–70	
	

–0.025	
(0.006)	

–0.081	
(0.052)

2,397

   Estimated effect for 1970–80	
	

–0.061	
(0.005)	

–0.023	
(0.033)

2,400

   Estimated effect for 1980–90	
	

–0.043	
(0.004)	

0.023	
(0.029)

2,399

   Estimated effect for 1990–2000	
	
D. 1960–2000, by age

–0.012	
(0.005)	

0.027	
(0.039)

1,440

1960 1%, 1970 2%, and 1980–2000 5% with age cubic × Census year:
  35–54 year olds	
	

–0.017	
(0.005)	

–0.067	
(0.036)

2,879

  55–64 year olds	
	

–0.039	
(0.005)	

0.063	
(0.053)

2,398

  65–89 year olds	
	
E. 1960–2000, by cohort

–0.030	
(0.003)	

–0.047	
(0.023)

3,071

1960 1%, 1970 2%, and 1980–2000 5% with age cubic × Census year:
  Cohorts born in 1901–12	
	

–0.019	
(0.004)	

–0.203	
(0.125)

3,644

  Cohorts born in 1913–25	
	

–0.017	
(0.004)	

0.025	
(0.023)

4,992

A. 1960–80

1960 1%, 1970 2%, and 1980–2000 5%:
  With age cubic × Census year  	
	
  With age cubic × Census year, state × cohort trend	
    	
C. 1960–2000, by Census year

Notes: WLS means weighted least squares. IV means instrumental variables. The dependent variable is the ten-year mortality rate; table entries 	
are the coefficient on education. All specifications include year dummies, cohort dummies, state of birth dummies, region of birth interacted with
cohort, and an intercept (except for panel A, fifth row, and panel B, second row). Estimates are weighted using the number of observations in the 	
cell in the base year. Standard errors, shown in parentheses, are clustered at the state of birth and cohort level.

Federal Reserve Bank of Chicago



In the first row of panel A of table 3, I match the WLS
estimate of –0.036 exactly, although my IV estimate
of –0.072 is slightly larger. It is also worth pointing
out that the partial F statistic on the first stage regression is reasonable at 7.5.14 The second row of panel A
uses the 1960 (1 percent) sample, as well as the larger
samples for 1970 (2 percent) and 1980 (5 percent), and
utilizes the Goldin and Katz (2003) data for constructing the instruments. I find that the WLS estimate rises
to –0.045 and that the IV estimates drop considerably
to –0.045. Had I used the Lleras-Muney data for constructing the instruments, the estimate would be exactly the same at –0.045. However, the standard error
would have declined by about 25 percent relative to the
first row, suggesting that expanding the sample provides considerably more precision. In the third and
fourth rows of panel A, I control for age and find that
this lowers the WLS estimates a little and increases
the IV estimates a little. In the fifth row, I drop the region of birth interactions with cohort and instead use
state-specific linear (cohort) trends. This raises the WLS
estimate to –0.048, but I now find that the IV coefficient
is sharply lower at –0.016 and is no longer statistically significant. However, the fact that the standard error does not rise suggests that the precision is the
same when including the state-specific trends.
In panel B of table 3, I add data from the 5 percent
samples of the 1990 and 2000 U.S. Censuses. With this
larger data set, I construct death rates over four tenyear periods and therefore follow cohorts over a longer
period with a considerably larger sample. Given that
the sample also tracks the cohorts later in life when
mortality rates are much higher, the age controls are
essential. I use a cubic in age, although I find that the
results are not very sensitive to the choice of the polynomial. Since medical technology and other healthrelated factors might change over time, I have also
interacted the cubic in age with the U.S. Census year.
In this specification (the first row of panel B), I now
find that the WLS estimate is about –0.034 and that
the IV estimate is –0.026. Both of these estimates are
a bit more plausible than the ones mentioned previously.
The IV estimate is now significant at the 10 percent
level, but not at the 5 percent level. With this larger
sample, the inclusion of state-specific cohort trends
again results in a point estimate that is much smaller
in magnitude (–0.012) and not statistically distinguishable from zero (the second row of panel B), despite a
similar degree of precision.
In the remaining panels of table 3, I examine how
the effects vary by year, age, and cohort. In panel C,
I separately estimate the education coefficient for each
U.S. Census year. Since the specification includes a

10

full set of cohort dummies, these are equivalent to
age controls when using a single U.S. Census year.
Although the WLS estimates are significant in all years,
they peak in 1970–80 at –0.061 and drop to only –0.012
by 1990–2000. The IV estimates have large standard
errors, so they are likely to be imprecisely estimated.
Nonetheless, the point estimate is large only for 1960–70
and is actually positive for 1980–90 and 1990–2000.
In panel D, I stratify the sample by three age ranges:
35–54, 55–64, and 65–89. Here I observe different
patterns between the WLS and IV specifications. The
WLS estimates suggest that the largest effect may be
for those aged 55–64, while the IV estimates are largest for those aged 35–54. Given the imprecision of
the estimates, I cannot draw any meaningful inferences regarding the age pattern.
Panel E of table 3, however, provides a striking
result when using the IV specification. It appears that
the entire effect of education on mortality arising from
compulsory schooling laws is due to cohorts born in
1901–12, who constitute just over 40 percent of the
sample. In fact for those born in 1913–25, the point
estimate is actually positive.
Interpreting the mortality results
I interpret the results in the fifth row of panel A
and the second row of panel B of table 3 as suggesting that I cannot reject the null hypothesis that the effect of education on mortality is zero. In other words,
education has no causal effect on mortality once I adequately control for state time trends. An alternative
view might be that once one includes state time trends,
the coefficient is smaller but still negative, and that
the standard errors are simply too large to estimate
the effect precisely, and therefore, I cannot rule out a
causal effect. One might be concerned, for example,
that the instruments are highly collinear with the time
trends. However, as I have shown, the standard errors
do not rise when including the time trends. In any case,
this alternative interpretation of the results would implicitly start with the hypothesis that there is a causal
effect and that the results here do not offer sufficient
evidence to reject that hypothesis—a strong assumption given that the literature has yet to successfully
identify a causal effect.
If one takes seriously the point estimates shown
in the fifth row of panel A and the second row of panel B of table 3 (despite their statistical insignificance),
then this implies that the causal effects of education
on mortality are much smaller than previously thought.
A more reasonable estimate then is that an additional
year of schooling lowers mortality risk over a tenyear period by about 10 percent. This is still a large

2Q/2008, Economic Perspectives

effect that might reflect the true causal effect. Still, it
bears repeating that using the current research design,
I am unable to reject the hypothesis that the true effect
is actually zero.
My analysis also suggests that, upon closer inspection, the results are driven by cohorts born very early
in the century and their mortality experience during
the 1960–70 period. One possible explanation could
be that the effect of education stayed roughly constant
but that compulsory schooling laws had their biggest
effect on those born earlier in the century. However,
I have run the first-stage regressions by these cohort
groupings and found that the partial F statistics on the
instruments are actually much higher for the 1913–25
cohorts. This suggests that the schooling laws may
actually have been more binding for the later cohorts,
casting doubt on this alternative explanation.
Health outcome results
Table 4 presents the results using the microdata
on health outcomes using the SIPP. The first column
shows the effects of education using a simple probit
(or ordinary least squares, or OLS), which does not
account for endogeneity. The second column presents
the 2SCML (or IV) estimates using the compulsory
schooling laws as instruments. Given the possible effects of education on mortality and the fact that outcomes in the SIPP are not observed until at least 1984,
one might not expect any remaining health effects to
be apparent. As it turns out, I do find significant effects
using the instruments for several broad health outcomes.
The first row of panel A shows that self-reported health
measured as a continuous variable is affected by education. The IV estimate of –0.23 is more than twice
the OLS estimate of –0.09. In the fourth column using a Hausman test of exogeneity, I can reject that the
OLS and IV coefficients are the same at the 7 percent
level (shown as 0.074 in the table). Translating the SRH
into a health index on a 1–100 scale following Johnson
and Schoeni’s (2007) approach, the IV estimate implies
that an increase in schooling by one year improves
the health index by 4.5 points, or about 7 percent
evaluated at the mean (third column). I also estimate
that the probability of being in fair or poor health is
reduced by 8.2 percentage points with an additional
year of schooling, a fairly large effect that is statistically different from the naive probit at the 18 percent
significance level. I do not find, however, that any of
the measures of hospitalization or days spent in bed
are significant when accounting for endogeneity.
Looking across a variety of measures of physical
function, I find that, while all of the naive probit estimates are significant and of the expected sign, the

Federal Reserve Bank of Chicago

two-stage estimates are typically not significant. Those
who have an additional year of schooling because of
compulsory schooling laws are no less likely to have
trouble lifting, walking, climbing stairs, getting around
outside the house, getting around inside the house, or
getting into or out of bed. In fact for many of these
outcomes, the coefficients are actually positive, suggesting they have a greater propensity for worse health.
On the other hand, those with greater schooling associated with compulsory schooling laws are dramatically less likely to experience problems with seeing,
hearing, or speaking. In almost all of these cases, the
differences between the simple probit and the 2SCML
estimates are very large and statistically different at
about the 10 percent level. For example, the 2SCML
estimates imply that an additional year of schooling
reduces the probability of having trouble “seeing” by
5.6 percentage points. In this sample, the mean rate of
this health outcome is 13.6 percent. These results might
suggest that the channel by which general health is
compromised for those with less schooling may be
related to sensory functions.
Next, I estimate results based on the incidence of
specific health conditions. Recall that these conditions
are only identified for subsets of individuals and that
the screening criteria changed across SIPP survey
years. Also recall that all individuals are included regardless of whether they were screened for this question, so as to avoid using a sample of only those in
poor health. Generally, the underlying health conditions
were only asked of individuals who reported particular kinds of activity limitations, reported having a
work disability, or reported being in fair or poor health.
This is captured by the variable “health limitation,”
which, not surprisingly, is significant under both probit and 2SCML. When I turn to the estimated likelihood of having one of the underlying health conditions,
the probit estimates once again are significant in every case. The 2SCML estimates, however, are only
negative and significant for four outcomes: back or
spine problems; stiffness or deformity of a limb; diabetes; and senility/dementia/Alzheimer’s disease. It is
important to point out that “trouble seeing,” “trouble
hearing,” and “trouble speaking” were never used as
screening criteria for asking about an underlying
health condition. This likely explains why blindness
and deafness are not significant within the subsamples.
Surprisingly, both kidney problems and hypertension appear to be positively associated with more
schooling. This is especially notable because these
are two outcomes for which self-management and
recent technological advances appear to be especially
important. According to appendix table B of

11

Table 4

Estimates of effects of education on health outcomes
	
Dependent variable	

	
OLS/probit	

	
IV/2SCML	

IV/2SCML	
effect size	

  Exogeneity test     	 Number of
p value	
observations

A. General health outcomes
Self-reported health	
  (1 is excellent, 5 is poor)	

–0.0941	
(0.0023)	

–0.2289	
(0.0745)

–0.074	

0.074	

26,030

Health index (1–100 scale)	
	

1.9674	
(0.0511)	

4.5345	
(1.6738)

0.067	

0.131	

26,030

Fair or poor health	
	

–0.0359	
(0.0010)	

–0.0824	
(0.0343)

–0.230	

0.176	

26,030

Poor health	
	

–0.0141	
(0.0006)	

–0.0269	
(0.0206)

–0.226	

0.533	

26,030

Hospitalized in last year	
	

–0.0049	
(0.0008)	

–0.0268	
(0.0241)

–0.149	

0.364	

26,484

Days in bed, last four months	
	

–0.3310	
(0.0364)	

2.1526	
(1.4848)

0.547	

0.074	

25,223

Number of times hospitalized	
	

–0.0101	
(0.0024)	

–0.0944	
(0.0884)

–0.335	

0.329	

22,229

Number of nights in hospital	
	

–0.0730	
(0.0186)	

–1.0828	
(0.7668)

–0.567	

0.185	

26,289

B. Functional limitations/activities of daily living/instrumental activities of daily living

12

Trouble seeing	
	

–0.0122	
(0.0007)	

–0.0559	
(0.0254)

–0.412	

0.085	

20,853

Trouble hearing	
	

–0.0103	
(0.0007)	

–0.0499	
(0.0247)

–0.329	

0.109	

20,845

Trouble speaking	
	

–0.0019	
(0.0002)	

–0.0192	
(0.0079)

–0.909	

0.039	

20,573

Trouble lifting	
	

–0.0198	
(0.0009)	

–0.0055	
(0.0330)

–0.023	

0.667	

20,837

Trouble walking	
	

–0.0251	
(0.0011)	

0.0130	
(0.0325)

0.045	

0.242	

20,797

Trouble with stairs	
	

–0.0250	
(0.0010)	

–0.0066	
(0.0324)

–0.024	

0.993	

20,820

Trouble getting around	
  outside the home	

–0.0120	
(0.0008)	

–0.0146	
(0.0257)

–0.114	

0.918	

17,401

Trouble getting around	
  inside the home	

–0.0048	
(0.0005)	

0.0051	
(0.0208)

0.087	

0.635	

17,463

Trouble getting in/	
  out of bed	
  
Trouble seeing at all	
	

–0.0056	
(0.0006)	

0.0013	
(0.0230)

0.016	

0.764	

17,621

–0.0020	
(0.0002)	

–0.0078	
(0.0084)

–0.343	

0.490	

20,589

Trouble hearing at all	
	

–0.0008	
(0.0001)	

–0.0100	
(0.0045)

–0.758	

0.060	

20,256

Trouble speaking at all	
	

0.0000	
(0.0001)	

–0.0008	
(0.0001)

–0.284	

0.000	

7,516

Trouble lifting at all	
	

–0.0100	
(0.0007)	

–0.0029	
(0.0250)

–0.025	

0.775	

20,789

Trouble walking at all	
	

–0.0148	
(0.0008)	

0.0107	
(0.0260)

0.069	

0.328	

20,723

Trouble with stairs at all	
	

–0.0114	
(0.0006)	

0.0071	
(0.0202)

0.061	

0.359	

20,775

Needs help getting 	
  around outside	

–0.0066	
(0.0007)	

0.0044	
(0.0153)

0.050	

0.470	

13,598

2Q/2008, Economic Perspectives

Table 4 (continued)

Estimates of effects of education on health outcomes
	
Dependent variable	

	
OLS/probit	

	
IV/2SCML	

IV/2SCML	
effect size	

Needs help getting	
  around inside	
Needs help getting	
  in/out of bed	

  Exogeneity test     	 Number of
p value	
observations

–0.0010	
(0.0002)	

0.0108	
(0.0078)

0.446	

0.125	

13,757

–0.0011	
(0.0003)	

0.0092	
(0.0080)

0.372	

0.191	

13,794

Health limitation	
	

–0.0250	
(0.0013)	

–0.0743	
(0.0348)

–0.175	

0.157	

19,073

Arthritis	
	

–0.0088	
(0.0008)	

–0.0043	
(0.0217)

–0.034	

0.836	

19,012

Back	
	

–0.0028	
(0.0005)	

–0.0349	
(0.0167)

–0.561	

0.061	

18,924

Blind	
	

–0.0014	
(0.0003)	

0.0145	
(0.0084)

0.557	

0.060	

18,454

Cancer	
	

–0.0007	
(0.0002)	

0.0025	
(0.0078)

0.161	

0.677	

18,569

Deaf	
	

–0.0003	
(0.0002)	

–0.0041	
(0.0064)

–0.179	

0.568	

18,422

Deformity	
	

–0.0006	
(0.0002)	

–0.0159	
(0.0066)

–0.591	

0.018	

18,821

Diabetes	
	

–0.0023	
(0.0003)	

–0.0258	
(0.0082)

–0.868	

0.007	

18,688

Heart	
	

–0.0062	
(0.0006)	

–0.0014	
(0.0194)

–0.016	

0.804	

19,025

Hernia	
	

–0.0003	
(0.0001)	

0.0023	
(0.0037)

0.362	

0.454	

17,179

Hypertension	
	

–0.0031	
(0.0004)	

0.0376	
(0.0124)

1.053	

0.000	

18,683

Kidney	
	

–0.0001	
(0.0001)	

0.0042	
(0.0027)

0.938	

0.072	

16,593

Lung	
	

–0.0037	
(0.0005)	

0.0203	
(0.0152)

0.472	

0.106	

19,060

Mental illness	
	

–0.00009	
(0.00008)	

–0.0002	
(0.0424)

–0.045	

0.932	

15,794

Missing limb	
	

–0.00007	
(0.00005)	

–0.0019	
(0.0016)

–0.580	

0.155	

14,565

Paralysis	
	

–0.00011	
(0.00006)	

0.0016	
(0.0020)

0.287	

0.348	

17,301

Senility	
	

–0.00005	
(0.00002)	

–0.0015	
(0.0006)

–0.214	

0.070	

17,993

Stomach	
	

–0.0006	
(0.0002)	

0.0069	
(0.0060)

0.695	

0.195	

17,701

Stroke	
	

–0.0008	
(0.0003)	

0.0084	
(0.0090)

0.397	

0.295	

18,918

Thyroid	
	

–0.0000001	
(0.000000)	

0.000001	
(0.000000)

0.000	

0.000	

14,559

Other	

–0.0023	

–0.0013	

–0.019	

0.947	

19,060

	

(0.0005)	

(0.0152)

C. Specific health conditions

Notes: OLS means ordinary least squares. IV means instrumental variables. 2SCML means two-stage conditional maximum likelihood. Standard
errors, shown in parentheses, are clustered at the state of birth and cohort level.

Federal Reserve Bank of Chicago

13

Glied and Lleras-Muney (2003), treatment of kidney
infections experienced substantial innovation. Among
the 56 causes of death, kidney disease experienced the
fastest decline in age-adjusted mortality from 1986 to
1995—falling more than 9 percent per year (Glied
and Lleras-Muney, 2003, p. 8, appendix table B).
Accordingly, a steep (negative) gradient between education and kidney disease would presumably be expected.
It is therefore of note that the 2SCML specification
finds an increase in the incidence of kidney problems
among those with high education. Treatment of diabetes is “often considered the prototype for chronic
disease management” (Goldman and Smith, 2002).
My findings, which analyze a broad range of health
conditions and chronic diseases, would suggest that,
insofar as the formal schooling is concerned, diabetes
appears to be an exception. In the SIPP data, diabetes
enters in the expected direction; that is, increases in
schooling appear to reduce the incidence of severe
cases of diabetes.
On the one hand, since diabetes is also associated
with loss of limbs and poor vision, the diabetes result
could be a plausible explanation for those findings. On
the other hand, kidney problems and hypertension,
which are also commonly associated with diabetes,
go in the wrong direction. Further, there is no wellestablished connection between diabetes and speech,
hearing, and back problems. An alternative explanation for the diabetes result could be that states that
had higher compulsory schooling levels also promoted
nutritional policies that might have reduced adult onset of diabetes. Overall, however, one conclusion that
may be drawn from this table is that there is little
support for the “decision-making” hypothesis.
I would also note that explanations for the link
between education and health that focus on better health
care access due to more financial resources (for example,
from higher income and a better paying occupation)
or unobserved time preferences do not appear to be
consistent with these results. These explanations would
likely imply that many outcomes ought to be affected,
not just a few.
There are two important limitations to this analysis. First, I observe individuals only if they have survived into the 1980s and 1990s when they are anywhere
between the ages of 59 and 83. This sample is almost
certainly positively selected on education and health,
so it is unclear to what extent they may be generalized.
I suspect that because of this selection, my results are
biased against finding any effects of education on improving health, making it still surprising that there are
very large negative coefficients on the incidence of
several negative health outcomes. Second, because

14

specific health conditions are only asked of those who
report an activity limitation or being in fair or poor
health, some individuals with a particular condition
may not be captured in the analysis. Nonetheless, it
may be even more meaningful to identify the effects
of education on specific conditions that were severe
enough to cause an activity limitation.
Conclusion
In this article, I expand upon the growing literature that attempts to identify whether there is a causal
effect of education on health. I closely examine the
effects of education induced by compulsory schooling
laws early in the twentieth century on long-term health,
using several approaches. First, I revisit the results in
Lleras-Muney (2005, 2006) by expanding the U.S.
Census sample and employing a variety of robustness
checks. The main finding is that the effects of education on mortality induced by changes in compulsory
schooling laws are not robust to including state-specific time trends, suggesting that a causal interpretation is unwarranted.
Second, I use the SIPP to identify not only general health effects but also specific health outcomes that
were induced by changes in state compulsory schooling laws to see if these outcomes correspond to our
existing theories of how education affects health. The
results suggest that there is a large effect of education
on general health status arising from compulsory schooling laws that are robust to state time trends. However,
I find that, with the important exception of diabetes,
none of the other specific health conditions that are
associated with education (for example, vision, hearing, speaking ability, back problems, deformities, and
senility) correspond to the leading theories of how
education improves health (for example, technological improvements, better decision-making, lower discount rates, higher income). This suggests that either
our theories are incorrect or that the compulsory schooling laws are suspect instruments. An important caveat,
however, is that the SIPP analysis uses a sample of
older individuals who are almost surely positively selected on education and health. While this likely makes
it more difficult to detect effects of education on improved health, it also raises questions as to how far
one can generalize these results.
A few other studies have begun to implement
strategies to better identify the causal effects of education on health with mixed findings. In a working
paper, Clark and Royer (2007) use differences in compulsory schooling laws affecting very narrowly defined birth cohorts in the United Kingdom, combined
with individual-level mortality data and find very small

2Q/2008, Economic Perspectives

effects of education on mortality, which are consistent with the results here. In another working paper,
Deschenes (2007) uses plausibly exogenous variation
based on cohort size in the U.S. and estimates a statistically significant and large effect of education on mortality using a grouped estimator. Deschenes’ estimates

suggest that an additional year of schooling adds an
additional year to life expectancy. Because we are still
only in the early stages of our understanding of this
important issue, it is important to conduct replication
and extension exercises on the small number of studies that have used more credible research strategies.

NOTES
Kolata (2007).

1

For example, Deaton and Paxson (2004) document that there is a
strong association between education and health in the United
Kingdom.

2

See Lyman (2006). The National Institute on Aging is part of the
National Institutes of Health.

3

The results from using the Lleras-Muney (2005) instruments instead
of the Goldin and Katz (2003) instruments are not very different,
and are in an earlier version of this article, Mazumder (2007).
4

The IPUMS are from the University of Minnesota, Minnesota
Population Center.

5

6
Lleras-Muney (2002) found no effect of compulsory schooling laws
on the education levels of blacks.

I thank Jay Bhattacharya for this suggestion. In a previous version
of the article, I found very similar results using two-stage least squares
for the dichotomous outcomes.

7

I generally found that the IV results were larger and more significant when using the state trends than when using region of birth interacted with cohort. The ordinary least squares results were virtually
identical under either specification.

8

The 1990 and 1996 panels include an oversample of poorer households. The restriction to the noninstitutionalized population means

9

that those living in nursing homes are not included in the survey.
However, more than 90 percent of the disabled and more than 80 percent of those requiring long-term care live outside of institutions;
for further details, see http://aspe.hhs.gov/daltcp/reports/rn11.htm.
See Johnson and Schoeni (2007) and the citations therein for a
discussion of this approach.

10

I pool responses from the 1984, 1990–93, and 1996 SIPPs in order to maximize sample size. Unfortunately, different criteria were
used across the SIPP survey years to select the subsamples for which
specific health conditions were asked. For example, in 1996 the health
conditions were asked of those who reported being in fair or poor
health. I found that it was important to combine all of the subsamples in all of the years in order to have enough power to identify effects.
There are also an additional set of ten outcomes that are not used
because they were not available in the 1984 SIPP. Experimentation
with a smaller sample suggests that the conclusions are not altered
by dropping these other outcomes.

11

Note that these are estimates from errata that correct the previous
estimates in Lleras-Muney (2005). See Mazumder (2007) for more
details.

12

The mean ten-year mortality rate in Lleras-Muney (2005) is
10.6 percent, so a reduction of 6.3 percentage points implies a 59
percent reduction in mortality.

13

The partial F statistic rises to 9.07 when using the expanded
sample.

14

references

Acemoglu, D., and J. Angrist, 2001, “How large are
human capital externalities? Evidence from compulsory
schooling laws,” in NBER Macroeconomics Annual
2000, B. S. Bernanke and K. S. Rogoff (eds.),
Cambridge, MA: MIT Press, pp. 9–59.
Becker, G. S., and C. B. Mulligan, 1997, “The endogenous determination of time preference,” Quarterly
Journal of Economics, Vol. 112, No. 3, August,
pp. 729–758.
Case, A., D. Lubotsky, and C. Paxson, 2002, “Economic status and health in childhood: The origins of
the gradient,” American Economic Review, Vol. 92,
No. 5, December, pp. 1308–1334.

Federal Reserve Bank of Chicago

Clark, D., and H. Royer, 2007, “The effect of education on longevity: Evidence from the United Kingdom,”
Case Western Reserve University, working paper.
Cutler, D., and G. Miller, 2005, “The role of public
health improvements in health advances: The twentiethcentury United States,” Demography, Vol. 42, No. 1,
February, pp. 1–22.
Deaton, A., and C. Paxson, 2004, “Mortality, income,
and income inequality over time in Britain and the
United States,” in Perspectives on the Economics
of Aging, D. A. Wise (ed.), Chicago: University of
Chicago Press, pp. 247–280.

15

Deschenes, O., 2007, “The effect of education on adult
mortality: Evidence from the baby boom generation,”
University of California, Santa Barbara, working paper.

Kolata, G., 2007, “A surprising secret to a long life:
Stay in school,” New York Times, January 3, available
at www.nytimes.com/2007/01/03/health/03aging.html.

Dhir, R., and J. P. Leigh, 1997, “Schooling and frailty
among seniors,” Economics of Education Review,
Vol. 16, No. 1, February, pp. 45–57.

Lleras-Muney, A., 2006, “Erratum: The relationship
between education and adult mortality in the United
States,” Review of Economic Studies, Vol. 73, No. 3,
p. 847.

Glied, S., and A. Lleras-Muney, 2003, “Health inequality,
education, and medical innovation,” National Bureau
of Economic Review, working paper, No. 9738, June.
Goldin, C., and L. F. Katz, 2003, “Mass secondary
schooling and the state,” National Bureau of Economic
Review, working paper, No. 10075, November.
Goldman, D. P., and J. P. Smith, 2002, “Can patient
self-management help explain the SES health gradient?,”
Proceedings of the National Academy of Sciences,
Vol. 99, No. 16, August 6, pp. 10929–10934.
Grossman, M., 2005, “Education and nonmarket
outcomes,” National Bureau of Economic Review,
working paper, No. 11582, August.
Gunderson, G. W., 1971, “The National School
Lunch Program: Background and development,”
U.S. Department of Agriculture, Food and Nutrition
Service, report, available at www.fns.usda.gov/cnd/
Lunch/AboutLunch/ProgramHistory.htm.
Hunter, R., 1904, Poverty, New York: Macmillan.
Johnson, R. C., and R. F. Schoeni, 2007, “The influence of early-life events on human capital, health
status, and labor market outcomes over the life course,”
University of California, Berkeley, Institute for
Research on Labor and Employment, working paper,
No. iirwps-140-07, January 2.
Kitagawa, E. M., and P. M. Hauser, 1973, Differential Mortality in the United States: A Study in Socioeconomic Epidemiology, Cambridge, MA: Harvard
University Press.

16

­­
__________,
2005, “The relationship between education and adult mortality in the United States,” Review
of Economic Studies, Vol. 72, No. 1, pp. 189–221.
__________, 2002, “Were compulsory attendance
and child labor laws effective? An analysis from 1915
to 1939,” Journal of Law and Economics, Vol. 45,
No. 2, part 1, October, pp. 401–435.
Lyman, R., 2006, “Census report foresees no crisis
over aging generation’s health,” New York Times,
March 10, available at www.nytimes.com/2006/03/
10/national/10aging.html.
Mazumder, B., 2007, “How did schooling laws
improve long-term health and lower mortality?,”
Federal Reserve Bank of Chicago, working paper,
No. WP-2006-23, revised January 24, 2007.
Marmot, M. G., 1994, “Social differences in health
within and between populations,” Daedalus,
Vol. 123, No. 4, pp. 197–216.
National Institutes of Health, 2003, “Pathways
linking education to health,” report, Bethesda, MD,
No. RFA OB-03-001, January 8, available at
http://grants1.nih.gov/grants/guide/rfa-files/RFAOB-03-001.html.
Rivers, D., and Q. H. Vuong, 1988, “Limited information estimators and exogeneity tests for simultaneous probit models,” Journal of Econometrics, Vol. 39,
No. 3, November, pp. 347–366.

2Q/2008, Economic Perspectives

11/12/10
ERRATUM, Corrected Table 3: New Estimates of effects of education on mortality

Sample and Specification

WLS

IV

Number of
observations

A. 1960 - 1980
1960-1980 1%
No age controls, region X cohort

-0.036
(0.004)

-0.072
(0.025)

4792

-0.045
(0.004)

-0.045
(0.024)

4797

With age cubic, region × cohort

-0.039
(0.004)

-0.047
(0.024)

4797

With age cubic × Census year, region × cohort

-0.039
(0.004)

-0.047
(0.024)

4797

With age cubic × Census year, state × cohort trend

-0.040
(0.004)

0.003
(0.038)

4797

-0.034
(0.003)

-0.029
(0.015)

8636

-0.035
(0.003)

0.006
(0.031)

8636

-0.025
(0.006)

-0.081
(0.052)

2397

Estimated effect for 1970–80

-0.061
(0.005)

-0.023
(0.033)

2400

Estimated effect for 1980–90

-0.043
(0.004)

0.023
(0.029)

2399

Estimated effect for 1990–2000

-0.012
(0.005)

0.027
(0.039)

1440

-0.017
(0.005)

-0.064
(0.036)

2879

55–64 year olds

-0.039
(0.005)

0.063
(0.053)

2398

65–89 year olds

-0.031
(0.003)

-0.052
(0.022)

3359

-0.019
(0.004)

-0.200
(0.124)

3644

-0.017
(0.004)

0.029
(0.023)

4992

1960 1%, 1970 2%, and 1980 5%:
No age controls, region × cohort

B. 1960 - 2000
1960 1%, 1970 2%, and 1980–2000 5%:
With age cubic × Census year
With age cubic × Census year, state × cohort trend
C. 1960–2000, by Census year
1960 1%, 1970 2%, and 1980–2000 5% with age cubic:
Estimated effect for 1960–70

D. 1960–2000, by age
1960 1%, 1970 2%, and 1980–2000 5% with age cubic × Census year:
35–54 year olds

E. 1960–2000, by cohort
1960 1%, 1970 2%, and 1980–2000 5% with age cubic × Census year:
Cohorts born in 1901–12
Cohorts born in 1913–25

Notes: WLS means weighted least squares. IV means instrumental variables. The dependent variable is the ten-year mortality rate; table
entries are the coefficient on education. All specifications include year dummies, cohort dummies, state of birth dummies, region of
birth interacted with cohort, and an intercept (except for panel A, fifth row, and panel B, second row). Estimates are weighted using the
number of observations in the cell in the base year. Standard errors, shown in parentheses, are clustered at the state of birth and
cohort level.

How do EITC recipients spend their refunds?
Andrew Goodman-Bacon and Leslie McGranahan

Introduction and summary
The earned income tax credit (EITC) is one of the
largest sources of public support for lower-income
working families in the U.S. The EITC operates as a
tax credit that serves to offset the payroll taxes and
supplement the wages of low-income workers. For
tax year 2004, the EITC transferred over $40 billion
to 22 million recipient families (U.S. Internal Revenue
Service, 2006b). Nearly 90 percent of program expenditures come in the form of tax refunds; the remaining
10 percent serve to reduce tax liability. While other
income support programs distribute benefits fairly
evenly across the calendar year, EITC payments are
concentrated in February and March when tax refunds
are received. Because the EITC makes one relatively
large payment per year, it may provide low-income,
credit-constrained households with a rare opportunity
to make important big-ticket purchases.
Research on the EITC has tended to focus on the
important labor supply effects of the program (Eissa
and Liebman, 1996; Meyer and Rosenbaum, 2001;
and Grogger, 2003). Relatively little is known about
how recipient households actually use EITC refunds.
In this article, we use data from the U.S. Bureau of
Labor Statistics’ Consumer Expenditure Survey (CES)
over the period 1997–2006 to investigate how households
spend EITC refunds.1 Following the methodology of
Barrow and McGranahan (2000), we rely on the particular timing of EITC payouts to identify the effects
of the credit on expenditures. Barrow and McGranahan
found that the EITC has a larger effect on spending
on durable goods than on nondurable goods. In this
article, we are particularly interested in determining
what items within the durables and nondurables categories are purchased using the credit and whether
these expenditures reinforce the EITC’s prowork and
prochild goals. Our primary finding is that recipient
household spending in response to EITC payments is

Federal Reserve Bank of Chicago

concentrated in vehicle purchases and transportation
spending. Given the crucial link between transportation and access to jobs, we believe this finding is consistent with the EITC’s goals. In the next section, we
present a brief history of the EITC and the key features of the program. We then review prior research
on the uses of the EITC by recipient families. Next,
we introduce the CES data and the methodology we
use to investigate the data. Finally, we present our results and discuss their implications.
History and structure of the EITC
Congress created the EITC in 1975 to offset payroll taxes paid by low-income workers with children.
The credit is structured as a supplement to earned income equaling a percentage of earnings up to a specific threshold (the “phase-in” range), at which point
the credit amount stays constant for an additional
amount of earnings (the “plateau” range). Then this
maximum credit is reduced by a given percentage of
earnings until it equals zero (the “phase-out” range).
Income thresholds, the phase-in and phase-out rates,
and, therefore, the credit amount also vary by the number of qualified children in a household and by marital status; and all these factors have varied over time.2
Figure 1 graphs the EITC program parameters for selected years. The program is implemented as a part of
the tax code, and recipients must file taxes in order to
apply for the program. For tax year 2006, a single mother
with two children earning between $11,340 and $14,810
would have received the maximum credit of $4,536.
Andrew Goodman-Bacon is currently a graduate student
in economics at the University of Michigan and a former
associate economist at the Federal Reserve Bank of
Chicago. Leslie McGranahan is an economist in the
Economic Research Department at the Federal Reserve
Bank of Chicago. The authors thank Lisa Barrow,
Eric French, and Anna Paulson for helpful comments.

17

37,500

35,000

32,500

30,000

27,500

25,000

22,500

20,000

17,500

15,000

12,500

7,500

10,000

5,000

0

18

2,500

The EITC began as a small program,
figure 1
but its generosity and coverage have exEITC program parameters for selected years
panded frequently in its 30-year history
EITC benefit in dollars, unadjusted for inflation
as is shown in figures 1 and 2. Particular5,000
ly large expansions enacted in 1986 and
1993 led to rapid program growth. In
1975
1994, childless families started to receive
4,000
1987
a small credit. In 1975, the EITC repre1996
sented 3.1 percent of federal means-tested
2006
3,000
transfers and 9.7 percent of federal
means-tested cash transfers; by 2002,
2,000
these proportions had increased by three
times and four and a half times, respectively, and the EITC was the second larg1,000
est means-tested cash transfer program
behind Supplemental Security Income
0
(SSI). In figure 2, we graph the average
credit and number of recipient families
by year. As the figure shows, the size of
adjusted gross income in dollars
the EITC was relatively constant in its
Notes: EITC means earned income tax credit. The data are for an unmarried
first decade, but between 1986 and 2005,
parent of two.
both the number of recipient families and
Source: Tax Policy Center, 2007, Earned Income Tax Credit Parameters,
1975–2008, table.
the real average credit amount grew by
more than three times, increasing real
federal expenditures on the program by
almost 12 times. In 1986, just over 7 milfigure 2
lion families received earned income tax
EITC recipients and benefits, 1975–2005
credits averaging $501 in 2005 dollars. By
real average credit in 2005 dollars
thousands of recipient families
2002, over 20 million families received
25,000
2,500
credits averaging $1,911 in 2005 dollars
(U.S. House of Representatives, Committee
Recipients (LHS)
on Ways and Means, 2004).
20,000
2,000
Unlike other transfer programs that
have monthly benefits, the EITC pays
15,000
1,500
out a lump sum once per year. The EITC
does permit recipients to receive some
10,000
1,000
portion of payments monthly prior to
Real average credit (RHS)
tax filing in the form of the advance
earned income tax credit, but in 2004
5,000
500
only 0.6 percent of recipient households
received any credit in this manner, repre0
0
senting just 0.2 percent of payments
1975
’80
’85
’90
’95
2000
’05
(U.S. Internal Revenue Service, 2006b).
Notes: EITC means earned income tax credit. LHS means left-hand scale.
Figure 3 shows the distribution of
RHS means right-hand scale.
refundable EITC payments from the
Sources: Authors’ calculations based on data from the U.S. House
of Representatives, Committee on Ways and Means (2004); and
U.S. Internal Revenue Service by month
U.S. Internal Revenue Service.
for 2005—a year with a payment pattern
typical of recent years. As the figure
shows, nearly all EITC payments are
This pattern is a result of the timing of tax filing. Taxes
made in February and March, and most of these come
can be filed anytime after W-2s (employee wage rein February. The modal month of EITC payments has
port forms) are received (by January 31), and refunds
changed over time, but since 1997 more payments
are received within six weeks.3
have been made in February than in any other month.

2Q/2008, Economic Perspectives

figure 3

Fraction of EITC payments, by month, 2005
fraction of payments
0.6
0.5
0.4

The one-time EITC average refund
of $2,113 among families with children
in 2004 is also large when compared with
the average monthly payments to recipient families in other transfer programs in
2004, such as SSI ($429); Temporary Assistance to Needy Families, or TANF
($397); the Food Stamp Program ($200);
and unemployment insurance ($1,141).6
Use of EITC refunds

0.3
0.2
0.1
0.0
Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov.
Note: EITC means earned income tax credit.
Source: Authors’ calculations based on data from the U.S. Department
of the Treasury, 2005–06, Monthly Treasury Statement of Receipts and
Outlays of the United States Government, various issues.

The lump sum payment structure also means
that EITC refunds represent a relatively large share
of recipients’ income in the month when they are received. For tax year 2004, the average EITC refund
for recipient families with children was $2,113, or
12 percent of their annual average adjusted gross income (AGI) of $16,981. Assuming income was earned
evenly across the calendar year, the average recipient
household’s income would be approximately two
and a half times its usual monthly value in the month
when the EITC payment was received.4
For comparison, the mean overpayment refund
for non-EITC recipients in tax year 2004 was $1,692,
or 2.9 percent of annual average AGI among nonrecipients.5 Overpayment refunds are less concentrated
in the first quarter of the year than EITC refunds. While
87 percent of EITC refunded dollars for 2004 were
distributed in the first quarter, 47 percent of non-EITC
refunded dollars were distributed in the first quarter,
and an additional 42 percent were distributed in the
second quarter (U.S. Internal Revenue Service, 2006c).
It is worth noting that the Consumer Expenditure Survey,
the data set used for our analysis, provides additional
evidence to show that EITC refunds are concentrated
earlier in the year than other tax refunds. Among
families who made an expenditure on “accounting
services,” including tax preparation, 43 percent of
EITC eligible families did so in January or February,
versus 29 percent of noneligible families.

Federal Reserve Bank of Chicago

The majority of research on the
EITC and expenditure patterns has relied
on surveys of EITC recipients about how
they spent or planned to spend refunds.
The consensus from these surveys is that
Dec.
the primary use of EITC refunds is to pay
bills. Sixty-three percent of respondents in
a survey of participants in the University
of Georgia’s Consumer Financial Literacy
Program reported that they planned to use
most of their refund to pay or catch up on
bills or debts (Linnenbrink et al., 2006).
Similarly, 44 percent of mothers in a study tracking
the well-being of rural families indicated that they
used their refund to pay bills (Mammen and Lawrence,
2006). Using surveys of free tax preparation clients in
Chicago, Smeeding, Phillips, and O’Connor (2000)
report that tax filers who anticipate an EITC refund
most often plan to use it to pay bills. These studies
also find that recipients used their refunds to purchase
or repair cars and buy other durables, such as home
furnishings. Some families also report buying children’s clothing and going on vacation. Very few families planned to save their refund for a rainy day or
for retirement.
In contrast to these studies, Barrow and McGranahan
(2000) use the nationally representative Consumer
Expenditure Survey to investigate expenditure uses
of EITC refunds. They rely on the unique seasonal
pattern of EITC refunds to determine whether EITC
eligible households have expenditure patterns that differ
from those of noneligible households. They find that
EITC eligible households have higher expenditures
on durable goods in February, the modal month of
EITC receipt, relative to noneligible households. They
attribute this increased spending on durables to the
EITC. Barrow and McGranahan do not measure health
care, housing, or utility expenditures, so they do not
measure much of what other studies categorize as “bills.”
Here we use CES data over the period 1997–2006
to build upon the work of Barrow and McGranahan
(2000). We investigate on which goods, particularly

19

within the durable goods category, the EITC recipient
households spend more. We also look at both the extensive and intensive margin of expenditure. In other
words, we ask both whether households are more
likely to make any expenditure and whether they make
larger expenditures, given that they make a purchase.
We focus on those goods that have been identified
in the literature as either those that recipients report
that they plan to purchase or those that further the EITC
program’s goals of “strengthen[ing] the incentive to
work,” “help[ing] low-wage working families make
ends meet,” and promoting the well-being of children
(Frost, 1993). Vehicle expenditures fall into both of
these categories. They have been mentioned by recipients as an intended use of the EITC credit and are
particularly supportive to work. According to a Brookings
Institute report, 88 percent of low-income Americans
commute in a personal vehicle (Blumenberg and Waller,
2003). In fact, other antipoverty and income support
programs explicitly recognize the link between car
ownership and employment through more lenient limits
on cars than on other forms of assets. For example,
the federal SSI program exempts one vehicle from its
resource limit. Similarly, most states exclude the value of one or more vehicles from resource limits used
to determine eligibility for the Food Stamp Program
and TANF Program. In addition to vehicles, we focus
on expenditures on household furnishings and home
electronics, as well as on children’s clothing. We do
not look at bill paying because the nature of the CES
data precludes such an analysis.
Our primary contribution is to provide evidence
on detailed actual expenditures, using nationally representative survey data. Time-series variation in EITC
payments over the year and cross-sectional variation
in imputed eligibility allow us to identify the EITC’s
impact. Similar to Barrow and McGranahan (2000),
we find that receiving EITC refunds increases household expenditures on both durable and nondurable
goods, but more so for durables. Eligible households
are more likely both to purchase big-ticket items in
February and to spend more on them, given that they
make any expenditure. Within durables, the strongest
patterns are found for vehicles, confirming the responses
given in surveys. Eligible households also spend slightly
more on all other major subcategories of durables—
household goods, appliances, and home electronics.
Within nondurables, the strongest patterns are found
for transportation expenses, such as car repairs.7
Data
We create a monthly household-level data set of
expenditure, income, and family structure, using the

20

CES’s interview survey data covering the period
1997–2006. Households, which are called consumer
units (CUs) in the data, are interviewed five times for
the survey.8 The first interview provides baseline
asset information. The second through fifth interviews cover detailed expenditure information for the
three months prior to the interview date. These interviews occur three months apart. As a result, in the
absence of attrition, a full year of expenditure data is
collected for each household. Households enter and
exit the survey each month. Information on income
in the 12 months leading up to the survey date is collected in the second and fifth interviews. Demographic information is updated every interview.
We begin with the 1997 data because February
has been the modal month of EITC payouts since 1997.
This consistency in payments across time allows us to
focus on the February expenditures of recipient households. In most years prior to 1997, March was the
modal month of EITC payments.9
We consider a CU to belong to the calendar year
in which we observe February expenditure (or would
have observed it if the household had responded). Since
1997, this is when the CU is most likely to have received the previous tax year’s EITC refund payment.
Therefore, data over the period 1997–2006 allow us
to consider EITC policies in place during tax years
1996–2005. The average number of observations in
our 120 month-year cells is 4,888, and in total we
have 589,568 observations.
Information on EITC receipt is not provided in the
CES, so we use the income and family structure variables to impute EITC eligibility and the magnitude of
EITC payments. Because of our reliance on the income
data, we delete those with incomplete income reports
from the analysis. We assume all households without
children are not eligible for the EITC despite the small
credit for childless families that has been available since
1994.10 The CUs may contain more than one tax filing
unit (TU). We impute EITC payments and eligibility
for each TU within the CU and combine these to determine CU eligibility and EITC amount. Ideally, we
would observe the income and family structure of each
TU for the year preceding their February interview.
However, we lack information on TU composition
and on tax year income. To generate our best guess
of income for the year preceding the February interview, we use the income information in the second
and fifth interviews. For some individuals, our best
guess of tax year income is the reported income
from the second interview; for others, we compute
a weighted average of the two income reports where

2Q/2008, Economic Perspectives

the weights depend on the number of months for
which the year covered by the income report and tax
year overlap.
To assign adults to TUs and generate TU income,
we use sex, marital status, relationship to reference
person, and individual income information. To assign
children to TUs for the purpose of the EITC computation, we use the EITC eligibility rules in place during the year before their February interview. Before
2001, EITC rules assigned all qualifying children in
a family to the TU with the highest income, but since
2001, families have been free to choose which TU
claimed qualifying children. Thus, before 2001 we
give all children to the highest-income TU, and after
2001 we give all qualifying children to the TU for
which they generate the largest EITC refund.11
Because of this imputation, we are measuring EITC
eligibility rather than EITC receipt. Two issues may
affect the accuracy of these imputations. First, some
households that are eligible for the EITC may not take
it up. According to a study by the U.S. Government
Accounting Office (2001), approximately 85 percent
of eligible households with children participate in the
EITC program. Second, we may be incorrectly imputing that eligible households are ineligible or that
ineligible households are eligible because either child
or income information is incorrect in the CES. There
is some underreporting of income in the CES, so we
may be assigning eligibility to some households that

are in fact beyond the maximum income for EITC receipt. We also may be assigning some children to an
incorrect TU. These issues make it harder for us to
find an effect of the EITC on consumption. As a result, our estimates represent a lower bound on the
effect of the EITC on recipient consumption patterns.
Table 1 gives variable means for the demographic, income, and EITC variables for all families and by
imputed EITC eligibility. In the full sample, 13 percent of household-months (shown as 0.13 in the first
column, fourth row of table 1) were eligible for an
average credit of $2,116 in the February in which we
observed them. These percentages and values change
over time in keeping with the changes in eligibility
and refund amounts presented in figure 2 (p. 18).
When we compare the EITC eligible and noneligible
populations, we find differences that are consistent
with the program rules. For example, EITC eligible
households earn approximately 60 percent of what
noneligible households earn on average, and have
more children. In addition, EITC eligible households
are also less likely to have a white household head,
are more likely to be headed by a single parent, and
are less educated than noneligible households. These
additional findings are not related explicitly to the
program rules, but result from patterns of earnings
in the U.S., and are consistent with the attributes of
participants in other income support programs.

Table 1

Summary statistics
	 	
Median real income (2004 dollars)	
Mean real income (2004 dollars)	
EITC amount (2004 dollars)	
EITC eligible	
Number of children	
White household head	
Household head’s highest educational attainment:
	 Some high school	
	 High school diploma	
	 Some college	
	 College degree	
Family type:
Husband, wife, and own kids	
Single parent	
Single person	
Other family type	
Observations (family months)	
Observations (distinct families)	

All	

Non-EITC	

EITC

32,346	
44,130	
277	
0.13	
0.71	
0.84	

36,590	
46,468	
—	
0.00	
0.52	
0.85	

22,548
28,599
2,116
1.00
1.97
0.75

0.13	
0.25	
0.20	
0.42	

0.12	
0.24	
0.20	
0.45	

0.22
0.34
0.23
0.22

0.27	
0.06	
0.28	
0.39	

0.25	
0.03	
0.32	
0.40	

0.37
0.25
0.00
0.39

589,568	
59,595	

512,405	
51,824	

77,163
7,771

Note: EITC means earned income tax credit.
Source: Authors’ calculations based on data from the U.S. Bureau of Labor Statistics, Consumer Expenditure Survey.

Federal Reserve Bank of Chicago

21

Our next goal is to generate monthly expenditure
data. We combine all available interviews for each
CU. Sixty-three percent of CUs have 12 months of
data, and the average CU has 9.9 months of data.
The CES contains very detailed information on expenditures, which we distill into durable goods and
nondurable goods, as well as subcategories of those
groups. Durable goods includes household goods (such
as furniture, linens, and carpets); appliances (such as
dishwashers, silverware, and kitchen electronics);
electronics (such as televisions and computers); and
new and used vehicle purchases. Nondurables includes
food, alcohol, and tobacco; apparel; trips (out-of-town
travel and expenditure while traveling); transportation expenses (except vehicle purchases); entertainment; child support, alimony, and charity; and pensions,
insurance, and social security payments. We do not
measure expenditure on items that we do not consider
to be durable or nondurable goods. In particular, we
exclude utilities, rent, education expenses, and health
care. These obligations may be difficult for households to alter on a month-to-month basis. In addition,
the rent and utility variables reported on the survey
capture the amount owed in a given month rather than
the amount paid, making it impossible to assess whether
households are spending money to catch up on overdue payments or prepay obligations.12
Table 2 provides summary statistics on expenditures in all of our categories as calculated from the CES.
It provides three different measures of expenditure
for each category. The first set of three columns presents expenditure that occurs on the goods category in
the average month as a percent of total annual expenditures on durable and nondurable goods. The entry
for durable goods in the first column indicates that in
the average month, a household spends 1.5 percent of
its total annual durable and nondurable goods expenditures on durable goods. The second set of three columns reports the probability that a household makes
any expenditure in a category in an average month.
In the average month, 84.5 percent of households purchase a durable good. The third set of three columns
reports the proportion of total annual expenditure for
durable and nondurable goods in that category in a
month, given that some expenditure was made. Among
households purchasing durables in a given month, the
average household spends 1.8 percent of total annual
durable and nondurable goods expenditures on durables.
Table 3 reports the average dollar amount (in 2004 dollars based on the Personal Consumption Expenditures
deflator) spent per month conditional on expenditure.

22

As seen in table 2, average monthly expenditure
shares are fairly consistent for EITC and non-EITC
families with a few exceptions. The EITC families
spend a high share on food and on children’s clothing. The higher expenditure share on food is consistent with the general finding that food expenditure
shares are higher for lower-income households in the
U.S. The higher expenditure share on children’s clothing
arises from our restriction that all EITC eligible households have children, while many noneligible households
do not. From the second group of columns in table 2,
we observe that EITC families are generally less likely than non-EITC families to make expenditures in
almost every category in an average month. As shown
in table 3, in dollar terms, conditional on nonzero expenditure, EITC families spend less on everything
except for tobacco, food, and gasoline. Our analysis
continues by examining the effect of EITC eligibility
on spending in the nondurables category and the nondurable goods subcategories of children’s clothing
and transportation, and then we focus our analysis on
durable goods expenditures and specifically on expenditures for vehicles and consumer electronics.
Methodology
We measure expenditure by household i in month t
on category j in three ways: the proportion of annual
 X itj
expenditure in each month 
X
 i , Annual


 , the probability


(

)

of making any expenditure P ( X itj > 0 ) , and the
proportion of annual expenditure conditional on making
 X itj

| X itj > 0  .13
an expenditure 

 X i , Annual

We estimate clustered probit models for the discrete
measure of expenditure and generalized least squares
(GLS) regression models for the expenditure proportion variables. Letting X be one of the three dependent
variables, we estimate the following equation:
1)

Χ itj = α + γ t M t + φEITCi
+ λ t ( EITCi × M t ) + βCi + εit ,

where M is a vector of month dummies, EITC is a
dummy variable equal to 1 if the household is imputed to be EITC eligible, and C is a vector of household-level controls—year of first quarter interview;
income, race, sex, and education of household head;
family size; number of children; family type; and region (all rural households are the omitted “region”).

2Q/2008, Economic Perspectives

Federal Reserve Bank of Chicago

Table 2

Expenditure patterns, by expenditure category and EITC eligibility
	
	
	
	

	
Monthly expenditure/	
annual expenditure	

	
Probability of	
expenditure	

Monthly expenditure/
annual expenditure, conditional
on nonzero expenditure

All	

Non-EITC	

EITC	

All	

Non-EITC	

EITC	

All	

Non-EITC	

EITC

Total	

0.084	

0.084	

0.083	

1.000	

1.000	

1.000	

0.084	

0.084	

0.083

Durable goods	
Household goods	
Furniture	
Drapes, linens, and floor coverings	
Miscellaneous household equipment	
Appliances	
Major appliances	
Minor appliances	
Electronics	
Vehicle purchases	

0.015	
0.003	
0.001	
0.000	
0.001	
0.001	
0.001	
0.000	
0.004	
0.007	

0.015	
0.003	
0.001	
0.001	
0.001	
0.001	
0.001	
0.000	
0.004	
0.007	

0.015	
0.002	
0.001	
0.000	
0.001	
0.001	
0.001	
0.000	
0.004	
0.008	

0.845	
0.285	
0.048	
0.098	
0.210	
0.101	
0.031	
0.076	
0.809	
0.024	

0.848	
0.291	
0.048	
0.098	
0.216	
0.102	
0.031	
0.076	
0.813	
0.022	

0.822	
0.248	
0.048	
0.094	
0.169	
0.100	
0.032	
0.074	
0.783	
0.034	

0.018	
0.010	
0.027	
0.005	
0.005	
0.009	
0.022	
0.003	
0.005	
0.302	

0.018	
0.010	
0.028	
0.005	
0.005	
0.009	
0.023	
0.003	
0.005	
0.316	

0.019
0.009
0.024
0.004
0.004
0.008
0.019
0.003
0.005
0.244

Nondurables	
Food, alcohol, and tobacco	
Food	
Alcohol	
Tobacco	
Food away from home	
Apparel	
Trips	
Transportation	
Gasoline	
Other vehicle expenses	
Public transportation	
Entertainment	
Fees, admissions, toys, and sports	
Personal care services	
Reading	
Other nondurables	
Child support, alimony, and charity	
Pensions, insurance, and social security	

0.068	
0.030	
0.023	
0.001	
0.002	
0.005	
0.006	
0.003	
0.017	
0.006	
0.010	
0.001	
0.006	
0.004	
0.001	
0.001	
0.006	
0.005	
0.002	

0.068	
0.030	
0.022	
0.001	
0.002	
0.005	
0.005	
0.003	
0.016	
0.006	
0.010	
0.001	
0.006	
0.004	
0.001	
0.001	
0.007	
0.005	
0.002	

0.068	
0.034	
0.028	
0.001	
0.002	
0.004	
0.007	
0.002	
0.017	
0.007	
0.009	
0.001	
0.005	
0.003	
0.001	
0.000	
0.004	
0.002	
0.001	

0.999	
0.997	
0.992	
0.350	
0.257	
0.808	
0.647	
0.186	
0.939	
0.893	
0.671	
0.134	
0.901	
0.670	
0.734	
0.587	
0.534	
0.456	
0.189	

0.999	
0.997	
0.991	
0.363	
0.244	
0.813	
0.643	
0.196	
0.938	
0.894	
0.673	
0.133	
0.908	
0.675	
0.749	
0.611	
0.554	
0.475	
0.196	

1.000	
0.998	
0.994	
0.265	
0.341	
0.772	
0.674	
0.119	
0.949	
0.891	
0.660	
0.135	
0.858	
0.635	
0.635	
0.428	
0.404	
0.330	
0.142	

0.068	
0.030	
0.023	
0.002	
0.007	
0.006	
0.009	
0.017	
0.018	
0.007	
0.014	
0.005	
0.007	
0.006	
0.002	
0.001	
0.012	
0.010	
0.009	

0.068	
0.030	
0.022	
0.002	
0.007	
0.007	
0.009	
0.017	
0.018	
0.007	
0.015	
0.005	
0.007	
0.006	
0.002	
0.001	
0.012	
0.011	
0.009	

0.068
0.035
0.028
0.002
0.007
0.005
0.010
0.014
0.018
0.008
0.014
0.005
0.005
0.005
0.002
0.001
0.009
0.007
0.008

Children’s clothing	
Children’s clothing only among
families with children	

0.001	

0.001	

0.003	

0.199	

0.171	

0.386	

0.006	

0.005	

0.008

0.003	

0.002	

0.003	

0.411	

0.425	

0.386	

0.006	

0.006	

0.008

Notes: EITC means earned income tax credit. For each column, the subcategories may not total because of rounding. Children’s clothing is a portion of the apparel subcategory.
Source: Authors’ calculations based on data from the U.S. Bureau of Labor Statistics, Consumer Expenditure Survey.

23

Table 3

Expenditure amounts, by EITC eligibility, conditional on expenditure
	
	
	
Total	
Durable goods	
Household goods	
Furniture	
Drapes, linens, and floor coverings	
Miscellaneous household equipment	
Appliances	
Major appliances	
Minor appliances	
Electronics	
Vehicle purchases	
Nondurables	
Food, alcohol, and tobacco	
Food	
Alcohol	
Tobacco	
Food away from home	
Apparel	
Trips	
Transportation	
Gasoline	
Other vehicle expenses	
Public transportation	
Entertainment	
Fees, admissions, toys, and sports	
Personal care services	
Reading	
Other nondurables	
Child support, alimony, and charity	
Pensions, insurance, and social security	
Children’s clothing	
Children’s clothing only among
families with children	

All	
Non-EITC	
EITC
( - - - - - - - - - - - - - - - - - - - 2004 dollars - - - - - - - - - - - - - - - - - - - )
1,788.78	

1,822.02	

1,568.02

475.38	
73.36	
34.27	
12.24	
26.85	
20.24	
14.97	
5.27	
79.38	
302.40	

484.59	
77.29	
35.81	
12.94	
28.54	
20.97	
15.52	
5.45	
80.87	
305.47	

414.21
47.30
24.02
7.60
15.68
15.38
11.32
4.07
69.47
282.05

1,313.40	
491.73	
349.55	
15.00	
26.19	
101.00	
119.46	
77.85	
323.12	
109.67	
201.27	
12.19	
144.07	
104.97	
25.22	
13.88	
157.15	
119.03	
38.13	

1,337.43	
487.31	
341.46	
15.66	
24.80	
105.40	
119.98	
83.91	
326.16	
108.25	
205.32	
12.59	
151.83	
110.84	
26.08	
14.91	
168.23	
127.82	
40.40	

1,153.81
521.08
403.29
10.60
35.42
71.77
115.99
37.57
302.99
119.06
174.40
9.53
92.53
65.97
19.52
7.04
83.63
60.63
23.00

25.20	

22.12	

45.68

55.31	

60.62	

45.68

Notes: EITC means earned income tax credit. For each column, the subcategories may not total because of rounding. Children’s clothing is a
portion of the apparel subcategory.
Source: Authors’ calculations based on data from the U.S. Bureau of Labor Statistics, Consumer Expenditure Survey.

We allow for correlation among errors (ε) within a
consumer unit over time.
The coefficients in the vector γt measure the common
seasonal pattern of expenditure for all households relative to September (the omitted month). For the equation measuring the percentage of total expenditure, γt
indicates the fraction of total expenditure on good j in
month t relative to the fraction of total expenditure in
September. The coefficient φ measures the constant
difference in the fraction of expenditures between
EITC eligible and noneligible households. Our coefficients of interest are the elements of the vector λt ,
which measure the monthly differences in expenditure
(the different seasonality) between eligible and noneligible households. If all households perfectly smoothed
their consumption across months, γt would be 0 and

24

the difference in expenditures between EITC eligible
and noneligible households would be constant and
entirely captured by φ. We interpret the coefficient on
the EITC × February interaction (λFeb) as an indicator
of whether the EITC changes the expenditure patterns
of recipients and report p values for a test of the hypothesis that λFeb = 0.
Our identification strategy relies on two sources
of variation: cross-sectional differences in eligibility
and the particular timing of EITC refunds. We have
no reason to believe, a priori, that unobserved factors
such as prices or preferences influence February expenditure among low-income, working families with
children differently than other families.14 Thus, we
feel confident interpreting our λFeb as the impact of
the EITC.

2Q/2008, Economic Perspectives

Results
Figure 4 shows overall expenditure
seasonality relative to September. There
are a number of notable patterns in the
data. High expenditure in December due
to the holiday season dominates expenditure patterns. We also observe high durable goods expenditures in the summer
months when many individuals buy cars
and household items. There is also an increase in nondurable goods expenditures
in August in part because of back-to-school
shopping. Finally, expenditure is low in
February, the shortest month of the year.
Table 4 presents estimates of λFeb
and the associated p value for the two
continuous specifications of equation 1
and marginal effects based on λFeb and
the associated p value for the probit model. We present these results for total durable and nondurable expenditure and for
numerous subcategories of expenditure.
Figures 5–10 graph the coefficients γt , λt,
and (γt + λt)—labeled “Non-EITC families,” “Marginal EITC effect,” and “EITC
families,” respectively, in the legend—for
the three different specifications of equation 1 and for selected expenditure categories. Since we omit September and do
not graph φ, the “Non-EITC families”
and “EITC families” lines represent deviations from their respective September
expenditure measures. “Marginal EITC
effect” is the difference between these
two lines. In order to facilitate comparison between goods, for the continuous
variables, the figures scale the estimated
coefficients by the dependent variable
mean (the average monthly expenditure
on that good). For the probit model, we
divide the coefficient by the estimated
probability of expenditure. The denominators are listed in each figure panel,
along with the p value for a test of the
hypothesis that λFeb = 0. If λFeb = 0, then
we cannot reject the hypothesis that the
EITC does not affect expenditure on
that good.

figure 4

Overall expenditure seasonality, 1997–2006
month coefficients/dependent variable mean
0.20
0.15
0.10
0.05
0.00
–0.05
–0.10
Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
Note: The data are for all families’ fraction of annual expenditure.
Source: Authors’ calculations based on data from the U.S. Bureau of
Labor Statistics, Consumer Expenditure Survey.

figure 5

Nondurable goods, proportion of annual expenditure,
1997–2006
month coefficients/dependent variable mean
0.20
0.15

dependent variable mean = 0.0681
p value of EITC × February = 0.0000

0.10
0.05
0.00
–0.05
–0.10

Nondurable goods
Figure 5 depicts seasonal expenditure patterns
for nondurable goods expenditures by EITC eligibility
status.

Federal Reserve Bank of Chicago

Total
Durable goods
Nondurable goods

Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
Non-EITC families
Marginal EITC effect

EITC families

Note: EITC means earned income tax credit.
Source: Authors’ calculations based on data from the U.S. Bureau of
Labor Statistics, Consumer Expenditure Survey.

As shown in the figure, we find a small, but statistically significant and positive, February effect
on unconditional expenditures for EITC families
(p value = 0.000). While noneligible families spend
about 4 percent less on nondurables in February

25

26

Table 4

Effects of EITC eligibility on February expenditures
	
	

Unconditional expenditure	
Feb. coefficient	

p value	

Conditional expenditure	
Feb. coefficient	

p value	

Discrete expenditure	
Feb. marginal effect	

p value

2Q/2008, Economic Perspectives

Total	

0.0067	

0.0000	

0.0067	

0.0000

Durable goods	
Household goods	
Furniture	
Drapes, linens, and floor coverings	
Miscellaneous household equipment	
Appliances	
Major appliances	
Minor appliances	
Electronics	
Vehicle purchases	

0.0039	
0.0009	
0.0008	
0.0000	
0.0001	
0.0003	
0.0001	
0.0001	
0.0005	
0.0023	

0.0004	
0.0001	
0.0000	
0.7757	
0.2578	
0.0125	
0.1710	
0.0020	
0.0067	
0.0332	

0.0043	
0.0024	
0.0050	
–0.0008	
0.0001	
0.0003	
–0.0021	
0.0013	
0.0005	
0.0004	

0.0012	
0.0087	
0.1133	
0.2527	
0.9176	
0.8035	
0.4006	
0.0336	
0.0176	
0.9844	

0.0144	
0.0312	
0.0195	
0.0205	
0.0238	
0.0304	
0.0094	
0.0212	
0.0091	
0.0092	

0.0023
0.0002
0.0000
0.0004
0.0020
0.0000
0.0069
0.0001
0.0854
0.0008

0.0027	
0.0009	
0.0007	
0.0000	
0.0001	
0.0001	
–0.0002	
0.0008	
0.0011	
0.0002	
0.0008	
0.0001	
0.0000	
0.0000	
0.0000	
0.0000	
0.0001	
0.0002	
–0.0001	

0.0000
0.0007	
0.0035	
0.5658	
0.1492	
0.1018	
0.3891	
0.0000	
0.0003	
0.0427	
0.0047	
0.0177	
0.8229	
0.9810	
0.1728	
0.9818	
0.6995	
0.1614	
0.1145	

0.0009	
0.0007	
0.0000	
0.0000	
0.0001	
–0.0004	
0.0020	
0.0011	
0.0002	
0.0009	
0.0004	
0.0000	
–0.0001	
0.0001	
0.0000	
–0.0001	
0.0002	
–0.0004	

0.0008	
0.0063	
0.8426	
0.7804	
0.5328	
0.2278	
0.1402	
0.0006	
0.0358	
0.0187	
0.2704	
0.8136	
0.7525	
0.1741	
0.5927	
0.7713	
0.6355	
0.4439	

0.0001	
0.0011	
–0.0052	
0.0083	
0.0138	
0.0140	
0.0243	
0.0022	
0.0007	
0.0122	
0.0135	
–0.0024	
0.0057	
0.0052	
0.0063	
0.0068	
0.0143	
–0.0154	

0.5506
0.1427
0.4628
0.0847
0.0136
0.0763
0.0024
0.2173
0.7920
0.0918
0.0061
0.5228
0.4336
0.4177
0.4132
0.3439
0.0436
0.0120

0.0002	

0.1490	

–0.0007	

0.0197	

0.0415	

0.0000

Nondurables	
Food, alcohol, and tobacco	
Food	
Alcohol	
Tobacco	
Food away from home	
Apparel	
Trips	
Transportation	
Gasoline	
Other vehicle expenses	
Public transportation	
Entertainment	
Fees, admissions, toys, and sports	
Personal care services	
Reading	
Other nondurables	
Child support, alimony, and charity	
Pensions, insurance, and social security	
Children’s clothing only among
families with children	

Notes: EITC means earned income tax credit. Children’s clothing is a portion of the apparel subcategory.
Source: Authors’ calculations based on data from the U.S. Bureau of Labor Statistics, Consumer Expenditure Survey.

than in September, EITC families spend about the
same in February and September. We do not investigate conditional or discrete expenditure because the
probability of making nondurable goods expenditure
is nearly 1 in a given month.
In figure 6, we present results for a subset of nondurables that is particularly relevant to the EITC’s goals:
expenditures on children’s clothes. We estimate these
models only for families with children so that the nonEITC control group is not dominated by childless families. Overall seasonal patterns between EITC families
with children and non-EITC families with children are
very similar, exhibiting a large increase in expenditures
before school starts in September and during the holiday season (panel A). The EITC families are more likely
to buy children’s clothes in February than non-EITC
families (panel B), but since they spend a slightly lower
proportion of their total annual expenditure conditional
on buying children’s clothes (panel C), we do not find
a statistically significant unconditional effect.
In figure 7, we present results for the nondurables
portion of transportation. This includes gasoline, local
public transportation, and car expenses outside of
vehicle purchases. We find that EITC eligible households spend about 4 percent more in February than
September, while noneligible households spend about
3 percent less (panel A). Most of this difference arises
from higher spending conditional on positive expenditure
(panel C). If we look at the first column of table 4, we
find that transportation spending increases in February
are the largest single contributor to the overall nondurables increase. From table 4, we also observe that
EITC households spend relatively more on food and
on trips than noneligible households in February.
Durable goods
Figure 8 presents results for all durable goods.
The difference in expenditure patterns between EITC
and non-EITC families in February is much more pronounced than for nondurable goods. While non-EITC
families spend about 8 percent less on durables in
February than in September, EITC families spend about
18 percent more (panel A). The EITC families are significantly more likely both to make a durable goods
purchase in February and to spend more conditional
on making a purchase (panels B and C, respectively).
We now examine the subcategories of durable goods
that drive the patterns depicted in figure 8. Figure 9
presents results for new and used vehicle purchases.15
While non-EITC families spend about 17 percent less
on vehicles in February than in September, EITC families spend 18 percent more (panel A), for a statistically significant difference of 35 percent (p value = 0.0332).
This difference is entirely attributable to the fact that

Federal Reserve Bank of Chicago

figure 6

Expenditure on children’s clothing only among
families with children, 1997–2006
A.	Proportion of annual expenditure
month coefficients/dependent variable mean
3.0
2.0
1.0
0.0
–1.0
–2.0

dependent variable mean = 0.0012
p value of EITC × February = 0.1490
J	

F	

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

S	

O	

N	

D

N	

D

B. Probability of making expenditure
month coefficients/estimated probability
1.4

estimated probability = 0.4108

p value of EITC × February = 0.0000
1.0
0.6
0.2
–0.2
–0.6

J	

F	

M	

A	

M	

J	

J	

A	

C. Proportion of annual expenditure conditional
on nonzero expenditure
month coefficients/dependent variable mean
1.0

dependent variable mean = 0.0061
p value of EITC × February = 0.0196

0.6

0.2

–0.2

–0.6

J	

F	

M	

A	

M	

J	

Non-EITC families
Marginal EITC effect

J	

A	

S	

O	

EITC families

Notes: EITC means earned income tax credit. Horizontal axes
are in calendar months.
Source: Authors’ calculations based on data from the U.S.
Bureau of Labor Statistics, Consumer Expenditure Survey.

27

figure 8

figure 7

Durable goods expenditures,
1997–2006

Expenditure on transportation excluding vehicle
purchases, 1997–2006

A. Proportion of annual expenditure
month coefficients/dependent variable mean

A. Proportion of annual expenditure
month coefficients/dependent variable mean
.08

.40

dependent variable mean = 0.0165
p value of EITC × February = 0.0002

dependent variable mean = 0.0153
p value of EITC × February =0.0004

.06
.20
.04
.02

.00

.00
–.20
–.02
---.04

J	

F	

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

J	

.20

estimated probability = 0.9393

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

S	

O	

N	

D

N	

D

estimated probability = 0.8446

p value of EITC × February = 0.2172

.06

F	

B. Probability of making expenditure
month coefficients/estimated probability

B. Probability of making expenditure
month coefficients/estimated probability
.08

–.40

p value of EITC × February = 0.0023
.15

.04

.10

.02
.05
.00
.00

–.02
–.04

J	

F	

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

J	

F	

M	

A	

M	

J	

J	

A	

C. Proportion of annual expenditure conditional
on nonzero expenditure
month coefficients/dependent variable mean

C. Proportion of annual expenditure conditional
on nonzero expenditure
month coefficients/dependent variable mean
.08

–.05

.30

dependent variable mean = 0.0175
p value of EITC × February = 0.0006

.06

dependent variable mean = 0.0181
p value of EITC × February = 0.0012

.20

.04

.10

.02
.00
.00
–.10

–.02
–.04

J	

F	

M	

A	

M	

J	

Non-EITC families
Marginal EITC effect

J	

A	

S	

O	

N	

D

EITC families

Notes: EITC means earned income tax credit. Horizontal axes
are in calendar months.
Source: Authors’ calculations based on data from the U.S.
Bureau of Labor Statistics, Consumer Expenditure Survey.

28

–.20

J	

F	

M	

A	

M	

J	

Non-EITC families
Marginal EITC effect

J	

A	

S	

O	

EITC families

Notes: EITC means earned income tax credit. Horizontal axes
are in calendar months.
Source: Authors’ calculations based on data from the U.S.
Bureau of Labor Statistics, Consumer Expenditure Survey.

2Q/2008, Economic Perspectives

figure 9

figure 10

Vehicle purchases expenditures,
1997–2006

Consumer electronics expenditures,
1997–2006

A. Proportion of annual expenditure
month coefficients/dependent variable mean

A. Proportion of annual expenditure
month coefficients/dependent variable mean
.60

dependent variable mean = 0.0072
p value of EITC × February = 0.0332

.40

.20

.40

.00

.20

–.20

.00

–.40

J	

F	

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

B. Probability of making expenditure
month coefficients/estimated probability
9.0

dependent variable mean = 0.0042
p value of EITC × February = 0.0066

–.20
J	

F	

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

S	

O	

N	

D

N	

D

B. Probability of making expenditure
month coefficients/estimated probability
.12

estimated probability = 0.0239

estimated probability = 0.8086
p value of EITC × February = 0.0853

p value of EITC × February = 0.0007
6.0

.08

3.0
.04
0.0
.00

–3.0
–6.0

–.04
J	

F	

M	

A	

M	

J	

J	

A	

S	

O	

N	

D

C. Proportion of annual expenditure conditional
on nonzero expenditure
month coefficients/dependent variable mean

J	

F	

M	

A	

M	

J	

J	

A	

C. Proportion of annual expenditure conditional
on nonzero expenditure
month coefficients/dependent variable mean
.60

.10

dependent variable mean = 0.0052
p value of EITC × February = 0.0175

.05

.40

.00
.20
–.05
dependent variable mean = 0.3024
p value of EITC × February = 0.9844

–.10
–.15

J	

F	

M	

A	

M	

J	

Non-EITC families
Marginal EITC effect

J	

A	

S	

.00

O	

N	

D

EITC families

Notes: EITC means earned income tax credit. Horizontal axes
are in calendar months.
Source: Authors’ calculations based on data from the U.S.
Bureau of Labor Statistics, Consumer Expenditure Survey.

Federal Reserve Bank of Chicago

–.20
J	

F	

M	

A	

M	

J	

Non-EITC families
Marginal EITC effect

J	

A	

S	

O	

EITC families

Notes: EITC means earned income tax credit. Horizontal axes
are in calendar months.
Source: Authors’ calculations based on data from the U.S.
Bureau of Labor Statistics, Consumer Expenditure Survey.

29

relative to September, EITC families are more than
600 percent more likely than non-EITC families to
buy a car in February (panel B). This difference is
about twice as large in February as in either January
or March. These findings are also consistent with the
research of Adams, Einav, and Levin (2007), which
shows high demand for subprime auto loans in tax rebate season among households likely to be receiving
an EITC refund. Interestingly, though, among families making a vehicle purchase in February (panel C),
all families spend the same proportion of their annual
expenditure on these goods (p value = 0.9844). Recall
that in dollars, this amount is considerably smaller for
EITC families.
Figure 10 graphs the coefficients from models
of spending on consumer electronics, which include
television sets, computers, and video and music players.
Considering all observations, non-EITC households
spend about 5 percent more on consumer electronics
in February than in September, and EITC households
spend about 15 percent more (panel A). However, the
February effect is relatively small compared with the
overall February effect for durable goods and substantially smaller than the effect for vehicles.
Results for other subcategories of durable goods
are similar to the findings for electronics. In February,
EITC eligible households spend more than noneligible

30

households on both household goods and appliances,
but the magnitude of these effects is smaller than the
magnitude of the effect for vehicles.
Conclusion
The results presented here indicate that EITC
families spend at least a portion of their refund immediately upon receipt. Consistent with Barrow and
McGranahan (2000), we find that recipients spend more
on durables than on nondurables in response to the
EITC. In particular, recipients are far more likely to
purchase vehicles after receiving EITC refunds. The
EITC increases relative average monthly spending on
vehicles in February by about 35 percent for EITC
families compared with their non-EITC counterparts.
Within nondurables, expenditure increases are concentrated in transportation. Given the crucial role of
access to transportation in promoting work, this leads
to the conclusion that recipient spending patterns support the program’s prowork goals. The EITC recipients
are also more likely to spend money within the other
durable goods categories, as well as on trips and food.
In future work, we hope to further analyze the
consumption effects of the EITC by taking advantage of
differences in state EITCs and by exploiting expansions in the EITC since its inception, as well as the
changes in the timing of EITC payments.

2Q/2008, Economic Perspectives

NOTES
The 2005 CES contains data for the first quarter of 2006.

1

A qualifying child must meet three requirements. First, this individual must be a child, stepchild, foster child, sibling, half sibling,
stepsibling, or a descendent of a sibling of the tax filer. Second, the
qualifying child must be younger than 19 at the end of the year,
younger than 24 and a full-time student, or permanently disabled.
Third, the qualifying child must live with the tax filer in the U.S.
for at least six months out of the year. If two tax filers can claim
one qualifying child, they can choose which one claims the child,
but they both cannot claim the same child (U.S. Internal Revenue
Service, 2006a). Starting in 2002, some married taxpayers filing
jointly had higher benefits than singles with the same
income and number of children.
2

For e-filing, the e-file window needs to be open. This occurs in
early January and happened on January 12, 2007.

Our method of dealing with qualifying children could falsely impute EITC eligibility or inflate refund amounts for CUs with children and multiple, unrelated TUs. This is only a potential problem
for the 4 percent of CUs that contain multiple TUs, have any qualifying children, and were assigned the EITC. Furthermore, if EITC
eligibility “truly” has an impact on expenditure, then misallocating
households into the EITC group should bias our results away from
finding a difference in expenditure seasonality between eligible
and noneligible CUs.
11

Throughout the analysis, we rely on the monthly data in the CES.
In some cases the monthly information is unreliable because of the
random attribution of some expenditure to months in the survey.
This attribution would likely operate in the same manner for EITC
recipient and nonrecipient households.
12

3

This was determined from authors’ calculations based on data
from the U.S. Internal Revenue Service (2006b).
4

These figures for tax year 2004 are based on calculations using
U.S. Internal Revenue Service (2006b) data. We assume that all
overpayment refunds not due to the EITC are given to non-EITC
recipients. The 26 percent of non-EITC taxpayers who did not receive a refund are included as zeros in this calculation.
5

U.S. Social Security Administration (2006); and U.S. Department
of Agriculture, Food and Nutrition Service (2008).
6

The nondurables portion of transportation consists of gasoline and
motor oil (42 percent), other vehicle expenses (49 percent), and
public transportation (9 percent), according to the U.S. Bureau of
Labor Statistics (2007).
7

A consumer unit is defined to be an individual or a group of individuals who are either related or use their income to make joint
expenditures on two of three categories—housing, food, or other
living expenses.
8

In future work, we plan to take advantage of changes in the timing
of EITC payments and of expansions in the EITC to further investigate consumption responses to the program.
9

In 2004, the credit for childless families accounted for only 3 percent of EITC payments despite representing 21 percent of returns
receiving the EITC (U.S. Internal Revenue Service, 2006b).
10

Federal Reserve Bank of Chicago

12

13

Total
For households with 12 observations, X i , Annual = ∑ X i ,t . In order
t =1

to adjust monthly expenditure proportions for households with
fewer than 12 observations, we regress X itTotal on household characteristics for 12-month households only and then generate predicted
expenditure proportions for all households. The sum of these predicted monthly proportions gives the expected proportion of annual
expenditures that we actually observe for households with fewer
than 12 observations. Thus, we estimate true annual expenditures
by dividing the sum of m (m < 12) observed expenditures by the
m

sum of m expected monthly proportions:

∑ X it

Total

. We use this
Total
m
 X

∑ Ε  it 
t =1  X
i , Annual 
expression as the denominator of monthly expenditure proportions
for households with fewer than 12 observations. It is because of this
adjustment that average monthly expenditures are not equal to 1/12,
or 8.33 percent, in table 2. We do not adjust the estimated standard
errors in our regressions for this imputation.
t =1

In their study of retail markdowns in Ann Arbor, Michigan, Warner
and Barsky (1995) note that “prices are indeed lowest in January,
but tend to return in February to December’s level.” We do not
correct for the fact that February has fewer days than other months,
which should, all else being equal, reduce February expenditures
for both EITC recipient and nonrecipient households.
14

According to the CES documentation, vehicle expenditures are
defined as the purchase price minus the trade-in value on new and
used domestic and imported cars and trucks and other vehicles,
including motorcycles and private planes.
15

31

references

Adams, William, Liran Einav, and Jonathan D.
Levin, 2007, “Liquidity constraints and imperfect information in subprime lending,” National Bureau of
Economic Research, working paper, No. 13067, April.
Barrow, Lisa, and Leslie McGranahan, 2000, “The
effects of the earned income credit on the seasonality
of household expenditures,” National Tax Journal¸
Vol. 53, No. 4, part 2, December, 1211–1244.
Blumenberg, Evelyn, and Margy Waller, 2003,
“The long journey to work: A federal transportation
policy for working families,” Brookings Institution
Series on Transportation Reform, Brookings Institution,
Center on Urban and Metropolitan Policy, report, July.
Frost, Jonas Martin, III (D-TX), 1993, speaking
before the U.S. House of Representatives, 103rd
Cong., 1st sess., Congressional Record, Vol. 139,
July 29, p. H5502.
Grogger, Jeffrey, 2003, “The effects of time limits,
the EITC, and other policy changes on welfare use,
work, and income among female-headed families,”
Review of Economics and Statistics, Vol. 85, No. 2,
pp. 394–408.
Eissa, Nada, and Jeffrey B. Liebman, 1996, “Labor
supply response to the earned income tax credit,”
Quarterly Journal of Economics, Vol. 111, No. 2,
May, pp. 605–637.
Linnenbrink, Mary, Michael Rupured, Teresa
Mauldin, and Joan Koonce Moss, 2006, “The
earned income tax credit: Experiences from and implications of the voluntary income tax assistance program in Georgia,” in 2006 Eastern Family Economics
and Resource Management Association Conference
Proceedings, section A, pp. 11–16, available at
http://mrupured.myweb.uga.edu/conf/2.pdf.
Mammen, Sheila, and Frances Lawrence, 2006,
“Use of the earned income tax credit by rural working families,” in 2006 Eastern Family Economics and
Resource Management Association Conference
Proceedings, section B, pp. 29–37, available at
http://mrupured.myweb.uga.edu/conf/4.pdf.
Meyer, Bruce D., and Dan T. Rosenbaum, 2001,
“Welfare, the earned income tax credit, and the labor
supply of single mothers,” Quarterly Journal of
Economics, Vol. 116, No. 3, August, pp. 1063–1114.

32

Smeeding, Timothy M., Katherin Ross Phillips, and
Michael O’Connor, 2000, “The EITC: Expectation,
knowledge, use, and economic and social mobility,”
National Tax Journal, Vol. 53, No. 4, part 2, December,
pp. 1187–1210.
U.S. Bureau of Labor Statistics, 2007, “Consumer
expenditures in 2005,” report, Washington, DC,
No. 998, February, available at www.bls.gov/cex/
csxann05.pdf, accessed on February 5, 2008.
U.S. Department of Agriculture, Food and Nutrition
Service, 2008, “Food Stamp Program,” report,
Alexandria, VA, February 19, available at www.fns.
usda.gov/fsp/faqs.htm, accessed on March 3, 2008.
U.S. Government Accounting Office, 2001,
“Earned income tax credit participation,” report,
Washington, DC, No. GAO-02-290R, December 14.
U.S. House of Representatives, Committee on
Ways and Means, 2004, 2004 Green Book:
Background Material and Data on the Programs
Within the Jurisdiction of the Committee on Ways
and Means, Washington, DC, April.
U.S. Internal Revenue Service, 2006a, “Publication
596 (2006), earned income credit (EIC),” report,
Washington, DC, available at www.irs.gov/publications/
p596/index.html, accessed on August 6, 2007.
__________, 2006b, “SOI tax stats—Individual income tax returns publication 1304 (complete report),”
report, Washington, DC, available at www.irs.gov/
taxstats/indtaxstats/article/0,,id=134951,00.html,
accessed on June 4, 2007.
__________, 2006c, “SOI tax stats—Issuing refunds,”
report, Washington, DC, available at www.irs.gov/
taxstats/compliancestats/article/0,,id=97270,00.html.
U.S. Social Security Administration, 2006, “Highlights and trends,” in Annual Statistical Supplement,
2005, Baltimore, MD, February, pp. 1–8, available at
www.ssa.gov/policy/docs/statcomps/supplement/2005/
highlights.pdf, accessed on March 3, 2008.
Warner, Elizabeth, and Robert B. Barsky, 1995,
“The timing and magnitude of retail store markdowns:
Evidence from weekends and holidays,” Quarterly
Journal of Economics, Vol. 110, No. 2, May, pp. 321–352.

2Q/2008, Economic Perspectives

Are inflation targets good inflation forecasts?
Marie Diron and Benoît Mojon

Introduction and summary
The growing use of inflation targeting and other forms
of quantified inflation objectives has marked the history of monetary policy since 1990. Indeed, a majority
of industrialized countries have either adopted some
form of inflation targeting or, most notably for the
15 countries that have adopted the euro, defined a quantified inflation objective. In the United States, the
Federal Reserve System aims to conduct the nation’s
monetary policy by influencing the monetary and credit
conditions in the economy in pursuit of “maximum employment, stable prices, and moderate long-term interest rates.”1 The Fed does not have an inflation target.
An inflation target is a numerical point or range
for the inflation of a given price index that the central
bank declares to be its objective for inflation. For instance, the Bank of Canada aims to keep inflation at
the 2 percent target. And the European Central Bank
(ECB) aims to keep inflation below but close to 2 percent. Central banks that have a quantified inflation
objective do structure the communication of their
monetary policy around this objective.2 Table 1 shows
how various central banks currently define their inflation objectives, as reported on the central banks’ websites. Table 2 shows when these targets were adopted
and how they have changed. Inflation point targets
and the midpoints of inflation target ranges are usually between 2 percent and 2.5 percent. These targets
were first introduced between the early 1990s and the
early 2000s. There is a broad consensus among economists that, as shown in figure 1, countries that have
adopted an inflation target have stabilized inflation
close to the inflation target.
In theory, a major virtue of quantified inflation objectives is to anchor inflation expectations—a key ingredient for the success of monetary policy. Stabilizing
inflation expectations is important3 because prices
and wages adjust relatively infrequently (for the most

Federal Reserve Bank of Chicago

up-to-date evidence, see Dhyne et al., 2005; Fabiani
et al., 2005; Vermeulen et al., 2007; and the references
therein). The people and institutions in the economy
(we call these economic agents) usually set prices and
wages over some horizon, and the level of these prices
and wages would reflect their expectation of the evolution of inflation. If these economic agents know what
the official inflation target is and the target is credible,
they will expect the general price level to grow at the
rate of the preannounced objective of the central
bank. This expectation in itself then helps to deliver
realized inflation close to the target.
While many economists find this argument to be
convincing, there has been little research so far on
whether the central banks’ targets actually do a better
job at forecasting inflation than other inflation benchmarks. In this article, we evaluate the potential benefits of inflation targets by comparing the performance
of benchmark forecasts of inflation (model-based and
published forecasts) and forecasts that are set equal to
the inflation target. We conduct this comparison of
forecast performance for the euro area, Australia,
Canada, New Zealand, Norway, Sweden, Switzerland,
and the United Kingdom, all of which have established
inflation targets as shown in table 1.
Marie Diron is an economist at Oxford Economics. She
worked on this project while she was at the European Central
Bank. Benoît Mojon is an economist at the Federal Reserve
Bank of Chicago, and is on leave from the European Central
Bank. The authors thank Gonzalo Camba-Mendez, Han Choi,
Michael Ehrmann, Gabriel Fagan, Jonas Fisher, Alejandro
Justiniano, Simone Manganelli, Sergio Nicoletti-Altimari,
Athanasios Orphanides, Anna Paulson, Frank Smets, Lars
Svensson, David Vestin, and participants in a Chicago Fed
research seminar and the Eurosystem Inflation Persistence
Network September 2005 meeting for comments and suggestions. The views expressed here are the authors’ and do not
necessarily reflect the views of the European Central Bank.

33

Table 1

Inflation objectives in selected Organization for Economic Cooperation and
Development countries and in the euro area
Euro area	

The primary objective of the European Central Bank’s (ECB) monetary policy is to maintain
price stability. The ECB aims at (harmonized index of consumer prices, or HICP) inflation rates
of below, but close to, 2 percent over the medium term.

Australia	

In pursuing the goal of medium-term price stability, both the bank and the government agree
on the objective of keeping consumer price inflation between 2 percent and 3 percent, on
average, over the cycle. This formulation allows for the natural short-run variation in inflation
over the business cycle while preserving a clearly identifiable performance benchmark over time.

Canada	

The Bank of Canada aims to keep inflation at the 2 percent target, the midpoint of the
1 percent to 3 percent inflation-control target range. This target is expressed in terms of
total Consumer Price Index (CPI) inflation, but the bank uses a measure of core inflation as
an operational guide. Core inflation provides a better measure of the underlying trend of
inflation and tends to be a better predictor of future changes in the total CPI.

New Zealand	

The Reserve Bank uses monetary policy to maintain price stability as defined in the policy
targets agreement (PTA). The current PTA requires the bank to keep inflation between
1 percent and 3 percent on average over the medium term. The bank implements monetary
policy by setting the official cash rate (OCR), which is reviewed eight times a year.

Norway	

The government has defined an inflation target for monetary policy in Norway. The operational
target is an inflation rate of 2.5 percent over time (with annual consumer price inflation of
approximately 2.5 percent over time).

Sweden	

According to the Sveriges Riksbank Act, the objective of monetary policy is to “maintain
price stability.” The Riksbank [or the central bank of Sweden] has interpreted this objective
to mean a low, stable rate of inflation. More precisely, the Riksbank’s objective is to keep
inflation around 2 percent per year, as measured by the annual change in the Consumer Price
Index (CPI). There is a tolerance range of plus/minus 1 percentage point around this target. 
At the same time, the range is an expression of the Riksbank’s ambition to limit such
deviations. In order to keep inflation around 2 percent, the Riksbank adjusts its key interest
rate, the repo rate.

Switzerland	

The Swiss National Bank equates price stability with a rise in the national Consumer Price
Index (CPI) of less than 2 percent per annum. In so doing, it takes account of the fact that
not every price movement is necessarily inflationary in nature. Furthermore, it believes that
inflation cannot be measured accurately. Measurement problems arise, for example, when
the quality of goods and services improves. Such changes are not properly accounted for in
the CPI; as a result, the measured level of inflation will tend to be slightly overstated.

United Kingdom	

A principal objective of any central bank is to safeguard the value of the currency in terms of
what it will purchase. Rising prices—inflation—reduces the value of money. ... In May 1997,
the government gave the bank independence to set monetary policy by deciding the level of
interest rates to meet the government’s inflation target—currently 2 percent. [The inflation
target of 2 percent is expressed in terms of an annual rate of inflation based on the
Consumer Prices Index (CPI).]

Sources: European Central Bank, www.ecb.int/mopo/html/index.en.html; Reserve Bank of Australia, www.rba.gov.au/MonetaryPolicy/; Bank
of Canada, www.bank-banque-canada.ca/en/monetary/monetary_main.html; Reserve Bank of New Zealand, www.rbnz.govt.nz/monpol/index.
html; Norges Bank, www.norges-bank.no/Pages/Section____11330.aspx; Sveriges Riksbank, www.riksbank.com/templates/SectionStart.
aspx?id=10602; Swiss National Bank, www.snb.ch/en/iabout/monpol; and Bank of England, www.bankofengland.co.uk/monetarypolicy/
index.htm.

34

2Q/2008, Economic Perspectives

Notes: The euro itself was introduced in January 1999. However, the monetary policy strategy of the European Central Bank was announced in November 1998. In the United Kingdom, CPI is the Consumer
Prices Index, and RPIX is the Retail Prices Index excluding mortgage interest payments. The last two columns define the target forecasts and the sample of forecast evaluation.
Sources: Roger and Stone (2005); and Swiss National Bank, www.snb.ch/en/iabout/monpol.

1995:Q1–2003:Q4
2004:Q1–2007:Q4
2.5 RPIX	
2.0 CPI	
1–4			
for RPIX	
none		

2004 switch	
to CPI	
October	
1992	
	
UK 	

1–3			
for CPI	
2		

2001:Q1–2007:Q4
1	
0–2	
1999	
Switzerland	

none						

1995:Q1–2007:Q4
2	
1–3	
January
1993	
	
Sweden	

2						

2003:Q1–2007:Q4
2.5	
none	
March
2001	
	
Norway	

2.5						

1995:Q1–2007:Q4
2	
business
cycle	
March							
1990	
3–5	
none		
2003	
1–3		

2	
indefinite	
1995			
multiyear	
2	
1–3	
February
1991	
	
Canada	

	
New Zealand	

1995:Q1–2007:Q4

1995:Q1–2007:Q4
2.5 	
April			
1993	
2–3	
none	
	
Australia	

business
cycle					

2001:Q1–2007:Q4
1.9	
close to 2
from below		
medium			
run	
2003		
November			
1998	
0–2	
none	
	
Euro area	

Forecast
evaluation period
Target	
forecast (%)	
	
Horizon	
Point	
target (%)	
	
Range (%)	
Most recent	
modifications	
	
Horizon	
Point	
target (%)	
	
	

First 	
introduction	

	
Range (%)	

Characteristics of the inflation quantified objectives

Table 2

Federal Reserve Bank of Chicago

We also report results for the
U.S., where inflation is often measured with the core Personal Consumption Expenditures (PCE) Price
Index—a broad measure of consumer prices that excludes the more volatile and seasonal food and energy
prices. Although the Federal Reserve
does not have an inflation target,
many market participants and economists assume that the U.S. central
bank’s price stability mandate can
be associated with numerical values
for the core PCE inflation rate:
Some have argued that this rate is
close to 2 percent,4 while others
think that the Federal Reserve may
have a “comfort zone” that is between 1 percent and 2 percent. Figure 2 shows that core PCE inflation
was indeed close to these numerical
values over the last decade. So, for
comparison, we also assess the forecasting performance of two selected
“constant forecast benchmarks” for
the U.S.—one of core PCE inflation
at 2 percent and the other at 1.5 percent (which is the midpoint of the
alleged “comfort zone”).
Our results provide support for
inflation targeting as a monetary
policy strategy. In all the countries
in our sample and in the euro area,
forecasting that inflation will be at
the inflation “target” implies a smaller forecasting error than alternative
models. This is true for both oneand two-year horizon forecasts. Forecasting inflation to be at the target
also beats the mean of professional
economists’ forecasts published in
Consensus Forecasts for the euro
area, Canada, and Sweden, as well
as for two-years-ahead forecasts in
Switzerland and in the United Kingdom.5 In the case of the U.S., forecasting core PCE inflation to be a
constant benchmark (either at 2 percent or 1.5 percent) also implies a
relatively small error on average
over the past 12 years.

35

figure 1

Inflation and quantified inflation objectives
A. Australia

B. Canada

10

10

8

8

6

6

4

4

2

2

percent

percent

0

0
–2
1986

’89

’92

’95

’98

2001

’04

’07

–2
1986

’89

C. New Zealand

D. Norway

20

12

percent

’92

’95

’98

2001

’04

’07

’92

’95

’98

2001

’04

’07

’92

’95

’98

2001

’04

’07

’92

’95

’98

2001

’04

’07

percent
10

15

8

10

6
4

5

2
0

0

–5
1986

’89

’92

’95

’98

2001

’04

’07

–2
1986

’89

E. Sweden

F. Switzerland

12

10

10

8

percent

percent

8

6

6

4

4

2

2

0

0
–2
1986

’89

’92

’95

’98

2001

’04

’07

–2
1986

’89

G. United Kingdom

H. Euro area

10

10

percent

percent

RPIX
inflation

8

8
6

6

4

4

2

CPI
inflation

2
0
1986

’89

		

’92

0

’95

’98
	

2001

’04

’07
Target ranges

–2
1986

’89

Point target

Notes: For panel G, in the United Kingdom, CPI is the Consumer Prices Index, and RPIX is the Retail Prices Index excluding mortgage interest payments.
See tables 1 and 2 for details on the inflation objectives over the time period.
Sources: Roger and Stone (2005) and authors’ calculations based on data from Haver Analytics.

36

2Q/2008, Economic Perspectives

figure 2

Inflation and selected constant forecast
benchmarks for the U.S.
percent
10
8

Core PCE inflation

6
4
2
0
1986

’89

’92

’95

’98

2001

’04

’07

Notes: The U.S. Federal Reserve System does not have an
inflation target. Core PCE is the Personal Consumption
Expenditures Price Index excluding food and energy prices.
See the text for further details on the selected constant
forecast benchmarks.
Source: Authors’ calculations based on data from Haver
Analytics.

To our knowledge, this article is the first one to
show that, while inflation is never exactly at the target, the central bank’s target has provided an ex ante
reliable and, to a large extent, unbeatable inflation
forecasting device in countries that have adopted a
quantified inflation objective. When agents in the
economy choose the inflation target as their expectation of future inflation, it is more likely that the target
is actually hit or at least that low and stable inflation
is maintained.
In the next section, we discuss the role of inflation targets in the formation of inflation expectations.
Then, we describe the forecasting models and report
the results of a “horse race” of inflation forecasts,
comparing the error incurred by taking the target as a
forecast with other widely used forecasting approaches.
Rule-of-thumb expectations and inflation
targets
The formation of inflation expectations plays a
large role in the success of monetary policy. Since all
prices and wages cannot be readjusted constantly, anchoring inflation expectations at a low level is essential to ensure price stability.
The academic debate on inflation expectations
has centered on the operational mode of expectation
formation. However, inflation expectations are not
observable. As a result, several views on expectation
formation that are mutually exclusive cannot easily
be proven to be inconsistent with the data (Lindé, 2001).

Federal Reserve Bank of Chicago

The most popular view has long been that inflation expectations are rational. Rational expectations
take two complementary meanings. First, expectations
need to fulfill certain criteria to be rational. Thus, rational expectations cannot be systematically or persistently wrong. As a result, a good approximation of
rational expectations is the result of a regression of
future realizations of inflation on past and present observable economic variables. By construction, this
procedure yields expectation errors that are zero on
average. In addition, if the set of economic variables
taken into account is comprehensive enough, this procedure is consistent with the requirement that expectations take into account all available information.
The second meaning of rational expectations formulates that in any given model of the economy, agents
form their expectations in a way that is consistent
with the functioning of the model.
Although the assumption of rational expectations
is frequently used in model construction and simulations, the empirical relevance is still controversial.6
In particular, inflation expectations seem to depend
significantly on past and present values of inflation
(for example, Estrella and Fuhrer, 1999). Hence,
some economists have advocated that expectations
should be approximated by simpler expectation
mechanisms, such as projecting inflation to be at the
level observed in the past.
Note that such “rule-of-thumb” expectations are
not necessarily irrational to the extent that rules deriving future inflation from its past values may be the
most efficient use of current available information to
derive the outlook for inflation. A good rationale for
such a rule of thumb is precisely that inflation proves
extremely difficult to forecast with multivariate economic models.7 Simple rules of thumb may therefore
optimally solve the trade-off between accuracy of the
expectations and effort spent to derive them.8 However, especially at times of persistent changes in inflation, such backward-looking rules will lead to
recurring forecast errors of persistent signs.
In countries where the central bank has announced
an inflation target, a natural rule of thumb consists of
expecting that future inflation would be at the target.
The forecast error of this rule of thumb is given by
the deviation of realized inflation from the preannounced target. It is different from zero because the
central bank cannot deliver an inflation rate that is exactly on target every period. However, the degree of
forecast error will depend on which benchmarks are
used and, in particular, on whether alternative forecasts are better or worse.

37

How well do you forecast inflation if you
believe in the central bank’s target?
We first check how accurate “forecasts” of agents
taking the central bank’s target for granted (henceforth,
“target forecasts”) would perform compared with forecasts based on six alternative benchmarks: random walk;
a track record or past mean inflation; three specifications of an autoregressive (AR) model of inflation,
that is, a model where past and current inflation help
forecast future inflation; and, finally, the mean inflation forecast published in Consensus Forecasts. These
models, which are standard benchmarks in the forecast evaluation literature, have proved difficult to beat
when trying to forecast inflation (Stock and Watson,
2003; and Banerjee, Marcellino, and Masten, 2003).
The quantified inflation objectives of central banks
An inflation target takes the form of either a numerical value or a range for inflation and a commitment by the central bank to stabilize inflation close to
the target level. Central banks that have a quantified
inflation objective put it at the core of the communication of their monetary policy.9 Table 1 (p. 34) reports
the current (as of January 2008) definitions of the central banks’ inflation objectives taken from their websites. Table 2 (p. 35) shows the timing of the adoption
of the targets and how they have changed over time.
Figure 1 (p. 36) shows how the targets compare with
actual inflation. The central banks’ inflation targets are
now typically between 1 percent and 3 percent. Some
central banks target a range (Australia, euro area, and
Switzerland) and others a specific rate (Norway). Some
banks have changed the definition of their objective
over time (euro area and UK), while some have not
(Australia). Changes have involved the range target
(New Zealand and euro area) or even a change in the
index for which the target is defined (UK).10
Going from the definition of the inflation targets
to a target forecast requires two main assumptions.
The first one is to choose a numerical value for the
target. We choose the effective point target when the
central bank has defined one (Canada, Norway, Sweden,
and the UK from 1996 onward) or, in the case of
countries with inflation range targets (Australia, New
Zealand, Switzerland, and UK before 1996), we use
the midpoint of the range in order to have a point
estimate to which actual inflation can be compared
(following Castelnuovo, Nicoletti-Altimari, and
Rodrígues-Palenzuela, 2003). In the case of the euro
area, the choice of a specific number for the inflation
quantified objective is somewhat more delicate. In
1998, the ECB had defined its inflation objective as
a positive inflation rate less than 2 percent over the

38

medium run. In May 2003, the ECB clarified its inflation objective as below but close to 2 percent.11 We set
the inflation objective for the euro area at 1.9 percent.
While this choice is somewhat arbitrary and not necessarily in line with the perception of the ECB objective between 1999 and 2003, we believe it is consistent
with the ECB strategy both before and after May 2003.
Finally, we also analyze the case of the U.S. As
noted earlier, in contrast with the other central banks
we study in this article, the Federal Reserve does not
set a target for inflation. However, some observers
have suggested that the Federal Reserve has an implicit target of 2 percent for core PCE inflation.12
Some others consider that the Federal Reserve has a
“comfort zone” that is between 1 percent and 2 percent. We thought it would be interesting to apply the
same type of test to the forecasting performance of
these working assumptions as we do to the official
inflation targets of other countries, purely as an academic exercise. We therefore assess the size of the errors implied by forecasting core PCE inflation rates to
be constant, either at 2 percent or 1.5 percent.
The second assumption we need to make is our
choice of forecast evaluation period. Given the medium-term nature of the central banks’ objectives, which
we interpret as a two-year horizon, we start our forecast evaluation period two years after the inflation
target has been announced. Hence, in the case of
Australia, where the inflation targeting strategy was
launched in 1993, the forecast evaluation commences
for forecasts of inflation for the first quarter of 1995.
In the case of the euro area, we record forecast performances from 2001 onward. The level of the inflation forecast and the first date of the forecast evaluation
are reported in table 2 (p. 35). In the case of the U.S.,
we arbitrarily start the forecasting evaluation in 1995.
Forecasting models
The target forecast model (“Target” in tables 3–5
on pp. 41–42) is simply:
πt + h t = π*,
 P − Pt − 4 
where πt =  t
 × 100, that is, it is the inflation
 Pt − 4 
rate for four quarters, h is either four quarters or eight
quarters, P is the level of the price index, and π* is
the inflation quantified objective defined in the next
to last column of table 2. The range of t + h dates for
which the model is evaluated is given in the last column of table 2.

2Q/2008, Economic Perspectives

We should stress that the forecasts are the same
whatever the horizon of the forecast. In this article,
we report results only for h equal to four and eight
quarters ahead.13
We compare the target forecast performance with
the forecasts from our six alternative measures. The
first of these is the random walk forecast; that is, we
forecast inflation to be equal to the inflation observed
over the year to the date when the forecast is made:
1)

πt + h t = πt .

This forecasting model is sometimes formulated
in the first difference of inflation, that is, changes in
inflation from one period to the next. We stick to a level
formulation, however, because inflation shows no
trend for the sample over which the forecast evaluation is conducted. We also record the forecast performance of considering that future inflation would be
well approximated by the average inflation level over
the past five years (or 20 quarters). This naive forecast
considers that the recent track record of inflation is
the most informative about where inflation should be:
 20

2) πt + h t =  ∑ πt −i  / 20.
 i =1

The main advantage with respect to the first model
(equation 1) is that it may smooth out temporary noise
in current inflation.
We then base inflation forecasts on three autoregressive models.14 The first of these models simply
relates current inflation to its lag levels, where the minimum lag is defined by the forecasting horizon. It is:
3)

πt = C + απt − h + βπt − h − 4 + εt ,

where C, α, and b are parameters and ε an error term,
which are to be estimated recursively by ordinary
least squares over the sample from the first quarter of
1986 to t.
This simplifies the forecasting procedure as it can
be computed in one step rather than rolling the model
over intermediate forecasts:

has the advantage that any change in the level of inflation would affect the forecasting performance of the
model only for one observation (Banerjee, Marcellino,
and Masten, 2003):
4)

∆πt = C + α∆πt − h + β∆πt − h −1 + εt ,
^

^

4a) πt + h t = πt + C + α∆πt + β∆πt −1 ,
^

where ∆πt = πt − πt −4 .
Second, in line with Labhard, Kapetanios, and
Price (2007), we take into account potential breaks in
the mean of inflation due to announcements of changes
in the inflation objective by the central banks.15 Hence,
we enrich the AR model by allowing for changes in
the intercept eight quarters after a change to the inflation targeting regime. In the case of Australia, for instance, the central bank announced its objective in
1993. We therefore include a one-step dummy taking
a zero value before 1995 (1993 plus eight quarters)
and one thereafter. We refer to this second set of
models as “AR models with breaks.” They are:
5)

πt = C + ∑ Ci Indi + απt − h + βπt − h − 4 + εt ,

5a) πt + h t = C + ∑ Ci Indi + απt + βπt − 4 ,
^

^

^

^

where Indi is a dummy variable that takes a value 1
from eight quarters after the announced change in
the target.
We estimate the models from the first quarter of
1986 onward with year-on-year inflation rates.16 The
out-of-sample forecast evaluation is then carried out
in pseudo real time. For example, the models are estimated from the first quarter of 1986 through the fourth
quarter of 1994. Based on this estimation, we calculate
forecasts at horizons four quarters and eight quarters
ahead. Then we store the associated forecast errors
and the one of taking the inflation forecast equal to
the central bank’s quantified objective π*, defined as
follows:
π1995Q11994Q1 − π1995Q1

and  π1995Q1 − π*  ,

3a) πt + h t = C + απt + βπt − 4 ,

π1996Q11994Q1 − π1996Q1

and  π1996Q1 − π*  .

where the coefficients with ^ have been estimated.
We also present results for two variants of this
model. First, we formulate the autoregressive model
on the first difference of inflation. This formulation

The setup is brought forward sequentially by one
quarter until the end of the evaluation sample.
Finally, we compare target forecasts to the
Consensus Forecasts (hereafter, referred to as the

^

^

^

Federal Reserve Bank of Chicago

39

“consensus”), which is the mean of the forecasts surveyed by Consensus Economics Inc. from F professional forecasters.
6)

 F

πt + h t =  ∑ πt + h t  / F .
 f =1


The consensus should represent informed forecasts produced on the basis of comprehensive information sets. Notably, respondents to the survey should
be aware of the central bank’s inflation objective. In
principle, differences between the views of economists
on future inflation and the central bank’s stated objective can indicate that such an objective lacks credibility.
However, inflation targets could be credible, albeit only
in the medium run. For shorter horizons, economists
may take into account a variety of factors that make
actual inflation deviate temporarily from the target.
Data on the professionals’ forecasts for future inflation (for the current and following years) are available
since 1990 for Canada, Norway, Sweden, Switzerland,
and the UK and since 2002 for the euro area. However,
we compile pre-2002 data as averages of country-level
data (except Luxembourg), with fixed weights corresponding to the countries’ share in euro area consumption.17 This current and following year framework
differs from the rolling forecast horizon used to evaluate models 1–5. In order to compare the performance
of the consensus with the degree of accuracy that target forecasts would have yielded had they been formed
at the same time as the consensus surveys, we need to
pay attention to the calendar of inflation data releases
and the timetable of the consensus surveys. Publication delays of inflation data differ from one country
to another and, in some cases, have changed over the
period we study here. However, inflation data are typically published about one month after the end of the
reference period. Meanwhile, the consensus survey results for a month, M, correspond to answers collected
up to the middle of the previous month M – 1. We can
therefore make the following comparisons. Consensus
forecasts of inflation in the current year published in
February rely on inflation data up to December of the
previous year. Therefore, we need to forecast the whole
year. We then compare these forecasts with four-quarters-ahead target forecasts. Similarly, we compare
forecasts of inflation in the following year published
in February with eight-quarters-ahead target forecasts.
Results
Tables 3 and 4 show the mean absolute errors
(MAEs) and the root mean square errors (RMSEs)18

40

of the target forecast and the five alternative quarterly
models laid out in equations 1–5. Table 5 compares
similar statistics for Consensus Forecasts and the target forecasts at an annual frequency. These statistics
are computed for the forecast evaluation periods that
begin either in 1995 or eight quarters after the instauration of the inflation quantified objective. For most
countries, this is from 1995 through 2007—that is,
for 52 quarterly forecasts for tables 3 and 4 and for
13 annual observations for table 5. However, the
forecast evaluation starts only in 2001 for the euro
area and Switzerland and in 2003 for Norway. In the
case of the UK, the forecast evaluation is split in
2004 to reflect the change in the underlying price index.
For each row in tables 3–5, the numbers in bold
indicate the smallest forecast errors. In tables 3 and 4,
for each column we also compute the mean performance
of each model across countries as the mean distance
to the best performing model for each country.
Our results provide strong support for the inflation target forecasts as good devices for inflation forecasting. This is especially true at the eight quarters
horizon, where forecasting the target systematically
beats all other forecasting approaches (that is, has both
the smallest MAE and smallest RMSE) except in the
UK, where the best model for the Consumer Prices
Index (CPI) is the simple AR model in equation 3.
But one should take this particular result for the UK
with a grain of salt because our evaluation is conducted
only over 16 observations (from 2004 through 2007).
At the four quarters horizon, the performance of
forecasting the target remains very impressive. This
model is the best performing one in terms of either
mean absolute errors (table 3) or root mean square
errors (table 4) in Canada, Norway, and Switzerland.
In both tables 3 and 4, the performance of forecasting
the target is very close to the best model in most other
cases: less than 0.05 percentage points above the best
model in the euro area and Australia and less than
0.10 percentage points above the best model in New
Zealand and Sweden. In the UK, the target forecast
has an MAE and RMSE about 0.20 percentage points
above the best model for the either the RPIX or the
CPI. However, even at a four quarters horizon, the
target forecast is the most robust approach in the
sense that it is, on average, the closest to the best performing model of each country.
The target forecasts yield significantly more accurate forecasts than any of the autoregressive models
and, hence, given the evidence reported in Stock and
Watson (2003) and Banerjee, Marcellino, and Masten
(2003), than most inflation forecast models (see note 7).

2Q/2008, Economic Perspectives

Federal Reserve Bank of Chicago

Table 3

Mean absolute errors at four quarters and eight quarters horizons
	

Four-quarters-ahead forecasts	

	

Alternative models	

	

Eight-quarters-ahead forecasts
Alternative models

Target	

1	

2	

3	

4	

Euro area	
Australia	
Canada	
New Zealand	
Norway	
Sweden	
Switzerland	
UK CPI	
UK RPIX	

0.36	
1.11	
0.65	
0.99	
0.91	
1.06	
0.39	
0.57	
0.54	

0.43	
1.55	
1.05	
1.12	
1.43	
0.97	
0.60	
0.45	
0.33	

0.46	
1.10	
1.19	
1.19	
1.00	
1.70	
0.42	
0.69	
0.85	

0.46	
1.46	
0.86	
0.91	
1.06	
1.48	
0.61	
0.38	
0.85	

Mean difference
with best model	

0.07	

0.22	

0.29	

0.23	

5	

	

Target	

1	

2	

3	

4	

5

0.33	
1.59	
1.23	
1.77	
2.13	
1.29	
0.54	
0.46	
0.32	

0.47		
1.56		
0.75		
0.95		
1.04		
1.21		
0.55		
0.46		
0.56		

0.36	
1.11	
0.65	
0.99	
0.91	
1.06	
0.39	
0.57	
0.54	

0.51	
1.84	
0.97	
1.31	
1.20	
1.42	
0.71	
0.63	
0.41	

0.53	
1.15	
1.57	
1.55	
1.09	
2.10	
0.45	
0.77	
1.19	

0.64	
2.04	
1.35	
1.11	
1.01	
2.70	
1.17	
0.48	
2.37	

0.52	
2.53	
1.05	
1.31	
1.06	
1.41	
0.70	
0.82	
0.45	

0.65
2.42
1.34
1.16
1.18
1.87
0.84
0.66
1.70

0.41	

0.18		

0.02	

0.30	

0.45	

0.73	

0.39	

0.61

Notes: In the United Kingdom, CPI is the Consumer Prices Index, and RPIX is the Retail Prices Index excluding mortgage interest payments. The forecast comparison is conducted in real time over the period
1995:Q1–2007:Q4 for Australia, Canada, New Zealand, and Sweden; over the period 2001:Q1–2007:Q4 for the euro area and Switzerland; over the period 2003:Q1–2007:Q4 for Norway; over the period
2004:Q1–2007:Q4 for UK CPI; and over the period 1995:Q1–2003:Q4 for UK RPIX. The numbers in bold indicate the best model.
Model 1 is the random walk, current year inflation; model 2 is the mean inflation over the last five years; model 3 is an autoregressive (AR) model in levels; model 4 is an AR model in first differences; and
model 5 is an AR model in levels with breaks in the mean f inflation. See equations 1–5 in the text for the exact specification of the forecast.

Table 4

Root mean square errors at four quarters and eight quarters horizons
	

Four-quarters-ahead forecasts	

	

Alternative models	

	

Eight-quarters-ahead forecasts
Alternative models

Target	

1	

2	

3	

4	

Euro area	
Australia	
Canada	
New Zealand	
Norway	
Sweden	
Switzerland	
UK CPI	
UK RPIX	

0.46	
1.51	
0.86	
1.25	
1.23	
1.28	
0.45	
0.70	
0.64	

0.53	
1.97	
1.32	
1.43	
1.95	
1.22	
0.72	
0.53	
0.42	

0.58	
1.50	
1.80	
1.66	
1.31	
2.19	
0.49	
0.82	
1.11	

0.56	
1.88	
1.18	
1.18	
1.42	
2.00	
0.73	
0.45	
0.97	

Mean difference
with best model	

0.07	

0.26	

0.42	

0.29	

5	

	

Target	

1	

2	

3	

4	

5

0.42	
1.94	
1.66	
2.43	
2.88	
1.55	
0.64	
0.54	
0.41	

0.57		
1.98		
0.99		
1.25		
1.38		
1.59		
0.66		
0.53		
0.68		

0.46	
1.51	
0.86	
1.25	
1.23	
1.28	
0.45	
0.70	
0.64	

0.68	
2.41	
1.15	
1.67	
1.54	
1.61	
0.83	
0.73	
0.49	

0.68	
1.57	
2.37	
2.42	
1.40	
2.74	
0.56	
0.92	
1.50	

0.84	
3.62	
2.78	
1.46	
1.24	
3.78	
1.44	
0.57	
2.83	

0.71	
3.23	
1.48	
1.88	
1.34	
1.72	
0.81	
0.95	
0.57	

0.89
3.96
2.75
1.51
1.50
2.99
1.11
0.77
2.23

0.53	

0.21		

0.03	

0.34	

0.68	

1.16	

0.51	

1.07

41

Notes: In the United Kingdom, CPI is the Consumer Prices Index, and RPIX is the Retail Prices Index excluding mortgage interest payments. The forecast comparison is conducted in real time over the period
1995:Q1–2007:Q4 for Australia, Canada, New Zealand, and Sweden; over the period 2001:Q1–2007:Q4 for the euro area and Switzerland; over the period 2003:Q1–2007:Q4 for Norway; over the period
2004:Q1–2007:Q4 for UK CPI and over the period 1995:Q1–2003:Q4 for UK RPIX. The numbers in bold indicate the best model.
Model 1 is the random walk, current year inflation; model 2 is the mean inflation over the last five years; model 3 is an autoregressive (AR) model in levels; model 4 is an AR model in first differences; and
model 5 is an AR model in levels with breaks in the mean f inflation. See equations 1–5 in the text for the exact specification of the forecast.

Table 5

Forecasting errors of target forecasts and Consensus Forecasts
	

Mean absolute errors	

	
	
	
Euro area	
Canada	
Sweden	
Switzerland	
UK	

Root mean square errors

	
	
Target	

Consensus	
one-year-ahead	
forecasts	

Consensus	
two-years-ahead	
forecasts	

	
	
Target	

Consensus	
one-year-ahead	
forecasts	

Consensus
two-years-ahead
forecasts

0.27	
0.29	
0.65	
0.24	
0.41	

0.29	
0.41	
0.69	
0.21	
0.34	

0.41	
0.38	
0.67	
0.41	
0.44	

0.31	
0.36	
0.88	
0.27	
0.53	

0.31	
0.54	
0.95	
0.33	
0.42	

0.45
0.43
0.88
0.48
0.56

Notes: The forecast comparison is conducted in real time over the period 1995–2007 for Canada, Sweden, and the UK and over the period
2001–07 for the euro area and Switzerland. The consensus forecasts are the ones published in the February issue of Consensus Forecasts
of the current year for one-year-ahead forecasts and the past year for the two-years-ahead forecasts. The numbers in bold indicate the best model.

Table 6

Performance of selected constant forecast benchmarks and model-based forecasts
of U.S. core PCE inflation
	

Constant forecast benchmarks	

Alternative models

	

1.5%	

2.0%	

1	

2	

Mean absolute errors
Forecast horizon
Four quarters	
Eight quarters	

3	

4	

5

0.49	
0.49	

0.32	
0.32	

0.30	
0.38	

0.57	
0.74	

0.33	
0.65	

0.37	
0.34	

0.32
0.64

Root mean square errors
Forecast horizon
Four quarters	
Eight quarters	

0.40	
0.40	

0.38	
0.38	

0.36	
0.45	

0.70	
0.92	

0.40	
0.87	

0.47	
0.42	

0.37
0.86

Notes: The U.S. Federal Reserve System does not have an inflation target. Core PCE is the Personal Consumption Expenditures Price Index
excluding food and energy prices. The forecasting performance of the constant forecast benchmarks for U.S. core PCE inflation is purely illustrative.
The forecast comparison is conducted in real time over the period 1995:Q1–2007:Q4. The numbers in bold indicate the best model.
Model 1 is the random walk, current year inflation; model 2 is the mean inflation over the last five years; model 3 is an autoregressive (AR) model
in levels; model 4 is an AR model in first differences; and model 5 is an AR model in levels with breaks in the mean f inflation. See equations 1–5
in the text for the exact specification of the forecast.

Table 5 shows the MAEs and the RMSEs of target forecasts and the Consensus Forecasts, though
this time using yearly observations. For two-yearsahead inflation forecasts, using the central bank’s
target has yielded smaller forecasting errors than the
consensus forecasts in terms of either MAEs or
RMSEs for all countries under review. This is also
observed at one-year-ahead forecasts, except for the
UK according to both the MAE and RMSE criteria
and for Switzerland according to the MAE criterion.
One caveat applying to these results is that they
are based on relatively short samples because of the
availability of consensus forecasts for only the past
15 years and the even more recent switch to quantified inflation objectives by central banks. However, in
our view, the paths of the forecasts obtained from the
autoregressive models, the consensus, and the central

42

banks’ targets suggest that the central banks’ targets
may constitute a new benchmark for forecast evaluation.
Finally, table 6 reports MAEs and RMSEs of
the constant forecast benchmarks of 1.5 percent and
2 percent for U.S. core PCE inflation. Forecasting
constant inflation at 2 percent has been the best at the
eight quarters horizon and very close to the best at the
four quarters horizon. These results show that, although
the Federal Reserve does not have an inflation target,
core PCE inflation has become remarkably stable in
the U.S. since 1995.
Taking a broader perspective, our results provide
concrete evidence of the success of preannounced
quantified objectives for inflation. One possible interpretation of this success is that economic agents have
indeed adopted the inflation target of the central bank
as their inflation expectation for the general price level.

2Q/2008, Economic Perspectives

The inflation target may have become the focal point
onto which decentralized inflation expectations have
converged. This would occur if the target of the central
bank is credible. That is, the central bank is always
willing to take measures to ensure the target is reached
over the specified horizon.
Conclusion
We have shown that quantified inflation objectives can be used as rule-of-thumb forecasting devices.
The experience of various countries that have adopted
such objectives shows that, to a large extent, such a

rule of thumb yields smaller forecast errors than
widely used forecasting models and the forecasts
of professional experts published by Consensus
Economics Inc. While inflation is never exactly at
the target, the central banks’ targets have provided
ex ante reliable and, to a large extent, unbeatable
inflation forecasting devices in countries that have
adopted a quantified inflation objective. These findings suggest that the central banks that have set explicit targets for inflation have been successful in their
often stated goal of anchoring inflation expectations.

NOTES
This is according to the Federal Reserve Act; see
www.federalreserve.gov/generalinfo/fract/sect02a.htm.
1

See Roger and Stone (2005) for a detailed description of the inflation targeting in OECD (Organization for Economic Cooperation
and Development) and emerging economies.
2

See the discussion in Castelnuovo, Nicoletti-Altimari, and
Rodríguez-Palenzuela (2003); Gürkaynak, Levin, and Swanson
(2006); Levin, Natalucci, and Piger (2004); and Svensson (1999).
3

A prominent example is Goodfriend (2007).

4

Consensus Forecasts—a monthly publication by Consensus
Economics Inc.—reports the forecasts of inflation by various
investment banks and public and private organizations that have
their own inflation forecasts. For further details, see
www.consensuseconomics.com.
5

See, for instance, Rudd and Whelan (2006) and Sargent (1993).

from one based on the Retail Prices Index excluding mortgage interest payments (RPIX) to one based on the Consumer Prices Index
(CPI)—also known there as the Harmonized Index of Consumer
Prices (HICP).
See Issing (2003).

11

12

Goodfriend (2007).

In a previous version of this article, we showed that the target
forecast does not perform well at a one-quarter horizon—a result
that is not surprising given that all central banks with an inflation
target insist that inflation can be brought back to the target only
over the medium run. In other words, it is widely agreed that monetary policy should not aim at cancelling the high frequency volatility of inflation.
13

Other lag structures did not improve the forecasting results, so we
use the simplest possible lag structure here.
14

6

Stock and Watson (2003); Banerjee, Marcellino, and Masten
(2003); and Banerjee and Marcellino (2003) show that multivariate
models of inflation—that is, models where inflation dynamics are
influenced by the evolution of other economic variables (output
and unemployment)—hardly ever improve the forecast of inflation
with respect to univariate nonstructural models of inflation. See
also Fisher, Liu, and Zhou (2002) and Brave and Fisher (2004).
7

The recent discussion of rational inattention (Sims, 2003; Mankiw
and Reis, 2002; and Maćkowiak and Wiederholt, 2005) models explicitly how the cost of information processing could cause agents
to restrict the information on which they base economic decisions.
8

Again, see Roger and Stone (2005) for a detailed description of
the inflation targeting in OECD and emerging economies.

An obvious weakness of this model is that it assumes that the
econometrician himself is convinced that the central bank announcement of a new target will immediately have an effect on the
inflation process.
15

16

Inflation time series were taken from Haver Analytics.

Since respondents to Consensus Forecasts vary from country to
country, these euro area constructs are not, strictly speaking, forecasts for the euro area economy. However, unless respondents of a
particular country have systematic biases in their inflation forecast,
the average inflation forecast across countries should be close to a
forecast by an “average” forecaster for the average of the countries,
that is, for the euro area as a whole.
17

9

In December 2003, the UK’s Chancellor of the Exchequer announced that the Bank of England would change its inflation target

These two statistics are the most frequently used statistics to evaluate our sample forecasting performance.
18

10

Federal Reserve Bank of Chicago

43

references

Banerjee, A., and M. Marcellino, 2003, “Are there
any reliable leading indicators for U.S. inflation and
GDP growth?,” Innocenzo Gasparini Institute for
Economic Research, working paper, No. 236, April.
Banerjee A., M. Marcellino, and I. Masten, 2003,
“Leading indicators of euro area inflation and GDP
growth,” Center for Economic Policy Research,
discussion paper, No. 3893, May.
Brave, Scott, and Jonas D. M. Fisher, 2004, “In
search of a robust inflation forecast,” Economic
Perspectives, Federal Reserve Bank of Chicago,
Vol. 28, No. 4, Fourth Quarter, pp. 12–31.
Castelnuovo, E., S. Nicoletti-Altimari, and D.
Rodríguez-Palenzuela, 2003, “Definition of price
stability, range, and point inflation targets: The anchoring of long-term inflation expectations,” in
Background Studies for the ECB’s Evaluation of its
Monetary Policy Strategy, O. Issing (ed.), Frankfurt,
Germany: European Central Bank, pp. 43–90.
Dhyne, E., L. Álvarez, H. Le Bihan, G. Veronese,
D. Dias, J. Hoffmann, N. Jonker, P. Lünnemann,
F. Rumler, and J. Vilmunen, 2005, “Price setting
in the euro area: Some stylized facts from individual
consumer price data,” European Central Bank,
Eurosystem Inflation Persistence Network,
working paper, No. 524, September.
Estrella, A., and J. Fuhrer, 1999, “Are ‘deep’
parameters stable? The Lucas critique as an
empirical hypothesis,” Federal Reserve Bank of
Boston, working paper, No. 99-4, September.
Fabiani, S., M. Druant, I. Hernando, C. Kwapil,
B. Landau, C. Loupias, F. Martins, T. Mathä,
R. Sabbatini, H. Stahl, and A. Stockman, 2005,
“The pricing behavior of firms in the euro area:
New survey evidence,” European Central Bank,
Eurosystem Inflation Persistence Network, working
paper, No. 535, October.
Fisher, Jonas D. M., Chin Te Liu, and Ruilin Zhou,
2002, “When can we forecast inflation?,” Economic
Perspectives, Federal Reserve Bank of Chicago,
Vol. 26, No. 1, First Quarter, pp. 30–42.

44

Goodfriend, M., 2007, “How the world achieved
consensus on monetary policy,” Journal of Economic
Perspectives, Vol. 21, No. 4, Fall, pp. 47–68.
Gürkaynak, R., A. Levin, and E. Swanson, 2006,
“Does inflation targeting anchor long-run inflation
expectations? Evidence from long-term bond yields
in the U.S., UK, and Sweden,” Bilkent University,
Ankara, Turkey, working paper, March 1, available
at www.bilkent.edu.tr/~refet/Gurkaynak_Levin_
Swanson_2006mar01.pdf.
Issing, O. (ed.), 2003, Background Studies for the
ECB’s Evaluation of its Monetary Policy Strategy,
Frankfurt, Germany: European Central Bank.
Labhard, V., G. Kapetanios, and S. Price, 2007,
“Forecast combination and the Bank of England’s
suite of statistical forecasting models,” Bank of
England, working paper, No. 323, May.
Levin, A. T., F. M. Natalucci, and J. M. Piger, 2004,
“Explicit inflation objectives and macroeconomic outcomes,” European Central Bank, Eurosystem Inflation
Persistence Network, working paper, No. 383, August.
Lindé, J., 2001, “The empirical relevance of simple
forward- and backward-looking models: A view from
a dynamic general equilibrium model,” Sveriges
Riksbank, working paper, No. 130, December.
Maćkowiak, B., and M. Wiederholt, 2005, “Optimal
sticky prices under rational inattention,” Humboldt
University, discussion paper, No. 2005-040, August 4,
available at http://sfb649.wiwi.hu-berlin.de/papers/
pdf/SFB649DP2005-040.pdf.
Mankiw, G., and R. Reis, 2002, “Sticky information
versus sticky prices: A proposal to replace the new
Keynesian Phillips curve,” Quarterly Journal of
Economics, Vol. 117, No. 4, November, pp. 1295–1328.
Roger, S., and M. Stone, 2005, “On target? The international experience with achieving inflation targets,” International Monetary Fund, working paper,
No. WP/05/163, August.

2Q/2008, Economic Perspectives

Rudd, J., and K. Whelan, 2006, “Can rational
expectations sticky-price models explain inflation
dynamics?,” American Economic Review, Vol. 96,
No. 1, March, pp. 303–320.
Sargent, T., 1993, Bounded Rationality in
Macroeconomics, Oxford: Clarendon Press.
Sims, C., 2003, “Implications of rational inattention,”
Journal of Monetary Economics, Vol. 50, No. 3,
April, pp. 665–690.

Svensson, L., 1999, “Inflation targeting as a monetary policy rule,” Journal of Monetary Economics,
Vol. 43, No. 3, June, pp. 607–654.
Vermeulen, P., D. Dias, M. Dossche, E. Gautier,
I. Hernando, R. Sabbatini, P. Sevestre, and
H. Stahl, 2007, “Price setting in the euro area:
Some stylized facts from individual producer price
data,” European Central Bank, Eurosystem Inflation
Persistence Network, working paper, No. 727,
February.

Stock, J., and M. Watson, 2003, “Forecasting output
and inflation: The role of asset prices,” Journal of
Economic Literature, Vol. 41, No. 3, September,
pp. 788–829.

Federal Reserve Bank of Chicago

45