View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Federal Reserve Bank of St. Louis

REGIONAL ECONOMIC
DEVELOPMENT
VO LU M E 4 , N U M B E R 1

2008

Selected Papers from
Federal Reserve Bank of St. Louis Economists
Local Price Variation and Labor Supply Behavior
Dan A. Black, Natalia A. Kolesnikova,
and Lowell J. Taylor
Regional Aggregation in Forecasting:
An Application to the
Federal Reserve’s Eighth District
Kristie M. Engemann, Rubén Hernández-Murillo,
and Michael T. Owyang
The Economic Impact of a
Smoking Ban in Columbia, Missouri:
An Analysis of Sales Tax Data for the First Year
Michael R. Pakko
Urban Decentralization and Income Inequality:
Is Sprawl Associated with Rising
Income Segregation Across Neighborhoods?
Christopher H. Wheeler

REGIONAL ECONOMIC
DEVELOPMENT

Selected Papers from
Federal Reserve Bank of St. Louis
Economists

1
Editor’s Introduction

Director of Research

Thomas A. Garrett

Robert H. Rasche
Deputy Director of Research

Cletus C. Coughlin

2

Editor

Local Price Variation and
Labor Supply Behavior

Thomas A. Garrett

Dan A. Black, Natalia A. Kolesnikova,
and Lowell J. Taylor

Center for Regional Economics—8th District (CRE8)
Director

Howard J. Wall
Subhayu Bandyopadhyay
Cletus C. Coughlin
Thomas A. Garrett
Rubén Hernández-Murillo
Natalia A. Kolesnikova
Michael R. Pakko
Christopher H. Wheeler

15
Regional Aggregation in Forecasting:
An Application to the
Federal Reserve’s Eighth District
Kristie M. Engemann, Rubén Hernández-Murillo,
and Michael T. Owyang

30
Managing Editor

George E. Fortier
Editors

Judith A. Ahlers
Lydia H. Johnson

The Economic Impact of a
Smoking Ban in Columbia, Missouri:
An Analysis of Sales Tax Data
for the First Year

Graphic Designer

Michael R. Pakko

Donna M. Stiller

The views expressed are those of the individual authors and
do not necessarily reflect official positions of the Federal
Reserve Bank of St. Louis, the Federal Reserve System,
or the Board of Governors.

41
Urban Decentralization and
Income Inequality:
Is Sprawl Associated with
Rising Income Segregation
Across Neighborhoods?
Christopher H. Wheeler

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

i

Regional Economic Development is published occasionally by the Research Division of
the Federal Reserve Bank of St. Louis and may be accessed through our web site:
research.stlouisfed.org/regecon/publications/. All nonproprietary and nonconfidential
data and programs for the articles written by Federal Reserve Bank of St. Louis staff and
published in Regional Economic Development also are available to our readers on this
web site.
General data can be obtained through FRED (Federal Reserve Economic Data), a database
providing U.S. economic and financial data and regional data for the Eighth Federal
Reserve District. You may access FRED through our web site: research.stlouisfed.org/fred.
Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included.
Please send a copy of any reprinted, published, or displayed materials to George Fortier,
Research Division, Federal Reserve Bank of St. Louis, P.O. Box 442, St. Louis, MO 631660442; george.e.fortier@stls.frb.org. Please note: Abstracts, synopses, and other derivative
works may be made only with prior written permission of the Federal Reserve Bank of St.
Louis. Please contact the Research Division at the above address to request permission.
© 2008, Federal Reserve Bank of St. Louis.
ISSN 1930-1979

ii

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Contributing Authors
Dan A. Black
University of Chicago and
National Opinion Research Center
danblack@uchicago.edu
Kristie M. Engemann
Federal Reserve Bank of St. Louis
kristie.m.engemann@stls.frb.org
Thomas A. Garrett
Federal Reserve Bank of St. Louis
tom.a.garrett@stls.frb.org
Rubén Hernández-Murillo
Federal Reserve Bank of St. Louis
ruben.hernandez@stls.frb.org

Michael T. Owyang
Federal Reserve Bank of St. Louis
michael.t.owyang@stls.frb.org
Michael R. Pakko
Federal Reserve Bank of St. Louis
michael.r.pakko@stls.frb.org
Lowell J. Taylor
Carnegie Mellon University
lt20@andrew.cmu.edu
Christopher H. Wheeler
Federal Reserve Bank of St. Louis (formerly)

Natalia A. Kolesnikova
Federal Reserve Bank of St. Louis
natalia.a.kolesnikova@stls.frb.org

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

iii

Editor’s Introduction
Thomas A. Garrett

T

he Center for Regional Economics–
8th District (CRE8) at the Federal Reserve
Bank of St. Louis sponsored the fourth
annual meeting of the Business and
Economics Research Group (BERG). This year’s
meeting was part of the eighth annual Missouri
Economics Conference held in Columbia, Missouri,
in March 2008 and sponsored by the Federal
Reserve Bank of St. Louis and the Department
of Economics at the University of Missouri–
Columbia.1
This issue of Regional Economic Development
contains four research papers by St. Louis Fed
economists, several of which were presented at
the recent BERG meeting. Dan Black, Natalia
Kolesnikova, and Lowell Taylor present empirical

1

and theoretical evidence that labor supply decisions
are not just a function of wages—as often assumed
in empirical and theoretical models of labor supply—but also are dependent on the prices of other
goods. Kristie Engemann, Rubén HernándezMurillo, and Michael Owyang compare the predictive power of various forecasting models of
employment that use different levels of data aggregation. Michael Pakko explores the economic
impact of a smoking ban in Columbia, Missouri,
using a time series of sales tax data for eating and
drinking establishments. Finally, Christopher
Wheeler examines whether urban sprawl resulted
in rising income segregation in 359 U.S. metropolitan areas over a 20-year period.

The agenda for the eighth annual Missouri Economics Conference,
which includes sessions sponsored by BERG, can be found at
http://research.stlouisfed.org/conferences/moconf/8th_annual_
agenda.pdf.

Thomas A. Garrett is an assistant vice president and economist at the Federal Reserve Bank of St. Louis.
Federal Reserve Bank of St. Louis Regional Economic Development, 2008, 4(1), p. 1.
© 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

1

Local Price Variation and Labor Supply Behavior
Dan A. Black, Natalia A. Kolesnikova, and Lowell J. Taylor

In standard economic theory, labor supply decisions depend on the complete set of prices: wages
and the prices of relevant consumption goods. Nonetheless, most theoretical and empirical work
in labor supply studies ignore prices other than wages. We address the question of whether the
common practice of ignoring local price variation in labor supply studies is as innocuous as generally assumed. We describe a simple model to demonstrate that the effects of wage and nonlabor
income on labor supply typically differ by location. In particular, we show that the derivative of
the labor supply with respect to nonlabor income is independent of price only when the labor
supply takes a form based on an implausible separability condition. Empirical evidence demonstrates that the effect of price on labor supply is not a simple “up-or-down shift” that would be
required to meet the separability condition in our key proposition. (JEL J01, J21, R23)
Federal Reserve Bank of St. Louis Regional Economic Development, 2008 4(1), pp. 2-14.

I

n standard economic theory, labor supply
decisions depend on the complete set of
prices: the wages and the prices of relevant
consumption goods. Nonetheless, as Abbott
and Ashenfelter (1976) noted some 30 years ago,
economists generally have found it a useful
abstraction, in both theoretical and empirical
work, to ignore prices other than wages in labor
supply studies. For example, none of the empirical
results on labor supply discussed in the prominent reviews of Pencavel (1986), Killingsworth
and Heckman (1986), or Blundell and MaCurdy
(1999) are derived by procedures that account for
variation in any price other than wages.1
However, most empirical work on labor does
use national datasets of individuals who live in
different locations and therefore face different
prices for locally priced goods. These price differ-

ences can be quite large, especially for housing.
For example, according to 1990 Census data, the
median housing price in New York is more than
three times that of the median housing price in
Cleveland.2 The question addressed in this paper
is whether the common practice of ignoring local
price variation in labor supply studies is as innocuous as has generally been assumed.
1

Abbott and Ashenfelter’s (1976) evaluation of labor supply in the
United States for the 1929-67 period exploits time-series changes
in relative prices but does not evaluate possible impacts of crosssectional variation (which, as they state, is “expected to be small”).
Some work conducts sensitivity analysis using Bureau of Labor
Statistics information on the cost of living to “adjust” wages. See,
for instance, DaVanzo, DeTray, and Greenberg (1973) and Masters
and Garfinkel (1977).

2

Gabriel and Rosenthal (2004) and Chen and Rosenthal (forthcoming)
show that massive housing price differences pertain across cities
even after careful adjustment for quality.

Dan A. Black is a professor in the Harris School, University of Chicago, and a senior fellow at the National Opinion Research Center; Natalia A.
Kolesnikova is an economist at the Federal Reserve Bank of St. Louis; and Lowell J. Taylor is a professor of economics and public policy at the
Heinz School, Carnegie Mellon University.

© 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

2

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Black, Kolesnikova, Taylor

To examine the issue, we first present a simple
theoretical model: an economy in which people
live in different locations with differing levels of a
production or consumption amenity. Following
logic familiar in urban economics, (e.g., Roback,
1982), equilibrium prices will differ across locations. We demonstrate that labor supply behavior
also can vary across locations.
Next, we demonstrate that, when prices vary
across locations, local variation in prices can be
safely ignored only when preferences take a very
specific and peculiar form. We also show that the
responsiveness of labor supply to wage changes will
be the same across locations only if the responsiveness of labor supply to nonlabor income changes
is the same across locations.
In our third step we evaluate the potential
empirical importance of our theoretical observations. We present results obtained by using 1990
Public Use Microdata Samples (PUMS) of the 1990
U.S. Census that examine labor supply in the
nation’s 50 largest cities. We focus on the labor
force participation and hours decisions of white
married women aged 30 to 50—a group whose
labor decisions are quite responsive to changes in
wages and nonlabor income.
In general, we analyze the basic “building
block” empirical relationship that would underlie
any empirical analysis of labor supply for this
group: the relationship between nonlabor income
and labor supply. Our innovation is examining
this relationship for each of the 50 cities separately
and demonstrating the significant systematic variation that exists among them.
We find that the basic correlation—between
labor supply and nonlabor income—differs across
cities. For example, women who have relatively
high nonlabor income (primarily a husband’s
income) work relatively fewer hours and have
lower participation rates. An important observation, from our perspective, is that this anticipated
negative relationship is substantially more pronounced in cities with inexpensive housing than
in cities with expensive housing.

A MODEL OF LOCAL LABOR
MARKETS WITH STONE-GEARY
PREFERENCES
We begin our study by presenting a simple
model of local price variation along the lines of
Roback (1982) and Haurin (1980). Locations differ
based on two criteria: (i) A location may be inherently more pleasant (i.e, have a higher level of a
“consumption amenity,” such as nice weather),
or (ii) a location may be associated with inherently
higher productivity (e.g., owing to the presence of
a natural resource or an agglomeration of economies
in production). For simplicity we restrict attention
to cases in which people choose to live in one of
two cities.
In contrast to the standard urban location
models such as those of Roback (1982) or Haurin
(1980), which fix labor supply as a constant, we
allow labor supply to be a choice variable. Preferences are assumed to be Stone-Geary. This is a
particularly transparent form of utility, and as
Ashenfelter and Ham (1979) note, it is the simplest
functional form of utility used in applied empirical
work examining labor supply.3 We assume, in particular, that individual i has utility ui as a function
of a consumption good x, leisure l (which is scaled
so that 0 ⱕ l ⱕ 1), and an amenity level Aj (that is
specific to location j ), according to a simple StoneGeary form as follows:
(1)

δ

u i = θ ij A j ( x − c ) l 1−δ ,

where c and δ are parameters that are common
across individuals and θ ij is a positive idiosyncratic
parameter that equals 1 for a typical individual, but
allows for the possibility that person i has a particular attraction, or distaste, for location j (as θ ij is
greater than, or less than, 1).
A person living in location j maximizes utility
subject to a budget constraint, pj x = wj 共1 – l 兲 + N,
where pj is the price for the local consumption
good, wj is the local wage, and N is nonlabor
income. Assuming an interior solution pertains,
3

See also Blundell and MaCurdy (1999) for a discussion of the StoneGeary form, as well as other forms used in applied work on labor
supply.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

3

Black, Kolesnikova, Taylor

demand for leisure and for the consumption good
are, respectively,
(1 − δ ) N + w j − cp j
l w j , pj =
,
(2)
wj

(

)

(

)

(

δ N +w
) ( p

and
(3)

x w j , pj =

j

− cp j

) + c.

j

the marginal individual θ i1 = θ i2 = 1, equation (5)
still characterizes equilibrium prices. In this
instance, however, some individuals will have a
strict preference with regard to location. For example, an individual with θ i1 > θ i2 will have a strict
preference for Location 1 over Location 2.
We turn next to labor supply. Let h be the fraction of time that a person works, h = 1 – l. From
equation (2), we have

Substituting equations (2) and (3) into equation
(1) gives indirect utility for person in location
ij

(4)

V ij =

1−δ

j δ

θ A δ (1 − δ )

(N + w

j

− cp j

pδj w 1j −δ

).

In equilibrium each individual chooses to live
in the location that yields the highest level of utility.
There are two locations: j = 1 or 2. We present two
cases: one with differing consumption amenities
and one with differing levels of productivity in
the locations.

Case 1: Differing Levels of the
Consumption Amenity

A 1 ( N + w − cp1 )
p1δw 1−δ

=

A 2 ( N + w − cp2 )
p2δ w 1−δ

4

4

V O LU M E 4 , N U M B E R 1

2008

(

δw − (1 − δ ) N − cp j
w

).

(

∂h w , p j

) = (1 − δ )(N − cp ) .
j

w

2

Notice that in this example, the responsiveness of
the labor supply to a wage change is greater in the
inexpensive city than in the expensive city,
∂h (w , p2 ) ∂h (w , p1 )
>
.
∂w
∂w

In contrast, if we focus on how a change in
nonlabor income affects labor supply,
(8)

For simplicity, we are implicitly assuming that labor is the only factor
of production, so that firms will be indifferent in hiring if the wage
is the same in the two cities. This would not be true, for example, if
land were a major factor of production and land prices differed in
the two cities.

)

∂w

.

Inspection of equation (5) confirms the intuitive
result that p1 > p2: The local consumption good is
more expensive in Location 1—the high-amenity
city.
This logic continues to hold if we add back
the idiosyncratic taste component to utility. If for

(

h w, pj =

Although wages are the same in both locations,
the labor supply differs. In this example, h共w,p1兲 >
h共w,p2兲; individuals supply more labor when they
work in the more expensive city.
Suppose instead the focus is on the effect of a
wage change in a local labor market (studying
people who would not move in response to a small
change in the wage)5:
(7)

Suppose there is general agreement that
Location 1 is nicer than Location 2, A1 > A2, and
for the moment assume further that there are no
idiosyncratic differences in opinion about location,
so that θ ij = 1 for all individuals. Because workers
are equally productive in the two locations, wages
and w1 and w2 must be the same, say w.4 In an
equilibrium in which people live in both locations,
we must have V i1 = V i2, so using equation (4), it is
clear that p1 and p2 must solve
(5)

(6)

(

∂h w , p j
∂N

) = − (1 − δ ) ,
w

we find that the relationship is independent of the
local price; that is, it can be written as
∂h (w )
.
∂N
5

In general, if the wage increases in a labor market, this factor can
attract new individuals to that location. Here, we are interested in
the effect on the labor supply of individuals who are already in the
market, for example, people who have an idiosyncratic taste for that
location.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Black, Kolesnikova, Taylor

Case 2: Differing Levels of Productivity
Now suppose that Locations 1 and 2 are viewed
as equally pleasant, A1 = A2, but productivity is
higher in Location 1 than in Location 2, so that
w1 > w2. The equilibrium condition corresponding
to equation (5)—that the marginal individual is
indifferent between locations (i.e., Vi1 = Vi2)—is then

( N + w 1 − cp1 ) = ( N + w 2 − cp2 ) .

(9)

p1δw 11−δ

p2δ w 21−δ

As for labor supply, in city j,
(10)

(

)

h w j , pj =

(

δw j − (1 − δ ) N − cp j
wj

).

In general, labor supply differs in the two locations,
but even with p1 > p2 and w1 > w2 the location that
will have the larger labor supply cannot be predicted. Similarly, in general
∂h (w 1 , p1 ) ∂h (w 2 , p2 )
≠
,
∂w
∂w

and we cannot determine in which city the labor
supply is more responsive to wage changes. On
the other hand, in this example the derivative of
labor supply with respect to nonlabor income,
(11)

(

∂h w j , p j
∂N

) = − (1 − δ ) ,
wj

turns out to be independent of pj . Furthermore,
the derivative of labor supply with respect to nonlabor income does not depend on the local price,
p, but because in equilibrium the high-productivity
city has relatively higher wages, we expect to
observe that δh/δN will be smaller (in absolute
value) in the expensive city.
Our examples illustrate two important points.
First, cross-sectional variation in wages and prices
may be associated with variation in labor supply,
although that cross-sectional variation is of no
value for understanding the behavioral effect of
wage changes on labor supply. For instance, in our
Case 2, even if in both cities

(

∂h w j , p j
∂w

) > 0,

identical individuals may well supply less labor
in the high-wage city than in the low-wage city,
depending on the local price-wage combination.
Second, the responsiveness of labor supply to
changes in the wage or nonlabor income typically
varies across locations.

WHEN DOES PRICE VARIATION
MATTER FOR LOCAL LABOR
SUPPLY?
As noted previously, housing prices vary
widely across U.S. cities, presumably because of
differences in consumption or production amenities across these locations. The examples in the
previous section indicate that labor supply varies
across locations even in the unusually simple and
transparent case of Stone-Geary preferences. We
now turn to a more systematic investigation of
conditions on preferences under which price and
income effects on labor supply do not depend on
location. As is common in the literature, attention
is restricted to the case of quasi-homothetic preferences (of which Stone-Geary is a special case).6
Given this common simplification, what further
restrictions are necessary to allow investigators to
ignore variation across locations when examining
labor supply?7
Under quasi-homothetic preferences, indirect
utility takes the form
(12) V ( p,w , N ) = α ( p,w ) + ( N + w ) β ( p,w ),
where, as before, p is the local price, w is the local
wage, and N is the nonlabor income. Using Roy’s
identity we derive the demand for leisure

6

Quasi-homothetic preferences are useful because they preserve a
linear expansion path of homothetic preferences, but they do not
require the path to go through the origin. Thus, under quasihomothetic preferences, income elasticities of demand need not
equal 1, as is the case with homothetic preferences.

7

We could attempt to analyze cases that are even more general, but
as we shall see, matters are sufficiently discouraging even for the
quasi-homothetic case.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

5

Black, Kolesnikova, Taylor

∂V / ∂w
∂V / ∂N
α ( p,w ) + β ( p,w ) + ( N + w ) βw ( p,w )
=− w
β ( p,w )

l ( p,w , N ) − 1 = −

(13)

=−

αw ( p,w ) + ( N + w ) βw ( p,w )
− 1, l ( p,w , N )
β ( p,w )

=−

αw ( p,w ) + ( N + w ) βw ( p,w )
.
β ( p,w )

follows:
Proposition 1 When preferences are quasihomothetic,

∂h
∂N
is independent of location if and only if preferences
satisfy a separability condition β 共p,w兲 = β 1共p兲β 2共w兲.
Next consider the response of the demand for
leisure to wage changes,

∂h
= aw ( p,w ) + b ( p,w ) + ( N + w )bw ( p,w ) .
∂w

It then follows that hours of labor supply are

h ( p,w , N ) = 1 − l ( p,w , N )

Again, the goal is to derive conditions under which

α ( p,w ) + ( N + w ) βw ( p,w )
= 1+ w
:
β ( p,w )

(14)

= a ( p,w ) + ( N + w )b ( p,w ),

where a ( p,w ) = 1 +

αw
β
, b ( p,w ) = w .
β
β

Consider the effect of the change in nonlabor
income on the labor supply,

β ( p, w )
∂h
= b ( p,w ) = w
.
∂N
β ( p,w )
Obviously, δh/δN is independent of p (and thus is
the same across locations) if and only if b共p,w兲 ⬅
b共w兲. The following claim provides the condition
under which this holds:
Claim

βw ( p,w )
= b (w ) ⇔ β ( p,w ) = β1 ( p ) β2 (w ).
β ( p, w )

Proof. The proof of sufficiency is trivial. To prove
necessity, we have

βw ( p,w )
= b (w ),
β ( p,w )
∂
lnβ ( p,w ) = b (w ),
∂w
lnβ ( p,w ) = ∫ b (w )dw + c ( p ),
b w dw +c ( p )
β ( p,w ) = e ∫ ( )
= β1 ( p ) β2 (w ),

∂h
∂w
does not depend on local prices, p. If b共p,w兲 = b共w兲,
as above, then the only other necessary condition
is that aw共p,w兲 be independent of p. Now aw共p,w兲
is independent of p if and only if it is equal to some
function of w only: aw共p,w兲 = f共w兲. Integrating both
parts with respect to w, we get a共p,w兲 = F共w兲 + c共p兲.
Then the supply of hours of work takes an additively separable form, h共p,w,N兲 = c共p兲 + F共w兲 +
共N + w兲b共w兲.
We have established, therefore,
Proposition 2 When preferences are quasihomothetic,

∂h and ∂h
∂N
∂w
are independent of location if and only if the
demand for leisure has the additively separable
form
(15) h ( p,w , N ) = c ( p ) + F (w ) + ( N + w )b (w ).
Notice that in equation (15) the effect of local
price variation is to simply shift the labor supply
function up or down. In this case, it might suffice
to merely incorporate location-specific dummies
when estimating labor supply functions.8 Without
this separability, however, local price variation
would have a fundamental impact on the shape of
the labor supply function itself.

where β2 (w ) = e ∫ b (w )dw .
8

The above observations can be summarized as
6

V O LU M E 4 , N U M B E R 1

2008

In fact, in empirical work on labor supply, researchers generally do
not even take this simple step.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Black, Kolesnikova, Taylor

These two propositions demonstrate that even
in the simple case of quasi-homothetic preferences,
rather strong conditions are necessary for locationindependent labor supply responses to income and
wage changes.
The Stone-Geary example used in the previous
section illustrates this point. Indirect utility can
be written in the form V = α 共p,w兲 + 共N + w兲β 共p,w兲,
where
1−δ

α ( p,w ) = −
(16)

cpθ Aδ δ (1 − δ )
pδ w 1−δ

1−δ

= −cp1−δθ Aδ δ (1 − δ )

⋅

1

w 1−δ

,

1−δ

β ( p,w ) =

θ Aδ δ (1 − δ )
pδ w 1−δ

=

θ Aδ δ (1 − δ )
pδ

(17)

1−δ

1

⋅

w

1−δ

If the key relationship ∂h is independent of p,
∂w
then ∂h is independent of p.
∂N
To prove this proposition we consider first the
effect of a change in nonlabor income on labor
supply:
∂h ( p,w , F )

.

Since β 共p,w兲 is separable in p and w, the separability condition of Proposition 1 is satisfied.
Recall from equation (6) that

h ( p,w , N ) =

in labor supplied in each city can be traced. Finding
data that correspond to such an experiment is a
formidable task. The following work instead focuses
exclusively on the sensitivity of labor supply to
nonlabor income. We can justify this focus with
the following result:
Proposition 3 In general, labor supply, h共p,w,F 兲,
depends on the price of the local good, the wage,
and full income, F = w + N.9

δw − (1 − δ ) ( N − cp )
.
w

Obviously, this function does not have an additively separable form as required in Proposition 2.
So it is not surprising that the derivative of labor
supply with respect to nonlabor income, N,

∂h
(1 − δ ) ,
=−
∂N
w
is independent of p, whereas the derivative of
leisure with respect to the wage, w,

∂N

=

∂h ( p,w , F ) ∂F ∂h ( p, w , F )
⋅
=
.
∂F
∂N
∂F

This is independent of price, p, if and only if

∂h ( p,w , F )
= G (w , F ).
∂F

(18)

Integrating both sides of equation (18), we then
notice that labor supply must have the following
additively separable form:

h ( p,w , F ) = g (w , F ) + c ( p,w )
(19)

= g (w ,w + N ) + c ( p,w ) .

Similarly, the effect of the change in the wage on
labor supply does not depend on p if and only if

∂h ( p,w , F )
= Q (w , F ),
∂w

(20)

or, integrating both sides of equation (20),
∂h (1 − δ )( N − cp )
=
,
∂w
w2

depends on p.
As noted earlier, labor supply studies generally
focus on the responsiveness of labor supply to
changes in wages. Here, we want to evaluate how
price variations, in addition to changes in wages,
affect the results. The ideal experiment would be
one in which wages are exogenously shifted in each
of many different U.S. cities and in which changes

h ( p,w , F ) = q (w , F ) + k ( p )
(21)

= q (w ,w + N ) + k ( p ) .

Compare the additive separability requirements
shown in equations (19) and (21). The latter takes
the same basic form but is more restrictive. It follows that when
9

Recall that full-time work entails h = 1, so that the maximum possible
labor income is w + N, making full income.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

7

Black, Kolesnikova, Taylor

∂h is independent of the local price, p,
∂w

∂h is independent of the local price, p.
∂N

EMPIRICAL RESULTS
The theoretical considerations outlined in the
preceding section suggest that unless preferences
are strongly restricted, the responsiveness of labor
supply to nonlabor income and to the wage will
vary across locations. It is possible, of course, that
the differences are insignificant and do not pose a
problem for empirical work. We examine this possibility using a dataset of married white women—
a group that is likely to have substantial variation
in labor supply (e.g., in response to differences in
wage, nonlabor income, and possibly local prices).
Data used in the analysis are from the 1990 PUMS10;
data include married non-Hispanic white women,
aged 30 to 50, who live in the 50 largest metropolitan statistical areas (MSAs) in the United States.
One goal of this exploration is to see if there
are any systematic differences in labor supply
related to differences in local prices. We consider
the relationship between labor supply and nonlabor income; the latter term is defined as family
income minus the woman’s own total income.
Given previous research on married women’s labor
supply, an inverse relationship would be expected
between nonlabor income and labor supply ( i.e.,
leisure is likely a “normal good.”) The question
here is whether that relationship differs in a systematic way across cities.
Examining the relationship between nonlabor
income and married women’s labor supply in cross
section is far from “state of the art” in estimating
labor supply. Still, it seems a reasonable first pass
at the issue, especially given that our focus is not
on any estimated relationship per se but on differences in the relationships in expensive and inexpensive urban areas.
In our investigation of the differences in the
response of labor supply to the change in nonlabor
10

8

Data were provided by the Minnesota Population Center (Ruggles
et al., 2004).

V O LU M E 4 , N U M B E R 1

2008

income, we do not want to specify any parametric
form because of concerns that results might be
sensitive to the functional form.11 Instead, we use
a nonparametric matching estimator. Two measures
of labor supply are used: annual hours of work and
an employment participation dummy variable.12
The data do not allow us to perform this analysis
for each city because they do not provide enough
support. Instead, we divide the sample roughly
into thirds and examine differences between the
most “expensive” cities (the 17 MSAs within the
top one-third of housing prices) and “inexpensive”
cities (the 17 MSAs with the lowest housing prices).
Our comparison of married women’s labor
supply in inexpensive and expensive cities then
follows three additional steps. The first step is to
divide households into deciles according to “nonlabor income” (which is predominately the husband’s income). Then within each decile we
compare the labor supply of women who live in
the expensive cities relative to the labor supply of
women who live in inexpensive cities. The goal is
to compare the labor supply of otherwise similar
women, so we use an estimator that matches women
with exactly the same age and level of education.
Separate analyses also are conducted for women
with high school education and college education.
Thus, the second step is to match women living in
an expensive city with corresponding women living in inexpensive cities (i.e., we match women in
each nonlabor income decile, di 共i = 1,…,10兲, with
age and education vector x = X, to women with
these same characteristics living in inexpensive
cities). In the analysis that centers on annual work
hours, this is
(22) ∆ ( X ,d i ) = E ( h 1| x = X ,d i ) − E ( h 0| x = X , di ),
where h1, h0 are annual hours of work in expensive
and inexpensive cities, respectively. In the absence
of selection, this might be taken to be the causal
effect on labor supply (measured in hours per year)
of living in an expensive city relative to an inexpensive city. The third step is to average the quan11

See, for example, DaVanzo, DeTray, and Greenberg (1973).

12

We also repeated the analysis with several other measures of labor
force participation, such as an indicator of full-time employment.
The results remain essentially the same.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Black, Kolesnikova, Taylor

Table 1
Differences in Annual Hours and Participation Rates Between Expensive and Inexpensive Locations
by Nonlabor Income Deciles
All women

Nonlabor income decile

Change in
annual
hours

Women with a
high school diploma

Women with a
college degree

Change in
participation
rates

Change in
annual
hours

Change in
participation
rates

Change in
annual
hours

Change in
participation
rates

1

–117.34
(14.23)

–0.04
(0.0065)

–136.1
(24.57)

–0.04
(0.012)

–78.08
(34.88)

–0.02
(0.016)

2

–75.46
(14.32)

–0.01
(0.0063)

–75.72
(24.36)

0.00
(0.011)

–99.43
(36.47)

–0.02
(0.016)

3

–54.14
(13.74)

–0.01
(0.0060)

–19.42
(23.39)

0.00
(0.012)

–46.71
(33.98)

–0.01
(0.015)

4

–15.14
(13.88)

0.00
(0.0062)

–28.97
(23.63)

–0.01
(0.012)

–20.59
(37.16)

0.00
(0.016)

5

–20.68
(13.31)

0.01
(0.0063)

–51.79
(24.14)

0.00
(0.012)

–13.31
(34.57)

0.03
(0.015)

6

2.59
(13.66)

0.02
(0.0068)

–39.52
(24.14)

0.00
(0.013)

59.98
(31.66)

0.05
(0.015)

7

12.47
(14.38)

0.01
(0.0072)

–16.11
(24.79)

0.00
(0.013)

85.6
(30.99)

0.03
(0.015)

8

83.55
(14.62)

0.05
(0.0076)

81.95
(26.78)

0.05
(0.014)

139.38
(30.24)

0.08
(0.015)

9

83.61
(15.80)

0.04
(0.0083)

88.98
(33.44)

0.03
(0.017)

128.59
(30.84)

0.06
(0.016)

10

82.59
(18.45)

0.04
(0.0098)

15.74
(41.52)

0.00
(0.023)

172.35
(28.04)

0.07
(0.015)

NOTE: Authors’ calculations, based on 5 percent 1990 PUMS data. The sample consists of white, non-Hispanic married women, aged 30
to 50. Bootstrapped standard errors using 999 replications are reported in parentheses.

tity in equation (22) across all women in each
decile di :
(23)

∆ n (d i ) = ∫ ∆ ( x |d i )dFn ( x |i ),

where dFn共x|di 兲 is the national distribution of x in
the decile di .
The analysis is repeated using a second measure of labor supply—a labor force participation
dummy variable. When these empirical exercises
are performed separately for women with a high
school diploma and those with a college degree, x,
is simply an age vector.
Results are reported in Table 1. The difference
in annual hours of work between women living in
expensive and inexpensive cities is substantial

(and statistically significant) for many of the nonlabor income deciles. For example, ninth-decile
women in expensive cities work considerably
longer hours than corresponding women in inexpensive cities. College-educated women in this
decile average 129 more work hours, whereas
women with a high school education work an
average of 89 hours more.
An apparent and striking pattern is shown in
Table 1 and Figure 1. First, as might be expected,
among these married women, leisure appears to
be a normal good; women with higher levels of
outside income generally work fewer hours per
year and have lower labor force participation rates.
More important, for our purposes, is that the rela-

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

9

Black, Kolesnikova, Taylor

Figure 1
Variation Between Expensive and Inexpensive Locations in Annual Hours and Participation Rates,
by Nonlabor Income Decile
Annual Hours
Hours
1,640

Hours
1,640
1,540

High School Graduates

1,540
1,440

Expensive Locations

1,340

Inexpensive Locations

College Graduates

1,440
1,340

1,240

1,240

1,140

1,140

1,040

1,040

940

940

840

840

740

740
640

640
1

2

3

4

5

6

7

8

9

1

10

2

Nonlabor Income Deciles

3

4

5

6

7

8

9

10

3
4
5
6
7
8
Nonlabor Income Deciles

9

10

Nonlabor Income Deciles

Participation Rates
Percent

Percent

High School Graduates

0.84

0.84

0.80

0.80

0.76

0.76

0.72
0.68
0.64

0.72
0.68

0.60

0.60

0.56

0.56
0.52

0.64

0.52
0.48
0.44

0.48
0.44

0.40
1

2

3
4
5
6
7
8
Nonlabor Income Deciles

9

10

tionship between nonlabor income and labor supply is quite different for expensive and inexpensive cities. At the very lowest levels of nonlabor
income (e.g., deciles 1 and 2), women in expensive
cities have lower labor supply than women in
inexpensive cities. The opposite is essentially true
for women in the high nonlabor income deciles;
among women with high nonlabor income, labor
force participation and average hours worked are
10

College Graduates

V O LU M E 4 , N U M B E R 1

2008

0.40

1

2

higher in expensive cities than in inexpensive
cities.
In short, the labor/leisure choice appears to not
conform to the additively separable form described
in Proposition 2; local prices do not merely shift
labor supply up or down. The derivative

∂h
∂N

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Black, Kolesnikova, Taylor

is generally negative (at least beyond the lowest
decile levels of N) and is smaller (in absolute value)
in the expensive city. This generalization holds
true for both high school– and college-educated
women.
Also, as noted, results are similar when “average hours” or “labor force participation rates” are
used as the measure of labor supply. Of note, in
these cities 66 percent of high school–educated
women and 70 percent of college-educated women
are employed on average. Thus, differences of 5 to
7 percentage points between expensive and inexpensive cities represent differentials of 8 to 10
percent, which seem (to us) quite substantial.
Our nonparametric approach does have one
disadvantage: The nonlabor income distribution
within each decile might differ somewhat for
women in expensive cities. An alternative flexible
parametric approach to estimation, described in
the Appendix, provides nearly identical inferences.
Our empirical findings are roughly consistent
with theoretical predictions in Case 2. In that
equilibrium example with Stone-Geary preferences,
the responsiveness of labor supply to nonlabor
income must be greater in inexpensive (lowproductivity) cities than expensive (highproductivity) cities.

CONCLUSION
We describe a simple model to demonstrate
that the effects of wage and nonlabor income on
labor supply typically differ by location. In particular, we show the derivative of the labor supply
with respect to nonlabor income is independent of
price only when labor supply takes a form based
on an implausible separability condition.
Empirical evidence demonstrates that the
effect of price on labor supply is not a simple “upor-down shift” that would be required to meet the
separability condition in our key proposition. For
example, among women with low nonlabor income,
living in an inexpensive city is associated with
higher labor force participation and longer work
hours, whereas among women with high nonlabor
income, living in an inexpensive city is associated
with lower labor force participation and shorter
work hours.

This work has a number of implications for
empirical strategies in estimating labor supply and
other policy research. First, our research makes
clear that empirical work should never use crosssectional variation in wages to estimate parameters
in labor supply models. We document significant
differences for married women in quantity of labor
supplied across cities that may have little connection with behavioral responses to cross-sectional
variation in wages.
Second, because labor supply elasticities vary
by location, researchers must be careful in interpreting results based on instrumental variable (IV)
strategies. For example, suppose an IV approach
is used in which the IV is the price of coal. Variation in the price of coal arguably serves as an excellent source of wage variation in the coal industry,
but the resulting estimates of the effect on labor
supply would apply only for regions where the
coal industry is a major employer. If local prices
differ in those regions from other parts of the country, the estimated relationships will not be generalizable to the entire country.
Third, using a back-of-the-envelope example,
we show that the evidence in Table 1 is consistent
with the possibility that wage elasticities or labor
supply (for married women) are quite different
across cities. Notice that the Slutsky equation, in
elasticity form, gives the relationship
(24)

 wh 
εw = εwH + 
εN ,
 N 

where εw is the observed wage elasticity of supply,
ε wH is the corresponding Hicksian elasticity (reflecting the pure substitution effect), and εN is the elasticity of labor supply with respect to nonlabor
income. Now consider college-educated married
women at the median level of nonlabor income. If
we take as causal the relationship drawn in Figure 1,
moving from the fourth to sixth deciles in income
we would estimate a nonlabor income elasticity,
εN , of –0.46 in the expensive cities and –0.29 in
the inexpensive cities. Suppose that the Hicksian
elasticity, ε wH , is 0.50 (and is the same in both
cities). We estimate that for the average woman at
the fourth decile wh/N is 0.57 in inexpensive

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

11

Black, Kolesnikova, Taylor

cities and 0.61 in expensive cities.13 Thus, the
uncompensated labor supply elasticity is more
than a third higher in expensive cities than inexpensive cities, 0.33 versus 0.24.
Fourth, as an example of an application to
policy-related research, locational differences
may occur in the response of female labor supply
to changes in taxes. Changes in income taxes, for
instance, would have different effects in different
cities. A closely related implication centers on the
analysis of social welfare policy. (Recall, for example, that wives of husbands with low earnings work
less in more expensive cities.) We believe that further analysis of policy implications is warranted.

Do Firms and Households Like the Same Cities?”
Review of Economics and Statistics, 2004, 86(1),
pp. 438-44.
Haurin, Donald R. “The Regional Distribution of
Population, Migration, and Climate.” Quarterly
Journal of Economics, 1980, 95(2), pp. 293-308.
Killingsworth, Mark R. and Heckman, James J. “Female
Labor Supply: A Survey,” in Orley Ashenfelter and
Richard Layard, eds., Handbook of Labor Economics.
Princeton, NJ: Princeton University Press, 1986, Vol. 1,
pp. 103-204.
Masters, Stanley H. and Garfinkel, Irwin. Estimating
the Labor Supply Effects of Income Maintenance
Alternatives. New York: Academic Press, 1977.

REFERENCES
Abbott, Michael and Ashenfelter, Orley. “Labour Supply,
Commodity Demand and the Allocation of Time.”
Review of Economic Studies, 1976, 43(3), pp. 389-411.
Ashenfelter, Orley and Ham, John C. “Education,
Unemployment, and Earnings.” Journal of Political
Economy, 1979, 87(5), pp. 99-116.
Blundell, Richard and MaCurdy, Thomas. “Labor
Supply: A Review of Alternative Approaches,” in
Orley Ashenfelter and David Card, eds., Handbook
of Labor Economics. Princeton, NJ: Princeton
University Press, 1999, Vol. 3, pp. 1559-95.
Chen, Yong and Rosenthal, Stuart S. “Local Amenities
and Life Cycle Migration: Do People Move for Jobs
or Fun?” Journal of Urban Economics, 2008
(forthcoming).
DaVanzo, Julie; DeTray, Dennis N. and Greenberg,
David H. “Estimating Labor Supply Response: A
Sensitivity Analysis,” publication No. R-1372-OEO.
Santa Monica, CA: The RAND Corporation, 1973.

Pencavel, John. “Labor Supply of Men: A Survey,” in
Orley Ashenfelter and Richard Layard, eds.,
Handbook of Labor Economics. Princeton, NJ:
Princeton University Press, 1986, Vol. 1, pp. 3-102.
Roback, Jennifer. “Wages, Rents, and the Quality of
Life.” Journal of Political Economy, 1982, 90(6),
pp. 1257-78.
Ruggles, Steven; Sobek, Matthew; Alexander, Trent;
Fitch, Catherine; Goeken, Ronald; Hall, Patricia;
King, Miriam and Ronnander, Chad. Integrated Public
Use Microdata Series: Version 3.0 (machine-readable
database). Minneapolis, MN: Minnesota Population
Center, 2004; www. ipums.org.
Ruggles, Steve; Sobek, Matthew; Alexander, Trent; Fitch,
Catherine; Goeken, Ronald; Hall, Patricia; King,
Miriam and Ronnander, Chad. Integrated Public Use
Microdata Series: Version 4.0 (machine-readable
database). Minneapolis, MN: Minnesota Population
Center, 2008; usa.ipums.org/usa/.

Gabriel, Stuart A. and Rosenthal, Stuart S. “Quality of
the Business Environment Versus Quality of Life:

13

In fact, the ratio of women’s earnings to nonlabor household income
(primarily men’s earnings) is larger in expensive cities than in inexpensive cities at every decile.

12

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Black, Kolesnikova, Taylor

APPENDIX
The empirical inferences in Table 1 are based on an entirely nonparametric approach. We divided
our sample into 10 nonlabor income deciles and compared labor supply across women within each of
these cells. Our primary finding is that for women in low nonlabor income deciles, the labor supply is
lower in expensive cities than in inexpensive cites, whereas for women in high nonlabor income deciles,
labor supply is higher in expensive cities than in inexpensive cities.
Here we present a flexible parametric approach that leads to this same inference. We estimate labor
supply regressions with the independent variables age (entered as 21 dummy variables for each age, 30 to
50 years inclusive) and nonlabor income (entered as a fourth-order polynomial). We estimate regressions—
separately for high school–educated women and college-educated women, as well as for each labor supply
variable (employment and hours worked)—using the sample of women from the expensive cities. We
similarly estimate corresponding regressions for the sample of women from the inexpensive cities. Then
for each woman i who lives in the expensive cities, we estimate the outcome of interest ŷ1i (e.g., “predicted”
employment, or “predicted” hours worked) using the regression parameter from the expensive city, and
similarly estimate ŷ0i using regression parameters from the inexpensive city. Finally, we form the estimated
gap,

ˆ = yˆ − yˆ ,
∆
i
1i
0i
for each individual. Notice that this last quantity is the “impact of the treatment on the treated,” where
the “treatment” is location in an expensive city rather than an inexpensive city.
To summarize findings in a manner comparable to Table 1, we aggregate estimates into deciles of
nonlabor income. Results are presented in Table A1. Bootstrapped standard errors using 999 replications
are reported in parentheses.14

14

Bootstrap procedure in this case involves 999 replications of generating a random sample with replacement from the original dataset and estimating
the parameter of interest for that sample. After 999 replications, we have a sampling distribution of the parameter estimate. The standard deviation of that distribution is the standard error of the parameter estimate.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

13

Black, Kolesnikova, Taylor

Table A1
Differences in Annual Hours and Participation Rates Between Expensive and Inexpensive Locations
by Nonlabor Income Deciles, Parametric Approach
Women with a
high school diploma
Change in
annual hours

Nonlabor income decile

Change in
participation rates

Women with a
college degree
Change in
annual hours

Change in
participation rates

1

–128.7
(22.04)

–0.034
(0.0110)

–118.1
(34.23)

–0.027
(0.0143)

2

–93.4
(12.42)

–0.021
(0.0066)

–72.5
(17.76)

–0.016
(0.0079)

3

–68.6
(11.10)

–0.013
(0.0059)

–36.6
(16.07)

–0.002
(0.0074)

4

–47.1
(10.82)

–0.005
(0.0056)

–9.5
(15.23)

0.009
(0.0071)

5

–28.1
(10.26)

0.001
(0.0056)

19.1
(14.59)

0.021
(0.0066)

6

–2.1
(11.15)

0.01
(0.0056)

46.5
(14.18)

0.032
(0.0066)

7

23.8
(12.73)

0.019
(0.0061)

76.5
(14.59)

0.045
(0.0071)

8

55.3
(15.28)

0.030
(0.0077)

108.6
(17.27)

0.058
(0.0082)

9

87.5
(20.48)

0.042
(0.0102)

143.5
(20.89)

0.075
(0.0099)

10

81.6
(38.06)

0.036
(0.0207)

123.1
(30.26)

0.066
(0.0151)

NOTE: Authors’ calculations, based on 1990 PUMS data. The sample consists of all married, white, non-Hispanic women between the
ages of 30 and 50 inclusive. The covariates are nonlabor income and age. Using a fourth-order polynomial, we use the sample of women
from expensive cities to estimate the outcome of interest, which we denote ŷ1i for the ith women. Using the sample of women from
inexpensive cities, we estimate parameters for a fourth-order polynomial and then evaluate the function using the covariates of women
from the expensive city sample, which we denote ŷ0i for the ith women. We then form the parameter for the “impact of treatment on the
treated” as ∆ˆ i = ŷ1i – ŷ0i. We then aggregate estimates into deciles of nonlabor income. Bootstrapped standard errors using 999 replications
are reported in parentheses.

14

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Regional Aggregation in Forecasting:
An Application to the Federal Reserve’s
Eighth District
Kristie M. Engemann, Rubén Hernández-Murillo, and Michael T. Owyang
Hernández-Murillo and Owyang (2006) showed that accounting for spatial correlations in regional
data can improve forecasts of national employment. This paper considers whether the predictive
advantage of disaggregate models remains when forecasting subnational data. The authors conduct
horse races among several forecasting models in which the objective is to forecast regional- or
state-level employment. For some models, the objective is to forecast using the sum of further
disaggregated employment (i.e., forecasts of metropolitan statistical area (MSA)-level data are
summed to yield state-level forecasts). The authors find that the spatial relationships between
states have sufficient predictive content to overcome small increases in the number of estimated
parameters when forecasting regional-level data; this is not always true when forecasting stateand regional-level data using the sum of MSA-level forecasts. (JEL C31, C53)
Federal Reserve Bank of St. Louis Regional Economic Development, 2008 4(1), pp. 15-29.

F

orecasting, especially as it pertains to
policymaking, is typically conducted at
the national level.1 However, a few recent
papers have indicated that aggregating
regional forecasts may improve forecasts of national
indicators. For example, Hendry and Hubrich
(2006) use disaggregate models to form forecasts
for aggregate variables. Similarly, Giacomini and
Granger (2004) show that using a disaggregate
model that accounts for spatial correlations can
reduce the root mean squared error of the forecasts. Their disaggregate forecasts take advantage
1

There are, however, some notable exceptions of forecasting economic
indicators at the subnational level (dates and regions noted in parentheses): Glickman (1971, Philadelphia MSA); Ballard and Glickman
(1977, Delaware Valley); Crow (1973, Northeast Corridor); Baird
(1983, Ohio); Liu and Stocks (1983, Youngstown-Warren MSA);
Duobinis (1981, Chicago MSA); LeSage and Magura (1986, 1990,
Ohio); and Rapach and Strauss (2005, Missouri; 2007, Eighth Federal
Reserve District).

of cross-regional correlations yet still restrict the
number of parameters estimated.2 They argue that,
under certain conditions, the sum of the forecasts
from an order-p,q space-time autoregression
[ST-AR共p,q兲] can outperform both aggregate models and models that do not account for the spatial
nature of the data. The ST-AR共p,q兲 model includes
p temporal lags and q spatially distributed lags—
that is, lags of the other regional series weighted
by proximity. Thus, the ST-AR共p,q兲 model exploits
both the spatial correlations and the information
content in the disaggregated series.
Hernández-Murillo and Owyang (2006) take
this approach to national employment data, show2

Compared with a standard vector autogression (VAR), the space-time
autoregression (AR) model posited in Giacomini and Granger (2004)
requires the estimation of 共n2 – n – 1兲p fewer parameters for the same
lag order p.

Kristie M. Engemann is a senior research associate, Rubén Hernández-Murillo is a senior economist, and Michael T. Owyang is a research officer
at the Federal Reserve Bank of St. Louis. This paper was prepared for the 4th Annual Business and Economics Research Group conference sponsored by the Federal Reserve Bank of St. Louis and the Center for Regional Economics—8th District. The authors thank Dave Rapach for comments.

© 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

15

Engemann, Hernández-Murillo, Owyang

ing that out-of-sample forecasts can be improved by
modeling the spatial interactions between Bureau
of Economic Analysis regions. They compare a
ST-AR共p,q兲 model with vector autoregressions
(VARs) with various levels of disaggregation. They
concluded that, as predicted by Giacomini and
Granger (2004), information in regional employment
data is useful for forecasting national employment.
In this paper, we are interested in whether the
information content of regional data can be observed
at a more disaggregated level. In particular, we ask
whether information for states helps forecast
regional data and whether information from cities
helps forecast state data. To this end, we construct
horse races among four competing models with
different levels of disaggregation. We then conduct
out-of-sample tests to determine which model produces the best short- and long-horizon forecasts.
The data used in these experiments are state- and
metropolitan statistical area (MSA)-level payroll
employment. In each experiment, the disaggregate
data are summed to yield either state- or regionallevel aggregates. In each case, we ask whether
models using the disaggregate data provide lower
mean squared prediction errors (MSPEs) than the
aggregate alternatives. We find that the spatial
relationships among states have sufficient predictive content to overcome small increases in the
number of estimated parameters. The same is not
always true when forecasting state- and regionallevel variables using the sum of MSA-level forecasts.
The next section reviews the four models used
in the horse races, followed by a section that discusses the subnational data and the construction
of the “aggregate” data. The results of the out-ofsample experiments are then presented, followed
by the conclusion.

aggregate series. These series can be disaggregated
in any manner (e.g., by regions or industries). The
aggregate forecast then can be constructed directly
from aggregate data or from the sum (or weighted
sum) of its components. We examine four
alternatives.
Suppose that period-t aggregate employment
is denoted Yt and can be written as the sum of its
N disaggregate counterparts (henceforth referred
to as “regions,” which depending on the application may refer to either states or metro areas), ynt ,
̂ be the h-period-ahead
without error.3 Let Yt+h
forecast of Y. A forecast from the simplest model,
a univariate aggregate order-p autoregression
(AR共p兲, Model 1), has the form
p

Yˆ t + h = ∑ Φ jYt + h − j ,

(1)

j =1

where p is the number of lags and Φj are scalar
coefficients.4
A similar univariate model can be constructed
to forecast each of the individual components—
in particular, region n’s h-period-ahead level of
employment, yn,t
̂ +h .5 The aggregate forecast is the
sum of the N regional forecasts (Model 2):
(2)

Yˆ t + h =

N

N

p

∑ yˆ uni
n,t + h = ∑ ∑ φ nj y n ,t + h − j ,
n =1

n =1 j =1

uni
where yn,t
̂ +h is region n’s employment forecast
from the univariate AR共p兲 model and φnj are scalar
coefficients.
An alternative to Model (2) that accounts for
the comovement between the regions is a VAR
forecast (Model 3). The aggregate forecast obtained
from such a model can be written as

3

MODELS

The implicit assumption made here is that the aggregate is exactly
the sum of its component parts. That is,
N

The goal of this experiment is to produce an
h-period-ahead forecast of an aggregate time
series—for example, employment. In this context,
“aggregate” does not necessarily mean “national,”
although it is an obvious interpretation. Instead,
here aggregate time series are data that are the sum
or weighted sum of a number of (forecastable) dis16

V O LU M E 4 , N U M B E R 1

2008

Yt = ∑n =1 y nt
holds identically. Of course, the validity of this assumption depends
greatly on the choice of data.
4

Potential constants and time trends are suppressed in this section
for notational convenience.

5

Henceforth, we refer to the disaggregate components as “regions,”
although they can, in principle, be of any type (e.g., industry, state,
MSA).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Engemann, Hernández-Murillo, Owyang

(3)

Yˆt + h =

N

N

p

N

∑ yˆ nvar,t +h =

∑ ∑ ∑ Γ nkj y k ,t +h − j ,

n =1

n =1 k =1 j =1

var
y n,t
̂ +h

where
is region n’s employment forecast and
Γnkj is the (scalar) lag-j effect of region k on region
n’s employment taken from the VAR coefficient
matrices.
Finally, we consider a ST-AR共p,q兲 model
(Model 4), which accounts explicitly for the spatial
correlations between regions by imposing a relationship that depends on the proximity to a region’s
neighbors. The spatial weights wnk are chosen a
priori and are intended to reflect proximity between
pairs of regions, for example, in terms of geographic
characteristics such as contiguity or distance. Interaction between regions is governed by a weighting
matrix W = {wnk } satisfying

wnk ≥ 0, wnn = 0, and

Yˆ t + h =

∑ k ≠n wnk = 1

N

∑ yˆ nstar,t + h
n =1

(4)

N

=



p

N

q



∑  ∑φ j y n,t +h − j + ∑ ∑ψ lw nk y k ,t +h −l  ,
n =1 
 j =1

k =1 l =1



where φj and ψl are scalar autoregressive and scalar
spatial lag coefficients, respectively. The weighting
matrices used in the empirical applications are
discussed below.
The primary differences among the four models
involve a tension between modeling the (in-sample)
cross-spatial correlations and parameter proliferation. Clearly, Models (1) and (2) are the most parsimonious models. However, these models neglect
potentially predictive information in the comovement between the variables. On the other hand,
the VAR depicted in Model (3) may overfit the insample data. Under parameter certainty, the VAR
forecast in Model (3) weakly dominates the three
alternative Models (1), (2), and (4). However,
Giacomini and Granger (2004) show that forecasting
from an estimated VAR (Model 3) is less efficient
than forecasting from the ST-AR model (Model 4).6
6

Under certain conditions, the univariate aggregate model yields a
lower mean squared error. For a discussion of these conditions, see
Giacomini and Granger (2004).

Because the ST-AR model is a restricted form of
the VAR, the error associated with parameter uncertainty decreases. Giancomini and Granger, however,
are unable to determine whether the ST-AR model
or the univariate model is more theoretically efficient (i.e., whether interaction between regions
yields significant information for forecasting). In
the following section, we investigate whether
accounting for spatial interaction in regional
employment data is sufficiently elucidative to
warrant the use of disaggregate data in forecasting.

EMPIRICAL DETAILS
Hernández-Murillo and Owyang (2006) tested
the forecasting efficacy of the spatially disaggregated model for national employment. Here, we
consider further disaggregation by examining the
model’s ability to forecast state- and Federal Reserve
District–level employment. We conduct three
experiments. First, we forecast Eighth District
employment using the sum of state-level employment.7 Second, we forecast District employment
using the sum of Eighth District MSA–level employment.8 Finally, we forecast state-level employment
for each of the seven District states using MSAlevel employment.

Data
Although a number of aggregate business cycle
indicators exist, relatively few series are available
at the disaggregate level. Two series available at a
state level with both a reasonable frequency and
sufficiently large sample are personal income
(quarterly) and employment (monthly).9 At an
MSA-level, only employment is readily available.
7

The Federal Reserve’s Eighth District contains portions of seven
states: Missouri, Illinois, Tennessee, Arkansas, Kentucky, Indiana,
and Mississippi. Only Arkansas lies entirely in the Eighth District.
However, for purposes of this experiment, we make the simplifying
assumption that the District consists of the entirety of all seven states.

8

In constructing District-level employment for this experiment and
state-level employment for the next experiment, we use the sum of
MSA-level employment. For the former, we include only MSAs
located in the Eighth District, and for the latter, we include all MSAs
in the states. Rural employment is omitted in each case.

9

Gross state product, which is the state-level equivalent to national
gross domestic product, is annual and only available at a one-year lag.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

17

Engemann, Hernández-Murillo, Owyang

Figure 1
Eighth District States’ Payroll Employment
$ Thousands
6,500

Arkansas
Illinois
Indiana
Kentucky
Mississippi
Missouri
Tennessee

6,000
5,500
5,000
4,500
4,000
3,500
3,000
2,500
2,000
1,500
1,000
500
0
Jan. 1990 Jan. 1992 Jan. 1994 Jan. 1996 Jan. 1998 Jan. 2000 Jan. 2002 Jan. 2004 Jan. 2006

NOTE: The employment series for each state is seasonally adjusted.

We, therefore, concentrate our efforts on the appropriate employment forecasts.
For our forecasting experiments, we use stateand MSA-level employment data from the Bureau
of Labor Statistics’ payroll employment survey.
For the first experiment, state-level employment
is summed to yield an approximation of the Eighth
District employment level. In the same manner,
the appropriate aggregates are constructed from
MSA-level data in the following two experiments
for forecasting District- and state-level data. For
each exercise, the full sample is January 1990 to
December 2007. For convenience, the state- and
MSA-level data are plotted in Figures 1 and 2,
respectively. Summary statistics for the data are
provided in Tables 1 and 2.
For each of the last two experiments, we construct the District- and state-level aggregates by
omitting rural employment. Table 3 shows that
the rural component of employment for each state
in the Federal Reserve’s Eighth District is significant.
The difficulty, however, of adding rural employment to the forecasting regressions (at least those
that account for cross-regional correlations) lies in
18

V O LU M E 4 , N U M B E R 1

2008

modeling the comovements between rural and
urban employment. In particular, for the spatial
model (4), modeling the distance between the
rural and MSA centroids is problematic.

Forecasting Scheme
We could use one of two forecasting schemes—
recursive or rolling window. A recursive forecasting
scheme fixes the initial period for the in-sample
data. Each additional period is added to the sample
and the model is reestimated. Thus, the estimation
window expands as the sample expands. Conversely, the rolling window scheme fixes the size
of the dataset used to make the forecast. With each
new period, recent data are added and data at the
beginning of the sample are dropped. The rolling
window scheme is particularly useful for cases in
which the data-generating process experiences
structural breaks. This has been shown to be the
case for both state- and MSA-level employment
(see Owyang, Piger, and Wall, 2005, forthcoming,
and Owyang, et al., forthcoming). Therefore, we
choose to use a rolling window forecasting scheme
with a 13-year sampling period. The number of

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Engemann, Hernández-Murillo, Owyang

Figure 2A
Eighth District MSAs’ Payroll Employment, by State
$ Thousands
2,000
1,800
1,600
1,400
1,200
1,000
800
600
400
200
0
Jan. 1990 Jan. 1992 Jan. 1994 Jan. 1996 Jan. 1998 Jan. 2000 Jan. 2002 Jan. 2004 Jan. 2006

Arkansas
Illinois
Indiana
Kentucky
Mississippi
Missouri
Tennessee

NOTE: The employment series for each state is seasonally adjusted and consists of the sum of all MSAs in that state.

Figure 2B
Total State MSAs’ Payroll Employment, by State
$ Thousands
5,500
5,000
4,500
4,000
3,500
3,000
2,500
2,000
1,500
1,000
500
0
Jan. 1990 Jan. 1992 Jan. 1994 Jan. 1996

Arkansas
Illinois
Indiana
Kentucky
Mississippi
Missouri
Tennessee

Jan. 1998 Jan. 2000 Jan. 2002 Jan. 2004 Jan. 2006

NOTE: The employment series for each state is seasonally adjusted and consists of the sum of all MSAs in that state.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

19

20

V O LU M E 4 , N U M B E R 1

2008

39,700.8

26,105.7

5,442.7

17,009.6

26,891.1

75,994.0

7,670.6

Variance

Level

Nov. 2007

May 2000

Jun. 2000

Mar. 2007

May 2007

2,816.8 Dec. 2007

2,805.4

1,171.2 Dec. 2007

1,859.8

3,015.2

6,059.8

Date

Maximum

Jan. 1990

2,171.0 Apr. 1991

2,294.9 Apr. 1991

928.8

1,461.7 Apr. 1990

2,492.5 Mar. 1991

5,201.4 Mar. 1992

912.2 Feb. 1990

Date

Minimum
Level

Level (thousands)

1,209.4

NOTE: Monthly growth rates are annualized.

2,560.9

Tennessee

1,705.5

Kentucky

1,085.0

2,824.3

Indiana

2,603.0

5,711.0

Illinois

Missouri

1,095.8

Arkansas

Mississippi

Mean

State

State-Level Summary Statistics

Table 1

–0.74

–0.66

–1.02

–0.64

–0.85

–0.64

–0.73

Skewness

1.5

1.0

1.4

1.4

1.0

0.8

1.6

Mean

12.8

10.6

15.9

16.4

12.0

6.8

7.8

Variance

17.1

11.0

15.9

29.6

10.6

11.7

9.0

Growth

Nov. 1994

Apr. 1993

Oct. 1993

May 1992

Feb. 1999

Sep. 1995

Jan. 1995

Date

Maximum

–13.2

–14.6

–18.6

–20.1

–10.4

–6.8

–5.2

Apr. 1996

Jan. 1991

Sep. 2005

Apr. 1992

Jan. 1999

Jul. 2001

Jan. 1991

Date

Minimum
Growth

Growth rate (percent)

Engemann, Hernández-Murillo, Owyang

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

55.6

42.0

345.8

171.4

153.3

111.6

Decatur, IL

Kankakee-Bradley, IL

Lake CountyKenosha County,
IL-WI

Peoria, IL

Rockford, IL

Springfield, IL

5.3

80.6

103.0

1,945.4

5.3

3.7

69.7

0.9

27,386.5

30.9

87.6

7.2

17.4

11.4

102.3

117.3

166.1

188.2

401.5

44.6

60.7

190.0

35.2

3,925.5

114.9

92.6

56.6

41.0

348.4

49.8

39.6

126.0

208.6

Level

Aug. 2000

Jul. 2000

Jun. 2007

Jun. 2007

Jan. 2007

Mar. 2000

Nov. 1999

Nov. 1996

Jan. 2001

Sep. 2007

Feb. 2002

Oct. 2007

Nov. 2003

Dec. 2007

Dec. 2007

Jul. 2007

Aug. 2007

Mar. 2007

Date

Maximum

106.1

133.2

149.1

265.1

35.8

52.5

164.5

31.2

3,393.5

98.0

65.3

47.0

36.5

255.1

36.6

27.0

89.9

105.2

Feb. 1990

Aug. 1991

Apr. 1992

Jan. 1990

Jan. 1990

Apr. 1992

Apr. 1990

Dec. 2006

Dec. 1991

Apr. 1990

Jan. 1990

May 1991

Feb. 1991

Feb. 1990

Jan. 1990

Jan. 1990

Jan. 1990

Jan. 1990

Date

Minimum
Level

Level (thousands)

0.03

–0.77

–0.40

–0.38

–1.10

1.05

–0.45

–0.31

–0.62

–0.28

–0.43

–0.15

–0.70

–0.53

–0.65

–0.48

–0.53

0.09

Skewness

0.6

1.4

1.4

2.4

1.6

0.5

0.9

0.3

0.6

1.2

2.5

1.0

0.7

1.8

2.0

2.4

2.1

4.0

Mean

66.7

81.9

81.4

20.8

85.1

78.2

28.0

137.9

8.3

102.3

120.5

34.9

75.3

13.8

58.2

62.7

37.0

25.1

Variance

44.3

35.7

69.9

26.4

31.5

54.2

23.9

48.0

12.4

52.9

48.4

22.3

35.3

24.5

34.0

37.1

22.9

38.0

Growth

Aug. 1993

Feb. 1990

May 1992

Oct. 1993

Jan. 1998

May 1992

Jul. 1998

Apr. 1991

Apr. 1993

Sep. 2007

Jun. 1992

Jan. 1996

Apr. 1991

Jan. 2001

Jul. 1993

May 2003

Apr. 1994

Jan. 2001

Date

Maximum

Growth rate (percent)

NOTE: *Indicates an MSA located in the Eighth District and used in the second experiment; monthly growth rates are annualized.

33.1

179.6

3,707.6

ChicagoNapervilleJoliet, IL

Davenport-MolineRock Island IA-IL

107.1

ChampaignUrbana, IL

Danville, IL

81.9

BloomingtonNormal, IL

1.1

39.3

Little RockNorth Little Rock,
AR*

51.9

44.3

307.2

Jonesboro, AR*

Pine Bluff, AR*

34.0

Hot Springs, AR*

Texarkana, TX-AR*

698.0

110.0

Fort Smith, AR-OK*

993.8

156.2

FayettevilleSpringdaleRogers, AR*

Variance

Mean

MSA

MSA-Level Summary Statistics

Table 2

–30.0

–28.4

–38.4

–8.7

–28.5

–29.9

–18.8

–42.4

–8.2

–23.4

–38.4

–16.2

–37.0

–5.5

–15.8

–26.2

–13.3

–8.1

Growth

Sep. 1992

Aug. 2000

Jul. 1994

Jun. 1993

Apr. 1992

Nov. 1991

Jan. 1999

Jan. 1999

Jul. 2001

Nov. 1999

Jan. 1995

Sep. 1993

Jul. 2006

Sep. 2001

Jan. 2000

Feb. 2006

Nov. 2006

Apr. 2007

Date

Minimum

Engemann, Hernández-Murillo, Owyang

2008

21

22

V O LU M E 4 , N U M B E R 1

2008

51.2

Terre Haute, IN

Bowling Green, KY*

231.3

579.1

47.8

HuntingtonAshland,
WV-KY-OH

Lexington-Fayette,
KY

Louisville, KY-IN*

Owensboro, KY*

12.0

1,495.5

407.3

35.6

16.1

4,351.7

39.8

6.5

62.7

6.4

1.6

40.3

7.5

6,428.6

66.0

80.2

110.5

104.8

9.7

38.5

6.3

Variance

51.6

629.4

257.1

121.2

48.8

1,047.4

62.8

78.2

150.9

62.9

50.3

97.0

55.7

921.8

285.6

220.0

182.0

133.7

46.0

85.2

51.1

Level

Dec. 2007

Jul. 2007

Dec. 2007

Sep. 2007

Nov. 2007

Oct. 2007

Dec. 2007

Mar. 2000

May 2000

Jul. 1995

Apr. 2000

Aug. 2007

Dec. 1999

Aug. 2007

Jun. 1998

Dec. 1998

Nov. 2002

Feb. 2006

Oct. 2007

Aug. 2006

Mar. 1990

Date

Maximum

40.8

504.2

194.2

99.8

35.2

859.0

39.4

68.2

123.6

52.0

45.1

74.8

44.1

666.4

257.9

190.6

150.2

95.1

34.2

62.3

41.4

Jul. 1991

Apr. 1991

Apr. 1990

Sep. 1991

Jan. 1990

Jan. 1990

Jan. 1990

Jul. 1990

Apr. 1992

Jul. 2005

Sep. 1991

May 1991

Jul. 2003

Apr. 1990

Apr. 1990

Mar. 1991

Feb. 1990

Mar. 1991

Feb. 1990

Jul. 1990

Jun. 2007

Date

Minimum
Level

Level (thousands)

–0.92

–0.70

–0.62

0.07

0.00

–0.47

–0.15

–0.85

–1.02

0.03

0.04

–0.59

–0.01

–0.32

–0.34

–0.74

–0.70

–0.58

–0.88

–0.80

–0.70

Skewness

1.5

1.3

1.7

1.1

2.0

1.2

3.0

0.6

0.9

0.6

0.4

1.7

2.7

1.9

0.5

0.7

1.1

1.6

2.0

2.4

–0.7

Mean

38.6

17.3

22.9

34.3

36.6

9.4

71.1

43.8

35.9

129.1

36.8

104.8

857.6

15.7

23.2

27.3

28.9

52.8

85.2

202.9

70.7

Variance

24.5

16.2

20.5

34.0

22.1

15.3

46.7

24.5

23.8

68.3

21.8

37.1

312.8

18.4

21.6

21.5

26.6

22.9

47.0

80.4

30.0

Growth

Apr. 1999

Dec. 2006

Jul. 2000

Feb. 1994

Jan. 2006

Oct. 1993

Sep. 1991

Jan. 1993

Oct. 2004

Jun. 2003

Sep. 1999

Jan. 1995

Aug. 2003

Oct. 1998

Jan. 2001

Jul. 2005

Jan. 2001

Apr. 1992

Aug. 1992

Jan. 1993

Aug. 1994

Date

Maximum

Growth rate (percent)

NOTE: *Indicates an MSA located in the Eighth District and used in the second experiment; monthly growth rates are annualized.

41.9

110.0

Elizabethtown, KY*

965.5

74.4

South BendMishawaka, IN

CincinnatiMiddletown,
OH-KY-IN

57.0

140.9

Muncie, IN

47.5

Michigan CityLa Porte, IN

Gary, IN

87.8

208.5

273.5

Fort Wayne, IN

Lafayette, IN

170.2

Evansville, IN-KY*

50.9

117.9

Elkhart-Goshen, IN

802.3

41.5

Columbus, IN

Kokomo, IN

75.7

Indianapolis, IN

47.4

Bloomington, IN

Mean

Anderson, IN

MSA

MSA-Level Summary Statistics

Table 2, cont’d

–16.3

–17.4

–11.2

–17.6

–15.6

–6.2

–32.4

–34.3

–20.6

–31.3

–22.1

–51.1

–74.9

–13.1

–17.5

–17.6

–18.0

–20.9

–32.3

–39.9

–24.0

Growth

Feb. 2003

Jan. 1991

Jul. 1995

Jan. 1994

Apr. 2000

Jan. 2005

Jan. 1992

Jan. 2003

Apr. 1992

Jun. 1997

Oct. 1997

Jan. 2003

Jul. 2003

Jan. 1999

Sep. 2007

Jan. 1999

Oct. 1994

Apr. 1995

Jul. 1992

Jan. 2003

Jul. 1994

Date

Minimum

Engemann, Hernández-Murillo, Owyang

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

293.9

765.6

51.9

647.1

338.9

125.7

81.7

62.3

42.9

85.7

248.9

202.9

1,362.6

59.6

1,024.2

80.1

80.3

93.3

60.8

263.4

62.1

116.2

Level

Dec. 2007

May 2006

Dec. 2007

Sep. 2007

Apr. 1997

Nov. 2007

Jun. 2007

Jan. 2007

Oct. 2007

Jul. 2007

Dec. 2007

May 2007

Aug. 2007

Aug. 2007

Dec. 2007

Dec. 2007

Oct. 2007

Jul. 1999

Oct. 2007

Aug. 2007

Jan. 2005

Date

Maximum

71.0

522.9

38.1

487.5

240.8

107.4

64.2

45.7

32.2

53.4

200.6

131.2

1,165.0

42.8

824.1

60.2

59.0

61.2

46.1

194.6

43.7

Feb. 1991

Jan. 1990

Mar. 1991

Apr. 1990

Jan. 1990

Feb. 1990

Jan. 1991

Aug. 1991

Dec. 1990

May 1991

Jul. 1990

Jun. 1991

Jul. 1991

Jun. 1991

Feb. 1991

Apr. 1990

Jun. 1990

Jan. 1990

Jan. 1990

May 1991

Jan. 1990

Date

Minimum
Level

Level (thousands)

–0.38

–0.75

–0.66

–0.23

–1.11

–0.10

–0.59

–0.74

–0.26

–0.38

–0.31

–0.58

0.50

–0.60

–0.70

–0.72

–0.24

–0.03

–0.40

–0.01

–0.75

Skewness

2.2

2.1

1.6

2.0

1.0

1.6

1.9

2.1

2.7

1.2

2.5

0.8

1.9

1.2

1.7

1.9

2.5

2.9

1.7

2.0

3.7

Mean

18.3

87.2

22.9

18.2

52.2

52.7

57.8

169.0

45.4

24.7

17.0

8.1

59.6

12.3

27.7

32.0

26.9

392.5

10.6

35.9

176.1

Variance

16.9

46.4

23.4

14.4

25.1

31.7

31.6

109.3

25.7

19.7

16.8

10.0

35.2

14.0

20.0

22.1

22.7

191.7

18.0

52.5

75.1

Growth

Nov. 1994

Jul. 1993

Jan. 1994

Apr. 1993

Jul. 1990

Sep. 2003

Jul. 1999

Jan. 1994

Sep. 1993

Jul. 1999

Aug. 1990

Jan. 1998

Oct. 1991

Jul. 2007

Jul. 2004

Jul. 1994

Oct. 1993

Apr. 2007

Jul. 2003

Oct. 2005

Aug. 1992

Date

Maximum

Growth rate (percent)

NOTE: *Indicates an MSA located in the Eighth District and used in the second experiment; monthly growth rates are annualized.

18.1
5,679.1

46.9

2,773.8

777.1

21.5

22.7

30.5

Nashville-Davidson- 651.1
Murfreesboro, TN

Morristown, TN

581.0

Knoxville, TN

Memphis,
TN-AR-MS*

73.2

119.1

55.8

Jackson, TN*

Kingsport-BristolBristol, TN-VA

38.7

Cleveland, TN

Johnson City, TN

100.4

70.8

Clarksville, TN-KY
10.5

193.9

Chattanooga, TN-GA 226.9

425.1

167.5

Springfield, MO*

4,295.7

20.7

1,280.5

49.4

St. Joseph, MO-KS

3,513.1

35.2

44.2

99.2

9.2

471.5

26.4

202.7

Variance

St. Louis, MO-IL*

71.7

72.2

Jefferson City, MO*

932.0

77.9

Columbia, MO*

Kansas City, MO-KS

54.2

Pascagoula, MS

Joplin, MO

232.9

Jackson, MS

99.3

51.7

Hattiesburg, MS

Mean

Gulfport-Biloxi, MS

MSA

MSA-Level Summary Statistics

Table 2, cont’d

–7.4

–22.5

–12.2

–11.9

–24.0

–28.0

–28.4

–42.9

–18.8

–12.1

–11.2

–7.5

–19.8

–14.8

–14.7

–16.3

–13.2

–71.6

–7.3

–10.5

–78.9

Growth

Oct. 2001

Oct. 2002

Aug. 1993

Jan. 2003

Sep. 1990

Jul. 1997

Aug. 1999

Mar. 1993

Jul. 1990

Jan. 1996

Jan. 2003

Dec. 2000

Sep. 1996

Jul. 2002

May 2002

Apr. 1995

Apr. 1990

Sep. 2005

Aug. 1990

Sep. 2005

Sep. 2005

Date

Minimum

Engemann, Hernández-Murillo, Owyang

V O LU M E 4 , N U M B E R 1

2008

23

Engemann, Hernández-Murillo, Owyang

Table 3
Rural Employment by State in 2006
Rural employment
(percent)

State
Arkansas

36.3

Illinois

11.6

Indiana

20.0

Kentucky

36.0

Mississippi

52.5

Missouri

23.9

Tennessee

22.1

Average

28.9

SOURCE: USDA, Economic Research Service, State Fact Sheets.

lags for each model is chosen using the Bayesian
information criterion (BIC) on the initial subsample and remains fixed for the entire forecasting
experiment.

Spatial Weighting
Two sets of weights are considered for the first
forecasting experiment. The first set of weights takes
into account the distance between the centroids of
economic regions, and the second considers geographic contiguity as a categorical qualification.
Under the first definition,

wnk = (1 d nk ) ( ∑ k ≠ n 1 d nk ),
where dnk is the distance between the geographic
centroids of regions n and k. Under the second
definition,

wnk = (ηnk ) ( ∑ k ≠ n ηnk ),
where ηnk = 1 if regions n and k are geographically
adjacent, and ηnk = 0 otherwise. Both of the final
two experiments use only the distance between
centroids because contiguity cannot be established
for most MSAs.

A few broadly consistent features are notable
for the three forecasting experiments. In particular,
V O LU M E 4 , N U M B E R 1

Forecasting District Employment with
State-Level Data
The first set of results considers forecasting the
Eighth Federal Reserve District using state-level
data. As mentioned previously, state-level data
support two possible spatial weighting matrices
for the ST-AR model: distance and contiguity. We
present results for both weighting matrices.
Figure 3 shows the relative decline in MSPEs
for the ST-AR model using centroid distance as the
spatial metric relative to each of the forecasting
models. Obvious from these results is that weighting state-level interactions by distance provides
some advantage to aggregate forecasting over
weighting by contiguity. The advantage may result
because a contiguity weighting scheme would suppress potentially important interactions between
noncontinuous states.10
For both weighting schemes, the informational
advantage in modeling the regional interactions is
obvious. The VAR and the ST-AR models yield
lower MSPEs for almost every horizon. At very
short horizons, the disaggregate AR has predictive
ability similar to that of the VAR and the ST-AR
models. However, at longer horizons, neglecting
the regional interactions can increase the MSPE
by up to 90 percent.
The regional VAR and the ST-AR models produce an interesting comparison. First, it is important to note that the lag order chosen by the BIC
for the VAR is much shorter than that for the ST-AR.
This negates, to some extent, the reduction in the
MSPEs gained by reducing parameter uncertainty
10

RESULTS

24

for the District forecasts the aggregate AR exhibits
greater MSPE at every horizon than the ST-AR
model. The difference in MSPEs for the ST-AR
model and a more parsimoniously parameterized
VAR is often small, especially for short horizons;
and the disaggregate AR can provide some (small)
forecasting advantages over the more heavily
parameterized ST-AR model at short horizons but
is inferior at long horizons.

2008

As alluded to above, the weighting matrix in spatial econometrics
is determined exogenously. Conley and Molinari (2007) propose a
test of the spatial weighting matrix. However, their test is conducted
in-sample and is a joint test of model and spatial weighting
misspecification.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Engemann, Hernández-Murillo, Owyang

Figure 3
Efficiency Gain for ST-AR Model, Using Eighth District States
1.0
0.8
0.6
0.4
0.2
0.0
–0.2

0

5

10

15

20

25

Disaggregate ST-AR(3,3) Contiguity
Disaggregate VAR(1)

Aggregate AR(3)
Disaggregate ST-AR(3,3)
Disaggregate AR(3)

Figure 4
Efficiency Gain for ST-AR Model, Using Eighth District States (Setting Equal Lag Lengths)
1.0
0.8
0.6
0.4
0.2
0.0
–0.2

0

5
Aggregate AR(3)
Disaggregate AR(3)

10

15

20

25

Disaggregate VAR(3)
Disaggregate ST-AR(3,3)

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

25

Engemann, Hernández-Murillo, Owyang

Figure 5
Efficiency Gain for ST-AR Model, Using Eighth District MSAs
1.0
0.8
0.6
0.4
0.2
0.0
–0.2

0

5

10

15

in the more parsimoniously parameterized ST-AR
model. Figure 4 demonstrates the informational
advantage for a ST-AR model versus a VAR with
equal lag length. This finding is consistent with
the theoretical findings in Giacomini and Granger
(2004): Increasing the number of estimated parameters in the VAR with equal lags leads to potential
overfitting and an increase in the MSPEs.

Forecasting District Employment with
MSA-Level Data
As Figure 5 shows, the results for disaggregating at the MSA level are broadly consistent with
those for the state data. The disaggregate models
perform better out of sample than the aggregate AR
model. The ST-AR model is more efficient than
the disaggregate AR at long horizons. At shorter
horizons, this information advantage is eroded and
sometimes negative. Moreover, the VAR performs
better in this case than the ST-AR model for most
horizons.
These results suggest several possible explanations. In the previous case, District data were disaggregated into seven states; here, the District is
disaggregated into 18 MSAs. Although the increase
in the number of disaggregate units may not seem
significant, it leads to a substantial increase in the
V O LU M E 4 , N U M B E R 1

2008

25

Disaggregate VAR(1)
Disaggregate ST-AR(3,3)

Aggregate AR(3)
Disaggregate AR(4)

26

20

number of estimated parameters for the ST-AR
model. This increase may erode the model’s forecasting advantage because of the increased uncertainty from estimating the extra parameters. Second,
the MSA may be an improper level of disaggregation. A third possibility is that the spatial weighting
matrix used in this exercise does not properly
model the interactions. This could potentially
explain why the VAR model performs better than
the ST-AR model despite estimating a comparable
number of parameters.

Forecasting State Employment with
MSA-Level Data
We conducted similar experiments using the
level of employment in the seven states in the
Eighth District as the aggregate and the MSAs in
those states as the disaggregate components. Our
motivation is to determine the optimal level of
disaggregation in forecasting employment. Unfortunately, few results are consistent across states
(Figure 6). For example, most states yield lower
MSPEs for the disaggregate forecasting models
versus the aggregate AR model. Mississippi is an
exception: The aggregate AR gives roughly similar
MSPEs as the VAR and much lower MSPEs than
either the ST-AR or disaggregate AR model. Overall,

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Engemann, Hernández-Murillo, Owyang

Figure 6
Efficiency Gain for ST-AR Model, Forecasting State Employment with MSAs

Arkanasas

Illinois

0.6
0.5
0.4
0.3
0.2
0.1
0.0
–0.1

0

5

10

15

20

25

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
–0.1

0

Disaggregate VAR(1)
Disaggregate ST-AR(1,1)

Aggregate AR(1)
Disaggregate AR(1)

5

0

5

10

15

20

25

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
–0.1
–0.2
–0.3

0

Disaggregate VAR(1)
Disaggregate ST-AR(2,2)

Aggregate AR(1)
Disaggregate AR(1)

5

0.0

–0.3
–0.4

5

10

15

20

25

Disaggregate VAR(1)
Disaggregate ST-AR(2,1)

Aggregate AR(1)
Disaggregate AR(1)

25

10

15

20

25

Disaggregate VAR(1)
Disaggregate ST-AR(3,3)

Missouri

–0.1
–0.2

0

20

Disaggregate VAR(1)
Disaggregate ST-AR(1,1)

Aggregate AR(4)
Disaggregate AR(1)

Mississippi
0.1

–0.5
–0.6

15

Kentucky

Indiana
0.6
0.4
0.2
0.0
–0.2
–0.4
–0.6
–0.8
–1.0
–1.2

10

Aggregate AR(3)
Disaggregate AR(1)

1.0
0.8
0.6
0.4
0.2
0.0
–0.2
–0.4
–0.6
–0.8
–1.0

0

5

10

15

Aggregate AR(3)
Disaggregate AR(1)

20

25

Disaggregate VAR(1)
Disaggregate ST-AR(1,1)

Tennessee
1.0
0.8
0.6
0.4
0.2
0.0
–0.2

0

5

10

Aggregate AR(4)
Disaggregate AR(1)

15

20

25

Disaggregate VAR(1)
Disaggregate ST-AR(1,1)

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

27

Engemann, Hernández-Murillo, Owyang

the model with the lowest MSPE for each state
differs. The ST-AR model provides the lowest
MSPE for about half of the states but performs
considerably worse than even the aggregate AR
for Mississippi and Indiana.
One notable fact in these results is that, for a
given model, the lag order called for by the (insample) BIC varies substantially across the states.
Not surprisingly, the ST-AR model tends to perform
worse in states in which the in-sample criterion
calls for longer lags. This can lead to an increase
in parameter uncertainty or overfitting.11 Similarly,
for states in which long lags are called for in the
AR model, this model performs poorly. We therefore
conclude that, although some information may be
gleaned from modeling spatial relationships, disaggregation to the MSA level should be done with
some caution.

CONCLUSION
Recent studies have shown that, at times, aggregate variables can be more accurately forecasted
by summing disaggregate forecasts. In particular,
using models that take into account the spatial
interactions of the disaggregate series can improve
forecasting performance. This occurs at the expense
of estimating additional parameters. This tension
naturally leads to the question of how much disaggregation is “optimal.”
We conducted a number of forecasting experiments along these lines. In general, we find that
disaggregation can produce better forecasts. For
example, by disaggregating a regional variable (the
Eighth Federal Reserve District’s employment level)
into states, we achieved a significant reduction in
the MSPE versus the aggregate AR. Using the state
level as the aggregate, however, yields less consistent results, which suggests that the exploitable
regional interactions at the MSA level may not be
sufficiently informative to overcome the increase
in estimated parameters. We imagine that further
disaggregation—perhaps to the county level—might
increase this tension between exploitable spatial
interactions and increased parameter uncertainty.
11

The tension between in-sample and out-of-sample fit is not surprising
(see Hansen, 2008).

28

V O LU M E 4 , N U M B E R 1

2008

REFERENCES
Baird, Catherine A. “A Multiregional Econometric
Model of Ohio.” Journal of Regional Science,
November 1983, 23(4), pp. 501-15.
Ballard, Kenneth and Glickman, Norman J.
“A Multiregional Econometric Forecasting System:
A Model for the Delaware Valley.” Journal of Regional
Science, August 1977, 17(2), pp. 161-77.
Conley, Timothy G. and Molinari, Francesca. “Spatial
Correlation Robust Inference with Errors in Location
or Distance.” Journal of Econometrics, September
2007, 140(1), pp. 76-96.
Crow, Robert T. “A Nationally-Linked Regional
Econometric Model.” Journal of Regional Science,
August 1973, 13(2), pp. 187-204.
Duobinis, Stanley F. “An Econometric Model of the
Chicago Standard Metropolitan Statistical Area.”
Journal of Regional Science, August 1981, 21(3),
pp. 293-319.
Giacomini, Raffaella and Granger, Clive W.J.
“Aggregation of Space-Time Processes.” Journal of
Econometrics, January 2004, 118(1/2), pp. 7-26.
Glickman, Norman J. “An Econometric Forecasting
Model for the Philadelphia Region.” Journal of
Regional Science, April 1971, 11(1), pp. 15-32.
Hansen, Peter R. “In-Sample Out-of-Sample Fit: Their
Joint Distribution and Its Implications for Model
Selection.” Unpublished manuscript, 2008.
Hendry, David F. and Hubrich, Kirstin. “Forecasting
Economic Aggregates by Disaggregates.” Working
Paper No. 589, European Central Bank, February
2006; www.ecb.eu/pub/pdf/scpwps/ecbwp589.pdf.
Hernández-Murillo, Rubén and Owyang, Michael T.
“The Information Content of Regional Employment
Data for Forecasting Aggregate Conditions.”
Economics Letters, March 2006, 90(3), pp. 335-39.
LeSage, James P. and Magura, Michael. “Econometric
Modeling of Interregional Labor Market Linkages.”
Journal of Regional Science, August 1986, 26(3),
pp. 567-77.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Engemann, Hernández-Murillo, Owyang

LeSage, James P. and Magura, Michael. “Using Bayesian
Techniques for Data Pooling in Regional Payroll
Forecasting.” Journal of Business and Economic
Statistics, January 1990, 8(1), pp. 127-35.
Liu, Yih-wu and Stocks, Anthony H. “A Labor-Oriented
Quarterly Econometric Forecasting Model of the
Youngstown-Warren SMSA.” Regional Science and
Urban Economics, August 1983, 13(3), pp. 317-40.
Owyang, Michael T.; Piger, Jeremy and Wall, Howard J.
“Business Cycle Phases in U.S. States.” Review of
Economics and Statistics, November 2005, 87(4),
pp. 604-16.
Owyang, Michael T.; Piger, Jeremy and Wall, Howard J.
“A State-Level Analysis of the Great Moderation.”
Working Paper No. 2007-003D, Federal Reserve
Bank of St. Louis, revised May 22, 2008.
(Forthcoming in Regional Science and Urban
Economics.)
Owyang, Michael T.; Piger, Jeremy M.; Wall, Howard J.
and Wheeler, Christopher H. “The Economic
Performance of Cities: A Markov-Switching
Approach.” Working Paper No. 2006-056C, Federal
Reserve Bank of St. Louis, revised January 21, 2007.
(Forthcoming in Journal of Urban Economics.)
Rapach, David E. and Strauss, Jack K. “Forecasting
Employment Growth in Missouri with Many
Potentially Relevant Predictors: An Analysis of
Forecast Combining Methods.” Federal Reserve Bank
of St. Louis Regional Economic Development, 2005,
1(1), pp. 97-112; research.stlouisfed.org/publications/
red/2005/01/RapachStrauss.pdf.
Rapach, David E. and Strauss, Jack K. “Forecasting Real
Housing Price Growth in the Eighth District States.”
Federal Reserve Bank of St. Louis Regional Economic
Development, November 2007, 3(2), pp. 33-42;
research.stlouisfed.org/publications/red/2007/02/
Rapach.pdf.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

29

The Economic Impact of a Smoking Ban
in Columbia, Missouri:
An Analysis of Sales Tax Data for the First Year
Michael R. Pakko
In January 2007, an ordinance took effect in Columbia, Missouri, banning smoking in all bars,
restaurants, and workplaces. This paper analyzes data for sales tax collections at eating and
drinking establishments from January 2001 through December 2007, including the first 12 months
of the smoking ban. The analysis accounts for trends, seasonality, general business conditions, and
weather. The findings suggest that the smoking ban has been associated with statistically significant losses in sales tax revenues at Columbia’s bars and restaurants, with an average decline of
approximately 3½ to 4 percent. Businesses that serve only food show no statistically significant
effects of the smoking ban. Those that serve food and alcohol, or alcohol only, show significant
losses with estimates in the range of 6½ to 11 percent (with the larger losses associated with bars).
Some individual businesses within each category may have been unaffected, whereas others are
likely to have incurred much greater losses. (JEL I18, D78, H11)
Federal Reserve Bank of St. Louis Regional Economic Development, 2008, 4(1), pp. 30-40.

I

n January 2007, the city of Columbia,
Missouri, implemented a smoke-free ordinance, banning smoking in all public places,
including bars and restaurants. This paper
analyzes data on sales tax collections at bars and
restaurants for the period before and after this
smoking ban was implemented. The sample period
covers the first year after the implementation of
the new law.1
The enactment of laws restricting smoking in
bars and restaurants has been a growing trend
among states and municipalities around the nation.
According to the Americans Nonsmokers’ Rights
Foundation, 748 municipalities have provisions
for 100 percent smoke-free environments in bars,
1

This paper represents an extension of my previous study (Pakko,
2007).

restaurants, and workplaces. Of these, 555 require
smoke-free restaurants and 426 require smoke-free
bars.2
As more U.S. communities have adopted such
laws, economic data have accumulated, allowing
economists to better identify some of the economic
costs of these restrictions. A large body of early
evidence on the economic impact of smoking bans,
much of which was published in medical and public health journals, tended to find no statistically
significant effects.3 This finding sometimes has been
interpreted as demonstrating that there is no negative economic impact of smoke-free laws whatsoever.
2

These counts are as of July 1, 2008. See American Nonsmokers’
Rights Foundation (2008).

3

Scollo et al. (2003) provide a review of previous literature.

Michael R. Pakko is a research officer and economist at the Federal Reserve Bank of St. Louis. Joshua Byrge provided research assistance. The
author thanks Laura Peveler, budget officer for the city of Columbia, for providing the data used in this analysis, and John Schultz of the Boone
Liberty Coalition for providing information from a survey of bar and restaurant smoking policies in 2006.

© 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

30

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Pakko

Figure 1
Sales Tax Revenues at Columbia Eating and Drinking Places
Annual Totals, Percent Change
9
8.1

7.9
6

7.5

6.2

4.5
3

0.6

0
2002

2003

2004

2005

2006

2007

This interpretation is far too simplistic. Recent
economic research has made it increasingly clear
that there are significant economic effects—for
some specific businesses—when 100 percent smoking bans are implemented. The evidence suggests
that economic costs are borne by businesses that
tend to be frequented by smokers. Statistically significant costs have been identified for casinos and
bars, in particular.4
One of the cities in the Eighth Federal Reserve
District that recently adopted a smoking ban is
Columbia, Missouri. Since January 9, 2007, all bars
and restaurants in Columbia have been required
to be smoke free. Only some sections of outdoor
patios are exempt from the requirement.
Some local businesses continued to oppose
Columbia’s smoke-free ordinance throughout its
first year in effect. Petitions to repeal the law by
ballot initiative were circulated, but the campaign
was ultimately unsuccessful.5 According to local
press reports, at least seven establishments cited
the smoking ban as a factor in their decision to
close their doors in 2007.6 The owner of one busi-

ness was quoted as reporting a 40 percent drop in
alcohol sales and a 20 to 30 percent drop in food
sales over the first several months of the smoking
ban.7 Although such reports are informative, they
are anecdotal. A more thorough, systematic analysis
of objective data is necessary to properly identify
economic costs.

4

6

See, for example, LeBlanc (2007) and Coleman (2007).

7

See Lynch (2007). The business—Otto’s Corner Bar and Grill—
closed in late 2007, citing the smoking ban as a factor in its demise.

5

For a review of some recent economic research, see Pakko (2008a).
In November 2007, the petition drive fell short of gathering enough
valid signatures.

SALES TAX REVENUES AT ALL
EATING AND DRINKING
ESTABLISHMENTS
Data from the city of Columbia show a distinct
decline in the growth rate of sales tax receipts at
bars and restaurants (Figure 1). The total for 2007
was only 0.6 percent above 2006. Revenues over
the previous four years had risen at an average rate
of 7.4 percent. In 2006—the year preceding the
implementation of the smoking ban—revenues
were 8.1 percent higher than the previous year.
The dramatic slowdown in sales tax revenues
from eating and drinking establishments after the

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

31

Pakko

Figure 2
Sales Tax Collected from All Eating and Drinking Establishments
$ Thousands
220
200

Non-Seasonally Adjusted
Seasonally Adjusted

180
160
140
120
100
2001

2002

2003

2004

smoking ban was implemented is consistent with
the anecdotal reports of revenue losses at Columbia
bars and restaurants. However, a simple comparison
of growth rates before and after the smoking ban is
insufficient for drawing any firm conclusions.
This section reports findings from a more rigorous analysis of the data covering all of Columbia’s
bars and restaurants. Using regression analysis to
account for trends, seasonality, general business
conditions, and weather, I find that the smoking ban
has been associated with statistically significant
losses in sales tax revenues. Point estimates indicate
an average loss of approximately 3½ to 4 percent.8

Sales Tax Data
The data series examined in this section consists of monthly sales tax revenues for all bars and
restaurants in Columbia. Because no changes were
made in tax rates over the sample period (January
2001–December 2007), sales tax revenues serve as
8

The range of estimates in this paper represents slightly smaller
losses than in my earlier, preliminary analysis of the data (Pakko,
2007). In the earlier paper, the total included establishments classified as “eating places only” and “eating and drinking places.” The
new dataset also includes “drinking places—alcoholic beverages only.”
Because the latter category is a very small component of the total
(about 4 to 5 percent over the sample period), its inclusion has little
impact on the empirical findings. The new estimates reflect the additional data that have accumulated during the second half of 2007.

32

V O LU M E 4 , N U M B E R 1

2008

2005

2006

2007

a direct proxy for sales. Total sales tax receipts also
were obtained from the city of Columbia for use as
a control variable for overall economic activity.
The data are also disaggregated, allowing independent analysis of bars and restaurants (see “Analysis
of Disaggregated Data” below).
Figure 2 shows a plot of the raw data for total
bar and restaurant tax receipts, along with a series
that has been seasonally adjusted using the Census
X-12 ARIMA procedure. A cursory examination of
the data shows an evident surge in growth during
the latter part of 2005 and into early 2006. Growth
slowed in late 2006 and turned negative for much
of 2007. By December 2007, revenues were down
6 percent from a year earlier.
The appropriate question is not, however,
whether sales taxes or revenues have been positive or negative since the Columbia Smoke-Free
Ordinance took effect, but whether the pattern is
different from what it would have been in its
absence. More formal statistical analysis is required
to address this question.

Regression Analysis
To test the hypothesis of a significant effect of
the Columbia smoking ban, I estimated a series of
least-squares regressions. The dependent variable

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Pakko

Table 1
Regression Results for All Eating and Drinking Establishments
Regression
Variable

(1a)

(1b)

(2a)

(2b)

(3a)

(3b)

Smoking ban

–0.0523***
(0.0176)

–0.0518***
(0.0157)

–0.0364***
(0.0098)

–0.0376***
(0.0091)

–0.0365***
(0.0091)

–0.0403***
(0.0091)

Constant

11.6432***
(0.0120)

11.7693***
(0.0072)

5.5311***
(1.5513)

6.1317***
(1.6131)

6.6745***
(1.3621)

7.3420***
(1.3576)

Time trend

0.0056***
(0.0002)

0.0056***
(0.0002)

0.0038***
(0.0005)

0.0040***
(0.0005)

0.0042***
(0.0004)

0.0044***
(0.0004)

0.4423***
(0.1122)

0.4051***
(0.1158)

0.3585***
(0.0986)

0.3178***
(0.0975)

–0.0049***
(0.0014)

–0.0033***
(0.0011)

Non-dining tax revenues
Snowfall
AR(1) coefficient
Seasonally adjusted data
Seasonal dummy variables
Adjusted R 2

0.2522*
(0.1313)
No

0.2255*
(0.1340)

0.1078
(0.1135)

0.0674
(0.1092)

0.0778
(0.1252)

0.0915
(0.1281)

No

Yes

No

Yes

Yes

Yes

No

Yes

No

Yes

No

0.9642

0.9636

0.9728

0.9709

0.9766

0.9739

NOTE: *, **, and *** denote significance at 10, 5, and 1 percent, respectively. The dependent variable for all equations is the log of diningsector tax revenue. Regressions labeled (a) use data that are not seasonally adjusted, whereas those labeled (b) use data that are adjusted
using the Census X-12 ARIMA procedure.

of the regressions is the log of restaurant sales tax
revenues. Each regression includes a constant and
a time trend, in addition to a dummy variable representing the implementation of the smoking ban
(which has a value of 0 before 2007 and 1 for
January-December 2007). The full regression also
includes controls for overall economic activity
and for weather:
ln ( DiningTax t ) = γ SmokingBant + β0 + β1TimeTrendt
+ β2 ln (OtherTax t ) + β3Snowfallt + ut .

The variable Other Tax is the total amount of nonfood and beverage taxes collected by the city of
Columbia. To control for the influence of adverse
weather, the full specification also includes the
variable Snowfall, which is entered as the deviation
of actual monthly snowfall from historic averages.
The focus of the analysis is the coefficient on the
smoking-ban dummy variable (γ ). All regressions
include a first-order autoregressive error term
ut = ρ ut –1 + εt (although the autoregressive coeffi-

cient is not significant in many of the regressions).
Estimation uses ordinary least squares regression
with standard errors adjusted for general autoregression and heteroskedasticity using the Newey-West
(1987) procedure.
Baseline Specification. The results of a naive
baseline specification, including only a constant
and a time trend (plus the autoregressive error
term), are shown in the first two columns of Table 1.
Regression (1a) uses the non-seasonally adjusted
data for the dependent variable and includes a set
of monthly dummy variables to account for seasonal patterns (coefficient estimates not reported).
Regression (1b) uses the seasonally adjusted data.
Each of these basic regressions suggests a highly
statistically significant decline in tax revenues
associated with the implementation of the smoking
ban. Point estimates for the coefficients on the
smoking ban dummy variable indicate an average
decline of approximately 5 percent.9
9

The coefficient estimates on the dummy variable can be interpreted
(approximately) as percentage changes.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

33

Pakko

Figure 3
Sales Tax Collected from Non-Dining Establishments
$ Thousands
1,900

Non-Seasonally Adjusted
Seasonally Adjusted

1,700

1,500

1,300

1,100

900
2001

2002

2003

2004

Controlling for General Business Conditions.
Although these initial estimates control for general
trends and seasonality in the data, other factors
could be associated with the decline in restaurant
tax revenues. In fact, the data suggest an overall
decline in non-dining retail sales in Columbia that
is unlikely to be associated with the smoking ban.
Subtracting dining tax receipts from data for total
sales tax receipts yields a measure of non-dining
tax receipts. Figure 3 shows this measure of nondining sales taxes receipts on both a seasonally
adjusted and non-seasonally adjusted basis.
A clear slowdown in 2006 and 2007 roughly
corresponds with the timing of the slowdown in
tax receipts at restaurants and bars. Non-dining tax
receipts showed some recovery in early 2007 but
sagged through the rest of the year. Overall yearly
revenues were flat—the total for 2007 was 0.16
percent lower than in 2006. As of December, nondining sales tax revenues were down approximately
4.7 percent from a year earlier.
Regressions (2a) and (2b) add the (logged) nondining revenue variable to the baseline specification
to control for this slowdown in business activity.
Regression (2a) includes the non-seasonally adjusted
measure, whereas regression (2b) uses the seasonally adjusted version. In both cases, the coefficient
on non-dining tax revenue is positive and highly
34

V O LU M E 4 , N U M B E R 1

2008

2005

2006

2007

significant. The addition of this factor does, in fact,
account for some of the slowdown in dining tax
revenues: Point estimates for losses associated with
the smoking ban are smaller than in the baseline
specification. Nevertheless, the coefficients on the
smoking ban dummy variable are still highly significant, with point estimates indicating a decline of
more than 3½ percent. These results indicate that
the slowdown in dining tax receipts is partly related
to a slowdown in overall economic activity, but
the decline in revenues at bars and restaurants is
greater than past patterns would predict.10
Controlling for Weather. Another factor that
can be particularly important for revenues at bars
and restaurants (for obvious reasons) is inclement
weather.11 Figure 4 shows the average monthly
10

The 2008 budget report for the city of Columbia also indicates that
dining and entertainment sectors are lagging the rest of the local
economy: “General retail sales remain steady, however the current
trend indicates the home improvement/construction and dining and
entertainment sectors are declining” (City of Columbia, 2007).

11

Adams and Cotti (2007) find that changes in restaurant employment
after the implementation of smoking bans in warm-weather states
differ from those in cold-weather states. They speculate that the difference might be related to the feasibility of providing outdoor seating areas where smoking might be permitted. Pakko (2008b) finds
that a severe snowstorm on the East Coast had a significant effect on
gambling revenues in Delaware after the implementation of a smoking
ban in that state.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Pakko

Figure 4
Average and Actual Snowfall—Columbia, MO
Inches
14
12

Actual
Average

10
8
6
4
2
0
2001

2002

2003

2004

snowfall for Columbia compared with actual snowfall over the sample period.12 The low snowfall
totals during the winter of 2006-07 clearly represent a departure from average weather conditions.
These relatively mild winter conditions might help
explain the apparent surge in dining tax revenues
during that period. In contrast, the relatively heavy
snowfall near the end of 2007 might be associated
with slower business at bars and restaurants.
Regressions (3a) and (3b) add this consideration
to the analysis, introducing a variable that is equal
to the difference between actual and average snowfall (in inches). The coefficient on this snowfall
variable is of the expected sign, and it is statistically significant. The point estimate indicates that
one inch of snowfall in excess of the average tends
to lower sales tax revenues by 0.3 percent (in the
non-seasonally adjusted regression) to 0.5 percent
(in the seasonally adjusted specification). The addition of the snowfall variable improves the overall
fit of the model, but it has little impact on the significance of the smoking ban dummy variable.
There remains a highly significant downturn beginning in January 2007, measuring approximately
3½ to 4 percent.13
12

Average snowfall is calculated for the period 1971-2000 (National
Oceanic and Atmospheric Administration).

2005

2006

2007

A Specification Test. The association of the
smoking ban dummy variable with the Columbia
Smoke-Free Ordinance in the reported regressions
relies on the timing of its adoption. It is possible
for a dummy variable to indicate statistically significant effects even if the restaurant sales slowdown began either before or after the implementation of the smoking ban. To test whether the
dummy variable is accurately identifying the
effects of the smoking ban and not an independent,
unidentified factor, the regression specifications
in (3a) and (3b) were reestimated using alternative
dummy variables to evaluate the timing of the
downturn more carefully.14 Possible breakpoints
from July 2006 through June 2007 were considered.
Figure 5 shows the adjusted R-squared statistics
from these regressions. For both methods of seasonal controls, the results show that the dummy
variable specifying a breakpoint of January 2007
provides the best model fit. These results suggest
that January 2007 does, indeed, represent the rele13

Although these estimates are lower than in my preliminary analysis
(Pakko, 2007), the difference between the new estimates and the
previous estimate of 5 percent is not statistically significant.

14

Regressions (3a) and (3b) were reestimated using alternative dummy
variables that have a value of 1 for all months after and including a
particular starting month and a value of 0 for all previous months.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

35

Pakko

Figure 5
Adjusted R-Squared Statistics for Different Breakpoints
0.980
Non-Seasonally Adjusted
Seasonally Adjusted

0.978
0.976
0.974
0.972
0.970
0.968
Jul
06

Aug
06

Sep
06

Oct
06

Nov
06

vant breakpoint in the data series on bar and
restaurant sales tax revenues.

Analysis of Disaggregated Data
In addition to sales tax data for the total bar
and restaurant sector of Columbia, I requested and
received data on sales tax revenues for three subsets
of the total, along with listings of the specific businesses that fall within each category. The designations correspond roughly to the following SIC codes:
• Group 1 (SIC code 5811): “Eating Places
Only”
• Group 2 (SIC code 5812): “Eating and
Drinking Places”
• Group 3 (SIC code 5813): “Drinking Places—
Alcoholic Beverages”
The categories are not precisely distinguished;
business owners select their own category when
filing their tax statements. Undoubtedly, some
classifications are questionable. Nevertheless, the
three categories are distinguished by the types of
businesses prevalent on each list.
Group 1 includes fast-food, take-out restaurants,
coffeehouses, and many common sit-down restaurants. Group 2 includes restaurants that might be
commonly categorized as “bar and grill” establish36

V O LU M E 4 , N U M B E R 1

2008

Dec
06

Jan
07

Feb
07

Mar
07

Apr
07

May
07

Jun
07

ments, as well as many common sit-down restaurants. The restaurants in group 2 are more likely
to have separate bar areas than those in group 1.
Group 3, the smallest category, primarily includes
establishments that would be commonly classified
as “bars.”
Figure 6 shows the data series (seasonally
adjusted and non-seasonally adjusted) for each of
the three groups. Group 2 is the largest of the three,
accounting for approximately 61 percent of the
total over the sample period. Group 1 accounts for
just over one-third (34 percent), while group 3
accounts for only about 5 percent. Over time, the
share of total tax revenues for group 1 establishments has been rising slightly (reaching 35 percent
in 2007), and the share from group 3 has been falling
(4 percent in 2007).
The Columbia Smoke-Free Ordinance is likely
to have affected these three categories of businesses
differently. Previous research has suggested that the
impact on bars differs from the impact on restaurants. For example, both Adams and Cotti (2007)
and Phelps (2006) use data from the Bureau of
Labor Statistics to identify significant effects on
bar employment but find no significant effect for
restaurants as a separate category.
One relevant distinction among businesses in
these categories is that they may have differed in

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Pakko

Figure 6
Tax Revenues by Type of Establishment
Eating Places Only
$ Thousands
80
Non-Seasonally Adjusted
Seasonally Adjusted
70
60
50
40
30
2001

2002

2003

2004

2005

2006

2007

Eating and Drinking Places
$ Thousands
130
Non-Seasonally Adjusted
120
Seasonally Adjusted
110
100
90
80
70
2001

2002

2003

2004

2005

2006

2007

Drinking Places—Alcoholic Beverages
$ Thousands
11
10
9
8
7
Non-Seasonally Adjusted
Seasonally Adjusted

6
5
2001

2002

2003

2004

2005

2006

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

2007

V O LU M E 4 , N U M B E R 1

2008

37

Pakko

Table 2
Disaggregated Regression Results
Non-seasonally adjusted data

Seasonally adjusted data

Variable

Group 1

Group 2

Group 3

Group 1

Group 2

Group 3

Smoking ban

0.0107
(0.0161)

–0.0642***
(0.0120)

–0.1102***
(0.0312)

0.0008
(0.0180)

–0.0671***
(0.0124)

–0.1074***
(0.0287)

Constant

6.1855***
(1.5714)

6.2645***
(1.2468)

3.5898
(3.3697)

6.9832***
(1.5918)

7.1419***
(1.2459)

4.7455
(3.2460)

Time trend

0.0042***
(0.0005)

0.0045***
(0.0004)

0.0010
(0.0011)

0.0045***
(0.0005)

0.0048***
(0.0004)

0.0012
(0.0010)

Non-dining tax revenues

0.3137***
(0.1138)

0.3526***
(0.0903)

0.3751
(0.2440)

0.2655**
(0.1144)

0.2962***
(0.0896)

0.2980
(0.2333)

Snowfall

–0.0046***
(0.0018)

–0.0047***
(0.0014)

–0.0038
(0.0039)

–0.0022
(0.0014)

–0.0041***
(0.0011)

–0.0024
(0.0029)

AR(1) coefficient

0.3334***
(0.1028)

0.2807***
(0.1060)

0.2422**
(0.1046)

0.4114***
(0.0984)

0.3197***
(0.1055)

0.2103**
(0.1052)

Seasonally adjusted data

No

Seasonal dummy variables
Adjusted R 2

No

No

Yes

Yes

Yes

Yes

Yes

Yes

No

No

No

0.9572

0.9707

0.6863

0.9536

0.9700

0.4008

NOTE: *, **, and *** denote significance at 10, 5, and 1 percent, respectively. Regressions in each panel are estimated simultaneously
using the technique of Seemingly Unrelated Regressions. The dependent variable for each equation is the log of tax revenue for a subset
of the bar and restaurant sector. Group 1 includes food only, Group 2 includes food and beverage establishments, and Group 3 includes
those businesses that serve only beverages. Regressions in the “Non-seasonally adjusted data” columns use data that are not seasonally
adjusted, whereas those in the “Seasonally adjusted data” columns use data that are adjusted using the Census X-12 ARIMA procedure.

their smoking policies before enactment of the
smoking ban. If few businesses within a category
were affected by the new law, it is unlikely that a
significant effect would be found in the data. If
many businesses had to change their policies, the
impact of the smoking ban might be more distinct.
To examine the importance of this factor, the list
of businesses in each category was cross-referenced
against a list of bar and restaurant smoking policies
compiled by the Boone Liberty Coalition (BLC)
before enactment of the smoking ban.15 Many of the
businesses on the sales tax list were not covered by
the BLC survey, including those that had gone out of
business before mid-2006 and those that have newly
opened since that time. In fact, more than half of
the listed establishments were in these unclassified

categories. A clear pattern is evident, however, in
those covered in the survey: Among restaurants in
group 1, only 18 percent permitted indoor smoking
before the smoking ban was enacted. For businesses
in group 2, 56 percent allowed smoking, while for
group 3, 71 percent did.16
Regressions of the same general form as reported
in Table 1 were estimated for the three subsectors
independently. Using both the non-seasonally
adjusted and seasonally adjusted data, three equation systems were estimated using the technique
of seemingly unrelated regressions. This technique
allows for possible correlation among the residuals
of the three equations (a distinct possibility in
this case). In addition, it allows for testing crossequation restrictions.

15

16

The BLC was active in opposition to the enactment of the Columbia
smoking ban. They circulated a report (Boone Liberty Coalition,
2006) indicating that nearly two-thirds of Columbia’s restaurants
had smoke-free policies before the ban was adopted.

38

V O LU M E 4 , N U M B E R 1

2008

Businesses that allowed smoking on patios before the ban are not
counted in the totals for smoking permitted, since the Columbia
Smoke-Free Ordinance included an exemption that allowed for
some smoking sections to remain in outdoor seating areas.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Pakko

Table 3
Wald Tests for Equality of Smoking Ban Coefficients Across Equations
Non-seasonally adjusted data
Test

Seasonally adjusted data

Chi-square (1) statistic

Probability

Chi-square (1) statistic

Probability

Group 1 = Group 2

18.8373

0.0000

13.7525

0.0002

Group 1 = Group 3

12.4516

0.0004

10.9588

0.0009

Group 2 = Group 3

2.5268

0.1119

2.3193

0.1278

Not surprisingly, estimated effects of the smoking ban differed among these three groups. The
results of regression equations for the three groups
are reported in Table 2. Both non-seasonally
adjusted and seasonally adjusted data are shown.
The results are similar for each technique. For the
restaurants in group 1, there is no statistically significant effect associated with the smoking ban.
For businesses in group 2, the impact is negative
and highly statistically significant. The point estimates suggest losses of about 6½ percent. For the
bars in group 3, the small sample size means that
there is more noise in the data, so the fit of the
regression equation is much less precise.17 Nevertheless, the coefficient on the smoking ban dummy
variable is highly significant, with the estimates
suggesting losses of nearly 11 percent.
Wald test statistics (reported in Table 3) were
calculated for testing the significance of the crossequation differences in the smoking ban coefficients.
The coefficients on the smoking ban dummy variable in the equations for groups 2 and 3 were each
significantly different from the coefficient estimated
for group 1. However, because of the relatively
large standard errors for the group 3 estimates, the
hypothesis that the effect on group 2 and group 3
businesses was the same could not be rejected at
standard levels of statistical significance.18
17

Although neither the time trend nor the other tax revenues variable
is individually significant in these regressions, the two variables
are jointly significant (p-value < 0.001), and together account for
much of the explanatory power of the equation.

18

In a regression equation estimated using the (logged) sum of group 2
and group 3 businesses as the independent variable (full results not
reported), the coefficient on the smoking ban dummy variable was
found to be –0.065 for the non-seasonally adjusted data and –0.068
for a regression using seasonally adjusted data.

DISCUSSION AND CONCLUSION
The results reported in this paper indicate
statistically significant losses to bar and restaurant
sales tax revenues following the implementation
of the Columbia Smoke-Free Ordinance in January
2007. After accounting for trends, seasonality, an
overall downturn in retail sales, and an unusually
harsh winter, there remains a 3½ to 4 percent loss
in dining tax revenues associated with the smoking
ban. The effects of the smoking ban vary for different
types of businesses. Restaurants that serve primarily food only show no significant effect, whereas
bars and restaurants with bars show significantly
greater losses. For the latter categories, losses are
estimated to be in the range of 6½ to 11 percent.
It is important to note that the point estimates
identify only average losses. Many businesses in
this category are likely to have been unaffected (e.g.,
take-out businesses, fast-food franchises, and other
restaurants that already had smoke-free policies).
Accordingly, some businesses are likely to have
incurred losses that are far greater than the average.
Anecdotal reports from specific business owners
suggesting losses in the range of 30 percent do not
seem unreasonable.
One interesting feature of the Columbia experience is the response of restaurant owners to the
patio exemption. According to the Columbia
Missourian, owners of at least two bars are building
or planning outdoor patio expansions. One owner
was quoted as saying, “You have to have a patio to
survive.”19 The expenses associated with these
renovations may help offset losses in sales revenue
of these establishments, but they also represent
19

Solberg (2007), Greaney (2007).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

39

Pakko

profit losses above and beyond the measured
declines in revenues.
Measuring the economic effects of smoking
bans can sometimes be difficult. For the case of
Columbia, Missouri, this analysis of data on sales
tax revenues indicates that losses are of a magnitude that is clearly identifiable and statistically
significant.

REFERENCES
Adams, Scott and Cotti, Chad D. “The Effect of Smoking
Bans on Bars and Restaurants: An Analysis of
Changes in Employment.” The B.E. Journal of
Economic Analysis & Policy, February 8, 2007, 7(1);
www.bepress.com/bejeap/vol7/iss1/art12.
American Nonsmokers’ Rights Foundation.
“Municipalities with Local 100% Smokefree Laws?”
July 1, 2008; www.no-smoke.org/pdf/100ordlisttabs.pdf.
Boone Liberty Coalition. “Proposed Smoking Ordinance
Position Paper.” Unpublished manuscript, June 9,
2006; booneliberty.org/StopTheBan/
BooneLibertySmokingBan.pdf.
City of Columbia, Missouri. “FY 2008 Adopted Budget.”
October 4, 2007; www.gocolumbiamo.com/Finance/
Services/Financial_Reports/FY2008/index.php.
Coleman, Kevin. “Ban Leaves Billiards Behind the
Eight Ball.” Columbia Tribune, June 16, 2007;
archive.columbiatribune.com/2007/jun/
20070616busi003.asp.
Greaney, T.J. “In Smoking Ban Era, Patios a Hot
Commodity.” Columbia Tribune, July 9, 2007;
archive.columbiatribune.com/2007/jul/
20070709news051.asp.
LeBlanc, Matthew. “Smoking Ban Fighters Fade in
Their Effort: Petition Drive Has Grown ‘Apathetic.’ ”
Columbia Tribune, April 26, 2007;
archive.columbiatribune.com/2007/apr/
20070426news007.asp.
Lynch, Andrew. “Petition to End Smoking Ban Awaits
Signature Verification.” KBIA News, November 6,

40

V O LU M E 4 , N U M B E R 1

2008

2007; publicbroadcasting.net/kbia/news.newsmain?
action=article&ARTICLE_ID=1178464&sectionID=1.
National Oceanic and Atmospheric Administration.
“Climatological Data for St. Louis and Columbia.”
www.crh.noaa.gov/lsx/?n=cli_archive.
Newey, Whitney K. and West, Kenneth D. “A Simple,
Positive Semi-Definite, Heteroskedasticity and
Autocorrelation Consistent Covariance Matrix.”
Econometrica, May 1987, 55(3), pp. 703-8.
Pakko, Michael R. “The Economic Impact of a Smoking
Ban in Columbia, Missouri: A Preliminary Analysis
of Sales Tax Data.” Federal Reserve Bank of St. Louis
Center for Regional Economics CRE8 Occasional
Report No. 2007-02, December 11, 2007;
research.stlouisfed.org/regecon/op/CRE8OP-2007002.pdf.
Pakko, Michael R. “Clearing the Haze: New Evidence
on the Economic Impact of Smoking Bans.” Federal
Reserve Bank of St. Louis Regional Economist,
January 2008a, pp. 10-11; stlouisfed.org/publications/
re/2008/a/pages/smoking-ban.html.
Pakko, Michael. R. “No Smoking at the Slot Machines:
The Effect of Smoke-Free Laws on Gaming Revenues.”
Applied Economics, July 2008b, 40(14), pp. 1769-74.
Phelps, Ryan. “The Economic Impact of 100% Smoking
Bans” in Kentucky Annual Economic Report 2006.
Lexington, KY: Center for Business and Economic
Research, Gatton College of Business and Economics,
University of Kentucky, 2006, pp. 31-34;
gatton.uky.edu/CBER/Downloads/Phelps-06.pdf.
Scollo, Michelle; Lal, Anita; Hyland, Andrew and
Glantz, Stanton. “Review of the Quality of Studies
on the Economic Effects of Smoke-free Policies on
the Hospitality Industry.” Tobacco Control, March
2003, 12(1), pp. 13-20; www.tobaccoscam.ucsf.edu/
pdf/scollotc.pdf.
Solberg, Christy. “Effects of Smoking Ban Still
Debated.” Columbia Missourian, September 27,
2007; www.columbiamissourian.com/stories/2007/
09/27/effects-smoking-ban-still-debated/.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Urban Decentralization and Income Inequality:
Is Sprawl Associated with Rising
Income Segregation Across Neighborhoods?
Christopher H. Wheeler
Existing research shows an inverse relationship between urban density and the degree of income
inequality within metropolitan areas; this information suggests that as urban areas spread out, they
become increasingly segregated by income. This paper examines this hypothesis using data covering more than 165,000 block groups within 359 U.S. metropolitan areas for the years 1980, 1990,
and 2000. The findings indicate that income inequality—defined by the variance of the log household income distribution—does indeed rise significantly as urban density declines. This increase,
however, is associated with rising inequality within block groups as cities spread farther from their
central core. The extent of income variation between different block groups, by contrast, shows
virtually no association with population density. Accordingly, little evidence supports the notion
that urban sprawl is systematically associated with greater residential segregation of households
by income. (JEL D31, R11, R23)
Federal Reserve Bank of St. Louis Regional Economic Development, 2008 4(1), pp. 41-57.

F

or much of the past century, the population within U.S. metropolitan areas has
shown a persistent tendency to move outward as residents leave central cities for
suburban locales. This movement has been striking
within the past 50 years. In 1950, 41.5 percent of
metropolitan populations resided in suburban
areas (i.e., those outside central cities); a half century later, more than 62 percent did. As a consequence, the density of population within the
nation’s urban areas has changed dramatically.
Between 1950 and 2000, the average central-city
population density decreased from 7,517 residents
per square mile to 2,716. At the same time, suburban densities increased from 175 residents per
square mile to 208.1
1

All of these figures are derived from the U.S. Census of Population
and Housing, as reported by Hobbs and Stoops (2002).

Undoubtedly, urban decentralization largely
reflects the decisions of individuals and employers
to expand their activities over more space. Improved
transportation technology and infrastructure, for
example, have eased longer commuting distances.
These changes have encouraged workers and firms
to locate on the outer fringes of their metropolitan
areas where land tends to be more plentiful and
less costly.
Despite the “voluntary” nature of this process,
urban decentralization has generated several concerns about the welfare of metropolitan area populations. One such concern is a rising disparity
between neighborhoods, especially the decline of
incomes in central cities relative to those of their
suburban counterparts. As metropolitan areas
expand, the majority of both employment opportunities and relatively high-income households
may shift from the central core to the periphery,

Christopher H. Wheeler was a research officer at the Federal Reserve Bank of St. Louis at the time this article was written.

© 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the
views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced,
published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts,
synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis.

41

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

thereby creating a widening income gap between
these two areas. Over time, these differences may
become more pronounced as the poor become
increasingly isolated from productive interactions
with wealthier neighbors.2
Existing evidence seems to support this idea.
Margo (1992), for example, argues that the movement of metropolitan populations in the United
States toward suburban locales over the latter half
of the twentieth century can be linked, to a significant degree, to the rise in personal incomes. As
individual incomes increased, so did the demand
for land. One rather straightforward implication
of this hypothesis is that decentralization should
be accompanied by a rise in the extent of income
segregation. Individuals migrating to the suburbs
(i.e., those with a particularly high demand for
space) should also be those with relatively high
incomes. As a result, urban decentralization would
be expected to lead to the accumulation of highincome households on the outskirts of cities, while
poorer residents remain within the central cores.
A number of studies do suggest that poverty
became more concentrated within the country’s
urban areas over this same period. Mayer (1996)
reports that in 1964, families in the bottom quintile
of the income distribution were 1.2 times as likely
to reside in a central city as wealthier families. By
1994, they were 1.4 times as likely to reside in central cities. In studies of the largest U.S. cities and
metropolitan areas, Kasarda (1993) and Abramson,
Tobin, and VanderGoot (1995) find that individuals
living in poverty became increasingly concentrated
within poor neighborhoods (defined by Census
tracts) between 1970 and 1990. Although these two
particular studies do not consider the issue of urban
decentralization per se, the figures documented
therein certainly characterize a period during which
metropolitan populations were shifting from central
areas toward suburban ones.
Research on the spatial mismatch hypothesis
offers a similar conclusion. This idea, advanced by
Kain (1968), holds that inner-city residents tend
to experience adverse economic outcomes as pop2

The movement of high-income individuals away from the poor, for
example, may leave the poor with relatively few jobs (e.g., Kain, 1968)
or reduce the extent to which the rich confer positive spillovers on
the poor (e.g., Wilson, 1987, and Benabou, 1996).

42

V O LU M E 4 , N U M B E R 1

2008

ulation and employment opportunities leave those
inner cities because it becomes increasingly difficult for them to find and sustain employment.
Therefore, the gap between the incomes earned by
residents of suburban neighborhoods and those
earned by residents of the central city should be
expected to rise as populations spread out. Many
studies on this topic have found that inner-city
minorities do seem to experience worse labor market outcomes, usually measured by employment
status and earnings, as economic activity leaves
urban centers, although the literature is far from
unanimous on this point.3
On the specific topic of income inequality,
Wheeler (2004) finds that urban density exhibits
a strong negative correlation with the degree of
spread more in the distribution of labor earnings.
Thus, as a metropolitan area’s population spreads
out, its wage distribution tends to widen. Although
the results apply to white male workers with a
strong attachment to the labor force (and so do not
offer direct evidence on spatial mismatch, which
tends to focus on differences by race), they certainly
are consistent with the concept that urban decentralization leads to greater segregation of high-income
and low-income workers across neighborhoods.
Despite the findings of existing work, surprisingly little research has directly studied the evolution of interneighborhood income differentials as
populations become increasingly dispersed, particularly among neighborhoods defined at levels
finer than central cities and suburbs. A notable
exception is Yang and Jargowsky (2006), who look
at the relationship between sprawl and a neighborhood segregation index based on urban tracts in the
United States between 1990 and 2000. This paper
performs a related, although different, exercise. In
particular, I examine the relationship between urban
density and the degree of income inequality both
within and between neighborhoods defined by
Census block groups. More specifically, I use data
on household income to compute the variance of
the income distribution for each of 359 U.S. metropolitan areas for the years 1980, 1990, and 2000. I
then exploit data covering more than 165,000 block
3

See, for example, Ihlanfeldt and Sjoquist (1989), Holzer (1991), and
Weinberg (2000, 2004) for a discussion of these issues.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

groups to decompose these variances into components associated with the dispersion of incomes
within block groups and components associated
with the dispersion across them.
The results suggest that even though a strong
negative association exists in the variance of a
metropolitan area’s household income distribution
and its overall population density, the association
operates through a within-neighborhood channel
rather than a between-neighborhood channel. That
is, as the population of a metropolitan area spreads
out, household income inequality increases largely
because the extent of income variation among
households within the same block group rises,
not because neighborhoods become more segregated by income.
On closer inspection, the data do reveal some
evidence that decentralization tends to be accompanied by rising between-neighborhood income
gaps, but this occurs only at the top of the blockgroup income distribution. Specifically, the income
differential between the block group at the 90th
percentile of the household income distribution
and the block group at the median does increase
significantly as metropolitan areas decentralize.
However, the gap between the median and the
block group at the 10th percentile tends to decrease,
which leaves measures of the overall spread in the
between-neighborhood income distribution relatively unchanged. Moreover, there appears to be
little association between density and either the
average income of the block group at the 90th percentile or that of the block group at the 10th percentile. Similar results hold when the analysis is
repeated using Census tracts instead of block groups.
Notably, these results should not be interpreted
as suggesting that certain neighborhoods do not
experience particularly adverse economic outcomes
as populations decentralize. Some inner-city areas
indeed may become increasingly poor as activity
moves outward. However, the extent to which this
process occurs evidently has little effect on the
overall level of between-neighborhood income
inequality in a metropolitan area.
The remainder of the paper proceeds as follows.
The next section provides a brief description of the
data and some of the computational issues. The
results section is followed by concluding remarks.

DATA AND MEASUREMENT
The primary data source used for the analysis
is the decennial U.S. Census of Population and
Housing for the years 1980, 1990, and 2000 as compiled by GeoLytics.4 The GeoLytics data files report
a variety of demographic and economic characteristics (e.g., income, industry of employment, age,
race, gender, education, place of birth, employmentunemployment status) for individuals at a variety
of geographic levels, including counties, tracts,
and block groups. Unfortunately, individual-level
observations are not reported in the data; only
summary measures taken across the individuals
located within each geographic unit are reflected.
This feature thereby limits the types of statistics
that can be calculated. The primary advantage of
these data is the consistency of the geographic
units—the data have been constructed based on
consistent geographic definitions over all three
Census years.
This study focuses on average household
income and a variety of other economic and demographic data among residents in block groups,
which are used as the basis for a “neighborhood.”
Although neighborhoods could also be (and frequently are) defined by Census tracts, the focus is
on block groups in this paper because they represent the finest grouping available in the data. Across
the 359 metro areas in the sample, there are more
than 165,000 block groups that each contained, on
average, 526.5 households and had a median land
area of approximately 0.33 square miles in the year
2000.5 Tracts tend to be larger (1,648.8 households,
on average, and a median land area of 1.31 square
miles in 2000), and therefore, they may be less
appropriate when considering neighborhoods,
which are meant to encompass areas over which
individuals can reasonably be expected to interact
with one another. As demonstrated below, the
principal findings are mostly invariant to the
choice of block groups or tracts.
4

The data can be obtained from GeoLytics, Inc. at http://www.geolytics.com.

5

Metropolitan area definitions follow the Census Bureau’s definitions
as of November 2004. They were accessed at www.census.gov/population/www/estimates/metrodef.html.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

43

Wheeler

Table 1
Summary Statistics: Block Group Income Inequality
Year
1980

1990

2000

Variable

Mean

Standard deviation

Minimum

Maximum

Variance

0.55

0.06

0.43

0.75

Within component

0.47

0.05

0.37

0.64

Between component

0.07

0.04

0.003

0.24

Variance

0.64

0.07

0.48

0.94

Within component

0.50

0.05

0.39

0.65

Between component

0.14

0.05

0.04

0.31

Variance

0.65

0.08

0.48

1.05

Within component

0.52

0.05

0.41

0.70

Between component

0.13

0.05

0.02

0.38

NOTE: Statistics taken across 359 metropolitan areas.

I estimate the variance of a metropolitan area’s
income distribution as follows. For each year, the
number of households with incomes falling into
each of N closed intervals is reported in the
GeoLytics files.6 I use these figures to compute the
fraction of households with incomes less than N
distinct levels, which allows N quantiles of the
household income distribution to be estimated for
each metro area. For example, if 14 percent of all
households have income less than $25,000, I estimate the 0.14 quantile by 25,000. Label these quantiles Xα . I then match these N quantiles to their
corresponding values from a normal (0,1) distribution. Label these quantiles Uα . Assuming a lognormal household income distribution, Xα and Uα
are related as follows:
(1)

X α = exp (ζ + U α σ ),

where ζ and σ are the mean and standard deviation
(SD) parameters characterizing the lognormal distribution (see Johnson and Kotz, 1970, p. 117).
These parameters are readily obtained by transforming equation (1) logarithmically and estimating by
ordinary least squares (OLS). The fit of these regressions tended to be quite high in all cases. Across the
359 metro areas, the mean adjusted R 2 was approximately 0.98 for each year, and the minimum across
6

For 1980, there are 15 income categories; for 1990, there are 24; for
2000, there are 15. See the Appendix for details.

44

V O LU M E 4 , N U M B E R 1

2008

all metro area-year observations was 0.95. With the
SD, σ, the variance follows simply as σ 2.
Summary statistics describing metropolitan
area–level income variances appear in Table 1.
Most notably, they demonstrate that, on average,
the degree of dispersion exhibited by metropolitan
area–level (log) income distributions increased
between 1980 and 2000, with the majority of this
increase between 1980 and 1990. Over these two
decades, the mean income variance rose by a total
of 10 log points (approximately 18 percent). Of this
10 log point increase, the majority—9 log points—
was experienced during the 1980s. Qualitatively,
of course, this finding is consistent with what has
now been widely established in the inequality literature (e.g., Katz and Murphy, 1992, Juhn, Murphy,
and Pierce, 1993).

EMPIRICAL FINDINGS
Urban Decentralization and Income
Inequality
Consider first the relationship between metropolitan area–level population density and the extent
of income inequality. To do so, let the variance of
the (log) income distribution for metropolitan area
m in year t have the following characterization:
(2)

2
σ mt
= µm + µt + β X mt + γ Dmt + ε mt ,

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

where µm is a metro area–specific fixed effect, µt is
a year-specific term, Xmt is a vector of covariates
described in greater detail below, Dmt is the logarithm of population density, and εmt is a residual.
To eliminate the metro area fixed effects, I take 10year differences of equation (2), yielding

which serves as the primary estimating equation
in the analysis. Given the nature of the differenced
error term, there is nonzero correlation between
the residuals for the same metro area. The standard
errors are adjusted to account for this correlation.
Density is calculated for each metropolitan
area as the weighted average of county-level population densities, where the weights are given by
each county’s share of total metropolitan area population. This measure is used instead of average
metropolitan area density (calculated as the ratio
of total metropolitan area population to total land
area) to mitigate the influence of extremely large
but relatively unpopulated counties, which appear
in many metropolitan areas of the West. Countyweighted population density gives these counties
less weight in the computations and, therefore, may
provide a better sense of how densely clustered a
city’s population is.7 Table 2 lists the 10 most and
least densely populated metropolitan areas in
each year.
Among the covariates included in the vector
Xmt are some basic characteristics commonly associated with the degree of income inequality in an
economy. These characteristics include the percentages of the resident population that are black,
female, foreign-born, younger than age 25, and
older than age 65; the fraction of the population
25 years of age or older that has completed at least
a bachelor’s degree; shares of employment in 9
broad industries8; the fraction of the labor force

that is represented by a union; and the unemployment rate. I also include three region dummies to
account for any basic geographic differences in
the inequality trends across different parts of the
country.9
Results of these characteristics appear in Table 3.
I consider three different specifications of the
covariates in the estimation of equation (3) to gauge
the robustness of the density-inequality relationship. The first limits the regressors to log density,
the three region dummies, and a time effect for the
1980-90 decade. The second then adds the population demographics of each metro area (age, race,
gender, education, foreign-born status). The third
includes the remainder of the covariates that provide a basic description of the metro area’s labor
market (industry employment shares, unionization,
unemployment).10
Several fairly standard findings are evident.
Larger proportions of women and individuals
younger than age 24 in the local population are
strongly, positively associated with inequality,
which likely reflects the relatively low average
income among these individuals. Some evidence
(although not always statistically significant) indicates that inequality increases with the percentages
of foreign-born residents and individuals older
than age 65 in the local population. Furthermore,
inequality in a metro area tends to rise significantly
as the unemployment rate increases, suggesting
that households at the bottom end of the income
distribution are more sensitive economically to
the business cycle than wealthier households.
Inequality is also significantly, negatively associated with the extent of union coverage in the local
labor force, which is a relatively common finding.
Although union workers typically receive an earnings premium over nonunion labor, union contracts
tend to equalize earnings across workers (e.g., Fortin

7

I also repeated all of the estimations using weighted averages of
block group–level population densities for each metro area. The
results were qualitatively similar to those reported here.

9

8

Because metropolitan area boundaries frequently cross state borders
and region definitions are based on states, parts of some metro areas
are in different regions. I assign these multiregion metropolitan areas
to the regions in which the majority of their populations lie.

The sectors are manufacturing; agriculture, forestry, fisheries, and
mining; construction; wholesale trade; retail trade; finance, insurance,
real estate; public administration; education services; health services.
I do not use a more detailed industrial classification scheme, in part,
to avoid difficulties associated with the change from the Standard
Industrial Classification system in 1980 and 1990 to the North
American Industry Classification System in 2000.

10

The unionization rate for each metropolitan area is based on statelevel union coverage rates reported by Hirsch, Macpherson, and
Vroman (2001) (available at www.unionstats.com). Metropolitan
area–level union rates are calculated as weighted averages of their
constituent state-level rates, where the weights are given by the
fraction of each metro area’s labor force located in each state.

(3)

2
∆σ mt
= ∆ µt + β∆ X mt + γ ∆D mt + ∆ ε mt ,

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

45

Wheeler

Table 2
Most and Least Densely Populated Metro Areas
Year

Top 10

Density

Bottom 10

Density

1980 New York-Northern New Jersey-Long Island, NY-NJ-PA

14,740.0

Flagstaff, AZ

4.03

Philadelphia-Camden-Wilmington, PA-NJ-DE-MD

4,927.0

Prescott, AZ

8.4

Washington-Arlington-Alexandria, DC-VA-MD-WV

4,374.1

St. George, UT

10.7

Baltimore-Towson, MD

4,017.3

Casper, WY

13.5

San Francisco-Oakland-Fremont, CA

3,996.1

Wenatchee, WA

14.3

Chicago-Naperville-Joliet, IL-IN-WI

3,959.4

Farmington, NM

14.8

Boston-Cambridge-Quincy, MA-NH

2,930.6

Yuma, AZ

16.4

Milwaukee-Waukesha-West Allis, WI

2,885.7

Bend, OR

20.6

Detroit-Warren-Livonia, MI

2,556.5

Rapid City, SD

20.9

Cleveland-Elyria-Mentor, OH

2,435.9

El Centro, CA

22.1

1990 New York-Northern New Jersey-Long Island, NY-NJ-PA

15,161.5

Flagstaff, AZ

5.2

Philadelphia-Camden-Wilmington, PA-NJ-DE-MD

4,385.6

Casper, WY

11.5

San Francisco-Oakland-Fremont, CA

4,171.9

Prescott, AZ

13.3

Washington-Arlington-Alexandria, DC-VA-MD-WV

3,886.3

Farmington, NM

16.6

Chicago-Naperville-Joliet, IL-IN-WI

3,783.4

Wenatchee, WA

16.7

Baltimore-Towson, MD

3,440.1

Yuma, AZ

19.4

Boston-Cambridge-Quincy, MA-NH

2,942.5

St. George, UT

20.0

Milwaukee-Waukesha-West Allis, WI

2,806.9

Rapid City, SD

24.4

Los Angeles-Long Beach-Santa Ana, CA

2,369.0

Bend, OR

24.8

Detroit-Warren-Livonia, MI

2,292.3

El Centro, CA

26.2

2000 New York-Northern New Jersey-Long Island, NY-NJ-PA

16,125.0

Flagstaff, AZ

6.2

San Francisco-Oakland-Fremont, CA

4,419.8

Casper, WY

12.5

Philadelphia-Camden-Wilmington, PA-NJ-DE-MD

4,027.1

Prescott, AZ

20.6

Chicago-Naperville-Joliet, IL-IN-WI

3,880.0

Farmington, NM

20.6

Washington-Arlington-Alexandria, DC-VA-MD-WV

3,573.1

Wenatchee, WA

21.2

Boston-Cambridge-Quincy, MA-NH

3,036.4

Rapid City, SD

26.5

Baltimore-Towson, MD

2,813.0

Yuma, AZ

29.0

Milwaukee-Waukesha-West Allis, WI

2,634.7

Great Falls, MT

29.8

Los Angeles-Long Beach-Santa Ana, CA

2,634.6

Cheyenne, WY

30.4

Detroit-Warren-Livonia, MI

2,231.9

Duluth, MN-WI

32.9

NOTE: Population densities are calculated as (population-share) weighted averages of county-level densities (in residents per square mile).

46

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

Table 3
Overall Inequality Results
Variable
Log density

I (SE)

II (SE)

III (SE)

–0.07*(0.009)

–0.086* (0.01)

–0.07* (0.01)

Percent bachelor’s degree

—

0.54* (0.08)

0.52* (0.09)

Percent female

—

0.73* (0.28)

0.44* (0.25)

Percent black

—

0.05 (0.11)

0.03 (0.10)

Percent <24 years

—

0.35* (0.14)

0.23* (0.13)

Percent >65 years

—

0.31* (0.16)

0.23 (0.15)

Percent foreign-born

—

0.28* (0.13)

Percent manufacturing

—

—

–0.35* (0.07)

Percent agriculture, forestry, fishing, and mining

—

—

–0.04 (0.11)

Percent construction

—

—

–0.28* (0.12)

Percent wholesale trade

—

—

–0.10 (0.15)

Percent retail trade

—

—

0.11 (0.11)

Percent finance, insurance, and real estate

—

—

–0.46* (0.15)

Percent public administration

—

—

–0.34* (0.14)

Percent education services

—

—

–0.28* (0.13)

Percent health services

—

—

0.11 (0.13)

Unemployment rate

—

—

0.46* (0.08)

Percent union representation

—

—

–0.12* (0.05)

0.64

0.69

R2

0.20 (0.13)

0.74

NOTE: Data represent 718 observations. The dependent variable is the change in the variance of the log income distribution for a metropolitan area. Each regressor is expressed in terms of contemporaneous 10-year changes. All specifications also include three region dummies
and a time effect for the 1980-90 decade. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–
metro area correlation of the regression error terms. * Significant at ≥10 percent.

and Lemieux, 1997). Shares of local employment
in manufacturing and construction—two sectors
frequently associated with relatively high earnings
for relatively low-skilled labor—correlate negatively
with income inequality.
The primary regressor of interest, the logarithm
of population density, is uniformly negative and
statistically significant across all three specifications. Based on the point estimates, a 1 SD decrease
in the change in population density corresponds to
a 1 log point increase in the change in log income
variance. This figure is far from negligible, representing approximately 20 percent of the mean
change in log income variance over the two decades
considered in this study. Again, this basic finding
has already been established, at least in a qualitative
sense, in some of the works previously described.

The following text takes a closer look at this result
to determine the extent to which it reflects an
increase in the degree of income segregation
across neighborhoods.

Decomposing Income Inequality
Consider the following standard decomposition
of a metropolitan area’s income inequality. The
variance of household income in a metropolitan
area, σ 2, can be estimated as
(4)

σ2 =

1
H

N Hn

2

∑ ∑ ( yh , n − y )

,

n =1 h =1

where yh,n is the income of household h of neighborhood n, y– is the mean household income for the
entire metropolitan area, Hn is the total number of

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

47

Wheeler

households in neighborhood n, N is the total number of neighborhoods, and H is the total number
of households, ΣnHn .11 This expression can be
rewritten as the sum of two terms:
(5) σ 2 =

1
H

N Hn

2

∑ ∑ ( yh , n − y n )

+

n =1 h =1

1
H

N Hn

2

∑ ∑ ( yn − y ) ,
n =1 h =1

where y–n represents the mean household income
in neighborhood n. The first of the terms on the
right-hand side of equation (5) is the “within”
neighborhood component, which measures the
degree of income dispersion among households
within the same neighborhood. The second term,
the “between” component, captures the amount of
income variation across different neighborhoods.
The within component cannot be computed
directly because data from individual households
are unavailable. However, the between component
can be computed. Using the estimates of the variance, σ 2, derived above, the within-neighborhood
component is constructed as the difference
between these two pieces.
Table 1 lists some summary statistics describing
the within-block and between-block group components. Two features are immediately apparent. First,
in each of the three years considered (1980, 1990,
2000), the extent of income variation within neighborhoods is considerably larger than the extent of
variation between them. In the year 2000, for
instance, the within-neighborhood component
accounted for 80 percent of total metropolitan
area income variation, on average. This finding is
roughly similar to Epple and Sieg’s (1999) report
for municipalities in Boston and is consistent with
the results of Ioannides (2004) and Hardman and
Ioannides (2004), who document a substantial
degree of income heterogeneity within small residential clusters in the United States. Second, the
10 years between 1980 and 1990 saw a sharp rise
in the proportion of total income variation attributable to between-neighborhood differences. Over
this decade, the average fraction of total income
variation associated with differences across neigh11

The average numbers of households per metropolitan area are relatively large: 180,164.6 for 1980, 208,780.9 for 1990, and 240,407.2
for 2000. Across all three years, the minimum number of households
is 8,681. Hence, the difference between using a factor of 1/H in equation (4) instead of 1/共H –1兲 is extremely small.

48

V O LU M E 4 , N U M B E R 1

2008

borhoods rose from 12.7 percent to 21.9 percent.
Hence, although income variation remained predominantly a within-neighborhood phenomenon
in 2000, the between-neighborhood component
became increasingly important between 1980 and
2000.

Decentralization and Inequality:
Within versus Between Neighborhoods
An estimated series of regressions following
the above procedure was used to determine whether
urban decentralization is associated with growing
inequality through a within- or a betweenneighborhood channel (or possibly both). I estimate
three specifications of equation (3) in which the
dependent variables are the changes in within- and
between-neighborhood income variation rather
than the change in the total variance of log income.
The estimates are shown in Table 4. Interestingly, they demonstrate some striking differences
in the estimated associations across the two sets of
results. In looking just at the longest specification,
III, the change in a metro area’s degree of income
variation within its block groups is positively and
significantly tied to changes in the fraction of the
population with a bachelor’s degree, the fraction
that is black, and the fraction that is foreign-born.
On the other hand, increases in the percentages of
total employment in manufacturing and finance,
insurance, and real estate correlate negatively with
income inequality within neighborhoods.
Between-neighborhood inequality shows a
similar positive and significant association with
the fraction of college graduates in the local population and with a number of quantities that did not
relate significantly to within-block group inequality: the percentages of the population accounted for
by women, individuals younger than age 24, and
the unemployment rate. Increases in these three
variables tend to be associated with increases in
the extent of income variation between different
block groups. In addition, between-neighborhood
inequality is significantly, negatively tied to the
fraction of the local population that is black, the
shares of total employment accounted for by construction and education services, and the extent
of union representation in the local labor force.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

Table 4
Within- and Between-Neighborhood Inequality Results
Within-neighborhood
Variable

Between-neighborhood

I

II

III

I

II

III

–0.069*
(0.009)

–0.075*
(0.01)

–0.064*
(0.01)

–0.001
(0.008)

–0.01
(0.009)

–0.006
(0.008)

Percent bachelor’s degree

—

0.38*
(0.08)

0.35*
(0.09)

—

0.16*
(0.07)

0.17*
(0.08)

Percent female

—

–0.006
(0.21)

–0.004
(0.23)

—

0.73*
(0.20)

0.44*
(0.17)

Percent black

—

0.24*
(0.11)

0.24*
(0.11)

—

–0.19*
(0.10)

–0.21*
(0.10)

Percent <24 years

—

0.10
(0.12)

0.06
(0.12)

—

0.25*
(0.11)

0.17*
(0.11)

Percent >65 years

—

0.37*
(0.15)

0.22
(0.14)

—

–0.06
(0.16)

0.01
(0.15)

Percent foreign-born

—

0.21*
(0.10)

0.18*
(0.09)

—

0.07
(0.06)

0.024
(0.06)

Percent manufacturing

—

—

–0.27*
(0.07)

—

—

–0.08
(0.05)

Percent agriculture, forestry, fishing, and mining

—

—

–0.13
(0.12)

—

—

0.09
(0.11)

Percent construction

—

—

0.035
(0.12)

—

—

–0.32*
(0.10)

Percent wholesale trade

—

—

0.10
(0.17)

—

—

–0.19
(0.15)

Percent retail trade

—

—

0.03
(0.10)

—

—

0.08
(0.09)

Percent fire, insurance, and real estate

—

—

–0.26*
(0.14)

—

—

–0.20
(0.13)

Percent public administration

—

—

–0.18
(0.13)

—

—

–0.16
(0.10)

Percent education services

—

—

0.12
(0.15)

—

—

–0.39*
(0.14)

Percent health services

—

—

0.02
(0.13)

—

—

0.09
(0.11)

Unemployment rate

—

—

0.02
(0.09)

—

—

0.44*
(0.09)

Percent union representation

—

—

0.01
(0.05)

—

—

–0.13*
(0.05)

0.17

0.23

0.28

0.66

0.68

0.72

Log density

R2

NOTE: Data represent 718 observations. Dependent variables are the changes in within- and between-neighborhood income variation for
a metropolitan area. Each regressor is expressed in terms of contemporaneous 10-year changes. All specifications also include three region
dummies and a time effect for the 1980-90 decade. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity
and within–metro area correlation of the regression error terms.* Significant at ≥10 percent.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

49

Wheeler

Why are there such differences in the associations of these variables with the two measures of
inequality? One possible explanation relates to how
residential patterns change with each quantity.
Increases in the fraction of black residents in a
metro area’s total population, for instance, may
be associated with increasing racial heterogeneity
within block groups (hence, higher withinneighborhood income variation), and as a consequence, declining heterogeneity between them
(thus, lower between-neighborhood variation).
Similarly, fluctuations in unemployment and union
membership may influence workers in particular
neighborhoods much more than a city’s general
population. This would lead to fluctuations in the
degree of inequality between neighborhoods rather
than within them.
For the variable of primary interest—population
density—the results demonstrate a clear, negative
association with the extent of income variation
within neighborhoods. As the change in population
density decreases by 1 SD in the cross section, the
change in (log) income variance within block groups
increases by approximately 1 percentage point.
(Recall that this magnitude is virtually identical
to the one estimated for overall income variation).
Given this finding, it is perhaps not surprising
that the estimated association between density and
between-neighborhood inequality is extremely
small. None of the three specifications produces a
statistically or economically significant coefficient
on the change in population density. Based on these
results, there is little evidence that urban decentralization is associated with rising income differentials
between neighborhoods. The negative association
between density and the variance of household
income observed in Table 3 seems to be driven
almost entirely by the change in withinneighborhood income differences.

move farther from low-income households as the
gap between the two groups increases.12
I use an instrumental variables (IVs) estimation
to address this matter. I consider two different sets
of instruments for the change in density: (i) the
lagged level of density within a metropolitan area,
and (ii) lagged shares of employment in each of
the nine industry shares previously considered.
The rationale for each is straightforward. Initial density should capture a city’s capacity for increased
levels of density over time. With all else equal,
initially dense cities should be less likely to see
further increases in their densities because they
face greater space constraints.13 Because different
types of employers have different propensities to
decentralize their operations (e.g., Glaeser and
Kahn, 2004), initial industry shares should also
predict future changes in population density.
Weinberg (2004), for example, has exploited this
feature of industry location patterns to instrument
for job centralization in a study of spatial mismatch.
Of course, because initial density or sectoral
employment shares may be correlated with unobserved factors influencing subsequent changes in
inequality (e.g., density or the manufacturing share
in 1990 may be endogenous with respect to the
change in inequality between 1990 and 2000), I use
density and each industry share in 1980 to instrument for the change in density between 1990 and
2000.14
Table 5 shows the results using all three
inequality measures and all three specifications.
For the sake of conciseness, I have reported only
the coefficients on the change in density. The
results generally are very similar to the estimates
in Tables 3 and 4. Density and inequality are negatively related, and the association operates primarily through a within-neighborhood channel rather
than a between-neighborhood channel.
12

Rising income differentials, for example, may generate greater differences in the demand for certain local public goods or an increasing
desire to avoid “negative” neighborhood effects.

13

In fact, a strong negative connection exists between the initial level
of density in a metro area and the extent to which it decentralizes over
the next 10 years. A simple regression of the change in density on
its initial level in the data used here produces a coefficient (standard
error) of –0.04 (0.004) with a goodness-of-fit statistic equal to 0.14.

14

As demonstrated by the results from F tests of marginal significance
reported in Table 5, both sets of instruments are significant predictors
of the change in density between 1990 and 2000.

Instrumental Variables Estimates
One obvious criticism of this estimation is the
potential endogeneity of changes in density with
respect to changes in inequality. A rise in the degree
of income dispersion in a metro area, for example,
may induce residents to segregate further, possibly
leading to greater decentralization. It is not implausible that high-income households may seek to
50

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

Table 5
Instrumental Variables Estimates
IV (density)
Dependent variable

IV (industry shares)

I

II

III

I

II

III

Variance change log income distribution

–0.24*
(0.06)

–0.10*
(0.03)

–0.04
(0.03)

–0.07
(0.05)

–0.10*
(0.04)

–0.04
(0.04)

Within-neighborhood inequality component

–0.20*
(0.05)

–0.11*
(0.03)

–0.066*
(0.03)

–0.07
(0.05)

–0.10*
(0.04)

–0.07*
(0.04)

Between-neighborhood inequality component

–0.04
(0.03)

0.01
(0.02)

0.02
(0.03)

–0.003
(0.04)

0.001
(0.03)

0.03
(0.03)

F test

40.2
(0)

95.1
(0)

88.03
(0)

5.26
(0)

9.79
(0)

8.72
(0)

NOTE: Data represent 359 observations. Coefficients are for the change in log population density. Dependent variables are the changes
in the variance, the within-neighborhood component, and the between-neighborhood component between 1990 and 2000. Instruments
are log density or industry employment shares in 1980. Specifications follow data reported in Tables 3 and 4. Standard errors (reported
in parentheses, except for F tests) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms.
F test reports results from test of the (marginal) significance of the instruments from the first-stage regression for the appropriate specification (p-value under null that the IV coefficients are zero appears in parentheses). *Significant at ≥10 percent.

Other Measures of BetweenNeighborhood Inequality
This section expands on the analysis of
between-neighborhood inequality by considering
how changes in metropolitan area density influence
some alternative measures of income differences
across block groups. In particular, how do differences among the 90th, 50th, and 10th percentiles
of the block group (average) household income
distribution within each metropolitan area change
as metropolitan areas decentralize?15 Although percentile differences are not typically used in studies
of neighborhood income inequality, they are commonly used to quantify inequality between individuals (e.g., Juhn, Murphy, and Pierce, 1993).
Table 6 shows the results from the same three
specifications considered above, each of which is
estimated by OLS and IV.16 Regardless of whether
the percentiles are computed in a weighted or
unweighted fashion (where the weights are given
by the number of households in each block group),
15

On average, metropolitan areas in the sample contain 460 block
groups each (minimum = 27, maximum = 14,019), so calculating
percentiles is a reasonable exercise with these data.

16

Recall that in all cases, standard errors are adjusted for heteroskedasticity and within–metro area correlation.

the estimated coefficients on density are quite
similar. The OLS results suggest that, instead of
decreases in density generating greater inequality
between neighborhoods, they may generate smaller
interneighborhood income differences.
This result, however, may be the product of
endogeneity, whereby some aspect of rising
between-neighborhood inequality may cause density to rise. For example, rising income segregation
between neighborhoods may be associated with
rising returns to the highly educated residents, who
may desire to live in traditional city centers (e.g.,
Brueckner and Rosenthal, 2008). This would create
an upward bias in a truly negative association
between density and inequality.
I consider, therefore, the use of IVs, which
produces a somewhat different set of conclusions.
These suggest little association between density
and the difference between the neighborhoods at
the 90th and 10th percentiles of the log income
distribution, which is consistent with the results
examining the between-neighborhood component
of total income variation documented above. When
separated into 90-50 and 50-10 differentials, however, the difference between the 90th percentile
and the median tends to increase significantly as

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

51

Wheeler

Table 6
Alternative Measures of Between-Neighborhood Inequality
OLS
Dependent variable

I

IV (density)

IV (industry shares)

II

III

I

II

III

I

II

III

Unweighted 90-10 percentile 0.04
difference
(0.03)

0.07*
(0.04)

0.10*
(0.04)

–0.30*
(0.15)

–0.055
(0.10)

0.06
(0.11)

–0.23
(0.18)

–0.20
(0.13)

–0.07
(0.13)

Unweighted 90-50 percentile 0.02
difference
(0.03)

0.035
(0.03)

0.06*
(0.03)

–0.41*
(0.10)

–0.20*
(0.07)

–0.13*
(0.07)

–0.23*
(0.11)

–0.24*
(0.08)

–0.15*
(0.08)

Unweighted 50-10 percentile 0.02
difference
(0.02)

0.03
(0.02)

0.03
(0.02)

0.09
(0.10)

0.14*
(0.07)

0.19*
(0.08)

–0.01
(0.11)

0.04
(0.09)

0.07
(0.10)

Weighted 90-10 percentile
difference

0.05
(0.04)

0.067*
(0.04)

0.09*
(0.036)

–0.35*
(0.13)

–0.07
(0.09)

0.01
(0.09)

–0.08
(0.17)

–0.14
(0.13)

0.002
(0.13)

Weighted 90-50 percentile
difference

0.02
(0.03)

0.027
(0.03)

0.06*
(0.03)

–0.40*
(0.10)

–0.19*
(0.06)

–0.16*
(0.07)

–0.10
(0.10)

–0.17*
(0.08)

–0.09
(0.08)

Weighted 50-10 percentile
difference

0.03
(0.02)

0.04*
(0.02)

0.03
(0.03)

0.06
(0.10)

0.12*
(0.06)

0.17*
(0.07)

0.01
(0.10)

0.03
(0.08)

0.09
(0.08)

NOTE: Coefficients are for the change in log population density. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. Specifications follow data reported in Tables 3 and 4.
* Significant at ≥10 percent.

cities decentralize. At the same time, the difference
between the median and the 10th percentile appears
to decrease as a metro area population spreads out.
Indeed, the estimated associations between density
and the 50-10 gap are significantly positive when
initial density is used as an instrument for its future
change. When combined, of course, these two
observations are perfectly compatible with the
finding that the 90-10 differential shows little
association with changes in density.
This evidence suggests that, although there
seems to be little association between urban decentralization and measures of the overall degree of
income variation across different neighborhoods,
the same is not true for all parts of the income distribution. As city populations spread out, there
appears to be an increase in the average incomes
of neighborhoods at the top relative to the middle.
Particularly, high-income households may segregate themselves to a larger extent as populations
spread out. On the other hand, the gap between
the average incomes at the middle of the distribution and those at the bottom shrinks, which may
reflect greater income mixing among middle- to
lower-income households.
52

V O LU M E 4 , N U M B E R 1

2008

Table 7 shows a more detailed set of results
describing these associations; it reports the coefficients on the change in density in regressions in
which these three individual quantiles are specified as the dependent variables. The OLS results
again suggest that declining density may lead to
smaller income differences between block groups
because the estimated associations are positive and
increasing in moving from the 10th percentile to
the 90th. Hence, decreases in density ought to
reduce the average income at the top of the block
group distribution by more than it does at either
the middle or the bottom.
The OLS results may be biased, however
(again, because of the likely endogeneity of changes
in population density in relation to changes in
inequality). IVs, therefore, may offer more reliable
estimates. The IV results indicate that the 90th and
10th percentiles of the block group income distribution vary little with population density. Only two
of the 24 estimates for these two quantiles differ
statistically from zero. This finding is interesting
because it suggests that urban decentralization is
not associated with the top of the neighborhood
income distribution pulling away from the rest of

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

Table 7
Individual Quantile Results
OLS
Dependent variable

IV (density)

IV (industry shares)

I

II

III

I

II

III

I

II

III

Unweighted 90th percentile

0.26*
(0.03)

0.21*
(0.03)

0.17*
(0.03)

–0.18
(0.11)

0.01
(0.07)

0.02
(0.07)

–0.08
(0.11)

–0.02
(0.08)

0.01
(0.08)

Unweighted 50th percentile

0.24*
(0.03)

0.17*
(0.02)

0.10*
(0.02)

0.23*
(0.08)

0.21*
(0.05)

0.15*
(0.05)

0.15*
(0.07)

0.23*
(0.06)

0.16*
(0.06)

Unweighted 10th percentile

0.22*
(0.03)

0.14*
(0.03)

0.07*
(0.03)

0.13
(0.13)

0.07
(0.08)

–0.05
(0.08)

0.16
(0.12)

0.19*
(0.10)

0.09
(0.10)

Weighted 90th percentile

0.27*
(0.04)

0.20*
(0.03)

0.16*
(0.03)

–0.28*
(0.12)

–0.01
(0.07)

–0.04
(0.06)

–0.02
(0.11)

–0.02
(0.08)

–0.01
(0.08)

Weighted 50th percentile

0.24*
(0.03)

0.17*
(0.02)

0.10*
(0.02)

0.13
(0.08)

0.18*
(0.05)

0.12*
(0.04)

0.08
(0.07)

0.15*
(0.06)

0.08
(0.05)

Weighted 10th percentile

0.21*
(0.04)

0.13*
(0.03)

0.066*
(0.03)

0.07
(0.11)

0.06
(0.07)

–0.05
(0.07)

0.07
(0.12)

0.12
(0.09)

–0.01
(0.09)

NOTE: Coefficients are for the change in log population density. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. Specifications follow data reported in Tables 3 and 4.
* Significant at ≥10 percent.

the distribution. It is also not associated with the
bottom of the income distribution falling farther
behind the remainder of the distribution. The
median, however, does show significantly positive
variation with density in most instances, suggesting
that urban decentralization may be associated with
a decline in the incomes of neighborhoods at the
middle of the distribution. This result, of course,
explains why the gap between the top of the income
distribution rises while the gap at the bottom falls.

Inequality Within and Between Tracts
While the basic geographic unit of analysis in
this paper is the block group, many existing studies
of neighborhood-level economic outcomes have
typically focused on Census tracts, which represent a larger geographic area. The median Census
tract consists of approximately 1,649 households
and covers roughly 1.3 square miles compared with
526 households and 0.33 square miles for block
groups. Given the prevalence of tract-level analyses
in the literature on neighborhood outcomes, this
section considers whether the definition of neighborhoods as tracts, rather than block groups, alters
the results in any substantive way.17

Table 8 reports the coefficients on the change
in log density from every specification considered
using block group–level observations. In general,
the tract-level results yield very similar conclusions.
The extent of income inequality observed within
tracts shows a strong, negative association with
population density, whereas between-tract inequality shows little correlation with density.
With regard to the percentile differences, the
OLS results again suggest that, if anything, urban
decentralization may be associated with smaller
between-neighborhood gaps, not larger. The IV
estimates are mostly insignificant, although there
is some evidence that the gap between the top and
middle of the neighborhood income distribution
widens somewhat as population density declines.
As noted previously, this finding seems to reflect
a decrease in the median relative to the 90th percentile, which could be the product of greater mixing of medium- and low-income households in
suburban neighborhoods.
17

On average, metropolitan areas in the sample contain 147 tracts
each (minimum = 10, maximum = 4,507).

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

53

Wheeler

Table 8
Tract-Level Results
OLS

IV (density)

IV (industry shares)

Dependent variable

I

II

III

I

II

III

I

II

III

Within component

–0.07*
(0.009)

–0.08*
(0.01)

–0.07*
(0.01)

–0.21*
(0.05)

–0.11*
(0.03)

–0.06*
(0.03)

–0.08
(0.05)

–0.10*
(0.04)

–0.06*
(0.03)

Between component

–0.001
(0.007)

–0.005
(0.008)

–0.001
(0.007)

–0.03
(0.03)

0.009
(0.02)

0.02
(0.02)

0.005
(0.04)

0.002
(0.03)

0.02
(0.03)

Unweighted 90-10 percentile
difference

0.037
(0.04)

0.067
(0.04)

0.08*
(0.04)

–0.17
(0.15)

–0.0002
(0.11)

0.05
(0.12)

–0.24
(0.19)

–0.26*
(0.15)

–0.25
(0.17)

Unweighted 90-50 percentile
difference

0.05
(0.03)

0.05
(0.03)

0.07*
(0.036)

–0.26*
(0.10)

–0.09
(0.07)

–0.11
(0.08)

–0.17
(0.12)

–0.18*
(0.10)

–0.15
(0.10)

Unweighted 50-10 percentile –0.01
difference
(0.03)

0.01
(0.03)

0.007
(0.03)

0.10
(0.13)

0.09
(0.09)

0.16
(0.10)

–0.07
(0.13)

–0.08
(0.11)

–0.10
(0.13)

Weighted 90-10 percentile
difference

0.04
(0.04)

0.056
(0.04)

0.09*
(0.04)

–0.30*
(0.17)

–0.09
(0.12)

–0.05
(0.14)

–0.27
(0.18)

–0.27*
(0.15)

–0.17
(0.16)

Weighted 90-50 percentile
difference

0.05
(0.03)

0.05
(0.04)

0.08*
(0.04)

–0.27*
(0.13)

–0.11
(0.10)

–0.08
(0.11)

–0.25*
(0.15)

–0.25*
(0.12)

–0.21*
(0.13)

Weighted 50-10 percentile
difference

–0.01
(0.02)

0.005
(0.03)

0.01
(0.03)

–0.03
(0.12)

0.02
(0.08)

0.03
(0.10)

–0.01
(0.12)

–0.02
(0.10)

0.04
(0.11)

Unweighted 90th percentile

0.30*
(0.04)

0.24*
(0.04)

0.19*
(0.04)

–0.05
(0.11)

0.10
(0.07)

0.01
(0.08)

–0.05
(0.13)

0.02
(0.10)

–0.04
(0.10)

Unweighted 50th percentile

0.25*
(0.03)

0.18*
(0.02)

0.11*
(0.02)

0.21*
(0.08)

0.19*
(0.05)

0.12*
(0.06)

0.12
(0.08)

0.20*
(0.07)

0.10*
(0.06)

Unweighted 10th percentile

0.26*
(0.04)

0.17*
(0.03)

0.11*
(0.03)

0.12
(0.13)

0.10
(0.09)

–0.04
(0.09)

0.19
(0.14)

0.28*
(0.12)

0.21*
(0.12)

Weighted 90th percentile

0.27*
(0.04)

0.20*
(0.03)

0.17*
(0.04)

–0.15
(0.13)

0.03
(0.09)

–0.01
(0.10)

–0.14
(0.14)

–0.09
(0.11)

–0.11
(0.12)

Weighted 50th percentile

0.23*
(0.03)

0.15*
(0.02)

0.08*
(0.02)

0.12
(0.08)

0.14*
(0.06)

0.07
(0.06)

0.11
(0.08)

0.16*
(0.06)

0.10*
(0.06)

Weighted 10th percentile

0.23*
(0.04)

0.15*
(0.03)

0.07*
(0.03)

0.15
(0.13)

0.12
(0.08)

0.05
(0.10)

0.13
(0.11)

0.17*
(0.10)

0.07
(0.11)

NOTE: Coefficients are for the change in log population density. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. Specifications follow data reported in Tables 3 and 4.
* Significant at ≥10 percent.

54

V O LU M E 4 , N U M B E R 1

2008

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

Thus, just as with block groups, urban decentralization tends to be accompanied by widening
income gaps within Census tracts. There is little
evidence that between-neighborhood income gaps
rise in sprawling cities.

CONCLUSION
City populations in the United States have
decentralized for more than a century. Although
the process was driven largely by the decisions of
individuals to live farther from historical city
centers, it has generated numerous concerns about
segregation of households by income. Given the
evidence documented in previous work and herein
that urban decentralization tends to be accompanied by significant increases in income inequality,
these concerns certainly seem warranted.
This paper has examined this issue further by
exploring the extent to which the increased income
inequality with decreasing density emanates from
a rise in the degree of income variation exhibited
across different neighborhoods. In general, the findings suggest that between-neighborhood income
gaps do not rise significantly as central cities spread
out. Neither the difference between the 90th and
10th percentiles of the block group–level income
distribution nor the degree of variation associated
with between-block group income differentials
rises (or falls) significantly as a metropolitan
area’s population spreads out. This result should
not be interpreted as suggesting that all betweenneighborhood income differentials are completely
invariant to the outward movement of people in a
city. A rising gap between the absolute poorest
neighborhoods and the remainder of the metropolitan area may still exist. However, the extent
to which this potential gap contributes to overall
income inequality within a local market appears
decidedly small.
Instead, the rise of income dispersion as cities
decentralize is largely associated with an increase
in the degree of income heterogeneity within neighborhoods. One straightforward interpretation of
this result is that urban decentralization is associated with greater income mixing within neighborhoods, regardless of whether they are defined by
block groups or tracts. Because they are less densely

populated, for instance, suburban neighborhoods
may more readily accommodate households with
widely varying income levels than central cities,
where individuals reside in closer proximity. This
may be similar to the finding reported by Glaeser
and Kahn (2004) that suburbs are more racially
integrated than central cities.
Unfortunately, why overall income inequality
increases with urban decentralization remains
unresolved. If sprawling cities were simply reorganizing their populations from dense, segregated
collections of neighborhoods into less-dense, heterogeneous sets of neighborhoods, the rise in withinneighborhood inequality should be offset by a drop
in between-neighborhood inequality. The data show
little evidence of any such drop.
One possible explanation is that urban decentralization may be associated with greater industrial
heterogeneity (beyond what this analysis controls
for), at least in the sense that suburban areas might
have large numbers of particularly low-wage jobs,
high-wage jobs, or both. A large presence of jobs in
typically low-wage sectors, such as food services
and accommodation or retail trade, for example,
may contribute to higher inequality within neighborhoods. On a more speculative level, less-dense
suburban areas might be characterized by fewer
social interactions among individuals of different
groups, as defined by income or education. That
is, although suburban neighborhoods may have a
more heterogeneous mix of residents, the extent
of productive interaction among them may be relatively low. Following Glaeser (1999), this may
lead to greater income inequality as “less-skilled”
workers have fewer opportunities to learn from
their “more-skilled” counterparts.
At this point, both explanations are purely
hypothetical and, therefore, require greater research.
Given the relative dearth of studies of the inequality–
urban decentralization issue, such research certainly seems worthwhile.

REFERENCES
Abramson, Alan J.; Tobin, Mitchell S. and
VanderGoot, Matthew R. “The Changing Geography
of Metropolitan Opportunity: The Segregation of the
Poor in Metropolitan Areas, 1970 to 1990.” Housing

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

55

Wheeler

Policy Debate, 1995, 6(1), pp. 45-72; www.fanniemaefoundation.org/programs/hpd/pdf/
hpd_0601_abramson.pdf.

Holzer, Harry J. “The Spatial Mismatch Hypothesis:
What Has the Evidence Shown?” Urban Studies,
February 1991, 28(1), pp. 105-22.

Benabou, Roland. “Heterogeneity, Stratification, and
Growth: Macroeconomic Implications of Community
Structure and School Finance.” American Economic
Review, 1996, 86, pp. 584-609.

Ihlanfeldt, Keith R. and Sjoquist, David L. “The Impact
of Job Decentralization on the Economic Welfare of
Central City Blacks.” Journal of Urban Economics,
July 1989, 26(1), pp. 110-30.

Brueckner, Jan K. and Rosenthal, Stuart S.
“Gentrification and Neighborhood Housing Cycles:
Will America’s Future Downtowns Be Rich?”
CESifo Working paper series No. 1579, University
of California-Irvine, April 2, 2008;
www.socsci.uci.edu/~jkbrueck/gentrification.pdf.

Ioannides, Yannis M. “Neighborhood Income
Distributions.” Journal of Urban Economics,
November 2004, 56(3), pp. 435-57.

Epple, Dennis and Sieg, Holger. “Estimating Equilibrium
Models of Local Jurisdictions.” Journal of Political
Economy, August 1999, 107(4), pp. 645-81.
Fortin, Nicole M. and Lemieux, Thomas. “Institutional
Changes and Rising Wage Inequality: Is There a
Linkage?” Journal of Economic Perspectives, Spring
1997, 11(2), pp. 75-96.
Glaeser, Edward L. “Learning in Cities.” Journal of
Urban Economics, September 1999, 46(2), pp. 254-77.
Glaeser, Edward L. and Kahn, Matthew E. “Sprawl
and Urban Growth,” in J. Vernon Henderson and
Jacques-François Thiesse, eds., Handbook of
Regional and Urban Economics, volume 4: Cities
and Geography (Handbooks in Economics), chapter
56. New York: Elsevier, 2004, pp. 2481-528.
Hardman, Anna and Ioannides, Yannis. “Neighbors’
Income Distribution: Economic Segregation and
Mixing in US Urban Neighborhoods.” Journal of
Housing Economics, December 2004, 13(4), pp. 368-82.
Hirsch, Barry T.; Macpherson, David A. and Vroman,
Wayne G. “Estimates of Union Density by State.”
Monthly Labor Review, July 2001, 124(7), pp. 51-55;
http://www.bls.gov/opub/mlr/2001/07/ressum2.htm.
Hobbs, Frank and Stoops, Nicole. “Demographic Trends
in the 20th Century.” US Census Bureau, Census 2000
Special Reports, Series CENSR-4 (November 2002).
Washington, DC: US Government Printing Office;
www.census.gov/prod/2002pubs/censr-4.pdf.

56

V O LU M E 4 , N U M B E R 1

2008

Johnson, Norman L. and Kotz, Samuel. Continuous
Univariate Distributions. Boston: Houghton Mifflin,
1970.
Juhn, Chinhui; Murphy, Kevin M. and Pierce, Brooks.
“Wage Inequality and the Rise in Returns to Skill.”
Journal of Political Economy, 1993, 101(3), 410-42.
Kain, John F. “Housing Segregation, Negro Employment,
and Metropolitan Decentralization.” Quarterly Journal
of Economics, May 1968, 82(2), pp. 175-97.
Kasarda, John D. “Inner-City Concentrated Poverty and
Neighborhood Distress: 1970-1990.” Housing Policy
Debate, 1993, 4(3), pp. 253-302.
Katz, Lawrence K. and Murphy, Kevin M. “Changes in
Relative Wages, 1963-1987: Supply and Demand
Factors.” Quarterly Journal of Economics, February
1992, 107, 35-78.
Margo, Robert A. “Explaining the Postwar
Suburbanization of the Population of the United
States: The Role of Income.” Journal of Urban
Economics, May 1992, 31(3), pp. 301-10.
Mayer, Christopher J. “Does Location Matter?” New
England Economic Review, May/June 1996, pp. 26-40.
Weinberg, Bruce A. “Black Residential Centralization
and the Spatial Mismatch Hypothesis.” Journal of
Urban Economics, July 2000, 48(1), pp. 110-34.
Weinberg, Bruce A. “Testing the Spatial Mismatch
Hypothesis Using Inter-City Variations in Industrial
Composition.” Regional Science and Urban
Economics, September 2004, 34(5), pp. 505-32.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

Wheeler

Wheeler, Christopher H. “Wage Inequality and Urban
Density.” Journal of Economic Geography, 2004,
4(4), pp. 421-37.

Yang, Rebecca and Jargowsky, Paula. “Suburban
Development and Economic Segregation in the
1990s.” Journal of Urban Affairs, June 2006, 28(3),
pp. 253-73.

Wilson, William J. The Truly Disadvantaged: The
Inner City, the Underclass, and Public Policy. Chicago:
University of Chicago Press, 1987.

APPENDIX
Income Categories Used in Analysis*
1980 Income Categories ($)

1990 Income Categories ($)

2000 Income Categories ($)

0-4,999

0-4,999

5,000-7,499

5,000-9,999

10,000-14,999

0-9,999

7,500-9,999

10,000-12,499

15,000-19,999

10,000-12,499

12,500-14,999

20,000-24,999

12,500-14,999

15,000-17,499

25,000-29,999

15,000-17,499

17,500-19,999

30,000-34,999

17,500-19,999

20,000-22,499

35,000-39,999

20,000-22,499

22,500-24,999

40,000-44,999

22,500-24,999

25,000-27,499

45,000-49,999

25,000-27,499

27,500-29,999

50,000-59,999

27,500-29,999

30,000-32,499

60,000-74,999

30,000-34,999

32,500-34,999

75,000-99,999

35,000-39,999

35,000-37,799

100,000-124,999

40,000-49,999

37,500-39,999

125,000-149,999

50,000-74,999

40,000-42,499

150,000-199,999

—

42,500-44,999

—

—

45,000-47,499

—

—

47,500-49,999

—

—

50,000-54,999

—

—

55,000-59,999

—

—

60,000-74,999

—

—

75,000-99,999

—

—

100,000-124,499

—

—

125,000-149,999

—

NOTE: * See footnote 6.

F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T

V O LU M E 4 , N U M B E R 1

2008

57