The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Federal Reserve Bank of St. Louis REGIONAL ECONOMIC DEVELOPMENT VO LU M E 4 , N U M B E R 1 2008 Selected Papers from Federal Reserve Bank of St. Louis Economists Local Price Variation and Labor Supply Behavior Dan A. Black, Natalia A. Kolesnikova, and Lowell J. Taylor Regional Aggregation in Forecasting: An Application to the Federal Reserve’s Eighth District Kristie M. Engemann, Rubén Hernández-Murillo, and Michael T. Owyang The Economic Impact of a Smoking Ban in Columbia, Missouri: An Analysis of Sales Tax Data for the First Year Michael R. Pakko Urban Decentralization and Income Inequality: Is Sprawl Associated with Rising Income Segregation Across Neighborhoods? Christopher H. Wheeler REGIONAL ECONOMIC DEVELOPMENT Selected Papers from Federal Reserve Bank of St. Louis Economists 1 Editor’s Introduction Director of Research Thomas A. Garrett Robert H. Rasche Deputy Director of Research Cletus C. Coughlin 2 Editor Local Price Variation and Labor Supply Behavior Thomas A. Garrett Dan A. Black, Natalia A. Kolesnikova, and Lowell J. Taylor Center for Regional Economics—8th District (CRE8) Director Howard J. Wall Subhayu Bandyopadhyay Cletus C. Coughlin Thomas A. Garrett Rubén Hernández-Murillo Natalia A. Kolesnikova Michael R. Pakko Christopher H. Wheeler 15 Regional Aggregation in Forecasting: An Application to the Federal Reserve’s Eighth District Kristie M. Engemann, Rubén Hernández-Murillo, and Michael T. Owyang 30 Managing Editor George E. Fortier Editors Judith A. Ahlers Lydia H. Johnson The Economic Impact of a Smoking Ban in Columbia, Missouri: An Analysis of Sales Tax Data for the First Year Graphic Designer Michael R. Pakko Donna M. Stiller The views expressed are those of the individual authors and do not necessarily reflect official positions of the Federal Reserve Bank of St. Louis, the Federal Reserve System, or the Board of Governors. 41 Urban Decentralization and Income Inequality: Is Sprawl Associated with Rising Income Segregation Across Neighborhoods? Christopher H. Wheeler F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 i Regional Economic Development is published occasionally by the Research Division of the Federal Reserve Bank of St. Louis and may be accessed through our web site: research.stlouisfed.org/regecon/publications/. All nonproprietary and nonconfidential data and programs for the articles written by Federal Reserve Bank of St. Louis staff and published in Regional Economic Development also are available to our readers on this web site. General data can be obtained through FRED (Federal Reserve Economic Data), a database providing U.S. economic and financial data and regional data for the Eighth Federal Reserve District. You may access FRED through our web site: research.stlouisfed.org/fred. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Please send a copy of any reprinted, published, or displayed materials to George Fortier, Research Division, Federal Reserve Bank of St. Louis, P.O. Box 442, St. Louis, MO 631660442; george.e.fortier@stls.frb.org. Please note: Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. Please contact the Research Division at the above address to request permission. © 2008, Federal Reserve Bank of St. Louis. ISSN 1930-1979 ii V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Contributing Authors Dan A. Black University of Chicago and National Opinion Research Center danblack@uchicago.edu Kristie M. Engemann Federal Reserve Bank of St. Louis kristie.m.engemann@stls.frb.org Thomas A. Garrett Federal Reserve Bank of St. Louis tom.a.garrett@stls.frb.org Rubén Hernández-Murillo Federal Reserve Bank of St. Louis ruben.hernandez@stls.frb.org Michael T. Owyang Federal Reserve Bank of St. Louis michael.t.owyang@stls.frb.org Michael R. Pakko Federal Reserve Bank of St. Louis michael.r.pakko@stls.frb.org Lowell J. Taylor Carnegie Mellon University lt20@andrew.cmu.edu Christopher H. Wheeler Federal Reserve Bank of St. Louis (formerly) Natalia A. Kolesnikova Federal Reserve Bank of St. Louis natalia.a.kolesnikova@stls.frb.org F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 iii Editor’s Introduction Thomas A. Garrett T he Center for Regional Economics– 8th District (CRE8) at the Federal Reserve Bank of St. Louis sponsored the fourth annual meeting of the Business and Economics Research Group (BERG). This year’s meeting was part of the eighth annual Missouri Economics Conference held in Columbia, Missouri, in March 2008 and sponsored by the Federal Reserve Bank of St. Louis and the Department of Economics at the University of Missouri– Columbia.1 This issue of Regional Economic Development contains four research papers by St. Louis Fed economists, several of which were presented at the recent BERG meeting. Dan Black, Natalia Kolesnikova, and Lowell Taylor present empirical 1 and theoretical evidence that labor supply decisions are not just a function of wages—as often assumed in empirical and theoretical models of labor supply—but also are dependent on the prices of other goods. Kristie Engemann, Rubén HernándezMurillo, and Michael Owyang compare the predictive power of various forecasting models of employment that use different levels of data aggregation. Michael Pakko explores the economic impact of a smoking ban in Columbia, Missouri, using a time series of sales tax data for eating and drinking establishments. Finally, Christopher Wheeler examines whether urban sprawl resulted in rising income segregation in 359 U.S. metropolitan areas over a 20-year period. The agenda for the eighth annual Missouri Economics Conference, which includes sessions sponsored by BERG, can be found at http://research.stlouisfed.org/conferences/moconf/8th_annual_ agenda.pdf. Thomas A. Garrett is an assistant vice president and economist at the Federal Reserve Bank of St. Louis. Federal Reserve Bank of St. Louis Regional Economic Development, 2008, 4(1), p. 1. © 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 1 Local Price Variation and Labor Supply Behavior Dan A. Black, Natalia A. Kolesnikova, and Lowell J. Taylor In standard economic theory, labor supply decisions depend on the complete set of prices: wages and the prices of relevant consumption goods. Nonetheless, most theoretical and empirical work in labor supply studies ignore prices other than wages. We address the question of whether the common practice of ignoring local price variation in labor supply studies is as innocuous as generally assumed. We describe a simple model to demonstrate that the effects of wage and nonlabor income on labor supply typically differ by location. In particular, we show that the derivative of the labor supply with respect to nonlabor income is independent of price only when the labor supply takes a form based on an implausible separability condition. Empirical evidence demonstrates that the effect of price on labor supply is not a simple “up-or-down shift” that would be required to meet the separability condition in our key proposition. (JEL J01, J21, R23) Federal Reserve Bank of St. Louis Regional Economic Development, 2008 4(1), pp. 2-14. I n standard economic theory, labor supply decisions depend on the complete set of prices: the wages and the prices of relevant consumption goods. Nonetheless, as Abbott and Ashenfelter (1976) noted some 30 years ago, economists generally have found it a useful abstraction, in both theoretical and empirical work, to ignore prices other than wages in labor supply studies. For example, none of the empirical results on labor supply discussed in the prominent reviews of Pencavel (1986), Killingsworth and Heckman (1986), or Blundell and MaCurdy (1999) are derived by procedures that account for variation in any price other than wages.1 However, most empirical work on labor does use national datasets of individuals who live in different locations and therefore face different prices for locally priced goods. These price differ- ences can be quite large, especially for housing. For example, according to 1990 Census data, the median housing price in New York is more than three times that of the median housing price in Cleveland.2 The question addressed in this paper is whether the common practice of ignoring local price variation in labor supply studies is as innocuous as has generally been assumed. 1 Abbott and Ashenfelter’s (1976) evaluation of labor supply in the United States for the 1929-67 period exploits time-series changes in relative prices but does not evaluate possible impacts of crosssectional variation (which, as they state, is “expected to be small”). Some work conducts sensitivity analysis using Bureau of Labor Statistics information on the cost of living to “adjust” wages. See, for instance, DaVanzo, DeTray, and Greenberg (1973) and Masters and Garfinkel (1977). 2 Gabriel and Rosenthal (2004) and Chen and Rosenthal (forthcoming) show that massive housing price differences pertain across cities even after careful adjustment for quality. Dan A. Black is a professor in the Harris School, University of Chicago, and a senior fellow at the National Opinion Research Center; Natalia A. Kolesnikova is an economist at the Federal Reserve Bank of St. Louis; and Lowell J. Taylor is a professor of economics and public policy at the Heinz School, Carnegie Mellon University. © 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. 2 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Black, Kolesnikova, Taylor To examine the issue, we first present a simple theoretical model: an economy in which people live in different locations with differing levels of a production or consumption amenity. Following logic familiar in urban economics, (e.g., Roback, 1982), equilibrium prices will differ across locations. We demonstrate that labor supply behavior also can vary across locations. Next, we demonstrate that, when prices vary across locations, local variation in prices can be safely ignored only when preferences take a very specific and peculiar form. We also show that the responsiveness of labor supply to wage changes will be the same across locations only if the responsiveness of labor supply to nonlabor income changes is the same across locations. In our third step we evaluate the potential empirical importance of our theoretical observations. We present results obtained by using 1990 Public Use Microdata Samples (PUMS) of the 1990 U.S. Census that examine labor supply in the nation’s 50 largest cities. We focus on the labor force participation and hours decisions of white married women aged 30 to 50—a group whose labor decisions are quite responsive to changes in wages and nonlabor income. In general, we analyze the basic “building block” empirical relationship that would underlie any empirical analysis of labor supply for this group: the relationship between nonlabor income and labor supply. Our innovation is examining this relationship for each of the 50 cities separately and demonstrating the significant systematic variation that exists among them. We find that the basic correlation—between labor supply and nonlabor income—differs across cities. For example, women who have relatively high nonlabor income (primarily a husband’s income) work relatively fewer hours and have lower participation rates. An important observation, from our perspective, is that this anticipated negative relationship is substantially more pronounced in cities with inexpensive housing than in cities with expensive housing. A MODEL OF LOCAL LABOR MARKETS WITH STONE-GEARY PREFERENCES We begin our study by presenting a simple model of local price variation along the lines of Roback (1982) and Haurin (1980). Locations differ based on two criteria: (i) A location may be inherently more pleasant (i.e, have a higher level of a “consumption amenity,” such as nice weather), or (ii) a location may be associated with inherently higher productivity (e.g., owing to the presence of a natural resource or an agglomeration of economies in production). For simplicity we restrict attention to cases in which people choose to live in one of two cities. In contrast to the standard urban location models such as those of Roback (1982) or Haurin (1980), which fix labor supply as a constant, we allow labor supply to be a choice variable. Preferences are assumed to be Stone-Geary. This is a particularly transparent form of utility, and as Ashenfelter and Ham (1979) note, it is the simplest functional form of utility used in applied empirical work examining labor supply.3 We assume, in particular, that individual i has utility ui as a function of a consumption good x, leisure l (which is scaled so that 0 ⱕ l ⱕ 1), and an amenity level Aj (that is specific to location j ), according to a simple StoneGeary form as follows: (1) δ u i = θ ij A j ( x − c ) l 1−δ , where c and δ are parameters that are common across individuals and θ ij is a positive idiosyncratic parameter that equals 1 for a typical individual, but allows for the possibility that person i has a particular attraction, or distaste, for location j (as θ ij is greater than, or less than, 1). A person living in location j maximizes utility subject to a budget constraint, pj x = wj 共1 – l 兲 + N, where pj is the price for the local consumption good, wj is the local wage, and N is nonlabor income. Assuming an interior solution pertains, 3 See also Blundell and MaCurdy (1999) for a discussion of the StoneGeary form, as well as other forms used in applied work on labor supply. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 3 Black, Kolesnikova, Taylor demand for leisure and for the consumption good are, respectively, (1 − δ ) N + w j − cp j l w j , pj = , (2) wj ( ) ( ) ( δ N +w ) ( p and (3) x w j , pj = j − cp j ) + c. j the marginal individual θ i1 = θ i2 = 1, equation (5) still characterizes equilibrium prices. In this instance, however, some individuals will have a strict preference with regard to location. For example, an individual with θ i1 > θ i2 will have a strict preference for Location 1 over Location 2. We turn next to labor supply. Let h be the fraction of time that a person works, h = 1 – l. From equation (2), we have Substituting equations (2) and (3) into equation (1) gives indirect utility for person in location ij (4) V ij = 1−δ j δ θ A δ (1 − δ ) (N + w j − cp j pδj w 1j −δ ). In equilibrium each individual chooses to live in the location that yields the highest level of utility. There are two locations: j = 1 or 2. We present two cases: one with differing consumption amenities and one with differing levels of productivity in the locations. Case 1: Differing Levels of the Consumption Amenity A 1 ( N + w − cp1 ) p1δw 1−δ = A 2 ( N + w − cp2 ) p2δ w 1−δ 4 4 V O LU M E 4 , N U M B E R 1 2008 ( δw − (1 − δ ) N − cp j w ). ( ∂h w , p j ) = (1 − δ )(N − cp ) . j w 2 Notice that in this example, the responsiveness of the labor supply to a wage change is greater in the inexpensive city than in the expensive city, ∂h (w , p2 ) ∂h (w , p1 ) > . ∂w ∂w In contrast, if we focus on how a change in nonlabor income affects labor supply, (8) For simplicity, we are implicitly assuming that labor is the only factor of production, so that firms will be indifferent in hiring if the wage is the same in the two cities. This would not be true, for example, if land were a major factor of production and land prices differed in the two cities. ) ∂w . Inspection of equation (5) confirms the intuitive result that p1 > p2: The local consumption good is more expensive in Location 1—the high-amenity city. This logic continues to hold if we add back the idiosyncratic taste component to utility. If for ( h w, pj = Although wages are the same in both locations, the labor supply differs. In this example, h共w,p1兲 > h共w,p2兲; individuals supply more labor when they work in the more expensive city. Suppose instead the focus is on the effect of a wage change in a local labor market (studying people who would not move in response to a small change in the wage)5: (7) Suppose there is general agreement that Location 1 is nicer than Location 2, A1 > A2, and for the moment assume further that there are no idiosyncratic differences in opinion about location, so that θ ij = 1 for all individuals. Because workers are equally productive in the two locations, wages and w1 and w2 must be the same, say w.4 In an equilibrium in which people live in both locations, we must have V i1 = V i2, so using equation (4), it is clear that p1 and p2 must solve (5) (6) ( ∂h w , p j ∂N ) = − (1 − δ ) , w we find that the relationship is independent of the local price; that is, it can be written as ∂h (w ) . ∂N 5 In general, if the wage increases in a labor market, this factor can attract new individuals to that location. Here, we are interested in the effect on the labor supply of individuals who are already in the market, for example, people who have an idiosyncratic taste for that location. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Black, Kolesnikova, Taylor Case 2: Differing Levels of Productivity Now suppose that Locations 1 and 2 are viewed as equally pleasant, A1 = A2, but productivity is higher in Location 1 than in Location 2, so that w1 > w2. The equilibrium condition corresponding to equation (5)—that the marginal individual is indifferent between locations (i.e., Vi1 = Vi2)—is then ( N + w 1 − cp1 ) = ( N + w 2 − cp2 ) . (9) p1δw 11−δ p2δ w 21−δ As for labor supply, in city j, (10) ( ) h w j , pj = ( δw j − (1 − δ ) N − cp j wj ). In general, labor supply differs in the two locations, but even with p1 > p2 and w1 > w2 the location that will have the larger labor supply cannot be predicted. Similarly, in general ∂h (w 1 , p1 ) ∂h (w 2 , p2 ) ≠ , ∂w ∂w and we cannot determine in which city the labor supply is more responsive to wage changes. On the other hand, in this example the derivative of labor supply with respect to nonlabor income, (11) ( ∂h w j , p j ∂N ) = − (1 − δ ) , wj turns out to be independent of pj . Furthermore, the derivative of labor supply with respect to nonlabor income does not depend on the local price, p, but because in equilibrium the high-productivity city has relatively higher wages, we expect to observe that δh/δN will be smaller (in absolute value) in the expensive city. Our examples illustrate two important points. First, cross-sectional variation in wages and prices may be associated with variation in labor supply, although that cross-sectional variation is of no value for understanding the behavioral effect of wage changes on labor supply. For instance, in our Case 2, even if in both cities ( ∂h w j , p j ∂w ) > 0, identical individuals may well supply less labor in the high-wage city than in the low-wage city, depending on the local price-wage combination. Second, the responsiveness of labor supply to changes in the wage or nonlabor income typically varies across locations. WHEN DOES PRICE VARIATION MATTER FOR LOCAL LABOR SUPPLY? As noted previously, housing prices vary widely across U.S. cities, presumably because of differences in consumption or production amenities across these locations. The examples in the previous section indicate that labor supply varies across locations even in the unusually simple and transparent case of Stone-Geary preferences. We now turn to a more systematic investigation of conditions on preferences under which price and income effects on labor supply do not depend on location. As is common in the literature, attention is restricted to the case of quasi-homothetic preferences (of which Stone-Geary is a special case).6 Given this common simplification, what further restrictions are necessary to allow investigators to ignore variation across locations when examining labor supply?7 Under quasi-homothetic preferences, indirect utility takes the form (12) V ( p,w , N ) = α ( p,w ) + ( N + w ) β ( p,w ), where, as before, p is the local price, w is the local wage, and N is the nonlabor income. Using Roy’s identity we derive the demand for leisure 6 Quasi-homothetic preferences are useful because they preserve a linear expansion path of homothetic preferences, but they do not require the path to go through the origin. Thus, under quasihomothetic preferences, income elasticities of demand need not equal 1, as is the case with homothetic preferences. 7 We could attempt to analyze cases that are even more general, but as we shall see, matters are sufficiently discouraging even for the quasi-homothetic case. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 5 Black, Kolesnikova, Taylor ∂V / ∂w ∂V / ∂N α ( p,w ) + β ( p,w ) + ( N + w ) βw ( p,w ) =− w β ( p,w ) l ( p,w , N ) − 1 = − (13) =− αw ( p,w ) + ( N + w ) βw ( p,w ) − 1, l ( p,w , N ) β ( p,w ) =− αw ( p,w ) + ( N + w ) βw ( p,w ) . β ( p,w ) follows: Proposition 1 When preferences are quasihomothetic, ∂h ∂N is independent of location if and only if preferences satisfy a separability condition β 共p,w兲 = β 1共p兲β 2共w兲. Next consider the response of the demand for leisure to wage changes, ∂h = aw ( p,w ) + b ( p,w ) + ( N + w )bw ( p,w ) . ∂w It then follows that hours of labor supply are h ( p,w , N ) = 1 − l ( p,w , N ) Again, the goal is to derive conditions under which α ( p,w ) + ( N + w ) βw ( p,w ) = 1+ w : β ( p,w ) (14) = a ( p,w ) + ( N + w )b ( p,w ), where a ( p,w ) = 1 + αw β , b ( p,w ) = w . β β Consider the effect of the change in nonlabor income on the labor supply, β ( p, w ) ∂h = b ( p,w ) = w . ∂N β ( p,w ) Obviously, δh/δN is independent of p (and thus is the same across locations) if and only if b共p,w兲 ⬅ b共w兲. The following claim provides the condition under which this holds: Claim βw ( p,w ) = b (w ) ⇔ β ( p,w ) = β1 ( p ) β2 (w ). β ( p, w ) Proof. The proof of sufficiency is trivial. To prove necessity, we have βw ( p,w ) = b (w ), β ( p,w ) ∂ lnβ ( p,w ) = b (w ), ∂w lnβ ( p,w ) = ∫ b (w )dw + c ( p ), b w dw +c ( p ) β ( p,w ) = e ∫ ( ) = β1 ( p ) β2 (w ), ∂h ∂w does not depend on local prices, p. If b共p,w兲 = b共w兲, as above, then the only other necessary condition is that aw共p,w兲 be independent of p. Now aw共p,w兲 is independent of p if and only if it is equal to some function of w only: aw共p,w兲 = f共w兲. Integrating both parts with respect to w, we get a共p,w兲 = F共w兲 + c共p兲. Then the supply of hours of work takes an additively separable form, h共p,w,N兲 = c共p兲 + F共w兲 + 共N + w兲b共w兲. We have established, therefore, Proposition 2 When preferences are quasihomothetic, ∂h and ∂h ∂N ∂w are independent of location if and only if the demand for leisure has the additively separable form (15) h ( p,w , N ) = c ( p ) + F (w ) + ( N + w )b (w ). Notice that in equation (15) the effect of local price variation is to simply shift the labor supply function up or down. In this case, it might suffice to merely incorporate location-specific dummies when estimating labor supply functions.8 Without this separability, however, local price variation would have a fundamental impact on the shape of the labor supply function itself. where β2 (w ) = e ∫ b (w )dw . 8 The above observations can be summarized as 6 V O LU M E 4 , N U M B E R 1 2008 In fact, in empirical work on labor supply, researchers generally do not even take this simple step. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Black, Kolesnikova, Taylor These two propositions demonstrate that even in the simple case of quasi-homothetic preferences, rather strong conditions are necessary for locationindependent labor supply responses to income and wage changes. The Stone-Geary example used in the previous section illustrates this point. Indirect utility can be written in the form V = α 共p,w兲 + 共N + w兲β 共p,w兲, where 1−δ α ( p,w ) = − (16) cpθ Aδ δ (1 − δ ) pδ w 1−δ 1−δ = −cp1−δθ Aδ δ (1 − δ ) ⋅ 1 w 1−δ , 1−δ β ( p,w ) = θ Aδ δ (1 − δ ) pδ w 1−δ = θ Aδ δ (1 − δ ) pδ (17) 1−δ 1 ⋅ w 1−δ If the key relationship ∂h is independent of p, ∂w then ∂h is independent of p. ∂N To prove this proposition we consider first the effect of a change in nonlabor income on labor supply: ∂h ( p,w , F ) . Since β 共p,w兲 is separable in p and w, the separability condition of Proposition 1 is satisfied. Recall from equation (6) that h ( p,w , N ) = in labor supplied in each city can be traced. Finding data that correspond to such an experiment is a formidable task. The following work instead focuses exclusively on the sensitivity of labor supply to nonlabor income. We can justify this focus with the following result: Proposition 3 In general, labor supply, h共p,w,F 兲, depends on the price of the local good, the wage, and full income, F = w + N.9 δw − (1 − δ ) ( N − cp ) . w Obviously, this function does not have an additively separable form as required in Proposition 2. So it is not surprising that the derivative of labor supply with respect to nonlabor income, N, ∂h (1 − δ ) , =− ∂N w is independent of p, whereas the derivative of leisure with respect to the wage, w, ∂N = ∂h ( p,w , F ) ∂F ∂h ( p, w , F ) ⋅ = . ∂F ∂N ∂F This is independent of price, p, if and only if ∂h ( p,w , F ) = G (w , F ). ∂F (18) Integrating both sides of equation (18), we then notice that labor supply must have the following additively separable form: h ( p,w , F ) = g (w , F ) + c ( p,w ) (19) = g (w ,w + N ) + c ( p,w ) . Similarly, the effect of the change in the wage on labor supply does not depend on p if and only if ∂h ( p,w , F ) = Q (w , F ), ∂w (20) or, integrating both sides of equation (20), ∂h (1 − δ )( N − cp ) = , ∂w w2 depends on p. As noted earlier, labor supply studies generally focus on the responsiveness of labor supply to changes in wages. Here, we want to evaluate how price variations, in addition to changes in wages, affect the results. The ideal experiment would be one in which wages are exogenously shifted in each of many different U.S. cities and in which changes h ( p,w , F ) = q (w , F ) + k ( p ) (21) = q (w ,w + N ) + k ( p ) . Compare the additive separability requirements shown in equations (19) and (21). The latter takes the same basic form but is more restrictive. It follows that when 9 Recall that full-time work entails h = 1, so that the maximum possible labor income is w + N, making full income. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 7 Black, Kolesnikova, Taylor ∂h is independent of the local price, p, ∂w ∂h is independent of the local price, p. ∂N EMPIRICAL RESULTS The theoretical considerations outlined in the preceding section suggest that unless preferences are strongly restricted, the responsiveness of labor supply to nonlabor income and to the wage will vary across locations. It is possible, of course, that the differences are insignificant and do not pose a problem for empirical work. We examine this possibility using a dataset of married white women— a group that is likely to have substantial variation in labor supply (e.g., in response to differences in wage, nonlabor income, and possibly local prices). Data used in the analysis are from the 1990 PUMS10; data include married non-Hispanic white women, aged 30 to 50, who live in the 50 largest metropolitan statistical areas (MSAs) in the United States. One goal of this exploration is to see if there are any systematic differences in labor supply related to differences in local prices. We consider the relationship between labor supply and nonlabor income; the latter term is defined as family income minus the woman’s own total income. Given previous research on married women’s labor supply, an inverse relationship would be expected between nonlabor income and labor supply ( i.e., leisure is likely a “normal good.”) The question here is whether that relationship differs in a systematic way across cities. Examining the relationship between nonlabor income and married women’s labor supply in cross section is far from “state of the art” in estimating labor supply. Still, it seems a reasonable first pass at the issue, especially given that our focus is not on any estimated relationship per se but on differences in the relationships in expensive and inexpensive urban areas. In our investigation of the differences in the response of labor supply to the change in nonlabor 10 8 Data were provided by the Minnesota Population Center (Ruggles et al., 2004). V O LU M E 4 , N U M B E R 1 2008 income, we do not want to specify any parametric form because of concerns that results might be sensitive to the functional form.11 Instead, we use a nonparametric matching estimator. Two measures of labor supply are used: annual hours of work and an employment participation dummy variable.12 The data do not allow us to perform this analysis for each city because they do not provide enough support. Instead, we divide the sample roughly into thirds and examine differences between the most “expensive” cities (the 17 MSAs within the top one-third of housing prices) and “inexpensive” cities (the 17 MSAs with the lowest housing prices). Our comparison of married women’s labor supply in inexpensive and expensive cities then follows three additional steps. The first step is to divide households into deciles according to “nonlabor income” (which is predominately the husband’s income). Then within each decile we compare the labor supply of women who live in the expensive cities relative to the labor supply of women who live in inexpensive cities. The goal is to compare the labor supply of otherwise similar women, so we use an estimator that matches women with exactly the same age and level of education. Separate analyses also are conducted for women with high school education and college education. Thus, the second step is to match women living in an expensive city with corresponding women living in inexpensive cities (i.e., we match women in each nonlabor income decile, di 共i = 1,…,10兲, with age and education vector x = X, to women with these same characteristics living in inexpensive cities). In the analysis that centers on annual work hours, this is (22) ∆ ( X ,d i ) = E ( h 1| x = X ,d i ) − E ( h 0| x = X , di ), where h1, h0 are annual hours of work in expensive and inexpensive cities, respectively. In the absence of selection, this might be taken to be the causal effect on labor supply (measured in hours per year) of living in an expensive city relative to an inexpensive city. The third step is to average the quan11 See, for example, DaVanzo, DeTray, and Greenberg (1973). 12 We also repeated the analysis with several other measures of labor force participation, such as an indicator of full-time employment. The results remain essentially the same. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Black, Kolesnikova, Taylor Table 1 Differences in Annual Hours and Participation Rates Between Expensive and Inexpensive Locations by Nonlabor Income Deciles All women Nonlabor income decile Change in annual hours Women with a high school diploma Women with a college degree Change in participation rates Change in annual hours Change in participation rates Change in annual hours Change in participation rates 1 –117.34 (14.23) –0.04 (0.0065) –136.1 (24.57) –0.04 (0.012) –78.08 (34.88) –0.02 (0.016) 2 –75.46 (14.32) –0.01 (0.0063) –75.72 (24.36) 0.00 (0.011) –99.43 (36.47) –0.02 (0.016) 3 –54.14 (13.74) –0.01 (0.0060) –19.42 (23.39) 0.00 (0.012) –46.71 (33.98) –0.01 (0.015) 4 –15.14 (13.88) 0.00 (0.0062) –28.97 (23.63) –0.01 (0.012) –20.59 (37.16) 0.00 (0.016) 5 –20.68 (13.31) 0.01 (0.0063) –51.79 (24.14) 0.00 (0.012) –13.31 (34.57) 0.03 (0.015) 6 2.59 (13.66) 0.02 (0.0068) –39.52 (24.14) 0.00 (0.013) 59.98 (31.66) 0.05 (0.015) 7 12.47 (14.38) 0.01 (0.0072) –16.11 (24.79) 0.00 (0.013) 85.6 (30.99) 0.03 (0.015) 8 83.55 (14.62) 0.05 (0.0076) 81.95 (26.78) 0.05 (0.014) 139.38 (30.24) 0.08 (0.015) 9 83.61 (15.80) 0.04 (0.0083) 88.98 (33.44) 0.03 (0.017) 128.59 (30.84) 0.06 (0.016) 10 82.59 (18.45) 0.04 (0.0098) 15.74 (41.52) 0.00 (0.023) 172.35 (28.04) 0.07 (0.015) NOTE: Authors’ calculations, based on 5 percent 1990 PUMS data. The sample consists of white, non-Hispanic married women, aged 30 to 50. Bootstrapped standard errors using 999 replications are reported in parentheses. tity in equation (22) across all women in each decile di : (23) ∆ n (d i ) = ∫ ∆ ( x |d i )dFn ( x |i ), where dFn共x|di 兲 is the national distribution of x in the decile di . The analysis is repeated using a second measure of labor supply—a labor force participation dummy variable. When these empirical exercises are performed separately for women with a high school diploma and those with a college degree, x, is simply an age vector. Results are reported in Table 1. The difference in annual hours of work between women living in expensive and inexpensive cities is substantial (and statistically significant) for many of the nonlabor income deciles. For example, ninth-decile women in expensive cities work considerably longer hours than corresponding women in inexpensive cities. College-educated women in this decile average 129 more work hours, whereas women with a high school education work an average of 89 hours more. An apparent and striking pattern is shown in Table 1 and Figure 1. First, as might be expected, among these married women, leisure appears to be a normal good; women with higher levels of outside income generally work fewer hours per year and have lower labor force participation rates. More important, for our purposes, is that the rela- F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 9 Black, Kolesnikova, Taylor Figure 1 Variation Between Expensive and Inexpensive Locations in Annual Hours and Participation Rates, by Nonlabor Income Decile Annual Hours Hours 1,640 Hours 1,640 1,540 High School Graduates 1,540 1,440 Expensive Locations 1,340 Inexpensive Locations College Graduates 1,440 1,340 1,240 1,240 1,140 1,140 1,040 1,040 940 940 840 840 740 740 640 640 1 2 3 4 5 6 7 8 9 1 10 2 Nonlabor Income Deciles 3 4 5 6 7 8 9 10 3 4 5 6 7 8 Nonlabor Income Deciles 9 10 Nonlabor Income Deciles Participation Rates Percent Percent High School Graduates 0.84 0.84 0.80 0.80 0.76 0.76 0.72 0.68 0.64 0.72 0.68 0.60 0.60 0.56 0.56 0.52 0.64 0.52 0.48 0.44 0.48 0.44 0.40 1 2 3 4 5 6 7 8 Nonlabor Income Deciles 9 10 tionship between nonlabor income and labor supply is quite different for expensive and inexpensive cities. At the very lowest levels of nonlabor income (e.g., deciles 1 and 2), women in expensive cities have lower labor supply than women in inexpensive cities. The opposite is essentially true for women in the high nonlabor income deciles; among women with high nonlabor income, labor force participation and average hours worked are 10 College Graduates V O LU M E 4 , N U M B E R 1 2008 0.40 1 2 higher in expensive cities than in inexpensive cities. In short, the labor/leisure choice appears to not conform to the additively separable form described in Proposition 2; local prices do not merely shift labor supply up or down. The derivative ∂h ∂N F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Black, Kolesnikova, Taylor is generally negative (at least beyond the lowest decile levels of N) and is smaller (in absolute value) in the expensive city. This generalization holds true for both high school– and college-educated women. Also, as noted, results are similar when “average hours” or “labor force participation rates” are used as the measure of labor supply. Of note, in these cities 66 percent of high school–educated women and 70 percent of college-educated women are employed on average. Thus, differences of 5 to 7 percentage points between expensive and inexpensive cities represent differentials of 8 to 10 percent, which seem (to us) quite substantial. Our nonparametric approach does have one disadvantage: The nonlabor income distribution within each decile might differ somewhat for women in expensive cities. An alternative flexible parametric approach to estimation, described in the Appendix, provides nearly identical inferences. Our empirical findings are roughly consistent with theoretical predictions in Case 2. In that equilibrium example with Stone-Geary preferences, the responsiveness of labor supply to nonlabor income must be greater in inexpensive (lowproductivity) cities than expensive (highproductivity) cities. CONCLUSION We describe a simple model to demonstrate that the effects of wage and nonlabor income on labor supply typically differ by location. In particular, we show the derivative of the labor supply with respect to nonlabor income is independent of price only when labor supply takes a form based on an implausible separability condition. Empirical evidence demonstrates that the effect of price on labor supply is not a simple “upor-down shift” that would be required to meet the separability condition in our key proposition. For example, among women with low nonlabor income, living in an inexpensive city is associated with higher labor force participation and longer work hours, whereas among women with high nonlabor income, living in an inexpensive city is associated with lower labor force participation and shorter work hours. This work has a number of implications for empirical strategies in estimating labor supply and other policy research. First, our research makes clear that empirical work should never use crosssectional variation in wages to estimate parameters in labor supply models. We document significant differences for married women in quantity of labor supplied across cities that may have little connection with behavioral responses to cross-sectional variation in wages. Second, because labor supply elasticities vary by location, researchers must be careful in interpreting results based on instrumental variable (IV) strategies. For example, suppose an IV approach is used in which the IV is the price of coal. Variation in the price of coal arguably serves as an excellent source of wage variation in the coal industry, but the resulting estimates of the effect on labor supply would apply only for regions where the coal industry is a major employer. If local prices differ in those regions from other parts of the country, the estimated relationships will not be generalizable to the entire country. Third, using a back-of-the-envelope example, we show that the evidence in Table 1 is consistent with the possibility that wage elasticities or labor supply (for married women) are quite different across cities. Notice that the Slutsky equation, in elasticity form, gives the relationship (24) wh εw = εwH + εN , N where εw is the observed wage elasticity of supply, ε wH is the corresponding Hicksian elasticity (reflecting the pure substitution effect), and εN is the elasticity of labor supply with respect to nonlabor income. Now consider college-educated married women at the median level of nonlabor income. If we take as causal the relationship drawn in Figure 1, moving from the fourth to sixth deciles in income we would estimate a nonlabor income elasticity, εN , of –0.46 in the expensive cities and –0.29 in the inexpensive cities. Suppose that the Hicksian elasticity, ε wH , is 0.50 (and is the same in both cities). We estimate that for the average woman at the fourth decile wh/N is 0.57 in inexpensive F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 11 Black, Kolesnikova, Taylor cities and 0.61 in expensive cities.13 Thus, the uncompensated labor supply elasticity is more than a third higher in expensive cities than inexpensive cities, 0.33 versus 0.24. Fourth, as an example of an application to policy-related research, locational differences may occur in the response of female labor supply to changes in taxes. Changes in income taxes, for instance, would have different effects in different cities. A closely related implication centers on the analysis of social welfare policy. (Recall, for example, that wives of husbands with low earnings work less in more expensive cities.) We believe that further analysis of policy implications is warranted. Do Firms and Households Like the Same Cities?” Review of Economics and Statistics, 2004, 86(1), pp. 438-44. Haurin, Donald R. “The Regional Distribution of Population, Migration, and Climate.” Quarterly Journal of Economics, 1980, 95(2), pp. 293-308. Killingsworth, Mark R. and Heckman, James J. “Female Labor Supply: A Survey,” in Orley Ashenfelter and Richard Layard, eds., Handbook of Labor Economics. Princeton, NJ: Princeton University Press, 1986, Vol. 1, pp. 103-204. Masters, Stanley H. and Garfinkel, Irwin. Estimating the Labor Supply Effects of Income Maintenance Alternatives. New York: Academic Press, 1977. REFERENCES Abbott, Michael and Ashenfelter, Orley. “Labour Supply, Commodity Demand and the Allocation of Time.” Review of Economic Studies, 1976, 43(3), pp. 389-411. Ashenfelter, Orley and Ham, John C. “Education, Unemployment, and Earnings.” Journal of Political Economy, 1979, 87(5), pp. 99-116. Blundell, Richard and MaCurdy, Thomas. “Labor Supply: A Review of Alternative Approaches,” in Orley Ashenfelter and David Card, eds., Handbook of Labor Economics. Princeton, NJ: Princeton University Press, 1999, Vol. 3, pp. 1559-95. Chen, Yong and Rosenthal, Stuart S. “Local Amenities and Life Cycle Migration: Do People Move for Jobs or Fun?” Journal of Urban Economics, 2008 (forthcoming). DaVanzo, Julie; DeTray, Dennis N. and Greenberg, David H. “Estimating Labor Supply Response: A Sensitivity Analysis,” publication No. R-1372-OEO. Santa Monica, CA: The RAND Corporation, 1973. Pencavel, John. “Labor Supply of Men: A Survey,” in Orley Ashenfelter and Richard Layard, eds., Handbook of Labor Economics. Princeton, NJ: Princeton University Press, 1986, Vol. 1, pp. 3-102. Roback, Jennifer. “Wages, Rents, and the Quality of Life.” Journal of Political Economy, 1982, 90(6), pp. 1257-78. Ruggles, Steven; Sobek, Matthew; Alexander, Trent; Fitch, Catherine; Goeken, Ronald; Hall, Patricia; King, Miriam and Ronnander, Chad. Integrated Public Use Microdata Series: Version 3.0 (machine-readable database). Minneapolis, MN: Minnesota Population Center, 2004; www. ipums.org. Ruggles, Steve; Sobek, Matthew; Alexander, Trent; Fitch, Catherine; Goeken, Ronald; Hall, Patricia; King, Miriam and Ronnander, Chad. Integrated Public Use Microdata Series: Version 4.0 (machine-readable database). Minneapolis, MN: Minnesota Population Center, 2008; usa.ipums.org/usa/. Gabriel, Stuart A. and Rosenthal, Stuart S. “Quality of the Business Environment Versus Quality of Life: 13 In fact, the ratio of women’s earnings to nonlabor household income (primarily men’s earnings) is larger in expensive cities than in inexpensive cities at every decile. 12 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Black, Kolesnikova, Taylor APPENDIX The empirical inferences in Table 1 are based on an entirely nonparametric approach. We divided our sample into 10 nonlabor income deciles and compared labor supply across women within each of these cells. Our primary finding is that for women in low nonlabor income deciles, the labor supply is lower in expensive cities than in inexpensive cites, whereas for women in high nonlabor income deciles, labor supply is higher in expensive cities than in inexpensive cities. Here we present a flexible parametric approach that leads to this same inference. We estimate labor supply regressions with the independent variables age (entered as 21 dummy variables for each age, 30 to 50 years inclusive) and nonlabor income (entered as a fourth-order polynomial). We estimate regressions— separately for high school–educated women and college-educated women, as well as for each labor supply variable (employment and hours worked)—using the sample of women from the expensive cities. We similarly estimate corresponding regressions for the sample of women from the inexpensive cities. Then for each woman i who lives in the expensive cities, we estimate the outcome of interest ŷ1i (e.g., “predicted” employment, or “predicted” hours worked) using the regression parameter from the expensive city, and similarly estimate ŷ0i using regression parameters from the inexpensive city. Finally, we form the estimated gap, ˆ = yˆ − yˆ , ∆ i 1i 0i for each individual. Notice that this last quantity is the “impact of the treatment on the treated,” where the “treatment” is location in an expensive city rather than an inexpensive city. To summarize findings in a manner comparable to Table 1, we aggregate estimates into deciles of nonlabor income. Results are presented in Table A1. Bootstrapped standard errors using 999 replications are reported in parentheses.14 14 Bootstrap procedure in this case involves 999 replications of generating a random sample with replacement from the original dataset and estimating the parameter of interest for that sample. After 999 replications, we have a sampling distribution of the parameter estimate. The standard deviation of that distribution is the standard error of the parameter estimate. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 13 Black, Kolesnikova, Taylor Table A1 Differences in Annual Hours and Participation Rates Between Expensive and Inexpensive Locations by Nonlabor Income Deciles, Parametric Approach Women with a high school diploma Change in annual hours Nonlabor income decile Change in participation rates Women with a college degree Change in annual hours Change in participation rates 1 –128.7 (22.04) –0.034 (0.0110) –118.1 (34.23) –0.027 (0.0143) 2 –93.4 (12.42) –0.021 (0.0066) –72.5 (17.76) –0.016 (0.0079) 3 –68.6 (11.10) –0.013 (0.0059) –36.6 (16.07) –0.002 (0.0074) 4 –47.1 (10.82) –0.005 (0.0056) –9.5 (15.23) 0.009 (0.0071) 5 –28.1 (10.26) 0.001 (0.0056) 19.1 (14.59) 0.021 (0.0066) 6 –2.1 (11.15) 0.01 (0.0056) 46.5 (14.18) 0.032 (0.0066) 7 23.8 (12.73) 0.019 (0.0061) 76.5 (14.59) 0.045 (0.0071) 8 55.3 (15.28) 0.030 (0.0077) 108.6 (17.27) 0.058 (0.0082) 9 87.5 (20.48) 0.042 (0.0102) 143.5 (20.89) 0.075 (0.0099) 10 81.6 (38.06) 0.036 (0.0207) 123.1 (30.26) 0.066 (0.0151) NOTE: Authors’ calculations, based on 1990 PUMS data. The sample consists of all married, white, non-Hispanic women between the ages of 30 and 50 inclusive. The covariates are nonlabor income and age. Using a fourth-order polynomial, we use the sample of women from expensive cities to estimate the outcome of interest, which we denote ŷ1i for the ith women. Using the sample of women from inexpensive cities, we estimate parameters for a fourth-order polynomial and then evaluate the function using the covariates of women from the expensive city sample, which we denote ŷ0i for the ith women. We then form the parameter for the “impact of treatment on the treated” as ∆ˆ i = ŷ1i – ŷ0i. We then aggregate estimates into deciles of nonlabor income. Bootstrapped standard errors using 999 replications are reported in parentheses. 14 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Regional Aggregation in Forecasting: An Application to the Federal Reserve’s Eighth District Kristie M. Engemann, Rubén Hernández-Murillo, and Michael T. Owyang Hernández-Murillo and Owyang (2006) showed that accounting for spatial correlations in regional data can improve forecasts of national employment. This paper considers whether the predictive advantage of disaggregate models remains when forecasting subnational data. The authors conduct horse races among several forecasting models in which the objective is to forecast regional- or state-level employment. For some models, the objective is to forecast using the sum of further disaggregated employment (i.e., forecasts of metropolitan statistical area (MSA)-level data are summed to yield state-level forecasts). The authors find that the spatial relationships between states have sufficient predictive content to overcome small increases in the number of estimated parameters when forecasting regional-level data; this is not always true when forecasting stateand regional-level data using the sum of MSA-level forecasts. (JEL C31, C53) Federal Reserve Bank of St. Louis Regional Economic Development, 2008 4(1), pp. 15-29. F orecasting, especially as it pertains to policymaking, is typically conducted at the national level.1 However, a few recent papers have indicated that aggregating regional forecasts may improve forecasts of national indicators. For example, Hendry and Hubrich (2006) use disaggregate models to form forecasts for aggregate variables. Similarly, Giacomini and Granger (2004) show that using a disaggregate model that accounts for spatial correlations can reduce the root mean squared error of the forecasts. Their disaggregate forecasts take advantage 1 There are, however, some notable exceptions of forecasting economic indicators at the subnational level (dates and regions noted in parentheses): Glickman (1971, Philadelphia MSA); Ballard and Glickman (1977, Delaware Valley); Crow (1973, Northeast Corridor); Baird (1983, Ohio); Liu and Stocks (1983, Youngstown-Warren MSA); Duobinis (1981, Chicago MSA); LeSage and Magura (1986, 1990, Ohio); and Rapach and Strauss (2005, Missouri; 2007, Eighth Federal Reserve District). of cross-regional correlations yet still restrict the number of parameters estimated.2 They argue that, under certain conditions, the sum of the forecasts from an order-p,q space-time autoregression [ST-AR共p,q兲] can outperform both aggregate models and models that do not account for the spatial nature of the data. The ST-AR共p,q兲 model includes p temporal lags and q spatially distributed lags— that is, lags of the other regional series weighted by proximity. Thus, the ST-AR共p,q兲 model exploits both the spatial correlations and the information content in the disaggregated series. Hernández-Murillo and Owyang (2006) take this approach to national employment data, show2 Compared with a standard vector autogression (VAR), the space-time autoregression (AR) model posited in Giacomini and Granger (2004) requires the estimation of 共n2 – n – 1兲p fewer parameters for the same lag order p. Kristie M. Engemann is a senior research associate, Rubén Hernández-Murillo is a senior economist, and Michael T. Owyang is a research officer at the Federal Reserve Bank of St. Louis. This paper was prepared for the 4th Annual Business and Economics Research Group conference sponsored by the Federal Reserve Bank of St. Louis and the Center for Regional Economics—8th District. The authors thank Dave Rapach for comments. © 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 15 Engemann, Hernández-Murillo, Owyang ing that out-of-sample forecasts can be improved by modeling the spatial interactions between Bureau of Economic Analysis regions. They compare a ST-AR共p,q兲 model with vector autoregressions (VARs) with various levels of disaggregation. They concluded that, as predicted by Giacomini and Granger (2004), information in regional employment data is useful for forecasting national employment. In this paper, we are interested in whether the information content of regional data can be observed at a more disaggregated level. In particular, we ask whether information for states helps forecast regional data and whether information from cities helps forecast state data. To this end, we construct horse races among four competing models with different levels of disaggregation. We then conduct out-of-sample tests to determine which model produces the best short- and long-horizon forecasts. The data used in these experiments are state- and metropolitan statistical area (MSA)-level payroll employment. In each experiment, the disaggregate data are summed to yield either state- or regionallevel aggregates. In each case, we ask whether models using the disaggregate data provide lower mean squared prediction errors (MSPEs) than the aggregate alternatives. We find that the spatial relationships among states have sufficient predictive content to overcome small increases in the number of estimated parameters. The same is not always true when forecasting state- and regionallevel variables using the sum of MSA-level forecasts. The next section reviews the four models used in the horse races, followed by a section that discusses the subnational data and the construction of the “aggregate” data. The results of the out-ofsample experiments are then presented, followed by the conclusion. aggregate series. These series can be disaggregated in any manner (e.g., by regions or industries). The aggregate forecast then can be constructed directly from aggregate data or from the sum (or weighted sum) of its components. We examine four alternatives. Suppose that period-t aggregate employment is denoted Yt and can be written as the sum of its N disaggregate counterparts (henceforth referred to as “regions,” which depending on the application may refer to either states or metro areas), ynt , ̂ be the h-period-ahead without error.3 Let Yt+h forecast of Y. A forecast from the simplest model, a univariate aggregate order-p autoregression (AR共p兲, Model 1), has the form p Yˆ t + h = ∑ Φ jYt + h − j , (1) j =1 where p is the number of lags and Φj are scalar coefficients.4 A similar univariate model can be constructed to forecast each of the individual components— in particular, region n’s h-period-ahead level of employment, yn,t ̂ +h .5 The aggregate forecast is the sum of the N regional forecasts (Model 2): (2) Yˆ t + h = N N p ∑ yˆ uni n,t + h = ∑ ∑ φ nj y n ,t + h − j , n =1 n =1 j =1 uni where yn,t ̂ +h is region n’s employment forecast from the univariate AR共p兲 model and φnj are scalar coefficients. An alternative to Model (2) that accounts for the comovement between the regions is a VAR forecast (Model 3). The aggregate forecast obtained from such a model can be written as 3 MODELS The implicit assumption made here is that the aggregate is exactly the sum of its component parts. That is, N The goal of this experiment is to produce an h-period-ahead forecast of an aggregate time series—for example, employment. In this context, “aggregate” does not necessarily mean “national,” although it is an obvious interpretation. Instead, here aggregate time series are data that are the sum or weighted sum of a number of (forecastable) dis16 V O LU M E 4 , N U M B E R 1 2008 Yt = ∑n =1 y nt holds identically. Of course, the validity of this assumption depends greatly on the choice of data. 4 Potential constants and time trends are suppressed in this section for notational convenience. 5 Henceforth, we refer to the disaggregate components as “regions,” although they can, in principle, be of any type (e.g., industry, state, MSA). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Engemann, Hernández-Murillo, Owyang (3) Yˆt + h = N N p N ∑ yˆ nvar,t +h = ∑ ∑ ∑ Γ nkj y k ,t +h − j , n =1 n =1 k =1 j =1 var y n,t ̂ +h where is region n’s employment forecast and Γnkj is the (scalar) lag-j effect of region k on region n’s employment taken from the VAR coefficient matrices. Finally, we consider a ST-AR共p,q兲 model (Model 4), which accounts explicitly for the spatial correlations between regions by imposing a relationship that depends on the proximity to a region’s neighbors. The spatial weights wnk are chosen a priori and are intended to reflect proximity between pairs of regions, for example, in terms of geographic characteristics such as contiguity or distance. Interaction between regions is governed by a weighting matrix W = {wnk } satisfying wnk ≥ 0, wnn = 0, and Yˆ t + h = ∑ k ≠n wnk = 1 N ∑ yˆ nstar,t + h n =1 (4) N = p N q ∑ ∑φ j y n,t +h − j + ∑ ∑ψ lw nk y k ,t +h −l , n =1 j =1 k =1 l =1 where φj and ψl are scalar autoregressive and scalar spatial lag coefficients, respectively. The weighting matrices used in the empirical applications are discussed below. The primary differences among the four models involve a tension between modeling the (in-sample) cross-spatial correlations and parameter proliferation. Clearly, Models (1) and (2) are the most parsimonious models. However, these models neglect potentially predictive information in the comovement between the variables. On the other hand, the VAR depicted in Model (3) may overfit the insample data. Under parameter certainty, the VAR forecast in Model (3) weakly dominates the three alternative Models (1), (2), and (4). However, Giacomini and Granger (2004) show that forecasting from an estimated VAR (Model 3) is less efficient than forecasting from the ST-AR model (Model 4).6 6 Under certain conditions, the univariate aggregate model yields a lower mean squared error. For a discussion of these conditions, see Giacomini and Granger (2004). Because the ST-AR model is a restricted form of the VAR, the error associated with parameter uncertainty decreases. Giancomini and Granger, however, are unable to determine whether the ST-AR model or the univariate model is more theoretically efficient (i.e., whether interaction between regions yields significant information for forecasting). In the following section, we investigate whether accounting for spatial interaction in regional employment data is sufficiently elucidative to warrant the use of disaggregate data in forecasting. EMPIRICAL DETAILS Hernández-Murillo and Owyang (2006) tested the forecasting efficacy of the spatially disaggregated model for national employment. Here, we consider further disaggregation by examining the model’s ability to forecast state- and Federal Reserve District–level employment. We conduct three experiments. First, we forecast Eighth District employment using the sum of state-level employment.7 Second, we forecast District employment using the sum of Eighth District MSA–level employment.8 Finally, we forecast state-level employment for each of the seven District states using MSAlevel employment. Data Although a number of aggregate business cycle indicators exist, relatively few series are available at the disaggregate level. Two series available at a state level with both a reasonable frequency and sufficiently large sample are personal income (quarterly) and employment (monthly).9 At an MSA-level, only employment is readily available. 7 The Federal Reserve’s Eighth District contains portions of seven states: Missouri, Illinois, Tennessee, Arkansas, Kentucky, Indiana, and Mississippi. Only Arkansas lies entirely in the Eighth District. However, for purposes of this experiment, we make the simplifying assumption that the District consists of the entirety of all seven states. 8 In constructing District-level employment for this experiment and state-level employment for the next experiment, we use the sum of MSA-level employment. For the former, we include only MSAs located in the Eighth District, and for the latter, we include all MSAs in the states. Rural employment is omitted in each case. 9 Gross state product, which is the state-level equivalent to national gross domestic product, is annual and only available at a one-year lag. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 17 Engemann, Hernández-Murillo, Owyang Figure 1 Eighth District States’ Payroll Employment $ Thousands 6,500 Arkansas Illinois Indiana Kentucky Mississippi Missouri Tennessee 6,000 5,500 5,000 4,500 4,000 3,500 3,000 2,500 2,000 1,500 1,000 500 0 Jan. 1990 Jan. 1992 Jan. 1994 Jan. 1996 Jan. 1998 Jan. 2000 Jan. 2002 Jan. 2004 Jan. 2006 NOTE: The employment series for each state is seasonally adjusted. We, therefore, concentrate our efforts on the appropriate employment forecasts. For our forecasting experiments, we use stateand MSA-level employment data from the Bureau of Labor Statistics’ payroll employment survey. For the first experiment, state-level employment is summed to yield an approximation of the Eighth District employment level. In the same manner, the appropriate aggregates are constructed from MSA-level data in the following two experiments for forecasting District- and state-level data. For each exercise, the full sample is January 1990 to December 2007. For convenience, the state- and MSA-level data are plotted in Figures 1 and 2, respectively. Summary statistics for the data are provided in Tables 1 and 2. For each of the last two experiments, we construct the District- and state-level aggregates by omitting rural employment. Table 3 shows that the rural component of employment for each state in the Federal Reserve’s Eighth District is significant. The difficulty, however, of adding rural employment to the forecasting regressions (at least those that account for cross-regional correlations) lies in 18 V O LU M E 4 , N U M B E R 1 2008 modeling the comovements between rural and urban employment. In particular, for the spatial model (4), modeling the distance between the rural and MSA centroids is problematic. Forecasting Scheme We could use one of two forecasting schemes— recursive or rolling window. A recursive forecasting scheme fixes the initial period for the in-sample data. Each additional period is added to the sample and the model is reestimated. Thus, the estimation window expands as the sample expands. Conversely, the rolling window scheme fixes the size of the dataset used to make the forecast. With each new period, recent data are added and data at the beginning of the sample are dropped. The rolling window scheme is particularly useful for cases in which the data-generating process experiences structural breaks. This has been shown to be the case for both state- and MSA-level employment (see Owyang, Piger, and Wall, 2005, forthcoming, and Owyang, et al., forthcoming). Therefore, we choose to use a rolling window forecasting scheme with a 13-year sampling period. The number of F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Engemann, Hernández-Murillo, Owyang Figure 2A Eighth District MSAs’ Payroll Employment, by State $ Thousands 2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0 Jan. 1990 Jan. 1992 Jan. 1994 Jan. 1996 Jan. 1998 Jan. 2000 Jan. 2002 Jan. 2004 Jan. 2006 Arkansas Illinois Indiana Kentucky Mississippi Missouri Tennessee NOTE: The employment series for each state is seasonally adjusted and consists of the sum of all MSAs in that state. Figure 2B Total State MSAs’ Payroll Employment, by State $ Thousands 5,500 5,000 4,500 4,000 3,500 3,000 2,500 2,000 1,500 1,000 500 0 Jan. 1990 Jan. 1992 Jan. 1994 Jan. 1996 Arkansas Illinois Indiana Kentucky Mississippi Missouri Tennessee Jan. 1998 Jan. 2000 Jan. 2002 Jan. 2004 Jan. 2006 NOTE: The employment series for each state is seasonally adjusted and consists of the sum of all MSAs in that state. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 19 20 V O LU M E 4 , N U M B E R 1 2008 39,700.8 26,105.7 5,442.7 17,009.6 26,891.1 75,994.0 7,670.6 Variance Level Nov. 2007 May 2000 Jun. 2000 Mar. 2007 May 2007 2,816.8 Dec. 2007 2,805.4 1,171.2 Dec. 2007 1,859.8 3,015.2 6,059.8 Date Maximum Jan. 1990 2,171.0 Apr. 1991 2,294.9 Apr. 1991 928.8 1,461.7 Apr. 1990 2,492.5 Mar. 1991 5,201.4 Mar. 1992 912.2 Feb. 1990 Date Minimum Level Level (thousands) 1,209.4 NOTE: Monthly growth rates are annualized. 2,560.9 Tennessee 1,705.5 Kentucky 1,085.0 2,824.3 Indiana 2,603.0 5,711.0 Illinois Missouri 1,095.8 Arkansas Mississippi Mean State State-Level Summary Statistics Table 1 –0.74 –0.66 –1.02 –0.64 –0.85 –0.64 –0.73 Skewness 1.5 1.0 1.4 1.4 1.0 0.8 1.6 Mean 12.8 10.6 15.9 16.4 12.0 6.8 7.8 Variance 17.1 11.0 15.9 29.6 10.6 11.7 9.0 Growth Nov. 1994 Apr. 1993 Oct. 1993 May 1992 Feb. 1999 Sep. 1995 Jan. 1995 Date Maximum –13.2 –14.6 –18.6 –20.1 –10.4 –6.8 –5.2 Apr. 1996 Jan. 1991 Sep. 2005 Apr. 1992 Jan. 1999 Jul. 2001 Jan. 1991 Date Minimum Growth Growth rate (percent) Engemann, Hernández-Murillo, Owyang F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 55.6 42.0 345.8 171.4 153.3 111.6 Decatur, IL Kankakee-Bradley, IL Lake CountyKenosha County, IL-WI Peoria, IL Rockford, IL Springfield, IL 5.3 80.6 103.0 1,945.4 5.3 3.7 69.7 0.9 27,386.5 30.9 87.6 7.2 17.4 11.4 102.3 117.3 166.1 188.2 401.5 44.6 60.7 190.0 35.2 3,925.5 114.9 92.6 56.6 41.0 348.4 49.8 39.6 126.0 208.6 Level Aug. 2000 Jul. 2000 Jun. 2007 Jun. 2007 Jan. 2007 Mar. 2000 Nov. 1999 Nov. 1996 Jan. 2001 Sep. 2007 Feb. 2002 Oct. 2007 Nov. 2003 Dec. 2007 Dec. 2007 Jul. 2007 Aug. 2007 Mar. 2007 Date Maximum 106.1 133.2 149.1 265.1 35.8 52.5 164.5 31.2 3,393.5 98.0 65.3 47.0 36.5 255.1 36.6 27.0 89.9 105.2 Feb. 1990 Aug. 1991 Apr. 1992 Jan. 1990 Jan. 1990 Apr. 1992 Apr. 1990 Dec. 2006 Dec. 1991 Apr. 1990 Jan. 1990 May 1991 Feb. 1991 Feb. 1990 Jan. 1990 Jan. 1990 Jan. 1990 Jan. 1990 Date Minimum Level Level (thousands) 0.03 –0.77 –0.40 –0.38 –1.10 1.05 –0.45 –0.31 –0.62 –0.28 –0.43 –0.15 –0.70 –0.53 –0.65 –0.48 –0.53 0.09 Skewness 0.6 1.4 1.4 2.4 1.6 0.5 0.9 0.3 0.6 1.2 2.5 1.0 0.7 1.8 2.0 2.4 2.1 4.0 Mean 66.7 81.9 81.4 20.8 85.1 78.2 28.0 137.9 8.3 102.3 120.5 34.9 75.3 13.8 58.2 62.7 37.0 25.1 Variance 44.3 35.7 69.9 26.4 31.5 54.2 23.9 48.0 12.4 52.9 48.4 22.3 35.3 24.5 34.0 37.1 22.9 38.0 Growth Aug. 1993 Feb. 1990 May 1992 Oct. 1993 Jan. 1998 May 1992 Jul. 1998 Apr. 1991 Apr. 1993 Sep. 2007 Jun. 1992 Jan. 1996 Apr. 1991 Jan. 2001 Jul. 1993 May 2003 Apr. 1994 Jan. 2001 Date Maximum Growth rate (percent) NOTE: *Indicates an MSA located in the Eighth District and used in the second experiment; monthly growth rates are annualized. 33.1 179.6 3,707.6 ChicagoNapervilleJoliet, IL Davenport-MolineRock Island IA-IL 107.1 ChampaignUrbana, IL Danville, IL 81.9 BloomingtonNormal, IL 1.1 39.3 Little RockNorth Little Rock, AR* 51.9 44.3 307.2 Jonesboro, AR* Pine Bluff, AR* 34.0 Hot Springs, AR* Texarkana, TX-AR* 698.0 110.0 Fort Smith, AR-OK* 993.8 156.2 FayettevilleSpringdaleRogers, AR* Variance Mean MSA MSA-Level Summary Statistics Table 2 –30.0 –28.4 –38.4 –8.7 –28.5 –29.9 –18.8 –42.4 –8.2 –23.4 –38.4 –16.2 –37.0 –5.5 –15.8 –26.2 –13.3 –8.1 Growth Sep. 1992 Aug. 2000 Jul. 1994 Jun. 1993 Apr. 1992 Nov. 1991 Jan. 1999 Jan. 1999 Jul. 2001 Nov. 1999 Jan. 1995 Sep. 1993 Jul. 2006 Sep. 2001 Jan. 2000 Feb. 2006 Nov. 2006 Apr. 2007 Date Minimum Engemann, Hernández-Murillo, Owyang 2008 21 22 V O LU M E 4 , N U M B E R 1 2008 51.2 Terre Haute, IN Bowling Green, KY* 231.3 579.1 47.8 HuntingtonAshland, WV-KY-OH Lexington-Fayette, KY Louisville, KY-IN* Owensboro, KY* 12.0 1,495.5 407.3 35.6 16.1 4,351.7 39.8 6.5 62.7 6.4 1.6 40.3 7.5 6,428.6 66.0 80.2 110.5 104.8 9.7 38.5 6.3 Variance 51.6 629.4 257.1 121.2 48.8 1,047.4 62.8 78.2 150.9 62.9 50.3 97.0 55.7 921.8 285.6 220.0 182.0 133.7 46.0 85.2 51.1 Level Dec. 2007 Jul. 2007 Dec. 2007 Sep. 2007 Nov. 2007 Oct. 2007 Dec. 2007 Mar. 2000 May 2000 Jul. 1995 Apr. 2000 Aug. 2007 Dec. 1999 Aug. 2007 Jun. 1998 Dec. 1998 Nov. 2002 Feb. 2006 Oct. 2007 Aug. 2006 Mar. 1990 Date Maximum 40.8 504.2 194.2 99.8 35.2 859.0 39.4 68.2 123.6 52.0 45.1 74.8 44.1 666.4 257.9 190.6 150.2 95.1 34.2 62.3 41.4 Jul. 1991 Apr. 1991 Apr. 1990 Sep. 1991 Jan. 1990 Jan. 1990 Jan. 1990 Jul. 1990 Apr. 1992 Jul. 2005 Sep. 1991 May 1991 Jul. 2003 Apr. 1990 Apr. 1990 Mar. 1991 Feb. 1990 Mar. 1991 Feb. 1990 Jul. 1990 Jun. 2007 Date Minimum Level Level (thousands) –0.92 –0.70 –0.62 0.07 0.00 –0.47 –0.15 –0.85 –1.02 0.03 0.04 –0.59 –0.01 –0.32 –0.34 –0.74 –0.70 –0.58 –0.88 –0.80 –0.70 Skewness 1.5 1.3 1.7 1.1 2.0 1.2 3.0 0.6 0.9 0.6 0.4 1.7 2.7 1.9 0.5 0.7 1.1 1.6 2.0 2.4 –0.7 Mean 38.6 17.3 22.9 34.3 36.6 9.4 71.1 43.8 35.9 129.1 36.8 104.8 857.6 15.7 23.2 27.3 28.9 52.8 85.2 202.9 70.7 Variance 24.5 16.2 20.5 34.0 22.1 15.3 46.7 24.5 23.8 68.3 21.8 37.1 312.8 18.4 21.6 21.5 26.6 22.9 47.0 80.4 30.0 Growth Apr. 1999 Dec. 2006 Jul. 2000 Feb. 1994 Jan. 2006 Oct. 1993 Sep. 1991 Jan. 1993 Oct. 2004 Jun. 2003 Sep. 1999 Jan. 1995 Aug. 2003 Oct. 1998 Jan. 2001 Jul. 2005 Jan. 2001 Apr. 1992 Aug. 1992 Jan. 1993 Aug. 1994 Date Maximum Growth rate (percent) NOTE: *Indicates an MSA located in the Eighth District and used in the second experiment; monthly growth rates are annualized. 41.9 110.0 Elizabethtown, KY* 965.5 74.4 South BendMishawaka, IN CincinnatiMiddletown, OH-KY-IN 57.0 140.9 Muncie, IN 47.5 Michigan CityLa Porte, IN Gary, IN 87.8 208.5 273.5 Fort Wayne, IN Lafayette, IN 170.2 Evansville, IN-KY* 50.9 117.9 Elkhart-Goshen, IN 802.3 41.5 Columbus, IN Kokomo, IN 75.7 Indianapolis, IN 47.4 Bloomington, IN Mean Anderson, IN MSA MSA-Level Summary Statistics Table 2, cont’d –16.3 –17.4 –11.2 –17.6 –15.6 –6.2 –32.4 –34.3 –20.6 –31.3 –22.1 –51.1 –74.9 –13.1 –17.5 –17.6 –18.0 –20.9 –32.3 –39.9 –24.0 Growth Feb. 2003 Jan. 1991 Jul. 1995 Jan. 1994 Apr. 2000 Jan. 2005 Jan. 1992 Jan. 2003 Apr. 1992 Jun. 1997 Oct. 1997 Jan. 2003 Jul. 2003 Jan. 1999 Sep. 2007 Jan. 1999 Oct. 1994 Apr. 1995 Jul. 1992 Jan. 2003 Jul. 1994 Date Minimum Engemann, Hernández-Murillo, Owyang F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T 293.9 765.6 51.9 647.1 338.9 125.7 81.7 62.3 42.9 85.7 248.9 202.9 1,362.6 59.6 1,024.2 80.1 80.3 93.3 60.8 263.4 62.1 116.2 Level Dec. 2007 May 2006 Dec. 2007 Sep. 2007 Apr. 1997 Nov. 2007 Jun. 2007 Jan. 2007 Oct. 2007 Jul. 2007 Dec. 2007 May 2007 Aug. 2007 Aug. 2007 Dec. 2007 Dec. 2007 Oct. 2007 Jul. 1999 Oct. 2007 Aug. 2007 Jan. 2005 Date Maximum 71.0 522.9 38.1 487.5 240.8 107.4 64.2 45.7 32.2 53.4 200.6 131.2 1,165.0 42.8 824.1 60.2 59.0 61.2 46.1 194.6 43.7 Feb. 1991 Jan. 1990 Mar. 1991 Apr. 1990 Jan. 1990 Feb. 1990 Jan. 1991 Aug. 1991 Dec. 1990 May 1991 Jul. 1990 Jun. 1991 Jul. 1991 Jun. 1991 Feb. 1991 Apr. 1990 Jun. 1990 Jan. 1990 Jan. 1990 May 1991 Jan. 1990 Date Minimum Level Level (thousands) –0.38 –0.75 –0.66 –0.23 –1.11 –0.10 –0.59 –0.74 –0.26 –0.38 –0.31 –0.58 0.50 –0.60 –0.70 –0.72 –0.24 –0.03 –0.40 –0.01 –0.75 Skewness 2.2 2.1 1.6 2.0 1.0 1.6 1.9 2.1 2.7 1.2 2.5 0.8 1.9 1.2 1.7 1.9 2.5 2.9 1.7 2.0 3.7 Mean 18.3 87.2 22.9 18.2 52.2 52.7 57.8 169.0 45.4 24.7 17.0 8.1 59.6 12.3 27.7 32.0 26.9 392.5 10.6 35.9 176.1 Variance 16.9 46.4 23.4 14.4 25.1 31.7 31.6 109.3 25.7 19.7 16.8 10.0 35.2 14.0 20.0 22.1 22.7 191.7 18.0 52.5 75.1 Growth Nov. 1994 Jul. 1993 Jan. 1994 Apr. 1993 Jul. 1990 Sep. 2003 Jul. 1999 Jan. 1994 Sep. 1993 Jul. 1999 Aug. 1990 Jan. 1998 Oct. 1991 Jul. 2007 Jul. 2004 Jul. 1994 Oct. 1993 Apr. 2007 Jul. 2003 Oct. 2005 Aug. 1992 Date Maximum Growth rate (percent) NOTE: *Indicates an MSA located in the Eighth District and used in the second experiment; monthly growth rates are annualized. 18.1 5,679.1 46.9 2,773.8 777.1 21.5 22.7 30.5 Nashville-Davidson- 651.1 Murfreesboro, TN Morristown, TN 581.0 Knoxville, TN Memphis, TN-AR-MS* 73.2 119.1 55.8 Jackson, TN* Kingsport-BristolBristol, TN-VA 38.7 Cleveland, TN Johnson City, TN 100.4 70.8 Clarksville, TN-KY 10.5 193.9 Chattanooga, TN-GA 226.9 425.1 167.5 Springfield, MO* 4,295.7 20.7 1,280.5 49.4 St. Joseph, MO-KS 3,513.1 35.2 44.2 99.2 9.2 471.5 26.4 202.7 Variance St. Louis, MO-IL* 71.7 72.2 Jefferson City, MO* 932.0 77.9 Columbia, MO* Kansas City, MO-KS 54.2 Pascagoula, MS Joplin, MO 232.9 Jackson, MS 99.3 51.7 Hattiesburg, MS Mean Gulfport-Biloxi, MS MSA MSA-Level Summary Statistics Table 2, cont’d –7.4 –22.5 –12.2 –11.9 –24.0 –28.0 –28.4 –42.9 –18.8 –12.1 –11.2 –7.5 –19.8 –14.8 –14.7 –16.3 –13.2 –71.6 –7.3 –10.5 –78.9 Growth Oct. 2001 Oct. 2002 Aug. 1993 Jan. 2003 Sep. 1990 Jul. 1997 Aug. 1999 Mar. 1993 Jul. 1990 Jan. 1996 Jan. 2003 Dec. 2000 Sep. 1996 Jul. 2002 May 2002 Apr. 1995 Apr. 1990 Sep. 2005 Aug. 1990 Sep. 2005 Sep. 2005 Date Minimum Engemann, Hernández-Murillo, Owyang V O LU M E 4 , N U M B E R 1 2008 23 Engemann, Hernández-Murillo, Owyang Table 3 Rural Employment by State in 2006 Rural employment (percent) State Arkansas 36.3 Illinois 11.6 Indiana 20.0 Kentucky 36.0 Mississippi 52.5 Missouri 23.9 Tennessee 22.1 Average 28.9 SOURCE: USDA, Economic Research Service, State Fact Sheets. lags for each model is chosen using the Bayesian information criterion (BIC) on the initial subsample and remains fixed for the entire forecasting experiment. Spatial Weighting Two sets of weights are considered for the first forecasting experiment. The first set of weights takes into account the distance between the centroids of economic regions, and the second considers geographic contiguity as a categorical qualification. Under the first definition, wnk = (1 d nk ) ( ∑ k ≠ n 1 d nk ), where dnk is the distance between the geographic centroids of regions n and k. Under the second definition, wnk = (ηnk ) ( ∑ k ≠ n ηnk ), where ηnk = 1 if regions n and k are geographically adjacent, and ηnk = 0 otherwise. Both of the final two experiments use only the distance between centroids because contiguity cannot be established for most MSAs. A few broadly consistent features are notable for the three forecasting experiments. In particular, V O LU M E 4 , N U M B E R 1 Forecasting District Employment with State-Level Data The first set of results considers forecasting the Eighth Federal Reserve District using state-level data. As mentioned previously, state-level data support two possible spatial weighting matrices for the ST-AR model: distance and contiguity. We present results for both weighting matrices. Figure 3 shows the relative decline in MSPEs for the ST-AR model using centroid distance as the spatial metric relative to each of the forecasting models. Obvious from these results is that weighting state-level interactions by distance provides some advantage to aggregate forecasting over weighting by contiguity. The advantage may result because a contiguity weighting scheme would suppress potentially important interactions between noncontinuous states.10 For both weighting schemes, the informational advantage in modeling the regional interactions is obvious. The VAR and the ST-AR models yield lower MSPEs for almost every horizon. At very short horizons, the disaggregate AR has predictive ability similar to that of the VAR and the ST-AR models. However, at longer horizons, neglecting the regional interactions can increase the MSPE by up to 90 percent. The regional VAR and the ST-AR models produce an interesting comparison. First, it is important to note that the lag order chosen by the BIC for the VAR is much shorter than that for the ST-AR. This negates, to some extent, the reduction in the MSPEs gained by reducing parameter uncertainty 10 RESULTS 24 for the District forecasts the aggregate AR exhibits greater MSPE at every horizon than the ST-AR model. The difference in MSPEs for the ST-AR model and a more parsimoniously parameterized VAR is often small, especially for short horizons; and the disaggregate AR can provide some (small) forecasting advantages over the more heavily parameterized ST-AR model at short horizons but is inferior at long horizons. 2008 As alluded to above, the weighting matrix in spatial econometrics is determined exogenously. Conley and Molinari (2007) propose a test of the spatial weighting matrix. However, their test is conducted in-sample and is a joint test of model and spatial weighting misspecification. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Engemann, Hernández-Murillo, Owyang Figure 3 Efficiency Gain for ST-AR Model, Using Eighth District States 1.0 0.8 0.6 0.4 0.2 0.0 –0.2 0 5 10 15 20 25 Disaggregate ST-AR(3,3) Contiguity Disaggregate VAR(1) Aggregate AR(3) Disaggregate ST-AR(3,3) Disaggregate AR(3) Figure 4 Efficiency Gain for ST-AR Model, Using Eighth District States (Setting Equal Lag Lengths) 1.0 0.8 0.6 0.4 0.2 0.0 –0.2 0 5 Aggregate AR(3) Disaggregate AR(3) 10 15 20 25 Disaggregate VAR(3) Disaggregate ST-AR(3,3) F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 25 Engemann, Hernández-Murillo, Owyang Figure 5 Efficiency Gain for ST-AR Model, Using Eighth District MSAs 1.0 0.8 0.6 0.4 0.2 0.0 –0.2 0 5 10 15 in the more parsimoniously parameterized ST-AR model. Figure 4 demonstrates the informational advantage for a ST-AR model versus a VAR with equal lag length. This finding is consistent with the theoretical findings in Giacomini and Granger (2004): Increasing the number of estimated parameters in the VAR with equal lags leads to potential overfitting and an increase in the MSPEs. Forecasting District Employment with MSA-Level Data As Figure 5 shows, the results for disaggregating at the MSA level are broadly consistent with those for the state data. The disaggregate models perform better out of sample than the aggregate AR model. The ST-AR model is more efficient than the disaggregate AR at long horizons. At shorter horizons, this information advantage is eroded and sometimes negative. Moreover, the VAR performs better in this case than the ST-AR model for most horizons. These results suggest several possible explanations. In the previous case, District data were disaggregated into seven states; here, the District is disaggregated into 18 MSAs. Although the increase in the number of disaggregate units may not seem significant, it leads to a substantial increase in the V O LU M E 4 , N U M B E R 1 2008 25 Disaggregate VAR(1) Disaggregate ST-AR(3,3) Aggregate AR(3) Disaggregate AR(4) 26 20 number of estimated parameters for the ST-AR model. This increase may erode the model’s forecasting advantage because of the increased uncertainty from estimating the extra parameters. Second, the MSA may be an improper level of disaggregation. A third possibility is that the spatial weighting matrix used in this exercise does not properly model the interactions. This could potentially explain why the VAR model performs better than the ST-AR model despite estimating a comparable number of parameters. Forecasting State Employment with MSA-Level Data We conducted similar experiments using the level of employment in the seven states in the Eighth District as the aggregate and the MSAs in those states as the disaggregate components. Our motivation is to determine the optimal level of disaggregation in forecasting employment. Unfortunately, few results are consistent across states (Figure 6). For example, most states yield lower MSPEs for the disaggregate forecasting models versus the aggregate AR model. Mississippi is an exception: The aggregate AR gives roughly similar MSPEs as the VAR and much lower MSPEs than either the ST-AR or disaggregate AR model. Overall, F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Engemann, Hernández-Murillo, Owyang Figure 6 Efficiency Gain for ST-AR Model, Forecasting State Employment with MSAs Arkanasas Illinois 0.6 0.5 0.4 0.3 0.2 0.1 0.0 –0.1 0 5 10 15 20 25 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 –0.1 0 Disaggregate VAR(1) Disaggregate ST-AR(1,1) Aggregate AR(1) Disaggregate AR(1) 5 0 5 10 15 20 25 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 –0.1 –0.2 –0.3 0 Disaggregate VAR(1) Disaggregate ST-AR(2,2) Aggregate AR(1) Disaggregate AR(1) 5 0.0 –0.3 –0.4 5 10 15 20 25 Disaggregate VAR(1) Disaggregate ST-AR(2,1) Aggregate AR(1) Disaggregate AR(1) 25 10 15 20 25 Disaggregate VAR(1) Disaggregate ST-AR(3,3) Missouri –0.1 –0.2 0 20 Disaggregate VAR(1) Disaggregate ST-AR(1,1) Aggregate AR(4) Disaggregate AR(1) Mississippi 0.1 –0.5 –0.6 15 Kentucky Indiana 0.6 0.4 0.2 0.0 –0.2 –0.4 –0.6 –0.8 –1.0 –1.2 10 Aggregate AR(3) Disaggregate AR(1) 1.0 0.8 0.6 0.4 0.2 0.0 –0.2 –0.4 –0.6 –0.8 –1.0 0 5 10 15 Aggregate AR(3) Disaggregate AR(1) 20 25 Disaggregate VAR(1) Disaggregate ST-AR(1,1) Tennessee 1.0 0.8 0.6 0.4 0.2 0.0 –0.2 0 5 10 Aggregate AR(4) Disaggregate AR(1) 15 20 25 Disaggregate VAR(1) Disaggregate ST-AR(1,1) F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 27 Engemann, Hernández-Murillo, Owyang the model with the lowest MSPE for each state differs. The ST-AR model provides the lowest MSPE for about half of the states but performs considerably worse than even the aggregate AR for Mississippi and Indiana. One notable fact in these results is that, for a given model, the lag order called for by the (insample) BIC varies substantially across the states. Not surprisingly, the ST-AR model tends to perform worse in states in which the in-sample criterion calls for longer lags. This can lead to an increase in parameter uncertainty or overfitting.11 Similarly, for states in which long lags are called for in the AR model, this model performs poorly. We therefore conclude that, although some information may be gleaned from modeling spatial relationships, disaggregation to the MSA level should be done with some caution. CONCLUSION Recent studies have shown that, at times, aggregate variables can be more accurately forecasted by summing disaggregate forecasts. In particular, using models that take into account the spatial interactions of the disaggregate series can improve forecasting performance. This occurs at the expense of estimating additional parameters. This tension naturally leads to the question of how much disaggregation is “optimal.” We conducted a number of forecasting experiments along these lines. In general, we find that disaggregation can produce better forecasts. For example, by disaggregating a regional variable (the Eighth Federal Reserve District’s employment level) into states, we achieved a significant reduction in the MSPE versus the aggregate AR. Using the state level as the aggregate, however, yields less consistent results, which suggests that the exploitable regional interactions at the MSA level may not be sufficiently informative to overcome the increase in estimated parameters. We imagine that further disaggregation—perhaps to the county level—might increase this tension between exploitable spatial interactions and increased parameter uncertainty. 11 The tension between in-sample and out-of-sample fit is not surprising (see Hansen, 2008). 28 V O LU M E 4 , N U M B E R 1 2008 REFERENCES Baird, Catherine A. “A Multiregional Econometric Model of Ohio.” Journal of Regional Science, November 1983, 23(4), pp. 501-15. Ballard, Kenneth and Glickman, Norman J. “A Multiregional Econometric Forecasting System: A Model for the Delaware Valley.” Journal of Regional Science, August 1977, 17(2), pp. 161-77. Conley, Timothy G. and Molinari, Francesca. “Spatial Correlation Robust Inference with Errors in Location or Distance.” Journal of Econometrics, September 2007, 140(1), pp. 76-96. Crow, Robert T. “A Nationally-Linked Regional Econometric Model.” Journal of Regional Science, August 1973, 13(2), pp. 187-204. Duobinis, Stanley F. “An Econometric Model of the Chicago Standard Metropolitan Statistical Area.” Journal of Regional Science, August 1981, 21(3), pp. 293-319. Giacomini, Raffaella and Granger, Clive W.J. “Aggregation of Space-Time Processes.” Journal of Econometrics, January 2004, 118(1/2), pp. 7-26. Glickman, Norman J. “An Econometric Forecasting Model for the Philadelphia Region.” Journal of Regional Science, April 1971, 11(1), pp. 15-32. Hansen, Peter R. “In-Sample Out-of-Sample Fit: Their Joint Distribution and Its Implications for Model Selection.” Unpublished manuscript, 2008. Hendry, David F. and Hubrich, Kirstin. “Forecasting Economic Aggregates by Disaggregates.” Working Paper No. 589, European Central Bank, February 2006; www.ecb.eu/pub/pdf/scpwps/ecbwp589.pdf. Hernández-Murillo, Rubén and Owyang, Michael T. “The Information Content of Regional Employment Data for Forecasting Aggregate Conditions.” Economics Letters, March 2006, 90(3), pp. 335-39. LeSage, James P. and Magura, Michael. “Econometric Modeling of Interregional Labor Market Linkages.” Journal of Regional Science, August 1986, 26(3), pp. 567-77. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Engemann, Hernández-Murillo, Owyang LeSage, James P. and Magura, Michael. “Using Bayesian Techniques for Data Pooling in Regional Payroll Forecasting.” Journal of Business and Economic Statistics, January 1990, 8(1), pp. 127-35. Liu, Yih-wu and Stocks, Anthony H. “A Labor-Oriented Quarterly Econometric Forecasting Model of the Youngstown-Warren SMSA.” Regional Science and Urban Economics, August 1983, 13(3), pp. 317-40. Owyang, Michael T.; Piger, Jeremy and Wall, Howard J. “Business Cycle Phases in U.S. States.” Review of Economics and Statistics, November 2005, 87(4), pp. 604-16. Owyang, Michael T.; Piger, Jeremy and Wall, Howard J. “A State-Level Analysis of the Great Moderation.” Working Paper No. 2007-003D, Federal Reserve Bank of St. Louis, revised May 22, 2008. (Forthcoming in Regional Science and Urban Economics.) Owyang, Michael T.; Piger, Jeremy M.; Wall, Howard J. and Wheeler, Christopher H. “The Economic Performance of Cities: A Markov-Switching Approach.” Working Paper No. 2006-056C, Federal Reserve Bank of St. Louis, revised January 21, 2007. (Forthcoming in Journal of Urban Economics.) Rapach, David E. and Strauss, Jack K. “Forecasting Employment Growth in Missouri with Many Potentially Relevant Predictors: An Analysis of Forecast Combining Methods.” Federal Reserve Bank of St. Louis Regional Economic Development, 2005, 1(1), pp. 97-112; research.stlouisfed.org/publications/ red/2005/01/RapachStrauss.pdf. Rapach, David E. and Strauss, Jack K. “Forecasting Real Housing Price Growth in the Eighth District States.” Federal Reserve Bank of St. Louis Regional Economic Development, November 2007, 3(2), pp. 33-42; research.stlouisfed.org/publications/red/2007/02/ Rapach.pdf. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 29 The Economic Impact of a Smoking Ban in Columbia, Missouri: An Analysis of Sales Tax Data for the First Year Michael R. Pakko In January 2007, an ordinance took effect in Columbia, Missouri, banning smoking in all bars, restaurants, and workplaces. This paper analyzes data for sales tax collections at eating and drinking establishments from January 2001 through December 2007, including the first 12 months of the smoking ban. The analysis accounts for trends, seasonality, general business conditions, and weather. The findings suggest that the smoking ban has been associated with statistically significant losses in sales tax revenues at Columbia’s bars and restaurants, with an average decline of approximately 3½ to 4 percent. Businesses that serve only food show no statistically significant effects of the smoking ban. Those that serve food and alcohol, or alcohol only, show significant losses with estimates in the range of 6½ to 11 percent (with the larger losses associated with bars). Some individual businesses within each category may have been unaffected, whereas others are likely to have incurred much greater losses. (JEL I18, D78, H11) Federal Reserve Bank of St. Louis Regional Economic Development, 2008, 4(1), pp. 30-40. I n January 2007, the city of Columbia, Missouri, implemented a smoke-free ordinance, banning smoking in all public places, including bars and restaurants. This paper analyzes data on sales tax collections at bars and restaurants for the period before and after this smoking ban was implemented. The sample period covers the first year after the implementation of the new law.1 The enactment of laws restricting smoking in bars and restaurants has been a growing trend among states and municipalities around the nation. According to the Americans Nonsmokers’ Rights Foundation, 748 municipalities have provisions for 100 percent smoke-free environments in bars, 1 This paper represents an extension of my previous study (Pakko, 2007). restaurants, and workplaces. Of these, 555 require smoke-free restaurants and 426 require smoke-free bars.2 As more U.S. communities have adopted such laws, economic data have accumulated, allowing economists to better identify some of the economic costs of these restrictions. A large body of early evidence on the economic impact of smoking bans, much of which was published in medical and public health journals, tended to find no statistically significant effects.3 This finding sometimes has been interpreted as demonstrating that there is no negative economic impact of smoke-free laws whatsoever. 2 These counts are as of July 1, 2008. See American Nonsmokers’ Rights Foundation (2008). 3 Scollo et al. (2003) provide a review of previous literature. Michael R. Pakko is a research officer and economist at the Federal Reserve Bank of St. Louis. Joshua Byrge provided research assistance. The author thanks Laura Peveler, budget officer for the city of Columbia, for providing the data used in this analysis, and John Schultz of the Boone Liberty Coalition for providing information from a survey of bar and restaurant smoking policies in 2006. © 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. 30 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Pakko Figure 1 Sales Tax Revenues at Columbia Eating and Drinking Places Annual Totals, Percent Change 9 8.1 7.9 6 7.5 6.2 4.5 3 0.6 0 2002 2003 2004 2005 2006 2007 This interpretation is far too simplistic. Recent economic research has made it increasingly clear that there are significant economic effects—for some specific businesses—when 100 percent smoking bans are implemented. The evidence suggests that economic costs are borne by businesses that tend to be frequented by smokers. Statistically significant costs have been identified for casinos and bars, in particular.4 One of the cities in the Eighth Federal Reserve District that recently adopted a smoking ban is Columbia, Missouri. Since January 9, 2007, all bars and restaurants in Columbia have been required to be smoke free. Only some sections of outdoor patios are exempt from the requirement. Some local businesses continued to oppose Columbia’s smoke-free ordinance throughout its first year in effect. Petitions to repeal the law by ballot initiative were circulated, but the campaign was ultimately unsuccessful.5 According to local press reports, at least seven establishments cited the smoking ban as a factor in their decision to close their doors in 2007.6 The owner of one busi- ness was quoted as reporting a 40 percent drop in alcohol sales and a 20 to 30 percent drop in food sales over the first several months of the smoking ban.7 Although such reports are informative, they are anecdotal. A more thorough, systematic analysis of objective data is necessary to properly identify economic costs. 4 6 See, for example, LeBlanc (2007) and Coleman (2007). 7 See Lynch (2007). The business—Otto’s Corner Bar and Grill— closed in late 2007, citing the smoking ban as a factor in its demise. 5 For a review of some recent economic research, see Pakko (2008a). In November 2007, the petition drive fell short of gathering enough valid signatures. SALES TAX REVENUES AT ALL EATING AND DRINKING ESTABLISHMENTS Data from the city of Columbia show a distinct decline in the growth rate of sales tax receipts at bars and restaurants (Figure 1). The total for 2007 was only 0.6 percent above 2006. Revenues over the previous four years had risen at an average rate of 7.4 percent. In 2006—the year preceding the implementation of the smoking ban—revenues were 8.1 percent higher than the previous year. The dramatic slowdown in sales tax revenues from eating and drinking establishments after the F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 31 Pakko Figure 2 Sales Tax Collected from All Eating and Drinking Establishments $ Thousands 220 200 Non-Seasonally Adjusted Seasonally Adjusted 180 160 140 120 100 2001 2002 2003 2004 smoking ban was implemented is consistent with the anecdotal reports of revenue losses at Columbia bars and restaurants. However, a simple comparison of growth rates before and after the smoking ban is insufficient for drawing any firm conclusions. This section reports findings from a more rigorous analysis of the data covering all of Columbia’s bars and restaurants. Using regression analysis to account for trends, seasonality, general business conditions, and weather, I find that the smoking ban has been associated with statistically significant losses in sales tax revenues. Point estimates indicate an average loss of approximately 3½ to 4 percent.8 Sales Tax Data The data series examined in this section consists of monthly sales tax revenues for all bars and restaurants in Columbia. Because no changes were made in tax rates over the sample period (January 2001–December 2007), sales tax revenues serve as 8 The range of estimates in this paper represents slightly smaller losses than in my earlier, preliminary analysis of the data (Pakko, 2007). In the earlier paper, the total included establishments classified as “eating places only” and “eating and drinking places.” The new dataset also includes “drinking places—alcoholic beverages only.” Because the latter category is a very small component of the total (about 4 to 5 percent over the sample period), its inclusion has little impact on the empirical findings. The new estimates reflect the additional data that have accumulated during the second half of 2007. 32 V O LU M E 4 , N U M B E R 1 2008 2005 2006 2007 a direct proxy for sales. Total sales tax receipts also were obtained from the city of Columbia for use as a control variable for overall economic activity. The data are also disaggregated, allowing independent analysis of bars and restaurants (see “Analysis of Disaggregated Data” below). Figure 2 shows a plot of the raw data for total bar and restaurant tax receipts, along with a series that has been seasonally adjusted using the Census X-12 ARIMA procedure. A cursory examination of the data shows an evident surge in growth during the latter part of 2005 and into early 2006. Growth slowed in late 2006 and turned negative for much of 2007. By December 2007, revenues were down 6 percent from a year earlier. The appropriate question is not, however, whether sales taxes or revenues have been positive or negative since the Columbia Smoke-Free Ordinance took effect, but whether the pattern is different from what it would have been in its absence. More formal statistical analysis is required to address this question. Regression Analysis To test the hypothesis of a significant effect of the Columbia smoking ban, I estimated a series of least-squares regressions. The dependent variable F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Pakko Table 1 Regression Results for All Eating and Drinking Establishments Regression Variable (1a) (1b) (2a) (2b) (3a) (3b) Smoking ban –0.0523*** (0.0176) –0.0518*** (0.0157) –0.0364*** (0.0098) –0.0376*** (0.0091) –0.0365*** (0.0091) –0.0403*** (0.0091) Constant 11.6432*** (0.0120) 11.7693*** (0.0072) 5.5311*** (1.5513) 6.1317*** (1.6131) 6.6745*** (1.3621) 7.3420*** (1.3576) Time trend 0.0056*** (0.0002) 0.0056*** (0.0002) 0.0038*** (0.0005) 0.0040*** (0.0005) 0.0042*** (0.0004) 0.0044*** (0.0004) 0.4423*** (0.1122) 0.4051*** (0.1158) 0.3585*** (0.0986) 0.3178*** (0.0975) –0.0049*** (0.0014) –0.0033*** (0.0011) Non-dining tax revenues Snowfall AR(1) coefficient Seasonally adjusted data Seasonal dummy variables Adjusted R 2 0.2522* (0.1313) No 0.2255* (0.1340) 0.1078 (0.1135) 0.0674 (0.1092) 0.0778 (0.1252) 0.0915 (0.1281) No Yes No Yes Yes Yes No Yes No Yes No 0.9642 0.9636 0.9728 0.9709 0.9766 0.9739 NOTE: *, **, and *** denote significance at 10, 5, and 1 percent, respectively. The dependent variable for all equations is the log of diningsector tax revenue. Regressions labeled (a) use data that are not seasonally adjusted, whereas those labeled (b) use data that are adjusted using the Census X-12 ARIMA procedure. of the regressions is the log of restaurant sales tax revenues. Each regression includes a constant and a time trend, in addition to a dummy variable representing the implementation of the smoking ban (which has a value of 0 before 2007 and 1 for January-December 2007). The full regression also includes controls for overall economic activity and for weather: ln ( DiningTax t ) = γ SmokingBant + β0 + β1TimeTrendt + β2 ln (OtherTax t ) + β3Snowfallt + ut . The variable Other Tax is the total amount of nonfood and beverage taxes collected by the city of Columbia. To control for the influence of adverse weather, the full specification also includes the variable Snowfall, which is entered as the deviation of actual monthly snowfall from historic averages. The focus of the analysis is the coefficient on the smoking-ban dummy variable (γ ). All regressions include a first-order autoregressive error term ut = ρ ut –1 + εt (although the autoregressive coeffi- cient is not significant in many of the regressions). Estimation uses ordinary least squares regression with standard errors adjusted for general autoregression and heteroskedasticity using the Newey-West (1987) procedure. Baseline Specification. The results of a naive baseline specification, including only a constant and a time trend (plus the autoregressive error term), are shown in the first two columns of Table 1. Regression (1a) uses the non-seasonally adjusted data for the dependent variable and includes a set of monthly dummy variables to account for seasonal patterns (coefficient estimates not reported). Regression (1b) uses the seasonally adjusted data. Each of these basic regressions suggests a highly statistically significant decline in tax revenues associated with the implementation of the smoking ban. Point estimates for the coefficients on the smoking ban dummy variable indicate an average decline of approximately 5 percent.9 9 The coefficient estimates on the dummy variable can be interpreted (approximately) as percentage changes. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 33 Pakko Figure 3 Sales Tax Collected from Non-Dining Establishments $ Thousands 1,900 Non-Seasonally Adjusted Seasonally Adjusted 1,700 1,500 1,300 1,100 900 2001 2002 2003 2004 Controlling for General Business Conditions. Although these initial estimates control for general trends and seasonality in the data, other factors could be associated with the decline in restaurant tax revenues. In fact, the data suggest an overall decline in non-dining retail sales in Columbia that is unlikely to be associated with the smoking ban. Subtracting dining tax receipts from data for total sales tax receipts yields a measure of non-dining tax receipts. Figure 3 shows this measure of nondining sales taxes receipts on both a seasonally adjusted and non-seasonally adjusted basis. A clear slowdown in 2006 and 2007 roughly corresponds with the timing of the slowdown in tax receipts at restaurants and bars. Non-dining tax receipts showed some recovery in early 2007 but sagged through the rest of the year. Overall yearly revenues were flat—the total for 2007 was 0.16 percent lower than in 2006. As of December, nondining sales tax revenues were down approximately 4.7 percent from a year earlier. Regressions (2a) and (2b) add the (logged) nondining revenue variable to the baseline specification to control for this slowdown in business activity. Regression (2a) includes the non-seasonally adjusted measure, whereas regression (2b) uses the seasonally adjusted version. In both cases, the coefficient on non-dining tax revenue is positive and highly 34 V O LU M E 4 , N U M B E R 1 2008 2005 2006 2007 significant. The addition of this factor does, in fact, account for some of the slowdown in dining tax revenues: Point estimates for losses associated with the smoking ban are smaller than in the baseline specification. Nevertheless, the coefficients on the smoking ban dummy variable are still highly significant, with point estimates indicating a decline of more than 3½ percent. These results indicate that the slowdown in dining tax receipts is partly related to a slowdown in overall economic activity, but the decline in revenues at bars and restaurants is greater than past patterns would predict.10 Controlling for Weather. Another factor that can be particularly important for revenues at bars and restaurants (for obvious reasons) is inclement weather.11 Figure 4 shows the average monthly 10 The 2008 budget report for the city of Columbia also indicates that dining and entertainment sectors are lagging the rest of the local economy: “General retail sales remain steady, however the current trend indicates the home improvement/construction and dining and entertainment sectors are declining” (City of Columbia, 2007). 11 Adams and Cotti (2007) find that changes in restaurant employment after the implementation of smoking bans in warm-weather states differ from those in cold-weather states. They speculate that the difference might be related to the feasibility of providing outdoor seating areas where smoking might be permitted. Pakko (2008b) finds that a severe snowstorm on the East Coast had a significant effect on gambling revenues in Delaware after the implementation of a smoking ban in that state. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Pakko Figure 4 Average and Actual Snowfall—Columbia, MO Inches 14 12 Actual Average 10 8 6 4 2 0 2001 2002 2003 2004 snowfall for Columbia compared with actual snowfall over the sample period.12 The low snowfall totals during the winter of 2006-07 clearly represent a departure from average weather conditions. These relatively mild winter conditions might help explain the apparent surge in dining tax revenues during that period. In contrast, the relatively heavy snowfall near the end of 2007 might be associated with slower business at bars and restaurants. Regressions (3a) and (3b) add this consideration to the analysis, introducing a variable that is equal to the difference between actual and average snowfall (in inches). The coefficient on this snowfall variable is of the expected sign, and it is statistically significant. The point estimate indicates that one inch of snowfall in excess of the average tends to lower sales tax revenues by 0.3 percent (in the non-seasonally adjusted regression) to 0.5 percent (in the seasonally adjusted specification). The addition of the snowfall variable improves the overall fit of the model, but it has little impact on the significance of the smoking ban dummy variable. There remains a highly significant downturn beginning in January 2007, measuring approximately 3½ to 4 percent.13 12 Average snowfall is calculated for the period 1971-2000 (National Oceanic and Atmospheric Administration). 2005 2006 2007 A Specification Test. The association of the smoking ban dummy variable with the Columbia Smoke-Free Ordinance in the reported regressions relies on the timing of its adoption. It is possible for a dummy variable to indicate statistically significant effects even if the restaurant sales slowdown began either before or after the implementation of the smoking ban. To test whether the dummy variable is accurately identifying the effects of the smoking ban and not an independent, unidentified factor, the regression specifications in (3a) and (3b) were reestimated using alternative dummy variables to evaluate the timing of the downturn more carefully.14 Possible breakpoints from July 2006 through June 2007 were considered. Figure 5 shows the adjusted R-squared statistics from these regressions. For both methods of seasonal controls, the results show that the dummy variable specifying a breakpoint of January 2007 provides the best model fit. These results suggest that January 2007 does, indeed, represent the rele13 Although these estimates are lower than in my preliminary analysis (Pakko, 2007), the difference between the new estimates and the previous estimate of 5 percent is not statistically significant. 14 Regressions (3a) and (3b) were reestimated using alternative dummy variables that have a value of 1 for all months after and including a particular starting month and a value of 0 for all previous months. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 35 Pakko Figure 5 Adjusted R-Squared Statistics for Different Breakpoints 0.980 Non-Seasonally Adjusted Seasonally Adjusted 0.978 0.976 0.974 0.972 0.970 0.968 Jul 06 Aug 06 Sep 06 Oct 06 Nov 06 vant breakpoint in the data series on bar and restaurant sales tax revenues. Analysis of Disaggregated Data In addition to sales tax data for the total bar and restaurant sector of Columbia, I requested and received data on sales tax revenues for three subsets of the total, along with listings of the specific businesses that fall within each category. The designations correspond roughly to the following SIC codes: • Group 1 (SIC code 5811): “Eating Places Only” • Group 2 (SIC code 5812): “Eating and Drinking Places” • Group 3 (SIC code 5813): “Drinking Places— Alcoholic Beverages” The categories are not precisely distinguished; business owners select their own category when filing their tax statements. Undoubtedly, some classifications are questionable. Nevertheless, the three categories are distinguished by the types of businesses prevalent on each list. Group 1 includes fast-food, take-out restaurants, coffeehouses, and many common sit-down restaurants. Group 2 includes restaurants that might be commonly categorized as “bar and grill” establish36 V O LU M E 4 , N U M B E R 1 2008 Dec 06 Jan 07 Feb 07 Mar 07 Apr 07 May 07 Jun 07 ments, as well as many common sit-down restaurants. The restaurants in group 2 are more likely to have separate bar areas than those in group 1. Group 3, the smallest category, primarily includes establishments that would be commonly classified as “bars.” Figure 6 shows the data series (seasonally adjusted and non-seasonally adjusted) for each of the three groups. Group 2 is the largest of the three, accounting for approximately 61 percent of the total over the sample period. Group 1 accounts for just over one-third (34 percent), while group 3 accounts for only about 5 percent. Over time, the share of total tax revenues for group 1 establishments has been rising slightly (reaching 35 percent in 2007), and the share from group 3 has been falling (4 percent in 2007). The Columbia Smoke-Free Ordinance is likely to have affected these three categories of businesses differently. Previous research has suggested that the impact on bars differs from the impact on restaurants. For example, both Adams and Cotti (2007) and Phelps (2006) use data from the Bureau of Labor Statistics to identify significant effects on bar employment but find no significant effect for restaurants as a separate category. One relevant distinction among businesses in these categories is that they may have differed in F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Pakko Figure 6 Tax Revenues by Type of Establishment Eating Places Only $ Thousands 80 Non-Seasonally Adjusted Seasonally Adjusted 70 60 50 40 30 2001 2002 2003 2004 2005 2006 2007 Eating and Drinking Places $ Thousands 130 Non-Seasonally Adjusted 120 Seasonally Adjusted 110 100 90 80 70 2001 2002 2003 2004 2005 2006 2007 Drinking Places—Alcoholic Beverages $ Thousands 11 10 9 8 7 Non-Seasonally Adjusted Seasonally Adjusted 6 5 2001 2002 2003 2004 2005 2006 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T 2007 V O LU M E 4 , N U M B E R 1 2008 37 Pakko Table 2 Disaggregated Regression Results Non-seasonally adjusted data Seasonally adjusted data Variable Group 1 Group 2 Group 3 Group 1 Group 2 Group 3 Smoking ban 0.0107 (0.0161) –0.0642*** (0.0120) –0.1102*** (0.0312) 0.0008 (0.0180) –0.0671*** (0.0124) –0.1074*** (0.0287) Constant 6.1855*** (1.5714) 6.2645*** (1.2468) 3.5898 (3.3697) 6.9832*** (1.5918) 7.1419*** (1.2459) 4.7455 (3.2460) Time trend 0.0042*** (0.0005) 0.0045*** (0.0004) 0.0010 (0.0011) 0.0045*** (0.0005) 0.0048*** (0.0004) 0.0012 (0.0010) Non-dining tax revenues 0.3137*** (0.1138) 0.3526*** (0.0903) 0.3751 (0.2440) 0.2655** (0.1144) 0.2962*** (0.0896) 0.2980 (0.2333) Snowfall –0.0046*** (0.0018) –0.0047*** (0.0014) –0.0038 (0.0039) –0.0022 (0.0014) –0.0041*** (0.0011) –0.0024 (0.0029) AR(1) coefficient 0.3334*** (0.1028) 0.2807*** (0.1060) 0.2422** (0.1046) 0.4114*** (0.0984) 0.3197*** (0.1055) 0.2103** (0.1052) Seasonally adjusted data No Seasonal dummy variables Adjusted R 2 No No Yes Yes Yes Yes Yes Yes No No No 0.9572 0.9707 0.6863 0.9536 0.9700 0.4008 NOTE: *, **, and *** denote significance at 10, 5, and 1 percent, respectively. Regressions in each panel are estimated simultaneously using the technique of Seemingly Unrelated Regressions. The dependent variable for each equation is the log of tax revenue for a subset of the bar and restaurant sector. Group 1 includes food only, Group 2 includes food and beverage establishments, and Group 3 includes those businesses that serve only beverages. Regressions in the “Non-seasonally adjusted data” columns use data that are not seasonally adjusted, whereas those in the “Seasonally adjusted data” columns use data that are adjusted using the Census X-12 ARIMA procedure. their smoking policies before enactment of the smoking ban. If few businesses within a category were affected by the new law, it is unlikely that a significant effect would be found in the data. If many businesses had to change their policies, the impact of the smoking ban might be more distinct. To examine the importance of this factor, the list of businesses in each category was cross-referenced against a list of bar and restaurant smoking policies compiled by the Boone Liberty Coalition (BLC) before enactment of the smoking ban.15 Many of the businesses on the sales tax list were not covered by the BLC survey, including those that had gone out of business before mid-2006 and those that have newly opened since that time. In fact, more than half of the listed establishments were in these unclassified categories. A clear pattern is evident, however, in those covered in the survey: Among restaurants in group 1, only 18 percent permitted indoor smoking before the smoking ban was enacted. For businesses in group 2, 56 percent allowed smoking, while for group 3, 71 percent did.16 Regressions of the same general form as reported in Table 1 were estimated for the three subsectors independently. Using both the non-seasonally adjusted and seasonally adjusted data, three equation systems were estimated using the technique of seemingly unrelated regressions. This technique allows for possible correlation among the residuals of the three equations (a distinct possibility in this case). In addition, it allows for testing crossequation restrictions. 15 16 The BLC was active in opposition to the enactment of the Columbia smoking ban. They circulated a report (Boone Liberty Coalition, 2006) indicating that nearly two-thirds of Columbia’s restaurants had smoke-free policies before the ban was adopted. 38 V O LU M E 4 , N U M B E R 1 2008 Businesses that allowed smoking on patios before the ban are not counted in the totals for smoking permitted, since the Columbia Smoke-Free Ordinance included an exemption that allowed for some smoking sections to remain in outdoor seating areas. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Pakko Table 3 Wald Tests for Equality of Smoking Ban Coefficients Across Equations Non-seasonally adjusted data Test Seasonally adjusted data Chi-square (1) statistic Probability Chi-square (1) statistic Probability Group 1 = Group 2 18.8373 0.0000 13.7525 0.0002 Group 1 = Group 3 12.4516 0.0004 10.9588 0.0009 Group 2 = Group 3 2.5268 0.1119 2.3193 0.1278 Not surprisingly, estimated effects of the smoking ban differed among these three groups. The results of regression equations for the three groups are reported in Table 2. Both non-seasonally adjusted and seasonally adjusted data are shown. The results are similar for each technique. For the restaurants in group 1, there is no statistically significant effect associated with the smoking ban. For businesses in group 2, the impact is negative and highly statistically significant. The point estimates suggest losses of about 6½ percent. For the bars in group 3, the small sample size means that there is more noise in the data, so the fit of the regression equation is much less precise.17 Nevertheless, the coefficient on the smoking ban dummy variable is highly significant, with the estimates suggesting losses of nearly 11 percent. Wald test statistics (reported in Table 3) were calculated for testing the significance of the crossequation differences in the smoking ban coefficients. The coefficients on the smoking ban dummy variable in the equations for groups 2 and 3 were each significantly different from the coefficient estimated for group 1. However, because of the relatively large standard errors for the group 3 estimates, the hypothesis that the effect on group 2 and group 3 businesses was the same could not be rejected at standard levels of statistical significance.18 17 Although neither the time trend nor the other tax revenues variable is individually significant in these regressions, the two variables are jointly significant (p-value < 0.001), and together account for much of the explanatory power of the equation. 18 In a regression equation estimated using the (logged) sum of group 2 and group 3 businesses as the independent variable (full results not reported), the coefficient on the smoking ban dummy variable was found to be –0.065 for the non-seasonally adjusted data and –0.068 for a regression using seasonally adjusted data. DISCUSSION AND CONCLUSION The results reported in this paper indicate statistically significant losses to bar and restaurant sales tax revenues following the implementation of the Columbia Smoke-Free Ordinance in January 2007. After accounting for trends, seasonality, an overall downturn in retail sales, and an unusually harsh winter, there remains a 3½ to 4 percent loss in dining tax revenues associated with the smoking ban. The effects of the smoking ban vary for different types of businesses. Restaurants that serve primarily food only show no significant effect, whereas bars and restaurants with bars show significantly greater losses. For the latter categories, losses are estimated to be in the range of 6½ to 11 percent. It is important to note that the point estimates identify only average losses. Many businesses in this category are likely to have been unaffected (e.g., take-out businesses, fast-food franchises, and other restaurants that already had smoke-free policies). Accordingly, some businesses are likely to have incurred losses that are far greater than the average. Anecdotal reports from specific business owners suggesting losses in the range of 30 percent do not seem unreasonable. One interesting feature of the Columbia experience is the response of restaurant owners to the patio exemption. According to the Columbia Missourian, owners of at least two bars are building or planning outdoor patio expansions. One owner was quoted as saying, “You have to have a patio to survive.”19 The expenses associated with these renovations may help offset losses in sales revenue of these establishments, but they also represent 19 Solberg (2007), Greaney (2007). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 39 Pakko profit losses above and beyond the measured declines in revenues. Measuring the economic effects of smoking bans can sometimes be difficult. For the case of Columbia, Missouri, this analysis of data on sales tax revenues indicates that losses are of a magnitude that is clearly identifiable and statistically significant. REFERENCES Adams, Scott and Cotti, Chad D. “The Effect of Smoking Bans on Bars and Restaurants: An Analysis of Changes in Employment.” The B.E. Journal of Economic Analysis & Policy, February 8, 2007, 7(1); www.bepress.com/bejeap/vol7/iss1/art12. American Nonsmokers’ Rights Foundation. “Municipalities with Local 100% Smokefree Laws?” July 1, 2008; www.no-smoke.org/pdf/100ordlisttabs.pdf. Boone Liberty Coalition. “Proposed Smoking Ordinance Position Paper.” Unpublished manuscript, June 9, 2006; booneliberty.org/StopTheBan/ BooneLibertySmokingBan.pdf. City of Columbia, Missouri. “FY 2008 Adopted Budget.” October 4, 2007; www.gocolumbiamo.com/Finance/ Services/Financial_Reports/FY2008/index.php. Coleman, Kevin. “Ban Leaves Billiards Behind the Eight Ball.” Columbia Tribune, June 16, 2007; archive.columbiatribune.com/2007/jun/ 20070616busi003.asp. Greaney, T.J. “In Smoking Ban Era, Patios a Hot Commodity.” Columbia Tribune, July 9, 2007; archive.columbiatribune.com/2007/jul/ 20070709news051.asp. LeBlanc, Matthew. “Smoking Ban Fighters Fade in Their Effort: Petition Drive Has Grown ‘Apathetic.’ ” Columbia Tribune, April 26, 2007; archive.columbiatribune.com/2007/apr/ 20070426news007.asp. Lynch, Andrew. “Petition to End Smoking Ban Awaits Signature Verification.” KBIA News, November 6, 40 V O LU M E 4 , N U M B E R 1 2008 2007; publicbroadcasting.net/kbia/news.newsmain? action=article&ARTICLE_ID=1178464§ionID=1. National Oceanic and Atmospheric Administration. “Climatological Data for St. Louis and Columbia.” www.crh.noaa.gov/lsx/?n=cli_archive. Newey, Whitney K. and West, Kenneth D. “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica, May 1987, 55(3), pp. 703-8. Pakko, Michael R. “The Economic Impact of a Smoking Ban in Columbia, Missouri: A Preliminary Analysis of Sales Tax Data.” Federal Reserve Bank of St. Louis Center for Regional Economics CRE8 Occasional Report No. 2007-02, December 11, 2007; research.stlouisfed.org/regecon/op/CRE8OP-2007002.pdf. Pakko, Michael R. “Clearing the Haze: New Evidence on the Economic Impact of Smoking Bans.” Federal Reserve Bank of St. Louis Regional Economist, January 2008a, pp. 10-11; stlouisfed.org/publications/ re/2008/a/pages/smoking-ban.html. Pakko, Michael. R. “No Smoking at the Slot Machines: The Effect of Smoke-Free Laws on Gaming Revenues.” Applied Economics, July 2008b, 40(14), pp. 1769-74. Phelps, Ryan. “The Economic Impact of 100% Smoking Bans” in Kentucky Annual Economic Report 2006. Lexington, KY: Center for Business and Economic Research, Gatton College of Business and Economics, University of Kentucky, 2006, pp. 31-34; gatton.uky.edu/CBER/Downloads/Phelps-06.pdf. Scollo, Michelle; Lal, Anita; Hyland, Andrew and Glantz, Stanton. “Review of the Quality of Studies on the Economic Effects of Smoke-free Policies on the Hospitality Industry.” Tobacco Control, March 2003, 12(1), pp. 13-20; www.tobaccoscam.ucsf.edu/ pdf/scollotc.pdf. Solberg, Christy. “Effects of Smoking Ban Still Debated.” Columbia Missourian, September 27, 2007; www.columbiamissourian.com/stories/2007/ 09/27/effects-smoking-ban-still-debated/. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Urban Decentralization and Income Inequality: Is Sprawl Associated with Rising Income Segregation Across Neighborhoods? Christopher H. Wheeler Existing research shows an inverse relationship between urban density and the degree of income inequality within metropolitan areas; this information suggests that as urban areas spread out, they become increasingly segregated by income. This paper examines this hypothesis using data covering more than 165,000 block groups within 359 U.S. metropolitan areas for the years 1980, 1990, and 2000. The findings indicate that income inequality—defined by the variance of the log household income distribution—does indeed rise significantly as urban density declines. This increase, however, is associated with rising inequality within block groups as cities spread farther from their central core. The extent of income variation between different block groups, by contrast, shows virtually no association with population density. Accordingly, little evidence supports the notion that urban sprawl is systematically associated with greater residential segregation of households by income. (JEL D31, R11, R23) Federal Reserve Bank of St. Louis Regional Economic Development, 2008 4(1), pp. 41-57. F or much of the past century, the population within U.S. metropolitan areas has shown a persistent tendency to move outward as residents leave central cities for suburban locales. This movement has been striking within the past 50 years. In 1950, 41.5 percent of metropolitan populations resided in suburban areas (i.e., those outside central cities); a half century later, more than 62 percent did. As a consequence, the density of population within the nation’s urban areas has changed dramatically. Between 1950 and 2000, the average central-city population density decreased from 7,517 residents per square mile to 2,716. At the same time, suburban densities increased from 175 residents per square mile to 208.1 1 All of these figures are derived from the U.S. Census of Population and Housing, as reported by Hobbs and Stoops (2002). Undoubtedly, urban decentralization largely reflects the decisions of individuals and employers to expand their activities over more space. Improved transportation technology and infrastructure, for example, have eased longer commuting distances. These changes have encouraged workers and firms to locate on the outer fringes of their metropolitan areas where land tends to be more plentiful and less costly. Despite the “voluntary” nature of this process, urban decentralization has generated several concerns about the welfare of metropolitan area populations. One such concern is a rising disparity between neighborhoods, especially the decline of incomes in central cities relative to those of their suburban counterparts. As metropolitan areas expand, the majority of both employment opportunities and relatively high-income households may shift from the central core to the periphery, Christopher H. Wheeler was a research officer at the Federal Reserve Bank of St. Louis at the time this article was written. © 2008, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. 41 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler thereby creating a widening income gap between these two areas. Over time, these differences may become more pronounced as the poor become increasingly isolated from productive interactions with wealthier neighbors.2 Existing evidence seems to support this idea. Margo (1992), for example, argues that the movement of metropolitan populations in the United States toward suburban locales over the latter half of the twentieth century can be linked, to a significant degree, to the rise in personal incomes. As individual incomes increased, so did the demand for land. One rather straightforward implication of this hypothesis is that decentralization should be accompanied by a rise in the extent of income segregation. Individuals migrating to the suburbs (i.e., those with a particularly high demand for space) should also be those with relatively high incomes. As a result, urban decentralization would be expected to lead to the accumulation of highincome households on the outskirts of cities, while poorer residents remain within the central cores. A number of studies do suggest that poverty became more concentrated within the country’s urban areas over this same period. Mayer (1996) reports that in 1964, families in the bottom quintile of the income distribution were 1.2 times as likely to reside in a central city as wealthier families. By 1994, they were 1.4 times as likely to reside in central cities. In studies of the largest U.S. cities and metropolitan areas, Kasarda (1993) and Abramson, Tobin, and VanderGoot (1995) find that individuals living in poverty became increasingly concentrated within poor neighborhoods (defined by Census tracts) between 1970 and 1990. Although these two particular studies do not consider the issue of urban decentralization per se, the figures documented therein certainly characterize a period during which metropolitan populations were shifting from central areas toward suburban ones. Research on the spatial mismatch hypothesis offers a similar conclusion. This idea, advanced by Kain (1968), holds that inner-city residents tend to experience adverse economic outcomes as pop2 The movement of high-income individuals away from the poor, for example, may leave the poor with relatively few jobs (e.g., Kain, 1968) or reduce the extent to which the rich confer positive spillovers on the poor (e.g., Wilson, 1987, and Benabou, 1996). 42 V O LU M E 4 , N U M B E R 1 2008 ulation and employment opportunities leave those inner cities because it becomes increasingly difficult for them to find and sustain employment. Therefore, the gap between the incomes earned by residents of suburban neighborhoods and those earned by residents of the central city should be expected to rise as populations spread out. Many studies on this topic have found that inner-city minorities do seem to experience worse labor market outcomes, usually measured by employment status and earnings, as economic activity leaves urban centers, although the literature is far from unanimous on this point.3 On the specific topic of income inequality, Wheeler (2004) finds that urban density exhibits a strong negative correlation with the degree of spread more in the distribution of labor earnings. Thus, as a metropolitan area’s population spreads out, its wage distribution tends to widen. Although the results apply to white male workers with a strong attachment to the labor force (and so do not offer direct evidence on spatial mismatch, which tends to focus on differences by race), they certainly are consistent with the concept that urban decentralization leads to greater segregation of high-income and low-income workers across neighborhoods. Despite the findings of existing work, surprisingly little research has directly studied the evolution of interneighborhood income differentials as populations become increasingly dispersed, particularly among neighborhoods defined at levels finer than central cities and suburbs. A notable exception is Yang and Jargowsky (2006), who look at the relationship between sprawl and a neighborhood segregation index based on urban tracts in the United States between 1990 and 2000. This paper performs a related, although different, exercise. In particular, I examine the relationship between urban density and the degree of income inequality both within and between neighborhoods defined by Census block groups. More specifically, I use data on household income to compute the variance of the income distribution for each of 359 U.S. metropolitan areas for the years 1980, 1990, and 2000. I then exploit data covering more than 165,000 block 3 See, for example, Ihlanfeldt and Sjoquist (1989), Holzer (1991), and Weinberg (2000, 2004) for a discussion of these issues. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler groups to decompose these variances into components associated with the dispersion of incomes within block groups and components associated with the dispersion across them. The results suggest that even though a strong negative association exists in the variance of a metropolitan area’s household income distribution and its overall population density, the association operates through a within-neighborhood channel rather than a between-neighborhood channel. That is, as the population of a metropolitan area spreads out, household income inequality increases largely because the extent of income variation among households within the same block group rises, not because neighborhoods become more segregated by income. On closer inspection, the data do reveal some evidence that decentralization tends to be accompanied by rising between-neighborhood income gaps, but this occurs only at the top of the blockgroup income distribution. Specifically, the income differential between the block group at the 90th percentile of the household income distribution and the block group at the median does increase significantly as metropolitan areas decentralize. However, the gap between the median and the block group at the 10th percentile tends to decrease, which leaves measures of the overall spread in the between-neighborhood income distribution relatively unchanged. Moreover, there appears to be little association between density and either the average income of the block group at the 90th percentile or that of the block group at the 10th percentile. Similar results hold when the analysis is repeated using Census tracts instead of block groups. Notably, these results should not be interpreted as suggesting that certain neighborhoods do not experience particularly adverse economic outcomes as populations decentralize. Some inner-city areas indeed may become increasingly poor as activity moves outward. However, the extent to which this process occurs evidently has little effect on the overall level of between-neighborhood income inequality in a metropolitan area. The remainder of the paper proceeds as follows. The next section provides a brief description of the data and some of the computational issues. The results section is followed by concluding remarks. DATA AND MEASUREMENT The primary data source used for the analysis is the decennial U.S. Census of Population and Housing for the years 1980, 1990, and 2000 as compiled by GeoLytics.4 The GeoLytics data files report a variety of demographic and economic characteristics (e.g., income, industry of employment, age, race, gender, education, place of birth, employmentunemployment status) for individuals at a variety of geographic levels, including counties, tracts, and block groups. Unfortunately, individual-level observations are not reported in the data; only summary measures taken across the individuals located within each geographic unit are reflected. This feature thereby limits the types of statistics that can be calculated. The primary advantage of these data is the consistency of the geographic units—the data have been constructed based on consistent geographic definitions over all three Census years. This study focuses on average household income and a variety of other economic and demographic data among residents in block groups, which are used as the basis for a “neighborhood.” Although neighborhoods could also be (and frequently are) defined by Census tracts, the focus is on block groups in this paper because they represent the finest grouping available in the data. Across the 359 metro areas in the sample, there are more than 165,000 block groups that each contained, on average, 526.5 households and had a median land area of approximately 0.33 square miles in the year 2000.5 Tracts tend to be larger (1,648.8 households, on average, and a median land area of 1.31 square miles in 2000), and therefore, they may be less appropriate when considering neighborhoods, which are meant to encompass areas over which individuals can reasonably be expected to interact with one another. As demonstrated below, the principal findings are mostly invariant to the choice of block groups or tracts. 4 The data can be obtained from GeoLytics, Inc. at http://www.geolytics.com. 5 Metropolitan area definitions follow the Census Bureau’s definitions as of November 2004. They were accessed at www.census.gov/population/www/estimates/metrodef.html. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 43 Wheeler Table 1 Summary Statistics: Block Group Income Inequality Year 1980 1990 2000 Variable Mean Standard deviation Minimum Maximum Variance 0.55 0.06 0.43 0.75 Within component 0.47 0.05 0.37 0.64 Between component 0.07 0.04 0.003 0.24 Variance 0.64 0.07 0.48 0.94 Within component 0.50 0.05 0.39 0.65 Between component 0.14 0.05 0.04 0.31 Variance 0.65 0.08 0.48 1.05 Within component 0.52 0.05 0.41 0.70 Between component 0.13 0.05 0.02 0.38 NOTE: Statistics taken across 359 metropolitan areas. I estimate the variance of a metropolitan area’s income distribution as follows. For each year, the number of households with incomes falling into each of N closed intervals is reported in the GeoLytics files.6 I use these figures to compute the fraction of households with incomes less than N distinct levels, which allows N quantiles of the household income distribution to be estimated for each metro area. For example, if 14 percent of all households have income less than $25,000, I estimate the 0.14 quantile by 25,000. Label these quantiles Xα . I then match these N quantiles to their corresponding values from a normal (0,1) distribution. Label these quantiles Uα . Assuming a lognormal household income distribution, Xα and Uα are related as follows: (1) X α = exp (ζ + U α σ ), where ζ and σ are the mean and standard deviation (SD) parameters characterizing the lognormal distribution (see Johnson and Kotz, 1970, p. 117). These parameters are readily obtained by transforming equation (1) logarithmically and estimating by ordinary least squares (OLS). The fit of these regressions tended to be quite high in all cases. Across the 359 metro areas, the mean adjusted R 2 was approximately 0.98 for each year, and the minimum across 6 For 1980, there are 15 income categories; for 1990, there are 24; for 2000, there are 15. See the Appendix for details. 44 V O LU M E 4 , N U M B E R 1 2008 all metro area-year observations was 0.95. With the SD, σ, the variance follows simply as σ 2. Summary statistics describing metropolitan area–level income variances appear in Table 1. Most notably, they demonstrate that, on average, the degree of dispersion exhibited by metropolitan area–level (log) income distributions increased between 1980 and 2000, with the majority of this increase between 1980 and 1990. Over these two decades, the mean income variance rose by a total of 10 log points (approximately 18 percent). Of this 10 log point increase, the majority—9 log points— was experienced during the 1980s. Qualitatively, of course, this finding is consistent with what has now been widely established in the inequality literature (e.g., Katz and Murphy, 1992, Juhn, Murphy, and Pierce, 1993). EMPIRICAL FINDINGS Urban Decentralization and Income Inequality Consider first the relationship between metropolitan area–level population density and the extent of income inequality. To do so, let the variance of the (log) income distribution for metropolitan area m in year t have the following characterization: (2) 2 σ mt = µm + µt + β X mt + γ Dmt + ε mt , F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler where µm is a metro area–specific fixed effect, µt is a year-specific term, Xmt is a vector of covariates described in greater detail below, Dmt is the logarithm of population density, and εmt is a residual. To eliminate the metro area fixed effects, I take 10year differences of equation (2), yielding which serves as the primary estimating equation in the analysis. Given the nature of the differenced error term, there is nonzero correlation between the residuals for the same metro area. The standard errors are adjusted to account for this correlation. Density is calculated for each metropolitan area as the weighted average of county-level population densities, where the weights are given by each county’s share of total metropolitan area population. This measure is used instead of average metropolitan area density (calculated as the ratio of total metropolitan area population to total land area) to mitigate the influence of extremely large but relatively unpopulated counties, which appear in many metropolitan areas of the West. Countyweighted population density gives these counties less weight in the computations and, therefore, may provide a better sense of how densely clustered a city’s population is.7 Table 2 lists the 10 most and least densely populated metropolitan areas in each year. Among the covariates included in the vector Xmt are some basic characteristics commonly associated with the degree of income inequality in an economy. These characteristics include the percentages of the resident population that are black, female, foreign-born, younger than age 25, and older than age 65; the fraction of the population 25 years of age or older that has completed at least a bachelor’s degree; shares of employment in 9 broad industries8; the fraction of the labor force that is represented by a union; and the unemployment rate. I also include three region dummies to account for any basic geographic differences in the inequality trends across different parts of the country.9 Results of these characteristics appear in Table 3. I consider three different specifications of the covariates in the estimation of equation (3) to gauge the robustness of the density-inequality relationship. The first limits the regressors to log density, the three region dummies, and a time effect for the 1980-90 decade. The second then adds the population demographics of each metro area (age, race, gender, education, foreign-born status). The third includes the remainder of the covariates that provide a basic description of the metro area’s labor market (industry employment shares, unionization, unemployment).10 Several fairly standard findings are evident. Larger proportions of women and individuals younger than age 24 in the local population are strongly, positively associated with inequality, which likely reflects the relatively low average income among these individuals. Some evidence (although not always statistically significant) indicates that inequality increases with the percentages of foreign-born residents and individuals older than age 65 in the local population. Furthermore, inequality in a metro area tends to rise significantly as the unemployment rate increases, suggesting that households at the bottom end of the income distribution are more sensitive economically to the business cycle than wealthier households. Inequality is also significantly, negatively associated with the extent of union coverage in the local labor force, which is a relatively common finding. Although union workers typically receive an earnings premium over nonunion labor, union contracts tend to equalize earnings across workers (e.g., Fortin 7 I also repeated all of the estimations using weighted averages of block group–level population densities for each metro area. The results were qualitatively similar to those reported here. 9 8 Because metropolitan area boundaries frequently cross state borders and region definitions are based on states, parts of some metro areas are in different regions. I assign these multiregion metropolitan areas to the regions in which the majority of their populations lie. The sectors are manufacturing; agriculture, forestry, fisheries, and mining; construction; wholesale trade; retail trade; finance, insurance, real estate; public administration; education services; health services. I do not use a more detailed industrial classification scheme, in part, to avoid difficulties associated with the change from the Standard Industrial Classification system in 1980 and 1990 to the North American Industry Classification System in 2000. 10 The unionization rate for each metropolitan area is based on statelevel union coverage rates reported by Hirsch, Macpherson, and Vroman (2001) (available at www.unionstats.com). Metropolitan area–level union rates are calculated as weighted averages of their constituent state-level rates, where the weights are given by the fraction of each metro area’s labor force located in each state. (3) 2 ∆σ mt = ∆ µt + β∆ X mt + γ ∆D mt + ∆ ε mt , F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 45 Wheeler Table 2 Most and Least Densely Populated Metro Areas Year Top 10 Density Bottom 10 Density 1980 New York-Northern New Jersey-Long Island, NY-NJ-PA 14,740.0 Flagstaff, AZ 4.03 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 4,927.0 Prescott, AZ 8.4 Washington-Arlington-Alexandria, DC-VA-MD-WV 4,374.1 St. George, UT 10.7 Baltimore-Towson, MD 4,017.3 Casper, WY 13.5 San Francisco-Oakland-Fremont, CA 3,996.1 Wenatchee, WA 14.3 Chicago-Naperville-Joliet, IL-IN-WI 3,959.4 Farmington, NM 14.8 Boston-Cambridge-Quincy, MA-NH 2,930.6 Yuma, AZ 16.4 Milwaukee-Waukesha-West Allis, WI 2,885.7 Bend, OR 20.6 Detroit-Warren-Livonia, MI 2,556.5 Rapid City, SD 20.9 Cleveland-Elyria-Mentor, OH 2,435.9 El Centro, CA 22.1 1990 New York-Northern New Jersey-Long Island, NY-NJ-PA 15,161.5 Flagstaff, AZ 5.2 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 4,385.6 Casper, WY 11.5 San Francisco-Oakland-Fremont, CA 4,171.9 Prescott, AZ 13.3 Washington-Arlington-Alexandria, DC-VA-MD-WV 3,886.3 Farmington, NM 16.6 Chicago-Naperville-Joliet, IL-IN-WI 3,783.4 Wenatchee, WA 16.7 Baltimore-Towson, MD 3,440.1 Yuma, AZ 19.4 Boston-Cambridge-Quincy, MA-NH 2,942.5 St. George, UT 20.0 Milwaukee-Waukesha-West Allis, WI 2,806.9 Rapid City, SD 24.4 Los Angeles-Long Beach-Santa Ana, CA 2,369.0 Bend, OR 24.8 Detroit-Warren-Livonia, MI 2,292.3 El Centro, CA 26.2 2000 New York-Northern New Jersey-Long Island, NY-NJ-PA 16,125.0 Flagstaff, AZ 6.2 San Francisco-Oakland-Fremont, CA 4,419.8 Casper, WY 12.5 Philadelphia-Camden-Wilmington, PA-NJ-DE-MD 4,027.1 Prescott, AZ 20.6 Chicago-Naperville-Joliet, IL-IN-WI 3,880.0 Farmington, NM 20.6 Washington-Arlington-Alexandria, DC-VA-MD-WV 3,573.1 Wenatchee, WA 21.2 Boston-Cambridge-Quincy, MA-NH 3,036.4 Rapid City, SD 26.5 Baltimore-Towson, MD 2,813.0 Yuma, AZ 29.0 Milwaukee-Waukesha-West Allis, WI 2,634.7 Great Falls, MT 29.8 Los Angeles-Long Beach-Santa Ana, CA 2,634.6 Cheyenne, WY 30.4 Detroit-Warren-Livonia, MI 2,231.9 Duluth, MN-WI 32.9 NOTE: Population densities are calculated as (population-share) weighted averages of county-level densities (in residents per square mile). 46 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler Table 3 Overall Inequality Results Variable Log density I (SE) II (SE) III (SE) –0.07*(0.009) –0.086* (0.01) –0.07* (0.01) Percent bachelor’s degree — 0.54* (0.08) 0.52* (0.09) Percent female — 0.73* (0.28) 0.44* (0.25) Percent black — 0.05 (0.11) 0.03 (0.10) Percent <24 years — 0.35* (0.14) 0.23* (0.13) Percent >65 years — 0.31* (0.16) 0.23 (0.15) Percent foreign-born — 0.28* (0.13) Percent manufacturing — — –0.35* (0.07) Percent agriculture, forestry, fishing, and mining — — –0.04 (0.11) Percent construction — — –0.28* (0.12) Percent wholesale trade — — –0.10 (0.15) Percent retail trade — — 0.11 (0.11) Percent finance, insurance, and real estate — — –0.46* (0.15) Percent public administration — — –0.34* (0.14) Percent education services — — –0.28* (0.13) Percent health services — — 0.11 (0.13) Unemployment rate — — 0.46* (0.08) Percent union representation — — –0.12* (0.05) 0.64 0.69 R2 0.20 (0.13) 0.74 NOTE: Data represent 718 observations. The dependent variable is the change in the variance of the log income distribution for a metropolitan area. Each regressor is expressed in terms of contemporaneous 10-year changes. All specifications also include three region dummies and a time effect for the 1980-90 decade. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within– metro area correlation of the regression error terms. * Significant at ≥10 percent. and Lemieux, 1997). Shares of local employment in manufacturing and construction—two sectors frequently associated with relatively high earnings for relatively low-skilled labor—correlate negatively with income inequality. The primary regressor of interest, the logarithm of population density, is uniformly negative and statistically significant across all three specifications. Based on the point estimates, a 1 SD decrease in the change in population density corresponds to a 1 log point increase in the change in log income variance. This figure is far from negligible, representing approximately 20 percent of the mean change in log income variance over the two decades considered in this study. Again, this basic finding has already been established, at least in a qualitative sense, in some of the works previously described. The following text takes a closer look at this result to determine the extent to which it reflects an increase in the degree of income segregation across neighborhoods. Decomposing Income Inequality Consider the following standard decomposition of a metropolitan area’s income inequality. The variance of household income in a metropolitan area, σ 2, can be estimated as (4) σ2 = 1 H N Hn 2 ∑ ∑ ( yh , n − y ) , n =1 h =1 where yh,n is the income of household h of neighborhood n, y– is the mean household income for the entire metropolitan area, Hn is the total number of F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 47 Wheeler households in neighborhood n, N is the total number of neighborhoods, and H is the total number of households, ΣnHn .11 This expression can be rewritten as the sum of two terms: (5) σ 2 = 1 H N Hn 2 ∑ ∑ ( yh , n − y n ) + n =1 h =1 1 H N Hn 2 ∑ ∑ ( yn − y ) , n =1 h =1 where y–n represents the mean household income in neighborhood n. The first of the terms on the right-hand side of equation (5) is the “within” neighborhood component, which measures the degree of income dispersion among households within the same neighborhood. The second term, the “between” component, captures the amount of income variation across different neighborhoods. The within component cannot be computed directly because data from individual households are unavailable. However, the between component can be computed. Using the estimates of the variance, σ 2, derived above, the within-neighborhood component is constructed as the difference between these two pieces. Table 1 lists some summary statistics describing the within-block and between-block group components. Two features are immediately apparent. First, in each of the three years considered (1980, 1990, 2000), the extent of income variation within neighborhoods is considerably larger than the extent of variation between them. In the year 2000, for instance, the within-neighborhood component accounted for 80 percent of total metropolitan area income variation, on average. This finding is roughly similar to Epple and Sieg’s (1999) report for municipalities in Boston and is consistent with the results of Ioannides (2004) and Hardman and Ioannides (2004), who document a substantial degree of income heterogeneity within small residential clusters in the United States. Second, the 10 years between 1980 and 1990 saw a sharp rise in the proportion of total income variation attributable to between-neighborhood differences. Over this decade, the average fraction of total income variation associated with differences across neigh11 The average numbers of households per metropolitan area are relatively large: 180,164.6 for 1980, 208,780.9 for 1990, and 240,407.2 for 2000. Across all three years, the minimum number of households is 8,681. Hence, the difference between using a factor of 1/H in equation (4) instead of 1/共H –1兲 is extremely small. 48 V O LU M E 4 , N U M B E R 1 2008 borhoods rose from 12.7 percent to 21.9 percent. Hence, although income variation remained predominantly a within-neighborhood phenomenon in 2000, the between-neighborhood component became increasingly important between 1980 and 2000. Decentralization and Inequality: Within versus Between Neighborhoods An estimated series of regressions following the above procedure was used to determine whether urban decentralization is associated with growing inequality through a within- or a betweenneighborhood channel (or possibly both). I estimate three specifications of equation (3) in which the dependent variables are the changes in within- and between-neighborhood income variation rather than the change in the total variance of log income. The estimates are shown in Table 4. Interestingly, they demonstrate some striking differences in the estimated associations across the two sets of results. In looking just at the longest specification, III, the change in a metro area’s degree of income variation within its block groups is positively and significantly tied to changes in the fraction of the population with a bachelor’s degree, the fraction that is black, and the fraction that is foreign-born. On the other hand, increases in the percentages of total employment in manufacturing and finance, insurance, and real estate correlate negatively with income inequality within neighborhoods. Between-neighborhood inequality shows a similar positive and significant association with the fraction of college graduates in the local population and with a number of quantities that did not relate significantly to within-block group inequality: the percentages of the population accounted for by women, individuals younger than age 24, and the unemployment rate. Increases in these three variables tend to be associated with increases in the extent of income variation between different block groups. In addition, between-neighborhood inequality is significantly, negatively tied to the fraction of the local population that is black, the shares of total employment accounted for by construction and education services, and the extent of union representation in the local labor force. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler Table 4 Within- and Between-Neighborhood Inequality Results Within-neighborhood Variable Between-neighborhood I II III I II III –0.069* (0.009) –0.075* (0.01) –0.064* (0.01) –0.001 (0.008) –0.01 (0.009) –0.006 (0.008) Percent bachelor’s degree — 0.38* (0.08) 0.35* (0.09) — 0.16* (0.07) 0.17* (0.08) Percent female — –0.006 (0.21) –0.004 (0.23) — 0.73* (0.20) 0.44* (0.17) Percent black — 0.24* (0.11) 0.24* (0.11) — –0.19* (0.10) –0.21* (0.10) Percent <24 years — 0.10 (0.12) 0.06 (0.12) — 0.25* (0.11) 0.17* (0.11) Percent >65 years — 0.37* (0.15) 0.22 (0.14) — –0.06 (0.16) 0.01 (0.15) Percent foreign-born — 0.21* (0.10) 0.18* (0.09) — 0.07 (0.06) 0.024 (0.06) Percent manufacturing — — –0.27* (0.07) — — –0.08 (0.05) Percent agriculture, forestry, fishing, and mining — — –0.13 (0.12) — — 0.09 (0.11) Percent construction — — 0.035 (0.12) — — –0.32* (0.10) Percent wholesale trade — — 0.10 (0.17) — — –0.19 (0.15) Percent retail trade — — 0.03 (0.10) — — 0.08 (0.09) Percent fire, insurance, and real estate — — –0.26* (0.14) — — –0.20 (0.13) Percent public administration — — –0.18 (0.13) — — –0.16 (0.10) Percent education services — — 0.12 (0.15) — — –0.39* (0.14) Percent health services — — 0.02 (0.13) — — 0.09 (0.11) Unemployment rate — — 0.02 (0.09) — — 0.44* (0.09) Percent union representation — — 0.01 (0.05) — — –0.13* (0.05) 0.17 0.23 0.28 0.66 0.68 0.72 Log density R2 NOTE: Data represent 718 observations. Dependent variables are the changes in within- and between-neighborhood income variation for a metropolitan area. Each regressor is expressed in terms of contemporaneous 10-year changes. All specifications also include three region dummies and a time effect for the 1980-90 decade. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms.* Significant at ≥10 percent. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 49 Wheeler Why are there such differences in the associations of these variables with the two measures of inequality? One possible explanation relates to how residential patterns change with each quantity. Increases in the fraction of black residents in a metro area’s total population, for instance, may be associated with increasing racial heterogeneity within block groups (hence, higher withinneighborhood income variation), and as a consequence, declining heterogeneity between them (thus, lower between-neighborhood variation). Similarly, fluctuations in unemployment and union membership may influence workers in particular neighborhoods much more than a city’s general population. This would lead to fluctuations in the degree of inequality between neighborhoods rather than within them. For the variable of primary interest—population density—the results demonstrate a clear, negative association with the extent of income variation within neighborhoods. As the change in population density decreases by 1 SD in the cross section, the change in (log) income variance within block groups increases by approximately 1 percentage point. (Recall that this magnitude is virtually identical to the one estimated for overall income variation). Given this finding, it is perhaps not surprising that the estimated association between density and between-neighborhood inequality is extremely small. None of the three specifications produces a statistically or economically significant coefficient on the change in population density. Based on these results, there is little evidence that urban decentralization is associated with rising income differentials between neighborhoods. The negative association between density and the variance of household income observed in Table 3 seems to be driven almost entirely by the change in withinneighborhood income differences. move farther from low-income households as the gap between the two groups increases.12 I use an instrumental variables (IVs) estimation to address this matter. I consider two different sets of instruments for the change in density: (i) the lagged level of density within a metropolitan area, and (ii) lagged shares of employment in each of the nine industry shares previously considered. The rationale for each is straightforward. Initial density should capture a city’s capacity for increased levels of density over time. With all else equal, initially dense cities should be less likely to see further increases in their densities because they face greater space constraints.13 Because different types of employers have different propensities to decentralize their operations (e.g., Glaeser and Kahn, 2004), initial industry shares should also predict future changes in population density. Weinberg (2004), for example, has exploited this feature of industry location patterns to instrument for job centralization in a study of spatial mismatch. Of course, because initial density or sectoral employment shares may be correlated with unobserved factors influencing subsequent changes in inequality (e.g., density or the manufacturing share in 1990 may be endogenous with respect to the change in inequality between 1990 and 2000), I use density and each industry share in 1980 to instrument for the change in density between 1990 and 2000.14 Table 5 shows the results using all three inequality measures and all three specifications. For the sake of conciseness, I have reported only the coefficients on the change in density. The results generally are very similar to the estimates in Tables 3 and 4. Density and inequality are negatively related, and the association operates primarily through a within-neighborhood channel rather than a between-neighborhood channel. 12 Rising income differentials, for example, may generate greater differences in the demand for certain local public goods or an increasing desire to avoid “negative” neighborhood effects. 13 In fact, a strong negative connection exists between the initial level of density in a metro area and the extent to which it decentralizes over the next 10 years. A simple regression of the change in density on its initial level in the data used here produces a coefficient (standard error) of –0.04 (0.004) with a goodness-of-fit statistic equal to 0.14. 14 As demonstrated by the results from F tests of marginal significance reported in Table 5, both sets of instruments are significant predictors of the change in density between 1990 and 2000. Instrumental Variables Estimates One obvious criticism of this estimation is the potential endogeneity of changes in density with respect to changes in inequality. A rise in the degree of income dispersion in a metro area, for example, may induce residents to segregate further, possibly leading to greater decentralization. It is not implausible that high-income households may seek to 50 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler Table 5 Instrumental Variables Estimates IV (density) Dependent variable IV (industry shares) I II III I II III Variance change log income distribution –0.24* (0.06) –0.10* (0.03) –0.04 (0.03) –0.07 (0.05) –0.10* (0.04) –0.04 (0.04) Within-neighborhood inequality component –0.20* (0.05) –0.11* (0.03) –0.066* (0.03) –0.07 (0.05) –0.10* (0.04) –0.07* (0.04) Between-neighborhood inequality component –0.04 (0.03) 0.01 (0.02) 0.02 (0.03) –0.003 (0.04) 0.001 (0.03) 0.03 (0.03) F test 40.2 (0) 95.1 (0) 88.03 (0) 5.26 (0) 9.79 (0) 8.72 (0) NOTE: Data represent 359 observations. Coefficients are for the change in log population density. Dependent variables are the changes in the variance, the within-neighborhood component, and the between-neighborhood component between 1990 and 2000. Instruments are log density or industry employment shares in 1980. Specifications follow data reported in Tables 3 and 4. Standard errors (reported in parentheses, except for F tests) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. F test reports results from test of the (marginal) significance of the instruments from the first-stage regression for the appropriate specification (p-value under null that the IV coefficients are zero appears in parentheses). *Significant at ≥10 percent. Other Measures of BetweenNeighborhood Inequality This section expands on the analysis of between-neighborhood inequality by considering how changes in metropolitan area density influence some alternative measures of income differences across block groups. In particular, how do differences among the 90th, 50th, and 10th percentiles of the block group (average) household income distribution within each metropolitan area change as metropolitan areas decentralize?15 Although percentile differences are not typically used in studies of neighborhood income inequality, they are commonly used to quantify inequality between individuals (e.g., Juhn, Murphy, and Pierce, 1993). Table 6 shows the results from the same three specifications considered above, each of which is estimated by OLS and IV.16 Regardless of whether the percentiles are computed in a weighted or unweighted fashion (where the weights are given by the number of households in each block group), 15 On average, metropolitan areas in the sample contain 460 block groups each (minimum = 27, maximum = 14,019), so calculating percentiles is a reasonable exercise with these data. 16 Recall that in all cases, standard errors are adjusted for heteroskedasticity and within–metro area correlation. the estimated coefficients on density are quite similar. The OLS results suggest that, instead of decreases in density generating greater inequality between neighborhoods, they may generate smaller interneighborhood income differences. This result, however, may be the product of endogeneity, whereby some aspect of rising between-neighborhood inequality may cause density to rise. For example, rising income segregation between neighborhoods may be associated with rising returns to the highly educated residents, who may desire to live in traditional city centers (e.g., Brueckner and Rosenthal, 2008). This would create an upward bias in a truly negative association between density and inequality. I consider, therefore, the use of IVs, which produces a somewhat different set of conclusions. These suggest little association between density and the difference between the neighborhoods at the 90th and 10th percentiles of the log income distribution, which is consistent with the results examining the between-neighborhood component of total income variation documented above. When separated into 90-50 and 50-10 differentials, however, the difference between the 90th percentile and the median tends to increase significantly as F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 51 Wheeler Table 6 Alternative Measures of Between-Neighborhood Inequality OLS Dependent variable I IV (density) IV (industry shares) II III I II III I II III Unweighted 90-10 percentile 0.04 difference (0.03) 0.07* (0.04) 0.10* (0.04) –0.30* (0.15) –0.055 (0.10) 0.06 (0.11) –0.23 (0.18) –0.20 (0.13) –0.07 (0.13) Unweighted 90-50 percentile 0.02 difference (0.03) 0.035 (0.03) 0.06* (0.03) –0.41* (0.10) –0.20* (0.07) –0.13* (0.07) –0.23* (0.11) –0.24* (0.08) –0.15* (0.08) Unweighted 50-10 percentile 0.02 difference (0.02) 0.03 (0.02) 0.03 (0.02) 0.09 (0.10) 0.14* (0.07) 0.19* (0.08) –0.01 (0.11) 0.04 (0.09) 0.07 (0.10) Weighted 90-10 percentile difference 0.05 (0.04) 0.067* (0.04) 0.09* (0.036) –0.35* (0.13) –0.07 (0.09) 0.01 (0.09) –0.08 (0.17) –0.14 (0.13) 0.002 (0.13) Weighted 90-50 percentile difference 0.02 (0.03) 0.027 (0.03) 0.06* (0.03) –0.40* (0.10) –0.19* (0.06) –0.16* (0.07) –0.10 (0.10) –0.17* (0.08) –0.09 (0.08) Weighted 50-10 percentile difference 0.03 (0.02) 0.04* (0.02) 0.03 (0.03) 0.06 (0.10) 0.12* (0.06) 0.17* (0.07) 0.01 (0.10) 0.03 (0.08) 0.09 (0.08) NOTE: Coefficients are for the change in log population density. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. Specifications follow data reported in Tables 3 and 4. * Significant at ≥10 percent. cities decentralize. At the same time, the difference between the median and the 10th percentile appears to decrease as a metro area population spreads out. Indeed, the estimated associations between density and the 50-10 gap are significantly positive when initial density is used as an instrument for its future change. When combined, of course, these two observations are perfectly compatible with the finding that the 90-10 differential shows little association with changes in density. This evidence suggests that, although there seems to be little association between urban decentralization and measures of the overall degree of income variation across different neighborhoods, the same is not true for all parts of the income distribution. As city populations spread out, there appears to be an increase in the average incomes of neighborhoods at the top relative to the middle. Particularly, high-income households may segregate themselves to a larger extent as populations spread out. On the other hand, the gap between the average incomes at the middle of the distribution and those at the bottom shrinks, which may reflect greater income mixing among middle- to lower-income households. 52 V O LU M E 4 , N U M B E R 1 2008 Table 7 shows a more detailed set of results describing these associations; it reports the coefficients on the change in density in regressions in which these three individual quantiles are specified as the dependent variables. The OLS results again suggest that declining density may lead to smaller income differences between block groups because the estimated associations are positive and increasing in moving from the 10th percentile to the 90th. Hence, decreases in density ought to reduce the average income at the top of the block group distribution by more than it does at either the middle or the bottom. The OLS results may be biased, however (again, because of the likely endogeneity of changes in population density in relation to changes in inequality). IVs, therefore, may offer more reliable estimates. The IV results indicate that the 90th and 10th percentiles of the block group income distribution vary little with population density. Only two of the 24 estimates for these two quantiles differ statistically from zero. This finding is interesting because it suggests that urban decentralization is not associated with the top of the neighborhood income distribution pulling away from the rest of F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler Table 7 Individual Quantile Results OLS Dependent variable IV (density) IV (industry shares) I II III I II III I II III Unweighted 90th percentile 0.26* (0.03) 0.21* (0.03) 0.17* (0.03) –0.18 (0.11) 0.01 (0.07) 0.02 (0.07) –0.08 (0.11) –0.02 (0.08) 0.01 (0.08) Unweighted 50th percentile 0.24* (0.03) 0.17* (0.02) 0.10* (0.02) 0.23* (0.08) 0.21* (0.05) 0.15* (0.05) 0.15* (0.07) 0.23* (0.06) 0.16* (0.06) Unweighted 10th percentile 0.22* (0.03) 0.14* (0.03) 0.07* (0.03) 0.13 (0.13) 0.07 (0.08) –0.05 (0.08) 0.16 (0.12) 0.19* (0.10) 0.09 (0.10) Weighted 90th percentile 0.27* (0.04) 0.20* (0.03) 0.16* (0.03) –0.28* (0.12) –0.01 (0.07) –0.04 (0.06) –0.02 (0.11) –0.02 (0.08) –0.01 (0.08) Weighted 50th percentile 0.24* (0.03) 0.17* (0.02) 0.10* (0.02) 0.13 (0.08) 0.18* (0.05) 0.12* (0.04) 0.08 (0.07) 0.15* (0.06) 0.08 (0.05) Weighted 10th percentile 0.21* (0.04) 0.13* (0.03) 0.066* (0.03) 0.07 (0.11) 0.06 (0.07) –0.05 (0.07) 0.07 (0.12) 0.12 (0.09) –0.01 (0.09) NOTE: Coefficients are for the change in log population density. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. Specifications follow data reported in Tables 3 and 4. * Significant at ≥10 percent. the distribution. It is also not associated with the bottom of the income distribution falling farther behind the remainder of the distribution. The median, however, does show significantly positive variation with density in most instances, suggesting that urban decentralization may be associated with a decline in the incomes of neighborhoods at the middle of the distribution. This result, of course, explains why the gap between the top of the income distribution rises while the gap at the bottom falls. Inequality Within and Between Tracts While the basic geographic unit of analysis in this paper is the block group, many existing studies of neighborhood-level economic outcomes have typically focused on Census tracts, which represent a larger geographic area. The median Census tract consists of approximately 1,649 households and covers roughly 1.3 square miles compared with 526 households and 0.33 square miles for block groups. Given the prevalence of tract-level analyses in the literature on neighborhood outcomes, this section considers whether the definition of neighborhoods as tracts, rather than block groups, alters the results in any substantive way.17 Table 8 reports the coefficients on the change in log density from every specification considered using block group–level observations. In general, the tract-level results yield very similar conclusions. The extent of income inequality observed within tracts shows a strong, negative association with population density, whereas between-tract inequality shows little correlation with density. With regard to the percentile differences, the OLS results again suggest that, if anything, urban decentralization may be associated with smaller between-neighborhood gaps, not larger. The IV estimates are mostly insignificant, although there is some evidence that the gap between the top and middle of the neighborhood income distribution widens somewhat as population density declines. As noted previously, this finding seems to reflect a decrease in the median relative to the 90th percentile, which could be the product of greater mixing of medium- and low-income households in suburban neighborhoods. 17 On average, metropolitan areas in the sample contain 147 tracts each (minimum = 10, maximum = 4,507). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 53 Wheeler Table 8 Tract-Level Results OLS IV (density) IV (industry shares) Dependent variable I II III I II III I II III Within component –0.07* (0.009) –0.08* (0.01) –0.07* (0.01) –0.21* (0.05) –0.11* (0.03) –0.06* (0.03) –0.08 (0.05) –0.10* (0.04) –0.06* (0.03) Between component –0.001 (0.007) –0.005 (0.008) –0.001 (0.007) –0.03 (0.03) 0.009 (0.02) 0.02 (0.02) 0.005 (0.04) 0.002 (0.03) 0.02 (0.03) Unweighted 90-10 percentile difference 0.037 (0.04) 0.067 (0.04) 0.08* (0.04) –0.17 (0.15) –0.0002 (0.11) 0.05 (0.12) –0.24 (0.19) –0.26* (0.15) –0.25 (0.17) Unweighted 90-50 percentile difference 0.05 (0.03) 0.05 (0.03) 0.07* (0.036) –0.26* (0.10) –0.09 (0.07) –0.11 (0.08) –0.17 (0.12) –0.18* (0.10) –0.15 (0.10) Unweighted 50-10 percentile –0.01 difference (0.03) 0.01 (0.03) 0.007 (0.03) 0.10 (0.13) 0.09 (0.09) 0.16 (0.10) –0.07 (0.13) –0.08 (0.11) –0.10 (0.13) Weighted 90-10 percentile difference 0.04 (0.04) 0.056 (0.04) 0.09* (0.04) –0.30* (0.17) –0.09 (0.12) –0.05 (0.14) –0.27 (0.18) –0.27* (0.15) –0.17 (0.16) Weighted 90-50 percentile difference 0.05 (0.03) 0.05 (0.04) 0.08* (0.04) –0.27* (0.13) –0.11 (0.10) –0.08 (0.11) –0.25* (0.15) –0.25* (0.12) –0.21* (0.13) Weighted 50-10 percentile difference –0.01 (0.02) 0.005 (0.03) 0.01 (0.03) –0.03 (0.12) 0.02 (0.08) 0.03 (0.10) –0.01 (0.12) –0.02 (0.10) 0.04 (0.11) Unweighted 90th percentile 0.30* (0.04) 0.24* (0.04) 0.19* (0.04) –0.05 (0.11) 0.10 (0.07) 0.01 (0.08) –0.05 (0.13) 0.02 (0.10) –0.04 (0.10) Unweighted 50th percentile 0.25* (0.03) 0.18* (0.02) 0.11* (0.02) 0.21* (0.08) 0.19* (0.05) 0.12* (0.06) 0.12 (0.08) 0.20* (0.07) 0.10* (0.06) Unweighted 10th percentile 0.26* (0.04) 0.17* (0.03) 0.11* (0.03) 0.12 (0.13) 0.10 (0.09) –0.04 (0.09) 0.19 (0.14) 0.28* (0.12) 0.21* (0.12) Weighted 90th percentile 0.27* (0.04) 0.20* (0.03) 0.17* (0.04) –0.15 (0.13) 0.03 (0.09) –0.01 (0.10) –0.14 (0.14) –0.09 (0.11) –0.11 (0.12) Weighted 50th percentile 0.23* (0.03) 0.15* (0.02) 0.08* (0.02) 0.12 (0.08) 0.14* (0.06) 0.07 (0.06) 0.11 (0.08) 0.16* (0.06) 0.10* (0.06) Weighted 10th percentile 0.23* (0.04) 0.15* (0.03) 0.07* (0.03) 0.15 (0.13) 0.12 (0.08) 0.05 (0.10) 0.13 (0.11) 0.17* (0.10) 0.07 (0.11) NOTE: Coefficients are for the change in log population density. Standard errors (reported in parentheses) are adjusted for both heteroskedasticity and within–metro area correlation of the regression error terms. Specifications follow data reported in Tables 3 and 4. * Significant at ≥10 percent. 54 V O LU M E 4 , N U M B E R 1 2008 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler Thus, just as with block groups, urban decentralization tends to be accompanied by widening income gaps within Census tracts. There is little evidence that between-neighborhood income gaps rise in sprawling cities. CONCLUSION City populations in the United States have decentralized for more than a century. Although the process was driven largely by the decisions of individuals to live farther from historical city centers, it has generated numerous concerns about segregation of households by income. Given the evidence documented in previous work and herein that urban decentralization tends to be accompanied by significant increases in income inequality, these concerns certainly seem warranted. This paper has examined this issue further by exploring the extent to which the increased income inequality with decreasing density emanates from a rise in the degree of income variation exhibited across different neighborhoods. In general, the findings suggest that between-neighborhood income gaps do not rise significantly as central cities spread out. Neither the difference between the 90th and 10th percentiles of the block group–level income distribution nor the degree of variation associated with between-block group income differentials rises (or falls) significantly as a metropolitan area’s population spreads out. This result should not be interpreted as suggesting that all betweenneighborhood income differentials are completely invariant to the outward movement of people in a city. A rising gap between the absolute poorest neighborhoods and the remainder of the metropolitan area may still exist. However, the extent to which this potential gap contributes to overall income inequality within a local market appears decidedly small. Instead, the rise of income dispersion as cities decentralize is largely associated with an increase in the degree of income heterogeneity within neighborhoods. One straightforward interpretation of this result is that urban decentralization is associated with greater income mixing within neighborhoods, regardless of whether they are defined by block groups or tracts. Because they are less densely populated, for instance, suburban neighborhoods may more readily accommodate households with widely varying income levels than central cities, where individuals reside in closer proximity. This may be similar to the finding reported by Glaeser and Kahn (2004) that suburbs are more racially integrated than central cities. Unfortunately, why overall income inequality increases with urban decentralization remains unresolved. If sprawling cities were simply reorganizing their populations from dense, segregated collections of neighborhoods into less-dense, heterogeneous sets of neighborhoods, the rise in withinneighborhood inequality should be offset by a drop in between-neighborhood inequality. The data show little evidence of any such drop. One possible explanation is that urban decentralization may be associated with greater industrial heterogeneity (beyond what this analysis controls for), at least in the sense that suburban areas might have large numbers of particularly low-wage jobs, high-wage jobs, or both. A large presence of jobs in typically low-wage sectors, such as food services and accommodation or retail trade, for example, may contribute to higher inequality within neighborhoods. On a more speculative level, less-dense suburban areas might be characterized by fewer social interactions among individuals of different groups, as defined by income or education. That is, although suburban neighborhoods may have a more heterogeneous mix of residents, the extent of productive interaction among them may be relatively low. Following Glaeser (1999), this may lead to greater income inequality as “less-skilled” workers have fewer opportunities to learn from their “more-skilled” counterparts. At this point, both explanations are purely hypothetical and, therefore, require greater research. Given the relative dearth of studies of the inequality– urban decentralization issue, such research certainly seems worthwhile. REFERENCES Abramson, Alan J.; Tobin, Mitchell S. and VanderGoot, Matthew R. “The Changing Geography of Metropolitan Opportunity: The Segregation of the Poor in Metropolitan Areas, 1970 to 1990.” Housing F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 55 Wheeler Policy Debate, 1995, 6(1), pp. 45-72; www.fanniemaefoundation.org/programs/hpd/pdf/ hpd_0601_abramson.pdf. Holzer, Harry J. “The Spatial Mismatch Hypothesis: What Has the Evidence Shown?” Urban Studies, February 1991, 28(1), pp. 105-22. Benabou, Roland. “Heterogeneity, Stratification, and Growth: Macroeconomic Implications of Community Structure and School Finance.” American Economic Review, 1996, 86, pp. 584-609. Ihlanfeldt, Keith R. and Sjoquist, David L. “The Impact of Job Decentralization on the Economic Welfare of Central City Blacks.” Journal of Urban Economics, July 1989, 26(1), pp. 110-30. Brueckner, Jan K. and Rosenthal, Stuart S. “Gentrification and Neighborhood Housing Cycles: Will America’s Future Downtowns Be Rich?” CESifo Working paper series No. 1579, University of California-Irvine, April 2, 2008; www.socsci.uci.edu/~jkbrueck/gentrification.pdf. Ioannides, Yannis M. “Neighborhood Income Distributions.” Journal of Urban Economics, November 2004, 56(3), pp. 435-57. Epple, Dennis and Sieg, Holger. “Estimating Equilibrium Models of Local Jurisdictions.” Journal of Political Economy, August 1999, 107(4), pp. 645-81. Fortin, Nicole M. and Lemieux, Thomas. “Institutional Changes and Rising Wage Inequality: Is There a Linkage?” Journal of Economic Perspectives, Spring 1997, 11(2), pp. 75-96. Glaeser, Edward L. “Learning in Cities.” Journal of Urban Economics, September 1999, 46(2), pp. 254-77. Glaeser, Edward L. and Kahn, Matthew E. “Sprawl and Urban Growth,” in J. Vernon Henderson and Jacques-François Thiesse, eds., Handbook of Regional and Urban Economics, volume 4: Cities and Geography (Handbooks in Economics), chapter 56. New York: Elsevier, 2004, pp. 2481-528. Hardman, Anna and Ioannides, Yannis. “Neighbors’ Income Distribution: Economic Segregation and Mixing in US Urban Neighborhoods.” Journal of Housing Economics, December 2004, 13(4), pp. 368-82. Hirsch, Barry T.; Macpherson, David A. and Vroman, Wayne G. “Estimates of Union Density by State.” Monthly Labor Review, July 2001, 124(7), pp. 51-55; http://www.bls.gov/opub/mlr/2001/07/ressum2.htm. Hobbs, Frank and Stoops, Nicole. “Demographic Trends in the 20th Century.” US Census Bureau, Census 2000 Special Reports, Series CENSR-4 (November 2002). Washington, DC: US Government Printing Office; www.census.gov/prod/2002pubs/censr-4.pdf. 56 V O LU M E 4 , N U M B E R 1 2008 Johnson, Norman L. and Kotz, Samuel. Continuous Univariate Distributions. Boston: Houghton Mifflin, 1970. Juhn, Chinhui; Murphy, Kevin M. and Pierce, Brooks. “Wage Inequality and the Rise in Returns to Skill.” Journal of Political Economy, 1993, 101(3), 410-42. Kain, John F. “Housing Segregation, Negro Employment, and Metropolitan Decentralization.” Quarterly Journal of Economics, May 1968, 82(2), pp. 175-97. Kasarda, John D. “Inner-City Concentrated Poverty and Neighborhood Distress: 1970-1990.” Housing Policy Debate, 1993, 4(3), pp. 253-302. Katz, Lawrence K. and Murphy, Kevin M. “Changes in Relative Wages, 1963-1987: Supply and Demand Factors.” Quarterly Journal of Economics, February 1992, 107, 35-78. Margo, Robert A. “Explaining the Postwar Suburbanization of the Population of the United States: The Role of Income.” Journal of Urban Economics, May 1992, 31(3), pp. 301-10. Mayer, Christopher J. “Does Location Matter?” New England Economic Review, May/June 1996, pp. 26-40. Weinberg, Bruce A. “Black Residential Centralization and the Spatial Mismatch Hypothesis.” Journal of Urban Economics, July 2000, 48(1), pp. 110-34. Weinberg, Bruce A. “Testing the Spatial Mismatch Hypothesis Using Inter-City Variations in Industrial Composition.” Regional Science and Urban Economics, September 2004, 34(5), pp. 505-32. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T Wheeler Wheeler, Christopher H. “Wage Inequality and Urban Density.” Journal of Economic Geography, 2004, 4(4), pp. 421-37. Yang, Rebecca and Jargowsky, Paula. “Suburban Development and Economic Segregation in the 1990s.” Journal of Urban Affairs, June 2006, 28(3), pp. 253-73. Wilson, William J. The Truly Disadvantaged: The Inner City, the Underclass, and Public Policy. Chicago: University of Chicago Press, 1987. APPENDIX Income Categories Used in Analysis* 1980 Income Categories ($) 1990 Income Categories ($) 2000 Income Categories ($) 0-4,999 0-4,999 5,000-7,499 5,000-9,999 10,000-14,999 0-9,999 7,500-9,999 10,000-12,499 15,000-19,999 10,000-12,499 12,500-14,999 20,000-24,999 12,500-14,999 15,000-17,499 25,000-29,999 15,000-17,499 17,500-19,999 30,000-34,999 17,500-19,999 20,000-22,499 35,000-39,999 20,000-22,499 22,500-24,999 40,000-44,999 22,500-24,999 25,000-27,499 45,000-49,999 25,000-27,499 27,500-29,999 50,000-59,999 27,500-29,999 30,000-32,499 60,000-74,999 30,000-34,999 32,500-34,999 75,000-99,999 35,000-39,999 35,000-37,799 100,000-124,999 40,000-49,999 37,500-39,999 125,000-149,999 50,000-74,999 40,000-42,499 150,000-199,999 — 42,500-44,999 — — 45,000-47,499 — — 47,500-49,999 — — 50,000-54,999 — — 55,000-59,999 — — 60,000-74,999 — — 75,000-99,999 — — 100,000-124,499 — — 125,000-149,999 — NOTE: * See footnote 6. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E G I O N A L E C O N O M I C D E V E LO P M E N T V O LU M E 4 , N U M B E R 1 2008 57