The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Working Paper Series Does Greater Inequality Lead to More Household Borrowing? New Evidence from Household Data WP 14-01 Olivier Coibion UT Austin and NBER Yuriy Gorodnichenko UC Berkeley and NBER Marianna Kudlyak Federal Reserve Bank of Richmond John Mondragon UC Berkeley This paper can be downloaded without charge from: http://www.richmondfed.org/publications/ Does Greater Inequality Lead to More Household Borrowing? New Evidence from Household Data Olivier Coibion UT Austin and NBER Yuriy Gorodnichenko UC Berkeley and NBER Marianna Kudlyak Federal Reserve Bank of Richmond John Mondragon UC Berkeley Working Paper No. 14-01 This Draft: January 10th, 2014 Abstract: One suggested hypothesis for the dramatic rise in household borrowing that preceded the financial crisis is that low-income households increased their demand for credit to finance higher consumption expenditures in order to “keep up” with higherincome households. Using household level data on debt accumulation during 2001-2012, we show that low-income households in high-inequality regions accumulated less debt relative to income than their counterparts in lower-inequality regions, which negates the hypothesis. We argue instead that these patterns are consistent with supply-side interpretations of debt accumulation patterns during the 2000s. We present a model in which banks use applicants’ incomes, combined with local income inequality, to infer the underlying type of the applicant, so that banks ultimately channel more credit toward lower-income applicants in low-inequality regions than high-inequality regions. We confirm the predictions of the model using data on individual mortgage applications in high- and low-inequality regions over this time period. JEL: E21, E51, D14, G21 Keywords: inequality, household debt, Great Recession We are grateful to Meta Brown and Donghoon Lee for helpful comments about the data, and seminar participants at the Richmond Fed, St. Louis Fed, and CES-Ifo conference. The views expressed here are those of the authors and do not reflect those of the Federal Reserve Bank of Richmond or the Federal Reserve System or any other institution with which the authors are affiliated. Mondragon thanks the Richmond Fed for their generous support while part of this paper was written. Gorodnichenko thanks the NSF and Sloan Foundation for financial support. 1 1 Introduction The financial crisis of 2008-09 was preceded by an exceptional rise in borrowing by U.S. households, accounted for primarily by a rise in mortgage debt. This increasing mortgage debt was securitized and ultimately played a key role in bringing down the financial system once housing prices began to decline and the associated mortgage-backed securities fell sharply in value. Why did households take on so much new debt in the years immediately preceding the financial crisis? There are two main views about this process. The first view is that the rise in borrowing reflected “credit supply” factors. Proponents point to the progress in information technology (Sanchez 2009 and Athreya, Tam and Young 2012) and rising financialization of debt (especially mortgages) as increasing the supply of credit to households with a disproportionally larger increase of credit to low-income and high risk households (Drozd and Serrano-Padial, 2013). Others also point to political motivations for expanding credit supply. For example, Rajan (2010) argues that, in response to rising income inequality, credit was made increasingly available to lower income groups to support their consumption levels in the face of stagnant incomes. According to the second view (“demand for credit”), there was a rise in the demand for borrowing on the part of U.S. households, especially low-income households. One motivation for such a rise in demand for borrowing again stems from rising inequality in the U.S. Specifically, rising consumption on the part of wealthy households could have generated a rise in the demand for borrowing on the part of lower-income households in their attempts to “keep up” with their wealthier neighbors, the so-called “keeping up with the Joneses” effect. Indeed, there is a positive correlation between income inequality in the U.S. (income share of the top 5%) and household debt relative to GDP (Figure 1) over time. Both were stable from 1967 to around 1980, then both measures rose gradually over the course of the 1980s as noted in Iacoviello (2008). But while income inequality then went up sharply in the early 1990s, household debt only caught up over the 2000s. The correlation is certainly consistent with the possibility of a causal relationship running from inequality to household borrowing. In this paper, we focus specifically on the link between inequality and household borrowing. In particular, we investigate whether borrowing patterns on the part of low-, middle- and high-income households differed depending on the level of local income inequality (where we define “local” as ranging from as fine a geographic level as the zip code to as aggregated a level as the state). Local inequality is, from a household’s point of view, likely to be the most relevant metric for “keeping up with Joneses”. Furthermore, with most of the rise in income inequality in the U.S. since the 1970s reflecting a rise in inequality within regions rather than inequality across regions, any sensitivity of borrowing to local inequality levels could readily have translated into aggregate effects. 2 To assess whether borrowing patterns differed depending on local inequality levels, we study the changes in debt to income ratios at the household level over the course of the 2000s and their relationship with households’ relative standings in the income distribution and the amount of local income inequality. We use unique data from the New York Federal Reserve Bank Consumer Credit Panel/Equifax (CCP) which provides comprehensive debt measures for millions of U.S. households since 1999, including detailed decompositions of debt by type (i.e. mortgage, auto, credit cards, etc.). Because this dataset does not include a measure of household income, we use the relationship between household debt and income, conditional on observable household characteristics, in the Survey of Consumer Finances to predict initial household income in 2001. This imputation allows us to study the relationship between income and debt in unprecedented detail. We then characterize the evolution of household debt levels, relative to initial income levels, across income groups in areas with different levels of income inequality, which is akin to a “difference-in-differences” approach across income groups and regional inequality levels. Our main finding is that high-income households in high-inequality regions accumulated more debt relative to their incomes than did low-income households in the same regions, or equivalently that low-income households in high-inequality regions borrowed relatively less than similar households in low-inequality regions. This effect is precisely the opposite of what one would have expected from “keeping up with the Joneses” driving the rise in household debt during the 2000s. We show that this result is remarkably robust and holds up to an extensive array of robustness checks: e.g. we find these patterns within households with low or high credit scores, within regions which experienced either high or low home price appreciation, within households with either low or high initial debt levels, etc. We measure inequality at the zip code, county and state and find similar results across levels of aggregation. The fact that the baseline results are robust to controlling for a wide range of other local factors that are correlated with inequality levels suggests that it is indeed the level of inequality that matters rather than inequality being a stand-in for other economic channels. Because our data provides disaggregated information on household debt, we assess the link between local inequality and different forms of debt: mortgage debt, auto debt, and credit card debt. We find strong evidence that low-income households in high-inequality regions borrowed less in terms of both mortgage and auto debt than those in low-inequality regions. A unique feature of the data is that we have information on both credit card balances as well as credit card limits. This is particularly useful because the latter can be interpreted as largely representing credit supply whereas the former primarily reflects the demand for credit. We find that low-income households in high-inequality regions saw their credit limits rise by less than those in lower inequality regions as was the case with mortgage and auto debt. At the same time, no economically significant heterogeneity is observed in terms of credit card 3 balances. We interpret this contrast as pointing to supply side factors as being at the root of the differential debt accumulation patterns that we observe in the data. To illustrate how supply-side factors can explain the differential borrowing behavior tied to regional inequality, we present a model in which each region is composed of two types of households. High-type households have higher income on average than low-type households and are also less likely to (exogenously) default on debt. A continuum of banks in each region lends to these households but banks do not observe households’ types, only their income and another signal correlated with the underlying type. As income inequality rises, banks treat an applicant’s income as an increasingly precise signal about their type and therefore target lending toward higher income households on average. How they do so, however, can vary with the local banking structure. For example, if banks are perfectly competitive and can charge different interest rates to different applicants, then higher-income applicants will on average face lower interest rates than low-income applicants, and this difference will be increasing in the amount of local income inequality. If instead we model the banking system as being monopolistic and forced to charge a common interest rate to all applicants, then this bank will reject low-income applicants more frequently than high-income applicants, and this difference will again be increasing in the amount of local inequality. In both cases, banks will make credit more readily accessible (or cheaper) to high-income households when local inequality is higher because the latter implies that income is a more precise signal of applicant types. The credit supply mechanism in the model has some testable implications. If banks use individual incomes combined with regional inequality as a signal about individuals’ types, then we would expect to see richer households be denied less often when applying for mortgages in high-inequality regions than in low-inequality regions, holding other characteristics constant. Similarly, one would expect richer households in high-inequality regions to be less likely to pay higher interest rates on a loan. We test these theoretical predictions using detailed mortgage application information from the publicly available Home Mortgage Disclosure Act data (HMDA). These data track mortgage applications as they go through the origination process and contain information on applicants (including their income, the amount of the loan requested, their locale, and whether the loan is denied or originated). We document that high-income households in high-inequality regions were less likely to be denied than their counterparts in low-inequality regions, precisely as suggested by the theory. High-income households in high-inequality regions were also less likely to be charged higher interest rates for their mortgages than equivalent households in low-inequality regions. Thus, both theoretical predictions from the model are confirmed in the data. In summary, we document a systematic relationship between local inequality and differential borrowing patterns across richer and poorer households in the U.S. that contradicts predictions based on 4 “keeping up with the Joneses” motives. We argue that these results can instead be explained through an information channel: applicants’ incomes are a stronger signal of their underlying quality when local inequality is high so banks are likely to channel relatively more credit to low-income applicants when the level of local inequality is low. These results have implications for interpreting the sources of the dramatic rise in borrowing by households during the housing boom, indicating that the source was more likely to stem from an expansion in credit supply than credit demand. This paper is most closely related to recent work evaluating the strength of “keeping up with the Joneses” forces. Most notably, Bertrand and Morse (2013) study whether rising consumption of the rich induces the non-rich to consume more.1 Using the Consumer Expenditure Survey (CES), they find that, within a state, the consumption of the rich (the top quintile of the income distribution) predicts higher consumption for the nonrich, holding everything else constant including own income. Bertrand and Morse interpret their estimates as supporting the view that rising income inequality in a geographic market translates into more demand for credit by low and middle-income households (see, for example, Rajan 2010). In contrast, by focusing explicitly on the borrowing decisions of households and exploiting a finer level of geographic variation, we document that low and middle income households living in highinequality regions borrowed no more, and in fact less, than similar households in low-inequality regions. This need not be interpreted as contradicting the empirical results of Bertrand and Morse (2013), since the differences in consumption that they document could have been financed through channels other than debt, e.g. through increased labor force participation, longer working hours, etc. But our results indicate that “keeping up with the Joneses” forces are unlikely to have played a primary role in accounting for the dramatic rise in household leverage during the 2000s and therefore in laying the groundwork for the financial crisis of 2008-2009. This paper therefore also relates to a broader line of research investigating the macroeconomic consequences of income inequality, such as whether they are systematically related to financial crises. Kumhof, Ranciere and Winant (2013), for example, argue that a rise in inequality driven by an increase in the share of income going to those at the top of the income distribution induces the latter to save more, lowering interest rates and inducing poorer households to borrow more, ultimately leading to more financial fragility and a higher likelihood of a financial crisis. Bordo and Meissner (2012) find little evidence of such a link based on aggregate data since 1920 for fourteen advanced economies, whereas Perugini, Holscher and Collier (2013) find a positive link between income inequality and private sector indebtedness since 1970 across eighteen economies. We contribute to this literature by documenting how, 1 Prior evidence in the same spirit as Bertrand and Morse (2013) includes Neumark and Postlewaite (1998), Zizzo and Oswald (2001), Christen and Morgan (2005), Luttmer (2005), Daly and Wilson (2006), Maurer and Meier (2008), Charles, Hurst and Roussanov (2009), Kuhn et al. (2010), Heffetz (2011), and Guven and Sorensen (2012). 5 within U.S. regions, debt accumulation patterns across different segments of the population over the course of the 2000s were systematically related to local levels of income inequality. We also provide a novel interpretation for these effects: local income inequality can be used in combination with an applicant’s income level to refine inference about borrower types. In such a setting, higher levels of income inequality will induce banks to reallocate credit toward higher income applicants and away from lower income applicants, thereby potentially amplifying the implications of a more unequal income distribution for the distribution of consumption. The relationship between income inequality and the allocation of credit emphasized in our paper also relates to the literature on consumption and income inequality. Krueger and Perri (2006) and related works argue that consumption inequality during the last decades did not rise with income inequality. 2 Krueger and Perri argue that low-income households have experienced income shocks that increased income inequality, but due to enhanced financial intermediation these households have been able to smooth their consumption such that consumption inequality remained stable. Iacoviello (2008) replicates the trend and cyclicality of household debt since the 1960s and also argues that increased access to credit has allowed households to smooth increasingly volatile income processes. As income inequality increases households use credit markets to smooth the temporary income shocks so that the aggregate level of debt increases with inequality. In contrast, Aguiar and Bils (2012) argue that, when one corrects for measurement errors associated with underreporting of consumption expenditures over time and across different goods, consumption inequality has tracked income inequality closely over the last three decades. While this line of research appeals to financial intermediation as a key link between consumption and income inequality, it could not measure directly the quantitative importance of formal borrowing for smoothing shocks and its relation to inequality due to data constraints. We examine this issue directly using household level data on debt accumulation. Our results are consistent with the findings in Aguiar and Bils (2012) because if low-income households were smoothing shocks to the extent suggested by Krueger and Perri then we would expect low-income households to have accumulated relatively more debt in areas where inequality is higher. We also contribute to the vast literature on household borrowing that covers such diverse topics as pricing of mortgages, optimal portfolios of household debt, risk scoring, and determinants of default probabilities. Our paper is most related to studies of default determinants (e.g., Fay, Hurst, and White 2002, Gross and Souleles 2002) and lenders’ treatment of loan applications (e.g., Tootell 1996, Munnell et al. 1996, Turner and Skidmore 1999) in the sense that we attempt to understand who obtains credit and at what terms. However, while previous research studies these aspects for borrowers (or lenders) without 2 Related papers are Blundell, Pistaferri, and Preston (2008), Heathcote, Storesletten, and Violante (2010), and Heathcote, Perri, and Violante (2010). 6 relating a given individual to the pool of borrowers, we explicitly focus on how the relative positions of borrowers in the income distribution as well as the properties of the income distribution can affect the level of debt they accumulate. Thus, in contrast to the previous literature, we examine directly the interplay between debt and inequality, which have both been salient subjects of recent policy and academic debates. This paper is structured as follows. We describe our primary source of data in section 2 as well as our imputation procedure for household income. In section 3, we consider household-level regressions describing the differential debt accumulation patterns across income levels in regions with different levels of income inequality. Section 4 presents a model that can explain these patterns. In section 5, we test and confirm the additional predictions of the model using data on mortgage applications by individuals in different inequality areas. Section 6 concludes. 2 Data In this section, we first describe the dataset used to measure household debt accumulation over the course of the 2000s. Second, we discuss how we impute household income based on observed patterns in the Survey of Consumer Finances. Third, we construct local income inequality measures and describe some of their properties. 2.1. The New York Federal Reserve Bank Consumer Credit Panel/Equifax We measure household debt accumulation using the New York Federal Reserve Bank Consumer Credit Panel/Equifax (CCP) data. The CCP is a quarterly panel of individuals with detailed information on consumer liabilities, delinquency, some demographic information, credit scores, and geographic identifiers to the zip level. 3 The core of the database constitutes a 5% random sample of all U.S. individuals with credit files. The database also contains information on all individuals with credit files residing in the same household as the individuals in the primary sample. The household members are added to the sample based on the mailing address in the existing credit files. Using the households’ identifiers, we aggregate individual records into households’ records and construct measures of households’ debt. Thus, the resulting sample is a sample of U.S. households in which at least one member has a credit file. The data in the CCP are updated quarterly. We use 100% of the CCP sample. Lee and van der Klaauw (2010) provide an excellent detailed description of the database. The data cover all major categories of household debt including mortgages, home equity lines of credit (HELOC), credit cards, and student loans. Because of the large sample size, the breadth of variables 3 For complete details on the data set and variables construction see Appendix B. 7 observed, detailed location, and the ability to construct a quarterly household panel these data provide the most detailed picture of household debt available. 2.2. Income Rank Imputation While the CCP provides detailed records of household debt and geographical location, it does not include information on household income. To address this issue, we impute income in the CCP using information from the Survey of Consumer Finances (SCF). The SCF is a household-level survey that contains information on debt balances and income as well as a rich set of demographic characteristics. However, the SCF does not provide geographic identifiers in the publicly available data. We use the SCF to estimate how household income relates to debt and demographic characteristics available in both the CCP and SCF data sets. We then use these estimates to impute household income in the CCP data. Finally, we use the imputed income and the estimated error terms from the SCF to impute the household’s income rank in the household’s geographical area. In our analysis, we restrict the sample to households for whom the household head’s age is between 20 and 65 to minimize potential age related selection effects. The data in the CCP are updated quarterly. We use data from the third quarter of the CCP for years 2001 - 2012. We follow Brown et al. (2011) and choose the third quarter to maximize the match with the SCF survey (typically administered between April and December), which we use to impute the initial income distribution as described below. For consistency, we then use the third quarter of each subsequent year to generate annual measures of household debt. Table 1 contains the summary statistics from the CCP and SCF samples from the third quarter of 2001. The statistics from the SCF and CCP are similar for most categories with the exception of credit card balances. This finding is consistent with Brown et al. (2011) reporting that overall and in the majority of disaggregated debt categories (mortgages, auto loans and HELOCs), borrower characteristics and environment cells, debt levels reported in the SCF and CCP are similar. Brown et al. (2011) suggest that some of the discrepancy between the credit card balance statistics in the two datasets might come from the way credit card balances are recorded: the CCP contains records of all credit card balances, whereas the households in the SCF might only report the fraction of the balance they intend to roll over. 4 The mortgage balance and HELOCs in the CCP are slightly higher than in the SCF because the CCP measure includes secondary/investment properties, while in the SCF it does not (see Brown et al. 2011). 4 In the CCP, the credit balance is recorded on some date during the quarter. For some individuals, this can be the date right before they pay off most of their credit balance, and the balance might largely reflect the transaction use of the credit cards. For other individuals, the date might be the date after they pay off the intended balance and the remaining amount reflects the carry-over balances. In the SCF, the credit balance reported likely does not reflect the use of credit card for transactions, but rather the debt that the household does not plan to repay in the current period. In addition, the households in the SCF might forget older balances. 8 The auto debt balance is also slightly higher in the CCP because the CCP always includes auto leases, while in the SCF respondents usually do not report car leases as auto debt. The bankruptcy rates are very similar between the two samples. The tables also show some differences between the delinquency statistics in the two datasets. The SCF households probably report only severe delinquencies on large quantities of debt and do not report delinquencies that they regard as temporary or small. 5 To impute the rank in the income distribution for a household in the CCP, we first estimate the following relationship between the household’s gross income and observable characteristics in the 2001 SCF, log�𝑌𝑖,𝑆𝐶𝐹 � = 𝑓( 𝛽𝑋𝑖,𝑆𝐶𝐹 ) + 𝜖𝑖,𝑆𝐶𝐹 , (1) where 𝑌𝑖,𝑆𝐶𝐹 is the income of household 𝑖, and 𝑋𝑖,𝑆𝐶𝐹 is the vector of the household’s characteristics that include (logs of) mortgage balance, credit card balance, credit card limit, an indicator for positive credit card limit, the credit card utilization rate conditional on positive credit card limit, auto loan balance, HELOC balance, student loan balance, an indicator for bankruptcy, an indicator of 60 days or more past due on any loan, the age of the head of the household and the household size. 𝑓(. ) is a function that includes polynomials, interaction terms, and dummy variables. Appendix F provides more information on the specification and variables. We estimate equation (1) using OLS (with the SCF sampling weights) and eliminate outliers using Cook's distance. 6 The adjusted R2 for this regression is 0.55. Using estimated β, we construct the expected imputed (log) income for each household 𝑖 in the third quarter of 2001 in the CCP data: E[log(𝑌𝑖 )] = 𝑓�𝛽̂ 𝑋𝑖,𝐶𝐶𝑃 �, and the expected imputed income (in levels) E[ 𝑌𝑖 ] = exp[E[log(𝑌𝑖 )] + 0.5𝜎𝜖�2𝑖,𝑆𝐶𝐹 ], where 𝜎𝜖�2𝑖,𝑆𝐶𝐹 = 0.3721 is the variance of 𝜖𝑖,𝑆𝐶𝐹 estimated in equation (1). Having imputed households’ income in the CCP, we then estimate the household’s rank in the local income distribution. For each household 𝑖 in area 𝑐 we construct its income rank in 2001, 𝑅𝑖,𝑐,2001 , as the rank of the household's expected imputed income, E[log�𝑌𝑖,2001 �], in the imputed income distribution for location 𝑐. We approximate the local income distribution through a simple resampling procedure. In particular, we assume that the distribution of income residuals estimated in the SCF is the same across all locations. Note that to the extent that this assumption is not appropriate, we will tend to 5 In the SCF data, the 60DPD indicator is the indicator of whether a household has ever been delinquent on any loan for 60 days or longer. In the CCP data, the 60DPD indicator is the indicator of whether a household is delinquent on any loan for 60 days or longer in the current quarter. 6 Equation (1) is estimated only for observations with positive values of income. We also restrict our analysis to the 50 U.S. states and the District of Columbia, dropping the observations from Puerto Rico and U.S.-owned territories. 9 bias our results against finding any role for inequality in accounting for debt dynamics. After drawing a household from location c in the CCP and calculating its expected income, we add a randomly drawn residual estimated on the SCF sample to obtain the actual household income: log�𝑌𝑖,𝑐,𝐶𝐶𝑃 � = 𝑓�𝛽̂ 𝑋𝑖,𝑐,𝐶𝐶𝑃 � + 𝜖̂𝑆𝐶𝐹 . By repeating the process 50,000 times, with draws done with replacement, we approximate the local income distribution. We then calculate each household’s percentile rank (𝑅𝑖,𝑐,2001 ) as well as distributional statistics. The higher the value of 𝑅𝑖,𝑐,2001, the relatively richer is household 𝑖 in its geographical location c in 2001. We separately construct the rank of the household by the household's location at the three different levels of aggregation: zip code, county and state. When the measure is constructed at the zip code level, we restrict the analysis to zip codes with at least 100 households in our CCP sample. This gives us 14,529 distinct zip codes in 2001. When the measure is constructed at the county level, we restrict the analysis to counties with at least 300 households in our CCP sample. This procedure gives us 2,303 counties in 2001, covering over 35,000 zip codes. We check the quality of our imputation in a number of ways. Table 2 presents the moments of the income distribution imputed in the CCP and the same moments calculated from the SCF. The two sets of moments are very similar, suggesting that our imputation function is sensible. We also check the quality of our income imputation procedure by bringing income information to the CCP data from an alternative source. In particular, we merge the CCP data with the data from a proprietary database. This database has detailed mortgage-level panel data that contain information on a majority of mortgages originated in the U.S.. These data include the debt-to-income ratio associated with each mortgage at the time of origination. We use information on the mortgage origination month, location (zip code) and balance from this proprietary database and the same attributes from the mortgage trade-line data in the CCP to match households in the two datasets. 7 The earliest year when the debt-to-income variable is available in both the proprietary dataset and the SCF is 2007; thus we merge the data using the first mortgages originated in 2007. Prior to the merge, we eliminate all cases of multiple mortgages with the same combination of open month, initial balance and zip code in both datasets to ensure that the match is unique. For the sample of matched households we then use the debt-to-income ratio from the proprietary database and the debt in the CCP to estimate the income. For this subset of matched households we compare the income rank derived from the proprietary data with the income rank derived from the SCF-CCP imputation. The two measures of rank are highly and positively correlated (Spearman correlation coefficient is 0.55), confirming that our imputation procedure provides a good measure of income. When we regress the 7 See Elul et al. (2010) for a similar merge procedure. 10 imputed CCP measure of income on the actual measure of income from the proprietary database, the estimate of the slope is practically one and thus measurement errors arising from the imputation do not appear to be mean-reverting to any significant extent. 2.3. Local Inequality Measures Having imputed income in the CCP, we construct the local inequality measures for 2001 (𝐼𝑐,2001 ). Our preferred measure of inequality is the difference between expected log income at the 90th percentile and expected log income at the 10th percentile, i.e., 𝐼𝑐,2001 = 𝑝90𝑐 [ 𝐸 { log�𝑌𝑖,𝑐,2001 �} ] − 𝑝10𝑐 [ 𝐸 { log�𝑌𝑖,𝑐,2001 �} ] . We then compare this measure to inequality measures constructed from alternative sources. At the zip code level, we use data from the IRS on household adjusted gross income (AGI) drawn from the 2001 tax returns. At the county level, we use the Census data on household income from 2000. Both of these sources provide income bins and the fraction of the population within each bin. Using this information, we construct a simple approximation to the Gini coefficient. The CCP measure constructed from imputed incomes is highly correlated with Gini coefficients based on Census or IRS data. For example, the correlation between Gini coefficients from the 2000 Census and 90-10 differences in the CCP data at the county level is 0.59. Figure 2 plots a map of U.S. inequality at the county level. Inequality is on average highest in the southern states, as well as California and the Pacific Northwest. Midwestern states, in contrast, stand out for having some of the lowest levels of inequality on average. The map also shows that inequality tends to be higher in large cities than in more rural areas. The map, which plots inequality at the county level, masks even greater regional heterogeneity in inequality at the zip code level. Figure 3 plots histograms of our CCP inequality measure at each level of aggregation. Average inequality is higher at lower levels of aggregation with a mean across zip codes of 2.24 and a mean of 1.68 across states. The standard deviation of inequality is twice as high (0.15) at the zip level compared to the state level (0.07). We focus on local income inequality for a number of reasons. First, this is likely to be the most relevant metric when households compare themselves to others. Second, it avoids measurement issues associated with comparing incomes across very different areas (e.g. $100K in New York vs. Tulsa). Third, much of the rise in aggregate inequality in the U.S. reflects rising inequality within regions rather than across regions. 8 Finally, there is much more variation in income inequality across regions than in 8 In Appendix C, we describe in detail a decomposition of aggregate income inequality in the U.S. from 1970 to 2000 measured using Census income data. When we measure the relative importance of differences in mean incomes across regions (“between” inequality) versus the dispersion of incomes within regions (“within” inequality) for each Census, we find that “between” inequality has consistently accounted for less than two percent of total inequality and that this share has, if anything, been declining over time. 11 aggregate inequality over time, which is necessary for identifying any potential effects on inequality on household behavior. 3 Empirical Analysis of Debt and Inequality In this section, we investigate whether households’ borrowing patterns from 2001 to 2012 varied with local inequality. We do so using household level regressions of debt to income changes over time as a function of household characteristics, the household’s position in the local income distribution, and interactions of the latter with local inequality measures. We find that while the evidence supports the notion that local inequality affected debt accumulation patterns across income groups, the direction of the effect is opposite to what one would expect from “keeping up with the Joneses” effects. We document the robustness of this result along a variety of dimensions. 3.1. Baseline Results We are interested in estimating the role of initial local income inequality on the relationship between the household's debt accumulation and the household's rank in the initial local income distribution. In particular, we estimate the change in the household's debt between 2001 and year 𝑡, 2002 ≤ 𝑡 ≤ 2012, as a function of the household's income rank in the 2001 local income distribution, conditional on local income inequality in 2001. The benchmark specification is Δ𝐷𝑖𝑐𝑡 where 𝐸[𝑌] 𝑖𝑐,2001 Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 = 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝑐 + + 𝜖𝑖𝑐𝑡 , (2) is the change from year 2001 to year 𝑡 in the debt of household 𝑖 that resides in location 𝑐 relative to the household's (imputed expected) income in 2001 (in levels), i.e., Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 ≡ 𝐷𝑖𝑐𝑡 −𝐷𝑖𝑐,2001 𝐸[𝑌]𝑖𝑐,2001 , where 𝐷𝑖𝑐𝑡 is deflated by the CPI-U and expressed in 2001 dollars. 𝑐 + is the fixed effect of the geographical location that is at one level of aggregation higher than the geographic area used to construct the income distribution and the income inequality measure. 9 We use the 2001 measure of local income inequality because it is predetermined relative to subsequent household debt accumulation decisions and it is highly persistent over time. Parameters 𝛼, β and 𝛾 describe the relationship between the household’s debt accumulation and local inequality. If 𝛼 < 0, low-rank households within an area accumulate relatively more debt than the high-rank households. If 𝛽 = 𝛾 = 0, then local inequality is irrelevant for household debt accumulation. This case is shown in Panel A of Figure 4. Panel B of Figure 4 illustrates the case when 𝛼 < 0, 𝛽 > 9 For example, in the regressions with zip code-level distribution of income and inequality, we control for countylevel fixed effects. In the regressions with county-level rank and inequality, we control for state-level fixed effects. We do not control for the geographical fixed effects in the regressions with state-level income rank and inequality. 12 0, 𝛾 < 0. If 𝛽 > 0, an area with higher inequality is associated with higher debt accumulation. If 𝛾 < 0, this effect weakens as household rank increases. Such a case is an example of the “keeping up with Joneses” hypothesis. Specification (2) can be interpreted as a “difference-in-differences” approach in which we compare high- and low-ranked households across high- and low-inequality regions, with γ being the key parameter that determines whether such differences have been important. We estimate equation (2) separately for each year 𝑡, 2002 ≤ 𝑡 ≤ 2012. In each year 𝑡, we follow Guerrieri, Hartley and Hurst (2013) and restrict the sample to households that reside in the same geographical area 𝑐 in 2001 and in 𝑡. In each regression, we exclude the observations below the 2nd and above the 98th percentile of the distribution of geographic location c. 10 Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 in year 𝑡. The standard errors are clustered by Our baseline estimates of equation (2), estimated at the zip code level with county fixed effects for years ranging from 2002 to 2012, are reported in Panel A of Table 3. Our first finding is that the coefficient on a household’s rank in the income distribution (α) is consistently negative, with a peak absolute value in 2007. Hence, debt accumulation over the course of the early to mid-2000s was, on average, greater for lower income households. Second, the estimated coefficient on the inequality level of the zip code is systematically negative, again peaking in absolute value in 2007. This implies that, holding everything else constant, households living in the more unequal areas within a county accumulated less debt over the early to mid-2000s than did those in lower inequality areas in the same county. The key parameter for us is γ, which captures the interaction of household rank in the local income distribution and local inequality. Our main finding is that γ is positive over this time period. This implies that debt accumulation was relatively higher for (sufficiently) high-income households in highinequality regions than in low-inequality regions, or equivalently that lower income households in highinequality regions borrowed relatively less than their counterparts in lower inequality regions. This result is precisely the opposite of what one would have expected from “keeping up with the Joneses’” effects. Panel C of Figure 4 illustrates our results qualitatively. Households with rank to the right of the crossing accumulate more debt on average as inequality increases. Households to the left of the crossing accumulate relatively less debt as inequality increases. To give a sense of the economic magnitudes, we calculate the change in debt accumulation in response to a one standard deviation increase in local inequality for households of several different ranks. Figure 5 plots these calculated effects at the 80th, 50th, and 20th percentiles for each time sample. At the 80th percentile the increase in inequality means the increase in household debt over expected income was higher by almost nine percentage points in 2007. At the 20th percentile we estimate that households 10 Each specification below is estimated using household sampling weights from year 2001. See Appendix B for details on the construction of household sampling weights. 13 decreased debt relative to income by a little over ten percentage points in 2007. In the same year the median household saw a decline in debt-to-income of less than one percentage point. 3.2. Specifications with Additional Controls Our baseline specification does not include any household-specific controls other than their rank in the income distribution. To control for potentially confounding household characteristics, we consider an expanded specification augmented to include a vector of household-specific regressors: Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 = 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 + 𝑐 + + 𝜖𝑖𝑐𝑡 , (3) where 𝑋𝑖𝑐 is the set of household-specific controls. The latter include the age of the head of the household, household size, (logarithm of) the level of household’s mortgage debt, (logarithm of) the level of household’s auto debt, (logarithm of) the level of household’s HELOC debt, (logarithm of) the level of household’s student loan debt, an indicator for a non-zero credit card debt limit, (logarithm of) the level of household’s credit card debt, (logarithm of) the level of household’s credit card limit, the credit card utilization rate conditional on non-zero credit card limit, default indicators, and the average of household members’ credit scores. All controls are from 2001, with the exception of credit scores for which we include both 2001 values (to control for initial access to credit) as well as year t values (to control for access to credit in subsequent years). Results from this augmented specification are presented in Panel B of Table 3. The results for the estimated effects of rank, inequality and the interaction of the two are almost identical to those from the parsimonious specification. A second concern one might have is that regional inequality is correlated with other regional economic characteristics and that it is the latter that are most relevant for household debt accumulation decisions. We control for this possibility in several ways. First, we include an additional vector of ziplevel control variables: Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 = 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 + 𝜅𝑊𝑐 + 𝑐 + + 𝜖𝑖𝑐𝑡 , (4) where 𝑊𝑐 is the set of location-specific controls. The set of location-specific controls includes the median expected income in the zip code in 2001, the median of (log of) the household’s total debt in 2001, and the median of (log of) the household’s mortgage debt in 2001. Results are presented in Panel C of Table 3. Again, our baseline estimates of the effects of household rank, local inequality and their interaction are almost unchanged. This is also illustrated graphically in Panel B of Figure 5: our estimates with both household and regional controls suggest that increasing inequality by one standard deviation is associated with households at the 80th percentile increasing borrowing relative to income by almost 13 percentage points, at the 50th percentile households increase borrowing over income by 3.5 percentage points, and at 14 the 20th percentile households decrease borrowing over income by almost 6 percentage points. The difference between high- and low-rank households is essentially the same as before. Another way to control for regional characteristics is to estimate our baseline specification with fixed effects at the level of the zip code rather than the county: Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 = 𝛼𝑅𝑖𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 + 𝛿𝑐 + 𝜖𝑖𝑐𝑡 . (5) With zip code-specific fixed effects δc, we can no longer separate the effect of local inequality from other regional characteristics, but we can still estimate the coefficient on the interaction term between the household’s income rank and local inequality, 𝛾. The results from estimating equation (5) are presented in Panel D of Table 3: the estimate of 𝛾 is again almost unchanged relative to those from our parsimonious specification (2) or specifications augmented with household (3) and regional controls (4). We also check for omitted variable bias in the interaction term by adding the interaction of the household credit risk score with local inequality to the specification in equation (3). If the measure of income rank primarily picked up the relative importance of the household’s credit risk score, one would expect the estimate of 𝛾 to differ significantly after including this interaction. We estimated the following modification of specification (3): Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 = 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 +𝜙𝑅𝑖𝑠𝑘𝑖𝑐,2001 + 𝜎𝑅𝑖𝑠𝑘𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝑐 + + 𝜖𝑖𝑐𝑡 , (3’) The estimates of 𝛾 across all years (Panel A, Table 4) are robust to the inclusion of the interaction term. Similarly, we check whether the results are sensitive to including an interaction of the household’s initial debt level with local inequality in specification (3): Δ𝐷𝑖𝑐𝑡 𝐸[𝑌]𝑖𝑐,2001 = 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 +𝜙𝐷𝑒𝑏𝑡𝑖𝑐,2001 + 𝜎𝐷𝑒𝑏𝑡𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝑐 + + 𝜖𝑖𝑐𝑡 , (3’’) Our baseline findings are unchanged with these additional controls (Panel B of Table 4). Finally, we verify that our results do not hinge on the CCP measure of income inequality. We replicate our results from Table 3 in Appendix Table A1 using the measure of inequality constructed from IRS data and described in section 2.3 and find almost identical results. In short, the differential debtaccumulation patterns by households of differing income levels across inequality regions are a robust feature of the data. 3.3 Subsample analysis Our finding that debt accumulation was higher for poorer households in low-inequality regions than highinequality regions is robust to controlling for a wide variety of household and regional controls. One may 15 be concerned however that our interaction effect is capturing some other nonlinear characteristic of household borrowing, which need not be captured by linear controls. To address this possibility, we consider an additional set of robustness checks in which we verify that our results still obtain within subsets of the data. Specifically, we break our regions along four dimensions: geographic areas, initial debt burdens, credit scores and house price growth. For geographic areas, we estimate our specification with household and regional controls (equation (4)) separately for each of the four Census regions: Midwest, Northeast, South and West. We present the results of the household level regressions of debt accumulation from 2001 to 2007 (the main period over which household debt increased sharply) for each region in Panel A of Table 5, with the full set of yearly regressions by region available in Appendix Table A2. For each region, the coefficients are of the same sign as before and of approximately the same order of magnitude. Hence, our baseline results are confirmed within each region of the country. Second, we decompose zip codes by the average level of credit scores among households in each locale in 2001. Specifically, we group zip codes into three bins: low credit scores (below the 33rd percentile of average credit score distribution), medium (between the 33rd and 67th percentiles) and high credit scores (above the 67th percentile of the average credit score distribution). We then rerun our specification with household and regional controls within each of these three credit score areas. The results for 2001-2007 are presented in Panel B of Table 5, with the full of set of yearly regressions by credit score grouping available in Appendix Table A3. Again, the results are qualitatively similar across credit score groups, although they are somewhat smaller in high credit score regions. Third, we split zip codes according to median debt-to-income ratios in 2001. Specifically, we construct median initial debt-to-income ratios across all households in a zip code, then split zip codes into three groups based on these median ratios: low initial debt levels (below the 33rd percentile of the debtto-income distribution), medium (between the 33rd and 67th percentiles) and high debt-to-income ratios (above the 67th percentile of the debt-to-income distribution). We then estimate our specification with household and regional controls within each of these three subsets of zip codes. We again present results for 2001-2007 in Panel C of Table 5, with the full set of yearly regressions by initial debt-to-income ratio available in Appendix Table A4. We find that our qualitative result holds across zip codes of different initial debt-to-income ratios but that the differential effects of inequality on household borrowing across income groups were largest in regions with higher initial debt to income ratios. Finally, we separate zip codes by the average growth rate of home prices from 2001-2005, as in section 2. We calculate zip code house price appreciation using data from the Core Logic index. These data are only available for a subset of our zip codes (about 6,600) which constitutes about 70% of our original sample. We group zip codes into three bins: low house price growth (below the 33rd percentile), 16 medium (between the 33rd and 67th percentiles), and high house price growth (above the 33rd percentile). We re-estimate the specification with household and regional controls within each subgrouping of zip codes and present results from 2001-2007 in Panel D of Table 5, with the full set of yearly regressions by house price growth in Appendix Table A5. Once again, the interaction of household rank and local inequality remains statistically significant within each subset of the data, with the differential effects of regional inequality being stronger in zip codes which experienced higher growth in house prices. 3.4 Results from a Nonparametric Specification The specification in equation (2) assumes a linear relationship between debt accumulation, income and rank and local inequality. In this section, we relax this assumption and estimate a nonparametric specification. Specifically, we first split the sample of households into three bins according to the level of local inequality. In particular, each location (zip code) is assigned to one of the three bins based on the location’s level of inequality in the distribution of inequality across locations in 2001, i.e., low-inequality bin (less than the 20th percentile of the distribution of local inequality levels), mid-level inequality bin (between the 20th and 80th percentile), and high-inequality bin (above the 80th percentile). The assignment of locations to inequality bins remains constant through 2002-2012. For the households in each bin, we run a regression of household relative debt accumulation on a dummy for income rank below 0.2, a dummy for income rank above 0.8, a full set of household and regional controls and the county-specific fixed effects for each year separately. The omitted category is the dummy for income rank between 0.2 and 0.8. Figure 6 shows the estimated coefficients on the dummy for income rank below 0.2 and the dummy for income rank above 0.8, relative to the dummy for the income rank between 0.2 and 0.8. The differences across inequality regions for high-ranked households (i.e. those above the 80th percentile) are small throughout the time sample. In contrast, low-ranked households display much larger differences in debt accumulation patterns across low- and high-inequality regions, with differences in debt accumulation reaching nearly 20 percent of initial income levels by 2008. Hence, the link between inequality and debt accumulation was relatively more important for low-income households than for high-income households. 3.5 Results with County- and State-Level Income Distribution and Inequality Measures Previous work on inequality and consumption has been done using measures of inequality at the state level (see Bertrand and Morse, 2013) and most discussion of inequality and debt has focused on measures of inequality at the national level, as in Figure 1. We explore how our results vary as we increase the level of geographic aggregation for inequality by estimating equation (4) using the income distribution at the 17 county and state level. We construct the area income distribution using the same resampling procedure we used for zip codes and now we compute a household’s percentile rank within the larger area (e.g. county) income distribution and inequality statistics of that distribution. We keep all household and regional-level controls that we used before except now we include state fixed effects for county-level regressions and no fixed effects for state-level regressions. Panels A and B of Table 6 report the results with county- and state-level income distribution and inequality measures, respectively. At the county level, we find very similar results to our zip code regressions once we consider that the standard deviation of inequality is smaller at the county level. Similarly, we also find very similar estimates of the interaction term when inequality is measured at the state level, although there is some loss of precision in our estimates due to the aggregation. Also noteworthy is that the estimate of β is positive at the state level, implying that households on average accumulated relatively more debt in states with higher levels of inequality. This is similar to the result obtained by Bertrand and Morse (2013) that typical households consumed more in states where consumption of the rich was higher. 3.6. Decomposition by Form of Debt We now consider debt accumulation patterns along different dimensions of debt: mortgages, auto loans and credit cards. For each, we reproduce our household-level regressions with household and regional controls and county fixed effects and report yearly results in Table 7. Panel A documents that the results for mortgages are almost identical to those found for total debt. Because mortgage debt on average accounts for two-thirds of total debt, it is likely the primary driver of total debt patterns described above. Panel B documents that very similar qualitative results obtain for auto loans: both α and β are estimated to be negative while the interaction term γ is positive. However, the interaction effects are significantly smaller for auto loans than for mortgages, even if we adjust them for the relative magnitudes of each form of debt (i.e. convert to growth rates). For example, the peak interaction effect on auto loans is about 0.09, which when adjusted by the average ratio of auto debt to mortgage debt (mortgage debt is almost eight times as large as auto debt on average) becomes 0.71 or one-third of the mortgage interaction effect. Thus, even though auto loans display the same qualitative patterns, the mapping from local inequality to differential borrowing patterns across households is quantitatively weaker for auto loans than for mortgages. Panels C and D report equivalent results for credit card balances and credit card limits. The distinction between credit card balances and limits is useful because the former can be interpreted as reflecting the demand for credit on the part of households while the latter largely reflects credit 18 availability. 11 Strikingly, we find very different results for the two measures. With credit card limits, we recover the same qualitative features as in our baseline estimates for total debt, α and β are both estimated to be systematically negative while the interaction term γ is positive. With credit card limits being approximately half of mortgage debt on average, the estimated peak level of γ of around 0.5 is approximately half as large as the peak interaction effect estimated for mortgages in terms of implied growth rates of each form of debt. In contrast, we find no consistent or economically significant relationship between local inequality and the credit card balances of households across different income groups: both β and γ are estimated to be very small (in some years becoming statistically insignificant) and the sign of γ unstable across years. Thus, to the extent that we can interpret credit card balances and limits as reflecting credit demand and supply, respectively, these results suggest that the differential borrowing patterns of lower and higher income households across regions of different inequality reflect differential credit supply conditions, not differential credit demand as would be the case under “keeping up with the Joneses”. In section 4, we propose one channel through which credit supply can vary with local inequality in a way that can account for these patterns, namely if banks use an applicant’s income in combination with local inequality to make inferences about the applicant’s underlying type. This interpretation of the data would be consistent not just with the difference in our findings for credit card limits and credit card balances, but also with the quantitative differences in the size of estimated effects of inequality across other forms of debt. Mortgages, for example, represent much larger loan amounts than other forms of debt and it is relatively difficult for financial institutions to recover the home or office associated with the loan in case of default. Auto loans, on the other hand, are much smaller in size and banks face fewer hurdles to repossessing a car. Hence, the incentive of financial institutions to devote resources toward identifying applicants’ underlying credit-worthiness should be much lower for auto loans than mortgages, leading to weaker utilization of the information provided by local income inequality as found in Table 7. While credit card debt is of the same order of magnitude on average as auto debt in the CCP, credit card debt is unsecured so that financial institutions bear more risk than they do with automobiles. One would therefore expect stronger incentives to utilize available information in extracting credit risk for credit cards than autos, which is again consistent with what we observe in the data. 11 This distinction is somewhat offset by the fact that households can endogenously raise their credit limits by applying for more credit cards or requesting higher limits from their current credit card providers. 19 4. Model In this section, we develop a stylized model in which banks use local inequality to extract information about applicant types and which results in borrowing patterns similar to those we find in the CCP data. We show how local inequality affects bank lending decisions under perfect competition and monopoly. Suppose there are two types of households: High (H) and Low (L). To simplify algebra, we assume that High type households never default on debt while Low type households default with probability 𝑑 and that the share of High type households is 0.5. 12 The income for each type 𝑗 ∈ {𝐻, 𝐿} is given by 𝑦𝑗 = 𝜇𝑗 + 𝑒𝑗 where 𝜇𝐻 > 𝜇𝐿 are constants and 𝑒𝑗 ~𝑁(0, 𝜎 2 ). Hence, 𝑦𝐻 ~𝑁(𝜇𝐻 , 𝜎 2 ) and 𝑦𝐿 ~𝑁(𝜇𝐿 , 𝜎 2 ). Denote the pdfs for each distribution with 𝜙𝐻 and 𝜙𝐿 . The average income in this 1 2 1 2 economy is 𝑦� = 𝜇𝐻 + 𝜇𝐿 . We also assume banks observe 𝑠, another signal about the quality of borrowers that can incorporate other information about borrowers and is not observed by the econometrician, to capture the idea that loan officers have more information than econometricians. Similar to the income signal, 𝑠𝑗 = 𝜌𝑗 + 𝜂𝑗 where 𝜌𝐻 > 𝜌𝐿 are constants and 𝜂𝑗 ~𝑖𝑖𝑑 𝑁(0, 𝜔2 ). Denote the pdfs for each distribution with 𝑞𝐻 and 𝑞𝐿 . To simplify algebra, we assume without loss of generality that income 𝑦𝑗 and signal 𝑠𝑗 are independent. Banks do not observe household types directly but they observe applicants’ incomes and signal 𝑠. 13 They can then infer the probability of a given type conditional on observed income. Specifically, using Bayes law, the posterior probability of being High type for a household 𝑖 with signals 𝑦𝑖 and 𝑠𝑖 is given by Pr(𝐻|𝑦𝑖 , 𝑠𝑖 ) = Pr(𝑦 |𝐻) = 𝑖 Pr(𝑦𝑖 |𝐻) Pr(𝑠𝑖 |𝐻) Pr(𝐻) Pr(𝑠𝑖 |𝐻) Pr(𝐻)+Pr(𝑦𝑖 |𝐿) Pr(𝑠𝑖 |𝐿) Pr(𝐿) 𝜙𝐻 (𝑦𝑖 )𝑞𝐻 (𝑦𝑖 )12 1 2 𝜙𝐻 (𝑦𝑖 )𝑞𝐻 (𝑦𝑖 ) +𝜙𝐿 (𝑦𝑖 )𝑞𝐿 (𝑦𝑖 ) 1 2 Φ(𝑦 )𝑄(𝑠 ) 𝑖 𝑖 = Φ(𝑦 )𝑄(𝑠 )+1 𝑖 𝑖 (6) where Φ(𝑦𝑖 ) ≡ 𝜙𝐻 (𝑦𝑖 )/𝜙𝐿 (𝑦𝑖 ) and 𝑄(𝑠𝑖 ) ≡ 𝑞𝐻 (𝑠𝑖 )/𝑞𝐿 (𝑠𝑖 ) are the likelihood ratios. Given our assumptions, we have Φ′ > 0 and 𝑄′ > 0, that is, High type households are monotonically more likely to be observed as income 𝑦 or signal 𝑠 increase. Since there are only two types, it follows that Clearly, 𝜕 Pr(𝐿|𝑦𝑖 ,𝑠𝑖 ) 𝜕𝑦𝑖 < 0, Pr(𝐿|𝑦𝑖 , 𝑠𝑖 ) = 1 − Pr(𝐻|𝑦𝑖 , 𝑠𝑖 ) = 𝜕 Pr(𝐿|𝑦𝑖 ,𝑠𝑖 ) 𝜕𝑠𝑖 < 0, 𝜕 Pr(𝐻|𝑦𝑖 ,𝑠𝑖 ) 𝜕𝑠𝑖 > 0, and 12 1 . Φ(𝑦𝑖 )𝑄(𝑠𝑖 )+1 𝜕 Pr(𝐻|𝑦𝑖 ,𝑠𝑖 ) 𝜕𝑦𝑖 (7) > 0. We document in Appendix E that high-income households are indeed less likely to default than low-income households. 13 Obviously, banks observe many other characteristics of households. We abstract from this additional information available to banks to simplify derivations. One may interpret this approach as partialling out these other characteristics. Typically, one of the important indicators of individual’s risk is individual’s credit score. In the analysis in section 3, we show that the household’s income rank has explanatory power for the household’s debt even after we control for the credit score. 20 Banks potentially have two margins to determine which borrowers obtain loans: 1) price of loans; 2) loan denial probability. While in reality banks are likely to use both margins, we consider polar cases to illustrate the workings of each margin separately. For the price margin, we will assume that banks can price discriminate borrowers perfectly, banks compete in all population segments, and banks can freely obtain resources at rate 𝑅0 (“perfect competition”). For the loan denial probability, we assume that there is only one bank serving the market but this bank is threatened by entry of other banks if this bank makes a profit (“monopoly”). 4.1 Perfect Competition With perfect competition and free entry in each lending segment, banks can have only one interest rate for a borrower of a given quality. Since there is a continuum of borrower quality, there is also a continuum of markets where each market is indexed by borrower quality. Consider a set of households with income 𝑦𝑖 and signal 𝑠𝑖 . Given by the zero profit condition, the interest rate is set to 𝑅 ∗ {(1 − 𝑑) Pr(𝐿|𝑦𝑖 , 𝑠𝑖 ) + Pr(𝐻|𝑦𝑖 , 𝑠𝑖 )} = 𝑅0 ⟹ 𝑅 ∗ = (1−𝑑) 𝑅0 Pr(𝐿|𝑦𝑖 ,𝑠𝑖 )+Pr(𝐻|𝑦𝑖 ,𝑠𝑖 ) = 𝑅0 Φ(𝑦𝑖 )𝑄(𝑠𝑖 )+1 Φ(𝑦𝑖 )𝑄(𝑠𝑖 )+(1−𝑑) = 𝑅 ∗ (𝑦𝑖 , 𝑠𝑖 ) (8) Note that households with other levels of 𝑦 and 𝑠 pay the same interest rate as long as Φ(𝑦𝑖 )𝑄(𝑠𝑖 ) = Φ(𝑦)𝑄(𝑠). That is, each lending segment is characterized by a pair of signals 𝒮(𝑅 ∗ ) = �(𝑦, 𝑠): 𝑅0 Φ(𝑦)𝑄(𝑠) + 1 = 𝑅 ∗ �. Φ(𝑦)𝑄(𝑠) + (1 − 𝑑) where 𝑅 ∗ is a sufficient statistic for the quality of borrowers. Because the quality of borrowers is the same in 𝒮(𝑅 ∗ ), every borrower in 𝒮(𝑅 ∗ ) obtains a loan at the interest rate 𝑅 ∗. Borrowers of a worse quality are offered loans at higher interest rates while borrowers of better quality can obtain a loan with a lower interest rate. Clearly, 𝜕𝑅∗ 𝜕𝑦 < 0 and 𝜕𝑅∗ 𝜕𝑠 < 0 so that households with high income 𝑦 and strong signal 𝑠 pay lower rates because banks believe that these applicants are more likely to be of the High type. To see the tradeoff between 𝑦 and 𝑠, one can fix 𝑅 ∗ (𝑦, 𝑠) at level 𝑅 # and find the required signal 𝑠 to allow a household to borrow at rate 𝑅 # given that this household has income 𝑦: 1 Φ(𝑦) 𝑠 ∗ (𝑦) = 𝑄 −1 � × 𝑅0 −𝑅# (1−𝑑) � 𝑅 # −𝑅0 (9) where 𝑄 −1 is the inverse function of 𝑄. Given that 𝑄 ′ > 0 and Φ′ > 0, it follows that 𝜕𝑠 ∗ (𝑦) 𝜕𝑦 < 0. Although we (unlike loan officers) do not observe signal 𝑠 in the data, we can still calculate the interest rate paid on average by households with income 𝑦, which is observed by the econometrician: 1 1 𝑅 ∗ (𝑦) = ∫ 𝑅 ∗ (𝑦, 𝑠) �𝑞𝐻 (𝑠) + 𝑞𝐿 (𝑠) � 𝑑𝑠 2 2 21 (10) Given that 𝑅 ∗ (𝑦, 𝑠) is differentiable and otherwise well behaved as well as 𝜕𝑅∗ (𝑦) 𝜕𝑦 =∫ 𝜕𝑅∗ (𝑦,𝑠) 1 �𝑞𝐻 (𝑠) 2 𝜕𝑦 1 𝜕𝑅∗ (𝑦,𝑠) 𝜕𝑦 + 𝑞𝐿 (𝑠) � 𝑑𝑠 < 0. 2 < 0, we have that (11) Hence, the model predicts that the interest rate decreases in household income. One can then consider a thought experiment of raising the income inequality in this economy without changing the mean level of income. Specifically, we increase the distance between 𝜇𝐻 and 𝜇𝐿 but the average income 𝑦� is held constant. Because income levels are now a stronger signal of an applicant’s type, banks put a higher weight on signal 𝑦, hence the slope of the tradeoff becomes steeper as it takes a larger change in signal 𝑠 to justify lending at a given interest rate (see Panel A of Figure 7). This will lead to higher borrowing on the part of low-income households in low-inequality regions than in high- inequality regions because, in the former, banks are less sure about the underlying type of the applicant based on income and therefore are more willing to lend to households of different incomes. In other ∗ ∗ (𝑦) < 𝑅𝑢𝑛𝑒𝑞𝑢𝑎𝑙 (𝑦) when 𝑦 < 𝑦� where “equal” and “unequal” denote the level of words, 𝑅𝑒𝑞𝑢𝑎𝑙 ∗ ∗ (𝑦) > 𝑅𝑢𝑛𝑒𝑞𝑢𝑎𝑙 (𝑦) when inequality, captured by mean-preserving changes in 𝜇𝐻 and 𝜇𝐿 , and 𝑅𝑒𝑞𝑢𝑎𝑙 𝑦 > 𝑦�. Panel B of Figure 7 illustrates this point. In short, banks charge lower interest rates to high-income households than to low-income households and the difference in the interest rates across income groups rises as the difference between these groups widens.14 In another thought experiment, we study the effects of an increase in the supply of credit. Since perfect competition prices each borrower type fairly, we can only increase the supply of credit by reducing the cost of funds rate 𝑅0 . Equation (9) shows that a decrease in 𝑅0 shifts schedule 𝑠 ∗ (𝑦) down and hence all borrowers enjoy a lower cost of credit. A combination of a positive credit supply shock (𝑅0 decreases) and an increase in inequality (𝜇𝐻 − 𝜇𝐿 increases) can reconcile how all types of households increased their borrowing on average over the course of the mid 2000s with the cross-sectional variation in debt-accumulation patterns across income groups at different levels of local inequality documented in section 3. The supply shock by itself can explain the former while the increased inequality by itself can explain only the latter. 4.2 Monopoly In practice, regulatory or informational constraints limit the ability of banks to charge different prices to different borrowers and therefore they often can charge only one rate or a limited number of rates for a given type of loan. To keep exposition simple, suppose that i) the market has only one bank and it is 14 Note that the value at which a household does not experience a change in the interest rate is equal to the average income 𝑦�. This value is insensitive to the level of inequality because by construction the average income is held constant and at the average income the likelihood ratios are equal to 1 and therefore the posterior probability is equal to 1/2. This value, however, can move in more complex models and alternative parameterizations. 22 threatened by entry of other banks, ii) regulators impose a minimum quality of borrowers who may obtain loans (e.g., to qualify for Freddie Mac and Fannie Mae guarantees), and iii) the bank can charge only one rate 𝑅�. To model assumption ii), we know that 𝑅 ∗ (𝑦, 𝑠) can be used as a sufficient statistic for the quality of a borrower. The bank makes a profit on borrowers with (𝑦, 𝑠) such that 𝑅 ∗ (𝑦, 𝑠) < 𝑅� and losses on borrowers with (𝑦, 𝑠) such that 𝑅 ∗ (𝑦, 𝑠) > 𝑅�. We will denote the cutoff interest rate 𝑅 + that meets the regulation requirements. With this cutoff rate, the threat of entry sets 𝑅� at the level that yields zero profits as implied by assumption i). 𝑅� ∫ ∫(𝑦,𝑠):𝑅∗ (𝑦,𝑠)≤𝑅+ {(1 − 𝑑) Pr(𝐿|𝑦, 𝑠) + Pr(𝐻|𝑦, 𝑠)}𝜙�(𝑦)𝑞�(𝑠)𝑑𝑦𝑑𝑠 ∫ ∫(𝑦,𝑠):𝑅∗ (𝑦,𝑠)≤𝑅+ 𝜙�(𝑦)𝑞�(𝑠)𝑑𝑦𝑑𝑠 = 𝑅0 1 1 1 1 where 𝜙�(𝑦) ≡ 𝜙𝐿 (𝑦) + 𝜙𝐻 (𝑦) and 𝑞� (𝑠) ≡ 𝑞𝐿 (𝑠) + 𝑞𝐻 (𝑠). Using the insight of equation (9), we 2 2 2 2 can find the threshold level of signal 𝑠 such that a bank will lend to a household with income 𝑦: As before, we have 𝜕𝑠 + (𝑦) 𝜕𝑦 1 Φ(𝑦) 𝑠 + (𝑦) = 𝑄 −1 � × 𝑅0 −𝑅+ (1−𝑑) � 𝑅+ −𝑅0 (12) < 0. The set of households who obtain a loan is: 𝒮 + (𝑅+ ) = �(𝑦, 𝑠): 𝑅0 Φ(𝑦)𝑄(𝑠) + 1 ≥ 𝑅+� Φ(𝑦)𝑄(𝑠) + (1 − 𝑑) The probability that a household with income 𝑦 is denied a loan is + Since 𝜕𝑠 + (𝑦) 𝜕𝑦 𝑠 + (𝑦) Pr(𝑑𝑒𝑛𝑖𝑒𝑑 𝑙𝑜𝑎𝑛|𝑦) = Pr(𝑠 < 𝑠 (𝑦)) = � < 0, it follows that 𝜕 Pr(𝑑𝑒𝑛𝑖𝑒𝑑 𝑙𝑜𝑎𝑛|𝑦) 𝜕𝑦 −∞ 𝑞� (𝑠)𝑑𝑠 < 0: the probability of loan denial decreases in income. Now we repeat the thought experiment with rising inequality. Similar to the perfect competition case, it takes a larger increment in signal 𝑠 to compensate for a given decrease in income 𝑦 because income is a more informative signal. As a result, if the quality of lending standard 𝑅 + is held constant, some low-income households may be denied a loan more often (see Panel C of Figure 7). Panel D of Figure 7 shows how the denial probability changes with rising inequality. The probability of denial increases for households with 𝑦 < 𝑦� and decreases for households with 𝑦 > 𝑦�. In contrast to the perfect competition case, the monopoly case has two ways to model an increase in the supply of credit. First, one can continue to model it as a reduction in the cost of funds rate 𝑅0 . Second, one can model it as an increase in 𝑅 +, i.e., relaxing lending standards to cover high-risk borrowers. In the first case, a decrease in 𝑅0 lowers 𝑅� and thus makes credit cheaper for households with 𝑅 ∗ ≤ 𝑅 +. However, it does not affect the interest rate for households with 𝑅 ∗ > 𝑅 + as these continue to 23 receive no loans (they do not meet lending requirements). In the second case, an increase in 𝑅 + raises 𝑅� because a wider coverage now includes high risk households and losses made on these high-risk households have to be compensated by larger profit margins on low-risk households. Thus, while credit is now available to a broader spectrum of households, the cost of borrowing increases for relatively highincome borrowers. On the other hand, the probability of obtaining a loan increases for all households as schedule 𝑠 + (𝑦) shifts down. Hence, although high-income households pay a higher price for credit, they are denied loans less frequently. Our model can therefore potentially account for why lower-income households accumulated relatively less debt in high-inequality regions than did similar households in low-inequality regions during the 2000s: if banks in higher-inequality regions placed more weight on applicants’ incomes as a signal of their underlying creditworthiness and therefore channeled more funds toward higher-income applicants than did banks in lower-inequality regions. Under perfect competition, this differential access to funds is predicted to happen through higher interest rates being offered to low-income applicants than highincome applicants whereas under monopoly banking, our model predicts that banks will reject lowincome applicants more frequently than high-income applicants. Because banking in the U.S. lies in between these two extremes, we expect both margins to be present in the data, a prediction to which we now turn. 5 Results from the Mortgage Application Data Our model suggests that variation in inequality across regions should be reflected in lending decisions of banks if regional inequality can be used to make inferences about applicants’ default probabilities. In this section, we use information on mortgage applications from the publicly available Home Mortgage Disclosure Act database (HMDA), 2001 – 2011, to test these implications. The HMDA data are compiled from reports filed by mortgage lenders. The HMDA was passed by Congress in 1975 and began requiring lenders to submit data reports in 1989. The initial intention of the act according to the Consumer Financial Protection Bureau (2012) was to monitor the provision of credit in urban neighborhoods. Later requirements to submit data reports were intended to monitor discriminatory lending practices. Dell’Ariccia, Igan, and Leaven (2012) find that HMDA covers between 77% and 95% of all mortgage originations from 2000 to 2006. Reporting criteria differ between depository and nondepository institutions and across years. Depository institutions have typically been required to report if they satisfy an asset threshold, make at least one home mortgage, are federally regulated or insured, and have a branch in a metropolitan area. Nondepository institutions were required to report if the share of home mortgages exceeded a threshold of all loan originations, the lender operated in an MSA, and met an asset threshold. In 2004 the share threshold was supplemented with a level of home mortgage originations to 24 increase the coverage of the market. Lenders who file reports include detailed information on every mortgage application received by the lender during a calendar year. All years of the data contain the size of the loan, income on the application, location of the property down to the census tract, demographics of the applicants, a lender identifier, and the action taken on the loan. Since 2004 the data include additional information including a censored picture of interest rates and the loan’s lien status. We use a 15% random sample of all HMDA records. While the data are very detailed in many respects there are some limitations. First, the data do not identify “piggyback” loans, i.e. loans with subordinate liens used to finance a larger first-lien loan. These secondary loans can be used to lower financing costs and to avoid requirements that a loan being sold to Fannie Mae or Freddie Mac be accompanied by private mortgage insurance if a traditional loan would not meet certain standards. The HMDA does not require lenders to report piggyback loans if they are issued as HELOCs and some piggyback loans might be issued by a lender not covered by HMDA. But some piggyback loans are included in the dataset and, given that these loans are not identified as such, a researcher might infer a much lower loan to value ratio than the actual loan to value on the property. Since we are not able to identify piggyback loans reliably and these loans are relatively small, we drop all applications where the loan-to-income (LTI) ratio is less than one. Second, we conduct the HMDA analysis at the county level rather than the zip code level. Although the data are available at the census tract, we aggregate to the county in order to use measures of inequality consistent with the CCP analysis. Finally, in contrast to the CCP database, the HMDA data set does not track applicants over time and hence we do not have a panel of applicants/borrowers. We focus on supply-side variables in line with the theoretical predictions of the model. First, we assess whether the probability of a loan being rejected is invariant to the applicant’s income rank interacted with regional inequality. Second, we consider whether the probability of the loan being “high-interest” (conditional on a loan application being approved) varies with inequality and the applicant’s rank. 15 Both of these can be interpreted as directly capturing credit supply factors, namely whether banks use local inequality to make inferences about applicants’ underlying types when one conditions on other observable characteristics of the applicant such as the loan-to-income ratio in the application. If banks use an applicant’s position in the income distribution to help make inferences about their underlying default risk, as suggested by the model, then one would expect banks to reject otherwise similar applications by highincome applicants less frequently in high-inequality regions than in low-inequality regions, or equivalently to reject otherwise similar applications by low-income applicants more frequently in high-inequality regions 15 The HMDA reporting guidelines require lenders to report the spread between the Treasury yield and the mortgage interest rate if the spread is greater than three percentage points for first-lien loans or five percentage points for subordinate-lien loan. 25 than in low-inequality regions. By the same logic, we should observe low-income applicants being charged higher interest rates on their loans more frequently in high-inequality regions than in low-inequality regions. We test these predictions in a framework very similar to that used in the CCP data. For a given outcome, we estimate the following regression 16 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑖𝑐𝑡 = 𝛼𝑅𝑎𝑛𝑘𝑖𝑐𝑡 + 𝛾𝑅𝑎𝑛𝑘𝑖𝑐𝑡 ∗ 𝐼𝑛𝑒𝑞𝑢𝑎𝑙𝑖𝑡𝑦𝑐,2001 + 𝛽𝑍𝑖𝑐𝑡 + 𝜆𝑐 + 𝑒𝑟𝑟𝑜𝑟, (6) where 𝑅𝑎𝑛𝑘𝑖𝑐𝑡 is the percentile rank of applicant i’s income within the pool of applicants in area c in year t. 17 The inequality measure and the income distribution are defined at the county level. The explanatory variables in vector 𝑍𝑖𝑐𝑡 include indicators for whether or not the loan is for an owner-occupied property, several race categories and gender, as well as interactions of the applicant’s income rank with the share of applicants in the county who are nonwhite. 18 We also control for the loan-to-income ratio in the application. While we estimate these models with county fixed effects 𝜆𝑐 , the results are very similar if we use state fixed effects (Appendix Table A6). We restrict the analysis to loans for home purchases, applications where the loan-to-income ratio is at most eight and not less than one, loans where the reporter was explicitly making the origination decision, and where the loan did not fail because of incompleteness or because it was not pre-approved. Notice that we retain in the sample loans that are not denied but also not originated. Excluding these does not change our results. As before, we are interested in the sign of the interaction term between income rank and inequality, 𝛾. All standard errors are clustered at the county level. The regressions are estimated separately for each year, 2001 – 2011. We use the log of the 90/10 income ratio derived from the income imputed in the CCP data in 2001 as the measure of inequality, but the results are essentially the same using the Gini coefficient derived from the Census data. We present the results for the probability of an application being rejected by a bank in Panel A of Table 8 and results for the probability of a loan being high-interest, conditional on origination, in Panel B of Table 8. For the probability of being rejected, the key finding is that estimated γ is consistently negative: applications from high-ranked households in high-inequality regions are less likely to be rejected than those from high-ranked households in low-inequality regions. This result is consistent with the theoretical predictions of the model in which banks use an applicant’s position in the local income distribution, along with the dispersion of that distribution, to make inferences about default risk. Using our 2007 estimates, our results suggest that a one standard deviation increase in inequality will decrease the probability of denial of a household in the 80th percentile rank relative to the 20th percentile rank by 16 Our baseline specification includes a county fixed effect because the county-level controls are not as detailed as those we can construct in the CCP data. 17 The results we present are also robust to using a measure of an applicant’s rank relative to the distribution of income across all households in the county. 18 We include the share of non-whites as an additional control because previous studies suggested that banks may treat differentially areas with predominantly non-white population. See Turner and Skidmore (1996) for a review. 26 approximately 2.3 percentage points. This is comparable in magnitude to the association between rank and the probability of denial. Similar results obtain with the probability of the loan being high-interest (this variable is not available before 2004): high-rank applicants are less likely to face higher rate loans in high-inequality regions than in low-inequality regions. Again, this is precisely the type of pricediscrimination predicted by the model. Doing the same calculation as above with the 2007 estimate we find that high-rank households will see the probability that they pay a high interest loan decline by 0.7 percentage point relative to low-rank household. We can also consider whether the size of the mortgage (intensive margin) varies across inequality regions and ranks within the income distribution by using the loan to income ratios associated with each originated mortgage. We use the same controls as with rejection probabilities (with the exception of LTI ratios) and county fixed effects. The results for each year are presented in Panel C of Table 8. Unlike with mortgage rejection rates and interest rate premia, we find little evidence that loan-to-income ratios in originated loans vary across households in different inequality regions. The estimates of 𝛾 are almost always insignificantly different from zero, with 2004 and 2007 being the only exceptions. To the extent that requested loans reflect demand for credit by households, we again find little evidence that demandside factors related to local inequality levels mattered for the debt-accumulation decisions of households. However, the HMDA dataset does not allow us to establish if households have multiple loans or reliably link piggyback loans to standard loans. Thus, while our results point mainly toward channels operating through credit supply— namely through the banks’ use of a household’s income rank combined with the amount of income inequality in that region to make inferences about applicants’ credit worthiness—more work needs to be done to better understand the intensive margin. 6 Conclusions Using household level measures of debt over the course of 2001 - 2012, we document a systematic link between local levels of income inequality and the debt-accumulation decisions of households of different income levels. Specifically, we find that low-income households in low-inequality regions accumulated more debt during the mid-2000s than did low-income households in high-inequality regions, with reverse (albeit smaller) effects operating for high-income households. While these results point to an economic channel linking economic inequality and borrowing by households of different income groups, they are inconsistent with “keeping up with the Joneses” being a significant force behind the great leveraging of households over this period. Instead, we argue that causality is likely to run from the banking system to households. We develop a model where income inequality is informative for evaluating credit risk. In the model, this channel leads to relatively more credit being allocated to low-income applicants when local inequality is 27 low rather than high, since higher levels of inequality imply that applicant incomes are stronger signals of credit-worthiness. Consistent with this view, we document that lower-income mortgage applicants in high-inequality regions are rejected more frequently and pay higher mortgage rates than similar applicants in low-inequality regions. While it is possible that income inequality implicitly captures other factors that are not included in the model or data, our findings suggest that the causality between inequality and debt is running through the credit supply channel. Our results support the notion that the growth in household borrowing during the mid-2000s was driven in large part by credit supply expansions targeted toward lower-income households. This is because we find no evidence for credit-demand forces such as “keeping up with Joneses” effects in the data and instead argue that causal links running from inequality to debt accumulation would point toward less relative debt accumulation by low-income households during periods of rising inequality, the opposite of what occurred in the U.S. during this time period. However, to the extent that this expansion in the supply of credit to lower income households is unlikely to continue (for example if it reflected a one-time securitization of household debt), our results suggest that a continuation of recent trends toward rising inequality is likely to reduce access to credit for lower-income households. Because limited access to credit restricts households’ ability to smooth their consumption and to engage in long-term investments (e.g. sending children to college, retraining for different careers), such differential access to credit could ultimately have negative longer term consequences. To the extent that many of these activities likely have positive societal externalities not captured in our model, such a development could have important policy implications. References Autor, David, Lawrence Katz, and Melissa S. Kearney. 2008. “Trends in U.S. Wage Inequality: Revising the Revisionists.” Review of Economics and Statistics, Vol. 90(2), pp: 300–23. Aguiar, Mark and Mark Bils. 2012. “Has Consumption Inequality Mirrored Income Inequality,” University of Rochester, mimeo. Aiyagari, S. R. 1994. “Uninsured Idiosyncartic Risk and Saving.’ Quarterly Journal of Economics, 109, 659-684. Athreya, Kartik, Xuan Tam, and Eric Young. 2012. "A Quantitative Theory of Information and Unsecured Credit," American Economic Journal: Macroeconomics, Vol. 4 (3): 153-183. Bertrand, Marianne, and Adair Morse. 2013. “Trickle-Down Consumption,” NBER Working Paper No. 18883. Blundell, Richard, Luigi Pistaferri, and Ian Preston. 2008. “Consumption Inequality and Partial Insurance,” American Economi Review, 98, 1887–1921. Bordo, Michael D. andChristopher M. Meissner, 2012. “Does Inequality Lead to a Financial Crisis?” NBER Working Paper 17896. Brown, Meta, Andrew Haughwout, Donghoon Lee, and Wilbert van der Klaauw. 2011. "Do We Know What We Owe? A Comparison of Borrower- and Lender-Reported Consumer Debt." Federal Reserve Bank of New York, Staff Report no. 523. 28 Charles, Kerwin K., Erik Hurst, and Nikolai Roussanov, 2009. “Conspicuous Consumption and Race,” Quarterly Journal of Economics 124(2), 42-67. Christen, Markus and Ruskin M. Morgan, 2005. “Keeping Up with the Joneses: Analyzing the Effect of Income Inequality on Consumer Borrowing,” Quantitative Marketing and Economics 3, 145-173. Consumer Financial Protection Bureau. 2012. “Supervision and Examination Manual” Daly, Mary C. and Daniel J. Wilson, 2006. “Keeping Up with the Joneses and Staying Ahead of the Smiths: Evidence from Suicide Data,” Federal Reserve Bank of San Francisco WP 2006-12. Dell’Ariccia , Giovanni, Deniz Igan, and Luc Laeven. 2012. “Credit Booms and Lending Standards: Evidence from the Subprime Mortgage Market,” Journal of Money, Credit and Banking, Vol. 44 (2-3): 367–384. Drozd, Lukasz A., and Ricardo Serrano-Padial. 2013. “Modeling the Credit Card Revolution: The Role of Debt Collection and Informal Bankruptcy,” FRB of Philadelphia Working Paper No. 13-12. Elul, Ronel, Nicholas S. Souleles, Souphala Chomsisengphet, Dennis Glennon, and Robert Hunt. 2010. "What "Triggers" Mortgage Default?" The American Economic Review Papers and Proceedings, forthcoming. (FRB Philadelphia Working Paper 10-13). Fay, Scott, Erik Hurst, and Michelle J. White. 1998. “The Bankruptcy Decision: Does Stigma Matter?,” University of Michigan Working Paper No. 98-01. Fay, Scott, Erik Hurst, and Michelle J. White, 2002. "The Household Bankruptcy Decision," American Economic Review 92(3), 706-718. Goldin Claudia, and Lawrence F. Katz, 2007. “Long-Run Changes in the U.S. Wage Structure: Narrowing, Widening, Polarizing.” Brookings Papers on Economic Activity. Vol. 2, 135-165. Gross, David B., and Nicholas S. Souleles, 2002. "An Empirical Analysis of Personal Bankruptcy and Delinquency," Review of Financial Studies 15(1), 319-347. Guerrieri, Veronica, Daniel Hartley, and Erik Hurst, 2013. “Endogenous Gentrification and Housing Price Dynamics,” forthcoming in Journal of Public Economics. Guven, Cahit and Bent E. Sorensen, 2012. “Subjective Well-Being: Keeping Up with the Joneses. Real or Perceived?” Social Indicators Research 109, 439-469. Heathcote, Jonathan, Fabrizio Perri, and Gianluca Violante, 2010. “Unequal We Stand: An Empirical Analysis of Economic Inequality in the United States, 1967-2006,” Review of Economic Dynamics, 13, 15-51. Heathcote, Jonathan, Kjetil Storesletten, and Giovanni L. Violante. 2004. “The Macroeconomic Implications of Rising Wage Inequality in the United States.” Journal of Political Economy, 118(4), 681-722 Heffetz, Ori, 2011. “A Test of Conspicuous Consumption: Visibility and Income Elasticities,” Review of Economics and Statistics 93(4) 1101-1117. Huggett, Mark. 1993. “The Risk-Free Rate in Heterogeneous-Agent incomplete –Insurance Economics.” Journal of Economics Dynamics and Control 17 (Septemeber-November): 953-69. Iacoviello, Matteo, 2008. “Household Debt and Income Inequality: 1963-2003,” Journal of Money, Credit and Banking 40(5), 929-965. Kennickell, Arthur B. 1991 “Imputation of the 1989 Survey of Consumer Finances," Proceedings of the Section on Survey Research Methods, 1990 Joint Statistical Meetings, Atlanta, GA. Kennickell, Arthur B., 1998, “Multiple imputation in the Survey of Consumer Finances,” Working paper, Federal Reserve Board, available at: http://www.federalreserve.gov/pubs/oss/oss2/method.html. Kopczuk, Wojciech and Saez, Emmanuel. 2004. “Top Wealth Shares in the United States, 1916–2000: Evidence from Estate Tax Returns.” National Tax Journal, 57(2), pp. 445–87. Krueger, Dirk, and Fabrizio Perri. 2006. "Does Income Inequality Lead to Consumption Inequality? Evidence and Theory," Review of Economic Studies, Vol. 73(1): 163-193. Kuhn, Peter, Peter Kooreman, Adriaan R. Soetevent, and Arie Kapteyn, 2010. “The Effects of Lottery Prizes on Winners and their Neighbors: Evidence from the Dutch Postcode Lottery,” IZA Discussion Paper 4950. 29 Lee, Donghoon, and Wilbert van der Klaauw. 2010. "An Introduction to the FRBNY Consumer Credit Panel." Federal Reserve Bank of New York, Staff Report no. 4799. Luttmer, Erzo F. P., 2005. “Neighbors as Negatives: Relative Earnings and Well-Being,” Quarterly Journal of Economics 120(3), 963-1002. Maurer, Jurgen and Andre Meier, 2008. “Smooth It Like the Joneses? Estimating Peer-Group Effects in Intertemporal Consumption Choice,” The Economic Journal 118, 454-476. Munnell, Alicia H., Geoffrey M. B. Tootell, Lynn E. Browne, and James McEneaney, 1996. "Mortgage Lending in Boston: Interpreting HMDA Data," American Economic Review 86(1), 25-53. Neumark, David and Andrew Postlewaite, 1998. “Relative Income Concerns and the Rise in Married Women’s Employment,” Journal of Public Economics 70, 157-183. Perugini, Cristiano, Jens Holscher, and Simon Collie, 2013. “Inequality, Credit Expansion and Financial Crises,” Munich Personal RePec Archive Paper 51336. Piketty, Thomas and Emmanuel Saez. 2003. “Income Inequality in the United States: 1913-1998.” Quarterly Journal of Economics, Vol. 118 (1), pp: 1–39. Rajan, Raghuram G., 2010. Fault Lines: How Hidden Fault Lines Still Threaten the World Economy, Princeton University Press, Princeton N.J. Sanchez, Juan M., 2009. "The IT Revolution and the Unsecured Credit Market," FRB Richmond Working paper no. 09-4. Tootell, Geoffrey M. B., 1996. "Redlining in Boston: Do Mortgage Lenders Discriminate against Neighborhoods?," Quarterly Journal of Economics 111(4), 1049-1079. Turner, Margery Austin, and Felicity Skidmore, 1999. Mortgage Lending Discrimination: A Review of Existing Evidence. The Urban Institute, Washington D.C. Zizzo, Daniel J. and Andrew J. Oswald, 2001. “Are People Willing to Pay to Reduce Others’ Income,” Annales d’Economie et de Statistique 63/64 39-65. 30 FIGURE 1: INEQUALITY AND DEBT IN THE U.S. 21 80 20 70 19 Household Debt to GDP Ratio (left axis) 60 18 50 17 40 Income Share of Top 5% Household Debt to GDP Ratio 22 Income Share of Top 5% (right axis) 90 16 30 15 1967 1972 1977 1982 1987 1992 1997 2002 2007 2012 Note: The figure plots the income share of the top 5% of U.S. households (source: IRS) and the ratio of household (and non-profit) total liabilities relative to GDP (source: Federal Reserve). FIGURE 2: INEQUALITY ACROSS U.S. COUNTIES 1.5081176 - 1.7359808 1.4748088 - 1.5081176 1.4433647 - 1.4748088 1.4067343 - 1.4433647 1.3558859 - 1.4067343 1.0951489 - 1.3558859 No data Note: The figure plots inequality in 2001 at the county level. Inequality is measured as the difference in log expected incomes at the 90th and 10th percentiles computed from the CCP. Darker counties are more unequal with each bin representing a quintile of the distribution across counties. 31 FIGURE 3: CROSS-SECTIONAL INEQUALITY IN THE U.S. 0 1 Density 2 3 Distribution of inequality by zip code .8 1 1.2 1.4 inequality (CCP): p90-p10 1.6 1.8 0 2 Density 4 6 Distribution of inequality by county .8 1 1.2 1.4 inequality (CCP): p90-p10 1.6 1.8 0 2 Density 4 6 8 Distribution of inequality by state .8 1 1.2 1.4 inequality (CCP): p90-p10 1.6 1.8 Note: The figures plot the regional distribution of inequality, measured using differences in expected log income between the 90th and 10th percentiles as computed from the CCP, at three levels of aggregation: zip code, county and state level. 32 FIGURE 4: DEBT ACCUMULATION, INCOME RANK AND LOCAL INEQUALITY A) 𝜶 < 𝟎, 𝜷 = 𝟎, 𝜸 = 𝟎 B) 𝜶 < 𝟎, 𝜷 > 𝟎, 𝜸 < 𝟎 C) 𝜶 < 𝟎, 𝜷 < 𝟎, 𝜸 > 𝟎, |𝜸| > |𝜷| Note: The figure plots qualitative predictions for various theories of how borrowing and inequality interact. Panel A shows a case where the local inequality is irrelevant for borrowing. Panel B demonstrates a special case of “keeping up with Joneses” when the debt accumulation of the richest household does not depend on the local inequality. Panel C shows the case where increased inequality (𝐼𝐻 > 𝐼𝐿 ) allows high-income households to borrow more. See section 3.1 in the text for details. 33 FIGURE 5: THE ESTIMATED EFFECT OF ONE SD INCREASE IN INEQUALITY ON DEBT ACCUMULATION 𝝈(𝑰𝒏𝒆𝒒𝒖𝒂𝒍𝒊𝒕𝒚) ∗ (𝜷 + 𝜸 ∗ 𝑹𝒂𝒏𝒌) Panel A: Parsimonious Specification Panel B: Specification with Full Set of Controls Note: These figures plot the calculated effects of a one standard deviation increase in inequality using estimated coefficients on rank, inequality, and the interaction of rank and inequality from the baseline specification (Table 3: Panel A) and the specification with full controls (Table 3: Panel C). 34 FIGURE 6. DEBT ACCUMULATION BY LOW AND HIGH-RANK HOUSEHOLDS AND LOCAL INEQUALITY, NONPARAMETRIC SPECIFICATION Note: The figure shows the estimated coefficients on the income rank dummies from the nonparametric regressions of the relative household debt accumulation between 2001 and year 𝑡. Each regression contains a dummy for income rank below 0.2, a dummy for income rank above 0.8, and a full set of controls described in equation (3) and the countyspecific fixed effects. The omitted category is the dummy for income rank between 0.2 and 0.8. The regressions are estimated by year. In each year, the regression is estimated separately for each of the three categories: low-inequality locations (below the 20th percentile of the inequality distribution across zip codes in 2001), mid-level inequality locations (between the 20th and 80th percentiles), and high-inequality locations (above the 80th percentile). Each location (zip code) is assigned to one of the three categories in 2001 and the assignment remains constant through 2002-2012. The standard errors are clustered by zip code. The dotted lines show the 95%-confidence interval. 35 FIGURE 7. THEORETICAL EFFECTS OF A CHANGE IN INEQUALITY ON PROVISION OF CREDIT Bank Sorting and Inequality under Perfect Competition Panel A Panel B Bank Sorting and Inequality under Monopoly Banking Panel C Panel D Note: Panel A shows the tradeoff 𝑠 ∗ (𝑦) for baseline income distribution (“equal”) and more unequal income distribution (“unequal”). Panel B plots the interest rate for each income level and for different levels of income inequality. In Panels A and B banks can price discriminate perfectly. Panel C plots sets of households with signals 𝑠 and 𝑦 who obtain loans for two “equal” and “unequal” income distributions. Shaded regions indicate combinations of signals that yield an approved loan. Panel D plots loan deny probability as a function of income. In Panels C and D, the bank changes the same rate for all applicants. 36 TABLE 1: SUMMARY STATISTICS Category Mean St. Dev. 10 25 Percentiles 50 75 90 Panel A: FRBNY Consumer Credit Panel/ Equifax, Q3 2001 Age of head of household Household size Housing debt Mortgage HELOC Auto loans Credit card limit Credit card balance Student loan Consumer financing Other debt Total debt Bankruptcy rate Delinquency rate Credit card utilization rate 42.6 3.0 56,423 54,658 1,765 6,876 30,459 8,884 1,639 929 4,044 78,794 0.12 0.30 11.0 1.7 99,938 97,202 12,565 11,543 36,452 14,812 7,849 5,861 22,158 112,167 0.32 0.46 28 1 0 0 0 0 1,609 261 0 0 0 1,368 0.00 0.00 34 2 0 0 0 0 6,127 1,120 0 0 0 9,437 0.00 0.00 42 3 12,351 8,267 0 0 19,320 3,923 0 0 0 42,311 0.00 0.00 51 4 83,255 81,163 0 10,805 42,288 10,881 0 178 0 111,335 0.00 1.00 58 5 156,082 153,000 0 21,376 73,009 22,893 2,723 2,033 10,410 193,395 1.00 1.00 0.41 0.35 0.02 0.09 0.31 0.71 0.99 Panel B: Survey of Consumer Finances, 2001 Age of head of household 43.3 11.3 28 35 43 52 59 Household size 2.8 1.4 1 2 2 4 5 Housing debt 60,783 119,310 0 0 29,000 90,000 150,000 Mortgage debt 57,643 90,243 0 0 27,000 88,000 147,000 HELOC 3,140 73,981 0 0 0 0 0 Auto loans 5,182 8,280 0 0 0 8,700 18,000 Credit card limit 19,290 43,636 1,400 4,500 10,000 22,000 42,000 Credit card balance 2,586 5,459 0 0 500 3,000 7,200 Student loan 2,271 9,786 0 0 0 0 5,000 Consumer financing Other debt Total debt 70,822 121,163 30 6,140 40,000 101,000 164,800 Bankruptcy rate 0.10 0.30 0.00 0.00 0.00 0.00 1.00 Delinquency rate 0.05 0.21 0.00 0.00 0.00 0.00 0.00 Credit card utilization rate 0.27 0.34 0.00 0.00 0.08 0.47 0.93 Note: The sample is restricted to the households with 20-65 year old head of household. The statistics are calculated using sampling weights. Housing debt is the sum of Mortgage and HELOC. The credit card limit is the maximum of the originally recorded credit card limit in the CCP and the credit card balance. The credit card utilization rate is calculated using this credit card limit. The table shows the statistics from the sample restricted to observations with nonzero credit card limit. The delinquency rate is a share of households with at least one member with an account that is 60 day past due or more. The number of observations in Panel A is 7,710,406. The number of observations in Panel B is 14,356. 37 TABLE 2: INCOME STATISTICS FROM SCF (ACTUAL) AND CCP (IMPUTED) Ln(Y), actual in SCF Mean St. dev. 10.62 10.72 Percentiles 10 25 50 75 90 0.91 9.47 10.09 10.67 11.20 11.62 0.98 9.54 10.10 10.70 11.28 11.88 Ln(Y), imputed in CCP Note: The sample is restricted to households with the 20-65 y.o. head of household and positive gross income. The sample in the SCF is further restricted to remove outliers. See text for more details. 38 TABLE 3: BASELINE RESULTS ON HOUSEHOLD DEBT ACCUMULATION 2002 2003 2004 2005 -1.23*** (0.02) -0.39*** (0.01) 0.63*** (0.01) -2.04*** (0.03) -0.59*** (0.01) 1.07*** (0.02) -2.86*** (0.04) -0.96*** (0.02) 1.58*** (0.03) -3.32*** (0.04) -1.04*** (0.02) 1.80*** (0.03) -3.81*** (0.05) -1.15*** (0.02) 2.05*** (0.04) -3.98*** (0.06) -1.10*** (0.03) 2.09*** (0.04) N R2 5,925,610 0.012 5,449,695 0.017 4,837,540 0.025 4,387,387 0.030 4,050,160 0.038 3,792,576 0.041 α -1.09*** (0.02) -0.41*** (0.01) 0.56*** (0.01) -1.87*** (0.03) -0.56*** (0.01) 0.93*** (0.02) -2.69*** (0.04) -0.83*** (0.02) 1.37*** (0.03) -3.12*** (0.05) -0.94*** (0.02) 1.59*** (0.04) -3.62*** (0.06) -1.07*** (0.03) 1.85*** (0.04) -3.80*** (0.06) -1.12*** (0.03) 1.95*** (0.05) -3.72*** (0.06) -1.09*** (0.03) 1.91*** (0.05) N R2 5,760,568 0.047 5,287,149 0.057 4,684,857 0.062 4,244,767 0.069 3,920,565 0.074 3,668,685 0.080 3,468,033 0.088 α -1.08*** (0.02) -0.31*** (0.01) 0.57*** (0.01) -1.86*** (0.03) -0.44*** (0.01) 0.94*** (0.02) -2.67*** (0.04) -0.62*** (0.02) 1.40*** (0.03) -3.09*** (0.05) -0.68*** (0.02) 1.63*** (0.03) -3.59*** (0.06) -0.76*** (0.03) 1.91*** (0.04) -3.77*** (0.06) -0.79*** (0.03) 2.02*** (0.05) -3.68*** (0.06) -0.76*** (0.03) 1.98*** (0.05) -3.62*** (0.06) -0.76*** (0.03) 1.97*** (0.05) N R2 5,760,568 0.048 5,287,149 0.058 4,684,857 0.064 4,244,767 0.071 3,920,565 0.076 3,668,685 0.082 3,468,033 0.090 3,326,869 0.093 α γ -1.08*** (0.10) 0.57*** (0.07) -1.86*** (0.15) 0.94*** (0.10) -2.66*** (0.24) 1.39*** (0.17) -3.09*** (0.30) 1.63*** (0.21) -3.56*** (0.38) 1.90*** (0.27) -3.76*** (0.43) 2.00*** (0.31) -3.68*** (0.42) 1.97*** (0.30) N R2 5,760,568 0.052 5,287,149 0.061 4,684,857 0.068 4,244,767 0.076 3,920,565 0.082 3,668,685 0.088 3,468,033 0.096 α β γ β γ β γ 2006 2007 2008 2009 2010 2011 2012 -3.85*** (0.06) -0.98*** (0.03) 1.95*** (0.04) -3.74*** (0.06) -0.93*** (0.03) 1.87*** (0.04) -3.38*** (0.05) -0.75*** (0.02) 1.62*** (0.04) -3.02*** (0.05) -0.58*** (0.02) 1.37*** (0.04) -2.66*** (0.05) -0.38*** (0.02) 1.10*** (0.04) 3,581,989 0.043 3,438,004 0.042 3,295,854 0.039 3,178,324 0.038 3,069,446 0.037 -3.66*** (0.06) -1.10*** (0.03) 1.89*** (0.05) -3.29*** (0.06) -1.01*** (0.03) 1.73*** (0.04) -2.90*** (0.06) -0.92*** (0.03) 1.53*** (0.04) -2.51*** (0.05) -0.79*** (0.02) 1.33*** (0.04) 3,326,869 0.091 3,185,764 0.097 3,069,465 0.107 2,964,013 0.119 -3.26*** (0.06) -0.69*** (0.03) 1.81*** (0.04) -2.87*** (0.06) -0.61*** (0.03) 1.61*** (0.04) -2.49*** (0.05) -0.53*** (0.02) 1.40*** (0.04) 3,185,764 0.098 3,069,465 0.109 2,964,013 0.120 -3.61*** (0.40) 1.96*** (0.28) -3.25*** (0.38) 1.79*** (0.26) -2.85*** (0.32) 1.59*** (0.22) -2.47*** (0.27) 1.38*** (0.18) 3,326,869 0.099 3,185,764 0.105 3,069,465 0.115 2,964,013 0.126 Panel A: Parsimonious Specification Panel B: Specification with Household Controls Panel C: Specification with Household and Zip-Level Controls Panel D: Specification with Zip-Level Fixed Effects Note: The table presents estimates of specifications (2), (3), (4) and (5) in Panels A through D respectively. Coefficient α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. In Panels A-C, the standard errors are clustered by zip code; in Panel D, standard errors are clustered by state. See sections 3.1 and 3.2 in the text for details. 39 TABLE 4: INTERACTIONS OF RANK WITH CREDIT SCORES AND INITIAL DEBT LEVELS 2002 2003 2004 -1.054*** (0.0267) -1.011*** (0.0287) 0.542*** (0.0195) -0.745*** (0.0494) 0.864*** (0.0351) -1.690*** (0.0394) -1.876*** (0.0417) 0.802*** (0.0289) -1.925*** (0.0718) 1.816*** (0.0514) -2.500*** (0.0525) -2.526*** (0.0550) 1.250*** (0.0388) -2.730*** (0.0931) 2.388*** (0.0666) N R2 3,971,367 0.049 3,621,115 0.058 3,182,620 0.063 α -0.516*** (0.0275) -0.312*** (0.0118) 0.233*** (0.0200) -2.97*** (0.089) 1.67*** (0.063) -1.171*** (0.0387) -0.452*** (0.0170) 0.530*** (0.0282) -3.79*** (0.115) 2.15*** (0.0824) -2.017*** (0.0489) -0.670*** (0.0224) 0.987*** (0.0359) -4.09*** (0.125) 2.49*** (0.891) 3,989,837 0.053 3,643,849 0.061 3,203,783 0.064 α β γ φ σ β γ φ σ N R2 2005 2006 2007 2008 2009 Panel A: Include Interaction of Household Credit Score and Local Inequality -2.867*** -3.299*** -3.465*** -3.440*** -3.351*** (0.0643) (0.0766) (0.0834) (0.0838) (0.0842) -3.136*** -3.841*** -4.087*** -3.962*** -3.909*** (0.0691) (0.0885) (0.102) (0.106) (0.109) 1.441*** 1.658*** 1.758*** 1.761*** 1.729*** (0.0475) (0.0570) (0.0621) (0.0624) (0.0625) -3.744*** -4.911*** -5.286*** -4.914*** -4.600*** (0.116) (0.146) (0.168) (0.173) (0.178) 3.065*** 3.831*** 4.095*** 3.947*** 3.908*** (0.0831) (0.105) (0.122) (0.126) (0.129) 2,862,799 0.070 2,631,983 0.074 2,453,874 0.080 2,314,493 0.089 2,215,144 0.091 Panel B: Include Interaction of Initial Household Debt Level and Local Inequality -2.422*** -2.970*** -3.069*** -2.916*** -2.814*** (0.0605) (0.0732) (0.0814) (0.0849) (0.0857) -0.758*** -0.878*** -0.910*** -0.881*** -0.857*** (0.0273) (0.0329) (0.0357) (0.0370) (0.0374) 1.203*** 1.481*** 1.529*** 1.460*** 1.433*** (0.0443) (0.0540) (0.0600) (0.0627) (0.0631) -4.47*** -4.59*** -5.00*** -5.37*** -5.49*** (0.147) (0.167) (0.200) (0.214) (0.213) 2.81*** 3.05*** 3.38*** 3.54*** 3.55*** (0.105) (0.122) (0.147) (0.158) (0.153) 2,882,349 0.070 2,650,275 0.074 2,470,570 0.079 2,329,399 0.088 2,228,828 0.091 2010 2011 2012 -3.049*** (0.0792) -3.593*** (0.103) 1.607*** (0.0589) -4.234*** (0.167) 3.577*** (0.121) -2.678*** (0.0741) -3.212*** (0.0979) 1.425*** (0.0551) -3.845*** (0.157) 3.205*** (0.114) -2.354*** (0.0690) -2.879*** (0.0926) 1.260*** (0.0511) -3.523*** (0.149) 2.874*** (0.108) 2,116,638 0.096 2,036,909 0.106 1,964,385 0.117 -2.316*** (0.0802) -0.770*** (0.0365) 1.221*** (0.0591) -6.05*** (0.199) 3.67*** (0.144) -1.848*** (0.0769) -0.659*** (0.0348) 1.014*** (0.0564) -6.21*** (0.214) 3.53*** (0.152) -1.309*** (0.0710) -0.556*** (0.0328) 0.744*** (0.0520) -6.876*** (0.195) 3.71*** (0.140) 2,128,927 0.098 2,047,809 0.109 1,974,388 0.124 Note: The table presents estimates of specification (3’) and (3’’) in section 3.2. Coefficient α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Coefficient φ represent the effects of each additional variable (household credit score in Panel A and initial household debt level in Panel B) while σ captures the interaction of this household variable with local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. The standard errors are clustered by zip code. In Panel B, coefficients φ and σ and the respective standard errors are multiplied by 10^6. 40 TABLE 5: HOUSEHOLD DEBT ACCUMULATION ALONG SUBSETS OF DATA α Midwest Northeast Grouping Zip Codes by Census Region South West Low Grouping Zip Codes by Middle Average Credit Ratings High Low Grouping Zip Codes by Middle Initial Average Debtto-Income Ratios High Low Grouping Zip Codes by Middle House Price Growth (2001-2005) High β γ N R2 873,543 0.096 739,380 0.071 1,329,937 0.094 725,825 0.056 -2.619*** (0.110) -3.765*** (0.120) -4.059*** (0.108) -6.078*** (0.176) -0.385*** (0.054) -0.832*** (0.055) -0.811*** (0.0457) -1.558*** (0.072) 1.222*** (0.084) 2.141*** (0.092) 2.145*** (0.077) 3.507*** (0.125) -4.489*** (0.124) -4.202*** (0.098) -2.289*** (0.074) -1.178*** (0.044) -0.934*** (0.044) -0.321*** (0.035) 2.483*** (0.088) 2.251*** (0.072) 1.214*** (0.056) 1,005,563 0.092 1,185,270 0.095 1,477,852 0.092 -2.189*** (0.171) -3.101*** (0.127) -3.754*** (0.108) -0.380*** (0.066) -0.606*** (0.053) -0.783*** (0.050) 1.002*** (0.117) 1.508*** (0.092) 1.988*** (0.086) 960,459 0.070 1,244,084 0.081 1,464,142 0.090 -3.084*** (0.117) -4.320*** (0.138) -5.561*** (0.163) -0.548*** (0.055) -0.981*** (0.063) -1.346*** (0.068) 1.536*** (0.088) 2.456*** (0.104) 3.139*** (0.117) 836,682 0.102 819,222 0.076 797,970 0.057 Note: The table presents estimates of specification (4) in the text using household debt accumulation from 2001 to 2007. Panel A presents separate estimates for households located in each of four Census regions. Panel B presents estimates for households in zip codes with low, medium, or high initial average credit ratings. Panel C presents estimates for households in zip codes with low, medium, or high initial average debt-to-income ratios. Panel D decomposes zip codes by growth of house prices between 2001 and 2005. See section 3.3 in the text for details. Coefficient α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. The standard errors are clustered by zip code. 41 TABLE 6: MEASURING INEQUALITY AT DIFFERENT LEVELS OF AGGREGATION 2002 2003 2004 2005 -1.174*** (0.0865) -0.241*** (0.0423) 0.583*** (0.0606) -2.073*** (0.134) -0.310*** (0.0671) 0.986*** (0.0943) -3.108*** (0.252) -0.456*** (0.118) 1.531*** (0.175) -3.949*** (0.321) -0.548*** (0.156) 1.993*** (0.224) -4.756*** (0.417) -0.570*** (0.202) 2.413*** (0.293) -5.179*** (0.475) -0.578** (0.232) 2.626*** (0.334) N R2 6,640,570 0.048 6,257,495 0.060 5,782,494 0.070 5,435,548 0.079 5,172,907 0.086 4,966,746 0.091 α -0.926** (0.359) 0.0490 (0.114) 0.393 (0.242) -1.710*** (0.543) 0.0832 (0.163) 0.695* (0.367) -2.852** (1.114) 0.254 (0.259) 1.280* (0.754) -4.036*** (1.412) 0.478 (0.324) 1.937** (0.954) -5.283*** (1.667) 0.839** (0.394) 2.616** (1.125) -5.651*** (1.697) 1.317*** (0.458) 2.765** (1.144) 7,015,125 0.049 6,704,094 0.062 6,344,116 0.071 6,088,596 0.082 5,893,406 0.088 5,737,576 0.092 α β γ β γ N R2 2006 2007 2008 2009 2010 2011 2012 -5.055*** (0.493) -0.519** (0.237) 2.545*** (0.344) -4.996*** (0.475) -0.501** (0.227) 2.534*** (0.330) -4.560*** (0.452) -0.475** (0.209) 2.343*** (0.314) -4.176*** (0.445) -0.467** (0.200) 2.170*** (0.309) -3.631*** (0.382) -0.426** (0.174) 1.861*** (0.264) 4,793,457 0.098 4,661,838 0.100 4,531,493 0.105 4,421,495 0.115 4,319,303 0.125 -5.592*** (1.612) 1.472*** (0.469) 2.711** (1.080) -5.545*** (1.525) 1.386*** (0.483) 2.708** (1.019) -4.969*** (1.476) 1.193** (0.479) 2.409** (0.988) -4.482*** (1.391) 1.001** (0.468) 2.170** (0.929) -3.795*** (1.224) 0.863* (0.447) 1.770** (0.815) 5,600,035 0.099 5,490,380 0.100 5,383,103 0.108 5,293,822 0.119 5,209,929 0.130 Panel A: Inequality at the County Level Panel B: Inequality at the State Level Note: The table presents estimates of specification (4) while measuring inequality at different levels of aggregation: county level in Panel A and state level in Panel B. Coefficient α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. See section 3.4 in the text for details. 42 TABLE 7: RESULTS BY FORM OF DEBT 2002 2003 0.926*** (0.017) -0.304*** (0.008) 0.568*** (0.012) -1.618*** (0.024) -0.447*** (0.011) 0.948*** (0.017) N R2 5,759,852 0.052 5,286,511 0.063 α -0.118*** (0.00315) -0.042*** (0.001) 0.060*** (0.002) -0.189*** (0.00433) -0.054*** (0.002) 0.076*** (0.003) N R2 5,761,261 0.084 5,287,505 0.110 α -0.012*** (0.003) 0.003** (0.001) -0.005*** (0.002) 0.005 (0.003) 0.005*** (0.001) -0.007*** (0.003) N R2 5,237,870 0.084 4,732,987 0.119 α -0.291*** (0.006) -0.063*** (0.002) 0.112*** (0.004) -0.422*** (0.009) -0.097*** (0.003) 0.189*** (0.006) 5,761,018 0.042 5,287,685 0.069 α β γ β γ β γ β γ N R2 2004 2005 2006 2007 2008 2009 2010 2011 2012 Panel A: Mortgage Debt Accumulation -2.393*** (0.034) -0.632*** (0.015) 1.393*** (0.025) 4,684,155 0.068 -2.748*** (0.040) -0.693*** (0.017) 1.606*** (0.029) -3.304*** (0.048) -0.813*** (0.021) 1.936*** (0.035) -3.560*** (0.053) -0.865*** (0.023) 2.087*** (0.039) -3.461*** (0.052) -0.828*** (0.023) 2.021*** (0.038) -3.361*** (0.052) -0.811*** (0.023) 1.970*** (0.038) 3.089*** (0.050) -0.756*** (0.022) 1.823*** (0.036) -2.778*** (0.047) -0.669*** (0.022) 1.634*** (0.035) -2.468*** (0.044) -0.606*** (0.020) 1.462*** (0.032) 4,244,067 0.078 3,919,926 0.082 3,667,964 0.087 3,467,395 0.096 3,326,197 0.099 3,185,052 0.109 3,068,773 0.122 2,963,305 0.138 Panel B: Auto Debt Accumulation -0.227*** (0.00518) -0.060*** (0.003) 0.087*** (0.004) 4,684,632 0.123 -0.236*** (0.006) -0.058*** (0.003) 0.087*** (0.004) -0.222*** (0.006) -0.052*** (0.003) 0.078*** (0.004) -0.191*** (0.006) -0.044*** (0.003) 0.061*** (0.004) -0.151*** (0.006) -0.033*** (0.003) 0.042*** (0.004) -0.110*** (0.006) -0.024*** (0.003) 0.023*** (0.004) -0.072*** (0.005) -0.014*** (0.002) 0.009** (0.004) -0.068*** (0.005) -0.015*** (0.002) 0.010*** (0.004) -0.078*** (0.005) -0.016*** (0.003) 0.017*** (0.004) 4,244,481 0.134 3,920,470 0.145 3,668,662 0.158 3,468,178 0.183 3,327,099 0.201 3,185,871 0.221 3,069,547 0.228 2,964,371 0.226 Panel C: Credit Card Balance Accumulation 0.030*** (0.004) 0.008*** (0.002) -0.013*** (0.003) 4,180,218 0.144 0.038*** (0.005) 0.013*** (0.002) -0.018*** (0.003) 0.048*** (0.005) 0.014*** (0.002) -0.017*** (0.004) 0.035*** (0.005) 0.009*** (0.002) -0.010** (0.004) 0.030*** (0.006) 0.005* (0.003) -0.003 (0.004) 0.034*** (0.006) 0.006** (0.003) -0.004 (0.005) 0.045*** (0.006) 0.002 (0.003) 0.009** (0.004) 0.053*** (0.006) -0.000 (0.002) 0.016*** (0.040) 0.061*** (0.005) 0.000 (0.002) 0.015*** (0.004) 3,803,373 0.154 3,512,251 0.167 3,293,491 0.161 3,111,432 0.159 2,946,652 0.164 2,798,243 0.202 2,699,679 0.232 2,602,130 0.251 -0.730*** (0.013) -0.166*** (0.006) 0.313*** (0.010) -0.766*** (0.014) -0.180*** (0.006) 0.356*** (0.010) -0.888*** (0.016) -0.208*** (0.007) 0.393*** (0.012) -0.911*** (0.017) -0.203*** (0.008) 0.390*** (0.013) -0.844*** (0.016) -0.237*** (0.007) 0.471*** (0.012) -0.743*** (0.015) -0.241*** (0.006) 0.476*** (0.011) -0.711*** (0.015) -0.228*** (0.006) 0.478*** (0.011) -0.698*** (0.015) -0.214*** (0.007) 0.462*** (0.011) 4,245,070 0.127 3,920,739 0.131 3,669,099 0.139 3,468,561 0.143 3,327,102 0.165 3,185,934 0.204 3,069,635 0.227 2,964,353 0.237 Panel D: Credit Card Limits -0.502*** (0.010) -0.132*** (0.004) 0.255*** (0.007) 4,685,051 0.102 Note: The table presents estimates of specification (4) for different forms of household debt: mortgage debt in Panel A, auto debt in Panel B, credit card balances in Panel C and credit card limits in Panel D. Coefficient α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. See section 3.6 in the text for details. 43 TABLE 8: MORTGAGE APPLICATIONS AND LOCAL INEQUALITY 2001 2002 2003 𝛼 0.287** (0.124) -0.441*** (0.092) 0.240** (0.101) -0.355*** (0.075) 0.222*** (0.079) -0.314*** (0.058) N R2 644,680 0.126 647,685 0.099 722,326 0.070 γ 2004 2005 2006 2007 2008 Panel A: Probability of Mortgage Application Being Rejected 0.223*** 0.195*** 0.228*** 0.169*** 0.165*** (0.052) (0.053) (0.056) (0.051) (0.058) -0.331*** -0.320*** -0.320*** -0.256*** -0.253*** (0.038) (0.039) (0.041) (0.037) (0.042) 790,699 0.063 890,889 0.058 798,332 0.058 577,110 0.060 395,574 0.052 2009 2010 2011 0.116** (0.048) -0.185*** (0.035) 0.209*** (0.054) -0.293*** (0.040) 0.265*** (0.060) -0.356*** (0.044) 371,967 0.044 382,851 0.057 359,100 0.073 0.063** (0.026) -0.107*** (0.019) 0.077** (0.036) -0.132*** (0.027) 286,764 0.094 268,874 0.092 -0.790*** (0.080) 0.068 (0.058) -0.720*** (0.082) 0.046 (0.059) 286,764 0.413 268,874 0.393 Panel B: Probability of Mortgage Being High-Interest (conditional on approval) 0.112*** 0.095 0.081 0.103** 0.069* 0.012 (0.041) (0.058) (0.063) (0.045) (0.042) (0.026) -0.187*** -0.219*** -0.204*** -0.196*** -0.161*** -0.073*** (0.030) (0.042) (0.045) (0.033) (0.030) (0.019) 𝛼 γ 598,307 0.113 N R2 𝛼 γ -0.661*** (0.099) 0.041 (0.073) -0.698*** (0.096) 0.039 (0.071) N R2 501,296 0.327 513,101 0.349 644,987 0.175 567,623 0.141 415,484 0.085 287,400 0.072 283,357 0.057 Panel C: Loan-to-Income Ratios of Mortgage Applications (conditional on approval) -0.823*** -0.872*** -0.614*** -0.623*** -0.785*** -0.704*** -0.702*** (0.092) (0.072) (0.068) (0.061) (0.062) (0.077) (0.077) 0.108 0.156*** -0.017 -0.005 0.101** 0.055 0.004 (0.068) (0.051) (0.049) (0.044) (0.044) (0.056) (0.055) 565,412 0.368 598,307 0.354 644,987 0.338 567,623 0.352 415,484 0.376 287,400 0.382 283,357 0.405 Note: The table presents estimates of specification (6) for different dependent variables as indicated in each panel. Coefficient α corresponds to the partial correlation of applicant’s income rank and the dependent variable in the year indicated by each column. Coefficient γ corresponds to the interaction of local inequality and applicant’s income rank. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. See section 5 in the text for details. 44 APPENDIX NOT FOR PUBLICATION 45 APPENDIX A: ADDITIONAL TABLES AND FIGURES APPENDIX TABLE A1: ROBUSTNESS TO USING IRS MEASURE OF INEQUALITY 2002 2003 2004 2005 2006 2007 -0.890*** (0.0194) -0.760*** (0.0253) 1.222*** (0.0438) -1.515*** (0.0293) -1.181*** (0.0380) 2.216*** (0.0659) -2.088*** (0.0390) -1.826*** (0.0551) 3.224*** (0.0875) -2.460*** (0.0484) -2.048*** (0.0672) 3.742*** (0.109) -2.826*** (0.0583) -2.307*** (0.0829) 4.241*** (0.131) -2.934*** (0.0641) -2.291*** (0.0896) 4.208*** (0.144) 5,924,527 0.012 5,448,830 0.016 4,837,105 0.023 4,387,141 0.029 4,049,988 0.037 3,792,440 0.041 2008 2009 2010 2011 2012 -2.871*** (0.0640) -2.054*** (0.0897) 3.924*** (0.144) -2.861*** (0.0638) -1.933*** (0.0895) 3.892*** (0.144) -2.692*** (0.0606) -1.601*** (0.0827) 3.538*** (0.137) -2.464*** (0.0558) -1.258*** (0.0747) 3.049*** (0.127) -2.250*** (0.0521) -0.929*** (0.0690) 2.539*** (0.118) 3,581,901 0.043 3,437,924 0.041 3,295,790 0.038 3,178,262 0.038 3,069,406 0.037 Panel A: Parsimonious Specification α β γ N R2 Panel B: Specification with Household and Regional Controls α β γ N R2 -0.760*** (0.0210) -0.600*** (0.0271) 1.064*** (0.0475) -1.415*** (0.0309) -0.934*** (0.0394) 1.951*** (0.0698) -2.038*** (0.0429) -1.339*** (0.0554) 2.961*** (0.0976) -2.398*** (0.0533) -1.517*** (0.0687) 3.561*** (0.121) -2.811*** (0.0639) -1.717*** (0.0846) 4.225*** (0.146) -2.901*** (0.0701) -1.708*** (0.0934) 4.353*** (0.160) -2.824*** (0.0700) -1.609*** (0.0940) 4.253*** (0.160) -2.809*** (0.0700) -1.597*** (0.0949) 4.337*** (0.160) -2.536*** (0.0673) -1.466*** (0.0907) 4.016*** (0.154) -2.201*** (0.0620) -1.255*** (0.0843) 3.524*** (0.142) -1.920*** (0.0571) -1.108*** (0.0791) 3.077*** (0.131) 5,759,501 0.048 5,286,304 0.057 4,684,443 0.063 4,244,552 0.070 3,920,426 0.076 3,668,580 0.081 3,467,968 0.089 3,326,809 0.092 3,185,721 0.098 3,069,425 0.108 2,963,983 0.120 Note: The table reproduces the results in Table 3 of the text using the IRS measure of inequality rather than the CCP measure. See section 3.2 in the text for details. 46 APPENDIX TABLE A2: ROBUSTNESS TO GEOGRAPHIC REGION 2002 2003 2004 2005 2006 -0.887*** (0.0416) -0.252*** (0.0198) 0.450*** (0.0313) -1.550*** (0.0552) -0.330*** (0.0266) 0.747*** (0.0413) -2.137*** (0.0730) -0.435*** (0.0353) 1.043*** (0.0554) -2.241*** (0.0906) -0.367*** (0.0422) 1.047*** (0.0688) -2.598*** (0.105) -0.384*** (0.0509) 1.233*** (0.0796) N R2 1,310,459 0.055 1,214,540 0.064 1,089,234 0.072 994,179 0.082 926,258 0.089 α -0.883*** (0.0372) -0.236*** (0.0169) 0.473*** (0.0281) -1.603*** (0.0523) -0.365*** (0.0244) 0.826*** (0.0395) -2.422*** (0.0717) -0.587*** (0.0338) 1.329*** (0.0551) -2.904*** (0.0886) -0.669*** (0.0413) 1.619*** (0.0677) -3.481*** (0.107) -0.777*** (0.0494) 1.950*** (0.0823) N R2 1,105,516 0.044 1,025,502 0.051 919,783 0.055 843,686 0.062 785,904 0.066 α -1.242*** (0.0378) -0.364*** (0.0157) 0.668*** (0.0266) -2.187*** (0.0547) -0.541*** (0.0227) 1.148*** (0.0385) -3.026*** (0.0695) -0.716*** (0.0289) 1.613*** (0.0491) -3.560*** (0.0847) -0.802*** (0.0353) 1.919*** (0.0598) -3.939*** (0.0993) -0.847*** (0.0419) 2.092*** (0.0704) N R2 2,104,747 0.056 1,931,826 0.068 1,709,240 0.076 1,547,622 0.085 1,425,285 0.089 α -1.644*** -2.848*** -4.113*** -4.832*** -5.695*** (0.0546) -0.499*** (0.0228) 0.928*** (0.0382) (0.0801) -0.769*** (0.0329) 1.573*** (0.0567) (0.105) -1.076*** (0.0437) 2.310*** (0.0746) (0.131) -1.267*** (0.0534) 2.738*** (0.0932) 1,239,846 0.039 1,115,281 0.047 966,600 0.048 859,280 0.051 α β γ β γ β γ β γ N R2 2007 2008 2009 2010 2011 2012 -2.619*** (0.110) -0.385*** (0.0542) 1.222*** (0.0844) -2.507*** (0.110) -0.327*** (0.0553) 1.167*** (0.0850) -2.531*** (0.110) -0.353*** (0.0540) 1.226*** (0.0834) -2.211*** (0.104) -0.322*** (0.0506) 1.092*** (0.0788) -1.923*** (0.0993) -0.264*** (0.0483) 0.960*** (0.0751) -1.635*** (0.0922) -0.237*** (0.0456) 0.802*** (0.0698) 873,543 0.096 829,532 0.108 799,413 0.111 767,757 0.122 742,214 0.137 717,954 0.153 -3.765*** (0.120) -0.832*** (0.0548) 2.141*** (0.0920) -3.725*** (0.122) -0.819*** (0.0568) 2.132*** (0.0946) -3.610*** (0.122) -0.803*** (0.0565) 2.074*** (0.0940) -3.411*** (0.113) -0.820*** (0.0538) 2.012*** (0.0876) -3.005*** (0.109) -0.731*** (0.0514) 1.797*** (0.0848) -2.708*** (0.101) -0.696*** (0.0489) 1.640*** (0.0783) 739,380 0.071 702,128 0.077 674,140 0.080 645,604 0.084 623,447 0.093 602,832 0.102 -4.059*** (0.108) -0.811*** (0.0457) 2.145*** (0.0766) -3.962*** (0.108) -0.764*** (0.0462) 2.113*** (0.0761) -3.905*** (0.110) -0.779*** (0.0472) 2.129*** (0.0776) -3.390*** (0.108) -0.648*** (0.0462) 1.869*** (0.0765) -3.071*** (0.103) -0.605*** (0.0450) 1.746*** (0.0725) -2.639*** (0.0992) -0.500*** (0.0431) 1.509*** (0.0700) 1,329,937 0.094 1,253,811 0.103 1,202,853 0.107 1,152,915 0.115 1,109,302 0.128 1,071,093 0.141 -6.078*** -5.941*** -5.899*** -5.584*** -4.794*** -4.232*** (0.162) -1.468*** (0.0658) 3.255*** (0.116) (0.176) -1.558*** (0.0719) 3.507*** (0.125) (0.174) -1.529*** (0.0706) 3.418*** (0.123) (0.178) -1.521*** (0.0742) 3.400*** (0.127) (0.165) -1.479*** (0.0711) 3.274*** (0.117) (0.158) -1.254*** (0.0671) 2.789*** (0.113) (0.149) -1.092*** (0.0613) 2.455*** (0.105) 783,118 0.053 725,825 0.056 682,562 0.061 650,463 0.062 619,488 0.065 594,502 0.071 572,134 0.083 Panel A: Midwest Panel B: Northeast Panel C: South Panel D: West Note: The table replicates the results in Panel A of Table 5 in the main text for each year in our sample. 47 APPENDIX TABLE A3: ROBUSTNESS TO AVERAGE LOCAL CREDIT RATINGS 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Panel A: Low Average Credit Ratings α -0.676*** (0.0372) -1.496*** (0.0524) -2.324*** (0.0704) -3.087*** (0.0886) -3.960*** (0.110) -4.489*** (0.124) -4.488*** (0.128) -4.474*** (0.129) -4.067*** (0.125) -3.590*** (0.118) -3.190*** (0.113) β -0.257*** (0.0125) 0.343*** (0.0260) -0.463*** (0.0180) 0.786*** (0.0367) -0.693*** (0.0248) 1.228*** (0.0494) -0.861*** (0.0310) 1.686*** (0.0625) -1.075*** (0.0395) 2.170*** (0.0778) -1.178*** (0.0443) 2.483*** (0.0875) -1.175*** (0.0461) 2.500*** (0.0909) -1.204*** (0.0471) 2.540*** (0.0915) -1.118*** (0.0464) 2.375*** (0.0890) -1.003*** (0.0445) 2.136*** (0.0837) -0.897*** (0.0426) 1.923*** (0.0802) 1,818,129 0.058 1,653,710 0.074 1,424,164 0.077 1,243,808 0.087 1,110,832 0.090 1,005,563 0.092 922,130 0.098 869,345 0.100 816,880 0.110 768,349 0.125 729,247 0.141 γ N R2 Panel B: Medium Average Local Credit Ratings α -1.226*** (0.0349) -2.123*** (0.0507) -3.045*** (0.0655) -3.519*** (0.0777) -4.070*** (0.0909) -4.202*** (0.0984) -4.289*** (0.101) -4.198*** (0.103) -3.778*** (0.0979) -3.383*** (0.0929) -2.912*** (0.0893) β -0.394*** (0.0155) 0.664*** (0.0253) -0.534*** (0.0224) 1.085*** (0.0368) -0.750*** (0.0294) 1.615*** (0.0479) -0.822*** (0.0348) 1.870*** (0.0567) -0.906*** (0.0405) 2.176*** (0.0669) -0.934*** (0.0443) 2.251*** (0.0724) -0.943*** (0.0460) 2.340*** (0.0744) -0.927*** (0.0470) 2.307*** (0.0756) -0.853*** (0.0449) 2.111*** (0.0717) -0.760*** (0.0429) 1.922*** (0.0679) -0.640*** (0.0408) 1.665*** (0.0655) 1,909,604 0.052 1,731,554 0.062 1,517,591 0.074 1,372,556 0.084 1,265,579 0.091 1,185,270 0.095 1,121,699 0.104 1,075,653 0.105 1,029,665 0.110 993,281 0.120 959,535 0.130 γ N R2 Panel C: High Average Local Credit Ratings α β γ N R2 -1.055*** (0.0309) -0.283*** (0.0142) 0.587*** (0.0231) -1.451*** (0.0427) -0.279*** (0.0199) 0.701*** (0.0320) -1.983*** (0.0537) -0.368*** (0.0252) 1.053*** (0.0404) -2.083*** (0.0614) -0.340*** (0.0287) 1.094*** (0.0460) -2.226*** (0.0679) -0.326*** (0.0323) 1.173*** (0.0511) -2.289*** (0.0736) -0.321*** (0.0349) 1.214*** (0.0557) -2.180*** (0.0725) -0.290*** (0.0347) 1.143*** (0.0548) -2.137*** (0.0749) -0.279*** (0.0357) 1.119*** (0.0561) -2.011*** (0.0722) -0.266*** (0.0345) 1.036*** (0.0541) -1.848*** (0.0693) -0.251*** (0.0329) 0.940*** (0.0520) -1.685*** (0.0668) -0.226*** (0.0316) 0.805*** (0.0501) 2,032,835 0.057 1,901,885 0.066 1,743,102 0.080 1,628,403 0.085 1,544,154 0.089 1,477,852 0.092 1,424,204 0.102 1,381,871 0.105 1,339,219 0.108 1,307,835 0.116 1,275,231 0.125 Note: The table replicates the results in Panel B of Table 5 in the main text for each year in our sample. 48 APPENDIX TABLE A4: ROBUSTNESS TO AVERAGE INITIAL DEBT-TO-INCOME RATIOS 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Panel A: Low Average Initial Debt-to-Income Ratio α -0.681*** (0.0484) -1.128*** (0.0684) -1.696*** (0.0988) -1.843*** (0.123) -2.132*** (0.151) -2.189*** (0.171) -2.097*** (0.169) -2.142*** (0.166) -1.790*** (0.159) -1.674*** (0.155) -1.421*** (0.141) β -0.181*** (0.0182) 0.346*** (0.0332) -0.230*** (0.0262) 0.513*** (0.0469) -0.350*** (0.0378) 0.820*** (0.0671) -0.346*** (0.0479) 0.869*** (0.0837) -0.366*** (0.0585) 0.981*** (0.103) -0.380*** (0.0661) 1.002*** (0.117) -0.351*** (0.0663) 0.949*** (0.115) -0.390*** (0.0661) 1.023*** (0.113) -0.327*** (0.0640) 0.881*** (0.109) -0.316*** (0.0616) 0.874*** (0.107) -0.284*** (0.0599) 0.744*** (0.0961) 1,550,820 0.046 1,419,702 0.055 1,246,854 0.057 1,124,143 0.064 1,033,758 0.066 960,459 0.070 900,996 0.078 861,336 0.083 821,234 0.094 786,886 0.110 756,966 0.126 γ N R2 Panel B: Medium Average Initial Debt-to-Income Ratio α -0.864*** (0.0376) -1.353*** (0.0540) -2.150*** (0.0766) -2.556*** (0.0924) -2.929*** (0.115) -3.101*** (0.127) -3.040*** (0.127) -2.928*** (0.129) -2.623*** (0.119) -2.196*** (0.112) -1.829*** (0.105) β -0.212*** (0.0161) 0.423*** (0.0269) -0.252*** (0.0226) 0.585*** (0.0386) -0.425*** (0.0320) 1.020*** (0.0549) -0.490*** (0.0385) 1.234*** (0.0664) -0.554*** (0.0482) 1.419*** (0.0830) -0.606*** (0.0533) 1.508*** (0.0917) -0.565*** (0.0528) 1.497*** (0.0915) -0.543*** (0.0548) 1.454*** (0.0927) -0.504*** (0.0506) 1.342*** (0.0861) -0.413*** (0.0484) 1.142*** (0.0808) -0.340*** (0.0453) 0.942*** (0.0756) 1,942,792 0.048 1,785,659 0.059 1,581,595 0.061 1,436,540 0.070 1,326,941 0.075 1,244,084 0.081 1,176,586 0.091 1,129,839 0.095 1,083,324 0.103 1,044,333 0.116 1,009,370 0.129 γ N R2 Panel C: High Average Initial Debt-to-Income Ratio α -1.212*** (0.0349) -1.964*** (0.0507) -2.748*** (0.0697) -3.115*** (0.0848) -3.616*** (0.101) -3.754*** (0.108) -3.653*** (0.109) -3.622*** (0.107) -3.227*** (0.103) -2.849*** (0.0954) -2.492*** (0.0886) β -0.367*** (0.0161) 0.647*** (0.0272) -0.495*** (0.0235) 0.980*** (0.0398) -0.632*** (0.0320) 1.416*** (0.0551) -0.684*** (0.0389) 1.617*** (0.0670) -0.765*** (0.0470) 1.901*** (0.0803) -0.783*** (0.0502) 1.988*** (0.0855) -0.757*** (0.0518) 1.939*** (0.0865) -0.758*** (0.0512) 1.947*** (0.0851) -0.669*** (0.0504) 1.732*** (0.0815) -0.596*** (0.0472) 1.528*** (0.0755) -0.532*** (0.0428) 1.331*** (0.0697) 2,266,956 0.053 2,081,788 0.062 1,856,408 0.068 1,684,084 0.076 1,559,866 0.082 1,464,142 0.090 1,390,451 0.100 1,335,694 0.101 1,281,206 0.105 1,238,246 0.112 1,197,677 0.121 γ N R2 Note: The table replicates the results in Panel C of Table 5 in the main text for each year in our sample. 49 APPENDIX TABLE A5: ROBUSTNESS TO AVERAGE HOUSE PRICE GROWTH (2001-2005) 2002 2003 2004 2005 -1.222*** (0.0453) -0.366*** (0.0202) 0.667*** (0.0331) -2.145*** (0.0640) -0.540*** (0.0285) 1.157*** (0.0470) -2.751*** (0.0839) -0.632*** (0.0373) 1.458*** (0.0617) -3.007*** (0.0963) -0.628*** (0.0430) 1.578*** (0.0708) N R2 1,291,855 0.055 1,189,375 0.067 1,050,610 0.081 957,101 0.093 α -1.198*** (0.0417) -0.373*** (0.0204) 0.675*** (0.0307) -1.918*** (0.0634) -0.481*** (0.0298) 1.014*** (0.0468) -2.828*** (0.0818) -0.660*** (0.0386) 1.549*** (0.0616) -3.174*** (0.101) -0.670*** (0.0460) 1.728*** (0.0761) N R2 1,313,788 0.051 1,194,060 0.061 1,059,157 0.061 970,151 0.065 α -1.287*** (0.0449) -0.378*** (0.0196) 0.690*** (0.0323) -2.193*** (0.0665) -0.547*** (0.0291) 1.124*** (0.0477) -3.399*** (0.0911) -0.848*** (0.0395) 1.834*** (0.0657) -4.486*** (0.124) -1.125*** (0.0526) 2.509*** (0.0887) 1,365,724 0.043 1,237,680 0.048 1,072,853 0.048 935,547 0.051 α β γ β γ β γ N R2 2006 2007 2008 Panel A: Low Average House Price Growth -3.117*** -3.084*** -3.488*** (0.108) (0.117) (0.126) -0.561*** -0.548*** -0.658*** (0.0496) (0.0549) (0.0603) 1.577*** 1.536*** 1.851*** (0.0806) (0.0884) (0.0954) 889,277 0.098 836,682 0.102 782,313 0.110 2009 2010 2011 2012 -3.912*** (0.138) -0.808*** (0.0672) 2.163*** (0.104) -3.337*** (0.127) -0.670*** (0.0593) 1.832*** (0.0948) -2.793*** (0.119) -0.524*** (0.0565) 1.529*** (0.0886) -2.283*** (0.110) -0.440*** (0.0545) 1.228*** (0.0822) 733,016 0.108 697,421 0.116 672,838 0.126 658,547 0.140 -3.464*** (0.132) -0.760*** (0.0629) 1.931*** (0.0981) -3.116*** (0.127) -0.649*** (0.0608) 1.752*** (0.0942) -2.833*** (0.117) -0.639*** (0.0552) 1.637*** (0.0863) 729,503 0.096 701,465 0.101 673,369 0.111 654,435 0.120 -4.616*** (0.145) -1.112*** (0.0625) 2.625*** (0.106) -4.144*** (0.137) -1.008*** (0.0618) 2.385*** (0.100) -3.710*** (0.128) -0.896*** (0.0575) 2.167*** (0.0932) -3.325*** (0.121) -0.776*** (0.0541) 1.932*** (0.0880) 752,625 0.071 717,752 0.074 690,702 0.083 651,403 0.093 Panel B: Medium Average House Price Growth -4.026*** -4.330*** -3.964*** -3.643*** (0.125) (0.138) (0.142) (0.139) -0.880*** -0.981*** -0.816*** -0.691*** (0.0584) (0.0632) (0.0656) (0.0645) 2.277*** 2.462*** 2.179*** 1.967*** (0.0949) (0.104) (0.105) (0.103) 897,573 0.069 819,222 0.076 754,658 0.091 Panel C: High Average House Price Growth -5.332*** -5.561*** -5.057*** (0.153) (0.163) (0.153) -1.335*** -1.346*** -1.252*** (0.0636) (0.0677) (0.0651) 2.982*** 3.139*** 2.861*** (0.109) (0.117) (0.111) 845,133 0.052 797,970 0.057 777,522 0.065 Note: The table replicates the results in Panel D of Table 5 in the main text for each year in our sample. 50 APPENDIX TABLE A6: MORTGAGE APPLICATIONS AND LOCAL INEQUALITY WITH STATE FE 2001 2002 2003 2004 2005 2006 2007 0.297** (0.130) 0.241** (0.104) 0.229*** (0.083) 0.242*** (0.054) 0.219*** (0.055) 0.244*** (0.057) 0.177*** (0.052) β 0.446*** (0.054) 0.374*** (0.040) 0.342*** (0.028) 0.374*** (0.026) 0.358*** (0.029) 0.351*** (0.028) γ -0.463*** (0.096) -0.364*** (0.077) -0.327*** (0.062) -0.356*** (0.039) -0.346*** (0.040) N R2 644680 0.090 647685 0.071 722326 0.050 790699 0.048 890889 0.045 2008 2009 2010 2011 0.152** (0.059) 0.115** (0.048) 0.203*** (0.054) 0.260*** (0.060) 0.314*** (0.025) 0.306*** (0.027) 0.223*** (0.026) 0.260*** (0.029) 0.286*** (0.032) -0.339*** (0.041) -0.270*** (0.038) -0.251*** (0.043) -0.191*** (0.035) -0.296*** (0.040) -0.362*** (0.045) 798332 0.046 577110 0.046 395574 0.035 371967 0.027 382851 0.035 359100 0.044 Panel A: Probability of Mortgage Application Being Rejected 𝛼 Panel B: Probability of Mortgage Being High-Interest (conditional on approval) 𝛼 0.132*** (0.042) 0.140** (0.059) 0.119* (0.062) 0.105** (0.046) 0.065 (0.041) 0.007 (0.026) 0.055** (0.026) 0.062* (0.036) β 0.241*** (0.026) 0.284*** (0.036) 0.270*** (0.038) 0.208*** (0.027) 0.169*** (0.026) 0.075*** (0.015) 0.114*** (0.014) 0.120*** (0.020) γ -0.208*** (0.031) -0.262*** (0.042) -0.244*** (0.045) -0.205*** (0.033) -0.165*** (0.030) -0.073*** (0.019) -0.104*** (0.019) -0.122*** (0.026) N R2 598307 0.099 644987 0.160 567623 0.123 415484 0.064 287400 0.047 283357 0.028 286764 0.042 268874 0.043 Panel C: Loan-to-Income Ratios of Mortgage Applications (conditional on approval) 𝛼 -0.696*** (0.099) -0.733*** (0.096) -0.853*** (0.096) -0.924*** (0.075) -0.664*** (0.073) -0.655*** (0.067) -0.797*** (0.068) -0.697*** (0.078) -0.728*** (0.080) -0.802*** (0.083) -0.731*** (0.083) β -0.219*** (0.050) -0.231*** (0.051) -0.265*** (0.053) -0.304*** (0.051) -0.223*** (0.046) -0.218*** (0.041) -0.240*** (0.043) -0.167*** (0.052) -0.114** (0.050) -0.138*** (0.048) -0.092** (0.047) γ 0.069 (0.073) 0.070 (0.070) 0.138* (0.070) 0.207*** (0.053) 0.032 (0.052) 0.029 (0.048) 0.118** (0.048) 0.056 (0.057) 0.025 (0.057) 0.077 (0.060) 0.053 (0.060) N R2 501296 0.289 513101 0.310 565412 0.327 598307 0.318 644987 0.308 567623 0.322 415484 0.344 287400 0.344 283357 0.365 286764 0.376 268874 0.357 Note: The table replicates the results in Table 8 using state fixed effects rather than county fixed effects. Standard errors clustered at the county level. 51 APPENDIX B: ADDITIONAL INFORMATION ON CCP DATA The Equifax FRBNY Consumer Credit Panel is a longitudinal database with detailed information on consumer debt and credit. The core of the database constitutes a 5% random sample of all U.S. individuals with credit (i.e., the primary sample). The database also contains information on all individuals with credit files residing in the same household as the individuals in the primary sample. The household members are added to the sample based on the mailing address in the existing credit files. Thus, the resulting sample is a sample of U.S. households in which at least one member has a credit file. The individual records in the CCP contain information on the mortgage debt, credit card debt and credit card limits, home equity lines of credit, student loans, auto loans, bankruptcy and delinquencies. The data include residential location on the census block level and the birth year of individuals. The data in the CCP are updated quarterly. We use 100% of the CCP sample. The unit of the analysis in the paper is a household. The CCP is primarily an individual-level dataset; however, it contains two identifiers that allow us to construct the household records in each period and then link the household records from period to period. In each quarter, a unique identifier is given for all individuals who reside in the same household as an individual in the primary sample. We use this identifier to aggregate the individual level information to construct the household level credit variables. The household identifier identifies household members only in one period. We then use the second identifier in the CCP data, an individual identifier that remains constant from period to period, to link household records from one quarter to another. To construct the longitudinal household record, we proceed as follows. Let i denote the identification number of a household in 2001. To identify the continuation of household i in year t, t > 2001, we first determine what members of household i are present in year t using individual identifiers. Then we determine the identification number of the household to which these members belong in year t. If there is a unique household to which these members belong in year t and the new household does not have any members from any other household in year 2001, we identify this new household as a continuation record for household i. While the primary sample of individuals in the CCP is a random sample of all U.S. households with credit reports; the resulting sample of the households is not random. Following, Lee and van der Klaauw (2010) we define the sampling weights as the inverse of the probability to be included in the sample, 𝑤ℎ = 1 , 1− .05𝑁 where N is the number of individuals in the household who are in the primary sample. For each individual, the data contain a record of her debt by detailed category as well as a record of the balances on the joint or cosigned accounts. In aggregating the debt on the household level, we use a correction to avoid double counting of the balances on joint accounts. This choice follows Brown, Haughwout, Lee and van der Klaauw (2011). In particular, while aggregating, we discount the total debt of the household members by 50% of the total debt on joint accounts of the household members. The exact formula that we use is 𝑖,𝑐 𝑖,𝑐 𝑖 𝑑ℎ,𝑗 = max{ ∑𝑖 (𝑑ℎ,𝑗 − .5𝑑ℎ,𝑗 ), .5𝑑ℎ,𝑗 }. 𝑖,𝑐 𝑖 Where 𝑑ℎ,𝑗 is the total debt in category j of member i in household h and 𝑑ℎ,𝑗 is the debt in joint accounts. The second input to the maximum function addresses the situation that arises with so-called “thin” credit records, or records with at most two credit report-worthy debts. The individuals with thin records are not 52 included in the primary sample, but they are included in the additional sample. These individuals might have records on joint accounts that are missed on individual accounts. We thank Donghoon Lee for this suggestion. Variable Descriptions Here we provide a short description of the variables used in the CCP analysis. For a detailed description of the CCP dataset please see Lee and van der Klaauw (2010). Age: We follow Brown, Haughwout, Lee, and van der Klaauw (2011) and define age as the median age of adult members of the house. Auto debts: These are any loans taken out explicitly for the purchase of a car including loans from banks and those from automobile financing institutions. Bankruptcy: An indicator in the CCP taken from public records that detail whether or not an individual has filed for bankruptcy. Credit Card Balance: The sum of reported balances across bank cards as well as retail cards. These cards reflect revolving accounts at banks, credit unions, credit card companies, and others. Importantly, the CCP does not distinguish between balances rolled over billing periods (and so potentially subject to interest charges) and cards where the balance is paid every month. Credit Card Limits: We take the maximum of reported limits and balances across all bank and retail cards to ensure that reported utilization is not greater than one. Credit Card Utilization Rate: This is the ratio of the credit card balance and credit card limit. Delinquency: Indicator for whether or not a household is at least 60 days delinquent on any of its accounts in the current quarter. HELOC Debt: The sum of home equity lines of credit, or home equity revolving accounts. We use the classification of HELOCs vs. installment loans provided by the CCP data. Mortgage Debt: The sum of all mortgage installment loans. Riskscore: A variable constructed by Equifax and similar to FICO. A higher number is interpreted as a lower default risk. We construct the household riskscore by taking the average of individual riskscores within the household. Size: Household size sums the number of distinct social security numbers that can be linked by household identifiers in a specific time period. We restrict the household size to at most 10. Student Loans: These include loans financing education from private and public institutions. Total debt: Constructed as the sum of mortgage debt balance, credit card balances, auto debts, balance on home equity lines of credit, and student loans. 53 APPENDIX C: DECOMPOSING U.S. INEQUALITY SINCE 1970 The decomposition is constructed using the following IPUMS samples: 1970, 1980, and 1990 1% metro samples and the 2000 1% unweighted sample. Within each of these samples we use the metro area geographies defined by IPUMS in the following way: “Metropolitan areas are counties or combinations of counties centering on a substantial urban area. METAREA identifies the metropolitan area where the household was enumerated, if that metropolitan area was large enough to meet confidentiality requirements.” We restrict the sample to the set of metro areas that can be identified in each year to get 117 metro areas containing roughly 60% of the entire sample within each year. We also restrict the sample to households where the respondent’s age is between 25 and 65 and the respondent is the head of the household or the spouse of the head of the household. These restrictions are not important for the results. To calculate income we use family total income. While not exactly the same as household income it is available for all years whereas household income is not available in 1970. We estimate the following model of log family income on each year of the sample: log(𝑦𝑖𝑎 ) = 𝛼𝑎 + 𝜖𝑖 Estimating this function gives estimates of the variance of the fixed effects and the variance of the residuals for each year. We then calculate the share of variance explained by variance of the fixed effects as: 𝜎�𝑎2 𝑆ℎ𝑎𝑟𝑒 = 2 𝜎�𝑎 + 𝜎�𝑖2 APPENDIX FIGURE C1: DECOMPOSING AGGREGATE U.S. INEQUALITY Note: The left-hand figure plots the ratio of “between” variance of mean incomes to the total variance of incomes. The right-hand figure plots the standard deviation of log income across all households. 54 APPENDIX 3: TIME VARIATION IN LOCAL INEQUALITY RATES To get a sense of how inequality within counties has varied across time we computed Gini coefficients at the county level using 1970 and 2000 Census aggregates available from ICPSR. To compute the Gini coefficient we follow the same procedure outlined in the Appendix and reproduced below. Because the number of bins used to compute the coefficient is not the same in both years (1970 has fewer bins) the levels of the Gini coefficients are not directly comparable. Using the Census data we match 3,122 counties. Let 𝑓(𝑦𝑖 ) be a discrete probability function where 𝑖 = 1, … , 𝑛 and 𝑦𝑖 < 𝑦𝑖+1 . Then the Gini coefficient G is defined as ∑𝑛𝑖=1 𝑓(𝑦𝑖 )(𝑆𝑖−1 + 𝑆𝑖 ) 𝐺 =1− 𝑆𝑛 𝑖 where 𝑆𝑖 = ∑𝑗=1 𝑓�𝑦𝑗 �𝑦𝑗 and 𝑆0 = 0. We approximate the discrete probability function with the share of a location’s population within each bin reported by the Census. For all bins but the last we assume all the mass is distributed at the midpoint of the bin. For the very last bin we add the last increment to the lower boundary. For example, if the last bin is incomes of $200,000 and up and the bin before was $150,000 to $199,999 we assign the last bin to have the value $250,000. This assumption limits the impact the very top bin will have on the coefficient, but should provides a reasonable approximation of inequality at low levels of aggregation. The figure reported below shows a high degree of correlation between inequality in 1970 and inequality in 2000. The R-squared is 0.26 and the Spearman correlation is 0.52, suggesting inequality is quite persistent. APPENDIX FIGURE C3: PERSISTENCE OF LOCAL INEQUALITY Note: The figure plots Gini coefficients for income inequality in U.S counties in 1970 versus 2000. 55 APPENDIX D: SUMMARY STATISTICS FROM HMDA DATA Table 1 in this appendix provides summary statistics from the 15% HMDA samples. We report the fraction of applications denied, originated, for owner-occupied properties, high interest, the race of the primary applicant, and the regulator of the lender. When using the HMDA data it is important to recognize that changes in reporting requirements from 2003 to 2004 had significant effects on the coverage of the mortgage market and so statistics we calculate. This can be seen clearly when comparing the change in racial composition of applicants from 2003 to 2004. While some of this might reflect real shifts in the provision of credit to nonwhite groups it also reflects the increased coverage of rural areas and smaller, non-bank lenders. This can also be seen by the large increase in applications filed at lenders regulated by HUD. While mortgage company activity was almost certainly increasing over this period many lenders were simply not reporting in the HMDA data. The health of the mortgage market can be traced out by changes in the sample size. The number of applications reported peaked in 2007 and then declined steadily until 2011. Interestingly, the fraction of loans with high interest rates has also declined sharply, probably reflecting fewer loans with junior liens. Notice that the mean applicant income reported in the HMDA data is substantially higher than the average household income reported in the SCF data and the imputed CCP data. However, average income is comparable to the average income of homeowners as reported in the 2007 SCF, which is about $99,500. Table 2 provides some sample correlations from 2007, most of which are qualitatively similar to other years. Owner-occupied applications are less likely to be denied while applications with high LTI ratios are more likely to be denied. Applicants applying to HUD-regulated lenders are more likely to be denied, which could reflect the stress of mortgage companies in this period or an increased likelihood that the applicant is subprime. Applicants to HUD lenders tend to have smaller incomes and higher LTI ratios. 56 APPENDIX TABLE D1: SUMMARY STATISTICS FROM HMDA 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 Denied Originated OOC LTI sd Loan sd Income sd High Int White Black OCC FRS FDIC OTS NCUA HUD 0.15 0.78 0.94 2.31 0.88 140.16 96.03 64.84 47.46 0.13 0.79 0.93 2.43 0.94 154.40 104.30 68.46 49.75 0.13 0.78 0.92 2.58 1.03 168.24 111.90 70.72 50.95 0.89 0.08 0.28 0.11 0.09 0.11 0.02 0.39 0.88 0.07 0.27 0.18 0.08 0.10 0.02 0.36 0.88 0.08 0.26 0.18 0.07 0.09 0.02 0.38 0.15 0.76 0.90 2.65 1.08 193.11 147.30 78.13 63.29 0.08 0.74 0.08 0.23 0.15 0.07 0.08 0.02 0.45 0.16 0.72 0.88 2.67 1.08 212.85 165.15 85.41 70.48 0.16 0.71 0.10 0.20 0.16 0.06 0.09 0.02 0.47 0.18 0.71 0.90 2.63 1.04 223.00 173.16 91.21 76.46 0.16 0.69 0.11 0.23 0.16 0.06 0.08 0.02 0.45 0.18 0.72 0.91 2.72 1.11 226.41 180.86 91.01 81.55 0.08 0.73 0.10 0.32 0.17 0.06 0.10 0.03 0.33 0.17 0.73 0.92 2.72 1.10 207.03 155.68 84.15 73.44 0.06 0.76 0.08 0.32 0.09 0.09 0.08 0.05 0.36 0.14 0.76 0.94 2.81 1.12 198.34 141.21 78.02 65.42 0.04 0.76 0.07 0.29 0.09 0.11 0.06 0.04 0.40 0.15 0.75 0.94 2.79 1.12 203.31 148.88 80.84 68.73 0.02 0.76 0.07 0.31 0.08 0.11 0.05 0.04 0.41 0.15 0.75 0.93 2.70 1.10 200.69 151.88 82.38 71.28 0.03 0.77 0.07 0.06 0.04 0.09 0.00 0.04 0.43 N 644680 647685 722326 790699 890889 798332 577110 395574 371967 382851 359100 Note: The table provides sample means for all variables and standard deviations for continuous variables for all years of the HMDA data under the sample restrictions identified in the text. Denied gives the probability that an application was formally denied while originated gives the probability a loan was approved and the funds disbursed to the borrower. OOC indicates that the application is for an owner-occupied home. LTI is the loan-to-income ratio on the application constructed from the application’s stated loan and income. High Int indicates if a loan was ultimately originated as a high interest loan. While and black both refer to the race of the primary applicant. OCC indicates a loan filed at a lender regulated by the Office of the Comptroller of the Currency. Similarly, FRS indicates a lender regulated by the Federal Reserve System, OTS regulated by the Office of Thrift Supervision, NCUA the National Credit Union Administration, and HUD the Department of Housing and Urban Development. 57 APPENDIX TABLE D2: SAMPLE CORRELATIONS FROM 2007 HMDA Denied Denied Originated OOC LTI Loan Inc White Black 1.000 Originated OOC LTI Loan Inc -0.762*** -0.0192*** 0.053*** 0.001 -0.028*** 1.000 0.021*** -0.060*** -0.020*** 0.014*** 1.000 0.200*** -0.0308*** -0.169*** 1.000 0.208*** -0.238*** 1.000 0.815*** White Black OCC FRS FDIC -0.145*** 0.116*** -0.066*** 0.051*** -0.044*** 0.146*** -0.113*** 0.120*** -0.070*** 0.045*** -0.0105*** 0.007*** -0.005*** -0.002 -0.031*** -0.116*** 0.050*** -0.012*** -0.022*** -0.031*** -0.033*** -0.053*** 0.056*** -0.023*** -0.060*** 0.034*** -0.074*** 0.063*** -0.011*** -0.041*** 1.000 -0.545*** 0.006*** 0.001 0.078*** 1.000 -0.025*** 0.004** -0.037*** OTS NCUA HUD N 0.0547*** -0.025*** 0.022*** 577110 -0.009*** 0.008*** -0.084*** -0.022*** 0.029*** 0.026*** -0.003* -0.004** 0.048*** 0.081*** -0.042*** -0.042*** 0.070*** -0.040*** -0.062*** -0.027*** 0.039*** -0.044*** 0.006*** -0.020*** 0.044*** 1.000 Note: The table provides correlations for all years of the HMDA data under the sample restrictions identified in the text. Denied gives the probability that an application was formally denied while originated gives the probability a loan was approved and the funds disbursed to the borrower. OOC indicates that the application is for an owner-occupied home. LTI is the loan-to-income ratio on the application constructed from the application’s stated loan and income. High Int indicates if a loan was ultimately originated as a high interest loan. White and black both refer to the race of the primary applicant. OCC indicates a loan filed at a lender regulated by the Office of the Comptroller of the Currency. Similarly, FRS indicates a lender regulated by the Federal Reserve System, OTS regulated by the Office of Thrift Supervision, NCUA the National Credit Union Administration, and HUD the Department of Housing and Urban Development. 58 APPENDIX E: INCOME AND DEFAULT We use the CCP data to verify our assumption about probability of default conditional on income. In particular, we estimate a linear probability model of the probability of default as a function of household income. The dependent variable takes value 1 if any member of the household in year t is 60-day past due or longer on any account (mortgage, auto loan, credit card, etc.). The explanatory variable of interest is the (log of the) household income in year 2001 (using the expected imputed income). We first estimate a parsimonious specification with only the income measure. We then estimate a specification with the measure of income and the full set of household and regional controls. These household-level controls are the following variables measured at 2001: dummies for age of the head of household and for the size of the household; amount of mortgage, auto loan, credit card balance, credit card limit, HELOC, student loan; dummies for bankruptcy and 60 DPD or longer, and risk score. The regional-level controls are the following zip code-level variables measured in 2001: income inequality, median of total household debt, median of household mortgage, house price growth between 2001 and year t, the ratio of the median house price to the median income, and the county level fixed effects. In the estimation, the standard errors are clustered by zip code. We use a linear probability model since the mean of the dependent variable is in the range 0.25-0.30. The equation is estimated for each year from 2002 to 2012 for the sample of the households use in the benchmark regression of our analysis (i.e., the households that do not change location between year 2001 and year t). We report results in Appendix Table E1. We find that higher income households and households with higher income ranks have lower probability of default. 59 Appendix Table E1. Income and default. 2002 2003 2004 2005 rank -0.325*** (0.00178) -0.279*** (0.00179) -0.289*** (0.00180) -0.261*** (0.00173) N R2 6,172,512 0.022 5,676,766 0.017 5,039,109 0.018 rank -0.324*** (0.00179) -0.278*** (0.00180) N R2 6,172,512 0.052 rank 2009 2010 2011 2012 Panel A: No Controls -0.247*** -0.224*** -0.212*** (0.00170) (0.00169) (0.00167) -0.194*** (0.00168) -0.186*** (0.00169) -0.181*** (0.00169) -0.187*** (0.00172) 4,570,211 0.015 4,218,948 0.013 3,731,267 0.010 3,581,280 0.008 3,433,201 0.008 3,310,773 0.007 3,197,351 0.008 -0.288*** (0.00182) -0.261*** (0.00174) Panel B: County Fixed Effects -0.247*** -0.224*** -0.213*** (0.00169) (0.00167) (0.00165) -0.196*** (0.00165) -0.188*** (0.00165) -0.184*** (0.00165) -0.190*** (0.00168) 5,676,766 0.046 5,039,109 0.050 4,570,211 0.047 4,218,948 0.045 3,581,280 0.034 3,433,201 0.033 3,310,773 0.032 3,197,351 0.033 -0.0303*** (0.00152) -0.0374*** (0.00171) -0.0398*** (0.00189) Panel C: Household-specific Characteristics and County Fixed Effects -0.0448*** -0.0470*** -0.0448*** -0.0348*** -0.0261*** (0.00202) (0.00213) (0.00223) (0.00233) (0.00245) -0.0235*** (0.00250) -0.0189*** (0.00244) -0.0120*** (0.00253) N R2 4,195,007 0.460 3,836,566 0.359 3,380,052 0.326 3,047,381 0.274 2,803,886 0.244 2,619,591 0.213 2,470,908 0.187 2,367,350 0.177 2,265,545 0.171 2,182,951 0.161 2,105,700 0.159 ln(y) -0.159*** (0.000625) -0.144*** (0.000626) -0.152*** (0.000644) -0.143*** (0.000638) Panel D: No Controls -0.139*** -0.129*** (0.000636) (0.000639) -0.122*** (0.000647) -0.113*** (0.000665) -0.109*** (0.000684) -0.106*** (0.000691) -0.109*** (0.000711) N R2 6,172,512 0.040 5,676,766 0.033 5,039,109 0.036 4,570,211 0.033 4,218,948 0.031 3,731,267 0.025 3,581,280 0.020 3,433,201 0.019 3,310,773 0.018 3,197,351 0.019 ln(y) -0.145*** (0.000669) -0.129*** (0.000667) -0.135*** (0.000681) -0.126*** (0.000660) Panel E: County Fixed Effects -0.121*** -0.113*** -0.109*** (0.000647) (0.000638) (0.000635) -0.101*** (0.000647) -0.0980*** (0.000647) -0.0958*** (0.000646) -0.0993*** (0.000657) N R2 6,172,512 0.062 5,676,766 0.055 5,039,109 0.059 4,570,211 0.056 3,581,280 0.041 3,433,201 0.040 3,310,773 0.038 3,197,351 0.039 ln(y) -0.010*** (0.000606) -0.012*** (0.000683) Panel F: Household-specific Characteristics and County Fixed Effects -0.013*** -0.016*** -0.017*** -0.015*** -0.012*** -0.009*** (0.000752) (0.000811) (0.000857) (0.000904) (0.000949) (0.000995) -0.008*** (0.00102) -0.006*** (0.000994) -0.004*** (0.00103) N R2 4,195,007 0.460 3,836,566 0.359 3,380,052 0.326 2,265,545 0.171 2,182,951 0.161 2,105,700 0.159 3,047,381 0.274 2006 4,218,948 0.054 2,803,886 0.244 2007 3,950,618 0.011 3,950,618 0.040 3,950,618 0.027 3,950,618 0.049 2,619,591 0.213 2008 3,731,267 0.037 3,731,267 0.045 2,470,908 0.187 2,367,350 0.177 Notes: The table reports estimated coefficients on income rank (Panels A-C) and log income (Panels D-F) in the linear regression where the dependent variable is a dummy variable equal to one if a household defaults in a given year and zero otherwise. Standard errors (clustered by zip code) are reported in parentheses. ***,**,*denote statistical significance at 1%, 5% and 10%. 60 APPENDIX F: IMPUTATION OF INCOME In the first step of our work, we estimate the relationship between income and observables in the SCF and then use this relationship to impute income in the CCP. In this appendix, we describe how variables are constructed and what specification is estimated. In the table below, we describe how variables are constructed in CCP and SCF. We use only variables which are available in both CCP and SCF. While there are some differences in the definitions across datasets, we made every effort to make it as comparable as possible. Variable Auto loans Bankruptcy flag Credit Card Limit 19 Credit Card Balance Delinquency flag HELOC Balance Income Mortgage Debt Student Loans SCF Counterpart in CCP X2218 + X2318 + X2418 + Auto loan bank and X7169 + X2424 + X2507 + auto loan finance X2607 balance X6772 Chapter 7 or Chapter 13 bankruptcy flag X414 Bank card + retail card high credit X413 + X427+ X421 + Bank card + retail card X424 + X430 balance X3005 A flag if any account is 60 DPD or more X1108 + X1119 + X1130 + Home equity revolving X1136 balance X5729 None X805 + X905 + X1005 + First mortgage balance X1715 + X1815 + X1915 + + home equity X2006 + X2016 installment balance X7824 + X7847 + Student loans balance X7870 + X7924 + X7947 + X7970 We also use household size and head of household age. The CCP does not include racial identifiers so we do not use these. In our imputation, we use all of the SCF replicates, which are discussed in detail by Kennickel (1998). Because the SCF intentionally oversamples wealthy households, we apply the SCF-computed weights X42001. Note that we take the natural log of one plus the level for all continuous variables to make the distribution of these variables more well-behaved and to avoid dropping observations with zero values. We also restrict the sample to households where the head’s age was between 20 and 65. We dropped outliers using Cook’s distance. As discussed in the text, our regression has the general form log�𝑌𝑖,𝑆𝐶𝐹 � = 𝑓� 𝛽𝑋𝑖,𝑆𝐶𝐹 � + 𝜖𝑖,𝑆𝐶𝐹 . In choosing the specific form of f, we aimed to capture as much of joint distribution of the observables and income as we could with a flexible assumption. Terms were added if it was found that they were meaningful predictors of log income. The function f was composed of 19 We code responses of “no limit” in the SCF as 1,000,000. 61 1. 2. 3. 4. 5. 6. Third-order Chebychev polynomials of mortgage, auto, and credit card limits, Credit card, HELOC, and student loan balances, Nine age bins in five year intervals, Interactions of all age bins with each type of debt balance, Household size and interactions of household size with debt balances and age bins, Indicators for bankruptcy and delinquency and interactions of these indicators with other indicators, 7. Indicators for positive credit card limit and interactions of this variable with various variables, 8. Interactions of household size, age, and debt levels. Table 2 shows that using data from 2001 the aggregate income statistics computed directly from the SCF match those we impute in the CCP very closely. 62