View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Working Paper Series

Does Greater Inequality Lead to More
Household Borrowing?
New Evidence from Household Data

WP 14-01

Olivier Coibion
UT Austin and NBER
Yuriy Gorodnichenko
UC Berkeley and NBER
Marianna Kudlyak
Federal Reserve Bank of Richmond
John Mondragon
UC Berkeley

This paper can be downloaded without charge from:
http://www.richmondfed.org/publications/

Does Greater Inequality Lead to More Household Borrowing?
New Evidence from Household Data
Olivier Coibion
UT Austin and NBER

Yuriy Gorodnichenko
UC Berkeley and NBER

Marianna Kudlyak
Federal Reserve Bank of Richmond

John Mondragon
UC Berkeley

Working Paper No. 14-01
This Draft: January 10th, 2014

Abstract: One suggested hypothesis for the dramatic rise in household borrowing that
preceded the financial crisis is that low-income households increased their demand for
credit to finance higher consumption expenditures in order to “keep up” with higherincome households. Using household level data on debt accumulation during 2001-2012,
we show that low-income households in high-inequality regions accumulated less debt
relative to income than their counterparts in lower-inequality regions, which negates the
hypothesis. We argue instead that these patterns are consistent with supply-side
interpretations of debt accumulation patterns during the 2000s. We present a model in
which banks use applicants’ incomes, combined with local income inequality, to infer the
underlying type of the applicant, so that banks ultimately channel more credit toward
lower-income applicants in low-inequality regions than high-inequality regions. We
confirm the predictions of the model using data on individual mortgage applications in
high- and low-inequality regions over this time period.

JEL: E21, E51, D14, G21
Keywords: inequality, household debt, Great Recession
We are grateful to Meta Brown and Donghoon Lee for helpful comments about the data, and seminar
participants at the Richmond Fed, St. Louis Fed, and CES-Ifo conference. The views expressed here are
those of the authors and do not reflect those of the Federal Reserve Bank of Richmond or the Federal
Reserve System or any other institution with which the authors are affiliated. Mondragon thanks the
Richmond Fed for their generous support while part of this paper was written. Gorodnichenko thanks the
NSF and Sloan Foundation for financial support.

1

1 Introduction
The financial crisis of 2008-09 was preceded by an exceptional rise in borrowing by U.S. households,
accounted for primarily by a rise in mortgage debt. This increasing mortgage debt was securitized and
ultimately played a key role in bringing down the financial system once housing prices began to decline
and the associated mortgage-backed securities fell sharply in value. Why did households take on so much
new debt in the years immediately preceding the financial crisis?
There are two main views about this process. The first view is that the rise in borrowing reflected
“credit supply” factors. Proponents point to the progress in information technology (Sanchez 2009 and
Athreya, Tam and Young 2012) and rising financialization of debt (especially mortgages) as increasing
the supply of credit to households with a disproportionally larger increase of credit to low-income and
high risk households (Drozd and Serrano-Padial, 2013). Others also point to political motivations for
expanding credit supply. For example, Rajan (2010) argues that, in response to rising income inequality,
credit was made increasingly available to lower income groups to support their consumption levels in the
face of stagnant incomes.
According to the second view (“demand for credit”), there was a rise in the demand for
borrowing on the part of U.S. households, especially low-income households. One motivation for such a
rise in demand for borrowing again stems from rising inequality in the U.S. Specifically, rising
consumption on the part of wealthy households could have generated a rise in the demand for borrowing
on the part of lower-income households in their attempts to “keep up” with their wealthier neighbors, the
so-called “keeping up with the Joneses” effect. Indeed, there is a positive correlation between income
inequality in the U.S. (income share of the top 5%) and household debt relative to GDP (Figure 1) over
time. Both were stable from 1967 to around 1980, then both measures rose gradually over the course of
the 1980s as noted in Iacoviello (2008). But while income inequality then went up sharply in the early
1990s, household debt only caught up over the 2000s. The correlation is certainly consistent with the
possibility of a causal relationship running from inequality to household borrowing.
In this paper, we focus specifically on the link between inequality and household borrowing. In
particular, we investigate whether borrowing patterns on the part of low-, middle- and high-income
households differed depending on the level of local income inequality (where we define “local” as
ranging from as fine a geographic level as the zip code to as aggregated a level as the state). Local
inequality is, from a household’s point of view, likely to be the most relevant metric for “keeping up with
Joneses”. Furthermore, with most of the rise in income inequality in the U.S. since the 1970s reflecting a
rise in inequality within regions rather than inequality across regions, any sensitivity of borrowing to local
inequality levels could readily have translated into aggregate effects.

2

To assess whether borrowing patterns differed depending on local inequality levels, we study the
changes in debt to income ratios at the household level over the course of the 2000s and their relationship
with households’ relative standings in the income distribution and the amount of local income inequality.
We use unique data from the New York Federal Reserve Bank Consumer Credit Panel/Equifax (CCP)
which provides comprehensive debt measures for millions of U.S. households since 1999, including
detailed decompositions of debt by type (i.e. mortgage, auto, credit cards, etc.). Because this dataset does
not include a measure of household income, we use the relationship between household debt and income,
conditional on observable household characteristics, in the Survey of Consumer Finances to predict initial
household income in 2001. This imputation allows us to study the relationship between income and debt
in unprecedented detail. We then characterize the evolution of household debt levels, relative to initial
income levels, across income groups in areas with different levels of income inequality, which is akin to a
“difference-in-differences” approach across income groups and regional inequality levels.
Our main finding is that high-income households in high-inequality regions accumulated more
debt relative to their incomes than did low-income households in the same regions, or equivalently that
low-income households in high-inequality regions borrowed relatively less than similar households in
low-inequality regions. This effect is precisely the opposite of what one would have expected from
“keeping up with the Joneses” driving the rise in household debt during the 2000s. We show that this
result is remarkably robust and holds up to an extensive array of robustness checks: e.g. we find these
patterns within households with low or high credit scores, within regions which experienced either high
or low home price appreciation, within households with either low or high initial debt levels, etc. We
measure inequality at the zip code, county and state and find similar results across levels of aggregation.
The fact that the baseline results are robust to controlling for a wide range of other local factors that are
correlated with inequality levels suggests that it is indeed the level of inequality that matters rather than
inequality being a stand-in for other economic channels.
Because our data provides disaggregated information on household debt, we assess the link
between local inequality and different forms of debt: mortgage debt, auto debt, and credit card debt. We
find strong evidence that low-income households in high-inequality regions borrowed less in terms of
both mortgage and auto debt than those in low-inequality regions. A unique feature of the data is that we
have information on both credit card balances as well as credit card limits. This is particularly useful
because the latter can be interpreted as largely representing credit supply whereas the former primarily
reflects the demand for credit. We find that low-income households in high-inequality regions saw their
credit limits rise by less than those in lower inequality regions as was the case with mortgage and auto
debt. At the same time, no economically significant heterogeneity is observed in terms of credit card

3

balances. We interpret this contrast as pointing to supply side factors as being at the root of the
differential debt accumulation patterns that we observe in the data.
To illustrate how supply-side factors can explain the differential borrowing behavior tied to
regional inequality, we present a model in which each region is composed of two types of households.
High-type households have higher income on average than low-type households and are also less likely to
(exogenously) default on debt. A continuum of banks in each region lends to these households but banks
do not observe households’ types, only their income and another signal correlated with the underlying
type. As income inequality rises, banks treat an applicant’s income as an increasingly precise signal about
their type and therefore target lending toward higher income households on average. How they do so,
however, can vary with the local banking structure. For example, if banks are perfectly competitive and
can charge different interest rates to different applicants, then higher-income applicants will on average
face lower interest rates than low-income applicants, and this difference will be increasing in the amount
of local income inequality. If instead we model the banking system as being monopolistic and forced to
charge a common interest rate to all applicants, then this bank will reject low-income applicants more
frequently than high-income applicants, and this difference will again be increasing in the amount of local
inequality. In both cases, banks will make credit more readily accessible (or cheaper) to high-income
households when local inequality is higher because the latter implies that income is a more precise signal
of applicant types.
The credit supply mechanism in the model has some testable implications. If banks use individual
incomes combined with regional inequality as a signal about individuals’ types, then we would expect to
see richer households be denied less often when applying for mortgages in high-inequality regions than in
low-inequality regions, holding other characteristics constant. Similarly, one would expect richer
households in high-inequality regions to be less likely to pay higher interest rates on a loan.
We test these theoretical predictions using detailed mortgage application information from the
publicly available Home Mortgage Disclosure Act data (HMDA). These data track mortgage applications as
they go through the origination process and contain information on applicants (including their income, the
amount of the loan requested, their locale, and whether the loan is denied or originated). We document that
high-income households in high-inequality regions were less likely to be denied than their counterparts in
low-inequality regions, precisely as suggested by the theory. High-income households in high-inequality
regions were also less likely to be charged higher interest rates for their mortgages than equivalent
households in low-inequality regions. Thus, both theoretical predictions from the model are confirmed in the
data.
In summary, we document a systematic relationship between local inequality and differential
borrowing patterns across richer and poorer households in the U.S. that contradicts predictions based on
4

“keeping up with the Joneses” motives. We argue that these results can instead be explained through an
information channel: applicants’ incomes are a stronger signal of their underlying quality when local
inequality is high so banks are likely to channel relatively more credit to low-income applicants when the
level of local inequality is low. These results have implications for interpreting the sources of the
dramatic rise in borrowing by households during the housing boom, indicating that the source was more
likely to stem from an expansion in credit supply than credit demand.
This paper is most closely related to recent work evaluating the strength of “keeping up with the
Joneses” forces. Most notably, Bertrand and Morse (2013) study whether rising consumption of the rich
induces the non-rich to consume more.1 Using the Consumer Expenditure Survey (CES), they find that,
within a state, the consumption of the rich (the top quintile of the income distribution) predicts higher
consumption for the nonrich, holding everything else constant including own income. Bertrand and Morse
interpret their estimates as supporting the view that rising income inequality in a geographic market
translates into more demand for credit by low and middle-income households (see, for example, Rajan
2010). In contrast, by focusing explicitly on the borrowing decisions of households and exploiting a finer
level of geographic variation, we document that low and middle income households living in highinequality regions borrowed no more, and in fact less, than similar households in low-inequality regions.
This need not be interpreted as contradicting the empirical results of Bertrand and Morse (2013), since the
differences in consumption that they document could have been financed through channels other than
debt, e.g. through increased labor force participation, longer working hours, etc. But our results indicate
that “keeping up with the Joneses” forces are unlikely to have played a primary role in accounting for the
dramatic rise in household leverage during the 2000s and therefore in laying the groundwork for the
financial crisis of 2008-2009.
This paper therefore also relates to a broader line of research investigating the macroeconomic
consequences of income inequality, such as whether they are systematically related to financial crises.
Kumhof, Ranciere and Winant (2013), for example, argue that a rise in inequality driven by an increase in
the share of income going to those at the top of the income distribution induces the latter to save more,
lowering interest rates and inducing poorer households to borrow more, ultimately leading to more
financial fragility and a higher likelihood of a financial crisis. Bordo and Meissner (2012) find little
evidence of such a link based on aggregate data since 1920 for fourteen advanced economies, whereas
Perugini, Holscher and Collier (2013) find a positive link between income inequality and private sector
indebtedness since 1970 across eighteen economies. We contribute to this literature by documenting how,

1

Prior evidence in the same spirit as Bertrand and Morse (2013) includes Neumark and Postlewaite (1998), Zizzo
and Oswald (2001), Christen and Morgan (2005), Luttmer (2005), Daly and Wilson (2006), Maurer and Meier
(2008), Charles, Hurst and Roussanov (2009), Kuhn et al. (2010), Heffetz (2011), and Guven and Sorensen (2012).

5

within U.S. regions, debt accumulation patterns across different segments of the population over the
course of the 2000s were systematically related to local levels of income inequality. We also provide a
novel interpretation for these effects: local income inequality can be used in combination with an
applicant’s income level to refine inference about borrower types. In such a setting, higher levels of
income inequality will induce banks to reallocate credit toward higher income applicants and away from
lower income applicants, thereby potentially amplifying the implications of a more unequal income
distribution for the distribution of consumption.
The relationship between income inequality and the allocation of credit emphasized in our paper
also relates to the literature on consumption and income inequality. Krueger and Perri (2006) and related
works argue that consumption inequality during the last decades did not rise with income inequality. 2
Krueger and Perri argue that low-income households have experienced income shocks that increased
income inequality, but due to enhanced financial intermediation these households have been able to
smooth their consumption such that consumption inequality remained stable. Iacoviello (2008) replicates
the trend and cyclicality of household debt since the 1960s and also argues that increased access to credit
has allowed households to smooth increasingly volatile income processes. As income inequality increases
households use credit markets to smooth the temporary income shocks so that the aggregate level of debt
increases with inequality. In contrast, Aguiar and Bils (2012) argue that, when one corrects for
measurement errors associated with underreporting of consumption expenditures over time and across
different goods, consumption inequality has tracked income inequality closely over the last three decades.
While this line of research appeals to financial intermediation as a key link between consumption and
income inequality, it could not measure directly the quantitative importance of formal borrowing for
smoothing shocks and its relation to inequality due to data constraints. We examine this issue directly
using household level data on debt accumulation. Our results are consistent with the findings in Aguiar
and Bils (2012) because if low-income households were smoothing shocks to the extent suggested by
Krueger and Perri then we would expect low-income households to have accumulated relatively more
debt in areas where inequality is higher.
We also contribute to the vast literature on household borrowing that covers such diverse topics
as pricing of mortgages, optimal portfolios of household debt, risk scoring, and determinants of default
probabilities. Our paper is most related to studies of default determinants (e.g., Fay, Hurst, and White
2002, Gross and Souleles 2002) and lenders’ treatment of loan applications (e.g., Tootell 1996, Munnell
et al. 1996, Turner and Skidmore 1999) in the sense that we attempt to understand who obtains credit and
at what terms. However, while previous research studies these aspects for borrowers (or lenders) without
2

Related papers are Blundell, Pistaferri, and Preston (2008), Heathcote, Storesletten, and Violante (2010), and
Heathcote, Perri, and Violante (2010).

6

relating a given individual to the pool of borrowers, we explicitly focus on how the relative positions of
borrowers in the income distribution as well as the properties of the income distribution can affect the
level of debt they accumulate. Thus, in contrast to the previous literature, we examine directly the
interplay between debt and inequality, which have both been salient subjects of recent policy and
academic debates.
This paper is structured as follows. We describe our primary source of data in section 2 as well as
our imputation procedure for household income. In section 3, we consider household-level regressions
describing the differential debt accumulation patterns across income levels in regions with different levels
of income inequality. Section 4 presents a model that can explain these patterns. In section 5, we test and
confirm the additional predictions of the model using data on mortgage applications by individuals in
different inequality areas. Section 6 concludes.

2 Data
In this section, we first describe the dataset used to measure household debt accumulation over the course
of the 2000s. Second, we discuss how we impute household income based on observed patterns in the
Survey of Consumer Finances. Third, we construct local income inequality measures and describe some
of their properties.

2.1.

The New York Federal Reserve Bank Consumer Credit Panel/Equifax

We measure household debt accumulation using the New York Federal Reserve Bank Consumer Credit
Panel/Equifax (CCP) data. The CCP is a quarterly panel of individuals with detailed information on
consumer liabilities, delinquency, some demographic information, credit scores, and geographic
identifiers to the zip level. 3 The core of the database constitutes a 5% random sample of all U.S.
individuals with credit files. The database also contains information on all individuals with credit files
residing in the same household as the individuals in the primary sample. The household members are
added to the sample based on the mailing address in the existing credit files. Using the households’
identifiers, we aggregate individual records into households’ records and construct measures of
households’ debt. Thus, the resulting sample is a sample of U.S. households in which at least one member
has a credit file. The data in the CCP are updated quarterly. We use 100% of the CCP sample. Lee and
van der Klaauw (2010) provide an excellent detailed description of the database.
The data cover all major categories of household debt including mortgages, home equity lines of
credit (HELOC), credit cards, and student loans. Because of the large sample size, the breadth of variables

3

For complete details on the data set and variables construction see Appendix B.

7

observed, detailed location, and the ability to construct a quarterly household panel these data provide the
most detailed picture of household debt available.

2.2.

Income Rank Imputation

While the CCP provides detailed records of household debt and geographical location, it does not include
information on household income. To address this issue, we impute income in the CCP using information
from the Survey of Consumer Finances (SCF). The SCF is a household-level survey that contains
information on debt balances and income as well as a rich set of demographic characteristics. However,
the SCF does not provide geographic identifiers in the publicly available data. We use the SCF to
estimate how household income relates to debt and demographic characteristics available in both the CCP
and SCF data sets. We then use these estimates to impute household income in the CCP data. Finally, we
use the imputed income and the estimated error terms from the SCF to impute the household’s income
rank in the household’s geographical area.
In our analysis, we restrict the sample to households for whom the household head’s age is
between 20 and 65 to minimize potential age related selection effects. The data in the CCP are updated
quarterly. We use data from the third quarter of the CCP for years 2001 - 2012. We follow Brown et al.
(2011) and choose the third quarter to maximize the match with the SCF survey (typically administered
between April and December), which we use to impute the initial income distribution as described below.
For consistency, we then use the third quarter of each subsequent year to generate annual measures of
household debt.
Table 1 contains the summary statistics from the CCP and SCF samples from the third quarter of
2001. The statistics from the SCF and CCP are similar for most categories with the exception of credit
card balances. This finding is consistent with Brown et al. (2011) reporting that overall and in the
majority of disaggregated debt categories (mortgages, auto loans and HELOCs), borrower characteristics
and environment cells, debt levels reported in the SCF and CCP are similar. Brown et al. (2011) suggest
that some of the discrepancy between the credit card balance statistics in the two datasets might come
from the way credit card balances are recorded: the CCP contains records of all credit card balances,
whereas the households in the SCF might only report the fraction of the balance they intend to roll over. 4
The mortgage balance and HELOCs in the CCP are slightly higher than in the SCF because the CCP
measure includes secondary/investment properties, while in the SCF it does not (see Brown et al. 2011).
4

In the CCP, the credit balance is recorded on some date during the quarter. For some individuals, this can be the
date right before they pay off most of their credit balance, and the balance might largely reflect the transaction use of
the credit cards. For other individuals, the date might be the date after they pay off the intended balance and the
remaining amount reflects the carry-over balances. In the SCF, the credit balance reported likely does not reflect the
use of credit card for transactions, but rather the debt that the household does not plan to repay in the current period.
In addition, the households in the SCF might forget older balances.

8

The auto debt balance is also slightly higher in the CCP because the CCP always includes auto leases,
while in the SCF respondents usually do not report car leases as auto debt. The bankruptcy rates are very
similar between the two samples. The tables also show some differences between the delinquency
statistics in the two datasets. The SCF households probably report only severe delinquencies on large
quantities of debt and do not report delinquencies that they regard as temporary or small. 5
To impute the rank in the income distribution for a household in the CCP, we first estimate the
following relationship between the household’s gross income and observable characteristics in the 2001
SCF,
log�𝑌𝑖,𝑆𝐶𝐹 � = 𝑓( 𝛽𝑋𝑖,𝑆𝐶𝐹 ) + 𝜖𝑖,𝑆𝐶𝐹 ,

(1)

where 𝑌𝑖,𝑆𝐶𝐹 is the income of household 𝑖, and 𝑋𝑖,𝑆𝐶𝐹 is the vector of the household’s characteristics that

include (logs of) mortgage balance, credit card balance, credit card limit, an indicator for positive credit

card limit, the credit card utilization rate conditional on positive credit card limit, auto loan balance,
HELOC balance, student loan balance, an indicator for bankruptcy, an indicator of 60 days or more past
due on any loan, the age of the head of the household and the household size. 𝑓(. ) is a function that

includes polynomials, interaction terms, and dummy variables. Appendix F provides more information on

the specification and variables. We estimate equation (1) using OLS (with the SCF sampling weights) and
eliminate outliers using Cook's distance. 6 The adjusted R2 for this regression is 0.55.
Using estimated β, we construct the expected imputed (log) income for each household 𝑖 in the

third quarter of 2001 in the CCP data:

E[log(𝑌𝑖 )] = 𝑓�𝛽̂ 𝑋𝑖,𝐶𝐶𝑃 �,

and the expected imputed income (in levels)

E[ 𝑌𝑖 ] = exp[E[log(𝑌𝑖 )] + 0.5𝜎𝜖�2𝑖,𝑆𝐶𝐹 ],

where 𝜎𝜖�2𝑖,𝑆𝐶𝐹 = 0.3721 is the variance of 𝜖𝑖,𝑆𝐶𝐹 estimated in equation (1).

Having imputed households’ income in the CCP, we then estimate the household’s rank in the

local income distribution. For each household 𝑖 in area 𝑐 we construct its income rank in 2001, 𝑅𝑖,𝑐,2001 ,

as the rank of the household's expected imputed income, E[log�𝑌𝑖,2001 �], in the imputed income

distribution for location 𝑐. We approximate the local income distribution through a simple resampling

procedure. In particular, we assume that the distribution of income residuals estimated in the SCF is the
same across all locations. Note that to the extent that this assumption is not appropriate, we will tend to
5

In the SCF data, the 60DPD indicator is the indicator of whether a household has ever been delinquent on any loan
for 60 days or longer. In the CCP data, the 60DPD indicator is the indicator of whether a household is delinquent on
any loan for 60 days or longer in the current quarter.
6
Equation (1) is estimated only for observations with positive values of income. We also restrict our analysis to the
50 U.S. states and the District of Columbia, dropping the observations from Puerto Rico and U.S.-owned territories.

9

bias our results against finding any role for inequality in accounting for debt dynamics. After drawing a
household from location c in the CCP and calculating its expected income, we add a randomly drawn
residual estimated on the SCF sample to obtain the actual household income:
log�𝑌𝑖,𝑐,𝐶𝐶𝑃 � = 𝑓�𝛽̂ 𝑋𝑖,𝑐,𝐶𝐶𝑃 � + 𝜖̂𝑆𝐶𝐹 .

By repeating the process 50,000 times, with draws done with replacement, we approximate the local
income distribution. We then calculate each household’s percentile rank (𝑅𝑖,𝑐,2001 ) as well as

distributional statistics. The higher the value of 𝑅𝑖,𝑐,2001, the relatively richer is household 𝑖 in its

geographical location c in 2001.

We separately construct the rank of the household by the household's location at the three
different levels of aggregation: zip code, county and state. When the measure is constructed at the zip
code level, we restrict the analysis to zip codes with at least 100 households in our CCP sample. This
gives us 14,529 distinct zip codes in 2001. When the measure is constructed at the county level, we
restrict the analysis to counties with at least 300 households in our CCP sample. This procedure gives us
2,303 counties in 2001, covering over 35,000 zip codes.
We check the quality of our imputation in a number of ways. Table 2 presents the moments of the
income distribution imputed in the CCP and the same moments calculated from the SCF. The two sets of
moments are very similar, suggesting that our imputation function is sensible. We also check the quality
of our income imputation procedure by bringing income information to the CCP data from an alternative
source. In particular, we merge the CCP data with the data from a proprietary database. This database has
detailed mortgage-level panel data that contain information on a majority of mortgages originated in the
U.S.. These data include the debt-to-income ratio associated with each mortgage at the time of
origination. We use information on the mortgage origination month, location (zip code) and balance from
this proprietary database and the same attributes from the mortgage trade-line data in the CCP to match
households in the two datasets. 7 The earliest year when the debt-to-income variable is available in both
the proprietary dataset and the SCF is 2007; thus we merge the data using the first mortgages originated
in 2007. Prior to the merge, we eliminate all cases of multiple mortgages with the same combination of
open month, initial balance and zip code in both datasets to ensure that the match is unique. For the
sample of matched households we then use the debt-to-income ratio from the proprietary database and the
debt in the CCP to estimate the income. For this subset of matched households we compare the income
rank derived from the proprietary data with the income rank derived from the SCF-CCP imputation. The
two measures of rank are highly and positively correlated (Spearman correlation coefficient is 0.55),
confirming that our imputation procedure provides a good measure of income. When we regress the

7

See Elul et al. (2010) for a similar merge procedure.

10

imputed CCP measure of income on the actual measure of income from the proprietary database, the
estimate of the slope is practically one and thus measurement errors arising from the imputation do not
appear to be mean-reverting to any significant extent.

2.3.

Local Inequality Measures

Having imputed income in the CCP, we construct the local inequality measures for 2001 (𝐼𝑐,2001 ). Our
preferred measure of inequality is the difference between expected log income at the 90th percentile and

expected log income at the 10th percentile, i.e.,
𝐼𝑐,2001 = 𝑝90𝑐 [ 𝐸 { log�𝑌𝑖,𝑐,2001 �} ] − 𝑝10𝑐 [ 𝐸 { log�𝑌𝑖,𝑐,2001 �} ] .

We then compare this measure to inequality measures constructed from alternative sources. At the zip code
level, we use data from the IRS on household adjusted gross income (AGI) drawn from the 2001 tax returns.
At the county level, we use the Census data on household income from 2000. Both of these sources provide
income bins and the fraction of the population within each bin. Using this information, we construct a
simple approximation to the Gini coefficient. The CCP measure constructed from imputed incomes is highly
correlated with Gini coefficients based on Census or IRS data. For example, the correlation between Gini
coefficients from the 2000 Census and 90-10 differences in the CCP data at the county level is 0.59.
Figure 2 plots a map of U.S. inequality at the county level. Inequality is on average highest in the
southern states, as well as California and the Pacific Northwest. Midwestern states, in contrast, stand out for
having some of the lowest levels of inequality on average. The map also shows that inequality tends to be
higher in large cities than in more rural areas.
The map, which plots inequality at the county level, masks even greater regional heterogeneity in
inequality at the zip code level. Figure 3 plots histograms of our CCP inequality measure at each level of
aggregation. Average inequality is higher at lower levels of aggregation with a mean across zip codes of
2.24 and a mean of 1.68 across states. The standard deviation of inequality is twice as high (0.15) at the zip
level compared to the state level (0.07).
We focus on local income inequality for a number of reasons. First, this is likely to be the most
relevant metric when households compare themselves to others. Second, it avoids measurement issues
associated with comparing incomes across very different areas (e.g. $100K in New York vs. Tulsa).
Third, much of the rise in aggregate inequality in the U.S. reflects rising inequality within regions rather
than across regions. 8 Finally, there is much more variation in income inequality across regions than in
8

In Appendix C, we describe in detail a decomposition of aggregate income inequality in the U.S. from 1970 to
2000 measured using Census income data. When we measure the relative importance of differences in mean
incomes across regions (“between” inequality) versus the dispersion of incomes within regions (“within” inequality)
for each Census, we find that “between” inequality has consistently accounted for less than two percent of total
inequality and that this share has, if anything, been declining over time.

11

aggregate inequality over time, which is necessary for identifying any potential effects on inequality on
household behavior.

3 Empirical Analysis of Debt and Inequality
In this section, we investigate whether households’ borrowing patterns from 2001 to 2012 varied with
local inequality. We do so using household level regressions of debt to income changes over time as a
function of household characteristics, the household’s position in the local income distribution, and
interactions of the latter with local inequality measures. We find that while the evidence supports the
notion that local inequality affected debt accumulation patterns across income groups, the direction of the
effect is opposite to what one would expect from “keeping up with the Joneses” effects. We document the
robustness of this result along a variety of dimensions.

3.1.

Baseline Results

We are interested in estimating the role of initial local income inequality on the relationship between the
household's debt accumulation and the household's rank in the initial local income distribution. In
particular, we estimate the change in the household's debt between 2001 and year 𝑡, 2002 ≤ 𝑡 ≤ 2012, as

a function of the household's income rank in the 2001 local income distribution, conditional on local
income inequality in 2001. The benchmark specification is

Δ𝐷𝑖𝑐𝑡

where 𝐸[𝑌]

𝑖𝑐,2001

Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

= 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝑐 + + 𝜖𝑖𝑐𝑡 ,

(2)

is the change from year 2001 to year 𝑡 in the debt of household 𝑖 that resides in location 𝑐

relative to the household's (imputed expected) income in 2001 (in levels), i.e.,

Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

≡

𝐷𝑖𝑐𝑡 −𝐷𝑖𝑐,2001
𝐸[𝑌]𝑖𝑐,2001

,

where 𝐷𝑖𝑐𝑡 is deflated by the CPI-U and expressed in 2001 dollars. 𝑐 + is the fixed effect of the

geographical location that is at one level of aggregation higher than the geographic area used to construct

the income distribution and the income inequality measure. 9 We use the 2001 measure of local income
inequality because it is predetermined relative to subsequent household debt accumulation decisions and
it is highly persistent over time.
Parameters 𝛼, β and 𝛾 describe the relationship between the household’s debt accumulation and

local inequality. If 𝛼 < 0, low-rank households within an area accumulate relatively more debt than the

high-rank households. If 𝛽 = 𝛾 = 0, then local inequality is irrelevant for household debt accumulation.

This case is shown in Panel A of Figure 4. Panel B of Figure 4 illustrates the case when 𝛼 < 0, 𝛽 >
9

For example, in the regressions with zip code-level distribution of income and inequality, we control for countylevel fixed effects. In the regressions with county-level rank and inequality, we control for state-level fixed effects.
We do not control for the geographical fixed effects in the regressions with state-level income rank and inequality.

12

0, 𝛾 < 0. If 𝛽 > 0, an area with higher inequality is associated with higher debt accumulation. If 𝛾 < 0,

this effect weakens as household rank increases. Such a case is an example of the “keeping up with

Joneses” hypothesis. Specification (2) can be interpreted as a “difference-in-differences” approach in
which we compare high- and low-ranked households across high- and low-inequality regions, with γ
being the key parameter that determines whether such differences have been important.
We estimate equation (2) separately for each year 𝑡, 2002 ≤ 𝑡 ≤ 2012. In each year 𝑡, we follow

Guerrieri, Hartley and Hurst (2013) and restrict the sample to households that reside in the same
geographical area 𝑐 in 2001 and in 𝑡. In each regression, we exclude the observations below the 2nd and

above the 98th percentile of the distribution of
geographic location c. 10

Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

in year 𝑡. The standard errors are clustered by

Our baseline estimates of equation (2), estimated at the zip code level with county fixed effects
for years ranging from 2002 to 2012, are reported in Panel A of Table 3. Our first finding is that the
coefficient on a household’s rank in the income distribution (α) is consistently negative, with a peak
absolute value in 2007. Hence, debt accumulation over the course of the early to mid-2000s was, on
average, greater for lower income households. Second, the estimated coefficient on the inequality level of
the zip code is systematically negative, again peaking in absolute value in 2007. This implies that, holding
everything else constant, households living in the more unequal areas within a county accumulated less
debt over the early to mid-2000s than did those in lower inequality areas in the same county.
The key parameter for us is γ, which captures the interaction of household rank in the local
income distribution and local inequality. Our main finding is that γ is positive over this time period. This
implies that debt accumulation was relatively higher for (sufficiently) high-income households in highinequality regions than in low-inequality regions, or equivalently that lower income households in highinequality regions borrowed relatively less than their counterparts in lower inequality regions. This result
is precisely the opposite of what one would have expected from “keeping up with the Joneses’” effects.
Panel C of Figure 4 illustrates our results qualitatively. Households with rank to the right of the crossing
accumulate more debt on average as inequality increases. Households to the left of the crossing
accumulate relatively less debt as inequality increases.
To give a sense of the economic magnitudes, we calculate the change in debt accumulation in
response to a one standard deviation increase in local inequality for households of several different ranks.
Figure 5 plots these calculated effects at the 80th, 50th, and 20th percentiles for each time sample. At the
80th percentile the increase in inequality means the increase in household debt over expected income was
higher by almost nine percentage points in 2007. At the 20th percentile we estimate that households
10

Each specification below is estimated using household sampling weights from year 2001. See Appendix B for
details on the construction of household sampling weights.

13

decreased debt relative to income by a little over ten percentage points in 2007. In the same year the
median household saw a decline in debt-to-income of less than one percentage point.

3.2.

Specifications with Additional Controls

Our baseline specification does not include any household-specific controls other than their rank in the
income distribution. To control for potentially confounding household characteristics, we consider an
expanded specification augmented to include a vector of household-specific regressors:
Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

= 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 + 𝑐 + + 𝜖𝑖𝑐𝑡 ,

(3)

where 𝑋𝑖𝑐 is the set of household-specific controls. The latter include the age of the head of the
household, household size, (logarithm of) the level of household’s mortgage debt, (logarithm of) the level

of household’s auto debt, (logarithm of) the level of household’s HELOC debt, (logarithm of) the level of
household’s student loan debt, an indicator for a non-zero credit card debt limit, (logarithm of) the level
of household’s credit card debt, (logarithm of) the level of household’s credit card limit, the credit card
utilization rate conditional on non-zero credit card limit, default indicators, and the average of household
members’ credit scores. All controls are from 2001, with the exception of credit scores for which we
include both 2001 values (to control for initial access to credit) as well as year t values (to control for
access to credit in subsequent years). Results from this augmented specification are presented in Panel B
of Table 3. The results for the estimated effects of rank, inequality and the interaction of the two are
almost identical to those from the parsimonious specification.
A second concern one might have is that regional inequality is correlated with other regional
economic characteristics and that it is the latter that are most relevant for household debt accumulation
decisions. We control for this possibility in several ways. First, we include an additional vector of ziplevel control variables:
Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

= 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 + 𝜅𝑊𝑐 + 𝑐 + + 𝜖𝑖𝑐𝑡 ,

(4)

where 𝑊𝑐 is the set of location-specific controls. The set of location-specific controls includes the median
expected income in the zip code in 2001, the median of (log of) the household’s total debt in 2001, and

the median of (log of) the household’s mortgage debt in 2001. Results are presented in Panel C of Table
3. Again, our baseline estimates of the effects of household rank, local inequality and their interaction are
almost unchanged. This is also illustrated graphically in Panel B of Figure 5: our estimates with both
household and regional controls suggest that increasing inequality by one standard deviation is associated
with households at the 80th percentile increasing borrowing relative to income by almost 13 percentage
points, at the 50th percentile households increase borrowing over income by 3.5 percentage points, and at

14

the 20th percentile households decrease borrowing over income by almost 6 percentage points. The
difference between high- and low-rank households is essentially the same as before.
Another way to control for regional characteristics is to estimate our baseline specification with
fixed effects at the level of the zip code rather than the county:
Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

= 𝛼𝑅𝑖𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐 + 𝛿𝑐 + 𝜖𝑖𝑐𝑡 .

(5)

With zip code-specific fixed effects δc, we can no longer separate the effect of local inequality from other
regional characteristics, but we can still estimate the coefficient on the interaction term between the
household’s income rank and local inequality, 𝛾. The results from estimating equation (5) are presented in

Panel D of Table 3: the estimate of 𝛾 is again almost unchanged relative to those from our parsimonious

specification (2) or specifications augmented with household (3) and regional controls (4).

We also check for omitted variable bias in the interaction term by adding the interaction of the
household credit risk score with local inequality to the specification in equation (3). If the measure of
income rank primarily picked up the relative importance of the household’s credit risk score, one would
expect the estimate of 𝛾 to differ significantly after including this interaction. We estimated the following
modification of specification (3):
Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

= 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐

+𝜙𝑅𝑖𝑠𝑘𝑖𝑐,2001 + 𝜎𝑅𝑖𝑠𝑘𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝑐 + + 𝜖𝑖𝑐𝑡 ,

(3’)

The estimates of 𝛾 across all years (Panel A, Table 4) are robust to the inclusion of the interaction term.

Similarly, we check whether the results are sensitive to including an interaction of the

household’s initial debt level with local inequality in specification (3):
Δ𝐷𝑖𝑐𝑡
𝐸[𝑌]𝑖𝑐,2001

= 𝛼𝑅𝑖𝑐,2001 + 𝛽𝐼𝑐,2001 + 𝛾𝑅𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝜓𝑋𝑖𝑐

+𝜙𝐷𝑒𝑏𝑡𝑖𝑐,2001 + 𝜎𝐷𝑒𝑏𝑡𝑖𝑐,2001 × 𝐼𝑐,2001 + 𝑐 + + 𝜖𝑖𝑐𝑡 ,

(3’’)

Our baseline findings are unchanged with these additional controls (Panel B of Table 4).
Finally, we verify that our results do not hinge on the CCP measure of income inequality. We
replicate our results from Table 3 in Appendix Table A1 using the measure of inequality constructed from
IRS data and described in section 2.3 and find almost identical results. In short, the differential debtaccumulation patterns by households of differing income levels across inequality regions are a robust
feature of the data.

3.3

Subsample analysis

Our finding that debt accumulation was higher for poorer households in low-inequality regions than highinequality regions is robust to controlling for a wide variety of household and regional controls. One may

15

be concerned however that our interaction effect is capturing some other nonlinear characteristic of
household borrowing, which need not be captured by linear controls. To address this possibility, we
consider an additional set of robustness checks in which we verify that our results still obtain within
subsets of the data. Specifically, we break our regions along four dimensions: geographic areas, initial
debt burdens, credit scores and house price growth.
For geographic areas, we estimate our specification with household and regional controls
(equation (4)) separately for each of the four Census regions: Midwest, Northeast, South and West. We
present the results of the household level regressions of debt accumulation from 2001 to 2007 (the main
period over which household debt increased sharply) for each region in Panel A of Table 5, with the full
set of yearly regressions by region available in Appendix Table A2. For each region, the coefficients are
of the same sign as before and of approximately the same order of magnitude. Hence, our baseline results
are confirmed within each region of the country.
Second, we decompose zip codes by the average level of credit scores among households in each
locale in 2001. Specifically, we group zip codes into three bins: low credit scores (below the 33rd
percentile of average credit score distribution), medium (between the 33rd and 67th percentiles) and high
credit scores (above the 67th percentile of the average credit score distribution). We then rerun our
specification with household and regional controls within each of these three credit score areas. The
results for 2001-2007 are presented in Panel B of Table 5, with the full of set of yearly regressions by
credit score grouping available in Appendix Table A3. Again, the results are qualitatively similar across
credit score groups, although they are somewhat smaller in high credit score regions.
Third, we split zip codes according to median debt-to-income ratios in 2001. Specifically, we
construct median initial debt-to-income ratios across all households in a zip code, then split zip codes into
three groups based on these median ratios: low initial debt levels (below the 33rd percentile of the debtto-income distribution), medium (between the 33rd and 67th percentiles) and high debt-to-income ratios
(above the 67th percentile of the debt-to-income distribution). We then estimate our specification with
household and regional controls within each of these three subsets of zip codes. We again present results
for 2001-2007 in Panel C of Table 5, with the full set of yearly regressions by initial debt-to-income ratio
available in Appendix Table A4. We find that our qualitative result holds across zip codes of different
initial debt-to-income ratios but that the differential effects of inequality on household borrowing across
income groups were largest in regions with higher initial debt to income ratios.
Finally, we separate zip codes by the average growth rate of home prices from 2001-2005, as in
section 2. We calculate zip code house price appreciation using data from the Core Logic index. These
data are only available for a subset of our zip codes (about 6,600) which constitutes about 70% of our
original sample. We group zip codes into three bins: low house price growth (below the 33rd percentile),
16

medium (between the 33rd and 67th percentiles), and high house price growth (above the 33rd
percentile). We re-estimate the specification with household and regional controls within each subgrouping of zip codes and present results from 2001-2007 in Panel D of Table 5, with the full set of
yearly regressions by house price growth in Appendix Table A5. Once again, the interaction of household
rank and local inequality remains statistically significant within each subset of the data, with the
differential effects of regional inequality being stronger in zip codes which experienced higher growth in
house prices.

3.4

Results from a Nonparametric Specification

The specification in equation (2) assumes a linear relationship between debt accumulation, income and
rank and local inequality. In this section, we relax this assumption and estimate a nonparametric
specification. Specifically, we first split the sample of households into three bins according to the level of
local inequality. In particular, each location (zip code) is assigned to one of the three bins based on the
location’s level of inequality in the distribution of inequality across locations in 2001, i.e., low-inequality
bin (less than the 20th percentile of the distribution of local inequality levels), mid-level inequality bin
(between the 20th and 80th percentile), and high-inequality bin (above the 80th percentile). The assignment
of locations to inequality bins remains constant through 2002-2012. For the households in each bin, we
run a regression of household relative debt accumulation on a dummy for income rank below 0.2, a
dummy for income rank above 0.8, a full set of household and regional controls and the county-specific
fixed effects for each year separately. The omitted category is the dummy for income rank between 0.2
and 0.8.
Figure 6 shows the estimated coefficients on the dummy for income rank below 0.2 and the
dummy for income rank above 0.8, relative to the dummy for the income rank between 0.2 and 0.8. The
differences across inequality regions for high-ranked households (i.e. those above the 80th percentile) are
small throughout the time sample. In contrast, low-ranked households display much larger differences in
debt accumulation patterns across low- and high-inequality regions, with differences in debt accumulation
reaching nearly 20 percent of initial income levels by 2008. Hence, the link between inequality and debt
accumulation was relatively more important for low-income households than for high-income households.

3.5

Results with County- and State-Level Income Distribution and Inequality Measures

Previous work on inequality and consumption has been done using measures of inequality at the state
level (see Bertrand and Morse, 2013) and most discussion of inequality and debt has focused on measures
of inequality at the national level, as in Figure 1. We explore how our results vary as we increase the level
of geographic aggregation for inequality by estimating equation (4) using the income distribution at the
17

county and state level. We construct the area income distribution using the same resampling procedure we
used for zip codes and now we compute a household’s percentile rank within the larger area (e.g. county)
income distribution and inequality statistics of that distribution. We keep all household and regional-level
controls that we used before except now we include state fixed effects for county-level regressions and no
fixed effects for state-level regressions.
Panels A and B of Table 6 report the results with county- and state-level income distribution and
inequality measures, respectively. At the county level, we find very similar results to our zip code
regressions once we consider that the standard deviation of inequality is smaller at the county level.
Similarly, we also find very similar estimates of the interaction term when inequality is measured at the state
level, although there is some loss of precision in our estimates due to the aggregation. Also noteworthy is
that the estimate of β is positive at the state level, implying that households on average accumulated
relatively more debt in states with higher levels of inequality. This is similar to the result obtained by
Bertrand and Morse (2013) that typical households consumed more in states where consumption of the rich
was higher.

3.6.

Decomposition by Form of Debt

We now consider debt accumulation patterns along different dimensions of debt: mortgages, auto loans
and credit cards. For each, we reproduce our household-level regressions with household and regional
controls and county fixed effects and report yearly results in Table 7. Panel A documents that the results
for mortgages are almost identical to those found for total debt. Because mortgage debt on average
accounts for two-thirds of total debt, it is likely the primary driver of total debt patterns described above.
Panel B documents that very similar qualitative results obtain for auto loans: both α and β are estimated to
be negative while the interaction term γ is positive. However, the interaction effects are significantly
smaller for auto loans than for mortgages, even if we adjust them for the relative magnitudes of each form
of debt (i.e. convert to growth rates). For example, the peak interaction effect on auto loans is about 0.09,
which when adjusted by the average ratio of auto debt to mortgage debt (mortgage debt is almost eight
times as large as auto debt on average) becomes 0.71 or one-third of the mortgage interaction effect.
Thus, even though auto loans display the same qualitative patterns, the mapping from local inequality to
differential borrowing patterns across households is quantitatively weaker for auto loans than for
mortgages.
Panels C and D report equivalent results for credit card balances and credit card limits. The
distinction between credit card balances and limits is useful because the former can be interpreted as
reflecting the demand for credit on the part of households while the latter largely reflects credit

18

availability. 11 Strikingly, we find very different results for the two measures. With credit card limits, we
recover the same qualitative features as in our baseline estimates for total debt, α and β are both estimated
to be systematically negative while the interaction term γ is positive. With credit card limits being
approximately half of mortgage debt on average, the estimated peak level of γ of around 0.5 is
approximately half as large as the peak interaction effect estimated for mortgages in terms of implied
growth rates of each form of debt. In contrast, we find no consistent or economically significant
relationship between local inequality and the credit card balances of households across different income
groups: both β and γ are estimated to be very small (in some years becoming statistically insignificant)
and the sign of γ unstable across years. Thus, to the extent that we can interpret credit card balances and
limits as reflecting credit demand and supply, respectively, these results suggest that the differential
borrowing patterns of lower and higher income households across regions of different inequality reflect
differential credit supply conditions, not differential credit demand as would be the case under “keeping
up with the Joneses”.
In section 4, we propose one channel through which credit supply can vary with local inequality
in a way that can account for these patterns, namely if banks use an applicant’s income in combination
with local inequality to make inferences about the applicant’s underlying type. This interpretation of the
data would be consistent not just with the difference in our findings for credit card limits and credit card
balances, but also with the quantitative differences in the size of estimated effects of inequality across
other forms of debt. Mortgages, for example, represent much larger loan amounts than other forms of debt
and it is relatively difficult for financial institutions to recover the home or office associated with the loan
in case of default. Auto loans, on the other hand, are much smaller in size and banks face fewer hurdles to
repossessing a car. Hence, the incentive of financial institutions to devote resources toward identifying
applicants’ underlying credit-worthiness should be much lower for auto loans than mortgages, leading to
weaker utilization of the information provided by local income inequality as found in Table 7. While
credit card debt is of the same order of magnitude on average as auto debt in the CCP, credit card debt is
unsecured so that financial institutions bear more risk than they do with automobiles. One would therefore
expect stronger incentives to utilize available information in extracting credit risk for credit cards than
autos, which is again consistent with what we observe in the data.

11

This distinction is somewhat offset by the fact that households can endogenously raise their credit limits by
applying for more credit cards or requesting higher limits from their current credit card providers.

19

4. Model
In this section, we develop a stylized model in which banks use local inequality to extract information
about applicant types and which results in borrowing patterns similar to those we find in the CCP data.
We show how local inequality affects bank lending decisions under perfect competition and monopoly.
Suppose there are two types of households: High (H) and Low (L). To simplify algebra, we
assume that High type households never default on debt while Low type households default with
probability 𝑑 and that the share of High type households is 0.5. 12 The income for each type 𝑗 ∈ {𝐻, 𝐿} is

given by 𝑦𝑗 = 𝜇𝑗 + 𝑒𝑗 where 𝜇𝐻 > 𝜇𝐿 are constants and 𝑒𝑗 ~𝑁(0, 𝜎 2 ). Hence, 𝑦𝐻 ~𝑁(𝜇𝐻 , 𝜎 2 ) and

𝑦𝐿 ~𝑁(𝜇𝐿 , 𝜎 2 ). Denote the pdfs for each distribution with 𝜙𝐻 and 𝜙𝐿 . The average income in this
1
2

1
2

economy is 𝑦� = 𝜇𝐻 + 𝜇𝐿 .

We also assume banks observe 𝑠, another signal about the quality of borrowers that can incorporate

other information about borrowers and is not observed by the econometrician, to capture the idea that loan
officers have more information than econometricians. Similar to the income signal, 𝑠𝑗 = 𝜌𝑗 + 𝜂𝑗 where

𝜌𝐻 > 𝜌𝐿 are constants and 𝜂𝑗 ~𝑖𝑖𝑑 𝑁(0, 𝜔2 ). Denote the pdfs for each distribution with 𝑞𝐻 and 𝑞𝐿 . To
simplify algebra, we assume without loss of generality that income 𝑦𝑗 and signal 𝑠𝑗 are independent.

Banks do not observe household types directly but they observe applicants’ incomes and signal

𝑠.

13

They can then infer the probability of a given type conditional on observed income. Specifically,

using Bayes law, the posterior probability of being High type for a household 𝑖 with signals 𝑦𝑖 and 𝑠𝑖 is

given by

Pr(𝐻|𝑦𝑖 , 𝑠𝑖 ) = Pr(𝑦 |𝐻)
=

𝑖

Pr(𝑦𝑖 |𝐻) Pr(𝑠𝑖 |𝐻) Pr(𝐻)
Pr(𝑠𝑖 |𝐻) Pr(𝐻)+Pr(𝑦𝑖 |𝐿) Pr(𝑠𝑖 |𝐿) Pr(𝐿)

𝜙𝐻 (𝑦𝑖 )𝑞𝐻 (𝑦𝑖 )12
1
2

𝜙𝐻 (𝑦𝑖 )𝑞𝐻 (𝑦𝑖 ) +𝜙𝐿 (𝑦𝑖 )𝑞𝐿 (𝑦𝑖 )

1
2

Φ(𝑦 )𝑄(𝑠 )

𝑖
𝑖
= Φ(𝑦 )𝑄(𝑠
)+1
𝑖

𝑖

(6)

where Φ(𝑦𝑖 ) ≡ 𝜙𝐻 (𝑦𝑖 )/𝜙𝐿 (𝑦𝑖 ) and 𝑄(𝑠𝑖 ) ≡ 𝑞𝐻 (𝑠𝑖 )/𝑞𝐿 (𝑠𝑖 ) are the likelihood ratios. Given our

assumptions, we have Φ′ > 0 and 𝑄′ > 0, that is, High type households are monotonically more likely to
be observed as income 𝑦 or signal 𝑠 increase. Since there are only two types, it follows that

Clearly,

𝜕 Pr(𝐿|𝑦𝑖 ,𝑠𝑖 )
𝜕𝑦𝑖

< 0,

Pr(𝐿|𝑦𝑖 , 𝑠𝑖 ) = 1 − Pr(𝐻|𝑦𝑖 , 𝑠𝑖 ) =

𝜕 Pr(𝐿|𝑦𝑖 ,𝑠𝑖 )
𝜕𝑠𝑖

< 0,

𝜕 Pr(𝐻|𝑦𝑖 ,𝑠𝑖 )
𝜕𝑠𝑖

> 0, and

12

1
.
Φ(𝑦𝑖 )𝑄(𝑠𝑖 )+1

𝜕 Pr(𝐻|𝑦𝑖 ,𝑠𝑖 )
𝜕𝑦𝑖

(7)

> 0.

We document in Appendix E that high-income households are indeed less likely to default than low-income
households.
13
Obviously, banks observe many other characteristics of households. We abstract from this additional information
available to banks to simplify derivations. One may interpret this approach as partialling out these other
characteristics. Typically, one of the important indicators of individual’s risk is individual’s credit score. In the
analysis in section 3, we show that the household’s income rank has explanatory power for the household’s debt
even after we control for the credit score.

20

Banks potentially have two margins to determine which borrowers obtain loans: 1) price of loans;
2) loan denial probability. While in reality banks are likely to use both margins, we consider polar cases
to illustrate the workings of each margin separately. For the price margin, we will assume that banks can
price discriminate borrowers perfectly, banks compete in all population segments, and banks can freely
obtain resources at rate 𝑅0 (“perfect competition”). For the loan denial probability, we assume that there

is only one bank serving the market but this bank is threatened by entry of other banks if this bank makes
a profit (“monopoly”).

4.1 Perfect Competition
With perfect competition and free entry in each lending segment, banks can have only one interest rate for
a borrower of a given quality. Since there is a continuum of borrower quality, there is also a continuum of
markets where each market is indexed by borrower quality. Consider a set of households with income 𝑦𝑖
and signal 𝑠𝑖 . Given by the zero profit condition, the interest rate is set to

𝑅 ∗ {(1 − 𝑑) Pr(𝐿|𝑦𝑖 , 𝑠𝑖 ) + Pr(𝐻|𝑦𝑖 , 𝑠𝑖 )} = 𝑅0 ⟹

𝑅 ∗ = (1−𝑑)

𝑅0
Pr(𝐿|𝑦𝑖 ,𝑠𝑖 )+Pr(𝐻|𝑦𝑖 ,𝑠𝑖 )

= 𝑅0

Φ(𝑦𝑖 )𝑄(𝑠𝑖 )+1
Φ(𝑦𝑖 )𝑄(𝑠𝑖 )+(1−𝑑)

= 𝑅 ∗ (𝑦𝑖 , 𝑠𝑖 )

(8)

Note that households with other levels of 𝑦 and 𝑠 pay the same interest rate as long as Φ(𝑦𝑖 )𝑄(𝑠𝑖 ) =
Φ(𝑦)𝑄(𝑠). That is, each lending segment is characterized by a pair of signals
𝒮(𝑅 ∗ ) = �(𝑦, 𝑠): 𝑅0

Φ(𝑦)𝑄(𝑠) + 1
= 𝑅 ∗ �.
Φ(𝑦)𝑄(𝑠) + (1 − 𝑑)

where 𝑅 ∗ is a sufficient statistic for the quality of borrowers. Because the quality of borrowers is the same in

𝒮(𝑅 ∗ ), every borrower in 𝒮(𝑅 ∗ ) obtains a loan at the interest rate 𝑅 ∗. Borrowers of a worse quality are

offered loans at higher interest rates while borrowers of better quality can obtain a loan with a lower interest
rate.
Clearly,

𝜕𝑅∗
𝜕𝑦

< 0 and

𝜕𝑅∗
𝜕𝑠

< 0 so that households with high income 𝑦 and strong signal 𝑠 pay

lower rates because banks believe that these applicants are more likely to be of the High type. To see the
tradeoff between 𝑦 and 𝑠, one can fix 𝑅 ∗ (𝑦, 𝑠) at level 𝑅 # and find the required signal 𝑠 to allow a
household to borrow at rate 𝑅 # given that this household has income 𝑦:
1
Φ(𝑦)

𝑠 ∗ (𝑦) = 𝑄 −1 �

×

𝑅0 −𝑅# (1−𝑑)
�
𝑅 # −𝑅0

(9)

where 𝑄 −1 is the inverse function of 𝑄. Given that 𝑄 ′ > 0 and Φ′ > 0, it follows that

𝜕𝑠 ∗ (𝑦)
𝜕𝑦

< 0.

Although we (unlike loan officers) do not observe signal 𝑠 in the data, we can still calculate the

interest rate paid on average by households with income 𝑦, which is observed by the econometrician:
1

1

𝑅 ∗ (𝑦) = ∫ 𝑅 ∗ (𝑦, 𝑠) �𝑞𝐻 (𝑠) + 𝑞𝐿 (𝑠) � 𝑑𝑠
2
2
21

(10)

Given that 𝑅 ∗ (𝑦, 𝑠) is differentiable and otherwise well behaved as well as
𝜕𝑅∗ (𝑦)
𝜕𝑦

=∫

𝜕𝑅∗ (𝑦,𝑠)
1
�𝑞𝐻 (𝑠) 2
𝜕𝑦

1

𝜕𝑅∗ (𝑦,𝑠)
𝜕𝑦

+ 𝑞𝐿 (𝑠) � 𝑑𝑠 < 0.
2

< 0, we have that

(11)

Hence, the model predicts that the interest rate decreases in household income.

One can then consider a thought experiment of raising the income inequality in this economy
without changing the mean level of income. Specifically, we increase the distance between 𝜇𝐻 and 𝜇𝐿 but

the average income 𝑦� is held constant. Because income levels are now a stronger signal of an applicant’s

type, banks put a higher weight on signal 𝑦, hence the slope of the tradeoff becomes steeper as it takes a
larger change in signal 𝑠 to justify lending at a given interest rate (see Panel A of Figure 7). This will lead

to higher borrowing on the part of low-income households in low-inequality regions than in high-

inequality regions because, in the former, banks are less sure about the underlying type of the applicant
based on income and therefore are more willing to lend to households of different incomes. In other
∗
∗
(𝑦) < 𝑅𝑢𝑛𝑒𝑞𝑢𝑎𝑙
(𝑦) when 𝑦 < 𝑦� where “equal” and “unequal” denote the level of
words, 𝑅𝑒𝑞𝑢𝑎𝑙
∗
∗
(𝑦) > 𝑅𝑢𝑛𝑒𝑞𝑢𝑎𝑙
(𝑦) when
inequality, captured by mean-preserving changes in 𝜇𝐻 and 𝜇𝐿 , and 𝑅𝑒𝑞𝑢𝑎𝑙

𝑦 > 𝑦�. Panel B of Figure 7 illustrates this point. In short, banks charge lower interest rates to high-income
households than to low-income households and the difference in the interest rates across income groups
rises as the difference between these groups widens.14
In another thought experiment, we study the effects of an increase in the supply of credit. Since
perfect competition prices each borrower type fairly, we can only increase the supply of credit by
reducing the cost of funds rate 𝑅0 . Equation (9) shows that a decrease in 𝑅0 shifts schedule 𝑠 ∗ (𝑦) down

and hence all borrowers enjoy a lower cost of credit.

A combination of a positive credit supply shock (𝑅0 decreases) and an increase in inequality

(𝜇𝐻 − 𝜇𝐿 increases) can reconcile how all types of households increased their borrowing on average over
the course of the mid 2000s with the cross-sectional variation in debt-accumulation patterns across

income groups at different levels of local inequality documented in section 3. The supply shock by itself
can explain the former while the increased inequality by itself can explain only the latter.

4.2 Monopoly
In practice, regulatory or informational constraints limit the ability of banks to charge different prices to
different borrowers and therefore they often can charge only one rate or a limited number of rates for a
given type of loan. To keep exposition simple, suppose that i) the market has only one bank and it is
14

Note that the value at which a household does not experience a change in the interest rate is equal to the average
income 𝑦�. This value is insensitive to the level of inequality because by construction the average income is held
constant and at the average income the likelihood ratios are equal to 1 and therefore the posterior probability is equal
to 1/2. This value, however, can move in more complex models and alternative parameterizations.

22

threatened by entry of other banks, ii) regulators impose a minimum quality of borrowers who may obtain
loans (e.g., to qualify for Freddie Mac and Fannie Mae guarantees), and iii) the bank can charge only one
rate 𝑅�.

To model assumption ii), we know that 𝑅 ∗ (𝑦, 𝑠) can be used as a sufficient statistic for the quality

of a borrower. The bank makes a profit on borrowers with (𝑦, 𝑠) such that 𝑅 ∗ (𝑦, 𝑠) < 𝑅� and losses on

borrowers with (𝑦, 𝑠) such that 𝑅 ∗ (𝑦, 𝑠) > 𝑅�. We will denote the cutoff interest rate 𝑅 + that meets the
regulation requirements. With this cutoff rate, the threat of entry sets 𝑅� at the level that yields zero profits
as implied by assumption i).
𝑅�

∫ ∫(𝑦,𝑠):𝑅∗ (𝑦,𝑠)≤𝑅+ {(1 − 𝑑) Pr(𝐿|𝑦, 𝑠) + Pr(𝐻|𝑦, 𝑠)}𝜙�(𝑦)𝑞�(𝑠)𝑑𝑦𝑑𝑠
∫ ∫(𝑦,𝑠):𝑅∗ (𝑦,𝑠)≤𝑅+ 𝜙�(𝑦)𝑞�(𝑠)𝑑𝑦𝑑𝑠

= 𝑅0

1
1
1
1
where 𝜙�(𝑦) ≡ 𝜙𝐿 (𝑦) + 𝜙𝐻 (𝑦) and 𝑞� (𝑠) ≡ 𝑞𝐿 (𝑠) + 𝑞𝐻 (𝑠). Using the insight of equation (9), we
2
2
2
2

can find the threshold level of signal 𝑠 such that a bank will lend to a household with income 𝑦:
As before, we have

𝜕𝑠 + (𝑦)
𝜕𝑦

1
Φ(𝑦)

𝑠 + (𝑦) = 𝑄 −1 �

×

𝑅0 −𝑅+ (1−𝑑)
�
𝑅+ −𝑅0

(12)

< 0. The set of households who obtain a loan is:
𝒮 + (𝑅+ ) = �(𝑦, 𝑠): 𝑅0

Φ(𝑦)𝑄(𝑠) + 1
≥ 𝑅+�
Φ(𝑦)𝑄(𝑠) + (1 − 𝑑)

The probability that a household with income 𝑦 is denied a loan is
+

Since

𝜕𝑠 + (𝑦)
𝜕𝑦

𝑠 + (𝑦)

Pr(𝑑𝑒𝑛𝑖𝑒𝑑 𝑙𝑜𝑎𝑛|𝑦) = Pr(𝑠 < 𝑠 (𝑦)) = �

< 0, it follows that

𝜕 Pr(𝑑𝑒𝑛𝑖𝑒𝑑 𝑙𝑜𝑎𝑛|𝑦)
𝜕𝑦

−∞

𝑞� (𝑠)𝑑𝑠

< 0: the probability of loan denial decreases in income.

Now we repeat the thought experiment with rising inequality. Similar to the perfect competition

case, it takes a larger increment in signal 𝑠 to compensate for a given decrease in income 𝑦 because

income is a more informative signal. As a result, if the quality of lending standard 𝑅 + is held constant,

some low-income households may be denied a loan more often (see Panel C of Figure 7). Panel D of
Figure 7 shows how the denial probability changes with rising inequality. The probability of denial
increases for households with 𝑦 < 𝑦� and decreases for households with 𝑦 > 𝑦�.

In contrast to the perfect competition case, the monopoly case has two ways to model an increase

in the supply of credit. First, one can continue to model it as a reduction in the cost of funds rate 𝑅0 .

Second, one can model it as an increase in 𝑅 +, i.e., relaxing lending standards to cover high-risk

borrowers. In the first case, a decrease in 𝑅0 lowers 𝑅� and thus makes credit cheaper for households with
𝑅 ∗ ≤ 𝑅 +. However, it does not affect the interest rate for households with 𝑅 ∗ > 𝑅 + as these continue to
23

receive no loans (they do not meet lending requirements). In the second case, an increase in 𝑅 + raises 𝑅�

because a wider coverage now includes high risk households and losses made on these high-risk
households have to be compensated by larger profit margins on low-risk households. Thus, while credit is
now available to a broader spectrum of households, the cost of borrowing increases for relatively highincome borrowers. On the other hand, the probability of obtaining a loan increases for all households as
schedule 𝑠 + (𝑦) shifts down. Hence, although high-income households pay a higher price for credit, they

are denied loans less frequently.

Our model can therefore potentially account for why lower-income households accumulated
relatively less debt in high-inequality regions than did similar households in low-inequality regions during
the 2000s: if banks in higher-inequality regions placed more weight on applicants’ incomes as a signal of
their underlying creditworthiness and therefore channeled more funds toward higher-income applicants
than did banks in lower-inequality regions. Under perfect competition, this differential access to funds is
predicted to happen through higher interest rates being offered to low-income applicants than highincome applicants whereas under monopoly banking, our model predicts that banks will reject lowincome applicants more frequently than high-income applicants. Because banking in the U.S. lies in
between these two extremes, we expect both margins to be present in the data, a prediction to which we
now turn.

5 Results from the Mortgage Application Data
Our model suggests that variation in inequality across regions should be reflected in lending decisions of
banks if regional inequality can be used to make inferences about applicants’ default probabilities. In this
section, we use information on mortgage applications from the publicly available Home Mortgage
Disclosure Act database (HMDA), 2001 – 2011, to test these implications.
The HMDA data are compiled from reports filed by mortgage lenders. The HMDA was passed by
Congress in 1975 and began requiring lenders to submit data reports in 1989. The initial intention of the act
according to the Consumer Financial Protection Bureau (2012) was to monitor the provision of credit in
urban neighborhoods. Later requirements to submit data reports were intended to monitor discriminatory
lending practices. Dell’Ariccia, Igan, and Leaven (2012) find that HMDA covers between 77% and 95% of
all mortgage originations from 2000 to 2006. Reporting criteria differ between depository and
nondepository institutions and across years. Depository institutions have typically been required to report if
they satisfy an asset threshold, make at least one home mortgage, are federally regulated or insured, and
have a branch in a metropolitan area. Nondepository institutions were required to report if the share of home
mortgages exceeded a threshold of all loan originations, the lender operated in an MSA, and met an asset
threshold. In 2004 the share threshold was supplemented with a level of home mortgage originations to
24

increase the coverage of the market. Lenders who file reports include detailed information on every
mortgage application received by the lender during a calendar year. All years of the data contain the size of
the loan, income on the application, location of the property down to the census tract, demographics of the
applicants, a lender identifier, and the action taken on the loan. Since 2004 the data include additional
information including a censored picture of interest rates and the loan’s lien status. We use a 15% random
sample of all HMDA records.
While the data are very detailed in many respects there are some limitations. First, the data do not
identify “piggyback” loans, i.e. loans with subordinate liens used to finance a larger first-lien loan. These
secondary loans can be used to lower financing costs and to avoid requirements that a loan being sold to
Fannie Mae or Freddie Mac be accompanied by private mortgage insurance if a traditional loan would not
meet certain standards. The HMDA does not require lenders to report piggyback loans if they are issued as
HELOCs and some piggyback loans might be issued by a lender not covered by HMDA. But some
piggyback loans are included in the dataset and, given that these loans are not identified as such, a
researcher might infer a much lower loan to value ratio than the actual loan to value on the property. Since
we are not able to identify piggyback loans reliably and these loans are relatively small, we drop all
applications where the loan-to-income (LTI) ratio is less than one. Second, we conduct the HMDA analysis
at the county level rather than the zip code level. Although the data are available at the census tract, we
aggregate to the county in order to use measures of inequality consistent with the CCP analysis. Finally, in
contrast to the CCP database, the HMDA data set does not track applicants over time and hence we do not
have a panel of applicants/borrowers.
We focus on supply-side variables in line with the theoretical predictions of the model. First, we
assess whether the probability of a loan being rejected is invariant to the applicant’s income rank interacted
with regional inequality. Second, we consider whether the probability of the loan being “high-interest”
(conditional on a loan application being approved) varies with inequality and the applicant’s rank. 15 Both of
these can be interpreted as directly capturing credit supply factors, namely whether banks use local
inequality to make inferences about applicants’ underlying types when one conditions on other observable
characteristics of the applicant such as the loan-to-income ratio in the application. If banks use an
applicant’s position in the income distribution to help make inferences about their underlying default risk, as
suggested by the model, then one would expect banks to reject otherwise similar applications by highincome applicants less frequently in high-inequality regions than in low-inequality regions, or equivalently
to reject otherwise similar applications by low-income applicants more frequently in high-inequality regions

15

The HMDA reporting guidelines require lenders to report the spread between the Treasury yield and the
mortgage interest rate if the spread is greater than three percentage points for first-lien loans or five percentage
points for subordinate-lien loan.

25

than in low-inequality regions. By the same logic, we should observe low-income applicants being charged
higher interest rates on their loans more frequently in high-inequality regions than in low-inequality regions.
We test these predictions in a framework very similar to that used in the CCP data. For a given
outcome, we estimate the following regression 16
𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑖𝑐𝑡 = 𝛼𝑅𝑎𝑛𝑘𝑖𝑐𝑡 + 𝛾𝑅𝑎𝑛𝑘𝑖𝑐𝑡 ∗ 𝐼𝑛𝑒𝑞𝑢𝑎𝑙𝑖𝑡𝑦𝑐,2001 + 𝛽𝑍𝑖𝑐𝑡 + 𝜆𝑐 + 𝑒𝑟𝑟𝑜𝑟,

(6)

where 𝑅𝑎𝑛𝑘𝑖𝑐𝑡 is the percentile rank of applicant i’s income within the pool of applicants in area c in year

t. 17 The inequality measure and the income distribution are defined at the county level. The explanatory

variables in vector 𝑍𝑖𝑐𝑡 include indicators for whether or not the loan is for an owner-occupied property,

several race categories and gender, as well as interactions of the applicant’s income rank with the share of
applicants in the county who are nonwhite. 18 We also control for the loan-to-income ratio in the
application. While we estimate these models with county fixed effects 𝜆𝑐 , the results are very similar if

we use state fixed effects (Appendix Table A6). We restrict the analysis to loans for home purchases,
applications where the loan-to-income ratio is at most eight and not less than one, loans where the

reporter was explicitly making the origination decision, and where the loan did not fail because of
incompleteness or because it was not pre-approved. Notice that we retain in the sample loans that are not
denied but also not originated. Excluding these does not change our results. As before, we are interested
in the sign of the interaction term between income rank and inequality, 𝛾. All standard errors are clustered

at the county level. The regressions are estimated separately for each year, 2001 – 2011. We use the log of
the 90/10 income ratio derived from the income imputed in the CCP data in 2001 as the measure of
inequality, but the results are essentially the same using the Gini coefficient derived from the Census data.
We present the results for the probability of an application being rejected by a bank in Panel A of
Table 8 and results for the probability of a loan being high-interest, conditional on origination, in Panel B
of Table 8. For the probability of being rejected, the key finding is that estimated γ is consistently
negative: applications from high-ranked households in high-inequality regions are less likely to be
rejected than those from high-ranked households in low-inequality regions. This result is consistent with
the theoretical predictions of the model in which banks use an applicant’s position in the local income
distribution, along with the dispersion of that distribution, to make inferences about default risk. Using
our 2007 estimates, our results suggest that a one standard deviation increase in inequality will decrease
the probability of denial of a household in the 80th percentile rank relative to the 20th percentile rank by

16

Our baseline specification includes a county fixed effect because the county-level controls are not as detailed as
those we can construct in the CCP data.
17
The results we present are also robust to using a measure of an applicant’s rank relative to the distribution of
income across all households in the county.
18
We include the share of non-whites as an additional control because previous studies suggested that banks may
treat differentially areas with predominantly non-white population. See Turner and Skidmore (1996) for a review.

26

approximately 2.3 percentage points. This is comparable in magnitude to the association between rank
and the probability of denial. Similar results obtain with the probability of the loan being high-interest
(this variable is not available before 2004): high-rank applicants are less likely to face higher rate loans in
high-inequality regions than in low-inequality regions. Again, this is precisely the type of pricediscrimination predicted by the model. Doing the same calculation as above with the 2007 estimate we
find that high-rank households will see the probability that they pay a high interest loan decline by 0.7
percentage point relative to low-rank household.
We can also consider whether the size of the mortgage (intensive margin) varies across inequality
regions and ranks within the income distribution by using the loan to income ratios associated with each
originated mortgage. We use the same controls as with rejection probabilities (with the exception of LTI
ratios) and county fixed effects. The results for each year are presented in Panel C of Table 8. Unlike with
mortgage rejection rates and interest rate premia, we find little evidence that loan-to-income ratios in
originated loans vary across households in different inequality regions. The estimates of 𝛾 are almost

always insignificantly different from zero, with 2004 and 2007 being the only exceptions. To the extent

that requested loans reflect demand for credit by households, we again find little evidence that demandside factors related to local inequality levels mattered for the debt-accumulation decisions of households.
However, the HMDA dataset does not allow us to establish if households have multiple loans or reliably
link piggyback loans to standard loans. Thus, while our results point mainly toward channels operating
through credit supply— namely through the banks’ use of a household’s income rank combined with the
amount of income inequality in that region to make inferences about applicants’ credit worthiness—more
work needs to be done to better understand the intensive margin.

6 Conclusions
Using household level measures of debt over the course of 2001 - 2012, we document a systematic link
between local levels of income inequality and the debt-accumulation decisions of households of different
income levels. Specifically, we find that low-income households in low-inequality regions accumulated
more debt during the mid-2000s than did low-income households in high-inequality regions, with reverse
(albeit smaller) effects operating for high-income households. While these results point to an economic
channel linking economic inequality and borrowing by households of different income groups, they are
inconsistent with “keeping up with the Joneses” being a significant force behind the great leveraging of
households over this period.
Instead, we argue that causality is likely to run from the banking system to households. We
develop a model where income inequality is informative for evaluating credit risk. In the model, this
channel leads to relatively more credit being allocated to low-income applicants when local inequality is
27

low rather than high, since higher levels of inequality imply that applicant incomes are stronger signals of
credit-worthiness. Consistent with this view, we document that lower-income mortgage applicants in
high-inequality regions are rejected more frequently and pay higher mortgage rates than similar applicants
in low-inequality regions. While it is possible that income inequality implicitly captures other factors that
are not included in the model or data, our findings suggest that the causality between inequality and debt
is running through the credit supply channel.
Our results support the notion that the growth in household borrowing during the mid-2000s was
driven in large part by credit supply expansions targeted toward lower-income households. This is
because we find no evidence for credit-demand forces such as “keeping up with Joneses” effects in the
data and instead argue that causal links running from inequality to debt accumulation would point toward
less relative debt accumulation by low-income households during periods of rising inequality, the
opposite of what occurred in the U.S. during this time period.
However, to the extent that this expansion in the supply of credit to lower income households is
unlikely to continue (for example if it reflected a one-time securitization of household debt), our results
suggest that a continuation of recent trends toward rising inequality is likely to reduce access to credit for
lower-income households. Because limited access to credit restricts households’ ability to smooth their
consumption and to engage in long-term investments (e.g. sending children to college, retraining for
different careers), such differential access to credit could ultimately have negative longer term
consequences. To the extent that many of these activities likely have positive societal externalities not
captured in our model, such a development could have important policy implications.

References
Autor, David, Lawrence Katz, and Melissa S. Kearney. 2008. “Trends in U.S. Wage Inequality: Revising
the Revisionists.” Review of Economics and Statistics, Vol. 90(2), pp: 300–23.
Aguiar, Mark and Mark Bils. 2012. “Has Consumption Inequality Mirrored Income Inequality,”
University of Rochester, mimeo.
Aiyagari, S. R. 1994. “Uninsured Idiosyncartic Risk and Saving.’ Quarterly Journal of Economics, 109,
659-684.
Athreya, Kartik, Xuan Tam, and Eric Young. 2012. "A Quantitative Theory of Information and
Unsecured Credit," American Economic Journal: Macroeconomics, Vol. 4 (3): 153-183.
Bertrand, Marianne, and Adair Morse. 2013. “Trickle-Down Consumption,” NBER Working Paper No.
18883.
Blundell, Richard, Luigi Pistaferri, and Ian Preston. 2008. “Consumption Inequality and Partial
Insurance,” American Economi Review, 98, 1887–1921.
Bordo, Michael D. andChristopher M. Meissner, 2012. “Does Inequality Lead to a Financial Crisis?”
NBER Working Paper 17896.
Brown, Meta, Andrew Haughwout, Donghoon Lee, and Wilbert van der Klaauw. 2011. "Do We Know
What We Owe? A Comparison of Borrower- and Lender-Reported Consumer Debt." Federal
Reserve Bank of New York, Staff Report no. 523.

28

Charles, Kerwin K., Erik Hurst, and Nikolai Roussanov, 2009. “Conspicuous Consumption and Race,”
Quarterly Journal of Economics 124(2), 42-67.
Christen, Markus and Ruskin M. Morgan, 2005. “Keeping Up with the Joneses: Analyzing the Effect of
Income Inequality on Consumer Borrowing,” Quantitative Marketing and Economics 3, 145-173.
Consumer Financial Protection Bureau. 2012. “Supervision and Examination Manual”
Daly, Mary C. and Daniel J. Wilson, 2006. “Keeping Up with the Joneses and Staying Ahead of the
Smiths: Evidence from Suicide Data,” Federal Reserve Bank of San Francisco WP 2006-12.
Dell’Ariccia , Giovanni, Deniz Igan, and Luc Laeven. 2012. “Credit Booms and Lending Standards:
Evidence from the Subprime Mortgage Market,” Journal of Money, Credit and Banking, Vol. 44
(2-3): 367–384.
Drozd, Lukasz A., and Ricardo Serrano-Padial. 2013. “Modeling the Credit Card Revolution: The Role of
Debt Collection and Informal Bankruptcy,” FRB of Philadelphia Working Paper No. 13-12.
Elul, Ronel, Nicholas S. Souleles, Souphala Chomsisengphet, Dennis Glennon, and Robert Hunt. 2010.
"What "Triggers" Mortgage Default?" The American Economic Review Papers and Proceedings,
forthcoming. (FRB Philadelphia Working Paper 10-13).
Fay, Scott, Erik Hurst, and Michelle J. White. 1998. “The Bankruptcy Decision: Does Stigma Matter?,”
University of Michigan Working Paper No. 98-01.
Fay, Scott, Erik Hurst, and Michelle J. White, 2002. "The Household Bankruptcy Decision," American
Economic Review 92(3), 706-718.
Goldin Claudia, and Lawrence F. Katz, 2007. “Long-Run Changes in the U.S. Wage Structure:
Narrowing, Widening, Polarizing.” Brookings Papers on Economic Activity. Vol. 2, 135-165.
Gross, David B., and Nicholas S. Souleles, 2002. "An Empirical Analysis of Personal Bankruptcy and
Delinquency," Review of Financial Studies 15(1), 319-347.
Guerrieri, Veronica, Daniel Hartley, and Erik Hurst, 2013. “Endogenous Gentrification and Housing Price
Dynamics,” forthcoming in Journal of Public Economics.
Guven, Cahit and Bent E. Sorensen, 2012. “Subjective Well-Being: Keeping Up with the Joneses. Real or
Perceived?” Social Indicators Research 109, 439-469.
Heathcote, Jonathan, Fabrizio Perri, and Gianluca Violante, 2010. “Unequal We Stand: An Empirical
Analysis of Economic Inequality in the United States, 1967-2006,” Review of Economic
Dynamics, 13, 15-51.
Heathcote, Jonathan, Kjetil Storesletten, and Giovanni L. Violante. 2004. “The Macroeconomic Implications of Rising Wage Inequality in the United States.” Journal of Political Economy, 118(4),
681-722
Heffetz, Ori, 2011. “A Test of Conspicuous Consumption: Visibility and Income Elasticities,” Review of
Economics and Statistics 93(4) 1101-1117.
Huggett, Mark. 1993. “The Risk-Free Rate in Heterogeneous-Agent incomplete –Insurance Economics.”
Journal of Economics Dynamics and Control 17 (Septemeber-November): 953-69.
Iacoviello, Matteo, 2008. “Household Debt and Income Inequality: 1963-2003,” Journal of Money, Credit
and Banking 40(5), 929-965.
Kennickell, Arthur B. 1991 “Imputation of the 1989 Survey of Consumer Finances," Proceedings of the
Section on Survey Research Methods, 1990 Joint Statistical Meetings, Atlanta, GA.
Kennickell, Arthur B., 1998, “Multiple imputation in the Survey of Consumer Finances,” Working paper,
Federal Reserve Board, available at: http://www.federalreserve.gov/pubs/oss/oss2/method.html.
Kopczuk, Wojciech and Saez, Emmanuel. 2004. “Top Wealth Shares in the United States, 1916–2000:
Evidence from Estate Tax Returns.” National Tax Journal, 57(2), pp. 445–87.
Krueger, Dirk, and Fabrizio Perri. 2006. "Does Income Inequality Lead to Consumption Inequality?
Evidence and Theory," Review of Economic Studies, Vol. 73(1): 163-193.
Kuhn, Peter, Peter Kooreman, Adriaan R. Soetevent, and Arie Kapteyn, 2010. “The Effects of Lottery
Prizes on Winners and their Neighbors: Evidence from the Dutch Postcode Lottery,” IZA
Discussion Paper 4950.

29

Lee, Donghoon, and Wilbert van der Klaauw. 2010. "An Introduction to the FRBNY Consumer Credit
Panel." Federal Reserve Bank of New York, Staff Report no. 4799.
Luttmer, Erzo F. P., 2005. “Neighbors as Negatives: Relative Earnings and Well-Being,” Quarterly
Journal of Economics 120(3), 963-1002.
Maurer, Jurgen and Andre Meier, 2008. “Smooth It Like the Joneses? Estimating Peer-Group Effects in
Intertemporal Consumption Choice,” The Economic Journal 118, 454-476.
Munnell, Alicia H., Geoffrey M. B. Tootell, Lynn E. Browne, and James McEneaney, 1996. "Mortgage
Lending in Boston: Interpreting HMDA Data," American Economic Review 86(1), 25-53.
Neumark, David and Andrew Postlewaite, 1998. “Relative Income Concerns and the Rise in Married
Women’s Employment,” Journal of Public Economics 70, 157-183.
Perugini, Cristiano, Jens Holscher, and Simon Collie, 2013. “Inequality, Credit Expansion and Financial
Crises,” Munich Personal RePec Archive Paper 51336.
Piketty, Thomas and Emmanuel Saez. 2003. “Income Inequality in the United States: 1913-1998.”
Quarterly Journal of Economics, Vol. 118 (1), pp: 1–39.
Rajan, Raghuram G., 2010. Fault Lines: How Hidden Fault Lines Still Threaten the World Economy,
Princeton University Press, Princeton N.J.
Sanchez, Juan M., 2009. "The IT Revolution and the Unsecured Credit Market," FRB Richmond Working
paper no. 09-4.
Tootell, Geoffrey M. B., 1996. "Redlining in Boston: Do Mortgage Lenders Discriminate against
Neighborhoods?," Quarterly Journal of Economics 111(4), 1049-1079.
Turner, Margery Austin, and Felicity Skidmore, 1999. Mortgage Lending Discrimination: A Review of
Existing Evidence. The Urban Institute, Washington D.C.
Zizzo, Daniel J. and Andrew J. Oswald, 2001. “Are People Willing to Pay to Reduce Others’ Income,”
Annales d’Economie et de Statistique 63/64 39-65.

30

FIGURE 1: INEQUALITY AND DEBT IN THE U.S.

21

80

20

70

19

Household Debt to
GDP Ratio (left axis)

60

18

50

17

40

Income Share of Top 5%

Household Debt to GDP Ratio

22

Income Share of
Top 5% (right axis)

90

16

30

15
1967

1972

1977

1982

1987

1992

1997

2002

2007

2012

Note: The figure plots the income share of the top 5% of U.S. households (source: IRS) and the ratio of household
(and non-profit) total liabilities relative to GDP (source: Federal Reserve).

FIGURE 2: INEQUALITY ACROSS U.S. COUNTIES

1.5081176 - 1.7359808
1.4748088 - 1.5081176
1.4433647 - 1.4748088
1.4067343 - 1.4433647
1.3558859 - 1.4067343
1.0951489 - 1.3558859
No data

Note: The figure plots inequality in 2001 at the county level. Inequality is measured as the difference in log expected
incomes at the 90th and 10th percentiles computed from the CCP. Darker counties are more unequal with each bin
representing a quintile of the distribution across counties.

31

FIGURE 3: CROSS-SECTIONAL INEQUALITY IN THE U.S.

0

1

Density

2

3

Distribution of inequality by zip code

.8

1

1.2
1.4
inequality (CCP): p90-p10

1.6

1.8

0

2

Density

4

6

Distribution of inequality by county

.8

1

1.2
1.4
inequality (CCP): p90-p10

1.6

1.8

0

2

Density
4

6

8

Distribution of inequality by state

.8

1

1.2
1.4
inequality (CCP): p90-p10

1.6

1.8

Note: The figures plot the regional distribution of inequality, measured using differences in expected log income
between the 90th and 10th percentiles as computed from the CCP, at three levels of aggregation: zip code, county and
state level.

32

FIGURE 4: DEBT ACCUMULATION, INCOME RANK AND LOCAL INEQUALITY
A) 𝜶 < 𝟎, 𝜷 = 𝟎, 𝜸 = 𝟎

B) 𝜶 < 𝟎, 𝜷 > 𝟎, 𝜸 < 𝟎

C) 𝜶 < 𝟎, 𝜷 < 𝟎, 𝜸 > 𝟎, |𝜸| > |𝜷|

Note: The figure plots qualitative predictions for various theories of how borrowing and inequality interact. Panel A
shows a case where the local inequality is irrelevant for borrowing. Panel B demonstrates a special case of “keeping up
with Joneses” when the debt accumulation of the richest household does not depend on the local inequality. Panel C
shows the case where increased inequality (𝐼𝐻 > 𝐼𝐿 ) allows high-income households to borrow more. See section 3.1 in
the text for details.

33

FIGURE 5: THE ESTIMATED EFFECT OF ONE SD INCREASE IN INEQUALITY ON DEBT ACCUMULATION
𝝈(𝑰𝒏𝒆𝒒𝒖𝒂𝒍𝒊𝒕𝒚) ∗ (𝜷 + 𝜸 ∗ 𝑹𝒂𝒏𝒌)

Panel A: Parsimonious Specification

Panel B: Specification with Full Set of Controls

Note: These figures plot the calculated effects of a one standard deviation increase in inequality using estimated
coefficients on rank, inequality, and the interaction of rank and inequality from the baseline specification (Table 3:
Panel A) and the specification with full controls (Table 3: Panel C).

34

FIGURE 6. DEBT ACCUMULATION BY LOW AND HIGH-RANK HOUSEHOLDS
AND LOCAL INEQUALITY, NONPARAMETRIC SPECIFICATION

Note: The figure shows the estimated coefficients on the income rank dummies from the nonparametric regressions of
the relative household debt accumulation between 2001 and year 𝑡. Each regression contains a dummy for income rank
below 0.2, a dummy for income rank above 0.8, and a full set of controls described in equation (3) and the countyspecific fixed effects. The omitted category is the dummy for income rank between 0.2 and 0.8. The regressions are
estimated by year. In each year, the regression is estimated separately for each of the three categories: low-inequality
locations (below the 20th percentile of the inequality distribution across zip codes in 2001), mid-level inequality
locations (between the 20th and 80th percentiles), and high-inequality locations (above the 80th percentile). Each location
(zip code) is assigned to one of the three categories in 2001 and the assignment remains constant through 2002-2012.
The standard errors are clustered by zip code. The dotted lines show the 95%-confidence interval.

35

FIGURE 7. THEORETICAL EFFECTS OF A CHANGE IN INEQUALITY ON PROVISION OF CREDIT
Bank Sorting and Inequality under Perfect Competition
Panel A

Panel B

Bank Sorting and Inequality under Monopoly Banking
Panel C

Panel D

Note: Panel A shows the tradeoff 𝑠 ∗ (𝑦) for baseline income distribution (“equal”) and more unequal income
distribution (“unequal”). Panel B plots the interest rate for each income level and for different levels of income
inequality. In Panels A and B banks can price discriminate perfectly. Panel C plots sets of households with signals 𝑠 and
𝑦 who obtain loans for two “equal” and “unequal” income distributions. Shaded regions indicate combinations of
signals that yield an approved loan. Panel D plots loan deny probability as a function of income. In Panels C and D, the
bank changes the same rate for all applicants.

36

TABLE 1: SUMMARY STATISTICS
Category

Mean

St. Dev.

10

25

Percentiles
50

75

90

Panel A: FRBNY Consumer Credit Panel/ Equifax, Q3 2001
Age of head of
household
Household size
Housing debt
Mortgage
HELOC
Auto loans
Credit card limit
Credit card balance
Student loan
Consumer financing
Other debt
Total debt
Bankruptcy rate
Delinquency rate
Credit card utilization
rate

42.6
3.0
56,423
54,658
1,765
6,876
30,459
8,884
1,639
929
4,044
78,794
0.12
0.30

11.0
1.7
99,938
97,202
12,565
11,543
36,452
14,812
7,849
5,861
22,158
112,167
0.32
0.46

28
1
0
0
0
0
1,609
261
0
0
0
1,368
0.00
0.00

34
2
0
0
0
0
6,127
1,120
0
0
0
9,437
0.00
0.00

42
3
12,351
8,267
0
0
19,320
3,923
0
0
0
42,311
0.00
0.00

51
4
83,255
81,163
0
10,805
42,288
10,881
0
178
0
111,335
0.00
1.00

58
5
156,082
153,000
0
21,376
73,009
22,893
2,723
2,033
10,410
193,395
1.00
1.00

0.41

0.35

0.02

0.09

0.31

0.71

0.99

Panel B: Survey of Consumer Finances, 2001
Age of head of
household
43.3
11.3
28
35
43
52
59
Household size
2.8
1.4
1
2
2
4
5
Housing debt
60,783
119,310
0
0
29,000
90,000
150,000
Mortgage debt
57,643
90,243
0
0
27,000
88,000
147,000
HELOC
3,140
73,981
0
0
0
0
0
Auto loans
5,182
8,280
0
0
0
8,700
18,000
Credit card limit
19,290
43,636
1,400
4,500
10,000
22,000
42,000
Credit card balance
2,586
5,459
0
0
500
3,000
7,200
Student loan
2,271
9,786
0
0
0
0
5,000
Consumer financing
Other debt
Total debt
70,822
121,163
30
6,140
40,000
101,000
164,800
Bankruptcy rate
0.10
0.30
0.00
0.00
0.00
0.00
1.00
Delinquency rate
0.05
0.21
0.00
0.00
0.00
0.00
0.00
Credit card utilization
rate
0.27
0.34
0.00
0.00
0.08
0.47
0.93
Note: The sample is restricted to the households with 20-65 year old head of household. The statistics are calculated
using sampling weights. Housing debt is the sum of Mortgage and HELOC. The credit card limit is the maximum of the
originally recorded credit card limit in the CCP and the credit card balance. The credit card utilization rate is calculated
using this credit card limit. The table shows the statistics from the sample restricted to observations with nonzero credit
card limit. The delinquency rate is a share of households with at least one member with an account that is 60 day past
due or more. The number of observations in Panel A is 7,710,406. The number of observations in Panel B is 14,356.

37

TABLE 2: INCOME STATISTICS FROM SCF (ACTUAL) AND CCP (IMPUTED)

Ln(Y), actual in SCF

Mean

St. dev.

10.62
10.72

Percentiles
10

25

50

75

90

0.91

9.47

10.09

10.67

11.20

11.62

0.98

9.54

10.10

10.70

11.28

11.88
Ln(Y), imputed in CCP
Note: The sample is restricted to households with the 20-65 y.o. head of household and positive gross income. The
sample in the SCF is further restricted to remove outliers. See text for more details.

38

TABLE 3: BASELINE RESULTS ON HOUSEHOLD DEBT ACCUMULATION
2002

2003

2004

2005

-1.23***
(0.02)
-0.39***
(0.01)
0.63***
(0.01)

-2.04***
(0.03)
-0.59***
(0.01)
1.07***
(0.02)

-2.86***
(0.04)
-0.96***
(0.02)
1.58***
(0.03)

-3.32***
(0.04)
-1.04***
(0.02)
1.80***
(0.03)

-3.81***
(0.05)
-1.15***
(0.02)
2.05***
(0.04)

-3.98***
(0.06)
-1.10***
(0.03)
2.09***
(0.04)

N
R2

5,925,610
0.012

5,449,695
0.017

4,837,540
0.025

4,387,387
0.030

4,050,160
0.038

3,792,576
0.041

α

-1.09***
(0.02)
-0.41***
(0.01)
0.56***
(0.01)

-1.87***
(0.03)
-0.56***
(0.01)
0.93***
(0.02)

-2.69***
(0.04)
-0.83***
(0.02)
1.37***
(0.03)

-3.12***
(0.05)
-0.94***
(0.02)
1.59***
(0.04)

-3.62***
(0.06)
-1.07***
(0.03)
1.85***
(0.04)

-3.80***
(0.06)
-1.12***
(0.03)
1.95***
(0.05)

-3.72***
(0.06)
-1.09***
(0.03)
1.91***
(0.05)

N
R2

5,760,568
0.047

5,287,149
0.057

4,684,857
0.062

4,244,767
0.069

3,920,565
0.074

3,668,685
0.080

3,468,033
0.088

α

-1.08***
(0.02)
-0.31***
(0.01)
0.57***
(0.01)

-1.86***
(0.03)
-0.44***
(0.01)
0.94***
(0.02)

-2.67***
(0.04)
-0.62***
(0.02)
1.40***
(0.03)

-3.09***
(0.05)
-0.68***
(0.02)
1.63***
(0.03)

-3.59***
(0.06)
-0.76***
(0.03)
1.91***
(0.04)

-3.77***
(0.06)
-0.79***
(0.03)
2.02***
(0.05)

-3.68***
(0.06)
-0.76***
(0.03)
1.98***
(0.05)

-3.62***
(0.06)
-0.76***
(0.03)
1.97***
(0.05)

N
R2

5,760,568
0.048

5,287,149
0.058

4,684,857
0.064

4,244,767
0.071

3,920,565
0.076

3,668,685
0.082

3,468,033
0.090

3,326,869
0.093

α
γ

-1.08***
(0.10)
0.57***
(0.07)

-1.86***
(0.15)
0.94***
(0.10)

-2.66***
(0.24)
1.39***
(0.17)

-3.09***
(0.30)
1.63***
(0.21)

-3.56***
(0.38)
1.90***
(0.27)

-3.76***
(0.43)
2.00***
(0.31)

-3.68***
(0.42)
1.97***
(0.30)

N
R2

5,760,568
0.052

5,287,149
0.061

4,684,857
0.068

4,244,767
0.076

3,920,565
0.082

3,668,685
0.088

3,468,033
0.096

α
β
γ

β
γ

β
γ

2006

2007

2008

2009

2010

2011

2012

-3.85***
(0.06)
-0.98***
(0.03)
1.95***
(0.04)

-3.74***
(0.06)
-0.93***
(0.03)
1.87***
(0.04)

-3.38***
(0.05)
-0.75***
(0.02)
1.62***
(0.04)

-3.02***
(0.05)
-0.58***
(0.02)
1.37***
(0.04)

-2.66***
(0.05)
-0.38***
(0.02)
1.10***
(0.04)

3,581,989
0.043

3,438,004
0.042

3,295,854
0.039

3,178,324
0.038

3,069,446
0.037

-3.66***
(0.06)
-1.10***
(0.03)
1.89***
(0.05)

-3.29***
(0.06)
-1.01***
(0.03)
1.73***
(0.04)

-2.90***
(0.06)
-0.92***
(0.03)
1.53***
(0.04)

-2.51***
(0.05)
-0.79***
(0.02)
1.33***
(0.04)

3,326,869
0.091

3,185,764
0.097

3,069,465
0.107

2,964,013
0.119

-3.26***
(0.06)
-0.69***
(0.03)
1.81***
(0.04)

-2.87***
(0.06)
-0.61***
(0.03)
1.61***
(0.04)

-2.49***
(0.05)
-0.53***
(0.02)
1.40***
(0.04)

3,185,764
0.098

3,069,465
0.109

2,964,013
0.120

-3.61***
(0.40)
1.96***
(0.28)

-3.25***
(0.38)
1.79***
(0.26)

-2.85***
(0.32)
1.59***
(0.22)

-2.47***
(0.27)
1.38***
(0.18)

3,326,869
0.099

3,185,764
0.105

3,069,465
0.115

2,964,013
0.126

Panel A: Parsimonious Specification

Panel B: Specification with Household Controls

Panel C: Specification with Household and Zip-Level Controls

Panel D: Specification with Zip-Level Fixed Effects

Note: The table presents estimates of specifications (2), (3), (4) and (5) in Panels A through D respectively. Coefficient
α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year
indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of
local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local
inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are
indicated by ***, **, and * respectively. In Panels A-C, the standard errors are clustered by zip code; in Panel D,
standard errors are clustered by state. See sections 3.1 and 3.2 in the text for details.

39

TABLE 4: INTERACTIONS OF RANK WITH CREDIT SCORES AND INITIAL DEBT LEVELS
2002

2003

2004

-1.054***
(0.0267)
-1.011***
(0.0287)
0.542***
(0.0195)
-0.745***
(0.0494)
0.864***
(0.0351)

-1.690***
(0.0394)
-1.876***
(0.0417)
0.802***
(0.0289)
-1.925***
(0.0718)
1.816***
(0.0514)

-2.500***
(0.0525)
-2.526***
(0.0550)
1.250***
(0.0388)
-2.730***
(0.0931)
2.388***
(0.0666)

N
R2

3,971,367
0.049

3,621,115
0.058

3,182,620
0.063

α

-0.516***
(0.0275)
-0.312***
(0.0118)
0.233***
(0.0200)
-2.97***
(0.089)
1.67***
(0.063)

-1.171***
(0.0387)
-0.452***
(0.0170)
0.530***
(0.0282)
-3.79***
(0.115)
2.15***
(0.0824)

-2.017***
(0.0489)
-0.670***
(0.0224)
0.987***
(0.0359)
-4.09***
(0.125)
2.49***
(0.891)

3,989,837
0.053

3,643,849
0.061

3,203,783
0.064

α
β
γ
φ
σ

β
γ
φ
σ
N
R2

2005

2006

2007

2008

2009

Panel A: Include Interaction of Household Credit Score and Local Inequality
-2.867***
-3.299***
-3.465***
-3.440***
-3.351***
(0.0643)
(0.0766)
(0.0834)
(0.0838)
(0.0842)
-3.136***
-3.841***
-4.087***
-3.962***
-3.909***
(0.0691)
(0.0885)
(0.102)
(0.106)
(0.109)
1.441***
1.658***
1.758***
1.761***
1.729***
(0.0475)
(0.0570)
(0.0621)
(0.0624)
(0.0625)
-3.744***
-4.911***
-5.286***
-4.914***
-4.600***
(0.116)
(0.146)
(0.168)
(0.173)
(0.178)
3.065***
3.831***
4.095***
3.947***
3.908***
(0.0831)
(0.105)
(0.122)
(0.126)
(0.129)
2,862,799
0.070

2,631,983
0.074

2,453,874
0.080

2,314,493
0.089

2,215,144
0.091

Panel B: Include Interaction of Initial Household Debt Level and Local Inequality
-2.422***
-2.970***
-3.069***
-2.916***
-2.814***
(0.0605)
(0.0732)
(0.0814)
(0.0849)
(0.0857)
-0.758***
-0.878***
-0.910***
-0.881***
-0.857***
(0.0273)
(0.0329)
(0.0357)
(0.0370)
(0.0374)
1.203***
1.481***
1.529***
1.460***
1.433***
(0.0443)
(0.0540)
(0.0600)
(0.0627)
(0.0631)
-4.47***
-4.59***
-5.00***
-5.37***
-5.49***
(0.147)
(0.167)
(0.200)
(0.214)
(0.213)
2.81***
3.05***
3.38***
3.54***
3.55***
(0.105)
(0.122)
(0.147)
(0.158)
(0.153)
2,882,349
0.070

2,650,275
0.074

2,470,570
0.079

2,329,399
0.088

2,228,828
0.091

2010

2011

2012

-3.049***
(0.0792)
-3.593***
(0.103)
1.607***
(0.0589)
-4.234***
(0.167)
3.577***
(0.121)

-2.678***
(0.0741)
-3.212***
(0.0979)
1.425***
(0.0551)
-3.845***
(0.157)
3.205***
(0.114)

-2.354***
(0.0690)
-2.879***
(0.0926)
1.260***
(0.0511)
-3.523***
(0.149)
2.874***
(0.108)

2,116,638
0.096

2,036,909
0.106

1,964,385
0.117

-2.316***
(0.0802)
-0.770***
(0.0365)
1.221***
(0.0591)
-6.05***
(0.199)
3.67***
(0.144)

-1.848***
(0.0769)
-0.659***
(0.0348)
1.014***
(0.0564)
-6.21***
(0.214)
3.53***
(0.152)

-1.309***
(0.0710)
-0.556***
(0.0328)
0.744***
(0.0520)
-6.876***
(0.195)
3.71***
(0.140)

2,128,927
0.098

2,047,809
0.109

1,974,388
0.124

Note: The table presents estimates of specification (3’) and (3’’) in section 3.2. Coefficient α corresponds to the partial correlation of household income rank and
debt accumulation between 2001 and the year indicated in each column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation
of local inequality and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Coefficient φ represent the
effects of each additional variable (household credit score in Panel A and initial household debt level in Panel B) while σ captures the interaction of this
household variable with local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by
***, **, and * respectively. The standard errors are clustered by zip code. In Panel B, coefficients φ and σ and the respective standard errors are multiplied by
10^6.

40

TABLE 5: HOUSEHOLD DEBT ACCUMULATION ALONG SUBSETS OF DATA
α
Midwest
Northeast
Grouping Zip Codes by
Census Region
South
West

Low
Grouping Zip Codes by Middle
Average Credit Ratings
High

Low
Grouping Zip Codes by
Middle
Initial Average Debtto-Income Ratios
High

Low
Grouping Zip Codes by
Middle
House Price Growth
(2001-2005)
High

β

γ

N

R2

873,543

0.096

739,380

0.071

1,329,937

0.094

725,825

0.056

-2.619***
(0.110)
-3.765***
(0.120)
-4.059***
(0.108)
-6.078***
(0.176)

-0.385***
(0.054)
-0.832***
(0.055)
-0.811***
(0.0457)
-1.558***
(0.072)

1.222***
(0.084)
2.141***
(0.092)
2.145***
(0.077)
3.507***
(0.125)

-4.489***
(0.124)
-4.202***
(0.098)
-2.289***
(0.074)

-1.178***
(0.044)
-0.934***
(0.044)
-0.321***
(0.035)

2.483***
(0.088)
2.251***
(0.072)
1.214***
(0.056)

1,005,563

0.092

1,185,270

0.095

1,477,852

0.092

-2.189***
(0.171)
-3.101***
(0.127)
-3.754***
(0.108)

-0.380***
(0.066)
-0.606***
(0.053)
-0.783***
(0.050)

1.002***
(0.117)
1.508***
(0.092)
1.988***
(0.086)

960,459

0.070

1,244,084

0.081

1,464,142

0.090

-3.084***
(0.117)
-4.320***
(0.138)
-5.561***
(0.163)

-0.548***
(0.055)
-0.981***
(0.063)
-1.346***
(0.068)

1.536***
(0.088)
2.456***
(0.104)
3.139***
(0.117)

836,682

0.102

819,222

0.076

797,970

0.057

Note: The table presents estimates of specification (4) in the text using household debt accumulation from 2001 to 2007.
Panel A presents separate estimates for households located in each of four Census regions. Panel B presents estimates
for households in zip codes with low, medium, or high initial average credit ratings. Panel C presents estimates for
households in zip codes with low, medium, or high initial average debt-to-income ratios. Panel D decomposes zip codes
by growth of house prices between 2001 and 2005. See section 3.3 in the text for details. Coefficient α corresponds to
the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each
column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality
and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Each
regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **,
and * respectively. The standard errors are clustered by zip code.

41

TABLE 6: MEASURING INEQUALITY AT DIFFERENT LEVELS OF AGGREGATION
2002

2003

2004

2005

-1.174***
(0.0865)
-0.241***
(0.0423)
0.583***
(0.0606)

-2.073***
(0.134)
-0.310***
(0.0671)
0.986***
(0.0943)

-3.108***
(0.252)
-0.456***
(0.118)
1.531***
(0.175)

-3.949***
(0.321)
-0.548***
(0.156)
1.993***
(0.224)

-4.756***
(0.417)
-0.570***
(0.202)
2.413***
(0.293)

-5.179***
(0.475)
-0.578**
(0.232)
2.626***
(0.334)

N
R2

6,640,570
0.048

6,257,495
0.060

5,782,494
0.070

5,435,548
0.079

5,172,907
0.086

4,966,746
0.091

α

-0.926**
(0.359)
0.0490
(0.114)
0.393
(0.242)

-1.710***
(0.543)
0.0832
(0.163)
0.695*
(0.367)

-2.852**
(1.114)
0.254
(0.259)
1.280*
(0.754)

-4.036***
(1.412)
0.478
(0.324)
1.937**
(0.954)

-5.283***
(1.667)
0.839**
(0.394)
2.616**
(1.125)

-5.651***
(1.697)
1.317***
(0.458)
2.765**
(1.144)

7,015,125
0.049

6,704,094
0.062

6,344,116
0.071

6,088,596
0.082

5,893,406
0.088

5,737,576
0.092

α
β
γ

β
γ
N
R2

2006

2007

2008

2009

2010

2011

2012

-5.055***
(0.493)
-0.519**
(0.237)
2.545***
(0.344)

-4.996***
(0.475)
-0.501**
(0.227)
2.534***
(0.330)

-4.560***
(0.452)
-0.475**
(0.209)
2.343***
(0.314)

-4.176***
(0.445)
-0.467**
(0.200)
2.170***
(0.309)

-3.631***
(0.382)
-0.426**
(0.174)
1.861***
(0.264)

4,793,457
0.098

4,661,838
0.100

4,531,493
0.105

4,421,495
0.115

4,319,303
0.125

-5.592***
(1.612)
1.472***
(0.469)
2.711**
(1.080)

-5.545***
(1.525)
1.386***
(0.483)
2.708**
(1.019)

-4.969***
(1.476)
1.193**
(0.479)
2.409**
(0.988)

-4.482***
(1.391)
1.001**
(0.468)
2.170**
(0.929)

-3.795***
(1.224)
0.863*
(0.447)
1.770**
(0.815)

5,600,035
0.099

5,490,380
0.100

5,383,103
0.108

5,293,822
0.119

5,209,929
0.130

Panel A: Inequality at the County Level

Panel B: Inequality at the State Level

Note: The table presents estimates of specification (4) while measuring inequality at different levels of aggregation: county level in Panel A and state level in
Panel B. Coefficient α corresponds to the partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each
column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality and household debt accumulation.
Coefficient γ is for the interaction of household income and local inequality. Each regression is run at the household level. Statistical significance at the 1%, 5%,
and 10% levels are indicated by ***, **, and * respectively. See section 3.4 in the text for details.

42

TABLE 7: RESULTS BY FORM OF DEBT
2002

2003

0.926***
(0.017)
-0.304***
(0.008)
0.568***
(0.012)

-1.618***
(0.024)
-0.447***
(0.011)
0.948***
(0.017)

N
R2

5,759,852
0.052

5,286,511
0.063

α

-0.118***
(0.00315)
-0.042***
(0.001)
0.060***
(0.002)

-0.189***
(0.00433)
-0.054***
(0.002)
0.076***
(0.003)

N
R2

5,761,261
0.084

5,287,505
0.110

α

-0.012***
(0.003)
0.003**
(0.001)
-0.005***
(0.002)

0.005
(0.003)
0.005***
(0.001)
-0.007***
(0.003)

N
R2

5,237,870
0.084

4,732,987
0.119

α

-0.291***
(0.006)
-0.063***
(0.002)
0.112***
(0.004)

-0.422***
(0.009)
-0.097***
(0.003)
0.189***
(0.006)

5,761,018
0.042

5,287,685
0.069

α
β
γ

β
γ

β
γ

β
γ
N
R2

2004

2005

2006

2007

2008

2009

2010

2011

2012

Panel A: Mortgage Debt Accumulation
-2.393***
(0.034)
-0.632***
(0.015)
1.393***
(0.025)
4,684,155
0.068

-2.748***
(0.040)
-0.693***
(0.017)
1.606***
(0.029)

-3.304***
(0.048)
-0.813***
(0.021)
1.936***
(0.035)

-3.560***
(0.053)
-0.865***
(0.023)
2.087***
(0.039)

-3.461***
(0.052)
-0.828***
(0.023)
2.021***
(0.038)

-3.361***
(0.052)
-0.811***
(0.023)
1.970***
(0.038)

3.089***
(0.050)
-0.756***
(0.022)
1.823***
(0.036)

-2.778***
(0.047)
-0.669***
(0.022)
1.634***
(0.035)

-2.468***
(0.044)
-0.606***
(0.020)
1.462***
(0.032)

4,244,067
0.078

3,919,926
0.082

3,667,964
0.087

3,467,395
0.096

3,326,197
0.099

3,185,052
0.109

3,068,773
0.122

2,963,305
0.138

Panel B: Auto Debt Accumulation
-0.227***
(0.00518)
-0.060***
(0.003)
0.087***
(0.004)
4,684,632
0.123

-0.236***
(0.006)
-0.058***
(0.003)
0.087***
(0.004)

-0.222***
(0.006)
-0.052***
(0.003)
0.078***
(0.004)

-0.191***
(0.006)
-0.044***
(0.003)
0.061***
(0.004)

-0.151***
(0.006)
-0.033***
(0.003)
0.042***
(0.004)

-0.110***
(0.006)
-0.024***
(0.003)
0.023***
(0.004)

-0.072***
(0.005)
-0.014***
(0.002)
0.009**
(0.004)

-0.068***
(0.005)
-0.015***
(0.002)
0.010***
(0.004)

-0.078***
(0.005)
-0.016***
(0.003)
0.017***
(0.004)

4,244,481
0.134

3,920,470
0.145

3,668,662
0.158

3,468,178
0.183

3,327,099
0.201

3,185,871
0.221

3,069,547
0.228

2,964,371
0.226

Panel C: Credit Card Balance Accumulation
0.030***
(0.004)
0.008***
(0.002)
-0.013***
(0.003)
4,180,218
0.144

0.038***
(0.005)
0.013***
(0.002)
-0.018***
(0.003)

0.048***
(0.005)
0.014***
(0.002)
-0.017***
(0.004)

0.035***
(0.005)
0.009***
(0.002)
-0.010**
(0.004)

0.030***
(0.006)
0.005*
(0.003)
-0.003
(0.004)

0.034***
(0.006)
0.006**
(0.003)
-0.004
(0.005)

0.045***
(0.006)
0.002
(0.003)
0.009**
(0.004)

0.053***
(0.006)
-0.000
(0.002)
0.016***
(0.040)

0.061***
(0.005)
0.000
(0.002)
0.015***
(0.004)

3,803,373
0.154

3,512,251
0.167

3,293,491
0.161

3,111,432
0.159

2,946,652
0.164

2,798,243
0.202

2,699,679
0.232

2,602,130
0.251

-0.730***
(0.013)
-0.166***
(0.006)
0.313***
(0.010)

-0.766***
(0.014)
-0.180***
(0.006)
0.356***
(0.010)

-0.888***
(0.016)
-0.208***
(0.007)
0.393***
(0.012)

-0.911***
(0.017)
-0.203***
(0.008)
0.390***
(0.013)

-0.844***
(0.016)
-0.237***
(0.007)
0.471***
(0.012)

-0.743***
(0.015)
-0.241***
(0.006)
0.476***
(0.011)

-0.711***
(0.015)
-0.228***
(0.006)
0.478***
(0.011)

-0.698***
(0.015)
-0.214***
(0.007)
0.462***
(0.011)

4,245,070
0.127

3,920,739
0.131

3,669,099
0.139

3,468,561
0.143

3,327,102
0.165

3,185,934
0.204

3,069,635
0.227

2,964,353
0.237

Panel D: Credit Card Limits
-0.502***
(0.010)
-0.132***
(0.004)
0.255***
(0.007)
4,685,051
0.102

Note: The table presents estimates of specification (4) for different forms of household debt: mortgage debt in Panel A,
auto debt in Panel B, credit card balances in Panel C and credit card limits in Panel D. Coefficient α corresponds to the
partial correlation of household income rank and debt accumulation between 2001 and the year indicated in each
column (relative to household’s 2001 income). Coefficient β corresponds to the partial correlation of local inequality
and household debt accumulation. Coefficient γ is for the interaction of household income and local inequality. Each
regression is run at the household level. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **,
and * respectively. See section 3.6 in the text for details.

43

TABLE 8: MORTGAGE APPLICATIONS AND LOCAL INEQUALITY
2001

2002

2003

𝛼

0.287**
(0.124)
-0.441***
(0.092)

0.240**
(0.101)
-0.355***
(0.075)

0.222***
(0.079)
-0.314***
(0.058)

N
R2

644,680
0.126

647,685
0.099

722,326
0.070

γ

2004

2005

2006

2007

2008

Panel A: Probability of Mortgage Application Being Rejected
0.223***
0.195***
0.228***
0.169***
0.165***
(0.052)
(0.053)
(0.056)
(0.051)
(0.058)
-0.331*** -0.320*** -0.320*** -0.256*** -0.253***
(0.038)
(0.039)
(0.041)
(0.037)
(0.042)
790,699
0.063

890,889
0.058

798,332
0.058

577,110
0.060

395,574
0.052

2009

2010

2011

0.116**
(0.048)
-0.185***
(0.035)

0.209***
(0.054)
-0.293***
(0.040)

0.265***
(0.060)
-0.356***
(0.044)

371,967
0.044

382,851
0.057

359,100
0.073

0.063**
(0.026)
-0.107***
(0.019)

0.077**
(0.036)
-0.132***
(0.027)

286,764
0.094

268,874
0.092

-0.790***
(0.080)
0.068
(0.058)

-0.720***
(0.082)
0.046
(0.059)

286,764
0.413

268,874
0.393

Panel B: Probability of Mortgage Being High-Interest (conditional on approval)
0.112***
0.095
0.081
0.103**
0.069*
0.012
(0.041)
(0.058)
(0.063)
(0.045)
(0.042)
(0.026)
-0.187*** -0.219*** -0.204*** -0.196*** -0.161*** -0.073***
(0.030)
(0.042)
(0.045)
(0.033)
(0.030)
(0.019)

𝛼

γ

598,307
0.113

N
R2

𝛼

γ

-0.661***
(0.099)
0.041
(0.073)

-0.698***
(0.096)
0.039
(0.071)

N
R2

501,296
0.327

513,101
0.349

644,987
0.175

567,623
0.141

415,484
0.085

287,400
0.072

283,357
0.057

Panel C: Loan-to-Income Ratios of Mortgage Applications (conditional on approval)
-0.823*** -0.872*** -0.614*** -0.623*** -0.785*** -0.704*** -0.702***
(0.092)
(0.072)
(0.068)
(0.061)
(0.062)
(0.077)
(0.077)
0.108
0.156*** -0.017
-0.005
0.101**
0.055
0.004
(0.068)
(0.051)
(0.049)
(0.044)
(0.044)
(0.056)
(0.055)
565,412
0.368

598,307
0.354

644,987
0.338

567,623
0.352

415,484
0.376

287,400
0.382

283,357
0.405

Note: The table presents estimates of specification (6) for different dependent variables as indicated in each panel. Coefficient α corresponds to the partial
correlation of applicant’s income rank and the dependent variable in the year indicated by each column. Coefficient γ corresponds to the interaction of local
inequality and applicant’s income rank. Statistical significance at the 1%, 5%, and 10% levels are indicated by ***, **, and * respectively. See section 5 in the
text for details.

44

APPENDIX
NOT FOR PUBLICATION

45

APPENDIX A: ADDITIONAL TABLES AND FIGURES
APPENDIX TABLE A1: ROBUSTNESS TO USING IRS MEASURE OF INEQUALITY
2002

2003

2004

2005

2006

2007

-0.890***
(0.0194)
-0.760***
(0.0253)
1.222***
(0.0438)

-1.515***
(0.0293)
-1.181***
(0.0380)
2.216***
(0.0659)

-2.088***
(0.0390)
-1.826***
(0.0551)
3.224***
(0.0875)

-2.460***
(0.0484)
-2.048***
(0.0672)
3.742***
(0.109)

-2.826***
(0.0583)
-2.307***
(0.0829)
4.241***
(0.131)

-2.934***
(0.0641)
-2.291***
(0.0896)
4.208***
(0.144)

5,924,527
0.012

5,448,830
0.016

4,837,105
0.023

4,387,141
0.029

4,049,988
0.037

3,792,440
0.041

2008

2009

2010

2011

2012

-2.871***
(0.0640)
-2.054***
(0.0897)
3.924***
(0.144)

-2.861***
(0.0638)
-1.933***
(0.0895)
3.892***
(0.144)

-2.692***
(0.0606)
-1.601***
(0.0827)
3.538***
(0.137)

-2.464***
(0.0558)
-1.258***
(0.0747)
3.049***
(0.127)

-2.250***
(0.0521)
-0.929***
(0.0690)
2.539***
(0.118)

3,581,901
0.043

3,437,924
0.041

3,295,790
0.038

3,178,262
0.038

3,069,406
0.037

Panel A: Parsimonious Specification
α
β
γ

N
R2

Panel B: Specification with Household and Regional Controls
α
β
γ

N
R2

-0.760***
(0.0210)
-0.600***
(0.0271)
1.064***
(0.0475)

-1.415***
(0.0309)
-0.934***
(0.0394)
1.951***
(0.0698)

-2.038***
(0.0429)
-1.339***
(0.0554)
2.961***
(0.0976)

-2.398***
(0.0533)
-1.517***
(0.0687)
3.561***
(0.121)

-2.811***
(0.0639)
-1.717***
(0.0846)
4.225***
(0.146)

-2.901***
(0.0701)
-1.708***
(0.0934)
4.353***
(0.160)

-2.824***
(0.0700)
-1.609***
(0.0940)
4.253***
(0.160)

-2.809***
(0.0700)
-1.597***
(0.0949)
4.337***
(0.160)

-2.536***
(0.0673)
-1.466***
(0.0907)
4.016***
(0.154)

-2.201***
(0.0620)
-1.255***
(0.0843)
3.524***
(0.142)

-1.920***
(0.0571)
-1.108***
(0.0791)
3.077***
(0.131)

5,759,501
0.048

5,286,304
0.057

4,684,443
0.063

4,244,552
0.070

3,920,426
0.076

3,668,580
0.081

3,467,968
0.089

3,326,809
0.092

3,185,721
0.098

3,069,425
0.108

2,963,983
0.120

Note: The table reproduces the results in Table 3 of the text using the IRS measure of inequality rather than the CCP
measure. See section 3.2 in the text for details.

46

APPENDIX TABLE A2: ROBUSTNESS TO GEOGRAPHIC REGION
2002

2003

2004

2005

2006

-0.887***
(0.0416)
-0.252***
(0.0198)
0.450***
(0.0313)

-1.550***
(0.0552)
-0.330***
(0.0266)
0.747***
(0.0413)

-2.137***
(0.0730)
-0.435***
(0.0353)
1.043***
(0.0554)

-2.241***
(0.0906)
-0.367***
(0.0422)
1.047***
(0.0688)

-2.598***
(0.105)
-0.384***
(0.0509)
1.233***
(0.0796)

N
R2

1,310,459
0.055

1,214,540
0.064

1,089,234
0.072

994,179
0.082

926,258
0.089

α

-0.883***
(0.0372)
-0.236***
(0.0169)
0.473***
(0.0281)

-1.603***
(0.0523)
-0.365***
(0.0244)
0.826***
(0.0395)

-2.422***
(0.0717)
-0.587***
(0.0338)
1.329***
(0.0551)

-2.904***
(0.0886)
-0.669***
(0.0413)
1.619***
(0.0677)

-3.481***
(0.107)
-0.777***
(0.0494)
1.950***
(0.0823)

N
R2

1,105,516
0.044

1,025,502
0.051

919,783
0.055

843,686
0.062

785,904
0.066

α

-1.242***
(0.0378)
-0.364***
(0.0157)
0.668***
(0.0266)

-2.187***
(0.0547)
-0.541***
(0.0227)
1.148***
(0.0385)

-3.026***
(0.0695)
-0.716***
(0.0289)
1.613***
(0.0491)

-3.560***
(0.0847)
-0.802***
(0.0353)
1.919***
(0.0598)

-3.939***
(0.0993)
-0.847***
(0.0419)
2.092***
(0.0704)

N
R2

2,104,747
0.056

1,931,826
0.068

1,709,240
0.076

1,547,622
0.085

1,425,285
0.089

α

-1.644***

-2.848***

-4.113***

-4.832***

-5.695***

(0.0546)
-0.499***
(0.0228)
0.928***
(0.0382)

(0.0801)
-0.769***
(0.0329)
1.573***
(0.0567)

(0.105)
-1.076***
(0.0437)
2.310***
(0.0746)

(0.131)
-1.267***
(0.0534)
2.738***
(0.0932)

1,239,846
0.039

1,115,281
0.047

966,600
0.048

859,280
0.051

α
β
γ

β
γ

β
γ

β
γ

N
R2

2007

2008

2009

2010

2011

2012

-2.619***
(0.110)
-0.385***
(0.0542)
1.222***
(0.0844)

-2.507***
(0.110)
-0.327***
(0.0553)
1.167***
(0.0850)

-2.531***
(0.110)
-0.353***
(0.0540)
1.226***
(0.0834)

-2.211***
(0.104)
-0.322***
(0.0506)
1.092***
(0.0788)

-1.923***
(0.0993)
-0.264***
(0.0483)
0.960***
(0.0751)

-1.635***
(0.0922)
-0.237***
(0.0456)
0.802***
(0.0698)

873,543
0.096

829,532
0.108

799,413
0.111

767,757
0.122

742,214
0.137

717,954
0.153

-3.765***
(0.120)
-0.832***
(0.0548)
2.141***
(0.0920)

-3.725***
(0.122)
-0.819***
(0.0568)
2.132***
(0.0946)

-3.610***
(0.122)
-0.803***
(0.0565)
2.074***
(0.0940)

-3.411***
(0.113)
-0.820***
(0.0538)
2.012***
(0.0876)

-3.005***
(0.109)
-0.731***
(0.0514)
1.797***
(0.0848)

-2.708***
(0.101)
-0.696***
(0.0489)
1.640***
(0.0783)

739,380
0.071

702,128
0.077

674,140
0.080

645,604
0.084

623,447
0.093

602,832
0.102

-4.059***
(0.108)
-0.811***
(0.0457)
2.145***
(0.0766)

-3.962***
(0.108)
-0.764***
(0.0462)
2.113***
(0.0761)

-3.905***
(0.110)
-0.779***
(0.0472)
2.129***
(0.0776)

-3.390***
(0.108)
-0.648***
(0.0462)
1.869***
(0.0765)

-3.071***
(0.103)
-0.605***
(0.0450)
1.746***
(0.0725)

-2.639***
(0.0992)
-0.500***
(0.0431)
1.509***
(0.0700)

1,329,937
0.094

1,253,811
0.103

1,202,853
0.107

1,152,915
0.115

1,109,302
0.128

1,071,093
0.141

-6.078***

-5.941***

-5.899***

-5.584***

-4.794***

-4.232***

(0.162)
-1.468***
(0.0658)
3.255***
(0.116)

(0.176)
-1.558***
(0.0719)
3.507***
(0.125)

(0.174)
-1.529***
(0.0706)
3.418***
(0.123)

(0.178)
-1.521***
(0.0742)
3.400***
(0.127)

(0.165)
-1.479***
(0.0711)
3.274***
(0.117)

(0.158)
-1.254***
(0.0671)
2.789***
(0.113)

(0.149)
-1.092***
(0.0613)
2.455***
(0.105)

783,118
0.053

725,825
0.056

682,562
0.061

650,463
0.062

619,488
0.065

594,502
0.071

572,134
0.083

Panel A: Midwest

Panel B: Northeast

Panel C: South

Panel D: West

Note: The table replicates the results in Panel A of Table 5 in the main text for each year in our sample.

47

APPENDIX TABLE A3: ROBUSTNESS TO AVERAGE LOCAL CREDIT RATINGS
2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

Panel A: Low Average Credit Ratings
α

-0.676***
(0.0372)

-1.496***
(0.0524)

-2.324***
(0.0704)

-3.087***
(0.0886)

-3.960***
(0.110)

-4.489***
(0.124)

-4.488***
(0.128)

-4.474***
(0.129)

-4.067***
(0.125)

-3.590***
(0.118)

-3.190***
(0.113)

β

-0.257***
(0.0125)
0.343***
(0.0260)

-0.463***
(0.0180)
0.786***
(0.0367)

-0.693***
(0.0248)
1.228***
(0.0494)

-0.861***
(0.0310)
1.686***
(0.0625)

-1.075***
(0.0395)
2.170***
(0.0778)

-1.178***
(0.0443)
2.483***
(0.0875)

-1.175***
(0.0461)
2.500***
(0.0909)

-1.204***
(0.0471)
2.540***
(0.0915)

-1.118***
(0.0464)
2.375***
(0.0890)

-1.003***
(0.0445)
2.136***
(0.0837)

-0.897***
(0.0426)
1.923***
(0.0802)

1,818,129
0.058

1,653,710
0.074

1,424,164
0.077

1,243,808
0.087

1,110,832
0.090

1,005,563
0.092

922,130
0.098

869,345
0.100

816,880
0.110

768,349
0.125

729,247
0.141

γ

N
R2

Panel B: Medium Average Local Credit Ratings
α

-1.226***
(0.0349)

-2.123***
(0.0507)

-3.045***
(0.0655)

-3.519***
(0.0777)

-4.070***
(0.0909)

-4.202***
(0.0984)

-4.289***
(0.101)

-4.198***
(0.103)

-3.778***
(0.0979)

-3.383***
(0.0929)

-2.912***
(0.0893)

β

-0.394***
(0.0155)
0.664***
(0.0253)

-0.534***
(0.0224)
1.085***
(0.0368)

-0.750***
(0.0294)
1.615***
(0.0479)

-0.822***
(0.0348)
1.870***
(0.0567)

-0.906***
(0.0405)
2.176***
(0.0669)

-0.934***
(0.0443)
2.251***
(0.0724)

-0.943***
(0.0460)
2.340***
(0.0744)

-0.927***
(0.0470)
2.307***
(0.0756)

-0.853***
(0.0449)
2.111***
(0.0717)

-0.760***
(0.0429)
1.922***
(0.0679)

-0.640***
(0.0408)
1.665***
(0.0655)

1,909,604
0.052

1,731,554
0.062

1,517,591
0.074

1,372,556
0.084

1,265,579
0.091

1,185,270
0.095

1,121,699
0.104

1,075,653
0.105

1,029,665
0.110

993,281
0.120

959,535
0.130

γ

N
R2

Panel C: High Average Local Credit Ratings
α
β
γ

N
R2

-1.055***
(0.0309)
-0.283***
(0.0142)
0.587***
(0.0231)

-1.451***
(0.0427)
-0.279***
(0.0199)
0.701***
(0.0320)

-1.983***
(0.0537)
-0.368***
(0.0252)
1.053***
(0.0404)

-2.083***
(0.0614)
-0.340***
(0.0287)
1.094***
(0.0460)

-2.226***
(0.0679)
-0.326***
(0.0323)
1.173***
(0.0511)

-2.289***
(0.0736)
-0.321***
(0.0349)
1.214***
(0.0557)

-2.180***
(0.0725)
-0.290***
(0.0347)
1.143***
(0.0548)

-2.137***
(0.0749)
-0.279***
(0.0357)
1.119***
(0.0561)

-2.011***
(0.0722)
-0.266***
(0.0345)
1.036***
(0.0541)

-1.848***
(0.0693)
-0.251***
(0.0329)
0.940***
(0.0520)

-1.685***
(0.0668)
-0.226***
(0.0316)
0.805***
(0.0501)

2,032,835
0.057

1,901,885
0.066

1,743,102
0.080

1,628,403
0.085

1,544,154
0.089

1,477,852
0.092

1,424,204
0.102

1,381,871
0.105

1,339,219
0.108

1,307,835
0.116

1,275,231
0.125

Note: The table replicates the results in Panel B of Table 5 in the main text for each year in our sample.

48

APPENDIX TABLE A4: ROBUSTNESS TO AVERAGE INITIAL DEBT-TO-INCOME RATIOS
2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

Panel A: Low Average Initial Debt-to-Income Ratio
α

-0.681***
(0.0484)

-1.128***
(0.0684)

-1.696***
(0.0988)

-1.843***
(0.123)

-2.132***
(0.151)

-2.189***
(0.171)

-2.097***
(0.169)

-2.142***
(0.166)

-1.790***
(0.159)

-1.674***
(0.155)

-1.421***
(0.141)

β

-0.181***
(0.0182)
0.346***
(0.0332)

-0.230***
(0.0262)
0.513***
(0.0469)

-0.350***
(0.0378)
0.820***
(0.0671)

-0.346***
(0.0479)
0.869***
(0.0837)

-0.366***
(0.0585)
0.981***
(0.103)

-0.380***
(0.0661)
1.002***
(0.117)

-0.351***
(0.0663)
0.949***
(0.115)

-0.390***
(0.0661)
1.023***
(0.113)

-0.327***
(0.0640)
0.881***
(0.109)

-0.316***
(0.0616)
0.874***
(0.107)

-0.284***
(0.0599)
0.744***
(0.0961)

1,550,820
0.046

1,419,702
0.055

1,246,854
0.057

1,124,143
0.064

1,033,758
0.066

960,459
0.070

900,996
0.078

861,336
0.083

821,234
0.094

786,886
0.110

756,966
0.126

γ

N
R2

Panel B: Medium Average Initial Debt-to-Income Ratio
α

-0.864***
(0.0376)

-1.353***
(0.0540)

-2.150***
(0.0766)

-2.556***
(0.0924)

-2.929***
(0.115)

-3.101***
(0.127)

-3.040***
(0.127)

-2.928***
(0.129)

-2.623***
(0.119)

-2.196***
(0.112)

-1.829***
(0.105)

β

-0.212***
(0.0161)
0.423***
(0.0269)

-0.252***
(0.0226)
0.585***
(0.0386)

-0.425***
(0.0320)
1.020***
(0.0549)

-0.490***
(0.0385)
1.234***
(0.0664)

-0.554***
(0.0482)
1.419***
(0.0830)

-0.606***
(0.0533)
1.508***
(0.0917)

-0.565***
(0.0528)
1.497***
(0.0915)

-0.543***
(0.0548)
1.454***
(0.0927)

-0.504***
(0.0506)
1.342***
(0.0861)

-0.413***
(0.0484)
1.142***
(0.0808)

-0.340***
(0.0453)
0.942***
(0.0756)

1,942,792
0.048

1,785,659
0.059

1,581,595
0.061

1,436,540
0.070

1,326,941
0.075

1,244,084
0.081

1,176,586
0.091

1,129,839
0.095

1,083,324
0.103

1,044,333
0.116

1,009,370
0.129

γ

N
R2

Panel C: High Average Initial Debt-to-Income Ratio
α

-1.212***
(0.0349)

-1.964***
(0.0507)

-2.748***
(0.0697)

-3.115***
(0.0848)

-3.616***
(0.101)

-3.754***
(0.108)

-3.653***
(0.109)

-3.622***
(0.107)

-3.227***
(0.103)

-2.849***
(0.0954)

-2.492***
(0.0886)

β

-0.367***
(0.0161)
0.647***
(0.0272)

-0.495***
(0.0235)
0.980***
(0.0398)

-0.632***
(0.0320)
1.416***
(0.0551)

-0.684***
(0.0389)
1.617***
(0.0670)

-0.765***
(0.0470)
1.901***
(0.0803)

-0.783***
(0.0502)
1.988***
(0.0855)

-0.757***
(0.0518)
1.939***
(0.0865)

-0.758***
(0.0512)
1.947***
(0.0851)

-0.669***
(0.0504)
1.732***
(0.0815)

-0.596***
(0.0472)
1.528***
(0.0755)

-0.532***
(0.0428)
1.331***
(0.0697)

2,266,956
0.053

2,081,788
0.062

1,856,408
0.068

1,684,084
0.076

1,559,866
0.082

1,464,142
0.090

1,390,451
0.100

1,335,694
0.101

1,281,206
0.105

1,238,246
0.112

1,197,677
0.121

γ

N
R2

Note: The table replicates the results in Panel C of Table 5 in the main text for each year in our sample.

49

APPENDIX TABLE A5: ROBUSTNESS TO AVERAGE HOUSE PRICE GROWTH (2001-2005)
2002

2003

2004

2005

-1.222***
(0.0453)
-0.366***
(0.0202)
0.667***
(0.0331)

-2.145***
(0.0640)
-0.540***
(0.0285)
1.157***
(0.0470)

-2.751***
(0.0839)
-0.632***
(0.0373)
1.458***
(0.0617)

-3.007***
(0.0963)
-0.628***
(0.0430)
1.578***
(0.0708)

N
R2

1,291,855
0.055

1,189,375
0.067

1,050,610
0.081

957,101
0.093

α

-1.198***
(0.0417)
-0.373***
(0.0204)
0.675***
(0.0307)

-1.918***
(0.0634)
-0.481***
(0.0298)
1.014***
(0.0468)

-2.828***
(0.0818)
-0.660***
(0.0386)
1.549***
(0.0616)

-3.174***
(0.101)
-0.670***
(0.0460)
1.728***
(0.0761)

N
R2

1,313,788
0.051

1,194,060
0.061

1,059,157
0.061

970,151
0.065

α

-1.287***
(0.0449)
-0.378***
(0.0196)
0.690***
(0.0323)

-2.193***
(0.0665)
-0.547***
(0.0291)
1.124***
(0.0477)

-3.399***
(0.0911)
-0.848***
(0.0395)
1.834***
(0.0657)

-4.486***
(0.124)
-1.125***
(0.0526)
2.509***
(0.0887)

1,365,724
0.043

1,237,680
0.048

1,072,853
0.048

935,547
0.051

α
β
γ

β
γ

β
γ
N
R2

2006

2007

2008

Panel A: Low Average House Price Growth
-3.117***
-3.084***
-3.488***
(0.108)
(0.117)
(0.126)
-0.561***
-0.548***
-0.658***
(0.0496)
(0.0549)
(0.0603)
1.577***
1.536***
1.851***
(0.0806)
(0.0884)
(0.0954)
889,277
0.098

836,682
0.102

782,313
0.110

2009

2010

2011

2012

-3.912***
(0.138)
-0.808***
(0.0672)
2.163***
(0.104)

-3.337***
(0.127)
-0.670***
(0.0593)
1.832***
(0.0948)

-2.793***
(0.119)
-0.524***
(0.0565)
1.529***
(0.0886)

-2.283***
(0.110)
-0.440***
(0.0545)
1.228***
(0.0822)

733,016
0.108

697,421
0.116

672,838
0.126

658,547
0.140

-3.464***
(0.132)
-0.760***
(0.0629)
1.931***
(0.0981)

-3.116***
(0.127)
-0.649***
(0.0608)
1.752***
(0.0942)

-2.833***
(0.117)
-0.639***
(0.0552)
1.637***
(0.0863)

729,503
0.096

701,465
0.101

673,369
0.111

654,435
0.120

-4.616***
(0.145)
-1.112***
(0.0625)
2.625***
(0.106)

-4.144***
(0.137)
-1.008***
(0.0618)
2.385***
(0.100)

-3.710***
(0.128)
-0.896***
(0.0575)
2.167***
(0.0932)

-3.325***
(0.121)
-0.776***
(0.0541)
1.932***
(0.0880)

752,625
0.071

717,752
0.074

690,702
0.083

651,403
0.093

Panel B: Medium Average House Price Growth
-4.026***
-4.330***
-3.964***
-3.643***
(0.125)
(0.138)
(0.142)
(0.139)
-0.880***
-0.981***
-0.816***
-0.691***
(0.0584)
(0.0632)
(0.0656)
(0.0645)
2.277***
2.462***
2.179***
1.967***
(0.0949)
(0.104)
(0.105)
(0.103)
897,573
0.069

819,222
0.076

754,658
0.091

Panel C: High Average House Price Growth
-5.332***
-5.561***
-5.057***
(0.153)
(0.163)
(0.153)
-1.335***
-1.346***
-1.252***
(0.0636)
(0.0677)
(0.0651)
2.982***
3.139***
2.861***
(0.109)
(0.117)
(0.111)
845,133
0.052

797,970
0.057

777,522
0.065

Note: The table replicates the results in Panel D of Table 5 in the main text for each year in our sample.

50

APPENDIX TABLE A6: MORTGAGE APPLICATIONS AND LOCAL INEQUALITY WITH STATE FE
2001

2002

2003

2004

2005

2006

2007

0.297**
(0.130)

0.241**
(0.104)

0.229***
(0.083)

0.242***
(0.054)

0.219***
(0.055)

0.244***
(0.057)

0.177***
(0.052)

β

0.446***
(0.054)

0.374***
(0.040)

0.342***
(0.028)

0.374***
(0.026)

0.358***
(0.029)

0.351***
(0.028)

γ

-0.463***
(0.096)

-0.364***
(0.077)

-0.327***
(0.062)

-0.356***
(0.039)

-0.346***
(0.040)

N
R2

644680
0.090

647685
0.071

722326
0.050

790699
0.048

890889
0.045

2008

2009

2010

2011

0.152**
(0.059)

0.115**
(0.048)

0.203***
(0.054)

0.260***
(0.060)

0.314***
(0.025)

0.306***
(0.027)

0.223***
(0.026)

0.260***
(0.029)

0.286***
(0.032)

-0.339***
(0.041)

-0.270***
(0.038)

-0.251***
(0.043)

-0.191***
(0.035)

-0.296***
(0.040)

-0.362***
(0.045)

798332
0.046

577110
0.046

395574
0.035

371967
0.027

382851
0.035

359100
0.044

Panel A: Probability of Mortgage Application Being Rejected
𝛼

Panel B: Probability of Mortgage Being High-Interest (conditional on approval)
𝛼

0.132***
(0.042)

0.140**
(0.059)

0.119*
(0.062)

0.105**
(0.046)

0.065
(0.041)

0.007
(0.026)

0.055**
(0.026)

0.062*
(0.036)

β

0.241***
(0.026)

0.284***
(0.036)

0.270***
(0.038)

0.208***
(0.027)

0.169***
(0.026)

0.075***
(0.015)

0.114***
(0.014)

0.120***
(0.020)

γ

-0.208***
(0.031)

-0.262***
(0.042)

-0.244***
(0.045)

-0.205***
(0.033)

-0.165***
(0.030)

-0.073***
(0.019)

-0.104***
(0.019)

-0.122***
(0.026)

N
R2

598307
0.099

644987
0.160

567623
0.123

415484
0.064

287400
0.047

283357
0.028

286764
0.042

268874
0.043

Panel C: Loan-to-Income Ratios of Mortgage Applications (conditional on approval)
𝛼

-0.696***
(0.099)

-0.733***
(0.096)

-0.853***
(0.096)

-0.924***
(0.075)

-0.664***
(0.073)

-0.655***
(0.067)

-0.797***
(0.068)

-0.697***
(0.078)

-0.728***
(0.080)

-0.802***
(0.083)

-0.731***
(0.083)

β

-0.219***
(0.050)

-0.231***
(0.051)

-0.265***
(0.053)

-0.304***
(0.051)

-0.223***
(0.046)

-0.218***
(0.041)

-0.240***
(0.043)

-0.167***
(0.052)

-0.114**
(0.050)

-0.138***
(0.048)

-0.092**
(0.047)

γ

0.069
(0.073)

0.070
(0.070)

0.138*
(0.070)

0.207***
(0.053)

0.032
(0.052)

0.029
(0.048)

0.118**
(0.048)

0.056
(0.057)

0.025
(0.057)

0.077
(0.060)

0.053
(0.060)

N
R2

501296
0.289

513101
0.310

565412
0.327

598307
0.318

644987
0.308

567623
0.322

415484
0.344

287400
0.344

283357
0.365

286764
0.376

268874
0.357

Note: The table replicates the results in Table 8 using state fixed effects rather than county fixed effects. Standard errors
clustered at the county level.

51

APPENDIX B: ADDITIONAL INFORMATION ON CCP DATA
The Equifax FRBNY Consumer Credit Panel is a longitudinal database with detailed information on
consumer debt and credit. The core of the database constitutes a 5% random sample of all U.S. individuals
with credit (i.e., the primary sample). The database also contains information on all individuals with credit
files residing in the same household as the individuals in the primary sample. The household members are
added to the sample based on the mailing address in the existing credit files. Thus, the resulting sample is a
sample of U.S. households in which at least one member has a credit file.
The individual records in the CCP contain information on the mortgage debt, credit card debt and credit card
limits, home equity lines of credit, student loans, auto loans, bankruptcy and delinquencies. The data include
residential location on the census block level and the birth year of individuals. The data in the CCP are
updated quarterly. We use 100% of the CCP sample.
The unit of the analysis in the paper is a household. The CCP is primarily an individual-level dataset;
however, it contains two identifiers that allow us to construct the household records in each period and then
link the household records from period to period. In each quarter, a unique identifier is given for all
individuals who reside in the same household as an individual in the primary sample. We use this identifier
to aggregate the individual level information to construct the household level credit variables.
The household identifier identifies household members only in one period. We then use the second identifier
in the CCP data, an individual identifier that remains constant from period to period, to link household
records from one quarter to another. To construct the longitudinal household record, we proceed as follows.
Let i denote the identification number of a household in 2001. To identify the continuation of household i in
year t, t > 2001, we first determine what members of household i are present in year t using individual
identifiers. Then we determine the identification number of the household to which these members belong in
year t. If there is a unique household to which these members belong in year t and the new household does
not have any members from any other household in year 2001, we identify this new household as a
continuation record for household i. While the primary sample of individuals in the CCP is a random sample
of all U.S. households with credit reports; the resulting sample of the households is not random. Following,
Lee and van der Klaauw (2010) we define the sampling weights as the inverse of the probability to be
included in the sample, 𝑤ℎ =

1
,
1− .05𝑁

where N is the number of individuals in the household who are in the

primary sample.
For each individual, the data contain a record of her debt by detailed category as well as a record of the
balances on the joint or cosigned accounts. In aggregating the debt on the household level, we use a
correction to avoid double counting of the balances on joint accounts. This choice follows Brown,
Haughwout, Lee and van der Klaauw (2011). In particular, while aggregating, we discount the total debt of
the household members by 50% of the total debt on joint accounts of the household members. The exact
formula that we use is
𝑖,𝑐
𝑖,𝑐
𝑖
𝑑ℎ,𝑗 = max{ ∑𝑖 (𝑑ℎ,𝑗
− .5𝑑ℎ,𝑗
), .5𝑑ℎ,𝑗
}.

𝑖,𝑐
𝑖
Where 𝑑ℎ,𝑗
is the total debt in category j of member i in household h and 𝑑ℎ,𝑗
is the debt in joint accounts.

The second input to the maximum function addresses the situation that arises with so-called “thin” credit
records, or records with at most two credit report-worthy debts. The individuals with thin records are not
52

included in the primary sample, but they are included in the additional sample. These individuals might have
records on joint accounts that are missed on individual accounts. We thank Donghoon Lee for this
suggestion.

Variable Descriptions
Here we provide a short description of the variables used in the CCP analysis. For a detailed description of
the CCP dataset please see Lee and van der Klaauw (2010).
Age: We follow Brown, Haughwout, Lee, and van der Klaauw (2011) and define age as the median age of
adult members of the house.
Auto debts: These are any loans taken out explicitly for the purchase of a car including loans from banks
and those from automobile financing institutions.
Bankruptcy: An indicator in the CCP taken from public records that detail whether or not an individual has
filed for bankruptcy.
Credit Card Balance: The sum of reported balances across bank cards as well as retail cards. These cards
reflect revolving accounts at banks, credit unions, credit card companies, and others. Importantly, the CCP
does not distinguish between balances rolled over billing periods (and so potentially subject to interest
charges) and cards where the balance is paid every month.
Credit Card Limits: We take the maximum of reported limits and balances across all bank and retail cards
to ensure that reported utilization is not greater than one.
Credit Card Utilization Rate: This is the ratio of the credit card balance and credit card limit.
Delinquency: Indicator for whether or not a household is at least 60 days delinquent on any of its accounts
in the current quarter.
HELOC Debt: The sum of home equity lines of credit, or home equity revolving accounts. We use the
classification of HELOCs vs. installment loans provided by the CCP data.
Mortgage Debt: The sum of all mortgage installment loans.
Riskscore: A variable constructed by Equifax and similar to FICO. A higher number is interpreted as a
lower default risk. We construct the household riskscore by taking the average of individual riskscores
within the household.
Size: Household size sums the number of distinct social security numbers that can be linked by household
identifiers in a specific time period. We restrict the household size to at most 10.
Student Loans: These include loans financing education from private and public institutions.
Total debt: Constructed as the sum of mortgage debt balance, credit card balances, auto debts, balance on
home equity lines of credit, and student loans.

53

APPENDIX C: DECOMPOSING U.S. INEQUALITY SINCE 1970
The decomposition is constructed using the following IPUMS samples: 1970, 1980, and 1990 1% metro
samples and the 2000 1% unweighted sample. Within each of these samples we use the metro area
geographies defined by IPUMS in the following way:
“Metropolitan areas are counties or combinations of counties centering on a substantial urban area.
METAREA identifies the metropolitan area where the household was enumerated, if that
metropolitan area was large enough to meet confidentiality requirements.”
We restrict the sample to the set of metro areas that can be identified in each year to get 117 metro areas
containing roughly 60% of the entire sample within each year. We also restrict the sample to households
where the respondent’s age is between 25 and 65 and the respondent is the head of the household or the
spouse of the head of the household. These restrictions are not important for the results.
To calculate income we use family total income. While not exactly the same as household income it
is available for all years whereas household income is not available in 1970. We estimate the following
model of log family income on each year of the sample:
log(𝑦𝑖𝑎 ) = 𝛼𝑎 + 𝜖𝑖
Estimating this function gives estimates of the variance of the fixed effects and the variance of the
residuals for each year. We then calculate the share of variance explained by variance of the fixed effects as:
𝜎�𝑎2
𝑆ℎ𝑎𝑟𝑒 = 2
𝜎�𝑎 + 𝜎�𝑖2
APPENDIX FIGURE C1: DECOMPOSING AGGREGATE U.S. INEQUALITY

Note: The left-hand figure plots the ratio of “between” variance of mean incomes to the total variance of incomes. The
right-hand figure plots the standard deviation of log income across all households.

54

APPENDIX 3: TIME VARIATION IN LOCAL INEQUALITY RATES
To get a sense of how inequality within counties has varied across time we computed Gini coefficients at the
county level using 1970 and 2000 Census aggregates available from ICPSR. To compute the Gini coefficient
we follow the same procedure outlined in the Appendix and reproduced below. Because the number of bins
used to compute the coefficient is not the same in both years (1970 has fewer bins) the levels of the Gini
coefficients are not directly comparable. Using the Census data we match 3,122 counties.
Let 𝑓(𝑦𝑖 ) be a discrete probability function where 𝑖 = 1, … , 𝑛 and 𝑦𝑖 < 𝑦𝑖+1 . Then the Gini
coefficient G is defined as
∑𝑛𝑖=1 𝑓(𝑦𝑖 )(𝑆𝑖−1 + 𝑆𝑖 )
𝐺 =1−
𝑆𝑛
𝑖
where 𝑆𝑖 = ∑𝑗=1 𝑓�𝑦𝑗 �𝑦𝑗 and 𝑆0 = 0.
We approximate the discrete probability function with the share of a location’s population within
each bin reported by the Census. For all bins but the last we assume all the mass is distributed at the
midpoint of the bin. For the very last bin we add the last increment to the lower boundary. For example, if
the last bin is incomes of $200,000 and up and the bin before was $150,000 to $199,999 we assign the last
bin to have the value $250,000. This assumption limits the impact the very top bin will have on the
coefficient, but should provides a reasonable approximation of inequality at low levels of aggregation.
The figure reported below shows a high degree of correlation between inequality in 1970 and
inequality in 2000. The R-squared is 0.26 and the Spearman correlation is 0.52, suggesting inequality is quite
persistent.
APPENDIX FIGURE C3: PERSISTENCE OF LOCAL INEQUALITY

Note: The figure plots Gini coefficients for income inequality in U.S counties
in 1970 versus 2000.

55

APPENDIX D: SUMMARY STATISTICS FROM HMDA DATA
Table 1 in this appendix provides summary statistics from the 15% HMDA samples. We report the fraction of
applications denied, originated, for owner-occupied properties, high interest, the race of the primary applicant,
and the regulator of the lender. When using the HMDA data it is important to recognize that changes in
reporting requirements from 2003 to 2004 had significant effects on the coverage of the mortgage market and
so statistics we calculate. This can be seen clearly when comparing the change in racial composition of
applicants from 2003 to 2004. While some of this might reflect real shifts in the provision of credit to nonwhite groups it also reflects the increased coverage of rural areas and smaller, non-bank lenders. This can also
be seen by the large increase in applications filed at lenders regulated by HUD. While mortgage company
activity was almost certainly increasing over this period many lenders were simply not reporting in the HMDA
data.
The health of the mortgage market can be traced out by changes in the sample size. The number of
applications reported peaked in 2007 and then declined steadily until 2011. Interestingly, the fraction of
loans with high interest rates has also declined sharply, probably reflecting fewer loans with junior liens.
Notice that the mean applicant income reported in the HMDA data is substantially higher than the
average household income reported in the SCF data and the imputed CCP data. However, average income is
comparable to the average income of homeowners as reported in the 2007 SCF, which is about $99,500.
Table 2 provides some sample correlations from 2007, most of which are qualitatively similar to other
years. Owner-occupied applications are less likely to be denied while applications with high LTI ratios are
more likely to be denied. Applicants applying to HUD-regulated lenders are more likely to be denied, which
could reflect the stress of mortgage companies in this period or an increased likelihood that the applicant is
subprime. Applicants to HUD lenders tend to have smaller incomes and higher LTI ratios.

56

APPENDIX TABLE D1: SUMMARY STATISTICS FROM HMDA
2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

Denied
Originated
OOC
LTI
sd
Loan
sd
Income
sd
High Int
White
Black
OCC
FRS
FDIC
OTS
NCUA
HUD

0.15
0.78
0.94
2.31
0.88
140.16
96.03
64.84
47.46

0.13
0.79
0.93
2.43
0.94
154.40
104.30
68.46
49.75

0.13
0.78
0.92
2.58
1.03
168.24
111.90
70.72
50.95

0.89
0.08
0.28
0.11
0.09
0.11
0.02
0.39

0.88
0.07
0.27
0.18
0.08
0.10
0.02
0.36

0.88
0.08
0.26
0.18
0.07
0.09
0.02
0.38

0.15
0.76
0.90
2.65
1.08
193.11
147.30
78.13
63.29
0.08
0.74
0.08
0.23
0.15
0.07
0.08
0.02
0.45

0.16
0.72
0.88
2.67
1.08
212.85
165.15
85.41
70.48
0.16
0.71
0.10
0.20
0.16
0.06
0.09
0.02
0.47

0.18
0.71
0.90
2.63
1.04
223.00
173.16
91.21
76.46
0.16
0.69
0.11
0.23
0.16
0.06
0.08
0.02
0.45

0.18
0.72
0.91
2.72
1.11
226.41
180.86
91.01
81.55
0.08
0.73
0.10
0.32
0.17
0.06
0.10
0.03
0.33

0.17
0.73
0.92
2.72
1.10
207.03
155.68
84.15
73.44
0.06
0.76
0.08
0.32
0.09
0.09
0.08
0.05
0.36

0.14
0.76
0.94
2.81
1.12
198.34
141.21
78.02
65.42
0.04
0.76
0.07
0.29
0.09
0.11
0.06
0.04
0.40

0.15
0.75
0.94
2.79
1.12
203.31
148.88
80.84
68.73
0.02
0.76
0.07
0.31
0.08
0.11
0.05
0.04
0.41

0.15
0.75
0.93
2.70
1.10
200.69
151.88
82.38
71.28
0.03
0.77
0.07
0.06
0.04
0.09
0.00
0.04
0.43

N

644680

647685

722326

790699

890889

798332

577110

395574

371967

382851

359100

Note: The table provides sample means for all variables and standard deviations for continuous variables for all years of
the HMDA data under the sample restrictions identified in the text. Denied gives the probability that an application was
formally denied while originated gives the probability a loan was approved and the funds disbursed to the borrower.
OOC indicates that the application is for an owner-occupied home. LTI is the loan-to-income ratio on the application
constructed from the application’s stated loan and income. High Int indicates if a loan was ultimately originated as a
high interest loan. While and black both refer to the race of the primary applicant. OCC indicates a loan filed at a lender
regulated by the Office of the Comptroller of the Currency. Similarly, FRS indicates a lender regulated by the Federal
Reserve System, OTS regulated by the Office of Thrift Supervision, NCUA the National Credit Union Administration,
and HUD the Department of Housing and Urban Development.

57

APPENDIX TABLE D2: SAMPLE CORRELATIONS FROM 2007 HMDA
Denied
Denied

Originated

OOC

LTI

Loan

Inc

White

Black

1.000

Originated
OOC
LTI
Loan
Inc

-0.762***
-0.0192***
0.053***
0.001
-0.028***

1.000
0.021***
-0.060***
-0.020***
0.014***

1.000
0.200***
-0.0308***
-0.169***

1.000
0.208***
-0.238***

1.000
0.815***

White
Black
OCC
FRS
FDIC

-0.145***
0.116***
-0.066***
0.051***
-0.044***

0.146***
-0.113***
0.120***
-0.070***
0.045***

-0.0105***
0.007***
-0.005***
-0.002
-0.031***

-0.116***
0.050***
-0.012***
-0.022***
-0.031***

-0.033***
-0.053***
0.056***
-0.023***
-0.060***

0.034***
-0.074***
0.063***
-0.011***
-0.041***

1.000
-0.545***
0.006***
0.001
0.078***

1.000
-0.025***
0.004**
-0.037***

OTS
NCUA
HUD
N

0.0547***
-0.025***
0.022***
577110

-0.009***
0.008***
-0.084***

-0.022***
0.029***
0.026***

-0.003*
-0.004**
0.048***

0.081***
-0.042***
-0.042***

0.070***
-0.040***
-0.062***

-0.027***
0.039***
-0.044***

0.006***
-0.020***
0.044***

1.000

Note: The table provides correlations for all years of the HMDA data under the sample restrictions identified in the text.
Denied gives the probability that an application was formally denied while originated gives the probability a loan was
approved and the funds disbursed to the borrower. OOC indicates that the application is for an owner-occupied home.
LTI is the loan-to-income ratio on the application constructed from the application’s stated loan and income. High Int
indicates if a loan was ultimately originated as a high interest loan. White and black both refer to the race of the primary
applicant. OCC indicates a loan filed at a lender regulated by the Office of the Comptroller of the Currency. Similarly,
FRS indicates a lender regulated by the Federal Reserve System, OTS regulated by the Office of Thrift Supervision,
NCUA the National Credit Union Administration, and HUD the Department of Housing and Urban Development.

58

APPENDIX E: INCOME AND DEFAULT
We use the CCP data to verify our assumption about probability of default conditional on income. In
particular, we estimate a linear probability model of the probability of default as a function of household
income.
The dependent variable takes value 1 if any member of the household in year t is 60-day past due or
longer on any account (mortgage, auto loan, credit card, etc.). The explanatory variable of interest is the (log
of the) household income in year 2001 (using the expected imputed income). We first estimate a
parsimonious specification with only the income measure. We then estimate a specification with the measure
of income and the full set of household and regional controls. These household-level controls are the
following variables measured at 2001: dummies for age of the head of household and for the size of the
household; amount of mortgage, auto loan, credit card balance, credit card limit, HELOC, student loan;
dummies for bankruptcy and 60 DPD or longer, and risk score. The regional-level controls are the following
zip code-level variables measured in 2001: income inequality, median of total household debt, median of
household mortgage, house price growth between 2001 and year t, the ratio of the median house price to the
median income, and the county level fixed effects. In the estimation, the standard errors are clustered by zip
code. We use a linear probability model since the mean of the dependent variable is in the range 0.25-0.30.
The equation is estimated for each year from 2002 to 2012 for the sample of the households use in the
benchmark regression of our analysis (i.e., the households that do not change location between year 2001
and year t).
We report results in Appendix Table E1. We find that higher income households and households
with higher income ranks have lower probability of default.

59

Appendix Table E1. Income and default.
2002

2003

2004

2005

rank

-0.325***
(0.00178)

-0.279***
(0.00179)

-0.289***
(0.00180)

-0.261***
(0.00173)

N
R2

6,172,512
0.022

5,676,766
0.017

5,039,109
0.018

rank

-0.324***
(0.00179)

-0.278***
(0.00180)

N
R2

6,172,512
0.052

rank

2009

2010

2011

2012

Panel A: No Controls
-0.247***
-0.224***
-0.212***
(0.00170)
(0.00169)
(0.00167)

-0.194***
(0.00168)

-0.186***
(0.00169)

-0.181***
(0.00169)

-0.187***
(0.00172)

4,570,211
0.015

4,218,948
0.013

3,731,267
0.010

3,581,280
0.008

3,433,201
0.008

3,310,773
0.007

3,197,351
0.008

-0.288***
(0.00182)

-0.261***
(0.00174)

Panel B: County Fixed Effects
-0.247***
-0.224***
-0.213***
(0.00169)
(0.00167)
(0.00165)

-0.196***
(0.00165)

-0.188***
(0.00165)

-0.184***
(0.00165)

-0.190***
(0.00168)

5,676,766
0.046

5,039,109
0.050

4,570,211
0.047

4,218,948
0.045

3,581,280
0.034

3,433,201
0.033

3,310,773
0.032

3,197,351
0.033

-0.0303***
(0.00152)

-0.0374***
(0.00171)

-0.0398***
(0.00189)

Panel C: Household-specific Characteristics and County Fixed Effects
-0.0448***
-0.0470*** -0.0448*** -0.0348*** -0.0261***
(0.00202)
(0.00213)
(0.00223)
(0.00233)
(0.00245)

-0.0235***
(0.00250)

-0.0189***
(0.00244)

-0.0120***
(0.00253)

N
R2

4,195,007
0.460

3,836,566
0.359

3,380,052
0.326

3,047,381
0.274

2,803,886
0.244

2,619,591
0.213

2,470,908
0.187

2,367,350
0.177

2,265,545
0.171

2,182,951
0.161

2,105,700
0.159

ln(y)

-0.159***
(0.000625)

-0.144***
(0.000626)

-0.152***
(0.000644)

-0.143***
(0.000638)

Panel D: No Controls
-0.139***
-0.129***
(0.000636)
(0.000639)

-0.122***
(0.000647)

-0.113***
(0.000665)

-0.109***
(0.000684)

-0.106***
(0.000691)

-0.109***
(0.000711)

N
R2

6,172,512
0.040

5,676,766
0.033

5,039,109
0.036

4,570,211
0.033

4,218,948
0.031

3,731,267
0.025

3,581,280
0.020

3,433,201
0.019

3,310,773
0.018

3,197,351
0.019

ln(y)

-0.145***
(0.000669)

-0.129***
(0.000667)

-0.135***
(0.000681)

-0.126***
(0.000660)

Panel E: County Fixed Effects
-0.121***
-0.113***
-0.109***
(0.000647)
(0.000638)
(0.000635)

-0.101***
(0.000647)

-0.0980***
(0.000647)

-0.0958***
(0.000646)

-0.0993***
(0.000657)

N
R2

6,172,512
0.062

5,676,766
0.055

5,039,109
0.059

4,570,211
0.056

3,581,280
0.041

3,433,201
0.040

3,310,773
0.038

3,197,351
0.039

ln(y)

-0.010***
(0.000606)

-0.012***
(0.000683)

Panel F: Household-specific Characteristics and County Fixed Effects
-0.013***
-0.016***
-0.017***
-0.015***
-0.012***
-0.009***
(0.000752)
(0.000811)
(0.000857)
(0.000904)
(0.000949)
(0.000995)

-0.008***
(0.00102)

-0.006***
(0.000994)

-0.004***
(0.00103)

N
R2

4,195,007
0.460

3,836,566
0.359

3,380,052
0.326

2,265,545
0.171

2,182,951
0.161

2,105,700
0.159

3,047,381
0.274

2006

4,218,948
0.054

2,803,886
0.244

2007

3,950,618
0.011

3,950,618
0.040

3,950,618
0.027

3,950,618
0.049

2,619,591
0.213

2008

3,731,267
0.037

3,731,267
0.045

2,470,908
0.187

2,367,350
0.177

Notes: The table reports estimated coefficients on income rank (Panels A-C) and log income (Panels D-F) in the linear regression where the dependent variable is
a dummy variable equal to one if a household defaults in a given year and zero otherwise. Standard errors (clustered by zip code) are reported in parentheses.
***,**,*denote statistical significance at 1%, 5% and 10%.

60

APPENDIX F: IMPUTATION OF INCOME
In the first step of our work, we estimate the relationship between income and observables in the SCF and then
use this relationship to impute income in the CCP. In this appendix, we describe how variables are constructed
and what specification is estimated.
In the table below, we describe how variables are constructed in CCP and SCF. We use only variables
which are available in both CCP and SCF. While there are some differences in the definitions across datasets, we
made every effort to make it as comparable as possible.
Variable
Auto loans

Bankruptcy flag
Credit Card Limit 19
Credit Card Balance
Delinquency flag
HELOC Balance
Income
Mortgage Debt

Student Loans

SCF
Counterpart in CCP
X2218 + X2318 + X2418 + Auto loan bank and
X7169 + X2424 + X2507 + auto
loan
finance
X2607
balance
X6772
Chapter 7 or Chapter
13 bankruptcy flag
X414
Bank card + retail card
high credit
X413 + X427+ X421 + Bank card + retail card
X424 + X430
balance
X3005
A flag if any account is
60 DPD or more
X1108 + X1119 + X1130 + Home equity revolving
X1136
balance
X5729
None
X805 + X905 + X1005 + First mortgage balance
X1715 + X1815 + X1915 + +
home
equity
X2006 + X2016
installment balance
X7824 + X7847 +
Student loans balance
X7870 + X7924 +
X7947 + X7970

We also use household size and head of household age. The CCP does not include racial identifiers so we
do not use these. In our imputation, we use all of the SCF replicates, which are discussed in detail by Kennickel
(1998). Because the SCF intentionally oversamples wealthy households, we apply the SCF-computed weights
X42001. Note that we take the natural log of one plus the level for all continuous variables to make the
distribution of these variables more well-behaved and to avoid dropping observations with zero values. We also
restrict the sample to households where the head’s age was between 20 and 65. We dropped outliers using Cook’s
distance.
As discussed in the text, our regression has the general form
log�𝑌𝑖,𝑆𝐶𝐹 � = 𝑓� 𝛽𝑋𝑖,𝑆𝐶𝐹 � + 𝜖𝑖,𝑆𝐶𝐹 .

In choosing the specific form of f, we aimed to capture as much of joint distribution of the observables and
income as we could with a flexible assumption. Terms were added if it was found that they were meaningful
predictors of log income. The function f was composed of
19

We code responses of “no limit” in the SCF as 1,000,000.

61

1.
2.
3.
4.
5.
6.

Third-order Chebychev polynomials of mortgage, auto, and credit card limits,
Credit card, HELOC, and student loan balances,
Nine age bins in five year intervals,
Interactions of all age bins with each type of debt balance,
Household size and interactions of household size with debt balances and age bins,
Indicators for bankruptcy and delinquency and interactions of these indicators with other
indicators,
7. Indicators for positive credit card limit and interactions of this variable with various variables,
8. Interactions of household size, age, and debt levels.

Table 2 shows that using data from 2001 the aggregate income statistics computed directly from the SCF match
those we impute in the CCP very closely.

62