View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.



REAL ESTATE RESEARCH

April 20, 2015

Income Growth, Credit Growth, and Lending Standards: Revisiting the Evidence
Almost a decade has passed since the peak of the housing boom, and a handful of economics papers have emerged as fundamental influences on the way that
economists think about the boom—and the ensuing bust. One example is a paper by Atif Mian and Amir Sufi that appeared in the Quarterly Journal of Economics in
2009 (MS2009 hereafter). A key part of this paper is an analysis of income growth and mortgage-credit growth in individual U.S. ZIP codes. The authors find that from
2002 to 2005, ZIP codes with relatively low growth in incomes experienced high growth in mortgage credit; that is, income growth and credit growth were negatively
correlated during this period.
Economists often cite this negative correlation as evidence of improper lending practices during the housing boom. The thinking is that prudent lenders would have
generated a positive correlation between area-level growth in income and mortgage credit, because borrowers in ZIP codes with high income growth would be in the
best position to repay their loans. A negative correlation suggests that lenders instead channeled credit to borrowers who couldn't repay.
Some of the MS2009 results are now being reexamined in a new paper by Manuel Adelino, Antoinette Schoar, and Felipe Severino (A2S hereafter). The A2S paper
argues that the statistical evidence in MS2009 is not robust and that using borrower-level data, rather than data aggregated up to the ZIP-code level, is the best way to
investigate lending patterns. The A2S paper has already received a lot of attention, which has centered primarily on the quality of the alternative individual-level data
that A2S sometimes employ.1 To understand the relevant issues in this debate, it's helpful to go back to MS2009's original statistical work that uses data aggregated to
the ZIP-code level to get a sense of what it does and doesn't show.

Chart 1 summarizes the central MS2009 result. We generated this chart from information we found in either MS2009 or its supplementary online appendix. The dark
blue bars depict the coefficients from separate regressions of ZIP-code level growth in new purchase mortgages on growth in ZIP-code level incomes.2 (These
regressions also include county fixed effects, which we discuss further below.) Each regression corresponds to a different sample period. The first regression projects
ZIP-level changes in credit between 1991 and 1998 on ZIP-level changes in income between these two years. The second uses growth between 1998 and 2001, and
so on.3 During the three earliest periods, ZIP-level income growth enters positively in the regressions, but in 2002–04 and 2004–05, the coefficients become negative.
A key claim of MS2009 is that this flip signals an important and unwelcome change in the behavior of lenders. Moreover, the abstract points out that the negative
coefficients are anomalous: "2002 to 2005 is the only period in the past eighteen years in which income and mortgage credit growth are negatively correlated."

There are, however, at least three reasons to doubt that the MS2009 coefficients tell us anything about lending standards. First of all, the coefficients for the 2005–06
and 2006–07 regressions are positive—for the latter period, strongly so. By MS2009's logic, these positive coefficients indicate that lending standards improved after
2005, but in fact loans made in 2006 and 2007 were among the worst-performing loans in modern U.S. history. Chart 2 depicts the share of active loans that are 90plus days delinquent or in foreclosure as a share of currently active loans, using data from Black Knight Financial Services. To be sure, loans made in 2005 did not
perform well during the housing crisis, but the performance of loans made in 2006 and 2007 was even worse.4 This poor performance is not consistent with the
improvement in lending standards implied by MS2009's methodology.
A second reason that sign changes among the MS2009 coefficients may not be informative is that these coefficients are not really comparable. The 1991–98
regression is based on growth in income and credit across seven years, while later regressions are based on growth over shorter intervals. This difference in time
horizon matters, because area-level income and credit no doubt fluctuate from year to year while they also trend over longer periods. A "high-frequency" correlation
calculated from year-to-year growth rates may therefore turn out to be very different from a "low-frequency" correlation calculated by comparing growth rates across
more-distant years. One thing we can't do is think of a low-frequency correlation as an "average" of high-frequency correlations. Note that MS2009 also run a
regression with growth rates calculated over the entire 2002–05 period, obtaining a coefficient of -0.662. This estimate, not pictured in our graph, is much larger in
absolute value than either of the coefficients generated in the subperiods 2002–04 and 2004–05, which are pictured.
A third and perhaps more fundamental problem with the MS2009 exercise is that the authors do not report correlations between income growth and credit growth but
rather regression coefficients.5 And while a correlation coefficient of 0.5 indicates that income growth and credit growth move closely together, a regression coefficient
of the same magnitude could be generated with much less comovement. MS2009 supply the data needed to convert their regression coefficients into correlation
coefficients, and we depict those correlations as green bars in chart 1.6 Most of the correlations are near 0.1 in absolute value or smaller. To calculate how much
comovement these correlations imply, recall that the R-squared of a regression of one variable on another is equal to the square of their correlation coefficient. A
correlation coefficient of 0.1 therefore indicates that a regression of credit growth deviated from county-level means on similarly transformed income growth would have
an R-squared in the neighborhood of 1 percent. The reported R-squareds from the MS2009 regressions are much larger, but that is because the authors ran their
regressions without demeaning the data first, letting the county fixed effects do the demeaning automatically. While this is standard practice, this specification forces
the reported R-squared to encompass the explanatory power of the fixed effects. The correlation coefficients that we have calculated indicate that the explanatory
power of within-county income growth for within-county credit growth is extremely low.7 Consequently, changes in the sign of this correlation are not very informative.
How do these arguments relate to A2S's paper? Part of that paper provides further evidence that the negative coefficients in the MS2009 regressions do not tell us
much about lending standards. For example, A2S extend a point acknowledged in MS2009: expanding the sample of ZIP codes used for the regressions weakens the
evidence of a negative correlation. The baseline income-credit regressions in MS2009 use less than 10 percent of the ZIP codes in the United States (approximately
3,000 out of more than 40,000 total U.S. ZIP codes). Omitted from the main sample are ZIP codes that do not have price-index data or that lack credit-bureau data.8
MS2009 acknowledge that if one relaxes the restriction related to house-price data, the negative correlations weaken. Our chart 1 conveys this information with the
correlation coefficients depicted in red, which are even closer to zero. A2S go farther to show that if the data set also includes ZIP codes that lack credit-bureau data,
the negative correlation and regression coefficients become positive.
But perhaps a deeper contribution of A2S is to remind the researchers that outstanding questions about the housing boom should be attacked with individual-level data.
No one doubts that credit expanded during the boom, especially to subprime borrowers. But how much of the aggregate increase in credit went to subprime borrowers,
and how did factors like income, credit scores, and expected house-price appreciation affect both borrowing and lending decisions? Even under the best of
circumstances, it is hard to study these questions with aggregate data, as MS2009 did. People who take out new-purchase mortgages typically move across ZIP-code
boundaries. Their incomes and credit scores may be different than those of the people who lived in their new neighborhoods one, two, or seven years before. A2S
therefore argue for the use of HMDA individual-level income data so that credit allocation can be studied at the individual level. This use has been criticized by Mian

and Sufi, who believe that fraud undermines the quality of the individual-level income data that appear in HMDA records. We should take these criticisms seriously. But
the debate over whether lending standards are best studied with aggregate or individual-level data should take place with the understanding that aggregate data on
incomes and credit may not be as informative as previously believed.
By Chris Foote, senior economist and policy adviser at the Federal Reserve Bank of Boston, Kris Gerardi, financial economist and associate policy adviser at the
Federal Reserve Bank of Atlanta, and Paul Willen, senior economist and policy adviser at the Federal Reserve Bank of Boston.
1

Mian and Sufi's contribution to the data-quality debate can be found here.

2

Data on new-purchase mortgage originations come from records generated by the Home Mortgage Disclosure Act (HMDA). Average income at the ZIP-code level is tabulated in the selected years
by the Internal Revenue Service.
3

Growth rates used in the regressions are annualized. The uneven lengths of the sample periods are necessitated by the sporadic availability of the IRS income data, especially early on. The 1991
data are no longer available because IRS officials have concerns about their quality.
4

Chart 2 includes data for both prime and subprime loans. The representativeness of the Black Knight/LPS data improves markedly in 2005, so LPS loans originated before that year may not be
representative of the universe of mortgages made at the same time. For other evidence specific to the performance of subprime loans made in 2006 and 2007, see Figure 2 of Christopher Mayer,
Karen Pence, and Shane M. Sherlund, "The Rise in Mortgage Defaults," Journal of Economic Perspectives (2009), and Figure 1 of Yuliya Demyanyk and Otto Van Hemert, "Understanding the
Subprime Mortgage Crisis," Review of Financial Studies (2009). For data on the performance of GSE loans made in 2006 and 2007, see Figure 8 of W. Scott Frame, Kristopher Gerardi, and Paul S.
Willen, "The Failure of Supervisory Stress Testing: Fannie Mae, Freddie Mac, and OFHEO," Atlanta Fed Working Paper (2015).
5

MS2009 often refer to their regression coefficients as "correlations" in the text as well as in the relevant tables and figures, but these statistics are indeed regression coefficients. Note that in the
fourth table of the supplemental online appendix, one of the "correlations" exceeds 1, which is impossible for an actual correlation coefficient.
6

Because a regression coefficient from a univariate regression is Cov(X,Y)/Var(X), multiplying this coefficient times StdDev(X)/StdDev(Y) gives Cov(X,Y)/StdDev(X)*StdDev(Y), which is the
correlation coefficient. Here, the Y variable is ZIP-code–level credit growth, demeaned from county-level averages, while X is similarly demeaned income growth. As measures of the standard
deviations, we use the within-county standard deviations displayed in Table I of MS2009. Specifically, we use the within-county standard deviation of "mortgage origination for home purchase annual
growth" calculated over the 1996–02 and 2002–05 periods (0.067 and 0.15, respectively) and the within-county standard deviation of "income annualized growth" over the 1991–98, 1998–2002,
2002–05, and 2005–06 periods (0.022, 0.017, 0.031, and 0.04, respectively). Unfortunately, the time periods over which the standard deviations were calculated do not line up exactly with the time
periods over which the regression coefficients were calculated, so our conversion to correlation coefficients is an approximation.
7

It is true that the regression coefficients in the MS2009 coefficients often have large t-statistics, so one may argue that ZIP-level income growth has sometimes been a statistically significant
determinant of ZIP-level credit growth. But the low correlation coefficients indicate that income growth has never been economically significant determinant of credit allocation within counties. It is
therefore hard to know what is driving the income-credit correlation featured in MS2009, or what may be causing its sign to fluctuate.
8

Though house prices and credit bureau data are not required to calculate a correlation between income growth and mortgage-credit growth, the authors use house prices and credit bureau data in
other parts of their paper.

April 20, 2015 in Credit conditions, Housing boom, Housing crisis, Mortgage crisis, Subprime mortgages | Permalink

REAL ESTATE RESEARCH SEARCH

Search
RECENT POSTS

Assessing the Size and Spread of Vulnerable Renter Households in the Southeast
What's Being Done to Help Renters during the Pandemic?
An Update on Forbearance Trends
Examining the Effects of COVID-19 on the Southeast Housing Market
Southeast Housing Market and COVID-19
Update on Lot Availability and Construction Lending
Tax Reform's Effect on Low-Income Housing
Housing Headwinds
Where Is the Housing Sector Headed?
Did Harvey Influence the Housing Market?
CATEGORIES

Affordable housing goals
Credit conditions
Expansion of mortgage credit
Federal Housing Authority
Financial crisis
Foreclosure contagion
Foreclosure laws
Governmentsponsored enterprises
GSE
Homebuyer tax credit
Homeownership
House price indexes
Household formations
Housing boom
Housing crisis
Housing demand
Housing prices
Income segregation
Individual Development Account
Loan modifications
Monetary policy
Mortgage crisis
Mortgage default
Mortgage interest tax deduction
Mortgage supply
Multifamily housing
Negative equity
Positive demand shock
Positive externalities
Rental homes
Securitization
Subprime MBS
Subprime mortgages
Supply elasticity
Uncategorized
Upward mobility
Urban growth