The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
The Credit Crisis and Cycle-Proof Regulation Raghuram G. Rajan This article was originally presented as the Homer Jones Memorial Lecture, organized by the Federal Reserve Bank of St. Louis, St. Louis, Missouri, April 15, 2009. Federal Reserve Bank of St. Louis Review, September/October 2009, 91(5, Part 1), pp. 397-402. F irst, I would like to thank the St. Louis Fed, especially Kevin Kliesen, and the National Association for Business Economics for inviting me to give this talk. I share with Homer Jones an affiliation with the University of Chicago. He was an important influence on Milton Friedman, and if that were all he did, he would deserve a place in history. But in addition, he was a very inquisitive economist with a reputation for thinking outside the box. He made major contributions to monetary economics. It is an honor to be asked to deliver a lecture in his name, especially at this critical time in the nation’s regulatory history. WHAT CAUSED THE CRISIS? The current financial crisis can be blamed on many factors and even some particular players in financial markets and regulatory institutions. But in pinning the disaster on specific agents, we could miss the cause that links them all. I argue that this common cause is cyclical euphoria; and, unless we recognize this, our regulatory efforts are likely to fall far short of preventing the next crisis. Let me start at the beginning. There is some consensus that the proximate causes of the crisis are as follows: (i) The U.S. financial sector misallocated resources to real estate, financed through the issuance of exotic new financial instruments. (ii) A significant portion of these instruments found their way, directly or indirectly, onto commercial and investment bank balance sheets. (iii) These investments were financed largely with short-term debt. (iv) The mix was potent and caused large-scale disruption in 2007. On these matters, there is broad agreement. But let us dig a little deeper. This is a crisis born in some ways from previous financial crises. A wave of crises swept through the emerging markets in the late 1990s: East Asian economies collapsed, Russia defaulted, and Argentina, Brazil, and Turkey faced severe stress. In response to these problems, emerging markets became far more circumspect about borrowing from abroad to finance domestic demand. Instead, their corporations, governments, and households cut back on investment and reduced consumption. Formerly net absorbers of financial capital from the rest of the world, a number of these countries became net exporters of financial capital. Combined with the savings of habitual exporters such as Germany and Japan, these circumstances created what Chairman Bernanke referred to as a “global saving glut” (Bernanke, 2005). Clearly, the net financial savings generated in one part of the world must be absorbed by Raghuram G. Rajan is the Eric Gleacher Distinguished Service Professor of Finance at the Booth School of Business, University of Chicago. © 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W S E P T E M B E R / O C TO B E R , PA R T 1 2009 397 Rajan deficits elsewhere. Corporations in industrialized countries initially absorbed these savings by expanding investment, especially in information technology, but this proved unsustainable and investment was cut back sharply after the collapse of the information technology bubble. Extremely accommodative monetary policy by the world’s central banks, led by the Federal Reserve, ensured the world did not suffer a deep recession. Instead, the low interest rates in a number of countries ignited demand in interestsensitive sectors such as automobiles and housing. House prices started rising, as did housing investment. U.S. price growth was by no means the highest. Housing prices reached higher values relative to rent or incomes in Ireland, Spain, the Netherlands, the United Kingdom, and New Zealand, for example. Then why did the crisis first manifest itself in the United States? Probably because the United States went further with financial innovation, thus drawing more buyers with marginal credit quality into the market. Holding a home mortgage loan directly is very hard for an international investor because it requires servicing, is of uncertain credit quality, and has a high propensity for default. Securitization dealt with some of these concerns. If the mortgage was packaged together with mortgages from other areas, diversification would reduce the risk. Furthermore, the riskiest claims against the package could be sold to those with the capacity to evaluate them and an appetite for bearing the risk, while the safest AAA-rated portions could be held by international investors. Indeed, because of the demand from international investors for AAA paper, securitization focused on squeezing out the most AAA paper from an underlying package of mortgages: The lower-quality securities issued against the initial package of mortgages were repackaged once again with similar securities from other packages, and a new range of securities, including a large quantity rated AAA, was issued by this “collateralized debt obligation.” The “originate-to-securitize” process had the unintended consequence of reducing the due diligence undertaken by originators. Of course, 398 S E P T E M B E R / O C TO B E R , PA R T 1 2009 originators could not completely ignore the true quality of borrowers because they were held responsible for initial defaults, but because house prices were rising steadily over this period, even this source of discipline weakened. If the buyer could not make even the nominal payments involved on the initial low mortgage teaser rates, the lender could repossess the house, sell it quickly in the hot market, and recoup any losses through the price appreciation. In the liquid housing market, as long as the buyer could scrawl an “X” on the dotted line, he or she could own a home. The slicing and dicing through repeated securitization of the original package of mortgages created very complicated securities. The problems in valuing these securities were not obvious when house prices were rising and defaults were few. But as house prices stopped rising and defaults started increasing, the valuation of these securities became very complicated. MALEVOLENT BANKERS OR FOOLISH NAÏFS? It was not entirely surprising that bad investments would be made in the housing boom. What was surprising was that the originators of these complex securities—the financial institutions that should have understood the deterioration of the underlying quality of mortgages—held on to so many of the mortgage-backed securities (MBS) in their own portfolios. Simply: Why did the sausage-makers, who knew what was in the sausage, keep so many sausages for personal consumption? The explanation has to be that at least one arm of the bank thought these securities were worthwhile investments, despite their risk. Investment in MBS seemed to be part of a culture of excessive risk-taking that had overtaken banks. A key factor contributing to this culture is that, over short periods of time, it is very hard, especially in the case of new products, to tell whether a financial manager is generating true excess returns adjusting for risk or whether the current returns are simply compensation for a risk that F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Rajan has not yet shown itself but will eventually materialize. Such difficulty could engender excess risk-taking both at the top of and within the firm. For instance, the performance of CEOs is evaluated in part on the basis of the earnings they generate relative to their peers. To the extent that some leading banks can generate legitimately high returns, this puts pressure on other banks to keep up. CEOs of “follower” banks may take excessive risks to boost various observable measures of performance. Indeed, even if managers recognize that this type of strategy is not truly value creating, a desire to pump up their bank’s stock prices and their own reputations may nevertheless make it their most attractive option. There is anecdotal evidence of such pressure on top management—perhaps most famously from Citigroup chairman, Chuck Prince, in describing why his bank continued financing buyouts despite mounting risks: “When the music stops, in terms of liquidity, things will be complicated. But, as long as the music is playing, you’ve got to get up and dance. We’re still dancing” (Wighton, 2007). Even if top management wants to maximize long-term bank value, it may be difficult to create incentives and control systems that steer subordinates in this direction. Given the competition for talent, traders have to be paid generously based on performance, but many of the compensation schemes paid for short-term, risk-adjusted performance. This setting gave traders an incentive to take risks that were not recognized by the system, so they could generate income that appeared to stem from their superior abilities, even though it was in fact only a market-risk premium. The classic case of such behavior is to write insurance on infrequent events such as defaults, assuming what is termed “tail” risk. If traders are allowed to boost bonuses by treating the entire insurance premium as income, instead of setting aside a significant fraction as a reserve for an eventual payout, they have an excessive incentive to engage in this sort of trade. Indeed, traders who bought AAA-rated MBS were essentially getting the additional spread on F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W these instruments relative to corporate AAA securities (the spread being the insurance premium) while ignoring the additional default risk entailed in these untested securities. The traders in AIG’s financial products division took all this to an extreme by writing credit default swaps, pocketing the premiums as bonuses, and not bothering to set aside reserves in case the bonds covered by the swaps actually defaulted. This is not to say that risk managers in banks were unaware of such incentives. However, they may have been unable to fully control them, because tail risks are by their nature rare and therefore hard to quantify with precision before they occur. Although the managers could try to impose crude limits on the activities of the traders taking maximum risk, these types of trades were likely to have been very profitable (before the risk actually was realized) and any limitations on such profits are unlikely to sit well with a top management that is being pressured for profits. Finally, all these shaky assets were financed with short-term debt. Why? Because in good times, short-term debt seems relatively cheap compared with long-term capital, and the market is willing to supply it because the costs of illiquidity appear remote. Markets seem to favor a bank capital structure that is heavy on short-term leverage. In bad times, though, the costs of illiquidity seem to be more salient, while risk-averse (and burnt) bankers are unlikely to take on excessive risk. The markets then encourage a capital structure that is heavy on capital. Given the conditions that led banks to hold large quantities of MBS and other risky loans (such as those to private equity financed with a capital structure heavy on short-term debt), the crisis had a certain degree of inevitability. As house prices stopped rising, and indeed started falling, mortgage defaults started increasing. MBS fell in value and became more difficult to price, and their prices became more volatile. They became hard to borrow against, even over the short term. Banks became illiquid and eventually insolvent. Only heavy intervention has kept the financial system afloat, and though the market seems to believe that the worst is over, its relief may be premature. S E P T E M B E R / O C TO B E R , PA R T 1 2009 399 Rajan The Blame Game Who is to blame for the financial crisis? As my discussion suggests, there are many possible suspects—the exporting countries that still do not understand that their thrift is a burden and not a blessing to the rest of the world; the U.S. households that have spent way beyond their means in recent years; the monetary and fiscal authorities who were excessively ready to intervene to prevent short-term pain, even though they only postponed problems into the future; the bankers who took the upside and left the downside to the taxpayer; the politicians who tried to expand their vote banks by extending homeownership to even those who could not afford it; the markets that tolerated high leverage in the boom only to become risk averse in the bust…The list goes on. There are plenty of suspects and enough blame to spread. But if all are to blame, should we also not admit they all had a willing accomplice—the euphoria generated by the boom? After all, who is there to stand for stability and against the prosperity and growth in a boom? Internal risk managers, who repeatedly pointed to risks that never materialized during an upswing, have little credibility and influence— that is, if they still have jobs. It is also very hard for contrarian investors to bet against the boom: As Keynes said, the market can stay irrational longer than investors can stay solvent. Politicians have an incentive to ride the boom, indeed to abet it, through the deregulation sought by bankers. After all, bankers have not only the money to influence legislation but also the moral authority conferred by prosperity. And what of regulators? When everyone is “for” the boom, how can regulators stand against it? They are reduced to rationalizing why it would be technically impossible for them to stop it. Everyone is therefore complicit in the crisis because, ultimately, they are aided and abetted by cyclical euphoria. And unless we recognize this, the next crisis will be hard to prevent. For we typically regulate in the midst of a bust when righteous politicians feel the need to do something, when bankers’ frail balance sheets and 400 S E P T E M B E R / O C TO B E R , PA R T 1 2009 vivid memories make them eschew any risk, and when regulators’ backbones are stiffened by public disapproval of past laxity. THE ROLE OF REGULATION We reform under the delusion that the regulated—and the markets they operate in—are static and passive and that the regulatory environment will not vary with the cycle. Ironically, faith in draconian regulation is strongest at the bottom of the cycle—when there is little need for participants to be regulated. By contrast, the misconception that markets will take care of themselves is most widespread at the top of the cycle—the point of maximum danger to the system. We need to acknowledge these differences and enact cycleproof regulation, for a regulation set against the cycle will not stand. Consider the dangers of ignoring this point. Recent studies such as the Geneva Report (Brunnermeier et al., 2009) have argued for “countercyclical” capital requirements—raising bank capital requirements significantly in good times, while allowing them to fall somewhat in bad times. Although this approach is sensible prima facie, these proposals may be far less effective than intended. To see why this is so, we need to recognize that in boom times, the market demands very low levels of capital from financial intermediaries, in part because euphoria makes losses seem remote. So when regulated financial intermediaries are forced to hold more costly capital than the market requires, they have an incentive to shift activity to unregulated intermediaries, as did banks in setting up structured investment vehicles and conduits during the current crisis. Changes in Regulation Even if regulations are strengthened to detect and prevent this shift in activity, banks can subvert capital requirements by assuming risk the regulators do not see or do not penalize adequately with capital requirements. Attempts to reduce capital requirements in busts are equally fraught. The risk-averse market wants banks to hold much F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Rajan more capital than regulators require, and its will naturally prevails. Even the requirements themselves may not be immune to the cycle. Once memories of the current crisis fade and the ideological cycle turns, the political pressure to soften capital requirements or their enforcement will be enormous. To have a better chance of creating stability through the cycle—of being cycle-proof—new regulations should be comprehensive, contingent, and cost effective. Regulations that apply comprehensively to all levered financial institutions are less likely to encourage the drift of activities from heavily regulated to lightly regulated institutions over the boom, a source of instability because the damaging consequences of such drift come back to hit the heavily regulated institutions during the bust through channels no one foresees. Regulations should also be contingent so they have maximum force when the private sector is most likely to do itself harm but bind less the rest of the time. This will make regulations more costeffective, which also makes them less prone to arbitrage or dilution. Consider some examples of such regulations. First, instead of asking institutions to raise permanent capital, ask them to arrange for capital to be infused when the institution or the system is in trouble. Because these “contingent capital” arrangements will be contracted in good times (when the chances of a downturn seem remote), they will be relatively cheap (compared with raising new capital in the midst of a recession) and thus easier to enforce. Also, because the infusion is seen as an unlikely possibility, firms cannot go out and increase their risks by using the future capital as backing. Finally, because the infusions occur in bad times when capital is really needed, they protect the system and the taxpayer in the right contingencies. One version of contingent capital is requiring banks to issue debt that would automatically convert to equity when two conditions are met: first, when the system is in crisis, either based on an assessment by regulators or based on objective indicators; and second, when the bank’s capital ratio falls below a certain value (Squam Lake Working Group on Financial Regulation, 2009). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W The first condition ensures that banks that do badly because of their own idiosyncratic errors, and not when the system is in trouble, do not avoid the disciplinary effects of debt. The second condition rewards well-capitalized banks by allowing them to avoid the forced conversion (the number of shares to which the debt converts will be set at a level to substantially dilute the value of old equity), while also giving banks that anticipate losses an incentive to raise new equity well in advance. Another version of contingent capital is requiring systemically important levered financial institutions to buy fully collateralized insurance policies (from unlevered institutions, foreigners, or the government) that will infuse capital into these institutions when the system is in trouble (Kashyap, Rajan, and Stein, 2009). Here is one way this type of system could operate. Megabank would issue capital insurance bonds—say, to sovereign wealth funds—and invest the proceeds in Treasury bonds, which would then be placed in a custodial account in State Street Bank. Every quarter, Megabank would pay a pre-agreed insurance premium (contracted at the time the capital insurance bond is issued) which, together with the interest accumulated on the Treasury bonds held in the custodial account, would be paid to the sovereign fund. If the aggregate losses of the banking system exceed a certain prespecified amount, Megabank would start receiving a payout from the custodial account to bolster its capital. The sovereign wealth fund would then face losses on the principal it has invested, but on average, it would be compensated by the insurance premium. Consider regulations aimed at “too big to fail” institutions. Regulations to limit their size and activities will become very onerous when growth is high, thus increasing the incentive to dilute these regulations. Perhaps, instead, a more cyclically sustainable regulation would be to make these institutions easier to close down. What if systemically important financial institutions were required to develop a plan that would enable them to be resolved over a weekend? Such a “shelf bankruptcy” plan would require banks to track, and document, their exposures S E P T E M B E R / O C TO B E R , PA R T 1 2009 401 Rajan much more carefully and in a timely manner, probably through much better use of technology. The plan would require periodic stress testing by regulators and the support of enabling legislation—such as facilitating an orderly transfer of a troubled institution’s swap books to precommitted partners. Not only would the requirement to develop resolution plans give these institutions the incentive to reduce unnecessary complexity and improve management, it also would not be much more onerous in the boom cycle and might indeed force management to think the unthinkable at such times. CONCLUSION A crisis offers us a rare window of opportunity to implement reforms—it is a terrible thing to waste. The temptation will be to overregulate, as we have done in the past. This creates its own perverse dynamic. For as we start eliminating senseless regulations once the recovery takes hold, we will find deregulation adds so much economic value that it further empowers the deregulatory camp. Eventually, though, the deregulatory momentum will cause us to eliminate regulatory muscle rather than fat. Perhaps rather than swinging maniacally between too much and too little regulation, it would be better to think of cycleproof regulation. REFERENCES Bernanke, Ben S. “The Global Saving Glut and the U.S. Current Account Deficit.” Remarks by Governor Ben S. Bernanke at the Homer Jones Memorial Lecture, St. Louis, Missouri, April 14, 2005; www.federalreserve.gov/boarddocs/speeches/2005/ 20050414/default.htm. Brunnermeier, Markus K.; Crockett, Andrew; Goodhart, Charles A.; Persaud, Avinash D. and Shin, Hyun Song. The Fundamental Principles of Financial Regulation: Geneva Reports on the World Economy 11. London: Centre for Economic Policy Research, 2009. Kashyap, Anik K.; Rajan, Raghuram G. and Stein, Jeremy C. “Rethinking Capital Regulation” in Federal Reserve Bank of Kansas City Symposium, Maintaining Stability in a Changing Financial System, February 2009, pp. 431-71; www.kc.frb.org/ publicat/sympos/2008/ KashyapRajanStein.03.12.09.pdf. Squam Lake Working Group on Financial Regulation. “An Expedited Resolution Mechanism for Distressed Financial Firms: Regulatory Hybrid Securities.” Working paper, Council on Foreign Relations, Center for Geoeconomic Studies; April 2009; www.cfr.org/content/publications/attachments/ Squam_Lake_Working_Paper3.pdf. Wighton, David. “Citigroup Chief Stays Bullish on Buy-Outs.” Financial Times, July 9, 2007; www.ft.com/cms/s/0/80e2987a-2e50-11dc-821c0000779fd2ac.html?nclick_check=1. 402 S E P T E M B E R / O C TO B E R , PA R T 1 2009 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Systemic Risk and the Financial Crisis: A Primer James Bullard, Christopher J. Neely, and David C. Wheelock How did problems in a relatively small portion of the home mortgage market trigger the most severe financial crisis in the United States since the Great Depression? Several developments played a role, including the proliferation of complex mortgage-backed securities and derivatives with highly opaque structures, high leverage, and inadequate risk management. These, in turn, created systemic risk—that is, the risk that a triggering event, such as the failure of a large financial firm, will seriously impair financial markets and harm the broader economy. This article examines the role of systemic risk in the recent financial crisis. Systemic concerns prompted the Federal Reserve and U.S. Department of the Treasury to act to prevent the bankruptcy of several large financial firms in 2008. The authors explain why the failures of financial firms are more likely to pose systemic risks than the failures of nonfinancial firms and discuss possible remedies for such risks. They conclude that the economy could benefit from reforms that reduce systemic risks, such as the creation of an improved regime for resolving failures of large financial firms. (JEL E44, E58, G01, G21, G28) Federal Reserve Bank of St. Louis Review, September/October 2009, 91(5, Part 1), pp. 403-17. T he financial crisis of 2008-09—the most severe since the 1930s—had its origins in the housing market. After several years of rapid growth and profitability, banks and other financial firms began to realize significant losses on their investments in home mortgages and related securities in the second half of 2007. Those losses triggered a full-blown financial crisis when banks and other lenders suddenly demanded much higher interest rates on loans to risky borrowers, including other banks, and trading in many financial instruments declined sharply. A string of failures and nearfailures of major financial institutions—including Bear Stearns, IndyMac Federal Bank, the Federal National Mortgage Association (Fannie Mae), the Federal Home Loan Mortgage Corporation (Freddie Mac), Lehman Brothers, American International Group (AIG), and Citigroup—kept financial markets on edge throughout much of 2008 and into 2009. The financial turmoil is widely considered the primary cause of the economic recession that began in late 2007. As individual firms lurched toward collapse, market speculation focused on which firms the government would consider “too big” or “too connected” to allow to fail. Why should any firm, large or small, be protected from failure? For financial firms, the answer centers on systemic risk. Systemic risk refers to the possibility that a triggering event, such as the failure of an individual firm, will seriously impair other firms or markets and harm the broader economy. Systemic risk concerns were at the heart of the Federal Reserve’s decision to facilitate the James Bullard is president and chief executive officer of the Federal Reserve Bank of St. Louis. Christopher J. Neely is an assistant vice president and economist and David C. Wheelock is a vice president and economist at the Federal Reserve Bank of St. Louis. The authors thank Richard Anderson, Rajdeep Sengupta, and Yi Wen for comments on a previous draft of this article. Craig P. Aubuchon provided research assistance. © 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W S E P T E M B E R / O C TO B E R , PA R T 1 2009 403 Bullard, Neely, Wheelock Figure 1 U.S. House Prices Relative to the CPI, Rents, and Median Family Income (1995:Q1–2008:Q4) 2.0 1 HPI/CPI (excluding shelter) 1.8 HPI/Rent HPI/Income 1 1.6 1 1.4 1.2 1 1.0 1 0.8 0.6 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 1 NOTE: The house price index (HPI) shown in the figure is the S&P/Case-Shiller National Home Price Index; the consumer price index (CPI) data exclude the shelter component of the index; the rent index is a separate component of the CPI; median family income is an aggregated monthly series from the National Association of Realtors; and recession dates (vertical gray bars) are from the National Bureau of Economic Research. acquisition of Bear Stearns by JPMorgan Chase in March 2008 and the U.S. Department of the Treasury’s decisions to place Fannie Mae and Freddie Mac into conservatorship1 and to assume control of AIG in September 2008. Federal Reserve Chairman Bernanke (2008b) explained the Fed’s decision to facilitate the acquisition of Bear Stearns as follows: Our analyses persuaded us…that allowing Bear Stearns to fail so abruptly at a time when the financial markets were already under considerable stress would likely have had extremely adverse implications for the financial system and for the broader economy. In particular, Bear Stearns’ failure under those circumstances would have seriously disrupted certain key secured funding markets and derivatives mar1 A conservatorship is a legal arrangement in which one party is given control of another party’s legal or financial affairs. In this case, the Federal Housing Finance Agency was appointed conservator of Fannie Mae and Freddie Mac by the U.S. Treasury Department in accordance with the Federal Housing Finance Regulatory Reform Act of 2008. 404 S E P T E M B E R / O C TO B E R , PA R T 1 2009 kets and possibly would have led to runs on other financial firms. This article describes how the failure of a single financial firm or market could endanger the entire U.S. financial system and economy and how this possibility influenced the response of policymakers to the recent crisis. Further, we explain why failures of financial institutions are more likely to pose systemic risks than failures of nonfinancial firms and discuss possible remedies for the systemic risks exposed by this particular financial crisis.2 A BRIEF GUIDE TO THE FINANCIAL CRISIS We begin with a brief review of the evolution of the financial crisis and its origins in the hous2 This article is based on and extends “Systemic Risk and the Macroeconomy” (see Bullard, 2008). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock ing market to understand systemic risk in the context of this crisis. U.S. house prices began to rise far above historical values in the late 1990s. Figure 1 shows the growth in an index of house prices relative to the consumer price index (CPI), an index of residential rents, and median family income, all normalized to equal 1 in the first quarter of 1995. House prices rose rapidly relative to consumer price inflation, rents, and median family income between 1998 and 2006. Analysts attribute the rapid growth in the demand for homes and the associated rise in house prices to unusually low interest rates, large capital inflows, rapid income growth, and innovations in the mortgage market.3 A rapid rise in the share of nonprime loans, especially nonprime loans with unconventional terms, was a key feature of the mortgage market during the housing boom. Nonprime loans increased from 9 percent of new mortgage originations in 2001 to 40 percent in 2006 (DiMartino and Duca, 2007). Most nonprime mortgage loans were made to homebuyers with weak credit histories, minimal down payments, low income-toloan ratios, or other deficiencies that prevented them from qualifying for a prime loan.4 Many nonprime loans also had adjustable interest rates or other features that kept the initial payments low but subjected borrowers to risk if interest rates rose or house prices declined. The rise in nonprime loans was accompanied by a sharp increase in the percentage of nonprime loans that originating lenders sold to banks and 3 4 Bernanke (2005) describes the “global saving glut” and changing pattern of international capital flows during the 1990s and early 2000s, and Caballero, Farhi, and Gourinchas (2008) discuss the role of capital inflows in fueling the housing boom. Taylor (2009), by contrast, blames the housing boom primarily on loose monetary policy during 2002-05. Mortgage loans are typically classified as prime or nonprime, depending on the risk that a borrower will default on the loan. Nonprime loans are further distinguished between “subprime” and “alternative-A” (Alt-A), again depending on credit risk. Generally, borrowers qualify for prime mortgages if their credit scores are 660 or higher and the loan-to-value ratio is below 80 percent. Borrowers with lower credit scores or other financial deficiencies, such as a previous record of delinquency, foreclosure or bankruptcy, or higher loan-to-value ratios, are more likely to qualify only for a nonprime loan. See Sengupta and Emmons (2007) for more information about nonprime mortgage lending. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W other financial institutions. The practice of selling conventional prime mortgages has been common since the 1930s, when the federal government established Fannie Mae to promote the flow of capital to the mortgage market.5 The federal government chartered Freddie Mac in 1970 to compete with Fannie Mae, which had been sold to private investors in 1968. Both firms purchase large amounts of prime mortgage loans, which they finance by selling bonds in the capital markets. Before the 1990s, Fannie Mae, Freddie Mac, and other firms rarely purchased nonprime loans. Instead, the originating lenders held most nonprime loans, which comprised a relatively small portion of the mortgage market, until they matured.6 When a lender sells a loan rather than holding it until maturity, the lender has less incentive to ensure that the borrower is creditworthy. Many analysts contend that lax underwriting standards contributed to the high rate of nonprime loan delinquencies.7 Although purchasers of loans do have an incentive to verify the creditworthiness of borrowers, many evidently failed to appreciate or manage the level of risk in their portfolios during the recent housing boom (Bernanke, 2008a). In some instances, investors may have relied too heavily on the judgments of credit rating agencies. 8 The banks and other financial institutions that purchased nonprime mortgage loans typically created residential mortgage-backed securities (RMBSs) based on pools of mortgage loans. An 5 Wheelock (2008) discusses the establishment of Fannie Mae and other agencies and programs to alleviate home mortgage distress during the Great Depression. 6 Fannie Mae and Freddie Mac are not permitted to purchase loans that exceed a specific limit (currently $417,000) except in designated high-cost areas. Further, Fannie Mae and Freddie Mac require minimum documentation and other standards on the loans they purchase, and hence they purchase relatively few nonprime loans. 7 Demyanyk and Van Hemert (2008) and Bhardwaj and Sengupta (2008) provide alternative perspectives on the role of lax underwriting of nonprime loans. 8 Critics charge that the rating agencies had a conflict of interest because bond issuers paid for the ratings (New York Times, 2007; Fons, 2008a,b). In addition, the rating agencies used inadequate risk models that did not account for a possibility of a serious drop in housing prices. See Fons (2008a,b). S E P T E M B E R / O C TO B E R , PA R T 1 2009 405 Bullard, Neely, Wheelock Figure 2 U.S. House Prices and Foreclosures New Foreclosures Started (percent) U.S. House Prices (year/year percent change) 1.40 20.00 1.20 15.00 10.00 1.00 5.00 0.80 0.00 0.60 –5.00 0.40 –10.00 Foreclosures Started 0.20 –15.00 U.S. House Price Index 0.00 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 –20.00 NOTE: Foreclosures data are from the Mortgage Bankers Association; the house price index (HPI) is the S&P/Case-Shiller National Home Price Index. Vertical gray bars indicate recessions. RMBS redistributes the income stream from the underlying mortgage pool among bonds that differ by the seniority of their claim. Sometimes additional securities, known as collateralized mortgage obligations (CMOs) or collateralized debt obligations, are created by combining multiple RMBSs (or parts of RMBSs) and then selling portions of the income streams derived from the mortgage pool or RMBSs to investors with different appetites for risk. The securities rating agencies assigned high ratings to many of the mortgage-related securities created to finance purchases of nonprime loans. As long as house prices were rising, most nonprime loans performed well because borrowers were usually able to refinance or sell their house— at a higher price—if they were unable to make their loan payments.9 When house prices began 9 Most nonprime loan originations were refinances of existing mortgages in which borrowers withdrew accumulated equity from their homes (a phenomenon known as a “cash-out” refinance). See Bhardwaj and Sengupta (2009). 406 S E P T E M B E R / O C TO B E R , PA R T 1 2009 to fall, many borrowers found that they owed more on their house than it was worth. This situation made it impossible for some borrowers to repay their loan by selling their house or refinancing their mortgage, and it also created an incentive simply to default. Consequently, loan defaults and foreclosures rose sharply, as shown in Figure 2, which plots data on the percentage of home mortgages entering foreclosure in a given quarter and the year-over-year percentage change in the S&P/ Case-Shiller National Home Price Index. Rising loan delinquencies caused many RMBSs and CMOs backed by home mortgage loans to default, and investment banks and other investors that held large portfolios of RMBSs and CMOs experienced substantial losses. Ultimately, the decline in house prices and the increase in mortgage loan defaults that began in 2006 were the root cause of the financial crisis. The following sections explore how systemic risks caused losses on nonprime mortgages and mortgage-related securities to disrupt the entire financial system. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock SYSTEMIC RISK Systemic Risk and Information Cascades Systemic Risk, Counterparty Risk, and Asymmetric Information Sophisticated investors and counterparties will cease to do business with a firm once the firm’s weak condition becomes known, as they did with Bear Stearns and Lehman Brothers. However, the inability to sort perfectly among good and bad risks can lead banks and other investors to pull away from nearly all lending during a crisis. The tendency of lenders to seek safe investments during a crisis explains why trading in risky assets declined sharply and their market yields rose relative to yields on federal government debt during 2007-08. Sometimes all firms in an industry are “tarred by the same brush” and one firm’s failure leads investors to shun an entire industry. For example, before the introduction of federal deposit insurance in 1933, the failure of individual banks sometimes caused the public to shift a large portion of its funds from bank deposits into cash. Why should the failure of a single firm cause the public to suspect an entire industry? Again, the answer is related to the fact that people have imperfect information. Because depositors lack complete information about the condition of their bank, the failure of one bank can trigger mass withdrawals by depositors of other banks to avoid losses in the event their own bank fails. Indeed, even if a particular depositor believed that his bank was fundamentally sound, it would still make sense for him to withdraw his money if he thought that withdrawals by other depositors might cause the bank to fail. Banking panics are especially dangerous because large-scale deposit withdrawals can make bank failures more likely, as well as cause banks to reduce their lending in an effort to boost liquidity. Several severe banking panics during the nineteenth and early twentieth centuries resulted in widespread bank failures, financial distress, and economic contractions.12 Federal deposit insurance has largely ended the problem of banking panics. When IndyMac Bank was rumored to be near failure in 2008, In the recent financial crisis, the most important type of risk to the financial system has been “counterparty risk,” which is also known as “default risk.”10 Counterparty risk is the danger that a party to a financial contract will fail to live up to its obligations. Counterparty risk exists in large part because of asymmetric information. Individuals and firms typically know more about their own financial condition and prospects than do other individuals and firms. Much of the recent concern about systemic risk has focused on investment banks that deal in complex financial contracts. Consider the following example: Suppose Bank A purchases an option from Bank B to hedge the risk of a change in the term structure of interest rates. If Bank B later fails, perhaps because of bad investments in home mortgages, then the option sold by Bank B may lose value or even become worthless. Thus, Bank A—which thought it was carefully hedging its risk—is adversely affected by Bank B’s problems in housing markets. Of course, financial firms can protect themselves to some degree in such simple situations. The logic of self-interested behavior combined with market clearing would lead to an appropriate pricing of risk; Bank A would have considered the possibility of the failure of Bank B and taken this into account in its contingency plan. For example, Bank A might require Bank B to post collateral to protect the value of the option in case Bank B failed. But in actual financial markets, arrangements are so complex that the nature of risk that firms face might not be obvious. In addition, the value of collateral fluctuates and thus even carefully collateralized deals are subject to some risk.11 10 Taylor (2009) argues that the financial crisis was associated mainly with an increase in counterparty risk and not a shortage of liquidity. 11 Kiyotaki and Moore (1997) and Pintus and Wen (2008) discuss how procyclical fluctuations in the value of collateral can exacerbate financial booms and busts and contribute to macroeconomic fluctuations. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W 12 Calomiris and Gorton (2000), Dwyer and Gilbert (1989), and Wicker (2000) are among the numerous studies of the causes and effects of U.S. banking panics. Diamond and Dybvig (1983) provide an important theoretical analysis of banking panics. S E P T E M B E R / O C TO B E R , PA R T 1 2009 407 Bullard, Neely, Wheelock many depositors withdrew their funds from the bank. However, rather than holding their funds as cash, IndyMac’s depositors merely moved their deposits to other banks. Similarly, the run on IndyMac did not trigger mass deposit withdrawals at other banks. Panic-like phenomena have occurred during the recent financial crisis, however. For example, when the Reserve Primary Fund, a large money market mutual fund, halted investor redemptions after the net asset value of its shares fell below $1 in September 2008, share redemptions rose sharply at other money market mutual funds. Although most money market mutual funds had ample reserves and good assets, investors interpreted the troubles of the Reserve Primary Fund (which held a large amount of Lehman Brothers debt) as a possible indicator of problems at other mutual funds. The federal government quickly guaranteed the value of existing accounts in money market mutual funds to discourage panic withdrawals from such funds. The dramatic declines in trading volume and liquidity in the markets for mortgage-related securities during the recent financial crisis also reflected investor panic. Trading in all RMBSs declined sharply when defaults and ratings downgrades made investors wary of RMBSs in general. Heavy reliance by investors on the evaluation of mortgage instruments by the rating agencies may have exacerbated swings in market liquidity. For example, a ratings downgrade, especially of a previously highly rated security, could induce panic selling by signaling possible downgrades or losses on similar securities. Ratings downgrades and declining asset values can also force borrowers to post additional collateral to maintain a given level of borrowing. AIG collapsed in September 2008 when it was unable to raise additional collateral in the wake of a downgrade of its debt rating (Son, 2008). In general, deterioration in the collateral value of borrower assets was an important amplification mechanism during the recent financial crisis. Falling asset prices caused lenders to demand more collateral, which caused borrowers to dump risky assets, thereby exacerbating declines in their market values and leading to further demands for more collateral (Brunnermeier, 2008). 408 S E P T E M B E R / O C TO B E R , PA R T 1 2009 Why the Financial System Is Special Many aspects of systemic risk are not unique to financial institutions or markets. The failure of a nonfinancial firm, such as an automobile manufacturer, will affect the firm’s suppliers and dealerships, as well as the local economies where manufacturing plants and other operations are located. By the same token, a default by an airline company on its debt obligations might cause investors to shun the debt of other airline companies if investors believe that the default reflected an industry-wide problem, such as rising fuel prices. Still, over the past decade, some very large firms have failed, including Enron, WorldCom, and several major airlines, yet none caused significant problems beyond its immediate shareholders, employees, suppliers, and customers. The failure of a nonfinancial firm would rarely threaten the solvency of a competitor, let alone significantly affect the economy more broadly. Instead, the failure of a large firm could increase the market shares and profitability of the remaining firms in an industry, as well as provide opportunities for smaller firms to enter previously inaccessible markets. Why do we think the failure of a large financial firm presents systemic risks that the failure of a nonfinancial firm does not? There are at least three reasons. The first is interconnectedness. In the normal course of business, large commercial and investment banks lend and trade with each other through interbank lending and deposit markets, transactions in over-the-counter (OTC) derivatives, and wholesale payment and settlement systems. Settlement risk—the risk that one party to a financial transaction will default after the other party has delivered—is a major concern for large financial institutions whose daily exposures routinely run into many billions of dollars. The lightning speed of financial transactions and the complex structures of many banks and securities firms make it especially difficult for a firm to fully monitor the counterparties with which it deals, let alone the counterparties of counterparties. The rapid failure of a seemingly strong bank could potentially expose other firms to large losses. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock Even firms that do not transact directly with the affected bank can be exposed through their dealings with affected third parties.13 A second reason why the financial sector is especially vulnerable to systemic risk is leverage. Compared with most nonfinancial firms, banks and other financial institutions are highly leveraged—that is, they fund a substantial portion of their assets by issuing debt rather than selling equity. During the housing boom, many banks, hedge funds, and other firms that invested heavily in mortgage-related securities financed their holdings by borrowing heavily in debt markets. Investment banks were especially highly leveraged before the crisis, with debt-to-equity ratios of approximately 25 to 1. That is, for every dollar of equity, investment banks issued an average of $25 of debt. By comparison, commercial banks, which are subject to minimum capital requirements, had leverage ratios of approximately 12 to 1.14 High leverage meant that financial firms enjoyed high rates of return on equity when times were good but also a high risk of failing when markets turned against them. Because investment banks held a mere $4 of equity for every $100 of assets on their balance sheets, a relatively modest (4 percent) decline in the value of an investment bank’s assets would wipe out the bank’s equity, forcing it to raise additional capital and/or sell some of its assets. Many investment banks and other financial institutions sustained large losses on their portfolios of RMBSs and were forced to raise additional capital to remain solvent. Similarly, Fannie Mae and Freddie Mac ran into financial difficulties in part because of their extreme leverage. The federal government placed both Fannie Mae and Freddie Mac into conservatorship in July 2008 because losses on their portfolios of mortgages and RMBSs drove the firms to the brink of insolvency. Had those firms held more capital, they could have withstood larger losses without becoming insolvent. 13 14 Lagunoff and Schreft (2001) present a model in which a financial crisis can arise as losses spread among firms whose portfolios are linked to those of other firms. See Economic Report of the President (2009, p. 71). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W A third reason why the financial sector is especially vulnerable to systemic risk is the tendency of financial firms to finance their holdings of relatively illiquid long-term assets with shortterm debt. Not only are financial institutions typically highly leveraged, but the nature of their business entails an inherent mismatch in the maturities of their assets and liabilities that can make them vulnerable to interest rate or liquidity shocks. Most financial intermediaries borrow short and lend long—that is, they fund long-term, relatively illiquid investments with short-term debt. For example, commercial banks traditionally have used demand deposits, which depositors can withdraw at any time, to fund loans and other long-term investments. Many investment banks and securities firms rely heavily on commercial paper, repurchase agreements (repos),15 and other short-term funding sources to finance long-term investments. If depositors suddenly pull their funds from a commercial bank or lenders refuse to purchase a securities firm’s commercial paper or repos, the bank or securities firm could be forced into bankruptcy. Bear Stearns collapsed when investors refused to purchase the firm’s short-term debt. Other firms faced sharply higher funding costs in 2007-08 as markets reevaluated the creditworthiness of borrowers. The speed with which the markets can “turn off the tap” makes financial institutions especially vulnerable to temporary disruptions of liquidity in financial markets.16 MITIGATING SYSTEMIC RISK Recognizing the problem of systemic risk, financial firms have long cooperated to limit risks associated with the failures of other financial firms. For example, before the creation of the Federal Reserve System in 1913, commercial banks devised clearinghouse arrangements in an 15 A repo is a trade in which one party agrees to sell securities to a second party and to buy those securities back at a prespecified price and date. It amounts to collateralized borrowing. 16 Acharya, Gale, and Yorulmazer (2009) present a model that can explain a sudden collapse of liquidity in a financial market associated with a change in the information structure of the assets traded in the market. S E P T E M B E R / O C TO B E R , PA R T 1 2009 409 Bullard, Neely, Wheelock attempt to protect themselves from banking panics. The primary purpose of a clearinghouse is to clear checks and other forms of payment among member banks. In the nineteenth century, clearinghouses developed mechanisms to protect their members from banking panics and to provide additional liquidity for banks facing deposit runs. For example, clearinghouse members could borrow certificates to settle their balances with other member banks in lieu of cash or other reserves. Further, clearinghouse members collectively guarantee the payment obligations of members threatened by deposit withdrawals.17 Financial market exchanges, such as the Chicago Mercantile Exchange, are also private arrangements that limit systemic risks. Securities and commodities exchanges arose centuries ago to settle trades efficiently under clear, fixed rules. Exchanges are the central counterparty to every transaction. Like bank clearinghouses, exchanges reduce default risk by requiring their members to meet minimum capital and disclosure requirements. If a member of the exchange does default, the other members bear that firm’s obligations according to the exchange’s loss-sharing rules. Thus, membership requirements and loss-sharing arrangements lessen the risk that default by one firm will adversely affect other members of the exchange. Many derivatives trade in OTC markets, which consist of financial institutions doing business directly with each other rather than through an exchange. Many analysts have identified weaknesses in OTC derivatives markets, especially in the market for credit default swaps, as important contributors to the recent financial crisis.18 The use of credit default swaps and other financial derivatives has grown enormously in recent years. Although useful for hedging risks, the proliferation of OTC derivatives is widely believed to have increased systemic risks in the financial system by increasing the extent to which large financial firms are interconnected and by reducing transparency. Many analysts believe that these risks could be substantially reduced by establishing a central exchange or clearinghouse for derivatives trading.19 Because exchangetraded derivatives are standardized contracts that are traded among many parties every day, they could be valued more precisely than the custom products traded among individual firms on OTC markets. In addition, the requirement for exchange participants to post margins against potential losses and mark positions to market daily would help reduce counterparty risks. Exchange participants that cannot cover their losses will have their positions closed out before the losses become too large. Cooperative arrangements, such as clearinghouses and exchanges, are one way of reducing systemic risks. However, in many circumstances private measures might be insufficient to ameliorate systemic risk. For example, individual firms could be reluctant to reveal private business information to competitors, which might impair a loss-sharing agreement. Further, firms often have little incentive to mitigate costs borne by others. Thus, a firm whose failure poses systemic risk will tend to behave less cautiously than society would desire and, hence, government involvement might be necessary to limit systemic risks.20 19 For example, see Bernanke (2008c) and Counterparty Risk Management Policy Group III (2008). 17 20 See Gorton (1985), Timberlake (1984), and White (1983, pp. 74-83) for more information about the role of clearinghouses in mitigating banking panics. 18 Wallison (2008) discusses the credit default insurance (or swap) market, and Schinasi et al. (2001) describe the OTC derivatives market. Systemic risk constitutes a “negative externality” in the sense that the actions of one firm harm others. The situation is analogous to a firm that pollutes the environment. Because others bear at least some of the costs of the pollution, the firm will tend to pollute more than it would if it had to compensate others for these costs. Negative externalities are an example of a market failure that may require government intervention to ameliorate. 410 S E P T E M B E R / O C TO B E R , PA R T 1 2009 Proposals for Government Policies to Control Systemic Risks The recent financial crisis has prompted numerous proposals for enhanced government regulation and supervision of large financial firms and markets to address systemic risks. Many proposals call for increased supervision of systemically important financial institutions, as well as new rules for resolving insolvent firms. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock Other proposals recommend regulation to limit risk-taking and to ensure ample liquidity in financial markets. This section reviews some of the regulatory and legal proposals suggested in response to the recent crisis. Many reform proposals call for the creation of a systemic financial regulator with responsibilities for “macroprudential” oversight of the financial system. A macroprudential regulator would consider broad economic trends and the impact of a firm’s actions on the entire financial system, not just the firm’s own risk of default (Bernanke, 2008c). To some extent, regulators already consider broad economic trends and effects, but several proposals argue for bringing all large or systemically important financial institutions under the umbrella of a systemic regulator.21 One justification for the regulation and supervision of systemically important firms is that governments are unlikely to permit such firms to fail, or if they do fail, the government will substantially protect many, if not all, of the firm’s creditors from loss. Such a government guarantee—either explicit or implicit—can encourage firms to take greater risks than they otherwise would, which increases the likelihood of their failure.22 Consequently, regulation and supervision is required to offset the incentive to take excessive risk. Federal deposit insurance is one example of a government guarantee that can encourage excessive risk-taking. Without deposit insurance, rational, fully informed depositors would require banks with risky assets to hold more capital or pay higher deposit rates than banks with less-risky 21 22 For example, the Group of Thirty (2009, p. 17) argues that at the start of 2008, there were five U.S. investment banks (Bear Stearns, Goldman Sachs, Lehman Brothers, Merrill Lynch, and Morgan Stanley), one insurance company (AIG), and two governmentsponsored enterprises (Fannie Mae and Freddie Mac) that were systemically significant and therefore should have been subject to stringent regulation and supervision. During 2008, all but two of those firms (Goldman Sachs and Morgan Stanley) failed or suffered large losses that required government intervention, and both Goldman Sachs and Morgan Stanley became bank holding companies. “Moral hazard” describes the idea that individuals and firms engage in riskier behavior when they are protected from the danger that such behaviors create. For example, a person who purchases fire insurance might be less concerned with fire hazards than one who would personally bear the full cost of a fire. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W assets. However, with insurance, depositors have little incentive to monitor the risks their banks take; hence, deposit insurance gives banks an incentive to assume greater risks than they otherwise would.23 Whereas deposit insurance is an explicit guarantee, the public’s expectation that the federal government would stand behind the liabilities of Fannie Mae and Freddie Mac is an example of an implicit guarantee. The perception that the government would guarantee the liabilities of Fannie Mae and Freddie Mac enabled those firms to borrow at relatively low interest rates to fund their purchases of mortgages and RMBSs, including securities backed by nonprime mortgages. Fannie Mae and Freddie Mac grew rapidly and operated with much lower capital ratios than other financial firms. Ultimately, financial losses eroded their thin capital cushions and pushed both Fannie Mae and Freddie Mac to the brink of failure before they were placed into government conservatorship.24 Many policymakers and analysts have called for new rules for shutting down large financial firms that become insolvent. The current bankruptcy regime is widely criticized as inadequate for dealing with failures of systemically important financial institutions.25 Delays and uncertainties inherent in the bankruptcy process of a systemically important firm could precipitate or exacerbate a financial crisis. Several reform proposals advocate subjecting nonbank financial firms to “prompt corrective action” in the event their capital ratios fall below prescribed levels. The Federal Deposit Insurance Corporation Improvement Act of 1991 already 23 Merton (1977) shows that banks maximize the value of deposit insurance to themselves by maximizing their risk. Capital requirements and other measures can limit the excessive risk-taking encouraged by deposit insurance. Many analysts blame lax regulation and supervision, coupled with an increase in deposit insurance coverage, for increased risk-taking and the high failure rates of banks and, especially, savings and loan associations during the 1980s. For example, see Kane (1989) and White (1991). 24 Poole (2002, 2003, and 2007) was among those warning of the risks inherent with the implicit government guarantee of Fannie Mae and Freddie Mac debt. Stern and Feldman (2004) discuss the effects of “too big to fail” policies in general. 25 For example, see Bernanke (2008c) and Congressional Oversight Panel (2009, p. 24). S E P T E M B E R / O C TO B E R , PA R T 1 2009 411 Bullard, Neely, Wheelock mandates prompt corrective action for commercial banks. For example, bank supervisors can limit the growth, executive compensation, and payment of dividends by undercapitalized banks. Supervisors can also place critically undercapitalized banks into conservatorship or receivership.26 Federal Reserve Chairman Bernanke (2008c) and others argue that prompt corrective action could reduce systemic risks and discourage large financial holding companies and nonbank financial firms from taking excessive risks. Further, the authority to place a critically undercapitalized firm into conservatorship or receivership would enable the government to resolve failures in an orderly way that imposes the failing firm’s losses on the firm’s creditors and equity holders rather than on taxpayers. Prompt corrective action is one potential component of a general strengthening of the oversight of large financial firms. Another potential component is a more comprehensive approach to the supervision of complex and systemically important financial firms. Proponents argue that broader supervision of systemically significant firms might have prevented the failure of AIG, which required a government rescue to avoid bankruptcy in September 2008. AIG is a large financial conglomerate with global operations. The traditional business of AIG is insurance—automobile, life, and so on. In the United States, state government authorities regulate insurance firms—New York State in the case of AIG. State insurance regulations and supervision are designed to ensure the solvency of insurance companies so that they are fairly certain to meet their contingent claims. But insurance regulators have little or no oversight of the other subsidiaries and operations of conglomerates such as AIG. Besides owning an insurance company, AIG also owns a federally chartered savings bank (AIG Bank, FSB), which places AIG under the supervision of the Office of Thrift Supervision. Bank and thrift regulators, however, traditionally 26 Aggarwal and Jacques (1998) and Spong (2000, pp. 84-98) provide additional information about commercial bank capital requirements and prompt corrective action. Evanoff and Wall (2003) argue that the use of subordinated debt spreads might be useful to trigger prompt corrective action. 412 S E P T E M B E R / O C TO B E R , PA R T 1 2009 have focused on the condition of the depository institution rather than on the systemic risks posed by its parent holding company. The Office of Thrift Supervision has neither the resources to supervise the activities of the entire conglomerate nor the mandate to regulate the extent to which AIG poses systemic risk to the financial system. AIG’s unregulated activities, notably the underwriting of credit default insurance, created substantial losses as the housing market slumped badly in 2006-08. These unregulated operations had grown so large that government officials feared that AIG’s sudden collapse could impose severe losses on other firms and seriously impair the functioning of the entire financial system. To avoid this outcome the U.S. Treasury and Federal Reserve provided AIG with loans and a capital injection in September 2008 when it appeared that the firm would default on its outstanding debts. Many proposals for reforming financial regulation call for the supervision of large, complex financial institutions such as AIG by strong regulators with sweeping oversight and enforcement powers that can focus on the systemic risks posed by such organizations.27 Brunnermeier et al. (2009) argue that an effective macroprudential regulator must have the political independence to impose unpopular measures. To limit discretion, the study argues, regulation should follow preset rules as much as possible. Writing rules to cover every possible contingency is difficult if not impossible, however, and before assigning sweeping oversight and enforcement authority to a systemic regulator, the scope of the regulator’s authority would have to be carefully delineated. In addition to enhanced macroprudential oversight, proposals for mitigating systemic risks in the financial system include the imposition of minimum capital requirements on large financial firms, regulations on the use of short-term debt to finance holdings of long-term assets, and changes to market value accounting rules. Many analysts contend that extreme leverage contributed to the recent financial crisis by making large financial firms especially vulnerable to 27 For example, see Brunnermeier et al. (2009), Congressional Oversight Panel (2009), Group of Thirty (2009), and Paulsen et al. (2008). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock losses. This view has prompted proposals to strengthen capital requirements for commercial banks and to extend those requirements to previously unregulated financial firms, such as investment banks. Some analysts argue that large, systemically significant firms should be required to hold more capital as a percentage of their assets than smaller firms (e.g., Congressional Oversight Panel, 2009, p. 26). Some proposals call for discouraging the funding of long-term, illiquid assets with short-term debt. A firm that cannot roll over its short-term debt could be forced to sell assets, and if many firms are in the same predicament, then asset prices could decline sharply. Such price declines would impose further losses on firms, forcing a spiral of still more sales and further price declines. As the recent financial crisis intensified, especially in September 2008, firms that relied heavily on short-term debt faced sharply higher interest rates as banks suddenly became less willing to lend and investors fled to the safety of U.S. Treasury securities.28 Future systemic risks could be reduced by discouraging excessive leveraging and the use of short-term debt to fund long-term asset holdings, for example, by requiring firms to hold more capital against long-term, relatively illiquid assets funded with short-term debt than against more-liquid assets or assets funded with long-term debt (e.g., Brunnermeier et al., 2009, pp. 38-39). Kotlikoff and Leamer (2009) offer a more radical solution to the problem of short-term debt financing illiquid assets: “limited-purpose banking.” This scheme would convert all financial firms to mutual funds so that individual depositors, not the financial firms, would bear the risk of the asset holdings. Noting the tendency of financial firms to increase their use of leverage when asset prices are rising and to reduce leverage when prices are falling, some analysts argue that capital requirements should become increasingly stringent when asset prices are rising. Some proposals call for tying capital requirements explicitly to the growth in the value of a bank’s assets (e.g., Congressional 28 The danger of issuing short-term debt is not limited to firms. Neely (1996) describes the role of short-term debt in triggering Mexico’s December 1994 peso crisis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Oversight Panel, 2009, pp. 27-28); others call on bank supervisors to encourage banks to build capital and liquidity when times are good and allow banks to draw down their buffers during difficult times. For example, the Group of Thirty (2009, p. 43) recommends capital requirements “expressed as a broad range within which capital ratios should be managed, with the expectation that, as part of supervisory guidance, firms will operate at the upper end of such a range in periods when markets are exuberant and tendencies for underestimating and underpricing risk are great.” One of the more hotly debated issues surrounding the recent financial crisis is the extent to which fair value accounting rules contributed to the crisis. In textbook financial markets, valuations are the considered outcomes of the views of rational, relatively risk-tolerant speculators with deep pockets. In the real world, however, imperfect information and limited risk tolerance are facts of life that can inhibit the rational speculation necessary to drive prices back to long-term fundamental values. Trading in certain assets might cease during a crisis or trades might occur at widely disparate prices, making the determination of their market value problematic just when the value of transparency is greatest. In addition, by forcing financial firms to realize declines in asset prices immediately, mark-to-market rules might exacerbate a crisis by encouraging asset sales when prices are already falling, leading to further write-downs and financial losses.29 The Group of Thirty (2009, p. 46) calls for applying “more realistic” accounting guidelines to less-liquid assets and distressed markets but is generally supportive of fair value accounting. Similarly, Brunnermeier et al. (2009, pp. 36-37) advocate a “mark-to-funding” approach to fair value accounting in which the value of an asset is tied to the funding of that asset. For example, if an asset that matures in 20 years is financed 29 Blanchard (2008) compares the current financial crisis with nineteenth-century bank runs. He points out that the opacity of the mortgage-backed assets has served to amplify the financial crisis by making those assets particularly difficult to value, lowering their resale price, and increasing uncertainly about financial firms’ solvency. Similarly, the high degree of leverage of financial institutions increases the probability that any losses will lead to insolvency. S E P T E M B E R / O C TO B E R , PA R T 1 2009 413 Bullard, Neely, Wheelock with debt that matures in 30 days, the asset should be valued at the expected price of the asset in 30 days. Of course, calculating expected future prices in any reasonable way is difficult and the authors acknowledge that their scheme would give firms some discretion over the valuation of their assets. However, they argue that it would more accurately relate the value of assets to funding risks. CONCLUSION The recent financial crisis has claimed many victims. Several prominent firms, including Bear Stearns, Lehman Brothers, AIG, Fannie Mae, and Freddie Mac, have gone bankrupt or required government intervention to prevent their failure. When the U.S. Treasury Department and the Federal Reserve intervened to prevent a failure, their goal was to protect the financial system— and the economy—from systemic risk. Financial firms are much more susceptible to systemic risk than nonfinancial firms because financial firms are typically highly interconnected with one another, highly leveraged, and tend to use short-term debt to finance their holdings of long-term, relatively illiquid assets. In the recent crisis, the possible failure of counterparties in complex transactions created systemic risk. Financial firms are cognizant of systemic risk and traditionally have tried to reduce their vulnerability to it by participating in clearinghouses or trading through financial exchanges. Nevertheless, because firms do not bear all the costs of their own failure, government has a role to play in limiting systemic risk in the financial system to protect the broader economy. Analysts have proposed regulatory reforms to reduce the danger from systemic risk in the future. In particular, some advocate the creation of a powerful macroprudential regulator that considers a firm’s impact on the stability of the entire financial system. Other ideas for reducing systemic risk include limiting the use of leverage and short-term debt and revising market value accounting rules. It is too soon to fully determine the causes of the recent financial crisis. Asset price booms and busts that impair the financial system and 414 S E P T E M B E R / O C TO B E R , PA R T 1 2009 the entire economy have occurred before. However, the complex nature of recently developed financial instruments has transmitted the consequences of the housing bust to the entire financial system and, ultimately, to the overall economy. Accordingly, many analysts favor measures to increase the use of organized exchanges for trading derivatives. An improved regime for resolving large insolvent financial firms would limit systemic risk and excessive risk-taking. When the government has intervened to protect the economy from the failure of a large systemically important financial firm, the shareholders of these firms usually received little or no value for their equity and their senior managers were dismissed or subject to compensation limits. However, bondholders doubtless received more compensation than they would have in the absence of government intervention. A legal reform that permits rapid resolution of failing financial firms, including appropriate reductions in payments to bondholders, would help to create incentives for bondholders to be mindful of the risk of their investments. This, in turn, would discourage excessive risk-taking by increasing the borrowing costs for risky firms. The economy could benefit from reforms that reduce the risks to the financial system imposed by firms that are “too big to fail.” REFERENCES Acharya, Viral; Gale, Douglas and Yorulmazer, Tanju. “Rollover Risk and Market Freezes.” New York University and Federal Reserve Bank of New York Working Paper, October 2008, updated February 2009; www.newyorkfed.org/research/conference/ 2009/cblt/Acharya-Gale-Yorulmazer.pdf. Aggarwal, Raj and Jacques, Kevin T. “Assessing the Impact of Prompt Corrective Action on Bank Capital and Risk.” Federal Reserve Bank of New York Economic Policy Review, October 1998, pp. 23-32; www.newyorkfed.org/research/epr/98v04n3/ 9810agga.pdf. Bernanke, Ben S. “The Global Saving Glut and the U.S. Current Account Deficit.” Remarks at the Homer Jones Memorial Lecture, St. Louis, April 14, F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock 2005; www.federalreserve.gov/boarddocs/speeches/ 2005/20050414/default.htm. Bernanke, Ben S. “Risk Management in Financial Institutions.” Speech at the Federal Reserve Bank of Chicago’s Annual Conference on Bank Structure and Competition, Chicago, May 15, 2008a; www.federalreserve.gov/newsevents/speech/ bernanke20080515a.htm. Bernanke, Ben S. “Financial Regulation and Financial Stability.” Speech at the Federal Deposit Insurance Corporation’s Forum on Mortgage Lending for Low and Moderate Income Households, Arlington, Virginia, July 8, 2008b; www.federalreserve.gov/ newsevents/speech/bernanke20080708a.htm. Bernanke, Ben S. “Reducing Systemic Risk.” Speech at the Federal Reserve Bank of Kansas City’s Annual Economic Symposium in Jackson Hole, Wyoming, August 22, 2008c; www.federalreserve.gov/ newsevents/speech/bernanke20080822a.htm. Bhardwaj, Geetesh and Sengupta, Rajdeep. “Where’s the Smoking Gun? A Study of Underwriting Standards for U.S. Subprime Mortgages.” Working Paper No. 2008-036B, Federal Reserve Bank of St. Louis, October 27, 2008; revised May 2009; http://research.stlouisfed.org/wp/2008/2008-036.pdf. Bhardwaj, Geetesh and Sengupta, Rajdeep. “Did Prepayments Sustain the Subprime Market?” Working Paper No. 2008-039B, Federal Reserve Bank of St. Louis, April 2009; http://research.stlouisfed.org/wp/2008/2008-039.pdf. Blanchard, Olivier J. “The Crisis: Basic Mechanisms, and Appropriate Policies.” Working Paper No. 09-01, MIT Department of Economics, December 29, 2008; http://ssrn.com/abstract=1324280. Economic Policy Research, 2009; www.voxeu.org/reports/Geneva11.pdf. Bullard, James. “Systemic Risk and the Macroeconomy: An Attempt at Perspective.” Speech at Indiana University on October 2, 2008; www.stlouisfed.org/ newsroom/speeches/2008_10_02.cfm. Caballero, Ricardo J.; Farhi, Emmanuel and Gourinchas, Pierre-Olivier. “Financial Crash, Commodity Prices and Global Imbalances.” NBER Working Paper 14521, National Bureau of Economic Research, November 17, 2008; www.nber.org/ papers/w14521.pdf?new_window=1. Calomiris, Charles W. and Gorton, Gary. “The Origins of Banking Panics: Models, Facts and Bank Regulation,” in Charles W. Calomiris, ed., U.S. Bank Deregulation in Historical Perspective. New York: Cambridge University Press, 2000, pp. 93-163. Congressional Oversight Panel. “Special Report on Regulatory Reform.” January 2009; http://cop.senate.gov/documents/cop-012909report-regulatoryreform.pdf. Counterparty Risk Management Policy Group III. “Containing Systemic Risk: The Road to Reform.” August 6, 2008; www.crmpolicygroup.org/docs/ CRMPG-III.pdf. Demyanyk, Yuliya S. and Van Hemert, Otto. “Understanding the Subprime Mortgage Crisis.” Advance Access published online on May 4, 2009, in Review of Financial Studies; doi:10.1093/rfs/hhp033. Diamond, Douglas and Dybvig, Philip. “Bank Runs, Deposit Insurance, and Liquidity.” Journal of Political Economy, June 1983, 91(3), pp. 401-19. Brunnermeier, Markus. “Deciphering the 2007-08 Liquidity and Credit Crunch.” Working paper, Princeton University, May 2008. DiMartino, Danielle and Duca, John V. “The Rise and Fall of Subprime Mortgages.” Federal Reserve Bank of Dallas Economic Letter, November 2007, 2(11); www.dallasfed.org/research/eclett/2007/el0711.html. Brunnermeier, Markus K.; Crockett, Andrew; Goodhart, Charles A.; Persaud, Avinash D. and Shin, Hyun. The Fundamental Principles of Financial Regulation: Geneva Reports on the World Economy 11 (Preliminary Conference Draft). London: Centre for Dwyer, Gerald P. Jr. and Gilbert, R. Alton. “Bank Runs and Private Remedies.” Federal Reserve Bank of St. Louis Review, May/June 1989, pp. 43-61; http://research.stlouisfed.org/publications/review/ 89/05/Remedies_May_Jun1989.pdf. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W S E P T E M B E R / O C TO B E R , PA R T 1 2009 415 Bullard, Neely, Wheelock Economic Report of the President, 2009. Washington, DC: U.S. Government Printing Office, January 2009; www.gpoaccess.gov/eop/2009/2009_erp.pdf. Evanoff, Douglas D. and Wall, Larry D. “Subordinated Debt and Prompt Corrective Regulatory Action.” Working Paper Series No. WP 2003-03, Federal Reserve Bank of Chicago, 2003; www.chicagofed.org/ publications/workingpapers/papers/wp2003-03.pdf. Fons, Jerome S. “Rating Competition and Structured Finance.” Journal of Structured Finance, Fall 2008a, 14(3). Fons, Jerome S. Testimony of Jerome S. Fons Before the Committee on Oversight and Government Reform United States House of Representatives, October 22, 2008b; http://oversight.house.gov/ documents/20081022102726.pdf. Gorton, Gary. “Clearinghouses and the Origin of Central Banking in the United States.” Journal of Economic History, June 1985, 45(2), pp. 277-83. Group of Thirty. Financial Reform: A Framework for Financial Stability. Washington, DC: The Group of Thirty, 2009; www.group30.org/pubs/ recommendations.pdf. Kane, Edward J. The S&L Insurance Mess: How Did It Happen? Washington, DC: Urban Institute Press, 1989. Kiyotaki, Nobuhiro and Moore, John. “Credit Cycles.” Journal of Political Economy, April 1997, 105(2), pp. 211-48. Kotlikoff, Laurence J. and Leamer, Edward. “A Banking System We Can Trust.” Forbes, April 23, 2009; www.forbes.com/2009/04/22/loan-mortgage-mutualfund-wall-street-opinions-contributors-bank.html. Lagunoff, Roger and Schreft, Stacey L. “A Model of Financial Fragility.” Journal of Economic Theory, July/August 2001, 99(1-2), pp. 220-64. Merton, Robert C. “An Analytic Derivation of the Cost of Deposit Insurance and Loan Guarantees: An Application of Modern Option Pricing Theory.” 416 S E P T E M B E R / O C TO B E R , PA R T 1 2009 Journal of Banking and Finance, June 1977, 1(1), pp. 3-11. Neely, Christopher J. “The Giant Sucking Sound: Did NAFTA Swallow the Peso?” Federal Reserve Bank of St. Louis Review, July/August 1996, 78(4), pp. 33-47; http://research.stlouisfed.org/publications/ review/96/07/9607cn.pdf. New York Times. “Senators Accuse Rating Agencies of Conflicts of Interest in Market Turmoil,” September 26, 2007; www.nytimes.com/2007/ 09/26/business/worldbusiness/26iht-credit.4. 7646763.html. Paulsen, Henry M. Jr.; Steel, Robert K.; Nason, David G. et al: The Department of the Treasury Blueprint for a Modernized Financial Regulatory Structure, March 2008; www.treas.gov/press/releases/reports/ Blueprint.pdf. Pintus, Patrick A. and Wen, Yi. “Excessive Demand and Boom-Bust Cycles.” Working Paper 2008-014B, Federal Reserve Bank of St. Louis, June 2008; http://research.stlouisfed.org/wp/2008/2008-014.pdf. Poole, William. “Financial Stability.” Remarks at the Council of State Governments Southern Legislative Conference Annual Meeting, New Orleans, Louisiana, August 4, 2002; http://fraser.stlouisfed.org/historicaldocs/wp2002/ download/41143/20020804.pdf. Poole, William. “Housing in the Macroeconomy.” Remarks at the Office of Federal Housing Enterprise Oversight Symposium, Washington, DC, March 10, 2003; http://fraser.stlouisfed.org/historicaldocs/wp2003/download/41135/20030310.pdf. Poole, William. “Reputation and the Non-Prime Mortgage Market.” Remarks at the St. Louis Association of Real Estate Professionals. July 20, 2007; http://fraser.stlouisfed.org/historicaldocs/ wp2007/download/40917/20070720.pdf. Schinasi, Garry J.; Craig, R. Sean; Drees, Burkhard and Kramer, Charles. “Modern Banking and OTC Derivatives Markets: The Transformation of Global Finance and Its Implications for Systemic Risk.” Occasional Paper No. 203, International Monetary F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Bullard, Neely, Wheelock Fund, January 9, 2001; www.imf.org/external/pubs/nft/op/203/. Sengupta, Rajdeep and Emmons, William R. “What Is Subprime Lending?” Federal Reserve Bank of St. Louis, Economic Synopses No. 13, 2007; http://research.stlouisfed.org/publications/es/07/ ES0713.pdf. Son, Hugh. “AIG Plunges as Downgrades Threaten Quest for Capital.” Bloomberg.com, September 16, 2008; www.bloomberg.com/apps/news?pid= 20601087&sid=aP5rm0.62wqo. Spong, Kenneth. Banking Regulation: Its Purposes, Implementation, and Effects. Fifth edition. Kansas City, MO: Federal Reserve Bank of Kansas City, 2000; www.kansascityfed.org/banking/ bankingpublications/RegsBook2000.pdf. Stern, Gary H. and Feldman, Ron J. Too Big to Fail: The Hazards of Bank Bailouts. Washington, DC: Brookings Institution Press, 2004. Taylor, John B. Getting Off Track: How Government Actions and Intervention Caused, Prolonged, and Worsened the Financial Crisis. Stanford, CA: Hoover Institution Press, 2009. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Timberlake, Richard H. Jr. “The Central Banking Role of Clearinghouse Associations.” Journal of Money, Credit, and Banking, February 1984, 16(1), pp. 1-15. Wallison, Peter J. “Everything You Wanted to Know about Credit Default Swaps—But Were Never Told” (American Enterprise Institute for Public Policy Research Outlook Series). December 2008; www.aei.org/docLib/20090107_12DecFSOg.pdf. Wheelock, David C. “The Federal Response to Home Mortgage Distress: Lessons from the Great Depression.” Federal Reserve Bank of St. Louis Review, May/June 2008, 90(3 Part 1), pp. 133-48; http://research.stlouisfed.org/publications/review/ 08/05/Wheelock.pdf. White, Eugene N. The Regulation and Reform of the American Banking System, 1900-1929. Princeton, NJ: Princeton University Press, 1983. White, Lawrence J. The S&L Debacle: Public Policy Lessons for Bank and Thrift Regulation. New York: Oxford University Press, 1991. Wicker, Elmus. Banking Panics of the Gilded Age. Cambridge: Cambridge University Press, 2000. S E P T E M B E R / O C TO B E R , PA R T 1 2009 417 418 S E P T E M B E R / O C TO B E R , PA R T 1 2009 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Can the Term Spread Predict Output Growth and Recessions? A Survey of the Literature David C. Wheelock and Mark E. Wohar This article surveys recent research on the usefulness of the term spread (i.e., the difference between the yields on long-term and short-term Treasury securities) for predicting changes in economic activity. Most studies use linear regression techniques to forecast changes in output or dichotomous choice models to forecast recessions. Others use time-varying parameter models, such as Markov-switching models and smooth transition models, to account for structural changes or other nonlinearities. Many studies find that the term spread predicts output growth and recessions up to one year in advance, but several also find its usefulness varies across countries and over time. In particular, many studies find that the ability of the term spread to forecast output growth has diminished in recent years, although it remains a reliable predictor of recessions. (JEL C53, E37, E43) Federal Reserve Bank of St. Louis Review, September/October 2009, 91(5, Part 1), pp. 419-40. I nformation about a country’s future economic activity is important to consumers, investors, and policymakers. Since Kessel (1965) first discussed how the term structure of interest rates varies with the business cycle, many studies have examined whether the term structure is useful for predicting various measures of economic activity. The term spread (the difference between the yields on long-term and short-term Treasury securities) has been found useful for forecasting such variables as output growth, inflation, industrial production, consumption, and recessions, and the ability of the spread to predict economic activity has become something of a “stylized fact” among macroeconomists. This article surveys recent research investigating the ability of the term spread to forecast output growth and recessions.1 The article briefly discusses theoretical explanations for why the spread might predict future economic activity and then surveys empirical studies that investigate how well the spread predicts output growth and recessions. The survey describes the data and methods used in various studies to investigate the predictive power of the term spread, as well as key findings. In general, the literature has not reached a consensus about how well the term spread predicts output growth. Although many studies do find that the spread predicts output growth at one-year horizons, studies also find considerable variation across countries and over time. In particular, many studies find that the ability of the spread to forecast output growth has declined since the mid-1980s. The empirical literature provides more consistent evidence that 1 Surveys of the older literature include Berk (1998), Dotsey (1998), Estrella and Hardouvelis (1991), Plosser and Rouwenhorst (1994), and Stock and Watson (2003). Stock and Watson (2003) also survey research on the usefulness of asset prices for forecasting inflation. David C. Wheelock is a vice president and economist at the Federal Reserve Bank of St. Louis. Mark E. Wohar is a professor of economics at the University of Nebraska at Omaha. The authors thank Michael Dueker, Massimo Guidolin, and Dan Thornton for comments on a previous draft of this article. Craig P. Aubuchon provided research assistance. © 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W S E P T E M B E R / O C TO B E R , PA R T 1 2009 419 Wheelock and Wohar Figure 1 U.S. Term Spread and Recessions Percent 5 4 3 2 1 0 –1 19 5 19 3 55 19 5 19 7 5 19 9 61 19 6 19 3 65 19 6 19 7 6 19 9 7 19 1 73 19 7 19 5 77 19 7 19 9 81 19 8 19 3 85 19 8 19 7 8 19 9 91 19 9 19 3 95 19 9 19 7 9 20 9 01 20 0 20 3 0 20 5 0 20 7 09 –2 Spread Between 10-Year and 3-Month Treasury Security Yields NOTE: The term spread is calculated as the difference between the yields on 10-year and 3-month Treasury securities. The shaded areas denote recessions as determined by the National Bureau of Economic Research. the term spread is useful for predicting recessions. Furthermore, the relationship appears robust to the inclusion of other variables and nonlinearities in the forecasting model. A LOOK AT THE DATA Yields on long-term securities typically exceed those on otherwise comparable short-term securities, reflecting the preference of most investors to hold instruments with shorter maturities. Hence, the yield curve, which is a plot of the yields on otherwise comparable securities of different maturities, is typically upward sloping. Analysts have long noted, however, that most recessions are preceded by a sharp decline in the slope of the yield curve and frequently by an inversion of the yield curve (i.e., by short-term yields rising above those on long-term securities). 420 S E P T E M B E R / O C TO B E R , PA R T 1 2009 Figure 1 shows the difference between the yields on 10-year and 3-month U.S. Treasury securities for 1953-2008. The shaded regions indicate recession periods as defined by the National Bureau of Economic Research.2 As Figure 1 shows, every U.S. recession since 1953 was preceded by a large decline in the yield on 10-year Treasury securities relative to the yield on 3-month Treasury securities, and several recessions were preceded by an inversion of the yield curve. Moreover, the only occasion when the 3-month Treasury security yield exceeded the (constant-maturity) 10-year Treasury yield without a subsequent recession was in December 1966. Similar data for Germany and the United Kingdom are shown in Figures 2 and 3, respec2 National Bureau of Economic Research, “Information on Recessions and Recoveries, the NBER Business Cycle Dating Committee, and Related Topics”; www.nber.org/cycles/main.html. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar Figure 2 German Term Spread and Recessions Percent 6 4 2 0 –2 –4 –6 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 Spread Between 10-Year Government Bond Yield and 3-Month Treasury Bill Yield NOTE: The term spread is calculated as the difference between the yields on 10-year and 3-month Treasury securities. The shaded areas denote recessions as determined by the Economic Cycle Research Institute. tively. Germany experienced recessions beginning in 1966, 1974, 1980, 1991, 2000, and 2008. All but the 1966 recession were preceded by a sharp decline in long-term Treasury security yields relative to short-term yields that resulted in a flat or inverted yield curve. The only inversion that was not followed by a recession occurred in 1970. The United Kingdom experienced recessions beginning in 1974, 1979, 1990, and 2008. All were preceded by or coincided with a yield curve inversion. However, large inversions in 1985 and 1997-98 were not followed by recessions.3 Table 1 summarizes additional information about the association between the term spread and economic activity. The table presents correlations between the term spread (measured as a 3 Recession dates for Germany and the United Kingdom are from the Economic Cycle Research Institute, as reported by Haver Analytics. Interest rate data for Germany and the United Kingdom are from Global Insight. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W quarterly average of monthly observations) and the year-over-year percentage change in real gross domestic product (GDP) for the United States, Germany, and the United Kingdom. The table presents the contemporaneous correlation between the two variables, as well as correlations at various leads and lags of the term spread relative to GDP growth. The top panel of the table reports correlations between GDP growth in one quarter and the term spread in the same quarter (t) and in six preceding quarters (t – 1 and so on). The bottom panel reports the correlations between GDP growth in one quarter and the term spread in the same quarter and in the six subsequent quarters (t + 1 and so on). The contemporaneous correlation between GDP growth and the term spread is not statistically different from zero for any of the three countries (column 1 in Table 1). By contrast, the correlations between GDP growth and the term spread lagged S E P T E M B E R / O C TO B E R , PA R T 1 2009 421 Wheelock and Wohar Figure 3 U.K. Term Spread and Recessions Percent 8 6 4 2 0 –2 –4 –6 1963 1968 1973 1978 1983 1988 1993 1998 2003 2008 Spread Between 10-Year Government Bond Yield and 3-Month Treasury Bill Yield NOTE: The term spread is calculated as the difference between the yields on 10-year and 3-month Treasury securities. The shaded areas denote recessions as determined by the Economic Cycle Research Institute. from one to six quarters are uniformly positive and statistically significant (indicated by p-values of 0.10 or less) for all three countries, except for the correlation between U.S. GDP growth and the term spread lagged by one quarter. Thus, the correlations indicate that, in general, the higher the yield on 10-year Treasury securities relative to the yield on 3-month Treasury securities—that is, the more steeply sloped the yield curve—the higher the rate of future GDP growth. Similarly, the less steeply sloped the yield curve, the lower the subsequent rate of GDP growth. The correlations between current GDP growth and future term spreads shown in the lower panel are negative and for the most part statistically significant for all three countries. Thus, a higher GDP growth rate in one quarter is associated with a less steeply sloped yield curve in subsequent quarters. As discussed in more detail in the following section, the pattern of positive correlation between current GDP growth and lagged term spreads and 422 S E P T E M B E R / O C TO B E R , PA R T 1 2009 negative correlation between current GDP growth and future term spreads is consistent with more than one explanation of the relationship between the yield curve and output growth. Further, although the unconditional correlation between output growth and the term spread is high, the correlation might reflect the influence of some other variable, in which case the term spread would not forecast output growth if that other influence is included in the forecasting model. After discussing why the term spread might forecast economic activity in the next section, we review empirical research on the usefulness of the term spread for forecasting output growth and recessions in subsequent sections. WHY MIGHT THE TERM SPREAD FORECAST ECONOMIC ACTIVITY? Although many empirical studies find that the term spread predicts future economic activity, F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar Table 1 Correlation of GDP Growth and Lagged and Future Term Spreads by Country Lagged term spread Term t (t – 1) (t – 2) (t – 3) (t – 4) (t – 5) (t – 6) United States –0.0449 (0.5047) 0.0999 (0.1379) 0.2557 (0.0001) 0.3605 (0.0001) 0.4141 (0.0001) 0.3957 (0.0001) 0.3196 (0.0001) Germany –0.0003 (0.9970) 0.1641 (0.0455) 0.2991 (0.0002) 0.3689 (0.0001) 0.3845 (0.0001) 0.3649 (0.0001) 0.3421 (0.0001) United Kingdom 0.0723 (0.3319) 0.1816 (0.0144) 0.2486 (0.0008) 0.3025 (0.0001) 0.3379 (0.0001) 0.3166 (0.0001) 0.2607 (0.0005) Future term spread Term t (t + 1) (t + 2) (t + 3) (t + 4) (t + 5) (t + 6) United States –0.0449 (0.5047) –0.1428 (0.0335) –0.2374 (0.0004) –0.2994 (0.0001) –0.3372 (0.0001) –0.3538 (0.0001) –0.3421 (0.0001) Germany –0.0003 (0.9970) –0.1722 (0.0357) –0.3414 (0.0001) –0.4424 (0.0001) –0.4548 (0.0001) –0.4545 (0.0001) –0.4110 (0.0001) United Kingdom 0.0723 (0.3319) –0.0364 (0.6244) –0.1366 (0.0652) –0.2116 (0.0040) –0.2306 (0.0017) –0.2204 (0.0001) –0.2261 (0.0021) NOTE: U.S. data are for 1953:Q1–2008:Q4; German data are for 1973:Q1–2008:Q2 (West Germany, 1973-1991); U.K. data are for 1958:Q1–2008:Q2. Numbers in parentheses represent p-values. there is no universally agreed-upon theory as to why a relationship between the term spread and economic activity should exist. To a large extent, the usefulness of the spread for forecasting economic activity remains a “stylized fact in search of a theory” (Benati and Goodhart, 2008, p. 1237). The expectations hypothesis of the term structure is the foundation of many explanations of the term spread’s usefulness in forecasting output growth and recessions. The expectations hypothesis holds that long-term interest rates equal the sum of current and expected future short-term interest rates plus a term premium. The term premium explains why the yield curve usually slopes upward—that is, why the yields on longterm securities usually exceed those on short-term securities. However, the yield curve flattens or inverts—slopes downward—if the public expects short-term interest rates to fall. In that case, investors bid up the prices of longer-term securiF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W ties, which causes their yields to fall relative to current yields on short-term securities. Many studies attribute the apparent ability of the term spread to forecast economic activity to actions by monetary authorities to stabilize output growth. For example, monetary policy tightening causes both short- and long-term interest rates to rise. Short-term rates are likely to rise more than long-term rates, however, if policy is expected to ease once economic activity slows or inflation declines. Hence, a policy tightening is likely to cause the yield curve to flatten or possibly invert. Monetary policy explanations usually have been stated with little underlying theory.4 However, as noted by Feroli (2004), Estrella (2005), and Estrella and Trubin (2006), the extent to which 4 For example, Estrella and Hardouvelis (1991) and Berk (1998) refer to simple dynamic IS-LM models but do not explicitly derive testable hypotheses from those models (see also Bernanke and Blinder, 1992; Dueker, 1997; and Dotsey, 1998). S E P T E M B E R / O C TO B E R , PA R T 1 2009 423 Wheelock and Wohar the term spread is a good predictor of output growth depends on the monetary authority’s policy objectives and reaction function. For example, the term spread forecasts output growth better the more responsive the monetary authority is to deviations of output growth from potential. The spread forecasts less accurately if monetary authorities concentrate exclusively on controlling inflation. Further, changes in the relative responsiveness of the monetary authority to either output growth or inflation could cause changes in the ability of the term spread to forecast output growth. In contrast to explanations that focus on monetary policy, theories of intertemporal consumption derive a relationship between the slope of the yield curve and future economic activity explicitly from the structure of the economy (e.g., Harvey, 1988; Hu, 1993). The central assumption of Harvey (1988), for example, is that individuals prefer stable consumption rather than high consumption during periods of rising income and low consumption when income is falling. Thus, when consumers expect a recession one year in the future, they will sell short-term financial instruments and purchase one-year discount bonds to obtain income during the recession year. As a result the term structure flattens or inverts.5 The theoretical implications of consumptionsmoothing models apply to the real term structure, that is, the term structure adjusted for expected inflation. However, much of the empirical evidence on the information content of the term structure pertains to the nominal term structure. The consistency of the empirical evidence linking the nominal yield curve to changes in output with the theoretical relationship depends on the persistence of inflation. If inflation were a random walk, implying that shocks to inflation are permanent, then inflation shocks would have no impact on the slope of the nominal yield curve because expected inflation would change by an identical amount at all horizons. However, if infla5 Rendu de Lint and Stolin (2003) study the relationship between the term structure and output growth in a dynamic equilibrium asset pricing model. They find that the term spread predicts future consumption and output growth at long horizons in a stochastic endowment economy model augmented with endogenous production. 424 S E P T E M B E R / O C TO B E R , PA R T 1 2009 tion has little persistence, an inflation shock will affect near-term expected inflation more than longterm expected inflation, causing the slope of the nominal yield curve to change. Hence, the extent to which changes in the slope of the nominal yield curve reflect changes in the real yield curve depends on the persistence of inflation which, in turn, reflects the underlying monetary regime.6 Much of the empirical literature has focused on estimating the precision with which the term spread forecasts economic activity, rather than on attempting to discriminate between the monetary policy and consumption-smoothing explanations. Laurent (1988, 1989) argues that the yield curve reflects the stance of monetary policy and finds that the term spread predicts changes in the growth rate of real GDP. On the other hand, several studies find that the term spread has significant predictive power for economic growth independent of the information contained in measures of current and future monetary policy, suggesting that monetary policy alone cannot explain all of the observed relationship (see, e.g., Estrella and Hardouvelis, 1991; Plosser and Rouwenhorst, 1994; Estrella and Mishkin, 1997; Benati and Goodhart, 2008). Harvey (1988) and Rendu de Lint and Stolin (2003) offer support for the consumptionsmoothing explanation by showing that the slope of the yield curve is useful for forecasting both consumption and output growth. Benati and Goodhart (2008), however, find that changes over time in the marginal predictive content of the nominal term spread for output growth do not match changes in inflation persistence, which they argue is evidence against the consumptionsmoothing explanation. Several studies find that the spread has forecast output growth less accurately since the mid1980s, which some attribute to greater stability of output growth and other key macroeconomic data (e.g., D’Agostino, Giannone, and Surico, 2006). It remains to be seen how incorporating data for 6 Under fiat monetary regimes, inflation has tended to be highly persistent. However, inflation tends to exhibit little persistence under metallic and inflation-targeting regimes (see, e.g., Shiller and Siegel, 1977; Barsky, 1987; Bordo and Schwartz, 1999; and Benati, 2006, 2008). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar the recession that began in 2007 affects the performance of forecasting models that use the term spread to predict economic activity and whether the additional information sheds light on alternative explanations for the forecasting relationship. 1 to 5 quarters ahead. Similarly, Estrella and Hardouvelis (1991) find that the spread between yields on 10-year and 3-month Treasury securities is useful for forecasting U.S. output growth and recessions, as well as consumption and investment, especially at 4- to 6-quarter horizons. DOES THE TERM SPREAD FORECAST OUTPUT GROWTH? Evidence from Outside the United States Numerous studies using a wide variety of data and methods investigate how well the term spread forecasts output growth. Although many studies use post-World War II U.S. data, several recent studies investigate how well the term spread predicts future economic activity using data from other countries or time periods. Such efforts can indicate whether the association between the term spread and output growth is an artifact of the postwar U.S. experience and shed light on the validity of alternative explanations for why the spread might forecast economic activity. Our survey focuses primarily on the literature published or written since the mid-1990s. However, we briefly discuss some earlier studies to set the stage for a more detailed discussion of recent work. Much of the evidence on the accuracy of the term spread in forecasting output growth comes from the estimation of linear models, such as the following linear regression, or some variant of it: (1) ∆Yt = α + βSpread + γ (L ) ∆Yt −1 + εt , where ∆Yt is the growth rate of output (e.g., real GDP); Spread is the difference between the yields on long-term and short-term Treasury securities; γ 共L兲 is a lagged polynomial, typically of length four (current and three lags, assuming quarterly data);7 and εt is an error term. Laurent (1988), Harvey (1988, 1989), and Estrella and Hardouvelis (1991) were among the first to present empirical evidence on the strength of the relationship between the term spread and output growth using U.S. data. Harvey (1989), for example, finds that the spread between the yields on 5-year and 3-month U.S. Treasury securities predicts real gross national product growth from 7 For example, γ 共L兲 = γ 1L1 + γ 2L2 + γ 3L3 + γ 4L4, where Li∆Yt = ∆Yt – i . F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Although the earliest studies were based on U.S. data, several others have explored the usefulness of the spread for forecasting output growth using data from other countries. Often these studies show considerable variation across countries in how well the spread forecasts output growth. For example, Plosser and Rouwenhorst (1994) find that term spreads are useful for predicting GDP growth in Canada and Germany, as well as the United States, but not in France or the United Kingdom. Plosser and Rouwenhorst (1994) also find that foreign term spreads help predict future changes in output in individual countries. Davis and Fagan (1997) find that the term spread has statistically significant within-sample explanatory power for output growth in six of nine European Union countries, but that the spread improves out-of-sample forecasts and satisfies conditions for statistical significance and stability in only three countries (Belgium, Denmark, and the United Kingdom). A related study by Berk and van Bergeijk (2001) examines 12 euro-area countries over the period 1970-98 and finds that the term spread contains only limited information about future output growth. Several studies examine whether the term spread contains information about future output growth in Japan. Harvey (1991) finds that the spread contains no information about future economic activity in Japan for the period 1970-89. By contrast, Hu (1993) finds a positive correlation between the term spread and future economic activity in Japan for the period from January 1957 to April 1991, but that lagged changes in stock prices and output growth have more explanatory power than the term spread. Kim and Limpaphayom (1997) argue that heavy regulation prevented interest rates from reflecting market expectations before 1984. Their study finds that the spread is useful for predicting output growth up to five S E P T E M B E R / O C TO B E R , PA R T 1 2009 425 Wheelock and Wohar quarters ahead during 1984-91 (see also Nakaota, 2005). Evidence from Multivariate Models quarters ahead. However, expected changes in short-term rates explain significantly more of the output growth than does the term premium. Hence, the most important reason that an inverted yield curve predicts slower output growth in the future is that a low term spread implies falling future short-term interest rates, rather than, say, an increase in the term premium associated with higher interest rate volatility near the end of economic expansions. Several studies examine the marginal predictive content of the term spread in models that also include other explanatory variables. Estrella and Hardouvelis (1991), Plosser and Rouwenhorst (1994), Estrella and Mishkin (1997), Hamilton and Kim (2002), and Feroli (2004) are among several studies that find the term spread has significant predictive power for economic growth even when a short-term interest rate or other measure of the stance of monetary policy is included as an additional explanatory variable. These results suggest that monetary policy alone does not explain why the term spread predicts output growth. However, Stock and Watson (2003) show that including other explanatory variables does not improve forecasts obtained from a bivariate model of the term spread and output growth.8 Aretz and Peel (2008) include both the term spread and professional forecasts in a model of output growth and find that both variables individually forecast real GDP growth and that the term spread contains information not captured by professional forecasts. However, Aretz and Peel (2008) find that the term spread contributes no information beyond that in the professional forecasts in models that assume that forecasters’ loss functions become more skewed as the forecast horizon lengthens. Hamilton and Kim (2002) note that (i) the term spread consists of an expected interest rate component and a term premium component and (ii) determining the relative usefulness of one or the other component for forecasting output growth could help distinguish among alternative hypotheses for why the term spread predicts output growth. Hamilton and Kim (2002) find that the expected change in the short-term interest rate and the time-varying term premium both contribute to forecasts of real GDP growth up to eight Table 2 summarizes the methods and principal findings of several recent studies of the ability of the term spread to forecast output growth. Much of the research during the past decade focuses on the stability of the forecasting relationship over time. Several studies find that the spread has been less useful for forecasting output growth since the mid-1980s, at least for the United States.9 For example, Dotsey (1998) finds that the spread forecasts cumulative output growth up to two years in the future, but does so less accurately for 1985-97 than for earlier years. Further, Dotsey (1998) finds that the spread forecasts less accurately when past values of output growth and short-term interest rates are included in the forecasting model and contributes no information to forecasts for the 1985-97 period. Estrella, Rodrigues, and Schich (2003) test for unknown breakpoints in the in-sample forecasting relationship between the term spread and output growth using data for the United States and Germany. Although the study detects a generally strong relationship between the term spread and output growth one year in the future for both countries, it identifies a break in September 1983 for the United States using models with one-year forecast-horizons. Estrella, Rodrigues, and Schich (2003), however, detect no breaks in longer-horizon forecasting models for the United States or in short- or long-horizon models estimated using data for Germany. 8 9 Similarly, Cozier and Tkacz (1994) and Hamilton and Kim (2002) find that the spread predicts future changes in output growth in forecasting models that include the output gap and changes in the price of oil, respectively, as an explanatory variable. 426 S E P T E M B E R / O C TO B E R , PA R T 1 2009 Recent Research on the Stability of the Forecasting Relationship In addition to the studies summarized in Table 2, other studies that find a break in the forecasting relationship in the mid-1980s include Haubrich and Dombrosky (1996), Estrella and Mishkin (1997), and Smets and Tsatsaronis (1997). F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 2 Selective Summary of Studies of the Usefulness of the Term Spread for Predicting Output Growth Methodology Data* (years) Principal finding(s) Notes Dotsey (1998) Single-equation linear and nonlinear regression U.S., quarterly (1955-97) Spread is useful for predicting cumulative GDP growth up to 2 years ahead, but less accurate during 1985-97 than previously. Spread has marginal predictive power only up to 6 quarters. Adding the spread to a VAR containing lagged output growth increases forecast errors. Galbraith and Tkacz (2000) Single-equation linear regression and smooth transition nonlinear asymmetric threshold model G-7 developed countries, quarterly (1960s–late 1990s; varies by country) Spread predicts changes in output. Evidence for the U.S. and Canada of asymmetric nonlinear behavior, where the impact of the spread is greater on one side of a threshold than on the other. Across a variety of specifications, the spread has its most significant predictive power when it is negative. Shaaf (2000) Single-equation linear U.S., quarterly (1959-97) models and neural networks Spread forecasts output growth: A 5 percent increase in the yield spread results in a 9.33 percent increase in output growth. Out-of-sample simulations indicate that the forecast of the artificial neural networks is more accurate and has less error and lower variation than forecasts from linear models. Berk and van Bergeijk (2001) Single-equation linear models Twelve developed countries and the euro area, quarterly (1970-98) Term spread has little information about future output growth beyond that contained in lagged output growth for most countries. The U.S. is an exception. Evidence of parameter instability for the U.S. in the latter part of the sample but not for other countries or the euro area. Tkacz (2001) Neural networks Canada, quarterly (1968-99) Four-quarter forecasts of output growth outperform 1-quarter forecasts. Neural network models outperform linear models at a 4-quarter horizon but not at a 1-quarter horizon. Hamilton and Kim (2002) Linear regression and GARCH models U.S., quarterly (1953-98) Cyclical behavior of interest rate volatility Cyclical movements in volatility are unable is an important determinant of the spread to account for the spread and the term and the term premium and a useful premium in forecasting output growth. predictor of future interest rates. Estrella, Rodrigues, and Schich (2003) Single-equation linear models U.S. and Germany, monthly industrial production (1955-98 for U.S.; 1967-98 for Germany) Spread forecasts output growth well at 1-year horizons in both countries but less accurately at 2- and 3-year horizons. Results are robust across several maturity combinations for the spread. Little evidence of instability for Germany, but a break in 1983 for the U.S. at a 1-year horizon. 2009 NOTE: *Unless otherwise noted, the dependent variable in each study is the growth rate of real GDP. GARCH, generalized autoregressive conditional heteroskedasticity; GNP, gross national product; VAR, vector autoregression; VAR-VECM, VAR–vector error correction model. 427 Wheelock and Wohar S E P T E M B E R / O C TO B E R , PA R T 1 Study Selective Summary of Studies of the Usefulness of the Term Spread for Predicting Output Growth S E P T E M B E R / O C TO B E R , PA R T 1 2009 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Study Methodology Data* (years) Principal finding(s) Notes Stock and Watson (2003) Linear regression and combination forecasts Canada, France, Germany, Italy, Japan, U.K., and U.S., quarterly (1959-99) Some asset prices have predictive content for output growth, but results vary across time and by country. Forecasts based on individual indicators are unstable. Simple combination forecasts, such as computing the median or trimmed mean of a panel of forecasts, seem to circumvent issues of instability in that they yield smaller errors than the autoregressive benchmark model. Combination forecasts are stable even though the individual predictive relations are unstable. Venetis, Paya, and Peel (2003) Smooth nonlinear transition models, regimeswitching models, and time-varying models U.S., U.K., and Canada, quarterly (Early 1960s to 2000; varies by country) Threshold effects for the U.S., the U.K., and Canada. The term spread–output growth relationship is stronger when past values of the term spread do not exceed a positive threshold value. Spread is less useful for predicting output growth in recent years. Jardet (2004) Single-equation linear model; VAR-VECM to identify sources of structural breaks U.S., monthly industrial production and employment (1957-2001) Spread forecasts output growth well, especially at 1-year horizons. Structural break occurs in 1984 with diminished forecasting strength thereafter. VAR estimates suggest that a structural break is due to a drop in the contributions of monetary policy and supply shocks to the covariance between the spread and output growth. Duarte, Venetis, and Paya (2005) Linear and nonlinear threshold models Euro area and U.S., quarterly (1970-2000) Significant nonlinearity exists in the term yield spread–output growth relation with respect to time and past output growth. Nonlinear model outperforms linear model in 1-year out-of-sample forecasts. With linear models, the term spread is a useful indicator of future output growth for the euro area. Linear models show signs of instability. Spreads are successful in predicting output growth when output growth has slowed. Nakaoto (2005) Single-equation linear model Japan, monthly industrial production (1985-2001) Spread forecasts output at 1- to 24-month horizons in models that account for a structural break in July 1991. Usefulness of the spread is robust to inclusion of other variables. Expected future changes in short-term rates appear to contribute useful information both before and after 1991, but the term premium is useful only after 1991. NOTE: *Unless otherwise noted, the dependent variable in each study is the growth rate of real GDP. GARCH, generalized autoregressive conditional heteroskedasticity; GNP, gross national product; VAR, vector autoregression; VAR-VECM, VAR–vector error correction model. Wheelock and Wohar 428 Table 2, cont’d F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 2, cont’d Selective Summary of Studies of the Usefulness of the Term Spread for Predicting Output Growth 2009 Methodology Data* (years) Principal finding(s) Notes Ang, Piazzesi, and Wei (2006) Linear models and VARs U.S., quarterly (1952-2001) Recommends using the longest yield spread to predict output growth regardless of forecast horizon. Results indicate that the level of the short-term rate contains more information about output growth than any yield spread. VAR model forecasts are superior to linear model forecasts both in and out of sample. The factor structure appears largely responsible for most of the efficiency gains. The lagged spread does not predict output growth in the 1990s, but high short-term rates forecast negative output growth. D’Agostino, Giannone, and Surico (2006) Single-equation linear model and bivariate VAR U.S., monthly personal income, industrial production, unemployment rate, and employment (1959-2003) Spread dominates other variables in forecasting output and employment at 12-month horizons during 1959-84 but not during 1985-2003. A general decline occurs in forecast accuracy for the spread, other variables, and professional forecasts after 1984 relative to a random walk. Giacomini and Rossi (2006) Structural break tests; both single break and multiple breaks U.S., monthly industrial production (1965-2001) Evidence of forecast breakdown in the relation between yield spread and output growth, especially during the Burns-Miller and Volker monetary policy regimes. Results parallel the empirical evidence on structural breaks of the relation between spread and output growth documented in the literature. Aretz and Peel (2008) Single-equation linear model U.S., quarterly GDP/GNP (1981-2006) Spread forecasts output growth at various horizons and includes information beyond that in the Survey of Professional Forecasters. Results are robust to the use of real-time or vintage data. The spread contributes no information in models that assume forecasters have asymmetric loss functions. Benati and Goodhart (2008) Bayesian VARs with time-varying parameters U.S. and U.K., quarterly (1875-2005); euro area, quarterly (1970-2003); Australia, quarterly (1957-2005); Canada, quarterly (1975-2005) Spread has considerable marginal predictive content for the U.S. before World War I and in the 1980s, but little during the interwar period or before or after the 1980s. Similar parameter instability is found in forecasts for other countries and in models that also include inflation and a short-term interest rate. Results fail to distinguish clearly between leading explanations for why the spread may be useful for predicting output growth. Bordo and Haubrich (2008) Single-equation linear model U.S., quarterly GNP, spread between corporate bonds and 6-month commercial paper (1875-1997) Spread improves forecasting model in only three of nine subperiods: 1875-1913, 1971-84, and, to a lesser extent, 1985-97. Spread performs somewhat better in forecasts based on rolling regressions. 429 NOTE: *Unless otherwise noted, the dependent variable in each study is the growth rate of real GDP. GARCH, generalized autoregressive conditional heteroskedasticity; GNP, gross national product; VAR, vector autoregression; VAR-VECM, VAR–vector error correction model. Wheelock and Wohar S E P T E M B E R / O C TO B E R , PA R T 1 Study Wheelock and Wohar Stock and Watson (2003) examine the stability of the forecasting relationship between the term spread and output growth for the United States and other countries and consider both in-sample and out-of-sample forecasts. Like prior studies, Stock and Watson (2003) find that the term spread forecasts U.S. output growth less accurately after 1985. The study also finds that the spread forecasts output less accurately during 1985-99 than a simple autoregressive model. A recent study by Giacomini and Rossi (2006) reexamines the forecasting performance of the yield curve for output growth using forecast breakdown tests developed by Giacomini and Rossi (2009). Giacomini and Rossi (2006) show that output growth models are characterized by a breakdown of predictability. In particular, they find strong evidence of forecast breakdowns at the one-year horizon during 1974-76 and 1979-87. Several studies that find diminished performance of the term spread forecasts of output growth in recent years point to the increased stability of output growth and other macroeconomic variables since the mid-1980s (at least until 2007) as a possible reason for the apparent change. As noted previously, a change in the relative responsiveness of monetary policy to output growth and inflation could affect how well the term spread predicts output growth. Bordo and Haubrich (2004, 2008) investigate the ability of the term spread to forecast U.S. output growth across different monetary regimes from 1875 to 1997. The authors examine periods distinguished by major changes in the monetary and interest rate environment, including the founding of the Federal Reserve System in 1914, World War II, the Treasury-Fed Accord of 1951, and the closing of the U.S. gold window and collapse of the Bretton Woods system in 1971. Bordo and Haubrich (2004, 2008) find that the term spread improves the forecast of output growth, as indicated by the mean squared forecast error, in three of the nine subperiods they consider: (i) the period preceding the establishment of the Federal Reserve System (1875-1913), (ii) the first 13 years after the collapse of the Bretton Woods system (1971-84), and, to a lesser extent, (iii) the 1985-97 period.10 The term spread does not improve forecasts of output 430 S E P T E M B E R / O C TO B E R , PA R T 1 2009 growth during the interwar period or the Bretton Woods era that followed World War II. Bordo and Haubrich (2004, 2008) find that the term spread tends to forecast output growth better during periods when the persistence of inflation was relatively high, such as the first 13 years after the collapse of the Bretton Woods system. In such periods, inflation shocks increase both short- and long-term interest rates and thus do not affect the slope of the yield curve. Real shocks that are expected to be temporary, however, increase short-term rates by more than long-term rates and signal a future downturn in economic activity. Bordo and Haubrich (2004, 2008) find that the term spread forecasts output growth less accurately when inflation persistence is relatively low, as it was during the interwar period and the Bretton Woods era. In such periods, both inflation and real shocks increase short-term interest rates more than long-term rates. Bordo and Haubrich argue, however, that only real shocks are likely to affect future output growth and, hence, the lower the persistence of inflation, the noisier the signal produced by the term spread about future output growth. Benati and Goodhart (2008) extend the work of Bordo and Haubrich (2004, 2008) by (i) considering the marginal predictive content of the term spread for forecasting output growth in a multivariate model and (ii) attempting to date more precisely changes in the marginal predictive content of the spread over time. Whereas Bordo and Haubrich (2004, 2008) estimate bivariate regression models similar to equation (1), Benati and Goodhart (2008) estimate Bayesian time-varying parameter vector autoregressions (VARs). Benati and Goodhart (2008) find that the term spread forecasts U.S. output growth better during the 1880s and 1890s than during the first two decades of the twentieth century. Further, like Bordo and Haubrich (2004, 2008), Benati and Goodhart (2008) find that the spread has almost no predictive content for the interwar years or the 10 Bordo and Haubrich (2004, 2008) also estimate rolling regressions with 24-quarter windows and find that the term spread predicts output less accurately during the pre-Fed period than suggested by their original estimates. However, their results for the post-Bretton Woods era are robust to the use of rolling regressions. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar Bretton Woods era. In addition, the study finds that the term spread contains significant predictive information about output growth during 1979-87 but none for other postwar years. Benati and Goodhart (2008) also find that estimates of the marginal predictive content of the spread are sensitive to whether a short-term interest rate and inflation are included in the forecasting model, and they find considerable variation in the marginal predictive content of the term spread over time for other countries and for different forecast horizons. Thus, like Bordo and Haubrich (2004, 2008), Benati and Goodhart (2008) find numerous breaks in the relationship between the term spread and future changes in output over time. However, unlike Bordo and Haubrich (2004, 2008), the breaks identified by Benati and Goodhart (2008) are not clearly associated with changes in the monetary regime or inflation persistence. Evidence from Nonlinear Models Much of the literature investigating the performance of the term spread in forecasting output growth relies on linear models. However, variation over time in the ability of the term spread to forecast output growth suggests possible nonlinearities in the forecasting relationship and some recent studies using data for the United States and Canada find this to be the case. Further, researchers are beginning to use models that capture such nonlinearities. For example, Galbraith and Tkacz (2000) find evidence of a threshold effect in the relationship between the term spread and conditional expectations of output growth for the United States and Canada but not for other major developed countries. Specifically, the authors find a large and statistically significant impact of the term spread on conditional expectations of output growth. However, the marginal effect that an increase in the spread has on predicted output growth is lower when the level of the term spread rises above a certain point. Shaaf (2000) and Tkacz (2001) use neural network models to account for nonlinearity in the relationship between the term spread and output growth. Both studies find that this class of models produces smaller forecast errors than linear models. Venetis, Paya, and Peel (2003) use nonF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W linear smooth transition models that can accommodate regime-type nonlinear behavior and time-varying parameters to examine the predictive power and stability of the term spread–output growth relationship. Using data for the United States, United Kingdom, and Canada, Venetis, Paya, and Peel (2003) find that the term spread– output growth relationship is stronger when past values of the term spread do not exceed a positive threshold value.11 Duarte, Venetis, and Paya (2005) use both linear regression and nonlinear models to examine the predictive accuracy of the term spread– output growth relationship among euro-area countries. The authors find that linear indicator and nonlinear threshold indicator models predict output growth well at four-quarter horizons and that the term spread is a useful indicator of future output growth and recessions in the euro area. The linear models show signs of instability, however, and the authors find evidence of significant nonlinearities with respect to time and lagged output growth. Further, the authors’ nonlinear model outperforms their linear model in out-ofsample forecasts of one-year-ahead output growth. Ang, Piazzesi, and Wei (2006) point out that the regressions typically used to investigate the predictive content of the term spread are unconstrained, and the authors argue for a model that treats both the term spread and output growth as endogenous variables. Ang, Piazzesi, and Wei (2006) build a dynamic model of GDP growth and bond yields that completely characterizes expectations of GDP growth. Using quarterly U.S. data for 1952-2001, the authors find that, contrary to previous research, the short-term interest rate outperforms the term spread in forecasting real GDP growth both in and out of sample and that including the term spread does not significantly improve forecasts of output growth. In summary, the recent empirical literature on the usefulness of the term spread for forecasting output growth finds that the spread predicts output growth less accurately in some countries and some periods than in others. Notably, several 11 For a discussion of smooth transition regression, see Granger and Teräsvirta (1993) or Teräsvirta (1998). S E P T E M B E R / O C TO B E R , PA R T 1 2009 431 Wheelock and Wohar studies find that the term spread’s power to forecast output has diminished since the mid-1980s. Several recent studies find evidence of significant nonlinearities, such as threshold effects, in the empirical relationship between the term spread and output growth. DOES THE TERM SPREAD FORECAST RECESSIONS? As an alternative to using the term spread to forecast output growth, many studies examine the extent to which the term spread is useful for forecasting the onset of recessions. Several of those studies are summarized in Table 3. Most recession-forecasting studies estimate a probit model of the following type, in which the dependent variable is a categorical variable set equal to 1 for recession periods and to 0 otherwise: (2) P ( recessiont ) = F (α 0 + α 1St − k ), where F indicates the cumulative normal distribution function. If the coefficient α 1 is statistically significant, then the term spread, St –k, is deemed useful for forecasting a recession k periods ahead. Models of the following form are often used to test how well the spread predicts recessions when additional explanatory variables are included in the model: (3) P ( recessiont ) = F (α 0 + α 1St − k + α 2 X t − k ), where Xt –k is a vector of additional explanatory variables. If α1 is significant in equation (2) but not in equation (3), then the ability of the spread to predict recessions is not robust to the inclusion of other variables. Using probit estimation, Estrella and Hardouvelis (1991) and Estrella and Mishkin (1998) find that the term spread significantly outperforms other financial and macroeconomic variables in forecasting U.S. recessions. Estrella and Hardouvelis (1991) show that the spread between the yields on 10-year and 3-month Treasury securities is a useful predictor of recessions, as well as of future growth of output, consumption, and investment. Estrella and Mishkin 432 S E P T E M B E R / O C TO B E R , PA R T 1 2009 (1998) compare the ability of several financial variables, including interest rates, interest rate spreads, stock prices, and monetary aggregates, to predict U.S. recessions out of sample. They find that stock prices are useful for predicting recessions at one- to three-quarter horizons but that the term spread outperforms all other variables beyond a one-quarter forecast horizon. Moreover, based on U.S. data for 1955-98 and German data for 1967-98, Estrella, Rodrigues, and Schich (2003) find that models that use the term spread to predict recessions are more stable than forecasting models for continuous variables, such as GDP growth and industrial production. The term spread appears useful for predicting recessions in many countries. Using probit estimation, Bernard and Gerlach (1998) find that the term spread forecasts recessions up to two years ahead in eight countries (Belgium, Canada, France, Germany, Japan, the Netherlands, United Kingdom, and United States) over the 1972-93 period. Similarly, Moneta (2005) finds that the spread is useful for predicting recession probabilities for the euro area as a whole, as well as in individual countries.12 Several studies test whether the term spread remains useful for predicting recessions in multivariate forecasting models. For example, Dueker’s (1997) probit model includes the change in an index of leading economic indicators, real money stock growth, the spread between the 6-month commercial paper and Treasury bill rates, and the percentage change in a stock price index, as well as the difference in yields on 30-year Treasury bonds and 3-month Treasury bills as a measure of the term spread. Dueker (1997) finds that among the variables, the term spread is the dominant predictor of recessions at horizons beyond three months. Bernard and Gerlach (1998) include both an index of leading indicators and foreign interest rate term spreads in a recession-forecasting model. The index of leading indicators contains information beyond that in the term spreads, but the 12 Moneta (2005) examines the predictive power of 10 yield spreads, representing different segments of the yield curve, and finds that the spread between the yield on 10-year government bonds and the 3-month interbank rate outperforms all other spreads in predicting recessions in the euro area. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar information is useful only for forecasting recessions in the immediate future. Bernard and Gerlach (1998) find that in addition to the domestic term spread, the term spreads of Germany and the United States are particularly useful for forecasting recessions in Japan and the United Kingdom, respectively. Sensier et al. (2004) use logit models to predict recessions in four European countries. The authors find that international data (in particular, the U.S. index of leading indicators and short-term interest rate) are useful for predicting business cycles in the four countries. The domestic term spread helps forecast recessions in Germany when international variables are included in the model, and short- and long-term interest rates entered separately help forecast recessions in France and the United Kingdom. Wright (2006) confirms previous studies in finding that the term spread is highly statistically significant in a bivariate probit recession model estimated on U.S. data for 1964-2005. However, Wright (2006) also finds that a model that includes both the federal funds rate and term spread fits the data much better than the bivariate model and provides superior out-of-sample recession forecasts. Similarly, King, Levin, and Perli (2007) find that a model that includes a corporate credit spread produces superior in- and out-of-sample recession forecasts compared with a model that includes only the term spread. In addition, they find that the multivariate model produces a much lower incidence of false-positive recession predictions. Rosenberg and Maurer (2008) investigate whether recession forecasts can be improved by distinguishing between the interest rate expectations and term premium components of the term spread. Their approach is similar to that of Hamilton and Kim (2002) discussed previously. If changes in the term premium distort the empirical relationship between the spread and recessions, a model that isolates interest rate expectations might yield superior recession forecasts. Rosenberg and Maurer (2008) find that the expectations component is more useful for forecasting recessions than the term premium and that only the coefficient on the expectations F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W component is statistically significant in the probit model. Their study finds, however, that the term spread and expectations component generally produce similar recession probability forecasts. Moreover, between August 2006 and May 2007, the term spread model predicted a significantly higher recession probability than did the expectations component model. Several recent studies investigate nonlinearities in recession-forecasting models. For example, Dueker (1997) estimates a probit model with Markov-switching coefficient variation and a lagged dependent variable. He finds that allowing for Markov-switching coefficient variation on the term spread improves forecast accuracy, especially at longer horizons, while including the lagged value of the recession indicator improves the model’s fit and forecast accuracy, especially at 3- to 12-month horizons. Further, Dueker (1997) finds that the nonlinear model produces fewer false warnings of recessions than a linear model. Ahrens (2002) estimates a probit forecasting model in which the term spread is assumed to follow a two-state Markov process. Using data for 1970-96 for eight countries among the Organisation for Economic Co-operation and Development (Canada, France, Germany, Italy, Japan, the Netherlands, the United Kingdom, and the United States), Ahrens (2002) finds that the term spread is a reliable predictor of business cycle peaks and troughs. Like Dueker (1997), Ahrens (2002) finds that the regime-switching framework produces more-accurate estimates of recession probabilities. Other studies that estimate augmented probit (or logit) models, or compare results from probit estimation with those obtained using other methods, include Chauvet and Potter (2005), Galvao (2006), and Dueker (2005). Chauvet and Potter (2005) compare recession forecasts obtained using four different probit model specifications: (i) a time-invariant conditionally independent version, (ii) a business cycle–specific conditionally independent model, (iii) a time-invariant probit model with autocorrelated errors, and (iv) a business cycle–specific probit model with autocorrelated errors. Chauvet and Potter (2005) find evidence in favor of the S E P T E M B E R / O C TO B E R , PA R T 1 2009 433 Selective Summary of Studies of the Usefulness of the Term Spread for Predicting Recessions S E P T E M B E R / O C TO B E R , PA R T 1 2009 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Study Methodology Data (years) Principal finding(s) Notes Estrella and Hardouvelis (1991) Probit model U.S. (1955-88) Spread is useful for forecasting recessions 4 quarters ahead. Results are robust to including short-term interest rate and other variables in model. Dueker (1997) Dynamic probit with Markov switching U.S. (1959-95) Spread is useful for prediction up to 12 months ahead. Results are robust to including other variables, including lagged recession indicator and regime switching. Dotsey (1998) Probit model U.S. (1955-97) Spread is useful for prediction; outperforms naive model. Spread failed to accurately forecast 1990-91 recession. Estrella and Mishkin (1998) Probit model U.S. (1959-95) Spread is useful for prediction, especially at 2- to 6-quarter horizons. Spread dominates other financial variables for out-of-sample prediction. Bernard and Gerlach (1998) Probit model Eight industrialized countries (1972-93) Spread is useful for prediction at 4- to 8-quarter horizons. Foreign spreads add little information, except for Japan (German spread) and the U.K. (U.S. spread). Ahrens (2002) Probit model with Markov switching Eight industrialized countries (1971-96) Spread is useful for prediction, especially cycle peaks. Regime-switching framework allows onset and ending of recessions to be determined endogenously. Estrella, Rodrigues, and Schich (2003) Probit model U.S. (1955-98) and Germany (1967-98) Spread is useful for prediction at 12-month horizons, less so at 24- and 36-month horizons. Results are generally robust to alternative term spreads, with little evidence of instability over time. Sensier et al. (2004) Logistic regression model Germany, France, Italy, and U.K. (1970-2001) Interest rates generally predict recessions at 3-month horizon. Short- and long-term rates entered separately; U.S. and German interest rates were useful for predicting recessions in other countries. Chauvet and Potter (2005) U.S. (1954-2001) Spread is useful for prediction at 12-month horizon. Model with breakpoints and autocorrelated errors fits better in sample than basic probit model. Variants of probit model allowing for multiple structural breaks and autoregression NOTE: EMU, European Monetary Union. Wheelock and Wohar 434 Table 3 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 3, cont’d Selective Summary of Studies of the Usefulness of the Term Spread for Predicting Recessions Study Methodology Data (years) Principal finding(s) Notes Duarte, Venetis, and Paya (2005) Dynamic probit model Euro area (1970-2004) Spread is useful for prediction at 3-quarter horizon. Both EMU and U.S. spreads useful, but EMU spread dominates. Moneta (2005) Standard and dynamic probit model Euro area, Germany, France, and Italy (1970-2002) Spread is useful for predicting at 1-year horizon; dynamic model outperforms standard probit model. Spread between 10-year and 3-month Treasury securities dominates other spreads in forecasts. Galvao (2006) Structural break threshold VAR model U.S. (1953-2003) Spread is useful for predicting output at 2-quarter horizons. Model allowing for structural breaks and nonlinearities outperforms standard VAR both in and out of sample. Wright (2006) Probit model U.S. (1964-2005) Spread is useful for predicting recessions. Models that include the level of the federals funds rate produced superior in- and out-of-sample forecasts. Rosenberg and Maurer (2008) Probit model U.S. (1961-2006) The expectations component of the spread is more accurate than the term premium component at forecasting recessions. The spread remains useful when the federal funds rate is included in the model. NOTE: EMU, European Monetary Union. 2009 435 Wheelock and Wohar S E P T E M B E R / O C TO B E R , PA R T 1 Wheelock and Wohar business cycle–specific probit model with autocorrelated errors, which allows for multiple structural breaks across business cycles and autocorrelation. Galvao (2006) estimates a recession-forecasting model that accounts for time-varying nonlinearity and structural breaks in the relationship between the term spread and recessions. The author finds that a model with time-varying thresholds predicts the timing of recessions better than models with a constant threshold or that allow only a structural break. Finally, Dueker (2005) proposes a VAR (“Qual-VAR”) model to forecast recessions using data on the term spread, GDP growth, inflation, and the federal funds rate. He finds that the model fits well in sample and accurately forecasts the 2001 recession out of sample. In summary, most empirical research to date finds that the term spread is useful for forecasting recessions—both for the United States and other countries—and that the spread predicts recessions more reliably than it does output growth. However, a few studies find that multivariate models that include other financial indicators besides the term spread improve recession-forecasting performance, as do models that account for threshold effects or other nonlinearities in the empirical relationship between the term spread and recessions. CONCLUSION The literature on the relationship between the yield curve and economic activity is large and expanding rapidly. Much of the literature examines empirically how well the term spread forecasts output growth or recessions, with less emphasis on why the yield curve predicts economic activity. To a great extent, the observation that changes in the slope of the yield curve appear to forecast changes in economic activity remains, as Benati and Goodhart (2008, p. 1237) contend, “a stylized fact in search of a theory.” Does the yield spread forecast output growth? Does it forecast recessions? The answer to both questions is a qualified “yes.” Early studies based on estimation of linear forecasting models using 436 S E P T E M B E R / O C TO B E R , PA R T 1 2009 postwar U.S. data, as well as several recent studies, find that the term spread forecasts output growth well. Much research finds that the term spread is useful for forecasting output growth, especially at horizons of 6 to 12 months, and that the term spread remains useful even if other variables, including measures of monetary policy, are added to the forecasting model. However, several recent studies also find considerable variation in the ability of the spread to forecast output growth across countries and time periods. In particular, several studies find that the spread’s ability to predict output growth has diminished since the mid-1980s. The literature also provides considerable evidence of nonlinearities and structural breaks in the relationship between the term spread and output growth. In general, studies show that the term spread is a more reliable predictor of recessions than of output growth and that the spread provides good recession forecasts, especially up to one year ahead. Researchers generally obtain superior forecasting performance from (i) probit models that include a lagged recession indicator and Markovswitching coefficients or other nonlinearities and (ii) other nonlinear approaches, such as smooth transition regression and multivariate adaptive regression splines estimation. The literature has not reached a consensus regarding the reasons for structural breaks or nonlinearities in the empirical relationship between the term spread and future economic activity. Several studies note that the relationship between the nominal yield curve and future economic activity is likely to depend on the nature of the monetary regime, including the relative responsiveness of the monetary authority to output and inflation. For example, the term spread is likely to forecast output growth better when the monetary authority is more responsive to output than inflation and when inflation is relatively persistent. Further estimation refinements, as well as additional research based on dynamic structural models (Ang, Piazzesi, and Wei, 2006), might provide insights into the interactions among the policy regime, financial variables, and output growth that help explain the questions posed by the empirical literature. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar REFERENCES Ahrens, Ralf. “Predicting Recessions with Interest Rate Spreads: A Multicountry Regime-Switching Analysis.” Journal of International Money and Finance, August 2002, 21(4), pp. 519-37. Ang, Andrew; Piazzesi, Monika and Wei, Min. “What Does the Yield Curve Tell Us About GDP Growth?” Journal of Econometrics, March/April 2006, 131(1/2), pp. 359-403. Aretz, Kevin and Peel, David A. “Spreads versus Professional Forecasters as Predictors of Future Output Change.” Working paper, Lancaster University, 2008; http://ssrn.com/abstract=1123949. Barsky, Robert B. “The Fisher Hypothesis and the Forecastability and Persistence of Inflation.” Journal of Monetary Economics, January 1987, 19(1), pp. 3-24. Benati, Luca. “UK Monetary Regimes and Macroeconomic Stylized Facts.” Bank of England Working Paper 290, Bank of England, March 2006; www.bankofengland.co.uk/publications/ workingpapers/wp290.pdf. Benati, Luca. “Investigating Inflation Persistence Across Monetary Regimes,” Quarterly Journal of Economics, August 2008, 123(3), pp. 1005-60. Benati, Luca and Goodhart, Charles. “Investigating Time-Variation in the Marginal Predictive Power of the Yield Spread.” Journal of Economic Dynamics and Control, April 2008, 32(4), pp. 1236-72. Berk, Jan M. “The Information Content of the Yield Curve for Monetary Policy: A Survey.” De Economist, July 1998, 146(2), pp. 303-20. Berk, Jan M. and van Bergeijk, Peter A.G. “On the Information Content of the Yield Curve: Lessons for the Eurosystem?” Kredit und Kapital, 2001, 1, pp. 28-47. Bernanke, Ben S. and Blinder, Alan S. “The Federal Funds Rate and the Channels of Monetary Transmission.” American Economic Review, September 1992, 82(4), pp. 901-21. Bernard, Henri and Gerlach, Stefan. “Does the Term F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Structure Predict Recessions? The International Evidence.” International Journal of Finance and Economics, July 1998, 3(3), pp. 195-215. Bordo, Michael D. and Haubrich, Joseph G. “The Yield Curve, Recessions, and the Credibility of the Monetary Regime: Long-Run Evidence, 1875-1997.” NBER Working Paper No. 10431, National Bureau of Economic Research, April 2004; www.nber.org/papers/w10431.pdf?new_window=1. Bordo, Michael D. and Haubrich, Joseph G. “The Yield Curve as a Predictor of Growth: Long-run Evidence, 1875-1997.” Review of Economics and Statistics, February 2008, 90(1), pp. 182-85. Bordo, Michael D. and Schwartz, Anna J. “Monetary Policy Regimes and Economic Performance: The Historical Record,” in John B. Taylor and Michael Woodford, eds., Handbook of Macroeconomics. Chap. 2. Amsterdam: North Holland, 1999. Chauvet, Marcelle and Potter, Simon. “Forecasting Recessions Using the Yield Curve.” Journal of Forecasting, March 2005, 24(2), pp. 77-103. Cozier, Barry and Tkacz, Greg. “The Term Structure and Real Activity in Canada.” Bank of Canada Working Paper No. 94-3, Bank of Canada, March 1994; www.bankofcanada.ca/en/res/wp/1994/wp94-3.pdf. D’Agostino, Antonello; Domenico, Giannone and Surico, Paolo. “(Un)predictability and Macroeconomic Stability.” European Central Bank Working Paper No. 605, European Central Bank, April 2006; www.ecb.int/pub/pdf/scpwps/ecbwp605.pdf. Davis, E. Phillip and Fagan, Gabriel. “Are Financial Spreads Useful Indicators of Future Inflation and Output Growth in EU Countries?” Journal of Applied Econometrics, November/December 1997, 12(6), pp. 701-14. Dotsey, Michael. “The Predictive Content of the Interest Rate Yield Spread for Future Economic Growth.” Federal Reserve Bank of Richmond Economic Quarterly, Summer 1998, 84(3), pp. 31-51; www.richmondfed.org/publications/research/ economic_quarterly/1998/summer/pdf/dotsey.pdf. S E P T E M B E R / O C TO B E R , PA R T 1 2009 437 Wheelock and Wohar Duarte, Augustin; Venetis, Ioannis A. and Paya, Ivan. “Predicting Real Growth and the Probability of Recession in the Euro Area Using the Yield Spread.” International Journal of Forecasting, April/June 2005, 21(2), pp. 262-77. Dueker, Michael J. “Strengthening the Case for the Yield Curve as a Predictor of U.S. Recessions.” Federal Reserve Bank of St. Louis Review, March/April 1997, 79(2), pp. 41-51; http://research.stlouisfed.org/ publications/review/97/03/9703md.pdf. Dueker, Michael J. “Dynamic Forecasts of Qualitative Variables: A Qual VAR Model of U.S. Recessions.” Journal of Business and Economics Statistics, January 2005, 23(1), pp. 96-104. Estrella, Arturo. “Why Does the Yield Curve Predict Output and Inflation?” Economic Journal, July 2005, 115(505), pp. 722-44. Estrella, Arturo and Hardouvelis, Gikas A. “The Term Structure as a Predictor of Real Economic Activity.” Journal of Finance, June 1991, 46(2), pp. 555-76. Estrella, Arturo and Mishkin, Frederic S. “The Predictive Power of the Term Structure of Interest Rates in Europe and the United States: Implications for the European Central Bank.” European Economic Review, July 1997, 41(7), pp. 1375-401. Estrella, Arturo and Mishkin, Frederic S. “Predicting U.S. Recessions: Financial Variables as Leading Indicators.” Review of Economics and Statistics, February 1998, 80(1), pp. 45-61. Estrella, Arturo; Rodrigues, Anthony P. and Schich, Sebastian. “How Stable Is the Predictive Power of the Yield Curve? Evidence from Germany and the United States.” Review of Economics and Statistics, August 2003, 85(3), pp. 629-44. Estrella, Arturo and Trubin, Mary R. “The Yield Curve as a Leading Indicator: Some Practical Issues.” Federal Reserve Bank of New York Current Issues in Economics and Finance, July/August 2006, 12(5), pp. 1-7; www.newyorkfed.org/research/ current_issues/ci12-5.pdf. 438 S E P T E M B E R / O C TO B E R , PA R T 1 2009 Feroli, Michael. “Monetary Policy and the Information Content of the Yield Spread.” Topics in Macroeconomics, September 2004, 4(1), Article 13. Galbraith, John W. and Tkacz, Greg. “Testing for Asymmetry in the Link Between the Yield Spread and Output in the G-7 Countries.” Journal of International Money and Finance, October 2000, 19(5), pp. 657-72. Galvão, Ana Beatriz C. “Structural Break Threshold VARs for Predicting U.S. Recessions Using the Spread.” Journal of Applied Econometrics, May/June 2006, 21(4), pp. 463-87. Giacomini, Raffaella and Rossi, Barbara. “How Stable Is the Forecasting Performance of the Yield Curve for Output Growth?” Oxford Bulletin of Economics and Statistics, December 2006, 68(Suppl. 1), pp. 783-95. Giacomini, Raffaela and Rossi, Barbara. “Detecting and Predicting Forecast Breakdown.” Review of Economic Studies, April 2009, 76(2), pp. 669-705. Granger, Clyde W. J. and Teräsvirta, Timo. Modeling Nonlinear Economic Relationships. New York: Oxford University Press, 1993. Hamilton, James D. and Kim, Dong H. “A ReExamination of the Predictability of the Yield Spread for Real Economic Activity.” Journal of Money, Credit, and Banking, May 2002, 34(2), pp. 340-60. Harvey, Campbell R. “The Real Term Structure and Consumption Growth,” Journal of Financial Economics, December 1988, 22(2), pp. 305-33. Harvey, Campbell R. “Forecasts of Economic Growth From the Bond and Stock Markets.” Financial Analysts Journal, September/October 1989, 45(5), pp. 38-45. Harvey, Campbell R. “The Term Structure and World Economic Growth.” Journal of Fixed Income, June 1991, 1(1), pp. 7-19. Haubrich, Joseph G. and Dombrosky, Ann M. “Predicting Real Growth Using the Yield Curve.” F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Wheelock and Wohar Federal Reserve Bank of Cleveland Economic Review, First Quarter 1996, 32(1), pp. 26-35; www.clevelandfed.org/Research/Review/1996/ 96-q1-haubrich.pdf. Hu, Zuliu. “The Yield Curve and Real Economic Activity.” IMF Staff Papers, December 1993, 40(4), pp. 781-806. Jardet, Caroline. “Why Did the Term Structure of Interest Rates Lose Its Predictive Power?” Economic Modelling, May 2004, 21(3), pp. 509-24. Kessel, Reuben, A. “The Cyclical Behavior of the Term Structure of Interest Rates.” NBER Occasional Paper 91, National Bureau of Economic Research, 1965. Kim, Kenneth A. and Limpaphayom, Piman. “The Effect of Economic Regimes on the Relation Between Term Structure and Real Activity in Japan.” Journal of Economics and Business, July/August 1997, 49(4), pp. 379-92. King, Thomas B.; Levin, Andrew T. and Perli, Roberto. “Financial Market Perceptions of Recession Risk.” Finance and Economics Discussion Series No. 2007-57, Board of Governors of the Federal Reserve System; www.federalreserve.gov/pubs/feds/2007/ 200757/index.html. Laurent, Robert D. “An Interest Rate-Based Indicator of Monetary Policy.” Federal Reserve Bank of Chicago Economic Perspectives, 1988, 12(1), pp. 3-14; www.chicagofed.org/publications/ economicperspectives/1988/ep_jan_feb1988_part1_ laurent.pdf. Laurent, Robert D. “Testing the ‘Spread.’” Federal Reserve Bank of Chicago Economic Perspectives, 1989, 13(4), pp. 22-34; www.chicagofed.org/ publications/economicperspectives/1989/ep_jul_ aug1989_part3_laurent.pdf. Moneta, F. ”Does the Yield Spread Predict Recession in the Euro Area?” International Finance, Summer 2005, 8(2), pp. 263-301. Nakaota, Hiroshi. “The Term Structure of Interest Rates in Japan: The Predictability of Economic F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Activity.” Japan and the World Economy, August 2005, 17(3), pp. 311-26. Plosser, Charles I. and Rouwenhorst, K. Geert. “International Term Structures and Real Economic Growth.” Journal of Monetary Economics, February 1994, 33(1), pp. 133-55. Rendu de Lint, Christel and Stolin, David. “The Predictive Power of the Yield Curve: A Theoretical Assessment.” Journal of Monetary Economics, October 2003, 50(7), pp. 1603-22. Rosenberg, Joshua V. and Maurer, Samuel. “Signal or Noise? Implications of the Term Premium for Recession Forecasting.” Federal Reserve Bank of New York Economic Policy Review, July 2008, 14(1), pp. 1-11; www.newyorkfed.org/research/epr/ 08v14n1/0807rose.pdf. Sensier, Marianne; Artis, Michael J.; Osborn, Denise R. and Birchenhall, Chris. “Domestic and International Influences on Business Cycle Regimes in Europe.” International Journal of Forecasting, April/June 2004, 20(2), pp. 343-57. Shaaf, Mohamad. “Predicting Recessions Using the Yield Curve: An Artificial Intelligence and Econometric Comparison.” Eastern Economic Journal, Spring 2000, 26(2), pp. 171-90. Shiller, Robert J. and Siegel, Jeremy J. “The Gibson Paradox and Historical Movements in Real Longterm Interest Rates.” Journal of Political Economy, October 1977, 85(1), pp. 11-30. Smets, Frank and Tsatsaronis, Kostas. “Why Does the Yield Curve Predict Economic Activity? Dissecting the Evidence for Germany and the United States.” CEPR Discussion Paper No. 1758, Centre for Economic Policy Research, December 1997. Stock, James H. and Watson, Mark W. “Forecasting Output and Inflation: The Role of Asset Prices.” Journal of Economic Literature, September 2003, 41(3), pp. 788-829. Teräsvirta, Timo. “Modeling Economic Relationships with Smooth Transition Regressions,” in Aman Ullah and David E.A. Giles, eds., Handbook of S E P T E M B E R / O C TO B E R , PA R T 1 2009 439 Wheelock and Wohar Applied Economic Statistics. Chap. 15. New York: Marcel Dekker, 1998, pp. 507-32. Tkacz, Greg. “Neural Network Forecasting of Canadian GDP Growth.” International Journal of Forecasting, January/March 2001, 17(1), pp. 57-69. Venetis, Ioannis A.; Paya, Ivan and Peel, David A. “Re-Examination of the Predictability of Economic Activity Using the Yield Spread: A Nonlinear Approach.” International Review of Economics and Finance, 2003, 12(2), pp. 187-207. Wright, Jonathan, H. “The Yield Curve and Predicting Recessions.” Finance and Economics Discussion Series No. 2006-07, Federal Reserve Board of Governors, February 2006; www.federalreserve.gov/ pubs/feds/2006/200607/200607pap.pdf. 440 S E P T E M B E R / O C TO B E R , PA R T 1 2009 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Mexico’s Integration into NAFTA Markets: A View from Sectoral Real Exchange Rates Rodolphe Blavy and Luciana Juvenal The authors use a threshold autoregressive model to confirm the presence of nonlinearities in sectoral real exchange rate dynamics across Mexico, Canada, and the United States for the periods before and after the North American Free Trade Agreement (NAFTA). Although trade liberalization is associated with reduced transaction costs and lower relative price differentials among countries, the authors find, by using estimated threshold bands, that Mexico still faces higher transaction costs than its developed counterparts. Other determinants of transaction costs are distance and nominal exchange rate volatility. The authors’ results show that the half-lives of sectoral real exchange rate shocks, calculated by Monte Carlo integration, imply much faster adjustment in the post-NAFTA period. (JEL F31, F36, F41) Federal Reserve Bank of St. Louis Review, September/October 2009, 91(5, Part 1), pp. 441-64. T he analysis of relative price differentials across countries and sectors offers a way to evaluate the degree of market integration. The law of one price (LOOP) states that identical goods should sell for the same price across countries when prices are expressed in a common currency. Evidence has shown, however, that prices of goods fail to fully equalize between countries, indicating that markets are not perfectly integrated. Prices of homogeneous goods tend to differ across countries because the presence of transaction costs—such as transport costs and (explicit or implicit) trade barriers—limits price arbitrage. The study of the LOOP among members of the North American Free Trade Agreement (NAFTA) is of particular interest because it allows an assessment of whether regional trade liberalization results in faster price convergence and smaller price differentials across countries and in greater market integration. This paper focuses on three issues. First, we assess the degree of market integration between the United States, Mexico, and Canada by analyzing the validity of the LOOP between the countries. Second, we determine whether markets became more integrated, with reduced transaction costs, after the introduction of NAFTA. Finally, we analyze whether transaction costs are related to economic determinants. Our study focuses on the role of transaction costs in modeling deviations from the LOOP. Several theoretical studies (see Dumas, 1992; Sercu and Raman, 1995; and O’Connell, 1998) show that because of transaction costs, it may not be profitable to arbitrage away relative price differences across countries when the marginal costs of arbitrage exceed the marginal benefits. This Rodolphe Blavy is an economist at the International Monetary Fund and Luciana Juvenal is an economist at the Federal Reserve Bank of St. Louis. The authors thank the staff of the Banco de Mexico for their helpful comments, Steven Phillips for his contributions at various stages of preparation of this paper, and Roberto Benelli, Roberto Garcia-Saltos, David J. Robinson, Lucio Sarno, and seminar participants at the International Monetary Fund and at the Latin American and Caribbean Economic Association 2007 conference for comments. Volodymyr Tulin and Douglas Smith provided research assistance. © 2009, The Federal Reserve Bank of St. Louis. The views expressed in this article are those of the author(s) and do not necessarily reflect the views of the Federal Reserve System, the Board of Governors, or the regional Federal Reserve Banks. Articles may be reprinted, reproduced, published, distributed, displayed, and transmitted in their entirety if copyright notice, author name(s), and full citation are included. Abstracts, synopses, and other derivative works may be made only with prior written permission of the Federal Reserve Bank of St. Louis. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W S E P T E M B E R / O C TO B E R , PA R T 1 2009 441 Blavy and Juvenal situation generates a band of no trade where prices in two locations fail to equalize. Outside this threshold band, arbitrage is profitable and the sectoral real exchange rate (SRER) can become mean-reverting. This dynamic implies nonlinearities in SRERs and is well captured by using a threshold autoregressive (TAR) model for each sectoral relative price (see Tong, 1990; and Hansen, 1996 and 1997). The TAR model allows for deviations from the LOOP to exhibit unit root behavior inside the threshold band and to become meanreverting outside the band. If there is no mean reversion in the outer regime, relative prices fail to equalize between countries—a sign of weak market integration. In this way, the estimated threshold bands provide a measure of transaction costs. The empirical methodology analyzes dynamics in relative price adjustment and innovates by taking the perspective of an emerging market— Mexico.1 Motivated by the previous literature, we investigate the presence of threshold-type nonlinearities in deviations from the LOOP by comparing the monthly real U.S. dollar/Mexican peso exchange rate, U.S. dollar/Canadian dollar exchange rate, and monthly real Mexican peso/ Canadian dollar exchange rate over 1980-2006. Nonlinearities are captured using a selfexciting threshold autoregressive (SETAR) model. More precisely, we estimate SETAR models for each SRER for the pre- and post-NAFTA periods. This estimation gives a measure of transaction costs (threshold band) and the autoregressive parameter outside the band. We determine whether deviations from the LOOP show mean-reverting properties by testing whether the nonlinear specification is superior to a nonstationary model for each subsample. This requires testing whether the autoregressive process outside the band is significantly different from the random walk observed inside the band. We also test whether the threshold bands are significantly wider for each SRER in the pre- and post-NAFTA periods, 1 There is now an established literature on the nonlinear behavior of SERSs for developed markets (see Obstfeld and Taylor, 1997; Imbs et al., 2003; Sarno, Taylor, and Chowdhury, 2004; and Juvenal and Taylor, 2008). 442 S E P T E M B E R / O C TO B E R , PA R T 1 2009 thus allowing assessment of whether NAFTA led to higher market integration. The results show that transaction costs are larger for the Mexico-U.S. and Mexico-Canada country pairs than for the Canada-U.S. pair, thus suggesting a higher degree of market integration between the United States and Canada. We also find that NAFTA significantly reduced transaction costs and price differentials between the United States and Mexico, although this was not uniform across sectors. Finally, our estimated transaction costs are negatively related to trade liberalization, commonly shared geographic borders, and lower exchange rate volatility. To measure the speed of mean reversion, we use generalized impulse response functions to compute the half-life of exchange rates, which is the time it takes for 50 percent of the effect of a shock to dissipate (see Koop, Pesaran, and Potter, 1996). We find that half-lives are substantially reduced after the introduction of NAFTA, especially for the Mexico-U.S. country pair. This implies that reduced arbitrage costs were accompanied by faster adjustments in price differentials. The remainder of the paper is organized as follows. The next section reviews theoretical considerations on nonlinear dynamics in SRERs and presents the corresponding econometric methodology. The following sections first discuss the results and then provide a battery of robustness tests. The last section concludes. NONLINEARITIES: MOTIVATION AND EMPIRICAL FRAMEWORK According to the LOOP, similar goods should be priced the same across countries when prices are expressed in a common currency. At the aggregate level, the LOOP translates into purchasing power parity. The LOOP is based on the assumption of frictionless goods arbitrage—an environment in which there are no impediments to trade or transaction costs that would prevent perfect arbitrage. Ample empirical evidence (Isard, 1977; Richardson, 1978; and Giovannini, 1988) suggests F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Blavy and Juvenal that relative prices do not converge, or do so only with a very long-term horizon, and that price differentials are persistent. These studies also find that relative price differentials are significant and highly correlated with exchange rate movements. One reason that prices of homogeneous commodities may not be the same across different countries is the existence of transaction costs arising from transport costs, tariffs, and nontariff barriers.2 A number of theoretical papers suggest the importance of transport and trade barriers in creating price differences between countries (e.g., Dumas, 1992; Sercu and Raman, 1995; and O’Connell, 1998). The models described in such studies have incorporated different assumptions regarding the nature of trade costs. Overall, price differences driven by transaction costs can be expressed as S iPji = PjR + Aj , where S i is the nominal exchange rate between country i’s currency and the reference country, Pji is the price of good j in country i, PjR is the price of good j in the reference country, and Aj is the marginal transaction cost. In particular, Aj shows the minimum price difference that makes arbitrage profitable between country i and the reference country. In the presence of perfectly competitive markets and constant returns to scale technology and in the absence of sellers’ pricing power, price differences that are higher than the transaction costs will be arbitraged. Thus, (1) − A j ≤ S i Pji − PjR ≤ A j . In this framework, transaction costs generate two regimes: (i) when price differentials are smaller than transaction costs, there is a regime of no arbitrage described by equation (1) and (ii) when price differences exceed transaction costs, arbitrage is profitable and equation (1) does not hold. This implies that price differentials behave in a nonlinear fashion. Price differentials follow a nonstationary process within the transaction costs band (or threshold band), and outside the 2 Heckscher (1916) first pointed out the possibility of nonlinearities in relative prices in the presence of trade frictions. In the case of Mexico, González and Rivadeneyra (2004) investigate the LOOP between Mexican cities and provide empirical evidence that transactions costs (including tariff and nontariff barriers) explain departures from the LOOP. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W band they are mean reverting toward the band because of arbitrage effects. The condition expressed in equation (1) can be written in terms of each SRER as (2) 1− Aj PjR ≤ S i Pji PjR ≤ 1+ Aj PjR , where S i Pji PjR is the SRER between country i’s currency and the reference country for good i. Condition (2) implies that transaction cost bands and nonlinearities are both good-specific and country pair–specific. Based on the previous theoretical framework, a number of empirical studies analyze the nonlinear nature of deviations from the LOOP in terms of a TAR model (e.g., Tong, 1990). The TAR model allows for the presence of a threshold band within which arbitrage is not profitable. Consequently, deviations from the LOOP follow a unit root process. Outside the band the process can become mean-reverting. Recent contributions that use this model to analyze SRER dynamics of developed markets include Obstfeld and Taylor (1997), Sarno, Taylor, and Chowdhury (2004), Imbs et al. (2003), and Juvenal and Taylor (2008). In particular, Obstfeld and Taylor (1997), who used disaggregated data on clothing, food, and fuel, find evidence of nonlinearities in a sample of 32 locations. Sarno et al. (2004) provide support for nonlinear mean reversion with considerable cross-country and sectoral heterogeneity. They use annual price data interpolated into quarterly data for nine sectors and quarterly data on five exchange rates vis-à-vis the U.S. dollar. Juvenal and Taylor (2008) study the presence of nonlinearities in deviations from the LOOP for 19 sectors in 10 European countries and find significant evidence of threshold adjustment with transaction costs varying considerably across sectors and countries. Empirical Framework Data. We use disaggregated monthly data on consumer price indices (CPIs) for 18 sectors from S E P T E M B E R / O C TO B E R , PA R T 1 2009 443 Blavy and Juvenal January 1980 to December 2006 for Mexico, the United States, and Canada. Data on CPIs were obtained from the Bank of Mexico, the U.S. Bureau of Labor Statistics, and Statistics Canada. The sectors analyzed are bread, meat, fish, dairy, fruits, veg (vegetables), nonalco (nonalcoholic beverages), alco (alcoholic beverages), tobac (tobacco), clothw (women’s clothing), clothm (men’s clothing), foot (footwear), fuel, furniture, medic (medication), vehicles, gasoline, and photo (photographic equipment). Table 1 lists the sectors analyzed in this study and the description of the category for each country. Monthly nominal exchange rates are period averages from International Financial Statistics of the International Monetary Fund. Model. We model deviations from the LOOP using a SETAR model for each sectoral exchange rate to analyze the patterns in relative price convergence. More precisely, we investigate the presence of nonlinearities in deviations from the LOOP using a threshold-type model with two regimes. Our model process involves four steps. First, we estimate TAR models for each SRER. Second, we explore the validity of the nonlinear threshold model with respect to a null hypothesis of unit root process. This allows us to test for the existence of some degree of price convergence as opposed to no price convergence at all.3 Third, when we find evidence that a nonlinear specification is superior to a nonstationary model, we determine whether price convergence is characterized by an asymmetric threshold adjustment consistent with arbitrage arguments. That is, we test whether a nonlinear model fits the data better than a stationary linear one. Finally, when we find evidence of nonlinear price convergence in 3 A failure to reject the unit root hypothesis implies that deviations from the LOOP are a uniform unit root process and, thus, prices in two locations are disconnected. This test allows identification of any difference in the autoregressive parameters between the inner band and the outer band regimes. This test is an important addition to the methodology generally used in the literature. Earlier studies directly test for nonlinearity with respect to a linear model but do not determine whether the outer regime is nonstationary. An exception is found in Peel and Taylor (2002), who present a procedure to test for unit root to study covered interest parity. We use the procedure developed by Enders and Granger (1998) to test for the null hypothesis of nonstationarity against an alternative of stationarity with threshold adjustment. 444 S E P T E M B E R / O C TO B E R , PA R T 1 2009 the pre- and post-NAFTA periods, we determine whether the size of the threshold band is equal in both periods. The existence of transaction costs, in the form of transport costs or trade barriers, is one explanation for the lack of price convergence. As described previously, frictions to trade imply the presence of significant nonlinearities in SRER dynamics. That is, transaction costs generate a band in which the marginal costs of arbitrage exceed the marginal benefit. Within this band, there is a zone of no trade and consequently prices in two locations fail to equalize. Outside this band, arbitrage is profitable and the SRER can become mean-reverting. Empirically, this pattern is described by a TAR model, which was originally popularized by Balke and Fomby (1997) in the context of testing for purchasing power parity and the LOOP. Let x jti be the deviation from the LOOP for a sector j in country i at time t, defined as follows: (3) x ijt = sti + p ijt − p Rjt , where sti is the logarithm of the nominal exchange rate between country i’s currency and the reference country, pjti is the logarithm of the price of good j in country i at time t, and pjtR is the logarithm of the price of good j in the reference country at time t. A simple three-regime TAR model may be written as (4) q ijt = α q ijt −1 + ε ijt if q ijt −d ≤ κ (5) q ijt = κ (1 − ρ ) + ρq ijt −1 + ε ijt if q ijt −d > κ (6) q ijt = −κ (1 − ρ ) + ρq ijt −1 + ε ijt if q ijt −d < −κ (7) ε ijt N 0, σ 2 , ( ) where q jti is the demeaned component of the relative price difference, x jti , given by x jti = c ji + q jti (q jti is estimated as an ordinary least squares [OLS] residual), κ is the threshold parameter,4 and q jti –d is the threshold variable for sector j and country i. The parameter d accounts for the delay with 4 Note that κ is country and sector specific. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 1 Categories of Goods in the CPIs Sector Bread Mexico United States Canada Bread, tortillas, and cereals Cereals and bakery products Bakery and other cereal products Meat Meat Meat Meat Fish Fish and seafood Fish and seafood Fish and other seafood Dairy Milk, dairy products, and eggs Dairy and related products Dairy products and eggs Fruits Fresh fruits Fresh fruits Fruit, fruit preparation, and nuts Veg Fresh vegetables Fresh vegetables Fresh vegetables Nonalco Sugar, coffee, and packaged refreshments Nonalcoholic beverages — Alcoholic beverages Alcoholic beverages Alcoholic beverages Tobacco Tobacco Tobacco products and smokers’ supplies Clothw Women’s clothing Women’s apparel Women’s wear Clothm Men’s clothing Men’s apparel Men’s wear Foot Footwear Footwear Footwear Fuel Electricity and fuel Fuel and utilities Water, fuel, and electricity Furniture Furniture Furniture and bedding Furniture Medic Medications and equipment Medical care commodities — Vehicles Acquisition of vehicles New vehicles Purchase of automotive vehicles Gasoline Gasoline and lubricants/oil Gasoline (all types) Gasoline Photo Photographic equipment and material Photographic equipment and supplies — 2009 445 Blavy and Juvenal S E P T E M B E R / O C TO B E R , PA R T 1 Alco Tobac Blavy and Juvenal Figure 1 Footwear Real Exchange Rate and Threshold Bands Deviations from the LOOP 0.6 0.5 0.4 0.3 0.2 0.1 0.0 –0.1 –0.2 –0.3 1980 1982 1984 1986 1988 1990 1992 which economic agents react to real exchange rate deviations. Hereafter, we restrict the value of α to unity, so that, inside the band, deviations from the LOOP are persistent and follow a random walk.5 Outside the band, when |q jti –d|> κ, the process becomes mean-reverting as long as ρ < 1. The model described is a TAR (1, 2, d), where 1 is the autoregressive order, 2 is the number of thresholds, and d is the delay parameter. Further, because the threshold variable is assumed to be the lagged dependent variable, the model is called SETAR (1, 2, d) with the given parameters. Figure 1 shows an example of the estimated model. The graph contains the time series for q jti (solid line), which represents the demeaned real exchange rate between Mexico and the United States for the footwear sector and the estimated κ (dashed lines). Estimation. Using indicator functions 1共q jti –d > κ 兲 and 1共q jti –d < –κ 兲, which take the 5 This restriction is widely used in the literature; see Obstfeld and Taylor (1997), Imbs et al. (2003), Sarno, Taylor, and Chowdhury (2004), and Juvenal and Taylor (2008). 446 S E P T E M B E R / O C TO B E R , PA R T 1 2009 1994 1996 1998 2000 2002 2004 2006 value of 1 when the inequality is satisfied, the model in equations (4) through (7) can be simplified to equation (8): ( (8) ) ( ∆q ijt = ( ρ − 1) q ijt −1 − κ 1 q ijt −d > κ ( ) ( ) ) + ( ρ − 1) q ijt −1 + κ 1 q ijt −d < −κ + ε ijt . Note that the model in equation (8) is assumed to be symmetric. Thus, deviations from the LOOP outside the threshold band are the same regardless of whether prices are higher in the United States or in another country. This specification assumes that reversion is toward the edge of the band. Let us rewrite equation (8) as (9) ∆q ijt = B ijt (κ ,d )′ Γ + ε ijt , where Bjti 共κ,d兲′ is a (1 × 2) row vector that describes the behavior of ∆q jti in the outer regime and Γ is a (2 × 1) vector containing the autoregressive parameters to be estimated. More precisely, ( ) ( ) (10) B ijt (κ ,d )′ = X ′1 q ijt – d > κ Y ′1 q ijt – d < –κ , where F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Blavy and Juvenal 2 1 T σˆ 2 κˆ ,dˆ = ∑ εˆ ijt κˆ ,dˆ . T t =1 X ′ = qt −1 − κ ( ) Y ′ = qt −1 + κ and Γ ′ = [ ρ – 1 ρ – 1]. (11) The parameters of interest are Γ, κ, and d. Equation (8) is a regression equation nonlinear in parameters that can be estimated using least squares. For given values of κ and d, the least-squares estimate of Γ is ˆ (κ ,d ) = Γ (12) T i i ′ ∑ B jt (κ ,d ) B jt (κ ,d ) t =1 −1 T i i ∑ B jt (κ ,d ) ∆q jt , t =1 Testing Procedures. Before explaining the results, it is important to determine whether the TAR-type nonlinear model is superior when tested against a unit root process and against a linear AR(1) process. These tests require preestimation of both the linear model under the null hypothesis and the TAR model under the alternative. First, we determine whether the SETAR specification is superior to a unit root process for each SRER using the Enders and Granger (1998) threshold unit root test.6 The method is a generalization of the Dickey-Fuller test. The null hypothesis is H 0A : ρ = 1 with residuals ˆ (κ ,d ), εˆ ijt (κ ,d ) = ∆q ijt − B ijt ( ρ,d )′ Γ and residual variance (13) 1 T 2 σˆ 2 (κ ,d ) = ∑ εˆ ijt (κ ,d ) . T t =1 Because the values of κ and d are not given, they should be estimated together with the autoregressive parameter, ρ. Hansen (1997) suggests a methodology to identify the model in equation (9) that consists of the simultaneous estimation of κ , d, and ρ via a grid search over κ and d. The model is estimated by sequential least squares for values of d from 1 to 6. The values of κ and d that minimize the sum of squared residuals are chosen. The range for the grid search is selected to contain the 15th and 85th percentiles of the threshold variable. This can be written as (14) ( ) against an alternative of stationarity with threshold adjustment. This test allows identification of any difference in the autoregressive parameters between the inner and outer regimes. Its main advantage is that it is generally more powerful than the Dickey-Fuller test. A failure to reject the unit root null hypothesis implies that the LOOP does not hold and prices in two locations are disconnected. We interpret this as conveying that transaction costs are so high that the entire series are included within the threshold bands. Thus, the inner and outer regimes cannot be distinguished. When the unit root null hypothesis is rejected, we continue with our analysis. Our second step is to test a linear AR共1兲 specification against a nonlinear stationary SETAR. Let β be the autoregressive parameter implied by the linear AR共1兲. The linear null hypothesis is H 0B : β = ρ. ( ) κˆ ,dˆ = arg min σˆ 2 (κ ,d ) , κ ∈Θ, d ∈Ψ 6 where Θ = [κ ,κ ]. The least-squares estimator of Γ is Γ̂ = Γ̂共κ̂ ,dˆ 兲 with residuals ′ˆ ˆ ˆ εˆ ijt κˆ ,dˆ = ∆q ijt − B ijt κˆ ,dˆ Γ κ ,d ( ) ( ) ( ) and residual variance F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Other tests for the null hypothesis of the unit root against a nonlinear model have been proposed in the literature. Recent contributions include Kapetanios and Shin (2006) and Bec, Guay, and Guerre (2008). In particular, Kapetanios and Shin (2006) propose a Wald statistic to test a unit root null hypothesis against a threeregime SETAR process. Bec, Guay, and Guerre (2008) develop a more general procedure that consists of an adaptive threshold SupWald unit root test. We emphasize that the decision to use the Enders and Granger (1998) test does not represent a criticism of other methods. Overall, simulations have not provided evidence in favor of one test or another and this analysis is beyond the scope of our paper. S E P T E M B E R / O C TO B E R , PA R T 1 2009 447 Blavy and Juvenal When we find evidence of nonlinearities in the pre- and post-NAFTA periods, we determine whether the size of the threshold band is equal in both periods. Let τ ji be the threshold variable in the post-NAFTA period and θ ji be the threshold variable in the pre-NAFTA period. The null hypothesis is H 0C : τ ij = θ ij . As noted in Hansen (1997), testing hypotheses H0B and H0C is not straightforward. A statistical problem is present because conventional tests have asymptotic nonstandard distributions. To overcome inference problems, the asymptotic distribution of the conventional F-statistic must be calculated using a Monte Carlo simulation. Following Hansen (1997) and Peel and Taylor (2002), if the errors are i.i.d., the null hypothesis H0B and H0C can be tested using the statistic (15) σ 2 − σˆ 2 (κ ,d ) FT (κ ,d ) = T , 2 σˆ (κ ,d ) where FT is the F-statistic when κ and d are known, T is the sample size, and σ̂ 2共κ,d 兲 and σ˜ 2 are the unrestricted and restricted estimates of the residual variance, respectively. Hence, σ̂ 2共κ,d 兲 is obtained from the unconstrained nonlinear leastsquares estimation of equation (8) and σ˜ 2 results from the estimation of equation (8) with the restriction to be tested imposed. Because κ and d are not identified under the null hypothesis, the distribution of FT 共κ,d 兲 is not a standard chi-square distribution. Hansen (1997) shows that the asymptotic distribution of FT 共κ,d 兲 may be approximated using the following bootstrap procedure: (i) generate y jti*,t = 1,…,T from i.i.d. N共0,1兲 random draws; (ii) set q jti* = y jti*; (iii) using q jti*–1 for t = 1,…,T, regress y jti* on q jti*–1 and estimate the restricted and unrestricted models and obtain the residual variances σ̃ *2 and σ̂ *2共κ,d 兲, respectively; (iv) with these residual variances, it is possible to calculate the following F-statistic: (16) σ ∗2 − σˆ ∗2 (κ ,d ) FT∗ (κ ,d ) = T . ∗2 σˆ (κ ,d ) The bootstrap approximation to the asymptotic p-value of the test is calculated by counting the 448 S E P T E M B E R / O C TO B E R , PA R T 1 2009 number of bootstrap samples for which FT* 共κ,d 兲 exceeds the observed FT 共κ,d 兲. ESTIMATION RESULTS Testing for Nonlinearity Tables 2A, 2B, and 2C show the results of the estimation of the SETAR model for the MexicoU.S., Canada-U.S., and Mexico-Canada country pairs, respectively. The first step consists of testing the null hypothesis of a unit root using the Enders and Granger (1998) threshold unit root test. Essentially, this allows us to determine whether the autoregressive process is the same outside and inside the threshold band. A failure to reject the null hypothesis implies that the SRER is nonstationary and consequently prices in two locations are disconnected. Thus, the LOOP does not hold. Our interpretation of such a case is that transaction costs are so large that arbitrage is not profitable and the threshold band is wide enough to contain the entire time series of the SRER. For the Mexico-U.S. country pair, the test rejects the unit root null hypothesis in half of the series for the pre-NAFTA period. By contrast, in the post-NAFTA period nonstationarity is found in four of the sectors. We interpret these results as evidence that NAFTA has been associated with greater integration between the United States and Mexico. The behavior of relative prices between Mexico and Canada shows a similar pattern even though the degree of market integration has not improved as much in the post-NAFTA period as in the case of the United States and Mexico. The deviations from the LOOP in the CanadaU.S. country pair show a different behavior. The unit root null hypothesis is rejected in 73 percent of the series in the pre-NAFTA period and in all the series except one in the post-NAFTA period. These results suggest that the Canadian and American markets have been more closely integrated, with a slight improvement with NAFTA. To further test for the validity of the SETAR model, the second step consists of testing whether the nonlinear model is superior to a linear AR共1兲 process applying the Hansen test described preF E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 2A SETAR Estimation Results: Mexico–United States Pre-NAFTA Post-NAFTA Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B p-Value H0C Bread — — 0.52 — — — 0.24 — — Meat 0.27 0.92 — 0.00 0.09 0.96 — 0.00 0.00 Fish — — 0.15 — 0.02 0.96 — 0.00 — Dairy 0.28 0.85 — — 0.10 0.75 — 0.00 0.00 Fruits — — 0.25 — 0.05 0.84 — 0.00 — 0.09 0.78 — 0.00 0.15 0.70 — 0.00 0.05 — — 0.35 — 0.15 0.81 — 0.00 — Alco 0.10 0.92 — 0.00 — — 0.11 — — Tobac 0.32 0.73 — 0.00 0.14 0.86 — 0.00 0.00 Clothw 0.18 0.86 — 0.00 0.09 0.83 — 0.00 0.01 Clothm — — 0.13 — 0.16 0.87 — 0.00 — 0.08 0.87 — 0.00 0.64 Veg Nonalco 0.07 0.95 — Fuel — — 0.34 — — — 0.59 — — Furniture — — 0.28 — 0.18 0.86 — 0.01 — Medic — — 0.14 — 0.20 0.85 — 0.00 — Vehicles 0.14 0.75 — 0.00 0.12 0.64 — 0.00 0.39 Gasoline — — 0.23 — — — 0.11 — — 0.19 0.97 — 0.03 0.19 0.85 — 0.00 0.00 Photo NOTE: This table shows the results from the estimation of the SETAR (1, 2, d) model in equation (8). κ is the value of the threshold and ρ is the outer root of the TAR process. The estimation of κ, ρ, and d is done simultaneously via a grid search over κ and d as described in the text. The p-values H0A, H0B, and H0C represent, respectively, the marginal significance levels of the null hypothesis of unit root in the outer regime, null hypothesis of linearity, and null hypothesis of equality of thresholds during pre- and postNAFTA periods. 2009 449 Blavy and Juvenal S E P T E M B E R / O C TO B E R , PA R T 1 Foot 0.02 S E P T E M B E R / O C TO B E R , PA R T 1 Blavy and Juvenal 450 Table 2B SETAR Estimation Results: Canada–United States Pre-NAFTA Post-NAFTA 2009 Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B p-Value H0C Bread — — 0.36 — 0.09 0.93 — 0.00 — Meat 0.06 0.91 — 0.00 0.04 0.94 — 0.00 0.39 Fish 0.08 0.85 — 0.00 0.04 0.90 — 0.00 0.08 Dairy 0.07 0.91 — 0.00 0.07 0.95 — 0.00 — Fruits 0.16 0.95 — 0.02 0.09 0.79 — 0.00 — Veg 0.14 0.80 — 0.00 0.05 0.79 — 0.00 0.01 Alco 0.15 0.89 — 0.00 0.14 0.93 — 0.00 0.47 Tobac F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W — — 0.14 — — — 0.41 — — Clothw 0.05 0.94 — 0.00 0.13 0.81 — 0.00 0.07 Clothm — — 0.23 — 0.14 0.93 — 0.00 — Foot — — 0.18 — 0.08 0.96 — 0.00 — Fuel 0.08 0.95 — 0.00 0.04 0.94 — 0.00 0.07 Furniture 0.16 0.91 — 0.00 0.10 0.95 — 0.01 0.02 Vehicles 0.08 0.92 — 0.00 0.07 0.94 — 0.00 0.54 Gasoline 0.27 0.79 — 0.00 0.28 0.72 — 0.00 0.46 NOTE: See Table 2A. In some cases, fewer sectors are shown because data were not available. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 2C SETAR Estimation Results: Mexico-Canada Pre-NAFTA Sector Threshold κ Post-NAFTA Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B p-Value H0C Bread — — 0.34 — — — 0.53 — — Meat 0.24 0.90 — 0.00 0.76 — — 0.00 0.03 Fish 0.14 0.87 — 0.00 0.14 — — 0.01 — Dairy 0.30 0.80 — 0.00 0.19 — — 0.00 0.00 Fruits — — 0.17 — 0.15 — — 0.00 — Veg 0.15 0.71 — 0.00 0.21 — — 0.00 0.07 Alco 0.23 0.92 — 0.00 0.27 — — 0.00 0.58 — — 0.14 — — — 0.25 — — Clothw 0.15 0.80 — 0.00 0.21 — — 0.00 0.14 Clothm 0.17 0.90 — 0.00 0.20 — — 0.00 0.19 Foot 0.10 0.90 — 0.00 0.20 — — 0.00 0.03 Fuel — — 0.27 — — — 0.61 — — Tobac — — 0.16 — 0.22 — — 0.00 0.01 — — 0.18 — — — 0.66 — — Gasoline — — 0.13 — — — 0.24 — — NOTE: See Table 2A. 2009 451 Blavy and Juvenal S E P T E M B E R / O C TO B E R , PA R T 1 Furniture Vehicles Blavy and Juvenal viously. We conduct this test only for cases in which the Enders and Granger (1998) test rejects the unit root null hypothesis.7 Our results show that the outcomes of the Hansen test are in line with those of the Enders and Granger (1998) test. When the Enders and Granger test finds evidence of threshold behavior, the Hansen test rejects the linear null hypothesis. A few sectoral-level points should be highlighted. For the Mexico-U.S. country pair, the following sectors show evidence of unit root behavior: bread, a low-cost subsidized food sector; sectors subject to intervention through taxation, such as alcoholic and nonalcoholic beverages; and a sector with a high degree of differentiation, such as furniture. Interestingly, nonstationary behavior is found in sectors such as gasoline and fuel, which are characterized by a high degree of monopolistic power. Similarly, for the MexicoCanada country pair there is evidence of unit root in gasoline and bread, further suggesting the potential role of specific regulations in price differences. In the Canada-U.S. country pair, nonstationary behavior is present in sectors subject to government intervention, such as tobacco, clothing, and footwear. By contrast, threshold adjustment is significant in food products sectors except for bread. Estimated Transaction Costs Tables 2A, 2B, and 2C show the estimated threshold bands for each SRER for the three country pairs. These bands are interpreted as a measure of transaction costs and thus reflect the degree of market integration. Evidence of a strong NAFTA effect is found for the Mexico-U.S. SRERs. Transaction costs bands and the heterogeneity of the threshold values are significantly reduced after the introduction of NAFTA. In the pre-NAFTA period, they range from 7 percent (footwear) to 32 percent (tobacco). By contrast, in the post-NAFTA period, threshold values range from 2 percent (fish products) to 20 percent (medical commodities). At an indi7 The Hansen test requires that the series are stationary; this is why we apply this test only for the series in which the unit root null hypothesis is rejected. 452 S E P T E M B E R / O C TO B E R , PA R T 1 2009 vidual level, in sectors such as nonalcoholic beverages, clothing, furniture, and medication, transaction costs decrease from “very large” (unit root process) in the pre-NAFTA period to “measurable” with a threshold model in the postNAFTA period. In sectors that exhibit significant nonlinear behavior in both periods, threshold bands are significantly smaller in the post-NAFTA period for meat, dairy, vegetables, tobacco, women’s clothing, and photo equipment. The reduction in the transaction costs bands suggests a greater market integration. Considering those sectors in which nonlinearities are detected, average transaction costs in the Mexico-U.S. pair are smaller than those for the Mexico-Canada pair. Moreover, the latter pair shows evidence of unit root behavior in a greater number of sectors. This means that transaction costs are so high arbitrage is not worthwhile. Transaction costs between the United States and Canada are the lowest among the three country pairs examined. Overall, average transaction costs are 34 percent higher between the United States and Mexico than between the United States and Canada. This result confirms previous evidence that the United States and Canada are the most integrated among NAFTA members.8 We also find less dispersion in the threshold bands in the pre- and post-NAFTA periods. The fact that the integration between Canada and the United States started before the introduction of NAFTA could explain this result. A further look at sectoral characteristics confirms that highly homogeneous sectors such as fish and fruits show relatively low threshold bands. This is a standard result in the literature, reported in studies for other country pairs (see Juvenal and Taylor, 2008). Compared with the work of Juvenal 8 One possible alternative explanation for the lower thresholds between the United States and Canada than between Mexico and the United States may be that goods are more homogeneous between the first two countries. More generally, the comparability of the sectors may vary across country pairs. First, wealth effects may be at play. The relatively large income differences between Mexico and the United States and Canada affect the specific goods sampled in each CPI category. This disparity may complicate the analysis with the varying composition among luxury, middle, and ordinary products across countries. Second, statistical differences exist in the compilation of price-level data, notably in adjustments for quality changes. A solution to this problem is to look at more disaggregated price indices and SRERs. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Blavy and Juvenal Table 3A Half-Lives: Mexico–United States Pre-NAFTA Post-NAFTA Shock (%) Shock (%) Sector 10 20 30 40 50 10 20 30 40 50 Bread Meat Fish Dairy Fruits Veg Nonalco Alco Tobac Clothw Clothm Foot Fuel Furniture Medic Vehicles Gasoline Photo — 36 — 20 — 4 — 13 18 10 — 18 — — — 6 — 55 — 26 — 15 — 4 — 12 12 10 — 17 — — — 5 — 49 — 20 — 11 — 4 — 12 8 10 — 16 — — — 5 — 44 — 17 — 9 — 4 — 11 7 9 — 16 — — — 4 — 40 — 15 — 8 — 4 — 11 6 9 — 16 — — — 3 — 37 — 29 19 7 6 5 7 — 8 5 10 6 — 14 8 6 — 24 — 25 18 5 5 5 7 — 7 5 8 6 — 10 8 4 — 14 — 23 18 5 5 5 6 — 7 5 8 6 — 8 8 4 — 10 — 22 18 5 5 5 6 — 7 5 7 6 — 8 8 4 — 9 — 21 18 5 5 5 6 — 7 5 7 6 — 8 7 4 — 8 Average 20 17 14 13 12 11 9 8 8 8 NOTE: This table shows the estimated half-lives of deviations from the LOOP for five shocks of various percentages: 10, 20, 30, 40, and 50. The half-lives were calculated conditional on average initial history using the generalized impulse response functions procedure developed by Koop et al. (1996). and Taylor (2008), threshold bands among NAFTA members are on average slightly lower than those between the United States and European countries. Half-Lives of Relative Price Adjustment A usual measure of the speed of mean reversion is the half-life, which is the time required for the effect of 50 percent of a shock to die out. Tables 3A, 3B, and 3C report the estimated halflives (in terms of months) of price deviations from the LOOP for the Mexico-U.S., Canada-U.S. and Mexico-Canada SRERs.9 The speed of mean reversion is generally computed by taking into account the adjustment in the outer regime, which depends on the value F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W of ρ. In this case, the half-life is calculated as if it were a linear model, that is, ln共0.5兲/ln共ρ兲. Lo and Zivot (2001) emphasize the uncertainty of whether the computation of half-lives for linear models is applicable for nonlinear models. However, studies based on a SETAR model generally use this measure (see, for example, Taylor, 2001). As highlighted in Juvenal and Taylor (2008), although the estimated half-lives of the outer regime yield some insights on the speed of mean reversion, this measure is limited because it does not consider the regime switching within the SETAR model. 9 We compute the half-lives only for cases in which we find evidence of threshold behavior. S E P T E M B E R / O C TO B E R , PA R T 1 2009 453 Blavy and Juvenal Table 3B Half-Lives: Canada–United States Pre-NAFTA Post-NAFTA Shock (%) Shock (%) Sector 10 20 30 40 50 10 20 30 40 50 Bread Meat Fish Dairy Fruits Veg Alco Tobac Clothw Clothm Foot Fuel Furniture Vehicles Gasoline — 11 6 12 27 7 13 — 14 — — 17 21 13 8 — 10 5 10 24 6 10 — 13 — — 15 15 12 7 — 10 4 10 21 6 9 — 12 — — 15 13 11 6 — 10 4 10 20 6 9 — 12 — — 15 12 11 6 — 9 4 10 19 6 9 — 11 — — 15 12 11 6 14 13 9 16 5 5 17 — 7 18 25 12 29 14 7 12 12 8 15 5 5 16 — 7 15 22 12 24 13 5 12 12 8 15 5 5 15 — 6 14 20 12 21 13 5 11 12 8 14 5 5 14 — 6 13 20 12 19 13 5 11 12 8 14 5 5 13 — 6 13 19 11 18 12 5 Average 14 12 11 10 10 12 11 11 10 10 NOTE: See Table 3A. Thus, we compute the half-life using generalized impulse response functions proposed by Koop, Pesaran, and Potter (1996). This method considers the nonlinear nature of the SETAR model and the different adjustment speeds in the inner and outer regimes. The SETAR model exhibits an infinite half-life within the threshold band and depends on ρ outside the band. A shock may cause the model to switch regimes, and this adjustment is not captured by the first methodology. Following Taylor, Peel, and Sarno (2001), we compute the impulse response functions conditional on average initial history using Monte Carlo integration for shocks of 10, 20, 30, 40, and 50 percent. For the Mexico-U.S. pair, the average relative price adjustment is significantly faster in the post-NAFTA period. For example, for a 10 percent shock, the average pre-NAFTA half-life is 20 months, whereas the average is reduced to 11 454 S E P T E M B E R / O C TO B E R , PA R T 1 2009 months in the post-NAFTA period (see Table 3A). Our results also yield additional observations. In the post-NAFTA period, the speed of mean reversion varies less across different shock sizes than in the pre-NAFTA period. This suggests that relative prices adjust more quickly, independent of the size of the price shock. Half-lives vary substantially across sectors. Relative prices adjust fairly quickly for homogeneous goods, such as food products. The relative price of more highend products (e.g., furniture and photographic equipment) takes longer to adjust. The speed of relative price adjustment in the post-NAFTA period is comparable for the MexicoU.S. and the Canada-U.S. pairs. For a 10 percent shock, the average half-lives are 11 months and 12 months, respectively. This contrasts with significant differences in the pre-NAFTA period when Mexico-U.S. relative prices were much slower to adjust than Canada-U.S. prices (see F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Blavy and Juvenal Table 3C Half-Lives: Mexico-Canada Pre-NAFTA Post-NAFTA Shock (%) Shock (%) Sector 10 20 30 40 50 10 20 30 40 50 Bread Meat Fish Dairy Fruits Veg Alco Tobac Clothw Clothm Foot Fuel Furniture Vehicles Gasoline — 24 10 9 — 4 16 — 10 12 9 — — — — — 17 8 7 — 4 14 — 10 11 8 — — — — — 13 7 6 — 4 13 — 9 11 8 — — — — — 12 7 5 — 4 12 — 8 10 8 — — — — — 11 6 5 — 4 11 — 8 9 7 — — — — — 7 16 11 5 5 16 — 11 14 15 — 8 — — — 6 14 9 4 4 15 — 10 13 13 — 6 — — — 6 12 9 4 4 14 — 9 12 12 — 6 — — — 6 12 8 4 4 14 — 8 12 12 — 5 — — — 6 12 8 4 4 14 — 8 11 11 — 5 — — Average 12 10 9 8 8 11 10 9 9 9 NOTE: See Table 3A. Tables 3A and 3B). The half-lives of the MexicoCanada country pairs are also less persistent in the post-NAFTA period (see Table 3C). Determinants of Thresholds Based on the estimates of the SETAR models, we assess whether transaction costs are related to economic variables. To do this, we estimate a regression explaining the threshold parameter obtained from the section on estimated transaction costs: C (17) κ = λ ij + ∑ Φij (c ) z ij (c ) + ε ij , c =1 where κ is the threshold parameter and z ji is a vector of explanatory variables. In equation (17) we assess whether transaction costs, measured by the estimated thresholds, are explained by selected explanatory variables. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W The explanatory variables are intended to capture the size and nature of transaction costs. The first variable (distance) is a proxy for shipping costs. Given the small number of country pairs and their relative proximity, distance appears to be a poor measure. Instead, we include a dummy variable that takes value 1 when countries share a common border. The second variable is the volatility of the nominal exchange rate, which intends to capture the uncertainty about the macroeconomic environment. It is measured as the standard deviation of monthly exchange rate observations. Third, we include a measure of “tradability,” defined as the sum of imports and exports relative to the total output in a sector for a given country sourced from the United Nations Industrial Development Organization (UNIDO) database. Fourth, we use the number of establishments in each sector as a proxy for competition, or concentration, obtained from the UNIDO dataS E P T E M B E R / O C TO B E R , PA R T 1 2009 455 Blavy and Juvenal Table 4 Threshold Regressions Variables (1) (2) Distance –0.042 (0.054)* –0.036 (0.058)* Dummy post-NAFTA –0.105 (0.002)** –0.111 (0.001)** Exchange rate volatility 4.468 (0.000)*** 4.266 (0.000)*** Firms –0.002 (0.477) — — Tradability –0.045 (0.259) — — R² 0.34 0.33 N 89 89 NOTE: This table shows the results from the estimation of equation (17); p-values are shown in parentheses. *, **, and *** denote significance at the 10, 5, and 1 percent levels, respectively. base. Finally, a dummy for the post-NAFTA period is included. We examine the determinants of thresholds for the entire sample, including all three country pairs.10,11 The results, shown in Table 4, indicate that three variables are significant: the postNAFTA dummy, the shared border, and nominal exchange rate volatility. These variables are significant in all specifications. We find that the thresholds are lower when countries share a border. Nominal exchange rate volatility is also significant. This indicates that uncertainty about the macroeconomic environment limits arbitrage. The post-NAFTA dummy is also highly significant: The negative coefficient indicates that the introduction of NAFTA is associated with lower transaction costs. Neither the number of firms in a sector nor the degree of “tradability” in a sector is statistically significant (column 1 in Table 4).12 10 Because we cannot obtain data on firms and tradability disaggregated for clothing (women) and clothing (men) but for only a generic clothing sector, we consider the average threshold value of clothing (women) and clothing (men) as the κ̂ value for clothing. 11 When we find evidence of unit root behavior in deviations from the LOOP, we consider κ to be the highest value of the threshold variable in the grid search. This implies that transaction costs are so high that the entire SRER series is within the threshold band. 456 S E P T E M B E R / O C TO B E R , PA R T 1 2009 In column 2, these two variables are excluded with little change in the results. Overall, thresholds appear to be determined by distance (border) and exchange rate volatility. These results are consistent with findings in the literature. For example, Imbs et al. (2003) find that distance and exchange rate volatility explain the threshold values. Another strand of the literature analyzed the determinants of relative price differentials between the United States and Canada using different types of models. Our results are consistent with the findings of these studies. As an example, Engel and Rogers (1996) study the nature of deviations from the LOOP using CPI data for 14 goods sectors for different U.S. and Canadian cities. This study shows that the Canadian and U.S. markets are not perfectly integrated and that distance and border are major determinants of price differences. In a related study, Engel et al. (2005) investigate the LOOP between U.S. and Canadian cities using actual prices (instead of price indices). They find that absolute price differences between U.S. and Canadian prices are higher than 7 percent. In addition, their results show border plays a significant role in explaining price differentials between cities. ROBUSTNESS OF RESULTS We conduct three robustness checks to gauge the sensitivity of empirical results to underlying assumptions and variable definitions. First, we consider the possibility of long-run trends in the measured price differentials arising from aggregation issues in price indices or the presence of nontradable components or quality differences. We define q jti as the detrended and demeaned component of the price difference, x jti , given by x jti = + c ji + θt + q jti . As described previously, it is estimated as an OLS residual. Overall, our baseline findings prove robust to using detrended SRERs instead of the demeaned series. Tables 5A, 5B, and 5C show the results of the estimation of the SETAR model with detrended 12 Poor data quality is a probable explanation for the lack of significance. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 5A SETAR Estimation Results (Detrended Data): Mexico–United States Pre-NAFTA Post-NAFTA Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Bread — — 0.31 — — — 0.14 — Meat 0.26 0.92 — 0.00 0.03 0.94 — 0.00 — — 0.18 — 0.03 0.95 — 0.00 0.29 0.84 — — 0.09 0.83 — 0.00 — — 0.13 — 0.02 0.82 — 0.00 0.06 0.77 — 0.00 0.15 0.78 — 0.00 — — 0.16 — 0.10 0.76 — 0.00 0.22 0.79 — 0.00 — — 0.17 — — — 0.15 0.00 0.16 0.90 — 0.00 0.17 0.88 — 0.00 0.18 0.80 — 0.00 Fish Dairy Fruits Veg Nonalco Alco Tobac Clothw Clothm — 0.33 — 0.15 0.77 — 0.00 0.11 0.93 — 0.02 0.09 0.88 — 0.00 Fuel — — 0.22 — — — 0.70 — Furniture — — 0.46 — 0.16 0.81 — 0.01 — — 0.27 — 0.15 0.88 — 0.00 0.16 0.79 — 0.00 0.09 0.70 — 0.00 — — 0.19 — — — 0.17 — 0.16 0.96 — 0.02 0.17 0.90 — 0.00 Medic Vehicles Gasoline Photo NOTE: See Table 2A. 2009 457 Blavy and Juvenal S E P T E M B E R / O C TO B E R , PA R T 1 — Foot S E P T E M B E R / O C TO B E R , PA R T 1 Blavy and Juvenal 458 Table 5B SETAR Estimation Results (Detrended Data): Canada–United States Pre-NAFTA Post-NAFTA 2009 Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Bread — — 0.40 — 0.15 0.83 — 0.00 Meat — — 0.23 — 0.03 0.95 — 0.00 Fish 0.11 0.85 — 0.00 0.02 0.94 — 0.00 Dairy 0.05 0.94 — 0.00 0.07 0.92 — 0.00 Fruits 0.11 0.88 — 0.02 0.09 0.83 — 0.00 Veg 0.04 0.72 — 0.00 0.03 0.85 — 0.00 Alco 0.08 0.91 — 0.00 0.10 0.82 — 0.00 Tobac F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W — — 0.22 — — — 0.22 — Clothw 0.04 0.90 — 0.00 0.09 0.80 — 0.00 Clothm 0.06 0.88 — 0.00 0.11 0.94 — 0.00 Foot — — 0.12 — 0.05 0.90 — 0.00 Fuel 0.05 0.90 — 0.00 0.09 0.86 — 0.00 Furniture 0.08 0.87 — 0.00 0.16 0.91 — 0.00 Vehicles 0.09 0.80 — 0.00 0.10 0.95 — 0.00 Gasoline 0.16 0.97 — 0.00 0.05 0.80 — 0.00 NOTE: See Table 2A. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Table 5C SETAR Estimation Results (Detrended Data): Mexico-Canada Pre-NAFTA Post-NAFTA Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Bread 0.28 0.82 — 0.00 0.21 0.72 — 0.00 Meat 0.22 0.92 — 0.00 0.11 0.88 — 0.00 Fish — — 0.13 — 0.12 0.92 — 0.00 Dairy 0.31 0.91 — 0.00 0.20 0.87 — 0.00 Fruits — — 0.11 — 0.08 0.78 — 0.00 Veg 0.08 0.75 — 0.00 0.12 0.70 — 0.00 Alco 0.22 0.83 — 0.00 0.25 0.93 — 0.01 — — 0.19 — — — 0.55 — 0.24 0.94 — 0.02 0.24 0.72 — 0.00 Tobac Clothw 0.23 0.93 — 0.01 0.24 0.82 — 0.00 0.15 0.85 — 0.00 0.20 0.92 — 0.00 Fuel — — 0.35 — — — 0.31 — Furniture — — 0.19 — 0.18 0.86 — 0.00 Vehicles — — 0.17 — — — 0.15 — Gasoline — — 0.18 — — — 0.39 — NOTE: See Table 2A. 2009 459 Blavy and Juvenal S E P T E M B E R / O C TO B E R , PA R T 1 Clothm Foot Blavy and Juvenal Table 6A SETAR Estimation Results (Different Mean during Tequila Crisis): Mexico–United States Post-NAFTA Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Bread — — 0.54 — Meat 0.14 0.82 — 0.00 Fish 0.13 0.91 — 0.00 Dairy 0.07 0.71 — 0.00 Fruits 0.05 0.77 — 0.00 Veg 0.04 0.83 — 0.00 Nonalco 0.14 0.78 — 0.00 Alco 0.11 0.93 — 0.00 Tobac 0.08 0.89 — 0.00 Clothw 0.09 0.83 — 0.00 Clothm 0.10 0.79 — 0.00 Foot 0.08 0.94 — 0.00 Fuel 0.14 0.75 — 0.00 Furniture 0.11 0.90 — 0.00 Medic 0.17 0.77 — 0.00 Vehicles 0.12 0.83 — 0.00 Gasoline — — 0.25 — 0.12 0.91 — 0.00 Photo NOTE: See Table 2A. SRERs. The conceptual problem with including a trend in the real exchange rate is that it implies that the real exchange rate converges to a different mean across time. This implication is somewhat contradictory to the LOOP. Hence, our preferred measure is the demeaned series. The stability of our results with the different measures indicates that the trend component may not be of the utmost importance. Second, we test the sensitivity of the results to a structural break in the Mexican series over the study period (1980-2006) during the Tequila Crisis. The results reported herein assume a constant mean over the period, consistent with the LOOP hypothesis. However, as a robustness check, we also test the sensitivity of the results to two conditions: (i) allowing for a different mean over the Tequila Crisis (1994:12–1995:12) and (ii) 460 S E P T E M B E R / O C TO B E R , PA R T 1 2009 restricting the estimation period to 1996-2006. This was intended to assess whether the Tequila Crisis would significantly affect our findings. Our baseline findings are again robust to these checks. Tables 6A, 6B, and 6C report the estimated thresholds for each SRER, allowing for a different mean for the real exchange rate during the Tequila Crisis. Across sectors, homogeneous goods have lower transaction costs than other goods in the sample. Across country pairs, average transaction costs among NAFTA members are 27 percent higher between the United States and Mexico than between the United States and Canada, slightly less than the results when the Tequila Crisis is ignored. The results of the latter robustness analysis (not reported here but available upon request) are broadly consistent with the ones discussed here; thus, the Tequila Crisis does not significantly affect our findings. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Blavy and Juvenal Table 6B SETAR Estimation Results (Different Mean during Tequila Crisis): Canada–United States Post-NAFTA Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Bread 0.09 0.93 — 0.00 Meat 0.04 0.94 — 0.00 Fish 0.04 0.90 — 0.00 Dairy 0.07 0.95 — 0.00 Fruits 0.09 0.79 — 0.00 Veg 0.05 0.79 — 0.00 Alco 0.14 0.93 — 0.00 Tobac 0.05 0.95 — 0.03 Clothw 0.13 0.81 — 0.00 Clothm 0.14 0.93 — 0.00 Foot 0.08 0.96 — 0.00 Fuel 0.04 0.94 — 0.00 Furniture 0.10 0.95 — 0.00 Vehicles 0.07 0.94 — 0.00 Gasoline 0.26 0.72 — 0.00 NOTE: See Table 2A. CONCLUSION Using a SETAR model, we find strong evidence of nonlinearities in SRER dynamics across Mexico, Canada, and the United States in the preand post-NAFTA periods. This result is consistent with the predictions of theoretical models that incorporate some form of market segmentation. Overall, mean reversion occurs when deviations from the LOOP are significant and the benefits of arbitrage are higher than transaction costs. We obtain two key parameters from the estimation of SETAR models. The first parameter is the threshold, taken as a measure of transaction costs. The second parameter is the autoregressive parameter in the outer regime, which determines the speed of mean reversion. We obtain these parameters for each SRER corresponding to the three country pairs for both periods. Our findings indicate that the value of transaction costs is highly heterogeneous for different F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W sectors and countries. The estimated price thresholds range from 2 percent to 32 percent for the Mexico-U.S. and Canada-U.S. country pairs. The results generally confirm that highly homogeneous sectors, such as fish and fruits, show low threshold bands. Overall, average transaction costs among NAFTA members are 34 percent higher between the United States and Mexico than between the United States and Canada. This indicates that Mexico and the United States are relatively less integrated than Canada and the United States. In turn, threshold bands are higher for the Mexico-Canada pair. We relate the value of the threshold band to plausible economic determinants. Our results show that the border effect and exchange rate volatility are significant determinants of transaction costs. The dummy post-NAFTA is also strongly significant and negative, confirming that the introduction of NAFTA is associated with lower transaction costs. S E P T E M B E R / O C TO B E R , PA R T 1 2009 461 Blavy and Juvenal Table 6C SETAR Estimation Results (Different Mean during Tequila Crisis): Mexico-Canada Post-NAFTA Sector Threshold κ Outer regime ρ Unit root test p-value H0A Hansen test p-value H0B Bread — — 0.74 — Meat 0.20 0.92 — 0.00 Fish 0.13 0.91 — 0.00 Dairy 0.08 0.97 — 0.05 Fruits 0.08 0.83 — 0.00 Veg 0.04 0.80 — 0.00 Alco 0.06 0.95 — 0.02 — — 0.25 — 0.10 0.90 — 0.00 Tobac Clothw Clothm 0.11 0.89 — 0.00 Foot 0.06 0.95 — 0.02 Fuel 0.14 0.77 — 0.01 Furniture — — 0.16 — Vehicles — — 0.13 — Gasoline — — 0.07 — NOTE: See Table 2A. To shed some light on the mean-reverting properties of the SRERs, we consider the regime switching that occurs inside and outside the band in the SETAR model and compute the half-lives using generalized impulse response functions. Overall, the speed of mean reversion depends on the size of the shock. Larger shocks mean-revert much faster than smaller ones. On average, the half-lives are substantially reduced after the introduction of NAFTA. For the Mexico-U.S. country pair, the average half-life is reduced from 20 months in the pre-NAFTA period to 11 months in the post-NAFTA period. The post-NAFTA period shows less variation in the speed of mean reversion across different shock sizes than in the pre-NAFTA period. Our analysis therefore supports the arguments that (i) emerging markets—in this case, Mexico— still face higher transaction costs than their developed counterparts and (ii) trade liberalization 462 S E P T E M B E R / O C TO B E R , PA R T 1 2009 may help in lower relative price differentials between countries. We suspect that lack of competition may be a major determinant of high price thresholds but cannot prove this matter empirically. The main conclusion of our analysis is that Mexico has made progress but still has considerable room for improvement in reducing barriers to goods market integration and achieving the full benefits of globalization. Future research should focus on why transactions costs between Mexico and the United States continue to exceed those between Canada and the United States for many types of goods and whether these costs can be reduced through policy actions. Examples of such actions include developing logistics, transportation, and internal distribution mechanisms or enhancing the state of competition among domestic firms and reducing remaining barriers to external trade. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Blavy and Juvenal REFERENCES Bec, Frédérique; Guay, Alain and Guerre, Emmanuel. “Adaptive Consistent Unit-Root Tests Based on Autoregressive Threshold Model.” Journal of Econometrics, January 2008, 142(1), pp. 94-133. Balke, Nathan S. and Fomby, Thomas B. “Threshold Cointegration.” International Economic Review, August 1997, 38, pp. 627-45. Dumas, Bernard. “Dynamic Equilibrium and the Real Exchange Rate in a Spatially Separated World.” Review of Financial Studies, June 1992, 5(2), pp. 153-80. Enders, Walter and Granger, C.W.J. “Unit-Root Tests and Asymmetric Adjustment with an Example Using the Term Structure of Interest Rates.” Journal of Business and Economic Statistics, July 1998, 16, pp. 304-12. Engel, Charles and Rogers, John H. “How Wide Is the Border?” American Economic Review, December 1996, 86(5), pp. 1112-25. Engel, Charles; Rogers, John H. and Wang, Shing-Yi. “Revisiting the Border: An Assessment of the Law of One Price Using Very Disaggregated Consumer Price Data,” in Rebecca Driver; Peter Sinclair and Christoph Thoenissen, eds., Exchange Rates, Capital Flows and Policy. London: Routledge, 2005, pp. 187-203. Giovannini, Alberto. “Exchange Rates and Traded Goods Prices.” Journal of International Economics, February 1988, 24(1-2), pp. 45-68. González, Marco and Rivadeneyra, Francisco. “La Ley de un Solo Precio en México: Un Análisis Empírico.” Gaceta de Economía, 2004, 19, pp. 91-115. Hansen, Bruce E. “Inference When a Nuisance Parameter Is Not Identified under the Null Hypothesis.” Econometrica, March 1996, 64, pp. 413-30. Hansen, Bruce E. “Inference in TAR Models.” Studies in Nonlinear Dynamics and Econometrics, April 1997, 2(1), pp. 1-14. F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W Heckscher, Eli F. “Växelkurens Grundval vid Pappersmynfot.” Economisk Tidskrift, 1916, 18(10), pp. 309-12. Imbs, Jean; Mumtaz, Haroon; Ravn, Morten O. and Rey, Hélène. “Nonlinearities and Real Exchange Rate Dynamics.” Journal of the European Economic Association, April 2003, 1(2-3), pp. 639-49. Isard, Peter. “How Far Can We Push the Law of One Price?” American Economic Review, December 1977, 67(5), pp. 942-48. Juvenal, Luciana and Taylor, Mark P. “Threshold Adjustment of Deviations from the Law of One Price.” Studies in Nonlinear Dynamics and Econometrics, September 2008, 12(3), Article 8. Kapetanios, George and Shin, Yongcheol. “Unit Root Tests in Three-Regime SETAR Models.” Econometrics Journal, June 2006, 9(2), pp. 252-78. Koop, Gary; Pesaran, M. Hashem and Potter, Simon M. “Impulse Response Analysis in Nonlinear Multivariate Models.” Journal of Econometrics, September 1996, 74(1), pp. 119-47. Lo, Ming Chien and Zivot, Eric. “Threshold Cointegration and Nonlinear Adjustment to the Law of One Price.” Macroeconomic Dynamics, September 2001, 5(4), pp. 533-76. Obstfeld, Maurice and Taylor, Alan M. “Nonlinear Aspects of Goods-Market Arbitrage and Adjustment: Heckscher’s Commodity Points Revisited.” Journal of Japanese and International Economics, December 1997, 11(4), pp. 441–79. O’Connell, Paul G.J. “The Overvaluation of the Purchasing Power Parity.” Journal of International Economics, February 1998, 44(1), pp. 1-19. Peel, David A. and Taylor, Mark P. “Covered Interest Rate Arbitrage in the Interwar Period and the Keynes-Einzig Conjecture.” Journal of Money, Credit, and Banking, February 2002, 34(1), pp. 51-75. Richardson, J. David. “Some Empirical Evidence on Commodity Arbitrage and the Law of One Price.” Journal of International Economics, May 1978, 8(2), pp. 341-51. S E P T E M B E R / O C TO B E R , PA R T 1 2009 463 Blavy and Juvenal Sarno, Lucio; Taylor, Mark P. and Chowdhury, Ibrahim. “Nonlinear Dynamics in Deviations from the Law of One Price: A Broad-Based Empirical Study.” Journal of International Money and Finance, February 2004, 23(1), pp. 1-25. Sercu, Piet and Uppal Raman. “The Exchange Rate in the Presence of Transaction Costs: Implications for Tests of Purchasing Power Parity.” The Journal of Finance, September 1995, 50(4), pp. 1309-319. Taylor, Alan M. “Potential Pitfalls for the PurchasingPower-Parity Puzzle? Sampling and Specification Biases in Mean-Reversion Tests of the Law of One Price.” Econometrica, March 2001, 69(2), pp. 473-98. Taylor, Mark P.; Peel, David A. and Sarno, Lucio. “Nonlinear Mean-Reversion in Real Exchange Rates: Towards a Solution to the Purchasing Power Parity Puzzles.” International Economic Review, November 2001, 42(4), 1015-42. Tong, Howell. Nonlinear Time Series: A Dynamic System Approach. Oxford, UK: Clarendon Press, July 1993. 464 S E P T E M B E R / O C TO B E R , PA R T 1 2009 F E D E R A L R E S E R V E B A N K O F S T . LO U I S R E V I E W