The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

The Check Float Puzzle Jeffrey M. Lacker A lthough the last few years have seen a dramatic surge in interest in new electronic payment instruments, consumers and businesses in the United States still write checks in vast numbers. Nearly 63 billion checks were written in 1995 according to one estimate, representing 78.6 percent of all noncash payments (Committee on Payment and Settlement Systems of the central banks of the Group of Ten countries 1995). Check use has continued to expand in recent years, despite the increased use of debit cards and the automated clearinghouse; the per capita number of checks written grew at an average annual rate of 1.3 percent from 1991 to 1995. Moreover, forecasts call for check use to remain around current levels for the foreseeable future (Humphrey 1996). Because the social costs associated with the use of paper checks constitutes the majority of the real resource costs of the payment system—65.4 percent according to David Humphrey and Allen Berger (1990)— it will be important to continue to seek improvements in the efﬁciency of the check system in the years ahead. The efﬁciency of check clearing is affected by the arrangements governing presentment and payment. These arrangements have a feature that is, for economists, puzzling. Helen writes a check to John for, say, $100. When the check is ultimately presented to Helen’s bank for payment, the bank pays $100, and deducts $100 from Helen’s account. What is surprising, from an economist’s point of view, is that the bank pays the same amount, $100, no matter how long it took for the check to be presented. This implies that John’s bank earns an additional day’s interest by getting the check to Helen’s bank one day sooner. This feature is puzzling because it is difﬁcult to identify any signiﬁcant social beneﬁts to Helen or Helen’s bank from getting a check from John’s bank one day sooner; certainly nothing approaching the magnitude of one day’s interest. Helen Upton deserves grateful thanks for research assistance. Gayle Brett, Andreas Hornstein, Tom Humphrey, Ned Prescott, Marsha Shuler, and John Weinberg provided helpful comments on an earlier draft, but the author remains solely responsible for the contents of this article. The views expressed do not necessarily reﬂect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. Federal Reserve Bank of Richmond Economic Quarterly Volume 83/3 Summer 1997 1 2 Federal Reserve Bank of Richmond Economic Quarterly Check ﬂoat is the time between when a check is tendered in payment and when usable funds are made available to the payee (John in our example).1 Because John and his bank bear the opportunity cost of foregone interest until the check is presented, they have an incentive to minimize the ﬂoat. But check ﬂoat provides interest income for Helen and her bank. Under current arrangements Helen and her bank implicitly reward John and his bank for reducing check ﬂoat. Helen’s bank stands ready to turn over their ﬂoat earnings. John’s bank thus has an incentive to capture those ﬂoat earnings by accelerating presentment. Another way to state the puzzle is that the beneﬁts to Helen and her bank do not seem to justify the incentive provided to John and his bank to minimize check ﬂoat. For this reason I call it the “check ﬂoat puzzle.” The resolution of this puzzle is of more than intellectual interest. Because collecting banks forgo interest earnings on the checks in their possession, they have a strong incentive to present them as quickly as possible in order to minimize the interest foregone. Collecting banks are motivated to incur signiﬁcant real resource costs to accelerate the presentment of checks. Check processors, including the Federal Reserve Banks, routinely compare the cost of accelerating presentment to the value of the ﬂoat. Checks are sorted at night and rapidly shipped across the country. But if there is little or no social beneﬁt of accelerating the presentment of checks, then much of the real resource costs associated with check processing and transportation would represent waste from the point of view of the economy as a whole. It may be possible to alter this puzzling arrangement and improve the efﬁciency of the payment system. The check ﬂoat puzzle can be directly attributed to the fact that the laws and regulations governing check clearing mandate par presentment; the payor owes the face value of the check, no matter when the check arrives. Par presentment implies that the real present discounted value of the proceeds of clearing the check are larger the faster the check is presented. Par presentment essentially ﬁxes the relative monetary rewards to alternative methods of clearing, taxing slower methods of clearing relative to faster methods. As with any regulation that ﬁxes relative prices, there is the potential to distort resource allocations. In this article I argue that the distortion appears to be signiﬁcant. This is only part of the story, however. There could be offsetting beneﬁts that make par presentment a good thing. To justify current arrangements there would have to be social beneﬁts of clearing checks quickly that payees and their banks—the ones deciding how fast to clear the check—do not take into account. The check ﬂoat puzzle is of interest to the Federal Reserve System (the Fed), both as payment system regulator and as the largest processor of checks. In the 1970s the Federal Reserve Banks established a number of Remote Check 1 This use of the word ﬂoat follows Humphrey and Berger (1990, p. 51). The reader should be aware that some writers use the term ﬂoat in a narrow sense to refer to the time between when the payee is credited and the payor is debited: see, for example, Veale and Price (1994). J. M. Lacker: The Check Float Puzzle 3 Processing Centers (RCPCs) around the country with the avowed goal of accelerating the presentment of checks (Board of Governors of the Federal Reserve System 1971; Board of Governors of the Federal Reserve System 1972). Critics have argued recently that Federal Reserve operations should be consolidated to take advantage of economies of scale in check sorting (Benston and Humphrey 1997). But closing down Fed ofﬁces could increase the amount of time it takes to collect some checks. Should this result be counted against the decision to close an ofﬁce? More generally, when performing a cost-beneﬁt analysis of alternative payment system arrangements, what value should be placed on changes in the speed of check collection? Check Float A few words about how check clearing works will be useful as background. Checks provide a simple arrangement for making payments by transferring ownership of book-entry deposits. Helen (the “payor”) writes a check and gives it to John (the “payee”). John deposits the check in his bank, which then initiates clearing and settlement of the obligation. A check is a type of ﬁnancial instrument or contingent claim. It entitles the person or entity named on the check, the payee, to obtain monetary assets if the check is exchanged in accordance with the governing laws and regulations. One noteworthy feature of the check is that the holder of the check is entitled to choose when the check is exchanged for monetary assets. In other words, the check represents a demandable debt. John’s bank has a number of options available for getting the check to Helen’s bank for presentment. John’s bank could present directly, transporting the check itself or by courier to Helen’s bank. Alternatively, the check could be presented through a clearinghouse arrangement in which a group of banks exchange checks at a central location. Another option is to send the check through a correspondent bank that presents the check in turn to Helen’s bank. Or the check could be deposited with a Federal Reserve Bank, which then presents the check to Helen’s bank. These intermediary institutions could themselves send the check through further intermediaries, such as clearinghouses, other correspondent banks, or other Reserve Banks. The length of time it takes to present a check depends on where the check is going and on how John’s bank decides to get it there. First, the checks received by John’s bank during the business day are sorted based on their destination. Sorting generally occurs during the early evening hours. Afterward, many checks can be presented to the paying bank overnight. A check drawn on a nearby bank might be presented directly early the next morning. A group of neighboring banks that consistently present many checks to each other might ﬁnd it convenient to organize a regular check exchange or clearinghouse in which all agree to accept presentment at a central location. Checks drawn on 4 Federal Reserve Bank of Richmond Economic Quarterly local clearinghouse banks can generally be presented before the next business day. For checks drawn on other nearby banks it might be advantageous to clear via a third party, such as a check courier, a correspondent bank, or the Federal Reserve. A third-party check processor posts a deadline, usually late in the evening, by which local checks must be deposited in order to be presented the next day. Third parties also clear checks drawn on distant banks. Often such checks can be presented by the next day as well, especially checks drawn on banks located in cities with convenient transportation links. For checks drawn on remote and distant locations, however, an additional day or two may be needed to get the check where it is going. For example, a check drawn on a bank in Birmingham, Alabama, and deposited at the Federal Reserve Bank of Richmond is usually presented to the Birmingham bank in one day, while a check drawn on a bank in Selma, Alabama, is usually presented in two days. When does John’s bank collect funds from Helen’s bank? If the two banks do not have an explicit agreement providing otherwise, Helen’s bank is obligated to pay John’s bank on the day her bank receives the check, provided it is received before the appropriate cutoff time. If the check is presented by a Federal Reserve Bank, the cutoff time is 2:00 p.m.; if anyone else presents the check, the cutoff time is 8:00 a.m. Helen’s bank is obligated to pay by transfer of account balances at a Reserve Bank or in currency; in practice Reserve Bank account balances are the rule. Checks presented after the cutoff are considered presented on the following business day. A majority of the checks in the United States are presented in time for payment the next business day. According to a recent survey by the American Bankers Association (1994), over 80 percent of local checks are presented within one business day, while only about half of nonlocal checks are presented within one business day (Table 1). Over 90 percent of the dollar volume of checks cleared through the Federal Reserve are presented within one business day. What’s the Puzzle? The puzzle is that the paying bank pays the same nominal amount no matter how many days it takes to clear the check. Helen’s bank pays John’s bank the face value of the check whether it takes one day, two days, or two weeks to clear. To put it another way, an outstanding check does not earn interest while the check is being cleared. The implication is that clearing a check one day faster allows the presenting bank to earn an extra day’s interest. The presenting bank’s gain is the paying bank’s loss, however; Helen’s bank gives up one day’s interest. Why are arrangements structured this way? At a superﬁcial level the answer is transparent. The presentment of checks is governed by the Uniform Commercial Code, the Federal Reserve Act, and J. M. Lacker: The Check Float Puzzle 5 Table 1 Number of Days It Takes to Receive Available Funds on Checks Deposited through Banks’ Check Clearing Network Average Percentage of Item Volume By Bank Assets in Millions of Dollars Less than $500 $500 to $4,999 $5,000 or More Local Checks Up to 1 business day 2 business days More than 2 business days Number of banks responding 83.7 12.7 3.5 159 85.9 11.0 3.1 93.8 5.9 0.3 61 29 53.2 31.1 15.7 65.7 24.3 10.0 60 26 Nonlocal Checks Up to 1 business day 2 business days More than 2 business days Number of banks responding 42.2 40.8 17.0 159 Source: American Bankers Association (1994). Federal Reserve regulations. In their current form, these legal restrictions require that checks presented before the relevant cutoff time be paid at par on the same day.2 The result is that paying banks do not compensate collecting banks for the interest lost while a check clears. Legal restrictions effectively mandate that John’s bank is rewarded with an extra day’s interest if it clears a check one day faster. The check ﬂoat puzzle is thus an artifact of legal restrictions that mandate par presentment. A deeper puzzle remains, however. Can we identify any economic beneﬁts to Helen and her bank from faster check clearing? Are they large enough to warrant the interest earnings captured by presenting faster? The answer, as I will argue below, appears to be no. Note that it is irrelevant how Helen and her bank divide between them the additional interest earnings due to check ﬂoat. The question is why Helen and her bank, taken together, would want to compensate John and his bank (or someone presenting the check on their behalf) for presenting the check early. Similarly, it is irrelevant how John and his bank divide between them 2 Under Regulation CC, checks presented by a depository institution before 8:00 a.m. on a business day must either be paid in reserve account balances by the close of Fedwire (currently 6:00 p.m.) or returned (12 CFR 229.36(f)). Under Regulation J, checks presented by a Reserve Bank before 2:00 p.m. on a business day must be settled the same day—the exact time is determined currently by each Reserve Bank’s operating circular (12 CFR 210.9(a)). 6 Federal Reserve Bank of Richmond Economic Quarterly the opportunity cost of foregone interest earnings. Taken together, they have an incentive to accelerate the presentment of Helen’s check. Some Efﬁciency Implications of the Allocation of Check Float The check ﬂoat puzzle would be merely an intellectual curiosity if it had little or no consequences for real resource allocations. Unfortunately, it appears that the allocation of check ﬂoat earnings has a substantial effect on real resource allocation. Consider the situation of John’s bank, which has a range of options for clearing Helen’s check. Some of these options are likely to differ in the speed with which they get the check to Helen’s bank. Some clearing mechanisms might present the check in one day and some, particularly if Helen’s bank is located far away, might take two or three days to present. The one-day methods have a distinct advantage for John’s bank, because investable funds are obtained one day earlier. At the margin, John’s bank is willing to incur real resource costs, in an amount up to one day’s worth of interest earnings, in order to clear a check one day faster. If, as I argue below, there is no identiﬁable social beneﬁt of clearing a check one day faster, then the incremental resources expended to accelerate check collection and capture the interest earnings are wasted from society’s point of view. The situation is illustrated in Figure 1. Check clearing speed is measured in days along the horizontal axis in Figure 1 and is increasing to the right. The position labeled “0” represents checks cleared the day they are ﬁrst received, the position labeled “1” represents checks cleared one day after they are received, and so on. For a hypothetical check, the bars labeled MPC represent the marginal cost to the payees of clearing a check one day faster; the height for a clearing time of one day is the incremental cost of clearing in one day rather than two, the height for a clearing time of two days is the incremental cost of clearing in two days rather than three, and so on. Since these are real resource costs, they coincide with marginal social costs, so MPC = MSC. The marginal beneﬁt to payees is measured by the horizontal line MPB; the height is the extra interest gained from earlier presentment.3 If MPB exceeds MPC, the check is not being cleared too fast, from the payees’ point of view, while if MPC exceeds MPB, the check is being cleared too fast. Payees will choose the fastest method of clearing checks that results in marginal beneﬁts exceeding marginal costs.4 For the checks portrayed in Figure 1, payees will 3 I abstract from weekends, for which the extra interest would be three times as large as for weekdays. 4 If interest compounds continuously and costs vary continuously with speed, then the payee bank would choose a method for which the marginal cost of accelerating presentment equaled the interest rate (MB). J. M. Lacker: The Check Float Puzzle 7 Figure 1 MSC=MPC i=MPB MSB 7 6 5 4 3 2 Number of Days to Clear 1 0 0 Marginal Social Benefits (MSB) Marginal Social Costs (MSC) = Marginal Private Costs (MPC) i + Overnight Interest Rate = Marginal Private Benefit (MPB) Deadweight Loss from One-Day Clearing present in one day; the marginal private cost of accelerating presentment in order to clear the same day exceeds the marginal private beneﬁt. I provide evidence below suggesting that the marginal social beneﬁt of accelerating presentment is actually very small. Figure 1 therefore portrays the marginal social beneﬁt curve MSB as relatively low for one-day clearing. Although the quantities in Figure 1 are not based on explicit empirical estimates, they are selected to illustrate the likely relative magnitudes involved. The socially optimal speed of check clearing in Figure 1 is four days; clearing any faster incurs marginal social costs that are greater than marginal social 8 Federal Reserve Bank of Richmond Economic Quarterly beneﬁts. The gaps between MSC and MSB between four days and one day— the cross-hatched bars—represent the deadweight social loss associated with the way check ﬂoat earnings currently are allocated, as compared to a hypothetical arrangement that results in the optimal clearing time. In this sense the deadweight loss is “caused” by our existing check ﬂoat arrangements. The value of daily check ﬂoat provides an upper bound on the incentive to expend resources to accelerate presentment. A rough calculation gives a sense of the potential magnitudes involved. The total value of the checks cleared in 1995 was approximately $73.5 trillion, or an average of $201 billion per day (Committee on Payment and Settlement Systems of the central banks of the Group of Ten countries 1995). The overnight interbank interest rate averaged 5.83 percent that year, which corresponds to 0.016 percent per day. Multiplying this overnight rate by the value of checks cleared yields $32.2 million per day ($201 billion times 0.000160), or $11.7 billion per year. This works out to about $0.18 per check, and represents the amount of real resource costs that would willingly be incurred by payees, like John and his bank, to present their checks one day faster. This corresponds to the height of the marginal private beneﬁt line (MPB) in Figure 1. Since payee banks will ensure that MSC does not exceed MPB, it follows that MSC could be as large as $0.18 for the average size check. If, as I argue below, MSB is close to zero, then the cross-hatched bar for day 1 in Figure 1 is likely to be close to $0.18, or $11.7 billion in total. For comparison, Kirstin Wells (1996) estimates that the total cost to banks of processing and handling checks is between $0.15 and $0.43 per item.5 If the marginal social beneﬁts of accelerating presentment by a day are close to zero, then a substantial proportion of bank and payee processing costs could represent socially wasteful expenditures. Moreover, additional resources might be saved by clearing checks in three or more days, as illustrated in Figure 1 by the cross-hatched bars, for a time to presentment of two and three days.6 The prices of private package delivery services—United Parcel Service (UPS) and Federal Express—provide another rough guide to the cost of accelerating check presentment. The major services offer different delivery speeds at different prices. Assuming that prices in these relatively competitive businesses closely reﬂect costs, the price of overnight delivery can be compared to the price of slower delivery options to provide a crude estimate of the relative 5 These estimates are only an upper bound on the relevant cost ﬁgures since they include the processing costs associated with receiving checks at paying banks. 6 Note that ﬂoat earnings (MPB) vary in proportion to the face value of the check, while costs generally do not. Marginal social beneﬁts from reduced fraud losses are probably at least proportional to the face value of the check. Thus if payees are able to choose different clearing methods for different checks, then for large value checks the MPB and the MSB curves will be shifted upward, while the MPC curve will stay ﬁxed. If it is too costly for payees to discriminate between checks, it is the average values of MPB and MSB that are relevant. J. M. Lacker: The Check Float Puzzle 9 cost of overnight presentment and slower presentment.7 The analogy between check presentment and package delivery is certainly imperfect; check presentment deadlines do not precisely match package company delivery deadlines, the items being shipped have different physical properties, and the package companies are able to track shipments in real time. Nonetheless, there are important similarities that make the comparison useful. Both use the same transportation technologies—airplanes and trucks. Both involve substantial sorting en route. And both process substantial volumes—63 billion checks annually (bundled together in packages) versus over 900 million items annually for Federal Express and 180 million items annually for UPS. In fact, both UPS and Federal Express contract with check processing ﬁrms to transport and present checks for them. Table 2 displays sample shipping costs for UPS and Federal Express from Richmond, Virginia, to various locations. The Federal Reserve presents checks to all these locations by 2:00 p.m. the next day at the latest. For UPS letter delivery, delaying delivery by 25 1/2 hours, from 10:30 a.m. the next day to noon the second day, saves over 30 percent of the cost of next-day delivery. Delaying next-day delivery until late the second day (yielding third-day funds availability under current check presentment rules) saves about half the cost, while delaying delivery until late the third day (fourth-day funds availability) saves about 60 percent of the cost. For a one-pound package with UPS, delaying delivery to the third day saves about 70 percent of the costs. For a one-pound package sent via Federal Express, the savings are even larger. Delivery late the second day (third-day funds availability) reduces costs by almost 80 percent. These ﬁgures suggest that delaying check presentment could eliminate a substantial portion of check processing and handling costs. Rough empirical calculations indicate, therefore, that current check ﬂoat arrangements impose potentially signiﬁcant social costs on the payment system. Are there offsetting social beneﬁts? Some Attempts to Explain the Check Float Puzzle Eliminating Nonpar Presentment As mentioned above, the presentment of checks is governed by legal restrictions that require that checks be paid at par on the day they are presented (see 7 The analogy assumes that the price of delivery within a certain time frame closely approximates the average cost of delivery within that time frame. One potential weakness of this analogy is the possibility that there is a large ﬁxed cost component and that the price differentials reﬂect different demand elasticities rather than different average costs. Price differentials are nonetheless limited by incremental and stand-alone costs; for either delivery option, slow or fast, the price must lie above the incremental cost and below the stand-alone cost for prices to be efﬁcient and sustainable: see Weinberg (1994). If the demand for fast delivery is less elastic, as one might expect, then the price for slow delivery will lie close to the incremental cost of slow delivery, in which case the price differential will be no less than the difference in incremental costs. 10 Federal Reserve Bank of Richmond Economic Quarterly Table 2 Shipping Rates from Richmond, Virginia in dollars UPS: Letter Destination Baltimore Birmingham San Francisco Next day 10:30 a.m. Second day noon Second day close of business Third day close of business $11.00 12.50 13.50 $ 7.50 8.00 9.50 $ 5.75 6.25 7.25 $ 4.40 4.90 5.80 UPS: One-Pound Package Baltimore Birmingham San Francisco $14.00 17.25 20.00 $ 7.75 8.25 10.50 $ 6.25 6.75 8.25 $ 4.40 4.90 5.80 Federal Express: One-Pound Package (all locations) Next day 8:00 a.m. Second day 10:30 a.m. Second day 4:30 p.m. $47.50 $22.50 $ 9.95 Sources: United Parcel Service (1997); Federal Express Corporation (1996). footnote 2). Do such legal restrictions serve any efﬁciency-enhancing role that might justify the inefﬁciencies caused by excessively rapid check presentment? The current system of presentment regulations arose over the last 90 years since the founding of the Fed. Before the Fed was established in 1914, many banks charged presentment or “exchange” fees on checks sent to them for payment. Some state laws at the time held that a check presented “over the counter” shall be paid at par, but presentment fees could be charged when the collecting bank presented by indirect means, such as by mail. The banks charging presentment fees (so-called nonpar banks) were often small and rural, and they justiﬁed their fees as a way of covering the cost of remitting funds by shipping bank notes to the collecting bank.8 In drafting the Federal Reserve Act, the Reserve Banks were given the power to clear and collect checks, in part to help attract members to the Federal Reserve System (Stevens 1996). While national banks were required to become members, few state-chartered banks joined the System in the early years. At 8 The term par presentment is generally taken to refer broadly to the right to present by indirect means such as mail or courier service and still receive par. J. M. Lacker: The Check Float Puzzle 11 ﬁrst the Reserve Banks tried a voluntary clearing system in which they accepted at par only checks drawn on other members who agreed to accept checks at par. This scheme failed to attract enough participants and was abandoned after a year in favor of the somewhat misnamed “compulsory” system in July 1916.9 Under the new scheme Reserve Banks accepted checks drawn on any member banks or on nonmember banks that agreed to accept checks at par. The Reserve Banks campaigned hard to get banks to agree to accept at par and had greater success. Congress helped by revising the Federal Reserve Act in 1917, adding a provision that no presentment fees could be charged against the Fed, although speciﬁcally authorizing “reasonable charges” against other presenting banks. The Reserve Banks thus acquired the unique legal privilege of being able to present at par by indirect means, such as by mail. Membership increased dramatically in the years that followed, and the Reserve Banks were successful in signiﬁcantly curtailing, though not eliminating, nonpar banking. Presentment fees were effectively eliminated in 1994 when the Fed introduced regulations that mandated same-day settlement for checks presented by 8:00 a.m. The conventional view is that par presentment regulations were instrumental in allowing the Fed to enter the check clearing business and that this enhanced the efﬁciency of the check collection system. If so, then eliminating inefﬁciencies in check collection represents a social beneﬁt that might outweigh the social waste due to excessively fast presentment. One potential explanation of the check ﬂoat puzzle, then, is that it reﬂects a side effect of a par presentment regime whose net social beneﬁts are positive. Two types of claims have been made about the efﬁciency-enhancing role of par presentment. The ﬁrst argument, advanced by contemporary observers just after the founding of the Fed, was that presentment fees resulted in wasteful practices on the part of collecting banks seeking to avoid them. After the check is written and accepted in payment, the paying bank has a monopoly on the ability to redeem the check. Paying banks would set charges well above costs to extract rents from collecting banks (Spahr 1926). Payee banks would in turn try to avoid paying what they saw as exorbitant fees. A bank typically would have a network of correspondent banks with whom it exchanged checks. A correspondent bank would present checks directly on behalf of the sending bank or would send the check on to another correspondent, hoping it had an arrangement for direct presentment. The second correspondent might then send the check further on, and so forth. Checks sometimes traveled circuitous routes as banks sought a correspondent whom they hoped would allow them to avoid presentment charges (Cannon 1901). Such practices, it was asserted, resulted in wasteful shipping costs and inefﬁcient delay in payment. 9 One reason the voluntary scheme failed was the policy of crediting and debiting banks immediately when checks were received. There was a lag before banks were informed of debits, which made reserve management difﬁcult and overdrafts frequent. 12 Federal Reserve Bank of Richmond Economic Quarterly A second argument for the efﬁciency-enhancing role of par presentment is advanced by modern critics of the pre-Fed check collection system. Unilaterally set presentment fees allow a bank to increase retail market share by raising the costs of rival depository institutions (McAndrews and Roberds 1997; McAndrews 1995). Nonpar banking allows a “vertical price squeeze” in which a bank inefﬁciently raises the price of an upstream input (presentment) purchased by a bank that is a rival in a downstream market (retail deposit-taking).10 Presentment fees are an anticompetitive practice, according to this argument, and the establishment of par presentment eliminated the associated inefﬁciencies.11 These two arguments fail to explain the check ﬂoat puzzle. Regarding the ﬁrst argument, it is not at all obvious that nonpar banking was inefﬁcient. It is important to note that a collecting bank was not completely at the mercy of the paying bank. Collecting banks always had the option of ﬁnding a correspondent to present directly on their behalf, thereby avoiding the presentment fee. Competition between correspondent banks ultimately governed the cost of clearing checks drawn on distant banks and placed a ceiling on the presentment fees banks could charge. Moreover, the occasional circuitous routing of checks is not obviously inefﬁcient, given the necessity of relying on a network of bilateral relationships (Weinberg 1997). It is a common feature of network transportation and communication arrangements; after all, the circuitous routing of telephone calls is not taken as evidence of inefﬁciency. Another common feature of network arrangements is the presence of ﬁxed costs. In such settings there typically is a range of prices consistent with efﬁciency and sustainability. Each participant obviously will prefer to bear as little of the ﬁxed costs as possible. Critics of presentment fees wanted paying banks to bear more of the common costs of check clearing. Defenders of presentment fees wanted collecting banks to bear more of the costs. The par presentment controversy appears to have had more to do with distributional issues than with economic efﬁciency. The view that presentment fees can facilitate a vertical price squeeze is based on models that take many important aspects of the institutional arrangements governing check clearing as ﬁxed. Models in which such arrangements are endogenous can have very different predictions. For example, Weinberg (1997) describes a model of check clearing in which outcomes are efﬁcient, even without restrictions on presentment fees. Such models are attractive in this setting because, historically, check clearing has often involved cooperative arrangements between banks, such as clearinghouses. Moreover, the banks most susceptible to a vertical price squeeze by the nonpar banks were located close 10 See Salop and Scheffman (1983) for a basic exposition, and Laffont (1996) and Economides, Lopomo, and Woroch (1996) for applications to network industries. 11 McAndrews (1995) argues that the imposition of any uniform presentment fee would sufﬁce to eliminate this inefﬁciency. J. M. Lacker: The Check Float Puzzle 13 by, and were the very banks that could present directly. The banks that bore the brunt of presentment fees were those located at a distance and thus least likely to lose retail customers to the paying bank. More to the point, check clearing arrangements provided the same incentives to accelerate presentment both before and after the founding of the Fed. Under state laws and established common law principles, the presenting bank was entitled to immediate payment at par for checks presented over the counter. Thus a bank presenting directly to the paying bank faced the same relative incentives before and after the entry of the Fed into check clearing; getting the check there one day earlier resulted in one day’s worth of interest. Over-the-counter presentment served as an anchor for the prices of other means of presentment. It placed a bound on the payee bank’s willingness to pay an exchange fee for presenting by mail or to pay a correspondent bank for collecting the check. Neither the paying bank nor the correspondent bank had any incentive to compensate the payee bank for the interest foregone before remitting the check. Thus the relevant property of the par presentment regime predates the Fed’s entry into check clearing. The elimination of nonpar presentment cannot explain the check ﬂoat puzzle. Reducing Check Fraud Another possible explanation of the check ﬂoat puzzle is that clearing checks faster reduces check fraud losses to paying banks and their customers. Helen’s bank might be willing to compensate John’s bank for getting the checks to them sooner because it reduces the expense associated with check fraud. There are various ways in which banks and their customers can lose money to check fraud. Someone possessing lost or stolen checks can forge the account holder’s signature or the endorsement. Checks can be altered without the account holder’s approval. Counterfeit checks resemble genuine checks and can sometimes be used to obtain funds. Checks can be written on closed accounts. Fraudulent balances can be created through “kiting”—writing a check before covering funds have been deposited. When Helen’s check is presented for payment her bank can verify the signature and the authenticity of the check and can verify that the account contains sufﬁcient funds. If her bank chooses to dishonor the check, it must initiate return of the check by midnight of the business day following the day the check was presented. The check is then returned to John’s bank. If Helen’s bank paid the check when it was presented, then a payment is made in the opposite direction when the check is returned. Otherwise Helen’s bank returns the check without paying. Note, however, that if Helen’s bank returns the check, Helen’s bank bears no loss. John and his bank now have a check that was dishonored, and between them they bear the loss (or else seek compensation from Helen). John and his 14 Federal Reserve Bank of Richmond Economic Quarterly bank can be expected to take into account the effect of the speed of check clearing on the likelihood of their fraud losses. Therefore, the losses experienced by payees and their banks do not help explain the check ﬂoat puzzle. The losses that are relevant to our puzzle are those borne by Helen and her bank. They would be willing to compensate John’s bank to induce more rapid clearing if that helped reduce their own check fraud losses.12 There are a number of reasons why check fraud losses to the paying bank might be reduced if it received the check faster. Helen’s bank may allow the time limit for check returns to elapse before ﬁnding out that the check is forged or that Helen has closed her account. Some banks, for example, do not routinely verify signatures. In this case, Helen’s bank bears the loss. Such losses might be lower for checks presented faster. Helen’s bank might want to provide an implicit reward to John’s bank for rapid presentment. In principle, then, the desire to encourage rapid check clearing to discourage check fraud might explain the check ﬂoat puzzle. But is the check fraud effect large enough empirically to explain the check ﬂoat puzzle? Does getting the check to Helen’s bank one day faster reduce fraud losses at Helen’s bank by enough to justify providing John’s bank with one more day’s interest on the funds? According to a recent Board of Governors report to Congress (Board of Governors 1996), check fraud losses incurred by U.S. commercial banks, thrifts, and credit unions amounted to $615.4 million in 1995. Some check fraud losses occur to banks in their role as collectors of checks drawn on other banks, and some occur to banks in their role as payors of checks drawn on other banks. Of the total estimated check fraud loss mentioned above, only about half—$310.6 million—represents losses to banks as payors. The remainder represents losses to banks as collectors. As noted above, only check fraud losses to the payor are directly relevant to the check ﬂoat puzzle. The ﬁgures just cited are gross losses, however. The Board study reports that depository institutions recovered a total of $256.0 million on past check fraud losses in 1995, although it does not indicate how these recoveries were divided between paying banks and collecting banks. If we take these as estimates of steady-state losses and recoveries, and if we assume that recoveries are the same fraction of gross losses for both collecting banks and paying banks, then paying banks experienced net check fraud losses of $181.4 million in 1995.13 Average net check fraud losses at paying banks therefore amounted to less 12 Figure 1 could be modiﬁed to account for the desire of John and his bank to reduce their check fraud losses. The marginal beneﬁt from reducing their expected losses should be added to the marginal private beneﬁt curve MPB. The same amount should be added to the marginal social beneﬁt curve, MSB, as well, so the net distortion remains the same. 13 Recoveries by paying banks are (50.5%) × ($256.0 million) or $129.2 million, so net losses are $310.6 million minus $129.2 million, or $181.4 million. Note that the resulting ﬁgure is conservative in the sense that if check volume is growing, then this procedure underestimates the ratio of recoveries to gross losses. J. M. Lacker: The Check Float Puzzle 15 than 0.0003 cents per dollar in 1995.14 In comparison, one day’s interest on the check, at a 5.5 percent annual rate (the current overnight Fed funds rate), is worth 0.015 cents per dollar; more than 50 times as large as the average rate of net check fraud losses at paying banks. The check fraud loss ﬁgure is the average net loss, however. The relevant ﬁgure is the marginal effect on net fraud loss of clearing a check one day faster. It could conceivably be the case that, say, the expected fraud loss on a check cleared in two days exceeds the expected loss on a check cleared in one day by 0.015 cents per dollar, the value of the ﬂoat, even while the average check fraud loss is 0.0003 cents per dollar. Unfortunately, there are no ﬁgures available that would allow us to estimate directly marginal net fraud losses. However, for the average net expected loss to be as small as 0.0003 cents while the marginal loss associated with clearing a check in two days rather than one day is as large as 0.015 would require that no more than 2 percent of checks take two or more days to clear.15 No more than 2 percent is quite implausible, however, given the ﬁgures in Table 1, which show that a substantial portion of checks take two days or more to clear. Thus, even though we do not have a direct measure of the marginal expected fraud loss associated with clearing a check one day slower, the evidence strongly suggests that fraud loss at paying banks does not explain the distribution of check ﬂoat earnings. Check writers themselves sometimes suffer losses due to check fraud. Perhaps Helen’s desire to limit her own check fraud losses makes her and her bank willing to forego the extra interest earnings in order to induce more rapid clearing of her checks. There are two principal methods by which a depositor could lose money due to check fraud. One is if Helen fails to inspect periodic bank statements for forged or unauthorized checks, she can be apportioned 14 Calculated as $181.4 million divided by $73.5 trillion (dollar value of checks written in 1995 [Committee on Payment and Settlement Systems of the central banks of the Group of Ten countries 1995]) = 0.0003. 15 Let α be the fraction of checks (by value) cleared in i days, and let γ be the expected i i fraud loss on checks cleared in i days. Expected fraud loss is then α1 γ1 + α2 γ2 + . . . = 0.0003. Suppose, hypothetically, that the marginal loss associated with clearing one extra day, γi+1 − γi , is at least 0.015. What values of α1 are consistent with these two assumptions? The most optimistic case, in the sense that the allowable range for α1 is the largest, is one in which all checks clear in either one or two days, because the longer it takes to clear the larger the expected loss. As long as γi+1 ≥ γi , the best case is for αi to be as small as possible for i ≥ 3, because increasing the weights on the days with larger losses makes it harder to match the average loss ﬁgure of 0.0003. Assume therefore that αi = 0 for i ≥ 3. Similarly, the most optimistic assumption to make about γ1 is γ1 = 0, because increasing γ1 , the expected loss on the smallest loss day, just makes it harder to match the average loss ﬁgure. Our two postulates are now (1 − α1 )γ2 = 0.0003, and γ2 ≥ 0.015, which together imply that 1 − α1 ≤ (0.0003/0.015) = 0.02. Looked at another way, for given fractions αi , how large can γ2 − γ1 be and still satisfy α1 γ1 + α2 γ2 + . . . = 0.0003 and γi+1 ≥ γi ? The answer is 0.0003/(1 − α1 ). From the ﬁgures in Table 1 this ranges from 0.0005 to 0.005, or 3.5 to 32.3 percent of the monetary value of one day’s worth of ﬂoat. 16 Federal Reserve Bank of Richmond Economic Quarterly some of the loss on grounds of negligence. But the timeliness of check clearing is only marginally important in such cases, since they involve inspecting monthly bank statements. Another method by which a depositor could lose money involves “demand drafts,” one-time pre-authorized checks written by merchants or vendors after taking a depositor’s bank account number over the phone. In place of the customer’s signature the check is stamped “pre-approved” or “signature on ﬁle.” Demand drafts are cleared the same way as conventional checks and have many legitimate uses, but they have been used in telemarketing scams. It seems unlikely that the detection and prosecution of such fraud depends signiﬁcantly on the speed with which demand drafts are cleared. Most cases seem to be discovered when a depositor’s bank statement is inspected. Moreover, such fraud only affects demand drafts, and these are a tiny fraction of all checks written.16 So in neither case does fraud loss by check writers appear to be a plausible rationale for the allocation of check ﬂoat earnings. There is an additional reason to doubt that fraud losses could ever explain why the collecting bank should lose interest earnings until the check is presented. The relevant interest rate is the nominal overnight rate, and thus will vary directly with expected inﬂation, other things being equal. There is no reason why the additional expected fraud loss associated with clearing a check in two days rather than one should have any necessary relationship with the inﬂation rate. Indeed, the inefﬁciency caused by the fact that checks do not bear interest parallels exactly the traditional welfare cost of anticipated inﬂation, which is caused by the fact that currency does not bear interest. The inefﬁciency of currency use arises because people go to excessive lengths to avoid holding it. Similarly, check ﬂoat arrangements cause banks to go to excessive lengths to avoid holding checks. In both cases the problem is that the rate of return is artiﬁcially depressed by inﬂation. The difference between the two is that, apart from changing the inﬂation rate, altering the rate of return on currency, say by paying interest, appears to be technologically difﬁcult. In contrast, as I argue below, the technology to alter the rate of return on checks appears to be readily available.17 The Expedited Funds Availability Act When an account holder deposits a check at a bank, the common banking practice is to place a “hold” on the funds for a number of days until the bank is 16 Legitimate demand drafts probably amount to less than $1 billion a year. Jodie Bernstein, Director of the Bureau of Consumer Protection, reported one estimate that “nine of the current twenty demand draft service bureaus process approximately 38,000 demand drafts weekly, totaling over ﬁve million dollars. . . .” In other words, $250 million annually (Bernstein 1996). 17 Reducing inﬂation to the socially optimal rate would accomplish the desired objective, but I take that as outside the realm of check regulatory policy. J. M. Lacker: The Check Float Puzzle 17 certain that the check has cleared. The bank customer is not allowed to withdraw the funds until the hold is removed. This practice protects the bank from fraud by shifting some of the risk to the account holder. In 1987 Congress passed the Expedited Funds Availability Act (EFAA), which asked the Federal Reserve to promulgate regulations limiting the length of time banks can hold customers’ funds. Maximum holds vary from one to ﬁve business days, depending on the type of check and whether or not it is a “local” item. Legal restrictions on the duration of holds can be an incentive to accelerate check presentment. After the hold is released, the funds may be withdrawn, and the bank may suffer a loss if the check is returned unpaid. Does this explain the check ﬂoat puzzle? The answer is clearly no. Congress enacted the EFAA to respond to concerns that holds were longer than were necessary to ascertain whether the check would be returned unpaid. The EFAA explicitly instructs the Federal Reserve Board to reduce the allowable time periods to the minimum consistent with allowing a bank to “reasonably expect to learn of the nonpayment of most items.” The hold periods, in other words, are tailored to the speed with which checks are actually being collected, not the other way around. The EFAA constrains the distribution of the risk of nonpayment between the payee and the payee’s bank. But it does nothing to alter the incentive both parties have to take steps to reduce their joint losses from fraud. The EFAA does increase the ability of payees to perpetrate fraud on their banks and so provides an extra incentive for payee banks to accelerate presentment. If the EFAA artiﬁcially discouraged faster presentment, such discouragement might explain the need for the compensating stimulus provided by the current check ﬂoat arrangement. But if anything, the EFAA heightens the incentive to accelerate presentment. What Can Be Done? I conclude that the social beneﬁt of accelerating check presentment is negligible in comparison to the reward to collecting banks in the form of captured interest earnings. Apparently this feature of the check clearing system does not have an identiﬁable economic rationale. Without any offsetting social beneﬁts, we are left with just the social costs described earlier. Is there an alternative to the current arrangements governing check ﬂoat? Is there a practical way to eliminate the artiﬁcial incentive to accelerate the presentment of checks? After all, it could be the case that the current scheme has deadweight social costs but is superior to all feasible alternatives. Is there a feasible alternative that does not require the deadweight social costs noted above? Consider ﬁrst what properties an ideal arrangement would possess. In an ideal arrangement the value to John’s bank of presenting a check one day sooner would equal the real value to Helen and Helen’s bank of receiving the check one day sooner. Fraud losses (to the payor bank) aside, John’s bank should 18 Federal Reserve Bank of Richmond Economic Quarterly implicitly earn interest on the check while it is being cleared. Helen’s bank should implicitly pay interest to John’s bank from the time at which John’s bank received the check. John’s bank would then face no artiﬁcial inducement to accelerate presentment. Note that John’s bank still has an incentive to clear the check, since fraud losses to the payee bank are likely to increase the longer it takes to clear the check. But the magnitude of the incentive to accelerate presentment would match the social value of accelerating presentment. Check fraud losses to the payor bank constitute an additional social value of accelerating presentment. To account for these precisely, the implicit interest rate on checks should be reduced by the marginal effect of delaying presentment on payor fraud losses, resulting in a slight penalty for delaying presentment. As noted previously, however, the marginal effect on payor bank fraud losses is likely to be quite small when compared to the interest earnings at stake. In an ideal arrangement, therefore, we should see checks in the process of collection implicitly bearing interest at close to the overnight rate. Implementing an ideal arrangement would require revising the current par presentment regulations. One possibility is to have the paying bank pay explicit interest on the face value of the check from the date the check was originally accepted by the bank of ﬁrst deposit. The interest would be paid directly to the presenting institution. The interest rate could be determined by reference to a publicly available overnight rate. Regulations would stipulate that upon presentment, the paying bank is accountable for the amount of the check plus accrued interest from the date of ﬁrst deposit. The regulation would constrain only the obligations of the paying bank. If the collecting bank was presenting on behalf of some other bank, they could divide the interest between them as they see ﬁt. Presumably each bank would receive the interest accruing while the check was in their possession. Similarly, the regulation would be silent on the division of interest between the bank of ﬁrst deposit and its customer. A second possibility is for checks to be payable at par only at a ﬁxed maturity date—say, ﬁve business days after the check is ﬁrst deposited in a bank. Checks presented before ﬁve business days would be discounted, again using a publicly available overnight interest rate as reference. After ﬁve days an outstanding check would accrue interest at the reference rate. The maturity date would determine the implicit division of revenues between paying banks and payee banks. The main practical difﬁculty facing any such scheme is to record and transmit the date on which the check is ﬁrst deposited. Currently, the Federal Reserve’s Regulation CC requires that the bank at which the check is ﬁrst deposited print on the back of the check certain information (the indorsement), including the date. This information is used mostly in the process of returning checks and is not machine-readable. Some information on a check is machine-readable, however. At some point early in the clearing process, the dollar amount is printed in magnetic ink on the bottom of the check front beside J. M. Lacker: The Check Float Puzzle 19 the paying bank’s routing number and the payor’s bank account number. The resulting string of digits and symbols—the so-called “MICR line” at the bottom of the check—is read automatically as the check subsequently is processed. One possibility would be to expand the MICR coding format to include the date as well. Then the implicit interest obligation could be handled using the same automated techniques used to handle the face amount. Although this alternative regime would certainly involve transitional costs, the ﬁgures discussed above indicate that the potential beneﬁts are substantial—perhaps as large as billions of dollars per year. Note that this proposal would have the side beneﬁt of facilitating improved contractual arrangements between banks and their customers by giving them more readily usable information on when a check was cleared. This information could be used by banks to penalize kiting if they so desired. Banks might charge check writers for the interest paid to the bank presenting a check. The arrangement would be a matter of contractual choice for banks and their customers, however, and would not affect the desirability of the proposal. In the Meantime, There Are Some Important Implications Until we establish a more rational scheme for allocating check ﬂoat earnings, payment system policymakers apparently face a dilemma. They are often asked to contemplate changes to the payment system that would alter the speed with which some checks are cleared. One example is a proposal to close down the Fed’s Remote Check Processing Centers (Benston and Humphrey 1997). This would likely slow down the collection of some checks. Another example is a proposal for electronic check presentment (ECP), which involves transmitting electronically to paying banks the encoded information on checks (Stavins 1997). In this case, checks would likely be collected somewhat faster on average. How should such changes in check ﬂoat affect the decision? One point of view (the “zero-sum view”) asserts that the change in ﬂoat earnings is merely a transfer. The gain realized by payees and their banks from faster presentment is exactly matched by a corresponding loss to payors and their banks. In this view, changes in ﬂoat should be ignored in policy analysis. That is, in a social cost-beneﬁt analysis, no weight should be given to changes in ﬂoat. This view is in accord with the evidence cited above that the social beneﬁt of accelerating check clearing is negligible. The danger in this approach, however, is that payment system participants respond to the (distorted) incentives embodied in the current arrangements; consequently their reactions could be misgauged. Imagine that the Fed is considering a change that would increase check ﬂoat. For example, suppose that the closure of an RCPC slowed down the collection of some deposited checks. For the checks the Fed continues to process, the slowdown would reduce the 20 Federal Reserve Bank of Richmond Economic Quarterly amount of resources wasted on accelerating presentment. But it would do nothing to reduce the incentive banks have to accelerate presentment. Banks could respond by clearing directly themselves or through private service providers, rather than through the Fed, in order to minimize ﬂoat. If the social cost of clearing checks outside the Fed is greater than the cost of clearing them through the Fed, then there might be no net social savings to closing down the RCPC, since the increase in private costs might outweigh the decrease in Fed costs. A cost-beneﬁt analysis that ignored the effect of changes in ﬂoat could be seriously misleading. An alternative approach (the “empirical view”) would treat the overnight interest rate as the social value of accelerating presentment, as if there is some as-yet-undiscovered social beneﬁt of reducing check ﬂoat. This approach has the advantage of aligning policy objectives with the incentives faced by private participants in the check collection industry. The danger in this approach is the risk of favoring speedy check presentment when it is not really in society’s best interest. Suppose again that the Fed is considering closing an RCPC, but that no banks switch to other means of clearing checks. The increase in ﬂoat would be counted against closing the facility, under the empirical view. It could turn out that, if one disregards the increased ﬂoat, then the net social beneﬁts of closing the facility are positive (due to the resources saved by clearing more slowly) but are negative when the value of the lost interest earnings to payee banks is deducted.18 In this case, the empirical approach recommends against closing the facility even though it really should be closed. By adopting the empirical view, policymakers would be joining in the private sector’s wasteful pursuit of ﬂoat. The dilemma is more apparent than real, however. Policymakers should focus on the implications for real resource costs of the proposals they are considering and should exclude the purely pecuniary impact of reallocations of check ﬂoat. But they should keep in mind that although ﬂoat does not reﬂect any direct social beneﬁts, it does affect behavior. To the extent that reallocations of ﬂoat induce behavioral changes that alter real resource use, the induced changes in resource costs must be included in any cost-beneﬁt analysis. Current ﬂoat arrangements can be thought of as imposing a tax paid by presenting banks on checks cleared by slower methods, with the proceeds automatically passed on to payor banks. The proper treatment of a tax in cost-beneﬁt analysis is well understood. Absent other interventions, the taxed service (slow clearing) will be undersupplied relative to the untaxed service (fast clearing) for which it is a substitute. If a public entity like the Fed is active in supplying the untaxed good, and unilaterally cuts back on its supply, providing more of 18 The ﬂoat that Reserve Banks experience is passed back to depositing banks. If, for example, 97 percent of a particular class of checks is cleared in one day and the rest in two days, on average, depositors receive 97 percent of their funds in one day and the rest in two days. J. M. Lacker: The Check Float Puzzle 21 the taxed good instead, the net effect will depend on the market for the untaxed good. At one extreme, the Fed might have many competitors whose costs and prices are close to that of the Fed. In this case reducing the supply of the untaxed service merely causes customers to switch to competitors—no improvement in efﬁciency results. At the other extreme, if the Fed has few competitors for the supply of the untaxed service—no other suppliers have costs close to the Fed’s—then customers can be induced to switch to the socially superior taxed good. Here, slowing down Fed check collection does not drive customers away, with the result that check collection does indeed slow down and thus saves societal resources. Note that this outcome could increase costs to Fed customers in the sense that Fed fees plus ﬂoat costs increase, even though social costs decrease. In the decision to close an RCPC, for example, the analysis should take into account the effect of increased ﬂoat on depositing banks’ check clearing choices. To the extent that increased ﬂoat causes banks to switch to other providers—private check clearing services or correspondent banks, for example—the increase in the real resource costs of alternative check clearing operations should be counted against any savings in real resource costs associated with Fed check clearing. The change in ﬂoat earnings itself should be excluded from the calculation of net social beneﬁts, but the effect on bank choices must be taken into account. In evaluating ECP, the ﬂoat beneﬁts to payees from faster presentment should not count as a social beneﬁt, as Joanna Stavins (1997) correctly points out. If ECP is offered under current par presentment regulations, however, the beneﬁts of ﬂoat arising from faster presentment (assuming they are passed back to depositing banks, as is current Fed practice) would be an artiﬁcial stimulus to the adoption of ECP. If ECP is offered at prices that are efﬁcient (relative to the real resource costs of ECP) and the extra ﬂoat earnings from faster presentment are passed on to payees, then ECP may be adopted where it is not socially efﬁcient.19 For some checks ECP might be more costly than physical presentment, and yet customers would prefer ECP because of the beneﬁts of reduced ﬂoat. The Fed should avoid deploying ECP in market segments where it would increase social costs, even if it would decrease Fed customers’ costs (including ﬂoat costs). More generally, the check ﬂoat problem can distort the process of technological innovation by artiﬁcially promoting techniques that accelerate check presentment. Payment system participants have an incentive to ﬁnd new ways 19 ECP with check truncation is often said to involve “network effects” because such a scheme would be most valuable if universally adopted, eliminating all paper presentment. The same logic applies, however. The set of prices that are efﬁcient and sustainable relative to resource costs alone will not in general coincide with the set of prices that are efﬁcient relative to the aggregate of resource costs and ﬂoat costs. See Weinberg (1997) regarding network effects in payment arrangements. 22 Federal Reserve Bank of Richmond Economic Quarterly to reduce their holdings of non-interest-bearing assets, like currency and checks (Lacker 1996). This incentive is merely an artifact of the inﬂation tax, and thus does not represent any fundamental social beneﬁt (Emmons 1996). The check ﬂoat problem is another example of the way inﬂation can distort the payment system. The check ﬂoat puzzle has important implications for the role of the Federal Reserve in the check clearing industry. The Fed currently enjoys certain competitive advantages over private participants. One involves the disparity in presentment times mentioned above; the Fed can present until 2:00 p.m. for same-day funds, while others must present before 8:00 a.m. for same-day funds (unless varied by agreement). This disparity gives the Fed a competitive advantage, because depositors can be offered a later deposit deadline at a cost lower than that of a private provider. Having such a competitive advantage would allow the Fed, should it so desire, to improve the efﬁciency of check collection by slowing down presentment and increasing check ﬂoat beyond that which the private market would provide.20 It gives the Fed an ability to offset some of the deleterious side effects of par presentment regulations. Note that this outcome is the opposite of the original justiﬁcation of the Fed’s role in check clearing provided by opponents of presentment fees, who claimed that the Fed would result in more rapid check clearing. The Fed’s advantage over private providers of check clearing services has been eroding over time. In 1980 Congress passed the Monetary Control Act, which required that the Fed charge prices for its payment services comparable to those that would be charged by private providers. Effective in 1994, Regulation CC was amended to allow “same-day settlement”—private presentment as late as 8:00 a.m. for same-day funds. Because of these changes and other factors, the Fed’s market share has been steadily eroding in recent years (Summers and Gilbert 1996). Payment system efﬁciency no doubt helped motivate this movement towards a “level playing ﬁeld.” And yet these changes have reduced 20 To see this, consider the following simpliﬁed situation. The Fed faces private providers with costs of γ1 of clearing a check in one day and γ2 of clearing a check in two days. The value of one day’s ﬂoat on a typical item is i. Under competitive conditions the cost to a depositor is γ1 + i for clearing privately in one day, and γ2 + 2i for clearing privately in two days. Clearing in two days is socially optimal, so γ1 > γ2 , there being no other relevant social costs or beneﬁts associated with check clearing. But under the current regime checks are collected (inefﬁciently) in one day; that is, γ1 + i < γ2 + 2i, or γ1 − i < γ2 . The Fed offers check clearing, but only two-day clearing. Suppose the Fed’s cost of clearing in two days is δ2 , and the Fed charges p per item. Cost recovery requires (a) p ≥ δ2 . Can the Fed attract depositors that are now clearing privately in one day? This requires (b) p + 2i < γ1 + i. Together, (a) and (b) are feasible if δ2 < γ1 − i < γ2 . The Fed’s presentment time advantage implies that the Fed can present checks in a given number of days at lower cost than the private sector can present checks in the same number of days: in other words, δ2 is strictly less than γ2 , as required. Thus the Fed’s presentment time advantage allows the Fed to reduce check clearing time from one day to two days in this example, improving the efﬁciency of the check collection. J. M. Lacker: The Check Float Puzzle 23 the Fed’s ability to unilaterally improve the efﬁciency of check collection by slowing down check presentment. Now is a good time, therefore, to reexamine the Fed’s role in the check collection industry and the payment system more broadly.21 As noted earlier, the rationale for the Fed’s original entry into check collection was to improve efﬁciency. But the par presentment regulations that once aided the Fed’s entry are now clearly an impediment to efﬁciency. Can the Fed still play an efﬁciency-enhancing role in the presence of par presentment regulations? Can the Fed implement technological improvements to the payment system without removing inefﬁcient par presentment regulations? These questions should be at the heart of any reexamination of the Fed’s role in the payment system. REFERENCES American Bankers Association. “1994 ABA Check Fraud Survey.” Washington: American Bankers Association, 1994. Benston, George J., and David B. Humphrey. “The Case for Downsizing the Fed,” Banking Strategies (1997), pp. 30–37. Bernstein, Jodie. “Demand Draft Fraud.” Prepared Statement before the House Banking Committee: Federal Trade Commission, 1996. Board of Governors of the Federal Reserve System. “Report to the Congress on Funds Availability Schedules and Check Fraud at Depository Institutions.” Washington: Board of Governors of the Federal Reserve System, 1996. . “Guidelines Approved for New Check-Clearing System,” Federal Reserve Bulletin, vol. 58 (February 1972), pp. 195–97. . “Statement of Policy on Payments Mechanism,” Federal Reserve Bulletin, vol. 57 (June 1971), pp. 546–47. Cannon, James G. Clearing-Houses: Their History, Methods and Administration. London: Smith, Elder, & Co., 1901. Committee on Payment and Settlement Systems of the central banks of the Group of Ten countries. “Statistics on Payment Systems in the Group of Ten Countries.” Basle: Bank for International Settlements, 1995. Economides, Nicholas, Giuseppe Lopomo, and Glenn Woroch. “Strategic Commitments and the Principle of Reciprocity in Interconnection Pricing.” New York: Stern School of Business, NYU, 1996. 21 In October 1996 Federal Reserve Chairman Alan Greenspan appointed a committee, headed by Board Vice Chair Alice M. Rivlin, to review the Fed’s role in the payment system. 24 Federal Reserve Bank of Richmond Economic Quarterly Emmons, William R. “Price Stability and the Efﬁciency of the Retail Payments System,” Federal Reserve Bank of St. Louis Review, vol. 78 (September/October 1996), pp. 49–68. Federal Express Corporation. Fedex Quick Guide. Memphis: Federal Express Corporation, 1996. Humphrey, David B. “Checks Versus Electronic Payments: Costs, Barriers, and Future Use.” Manuscript. Florida State University, 1996. , and Allen N. Berger. “Market Failure and Resource Use: Economic Incentives to Use Different Payment Instruments,” in David B. Humphrey, ed., The U.S. Payment System: Efﬁciency, Risk and the Role of the Federal Reserve. Boston: Kluwer, 1990. Lacker, Jeffrey M. “Stored Value Cards: Costly Private Substitutes for Government Currency,” Federal Reserve Bank of Richmond Economic Quarterly, vol. 82 (Summer 1996), pp. 1–25. Laffont, Jean-Jacques, Patrick Rey, and Jean Tirole. “Network Competition: I. Overview and Nondiscriminatory Pricing.” Manuscript. 1996. McAndrews, James J. “Commentary,” Federal Reserve Bank of St. Louis Review, vol. 77 (November/December 1995), pp. 55–59. , and William Roberds. “A Model of Check Exchange.” Manuscript. Philadelphia: Federal Reserve Bank of Philadelphia, 1997. Salop, Steven C., and David T. Scheffman. “Raising Rivals’ Costs,” American Economic Review, vol. 73 (May 1983, Papers and Proceedings), pp. 267– 71. Spahr, Walter Earl. The Clearing and Collection of Checks. New York: Bankers Publishing Co., 1926. Stavins, Joanna. “A Comparison of Social Costs and Beneﬁts of Paper Check Presentment and ECP with Truncation,” New England Economic Review (July/August 1997), pp. 27– 44. Stevens, Ed. “The Founders’ Intentions: Sources of the Payments Services Franchise of the Federal Reserve Banks.” Cleveland: Financial Services Working Paper Series, 1996. Summers, Bruce J., and R. Alton Gilbert. “Clearing and Settlement of U.S. Dollar Payments: Back to the Future?” Federal Reserve Bank of St. Louis Review, vol. 78 (September/October 1996), pp. 3–27. United Parcel Service. “Quick Cost Calculator.” Available: http:// www.ups.com [April 1997]. Veale, John M., and Robert W. Price. “Payment System Float and Float Management,” in Bruce J. Summers, ed., The Payment System: Design, Management, and Supervision. Washington: International Monetary Fund, 1994. J. M. Lacker: The Check Float Puzzle 25 Weinberg, John A. “The Organization of Private Payment Networks,” Federal Reserve Bank of Richmond Economic Quarterly, vol. 83 (Spring 1997), pp. 25– 43. . “Selling Federal Reserve Payments Services: One Price Fits All?” Federal Reserve Bank of Richmond Economic Quarterly, vol. 80 (Fall 1994), pp. 1–23. Wells, Kirstin E. “Are Checks Overused?” Federal Reserve Bank of Minneapolis Quarterly Review, vol. 20 (Fall 1996), pp. 2–12. A Review of the Recent Behavior of M2 Demand Yash P. Mehra I t is now known that the public’s M2 demand experienced a leftward shift in the early 1990s. Since about 1990 M2 growth has been weak relative to what is predicted by standard money demand regressions. It is widely believed that this shift in money demand reﬂected the public’s desire to redirect savings ﬂows from bank deposits to long-term ﬁnancial assets including bond and stock mutual funds. Recognizing this, policymakers have not paid much attention to M2 in the short-run formulation of monetary policy since July of 1993.1 In this article, I review the recent behavior of M2 demand. I then evaluate the hypothesis that the recent shift in M2 demand can be explained if we allow for the effect of the long-term interest rate on money demand. The longterm interest rate supposedly captures household substitutions out of M2 and into long-term ﬁnancial assets. The evidence here indicates that a standard M2 demand regression augmented to include the bond rate spread can account for most of the “missing M2” since 1990 if the estimation includes the missing The author wishes to thank Robert Hetzel, Roy Webb, and Alex Wolman for many helpful comments. The views expressed are those of the author and not necessarily those of the Federal Reserve Bank of Richmond or the Federal Reserve System. 1 See Greenspan (1993). The issue of the stability of money demand is central in assessing M2’s usefulness for formulating policy. If M2 weakens, policymakers have to determine whether this weakness has resulted from a shift in money demand or whether it indicates that the Fed has been supplying an inadequate amount of money to the economy. If it’s the latter, weak M2 growth may portend weakness in the economy. To remind readers, the current deﬁnition of M2 includes currency, demand deposits, other checkable deposits, savings deposits, small-denomination time deposits, retail money market mutual funds and overnight repurchase agreements and Eurodollar deposits. Federal Reserve Bank of Richmond Economic Quarterly Volume 83/3 Summer 1997 27 28 Federal Reserve Bank of Richmond Economic Quarterly M2 period. Furthermore, changes in the missing M2 are highly correlated with changes in household holdings of bond and stock mutual funds from 1990 to 1994. This evidence lends credence to the view that the steepening of the yield curve in the early 1990s encouraged households to substitute out of M2 and into other ﬁnancial assets and that part of this missing M2 ended up in bond and stock mutual funds. However, a few caveats suggest caution in interpreting the twin role of the long-term interest rate and the growth of the mutual fund industry in inﬂuencing money demand. One is that the bond rate has no predictive content for M2 demand in the pre-missing M2 period. And during the past two years, 1995 and 1996, actual M2 growth has been in line with that predicted by the money demand regression estimated with and without the bond rate. Hence, the result that the bond rate can account for the missing M2 from 1990 to 1994 is interesting, but it does not necessarily indicate the presence of the systematic inﬂuence of the yield curve on M2 demand. The other caveat is that household holdings of bond and stock mutual funds continued to increase in 1995 and 1996, and that increase has not come at the expense of weak M2 growth. In fact, the strong correlation noted above between the missing M2 and household holdings of bond and stock mutual funds disappears when post-’94 observations are included. This result indicates that changes in household holdings of bond and stock mutual funds do not necessarily imply instability in M2 demand. Taken together, one interpretation of this evidence is that special factors, such as the unusual steepening of the yield curve in the early ’90s and the increased availability and liquidity of mutual funds since then, caused the public to redirect part of savings balances from bank deposits to bond and stock mutual funds. Those factors probably have not changed the character of M2 demand beyond causing a one-time permanent shift in the level of M2 balances demanded by the public.2 The result that the leftward shift in M2 demand ended two years ago should now be of interest to monetary policymakers. The plan of this article is as follows. Section 1 presents the standard M2 demand regression and reviews the econometric evidence indicating the existence of the missing M2 since 1990. Section 2 presents an explanation of the missing M2 and Section 3 examines the role of the bond rate in explaining the missing M2. Section 4 contains concluding observations. 2 Other special factors that have usually been cited are resolution of thrifts by the Resolution Trust Corporation; the credit crunch; the downsizing of consumer balances accomplished by using M2 balances to pay off debt; rising deposit insurance premiums and the imposition of new, high-capital standards for depositories (resulting in a decreasing proportion of intermediation through the traditional banking sector); and so on. But none of these other special factors offers a satisfactory explanation of the missing M2 from 1990 to 1994 as does the steepening of the yield curve. See Duca (1993), Darin and Hetzel (1994), and Feinman (1994) for a further discussion of these special factors. Y. P. Mehra: Recent Behavior of M2 Demand 1. 29 A STANDARD M2 DEMAND EQUATION AND ITS PREDICTIVE FAILURE IN THE EARLY 1990S An M2 Demand Model The money demand model that underlies the empirical work here is in errorcorrection form and is reproduced below (Mehra 1991, 1992): mt = a0 + a1 yt + a2 (R − RM2)t + Ut and n1 ∆mt = b0 + (1) n2 b1s ∆mt−s + s=1 b2s ∆yt−s s=1 n3 b3s ∆(R − RM)t−s + λUt−1 + t , + (2) s=0 where m is real M2 balances; y is real GDP; R is a short-term nominal interest rate; RM2 is the own rate on M2; U and are the random disturbance terms; and ∆ is the ﬁrst-difference operator. All variables are in their natural logs except interest rates. Equation (1) is the long-run equilibrium M2 demand function and is standard in the sense that the public’s demand for real M2 balances depends upon a scale variable measured by real GDP and an opportunity cost variable measured as the difference between a short-term nominal rate of interest and the own rate of return on M2. The parameter a1 measures the long-run income elasticity and a2 is the long-run opportunity cost parameter. Equation (2) is the short-run money demand equation, which is in a dynamic error-correction form. The parameter bis (I = 2, 3) measures short-run responses of real M2 to changes in income and opportunity cost variables. The parameter λ is the errorcorrection coefﬁcient. It is assumed that if variables in (1) are nonstationary in levels, they are cointegrated (Engle and Granger 1987). The presence of the error-correction mechanism indicates that if actual real money balances are high relative to what the public wishes to hold (Ut−1 > 0), then the public will be reducing its holdings of money balances. Hence the parameter λ that appears on Ut−1 in (2) is negative. The long- and short-run money demand equations given above can be estimated jointly. This is shown in (3), which is obtained by solving for Ut−1 in (1) and substituting in (2) (Mehra 1992): k ∆mt = d0 + k b1s ∆mt−s + s=1 n3 b3s ∆(R − RM2)t−s b2s ∆yt−s + s=1 s=0 + d1 mt−1 + d2 yt−1 + d3 (R − RM2)t−1 + t , (3) where d0 = b0 − λa0 ; d1 = λ; d2 = −λa1 ; and d3 = −λa2 . As can be seen, the long-term income elasticity can be recovered from the long-run part of the money demand equation (3), i.e., a1 is d2 divided by d1 . If the long-term 30 Federal Reserve Bank of Richmond Economic Quarterly income elasticity is unity (a1 = 1 in [1]), then this assumption implies the following restriction on the long-run part of equation (3): d1 + d2 = 0. (4) Equation (4) says that coefﬁcients that appear on yt−1 and mt−1 sum to zero. The short-run part of (3) yields another estimate of the long-term income elasticity, n2 i.e., as s=0 b2s / 1 − n1 b1s . If the same scale variable appears in the longs=1 and short-run parts of the model, then a convergence condition can be imposed on equation (3) to ensure that one gets the same point-estimate of the long-run scale elasticity. The convergence condition implies another restriction (5) on the short-run part of equation (3): n2 n1 b2s / 1 − s=0 b1s = 1. (5) s=1 Equivalently, (5) can be expressed as n2 n1 b2s + s=0 b1s = 1. s=1 That is, coefﬁcients that appear on ∆mt−s and ∆yt−s in (3) sum to unity. Equation (3) can be estimated by ordinary least squares or by instrumental variables if income and/or opportunity cost variables are contemporaneously correlated with the disturbance term. An Estimated Standard M2 Demand Regression: 1960Q4 to 1989Q4 Panel A in Table 1 presents results of estimating the standard money demand regression (3) over the pre-missing M2 period, 1960Q4 to 1989Q4. Regressions are estimated using the new, chain-weighted price and income data.3,4 I present 3 The empirical work here uses the quarterly data over the period 1959Q3 to 1996Q4. Variables that appear in (3) are measured as follows. Real money balances (m) are the log of nominal M2 deﬂated by the GDP deﬂator; scale variables are the logs of real GDP and real consumer spending. All income and price data used are chain-weighted. R is the four-to-sixmonth commercial paper rate; RM2 is the weighted average of the explicit rates paid on the components of M2. The bond rate (R10) used later is the nominal yield on ten-year Treasury bonds. The data on household holdings of bond and equity mutual funds is from the Board of Governors and is constructed by adding net assets of mutual funds but netting out institutional and IRA/Keogh balances (Collins and Edwards 1994). 4 Instrumental variables are used to estimate money demand regressions. Instruments used are just lagged values of the right-hand side explanatory variables. Ordinary least squares are not used mainly out of concern for the simultaneity bias. Both procedures yield similar estimates of the long-run parameters, even though estimates of short-run parameters differ. The convergence condition is usually rejected if ordinary least squares are used, but that is not the case with instrumental variables. That result favors instrumental variables. Nevertheless, the Hausman statistic (Hausman 1978) that tests the hypothesis that ordinary least squares estimates of all parameters are identical to those using the instrumental procedure is small, indicating that simultaneity may not be a serious problem. Y. P. Mehra: Recent Behavior of M2 Demand 31 Table 1 Instrumental Variable Estimates of M2 Demand Regressions: 1960Q4 to 1989Q4 Regression A M2 Demand without the Bond Rate ∆mt = −0.05 + 0.23 ∆mt−1 + 0.08 ∆mt−2 + 0.45 ∆ct + 0.24 ∆ct−1 (4.3) (2.9) (1.0) (4.3) (3.6) ˜ −0.002 ∆(R − RM2)t − 0.003 ∆(R − RM2)t−1 − 0.11 mt−1 + 0.11 yt−1 (1.6) (3.7) (4.6) (4.6) −0.002 (R − RM2)t−1 − 0.72 Tt + 0.03 D83Q1 (3.8) (4.2) (5.5) CRSQ = 0.78 SER = 0.0047 Q(2) = 1.5 Q(4) = 5.1 Q(29) = 22.6 L(R−RM2) = −0.02 Nc = Ny = 1.0 F1(2,105) = 0.99 Regression B M2 Demand with the Bond Rate ∆mt = −0.05 + 0.26 ∆mt−1 + 0.08 ∆mt−2 + 0.40 ∆ct + 0.26 ∆ct−1 (4.2) (3.5) (1.1) (4.2) (4.0) ˜ −0.002 ∆(R − RM2)t − 0.004 ∆(R − RM2)t−1 − 0.11 mt−1 + 0.11 yt−1 (1.5) (4.0) (4.4) (4.4) −0.002 (R − RM2)t−1 − 0.63 Tt + 0.03 D83Q1 − 0.005 (R10 − RM2)t−1 (3.1) (3.3) (5.2) (0.7) −0.002∆(R10 − RM2)t−1 (1.5) CRSQ = 0.79 SER = 0.0045 Nc = Ny = 1.0 F1(2,105) = 0.99 Q(2) = 1.5 L(R−RM2) = −0.02 Q(4) = 5.2 Q(29) = 26.9 L(R10−RM2) = −0.004 F2(2,105) = 2.24 Notes: m is real M2 balances; c is real consumer spending; y is (yt + yt−1 )/2 where y is real ˜ GDP; R is the four-to-six-month commercial paper rate; RM2 is the own rate on M2; R10 is the nominal yield on ten-year U.S. Treasury bonds; D83Q1 is a dummy that equals 1 in 83Q1 and 0 otherwise; ∆ is the ﬁrst difference operator. All variables are in their natural logs, except interest rate variables. CRSQ is the corrected R-squared; SER is the standard error of regression; Q(k) the Ljung-Box Q-statistic based on k number of auto correlations of the residuals. Ny is the longterm income elasticity; Nc is the long-term consumption elasticity; N(R−RM2) is the long-term opportunity cost parameter. F1 tests the restriction Ny = Nc = 1; F2 tests the restriction that the bond rate spread variables are not signiﬁcant in the regression (the 5 percent critical value is 3.1). Instruments used for estimation are just lagged values of the right-hand side explanatory variables. The reported coefﬁcient on trend is to be divided by 1,000. 32 Federal Reserve Bank of Richmond Economic Quarterly the version estimated using real consumer spending as the short-run scale variable and real GDP as the long-run scale variable. The evidence reported in Mankiw and Summers (1986), Small and Porter (1989), and Mehra (1992) indicates that in the short run changes in real money balances are correlated more with changes in consumer spending than with real GDP.5 The regression, however, is estimated under the assumption that the long-run scale elasticity is unity, computed using either the long-run part or the short-run part of (3). That is, restrictions (4) and (5) are imposed on equation (3). In addition, the regression includes a deterministic time trend and a dummy for the introduction of superNows and money market deposit accounts.6 As can be seen, the coefﬁcients that appear on the scale and opportunity cost variables have theoretically correct signs and are statistically signiﬁcant.7 F1 tests the restrictions that long-run income and consumer spending elasticities are unity. This F-statistic is small, indicating that those restrictions are consistent with data (see Table 1). The long-run opportunity cost parameter is −0.02, indicating that a 1 percentage point increase in M2’s opportunity cost (R − RM2) from its current level would reduce equilibrium M2 demand by about 2 percent. It is also worth noting that the long-run part of the money demand equation is well estimated. In particular, the estimated error-correction coefﬁcient is correctly signed and signiﬁcant, indicating the presence of a cointegrating M2 relation in the pre-1990 period. Evidence on the Missing M2 during the 1990s Panel A in Table 2 presents the dynamic, out-of-sample predictions of M2 growth from 1990Q1 to 1996Q4. Those predictions are generated using the standard M2 demand regression given in Table 1. Actual M2 growth and prediction errors (with summary statistics) are also reported. As shown in the 5 I prefer to work with this speciﬁcation because the restrictions that the long-run scale elasticity computed using either the long-run part or the short-run part is unity are consistent with the data in this speciﬁcation. Those restrictions are usually found to be inconsistent with the data when instead real GDP is used in the short-run part. Nevertheless, the results here are not sensitive to the use of different scale variables in short- and long-run parts of the money demand equation. In particular, with real GDP in the short-run part we still have the episode of the missing M2 from 1990 to 1994 and the result that M2 growth was on track in the years 1995 and 1996. 6 In the empirical money demand literature, time-trend variables generally proxy for the effect of ongoing ﬁnancial innovation on the demand for money. Estimates reported in many previous studies indicate that the statistical signiﬁcance of trend variables in money demand regressions is not robust across different speciﬁcations and sample periods. For example, a time trend when included in the Federal Reserve Board M2 demand model is signiﬁcant (Small and Porter 1989; Duca 1995; Koenig 1996), whereas that is not the case in speciﬁcations reported in Hetzel and Mehra (1989) and Mehra (1991, 1992). Different sample periods used in these studies may account for these different results. 7 The Ljung-Box Q-statistics presented in Table 1 indicate that serial correlation is not a problem. 34 Federal Reserve Bank of Richmond Economic Quarterly table, this money demand regression overpredicts M2 growth from 1990 to 1994. Those prediction errors cumulate to an overprediction in the level of M2 of about $490 billion, or 14 percent, by the fourth quarter of 1994.8 However, since 1995 M2 growth has been in line with that predicted by the money demand regression. The cumulative over prediction in the level of M2 has stabilized and there is no tendency for the level percent error to increase since then (see panel A in Table 2). This evidence indicates that the leftward shift in the public’s M2 demand seen early in the 1990s may have ended. 2. AN EXPLANATION OF THE MISSING M2 Portfolio-Substitution Hypothesis It is widely held that weak M2 growth observed in the early ’90s is due to household substitutions out of bank deposits (in M2) and into long-term ﬁnancial assets including bond and stock mutual funds.9 Two developments may have contributed to such portfolio substitution. One is the increased availability and liquidity of bond and stock mutual funds brought about by reductions in transaction costs, improvements in computer technology, and the introduction of check writing on mutual funds. The other is the steepening of the yield curve brought about mainly by a reduction in short-term market interest rates in general and bank deposit rates in particular.10 It is suggested that the combination of these factors reduced the public’s demand for savings in the form of bank deposits, leading them to redirect savings balances into long-term ﬁnancial assets including bond and stock mutual funds.11 8 This predictive failure is conﬁrmed by formal tests of stability. The conventional Chow test with the shift date (1978Q4) located near the midpoint of the sample period indicates that the M2 demand regression is unstable from 1960Q4 to 1996Q4. The Dufour test (Dufour 1980), which is a variant of the Chow test, examines stability over the particular interval, 1990Q1 to 1994Q4. This test uses an F-statistic to test the joint-signiﬁcance of dummy variables introduced for each observation over 1990Q1 to 1994Q4. The results here indicate that the individual coefﬁcients that appear on these shift dummies are generally large and statistically signiﬁcant. The F-statistic is large and signiﬁcant at the 10 percent level. (The F-statistic, however, is not signiﬁcant at the 5 percent level.) Together these results indicate that the M2 demand regression is not stable over this interval. 9 Darin and Hetzel (1994), Wenninger and Partlan (1992), Feinman and Porter (1992), Collins and Edwards (1994), Orphanides, Reid, and Small (1994), Duca (1995), and Koenig (1996). Wenninger and Partlan (1992) argued that weakness in M2 growth was due to weakness in its small time deposits component. 10 Many analysts have argued that the decline in the size of taxpayers’ subsidy to the depository sector also may have contributed to a reduction in offering rates on bank deposits. It is argued that rising premiums for deposit insurance, higher capital requirements, and more stringent standards for depository borrowing and lending in both wholesale and retail markets may have pressured many banks and thrifts to widen intermediation margins, resulting in lower offering rates on many bank deposits (Feinman and Porter 1992). 11 It may, however, be noted that bond and stock funds also grew rapidly in the mid ’80s, shortly after IRA, 401k, and Keogh regulations were liberalized. Such growth, however, did not Y. P. Mehra: Recent Behavior of M2 Demand 35 Tests in Previous Studies The portfolio-substitution hypothesis outlined above has been tested in two different ways. The ﬁrst one attempts to internalize such substitutions by adding bond and/or stock mutual funds to M2. Duca (1995) adds bond funds to M2 and ﬁnds the expanded M2 more explainable from 1990Q3 to 1992Q4. Darin and Hetzel (1994) shift-adjust M2, and Orphanides, Reid, and Small (1994) simply add bond and stock funds to M2. While the resulting monetary aggregates do explain part of the missing M2 or improve the predictive content of M2 in the missing M2 period, they worsen performance in other periods.12 The other approach attempts to capture the increased substitution of mutual funds for bank deposits by redeﬁning the opportunity cost of M2 to include the long-term bond rate. This approach assumes that the bond rate is a proxy for the return available on long-term ﬁnancial assets including bond and stock mutual funds. Hence M2 demand is assumed to be sensitive to both short- and long-term interest rates (Feinman and Porter 1992; Mehra 1992; Koenig 1996). This approach has been relatively more successful in explaining the missing M2 than the other one discussed above. The main issue here however is whether the character of M2 demand has changed since 1990. In Koenig (1996) long-term interest rates are found to inﬂuence M2 demand even before the period of missing money, suggesting that the character of M2 demand did not change and that standard M2 demand regressions estimated without the long-term interest rate are misspeciﬁed. In contrast the empirical work in Feinman and Porter (1992) and Mehra (1992) are consistent with the observation that long-term interest rates did not add much towards explaining M2 demand in pre-1990 sample periods. In the next section I examine further the quantitative importance of the long-term interest rate in explaining M2 demand. 3. THE ROLE OF THE BOND RATE IN M2 DEMAND Pre-1990 M2 Demand Regression with the Bond Rate Panel B in Table 1 presents the standard M2 demand regression augmented to include the bond rate spread variable measured as the difference between the nominal yield on ten-year Treasury bonds and the own rate of return on M2. destablize M2 demand. The ﬂow-of-funds data discussed in Duca (1994) indicates that the assets that households shifted into bond and equity funds came from direct holdings of bonds and equities rather than from M2 deposits. By contrast more of the inﬂows into bond and stock funds in the early ’90s reﬂected shifts out of M2 rather than out of direct bond and equity holdings. 12 For example, Orphanides, Reid, and Small (1994) report that money demand equations that add bond and stock funds to M2 fail Chow tests of stability. Koenig (1996) shows that the bond-fund adjusted M2 demand equation, while it improved the forecast performance from 1990 to 1994, worsened performance in the early sample period. I show later (see footnote 16) that adding bond and stock funds to M2 worsened performance over the last couple of years. 36 Federal Reserve Bank of Richmond Economic Quarterly I include both the level and ﬁrst differences of this spread. The regression is estimated over the pre-missing M2 demand period, 1960Q4 to 1989Q4. It is evident that the coefﬁcient that appears on the level of the bond rate spread variable is small and statistically not different from zero. F2 is the F-statistic that tests the hypothesis that coefﬁcients that appear on both the level and ﬁrst differences of the bond rate spread variable are zero. This statistic is small, indicating that the bond rate spread did not inﬂuence M2 demand in the pre1990 period (see regression B in Table 1). Including the bond rate spread in the M2 demand regression estimated using only pre-1990 sample observations does not solve the missing M2 puzzle either. The evidence on this point is indicated by the dynamic out-of-sample simulations of M2 demand given in panel B of Table 2. The augmented M2 demand regression continues to overpredict M2 growth from 1990 to 1994. Those prediction errors cumulate to an overprediction in the level of M2 of about $464 billion, or 13.2 percent by end of 1994. Including the bond rate spread does yield a somewhat lower root mean squared error, but this improvement is very small (compare prediction errors in panels A and B of Table 2).13 Full-Sample M2 Demand Regression with the Bond Rate Table 3 presents M2 demand regressions estimated including post-’90 sample observations. In regression D the bond rate spread enters interacting with a slope dummy that is unity since 1989 and zero otherwise. In that speciﬁcation the restriction that the bond rate spread did not inﬂuence M2 demand in the pre-1990 period is imposed on the regression. I also present the regression C 13 In the money demand regression above, long- and short-rate spreads are included in an unrestricted fashion. It is possible to get the result that the bond rate inﬂuenced M2 demand even before the missing M2 demand period if the opportunity cost of holding M2 is alternatively measured as a weighted average of long- and short-rates: OCt = (w ∗ R10t + (1 − w) ∗ RCPt ) − RM2t , where OCt is the opportunity cost; w is the weighting coefﬁcient; and other variables are deﬁned as before. If w = 0, then the bond rate is not relevant in inﬂuencing M2 demand. The money demand regression (3) here also is estimated using this alternative measure. Estimation results using pre-1990 sample observations indicate that the standard error of M2 demand regression is minimized when w = 0.4. In that regression the opportunity cost variable is correctly signed and signiﬁcant, indicating that the long-term interest rate inﬂuenced M2 demand even before the missing M2 demand period. This ﬁnding is similar to the one reported in Koenig (1996). However, this empirical speciﬁcation does not solve the missing M2 problem. M2 growth predicted by this regression remains large relative to actual M2 growth from 1990 to 1994. Those prediction errors still cumulate to generate an overprediction in the level of M2 of about $441 billion, or 12.6 percent by end of 1994. The magnitude of this prediction error is somewhat smaller than the one generated by assuming w = 0. But the improvement is small. This empirical speciﬁcation does not solve the missing M2 problem because the increased explanatory power of the bond rate in the M2 demand regression comes at the cost of the short rate. Y. P. Mehra: Recent Behavior of M2 Demand 37 Table 3 Instrumental Variables Estimates of M2 Demand Regressions: 1960Q4 to 1996Q4 Regression C M2 Demand with the Bond Rate, but No Slope Dummies ∆mt = −0.01 + 0.33 ∆mt−1 + 0.13 ∆mt−2 + 0.36∆ct + 0.17 ∆ct−1 (2.1) (4.8) (1.9) (3.9) (2.7) ˜ −0.001∆(R − RM2)t − 0.005 ∆(R − RM2)t−1 − 0.03 mt−1 + 0.03 yt−1 (0.9) (5.9) (2.4) (2.4) −0.000 (R − RM2)t−1 − 0.000 (R10 − RM2)t−1 − 0.004 ∆(R10 − RM2)t−1 (0.3) (0.1) (3.2) −0.51 Tt + 0.02 D83Q1 (3.0) (5.1) CRSQ = 0.78 SER = 0.0047 Q(2) = 2.7 Ny = Nc = 1 N(R−RM2) = −0.005 Q(4) = 6.6 Q(29) = 35.1 N(R10−RM2) = −0.005 F2(2,133) = 5.5∗ Regression D M2 Demand with the Bond Rate Interacting with Slope Dummy ∆mt = −0.03 + 0.25 ∆mt−1 + 0.08 ∆mt−2 + 0.49 ∆ct + 0.18 ∆ct−1 (3.8) (3.5) (1.1) (5.2) (2.8) y −0.002 ∆(R − RM2)t − 0.003 ∆(R − RM2)t−1 − 0.07 mt−1 + 0.07˜ t−1 (1.5) (4.2) (3.9) (3.9) −0.001(R − RM2)t−1 − 0.002 (D ∗ R10 − RM2)t−1 − 0.003 ∆(R10 − RM2)t−1 (2.5) (3.2) (2.1) −0.56 Tt + 0.02 D83Q1 (3.6) (5.3) CRSQ = 0.77 SER = 0.0046 Q(2) = 1.7 Ny = Nc = 1 N(R−RM2) = −0.02 Q(4) = 6.0 Q(36) = 35.3 N(R10−RM2) = −0.03 *Signiﬁcant at the 5 percent level. Notes: D is a dummy that is 1 from 1989Q1 to 1996Q4 and 0 otherwise. N(R10−RM2) is the long-run bond rate opportunity cost parameter. See also notes in Table 1. in which no such slope dummy is included. Both differences and the level of the bond rate spread are included in these regressions. As can be seen, the coefﬁcient that appears on the level of the bond rate spread is signiﬁcant 38 Federal Reserve Bank of Richmond Economic Quarterly only in the regression where the spread is included interacting with the slope dummy.14 Moreover, in that regression other coefﬁcients, including the one that appears on the error-correction variable, have expected signs and are statistically signiﬁcant. In contrast none of the coefﬁcients that appear on levels of the interest rate spreads are signiﬁcant in the regression without the slope dummy (compare coefﬁcients in regressions C and D of Table 3).15 Together this evidence indicates that a signiﬁcant role for the impact of the long-term interest rate on M2 demand emerges only in the post-1990 period.16 Panel D in Table 4 presents the dynamic, within-sample simulations of M2 growth from 1990 to 1996 generated using the regression with the slope dummy. As shown in the table, this regression can account for most of the missing M2 since 1990. The prediction errors now cumulate to an overprediction in the level of M2 of about $41 billion, or 1.2 percent by end of 1994. Since then, the level percent error has displayed no tendency to increase over time. This 14 The intuition behind this result is that the least squares regression coefﬁcient measures the average response of M2 demand to the spread variables over the full sample. If for most of the sample period—as is the case here—this response is small or zero, then the estimated regression coefﬁcient that simplify averages such responses over the full sample will be small or zero. But when the slope dummy is included, the estimated regression coefﬁcient receives full weight over part of the sample over which the response is believed to be strong. I have not reported the slope dummy on the ﬁrst difference of the bond rate spread because it is not signiﬁcant in the regression. 15 In the regression C without the slope dummy, the error-correction coefﬁcient is small in magnitude and only marginally signiﬁcant. In fact, if restrictions that long-run scale elasticities are unity are not imposed on the regression, then none of the coefﬁcients that appear on levels of variables are signiﬁcant. Hence in these regressions the hypothesis that there exists a cointegrating M2 demand relation is easily rejected. This ﬁnding is similar in spirit to the one in Miyao (1996), where it is shown that once post-’90 sample observations are included in the estimation period, evidence supports no M2 cointegration. 16 Alternatively, the hypothesis that most of the missing M2 went into bond and stock mutual funds can be tested by broadening the deﬁnition of M2 to include such mutual funds. If the hypothesis is correct, then the broadly deﬁned monetary aggregate should be more explainable from 1990 to 1994. This procedure yields similar results. To explain it further, consider the behavior of the monetary aggregate that simply adds bond and stock mutual funds to M2, denoted hereafter as M2+ (Orphanides, Reid, and Small 1994). This aggregate has grown at the following rates (in percent) in recent years: 4.1 in 1990, 6.2 in 1991, 4.6 in 1992, 5.5 in 1993, 0.9 in 1994, 6.3 in 1995, and 7.9 in 1996. For those years M2 growth predicted by the standard M2 demand regression is 6.4, 3.5, 6.4, 4.8, 3.0, 3.5, and 3.9, respectively. The corresponding prediction errors are −2.3, 2.0, −1.7, 0.6, −2.0, 2.9, and 3.9. As can be easily veriﬁed, for the period 1990 to 1994 the mean prediction error is −0.57 percentage point and the root mean squared error is 2.0 percentage points. These prediction errors are smaller than those generated using the narrowly deﬁned M2; for the latter the mean error is −1.78 and the root mean squared error is 2.52. Thus M2+ is more explainable over the period 1990 to 1994 than is M2. However, adding bond and stock funds to M2 does not yield a more stable money demand equation. As can be seen, strong growth in M2+ over the period 1995 to 1996 is not easily predicted when conventional money demand parameters are used to characterize M2+ demand. The analysis above, however, is subject to the caveat that the opportunity cost variable in M2+ demand is different from the one that shows up in M2 demand. In particular, the own rate of return on M2+ must include the returns on bond and stock mutual funds. 40 Federal Reserve Bank of Richmond Economic Quarterly evidence indicates that the steepening of the yield curve contributed to weak M2 growth in the early ’90s. The Missing M2 and Bond and Stock Mutual Funds Figure 1 charts the missing M2 as explained by the bond rate spread since 1990.17 It also charts the cumulative change (since 1989) in household holdings of bond and stock mutual funds.18 As can be seen, these two series comove from 1990 to 1994. But this comovement ends in the years 1995 and 1996. Furthermore, in the beginning years (1990, 1991, 1992) of this missing period, the magnitude of the missing M2 somewhat exceeds the cumulative increase in household holdings of bond and stock mutual funds. This data supports the view that weak M2 growth in the early ’90s is due to household’s substitution out of M2 and into bond and stock mutual funds. But not all of the missing M2 ﬁrst went into bond and stock funds. A part might have gone into direct holdings of bonds, stocks, and other long-term savings vehicles (Duca 1993; Darin and Hetzel 1994). If part of the missing M2 ended up in bond and stock mutual funds, then changes in missing M2 balances should be correlated with changes in household holdings of bond and stock funds. This implication is tested by running the following regression: ∆BSt = a0 + a1 ∆BSt−1 + a2 ∆MM2t + t , where BS is household holdings of bond and stock funds; MM2 is the missing M2; and t is the random disturbance term. The series on BS and MM2 are reported in Table 4 and charted in Figure 1. Estimation results indicate that from 1991Q1 to 1994Q4 a2 = 0, but from 1991Q1 to 1996Q4 a2 = 0.19 These results are consistent with the hypothesis that part of the missing M2 during the 1990s ended up in bond and stock mutual funds. 17 The series on the missing M2 is generated in the following way. The M2 demand regression D, which includes the bond rate interacting with a slope dummy, is estimated from 1960Q4 to 1996Q4. This regression is dynamically simulated from 1990Q1 to 1996Q4, ﬁrst using actual values of the bond rate spread over the prediction interval and then repeating the simulation with actual values of the bond rate set to zero. The difference in predicted values so generated gives M2 demand explained by the bond rate. 18 This series is constructed by Collins and Edwards (1994) and is the plus part of the monetary aggregate (M2+) discussed in the previous footnote. As noted before, the plus part is the market value of household holdings of bond and stock mutual funds. The current deﬁnition of the conventional M2 aggregate includes currency, demand deposits, other checkable deposits, savings deposits, small time deposits, retail money market mutual funds and overnight RPs, and Eurodollar deposits. Since this deﬁnition does not include institutional and IRA/Keogh balances, household holdings of bond and stock funds are also net of such assets. However, unlike M2, those household holdings can increase if bonds and stocks appreciate and thus do not necessarily represent funds out of new savings. 19 The regressions use quarterly observations on year-over-year changes in BS and MM2 and are run from 1991Q1 to 1996Q4. Y. P. Mehra: Recent Behavior of M2 Demand 41 Figure 1 The Missing M2 and the Cumulative Change in Household Holdings of Bond and Stock Mutual Funds since 1990 700 600 500 Missing M2 400 300 200 100 Cumulative Change in Household Holdings of Bond and Stock Mutual Funds 0 + 1990:1 1991:1 1992:1 1993:1 1994:1 1995:1 1996:1 Notes: The missing M2 is the reduction in M2 demand that is due to the bond rate spread. Household holdings are net of institutional and IRA/Keogh assets. 4. CONCLUDING OBSERVATIONS It is now known that the public’s demand for M2 experienced a leftward shift in the early ’90s. It is widely believed that this shift reﬂected the public’s desire to redirect savings balances from bank deposits to long-term ﬁnancial assets, including bond and stock mutual funds. In this article, I test this popular hypothesis. In particular, I present evidence that a standard M2 demand regression augmented to capture the impact of the long-term interest rate on money demand can account for most of the missing M2 since 1990 and that changes in this missing M2 are highly correlated with changes in household holdings of bond and stock mutual funds in the early 1990s. The evidence here, however, also indicates that the long-term interest rate has no predictive content for M2 demand in the pre-missing M2 period. That result suggests caution in assigning a causal role to the independent inﬂuence of the long-term rate on M2 demand found in the missing M2 period. Furthermore, household holdings of bond and stock mutual funds continued to increase in the years 1995 and 1996, but that increase has not accompanied any weakness 42 Federal Reserve Bank of Richmond Economic Quarterly in M2. Hence increases in household holdings of bond and stock mutual funds may not necessarily signal instability in M2 demand. One interpretation of the recent behavior of M2 demand is that some special factors caused a leftward shift in the public’s M2 demand. The evidence here is consistent with the view that those special factors included the combination of the unusual steepening of the yield curve and the increased availability, liquidity, and public awareness of bond and stock mutual funds. The evidence so far is that those special factors have not changed fundamentally the character of M2 demand beyond causing a one-time permanent shift in the level of M2 balances demanded by the public. Hence the result that the leftward shift in M2 demand ended two years ago should now be of interest to monetary policymakers. REFERENCES Collins, Sean, and Cheryl L. Edwards. “An Alternative Monetary Aggregate: M2 Plus Household Holdings of Bond and Equity Mutual Funds,” Federal Reserve Bank of St. Louis Review, vol. 76 (November/December 1994), pp. 7–29. Darin, Robert, and Robert L. Hetzel. “A Shift-Adjusted M2 Indicator of Monetary Policy,” Federal Reserve Bank of Richmond Economic Quarterly, vol. 80 (Summer 1994), pp. 25– 47. Duca, John V. “Should Bond Funds be Added to M2?” Journal of Banking and Finance, vol. 19 (April 1995), pp. 131–52. . “Commentary,” Federal Reserve Bank of St. Louis Review, vol. 76 (November/December 1994), pp. 67–70. . “RTC Activity and the ‘Missing M2,’ ” Economic Letters, vol. 41 (1993), pp. 67–71. Dufour, Jean Marie. “Dummy Variables and Predictive Tests for Structural Change,” Economic Letters, vol. 6 (1980), pp. 241– 47. Engle, Robert F., and C. W. J. Granger. “Co-Integration and Error Correction: Representation, Estimation, and Testing,” Econometrica, vol. 55 (March 1987), pp. 251–76. Feinman, Joshua. “Commentary,” Federal Reserve Bank of St. Louis Review, vol. 76 (November/December 1994), pp. 71–73. , and Richard D. Porter. “The Continuing Weakness in M2,” Finance and Economics Discussion Working Paper 209. Board of Governors of the Federal Reserve System, 1992, pp. 1– 41. Y. P. Mehra: Recent Behavior of M2 Demand 43 Greenspan, Alan. “Statement to the Congress,” Federal Reserve Bulletin, vol. 79 (September 1993), pp. 849–55. Hausman, J. A. “Speciﬁcation Tests in Econometrics,” Econometrica, vol. 46 (November 1978), pp. 1251–72. Hetzel, Robert L., and Yash P. Mehra. “The Behavior of Money Demand in the 1980s,” Journal of Money, Credit, and Banking, vol. 21 (November 1989), pp. 455–63. Koenig, Evan F. “Long-Term Interest Rates and the Recent Weakness in M2,” Journal of Economics and Business, vol. 48 (May 1996), pp. 81–101. Mankiw, N. Gregory, and Lawrence H. Summers. “Money Demand and the Effects of Fiscal Policies,” Journal of Money, Credit, and Banking, vol. 18 (November 1986), pp. 415–29. Mehra, Yash P. “Has M2 Demand Become Unstable?” Federal Reserve Bank of Richmond Economic Review, vol. 78 (September/October 1992), pp. 27–35. . “An Error-Correction Model of U.S. M2 Demand,” Federal Reserve Bank of Richmond Economic Review, vol. 77 (May/June 1991), pp. 3–12. Miyao, Ryuzo. “Does a Cointegrating M2 Demand Relation Really Exist in the United States?” Journal of Money, Credit, and Banking, vol. 28 (August 1996), pp. 365–80. Orphanides, Athanasios, Brian Reid, and David H. Small. “The Empirical Properties of a Monetary Aggregate that Adds Bond and Stock Funds to M2,” Federal Reserve Bank of St. Louis Review, vol. 76 (November/December 1994), pp. 31–51. Small, David H., and Richard D. Porter. “Understanding the Behavior of M2 and V2,” Federal Reserve Bulletin, vol. 75 (April 1989), pp. 244 –54. Wenninger, John, and John Partlan. “Small Time Deposits and the Recent Weakness in M2,” Federal Reserve Bank of New York Quarterly Review, vol. 17 (Spring 1992), pp. 21–35. On the Identiﬁcation of Structural Vector Autoregressions Pierre-Daniel G. Sarte F ollowing seminal work by Sims (1980a, 1980b), the economics profession has become increasingly concerned with studying sources of economic ﬂuctuations. Sims’s use of vector autoregressions (VARs) made it possible to address both the relative importance and the dynamic effect of various shocks on macroeconomic variables. This type of empirical analysis has had at least two important consequences. First, by deepening policymakers’ understanding of how economic variables respond to demand versus supply shocks, it has enabled them to better respond to a constantly changing environment. Second, VARs have become especially useful in guiding macroeconomists towards building structural models that are more consistent with the data. According to Sims (1980b), VARs simply represented an atheoretical technique for describing how a set of historical data was generated by random innovations in the variables of interest. This reduced-form interpretation of VARs, however, was strongly criticized by Cooley and Leroy (1985), as well as by Bernanke (1986). At the heart of the critique lies the observation that VAR results cannot be interpreted independently of a more structural macroeconomic model. Recovering the structural parameters from an estimation procedure requires that some restrictions be imposed. These are known as identifying restrictions. Implicitly, the choice of variable ordering in a reduced-form VAR constitutes such an identifying restriction. As a result of the Cooley-Leroy/Bernanke critique, economists began to focus more precisely upon the issue of identifying restrictions. The extent to which speciﬁc innovations were allowed to affect some subset of variables, I would like to thank Tom Cooley, Michael Dotsey, Bruce Hansen, Tom Humphrey, Yash Mehra, and Alex Wolman for more than helpful comments. I would also like to thank Sergio Rebelo, Vassilios Patikis, and Mark Watson for their suggestions. The opinions expressed herein are the author’s and do not represent those of the Federal Reserve Bank of Richmond or the Federal Reserve System. Federal Reserve Bank of Richmond Economic Quarterly Volume 83/3 Summer 1997 45 46 Federal Reserve Bank of Richmond Economic Quarterly either in the short run or in the long run, began to be derived explicitly from structural macroeconomic models. Consequently, what were previously considered random surprises could be interpreted in terms of speciﬁc shocks, such as technology or ﬁscal policy shocks. This more reﬁned use of VARs, known as structural vector autoregressions (SVARs), has become a popular tool for evaluating economic models, particularly in the macroeconomics literature. The fact that nontrivial restrictions must be imposed for SVARs to be identiﬁed suggests, at least in principle, that estimation results may be contingent on the choice of restrictions. To take a concrete and recent example, in estimating a system containing employment and productivity variables, Gali (1996) achieves identiﬁcation by assuming that aggregate demand shocks do not affect productivity in the long run. Using postwar U.S. data, he is then able to show that, surprisingly, employment responds negatively to a positive technology shock. One may wonder, however, whether his results would change signiﬁcantly under alternative restrictions. This article consequently investigates how the use of different identifying restrictions affects empirical evidence about business ﬂuctuations. Two important conclusions emerge from the analysis. First, by thinking of SVARs within the framework of instrumental variables estimation, it will become clear that the method is inappropriate for certain identifying restrictions. This ﬁnding occurs because SVARs use the estimated residual from a previous equation in the system as an instrument in the current equation. Since estimation of this residual depends on some prior identifying restriction, the identiﬁcation scheme necessarily determines the strength of the instrument. By drawing from the literature on estimation with weak instruments, this article points out that in some cases, SVARs will not yield meaningful parameter estimates. The second ﬁnding of interest suggests that even in cases where SVAR parameters can be properly estimated, different identiﬁcation choices can lead to contradictory results. For example, in Gali (1996) the restriction that aggregate demand shocks not affect productivity in the long run also implies that employment responds negatively to a positive technology shock. But the opposite result emerges when aggregate demand shocks are allowed to have a small negative effect on productivity in the long run. This latter restriction is appropriate if demand shocks are interpreted as ﬁscal policy shocks in a real business cycle model. More importantly, this observation suggests that sensitivity analysis should form an integral part of deciding what constitutes a stylized fact within the conﬁnes of SVAR estimation. This article is organized as follows. We ﬁrst provide a brief description of reduced-form VARs as well as the basic idea underlying the CooleyLeroy/Bernanke critique. In doing so, the important assumptions underlying the use of VARs are laid out explicitly for the nonspecialist reader. We then introduce the mechanics of SVARs—that is, the details of how SVARs are usually estimated—and link the issue of identiﬁcation to the estimation P.-D. G. Sarte: Structural Vector Autoregressions 47 procedure.1 The next section draws from the literature on instrumental variables in order to show the conditions in which the SVAR methodology fails to yield meaningful parameter estimates. We then describe the type of interpretational ambiguities that may arise when the same SVAR is estimated using alternative identifying restrictions. Finally, we offer a brief summary and some conclusions. 1. REDUCED-FORM VARS AND THE COOLEY-LEROY/BERNANKE CRITIQUE In this section, we brieﬂy describe the VAR approach ﬁrst advocated by Sims (1980a, 1980b). In doing so, we will show that the issue of identiﬁcation already emerges in interpreting estimated dynamic responses for a given set of variables. To make matters more concrete, the analysis in both this and the next section is framed within the context of a generic bivariate system. However, the basic issues under consideration are invariant with respect to the size of the system. Thus, consider the joint time series behavior of the vector (∆yt , ∆xt ), which we summarize as B(L)Yt = et , with B(0) = B0 = I, (1) where Yt = (∆yt , ∆xt ) , and B(L) denotes a matrix polynomial in the lag operator L. B(L) is thus deﬁned as B0 +B1 L+ . . . +Bk Lk + . . . , where Lk Yt = Yt−k . Since B(0) = I, equation (1) is an unrestricted VAR representation of the joint dynamic behavior of the vector Yt . In Sims’s (1980a) original notation, the vector et = (eyt , ext ) would carry the meaning of “surprises” or innovations in ∆yt and ∆xt respectively. In its simplest interpretation, the reduced form in (1) is a model that describes how the historical data contained in Yt was generated by some random mechanism. As such, few would question its usefulness as a forecasting tool. However, in the analysis of the variables’ dynamic responses to the various innovations, the implications of the unrestricted VAR are not unambiguous. Speciﬁcally, let us rewrite (1) as a moving average representation, Yt = B(L)−1 et = C(L)et , (2) where C(L) is deﬁned to be equal to B(L)−1 , with C(L) = C0 + C1 L + . . . + CK LK + . . . , and C0 = C(0) = B(0)−1 = I. To obtain the comparative dynamic responses of ∆yt and ∆xt , Sims (1980a) ﬁrst suggested orthogonalizing the vector of innovations et by deﬁning ft = Aet , such that A is a lower triangular matrix with 1s on its diagonal and ft has a normalized diagonal covariance 1 Note that the details of the estimation procedure described in this article apply directly to the work of King and Watson (1997). 48 Federal Reserve Bank of Richmond Economic Quarterly matrix. This particular transformation is known as a Choleski factorization and the newly deﬁned innovations, ft = ( fyt , fxt ) , have unit variance and are orthogonal. Equation (2) can therefore also be expressed as Yt = C(L)A−1 Aet = D(L)ft , (3) with D(L) = C0 A−1 + C1 A−1 L + . . . + Ck A−1 Lk + . . . . Responses to innovations at different horizons, also known as impulse responses, are then given by ∂Yt+k = Ck A−1 , for k = 0, 1, . . . . (4) Et ∂ft The advantage of computing dynamic responses in this way is that the innovations ft are uncorrelated. Therefore it is very simple to compute the variances associated with any linear combinations involving them. Note that Et+1 Yt+k − Et Yt+k = Ck A−1 ft , (5) so that the jth row of Ck A−1 gives the marginal effect of ft on the jth variable’s k step-ahead forecast error. Since the ft ’s are uncorrelated with unit variance, squaring the elements of Ck A−1 leads to contributions of the elements of ft to the variance of the k step-ahead forecast error. This latter process is known as variance decomposition and describes the degree to which a particular innovation contributes to observed ﬂuctuations in Yt . Note that the variance decomposition of the contemporaneous forecast error is given by the squared elements of C0 A−1 = A−1 . More importantly, since A is a lower triangular matrix, A−1 is also lower triangular. This implies that the innovation in the ﬁrst equation, fyt , explains 100 percent of the variance in the contemporaneous forecast error of ∆yt . But this is precisely an identifying restriction on the dynamic behavior of Yt . In a larger system, the variance of the contemporaneous forecast error in the jth variable would be entirely accounted for by the ﬁrst j innovations in a recursive fashion. Each of these restrictions would then implicitly constitute prior identifying restrictions. In this sense, the ordering of variables in a reduced-form VAR is of crucial signiﬁcance. This last point was made, perhaps most vigorously, in Cooley and Leroy (1985): “if the models (i.e., VARs) are interpreted as non-structural, we view the conclusions as unsupportable, being structural in nature. If the models are interpreted as structural, on the other hand, the restrictions on error distributions adopted in atheoretical macroeconometrics are not arbitrary renormalizations, but prior identifying restrictions.” On a related note, Bernanke (1986) also writes that the standard Choleski decomposition, while “sometimes treated as neutral . . . in fact embodies strong assumptions about the underlying economic structure.” Following these criticisms, several authors, including Blanchard and Watson (1984), Sims (1986), Bernanke (1986), and Blanchard and Quah (1989), addressed the issue of identiﬁcation explicitly. The error terms in these latter models were given structural interpretations and the results no longer had to P.-D. G. Sarte: Structural Vector Autoregressions 49 depend on an arbitrary orthogonalization. However, this latter methodology possesses its own problems, both in terms of the validity of the estimation procedure and the interpretation of the results. This is the subject to which we now turn our attention. 2. INTRODUCTION TO THE MECHANICS OF STRUCTURAL VARS The reduced form in equation (1) could simply be thought of as a way to summarize the full data set Yt . In contrast, suppose that a theoretical model tells us that yt actually evolves according to a speciﬁc stochastic process, ∆yt = Θya (L) at + Θyb (L) bt + (1 − L)Φya (L) at + (1 − L)Φyb (L) bt , (6) where at and bt now possess well-deﬁned structural interpretations. Thus, yt might represent national output, while at and bt might denote shocks to technology and labor supply respectively. This speciﬁcation for yt is quite general in that it allows shocks to have both permanent and temporary effects. The polynomial in the lag operator Φ(L) captures temporary deviations in yt , while the polynomial Θ(L) keeps track of permanent changes in its steady-state level. Similarly, suppose that xt follows a process that can be described by ∆xt = Θxa (L) at + Θxb (L) bt + (1 − L)Φxa (L) at + (1 − L)Φxb (L) bt . (7) With this speciﬁcation in hand, it is possible to summarize the system as Yt = S(L) t , where Yt is deﬁned as in the previous section, S(L) = Θya (L) + (1 − L)Φya (L) Θxa (L) + (1 − L)Φxa (L) (8) t =( at , bt ) , and Θyb (L) + (1 − L)Φyb (L) . Θxb (L) + (1 − L)Φxb (L) (9) Equation (8) therefore denotes the structural moving average representation of the variables yt and xt , as a function of the exogenous innovations at and bt . Let us assume that S(L) is invertible so that equation (8) can also be expressed in autoregressive form: T(L)Yt = S(L)−1 Yt = t , (10) that is, T(L) ∆yt ∆xt = at , with T(0) = S(0)−1 = I. (11) bt Since the two exogenous processes that govern the behavior of yt and xt in (6) and (7) are assumed stationary, we also assume that the roots of the polynomial matrix |T(z)| lie outside the unit circle. At this stage it is not possible to disentangle the structural effects of at and bt in equation (11). Put another 50 Federal Reserve Bank of Richmond Economic Quarterly way, we cannot currently identify the structural error terms at and bt with the residuals in the two equations implicit in (11). This is a well-known problem that naturally leads us to the issue of identiﬁcation. Identiﬁcation in Structural VARs To get a handle on the problem of identiﬁcation, observe the relationship between the reduced form in (1) and equation (11). Since T(L)Yt = T0 Yt + −1 −1 −1 T1 Yt−1 + . . . , it follows that T0 T(L)Yt = Yt + T0 T1 Yt−1 + . . . = T0 t . −1 −1 We then see that T0 T(L)Yt is the reduced form, that is, T0 T(L) = B(L) so that T(0)−1 T(L)Yt = B(L)Yt = et = T(0)−1 t . (12) Hence, if Σ = cov ( t ) and Ω = cov (et ), the following relation also holds: T(0)−1 ΣT(0)−1 = Ω. (13) Since Ω can be estimated from the reduced form, the problem of identiﬁcation relates to the conditions under which the structural parameters in T(0)−1 ΣT(0)−1 can be recovered from Ω. Equation (13) potentially establishes a set of three equations in seven unknowns. Speciﬁcally, the unknowns consist of four parameters in T(0) and two variances and one covariance term in Σ. The SVAR literature typically reduces the size of this problem by making the following two assumptions. First, T(0) is normalized to contain 1s on its diagonal. Second, Σ is diagonalized, which reﬂects the assumption that the structural disturbance terms are taken to be uncorrelated. This leaves us with four unknowns; therefore, one further restriction must be imposed for the structural form to be identiﬁed. This additional restriction will generally reﬂect the econometrician’s beliefs and, as will be apparent below, will allow one to separate the effects of the two structural error terms. As we have just pointed out, only one restriction needs to be imposed upon the dynamics of the system in (11) for the parameters to be identiﬁed. One possibility is to specify a priori one of the parameters in the contemporaneous matrix T(0). Another popular approach, the one we focus on here, is to pre-specify a particular long-run relationship between the variables and therefore constrain the matrix of long-run multipliers T(1). This approach is the one followed by Shapiro and Watson (1988), Blanchard and Quah (1989), King, Plosser, Stock, and Watson (1991), and Gali (1992, 1996) among others. To be concrete, deﬁne T(1) = 1 − θyy −θxy −θyx Θya (1) Θyb (1) = Θxa (1) Θxb (1) 1 − θxx −1 = S(1)−1 . (14) One way to achieve identiﬁcation would be to impose the restriction that the exogenous process with innovation at not affect the level of xt in the long run. P.-D. G. Sarte: Structural Vector Autoregressions 51 That is, impose the restriction that Θxa (1) = 0. (15) Since inverses of block diagonal matrices are themselves block diagonal, setting Θxa (1) = 0 is tantamount to setting θxy = 0. It would then be possible to estimate all the remaining parameters in equation (6) and (7). This type of restriction, known as an exclusion restriction, is used for identiﬁcation in the papers cited above. Note, however, that in theory there is no reason why identiﬁed parameters should be set to zero as opposed to any other value. All that is required is that the set of identiﬁed parameters be ﬁxed in advance, whether zero or not. For example, if at denotes a shock to technology and xt represents labor supply, imposing Θxa (1) = 0 would mean the structural model we have in mind implies that changes in technology do not affect labor supply in the long run. However, in a standard real business cycle model, the permanent effect of technology on labor supply depends on whether the income or the substitution effect dominates. This effect in turn depends on whether the elasticity of intertemporal substitution is greater or less than one. Therefore, there is no reason why exclusion restrictions should necessarily be used as an identiﬁcation strategy. The fact that Θxa (1), or alternatively θxy , does not have to be set to zero as a way to identify the model means that estimated parameters, and therefore estimated dynamic responses, can vary depending on the identiﬁcation scheme adopted. This observation carries with it two potential problems. First, different identiﬁcation schemes might lead to different comparative dynamic responses of the variables. Therefore, in using SVARs to establish stylized facts, some sensitivity analysis appears to be essential. Second, the estimation procedure may fail in a statistical sense for some values of θxy in the relevant parameter space. Before looking at each of these problems, however, we ﬁrst need to explain SVAR estimation. Structural VAR Estimation Procedure The most popular way of imposing identifying restrictions as part of the estimation procedure in a SVAR is to take an instrumental variables (IV) approach, speciﬁcally two-stage least squares. In applying this approach to our bivariate system, we examine a simple case involving one lag. This will help in keeping matters tractable. Thus, the second equation in (11) can be written as ∆xt = βxy0 ∆yt + βxy1 ∆yt−1 + βxx1 ∆xt−1 + bt . (16) To see how the long-run multipliers θxx and θxy in T(1) implicitly enter in equation (16), observe that this equation can also be expressed as ∆xt − θxy ∆yt = γxy0 ∆2 yt + θxx ∆xt−1 + bt , (17) 52 Federal Reserve Bank of Richmond Economic Quarterly where ∆2 yt denotes the second difference in yt , θxx = βxx1 , γxy0 = −βxy1 , and θxy = βxy0 +βxy1 .2 By setting a predetermined value for θxy , not necessarily zero, the parameters of equation (17) can then be estimated. Since ∆2 yt is correlated with bt , ordinary least squares estimation is inappropriate, but two-stage least squares can be performed using the set Z = {∆xt−1 , ∆yt−1 } as instruments. In a similar fashion, the equation for ∆yt can be written as ∆yt = βyy1 ∆yt−1 + βyx0 ∆xt + βyx1 ∆xt−1 + at . (18) Equation (18) can be estimated using the same set of instruments as for (17) plus the estimated residual for bt .3 Recall that in order to achieve identiﬁcation, the structural disturbances were assumed uncorrelated, thereby allowing the use of the estimated residual as an instrument. Furthermore, this residual is the only candidate instrument that remains. Additional lags of the endogenous variables, if relevant, should have been included in the original equations. The key point to note at this stage is that since the left-hand side of equation (17) varies with θxy , the parameters as well as the error term in that equation are contingent upon the identiﬁcation scheme. This raises a question as to the validity of the estimated residual from equation (17) as an instrument. Not only is zero correlation between the structural disturbances necessary, but a high correlation between the instrument and the variable it is instrumenting for is also essential. This point is emphasized by Nelson and Startz (1990). As we shall now see, because the time series behavior of the estimated residual in (17) varies with θxy , the validity of the estimation procedure in the subsequent equation will be implicitly tied to the choice of identifying restriction. 3. IDENTIFICATION FAILURE IN STRUCTURAL VARS To gain insight into the problems that may arise in this framework, given the identiﬁcation strategy adopted, let us rewrite equation (17) as follows: ∆xt − θxy ∆yt = X φ + bt , (19) where X = {∆2 yt , ∆xt−1 } and φ = (γxy0 , θxx ) . Then, the two-stage least squares estimator φ is given by φ = (Z X )−1Z (∆xt − θxy ∆yt ). Z (20) From equation (20), the parameter estimates in φ will change as θxy takes on different values. This is also true of the estimated residual, which we therefore 2 As an intermediate step, equation (16) can also be expressed as ∆x = (β t xy0 + βxy1 − βxy1 )∆yt + βxy1 ∆yt−1 + βxx1 ∆xt−1 + bt . 3 Observe that, analogously to (17), this equation can also be written as ∆y = θ ∆y t yy t−1 + θyx ∆xt + γyx0 ∆2 xt + at , where θyy = βyy1 , γyx0 = −βyx1 , and θyx = βyx0 + βyx1 . P.-D. G. Sarte: Structural Vector Autoregressions 53 denote by ebt (θxy ) to underscore its dependence on the adopted identiﬁcation strategy. Since this estimated residual can be computed as ebt (θxy ) = (∆xt − θxy ∆yt ) − X φ Z = (∆xt − θxy ∆yt ) − X (Z X )−1Z (∆xt − θxy ∆yt ), (21) observe that Z ebt (θxy ) = ebt (θxy ) Z = 0 ∀θxy . This last condition summarizes what are sometimes called the normal equations. Now, the second equation to be estimated in (18) can also be expressed as ∆yt = Zβ + ∆xt βyx0 + at , (22) where β = (βyx1 , βyy1 ) , and ∆xt is the endogenous variable of interest. Since the relevant set of instruments for the estimation of equation (22) is given by {Z, ebt (θxy )}, it follows that the two-stage least squares estimator for β is given Z by β βyx0 = ZZ Z ∆xt ebt (θxy ) Z ebt (θxy ) ∆xt −1 Z ∆yt . ebt (θxy ) ∆yt (23) This last expression can be thought of as a set of two equations in two unknowns, speciﬁcally, Z Zβ + Z ∆xt βyx0 = Z ∆yt (24) ebt (θxy ) Zβ + ebt (θxy ) ∆xt βyx0 = ebt (θxy ) ∆yt . (25) and Therefore it follows that βyx0 = [ebt (θxy ) M z ∆xt ]−1 [ebt (θxy ) M z ∆yt ], (26) −1 Z where M z is the projection matrix I − Z(Z Z) Z . But we have just seen that ebt (θxy ) Z = 0 ∀θxy , hence equation (26) simpliﬁes to βyx0 = [ebt (θxy ) ∆xt ]−1 [ebt (θxy ) ∆yt ]. (27) In other words, the two-stage least squares estimator for βyx0 , and hence the long-run multiplier θyx , depends on two key elements: the correlations of the estimated residual from the previous equation, equation (19), with both ∆xt and ∆yt . This is because each equation in a SVAR possesses many regressors in common. Since the “extra” instrument ebt (θxy ) in the second equation is the residual from the ﬁrst equation, it is by construction orthogonal to the other instruments in the second equation. It then follows that the two-stage least squares estimator for βyx0 depends only on the correlations of this residual with ∆xt and ∆yt as shown by (27). To see that certain identiﬁcation schemes ∗ ∗ may be problematic, deﬁne θxy such that ebt (θxy ) ∆xt = 0. Then, as long as ∗ ebt (θxy ) ∆yt remains ﬁnite, βyx0 diverges when θxy → θxy . In more standard IV settings, this result would not emerge. Residuals from other equations would not 54 Federal Reserve Bank of Richmond Economic Quarterly generally be used as regressors, and hence parameter estimates would depend on more than one correlation. ∗ To determine the exact value of the problematic identifying restriction, θxy , given the data under consideration, it sufﬁces to take the transpose of equation (21), post-multiply the result by ∆xt , and set it to zero to yield ∗ θxy = ∆xt W ∆xt , where W = Z(X Z)−1 X − I. X ∆yt W ∆yt (28) To continue with our discussion, observe from equations (22) and (27) that βyx0 − βyx0 = [ebt (θxy ) ∆xt ]−1 [ebt (θxy ) at ]. (29) Therefore a lower bound for the variance of the two-stage least squares estimator βyx0 is given by var (βyx0 ) = σ 2a [ebt (θxy ) ∆xt ]−1 [ebt (θxy ) ebt (θxy )][ebt (θxy ) ∆xt ]−1 , (30) ∗ where σ 2a = E( 2 ).4 As θxy → θxy , this variance diverges at the squared rate of at that at which βyx0 itself diverges. Taken together, equations (27) and (30) tell ∗ us that for identiﬁcation strategies in a neighborhood of θxy , it is not possible to obtain a meaningful estimate of βxy0 . Both its estimator as well as associated conﬁdence interval become arbitrarily large. The above analysis has been numerical in nature in order to make clear the source of identiﬁcation failure in SVAR estimation. One may wonder further, however, about the relationship between the distributional properties of βyx0 and the identiﬁcation restriction θxy . The questions of statistical inference and asymptotic distribution can be answered to some degree, it turns out, as a special case of the analysis carried out by Staiger and Stock (1993). Their analysis indicates that conventional asymptotic inference procedures are no longer valid when ebt (θxy ) is weakly related to ∆xt in a regression of ∆xt on its instruments.5 Since residuals are recursively used as instruments in the estimation of SVARs, the “validity” of the estimation procedure implicitly depends on the nature of the identifying restrictions adopted. That is, the strength of the instruments is contingent upon the identiﬁcation scheme. Some structural economic models may then be impossible to investigate empirically within the conﬁnes of a just-identiﬁed SVAR. In particular, as long as an identiﬁcation strategy generates a small correlation between a recursively estimated residual and the variable it is meant to instrument for in the subsequent equation, coefﬁcient estimates will lose their standard distributional properties. 4 This is only a lower bound since e (θ ) is a generated regressor and therefore possesses bt xy some variation not accounted for in equation (30). 5 See Appendix. P.-D. G. Sarte: Structural Vector Autoregressions 55 An Illustrative Example Although the analysis in this section has been carried out with a long-run identifying restriction in mind, the arguments above are also relevant in settings incorporating short-run identifying restrictions. As an example, consider a recent paper on long-run neutrality by King and Watson (1997). The authors estimate a bivariate system in output and money in order to test long-run money neutrality. In doing so, they recognize the importance of considering alternative identifying restrictions for robustness. A subset of their results are reproduced in Figure 1. In panel A of Figure 1, King and Watson (1997) report point estimates and conﬁdence intervals for the hypothesis of long-run superneutrality when the short-run elasticity of money demand with respect to output is allowed to vary. Observe that as this value approaches −0.2, both the coefﬁcient estimate for long-run superneutrality and its conﬁdence intervals begin to blow up. In a similar fashion, panel C shows long-run superneutrality results under various assumptions with respect to the long-run response of money to exogenous permanent shifts in the level of output. Here, γ∆m,y corresponds to θxy so that in our notation, ∆xt is the money variable, while yt is the output variable. As in the case where a short-run identifying restriction was considered, the estimate for long-run superneutrality and its associated conﬁdence intervals start to diverge as γ∆m,y approaches −0.35. Thus, it should be clear that in looking for robustness across different identiﬁcation schemes, one may be confronted with cases where the SVAR methodology cannot be meaningfully implemented. At this stage, there remains at least one other obvious issue of interest. In our context, there may exist a plausible range of identifying restrictions in θxy for which the residual ebt (θxy ) is, in fact, a proper instrument. If this were the case, one would naturally wonder whether comparative dynamic response estimates are sensitive to the particular identifying restriction imposed upon the system. The next section provides an example of interpretation ambiguities associated with precisely this issue. 4. INTERPRETING STRUCTURAL VARS: TECHNOLOGY SHOCKS AND AGGREGATE EMPLOYMENT FLUCTUATIONS One topic of considerable interest in macroeconomics is the relationship between technology shocks and aggregate ﬂuctuations in employment. Real business cycle models typically predict that technological innovations raise the level of employment. This result reﬂects the increase in the marginal productivity of labor associated with the positive technology shock when labor supply is relatively less variable. In a recent paper, however, Gali (1996) suggests that this feature of real business cycle models does not hold empirically. By using a bivariate SVAR in labor productivity and employment, he is able to show 56 Federal Reserve Bank of Richmond Economic Quarterly Figure 1 Money Growth and Output 30 A. 95% Confidence Interval for ␥ y, ⌬ m as a Function of ⌬ m,y ␥ y,⌬m 20 10 0 -10 -20 - 0.6 - 0.2 0.2 0.6 1.0 1.4 1.8 2.2 ⌬m,y 30 B. 95% Confidence Interval for ␥ y, ⌬ m as a Function of y, ⌬ m 20 ␥ y,⌬m 10 0 -10 - 20 -10 30 -8 -6 -4 y,⌬m -2 0 2 C. 95% Confidence Interval for ␥ y, ⌬ m as a Function of ␥ ⌬ m,y ␥ y,⌬m 20 10 0 -10 - 20 - 0.5 - 0.4 - 0.3 - 0.2 - 0.1 ␥ ⌬m,y 0.0 0.1 0.2 D. 95% Confidence Ellipse when ␥ y, ⌬ m = 0 1.5 1.0 y,⌬m 0.5 0.0 - 0.5 - 1.0 - 1.5 - 2.0 - 2.5 - 0.4 + - 0.2 0.0 0.2 ⌬m,y 0.4 0.6 0.8 P.-D. G. Sarte: Structural Vector Autoregressions 57 that technology shocks appear to induce a persistent decline in employment. Furthermore, labor productivity increases temporarily in response to demand shocks. To motivate the identiﬁcation of the particular SVAR he uses, Gali (1996) suggests a stylized model whose key features are monopolistic competition, predetermined prices, and variable effort.6 In such a framework, a positive technology shock enhances labor productivity while leaving aggregate demand unchanged due to sticky prices. Employment must therefore fall. In addition, a positive demand shock would be met by a higher level of “unobserved” effort as well as higher “measured” employment. Given a strong enough effort response, labor productivity would temporarily rise. Formally, the structure of Gali’s (1996) model implies that employment evolves according to ∆ht = Θhη (L)ηt + Θhξ (L)ξt + (1 − L)Φhη (L)ηt + (1 − L)Φhξ (L)ξt , (31) where ηt and ξt denote money growth and technology shocks respectively. Here, money growth shocks are associated with the management of aggregate demand by the monetary authority and hence serve as a proxy for demand shocks. Since technology shocks induce a persistent decline in employment, we have Θhξ (1) < 0. Similarly, labor productivity is given by ∆qt = Θqη (L)ηt + Θqξ (L)ξt + (1 − L)Φqη (L)ηt + (1 − L)Φqξ (L)ξt , (32) with Θqη (0) + Φqη (0) > 0 to capture the contemporaneous positive effect of a demand shock on labor productivity. As in Section 2, this system of equations can be summarized as T(L)Yt = t , where Yt = (∆ht , ∆qt ) , T(L) = t (33) = (ηt , ξt ) , and Θhη (L) + (1 − L)Φhη (L) Θqη (L) + (1 − L)Φqη (L) Θhξ (L) + (1 − L)Φhξ (L) Θqξ (L) + (1 − L)Φqξ (L) −1 . (34) The key identifying restriction that Gali (1996) imposes upon the dynamics of his system is that demand shocks do not have a permanent effect on labor productivity. In terms of our earlier notation, we have T(1) = 1 − θhh −θqh −θhq Θhη (1) Θhξ (1) = Θqη (1) Θqξ (1) 1 − θqq −1 , (35) with Θqη (1) = θqh = 0. 6 For the details of the model, refer to Gali (1996). (36) 58 Federal Reserve Bank of Richmond Economic Quarterly Figure 2 plots impulse response functions for the bivariate SVAR we have just described. The data comprise the log of hours worked in the nonfarm business sector as well as gross domestic product (in 1987 dollars), less gross domestic product in the farm sector. The log of productivity was hence computed as the log of gross domestic product, less the log of hours worked. Four lags were used in estimation and the sample period covers 1949:1 to 1992:4. As in Gali (1996), observe that the structural response of employment to a positive technology shock is negative, both in the short and long run. Furthermore, this is true even within a 90 percent conﬁdence interval.7 Note also that the contemporaneous response of productivity to a demand shock is positive and, by construction, eventually vanishes. Of course, since we have used data that is very similar to that used in the original study, these results are hardly surprising. However, Gali (1996) argues that since these estimates seem to hold for the majority of G7 countries, the impact “of technology shocks yields a picture which is hard to reconcile with the prediction of (real business cycle) models.” This statement makes it clear that, among other results, the persistent employment decline in response to a technology shock is implicitly interpreted as a stylized fact. As we know, however, Gali’s (1996) estimates derive from his choice of identiﬁcation scheme; deviations from that scheme must be considered in order to decide what constitutes a stylized fact. Alternative Identiﬁcation Strategies There are several different ways to think about Gali’s (1996) initial SVAR set-up. First, supposing that aggregate demand shocks account for more than just money growth shocks, demand shocks may have a permanent impact on productivity. For instance, a permanent increase in taxes in a real business cycle model would yield an increase in the steady-state ratio of employment to capital. Given a standard production function with constant returns to scale, this increase in the ratio of labor to capital would necessarily be accompanied by a fall in labor productivity. This would invalidate the restriction that Θqη (1) = θqh = 0. Moreover, since θqh represents the long-run elasticity of productivity with respect to employment, it might not be unreasonable to expect that θqh < 0. Figure 3 shows the impulse response functions that result in Gali’s (1996) framework when θqh is set to −0.5. Under this alternative identiﬁcation strategy, the response of employment to a technology shock is no longer negative. In fact, both the short- and long-run responses of employment are now positive. By comparing Figures 2b and 3b, observe that this latter result seems to hold even when standard errors are taken into account. That is, 7 To construct the standard error bands, Monte Carlo simulations were done using draws from the normal distribution for each of the two structural innovations. One thousand Monte Carlo draws were carried out in each case. P.-D. G. Sarte: Structural Vector Autoregressions 59 Figure 2 Identiﬁcation Assumption: Demand Shocks Have No Long-Run Impact on Productivity A. Structural Impulse Response of Employment from a Demand Shock 2.4 Percent 2.0 1.6 1.2 0.8 0.4 0 4 8 12 Horizon 16 20 24 B. Structural Impulse Response of Employment from a Technology Shock 0.8 Percent 0.4 0.0 - 0.4 - 0.8 -1.2 0 4 8 12 Horizon 16 20 24 C. Structural Impulse Response of Productivity from a Demand Shock 0.5 0.4 Percent 0.3 0.2 0.1 0.0 - 0.1 - 0.2 0 4 8 12 Horizon 16 20 24 D. Structural Impulse Response of Productivity from a Technology Shock - 1.4 - - 1.2 - - 1.0 - - 0.8 - - 0.6 - Percent 1.6 - - 0.4 0 + 4 8 12 Horizon 16 20 24 60 Federal Reserve Bank of Richmond Economic Quarterly Figure 3 Identiﬁcation Assumption: Demand Shocks Have a Negative Long-Run Impact on Productivity Percent A. Structural Impulse Response of Employment from a Demand Shock 2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 - - - - 0 4 8 12 Horizon 16 20 24 Percent B. Structural Impulse Response of Employment from a Technology Shock 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 - 0 4 8 12 Horizon 16 20 24 C. Structural Impulse Response of Productivity from a Demand Shock - 0.2 - - 0.6 - - - 0.8 - - -1.0 - - Percent - 0.4 - -1.2 0 4 8 12 Horizon 16 20 24 D. Structural Impulse Response of Productivity from a Technology Shock Percent 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 + - 0 4 8 12 Horizon 16 20 24 P.-D. G. Sarte: Structural Vector Autoregressions 61 there is little overlap of the corresponding conﬁdence intervals. Moreover, the contemporaneous effect of a demand shock on productivity is no longer positive but negative as shown in panel C. Viewed in this light, the dynamic response estimates initially reported in Gali (1996) may appear somewhat fragile. In particular, his contention that the data does not coincide with the predictions of real business cycle models does not necessarily hold. In Figure 4, we show the results obtained when Gali’s (1996) SVAR is identiﬁed using yet a third alternative. In this case, we require that technology shocks not have a long-run impact on employment. In terms of equation (35), this implies that Θhξ (1) = θhq = 0. This identifying restriction is used by Shapiro and Watson (1988). It also emerges as a steady-state result in a real business cycle model when utility is logarithmic in consumption and leisure. Under this parameterization for utility, the income and substitution effects resulting from a positive technology shock cancel out, leaving labor supply unchanged in the steady state. (See King, Plosser, and Rebelo [1988].) Note in panel C of Figure 4 that under this third alternative, the long-run response of productivity to a demand shock is negative, which provides further evidence against Gali’s (1996) initial identifying restriction. As already noted, this result is also consistent with a permanent increase in taxes in a real business cycle framework. Put another way, when one identiﬁes Gali’s (1996) bivariate system in a way that is consistent with the steady state generated by a standard real business cycle model, the empirical ﬁndings generated by the SVAR are consistent with the predictions of that model. Of course, that is not to say that real business cycle models represent a more compelling framework when gauged against the data. The empirical results reported by Gali (1996) are themselves consistent with the theoretical model he uses to identify his SVAR. It is simply that in this case, what can be read from the data can vary sharply with one’s prior beliefs concerning the theoretical nature of the data-generating mechanism. While we have just shown that some of the key results in Gali (1996) are sensitive to the way one thinks about the long-run impact of various demand or supply shocks, this is not always the case. Observe that the structural impulse response of employment to a demand shock is similar in both direction and magnitude across Figures 2, 3, and 4. This is also true for the structural impulse response of productivity to a technology shock. Since these latter results emerge across estimated systems, that is, across systems with varying identifying restrictions, they may be reasonably considered stylized facts. 62 Federal Reserve Bank of Richmond Economic Quarterly Figure 4 Identiﬁcation Assumption: Technology Shocks Have No Long-Run Impact on Employment A. Structural Impulse Response of Employment from a Demand Shock 2.4 Percent 2.0 1.6 1.2 0.8 0.4 0 4 8 12 Horizon 16 20 24 B. Structural Impulse Response of Employment from a Technology Shock 0.2 Percent 0.0 - 0.2 - 0.4 - 0.6 - 0.8 0 4 8 12 Horizon 16 20 24 C. Structural Impulse Response of Productivity from a Demand Shock 0.4 Percent 0.2 0.0 - 0.2 - 0.4 - 0.6 - 0.8 0 4 8 12 Horizon 16 20 24 D. Structural Impulse Response of Productivity from a Technology Shock 1.6 Percent 1.4 1.2 1.0 0.8 0.6 0 + 4 8 12 Horizon 16 20 24 P.-D. G. Sarte: Structural Vector Autoregressions 5. 63 SUMMARY AND CONCLUSIONS We have investigated the extent to which identiﬁcation issues can matter when using SVARs to characterize data. Although the main focus was on the estimation of bivariate systems, it should be clear that most of the above analysis applies to larger systems as well. At a purely mechanical level, the source of the problem lies with the recursive use of an estimated residual as an instrument. The assumption made in SVAR estimation that the structural disturbances be uncorrelated is not sufﬁcient to guarantee a proper estimation procedure. One must also pay attention to the degree of correlation between the estimated residual and the endogenous variable it is meant to be instrumenting for. This observation has long been made for simultaneous equations systems; and in this sense, it is important not to lose sight of the fact that SVARs are in effect a set of simultaneous equations. At another level, we have also seen that even when the residual from a previously estimated equation is a valid instrument, SVARs can yield ambiguous results. This is the case even when conﬁdence intervals are taken into account as in the bivariate example in hours and productivity. In that case, it was unclear whether employment responded positively or negatively, both in the short and long run, in response to a technology shock. Therefore, there may be a sense in which SVARs can fail in a way that is reminiscent of the Cooley and Leroy (1985) critique. In reduced-form VARs, different results emerge when alternative methods of orthogonalization of the error terms are adopted. In structural VARs, the results can now be directly contingent upon speciﬁc identifying restrictions. In effect, these are two facets of the same problem. We have also seen in our example that certain results may be relatively robust with respect to the particular identiﬁcation strategy of interest. For example, the response of productivity to a technology shock was estimated to be positive in both the short and long run across varying systems. Thus, two conclusions ultimately emerge from this investigation. First, special emphasis should be given to the derivation of identifying restrictions. The proper use of SVARs is contingent upon such restrictions and the case of identiﬁcation failure cannot be ruled out a priori. Second, sensitivity analysis can be quite helpful in gaining a sense of the range of dynamics consistent with a given set of data. Assessing such a range seems an essential step in establishing stylized facts. 64 Federal Reserve Bank of Richmond Economic Quarterly APPENDIX This appendix derives the asymptotic distribution of βyx0 in the text. This derivation is based on Staiger and Stock (1993). In the estimation of equation (22), suppose that the relationship that ties ∆xt to its instruments can be described as ∆xt = Zα + ebt (θxy )αxe + νt , (A1) where νt is uncorrelated with at . Furthermore, let us consider the set of identifying restrictions Πθxy for which αxe = N −1/2 g(θxy ), where N is the sample size of our dataset and g(θxy ): Πθxy → . In other words, Πθxy denotes a set of identifying restrictions for which the instrument ebt (θxy ) is only weakly related to the endogenous variable ∆xt in the local to zero sense; the coefﬁcient αxe goes to zero as the sample size itself becomes arbitrarily large. To proceed with the argument, rewrite equation (29) as βyx0 − βyx0 = [(N −1/2 ∆xt ebt (θxy ))(N −1 ebt (θxy ) ebt (θxy ))(N −1/2 ebt (θxy ) ∆xt )]−1 [(N −1/2 ∆xt ebt (θxy ))(N −1 ebt (θxy ) ebt (θxy ))(N −1/2 ebt (θxy ) at )]. (A2) Given the assumptions embodied in (A1), it follows that N −1/2 ∆xt ebt (θxy ) = N −1/2 [α Z + αxe ebt (θxy ) + νt ]ebt (θxy ) = N −1 ebt (θxy ) ebt (θxy )g(θxy ) + N −1/2 νt ebt (θxy ). (A3) Under suitable conditions, the ﬁrst term in the above equation will converge to some constant almost surely as the sample size becomes large. The second term, on the other hand, will converge asymptotically to a normal distribution by the Central Limit Theorem. Therefore, although the coefﬁcient on the relevant instrument, ebt (θxy ), in the ﬁrst-stage equation converges to zero, if the rate of convergence is slow enough, the right-hand side of equation (A2) will not diverge asymptotically. Nevertheless, in this case, the two-stage least squares estimator βyx0 is asymptotically distributed as a ratio of quadratic forms in two jointly distributed normal variables. Hence, for identiﬁcation strategies that belong to the set Πθxy , conventional asymptotic inference procedures will fail. In fact, in the so-called leading case where g(θxy ) = 0, Phillips (1989), Hillier (1985), and Staiger and Stock (1993) point out that βyx0 asymptotically possesses a t distribution. We now provide a sketch of the basic arguments. To this end, we assume that the following moment conditions are satisﬁed. The notation “→p ” and “⇒” denote convergence in probability and convergence in distribution respectively. P.-D. G. Sarte: Structural Vector Autoregressions 65 (a) (N −1 X X, N −1 Z X, N −1 X ∆xt , N −1 Z ∆xt , N −1 X ∆yt , N −1 Z ∆yt ) →p (ΣXX , ΣZX , ΣX∆xt , ΣZ∆xt , ΣX∆yt , ΣZ∆yt ) (b) (N −1 ∆xt ∆xt , N −1 ∆yt ∆yt , N −1 ∆xt ∆yt ) →p (Σ∆xt ∆xt , Σ∆yt ∆yt , Σ∆xt ∆yt ) (c) (N −1/2 νt ∆xt , N −1/2 νt ∆yt , N −1/2 νt X, N −1/2 N −1/2 at X) ⇒ (Ψνt ∆xt , Ψνt ∆yt , Ψνt X , Ψ a ∆xt at ∆xt , , Ψ N −1/2 a ∆yt , Ψ at ∆yt , aX ). Note two particular points embodied in assumptions (a) through (c). First, assumptions (a) and (b) would naturally hold under standard conditions governing stationarity and ergodicity of the variables in the reduced form. Second, since these are primary assumptions, they do not depend on the identifying restriction θxy . It now remains to specify the asymptotic properties of three terms in (A2) and (A3), namely N −1 ebt (θxy ) ebt (θxy ), N −1/2 νt ebt (θxy ), and N −1/2 ebt (θxy ) at , to determine the asymptotic behavior of βyx0 (θxy ) − βyx0 (θxy ) when θxy ∈ Πθxy . Let us then examine each of these terms in turn. Recall from equation (21) that ebt (θxy ) = (∆xt − θxy ∆yt ) − X(Z X)−1 Z (∆xt − θxy ∆yt ). It follows that N −1 ebt (θxy ) ebt (θxy ) is quadratic in θxy . Therefore, under assumptions (a) and (b), N −1 ebt (θxy ) ebt (θxy ) →p Σ(θxy ) uniformly, where Σ(θxy ) also depends on ΣXX , ΣZX , etc. Next, consider N −1/2 νt ebt (θxy ). We have N −1/2 νt ebt (θxy ) = N −1/2 [νt ∆xt − θxy νt ∆yt − νt X(Z X)−1 Z (∆xt − θxy ∆yt )], which is linear in θxy . Therefore N −1/2 νt ebt (θxy ) ⇒ Ψνt (θxy ) uniformly, where Ψνt (θxy ) = Ψνt ∆xt − θxy Ψνt ∆yt − Ψνt X [Σ−1 ΣX∆xt − θxy Σ−1 ΣZ∆yt ]. Finally, ZX ZX N −1/2 ebt (θxy ) at is given by N −1/2 [∆xt at − θxy ∆yt at − (∆xt − θxy ∆yt )Z(X Z)−1 X at ], which is also linear in θxy . Hence, N −1/2 ebt (θxy ) at ⇒ Ψ at (θxy ) uniformly, where Ψ at (θxy ) = Ψ at ∆xt − θxy Ψ at ∆yt − [ΣX∆xt Σ−1 − θxy ΣX∆yt Σ−1 ]Ψ at X . ZX ZX With these results in mind, it follows that βyx0 (θxy ) converges in distribution to βyx0 (θxy ) + [(g(θxy )Σ(θxy )1/2 + Ψνt (θxy )Σ(θxy )−1/2 ) (g(θxy )Σ(θxy )1/2 + Ψνt (θxy )Σ(θxy )−1/2 )]−1 [(g(θxy )Σ(θxy )1/2 + Ψνt (θxy )Σ(θxy )−1/2 ) (Σ(θxy )−1/2 Ψ at (θxy )]. This implies that for identiﬁcation schemes in Πθxy , the two-stage least squares estimator is not only biased, it is asymptotically distributed as a ratio of quadratic forms in the jointly distributed normal random variables Ψνt (θxy ) and Ψ at (θxy ). 66 Federal Reserve Bank of Richmond Economic Quarterly REFERENCES Bernanke, Ben S. “Alternative Explanations of the Money-Income Correlation,” Carnegie-Rochester Conference Series on Public Policy, vol. 25 (Autumn 1986), pp. 49–99. Blanchard, Olivier J., and Danny Quah. “The Dynamic Effects of Aggregate Demand and Supply Disturbances,” American Economic Review, vol. 79 (September 1989), pp. 655–73. Blanchard, Olivier J., and Mark W. Watson. “Are Business Cycles All Alike?” in Robert J. Gordon, ed., The American Business Cycle: Continuity and Change, Chicago: University of Chicago Press, 1984. Cooley, Thomas F., and Stephen F. Leroy. “Atheoretical Macroeconometrics: A Critique,” Journal of Monetary Economics, vol. 16 (June 1985), pp. 283–308. Gali, Jordi. “Technology, Employment, and the Business Cycle: Do Technology Shocks Explain Aggregate Fluctuations?” Mimeo, New York University, 1996. . “How Well Does the IS-LM Model Fit Postwar U.S. Data?” Quarterly Journal of Economics, vol. 107 (May 1992), pp. 709–38. Hillier, Grant H. “On the Joint and Marginal Densities of Instrumental Variables Estimators in a General Structural Equation,” Econometric Theory, vol. 1 (April 1985), pp. 53–72. King, Robert J., Charles I. Plosser, and Sergio T. Rebelo. “Production, Growth, and Business Cycles: II. New Directions,” Journal of Monetary Economics, vol. 21 (March 1988), pp. 309–42. King, Robert J., Charles I. Plosser, James H. Stock, and Mark W. Watson. “Stochastic Trends and Economic Fluctuations,” American Economic Review, vol. 81 (September 1991), pp. 819–40. King, Robert J., and Mark W. Watson. “Testing Long-Run Neutrality,” Federal Reserve Bank of Richmond Economic Quarterly, vol. 83 (Summer 1997), pp. 69–101. Nelson, Charles R., and Richard Startz. “The Distribution of the Instrumental Variables Estimator and its t-Ratio when the Instrument Is a Poor One,” Journal of Business, vol. 63 (January 1990), pp. 125–40. Phillips, Peter. C. “Partially Identiﬁed Models,” Econometric Theory, vol. 5 (August 1989), pp. 181–240. Shapiro, Mathew D., and Mark W. Watson. “Sources of Business Cycle Fluctuations,” NBER Working Paper 1246, 1988. P.-D. G. Sarte: Structural Vector Autoregressions 67 Sims, Christopher. A. “Are Forecasting Models Usable for Policy Analysis?” Federal Reserve Bank of Minneapolis Quarterly Review, vol. 10 (Winter 1986), pp. 2–16. . “Comparison of Interwar and Postwar Business Cycles: Monetarism Reconsidered,” American Economic Review, vol. 70 (May 1980a), pp. 250–59. . “Macroeconomics and Reality,” Econometrica, vol. 48 (January 1980b), pp. 1– 47. Staiger, Douglas, and James H. Stock. “Instrumental Variables Regressions with Weak Instruments.” Mimeo, Kennedy School of Government, Harvard University, 1993. Testing Long-Run Neutrality Robert G. King and Mark W. Watson K ey classical macroeconomic hypotheses specify that permanent changes in nominal variables have no effect on real economic variables in the long run. The simplest “long-run neutrality” proposition speciﬁes that a permanent change in the money stock has no long-run consequences for the level of real output. Other classical hypotheses specify that a permanent change in the rate of inﬂation has no long-run effect on unemployment (a vertical long-run Phillips curve) or real interest rates (the long-run Fisher relation). In this article we provide an econometric framework for studying these classical propositions and use the framework to investigate their relevance for the postwar U.S. experience. Testing these propositions is a subtle matter. For example, Lucas (1972) and Sargent (1971) provide examples in which it is impossible to test long-run neutrality using reduced-form econometric methods. Their examples feature rational expectations together with short-run nonneutrality and exogenous variables that follow stationary processes so that the data generated by these models do not contain the sustained changes necessary to directly test long-run neutrality. In the context of these models, Lucas and Sargent argued that it was necessary to construct fully articulated behavioral models to test the neutrality propositions. McCallum (1984) extended these arguments and showed that lowfrequency band spectral estimators calculated from reduced-form models were also subject to the Lucas-Sargent critique. While these arguments stand on ﬁrm logical ground, empirical analysis following the Lucas-Sargent prescriptions has not yet yielded convincing evidence on the neutrality propositions. This undoubtedly reﬂects a lack of consensus among macroeconomists on the appropriate behavioral model to use for the investigation. The authors thank Marianne Baxter, Michael Dotsey, Robert Hetzel, Thomas Humphrey, Bennett McCallum, Yash Mehra, James Stock, and many seminar participants for useful comments and suggestions. This research was supported in part by National Science Foundation grants SES-89-10601, SES-91-22463, and SBR-9409629. The views expressed are those of the authors and do not necessarily reﬂect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. Federal Reserve Bank of Richmond Economic Quarterly Volume 83/3 Summer 1997 69 70 Federal Reserve Bank of Richmond Economic Quarterly The speciﬁc critique offered by Lucas and Sargent depends critically on stationarity. In models in which nominal variables follow integrated variables processes, long-run neutrality can be deﬁned and tested without complete knowledge of the behavioral model. Sargent (1971) makes this point clearly in his paper, and it is discussed in detail in Fisher and Seater (1993).1 But, even when variables are integrated, long-run neutrality cannot be tested using a reduced-form model. Instead, what is required is the model’s “ﬁnal form,” showing the dynamic response of the variables to underlying structural disturbances.2 Standard results from the econometric analysis of simultaneous equations show that the ﬁnal form of a structural model is not econometrically identiﬁed, in general, because a set of a priori restrictions are necessary to identify the structural disturbances. Our objective in this article is to summarize the reduced-form information in the postwar U.S. data and relate it to the long-run neutrality propositions under alternative identifying restrictions. We do this by systematically investigating a wide range of a priori restrictions and asking which restrictions lead to rejections of long-run neutrality and which do not. For example, in our framework the estimated value of the long-run elasticity of output with respect to money depends critically on what is assumed about one of three other elasticities: (i) the impact elasticity of output with respect to money, (ii) the impact elasticity of money with respect to output, or (iii) the long-run elasticity of money with respect to output. We present neutrality test results for a wide range of values for these elasticities, using graphical methods. Our procedure stands in stark contrast to the traditional method of exploring a small number of alternative identifying restrictions, and it has consequent costs and beneﬁts. The key beneﬁt is the extent of the information conveyed: researchers with strong views about plausible values of key parameters can learn about the result of a neutrality test appropriate for their beliefs; other researchers can learn about what range of parameter values result in particular conclusions about neutrality. The key cost is that the methods that we use are only practical in small models, and we demonstrate them here using 1 Also see Geweke (1986), Stock and Watson (1988), King, Plosser, Stock, and Watson (1991), and Gali (1992). 2 Throughout this article we use the traditional jargon of dynamic linear simultaneous equations. By “structural model” we mean a simultaneous equations model in which each endogenous variable is expressed as a function of the other endogenous variables, exogenous variables, lags of the variables, and disturbances that have structural interpretation. By “reduced-form model” we mean a set of regression equations in which each endogenous variable is expressed as a function of lagged dependent variables and exogenous variables. By “ﬁnal-form model” we mean a set of equations in which the endogenous variables are expressed as a function of current and lagged values of shocks and exogenous variables in the model. For the standard textbook discussion of these terms, see Goldberger (1964), chapter 7. R. G. King and M. W. Watson: Testing Long-Run Neutrality 71 bivariate models. This raises important questions about effects of potential omitted variables, and we discuss this issue below in the context of speciﬁc empirical models. We organize our discussion as follows. In Section 1 below, we begin with the theoretical problem of testing for neutrality in economies that are consistent with the Lucas-Sargent conclusions. Our goal is to show the restrictions that long-run neutrality impose on the ﬁnal-form model, and how these restrictions are related to the degree of integration of the variables. In Section 2, we discuss issues of econometric identiﬁcation. Section 3 contains an empirical investigation of (i) the long-run neutrality of money, (ii) the long-run superneutrality of money, and (iii) the long-run Fisher relation. Even with an unlimited amount of data, the identiﬁcation problems discussed above make it impossible to carry out a deﬁnitive test of the long-run propositions. Instead, we investigate the plausibility of the propositions across a wide range of observationally equivalent models. In Section 4 we investigate the long-run relation between inﬂation and the unemployment rate, i.e., the slope of the long-run Phillips curve. Here, the identiﬁcation problem is more subtle than in the other examples. As we show, the estimated long-run relationship depends in an important way on whether the Phillips curve slope is calculated from a “supply” equation, as in Sargent (1976) for example, or from a “price” equation, as in Solow (1969) or Gordon (1970). Previewing our empirical results, we ﬁnd unambiguous evidence supporting the neutrality of money but more qualiﬁed support for the other propositions. Over a wide range of identifying assumptions, we ﬁnd there is little evidence in the data against the hypothesis that money is neutral in the long run. Thus the ﬁnding that money is neutral in the long run is robust to a wide range of identifying assumptions. Conclusions about the other long-run neutrality propositions are not as unambiguous: these propositions are rejected for a range of identifying restrictions that we ﬁnd arguably reasonable, but they are not rejected for others. Yet many general conclusions are robust. For example, the rejections of the long-run Fisher effect suggest that a one percentage point permanent increase in inﬂation leads to a smaller than one percentage point increase in nominal interest rates. Moreover, a wide range of identifying restrictions leads to very small estimates of the long-run effect of inﬂation on unemployment. On the other hand, the sign and magnitude of the estimated long-run effect of money growth on the level of output depends critically on the speciﬁc identifying restriction employed. 1. THE ROLE OF UNIT ROOTS IN TESTS FOR LONG-RUN NEUTRALITY Early empirical researchers investigated long-run neutrality by examining the coefﬁcients in the distributed lag: yt = Σαj mt−j + error = α(L)mt + error, (1) 72 Federal Reserve Bank of Richmond Economic Quarterly where y is logarithm of output, m is logarithm of the money supply, α(L) = Σαj Lj , and L is the lag operator.3 If mt is increased by one unit permanently, then (1) implies that yt will eventually increase by the sum of the αj coefﬁcients. Hence, investigating the long-run multiplier, α(1) = Σαj , appears to be a reasonable procedure for investigating long-run neutrality. However, Lucas (1972) and Sargent (1971) demonstrated that in models with short-run nonneutrality and rational expectations, this approach can be very misguided. The Lucas-Sargent critique can be exposited as follows. Consider a model consisting of an aggregate supply schedule (2a); a monetary equilibrium condition (2b); and a money supply rule (2c): yt = θ(pt − Et−1 pt ), (2a) pt = mt − δyt , and (2b) m t , (2c) mt = ρmt−1 + where yt is the logarithm of output; pt is the logarithm of the price level; Et−1 pt is the expectation of pt formed at t − 1, mt is the logarithm of the money stock, and m is a mean-zero serially independent shock to money. The solution for t output is yt = π(mt − Et−1 mt ) = π(mt − ρmt−1 ) = π(1 − ρL)mt = α(L)mt , (3) with π = θ/(1 + δθ) and α(L) = α0 + α1 L = π(1 − ρL). As in Lucas (1973), the model is constructed so that only surprises in the money stock are nonneutral and these have temporary real effects. Permanent changes in money have no long-run effect on output. However, the reducedform equation yt = α(L)mt suggests that a one-unit permanent increase in money will increase output by α0 + α1 = α(1) = π(1 − ρ). Moreover, as noted by McCallum (1984), the reduced form also implies that there is a long-run correlation between money and output, as measured by the spectral density matrix of the variables at frequency zero. On this basis, Lucas (1972), Sargent (1971), and McCallum (1984) argue that a valid test of long-run neutrality can only be conducted by determining the structure of monetary policy (ρ) and its interaction with the short-run response to monetary shocks (π), which depends on the behavioral relations in the model (δ and θ). While this is easy enough to determine in this simple setting, it is much more difﬁcult in richer dynamic models or in models with a more sophisticated speciﬁcation of monetary policy. 3 See Sargent (1971) for references to these early empirical analyses. R. G. King and M. W. Watson: Testing Long-Run Neutrality 73 However, if ρ = 1, there is a straightforward test of the long-run neutrality proposition in this simple model. Adding and subtracting ρmt from the righthand side of (3) yields yt = πρ∆mt + π(1 − ρ)mt (3 ) so that with ρ = 1 there is a zero effect of the level of money under the neutrality restriction. Hence, one can simply examine whether the coefﬁcient on the level of money is zero when mt is included in a bivariate regression that also involves ∆mt as a regressor. With permanent variations in the money stock, the reduced form of this simple model has two key properties: (i) the coefﬁcient on mt corresponds to the experiment of permanently changing the level of the money stock; and (ii) the coefﬁcient on ∆mt captures the short-run nonneutrality of monetary shocks. Equivalently, with ρ = 1, the neutrality hypothesis implies that in the speciﬁcation yt = Σαj mt−j , the neutrality restriction is α(1) = 0, where α(1) = Σαj is the sum of the distributed lag coefﬁcients. While the model in (2a) – (2c) is useful for expositing the Lucas-Sargent critique, it is far too simple to be used in empirical analysis. Standard macroeconomic models include several other important features: shocks other than m are t incorporated to capture other sources of ﬂuctuations; the simple speciﬁcation of an exogenous money supply in (2c) is discarded in favor of a speciﬁcation that allows the money supply to respond to the endogenous variables in the model; and ﬁnally, the dynamics of the model are generalized through the incorporation of sticky prices, costs of adjusting output, information lags, etc. In these more general settings, it is still the case that long-run neutrality can sometimes be determined by examining the model’s ﬁnal form. To see this, consider a macroeconomic model that is linear in both the observed variables and the structural shocks. Then, if the growth rates of both output and money are stationary, the model’s ﬁnal form can be written as ∆yt = µy + θyη (L) η t ∆mt = µm + θmη (L) where η is t Σθmm, j m , t−j + θym (L) η t m t and + θmm (L) m , t (4a) (4b) vector of shocks, other than money, that affect output; θmm (L) m = t and the other terms are similarly deﬁned. Rich dynamics are incorporated in the model via the lag polynomials θyη (L), θym (L), θmη (L), and θmm (L). These ﬁnal-form lag polynomials will be functions of the model’s behavioral parameters in a way that depends on the speciﬁcs of the model, but the particular functional relation need not concern us here. The long-run neutrality tests that we conduct all involve the answer to the following question: does an unexpected and exogenous permanent change in the level of m lead to a permanent change in the level of y? If the answer is no, then we say that m is long-run neutral towards y. In equations (4a) and (4b), m t 74 Federal Reserve Bank of Richmond Economic Quarterly are exogenous unexpected changes in money. The permanent effect of m on t future values of m is given by Σθmm, j m = θmm (1) m . Similarly, the permanent t t effect of m on future values of y is given by Σθym, j m = θym (1) m . Thus, the t t t long-run elasticity of output with respect to permanent exogenous changes in money is γym = θym (1)/θmm (1). (5) Within this context, we say that the model exhibits long-run neutrality when γym = 0. That is, the model exhibits long-run neutrality when the exogenous shocks that permanently alter money, m , have no permanent effect on output. t In an earlier version of this article (King and Watson 1992) and in King and Watson (1994), we explored the relationship between the restriction γym = 0 and the traditional notion of long-run neutrality using a dynamic linear rational expectations model with sluggish short-run price adjustment. We required that the model display theoretical neutrality, in that its real variables were invariant to proportionate changes in all nominal variables. We showed that this longrun neutrality requirement implied long-run neutrality in the sense investigated here. That is, unexpected permanent changes in mt had no effect on yt . Further, like the simple example presented in equations (2) and (3) above, the model also implied that long-run neutrality could be tested within a system like (4) if (and only if ) the money stock is integrated of order one. Finally, in the theoretical model, long-run neutrality implied that γym = 0. In the context of equations (4a) – (4b), the long-run neutrality restriction γym = 0 can only be investigated when money is integrated. If the money process does not contain a unit root, then there are no permanent changes in the level of mt and θmm (1) = 0. In this case, γym in (5) is undeﬁned, and the model’s ﬁnal form says nothing about long-run neutrality. This is the point of the Lucas-Sargent critique. The intuition underlying this result is simple: long-run neutrality asks whether a permanent change in money will lead to a permanent change in output. If permanent changes in money did not occur in the historical data (that is, money is stationary), then these data are uninformative about long-run neutrality. On the other hand, when the exogenous changes in money permanently alter the level of m, then θmm (1) = 0, money has a unit root, γym is well deﬁned in (5), and the question of long-run neutrality can be answered from the ﬁnal form of the model. 2. ECONOMETRIC ISSUES In general, it is not possible to use data to determine the parameters of the ﬁnal-form equations (4a) – (4b). Econometric identiﬁcation problems must ﬁrst be solved. We approach the identiﬁcation problem in an unusual way. Rather R. G. King and M. W. Watson: Testing Long-Run Neutrality 75 than “solve” it by imposing a single set of a priori restrictions, our empirical strategy is to investigate long-run neutrality for a large set of observationally equivalent models. Our hope is that this will provide researchers with a clearer sense of the robustness of any conclusions about long-run neutrality. Before presenting the empirical results, we review the issues of econometric identiﬁcation that arise in the estimation of sets of equations like (4a) and (4b). This discussion motivates the set of observationally equivalent models analyzed in our empirical work. To begin, assume that ( η m ) is a vector of unobserved mean-zero serially t t independent random variables, so that (4a) – (4b) can be interpreted as a vector moving average model. The standard estimation strategy begins by inverting the moving average model to form a vector autoregressive model (VAR). The VAR, which is assumed to be ﬁnite order, is then analyzed as a dynamic linear simultaneous equations model.4 We will work within this framework. Estimation and inference in this framework requires two distinct sets of assumptions. The ﬁrst set of assumptions is required to transform the vector moving average model into a VAR. The second set of assumptions is required to econometrically identify the parameters of the VAR. These sets of assumptions are intimately related: the moving average model can only be inverted if the VAR includes enough variables to reconstruct the structural shocks. In the context of (4a) – (4b), if t = ( η m ) is an n × 1 vector, then there must be at t t least n variables in the VAR. But, identiﬁcation of an n-variable VAR requires n × (n − 1) a priori restrictions, so that the necessary number of identifying restrictions increases with the square of the number of structural shocks. In our empirical analysis we will assume that n = 2, so that only bivariate VARs are required. To us, this seems the natural starting point, and it has been employed by many other researchers in the study of the neutrality propositions discussed below. We also do this for tractability: when n = 2, only 2 identifying restrictions are necessary. This allows us to investigate thoroughly the set of observationally equivalent models. The cost of this simpliﬁcation is that some of our results may be contaminated by omitted variables bias. We discuss this possibility more in the context of the empirical results. To derive the set of observationally equivalent models, let Xt = (∆yt , ∆mt ) , and stack (4a) – (4b) as Xt = Θ(L) t , where t =( 4 Standard η m t t ) (6) is the 2 × 1 vector of structural disturbances. Assume that references are Blanchard and Watson (1986), Bernanke (1986), and Sims (1986). See Watson (1994) for a survey. 76 Federal Reserve Bank of Richmond Economic Quarterly |Θ(z)| has all of its zeros outside the unit circle, so that Θ(L) can be inverted to yield the VAR: 5 α(L)Xt = t , (7) where α(L) = Σ∞ αj L j , with αj a 2 × 2 matrix. Unstacking the ∆yt and ∆mt j=0 equations yields p ∆yt = λym ∆mt + p αj,yy ∆yt−j + j=1 p αj,ym ∆mt−j + η t and (8a) m t , (8b) j=1 p ∆mt = λmy ∆yt + αj,my ∆yt−j + j=1 αj,mm ∆mt−j + j=1 which is written under the assumption that the VAR in (7) is of order p. Equation (7) or equivalently equations (8a) and (8b) are a set of dynamic simultaneous equations, and econometric identiﬁcation can be studied in the usual way. Writing Σ = E( t t ), the reduced form of (7) is p Xt = −1 where Φi = −α0 αi and et = by the set of equations i=1 −1 α0 t . Φi Xt−i + et , (9) The matrices αi and Σ are determined −1 α0 αi = −Φi , i = 1, . . . , p and (10) −1 −1 α0 Σ α0 = Σe = E(et et ). (11) When there are no restrictions on coefﬁcients on lags entering (9), equation (10) imposes no restrictions on α0 ; it serves to determine αi as a function of α0 and Φi . Equation (11) determines both α0 and Σ as a function of Σe . Since Σe (a 2×2 symmetric matrix) has only three unique elements, only three unknown parameters in α0 and Σ can be identiﬁed. Equations (8a) and (8b) place 1s on the diagonal of α0 , but evidently only three of the remaining parameters var( m ), var( η ), cov( m , η ), λmy and λym can be identiﬁed. We follow the stant t t t dard practice in structural VAR analysis and assume that the structural shocks are uncorrelated. Since λmy and λym are allowed to be nonzero, the assumption places no restriction on the contemporaneous correlation between y and m. Moreover, nonzero values of λmy and λym allow both y and m to respond m and η shocks within the period. With the assumption that cov( m , η ) = 0, t t only one additional identifying restriction is required. Where might this additional restriction come from? One approach is to assume that the model is recursive, so that either λmy = 0 or λym = 0. Geweke (1986), Stock and Watson (1988), Rotemberg, Driscoll, and Poterba (1995), and Fisher and Seater (1993) present tests for neutrality under the assumption 5 The unit roots discussion of Section 1 is important here, since the invertability of Θ(L) requires that Θ(1) has full rank. This implies that yt and mt are both integrated processes, and (yt , mt ) are not cointegrated. R. G. King and M. W. Watson: Testing Long-Run Neutrality 77 that λym = 0; Geweke (1986) also presents results under the assumption that λmy = 0. Alternatively, neutrality might be assumed, and the restriction γym = 0 used to identify the model. This assumption has been used by Gali (1992), by King, Plosser, Stock, and Watson (1991), by Shapiro and Watson (1988), and by others to disentangle the structural shocks m and η . Finally, an assumption t t such as γmy = 1 might be used to identify the model; this assumption is consistent with long-run price stability under the assumption of stable velocity. The approach that we take in the empirical section is more eclectic and potentially more informative. Rather than report results associated with a single identifying restriction, we summarize results for a wide range of observationally equivalent estimated models. This allows the reader to gauge the robustness of conclusions about γym and long-run neutrality to speciﬁc assumptions about λym , λmy , or γmy . Our method is in the spirit of robustness calculations carried out by sophisticated users of structural VARs such as Sims (1989) and Blanchard (1989). 3. EVIDENCE ON THE NEUTRALITY PROPOSITIONS IN THE POSTWAR U.S. ECONOMY While our discussion has focused on the long-run neutrality of money, we can test a range of related long-run neutrality propositions by varying the deﬁnition Xt in equation (7). As we have shown, using Xt = (∆yt , ∆mt ) , with mt assumed to follow an I(1) process, the model can be used to investigate the neutrality of money. If the process describing mt is I(2) rather than I(1), then the framework can be used to investigate superneutrality by using Xt = (∆yt , ∆2 mt ) .6 In economies in which rate of inﬂation, πt , and the nominal interest rate, Rt , follow integrated processes, then we can study the long-run effect of inﬂation on real interest rates by setting Xt = (∆πt , ∆Rt ) . Finally, if both the inﬂation rate and the unemployment rate are I(1), then the slope of the long-run Phillips curve can be investigated using Xt = (∆πt , ∆ut ). We investigate these four long-run neutrality hypotheses using postwar quarterly data for the United States. We use gross national product for output; 6 Long-run neutrality cannot be tested in a system in which output is I(1) and money is I(2). Intuitively this follows because neutrality concerns the relationship between shocks to the level of money and to the level of output. When money is I(2), shocks affect the rate of growth of money, and there are no shocks to the level of money. To see this formally, write equation (8a) as αyy (L)∆yt = αym (L)∆mt + η t = αym (1)∆mt + α∗ (L)∆2 mt + ym η t , where α∗ (L) = (1 − L)−1 [αym (L) − αym (1)]. When money is I(1), the neutrality restriction is ym αym (1) = 0. But when money is I(2) and output is I(1), αym (1) = 0 by construction. (When αym (1) = 0, output is I(2).) For a more detailed discussion of neutrality restrictions with possibly different orders of integration, see Fisher and Seater (1993). 78 Federal Reserve Bank of Richmond Economic Quarterly money is M2; unemployment is the civilian unemployment rate; price inﬂation is calculated from the consumer price index; and the nominal interest rate is the yield on three-month Treasury bills.7 Since the unit root properties of the data play a key role in the analysis, Table 1 presents statistics describing these properties of the data. We use two sets of statistics: (i) augmented Dickey-Fuller (ADF) t-statistics and (ii) 95 percent conﬁdence intervals for the largest autoregressive root. (These were constructed from the ADF statistics using Stock’s [1991] procedure.) The ADF statistics indicate that unit roots cannot be rejected at the 5 percent level for any of the series. From this perspective, output (yt ), money (mt ), money growth (∆mt ), inﬂation (πt ), unemployment (ut ), and nominal interest rates (Rt ) all can be taken to possess the nonstationarity necessary for investigating long-run neutrality using the ﬁnal form (7). Moreover, a unit root cannot be rejected for rt = Rt − πt , consistent with the hypothesis that Rt and πt are not cointegrated. However, the conﬁdence intervals are very wide, suggesting a large amount of uncertainty about the unit root properties of the data. For example, the real GNP data are consistent with the hypothesis that the process is I(1), but also are consistent with the hypothesis that the data are trend stationary with an autoregressive root of 0.89. The money supply data are consistent with the trend stationary, I(1) and I(2) hypotheses. The results in Table 1 suggest that while it is reasonable to carry an empirical investigation of the neutrality propositions predicated on integrated processes, as is usual in models with unit root identifying restrictions, the results must be interpreted with some caution. Our empirical investigation centers around the four economic interpretations of equation (7) discussed above. For each interpretation, we estimate the model using the following identifying assumptions: (i) α0 has 1s on the diagonal, (ii) Σ is diagonal, and, deﬁning Xt = (xt1 xt2 ), one of the following: (iii.a) the impact elasticity x1 with respect to x2 is known (e.g., λym is known in the money-output system), 7 Data sources: Output: Citibase series GNP82 (real GNP). Money: The monthly Citibase M2 series (FM2) was used for 1959–1989; the earlier M1 data were formed by splicing the M2 series reported in Banking and Monetary Statistics, 1941–1970, Board of Governors of the Federal Reserve System, to the Citibase data in January 1959. Inﬂation: Log ﬁrst differences of Citibase series PUNEW (CPI-U: All Items). Unemployment Rate: Citibase Series LHUR (Unemployment rate: all workers, 16 years and over [percent, sa]). Interest Rate: Citibase series FYGM3 (yield on three-month U.S. Treasury bills). Monthly series were averaged to form the quarterly data. R. G. King and M. W. Watson: Testing Long-Run Neutrality 79 Table 1 Unit Root Statistics yt mt ∆mt πt ut Rt rt ADF τ τ ˆ ADF τ µ ˆ −2.53 −2.40 −2.76 −3.27 −3.35 −3.08 −3.34 Variable — — −2.90 −2.86 −2.34 −1.87 −2.94 95 Percent Conﬁdence Intervals for ρ Detrended Data Demeaned Data (0.89 (0.90 (0.86 (0.81 (0.81 (0.84 (0.82 1.02) 1.03) 1.02) 1.02) 1.01) 1.02) 1.02) — — (0.84 1.01) (0.84 1.02) (0.89 1.02) (0.92 1.02) (0.85 1.01) Notes: The regressions used to calculate the ADF statistics included six lagged differences of the variable. All regressions were carried out over the period 1949:1 to 1990:4 using quarterly data except those involving ut , which began in 1950:1. The variables yt , mt are the logarithms of output and money multiplied by 400, so that their ﬁrst differences represent rates of growth at annual rates; similarly, πt represents price inﬂation at an annual rate. The 95 percent conﬁdence intervals were based on the ADF statistics using the procedure developed in Stock (1991). (iii.b) the impact elasticity of x2 with respect to x1 is known (e.g., λmy is known in the money-output system), (iii.c) the long-run elasticity of x1 with respect to x2 is known (e.g., γym is known in the money-output system), (iii.d) the long-run elasticity of x2 with respect to x1 is known (e.g., γmy is known in the money-output system). The models are estimated using simultaneous equation methods. The details are provided in the appendix, but the basic strategy is quite simple and we describe it here using the money-output system. If λym in (8a) were known, then the equation could be estimated by regressing ∆yt − λym ∆mt onto the lagged values of the variables in the equation. However, the money supply equation (8b) cannot be estimated by ordinary least squares regression since it contains ∆yt , which is potentially correlated with the error term. The maximum likelihood estimator of this equation is constructed by instrumental variables, using the residual from the estimated output supply equation together with lags of ∆mt and ∆yt as instruments. The residual is a valid instrument because of assumption (ii). In the appendix we show how a similar procedure can be used when assumptions (iii.b) – (iii.d) are maintained. Formulae for the standard errors of the estimators are also provided in the appendix. We report results for a wide range of values of the parameters in assumptions (iii.a) – (iii.d). All of the models include six lags of the relevant variables. The sample period is 1949:1–1990:4 for the models that did not include the unemployment rate; when the unemployment rate was included in the model, the sample period is 1950:1–1990:4. Data prior to the initial periods were used 80 Federal Reserve Bank of Richmond Economic Quarterly as lags in the regressions. The robustness of the results to choice of lag length and sample period is discussed below. We now discuss the empirical evidence on the four long-run neutrality propositions. Neutrality of Money Figure 1 plots the estimates of the stochastic trends or permanent components in output and money. These were computed as the multivariate BeveridgeNelson (1981) trends from the estimated bivariate VAR. Also shown in the graph are the NBER business cycle peak and trough dates. Changes in these series at a given date represent changes in the long-run forecasts of output and money associated with the VAR residuals at that date.8 A scatterplot of these residuals, or innovations in the stochastic trends, is shown in Figure 2. The simple correlation between these innovations is −0.25. Thus, money and output appear to have a negative long-run correlation, at least over this sample period. The important question is the direction of causation explaining this correlation. Simply put, does money cause output or vice versa? This question cannot be answered without an identifying restriction, and we now present results for a range of different identifying assumptions. Since we estimate the ﬁnal form (7) using literally hundreds of different identifying assumptions, there is a tremendous amount of information that can potentially be reported. In Figure 3 we summarize the information on longrun neutrality. Figure 3 presents the point estimates and 95 percent conﬁdence intervals for γym for a wide range of values of λmy (panel A), λym (panel B), and γmy (panel C). Long-run neutrality is not rejected at the 5 percent level if γym = 0 is contained in the 95 percent conﬁdence interval. For example, from panel A, when λmy = 0, the point estimate for γym is 0.23 and the 95 percent conﬁdence interval is −0.18 ≤ γym ≤ 0.64. Thus, when λmy = 0, the data do not reject the long-run neutrality hypothesis. Indeed, as is evident from the ﬁgure, long-run neutrality cannot be rejected at the 5 percent level for any value of λmy ≤ 1.40. Thus, the interpretation of the evidence on long-run neutrality depends critically on the assumed value of λmy . The precise value of λmy depends on the money supply process. For example, if the central bank’s reserve position is adjusted to smooth interest rates, then mt will adjust to accommodate shifts in money demand arising from changes in yt . In this case, λmy corresponds to the short-run elasticity of money demand, and a reasonable range of values is 0.1 ≤ λmy ≤ 0.6. For all values of λmy in this range, the null hypothesis of long-run neutrality cannot be rejected. Panel B of Figure 3 shows that long-run neutrality is not rejected for values of λym > −4.61. Since traditional monetary models of the business cycle imply 8 Because the VAR residuals sum to zero over the entire sample, the trends are constrained to equal zero in the ﬁnal period. In addition, they are normalized to equal zero in the initial period. This explains their “Brownian Bridge” behavior. R. G. King and M. W. Watson: Testing Long-Run Neutrality 81 Figure 1 Stochastic Trends A. Stochastic Trend in Real Output 20 Percent 10 0 -10 50 55 60 65 70 Date 75 80 85 90 85 90 B. Stochastic Trend in Nominal Money Percent 40 0 - 40 50 + 55 60 65 70 Date 75 80 that λym ≥ 0—output does not decline on impact in response to a monetary expansion—the results in panel B again suggest that the data are consistent with the long-run neutrality hypothesis. Finally, the results in panel C suggest that the long-run neutrality hypothesis cannot be rejected for the entire range of values γmy shown in Figure 3. To interpret the results in this ﬁgure, recall that γmy represents the long-run response 82 Federal Reserve Bank of Richmond Economic Quarterly Figure 2 Innovations in Stochastic Trends 6 ● ● 4 Real Output 2 0 -2 ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●●● ●●●● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ●●●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● -4 ● -6 -10 -6 -2 2 Nominal Money 6 10 14 + of mt to exogenous permanent shifts in the level of yt . If (M2) velocity is reasonably stable over long periods, then price stability would require γmy = 1. Consequently, values of γmy < 1 represent long-run deﬂationary policies and γmy > 1 represent long-run inﬂationary policies. Thus, when γmy = 1 + δ, the long-run level of prices increase by δ percent when the long-run level of output increases by 1 percent. In the ﬁgure we show that long-run neutrality cannot be rejected for values of γmy as large as 2.5; we have estimated the model using values of γmy as large as 5.7 and found no rejections of the long-run neutrality hypothesis. An alternative way to interpret the evidence from panels A – C of Figure 3 is to use long-run neutrality as an identifying restriction and to estimate the other parameters of the model. From the ﬁgure, when γym = 0, the point ˆ ˆ estimates are λmy = 0.22, λym = −0.59, and γmy = −0.51, and the implied 95 ˆ percent conﬁdence intervals are −0.18 ≤ λmy ≤ 0.62, −1.93 ≤ λym ≤ 0.74, and −2.1 ≤ γmy ≤ 1.06. By deﬁnition, these intervals contain the true values of λmy , λym , and γmy 95 percent of the time, if long-run neutrality is true. Thus, if the conﬁdence intervals contain only nonsensical values of these parameters, then this provides evidence against long-run neutrality. We ﬁnd that the R. G. King and M. W. Watson: Testing Long-Run Neutrality 83 Figure 3 Money and Output 6 A. 95% Confidence Interval for ␥ ym as a Function of my 5 ␥ ym 4 3 2 1 0 -1 -2 -0.6 - 0.2 0.2 0.6 my 1.0 1.4 1.8 2.2 B. 95% Confidence Interval for ␥ ym as a Function of ym 6 5 ␥ ym 4 3 2 1 0 -1 -2 -10 -8 -6 -4 ym -2 0 2 C. 95% Confidence Interval for ␥ ym as a Function of ␥ my 10 8 ␥ ym 6 4 2 0 -2 -4 -6 -5 -4 -3 -2 -1 ␥ my 0 1 2 3 D. 95% Confidence Ellipse when ␥ ym = 0 1.5 1.0 ym 0.5 0.0 -0.5 -1.0 -1.5 -2.0 -2.5 -0.4 + -0.2 0.0 0.2 my 0.4 0.6 0.8 84 Federal Reserve Bank of Richmond Economic Quarterly conﬁdence intervals include many reasonable values of the parameters and conclude that they provide little evidence against the neutrality hypothesis. Multivariate conﬁdence intervals can also be constructed. Panel D of Figure 3 provides an example. It shows the 95 percent conﬁdence ellipse for (λmy , λym ) constructed under the assumption of long-run neutrality.9 If long-run neutrality holds, then 95 percent of the time this ellipse will cover the true values of the pair (λym , λmy ). Thus, if reasonable values for the pair of parameters are not included in this ellipse, then this provides evidence against long-run neutrality. Table 2 summarizes selected results for variations in the speciﬁcation. The VAR lag length (6 in the results discussed above) is varied between 4 and 8, and the model is estimated over various subsamples. Overall, the table suggests that the results are robust to these changes in the speciﬁcation.10 These conclusions are predicated on the two-shock model that forms the basis of the bivariate speciﬁcation. That is, the analysis is based on the assumption that money and output are driven by only two structural disturbances, here interpreted as a monetary shock and a real shock. This is clearly wrong, as there are many sources of real shocks (productivity, oil prices, tax rates, etc.) and nominal shocks (factors affecting both money supply and money demand). However, deducing the effects of these omitted variables on the analysis is difﬁcult, since what matters is both the relative variability of these different shocks and their different dynamic effects on y and m. Indeed, as shown in Blanchard and Quah (1989), a two-shock model will provide approximately correct answers if the dynamic responses of y and m to shocks with large relative variances are sufﬁciently similar. Superneutrality of Money Evidence on the superneutrality of money is summarized in Figure 4 and in panel B of Table 2. Figure 4 is read the same way as Figure 3, except that now the experiment involves the effects of changes in the rate of growth of 9 This conﬁdence ellipse is computed in the usual way. For example, see Johnston (1984), p. 190. 10 These results are not robust to certain other changes in the speciﬁcation. For example, Rotemberg, Driscoll, and Poterba (1995) report results using monthly data on M2 and U.S. Industrial Production (IP) for a speciﬁcation that includes a linear time trend, 12 monthly lags, and is econometrically identiﬁed using the restriction that λmy = 0. These authors report an estimate of γym = 1.57 that is signiﬁcantly different from zero and thus reject long-run neutrality. Stock and Watson (1988) report a similar ﬁnding using monthly data on IP and M1. The sample period and output measure seems to be responsible for the differences between these results and those reported here. For example, assuming λym = 0 and using quarterly IP and M2 results in estimated values of γym of 0.43 (0.31) using data from 1949:1 to 1990:4. (The standard error of the estimate is shown in parentheses.) As in Table 2, when the sample is split and the model estimated over the period 1949:1 to 1972:4 and 1973:1 to 1990:4, the resulting estimates are 0.56 (0.37) and 1.32 (0.70). Thus, point estimates of γym are larger using IP in place of real GNP, and tend to increase in the second half of the second period. R. G. King and M. W. Watson: Testing Long-Run Neutrality 85 Table 2 Robustness to Sample Period and Lag Length A. Neutrality of Money Xt = (∆mt , ∆yt ) Sample Period 1949–1990 1949–1972 1973–1990 1949–1990 1949–1990 B. 1949–1990 1949–1972 1973–1990 1949–1990 1949–1990 6 6 6 4 8 0.23 0.15 0.77 0.24 0.12 Lag Length λ ym = 0 (0.21) (0.24) (0.47) (0.17) (0.19) 1949–1990 1949–1972 1973–1990 1949–1990 1949–1990 λ y,∆m = 0 3.80 (1.74) 3.50 (1.66) 4.02 (4.57) 1.81 (0.90) 3.94 (1.81) 6 6 6 4 8 1950–1990 1950–1972 1973–1990 1950–1990 1950–1990 3.12 (1.36) 3.32 (1.49) 2.65 (2.62) 1.31 (0.63) 3.43 (1.53) Note: γ ∆m, y = 0 −0.95 1.67 −4.11 −1.55 0.10 (1.57) (1.99) (1.14) (0.97) (1.66) λπ R = 0 λ Rπ = 0 π γπR = 0 0.18 0.04 0.40 0.15 0.26 6 6 6 4 8 6 6 6 4 8 −0.12 (0.19) 0.04 (0.27) 0.02 (0.25) −0.04 (0.17) −0.18 (0.18) Estimates of γ Rπ when π Lag Length Lag Length (0.19) (0.24) (0.37) (0.15) (0.17) λ ∆m, y = 0 0.08 0.03 0.23 0.07 0.14 0.34 0.07 0.53 0.28 0.39 (0.09) (0.06) (0.16) (0.07) (0.09) Long-Run Phillips Curve π Xt = (∆π t , ∆ut ) Sample Period 0.17 0.13 0.65 0.20 0.07 γ my = 0 Estimates of γ y , ∆m when Long-Run Fisher Effect π Xt = (∆π t , ∆Rt ) Sample Period D. λ my = 0 Superneutrality of Money Xt = (∆2 mt , ∆yt ) Sample Period C. Estimates of γ ym when Lag Length (0.08) (0.05) (0.18) (0.06) (0.08) (0.12) (0.09) (0.20) (0.09) (0.13) Estimates of γ uπ when π λπ u = 0 0.03 −0.04 0.29 −0.03 0.08 (0.09) (0.10) (0.35) (0.06) (0.09) Standard errors are shown in parentheses. λ uπ = 0 π 0.06 −0.03 0.51 −0.00 0.12 (0.09) (0.09) (0.56) (0.05) (0.09) γπu = 0 −0.17 −0.07 −0.21 −0.18 −0.11 (0.11) (0.14) (0.16) (0.07) (0.10) 86 Federal Reserve Bank of Richmond Economic Quarterly Figure 4 Money Growth and Output 30 A. 95% Confidence Interval for ␥ y, ⌬ m as a Function of ⌬ m,y ␥ y,⌬m 20 10 0 -10 -20 - 0.6 - 0.2 0.2 0.6 1.0 1.4 1.8 2.2 ⌬m,y 30 B. 95% Confidence Interval for ␥ y, ⌬ m as a Function of y, ⌬ m 20 ␥ y,⌬m 10 0 -10 - 20 -10 30 -8 -6 -4 y,⌬m -2 0 2 C. 95% Confidence Interval for ␥ y, ⌬ m as a Function of ␥ ⌬ m,y ␥ y,⌬m 20 10 0 -10 - 20 - 0.5 - 0.4 - 0.3 - 0.2 - 0.1 ␥ ⌬m,y 0.0 0.1 0.2 D. 95% Confidence Ellipse when ␥ y, ⌬ m = 0 1.5 1.0 y,⌬m 0.5 0.0 - 0.5 - 1.0 - 1.5 - 2.0 - 2.5 - 0.4 + - 0.2 0.0 0.2 ⌬m,y 0.4 0.6 0.8 R. G. King and M. W. Watson: Testing Long-Run Neutrality 87 money, so that the parameters are λ∆m,y , λy,∆m , γ∆m,y , and γy,∆m . There are two substantive conclusions to be drawn from the table and ﬁgure. The ﬁrst conclusion is that it is possible to ﬁnd evidence against superneutrality. For example, superneutrality is rejected at the 5 percent level for all values of λ∆m,y between −0.25 and 0.08, and for all values of λy,∆m between −0.26 and 1.02. On the other hand, the ﬁgures suggest that these rejections are marginal, and the rejections are not robust to all of the lag-length and sampleperiod speciﬁcation changes reported in Table 2. Moreover, a wide range of (arguably) reasonable identifying restrictions lead to the conclusion that superneutrality cannot be rejected. For example, superneutrality is not rejected for any value of λ∆m,y in the interval 0.08 to 0.53. Because of the lags in the model, the impact multiplier λ∆m,y has the same interpretation as λmy in the discussion of long-run neutrality, and we argued above that the interval (0.08, 0.53) was a reasonable range of values for this parameter. In addition, from panel C, superneutality cannot be rejected for values of γ∆m,y < 0.07. To put this into perspective, note that γ∆m,y measures the long-run elasticity of rate of growth of money with respect to permanent changes in the level of output. Thus a value of γ∆m,y = 0 corresponds to a non-accelerationist policy. The second substantive conclusion is that the identifying assumption has a large effect on the sign and the magnitude of the estimated value of γy,∆m . For example, when λ∆m,y = 0 the estimated value of γy,∆m is 3.8. Thus, a 1 percent permanent increase in the money growth rate is estimated to increase the ﬂow of output by 3.8 percent per year in perpetuity. Our sense is that even those who believe that the Tobin (1965) effect is empirically important do not believe that it is this large. The estimated value of γy,∆m falls sharply as λ∆m,y is increased, and γy,∆m = 0 when λ∆m,y = 0.30. For values of λ∆m,y > 0.30, ˆ the point estimate of γy,∆m is negative, consistent with the predictions of cashin-advance models in which sustained inﬂation is a tax on investment activity (Stockman 1981) or on labor supply (Aschauer and Greenwood 1983 or Cooley and Hansen 1989). The Fisherian Theory of Inﬂation and Interest Rates In the Fisherian theory of interest, the interest rate is determined as the sum of a real component, rt , and an expected inﬂation component Et πt+1 . A related long-run neutrality proposition—also suggested by Fisher—is that the level of the real interest rate is invariant to permanent changes in the rate of inﬂation. If inﬂation is integrated, then this proposition can be investigated using our framework: when Xt = (∆πt , ∆Rt ), then permament changes in πt will have no effect on real interest rates when γRπ = 1. We ﬁnd mixed evidence against the classical Fisherian link between longrun components of inﬂation and nominal interest rates, interpreted here as γRπ = 1. For example, from Figure 5, maintaining a positive value of either 88 Federal Reserve Bank of Richmond Economic Quarterly Figure 5 Inﬂation and Nominal Rates 1.8 A. 95% Confidence Interval for ␥ R as a Function of R 1.4 1.0 ␥R 0.6 0.2 - 0.2 - 0.6 - 26 1.8 - 22 -18 -14 R -10 -6 -2 2 B. 95% Confidence Interval for ␥ R as a Function of R 1.4 ␥R 1.0 0.6 0.2 - 0.2 - 0.6 - 0 .2 1.8 0.6 R 0.2 1.4 1.0 C. 95% Confidence Interval for ␥ R as a Function of ␥ R 1.4 ␥R 1.0 0.6 0.2 - 0.2 - 0.6 - 30 - 26 - 22 -18 -14 ␥ R -10 -6 -2 2 D. 95% Confidence Ellipse when ␥ R = 1 2.6 2.2 1.8 R 1.4 1.0 0.6 0.2 - 0.2 - 0.6 - 60 + - 50 - 40 - 30 - 20 R -10 0 10 20 30 R. G. King and M. W. Watson: Testing Long-Run Neutrality 89 λπR or γπR leads to an estimate of γRπ that is signiﬁcantly less than 1. A mechanical explanation of this ﬁnding is that the VAR model implies substantial volatility in trend inﬂation: the estimated standard deviation of the inﬂation trend is much larger (1.25) than that of nominal rates (0.75). Thus, to reconcile the data with γRπ = 1, a large negative effect of nominal interest rates on inﬂation is required. However, from panel B of the ﬁgure, γRπ = 1 cannot be rejected for a value of λRπ > 0.55. One way to interpret the λRπ parameter is to decompose the impact effect of π on R into an expected inﬂation effect and an effect on real rates. If π has no impact effect on real rates, so that only the expected inﬂation effect was present, then λRπ = ∂πt+1 /∂ π . For our data, ∂πt+1 /∂ π = 0.6 t t when the model is estimated using λRπ = 0.6 as an identifying restriction, suggesting that this is a reasonable estimate of the expected inﬂation effect. The magnitude of the real interest effect is more difﬁcult to determine since different macreconomic models lead to different conclusions about the effect of nominal shocks on real rates. For example, models with liquidity effects imply that real rates fall (e.g., Lucas [1990], Fuerst [1992], and Christiano and Eichenbaum [1994]), while the sticky nominal wage and price models in King (1994) imply that real rates rise. In this regard, the interpretation of the evidence on the long-run Fisher effect is seen to depend critically on one’s belief about the impact effect of a nominal disturbance on the real interest rate. If this effect is negative, then there is signiﬁcant evidence in the data against this neutrality hypothesis. The conﬁdence intervals suggest that the evidence against the long-run Fisher relation is not overwhelming. When γRπ = 1 is maintained, the implied conﬁdence intervals for the other parameters are wide (−43.7 ≤ λπR ≤ 15.6, 0.0 ≤ λRπ ≤ 2.1, −154.8 ≤ γπR ≤ 116.4) and contain what are arguably reasonable values of these parameters. This is also evident from the conﬁdence ellipse in panel D of Figure 5. One interpretation is that these results reﬂect the conventional ﬁnding that nominal interest rates do not adjust fully to sustained inﬂation in the postwar U.S. data. This result obtains for a wide range of identifying assumptions. One possible explanation is that the failure depends on the particular speciﬁcation of the bivariate model that we employ, suggesting the importance of extending this analysis to multivariate models. Another candidate source of potential misspeciﬁcation is cointegration between nominal rates and inﬂation. This is discussed in some detail in papers by Evans and Lewis (1993), Mehra (1995), and Mishkin (1992).11 authors suggest that real rates Rt − πt are I(0). Evans and Lewis (1993) and Mishkin (1992) ﬁnd estimates suggesting that nominal rates do not respond fully to permanent changes in inﬂation and attribute this to a small sample bias associated with shifts in the inﬂation process. Mehra (1995) ﬁnds that permanent changes in interest rates do respond one-for-one with 11 These 90 4. Federal Reserve Bank of Richmond Economic Quarterly EVIDENCE ON THE LONG-RUN PHILLIPS CURVE As discussed in King and Watson (1994), the interpretation of the evidence on the long-run Phillips curve is more subtle than the other neutrality propositions.12 Throughout this article we have examined neutrality by examining the long-run multiplier in equations relating real variables to nominal variables. This suggests examining the neutrality proposition embodied in the long-run Phillips curve using the equation αuu (L)ut = αuπ (L)πt + u t. (12) Of course, as in Sargent (1976), equation (12) is one standard way of writing the Phillips curve. Figure 6 shows estimates γuπ for a wide range of identifying assumptions. When the model is estimated using λπu as an identifying assumption, a vertical Phillips curve (γuπ = 0) is rejected when λπu > 2.3.13 Thus, neutrality is rejected only if one assumes that positive changes in the unemployment rate have a large positive impact effect on inﬂation. From panel B of the ﬁgure, γuπ = 0 is rejected for maintained values of λuπ < −0.07. Since λuπ can be interpreted as the slope of the short-run (impact) Phillips curve, this ﬁgure shows the relationship between maintained assumptions and conclusions about short-run and long-run neutrality. The data are consistent with the pair of parameters λuπ and γuπ being close to zero; the data also are consistent with the hypothesis that these parameters are both less than zero. If short-run neutrality is maintained (λuπ = 0), the estimated long-run effect of inﬂation on unemployment is very small (γuπ = 0.06). If long-run neutrality is maintained ˆ (γuπ = 0), the estimated short-run effect of inﬂation on unemployment is very ˆ small (λuπ = −0.02). This latter result is consistent with the small estimated real effects of nominal disturbances found by King, Plosser, Stock, and Watson (1991), Gali (1992), and Shapiro and Watson (1988), who all used long-run neutrality as an identifying restriction. Several researchers, relying on a variety of speciﬁcations and identifying assumptions, have produced estimates of the short-run Phillips curve slope. For example, Sargent (1976) estimates λuπ using innovations in population, money, and various ﬁscal policy variables as instruments. He ﬁnds an estimate of λuπ = −0.07. Estimates of λuπ ranging from −0.07 to −0.18 can permanent changes in inﬂation. In contrast to these papers, our results are predicated on the assumption that πt and Rt are I(1) and are not cointegrated over the entire sample. As the results in Table 1 make clear, both the I(0) and I(1) hypotheses are consistent with the data. 12 A greatly expanded version of the analysis in this section is contained in King and Watson (1994). 13 Recall that the Phillips curve is drawn with inﬂation on the vertical axis and unemployment on the horizontal axis. Thus, a vertical long-run Phillips curve corresponds to the restriction γuπ = 0. R. G. King and M. W. Watson: Testing Long-Run Neutrality 91 Figure 6 Inﬂation and Unemployment A. 95% Confidence Interval for ␥ u as a Function of u 0.8 0.6 0.4 ␥ u 0.2 0.0 -0.2 -0.4 -0.6 -4 0.8 -2 0 2 u 4 6 8 B. 95% Confidence Interval for ␥ u as a Function of u 0.6 ␥ u 0.4 0.2 0.0 -0.2 -0.4 ␥ u -0.6 - 0.22 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 - 2.0 - 0.18 - 0.14 - 0.10 - 0.06 u - 0.02 0.02 0.06 0.10 C. 95% Confidence Interval for ␥ u as a Function of ␥ u -1.6 -1.2 ␥ u - 0.8 - 0.4 0.0 D. 95% Confidence Ellipse when ␥ u = 0 0.10 0.05 u 0.00 - 0.05 - 0.10 - 0.15 - 0.20 -3 + -1 1 u 3 5 7 92 Federal Reserve Bank of Richmond Economic Quarterly be extracted from the results in Barro and Rush (1980), who estimated the unemployment and inﬂation effects of unanticipated money shocks. Values of λuπ in this range lead to a rejection of the null γuπ = 0, but they suggest a very steep long-run tradeoff. For example, when λuπ = −0.10, the corresponding point estimate of γuπ = −0.20, so that the long-run Phillips curve has a slope −1 of −5.0(= γuπ ). By contrast, the conventional view in the late 1960s and early 1970s was that there was a much more favorable tradeoff between inﬂation and unemployment. For example, in discussing Gordon’s famous (1970) test of an accelerationist Phillips curve model, Solow calculated that there was a onefor-one long-run tradeoff implied by Gordon’s results. This calculation was sufﬁciently conventional that it led to no sharp discussion among the participants at the Brookings panel. Essentially the same tradeoff was suggested by the 1969 Economic Report of the President, which provided a graph of inﬂation and unemployment between 1954 and 1968.14 What is responsible for the difference between our estimates and the conventional estimates from the late ’60s? Panel D in Table 2 suggests that sample period cannot be the answer: the full sample results are very similar to the results obtained using data from 1950 through 1972. Instead, the answer lies in differences between the identifying assumptions employed. The traditional Gordon-Solow estimate was obtained from a price equation of the form15 αππ (L)πt = απu (L)ut + π t . (13) The estimated slope of the long-run Phillips curve was calculated as γ = απu (1)/αππ (1). Thus, in the traditional Gordon-Solow framework, the longrun Phillips curve was calculated as the long-run multiplier from the inﬂation −1 equation. In contrast, our estimate (γuπ ) is calculated from the unemployment equation. The difference is critical, since it means that the two parameters represent responses to different shocks. Using our notation, the long-run multiplier from (13) is γπu = limk→∞ ∂πt+k /∂ u t , limk→∞ ∂ut+k /∂ u t while the inverse of the long-run multiplier from the unemployment equation (12) is −1 γuπ = 14 See limk→∞ ∂πt+k /∂ limk→∞ ∂ut+k /∂ π t π. t McCallum (1989, p. 180) for a replication and discussion of this graph. (13) served as a baseline model for estimating the Phillips curve. Careful researchers employed various shift variables in the regression to capture the effects of demographic shifts on the unemployment rate and the effects of price controls on inﬂation. For our purposes, these complications can be ignored. 15 Equation R. G. King and M. W. Watson: Testing Long-Run Neutrality 93 Thus, the traditional estimate measures the relative effect of shocks to unemployment, while our estimate corresponds to the relative effect of shocks to inﬂation. Figure 7 presents our estimates of γπu . Evidently, the Gordon-Solow value of γuπ = −1 is consistent with a wide range of identifying restrictions shown in the ﬁgure. But the question is not whether the long-run multiplier is calculated from the unemployment equation, αuu (L)ut = αuπ (L)πt + u , or from the inﬂation t equation, αππ (L)πt = απu (L)ut + π . By choosing between these two spect iﬁcations under a speciﬁc identiﬁcation scheme, one is also choosing a way of representing the experiment of a higher long-run rate of inﬂation, presumably originating from a higher long-run rate of monetary expansion. Under the Gordon-Solow procedure, the idea is that the shock to unemployment—the u t shock deﬁned by a particular identifying restriction—is the indicator of a shift in aggregate demand. Its consequences are traced through the inﬂation equation since unemployment is the right-hand side variable in that equation. Under the Lucas-Sargent procedure, the idea is that the shock to inﬂation—the π shock t deﬁned by a particular identifying restriction—is the indicator of a shift in aggregate demand. To interpret the Gordon-Solow estimate of γπu we must determine the particular identifying assumption that they used. Their assumption can be deduced from the way that they estimated γπu , namely from the ordinary least squares estimators of equation (13). Recall that OLS requires that the variables on the right-hand side of (13) are uncorrelated with the error term. Since ut appears on the right-hand side of (13), this will be true only when λuπ = 0. Thus, the particular identifying assumption employed in the Gordon-Solow speciﬁcation in λuπ = 0. What does this identifying assumption mean? When λuπ = 0, the GordonSolow interpretation implies that autonomous shocks to aggregate demand are one-step-ahead forecast errors in ut . The other shocks in the system can affect prices on impact but cannot affect unemployment. Thus, in this sense, prices are ﬂexible, since they can be affected on impact by all shocks, but unemployment is sticky, since it can be affected on impact only by aggregate demand shocks. For today’s “new Keynesians” this may appear to be a very unreasonable identifying restriction (and so must any evidence about the Phillips curve that follows from it). However, the identifying restriction is consistent with the traditional Keynesian model of the late 1960s.16 16 What we have in mind is a block recursive model in which the unemployment rate is determined in an IS-LM block, and wages and prices are determined in a wage-price block. This interpretation is further explored in King and Watson (1994). 94 Federal Reserve Bank of Richmond Economic Quarterly Figure 7 Unemployment and Inﬂation A. 95% Confidence Interval for ␥ u as a Function of u 3 2 ␥ u 1 0 -1 -2 -3 -4 4 -2 0 2 u 4 6 8 B. 95% Confidence Interval for ␥ u as a Function of u 3 2 ␥u 1 0 -1 -2 -3 - 0.22 4 - 0.18 - 0.14 - 0.10 - 0.06 u - 0.02 0.02 0.06 0.10 C. 95% Confidence Interval for ␥ u as a Function of ␥ u 3 2 ␥u 1 0 -1 -2 -3 -4 - 0.4 - 0.3 - 0.2 - 0.1 0.0 ␥ u 0.1 0.2 0.3 0.4 D. 95% Confidence Ellipse when ␥ u = 0 0.10 0.05 u 0.00 - 0.05 - 0.10 - 0.15 - 0.20 -3 + -1 1 u 3 5 7 R. G. King and M. W. Watson: Testing Long-Run Neutrality 5. 95 CONCLUDING REMARKS We have investigated four long-run neutrality propositions using bivariate models and 40 years of quarterly observations. We conclude that the data contain little evidence against the long-run neutrality of money and suggest a very steep long-run Phillips curve. These conclusions are robust to a wide range of identifying assumptions. Conclusions about the long-run Fisher effect and the superneutrality of money are not robust to the particular identifying assumption. Over a fairly broad range of identifying restrictions, the data suggest that nominal interest rates do not move one-for-one with permanent shifts in inﬂation. The sign and magnitude of the estimated long-run effect of money growth on the level of output depends critically on the speciﬁc identifying restriction employed. These conclusions are tempered by four important caveats. First, the results are predicated on speciﬁc assumptions concerning the degree of integration of the data, and with 40 years of data the degree of integration is necessarily uncertain. Second, even if the degree of integration were known, only limited “long-run” information is contained in data that span 40 years. This suggests that a useful extension of this work is to carry out similar analyses on long annual series. Third, the analysis has been carried out using bivariate models. If there are more than two important sources of macroeconomic shocks, then bivariate models may be subject to signiﬁcant omitted variable bias. Thus another extension of this work is to expand the set of variables under study to allow a richer set of structural macroeconomic shocks. The challenge is to do this in a way that produces results that can be easily interpreted in spite of the large number of identifying restrictions required. Fourth, we have analyzed each of these propositions separately and yet there are obvious and important theoretical connections between them. Future work on multivariate extensions of this approach may allow for a uniﬁed econometric analysis of these long-run neutrality propositions. 96 Federal Reserve Bank of Richmond Economic Quarterly APPENDIX Estimation Methods Under each alternative identifying restriction, the Gaussian maximum likelihood estimates can be constructed using standard regression and instrumental variable calculations. When λym is assumed known, equation (8a) can be estimated by ordinary least squares by regressing ∆yt − λym ∆mt onto {∆yt−i , ∆mt−i }p . Equation (8b) cannot be estimated by OLS because ∆yt , i=1 one of the regressors, is potentially correlated with m . Instrumental variables t must be used. The appropriate instruments are {∆yt−i , ∆mt−i }p together with i=1 the residual from the estimated (8a). This residual is a valid instrument because of the assumption that η and m are uncorrelated. When λmy is assumed known, t t rather than λym , this process was reversed. When a value for γmy is used to identify the model, a similar procedure can be used. First, rewrite (8b) as p−1 p−1 j=0 ∼j αmm ∆2 mt−j + + ∼j αmy ∆2 yt−j ∆mt = αmy (1)∆yt + βmm ∆mt−1 + m t , (A1) j=0 j where βmm = p αmm . Equation (A1) replaces the regressors (∆yt , ∆yt−1 , j=1 . . . , ∆yt−p , ∆mt−1 , . . . , ∆mt−p ) in (8b) with the equivalent set of regressors (∆yt , ∆mt−1 , ∆2 yt , ∆2 yt−1 , . . . , ∆2 yt−p+1 , ∆2 mt−1 , . . . , ∆2 mt−p+1 ). In (A1), the long-run multiplier is γmy = αmy (1)/(1 − βmm ), so that αmy (1) = γmy − βmm γmy . Making this substitution, (A1) can be written as p−1 ∆mt − γmy ∆yt = βmm (∆mt−1 − γmy ∆yt ) + p−1 + ∼j αmy ∆2 yt−j j=0 ∼j 2 αmm ∆ mt−j + m t . (A2) j=0 Equation (A2) can be estimated by instrumental variables by regressing ∆mt − γmy ∆yt onto (∆mt−1 − γmy ∆yt , ∆2 yt , ∆2 yt−1 , . . . , ∆2 yt−p+1 , ∆2 mt−1 , . . . , ∆2 mt−p+1 ) using {∆yt−i , ∆mt−i }p as instruments. (Instruments are i=1 required because of the potential correlation between ∆yt and the error term.) Equation (8a) can now be estimated by instrumental variables using the residual from the estimated (A2) together with {∆yt−i , ∆mt−i }p . When a value for i=1 γym is used to identify the model, this process was reversed. Two complications arise in the calculation of standard errors for the estimated models. The ﬁrst is that the long-run multipliers, γym and γmy , are R. G. King and M. W. Watson: Testing Long-Run Neutrality 97 nonlinear functions of the regression coefﬁcients. Their standard errors are calculated from standard formula derived from delta method arguments. The second complication arises because one of the equations is estimated using instruments that are residuals from another equation. This introduces the kind of “generated regressor” problems discussed in Pagan (1984). To see the problem in our context, notice that all of the models under consideration can be written as y1 = xt1 δ1 + 1 (A3) t t y2 = xt2 δ2 + t 2 t. (A4) Where, for example, when λmy is assumed known, y1 = ∆mt − λmy ∆yt , xt1 t represents the set of regressors {∆yt−i , ∆mt−i }p , y2 = ∆yt , and xt2 reprei=1 t sents the set of regressors [∆mt , {∆yt−i , ∆mt−i }p ]. Alternatively, when γmy i=1 is assumed known, y1 = ∆mt − γmy ∆yt , xt1 represents the set of regressors t [∆mt−1 − γmy ∆yt , ∆2 yt , {∆2 yt−i , ∆2 mt−i }p−1 ], y2 = ∆yt , and xt2 represents the t i=1 set of regressors [∆mt , {∆yt−i , ∆mt−i }p ]. i=1 Equations (A3) and (A4) allow us to discuss estimation of all the models in a uniﬁed way. First, (A3) is estimated using zt = {∆yt−i , ∆mt−i }p as ini=1 struments. Next, equation (A4) is estimated using ut = (ˆ1 , zt ) as instruments, ˆ t where ˆ1 is the estimated residuals from (A3). If 1 rather than ˆ1 is used t t t as an instrument, standard errors could be calculated using standard formulae. However, when ˆ1 , an estimate of 1 , is used, a potential problem arises. t t This problem will only effect the estimates in (A4) since ˆ1 is not used as an t instrument in (A3). To explain the problem, some additional notation will prove helpful. Stack the observations for each equation so that the model can be written as Y1 = X1 δ1 + 1 (A5) Y2 = X2 δ2 + 2, (A6) where Y1 is T × 1, etc. Denote the matrix of instruments for the ﬁrst equaˆ tion by Z, the matrix of instruments for the second equation by U = [ˆ1 Z ], ˆ ˆ ˆ and let U = [ 1 Z]. Since ˆ1 = 1 − X1 (δ1 − δ1 ), U = U − [X1 (δ1 − δ1 ) 0]. 2 −1 Let V1 = σ 1 plim [T(Z X1 ) (Z Z)(X1 Z)] denote the asymptotic covariance ˆ matrix of T 1/2 (δ1 − δ1 ). Now write, ˆ ˆ ˆ T 1/2 (δ2 − δ2 ) = (T −1 U X2 )−1 (T −1/2 U 2) ˆ = (T −1 U X2 )−1 (T −1/2 U ˆ ˆ −(T −1 U X2 )−1 T 1/2 (δ1 − δ1 ) (T −1 X1 2 ). 0 2) (A7) ˆ ˆ It is straightforward to verify that plim T −1 U U = plim T −1 U U and that −1 ˆ −1 T U X2 = plim T U X2 . Thus, the ﬁrst term on the right-hand side of (A7) 98 Federal Reserve Bank of Richmond Economic Quarterly ˆ is standard: it is asymptotically equivalent to the expression for T 1/2 (δ2 − δ2 ) ˆ were used as instruments. This expression that would obtain if U rather than U converges in distribution to a random variable distributed as N(0, σ 22 plim ˆ ˆ ˆ ˆ [T(U X2 )−1 (U U)(X2 U)−1 ]), which is the usual expression for the asymptotic distribution of the IV estimator. Potential problems arise because of the second term on the right-hand side ˆ of (A7). Since T 1/2 (δ1 − δ1 ) converges in distribution, the second term can only be disregarded asymptotically when plim T −1 X1 2 = 0, that is, when the regressors in (A3) are uncorrelated with the error terms in (A4). In our context, this will occur when λmy and λym are assumed known, since in this case xt1 contains only lagged variables. However, when γmy or γym are assumed known, xt1 will contain the contemporaneous value of ∆yt or ∆mt , and thus xt1 and 2 t ˆ will be correlated. In this case the covariance matrix of δ2 must be modiﬁed to account for the second term on the right-hand side of (A7). The necessary modiﬁcation is as follows. Standard calculations show that ˆ T 1/2 (δ1 −δ1 ) and T −1/2 U 2 are asymptotically independent under the maintained assumption that E( 2 | 1 ) = 0; thus, the two terms on the right-hand side of (A7) are asymptotically uncorrelated. A straightforward calculation demonstrates that ˆ T 1/2 (δ2 − δ2 ) converges to a random variable with a N(0, V2 ) distribution where ˆ ˆ ˆ ˆ ˆ ˆ V2 = σ 22 plim [T(U X2 )−1 (U U)(X2 U)−1 ] + plim [T(U X2 )−1 D(X2 U)−1 ], where D is a matrix with all elements equal to zero, except that D11 = ( 2 X1 )TV1 (X1 2 ), and where TV1 = σ 21 (Z X1 )−1 (Z Z)(X1 Z)−1 . Similarly, it is ˆ straightforward to show that the asymptotic covariance between T 1/2 (δ1 − δ1 ) 1/2 ˆ −1 −1 ˆ and T (δ2 − δ2 ) = −plim[V1 (T X1 2 ) 0][T X2 U]. An alternative to this approach is the GMM-estimator in Hausman, Newey, and Taylor (1987). This approach considers the estimation problem as a GMM problem with moment conditions E(zt 1 ) = 0, E(zt 2 ) = 0, and E( 1 2 ) = 0. t t t t The GMM approach is more general than the one we have employed, and when the errors terms are non-normal and the model is over-identiﬁed, it may produce more efﬁcient estimates. REFERENCES Aschauer, David, and Jeremy Greenwood. “A Further Exploration in the Theory of Exchange Rate Regimes,” Journal of Political Economy, vol. 91 (October 1983), pp. 868–72. Barro, Robert J., and Mark Rush. “Unanticipated Money and Economic Activity,” in Stanley Fischer, ed., Rational Expectations and Economic Policy. Chicago: University of Chicago Press, 1980. R. G. King and M. W. Watson: Testing Long-Run Neutrality 99 Bernanke, Ben S. “Alternative Explanations of the Money-Income Correlation,” Carnegie-Rochester Conference Series on Public Policy, vol. 25 (Autumn 1986), pp. 49–99. Beveridge, Stephen, and Charles R. Nelson. “A New Approach to Decomposition of Economic Time Series into Permanent and Transitory Components with Particular Attention to Measurement of the ‘Business Cycle,’ ” Journal of Monetary Economics, vol. 7 (March 1981), pp. 151–74. Blanchard, Olivier J. “A Traditional Interpretation of Macroeconomic Fluctuations,” American Economic Review, vol. 79 (December 1989), pp. 1146–64. , and Danny Quah. “The Dynamic Effects of Aggregate Demand and Supply Disturbances,” American Economic Review, vol. 79 (September 1989), pp. 655–73. , and Mark Watson. “Are Business Cycles All Alike?” in Robert J. Gordon, ed., The American Business Cycle: Continuity and Change. Chicago: University of Chicago Press, 1986. Christiano, Lawrence, and Martin Eichenbaum. “Liquidity Effects, Monetary Policy, and the Business Cycle,” Journal of Money, Credit, and Banking, vol. 27 (November 1995), pp. 113–36. Cooley, Thomas F., and Gary D. Hansen. “The Inﬂation Tax in a Real Business Cycle Model,” American Economic Review, vol. 79 (September 1989), pp. 733–48. Economic Report of the President, 1969. Washington: Government Printing Ofﬁce, 1969. Evans, Martin D. D., and Karen L. Lewis. “Do Expected Shifts in Inﬂation Affect Estimates of the Long-Run Fisher Relation?” Manuscript. University of Pennsylvania, 1993. Fisher, Mark E., and John J. Seater. “Long-Run Neutrality and Superneutrality in an ARIMA Framework,” American Economic Review, vol. 83 (June 1993), pp. 402–15. Fuerst, Timothy S. “Liquidity, Loanable Funds, and Real Activity,” Journal of Monetary Economics, vol. 29 (February 1992), pp. 3–24. Gali, Jordi. “How Well Does the IS-LM Model Fit Postwar U.S. Data?” Quarterly Journal of Economics, vol. 107 (May 1992), pp. 709–38. Geweke, John. “The Superneutrality of Money in the United States: An Interpretation of the Evidence,” Econometrica, vol. 54 (January 1986), pp. 1–21. Goldberger, Arthur S. Econometric Theory. New York: John Wiley and Sons, 1964. 100 Federal Reserve Bank of Richmond Economic Quarterly Gordon, Robert J. “The Recent Acceleration of Inﬂation and Its Lessons for the Future,” Brookings Papers on Economic Activity, 1:1970, pp. 8–41. Hausman, Jerry A., Whitney K. Newey, and William E. Taylor. “Efﬁcient Estimation and Identiﬁcation of Simultaneous Equation Models with Covariance Restrictions,” Econometrica, vol. 55 (July 1987), pp. 849–74. Johnston, J. Econometric Methods, 3d ed. New York: McGraw Hill, 1984. King, Robert G., and Charles I. Plosser. “Money Business Cycles,” Journal of Monetary Economics, vol. 33 (April 1994), pp. 405–38. , Charles I. Plosser, James H. Stock, and Mark W. Watson. “Stochastic Trends and Economic Fluctuations,” American Economic Review, vol. 81 (September 1991), pp. 819–40. King, Robert G., and Mark W. Watson. “The Post-War U.S. Phillips Curve: A Revisionist Econometric History,” Carnegie-Rochester Conference Series on Public Policy, vol. 41 (December 1994), pp. 157–219. . “Testing Long-Run Neutrality,” Working Paper 4156. Boston: National Bureau of Economic Research, September 1992. Lucas, Robert E., Jr. “Liquidity and Interest Rates,” Journal of Economic Theory, vol. 50 (April 1990), pp. 237–64. . “Some International Evidence on Output-Inﬂation Trade-offs,” American Economic Review, vol. 63 (June 1973), pp. 326–34. . “Econometric Testing of the Natural Rate Hypothesis,” in Otto Eckstein, ed., The Econometrics of Price Determination. Washington: Board of Governors of the Federal Reserve System, 1972. McCallum, Bennett T. Monetary Economics: Theory and Policy. New York: Macmillan, 1989. . “On Low-Frequency Estimates of Long-Run Relationships in Macroeconomics,” Journal of Monetary Economics, vol. 14 (July 1984), pp. 3–14. Mehra, Yash P. “Some Key Empirical Determinants of Short-Term Nominal Interest Rates,” Federal Reserve Bank of Richmond Economic Quarterly, vol. 81 (Summer 1995), pp. 33–51. Mishkin, Frederic S. “Is the Fisher Effect Real? A Reexamination of the Relationship between Inﬂation and Interest Rates.” Manuscript. Columbia University, 1992. Pagan, Adrian. “Econometric Issues in the Analysis of Regressions with Generated Regressors,” International Economic Review, vol. 25 (February 1984), pp. 221– 48. Phillips, A. W. “The Relation between Unemployment and the Rate of Change of Money Wage Rates in the United Kingdom, 1861–1957,” Economica, vol. 25 (1958), pp. 283–99. R. G. King and M. W. Watson: Testing Long-Run Neutrality 101 Rotemberg, Julio J., John C. Driscoll, and James M. Poterba. “Money, Output, and Prices: Evidence from a New Monetary Aggregate,” Journal of Economic and Business Statistics, vol. 13 (January 1995), pp. 67–84. Sargent, Thomas J. “A Classical Macroeconometric Model for the United States,” Journal of Political Economy, vol. 84 (April 1976), pp. 207–37. . “A Note on the Accelerationist Controversy,” Journal of Money, Credit, and Banking, vol. 3 (August 1971), pp. 50–60. Shapiro, Matthew, and Mark W. Watson. “Sources of Business Cycle Fluctuations,” National Bureau of Economic Research Macroeconomics Annual, vol. 3 (1988), pp. 111–56. Sims, Christopher A. “Models and Their Uses,” American Journal of Agricultural Economics, vol. 71 (May 1989), pp. 489–94. . “Are Forecasting Models Usable for Policy Analysis?” Federal Reserve Bank of Minneapolis Quarterly Review, vol. 10 (Winter 1986), pp. 2–16. Solow, Robert. Price Expectations and the Behavior of the Price Level. Manchester, U.K.: Manchester University Press, 1969. Stock, James H. “Conﬁdence Intervals for the Largest Autoregressive Root in U.S. Macroeconomic Time Series,” Journal of Monetary Economics, vol. 28 (December 1991), pp. 435–60. , and Mark W. Watson. “Interpreting the Evidence on MoneyIncome Causality,” Journal of Econometrics, vol. 40 (January 1989), pp. 161–81. Stockman, Alan C. “Anticipated Inﬂation and the Capital Stock in a Cash-InAdvance Economy,” Journal of Monetary Economics, vol. 8 (November 1981), pp. 387–93. Tobin, James. “Money and Economic Growth,” Econometrica, vol. 33 (October 1965), pp. 671–84. Watson, Mark W. “Vector Autoregressions and Cointegration,” in Robert Engle and Daniel McFadden, eds., Handbook of Econometrics, Vol. IV. Amsterdam: Elsevier, 1994.