The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
F E D E R A L R E S E R V E B A N K O F AT L A N TA Transparency, Expectations, and Forecasts ANDREW BAUER, ROBERT A. EISENBEIS, DANIEL F. WAGGONER, AND TAO ZHA Bauer is a senior economic analyst in the macropolicy section, Eisenbeis is executive vice president and director of research, Waggoner is a research economist and assistant policy adviser in the financial section, and Zha is a research economist and policy adviser in the macropolicy section, all in the Atlanta Fed’s research department. They thank Jinill Kim, Brian Madigan, John Robertson, and Ellis Tallman for critical comments and Cindy Soo and Eric Wang for research assistance. A similar version of this research is also published with the same title as Federal Reserve Bank of Atlanta Working Paper 2006-3. M any macroeconomists have argued that a central bank should be transparent about its objectives, its views about the economic outlook, and the reasoning behind its policy changes (see Faust and Leeper 2005). In 1994 the Federal Open Market Committee (FOMC) began to release statements accompanying changes in the federal funds rate target. Since then, the degree of specificity of the statements and the guidance provided on the likely course of future policy have evolved significantly.1 In a recent paper, Woodford (2005) discusses two kinds of central-bank communications: current policy decisions and the central bank’s view of likely future policy. He articulates four categories of information—the central bank’s view of current economic conditions, current operating targets, strategies guiding policy decision making, and the outlook for future policy—that a central bank might seek to communicate to the public. Woodford argues that these open communications are “beneficial, not only from the point of view of reducing the uncertainty with which traders and other economic decision makers must contend, but also from that of enhancing the accuracy with which the FOMC is able to achieve the effects on the economy that it desires, by keeping the expectations of market participants more closely synchronized with its own.” This article investigates whether the public’s views about the economy’s current path and about future policy have been affected by changes in the Federal Reserve’s communications policy as reflected in private-sector forecasts of future economic conditions and policy moves. In particular, has private agents’ ability to predict the direction of the economy improved since 1994, when the FOMC began to publicly state its views of the economic outlook? If so, on which dimensions has the ability to forecast improved? The analysis focuses on both the short-term and longer-term economic forecasts of key macroeconomic variables—such as inflation, gross domestic product (GDP) growth, and unemployment—and of policy variables such as shortterm interest rates. Private agents’ current-year and next-year forecasts are used as proxies for the public’s short-term and longer-term expectations, and empirical ECONOMIC REVIEW First Quarter 2006 1 F E D E R A L R E S E R V E B A N K O F AT L A N TA evidence is presented regarding whether such forecasts have performed better in predicting future economic and policy conditions since 1994. The private-agent forecasts used in this article are those of individual participants as well as the consensus (average) forecasts contained in the monthly Blue Chip Economic Indicators surveys from 1986 to 2004, which include both the pre-FOMCstatement subperiod (1986:01–1993:12) and the post-FOMC-statement subperiod (1994:01–2004:12). We employ the econometric methodology of Eisenbeis, Waggoner, and Zha (2002), which permits us to evaluate the accuracy of forecasts both in cross section and across time and to examine the errors in forecasting key economic variables on both a univariate and a multivariate basis. The latter is important because agents are not simply forecasting one economic variable but rather a set of variables that presumably are interrelated and jointly capture important dimensions of economic performance. Good forecasts on one dimension but poor overall performance may provide some indication of the internal consistency of the forecaster’s approach. This cross-sectional data set enables us to decompose forecast accuracy into two components: the common error that affects all individual participants and the idiosyncratic error that reflects discrepant views across individuals about future economic and policy conditions. According to Woodford (2005), one should expect the idiosyncratic error to become smaller as FOMC open communications become more transparent. But the common error may not change much because it is likely to be affected by factors other than changes in policy transparency, such as unforeseen business cycles. To preview the main result, we find that since 1994 the idiosyncratic errors for key macroeconomic variables have steadily declined and the expectations of market participants are more closely synchronized to one another. We find no evidence, however, that the common error has become smaller since 1994, especially for the longer-term forecasts. The Methodology Let µt be an n × 1 vector of economic variables at time t, let yt be the realized value of these economic variables, and let y it be the ith individual’s forecast value of the variables. Assume that yt is normally distributed with mean µt and an economywise (common) covariance matrix Ω Rt and that y it is normally distributed with mean µt and a forecastwise covariance matrix Ω Ft. (The superscripts R and F stand for “realized” and “forecast,” respectively.) The covariance matrix Ω Rt reflects the aggregate shocks that affect the realized value of µt; the covariance matrix Ω Ft captures the discrepancy in forecasts across individual participants. The assumption that the mean forecast among individual participants is µt is reasonable because previous work has suggested that the Blue Chip Consensus forecast, serving as a proxy for the mean forecast, is close to being an unbiased estimate of µt (Bauer et al. 2003). We denote the forecast error for the ith forecaster by x it = y it – yt. Therefore, the individual forecast error x it has mean zero and a variance matrix Ω t = Ω Rt + Ω Ft , which indicates that x it is subject to both idiosyncratic and common shocks.2 The standard statistical theory implies that ( ) χti ≡ xti′Ωt−1 xti ∼ χ 2 n , where χ 2(n) denotes the χ 2 distribution with n degrees of freedom and χ it is a square error weighted by Ω t. The above expression shows that the weighted square error 2 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA χ it follows the χ 2 distribution with n degrees of freedom. To measure the forecast accuracy for each individual participant, we compute a score value (p value) associated with this χ 2 distribution and call it an “accuracy score.” The score for individual forecaster i at forecast time t is a function of χ it and n: ( ) ( ) p χti , n = 1 − χ 2cdf χti , n , where χ 2cd f (χ it,n) is the probability that a random observation from the χ 2 distribution with n degrees of freedom falls in the interval [0 χ it ].3 As Eisenbeis, Waggoner, and Zha (2002) point out, the summary measure p(χ it,n) is a probability that is invariant to the underlying scales-of-error variances. One possible interpretation is that the ith participant’s forecast is closer to the realized value than that of 100 p(χ it,n) percent of all possible forecasters. Moreover, the score p(χ it,n) can be compared across forecasters, within a forecast period, and across periods. Bauer et al. (2003) show how to estimate the covariance matrices Ω Rt and Ω Ft. The matrix Ω Rt can be estimated as the sample covariance matrix of the Blue Chip Consensus forecast errors across time under the assumption that Ω Rt is the same across years for each month but varies across months within a year. Thus, the variances on the diagonal of Ω Rt become smaller as t approaches the end of the year because more information becomes available to forecast economic conditions for the current year. The covariance matrix Ω Ft can be estimated as the sample covariance matrix of forecast errors across individual forecasters; this covariance varies both across months and across years.4 The estimate of Ω t, denoted by Ω̂t , is the sum of the estimates of Ω Rt and Ω Ft. Given this estimate, the weighted-square error can be calculated as ˆ −1 x i . χˆ ti = xti′Ω t t At each time t, the average accuracy score is N pˆ t ( n ) = ( ) 1 t ∑ p χˆ ti , n , N t i=1 where Nt is the number of individual forecasters at time t. One can also calculate the cross-sectional distribution of accuracy scores; the process is described in detail in the sidebar on page 6. 1. Kohn and Sack (2003) characterize several distinct periods of increasing transparency in FOMC statements: statements on changes in the discount rate (1989–93), statements on changes in the federal funds rate (1994–98), statements including policy tilt (1998–99), and statements including assessment of the balance of risks (2000–04). In May 2003 a further refinement was added to separately state the committee’s views on the risks to inflation and growth. And, finally, in August 2003 the committee provided explicit guidance on the likelihood that policy would remain accommodative. 2. In future research, we intend to relax the assumptions that the Blue Chip Consensus forecast is equal to µt and idiosyncratic shocks are independent of common shocks. 3. If the assumptions used are valid, the distribution of accuracy scores from 1986 to 2004 should be uniform. We have verified that such a distribution is more or less uniform, taking into account small-sample uncertainty. 4. Other estimates can also be constructed using model-based methods. ECONOMIC REVIEW First Quarter 2006 3 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 1 Blue Chip Average of Individual Scores for the Current Year Skewness and kurtosis Average scores and standard deviations 100 14 12 Average score 80 Kurtosis 10 8 60 6 40 20 0 1986 4 2 Skewness Standard deviation 1989 1992 0 1995 1998 2001 2004 –2 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data Vintage Data and Forecast Errors The monthly Blue Chip Economic Indicators report the forecasts of key macroeconomic variables for the current and next years. We study the annual average forecasts of five key variables: the three-month Treasury bill (T-bill) rate, the consumer price index (CPI) inflation rate, real gross national product (GNP) for 1986 to 1995 or real gross domestic product (GDP) from 1996 to 2004, the unemployment rate, and the long-term bond yield (the corporate bond yield from 1986 to 1995 or the tenyear Treasury note yield from 1996 to 2004). The three-month T-bill rate, the CPI inflation rate, the unemployment rate, and the long-term bond yield are monthly variables while real GNP/GDP is a quarterly variable. This frequency difference is important to note when evaluating forecasts. (See Appendix 1 for a description of and sources for these data.) More information becomes available about the actual current-year data as the end of the year approaches, and therefore the forecast errors for both the current and next years get smaller. For example, the forecasters participating in the December Blue Chip survey will have monthly data on the three-month T-bill rate and the longterm bond yield through November, data on the unemployment rate through October or November, and data on the CPI inflation rate through October. However, since GNP/GDP data are released quarterly, forecasters will have information regarding i GNP/GDP only through the third quarter of the year. The weighted-square error χ̂t is designed to avoid the influence of different amounts of available data so that the errors are comparable across time. To gauge forecast errors, the realized values of each variable at a given time must be used. The values of some variables are revised over time by the agencies responsible for reporting those variables. In particular, real GNP/GDP is reported quarterly and revised twice. Every year additional benchmark revisions may be made in July to past GDP data. Hence, the information reported is actually the continuously changing estimates of many key economic variables’ final values. Finally, sometimes the definition of GDP is changed and the series is completely revised. 4 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 2 Blue Chip Average of Individual Scores for the Next Year Skewness and kurtosis Average scores and standard deviations 14 100 12 Kurtosis 80 Average score 10 8 60 6 40 4 2 20 0 1986 1989 1992 1995 1998 Skewness 0 Standard deviation 2001 2004 –2 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data Such revisions raise the question, What vintage data should one use to evaluate forecast errors? From a macropolicy perspective, one could argue that the focus should be on the “best” estimate of the final value of the variable of interest. Often, however, that value is not known for several years, and sometimes the difference between even a preliminary estimate and its nearest neighbor estimates can be very large. For example, the advanced estimate for real GDP for the first quarter of 2005 was 3.1 percent. This number was revised upward by the Bureau of Economic Analysis (BEA) to 3.4 percent and finally to 3.8 percent as more data on the performance of the economy became available. Policymakers might have inferred that the economy was growing below trend according to the first number but above trend based on the final estimate. Such differences could have significantly different implications for policy. For this reason, we would argue that the focus should be on forecast methods that best approximate the final number rather than the initial estimate. Also, a priori knowledge of the expected performance of a model or forecasting method can help policymakers decide how to weigh the evidence when significant differences exist between the initial releases of data and forecasts. For the purposes of this study, for the current-year forecasts, we use vintage data available at the end of January following the current year; for the next-year forecasts, we use data available at the end of January following the next year. This study uses vintage data so that its results will be comparable with those of previous studies. It also provides a comparison between the average Blue Chip Consensus score using vintage and final data, using January 2005 for the final data. Accuracy Scores This section looks at the distribution of scores at each month and examines whether the distribution has changed over time, especially from the prestatement subperiod to the poststatement period. The technical details of how to characterize the crosssectional distribution of scores are provided in the sidebar on page 6. The first panel of Figure 1 shows the time-series paths of average scores and standard deviations of scores for the current year. The first panel of Figure 2 shows ECONOMIC REVIEW First Quarter 2006 5 F E D E R A L R E S E R V E B A N K O F AT L A N TA Characterizing the Distribution of Accuracy Scores T he distribution of accuracy scores can be summarized by the first four moments. The method for calculating the mean or average score pˆ t ( n ) is shown in the text. The other three moments—standard deviation, skewnesss, and kurtosis—can be calculated as follows: 1 2 2⎤ ⎡ 1 Nt σˆ t ( n ) = ⎢ ∑ p χˆ ti , n − pˆ t ( n ) ⎥ , ⎥⎦ ⎢⎣ N t i=1 ( ( ) ) N ( ( ) ) ( ( ) ) 3 1 t ∑ p χˆ ti , n − pˆ t ( n) N t i=1 sˆt ( n ) = , and σˆ t ( n )3 N 4 1 t p χˆ ti , n − pˆ t ( n ) ∑ N i=1 uˆ t ( n ) = t , σˆ t ( n )4 where σ stands for the standard deviation, s the skewness, and u the kurtosis. similar paths for the next year. The measure of standard deviation is often used to approximate the volatility of the public’s expectations or forecasts at each point in time. As the first panel of Figure 1 shows, both the average score and the standard deviation of scores fluctuate over time. No noticeable differences exist in the degree of fluctuation before and after 1994, nor are there differences for any subperiods after 1994. No trend appears in which the average score has increased or the standard deviation of scores has decreased since 1994. The figures clearly display periods when forecasters made big errors, such as missing the onset of the recessions in 1990 and 2001. In addition, while the average scores increased in 2004, so did the standard deviations of the scores. Similarly, the average scores dropped significantly in 1995 primarily because the definition of the GDP series changed. In January 1996 the BEA changed the measurement of GDP to a chain-weighted system, but the forecasts made before January 1996 might be based on the non-chain-weighted series. Interestingly, this change seems to have had relatively less effect on the longer-term forecast errors (the second panel of Figure 2). The average score for the next year (Figure 2) shows no improvement since 1994 and in fact appears to have drifted lower since 1996. The standard deviation of scores since 2001 has drifted steadily upward. The pattern of the drift in the standard deviation is similar to that just prior to and coming out of the 1990–91 recession. As discussed further in the next section, these lower scores after 1996 are most likely associated with the nature of the business cycle and a surge of unexpected productivity growth in the late 1990s. The second panels of Figures 1 and 2 display the skewness and kurtosis of accuracy scores. Skewness measures the asymmetry of the score distribution. The more negative this measure is, the more scores spread out toward 0 percent. Conversely, the more positive this measure is, the more scores spread out toward 100 percent. Kurtosis measures the likelihood that the score distribution has extreme outliers that may affect the average score. The bigger the value of this measure is, the more likely the presence of outliers in the score distribution is. For the current-year forecasts, the skewness and kurtosis have remained stable except for a few periods. The 1995 spike is the result of the redefinition of GDP, and the small spikes around 2001 are associated with the recent recession. For the next-year forecasts, again, no clear pattern or trend is apparent in which skewness and kurtosis have changed since 1994. Two spikes in skewness and kurtosis correspond to the Asian financial crisis and the recent recession. 6 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 3 Blue Chip Consensus Scores and the Averages of the Five Top and Bottom Forecaster Scores Current year 100 80 60 40 20 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 1998 2000 2002 2004 Next year 100 80 60 40 20 0 1986 1988 1990 1992 Blue Chip Consensus 1994 1996 Average of five top scores Average of five bottom scores Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data Further information about the distributional changes of accuracy scores is provided in Figure 3, which displays the time-series paths of accuracy scores of the Blue Chip Consensus forecast and the average of the top and bottom five forecasts for each month. The consensus forecast is of particular interest because its score is on average the highest (see Appendix 2 for details) and because it performs better than any single individual forecaster over the sample. Again, Figure 3 demonstrates that these scores have had no tendency to improve over time since 1994. In fact, the ECONOMIC REVIEW First Quarter 2006 7 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 4 Cross-Sectional Standard Deviations of Three-Month Treasury Bill Forecasts Twelve-month moving average Monthly data 1.0 1.2 1.0 0.8 Next year Next year 0.8 0.6 0.6 0.4 0.4 Current year 0.2 0.2 Current year 0 1986 1989 1992 1995 1998 2001 2004 0 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data scores of consensus forecasts appear to be slightly lower after 1996 than before, especially for the next-year forecast. Moreover, the drop in the consensus scores around the recent recession and again following September 11, 2001, suggests that events and exogenous shocks affected forecast performance much more than FOMC statements did. The drop in the scores toward the end of 1995 is attributable to the redefinition of GDP. The average scores for the five top and the five poorest forecasters suggest that the data have fat tails, with most of the forecasts being clustered at the high end with a few really poor performers on the bottom. All these findings suggest that the individual participant’s forecast performance relative to other participants has not improved between the prestatement and poststatement periods. Although the accuracy score is a powerful summary measure of forecasting performance, it is a nonlinear function of the square forecast errors weighted by the overall covariance matrix Ω t. Separating Ω t and forecast errors for further analysis would be informative. In the next section, we examine whether the covariance matrix Ω Ft has changed over time and study the sources of forecast errors that do not depend on Ω Ft and Ω Rt.5 Transparency and Sources of Forecast Errors Kohn and Sack (2003) and Woodford (2005) argue that the contents of FOMC statements have become more transparent since 1994. To evaluate this argument, it is important to determine whether the expectations of market participants as reflected in the forecasts of key economic variables have become more synchronized in the poststatement period than in the prestatement subperiod. If the statement contains useful information, then one might expect an overall improvement in forecast accuracy, ceteris paribus, or at least more agreement among forecasters (that is, a tighter distribution of idiosyncratic errors). A positive answer may provide evidence about the effects of the FOMC statements on the private sector’s agreement on the direction of the future economy. 8 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 5 Cross-Sectional Standard Deviations of CPI Forecasts Monthly data Twelve-month moving average 1.0 0.8 0.8 0.6 Next year Next year 0.6 0.4 0.4 0.2 0.2 Current year Current year 0 1986 1989 1992 1995 1998 2001 2004 0 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data We also examine the sources of forecast errors by directly decomposing the mean square error (MSE) into the idiosyncratic component that reflects the discrepancy in individual participants from the Blue Chip surveys and the common component that is associated with unanticipated aggregate shocks and affects all participants. The technical details of this decomposition are provided in the sidebar on page 19. The MSE is the average of square errors across individual forecasters. Arguably, both the idiosyncratic and common errors may show a decreasing trend if the statement contains useful information and forecasters gain better understanding of the economy over time, especially after 1994. To the extent that the common error is affected by exogenous aggregate shocks and the distribution of the shocks is not constant, no clear inference may exist about the size of the common error. However, we hypothesize that the more important impact is likely to be seen for the idiosyncratic component, in that the idiosyncratic errors should be tighter—that is, greater agreement should be evident among the forecasters. The empirical results presented below confirm this hypothesis. The degree of synchronization among market participants’ expectations is measured by the cross-sectional standard deviations of all the variables, which are equal to square roots of the diagonal elements of Ω Ft. Figures 4–8 report the cross-sectional standard deviation of each of the five macroeconomic variables considered in this study. These charts clearly show that the trend for these variables has been downward, and the standard deviations tend to be smaller after 1994 than before 1994. These findings suggest that individual participants’ forecasts have indeed been more synchronized since 1994 in terms of both their overall view of the economy and the interest rate variable most closely tied to policy. 5. The reader may recall that by assumption Ω Rt does not change from one year to another. We intend to relax this assumption in future research. ECONOMIC REVIEW First Quarter 2006 9 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 6 Cross-Sectional Standard Deviations of GDP Forecasts Twelve-month moving average Monthly data 1.4 1.2 1.2 1.0 1.0 0.8 Next year Next year 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 1986 Current year Current year 1989 1992 1995 1998 2001 2004 0 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data Figure 7 Cross-Sectional Standard Deviations of Unemployment Rate Forecasts Monthly data Twelve-month moving average 0.7 0.6 0.6 0.5 Next year 0.5 0.4 0.4 0.3 Next year 0.3 0.2 0.2 Current year 0.1 0.1 0 1986 Current year 1989 1992 1995 1998 2001 2004 0 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data Figures 9–14 show the time-series paths of decompositions for each of the five key variables as well as all the variables jointly. One uniform result seen in the first panel of each figure is that the time path of idiosyncratic errors shows a pattern of steady decline as well as a seasonal pattern for the current-year forecasts. Within the current year, the individual participant’s forecast error becomes much smaller as December approaches. The seasonal pattern is much less obvious for the next-year forecasts (the second panel of each figure) partly because the uncertainty about the economy during the coming year is still large even if one tries to forecast as of 10 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 8 Cross-Sectional Standard Deviations of Ten-Year Treasury Note Forecasts Twelve-month moving average Monthly data 1.2 1.0 1.0 Next year 0.8 0.8 0.6 Next year 0.6 0.4 0.4 0.2 0.2 Current year Current year 0 1986 1989 1992 1995 1998 2001 2004 0 1986 1989 1992 1995 1998 2001 2004 Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data December in the current year. For both the current-year and next-year forecasts, a clear pattern of smaller idiosyncratic errors emerges after 1994. Again, these results are consistent with the hypothesis that individual forecasts have been more synchronized since 1994. Patterns of common errors are distinctively different from those of idiosyncratic ones, and the difference seems to be associated with business cycles unrelated to the FOMC statements. One can see from Figures 9–14 that the common errors in the current-year forecast are large relative to the idiosyncratic errors whereas the common errors are dominant in the next-year forecasts. But there is no apparent pattern that the common errors are smaller after 1994 than before. According to the first panel of Figure 9, unusually large common errors for the current-year forecasts of the short-term interest rate occur in 2001. These errors are associated with the unexpected sharp decline of the federal funds rate. The large common errors of longer-term (next-year) forecasts seem to be associated with missing the turning point of the federal funds rate in the early 2000s and failing to predict the unchanged rate in 2002 and 2003 (the second panel of Figure 9). For CPI inflation, except for two unusually large common errors before 1994, the common errors of the current-year forecasts have similar patterns before and after 1994 (the first panel of Figure 10). The common errors for the next-year forecasts tend to be larger in the period after 1996 than before (the second panel of Figure 10), and no tendency is apparent that these errors have become smaller than before 1994. Typically, as the end of the year approaches, both idiosyncratic and common errors become smaller for the current-year forecasts. But unusually large common errors of the current-year forecasts of real GNP/GDP develop toward the end of 1995, caused mainly by the definition change of the GDP series. When divided by the diminishing variances of forecast errors, these errors are amplified, accounting for the steep drop of accuracy scores toward the end of 1995 (see the first panel of Figure 3). In the first panel of Figure 11, the errors are not divided by the variances of forecast errors and thus are not as visually dramatic as in Figure 3. The substantial, ECONOMIC REVIEW First Quarter 2006 11 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 9 Mean Square Errors of Three-Month Treasury Bill Forecasts Current year 5 10 4 8 3 6 2 4 1 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 10 15 12 8 9 6 6 4 3 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 Idiosyncratic error 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data persistent common errors of the next-year forecasts in the late 1990s are consistent with the sustained increase in productivity growth being largely unexpected by the public, while the federal funds rate did not change much. The common errors in forecasting the unemployment rate for the current year appear to be somewhat smaller after 1994 than before, but those errors for the next year have similar patterns before and after 1994 (Figure 12). The large common errors for the next-year forecasts have much to do with business cycles and with the errors in predicting output growth. 12 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 10 Mean Square Errors of CPI Forecasts Current year 9/11/2001 10 2.4 8 1.8 6 1.2 4 0.6 2 0 Federal funds rate (percent) Errors (percentage points) 3.0 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 10 2.5 2.0 8 1.5 6 1.0 4 0.5 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 Idiosyncratic error 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data No clear patterns exist in which the common forecast errors of the long-term bond yield have become smaller since 1994 (Figure 13). In particular, the errors around the recent recession are relatively large in magnitude. Interestingly, a noticeable drop in the idiosyncratic errors in both the current-year and next-year forecasts occurs after 1987, when Alan Greenspan became chairman and the effects of the stock-market problems dissipated. Figure 14 summarizes the decomposition of the MSE for the five variables combined. For the current-year forecasts, the seasonal pattern is evident, as explained ECONOMIC REVIEW First Quarter 2006 13 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 11 Mean Square Errors of GDP Forecasts Current year 10 5 4 8 3 6 2 4 1 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 9/11/2001 10 8 8 6 6 4 4 2 2 Federal funds rate (percent) Errors (percentage points) 10 0 0 1986 1988 1990 1992 Idiosyncratic error 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data early in this article. For the next-year forecasts, the large common errors occurred in the periods around the last two recessions. The persistent and volatile common errors since 1994 are mainly caused by the correlation effect among forecast errors across variables because the forecast errors for individual variables other than GNP/GDP do not share these features. Overall no evidence indicates that the public’s forecasts of key macroeconomic variables have improved since 1994, following the FOMC’s efforts to increase transparency. The table (on page 18) reports the average of percentages of the MSE that are attributed to the idiosyncratic component and the common component. Two meth14 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 12 Mean Square Errors of Unemployment Rate Forecasts Current year 10 1.0 0.8 8 0.6 6 0.4 4 0.2 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 2.0 10 1.6 8 1.2 6 0.8 4 0.4 2 0 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 1986 1988 1990 1992 Idiosyncratic error 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data ods are used to compute the average percent contributions. The first is to calculate the percent contributions of idiosyncratic and common errors for each period and then average them over all the periods. This method helps eliminate outliers of extremely large errors, so the results may not conform to the patterns in the charts. The top panel of the table reports these results. The second method is to accumulate the forecast errors of both types throughout the entire sample and then calculate the percent contributions of idiosyncratic and common errors (see the bottom panel of the table). This method is likely to be influenced by outliers but will be consistent with the patterns shown in the charts. ECONOMIC REVIEW First Quarter 2006 15 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 13 Mean Square Errors of Ten-Year Treasury Note Forecasts Current year 2.0 10 1.6 8 1.2 6 0.8 4 0.4 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 5 10 4 8 3 6 2 4 1 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 Idiosyncratic error 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data In the top panel of the table, the idiosyncratic errors for the current-year forecasts, except for GNP/GDP, contribute much more to the total errors than the common errors do despite the fact that the common errors are much larger at times. But for all the variables jointly, the common errors become more important. This result implies that while predicting a single variable may be relatively easy, predicting a set of economic variables may be more difficult.6 For the longer-term (next-year) forecasts, the picture is completely different: The common errors are clearly a driving force for almost all variables (except for CPI), individually and jointly. 16 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 14 Mean Square Errors of All Variables Forecasts Current year 10 10 8 8 6 6 4 4 2 2 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 20 10 16 8 12 6 8 4 4 2 0 Federal funds rate (percent) Errors (percentage points) 9/11/2001 0 1986 1988 1990 Idiosyncratic error 1992 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data Compared to the results in the top panel of the table, the results in the bottom panel give a more dominant role to the common errors, partly because the common errors are much larger than the idiosyncratic errors in some periods. All in all, the common errors clearly play a dominant role in overall forecast errors. 6. One might also infer that different models are being used and that these models perform better on some variables than others, but in aggregate significant differences exist among the forecasts. ECONOMIC REVIEW First Quarter 2006 17 F E D E R A L R E S E R V E B A N K O F AT L A N TA Table Decomposition of the Mean Square Error All 3-month variables T-bill CPI GDP Unempl. rate 10-year T-note By average percent contribution to error in each period Current-year forecasts (1986–2004) Idiosyncratic component Common component 44.5 55.5 57.0 43.0 69.7 30.3 43.3 56.7 64.0 36.0 58.7 41.3 Next-year forecasts (1986–2003) Idiosyncratic component Common component 30.0 70.0 40.0 60.0 52.7 47.3 41.0 59.0 36.6 63.4 48.5 51.5 Current-year forecasts (1986–2004) Idiosyncratic component Common component 31.9 68.1 30.9 69.1 40.6 59.4 28.0 72.0 39.6 60.4 32.0 68.0 Next-year forecasts (1986–2003) Idiosyncratic component Common component 22.1 77.9 15.1 84.9 38.6 61.4 20.1 79.9 24.7 75.3 32.1 67.9 By percent contribution of total error across sample This finding suggests that unexpected shocks, which of course are also not anticipated in the FOMC statements, are dominant factors in affecting forecast performance, and improvements in policy transparency would be unlikely to make the forecast errors smaller except on the margins.7 Another possibility is that clearer patterns may show up as more observations become available; the FOMC only began in August 2003 to provide explicit guidance on the likely path of future policy and state-contingent economic conditions in the future. Given the data available today, however, we find no empirical evidence of significant improvement in the common forecast errors over the period in which the FOMC attempted to clarify its views of the economy or the likely course for future policy. This finding does not necessarily suggest that the movement toward transparency has been a failure. It may simply indicate that no new information was provided in the statements that had not already been inferred by market participants. Given the unpredictable nature of business cycles, moreover, the common error may be mostly affected by factors other than monetary policy transparency. Vintage Data versus Final Data One could argue that whenever forecast errors for a particular period are evaluated, final data available at that time should be used. The reason is obvious: From a policy perspective, being able to accurately predict initially released data that are subsequently revised may lead to policy errors, especially when turning points are imminent or when the revisions may substantially alter one’s view of the economy. However, when policy formulation relies heavily upon model forecasts, it is important that those forecasts capture, as well as possible, the true underlying paths for key economic variables. If they do not, then the risk of serious policy errors may be increased. Furthermore, deciding how to choose the vintage data at various points in time is completely arbitrary, and no statistical or economical foundation exists to 18 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Decomposition of the Mean Square Error Let the estimate of µt be 1 t i ∑y . N t i=1 t Note that µ̂ t is also the Blue Chip Consensus forecast. The weighted mean square error at time t can be decomposed as ( ) ( ) ′ i N N 1 t i i 1 t ⎡⎣ yt − µˆ t − yt − µˆ t ⎤⎦ xt ′ xt = ∑ ∑ N t i=1 N t i=1 ( ) ( ) ⎡ yti − µˆ t − yt − µˆ t ⎤ ⎣ ⎦ ( N ( )( ) 1 t i ∑ y − µˆ t ′ yti − µˆ t N t i=1 t + 1 t ∑ y − µˆ t ′ yt − µˆ t , N t i=1 t N µˆ t = N = )( ) where the first term on the right-hand side is the MSE attributed to the idiosyncratic component and the second term is the MSE attributed to the common component. The cross term is zero because N ( )( 1 t i ∑ y − µˆ t ′ yt − µˆ t = µˆ t − µˆ t ′ yt − µˆ t = 0. N t i=1 t ) ( )( ) guide such decisions. The public know that data such as GDP are often revised and sometimes thoroughly revised. They take such unpredictable outcomes into account and make their forecasts as accurately as possible on average. In this section, we use the revised and most current data available at the beginning of 2005 to recompute the forecast errors. Figure 15 displays the Blue Chip Consensus accuracy scores with the vintage data and the final data for both the current-year and next-year forecasts. The average current year score using vintage data is 70.9 while the average current-year score using final data is 67, just 3.9 points lower. For the nextyear forecast, the average scores using vintage data and final data are very similar: 57.4 using vintage data and 56.4 using final data. During several periods (1992, 1995–96, and 1998) the next-year forecast scores are lower using final data, but several periods (1994, 1999, and 2002) have higher scores. These results indicate that future data revisions are random enough that they do not introduce a bias that significantly affects forecast scores on average. More important, the findings also suggest that the data revisions do not pose significant risks for policymakers. One would expect, perhaps, a greater disparity between the two scores given that additional revision errors are unpredictable. However, an important advantage of using the final data is that one can avoid the distorted GDP forecast errors caused by the 1995 data revision. By comparing the first panels of Figures 6 and 11, one can see that the distortion is completely eliminated when the final data are used to measure the forecast accuracy. Still, when the 1995 period is excluded, the difference between the current-year scores using vintage and final data increases from 3.9 to 7.7. Looking more closely at the source of this difference, we find that it can be attributed mostly to the GNP/GDP forecast error. Figure 16 displays the decompositions of forecast errors for GNP/GDP using the final data as realized values. A comparison of this figure with Figure 11 reveals some notable differences in the breakdown in the composition for both the current-year and next-year forecasts. In the first panel of Figure 11, we see larger overall errors in 1992 and in the 1996–2004 period that are due to increases in the common component 7. This interpretation is consistent with the results of Stock and Watson (2003) and Sims and Zha (2006). ECONOMIC REVIEW First Quarter 2006 19 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 15 Blue Chip Consensus Scores: Current versus Real-Time Actual Data Current year 100 80 60 40 Jan. 2005 data for actual 20 Real-time actual data 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year 100 Real-time actual data 80 60 40 20 Jan. 2005 data for actual 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Note: The shaded vertical bars indicate recessions. In the first panel, the average score using real-time actual data is 70.9; the average score using January 2005 data for actual data is 67.0. In the second panel, the average score using real-time actual data is 57.4; the average score using January 2005 data for actual data is 56.3. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data of the forecast error. Consequently, a greater proportion of the error each period is due to the common component. The average contribution of the common component to the overall error rises to 73.9 percent from 56.7 percent. In addition, the overall error in 1995 using vintage data (which resulted from the changing to chain-weighted GDP) is no longer present. For the next-year forecasts in the second panel of Figure 11, we again see that the overall error has increased but to a considerably more modest degree. The overall forecast error prior to the 1990–91 recession is less using final data but is greater (on aggregate) for the 1996–2000 period. But once again, this increase 20 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 16 Mean Square Error (Using January 2005 Data as Actual Data) of GDP Forecasts 5 10 4 8 3 6 2 4 1 2 0 Federal funds rate (percent) Errors (percentage points) Current year 0 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 Next year Errors (percentage points) 8 8 6 6 4 4 2 2 Federal funds rate (percent) 10 10 0 0 1986 1988 1990 Idiosyncratic error 1992 1994 Common error 1996 1998 Overall MSE 2000 2002 2004 Federal funds rate Note: The shaded vertical bars indicate recessions. In the first panel, the idiosyncratic percent of total error (per period average) is 26.1; the common percent of total error (per period average) is 73.9. In the second panel, the idiosyncratic percent of total error (per period average) is 38.2; the common percent of total error (per period average) is 61.8. Source: Authors’ calculations from monthly Blue Chip Economic Indicators data in overall error is attributable to the common component. The average contribution of the common component rises to 61.8 percent from 59 percent. Our findings suggest that using final data or vintage data may make little difference when evaluating forecasts. The results show that the average Blue Chip Consensus score is modestly affected for current-year forecasts and almost unchanged for nextyear forecasts. In addition, the decrease in score for current-year and next-year forecasts results from an increase in the common component of the forecast error and does not affect the idiosyncratic component. Therefore, the effect of a switch to final data ECONOMIC REVIEW First Quarter 2006 21 F E D E R A L R E S E R V E B A N K O F AT L A N TA for evaluating individual forecasts scores should be roughly equal across forecasts. The use of final data eliminates the need for arbitrarily choosing among different vintages. Conclusion In 1994 the FOMC began to release statements after each meeting. The amount of policy information released in the statements has increased and changed over time. The findings from Kohn and Sack (2003) and Ehrmann and Fratzscher (2004) suggest that financial markets are sensitive to the information revealed in these statements. While knowing whether the statements have affected markets is important, understanding whether the statements are providing strong signals concerning the FOMC’s views about the future path of the economy or economic policy is also important. That is, has the public’s ability to forecast future economic and financial conditions improved since 1994? This question is important because one hopes that transparency, if appropriately communicated, enhances market participants’ ability to forecast (Woodford 2005). This article analyzes the forecast errors across a large section of forecasters and for a set of five key macroeconomic variables. The analysis finds evidence that the individuals’ forecasts have been more synchronized since 1994, implying the possible effects of the FOMC’s transparency. On the other hand, we find little evidence that the common forecast errors, which are the driving force of overall forecast errors, have become smaller since 1994. In fact, common forecast errors have increased and have become more volatile on several dimensions. These common errors seem to be associated with business cycles and other economic shocks. Transparent monetary policy may not necessarily enhance the public’s ability to predict business cycles. On the other hand, it is possible that we do not have a long-enough sample to observe the effects of transparency because the FOMC just began in August 2003 to provide more explicit guidance on the likely path of future policy and its contingency on future economic conditions. We hope that our findings will generate more research on this important topic. 22 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Appendix 1 Data Description Three-month Treasury bill rate: 1986–2004. Secondary market, monthly average. Source: Board of Governors of the Federal Reserve System. Unemployment rate: 1986–2004. All workers sixteen years or older. Source: U.S. Department of Labor, Bureau of Labor Statistics. Consumer price index: 1986–2004. CPI-U (all urban consumers). Source: U.S. Department of Labor, Bureau of Labor Statistics. Corporate bond yield: 1986–95. Aaa, monthly average. Source: Moody’s Investors Service Inc. Gross national/domestic product: 1986–95, not chained; 1996–2004, chained. Source: U.S. Department of Commerce, Bureau of Economic Analysis. Ten-year Treasury note yield: 1996–2004. Constant maturity, monthly average. Source: Board of Governors of the Federal Reserve System. Appendix 2 Scores and Ranks for Individual Forecasters I n this appendix, the following table shows the average scores for all the individual forecasters who have continued to participate in the surveys in recent years. The table also includes the consensus forecast and the Bayesian vector autoregressive (BVAR) model. The BVAR model is often used in the empirical literature as a benchmark for model compari- son (Robertson and Tallman 1999, 2001), and reporting the real-time forecasting performance of this model is of particular interest to academic researchers. For completeness, we also report other forecasters’ scores toward the end of the table. The years in which each forecaster participated in the Blue Chip surveys are also reported in the table. Table Overall Performance: Score Forecaster Name Overall Avg. Std. score dev. Current year Avg. Std. score dev. Next year Participation Avg. Std. Current Next score dev. year year BC—average of top 10 BC—consensus Macroeconomic Advisers, LLC Schwab Washington Research Group Atlanta BVAR U.S. Trust Company ClearView Economics Banc of America Corporation Northern Trust Company Wayne Hummer & Company Moody’s Investors Service Perna Associates 82.24 64.36 62.58 62.04 59.69 59.25 59.23 59.22 58.75 55.89 55.04 54.61 86.45 70.92 71.57 69.97 69.21 64.61 66.69 63.28 63.34 58.05 65.77 60.90 77.81 57.43 53.10 53.64 49.64 49.96 50.10 54.87 53.17 53.58 42.35 47.82 16.86 23.49 27.71 28.26 31.19 27.15 28.94 27.10 28.01 27.27 28.03 26.31 15.99 24.07 26.25 27.11 29.54 26.25 27.72 27.82 27.27 27.61 28.63 28.35 16.66 20.77 26.06 27.07 29.75 26.25 27.99 25.68 27.95 26.78 21.34 22.08 ECONOMIC REVIEW 228 228 227 197 228 227 66 204 222 228 78 167 216 216 215 186 216 131 54 190 183 214 66 155 First Quarter 2006 23 F E D E R A L R E S E R V E B A N K O F AT L A N TA Appendix 2 (continued) 24 Forecaster Name Overall Avg. Std. score dev. Current year Avg. Std. score dev. Next year Avg. Std. score dev. Merrill Lynch Wells Capital Management National Association of Home Builders Nomura Securities National City Bank of Cleveland DuPont Georgia State University Fannie Mae DaimlerChrysler AG Standard & Poors Eggert Economic Enterprises Siff, Oakley, Marks Inc. Evans, Carrol and Associates Bank One Bear Stearns & Company Inc. BC—average of individual scores La Salle National Bank Prudential Securities Prudential Financial Goldman Sachs & Company National Association of Realtors Conference Board Chamber of Commerce, USA General Motors Corporation Econoclast Eaton Corporation Turning Points (Micrometrics) Comerica UCLA Business Forecast Motorola Inc. JPMorgan Chase Kellner Economic Advisers Genetski.com Wachovia Securities Federal Express Corporation DRl-WEFA Morgan Stanley & Company Inforum–University of Maryland Deutsche Banc Alex Brown Naroff Economic Advisors Ford Motor Company BC—average of bottom 10 54.50 53.58 53.56 52.55 52.01 51.68 51.67 51.43 51.34 51.25 50.79 50.66 50.43 49.82 49.67 48.13 47.47 47.07 47.01 46.28 46.10 45.08 44.97 44.30 43.29 43.04 43.04 42.41 42.12 42.02 40.92 40.79 40.46 40.39 39.92 39.02 35.95 35.72 30.71 29.96 25.80 7.50 58.36 59.83 58.77 55.77 56.75 57.06 51.72 59.67 58.94 58.86 50.12 56.56 58.01 56.87 53.11 51.84 54.13 47.40 50.54 59.47 51.08 52.22 48.35 46.05 42.32 40.92 41.15 43.88 45.32 50.83 47.57 41.86 50.61 44.60 41.80 48.32 38.27 33.15 31.86 33.36 27.32 6.12 50.32 46.83 47.93 48.57 46.93 46.00 51.62 41.81 43.35 42.78 51.48 44.77 42.86 42.21 43.95 44.21 40.22 46.57 43.31 30.49 40.10 37.46 41.20 42.42 44.32 45.37 45.04 40.84 38.75 31.91 33.40 39.55 29.53 35.69 37.63 27.99 32.30 38.46 29.08 25.86 23.69 8.96 ECONOMIC REVIEW First Quarter 2006 27.41 28.71 26.06 28.87 26.08 25.60 27.39 28.00 29.13 30.43 25.90 28.19 29.77 31.39 29.96 16.08 29.73 31.41 26.68 27.19 29.24 29.38 27.68 28.03 27.13 28.51 27.86 25.44 30.19 28.76 27.12 23.00 32.50 27.19 26.15 27.09 29.39 26.46 28.22 28.67 25.12 6.35 28.97 28.19 26.69 29.82 26.56 28.14 28.64 29.13 29.24 30.09 27.56 27.41 30.35 31.75 30.39 16.22 32.16 33.06 28.97 25.49 29.08 31.03 28.34 29.42 30.94 30.07 29.28 29.34 32.23 31.73 29.12 24.65 32.88 31.05 28.76 28.48 31.92 27.15 26.76 32.65 26.09 6.32 25.02 27.81 24.21 27.41 24.61 21.23 26.07 23.35 26.85 28.63 24.09 27.78 27.21 29.22 28.59 15.00 24.97 28.88 23.55 19.85 28.56 25.46 26.50 26.40 22.44 26.62 26.19 20.41 27.55 20.92 22.55 21.02 28.37 21.33 22.60 20.65 24.75 25.47 30.33 22.59 23.73 6.05 Participation Current Next year year 206 161 176 63 224 228 223 84 226 120 225 197 202 205 98 228 158 175 201 79 64 224 214 162 227 127 185 178 227 102 104 91 154 98 65 77 85 222 91 70 103 228 190 149 163 51 209 216 211 72 215 108 215 197 202 190 59 216 145 117 192 66 53 210 192 150 215 115 174 166 215 89 92 79 143 88 53 65 54 208 64 58 74 216 F E D E R A L R E S E R V E B A N K O F AT L A N TA REFERENCES Bauer, Andy, Robert A. Eisenbeis, Daniel F. Waggoner, and Tao Zha. 2003. Forecast evaluation with crosssectional data: The Blue Chip Surveys. Federal Reserve Bank of Atlanta Economic Review 88, no. 2:17–31. Ehrmann, Michael, and Marcel Fratzscher. 2004. Central bank communication: Different strategies, same effectiveness? European Central Bank, unpublished paper. Eisenbeis, Robert A., Daniel F. Waggoner, and Tao Zha. 2002. Evaluating Wall Street Journal survey forecasters: A multivariate approach. Business Economics 37, no. 3:11–21. Faust, Jon, and Eric M. Leeper. 2005. Forecasts and inflation reports: An evaluation. Paper presented at the Sveriges Riksbank conference “Inflation Targeting: Implementation, Communication and Effectiveness,” Stockholm, June 10–12. Kohn, Donald L., and Brian P. Sack. 2003. Central bank talk: Does it matter and why? Board of Governors of the Federal Reserve System Finance and Economics Discussion Series No. 2003-55, November. Robertson, John C., and Ellis W. Tallman. 1999. Vector autoregressions: Forecasting and reality. Federal Reserve Bank of Atlanta Economic Review 84, no. 1:4–18. ———. 2001. Improving federal-funds rate forecasts in VAR models used for policy analysis. Journal of Business and Economic Statistics 19, no. 3:324–30. Sims, Christopher A., and Tao Zha. 2006. Were there regime switches in U.S. monetary policy? American Economic Review 96, no. 1:54–81. Stock, James H., and Mark W. Watson. 2003. Has the business cycle changed? Evidence and explanations. In Monetary policy and uncertainty: Adapting to a changing economy. Federal Reserve Bank of Kansas City. Woodford, Michael. 2005. Central-bank communication and policy effectiveness. In The Greenspan era: Lessons for the future. Federal Reserve Bank of Kansas City. ECONOMIC REVIEW First Quarter 2006 25 F E D E R A L R E S E R V E B A N K O F AT L A N TA Merchant Acquirers and Payment Card Processors: A Look inside the Black Box RAMON P. DEGENNARO The author is the SunTrust Professor of Finance at the University of Tennessee and a visiting scholar at the Federal Reserve Bank of Atlanta. He thanks Jerry Dwyer, Dick Fraher, Scott Frame, Will Roberds, and Lynn Woosley for useful comments and discussions. He is grateful to Timothy Miller and Mario Beltran of NOVA Information Systems for explaining important institutional details and to Lee Cohen and Victoria L. Messman for research assistance. ike most consumers, you probably take your credit and debit card transactions for granted. You and others like you carry millions of cards and use them billions of times annually. But unless a transaction goes awry, you rarely think about how your cards work. In fact, a great deal happens after you produce your card to pay for a purchase and before the merchant receives funds and you receive your bill. What happens during the few seconds between the time you swipe your card and the terminal flashes a result? How does that swipe translate into a line on your bill from the institution that issued the card? When making a purchase using a card online or over the telephone, why are you sometimes asked for the three- or four-digit number printed on the back of the card, the card’s expiration date, or arcane information such as your mother’s maiden name? From the merchant’s perspective, how is that same card swipe turned into cash to pay for the goods or services provided? Why does a merchant pay a larger fee when it accepts a card in some circumstances than it does in others? And why was the representative from the payment card company so interested in the merchant’s personal information before the merchant was even permitted to accept cards? This article answers such questions. It explains how the card network signs up merchants to accept payment cards and how the sales slips that consumers sign are converted into cash for the merchants. The discussion begins with an explanation of the simplest type of card transaction—one using a private-label card (one that is accepted by only one merchant)—but the focus is primarily on the Visa and MasterCard networks in the United States. The major aspects of payment cards are similar in other countries, although details may differ, especially for cards other than Visa and MasterCard. The key institutions in this transactions process are the merchant acquirer and the payment card processor. The largest of these often perform both functions. Together, merchant acquirers and processors serve as the communications and transactions link between the merchants and the card issuers. L ECONOMIC REVIEW First Quarter 2006 27 F E D E R A L R E S E R V E B A N K O F AT L A N TA Merchant acquirers and card processors are important for several reasons. First, every card issuer deals with at least one payment processor, and every merchant that accepts cards has a relationship with a merchant acquirer. Without them, the payment system as we know it would not exist. According to Gerdes et al. (2005), U.S. consumers used credit cards for 19 billion transactions and debit cards for another 15.6 billion in 2003. These figures represent a dollar volume of $1.7 trillion for credit cards and $600 billion for debit cards. In terms of dollar value, annual growth for credit cards between 2000 and 2003 was 9.9 percent, and for debit cards, 21.9 percent. According to the Nilson Report (2005a), in 2004 consumers in the United States held 795.5 million MasterCard and Visa Unbeknownst to the cardholder, card-based cards (about three cards for every man, woman, and child in the country). transactions actually travel through the Second, the industry generates revBlack Box—a highly evolved group of enues through merchant fees, which merintermediaries. chants must recover either through higher prices or more sales, and the dollar amount is substantial. Lucas (2004), for example, reports that debit and credit card fees are the fourth-largest expense for gas stations and convenience stores after labor, rent, and utility costs. Third, the merchant acquiring and processing industry employs many workers. Jeff Johnson, vice president of search and recruitment with CSH Consulting, estimates that the industry employs about 50,000 people.1 Despite the size of the industry, few people understand the function of merchant acquirers and processors, and almost no academic research on this topic exists.2 The next section describes how regulations and card association rules set the boundaries of the Black Box. A description of a private-label transaction follows. This is the simplest type of card transaction because the card is accepted by only one merchant. The article then identifies some major types of institutions in the payment card industry and traces the transactions process. The following sections describe how chargebacks and fraud affect a merchant acquirer and identify cross-sectional risk differences among card transactions from the perspective of the merchant acquirer. The Boundaries of the Black Box Figure 1 presents a schematic of a credit and debit transaction, in which the cardholder is typically aware only of the issuing bank and the merchant. The cardholder deals with the issuing bank and with the merchant under the protection of Regulations Z and E. The issuing bank and the merchant are liminal figures that deal with the cardholder in the realm of these regulations and with the Black Box through the associations and the merchant acquirers. Unbeknownst to the cardholder, card-based transactions actually travel through the Black Box—a highly evolved group of intermediaries that sign up merchants to accept cards, handle card transactions, manage the dispute-resolution process, and, along with regulatory agencies, set rules that govern card transactions. Things can and do go wrong with card purchases and billings. Sometimes the culprit is poor quality or bad service. Sometimes the merchant fails to deliver the product. Cardholders and merchants may dispute a refund, and fraud by both cardholders and merchants is a constant challenge. Although these matters can cause serious headaches for cardholders and issuing banks, in most cases the financial impact is relatively minor from the cardholder’s perspective. This situation exists because Regulation Z and card association rules limit an innocent cardholder’s liability to at most $50 in 28 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 1 The Boundaries of the Black Box The Black Box Issuing bank Cardholder Card associations Merchant acquirers Third-party processors Merchant Cardholders generally interact with the Black Box only through merchants and issuing banks. almost all cases involving credit card fraud, and Regulation E and association rules provide essentially the same protection for debit card users.3 Regulations Z and E thus shift liability for fraud from the (innocent) cardholder to other parties. By means of contracts, the parties within the Black Box and the issuing banks assume and allocate this liability. In practice, then, Regulations Z and E ensure that most of the losses that result from card-based transactions are allocated among the entities within the boundaries of the Black Box. Aside from initiating a transaction with a merchant at the point of sale, the only time a cardholder interacts with the Box itself is during a dispute. Even then, if an attempt at resolution between the cardholder and the merchant fails, the cardholder typically turns to the issuing bank for relief. For their part, issuing banks usually interact with the Black Box only through the card associations. Private-Label Cards This section describes a simplified example of a transaction using a private-label card—a card accepted only by the merchant that issued it. Examples include department stores such as Macy’s and Sears. The transaction begins when the consumer presents the card at the point of sale. The sales clerk enters the purchase amount and, depending on the equipment available, either records the card number and obtains a signature or swipes the card. Depending on the specific merchant, the rest of the transaction cycle is handled either in-house or by a third party such as GE Capital. Sears handled its own processing until 2003, when it sold that part of its business to 1. Jeff Johnson, e-mail messages and telephone conversations with author (November and December 2005). 2. An exception is Rochet and Tirole (2002). Their focus differs from this article’s. They develop a theoretical model of optimal interchange fees and the merchants’ decision to accept payment cards. 3. Section 226.13 of Regulation Z addresses credit card “billing errors.” Section 205.11 of Regulation E contains error-resolution procedures for debit cards, and Section 205.6 of Regulation E covers consumers’ obligation for unauthorized transfers. ECONOMIC REVIEW First Quarter 2006 29 F E D E R A L R E S E R V E B A N K O F AT L A N TA Table The Ten Largest U.S. Merchant Acquirers in 2004, Excluding Partnerships and Alliances Ranking (transactions) Ranking (dollar volume) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. First Data BA Merchant Services Chase Merchant Services Paymentech Fifth Third Bank Global Payments Nova Information Systems Wells Fargo Alliance Data Systems Heartland Payment Systems Chase Merchant Services BA Merchant Services First Data Paymentech Nova Information Systems Fifth Third Bank Global Payments Wells Fargo First National Merchant Solutions Heartland Payment Systems Merchant acquirers holding at least 1 percent of U.S. market share in 2004 (by dollar volume), including partnerships and alliances 1. 2. 3. 4. 5. 6. 7. 8. First Data (including Chase Merchant Services, Paymentech, Wells Fargo, SunTrust, and PNC) BA Merchant Services Nova Information Systems (including KeyCorp) Fifth Third Bank Global Payments First National Merchant Solutions Heartland Payment Systems TransFirst Source: The Nilson Report (2005b) Citigroup. In this simplified example, the processor bills the cardholder and remits funds to the merchant. Private-label transactions are relatively simple because only one merchant and one processing entity are involved. For universal cards such as Visa and MasterCard, the situation is more complex not only because many different merchants could have made the sale but also because many different banks could have issued the card. Specialized institutions have evolved to route transactions to the correct business entities, and others have evolved to manage the relationship between the card networks and the merchants. Payment Cards: The Industry and Transactions Processing The industry. The payment card industry comprises many different entities that perform various tasks, and because many of them have formed alliances, the lines between them are often blurred. The card issuer provides the cards to the consumer and, in the case of credit cards, extends credit to the consumer. (See the sidebar on page 32 for information about different types of cards.) The relationship is businessto-consumer. The merchant acquirer signs up merchants to accept payment cards for the network. This relationship is business-to-business. These acquirers also arrange processing services for merchants. Processors handle transaction authorization and route a (usually electronic) transaction from the point of sale to the network (frontend processing). Later, they handle the information and payment flows needed to 30 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 2 Parties Involved in a Card Program: A Four-Party Network Visa MasterCard Merchant acquirers (the associations) Financial institutions (members of the associations) Card issuers Financial institutions (members of the associations) Service providers Merchants Cardholder convert the electronic record created at the point of sale into cash for the merchant (back-end processing). Some merchant acquirers perform the processing themselves; others resell the services of a third-party processor. That is, they are merchant acquirers who resell front- and back-end processing services but do not provide those services themselves. Most of the larger merchant acquirers also function as processors, but almost all of the smaller ones are resellers. The table lists the ten largest merchant acquirers by the number of transactions processed and by dollar volume. Because some acquirers have formed partnerships and alliances, the table also reports the eight groups with more than 1 percent of U.S. market share (by dollar volume). Only a bank may join Visa or MasterCard; as a result, many merchant acquirers and processors form an alliance or partnership with a sponsoring bank. In addition, depending on the needs of the merchant, an acquirer might sell front-end processsing from any of several companies and back-end processing from yet another one. These arrangements make the web of relationships messy, complicating the transactions process. The next section clarifies this process. Transactions processing. Figure 2 illustrates the institutions participating in a transaction involving either of the two major payment card associations, MasterCard and Visa, which are examples of four-party networks.4 The network includes the card issuers and the merchant acquirers/processors plus the cardholders and the merchants. The card issuer distributes cards to consumers, bills them, and collects payment from them. The merchant acquirer recruits merchants to accept cards and provides the front-end service of routing the transaction to the network’s processing facilities. The processor is responsible for delivering the transaction to the appropriate card issuer so that the customer is billed and the merchant receives funds for the purchase. Acquirers often delegate the actual processing to third-party service providers. The sidebar on page 36 provides a brief explanation of the differences between four-party networks and three-party networks (for 4. The associations, as umbrella organizations, are not counted as a separate group. Neither are service providers because their function is often served by merchant acquirers. ECONOMIC REVIEW First Quarter 2006 31 F E D E R A L R E S E R V E B A N K O F AT L A N TA Types of Payment Cards onsumers today can choose from a wide variety of payment cards, and the universe of cards can be partitioned in several ways. For example, one way to differentiate cards is according to the merchants who accept them. Some retailers issue private-label cards that are accepted only in their stores. Examples include Sears and Macy’s. General-purpose cards, by contrast, are accepted by a wide variety of merchants. Visa and MasterCard are the most common examples. Another way to classify payment cards is by the amount of time consumers have before payment is due. Debit cards enable a direct withdrawal from the user’s savings or checking account, and payment is due much sooner than for a credit or charge card. Debit cards can be used in either online or offline mode. When used in online mode, the card is swiped through a terminal equipped to handle a personal identification number (PIN). In this case, the cardholder enters a PIN instead of signing a transaction slip, and funds are deducted from the user’s account immediately. In offline mode, the card is swiped through a standard terminal, and no PIN is entered. Instead, the merchant obtains the cardholder’s signature. In this case, the customer’s C account is debited within two or three days. A debit card user can purchase any amount up to his balance in that account, and some of these cards even come with overdraft protection. In contrast, credit cards and charge cards allow the purchaser a longer period of time before he must deliver funds to cover the purchase, and the card may or may not have a predetermined spending limit. Charge cards require the cardholder to pay the balance in full each month unless special arrangements have been made while credit cards allow him the option to make only a minimum payment and pay interest on the balance carried from month to month. Still another way to distinguish payment cards is by the type of issuer. Financial institutions issue bankcards, which may be either charge cards or credit cards. Visa and MasterCard are the most popular examples. Nonfinancial institutions issue non-bankcards. Market participants subdivide these non-bankcards into two subcategories. Nonbank credit cards, such as Discover Card, enable the cardholder to roll over a balance from month to month while some travel and entertainment cards, such as the American Express Rewards Green Card and the Diners Club Charge Card, are charge cards. example, American Express, Discover Card, and Diners Club), in which the card issuer and merchant acquirer are the same entity. The transactions process has two major parts. The first is authorization, and the second is clearing and settlement. Authorization is the process of obtaining permission from the bank that issued the card to accept the card for payment. Clearing and settlement is the process of sending transactions through the Visa or MasterCard network so that the merchant can be paid for the sale. Authorization begins when a consumer presents his card to the merchant for a purchase. Usually, this authorization happens at the point of sale, though an increasing number of transactions are being done in “card not present” situations (for example, online). Merchants usually obtain authorization electronically, either by having the consumer swipe the card through a terminal at the point of sale or by entering the card information manually. However, some transactions still rely on voice authorization, which entails the merchant calling an authorization center to obtain permission to accept the card.5 The terminal sends the merchant’s identification number, the card information, and the transaction amount to the card processor. The processor’s system reads the information and sends the authorization request to the specific issuing bank through the card network. The issuing bank conducts a series of checks for fraud and verifies that the 32 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA cardholder’s available credit line is sufficient to cover the purchase before returning a response, either granting or denying authorization. The merchant acquirer receives the response and relays it to the merchant. Usually, this process takes no more than a few seconds. After authorization, the second major part of the transactions process—clearing and settlement—begins. When a consumer purchases an item with a payment card, the consumer and the merchant form a contractual obligation. The merchant agrees to deliver the goods or services, and the consumer agrees to pay for them. Settlement is the process by which assets are delivered Specialized institutions have evolved to to discharge that obligation. Clearing comprises the series of transaction activities route card transactions to the correct busifrom the moment the trade or purchase ness entities, and others have evolved to occurs until it is settled. Usually, clearing manage the relationship between the card involves the transfer of information rather than assets. Examples include netting networks and the merchants. numerous trades to reduce the number of deliveries, meeting reporting requirements, or handling failed trades (say, due to an error in recording). In the payment-card industry, the most common example of clearing is the process of transfering transaction information from the merchant to its bank. Clearing, then, includes activities that facilitate settlement. In practice, clearing and settlement for payment cards is more complicated because several entities are involved. Recall that payment card networks include four distinct parties (Figure 2). Moreover, each of those parties for a transaction could be one of hundreds or thousands of different acquirers or issuers and one of millions of cardholders or merchants. The process differs somewhat depending on the specific merchant acquirer and the type of network. The following discussion outlines the major steps of a typical clearing and settlement process for payment cards. Figure 3 illustrates a typical transaction cycle. In the first step, the merchant sends its transactions to its merchant acquirer. The merchant acquirer sends this information to the merchant accounting system (MAS) servicing that particular merchant’s account. In some cases, the MAS is a part of the merchant acquirer; in others, it is a different entity. The MAS distributes the transactions to the appropriate network—Visa transactions to the Visa network, MasterCard transactions to the MasterCard network, and so forth.6 Next, the MAS deducts the appropriate merchant discount fee (to cover the costs of the merchant acquirer’s activities) from the transaction amount and generates instructions to remit the difference to the merchant’s bank for deposit into the merchant’s account. The MAS sends these instructions to the automated clearinghouse (ACH) network, which is a computer-based system used to process electronic transactions between participating depository institutions.7 5. For authorization of card-not-present transactions, merchants must follow procedures designed to minimize error and fraud. For example, merchant acquirers can require use of the Address Verification Service (AVS). AVS offers varying levels of detail, including the cardholder’s ZIP code, street, city, or state. AVS can even verify which bank issued the card; if the buyer can provide that information, then he probably has the card in hand. This verification process helps rule out fraud by someone who has stolen the card number and does not have the card itself. 6. The process for transactions routed to networks other than MasterCard or Visa is somewhat different than the one that follows in the text, particularly regarding the handling of payments. 7. FedACH is part of the Federal Reserve System; the Electronic Payments Network (EPN) is the most notable example of a private ACH network. ECONOMIC REVIEW First Quarter 2006 33 F E D E R A L R E S E R V E B A N K O F AT L A N TA To recover these funds, the MAS sends information about the merchant’s transactions to Interchange, which is part of the Visa or MasterCard network. Interchange is the clearing and settlement sytem that transfers data between the card processor and the issuing bank. Interchange determines the interchange fee and Visa/MasterCard assessments (to cover the cost of the issuing bank’s services and the network’s costs) and sends the information to the card-issuing bank. In turn, the issuing bank remits the transaction amount, less the interchange fee, to Interchange, which passes it on to the MAS. Finally, the issuing bank bills the cardholder and collects the balance.8 Merchant acquirers provide other services to merchants besides the processing described above, including installing card terminal equipment, recording transactions, providing reports, and handling problems with card processing (Chang 2004). Some acquirers also provide related services such as analyzing the purchasing patterns of the merchant’s customers. Chargebacks and Fraud Chargebacks. A merchant acquirer suffers losses if a merchant is unable to make good on credit transactions disputed by customers, called chargebacks. Chargebacks usually occur when a consumer is dissatisfied with a product or service. Beginning with the later of the date on which a transaction is processed or the delivery of the product or service, cardholders have as much as three months to claim a chargeback—sixty days plus up to another month depending on the purchase date relative to the billing cycle.9 The presumption is initially in favor of the customer, and the amount of the chargeback is deducted from the merchant’s account pending the result of a review. If the dispute is resolved in the merchant’s favor, then the merchant recovers the funds. The merchant acquirer is at risk in the event that the merchant fails between the time of the initial sale and the time his account is debited for the chargeback. In this case, according to the card network’s rules, the merchant acquirer is liable and must make restitution to the customer. Because of this feature, the merchant (and ultimately the merchant acquirer) is at risk of loss for up to several months because the transaction can be reversed. In the language of payments, the transaction is not final. This feature greatly enhances the appeal of credit cards to cardholders, but it also shifts the risk of chargebacks to the merchant acquirer. In essence, the merchant acquirer has insured the issuing bank against an adverse result. The risk of a merchant acquirer’s contingent liability is similar to that of a bank’s guarantee of a debtor’s liabilities or an insurance contract. Merchant acquirers include the cost of this implicit insurance in the price that merchants pay for their services. Quinn and Roberds (2003) argue that payment-finality rules are essentially lossallocation rules. The rules determine which party to a transaction absorbs the loss if the transaction is not completed. For example, cash transactions are final when goods or services are exchanged for cash. Absent fraud or a private agreement such as a warranty, neither the buyer nor the seller can cancel the transaction after the exchange. In contrast, because of Visa/MasterCard chargeback provisions, credit card transactions are effectively not final for up to three months after delivery of the good or service. This lack of finality is a key determinant of a merchant acquirer’s risk because, until a transaction is final, the merchant acquirer bears the risk that a merchant cannot cover a chargeback. The industry attempts to quantify this risk through the closely related concept of delayed delivery. Magazine subscriptions are a good example. Subscribers pay for subscriptions in advance, and the term of subscriptions can be as much as a few 34 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 3 The Transactions Process Merchant Merchant acquirers/ processor Merchant accounting system Visa/MasterCard Cardholder ACH Merchant demand deposit account Issuing bank years. If the magazine ceases publication before the term of the contract, then the subscriber has recourse for undelivered issues according to Visa/MasterCard rules. The delay between the sale and the delivery of the goods or services increases the chances that the merchant will fail and be unable to cover the resulting chargeback. The sidebar on page 38 describes an extreme example. Fraud. Kahn and Roberds (2005) define fraud risk as the risk that a claim cannot be collected because the identity of the person who incurred the debt cannot be established. They identify three distinct types of fraud. First, existing account fraud is usually traced to stolen account information. For example, a thief who steals a card and orders merchandise commits existing account fraud. The second category is new account fraud, popularly called identity theft. In this case, a thief uses information about a third party to open an account, incurring debts in the name of the victim. Finally, those who commit friendly fraud make legitimate transactions that they later deny having made. The risk of fraud is especially serious if a merchant takes orders by mail, telephone, or over the Internet. In such card-not-present situations, the Truth in Lending Act frees cardholders from liability—they are not responsible for even the first $50 (association rules provide essentially the same protection for debit card users). This consumer protection shifts the risk to merchants and, in turn, creates a larger contingent liability for merchant acquirers. One notorious example involves a merchant that defrauded customers by taking orders with no intention to deliver. Had the merchant been a traditional storefront operation, red flags would have been more apparent. First, customers would have been interacting with the merchant face to face, 8. For debit cards, the billing is done automatically. Put differently, the cardholder’s account is debited, and the cardholder later receives a statement of transactions rather than a bill. 9. Specific details of chargeback terms are complicated because they are governed by law (for example, the Truth in Lending Act), by regulation (Regulation Z for credit cards and Regulation E for debit cards), and by the rules of the card associations and networks. See Furletti and Smith (2005) for more information. The terms in the text are common in the industry. ECONOMIC REVIEW First Quarter 2006 35 F E D E R A L R E S E R V E B A N K O F AT L A N TA Three-Party Networks he figure illustrates the three-party analog to the four-party diagram in Figure 2. The only major distinction is that, in three-party networks, the card issuer and the merchant acquirer are the same entity; in four-party networks, they are separate. In four-party networks, banks that are members of Visa and MasterCard issue the payment cards and extend credit to consumers for credit cards. Separate entities are responsible T for signing up merchants to accept these cards for payment. In practice, some acquirers are affiliated with or have formed partnerships with card issuers. Most payment cards in three-party networks are nonbank cards, issued by institutions such as American Express, instead of a bank. In almost all cases, the difference between three- and four-party networks is unimportant for cardholders. Figure Parties Involved in a Card Program: A Three-Party Network American Express Merchant Discover Cardholder Diners Club Service providers making it easier to detect suspicious behavior. Second, customers would have been more likely to benefit from the experiences of other customers; they might have met in the store or overheard conversations and complaints. Finally, either the merchant would have had no inventory or business history at that location (fueling suspicion), or he would have had at least some inventory and other collateral after the firm failed. Either way, the merchant acquirer would have been better off. Instead, because this was a card-not-present situation, the fraudulent merchant was able to collect a large amount over a period of several weeks. When consumers were no longer willing to wait for delivery and filed chargebacks, they were entitled to relief. Because the fraudulent merchant could not pay, the merchant acquirer was forced to make restitution. Cross-Sectional Risk Factors Clearly, a merchant acquirer must consider the credit standing of the merchants it services. Merchant acquirers do perform credit analysis, but the analysis is different from that of a more familiar bank loan. A merchant acquirer’s contingent liability is more similar to an insurance contract than to a bank loan. This description fits in part because the acquirer pays only if another entity cannot, but there are other differences. For example, for a bank loan, the bank delivers funds to a borrower. A merchant acquirer, though, advances no funds. Instead, it indemnifies a third party—the card 36 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA issuer (who in turn indemnifies the cardholder)—in the event that a merchant cannot cover a chargeback. Another major difference between bank loans and a merchant acquirer’s contingent liability is the term of the contract. Bank loans can have maturities of several years. In contrast, although consumers can file chargebacks for up to several months after a purchase, the effective term of the contingent Because the merchant acquirer is at risk liability produced by each transaction is usually measured in a very few days. In if the merchant cannot cover a chargeback, addition, merchant acquirers review most the acquirer must evaluate the credit accounts at least once a year. Cast in terms quality of merchants seeking or using of the probability of default times the loss given default, the probability of default the acquirer’s services. is affected in part by the time between account reviews, and the loss given default—again absent delayed delivery—rarely represents more than a few days’ worth of total processing volume at any one time. Taken together, the merchant acquirer’s annual review of accounts and the short term of the contingent liability have enormous implications for risk. The annual review makes the risk that merchant acquirers face similar to a short-term bond, whereas a bank loan is (sometimes) more similar to a long-term bond. Investors in short-term bonds need not reinvest in the same company when their bonds mature if, for example, a company’s credit quality deteriorates. Long-term investors do not have that option. They can only sell their bonds prior to maturity, likely taking a loss because the credit standing of the bonds has deteriorated. Similarly, if a merchant’s credit quality deteriorates, a merchant acquirer need not renew the relationship, whereas a bank probably cannot cancel a loan unless a covenant has been violated. Because the merchant acquirer is at risk if the merchant cannot cover a chargeback, the acquirer must evaluate the credit quality of merchants seeking to use the acquirer’s services and monitor the credit quality of the merchants it currently services. The acquirer considers industry effects, firm-specific effects, and even the nature of individual transactions. In fact, merchant acquirers charge different fees depending on whether or not a merchant has followed certain procedures for a transaction. Industry effects. Because customers who regret making a purchase have up to three months to act before their credit card purchases are final, businesses that are susceptible to so-called buyer’s remorse present higher risk to a merchant acquirer. Consider health clubs, which often sell annual memberships at a discount relative to their monthly fee to encourage customers to commit for a longer period. The problem is that many customers regret their commitment after just a few weeks. Although buyer’s remorse alone is not sufficient to win a chargeback dispute, it does give the buyer incentives to try to exploit the process. For example, he might claim that equipment at the club is often broken or that the premises are unsanitary. Because “often” and “unsanitary” are matters of degree, the cardholder has a chance to win the chargeback dispute, putting the acquirer at risk. Merchants that sell items of high and uncertain value—collectibles are an obvious example—are also prone to customer disputes. Customers can be disappointed in artwork, rare coins, or stamps for any of several reasons. Also, fraud is frequently involved in these types of businesses because the goods may not be genuine or their condition might be exaggerated. Mystics, such as fortune tellers, face high chargebacks due to buyer’s remorse, and one can easily see how customers of gambling establishments could regret a transaction depending on the outcome of a race or sporting event. For this reason, such businesses usually are not authorized to take credit cards for purchases. ECONOMIC REVIEW First Quarter 2006 37 F E D E R A L R E S E R V E B A N K O F AT L A N TA Delayed Delivery in the Extreme he nature of airline ticket sales and the industry’s current financial problems combine to form an extreme example of delayeddelivery risk. Consider a cardholder planning a trip by air. In some cases, the cardholder buys his ticket weeks or even months in advance, and travelers usually pay for their tickets using a credit card. Suppose that the airline fails between the time of purchase and departure. In this case, under credit card association rules, the acquirer must make restitution. How large can the potential losses be? One merchant acquirer, National City Corporation, reports that as of June 30, 2004, the value of credit card transactions it had acquired for outstanding tickets purchased on United Airlines was $853 million (National City Corporation 2004a). United Airlines is operating under Chapter 11 protection as of this writing. If United Airlines were unable to honor those tickets, then travelers who purchased their tickets using credit cards would be entitled to refunds under Visa and MasterCard rules, and National City held no significant collateral against this potential liability as of June 30, 2004. The $853 million worth of unflown tickets, of course, represents the potential liability from exposure to United Airlines alone. National City Corporation (2004a) says that it processed over five times that amount—about $5 billion worth of delayed-delivery purchases—during the six months ending June 30, 2004. National City Corporation (2004b) reports that as of December 31, 2004, the value of unflown tickets had been reduced to $547 million. T Of course, the odds are small that National City Corporation would be liable for the full amount of these huge sums. Consider the case of United Airlines. For National City Corporation to be liable for the full amount, three things must happen. First, United Airlines must halt all flights. Second, all ticket holders must file chargebacks within the allotted time limits. Although this is within their rights, many travelers would instead opt to fly on other airlines, which usually honor the stranded travelers’ tickets on a standby basis (McCartney 2004).1 This provision reduces the number of travelers who file chargebacks. Finally, National City Corporation would have to have a recovery rate of zero in liquidation. This outcome is unlikely because, even as a general creditor, the company could probably recover a portion of its losses from the bankrupt carrier. If National City Corporation anticipates problems it can also require a security deposit, a line of credit from a bank, or delay payment to the merchant. National City Corporation (2004a) puts the problem in perspective. For the first and second quarters of 2003 and 2004, the company processed about $35 million in chargebacks each quarter, for a total of about $150 million in the four quarters. Actual losses were about $1 million each quarter, for a total of about $4 million. The company had $5 million worth of chargebacks in the process of resolution as of June 30, 2004. The company believes the chance of a “material loss” because of chargeback rules is “unlikely” (National City Corporation 2004a). Still, losses of this size are not trivial, and “unlikely,” of course, does not mean that a material loss is impossible. 1. In November 2005, Congress extended this provision through November 2006. Airlines must honor these tickets but may charge a fee and need only accommodate travelers on a space-available basis. Instead, customers must get cash advances on their cards and use the cash to make the purchase. Items that can easily be resold are prone to fraud, so dealers in these products also present higher risks. Consumer electronics and jewelry head the list. Intangible products, particularly downloadable software, tend to attract fraudulent merchants and customers because proof of delivery and the products’ performance are difficult to substantiate. Timeshare services have high chargeback rates because customers sometimes place deposits months before developers even begin construction, when 38 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA the suitability of the property is difficult to ascertain. Customer dissatisfaction is more common in such cases.10 Perhaps the best example of an industry effect is the restaurant industry. Most bankers realize that loans to restaurants are very risky. For example, the Cline Group (2003) tracked over 4,000 non-fast-food restaurants in the Dallas area and reported that an average of 23 percent failed during their first year. Yet restaurants are extremely safe customers for merchant acquirers. Why? Consider the nature of a restaurant transaction. The diner finishes the meal, pays using a credit card, and departs. In the vast majority of cases, the Because customers who regret a purchase consumer is satisfied enough to consider the transaction to be final, and the settling have up to three months before their credit of accounts proceeds normally. Suppose card purchases are final, businesses that instead that the diner is dissatisfied. are susceptible to so-called buyer’s remorse Although Visa/MasterCard rules give the diner the right to file a chargeback for sevpresent higher risk to a merchant acquirer. eral weeks afterward, only in very rare circumstances will the diner pay, leave the premises, and then file a complaint. The diner is more likely to voice his dissatisfaction during the meal, and, almost always, restaurant management accommodates the diner. By the time the consumer uses his credit card, he is satisfied and considers the transaction to be final. The settling of accounts again proceeds normally. Only in very rare circumstances will he still complain after using his credit card. Even then, a complaint does not necessarily imply that the acquirer bears a loss. For the merchant acquirer to incur a loss, the cardholder must win the chargeback dispute (unlikely in such cases), and the merchant must fail between the time of the sale and the chargeback. Otherwise, the merchant itself and not the acquirer is responsible for the chargeback. Firm-specific risk. Just as insurers and banks evaluate the credit risk of individual companies, so do merchant acquirers. For example, they study standard measures of financial strength, such as financial ratios of individual firms. For unincorporated businesses, financial statements are often unaudited, so acquirers might use business tax returns to supplement the unaudited statements. Especially for small firms, acquirers even proceed beyond the firm level and use information about the owners and managers of companies, especially for unincorporated businesses. Acquirers can use credit scores from the Fair Isaac Corporation, commonly known as FICO scores, at the personal level as well as at the business level. Acquirers also use credit report information and the number of years that a potential customer has been in business to gauge risk. Both traditional lenders and merchant acquirers use information that others have already generated about specific firms—for example, whether or not the merchant has existing banking relationships. Almost surely, an international company will receive greater scrutiny than a domestic one. The processing history of a company that already has a relationship with a merchant acquirer is always important, particularly fraud and chargeback rates. If the firm’s condition is sufficiently weak, a merchant acquirer might require the owner to offer a personal guarantee; such guarantees are common for small business loans. An acquirer might impose conditions similar to restrictive covenants in business loans. For example, the acquirer might impose a processing limit, which corresponds to a commercial bank’s lending limits. Like a bank, the acquirer might require 10. For examples of items on restricted lists, see www.internetsecure.com/solutions-faq.htm#2 and www.practicepaysolutions.com/apply/index0007.php. ECONOMIC REVIEW First Quarter 2006 39 F E D E R A L R E S E R V E B A N K O F AT L A N TA marginally qualified merchants to provide collateral, usually in the form of a certificate of deposit, cash, or a letter of credit. If the merchant cannot provide collateral, then the acquirer might institute a holdback, or a delayed-payment arrangement. Under such an arrangement, the merchant acquirer withholds payment to the merchant for a predetermined length of time after processing. The duration of the payment delay is usually a function of the delivery delay and, less frequently, the chargeback ratio. Transaction-related risks. Banerjee (2004) notes that credit cards were originally designed to be physically present at the point of sale. If merchants followed procedures, then nearly all risks except fraud and delayed delivery declined enormously. This low level of risk is still true These procedures are only partially effective, for face-to-face transactions. For example, if a merchant swipes a card instead of so merchant acquirers charge higher fees for manually keying the card number, the card-not-present transactions to compensate chance for error drops to near zero. True, the card may have been stolen, but swipfor the higher risk. ing is at least one step toward insuring legitimacy: A thief must have stolen the card itself and not just the card number. This consideration goes far toward eliminating theft losses from, say, a dishonest waiter who copies the card number while clearing a diner’s tab. For a growing number of transactions, however, the cardholder and the card are not present. As a result, merchants and merchant acquirers face the challenge of developing new procedures for limiting risk. Mail-order and telephone-order (MOTO) transactions—and, more recently, Internet transactions—have presented special problems for payment card associations. The most popular approach has been for merchants to have access to increasingly arcane bits of information during authorization. Some help to confirm that the purchaser has possession of the card itself and not just the card number. For example, card associations have long encoded a verification number into the magnetic stripe on the back of the card. Visa calls this code the Card Verification Value (CVV or CVV1); MasterCard’s term is the Card Validation Code (CVC or CVC1). This code, read during the swipe, confirms that the card is actually present at the point of sale. The problem is that this approach cannot help for Internet or MOTO transactions because the card is not present and a swipe is impossible. Associations have had to devise other ways to confirm that the purchaser is in physical possession of the card at the time of the sale. The result is CVV2 and CVC2. These three-digit numbers (different from the magnetically coded CVV or CVC numbers) are printed on the right side of the signature area on the back of the card. Because this number is not embossed on the card, it does not appear on a paper sales slip, making it harder to steal. The customer must have physical possession of the card—or the printed number stolen by some other means—for the buyer to have access to it. CVV2 and CVC2 are only partially effective, though. First, the network merely flags the transaction if the buyer cannot provide the number; it does not refuse it. Second, some situations make it easy to defeat. For example, a dishonest waiter can steal a CVV2 or CVC2 number while clearing a dinner tab just as easily as he can steal a card number. Because these procedures are only partially effective, merchant acquirers charge higher fees for card-not-present transactions to compensate for the higher risk.11 Still, CVV2 and CVC2 provide one more layer of protection, and Banerjee (2004) reports that they do help discourage fraud. Another approach is to prearrange a question and answer or series of questions and answers. Card users might be asked to verify their mother’s maiden name, for example. By allowing cardholders to select from a list of questions, merchants and 40 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA acquirers make it more difficult for a thief to have the necessary information. The Address Verification Service (AVS) is a good example (see footnote 4). This verification process helps rule out fraud by someone who has stolen the card number and does not have the card itself. These procedures are somewhat effective, but as Banerjee (2004) points out, none of the ways to reduce fraud on the Internet seems to be particularly effective. As evidence, he notes that issuers have not lowered the interchange fees they charge for transactions that follow these procedures. More recent innovations are Visa’s Verified by Visa and MasterCard’s MasterCard SecureCode. Both of these systems use passwords for Internet purchases to insure that only the cardholder can make such purchases. These examples illustrate that merchant acquirers can help protect merchants (and therefore themselves) from fraud by setting procedures. After all, most merchants are too small to dedicate resources to designing low-cost, effective fraudprotection procedures, so a merchant acquirer can add value by supplying them. Merchant discount rates provide a means for acquirers to give incentives without mandating a specific procedure for each different merchant. Merchant acquirers provide these incentives by setting qualification levels for the discount fee that merchants pay; the more hurdles the merchant surmounts for a transaction, the higher the qualification rate and the lower the discount fee. A three-tiered system is common, beginning with the nonqualified rate, which is the lowest acceptable category (with the highest fee); moving to the partially qualified rate; and ending with the qualified rate, which is the highest category. For an example of how these tiers are determined, consider the method of entering the card number. Being hand-keyed without AVS might automatically drop a transaction to the nonqualified rate; adding AVS might move the transaction to the partially qualified rate. Swiping the card could move the transaction into the qualified rate. Different industries sometimes have different qualification criteria. For example, tipping is common in businesses such as restaurants, and the amount of the tip is usually unknown until after the card is swiped or the card number is entered. Therefore, the amount approved is a lower bound on the total amount to be charged. If the final amount including the tip is sufficiently above that lower bound, then the transaction might drop to the partially qualified rate from the qualified rate. A merchant acquirer’s management of individual sales is not limited to the time when the customer places the order. Merchant acquirers often require merchants to follow specific procedures immediately prior to shipping. For example, just before shipping a back-ordered item, a merchant might be required to contact the buyer to verify the customer’s telephone number, mailing and shipping address, or e-mail address. For MOTO or Internet purchases, shippers can insist that products be delivered only to the card’s billing address (rather than delivering to a destination that a would-be thief designates). This practice helps reduce fraud because the thief is less likely to attempt fraud in the first place if he knows he may not receive the merchandise. Finally, card associations have set procedures that force acquirers to cooperate to improve network efficiency. One obvious example is the MATCH list (Member Alert to Control High Risk Merchants), maintained by Visa and MasterCard, which comprises problem companies. If a merchant acquirer denies permission to accept cards to a merchant because of adverse processing behavior and fails to add it to the MATCH list, then the merchant acquirer is liable for losses another provider might suffer from that merchant. 11. For example, see AMS’s Web site at www.merchant-accounts.com/retail-merchant-account.html. ECONOMIC REVIEW First Quarter 2006 41 F E D E R A L R E S E R V E B A N K O F AT L A N TA Summary Consumer and merchant acceptance of payment cards has been phenomenal. Hundreds of millions of cardholders make billions of transactions worth trillions of dollars each year. Yet few cardholders understand how payment networks operate. Most treat them as a Black Box. This article demystifies the transactions process for payment cards, emphasizing the roles of the merchant acquirer and card processor. After outlining the regulations and card association rules that set the boundaries of the Black Box, the article describes a transaction with a private-label card. The discussion then considers the complications introduced by general-purpose cards, such as Visa and MasterCard, and introduces a key participant in the payment card market, the merchant acquirer. The description of the risks borne by merchant acquirers demonstrates that they take losses on these transactions only in rare circumstances—usually when a merchant fails to make good on a chargeback. The article also delineates some of the risk factors associated with specific industries, merchant types, and transactions that influence the price merchants pay for these transactions services. Finally, the article discusses some ways that merchant acquirers manage the risks that they face, especially the risk of fraud. REFERENCES Banerjee, Sankarson. 2004. Credit card security on the Net: Where is it today? Journal of Financial Transformation 12 (December): 21–23. McCartney, Scott. 2004. Bill to protect flyers from shutdowns has a surprising beneficiary. Wall Street Journal, October 26. Chang, Howard H. 2004. Payment card industry primer. Payment Card Economics Review 2 (Winter): 29–46. National City Corporation. 2004a. Form 10-Q: Quarterly report pursuant to section 13 or 15(D) of the Securities Exchange Act of 1934—for the quarterly period ended June 30, 2004, Commission file number 1-10074. Filed August 6, 2004. Cline Group. 2003. Restaurant start & growth magazine unit start-up and failure study. Cline Group for Specialized Publications, September. Furletti, Mark, and Stephan Smith. 2005. The laws, regulations, and industry practices that protect consumers who use electronic payment systems: Credit and debit cards. Federal Reserve Bank of Philadelphia Discussion Paper No. 05-01, March. Gerdes, Geoffrey R., Jack K. Walton II, May X. Liu, and Darrel W. Parke. 2005. Trends in the use of payment instruments in the United States. Federal Reserve Bulletin (Spring): 180–201. Kahn, Charles M., and William Roberds. 2005. Credit and identity theft. Federal Reserve Bank of Atlanta Working Paper 2005-19, August. Lucas, Peter. 2004. Why gasoline retailers are fuming. Credit Card Management (August): 20. 42 ECONOMIC REVIEW First Quarter 2006 ———. 2004b. Annual report. The Nilson Report. 2005a. Visa & Mastercard—U.S. 2004. No. 828, February. ———. 2005b. Top U.S. acquirers. No. 831, April. Quinn, Stephen F., and William Roberds. 2003. Are on-line currencies virtual banknotes? Federal Reserve Bank of Atlanta Economic Review 88, no. 2:1–15. Rochet, Jean-Charles, and Jean Tirole. 2002. Cooperation among competitors: Some economics of payment card associations. RAND Journal of Economics 33, no. 4:549–70. F E D E R A L R E S E R V E B A N K O F AT L A N TA International Business Cycles: G7 and OECD Countries MARCELLE CHAUVET AND CHENGXUAN YU Chauvet is an associate professor of economics at the University of California, Riverside, and a former research economist at the Atlanta Fed. Yu is a research scientist with the New York State Department of Health. M onitoring economic activity through the use of composite leading and coincident indicators has been a tradition in the United States for over sixty years, since the seminal book by Arthur Burns and Wesley Mitchell (1946). These indicators are some of the most watched series by the press, businesses, policymakers, and stock market participants. Progressive globalization has sparked a worldwide interest in using economic indicators to analyze cyclical fluctuations. The development of the European Monetary Union and advances in econometric models that explore potential dynamic differences across business cycle phases have given rise to a large recent literature focused on economic indicators and inferences on turning points for European countries. As markets become more integrated, governments and the private sector seek to conduct their activities in light of both national and international economic conditions. Changes in exchange rates, output, consumption, inflation, and interest rates in different parts of the world can influence the effectiveness of government policies and the competitive position of businesses, even those not directly related to international operations. The benefits of a warning system to detect recessions in major economic partners and in industrialized countries as a whole are considerable. The more reliable the warning system is, the more efficiently economic policy can be implemented as a pre-emptive action against the negative effects of widespread economic weakness and unemployment. Private businesses also benefit from making decisions based on more complete information regarding demand and supply for their services. This article constructs an international business cycle indicator using a broad production measure of the G7 countries and the Organisation for Economic Co-operation and Development (OECD) member countries.1 It also builds national business cycle indicators for each of the G7 countries individually using series that comove with their aggregate economic activity. A dynamic factor model with Markov switching (DFMS) is used to combine these macroeconomic series and to estimate probabilities of current ECONOMIC REVIEW First Quarter 2006 43 F E D E R A L R E S E R V E B A N K O F AT L A N TA business cycle phases for each of the G7 countries and for the aggregate G7 and OECD measures, which can be used as a warning system to monitor country-specific and international business cycles.2 The novelty of this approach is that we extend the DFMS model to include a filter that minimizes the occurrence of false turning points as it sorts out minor contractions and estimates only major economic recessions and expansions. This feature is especially important in situations in which an The results of this study indicate that some economy often slows down but does not enter a recession, occurrences that lead to economic recessions and expansions were a high rate of false alarms. common to the majority of OECD countries The phases of business cycles are well studied, characterizing an international characterized by the model probabilities, which show a clear dichotomy between business cycle. expansions and recessions for each of the G7 countries and for the aggregate OECD and G7 measures. The proposed model detects only probabilities of major recessions compared with the probabilities obtained without the filter, which capture several minor contractions for some of the G7 countries. Discerning between major downturns and minor contractions helps to avoid identifying false turning points. This quality is especially important for monetary policy purposes because central banks may want to act only in the event of major recessions affecting several sectors of the economy at the same time, such as employment, sales, output, and income. OECD countries differ in their institutions, monetary and fiscal policies, industrial compositions and structures, and average aggregate growth rates. The results of this study indicate, however, that OECD countries share some common business cycle phases despite their idiosyncrasies. Some economic recessions and expansions were common to the majority of countries studied, characterizing an international business cycle. The results from the probabilities also suggest that the business cycle derived from the OECD and G7 output data coincides with the swings in the euro area. The OECD countries altogether have experienced three major recessions in the period analyzed: during the oil crisis in the mid-1970s, in the early 1980s, and in the early 1990s. Comparing the U.S. business cycle with the international business cycle shows that recessions in the United States are more frequent and of shorter duration than in the aggregate OECD in the sample analyzed. The U.S. economy led the beginning and end of the contractions occurring in the rest of the world in the early 1970s and early 1990s, whereas the 1980s recession started and ended at about the same time in the United States and the OECD countries. Some patterns of lead-lag relationship are also revealed in the business cycle phases among the G7 countries. The article begins with an intuitive explanation of the model and then presents the empirical results for the aggregate OECD countries and for each of the G7 countries. Constructing the Model This analysis uses a multivariate system to model business cycle fluctuations in G7 and OECD countries. The model is an extension of the DFMS model, which has been successfully applied to represent business cycles worldwide. As in the DFMS model, an unobservable variable is computed as a nonlinear weighted average of the observed coincident macroeconomic series, and it represents the common information related to business cycles contained in these series. This latent variable switches regimes following a two-state Markov process, which represents expansion and contraction phases of the business cycle. 44 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA We extend the model by including a self-adjusting variable-bandwidth filter, which enhances signal-to-noise ratio cycles. The advantage of this filter is that it minimizes the occurrence of false turning points because it removes minor economic contractions and estimates only major recessions and expansions (Chauvet 2005). This filtering is especially important in situations with low signal-to-noise ratios, where the detection threshold in Markov-switching models can be low to capture recessions and can thus lead to a high rate of false alarms when the economy slows down but does not enter a recession. We apply the model for each of the G7 countries’ macroeconomic variables that display simultaneous movements with national gross domestic product (GDP), such as consumption, production, sales, employment, and income, among others. The resulting dynamic factor model characterizes country-specific business cycles. We also apply the model to an aggregate measure of output of twenty-nine OECD countries and to the GDP of each of the G7 countries to obtain a broad measure of the international business cycle shared by most industrialized and semi-industrialized countries. The proposed method tracks business cycle fluctuations and generates coincident probabilities of business cycle phases, which can be used to predict business cycle turning points. The Data We use quarterly data to build the coincident indicators for each of the G7 countries individually. These data were obtained from the International Financial Statistics database, Datastream Systems Inc., and the OECD database, with different sample ranges. For the United States, we use the same four coincident variables used by the National Bureau of Economic Research (NBER): measures of sales, personal income, industrial production, and employment.3 For the other six countries, we select four series that correspond closely to the same measurement variables used to build the coincident indicators of the U.S. economy (see Table 1). In particular, industrial production and employment are common variables used for all countries. Different measures of income (such as personal income or wages and salaries) are used for all countries except Japan. Other variables used are sales (retail or manufacturing), electricity consumption, GDP, consumption, and manufacturing orders. In order to represent a broad measure of international business cycles, we use the aggregate OECD quarterly industrial production series for its country members combined with the GDP of each G7 country. Table 1 summarizes the information about all the series used. Empirical Results 4 Business cycle phases are well characterized by the estimated probabilities, which display a clear dichotomy between expansions and recessions for each G7 country 1. G7 members are Canada, France, Germany, Italy, Japan, the United Kingdom, and the United States. OECD members are Australia, Austria, Belgium, Canada, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Korea, Luxembourg, Mexico, the Netherlands, New Zealand, Norway, Poland, Portugal, Spain, Sweden, Switzerland, Turkey, the United Kingdom, and the United States. Since the Slovak Republic became a member only in December 2000, the aggregate industrial production series we use does not include this country. 2. See Chauvet and Hamilton (2006) for a detailed explanation of the multivariate DFMS model and the univariate Markov switching model. 3. The NBER’s decisions regarding business cycle dates are considered the official U.S. turning points and are used as the benchmark for model comparison. 4. The model selected by diagnostic and predictive performance tests in identifying turning points is an autoregressive specification of order two for each country and for the aggregate OECD and G7 series. ECONOMIC REVIEW First Quarter 2006 45 F E D E R A L R E S E R V E B A N K O F AT L A N TA Table 1 Coincident Variables of G7 and OECD Countries Series Sample OECD countries Aggregate industrial production for 29 countries 1960Q1–2000Q1 United States Industrial production Total civilian employment Personal income less transfer payments Manufacturing and trade sales 1959Q1–2000Q2 Canada Industrial production Employment, business and personal services Personal income Personal consumption expenditures 1967Q2–2000Q1 United Kingdom Industrial production Employee jobs Disposable income Retail sales 1980Q1–2000Q2 Japan Industrial production Employees nonagricultural ind. Department stores sales Electric power consumption 1973Q1–2000Q2 Germany Industrial production Employed persons Gross wages and salaries Manufacturing orders 1962Q2–2000Q2 France Industrial production Employment except agriculture Gross disposable income Gross domestic product 1978Q1–2000Q1 Italy Industrial production Employment Wages and salary earnings Private consumption 1982Q1–1999Q4 Source: International Financial Statistics database, International Monetary Fund, Datastream Systems Inc., and OECD database and for the aggregate G7 and OECD measures, as shown in Figure 1. The coincident probabilities of recessions increase substantially during recessions and display low values during expansions. Figure 1 also compares the probabilities of recession from the DFMS model with and without the self-adjusting variable-bandwidth filter. For the model with the filter, the probabilities of recessions detect only major recessions, but in the model without the filter the probabilities also capture several other minor contractions in addition to the major recessions for some G7 countries. The fact that the probabilities estimated without the filter capture minor contractions is not a disadvantage per se if the goal is in fact to capture them. However, 46 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 1 Coincident Probabilities of Recessions for G7 and Aggregate OECD Countries OECD countries 1.2 DFMS with filter 0.8 United States 1.2 0.8 DFMS 0.4 0 1970 0.4 1975 1980 1985 1990 1995 Canada 1.2 0 1970 2000 0.8 0.4 0.4 1975 1980 1985 1990 1995 Japan 1.2 0 1970 2000 1975 1980 0.8 0.4 0.4 1990 1995 2000 1985 1990 1995 2000 1990 1995 2000 1990 1995 2000 0 0 1975 1980 1985 1990 1995 2000 1970 1975 1980 France 1.2 0.8 0.8 0.4 0.4 1975 1980 1985 1985 Italy 1.2 0 1970 1985 Germany 1.2 0.8 1970 1980 United Kingdom 1.2 0.8 0 1970 1975 1990 1995 2000 0 1970 1975 1980 1985 Note: The graphs show the probabilities of recessions using a DFMS model with a self-adjusting variable-bandwidth filter and a DFMS model without a filter. Source: Estimated probabilities from the proposed DFMS model with filter ECONOMIC REVIEW First Quarter 2006 47 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 2 Probabilities of Recessions for the Aggregate OECD Countries and the United States CEPR recession dating for the euro area NBER recession dating for the United States 1.2 1.2 OECD 0.8 0.8 United States 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter if the aim is to discern between major downturns and minor contractions, then the filter reduces the risk of calling false turning points. This feature is especially important for monetary policy purposes because central banks may want to change the size and direction of changes in interest rates depending on the severity of the economic downturn. In order to analyze business cycle phases, we define turning points based on whether the probabilities of recessions and expansions are smaller or greater than 50 percent. For example, the beginning of a recession occurs when the probability of a recession moves from below 50 percent to above 50 percent. This rule provides a good definition of turning points because the estimated probabilities clearly distinguish times when an expansion is more likely from those when a recession is more likely. OECD countries.5 Figure 2 shows the full-sample probabilities of recession for the aggregate output of the OECD and GDP of each G7 country. The probabilities of recessions and expansions can be interpreted as a representation of business cycle phases for industrialized and semi-industrialized countries given that the analysis includes twenty-nine member countries. Table 2 summarizes some features of the probabilities of recession measure. The average duration of a recession shared by OECD countries is eight quarters, and the average probability that the economy will enter a recession is 87 percent. Expansions last twenty quarters on average, and the average probability of entering an expansion is 95 percent. According to the recession probabilities, OECD countries altogether have experienced three major recessions in the period analyzed: during the oil crisis in the mid-seventies, in the early eighties, and in the early nineties. The results from the probabilities suggest that the business cycle obtained from the broad OECD output measure coincides with the euro area’s business cycle. The timing of recessions is very close to the euro area’s recessions as dated by the Centre for Economic Policy Research (CEPR) Business Cycle Dating Committee (see the shaded area in the second panel of Figure 2), which is a European counterpart to the NBER Business Cycle 48 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Table 2 Estimated Business Cycles of OECD and G7 Countries Number of full recessions Average expansion probability Average expansion duration (quarters) Average recession probability Average recession duration (quarters) OECD 3 0.95 20 0.87 8 United States 4 0.94 17 0.84 6 Canada 4 0.95 20 0.88 8 United Kingdom 3 0.96 25 0.86 7 Japan 4 0.95 20 0.83 6 Germany 3 0.90 10 0.89 9 France 3 0.96 25 0.87 8 Italy 3 0.95 20 0.87 8 Source: Authors’ calculations based on estimated probabilities from the proposed DFMS model with filter Table 3 Business Cycle Dating for OECD Countries and the Euro Area CEPR dating for the euro area Model dating for OECD countries Peak Trough Peak Trough 1974Q3 1980Q1 1992Q1 1975Q1 1982Q3 1993Q3 1974Q3 1980Q1 1991Q3 1975Q2 1982Q4 1993Q3 Source: CEPR (2003); authors’ calculations based on estimated model probabilities Dating Committee.6 During periods that the CEPR classifies as expansions, the probabilities of recessions are generally close to zero. At CEPR peak dates (the onset of recessions), the probabilities of recession increase substantially above 50 percent and stay high until the trough dates (the end of recessions). Table 3 compares the CEPR recession dating for Europe and the recession dating obtained from our model of OECD countries. From the six estimated turning points in the period studied (three peaks and three troughs), three match exactly, and the other three are off by only one or two quarters. This dating also coincides with the the euro 5. Since G7 members are also OECD members, the results of the combination of aggregate G7 outputs are subsumed in the OECD results. 6. Although the techniques used differ between the NBER and the CEPR, the dating generated by these institutions is similar in the sense that it is often used as a benchmark. The euro area considered by the CEPR includes Austria, Belgium, Finland, France, Germany, Greece, Ireland, Italy, Luxembourg, the Netherlands, Portugal, and Spain. ECONOMIC REVIEW First Quarter 2006 49 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 3 Probabilities of Recessions for the United States NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter Figure 4 Probabilities of Recessions for Canada NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter area dating by Artis, Marcellino, and Proietti (2003), Dopke (1999), Artis, Krolzig, and Toro (2004), Anas and Ferrara (2004), and Krolzig (2001), among others. Figure 2 also compares the probabilities of recession for OECD countries and the U.S. economy, the NBER dating for U.S. recessions, and the CEPR dating of recessions for the euro area. Recessions in the United States are more frequent and of shorter duration than in the aggregate OECD countries during the period studied. The U.S. economy led the beginning and end of contractions occurring in the OECD countries in the early 1970s and early 1990s recessions whereas the 1980s recession 50 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 5 Probabilities of Recessions for the United Kingdom NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter Figure 6 Probabilities of Recessions for Japan NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter started and ended at about the same time in the United States and the OECD. However, the U.S. economy experienced two recessions between 1980 and 1982 while only one long recession occurred in the OECD countries altogether. G7 countries. Figures 3–9 plot the probabilities of recession for all G7 countries and contrast these probabilities with the NBER dating for the United States (the first panels of Figures 3–9) and the CEPR dating for the euro area (the second panels of Figures 3–9) for each country. The probabilities of recession show some similarities and differences in the business cycles of the G7 countries. The G7 countries also ECONOMIC REVIEW First Quarter 2006 51 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 7 Probabilities of Recessions for Germany NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter Figure 8 Probabilities of Recessions for France NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter experienced three or four full recessions in the period studied;7 Japan experienced an additional recession in 1997–99. The most similar recession across the G7 countries is the one that took place in the mid-1970s, which hit all economies at about the same time. The recession in the early 1980s was a long one, lasting three or four years for some countries (France, Germany, the United Kingdom, and Japan) and for the aggregate OECD and G7 measures, whereas for a few countries (Italy, the United States, and Canada), two shorter recessions instead occurred close to each other during the same period. 52 ECONOMIC REVIEW First Quarter 2006 F E D E R A L R E S E R V E B A N K O F AT L A N TA Figure 9 Probabilities of Recessions for Italy NBER recession dating for the United States CEPR recession dating for the euro area 1.2 1.2 0.8 0.8 0.4 0.4 0 1970 1975 1980 1985 1990 1995 2000 0 1970 1975 1980 1985 1990 1995 2000 Note: The shaded vertical bars indicate recessions. Source: Estimated probabilities from the proposed DFMS model with filter The main difference in business cycles among these countries concerns the early 1990s recession. This recession started earlier in the United Kingdom, the United States, Canada, and Japan while in Germany and the other G7 countries this recession did not begin until one or two years later. For the aggregate OECD and G7 countries, this recession started and ended at about the same time as the CEPR date for the euro area (Figure 2). The NBER dates the beginning of this recession in the United States in July 1990 while the CEPR dates the start of the recession in the first quarter of 1992. The closest estimated probabilities of recessions are for the United States and Canada. Recessions began and ended at about the same time in these two countries. Italy and France also have very close recession timing. The one difference between these two countries is in the early 1980s: France experienced a single long recession while Italy had two recessions during this period. The probabilities of recession for Germany, the United Kingdom, and Japan are also very similar for the first two recessions in the sample. The probabilities suggest that recessions in the United Kingdom occurred slightly ahead of those in Germany and Japan and occurred more closely to recessions in the United States and Canada. In the 1990s recession, the U.K. economy contracted even before the U.S. and Canadian economies. Overall, recessions in the United Kingdom occurred earlier than in other European countries, followed by Germany. Recessions in the United Kingdom also lasted longer than those in the United States, Canada, and Germany. The Japanese economy displays dynamics similar to the other G7 and OECD countries in the 1970s and 1980s. However, Japan experienced two severe and long recessions in the 1990s: one in 1991–94 and another in 1997–99 (Figure 6). The earlier recession hit Japan at about the same time that it hit the United States but did not end until much later, coinciding with the trough of the recession in the OECD countries. 7. The sample begins in 1970 and therefore does not include the recessions that occurred in the United States and Canada around 1969–70. ECONOMIC REVIEW First Quarter 2006 53 F E D E R A L R E S E R V E B A N K O F AT L A N TA The Asian financial crisis that started in 1997 marked the beginning of a second 1990s recession in Japan that was not experienced by any of the other G7 countries studied. Conclusions This article constructs business cycle indicators for the G7 countries and for an aggregate measure of output by twenty-nine industrialized and semi-industrialized OECD member countries. We extend the Markov-switching dynamic factor model by adding a self-adjusting variable-bandwidth filter. The model yields output probabilities of the current business cycle phase for each G7 country and for the aggregate OECD and G7 output measures, which can be used as a warning system to monitor countryspecific and international business cycles. As a result of the filter, the probabilities of recession display a clearer distinction between recessions and expansions, reducing the risk of calling false turning points. We find a common business cycle underlying the twenty-nine OECD countries, characterizing an international business cycle. The probabilities of recessions for the aggregate OECD countries indicate that they shared three major recessions in the period analyzed: during the oil crisis in the mid-1970s, in the early 1980s, and in the early 1990s. The most similar recession in terms of timing and duration across countries is the one that took place in mid-1970s, and the most divergent is the one that occurred in the early 1990s. REFERENCES Anas, Jacques, and Laurent Ferrara. 2004. A comparative assessment of parametric and non-parametric turning points detection methods: The case of the Euro-zone economy. In Papers and proceedings of the third Eurostat colloquium on modern tools for business cycle analysis: Statistical methods and business cycle analysis of the Euro zone, edited by Gian Luigi Mazzi and Giovanni Savio. Luxembourg: Office for Official Publications of the European Communities. Artis, Michael, Hans-Martin Krolzig, and Juan Toro. 2004. The European business cycle. Oxford Economic Papers 56, no. 1:1–44. Artis, Michael, Massimiliano Marcellino, and Tommaso Proietti. 2003. Dating the euro area business cycle. CEPR Discussion Paper No. 3696, January. Burns, Arthur F., and Wesley C. Mitchell. 1946. Measuring business cycles. New York: National Bureau of Economic Research. Centre for Economic Policy Research. 2003. Euro Area Business Cycle Dating Committee. Press release, September 22. Available online at <www.cepr.org/press/ dating.pdf>. 54 ECONOMIC REVIEW First Quarter 2006 Chauvet, Marcelle. 2005. Estimating multivariate models with latent variable Markov switching. University of California, Riverside, photocopy. Chauvet, Marcelle, and James H. Hamilton. 2006. Dating business cycle turning points in real time. In Nonlinear time series analysis of business cycles, edited by Costas Milas, Philip Rothman, and Dick Van Dijik. Vol. 276, Contributions to economic analysis. Amsterdam: Elsevier Science and Technology. Dopke, Jorg. 1999. Stylised facts of euroland’s business cycle. Jahrbucher fur Nationalokonomie und Statistik 219, nos. 5–6:591–610. Krolzig, Hans-Martin. 2001. Business cycle measurement in the presence of structural change: International evidence. International Journal of Forecasting 17, no. 3:349–68.