The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
clevelandfed.org/research/workpaper/index.cfm Working Paper 95 17 SOME MONTE CARL0 RESULTS ON NONPARAMETRIC CHANGEPOINT TESTS by Edward Bryden, John B. Carlson, and Ben Craig Edward Bryden is a statistician at Booze, Allen, and Hamilton, Inc., Cleveland. John B. Carlson is an economist and Ben Craig is an economic advisor at the Federal Reserve Bank of Cleveland. The authors would like to thank Ben Keen for excellent research assistance. This paper was presented at the Midwest Macro Conference, Michigan State University, on September 16, 1995. Working papers of the Federal Reserve Bank of Cleveland are preliminary materials circulated to stimulate discussion and critical comment. The views expressed herein are those of the authors and not necessarily those of the Federal Reserve Bank of Cleveland or of the Board of Governors of the Federal Reserve System. December 1995 clevelandfed.org/research/workpaper/index.cfm Abstract For long periods since 1982, core inflation has behaved as if it were generated by a process with a fixed mean and serially independent error term. Nonparametric changepoint tests proposed by Pettitt (1979) and Lombard (1987) suggest that since 1982, changes in core inflation have been infrequent and rather abrupt. However, little is known about the small-sample properties, the power of the tests, or the robustness of changepoint tests when a series is not i.i.d. This paper uses Monte Carlo analysis to investigate the probabilities of false positive tests under alternative assumptions about the time-series properties of the underlying process. clevelandfed.org/research/workpaper/index.cfm clevelandfed.org/research/workpaper/index.cfm 1. Introduction In many situations with economic time series, researchers are faced with the question of whether an underlying probability distribution has changed in some distinct way. To illustrate, consider a continuous distribution F(x,8,) on a sequence of independent random variables xl, ... ,XT.If el =. . .= 8, = 8, while Or+ 1 , ...,eTdiffer in some unknown way, the sequence is said to have a changepoint at z. Often, the changepoint problem is examined under specific assumptions regarding the form of the underlying distribution. The distributional assumptions, however, may be controversial. Moreover, evidence suggests that some established procedures may be quite sensitive to deviations from assumed distributional forms. In light of these issues, Lombard (1987) argues that initial analysis requires procedures that are robust against deviations from distributional assumptions. He proposes a nonparametric procedure for identifying changepoints. By replacing data with functions of rank statistics, nonparametric techniques provide a distribution-free test of the null hypothesis of no change. Such methods also offer some protection against potential effects of spurious outliers. The gains from generality, however, have potential costs, particularly in terms of the power of the test. Without a specific distribution, we have no analytic means for assessing the power lost. For example, how quickly does the Lombard procedure identify a changepoint when it does occur? Moreover, most economic time series exhibit some degree of autocorrelation. It is thus useful to assess how sensitive Lombard's tests are to deviations from the serial independence assumption. To address these issues, we present some results from Monte Carlo experiments designed to estimate significance levels in clevelandfed.org/research/workpaper/index.cfm small samples, the likelihood that a changepoint will be detected by a given time period, and the probability of a false positive in the presence of serial correlation. Although our analysis is limited to univariate estimators, we offer an application that illustrates their potential usefulness. The rest of the article is organized as follows. Section 2 describes alternative nonparametric estimators of changepoints. The design of the Monte Carlo analysis is presented in section 3, while the results are discussed in section 4. To illustrate how our results may aid an initial analysis of the changepoint problem, we provide an application of the nonparametric procedures to a measure of core inflation -- the 15 percent trimmed mean. These results are given in section 5. Section 6 presents a discussion of our findings in a more general context. We offer some concluding thoughts in the final section. 2. Nonparametric Changepoint Tests In his widely cited paper, Pettitt (1979) offers an appealing nonparametric test to detect changepoints based on the Mann-Whitney two-sample test. Using his notation, let Ut, = 2Wt - t(T -I), where Ut,Tis equivalent to the Mann-Whitney statistic for testing whether two samples are from the same population. clevelandfed.org/research/workpaper/index.cfm For the hypothesis Ho: no change versus HA: change, Pettitt proposes the statistic 1 K, = max lut,,. Kt<, Pettitt shows that the significance value k of KT is approximated by POA = e ~ ~ { - 6 k+~T /~ () }~, ~ which is accurate to two decimal places for p o I ~ 0.5. Although some situations may dictate that a changepoint occurred rather abruptly, it is often more realistic to assume that a change occurs smoothly over a period of time. For this purpose, Lombard introduces a smooth-change specification: where {, , t2, z,, and z2 are unknown. Note that the abnipt-change model is a special case where z2 = z, + 1. Moreover, an onset of a trend is a special case characterized by z2= T and z, < z2- 1 . Using standard rank statistic notation, the rank of xi is denoted as ri. The rank score of xi is given by s(ri)= [@{ri/T+l}- $]/A ( 1 I i I T), where is an arbitrary score function, i#~ $ is the average value, and A is the standard deviation of that function. When changepoints zl and z2are known (e.g., z,=tl and z2=t2), clevelandfed.org/research/workpaper/index.cfm Lombard suggests as a rank test statistic to test Ho: 5, = 5, of the smooth-change model (I. I). When 72 71 and are unknown, he suggests rejecting Ho for large values of the statistic An interesting special case of the smooth-change model is when z, = z and z, = T . Lombard calls this the onset-of-trend model because the parameter 8,is initially stable, but slowly increases or decreases after time z. Under this constraint, (1.3) reduces to as a rank test statistic to test the null hypothesis of no change. For each of these test statistics, Lombard derives the asymptotic distributions based on null hypotheses and provides a table of significance points. Asymptotic significance points are shown to be applicable when sample sizes are at least 30. A method for estimating both z, and z, is also provide$ A single abrupt-change test emerges as a special case where z, = z and z, = z + 1. Lombard denotes this statistic as m l , ~where , clevelandfed.org/research/workpaper/index.cfm He shows.that under the null hypothesis, T m l , ~ converges in distribution to the limiting form of the Cramer-von Mises goodness-of-fit criterion, for which significance points are available in Anderson and Darling (1952, p. 203). In the case of multiple abrupt changes, Lombard suggests where denotes summation over indices 1 I 71 < . ..< ~ k <T, as a test statistic for the null hypothesis that 5, = . . . = 5,. This statistic converges in distribution. Lombard (1987) provides asymptotic significance points for cases k = 2 and k = 3 (table 2, p. 609). 3. Experimental Design Monte Carlo methods are used here to estimate significance levels in small samples, the power of each test, and the sensitivity of the tests to deviations from the assumption of serial independence. To estimate small-sample significance levels for a given asymptotic critical value of 5 percent, we generate at least 5,000 samples of varying length. The various tests are applied to each of the generated series, and the percentage of trials that reject the null hypothesis is the estimated significance level. To assess the power of each test, we generate series of varying initial lengths, each normally distributed with zero mean and unitary standard deviation. To each of these series, we append additional terms generated by the same distribution, but with a different mean. A battery of tests is applied sequentially. The first time a test rejects the null hypothesis, its position in the series is tabulated and no further tests are performed. From these data, we obtain the percentage of trials for which the null is rejected for each clevelandfed.org/research/workpaper/index.cfm additional term. The cumulative sum of the percentages is our estimate of the percentage of tests that detect a change in mean by a given period, i.e., the power of the test. Finally, to assess the sensitivity of each test to the assumption of serial independence, we generate a set of autocorrelated series (first order) for each of varying lengths and for alternative values of p. The mean of each series is unchanged. The percentage of times the null is rejected is hence a measure of the percentage of false positives when the assumption of serial independence is violated. 4. Results Table 1 reports the estimated significance levels for simulated series of a standardized normal random variable and for which the asymptotic critical value of the test is 5 percent. Generally, all tests perform better as the sample size increases, i.e., estimated significance levels tend to approach 5 percent. The smooth-change test seems best for the smallest sample size, while the Pettitt test consistently performs less favorably than other tests, but especially in small samples. Figure l a compares the powers of the Lombard and the Pettitt tests when there is a one-standard-deviation increase in the mean. It is clear that the Lombard test dominates. The Pettitt test statistic is evaluated at its estimated changepoint. If we were to restrict the test to accept the null hypothesis only if the changepoint estimate were exactly correct, the power difference would be even greater.1 The Lombard test statistic does not depend on an estimate of the changepoint. ' For example, by the sixth period the Pettitt test found the correct changepoint only 2.52 percent of the time. clevelandfed.org/research/workpaper/index.cfm Figure l b illustrates that the power of both tests improves significantly for two standard deviations and that the Lombard test still dominates the Pettitt test. It is interesting that the power of all tests is higher in periods immediately following a change when the initial sample size is small. It appears as though power is greatest when the change occurs in the middle of the sample. Figure 2 illustrates the estimated probabilities of detecting a smooth change with a smooth-change test. Each panel contrasts results for both one- and two-standarddeviation changes that occur smoothly over 12 periods. Because the change is spread out over a year, the smooth-change test takes around 15 months to detect a one-standard- - deviation change with a probability of 0.5. A two-standard-deviation change occurring within 12 periods, on the other hand, is detected within 10 months. The power of the smooth-change test also seems highest when a change occurs in the middle of a sample. Nevertheless, initial sample size does not appear to be especially important. Figure 3 illustrates the estimated probabilities for detecting an onset of trend with an onset-of-trend test. Each panel contrasts results for onsets of trend occurring at both one-half and one standard deviation per 12 periods. Again, initial sample size matters little. The powers of both the smooth-change and Lombard one-change tests for detecting an abrupt one-standard-deviation change are compared in figure 4. We find little difference between the two tests when a change occurs after an initial sample of 12. Although the one-change test is slightly better at detecting an abrupt change between 7 clevelandfed.org/research/workpaper/index.cfm and 12 periods, the smooth-change test performs better after 15 periods. We suspect, however, that these differences are due to limited trials. Similarly, the powers of both the smooth-change and onset-of-trend tests for detecting an onset of trend are compared in figure 5. The simulated trend change occurs at a rate of one standard deviation per year after an initial sample of 24. The onset-of-trend test is slightly better after the tenth period, while the smooth-change test appears to be better immediately after the break. Finally, table 2 presents estimated significance levels for each of the tests when first-order autocorrelation is present. These values indicate the percentage of false positives associated with each test for five values of p and for varying initial sample sizes. Not surprisingly, a high degree of positive serial correlation leads to a high proportion of false positives, especially as sample size increases. For a p of 0.2, the estimated significance levels hover around 10 percent for the Lombard one-change, smooth-change, and onset-of-trend tests for all sample sizes. The same value of p leads to considerably higher estimated significance levels for the Lombard three-change tests. In the latter, the significance level increases with sample size. For negative values of p, all estimates of significance levels are below 5 percent. Thus, negative autocorrelation tends to bias tests against the null. Although sample size appears to be irrelevant, significance levels tend to decline with sample size when p equals -0.3 and to increase with sample size when p equals -0.1. clevelandfed.org/research/workpaper/index.cfm To summarize, we find that sample size can matter for some nonparametric tests. The smooth-change test suggested by Lombard appears to be the most powerful under most circumstances. Nevertheless, less dominant tests such as the Pettitt can corroborate changepoint dates. Our Monte Carlo results indicate that serial dependence can matter. 5. An Application Since 1982, core inflation -as measured by the 15 percent trimmed mean -has behaved much differently than it did over the previous 15 years (see figure 6). Indeed, relative to the earlier period, core inflation appears to have become a stationary process with little or no serial correlation. Figure 7 illustrates more clearly that around May 1988, inflation appears to have moved higher. In each of the five months following May, the trimmed mean registered persistently above its previous average. The measure varied around this higher level until February 1991, when its mean appears to have moved permanently lower. The abruptness of these changes suggests that the underlying process experienced at least two permanent changes in its mean over the sample period. Thus, as an initial analysis, it wollld seem suitable to use parametric changepoint methods to test this hypothesis. Table 3 presents test statistics for the various changepoint tests in selected samples. These results confirm that there were at least three changepoints: one abrupt changepoint in May 1988, another in January 1991, and a smooth change between June 1991 and July 1993 (see figure 8). When applied to the whole sample, all changepoint tests indicate significant location changes, with the most likely being an abrupt change of about one and a half standard deviations immediately after January 1991. clevelandfed.org/research/workpaper/index.cfm We split the sample at this point and found the other two changepoints within each subsample. The Lombard estimates for the smooth-change range also indicated an abrupt change in May 1988, the same date that the Pettitt test estimates. Stratification around these changes failed to produce any evidence of additional changes. As our Monte Carlo results indicate, the presence of autocorrelation can bias the tests towad rejecting the null of no change. We therefore examine the autocorrelation functions for periods both between and across changepoints. Figure 9a, for example, illustrates the estimated autocorrelation functions for selected periods from January 1983 to January 1991. In no case is there any evidence of autocorrelation. The Ljung-BoxPierce statistic fails to reject the hypothesis that the error process is white noise. Thus, we conclude that the data support the use of our techniques, and we accept the alternative that a changepoint occurred in May 1988. Similarly, figure 9b illustrates the estimated autocorrelation functions beginning in February 1991. Although there is some indication of negative serial correlation, we suspect that this may reflect the preliminary nature of the seasonal used to adjust the data. Nevertheless, the Ljung-Box-Pierce statistic fails to reject the hypothesis of white noise for either sample. We conclude that the data support the hypothesis of a smooth decline in the mean between June 1991 and July 1993. Since July 1993, the trimmed mean has averaged less than 3 percent. 6. Discussion We must emphasize that our application of the nonparametric methods is meant to be an initial analysis of the time-series properties of one measure of core inflation. clevelandfed.org/research/workpaper/index.cfm Nevertheless, we are surprised at how far these techniques can take us. Our results raise some important questions about the conventional wisdom concerning the inflation process. For example, many economists believe that inflation is either highly autocorrelated or nonstationary. The data examined above indicate that for periods as long as 65 quarters, the trimmed mean appears to have been generated by a process with a fixed mean and no serial correlation. Although we make no claims about what theoretical models may account for the regularities uncovered, we believe the results may provide some guidance in forming modeling strategies. For example, do the periods of stationarity reflect particular monetary regimes? If so, one would clearly want to consider the relevant restrictions on the policy-reaction function implied by the facts. Moreover, what accounts for the abrupt changes? Perhaps S-s type models could account for adjustment in inflation. Our purpose here is not to provide a basis for what we find, but to illustrate how useful some simple empirical techniques are in an initial investigation. 7. Conclusions On the basis of our Monte Carlo simulations, we conclude that the smooth-change test suggested by Lombard is the preferred test in most situations, particularly when the researcher has no knowledge about the location of the change. The estimated significance levels in small samples were closest to asymptotic values for this test, while its power was at least as good in almost all cases. If one suspects an onset of trend in the series began more than 10 periods earlier?we recommend using the onset-of-trend test. clevelandfed.org/research/workpaper/index.cfm Although one might expect that nonparametric tests lack power, our experiment reveals that these techniques are not so bad. A one-standard-deviation change in mean is generally detectable half the time within 12 periods. The power improves if the initial sample is even smaller. Not surprisingly, the presence of positive serial correlations biases all tests toward accepting the null. While the application we propose seems well suited to the techniques applied, it illustrates a need to extend our analysis in several directions. First, we recognize a need to provide some common parametric changepoint test as a benchmark, particularly for assessing the power lost by choosing nonparametric tests. It would be useful to nest our tests into a richer framework that would enable us to discriminate between a changepoint and a series that is autoconelated. These issues will be addressed in a forthcoming extension of this study. clevelandfed.org/research/workpaper/index.cfm References Anderson, T.W., and D.A. Darling. "Asymptotic Theory of Certain 'Goodness of Fit' Criteria Based on Stochastic Processes," Annals of Mathematical Statistics. vol. 63 (1952), pp. 193-212. Box, George E.P., and Gwilym M. Jenkins. Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day, 1976. Bryan, Michael F., and Stephen Cecchetti. "Measuring Core Inflation," Federal Reserve Bank of Cleveland, Working Paper No. 9304, June 1993. Lombard, F. "Rank Tests for Changepoint Problems," Biometrica, vol. 74, no. 3 (1987), pp. 615-24. Pettitt, A.N. "A Nonparametric Approach to the Changepoint Problem," Applied Statistician, vol. 28 (1979), pp. 126-35. clevelandfed.org/research/workpaper/index.cfm Figure la. Estimated Probability that a One-Standard-Deviation Change in Mean Will Be Detected by a Specified Period Initial Series: 12 Periods 100% 50% U Lornbard Test 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Found Change by Period: Initial Series: 24 Periods 100% 50% Lornbard Test 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Found Change by Period: Initial Series: 72 Periods 100% 50% U Lombard Test 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Pound Change by Period: Source: Authors' calculations. clevelandfed.org/research/workpaper/index.cfm Figure lb. Estimated Probability that a Two-Standard-Deviation Change in Mean Will Be Detected by a Specified Period Initial Series: 12 Periods 100% 50% -a- Lombard Test 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Found Change by Period: 100% *Lombard Test 50% 0% 1 2 3 4 5 6 Found Change by Period: 100% 50% -a- Lombard Test 0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Found Change by Period: Source: Authors' calculations. Figure 2: Estimated Probability that a Smooth Change over 12 Periods Will Be Detected by a Specified Period with a Smooth-Change Test clevelandfed.org/research/workpaper/index.cfm 100% Initial Series: 12 Periods +1-StandardDeviation Change 50% +2-StandardDeviation Change 0% Found Change by Period: 100% +1-StandardDeviation Change 50% U 2-Standard- Deviation Change 0% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Found Change by Period: 100% Initial Series: 36 Periods +1-StandardDeviation Change 50 % 2-StandardDeviation Change 0% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Found Change by Period: Source: Authors' calculations. Figure 3: Estimated Probability that an Onset-of-Trend Change Will Be Detected by clevelandfed.org/research/workpaper/index.cfm a Specified Period with an Onset-of-Trend Test 100% Initial Series: 12 Periods +112-StandardDeviation Change per 12 Periods 50% *1-StandardDeviation Change per 12 Periods 0O h 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Found Change by Period: Initial Series: 24 Periods 100% +112-StandardDeviation Change per 12 Periods 50 % +1-StandardDeviation Change per 12 Periods 0% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Found Change by Period: 100% +112-StandardDeviation Change per 12 Periods 50 % *1-StandardDeviation Change per 12 Periods 0% 1 3 5 7 Source: Authors' calculations. 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Found Change by Period: clevelandfed.org/research/workpaper/index.cfm Figure 4: Estimated Probability that a One-Standard-Deviation Abrupt Change Will Be Detected by a Smooth-Change Test Versus a Lombard One-Change Test I 1 3 5 7 9 11 13 15 17 19 21 23 Found Change by Period: Source: Authors' calculations. 25 27 I I 29 I I 31 I l 33 I l 35 clevelandfed.org/research/workpaper/index.cfm Figure 5: Estimated Probability that an Onset of Trend of One Standard Deviation per 12 Periods Will Be Detected by a Smooth-Change Test Versus an Onset-of-Trend Test +Onset-ofTrend Test -u- Smooth- Change Test 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Found Change by Period: Source: Authors' calculations. clevelandfed.org/research/workpaper/index.cfm Figure 6: Core Inflation as Measured by the 15 Percent Trimmed Mean Percent l6 1 Source: Federal Reserve Bank of Cleveland. clevelandfed.org/research/workpaper/index.cfm clevelandfed.org/research/workpaper/index.cfm clevelandfed.org/research/workpaper/index.cfm Figure 9a: Autocorrelation Functions of Selected Samples January 1983 to January 1991 Coefficient 0.4 -0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Lags January 1983 to May 1988 Coefficient -0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Lags June 1988 to January 1991 Coefficient -0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Lags Source: Authors' calculations. clevelandfed.org/research/workpaper/index.cfm Figure 9b: Autocorrelation Functions of Selected Samples February 1991 to July 1995 Coefficient 0.4 0.2 0.0 -0.2 -0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Lags June 1993 to May 1995 Coefficient 0.4 0.2 0.0 -0.2 -0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Lags Source: Authors' calculations. clevelandfed.org/research/workpaper/index.cfm Table 1: Estimated Significance Levels for N(0,l) (Percent) Pettitt Statistic Lombard Test Statistics Number of Chan~epoints One - Two 0.72 3.73 1.92 Sample Size 12 120 Source: Authors' calculations. 3.36 Three 1.40 Smooth 4.42 Trend 1.95 4.88 4.29 3.36 4.74 3.59 2.47 4.40 4.30 3.30 4.51 4.03 2.92 4.81 4.55 4.42 4.80 4.26 3.02 5 .OO 4.44 4.54 5 .OO 4.50 3.30 4.52 4.49 4.96 5.08 4.21 3.85 5.07 4.90 4.42 4.70 4.92 clevelandfed.org/research/workpaper/index.cfm Table 2: Estimated Significance Levels in Presence of First-Order Autocorrelation P Test Sample size Pettitt 24 36 60 120 Lombard 1 24 36 60 120 Lombard 3 24 36 60 120 Source: Authors' calculations. -0.3 -0.1 0.2 0.5 0.9 clevelandfed.org/research/workpaper/index.cfm Table 3 Changepoint Test Results Pettitt Statistics m fiam.& Max U * Significant at the 5 percent confidence level. Source: Authors' calculations. && One i Lombard Test Statistiq n t s b!Q Three Trend Smooth tl tz