The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
orKing raper beries S m a l l S a m p l e Properties of G M M for B u s i n e s s C y c l e A n a l y s i s Lawrence J. Christiano and Wouter den Haan 3 Working Papers Series Macroeconomic Issues Research Department Federal Reserve Bank of Chicago February 1995 (WP-95-3) c FEDERAL RESERVE B A N K OF CHICAGO SM A LL SA M P LE P R O P ER T IES O F GMM FO R BUSIN ESS C Y C L E A N A LYSIS* * Lawrence J. Christiano* Wouter den Haan* March, 1995 Abstract W e investigate, by Monte Carlo methods, the f n t sample properties of G M M iie procedures for conducting inference about s a i t c that are of interest in the business ttsis cycle l t r t r . These s a i t c include the second moments of data f l e e using ieaue ttsis itrd the f r t difference and Hodrick-Prescott f l e s and they include s a i t c f r evalu is itr, ttsis o ating model ft Our results indicate that, for the procedures considered, the existing i. asymptotic theory i not a good guide in a sample the s z of quarterly postwar U.S. s ie data. JEL numbers: C12, C15, E32. Keywords: Monte Carlo Simulation, Generalized Method of Moments, Finite Sam p e , Covariance Matrix Estimation, Spectral Densities, Prewhitening, Hypothesis Test ls ing ’The authors are grateful to Neil Ericsson, Rob Engle, Andrew Levin, Masao Ogaki, Kenneth West, and the referees for comments, and Christiano is grateful to the National Science Foundation for financial support. ^Northwestern University, NBER, and Federal Reserve Banks of Chicago and Minneapolis. *University of California, San Diego 1 1 Introduction Statistical tools based on the generalized method of moments ( GMM) procedures outlined by Hansen (1982) are increasingly being used in the analysis of business cycles. (See, for example, Backus, Gregory, and Zin 1989; Backus and Kehoe 1992; Backus, Kehoe, and Kydland 1994; Braun 1994; Braun and Evans 1991,1995; Burnside, Eichenbaum, and Rebelo 1993; Cecchetti, Lam, and Mark 1993; Christiano and Eichenbaum 1992; den Haan 1995; Fisher 1993; Marshall 1992; and Reynolds 1992.) For the most part, the theory available for conducting inference with these tools i asymptotic. Recently, efforts have been made s to investigate the f n t sample accuracy of the asymptotic theory. Much of this work has iie focused on the sampling distribution of statistics used in the empirical analysis of asset pricing and inventory investment models. Analyses in the context of studies of asset pricing theories include Burnside (1991); Ferson and Foerster (1991); Kocherlakota (1990); Neely (1993); and Tauchen (1986). West and Wilcox (1992) and Fuhrer, Moore, and Schuh (1993) conduct fin t sample studies of inference in the context of inventory investment models. Analyses of ie the fin t sample properties of instrumental variables estimation include Christiano (1989), ie Ericsson (1991), and Nelson and Startz (1990). This paper uses Monte Carlo methods to investigate the f n sample properties of statis i ite t c often used in the analysis of business cycles. W e are particularly interested in the f n t is iie sample performance of G M M for conducting inference about correlations, standard devia tions, and relative standard deviations of data that have been filtered to induce covariance stationarity. W e focus on the two f l e s most often used in business cycle analysis: the f r t itr is difference f l e and the Hodrick-Prescott (HP) f l e . Statistics based on these f l e s provide itr itr itr different information about the data because the f l e s emphasize different frequencies. W e itr also examine the f n t sample properties of the G M M procedures implemented in Christiano iie and Eichenbaum (1992) for testing the null hypothesis that a model’ implications for a set s of second moments correspond with the actual second moment properties of the data. To calculate the asymptotic standard errors using G M M , one needs to estimate the zero-frequency spectral density of a particular disturbance process. W e estimate this using the heteroskedasticity and autocorrelation consistent covariance matrix estimators (HAC) described in Newey and West (1987,1994), Andrews (1991), and Andrews and Monahan (1992). The estimators d f according to the choice of kernel, the bandwidth parameter, i fer and the order of prewhitening. One factor that distinguishes our investigation from previous ones i our analysis of the H P f l e . Our statistical environment (chosen for empirical s itr plausibility) has the property that the H P f l e introduces a complicated serial correlation itr pattern into the relevant G M M disturbance process. As i well known, persistence puts s zero-frequency spectral density estimators to a severe t s . et W e begin our analysis by investigating the coverage probabilities of confidence intervals computed for various second moments. W e f r t discuss these issues thoroughly in the context is of a univariate data generating mechanism (dgm) that has proved useful in the analysis of several macroeconomic data s r e . The advantage of this dgm, aside from i s empirical eis t plausibility, i that i s simplicity enables one to develop intuition about the reasons that s t 2 alternative H A C estimators have different finite sample properties. W e then analyze the f n t iie sample properties of these H A C estimators using a t f c a data generated by a multivariate riiil dgm, estimated by fitting a vector autoregression to the set of postwar U.S. macroeconomic data typically analyzed in business cycle studies. Next, we evaluate the f n t sample properties of the chi-square test implemented in Chrisiie tiano and Eichenbaum (1992) for testing the f t of an equilibrium business cycle model. The i test compares a model’ implications for second moments with the actual second moments s estimated in the data. The test takes into account the joint sampling uncertainty in model parameter estimates and data second moments. Our Monte Carlo study i based on data s generated from the Long and Plosser (1982) model, the solution of which may be obtained analytically. The paper i organized as follows. The next section lays out the asymptotic sampling s theory that i relevant to our analysis. Section 3 discusses the different procedures to estimate s the spectral density at frequency zero. Section 4 describes the features of the H P f l e that itr are relevant to our analysis. Section 5 presents the analysis of inference about second moment properties. Section 6 analyzes the f n t sample properties of the test of an equilibrium model. iie Finally, section 7 presents a summary of our main findings and concludes. 2 Generalized M e t h o d of M o m e n t s In this section we discuss the use of G M M for the estimation of parameters and for testing hypotheses. (For textbook treatments, see Davidson and MacKinnon 1993, Hamilton 1994, and Ogaki 1992.) In the f r t subsection, we survey the relevant large sample theory. In the is second subsection we discuss hypothesis testing. 2.1 Large S am p le T heory Suppose we wish to estimate a n n x l vector, tp °, of parameters. To do so using G M M , we need to f r t identify a n n x l vector, tt ( 0 which i a strictly stationary stochastic process is tV> s for each and which has the property Eu,(r) =o , (i) where ip ° denotes the true values of the parameters of the underlying data generating mecha nism. (For other regularity conditions on ut( ' , see Hansen (1982).) W e consider the exactly V) identified case only, so that the dimensions of the G M M error, U t, and the parameter vector, - , coincide. For an analysis of f n t sample issues in the over-identified case, see Burnside 0 iie and Eichenbaum (1994). In the exactly identified case, the G M M estimator of rp °, denoted ip T , i defined by s 9t $ t ) = 0 ,gT (fp) = ^ X X M , 1 t=i 3 (2) where T denotes the sample s z . According to Hansen (1982), tp r has the following asympie totic distribution: V T ( $ T - r ) ~ N ( o ,v ) , (3) where V Here, D = D ~ , S ( D ' ) - 1. (4) i given by s D and ' denotes transposition. Also, zero of U t(ip °), defined by S = B{ (5) t y i the (positive definite) spectral density at frequency s s - E (6 ) C> /=—OO and C i- E u t(rp ) u t - i( ip y . (7) Since V i unknown, in practice i must be replaced by a consistent sample estimate, s t which i based on replacing D and S in (4) by D t and S t - Here, D t i computed by s s replacing the expectation operator in ( by the sample average operator and ip ° by xpT 5) Also, S t i obtained by applying an estimator of the spectral density at frequency zero to s U t ( ip r ) - Thus, in practice, inference i conducted based on s V r, Vt = D ? S W e discuss alternative estimators, 2.2 St , t (8) (D t ) - 1. ’ in section 3 . H y p o th esis T estin g To test a hypothesis about the asymptotically, i th element of rp°, j>i,T - 1> t N ( ifi° , we can make use of the fact that, 0,1), (9) \ A W o where i the i th element of ip r and V ^ t i the i th diagonal element of V t s s W e will also consider tests of joint hypotheses about the elements of ip °. Let F be a differentiable function which maps 5 n into the m x 1 zero vector, 0m . Then F ( r p ° ) = 0m R represents m hypotheses, each of which potentially involves a l elements of rp°. To test the l null hypothesis, F ( ip ° ) = 0m , we meike use of the fact that, i indeed F ( ip ° ) = 0m , then f asymptotically, V t f $ t) ~ n ( 0m, vp), v p = f (r)vf(ry, (io) where the m x m matrix i defined as follows: s dF m ° ) fy'iw' 4 In practice, V p in (10}, which depends on unknown parameters, i estimated by replacing ip ° s with ip p and V with V t : V f ,t — f { ^ ) V T f{ % pr)'. (11) W e base inference on the asymptotic result: TF(rpT)'{Vpt T]~1 F('tpT) ~ Xm- (12) The popular ‘ calibration’procedure of Kydland and Prescott (1982) tests a restriction of the form F ( t p ° ) = 0m . The procedure calculates the second moments implied by an economic model at the estimated values of i s parameters and compares these second moments with the t second moments observed in the data. Equation (12) constitutes a formal theory of inference that can potentially be of assistance in making this comparison. I takes into account the t joint sampling uncertainty in the model parameter estimates and data second moments. 3 Estimation of a Spectral Density at F r e q u e n c y Zero This section discusses the computation of S p , an estimator of the spectral density of U t{ipQ) at frequency zero. This object i central to econometric analyses using G M M . In section 2 s we showed that in the exactly identified case, S p i required for conducting hypothesis tests s on the parameters, ip . In the over-identified case not considered in this paper, S p also plays a role in computing the point estimates, ip p . (See Hansen 1982.) In our simulation analysis we study five zero-frequency spectral density estimators, S p , and these are described below. I i useful for us to describe our estimators of S by reference to the following baseline class t s of nonparametric estimators. (See den Haan and Levin 1994 for an analysis of parametric estimators.) St - E j= -T + l where k (-) < 13) i a weighting function (kernel) to be discussed below and s C= l 1 X) T ~ n£ U “t($r) u-{p)y = 0,...,T - 1, tiip'l (14) where C = C_ = i 'hl -1, - , . , —T+1. 2.. In (14), the scalar, n, i included as a fin t sample correction. In the f r t subsection s ie is below, we discuss a perturbation on the above estimator that was described by Andrews and Monahan (1992) and involves ‘ prewhitening’U t ( i p p ) . In the second two subsections, we discuss aspects of the problem of choosing the kernel, k . 5 3.1 P rew h iten in g Andrews and Monahan (1992) propose and stud^a modification to the class of estimators defined by (13), which involves prewhitening Ut(^r). Their procedure i executed in three s steps. In the prewhitening step, compute tt£($r), the fitted residual from the following 6 * * order vector autoregression to t t V r : t(*) U t(tp r) = ^ 2 A r U t - r i'P r ) + &t($r)> (15) r=l where ( ip r ) = Ut(V'r) when 6 = 0 The second step applies an estimator of the spectral . density at frequency zero to i * fp r) : t( §t = E (is) j=-T+l /S where Cj , i the s j th autocovariance of u*t ( tp r ) , computed using the analog of (14), and k ( ) /. S is a real-valued kernel discussed below. Finally, the prewhitened estimator of 5, S t = (I — ^ 2 A-) is (17) 1- r=l S ! f, r=1 We define S!p = S £ = S T when 6 = 0, where S t is defined in (13). Andrews and Monahan (1992) give no advice on the appropriate choice of 6. In their Monte Carlo studies, they consider 6 = 0 and 6 = 1, and we consider 6 = 0,1,2. 3.2 A ltern a tiv e C hoices o f th e K ernel, k We now discuss the choice of the kernel, «(•). Newey and West (1987) use the Bartlett kernel, that is, < j) = with [1 - 0 < |j/£ | < 1, K(j) = 0 ,1j / S \> 1, (18) We refer to (17) with the Bartlett kernel as the B a r t l e t t e s t i m a t o r o f S , w i t h £ a n d p r e w h i t e n i n g p a r a m e t e r 6. An alternative kernel, proposed by Hansen and Hodrick (1980) and White (1984), is (18) with 9 = 0 . We refer to this as the unweighted, truncated kernel. An advantage of the Bartlett kernel is that positive definiteness of S t is guaranteed, while this is not the case for the unweighted, truncated kernel. To accommodate the latter observation, we define the u n w e i g h t e d , t r u n c a t e d e s t i m a t o r o f S , w i t h b a n d w i d t h £ a n d p r e w h i t e n i n g p a r a m e t e r 6, as one which uses the truncated kernel to compute S f in (17) when S ^ is positive definite and the Bartlett estimator otherwise. In the data generating processes considered in this paper, we found that failure of positive definiteness occurs with low probability. In particular, in all of our experiments involving artificial data sets of length 9 = 1 . b a n d w id th 6 120 observations, the frequency with which positive definiteness f i s never exceeds 6 % and al i typically closer to 1%. W e also considered data sets of length 1000, and in these we never s encountered the problem. W e also consider the QS kernel proposed in Andrews (1991): (19) *0) = *as(j'•/(). where k q S (x ) = 25 sin(67rx/5) 127r 2x2 6nx/5 — cos(67rx/5) (20) with Kq s ( = 1 Like the Bartlett kernel, this kernel guarantees a positive definite estimator 0) . of 5. 3.3 A u t o m a t i c B a n d w i d t h S e lection In this subsection, we describe data-based ( automatic’ methods that select bandwidths for ‘ ) the Bartlett and QS kernels. W e denote a bandwidth that i selected as a function of the data s by £ t - Andrews (1991) and Newey and West (1994) (Newey-West) each describe methods that can be used to compute £r for the Bartlett, QS, and other kernels. Although when n > 1 , their efficiency criterion guiding the selection of £r d iffers s i h l , the primary difference lgty between .Andrews and Newey-West l e in their strategies for exploiting the information in is the sample autocorrelation function to select a value for The Andrews procedure, as implemented by Andrews and Monahan, assumes an AR(1) parametric structure for the autocovariance function, which enables the analyst to select the truncation parameter based on just the variance and f r t order autocovariance. By contrast, Newey and West do not is assume a parametric structure, and so their procedure must select the lag length based on a longer l s of autocorrelations. it Neither method i entirely automatic in the sense that i completely avoids selecting s t parameters exogenously. In the case of the Andrews method, a time series model must be selected for uj(^r). No automatic procedure i offered for doing t i , though Andrews and s hs Monahan (1992) do recommend (on computational tractability grounds, i seems) the AR(1) t model. Similarly, the Newey-West method requires picking a bandwidth parameter (not £ i s l ! exogenously, and in their work they use two arbitrarily chosen values for i . tef) t For the kernels discussed in the previous subsection, consistency of ,i e , plim .. = 5*, i guaranteed i £r — ► oo as T — ► oo, with £r/T1, — ► 0 (See Andrews (1991).) Andrews s f /a . and Newey-West select {£r} from this class to optimize asymptotic efficiency cri e i . The tra optimal choice of i s $r = 1.1447[a(l)T]I 3 / (21) for the Bartlett kernel and £r = 1.3221 [ a ( 2 ) T ] / 15 (22) for the QS kernel. (See Andrews and Newey-West and the references they c t . In practice, ie) in (21)-(22), a(l) and a(2) must be replaced by sample estimates. 7 Under regularity conditions satisfied here, Andrews and Newey and West show that (21)-(22) with the a ’ replaced by particular sample estimates also optimize their respective s efficiency criteria when the underlying parameters, ip, are unknown, and 6 = 0 (See Andrews . and Monahan (1992) for the 6 > 0 case, where A \ , ..., A t must be estimated too.) W e turn now to a description of the details of the Andrews optimal bandwidth selection procedure and the Newey-West optimal bandwidth selection procedure. T h e A n d r e w s B a n d w id t h S e l e c t i o n P r o c e d u r e Andrews proposes estimating the parameters of a time series model for uj(^r) and pro vides formulas for estimating a(l) and a(2) based on the parameter estimates. In practice, Andrews and Monahan recommend fitting an AR(1) representation to the 0 th component of u \ (ip r) i a — 1 , n. Letting ( p a , a%) denote the associated f r t order autoregressive and is innovation variance parameters, E o = i ^ ° ( l - p f f i ( r + p a )2 d(l) = (23) ■n > -fO=l and ELi d(2) = gj j E2=i U a( -pj* i (24) W e follow Andrews and Monahan, who suggest setting uia = 1 for a l a . l To summarize, the Andrews bandwidth selection procedure i implemented in the follow s ing four steps: 1 Obtain a s r e , U t t y r ) , and compute the fitted residuals, u\(ij> r), in (15) i 6 > 1 I . eis f . f 6 = 0, then u J V t ) = Ut(^r)(> 2 Fit scalar AR(1) representations to each of the n elements of u ( ^ ' . Denote the . jtT) resulting parameter estimates by ( a , a * ) , a = 1,..., n. p 3 Select a set of weights, u>a , . 4 Evaluate (21)-(22) with . a a (q ) = 1,... ,n, and compute ( ( ) a(2) using (23)-(24). 51, replaced by oc(q), q = 1,2. W e refer to the procedure for computing which uses the QS kernel and the above bandwidth selection method as the A n d r e w s ( Q S ) e s t i m a t o r o f S , w it h p r e w h it e n in g p a r a m e t e r b. W e refer to the procedure which uses the Bartlett kernel and the above bandwidth selection method as the A n d r e w s ( B a r t le t t ) e s t im a t o r o f S , w it h p r e w h it e n in g p a r a m e t e r b. 8 T h e N e w e y - W e s t B a n d w id t h S e le c tio n P r o c e d u re Newey and West’ formulas for a(l) and a(2) are as follows: s ,2 a (q ) = , 9 = 1,2, (25) u/F^w and they make w a vector of l’. Here, s F<*> = £ \j\qC ; , l = 0 ( T / 100)2 ® /. (26) j= -t Newey and West avoid making parametric assumptions about the time series representation of U t(i> r)- However, they must choose an exogenous bandwidth parameter, 0 . They suggest doing so by trying alternative values and ‘ then exercising some judgment about sensitivity of results’ (p.7). In practice, they work with values of 0 equal to 4 and 12. In our Monte Carlo experiments we used 0 = 4 . W e found very l t l differences between setting 0 equal ite to 4 or 9 To summarize, the Newey-West bandwidth selection procedure i implemented in . s the following four steps: 1 Same as step 1 in the Andrews procedure. . 2 Set 0 and compute . F ^ q\ ^ = 0,1,2, using (26). 3 Select a set of weights, w , and compute . o t(q ), q — 4 Evaluate (21)-(22) with a ( q ) replaced by . 6r- a (q ), q = 1,2, using (25). 1,2, and retain the integer value of W e refer to the procedure for computing Sf which uses the Bartlett kernel and the above bandwidth selection method ssthe N e w e y - W e s t ( B a r t le t t ) e s t i m a t o r o f S , w it h p r e w h it e n in g i p a r a m e t e r b. For convenience, the various zero-frequency spectral estimators and their names are summarized in Table 1 . 4 T h e Hodrick-Prescott Filter W e consider two detrending methods: f r t differencing and applying the H P f l e . W e is itr briefly review properties of the H P f l e which are relevant to our analysis. itr Suppose we have a partial realization of length T , Y = [K_r/2+i, ••• Y r / 2 , of a stochastic , ]> process, {Vt}. Application of the H P f l e to this partial realization f r t involves computing itr is a T dimensional trend path, r = [ _t’2+ i,• • T t /tW which minimizes t / ., T/2 T/2 -1 ( - rt 2 + A yt ) t = - T / 2+1 X) [(n+1 - Tt) - (rt- Tt.i)]2, t=-T/2+2 9 (27) with A normally being set to 1600 with quarterly data. The detrended series i Y d = s [ y-r/2+i. ••• > where Y d = Y t — rt As pointed out in Prescott (1986), the solution to . this problem can be represented as follows: Y d = A t (28) Y, where A t i a T x T matrix with elements that depend upon the values of A and T, but s not upon the data, Y . Thus, the weights in the H P f l e , the rows of A t , are a function of itr T . The weights are graphed in Figure la for T = 120 and A = 1600. The figure displays the entries in rows 2 10, 25, 60, 105, 110, and 118 of A m . These are the f l e weights for , itr Y d , t = — 58,-50,-35,0,35,50, and 58, respectively. The f l e weights used to get Y q are itr essentially the H P f l e weights for T = oo. The figure indicates that these weights extend itr forward and backward in time a l t l over 25 periods. For this reason, the 25** and 95t ite A rows of A m show some (slight) evidence of truncation. For observations on Y d that are less than 25 periods from the endpoints of the data s t the H P f l e weights are significantly e, itr different from their T = oo values. There i a simple representation of the H P f l e weights for large T . A s shown by King s itr and Rebelo (1993), manipulation of the f r t order conditions of the above optimization is problem shows that, as T —♦ oo, Y d — ►g ( L ) Y t , where, for A = 1600, g (L ) = 0.7794 > h (L ) = 1 “ 1-7771L + 0.7994L. (29) This result provides the sense in which, for large T ,the H P f l e induces covariance stationitr arity in processes that require up to fourth differencing to induce stationarity. (Examples include processes that are integrated of order up to four.) In addition, the preceding discus sion suggests that for A = 1600 and f n t T > 40, H P filtered observations in the middle of a iie data set virtually coincide with g ( L ) Y t , a covariance stationary process. These considerations have lead researchers to conclude that the H P f l e , as conventionally applied, i equivalent itr s to the application of g { L ) to the data, together with a particular strategy for dealing with endpoints. Baxter and King (1994) also discuss the endpoint issue, and they suggest dealing with i by dropping observations at the beginning and the end of the data s t Although i t e. t i beyond the scope of this paper to do s , i would be of interest to compare the sampling s o t properties of this strategy for dealing with endpoints with the conventional strategy. In the introduction we drew attention to the fact that the H P and f r t difference f l e s is itr emphasize different frequencies. (See also Singleton 1988.) This i illustrated in Figure s lb, which shows how the f r t difference and g ( L ) f l e s scale the spectrum of a raw time is itr s r e . The horizontal axis reports u/ which i frequencies divided by 7r and the vertical axis eis , s , reports the transfer function of the f l e . Cycle periods are given by 2 / u . Since business itr cycle analysis i primarily concerned with quarterly data, we think of the period as being s one quarter. As i well known, the f r t difference f l e amplifies the high (quarters 8 and s is itr lower) frequency components of the data, while reducing the lower frequency components. The H P f l e resembles the high-pass f l e also displayed in the figure, which eliminates itr itr 10 cycles of period 32 quarters and greater and leaves shorter cycles untouched. The figure reveals why some business cycle analysts prefer the H P f l e . To them, using f r t difference itr is adjusted data to study business cycles i much like using seasonally adjusted data to study the s seasonal cycle. Business cycle frequencies are commonly associated with periods 8 through 32 quarters ( . . a; = 0.063 to 0.25), and the f r t difference f l e dramatically reduces the ie, is itr relative importance of these. At the same time, there are reasons that some researchers prefer to work with the f r t is difference f l e . For example, for some researchers the variable of interest may be d e f in e d itr in terms of the f r t difference f l e because, say, they are interested in the growth rate of is itr GDP, or inflation, rather than the levels of these variables. In this paper, we simply take i t as given that, for a variety of reasons, some researchers work with the H P f l e and others itr with the f r t difference f l e . Our task here i to provide information on the small sample is itr s distribution of various statistics computed based on these two f l e s itr. 5 Inference A b o u t Seco n d M o m e n t s Our Monte Carlo experimental design i motivated by a desire to provide evidence on the s fin t sample properties of statistics commonly used in the analysis of business cycles. There ie fore, i i important to us that ( ) we study statistics that are actually in use, and ( i we t s i i) employ empirically plausible data generating mechanisms for our experiments. The f r t is subsection below uses a univariate data generating mechanism often used in the analysis of macroeconomic data. An advantage of using this model i that i s simplicity allows us to s t gain intuition into the basic results. In the context of this data generating mechanism, we study the f n t sample performance of a standard deviation estimator. W e then proceed iie to analyze a multivariate data generating mechanism, obtained by fitting a four variable vector autoregression to the main macroeconomic data s r e : consumption, employment, eis output, and investment. Here, we analyze the f n t sample properties of 23 second moments iie commonly studied in the analysis of business cycles. The insights obtained in the univari ate environment provide intuition into the qualitatively similar results that we get in our multivariate setting. 5.1 A Univariate D G M W e suppose the data, x t , are generated by (30) x t = x t- i + uu where ut = p u t-x + cret , e t ~ N ( 0,1), t = -T / 2 + 1 ...,T / 2 , , |p |< 1 and a > 0 This data generating mechanism has two advantages. First, i s simplicity . t i helpful for diagnosing our results. Second, i i a plausible representation for several U.S. s t s 11 time se i s (Christiano 1992 argues that this i a good representation for log GNP. Cooley re. s and Hansen 1989 and den Haan 1995 use this to model money growth.) W e consider the problem of estimating the standard deviation of detrended Xt, x\. Con sequently, our parameter vector ip i composed of a single element ( . . ip ° = [ E ^ ) 2 1 2,and s ie, ]/ n = 1. I i readily confirmed that, when x * i obtained by f r t differencing, the following ) t s s is specification of U t(tp) s t s i s the conditions discussed in the previous section: aife u t W = (4)2 - (VO2- (31) Note that we commit a slight abuse of notation here, because the value of ip depends of course on the method of detrending. The value of ip ° i unambiguous for the case of f r t s is differenced data. However, this i not so when data have been H P f l e e . W e confront this s itrd f r t before reporting the results of our Monte Carlo experiments. is, 5.1.1 T he Standard Deviation of H P Filtered Data When x f i obtained by H P filtering xt (31) does not exactly satisfy the conditions in s , sections 2 and 3 Endpoint effects associated with the f l e have the implication that x f . itr and, hence, Ut(xp) are not strictly stationary. To quantify t i , we computed ip® = [ E ( x * )2]x/2 hs for t = — T f 2 + 1,...,T/2 and T = 120. For comparison, we also computed tp® = [ E { x f )2\1^2 for t = — T /2 + 1,.,T /2 and T = 120,where x f = g ( L ) x t . W e refer to g ( L ) as the in f e a s ib le .. H P f i l t e r , since i requires having an infinite number (actually, 25 or so will do) of data t points prior to the f r t observation and after the last observation in the data s t By is e. contrast, the f e a s ib le H P f i l t e r requires only the available data. W e computed ip® and ip® using a Monte Carlo analysis with 100,000 replications, in which p = 0.4 and a = 0.01. For the experiments with the feasible H P f l e , each replication i composed of 220 observations, itr s ■with the starting value of x t set to zero and the f r t 100 observations discarded in order to is randomize i i i l conditions. The feasible H P f l e was then computed using the remaining nta itr 120 observations. For the experiments related to the infeasible H P f l e , 600 observations itr were generated per replication, with the f r t 100 deleted from the analysis to randomize is i i i l conditions. To approximate the infeasible H P f l e , we applied the feasible H P f l e nta itr itr with . 500 to the remaining observations and then kept the middle 120 observations for A analysis. These calculations produced and x f ti for t = — 59, . . 6 , and i = 1,., 100,000: .,0 .. 1000 0,0 4 = [ 100,000 1000 0,0 E Kj 2 . « = [100,000 E ]* fri for t = — 5 , . , 60. 9.. These objects are graphed in Figure l . In addition, we display P li m r - ^ c o V t , which i c ' s 0.01877 and was computed by inverse-Fourier transforming the spectrum of g ( L ) x t . Note the substantial variation in ip® at the beginning and at the end of the data s t revealing e, that ( x t )2 i not stationary in the mean. (See Baxter and King 1994 for a similar result s using a different data generating mechanism.) By contrast, in the middle 60 observations ip® 12 i roughly constant and equal to t>, which in turn essentially coincides with P l i m T -J,< ) xpp. s /° X These findings are consistent with the observations about the nature of the H P f l e made itr in the previous section. The endpoint effects in x% present a problem for us. The asymptotic theory requires that ip ° in the numerator of (9) be the standard deviation of a^. But there i no such s number, independent of t. There are at least three options. The f r t i to equate in rp° with is s xpt = [-E(xt)2 1 2 for observations in the middle of an H P filtered data s t A feature of this ]/ e. option i that i overstates the degree of variation in x f . For example, the mean value of rff} s t for a l values of t i 0.01813. A second option i to equate ip ° with E x p p , which i 0.01793 for l s s s T = 120. (This was approximated by averaging over 1000 Monte Carlo replications of xpp.) The discussion in the previous paragraph suggests that these two options are asymptotically equivalent. For example, when T = 1000, E x p ? = 0.01864, which i nearly equivalent to s P l i m r - * o o i p r • The third option i to throw away the f r t and last 25 observations. W e s is did not implement this option because we are interested in implementing the H P f l e as itr i i used in practice. W e decided to go with the second option for two reasons. First, this t s way of selecting t ° seems closest in s i to equating ip ° with the ‘ p p rit true’standard deviation of Xj. Second, this choice i conservative from the point of view of the conclusions of our s analysis. W e will show that asymptotic theory does not work well in f n t samples. Evidence iie presented below suggests that, had we pursued the f r t option, i would have worked even is t worse. For the sake of comparability, we treat xp° in the case of the f r t difference f l e is itr analogously. 5.1.2 Results Based on N o Prewhitening Our results for the case 6 = 0 can be summarized as follows. W e show that there i substan s t a distortion — in terms of fat t i s and skewness — in estimated confidence intervals for il al the standard deviation of x %.This reflects two features of the sampling distribution of Vp\ i t i biased downward and covaries positively with xpp. The former i the principal reason for s s the fat t i s while the latter accounts for skewness. The downward bias in V p principally al, reflects the persistence in Ut(xp), particularly when the data have been H P f l e e . Persis itrd tence leads to distortions in part by requiring a large lag bandwidth, which the methods we implement tend to underestimate, accounting in part for the downward bias in V p . Finally, the performance of our various estimators i very similar. s These results reflect six observations, based principally on an examination of the 6 = 0 , p = 0.4, < = 0.01 case. For money and G N P growth, this value of p i the empirically j s relevant one. First, for both the H P and f r t difference f l e s there i a substantial amount of is itr, s skewness in the t-statistic defined in equation ( when 6 = 0 and the number of observations, 9) T , i 120. (See Table 2a, top panel.) For each variance estimator, the frequency of the s associated t-statistic being in the nominal lower 5% t i i around 19% in the case of the H P al s f l e and around 10% in the f r t difference case. W e examined the impact on our results itr is of identifying 1 ° with P l i m x p p rather than with E x p p , and consistent with the discussion p in section 5.1.1, we found that this exacerbates the skewness problem. For example, the 13 BART(ll) row in the top l f panel of Table 2a becomes 26.0, 33.4, 9.3, and 6. . Second, et 0 when T = 120 there i also a substantial f t t i problem in that the sum of the probabilities s a-al of being in the top and bottom 5 % t i s substantially exceeds 10%, and similarly for the top al and bottom 10% t i s W e also investigated the consequences of applying the infeasible H P al. f l e and found that our results regarding skewness and fat t i s are essentially unaffected itr al by this change. This suggests that these problems do not reflect the endpoint features of the H P fle. itr Third, the f n t sample distortions appear to re l c problems with V p . This can be seen iie fet by noting that there i almost no f t t i or skewness problem in the row corresponding to s a-al TRUE. This suggests that the distribution of the numerator in (9) i nearly normal and s indicates that the skewness and f t tail problems arise almost entirely from the sampling aproperties of V p . Results in column I of Table 3a bear out this view, as i applies to skewness. t They show that the numerator and denominator of (9) are positively correlated. Thus, in replications when the numerator in (9) i big, the ratio i not big since the denominator i s s s typically large too. Similarly, the impact on the ratio of a negative, but large in absolute value, realization in the numerator i typically magnified by a small denominator. Results s in columns I and I I of Table 3a identify problems in the sampling distribution of V p which I I may account for the f t t i problem evident in Table 2 In particular, they show that when a-al . 6 = 0 the G M M estimator, V p , i biased downward. Other things the same, this would be , s expected to blow up t i areas. al Fourth, distortions appear to r eflect the persistence in The literature on small sample properties of variance estimators notes that high temporal dependence can lead to coverage probabilities for confidence intervals that are too low, i e , that lead to excessive .. rejections. (See Andrews 1991, Andrews and Monahan 1992, and Ericsson 1991.) The impact of persistence on our results can be seen in two ways: by comparing results based on the H P and f r t difference f l e s and by comparing results based on p = 0.4 and p = 0 1 is itr .. Distortions — in both long and short samples — appear to be substantially lower for computations based on the f r t difference f l e than for computations based on the H P is itr f l e . And the H P f l e leaves substantially more temporal dependence in x f than does the itr itr f r t difference f l e . One way to see this can be seen in the results in Figure Id. That is itr figure displays four autocorrelation functions for U t ( ip p ) , where Ut(V') i defined in (31). s Each autocorrelation function i based on 100,000 observations of a t f c a data generated s riiil from (30) and i differentiated according to whether the data have been H P filtered or f r t s is differenced and whether p = 0 or p = 0 4 The lowest two autocorrelation functions are .. based on f r t difference f l e i g The higher two are based on H P f l e i g From our is itrn. itrn. perspective, the notable feature of this graph i how high and relatively insensitive to p the s autocorrelation of U t(ip p ) i when the underlying data have been H P f l e e . (For related s itrd observations, see Cogley and Nason 1995.) Another way to see that persistence accounts for the distortions we find i to compare s the results for p = 0.1 in Table 2b with those based on p = 0.4 in Table 2a. Note that, with the drop in p , the coverage probabilities are somewhat closer to their nominal values in the case of f r t differenced data, while there i l s improvement with H P filtered data. This i is s es s 14 consistent with the notion that persistence in Xj i an important factor underlying the poor s small sample distribution of our test s a i t c Recall from Figure Id that a reduction in p ttsi. substantially reduces the persistence in when data have been f r t differenced, but not is when data have been H P f l e e . itrd Fifth, high persistence produces a downward bias in V t in part because our automatic bandwidth selection procedures are themselves downward biased. To see this, we note f r t is that the relatively high persistence in t t V r when data have been transformed by the H P t(') f l e implies that a high bandwidth parameter i needed to estimate the zero-frequency itr s spectral density. Consider the results in Figures 2a and 2b. They graph 5(£)/5 against values of the bandwidth parameter, f = 1 . . 101. 5(f) i (6) with the summation truncated ,., s at l = f, using the kernel indicated in the figure. These objects were computed using a single realization of length 1000 generated from (30) with p = 0.4 and a = 0.01. In these calculations, 6 = 0 and 5 i approximated by 5(101), computed using the unweighted, s truncated kernel ( . . (18) with 6 = 0 ) This normalization guarantees that a l the curves ie, . l in Figure 2 eventually converge to unity. The curves marked ‘ N W E I G H T E D ’ correspond U to the kernel in (18) with 6 = 0 . For the curves market ‘ A R T L E T T ’ we set 6 = 1 . The B curves marked ‘ S ’correspond to the case where the kernel i (20). Figures 2a and 2b report Q s results based on filtering data using the H P f l e and the f r t difference f l e , respectively. itr is itr Consistent with the findings in Figure Id, Figures 2a and 2b suggest that i takes a much t higher value of f to get a variance estimator to converge for data based on the H P f l e itr than for data based on the f r t difference f l e . For example, in the case of ‘ A R T L E T T , ’ is itr B the variance estimator has 90% converged for f less than 6 when the data have been f r t is differenced, while a value of f in excess of 31 i needed to get comparable convergence when s the data have been H P f l e e . itrd The results in column I of Table 4 indicate that the automatic lag selection procedures detect the need for a higher bandwidth when data have been H P f l e e . The procedures itrd based on Andrews and Newey-West select average lag lengths of 10 and 5 respectively. The , reason the former selects 10 on average reflects i s ‘ t strategy’for guessing the lag of the last significant autocorrelation: i extrapolates the f r t autocorrelation (0.75, in this case) and t is assumes U t(tp ) i AR(1). The Newey-West method with (3 = 4 and T = 120 guesses the s lag of the last significant autocorrelation by ‘ looking’at the f r t four autocorrelations. I i is t s then not surprising, based on inspection of Figure Id, that the Newey-West method selects a lag length of 5 and misses the ‘ hump’ in the autocorrelation function at higher lags. The results in Figure 2a suggest that, particularly for H P filtered data, the chosen bandwidths are not large enough. As the figure indicates, with too small a bandwidth, one expects the standard error estimate to be understated and t i s to be blown up. al Sixth, results for the various procedures are a l very similar. In this example, i makes l t l t l difference how exactly the bandwidth or kernel i picked. For example, even though ite s the lag lengths picked by the Andrews and Newey-West methods are different, the results in Figure 2a indicate that the implied estimates of 5 are not very different. Finally, the skewness and f t t i problems are reduced when the number of observations a-al i increased to 1000 (see the bottom half of Table 2a). This i to be expected, based on large s s 15 sample theory. But the reduction i surprisingly small, particularly for results based on the s H P fle. itr 5.1.3 Results Based on Prewhitening Our results in this subsection can be summarized as follows. W e show that f r t order is prewhitening has a beneficial impact on the f t t i problem, but relatively l s impact on the a-al es skewness problem. The impact on the f t t i problem reflects that f r t order prewhitening a-al is reduces the downward bias in V x that contributes to fat t i s when 6 = 0 An important al . objective of a project such as ours i to determine which of the several existing zero spectral s density estimators works best in our setting. And so we i itially found i interesting that the n t Andrews and Andrews and Monahan zero-frequency spectral density estimator appears to outperform the Newey and West procedure when 6 = 1 However, i turns out that this result . t does not actually indicate any inherent superiority in the former. These procedures have a variety of potential sources of bias. In our application the results reflect that the former procedure i driven by two sources of bias which tend to cancel, while only one of these s sources of bias i present in the Newey-West (NW) procedure. The two sources of bias are s ( ) misspecification inherent in the relatively parametric Andrews said Andrews and Monahan i procedure which leads to an underestimate of the bandwidth and ( i a bias affecting both i) procedures which leads to an underestimate of S ( £ ) for any given bandwidth, £ In contrast . with the no prewhitening case, consideration ( ) by i s l leads to an overestimate of V . This i tef i because f r t order prewhitening in our context induces high order negative autocorrelation s is in the prewhitened U t(xp), u \ ( ip ) . Finally, our results based on second order prewhitening are somewhat discouraging, since they are essentially identical to the 6 = 0 results. Moreover, we are not aware of any algorithm that would lead a researcher to select 6 = 1 W e show . that the AIC criterion invariably leads to a selection of 6 = 2 . These results reflect six observations, with the f r t f v being based on 6 = 1 First, in is i e . the cases when prewhitening does help, i does so mainly by alleviating the problem of fat t t i s and does l t l to reduce the skewness problem. (Compare the 6 = 1 rows with the al ite 6 = 0 rows in Table 2a.) For example, when T = 120 and data have been H P f l e e , the itrd sum of the lower and upper 5 % t i area probabilities i 28.0 for QS when 6 = 0 and f l s al s al to 13.3 when 6 = 1 At the same time, the l f t i area exceeds the right by 9.4% when . et al 6 = 0 and by 10.7 when 6 = 1 The favorable impact on the f t t i problem of increasing . a-al 6 i consistent with simulation results in Andrews and Monahan (1992). But they do not s analyze skewness. In our example we would clearly overstate the benefits of prewhitening by abstracting from skewness. Second, the beneficial impact on the f t t i problem of f r t order prewhitening appears a-al is to reflect a ri e in the mean of V x - (See column I I Table 3a and 3b.) This impact appears s I, to have been greatest for QS and BARTLETT, and not surprisingly, these also exhibit the smallest f t t i problem. (See Table 2a and 2b.) a-al Third, the differences between the automatic bandwidth selection methods are greatest with 6 = 1 For 6=1, N W appears to perform worst, at least relative to the f t t i problem. . a-al 16 For example, the sum of the upper and lower 5 % t i areas i equal to 21.8 for N W and to al s 13.7 and 13.3 for B A R T L E T T and QS, respectively. W e argue that, ironically, the relatively poor performance of N W reflects that i i distorted by fewer sources of bias. To see this, we t s note f r t that selecting the bandwidth by the Newey-West procedure delivers about the same is results as the much simpler procedure of simply setting £ exogenously to 11. (Compare the BART(ll) and N W rows in the H P block of columns in Tables 2a and 2b.) This can be seen in Table 4a, which shows (column I that the mean value of ) chosen by N W i roughly the s same as the bandwidth in the BART(ll) procedure. In contrast, the B A R T L E T T procedure selects a much lower £r on average. The reason that, in this example, Newey-West selects much higher f a ' s on average than does Andrews i instructive about the differences between these procedures. W h e n 6 = 1 , s i appears that the autocorrelation function of t i positive for the f r t few lags, after s is which i turns sharply negative. This can be seen in Figure 3 which i the exact analog of t , s Figure 2a, for 6 = 1 Note the i i i l rise in ‘ N W E I G H T E D , ’ followed by a sharp f l , as . nta U al the bandwidth increases above f = 3 To see the implications of this, recall that the two . methods select the bandwidth based on different strategies for extrapolating u \ . The AR(1) version of the Andrews procedure used here looks at the lag 0 and 1 autocovariances of and extrapolates based on t i . W h e n 6 = 1 Figure 3 shows that this extrapolation i very hs , s misleading. The AR(1) assumption clearly entails specification error, since i completely t misses the oscillatory behavior of the actual autocovariance function. Since, in addition, the f r t order autocorrelation i small, the Andrews procedure picks a small bandwidth. Neweyis s West looks at more elements in the autocorrelation function, properly detects i s complexity, t and therefore sets a much higher value of the bandwidth, on average. The following simple example illustrates why, in our setting, f r t order prewhitening is causes the Andrews and Andrews and Monahan lag length selection procedure to understate the bandwidth, while the Newey-West procedure gets i about right. Let u* = e t + £t-i> t where e t i uncorrelated over time. The autocorrelation function of U t, which i relatively s s high at lag one ( p = 0.5) and then f l s to 0.0, resembles qualitatively the autocorrelation al for H P filtered data exhibited in Figure Id. The autocorrelation function of the f r t order is prewhitened process, u \ = U t — p t L t - i, i p\ = 1/6, p\ = — 1/3 and P j = 0 for j > 3 at lags 1 2 s , , and higher, respectively. The Andrews procedure approximates this autocorrelation function of u \ using an AR(1) functional form and the relatively small p \ . Substituting p\ = 1/6 into (23) yields a(l) = 0.12, or, for T = 120, using (21), £r = 2.77. The N W procedure, with (3 = 4 and T = 120 selects a(l) = 2.25 using (25), or, £r = 7.40 using (21). This example , captures the reasons that the Andrews method picks a shorter lag length than Newey-West, when 6 = 0 In that case, the Andrews method implies that a(l) = 1.7778 for T = 120, . and, hence, — 6-8. The Newey-West method implies that a(l) = 0.24, and = 36 .. I i interesting that the Newey and West method does not pick a monotonically declining t s bandwidth as the order of prewhitening increases. In view of the apparently superior performance of Newey-West over Andrews in selecting the bandwidth, i i ironic that Newey-West nevertheless underperforms relative to Andrews t s in our Monte Carlo results. The resolution appears to l e in the effects of f n i i ite sample 17 bias. In particular, we have found that the mean of the f n t sample analog of the curves iie in Figure 3 l e considerably below those curves. (That i , letting S t (£ ) denote an estimator is s of S ( £ ) , we found that E S t {£) < S ( £ ) for various values of £ and various kernels.) Given the nature of the hump near the origin, methods which select a small bandwidth, in e f c , fet overcome this small sample bias. Thus, the superior performance of the AR(1) Andrews procedure appears to reflect the offsetting effect of specification error on bias. The NeweyWest procedure does relatively poorly because i also suffers from bias, but does not enjoy t the compensating effects of specification error. Clearly, this result i specific to the data s generating mechanism we have assumed. S i l i illustrates the kind of factors that impact tl, t on the small sample performance of alternative zero-frequency spectral density estimators. Fourth, consider the effects of second order prewhitening, i e , 6 = 2 Andrews and .. . Monahan (1992) and Newey and West (1994) do not consider this case, but they give no motivation for only considering f r t order prewhitening. (For a further analysis, see den Haan is and Levin 1994.) W e had expected second order prewhitening to improve the performance of our test s a i t c given the complex behavior of the autocorrelation function when 6=1. ttsi, Also, with an AIC selection criterion the AR(2) was chosen 939 times out of 1000 data over an AR(1) to model Ut(t^r), for the case with p = 0 4 H P filtered data and T = 120. ., To our surprise, performance actually deteriorated and closely resembles the 6 = 0 case. Presumably, this reflects that, as in the 6 = 0 case, specification error ( . . choosing too low ie, a value of - see Figure 4) and bias (for any £r, *Sr(6r) underestimates S ( £ ) ) both work in the same direction, towards underestimating V . This accounts for the reappearance of the f t tail problem when 6 = 2 a. 5.2 A Multivariate D G M W e estimated a data generating mechanism for log consumption, ct, log GNP, y t , log gross investment, i t , and log hours, nt For t i , we use the quarterly postwar U.S. data described . hs in Christiano (1988). W e impose that c t , y t , i t are each integrated of order 1 and that y t — c t , y t — i t , n t are each covariance stationary. Define Yt = A yt yt-ct (32) y t- it nt W e estimated a V A R for Y t for the period 1957:1-1984:1, and used this to simulate 500 a t f c a data s t , each of length 115 observations. W e did this in two ways: one by riiil es a Monte Carlo procedure of drawing the disturbances from the normal distribution with variance estimated from the data (the rows marked ‘ ’in Tables 5a and 5b) and the other N by bootstrapping the actual fitted residuals (the rows marked ‘ ’ in Tables 5a and 5b). W e B implement these two procedures as a check on the robustness of our results. In each a t f c a data s t we computed ( for 23 s a i t c . For each s a i t c we recorded riiil e, 9) ttsis ttsi, the frequency of times, across data s t , that ( ) was l s than the nominal 5% c i i a value es 9 es rtcl 18 and the frequency of times that (9) exceeded the 95% c i ical value. Our 23 statistics are rt standard in the business cycle literature. They include a yt a c/ a y i <Ti/<ry , crn / a y , crw / a y , V\u/Gn, where < denotes the standard deviation of the detrended variable x — y , c , i t n , w rx and w denotes labor productivity, that i , w = y — n . In addition, they include 17 correlations: s P y x ( r ) }t - -1,0,1, for x = c i n, i ; pyy(r), r = -1,1; and pw„(r), r = -1,0,1. Our analysis ,, u was done for each of the two detrending methods. The results are similar to what we found in the univariate analysis in the previous section. Consider Table 5a, which reports findings for cry . Standard error estimates, \ / & r , were based on three measures of the zero-frequency spectral density: BART(ll), BART L E T T , and N W . Recall that the choice of kernel for a l three procedures i the Bartlett kernel, but l s that the choice of bandwidth differs across these three estimators. There are three results we would like to emphasize. First, results are very similar for the experiments with Normal and with bootstrapped errors. Second, for both detrending methods there i a skewness s and fat-tail problem when the data have not been prewhitened. The problem i less severe s for data that have been f r t differenced. These findings closely resemble those reported for is the analysis in the previous section. I anything, the skewness and fat-tail problem i more f s severe here. Third, when detrending i by H P f l e , prewhitening helps the fat-tail problem s itr a great deal, but has l t l impact on the skewness problem. For example, the average of the ite l f and right t i areas i 16.6 for BART(ll) when there i no prewhitening and 5.6 when et al s s there i f r t order prewhitening. The latter i very close to the asymptotically correct value s is s of 5. . At the same time, the difference between the l f and right t i areas i 10.0 in each 0 et al s case. Prewhitening helps l t l when the underlying data have been f r t differenced. Here, ite is too, the results closely resemble what we found in the previous subsection. Table 5b contains a summary of the findings for the other s atistics. The f l set of t ul results, available on request, are too numerous to reproduce here. In any case, the message from these results i fairly simple and corresponds closely to our findings in the previous s subsection. The l f panel of Table 5b reports results based on H P f et iltering the data, while the right panel i based on f r t differencing. The columns labelled ‘ report the absolute deviation s is I’ from 10% of the sura of the two t i areas, averaged over a l 23 s a i t c . This average does al l ttsis not indicate the typical sign of the deviation, and so we also present columns labelled l I ’ I, which indicate the number of times, out of 23, that the deviation was positive. A positive deviation indicates a ‘ fat-tail’ problem. Columns labelled ‘ III’ report the absolute value of the difference between the l f and right t i areas, averaged over the 23 s atistics. This et al t i a measure of skewness, although the absolute value operator destroys information about s whether the skewness i to the l f or right. Columns labelled ‘ s et IV’ provide information on that, by reporting the number of times that the skewness i to the l f , i e , the number of s e t .. times that the deviation i positive. s Consider columns I and I . With no prewhitening (b = 0) almost a l statistics display a I l substantial fat-tail problem, which averages from about 14 to 20%, depending upon the exact statistical procedure used. The problem i considerably l s severe when data have been f r t s es is differenced, although i i s i l substantial, being on the order of from 5 to 10%. Here, as t s tl 19 in the example in the previous subsection, prewhitening reduces the f t t i problem by a a-al substantial amount when the underlying data have been H P filtered and very l t l when the ite data have been f r t differenced. is Columns labelled I I and IV indicate that, with no prewhitening, there i a skewness I s problem on the order of from 4 to 7% for results based on H P filtered data. The problem i le s severe when the data have been f r t differenced. Here, as in the previous subsection, s s is prewhitening has l t l impact on the skewness problem. Column IV indicates that there i ite s l t l consistency among the underlying results on the direction of skewness. In fact, some ite statistics do not suffer from a skewness problem. A subset of the results are represented in Figure 5 For each s a i t c for B A R T L E T T . ttsi, and N W , and for b=0, 1 we display the probability of a statistic exceeding the 95% (height of gray bar) c i i a value and of being le s than the 5% (black bar) c i ical value. Results are rtcl s rt reported for each of our 23 s a i t c , and the numbering code for these statistics (#l-#23) i ttsis s explained in the note to the figure. Also, the normal distribution was used in simulating the dgm. The fact that there i no pattern in the direction of skewness i evident. In addition, s s the beneficial impact on the f t t i problem of raising b from 0 to 1 i also evident. a-al s Finally, whether we use the bootstrap or Normal distribution in simulating our dgm makes l t l difference to the results. This i consistent with the underlying asymptotic ite s theory. 6 A Wald-Test E x a m p l e In this section we study the f n sample properties of the Wald-type statistics proposed by i ite Christiano and Eichenbaum (1992) for testing the null hypothesis that a model’ implications s for the second moment properties of a set of variables coincide with the second moment properties of those variables in the data. W e pursue this in a simple example. This Wald test provides a statistic to evaluate a model’ goodness of f t and overcomes s i an important weakness of the ‘ calibration’approach. The ‘ calibration’ approach consists of two steps. In the f r t step the model’ structural parameters, ip\, and some data second is s moments, ip 2 , are estimated. The economic model implies a relation between the structural parameters and the second moments. W e denote this relation by ip 2 — 9^ P i )>or F ( i p ) — 0 * The second step consists of comparing the estimated second moments, ip 2,T i with the ones implied by the economic model, g (ip\ , r). Singleton (1988) points out that a disadvantage of the ‘ calibration’ approach i that the metric of evaluating the difference between ip 2,T and s 9 ( ^ 1,t ) i not made explicit. In section 2 we showed how standard asymptotic theory can s , be used to construct a formal metric of the difference between the two sets of moments. An alternative estimation and testing strategy imposes the restriction F ( i p ) = 0 during estimation. For example, the vector ip could be estimated by designating ip\ as the free parameters and setting ip 2 = < 7(^1 ) This estimation strategy would typically require the • use of a nonlinear search algorithm to optimize the estimation criterion, and this would typically involve evaluating g{ip\) hundreds, perhaps thousands, of times. A difficulty with 20 this strategy i that for many interesting models, i i computationally costly to calculate s t s for a particular value of rp\. A n advantage of the Wald procedure studied here i s that i involves estimating ip without imposing the restrictions of the model. Then, for t testing purposes, a l that i required i the derivative of g ( - ) t and numerical procedures to l s s approximate this only require evaluating g ( i p i ) a small number of times. g ( ip i ) 6.1 The Brock-Mirman Model W e use the Brock-Mirman version of the neoclassical growth model, which has log-utility and complete depreciation. As demonstrated in Long and Plosser (1982), this model has the advantage that i s analytic solution i known. The f n t sample properties of test statistics t s iie are, therefore, not affected by approximation error in the model solution. Den Haan and Marcet (1994) show that small numerical errors can be important for the distribution of the test s a i t c . More complicated examples are analyzed in Burnside (1991) and Burnside ttsis and Eichenbaum (1994). According to the model, a planner selects contingent plans for consumption, c , and * capital, k t+i , to maximize E 0 ( ? log(ct) subject to the resource constraint, c - ft+1 = * f c A fexp(zt); the exogenous technology shock process, A z t = p A z t- i +<7£t and the given i i i l r ; nta conditions, A© > 0 z q , z - \ . Here, 0 = 0.99, < = 0.01, a = 0.3, and et ~ N ( 0,1). W e consider c , 7 two different values for p . These are p = 0.1 and p = 0. . As i well known, the contingency 5 s plan that solves this problem i At+1 = a 0 y t , where yt i output and y t = k f exp(zt). Some s : s simple algebra shows that Alog ft+j i an AR(2) with parameters equal to a + p and — atp. I c s f p = 0.1, then the law of motion for Alogfct+i i not very different from the simple example s discussed in section 5 1 I p = 0.5, then the law of motion for A log k t+ i i much more .. f s persistent. 6.2 T h e W a l d Test W e simulated 1000 a t f c a data sets for this economy, of length 120, 1000, and 5000 ob riiil servations each. W e performed two different experiments. To simplify the Monte Carlo exercise we only estimate one model parameter, < or p, and estimate one second moment, 7 the standard deviation of detrended capital. In the f r t experiment we take the values of is the structural parameters p, 0 , and a as given and the value of a as unknown. In the second experiment we take the values of c, 0 , and a as given and the value of p as unknown. W e r now describe the f r t experiment. is In each data s t we estimated a 2 x 1 vector, ip , where e, ~ (s )- U ) (33) and » » = [£(*?)2] / 12 21 (34) and x * denotes detrended log(A^). As before, x * i alternatively the f r t difference of l g J t s is o(b) or H P filtered l g f t . W e specified the following 2 x 1 G M M error vector: o(c) u.(tb) - ( (AZt ~ ~ (^l)2 ^ )' /ex «r (35) I i easily established that, when detrending i accomplished by f r t differencing, E u t ( t p ° ) = t s s is 0 and U t sati f e the other conditions in section 2 As before, under H P f l e i g U t(ip ) does sis . itrn, not satisfy the conditions of section 2 exactly, due to the influence of endpoint effects in the application of the H P f l e . However, we proceed as though the asymptotic results in itr section 2 are valid anyway. Given a value of a and of the other model parameters, i i possible to compute the t s model’ implied variance of x * . Denote this by g ( i p i ) . In the case of the H P f l e , g ( - ) i s itr s obtained by f r t applying the inverse-Fourier transform to the spectral density of the H P is filtered series and then taking the square root of the result. When the data are detrended by f r t differencing, then x * i an AR(2) and we can calculate g ( - ) analytically. Then is s w ) - m * - s m = o. (36) W e computed F ( i p r ) in the a t f c a data and compared the small sample distribution of riiil the test statistic in (12) with the chi-square distribution with one degree of freedom. W e will refer to this experiment as experiment 1 In experiment 2 the value of a i considered . , s known and the value p i estimated. Here equation (35) i replaced by s s - l (A z ‘ “ Azt _1)A2*_1 - {a)2 \ (xtf-fa)2 )• / 37x (37) The simple nature of this example prevents us from estimating a and p simultaneously and s i l have a meaningful Wald t s . I i not hard to show that in that case the variance of tl et t s F(4>t ) would be singular. The results for experiments 1 and 2 are presented in Tables 6 and 7 respectively. In , section 5.1, we found that results are not very sensitive to whether the Bartlett or quadratic spectral (QS) kernel i used. As a result, we only report results based on the Bartlett kernel s in this section. W e used the Andrews and Newey-West methods to calculate the optimal bandwidth. Consistent with the notation in section 5 here we refer to the variance estimators , ( V t ) based on the f r t and second procedure as B A R T L E T T and N W , respectively (see Table is 1 . The structure of Tables 6 and 7 i similar to that of Tables 3a and 3b in section 5 The ) s . f r t two columns indicate the frequency of times that the test statistic i less than the 5% is s and 10% c i ical values of the Xi-distribution, respectively. The third and fourth columns rt indicate the frequency of times that the test statistic i greater than the 90% and 95% cr t c l s iia values. The f f h column contains the average value of the selected optimal bandwidth, £t it 6 .2.1 O u r Experiment 1: Innovation Variance Estimated re s u lts w o rth a re g e n e ra lly e m p h a s iz in g . F ir s t, c o n s is te n t w ith s m a ll s a m p le th e fin d in g s c o v e ra g e 22 o f s e c tio n p ro b a b ilitie s a re 5. F iv e c lo s e r to o b s e rv a tio n s th e ir a re a s y m p to tic values when there i l t l persistence, i e , p = 0.1, than when p = 0. . Also, as in section 5 s ite .. 5 , the coverage probabilities are closer to their nominal values when data have been filtered by f r t differencing, rather than when they are H P f l e e . In f c , the results are surprisingly is itrd at good for the f r t difference f l e , even with T = 120. For this reason, to save space we only is itr report the results based on B A R T L E T T for the f r t difference f l e . is itr Second, we found substantial skewness in the distribution of our test s a i t c Moreover, ttsi. this skewness primarily reflects a rightward shift relative to the chi-square distribution. This i a problem because in practice one i only interested in one-sided t s s ests. The third observation i that f r t order prewhitening increases the values of Vj?r, es s is pecially at low values of the bandwidth. This helps reduce the f t tail problem. In this aexperiment, f r t order prewhitening also helps to reduce the skewness problem. Here, as in is the example in section 5. , second order prewhitening does not help. Although f r t order 1 is prewhitening helps, i only does so very l t l when data are H P f l e e . The frequency of t ite itrd falsely rejecting the null hypothesis at the 5 or 10% levels i around twice what i should s t be in a sample of size 120. Fourth, as in the section 5.1 example, f r t order prewhitening is has a bigger effect on the fat t i with B A R T L E T T than with N W . Fifth, as expected, the al asymptotic theory i validated when the number of observations becomes large. s 6.2.2 Experiment 2: Autocorrelation Estimated The results of experiment 2 reported in Tables 7a and 7b, f r similar to those of exper , ie iment 1 Here, too, N W exhibits fatter t i s relative to B A R T L E T T when there i f r t . al s is order prewhitening. For example, in Table 7a (T = 120), the sum of the two 5 % t i area al probabilities i 17.5 percent for N W versus 11.3 for BARTLETT. s The reason for the poor performance of N W relative to B A R T L E T T when there i f r t s is order prewhitening appears to reflect that there are offsetting biases in the latter, as in the Monte Carlo analysis in section 5 1 To see t i , consider f r t Figure 6 This figure exhibits .. hs is . 70,000 as a function of the bandwidth used in the underlying zero-frequency spectral density calculation. Note how f r t order prewhitening leads to values of V fi o, that are very high is 7 ooo at low values of the bandwidth. For example, with the Bartlett kernel and f r t order is prewhitening, a bandwidth of around 3 results in V f being overestimated by a factor of around 2. . In addition, in results not reported here, we find that estimates of V f ,t are 8 downward biased for every fixed bandwidth. For reasons like those reported in section 5.1, the Andrews optimal bandwidth selection method computes a low value for the bandwidth (see Tables 7a and b). In a large sample, this would lead to an upward bias in V f ,t - However, when T = 120, the upward bias roughly cancels the downward bias, allowing Andrews to turn in a tolerable performance. At the same time N W , which accurately recognizes that a much larger bandwidth i needed, does s poorly. Here, as in section 5. , the problem i that N W does not have an upward bias 1 s to cancel the downward bias in estimating V f ,t - This i why N W produces fatter t i area s al probabilities. Thus, B A R T L E T T ’ good performance relative to N W i a Pyrrhic victory. While i i s s t s 23 nice that B A R T L E T T performed well, the manner in which this was accomplished - by the offsetting effects of two biases - i cause for discomfort. There would of course be no problem s i in practice the two biases in B A R T L E T T were perfectly correlated across applications. f But that this i not the case i suggested by the results based on T = 1000 and T = 5000 in s s Tables 7a and 7b. There we see that B A R T L E T T with f r t order prewhitening has thin t i s is al. For example, in Table 7a, the upper 5% t i area i equal to 2.2 for B A R T L E T T and equal to al s 7.8 for N W when T — 1000. W hen T = 5000, these numbers for B A R T L E T T and N W are respectively 2.6 and 5 2 The presence of thin t i s reflects that the bandwidth parameter .. al increases quite slowly as the number of observations increases. For example, for T = 1000 observations, Table 7a indicates that the bandwidth parameter i only 10.7 for BARTLETT. s Figure 6 indicates that a bandwidth parameter of 10.7 could s i lproduce a 30% overestimate tl of V ftT . At the same time, the downward bias in estimates of Vf ,t disappears in a sample of 5000. Then the only source of bias that remains i the upward bias stemming from the s overly short bandwidth parameter. This example suggests that the upward and downward sources of bias in this model are not perfectly correlated. 7 S u m m a r y a n d Conclusion I i common in business cycle analysis to characterize the business cycle in terms of a t s set of second moment properties of detrended data. Researchers then focus on constructing models that can account for these moments. Recently, a number of researchers have employed versions of the G M M sampling theory developed by Hansen (1982) to assist them in this process. Although the sampling theory i known to be valid in arbitrarily large data s t , s es l t l i known about how well i works in r alistic settings. Our purpose in this paper i to ite s t e s investigate t i . hs The results are disappointing. The asymptotic theory appears to provide a poor ap proximation in f n t samples, particularly when the data have been H P f l e e . These iie itrd conclusions are based on a Monte Carlo study of two types of s a i t c . First, we examine ttsis second moment statistics like correlations and standard deviations. W e study the coverage probabilities of confidence intervals computed for these s a i t c . Second, we study the size ttsis properties of a statistic proposed in Christiano and Eichenbaum (1992) for the purpose of testing model ft i. Consider f r t our analysis of second moment s a i t c . W e study these in the context of is ttsis two d g m ’: a univariate time series model often used in the analysis of macroeconomic data s and a four variable vector autoregression estimated for the basic macroeconomic variables using postwar U.S. data. Our objective in selecting these dgm’ was to build confidence s that our Monte Carlo results provide a reasonable indication of statistical performance in actual research applications. In our analysis of confidence intervals we consider two issues: ( ) how frequently the ‘ i true’ value of the second moment l e outside computed confidence is intervals and ( i how often the true statistic l e to the l f or to the right of the confidence i) is et interval. When the statistic l e outside the confidence interval more often than predicted is by the (asymptotic) sampling theory, we say i exhibits a ‘ t fat-tail’ problem. Also, i i l e f t is 24 on one side of the confidence interval more often than on the other side, then we say the statistic exhibits a ‘ skewness’problem. W e examine a number of different ways of estimating confidence intervals and find that, for the 23 statistics examined, there i almost always a s substantial skewness problem. Unfortunately, there i no clear pattern in the direction of s this skewness problem. To a somewhat lesser extent, we also encountered a f t tail problem. aW e use our univariate dgm to conduct a thorough diagnosis of the causes of the skewness and fat t i s W e find that these problems are most severe when the underlying detrended al. data exhibit substantial persistence. This i because our procedures for computing confidence s intervals require an estimate of the zero-frequency spectral density of the ‘ M M error pro G cess,’which i a particular function of the detrended data. As i well known, zero-frequency s s spectral density estimators show considerable imprecision when the data are persistent. That persistence i an important element of the problem i consistent with our finding s s that the distortions are greater when we consider statistics based on data transformed using the H P f l e , rather than the f r t difference f l e . In our statistical environment, H P filtered itr is itr data exhibit more persistence than f r t difference filtered data for two reasons: ( ) the data is i have the property that there i more persistence in the business cycle frequencies than there i s s in the higher frequencies, and ( i the H P f l e emphasizes the former and the f r t difference i) itr is f l e , the l itr atter. Among the various procedures we used, we found none that satisfactorily resolves the persistence problem. For example, there i a zero-frequency spectral density s estimator designed specifically to accommodate persistence: the prewhitening procedure of Andrews (1991) and Andrews and Monahan (1992). However, we had only limited success with i . W e did find that prewhitening can reduce (without eliminating) the fat-tail problem, t but that this depends very sensitively on precisely how the degree of prewhitening i selected. s In particular, we find that with f r t order prewhitening, the fat-tail problem i definitely is s alleviated relative to the situation when there i no prewhitening. However, this i only s s of limited interest: in any given empirical application, one does not know what the correct degree of prewhitening i , and some means must be found to estimate i. In our Monte Carlo s t analysis, we found that when the degree i selected using the AIC criterion, then second s order prewhitening was most often indicated. The trouble i that second order prewhitening s generates roughly the same results as no prewhitening at a l In addition, although there i l. s some evidence that prewhitening can alleviate the f t t i problem, i seems to have relatively a-al t l t l impact on skewness. ite An important question addressed by our work i , What procedure for computing con s fidence intervals works best? Although we tried several procedures, none turned out to uniformly dominate the r s . The ones we tried are differentiated according to how the et zero-frequency spectral density of the G M M error process i estimated. In each case, we s consider the non-parametric approach, which involves computing a weighted average of au tocovariances up to some f n t lag length. In terms of sampling performance, the differences iie among the various procedures studied came down to how the lag length was chosen. One lag selection procedure, due to Andrews (1991) and Andrews and Monahan (1992), places structure on the autocovariances and computes the lag length based on an examination of the lag-one autocovariance. The other lag selection procedure, due to Newey and West (1994), 25 examines a longer l s of autocovariances. Not surprisingly, we found that the Newey and it West (1994) lag-selection procedure works best when the autocovariance function exhibits a complicated pattern, and the f r t order autoregressive assumption underlying the Andrews is (1991) and Andrews and Monahan (1992) procedure i misspecified. W e expected, therefore, s that confidence intervals computed based on the Newey and West procedure would exhibit fewer distortions. W e particularly expected . h s when there i f r t order prewhitening, since ti s is we found in our application that this produces an exotically shaped autocovariance function that i completely unlike the autocovariance function of a f r t order autoregression. I was s is t to our i i i l great surprise, therefore, that with f r t order prewhitening the Andrews and nta is Andrews and Monahan procedure actually dominates the Newey and West procedure. The reason for this i that small sample biases in the estimation of autocovariances also matter, in s addition to misspecification, in determining f n t sample performance. In our environment, iie the impact of these two factors roughly cancel. Since Andrews and Andrews and Monahan suffers from both problems and Newey and West only from only one, this explains why the former does better. Given the reason that the former dominated the latter in this case, we thought this experiment did not constitute a clear basis for preferring the Andrews and An drews and Monahan procedure. This and other experiments l f us with conflicting evidence et on whether one or the other of these two procedures dominates. Our basic findings for the univariate dgm extend to the multivariate dgm. In particular, for virtually a l of the statistics examined, there i a skewness problem. Also, researchers l s who somehow know that f r t order prewhitening i appropriate, can alleviate the f t t i is s a-al problems somewhat. Moreover, the direction of skewness i not consistent across the 23 s statistics examined. Finally, we considered a statistic for testing the null hypothesis that a model’ second s moment implications correspond to the actual second moment properties of the data. W e found that this st t s i , which according to asymptotic sampling theory has a chi-square aitc distribution, rejects the null hypothesis far too often, even when i i true. Again, the t s problem i more severe when the second moments are based on H P filtered data. s The results reported in this paper are a clear indication that a more reliable sampling theory i required for the statistics used in the analysis of business cycles. The need i l s s ess pressing for analysts who use the f r t difference f l e . However, this i l t l comfort for is itr s ite researchers interested in the frequencies emphasized by the H P f l e . itr 26 8 References 1 Andrews, Donald W. K., 1991, ‘ . Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation,’ E c o n o m e t r i c a ,59(3):817-858, May. 2 Andrews, Donald W. K., and J Christopher Monahan, 1992, ‘ . . An Improved Het eroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator,’ E c o n o m e t r i c a , 60:953-966. 3 Backus, Dave, Allan Gregory, and Stanley Zin, 1989, ‘ . Risk Premiums in the Term Structure: Evidence from Artificial Economies,’ J o u r n a l o f M o n e t a r y E c o n o m i c s , Vol. 24. 4 Backus, Dave, and Patrick Kehoe, 1992, ‘ . International Evidence on Historical Proper ties of Business Cycles,’ A m e r i c a n E c o n o m i c R e v ie w , vol. 81, September. 5 Backus, Dave K., Patrick J Kehoe, and Finn E. Kydland, 1994, ‘ . . Dynamics of the Trade Balance and the Terms of Trade: The J-Curve?’ A m e r i c a n E c o n o m i c R e v ie w 84(1), 84-103. 6 Baxter, Marianne, find Robert G. King, 1994, ‘ . Measuring Business Cycles: Approx imate Band-Pass Filters for Economic Time Series,’ National Bureau of Economic Research Working paper number 5022. 7 Braun, R. Anton, 1994, ‘ . Tax Disturbances and Real Economic Activity in the Postwar United States,’ J o u r n a l o f M o n e t a r y E c o n o m i c s 33:441-62. 8 Braun, R. Anton, and Charles L. Evans, 1991, ‘ . Seasonal Solow Residuals and Christ mas: A Case for Labor Hoarding and Increasing Returns,’August, unpublished manuscript. 9 Braun, R. Anton, and Charles L. Evans, 1995, ‘ . Seasonality and Equilibrium Business Cycle Theories,’ J o u r n a l o f E c o n o m i c D y n a m i c s a n d C o n t r o l, Vol. 19, no. 3, pp. 503-531. 10. Burnside, Craig, 1991, ‘ Small Sample Properties of 2-Step Method of Moments Esti mators,’Department of Economics, University of Pittsburgh. 11. Burnside, Craig, and Martin Eichenbaum, 1994, ‘ Small Sample Properties of General ized Method of Moments Based Wald Tests,’Working Paper, University of Pittsburgh and Northwestern University.2 1 12. Burnside, Craig, Martin Eichenbaum, and Sergio Rebelo, 1993, ‘ Labor Hoarding and the Business Cycle,’ J o u r n a l o f P o l i t i c a l E c o n o m y 101(2): 245-273. 27 13. Cecchetti, Stephen G., Pok-sang Lam, and Nelson C. Mark, 1993, ‘ The Equity Pre mium and the Risk-Free Rate: Matching the Moments,’ J o u r n a l o f M o n e t a r y E c o n o m i c s 31(1): 21-46. 14. Christiano, Lawrence J , 1988, ‘ h y Does Inventory Investment Fluctuate So Much?,’ . W J o u r n a l o f M o n e t a r y E c o n o m i c s 21:247-80. 15. Christiano, Lawrence J , 1989, ‘ . Comment on Campbell and Mankiw,’M a c r o e c o n o m i c s Annual 16. Christiano, Lawrence J , 1992, ‘ . Searching for a Break in GNP,’ J o u r n a l a n d E c o n o m i c S t a t i s t i c s 10: 3-23. o f B u s in e s s 17. Christiano, Lawrence J , and Martin Eichenbaum, 1992, ‘ . Current Real-Business-Cycle Theories and Aggregate Labor-Market Fluctuations,’A m e r i c a n E c o n o m i c R e v ie w 82(3): 430-450. 18. Cogley, Timothy, and James M. Nason, 1995, ‘ Effects of the Hodrick-Prescott Fil ter on Trend and Difference Stationary Time Series: Implications for Business Cycle Research,’ J o u r n a l o f E c o n o m i c D y n a m i c s a n d C o n t r o l, Vol. 19, no. 1 & 2 253-278. : 19. Cooley, Thomas F., and Gary D. Hansen, 1989, ‘ The Inflation Tax in a Real Business Cycle Model,’ A m e r i c a n E c o n o m i c R e v ie w 79(4): 733-748. 20. Davidson, Russel, and James G. MacKinnon, 1993, E s t i m a t i o n m e t r i c s , Oxford University Press. a n d In fe re n c e in E c o n o 21. Den Haan, Wouter J , 1995, ‘ . The Term Structure of Interest Rates in Real and Mon etary Economies,’ J o u r n a l o f E c o n o m i c D y n a m i c s a n d C o n t r o l 22. Den Haan, Wouter J , and Andrew Levin, 1994, ‘ . Inferences from Parametric and NonParametric Spectral Density Estimation Procedures,’manuscript, U C S D and Federal Reserve Board. 23. Den Haan, Wouter J and Albert Marcet, 1994, ‘ . Accuracy in Simulations,’ R e v ie w E c o n o m i c S t u d ie s 61(1): 3-18. of 24. Ericsson, Neil R., 1991, ‘ Monte Carlo Methodology and the Finite Sample Properties of Instrumental Variables Statistics for Testing Nested and Non-Nested Hypotheses,’ E c o n o m e t r i c a 59(5): 1249-1277. 25. Ferson, Wayne, and Stephen E. Foerster, 1991, ‘ Finite Sample Properties of the Gen eralized Method of Moments in Tests of Conditional Asset Pricing Models,’ Working Paper, University of Chicago Graduate School of Business. 28 26. Fisher, Jonas D. M., 1993, ‘ Relative Prices, Complementarities, and Co-movement Among Components of Aggregate Expenditures,’Working Paper University of Western Ontario. 27. Fuhrer, Jeffrey, George Moore, and Scott Schuh, 1993, ‘ Estimating the Linear-Quadratic Inventory Model: Maximum Likelihood Versus Generalized Method of Moments,’ Fi nance and Economics Discussion Series 93-11, Board of Governors, Washington, D.C., April. 28. Hamilton, James D., 1994, T i m e S e r ie s A n a l y s i s ,Princeton University Press. 29. Hansen, Lars P., 1982, ‘ Large Sample Properties of Generalized Method of Moment Models,’ E c o n o m e t r i c a 50: 1269-1286. 30. Hansen, Lars P., and Robert J Hodrick, 1980, ‘ . Forward Exchange Rates as Opti mal Predictors of Future Spot Rates: An Econometric Analysis,’ J o u r n a l o f P o l i t i c a l E c o n o m y 88: 829-853. 31. King, Robert G. and Sergio T. Rebelo, 1993, ‘ Low Frequency Filtering and Real Busi ness Cycles,’ J o u r n a l o f E c o n o m i c D y n a m i c s a n d C o n t r o l 17(1/2): 207-232. 32. Kocherlakota, N. R., 1990, ‘ Tests of Representative Consumer Asset Pricing M o d On e s ’ J o u r n a l o f M o n e t a r y E c o n o m i c s 26(2): 285-304. l, 33. Kydland, Finn E., and Edward C. Prescott, 1982, ‘ Time to Build and Aggregate Fluctuations,’ E c o n o m e t r i c a ,v l 50, no. 6 pp. 1345-1370. o. , 34. Long, John B., and Charles I Plosser, 1982, ‘ . Real Business Cycles,’J o u r n a l E c o n o m y 91(1): 39-69. o f P o lit ic a l 35. Marshall, David, 1992, ‘ Inflation and Asset Returns in a Monetary Economy,’ J o u r n a l o f F i n a n c e A T {A): 1315-1342. 36. Neely, Christopher J , 1993, ‘ Reconsideration of Representative Consumer Asset . A Pricing Models,’ Department of Economics, University of Iowa. 37. Nelson, Charles R., and Richard Startz, 1990, ‘ The Distribution of the Instrumental Variables Estimator and i s t-ratio When the Instrument i a Poor One,’ J o u r n a l o f t s B u s i n e s s , 63(l,Part2):S125-S140. 38. Newey, W.K., and K.D. West, 1987, ‘ Simple Positive Semi-definite, HeteroskedasA ticity and Autocorrelation Consistent Covariance Matrix,’ E c o n o m e t r i c a , 55: 703-708. 39. Newey, W.K., and K.D. West, 1994, ‘ Automatic Lag Selection in Covariance Matrix Estimation,’ R e m e w o f E c o n o m i c S t u d ie s , Vol. 61, no. 4 631-653. : 29 40. Ogaki, Masao, 1992, ‘ Introduction to the Generalized Method of Moments,’Rochester An Center for Economic Research no. 370. 41. Prescott, Edward, 1986, ‘ Theory Ahead of Business Cycle Measurement,’ C a m e g i e R o c h e s t e r C o n f e r e n c e S e r i e s o n P u b l i c P o l i c y 25: 11-66. 42. Reynolds, Patricia, 1992, ‘ International Co-Movements in Production and Consump tion,’Working Paper, University of Southern California. 43. Singleton, K.J., 1988, ‘ Econometric Issues in the Analysis of Equilibrium Business Cycle Models,’ J o u r n a l o f M o n e t a r y E c o n o m i c s 21(2/3): 361-386. 44. Tauchen, G., 1986, ‘ Statistical Properties of Generalized Method of Moment Models of Structural Parameters Obtained from Financial Market Data,’ J o u r n a l o f B u s i n e s s a n d E c o n o m i c S t a t i s t i c s 4 397-425. : 45. West, Ken, find David W. Wilcox, 1992, ‘ Some Evidence on Finite Sample Distributions of Instrumental Variables Estimators of a Linear-Quadratic Inventory Model,’ Board of Governors, Federal Reserve System. 46. White, H., 1984, A s y m p t o t i c T h e o r y f o r E c o n o m e t r ic ia n s , 30 New York: Academic Press. FIGURE la: FILTER WEIGHTS, HP FILTER D E T R E N D E D O B S E R V A T IO N S 2 A N D 118 1 08 . 06 . 0.4 02 . 0 -. 02 10 20 30 40 50 80 70 80 90 100 110 120 OBSERVATION D E T R E N D E D O B S E R V A T IO N S 10 A N D 110 1 08 . 06 . 0.4 • 02 . 0 -. 02 10 20 30 40 50 60 70 80 90 100 110 120 10 0 110 120 OBSERVATION D E T R E N D E D O B S E R V A T IO N S 25 A N D 95 1 08 . 06 . 0.4 02 . 0 -. 02 10 20 30 40 50 60 70 80 90 OBSERVATION D E T R E N D E D O B S E R V A T IO N 60 NOTES: The figure displays the HP filter weights used to compute the i-th observation on the detrended variable, as indicated in the header. The sample length is equal to 120 observations. FIGURE lb: TRANSFER FUNCTIONS, ALTERNATIVE FILTERS © Note: < denotes frequency ofo c l a i n divided by i. a siltos t FIGURE l : STANDARD DEVIATION OF HP FILTERED D ATA c OBSERVATION Notes: Plot ofthe standard deviation of xf,where xf i obtained by detrending 120 observations using s the indicated f l e . Plim s g i i s p Um V t •The dgm i equation (30) with p = 0.4 and a = 0.01. See i tr infe s T -* section 5 1 1 fo d t i s . . r eal. o o FIGURE 2a: S(S) divided by S’ HP filter and b - 0. , Notes: S E i the s e tral density ofthe G M M residual associated with data detrended by the indicated () s pc f l e ,using the indicated kernel and bandwidth § S i equal t S(£) as % -*■«. The dgm i equation (30) itr . s o s with p = 0.4 and o = 0.01. See Table 1 for d f nitions ofthe s e t a estimators. ei pcrl FIGURE 2b: S(Q divided by S f r tdifference f l e and b - 0 , is itr . Note: See the note to Figure 2a. FIGURE 3: S (£ ) divided by S', HP filter and b - 1. BANDWIDTH, £ Note: See the note t Figure 2 . o a FIGURE 4 S(Q divided by S HP f l e and b = 2 : , itr . BANDWIDTH, % Note: See the note to Figure 2a. FIGURE 5: COVERAGE PROBABILITIES FOR VARIOUS STATISTICS (MULTIVARIATE DGM) B A R T L E T T (b=0) 0 .3 0 .2 5 02 . 0 .1 5 01 . 0 .0 5 0 -0 .0 5 -. 01 -0 .1 5 -. 02 -0 .2 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2 0 21 22 23 17 18 19 2 0 21 22 23 15 16 17 18 19 2 0 21 22 23 15 16 17 18 19 2 0 21 22 23 N E W E Y -W E S T (b=0) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 B A R T L E T T (b=1) 0 .3 0 .2 5 02 . 0 .1 5 01 . 0 .0 5 0 -0 .0 5 -. 01 -0 .1 5 -. 02 -0 .2 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 N E W E Y -W E S T (b=1) 0 .3 ----------------------------------------------------------------------------------------0 . 2 5 ------------------------------------------------------------------------------------------ 0.2---------------------------------- -. 02 -0 .2 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Notes: The grey(black) columns report the frequency the t-staiistic is higher(less) than the upper(lower) 5% critical value. The value o f b indicates the order o f prewhitening. For definitions o f the spectral estimators see Table 1. The ordering o f the statistics is as follows: 1 3 1 1 a , B2 - a j o j ,3 - a / a , ,4 5 - a j a , ,6 - a j a n ,7 = p „ (-l), 8 * pcy(-i), 9 * p *(-l), 10 * M * U , H * M - 1 X 12 - M - l ) , 13 * Pc<0), 14 = p*(0 , 15* p y 0 , 16 * MO), 17 * M 0 ) , 18 = M 1), 19 « Pc/1), 20 * p ^ lX 15* p ^ l ) , 16 = M U » 17 - M U , where ) n() a x stands for the standard deviation o f variable x, and p** (t) stands for the correlation coefficient between and xt+t. FIGURE 6: S(£) divided by S’ Wald Test, experiment 2 and the HP filter. , BANDWIDTH, § Notes: S(|) i the s e t a density ofthe G M M residuals in experiment 2 using the indicated kernel and s pcrl , bandwidth £ S i equal t S(§) as T -> «. For d t i s see section 6 2 2 and see Table 1 f r definitions of . s o eal .., o the s e t a estimators. pcrl TABLE 1: l RQ-FREOUENCY KERNEL NAME UW(ll) BART(11) BARTLETT QS NW SPECTRAL DENSITY ESTIMATORS UNWEIGHTED BARTLETT BARTLETT QS BARTLETT (EQUATION (EQUATION (EQUATION (EQUATION (EQUATION BANDWIDTH (18) (18) (18) (19) (18) WITH 0 = 0 ) WITH 6 = 1 ) WITH 0 = 1 ) AND (20) ) WITH 0 = 1 ) 1 11 11 ANDREWS, AUTOMATIC ANDREWS, AUTOMATIC NEWEY-WEST, AUTOMATIC TABLE 2a: COVERAGE PROBABILITIES DGM: Axt = 0.4 Ax t - 1 + 0.01et, et ~ N(0,1) HP b SPECTRAL ESTIMATOR 5’ / . 1*. 0/ 9*. 0/ 9*. 5/ 5% T = 10% 90% 95% 120 3.6 TRUE 9.2 10.6 6.6 4.7 9.5 10.0 4.8 UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 18.7 19.3 19.0 18.6 20.9 23.8 23.6 23.2 22.9 24.4 15.2 16.7 16.4 15.0 18.0 10.8 10.3 9.8 9.4 11.7 12.6 11.0 9.5 9.5 10.0 17.7 16.1 14.7 14.4 15.1 13.0 10.8 9.9 9.3 10.7 8.1 5.5 4.6 4.2 5.4 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 19.6 16.5 12.1 12.0 16.7 23.9 21.1 17.5 17.3 20.5 16.0 9.9 5.5 5.0 9.3 11.0 5.8 1.6 1.3 5.1 13.1 10.9 8.5 8.8 9.0 17.8 16.0 13.7 14.0 14. 1 13.4 10.3 8. 1 8.2 9.5 8.1 5.1 3.2 3.4 4.4 UW(ll) BART(11) BARTLETT QS NW 2 2 2 2 2 19.1 18.6 18.8 18.8 18.8 24.0 22.8 22.3 22.2 22. 1 15.8 15.8 15.5 15.7 15.5 10.8 9.8 9.6 9.7 9.6 13.3 11.1 9.5 9.7 9.7 17.7 15.8 14.1 14.2 14.5 13.4 10.5 8.8 8.9 9.1 8.3 5.5 4.0 4.1 5.1 T = 1000 4.5 TRUE 9. 1 10.7 5.5 4.9 10.0 9.7 5.3 UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 8.8 11.1 9.0 8.6 11.1 15.7 17.4 15.5 15.4 17.7 13.3 15.4 13.0 12.6 15.2 7.7 9.4 7.1 6.8 9.0 7.8 7.8 7.8 7.7 7.9 13.3 13.3 13.6 13.4 13.5 10.6 10.3 10.3 10.0 10.3 5.2 5.5 5.9 5.6 6.2 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 9.6 7.4 5.7 5.0 7.5 16.2 13.0 10.7 9.3 13.6 13.2 10.0 7.7 6.8 11.1 8.4 4.9 3.9 3. 1 5.3 7.9 7.7 7.2 7. 1 7.0 13.4 13.1 13. 1 13. 1 12.9 10.6 9.8 9.5 9.5 9.6 5.3 5.3 5.2 5.2 5.3 UW(ll) BART(11) BARTLETT QS NW 2 2 2 2 2 9.0 10.0 10.8 10.8 10.7 15.8 17.2 17.4 17.3 17.3 13. 1 14.8 15.0 15. 1 15.0 7.5 8.6 9.2 9.1 9. 1 7.9 7.5 7.4 7.4 7.0 13.4 13. 1 13. 1 13. 1 12.9 10.6 10. 1 9.5 9.5 9.6 5.3 5.2 5.4 5.4 5.3 NOTES: is true 5%<95X) The less(hlgher) uses calculate definitions the the of Hodrick-Prescott and 1O X (90%) than the lower(upper) Monte standard the Carlo error, spectral detrending independently simulated data sets, columns 5% j standard b report and deviation indicates estimators see length T. 2 % the frequency critical of order of 1 . t -statistic ' the The value. parameter the Table (first-differencing). each of the 1 0 estimate prewhitening. HP(A) Based refers on row to For to 1,000 TABLE 2b: COVERAGE PROBABILITIES DGM: Ax* = 0.lAxt1 + 0.01c t* et ~ N(0, 1) HP SPECTRAL ESTIMATOR b 5% 10% A 90% 95% 5% T * TRUE 10% 90% 95% 120 3.7 9.0 10.5 6.5 4.9 9.8 10.4 5.2 UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 18.9 18.0 17.9 17.8 19.1 24.2 24.0 23.9 23.1 24.7 14.7 15.4 15.4 15.0 17.1 10.6 9.4 9.1 8.9 10.8 11.6 9.7 7.9 7.8 9.2 15.6 14.2 13.0 13.2 13.6 12.8 10.4 9.7 9.8 10.2 8.8 5.9 4.0 3.8 4.8 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 19.6 16.7 14.8 14.7 16.2 23.9 21.4 19. 1 19.0 20.6 15.3 12.3 8.5 8.1 11.0 11.5 7.3 4.0 4.1 6.0 11.7 9.6 7.7 7.7 9.2 15.8 13.9 12.5 13.0 13.4 12.7 10.7 9.6 9.7 10.1 8.5 5.6 4.0 4.0 4.9 UW(ll) BART(11) BARTLETT QS NW 2 2 2 2 2 19.1 17.5 16.6 16.6 16.9 23.8 22.9 21.6 21.9 22.0 15.2 14.1 13.1 13.3 13.4 11.5 8.8 7.7 7.8 7.7 11.8 9.8 8.4 8.6 9.0 15.6 14.3 12.9 13. 1 13.7 12.9 10.5 9.2 9.4 9.8 8.7 5.9 4.2 4.3 5.4 T = 1000 4.3 TRUE 8.7 10.6 5.8 4.7 9.5 10.2 5.4 UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 8.3 10.5 9.3 9.5 11.1 16. 1 17.8 17.0 17.2 18. 1 13.0 15.0 13.5 13.5 14.7 6.7 8.4 7.1 7.0 8.6 6.9 6.8 7.0 6.8 7.0 13.0 12.4 12.2 12.2 12.4 11.5 10.9 10.5 10.5 10.8 5.3 5.7 5.4 5.4 5.6 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 8.3 8.5 7.1 4.7 8.3 15.9 15.2 12.9 9.7 14.8 13. 1 12.4 9.7 11.2 12.1 7.0 6.2 4.5 6.3 6.2 6.9 6.8 6.7 6.8 6.9 13.0 12.4 12. 1 12. 1 12.4 11.5 10.8 10.4 10.4 10.7 5.3 5.6 5.3 5.3 5.5 UW(ll) BART(11) BARTLETT QS NW 2 2 2 2 2 8.3 8.9 9.3 9.3 9.3 16.0 16.6 16.1 16.1 16.5 13.0 13.3 13.2 13.2 13.6 6.8 7.1 7.0 7.0 7.2 7.0 6.7 7. 1 7.1 7.1 13.0 12.5 12.4 12.4 12.5 11.5 10.9 10.6 10.7 10.6 5.3 5.6 5.2 5.2 5.5 For notes see Table 2 . a 3 TABLE 3a: DIAGNOSING FAT-TAIL AND SKEWNESS PROBLEM IN TABLE 2a DGM: Axt * 0.4 A*t-i ♦ 0.01et, et ~ N(O.l) HP SPECTRAL ESTIMATOR b I II A III I T = UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 .50 .62 .61 .59 .65 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 .33 .48 .56 .57 .47 UW(ll) BART(11) BARLETT QS NW 2 2 2 2 2 .48 .59 .62 .62 .62 .00269 = = = = II in 120 .001870 .001790 .001810 .001867 .001687 .34 .47 .54 .53 .49 = = = = = .001973 .002277 .002829 .002867 .002346 .32 .47 .52 .52 .47 = = = = = .000680 .000738 .000808 .000805 .000782 = = = = = .001867 .001862 .001871 .001864 .001869 .33 .45 .46 .46 .45 = = = = = .000678 .000735 .000793 .000789 .000781 .000827 s r = = .000680 .000725 .000761 .000772 .000740 T = 1000 UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 .46 .55 .49 .50 .48 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 .36 .59 .58 .58 .45 UW(11) BART(11) BARTLETT QS NW 2 2 2 2 2 .46 .53 .54 .54 .55 .000819 .000741 .000826 .000835 .000751 .37 .47 .53 .53 .41 s s = = = = .000809 .000911 .000987 .001045 .000898 .36 .47 .55 .55 .48 = = = = = .000280 .000282 .000285 .000285 .000284 = .000820 .000771 .000755 .000755 .000760 .36 .46 .51 .51 .47 = .000280 .000282 .000285 .000284 .000283 .000959 = = = = = = = = For notes see Table 2a and I : the correlation between i f j j and [^] lj I: I the Monte-Carlo sd of ~ 1/2 II I : the Monte-Carlo mean of [ y /l 4 .000299 = = = = = = = = .000280 .000278 .000276 .000279 .000274 TABLE 3b: DIAGNOSING FAT-TAIL AND SKEWNESS PROBLEMS IN TABLE 2b DGM: Axt = 0. 1 Axt1 + 0.01et, et ~ N(O.l) HP SPECTRAL ESTIMATOR b I A II III I T = UW(U) BART(11) BARTLETT QS NW 0 0 0 0 0 .49 .61 .65 .63 .61 UW(U) BART(11) BARTLETT QS NW 1 1 1 1 1 .43 .62 .66 .66 .63 UW(ll) BART(11) BARLETT QS NW 2 2 2 2 2 .46 .60 .65 .65 .64 III .000663 = 3 S s = .000559 .000601 .000642 .000641 .000622 120 .001249 .001217 .001210 .001243 .001140 .24 .39 .56 .55 .44 = = = = - .001251 .001358 .001561 .001574 .001413 .24 .39 .50 .50 .41 = = = = - .000558 .000602 .000644 .000641 .000623 s 3 5 = = = .001241 .001278 .001324 .001318 .001315 .24 .37 .42 .42 .38 = = = = = .000555 .000600 .000636 .000634 .000622 .00175 = = = 1: = UW(ll) BART(11) BARTLETT QS NW 0 0 0 0 0 .47 .56 .53 .54 .47 UW(ll) BART(11) BARTLETT QS NW 1 1 1 1 1 .45 .59 .63 .63 .56 = = UW( l l ) BART(1 1 ) BARTLETT QS N W 2 2 2 2 2 .47 .55 .58 .58 .56 = = = = .000628 = = = = = = — iooo .000541 .000496 .000523 .000520 .000494 .31 .43 .53 .52 .44 .000539 .000546 .000599 .000607 .000555 .31 .42 .49 .49 .43 s s = = = = .000223 .000225 .000227 .000227 .000225 .000541 .000523 .000527 .000527 .000521 .31 .41 .45 .45 .41 = = .000223 .000224 .000226 .000226 .000225 For notes see Table 2a and ~ A 1/2 7] I : the correlation between i f l j and [ j I: I the Monte-Carlo sd of 0t. ^ ~ 1/2 II I : the Monte-Carlo mean of Cy 1 II 5 .000238 = = = = = = .000223 .000224 .000226 .000226 .000225 TABLE 4a: SAMPLING PROPERTIES OF BANDWIDTHS DGM: Axt » 0.4 Axt, + 0.01ct, et ~ N(0 ,1) HP SPECTRAL ESTIMATOR b I A II I II T = 120 BARTLETT QS NW 0 0 0 10.7 10.0 5.0 2.85 3.21 1.74 2.41 2.29 4.42 1.31 1.03 3.96 BARTLETT QS NW 1 1 1 3.26 2.96 13.05 .84 .45 10.11 .46 .72 5.00 .36 .34 3.85 BARLETT QS NW 2 2 2 .71 .95 3.18 .44 .40 1.84 .32 .58 4.45 .26 .27 3.47 T = 1000 BARTLETT QS NW 0 0 0 24.26 17.30 11.79 2.11 1.76 2.64 5.30 3.76 6.63 1.08 .60 3.41 BARTLETT QS NW 1 1 1 6.91 4.67 40.70 .83 .48 14.86 .45 .72 6.75 .25 .26 3.76 BARTLETT QS NW 2 2 2 .74 .98 5.86 .43 .37 3.09 .17 .39 6.68 .13 .18 3.66 For notes see Table 2a and I the Monte-Carlo mean of : j I : the Monte-Carlo sd of ^ . I 6 TABLE 4b: SAMPLING PROPERTIES OF BANDWIDTHS DGM: Axt = 0.1 Axt - 1 ♦ 0.01856et l et ~ N(0,1) HP SPECTRAL ESTIMATOR b I A II I II T ■ 120 BARTLETT QS NW 0 0 0 7.14 6.37 4.44 2.03 1.98 2.03 1.47 1.49 5. 10 .78 .56 3.94 BARTLETT QS NW 1 1 1 1.52 1.62 7.74 .85 .67 6.30 .31 .57 5.08 .23 .25 3.72 BARLETT QS NW 2 2 2 .41 .67 3.87 .30 .30 2.43 .30 .56 4.73 .21 .23 3. 14 T = 1000 BARTLETT QS NW 0 0 0 16.26 10.93 10.31 1.62 1.21 3.91 1.51 1.52 6.68 .80 .54 3.84 BARTLETT QS NW 1 1 1 2.84 2.38 16.10 .81 .48 7.21 .15 .37 6.69 .12 .17 3.82 BARTLETT QS NW 2 2 2 .41 .68 7.50 .24 .26 4.22 .15 .37 6.71 .11 .16 3.78 For notes see Table 2a and I the Monte-Carlo mean of : I : the Monte-Carlo sd of I 7 TABLE 5a: COVERAGE PROABILITIES FOR < y r DGM: US VAR HP A SPECTRAL ESTIMATOR b DGM 5’ / . 9 57. BART(11) BART(11) BARTLETT BARTLETT NW NW 0 0 0 0 0 0 N B N B N B 21.6 20.6 22.2 21.0 25.6 23.4 11.6 9.8 11.0 9.4 16.0 14.2 15.4 11.8 14.0 11.6 14.4 12.0 7.8 6.2 7.2 5.4 8.2 6.2 BART(11) BART(11) BARTLETT BARTLETT NW NW 1 1 1 1 1 1 N B N B N B 10.6 9.8 12.2 11.2 14.0 11.2 0.6 0.2 0.6 0.4 1.6 1.0 15.2 11.6 13.8 11.4 13.8 11.4 7.8 4.8 7.2 4.2 7.2 4.2 5. 7 957. NOTES: The SX( 5 ) 9% column reports the frequency the t■statistic i s 5X critical v alue. less(higher) than the lower(upper) The value of b Indicates the order of prewhltenlng. The DGM I a VAR described i the t s . s n et The generated errors are either normal (D M = N o bootstrapped (DGM = B . G ) r ) For definitions of the spectral estimators see Table 1 . HP refers t o Hodrlck-Prescott detrending and A refers t flrst-dlfferenclng. o 8 TABLE 5b: COVARAGE PROBABILITES FOR VARIOUS STATISTICS. SUMMARY RESULTS DGM: US VAR HP SPECTRAL ESTIMATOR b DGM BART(11) BART(11) BARTLETT BARTLETT NW NW 0 0 0 0 0 0 N B N B N B BART(11) BART(11) BARTLETT BARTLETT NW NW 1 1 1 1 1 1 N B N B N B A ii in IV i ii hi IV 18.1 15.9 14.8 13.6 20.7 19.3 23 23 21 20 20 20 7.5 6.2 6.0 5.1 5.9 5.3 7 6 8 8 11 9 7.3 7.2 6.0 5.7 9.9 9.5 20 20 16 17 16 17 3.5 3.4 3.4 2.8 3.4 2.9 9 11 10 14 8 13 10.1 9.0 8.7 7.7 12.2 11.2 23 22 19 19 19 19 5.6 4.5 4.8 3.8 4.8 3.7 7 5 10 11 10 11 6.2 5.8 5.2 4.9 8.6 7.9 21 20 17 17 17 17 3.2 3. 1 3.1 2.8 3.1 2.8 9 12 10 12 10 12 i Notes: See Table 5 , and a ail I: the average of the absolute deviation between the sum of the two 5% t areas minus 10X across a l 23 statistics. l s 0. ail areas i larger than 1 % I : The number of times the sum of the two 5X t I II I : the average absolute value of the difference between the upper and the lower 5% t all areas. s al I : the number of times the lower 5% t l area i bigger than the upper S X V ta l a e . i ra 9 TABLE 6a: COVERAGE PROBABILITIES FOR WALD TEST EXAMPLE: EXPERIMENT 1 p = 0.1 HP FILTERED DATA T b 5% 1 *. 0/ 90% 95% TRUE BARTLETT NW BARTLETT NW BARTLETT NW 120 120 120 120 120 120 120 0 0 1 1 2 2 4.6 3.5 4.3 5.0 5.3 3.8 4.2 8.8 6.9 7.7 10.6 9.5 7.0 8.0 10.2 28.5 24. 1 17.7 21.1 26.3 26.0 4.8 20.5 16.4 11.9 15.5 19.8 17.9 _ 10.4 5.2 3.2 9.9 .81 3.1 TRUE BARTLETT NW BARTLETT NW BARTLETT NW 1000 1000 1000 1000 1000 1000 1000 0 0 1 1 2 2 5.3 4.7 3.6 5.5 3.8 4.8 3.5 9.0 8.3 6.9 9.6 8.3 8.0 7.1 10. 1 16.6 16.2 11.3 12.3 17.6 16.2 5.6 11.0 10.0 7.0 7.6 11.7 10. 1 — 23.3 12.8 7.0 29.7 .86 6.44 TRUE BARTLETT NW BARTLETT NW BARTLETT NW 5000 5000 5000 5000 5000 5000 5000 0 0 1 1 2 2 3.7 3.6 4.0 3.7 4.3 3.2 3.8 9.1 8.2 8.9 8.7 9.2 7.7 8.4 10.0 12.7 11.9 11.8 10.2 16.8 13.3 4.7 6.7 7.0 5.9 5.7 9.3 8. 1 — 40.3 28.4 12.0 39.0 1.04 19.3 AVE«;t ) SPECTRAL ESTIMATOR AVE(Ct ) FIRST DIFFERENCED DATA SPECTRAL ESTIMATOR T b 5% 10% 97 0. 95% TRUE BARTLETT BARTLETT 120 120 120 _ 0 1 4.5 3.7 4.3 9.4 9.1 9.3 9.9 11.2 11.4 4.9 6. 1 6.3 2.3 0.5 TRUE BARTLETT BARTLETT 1000 1000 1000 _ 0 1 4.6 4.6 4.7 11.0 10.8 10.9 9.5 11.3 10.2 5.3 6.0 5.6 4.7 0.4 TRUE BARTLETT BARTLETT 5000 5000 5000 — 0 1 5.3 5.2 5.3 9.8 9.3 9.6 10.4 10.9 10.8 4.5 5.7 5.6 8.2 0.4 the frequency the report and 10%(90%) ■olumns c 5%(95%) The a ue. i s less(hlgher) than the lower(upper) 5% and 10% critical v l For definitions o f f prewhitening. The value of b Indicates the order o HP refers t o Hodrick-Prescott detrending . spectral estimators see Table 1 AVE(£t) Is the Monte Carlo mean of t he and A refers to first-differencing. NOTES: 2 X “statistic bandwidth parameter. 10 TABLE 6b: COVERAGE PROBABILITIES FOR WALD TEST EXAMPLE: EXPERIMENT p = 0.5 HP FILTERED DATA SPECTRAL ESTIMATOR TRUE BARTLETT NW BARTLETT NW BARTLETT NW 120 120 120 120 120 120 120 TRUE BARTLETT NW BARTLETT NW BARTLETT NW TRUE BARTLETT NW BARTLETT NW BARTLETT NW b T 5% 10% 90% 95% — 0 0 1 1 2 2 4.6 3.5 3.5 5.7 5. 1 3.1 3.4 8.2 6.9 8.0 10.3 10.9 5.9 7.6 9.9 30.4 27.8 19. 1 22.8 30.6 31.9 4.5 23.5 20.3 13.2 17.5 24.8 23.7 1000 1000 1000 1000 1000 1000 1000 0 0 1 1 2 2 4.4 4.2 4.6 4.9 5.4 3.3 4.6 9.6 8.5 8.6 9.8 9.9 6.5 8.3 10.6 15.7 17.6 10.8 14. 1 23.8 18.8 5.0 10.5 11.1 7.0 8.6 16.1 12. 1 5000 5000 5000 5000 5000 5000 5000 0 0 1 1 2 2 5.5 4.9 3.4 5.5 3.9 3.9 3.4 9.6 9.4 7.8 9.9 8.5 7.8 7.7 10.1 12.7 12.2 9.0 9.6 22.3 12.4 4.4 6.4 7.2 4.2 5.4 14.9 7.3 AVE($t ) 15.7 5.79 6 .0 15.4 1.86 3. 14 34.7 13.6 12.8 96.0 4.01 13.2 59.9 29.9 22. 1 116.6 6.97 33.5 FIRST1 DIFFERENCED DATA SPECTRAL ESTIMATOR T b 5% 10% 90% 95% AVE(^t) TRUE BARTLETT BARTLETT 120 120 120 0 1 4.3 4.3 4.8 9.9 9. 1 10.0 10. 1 14.3 11.5 5.0 8.4 7.5 6. 1 1.0 TRUE BARTLETT BARTLETT 1000 1000 1000 0 1 5.6 5.0 5.8 12.2 11.9 12.6 9.2 12.5 9.3 5.6 6.6 5.0 13.2 TRUE BARTLETT BARTLETT 5000 5000 5000 — 0 1 5.4 5.4 5.7 9.6 9.6 10. 1 10.0 11.1 9.4 5. 1 6.7 4.2 22.7 For notes see Table 6 . a 11 1.6 2.6 TABLE 7a: COVERAGE PROBABILITIES FOR WALD TEST EXAMPLE: EXPERIMENT 2 HP FILTERED DATA p = 0.1 SPECTRAL ESTIMATOR T b 5/ *. 10% 90% 95% TRUE BARTLETT NW BARTLETT NW BARTLETT NW 120 120 120 120 120 120 120 0 0 1 1 2 2 3.3 3.3 3.6 6.5 3.9 3.4 4.3 9.1 7.6 8.9 13.7 9.8 8.4 8.2 11.7 21.3 16.9 7.0 20.4 20.0 21.3 5.1 14.4 9.7 4.8 13.6 14. 1 14.8 — 10.4 4.3 2.8 23.8 .70 3.8 TRUE BARTLETT NW BARTLETT NW BARTLETT NW 1000 1000 1000 1000 1000 1000 1000 0 0 1 1 2 2 5.2 4.8 4.8 6.9 5.0 4.8 5.9 9.4 8.6 8.8 12.5 9.0 8.8 10.5 9.2 14.2 13.6 3.9 13.4 13.3 14.2 5.0 8.2 7.8 2.2 7.8 7.8 8.5 — 23.3 8.7 6.2 78. 1 .80 8.6 TRUE BARTLETT NW BARTLETT NW BARTLETT NW 5000 5000 5000 5000 5000 5000 5000 — 0 0 1 1 2 2 4.8 4.6 4.5 5.6 4.8 4.3 4.6 10.0 9.4 9.3 11.2 9.4 9.4 8.6 9.4 10.4 11.5 6.0 9.6 11.2 11.0 5.0 6.3 6.6 2.6 5.2 6.7 5.4 40.3 23.8 10.7 103.4 1.1 10.9 For notes see Table 6 . a 12 AVE(St ) TABLE 7b: COVERAGE PROBABILITIES FOR WALD TEST EXAMPLE: EXPERIMENT p = 0.5 HP FILTERED DATA SPECTRAL ESTIMATOR T b TRUE BARTLETT NW BARTLETT NW BARTLETT NW 120 120 120 120 120 120 120 0 0 1 1 2 2 TRUE BARTLETT NW BARTLETT NW BARTLETT NW 1000 1000 1000 1000 1000 1000 1000 0 0 1 1 2 2 TRUE BARTLETT NW BARTLETT NW BARTLETT NW 5000 5000 5000 5000 5000 5000 5000 0 0 1 1 2 2 - — - 5X 1*. 0/ 90% 95% 4.2 3.4 4.8 8.5 5.9 3.7 5.1 9.3 7.9 10.5 19.1 13.6 8.8 10.0 10.7 16.3 8.3 3. 1 12. 1 14.7 16.2 5.9 10.5 3.9 1.6 8.7 8.8 8.5 4.6 4.1 4.7 7.5 3.6 4.6 3.9 9.1 8.6 9.0 13.4 7.9 9.0 9.1 10.4 13.3 10.6 1.9 18.3 11.2 10.3 5.3 7.3 5.3 0.7 12.8 5.7 5.8 6.5 6.6 6.6 9.1 6.1 6.7 4.5 10.7 10.7 10.8 14.8 10.2 11.2 8.9 10.5 12.0 11.6 2.5 14.7 9.8 9.0 5.1 5.3 5.5 0.6 9.1 4.5 3.3 For notes see Table 6 . a 13 AVE(Ct ) 15.7 5. 1 5.7 28.2 1.1 3.8 34.7 11.6 12.4 280. 1 1.8 5.8 59.9 27.4 21.4 680.9 3.2 10.2 Working Paper Series A series of research studies on regional economic issues relating to the Seventh Federal Reserve District, and on financial and economic topics. REGIONAL ECONOMIC ISSUES Estimating Monthly Regional Value Added by Combining Regional Input With National Production Data WP-92-8 Philip R. Israilevich and Kenneth N. Kuttner Local Impact of Foreign Trade Zone WP-92-9 D avid D. Weiss Trends and Prospects for Rural Manufacturing WP-92-12 William A. Testa State and Local Government Spending-The Balance Between Investment and Consumption WP-92-14 R ichard H. M attoon Forecasting with Regional Input-Output Tables WP-92-20 P.R. Israilevich , R. M ahidhara , and G.J.D. H ewings A Primer on Global Auto Markets Paul D. B allew and R obert H. Schnorbus W P-93-1 Industry Approaches to Environmental Policy in the Great Lakes Region WP-93-8 D avid R. Allardice, R ichard H. M attoon and William A. Testa The Midwest Stock Price Index— Leading Indicator of Regional Economic Activity WP-93-9 William A. Strauss Lean Manufacturing and the Decision to Vertically Integrate Some Empirical Evidence From the U.S. Automobile Industry W P-94-1 Thomas H. K lier Domestic Consumption Patterns and the Midwest Economy WP-94-4 R obert Schnorbus and Paul B allew 1 Working paper series continued To Trade or Not to Trade: Who Participates in RECLAIM? WP-94-11 Thomas H. K lier and R ichard M attoon Restructuring & Worker Displacement in the Midwest WP-94-18 Paul D. B allew and R obert H. Schnorbus Financing Elementary and Secondary Education in the 1990s: A Review of the Issues WP-95-2 R ichard H. M attoon ISSUES IN FINANCIAL REGULATION Incentive Conflict in Deposit-Institution Regulation: Evidence from Australia WP-92-5 E dw ard J. Kane and G eorge G. Kaufman Capital Adequacy and the Growth of U.S. Banks WP-92-11 H erbert B aer and John M cE lravey Bank Contagion: Theory and Evidence WP-92-13 G eorge G. Kaufman Trading Activity, Progarm Trading and the Volatility of Stock Returns WP-92-16 Jam es T M oser Preferred Sources of Market Discipline: Depositors vs. Subordinated Debt Holders WP-92-21 D ouglas D. E vanoff An Investigation of Returns Conditional on Trading Performance WP-92-24 Jam es T. M oser and Jacky C. So The Effect of Capital on Portfolio Risk at Life Insurance Companies WP-92-29 Elijah B rew er III, Thomas H. M ondschean, and Philip E. Strahan A Framework for Estimating the Value and Interest Rate Risk of Retail Bank Deposits WP-92-30 D a vid E. Hutchison, G eorge G. Pennacchi Capital Shocks and Bank Growth-1973 to 1991 WP-92-31 H erbert L. B aer and John N. M cE lravey 2 Working paper series continued The Impact of S&L Failures and Regulatory Changes on the CD Market 1987-1991 WP-92-33 Elijah B rew er and Thomas H. M ondschean Junk Bond Holdings, Premium Tax Offsets, and Risk Exposure at Life Insurance Companies WP-93-3 Elijah B rew er HI and Thomas H. Mondschean Stock Margins and the Conditional Probability of Price Reversals WP-93-5 Paul Kofman and Jam es T. M oser Is There Lif(f)e After DTB? Competitive Aspects of Cross Listed Futures Contracts on Synchronous Markets Paul Kofman , Tony Bouwman and James T. M oser WP-93-11 Opportunity Cost and Prudentiality: A RepresentativeAgent Model of Futures Clearinghouse Behavior H erbert L. Baer , Virginia G. France and Jam es T. M oser WP-93-18 The Ownership Structure of Japanese Financial Institutions WP-93-19 Hesna Genay Origins of the Modern Exchange Clearinghouse: A History of Early Clearing and Settlement Methods at Futures Exchanges WP-94-3 Jam es T. M oser The Effect of Bank-Held Derivatives on Credit Accessibility Elijah B rew er III, B ernadette A. Minton and Jam es T. M oser Small Business Investment Companies: Financial Characteristics and Investments WP-94-5 WP-94-10 Elijah B rew er III and Hesna G enay Spreads, Information Flows and Transparency Across Trading System W P-95-1 Paul Kofman and Jam es T. M oser 3 Working paper series continued MACROECONOMIC ISSUES An Examination of Change in Energy Dependence and Efficiency in the Six Largest Energy Using Countries--1970-1988 WP-92-2 Jack L. H ervey Does the Federal Reserve Affect Asset Prices? WP-92-3 Vefa Tarhan Investment and Market Imperfections in the U.S. Manufacturing Sector WP-92-4 Paula R. Worthington Business Cycle Durations and Postwar Stabilization of the U.S. Economy WP-92-6 M ark W. Watson A Procedure for Predicting Recessions with Leading Indicators: Econometric Issues and Recent Performance WP-92-7 Jam es H. Stock and M ark W. Watson Production and Inventory Control at the General Motors Corporation During the 1920s and 1930s WP-92-10 Anil K. K ashyap and D a vid W. Wilcox Liquidity Effects, Monetary Policy and the Business Cycle WP-92-15 Lawrence J. C hristiano an d M artin Eichenbaum Monetary Policy and External Finance: Interpreting the Behavior of Financial Flows and Interest Rate Spreads WP-92-17 Kenneth N. K uttner Testing Long Run Neutrality WP-92-18 R obert G. King an d M ark W. Watson A Policymaker’s Guide to Indicators of Economic Activity Charles Evans , Steven Strongin, and F rancesca Eugeni WP-92-19 Barriers to Trade and Union Wage Dynamics WP-92-22 Ellen R. Rissman Wage Growth and Sectoral Shifts: Phillips Curve Redux WP-92-23 Ellen R. Rissm an 4 Working paper series continued Excess Volatility and The Smoothing of Interest Rates: An Application Using Money Announcements WP-92-25 Steven Strongin Market Structure, Technology and the Cyclicality of Output WP-92-26 Bruce Petersen and Steven Strongin The Identification of Monetary Policy Disturbances: Explaining the Liquidity Puzzle WP-92-27 Steven Strongin Earnings Losses and Displaced Workers WP-92-28 Louis S. Jacobson, R obert J. LaLonde, and D aniel G. Sullivan Some Empirical Evidence of the Effects on Monetary Policy Shocks on Exchange Rates WP-92-32 M artin Eichenbaum and Charles Evans An Unobserved-Components Model of Constant-Inflation Potential Output WP-93-2 Kenneth N. Kuttner Investment, Cash Flow, and Sunk Costs WP-93-4 Paula R. Worthington Lessons from the Japanese Main Bank System for Financial System Reform in Poland WP-93-6 Takeo Hoshi, A nil Kashyap, and G ary Loveman Credit Conditions and the Cyclical Behavior of Inventories WP-93-7 Anil K K ashyap, Owen A. Lam ont and Jeremy C. Stein Labor Productivity During the Great Depression WP-93-10 M ichael D. Bordo and Charles L. Evans Monetary Policy Shocks and Productivity Measures in the G-7 Countries WP-93-12 Charles L. Evans and Fernando Santos Consumer Confidence and Economic Fluctuations WP-93-13 John G. M atsusaka and A rgia M. Sbordone 5 Working paper series continued Vector Autoregressions and Cointegration WP-93-14 M ark W. Watson Testing for Cointegration When Some of the Cointegrating Vectors Are Known WP-93-15 M ichael T. K H orvath and M ark W. Watson Technical Change, Diffusion, and Productivity WP-93-16 Jeffrey R. C am pbell Economic Activity and the Short-Term Credit Markets: An Analysis of Prices and Quantities WP-93-17 Benjamin M. Friedman an d Kenneth N. Kuttner Cyclical Productivity in a Model of Labor Hoarding WP-93-20 A rgia M. Sbordone The Effects of Monetary Policy Shocks: Evidence from the Flow of Funds WP-94-2 L aw rence J. Christiano , M artin Eichenbaum and Charles Evans Algorithms for Solving Dynamic Models with Occasionally Binding Constraints WP-94-6 Lawrence J. Christiano and Jonas D.M. Fisher Identification and the Effects of Monetary Policy Shocks Lawrence J. Christiano , M artin Eichenbaum and Charles L. Evans WP-94-7 Small Sample Bias in GMM Estimation of Covariance Structures WP-94-8 Joseph G. Altonji and Lew is M. Segal Interpreting the Procyclical Productivity of Manufacturing Sectors: External Effects of Labor Hoarding? WP-94-9 A rgia M. Sbordone Evidence on Structural Instability in Macroeconomic Time Series Relations WP-94-13 Jam es H. Stock an d M ark W. Watson The Post-War U.S. Phillips Curve: A Revisionist Econometric History WP-94-14 R obert G. King and M ark W. Watson The Post-War U.S. Phillips Curve: A Comment WP-94-15 Charles L. Evans 6 Working paper series continued Identification of Inflation-Unemployment WP-94-16 Bennett T. M cCallum The Post-War U.S. Phillips Curve: A Revisionist Econometric History Response to Evans and McCallum WP-94-17 Robert G. King and M ark W. Watson Estimating Deterministic Trends in the Presence of Serially Correlated Errors WP-94-19 Eugene Canjels and M ark W. Watson Solving Nonlinear Rational Expectations Models by Parameterized Expectations: Convergence to Stationary Solutions WP-94-20 A lbert M arcet and D avid A. M arshall The Effect of Costly Consumption Adjustment on Asset Price Volatility WP-94-21 D avid A. M arshall and Nayan G. Parekh The Implications of First-Order Risk Aversion for Asset Market Risk Premiums WP-94-22 G eert Bekaert, R obert J. H odrick and D avid A. M arshall Asset Return Volatility with Extremely Small Costs of Consumption Adjustment WP-94-23 D avid A. M arshall Indicator Properties of the Paper-Bill Spread: Lessons From Recent Experience WP-94-24 Benjamin M. Friedman and Kenneth N. K uttner Overtime, Effort and the Propagation of Business Cycle Shocks WP-94-25 G eorge J. H all Monetary policies in the early 1990s~reflections of the early 1930s WP-94-26 R obert D. Laurent 7 Working paper series continued The Returns from Classroom Training for Displaced Workers WP-94-27 Louis S. Jacobson, R obert J. LaLonde an d D aniel G. Sullivan Is the Banking and Payments System Fragile? WP-94-28 G eorge J. Benston an d G eorge G. Kaufman Small Sample Properties of GMM for Business Cycle Analysis WP-95-3 Law rence J. C hristiano and Wouter den Haan 8