The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
MODELING VOLATILITY DYNAMICS by Francis X. Diebold and Jose A. Lopez Federal Rese"e Bank of New York Research Paper No. 9522 October 1995 This paper is being circulated for purposes of discussion and COIDillent only. The contents should be regarded as preliminary and not for citation or quotation without permission of the author. The views expressed are those of the author and do not necessarily reflect those of the Federal Reserve Bank of New York or the Federal Reserve System. Single copies are available on request to: Public Information Department Federal Reserve Bank of New York New York, NY 10045 MODELING VOLATILITY DYNAMICS Francis X. Diebold Jose A. Lopez Department of Economics University of Pennsylvania 3718 Locust Walle Philadelphia, PA 19104-6297 Research and Market Analysis Group Federal Reseive Bank of New York 33 Liberty Street New York, NY 10045 Print date: October 23, 1995 ABSTRACT: Many economic and financial time series have been found to exhibit dynamics in variance; that is, the second moment of the time series innovations varies over time. Many possible model specifications are available to capture this phenomena, but to date, the class of models most widely used are autoregressive conditional heteroskedasticity (ARCH) models. ARCH models provide parsimonious approximations to volatility dynamics and have found wide use in macroeconomics and finance. The family of ARCH models is the subject of this paper. In section Il, we sketch the rudiments of a rather general univariate time-series model, allowing for dynamics in both the conditional mean and variance. In section m, we provide motivation for the models. In section IV, we discuss the properties of the models in depth, and in section V, we discuss issues related to estimation and testing. In Section VI, we detail various important extensions and applications of the model. We conclude in section VIl with speculations on productive directions for future research. AcJmnwJedgements· The views expressed here are those of the authors and not those of the Federal Reseive Bank of New York or the Federal Reseive System. Helpful comments were received from Richard Baillie, Tim Bollerslev, Pedro de Lima, Wayne Ferson, Kevin Hoover, Peter Robinson and Til Schuermann. We gratefully acknowledge the support of the National Science Foundation, the Sloan Foundation, and the University of Pennsylvania Research Foundation. I. Introduction Good macroeconomic and financial theorists, like all good theorists, want to get the facts straight before theorizing; hence, the explosive growth in the methodology and application of time-series econometrics in the last 25 years. Man y factors fueled that growth, Illllging from important developments in related field s (e.g. , Box and Jenkins, 1970) to dissatisfaction with the "incredible identifying restriction s" associated with traditional macroeconometric models (Sims, 1980) and the assoc iated recognition that many tasks of interest, such as forecasting, simply do not require a structural model (e.g. , Granger and Newbold, 1979). A short list of active subfields inclu des vector autoregressions, index and dynamic factor models, causality, integration and persi stence, cointegration, seasonality, unobserved-components models, state-space representa tions and the Kalman filter, regimeswitching models, nonlinear dynamics and optimal nonl inear filtering. Any such list must also include models of volatility dynamics. ARCH mode ls, in particular, provide parsimonious approximations to volatility dynamics and have found wide use in macroeconomics and 1 finance. The family of ARCH models is the subject of this chapter. Economists are typically introduced to heteroskedasti city in cross-sectional contexts, such as when the variance of a cross-sectional regression disturbance depends on one or more of the regressors. A classic example is the estimation of Engel curves by weighted least squares, in light of the fact that the variance of the distu rbance in an expenditure equation may depend on income. Heteroskedasticity is equally perv asive in the time-series contexts prevalent in macroeconomics and finance. For exam ple, in Figures 1 and 2, we plot the log of daily Deutschemark/Dollar and Swiss Franc/Dollar spot exchange rates, as well as the daily returns and squared returns, 1974--1991. Volatility clust ering (that is, contiguous periods of high or low volatility) is apparent. However, models of cross-sectional heteroskedasticity are not useful in such cases because they are not dynamic. ARCH models, on the other hand, were developed to model such time-series volatility fluctuations. Engle (1982) used them to model the variance of inflation, and more recently they have enjoyed widespread use in modeling asset return volatility. Exhaustive surveys of the ARCH literature already exist, including Engle and Bollerslev (1986), Bollerslev, Chou and Kroner (1992), Bera and Higgins (1993) and Bollerslev, Engle and Nelson (1994), and it is not our intention to produce another. Rather, we shall provide a selective account of certain aspects of conditional volatility modeling that are of particular relevance in macroeconomics and finance. In section II, we sketch the rudiments of a rather general univariate time-series model, allowing for dynamics in both the conditional mean and variance. We introduce the ARCH and generalized ARCH (GARCH) models there. In section ID, we provide motivation for the models. In section IV, we discuss the properties of the models in depth, and in section V, we discuss issues related to estimation and testing. In Section VI, we detail various important extensions and applications of the model. We conclude in section VII with speculations on productive directions for future research. II. A Time-series Model with Conditional Mean and Variance Dynamics Wold's (1938) celebrated decomposition theorem establishes that any covariance stationary stochastic process {xJ may be written as the sum of a linearly deterministic component and a linearly indeterministic component with a square-summable, one-sided moving average representation. 2 We write x, = cl,+ y., where cl, is linearly deterministic and y, is a linearly regular (or indeterministic) covariance stationary stochastic process (LRCSSP) given by y = B(L) e,, B(L) = Lb; Li, L- b/ < 1 i=O i=O E[ e e ] = { ' ' a~ < 0, 3 00 , 00 , b0 = 1, if t = -r otherwise. The uncorrelated innovation sequence {eJ need not be Gaussian and therefore need not be independent. Non-independent innovations are characteristic of nonlinear time series in general and conditionally heteroskedastic time series in particular. In this section, we introduce the ARCH proc ess within Wol d's framework by contrasting the polar extremes of the LRCSSP with independent and identically distributed (i.i. d.) innovations, which allows only condition al mean dynamics, and the pure ARC H process, which allows only conditional variance dynamics. We then combine these extremes to produce a generalized model that permits vari ation in both the first and second conditional moments. Finally, we introduce the Generali zed ARCH (GARCH) process, which is very useful in practice. A. Conditional Mean Dynamics Suppose that y, is a LRCSSP with i.i.d ., as oppo sed to merely white noise, 3 innovations. The ability of the LRCSSP to capture conditional mean dynamics is the sour ce of its power. The unconditional mean and vari 2 ance are ElY,l = o and E[ y,2 ] = bj , which t=O are both time-invariant. However, the condition al mea n is time -var ying and is given by • m y1 j Qt-I ] = ~ b; e _;, where the information set is Q1_ 1 = { e,-1> e,_ , ... }1 o; 't E[ 2 1=1 Because the volatility of many economic time series seems to vary, one would hope that the LRCSSP could capture conditional vari ance dynamics as well, but such is not the case for the model as presently specified. The cond itional variance of y, is constant at E [ ( y1 - E [ y IQ _ ] )2 jQt-1 ] = o! . This pote 1 ntially unfortunate restriction manifests itself 1 1 in the properties of the k-step-ahead conditional prediction erro r variance. The least squares forecast is the conditional expectation, m E[Y,.k I Q,1 = ~ bk•i~-j, 1•0 4 and the associated prediction error is k-1 Yt+k - E [ Yt+k I n, l = ~ ••0 bi e,.k-i• which has a conditional prediction error variance of As k - a; L- bi 00 the conditional prediction error variance converges to the unconditional variance , 2 • i=O Note that for any k, the conditional prediction error variance depends only on k and not on 0 1_1; thus, readily available and potentially useful infonnation is discarded. B. Conditional Variance Dynamics By way of contrast, we now introduce a pure ARCH process, which displays only conditional variance dynamics. We write e, I n,-1 - N(O, h,} h, = w + y(L)t:;, w>O, y(L) = L- yiLi, Yi;, 0 'ef i, y(I) <I. i=l The process is parameterized in tenns of the conditional density of e, I n,-1> which is assumed to be nonnal with a zero conditional mean and a conditional variance that depends linearly on past squared innovations..Note that even though the e,'s are serially uncorrelated, they are not independent. The stated conditions are sufficient to ensure that the conditional and unconditional variances are positive and finite as well as that y, is covariance stationary. The unconditional moments are constant and are given by E[ y ] = O and 1 E[( y1 - E[y, ])2] = w I - y(I) . As for the conditional moments, by construction, the 5 conditional mean of the process is zero, and the conditional variance is potentially timevarying. That is, E[y, I Qt-1] = O and E[(Y, - E[y, I Qt-1 ]}2 1 Q1_1] = w + y(L) e;. C. Conditional Mean and Variance Dynamics We can incozporate both conditional mean and conditional variance dynamics by introducing ARCH innovations into the standard LR.CSSP. We write Y, = B(L)e,, e, I Q1_1 - N(o, Ji.} Ji.= w +y(L )e;, subject to the conditions discussed earlier. Both the unconditional mean and variance are constant; i.e., E[ y,] = O and However, the conditional mean and variance are time-varying; i.e., - E[y, I Q,_i] = ~ biet-i, 1=1 E[(Y, - E[y, I Q,-iJ)2 I Qt-1] = w + y(L)e ;. Thus, this model treats the co.nditional mean and variance dynamics in a symmetric fashion by allowing for movement in each, a common characteristic of economic time series. D. The Generalized ARCH Process In the previous subsections, we used an infinite-ordered ARCH process to model conditional variance dynamics. We now introduce the GARCH proce ss, which we shall subsequently focus on almost exclusively. The finite-ordered GARC H model approximates 6 infinite-ordered conditional variance dynamics in the same way that finite-ordered ARMA models approximate infinite-ordered conditional mean dynamics.4 The GARCH(p,q) process, introduced by Bollerslev (1986), is given by e1 h, I OH - N{O, h,} )e; + p(L)h,, = w + o:(L The stated conditions ensure that the conditional variance is positive and that y, is covaria nce 5 stationary. The ARCH model of Engle (1982) emerges when p(L) = 0. If both o:(L) and P(L) are zero, then the model is simply i.i.d. noise with variance w. The GARCH(p,q) model can be represented as a restricted infinite-ordered ARCH model: h "t w = I - P(l) + o:(L) I - P(L) e2 = I w I - P(l) + ~ 2 6' 0;f!t-i. The first two unconditional moments of the pure GARCH model are constant and given by E[y j = O and 1 The conditional moments are E [ Y1 I 0 _ 1 ] = 0 and 1 7 m. Motivating GARCH Processes GARCH models have been used extensively in macroeconomics and finance becaus e of their attractive approximation-theoretic properties. However, these models do not arise directly from economic theory, and various efforts have been made to imbue them with economic rationale. Here, we discuss both approximation-theoretic and econom ic motivations for the GARCH framework. A. Approximation-Theoretic Considerations The primary and most powerful justification for the GARCH model is approximation theoretic. That is, the GARCH model provides a flexible and parsimonious approx imation to conditional variance dynamics, in exactly the same way that ARMA models provid e a flexible and parsimonious approximation to conditional mean dynamics. In each case, an infiniteordered distributed lag is approximated as the ratio of two finite, low-ordered lag operator polynomials. The power and usefulness of ARMA and GARCH models come entirely from the fact that ratios of such lag operator polynomials can accurately approximate a variety of infinite-ordered lag operator polynomials. 6 In short, ARMA models with GARC H innovations offer a natural, parsimonious, and flexible way to capture the conditional mean and variance dynamics observed in a time series. B. Economic Considerations Economic considerations may also lead to GARCH effects, although the precise links have proved difficult to establish. Any of the myriad economic forces that produc e persistence in economic dynamics may be responsible for the appearance of GARCH effects in volatility. In such cases, the persistence happens to be in the conditional second momen t, rather than the first. 8 To take one example, conditional heteroskedasticity may arise in situations in which "economic time" and "calendar time" fail to move together. A well-known example from financial economics is the subordinated stochastic process model of Clark (1973). In this model and its subsequent extensions, the number of trades occurring per unit of calendar time (I,) is a random variable, and the price change per unit of calendar time (eJ is the sum of the I. intra-period price changes (o~, which are assumed to be normally distributed: i.i.d. oi Using a simple transformation, N( 0, TJ ). e. can be written more directly as a function of I., i.i.d. N( 0, I). Thus, e, is characterized by conditional heteroskedasticity linked to trading volume. If the number of trades per unit of calendar time displays serial correlation, as in Gallant, Hsieh and Tauchen (1991), the serial correlation induced in the conditional variance of returns (measur ed in calendar time) results in GARCH-like behavior. Similar ideas arise in macroeconomics . The divergence between economic time and calendar time accords with the tradition of "phaseaveraging" (e.g., Friedman and Schwartz, 1963) and is captured by the time-deformatio n models of Stock (1987, 1988). Several other explanations for the existence of GARCH effects have been advanced, including parameter variation (Tsay, 1987), differences in the interpretability of informa tion (Diebold and Nerlove, 1989), market microstructure (Bollerslev and Domowitz, 1991), and agents' "slow" adaptation to news (Brock and LeBaron, 1994). Currently, a consensus economic model producing persistence in conditional volatility does not exist, but it would be foolish to deny the existence of such persistence; measurement is simply ahead of theory. 9 IV. Properties of GARCH Processes Here we highlight some important properties of GARCH processes. To facilitate the discussion, we generate a realization of a pure GAR CH(l , I) process of length 500 that we will use repeatedly for illustration. 7 The parameter values are w = l, rx = .2 and p = .7, and the underlying shocks are N(0, I). 8 This parameterization deliv ers a persistent conditional variance and has finite unconditional variance and kurtosis. 9 We plot the realization and its first 25 sample autocorrelations in Figure 3. The sample autocorrelations are indicative of white noise, as expected. A. The Conditional Variance is a Serially Correlated Rand om Variable The conditional variance associated with the GARCH mode l is h1 = w + rx(L )t; + P(L) Ji.. Recall that the unconditional variance of the process is given by 2 0 y W =--1-rx (l)P(l) . Replacing w with o;(I - rx(I) - P(I)) yields h1 = o~(l - rx(I) - PO) )+ rx(L )t; + P(L) I\, so that I\ - o~ = rx(L )t; - o~rx (l) + P(L) I\ - ~P( l) = rx(L)(e~ - o;) + P(L) (I\ - o;). Thus, the conditional variance is itself a serially correlated random variable. We plot the conditional variance of the simulated GAR CH(l , I) process and its sample autocorrelation function in Figure 4. The high persistenc e of the conditional variance is due to the large sum of the coefficients, rx+P = 0.90. 10 B. e! Has an ARMA Representation If e, is a GARCH(p,q) process, ~ has the ARMA representation t; = = [a(L) + j3(L)]~ j3(L) v, + v , 1 where v1 = ~ - 11t is the difference between the squared innovation and the conditional (,I) + variance at time t. To see this, note that, by supposition, h, = (,I)+ a(L)~ + f3(L)11t- Adding and subtracting j3(L)~ from the right side gives h1 = (,I)+ = (,I) + a(L)~ [a(L) Adding ~ to each side gives + j3(L)e: + - j3(L)e: + j3(L)11t j3(L)]~ - j3(L)[e: -11t]. so that t; = (,I)+ [a(L) + j3(L)]~ - j3(L)[e: -11t] + [t; -11t], [a(L) + P(L)]e ; - P(L)v, + v,. Thus, e~ is an ARMA([max(p,q)], p) process with innovation v,, where v E[-h , oo), and it is 1 1 covariance stationary if the roots of a(L)+P (L)=l are outside the unit circle. =(,I)+ The square of our GARCH(l,l) realization is presented in Figure 5; the persistence in ~, which is essentially a proxy for the unobservable 11t, is apparent. Differences in the behavior of~ and h1 are also apparent, however. In particular, ~ appears "noisy." To see why, use the multiplicative form of the GARCH model, e, = h.112 z with z, ~ N ( 0, I). It is 1 easy to see that t; is an unbiased estimator of Ii., E[ ~ 2 because z. P(~ < I OH ~ .! 11t) > 2 I n,_ 1 ] = E[11t I n,_i]E[z.2 I n,_ 1 ] = EJ11t I n,_i], x:i>· However, because the median of a Jec 1>is .455, 1/2. Thus, the e; proxy introduces a potentially significant error into the analysis of small samples of Ii., t = 1, ... , T, altliough the error diminishes as T increases. 11 C. The Conditional Prediction Erro r Variance Dep ends on the Conditioning Information Set Because the conditional variance of a GARCH proc ess is a serially correlated random variable, it is of interest to examine the optimal k-ste p-ahead prediction, prediction error and conditional prediction error variance. Immediately, the k-step-ahead prediction is E [ Y,.k I n,] = 0, and the prediction error is l Yt•k - E [Y,.k I n, = etok· This implies that the conditional variance of the pred iction error, E[(Y,.k - E[Y,.k I n,])2 In, ]= E[e;.k In,} depends on both k and n, because of the dynamics in the conditional variance. Simple calculations reveal that the-expression for the GAR CH(p, q) process is given by In the limit, this conditional variance reduces to the unconditional variance of the process, lim E [F.2 k-~ ,.k IQ ' l = <,) I - a(l) - P(l) • For finite k, the dependence of the prediction error variance on the current information set n, can be exploited to produce better interval forec asts, as illustrated in Figure 6 for k = l. We plot the one-step-ahead 90% conditional and unco nditional interval forecasts of our simulated GARCH(l,l) process along with the actu al realization. We construct the conditional prediction intervals using the conditional variance E[t:;. 1 In_]= 1\.1 = w + at:; +Pl\= 1 + .2t:; + .7J\; [ii;}:~;. thus, the conditional prediction intervals are ~l.6 4 The 90% unconditional interval, on the other hand, is simply [f.os, f. .J, where f. deno tes the a percentile of the unconditional 9 distribution of the GARCH process. The ability of the conditional prediction intervals to adapt to changes in volatility is clear. 12 D. The Implied Unconditional Distrib ution Is Symme tric and Leptok urtic The moment structure of GARCH processes is a complicated affair. In addition to the earlier-referenced surveys, Milhoj (1985) and Bollerslev (1988) are good sources. Howeve r, straightforward calculation reveals that the unconditional distribution of a GARCH process is symmetric and leptokurtic, a characteristic that agrees nicely with a variety of financial market data. The unconditional leptokurtosis of GARCH processes follows from the persistence in conditional variance, which produces the clusters of "low volatility" and "high volatility" episodes associated with observations in the center and in the tails of the unconditional distribution. GARCH processes are not constrained to have finite unconditional moments, as shown in Bollerslev (1986). In fact, the only conditionally Gaussian GARCH process with unconditional moments of all orders occurs when a(L) = ~(L) = 0, which is the degenerate case of i.i.d. innovations. Otherwise, depending on the precise parameterization, unconditional moments will cease to exist beyond some point. For example, most parame ter estimates for financial data indicate an infinite fourth moment, and some even indicate an infinite second moment. Our illustrative process has population mean 0, variance I 0, skewness 0, and kurtosis 5.2. E. Temp(lral Aggregation Produces Convergence to Normality Convergence to nonnality under temporal aggregation is a key feature of much economic data and is also a property of covariance stationary GARCH processes. The key insight is that a low-frequency change is simply the sum of the corresponding high-frequency changes; for example, an annual change is the sum of the internal quarterly changes, each of which is the sum of its internal monthly changes, and so forth. Thus, if a Gaussian central limit theorem can be invoked for sums of GARCH processes, convergence to nonnality under temporal aggregation is assured. Such theorems can be invoked so long as the process is 13 covariance stationary, as shown by Dieb old (198 8) using a central limi t argument from Whi te (1984) that requires only the existence of an unco nditional second moment. Dros t and Nijman (1993) extend Dieb old's result by showing that a parti cula r generalization of the GAR CH class is closed unde r temporal aggregation, and by char acterizing the prec ise way in which temporal aggregation leads to reduced GAR CH effects. 10 V. Estimation and Testing of GARCH Models Foll owin g the majority of the literature, we focu s primarily on maximum-likelihood estimation (ML E) and associated testing procedur 11 es. A. Approximate Maximum Likelihood Estimat ion As always, the likelihood function is simply the join t density of the observations, L(6 ; Y1, ... ,YT )= f(Y1•···,YT; 6). This join t density is non-Gaussian and does not have a known closed-form expression, but it can be factored into the prod uct of conditional densities, L(6; Yi, ... , YT) = f(yT I QT-1; 6) ~YT-1 I QT-2; 6) ... ~Yp+1 IQP; 6) l{Yp• ...• Y1; 6), where, if the conditional densities are Gaussian , 1-l\( 6r 112 exp (-.! _i_ ) . .fEi f(y, I Qt-I; 6} = - 2 h1(6) The f( Yp• ... ,y ; 6 )term is often igno red because 1 a closed-form expression for it does not exist and beca use its deletion is asymptotically inconseq uential. Thus, the approximate log likelihood is T T 2 lnL (6; Yp+l' ... , YT ) = -T-p Y, ) · - ln(21t) - -l "~ In 1\(6) - -1 "~ h( 2 2t=p+ l 2 t=p+l •'t 6 It may be maximized numerically using iterative procedures and is easily generalized to models riche r than the pure univariate GAR CH process, such as regression models with GAR CH 14 disturbances. In that case, the likelihood is ihe same with et of y,. The unobserved conditional variances {h,(8)} T t=p+l calculated at iteration j using eG- 1>, = Yt - E [ Yt I Qt-1; 8] in place that enter the likelihood function are the estimated parameter vector at iteration j-1. The necessary initial values of the conditional variance are set at the first iteration to the sample variance of the observed data and at all subsequent iterations to the sample varianc e of a simulated realization with parameters eG- 1>. The assumption of conditional normality is not always appropriate. Nevertheless, Weiss (1986) and Bollerslev and Wooldridge (1992) show that even when normality is inappropriately assumed, the resulting quasi-MLE estimates are asymptotically normal ly distributed and consistent if the conditional mean and variance functions are specifi ed correctly. Bollerslev and Wooldridge (1992), moreover, derive asymptotic standar d errors for the quasi-MLE estimates that are robust to conditional non-normality and are easily calculated as functions of the estimated parameters and the first derivatives of the conditional mean and variance functions. B. Exact Maximum Likelihood Estimation Diebold and Schuermann (1993) propose a numerical procedure for constructing the exact likelihood function of an ARCH process using simulation techniques in conjun ction with nonparametric density estimation, thereby retaining the information contained in {yp, ... ,y 12 1}. Consider the ARCH(p) process, Yt = t\, where et I Qt-1 - N( 0, h,} t ht = w + a 1e~_ 1 + ... + «lf-p, w > 0, «; ;;,; 0, v' i = I, ... , p, and «; < 1. The ••I conditional normality assumption is adopted only because it is the most common; alterna tive distributions can be used with no change in the procedure. Let 8 = ( <a>, « , ••• , aP } 1 The initial likelihood term f( Yp, ... ,y ; 8) for any given parameter configuration 8 is 1 simply the unconditional density of the first p observations evaluated at {Yp, ... ,y }, which can 1 be estimated to any desired degree of accuracy using well-known techniques of simula tion and 15 consistent nonparametric density estimation. At any iteration j, a curren t "best guess" of the 0 parameter vector e >exists. Therefore, a very long realization of the process with parameter 0 vector e > can be simulated and the value of the joint unconditional density evaluated at {yP, ... ,y1 } canbeconsistentlyestimatedanddenotedas f(yP, ... , y 0 1; e >) This estimated unconditional density can then be substituted into the likelihood where the true unconditional density appears. By simulating a large sample, the difference betwe en ~Yp, ... , y ; e<i l) and 1 f(Yp• ..., y1; e0 >) is made arbitrarily small, given the consistency of the density estimation technique. The full conditionally Gaussian likelihood, evaluated at em, is then L(B<il;y1>···,Y1)" r(Yp•·--,Y1;0<il) II [/iii1i,(emi-112 t=p+l 21t exp[-21 (y~">) ht 0 J which may be maximized with respect to 0 using standard numerical techniques. ll C. Testing Standard likelihood-ratio procedures may be used to test the hypothesis that no ARCH effects are present in a time series, but the numerical estimation requir ed under the ARCH alternative makes that a rather tedious approach. Instead, the Lagra nge-multiplier (LM) approach, which requires estimation only under the null, is preferable. Engle (1982) proposes a simple LM test for ARCH under the assumption of conditional norm ality that involves only a least-squares regression of squared residuals on an intercept and lagge d squared residuals. Under the null of no ARCH, TR2 from that regression is asymptotica lly distributed as x\>• where q is the number of lagged squared residuals included in the regres sion. A minor limitation of the LM test for ARCH is the underlying assum ption of conditional normality, which is sometimes restrictive. 13 A more impor tant limitation is that it is difficult to generalize to the GARCH case. Lee (1991) and Lee and King (1993) present such a generalization, but as discussed in Bollerslev, Engle and Nelso n (1994), the GARCH 16 parameters cannot be separately identified in models close to the null -- the 1M test for GARC H(l,l) is the same as that for ARCH(!). Thus, less formal diagnostics are often used, such as the sample autocorrelation function of squared residuals. McLeod and Li (1983) show that under the null hypoth esis of no non-linear dependence among the residuals from an ARMA model, the vector of normalized sample autocorrelations of the squared residuals, where 6 2 is the estimated residual variance and t = 1, ... , m, is asymptotically distributed as a multivariate normal with a zero mean and a unit covariance matrix. Moreover, the associated Ljung-Box statistic, Q,, (m) = T(T +2>:E f>,,(t}2, •=I T-t is asymptotically X~mJ under the null. If the null is rejected, then non-linear depend ence, such as GARCH, may be present. 14 After fitting a GARCH model, it is often of interest to test the null hypothesis that the standardized residuals are conditionally homoskedastic. Bollerslev and Mikkelsen (1993) argue that one may use the Ljung-Box statistic on the squared standardized residua l autocorrelations, but that the significance of the statistic should be tested using a ~m-k) distribution, where k is the number of estimated GARCH parameters. This adjustm ent is necessary due to the deflation associated with fitting the conditional variance model. A related testing issue concerns the effect of GARCH innovations on tests for other deviations from classical behavior. Diebold (1987, 1988) examines the impact of GARCH 17 effects on two standard serial correlation diagnostics, the Bartlett standard errors and the Ljung-Box statistic. As is well-known, in the large-sample Gaussian white-noise case, p(-r) i.i.d. - N ( 0, ~·) , -r-= 1, 2, ... and m Q(m) = T(T+2) L t=l 1 .(f-'t) p(-r) a 2 - X~m)• where p(-r) denotes the sample autocorrelation at lag -r. In the GARCH case, howev er, an adjustment must be made, p(-r) i.i.d. N( 0, ~(I+ Yy;~-r)) ), -r = I, 2, . . , where y y,(-r) denotes the autocovariance function of y,2 at lag -r and a4 is the squared unconditional variance of y,. The adjustment is largest for small -r and decreases monotonically as -r- if the process is covariance stationary. Similarly, the robust Ljung-Box statistic is 00 Q(m) = T(T+2) t-1 -( t=l (T--r) o4 ) p(-r)2 o4 + Yy2('t) ~ X~mJ· The formulae are made operational by replacing the unknown population parameters with the usual consistent estimators. It is important to note that the standard error adjustment serves to increase the standar d errors; failure to perform the adjustment results in standard error bands that are "too tight." Similarly, failure to adjust the Ljung-Box statistic c:auses empirical test size to be larger than nominal size -- often much larger, due to the cumulation of distortions through summa tion. Thus, failure to use robust serial correlation diagnostics for GARCH effects may produc ea spurious impression of serial correlation. 18 A more general approach that yields robust sample autocovariances and related statistics is obtained by adopting a generalized method of moments (GMM) perspective, as proposed by West and Cho (1994). 15 Define "1 = 6 (e;, e,e1-1, ..., e.e.-m)', = (E[e;J Eft;e,_iJ ... , Eft;e,-m])' and g.(6) = X. - 6 as.((m +l)xl) vector sand 60MM as the value of 6 that satisfies the condition Note that, because there are as many parameters being estimated as there are orthog onality conditions, GMM simply yields the standard point estimates of the autocovarianc es. Their standard errors and related test statistics are asymptotically robust, because as shown by Hansen (1982) under general conditions allowing for heteroskedasticity and serial correlation of unknown form, ~ N(O,V) where /f(eoMM - e) &(:~MM) ] S -I E [ ag,( !:MM} V = { E [a rr 1 and S is the spectral density matrix of g.(6) at frequency zero. This expression for V is made operational by replacing all population objects with consistent estimates. The GMM-estimated autocovariances of y, and their standard errors will be robust to possible conditi onal heteroskedasticity in e,, as will the Ljung-Box statistic computed using the GMMestimated autocovariances. VI. Applications and Extensions There are numerous applications and extensions of the basic GARCH model. In this section, we highlight those that we judge most important in macroeconomic and financial contexts. It is natural to discuss applications and extensions simultaneously becaus e many of the extensions are motivated by applications. 19 A. Functional Form and Density Form Numerous alternative functional forms for the conditional variance have been suggested 16 in the literature. One of the most interesting is Nelson's (1991) expon ential GARCH(p,q) or EGARCH(p,q) model, Y1 = et = 1, ''t 112 Z,, i.i.d. z, - N( 0, I), In(!\} =w + ta; g( z,_;) + t P; In(!\-;} 1= 1 ,,... g(z,} = 8z1 + Y(/z,/ -E[/z,/]). The log specification ensures that the conditional variance is positive, and the model allows for an asymmetric response to the z. innovations depending on their sign. Thus, the effect of a negative innovation on volatility may differ from that of a positive innova tion. This allowance for asymmetric response has proved useful for modeling the "leverage effect" in the stock market described by Black (1976). 17 With respect to density form, non-Gaussian conditional distributions are easily incorporated into the GARCH model. This is important, because it is commonly found that the Gaussian GARCH model does not explain all of the leptokurtosis in asset returns. With this in mind, Bollerslev (1987) proposes a conditionally Student-t GARC H model, in which the degrees-of-freedom is treated as another parameter to be estimated. Alternatively, Engle and Gonzalez-Rivera (1991) propose a semiparametric methodology in which the conditional variance function is parametrically specified in the usual fashion, but the conditional density is estimated nonparametrically. B. GARC H-M: Time-Varying Risk Premi a 20 Consider a regression model with GARCH disturbances of the usual sort, with one additional twist: the conditional variance enters as a regressor, thereby affecting the conditional mean. Write the model as Y1 = x.'P e, I n,_ 1 + Yh. +t,, - N ( o, Ii.). This GARCH-in-Mean (GARCH-M) model is useful in modeling the relationship between risk and return when risk (as measured by the conditional variance) varies. Engle, Lillien and Robins (1987) introduce the model and use it to examine time-varying risk premia in the tenn structure of interest rates. C. !GAR.CH: Persistence in Variance A special case of the GARCH model is the integrated GARCH (IGARCH) model, introduced by Engle and Bollerslev (1986). A GARCH(p,q) process is integrated of order one in variance if I - a(L) - P(L) = 0 has a root on the unit circle. The IGARCH process is potentially important because, as an empirical matter, GARCH roots near unity are commo n in high-frequency financial data. The earlier ARMA result for the squared GARCH process now becomes an ARIMA result for the squared IGARCH process. As before, thus, [I -a(L) - P(L)JC : = <il - e"; = <il +[a(L) +p(L)J t;-p(L) v +v ; 1 1 P{L) v, + v,. When the autoregressive polynomial contains a unit root, it can be rewritten as e; [ 1 - a(L) - p (L)] = q>(L)(l -L) t; = <il - p (L) v, + v,. Thus, the differenced squared process is of stationary ARMA fonn. Unlike· the conditional prediction error variance for the covariance stationary GARCH process, the IGARCH conditional prediction error variance does not converge as the forecast horizon lengthens; instead, it grows linearly with the length of the forecast horizon. Fonnall y, 21 E[ e;.k I n,] = (k-l)w + 11..1so that Jim E [ e;.k I n,] = an infinite unconditional variance. k-• 00 • Thus, the IGARCH process has Clearly, a parallel exists between the IGARCH process and the vast literatu re on unit roots in conditional mean dynamics (see Stock, 1994). This parallel, howev er, is partly superficial. In particular, Nelson (1990b) shows that the IGARCH(l,1) proces s (with w ., 0) is nevertheless strictly stationary and ergodic, which leads one to suspect that likelihood-based inference may proceed in the standard fashion. This conjecture is verified in the theoretical and Monte Carlo work of Lee and Hansen (1994) and Lumsdaine (1992, 1995). Although conditional variance dynamics are often empirically found to be highly persistent, it is difficult to ascertain whether they are actually integrated. (Again , this difficulty parallels the unit root literature.) Circumstantial evidence agains t IGARCH arises from several sources, such as temporal aggregation. Little is known about the temporal aggregation of IGARCH processes, but due to the infinite unconditional second moment, we conjecture that a Gaussian central limit theorem is unattainable. (To the best of our knowledge, no existing Gaussian central limit theorems are applicable.) If so, this bodes poorly for the IGARCH model, because actual series displaying GARCH effects seem to approach normality when temporally aggregated. It would then appear likely that highly persistent covariance-stationary GARCH models, not IGARCH models, provid e a better approximation to conditional variance dynamics. The possibility also arises that some findings of IGARCH may be due to misspecification of the conditional variance function. In particular, Diebo ld (1986) suggests that the appearance of IGARCH could be an artifact resulting from failure to allow for structural breaks in the unconditional variance, if in fact such breaks exist. This is borne out in various contexts by Lastrapes (1989), Lamoureux and Lastrapes (1990), and Hamilton and Susmel (1994). Accordingly, Chu (1993) suggests procedures for testing param eter instability in GARCH models. 22 D. Stochastic Volatility Models A simple first-order stochastic volatility model is given by e1 = 01 Zi = exp( ; ) Zi, Zi - N( 0, 1 ), h, = (A) + Pl\-1 + ,,,, 111 - N ( 0, o~ ). Thus, as opposed to standard GARCII models, h, is not deterministic conditional on QH; the conditional variance evolves as a first-order autoregressive process driven by a separate innovation. Moreover, the exponential specification ensures that the conditional variance remains positive. It is clear that the stochastic volatility model is intimately related to Clark's (I 973) subordinated stochastic process model -- in fact, for all practical purposes, it is Clark's model. For further details, see Harvey, Ruiz and Shephard (1994), and for alternative approaches to estimation, which can be challenging, see Jacquier, Polson and Rossi (1994) and Kim and Shephard (1994). Although there has been substantial recent interest in stochastic volatility models, their empirical success relative to GARCII models has yet to be established. E. Multivariate GARCH Models Cross-variable interactions are key in macroeconomics and finance. Multivariate GARCII models are used to capture cross-variable conditional volatility interactions. The first multivariate GARCII model, developed by Kraft and Engle (1982), is a multivariate generalization of the pure ARCII model. The multivariate GARCII (p,q) model is proposed in Bollerslev, Engle and Wooldridge (1988). The N-dimensional Gaussian GARCII(p,q) process is e, I n,_ 1 - N (0, H,), ·where H. is the (NxN) conditional covariance matrix given by vech(If.) = W + t i=l A; vech{ eH t;_;} + t j=l 23 Bi vech(lf. ~ vech(.) is the vector-half operator that converts {NxN ) matrices into (N(N + 1)/2 xl) vectors of their lowe r triangular elements, W is an (N(N + 1)/2 xl) parameter vector, and A; and B; are ((N{N + 1)/2) x {N(N + 1)/2)) parameter matrices. Like lihood-based estimation and inference are conceptually straightforward and parallel the univ ariate case. The approximate log likelihood function for the conditionally-Gaussian mult ivariate GARCH(p,q) process, aside from a constant, is In practice, however, two complications arise. First , the conditions needed to ensure that H. is positive definite are complex and difficult to verify. Second, the model lacks parsimony; an unrestricted parameterization of H. is too profligate to be of much empirical use. As written above, the model has (N(N + 1)/2)[1 +(p+ q)N( N + 1)/2] = O(N4) parameters, which makes numerical maximization of the likelihood function extre mely difficult, even for low values of N, pan dq. Various strategies have been proposed to deal with the positive definiteness and parsimony complications. Engle and Kroner (1993) propose restrictions that guarantee positive definiteness without entirely ignoring these cross-variable interactions. Bollerslev, Engle and Wooldridge (1988) enforce further parsimon y by requiring that the A; and B; matrices be diagonal, reducing the number of param eters to (N(N + 1)/2)[1 +p+ q] = O{N2). However, the parsimony of this "diagonal" model come s at potentially high cost, because much of the potential cross-variable volatility interactio n, a key point of multivariate analysis, is assumed away. F. Common Volatility Patterns: Multivariate Mod els With Factor Structure Multivariate models with factor structure, such as the latent-factor GARCH model (Diebold and Nerlove, 1989) and the factor GARCH model (Engle, 1987 and Bollerslev and 24 Engle, 1993), capture the idea of commonality of volatility shocks, which appears empirically relevant in systems of asset returns in the stock, foreign exchange, and bond markets. 18 Models with factor structure are also parsimonious and are easily constr ained to maintain positive definiteness of the conditional covariance matrix. In the latent-factor model, movements in each of the N time series are driven by an idiosyncratic shock and a set of k < N common latent shocks or "facto rs". 19 The latent factors display GARC H effects, whereas the idiosyncratic shocks are i.i.d. and orthogonal at all leads and lags. The one-factor model is important in practice, and we descri be it in some detail. The model is written as e, = .i..F + v,, where e., A and u, are (Nxl) vectors and F, is a scalar. 1 F, and u, have zero conditional means and are orthogonal at all leads and lags. The factor F, follows a GARCH(p,q) process, F, 1 n,_ 1 - N(o, I\) h1 = c.> + a(L)F,2 + PCL)h., so that the conditional distribution of the obseived vector is e,I0,_ 1 - N ( 0, ff.} .... = .i...i..'h' + r, T-f where r = cov(v,) = diag(y , 1 - H..JJ,t = )...J2h-~ ••• , yN). Thus, the j'11 time-t conditional variance is + y.J = )...J2( c.> + t i=I 2 q a.F, . + L..J 't"' 1 -1 i=I ll._h .) t'1•'t-1 + Y·J' and the j ,1<'1' time-t conditional covariance is Note that the latent factor F, is unobseivable and not directly included in '21-1 = {e1_1, Effectively, the latent-factor model is a stochastic volatility model. In general, the numbe r of parameters in the k-factor model is N(k + 1) + k 2(1 +p +q) = O(N), so the number of parameters in the one-fa ctor case is 25 ••• , e1 } 2N +(I +p+ q), a drasti~ reduction relative to the general multi variate case. Moreover, the conditional covariance matrix is guaranteed to be positive defin ite, so long as the conditional variances of the common and idiosyncratic factors are_constraine d to be positive. A simulated realization from a bivariate model with one comm on GARCH(l, 1) factor is sho.wn in Figures 7-9. The model is parameterized as h, = I + (v 1,, v2.)' .2F/. 1 + .71\_ 1, i.i.d. - N(O, I). The realization of the common factor underlying the system is precisely the one presented in our earlier discussion of univariate GAR.CH models. The laten t-factor GAR.CH series exhibit the volatility clustering present in the common factor. As befor e, the squared realizations of the two series indicate a degree of persistence in volatility. Furthermore, as expected, the conditional second moments of the two series are similar to that of F, because, as shown above, they are simply multiples of Ii.Diebold and Nerlove (1989) suggest a two-step estimation proce dure. The first step entails performing a standard factor analysis; i.e., factoring the unconditional covariance matrix as H = ;\.;\. 1o2 + r, where o2 is the unconditional variance ofF,, and extracting an .estimate of the time series of factor values {F,}:1. The second step entails estimating the latent-factor GAR.CH model treating the extracted factor series F, as if it were the actual series F,. The Diebold-Nerlove procedure is clearly suboptimal relative to fully simultaneous maximum likelihood estimation, because the F, series is not equal to the F, series, even asymptotically. Harvey, Ruiz and Sentana (1992) provide a better approximation to the exact 26 likelihood function that involves a correction factor to account for the fact that the F, series is unobservable. 20 For example, using an ARCH(l) specification, the conditional variance of the latent factor F, in the Diebold-Nerlove model is Ii. = var(F,IC:J.-i) = w + aF,: 1 = w. + aE [F1:t1Q._ 1} Using the identity F1-1 = F1-1 + (F1-1 - Ft-1} r Er F,:1101-1 J = E [ Ft-I + (Ft-1-F,-1) 10,-1 J = Er F,~110,-1 J + P(-1 = F,~1 + P,-1, where p,.1 is the correction factor. Thus, h, is expressed as Ii. = w + a F ~ + p _ }. The 1 1 1 1 correction factor can be constructed using the appropriate elements in the conditional covariance matrix of the state vector estimated by the Kalman filter. Finally, we note that recently-developed Markov-chain Monte Carlo techniques facilitate exact maximum-likelihood estimation of the latent-factor model (or, more precisely, approximate maximum-likelihood estimation with the crucial distinction that the approximation error is under the user's control and can be made as small as possible). For details see Kim and Shephard (1994). G. Optimal Prediction Under Asymmetric Loss Volatility forecasts are readily generated from GARCH models and used for a variety of purposes, such as producing improved interval forecasts, as discussed previously. Less obvious but equally true is the fact that, under asymmetric loss, volatility dynamics can be exploited to produce improved point forecasts, as shown by Christoffersen and Diebold (1994). If, for example, Y,+k is normally distributed with conditional mean µ,.klO, and conditional variance 11..klO. and L(et+k) is any loss function defined on the k-step-ahead prediction error e,.k = Y,.k - Y,.k• then the optimal predictor is Y,.k = µ,.klO, + a,, where a, depends only on the loss function and the conditional prediction error variance var( e,.klO,} = var(y,.klO,} = 11..klO,. The optimal predictor under asymmetric loss is not the conditional mean, but rather the conditional mean shifted by a time-varying adjustment that 27 depends on the conditional variance. The intuition for this is simple - when, for example, positive prediction errors are more costly than nega tive errors, a negative conditionally expected error is desirable and is induced by setting the bias 0:1 > 0. The optimal amount of bias depends on the conditional prediction error varia nce of the process. As the conditional variation around µ,.d 0 1 grows, so too does the optim al amount of bias needed to avoid large positive prediction errors. To illustrate this idea, consider the linlin loss function, so-named for its linearity on each side of the origin (albeit with possibly different slopes): L(Y,.k -:r,.k) ajy,.k -y,.kl, = l • if Y,.k -y,.k > 0 .f • b/y,.k-yt•k'• 1 Y,.k-Yt+k s; O. Christoffersen and Diebold (1994) show that the optim al predictor of Yi+k under this loss function is where cl> is the Gaussian cumulative density function. In contrast, a pseudo-optimal predictor, which accounts for loss asymmetry but not condition al variance dynamics, is Yt+k where CJ~ = µ,.k/0, + 0 k q,-1( a:b), is the unconditional variance of Y,+t· In Figure 10, we show our GARCH(l,l) realizatio n together with the one-step-ahead linlin-optimal, pseudo-optimal and conditional mean predictors for the loss parameters a = .95 and b = .05. Note that the optimal predictor injec ts more bias. when conditional volatility is high, reflecting the fact that it accounts for both loss asymmetry and conditional heteroskedasticity. This conditionally optimal amo unt of bias may be more or less than the constant bias associated with the pseudo-optimal pred ictor. Of course, the conditional-mean 28 predict or injects no bias, as it accounts for neither loss asymmetry nor conditi onal heteroskedasticity. H. Evaluating Volatility Forecasts Although volatility forecast accuracy comparisons are often conducted using meansquared error, loss functions that explicitly incorporate the forecast user's econom ic loss function are more relevant and may lead to different rankings of models. West et al. (1993) and Engle et al. (1993) make important contributions along those lines, propos ing economic loss functions based on utility maximization and profit maximization, respect ively. Lopez (1994) proposes a volatility forecast evaluation framework that subsum es a variety of economic loss functions. The framework is based on transforming a model' s volatility forecasts into probability forecasts by integrating over the distribution of t\. By selecting the range of integration corresponding to an event of interest, a forecas t user can incorporate elements of her loss function into the probability forecasts. For example, given e, I0 1_1 - D( 0, Ii.) and a volatility forecast an options trader interested. in the event h., e, E [ L•. ,. Uc. 1 ] would generate the probability forecast p : t Pr(L <e < c.t t --, Jr.,l U )·: Pr[ L•. , <7 < u•. c,t /hi where z. is the standardized innovation, : ·~, f( z,) is the functional form of the distribution D ( o, I ) , and [1•. ,, u•. 1 ] is the standardized range of integration. In contras t, a. forecast user such as a portfolio manag er or a central bank interested in the behavior of y, = µ1 + e,, where µ1 = E [ y1 P I I 0 1_1 ], would generate the probability forecast = Pr(L y.t <y < t U ) = Pr[ Ly., - lli y.t A 29 < z, < A µ,l uy.t - = where 11, is the forecasted conditional mean and [ ly. T•t' uy, T•t] is the standardized range of integration. The probability forecasts so-generated can be evaluated using statistical tools tailored to the user 's loss function. In particular, probability scori ng rules can be used to assess the accuracy of the probability forecasts, and the significan ce of differences across models can be tested using a generalization of the Diebold-Mariano (1995) procedure. Moreover, the calibration tests of Seillier-Moiseiwitsch and Dawid (1993) can be used to examine the degree of equivalence between an even t's predicted and obse rved frequencies of occurrence within subsets of the probability forecasts specified by the user. vn. Directions for Future Research Fifteen years ago, little attention was paid to condition al volatility dynamics in modeling macroeconomic and financial time series; the situation has since changed dramatically. GARCH and related models have prov ed tremendously useful in modeling such dynamics. However, perhaps in contrast to the impr ession we may have created, we believe that the literature on modeling conditional volatility dynamics is far from settled, and that complacency with the ubiquitous GAR CH( l, 1) mode l is not justified. Almost without exception, low-ordered (and hence poten tially restrictive) GARCH models are used in applied work. For example, amon g hundreds of empirical applications of the GARCH model, almost all casually and uncriticall y adopt the GAR CH( l,1) specification. EGARCH applications have followed suit with the vast majority adopting the EGA RCH (l, I) specification. Similarly, applications of the stochastic volatility model typically use an AR( l) specification. However, recent findings suggest that such specifications - as well as the models themselves, regardless of the particular speci fication -- are often too restrictive to maintain fidelity to the data. 30 It appears, for example, that the conditional volatility dynamics of stock market returns (as well as certain other asset returns) contain long memory. Ding, Engle and Granger (1993) find positive and significant sample autocorrelations for daily S&P 500 returns at up to 2500 lags and that their rate of decay is slower than exponential. A model consistent with such long-memory volatility findings is the fractionally-integrated GARCH (FIGARCH) model developed by Baillie, Bollerslev and Mikkelsen (1993), building on earlier work by Robinson (1991). FIGARCH is a model of fractionally-integrated conditional variance dynamics, in parallel to the well-known fractionally-integrated ARMA models of conditional mean dynamic s (e.g., Granger and Joyeux, 1980). The FIGARCH model implies a hyperbolic rate of decay for the autocorrelations of the squared process that is slower than exponential. To motivate the FIGARCH process, begin with the GARCH(l,l) process, e, '1t I 0 1_1 - N(O, h,} = w + a(L)t:; + P(L)h.- Rearranging the conditional variance into ARMA form, the FIGARCH (p,d,q) equation is [! - a(L) - P(L)] e; = <!>(L) (I -Lt e; = w (1-P(L)) u,. That is, the [ I - a(L) - P(L)] polynomial can be factored into a stationary ARMA + component and a long-memory difference operator. If O< d < l, the process is FIGARCH(p,d,q). If d=O, then the standard GARCH(p,q) model obtains; if d = 1, then the IGARCH(p,q) model obtains. Bollerslev and Mikkelsen (1993) conjecture that the coefficients in the ARCH representation of a FIGARCH process (d < 1) are dominated by those of an IGARCH process. If so, then FIGARCH (d < 1) would be strictly stationary (though not covariance stationary), because IGARCH is strictly stationary. Long memory is only one of many previously unnoticed features of volatility. Interestingly, as we study volatility more carefully, more and more anomalies emerge. Volatility patterns tum out to differ across assets, time periods, and transformations of the 31 data. The complacency with the "standard" GARCH mode l is being shattered, and we think it unlikely that any one consensus model will take its place. The implications of this development are twofold. First, real care must be taken in tailoring volatility models to the relevant data, as in Engle and Ng (1993). Second, becau se all volatility models are likely to be misspecified, care should be taken in assessing models' robustness to misspecification. To illustrate the deviations from classical GARCH models that turn out to be routinely present in real data, we present in Figure 11 the sample autoc orrelation functions of the absolute and squared change in the Jog daily closing value of the S&P 1990. The autocorrelation functions are shown to displaceme nt -c 500 stock index, 1928- = 200 in order to assess the evidence for long memory, and dashed lines indicate the Bartlett 95 % confidence interval for white noise. Note that substantially more persistence is found in absolute returns than in squared returns, in keeping with Ding, Engle, and Granger (1993), and that both absolute and squared returns appear too persistent to accord with any of the "standard" volatility models. In addition, these patterns are different over time. In Figur e 12, we show squared returns over various subperiods: 1928-1940, 1941-1970, 1971-1980 and 1981-1990. It seems clear that most of the Jong memory is driven by the 1928-1940 perio d. To the extent that there is any long memory in the post-1940 period, it seems to be comi ng from the 1970's. Interestingly, there seems to be no GARCH effects in the 1980's as show n by the negligible autocorrelations 2 for e,. Other assets, including interest rates, foreign exchange rates, and other stock indexes, display a bewildering variety of volatility patterns, as discu ssed in Mor (1994). Sometimes there seems to be long memory; sometimes not. Sometimes the autocorrelation patterns of match those of je,I, and sometimes the autocorrelation patte rns of le,I appear much more persistent. The patterns differ across assets and often seem to indicate structural change. For example, the long memory seemingly present in exchange rate volatility seems concentrated in the 1970's, while long memory in interest rate volatility is typically concentrated in the e; 32 1980's. These observed phenomena, as well as occasional long-horizon spikes in autocorrelations and the appearance of oscillatory autocorrelation behavior, are again inconsistent with standard specifications. An additional illustration of the inadequacies of GARCH models is provided by West and Cho (1994). Using weekly exchange rates, they show that for horizons longer than one week, out-of-sample GARCH volatility forecasts loose their value, even though volatility seems highly persistent. The good in-sample perfonnance of GARCH models breaks down rapidly out-of-sample. 21 In addition, standard tests of forecast optimality, such as regressi ons of realized squared returns on an intercept and the GARCH forecast, strongly reject the null of the optimality of the GARCH forecast with respect to available infonnation. West and Cho suggest time-varying parameters and discrete shifts in the mean level of volatility as possible explanations. In light of the emerging evidence that GARCH models are likely misspecified and the unlikely occurrence of happening upon a "correct" specification, it is of interest to conside r whether GARCH models might still perfonn adequately in tracking and forecasting volatilit y -that is, whether their good properties are robust to misspecification. In a series of papers {Nelson, 1990a, 1992, 1993; Nelson and Foster, 1991, 1994), Nelson and Foster find that the usefulness of GARCH models in volatility tracking and short-tenn volatility forecasting is robust to a variety of types of misspecification; thus, in spite of misspecification, GARCH models can consistently extract conditional variances from high-frequency time series. More specifically, if a process is well approximated by a continuous-time diffusion, then broad classes of GARCH models provide consistent estimates of the instantaneous conditional variance as the sampling frequency increases. This occurs because the sequence of GARCH {l,I) models used to fonn estimates of next period's conditional variance average increasing numbers of squared residuals from the increasingly recent past. In this way, a 33 sequence of GARCH(l,l) models can consistently estimate next period's conditional variance despite potentially severe misspecification. 34 References Baillie, R.T., Bollerslev, T. and Mikkelsen, H.O. (1993), "Fractionally Integrated Generalized Autoregressive Conditional Heteroskedasticity," Manuscript, J.L, Kellogg School of Management, Northwestern University. ·· · · ·· Bera, A.K. and Higgins, M.L. (1993), • ARCH Models: Properties, Estimation and Testing • Jounuil ofEcorwmic Surveys, 7, 305-362. ' Black, F. (1976), "Studies of Stock Price Volatility Changes," Proceedings of the American Statistical Association, Business and Ecorwmic Statistics Section, 177-18 1. Bollerslev, T. (1986), "Generalized Autoregressive Conditional Hetero skedasticity," Journal ofEcorwmetrics, 31, 307-327. Bollerslev, T. (1987), "A Conditional Heteroskedastic Time Series Mode l for Prices and Rates of Return," Review ofEcorwmics and Statistics, 69, Speculative 542-547. Bollerslev, T. (1988), "On the Correlation Structure for the Generalized Autore Conditional Heteroskedastic Process," Journal of Time Series Analysis, gressive 9, 121-131. Bollerslev, T., Chou, R.Y., Kroner, K.F. (1992), "ARCH Modeling in Finance: Review of the Theory and Empirical Evidence," Journal of Ecorwmetric A Selective s, 52, 5-59. Bollerslev, T. and Domowitz, I. (1991), "Price Volatility, Spread Variab ility And The Role Of Alternative Market Mechanisms," Review of Futures Markets, 10, 78-102. Bollerslev, T. and Engle, R.F. (1993), "Common Persistence in Condi tional Variances," Ecorwmetrica, 61, 166-187. Bollerslev, T., Engle, R.F. and Nelson, D.B. (1994), "ARCH Models," in R.F. Engle and D. McFadden (eds.), Handbook ofEconometrics, Volume IV. Amsterdam: NorthHolland. Bollerslev, T., Engle, R.F. and Wooldridge, J.M. (1988), "A Capita l Asset Pricin with Time Varying Covariances," Journal of Political Ecorwmy, 95, 116-13g Model 1. Bollerslev, T. and Mikkelsen, H.O. (1993), "Modeling and Pricing Long Memory in Stock Market Volatility," Manuscript, J.L. Kellogg School of Management, Northwestern University. Bollerslev, T. and Wooldridge, J.M. (1992), "Quasi-Maximum Likeli hood Estimation and Inference in Dynamic Models with Time-Varying Covariances," Ecorw metric Reviews, 11, 143-179. Box, G.E.P . and Jenkins, G.W. (1970), Time Series Analysis Forecasting and Control. Oakland: Holden-Day. Brock, W.A. and LeBaron, B.D. (1993), "Using Structural Modeling in Building Models of Volatility and Volume of Stock Market Returns," Manuscript, Statistical Department of Economics, University of Wisconsin, Madison. 35 Christoffersen, P.F. and Diebold, F.X. (1994), "Opt imal Prediction unde r Asymmetric Loss," Technical Working Pape r #167, National Bureau of Economic Research. Chu, C.-S . J. (1993), "Detecting Parameter Shifts in Generalized Autoregressive Conditional Heteroskedasticity Models," Manuscript, Departme nt of Economics, University of Southern California. Clar k, P.K. (1973), "A Subordinated Stochastic Proc ess Model With Finit e Variance for Speculative Prices," Econometrica, 41, 135-156. Dem os, A. and Sentana, E. (1991), "An EM-Based Algorithm for Conditionally Heteroskedastic Latent Fact or Models," Manuscript, Financial Markets Group, London School of Economics. Diebold, F.X. (1986), "Modeling the Persistence of Conditional Variances: Comment," Econometric Reviews, 5, 51-56. · Diebold, F.X. (1987), "Testing for Serial Correlatio n in the Presence of ARC H," Proceedings of the American Statistical Association, Business and Economic Statistics Section, 1986, 323-328. Washington, DC: American Statistical Asso ciation. Diebold, F .X. (1988), Empirical Modeling ofExch ange Rate Dynamics. New York: Springer-Verlag. Diebold, F.X. and Mariano, R.S. (1995), "Comparin Business and Economic Statistics, 13, 253-264.g Predictive Accuracy," Journal of Diebold, F.X. and Nerlove, M. (1989), "The Dyna mics of Multivariate Latent-Factor ARCH Mod el," Journal Exchange Rate Volatility: A ofApplied Econometrics, 4, l-22. Dieb old, F.X. and Schuennann, T. (1993), "Exact Maximum Likelihood Estimation of ARC H Mod els," Manuscript, Department of Economics, Univ ersity of Pennsylvania. Ding , Z., Engle, R.F. and Granger, C.W .J. (1993), "A Long Mark et Returns and a New Mod el," Journal ofEmp Memory Property of Stock irical Finance, 1, 83-106. Dros t, F. C. and Nijman, T. E. (1993), "Tempora l Aggregation of GARCH Processes," Econometrica, 61, 909-927. Engle, R.F. (1982), "Autoregressive Conditional Hete roskedasticity with Estimates of the Variance of U.K. Inflation," Econometrica, 50, 9871008. Engle, R.F. (1987), "Multivariate GARCH with Fact or Struc Variance," Manuscript, Department of Economics, tures - Cointegration in University of California, San Dieg o. Engle, R.F. and Bollerslev, T. (1986), "Modeling the Persistence of Conditional Variances," Econometric Reviews, 5, l-50. Engl e, R.F. and Gonzalez-Rivera, G. (1991), "Sem iparametric ARCH Models," Journal of Business and Economic Statistics, 9, 345-359. 36 Engle, R.F., Hendry, ·b.F., and Trumble, D. (1985), "Small-Sample Properties of ARCH Estimators and Tests," Canadian Journal ofEconomics, 18, 66-93. Engle, R.F., Hong, C.-H., Kane, A. and Noh, J. (1993), • Arbitrage Valuation of Variance Forecasts with Simulated Options," in D. Chance and R. Tripp (eds.), Advanc es in Futures and Options Research. Greenwich, CT: JIA Press. Engle, R.F. and Kroner, K.F. (1993), "Multivariate Simultaneous Generalized ARCH ," Econometric Theory, forthcoming. Engle, R.F., Lillien, D.M. and Robins, R.P. (1987), "Estimating Time-Varying RiskPr emia in the Term Structure: The ARCH-M Model,• Econometrica, 55, 391-408. Engle, R.F. and Ng, V.K. (1993), "Measuring and Testing the Impact of News on Volatility," Journal of Finance, 48, 1749-1778. Friedman, M. and Schwartz, A.J. (1963), A Monetary History of the United States, 18671960. Princeton: Princeton University Press. Gallant, A.R., Hsieh, D.A. and Tauchen, G. (1991), "On Fitting a Recalcitrant Series: Pound-Dollar Exchange Rate, 1974-1983," in W.A. Barnett, J. Powell and G. The Tauchen, G. (eds.), Nonparametric and Semiparametric Methods in Econometrics and Statistics. Cambridge: Cambridge University Press. Geweke, J. (1989), "Bayesian Inference in Econometric Models Using Monte Carlo Integration," Econometrica, 51, 1317-1339. Granger, C.W.J . and Joyeux, R. (1980), "An Introduction to Long-Memory Time Series Models and Fractional Differencing," Joumal of Time Series Analysis, 1, 15-39. Granger, C.W.J . and Newbold, P. (1979), Forecasting Economic Time Series. New York: Academic Press. Hamilton, J.D. and Susmel, R. (1994), "Autoregressive Conditional Heteroskedas ticity and Changes in Regime," Journal ofEconometrics, 64, 307-333. Hansen, L.P. (1982), "Large Sample Properties of the Method of Moment Estima tors," Econometrica, 50, 1029-1054. Harvey, A., Ruiz, E., and Sentana, E. (1992), "Unobserved Component Time Series Models with ARCH Disturbances," Journal ofEconometrics, 52, 129-158. Harvey, A., Ruiz, E. and Shephard, N. (1994), "Multivariate Stochastic Varian ce Models," Review of Economic Studies, 61, 247-264. · Jacquier, E., Polson, N.G. and Rossi, P.E. (1994), "Bayesian Analysis of Stocha stic Volatility Model s,• Joumal ofBusiness and Economics Statistics, 12, 371-389. Jorgenson, D.W. (1966), "Rational Distributed Lag Functions," Econometrica, 34, 135-149. 37 Kim, S. and Shephard, N. (1994), "Stochastic Volatility : Likelihood Inference and Com parison with ARCH Models," Manuscript, Nuffield College, Oxford University. King, M., Sentana, E. and Wadhwani, S. (1994), "Vol Stock Markets," Econometrica, 62, 901-933. __atility and Links Between National Kraft, D. and Engle, R.F. (1982), "Autoregressive Condition Time Series Models," Discussion Paper 82-23, Depa al Heteroskedasticity in Multiple rtment of Economics, University of California, San Diego. Lamoureux, C.G . and Lastrapes, W.D . (1990), "Pers istence in Variance, Structural Change · and the GARCH Model," Journal ofBusiness and Econ omic Statistics, 8, 225-234. Lastrapes, W.D . (1989), "Exchange Rate Volatility and U.S. Application," Journal ofMoney, Credit and Banking, Monetary Policy: An ARCH 21, 66-77. Lee, J.H. H. (1991), "A Lagrange Multiplier Test for GARCH Models," Economics Letters, 37, 265-271. Lee, J.H. H. and King, M.L . (1993), "A Locally Most ' Mean ARCH and GARCH Regression Disturbances," Jour Powerful Based Score Test for nal of Business and Economics Statistics, 11, 17-27. Lee, S.-W. and Hansen, B.E. (1994), "Asymptotic Theory Maximum Likelihood Estimator," Econometric Theofor the GAR CH( l,l) Quasiry, 10, 29-52. Lopez, J.A. (1995), "Evaluating the Predictive Accu racy of Vola Department of Economics, University of Pennsylvania. tility Models," Manuscript, Lomsdaine, R.L. (1992), "Asymptotic Properties of the Quas in GAR CH( l ,l) and IGA RCH (l, 1) Models," Manuscrii-Maximum Likelihood Estimator pt, Department of Economics, Princeton University. Lomsdaine, R. L. (1995), "Finite Sample Properties of the Maximum Likelihood Estimator in GARCH (1,1) and IGARCH (1,1) Models: A Mon te Carlo Investigation," Journal of BusineessandEconomicStatistics, 13, 1-10. McLeod, A.I. and Li, W.K. (1983), "Diagnostic Chec king of ARM A Time Series Models Using Squared Residual Autocorrelations," Journal ofnm e Series Analysis, 4, 269273. Milhoj, A. (1985), "The Moment Structure of ARCH Processes," Scandinavian Journal of Statistics, 12, 281-292. Mor , N.M . (1994), "Essays on Nonlinearity in Exch ange Rate Department of Economics, University of Pennsylvania. s," Doctoral dissertation, Nelson, D.B. (1990a), "ARCH Models as Diffusion Approximations," Journal of Econometrics, 45, 7-39. 38 Nelson, D. B. (1990b), "Stationarity and Persistence in the GARCH (1,1) Model," Econometric Theory, 6, 318-334. Nelson, D.B. (1991), "Conditional Heteroskedasticity in Asset Return s: A New Approach " Econometrica, 59, 347-370. _ ' Nelson, D.B. (1992), "Filtering and Forecasting with Misspecified ARCH Models: I," Journal ofEconometrics, 52, 61-90. Nelson, D.B. (1993), "Asymptotic Ftltering and Smoothing Theory for Multivariate ARCH Models," Manuscript, Graduate School of Business, University of Chicag o. Nelson, D.B. and Cao, C.Q. (1992), "Inequality Constraints in the Univa riate GARCH Model," Journal of Business and Economic Statistics, IO, 229-235. Nelson, D.B. and Foster, D.P. (1991), "Filtering and Forecasting with Misspecified ARCH Models: Il," Journal of Econometrics, forthcoming. Nelson, D.B. and Foster, D.P. (1994), "Asymptotic Filtering Theory for Univariate ARCH Models," Econometrica, 62, 1-41. Robinson, P.M. (1987), "Adaptive Estimation ofHeteroskedastic Econo metric Models," Revista de Econometria, 7, 5-28. Robinson, P.M. (1991), "Testing for Strong Serial Correlation and Dynam ic Conditional Heteroskedasticity in Multiple Regression," Journal ofEconometrics, 47, 67-84. Sentana, E. (1992), "Identification and Estimation of Multivariate Condi tionall Heteroskedastic Latent Factor Models," Manuscript, Financial Marke y ts Group, London School of Economics. · Seillier-Moiseiwitsch, F., and Dawid, A.P. (1993), "On Testing the Validity Probability Forecasts," Journal of the American Statistical Association,of Sequential 88, 355-359. Sims, C.A. (1980), "Macroeconomics and Reality," Econometrica, 48, 1-48. Stock, J.H. (1987), "Measuring Business Cycle Time," Journal of Politic al Economy, 95, 1240-1261. Stock, J.H. (1988), "Estimating Continuous-Time Processes Subject to Time Deformation: An Application to Postwar U.S. GNP," Journal of the American Statist ical Association, 83, 77-85. Stock, J.H. (1994), "Unit Roots and Trend Breaks," in R.F. Engle and D. McFadden (eds.), Handbook of Econometrics, Volume 1Y. Amsterdam: North-Holla nd. Tsay, R.S. (1987), "Conditional Heteroskedastic Time Series Models," Journal of the American Statistical Association, 82, 590-604. Weiss, A.A. (1984), "ARMA Models with ARCH Errors," Journal of Time Series Analysis, 5, 129-143. 39 Weiss, A.A. (1986), "Asymptotic Theory for ARC H Models: Estimation and Testing," Econometric Theory, 2, 107-131. · West, K.D . and Cho, D. (1994), "The Predictiv e Ability of Several Models of Exchange Rate Volatility," Technical Working Pape r #152, Na!_ ional Bureau of Economic Research. West, K.D ., Edison, HJ. and Cho, D. (1993), "A Utility-B of Exchange Rate Volatility," Journal ofInternatio ased Comparison of Some Models nal Economics, 35, 23-45. White, H. (1984), Asymptotic Theory for Econome tricians. New York: Academic Press. Wold, H.O . (1938), The Analysis of Stationary Time Series. Uppsala: Almquist and Wicksell. 40 Figure 1 Figure2 Daily Spot SF/$ (1974-1991) Daily Spot DM/$ (1974-1991) ·~ --- --- --- ~ •.- --- --- --- --- -, ·- 1.8 1.8 1., 1.6 1.4 1.4 u 0.2 Timo Daily DM/$ Returns (1974-1991) Daily SF/$ Returns (1974-1991) 0.00 0.06 0.04 ..... ..... Timo Squared DM/$ Returns (1974-1991) ., Squared SF/$ Returns (1974-1991) • '10 3 3 25 2 2 ,.. Time 41 Figu re3 GAR CH( l,1) Realization Sample Autocorrelation Function 10 0 -10 Tme • Figu re4 Conditional Variance Sample Autocorrelation Function .,. 30 ••• 25 20 ~~ 0 ------- ------ -o.•1,----.------...----.,,,r---,=----"""·, Tme • Figu res Squared GAR CH( l,l) Realization 120 80 50 •• 42 Figure6 GARC H(l,l) Realization with One-Step-Ahead 90% Conditional and Unconditional Confidence Intervals 10 8 6 -10 0 100 20 300 Time 43 400 500 Figure 7 Factor-GARCH Series 1 Factor-GARCH Series 2 8 8 • 4 j. ij . Figur es Factor-GARCH Series 1 Squared Factor-GARCH Series 2 Squared 80 . 70 70 80 60 '° 40 '° 30 30 40 44 Figure 9 Conditional Variance of Series 1 Sample Autocorrelation Function . 30 ,. ••• 20 15 0 k Conditional Variance of Series 2 Sample Autocorrelation Function . 30 ,. 20 Time k Conditional Covariance Sample Autocorrelation Function . 30 ,. 20 ,. -0.501,- --..-,---,u ---,.--m, ...----.J, k 45 Figure 10 GARC H(l,l) Realization with Linlin Optimal, Pseudo-Optimal, and Conditional Mean Predictors 15, ---- ---, ---- --,- ---- ---r ---- -,-- --~ 10 -15- ---~ ----- -=b. ----- ,d-,. ----- ----. -,..,. .---- 0 100 200 300 400 500 Notes to Figure: The linlin loss parameters are set to a = .95 and b = .05, so a/(a+b ) = .95. The GARC H(l,1) parameters are set to a=.2 .and P=.75. The dotted linethat is the GARC H(l,l) realization. The horizontal line at zero is the conditional mean predict horizontal line at 1.65 is the pseudo-optimal predictor, and the time-varying solid or, the line is the optimal predictor. 46 Figure 11 Autocorrelation Function-S&P Ie I - Jan 28 to May 90 Autocorrelation Function-S&P 2 ~ - Jan 28 to May 90 ;~- ---- -~- -~- ---, • • 0 0 . 0 '=!--- 1-- . ;-_- ----- - -- -- -_._- • -- --------------- -- •'------~--------__, l 0 20 •0 80 80 100 120 140 160 180 200 47 ··~ Figure 12. Autocorrelation Function-S&P e 2 - Jan 28 to Dec 40 Autocorrelation Function-S&P e 2 - Jan 41 to Dec 70 •• •• . • ~---1--- ---- -- __,_ ;;.___,,. 10 Autocorrelation Function-S&P e 2 - Jan 71 to Dec 80 20 <10 ___________ 60 BO 100 120 1•0 160 180 _, 200 Autocorrelation Function-S&P e2 - Jan 81 to May 90 .• •• ---•-~·--- ----!--- ---- r--- ---- - ---- ---- ---- ---- •'- --- --- --- --- --- -' ,o 20 "0 &0 ao 100 120 uo tac 1ao 200 • 48 Endnotes 1. ARCH is short for AutoRegressive Conditional Heterosked asticity. · 2. A process is linearly detenninistic if it can be predicted to any desire d degree of accuracy by linear projection on sufficiently many past observations. 3. Recall that the defining characteristic of white noise-is a lack of serial correlation, which is a weaker condition than serial independence. 4. The obvious empirically useful approximation to an LRCSSP (whic h is an infinite-ordered moving average) with infinite-ordered ARCH errors is an ARMA proce ss with GARCH errors. See Weiss (1984), who studies ARMA processes with finiteordered ARCH errors. (The GARCH process had not yet been invented.) 5. Nelson and Cao (1992) show that, for higher order GARCH proce sses, the nonnegativity constraints are sufficient, but not necessary, for the conditional varian ce to be positive. 6. See, for example, Jorgenson (1966). 7. Setting Yo= 0 and ho= E(y,2), we generate 1500 observations, and we discard the first 1000 to eliminate the effects of the start-up values. 8. The parameter values for a and empirical literature. Pare typical of the parameter estimates reported in the 9. For a precise statement of the necessary and sufficient condition for finite kurtosis, see Bollerslev (1986). 10. Their results, however, require a finite fourth unconditional mome nt, a condition likely to be violated in financial contexts. 11. Alternative approaches may of course be taken. Geweke (1989 ), for example, discusses Bayesian procedures. 12. Generalization to the GARCH case has not yet been done. 13. However, Bollerslev and Wooldridge (1992) introduce a modif ied LM test robust to nonnonnal conditional distributions. 14. As always, rejection of the null does not imply acceptance of the alternative. Tests for conditional heteroskedasticity, for example, often have power again st alternatives of serial correlation as well; see Engle, Hendry and Trumble (1985). 15. Robinson (1991) also treats the issue of robustness by proposing general classes of heteroskedasticity-robust serial correlation tests and serial correlationrobust heteroskedasticity tests. 16. In fact, Robinson (1987) goes so far as to propose nonparametric estimation of the conditional variance function, thereby eliminating the need for param etric specification of functional fonn. 49 17. Negative shocks appear to contribute more to stock marlcet volatility than do positive shocks. This phenomena is called the leverage effect, because a negative shock to the market value of equity increases the aggregate debt/equity ratio (other things the same), thereby increasing leverage. 18. Models of "copersistence" in variance and cointegration in variance are based on similar ideas; see Bollerslev and Engle (1993). 19. Despite the similarity in their names, the latent-factor GARCH model discussed here is different from the factor GARCH model. In the latent-factor GARCH case, the observed variables are linear combinations of latent GARCH processes, whereas in the factor GARCH case, linear combinations of the observed variables follow univariate GARCH processes. As pointed out by Sentana (1992), the difference between the two models is similar to the difference between standard factor analysis and principal components analysis. 20. See also King, Sentana and Wadhwani (1994), Demos and Sentana (1991), and Sentana (1992). 21. Note, however, that West and Cho (1994) evaluate volatility forecasts using the meansquared error criterion, which may not be the most appropriate. For further discussion, see Bollerslev, Engle and Nelson (1994) and Lopez (1995). 50