The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
A Series of Occasional Papers in Draft Form Prepared by Members of the Research Department for Review and Comment. 75-1 Autoregressive Form of Time-Polynomial Extrapolation Thomas A. Gittings Federal Reserve Bank of Chicago Research Paper No. 75-1 AUTOREGRESSIVE FORM OF TIME-POLYNOMIAL EXTRAPOLATION By Thomas A. Gittings Department of Research Federal Reserve Bank of Chicago This paper is being circulated for purposes of discussion and comment. The contents should be regarded as preliminary and not for citation or quotation without permission of the author. The views expressed are those of the author and do not necessarily reflect those of the Federal Reserve Bank of Chicago or the Federal Reserve System. 1. Introduction Some of the standard methods of forecasting future values of a time series that has a trend Include treating the series as an A.R.I.M.A. process, using general exponential smoothing, or extrapolating least-squares fitted time-polynomials. This article primarily investigates some properties of this last method. Given the current and some past values of a single time series, the procedure of estimating the coefficients of a polynomial in time and extrapolating into the future is equivalent to a specific type of autoregressive process. The lag weights in the autoregressive equation are independent of the data and are functions only of the number of observations, the degree of the estimated time-polynomial, the lead time of the forecast, and the weighting function in the least-squares estimation. This article is concerned with forecasting a time series which can be represented by a simple polynomial or exponential trend without a seasonal component. One method of forecasting such a variable is to form a stationary series from the data, usually by taking the difference between successive observations (or logarithms of the observations), and then estimate the coefficients for an autoregressive model. In a general A.R.I.M.A. model the equation also can include the deviations of some past observations from their estimated values. This approach is advocated especially by Wold [6], Whittle . [5], and Box and Jenkins [!]• For cases where the time series has such a simple trend, an alternative approach is to estimate the time trend and extrapolate to get the forecasts for some future dates. This forecasting technique is equivalent to using certain deterministic autoregressive equations where the coefficients are independent of the data. The primary advantage of this later approach is its computational simplicity; forecasts can be made by just taking a weighted sum of a set of observations. The process of estimating the coefficients of a time-polynomial is embodied in these particular autoregressive coefficients. In section 2 the autoregressive form for extrapolating a trend estimated by ordinary least-squares is derived and some interesting properties of these coefficients are noted. Section 3 presents the autoregressive schemes which correspond to single period forecasts using exponentially discounted leastsquares and double exponential smoothing. Section 4 shows what constraints on the autoregressive coefficients are necessary if one wants to extrapolate any N degree time-polynomial trend. The constraints can be used to reduce the number of coefficients that must be specified or estimated in a auto regressive equation. Section 5 examines some properties of forecast functions which extrapolate trend and use the first difference of the observations. Section 6 considers two methods of extending the results for any forecast lead time and shows how to extrapolate any exponential trend. 2. ORDINARY LEAST-SQUARES Extrapolation of trend in a single time series is appropriate when the variable satisfies the usual assumptions of a least'squares model. Given T observations from discrete, equispaced intervals of time, the time-polynomial model of degree N is N z^ = gQ+g^t+..."Vg^t H*e^, t m 1,2 T. ( 2. 1) The time origin has been set so that the first observation is for time period one. The residuals are assumed to be independently distributed with a constant variance and zero expectation. The number of observations must be greater than the degree of the time-polynomial. In matrix notation the model is (2 . 2) Z - Xg + e Where Z is the Txl vector whose t ^ element is the observation for period t, g is the (N+l)xl vector of coefficients that are to be estimated, and X is the Tx(N+l) matrix with X fcn = tn-l. The ordinary least-squares estimates of the vector of coefficients is g = (X'X)-1 X"Z. (2.3) Using these estimated coefficients to extrapolate trend, the expected value of this variable in the next time period (T+l) is * o (2.4) ZT+1 = t6 ’ where x is the lx(N+l) vector with xn = (T+l)11-'*' This forecast can be written as z* (2.5) = T (X'X)"1 X'Z & ZT+i = the tfck lag weight ^ i z t^ 2z t 1 - + ’ (2.7) *,+0t z 1» must equal ax+l-t » t = 1>2 >•••9T. In order to determine some interesting properties of these autoregressive coefficients, let the square matrix Y be equal to X'X. This matrix has elements which are sums of the first T positive integers raised to certain powers. N+l. (2. 8) 2 Since sums of positive powers can be represented as a function of only the variables T and N (and the corresponding Bernoulli numbers), the inverse matrix (X'X)~1 is also a function of only T and N. Therefore, the equations which generate the autoregressive terms are specific N degree polynomials with constant coefficients that are functions of T and N. For example, the polynomial which generates at is at x(X'X)-1 t0\“ a0+axt+...+ajjt^ (2.9) y Since 0t = t = 1,2,..., T, the lag weights are independent of the data when one uses thes& autoregressive schemes to extrapolate a time-polynomial estimated by ordinary least-squares. Another interesting property of these weights can be determined by postmultiplying the equation for a by the matrix X. aX = x , (2.10) or Zt-1 atttt “ (T+1>n » n = °»1 .... N * (2.11) By virtue of the binomial theorem, this implies that Et=l 0t = (2.12) and st=l 0ttn =0* n - 1,2,...,N. (2.13) The explicit form of the polynomial that determines the lag weights has been derived for the cases where n = 1, 2, or 3 and t * 1,2,...,T. N = i, 0t = N - 2, 0t = N = 3, 0t = 2 (2T+1 - 3t) T(T-l) 3 (3T2+3T+2 - 6 (2T+1) t + 10t2) T (T-l) (T-2) 4 (T-4); T! (2.14) (2.15) [4TJ+6x +14T+6 - 5(61^+61+5)t +30(2T+l)t2-35t3] (2.16) Some special properties which hold at least for these three cases but which have not been proven for all N cases are 0-L = (N+l)2/T ; (2.17) 0X - (-1)n (N+1)/T ; (2.18) 0t - (X'X)-l t°\ 11 , t - 1,2,...,T (2.19) 3 This last property implies that the coefficients of the polynomial which generates the autoregressive weights are the elements of the first row of the inverse matrix (X'X)”1. 3. DISCOUNTED LEAST-SQUARES A logical extension of this approach is to estimate the coefficients of the time-polynomial by minimizing the discounted sum of squared residuals. This approach has been proposed by Brown [2,3] and Box and Jenkins [1]. If this discounted sum is of the form s - w t et2’ (3.i) then the vector of coefficients estimated by discounted least-squares is S = (X"WX)-1X'WZ. (3.2) The TxT matrix W is the product of an identity matrix multiplied by the vector of discount factors wt , t = 1,2,...,T. Now the autoregressive coefficients are a function of T,N, and the discount factors. Consider the case where N = 1 and the discount weights are determined by wt = 6T+1-t , t = 1,2.... T, (3.3) where 6 is some positive fraction less than one. By using the explicit forms for a finite sum of an arithmetic-geometric series, one can derive the corresponding autoregressive weights for one period extrapolation. The general equation for the a vector is now a = x(X"WX)_1XW. (3.4) In this example the autoregressive coefficient 0t (“x+i-t^ 0t = aQ 6^ (al * t = 1 > 2 , . • •, T (1 - 6) S(6-6T( t 2(1-6)2 + 6(2-6?))) ao = al ■ a2 = (1+6)(1-6T ) - T6 t ; (1-6)(2+T(l-6)); (1-6) (1-6T (l+TU-6))). determined by (3.5) (3.6) (3.7) (3.8) In the limit as 6 approaches one, the equation (3.5) for the autoregressive coefficients is equivalent to equation (2.14). As proven by D'Esopo [4], the polynomial of degree N obtained by multiple exponential smoothing is equivalent to fitting a N degree polynomial using exponentially discounted least-squares where the sample size is infinite. Therefore, the autoregressive form of the forecast equation for double exponential smoothing (N = 1) can be determined from equation (3.5) 4 by evaluating the limit as T approaches infinity. For one period forecasts using double exponential smoothing, the corresponding linear autoregressive forecast function has coefficients which are generated by 0* = 6t_2((l-62) - (l-6)2t), t = 1 , 2 , . . . ( 3 . 9 ) 4. CONSTRAINTS FOR EXTRAPOLATING POLYNOMIAL TRENDS Another approach to extrapolating polynomial trends is to consider what are the properties of a T order linear homogeneous difference equation which has a N order polynomial solution equation. These properties can be used as constraints if one wishes to estimate autoregressive coefficients that will insure the predicted value of a variable will equal its extrapolated value whenever T successive observations lie on any N degree time-polynomial. In order words, if zt - $0+0^+.. -+BNtN , t = 1,2,... ,T, (4.1) * ZT+1 (4.2) and 51z T+02ZT-1+ * .+0Tzi then z*+i - eo+31 (T+l)+...+eN (T+l)N . (4.3) If this condition holds for any value of the 3 coefficients, simple substitutions yield the following constraints: 0t = 1 ; (4-4) E T tn0|. - 0 , n = 1,2,..., N t=l L (4.5) Notice that these constraints imply that at least one of the autoregressive coefficients must be negative in order for the forecast function to extrapolate a time-polynomial trend. The results derived in the section on extrapolating a time-polynomial fitted by ordinary least squares are equivalent to imposing these extrapolation constraints and the assumption that a N degree polynomial generates the lag weights. Alternative autoregressive coefficients that will extrapolate a N degree time-polynomial trend can be determined by specifying a different generating function and/or by estimating some lag weights subject to the extrapolation constraints. In order for a solution to exist there must be at least N+l parameters in the function which is assumed to generate the autoregressive coefficients. For example, assume that one wants a set of lag weights that will extrapolate any linear trend (N = 1) and which are generated by the second degree polynomial = a„ a-^t + a2t t = 1,2,..., T (4.6) 5 Using the two extrapolation constraints enables one to solve for aQ and a^ as functions of a2 . 2(2T+1) T(T-l) - + (T+l)(T+2) 6 a2 (4.8) 6 T(T-l) - (T+l)a2 (4.8) The value of a2 can be estimated if one has more than T observations of the variable. Alternatively it could be determined by imposing an additional condition, such as an end point constraint. Another example of an autoregressive process that can extrapolate any linear trend is Brown's double moving average method [3]. At time period T the moving average of a variable using the K previous observations is M t = -1 ii-i z zT+l-i' K (4.9) The moving average of the moving average (double moving average) is ^ = "K Ej=l MT+l-j (4.10) Brown's equation to forecast the value in the next period can be written as * z 2K (K+l) T+l = K-l Mj - (K-l) r2-| (4.11) In order to translate this forecast equation into a linear autoregressive form, first manipulate the equation for the double moving average. M [2 ] _ 1 T (4.12) ZT+l-i-j K2 By expanding and then adding the coefficients for each observation, one finds that M£2] ■ i2 [ £ i k*T+ l-k + ^ W k ] (4-13) Substituting equations (4.9) and (4.13) into (4.11) yields the autoregressive equation for this forecasting technique. * _ r2K-i a „ t+i " Ek=i 0kzT+l-k, where \ and 2K2 - (K+l)k “ (K-l) K2 . -(K+ l)(2K-k) k » (4.14) k ” 1*2,..,K, . 2.... 2K„ ^ (4.15) (4.16) (K-1)k 2 Notice that the first K coefficients are positive and decreasing in value whereas that last K-l coefficients are negative and increasing in value. This implies that there is a relatively large difference between the K and the K+l lage weights. 6 5. EXTRAPOLATING USING FIRST DIFFERENCES OF THE OBSERVATIONS As shown in the previous section, in order for a linear autoregressive function to be able to extrapolate all N degree time-polynomial trends, the coefficients must satisy the constraints (4.4) and (4.5). And alternative forecast equation form is to use the first differences of the observations so that V l - ZT + «tCl] <zT+l-t - zT-t). « - l> The notation indicates that these coefficients are for the first difference between successive observations. Since the first difference of a N degree polynomial is a N-l degree polynomial, the corresponding constraints on the p [lj coefficients that will enable one to extrapolate any time trend are £tll1«t[l] - 1 «-2> and, if N > 1, Et=l tn0t^ “ 0 » n = 1»2 .... N-l. (5.3) The relation between these two sets of autoregressive coefficients can be summarized by - & ml 0S - 1, t - 1,2.... T—1 (5.4) or 01 - 1 + 0t - 0fc[l] - 0t[l] , t = 2,3,..., T-l; 0T = - 0TIiJ (5.5) When one assumes that the 0t coefficients are determined bv some specific function, the corresponding function which generates the 0t^J coefficients can be determined. For example, if one forecasts one period ahead by estimating a linear time-polynomial, the coefficients of the equivalent autoregressive process are given by equation (2.14). Substituting this equation into (5.4) yields the following function for the corresponding 0t [1] coefficients. 0 ^ = -1 + (4T-l)t -3t2 T(T-l) , t = 1,2,...,T-l. (5.6) Notice that this is a second degree polynomial in t; whereas, the 0t coefficients are determined by a linear equation. Furthermore, this equation is concave from below. 7 6. SOME ADDITIONAL OBSEVATIONS One of the simplifying assumptions of the preceding analysis is that the forecasts are being made for only one period ahead. There are two established approaches one can use when generalizing the results for any lead time L. One may estimate the coefficients of a time-polynomial and solve the estimated equation for any desired time. The coefficients of the corresponding linear autoregressive process can be determined from equation (3.4), where the t vector is now defined by Tn = (T+L) n-l, n * 1,2,..., N+l (6.1) The second approach is to consider the autoregressive process to be a T order difference equation where the observations are the initial conditions. Evaluating this difference equation for successive time periods provides forecasts for any desired lead time. When using an autoregressive process for extrapolating an estimated trend, this second approach is equivalent to estimating the trend in the original T observations, extrapolating to get the next period forecast, using this value and the T-l previous observations to reestimate the trend, and extrapolating the revised trend one period. This process of estimating and extrapolating owe period is repeated until the forecast of each desired lead time is obtained. When a variable has a simple exponential time trend, the coefficients can be estimated by least-squares after transforming the observations into a linear logarithmic system. The logarithm of the next period forecast can therefore be determined by the linear autoregressive process ln,(z£+1) = Efc^1 0fc ln(zT+i_t)> (6.2) Where 0t is generated by equation (2,14). The value of this forecast can also be determined by the corresponding linear logarithmic autoregressive process ^ T fiji. ZT+1 = nt=l ^zT+l-t). Using this logarithmic form, any simple exponential time trend can be extrapolated provide the exponential weights satisfy the constraints (4.4) and (4.5) for N = 1. 8 7. CONCLUSIONS Assume one wishes to forecast future values of a time series that can be represented by a time-polynomial model where the stochastic terms are independently distributed with zero means and the observations are from discrete, equispaced intervals of time. Instead of estimating the coefficients of the time-polynomial by ordinary or discounted least-squares and extrapolating to get the forecasts, one only needs to take a weighted sum of the observations. The equation which generates the autoregressive coefficients is a function of the number of observations used, the degree of the fitted time-polynomial, the lead time of the forecast, and the function which determines the discount weights. REFERENCES [1] Box, G. E. and Jenkins, G. M., Time Series Analysis: Forecast and Control, San Francisco; Holden-Day, 1970. [2] Brown, R. G., Smoothing, Forecasting and Prediction of Discrete Time Series, New Jersey; Prentice-Hall, 1962. [3 ] Brown, R. G. and Meyer, R. F., "The Fundamental Theorem of Exponential Smoothing," Operations Research, 9 (1961), 673-635. [4 ] D'Esopo, D. A., "A Note on Forecasting by the Exponential Smoothing Operator," Operations Research, 9 (1961), 686-687. [5] Whittle, P., Prediction and Regulation by Linear Least-Squares Methods, London; English Universities Press, 1963. [6] Wold, H., A Study in the Analysis of Stationary Time Series (2nd ed.), Stockholm; Almquist & Wiksell, 1954.