Full text of Working Papers (Federal Reserve Bank of Chicago) : Autoregressive Form of Time-Polynomial Extrapolation, SM-75-1

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
A Series of Occasional Papers in Draft Form Prepared by Members
of the Research Department for Review and Comment.




75-1

Autoregressive Form of
Time-Polynomial Extrapolation
Thomas A. Gittings

Federal Reserve Bank of Chicago

Research Paper No. 75-1

AUTOREGRESSIVE FORM OF
TIME-POLYNOMIAL EXTRAPOLATION

By

Thomas A. Gittings
Department of Research
Federal Reserve Bank of Chicago

This paper is being circulated for purposes of discussion
and comment. The contents should be regarded as preliminary
and not for citation or quotation without permission of the
author. The views expressed are those of the author and do
not necessarily reflect those of the Federal Reserve Bank
of Chicago or the Federal Reserve System.




1.

Introduction

Some of the standard methods of forecasting future values of a time
series that has a trend Include treating the series as an A.R.I.M.A. process,
using general exponential smoothing, or extrapolating least-squares fitted
time-polynomials. This article primarily investigates some properties of
this last method. Given the current and some past values of a single time
series, the procedure of estimating the coefficients of a polynomial in
time and extrapolating into the future is equivalent to a specific type of
autoregressive process. The lag weights in the autoregressive equation are
independent of the data and are functions only of the number of observations,
the degree of the estimated time-polynomial, the lead time of the forecast,
and the weighting function in the least-squares estimation.
This article is concerned with forecasting a time series which can be
represented by a simple polynomial or exponential trend without a seasonal
component. One method of forecasting such a variable is to form a stationary
series from the data, usually by taking the difference between successive
observations (or logarithms of the observations), and then estimate the
coefficients for an autoregressive model. In a general A.R.I.M.A. model the
equation also can include the deviations of some past observations from their
estimated values. This approach is advocated especially by Wold [6], Whittle .
[5], and Box and Jenkins [!]•
For cases where the time series has such a simple trend, an alternative
approach is to estimate the time trend and extrapolate to get the forecasts for
some future dates. This forecasting technique is equivalent to using certain
deterministic autoregressive equations where the coefficients are independent of
the data. The primary advantage of this later approach is its computational
simplicity; forecasts can be made by just taking a weighted sum of a set of
observations. The process of estimating the coefficients of a time-polynomial
is embodied in these particular autoregressive coefficients.
In section 2 the autoregressive form for extrapolating a trend estimated
by ordinary least-squares is derived and some interesting properties of these
coefficients are noted. Section 3 presents the autoregressive schemes which
correspond to single period forecasts using exponentially discounted leastsquares and double exponential smoothing. Section 4 shows what constraints
on the autoregressive coefficients are necessary if one wants to extrapolate
any N degree time-polynomial trend. The constraints can be used to reduce
the number of coefficients that must be specified or estimated in a auto
regressive equation. Section 5 examines some properties of forecast functions
which extrapolate trend and use the first difference of the observations.
Section 6 considers two methods of extending the results for any forecast
lead time and shows how to extrapolate any exponential trend.




2.

ORDINARY LEAST-SQUARES

Extrapolation of trend in a single time series is appropriate when the
variable satisfies the usual assumptions of a least'squares model. Given T
observations from discrete, equispaced intervals of time, the time-polynomial
model of degree N is
N
z^ = gQ+g^t+..."Vg^t H*e^, t m 1,2

T.

( 2. 1)

The time origin has been set so that the first observation is for time period
one. The residuals are assumed to be independently distributed with a constant
variance and zero expectation. The number of observations must be greater
than the degree of the time-polynomial.
In matrix notation the model is

(2 . 2)

Z - Xg + e

Where Z is the Txl vector whose t ^ element is the observation for period t,
g is the (N+l)xl vector of coefficients that are to be estimated, and X is the
Tx(N+l) matrix with X fcn = tn-l. The ordinary least-squares estimates of the
vector of coefficients is
g = (X'X)-1 X"Z.

(2.3)

Using these estimated coefficients to extrapolate trend, the expected value
of this variable in the next time period (T+l) is
*

o

(2.4)

ZT+1 = t6 ’
where x is the lx(N+l) vector with xn = (T+l)11-'*'
This forecast can be written as
z*

(2.5)

= T (X'X)"1 X'Z

&
ZT+i =
the tfck lag weight

^

i z t^

2z t 1
-

+

’

(2.7)

*,+0t z 1»

must equal ax+l-t » t = 1>2 >•••9T.

In order to determine some interesting properties of these autoregressive
coefficients, let the square matrix Y be equal to X'X. This matrix has
elements which are sums of the first T positive integers raised to certain powers.




N+l.

(2. 8)

2

Since sums of positive powers can be represented as a function of only the
variables T and N (and the corresponding Bernoulli numbers), the inverse matrix
(X'X)~1 is also a function of only T and N. Therefore, the equations which
generate the autoregressive terms are specific N degree polynomials with
constant coefficients that are functions of T and N. For example, the polynomial
which generates at is
at

x(X'X)-1

t0\“ a0+axt+...+ajjt^

(2.9)

y

Since 0t =
t = 1,2,..., T, the lag weights are independent of the data
when one uses thes& autoregressive schemes to extrapolate a time-polynomial
estimated by ordinary least-squares.
Another interesting property of these weights can be determined by postmultiplying the equation for a by the matrix X.
aX = x ,

(2.10)

or
Zt-1 atttt “ (T+1>n »

n = °»1 .... N *

(2.11)

By virtue of the binomial theorem, this implies that
Et=l 0t =

(2.12)

and
st=l 0ttn

=0* n - 1,2,...,N.

(2.13)

The explicit form of the polynomial that determines the lag weights has
been derived for the cases where n = 1, 2, or 3 and t * 1,2,...,T.
N = i,

0t =

N - 2,

0t =

N = 3,

0t =

2 (2T+1 - 3t)
T(T-l)
3 (3T2+3T+2 - 6 (2T+1) t + 10t2)
T
(T-l) (T-2)
4 (T-4);
T!

(2.14)
(2.15)

[4TJ+6x +14T+6 - 5(61^+61+5)t
+30(2T+l)t2-35t3]

(2.16)

Some special properties which hold at least for these three cases but which
have not been proven for all N cases are




0-L = (N+l)2/T ;

(2.17)

0X - (-1)n (N+1)/T ;

(2.18)

0t - (X'X)-l

t°\
11

,

t - 1,2,...,T

(2.19)

3

This last property implies that the coefficients of the polynomial which
generates the autoregressive weights are the elements of the first row of
the inverse matrix (X'X)”1.
3.

DISCOUNTED LEAST-SQUARES

A logical extension of this approach is to estimate the coefficients of
the time-polynomial by minimizing the discounted sum of squared residuals.
This approach has been proposed by Brown [2,3] and Box and Jenkins [1]. If
this discounted sum is of the form
s -

w t et2’

(3.i)

then the vector of coefficients estimated by discounted least-squares is
S = (X"WX)-1X'WZ.

(3.2)

The TxT matrix W is the product of an identity matrix multiplied by the vector
of discount factors wt , t = 1,2,...,T. Now the autoregressive coefficients
are a function of T,N, and the discount factors.
Consider the case where N = 1 and the discount weights are determined by
wt = 6T+1-t ,

t = 1,2.... T,

(3.3)

where 6 is some positive fraction less than one. By using the explicit
forms for a finite sum of an arithmetic-geometric series, one can derive the
corresponding autoregressive weights for one period extrapolation. The
general equation for the a vector is now
a = x(X"WX)_1XW.

(3.4)

In this example the autoregressive coefficient 0t (“x+i-t^
0t = aQ 6^ (al

* t = 1 > 2 , . • •, T

(1 - 6)
S(6-6T(
t 2(1-6)2 + 6(2-6?)))
ao =
al ■
a2 =

(1+6)(1-6T ) - T6 t

;

(1-6)(2+T(l-6));

(1-6) (1-6T (l+TU-6))).

determined by
(3.5)

(3.6)
(3.7)
(3.8)

In the limit as 6 approaches one, the equation (3.5) for the autoregressive
coefficients is equivalent to equation (2.14).
As proven by D'Esopo [4], the polynomial of degree N obtained by
multiple exponential smoothing is equivalent to fitting a N degree polynomial
using exponentially discounted least-squares where the sample size is
infinite. Therefore, the autoregressive form of the forecast equation for
double exponential smoothing (N = 1) can be determined from equation (3.5)




4
by evaluating the limit as T approaches infinity. For one period
forecasts using double exponential smoothing, the corresponding linear
autoregressive forecast function has coefficients which are generated by
0* = 6t_2((l-62) - (l-6)2t), t = 1 , 2 , . . . ( 3 . 9 )

4.

CONSTRAINTS FOR EXTRAPOLATING
POLYNOMIAL TRENDS

Another approach to extrapolating polynomial trends is to consider what
are the properties of a T order linear homogeneous difference equation which
has a N order polynomial solution equation. These properties can be used
as constraints if one wishes to estimate autoregressive coefficients that
will insure the predicted value of a variable will equal its extrapolated
value whenever T successive observations lie on any N degree time-polynomial.
In order words, if
zt - $0+0^+.. -+BNtN , t = 1,2,... ,T,

(4.1)

*
ZT+1

(4.2)

and
51z T+02ZT-1+ *

.+0Tzi

then
z*+i - eo+31 (T+l)+...+eN (T+l)N .

(4.3)

If this condition holds for any value of the 3 coefficients, simple
substitutions yield the following constraints:
0t = 1 ;

(4-4)

E T tn0|. - 0 , n = 1,2,..., N
t=l
L

(4.5)

Notice that these constraints imply that at least one of the autoregressive
coefficients must be negative in order for the forecast function to extrapolate
a time-polynomial trend.
The results derived in the section on extrapolating a time-polynomial
fitted by ordinary least squares are equivalent to imposing these extrapolation
constraints and the assumption that a N degree polynomial generates the lag
weights. Alternative autoregressive coefficients that will extrapolate a
N degree time-polynomial trend can be determined by specifying a different
generating function and/or by estimating some lag weights subject to the
extrapolation constraints. In order for a solution to exist there must be
at least N+l parameters in the function which is assumed to generate the
autoregressive coefficients.
For example, assume that one wants a set of lag weights that will
extrapolate any linear trend (N = 1) and which are generated by the second
degree polynomial




= a„

a-^t +

a2t

t = 1,2,..., T

(4.6)

5

Using the two extrapolation constraints enables one to solve for aQ and a^
as functions of a2 .
2(2T+1)
T(T-l)
-

+ (T+l)(T+2)
6

a2

(4.8)

6

T(T-l)

- (T+l)a2

(4.8)

The value of a2 can be estimated if one has more than T observations of the
variable. Alternatively it could be determined by imposing an additional
condition, such as an end point constraint.
Another example of an autoregressive process that can extrapolate any
linear trend is Brown's double moving average method [3]. At time period T
the moving average of a variable using the K previous observations is
M
t

= -1 ii-i z
zT+l-i'
K

(4.9)

The moving average of the moving average (double moving average) is
^ = "K Ej=l MT+l-j

(4.10)

Brown's equation to forecast the value in the next period can be written as
*

z
2K
(K+l)
T+l = K-l Mj - (K-l)

r2-|
(4.11)

In order to translate this forecast equation into a linear autoregressive
form, first manipulate the equation for the double moving average.
M

[2 ] _ 1

T

(4.12)

ZT+l-i-j

K2

By expanding and then adding the coefficients for each observation, one
finds that
M£2] ■ i2 [ £ i

k*T+ l-k + ^

W

k

]

(4-13)

Substituting equations (4.9) and (4.13) into (4.11) yields the autoregressive
equation for this forecasting technique.
*
_ r2K-i a „
t+i " Ek=i 0kzT+l-k,
where
\
and

2K2 - (K+l)k
“ (K-l) K2
. -(K+ l)(2K-k)

k

»

(4.14)

k ” 1*2,..,K,
.

2.... 2K„ ^

(4.15)

(4.16)

(K-1)k 2

Notice that the first K coefficients are positive and decreasing in value whereas
that last K-l coefficients are negative and increasing in value. This implies
that there is a relatively large difference between the K and the K+l lage
weights.




6

5.

EXTRAPOLATING USING FIRST DIFFERENCES
OF THE OBSERVATIONS

As shown in the previous section, in order for a linear autoregressive
function to be able to extrapolate all N degree time-polynomial trends, the
coefficients must satisy the constraints (4.4) and (4.5). And alternative
forecast equation form is to use the first differences of the observations
so that
V l

- ZT +

«tCl] <zT+l-t - zT-t).

« - l>

The notation
indicates that these coefficients are for the first
difference between successive observations. Since the first difference of
a N degree polynomial is a N-l degree polynomial, the corresponding constraints
on the p [lj coefficients that will enable one to extrapolate any time trend are
£tll1«t[l] - 1

«-2>

and, if N > 1,
Et=l tn0t^

“ 0

» n = 1»2 .... N-l.

(5.3)

The relation between these two sets of autoregressive coefficients can
be summarized by
- & ml 0S - 1,

t - 1,2.... T—1

(5.4)

or
01

- 1 +

0t

- 0fc[l] - 0t[l] , t = 2,3,..., T-l;

0T

= - 0TIiJ

(5.5)

When one assumes that the 0t coefficients are determined bv some specific
function, the corresponding function which generates the 0t^J coefficients
can be determined. For example, if one forecasts one period ahead by
estimating a linear time-polynomial, the coefficients of the equivalent
autoregressive process are given by equation (2.14). Substituting this
equation into (5.4) yields the following function for the corresponding 0t [1]
coefficients.
0 ^ = -1 + (4T-l)t -3t2
T(T-l)
, t = 1,2,...,T-l.
(5.6)
Notice that this is a second degree polynomial in t; whereas, the 0t coefficients
are determined by a linear equation. Furthermore, this equation is concave
from below.




7
6.

SOME ADDITIONAL OBSEVATIONS

One of the simplifying assumptions of the preceding analysis is that the
forecasts are being made for only one period ahead. There are two established
approaches one can use when generalizing the results for any lead time L. One
may estimate the coefficients of a time-polynomial and solve the estimated
equation for any desired time. The coefficients of the corresponding linear
autoregressive process can be determined from equation (3.4), where the t
vector is now defined by
Tn = (T+L) n-l,

n * 1,2,..., N+l

(6.1)

The second approach is to consider the autoregressive process to be a
T order difference equation where the observations are the initial conditions.
Evaluating this difference equation for successive time periods provides
forecasts for any desired lead time. When using an autoregressive process
for extrapolating an estimated trend, this second approach is equivalent to
estimating the trend in the original T observations, extrapolating to get the
next period forecast, using this value and the T-l previous observations to
reestimate the trend, and extrapolating the revised trend one period. This
process of estimating and extrapolating owe period is repeated until the
forecast of each desired lead time is obtained.
When a variable has a simple exponential time trend, the coefficients can
be estimated by least-squares after transforming the observations into a
linear logarithmic system. The logarithm of the next period forecast can
therefore be determined by the linear autoregressive process
ln,(z£+1) = Efc^1

0fc ln(zT+i_t)>

(6.2)

Where 0t is generated by equation (2,14). The value of this forecast can
also be determined by the corresponding linear logarithmic autoregressive
process
^
T
fiji.
ZT+1 = nt=l ^zT+l-t).
Using this logarithmic form, any simple exponential time trend can be
extrapolated provide the exponential weights satisfy the constraints (4.4)
and (4.5) for N = 1.




8

7.

CONCLUSIONS

Assume one wishes to forecast future values of a time series that can be
represented by a time-polynomial model where the stochastic terms are
independently distributed with zero means and the observations are from
discrete, equispaced intervals of time. Instead of estimating the coefficients
of the time-polynomial by ordinary or discounted least-squares and extrapolating
to get the forecasts, one only needs to take a weighted sum of the observations.
The equation which generates the autoregressive coefficients is a function of
the number of observations used, the degree of the fitted time-polynomial, the
lead time of the forecast, and the function which determines the discount weights.




REFERENCES

[1]

Box, G. E. and Jenkins, G. M., Time Series Analysis: Forecast and
Control, San Francisco; Holden-Day, 1970.

[2]

Brown, R. G., Smoothing, Forecasting and Prediction of Discrete
Time Series, New Jersey; Prentice-Hall, 1962.

[3 ]

Brown, R. G. and Meyer, R. F., "The Fundamental Theorem of
Exponential Smoothing," Operations Research, 9 (1961), 673-635.

[4 ]

D'Esopo, D. A., "A Note on Forecasting by the Exponential
Smoothing Operator," Operations Research, 9 (1961), 686-687.

[5]

Whittle, P., Prediction and Regulation by Linear Least-Squares Methods,
London; English Universities Press, 1963.

[6]

Wold, H., A Study in the Analysis of Stationary Time Series
(2nd ed.), Stockholm; Almquist & Wiksell, 1954.
Full text of Working Papers (Federal Reserve Bank of Chicago) : Autoregressive Form of Time-Polynomial Extrapolation, SM-75-1

FRASER