View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

A SIMPLE ESTIMATOR OF COINTEGRATING
VECTORS IN HIGHER ORDER INTEGRATED
SYSTEMS
James H. Stock and Mark W. Watson
Working Paper Series
Macro Economic Issues
Research Department
Federal Reserve Bank of Chicago
February, 1991 (WP-91-3)

A Simple Estimator of Cointegrating Vectors
in Higher Order Integrated Systems

James H. Stock
Department of Economics
University of California, Berkeley
Berkeley, CA 94720
and
Mark W. Watson
Department of Economics
Northwestern University
Evanston, IL 60208
and the Federal Reserve Bank of Chicago
First Draft:
This revision:

September 1989
January 1991

Abstract
Efficient estimators of cointegrating vectors are presented for systems
involving deterministic components and variables of differing, higher orders of
integration. The estimators are computed using GLS^or OLS, and Wald statistics
constructed from these estimators have asymptotic x distributions. These and
previously proposed estimators of cointegrating vectors are used to study long-run
U.S. money (Ml) demand. Ml demand is found to be stable over 1900-1989; the 95%
confidence intervals for the income elasticity and interest rate semielasticity
are (.90, 1.03) and (-.124, -.088), respectively. Estimates based on the postwar
data alone, however, are unstable, with variances that suggest substantial
sampling uncertainty.
Keywords:

Error correction models, unit roots, money demand

JEL classification number:

210

The authors thank Manfred Deistler, Robert Engle, Peter Phillips, Danny Quah,
Kenneth West, and the participants in the NBER/FMME Summer Institute Workshop on
New Econometric Methods in Financial Time Series, July 17-20, 1989 for helpful
comments on earlier versions of this paper. Helpful suggestions by Lars Hansen
and two anonymous referees are gratefully acknowledged. The authors also thank
Robert Lucas for kindly providing the annual data analyzed in Section 7. This
research was supported in part by the Sloan Foundation and National Science
Foundation grants SES-86-18984 and SES-89-10601.




1. Introduction

Parameters describing the long-run relation between economic time series, such
as the long-run income and interest elasticities of money demand, often play an
important role in empirical macroeconomics.

If these variables are cointegrated

as defined by Engle and Granger (1987), then the task of describing these long-run
relations reduces to the problem of estimating cointegrating vectors.

Recent

research on the estimation of cointegrating vectors has focused on the case that
each series is individually integrated of order 1 (is 1(1)), typically with no
drift term.

Johansen (1988a) and Ahn and Reinsel (1990) independently derived the

asymptotic distribution of the Gaussian MLE when the cointegrated system is
parameterized as a vector error correction model (VECM), and Johansen (1989)
extended this result to the case of nonzero drifts.

In a series of papers,

Phillips and coauthors have considered efficient estimation based on a different
model for cointegrated systems, the triangular representation.

Phillips (1988a)

studied estimation in a cointegrated model with general 1(0) errors; Phillips and
Hansen (1989) considered a two-step zero frequency seemingly unrelated regression
estimator; and Phillips (1988b) used spectral methods to compute efficient
estimators in the frequency domain.
This paper proposes two alternative, computationally simple estimators of
cointegrating vectors, which readily extend to systems with arbitrary
deterministic components and with higher orders of integration and cointegration.
The estimators are motivated as Gaussian MLE's for a particular parameterization
of the triangular representation.

However, under more general conditions they are

shown to be asymptotically efficient in Phillips' (1988a) sense, having an
asymptotic distribution that is a random mixture of normals and producing Wald
test statistics with asymptotic chi-squared null distributions.




1

In the 1(1) case

with a single cointegrating vector, one simply regresses one of the variables onto
contemporaneous levels of the remaining variables, leads and lags of their first
differences, and a constant, using either ordinary or generalized least squares.
It is argued that the resulting "dynamic OLS" (respectively GLS) estimators are
asymptotically equivalent to the Johansen/Ahn-Reinsel MLE.

The proposed

estimators treat the parameters describing the short-run dynamics of the process
as nuisance parameters; the object is to obtain efficient estimates of the
cointegrating vector, which are typically of independent interest.

If desired,

the estimates subsequently can be used to study the short-run dynamics, say by
estimating a triangular system (Campbell (1987), Campbell and Shiller (1987,
1989)) or a constrained VECM (King, Plosser, Stock and Watson (1987)).
These estimators are used here to investigate the long-run demand for money
(Ml) in the U.S. from 1900 to 1989.

Other researchers (recently including Hoffman

and Rasche [1989] and Baba, Hendry and Starr [1990]) have argued either explicitly
or implicitly that long-run money demand can be thought of as a cointegrating
relation among real balances, real income and the interest rate in postwar data.
We find this characterization empirically plausible for the longer annual data as
well, and therefore use these estimators of cointegrating vectors to examine
Lucas's (1988) suggestion that there is a stable long-run Ml money demand relation
spanning the twentieth century.
The paper is organized as follows.

The model and estimators are introduced in

Section 2 for 1(1) variables, and they are extended to 1(d) variables in Section
3.

The large-sample properties of the estimators and test statistics are

summarized in Section 4.

In Section 5, the proposed estimators are related to

Johansen's (1988a) MLE, and the 1(2) case is examined in detail.
results are presented in Section 6.
given in Section 7.




Monte Carlo

The application to long-run Ml demand is

Section 8 concludes.

2

Readers primarily interested in the

empirical results can skip Sections 3-6 with little loss of continuity.

2.

1

Representation and Estimation in 1(1) Systems

Suppose that each element of the n-dimensional time series y t is 1(1), that
EAyt»0, and that the nxr matrix of r cointegrating vectors a is a - (-$
where B is the

Ir)',

rx(n-r) submatrix of unknown parameters to be estimated and Ir is

the rxr identity matrix.

The triangular representation for yt is,

(2.1a)
(2.1b)

1 2

1

2

where yt is partitioned as (yt , yt), where yt is (n-r)xl and yt is rxl and

1

2

where ut — (ut ' u^')' is a stationary stochastic process with full rank
spectral density matrix.

This representation has been used extensively in

theoretical work by Phillips (1988a,1988b), typically without parametric structure
on the 1(0) process u^, and in applications by Campbell (1987) and Campbell and
Shiller (1987, 1989) (also see Bewley [1979]).

For the moment, u^ is assumed to

be Gaussian to permit the development of the Gaussian MLE for B .
The parameterization that forms the basis for the proposed estimators is
obtained by making the error in (2.1b) independent of (u^).

Because ut is

2
1
2
1
1
Gaussian and stationary, E[ut |{Ayt )]«E[ut |{u^}]*d(L)Ayt , where d(L) is
in general two-sided and {Ay^} denotes {Ay^, t-l,...,T).
written,

( 2 . 2)




y\ - 0y\ + d(L)Ay^ +

3

Thus (2.1b) can be

-2

2

2

1

where ut*ut"E[ut |{ut J] .
independent.

1

-2

By construction, (Ayt ) and {ut J are

1

-2

In addition, ut and ut have Wold representations

1
1
2
2
1
2
ut*c^^(L)€t and uta*c 2 2 ^'^€t ’ where {c^} and {ct } are independent.
Thus (2.1a) and (2.2) can be written,

(2.3a)

Ay* - cn (L)eJ:

(2.3b)

y2 - 0y* + d(L)Ay* + c22(L)e2

2

’ so ^ct^

where ct is NIID(0,2€), E €~diag(2^,

1

^dependent °f (yt) .

The two-sided triangular representation (2.3) provides a nonstandard asymptotic
factorization of the Gaussian likelihood.
and

Let A^ denote the parameters of c-^(L)

let A 2 denote the parameters of d(L) , C2

(y^,...,yj), i—1,2.

(2.4)

2

(L), and ^ 2 *

anc^ ^et ^

denote

Then (2.3) implies that the likelihood can be factored as

f(Y1 ,Y2 |9,A1 ,A2) - f(Y2 |Y1 ,0,X 2 )f (Y1 |

.

This differs from the usual prediction-error factorization because the conditional
2
1
mean of yt involves future as well as past value of y^.
The representations (2.3) and (2.4) provide concrete guides to estimation and
inference in these Gaussian systems.
[0,Ag], then

If there are no restrictions between A^ and

is ancillary (in Engle, Hendry and Richard's [1983] terminology,

weakly exogenous, extended to permit conditioning on both leads and lags of Ay^)
for 0, so that inference can be carried out conditional on Y^.

In this case, the

2 1
MLE of 0 can be obtained by maximizing f(Y |Y ,0,A2 ) . This reduces to estimating
the parameters of the regression equation (2.3b) by GLS.

Because the regressor

y^ is 1(1), as is shown in Section 4 an asymptotically equivalent estimator of 0
can be obtained by estimating 0 in (2.3b) by OLS; this will be referred to as the




4

dynamic OLS (DOLS) estimator, to distinguish it from the static OLS (SOLS)
2

1

estimator obtained by regressing yt against only a constant and y^.
Similarly, the feasible GLS estimator of 6 in (2.3b) will be referred to as the
dynamic GLS (DGLS) estimator.
The representation (2.3) warrants three remarks.

First, Sims' (1972) Theorem 2

implies that the projection d(L)Ay^ will involve only current and lagged values
1

2

1

of Ayt if (and only if) u^ does not Granger-cause Ayt .
C2

2

If so, and if

(l) ^ has finite order p, then (2.3b) can be rewritten as an r-dimensional
2

1

error correction model, i.e. as a regression of Ayt onto (Ayt , Ayt
Ay t

- 2

, . . . ,

Ayt

, yt_i~0'yt-i)•

this case, the nonlinear least squares

estimator (with Ayt included as a regressor; Stock [1987]) is the Gaussian MLE.
Second, the large-sample properties of the OLS and GLS estimators of 0 are

2

readily deduced from the representation (2.3).

Because €t is uncorrelated with

the regressors at all leads and lags, conditional on

the GLS estimator has a

normal distribution and the Wald statistic testing the hypothesis that 0«0q (where
rank(0)=r is maintained) has a x

2

1

distribution.

Because yt is 1(1), the

conditional covariance matrix of the GLS estimator differs across realizations of
y\

even in large samples; thus unconditionally the GLS estimator of 6 has a

large-sample distribution that is a random mixture of normals and the Wald
statistic has a x

2

distribution.

Phillips (1988a) provides an insightful

discussion of the asymptotic mixed normal property of the MLE of 0, the local
asymptotic mixed normal (LAMN) behavior of test statistics, and the efficiency of

2

the Gaussian maximum likelihood estimator of 0 .

Third, although the interpretation of (2.3) as a factorization of the
likelihood (2.4) assumes Gaussianity, the two-sided triangular representation with
2
1 '
EetAyt-j*0 for all j can be constructed under the weaker condition that ut

is linearly regular and covariance stationary.




5

This result, summarized in the

following lemma (proven in the Appendix) , provides an alternative to the Wold
representation theorem.

3

Let ^

i

1

and

denote the linear manifolds spanned by

{u^}, -®<s<t and {u^}, -»<s<«> (respectively), closed with respect to
convergence in mean square, and let ^(u^l^) denote the projection of
onto

.

Lemma 2.1.

1

Let

2

— (u^' ut ')' he a mean zero linearly regular weakly

stationary stochastic process with Eu^u^ < ®, where u^ is

2

(n-r)xl and ut is rxl.

---

1

O

(L)
II

1-1
ut
2

Then u^ has the representation,

_c21

KJ

c22

r li
£t
2
LetJ

or ut~c(L)€t (where c(L) and ct are partitioned conformably with ut) , where
Ect^O, Ect€^.=diag(S^^, ^
1

2

) and Ec^Cg-0 f°r

* and where {c^} are

2

innovations of {ut ) and {et ) are innovations of {ut*ut2

1

P(ut 1^oq) } • Also, cn (L) and c 2

2

-2

2

^ ) are one-sided, in general ^^(L) is

two-sided, and c(L) is square summable.

In the representation (2.5), C2 ^(z) is in general two-sided because {e^} are
constructed to be the innovations of {u^}. Note however that {et ) are not
innovations for {ut).^

3.

Representation in 1(d) Systems

This section extends the triangular representation (2.3) to systems in which
variables may be integrated and cointegrated of different orders and in which
there are polynomial time trends.




The 1(d) generalization of (2.1), derived in

6

the Appendix from the Wold representation of an 1(d) system with drifts and
multiple cointegrating vectors, is

(3.1)

.d 1
A yt
Ad-1 2
A
yt
Ad-2 3
A
yt

**1,0 + ut
■
i /,d-l/?d-l„l,
»d "l/i d “1 1 v .
^2,0 + 1*2,lz + *2,1(A
yt) +

u *.

0
**3,0 +
+ ^ 3 , 2 t2 T
Ad-1 /Ad - 1 1 N ^ -d-2 /Ad - 2 1 . ^ .d-2 ,Ad - 2
^3 ,1 (A
yt> + ^ 3 ,1 (A
y t > + * 3 , 2 (A
5

d+1
yt

d vd .d-i
j-l/-i-j*d+lfj

for t-l,...,T, where the

3
+ ut

d+1
+ ut

are kjXl vectors which form a partition of yt , i.e.

yt=(y^' y^' ... y^+^')', and A^=(l-L)(1-L), etc.

By assumption, ut=(u^'

2
d+1
u£' ... u£ ')' is weakly stationary with mean zero.

It is assumed that the

highest order of integration of the elements of yt is 1(d).

However, not all

elements of yt need to be 1(d) for (3.1) to apply (see the examples in Section
5(B) and the empirical application to long-run money demand in Section 7).
Moreover, some blocks of (3.1) might not appear.

For example, with d~2 and n=2,

yt could be Cl(2,1) in Engle and Granger's (1987) terminology, in which case k^=l,
k 2 =l, and kg=0; if yt is Cl(2,2), then k^-1,

and kg**l.

This representation partitions yt into components corresponding to stochastic
trends of different orders.

Abstracting from the deterministic components, y^

is a k-^-vector corresponding to the k-^ distinct 1(d) stochastic trends in the
system.

In the second block of k 2 equations,

2

d -1 1
l^t corresPon<^s to

the k 2 distinct I(d-l) elements in the system; for rows of
2
zero, yt is I(d-l), while for nonzero rows of
1
2
(yt , yt) are CI(d,l) .

2
yt is 1(d) and

The kg equations in the third block describe the

distinct I(d-2) components, and so forth.




d-1

that equal

7

It is straightforward to generalize the

representation (3.1) to include higher order polynomials in t, or to specialize it
to the common case in which higher-order polynomials are suppressed.
By assumption, (3.1) requires contemporaneous linear combinations of levels of
yt , or levels of yt plus differences, to be integrated of at least order zero so
that ut has a finite nonsingular spectral density matrix.

This assumption is made

without loss of generality, and serves to fix the maximum order of integration d.
In practice, this is achieved by replacing yt by A ^yt or Ayt as needed for u^ to
be 1(0) and not cointegrated.
As in the 1(1) case, the errors are orthogonalized by projecting onto leads and
lags of the errors in the preceding equations.
2.1,

u^' ... u£+1')' has the representation

u t - C(L)ct

(3.2)

where

By repeated application of Lemma

A.

d+lfv ,
2 , ... «t
«t'
')' and E£t€4-S£-diag(211,S22.... sd+l,d+l>’ and

where C(L) is a block lower triangular matrix partitioned conformably with ut ,
with diagonal blocks c^(L) that are one-sided lag polynomials and with lower offdiagonal blocks c^j(L) that in general are two-sided.
We focus our attention on the first i blocks of equations, and make the
additional assumption that {c^^(L)}, i—l,...,i-l are invertible, so that the first
1 blocks of (3.2) can be written,

1

(3.3)

■^2 i(L)
•
•
U.ua>

0

• • •

0

1

0

•
•

•
•

'di,i-l(L) * * •

1

■

c-na)
.

0

•

ut 0

.

cJei(L)

where in general d^j (L) are two-sided (for example, d 2 ^(L)~C2 ^(L)c^(L)
Substitution of the i-th equation in (3.3) into (3.1) then yields,




8

)

i
t

where {/xp ., j»l,...,jB-1} are functions of [u
x

t

J

lag polynomials
projection of
NIID(0,Se

J

., j-l,...,m-l, m«l,...,i}.

The

which generalize d(L) in (2.3), arise from the
onto {u1^} for m*=l,...,i-1. When {y^} is Gaussian, e t is

) . 5

i
1
iThe subspaces that cointegrate y t with (yt , . . . , yt

1

) and their

differences are determined by the matrices (0^ j} appearing in the second
>J
term on the right hand side of (3.4).

Note that the i-th block of equations

contains all of the cointegrating vectors for m<i, which appear in the higher
order error correction terms making up the third term on the right hand side of
(3.4).

For example, in a system with d-2 the equations describing cointegration

in the levels contain any cointegrating relations between the first differences.
As in the 1(1) case, one motivation for considering (3.4) is as the conditional
mean of a nonstandard factorization of the Gaussian likelihood.

Let A^ denote the

parameters of c^(L) and <^m (L), m*l,...i-l, let 0^ denote {0^ j, j«l,...,i1, i«j ,...,i-1) , let

denote {/i^ j,j*0,...,i-l), and let A, 0, and /x represent

the collection of A^, 0^, and /x^.

(3.5)

The Gaussian likelihood can be written as:

f(Y,0 ,M»A) - f

where Y=(y£, y£, ... , y^,)' and Y1 »(yj' , y \ ' ...... y^')' for




9

i-1.... d+1.

If the parameters (

0

t^le higher-order cointegrating

vectors in (3.1) are known and if there are no restrictions between (A^,0^,/i^) and
1

1-1

{(Aj ,0j ,/ij ) ,j<i} , then (Y ,...,Y

) are ancillary for 0^.

of 0^ is obtained by estimating the system (3.4) by GLS.

In this case the MLE
If some of (0g ,...,0^_]_)

1
1-1
are unknown then (Y ,...,Y
) are no longer ancillary (weakly exogenous) for 0^,
and the GLS estimator of {0^’j} in (3.4) is not the exact Gaussian MLE.
However, as is made precise in Section 5 for the 1(2) case, the DGLS and DOLS
estimators of {0^ j} in (3.4) still have desirable properties.
We therefore consider regression estimators of (3.4).

Because the regressors

in (3.4) can have stochastic or deterministic trends in common, it is convenient
to transform the regressors to isolate these different trends.
regressors in each equation in (3.4).

Let

denote the

It is assumed that Xt is a known nonrandom

1
1-1
(not data-dependent) linear combination of Y ,...,Y
.

Define zt - DXt , where D

is an invertible matrix of constants (possibly unknown) , chosen so that zt are the
canonical regressors in the sense of Sims, Stock and Watson (1990).
The choice of transformation matrix D depends on the specific application (see
tioi 5(B) for examples).
Section

1

2

In general, partition z^ as (z z ^ '

21 9 \ 9
z~')',
where by construction z^ is an 1(0) vector with mean zero (z^
contains the required leads and lags of {u^, mC0} , dictated by the polynomials
2
3
4
5
(d^m (L)}), zt~l, zt is dominated by a martingale, zt=t, zt is dominated by
6 2
integrated martingales, z^=t , and so forth.
0 (T*~^) for i>2.
P

T
i i
In general Xt*lztzt

From Sims, Stock, and Watson (1990, Section 2), z. can be
u

written as zt * G(L)ut , where G(L) is a block lower triangular matrix and
0

V « t

* i
1

«t

and where ^

C «t—

*

i-lw
)', where

0■=
/(1£ , 1 £ 2,' mmm£ i-l,w
' l ’
')

is defined recursively by ?t“^ s - l ^ s ^ or

Also> let

i
21
denote the dimension of z£, and let g=$\_^g^ be the dimension of zt .
With these definitions, the system (3.4) equivalently can be written as,




10

(3.6)

A

d-i+i z
yt - <x t ® Tki>^ + et

or
d->e+i i
yt - <zt ®

A

(3.7)

where ej.-c^^L) ct> tj€t €t

+ et

coe^ficients P are

ana K L€t

the coefficients on the regressors appearing in (3.4); these are related to the

parameters of interest (the cointegrating parameters) are the coefficients on the
integrated elements of z^, it is convenient to partition the gk^-vector 8 as
6*=(6^ 8*2 ... $2 ^)' > where 8 ^ is the g^k^-vector of coefficients on z^.

4.

Estimation and Testing

This section examines the least squares estimation of the parameters 8 in
(3.7).

It is assumed that yt has the triangular representation (3.1) with ut

given by (3.3).

We consider the case that zt and 8 are finite dimensional, i.e.

in which {d^m (L), m<i} have fixed finite orders.

It is assumed that {e^) in (3.2)

is a martingale difference sequence with E[et€^.| ct

et_2 >**•]“2 €- d i a g ( 2 ^ ,

/|
S22,..., 2d+l,d+l) nonsingular and maxisuptE[(c.t) I61-1 *£t-2 '***^<0°*
There are two natural estimators of the parameters in (3.6):

the feasible GLS

estimator based on an estimator of c^^(L), suitably parameterized, and the dynamic
OLS estimator (respectively, the DGLS and DOLS estimators).

(4.1)
(4.2)




11

These are

where zt-[zt ® 4>(L) '] and yt - $>(L)yt , where $(L) is an estimator of
^ ( D - s ^ c ^ a r 1.
Associated with the DGLS estimator is the Wald statistic testing the h
restrictions R£»r (where R and r have dimensions hxgk^ and hxl, respectively),

(4.3)

WGLS * ^R^GLS’r^ [R (X^tz£)

R ^

(R^GLS"r^ ‘

Because the disturbance in (3.6) is serially correlated, the Wald statistic for
$Ols must

constructed using a modified covariance matrix.

When the hypotheses

of interest do not involve the coefficients on the mean-zero stationary regressors
in (3.6), this is the spectral density matrix of e*. at frequency zero,
Qi>e*cii(l)2iicii(l) 9 , estimated by

(4.4)

That is,

W q LS - [R^OLS’r l#tR [(Iztzt ^ 0

Define the gxg scaling matrix

R> *

to be a block diagonal matrix partitioned

conformably with zt , with diagonal blocks
i>2.

[R^OLS“r l *

and

for

Also, for w t weakly stationary, define rw (j) - E[wt-E(wt) ] [wt_j-E(wt) ]' .

The next four theorems, proven in the Appendix, summarize the asymptotic
distributions of these statistics.

Theorem 4.1

Suppose that y^ satisfies (3.4) and (3.6) where c..(L) is d+l-j
c
JJ

summable, j-l,...,d+l, that c^(L)~^ has known order q<«, and ^^(L) has a
known finite order.

Then

(T^ ® * k ^ ^ G L S ’^

Q

partitioning Q and </> conformably with S :
Qn

- E[(zJ: ® $(L)')(z*' ® $(L))], Qj_j - 0, j>2, and

Q jj - [V4j ® $(!)'$(!)] for i,j >2, where V22 - 1,




12

where after

% > - Gmm<1> l J ' M “ ‘1>/2<s>H lP ‘l)/2<s>'dsl°pp<1>''
m-3,5,7.... 2i-l, p-3,5,7.... 2i-l
vmp - G»n<1 >lJ'is<m'2 >/M

P '1 )/'2 <8 >'<,s)GpP<1)' ' Vpm -

m-2,4,6.... 2i, p-3,5,6,...,2i-l
Vmp “

2

/<P+m'2 >Gmm(1 )GPP (1)'’

m " 2

’ 4

' 6

.... 2i> P"

2

’ 4

’ 6

.... 2i>

4>x - N<0,E[(z£ ® •(L)')Sii(zJ» ® *(L))]),
K

“ / i (Gnun(

K

“ / ^ Gmm(1 )Wim '1)/2(s) ® *(1)')<W2 C), m-3,5,7.... 2i-l,

where

1 ) s < m ' 2 ) / 2

® *(l)')<W2 (s), m-2,4,6.... 2i,

and W 2 are independent standard Wiener processes of dimension
an<*

respectively, where

(t)«/^w£m

i~l,2 and m=2,3,. ..,g, and where <f> i s

Theorem 4.2.

(s)ds for

independent of <f>m , m>l.

Under the assumptions of Theorem 4.1,

(a) (T,p ® 1 ^ ) (^oLS"^ =>

where after partitioning V and w

conformably with 8:
«1 ~ N ( ° - V ’ Where
wm “ /o(Gmm (

1 ) s ( m " 2 ) / 2

trzl<J) ®
® « ^ ) d W 2(s), m-2,4,6.... 2i,

«m = / o (Gmm(1 )Wim '1)/2(s) ®

0

ii)dW2(s)’ m-3,5,7.... 2i-l,

where w ^ is independent of wm , m>l, and where V * [V-^j ] , i,j * 1, 2,...,2i,
where
4.1.

-

Ez^zJ;', V^. * 0, j>2, and V^j , i,j > 2 are given in Theorem

This holds even if c^(L) ^ has infinite order as long as c^(l) is 1-

summable and c^^Cl) is nonsingular.
(b) Partition 8 « ( 5 6 * ' ) '

so t^iat

denotes the g e l e m e n t s of 8

corresponding to z^ and 8 * corresponds to the remaining (g-g^)k^ elements
of S.

Similarly partition ^q l S ’ ^GLS’ zt“ ^zt

T^=diag(TiT,T^T) •




z t ^ ' ’ an<*

Then (T*T ® 1 ^ ) (^*q l s ”^*GLS^ ^ G '

13

Theorem 4.3.

Under the Assumptions of Theorem 4.1, W ^ g —>

Theorem 4.4.

Suppose that the first

^

o

columns of R equal zero and that

T* 1 6 1 1 under the assumptions of Theorem 4.1,

W q l S ** VGLS ^ ® and

WOLS _ > x h ‘

Note that c^^(L) ^ must be be finitely parameterized to implement the DGLS
estimator.

Although this is not strictly needed for the DOLS estimator,

and

therefore c^(l) must be consistently estimated to construct W q l s » which in
practice entails estimating a parametric approximation.
The asymptotic equivalence of the DOLS and DGLS estimators of S* (Theorem
4.2(b)) is a consequence of the trends in z^:

for m>2 the GLS-transformed

regressors are asymptotically collinear with their untransformed counterparts.
This result extends the familiar result for the case of a constant and polynomial
time trend (Grenander and Rosenblatt [1957]) and the results of Phillips and Park
(1986) for 1(1) regressors to the general integrated regression model with
regressors of various orders of integration.
Although the theorems are stated in terms of 5, typically the results are
translated into results on the coefficients of interest, f).
j&Gls

*-s obtained from $GLS“ ^

^OLS*
P«R(D'

^ k ^ ^ G L S an<*

The distribution of

and similarly for

Moreover, the Wald statistic testing R 6 =r equivalently tests P£=r, where

-1
®I^g).

Theorem 4.3 implies that WGLg is asymptotically x

the Wald statistic testing P£=r is asymptotically x

2

for all P.

2

for all R, so

When P/J=r places

no restrictions on coefficients that can be written as coefficients on mean-zero
stationary regresssors, Theorem 4.4 implies that the Wald test of P/3=r based on

2
j&OLs (with a serial correlation-robust covariance matrix) is asymptotically x •
Importantly, the result concerning the asymptotic x




14

2

distribution of the Wald

statistic testing restrictions on cointegrating vectors applies whether or not the
integrated regressors have components that are polynomials in time.

However, the

limiting distribution of the estimator itself will differ depending on whether
time (say) is included as a regressor and whether some of the regressors have a
time trend component; for specific examples in the 1(1) case, see West (1988) and
Hansen (1989).
These theorems apply to the case that there are a fixed number of regressors.
Conceptually, one could view this estimator as semiparametric by embedding this
parametric regression in a sequence of regressions where the number of regressors
increase as a function of the sample size.

A formal treatment of this extension

would entail generalizing the univariate 1(0) results of Berk (1974) and the
univariate 1(1) results of Said and Dickey (1984) to the 1(d), vector-valued case,
an extension not undertaken here.^

5. Examples

This section examines two examples in detail.

The first compares the model of

Section 2, and the associated dynamic OLS and GLS estimators, with the 1(1) VECM
formulation studied in Engle and Granger (1987) and Johansen (1988a, 1988b).
second example examines various cases of the

A.

1

The

(2 ) specialization of (3.1).

Comparison with the 1(1) vector error correction model. One representation of

a purely stochastic 1(1) cointegrated system is the VECM,

(5.1)

Ayt - 7<*'yt - l + A(L)Ayt

- 1

+ ft , ft

NIID(0,Sf), t-1, ...,T

where yt is nxl, A(L) has finite order and is unrestricted, a is a nxr matrix of




15

cointegrating vectors and 2^. is unrestricted.

Johansen (1988a) and Ahn and

Reinsel (1990) derived the limiting distribution of the Gaussian MLE
unknown parameters of a in (5.1).

for the

Here, the MLE for (5.1) is related to the DOLS

estimator.
Because the asymptotic information matrix for (uMTF, AMTR(z)) in (5.1) is
block diagonal, let A(L)*0.
vector, partition

7

-(7 ^'

7 2

1

2

Partition yt-(yt ' y t)9 into a (n-r)- and r-

)9 conformably, where

7

^ is (n-r)xr and

7 2

is rxr and

normalize a as a«(-0 Ir)9, where 6 is rx(n-r). Without loss of generality assume
that

7

g is nonsingular.

/co\
(5.2a)

The block triangular form of (5.1) is,

A 1 - ut
1,
Ayt

1

2

1
+.^ft

y\ “ *j\ + ut>

(5.2b)

>

2

Clearly ut depends on 6 and, if

7

^

P-1r+a''t ’

1

0

, so does ut , so the factorization of the

likelihood (2.4) results in restrictions between 0 and Ag and between A^ and
(0 ,A2 ).

In this model, the exact MLE (9MT F) is the system estimator studied by

Johansen (1988a) and Ahn and Reinsel (1990), not the single equation estimator
examined in Section 4.
To study the behavior of ^LE* ^

**s convenfent to reparameterize the VECM

(5.1) (with A(L)-O) as:

(5.3a)

Ay 1 - IIAy^ + i; 1

(5.3b)

Ayj: -

where n - 1^1^,

+ ^ t - l + *t

P2~12 >

and rhT^\'

1 1 1 1 8

parameterization is convenient because (5.3) is an unconstrained linear triangular
simultaneous equation model so that MLE's correspond to iterated SUR estimators




16

(see Lahiri and Schmidt [1978]).

The MLE's of the cointegrating vectors in (5.1)

can be recovered from the MLE's of (5.3) as ^ L E “ ’^2^MLE^1 MLE*

Isolating

the regressors of different orders of integration, equation (5.3b) can be written
as:

(5.3c)

Ay l - Sxz\ + «3zJ + r,\

1 2
1
3 1
where zt«yt ^-0 yt_^=a'yt ^
*
3
parameterization zt are canonical

* an<i ^3 ~^l+^ 2 ^ • In t^1^s
1
(1 ) regressors, z^ are mean zero 1 (0 )
1

1

~^2 ~7

2

regressors, the true value of S^ is zero, and ^m LE“^“ "^1^MLE^3 MLE*
(5.a) and (5.c),

^gg and

2

^gg can be written as OLS estimators from the

1 3

*1

l

2

regression of Ayt onto z£, zt and ^-Ay^-ft^ggAy^.

11

$ 1

From

2

Because ft^gg and

1

MTF are consistent, Eytyt' is Op(T ), and 2yt_^Ayt ' is Op(T), by direct

calculation,

(5.4) T(0m l e -0) - '^1 ^MLET^3,MLE “

^t^t^t-l^

1

2

Et-2yt-lyt-l)

1

+ °p^1J

2 1 lv 7
where at-=f?^-E(»y^| »/^) .
The single equation estimators are obtained from the regression,

(5.5)

y\ - 8y\ + d(L)AyJ: + et

For finite order VECM's, d(L) will typically be infinite order, so the finite
approximations used in Sections 2-4 will result in misspecified regression
equations.

If, however, the order of d(L) (say q) is such that q-*x> as T-*® and

3

q /T-*0, then the results of Berk (1974) and Said and Dickey (1984) suggest that
this misspecification will vanish asymptotically. With this interpretation, the




17

single-equation dynamic OLS estimator can be written,

(5.6)

where

T<»0LS-U> -

"^1

+ op<l)

t^ie l°

1 ^ 1 2

n 6

run component of e^, where

0*c(l)Sj.c(l)' is (2n times) the spectral density matrix of

at frequency zero.

A straightforward calculation demonstrates that et*"^2^at+l’ so t^iat
T<

W

Jq l s ^ ^ 0.

Thus, even though the VECM likelihood cannot be factored as

in (2.4), the single equation estimator is asymptotically equivalent to the MLE
for this model.

An interpretation is that, even though there are constraints

across equations and across parameters that appear when the triangular system
(5.2)

is derived from a VECM, these constraints only involve coefficients that

obey conventional JT asymptotics.

Thus, asymptotically they convey no

information about 6 , beyond that contained in the unrestricted second equation.
Indeed this is a general property of regressions involving integrated regressors:
the asymptotic distribution of estimators of coefficients on canonical integrated
regressors remains unaltered when efficient estimators of coefficients on zero
mean 1(0) regressors are replaced by consistent esimators.

One example of this is

given by the MLE of 6 for (5.1); asymptotically equivalent estimators can be
constructed as

where
2

1

regression of Ayt onto yt

2

and

are t^ie OLS estimators from the

$ 2

- 1
1
~ 2
Y^-l an<* *7 t“Ayt-nAyt , w^ere ®

any

consistent estimator of II in equation (5.3a).
Their asymptotic equivalence notwithstanding, it is useful to think of DOLS and
#Ml e as applying to two different models.

For finite order VECM's, Johansen's

(1988a) estimator is the MLE, while for models that support the factorization
(2.4) with d(L) finite order, the single equation estimators are the MLE.




18

B.

Examples of 1(2) Systems. The following examples concern specification and

inference in general 1(2) systems.

To simplify exposition, all deterministic

terms are omitted and their coefficients are taken to be zero.
general

From (3.1), the

(2 ) model is,

1

(5.7a)
(5.7b)
(5.7c)

Some of the 0's can have rows of zero, or be zero, and the second block of
equations might not be present at all (i.e. k 2 ~ 0 ).

These possibilities are

examined here by considering a series of special cases with k 2

~ 0

general cases can be analyzed by combining these special cases.

or k 2 ~l; more
It is assumed

that {d^ m (L)} in (3.3) have known finite order.

2
. Then (5.7b) does not appear in the system and yt is omitted from

Case 1:
(5.7c).

The dynamic OLS and GLS estimators of (0^ p

0^ ^) are asymptotically

2

efficient and inference is x •

Case 2:

k^^l, 0^ ^ ^nown and nonzero.
f

Then the estimation equation (3.4)

becomes,

_ 3
1
~3 2
where Eutus ' and Eutug ' are zero f°r all t9s.

2 1

2

the regressors A yt , Ay t




1
- 0 2

1

anc*

1
Because 0g ^ is known,

leads and lags are 1(0) with

19

1

1

mean zero, so these comprise zt .

2

Because yt and yt are CI(2,1), we can set

3
1 2
1 1
5 1
zt«(Ayt , yt-0 2 iYt)' and zt«yt (other assignments of z^ are possible
3
5
The coefficients on z^ and z^ are respectively

but yield the same results).
f3 /f3 r3N /al
.0
8 **(5-^, ^ 2 ™ ^ 3 i » ^

2

N
, c5 n0
'and o
^ 1

^3 2*^2 an<* ^3 1**^“^2 1^2*
(T,T2), 0^ i>

A
”^ ^ 2

,0
1^3 2 *

*1
*3
^3 i"*®i>

Because (i^, $"*) converge at rates

2 an<* ^3 1 individually converge at the rate T.

^ 3

Jointly, (0® \+&\ 1^3 2*^3 2 ’^3 1^ converge at rates (T2 ,T,T).

The

2

estimators are efficient and inference is x •

Case 3:

Jc2 -Z, ^\ \ known to be zero.

1

2

02

1

The estimation equation is (5.8) with

2

Leads and lags of A yt and Ayt are 1(0) with mean zero, and these

1
3
1
2
2
5 1
comprise zt . Also set zt«(Ayt ' yt)' (yt is 1(1)) and zt“yt .

Thus

0
0
1
2
2
(ij i » ^ 3 2*^3 1^ converge at (T ,T ,T) and inference is x •

1

Case 4:
( ^ 3

k<£*l, 82 ^ unknown.

1

2

Although (Y ,Y ) are not weakly exogenous for

i » ® 3 2*^3 1^ *-n ^ i s case» the dynamic OLS and GLS estimators

nevertheless have desirable properties.

With 02 ^ unknown, the estimation

equation (5.8) becomes,

y\

(5.9)

-d3,2<1 >*2,l>Ayt + d3,2<1>Ayt + *3 ,lyt + «3,2yt

(d3(l(L)-d3j2(L)^2,l^A2yt + d3,2^L^A2yt +

where
either

2

1

(L)“ (1 *L)

(0 ) (if

0

2

( b )

2

^^'

^ j/0 ) or I(-l) (if

0

Because A^yJ: *-s

1

(0 ) an<* A^y^

^ ]_” 0 ) > and because both have mean zero,

their presence does not affect the asymptotic distribution of the other estimators
and they will be ignored in this discussion.

Whether or not 02 -j~0, a valid

1
2 1
1
3
1 2 1 1
assignment of zfc is zt-(Ayt - 0 2 j A y fc)', zt-(Ayt , yt-0 2 >iyt)'» and




20

zj^-y^.

Evidently 03 ^ is not identified from (5.9) alone; using a
1

consistent estimator of 02

1

from (5.7b) would result in loss of x

(although the resulting estimator would be consistent).
0^

2

2

inference

However, 03 ^ and

are s®Parately identified in (5.9) and individually converge at rate T.

3
5
Together, the coefficients on (zt , z^) have an asymptotic mixed normal
distribution.

Moreover, the distribution of (0®

case 2, when the true value of 0^

1

*-s known.

^3 2^

Thus (0^

t*ie same as *-n
^3

^ are

2

asymptotically efficient even if 0^ ^ is unknown, for general 0^
exception to this is the special case of

0

^ ^ known to be zero, in which case

Ay^ would not enter as a regressor in (5.8) were 0^ ^ known.
inference on (0 ^

2

The

Even here,

•

) Is

. Monte Carlo Results

6

This section summarizes a study of the sampling properties of seven estimators
of cointegrating vectors in three bivariate probability models.

The data were

generated by the model:

(6 .1 a)

Ay* - u*

(6 .1 b)

-

0

y* +

NIID(0,2j.), where ut-(u* u^) '.

with $(L)ut—ft , $(L)«l2 -$L,
in the series is zero.

Because ut follows a VAR(l), yt follows a VAR(2). Under

(6 .1 ), T (5 -0 ) is invariant to
loss of generality

0

The true drift

0

for all the estimators considered, so without

is set to zero.

The six estimators considered are the static OLS estimator (SOLS; Engle and
Granger [1987], Stock [1987]), the dynamic OLS estimator




21

(DOLS) , the dynamic

GLS estimator

(DGLS), the zero frequency band spectrum estimator of Phillips

(1988b) (PBSR), the fully modified estimator of Phillips and Hansen (1989) (PHFM),
and Johansen's (1988a) VECM maximum likelihood estimator (JOH).
JOH were calculated as described in Section 5(A).

t-statistics for

Two serial correlation-robust

estimators of the covariance matrix of the DOLS estimator were considered, one
based on a weighted sum of the autocovariances of the errors (D0LS1), the second
using an autoregressive spectral estimator (D0LS2).
all estimation procedures.

A constant was included in

The details of the construction of the estimators are

given in the notes to Table 1.
The design (6.1) parsimoniously nests several important special cases.
(case A), when all elements of $ except

First

equal zero and Ej. is diagonal, Ay^

is strictly exogenous in (6.1b) and SOLS is the MLE.

In this case, all the

efficient estimators are asymptotically equivalent to SOLS, although they estimate
nuisance parameters that in fact are zero.
of $ is zero and $ 2 1 ^® or

Second (case B) , if the second column

is not diagonal or both, then SOLS is no longer the

MLE and does not have an asymptotic mixed normal distribution, but the DOLS, DGLS,
and JOH estimators are correctly specified and are asymptotically MLE's (the
difference again being the unnecessary estimation of some nuisance parameters).
In this case, PBSR and PHFM are efficient if interpreted semiparametrically.
Third (case C) , for general $ and 2^., JOH with one lag is the exact MLE and DOLS,
DGLS, PBSR, and PHFM are asymptotically efficient when interpreted
semiparametrically.
Results for cases A, B and C are reported in the respective panels of Table 1
for T**100 and 300.

Panel A verifies that the estimation of the nuisance

parameters per se in the efficient estimators does not substantially reduce
performance in the special case that OLS is the MLE.

Panel B explores the

performance of the estimators in 22 models in which DOLS, DGLS, and JOH are




22

correctly specified.

Even when ^21~^’ SOLS can have substantial bias; for

example, for T-100 and ^^--.90, the 5%, 50%, and 95% points of the SOLS
distribution are -.001, .076, .196.

The DOLS, DGLS, and JOH estimators eliminate

this bias, although when the regressor exhibits strong serial correlation, the
DOLS t-statistics tend to have heavier tails than predicted by the asymptotic
distribution theory.
to SOLS.

The PBSR and PHFM estimators tend to have biases comparable

When this bias is small (for example when

21*®^ ’ their t-statistics

have approximately normal distributions.
The final case ($, 2^. unrestricted) introduces two additional parameters, and
it is beyond the scope of this investigation to explore this case in detail.
Rather, case C is examined by generating data from a model relevant to the
empirical analysis in Section 7, namely a bivariate model of log Ml velocity (v)
and the commercial paper rate (r), estimated using annual data from 1904-1989
(earlier observations were used for initial lags) imposing a long-run interest

g
semielasticity of -.10.

The estimated VAR(l) for the triangular system (vt ,

Vt+.10rt) is reported in panel C of Table 1.

The results for this system indicate

large bias in SOLS and, to a lesser extent, in DGLS, PBSR, and PHFM.

DOLS

exhibits less bias and, not surprisingly since it is the exact MLE in this system,
JOH is essentially unbiased.

The dispersion of the distributions are comparable,

except for the JOH estimator which has some large outliers for T-100.

The x

2

approximation to the Wald statistic (testing 0«-.lO) works best for JOH, next best
for D0LS2 and DGLS, less well for the remaining efficient estimators.
To interpret DOLS and DGLS results, it is useful to write (6.1) in the
triangular form given in (2.3).

Write the VAR(l) for u^ as tf(L)ut-at , where

-h
-H
tf(L)=E£. $(L) and at~2j. et , so that E(ata£)*I.

1

Then Ayt has a univariate

ARMA(2,1) representation and c^^(L) in (2.3a) is given by c^(L)~/c(L) |tf(L) |
where k (L) is the first degree polynomial with roots outside the unit circle that




23

solves /c(L)/c(L ^)-^2

2

^L ^

2 2

^L

#

T^ie Pr°ject^on °f y^-fly^

onto {Ay^} is d(L)Ay^; for this design d(L)— [ ^ ^ ( L ) ^ ^ ^ " ^ ) +
^ll(L)^i2 (L’^) ] [#c(L)/c(L ^)]
C2

2

Finally, the residual from this regression,

2
(L)€t in (2.3b), follows an AR(1) with C2

2

(L)

-1
-*(L ).

Thus *c(L) dictates

both how quickly the coefficients on leads and lags of Ay^ in the DOLS/DGLS
regressions die out and the degree of serial correlation in the regression error.
In cases A and B, /c(L)»l, and the DOLS/DGLS regressions have no omitted variables.
In case C, *(L)*1-.66L so the true d(L) is infinite order and the DOLS/DGLS
regressions omit leads and lags of Ay^.
The results from the experiments can be summarized as follows.

First, SOLS is

biased in almost all trials, with nonstandard distributions for the estimator and
test statistics.

Second, DOLS and DGLS are unbiased for cases A and B, but

exhibit bias in Case C.

The relatively large root of /c(L) suggests that the bias

is attributable to the truncation of d(L) in the DOLS/DGLS regressions.

o

Third,

in results not shown in the table, doubling the number of leads and lags for DOLS
and DGLS and the order of the AR correction for DGLS has little effect in cases A
and B and reduces the bias in case C . ^

Fourth, the PBSR and PHFM bias has the

same sign as, but is somewhat less than, the SOLS bias.

A possible explanation is

that both PBSR and PHFM rely on initial biased SOLS estimates of 0 , which result
in inaccurate spectral density estimates subsquently used to compute PBSR and
PHFM.

Fifth, for case C (where the error is highly serially correlated) the

autoregressive spectral estimator used in D0LS2 produces a more normallydistributed t-statistic than does the kernel estimator used in D0LS1.

Sixth,

tripling the sample size noticeably improves the quality of the asymptotic
approximations.
This modest Monte Carlo experiment suggests three conclusions.

First, all the

estimators (except the correctly-specified JOH) exhibit bias in some of the




24

simulations, although the bias is in each case less than for SOLS: no single
estimator is a panacea.

Second, the distributions of the t-ratios tend to be

spread out relative to the normal distribution, suggesting that the usual
confidence intervals will overstate precision.
has shortcomings:

Third, in case C each estimator

the DGLS, PBSR, and PHFM estimators are substantially biased,

and the JOH estimator, while unbiased, has an empirical distribution with a much
greater dispersion than the other efficient estimators; DOLS has the lowest RMSE.
Fourth, of the two procedures for computing the covariance matrix, the
autoregressive estimator produces t-statistics that are more normally-distributed
than does the kernel estimator.

For this reason, the DOLS standard errors

reported in the empirical analysis in Section 7 are based on the autoregressive
covariance estimator.

7. Application to the Long-Run Demand for Money in the U.S.

This section addresses two questions.

First, is there a stable long-run Ml

demand equation spanning 1900-1989 in the United States?

Second, what are the

income elasticity and interest semielasticity, and how precisely are these
estimated?

The long-run demand for money plays an important role in the

quantitative analysis of the effects of monetary policy.

Unfortunately, estimates

of long-run income and interest elasticities obtained using postwar data have been
sensitive to the sample period and specification (see the reviews by Laidler
[1977], Judd and Scadding [1982], and Goldfeld and Sichel [1990]).

In his review

of this research and of early work by Meltzer (1963), Lucas (1988) presented
informal but highly suggestive evidence that this apparent sensitivity resulted
not from a breakdown of the prewar long-run Ml demand relation, but from a paucity
of low frequency information in the postwar data.




25

This section examines Lucas's

interpretation using the formal econometric techniques for the analysis of
cointegrating relations developed in this paper and elsewhere.

Our analysis

focuses on the annual data studied by Lucas (1988), extended to cover 1900-1989,
although for comparison with other studies selected results using postwar monthly
data are presented as well.^

A . Results for annual data.

The annual time series are Ml (in logarithms, m ) ,

real net national product (in logarithms, y ) , the net national product price
deflator (in logarithms, p ) , and the commercial paper rate (in percent at an

12

annual rate, r) .

Real Ml balances (m-p, plotted with y in Figure la) grew

strongly over the first half of the century, but experienced almost no net growth
over most of the postwar period.

Over the entire period, velocity (y-m+p) and r

(plotted in Figure lb) exhibit strikingly similar long-run trends, dropping from
the 1920's to the 1930's, growing from 1950 to 1980, then declining after 1981.
Inspection of these figures suggest that real balances, output, velocity and
interest rates might be well-characterized as being individually integrated, and
formal tests support this view.

Specifically, the following characterizations

appear consistent with the observed series:

m-p is

both halves of the sample), with drift; r is

1

drift; and (m-p), y and r are cointegrated.

Whether p and m are individually 1(1)

or

1

1

(1 ) for the full sample (and

(1 ) with no drift; y is

1

(1 ) with

(2 ) is unclear: the inference depends on the subsample and the test

specification.

The evidence suggests, but is not conclusive, that r-Ap is 1(0).

Because rt is nonnegative, characterizing rt as 1(1) raises conceptual difficulty.
This decision is driven by the empirical evidence that r^ exhibits considerable
persistence, and is consistent with interest rate specifications used by other
researchers (e.g., Campbell and Shiller [1987] and Hoffman and Rasche [1989]).
The applicability of the DOLS and DGLS estimators to 1(1) and 1(2) systems




26

13

makes it possible to estimate 6
to test whether 0^-1.

in the cointegrating relation, m~0pP’0yy*0rr , and

consider three specifications.

First, if m and p are

1(1), then (m,p,y,r) constitute the 1(1) system analyzed in Section 2 with one

2

cointegrating vector, modified for nonzero drifts, and inference is x •

Second,

if m and p are 1(2) and (r,Ap) are not cointegrated, then this is an 1(2) system
with y*-pt , yt~<rt yt^’ and yt“mt ’ where *2 ,1 “° ’ *3 ,1 ~0 ’ fl3 ,l“ (*y
and 0® l"^p-

’

This is case 3 in Section 5(B), and inference on (0y , 0r , 0^)

• Third, if p is 1(2) and r-Ap is 1(0), this is a

using DOLS or DGLS is

1

2

-

combination of cases 2 and 3 in Section 5(B), with yt~Pt > yt=(A
*2,1-(1’0)'’ *3,l“ (*r’0)'’ *3 ,l“V

B\ iAyJ-(rt-Apt,Ayt)'.

and 0°>2 «(O,0y )'.

1

rt ,yt),

Then Ay*-

Here, the cointegrating vector (1,-1) is imposed on

(rt ,Apt) as implied by the elementary economic hypothesis that the real interest
rate is stationary.

2

Again, inference using DOLS or DGLS is x •

Estimates for the four-variable system are reported in Table 2 for these three
specifications.

14

The estimates of 0^ do not differ from one at the 10% (two-

sided) level in any of the specifications.

In all but two cases, 0y is

statistically indistinguishable from 1 at the 5% level, and in the two exceptions
0y is estimated imprecisely.

To be consistent with theory and with the rest of

the money demand literature, we henceforth impose
the estimation of

0

0

p“l and study in more detail

y and 6r .

Estimates of Ml demand cointegrating vectors in the system (m-p,y,r) are
presented in panel A of Table 3.

The estimators are those studied in the Monte

Carlo experiment, plus the single-equation nonlinear least squares estimator
(NLLS), which is used by Baba, Hendry and Starr (1990) to estimate their long-run
Ml demand equation.

The full-sample estimates are similar across estimators and

none of the efficient estimators reject the hypothesis that 6 ^ 1 at the
sided level.




1 0

%

2

-

Using only the first half of the sample, the efficient estimators

27

provide smaller income elasticities and larger interest elasticities, but this
difference is modest.

In sharp contrast to the first-half estimates, the postwar

estimates in Table 3 differ greatly across estimators.

The SOLS estimate is close

to zero, the NLLS elasticities have the "wrong" sign, and the JOH estimator is
highly sensitive to the number of lagged first differences used.^
The final set of estimates refer to the system (m-p,y,r ), where r
smoothed commercial paper rate.
reasons.

16

is the

A smoothed interest rate is used for two

First, the empirical money demand literature is indecisive on whether a

long- or short-term interest rate is most appropriate.

Because there is no

consistent risk-free long-term rate with constant tax treatment over the full
sample, r

can be interpreted as a proxy for a long-term rate which, under the

risk-neutral theory of the term structure, is an average of current and expected
future short-rates.

*

Second, r

can be viewed (indeed is constructed as) an

estimate of the permanent component in interest rates.

The cointegrating vector

relates the permanent components of m-p, y, and r; to the extent that r is a
particularly noisy measure of its permanent component, the cointegrating
regressions will suffer from a small-sample version of errors-in-variables bias,
and using r

could reduce this bias.

The results in Table 3 are for a two-sided

smoother, but they are typical of results for other smoothed rates.
sample estimates change only slightly using r .

"ic

and standard errors are larger with r

than r.

The full-

The postwar income elasticities
The JOH and NLLS estimates are*

*
17
quite sensitive to using r , and the differences across point estimates remain.
The differences between the prewar and postwar estimates raise the possibility
that there has been a shift in the long-run money demand relation.

To evaluate

this (and to explore the source of the instability in the postwar estimates), we
examine four related pieces of evidence.

The first consists of formal tests of

the null hypothesis of a constant cointegrating relation, against the alternative




28

of different cointegrating vectors over 1900-1945 and 1946-1989, under the
maintained hypothesis that the parameters describing the short-run relations are
constant, using the DOLS estimator.
not conclusive.

The results, given in panel B of Table 3, are

Although two of the four specifications reject constancy at the

10% level, only one rejection is at the 5% level and the shift parameters Sy and
$r are imprecisely estimated.
Second, 95% confidence regions for (0 , 0r) implied by the point estimates in
panel A of Table 3 generally overlap, or nearly overlap, near 0^,-1.00 and 0r«-.ll.
These regions are plotted in Figure 2a-2d for, respectively, the DOLS, DGLS, PBSR,
and PHAN estimators.

18

For each estimator, the only nonoverlapping region is for

the postwar estimator based on r; the major axes of the prewar and postwar
ellipses are approximately orthogonal; and the confidence region for the full
sample is much smaller than for either half.

19

The third piece of evidence concerns the properties of the cointegrating
residuals, zt« mt-0yyt-0rrt .

These residuals exhibit quite different

properties for the different point estimates:

residuals constructed using either

the full-sample or first-half point estimates are consistent with cointegration,
while the residuals based on the postwar estimates are not.
based on the full-sample point estimates,

appears stationary.

cointegrating vector is estimated over the first half and r
second half, asymptotically r h a s
distribution:

As shown in Table 4,
When the

is computed over the

the standard univariate Dickey-Fuller (1979)

using f , non-cointegration is rejected in the postwar data at the

5% one-sided level for the DOLS prewar cointegrating vector.

In contrast, the

residuals from the postwar money demand equations exhibit more serial correlation
but lower variance than the residuals constructed using the prewar or full-sample
estimates.
Fourth, the postwar VECM likelihood, concentrated to be a function of (0^, 0r),




29

is bimodal for both the r and r

data sets.

Moreover the JOH point estimates are

quite sensitive to the number of nuisance parameters estimated (number of lagged
first differences included).

Inspection of the concentrated likelihood, plotted

in Figure 3 for 3 lags (J0H(3) in Table 3A), indicates two conclusions:

that the

JOH MLE's for 2 and 3 lags lie on a ridge that corresponds to the major axis of
the postwar confidence ellipses in Figure 2, and that the likelihood is not well
approximated as a quadratic.

This explains, in a mechanical sense, the

instability of the JOH estimates with respect to the lag length, and suggests that
the JOH estimator might be poorly approximated as normally distributed.
These four pieces of evidence lead us to conclude that, despite the apparently
large differences in the prewar and postwar point estimates, the evidence against
Lucas's (1988) interpretation of a stable long-run money demand relation is weak,
and indeed that the best summary of the evidence is that long-run Ml demand has
been stable over 1900-1989.
imprecisely estimated.

Using the postwar data alone, the elasticities are

The postwar data is dominated by the 1950-1980 trends in

velocity and interest rates;

as Lucas (1988) pointed out, this requires the

estimates to lie on the "trend line" given by A(m-p) -0yAy-0rAr (where Ay
is the average annual growth rate of yt , etc.).

This line constitutes the major

axis of the postwar confidence ellipses in Figure 2 and the ridge in the postwar
VECM likelihood in Figure 3.

Several such trend lines (or low frequency

movements) can be drawn from the prewar sample, resulting in tighter confidence
regions when only the 1900-1945 sample is used.

When the 1900-1945 and 1946-1989

subsamples are combined, the 1900-1945 and 1946-1989 trend lines solve for point
estimates 0y~l.OOO and 0r«-.145.

Because the efficient estimators of

cointegrating vectors exploit this same low-frequency information, albeit in a
more sophisticated way, the sampling uncertainty of the full-sample estimates is
much smaller than that based on the prewar and especially the postwar data.




30

B.

R e s u l t s f o r p o s t w a r m o n th ly d a t a .

Cointegrating vectors estimated using

postwar monthly data on Ml, real personal income, the personal income price
deflator, and a variety of interest rates are reported in Table 5, panel A.
Compared to the postwar annual results, the income elasticities estimated over
1949:1-1988:6 are higher and there is somewhat less disagreement across the
efficient estimators, with income elasticities ranging from .30 to .89 based on
the commercial paper rate.

The estimates are stable across the choice interest

rate (the exception is the DGLS estimates, for which GLS effectively firstdifferences the data, as in the postwar annual estimates).

The point estimates

agree closely with Baba, Hendry and Starr's (1990) NLLS estimate of .5 obtained
over 1960-1988, strikingly so since they used GNP rather than personal income,
quarterly rather than monthly data, and several additional regressors designed to
account for shifts in short-run money demand relation.

20

Although the point estimates are not sensitive to the start date of the
regression, they are quite sensitive to the final regression date.

For example,

JOH estimates of the income elasticity, estimated over 60:1 to the last month in
each quarter from March 1984 through June 1988 using the commercial paper rate

( 8

lags), range from -3.00 to 3.54; for the NLLS estimator, this range is .29 to
1.08.

When computed over 60:1-78:12, the JOH, NLLS and DOLS income elasticities

are -.27, -.13, and .11.

Comparable instability is present for each of the

interest rates studied in Table 5, whether estimated in logarithms or in levels.
Because we do not provide uniform critical values for tests based on these
"recursive" estimates, this observed instability does not provide formal evidence
on the stability of the cointegrating vector estimated with the postwar data.
This sensitivity to terminal regression dates is, however, consistent with our
interpretation of the annual data.




Specifically, the data from 1950 to 1982 are

31

dominated by the single upward trend in real balances, income and interest rates,
which results in estimated income and interest elasticities that are strongly
negatively correlated and are imprecisely estimated, except that they must fall on
the trend line.

Only with the most recent data, which reflect the second trend

(increasing income, declining velocity and interest rates), are the estimates more
precise with values that are comparable across estimators.

C.

Discussion and Summary. This analysis is restricted in several regards.

Only one monetary aggregate has been considered, Ml.

Much of the money demand

literature has focused on the search for a stable short-run demand function, an
issue avoided here altogether.

The analysis has relied heavily on asymptotic

distribution theory to construct formal confidence intervals and tests, and the
estimation procedures typically entail the estimation of many nuisance parameters
relative to the sample size.

Although we have only limited evidence, this leads

us to suspect that the precision of the foregoing results is overstated.
Even with these caveats, these results suggest three conclusions.

First, when

viewed over 1900-1989, there appears to be a stable long-run Ml demand function.
Estimated over the entire sample, 95% confidence intervals based on the DOLS
estimator are, for the income elasticity, (.90, 1.03), and for the interest
semielasticity, (-.124, -.088).

Similar intervals are obtained using the other

efficient estimators over the full sample.
Second, our results are consistent with Lucas's (1988) suggestion that there is
a stable long-run money demand relation over the pre- and postwar periods.

A key

piece of evidence for this is the apparent stationarity of the postwar residuals
computed using the first-half estimates of the cointegrating vector.
Third, in isolation the postwar evidence says little about the parameters of
the cointegrating vector:




the estimates have large standard errors and moreover

32

are sensitive to the subsample and estimator used.

The main reason for this is

that the postwar data are dominated by steadily rising income and interest rates
and effectively no growth in real balances; only after 1982 is there a decline in
interest rates that reduces multicollinearity sufficiently to estimate the money
demand relation.

We suspect that the postwar standard errors understate the

sampling variability, particularly for the monthly results, because of this
sensitivity to terminal dates, some evidence that the large-sample mixed normal
distribution provides a poor approximation to the postwar sampling distributions,
and the presence of this problem in the Monte Carlo analysis in Section

8

6

.

. Conclusions

The empirical investigation suggests some observations that might apply more
generally beyond this particular application.

First, the precise estimation of

long-run money demand appears to require a long span of data:

estimates over the

full span are more precise than over the first half of the century alone, and the
data since 1946 contain quite limited information about long-run money demand when
viewed in isolation.

Second, the use of several efficient estimators is a

valuable check of the sensitivity of the estimates to changes that should be
asymptotically negligible.

In the case of postwar money demand, the sensitivity

of the postwar estimates to the choice of estimator and to the estimation period
drew attention to the low-frequency multicollinearity between postwar income,
velocity and interest rates in the postwar data.




33

Appendix

Proof of Lemma 2.1.
The proof is a modification of Anderson's (1971, Theorem 7.6.7) and Rozanov's
(1967, Chapter 2.3) proofs of the Wold decomposition.
i—1,2, form Hilbert spaces.
2

Note that

Then u^“cn(L)€^ is the Wold representation of

1

1

1

and by construction ct is the innovation process for ut .

^t ® ^t-1* so t^iat

and

construction

Let

forms a basis for D^.

p<utl*i>“p <utlUs~«Ds)“5 — '
»C2 1 j £t-j“c2

1

Then

(L>€t> where c2 1 j“

Now at_ut-p<uti®i)“ut-c 2 i(L)et is
~2 2
stationary, has E(ut) <», and is linearly regular, and so possesses the Wold

~2

decomposition utsssC2

2

2

2

2

2

^F^ €t ’ where €t~ut-P(ut 1 ^

1

2

2

<D ^t- 1 ^*
- 2

1

2

construction, ct is the innovation process for ut , E€t« 0 , E€tes
all t,s, and Ec^c^'-E.^ for t-s a**d equals 0 otherwise.

' = 0

for

Finally, c(L) is

square summable because Eu^u^'^ by assumption. □

Derivation of (3.1).
Assume that the nxl vector yt has the Wold representation A<*yt — A* + F^(L)at ,
where (i) at is a martingale difference sequence with E(atat '|at ^ ,at_2 »•••) and max^suptE(a^t)<®, (ii) as~0 for s<0, (iii)
Zj«oJd |Fjl<00'

F ^ L ^ ^ J ^ q Fj L^ , with

F^(e’^w ) is nonsingular for t^O, and (v) rank[F^(l) ]=k^<n.

The triangular representation (3.1) is constructed by repeated application of the
following Lemma:

Lemma A.l.
Axt -

Assume that the nxl vector xt is generated by
+ F(L)at , where at satisfies (i) and (ii), F(L) is Z-

summable and satisfies (iv), and rank[F(l)]=k<n.

Without loss of generality

arrange xt so that the upper kxn block of F(l) has full row rank.
- 34 -




Then xt

can be represented as:
Axt -

+ D 1 (L)at

“

4

+ *xt + D 2 (L)at
1

2

1

2

where xt«(xt ' xt ')', where xt is kxl, xt is (n-k)xl, and
D(L)*[D^(L)' D 2 (L)']' is (i-1) summable.

Proof.

When /im lies in the column space of

The result holds trivially for k-n, so consider k<n.

Order xt so that

F(L) can be partitioned as F(L)=[F^(L)' F2 (L)']' where F^(L) is kxn, F 2 (L) is (nk)xn, and F^(l) has full row rank.
for some kxr matrix 6.

(A. 1)

Because F^(l) has full row rank, F2 (l)-0 'F-^(l)

Now partition

as (/*£ ^

so that

Ax* - 0'AxJ; - S _ o O * 2 ,i-*>l,i>ti + [F2 <L>-,9 'Fl<L)]af

Accumulating (A.l) yields x^ - 0'x^ -

+ ^ 2 ^ ^ at ’ w^ere D 2 (L)

- F^CL) - 6 'F*(L) , where F*(L) - (1-L)'1 (Fi(L)-F^l) ) , i-1,2.

"fc

is H summable, F^(L) is (i-1) summable.
then 1*2

m*® so

^ 1

^ 2

m+ 1 *®*

Because Fi(L)

If /im lies in the column space of F(l) ,

T^e Lemma f ° H ° ws by setting / i ^ »

(i=0,...,m) and D 1 (L)=F1 (L). □

To construct the triangular representation (3.1), apply Lemma A.l to xt=A^"^yt
to yield the decomposition:

ad ~ 1

~

A yt
.d-l-

A

^d■1

“ n , o + Fi
2

-

-

/. \

(L)at
„ ^ .d-l/Ad-l-lN

y t " ^2,0 + 1* 2 , 1 *

+

°

2 ,1 ( A

+ Ff‘1 (L)at

y t>

~1

-2

where yt has been partitioned into k^xl and (n-k^)xl components yt and yt .
Now assume that F^’^(1)-[F^ ^(1)' F^ ^(1)']' has rank kj^^^n, and apply the
- 35 -




lemma to xt - [Ad "^y^f (Ad’2 y 2 -0d pAd ~^y\) 1 •

Continuing this

process yields the triangular representation (3.1), with u^.«Dj(L)at , j- 1 ,
...,d+1, where rank [Dj(l)]-kj for j-l,...,d.

While £>-[0^(1)' D 2 (l)' ...

Dd(l)']' has full row rank by construction, nothing so far ensures that D^+^(l) is
linearly independent of the rows of D.
completed.

If it is, then the construction is

If it is not, then redefine the variables yt to be A ^yt and d to be

d+1 , and repeat the construction until [6 ' D^+^(l)]' has full rank, so that ut is
1(0) with a full rank spectral density matrix.

This yields (3.1) for variables of

arbitrary finite orders of integration and cointegration.

Proof of Theorem 4.1.
First consider the infeasible GLS estimator
than $(L) .

constructed using $(L) rather

-1 ,
Note that (Tt ®I)(6 q LS-£) - <T ^T
“•p,
> where Q,

(T^®I)Xtztz£(T^®I) and <f>^ - (T^®I)£tztet , with zt - [zt®$(L)'] and
e^ - $(L)et .

(Unless otherwise stated, the remaining identity matrices have

dimension k^ so this subscript is suppressed.)

The convergence of Qppp- to

follows from a standard application of the weak law of large numbers.
with i or j >

q1jt -

2

For Qpjp

:

aii®i)Et[ZSU(4-m ®v ) [ ig .0(4-h ®
- < T i^ t[2 -o a -o < 4 -» 4 :h ®
- < T ^ i> x t [ 2 ^ I g . 0( 4 4 ’ ® »;% )i(T ji® D + op <i)
-> (Vtj ® 0 ^ )

where the last two lines follow from Lemma 1 of Sims, Stock and Watson (1990)
(SSW) and $(1)
h i

For <f>iT, i > 2:

8

-

- (T‘^®I)Xt(z^ ® $( 1 )')]Z’^




+ Opd)

36

J /i<Gmm (

1 ) s ( m

' 2 ) / 2

® *<l),)‘W

2

<s). ">-2,4,6.... 2 i

u i (Gmm(1 )Wim "1)/2(s) ® $(l)')dW2 (s), m-3,5,7, ...,2i-l
where the last line follows from Lemma 1 of SSW. The joint convergence and
distribution of ^ follows from SSW Lemmas 1 and 2.

To prove that the feasible GLS

estimator has the same limit, let
qt

- (Tx1®i)lti l % < Zt.m ® *;>n2-o<*t-h ® ^)i'(Ti1®i>

so (Tj1®!)

(SG h s-S G h s)

j=l,...,q.

- 0T (^T -^T ) + (0t -Qt)*t - Assume that

B

for

Because Qrj, - Q ^ 0, Q^, - Q 5 0, Q is a.s. invertible, and

0, GLS and feasible GLS are asymptotically equivalent.

^

□

Proof of Theorem 4.2.
(a)

By assumption, Cj j (L) is d+l-j summable for j-1,2,...,d+l.

This implies that

the diagonal entries Gjj(L) of G(L) corresponding to the stochastic elements,
in

from equation (3.7) are j summable.

The theorem then follows from

Lemma 1 of SSW. □

1
1
^
1
—
Theorems 4.1 and 4.2 imply that TlT^tztzt'T*T * 0 •

(b)

consider the infeasible GLS estimator

6

First

GLS, defined in the proof of Theorem 4.1.

Theorem 4.1 implies that
(T*t ® I)(«*GLS-S*) - B;£[(T;£®I)Xt(zt®S(L)')E‘^ ]
where

B*x - (T^®I) [£t(z*®$(l) ') (z*'®$(l) ) ]( T ^ I ) .

+ op (l)
Now

B*T “ (T;i®IH I(g-gl)ki®$ <1 >'$ <1 > > ^ t (ztzt'®I>l(T*T®I>
SO

K t - [(T*x®I)Et(z*®I)(z*'®I)(T^^sl)]*1 tl®$(l)'$(1 )]_1.
Also,

( t * t®d ( S * o l s - « * ) - [(T ;^® i)X t ( z *® i) ( z *'® i) ( T ;^ ® i) ] * 1




37

x (T;J»I)Xt(z*®I)Cii(L)«^ + op (l)
- B*£(I®$(1) '$(1) ) (T;^®I)][t;(z*®I)cii(L)e^ + op (l)

- B;^(T;^®i)j;t(z*®$(i)'$(i))cii(L)£^ + opd) .
Thus
(T*t ®I) (^*OLS”^*GLS^ “ ®*T^ *x®^)

'^(1 )

- It<z*®S(L)')2 - ^ }

€fc

+ op (l)

- B;J(T;^®i)5;t{(z*®$(i)')$(i)[cXi(i)£^ + c ^ ( l )a c ^]
- ( z * ® * d ) ' ) S ^ £t + [z*®($(l)-$(L))']S^} + Opd)
“

A£j)

+

O p d )

-H

where the final equality follows from $d)cjjjj(l)-£^, and where
^(D-d-D'^KD-Kl)),

c^LHl-D'^c^aj-c^d)),

and

A 1T - (T;^®I)Xt(z*®$(l)'$(l))c*i(L)A£^

A2t - (T;^®I)Xt[z*®d(l)-$(L))']S^e^
“ -It [(T;^Az*)®$*(L)')S^e^ •

Because

-> Q* (the (*,*) block of Q given in Theorem 4.1), the result follows

if A^t ^ 0 and A 2 ^ ^ 0.

Because $*(L) has a finite order by assumption and

E(€^| {z*} )*=0, standard telescoping arguments imply that A ^ 5 0.

In addition,

^2T ^ 0 as a consequence of the results for

□

in Theorem 4.1.

Proof of Theorem 4.3.
The result follows from Theorem 4.1 above and Theorem 4 of Johansen (1988a) or
alternatively from section 4 of Phillips (1988a).

□

Proof of Theorem 4.4.
This follows directly from Theorem 4.3 and the proof of Theorem 4.2.




38

□

Footnotes

1. Since submitting this paper it has come to our attention that the estimator
proposed here was independently developed by Phillips (1988a) (also see Phillips
and Loretan [1989]) and Saikkonen (1989). The earliest reference of which we have
become aware is Hansen (1988). The current paper extends previous results to
higher, differing orders of integration, handles deterministic time trends, and
applies the results to the estimation of long-run money demand.
2. Similar results hold in the Gaussian model with explosive roots, see Domowitz
and Muus (1988).
3. This lemma has antecedents (but to our knowledge no previous formal statement
and proof) in the literature on optimal filtering, for example Whittle (1983)
Chapter 5 or Brillinger (1980) Section 8.3, or Sims' (1972) discussion of Granger
causality.
4. Although the construction leading to (2.5) makes (2.5) unique, alternative
triangular representations exist. For example, it is possible to^construct a one­
sided triangular^representation analogous to (2.5), except that
will not be
innovations of u£ and in general c-^(z) will not be invertible. Such a
representation was derived by Hansen, Roberds and Sargent (1990) to study
restrictions on the consumption and labor income process implied by the balanced
budget constraint.5
7
6
5. Johansen (1988b, 1990) studied the restrictions on the coefficients of vector
autoregressions implied by the existence of cointegration in higher order systems.
Johansen (1988b) examined systems with, in Granger and Engle's (1987) terminology,
cointegration of the form CI(d,b), where d^b. As Johansen (1990) points out, this
excludes cointegration of the general form (3.4), which generalizes what Granger
and Lee [1988] term "multicointegration". Johansen (1990) complements our
derivation, since it explicitly handles multicointegration; it relates
multicointegration to restrictions on the parameters of the levels VAR, whereas
the current derivation refers to the moving average representation of the d-th
difference.
6

.

7.

This result has recently been provided by Saikkonen (1989) in the d=l case.
Equations (5.3) and (5.4) provide an easy way to construct standard errors for

^MLE’ namely from (5.4) or alternatively from the usual NLLS formula from the
2

regression of Ayt onto

1

2

*

1

1

2

yt_^, ^t*"^t‘®MLE^t^ *

extends

directly to the case of general finite-order A(L) by including lags of Ayt in the
regression.

As discussed below, an asymptotically equivalent estimator of 6 and

its standard error can be constructed by replacing
of

by any consistent estimator of II.




39

usec* *-n

construction

8

.

See Section 7 for a description of the data.

9. This^interpretation is supporte^ by jn additional Monte Carlo experiment in
which Ay. was replaced by [k (L)k (L’ ;]Ayt . (Of course in an empirical
application /c(L) would be unknown.) This eliminates nearly all of the bias: for
T-100, the bias falls from .026 to -.006 for DOLS and from .045 to -.008 for DGLS.
10. For Model 3 and T-100, the DOLS bias was reduced to .007. The main effect of
doubling the number of leads and lags and order of the autoregression for Models 1
and 2 was to increase the dispersion of the t-statistic; for example, for T-100
t 9 5 -t qc for D0LS1 increased from 3.86 to 4.13. A similar increase in dispersion
of the’distribution of JOH t-statistics occurred when the number of lags in the
VECM was doubled.
11. Most empirical analyses of money demand predate the literature on
cointegration. Exceptions are Hoffman and Rasche (1989), who apply Johansen's
(1988a) estimator to monthly U.S. Ml data from 1953 to 1987, and Johansen and
Jesulius (1990), who apply Johansen's (1988b) procedure to the long-run demand for
money in Denmark and Finland. Baba, Hendry, and Starr (1990) focus on short-run
U.S. Ml demand (1960-1988, quarterly), but a preliminary step is their estimation
of long-run Ml demand using a single equation error correction model (the "NLLS"
estimator). With the same purpose and methodology, Hendry and Ericsson (1990)
present results for the U.K. as well as the U.S.
12. Data sources and construction: Ml: 1947-1989: The monthly Citibase Ml series
(FM1) was used for 1959-1989; the earlier Ml data was formed by splicing the Ml
series reported in Banking and Monetary Statistics, 1941-1970, Board of Governors
of the Federal Reserve System to the Citibase data in January 1959. The monthly
data were averaged to obtain the quarterly or annual observations. Data prior to
1947 are those used by Lucas; from 1900-1914 the data are from Historical
Statistics, series X267 and from 1915-1946 they are from Friedman and Schwartz
(1970), pp. 704-718, column 3. Y: U.S. Net National Product 1947-1989, Citibase
GNNP. Prior to 1947, Lucas's (1988) data (Friedman and Schwartz real net national
product (1982 dollars), Table 4.8). For the monthly data 1959-1989, we used
personal income (GMPY). P: Price deflator for NNP. 1947-1989, Citibase GDNNP.
Prior to 1947, Lucas's (1988) data -- same source as NNP. For monthly data we
used the price deflator for GMPY. Interest rates: Commercial Paper rate. 19471989, 6 -month commercial paper, Federal Reserve Board (FYCP), prior to 1947,
Lucas's (1988) data (Friedman and Schwartz (1982), Table 4.8, column 6 ).1
3
13. Univariate Dickey-Fuller (1979) r and ?r statistics, computed with 2 and 4
lags on the full data set, fail to reject a single unit root in each of m, p, y,
r, m-p, and log velocity at the 1 0 % level; the unit root hypothesis is not
rejected for y with 4 lags, but is rejected at the 10% (but not 5%) level with 2
lags. A unit root in Ay, Ar, and A(m-p) are each rejected at the 1% level.
Similar inferences obtain when the sample is split 1900-1945, 1946-1989. Whether
m and p have two unit roots is less clear: for m, two unit roots are rejected in
favor of one at the 5% level for both subsamples, but not the full sample, while
the reverse is true for p. For r-Ap (Ap in percentages), one unit root is
rejected (vs. zero) for the full sample at the 1 0 % level, but not in either
subsample using the rT statistic (f rejects at 1 0 % in both subsamples).




40

The Stock-Watson (1988) qr(3,l) statistic, applied to the system (m-p,y,r)
over the full sample, rejects the hypothesis of three unit roots in favor of one
unit root at the 5% level over the full sample ^ith 1-4 lags. The evidence on
three vs. two unit roots is less strong: the q*(3,2) statistic (2 lags) has a
p-value of .33. However, the Engle-Granger (1987) augmented Dickey Fuller test
based on the residual from regressing m-p on y and r (with a constant and time
trend in the regression and using the appropriate critical values for a trivariate
detrended system, two lags) rejects non-cointegration at the 5% level over the
full sample. The details are available from the authors on request.
14. The dates in Table 2 and henceforth refer to the dates over which regressions
are run; earlier and later observations are used as initial and terminal
conditions as needed.
15. Likelihood ratio tests of 2 vs. 3 lags in the VECM are, respectively, 12.08,
11.12, and 3^.88 over the periods 1904-1986, 1904-1945, and 1946-1986. With
asymptotic Xg distributions, these suggest specifying p-3 over 1946-1986.
16. The smoothed interesj rate was constructed to be the two-sided estimate of
the germanent^component r. calculated using the Kalman smoother for the model
r^»rt+/i^t:, Ar^/i^,
^lt*^2t^ independent and var(/i^t)/var(|t2 t)“ 3 . Other
filters that yield similar results are a one-sided exponentially weighted moving
average filter with coefficient .95 and the Hodrick-Prescott filter.
17. The results in Table 3 are robust to changes in the details of the
computation of each of the estimators, in particular: using a Bartlett kernel
with 7 lags for PBSR and PHFM, using 3 rather than 2 leads/lags for DOLS and DGLS,
using 1 or 2 rather than 3 lags for JOH. The only exception is the postwar
instability of the JOH estimates, discussed in more detail below.
18. Because Wald statistics testing hypotheses about (0 , 0r) using the efficient
estimators have large-sample x distributions, the usual*7approach can be used to
construct confidence regions for (0 , 0r) . Asymptotically the estimators in the
two subsamples are independent, but*7for the small samples considered here, the
short-run dependence in the data, the presence of initial and terminal leads and
lags, and possible deviations from the large-sample mixed normal distribution will
result in a lack of independence.
19. The anomalous region is the postwar unsmoothed interest rate region for DGLS.
This is best understood by noting that the estimated GLS transformation for DGLS
approximately differenced the data (the estimated AR(2) filter is 1-1.39L+.41L ;,
so that the DGLS point estimates are in effect determined by covariances between
first differences of the data, not their levels.
20. Using the JOH estimator with monthly data on log real personal income, log
real Ml, and the logarithm of the 90-day Treasury bill rate, 1953-1988 (3 lags),
Hoffman and Rasche [1989] estimate the income elasticity to be .78; the difference
between their estimate and the corresponding value from Table 5 (.462) arises from
our use of levels, not logarithms, of interest rates.




41

References

Ahn, S.K. and G.C. Reinsel (1990), "Estimation for Partially Nonstationary
Autoregressive Models," Journal of the American Statistical Association, 85,
813-823.
Anderson, T.W. (1971), The Statistical Analysis of Time Series.

Wiley:

New York.

Baba, Y. , D.F. Hendry, and R.M. Starr (1990), "The Demand for Ml in the USA, 19601988," manuscript, Department of Economics, University of California, San
Diego (revision of University of California, San Diego Working Paper # 8 8 -8 ,
"U.S. Money Demand, 1960-84," December 1987).
Berk, K.N. (1974), "Consistent Autoregressive Spectral Estimates," Annals of
Statistics, 2, 489-502.
Bewley, R.A. (1979), "The Direct Estimation of the Equilibrium Response in a
Linear Model," Economic Letters, 3, 357-361.
Brillinger, D.R. (1980), Time Series, Data Analysis and Theory, Expanded Edition,
Holden-Day, San Francisco.
Campbell, J.Y. (1987), "Does Savings Anticipate Labor Income? An Alternative Test
of the Permanent Income Hypothesis," Econometrica, 55, 1249-1274.
Campbell, J.Y. and R.J. Shiller (1987), "Cointegration and Tests of Present Value
Models," Journal of Political Economy, 95, 1062-1088.
Campbell, J.Y. and R.J. Shiller (1989), "The Dividend-Price Ratio and
Expectations of Future Dividends and Discount Factors," The Review of
Financial Studies, 1, 195-228.
Dickey, D.A. and W.A. Fuller (1979), "Distribution of the Estimators for
Autoregressive Time Series with a Unit Root," Journal of the American
Statistical Association, 74, 427-431.
Domowitz, I.H. and L. Muus (1988), "Likelihood Inference in the Nonlinear
Regression Model with Explosive Linear Dynamics," in W. Barnett, E. Berndt
and H. White (eds), Dynamic Econometric Modelling, Cambridge University
Press, Cambridge.
Engle, R.F. and C.W.J. Granger (1987), "Cointegration and Error Correction:
Representation, Estimation, and Testing," Econometrica, 55, 251-276.
Engle, R.F., D.F. Hendry, and J.F. Richard (1983), "Exogeneity," Econometrica 51,
no. 2, 277-304.
Friedman, M. and A.J. Schwartz (1970), Monetary Statistics of the United States.
New York: Columbia University Press for the National Bureau of Economic
Research.




42

Friedman, M. and A.J. Schwartz (1982), Monetary Trends in the United States and
the United Kingdom. Chicago: Unversity of Chicago Press for the National
Bureau of Economic Research.
Goldfeld, S.M. and D.E. Sichel (1990), " The Demand for Money," Chapter 8 in B.M.
Friedman and F.H. Hahn (eds.), Handbook of Monetary Economics, vol. 1 ,
Amsterdam: North-Holland, 299-356.
Granger, C.W.J. and T-H. Lee (1988), "Multicointegration," Discussion Paper #24,
Department of Economics, University of California, San Diego.
Grenander, U. and M. Rosenblatt (1957), Statistical Analysis of Stationary Time
Series, John Wiley and Sons: New York.
Hansen, B.E. (1988), "Robust Inference in General Models of Cointegration,"
manuscript, Yale University.
Hansen, B.E. (1989), "Efficient Estimation of Cointegrating Vectors in the
Presence of Deterministic Trends," manuscript, University of Rochester.
Hansen, B.E. and P.C.B. Phillips (1988), "Estimation and Inference in Models of
Cointegration: A Simulation Study," Cowles Foundation Discussion Paper No.
881.
Hansen, L.P., W. Roberds, and T.J. Sargent (1987), "Time Series Implications of
Present Value Budget Balance and of Martingale Models of Consumption and
Taxes," manuscript, Department of Economics, University of Chicago.
Hendry, D.F. and N.R. Ericsson (1990), "Modeling the Demand for Narrow Money in
the United Kingdom and the United States," International Finance Discussion
Paper no. 383, Board of Governors of the Federal Reserve System, Washington,
D.C.
Hoffman, D. and R.H. Rasche (1989), "Long-Run Income and Interest Elasticities of
Money Demand in the United States," NBER Working Paper no. 2949.
Johansen, S. (1988a), "Statistical Analysis of Cointegration Vectors," Journal of
Economic Dynamics and Control, 12, 231-255.
Johansen, S. (1988b), "The Mathematical Structure of Error Correction Models,"

Contemporary Mathematics, vol. 80: Structural Inference from Stochastic
Processes, N.U. Prabhu (ed.), American Mathematical Society: Providence, RI .
Johansen, S. (1989), "Estimation and Hypothesis Testing of Cointegrating Vectors
in Gaussian Vector Autoregression Models," manuscript, Institute for
Mathematical Statistics, University of Copenhagen.
Johansen, S. (1990), "A Representation of Vector Autoregressive Processes
Integrated of Order 2," Preprint 1990 no. 3, Institute of Mathematical
Statistics, University of Copenhagen.




43

Johansen, S. and K. Juselius (1990), "Maximum Likelihood Estimation and Inference
on Cointegration -- with Applications to the Demand for Money," Oxford
Bulletin of Economics and Statistics, 52, no. 2, 169-210.
Judd, John P. and J.L. Scadding (1982), "The Search for a Stable Demand Function:
A Survey of the Post-1973 Literature," Journal of Economic Literature, Vol.
XX, pp. 993-1023.
King, R . , C. Plosser, J.H. Stock, and M.W. Watson (1987), "Stochastic Trends and
Economic Fluctuations," NBER Discussion Paper No. 2229; forthcoming, American

Economic Review.
Lahiri, K. and P. Schmidt (1978), "On the Estimation of Triangular Structural
Systems," Econometrica 46, 1217-1222.
Laidler, D.E.W. (1977), The Demand for Money: Theories and Empirical Evidence.
New York: Dun-Donnelly.
Lucas, R.E. (1988), "Money Demand in the United States:

A Quantitative Review,"
Camegie-Rochester Conference Series on Public Policy, 29, 137-168.

Meltzer, A.H., "The Demand for Money: The Evidence from the Time Series," Journal
of Political Economy, 71, 219-246.
Phillips, P.C.B. (1988a), "Optimal Inference in Cointegrated Systems," Cowles
Foundation Discussion Paper No. 8 6 6 (revised August 1989).
Phillips, P.C.B. (1988b), "Spectral Regression for Cointegrated Time Series,"
Cowles Foundation Discussion Paper No. 872.
Phillips, P.C.B. and B.E. Hansen (1989), "Statistical Inference in Instrumental
Variables Regression with 1(1) Processes," forthcoming, Review of Economic

Studies.
Phillips, P.C.B. and M. Loretan (1989), "Estimating Long Run Economic Equilibria,"
Cowles Foundation Discussion Paper no. 928, Yale University.
Phillips, P.C.B. and J.Y. Park (1986), "Asymptotic Equivalence of OLS and GLS in
Regression with Integrated Regressors," Cowles Foundation Discussion Paper
No. 802.
Phillips, P.C.B and P. Perron (1988) " Testing for a Unit Roots in a Time Series
Regression," Biometrika, 75, 335-346.
Rozanov, Y. (1967), Stationary Random Processes.

San Francisco: Holden Day.

Said, S.E. and D.A. Dickey (1984), "Testing for Unit Roots in AutoregressiveMoving Average Models of Unknown Order," Biometrika, 71, 599-608.
Saikkonen, P. (1989), "Asymptotically Efficient Estimation of Cointegrating




44

Regressions,” manuscript, Department of Statistics, University of Helsinki.
Sims, C.A. (1972), "Money, Income and Causality," American Economic Review. 62,
540-552.
Sims, C.A., J.H. Stock, and M.W. Watson (1990), "Inference in Linear Time Series
Models with Some Unit Roots," Econometrics, Vol. 58, No. 1..
Stock, J.H. (1987), "Asymptotic Properties of Least Squares Estimators of
Cointegrating Vectors," Econometrics, 55, 1035-1056.
Stock, J.H. and M.W. Watson (1988), "Testing for Common Trends," Joumsl of the
American Ststisticsl Associstion, 83, 1097-1107.
West, K.D. (1988), "Asymptotic Normality when Regressors Have a Unit Root,"
Econometrics, 56, 1397-1418.
Whittle, P. (1983), "Prediction and Regulation by Least Squares," 2nd Edition,
revised, University of Minnesota Press, Minneapolis.




45

Table 1

Monte Carlo Results

t: :]• vC a

A.
T=100
Estimator

Bias(0)

a(0)

T-300
t

t

.05

P(W>3 84)
1

.95

Bias(0)

P(W>3..84)

.000

.021

-1.67

1.68

054

.000

.007

-1.63

1.70

.052

.000

.023

-1.86

1.87

083

.000

.007

-1.69

1.79

.062

DOLS2

.000

.023

-1.87

1.86

087

.000

.007

-1.66

1.78

.061

DGLS

.000

.024

-1.80

1.76

073

.000

.007

-1.64

1.75

.056

PBSR

.000

.021

-1.78

1.81

073

.000

.007

-1.67

1.76

.060

PHFM

.000

.022

-1.88

1.88

086

.000

.007

-1.71

2.81

.065

JOH

.000

.025

-1.98

1.96

077

.000

.007

-1.84

1.67

.057

SOLS

0.8

'.95

SOLS

T=100,

*u

* *

*21 ::]■

0. 0

\05

DOLS1

B.

21

<r(0)

ft

11

DOLS1

bias(0)

t

DOLS2
t

t .95

.05

1
:*
■ € L*5

5'
1

DGLS
t

.95

'.05

PBSR
*.95

'.05

PHFM
t .95

'.05

JOH
t .95

'.05

*. 95

-.90

.084

-1.80

1.84

-1.84

1.84

-1.77

1.77

-1.46

1.69

-1.06

2.98

-1.95

1.83

-.80

.092

-1.81

1.84

-1.85

1.86

-1.77

1.76

-1.49

1.74

-1.17

2.74

-1.95

1.83

-.70

.089

-1.82

1.84

-1.84

1.85

-1.77

1.76

-1.52

1.77

-1.25

2.55

-1.95

1.83

-.60

.081

-1.83

1.84

-1.85

1.84

-1.77

1.76

-1.56

1.78

-1.31

2.40

-1.94

1.84

-.50

.071

-1.83

1.84

-1.84

1.84

-1.77

1.76

-1.58

1.79

-1.38

2.32

-1.95

1.86

.00

.026

-1.85

1.83

-1.86

1.83

-1.77

1.77

-1.76

1.72

-1.67

2.05

-1.97

1.90

.50

.000

-1.87

1.89

-1.90

1.87

-1.80

1.77

-2.05

1.46

-2.01

1.65

-1.99

2.01

.60

-.002

-1.88

1.89

-1.89

1.89

-1.81

1.80

-2.15

1.34

-2.11

1.55

-1.97

2.01

.70

-.003

-1.88

1.91

-1.91

1.91

-1.82

1.82

-2.25

1.25

-2.21

1.42

-1.96

2.05

.80

-.003

-1.92

1.94

-1.91

1.94

-1.81

1.82

-2.36

1.13

-2.33

1.28

-1.97

2.04

.90

-.002

-1.90

1.93

-1.93

1.94

-1.85

1.83

-2.45

1.04

-2.42

1.15

-1.99

2.02

-.90

-.283

-1.80

1.84

-1.84

1.84

-1.77

1.77

-0.58

1.23

-3.80

0.69

-1.90

1.79

-.80

-.078

-1.81

1.84

-1.85

1.85

-1.77

1.76

-0.77

1.46

-1.80

1.34

-1.90

1.79

-.70

.007

-1.82

1.84

-1.84

1.85

-1.77

1.75

-0.86

1.62

-1.31

1.75

-1.91

1.79

-.60

.048

-1.83

1.84

-1.85

1.84

-1.77

1.76

-0.94

1.73

-1.15

2.00

-1.91

1.79

-.50

.068

-1.83

1.84

-1.84

1.84

-1.77

1.76

-0.98

1.82

-1.09

2.13

-1.92

1.79

.00

.065

-1.85

1.83

-1.86

1.83

-1.77

1.77

-1.08

2.08

-1.09

2.28

-1.97

1.80

.50

.028

-1.87

1.89

-1.89

1.87

-1.80

1.77

-1.14

2.18

-1.16

2.30

-2.00

1.84
1.85

.60

.021

-1.88

1.89

-1.89

1.89

-1.81

1.80

-1.15

2.20

-1.18

2.30

-2.01

.70

.015

-1.88

1.91

-1.91

1.91

-1.82

1.82

-1.17

2.19

-1.19

2.31

-2.00

1.87

.80

.010

-1.92

1.94

-1.91

1.94

-1.81

1.82

-1.24

2.16

-1.23

2.30

-2.03

1.84

.90

.005

-1.90

1.93

-1.93

1.94

-1.85

1.83

-1.45

2.18

-1.38

2.36

-1.98

1.87

.951

-.039
[-.062

,
.643

Z *
€

.499

.499'
1.374

T=100

T-300
P(W>3.

Bias(0)

P(W>3.84)

Estimator

Bias(0)

SOLS

.085

.120

-1.95

5.16

.466

.033

.045

-1.90

5.29

.483

DOLS1

.026

.125

-2.10

2.71

.188

.007

.041

-1.79

2.32

.118

DOLS2

.026

.125

-1.72

2.25

.111

.007

.040

-1.55

1.97

.071

DGLS

.045

.131

-1.52

2.35

.111

.012

.042

-1.43

2.08

.076

PBSR

.039

.123

-1.83

2.79

.180

.012

.041

-1.64

2.41

.122

PHFM

.041

.122

-1.91

3.01

.206

.011

.041

-1.69

2.46

.131

JOH

.003

.330

-2.40

2.07

.095

-.001

.044

-1.97

1.75

.064




< r(6 )

'.05

'.95

<r(9)

'.05

'.95

Notes to Table 1:
Bias(0) and a(d) are the Monte Carlo bias and standard deviation of 0,
respectively. t qc and t ^ 5 are the empirical 5% and 95% critical values of the
t-ratios, and P(ft>3.84) is the percent rejections at the asymptotic 5% level of
the test statistic testing 0-0q which, for all but JOH, is the square of the tstatistic, and for JOH is the likelihood ratio statistic. 5000 Monte Carlo
replications were used. The number of observations (100 and 300) refer to the
span of the regressions; additional observations were used for initial and
terminal conditions. All regression include a constant term.
The estimators are:
SOLS

1
2
Static OLS regression of yt on yt .

D0LS1

Dynamic OLS regression of yt on (yt ,Ayt ,Ayt+^,...,Ayt+k ) ,

1

2

where k-2 for T-100, k—3 for T-300.

2 2

2

The covariance matrix is estimated

by averaging the first k error autocovariances using the Bartlett
kernel, where k-5 for T-100, k-8 for T-300.
D0LS2

1

2

2 2

2

Dynamic OLS regression of yt on (yt ,Ayt ,Ayt±1,...,Ayt±k),
where k-2 for T-100, k-3 for T-300.

The covariance matrix is estimated

by an autoregressive spectral estimator with 2 lags for T-100, 3 lags
for T-300.
DGLS

1

2

2 2

2

Dynamic GLS regression of yt on (yt ,Ayt ,Ayt+^,...,Ayt+^ ) ,
where k-2 for T-100, k-3 for T-300.

The errors were modeled as an

AR(2) for T-100 and AR(3) for T-300.
PBSR

Phillips (1988b) band spectral regression, where the spectral density
at frequency zero was estimated using the Bartlett kernel with 5
lead/lags for T-100 and 8 lead/lags for T-300.

PHFM

Phillips-Hansen (1989) fully modified estimator using the Bartlett
kernel with 5 lead/lags for T-100 and 8 lead/lags for T-300.

JOH

Johansen (1988a) VECM MLE based on the estimated model yt £k^A ^ A y t ^ + at , where k-4 for T-100 and k-6 for T-300.

7

a'yt ^ +

Standard

errors were computed using the formulas given in Section 5(A).
The DOLS and DGLS standard errors were computed using a degrees-of-freedom
adjustment, specifically df-number of periods in the regression - number of
regressors in the DGLS or DOLS regression - number of autoregressive lags in the
GLS transform (DGLS) or AR spectral estimator (DOLS). The JOH standard errors
were computed as described in Section 5(A) with a degrees-of-freedom adjustment
(df-number of periods in the regression - number of regressors in a single
equation of the VECM). The degrees-of-freedom corrections are motivated by
analogy to the classical linear regression model. No such adjustments were made
for PBSR or PHFM.




Table 2
Estimated Cointegrating Relations:

mt - a + 0ppt + 0yyt + ^rrt;

Specifications:
I.
II.

For p t 1(1): mt - /i+0ppt+0yyt+0rrt+dp (L)Apt+dy (L)Ayt+dr (L)Art+et
For pt 1(2) and r, Apt not cointegrated:
mt " ^+% pt+*yyt+*rrt+dp(L)A2 Pt+dy (L>Ayt+dr (L>Art+et

III.

For pt 1(2) and r-Apt 1(0):
mt - 'i+V t +V t +*rrt+dp (L)A2 pt+dy (L)Ayt+dr (L)(rfc-Apt)+e t
Estimates (Standard Errors)

Specification Estimator

I

II

III

Period

no . leads/lags

°y

DOLS

1903-87

2

DOLS

1904-86

3

DGLS

1903-87

2

1 . 0 0 0

DGLS

1904-86

3

DOLS

1904-87

2

DOLS

1905-86

3

DGLS

1904-87

2

DGLS

1905-86

3

DOLS

1904-87

2

1.143
(.185)
1.205
•(-177)
(.213)
1.219
(.152)

.838
(.154)
.794
(.145)
.322
(.289)
.798
(.125)

- .119
(.016)
- .128
(.014)
- .042
(.019)
-.133
(.013)

1.183
(.190)
1.304
(.2 0 0 )
1.041
(.166)
1.292
(.180)

.820
(.159)
.732
(.165)
.932
(.143)
.763
(.147)

-.119
(.016)
-.132
(.016)
-. 1 0 0
(.016)
- .134
(.016)

.949
(.138)
.887
(.118)
.355
(.289)
.842
(.103)

-.096
(.015)
-.106
(.013)
-.024
(.017)
-.115
(.0 1 1 )

1 . 0 1 1

(.165)

Notes:

DOLS

1905-86

3

DGLS

1904-87

2

DGLS

1905-86

3

1 . 1 0 0

(.145)
.982
(.209)
1.180
(.128)

di(L)“Zj— kdijL^ , where k is the number of leads/lags listed in the

third column. Standard errors are in parentheses . An AR(2) <
error process was
used to implement the GLS transformation for the ]DGLS estimator and to estimate
the DOLS covariance matrix when k=2, and an AR(3) was used for k«3. The shorter
regression periods for k-3 in panel B relative to k~2 in panel B, and for k=2 in
panel A relative to k=2 in Panel B, allow for necessary initial and terminal
conditions (leads and lags).




Table 3
Honey Demand Cointegrating Vectors:
Estimates and Tests, Annual Data

Dynamic OLS/GLS estimation equation:

A.

mt-pt * A* + ^y^t + ^rrt + dy(L)Ayt + dr (L)Art + et

Point Estimates (standard errors)
1903 -1945

1903 -1987
Estimator

1946-1987
e

#y

#y

e

y

r

#y

1946 -1987
0 r*

SOLS

.929

-.083

.916

-.089

.193

-.016

.412

-.046

NLLS

.898

-.093

1.104

-.093

-.445

.084

.298

-.023

DOLS

.965
(.031)

-.106
(.009)

.859
(.142)

-.117
(.028)

.270
(.213)

-.027
(.025)

.413
(.320)

- .047
(.042)

DGLS

.960
(.037)

-. 1 0 0
(.0 1 0 )

.972
(.170)

.945
(.308)

- . 0 2 0

(.030)

(.009)

1.171
(.132)

- .091
(.013)

.903
(.103)

-.103
(.018)

.216
(.091)

-. 0 2 0
(.0 1 1 )

.367
(.147)

-.042
(.019)

(.008)

.911
(.082)

-. 1 0 2
(.015)

.205
(.054)

- .018
(.006)

.393
(.106)

- .045
(.014)

.971
(.031)

-.109
(.009)

.878
(.094)

35.588
(1787.6)

-5.088
(266.0)

-2.344
(4.581)

.340
(.645)

.976
(.030)

-.115
(.009)

.940
(.119)

-.473
(.390)

.075
(.055)

- .131
(.274)

.033
(.039)

.960
(.033)

PBSR

.956
(.032)

PHFM

J0H(2)

J0H(3)

B.
Interest
rate

- . 1 0 1

(.009)
- . 1 0 0

- . 1 0 0

- . 1 1 1

(.018)
- . 1 2 1

(.0

2 2

)

Tests for Breaks in the Cointegrating Vector Based on DOLS , break date - 1946
no. leads/
lags

r

2

3

]r*




2

3

Wald statistic
(p-value)

Point estimates (standard errors)
ey

6.05
(.05)

.969
(.107)

5.03
(.08)

.983
(.103)

2.90
(.24)
3.82
(.15)

•t

S
)

-.452
(.269)

.061
(.029)

)

-.446
(.280)

.056
(.027)

.862
(.108)

-.141
(.025)

-.197
(.352)

.068
(.048)

.862
(.099)

-.144
(.023)

-.285
(.320)

.076
(.043)

- . 1 1 1

(.0

2 2

- . 1 1 1

(.0

2 2

Notes to Table 3:
Panel A:

NLLS is the nonlinear least squares estimator; the other

estimators are defined in the notes to Table 1 (DOLS here and in subsequent
tables is D0LS2 in Table 1).
lagged first differences.

JOH(k) is the JOH estimator evaluated using k

J0H(3) was computed over regression dates 1904-

1986, 1904-1945, and 1946-1986.
on (m-p)t l , yt.1 , rt_1( and

2

For the NLLS estimator, A(m-p)t is regressed
lags each of A(m-p)t_1 , Ayt_1( and A r ^ ;

and $r are estimated from the coefficients on the lagged levels.

9

DOLS and

DGLS used 2 leads and lags of the first differences in the regressions and an
AR(2) process for the error.

The frequency zero spectral estimators required

for PBSR and PHFM were computed using a Bartlett kernel with 5 lags.

All

regressions included a constant.
Panel B:

The statistics are based on the regression, (m-p)t«fi+0yyt.+0rrt+

iy(yt-yr)l(t>r)+6 r (rt-rr)l(t>r)+dy(L)Ayt+dr (L)Art , where 1(*) is the indicator
function and dy(L) and dr (L) have the number of leads and lags stated in the
second column.
1904-1986.
* 2

Regressions with k*=2 were run over 1903-1987, with k=3, over

The Wald

distribution.

The covariance matrix was computed using and AR(2)

spectral estimator.




statistic tests the hypothesis that 8 -$r-0 and has a

Table 4
Properties of Error Corrections Terms

Esti­

-- 1904-86 —

Estimation

—

zt “ mt

1904-45 —

rrt

—

1946-86 —

mator

Period

DOLS

1903-87

0.965 -0.106

-4.646

0.397

0.156

-3.618

0.334

0.136

-3.496

0.360

0.172

DGLS

1903-87

0.960 -0.100

-4.542

0.411

0.151

-3.685

0.301

0.133

-3.314

0.407

0.166

JOH

1903-87

0.971 -0.109

-4.673

0.396

0.159

-3.549

0.361

0.139

-3.553

0.347

0.176

DOLS

1903-45

0.859 -0.116

-3.289

0.674

0.195

-3.654

0.282

0.137

-3.211

0.456

0.192

DGLS

1903-45

0.972 -0.100

-4.531

0.409

0.151

-3.667

0.312

0.134

-3.215

0.432

0.167

JOH

1903-45

0.878 -0.111

-3.592

0.617

0.179

-3.688

0.271

0.134

-3.438

0.389

0.180

DOLS

1946-87

0.270 -0.027

-1.512

0.972

0.468

1.136

1.065

0.347

-3.134

0.463

0.050

DGLS

1946-87

0.945 -0.020

-1.047

0.951

0.240

-1.114

0.848

0.187

-1.526

0.970

0.279

JOH(2)

1946-87

35.588 -5.088

-1.541

0.958 23.401

0.222

1.009 19.997

-3.377

0.444

9.014

JOH(3)

1946-86

-0.473

0.075

-1.327

0.978

1.017

-1.453

0.964

1.017

-0.078

0.998

1.017

DOLS*

1946-87

0.413 -0.047

-1.331

0.968

0.379

0.656

1.052

0.279

-2.969

0.664

0.042

DGLS*

1946-87

1.170 -0.091

-1.740

0.897

0.207

-2.898

0.520

0.157

-1.641

0.938

0.173

JOH(2)* 1946-87
★
JOH(3) 1946-86

Notes:

8

y

8

r

f

p

»

t

P

&

f

p

a

-2.344

0.340

-1.583

0.986

2.152

1.607

1.038

1.761

-2.542

0.811

0.257

-0.131

0.033

-1.447

0.980

0.773

-1.446

0.964

0.773

0.201

1.001

0.773

The point estimates for the indicated estimator and sample period are

taken from Table 3.

DOLS*, DGLS*, and JOH* refer to these estimators

evaluated using the smoothed interest rate rt .

A
The summary statistics f^,

p, and a are respectively the Dickey-Fuller t-statistic testing p—1 with a
constant and 3 lags in the autoregression, the sum of the autoregressive
coefficients in the regression of zt on a constant and 3 lags, and the
standard devation of z^..

The reported entries are these statistics, computed

for z^ constructed using the the point estimates in the first columns for
each row, with regressions run (and statistics computed) over the subsample
given in the column heading.




Table 5
Honey Demand Cointegrating Vectors:

Period:

49:1 - 88:6

Interest rate: Coon. Paper
Estimator

e

e

y

49:1 - 88:6
Conm. Paper
e

r

e

y

r

Estimates, Monthly Data

60:1 - 88:6

60:1 - 88:6

Coon. Paper

90-day T-bill

e

y

e

r

e

y

e

r

60:1 - 88:6
10-yr T-bond
9

9

y

r

SOLS

.272

-.016

.398

-.035

.339

-.017

.362

-.021

.480

-.031

NLLS

.539

-.044

.259

.034

.570

-.030

-.483

-.026

.353

.012

-.044

.398

-.027

.415

-.030

.529

-.037

DOLS

.326

-.025

.457

(.187)

(.026)

(.136)

(.019)

(.208)

(.025)

(.165)

(.020)

(.210)

(.023)

.889

-.008

DGLS

(.203)

(.003)

.302

-.021

PBSR

(.037)

(.005)

.302

-.021

PHFM

(.033)

(.004)

.561

-.068

JOH

(.199)

Notes:

(.032)

NLLS and JOH

DGLS used

8

.525

-.026

1.139

-.009

1.195

-.011

1.046

-.019

(.109)

(.007)

(.289)

(.003)

(.268)

(.003)

(.173)

(.003)

.404

-.036

.367

-.022

.389

(.045)

(.006)

(.053)

(.005)

(.049)

(.005)

.370

-.022

.393
(.045)

-.037

.412
(.042)

(.006)

.629
(.129)

used

8

(.048)

(.004)

-.076

.520

-.075

(.020)

(.202)

(.039)

.462
(.137)

-.025

.500

-.034

(.052)

(.005)

-.025

.511

-.035

(.004)

(.047)

(.005)

-.060
(.024)

.631
(.144)

-.067
(.021)

lagged differences of the variables; DOLS and

leads and lags of the first differences in the regressions.

An

AR( 6 ) error was assumed for DGLS and for the calculation of the standard
errors for DOLS.

The frequency zero spectral estimators required for PBSR and

PHFM were computed using a Bartlett kernel with 18 (monthly) lags.
regressions included a constant.




All

Figure 1
U.S. real net national product (solid line) and real Ml, 1900-1989

A.

1 300

B.

13 10

1320

1330

1 3 H0

1350

1360

1370

1380

U.S. short-term commercial paper rate (solid line; left scale)
and the logarithm of Ml velocity, 1900-1989

2.0
1

. 8

1

. 6

1 .9

1 .2
1 .0

.8
. 6

1900 1908 1916 192H 1932 19HQ 1998 1956 1969 1972 1980 1988



Figure 2. 95% confidence regions for the income elasticity 9 and the
interest semielasticity 9r , estimated over 1903-1987 (solid line), 1903-1945
(dashes), 1946-1987 (short dashes), and, using the smoothed interest rate r ,
1946-1987 (dash-dots), based on the D0LS, DGLS, PBSR, and PHFM estimators.

DOLS

B. DGLS

0 .0 0
-0 .0 4
-0 .0 8
-0 .1 2
-0 .1 6
-0 .2 0

Interest s e m ie la stic ity

0 .0 4

A.

0 .0 0
-0.0 4
-0 .0 8
-0.1 2
-0 .1 6

Interest s e m ie la sticity

0 .0 4

C.




PBSR

D. PHFM

Figure 3.
Concentrated vector error correction model (VECM(3)) likelihood surface
in (0 , 0r) space, 1946-1986