View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

A Series of Occasional Papers in Draft Form Prepared by Members

of the Research Department for Review and Comment { Q

SM -88-

IM PERFEC T IN FO R M A TIO N AND
TH E PERM A N EN T IN C O M E H Y PO TH ESIS
Abhijit V. Banerjee and Kenneth N. Kuttner

Imperfect Information and the Permanent Income Hypothesis
A b h ijit V . B a n e r je e

K e n n e th N . K u ttn e r

1

August 4, 1988

lrThe authors are grateful to Gary Chamberlain, Benjamin Friedman, Zvi Griliches, Greg Mankiw,
Knut Mork, Philippe Weil, and to participants in the N BER Consumption Group for comments;
and to Iain Cockburn, Cheri Minton, and Andy Mitrusi for valuable technical assistance.




Abstract
T h e purpose of this paper is to explore the nature of the information set used by con­
sumers in m aking their counsumption decisions.

Specifically, it re-examines the evidence

for t he Permanent Income view of consumption under the assumption that consumers may
not always be able to distinguish transitory income shocks from permanent shocks.

For

these in d istin gu ish ab le’ shocks, we assume that consumers use an optim al linear forecast,
to calculate t he annuity value of the shock. Because this implies that the consumer treats
souh '

portion of each tem porary shock as if it were permanent, the resulting response of

consum ption would appear, in the ‘distinguishable shocks’ context, to be excessive.

This

hypothesis offers an explanation for the excess sensitivity puzzle reported by earlier econo­
metric st udies of consum ption.
T h e contribution of this paper lies in its attem pt to estim ate a param eter describing
the ‘ am ount’ of .information utilized by consumers, i.e., the degree to which income shocks
can be discerned as either transitory or permanent. Th is m ethodology is also relevant for
the ('valuation of business-cycle theories which rely on agents' confusion between kinds of
shocks to generate output fluctuations.
T h e first section of the paper discusses a simple exam ple, dem onstrating t hat incorrectly
attrib utin g to consumers the ability to distinguish transitory from lifetim e shocks can lead
to erroneous conclusions, such as the spurious finding of excess sensitivity. T h e next section
proposes a general m odel, incorporating both distinguishable and indistinguishable shocks,
and discusses its identification. T h e next section covers the d ata issues, and the minimumdistance estimation m ethod. T h e final two sections summarize our empirical findings, and
draw some general conclusions about the role of information in m odeling consumer behavior.
T h e empirical results we present are inconclusive, but do appear to offer some weak evidence
against the Perfect Information view embodied in much of the literature.
K e y w o r d s : Permanent Income Hypothesis, dynam ic factor models, m inim um -distance
estimation.




1

In tro d u ctio n

T h e responsiveness o f consumer spending to changes in income is of v ita l im portance for
policy questions, particularly those relating to demand m anagem ent. Em pirically assessing
this responsiveness is also im portant in a larger sense, in that one can ask to w hat extent it
is ‘ration al,’ or consistent w ith the efficient exploitation o f all available inform ation at the
disposal o f consumers.
Studies which test the Perm anent Income H ypothesis have usually distinguished be­
tween tw o aspects o f this responsiveness.

T h e first is the responsiveness o f consum ption

to anticipated changes in income; this, according to the rational expectations version of
the Perm anent Incom e H ypothesis, should alw ays equal zero.1

T h e second concerns the

m a g n it u d e o f the response of consum ption to income ‘surprises’ ; studies exam ining this

second angle address themselves to the question o f whether the sensitivity o f consum ption
to these income innovations is appropriate, given plausible values o f the prevailing interest
rate. Em pirical work attem p tin g to assess this sensitivity tends to find a level of sensitivity
far in excess o f w hat can be justified b y Perm anent Income theory.2
Our purpose in this paper is to reexamine both o f these issues, w ith particular emphasis
on the second, ‘excess sen sitivity’ issue, taking into account alternative assum ptions re­
garding the consum er’s inform ation set as it affects the measurement of the key sensitivity
param eter.
T h e issue o f how m uch information to attrib ute to agents in a rational expectations
model is a thorny one; consumers are assumed to respond rationally to all available infor­
m ation, but w hat to include in th at information set is usually unclear, and left unspecified
b y the theory.

Nor is it usually apparent how to determine from the d a ta the answer to

this question. Y et the correct choice is essential in empirical work; if we m odel behavior as
the response to unanticipated ‘surprises,’ the econometric specification o f th a t surprise will

1Hall (1978) was the first to emphasize this orthogonality property. Flavin (1981) re-interprets the
orthogonality property in terms of the response to anticipated income shocks.
2Hall and Mishkin (1982) is the first study to uncover this excess sensitivity phenomenon.




1

certainly affect our interpretation o f the response.
T h is issue is particularly germane to em pirical work in consum ption.

T h o se studies

which purport to estim ate the response of consum ption to incom e innovations h ave so
far paid scant atten tion to this problem and its im plications.

T h e original work b y H all

and M ishkin in this area, for exam ple, and its successors3 assert th a t consumers possess
enough inform ation to discern tw o distinct kinds o f shocks to their income: high persistence
(lifetim e), and low persistence (transitory).
W hile it is easy to think o f a few examples o f unm istakably tran sitory income shocks
(e.g., lo ttery payoffs, tem porary ta x surcharges), and a few shocks w ith an identifiably
longer persistence (e.g., the T a x Reform A c t o f 1986), the m ajo rity o f changes to households’
incomes would seem very difficult to classify as either lifetim e or transitory.

Changes in

real income due to m ovem ents in the price level, indirect taxes, and ‘tem po rary’ layoffs are
good exam ples o f w hat could be called ‘indistinguishable’ shocks.4
One o f the goals o f this paper is to exam ine explicitly the nature o f the inform ation set
available to consumers when th ey plan their consum ption expenditures.

In particular, it

seeks a w ay o f em pirically discerning w hat part o f (the variance of) consum ption is due to
‘distinguishable’ shocks, as opposed to ‘indistinguishable’ shocks.5
T h e answer to this question has im portant im plications for policy, and for a proper
appraisal o f the Perm anent Income Hypothesis.

One o f the im plications o f our work, for

exam ple, is th a t it can provide an explanation for the excess sensitivity puzzle reported by
Hall and M ishkin. Specifically, we show that estim ating the permanent income consum ption
m odel under the Perfect Inform ation assum ption o f Hall and M ishkin, when the Im perfect
Inform ation m odel is true, will deliver an estim ate o f the senstivity param eter (representing

3Other work incorporating the same the same kinds of informational assumptions includes papers by
Mork and Smith (1986), and by Altonji, Martins and Siow (1988).
4Even income movements which are the direct result of announced ‘permanent’ or ‘temporary’ policy
measures might be subject to problems of dynamic inconsistency, and therefore be placed, to some degree,
in the ‘indistinguishable’ class. After all, who knows what a Democrat in the White House might do with
the Tax Reform Act of 1986?
5 Of course, a finding in favor of indistinguishable shocks need not imply that no distinction can be
made regarding the persistence of shocks — simply that most of the observed variation in income is of the
indistinguishable type.




2

the annuity value of a transitory income shock) which is inconsistent and biased upwards,
spuriously indicating excess sensitivity.
Moreover, if consumers are uninformed about the sources of income shocks, policy mea­
sures that affect income will produce changes in consumption whose dynamics are quite
different from those we would see if consumers could actually discern the nature of their
income shocks, and could use this information to divine their true persistence. In particu­
lar, evidence in favor of the ‘indistinguishability’hypothesis could be interpreted as indirect
evidence in favor of the sort of price-level versus relative-price confusion which is key to
much of modern business cycle theory, as in Lucas (1972).
In order to examine these issues, we construct several models of consumption which
include features of the Rational Expectations - Permanent Income Hypothesis. Alternative
specifications we consider include the ‘Perfect Information’ version, which embodies the
distinguishability hypothesis implicit in Hall and Mishkin, and an alternative ‘Imperfect
Information’ hypothesis, which drops the distinguishability assumption. W e also consider
nested specifications which include both of these models as special cases, and include, in
the tradition of earlier studies, a portion of ‘rule-of-thumb’consumers, whose consumption
tracks income one-for-one. W e use an Optimal Minimum Distance ( O M D ) technique to
fit these models on family-level food expenditure and labor income data from a subset
of the Panel Study of Income Dynamics (PSID), for the 1978-1984 survey years. This
O M D method has been shown to have advantages over maximum-likelihood methods in
the presence of non-normal disturbances and conditional heteroskedasticity. W e estimate
the model both using unweighted data, and using data weighted by each household’s mean
income level to correct for heteroskedasticity.
Our empirical results confirm the Hall-Mishkin finding of excess sensitivity — but only
when we impose the Perfect Information restrictions. In contrast, when we impose the
restrictions implied by the Imperfect Information hypothesis, this excess sensitivity result
vanishes; consumption appears to respond more or less as predicted by the Permanent
Income Hypothesis, although large standard errors make a precise assessment of this sensi-




3

tivity difficult.
Actually nesting the Perfect and Imperfect Information specifications proves to be some­
what problematic, however. Although the respective hypotheses imply distinct covariance
patterns between changes in consumption and subsequent changes in income, the differ­
ences are subtle, and the data are not particularly sympathetic to either specification. The
result is that the two key parameters /? and <£, the sensitivity of consumption to income
innovations and the proportion of ‘perfect information’consumption, are not well identified
separately.
The orthogonality restrictions implied by the Permanent Income Hypothesis are ac­
cepted in the unweighted data, but rejected in the weighted data. Including a fraction of
consumption attributed to rule-of-thumb consumers further muddies the empirical waters,
as the rule-of-thumb component of consumption can also account for patterns of covariance
between changes in income and lagged changes in consumption which are similar to those
implied by both types of Permanent Income consumption.
The general picture to emerge from these results is that the data are slightly more
consistent with Imperfect Information versions of the Permanent Income model than they
are to Perfect Information versions. While the data do not appear to be rich enough to
enable us to effectively discriminate between the two hypotheses, the Imperfect Information
version, at least, yields an estimate of the sensitivity parameter which is more consistent with
the implications of the Permanent Income Hypothesis than the one delivered by the Perfect
Information version.

None of the parametric models of consumption we try, including

those which include rule-of-thumb behavior, appears to fit the data very well, however, as
indicated by their x 2 statistics.

2

A n

Illu s tr a tiv e

M o d e l

A simple model of life-cycle consumption will illustrate the consequences of inappropriately
specifying the consumer’s information set by attributing to the consumer knowledge about
the source of each shock. Specifically, we show how estimating the sensitivity parameter, /?,




4

from the moment restrictions implied by the Perfect Information model (when the Imperfect
Information model is true) will deliver an estimate of the senstivity parameter which is
inconsistent and biased upwards. For the sake of illustration, we will discuss the case with a
zero rate of time preference, a zero interest rate, and serially uncorrelated transitory income
shocks. In Section 3, we will drop these assumptions, and cover a somewhat more realistic
case with a constant interest rate (assumed to equal to the rate of time preference), and
serially correlated transitory income shocks.
As in the Hall and Mishkin paper, there are two kinds of shocks to consumers’income:
lifetime and transitory . Lifetime income shocks are assumed to exhibit infinite persistence,

while the transitory shocks decay over time. In other words, a lifetime income shock per­
manently alters an individual’s earnings prospects, while a transitory income shock reflects

temporary ‘blips’to earnings. The simplest stochastic specification of such a latent variable
process is to model the lifetime income shocks as innovations in a random walk process,
while the transitory component is simply white noise:

X t

=

X t-l +

yt

=

xt +

Ct

Vf

(1)
(2)

Here, x is lifetime income, e the shocks to lifetime income, y observed income, and rj
the transitory income shocks. The two components are assumed to be serially uncorrelated
(for the time being), and uncorrelated with one another.
Obviously, good permanent-income consumers would, ifthey were able to discern the two
kinds of shocks, consume the full amount of the lifetime income shock. O n the other hand,
rather than consume the full amount of the transitory income shock, rational consumers
would clearly want to consume only the annuity value of the amount of the shock (at some
appropriate interest rate), thereby spreading the windfall over the duration of their lifetimes.
The assumption that these shocks are distinguishable to the consumer is, as we have
argued, inappropriate for many of the income changes we observe. The question we intend
to explore is what the consequences would be of attributing to consumers more information




5

on the nature of these shocks than they actually have, and estimating the consumption
model as if consumers could separately discern the two components.6
In order to say something specific about the joint behavior of consumption and income,
we need to specify a model of permanent-income consumption. For the purposes of this
example, we will make use of the simplest model imaginable. W e assume throughout that
consumers maximize an additively separable, quadratic utility function in discrete time, in
which the consumer knows his lifetime with certainty. For the time being, we also take both
the rate of time preference and the rate of interest to be equal to zero. The maximization
problem is therefore:
T

i i i a x ^ £Vm( c/+,;)>
W i=o

where
u(c) ■ do + dic + d2c2,

subject to the budget constraint:
X > + ,: = T r
*=0

where IT is the sum of current assets plus the present value all future income. The budget
constraint is assumed to hold ex post .'
With a zero interest rate, the following consumption rule solves the above maximization
problem:
Ct = r + T

yt

^

^ tyt+ ^

where ,4* is the value of the consumer’s assets at the beginning of period /, and the expression
within parentheses is the expected value of lifetime wealth. First differencing and using the

Mn (lievocabulary ofdynamic factor models, ofwhich thisisan example, the issue iswhether a one-factor
model is more appropriate than a two-factor version.
'For the purposes of this model, we overlook the complications introduced by allowing the budget con­
straint to hold in an expected value sense, enabling the consumer to die with negative net worth. See Hitter
(1988) for a discussion of this issue.




G

law of motion for A tj
M

— A t - i + yt — c t ,

yields an expression for Ac which, because it is a function entirely of the revision in the
consumer’s expectations about future income between period t and period t + 1 , embodies
the random walk principle of Hall (1978) so long as those expectations are formed rationally:

Act+ 1 = —
1

•

( ^ 2 E t+ iV t+ i ~

\*=i

i=i

/

Because lifetime income is a random walk and transitory income is serially uncorrelated,
the consumer’s best forecast for income j periods hence is exactly the same as his one-stepahead forecast of income:
E tV t+j

=

E t y t + 1-

Using this fact in the expression for Ac yields a simplified expression in terms of revisions
in expectations:
A c * + 1 = — (y* + 1 + ( T — l)i?t+it/*+ 2 - T E ty t + \ ) .

(3)

This is the point at which the assumption about the nature of the information set
available to the consumer becomes crucial. If the consumer can, in fact, distinguish the two
shocks (thereby observing his own lifetime income, x ) , E t+ iy t+ 2 is simply equal to £t+i,
and E ty t+ i is just x t . In this Perfect Information case, the expression for Ac*+i simplifies
to:
A c t+i = ct + y

(4)

These forecasts are clearly infeasible for consumers who are unable to distinguish one
kind of shock from the other. In a sense, an uninformed consumer is suffering from an
errors-in-variables problem similar to that experienced by econometricians trying to esti­
mate permanent income models. Because he is unable to use the unobservable lifetime
income to forecast his future income, the consumer must come up with a forecast based
only on those elements in his information set. While each individual clearly has a large




7

amount of idiosyncratic information on which he can base his forecast of future income
(e.g., education, promotion prospects, etc.), we will model his prediction problem as if he
had only the information in his earnings history to go on.
The construction of an optimal forecast rule for this restricted information set is simpli­
fied by the observation that the second moments of a latent variable time series process such
as Equations 1 and 2 are equal to the second moments of an alternative A R M A process. In
other words, to someone who could not discern the underlying latent variables, A y would
‘look’just like an A R M A process. A n uninformed consumer, who could not separately dis­
cern the latent variables in Equations 1 and 2 could, therefore, construct an optimal linear
forecast of his earnings based on this corresponding A R M A process. In our case, where
lifetime income follows a random walk and transitory income is serially uncorrelated, A y
can be written as:
A y t = et + V t ~ V t - u

with autocovariances:

E (A y?)

=

<7£
2 + 2ct*

E ( A y tA y t - i )

=

-o %

E ( A y tA y t- k )

=

0 for k > 2 .

Because the autocovariances of A y are zero beyond the first, one can find some M A ( 1 )
process which will generate exactly the same set of autocovariances as those generated by
our latent variable model. If b is the moving-average parameter of the corresponding M A ( 1 )
process, then A y can be written as:




A y t = ( l + b)et

8

with autocovariances:

E (A y?)

=

(1 + & V 2

E ( A y tA y t- i )

=

ba*

E ( A y t A y t- k )

=

0 for k

> 1.

Equating the two sets of autocovariances and solving for 6 as a function of cr% and a\
yields the following expression for 6 :8

Using the standard forecast rule for M A ( 1 ) processes,
OO

£ ^ + 1 = (1 + 6)
t=0

to substitute for the expectations in Equation 3 yields an error-learning equation for Ac:

A c 1+i — (l + b — — — )

(yt+i -

(5)

E ty t + i )

where the M A forecast error, in terms of the latent variables, is:
OO

y*+i — E t y t + i = e t + 1 =

OO

+ Vt+i +
t=0

While obviously not orthogonal to lagged e or

t=0
tj

individually, this forecast error (and

the corresponding change in consumption) will be uncorrelated with all elements in the
uninformed consumer’s information set: that is, all lagged changes in his observed income.

This orthogonality condition places a testable restriction on the covariance matrix between

8This is identical to the expression derived by Muth (1960) by explicitly minimizing the mean-square
error of a linear forecast of a random walk with measurement error.




9

A c and lagged A y :
E ( A c tA y t- k ) = 0 for k > 1 .

In the Imperfect Information case, multiplying the error-learning equation for A c (Equa­
tion 5) by (leads of) the expression for A y and taking expectations yields the other restric­
tions on the elements of the covariance matrix:

E ( A c tA y t )

=

( l O ^ ) ( < r £2 + ( 2 + 6 )<7 2)

(6 )

- E ( A c tA y t + i )

=

-(l + 6 ^ )

(7)

ajj.

O n the other hand, multiplying the Perfect Information consumption rule in Equation 4
by A y and taking expectations yields a different set of covariance restrictions:

E ( A c tA y t)

=

E ( A c tA y t + l )

=

of + ^tr2

(8 )
(9)

Regardless of the assumptions made about the consumer’s information set, the autoco­
variances of the income process are:

E (A y?)

=

a\ + 2<r2

(1 0 )

E ( A y tA y t. . i )

=

-cr2.

(1 1 )

Under either assumption about the nature of the consumer’s information, these four
equations can be used to identify, from the estimated covariances of A c and A y ,the param­
eters of the income process and the consumption model. The task is to show how changing
the informational assumption alters the mapping from the structural parameters of the
consumption model to the moments of the joint distribution of A c and A y in such a way as
to lead to biased and inconsistent estimates of the parameters of the consumption model.
Our concern here is the responsiveness of consumption to income innovations, which we




10

parameterize with (3. In the context of the Perfect Information model, (3 can be thought
of as the annuity value of a transitory shock, which reflects a combination of the interest
rate used to discount future income, the length of the consumer’s planning horizon, and
the persistence of transitory income shocks. In this illustrative model, (3 is a relatively
uninteresting quantity, simply equal to

(In the extended model, (3 will depend not only

on the length of the horizon, but also on the prevailing interest rate, and the persistence of
the transitory shocks.)
The definition of the (3 is exactly the same in the pure Imperfect Information case as in
the Perfect Information case. With no distinction between transitory and permanent shocks,
however, its interpretion is somewhat less obvious. The correct interpretion of (3 parameter
is still as an index of the sensitivity of consumption to income innovations. Now, f3 measures
the degree to which this sensitivity exceeds the ‘baseline’(infinite horizon) sensitivity, (1 + 6 ).
Alternatively, the Imperfect Information (3 can be thought of as responsiveness to transitory
shocks consistent with the observed response to indistinguishable shocks.
Our plan is to explore the sensitivity issue by estimating the consumption model, com­
paring the estimate of f3 with what could be thought of as reasonable values for that pa­
rameter. In the Perfect Information case, one can identify f3 through the covariance of A ct
with At/t+i (Equation 8 ), and the first autocovariance of A y (Equation 1 1 ), since, in the
perfect information case,
_ E ( A c tA y t+1)
P

E { A y tA y t - i ) '

Forming the ratio of the sample analogs of these moments should, if the model is correctly
specified, deliver a consistent estimate of (3.
If, on the other hand, the sample covariances are generated by the consumption of
individuals who are unable to distinguish between the two kinds of shocks, then using this
ratio to identify f3 yields an inconsistent /3 , a linear combination of the true (3 and 1 with
weights — 6 and 1 + 6 (where 6 is negative):




P = (l + b )-b l3 .

11

Thus, in this simple example, the j3 obtained from estimating the model under the
incorrect assumption of perfect information would be subject to a potentially serious incon­
sistency problem, leading to an overstatement of the response of consumption to income.9
Such a problem could at least partially account for the Hall-Mishkin finding of excess sen­
sitivity in the response of consumption to innovations in transitory income.

3

T h e

E x te n d e d

M o d e l

In this section, w e extend the basic model of consumption outlined in the preceding sec­
tion to include two additional features: serially correlated transitory income shocks, and a
nonzero (but constant) interest rate. W e also expand the specification to allow for advance
information about changes in income, and construct a specification which includes both
distinguishable and indistinguishable shocks, thereby nesting the Hall-Mishkin restrictions
within a more general model. Finally, because the PSID study covers only food expendi­
tures (rather than total consumption) for most years, we modify the model to describe the
behavior of food consumption.

3 .1

S e r ia lly C o r r e la te d T r a n s ito r y In c o m e S h o ck s

In order to approximate the dynamics of the changes households’ earnings, we model the
transitory income component as following an AR(1) process, while the lifetime income com­
ponent continues to be a random walk with uncorrelated errors. In the example above, with
white noise transitory income shocks, the time series process which replicated the autoco­
variances of the latent variable process for A y was an A R M A ( 0 ,1 ); here, the equivalent
time series process is an ARMA(1,1). W e will briefly sketch the mapping between the two

9The /? parameter is not the only one for which bias may be a problem when the information set is
misspecified. Because the PSID data set reports only food expenditures, it is necessary to jointly estimate
both /? and a, the slope of the Engel curve for food. A similar argument can be made that the incorrect
specification will lead to an inflated estimate of or,ifit is also estimated using sample covariances.




12

representations. In terms of the latent variables, the income model is:

( 1 - L )xt

=

Ct

(1 2 )

%
It

=

Xt-1 + Vt

(13)

(1 - <f>L)rjt

=

Vt

(14)

where rf and v are white noise. L denotes the lag operator.
As above, if consumers are unable to make out the lifetime and transitory components
separately, this latent variable process will look to them just like some A R M A process.
Specifically, Equations 1 2 through 14 can be rewritten in terms of A y as:

(1 - <t>L)Ayt = A vt + (1 - <t>L)et,

which is recognizable as an ARMA(1,1) in Ay, with a composite error term consisting of
terms in v and e.
Calling the autoregressive parameter a and the moving-average parameter 6 ,the corre­
sponding A R M A process can be written as:

(1 - a L ) A y t = (1 + b V )e t.

Equating the autocovariances of the two representations and solving for a, 6 ,and of in terms
of <£, of, and of yields expressions for the parameters of the A R M A ( 1,1) representation in
terms of <f> and the ratio of the variances of the lifetime and the transitory shocks:

a

—

(f)
- l -

\(<t>2 + l ) ^

+ (i - 4 ) ^ \ j \ { X +

+ 1

Having ascertained the parameters of the appropriate A R M A process, all that remains




13

is t o i n s e r t t h e f o r e c a s t r u l e s :

oo
E t A y t +1

=

(t + b ^ i -b Y A y t - i
i

=0
CO

E tA y t+k

=

<t>kE tA y t+1 = <t>k-\<l> + b ) Y , { - b Y & y t - i
1=0

into the error-learning consumption equation of our uninformed consumer, and use the
result to generate the corresponding moment restrictions.

3 .2

N o n ze ro R a te o f In te rest

While the introduction of a nonzero interest rate and serially correlated transitory income
complicates matters somewhat, the consumer’s decision rule retains the error-learning struc­
ture it had in the simpler version of the model. The main difference is in the definition
of the sensitivity parameter, /?, which now is a function of the interest rate and the serial
correlation parameter, as well as the length of the consumer’s horizon. Specifically, the (3
which describes the ‘correct’ degree of sensitivity to income surprises takes the form:

/? = cj//i, where

and u = ■ -

fi =

w i t ) '

! - ( * )

'

As T increases, the (3 parameter, defined in this way, approximates the interest rate, r;
in the limit as T — ► oo, (3 — ► r/(l + r — a). Both the Imperfect Information and the
Perfect Information versions of the consumption rule can now be rewritten in terms of this
(3 parameter.

At this point, it is convenient to introduce the modification required to estimate the
model on food expenditure alone. Going on the assumption that the Engel curve for food is
approximately linear, with slope equal to a and a nonzero intercept, the only modification
required is straightforward, and simply involves inserting this a parameter into the equations
describing the consumption rule. Incorporating these changes, and reinterpreting Ac as




14

r e fe r r in g t o f o o d c o n s u m p t io n o n ly , y ie ld s t h e fo llo w in g g e n e r ic e x p r e s s io n fo r A c :

Act =

) (EtVt+i ~ E t- \ y t+ i).
U +

H

t

J

To derive the imperfect information version, we combine this expression with that for
the A R M A ( 1 ,1 ) forecast rule, yielding an error-learning equation for Ac as a function of
the period t forecast error:

Ac< = “ ( y r ^ ~ r r ^ ) (yt ~

(15)

The perfect information model, on the other hand, implies a very different consumption
rule, specifying a separate response to each component:

(16)

Ac* = a et + af)vt .

As in the simple example of the preceding section, this error-learning rule for consump­
tion implies a specific set of restrictions on the elements of the covariance matrix of Ac and
the leads and lags of Ay. Also, as before, it implies the orthogonality condition between
Ac and all lagged elements of the information set, including lagged changes in y.
The other restrictions on the covariance matrix implied by the imperfect information
model can be found by substituting the A R M A forecast error (in terms of the latent vari­
ables) into the error learning rule, multiplying by Ay, and taking expectations. They are:

E ( A c tA y t)

=

a

1 + b

<f> + b

i -< t>
E ( A c tA y t+ i)

=

-a

2 , 2 + 6

o. +

a

1 + 6b

i-4>

l + b _ <j>+ b
1 - 6

1 -4 >

P

(i

6)2

1 + 6b

4>

(17)
(18)

where b is defined as above. O n the other hand, the perfect information model implies a




15

d is tin c t p a t te r n o f c o v a r ia n c e b e tw e e n A c a n d A y :

E ( A c tA y t )
E ( A c tA y t + i)

=
=

+ a(3al
-a/3(l - <j>)al.

(19)
(2 0 )

As before, the pattern of autocovariances of A y is independent of the specification of
the consumer’s information set:

E (A y?)

=

E ( A y tA y t_ i )

=

E ( A y tA y t- 2 )

=

(2 1 )
4,-1

2

<f>+l

"

A> — 1

(2 2 )
2

(23)

*4> + l <’-

Comparing Equations 15 and 16, the consumption rules for the Imperfect and the Perfect
Information cases, it is clear that the competing hypotheses imply qualitatively different
reactions in response to a shock to income. One way to see this is to compare the expressions
for the covariance between A y t+\ and Ac*.
Consider first the perfect information case, and consider a period in which an individual
receives a positive transitory income ‘blip.’ If consumption behaves according to Equa­
tion 16, A y t+\ and Ac* will be negatively correlated for the following reason: In the current
period, our consumer will adjust his consumption upwards. In the subsequent period, in­
come will fall — but the consumer knew it was going to, so consumption w o n ’t change. The
correlation between this period’s increase in consumption and next period’s decrease in in­
come generates the negative correlation. Furthermore, as (3 approaches zero, this negative
covariance in the Perfect Information case shrinks to zero, since the accompanying change
in consumption also shrinks to zero.
O n the other hand, under the Imperfect Information assumption, even as f3 goes to zero,
we will see a negative covariance between A?/*+i and Ac* as successive observations of y yield
additional information on whether the initial change in income was a lifetime or transitory




16

income shock. Thus, it is the covariance of A ct with A^+i, relative to C o v ( A c tA y t) and
V a r ( A y t), which will be key in attempting to distinguish the competing hypotheses.

Before we estimate the model we will first discuss two additional ways in which we
augment the basic indistinguishable shocks model of Section 2 ,and then discuss the question
of identification in that more general model.

3 .3

A llo w in g For A d v a n c e In fo rm a tio n

A striking feature of the household consumption data is that the correlation between the
current change in consumption and the change in income one period hence is significantly
positive ,rather than negative as implied by the Permanent Income Hypothesis under either

informational assumption. The standard explanation of this phenomenon is that consumers,
when they make their ‘current’consumption decision, already have some advance informa­
tion about the subsequent period’s innovations in income. The timing of the PSID survey
corroborates this interpretation. The survey, which is administered in March of every year,
contains income questions which refer to the previous year’s earnings. The consumption
questions, on the other hand, are usually interpreted as pertaining to current consump­
tion. Therefore, it is only natural that consumption in March of 1984 should already have
responded to some of the earnings news for 1984.
To compensate for this timing problem, we assume, following Hall and Mishkin, that the
‘true’ current change in consumption is a convex combination of the theoretical ‘current’
change in consumption and ‘next period’s’ change in consumption. If 7 is the ‘proportion
with no advance information,’then:

Ac' = 70 ( r 7 “

'

E ‘~'v,)

+ (1~7)0 ( r 7 ” i^ 7 ) (!"+, “

E,3w)'

One justification for this specification is that it corresponds to a model of information
propagation in which there is a 7 probability every period that the next period’s innovation




17

in income w ill be known before this period’s consumption decision is taken.

3 .4

N e s tin g T h e T w o M o d e ls

While it seems plausible that consumers often cannot distinguish between temporary and
permanent shocks to their incomes, it is unlikely that they could never make this distinc­
tion. To allow for the possibility that consumers face a situation of p a r tia l information,
we will assume that the consumer faces tw o sets of income shocks: one distinguishable set,
and one indistinguishable set. Here, we will describe a fully general (but underidentified)
specification, and will later discuss the restrictions on the model required to achieve iden­
tification. The required restrictions will condense this general specification into a convex
combination of the Hall-Mishkin model and our imperfect information model. W e will start
with an unrestricted latent variable specification for the A y process:

A y = c i,* +

- r / i j - i + e2,t +

~ V2,t-i

where c^, €2,t V itti and r\2,t are independent of each other, and are described by the same
stochastic processes as e and

77

above (i.e.,

and €2 yt are white noise with variances a\ t

and (tI,e, while rftj and 772,* follow the processes (1 — (f)iL)rjiit = v\j and (1 — 4>2L)p2,t — Vi
where the variance of V\jt and 7/2 ,t are o\

and cr*^)-

W e now assume that the consumer cannot separately observe c1?< and 77^, but can
separately observe

62,7

and

772,/.

This makes our augmented consumption model a sum

of two parts: one which corresponds to the Imperfect Information model (Equation 15),
and one which corresponds to the Perfect Information model (Equation 16). Ignoring, for
the time being, the advance information complication, this extension implies the following
equation for Ac:




Ac<

=

a

et

—

ei,t + v \ yt

et + a (e2.< + v %tP ')
00

—

(<f> +

00

b ) ^ ( —6)2+1c i ^ _ i _ j — (1 + b ) Y X
2=0
2=0

18

-

b y m ^ -j

and (3 and /?' are defined exactly as before for the two subprocesses.
This formulation nests the Perfect Information specification of Hall and Mishkin with
the Imperfect Information specification, as special cases of a more general model.
two special cases correspond to setting €\yt and

or 62

The

and 1^2 ,t identically equal to

zero. However, as we will argue in the next section, this most general formulation does
not impose enough restrictions to achieve identification. In order to construct an estimable
specification, we must impose a number of restrictions on the variances of the observed and
unobserved components discussed above.
The system of equations we use to identify the parameters appears in full in the ap­
pendix. What follows is a brief discussion of some of the problems we face in identifying the
nested versions of the model. There are essentially six variances and covariances which pro­
vide independent information about the parameters of this model. They are: E ( A c tA y t ) ,
E ( A c tA y t+1), E ( A c tA y t+ 2 ), E ( A y f ) , E ( A y tA y t- i ) , and E { A y tA y t- 2)-

As the model is

specified above (incorporating both the advance information extension and the two classes
of income innovations), there are nine parameters to identify: 7 , /?, /?', ^>1 , <^2 ,

cr| c,

o \ ,v , and a ^ v .

With such an excess of parameters relative to the number of moments, the model as it
stands is clearly underidentified. To identify the model, we proceed by assuming that the
distinguishable and indistinguishable shocks come from populations with identical autore­
gressive parameters, and equal variances, up to a constant of proportionality:

<t>1

=

<t>2

a lc

_

-p

~

P

-

p

1

a lV
1

-p

In addition to eliminating a (f> and two

2

<j s ,

these assumptions also imply that (3 = /?'.

Introducing the constant of proportionality between variances, p, adds a single parameter,
so that we are left with exactly six coefficients to estimate, not counting a, which will be




19

estimated separately.
These assumptions essentially say that the two pairs of innovations to the income process
are identical, except to the extent that one pair may have a higher variance than the other
pair. One way to interpret this restricted model is the following: consumers receive only two
kinds of shocks, e and v. Each year, some proportion of consumers, />, receive information
on the source of their shocks, while the rest, 1 — />, receive no information on the source of
their shocks.
This is a non-trivial assumption, since one can think of cases where the distinguishable
shocks come from one kind of population (say, changes in direct taxes) while the indistin­
guishable shocks come from a different kind of population (say, changes in indirect taxes);
however, it is unavoidable if we are to achieve identification. In this restricted form, our
model nests both the Perfect and Imperfect Information models as special cases correspond­
ing to p = 1 and p = 0 ,respectively.

3 .5

E stim a tin g a

One final identification problem remains. The a parameter, which appears in each of the
A c — A y elements of the covariance matrix, is not identifiable from covariances alone,
unless the Perfect Information assumption is maintained. In this case, the fact that both
the transitory and the lifetime factors have their own coefficients, a/3 and a, respectively,
allows a to be identified from the covariances. This accords with intuition, which suggests
that we can discern a from the response to a lifetime shock. Then, knowing a and observing
the response to a transitory shock, we can ‘back out’ an estimate of /?.
In the Imperfect Information situation, this is no longer the case: a and a term contain­
ing (3 always enter the covariance restrictions as a product, meaning that the two cannot
be disentangled from this information alone. It is possible to verify from the system of
equations in the appendix that even with some non-zero fraction of Perfect Information
consumption, the presence of the p nesting parameter makes it impossible to determine a
from the covariances.




20

W e choose an alternate method of estimating the slope of the Engel curve, involving
an auxiliary regression of consumption on income — in effect, using information from the
levels of consumption and income, rather than the differences, to achieve identification.10
This is just the kind of regression which is susceptible to the errors-in-variables problem
identified by Friedman (1957). The problem is that measured income is the sum of transitory
and permanent components; a naive regression of consumption on income will yield an
attenuated estimate of the marginal propensity to consume food out of permanent income.
W e remedy this problem by estimating the Engel curve on time averages of each house­
hold’s data —

that is, we regress the average consumption level for household i on the

average income of that household in a regression of the form:

=

k

+ a y {.

The idea behind this ‘between’ estimator of a is that the measurement errors induced
by transitory income tend to cancel each other out over time, so that y is a relatively
noise-free estimate of permanent income. Such a procedure will not completely eliminate
measurement error, but will at least reduce it.
Table 1 below shows the equation we use to obtain our estimate of a, in which we also
control for the number of household members. The point estimate from the linear version
is 0.08; this is the value we will use in subsequent estimations. The nonlinear specification
in Table 1 shows only a very small amount of curvature in the Engel curve for our sample,
a fact we attribute to the homogeneous composition of the sample we selected for analysis.
The next section describes this sample in greater detail.

10 An additional benefit ofthis method isthat itisfree ofthe specification bias which, as we argued above,
could contaminate the a and estimated from the covariances under the incorrect informational assumption.




21




Table 1: The ‘Between’ Estim ate of a
Independent Variable
Intercept
Y2
HHSIZE
Y

R2

1

259.89
(30.50)

0.0803
(0.0026)

—

168.47
(7.79)

0.5122

2

230.35
(51.07)

0.0867
(0.0093)

-2.7E-7
(3.7E - 7)

167.42
(7.92)

0.5121

Dependent Variable: Real food consumption
Y = Real disposable income
HHSIZE = Number of household members
Data are 1978-84 averages, in 1967 dollars.
Standard errors are in parentheses.

22

4

E stim a tio n

4 .1

T h e D a ta

The data we use come from the University of Michigan’s Panel Study of Income Dynamics
(PSID), from survey years 1978 through 1984.11 W e use only family-level data through­
out, taking the household, as defined by PSID conventions, as being the appropriate level of
aggregation for the analysis of consumption decisions. Bearing in mind the timing complica­
tion described above, we take the responses from each year’s survey to refer to the previous
year’s values, but allow for consumption decisions to be made with advance information.
For our income variable, we use the sum of the labor incomes of the Head and the Wife
(or ‘Wife’), adjusted for federal income taxes and FICA payroll taxes.12 For consumption,
we use the sum of the household’s expenditures on food at home and in restaurants. W e
deflate the income variable with the consumer price index to put it in terms of constant
1967 dollars. Similarly, using the food price component of the consumer price index, we
express food consumption in terms of 1967 dollars.
Rather than using the entire PSID panel of 6,918 families (as of Wave 17), we choose to
focus our analysis on a carefully selected subset of the panel. W e first drop a number of ob­
servations which appear to be ‘bad’data, outliers, or have some other (observable) problem.
Second, in an attempt to avoid the problems involved in modelling non-Permanent-Income
behavior, we omit observations on families which are likely to be constrained in one way or
another.
The data we drop in the first round are the following:
1. Families which report zero labor income for both Head and Wife

11 Technical constraints imposed by the computer software forced us to use only seven years of data (six
sets of differences). However, because households are being continually added to the survey and because our
method requires a balanced sample, including fewer years increases the number of observations available for
estimation.
12Ideally, we would also want to adjust for state taxes in arriving a measure of changes in disposible
income. However, because the PSID does not contain any state tax data, this would involve either adjusting
by some representative marginal state tax rate, or combining the PSID information on the household’s state
of residence with statutory tax rate data to estimate each household’s state taxes.




23

2 . Families which report zero food expenditure

3. Very wealthy families (inflation adjusted needs ratio greater than 20)
4. Observations with a major assignment to food or income data
5. Families in which both the Head and the Wife were institutionalized, students, or
non-participants in the labor force for any other reason
6 . Observations in which the reported labor income was truncated by the number of

digits on the PSID tape13
7. Families which reported a change in real income greater than 50 per cent in absolute
value, relative to the previous year.
Second, in order to focus on the informational issue discussed above, we wish to con­
centrate our analysis on those families who are most likely to behave according to the
Permanent Income Hypothesis. Accordingly, we drop those observations corresponding to
very poor families whose behavior is likely to be liquidity constrained. Specifically, we dis­
card observations of families whose inflation adjusted needs ratio was less than unity, and
those of families which received food stamps during the sample period.
Finally, on the grounds that estimating a dynamic earnings structure as in Equations
1 2 and 13 makes little sense for retired people who generally have little or no labor income,

we eliminate retirees from the sample.

The result of this series of cuts is a relatively

homogeneous balanced sample of 1,978 observations.

4 ,2

E stim a tio n M e th o d

In the most general terms, dynamic factor models of consumption, like those described
above, act to place sets of restrictions on the covariance matrix of (the leads and lags of)
A y and Ac. Therefore, an estimation method which allows us to impose those restrictions

13For 1978-83, this amount was $99,999; for 1984, it was $999,999.




24

directly on the elements of the covariance matrix is better adapted to fitting these models
than one which imposes restrictions on the ratios of the off-diagonal elements to the diagonal
elements, as does the regression method.
The Minimum Distance estimation method is very well adapted to estimating a model
with such a structure.14 The idea is to minimize a quadratic criterion function of the form:

min [g($) -

where

6

z } ' A [g{$) -

z]

is an ra-dimensional vector of parameters to be estimated,

z

is an n-dimensional

vector of unconstrained estimates, and g is the mapping from the constrained parameter
space to the unconstrained parameter space which incorporates the restrictions on the
moments implied by the behavioral model. The minimum distance estimator, 0, is the 9
which solves the first-order condition:

D g ( 0 ) 'A [g(0) - z] = 0.

Gauss-Newton iterations can be used to numerically solve this equation and obtain a value
for 0 .
It can be shown that, under very general conditions, the Minimum Distance estimator
is consistent and asymptotically normal —

regardless of the choice of weighting matrix,

A, used in the minimization. If the inverse of the covariance matrix of

z

is used as the

weighting matrix, then the minimum-distance method delivers the ‘Optimal’ Minimum
Distance estimator, yielding the most efficient (relative to other choices of A ) estimate of
0. 15 The method of M a x i m u m Likelihood is analogous to using the inverse of the matrix of
fourth moments implied by the normal distribution for A ; for a non-normal distribution of
disturbances, the M L method would therefore utilize a sub-optimal weighting matrix.

14Other applications of O M D estimation of covariance structures include Abowd and Card (1986), and
Altonji, Martins and Siow (1987).
15See Chamberlain (1982) and (1984) for a complete presentation of the Minimum Distance method, and
its optimality properties.




25

Another benefit of using C o v ( z ) ~ x as the weighting matrix is that the minimized crite­
rion function, multiplied by the number of observations, is distributed asymptotically as a
X 2, with degrees of freedom equal to the number of restrictions placed on the model, i.e.,
n — to ;in Table 3 we use this fact to perform x 2 tests of various restrictions. In the results

that follow, we use a feasible version of the Optimal Minimum Distance estimator, in which
the A matrix is replaced by the inverse of the estimated covariance matrix of z, the matrix
of sample fourth moments.
In our application, the 6 vector consists of the parameters of the income process and
the consumption model discussed above. The g function maps these parameters into our 2
vector, which is comprised of the unique elements of the sample covariance matrix, e.g.:
1

1

*

N

and — J 2 A c h A c h - s , for s < 5,

where A c * t = Ac,-,* — Ac* and A y * t = A yijt — A y t. Even though our models place no
restrictions on the autocovariances of Ac, we include these moments in the z vector, but
impose only the stationarity restrictions on that subvector.

5

T h e

R e s u lts

The results we get are mixed. As it should be evident from Tables 2 and 3, the data resist
our attempts to impose stationarity. This should not surprise the reader; most studies based
on this data find the same rejection.16 Orthogonality restrictions imposed by our theory are
rejected in the weighted data but not in the unweighted data; the implications of this will
be examined in the next section. The income process we impose is, however, accepted even
at the 10% level which suggests that our ARMA(1,1) specification for A y is not obviously
worse than the ARMA(0,2) specification adopted by Hall and Mishkin.17

16See for example Altonji, Martins and Siow(1987).
17MaCurdy (1982) also concludes that the data are indifferent between the ARMA(1,1) and the
ARMA(0,2) specifications.




26

The interpretation of our results should therefore be qualified by the fact that the basic
structure we impose on the data to make estimation possible is not entirely supported by
the data. A further problem arises because the relation between p and (5 in our model is
highly nonlinear and, as a result, in our estimation of the nested model, yields two sets
of values for p and (3 which generate the same values of the objective function. To make
matters worse, one set of these values is typically outside the legitimate range. (Both p
and (3 get negative point estimates although in general one cannot reject the hypothesis
that they are both zero.) Our emphasis in interpreting the results will therefore be in the
direction of comparing x 2 values for different alternative hypotheses rather than looking at
point estimates.
The strongest support for the Imperfect Information view comes from looking at the
weighted data. The numbers in Tables 2 and 3 show that imposing the pure Imperfect
Information restriction (p = 0) on the nested model causes the x 2 value to change very
slightly. O n the other hand, imposing the Perfect Information restriction (p = 1 ) is rejected
quite emphatically by the data. The results in the unweighted data are less clear cut. The
pure Imperfect Information restriction has a somewhat lower x 2 value than the Perfect
Information hypothesis, but neither set of restrictions on the nested model can be rejected
even at the 10% level. (At the 15% level we can reject the Perfect Information hypothesis
but we cannot really insist that this is a very meaningful rejection).
The point estimates reported in Tables 4 and 5 correspond to the non-negative set of
roots. As discussed above, there is some reason to be sceptical of the information value of
these estimates. The one which seems most reasonable is the estimate for 7 which comes
out to be 0.51 in the unweighted case and 0.62 in the weighted case. At least the value from
the weighted model, suggesting 38% advance information, is not inconsistent with Hall and
Mishkin’s claim that there is a lag of one quarter between the income and the consumption
data. Also, the value of 7 corresponding to the negative roots is not very different from
these values, which suggests a degree of robustness in the estimate of this parameter.
The estimates of (3 and p reported in the tables are hard to interpret. W h e n we estimate




27




Table 2: Goodness-of-Fit Statistics
\2 Statistics
Unweighted Weighted

Model

Restrictions

k

0

Unrestricted

78

0.00

0.00

1
2
3

Stationarity Constrained
(1) Plus Orthogonality
(2) Plus Income Process

23
18
15

100.00
106.51
111.44

114.51
126.95
130.52

4
5
6

Without Rule-of-Thumb Consumption
Nested Perfect &; Imperfect Info
Perfect Information Only
Imperfect Information Only

12
11
11

124.21
126.31
125.30

140.60
144.54
141.55

7
8
9

With Rule-of-Thumb Consumption
Nested Perfect &; Imperfect Info
Perfect Information Only
Imperfect Information Only

13
12
12

122.70
122.88
123.60

138.41
139.59
138.53

Table 3: Tests of Alternative Specifications

Model
H a Ho

DF

Unweighted Data
X2 Stat P-Value

0
1
2
3

1
2
3
4

55
5
3
3

100.00
6.51
4.93
12.77

0.00
0.26
0.18
0.01

114.51
12.44
3.57
10.08

0.00
0.03
0.31
0.02

4
4

5
6

1
1

2.10
1.09

0.15
0.30

3.96
0.95

0.05
0.33

7
7

8
9

1
1

0.18
0.90

0.67
0.34

1.18
0.12

0.28
0.73

7
7
7

4
5
6

1
2
2

1.51
3.61
2.60

0.22
0.16
0.27

2.19
6.13
3.14

0.14
0.05
0.21

8
9

5
6

1
1

1.43
1.70

0.23
0.19

4.95
3.02

0.03
0.08

28

Weighted Data
X1 Stat P-Value




Table 4: Parameter Estim ates, Unweighted Data
Parameters
Model

°i

7

<t>

P

c

3

1.19
(0.10)

0.96
(0.09)

0.31
(0.05)

~

—

—

—

4

1.21
(0.11)

0.94
(0.09)

0.33
(0.05)

0.47
(0.14)

0.51
(0.06)

0.69
(0.24)

0.00

5

1.15
(0.09)

0.98
(0.08)

0.30
(0.05)

0.53
(0.11)

0.48
(0.06)

1.00

0.00

6

1.21
(0.11)

0.94
(0.09)

0.33
(0.05)

0.08
(0.13)

0.52
(0.06)

0.00

0.00

7

1.24
(0.11)

0.92
(0.09)

0.34
(0.05)

0.46
(0.17)

0.54
(0.06)

0.85
(0.34)

0.24
(0.19)

8

1.24
(0.11)

0.92
(0.09)

0.33
(0.05)

0.47
(0.16)

0.54
(0.06)

1.00

0.29
(0.14)

9

1.25
(0.11)

0.92
(0.09)

0.34
(0.05)

0.03
(0.20)

0.56
(0.05)

0.00

0.26
(0.18)

Standard Errors are in parentheses.

29




Table 5: Parameter Estim ates, Weighted Data
Parameters
Model

P

7

P

c

3

15.18
(1.14)

12.85
(1.05)

0.22
(0.04)

4

14.93
(1.14)

12.89
(1.05)

0.23
(0.04)

0.44
(0.17)

0.62
(0.08)

0.60
(0.22)

0.00
—

5

14.11
(1.04)

13.44
(0.98)

0.20
(0.04)

0.42
(0.14)

0.63
(0.10)

1.0

0.00
—

14.89
(1.13)

12.90
(1.04)

0.23
(0.04)

0.06
(0.13)

0.64
(0.18)

0.0

—

0.00
—

7

15.41
(1.19)

12.57
(1.07)

0.24
(0.04)

0.34
(0.29)

0.66
(0.07)

0.61
(0.37)

0.29
(0.19)

8

15.21
(1.16)

12.69
(1.06)

0.24
(0.04)

0.32
(0.25)

0.69
(0.08)

1.0

0.39
(0.16)

15.42
(1.18)

12.56
(1.07)

0.24
(0.04)

-0.01
(0.19)

0.66
(0.05)

0.0

6

9

Standard Errors are in parentheses.

30

—

—
—

0.32
(0.17)

p freely, the estimate comes out to be quite high (between 0.6 and 0.7), and it is accom­
panied by high f3 values (not less than 0.4) . 18 The same kind of (3 values are generated
if we impose the Perfect Information restrictions. O n the other hand, imposing the pure
Imperfect Information restrictions yields a /3 value of 0.06, which is perfectly consistent
with the interest rates and lifetimes we consider reasonable, and the decline in the fit, as
we saw above, is very slight. The fact that the deterioration in fit is so slight, in apparent
contradiction to the high point estimates of p, is due to the non-linearity of the model,
and the presence of the second root corresponding to a small, negative value of p. Small
changes in the specification or the selection of an alternative subset of the PSID panel may
make this second set of roots positive, in which case a stronger conclusion in favor of our
hypothesis would be warranted.

18In constrast, with a rate of interest equal to 4% and a horizon of 25 years, the ‘correct’value of /? is
approximately 0.05.




31

6

Rule-of-Thumb Consumption?

The most unsatisfactory aspect of our current results is the strong violation of the Euler
equation restrictions in the weighted case. In combination with the fact that the Euler
equation restrictions are not rejected in the unweighted case, this suggests that the violation
ma y come from the behavior of the members of the sample with relatively low incomes.
Intuitively one expects their influence on the results to be greater in the weighted case.
This suggests that the source of the trouble m a y be liquidity constraints on low income
families. The trouble is, as Zeldes (1985) points out, that there is no simple rule for
predicting how liquidity constrained agents would behave.19
The same problem also arises with the second candidate for an alternative hypothesis,
namely, that at least some agents are not rational and use rules of thumb to decide their
consumption. There is no obvious candidate for such a rule of thumb. Current fashion
favors the so-called Keynesian consumption function, which simply says that Ac = Ay, but
there seems to be no reason to prefer this over Ac = k A y where k is less than 1 . Keynes
himself preferred the latter and called it the ‘fundamental psychological law. ’ 20 And as long
as k is positive, the version with k < 1 is a priori no worse in explaining violations of the
Euler equation.
In a future extension of this paper we will consider the question of the best specification
for rule-of-thumb behavior. In this paper we limit ourselves to examining what happens if
we add to our model the ‘Keynesian’ alternative mentioned above. Our motive for doing
so is for comparability with other studies like Hall and Mishkin and Campbell and Mankiw
(1988) which make this assumption.
The method we use for incorporating ‘Keynesian’ behavior follows Hall and Mishkin
(1982). W e assume that a fraction £ of consumption is determined by the rule of thumb,
Ac = Ay, and the rest is determined by the permanent income model, as above. W e

19See Hall (1987) and Hayashi (1987) for surveys of the evidence on liquidity constraints.
20See Keynes (1936).




32

therefore write:
A c = CA?/ + (l-C)Acp
where cp is the consumption predicted by the permanent income model introduced above.
The method of incorporating rule-of-thumb consumption suffers from the defect that it
actually implies that all agents sometimes follow a rule of thumb and sometimes follow the
permanent income model. By contrast what we actually want is to be able to model the
fact that some agents follow one of these models most of the time and that others follow
the other model most of the time. As a result, Hall and Mishkin’s interpretation of our (
parameter as the fraction of rule-of-thumb consumers is not strictly correct.
W e follow Hall and Mishkin in using the moments of consumption with lagged income
to identify ( . Our results again depend on which data set we use. With both sets we find
that the restriction that ( = 0 cannot be rejected at the 5 % level. The same is true if we
compare the pure Imperfect Information model (£ = 0, p = 0) to the unrestricted model.
The restriction of Perfect Information (£ = 0, p — 1) is rejected in the weighted data. If we
allow ( to be freely estimated, once again one cannot reject the restriction of pure Imperfect
Information, but now one cannot reject Perfect information either, and in the unweighted
data it actually performs slightly better. To a limited extent therefore, the introduction of
rule-of-thumb consumption does make the Hall-Mishkin assumption perform better relative
to the Imperfect Information hypothesis.
However, the conclusion of Hall and Mishkin that the inclusion of rule-of-thumb con­
sumers is enough to generate reasonable point estimates is not confirmed by our study. In
the Hall-Mishkin case (p = 1, unweighted data) the estimate of £ we get, 0.29, is not so
different from the estimate they report, 0.2, but the /? we get, 0.47 is much larger than their
estimate of 0.17 and cannot be reconciled with rational behavior. The only case where we
obtain an estimate of {3 consistent with the theory is in the unrestricted case with weighted
data. In this case, we get an estimate of 0.33 with a standard error of 0.28, but the finding
that this is not inconsistent with the theory is mainly driven by the unusually large standard
error. In all other cases the estimate of j3 is not really changed by the inclusion of rule-




33

of-thumb consumers. The estimate of p is also surprisingly insensitive to the inclusion of
rule-of-thumb consumers.

7

C o n c lu s io n s

The most general conclusion to be drawn from this paper is that alternative specifications
of the information set used in inferring the ‘surprise’ movements of a variable can make a
substantive difference in the estimation of rational expectations models.
In the context of consumption, our specific result is that an alternative specification of
consumers’information sets, which endows them with ‘less’information than is customary,
performs at least as well (in terms of fit) as the stronger Perfect Information specification.
In addition, this Imperfect Information assumption is able to resolve the excess sensitivity
puzzle found in other studies. In particular, when we impose the weaker informational
restrictions, we obtain a point estimate of the sensitivity parameter which is justifiable
in the presence of plausible interest rates and horizons, although a large standard error
makes it difficult to do precise inference. In this sense, our results can be interpreted as
being favorable to the Permanent Income Hypothesis, although some kind of rule-of-thumb
behavior appears to be marginally relevant.
There are a number of reasons for caution in interpreting our results. First, the power of
the covariances of food consumption with income to distinguish these competing hypotheses
is very low. While the Imperfect Information versions appear to perform slightly better
than the Perfect Information versions, the improvement in fit is marginal. It is therefore
entirely plausible that the Perfect Information model is really the appropriate specification
of consumer behavior, but that consumers simply respond with excess vigor to income
shocks; there is not enough information in our data to reject one hypothesis in favor of the
other. It may, however, be possible to extend our analysis to include additional ‘indicator’
variables, such as hours worked, asset income, or saving, which would yield information on
households’ expectations of future income, and by so doing, improve our ability to discern
one informatonal hypothesis from the other. This remains a topic for future work.




34

A second caveat is due from the observation that the x 2 statistics indicate that all of
the consumption models we try fit the data rather poorly. Even the stationarity restrictions
we impose at the outset fail spectacularly. While conclusions drawn from these misspecified
models must be treated with caution, such models can serve as useful approximations for
divining the structure in the data.
Finally, the problem of measurement error is one we were unable to properly address;
although we made an attempt to discard outliers and data points to which major assign­
ments were made, a proper measurement-error correction would require more degrees of
freedom than we have at our disposal, using only income and consumption data.
Despite these caveats, we believe that the results presented here are a first step to­
wards resolving some of the outstanding questions in the study of household-level consumer
behavior.




35

A

Appendix: The Nested Model

The following set of equations defines the mapping, y, from the parameters of the structural
model of consumption and income to the covariances of A c with Ay, and the autocovariances
of Ay.
Iflifetime income is a random walk, while transitory income is described by a stationary
A R ( 1 ), then the autocovariances of A y are:

E ( A y 2)

E ( A y tA y t- i )

E ( A y tA y t-

2

=

=

) =

E ( A y tA y t- k )

^ + l<^ + <r£

— 1 2

^+1

"

J > ~ l rT2

^ + 1

^

=

If we define:

k

=

b

=

-1 - H P + i)f| + (i -

+ »)’g + i

then the restrictions placed on the covariances of A c and A y can be written:

E ( A c tA y t- k )

E ( A c tA y t)




=

0 for k > 1

=

cry k ( l - p ) (a,

2 | 2 + 6 - ^ 2^1 , „ („2 ,
+ P
+ Z ^ 2)]
1+ #

36

E ( A c tA y t + 1 )

c,k(l - „) [(1 - 7) (al + 2 ± ^ < r ’) -

+

<*P

[(1 - 7) (of +

fi°£ )

+ 7/5(1 - </>)<r2]

E ( A c tA y t+2)

—a (7 0 + ( 1 - 7 )) k ( l - p ) ^ - k + P0 (l-<l>)
l + <f>b

E ( A c tA y t+k)

-a<j>k 2 (7 </»+ ( 1 — 7 ))




37

al for k > 3.

References
Abowd, J.M. and D. Card (1986), “O n the Covariance Structure of Earnings and Hours
Changes”, N B E R Working Paper #1832.
Altonji, J.G. and A. Siow (1987), “Testing the Response of Consumption to Income
Changes with (Noisy) Panel Data”, The Quarterly Journal o f E co n o m ics 1 0 2 ,
293-328.
Altonji, J.G., A.P. Martins and A. Siow (1987), “Dynamic Factor Models of Consump­
tion, Hours, and Income”, N B E R Working Paper #2155.
Altonji, J.G., A.P. Martins and A. Siow (1988), “Using Cross-Equation Restrictions
between Asset Income and Non-Asset Income to Estimate the Permanent Income
Hypothesis”, Northwestern University.
Campbell, J.Y. and N.G. Mankiw (1987), “Permanent Income, Current Income, and
Consumption”, N B E R Working Paper #2436.
Chamberlain, G. (1982), “Multivariate Regression Models for Panel Data”, Journal o f
E co n o m etrics 18, 5-46.
Chamberlain, G., 1984, “Panel Data”, in Griliches and Intriligator, eds: Handbook o f
E co n o m etrics Volume II, Amsterdam: North-Holland.
Flavin, M.A. (1981), “The Adjustment of Consumption to Changing Expectations about
Future Income”, Journal o f Political E c o n o m y 89, 974-1009.
Friedman, M., 1957, A T heory o f the C onsum ption Function. Princeton, NJ: Princeton
University Press.
Hall, R.E. (1978), “Stochastic Implications of the Life Cycle-Permanent Income H y ­
pothesis: Theory and Evidence”, Journal o f Political E c o n o m y 8 6 ,971-987.
Hall, R.E. and F. Mishkin (1982), “The Sensitivity of Consumption to Transitory
Income: Estimates from Panel Data on Households”, E conom etrica 50, 461-481.
Hall, R.E. (1987), “Consumption”, N B E R Working Paper #2265.
Hayashi, F., 1987, “Tests for Liquidity Constraints: A Critical Survey”,in Bewley, ed.:
A d va n ces in E con om etrics, Fifth W orld C ongress Volume II, Cambridge: C a m ­
bridge University Press.




38

Keynes, J.M., 1936, The General Theory o f E m ploym en t, In terest, and M o n e y . London:
Macmillan.
Lucas, R.E. (1972), “Expectations and the Neutrality of Money”, Journal o f E con om ic
T heory 4, 103-124.
MaCurdy, T.E. (1982), “The Use of Time Series Processes to Model the Error Structure
of Earnings in a Longitudinal Data Analysis”,Journal o f E con om etrics 18, 83-114.
Mork, K.A. and V.K. Smith (1986), “Testing the Life-Cycle Hypothesis on Panel Data
Using Detailed Consumption Diaries and Income Based on Tax Records”,Vander­
bilt University.
Muth, J.F. (1960), “Optimal Properties of Exponentially Weighted Forecasts”,Journal
o f the A m erica n Statistical A ssocia tion 55, 299-306.
Ritter, J.A. (1988), “Endogenous Borrowing Constraints and Consumption”,University
of Texas.
Zeldes, S. (1985), “Consumption and Liquidity Constraints: A n Empirical Investiga­
tion”,Wharton School, University of Pennsylvania.




39