The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
A Series of Occasional Papers in Draft Form Prepared by Members of the Research Department for Review and Comment { Q SM -88- IM PERFEC T IN FO R M A TIO N AND TH E PERM A N EN T IN C O M E H Y PO TH ESIS Abhijit V. Banerjee and Kenneth N. Kuttner Imperfect Information and the Permanent Income Hypothesis A b h ijit V . B a n e r je e K e n n e th N . K u ttn e r 1 August 4, 1988 lrThe authors are grateful to Gary Chamberlain, Benjamin Friedman, Zvi Griliches, Greg Mankiw, Knut Mork, Philippe Weil, and to participants in the N BER Consumption Group for comments; and to Iain Cockburn, Cheri Minton, and Andy Mitrusi for valuable technical assistance. Abstract T h e purpose of this paper is to explore the nature of the information set used by con sumers in m aking their counsumption decisions. Specifically, it re-examines the evidence for t he Permanent Income view of consumption under the assumption that consumers may not always be able to distinguish transitory income shocks from permanent shocks. For these in d istin gu ish ab le’ shocks, we assume that consumers use an optim al linear forecast, to calculate t he annuity value of the shock. Because this implies that the consumer treats souh ' portion of each tem porary shock as if it were permanent, the resulting response of consum ption would appear, in the ‘distinguishable shocks’ context, to be excessive. This hypothesis offers an explanation for the excess sensitivity puzzle reported by earlier econo metric st udies of consum ption. T h e contribution of this paper lies in its attem pt to estim ate a param eter describing the ‘ am ount’ of .information utilized by consumers, i.e., the degree to which income shocks can be discerned as either transitory or permanent. Th is m ethodology is also relevant for the ('valuation of business-cycle theories which rely on agents' confusion between kinds of shocks to generate output fluctuations. T h e first section of the paper discusses a simple exam ple, dem onstrating t hat incorrectly attrib utin g to consumers the ability to distinguish transitory from lifetim e shocks can lead to erroneous conclusions, such as the spurious finding of excess sensitivity. T h e next section proposes a general m odel, incorporating both distinguishable and indistinguishable shocks, and discusses its identification. T h e next section covers the d ata issues, and the minimumdistance estimation m ethod. T h e final two sections summarize our empirical findings, and draw some general conclusions about the role of information in m odeling consumer behavior. T h e empirical results we present are inconclusive, but do appear to offer some weak evidence against the Perfect Information view embodied in much of the literature. K e y w o r d s : Permanent Income Hypothesis, dynam ic factor models, m inim um -distance estimation. 1 In tro d u ctio n T h e responsiveness o f consumer spending to changes in income is of v ita l im portance for policy questions, particularly those relating to demand m anagem ent. Em pirically assessing this responsiveness is also im portant in a larger sense, in that one can ask to w hat extent it is ‘ration al,’ or consistent w ith the efficient exploitation o f all available inform ation at the disposal o f consumers. Studies which test the Perm anent Income H ypothesis have usually distinguished be tween tw o aspects o f this responsiveness. T h e first is the responsiveness o f consum ption to anticipated changes in income; this, according to the rational expectations version of the Perm anent Incom e H ypothesis, should alw ays equal zero.1 T h e second concerns the m a g n it u d e o f the response of consum ption to income ‘surprises’ ; studies exam ining this second angle address themselves to the question o f whether the sensitivity o f consum ption to these income innovations is appropriate, given plausible values o f the prevailing interest rate. Em pirical work attem p tin g to assess this sensitivity tends to find a level of sensitivity far in excess o f w hat can be justified b y Perm anent Income theory.2 Our purpose in this paper is to reexamine both o f these issues, w ith particular emphasis on the second, ‘excess sen sitivity’ issue, taking into account alternative assum ptions re garding the consum er’s inform ation set as it affects the measurement of the key sensitivity param eter. T h e issue o f how m uch information to attrib ute to agents in a rational expectations model is a thorny one; consumers are assumed to respond rationally to all available infor m ation, but w hat to include in th at information set is usually unclear, and left unspecified b y the theory. Nor is it usually apparent how to determine from the d a ta the answer to this question. Y et the correct choice is essential in empirical work; if we m odel behavior as the response to unanticipated ‘surprises,’ the econometric specification o f th a t surprise will 1Hall (1978) was the first to emphasize this orthogonality property. Flavin (1981) re-interprets the orthogonality property in terms of the response to anticipated income shocks. 2Hall and Mishkin (1982) is the first study to uncover this excess sensitivity phenomenon. 1 certainly affect our interpretation o f the response. T h is issue is particularly germane to em pirical work in consum ption. T h o se studies which purport to estim ate the response of consum ption to incom e innovations h ave so far paid scant atten tion to this problem and its im plications. T h e original work b y H all and M ishkin in this area, for exam ple, and its successors3 assert th a t consumers possess enough inform ation to discern tw o distinct kinds o f shocks to their income: high persistence (lifetim e), and low persistence (transitory). W hile it is easy to think o f a few examples o f unm istakably tran sitory income shocks (e.g., lo ttery payoffs, tem porary ta x surcharges), and a few shocks w ith an identifiably longer persistence (e.g., the T a x Reform A c t o f 1986), the m ajo rity o f changes to households’ incomes would seem very difficult to classify as either lifetim e or transitory. Changes in real income due to m ovem ents in the price level, indirect taxes, and ‘tem po rary’ layoffs are good exam ples o f w hat could be called ‘indistinguishable’ shocks.4 One o f the goals o f this paper is to exam ine explicitly the nature o f the inform ation set available to consumers when th ey plan their consum ption expenditures. In particular, it seeks a w ay o f em pirically discerning w hat part o f (the variance of) consum ption is due to ‘distinguishable’ shocks, as opposed to ‘indistinguishable’ shocks.5 T h e answer to this question has im portant im plications for policy, and for a proper appraisal o f the Perm anent Income Hypothesis. One o f the im plications o f our work, for exam ple, is th a t it can provide an explanation for the excess sensitivity puzzle reported by Hall and M ishkin. Specifically, we show that estim ating the permanent income consum ption m odel under the Perfect Inform ation assum ption o f Hall and M ishkin, when the Im perfect Inform ation m odel is true, will deliver an estim ate o f the senstivity param eter (representing 3Other work incorporating the same the same kinds of informational assumptions includes papers by Mork and Smith (1986), and by Altonji, Martins and Siow (1988). 4Even income movements which are the direct result of announced ‘permanent’ or ‘temporary’ policy measures might be subject to problems of dynamic inconsistency, and therefore be placed, to some degree, in the ‘indistinguishable’ class. After all, who knows what a Democrat in the White House might do with the Tax Reform Act of 1986? 5 Of course, a finding in favor of indistinguishable shocks need not imply that no distinction can be made regarding the persistence of shocks — simply that most of the observed variation in income is of the indistinguishable type. 2 the annuity value of a transitory income shock) which is inconsistent and biased upwards, spuriously indicating excess sensitivity. Moreover, if consumers are uninformed about the sources of income shocks, policy mea sures that affect income will produce changes in consumption whose dynamics are quite different from those we would see if consumers could actually discern the nature of their income shocks, and could use this information to divine their true persistence. In particu lar, evidence in favor of the ‘indistinguishability’hypothesis could be interpreted as indirect evidence in favor of the sort of price-level versus relative-price confusion which is key to much of modern business cycle theory, as in Lucas (1972). In order to examine these issues, we construct several models of consumption which include features of the Rational Expectations - Permanent Income Hypothesis. Alternative specifications we consider include the ‘Perfect Information’ version, which embodies the distinguishability hypothesis implicit in Hall and Mishkin, and an alternative ‘Imperfect Information’ hypothesis, which drops the distinguishability assumption. W e also consider nested specifications which include both of these models as special cases, and include, in the tradition of earlier studies, a portion of ‘rule-of-thumb’consumers, whose consumption tracks income one-for-one. W e use an Optimal Minimum Distance ( O M D ) technique to fit these models on family-level food expenditure and labor income data from a subset of the Panel Study of Income Dynamics (PSID), for the 1978-1984 survey years. This O M D method has been shown to have advantages over maximum-likelihood methods in the presence of non-normal disturbances and conditional heteroskedasticity. W e estimate the model both using unweighted data, and using data weighted by each household’s mean income level to correct for heteroskedasticity. Our empirical results confirm the Hall-Mishkin finding of excess sensitivity — but only when we impose the Perfect Information restrictions. In contrast, when we impose the restrictions implied by the Imperfect Information hypothesis, this excess sensitivity result vanishes; consumption appears to respond more or less as predicted by the Permanent Income Hypothesis, although large standard errors make a precise assessment of this sensi- 3 tivity difficult. Actually nesting the Perfect and Imperfect Information specifications proves to be some what problematic, however. Although the respective hypotheses imply distinct covariance patterns between changes in consumption and subsequent changes in income, the differ ences are subtle, and the data are not particularly sympathetic to either specification. The result is that the two key parameters /? and <£, the sensitivity of consumption to income innovations and the proportion of ‘perfect information’consumption, are not well identified separately. The orthogonality restrictions implied by the Permanent Income Hypothesis are ac cepted in the unweighted data, but rejected in the weighted data. Including a fraction of consumption attributed to rule-of-thumb consumers further muddies the empirical waters, as the rule-of-thumb component of consumption can also account for patterns of covariance between changes in income and lagged changes in consumption which are similar to those implied by both types of Permanent Income consumption. The general picture to emerge from these results is that the data are slightly more consistent with Imperfect Information versions of the Permanent Income model than they are to Perfect Information versions. While the data do not appear to be rich enough to enable us to effectively discriminate between the two hypotheses, the Imperfect Information version, at least, yields an estimate of the sensitivity parameter which is more consistent with the implications of the Permanent Income Hypothesis than the one delivered by the Perfect Information version. None of the parametric models of consumption we try, including those which include rule-of-thumb behavior, appears to fit the data very well, however, as indicated by their x 2 statistics. 2 A n Illu s tr a tiv e M o d e l A simple model of life-cycle consumption will illustrate the consequences of inappropriately specifying the consumer’s information set by attributing to the consumer knowledge about the source of each shock. Specifically, we show how estimating the sensitivity parameter, /?, 4 from the moment restrictions implied by the Perfect Information model (when the Imperfect Information model is true) will deliver an estimate of the senstivity parameter which is inconsistent and biased upwards. For the sake of illustration, we will discuss the case with a zero rate of time preference, a zero interest rate, and serially uncorrelated transitory income shocks. In Section 3, we will drop these assumptions, and cover a somewhat more realistic case with a constant interest rate (assumed to equal to the rate of time preference), and serially correlated transitory income shocks. As in the Hall and Mishkin paper, there are two kinds of shocks to consumers’income: lifetime and transitory . Lifetime income shocks are assumed to exhibit infinite persistence, while the transitory shocks decay over time. In other words, a lifetime income shock per manently alters an individual’s earnings prospects, while a transitory income shock reflects temporary ‘blips’to earnings. The simplest stochastic specification of such a latent variable process is to model the lifetime income shocks as innovations in a random walk process, while the transitory component is simply white noise: X t = X t-l + yt = xt + Ct Vf (1) (2) Here, x is lifetime income, e the shocks to lifetime income, y observed income, and rj the transitory income shocks. The two components are assumed to be serially uncorrelated (for the time being), and uncorrelated with one another. Obviously, good permanent-income consumers would, ifthey were able to discern the two kinds of shocks, consume the full amount of the lifetime income shock. O n the other hand, rather than consume the full amount of the transitory income shock, rational consumers would clearly want to consume only the annuity value of the amount of the shock (at some appropriate interest rate), thereby spreading the windfall over the duration of their lifetimes. The assumption that these shocks are distinguishable to the consumer is, as we have argued, inappropriate for many of the income changes we observe. The question we intend to explore is what the consequences would be of attributing to consumers more information 5 on the nature of these shocks than they actually have, and estimating the consumption model as if consumers could separately discern the two components.6 In order to say something specific about the joint behavior of consumption and income, we need to specify a model of permanent-income consumption. For the purposes of this example, we will make use of the simplest model imaginable. W e assume throughout that consumers maximize an additively separable, quadratic utility function in discrete time, in which the consumer knows his lifetime with certainty. For the time being, we also take both the rate of time preference and the rate of interest to be equal to zero. The maximization problem is therefore: T i i i a x ^ £Vm( c/+,;)> W i=o where u(c) ■ do + dic + d2c2, subject to the budget constraint: X > + ,: = T r *=0 where IT is the sum of current assets plus the present value all future income. The budget constraint is assumed to hold ex post .' With a zero interest rate, the following consumption rule solves the above maximization problem: Ct = r + T yt ^ ^ tyt+ ^ where ,4* is the value of the consumer’s assets at the beginning of period /, and the expression within parentheses is the expected value of lifetime wealth. First differencing and using the Mn (lievocabulary ofdynamic factor models, ofwhich thisisan example, the issue iswhether a one-factor model is more appropriate than a two-factor version. 'For the purposes of this model, we overlook the complications introduced by allowing the budget con straint to hold in an expected value sense, enabling the consumer to die with negative net worth. See Hitter (1988) for a discussion of this issue. G law of motion for A tj M — A t - i + yt — c t , yields an expression for Ac which, because it is a function entirely of the revision in the consumer’s expectations about future income between period t and period t + 1 , embodies the random walk principle of Hall (1978) so long as those expectations are formed rationally: Act+ 1 = — 1 • ( ^ 2 E t+ iV t+ i ~ \*=i i=i / Because lifetime income is a random walk and transitory income is serially uncorrelated, the consumer’s best forecast for income j periods hence is exactly the same as his one-stepahead forecast of income: E tV t+j = E t y t + 1- Using this fact in the expression for Ac yields a simplified expression in terms of revisions in expectations: A c * + 1 = — (y* + 1 + ( T — l)i?t+it/*+ 2 - T E ty t + \ ) . (3) This is the point at which the assumption about the nature of the information set available to the consumer becomes crucial. If the consumer can, in fact, distinguish the two shocks (thereby observing his own lifetime income, x ) , E t+ iy t+ 2 is simply equal to £t+i, and E ty t+ i is just x t . In this Perfect Information case, the expression for Ac*+i simplifies to: A c t+i = ct + y (4) These forecasts are clearly infeasible for consumers who are unable to distinguish one kind of shock from the other. In a sense, an uninformed consumer is suffering from an errors-in-variables problem similar to that experienced by econometricians trying to esti mate permanent income models. Because he is unable to use the unobservable lifetime income to forecast his future income, the consumer must come up with a forecast based only on those elements in his information set. While each individual clearly has a large 7 amount of idiosyncratic information on which he can base his forecast of future income (e.g., education, promotion prospects, etc.), we will model his prediction problem as if he had only the information in his earnings history to go on. The construction of an optimal forecast rule for this restricted information set is simpli fied by the observation that the second moments of a latent variable time series process such as Equations 1 and 2 are equal to the second moments of an alternative A R M A process. In other words, to someone who could not discern the underlying latent variables, A y would ‘look’just like an A R M A process. A n uninformed consumer, who could not separately dis cern the latent variables in Equations 1 and 2 could, therefore, construct an optimal linear forecast of his earnings based on this corresponding A R M A process. In our case, where lifetime income follows a random walk and transitory income is serially uncorrelated, A y can be written as: A y t = et + V t ~ V t - u with autocovariances: E (A y?) = <7£ 2 + 2ct* E ( A y tA y t - i ) = -o % E ( A y tA y t- k ) = 0 for k > 2 . Because the autocovariances of A y are zero beyond the first, one can find some M A ( 1 ) process which will generate exactly the same set of autocovariances as those generated by our latent variable model. If b is the moving-average parameter of the corresponding M A ( 1 ) process, then A y can be written as: A y t = ( l + b)et 8 with autocovariances: E (A y?) = (1 + & V 2 E ( A y tA y t- i ) = ba* E ( A y t A y t- k ) = 0 for k > 1. Equating the two sets of autocovariances and solving for 6 as a function of cr% and a\ yields the following expression for 6 :8 Using the standard forecast rule for M A ( 1 ) processes, OO £ ^ + 1 = (1 + 6) t=0 to substitute for the expectations in Equation 3 yields an error-learning equation for Ac: A c 1+i — (l + b — — — ) (yt+i - (5) E ty t + i ) where the M A forecast error, in terms of the latent variables, is: OO y*+i — E t y t + i = e t + 1 = OO + Vt+i + t=0 While obviously not orthogonal to lagged e or t=0 tj individually, this forecast error (and the corresponding change in consumption) will be uncorrelated with all elements in the uninformed consumer’s information set: that is, all lagged changes in his observed income. This orthogonality condition places a testable restriction on the covariance matrix between 8This is identical to the expression derived by Muth (1960) by explicitly minimizing the mean-square error of a linear forecast of a random walk with measurement error. 9 A c and lagged A y : E ( A c tA y t- k ) = 0 for k > 1 . In the Imperfect Information case, multiplying the error-learning equation for A c (Equa tion 5) by (leads of) the expression for A y and taking expectations yields the other restric tions on the elements of the covariance matrix: E ( A c tA y t ) = ( l O ^ ) ( < r £2 + ( 2 + 6 )<7 2) (6 ) - E ( A c tA y t + i ) = -(l + 6 ^ ) (7) ajj. O n the other hand, multiplying the Perfect Information consumption rule in Equation 4 by A y and taking expectations yields a different set of covariance restrictions: E ( A c tA y t) = E ( A c tA y t + l ) = of + ^tr2 (8 ) (9) Regardless of the assumptions made about the consumer’s information set, the autoco variances of the income process are: E (A y?) = a\ + 2<r2 (1 0 ) E ( A y tA y t. . i ) = -cr2. (1 1 ) Under either assumption about the nature of the consumer’s information, these four equations can be used to identify, from the estimated covariances of A c and A y ,the param eters of the income process and the consumption model. The task is to show how changing the informational assumption alters the mapping from the structural parameters of the consumption model to the moments of the joint distribution of A c and A y in such a way as to lead to biased and inconsistent estimates of the parameters of the consumption model. Our concern here is the responsiveness of consumption to income innovations, which we 10 parameterize with (3. In the context of the Perfect Information model, (3 can be thought of as the annuity value of a transitory shock, which reflects a combination of the interest rate used to discount future income, the length of the consumer’s planning horizon, and the persistence of transitory income shocks. In this illustrative model, (3 is a relatively uninteresting quantity, simply equal to (In the extended model, (3 will depend not only on the length of the horizon, but also on the prevailing interest rate, and the persistence of the transitory shocks.) The definition of the (3 is exactly the same in the pure Imperfect Information case as in the Perfect Information case. With no distinction between transitory and permanent shocks, however, its interpretion is somewhat less obvious. The correct interpretion of (3 parameter is still as an index of the sensitivity of consumption to income innovations. Now, f3 measures the degree to which this sensitivity exceeds the ‘baseline’(infinite horizon) sensitivity, (1 + 6 ). Alternatively, the Imperfect Information (3 can be thought of as responsiveness to transitory shocks consistent with the observed response to indistinguishable shocks. Our plan is to explore the sensitivity issue by estimating the consumption model, com paring the estimate of f3 with what could be thought of as reasonable values for that pa rameter. In the Perfect Information case, one can identify f3 through the covariance of A ct with At/t+i (Equation 8 ), and the first autocovariance of A y (Equation 1 1 ), since, in the perfect information case, _ E ( A c tA y t+1) P E { A y tA y t - i ) ' Forming the ratio of the sample analogs of these moments should, if the model is correctly specified, deliver a consistent estimate of (3. If, on the other hand, the sample covariances are generated by the consumption of individuals who are unable to distinguish between the two kinds of shocks, then using this ratio to identify f3 yields an inconsistent /3 , a linear combination of the true (3 and 1 with weights — 6 and 1 + 6 (where 6 is negative): P = (l + b )-b l3 . 11 Thus, in this simple example, the j3 obtained from estimating the model under the incorrect assumption of perfect information would be subject to a potentially serious incon sistency problem, leading to an overstatement of the response of consumption to income.9 Such a problem could at least partially account for the Hall-Mishkin finding of excess sen sitivity in the response of consumption to innovations in transitory income. 3 T h e E x te n d e d M o d e l In this section, w e extend the basic model of consumption outlined in the preceding sec tion to include two additional features: serially correlated transitory income shocks, and a nonzero (but constant) interest rate. W e also expand the specification to allow for advance information about changes in income, and construct a specification which includes both distinguishable and indistinguishable shocks, thereby nesting the Hall-Mishkin restrictions within a more general model. Finally, because the PSID study covers only food expendi tures (rather than total consumption) for most years, we modify the model to describe the behavior of food consumption. 3 .1 S e r ia lly C o r r e la te d T r a n s ito r y In c o m e S h o ck s In order to approximate the dynamics of the changes households’ earnings, we model the transitory income component as following an AR(1) process, while the lifetime income com ponent continues to be a random walk with uncorrelated errors. In the example above, with white noise transitory income shocks, the time series process which replicated the autoco variances of the latent variable process for A y was an A R M A ( 0 ,1 ); here, the equivalent time series process is an ARMA(1,1). W e will briefly sketch the mapping between the two 9The /? parameter is not the only one for which bias may be a problem when the information set is misspecified. Because the PSID data set reports only food expenditures, it is necessary to jointly estimate both /? and a, the slope of the Engel curve for food. A similar argument can be made that the incorrect specification will lead to an inflated estimate of or,ifit is also estimated using sample covariances. 12 representations. In terms of the latent variables, the income model is: ( 1 - L )xt = Ct (1 2 ) % It = Xt-1 + Vt (13) (1 - <f>L)rjt = Vt (14) where rf and v are white noise. L denotes the lag operator. As above, if consumers are unable to make out the lifetime and transitory components separately, this latent variable process will look to them just like some A R M A process. Specifically, Equations 1 2 through 14 can be rewritten in terms of A y as: (1 - <t>L)Ayt = A vt + (1 - <t>L)et, which is recognizable as an ARMA(1,1) in Ay, with a composite error term consisting of terms in v and e. Calling the autoregressive parameter a and the moving-average parameter 6 ,the corre sponding A R M A process can be written as: (1 - a L ) A y t = (1 + b V )e t. Equating the autocovariances of the two representations and solving for a, 6 ,and of in terms of <£, of, and of yields expressions for the parameters of the A R M A ( 1,1) representation in terms of <f> and the ratio of the variances of the lifetime and the transitory shocks: a — (f) - l - \(<t>2 + l ) ^ + (i - 4 ) ^ \ j \ { X + + 1 Having ascertained the parameters of the appropriate A R M A process, all that remains 13 is t o i n s e r t t h e f o r e c a s t r u l e s : oo E t A y t +1 = (t + b ^ i -b Y A y t - i i =0 CO E tA y t+k = <t>kE tA y t+1 = <t>k-\<l> + b ) Y , { - b Y & y t - i 1=0 into the error-learning consumption equation of our uninformed consumer, and use the result to generate the corresponding moment restrictions. 3 .2 N o n ze ro R a te o f In te rest While the introduction of a nonzero interest rate and serially correlated transitory income complicates matters somewhat, the consumer’s decision rule retains the error-learning struc ture it had in the simpler version of the model. The main difference is in the definition of the sensitivity parameter, /?, which now is a function of the interest rate and the serial correlation parameter, as well as the length of the consumer’s horizon. Specifically, the (3 which describes the ‘correct’ degree of sensitivity to income surprises takes the form: /? = cj//i, where and u = ■ - fi = w i t ) ' ! - ( * ) ' As T increases, the (3 parameter, defined in this way, approximates the interest rate, r; in the limit as T — ► oo, (3 — ► r/(l + r — a). Both the Imperfect Information and the Perfect Information versions of the consumption rule can now be rewritten in terms of this (3 parameter. At this point, it is convenient to introduce the modification required to estimate the model on food expenditure alone. Going on the assumption that the Engel curve for food is approximately linear, with slope equal to a and a nonzero intercept, the only modification required is straightforward, and simply involves inserting this a parameter into the equations describing the consumption rule. Incorporating these changes, and reinterpreting Ac as 14 r e fe r r in g t o f o o d c o n s u m p t io n o n ly , y ie ld s t h e fo llo w in g g e n e r ic e x p r e s s io n fo r A c : Act = ) (EtVt+i ~ E t- \ y t+ i). U + H t J To derive the imperfect information version, we combine this expression with that for the A R M A ( 1 ,1 ) forecast rule, yielding an error-learning equation for Ac as a function of the period t forecast error: Ac< = “ ( y r ^ ~ r r ^ ) (yt ~ (15) The perfect information model, on the other hand, implies a very different consumption rule, specifying a separate response to each component: (16) Ac* = a et + af)vt . As in the simple example of the preceding section, this error-learning rule for consump tion implies a specific set of restrictions on the elements of the covariance matrix of Ac and the leads and lags of Ay. Also, as before, it implies the orthogonality condition between Ac and all lagged elements of the information set, including lagged changes in y. The other restrictions on the covariance matrix implied by the imperfect information model can be found by substituting the A R M A forecast error (in terms of the latent vari ables) into the error learning rule, multiplying by Ay, and taking expectations. They are: E ( A c tA y t) = a 1 + b <f> + b i -< t> E ( A c tA y t+ i) = -a 2 , 2 + 6 o. + a 1 + 6b i-4> l + b _ <j>+ b 1 - 6 1 -4 > P (i 6)2 1 + 6b 4> (17) (18) where b is defined as above. O n the other hand, the perfect information model implies a 15 d is tin c t p a t te r n o f c o v a r ia n c e b e tw e e n A c a n d A y : E ( A c tA y t ) E ( A c tA y t + i) = = + a(3al -a/3(l - <j>)al. (19) (2 0 ) As before, the pattern of autocovariances of A y is independent of the specification of the consumer’s information set: E (A y?) = E ( A y tA y t_ i ) = E ( A y tA y t- 2 ) = (2 1 ) 4,-1 2 <f>+l " A> — 1 (2 2 ) 2 (23) *4> + l <’- Comparing Equations 15 and 16, the consumption rules for the Imperfect and the Perfect Information cases, it is clear that the competing hypotheses imply qualitatively different reactions in response to a shock to income. One way to see this is to compare the expressions for the covariance between A y t+\ and Ac*. Consider first the perfect information case, and consider a period in which an individual receives a positive transitory income ‘blip.’ If consumption behaves according to Equa tion 16, A y t+\ and Ac* will be negatively correlated for the following reason: In the current period, our consumer will adjust his consumption upwards. In the subsequent period, in come will fall — but the consumer knew it was going to, so consumption w o n ’t change. The correlation between this period’s increase in consumption and next period’s decrease in in come generates the negative correlation. Furthermore, as (3 approaches zero, this negative covariance in the Perfect Information case shrinks to zero, since the accompanying change in consumption also shrinks to zero. O n the other hand, under the Imperfect Information assumption, even as f3 goes to zero, we will see a negative covariance between A?/*+i and Ac* as successive observations of y yield additional information on whether the initial change in income was a lifetime or transitory 16 income shock. Thus, it is the covariance of A ct with A^+i, relative to C o v ( A c tA y t) and V a r ( A y t), which will be key in attempting to distinguish the competing hypotheses. Before we estimate the model we will first discuss two additional ways in which we augment the basic indistinguishable shocks model of Section 2 ,and then discuss the question of identification in that more general model. 3 .3 A llo w in g For A d v a n c e In fo rm a tio n A striking feature of the household consumption data is that the correlation between the current change in consumption and the change in income one period hence is significantly positive ,rather than negative as implied by the Permanent Income Hypothesis under either informational assumption. The standard explanation of this phenomenon is that consumers, when they make their ‘current’consumption decision, already have some advance informa tion about the subsequent period’s innovations in income. The timing of the PSID survey corroborates this interpretation. The survey, which is administered in March of every year, contains income questions which refer to the previous year’s earnings. The consumption questions, on the other hand, are usually interpreted as pertaining to current consump tion. Therefore, it is only natural that consumption in March of 1984 should already have responded to some of the earnings news for 1984. To compensate for this timing problem, we assume, following Hall and Mishkin, that the ‘true’ current change in consumption is a convex combination of the theoretical ‘current’ change in consumption and ‘next period’s’ change in consumption. If 7 is the ‘proportion with no advance information,’then: Ac' = 70 ( r 7 “ ' E ‘~'v,) + (1~7)0 ( r 7 ” i^ 7 ) (!"+, “ E,3w)' One justification for this specification is that it corresponds to a model of information propagation in which there is a 7 probability every period that the next period’s innovation 17 in income w ill be known before this period’s consumption decision is taken. 3 .4 N e s tin g T h e T w o M o d e ls While it seems plausible that consumers often cannot distinguish between temporary and permanent shocks to their incomes, it is unlikely that they could never make this distinc tion. To allow for the possibility that consumers face a situation of p a r tia l information, we will assume that the consumer faces tw o sets of income shocks: one distinguishable set, and one indistinguishable set. Here, we will describe a fully general (but underidentified) specification, and will later discuss the restrictions on the model required to achieve iden tification. The required restrictions will condense this general specification into a convex combination of the Hall-Mishkin model and our imperfect information model. W e will start with an unrestricted latent variable specification for the A y process: A y = c i,* + - r / i j - i + e2,t + ~ V2,t-i where c^, €2,t V itti and r\2,t are independent of each other, and are described by the same stochastic processes as e and 77 above (i.e., and €2 yt are white noise with variances a\ t and (tI,e, while rftj and 772,* follow the processes (1 — (f)iL)rjiit = v\j and (1 — 4>2L)p2,t — Vi where the variance of V\jt and 7/2 ,t are o\ and cr*^)- W e now assume that the consumer cannot separately observe c1?< and 77^, but can separately observe 62,7 and 772,/. This makes our augmented consumption model a sum of two parts: one which corresponds to the Imperfect Information model (Equation 15), and one which corresponds to the Perfect Information model (Equation 16). Ignoring, for the time being, the advance information complication, this extension implies the following equation for Ac: Ac< = a et — ei,t + v \ yt et + a (e2.< + v %tP ') 00 — (<f> + 00 b ) ^ ( —6)2+1c i ^ _ i _ j — (1 + b ) Y X 2=0 2=0 18 - b y m ^ -j and (3 and /?' are defined exactly as before for the two subprocesses. This formulation nests the Perfect Information specification of Hall and Mishkin with the Imperfect Information specification, as special cases of a more general model. two special cases correspond to setting €\yt and or 62 The and 1^2 ,t identically equal to zero. However, as we will argue in the next section, this most general formulation does not impose enough restrictions to achieve identification. In order to construct an estimable specification, we must impose a number of restrictions on the variances of the observed and unobserved components discussed above. The system of equations we use to identify the parameters appears in full in the ap pendix. What follows is a brief discussion of some of the problems we face in identifying the nested versions of the model. There are essentially six variances and covariances which pro vide independent information about the parameters of this model. They are: E ( A c tA y t ) , E ( A c tA y t+1), E ( A c tA y t+ 2 ), E ( A y f ) , E ( A y tA y t- i ) , and E { A y tA y t- 2)- As the model is specified above (incorporating both the advance information extension and the two classes of income innovations), there are nine parameters to identify: 7 , /?, /?', ^>1 , <^2 , cr| c, o \ ,v , and a ^ v . With such an excess of parameters relative to the number of moments, the model as it stands is clearly underidentified. To identify the model, we proceed by assuming that the distinguishable and indistinguishable shocks come from populations with identical autore gressive parameters, and equal variances, up to a constant of proportionality: <t>1 = <t>2 a lc _ -p ~ P - p 1 a lV 1 -p In addition to eliminating a (f> and two 2 <j s , these assumptions also imply that (3 = /?'. Introducing the constant of proportionality between variances, p, adds a single parameter, so that we are left with exactly six coefficients to estimate, not counting a, which will be 19 estimated separately. These assumptions essentially say that the two pairs of innovations to the income process are identical, except to the extent that one pair may have a higher variance than the other pair. One way to interpret this restricted model is the following: consumers receive only two kinds of shocks, e and v. Each year, some proportion of consumers, />, receive information on the source of their shocks, while the rest, 1 — />, receive no information on the source of their shocks. This is a non-trivial assumption, since one can think of cases where the distinguishable shocks come from one kind of population (say, changes in direct taxes) while the indistin guishable shocks come from a different kind of population (say, changes in indirect taxes); however, it is unavoidable if we are to achieve identification. In this restricted form, our model nests both the Perfect and Imperfect Information models as special cases correspond ing to p = 1 and p = 0 ,respectively. 3 .5 E stim a tin g a One final identification problem remains. The a parameter, which appears in each of the A c — A y elements of the covariance matrix, is not identifiable from covariances alone, unless the Perfect Information assumption is maintained. In this case, the fact that both the transitory and the lifetime factors have their own coefficients, a/3 and a, respectively, allows a to be identified from the covariances. This accords with intuition, which suggests that we can discern a from the response to a lifetime shock. Then, knowing a and observing the response to a transitory shock, we can ‘back out’ an estimate of /?. In the Imperfect Information situation, this is no longer the case: a and a term contain ing (3 always enter the covariance restrictions as a product, meaning that the two cannot be disentangled from this information alone. It is possible to verify from the system of equations in the appendix that even with some non-zero fraction of Perfect Information consumption, the presence of the p nesting parameter makes it impossible to determine a from the covariances. 20 W e choose an alternate method of estimating the slope of the Engel curve, involving an auxiliary regression of consumption on income — in effect, using information from the levels of consumption and income, rather than the differences, to achieve identification.10 This is just the kind of regression which is susceptible to the errors-in-variables problem identified by Friedman (1957). The problem is that measured income is the sum of transitory and permanent components; a naive regression of consumption on income will yield an attenuated estimate of the marginal propensity to consume food out of permanent income. W e remedy this problem by estimating the Engel curve on time averages of each house hold’s data — that is, we regress the average consumption level for household i on the average income of that household in a regression of the form: = k + a y {. The idea behind this ‘between’ estimator of a is that the measurement errors induced by transitory income tend to cancel each other out over time, so that y is a relatively noise-free estimate of permanent income. Such a procedure will not completely eliminate measurement error, but will at least reduce it. Table 1 below shows the equation we use to obtain our estimate of a, in which we also control for the number of household members. The point estimate from the linear version is 0.08; this is the value we will use in subsequent estimations. The nonlinear specification in Table 1 shows only a very small amount of curvature in the Engel curve for our sample, a fact we attribute to the homogeneous composition of the sample we selected for analysis. The next section describes this sample in greater detail. 10 An additional benefit ofthis method isthat itisfree ofthe specification bias which, as we argued above, could contaminate the a and estimated from the covariances under the incorrect informational assumption. 21 Table 1: The ‘Between’ Estim ate of a Independent Variable Intercept Y2 HHSIZE Y R2 1 259.89 (30.50) 0.0803 (0.0026) — 168.47 (7.79) 0.5122 2 230.35 (51.07) 0.0867 (0.0093) -2.7E-7 (3.7E - 7) 167.42 (7.92) 0.5121 Dependent Variable: Real food consumption Y = Real disposable income HHSIZE = Number of household members Data are 1978-84 averages, in 1967 dollars. Standard errors are in parentheses. 22 4 E stim a tio n 4 .1 T h e D a ta The data we use come from the University of Michigan’s Panel Study of Income Dynamics (PSID), from survey years 1978 through 1984.11 W e use only family-level data through out, taking the household, as defined by PSID conventions, as being the appropriate level of aggregation for the analysis of consumption decisions. Bearing in mind the timing complica tion described above, we take the responses from each year’s survey to refer to the previous year’s values, but allow for consumption decisions to be made with advance information. For our income variable, we use the sum of the labor incomes of the Head and the Wife (or ‘Wife’), adjusted for federal income taxes and FICA payroll taxes.12 For consumption, we use the sum of the household’s expenditures on food at home and in restaurants. W e deflate the income variable with the consumer price index to put it in terms of constant 1967 dollars. Similarly, using the food price component of the consumer price index, we express food consumption in terms of 1967 dollars. Rather than using the entire PSID panel of 6,918 families (as of Wave 17), we choose to focus our analysis on a carefully selected subset of the panel. W e first drop a number of ob servations which appear to be ‘bad’data, outliers, or have some other (observable) problem. Second, in an attempt to avoid the problems involved in modelling non-Permanent-Income behavior, we omit observations on families which are likely to be constrained in one way or another. The data we drop in the first round are the following: 1. Families which report zero labor income for both Head and Wife 11 Technical constraints imposed by the computer software forced us to use only seven years of data (six sets of differences). However, because households are being continually added to the survey and because our method requires a balanced sample, including fewer years increases the number of observations available for estimation. 12Ideally, we would also want to adjust for state taxes in arriving a measure of changes in disposible income. However, because the PSID does not contain any state tax data, this would involve either adjusting by some representative marginal state tax rate, or combining the PSID information on the household’s state of residence with statutory tax rate data to estimate each household’s state taxes. 23 2 . Families which report zero food expenditure 3. Very wealthy families (inflation adjusted needs ratio greater than 20) 4. Observations with a major assignment to food or income data 5. Families in which both the Head and the Wife were institutionalized, students, or non-participants in the labor force for any other reason 6 . Observations in which the reported labor income was truncated by the number of digits on the PSID tape13 7. Families which reported a change in real income greater than 50 per cent in absolute value, relative to the previous year. Second, in order to focus on the informational issue discussed above, we wish to con centrate our analysis on those families who are most likely to behave according to the Permanent Income Hypothesis. Accordingly, we drop those observations corresponding to very poor families whose behavior is likely to be liquidity constrained. Specifically, we dis card observations of families whose inflation adjusted needs ratio was less than unity, and those of families which received food stamps during the sample period. Finally, on the grounds that estimating a dynamic earnings structure as in Equations 1 2 and 13 makes little sense for retired people who generally have little or no labor income, we eliminate retirees from the sample. The result of this series of cuts is a relatively homogeneous balanced sample of 1,978 observations. 4 ,2 E stim a tio n M e th o d In the most general terms, dynamic factor models of consumption, like those described above, act to place sets of restrictions on the covariance matrix of (the leads and lags of) A y and Ac. Therefore, an estimation method which allows us to impose those restrictions 13For 1978-83, this amount was $99,999; for 1984, it was $999,999. 24 directly on the elements of the covariance matrix is better adapted to fitting these models than one which imposes restrictions on the ratios of the off-diagonal elements to the diagonal elements, as does the regression method. The Minimum Distance estimation method is very well adapted to estimating a model with such a structure.14 The idea is to minimize a quadratic criterion function of the form: min [g($) - where 6 z } ' A [g{$) - z] is an ra-dimensional vector of parameters to be estimated, z is an n-dimensional vector of unconstrained estimates, and g is the mapping from the constrained parameter space to the unconstrained parameter space which incorporates the restrictions on the moments implied by the behavioral model. The minimum distance estimator, 0, is the 9 which solves the first-order condition: D g ( 0 ) 'A [g(0) - z] = 0. Gauss-Newton iterations can be used to numerically solve this equation and obtain a value for 0 . It can be shown that, under very general conditions, the Minimum Distance estimator is consistent and asymptotically normal — regardless of the choice of weighting matrix, A, used in the minimization. If the inverse of the covariance matrix of z is used as the weighting matrix, then the minimum-distance method delivers the ‘Optimal’ Minimum Distance estimator, yielding the most efficient (relative to other choices of A ) estimate of 0. 15 The method of M a x i m u m Likelihood is analogous to using the inverse of the matrix of fourth moments implied by the normal distribution for A ; for a non-normal distribution of disturbances, the M L method would therefore utilize a sub-optimal weighting matrix. 14Other applications of O M D estimation of covariance structures include Abowd and Card (1986), and Altonji, Martins and Siow (1987). 15See Chamberlain (1982) and (1984) for a complete presentation of the Minimum Distance method, and its optimality properties. 25 Another benefit of using C o v ( z ) ~ x as the weighting matrix is that the minimized crite rion function, multiplied by the number of observations, is distributed asymptotically as a X 2, with degrees of freedom equal to the number of restrictions placed on the model, i.e., n — to ;in Table 3 we use this fact to perform x 2 tests of various restrictions. In the results that follow, we use a feasible version of the Optimal Minimum Distance estimator, in which the A matrix is replaced by the inverse of the estimated covariance matrix of z, the matrix of sample fourth moments. In our application, the 6 vector consists of the parameters of the income process and the consumption model discussed above. The g function maps these parameters into our 2 vector, which is comprised of the unique elements of the sample covariance matrix, e.g.: 1 1 * N and — J 2 A c h A c h - s , for s < 5, where A c * t = Ac,-,* — Ac* and A y * t = A yijt — A y t. Even though our models place no restrictions on the autocovariances of Ac, we include these moments in the z vector, but impose only the stationarity restrictions on that subvector. 5 T h e R e s u lts The results we get are mixed. As it should be evident from Tables 2 and 3, the data resist our attempts to impose stationarity. This should not surprise the reader; most studies based on this data find the same rejection.16 Orthogonality restrictions imposed by our theory are rejected in the weighted data but not in the unweighted data; the implications of this will be examined in the next section. The income process we impose is, however, accepted even at the 10% level which suggests that our ARMA(1,1) specification for A y is not obviously worse than the ARMA(0,2) specification adopted by Hall and Mishkin.17 16See for example Altonji, Martins and Siow(1987). 17MaCurdy (1982) also concludes that the data are indifferent between the ARMA(1,1) and the ARMA(0,2) specifications. 26 The interpretation of our results should therefore be qualified by the fact that the basic structure we impose on the data to make estimation possible is not entirely supported by the data. A further problem arises because the relation between p and (5 in our model is highly nonlinear and, as a result, in our estimation of the nested model, yields two sets of values for p and (3 which generate the same values of the objective function. To make matters worse, one set of these values is typically outside the legitimate range. (Both p and (3 get negative point estimates although in general one cannot reject the hypothesis that they are both zero.) Our emphasis in interpreting the results will therefore be in the direction of comparing x 2 values for different alternative hypotheses rather than looking at point estimates. The strongest support for the Imperfect Information view comes from looking at the weighted data. The numbers in Tables 2 and 3 show that imposing the pure Imperfect Information restriction (p = 0) on the nested model causes the x 2 value to change very slightly. O n the other hand, imposing the Perfect Information restriction (p = 1 ) is rejected quite emphatically by the data. The results in the unweighted data are less clear cut. The pure Imperfect Information restriction has a somewhat lower x 2 value than the Perfect Information hypothesis, but neither set of restrictions on the nested model can be rejected even at the 10% level. (At the 15% level we can reject the Perfect Information hypothesis but we cannot really insist that this is a very meaningful rejection). The point estimates reported in Tables 4 and 5 correspond to the non-negative set of roots. As discussed above, there is some reason to be sceptical of the information value of these estimates. The one which seems most reasonable is the estimate for 7 which comes out to be 0.51 in the unweighted case and 0.62 in the weighted case. At least the value from the weighted model, suggesting 38% advance information, is not inconsistent with Hall and Mishkin’s claim that there is a lag of one quarter between the income and the consumption data. Also, the value of 7 corresponding to the negative roots is not very different from these values, which suggests a degree of robustness in the estimate of this parameter. The estimates of (3 and p reported in the tables are hard to interpret. W h e n we estimate 27 Table 2: Goodness-of-Fit Statistics \2 Statistics Unweighted Weighted Model Restrictions k 0 Unrestricted 78 0.00 0.00 1 2 3 Stationarity Constrained (1) Plus Orthogonality (2) Plus Income Process 23 18 15 100.00 106.51 111.44 114.51 126.95 130.52 4 5 6 Without Rule-of-Thumb Consumption Nested Perfect &; Imperfect Info Perfect Information Only Imperfect Information Only 12 11 11 124.21 126.31 125.30 140.60 144.54 141.55 7 8 9 With Rule-of-Thumb Consumption Nested Perfect &; Imperfect Info Perfect Information Only Imperfect Information Only 13 12 12 122.70 122.88 123.60 138.41 139.59 138.53 Table 3: Tests of Alternative Specifications Model H a Ho DF Unweighted Data X2 Stat P-Value 0 1 2 3 1 2 3 4 55 5 3 3 100.00 6.51 4.93 12.77 0.00 0.26 0.18 0.01 114.51 12.44 3.57 10.08 0.00 0.03 0.31 0.02 4 4 5 6 1 1 2.10 1.09 0.15 0.30 3.96 0.95 0.05 0.33 7 7 8 9 1 1 0.18 0.90 0.67 0.34 1.18 0.12 0.28 0.73 7 7 7 4 5 6 1 2 2 1.51 3.61 2.60 0.22 0.16 0.27 2.19 6.13 3.14 0.14 0.05 0.21 8 9 5 6 1 1 1.43 1.70 0.23 0.19 4.95 3.02 0.03 0.08 28 Weighted Data X1 Stat P-Value Table 4: Parameter Estim ates, Unweighted Data Parameters Model °i 7 <t> P c 3 1.19 (0.10) 0.96 (0.09) 0.31 (0.05) ~ — — — 4 1.21 (0.11) 0.94 (0.09) 0.33 (0.05) 0.47 (0.14) 0.51 (0.06) 0.69 (0.24) 0.00 5 1.15 (0.09) 0.98 (0.08) 0.30 (0.05) 0.53 (0.11) 0.48 (0.06) 1.00 0.00 6 1.21 (0.11) 0.94 (0.09) 0.33 (0.05) 0.08 (0.13) 0.52 (0.06) 0.00 0.00 7 1.24 (0.11) 0.92 (0.09) 0.34 (0.05) 0.46 (0.17) 0.54 (0.06) 0.85 (0.34) 0.24 (0.19) 8 1.24 (0.11) 0.92 (0.09) 0.33 (0.05) 0.47 (0.16) 0.54 (0.06) 1.00 0.29 (0.14) 9 1.25 (0.11) 0.92 (0.09) 0.34 (0.05) 0.03 (0.20) 0.56 (0.05) 0.00 0.26 (0.18) Standard Errors are in parentheses. 29 Table 5: Parameter Estim ates, Weighted Data Parameters Model P 7 P c 3 15.18 (1.14) 12.85 (1.05) 0.22 (0.04) 4 14.93 (1.14) 12.89 (1.05) 0.23 (0.04) 0.44 (0.17) 0.62 (0.08) 0.60 (0.22) 0.00 — 5 14.11 (1.04) 13.44 (0.98) 0.20 (0.04) 0.42 (0.14) 0.63 (0.10) 1.0 0.00 — 14.89 (1.13) 12.90 (1.04) 0.23 (0.04) 0.06 (0.13) 0.64 (0.18) 0.0 — 0.00 — 7 15.41 (1.19) 12.57 (1.07) 0.24 (0.04) 0.34 (0.29) 0.66 (0.07) 0.61 (0.37) 0.29 (0.19) 8 15.21 (1.16) 12.69 (1.06) 0.24 (0.04) 0.32 (0.25) 0.69 (0.08) 1.0 0.39 (0.16) 15.42 (1.18) 12.56 (1.07) 0.24 (0.04) -0.01 (0.19) 0.66 (0.05) 0.0 6 9 Standard Errors are in parentheses. 30 — — — 0.32 (0.17) p freely, the estimate comes out to be quite high (between 0.6 and 0.7), and it is accom panied by high f3 values (not less than 0.4) . 18 The same kind of (3 values are generated if we impose the Perfect Information restrictions. O n the other hand, imposing the pure Imperfect Information restrictions yields a /3 value of 0.06, which is perfectly consistent with the interest rates and lifetimes we consider reasonable, and the decline in the fit, as we saw above, is very slight. The fact that the deterioration in fit is so slight, in apparent contradiction to the high point estimates of p, is due to the non-linearity of the model, and the presence of the second root corresponding to a small, negative value of p. Small changes in the specification or the selection of an alternative subset of the PSID panel may make this second set of roots positive, in which case a stronger conclusion in favor of our hypothesis would be warranted. 18In constrast, with a rate of interest equal to 4% and a horizon of 25 years, the ‘correct’value of /? is approximately 0.05. 31 6 Rule-of-Thumb Consumption? The most unsatisfactory aspect of our current results is the strong violation of the Euler equation restrictions in the weighted case. In combination with the fact that the Euler equation restrictions are not rejected in the unweighted case, this suggests that the violation ma y come from the behavior of the members of the sample with relatively low incomes. Intuitively one expects their influence on the results to be greater in the weighted case. This suggests that the source of the trouble m a y be liquidity constraints on low income families. The trouble is, as Zeldes (1985) points out, that there is no simple rule for predicting how liquidity constrained agents would behave.19 The same problem also arises with the second candidate for an alternative hypothesis, namely, that at least some agents are not rational and use rules of thumb to decide their consumption. There is no obvious candidate for such a rule of thumb. Current fashion favors the so-called Keynesian consumption function, which simply says that Ac = Ay, but there seems to be no reason to prefer this over Ac = k A y where k is less than 1 . Keynes himself preferred the latter and called it the ‘fundamental psychological law. ’ 20 And as long as k is positive, the version with k < 1 is a priori no worse in explaining violations of the Euler equation. In a future extension of this paper we will consider the question of the best specification for rule-of-thumb behavior. In this paper we limit ourselves to examining what happens if we add to our model the ‘Keynesian’ alternative mentioned above. Our motive for doing so is for comparability with other studies like Hall and Mishkin and Campbell and Mankiw (1988) which make this assumption. The method we use for incorporating ‘Keynesian’ behavior follows Hall and Mishkin (1982). W e assume that a fraction £ of consumption is determined by the rule of thumb, Ac = Ay, and the rest is determined by the permanent income model, as above. W e 19See Hall (1987) and Hayashi (1987) for surveys of the evidence on liquidity constraints. 20See Keynes (1936). 32 therefore write: A c = CA?/ + (l-C)Acp where cp is the consumption predicted by the permanent income model introduced above. The method of incorporating rule-of-thumb consumption suffers from the defect that it actually implies that all agents sometimes follow a rule of thumb and sometimes follow the permanent income model. By contrast what we actually want is to be able to model the fact that some agents follow one of these models most of the time and that others follow the other model most of the time. As a result, Hall and Mishkin’s interpretation of our ( parameter as the fraction of rule-of-thumb consumers is not strictly correct. W e follow Hall and Mishkin in using the moments of consumption with lagged income to identify ( . Our results again depend on which data set we use. With both sets we find that the restriction that ( = 0 cannot be rejected at the 5 % level. The same is true if we compare the pure Imperfect Information model (£ = 0, p = 0) to the unrestricted model. The restriction of Perfect Information (£ = 0, p — 1) is rejected in the weighted data. If we allow ( to be freely estimated, once again one cannot reject the restriction of pure Imperfect Information, but now one cannot reject Perfect information either, and in the unweighted data it actually performs slightly better. To a limited extent therefore, the introduction of rule-of-thumb consumption does make the Hall-Mishkin assumption perform better relative to the Imperfect Information hypothesis. However, the conclusion of Hall and Mishkin that the inclusion of rule-of-thumb con sumers is enough to generate reasonable point estimates is not confirmed by our study. In the Hall-Mishkin case (p = 1, unweighted data) the estimate of £ we get, 0.29, is not so different from the estimate they report, 0.2, but the /? we get, 0.47 is much larger than their estimate of 0.17 and cannot be reconciled with rational behavior. The only case where we obtain an estimate of {3 consistent with the theory is in the unrestricted case with weighted data. In this case, we get an estimate of 0.33 with a standard error of 0.28, but the finding that this is not inconsistent with the theory is mainly driven by the unusually large standard error. In all other cases the estimate of j3 is not really changed by the inclusion of rule- 33 of-thumb consumers. The estimate of p is also surprisingly insensitive to the inclusion of rule-of-thumb consumers. 7 C o n c lu s io n s The most general conclusion to be drawn from this paper is that alternative specifications of the information set used in inferring the ‘surprise’ movements of a variable can make a substantive difference in the estimation of rational expectations models. In the context of consumption, our specific result is that an alternative specification of consumers’information sets, which endows them with ‘less’information than is customary, performs at least as well (in terms of fit) as the stronger Perfect Information specification. In addition, this Imperfect Information assumption is able to resolve the excess sensitivity puzzle found in other studies. In particular, when we impose the weaker informational restrictions, we obtain a point estimate of the sensitivity parameter which is justifiable in the presence of plausible interest rates and horizons, although a large standard error makes it difficult to do precise inference. In this sense, our results can be interpreted as being favorable to the Permanent Income Hypothesis, although some kind of rule-of-thumb behavior appears to be marginally relevant. There are a number of reasons for caution in interpreting our results. First, the power of the covariances of food consumption with income to distinguish these competing hypotheses is very low. While the Imperfect Information versions appear to perform slightly better than the Perfect Information versions, the improvement in fit is marginal. It is therefore entirely plausible that the Perfect Information model is really the appropriate specification of consumer behavior, but that consumers simply respond with excess vigor to income shocks; there is not enough information in our data to reject one hypothesis in favor of the other. It may, however, be possible to extend our analysis to include additional ‘indicator’ variables, such as hours worked, asset income, or saving, which would yield information on households’ expectations of future income, and by so doing, improve our ability to discern one informatonal hypothesis from the other. This remains a topic for future work. 34 A second caveat is due from the observation that the x 2 statistics indicate that all of the consumption models we try fit the data rather poorly. Even the stationarity restrictions we impose at the outset fail spectacularly. While conclusions drawn from these misspecified models must be treated with caution, such models can serve as useful approximations for divining the structure in the data. Finally, the problem of measurement error is one we were unable to properly address; although we made an attempt to discard outliers and data points to which major assign ments were made, a proper measurement-error correction would require more degrees of freedom than we have at our disposal, using only income and consumption data. Despite these caveats, we believe that the results presented here are a first step to wards resolving some of the outstanding questions in the study of household-level consumer behavior. 35 A Appendix: The Nested Model The following set of equations defines the mapping, y, from the parameters of the structural model of consumption and income to the covariances of A c with Ay, and the autocovariances of Ay. Iflifetime income is a random walk, while transitory income is described by a stationary A R ( 1 ), then the autocovariances of A y are: E ( A y 2) E ( A y tA y t- i ) E ( A y tA y t- 2 = = ) = E ( A y tA y t- k ) ^ + l<^ + <r£ — 1 2 ^+1 " J > ~ l rT2 ^ + 1 ^ = If we define: k = b = -1 - H P + i)f| + (i - + »)’g + i then the restrictions placed on the covariances of A c and A y can be written: E ( A c tA y t- k ) E ( A c tA y t) = 0 for k > 1 = cry k ( l - p ) (a, 2 | 2 + 6 - ^ 2^1 , „ („2 , + P + Z ^ 2)] 1+ # 36 E ( A c tA y t + 1 ) c,k(l - „) [(1 - 7) (al + 2 ± ^ < r ’) - + <*P [(1 - 7) (of + fi°£ ) + 7/5(1 - </>)<r2] E ( A c tA y t+2) —a (7 0 + ( 1 - 7 )) k ( l - p ) ^ - k + P0 (l-<l>) l + <f>b E ( A c tA y t+k) -a<j>k 2 (7 </»+ ( 1 — 7 )) 37 al for k > 3. References Abowd, J.M. and D. Card (1986), “O n the Covariance Structure of Earnings and Hours Changes”, N B E R Working Paper #1832. Altonji, J.G. and A. Siow (1987), “Testing the Response of Consumption to Income Changes with (Noisy) Panel Data”, The Quarterly Journal o f E co n o m ics 1 0 2 , 293-328. Altonji, J.G., A.P. Martins and A. Siow (1987), “Dynamic Factor Models of Consump tion, Hours, and Income”, N B E R Working Paper #2155. Altonji, J.G., A.P. Martins and A. Siow (1988), “Using Cross-Equation Restrictions between Asset Income and Non-Asset Income to Estimate the Permanent Income Hypothesis”, Northwestern University. Campbell, J.Y. and N.G. Mankiw (1987), “Permanent Income, Current Income, and Consumption”, N B E R Working Paper #2436. Chamberlain, G. (1982), “Multivariate Regression Models for Panel Data”, Journal o f E co n o m etrics 18, 5-46. Chamberlain, G., 1984, “Panel Data”, in Griliches and Intriligator, eds: Handbook o f E co n o m etrics Volume II, Amsterdam: North-Holland. Flavin, M.A. (1981), “The Adjustment of Consumption to Changing Expectations about Future Income”, Journal o f Political E c o n o m y 89, 974-1009. Friedman, M., 1957, A T heory o f the C onsum ption Function. Princeton, NJ: Princeton University Press. Hall, R.E. (1978), “Stochastic Implications of the Life Cycle-Permanent Income H y pothesis: Theory and Evidence”, Journal o f Political E c o n o m y 8 6 ,971-987. Hall, R.E. and F. Mishkin (1982), “The Sensitivity of Consumption to Transitory Income: Estimates from Panel Data on Households”, E conom etrica 50, 461-481. Hall, R.E. (1987), “Consumption”, N B E R Working Paper #2265. Hayashi, F., 1987, “Tests for Liquidity Constraints: A Critical Survey”,in Bewley, ed.: A d va n ces in E con om etrics, Fifth W orld C ongress Volume II, Cambridge: C a m bridge University Press. 38 Keynes, J.M., 1936, The General Theory o f E m ploym en t, In terest, and M o n e y . London: Macmillan. Lucas, R.E. (1972), “Expectations and the Neutrality of Money”, Journal o f E con om ic T heory 4, 103-124. MaCurdy, T.E. (1982), “The Use of Time Series Processes to Model the Error Structure of Earnings in a Longitudinal Data Analysis”,Journal o f E con om etrics 18, 83-114. Mork, K.A. and V.K. Smith (1986), “Testing the Life-Cycle Hypothesis on Panel Data Using Detailed Consumption Diaries and Income Based on Tax Records”,Vander bilt University. Muth, J.F. (1960), “Optimal Properties of Exponentially Weighted Forecasts”,Journal o f the A m erica n Statistical A ssocia tion 55, 299-306. Ritter, J.A. (1988), “Endogenous Borrowing Constraints and Consumption”,University of Texas. Zeldes, S. (1985), “Consumption and Liquidity Constraints: A n Empirical Investiga tion”,Wharton School, University of Pennsylvania. 39