The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 235–267 Distortionary Taxation for Efﬁcient Redistribution Borys Grochulski T his article uses a simple model to review the economic theory of efﬁcient redistributive taxation. Three main results are presented. The ﬁrst is the classic competitive equilibrium efﬁciency result: trade in competitive markets leads to an efﬁcient ﬁnal (i.e., equilibrium) allocation of consumption among the agents in the economy. The equilibrium allocation is determined by market supply and demand forces. In our model economy, the equilibrium allocation is determined uniquely. This efﬁciency result, known as the First Welfare Theorem, provides a strong argument supporting the view that unobstructed competitive market forces can be relied on to determine the allocation of consumption in the economy. One must observe, however, that the competitive market equilibrium supports one efﬁcient allocation, i.e., competitive markets support one particular distribution of the total gains from trade that are available in the economy. There are an inﬁnite number of ways in which the total gains from trade can be efﬁciently divided among the agents. Thus, absent redistribution, almost all efﬁcient divisions of the gains from trade are inconsistent with the competitive market mechanism. In other words, the competitive market mechanism guarantees efﬁciency but also imposes on the society one particular division of the welfare gains from trade. It is entirely possible that the agents in the economy may prefer to divide the gains from trade differently. In fact, there is no a priori reason to believe that the society’s most preferred division of the gains from trade should happen to coincide with that imposed by the market mechanism. Thus, for distributional reasons, the competitive market allocation will almost surely be suboptimal. The second result we review describes the classic solution to the distributional problems associated with the competitive market mechanism: wealth The author would like to thank Kartik Athreya, Leonardo Martinez, Sam Henly, and Ned Prescott for their helpful comments. The views expressed in this article are those of the author and do not necessarily reﬂect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. E-mail: borys.grochulski@rich.frb.org. 236 Federal Reserve Bank of Richmond Economic Quarterly transfers. If society prefers a different division of the gains from trade than the one brought about by competitive market forces, it is sufﬁcient to transfer wealth among the agents in order to correct it. Such wealth transfers can be implemented via simple lump-sum transfers and taxes levied by the government. This result, known as the Second Welfare Theorem, however, poses strong requirements on the quality of information available to the government. Lump-sum taxes, by deﬁnition, depend only on agents’ types (and not on their actions). In order to use lump-sum taxes, thus, the government must possess sufﬁcient information about agents’ types, on a person-by-person basis. If public information is not sufﬁciently detailed, the required wealth transfers and taxes cannot be applied because the government is unable to determine which agents should be taxed and which should receive a transfer. In this situation, the Second Welfare Theorem breaks down: Lump-sum taxes are insufﬁcient to achieve any division of the gains from trade other than the one implied by the competitive market mechanism. The third result we review concerns the problem of efﬁcient redistribution of the total gains from trade in the case of incomplete public information. Here, the inefﬁcacy of lump-sum taxes creates a role for distortionary taxation. A tax is called distortionary if the amount due from an agent depends on his actions. If an activity is subject to a distortionary tax, then by avoiding the activity the agent can avoid the tax, which distorts his incentive to engage in this activity. The ability to inﬂuence agents’ incentives is exactly what makes distortionary taxes useful. The tax-imposed distortions can be designed to offset the distortions resulting from incomplete information. Such corrective distortions, clearly, cannot be generated by lump-sum taxes, which are nondistortionary. The third main result we review in this article is a version of the Second Welfare Theorem modiﬁed to include distortionary taxes. Within our model economy, we fully characterize a distortionary tax system sufﬁcient to achieve any efﬁcient division of the gains from trade available in our economy when public information is incomplete. This tax system consists of a lumpsum-funded subsidy to sufﬁciently large capital trades. Depending on which among the inﬁnite number of efﬁcient divisions of the gains from trade is to be implemented, the subsidy can go to either those who sell or to those who buy capital in the competitive market. The model economy is a two-period, deterministic Lucas-tree economy in which income comes from a stock of productive capital. In each period, one unit of the capital stock produces y units of a single, perishable consumption good. The capital stock is ﬁxed, i.e., no physical investment is possible. The size of the economy’s total capital stock is normalized to unity. Agents, who own the capital stock in equal shares, are heterogenous with respect to their preference for early versus late consumption. In particular, there are two types of agents in this economy: the patient ones, whose marginal utility from B. Grochulski: Redistributive Taxation 237 consuming in the ﬁrst period is relatively low, and the impatient ones, whose marginal utility from consuming in the ﬁrst period is relatively high. Efﬁcient divisions of the welfare gains from trade are represented by Pareto-efﬁcient allocations of consumption. For the model economy we consider, Pareto-efﬁcient allocations of consumption are characterized in Grochulski (2008). This characterization describes the full set of possibilities for feasible and efﬁcient division of the total gains from trade among the patient and impatient agents—both in the case of complete public information and in the case of agents’ private knowledge of their impatience type. Having this description in hand, we can consider, in the present article, the question of how any given such division can be implemented in a competitive market economy. In particular, we focus on the role the government has in supporting the socially preferred division of total gains from trade through redistributive taxation. The question of efﬁcient redistribution and its implementation through taxation has long been studied in economics. The deﬁnitive treatment of the classical theory of efﬁciency and distributional properties of competitive markets under full information and with no externalities is given in Debreu (1959).1 The ﬁrst two of the three main results we review in this article are simple special cases of the welfare theorems provided in Debreu (1959).2 In a seminal paper, Mirrlees (1971) takes on the same question while explicitly recognizing that government information may be incomplete. The third main result we present is a version of the optimal distortionary taxation result of Mirrlees (1971). Our model economy differs from that of Mirrlees (1971) in that ours is a pure-capital-income, general equilibrium economy while in the one studied in Mirrlees (1971) all income comes from labor. Mathematically, however, our model economy is a simpliﬁed version of the model studied in Mirrlees (1971). Stiglitz (1987) reviews the literature on optimal redistributive taxation in economies with private information.3 Kocherlakota (2007) surveys a related literature on tax systems implementing optimal social insurance in dynamic economies with ex-post private information. The body of this article is organized as follows. Section 1 describes in detail the economy we study. Section 2 deﬁnes competitive equilibrium. Section 3 demonstrates the efﬁciency of competitive equilibrium indirectly as well as through direct computation. Section 4 shows the sufﬁciency of lump-sum taxes for efﬁcient redistribution under full information. Also, it 1 Pigou (1932) initiated the by now extensive, and still actively developing, literature on corrective distortionary taxation in economies with externalities. 2 Chapter 16 of Mas-Colell, Whinston, and Green (1995) contains an excellent textbook treatment of these results. 3 Werning (2007) is among recent contributions to this literature. 238 Federal Reserve Bank of Richmond Economic Quarterly demonstrates the inefﬁcacy of lump-sum taxes when agents’ types are not observed by the government. Section 5 deﬁnes a general class of distortionary tax systems. There, also, it is shown that a simple tax system with a proportional distortionary tax on capital income is incapable of providing any redistribution. Section 6 is devoted to the study of an optimal distortionary tax system in which capital taxes are nonlinear. Section 7 discusses alternative optimal tax systems. Section 8 concludes. 1. A SIMPLE PURE CAPITAL INCOME ECONOMY In this article, we will study a private-ownership version of the economic environment also studied in a companion article, Grochulski (2008, referenced hereafter as G08 for short). The economy is populated by a unit mass of agents who live for two periods, t = 1, 2. There is a single consumption good each period, ct , and agents’ preferences over consumption pairs (c1 , c2 ) are represented by the utility function θ u(c1 ) + βu(c2 ), (1) where β is a common-to-all discount factor, and θ is an agent-speciﬁc preference parameter. Agents are heterogenous in their relative preference for consumption at date 1. We assume a two-point support for the population distribution of the impatience parameter, θ . Agents, therefore, are of two types. A fraction, μH , of the agents are impatient with a strong preference for consuming in period 1. Denote by H the value of the preference parameter, θ , that represents preferences of the impatient agents. A fraction, μL = 1 − μH , are agents of the patient type. Their value of the impatience parameter, θ , denoted by L, satisﬁes L < H . The production side of the economy is represented by the so-called Lucas tree. Each agent is endowed with one unit of productive capital stock—the tree. Each period, one unit of the capital stock produces y units of the consumption good—the fruit of the tree. Given that the total mass of agents is normalized to unity and each agent is endowed with one tree, the aggregate amount of the consumption good available in this economy in each of the two periods is Y = y. The consumption good is perishable—it cannot be stored from period 1 to 2. The size of the capital stock, i.e., the number of trees, is ﬁxed: Capital does not depreciate nor can it be accumulated. Note that there is no uncertainty in this economy. In particular, agents’ impatience parameter, θ , is nonstochastic. The production side of the economy is deterministic as well. For simplicity and clarity of exposition, as in G08, we will focus our attention on a particular set of values for the preference and technology parameters. B. Grochulski: Redistributive Taxation 239 In particular, we take u(·) = log(·), β= 1 , 2 H = 5 , 2 L= 1 , 2 1 , 2 y = 1. (2) Roughly, the model period is thought of as being 25 years. The value of 1 the discount factor, β, of 2 corresponds to an annualized discount factor of about 0.973. The fractions of the two patience types are equal; preferences are logarithmic. The per-period product of the capital stock, y = Y , is normalized to 1. An allocation in this economy is a description of how the total output (i.e., the total capital income, Y ) is distributed among the agents each period. An allocation, therefore, is given by c = (c1H , c1L , c2H , c2L ), where ctθ ≥ 0 denotes the amount of the consumption good in period t assigned to each agent of type θ. To be resource-feasible, allocations must satisfy μH = μL = μθ ctθ ≤ Y, (3) θ=H,L for t = 1, 2, i.e., the aggregate consumption must not exceed the aggregate output.4 Given the utility functions (1), an allocation, c, gives total utility (or welfare), θ u(c1θ ) + βu(c2θ ), to each agent of type θ = H, L. For any α ∈ [0, 1], the social welfare function is a weighted average of the utilities of the two types of agents: α [H u(c1H ) + βu(c2H )] + (1 − α)[Lu(c1L ) + βu(c2L )], (4) where α represents the absolute weight the society attaches to the welfare of the agents of type H . Let γ = α/(1 − α) denote the relative weight of the agents of type H . An allocation is Pareto efﬁcient if there does not exist a feasible re-allocation that some agents would desire and no agents would oppose. In this sense, Pareto-efﬁcient allocations represent all divisions of the total gains from trade that can be attained in the economy. As discussed in G08, one can ﬁnd all Pareto-efﬁcient allocations by solving, for each γ ∈ [0, +∞], the problem of maximization of the social welfare function (4) subject to feasibility constraints. If all information in the economy is public, these feasibility constraints are simply the resource feasibility constraints (3). The allocation, c, attaining the maximum of the social welfare function (4) for a given value of the relative weight, γ , is called a First Best Pareto optimum, and is denoted by c∗ (γ ). By adjusting γ between 0 and +∞, we can trace out the set of all First Best Pareto optima in this economy. This set is depicted in Figure 1 of G08. The assumption of complete public information may be too strong. In particular, the government may be unable to observe agents’ preferences. For 4 Note that this constraint is independent of how the aggregate output is initially allocated to the agents. 240 Federal Reserve Bank of Richmond Economic Quarterly this reason, we will consider the assumption that each agent’s impatience parameter, θ , is known only to the agent himself and not to anybody else in the economy. This incompleteness of public information imposes additional restrictions on the feasible re-allocations that can be implemented in the economy. As discussed in G08, these restrictions take the form of the so-called incentive compatibility constraints, which are given by H u(c1H ) + βu(c2H ) ≥ H u(c1L ) + βu(c2L ) and (5) Lu(c1L ) + βu(c2L ) ≥ Lu(c1H ) + βu(c2H ). (6) Suppose the government presents the agents with an allocation, c, and asks them to reveal their impatience parameter. If c satisﬁes these constraints, the agents will have no incentive to misrepresent their true type. In the economy with private information, all Pareto-efﬁcient allocations can be found by maximizing, again for each γ ∈ [0, +∞], the social welfare function (4) subject to resource feasibility constraints (3) and the incentive compatibility constraints (5) and (6). The allocation, c, attaining the maximum in this problem for a given value of γ is denoted by c∗∗ (γ ) and often called a constrained-Pareto or Second Best Pareto optimum. This name reﬂects the fact that c∗∗ (γ ) is efﬁcient in a more narrow sense than the corresponding c∗ (γ ), as c∗∗ (γ ) is constrained by private information while c∗ (γ ) is not. The set of all Second Best Pareto optima for this economy is depicted in Figure 4 of G08. G08 provides a fairly detailed characterization of the sets of First and Second Best Pareto-optimal allocations. In the present article, we will examine the relation between Pareto-optimal allocations and market equilibrium allocations. We begin by describing the competitive market mechanism and its equilibrium. 2. COMPETITIVE CAPITAL MARKET EQUILIBRIUM In this article, we study a private-ownership economy in which all agents are initially endowed with one unit of productive capital. Relative to this initial allocation, clearly, there are gains from trade to be exploited (i.e., the initial allocation is not a Pareto optimum). When income generated by the capital stock (i.e., the dividend) is realized in the ﬁrst period, all agents have the same amount of the consumption good in hand (y units), and the same amount of consumption they will receive in the next period (y units again), but not the same desire to consume now versus next period. Thus, it is natural for them to trade consumption in hand today for capital, i.e., for the dividends, that will be received tomorrow. The relatively impatient agents, i.e., those whose preference type is θ = H , can sell some of their capital to the more patient agents of type θ = L in return for current consumption. This can be done for the mutual beneﬁt of the two types of agents because their preferences differ. B. Grochulski: Redistributive Taxation 241 The terms of this mutually beneﬁcial trade, which will determine the ﬁnal division of the welfare gains from trade, can depend on many factors. How many units of consumption in the ﬁrst period will a patient agent be willing to pay for a unit of capital being sold by the impatient agent? Given the economic environment, a reasonable answer to this question is: the market price. In this environment, we have a large number of sellers of capital (mass mH to be exact) and a large number of buyers (mass mL ). Also, we do not assume that buyers or sellers face any technological barriers to trading like signiﬁcant costs of shopping around, communicating, or negotiating with potential trade counterparties. Therefore, no rational agent will trade with a counterparty unless he is conﬁdent that he cannot obtain more favorable terms of trade by continuing to shop around. The competitive market price of capital represents the terms of trade that give this conﬁdence to a rational agent. It is reasonable to expect that a competitive market for capital will emerge in this environment. Let us therefore consider the standard formal model of the competitive market mechanism. After agents collect dividends in period 1, they choose the quantity, c1 , that they consume now, the quantity, a, of capital they purchase or sell at the market price, q, and the quantity they will be consuming in the second period, c2 . Their initial endowment of capital and its price, q, determine the set of consumption pairs (c1 , c2 ) that are affordable. Formally, agents of type θ = H, L choose c1 , a, and c2 so as to solve the following individual utility maximization problem: max c1 ≥0,c2 ≥0,a θ u(c1 ) + βu(c2 ), subject to the budget constraints c1 + qa ≤ y, c2 ≤ (1 + a)y. (7) (8) Note that the non-negativity requirement for consumption at the second date implies that a ≥ −1, i.e., no agent can sell more capital than the one unit he owns. D D Let ctθ (q) for t = 1, 2 and aθ (q) for θ = H, L denote the agents’ demand functions for consumption and capital, respectively, i.e., the solutions to the above individual optimization problem for any given price of capital, q. Deﬁnition 1 Competitive market equilibrium consists of a consumption allocation, c = (c1H , c1L , c2H , c2L ); capital trades, a = (aH , aL ); and a capital ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ price, q, such that ˆ (i) agents optimize, i.e., the equilibrium allocation maximizes agents’ utility given the equilibrium price, q: ˆ ctθ ˆ aθ ˆ for t = 1, 2 and θ = H, L; D = ctθ (q), ˆ D = aθ (q), ˆ 242 Federal Reserve Bank of Richmond Economic Quarterly (ii) the capital market clears: μθ aθ = 0. ˆ (9) θ=H,L Note that the budget constraints and the capital market clearing condition imply that the equilibrium allocation of consumption is resource-feasible, i.e., the sum of all agents’ consumption in every period does not exceed the total amount of output, Y : μθ ctθ ≤ Y ˆ (10) θ=H,L for t = 1, 2. 3. EFFICIENCY OF CAPITAL MARKET EQUILIBRIUM Suppose there is no government intervention and agents trade freely. As a result, each agent obtains some ﬁnal allocation of consumption. As we discussed in the previous section, we expect this allocation to be a competitive equilibrium allocation, c. The following is the classic competitive market ˆ optimality result. Theorem 1 Let (c, a, q) be a competitive capital market equilibrium. Then, ˆ ˆ ˆ the equilibrium allocation of consumption, c, is Pareto optimal. ˆ Recall that an allocation is Pareto optimal (or Pareto efﬁcient) if it is feasible and not Pareto dominated by another feasible allocation. An allocation x Pareto dominates an allocation z if all agents in the economy prefer x over z, and at least one agent in the economy prefers x over z strictly.5 Clearly, a Pareto-dominated allocation is a waste. If all agents can be made better off including at least one agent strictly, it would be a waste to not exploit this opportunity. The above theorem tells us that competitive equilibrium allocation is free of this failure. This important result, which holds much more generally than just in our simple capital market model, is often called the First Welfare Theorem. A General Proof of the First Welfare Theorem In this subsection, let us present a general, standard argument behind the First Welfare Theorem.6 We will note that this argument is an indirect one. 5 G08 provides additional discussion of Pareto dominance and feasibility with full and partial public information. 6 See also Mas-Colell, Whinston, and Green (1995) for an excellent textbook treatment of this result. B. Grochulski: Redistributive Taxation 243 We begin with the following simple implication of agents’ utility maximization: In equilibrium, it must be the case that c1θ + c2θ q/y = y + q. ˆ ˆ ˆ ˆ (11) To see this, note ﬁrst that when agents optimize, their budget constraints will be satisﬁed as equalities because utility is strictly increasing in consumption. Then, eliminate aθ from (7) and (8) to obtain (11). ˆ Equation (11) represents the fact that in equilibrium agents will not waste personal wealth. The right-hand side of (11) represents the equilibrium value of each agent’s initial endowment of capital in terms of the units of ﬁrstperiod consumption. In period 1, an agent can collect dividend y and then sell all his capital endowment for q. Thus, his total wealth is y + q. The ˆ ˆ left-hand side of (11) represents the cost of the consumption allocation, cθ . ˆ ˆ In equilibrium, q is the price of one unit of c2 in terms of the units of c1 : It y 1 takes y units of capital to obtain one unit of consumption in period 2, and q units of consumption in period 1 to obtain one unit of capital. Effectively, it ˆ takes q units of consumption c1 to obtain one unit of consumption c2 . Thus, y c1θ + c2θ q/y is the cost of the consumption pair cθ = (c1θ , c2θ ). ˆ ˆ ˆ ˆ ˆ ˆ Now suppose that a feasible allocation c Pareto dominates the equilibrium ¯ allocation c. This means that agents of at least one type strictly prefer cθ over ˆ ¯ cθ , and both types prefer cθ over cθ at least weakly. Because utility is strictly ˆ ¯ ˆ increasing in consumption, cθ must be strictly unaffordable to all those who ¯ strictly prefer it and at best just affordable to those who weakly prefer it (which is everybody).7 Thus, for both θ , c1θ + c2θ q/y ≥ y + q ¯ ¯ ˆ ˆ (12) with at least one of these two inequalities being strict. Multiplying this inequality for type θ by μθ and adding over θ , we obtain μθ c1θ + ¯ θ=H,L μθ c2θ q/y > ¯ ˆ θ=H,L μθ (y + q), ˆ θ=H,L where the inequality is strict because at least one of the inequalities in (12) is strict. Since c is feasible, it must satisfy the resource constraints (10). Using ¯ these, we obtain from the above that Y + Y q/y > y + q. ˆ ˆ 7 Note that this argument relies only on the strict monotonicity of preferences (and, in fact, could rely only on local nonsatiation; see Mas-Collel, Whinston, and Green [1995], section 16 C). In our model, we could actually make a stronger argument based on the strict convexity of preferences. Namely, since agents’ preferences are strictly increasing and strictly convex, cθ is a ˆ unique maximizer of utility in the budget set. Thus, cθ must be strictly unaffordable even to the ¯ ¯ type that is indifferent between cθ and cθ . ˆ 244 Federal Reserve Bank of Richmond Economic Quarterly Substituting Y = y we get Y + q > Y + q, ˆ ˆ which is a contradiction. Thus, a feasible allocation, c, that Pareto dominates ¯ the equilibrium allocation, c, cannot exist. ˆ A Direct Proof of the First Welfare Theorem The First Welfare Theorem tells us that any equilibrium allocation is Pareto optimal and nothing more. In particular, the general, indirect proof of the First Welfare Theorem tells us nothing about which among the inﬁnitely many Pareto-optimal allocations the equilibrium allocation c coincides with. This ˆ question, which strictly speaking is outside the scope of the First Welfare Theorem, may be of independent interest. In the speciﬁc environment that we consider in this article, we can give a direct proof of the First Welfare Theorem. Namely, we can compute the set of competitive equilibrium allocations and compare it against the set of all Pareto-optimal allocations. In this way, we will be able to tell exactly which Pareto optima can be implemented as competitive equilibria. Solving the individual utility maximization problem We begin by deriving the agents’ capital demand functions. Similar to (11), we can rewrite the agents’ budget constraints as an equality of the present value of consumption and wealth: c1 + c2 q/y = y + q. (13) In this form, it is easy to see the agents’ utility maximization problem has a linear budget set and a strictly concave objective function. Thus, at each price, q, it has a unique solution, which we can compute from the budget constraint (13) and the Euler equation8 θ u (c1 )q/y = βu (c2 ). (14) For each θ, we solve these two equations to obtain consumption demand D functions ctθ (q) for t = 1, 2. Using the parameter values in (1), these solutions are 1+q D c1θ (q) = 2θ , 1 + 2θ 1 1+q D . c2θ (q) = q 1 + 2θ 8 Perhaps the simplest way to obtain the Euler equation (14) is to express the utility maximization problem as maxc2 θ u(y + q − c2 q/y) + βu(c2 ) and take the ﬁrst-order condition. B. Grochulski: Redistributive Taxation 245 From (8) evaluated at equality, we obtain that type θ ’s capital demand function is 1 1+q D − 1. aθ (q) = q 1 + 2θ Solving for equilibrium price of capital and allocation Substituting the capital demand functions into the capital market clearing condition (9) and solving for the price that clears this market, we obtain an 1 equilibrium price q = 2 . It is easy to see that there are no other prices that ˆ clear this market, i.e., the competitive equilibrium is unique in our model.9 We can now compute equilibrium capital trades: 1 D 1 aL = aL ( ) = , ˆ 2 2 1 1 D aH = aH ( ) = − , ˆ 2 2 and the equilibrium allocation of consumption: cL = ˆ cH ˆ = 3 3 , , 4 2 5 1 = , . 4 2 c1L , c2L = ˆ ˆ (15) c1H , c2H ˆ ˆ (16) Figure 1 depicts the agents’ budget constraint at the equilibrium capital 1 price, q = 2 ; equilibrium consumption pairs (15) and (16); and one indifferˆ ence curve for each type of agent. Clearly, both agent types face the same budget constraint. Since their preferences differ, so do their choices. The indifference curve depicted for each type θ represents the highest level of utility that each type attains within the budget constraint. Note also that point (1,1) in Figure 1 represents the consumption bundle that agents get if they do not 1 1 trade. In equilibrium, the impatient agent exchanges 4 units of c1 for 2 units of c2 . The patient agent, of course, takes the opposite end of this trade. Confronting equilibrium with the set of Pareto-optimal allocations Our ﬁrst observation here is that the competitive capital market mechanism delivers a unique equilibrium allocation of consumption. We observe next that, as shown in detail in G08, there is a continuum of Pareto-optimal allocations 9 Expressed as a function of gross return on capital investment, R = 1 , rather than the price q R+1 D of capital, q, agents’ demand for capital is linear in R. Namely, aθ (R) = 1+2θ − 1. Thus, for any two numbers, θ , there can be at most one solution to the capital market clearing condition, so equilibrium is unique. 246 Federal Reserve Bank of Richmond Economic Quarterly Figure 1 Agents’ Problems’ Solutions at Competitive Equilibrium 3 c2 2 1 0 0 1 2 3 c1 in our environment.10 From these two observations, we immediately see that almost all Pareto-optimal allocations are incompatible with competitive equilibrium. Which one among the continuum is the Pareto optimum consistent with competitive equilibrium? Formulas (8)–(11) in G08 describe the set of all First Best Pareto optima indexed by parameter γ ∈ [0, ∞] representing the relative welfare weight assigned to the impatient agents in the social objective function. If society favors neither of the two types of agents, the welfare weight given to both types is the same, i.e., the relative weight of the impatient type, γ , is 1. Thus, γ = 1 represents the so-called utilitarian Pareto optimum. By γ CE let us denote the value of the index, γ , associated with the optimum that is selected by the market mechanism in competitive equilibrium. From 10 Multiplicity of Pareto-optimal allocations is typical in environments with heterogenous agents. B. Grochulski: Redistributive Taxation 247 formulas (8)–(11) in G08, we obtain immediately that the unique competitive allocation, c, given in (15) and (16) is the Pareto optimum corresponding to ˆ γ = 1/3, i.e., γ CE = 1/3 in our economy. Thus, competitive equilibrium is optimal if and only if society values welfare of the patient type, L, three times as much as it values welfare of the impatient type, H . Why this Pareto Optimum? As we have seen, competitive capital market equilibrium implements a rather particular Pareto optimum. Intuitively, we see that the competitive capital market selects a Pareto optimum that “favors” the patient agents. With a large mass of agents whose desire for consumption in the ﬁrst period is very strong relative to the population average (H exceeds the average θ by 66 percent), the market is “ﬂooded” with capital, which becomes very affordable to the patient agents.11 As the impatient agents compete for ﬁrst-period consumption, the patient agents end up receiving two units of c2 for each unit of c1 in equilibrium. That rate of exchange is optimal only if society on the whole cares for the welfare of the patient agents more than it cares for the welfare of the impatient consumers.12 In conclusion, the competitive market mechanism does two things: it allows the agents to obtain welfare gains from trade, and it also divides these gains among the agents in a particular way. It is entirely possible that society might desire a different division of the welfare gains than the one built into the competitive market allocation mechanism. This problem creates a role for redistributive government policy. In the remainder of this article, we consider how the government can supplement the competitive market mechanism with a tax system that preserves efﬁciency but implements other divisions of the welfare gains from trade. In the next section, we consider the situation in which the government has full information on each agent’s preference type and, therefore, can transfer wealth from one type to another as a lump sum. Subsequently, we consider 11 To see this point more clearly, note that the preferences of the patient type can be alter1 natively represented by log(c1 ) + log(c2 ) and q = 2 . ˆ 12 As a simple thought experiment, consider the question of how the competitive equilibrium selection from the Pareto set changes when the relative impatience of the two types of agents 1 changes. In particular, suppose that L is not necessarily 2 but can be any real number smaller CE (L) denote the index of the Pareto optimum that is implemented by the or equal to H . Let γ 5 competitive equilibrium as L is adjusted between 0 and 2 . It is easy to show that γ CE (L) = (2L + 1)/6. Thus, competitive equilibrium selects the utilitarian Pareto optimum only if all agents are identical (no trade is optimal in this case). When the impatient agents become very impatient, i.e., when L approaches 0, we have that γ CE (0) = 1 , i.e., competitive equilibrium selects the 6 Pareto optimum that would be selected by a society that values welfare of the patient type L six times as much as it values welfare of the impatient type H . 248 Federal Reserve Bank of Richmond Economic Quarterly private information, which makes lump-sum wealth transfers infeasible and creates a role for distortionary redistributive taxation. 4. COMPETITIVE EQUILIBRIUM WITH LUMP-SUM TAXES We begin by extending the deﬁnition of competitive equilibrium (i.e., Deﬁnition 1) to allow for lump-sum wealth transfers. A tax on an agent is lump-sum if the amount due is independent of any choices made by this agent. For example, a labor income tax is not lump-sum because the amount due increases with the number of hours the agent chooses to work. Taxes under which the amount due does depend on the taxpayers’ choices are called distortionary. In our economy, agents choose consumption in periods 1 and 2 and their capital holdings in period 2. Thus, lump-sum taxes must not depend on consumption or capital holdings. Agents’ impatience type θ , however, is not their choice. If the government can observe each agent’s type θ , a lump-sum tax can depend on θ . In this section, we assume that each agent’s preference type θ is freely and publicly observable. In particular, the government sees every agent’s type and therefore can impose different lump-sum taxes on the agents of different types. In this setting, a lump-sum tax system consists of two real numbers: TH and TL , where a negative value of Tθ means a transfer from the government to the agent of type θ . Under these taxes, the budget constraints of the agents of type θ = H, L are c1 + aq ≤ y − Tθ , c2 ≤ (1 + a)y, where q, as before, is the ex-dividend price of capital in the ﬁrst period. Note that the lump-sum taxes Tθ are levied in period 1 and denominated in the units of consumption at that date. It is entirely possible to levy lump-sum taxes at both dates, but it is easy to see that doing this would not be useful. Treating the budget constraints as equalities, eliminating a, we can express the budget constraint in the present value as follows: c1 + c2 q/y = y + q − Tθ . (17) From here we see that any lump-sum tax at the second date can be lumped into Tθ . Competitive equilibrium with lump-sum taxes, (Tθ )θ=H,L , is deﬁned analogously to the tax-free competitive equilibrium of Deﬁnition 1: A priceallocation pair will be an equilibrium if agents optimize, now subject to (17), and markets clear. In addition, the government must break even in equilibrium, i.e., taxes, (Tθ )θ=H,L , must satisfy the government budget constraint μθ Tθ = 0. θ=H,L (18) B. Grochulski: Redistributive Taxation 249 Efﬁcient Redistribution with Lump-Sum Taxation Under Full Information We will say that a lump-sum tax system, (Tθ )θ=H,L , implements a given allocation, c, if c is a competitive equilibrium allocation under taxes, (Tθ )θ=H,L . The following result is a version of the classic sufﬁciency result known as the Second Welfare Theorem. Theorem 2 Every First Best Pareto optimum, c∗ , can be implemented with a lump-sum tax system, (Tθ )θ=H,L . Under this theorem, lump-sum taxes are clearly sufﬁcient to achieve any desired distribution of the total gains from trade available in this economy. We will now provide a proof of this theorem constructed as follows. First, we derive a set of sufﬁcient conditions for an allocation to be an equilibrium allocation. Then, we show that for every First Best Pareto optimum c∗ (γ ), γ ∈ [0, ∞], lump-sum taxes, (Tθ )θ=H,L , can be set so c∗ (γ ) satisﬁes these sufﬁcient conditions. In order for an allocation, c, to be an equilibrium allocation, there must exist a capital price, q, at which agents choose to consume c and the capital trades associated with c clear the market. First, let us identify sufﬁcient conditions for agents’optimization at a given price, q. Under the present-value budget constraint (17), agents solve a strictly concave optimization problem. Thus, the individual Euler equation (14) and the budget constraint (17) are sufﬁcient for an allocation, c, to be individually optimal at the price, q. Second, we need to check market clearing. However, as long as we implement a resource-feasible allocation, c, the capital purchases associated with c will clear the market. One of the properties of the First Best Pareto optima (FBPO) is that they are free of the so-called intertemporal wedges (see G08, Section 3). This means that at each FBPO c∗ (γ ) the intertemporal marginal rate of substitution (IMRS) of each agent type is equal to the intertemporal marginal rate of transformation. Denote the IMRS of agent type θ evaluated at a FBPO allocation c∗ (γ ) by m∗ (γ ), i.e., θ ∗ βu (c2θ (γ )) . ∗ θ u (c1θ (γ )) The lack of intertemporal wedges demonstrated in G08 implies that m∗ (γ ) = θ (19) m∗ (γ ) = m∗ (γ ) H L for each γ ∈ [0, ∞]. This simple property is crucial for the implementation of FBPO as competitive equilibria. Let us denote the two agent types’ common IMRS value by m∗ (γ ). Directly from (14) we see that if the price of capital is q = m∗ (γ )y, 250 Federal Reserve Bank of Richmond Economic Quarterly then the individual Euler equation holds for both agent types simultaneously. Let us denote this price of capital by q(γ ) for each γ ∈ [0, ∞]. ˆ All that remains to be checked is affordability, i.e., that both types’ budget constraints are satisﬁed at the consumption allocation, c∗ (γ ), and price, q(γ ). ˆ For that, however, we have the lump-sum taxes, Tθ . In particular, we can ﬁnd ∗ the lump-sum tax, TL , that will make the FBPO cL (γ ) affordable for agent L. To do that, we solve the budget constraint ∗ ∗ c1L (γ ) + c2L (γ )q(γ )/y = y + q(γ ) − TL ˆ ˆ (20) for TL . For each γ ∈ [0, ∞], we will denote this solution by TL (γ ). Using the formulas for c∗ (γ ) derived in G08, we can compute TL (γ ) explicitly. Formulas (9) and (11) in G08 tell us that 2 ∗ , (21) c1L (γ ) = 1 + 5γ 2 ∗ (22) c2L (γ ) = 1+γ for any γ ∈ [0, ∞]. Substituting these expressions into (19), we obtain ∗ βu (c2θ ) ∗ θ u (c1θ ) 1+γ . 1 + 5γ m∗ (γ ) = = From the Euler equation (14), we therefore have that if c∗ (γ ) is to be an equilibrium allocation, then the price of capital must be q(γ ) = ˆ 1+γ . 1 + 5γ Substituting this price and the consumption values (21) and (22) into (20), we solve for TL to obtain TL (γ ) = − 2 − 6γ . 1 + 5γ (23) From the government budget constraint (18), it is immediate that TH (γ ) = −TL (γ ) 2 − 6γ . = 1 + 5γ (24) ∗ It is easy to verify that with tax TH (γ ), the FBPO cH (γ ) is affordable to agent H under the capital price q(γ ). Thus, for any γ ∈ [0, ∞] with taxes T (γ ), the ˆ ∗ optimal allocation cH (γ ) satisﬁes sufﬁcient conditions for equilibrium. Proof of Theorem 2 is therefore complete. Equilibrium with lump-sum taxes, naturally, reduces to the pure competitive equilibrium of Deﬁnition 1 if the government chooses the taxes to be zero. B. Grochulski: Redistributive Taxation 251 Figure 2 Agents’ Problems’ Solutions at Competitive Equilibrium with Lump-Sum Taxes Implementing the Utilitarian Optimum 3 c2 2 1 0 0 1 2 3 c1 From (23) and (24), it is easy to see that TH ( 1 ) = TL ( 1 ) = 0. Thus, zero 3 3 taxes are optimal with γ = 1 , which exactly replicates the result we obtained 3 in the constructive proof of Theorem 1. When γ = 0, i.e., when the society puts zero weight on welfare of the type H , the lump-sum tax on agents H is TH (0) = 2, i.e., all wealth is taken away from agents H . At the other extreme, TH (∞) = −2, i.e., the government transfers all wealth to agents H when γ = ∞. Figure 2 depicts the solution to the agents’ utility maximization problems at the equilibrium implementing the utilitarian optimal allocation (i.e., when all agents receive the same welfare weight in the social planning problem). With γ = 1, we have TH (1) = − 2 and TL (1) = 2 . The equilibrium price is 3 3 q(1) = 1 . At this price, the ex-dividend value of each agent’s capital in period ˆ 3 ˆ 1 is 1 . The after-tax wealth of type H , thus, is y + q(γ ) − TH = 2, while that 3 of type L is y + q(γ ) − TL = 2 . The budget constraint for type L, therefore, ˆ 3 252 Federal Reserve Bank of Richmond Economic Quarterly is 2 , 3 and the optimal choice is cL = ( 1 , 1). The H type faces the budget constraint ˆ 3 c1 + c2 /3 = c1 + c2 /3 = 2, and his optimal choice is cH = ( 5 , 1). Figure 2 depicts these budget conˆ 3 straints and optimal choices along with two indifference curves representing the maximal utility levels attained by each type in equilibrium with lump-sum taxes (TH (1), TL (1)). The pale, horizontal, solid line represents the lumpsum tax on the patient types TL (1) = 2 . The pale, horizontal, dashed line 3 represents the lump-sum transfer to the impatient type, i.e., the negative of the tax TH (1) = − 2 . Under these transfers and the capital price q(1) = 1 , ˆ 3 3 the two types’ budget constraints are parallel, with the budget line of type L being strictly inside (closer to the origin) the budget of type H . Agents choose aH = aL = 0. ˆ ˆ With complete information about agents’ types, the government can freely redistribute wealth among the two types of agents. With a competitive market for capital, no further government intervention is needed for either efﬁciency or distributional reasons. Nondistortionary lump-sum taxes are sufﬁcient to efﬁciently attain any distributional objective of the government, i.e., they implement any First Best Pareto-optimal allocation of consumption. Inefﬁcacy of Lump-Sum Taxes Under Incomplete Public Information Suppose now that the government cannot directly observe agents’ types. Can the government implement a wealth transfer from one type to the other when it does not see which agents are of which type? Certainly the lump-sum tax system described above cannot be used because it requires the knowledge of agents’ types. Potentially, the government could elicit this information from the agents. However, if the government uses this information to simply transfer wealth, agents will not reveal their type truthfully. This is very intuitive. The larger an agent’s after-tax wealth, the better off this agent will be in any competitive equilibrium. With agents themselves being the only source of information about their preference types, any lumpsum tax with TH = TL will make some agents lie about their type. Clearly, if TH < TL , everybody will declare themselves to be of type H . If TL < TH , everybody will say they are of type L. Therefore, if the government sets lump-sum taxes TH = TL , all agents will end up paying min{TH , TL }, i.e., all agents will pay the same amount. Given the government budget constraint, this amount must be zero. Thus, when agents’ types are their private information, the only lump-sum taxes that the government can feasibly implement are B. Grochulski: Redistributive Taxation 253 TH = TL = 0. We see that if the government wants to (or is restricted to) use lump-sum taxes to redistribute social surplus and agents have private information, the government can redistribute nothing. It is worth emphasizing that the competitive equilibrium allocation remains efﬁcient in our economy even when agents have private information about their type, i.e., the First Welfare Theorem holds in our economy with private information. An easy way to see that this indeed is the case is simply to check that the competitive equilibrium allocation, c, given in (15)–(16) beˆ longs to the set of Second Best Pareto-optimal allocations characterized in G08 (see Figure 4 in particular). In fact, this is quite intuitive. Private information does not interfere with the price mechanism in our model because it does not affect the nature of the commodity that is being traded. Under both complete and private information about agents’ preferences, consumption is traded for capital. Preferences of buyers and sellers do not affect the nature of this trade beyond what is captured by agents’ demand functions. The competitive price mechanism is thus efﬁcient.13 In sum, competitive equilibrium delivers one efﬁcient allocation in our economy—under both complete and incomplete public information. This allocation represents a particular distribution of the gains from trade among the two types of agents. Thus, competitive equilibrium is suboptimal under almost all possible strictly Paretian social preference orderings (represented by the parameter γ ∈ [0, ∞]).14 In the case of complete public information, this distributional problem can be remedied by lump-sum taxes. In the case of private information, however, lump-sum taxes are powerless. In fact, in our economy, the only implementable lump-sum tax is the zero tax on all agents. Motivated by this, we now turn to distortionary taxes. 5. COMPETITIVE EQUILIBRIUM WITH DISTORTIONARY TAXES For the remainder of this article, we assume that information available to the government is incomplete. In particular, agents’ impatience is known only to them. We assume that the government knows the population distribution of the impatience parameter, θ , but cannot determine the value of θ on an agent-by-agent basis. Thus, tax systems in which the amount levied on an agent depends directly on the agent’s θ are not feasible to the government. 13 In particlular, the classic lemons problem of Akerlof (1970) does not appear in this market. 14 One could also consider non-Paretian social preference orderings (see Mas-Colell, Whinston, and Green [1995], Section 22.C). By considering only strictly Paretian social welfare functions (of the form αuH + (1 − α)uL , where uθ = θ u(c1θ ) + βu(c2θ ) for θ = H, L) we pose a reasonably strong restriction on the set of allocations that can be considered optimal. In this restricted set, almost all Second Best Pareto-optimal allocations cannot be supported by competitive equilibrium with lump-sum taxes. 254 Federal Reserve Bank of Richmond Economic Quarterly Let us start by deﬁning a general class of feasible tax systems. For dates t = 1, 2, let Tt denote the mapping from agents’ publicly observable characteristics at t to the tax payments to the government at t. In the ﬁrst period, agents trade current consumption for capital. These trades are observable to the government. Thus, T1 (c1 , a) represents the amount of tax due at the end of date one. At the end of the second date, second-period consumption is also publicly available, so T2 (c1 , a, c2 ) is the second-period tax function. Clearly, the government can use any tax system of this form because the amounts due from each agent depend only on what the government can observe. In particular, T1 and T2 do not depend on the unobservable parameter θ. Under a tax system (T1 , T2 ), agents’ budget constraints are given by c1 + qa = y − T1 (c1 , a), c2 = (1 + a)y − T2 (c1 , a, c2 ). Competitive equilibrium is deﬁned, again, analogously to Deﬁnition 1: agents optimize, markets clear, and government budget constraints are satisﬁed. With taxes (T1 , T2 ), these constraints are given by μθ T1 (c1θ , aθ ) = 0, ˆ ˆ (25) μθ T2 (c1θ , aθ , c2θ ) = 0, ˆ ˆ ˆ (26) θ=H,L θ=H,L ˆ where ctθ and aθ are equilibrium values of agents’ consumption and capital ˆ trades. Note that any nonzero feasible tax system (T1 , T2 ) will be distortionary. Indeed, if a tax system (T1 , T2 ) is not distortionary, then T1 and T2 must be constant (independent of their arguments). In this case, the government budget constraints (25) and (26) imply immediately that T1 = T2 = 0. Since agents’ actions, but not types, are observable, it is clear that redistribution can be achieved only with taxes that depend on agents’ actions and not types. However, it is not obvious what form these taxes should take in order to be effective. The next subsection provides an example of a simple distortionary tax system that is feasible but completely ineffective for implementation of redistribution. A Simple Distortionary Tax System In this subsection, we examine a simple tax system with a proportional tax on capital income. In this system, capital income in period t is taxed at a ﬂat rate τ t ∈ [0, 1] and the proceeds are refunded to the agents as a lump-sum transfer B. Grochulski: Redistributive Taxation 255 Tt .15 In our general notation, this tax system is written as T1 (c1 , a) = τ 1 y − T1 , T2 (c1 , a, c2 ) = τ 2 (1 + a)y − T2 . A tax system of this form consists of four numbers, (τ t , Tt )t=1,2 . Setting τ t = Tt = 0 for t = 1, 2 gives us the competitive equilibrium outcome, i.e., the equilibrium allocation is a Pareto optimum with the relative weight of the high type equal to γ CE = 1 . We want to study what other efﬁcient allocations 3 can be achieved in this economy with a tax system of the form (τ t , Tt )t=1,2 . The answer turns out to be: none. Under taxes, (τ t , Tt )t=1,2 , agents’ budget constraints are given by c1 + qa = y(1 − τ 1 ) + T1 , c2 = (1 + a)y(1 − τ 2 ) + T2 , and the government budget constraints are τ 1 y = T1 , μθ τ 2 (1 + aθ )y = T2 . ˆ θ=H,L Because of agents’ equilibrium choices aθ satisfy capital market clearing ˆ ˆ θ=H,L μθ aθ = 0, the second-period government budget constraint reduces to T2 = τ 2 y(1 + = τ 2 y. μθ aθ ) ˆ θ=H,L Thus, in both periods the amount the government refunds to each agent must equal the marginal capital income tax rate times the economy’s aggregate amount of capital income, which in our model is ﬁxed at Y = y. Using τ 1 y = T1 , the agents’ budget constraint in the ﬁrst period reduces to c1 + qa = (1 − τ 1 )y + τ 1 y = y. Thus, the ﬁrst-period tax on capital income has no effect on the agents’budgets, as every agent has the same capital income and receives the same lump-sum refund equal to the average capital income tax. 15 Proportional distortionary taxes have been extensively studied in a vast literature initiated by Ramsey (1927). That literature concentrates on the question of minimization of the distortions resulting from proportional taxes, without addressing the question of optimal taxation. In particular, that literature does not consider situations in which distortionary taxes may have a corrective function, e.g., in economies with externalities or private information. 256 Federal Reserve Bank of Richmond Economic Quarterly In the second period, using τ 2 y = T2 , we can simplify the budget constraint as follows c2 = (1 + a)y(1 − τ 2 ) + T2 = (1 + a)y(1 − τ 2 ) + τ 2 y = (1 + (1 − τ 2 )a)y. We see that the lump-sum refunded ﬂat tax on capital income, τ 2 > 0, acts simply as a transfer from those who buy capital (a > 0) to those who sell it (a < 0). If τ 2 < 1, this transfer is proportional to the amount of capital traded.16 We note here that in a regular capital market transaction the payment that the buyer makes to the seller is a transfer of the exact same form. In particular, the tax payment, τ 2 a, just like a price payment, is proportional to the amount, a, of capital being traded. From this observation, we see that a proportional tax on capital, with τ 2 < 1, does nothing but change the equilibrium price of capital. In particular, under any tax of this form, equilibrium allocation will coincide with the competitive equilibrium allocation, c, so no redistribution ˆ can be achieved. To see this point more clearly, let us write the agents’ budget constraints again in the present-value form. From the ﬁrst-period budget constraint we have that a = (y − c1 )/q. Substituting into the budget constraint at date two, we obtain c2 = (1 + (1 − τ 2 )(y − c1 )/q)y, which is equivalently written as q q + y. = (1 − τ 2 )y 1 − τ2 Let us now denote q/(1 − τ 2 ) by Q. This value represents the tax-adjusted price of capital. For any tax rate τ 2 < 1, we can write the present-value budget constraint as c1 + c2 c1 + c2 Q/y = Q + y, which is the same expression as the budget constraint agents face in the model without taxes, but with the price of capital, q, replaced with the tax-adjusted price, Q. The solutions to these two models must therefore be the same, i.e., 1 ˆ Q = 2 . Thus, under a proportional capital tax, the equilibrium price of capital is q = (1 − τ 2 )/2 and the unique equilibrium allocation is c for any tax rate, ˆ ˆ τ 2 < 1. This result is intuitive. Absent taxes, q is the price of one unit of c2 in y terms of c1 . With tax, τ 2 , on capital purchases, a, in order to obtain one extra 16 If τ = 1, the government taxes the proceeds from the sale of capital at the rate of 100 2 percent. Under this tax, the market for capital is shut down, the tax proceeds are zero, and the only equilibrium is autarchy, which is not an efﬁcient allocation in this economy. B. Grochulski: Redistributive Taxation 257 unit of c2 , an agent must purchase 1/(1 − τ 2 )y units of capital at date one. With the price of capital being q, this means that it takes q/(1 − τ 2 )y units of c1 to purchase one unit of c2 . The beneﬁt of selling capital in period 1 is symmetrically increased, as selling capital now not only brings in resources for consumption today but also saves capital income taxes tomorrow. By affecting both sides of a capital transaction symmetrically, the tax, τ 2 , changes the nominal price of capital but does not change the real tradeoff that agents face in equilibrium. Setting aside the case of complete market shutdown, we see that no distortionary tax system of the form (τ , T ) can affect the competitive equilibrium outcome. For any marginal tax rate, τ 2 < 1, the equilibrium allocation is the same as it is for τ 2 = 0. In the next section, we consider distortionary tax systems capable of changing the equilibrium outcome and implementing other efﬁcient allocations. 6. EFFICIENT REDISTRIBUTION WITH DISTORTIONARY TAXATION UNDER INCOMPLETE INFORMATION In this section, we devise a class of tax systems that are feasible despite agents’ private information and capable of implementing any Second Best Paretooptimal allocation. Similar to the simple system (τ , T ) considered in the previous section, we will have a distortionary tax on capital and a lump-sum component. However, the distortion will not affect both parties to a capital sale/purchase transaction symmetrically. An Optimal Distortionary Tax System The tax system we consider in this section consists of two parts. First, there is a lump-sum tax, Tt , levied on all agents in period t = 1, 2. Second, there are subsidies to sufﬁciently extreme capital trades. The form these subsidies take is as follows. The government sets a (negative) threshold, a, and pays a − subsidy, S1 , in period 1 to all agents whose capital purchases are not greater than a (i.e., a subsidy to all who sell a sufﬁciently large quantity of capital). + Alternatively, the government can set a threshold, a, and pay a subsidy, S2 , in period 2 to all agents whose capital purchases are not smaller than a (i.e., a subsidy to those who buy a lot of capital). In the tax system implementing a given Second Best Pareto optimum, only one of these subsidies will be nonzero. − A tax system of this form is therefore given by six numbers (T1 , S1 , a, T2 , + S2 , a). In the general notation introduced in the previous section, we can express this tax system as follows: − T1 (c1 , a) = T1 − Ia (a)S1 , + T2 (c1 , a, c2 ) = T2 − Ia (a)S2 , (27) (28) 258 Federal Reserve Bank of Richmond Economic Quarterly where Ia and Ia are indicator functions given by Ia (a) = 1 if a ≤ a, 0 otherwise, and 1 if a ≥ a, 0 otherwise. We restrict attention to this class of tax systems because, as we will show, taxes in this class are sufﬁcient for implementation of all Second Best Paretooptimal allocations. In the next section, we discuss the possibility of implementing Second Best Pareto optima with other tax mechanisms. Clearly, since taxes (27)–(28) do not depend on the unobservable parameter θ, agents of both types face the same budget constraint, which is given by Ia (a) = − c1 + qa ≤ y − T1 + Ia (a)S1 , + c2 ≤ (1 + a)y − T2 + Ia (a)S2 . Also, the government budget constraints (25)–(26) can be expressed as − μθ S1 Ia (aθ ) = T1 , ˆ θ=H,L + μθ S2 Ia (aθ ) = T2 . ˆ θ=H,L − As before, competitive capital market equilibrium with taxes T = (T1 , S1 , + a, T2 , S2 , a) consists of a consumption allocation, c = (c1H , c1L , c2H , c2L ); ˆ ˆ ˆ ˆ ˆ capital trades, a = (aH , aL ); and a capital price, q, such that (i) agents optiˆ ˆ ˆ ˆ mize, i.e., the equilibrium allocation maximizes agents’ utility given the price, q, and taxes, T ; (ii) the capital market clears; and (iii) the government’s budget ˆ is balanced in every period. As before, we will say that the tax system, T , implements a Second Best Pareto optimum, c∗∗ (γ ), if there exists a competitive equilibrium such that c = c∗∗ (γ ). ˆ Analogous to (19), let m∗∗ (γ ) denote the intertemporal marginal rate of θ substitution of agents of type θ at the Second Best Pareto optimum, c∗∗ (γ ), i.e., ∗∗ βu (c2θ (γ )) . ∗∗ θ u (c1θ (γ )) The following result is a version of the Second Welfare Theorem with private information. m∗∗ (γ ) = θ Theorem 3 Every Second Best Pareto optimum c∗∗ can be implemented as a competitive equilibrium with taxes, T . In particular, for γ ∈ [0, ∞], the Second Best Pareto optimum c∗∗ (γ ) is implemented by the tax system − + T (γ ) = (T1 (γ ), S1 (γ ), a(γ ), T2 (γ ), S2 (γ ), a(γ )) B. Grochulski: Redistributive Taxation 259 given as follows. For γ < γ CE : − T1 (γ ) = S1 (γ ) = a(γ ) = 0, ∗∗ ∗∗ T2 (γ ) = Y − c2H (γ ) + m∗∗ (γ )−1 Y − c2L (γ ) , H + ∗∗ ∗∗ ∗∗ ∗∗ S2 (γ ) = c2L (γ ) − c2H (γ ) + m∗∗ (γ )−1 c1L (γ ) − c1H (γ ) , H a(γ ) = Y m∗∗ (γ ) H −1 ∗∗ Y − c1H (γ ) . For γ ≥ γ CE : + T2 (γ ) = S2 (γ ) = a(γ ) = 0, ∗∗ ∗∗ T1 (γ ) = Y − c1L (γ ) + m∗∗ (γ ) Y − c2L (γ ) , L − ∗∗ ∗∗ ∗∗ ∗∗ S1 (γ ) = c1H (γ ) − c1L (γ ) + m∗∗ (γ ) c2H (γ ) − c2L (γ ) , L ∗∗ a(γ ) = Y −1 (c2H (γ ) − Y ). Although the expressions for the thresholds and transfers speciﬁed in the tax system, T (γ ), look complicated, the intuition behind them is very simple. Absent taxes, as we have seen, the competitive market mechanism implements the efﬁcient allocation c∗∗ (γ CE ). In order to implement an optimum c∗∗ (γ ) for some γ > γ CE , the government must redistribute from the patient types, L, to the impatient types, H , (recall that γ is the relative weight that the impatient type, H , receives in the social welfare objective). How can this redistribution be achieved when the government cannot observe agents’ types? In competitive equilibrium without taxes, the impatient types sell capital because of their strong preference for ﬁrst-period consumption. The patient types buy it. Thus, the government knows ex post who the impatient and patient agents are simply by looking at agents’ capital trades. Suppose then that the government, targeting the impatient agents, gives a small subsidy to those who sell a sufﬁciently large quantity of capital. If the subsidy is small enough, or the minimum sale size requirement is sufﬁciently large, this subsidy will not cause the patient agents to change their behavior (i.e., to ﬂip from buying to selling capital).17 Under such a subsidy, patient agents still buy capital and, therefore, do not collect the subsidy. The impatient agents, who were selling capital even without the subsidy, continue to sell it, which now gives them the additional beneﬁt of the subsidy. Thus, the subsidy reaches the targeted type. If this subsidy is funded by lump-sum taxes on all agents, it redistributes from the patient agents to the impatient ones, as intended. The optimal tax mechanism, T (γ ), delivers the subsidy to the targeted type precisely in this way. For any γ > γ CE , the optimal tax system, T (γ ), provides a threshold level, − a(γ ), and a subsidy level, S1 (γ ), that achieve in equilibrium the amount of 17 In the language of mechanism design, the market mechanism distorted by such a subsidy remains incentive compatible. 260 Federal Reserve Bank of Richmond Economic Quarterly redistribution (relative to the competitive market allocation) required to implement the optimal allocation, c∗∗ (γ ). Similarly, in order to implement the optimum c∗∗ (γ ) for some γ < γ CE , the government redistributes from the impatient types, H , to the patient types, L. Taxes, T (γ ), are again designed to not induce the agents to ﬂip, so the impatient types continue to sell capital and the patient types continue to buy + it. For γ < γ CE , the lump-sum-funded subsidy, S2 (γ ), goes to the buyers of capital, that is types L, and thus reaches the targeted type of agent. In this way, tax T (γ ) achieves the desired redistribution. Let us now argue slightly more formally that this intuition is consistent with equilibrium. We need to demonstrate that conditions (i)–(iii) deﬁning competitive equilibrium with taxes are satisﬁed under taxes, T (γ ), with consumption, c = c∗∗ (γ ), along with some prices, q(γ ), and capital trades, aθ (γ ). ˆ ˆ ˆ More precisely, we will argue that equilibrium capital price, q(γ ), can be obˆ tained from the IMRS of the agents who do not receive the subsidy to capital sales/purchases. For γ > γ CE , these are the patient agents, i.e., q(γ ) = m∗∗ (γ )y ˆ L (29) for these γ . For γ < γ CE , the impatient types do not receive the subsidy, thus q(γ ) = m∗∗ (γ )y ˆ H for all γ in this range. The subsidy threshold levels a(γ ) and a(γ ) are such that the following capital trades are optimal in the agents’ utility maximization problem: aH (γ ) = a(γ ), ˆ μ aL (γ ) = − H a(γ ) ˆ μL for each γ > γ CE , and μL a(γ ), μH aL (γ ) = a(γ ) ˆ aH (γ ) = − ˆ for each γ < γ CE . Checking that equilibrium conditions (ii) and (iii) are satisﬁed amounts to a bit of simple algebra. The crux of the argument is in checking the ﬁrst equilibrium condition, i.e., in showing that under taxes, T (γ ), and proposed equilibrium prices, q(γ ), agents of types H and L indeed ﬁnd it optimal to ˆ choose the proposed equilibrium capital trades aH (γ ) and aL (γ ), respectively. ˆ ˆ An algebraic proof of this result would be very tedious. In particular, note that the algebraic argument we used in the case of lump-sum taxes with full information cannot be used here, as the Euler equations are invalid due to the budget line being given by a non-differentiable function. We will thus proceed differently. For several selected values of γ , we will demonstrate graphically that the optimal allocation c∗∗ (γ ) is consistent with B. Grochulski: Redistributive Taxation 261 agents’individual utility maximization under taxes, T (γ ). Qualitatively, these values will be representative of the whole spectrum of γ . From our graphical argument, it will be clear that the conclusion holds for all γ ∈ [0, ∞]. Consider the case of γ = 1 (which represents the utilitarian social welfare objective). Since 1 > γ CE = 1 , we have that the tax system, T (1), provides a 3 − subsidy, S1 (1), to agents whose capital purchases, a, are not larger than a(1). From the closed-form expression for c∗∗ (1) given in equations (21)–(22) of 3 1 ∗∗ G08, we have that the optimal utilitarian allocation has cH (1) = ( 2 , 2 ) and 1 3 ∗∗ cL (1) = ( 2 , 2 ). Substituting these values into the formula for tax parameters T (1) given in the statement of Theorem 3, we have + T2 (γ ) = S2 (γ ) = a(γ ) = 0 and 1 , 3 2 − , S1 (1) = 3 1 a(1) = − . 2 Under the tax system, T (1), therefore, agents who sell at least half of their initial capital stock receive the subsidy of 2 units of consumption at date one. 3 There is no subsidy to buying assets. All agents pay the lump-sum tax of 1 at 3 date one. From (29) we compute T1 (1) = 1 q(1) = . ˆ 3 The thick crooked line in Figure 3 represents the budget constraint that all agents face in their utility maximization problem under taxes, T (1), and price, q(1). The horizontal segment of this budget constraint results from ˆ − the subsidy, S1 (1). The horizontal dashed line represents the lump-sum tax, T1 . The two convex curves in Figure 3 are the highest indifference curves that types H and L attain in their utility maximization problems under taxes, T (1), and price, q(1). The indifference curve of type H has exactly one ˆ 3 1 ∗∗ point in common with the budget constraint, cH (1) = ( 2 , 2 ). The impatient agents, therefore, maximize their utility by choosing the consumption pair ∗∗ cH (1), which is consistent with implementation of the Second Best Pareto optimum c∗∗ (1). The indifference curve of type L meets the budget constraint 1 3 3 1 ∗∗ ∗∗ ∗∗ at two points: cL (1) = ( 2 , 2 ) and cH (1) = ( 2 , 2 ). Thus, cL (1) is consistent with the individual utility maximization of the L types, as well, however not uniquely.18 18 That this individual optimum is not unique is necessary in the implementation of the optimum c∗∗ (1) because the incentive compatibility constraint of type L, (6), binds at c∗∗ (1). 262 Federal Reserve Bank of Richmond Economic Quarterly Figure 3 Individual Optima of the Two Types Under the Budget Constraint Resulting from Taxes τ (1) 3 c2 2 1 0 0 1 2 3 c1 ∗∗ Note in Figure 3 that the indifference curve of type H is ﬂatter at cH (1) = than the downward-slopping segment of the budget constraint at this point. This is a consequence of the so-called intertemporal wedge, which is described in detail in G08. The slope of the budget line, everywhere outside of the horizontal segment, equals −m∗∗ (1)−1 . The slope of the indifference curve L ∗∗ of the H type at cH (1) is −m∗∗ (1)−1 . Because of the intertemporal wedge H prevailing at the optimal allocation c∗∗ (1), these two rates are not equal. In fact, the sloping segment of the budget line is strictly steeper than the indif∗∗ ference curve of the H type at cH (1). This implies that the optimal subsidy, − S1 (1), could not be made available with a weaker capital sale requirement than 3 1 (2, 2) Non-uniqueness for at least one type of agent will appear in any implementation of any Second Best Pareto optimum at which at least one of the incentive compatibility constraints (5) (6) is binding. B. Grochulski: Redistributive Taxation 263 1 a(1) = − 2 . Given the intertemporal wedge, which implies that the H type is savings-constrained, a lower threshold a(1) would provide a smaller distortion and beneﬁt the H types. It would, however, also beneﬁt the L-type agents, causing them to change their behavior from buying capital and receiving no subsidy to selling capital and qualifying for the subsidy, which would make this tax mechanism miss its subsidy target. For that reason, the H -type agents must remain savings-constrained in equilibrium. That the same construction of equilibrium holds for all γ > γ CE can be easily checked using the expressions for taxes, T (γ ), provided in the statement of Theorem 3 and prices, q(γ ), given in (29). One difference appears when ˆ we consider the Second Best Pareto optima c∗∗ (γ ) for the values of γ for which the incentive constraints do not bind.19 When no incentive constraints ∗∗ bind, the consumption bundle cθ (γ ) is a unique maximizer in the individual utility maximization problems of both types θ = H, L. The slope of the ∗∗ non-horizontal segment of the budget line at cH (γ ) is equal to the slope of the indifference curve of the H type at this point; the allocation is free of intertemporal wedges. This means that agents of type H would not beneﬁt − by selling slightly fewer claims than a(γ ) even if the subsidy, S1 (γ ), were available at a slightly lower threshold. In this sense, the threshold, a(γ ), is not uniquely pinned down by the optimum c∗∗ (γ ) for these values of γ . Figure 4 depicts this construction for one such value, namely γ = 0.4. Let us now turn to the Second Best Pareto optimum c∗∗ (0), i.e., the worst among all Second Best Pareto-optimal allocations from the point of view of the agents of type H . In order to implement this outcome, the government subsidizes capital purchases. Calculating taxes, T (0), from the formulas given in Theorem 3, and pinning down capital price from the IMRS of the agents of type H (who do not receive the subsidy in equilibrium), we construct the budget constraint depicted in Figure 5. The vertical segment of the budget + constraint represents the subsidy, S2 (0). The dashed vertical line represents the lump-sum tax, T2 (0). The maximal indifference curve attained by the ∗∗ ∗∗ agents of type H touches the budget line at two points: cH (0) and cL (0). The maximal indifference curve of the agents of type L touches the budget line ∗∗ only at cL (0). Within this budget set, therefore, both types of agents choose to consume their part of the optimal allocation, c∗∗ (0). In this way, the tax system, T (0), implements the Second Best Pareto optimum c∗∗ (0). As before, this construction generalizes for all γ < γ CE . For those γ for which the incentive constraint of the H type does not bind, both types’ optimal ∗∗ consumption, cθ (γ ), is the unique maximizer of individual utility under the budget constraints obtained from the equilibrium price, q(γ ) = m∗∗ (γ )y, ˆ H 19 As shown in G08, this is the case for γ in the interval [γ CE , γ ], where γ is the 2 2 threshold value at which the incentive constraint for the L type begins to bind. 264 Federal Reserve Bank of Richmond Economic Quarterly Figure 4 Individual Optima of the Two Types Under the Budget Constraint Resulting from Taxes τ (0.4) 3 c2 2 1 0 0 2 1 3 c1 and taxes, T (γ ). In those cases, as well, the optimal threshold, a(γ ), is not uniquely determined by the optimum c∗∗ (γ ). From the above graphical constructions, we can see how the implementation argument extends to all values of γ ∈ [0, ∞]. 7. OTHER TAX MECHANISMS In this section, we brieﬂy discuss the question of the uniqueness of the tax system, T (γ ). The tax system, T (γ ), is by no means a unique tax system capable of implementation of Second Best Pareto optima. Consider an arbitrary feasible tax system, T , and denote by B(T ) the set of all consumption pairs (c1 , c2 ) that are budget-feasible in the agents’ individual utility maximization problem under taxes, T . Suppose that (a) B(T ) contains ∗∗ ∗∗ the consumption pairs cH (γ ) and cL (γ ), and (b) B(T ) is contained in the B. Grochulski: Redistributive Taxation 265 Figure 5 Individual Optima of the Two Types Under the Budget Constraint Resulting from Taxes τ (0) 3 c2 2 1 0 0 1 2 3 c1 lower envelope of the indifference curves of the agents of type θ traced from ∗∗ the optimal consumption bundles cθ (γ ). It can easily be seen in Figures 3, 4, and 5 that any tax system, T , that satisﬁes (a) and (b) does implement the optimum c∗∗ (γ ). This point goes back to Mirrlees (1971). Nevertheless, the tax system, T (γ ), used in Theorem 3 has several features that may be appealing (on the basis of out-of-model considerations, however). First, it is simple. Second, it does not crowd out the market completely. Let us discuss these two points by comparing the tax system, T (γ ), with two alternatives. As the ﬁrst alternative, consider a tax system in which the government taxes away all private wealth by setting the lump-sum taxes T1 = T2 = y and offers two government welfare programs, with each agent in the economy being eligible to sign up for at most one. The ﬁrst welfare program hands ∗∗ out consumption cH (γ ) to each agent who signs up for it. The second hands 266 Federal Reserve Bank of Richmond Economic Quarterly ∗∗ out cL (γ ).20 Clearly, this system can implement any Second Best Pareto optimum c∗∗ (γ ), as well as any resource feasible and incentive compatible allocation. But it may be considered unappealing. Under this tax mechanism, the market is completely shut down: Anticipating the lump-sum tax, T2 , agents hold on to their capital and just consume the government handout. All trade is crowded out by the combination of high taxes and generous welfare programs. All transfers in this economy go through the hands of the government. The tax system, T (γ ), of the previous section is comparatively appealing because it calls for a much smaller government intervention in the market economy. Only a part of the transfers needed to support Pareto-optimal allocations go through the hands of the government, with private markets having a clear role. Another possible tax system is one under which the budget constraint, B(T ), is exactly equal to the lower envelope of the indifference curves traced ∗∗ from the optimal consumption, cθ (γ ), of the two types of agents. At this system, the size of the transfers going through the government’s hands is minimal. This system, however, is complicated because a high degree of nonlinearity in the implicit tax rates is required to trace out the nonlinear indifference curves of the two types. By comparison, the system, T (γ ), is simple, with the budget constraint being given by a linear schedule with just − + one parallel shift (by the amount of subsidy S1 or S2 ). 8. CONCLUSION Classical general equilibrium analysis of competitive markets provides a strong argument against distortionary government interventions. Market allocations are efﬁcient and all societal needs for redistribution can be efﬁciently achieved with lump-sum taxes and transfers. There is no reason to use distortionary taxes in the classical general equilibrium model. From the vantage point of the classical theory, distortionary taxes, which in fact are used by governments in many countries, may appear to reﬂect a failure of government policy. This appearance is overturned when one recognizes the strong informational requirements imposed by the classical general equilibrium theory. When governments do not posses sufﬁciently ﬁne information about the agents populating the economy, general equilibrium analysis leads to a completely different view of distortionary taxation. As our simple model illustrates, with incomplete public information, governments must necessarily rely on distortionary taxes in order to efﬁciently implement the desired level of redistribution. 20 One can see that this tax mechanism is simply a version of the direct revelation mechanism used to deﬁne the Social Planning Problem in G08. B. Grochulski: Redistributive Taxation 267 REFERENCES Akerlof, George. 1970. “The Market for Lemons: Qualitative Uncertainty and the Market Mechanism.” Quarterly Journal of Economics 84 (August): 488–500. Debreu, G´ rard. 1959. Theory of Value. New Haven and London: Yale e University Press. Grochulski, Borys. 2008. “Limits to Redistribution and Intertemporal Wedges: Implications of Pareto Optimality with Private Information.” Economic Quarterly 94 (Spring): 173–96. Kocherlakota, Narayana R. 2007. “Advances in Dynamic Optimal Taxation.” In Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress, Vol. 1 edited by Richard Blundell, Whitney Newey, and Torsten Persson. New York: Cambridge University Press, 269–97. Mas-Colell, Andreu, Michael D. Whinston, and Jerry R. Green. 1995. Microeconomic Theory. New York: Oxford University Press. Mirrlees, James A. 1971. “An Exploration in the Theory of Optimum Income Taxation.” Review of Economic Studies 38 (April): 175–208. Pigou, Arthur C. 1932. The Economics of Welfare. London: Macmillan and Co. Limited. Ramsey, F. P. 1927. “A Contribution to the Theory of Taxation.” Economic Journal 37: 47–61. Stiglitz, Joseph E. 1987. “Pareto Efﬁcient and Optimal Taxation and the New Welfare Economics.” In Handbook of Public Economics, vol. 2, edited by Alan J. Auerbach and Martin Feldstein. Amsterdam: North-Holland, 991–1,042. Werning, Iv´ n. 2007. “Optimal Fiscal Policy with Redistribution.” The a Quarterly Journal of Economics 122 (August): 925–67. Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 269–288 The Behavior of Household and Business Investment over the Business Cycle Kausik Gangopadhyay and Juan Carlos Hatchondo T he spillover effects associated with the decline in the housing market during 2007 and 2008 suggest the importance of this market for the overall economy. Yet the decision to purchase a house is only part of a broader plan of production and consumption of goods within the household. The residential services homeowners enjoy from their dwelling, the transportation services they enjoy from their automobiles, the meals prepared at home, the child/adult care services provided within the household, and the entertainment services derived from television and audio equipment are just a few examples of goods that are produced and consumed within the household, as opposed to goods that are purchased in the market. The size of this nonmarket output is quite signiﬁcant: Benhabib, Wright, and Rogerson (1991) estimate that the output of the household sector in the United States is approximately half of the size of the output in the market sector.1 Furthermore, the production of non-market goods requires the use of capital. Greenwood and Hercowitz (1991) report that the stock of household capital is actually larger than the stock of capital in the market sector. Examples of household capital are the dwellings owned and occupied by the household, automobiles owned and used by the household’s members, home appliances, furniture, etc. Given the size of the household sector, several studies have incorporated this sector into the real business cycle model with the goal of enhancing the understanding of aggregate ﬂuctuations of economic activity. Even though Gangopadhyay is an assistant professor at the Indian Institute of Management Kozhikode. Hatchondo is an economist at the Federal Reserve Bank of Richmond. The views expressed in this article do not necessarily reﬂect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. E-mails: kausik@iimk.ac.in; juancarlos.hatchondo@rich.frb.org. 1 Except for the ﬂow of services provided by dwellings to homeowners, the rest of non-market output produced within the household goes unreported in the System of National Accounts. 270 Federal Reserve Bank of Richmond Economic Quarterly the real business cycle model has proven to be a powerful tool for explaining basic patterns of business cycle ﬂuctuations in the United States, it has faced several challenges when it has been utilized to account for the behavior of business and household investment. This article presents a summary of the literature that studies the behavior of household investment decisions over the business cycle. Previous studies have emphasized three stylized facts about the cyclical behavior of household and business investment in the United States: (1) both investment components display a positive co-movement with output—as well as a positive co-movement with each other, (2) household investment is more volatile than business investment, and (3) household investment leads the cycle whereas business investment lags the cycle. With respect to the last ﬁnding, household investment is correlated more with future output than with current or past output, while business investment is correlated more with past output than with current or future output. This article discusses the performances of previous studies in terms of their ability to account for these stylized facts within a framework that is broadly consistent with the main properties of business cycles in the United States. This article provides a summary of studies that have extended the real business cycle model in order to reach a better understanding of the facts described above. Alternative explanations for the positive co-movement and relative volatilities between the two investment components have relied on different degrees of complementarity between capital and labor in the production of home goods, the presence of alternative uses for labor and/or household capital, and the presence of a more costly adjustment in the stock of market capital compared with the stock of household capital. The leading behavior of household investment has been harder to explain. The two studies that have succeeded in accounting for this fact have relied on household capital as a factor that may enhance the quality of the labor force and on a multiple-sector model in which capital goods are produced in a separate sector. All the studies reviewed in this article rely on exogenous shocks to productivity levels as the driving force of cyclical ﬂuctuations. This modeling strategy abstracts from explanations for cyclical ﬂuctuations in which market imperfections lead to inefﬁciently low or high output levels. For example, none of the studies revisited in this article feature residential investment driven by house prices that may be misaligned with fundamentals. This implies that the studies surveyed in this article portray cyclical downturns as an efﬁcient response of the economy to “bad shocks.” The rest of the paper is organized as follows. Section 1 describes the main characteristics of the business cycle in the United States and the importance of household production. Sections 2 and 3 present a summary of the literature on the cyclical behavior of household and business investment. The conclusions are noted in Section 4. Gangopadhyay and Hatchondo: Household and Business Investment 1. 271 DATA DESCRIPTION The concept of business cycles refers to ﬂuctuations of economic activity around its long-run growth path. The long-run growth path is commonly referred to as the trend of the time series of an economic variable. The cyclical component of the series is deﬁned as the deviation from the trend. In real business cycle theory, economists study the behavior of the cyclical component. For example, studies of business cycles focus on notions of persistence in the detrended component of economic aggregates, co-movement among various detrended (cyclical) components and the leading or lagging behavior relative to the detrended component of output, and also the relative amplitudes of standard deviation or volatilities of various detrended series. The remarkable feature about ﬂuctuations of aggregate variables over time is that the cyclical components tend to move in a synchronized mode. There has been an extensive literature over the last 30 years aimed at reaching a coherent understanding of the regularities that characterize the business cycle in the U.S. economy. As was pointed out by Lucas (1977), the development of a theoretical explanation for these regularities constitutes a ﬁrst step toward the design of sound policy measures. This section does not provide an exhaustive description of the properties of business cycles in the United States. Instead, it focuses on the cyclical behavior of the aggregate variables that are studied in this article. Table 1 presents the behavior of market output, market consumption, household and business investment, and total hours worked in the market sector. The moments are computed using data from the ﬁrst quarter of 1964 to the second quarter of 2008.2 The second column reports the standard deviation of market output and ratios of the standard deviations of each variable relative to the standard deviation of market output. The remaining columns report the cross-time correlation between each variable and market output. In particular, the seventh column illustrates that there is a signiﬁcant positive co-movement between all ﬁve variables. However, the highest magnitudes of the coefﬁcients of correlations do not necessarily correspond to the contemporaneous correlations. Household investment is more closely correlated with market output one and two quarters ahead than with current market output: 2 Market output consists of gross domestic product less consumption of housing services. Market consumption consists of personal consumption expenditures in nondurables and services less housing services. Household investment consists of residential ﬁxed investment and expenditures in durable consumption goods. Business investment consists of nonresidential ﬁxed investment. Market hours consists of total hours worked in the private sector. The Bureau of Economic Analysis is the primary source for the ﬁrst four variables and the Bureau of Labor Statistics is the primary source for the last variable. The moments reported in the table correspond to deviations from the trend of the natural logarithm of each variable. Trends are computed using the Hodrick-Prescott ﬁlter with a smoothing parameter of 1,600. Market Output Market Consumption Business Investment Household Investment Market Hours Std. Dev. 1.66 0.55 2.91 4.03 1.11 xt−4 0.26 0.43 −0.06 0.58 0.02 Cross Correlation of Market Output at Period t with: xt−3 xt−2 xt−1 xt xt+1 xt+2 xt+3 0.47 0.68 0.86 1.00 0.86 0.68 0.47 0.61 0.75 0.82 0.79 0.66 0.49 0.30 0.13 0.37 0.59 0.78 0.84 0.81 0.71 0.68 0.78 0.81 0.73 0.50 0.27 0.04 0.22 0.46 0.69 0.86 0.89 0.82 0.69 Table 1 Properties of Business Cycles in the United States, Selected Moments xt+4 0.26 0.10 0.54 −0.15 0.51 272 Federal Reserve Bank of Richmond Economic Quarterly Gangopadhyay and Hatchondo: Household and Business Investment 273 corr(xht−2 , yt ) = 0.78 and corr(xht−1 , yt ) = 0.81, while corr(xht , yt ) = 0.73.3 On the contrary, business investment is correlated more with market output one and two quarters behind than with current market output: corr(xmt+1 , yt ) = 0.84 and corr(xmt+2 , yt ) = 0.81, while corr(xmt , yt ) = 0.78. In addition, both investment components are signiﬁcantly more volatile than market output and consumption. The leading behavior of household investment is also apparent in Figure 1. The graph illustrates the dynamics of household investment, business investment, and output before and after each of the last seven recessions. Except for the 2001 recession, household investment had already peaked and was in decline at the beginning of each recession. On the other hand, except for the recessions that started in 1969 and 2001, business investment peaked either at the beginning of the recession or after that. Even though standard one-sector real business cycle models have been successful in accounting for the cyclical pattern of aggregate investment, the extensions to the one-sector model have been less successful. To some extent, this poses a challenge to the use of transitory shocks to aggregate productivity as the main source of aggregate business ﬂuctuations. The next sections present a summary of the lessons that can be extracted from past work that has studied the cyclical behavior of household and business investment. 2. THE BASELINE NEOCLASSICAL GROWTH MODEL Kydland and Prescott (1982) and Long and Plosser (1983) are the ﬁrst studies to quantify the explanatory power of equilibrium theories to account for business cycle ﬂuctuations. They consider different extensions of the stochastic growth model studied in Brock and Mirman (1972) and compare statistical properties of the data generated by their models with actual statistics. In Kydland and Prescott (1982) and Long and Plosser (1983), the only source of ﬂuctuations in the economy is a shock to the aggregate factor productivity. Their work laid down the foundations of a vast literature that shows how equilibrium theories could provide a plausible explanation of aggregate ﬂuctuations of economic activity. The rest of this section is devoted to elaborating on the structure of the one-sector real business cycle model and the different multi-sector models that have been used so far to explain the cyclical patterns of business and household investment. As a simple case study, consider a closed economy with no government spending and complete markets. There is one good in the economy that can be either consumed or invested. Fluctuations in economic activity are driven by persistent shocks to total factor productivity. In the simple model, there is no 3 The leading behavior of household investment is shared by its two components: household purchases of durable goods and residential investment. 274 Federal Reserve Bank of Richmond Economic Quarterly Figure 1 Real Investment and GDP Before and After Each of the Last Seven Recessions 125 GDP Household Investment Business Investment 120 115 125 1960 1969 GDP Household Investment Business Investment 120 115 110 110 105 105 100 100 95 95 90 90 85 85 80 80 75 75 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 -8 8 -7 -6 -5 Number of Quarters Before/After Recession Starts GDP Household Investment Business Investment 125 120 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 Number of Quarters Before/After Recession Starts 1973 GDP Household Investment Business Investment 125 120 115 1980 115 110 110 105 105 100 100 95 95 90 90 85 85 80 80 75 75 -8 -7 -6 -4 -5 -3 -2 -1 0 1 2 3 4 5 6 7 -8 8 -7 -6 -5 1981 GDP Household Investment Business Investment 125 120 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 Number of Quarters Before/After Recession Starts Number of Quarters Before/After Recession Starts GDP Household Investment Business Investment 125 120 115 1990 115 110 110 105 105 100 100 95 95 90 90 85 85 80 80 75 -8 -7 -6 -4 -5 -3 -2 -1 0 1 2 3 4 5 6 7 8 75 -8 -7 -6 Number of Quarters Before/After Recession Starts -5 -4 -3 -2 -1 0 1 2 3 4 5 6 Number of Quarters Before/After Recession Starts 2001 GDP Household Investment Business Investment 125 120 GDP Household Investment Business Investment 125 120 115 105 100 2007 110 105 8 115 110 7 100 95 95 90 90 85 85 80 80 75 75 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 Number of Quarters Before/After Recession Starts 7 8 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 Number of Quarters Before/After Recession Starts Notes: The indexes take a value of 100 in the ﬁrst quarter of each recession. disutility of labor implying that the supply of labor is inelastic. Under a wide range of values for the parameters, a positive shock to productivity generates Gangopadhyay and Hatchondo: Household and Business Investment 275 higher output, consumption, and investment in the shock period, which can account for the positive co-movement of these three economic aggregates. In this economy, there are two effects through which a positive productivity shock may induce higher investment level in the shock period. First, agents become richer and may want to smooth out the current windfall of output. The only aggregate mechanism available to transfer current resources to future periods is capital accumulation. Secondly, if the shock is persistent enough, positive current productivity shocks predict a distribution biased toward positive shocks in the following period, which augments the marginal beneﬁt to invest rather than to consume.4 Additionally, an agent’s ability to transfer resources across time by investing or disinvesting enables the model to account for the volatilities of consumption and investment relative to output. What happens when investment is disaggregated between household and business investment? The answer is that the baseline model faces a hard time accounting for the cyclical pattern of these two components. 3. MODELS WITH HOME PRODUCTION Greenwood and Hercowitz (1991) constitutes the ﬁrst attempt to study the cyclical behavior of these two components of investment in a real business cycle model. They consider a two-sector model in which the representative household maximizes its expected lifetime utility, as given by ∞ E0 u (cMt , cH t ) , (1) t=0 where cMt denotes the consumption of market goods, and cH t denotes the consumption of home-produced goods at time period t. The consumption of market goods is identical to the purchases of consumption goods, c, namely cMt = ct , (2) while home goods, cH t , are assumed to be a function of the stock of household capital, kH t , and the number of hours allocated to produce home goods, hH t , (3) cH t = H (kH t , zH t hH t ) . Market goods are produced using a technology that depends on the capital stock invested in the market sector, kMt , and the number of hours supplied to the market sector, hMt , yt = F (kMt , zMt hMt ) . (4) 4 Note that there may exist cases where a positive shock induces a decrease in investment in the shock period. This would occur when agents predict that they are going to be sufﬁciently rich in the future as a consequence of the current shock and thus want to transfer some of those future resources to the current period. 276 Federal Reserve Bank of Richmond Economic Quarterly In choosing market consumption, cMt , and savings, the household faces the following budget constraint in period t: ct + xMt + xH t = (1 − τ k ) rt kMt + (1 − τ l ) wt hMt + τ , (5) where wt is the wage rate in the market sector, rt is the rental price of capital in the market sector, xMt and xH t are the investment in household and market capital, respectively, τ k is the tax rate on capital income, τ l is the tax rate on labor income, and τ is a lump sum transfer. The variables zMt and zH t represent labor-augmenting technological progress. In this study, an important assumption is that productivity shocks in the market and household sectors are perfectly correlated, i.e., zMt = zH t . The endowment of hours in each period is normalized to 1 and it is assumed that all hours that are not used to produce market goods are used to produce home goods. That is, hMt + hH t = 1. (6) Finally, the capital stocks in the market and household sector depreciate at the constant rates δ M and δ H , respectively. This means that the capital stock in sector i follows the law of motion kit+1 = (1 − δ i ) kit + xit , with i ∈ {M, H } . (7) Similar investment motives to the ones described in the case of the onesector model are also present in this environment. The difference is that now there is a tradeoff between the accumulation of business capital and that of household capital. In the baseline calibration of Greenwood and Hercowitz (1991), households respond to a positive productivity shock by increasing business investment and decreasing household investment in the shock period. This behavior explains why the simulated data sets obtained using their baseline calibration feature a strong negative co-movement between business and household investment. The mechanism of this model is summarized by the following passage from Greenwood and Hercowitz (1991; 1,205): . . . The negative co-movement of the two investments, which stands in contrast with the positive one displayed by the actual data has to do with the basic asymmetry between the two types of capital. Business capital can be used to produce household capital, but not the other way around. When an innovation to technology occurs, say a positive one, the optimal levels for both capital stocks increase. Given the asymmetry in the nature of the two capital goods, the tendency for the benchmark model is to build business capital ﬁrst, and only then household capital. . . Gangopadhyay and Hatchondo: Household and Business Investment 277 Greenwood and Hercowitz (1991) show that a higher degree of complementarity between labor and capital in home technology helps in accounting for the co-movement between household and business capital accumulation. The Euler equation for household capital accumulation is given by u1 (cM , cH ) = β u1 cM , cH u2 cM , cH u1 cM , cH H1 kH , z hH +1−δ H dG(z | z), (8) where x denotes the next-period value of variable x. The marginal value of household capital accumulation depends on the future shadow price of u c ,c household consumption, u2 (cM ,cH ) , and on the future marginal productivity of 1( M H ) household capital, H1 kH , z hH . The Euler equation takes a simple form for the parameterization used in Greenwood and Hercowitz (1991). They assume that the production function for the home good, H (kH , zhH ), is of the following form: H (kH , hH ) = ⎧ η ⎨ kH (zhH )1−η ζ ⎩ ηkH + (1 − η) (zhH )ζ 1 ζ if ζ = 0 (9) if ζ = 0. The value of ζ determines the elasticity of substitution between household capital and labor in the production of home goods. Both inputs are complements when ζ < 0, and are substitutes when 0 < ζ < 1. Greenwood and Hercowitz (1991) assume that the market technology is speciﬁed by a standard Cobb-Douglas production function with a laboraugmenting productivity shock. Firms seek to maximize proﬁts given the rental rates for capital and labor. The instantaneous utility function has the following form: C (cM , cH )1−γ − 1 . 1−γ The consumption aggregator, C (cM , cH ), is given by u (cM , cH ) = (10) θ 1−θ C (cM , cH ) = cM cH . (11) Under this parameterization, the Euler equation simpliﬁes to u2 cM , cH u1 cM , cH H1 kH , z hH ηk ζ −1 1−θ H = cM ζ θ ηk H + (1 − η) z hH ζ . (12) In Greenwood and Hercowitz’s (1991) baseline calibration ζ = 0, so the direct role of the future productivity shock, z , on the future shadow price of household consumption and the future marginal productivity of household capital cancel each other out. However, when capital and labor are complements in the production of home goods (ζ < 0), higher future productivity 278 Federal Reserve Bank of Richmond Economic Quarterly shocks have a direct positive effect on the incentives to accumulate household capital. Thus, when ζ < 0, a positive productivity shock in the current period increases the probability of observing higher shocks in the next period and generates a stronger desire to accumulate household capital in the period of the shock. The intuition is that when the ability to substitute capital for labor decreases, it becomes more costly for households to compensate a decrease in household capital with an increase in the number of hours devoted to the production of home goods. Greenwood and Hercowitz (1991) show that a value of ζ = −1 sufﬁces to generate a positive reaction of household investment to productivity shocks and hence, a positive co-movement between household and business investment. In addition, a value of ζ = −1 also helps to account for the larger volatility of household investment relative to business investment. Modiﬁcations of the Baseline Model with Home Production Differential capital adjustment costs in the market and household sector Gomme, Kydland, and Rupert (2001) point out that the alternative parameterization proposed by Greenwood and Hercowitz (1991) to account for the positive co-movement between the two investment components may be inconsistent with the presence of balanced growth.5 Gomme, Kydland, and Rupert (2001) extend the setup studied in Greenwood and Hercowitz (1991) by introducing a time-to-build technology for the production of market goods as well as utility from leisure. In Gomme, Kydland, and Rupert (2001) the representative household lifetime utility is represented by ∞ E0 u (cMt , cH t , hLt ) , (13) t=0 where hLt denotes the number of hours devoted to leisure activities. The inputs required to produce market and home goods are the same as in equations (2)–(4). In Gomme, Kydland, and Rupert (2001), the household allocates its endowment of hours over three possible uses. This means that equation (6) is replaced by hMt + hH t + hLt = 1. (14) 5 If the model were extended to account for the decline in the price of durable goods, it would not be able to generate a constant fraction of expenditures in durable goods as observed empirically. Gangopadhyay and Hatchondo: Household and Business Investment 279 The assumption of time-to-build for market capital implies that an agent decides today the increase in the stock of business capital that will take place four periods ahead (a period refers to a quarter). In addition to that, the investment projects decided today entail a commitment of investment resources during four periods until the projects can become active. More precisely, when households decide at date t to increase their capital stock in the market sector at date t + 4 in one unit, they need to spend 0.25 units per period from date t until t + 3. This means that law of motion for capital in the market sector satisﬁes the following equation: kMt+1 = (1 − δ M ) kMt + pMt−3 , (15) where pMt denotes the number of projects in the market sector started in period t. Unlike in Greenwood and Hercowitz (1991), the investment in market capital in a given period depends on the number of projects started in that period as well as on the number of projects started over the last three periods, namely 1 (16) pMt + pMt−1 + pMt−2 + pMt−3 . 4 However, Gomme, Kydland, and Rupert (2001) assume that it takes only one period to complete household investment projects. This means that equation (7) still applies for the stock of capital in the household sector. Finally, Gomme, Kydland, and Rupert (2001) relax the strong assumption of perfect correlation between productivity shocks in the household and market sectors. The main improvement over Greenwood and Hercowitz (1991) is that the model with time-to-build technology manages to replicate the positive co-movement between household and business investment and generates a stronger lag in the reaction of business investment to output. That result is obtained assuming a unitary elasticity of substitution between capital and labor in the home technology (ζ = 0). In order to assist the intuition, Figure 2 describes the impulse response on a one-time shock to the productivity level in the market sector ( M ). Figure 2 shows that at the time of the shock, agents respond by starting more investment projects. This accounts for the increase in market investment at date 1 and at the dates that follow the shock. There are fewer investment projects started after date 1, which accounts for the decline in market investment observed after date 5. Even though the productivity level in the household sector remains unchanged throughout the period, the positive wealth effect because of the higher productivity in the market sector induces households to consume more homemade goods and thus to invest more in home capital. The upward pressure on wages triggered by the spike in market productivity induce xMt = 280 Federal Reserve Bank of Richmond Economic Quarterly Figure 2 Impulse Responses to a One-Time Shock to Current Market Productivity 8 x 7 x 6 y h M H M 5 4 3 2 1 0 1 2 0 2 4 6 8 10 12 Notes: xM is business investment, xH is household investment, y is market output, and hM is market hours. The deviations are expressed in percentage deviations from the steady-state values for each variable. households to work more hours in the market sector. As a result of the higher supply of labor hours and the increase in factor productivity, market output increases upon the shock. The initial increase in output and labor hours tends to fade away until date 5. At that point, the investment projects started at date 1 become active and market output and hours worked in the market sector jump up again. The results are symmetric in the case of a negative shock to market productivity. The simultaneous rise (fall) in household and business investment that tends to follow a rise (fall) in market productivity plays a key role in explaining the co-movement of both investment components. As it is explained in Gomme, Kydland, and Rupert (2001; 1,127): Gangopadhyay and Hatchondo: Household and Business Investment 281 The effect of time to build is to mute the impact effect of the shock on market investment by drawing out the response over the four quarters it takes to build market capital. . . . As a result, home investment need not take such a big hit in the initial period of the shock. Chang (2000) explores a slightly different setup and provides an alternative mechanism that can explain the co-movement between market and household investment. The household’s objective is the same as the one speciﬁed in equation (1), with the difference that both consumption goods are produced within the household. That is, Chang (2000) replaces equation (2) with cMt = M (ct , zCt hCt ) , (17) where hCt denotes the number of hours allocated to the production of home goods that do not require nondurable inputs, and zCt is a labor-augmenting productivity shock. The production of home goods that require durable inputs satisﬁes equation (3).6 As in Greenwood and Hercowitz (1991), there is only one market sector in the economy. The market good can be used as a nondurable good, a durable good, or capital to be rented to ﬁrms in the market sector. These uses are nonreversible. The household’s allocation of time must satisfy hCt + hMt + hH t = 1. (18) Chang (2000) assumes that the accumulation of durable goods and market capital are subject to an adjustment cost, φ, that is kit+1 = (1 − δ i ) kit + φ xit kit kit for i ∈ {H, M}. (19) The only source of uncertainty consists of a productivity shock in the market sector (zH and zC display a constant and deterministic growth rate). Chang (2000) shows that when the household technology features a higher degree of substitutability between durable goods and labor than between nondurable goods and labor, a positive productive shock in the market sector generates a simultaneous increase in the investment of market capital and household stock of durable goods. The intuition is that a positive productivity shock induces households to increase their consumption while it increases their opportunity cost of time allocated to the production of consumption goods, given that the market wage increases. When the production of cD displays a 6 Note that in Chang (2000) there are two types of household capital. One is composed of nondurable goods and fully depreciates at the end of each period. The other is composed of durable goods and is subject to partial depreciation. 282 Federal Reserve Bank of Richmond Economic Quarterly sufﬁciently higher degree of substitution compared to the production of cN , households ﬁnd it optimal to increase their consumption of cD by using more capital (durable goods) and less labor. This accounts for the increase in the purchases of durable goods upon a positive productivity shock. In addition, Chang (2000) shows that it is the joint presence of a higher elasticity of substitution in the production of cD and the adjustment cost in the accumulation of durable goods and business capital that helps in generating a positive co-movement of purchases in durable goods and business investment. Once one of these two assumptions is relaxed, the model generates a negative co-movement between the accumulation of durable goods and business investment. In contrast to Greenwood and Hercowitz (1991), the environment studied by Chang (2000) suggests that the positive co-movement between the two investment components can be explained by a high degree of substitutability in the production of the home good that requires durable goods. In addition, Chang (2000) estimates the elasticity of substitution between goods and time in different consumption activities and ﬁnds that durable goods seem to be a good substitute for time, a ﬁnding that is consistent with previous empirical studies. Home production as an input to market production Einarsson and Marquis (1997) are able to explain the co-movement of household and business investment in a setup in which households supply labor hours to the market sector and the non-market sector to accumulate human capital. In Einarsson and Marquis (1997), the household faces the same objective as in equation (1) and it has to satisfy the same restrictions deﬁned in equations (2)–(5) with two differences. First, the term hit in equations (2)–(5) needs to be replaced by Et hit for i ∈ {H, M}. The variable Et denotes the stock of human capital in period t. Second, there are no productivity shocks in the production of home goods. Einarsson and Marquis (1997) assume that households can increase their stock of human capital using the following technology: Et+1 = G(Et , hEt ), (20) where hEt is the amount of time allocated in period t to learning activities. That is, human capital has a few nonexclusive uses: it serves as an input in the production of human capital and it affects the quality of hours supplied to the market sector and allocated to the production of home goods. Thus, hMt + hH t + hEt = 1. (21) Finally, the law of motion for market and household capital satisﬁes equation (7). Gangopadhyay and Hatchondo: Household and Business Investment 283 In Einarsson and Marquis’s (1997) baseline calibration, a positive productivity shock in the market sector induces households to work more hours in the market and household sectors and decreases the number of hours devoted to accumulating human capital. In turn, the increase in hours worked in the household sector increases the marginal return on capital in that sector, which introduces an incentive to invest in household capital upon a positive productivity shock. Unlike Greenwood and Hercowitz (1991), Einarsson and Marquis (1997) do not rely on a high correlation of productivity shocks in the market and non-market sectors. In fact, they assume that only the production of market goods is hit with productivity shocks. Nonetheless as in Greenwood and Hercowitz (1991), they need to assume that capital and labor in the household sectors are complementary. Even though the articles summarized in this section provide different tentative explanations for the positive co-movement of business and household investment, and the relative volatility of these two investment components, they cannot explain the leading behavior of household investment and the lagging behavior of business investment. Fisher (2007) succeeds in this respect after introducing a direct role for household capital as an input in market production. Fisher (2007) extends Gomme, Kydland, and Rupert (2001) by introducing an additional use for household capital: Households can affect total effective hours supplied to ˜ ˜ business ﬁrms (hM ). The technology for determining hM is speciﬁed by μ ˜ hMt = L (kH Mt , zH t hMt ) = kH Mt (zH t hMt )1−μ , (22) where kMH and hM denote the household capital and hours allocated to improve the quality of labor supply to business ﬁrms. As in Gomme, Kydland, and Rupert (2001), households produce a home good using household capital and labor: cH t = H (kH H t , zH t hH t ) , (23) where kH H t and hH t denote the household capital and hours allocated to produce the home good. Note that unlike in Einarsson and Marquis (1997), households cannot affect the quality of the hours allocated to the production of home goods. The uses of household capital are constrained by the total stock of household capital in the period, namely kH Mt + kH H t = kH t . (24) In this setup, household capital is not only useful to produce home consumption goods, but it indirectly enhances the ability to produce market goods. In that context, Fisher (2007) shows that the model can replicate the leading behavior of household investment over business investment. When the share 284 Federal Reserve Bank of Richmond Economic Quarterly of capital in the production of human capital (μ) is below 0.25 (it is 0.19 in Fisher’s calibration), the optimal response of households to a positive productivity shock in the market sector is ﬁrst to increase their investment in household capital. This allows households to increase their effective labor supply over periods following the shock, where higher productivity shocks would tend to push up wages. In turn, the higher labor supply will augment the production of market goods in future periods, which also helps to account for the leading behavior of household investment. The “strong” initial increase in household investment takes place at the expense of market investment, which displays a modest increase in the shock period. The household raises market investment in the periods following the positive shock. Models with Multiple-Market Sectors Finally, Davis and Heathcote (2005) and Hornstein and Praschnik (1997) study the cyclical behavior of residential investment and/or purchases of durable consumption goods without resorting to household production. These studies consider a structure in which all goods are produced in the market and in which households derive direct utility from the acquisition of durable goods. That is, in both setups the household maximizes the same objective function deﬁned in equation (13), with the additional restrictions cMt = ct and cH t = kH t . Unlike the articles surveyed above that study economies with only one market sector, Davis and Heathcote (2005) and Hornstein and Praschnik (1997) consider economies with multiple market sectors. Davis and Heathcote (2005) consider a model with three intermediate inputs: construction (b), manufactures (m), and services (s) that are produced using labor and capital. Formally, let yit denote the production of intermediate good i: yit = Fi (kit , zit hit ) , with i ∈ {b, m, s} , (25) where kit and hit denote the capital and labor hours used in the production of intermediate input i. These three goods are the only inputs in the production of two ﬁnal goods: a consumption/capital good (M) and a residential good (R). Thus, yj t = Fj bj t , mj t , sj t , with j ∈ {M, R} , (26) where yj t denotes the production of ﬁnal good j , and bj t , mj t , and sj t denote the quantities of each of the three intermediate goods in the production of j . The residential good must be combined with land (xLt ) to produce houses (xH t ), namely xH t = FH (xLt , xRt ) , (27) Gangopadhyay and Hatchondo: Household and Business Investment 285 where the stock of land is constant and equal to 1, i.e., xLt ≤ 1. In their setup, houses are the only durable consumption good. In Davis and Heathcote (2005) there are three alternative uses for market capital and four alternative uses for the household’s endowment of hours, namely kbt + kmt + kst = kMt , and (28) hbt + hmt + hst + hLt = 1. (29) The law of motion for market capital, kM , is the same as in equation (7), while the law of motion for the stock of houses is given by kH t+1 = (1 − δ H )1−φ kH t + xH t . (30) Finally, the resource constraint for ﬁnal goods is given by ct + xMt + gt = yMt , (31) where the government expenditures, gt , are ﬁnanced by labor and capital income taxes. Davis and Heathcote (2005) show that the model can account for the comovement between residential and nonresidential investment and the higher volatility of residential compared to nonresidential investment. The environment studied in Davis and Heathcote (2005) is quite different from the environment considered in previous studies. Davis and Heathcote (2005) carry on different experiments to identify the role of different features of the model. On page 753 they state that First, although our Solow residual estimates suggest only moderate comovement in productivity shocks across intermediate goods sectors, comovement in effective productivity across ﬁnal-goods sectors is ampliﬁed by the fact that both ﬁnal-goods sectors use all three intermediate inputs, albeit in different proportions. Second, the production of new housing requires suitable new land, which is relatively expensive during construction booms. We ﬁnd that land acts like an adjustment cost for residential investment, reducing residential investment volatility, and increasing co-movement. Third, construction and hence residential investment are relatively labor intensive. This increases the volatility of residential investment because following an increase in productivity less additional capital (which takes time to accumulate) is required to efﬁciently increase the scale of production in the construction sector. Fourth, the depreciation rate for housing is much slower than that for business capital. This increases the relative volatility of residential investment and increases co-movement, since it increases the incentive to concentrate production of new houses in periods of high productivity. Hornstein and Praschnik (1997) propose a multi-sector economy in which the use of intermediate inputs helps to explain the co-movement of sectoral 286 Federal Reserve Bank of Richmond Economic Quarterly employment and output. Their article also offers an explanation for the leading pattern of household investment. They consider a setup with two market sectors: one produces a durable good and the other produces a nondurable good. The durable good (MX) can be accumulated either as business capital or household capital. The nondurable good (MC) can be used either in consumption or as an input in the production of durable goods. Thus, xMXt + xMCt + xH t = yMXt = FMX (kMXt , zMXt hMXt , mt ) and cM + m = FMC (kMCt , zMCt hMCt ) , (32) (33) where xit denotes the investment in the stock of capital, kit , yMXt denotes the production of durable goods, kMXt (kMCt ), hMXt (hMCt ) denotes the capital and labor hours used in the production of durable (nondurable) goods, mt denotes the amount of nondurable goods used as input in the production of durable goods, and zMXt (zMCt ) denotes a labor-augmenting productivity shock in the durable (nondurable sector). The resource constraint for labor hours reads hMXt + hMCt + hLt = 1, (34) while the law of motion for kit is the same as in equation (7), for i ∈ {MX, MC, H }. Note that in Hornstein and Praschnik (1997) investment decisions are nonreversible. This setup not only explains the co-movement between household and business investment but it also explains the leading pattern of business investment. We quote Hornstein and Praschnik (1997, 589) below: Following a productivity increase in either sector, capital becomes more productive and in order to increase the production of capital goods investment in the durable goods sector increases whereas investment in the nondurable goods sector is postponed for one period. The positive wealth effect of a productivity increase raises household consumption of capital services, and household sector investment increases contemporaneously with the productivity shock. Since investment in the nondurable goods sector represents the bulk of business investment, household investment leads business investment. 4. CONCLUSION A substantial fraction of societal consumption is not purchased in markets but rather is produced and consumed within households. This article describes the main characteristics of the cyclical behavior of household and business investment over the cycle in the United States, and offers a summary of studies Gangopadhyay and Hatchondo: Household and Business Investment 287 that have tried to explain the dynamics of these two investment components. Even though we have reached a better understanding of what economic relationships may help in explaining the behavior of these two investment components, more research is needed. For example, changes in the relative prices of houses could be playing a signiﬁcant role as a propagation mechanism or as a coordination device across households. However, most existing studies abstract from changes in the relative price of houses, and the ones that allow for that channel generate house price movements that are not aligned with the data. REFERENCES Benhabib, Jess, Randall Wright, and Richard Rogerson. 1991. “Homework in Macroeconomics: Household Production and Aggregate Fluctuations.” Journal of Political Economy 99 (December): 1,166–87. Brock, William A., and Leonard J. Mirman. 1972. “Optimal Economic Growth and Uncertainty: The Discounted Case.” Journal of Economic Theory 4 (June): 479–513. Chang, Yongsung. 2000. “Comovement, Excess Volatility, and Home Production.” Journal of Monetary Economics 46 (October): 385–96. Davis, Morris A., and Jonathan Heathcote. 2005. “Housing and the Business Cycle.” International Economic Review 46 (August): 751–84. Einarsson, Tor, and Milton H. Marquis. 1997. “Home Production with Endogenous Growth.” Journal of Monetary Economics 39 (August): 551–69. Fisher, Jonas D. M. 2007. “Why Does Household Investment Lead Business Investment over the Business Cycle?” Journal of Political Economy 115: 141–68. Gomme, Paul, Finn E. Kydland, and Peter Rupert. 2001. “Home Production Meets Time to Build.” Journal of Political Economy 109 (October): 1,115–31. Greenwood, Jeremy, and Zvi Hercowitz. 1991. “The Allocation of Capital and Time over the Business Cycle.” Journal of Political Economy 99 (December): 1,188–214. Hornstein, Andreas, and Jack Praschnik. 1997. “Intermediate Inputs and Sectoral Comovement in the Business Cycle.” Journal of Monetary Economics 40 (December): 573–95. 288 Federal Reserve Bank of Richmond Economic Quarterly Kydland, Finn E., and Edward C. Prescott. 1982. “Time to Build and Aggregate Fluctuations.” Econometrica 50 (November): 1,345–70. Long, John B., Jr., and Charles I. Plosser. 1983. “Real Business Cycles.” Journal of Political Economy 91 (February): 39–69. Lucas, Robert E. 1977. “Understanding Business Cycles.” Carnegie-Rochester Conference Series on Public Policy 5 (January): 7–29. Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 289–313 Short-Term Headline-Core Inﬂation Dynamics Yash P. Mehra and Devin Reilly M any analysts contend that the Federal Reserve under ChairmenAlan Greenspan and Ben Bernanke has conducted monetary policy that focuses on core rather than headline inﬂation. The measure of core inﬂation used excludes food and energy prices.1 The main argument in favor of using core inﬂation to implement monetary policy is that core inﬂation approximates the permanent or trend component of inﬂation much better than does headline inﬂation, the latter being inﬂuenced more by transitory movements in food and energy prices. The empirical evidence favorable to the use of core inﬂation in policy is recently reviewed in Mishkin (2007b). This empirical evidence consists of examining short-term dynamics between headline and core inﬂation measures, indicating that, in samples that start after the early 1980s, headline inﬂation has reverted more strongly toward core inﬂation than core inﬂation has moved toward headline inﬂation. However, the research reviewed also shows that the evidence indicating the reversion of headline inﬂation to core inﬂation is quite weak in samples that start in the 1960s, suggesting that headline-core inﬂation dynamics may not be stable over time.2 Thomas Lubik, Roy Webb, and Nadezhda Malysheva provided valuable comments on this article. The views expressed in this article do not necessarily reﬂect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. E-mails: yash.mehra@rich.frb.org; devin.reilly@rich.frb.org. 1 The evidence suggesting the Federal Reserve under Chairman Greenspan focused on a core measure of inﬂation appears in Blinder and Reis (2005), Mehra and Minton (2007), and Mishkin (2007b). 2 See also Clark (2001), Blinder and Reis (2005), Rich and Steindel (2005), and Kiley (2008). These analysts use different empirical methodologies to come to the same conclusion that core inﬂation is better than headline inﬂation in gauging the trend in inﬂation if we focus on the samples that start in the early 1980s. For example, Kiley (2008) uses statistical models to extract directly the trend component of inﬂation and argues that, in the 1970s and early 1980s, core as well as headline inﬂation contains information about the trend; however, in the recent data, the trend is best gauged by focusing on core inﬂation. The evidence in Clark (2001), Blinder and Reis (2005), Rich and Steindel (2005), and Crone et al. (2008) is based on comparing the relative 290 Federal Reserve Bank of Richmond Economic Quarterly In this article we re-examine the short-term dynamics between headline and core measures of inﬂation over a longer sample period of 1959–2007. We offer new evidence that headline-core inﬂation dynamics have indeed changed during this sample period and that this change in dynamics may be due to a change in the conduct of monetary policy in 1979.3 In particular, we examine such dynamics over three sub-periods: 1959:1–1979:1, 1979:2–2001:2, and 1985:1–2007:2. We consider the sub-sample 1985:1–2007:2, as it spans a period of relatively low and stable inﬂation. We consider both the consumer price index (CPI) and the personal consumption expenditure (PCE) deﬂator. The data used is biannual because the structural vector autoregression (VAR) model employed uses the Livingston survey data on the public’s expectations of headline CPI inﬂation, which is available twice a year. However, the basic results on the change in short-term headline-core inﬂation dynamics are robust to using quarterly data and to including additional determinants of inﬂation in bivariable headline-core inﬂation regressions. The empirical evidence presented here indicates headline and core measures of inﬂation are co-integrated, suggesting long-run co-movement. However, the ways these two variables adjust to each other in the short run and generate co-movement have changed across these sub-periods. In the pre1979 sample period, when a positive gap opens up with headline inﬂation rising above core inﬂation, the gap is eliminated mainly as a result of headline inﬂation not reverting and core inﬂation moving toward headline inﬂation. This result suggests headline inﬂation is better than core inﬂation in assessing the permanent component of inﬂation. In post-1979 sample periods, however, the positive gap is eliminated as a result of headline inﬂation reverting more strongly toward core inﬂation than core inﬂation moving toward headline inﬂation. This suggests core inﬂation would be better than headline inﬂation in assessing the permanent component of inﬂation. Recent research suggests a monetary policy explanation of this change in short-term headline-core inﬂation dynamics. We focus on a version of monetary policy explanation suggested by the recent work of Leduc, Sill, and Stark (2007), which attributes the persistently high inﬂation of the 1970s to a weak monetary policy response to surprise increases in the public’s expectations of inﬂation. In particular, using a structural VAR that includes a direct survey measure of expected (headline CPI) inﬂation, Leduc, Sill, and Stark show that prior to 1979, the Federal Reserve accommodated exogenous movements in expected inﬂation seen in the result that the short-term real interest rate did not increase in response to such movements, which then led to persistent increases forecast performance of core and headline measures; only in recent data is core inﬂation the better predictor of future headline inﬂation. 3 The evidence indicating that inﬂation dynamics have changed since 1979 appears in Bernanke (2007); Leduc, Sill, and Stark (2007); and Mishkin (2007a). Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 291 in actual inﬂation. Such Federal Reserve behavior, however, is absent post1979, leading to a decline in the persistence of inﬂation. We illustrate that such a change in Federal Reserve behavior is also capable of generating the change in headline-core inﬂation dynamics documented above. In particular, when we consider a variant of the structural VAR model that includes expected headline inﬂation, actual headline inﬂation, actual core inﬂation, and a short-term nominal interest rate, we ﬁnd that a change in the interest rate response to exogenous movements in expected headline inﬂation can explain the change in actual headline-core inﬂation dynamics. Thus, prior to 1979, when the Federal Reserve accommodated exogenous movements in expected headline inﬂation, a surprise increase in expected headline inﬂation (say, due to higher energy and food prices) was not reversed, leading to persistent increases in actual headline inﬂation with core inﬂation moving toward headline inﬂation. A surprise increase in expected headline inﬂation thus generates co-movement between actual headline and core inﬂation measures. Since such Federal Reserve accommodation of shocks to expected headline inﬂation is absent post-1979, surprise increases in expected headline inﬂation are reversed, with actual headline inﬂation inverting to core inﬂation. In the most recent sample period, 1985:1–2007:2, surprise increases in expected headline inﬂation have no signiﬁcant effect on core inﬂation, whereas surprise increases in core inﬂation do lead to increases in headline inﬂation, generating co-movement between headline and core CPI inﬂation measures. Since movements in food and energy prices are likely signiﬁcant sources of movements in the public’s expectations of headline inﬂation, this empirical work implies that change in headline-core inﬂation dynamics may be due to the Federal Reserve having convinced the public it would no longer accommodate food and energy inﬂation. The rest of the paper is organized as follows. Section 1 presents the main empirical results on the nature of the change in headline-core inﬂation dynamics across three sub-periods spanning the sample of 1959–2007. Section 2 presents and discusses results from recent research including a structural VAR model, suggesting a monetary policy explanation of the change in headline-core inﬂation dynamics documented in Section 1. Section 3 contains concluding observations. 1. EMPIRICAL RESULTS ON HEADLINE-CORE INFLATION DYNAMICS In this section we present the econometric work consistent with change in short-term headline-core inﬂation dynamics. Figure 1, which charts headline and core measures of PCE and CPI inﬂation, provides a look at the behavior of these two measures of inﬂation during the sample period of 1959–2007. Two observations are noteworthy. The ﬁrst is that headline and core measures 292 Federal Reserve Bank of Richmond Economic Quarterly Figure 1 PCE and CPI Inﬂation Rates Since 1959 Percent Panel A: PCE - Upper, Core Deviation (Headline minus Core) - Lower 12.5 10.0 7.5 5.0 2.5 0.0 Headline Core 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 Percent 5 3 1 -1 -3 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 Percent Panel B: CPI - Upper, Core Deviation (Headline minus Core) - Lower 15.0 12.5 10.0 7.5 5.0 2.5 0.0 Headline Core 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 4 Percent 2 0 -2 -4 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 of CPI and PCE inﬂation co-move over the sample period. The lower graph in each panel of Figure 1 charts the “core deviation,” measured as the gap between headline and core inﬂation rates. This series is mean stationary, consistent with co-movement. The second point to note is that, although Figure 1 shows that headline and core measures of inﬂation co-move in the long run, it is not clear from the ﬁgure how this co-movement arises. This co-movement may be a result of one series adjusting to the other, or both series adjusting to each other. We formally investigate such dynamics in this section. One approach to headline-core inﬂation dynamics uses the co-integrationerror-correction methodology popularized by Granger (1986) and Engle and Granger (1987), among others. Under this approach, one examines short-term inﬂation dynamics under the premise that headline and core inﬂation series may be nonstationary but co-integrated, indicating the presence of a long-run relationship between these two measures. Using short-term error-correction equations, one can then estimate how these two series adjust if headline Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 293 inﬂation moves above or below its long-run value implied by the co-integrating regression. Another approach treats the inﬂation series as being mean stationary in levels, especially during shorter sample periods.4 One infers short-term headline-core dynamics by examining the near-term responses of headline and core inﬂation measures to a core deviation. We employ both of these approaches. Unit Roots, Co-integration, and Short-Term Dynamics To investigate whether there exists a long-run co-integrating relationship between headline and core measures of inﬂation, we ﬁrst examine the unit root properties of these two series. Table 1 presents test results for determining whether headline (π H ) and core (π C ) inﬂation measures have unit roots. The t t test used is the t-statistic, implemented by estimating the augmented DickeyFuller (1979) regression of the form k Xt = m0 + ρXt−1 + m1s Xt−s + ε t , (1) s=1 where Xt is the pertinent variable, ε is the disturbance term, and k is the number of lagged ﬁrst differences to make ε serially uncorrelated. If ρ = 1, Xt has a unit root. The null hypothesis ρ = 1 is tested using the t-statistic. As can be seen, the t-statistic reported in Table 1 is small for levels of inﬂation series but large for ﬁrst differences of these series, suggesting that inﬂation is nonstationary in levels but stationary in ﬁrst differences over 1959:1–2007:2. If headline and core inﬂation measures are nonstationary in levels, there may exist a long-run co-integrating relationship between them. We use a twostep Engle-Granger (1987) procedure to test for the presence of a long-run relationship. In step one of this procedure, we estimate by ordinary least squares (OLS) the regression of the form π H = a0 + a1 π C + μt , t t (2) where μt is the disturbance term. In step two, we investigate the presence of a unit root in the residuals of regression (2) using the augmented Dickey-Fuller 4 Many analysts have noted the low power of unit root tests in detecting nonstationarity in series, arguing that inﬂation may not have a unit root when some more attractive alternative hypotheses are considered. For example, Webb (1995) argues that it is possible to reject the hypothesis of a unit root in inﬂation when the alternative hypothesis allows for the presence of breaks in monetary policy regimes. As noted in the main text of this article, we also examine short-term headline-core inﬂation dynamics, treating inﬂation as being stationary within each subperiod. 294 Table 1 Unit Root Tests Augmented Dickey-Fuller Regressions Biannual Data from 1959:1–2007:2 Levels: H + t−1 C + t−1 k s=1 k s=1 First Differences: k H )+ s=1 ( t−1 k C )+ s=1 ( t−1 H = m + ρ( 0 t C = m + ρ( 0 t H t−s C t−s H ) t−s C ) t−s Levels Headline Core ρ ˆ 0.8362 0.8510 CPI tρ ˆ −2.3982 −2.4537 k 4 1 ρ ˆ 0.8896 0.9092 First Differences PCE tρ ˆ −1.9427 −2.1164 k 4 0 ρ ˆ −0.4085 −0.4239 CPI tρ ˆ −6.0378 −8.7399 k 3 1 ρ ˆ −0.4590 −0.0851 PCE tρ ˆ −6.5147 −10.6229 k 3 0 H and C are headline and core inﬂation, respectively, in levels, while H and C are ﬁrst differences of Notes: headline and core inﬂation. ρ and the t-statistic, tρ , for ρ = 1 are from the augmented Dickey-Fuller regressions. The series ˆ ˆ has a unit root if ρ = 1. The 5 percent critical value is −2.9. The number of lagged ﬁrst differences (k) is chosen using the Akaike Information Criterion. Federal Reserve Bank of Richmond Economic Quarterly H =m +ρ 0 t C = m +ρ 0 t Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 295 Table 2 Co-integration Tests CPI PCE α0 ˆ 0.1790 −0.0111 CPI PCE λ1 0.2617 0.2554 CPI PCE α0 0.0837 −0.0144 Panel A: Engle-Granger Test ˆ ˆ δ tδ α1 ˆ 0.9714 0.2924 −4.4380 1.0408 0.3101 −4.7694 Panel B: Johansen Test λ2 Co-integrating Vector 0.0852 (−1.3725, 1.4368) 0.0584 (−1.8244, 1.9375) Panel C: Fully Modiﬁed OLS Estimates α1 s1 s2 0.9956 0.9326 0.8841 1.0427 0.3521 0.2481 k 3 3 LR 28.8223** 28.0137** Notes: Biannual data from 1959:1–2007:2. *10 percent signiﬁcance, **5 percent signiﬁcance. For the Engle-Granger (1987) test, α 0 , α 1 , δ , and the t-statistic for δ = 1 ˆ ˆ ˆ ˜ in Panel A are from two regressions of the form H = α 0 + α 1 C + ut and ut = t t ˜ δ ut−1 + k bs ut−s . Headline and core measures are not co-integrated if the resid˜ s=1 ual series, ut , has a unit root, i.e., if δ = 1. For the Johansen (1988) test, the table ˜ shows the two eigenvalues, λ1 and λ2 , used in evaluating Johansen’s likelihood function, the estimated co-integrating vectors, and the likelihood ratio statistic, LR, for testing the null hypothesis of no co-integration. The LR is calculated as −T · ln(1 − λ1 ), where T is the number of total observations. Critical values for LR are reported under the heading Case 1 in Hamilton (1994, 768, Table B.11). Panel C shows results from a fully modiﬁed OLS regression of the form H = α 0 + α 1 C + ut . The statistic s1 is the signiﬁcance t t level of the test hypothesis α 1 = 1, while s2 is the signiﬁcance level of the test of the hypothesis α 0 = 0 and α 1 = 1. See notes from Table 1 for variable deﬁnitions. test implemented by estimating regression of the form k ut = δut−1 + b1s ut−s , (3) s=1 where u is the residual. If δ = 1, then there does not exist a long-run relationship between headline and core measures of inﬂation. The null hypothesis, δ = 1, is tested using the t-statistic. Table 2, Panel A presents the pertinent t-statistic, which is large for both PCE and CPI inﬂation measures, leading to the rejection of the null hypothesis. These test results suggest headline and core measures of inﬂation are indeed co-integrated. The Engle-Granger test is implemented above by assuming a particular normalization, regressing headline inﬂation on core inﬂation, and examining the presence of a unit root in the residuals of (2). For robustness with respect to normalization, we also implement the likelihood test of co-integration as in Johansen (1988). Table 2, Panel B reports the likelihood test results and estimated co-integrating vectors. The likelihood ratio statistic that tests the 296 Federal Reserve Bank of Richmond Economic Quarterly null hypothesis of no co-integrating vector against the alternative of one cointegrating vector is large, leading to the rejection of the null hypothesis. In order to be able to carry out tests of hypotheses on parameters of the estimated co-integrating vectors, we re-estimate the co-integrating relationship (2) using a fully modiﬁed OLS estimator as in Phillips and Hansen (1990) because standard OLS estimates, though consistent, do not have the asymptotic normal distribution. The estimates are reported in Table 2, Panel C. As can be seen, the estimated long-run coefﬁcient, a1 , is positive and statistically different from zero, suggesting the presence of a positive relationship between headline and core inﬂation measures. The estimated long-run coefﬁcient, a1 , is not different from unity, suggesting the headline measure of inﬂation moves one-for-one with the core measure in the long run. The signiﬁcance level of the statistic that tests the null hypothesis a0 = 0, a1 = 1 is .88 using CPI and .25 using PCE. These signiﬁcance levels are large, leading to an acceptance of the null hypothesis. Having established above that headline and core measures of inﬂation comove in the long run, we now investigate the sources of this co-movement by estimating short-term error-correction equations of the form given in (4) and (5): k π H = b0 + λh μt−1 + t π H + υ t , and t−s (4.1) π C + υt . t−s (4.2) s=1 k π C = b0 + λc μt−1 + t s=1 Under the assumptions a0 = 0, a1 = 1, we can re-write (4) as (5): k π H = b0 + λh (π H − π C )t−1 + t π H + υ t , and t−s (5.1) π C + υt . t−s (5.2) s=1 k π C = b0 + λc (π H − π C )t−1 + t s=1 Regressions (4) and (5) capture short-term dynamics between headline and core inﬂation measures, and the coefﬁcients λh and λc indicate how headline inﬂation and core inﬂation adjust if a gap emerges between headline and core inﬂation rates. If λh = 0 and λc > 0, headline and core inﬂation stay together mainly by core inﬂation moving toward headline inﬂation. If λh < 0 and λc = 0, headline and core inﬂation stay together mainly by headline inﬂation moving toward core inﬂation. If λh < 0 and λc > 0, both series adjust, with headline inﬂation moving toward core inﬂation and core inﬂation moving toward headline inﬂation. The relative magnitudes of these adjustment Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 297 coefﬁcients convey information about which series adjusts more in response to a core deviation. Table 3, Panel A, presents estimates of short-term error-correction (adjustment) coefﬁcients, providing information about the ways these two series adjust over three sub-samples considered. Focusing ﬁrst on the adjustment coefﬁcient, λh , that appears in headline inﬂation regressions, this estimated coefﬁcient is positive and not statistically different from zero in the pre-1979 sample period, but is negative and statistically different from zero in the recent sample period, 1985:1–2007:2. This result holds for headline CPI as well as for headline PCE inﬂation. These estimates of the adjustment coefﬁcient, λh , suggest that if headline inﬂation is above core inﬂation, headline inﬂation inverts toward core inﬂation in the recent sample period but not in the pre-1979 sample period. Focusing now on the adjustment coefﬁcient, λc , that appears in core inﬂation equations, we see that results differ for CPI and PCE inﬂation measures. In core PCE inﬂation equations, the estimated coefﬁcient is positive, large, and statistically signiﬁcant in the pre-1979 sample period but it becomes small and not statistically different from zero in the recent sample period, 1985:1–2007:2, suggesting that if headline inﬂation is above core inﬂation, core inﬂation moves toward headline inﬂation in the pre-1979 sample period but not in the recent sample period, 1985:1–2007:2. For CPI inﬂation, the adjustment coefﬁcient, λc , that appears in the core inﬂation equation does decline signiﬁcantly from .91 in the pre-1979 sample period to .19 in the recent sample period. However, it remains statistically signiﬁcant in the recent sample period, suggesting the CPI measure of core inﬂation has also moved somewhat toward headline inﬂation. Together, these short-term adjustment coefﬁcients suggest that, whereas in the pre-1979 sample period headline and core measures of inﬂation stayed together as a result of core inﬂation moving toward headline inﬂation, in the recent sample period they have stayed together more as a result of headline inﬂation moving toward core inﬂation than core inﬂation moving toward headline inﬂation. In order to check robustness, discussed in detail later in this article, we re-estimate short-term adjustment equations (5) augmented to include two additional lags of other economic determinants of inﬂation such as changes in a short-term nominal interest rate and changes in the unemployment rate. Table 3, Panel B, presents the shortterm adjustment coefﬁcients from these short-term augmented regressions. We can see estimates of short-term adjustment coefﬁcients yield qualitatively similar results about change in headline-core inﬂation dynamics.5 5 The adjusted R-squared statistics provided in Table 3 appear reasonable given that short-term adjustment equations are estimated using ﬁrst-differences of inﬂation measures. 298 Table 3 Short-Term Headline-Core Inﬂation Dynamics Panel A: Bivariable Adjustment Regressions 2 H − C H +v t t−s s=1 t−1 + t−1 H =β +λ h 0 t C =β +λ c 0 t H =β +λ h 0 t C =β +λ c 0 t 1959:1–1979:1 1979:2–2001:2 1985:1–2007:2 λh 0.4745 −0.0881 −0.6471* CPI ¯ R2 −0.027 0.144 0.365 C t−1 + ¯ R2 0.433 0.136 0.264 λc 0.9141** 0.2917 0.1943** 2 s=1 C +v t t−s λh 0.4011 −0.8139** −0.6168** PCE ¯ R2 −0.042 0.200 0.328 Panel B: Multivariable Adjustment Regressions 2 C + sr H − C H + t−s + t−s t−s s=1 ( t−1 + t−1 H − t−1 ¯ R2 0.251 0.322 0.351 C t−1 + CPI λc 1.0793** 0.6665** 0.2701** 2 s=1 ( C + t−s ¯ R2 0.520 0.601 0.465 H + t−s λh 0.2213 −0.2972 −0.5400 λc 0.7734** −0.0483 0.0763 ¯ R2 0.406 0.107 0.203 urt−s ) + vt srt−s + urt−s ) + vt PCE ¯ λc R2 0.147 0.6770* 0.260 0.4519* 0.261 0.4158** ¯ R2 0.354 0.335 0.280 Notes: *10 percent signiﬁcance, **5 percent signiﬁcance. The coefﬁcients λh and λc are estimated using OLS; srt is the ﬁrst difference in the short-term nominal rate, deﬁned as the three-month Treasury-bill rate; ur is the ﬁrst difference in the unemployment rate. See notes from Table 1 for the deﬁnitions of other variables. Federal Reserve Bank of Richmond Economic Quarterly 1959:1–1979:1 1979:2–2001:2 1985:1–2007:2 λh 0.3551 −0.2208 −0.7319** H − t−1 Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 299 Stationarity and Mean Reversion We also examine short-term headline-core dynamics by focusing on the inﬂuence of core deviation on the longer-horizon behavior of inﬂation, assuming headline and core inﬂation measures are likely mean stationary in shorter sample periods. If headline inﬂation is above core inﬂation and if adjustment occurs mainly as a result of headline inﬂation moving toward core inﬂation, we should expect headline inﬂation to decline in the near term. With that in mind, we examine the behavior of inﬂation over various forecast horizons as in (6.1) and (6.2):6 k π H − π H = b0f + λhf (π H − π C )t + t+f t b1f π H + μt+f , and t−s (6.1) s=1 k π C − π C = c0f + λcf (π H − π C )t + t+f t c1f π C + μt+f , t−s (6.2) s=1 where π H is the f -periods-ahead headline inﬂation rate, π H is the currentt t+f period headline inﬂation rate, π C is the current-period core inﬂation rate, t π H − π C is the contemporaneous core deviation, f is the forecast horizon, and μt+f is a mean-zero random disturbance term. Regressions (6.1) and (6.2) relate the change in inﬂation over the next f (six-month) periods to the contemporaneous gap between headline and core inﬂation rates. If the coefﬁcients, λhf , in (6.1) are generally negative and the coefﬁcients, λcf , in (6.2) are zero, then core deviation is eliminated primarily as a result of headline inﬂation inverting to core inﬂation. In contrast, if the coefﬁcients, λhf , in (6.1) are zero and the coefﬁcients, λcf , in (6.2) are positive, core deviation is eliminated mainly as a result of core inﬂation moving toward headline inﬂation. Table 4 presents estimates of the coefﬁcients from regressions given in (6.1) and (6.2). The estimates are presented for forecast horizons of one to four periods in the future. Panel A presents estimates using CPI and Panel B uses PCE. Since results derived using CPI are broadly similar to those derived using PCE inﬂation, we focus on estimates derived using CPI. As can be seen in the pre-1979 sample period, estimated coefﬁcients λhf , f = 1, 2, ..., 4 are zero and λcf , f = 1, 2, ..., 4 are positive, conﬁrming that the series have stayed together mainly as a result of core inﬂation moving toward headline inﬂation. In the most recent sample period, 1985:1–2007:2, however, estimated coefﬁcients λhf , f = 1, 2, ..., 4 are negative and λcf , f = 1, 2, ..., 4 are positive, suggesting that both series are adjusting to each other. However, 6 In previous research, analysts have focused only on equation (4.1), examining inversion in headline inﬂation. See, for example, Clark (2001), Cogley (2002), and Rich and Steindel (2005). 300 Table 4 Short-Term Headline-Core Inﬂation Dynamics H − t+f C − t+f f f f f H − t C t + 2 s=1 C +μ t+f t−s =1 =2 =3 =4 1959:1–1979:1 ˆ ˆ λhf (t-value) λcf (t-value) 0.2922 (1.4712) 0.9476 (5.1898) 0.1799 (0.6308) 1.0523 (4.0652) 0.2708 (0.8540) 0.9571 (3.2132) −0.4165 (−1.1036) 0.5683 (1.5071) Panel A: CPI 1979:2–2001:2 ˆ ˆ λhf (t-value) λcf (t-value) −0.7230 (−2.3898) 0.1660 (0.6635) −0.8962 (−3.5360) 0.2796 (1.1604) −0.6554 (−2.4870) 0.1431 (0.5892) −1.2379 (−3.9101) −0.1730 (−0.6615) 1985:1–2007:2 ˆ ˆ λhf (t-value) λcf (t-value) −0.7101 (−4.9165) 0.1858 (3.7078) −0.9658 (−5.9644) 0.1906 (2.5528) −0.8059 (−4.9527) 0.1478 (1.7234) −1.0563 (−6.2716) 0.0934 (3.6687) =1 =2 =3 =4 1959:1–1979:1 ˆ ˆ λhf (t-value) λcf (t-value) 0.1621 (0.6311) 0.7294 (4.3905) −0.0373 (−0.1061) 0.7377 (2.7125) −0.1051 (−0.2686) 0.5343 (1.6707) −0.9059 (−2.1481) 0.2232 (0.6207) Panel B: PCE 1979:2–2001:2 ˆ ˆ λhf (t-value) λcf (t-value) −0.9177 (−3.9444) −0.1905 (−1.1215) −1.2218 (−5.5721) −0.0874 (−0.5304) −0.8718 (−4.0458) −0.0183 (−0.0981) −1.5878 (−7.2154) −0.6071 (−3.1880) 1985:1–2007:2 ˆ ˆ λhf (t-value) λcf (t-value) −0.7550 (−4.5498) 0.0684 (0.6118) −0.9640 (−4.9353) 0.2006 (1.6302) −0.8452 (−4.4408) 0.1519 (1.2177) −1.1517 (−5.4391) −0.0300 (−0.1976) Notes: f is the number of periods in the forecasting horizon. Regressions are estimated including levels of lagged inﬂation. All regressions are estimated using OLS. See notes from Table 1 for variable deﬁnitions. Federal Reserve Bank of Richmond Economic Quarterly f f f f Long-Horizon Behavior of Inﬂation 2 H − C + H t t s=1 t−s + μt+f H =b 0,f + λhf t C =b 0,f + λcf t Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 301 relative magnitudes of the estimated adjustment coefﬁcients suggest headline inﬂation has moved more toward core inﬂation than core inﬂation has moved toward headline inﬂation. Robustness: Multivariate System, Data Frequency, and Sample Breaks The change in short-term headline-core inﬂation dynamics summarized above are derived using a bivariable framework, biannual data, and three sub-periods generated by breaking the sample in 1979 and 1984. Here, we present some additional evidence indicating inference that the nature of change in headlinecore inﬂation dynamics remains robust to several changes in speciﬁcation. The ﬁrst change in speciﬁcation expands the regressions given in (5.1) and (5.2) to include other possible determinants of inﬂation such as changes in a short-term nominal interest (capturing the possible inﬂuence of monetary policy actions) and changes in the unemployment rate (as a proxy for the inﬂuence of the state of the economy). We focus on the sign and statistical signiﬁcance of the short-term adjustment coefﬁcients in these expanded regressions. As already noted, estimates from these multivariate regressions (Table 3, Panel B) yield qualitatively similar inferences about the nature of the change in short-term headline-core inﬂation dynamics to those derived using bivariable regressions. Rather than focus on three sub-periods, we estimate the short-term adjustment coefﬁcients from the multivariate versions of regressions given in (5.1) and (5.2) using rolling regressions over a 19-year window.7 We estimate those regressions using biannual as well as quarterly data. Since the results using biannual data are qualitatively similar to those derived using quarterly data and, since the results also appear robust to the use of CPI or PCE inﬂation, we focus on estimates derived using biannual data and CPI inﬂation. Panel A in Figure 2 charts estimates of the short-term adjustment coefﬁcient, λh , from headline inﬂation regressions, and Panel B charts estimates of the short-term adjustment coefﬁcient, λc , from core inﬂation regressions, with 95 percent conﬁdence bands. In samples that begin in the 1960s or early 1970s, the short-term adjustment coefﬁcient, λh , is usually positive but statistically indifferent from zero whereas the short-term adjustment coefﬁcient, λc , is positive and statistically different from zero, suggesting headline inﬂation does not revert, but rather core inﬂation moves toward headline inﬂation. In contrast, in samples that begin in the early 1980s, the short-term adjustment coefﬁcient, λh , is instead negative and statistically signiﬁcant whereas the short-term adjustment coefﬁcient, λc , is positive but not always statistically 7 In the multivariable versions of (5.1) and (5.2), we include changes in a short-term nominal interest rate and changes in the unemployment rate, besides including lags of headline and core inﬂation rates. 302 Federal Reserve Bank of Richmond Economic Quarterly Figure 2 Rolling Window Regression: 19-Year Window, Biannual Data, CPI Inﬂation Estimate of Adjustment Coefficient in Headline Equation with 95 Percent Confidence Band 3 2 1 0 -1 -2 -3 -4 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 Estimate of Adjustment Coefficient in Core Equation with 95 Percent Confidence Band 3 2 1 0 -1 -2 -3 -4 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 Notes: Entries on the x-axis represent the start of the sample window for the coefﬁcient estimate. different from zero. This suggests that the gap between headline and core CPI inﬂation is eliminated as a result of headline inﬂation inverting toward core inﬂation rather than core inﬂation moving toward headline inﬂation. These results are qualitatively similar to those derived using bivariable regressions estimated across three chosen sample periods. 2. DISCUSSION OF RESULTS What explains the change in the short-term headline-core inﬂation dynamics documented above? Recent research suggests a monetary policy explanation. Mishkin (2007a) provides evidence that in recent years inﬂation persistence has declined and inﬂation has become less responsive to changes in unemployment and other shocks. He attributes this change in inﬂation dynamics to the anchoring of inﬂation expectations as a result of better conduct of Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 303 monetary policy. In a recent paper, Leduc, Sill, and Stark (2007) attribute the persistently high inﬂation of the 1970s to a weak monetary policy response to surprise increases in the public’s expectations of inﬂation. In particular, using a structural VAR that includes a direct survey measure of expected (headline CPI) inﬂation, Leduc, Sill, and Stark show that, prior to 1979, the Federal Reserve accommodated exogenous movements in expected inﬂation, seen in the result that the short-term real interest rate did not increase in response to such movements, which then led to persistent increases in actual inﬂation. Such behavior, however, is absent post-1979. We argue below that such change in the Federal Reserve’s accommodation of expected headline inﬂation is also capable of generating the change in actual headline-core inﬂation dynamics documented above. We demonstrate this by using a variant of the structural VAR model that includes actual headline and core inﬂation measures.8 To explain further, consider a four-variable VAR that contains a direct survey measure of the public’s expectations of headline inﬂation, represented by the median Livingston survey forecast of the eight-month-ahead headline CPI inﬂation rate (π eH ). The other variables included in the VAR are actual t headline CPI inﬂation (π H ), actual core CPI inﬂation (π C ), and a short-term t t nominal interest rate (srt ). Following Leduc, Sill, and Stark (2007), we deﬁne and measure variables in such a way that survey participants making forecasts do not observe contemporaneous values of other VAR variables, thereby helping to identify exogenous movements in expected headline inﬂation.9 Using a recursive identiﬁcation scheme {π eH , π H , π C , srt } in which expected inﬂat t t tion is ordered ﬁrst and the short nominal interest rate is last, we examine and compare the impulse responses of actual headline and core inﬂation measures to surprise increases in expected headline inﬂation (and core inﬂation itself). Figure 3 shows the responses of VAR variables to a one-time surprise increase in expected headline inﬂation for three sample periods: 1959:1– 1979:1 (Panel A), 1979:2–2001:2 (Panel B), and 1985:1–2007:2 (Panel C). Figure 4 shows the responses to a one-time increase in core inﬂation. In these ﬁgures, and those that follow, the solid line indicates the point estimate, while the darker and lighter shaded regions represent 68 percent and 90 percent conﬁdence bands, respectively. Focusing on Figure 3, we highlight two observations. First, the effects of a surprise increase in expected headline inﬂation on actual headline and core measures of inﬂation have changed over time. In the pre-1979 sample period, a surprise increase in expected headline inﬂation is not reversed and leads to a persistent increase in actual headline and core inﬂation measures. However, in post-1979 sample periods, such effects have become weaker. In fact, in the 8 For an empirical demonstration of the impact of change in policy on the stability of empirical models (the so-called Lucas critique), see Lubik and Surico (2006). 9 For further details see Leduc, Sill, and Stark (2007) and Mehra and Herrington (2008). 304 Federal Reserve Bank of Richmond Economic Quarterly Figure 3 Shock to Expected Headline Inﬂation Panel A: 1959:1 - 1979:1 Headline Inflation Response 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 1 2 3 4 5 6 7 8 Expected Headline Response 2.00 1.50 1.00 0.50 0.00 9 10 11 12 13 14 0 1 2 Core Inflation Response 1 2 3 4 5 6 7 8 4 5 6 7 8 9 10 11 12 13 14 Nominal Interest Rate Response 2.00 1.50 1.00 0.50 0.00 0 3 9 10 11 12 13 14 1.50 1.25 1.00 0.75 0.50 0.25 0.00 -0.25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Panel B: 1979:2 - 2001:2 Headline Inflation Response Expected Headline Response 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 1.2 0.8 0.4 0.0 -0.4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 Core Inflation Response 3 4 5 6 7 8 9 10 11 12 13 14 Nominal Interest Rate Response 1.50 1.25 1.00 0.75 0.50 0.25 0.00 -0.25 1.2 0.8 0.4 0.0 -0.4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Panel C: 1985:1 - 2007:2 Headline Inflation Response 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 Expected Headline Response 0.5 0.4 0.3 0.2 0.1 0.0 -0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 Core Inflation Response 0.6 0.4 0.2 0.0 -0.2 3 4 5 6 7 8 9 10 11 12 13 14 Nominal Interest Rate Response 1.2 0.8 0.4 0.0 -0.4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Notes: Responses to a one standard deviation shock to expected headline CPI inﬂation. The responses are generated from a VAR with expected headline CPI inﬂation, actual headline CPI inﬂation, actual core CPI inﬂation, and the three-month Treasury bill rate. All responses are in percentage terms. In each chart, the darker area represents the 68 percent conﬁdence interval and the lighter area represents the 90 percent conﬁdence interval. The x-axis denotes six-month periods. most recent sample period, 1985:1–2007:2, a surprise increase in expected headline inﬂation is reversed and has no signiﬁcant effect on actual headline and core inﬂation measures (compare responses in Panels A and C). These results suggest that, in the pre-1979 sample period, shocks to expected headline Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 305 inﬂation can generate co-movement between headline and core measures of inﬂation and that this co-movement arises as a result of headline inﬂation not reverting to core inﬂation and core inﬂation moving toward headline inﬂation. In contrast, in the recent sample period, 1985:1–2007:2, a surprise increase in expected headline inﬂation does not generate co-movement between actual headline and core inﬂation measures because they are not affected by movements in expected headline inﬂation. As discussed below, a surprise increase in core inﬂation, however, can generate co-movement between headline and core measures of inﬂation in the most recent sample period. Second, the interest rate responses shown in Figure 3 suggest monetary policy may be at the source of the above-noted change in the response of actual headline inﬂation to expected headline inﬂation shocks. If we focus on the nominal interest rate response shown in Panel A, we see that the nominal interest rate does increase in response to a surprise increase in expected headline inﬂation, but that this increase in the nominal interest rate approximates the increase in expected headline inﬂation leaving the real interest essentially unchanged.10 The behavior of the real interest rate in response to surprise increases in expected headline inﬂation suggests that the Federal Reserve followed an accommodative monetary policy. However, in the sample period 1979:2–2001:2, the real interest rate rises sharply in response to a surprise increase in expected headline inﬂation, suggesting that the Federal Reserve did not accommodate shocks to expected headline inﬂation. In the most recent sample period, 1985:1–2007:2, there is no signiﬁcant response of the real interest rate to an expected inﬂation shock, because a surprise increase in expected headline inﬂation is reversed, having no signiﬁcant effect on actual headline and core inﬂation measures. Focusing on Figure 4, we see that it is only in the most recent sample period, 1985:1–2007:2, in which a surprise increase in core inﬂation leads to an increase in expected and actual headline inﬂation, generating co-movement between headline and core measures of inﬂation. This co-movement is generated as a result of headline inﬂation moving toward core inﬂation. Furthermore, the real interest rate does rise signiﬁcantly in response to a surprise increase in core inﬂation, suggesting that in conducting monetary policy the Federal Reserve appears to be focused on the core measure of inﬂation. In contrast, in the pre-1979 sample period, a surprise increase in core inﬂation does not lead to an increase in headline inﬂation and there is no signiﬁcant response of the nominal interest rate.11 10 We infer the response of the real interest rate to a shock by comparing the responses of the nominal interest rate and expected headline inﬂation. Thus, the expected real interest rate response is simply the short-term nominal interest rate response minus the expected headline inﬂation response. 11 However, in the pre-1979 sample period, a surprise increase in core inﬂation is reversed and leads to a decline (not increase) in expected and actual headline inﬂation. Even though the 306 Federal Reserve Bank of Richmond Economic Quarterly Figure 4 Shock to Core Inﬂation Panel A: 1959:1 - 1979:1 Headline Inflation Response 0.5 0.0 -0.5 -1.0 -1.5 -2.0 -2.5 Expected Headline Response 0.25 -0.25 -0.75 -1.25 -1.75 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Core Inflation Response 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 -2.0 0 1 2 3 4 5 6 7 8 9 0 1 2 12 13 14 4 5 6 7 8 9 10 11 12 13 14 11 12 13 14 11 12 13 14 11 12 13 14 11 12 13 14 12 13 14 Nominal Interest Rate Response 0.25 0.00 -0.25 -0.50 -0.75 -1.00 -1.25 10 11 3 0 1 2 3 4 5 6 7 8 9 10 Panel B: 1979:2 - 2001:2 Headline Inflation Response 1.00 0.75 0.50 0.25 0.00 -0.25 -0.50 -0.75 0 1 2 3 4 5 6 7 8 9 Expected Headline Response 0.5 0.4 0.3 0.2 0.1 0.0 -0.1 -0.2 10 11 12 13 14 0 1 2 Core Inflation Response 1.0 3 4 5 6 7 8 9 10 Nominal Interest Rate Response 0.4 -0.0 -0.4 0.6 0.2 -0.2 -0.8 -1.2 -0.6 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 Panel C: 1985:1 - 2007:2 Headline Inflation Response 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 Core Inflation Response 0.6 0.5 0.4 0.3 0.2 0.1 0.0 -0.1 0 1 2 3 4 5 6 7 8 9 Expected Headline Response 0.5 0.4 0.3 0.2 0.1 0.0 -0.1 1 2 11 12 13 14 4 5 6 7 8 9 10 Nominal Interest Rate Response 1.25 1.00 0.75 0.50 0.25 0.00 -0.25 -0.50 10 3 0 1 2 3 4 5 6 7 8 9 10 11 Notes: Responses to one standard deviation shock to core CPI inﬂation. The responses are generated from a VAR with expected headline CPI inﬂation, actual headline CPI inﬂation, actual core CPI inﬂation, and the three-month Treasury bill rate. All responses are in percentage terms. In each chart, the darker area represents the 68 percent conﬁdence interval and the lighter area represents the 90 percent conﬁdence interval. The x-axis denotes six-month periods. nominal interest rate does not increase in response to a positive shock to core inﬂation, the expected real interest rate does increase because of a decline in expected headline inﬂation. These responses Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 307 Together, the responses depicted in Figures 3 and 4 imply that, before 1979, headline and core inﬂation measures co-move mainly as a result of core inﬂation moving toward headline inﬂation, because the Federal Reserve accommodated surprise increases in the public’s expectations of headline inﬂation. A surprise increase in core inﬂation is simply reversed and does not lead to higher expected or actual headline inﬂation. Since 1979, however, the Federal Reserve has not accommodated increases in the public’s expectations of headline inﬂation, and hence co-movement has mainly arisen as a result of headline inﬂation moving toward core inﬂation. Food and Energy Inﬂation Since the measure of core inﬂation used here is derived excluding food and energy inﬂation from headline inﬂation, and since food and energy prices are likely to be a signiﬁcant source of movements in expected headline inﬂation, the results discussed above imply that change in monetary policy response to expected headline inﬂation may reﬂect change in monetary policy response to movements in expected food and energy prices. Since we do not have any direct survey data on the public’s expectations of food and energy price inﬂation, we provide some preliminary evidence on this issue by examining responses to movements in actual food and energy inﬂation. With that in mind, we consider another variant of the structural VAR model that includes expected headline inﬂation, actual core inﬂation, the food and energy component of headline CPI inﬂation, and the short-term nominal interest rate. We continue to assume the baseline identiﬁcation ordering {π eH , π C , (π H − π C ), srt } in t t t t which expected headline inﬂation is exogenous but food and energy price inﬂation is not. Food and energy inﬂation is measured as the gap between headline and core inﬂation rates. Figure 5 shows responses to a surprise increase in the food and energy component of headline inﬂation over three sample periods: 1959:1–1979:1 (Panel A), 1979:2–2001:2 (Panel B), and 1985:1–2007:2 (Panel C). In the pre1979 sample period a surprise temporary increase in food and energy prices has a signiﬁcant effect on expected headline inﬂation, leading to a persistent increase in expected (and hence actual) headline inﬂation. Core inﬂation is also persistently higher in response to a surprise increase in food and energy inﬂation. These responses suggest that a surprise increase in food and energy inﬂation can generate co-movement between headline and core measures of inﬂation, with core inﬂation moving toward headline inﬂation. However, in suggest that the Federal Reserve was not as accommodative to shocks to core inﬂation as it was to shocks to expected headline inﬂation. As noted by several analysts, the Federal Reserve may have believed that shocks to food and energy prices are likely temporary and would not lead to persistent increases in headline inﬂation. 308 Federal Reserve Bank of Richmond Economic Quarterly Figure 5 Shock to Food and Energy Component of Headline Inﬂation Panel A: 1959:1 - 1979:1 Core Inflation Response 2.5 2.0 1.5 1.0 0.5 0.0 -0.5 1 0 2 3 4 5 6 7 8 9 10 11 12 13 14 Food and Energy Inflation Response 1.50 Expected Headline Response 1.75 1.50 1.25 1.00 0.75 0.50 0.25 0.00 0 1 2 1.4 1.00 4 5 6 7 8 9 10 11 12 13 14 11 12 13 14 11 12 13 14 11 12 13 14 11 12 13 14 12 13 14 1.0 0.50 3 Nominal Interest Rate Response 0.6 0.00 0.2 -0.50 -0.2 1 0 2 3 4 5 6 7 8 9 10 11 12 13 0 14 1 2 3 4 5 6 7 8 9 10 Panel B: 1979:2 - 2001:2 Core Inflation Response 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 Expected Headline Response 0.4 0.2 -0.0 -0.2 -0.4 1 0 2 3 4 5 6 7 8 9 10 11 12 13 14 Food and Energy Inflation Response 0 1 2 4 5 6 7 8 9 10 Nominal Interest Rate Response 1.2 1.25 0.75 0.25 -0.25 -0.75 3 0.8 0.4 0.0 -0.4 1 0 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 Panel C: 1985:1 - 2007:2 Core Inflation Response 0.4 0.3 0.2 0.1 0.0 -0.1 -0.2 Expected Headline Response 0.25 0.15 0.05 -0.05 -0.15 1 0 2 3 4 5 6 7 8 9 10 11 12 13 14 Food and Energy Inflation Response 1.50 0.50 0.00 -0.50 1 2 3 4 5 6 7 8 9 10 11 1 2 12 13 14 3 4 5 6 7 8 9 10 Nominal Interest Rate Response 0.6 0.4 0.2 -0.0 -0.2 -0.4 -0.6 -0.8 1.00 0 0 0 1 2 3 4 5 6 7 8 9 10 11 Notes: Responses to one standard deviation shock to the food and energy component of headline CPI inﬂation. The responses are generated from a VAR with expected headline CPI inﬂation, core CPI inﬂation, food and energy inﬂation, and the three-month Treasury bill rate. All responses are in percentage terms. In each chart, the darker area represents the 68 percent conﬁdence interval and the lighter area represents the 90 percent conﬁdence interval. The x-axis denotes six-month periods. post-1979 sample periods the positive response of expected headline inﬂation to a surprise increase in food and energy inﬂation weakens considerably. More interestingly, in the most recent sample period, 1985:1–2007:2, a surprise increase in food and energy inﬂation has no signiﬁcant effect on expected Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 309 headline inﬂation, suggesting that the public believes increases in food and energy prices are unlikely to lead to a persistent increase in headline inﬂation (compare responses across Panels A through C).12 The response of the real interest rate to a surprise increase in food and energy prices implicit in Panels A through C suggests a monetary policy explanation of the decline in the inﬂuence of food and energy prices on expected headline inﬂation. In the pre-1979 period, the real interest rate does not change much because the rise in nominal interest rate matches the rise in expected headline inﬂation, suggesting an accommodative stance of monetary policy. However, in the sample period 1979:2–2001:2, the real interest rate rises signiﬁcantly in response to a surprise increase in food and energy prices, suggesting that the Federal Reserve did not accommodate increases in food and energy prices. Hence, the decline in the inﬂuence of food and energy inﬂation on expected headline inﬂation since 1979 may be due to the Federal Reserve no longer accommodating shocks to food and energy prices. In the most recent sample period, 1985:1–2007:2, however, there is no signiﬁcant response of the nominal (or real) interest rate to a surprise increase in food and energy prices, because a surprise increase in food and energy inﬂation has no signiﬁcant effect on expected headline inﬂation. One plausible explanation of the absence of any signiﬁcant effect of movements in food and energy inﬂation on expected headline inﬂation is that past Federal Reserve behavior has convinced the public that it would not accommodate food and energy inﬂation. As a result, surprise increases in food and energy inﬂation have no signiﬁcant effect on expected headline inﬂation, suggesting the Federal Reserve has become credible. But do shocks to food and energy inﬂation matter for expected headline inﬂation? The results of the variance decomposition of expected headline inﬂation presented in Table 5 are consistent with the decline in the inﬂuence of food and energy inﬂation on expected headline inﬂation since 1979. In the pre-1979 sample period, shocks to the food and energy component of inﬂation account for about 35 percent of the variability of expected headline inﬂation at a two-year horizon, whereas in the recent sample period, 1985:1–2007:2, they account for less than 4 percent of the variability of expected headline inﬂation at the same two-year horizon. 3. CONCLUDING OBSERVATIONS This article investigates empirically short-term dynamics between headline and core measures of CPI and PCE inﬂation over three sample periods: 1959:1– 1979:1, 1979:2–2001:2, and 1985:1–2007:2. Headline and core inﬂation 12 In the recent sample period, 1985:1–2007:2, a surprise increase in food and energy prices does feed into core inﬂation. 310 Federal Reserve Bank of Richmond Economic Quarterly Table 5 Variance Decomposition of Expected Headline CPI Inﬂation Panel A: 1959:1–1979:1 Ordering: Steps 1 2 3 4 6 8 eH , Ordering: Steps 1 2 3 4 6 8 eH , Ordering: Steps 1 2 3 4 6 8 eH , C, eH H − 100.000 83.939 56.768 49.345 45.326 44.895 C, eH H − 100.000 75.964 62.899 55.940 49.066 45.673 C, eH H − 100.000 66.457 55.816 50.830 49.452 49.412 C , SR C 0.000 0.979 3.739 4.427 7.700 10.508 Panel B: 1979:2–2001:2 C , SR C 0.000 12.141 22.970 29.473 35.928 39.126 Panel C: 1985:1–2007:2 C , SR C 0.000 27.758 35.223 39.187 41.004 41.533 H − C SR 0.000 3.461 7.186 10.370 10.616 9.353 C SR 0.000 4.125 3.749 3.777 4.644 5.343 C SR 0.000 5.704 6.309 6.307 5.867 5.394 0.000 11.621 32.307 35.859 36.358 35.244 H − 0.000 7.771 10.382 10.809 10.363 9.858 H − 0.000 0.081 2.653 3.676 3.676 3.659 Notes: Entries are in percentage terms, with the exception of those under the column labeled “Steps.” Those entries refer to n-step-ahead forecasts for which decomposition is done. eH is expected headline inﬂation, as measured by the Livingston Survey. See notes from Tables 1 and 3 for the deﬁnitions of the other variables. measures are co-integrated, suggesting long-run co-movement. However, the ways in which these two variables adjust to each other in the short run and generate co-movement have changed across these sample periods. In the pre1979 sample period, when a positive gap opens up with headline inﬂation rising above core inﬂation, the gap is eliminated mainly as a result of headline inﬂation not reverting and core inﬂation moving toward headline inﬂation. These dynamics suggest headline inﬂation would be better than core inﬂation in assessing the permanent component of inﬂation. In post-1979 sample periods, however, the positive gap is eliminated as a result of headline inﬂation reverting more strongly toward core inﬂation than core inﬂation moving toward headline inﬂation, suggesting core inﬂation would be better than headline inﬂation in assessing the permanent component of inﬂation. Although short-term headline-core inﬂation dynamics are investigated using biannual Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 311 data, the basic result on change in inﬂation dynamics is robust to the use of quarterly data and includes additional economic determinants of inﬂation in the bivariable headline-core inﬂation regressions. The results are also not sensitive to the precise breakup of the sample in 1979 and 1984. Recent research suggests a monetary policy explanation of change in inﬂation dynamics. We focus on a version suggested in Leduc, Sill, and Stark (2007) that attributes the decline in the persistence of actual headline inﬂation to a change in the accommodative stance of monetary policy in 1979. We illustrate that such a change in monetary policy response to exogenous shocks to the public’s expectations of headline inﬂation can generate the change in headline-core inﬂation dynamics documented above. Before 1979, the Federal Reserve accommodated shocks to expected headline inﬂation: A surprise increase in expected headline inﬂation is not reversed, leading to a persistent increase in actual headline inﬂation and co-movement arising as a result of core inﬂation moving toward headline inﬂation. Since 1979 that has not been the case: A surprise increase in expected headline inﬂation is reversed and co-movement arises mainly as a result of headline inﬂation moving toward core inﬂation. Since food and energy prices are likely a signiﬁcant determinant of expected headline inﬂation, the results imply that the change in headline-core inﬂation dynamics may simply be due to the Federal Reserve no longer accommodating food and energy inﬂation. In the most recent sample period, a surprise increase in food and energy inﬂation has no signiﬁcant effect on the public’s expectations of headline inﬂation. This result suggests that past Federal Reserve behavior has convinced the public that it would no longer accommodate food and energy inﬂation. In previous research, analysts have often found that the empirical evidence indicating that core inﬂation is better than headline inﬂation at gauging the trend component of inﬂation is not robust across sample periods. The empirical work in this article explains this lack of robustness; namely, headline-core inﬂation dynamics changed with a change in the conduct of monetary policy in 1979. Hence, in sample periods beginning in the 1960s and ending in the 1980s or 1990s, the hypothesis that the trend component of inﬂation is best gauged by focusing only on core inﬂation may or may not be found consistent with the data. 312 Federal Reserve Bank of Richmond Economic Quarterly REFERENCES Bernanke, Ben S. 2007. “Inﬂation Expectations and Inﬂation Forecasting.” Remarks at the Monetary Economics Workshop of the NBER Summer Institute, Cambridge, Massachusetts. Blinder, Alan S., and Ricardo Reis. 2005. “Understanding the Greenspan Standard.” Proceedings of Jackson Hole Symposium, Federal Reserve Bank of Kansas City. Clark, Todd E. 2001. “Comparing Measures of Core Inﬂation.” Federal Reserve Bank of Kansas City Economic Review: 5–31 Cogley, Timothy. 2002. “A Simple Adaptive Measure of Core Inﬂation.” Journal of Money, Credit and Banking 34 (February): 94–113. Crone, Theodore M., N. Neil K. Khettry, Loretta J. Mester, and Jason Novak. 2008. “Core Measures as Predictors of Total Inﬂation.” Federal Reserve Bank of Philadelphia Working Paper 08-9. Dickey, D. A., and W. A. Fuller. 1979. “Distribution of the Estimator for Autoregressive Time Series with a Unit Root.” Journal of the American Statistical Association 74: 427–31. Engle, Robert F., and C. W. J. Granger. 1987. “Co-Integration and Error-Correction: Representation, Estimation and Testing.” Econometrica (March): 251–76. Granger, C. W. J. 1986. “Development in the Study of Cointegrated Economic Variables.” Oxford Bulletin of Economics and Statistics 48: 213–28. Hamilton, J. D. 1994. Time Series Analysis. Princeton, N.J.: Princeton University Press. Johansen, S. 1988. “Statistical Analysis of Cointegrating Vectors.” Journal of Economic Dynamics and Control 12: 231–54. Kiley, Michael T. 2008. “Estimating the Common Trend Rate of Inﬂation for Consumer Prices and Consumer Prices Excluding Food and Energy Prices.” Finance and Economics Discussion Series, Division of Research and Statistics and Monetary Affairs, Federal Reserve Board, Washington, D.C. Leduc, Sylvain, Keith Sill, and Tom Stark. 2007. “Self-Fulﬁlling Expectations and the Inﬂation of the 1970s: Evidence from the Livingston Survey.” Journal of Monetary Economics: 433–59. Y. P. Mehra and D. Reilly: Headline-Core Inﬂation Dynamics 313 Lubik, Thomas, and Paolo Surico. 2006. “The Lucas-Critique and the Stability of Empirical Models.” Federal Reserve Bank of Richmond Working Paper 6. Mehra, Yash P., and Brian D. Minton. 2007. “A Taylor Rule and the Greenspan Era.” Federal Reserve Bank of Richmond Economic Quarterly 93 (Summer): 229–50. Mehra, Yash P., and Christopher Herrington. 2008. “On Sources of Movements in Inﬂation Expectations: A Few Insights from a VAR Model.” Federal Reserve Bank of Richmond Economic Quarterly 94 (Spring): 121–46. Mishkin, Frederick S. 2007a. “Inﬂation Dynamics.” Working Paper 13147. Cambridge, Mass.: National Bureau of Economic Research (June). Mishkin, Frederick S. 2007b. “Headline Versus Core Inﬂation in the Conduct of Monetary Policy.” Speech at the Business Cycles, International Transmission and Macroeconomic Policies Conference, Montreal, Canada, October 20. Phillips, Peter, and Bruce E. Hansen. 1990. “Statistical Inference in Instrumental Variables Regression with I(1) Processes.” Review of Economic Studies 57 (January): 99–125. Rich, Robert, and Charles Steindel. 2005. “A Review of Core Inﬂation and an Evaluation of its Measures.” Federal Reserve Bank of New York Staff Report 236. Webb, Roy. 1995. “Forecasts of Inﬂation from VAR Models.” Journal of Forecasting (May): 267–86. Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 315–334 Why Could Political Incentives Be Different During Election Times? Leonardo Martinez T he literature on political cycles argues that the proximity of the next election date affects policy choices (Alesina, Roubini, and Cohen [1997]; Drazen [2000]; and Shi and Svensson [2003] present reviews of this literature).1 Evidence of such cycles is stronger for economies that are less developed, have younger democracies, have less government transparency, have less media freedom, have a larger share of uninformed voters in the electorate, and have a higher re-election value. Brender and Drazen (2005) ﬁnd evidence of a political deﬁcit cycle in a large cross-section of countries but show that this ﬁnding is driven by the experience of “new democracies.” The budget cycle disappears when the new democracies are removed from their sample. Similarly, using a large panel data set, Shi and Svensson (2006) ﬁnd that, on average, governments’ ﬁscal deﬁcits increase by almost 1 percent of gross domestic product in election years, and that these political budget cycles are signiﬁcantly larger and statistically more robust in developing than in developed countries. Using suitable proxies, they also ﬁnd that the size of the electoral budget cycles increases with the size of politicians’ rents from remaining in power, and with the share of informed voters in the electorate. Akhmedov and Zhuravskaya (2004) use a regional monthly panel from Russia and ﬁnd a sizable and short-lived political budget cycle (public spending is For helpful comments, the author would like to thank Juan Carlos Hatchondo, Pierre Sarte, Anne Stilwell, and John Weinberg. The views expressed in this article do not necessarily reﬂect those of the Federal Reserve Bank of Richmond or the Federal Reserve System. E-mail: leonardo.martinez@rich.frb.org. 1 Related work studies how political turnover causes movements in the real economy. Partisan cycles are studied, for example, by Alesina (1987), Azzimonti Renzo (2005), Cuadra and Sapriza (2006), and Hatchondo, Martinez, and Sapriza (forthcoming). Hess and Orphanides (1995, 2001) and Besley and Case (1995) study how the presence of term limits introduces electoral cycles between terms (while I focus on cycles within terms). 316 Federal Reserve Bank of Richmond Economic Quarterly shifted toward direct monetary transfers to voters). They also ﬁnd that the magnitude of the cycle decreases over time and with democracy, government transparency, media freedom, and voter awareness. They argue that the short length of the cycle explains underestimation of its size by studies that use lower frequency data. Why would policymakers prefer to inﬂuence economic conditions at the end of their term rather than at the beginning of their term? This article discusses some answers to this question provided by the theoretical literature on political cycles. More generally, this article discusses agency relationships in which an important part of the compensation is decided upon infrequently. For instance, my framework could be used to discuss incentives when a contract commits the employer to working with a certain employee for a number of periods, but allows the employer to replace this employee after the contract ends. Consider, for example, a professional athlete who signs a multi-year contract with a team, which is free to terminate its relationship with this athlete (or not) after the contract ends. Do athletes have stronger incentives to improve their performance just before their contract expires? Wilczynski (2004) and Stiroh (2007) present empirical evidence of a renegotiation cycle: performance improves in the year before the signing of a multi-year contract, but declines after the contract is signed. Renegotiation cycles resemble the cycles discussed in the political-economy literature. Even though my analysis applies to other employment relationships, for concreteness, this article refers to voters and policymakers. I study political cycles in a standard three-period political-agency model of career concerns. An incumbent policymaker who starts his political career in period one with an average reputation can exert effort in periods one and two to increase his re-election probability. Each period, the incumbent’s performance depends on his ability, his effort level, and luck. Voters do not observe the incumbent’s ability, effort, and luck; instead, they observe his performance. Good current performance by the incumbent may signal that he is capable of good performance in the future. Voters re-elect the incumbent only if they expect that his performance will be good in the future. Since the incumbent wants to be re-elected, he may exert effort to improve his current performance.2 2 By assuming that the policymaker can inﬂuence the beliefs about his future performance, the literature on political cycles does not imply that he can ﬁne-tune the aggregate economic effects of economic policy. One may think that the policymaker is evaluated on the quality of services he provides. For instance, Brender (2003) ﬁnds that “the incremental student success rate during the mayor’s term had a signiﬁcant positive effect on his reelection chances.” The quality of education depends on economic policy (for example, it depends on the resources the policymaker makes available for education). Thus, the policymaker may decide to make more resources available for education (instead of keeping resources for his favorite interest group or himself) in order to increase his re-election probability. L. Martinez: Political Incentives and Elections 317 Earlier theoretical studies of political cycles succeeded in showing that in environments with asymmetric information about the incumbent’s unobservable and stochastically evolving ability (as the one studied in this article), cycles can arise with forward-looking and rational voters. These studies show that political cycles may arise because the incumbent’s end-of-term performance may be more informative about the quality of his future (post-election) performance than his beginning-of-term performance. Therefore, the incumbent’s end-of-term actions (that inﬂuence his end-of-term performance) may be more effective in inﬂuencing the election result than his beginning-of-term actions (that inﬂuence his beginning-of-term performance). Consequently, the incumbent may have stronger incentives to improve his performance at the end of his term. For expositional simplicity, these studies model this intuition in its most extreme form. That is, they assume that only the end-of-term incumbent’s action is effective in changing the election result (see, for example, Rogoff [1990], Shi and Svensson [2006], and the references therein). Thus, re-election concerns play a role only at the end of a term, and, therefore, political cycles arise. These earlier studies make three assumptions that imply that the incumbent only affects his re-election probability by inﬂuencing his end-of-term performance. The ﬁrst assumption is that at the time of the election, only the end-of-term ability is not observable. If beginning-of-term ability is observable, the incumbent cannot inﬂuence voters’beliefs with his beginning-of-term actions and, therefore, cycles arise. The second assumption is that only end-of-term ability is correlated with post-election ability. Consequently, only voters’ inference about end-of-term ability directly inﬂuences their re-election decision. The third assumption is that output is a perfect signal of ability. This implies that voters can learn the incumbent’s end-of-term ability (which is correlated with his post-election ability) perfectly from his end-of-term performance, without considering his beginning-of-term performance. Therefore, beginning-of-term actions are not effective in changing the re-election probability. The three assumptions described above imply strong asymmetries across periods. Political cycles in these earlier studies are a direct result of these asymmetries. In Martinez (2009b), I explain why political cycles may arise even if the incumbent’s end-of-term performance is not more informative about the quality of his future performance, and, consequently, the incumbent’s endof-term actions are not more effective in inﬂuencing the election result. In the model, the incumbent’s equilibrium effort choice depends on both the proximity of the next election and his reputation (which I refer to as the beliefs about his ability). Recall that we want to study how the proximity of elections affects policy choices. Consequently, with political cycles I refer 318 Federal Reserve Bank of Richmond Economic Quarterly to differences in the incumbent’s choices within a term in ofﬁce for a given reputation level. For a given reputation level, why would the incumbent exert more effort closer to the election? If the incumbent’s reputation does not change between periods one and two, why would the incumbent exert more effort in period two than in period one? The key insight to the answer to these questions comes from the characterization of the incumbent’s effort-smoothing decision, which is such that he makes the marginal cost of exerting effort in period one (roughly) equal to the expected marginal cost of exerting effort in period two. This decision presents the typical intertemporal tradeoff in dynamic models: Having less utility in period one allows the incumbent to have more utility in period two. In this case, a lower expected effort level in period two compensates for a higher effort level in period one. In period one, the incumbent (whose reputation is average) knows that his reputation is likely to change and anticipates that this change will lead him to choose an effort level lower than the one he would choose in period two if his reputation remains average—extreme reputations imply low efforts. Consequently, the expected marginal cost of exerting effort in period two is lower than the marginal cost of the equilibrium period-two effort level for an average reputation (the marginal cost is an increasing function). Thus, the incumbent’s effort-smoothing decision implies that the marginal cost of the equilibrium period-one effort level—which is equal to the expected marginal cost of exerting effort in period two—is lower than the marginal cost of the equilibrium period-two effort level for the same (average) reputation. Therefore, for the same reputation, the period-one equilibrium effort level is lower than that of period two. That is, incentives to inﬂuence the re-election probability are stronger closer to the election. In another context, consider a professional athlete who has an average reputation at the beginning of a multi-year contract with a team and may want to exert effort in order to improve his reputation and obtain a good contract after his current contract ends. The discussion above indicates that the optimal strategy for the athlete is to wait until the end of his current contract to see whether it is worth exerting a high effort level. At the beginning of his current contract, he should choose an intermediate effort level. At the end of his contract, if his reputation remains average, he should choose a higher effort level. If his reputation became either very good or very bad (because his performance was very good or very bad), he should choose a lower effort level. Thus, for the same reputation level, the athlete exerts more effort at the end of his contract and there is a “renegotiation cycle.” This article ﬁrst characterizes a model with the three simplifying assumptions adopted in earlier studies. Then, each of the three assumptions described above is relaxed, and yet the model still generates cycles without assuming strong asymmetries across periods because of the effort-smoothing considerations I ﬁrst described in Martinez (2009b). L. Martinez: Political Incentives and Elections 319 The rest of this article is structured as follows. Section 1 presents the main elements of a standard model of political cycles. Section 2 characterizes a benchmark with the three simplifying assumptions adopted in earlier studies. These assumptions are relaxed in Sections 3, 4, and 5. Section 3 assumes that beginning-of-term ability is not observable. It is shown that this does not change the incumbent’s equilibrium decisions but it makes the optimal period-two effort level a function of the period-one effort level. In Section 4, I assume positive correlation between beginning-of-term ability and post-election ability. I show that the incumbent still chooses to exert zero effort at the beginning of the term, but his end-of-term equilibrium effort level depends on his period-one ability. In Section 5, it is assumed that observing performance in one period is not sufﬁcient to fully learn ability, and it is explained how the incumbent’s optimal effort-smoothing decision generates cycles. Section 6 concludes. 1. THE MODEL This article presents a three-period political-agency model of career concerns. In period one, there is a new policymaker in ofﬁce. At the beginning of period three, elections are held: Voters decide whether to re-elect the incumbent policymaker or replace him with a policymaker who was not previously in ofﬁce. The amount of public good produced by the incumbent policymaker in period t, yt , is a stochastic function of his ability, ηt , and his effort level, at . In particular, yt = at + ηt + ε t , (1) where εt is a random variable. Each period, the policymaker in ofﬁce can exert effort to increase the amount of public good he produces. Voters do not observe the effort level (which is, of course, known by the incumbent policymaker). The incumbent and voters do not know the incumbent’s ability. The common belief about the ability of a new incumbent is given by the distribution of abilities in the economy. The timing of events within each period is as follows. First, the incumbent decides on his effort level, after which ηt and εt are realized, and yt is observed. Voters’ per-period utility is given by yt . In period three, they decide on re-election in order to maximize the expected value of y3 . A policymaker’s per-period utility is normalized to zero if he is not in ofﬁce. He receives R > 0 in each period during which he is in charge of the production of the public good. The cost of exerting effort is given by c (a), with c (a) ≥ 0, c (a) > 0, and c (0) = 0. Let δ ∈ (0, 1) denote the voters’ 320 Federal Reserve Bank of Richmond Economic Quarterly and the incumbent’s discount factor. I use backward induction to solve for the subgame perfect equilibrium of this game. 2. A BENCHMARK This section provides a benchmark following earlier studies of political cycles by assuming that only the ability in the last period before the election is not observable at the time of the election, that ability follows a ﬁrst-order moving average process, and that output is a perfect signal of ability (see, for example, Rogoff [1990], Shi and Svensson [2006], and the references therein). The ﬁrst period a policymaker is in ofﬁce, his ability is given by ηt = γ t , and in every other period, ηt = γ t +γ t−1 , where γ t is an i.i.d. random variable with mean m1 , differentiable distribution function , and density function φ. When voters decide on re-election, γ 1 is known and γ 2 is not known. The production function is deterministic: ε t = 0 for all t. Observing output yt allows voters and the incumbent to compute the values of ηt and γ t using their knowledge of the effort exerted by the incumbent and the production function. Let ηvt and ηit denote the ability computed by voters and by the incumbent, respectively. Let γ vt and γ it denote the value of γ t computed by voters and the incumbent, respectively. The incumbent knows the effort level he chooses and, therefore, he always can compute ηt = yt − at correctly (i.e., ηit = ηt ). Using η1 , he can compute the value of γ 2 : γ i2 = y2 − η1 − a2 = γ 2 . Voters compute η2 and γ 2 using equilibrium effort levels. They are rational and understand the game. In particular, they know the incumbent’s equilibrium strategy. At the time the incumbent decides his period-two effort level, he knows a1 and y1 . Recall that the latter is a function of a1 and, therefore, we can summarize the information available to the incumbent by the effort component, a1 , and the stochastic component, η1 = y1 − a1 , of y1 . For any value of η1 and a1 , let α 2 η1 , a1 denote the incumbent’s equilibrium period∗ two effort level. Let a1 denote the incumbent’s equilibrium period-one effort level. Voters compute ∗ ∗ γ v2 = y2 − η1 − α 2 η1 , a1 = γ 2 + a2 − α 2 η1 , a1 . (2) In period three, there is no future re-election probability that could be inﬂuenced by the incumbent. Therefore, any policymaker would exert zero effort. Consequently, when forward-looking voters decide on re-election, they compare the incumbent’s period-three expected ability with the period-three expected ability of a policymaker who was not previously in ofﬁce. The incumbent’s period-three expected ability computed by voters is equal to γ v2 . The expected period-three ability of a policymaker who was not in ofﬁce before is m1 . Consequently, voters re-elect the incumbent if and only if γ v2 > m1 . L. Martinez: Political Incentives and Elections 321 ∗ That is, the incumbent is re-elected if and only if γ 2 + a2 − α 2 η1 , a1 > m1 , ∗ or equivalently γ 2 > m1 + α 2 η1 , a1 − a2 . Thus, exerting effort in period two decreases the minimum realization of γ 2 that would allow the incumbent to be re-elected and, therefore, it increases the re-election probability. The incumbent’s period-two maximization problem reads max δR 1 − a2 ≥0 ∗ m1 + α 2 η1 , a1 − a2 − c (a2 ) , (3) ∗ where 1 − m1 + α 2 η1 , a1 − a2 is the probability of re-election. Note that the incumbent can compute equilibrium effort levels as voters do (all information available to voters is also available to the incumbent) and, therefore, ∗ he can compute α 2 η1 , a1 . In this article, I characterize the incumbent’s equilibrium effort levels through the ﬁrst-order condition of his maximization problems.3 Note that for ﬁnding the equilibrium effort level, we solve a ﬁxed-point problem. The effort level that maximizes the incumbent’s expected utility in (3) depends on the ∗ effort level voters use to compute the signal, α 2 η1 , a1 . In equilibrium, the incumbent’s effort level must be equal to the effort level voters use to compute the signal. The optimal period-two effort level satisﬁes c α 2 η1 , a1 ∗ = δRφ m1 + α 2 η1 , a1 − α 2 η1 , a1 . (4) ∗ ∗ Let a2 denote the period-two equilibrium effort level. In equilibrium, a1 = a1 ∗ and, therefore, a2 satisﬁes ∗ c a2 = δRφ (m1 ) > 0. (5) Equation (5) shows that the equilibrium effort level is such that the marginal cost of exerting effort is equal to the marginal beneﬁt of exerting effort. The incumbent beneﬁts from exerting effort because this increases the re-election probability. The marginal beneﬁt of exerting effort is given by the change in the probability of re-election multiplied by R (the value of winning the election) and the discount factor, δ. It should be mentioned that, in models of career concerns, equilibrium effort levels are typically inefﬁcient (for a more thorough discussion of this issue, see Foerster and Martinez [2006]). The efﬁcient effort level is the one a benevolent social planner would force the incumbent to exert (if he could observe the effort exerted by the incumbent). This effort level can be deﬁned as the one at which the social marginal cost of exerting effort (the incumbent’s marginal cost) equals the social marginal beneﬁt of exerting effort (the increase 3 As in previous models of political agency, assumptions are necessary to guarantee the concavity of these problems in which the re-election probability may not be a concave function of the incumbent’s decision. For example, the ﬁrst term in the objective function in (3) may not be globally concave. In order to assure global concavity of the incumbent’s problems, it is sufﬁcient to assume enough convexity in the cost of the effort function. 322 Federal Reserve Bank of Richmond Economic Quarterly in output implied by an extra unit of effort, which according to the production function in equation 1 is equal to one). Since the incumbent’s marginal beneﬁt of exerting effort represented in the right-hand side of equation (5) is typically different from the marginal productivity of effort, the equilibrium effort level is typically inefﬁcient. Furthermore, since the social marginal beneﬁt and marginal cost of exerting effort are the same every period, political cycles (differences in effort levels within a term) imply inefﬁciencies. ∗ Note that a2 does not depend on η1 or a1 . Equation (4) shows that, since the period-two equilibrium effort level does not depend on η1 or a1 , off the ∗ equilibrium path (i.e., when a1 = a1 ) the optimal period-two effort level does not depend on η1 or a1 (for a more thorough discussion of how the history of the game affects the agent’s strategy in models of career concerns, see ∗ ∗ Martinez [2009a]). Furthermore, since c a2 > 0, a2 > 0. In period one, the incumbent anticipates equilibrium play in the subsequent periods. In particular, the incumbent anticipates that the probability of re-election is given by 1 − (m1 ) and does not depend on his period-one effort level. Consequently, the period-one equilibrium effort level is given by ∗ ∗ a1 = 0 < a2 . Thus, I have shown that, under the standard assumptions in earlier studies of political cycles, the incumbent can affect his re-election probability only with the last effort level prior to the election and, therefore, cycles appear (the incumbent only chooses a positive effort level in period two). In the next sections, I shall discuss the consequences of relaxing these assumptions. 3. SYMMETRIC OBSERVABILITY In Section 2, the incumbent’s period-one ability, η1 , was observable and, therefore, there was nothing the incumbent could do in period one to inﬂuence voters’beliefs about his post-election ability and the re-election probability. In this section, I assume that η1 is not observable. I will show that this complicates the analysis, but that not exerting effort in period one is still optimal for the incumbent. The period-two equilibrium effort level is also identical to the one found in Section 2. The assumption on the observability of η1 only affects the incumbent’s off-equilibrium period-two optimal effort choices. Let ∗ ∗ ηv1 = y1 − a1 = η1 + a1 − a1 (6) denote the period-one ability computed by voters using the equilibrium effort level. Using ηv1 and the equilibrium effort strategies, voters compute ∗ ∗ ∗ γ v2 = y2 − ηv1 − α 2 ηv1 , a1 = γ 2 + a2 − a1 + a1 − α 2 ηv1 , a1 . (7) As in Section 2, the incumbent is re-elected if and only if γ v2 > m1 . He can ∗ ∗ compute a1 and ηv1 as voters do and, therefore, he can compute α 2 ηv1 , a1 . L. Martinez: Political Incentives and Elections 323 Thus, the incumbent’s period-two maximization problem reads max δR 1 − a2 ≥0 ∗ ∗ m1 + a1 − a1 + α 2 ηv1 , a1 − a2 − c (a2 ) . (8) The solution of problem (8), α 2 η1 , a1 , satisﬁes c α 2 η1 , a1 ∗ ∗ = δRφ m1 + a1 − a1 + α 2 ηv1 , a1 − α 2 η1 , a1 . (9) ∗ ∗ In equilibrium, a1 = a1 and, therefore, α 2 η1 , a1 = α 2 ηv1 , a1 (see equation 6). Consequently, the period-two equilibrium effort level is the same ∗ as in Section 2 (i.e., it is given by c a2 = δRφ (m1 ) > 0). Note that, as in Section 2, the equilibrium period-two effort level does not depend on η1 and a1 . However, if η1 is not observable, off equilibrium the optimal period-two effort level depends on a1 . Let α 2 (a1 ) denote this optimal ˆ effort level, which satisﬁes ∗ ∗ c α 2 (a1 ) = δRφ m1 + a1 − a1 + a2 − α 2 (a1 ) . ˆ ˆ At the beginning of period two, the incumbent’s expected utility is given by ∗ ∗ ˆ m1 + a1 − a1 + a2 − α 2 (a1 ) . (10) The period-one incumbent’s maximization problem is given by W2 (a1 ) = R − c α 2 (a1 ) + δR 1 − ˆ max {δW2 (a1 ) − c (a1 )} . a1 ≥0 Recall that, since the incumbent’s period-one ability, η1 , is not observable, the period-one ability computed by voters, ηv1 , is increasing with respect to a1 . Thus, in period one, the incumbent could choose a higher effort level in order to make voters believe that he has more ability. However, the incumbent’s continuation utility is lower when voters believe that his period-one ability is higher. There are two reasons for this. First, under the assumptions in this section (and in earlier studies of political cycles), only period-two ability is correlated with period-three ability and, therefore, only period-two ability directly inﬂuences the re-election decision. Consequently, the incumbent would only want to inﬂuence voters’ period-one inference in order to inﬂuence their period-two inference. Second, for any period-two output observation, y2 , voters’ inference about the period-two ability, γ v2 , is decreasing with respect to ηv1 (see equation 7). If ηv1 is higher, voters believe that y2 is the result of a higher period-one ability and a lower period-two ability. Since the incumbent’s continuation utility is lower when voters believe that his period-one ability is higher, W2 (a1 ) is decreasing with respect to a1 (recall that equation 6 shows that ηv1 is increasing with respect to a1 ). That is, the incumbent does not have incentives to exert effort in period one. If he exerted effort, he would both suffer the cost of exerting effort and decrease 324 Federal Reserve Bank of Richmond Economic Quarterly his continuation utility. Therefore, the period-one equilibrium effort level is ∗ ∗ given by a1 = 0 < a2 . Thus, equilibrium effort levels are identical to those found in Section 2, and the assumption on the observability of η1 only affects the incumbent’s off-equilibrium period-two optimal effort choices. 4. A RANDOM WALK PROCESS FOR ABILITY In the previous section, I showed that when the incumbent’s period-one ability is not correlated with his post-election ability (and, therefore, his period-one effort cannot directly inﬂuence the re-election probability), the incumbent does not want to exert effort in period one. This section studies the effects of allowing for correlation between the period-one ability and the post-election ability. Following Holmstr¨ m’s (1999) seminal paper on career concerns, I aso sume that ηt+1 = ηt + ξ t , where ξ t is normally distributed with mean 0 and precision hξ (the variance is h1ξ ), and it is unobservable. The common belief about the ability of a new incumbent is given by the distribution of abilities in the economy, which is normally distributed with mean m1 and precision hη (these are the beliefs about the period-one incumbent’s ability). Thus, results presented in this section are a special case of the results presented in Martinez (2009b). Let φ(v; x, z) denote the density function for a normally distributed random variable V with mean x and precision z, and let (v; x, z) denote the corresponding cumulative distribution function. As in previous sections, the incumbent is re-elected if and only if his expected period-three ability is higher than the expected period-three ability of a policymaker who was not previously in ofﬁce. That is, the incumbent is ∗ re-elected if and only if ηv2 = η2 +a2 −α 2 ηv1 , a1 > m1 (i.e., the incumbent ∗ is re-elected if and only if η2 > m1 +α 2 ηv1 , a1 −a2 ). Thus, the incumbent’s period-two maximization problem reads max δR 1 − a2 ≥0 ∗ m1 + α 2 ηv1 , a1 − a2 ; η1 , hξ − c (a2 ) . (11) The solution of (11), α 2 η1 , a1 , satisﬁes c α 2 η1 , a1 ∗ = δRφ m1 + α 2 ηv1 , a1 − α 2 η1 , a1 ; η1 , hξ . ∗ ∗ In equilibrium, a1 = a1 and, therefore, ηv1 = ηi1 = η1 and α 2 ηv1 , a1 = ∗ ∗ α 2 η1 , a1 . Let a2 η1 ≡ α 2 η1 , a1 denote the period-two equilibrium effort level, which is given by ∗ c a2 η 1 = δRφ m1 ; η1 , hξ . (12) Note that, in this section, the period-two equilibrium effort level depends on the period-one ability η1 (recall this was not the case in previous sections). The realization of period-one ability shock affects the distribution of the period-two ability shock. L. Martinez: Political Incentives and Elections 325 At the beginning of period two, the incumbent’s expected utility is given by W2 η1 , a1 = R − c α 2 η1 , a1 +δR 1 − ∗ m1 + a2 η1 − α 2 η1 , a1 ; η1 , hξ . The period-one incumbent’s maximization problem is given by max a1 ≥0 W2 η1 , a1 φ η1 ; m1 , hη dη1 − c (a1 ) . ∗ Let a2 η1 denote the derivative of the period-two equilibrium effort level with respect to the period-one ability. The following proposition presents the incumbent’s effort-smoothing decision (see Appendix A for the proof). Proposition 1 There exists a unique period-one equilibrium effort level that satisﬁes ∗ c a1 = δ ∗ ∗ −a2 η1 c a2 η1 φ η1 ; m1 , hη dη1 . (13) The Euler equation (13) represents the typical intertemporal tradeoff in dynamic models: Having less utility in period one allows the incumbent to have more utility in period two. In this case, a lower expected effort level in period two compensates for a higher effort level in period one. The incumbent knows that he could affect the re-election probability by exerting effort in periods one and two. He could exert more effort in period one and less effort in period two (or vice versa) and still have the same re-election probability. Equation (13) shows that the optimal effort-smoothing decision depends on the cost and the effectiveness of exerting effort in each period. In equa∗ tion (13), −a2 η1 represents the relative effectiveness in changing ηv2 (and, therefore, the re-election probability) of a1 (compared with a2 ). The incumbent’s period-one effort level affects ηv1 directly, and it affects ηv2 through ηv1 . His period-two effort level affects ηv2 directly. Thus, the relative effec∗ tiveness is the derivative of ηv2 = y2 − a2 ηv1 , with respect to ηv1 . For example, if voters expect a lower period-two effort level from an incumbent who is perceived to be better, then, by choosing a higher effort level in period one, and making ηv1 higher, the incumbent would make voters expect a lower period-two effort level. Consequently, voters would think that the period-two outcome is the result of a lower period-two effort level and a higher period-two ability. Thus, the incumbent’s period-one effort would have a positive effect on the voters’ period-two learning. This section introduces incentives to exert effort at the beginning of a term. These incentives were not present in previous sections, where beginning-ofterm ability was not correlated with post-election ability. A positive relative effectiveness implies that period-one effort was effective in changing ηv2 (and, 326 Federal Reserve Bank of Richmond Economic Quarterly therefore, the re-election probability). Thus, in period one, the incumbent may want to exert effort. Recall that, in Section 2, the relative effectiveness is zero (period-one effort is not effective), and in Section 3 it is negative (with the moving-average assumption, the incumbent’s expected post-election ability is decreasing with respect to the beginning-of-term ability inferred by voters). In this section, the relative effectiveness of period-one effort could be positive. It could even be higher than one (implying that beginning-of-term effort is more effective than end-of-term effort in changing the re-election probability). However, the next proposition shows that, even though the incumbent could use beginning-of-term effort to increase the re-election probability, under the assumptions in this section, the incumbent chooses to exert zero effort at the beginning of the term because the expected relative effectiveness is equal to zero (see Appendix B for the proof).4 Proposition 2 In period one, the incumbent chooses not to exert effort. Loosely speaking, proposition 2 shows that the incumbent does not expect his period-one effort level to be effective in changing the re-election probability and, therefore, he does not exert effort in period one. There are two reasons for this. First, on average, the effect of period-one effort on period-two learning is zero. Second, period-one learning does not have a direct effect on the re-election probability (i.e., period-one effort may only affect the re-election probability through its effect on period-two learning). Since there is no noise in the production process, learning the incumbent’s period-two performance is enough to perfectly learn his type. Thus, the policymaker’s behavior is different closer to the election because we assume that his actions can only have a direct effect on the re-election probability closer to the election. The next section explains how the model can generate a cycle without this assumption. 5. A STOCHASTIC PRODUCTION FUNCTION In previous sections, cycles arise because I assume differences across periods (besides the proximity of the election). In particular, in Section 4, I showed that assuming that output is a perfect signal of ability generates a strong asymmetry across periods. In this section I relax this assumption. In particular, as in Holmstr¨ m (1999), I assume that εt is a normally distributed random variable o with expected value 0 and precision hε —consequently, I can interpret the results in Section 4 as the limit of the results presented in this section when hε goes to inﬁnity. Thus, the model studied in this section is the one-election version of the model I study in Martinez (2009b). 4 As shown in the proof of proposition 2, the symmetry of the equilibrium effort strategy is necessary to prove this result. In Martinez (2009b), I show that, in a version of the model with more than three periods in which the incumbent can be re-elected more than once, even if the ability distribution is symmetric, the equilibrium effort strategy may not be symmetric. L. Martinez: Political Incentives and Elections 327 Since there is noise in production, observing output only allows voters and the incumbent to compute a “signal” of the incumbent’s ability. This is in contrast with previous sections, where observing output allows voters and the incumbent to compute the incumbent’s ability. Deﬁne st ≡ ηt +εt . I refer to st as the period-t signal of the incumbent’s ability. Voters and the incumbent use the signal they compute to update their beliefs about the incumbent’s ability. From this point forward, belief refers to belief about the incumbent’s ability unless stated otherwise. Beliefs are Gaussian and, therefore, they can be characterized by their mean and their precision. Depending on the precision of the shock that determines the evolution of the incumbent’s ability, hξ , the precision of beliefs may be increasing or decreasing with respect to the number of performance observations (see Holmstr¨ m 1999).5 For simplicity, I assume that hξ is such o that the precision of beliefs is constant. That is, I assume hξ = h2 + hη hε η . (14) hε By making an assumption that guarantees that the precision of beliefs is constant, I can keep track of their evolution by following the evolution of their mean. This simpliﬁes the analysis. Equation (14) implies that for any t, the precision of the period-t + 1 beliefs about the signal st+1 is equal to the precision of the period-t beliefs about the signal st . This precision is given by hη hε . (15) H ≡ hε + hη Since beliefs about the signal are also Gaussian and have a constant precision, the evolution of these beliefs can also be summarized by the evolution of their mean, which is equal to the mean of the beliefs about ability. As in previous sections, the incumbent is re-elected if and only if his expected period-three ability is higher than the expected period-three ability of a policymaker who was not previously in ofﬁce. Let mvt and mit denote the mean of the voters’ and the incumbent’s beliefs at the beginning of period t (from here on, at period t). I refer to a belief with mean m as belief m. The incumbent is re-elected if and only if mv3 > m1 . Bayes’ rule implies that the mean of beliefs at t + 1 is a weighted sum of the mean at t and the period-t signal. Equation (14) implies that the weight of the period-t mean belief in the period-t + 1 mean belief does not depend on the number of observations of the incumbent’s performance. This weight 5 In general, the precision of t + 1 believes h t+1 is given by ht+1 = (ht + hε ) hξ . ht + hε + hξ 328 Federal Reserve Bank of Richmond Economic Quarterly is given by μ= hη . hη + hε (16) Let svt and sit denote the period-t signal computed by voters and by the incumbent, respectively. Since the incumbent knows the effort he exerted, he can compute the true signal, i.e., sit = yt − at = st . Thus, mit+1 = μmit + (1 − μ) sit = μmit + (1 − μ) st . Voters compute the signal using equilibrium effort strategies. In Section 4, I wrote the incumbent’s period-two equilibrium strategy as a function of his period-one ability and effort level. In this section, at the time of the periodtwo effort decision, the incumbent does not know his period-one ability, but he learned the signal s1 . Instead of writing his period-two equilibrium strategy as a function of a1 and s1 , for expositional simplicity, I will write the equilibrium strategy as a function of a1 and m2 = μm1 + (1 − μ)s1 , α 2 (m2 , a1 ). Thus, the period-two signal computed by voters is given by ∗ ∗ sv2 ≡ y2 − α 2 (mv2 , a1 ) = s2 + a2 − α 2 (mv2 , a1 ), (17) where ∗ mv2 = μm1 + (1 − μ)sv1 = μm1 + (1 − μ) s1 + a1 − a1 = ∗ m2 + (1 − μ) a1 − a1 . Consequently, ∗ mv3 = μmv2 + (1 − μ)sv2 = μmv2 + (1 − μ)[s2 + a2 − α 2 (mv2 , a1 )]. (18) Equation (18) shows how exerting effort helps the incumbent increase the re-election probability. The expected ability in the voters’ belief is increasing with respect to effort, and voters re-elect the incumbent if and only if they expect his ability to be good enough. Recall that voters and the incumbent have the same period-one belief. Moreover, in any period in which the incumbent exerts the equilibrium effort level, voters and the incumbent compute the same signal. Consequently, in equilibrium, the voters’ and the incumbent’s beliefs coincide (mvt = mit ). −μm ∗ The incumbent is re-elected if and only if s2 > m11−μ v2 +α 2 (mv2 , a1 )−a2 ∗ (i.e., if and only if mv3 > m1 ). Let Mv2 (m2 , a1 ) ≡ m2 + (1 − μ) a1 − a1 denote the mean of the voters’ period-two belief when m2 is the mean of the incumbent’s period-two belief and a1 is the period-one effort level. Thus, the incumbent’s period-two maximization problem can be written as m1 − μMv2 (m2 , a1 ) 1−μ ∗ +α 2 (Mv2 (m2 , a1 ) , a1 ) − a2 ; m2 , H − c (a2 ) . max δR 1 − a2 ≥0 (19) L. Martinez: Political Incentives and Elections 329 The following proposition shows that a unique ﬁxed point that solves for the period-two equilibrium effort strategy exists (see Martinez [2009b] for the proof).6 Proposition 3 (Martinez 2009b): Let m2 denote the voters’ and the incumbent’s beliefs at the beginning of period two. The unique period-two equilib∗ rium effort strategy a2 (m2 ) satisﬁes m1 − μm2 ; m2 , H > 0. (20) 1−μ ∗ Thus, for any reputation m2 , the equilibrium period-two effort level a2 (m2 ) is positive. ∗ c a2 (m2 ) = δRφ Let M2 (s1 ) ≡ μm1 + (1 − μ)s1 denote the mean of the incumbent’s period-two posterior belief when s1 is the signal he uses to update his prior. The period-one incumbent’s maximization problem is given by max δ a1 ≥0 W2 (M2 (s1 ), a1 ) φ (s1 ; m1 , H ) ds1 − c (a1 ) , where W2 (m2 , a1 ) = R − c (α 2 (m2 , a1 )) m1 − μMv2 (m2 , a1 ) +δR 1 − 1−μ ∗ + a2 (Mv2 (m2 , a1 )) − α 2 (m2 , a1 ); m2 , H denotes the incumbent’s expected utility at the beginning of period two when his belief is characterized by m2 and he chose a1 . The following proposition presents the incumbent’s period-one effort-smoothing decision (Martinez [2009b] presents the proof). Proposition 4 (Martinez 2009b): There exists a unique and positive period∗ one equilibrium effort level a1 that satisﬁes ∗ c a1 = δμ ∞ −∞ c (α 2 (M2 (s1 ))) φ (s1 ; m1 , H ) ds1 > 0. (21) In equation (21), the expected relative effectiveness in changing the reelection probability of the incumbent’s period-one effort (compared with his period-two effort) is represented by μ > 0, which indicates the relative weight of sv1 (compared with sv2 ) in mv3 = μ2 m1 + (1 − μ)sv2 + μ(1 − μ)sv1 . 6 Note that, for μ = 0 (and, therefore, for m = η ), the equilibrium effort strategy in 2 1 equation (20) coincides with the one in equation (12). 330 Federal Reserve Bank of Richmond Economic Quarterly Thus, the expected relative effectiveness, μ, indicates the relative importance of the direct effect on the re-election probability of appearing more talented in period one (recall the incumbent is re-elected if and only if mv3 > m1 ).7 Since the equilibrium period-two effort level in equation (20) is a function of the incumbent’s period-two reputation, m2 , differences in the incumbent’s behavior during his term in ofﬁce could be the result of changes in his reputation and may not imply that he is deciding differently because the election time is closer. I want to focus on differences in the incumbent’s behavior that are due to the proximity of the election. Therefore, I refer to differences in behavior across the incumbent’s term for a given reputation level as political cycles. The next proposition shows that the model generates such cycles (I present the proof in Martinez [2009b]). Proposition 5 (Martinez 2009b): For the same reputation level (m1 ), the period-two equilibrium effort level is higher than the period-one equilibrium effort level. Recall that the lessened effectiveness of effort further from the election is the force behind political cycles in previous sections, which present this mechanism in its most extreme form by making assumptions that imply that beginning-of-term effort is not expected to be effective in increasing the reelection probability. In particular, the equilibrium strategy in Section 4 is a special case of the equilibrium strategy presented in this section for which period-one effort is not expected to be effective (μ = 0). In contrast, proposition 5 shows that a standard model can generate cycles for all possible values of μ. In particular, the model can generate cycles if the effectiveness of beginning-of-term actions is arbitrarily close to the effectiveness of end-ofterm actions (μ is arbitrarily close to 1). The proposition also shows that discounting is not necessary for generating cycles in the model: Cycles arise for all values of δ, including δ = 1. How could political cycles arise in an economy without no discounting where manipulating policy is equally effective in every period? As I explain in Martinez (2009b), cycles could still arise in such an economy because at the beginning of his term, the incumbent knows that his reputation is likely to change, and he anticipates that this change will lead him to choose an effort level lower than the one he would choose at the end of his term for his beginning-of-term reputation level. Note ﬁrst that the period-two equilibrium effort strategy deﬁned in equation (20) is a hump-shaped function of the 7 As in Section 4, because of the symmetry of the equilibrium period-two effort strategy, the incumbent does not expect that his period-one effort will affect the re-election probability through the period-two effort level used by voters for their period-two learning. In Martinez (2009a), I present a more thorough discussion of the relative effectiveness and this indirect effect of currentperiod effort on next-period learning. L. Martinez: Political Incentives and Elections 331 incumbent’s period-two reputation, m2 , as is the signal density function.8 That is, in period two, the incumbent exerts less effort when his reputation has more extreme values. Thus, in period one, he anticipates that if his reputation does not change, he will choose α 2 (m1 ) in period two. He also anticipates that, for example, if his period-one performance turns out to be either very good or very bad (and, therefore, his period-two reputation is either very good or very bad), he will exert a lower effort level in period two. In particular, the expected period-two effort level is lower than α 2 (m1 ), and the expected marginal cost of exerting effort in period two is lower than c (α 2 (m1 )). Therefore, the effort∗ smoothing rule in (21) implies that c a1 < c (α 2 (m1 )), and the incumbent ∗ chooses a1 < α 2 (m1 ). In Martinez (2009b), I analyze the multiple-election version of the model presented in this section. That is, I analyze a model with more than three periods in which the incumbent could run for re-election more than once. Such a model allows for the study of situations that do not arise in the one-election version: With multiple elections, the beginning-of-term reputation may be better than the average reputation, and the end-of-term effort may not be maximized at the beginning-of-term reputation. Recall that in the one-election version of the model, at the beginning of the term, there is a new incumbent with an average reputation, and the proof of proposition 5 (which shows that a political cycle arises in the one-election version of the model) is based on the end-of-term equilibrium effort strategy being such that it is optimal to exert the maximum effort level for the beginning-of-term reputation. In Martinez (2009b), I show that the insight described in the one-election version of the model helps us understand political cycles with multiple elections: For the same reputation, end-of-term effort is higher if, at the beginning of the term, the incumbent anticipates that changes in his reputation will, on average, lead him to choose an end-of-term effort level lower than the one he would choose for his beginning-of-term reputation. I also show that the model can generate expected end-of-term effort levels higher than the beginning-of-term effort level. 6. CONCLUSIONS Using a career-concern model of political cycles, this article discusses why political incentives could be different in election times. First, I show that cycles could arise if end-of-term political actions are more effective in changing the re-election probability than beginning-of-term actions. Following earlier 8 As I explain in Martinez (2009b), one can expect equilibrium effort to be hump-shaped in the incumbent’s belief if better incumbents are less (more) likely to produce bad (good) signals. One can expect equilibrium effort to be hump-shaped in the voters’ belief if extreme signals are less likely than average signals. 332 Federal Reserve Bank of Richmond Economic Quarterly theoretical studies of political cycles, I model this intuition in its most extreme form. In particular, I assumed that at the time of the election, only the end-of-term ability is not observable; that only the incumbent’s end-of-term performance is correlated with his post-election performance; and that the incumbent’s performance is a perfect signal of his type. Then, I relax each of these assumptions and discuss how they affect results. In particular, I show that the model still generates cycles without assuming strong asymmetries across periods because of the effort-smoothing considerations I ﬁrst described in Martinez (2009b). The analysis in this article helps one understand other agency relationships in which an important part of the compensation is decided upon infrequently. APPENDIX A : PROOF OF PROPOSITION 1 ∗ In equilibrium, a1 = a1 and, therefore, the ﬁrst-order condition of the incumbent’s period-one problem reads ∗ c a1 = δ ∗ −δRa2 η1 φ m1 ; η1 , hξ φ η1 ; m1 , hη dη1 . (22) Equation (12) shows that ∗ δRφ m1 ; η1 , hξ = c a2 η1 . (23) Plugging equation (23) into equation (22), we obtain equation (13). Since ∗ there is a unique period-two equilibrium strategy, a2 η1 , deﬁned by equation ∗ (12), there is a unique period-one equilibrium effort level, a1 , that can easily be obtained from equation (13) (the right-hand side of equation 13 does not depend on the period-one effort level). APPENDIX B : PROOF OF PROPOSITION 2 Recall that φ m1 ; η1 , hξ is symmetric with respect to η1 with the maxi∗ mum at η1 = m1 . Consequently, c a2 η1 is a symmetric function with the maximum at η1 = m1 (see equation 12). Moreover, φ η1 ; m1 , hη is a symmetric function with respect to η1 with the maximum at η1 = m1 . In ∗ ∗ ∗ addition, a2 (m1 ) = 0, and, for any A ∈ , a2 (m1 + A) = −a2 (m1 − A) L. Martinez: Political Incentives and Elections 333 (see equation 12). Consequently, ∗ ∗ a2 η1 c a2 η1 φ η1 ; m1 , hη dη1 = 0, ∗ and according to equation (13), a1 = 0. REFERENCES Akhmedov, Akhmed, and Ekaterina Zhuravskaya. 2004. “Opportunistic Political Cycles: Test in a Young Democracy Setting.” Quarterly Journal of Economics 119 (November): 1,301–38. Alesina, Alberto. 1987. “Macroeconomic Policy in a Two-Party System as a Repeated Game.” Quarterly Journal of Economics 102 (August): 651–78. Alesina, Alberto, Nouriel Roubini, and Gerald D. Cohen. 1997. Political Cycles and the Macroeconomy. Cambridge, Mass.: MIT Press. Azzimonti Renzo, Marina. 2005. “On the Dynamic Inefﬁciency of Governments.” Manuscript, University of Texas at Austin. Besley, Timothy J., and Anne Case. 1995. “Does Electoral Accountability Affect Economic Policy Choices? Evidence from Gubernatorial Term Limits.” Quarterly Journal of Economics 110 (August): 769–98. Brender, Adi. 2003. “The Effect of Fiscal Performance on Local Government Election Results in Israel: 1989–1998.” Journal of Public Economics 87 (September): 2,187–205. Brender, Adi, and Allan Drazen. 2005. “Political Budget Cycles in New Versus Established Democracies.” Journal of Monetary Economics 52 (October): 1,271–95. Cuadra, Gabriel, and Horacio Sapriza. 2006. “Sovereign Default, Interest Rates and Political Uncertainty in Emerging Markets.” Working Paper 2006-02, Banco de M´ xico. e Drazen, Allan. 2000. “The Political Business Cycle After 25 Years.” NBER Macroeconomics Annual 15: 75–117. Foerster, Andrew, and Leonardo Martinez. 2006. “Are We Working Too Hard or Should We Be Working Harder? A Simple Model of Career Concerns.” Federal Reserve Bank of Richmond Economic Quarterly 92 (Winter): 79–91. 334 Federal Reserve Bank of Richmond Economic Quarterly Hatchondo, Juan Carlos, Leonardo Martinez, and Horacio Sapriza. Forthcoming. “Heterogeneous Borrowers in Quantitative Models of Sovereign Default.” International Economic Review. Hess, Gregory D., and Athanasios Orphanides. 1995. “War Politics: An Economic, Rational-Voter Framework.” American Economic Review 85 (September): 828–46. Hess, Gregory D., and Athanasios Orphanides. 2001. “Economic Conditions, Elections, and the Magnitude of Foreign Conﬂicts.” Journal of Public Economics 80 (April): 121–40. Holmstr¨ m, Bengt. 1999. “Managerial Incentive Problems: A Dynamic o Perspective.” Review of Economic Studies 66: 169–82. Martinez, Leonardo. 2009a. “Reputation, Career Concerns, and Job Assignments.” The B.E. Journal of Theoretical Economics (Contributions) 9: Article 15. Martinez, Leonardo. 2009b. “A Theory of Political Cycles.” Journal of Economic Theory 144 (May): 1,166–86. Rogoff, Kenneth. 1990. “Equilibrium Political Budget Cycles.” American Economic Review 80: 21–36. Shi, Min, and Jakob Svensson. 2003. “Political Budget Cycles: A Review of Recent Developments.” Nordic Journal of Political Economy 29: 67–76. Shi, Min, and Jakob Svensson. 2006. “Political Budget Cycles: Do They Differ Across Countries and Why?” Journal of Public Economics 90 (September): 1,367–89. Stiroh, Kevin J. 2007. “Playing for Keeps: Pay and Performance in the NBA.” Economic Inquiry 45: 145–61. Wilczynski, Adam. 2004. “Career Concerns and Renegotiation Cycle Effect.” Unpublished manuscript.