View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 235–267

Distortionary Taxation for
Efficient Redistribution
Borys Grochulski

T

his article uses a simple model to review the economic theory of
efficient redistributive taxation. Three main results are presented.
The first is the classic competitive equilibrium efficiency result: trade
in competitive markets leads to an efficient final (i.e., equilibrium) allocation
of consumption among the agents in the economy. The equilibrium allocation
is determined by market supply and demand forces. In our model economy, the
equilibrium allocation is determined uniquely. This efficiency result, known
as the First Welfare Theorem, provides a strong argument supporting the view
that unobstructed competitive market forces can be relied on to determine
the allocation of consumption in the economy. One must observe, however,
that the competitive market equilibrium supports one efficient allocation, i.e.,
competitive markets support one particular distribution of the total gains from
trade that are available in the economy. There are an infinite number of
ways in which the total gains from trade can be efficiently divided among
the agents. Thus, absent redistribution, almost all efficient divisions of the
gains from trade are inconsistent with the competitive market mechanism. In
other words, the competitive market mechanism guarantees efficiency but also
imposes on the society one particular division of the welfare gains from trade.
It is entirely possible that the agents in the economy may prefer to divide the
gains from trade differently. In fact, there is no a priori reason to believe that
the society’s most preferred division of the gains from trade should happen to
coincide with that imposed by the market mechanism. Thus, for distributional
reasons, the competitive market allocation will almost surely be suboptimal.
The second result we review describes the classic solution to the distributional problems associated with the competitive market mechanism: wealth
The author would like to thank Kartik Athreya, Leonardo Martinez, Sam Henly, and Ned
Prescott for their helpful comments. The views expressed in this article are those of the
author and do not necessarily reflect those of the Federal Reserve Bank of Richmond or the
Federal Reserve System. E-mail: borys.grochulski@rich.frb.org.

236

Federal Reserve Bank of Richmond Economic Quarterly

transfers. If society prefers a different division of the gains from trade than
the one brought about by competitive market forces, it is sufficient to transfer
wealth among the agents in order to correct it. Such wealth transfers can
be implemented via simple lump-sum transfers and taxes levied by the government. This result, known as the Second Welfare Theorem, however, poses
strong requirements on the quality of information available to the government.
Lump-sum taxes, by definition, depend only on agents’ types (and not on their
actions). In order to use lump-sum taxes, thus, the government must possess
sufficient information about agents’ types, on a person-by-person basis. If
public information is not sufficiently detailed, the required wealth transfers
and taxes cannot be applied because the government is unable to determine
which agents should be taxed and which should receive a transfer. In this
situation, the Second Welfare Theorem breaks down: Lump-sum taxes are
insufficient to achieve any division of the gains from trade other than the one
implied by the competitive market mechanism.
The third result we review concerns the problem of efficient redistribution
of the total gains from trade in the case of incomplete public information.
Here, the inefficacy of lump-sum taxes creates a role for distortionary taxation. A tax is called distortionary if the amount due from an agent depends on
his actions. If an activity is subject to a distortionary tax, then by avoiding the
activity the agent can avoid the tax, which distorts his incentive to engage in
this activity. The ability to influence agents’ incentives is exactly what makes
distortionary taxes useful. The tax-imposed distortions can be designed to
offset the distortions resulting from incomplete information. Such corrective
distortions, clearly, cannot be generated by lump-sum taxes, which are nondistortionary. The third main result we review in this article is a version of the
Second Welfare Theorem modified to include distortionary taxes. Within our
model economy, we fully characterize a distortionary tax system sufficient to
achieve any efficient division of the gains from trade available in our economy
when public information is incomplete. This tax system consists of a lumpsum-funded subsidy to sufficiently large capital trades. Depending on which
among the infinite number of efficient divisions of the gains from trade is to
be implemented, the subsidy can go to either those who sell or to those who
buy capital in the competitive market.
The model economy is a two-period, deterministic Lucas-tree economy in
which income comes from a stock of productive capital. In each period, one
unit of the capital stock produces y units of a single, perishable consumption
good. The capital stock is fixed, i.e., no physical investment is possible. The
size of the economy’s total capital stock is normalized to unity. Agents, who
own the capital stock in equal shares, are heterogenous with respect to their
preference for early versus late consumption. In particular, there are two
types of agents in this economy: the patient ones, whose marginal utility from

B. Grochulski: Redistributive Taxation

237

consuming in the first period is relatively low, and the impatient ones, whose
marginal utility from consuming in the first period is relatively high.
Efficient divisions of the welfare gains from trade are represented by
Pareto-efficient allocations of consumption. For the model economy we
consider, Pareto-efficient allocations of consumption are characterized in
Grochulski (2008). This characterization describes the full set of possibilities for feasible and efficient division of the total gains from trade among the
patient and impatient agents—both in the case of complete public information
and in the case of agents’ private knowledge of their impatience type. Having
this description in hand, we can consider, in the present article, the question
of how any given such division can be implemented in a competitive market
economy. In particular, we focus on the role the government has in supporting
the socially preferred division of total gains from trade through redistributive
taxation.
The question of efficient redistribution and its implementation through
taxation has long been studied in economics. The definitive treatment of
the classical theory of efficiency and distributional properties of competitive
markets under full information and with no externalities is given in Debreu
(1959).1 The first two of the three main results we review in this article are
simple special cases of the welfare theorems provided in Debreu (1959).2 In
a seminal paper, Mirrlees (1971) takes on the same question while explicitly
recognizing that government information may be incomplete. The third main
result we present is a version of the optimal distortionary taxation result of
Mirrlees (1971). Our model economy differs from that of Mirrlees (1971) in
that ours is a pure-capital-income, general equilibrium economy while in the
one studied in Mirrlees (1971) all income comes from labor. Mathematically,
however, our model economy is a simplified version of the model studied in
Mirrlees (1971).
Stiglitz (1987) reviews the literature on optimal redistributive taxation in
economies with private information.3 Kocherlakota (2007) surveys a related
literature on tax systems implementing optimal social insurance in dynamic
economies with ex-post private information.
The body of this article is organized as follows. Section 1 describes
in detail the economy we study. Section 2 defines competitive equilibrium.
Section 3 demonstrates the efficiency of competitive equilibrium indirectly
as well as through direct computation. Section 4 shows the sufficiency of
lump-sum taxes for efficient redistribution under full information. Also, it
1 Pigou (1932) initiated the by now extensive, and still actively developing, literature on
corrective distortionary taxation in economies with externalities.
2 Chapter 16 of Mas-Colell, Whinston, and Green (1995) contains an excellent textbook treatment of these results.
3 Werning (2007) is among recent contributions to this literature.

238

Federal Reserve Bank of Richmond Economic Quarterly

demonstrates the inefficacy of lump-sum taxes when agents’ types are not observed by the government. Section 5 defines a general class of distortionary
tax systems. There, also, it is shown that a simple tax system with a proportional distortionary tax on capital income is incapable of providing any
redistribution. Section 6 is devoted to the study of an optimal distortionary
tax system in which capital taxes are nonlinear. Section 7 discusses alternative
optimal tax systems. Section 8 concludes.

1. A SIMPLE PURE CAPITAL INCOME ECONOMY
In this article, we will study a private-ownership version of the economic
environment also studied in a companion article, Grochulski (2008, referenced
hereafter as G08 for short). The economy is populated by a unit mass of
agents who live for two periods, t = 1, 2. There is a single consumption good
each period, ct , and agents’ preferences over consumption pairs (c1 , c2 ) are
represented by the utility function
θ u(c1 ) + βu(c2 ),

(1)

where β is a common-to-all discount factor, and θ is an agent-specific preference parameter. Agents are heterogenous in their relative preference for
consumption at date 1. We assume a two-point support for the population
distribution of the impatience parameter, θ . Agents, therefore, are of two
types. A fraction, μH , of the agents are impatient with a strong preference for
consuming in period 1. Denote by H the value of the preference parameter, θ ,
that represents preferences of the impatient agents. A fraction, μL = 1 − μH ,
are agents of the patient type. Their value of the impatience parameter, θ ,
denoted by L, satisfies L < H .
The production side of the economy is represented by the so-called Lucas
tree. Each agent is endowed with one unit of productive capital stock—the tree.
Each period, one unit of the capital stock produces y units of the consumption
good—the fruit of the tree. Given that the total mass of agents is normalized
to unity and each agent is endowed with one tree, the aggregate amount of
the consumption good available in this economy in each of the two periods is
Y = y. The consumption good is perishable—it cannot be stored from period
1 to 2. The size of the capital stock, i.e., the number of trees, is fixed: Capital
does not depreciate nor can it be accumulated.
Note that there is no uncertainty in this economy. In particular, agents’
impatience parameter, θ , is nonstochastic. The production side of the economy
is deterministic as well.
For simplicity and clarity of exposition, as in G08, we will focus our attention on a particular set of values for the preference and technology parameters.

B. Grochulski: Redistributive Taxation

239

In particular, we take
u(·) = log(·),

β=

1
,
2

H =

5
,
2

L=

1
,
2

1
,
2

y = 1.
(2)
Roughly, the model period is thought of as being 25 years. The value of
1
the discount factor, β, of 2 corresponds to an annualized discount factor of
about 0.973. The fractions of the two patience types are equal; preferences are
logarithmic. The per-period product of the capital stock, y = Y , is normalized
to 1.
An allocation in this economy is a description of how the total output
(i.e., the total capital income, Y ) is distributed among the agents each period.
An allocation, therefore, is given by c = (c1H , c1L , c2H , c2L ), where ctθ ≥ 0
denotes the amount of the consumption good in period t assigned to each agent
of type θ. To be resource-feasible, allocations must satisfy
μH = μL =

μθ ctθ ≤ Y,

(3)

θ=H,L

for t = 1, 2, i.e., the aggregate consumption must not exceed the aggregate
output.4 Given the utility functions (1), an allocation, c, gives total utility
(or welfare), θ u(c1θ ) + βu(c2θ ), to each agent of type θ = H, L. For any
α ∈ [0, 1], the social welfare function is a weighted average of the utilities of
the two types of agents:
α [H u(c1H ) + βu(c2H )] + (1 − α)[Lu(c1L ) + βu(c2L )],

(4)

where α represents the absolute weight the society attaches to the welfare
of the agents of type H . Let γ = α/(1 − α) denote the relative weight of
the agents of type H . An allocation is Pareto efficient if there does not exist
a feasible re-allocation that some agents would desire and no agents would
oppose. In this sense, Pareto-efficient allocations represent all divisions of the
total gains from trade that can be attained in the economy.
As discussed in G08, one can find all Pareto-efficient allocations by solving, for each γ ∈ [0, +∞], the problem of maximization of the social welfare
function (4) subject to feasibility constraints. If all information in the economy is public, these feasibility constraints are simply the resource feasibility
constraints (3). The allocation, c, attaining the maximum of the social welfare
function (4) for a given value of the relative weight, γ , is called a First Best
Pareto optimum, and is denoted by c∗ (γ ). By adjusting γ between 0 and +∞,
we can trace out the set of all First Best Pareto optima in this economy. This
set is depicted in Figure 1 of G08.
The assumption of complete public information may be too strong. In
particular, the government may be unable to observe agents’ preferences. For
4 Note that this constraint is independent of how the aggregate output is initially allocated
to the agents.

240

Federal Reserve Bank of Richmond Economic Quarterly

this reason, we will consider the assumption that each agent’s impatience parameter, θ , is known only to the agent himself and not to anybody else in
the economy. This incompleteness of public information imposes additional
restrictions on the feasible re-allocations that can be implemented in the economy. As discussed in G08, these restrictions take the form of the so-called
incentive compatibility constraints, which are given by
H u(c1H ) + βu(c2H ) ≥ H u(c1L ) + βu(c2L ) and

(5)

Lu(c1L ) + βu(c2L ) ≥ Lu(c1H ) + βu(c2H ).

(6)

Suppose the government presents the agents with an allocation, c, and asks
them to reveal their impatience parameter. If c satisfies these constraints, the
agents will have no incentive to misrepresent their true type.
In the economy with private information, all Pareto-efficient allocations
can be found by maximizing, again for each γ ∈ [0, +∞], the social welfare
function (4) subject to resource feasibility constraints (3) and the incentive
compatibility constraints (5) and (6). The allocation, c, attaining the maximum
in this problem for a given value of γ is denoted by c∗∗ (γ ) and often called
a constrained-Pareto or Second Best Pareto optimum. This name reflects the
fact that c∗∗ (γ ) is efficient in a more narrow sense than the corresponding
c∗ (γ ), as c∗∗ (γ ) is constrained by private information while c∗ (γ ) is not. The
set of all Second Best Pareto optima for this economy is depicted in Figure 4
of G08.
G08 provides a fairly detailed characterization of the sets of First and
Second Best Pareto-optimal allocations. In the present article, we will examine the relation between Pareto-optimal allocations and market equilibrium
allocations. We begin by describing the competitive market mechanism and
its equilibrium.

2.

COMPETITIVE CAPITAL MARKET EQUILIBRIUM

In this article, we study a private-ownership economy in which all agents are
initially endowed with one unit of productive capital. Relative to this initial
allocation, clearly, there are gains from trade to be exploited (i.e., the initial
allocation is not a Pareto optimum). When income generated by the capital
stock (i.e., the dividend) is realized in the first period, all agents have the same
amount of the consumption good in hand (y units), and the same amount of
consumption they will receive in the next period (y units again), but not the
same desire to consume now versus next period. Thus, it is natural for them
to trade consumption in hand today for capital, i.e., for the dividends, that
will be received tomorrow. The relatively impatient agents, i.e., those whose
preference type is θ = H , can sell some of their capital to the more patient
agents of type θ = L in return for current consumption. This can be done for
the mutual benefit of the two types of agents because their preferences differ.

B. Grochulski: Redistributive Taxation

241

The terms of this mutually beneficial trade, which will determine the final
division of the welfare gains from trade, can depend on many factors. How
many units of consumption in the first period will a patient agent be willing to
pay for a unit of capital being sold by the impatient agent? Given the economic
environment, a reasonable answer to this question is: the market price. In this
environment, we have a large number of sellers of capital (mass mH to be
exact) and a large number of buyers (mass mL ). Also, we do not assume
that buyers or sellers face any technological barriers to trading like significant
costs of shopping around, communicating, or negotiating with potential trade
counterparties. Therefore, no rational agent will trade with a counterparty
unless he is confident that he cannot obtain more favorable terms of trade by
continuing to shop around. The competitive market price of capital represents
the terms of trade that give this confidence to a rational agent. It is reasonable
to expect that a competitive market for capital will emerge in this environment.
Let us therefore consider the standard formal model of the competitive
market mechanism. After agents collect dividends in period 1, they choose the
quantity, c1 , that they consume now, the quantity, a, of capital they purchase
or sell at the market price, q, and the quantity they will be consuming in
the second period, c2 . Their initial endowment of capital and its price, q,
determine the set of consumption pairs (c1 , c2 ) that are affordable.
Formally, agents of type θ = H, L choose c1 , a, and c2 so as to solve the
following individual utility maximization problem:
max

c1 ≥0,c2 ≥0,a

θ u(c1 ) + βu(c2 ),

subject to the budget constraints
c1 + qa ≤ y,
c2 ≤ (1 + a)y.

(7)
(8)

Note that the non-negativity requirement for consumption at the second date
implies that a ≥ −1, i.e., no agent can sell more capital than the one unit he
owns.
D
D
Let ctθ (q) for t = 1, 2 and aθ (q) for θ = H, L denote the agents’ demand
functions for consumption and capital, respectively, i.e., the solutions to the
above individual optimization problem for any given price of capital, q.
Definition 1 Competitive market equilibrium consists of a consumption allocation, c = (c1H , c1L , c2H , c2L ); capital trades, a = (aH , aL ); and a capital
ˆ
ˆ
ˆ ˆ
ˆ
ˆ
ˆ ˆ
price, q, such that
ˆ
(i) agents optimize, i.e., the equilibrium allocation maximizes agents’
utility given the equilibrium price, q:
ˆ
ctθ
ˆ
aθ
ˆ
for t = 1, 2 and θ = H, L;

D
= ctθ (q),
ˆ
D
= aθ (q),
ˆ

242

Federal Reserve Bank of Richmond Economic Quarterly
(ii) the capital market clears:
μθ aθ = 0.
ˆ

(9)

θ=H,L

Note that the budget constraints and the capital market clearing condition
imply that the equilibrium allocation of consumption is resource-feasible, i.e.,
the sum of all agents’ consumption in every period does not exceed the total
amount of output, Y :
μθ ctθ ≤ Y
ˆ

(10)

θ=H,L

for t = 1, 2.

3.

EFFICIENCY OF CAPITAL MARKET EQUILIBRIUM

Suppose there is no government intervention and agents trade freely. As
a result, each agent obtains some final allocation of consumption. As we
discussed in the previous section, we expect this allocation to be a competitive
equilibrium allocation, c. The following is the classic competitive market
ˆ
optimality result.
Theorem 1 Let (c, a, q) be a competitive capital market equilibrium. Then,
ˆ ˆ ˆ
the equilibrium allocation of consumption, c, is Pareto optimal.
ˆ
Recall that an allocation is Pareto optimal (or Pareto efficient) if it is
feasible and not Pareto dominated by another feasible allocation. An allocation
x Pareto dominates an allocation z if all agents in the economy prefer x over
z, and at least one agent in the economy prefers x over z strictly.5 Clearly,
a Pareto-dominated allocation is a waste. If all agents can be made better
off including at least one agent strictly, it would be a waste to not exploit
this opportunity. The above theorem tells us that competitive equilibrium
allocation is free of this failure. This important result, which holds much
more generally than just in our simple capital market model, is often called
the First Welfare Theorem.

A General Proof of the First Welfare Theorem
In this subsection, let us present a general, standard argument behind the First
Welfare Theorem.6 We will note that this argument is an indirect one.
5 G08 provides additional discussion of Pareto dominance and feasibility with full and partial
public information.
6 See also Mas-Colell, Whinston, and Green (1995) for an excellent textbook treatment of
this result.

B. Grochulski: Redistributive Taxation

243

We begin with the following simple implication of agents’ utility maximization: In equilibrium, it must be the case that
c1θ + c2θ q/y = y + q.
ˆ
ˆ ˆ
ˆ

(11)

To see this, note first that when agents optimize, their budget constraints will
be satisfied as equalities because utility is strictly increasing in consumption.
Then, eliminate aθ from (7) and (8) to obtain (11).
ˆ
Equation (11) represents the fact that in equilibrium agents will not waste
personal wealth. The right-hand side of (11) represents the equilibrium value
of each agent’s initial endowment of capital in terms of the units of firstperiod consumption. In period 1, an agent can collect dividend y and then
sell all his capital endowment for q. Thus, his total wealth is y + q. The
ˆ
ˆ
left-hand side of (11) represents the cost of the consumption allocation, cθ .
ˆ
ˆ
In equilibrium, q is the price of one unit of c2 in terms of the units of c1 : It
y
1
takes y units of capital to obtain one unit of consumption in period 2, and q
units of consumption in period 1 to obtain one unit of capital. Effectively, it
ˆ
takes q units of consumption c1 to obtain one unit of consumption c2 . Thus,
y
c1θ + c2θ q/y is the cost of the consumption pair cθ = (c1θ , c2θ ).
ˆ
ˆ ˆ
ˆ
ˆ ˆ
Now suppose that a feasible allocation c Pareto dominates the equilibrium
¯
allocation c. This means that agents of at least one type strictly prefer cθ over
ˆ
¯
cθ , and both types prefer cθ over cθ at least weakly. Because utility is strictly
ˆ
¯
ˆ
increasing in consumption, cθ must be strictly unaffordable to all those who
¯
strictly prefer it and at best just affordable to those who weakly prefer it (which
is everybody).7 Thus, for both θ ,
c1θ + c2θ q/y ≥ y + q
¯
¯ ˆ
ˆ

(12)

with at least one of these two inequalities being strict. Multiplying this inequality for type θ by μθ and adding over θ , we obtain
μθ c1θ +
¯
θ=H,L

μθ c2θ q/y >
¯
ˆ
θ=H,L

μθ (y + q),
ˆ
θ=H,L

where the inequality is strict because at least one of the inequalities in (12) is
strict. Since c is feasible, it must satisfy the resource constraints (10). Using
¯
these, we obtain from the above that
Y + Y q/y > y + q.
ˆ
ˆ
7 Note that this argument relies only on the strict monotonicity of preferences (and, in fact,
could rely only on local nonsatiation; see Mas-Collel, Whinston, and Green [1995], section 16
C). In our model, we could actually make a stronger argument based on the strict convexity of
preferences. Namely, since agents’ preferences are strictly increasing and strictly convex, cθ is a
ˆ
unique maximizer of utility in the budget set. Thus, cθ must be strictly unaffordable even to the
¯
¯
type that is indifferent between cθ and cθ .
ˆ

244

Federal Reserve Bank of Richmond Economic Quarterly

Substituting Y = y we get
Y + q > Y + q,
ˆ
ˆ
which is a contradiction. Thus, a feasible allocation, c, that Pareto dominates
¯
the equilibrium allocation, c, cannot exist.
ˆ

A Direct Proof of the First Welfare Theorem
The First Welfare Theorem tells us that any equilibrium allocation is Pareto
optimal and nothing more. In particular, the general, indirect proof of the
First Welfare Theorem tells us nothing about which among the infinitely many
Pareto-optimal allocations the equilibrium allocation c coincides with. This
ˆ
question, which strictly speaking is outside the scope of the First Welfare
Theorem, may be of independent interest.
In the specific environment that we consider in this article, we can give a
direct proof of the First Welfare Theorem. Namely, we can compute the set
of competitive equilibrium allocations and compare it against the set of all
Pareto-optimal allocations. In this way, we will be able to tell exactly which
Pareto optima can be implemented as competitive equilibria.
Solving the individual utility maximization problem

We begin by deriving the agents’ capital demand functions. Similar to (11),
we can rewrite the agents’ budget constraints as an equality of the present
value of consumption and wealth:
c1 + c2 q/y = y + q.

(13)

In this form, it is easy to see the agents’ utility maximization problem has a
linear budget set and a strictly concave objective function. Thus, at each price,
q, it has a unique solution, which we can compute from the budget constraint
(13) and the Euler equation8
θ u (c1 )q/y = βu (c2 ).

(14)

For each θ, we solve these two equations to obtain consumption demand
D
functions ctθ (q) for t = 1, 2. Using the parameter values in (1), these solutions
are
1+q
D
c1θ (q) = 2θ
,
1 + 2θ
1 1+q
D
.
c2θ (q) =
q 1 + 2θ
8 Perhaps the simplest way to obtain the Euler equation (14) is to express the utility maximization problem as maxc2 θ u(y + q − c2 q/y) + βu(c2 ) and take the first-order condition.

B. Grochulski: Redistributive Taxation

245

From (8) evaluated at equality, we obtain that type θ ’s capital demand function
is
1 1+q
D
− 1.
aθ (q) =
q 1 + 2θ
Solving for equilibrium price of capital and allocation

Substituting the capital demand functions into the capital market clearing
condition (9) and solving for the price that clears this market, we obtain an
1
equilibrium price q = 2 . It is easy to see that there are no other prices that
ˆ
clear this market, i.e., the competitive equilibrium is unique in our model.9
We can now compute equilibrium capital trades:
1
D 1
aL = aL ( ) = ,
ˆ
2
2
1
1
D
aH = aH ( ) = − ,
ˆ
2
2
and the equilibrium allocation of consumption:
cL =
ˆ
cH
ˆ

=

3 3
,
,
4 2
5 1
=
,
.
4 2

c1L , c2L =
ˆ ˆ

(15)

c1H , c2H
ˆ
ˆ

(16)

Figure 1 depicts the agents’ budget constraint at the equilibrium capital
1
price, q = 2 ; equilibrium consumption pairs (15) and (16); and one indifferˆ
ence curve for each type of agent. Clearly, both agent types face the same
budget constraint. Since their preferences differ, so do their choices. The indifference curve depicted for each type θ represents the highest level of utility
that each type attains within the budget constraint. Note also that point (1,1)
in Figure 1 represents the consumption bundle that agents get if they do not
1
1
trade. In equilibrium, the impatient agent exchanges 4 units of c1 for 2 units
of c2 . The patient agent, of course, takes the opposite end of this trade.
Confronting equilibrium with the set of Pareto-optimal
allocations

Our first observation here is that the competitive capital market mechanism
delivers a unique equilibrium allocation of consumption. We observe next that,
as shown in detail in G08, there is a continuum of Pareto-optimal allocations
9 Expressed as a function of gross return on capital investment, R = 1 , rather than the price
q
R+1
D
of capital, q, agents’ demand for capital is linear in R. Namely, aθ (R) = 1+2θ − 1. Thus, for
any two numbers, θ , there can be at most one solution to the capital market clearing condition,
so equilibrium is unique.

246

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 Agents’ Problems’ Solutions at Competitive Equilibrium

3

c2

2

1

0
0

1

2

3

c1
in our environment.10 From these two observations, we immediately see
that almost all Pareto-optimal allocations are incompatible with competitive
equilibrium.
Which one among the continuum is the Pareto optimum consistent with
competitive equilibrium? Formulas (8)–(11) in G08 describe the set of all
First Best Pareto optima indexed by parameter γ ∈ [0, ∞] representing the
relative welfare weight assigned to the impatient agents in the social objective
function. If society favors neither of the two types of agents, the welfare weight
given to both types is the same, i.e., the relative weight of the impatient type,
γ , is 1. Thus, γ = 1 represents the so-called utilitarian Pareto optimum.
By γ CE let us denote the value of the index, γ , associated with the optimum
that is selected by the market mechanism in competitive equilibrium. From
10 Multiplicity of Pareto-optimal allocations is typical in environments with heterogenous

agents.

B. Grochulski: Redistributive Taxation

247

formulas (8)–(11) in G08, we obtain immediately that the unique competitive
allocation, c, given in (15) and (16) is the Pareto optimum corresponding to
ˆ
γ = 1/3, i.e., γ CE = 1/3 in our economy. Thus, competitive equilibrium is
optimal if and only if society values welfare of the patient type, L, three times
as much as it values welfare of the impatient type, H .

Why this Pareto Optimum?
As we have seen, competitive capital market equilibrium implements a rather
particular Pareto optimum. Intuitively, we see that the competitive capital
market selects a Pareto optimum that “favors” the patient agents. With a large
mass of agents whose desire for consumption in the first period is very strong
relative to the population average (H exceeds the average θ by 66 percent), the
market is “flooded” with capital, which becomes very affordable to the patient
agents.11 As the impatient agents compete for first-period consumption, the
patient agents end up receiving two units of c2 for each unit of c1 in equilibrium.
That rate of exchange is optimal only if society on the whole cares for the
welfare of the patient agents more than it cares for the welfare of the impatient
consumers.12
In conclusion, the competitive market mechanism does two things: it
allows the agents to obtain welfare gains from trade, and it also divides these
gains among the agents in a particular way. It is entirely possible that society
might desire a different division of the welfare gains than the one built into
the competitive market allocation mechanism. This problem creates a role for
redistributive government policy. In the remainder of this article, we consider
how the government can supplement the competitive market mechanism with
a tax system that preserves efficiency but implements other divisions of the
welfare gains from trade.
In the next section, we consider the situation in which the government has
full information on each agent’s preference type and, therefore, can transfer
wealth from one type to another as a lump sum. Subsequently, we consider
11 To see this point more clearly, note that the preferences of the patient type can be alter1
natively represented by log(c1 ) + log(c2 ) and q = 2 .
ˆ
12 As a simple thought experiment, consider the question of how the competitive equilibrium

selection from the Pareto set changes when the relative impatience of the two types of agents
1
changes. In particular, suppose that L is not necessarily 2 but can be any real number smaller
CE (L) denote the index of the Pareto optimum that is implemented by the
or equal to H . Let γ
5
competitive equilibrium as L is adjusted between 0 and 2 . It is easy to show that γ CE (L) =
(2L + 1)/6. Thus, competitive equilibrium selects the utilitarian Pareto optimum only if all agents
are identical (no trade is optimal in this case). When the impatient agents become very impatient,
i.e., when L approaches 0, we have that γ CE (0) = 1 , i.e., competitive equilibrium selects the
6
Pareto optimum that would be selected by a society that values welfare of the patient type L six
times as much as it values welfare of the impatient type H .

248

Federal Reserve Bank of Richmond Economic Quarterly

private information, which makes lump-sum wealth transfers infeasible and
creates a role for distortionary redistributive taxation.

4.

COMPETITIVE EQUILIBRIUM WITH LUMP-SUM TAXES

We begin by extending the definition of competitive equilibrium (i.e., Definition 1) to allow for lump-sum wealth transfers. A tax on an agent is lump-sum
if the amount due is independent of any choices made by this agent. For example, a labor income tax is not lump-sum because the amount due increases
with the number of hours the agent chooses to work. Taxes under which the
amount due does depend on the taxpayers’ choices are called distortionary.
In our economy, agents choose consumption in periods 1 and 2 and their
capital holdings in period 2. Thus, lump-sum taxes must not depend on consumption or capital holdings. Agents’ impatience type θ , however, is not their
choice. If the government can observe each agent’s type θ , a lump-sum tax
can depend on θ . In this section, we assume that each agent’s preference type
θ is freely and publicly observable. In particular, the government sees every
agent’s type and therefore can impose different lump-sum taxes on the agents
of different types.
In this setting, a lump-sum tax system consists of two real numbers: TH
and TL , where a negative value of Tθ means a transfer from the government
to the agent of type θ . Under these taxes, the budget constraints of the agents
of type θ = H, L are
c1 + aq ≤ y − Tθ ,
c2 ≤ (1 + a)y,
where q, as before, is the ex-dividend price of capital in the first period. Note
that the lump-sum taxes Tθ are levied in period 1 and denominated in the units
of consumption at that date. It is entirely possible to levy lump-sum taxes at
both dates, but it is easy to see that doing this would not be useful. Treating
the budget constraints as equalities, eliminating a, we can express the budget
constraint in the present value as follows:
c1 + c2 q/y = y + q − Tθ .

(17)

From here we see that any lump-sum tax at the second date can be lumped
into Tθ .
Competitive equilibrium with lump-sum taxes, (Tθ )θ=H,L , is defined
analogously to the tax-free competitive equilibrium of Definition 1: A priceallocation pair will be an equilibrium if agents optimize, now subject to (17),
and markets clear. In addition, the government must break even in equilibrium,
i.e., taxes, (Tθ )θ=H,L , must satisfy the government budget constraint
μθ Tθ = 0.
θ=H,L

(18)

B. Grochulski: Redistributive Taxation

249

Efficient Redistribution with Lump-Sum Taxation
Under Full Information
We will say that a lump-sum tax system, (Tθ )θ=H,L , implements a given allocation, c, if c is a competitive equilibrium allocation under taxes, (Tθ )θ=H,L .
The following result is a version of the classic sufficiency result known as the
Second Welfare Theorem.
Theorem 2 Every First Best Pareto optimum, c∗ , can be implemented with a
lump-sum tax system, (Tθ )θ=H,L .
Under this theorem, lump-sum taxes are clearly sufficient to achieve any
desired distribution of the total gains from trade available in this economy.
We will now provide a proof of this theorem constructed as follows. First,
we derive a set of sufficient conditions for an allocation to be an equilibrium
allocation. Then, we show that for every First Best Pareto optimum c∗ (γ ),
γ ∈ [0, ∞], lump-sum taxes, (Tθ )θ=H,L , can be set so c∗ (γ ) satisfies these
sufficient conditions.
In order for an allocation, c, to be an equilibrium allocation, there must
exist a capital price, q, at which agents choose to consume c and the capital
trades associated with c clear the market. First, let us identify sufficient
conditions for agents’optimization at a given price, q. Under the present-value
budget constraint (17), agents solve a strictly concave optimization problem.
Thus, the individual Euler equation (14) and the budget constraint (17) are
sufficient for an allocation, c, to be individually optimal at the price, q. Second,
we need to check market clearing. However, as long as we implement a
resource-feasible allocation, c, the capital purchases associated with c will
clear the market.
One of the properties of the First Best Pareto optima (FBPO) is that they are
free of the so-called intertemporal wedges (see G08, Section 3). This means
that at each FBPO c∗ (γ ) the intertemporal marginal rate of substitution (IMRS)
of each agent type is equal to the intertemporal marginal rate of transformation.
Denote the IMRS of agent type θ evaluated at a FBPO allocation c∗ (γ ) by
m∗ (γ ), i.e.,
θ
∗
βu (c2θ (γ ))
.
∗
θ u (c1θ (γ ))
The lack of intertemporal wedges demonstrated in G08 implies that

m∗ (γ ) =
θ

(19)

m∗ (γ ) = m∗ (γ )
H
L
for each γ ∈ [0, ∞]. This simple property is crucial for the implementation
of FBPO as competitive equilibria.
Let us denote the two agent types’ common IMRS value by m∗ (γ ).
Directly from (14) we see that if the price of capital is
q = m∗ (γ )y,

250

Federal Reserve Bank of Richmond Economic Quarterly

then the individual Euler equation holds for both agent types simultaneously.
Let us denote this price of capital by q(γ ) for each γ ∈ [0, ∞].
ˆ
All that remains to be checked is affordability, i.e., that both types’ budget
constraints are satisfied at the consumption allocation, c∗ (γ ), and price, q(γ ).
ˆ
For that, however, we have the lump-sum taxes, Tθ . In particular, we can find
∗
the lump-sum tax, TL , that will make the FBPO cL (γ ) affordable for agent L.
To do that, we solve the budget constraint
∗
∗
c1L (γ ) + c2L (γ )q(γ )/y = y + q(γ ) − TL
ˆ
ˆ

(20)

for TL . For each γ ∈ [0, ∞], we will denote this solution by TL (γ ). Using
the formulas for c∗ (γ ) derived in G08, we can compute TL (γ ) explicitly.
Formulas (9) and (11) in G08 tell us that
2
∗
,
(21)
c1L (γ ) =
1 + 5γ
2
∗
(22)
c2L (γ ) =
1+γ
for any γ ∈ [0, ∞]. Substituting these expressions into (19), we obtain
∗
βu (c2θ )
∗
θ u (c1θ )
1+γ
.
1 + 5γ

m∗ (γ ) =
=

From the Euler equation (14), we therefore have that if c∗ (γ ) is to be an
equilibrium allocation, then the price of capital must be
q(γ ) =
ˆ

1+γ
.
1 + 5γ

Substituting this price and the consumption values (21) and (22) into (20), we
solve for TL to obtain
TL (γ ) = −

2 − 6γ
.
1 + 5γ

(23)

From the government budget constraint (18), it is immediate that
TH (γ ) = −TL (γ )
2 − 6γ
.
=
1 + 5γ

(24)

∗
It is easy to verify that with tax TH (γ ), the FBPO cH (γ ) is affordable to agent
H under the capital price q(γ ). Thus, for any γ ∈ [0, ∞] with taxes T (γ ), the
ˆ
∗
optimal allocation cH (γ ) satisfies sufficient conditions for equilibrium. Proof
of Theorem 2 is therefore complete.
Equilibrium with lump-sum taxes, naturally, reduces to the pure competitive equilibrium of Definition 1 if the government chooses the taxes to be zero.

B. Grochulski: Redistributive Taxation

251

Figure 2 Agents’ Problems’ Solutions at Competitive Equilibrium with
Lump-Sum Taxes Implementing the Utilitarian Optimum

3

c2

2

1

0
0

1

2

3

c1

From (23) and (24), it is easy to see that TH ( 1 ) = TL ( 1 ) = 0. Thus, zero
3
3
taxes are optimal with γ = 1 , which exactly replicates the result we obtained
3
in the constructive proof of Theorem 1. When γ = 0, i.e., when the society
puts zero weight on welfare of the type H , the lump-sum tax on agents H is
TH (0) = 2, i.e., all wealth is taken away from agents H . At the other extreme,
TH (∞) = −2, i.e., the government transfers all wealth to agents H when
γ = ∞.
Figure 2 depicts the solution to the agents’ utility maximization problems
at the equilibrium implementing the utilitarian optimal allocation (i.e., when
all agents receive the same welfare weight in the social planning problem).
With γ = 1, we have TH (1) = − 2 and TL (1) = 2 . The equilibrium price is
3
3
q(1) = 1 . At this price, the ex-dividend value of each agent’s capital in period
ˆ
3
ˆ
1 is 1 . The after-tax wealth of type H , thus, is y + q(γ ) − TH = 2, while that
3
of type L is y + q(γ ) − TL = 2 . The budget constraint for type L, therefore,
ˆ
3

252

Federal Reserve Bank of Richmond Economic Quarterly

is
2
,
3
and the optimal choice is cL = ( 1 , 1). The H type faces the budget constraint
ˆ
3
c1 + c2 /3 =

c1 + c2 /3 = 2,
and his optimal choice is cH = ( 5 , 1). Figure 2 depicts these budget conˆ
3
straints and optimal choices along with two indifference curves representing
the maximal utility levels attained by each type in equilibrium with lump-sum
taxes (TH (1), TL (1)). The pale, horizontal, solid line represents the lumpsum tax on the patient types TL (1) = 2 . The pale, horizontal, dashed line
3
represents the lump-sum transfer to the impatient type, i.e., the negative of
the tax TH (1) = − 2 . Under these transfers and the capital price q(1) = 1 ,
ˆ
3
3
the two types’ budget constraints are parallel, with the budget line of type L
being strictly inside (closer to the origin) the budget of type H . Agents choose
aH = aL = 0.
ˆ
ˆ
With complete information about agents’ types, the government can freely
redistribute wealth among the two types of agents. With a competitive market
for capital, no further government intervention is needed for either efficiency
or distributional reasons. Nondistortionary lump-sum taxes are sufficient to
efficiently attain any distributional objective of the government, i.e., they implement any First Best Pareto-optimal allocation of consumption.

Inefficacy of Lump-Sum Taxes Under Incomplete
Public Information
Suppose now that the government cannot directly observe agents’ types. Can
the government implement a wealth transfer from one type to the other when
it does not see which agents are of which type? Certainly the lump-sum tax
system described above cannot be used because it requires the knowledge
of agents’ types. Potentially, the government could elicit this information
from the agents. However, if the government uses this information to simply
transfer wealth, agents will not reveal their type truthfully.
This is very intuitive. The larger an agent’s after-tax wealth, the better
off this agent will be in any competitive equilibrium. With agents themselves
being the only source of information about their preference types, any lumpsum tax with TH = TL will make some agents lie about their type. Clearly, if
TH < TL , everybody will declare themselves to be of type H . If TL < TH ,
everybody will say they are of type L. Therefore, if the government sets
lump-sum taxes TH = TL , all agents will end up paying min{TH , TL }, i.e., all
agents will pay the same amount. Given the government budget constraint, this
amount must be zero. Thus, when agents’ types are their private information,
the only lump-sum taxes that the government can feasibly implement are

B. Grochulski: Redistributive Taxation

253

TH = TL = 0. We see that if the government wants to (or is restricted
to) use lump-sum taxes to redistribute social surplus and agents have private
information, the government can redistribute nothing.
It is worth emphasizing that the competitive equilibrium allocation remains efficient in our economy even when agents have private information
about their type, i.e., the First Welfare Theorem holds in our economy with
private information. An easy way to see that this indeed is the case is simply
to check that the competitive equilibrium allocation, c, given in (15)–(16) beˆ
longs to the set of Second Best Pareto-optimal allocations characterized in G08
(see Figure 4 in particular). In fact, this is quite intuitive. Private information
does not interfere with the price mechanism in our model because it does not
affect the nature of the commodity that is being traded. Under both complete
and private information about agents’ preferences, consumption is traded for
capital. Preferences of buyers and sellers do not affect the nature of this trade
beyond what is captured by agents’ demand functions. The competitive price
mechanism is thus efficient.13
In sum, competitive equilibrium delivers one efficient allocation in our
economy—under both complete and incomplete public information. This
allocation represents a particular distribution of the gains from trade among
the two types of agents. Thus, competitive equilibrium is suboptimal under
almost all possible strictly Paretian social preference orderings (represented
by the parameter γ ∈ [0, ∞]).14 In the case of complete public information,
this distributional problem can be remedied by lump-sum taxes. In the case of
private information, however, lump-sum taxes are powerless. In fact, in our
economy, the only implementable lump-sum tax is the zero tax on all agents.
Motivated by this, we now turn to distortionary taxes.

5.

COMPETITIVE EQUILIBRIUM WITH DISTORTIONARY
TAXES

For the remainder of this article, we assume that information available to the
government is incomplete. In particular, agents’ impatience is known only
to them. We assume that the government knows the population distribution
of the impatience parameter, θ , but cannot determine the value of θ on an
agent-by-agent basis. Thus, tax systems in which the amount levied on an
agent depends directly on the agent’s θ are not feasible to the government.
13 In particlular, the classic lemons problem of Akerlof (1970) does not appear in this market.
14 One could also consider non-Paretian social preference orderings (see Mas-Colell,

Whinston, and Green [1995], Section 22.C). By considering only strictly Paretian social welfare
functions (of the form αuH + (1 − α)uL , where uθ = θ u(c1θ ) + βu(c2θ ) for θ = H, L) we pose
a reasonably strong restriction on the set of allocations that can be considered optimal. In this
restricted set, almost all Second Best Pareto-optimal allocations cannot be supported by competitive
equilibrium with lump-sum taxes.

254

Federal Reserve Bank of Richmond Economic Quarterly

Let us start by defining a general class of feasible tax systems. For dates
t = 1, 2, let Tt denote the mapping from agents’ publicly observable characteristics at t to the tax payments to the government at t. In the first period,
agents trade current consumption for capital. These trades are observable to
the government. Thus, T1 (c1 , a) represents the amount of tax due at the end
of date one. At the end of the second date, second-period consumption is also
publicly available, so T2 (c1 , a, c2 ) is the second-period tax function. Clearly,
the government can use any tax system of this form because the amounts
due from each agent depend only on what the government can observe. In
particular, T1 and T2 do not depend on the unobservable parameter θ.
Under a tax system (T1 , T2 ), agents’ budget constraints are given by
c1 + qa = y − T1 (c1 , a),
c2 = (1 + a)y − T2 (c1 , a, c2 ).
Competitive equilibrium is defined, again, analogously to Definition 1:
agents optimize, markets clear, and government budget constraints are satisfied. With taxes (T1 , T2 ), these constraints are given by
μθ T1 (c1θ , aθ ) = 0,
ˆ ˆ

(25)

μθ T2 (c1θ , aθ , c2θ ) = 0,
ˆ ˆ ˆ

(26)

θ=H,L

θ=H,L

ˆ
where ctθ and aθ are equilibrium values of agents’ consumption and capital
ˆ
trades.
Note that any nonzero feasible tax system (T1 , T2 ) will be distortionary.
Indeed, if a tax system (T1 , T2 ) is not distortionary, then T1 and T2 must be
constant (independent of their arguments). In this case, the government budget
constraints (25) and (26) imply immediately that T1 = T2 = 0.
Since agents’ actions, but not types, are observable, it is clear that redistribution can be achieved only with taxes that depend on agents’ actions and
not types. However, it is not obvious what form these taxes should take in
order to be effective. The next subsection provides an example of a simple
distortionary tax system that is feasible but completely ineffective for implementation of redistribution.

A Simple Distortionary Tax System
In this subsection, we examine a simple tax system with a proportional tax on
capital income. In this system, capital income in period t is taxed at a flat rate
τ t ∈ [0, 1] and the proceeds are refunded to the agents as a lump-sum transfer

B. Grochulski: Redistributive Taxation

255

Tt .15 In our general notation, this tax system is written as
T1 (c1 , a) = τ 1 y − T1 ,
T2 (c1 , a, c2 ) = τ 2 (1 + a)y − T2 .
A tax system of this form consists of four numbers, (τ t , Tt )t=1,2 . Setting
τ t = Tt = 0 for t = 1, 2 gives us the competitive equilibrium outcome, i.e.,
the equilibrium allocation is a Pareto optimum with the relative weight of the
high type equal to γ CE = 1 . We want to study what other efficient allocations
3
can be achieved in this economy with a tax system of the form (τ t , Tt )t=1,2 .
The answer turns out to be: none.
Under taxes, (τ t , Tt )t=1,2 , agents’ budget constraints are given by
c1 + qa = y(1 − τ 1 ) + T1 ,
c2 = (1 + a)y(1 − τ 2 ) + T2 ,
and the government budget constraints are
τ 1 y = T1 ,
μθ τ 2 (1 + aθ )y = T2 .
ˆ
θ=H,L

Because of agents’ equilibrium choices aθ satisfy capital market clearing
ˆ
ˆ
θ=H,L μθ aθ = 0, the second-period government budget constraint reduces
to
T2 = τ 2 y(1 +
= τ 2 y.

μθ aθ )
ˆ
θ=H,L

Thus, in both periods the amount the government refunds to each agent must
equal the marginal capital income tax rate times the economy’s aggregate
amount of capital income, which in our model is fixed at Y = y.
Using τ 1 y = T1 , the agents’ budget constraint in the first period reduces
to
c1 + qa = (1 − τ 1 )y + τ 1 y
= y.
Thus, the first-period tax on capital income has no effect on the agents’budgets,
as every agent has the same capital income and receives the same lump-sum
refund equal to the average capital income tax.
15 Proportional distortionary taxes have been extensively studied in a vast literature initiated
by Ramsey (1927). That literature concentrates on the question of minimization of the distortions
resulting from proportional taxes, without addressing the question of optimal taxation. In particular, that literature does not consider situations in which distortionary taxes may have a corrective
function, e.g., in economies with externalities or private information.

256

Federal Reserve Bank of Richmond Economic Quarterly

In the second period, using τ 2 y = T2 , we can simplify the budget constraint as follows
c2 = (1 + a)y(1 − τ 2 ) + T2
= (1 + a)y(1 − τ 2 ) + τ 2 y
= (1 + (1 − τ 2 )a)y.
We see that the lump-sum refunded flat tax on capital income, τ 2 > 0, acts
simply as a transfer from those who buy capital (a > 0) to those who sell it (a <
0). If τ 2 < 1, this transfer is proportional to the amount of capital traded.16
We note here that in a regular capital market transaction the payment that the
buyer makes to the seller is a transfer of the exact same form. In particular,
the tax payment, τ 2 a, just like a price payment, is proportional to the amount,
a, of capital being traded. From this observation, we see that a proportional
tax on capital, with τ 2 < 1, does nothing but change the equilibrium price of
capital. In particular, under any tax of this form, equilibrium allocation will
coincide with the competitive equilibrium allocation, c, so no redistribution
ˆ
can be achieved.
To see this point more clearly, let us write the agents’ budget constraints
again in the present-value form. From the first-period budget constraint we
have that a = (y − c1 )/q. Substituting into the budget constraint at date two,
we obtain
c2 = (1 + (1 − τ 2 )(y − c1 )/q)y,
which is equivalently written as
q
q
+ y.
=
(1 − τ 2 )y
1 − τ2
Let us now denote q/(1 − τ 2 ) by Q. This value represents the tax-adjusted
price of capital. For any tax rate τ 2 < 1, we can write the present-value budget
constraint as
c1 + c2

c1 + c2 Q/y = Q + y,
which is the same expression as the budget constraint agents face in the model
without taxes, but with the price of capital, q, replaced with the tax-adjusted
price, Q. The solutions to these two models must therefore be the same, i.e.,
1
ˆ
Q = 2 . Thus, under a proportional capital tax, the equilibrium price of capital
is q = (1 − τ 2 )/2 and the unique equilibrium allocation is c for any tax rate,
ˆ
ˆ
τ 2 < 1.
This result is intuitive. Absent taxes, q is the price of one unit of c2 in
y
terms of c1 . With tax, τ 2 , on capital purchases, a, in order to obtain one extra
16 If τ = 1, the government taxes the proceeds from the sale of capital at the rate of 100
2

percent. Under this tax, the market for capital is shut down, the tax proceeds are zero, and the
only equilibrium is autarchy, which is not an efficient allocation in this economy.

B. Grochulski: Redistributive Taxation

257

unit of c2 , an agent must purchase 1/(1 − τ 2 )y units of capital at date one.
With the price of capital being q, this means that it takes q/(1 − τ 2 )y units
of c1 to purchase one unit of c2 . The benefit of selling capital in period 1 is
symmetrically increased, as selling capital now not only brings in resources
for consumption today but also saves capital income taxes tomorrow. By
affecting both sides of a capital transaction symmetrically, the tax, τ 2 , changes
the nominal price of capital but does not change the real tradeoff that agents
face in equilibrium.
Setting aside the case of complete market shutdown, we see that no distortionary tax system of the form (τ , T ) can affect the competitive equilibrium
outcome. For any marginal tax rate, τ 2 < 1, the equilibrium allocation is
the same as it is for τ 2 = 0. In the next section, we consider distortionary
tax systems capable of changing the equilibrium outcome and implementing
other efficient allocations.

6.

EFFICIENT REDISTRIBUTION WITH DISTORTIONARY
TAXATION UNDER INCOMPLETE INFORMATION

In this section, we devise a class of tax systems that are feasible despite agents’
private information and capable of implementing any Second Best Paretooptimal allocation. Similar to the simple system (τ , T ) considered in the
previous section, we will have a distortionary tax on capital and a lump-sum
component. However, the distortion will not affect both parties to a capital
sale/purchase transaction symmetrically.

An Optimal Distortionary Tax System
The tax system we consider in this section consists of two parts. First, there
is a lump-sum tax, Tt , levied on all agents in period t = 1, 2. Second, there
are subsidies to sufficiently extreme capital trades. The form these subsidies
take is as follows. The government sets a (negative) threshold, a, and pays a
−
subsidy, S1 , in period 1 to all agents whose capital purchases are not greater
than a (i.e., a subsidy to all who sell a sufficiently large quantity of capital).
+
Alternatively, the government can set a threshold, a, and pay a subsidy, S2 ,
in period 2 to all agents whose capital purchases are not smaller than a (i.e.,
a subsidy to those who buy a lot of capital). In the tax system implementing
a given Second Best Pareto optimum, only one of these subsidies will be
nonzero.
−
A tax system of this form is therefore given by six numbers (T1 , S1 , a, T2 ,
+
S2 , a). In the general notation introduced in the previous section, we can
express this tax system as follows:
−
T1 (c1 , a) = T1 − Ia (a)S1 ,
+
T2 (c1 , a, c2 ) = T2 − Ia (a)S2 ,

(27)
(28)

258

Federal Reserve Bank of Richmond Economic Quarterly

where Ia and Ia are indicator functions given by
Ia (a) =

1 if a ≤ a,
0
otherwise,

and
1 if a ≥ a,
0
otherwise.
We restrict attention to this class of tax systems because, as we will show,
taxes in this class are sufficient for implementation of all Second Best Paretooptimal allocations. In the next section, we discuss the possibility of implementing Second Best Pareto optima with other tax mechanisms.
Clearly, since taxes (27)–(28) do not depend on the unobservable parameter θ, agents of both types face the same budget constraint, which is given
by
Ia (a) =

−
c1 + qa ≤ y − T1 + Ia (a)S1 ,
+
c2 ≤ (1 + a)y − T2 + Ia (a)S2 .

Also, the government budget constraints (25)–(26) can be expressed as
−
μθ S1 Ia (aθ ) = T1 ,
ˆ
θ=H,L
+
μθ S2 Ia (aθ ) = T2 .
ˆ
θ=H,L
−
As before, competitive capital market equilibrium with taxes T = (T1 , S1 ,
+
a, T2 , S2 , a) consists of a consumption allocation, c = (c1H , c1L , c2H , c2L );
ˆ
ˆ
ˆ ˆ
ˆ
capital trades, a = (aH , aL ); and a capital price, q, such that (i) agents optiˆ
ˆ ˆ
ˆ
mize, i.e., the equilibrium allocation maximizes agents’ utility given the price,
q, and taxes, T ; (ii) the capital market clears; and (iii) the government’s budget
ˆ
is balanced in every period. As before, we will say that the tax system, T , implements a Second Best Pareto optimum, c∗∗ (γ ), if there exists a competitive
equilibrium such that c = c∗∗ (γ ).
ˆ
Analogous to (19), let m∗∗ (γ ) denote the intertemporal marginal rate of
θ
substitution of agents of type θ at the Second Best Pareto optimum, c∗∗ (γ ),
i.e.,
∗∗
βu (c2θ (γ ))
.
∗∗
θ u (c1θ (γ ))
The following result is a version of the Second Welfare Theorem with private
information.

m∗∗ (γ ) =
θ

Theorem 3 Every Second Best Pareto optimum c∗∗ can be implemented as
a competitive equilibrium with taxes, T . In particular, for γ ∈ [0, ∞], the
Second Best Pareto optimum c∗∗ (γ ) is implemented by the tax system
−
+
T (γ ) = (T1 (γ ), S1 (γ ), a(γ ), T2 (γ ), S2 (γ ), a(γ ))

B. Grochulski: Redistributive Taxation

259

given as follows.
For γ < γ CE :
−
T1 (γ ) = S1 (γ ) = a(γ ) = 0,
∗∗
∗∗
T2 (γ ) = Y − c2H (γ ) + m∗∗ (γ )−1 Y − c2L (γ ) ,
H
+
∗∗
∗∗
∗∗
∗∗
S2 (γ ) = c2L (γ ) − c2H (γ ) + m∗∗ (γ )−1 c1L (γ ) − c1H (γ ) ,
H

a(γ ) =

Y m∗∗ (γ )
H

−1

∗∗
Y − c1H (γ ) .

For γ ≥ γ CE :
+
T2 (γ ) = S2 (γ ) = a(γ ) = 0,
∗∗
∗∗
T1 (γ ) = Y − c1L (γ ) + m∗∗ (γ ) Y − c2L (γ ) ,
L
−
∗∗
∗∗
∗∗
∗∗
S1 (γ ) = c1H (γ ) − c1L (γ ) + m∗∗ (γ ) c2H (γ ) − c2L (γ ) ,
L
∗∗
a(γ ) = Y −1 (c2H (γ ) − Y ).

Although the expressions for the thresholds and transfers specified in the
tax system, T (γ ), look complicated, the intuition behind them is very simple.
Absent taxes, as we have seen, the competitive market mechanism implements
the efficient allocation c∗∗ (γ CE ). In order to implement an optimum c∗∗ (γ ) for
some γ > γ CE , the government must redistribute from the patient types, L, to
the impatient types, H , (recall that γ is the relative weight that the impatient
type, H , receives in the social welfare objective). How can this redistribution
be achieved when the government cannot observe agents’ types?
In competitive equilibrium without taxes, the impatient types sell capital
because of their strong preference for first-period consumption. The patient
types buy it. Thus, the government knows ex post who the impatient and patient agents are simply by looking at agents’ capital trades. Suppose then that
the government, targeting the impatient agents, gives a small subsidy to those
who sell a sufficiently large quantity of capital. If the subsidy is small enough,
or the minimum sale size requirement is sufficiently large, this subsidy will
not cause the patient agents to change their behavior (i.e., to flip from buying
to selling capital).17 Under such a subsidy, patient agents still buy capital and,
therefore, do not collect the subsidy. The impatient agents, who were selling
capital even without the subsidy, continue to sell it, which now gives them the
additional benefit of the subsidy. Thus, the subsidy reaches the targeted type.
If this subsidy is funded by lump-sum taxes on all agents, it redistributes from
the patient agents to the impatient ones, as intended. The optimal tax mechanism, T (γ ), delivers the subsidy to the targeted type precisely in this way.
For any γ > γ CE , the optimal tax system, T (γ ), provides a threshold level,
−
a(γ ), and a subsidy level, S1 (γ ), that achieve in equilibrium the amount of
17 In the language of mechanism design, the market mechanism distorted by such a subsidy
remains incentive compatible.

260

Federal Reserve Bank of Richmond Economic Quarterly

redistribution (relative to the competitive market allocation) required to implement the optimal allocation, c∗∗ (γ ).
Similarly, in order to implement the optimum c∗∗ (γ ) for some γ < γ CE ,
the government redistributes from the impatient types, H , to the patient types,
L. Taxes, T (γ ), are again designed to not induce the agents to flip, so the
impatient types continue to sell capital and the patient types continue to buy
+
it. For γ < γ CE , the lump-sum-funded subsidy, S2 (γ ), goes to the buyers
of capital, that is types L, and thus reaches the targeted type of agent. In this
way, tax T (γ ) achieves the desired redistribution.
Let us now argue slightly more formally that this intuition is consistent
with equilibrium. We need to demonstrate that conditions (i)–(iii) defining
competitive equilibrium with taxes are satisfied under taxes, T (γ ), with consumption, c = c∗∗ (γ ), along with some prices, q(γ ), and capital trades, aθ (γ ).
ˆ
ˆ
ˆ
More precisely, we will argue that equilibrium capital price, q(γ ), can be obˆ
tained from the IMRS of the agents who do not receive the subsidy to capital
sales/purchases. For γ > γ CE , these are the patient agents, i.e.,
q(γ ) = m∗∗ (γ )y
ˆ
L

(29)

for these γ . For γ < γ CE , the impatient types do not receive the subsidy, thus
q(γ ) = m∗∗ (γ )y
ˆ
H
for all γ in this range. The subsidy threshold levels a(γ ) and a(γ ) are such
that the following capital trades are optimal in the agents’ utility maximization
problem:
aH (γ ) = a(γ ),
ˆ
μ
aL (γ ) = − H a(γ )
ˆ
μL
for each γ > γ CE , and
μL
a(γ ),
μH
aL (γ ) = a(γ )
ˆ

aH (γ ) = −
ˆ

for each γ < γ CE .
Checking that equilibrium conditions (ii) and (iii) are satisfied amounts
to a bit of simple algebra. The crux of the argument is in checking the first
equilibrium condition, i.e., in showing that under taxes, T (γ ), and proposed
equilibrium prices, q(γ ), agents of types H and L indeed find it optimal to
ˆ
choose the proposed equilibrium capital trades aH (γ ) and aL (γ ), respectively.
ˆ
ˆ
An algebraic proof of this result would be very tedious. In particular, note
that the algebraic argument we used in the case of lump-sum taxes with full
information cannot be used here, as the Euler equations are invalid due to the
budget line being given by a non-differentiable function.
We will thus proceed differently. For several selected values of γ , we will
demonstrate graphically that the optimal allocation c∗∗ (γ ) is consistent with

B. Grochulski: Redistributive Taxation

261

agents’individual utility maximization under taxes, T (γ ). Qualitatively, these
values will be representative of the whole spectrum of γ . From our graphical
argument, it will be clear that the conclusion holds for all γ ∈ [0, ∞].
Consider the case of γ = 1 (which represents the utilitarian social welfare
objective). Since 1 > γ CE = 1 , we have that the tax system, T (1), provides a
3
−
subsidy, S1 (1), to agents whose capital purchases, a, are not larger than a(1).
From the closed-form expression for c∗∗ (1) given in equations (21)–(22) of
3 1
∗∗
G08, we have that the optimal utilitarian allocation has cH (1) = ( 2 , 2 ) and
1 3
∗∗
cL (1) = ( 2 , 2 ). Substituting these values into the formula for tax parameters
T (1) given in the statement of Theorem 3, we have
+
T2 (γ ) = S2 (γ ) = a(γ ) = 0

and
1
,
3
2
−
,
S1 (1) =
3
1
a(1) = − .
2
Under the tax system, T (1), therefore, agents who sell at least half of their
initial capital stock receive the subsidy of 2 units of consumption at date one.
3
There is no subsidy to buying assets. All agents pay the lump-sum tax of 1 at
3
date one. From (29) we compute
T1 (1) =

1
q(1) = .
ˆ
3
The thick crooked line in Figure 3 represents the budget constraint that
all agents face in their utility maximization problem under taxes, T (1), and
price, q(1). The horizontal segment of this budget constraint results from
ˆ
−
the subsidy, S1 (1). The horizontal dashed line represents the lump-sum tax,
T1 . The two convex curves in Figure 3 are the highest indifference curves
that types H and L attain in their utility maximization problems under taxes,
T (1), and price, q(1). The indifference curve of type H has exactly one
ˆ
3 1
∗∗
point in common with the budget constraint, cH (1) = ( 2 , 2 ). The impatient
agents, therefore, maximize their utility by choosing the consumption pair
∗∗
cH (1), which is consistent with implementation of the Second Best Pareto
optimum c∗∗ (1). The indifference curve of type L meets the budget constraint
1 3
3 1
∗∗
∗∗
∗∗
at two points: cL (1) = ( 2 , 2 ) and cH (1) = ( 2 , 2 ). Thus, cL (1) is consistent
with the individual utility maximization of the L types, as well, however not
uniquely.18
18 That this individual optimum is not unique is necessary in the implementation of the
optimum c∗∗ (1) because the incentive compatibility constraint of type L, (6), binds at c∗∗ (1).

262

Federal Reserve Bank of Richmond Economic Quarterly

Figure 3 Individual Optima of the Two Types Under the Budget
Constraint Resulting from Taxes τ (1)

3

c2

2

1

0
0

1

2

3

c1
∗∗
Note in Figure 3 that the indifference curve of type H is flatter at cH (1) =
than the downward-slopping segment of the budget constraint at this
point. This is a consequence of the so-called intertemporal wedge, which is
described in detail in G08. The slope of the budget line, everywhere outside of
the horizontal segment, equals −m∗∗ (1)−1 . The slope of the indifference curve
L
∗∗
of the H type at cH (1) is −m∗∗ (1)−1 . Because of the intertemporal wedge
H
prevailing at the optimal allocation c∗∗ (1), these two rates are not equal. In
fact, the sloping segment of the budget line is strictly steeper than the indif∗∗
ference curve of the H type at cH (1). This implies that the optimal subsidy,
−
S1 (1), could not be made available with a weaker capital sale requirement than
3 1
(2, 2)

Non-uniqueness for at least one type of agent will appear in any implementation of any Second
Best Pareto optimum at which at least one of the incentive compatibility constraints (5) (6) is
binding.

B. Grochulski: Redistributive Taxation

263

1
a(1) = − 2 . Given the intertemporal wedge, which implies that the H type is
savings-constrained, a lower threshold a(1) would provide a smaller distortion
and benefit the H types. It would, however, also benefit the L-type agents,
causing them to change their behavior from buying capital and receiving no
subsidy to selling capital and qualifying for the subsidy, which would make
this tax mechanism miss its subsidy target. For that reason, the H -type agents
must remain savings-constrained in equilibrium.
That the same construction of equilibrium holds for all γ > γ CE can be
easily checked using the expressions for taxes, T (γ ), provided in the statement
of Theorem 3 and prices, q(γ ), given in (29). One difference appears when
ˆ
we consider the Second Best Pareto optima c∗∗ (γ ) for the values of γ for
which the incentive constraints do not bind.19 When no incentive constraints
∗∗
bind, the consumption bundle cθ (γ ) is a unique maximizer in the individual
utility maximization problems of both types θ = H, L. The slope of the
∗∗
non-horizontal segment of the budget line at cH (γ ) is equal to the slope of
the indifference curve of the H type at this point; the allocation is free of
intertemporal wedges. This means that agents of type H would not benefit
−
by selling slightly fewer claims than a(γ ) even if the subsidy, S1 (γ ), were
available at a slightly lower threshold. In this sense, the threshold, a(γ ), is not
uniquely pinned down by the optimum c∗∗ (γ ) for these values of γ . Figure 4
depicts this construction for one such value, namely γ = 0.4.
Let us now turn to the Second Best Pareto optimum c∗∗ (0), i.e., the worst
among all Second Best Pareto-optimal allocations from the point of view of
the agents of type H . In order to implement this outcome, the government
subsidizes capital purchases. Calculating taxes, T (0), from the formulas given
in Theorem 3, and pinning down capital price from the IMRS of the agents
of type H (who do not receive the subsidy in equilibrium), we construct the
budget constraint depicted in Figure 5. The vertical segment of the budget
+
constraint represents the subsidy, S2 (0). The dashed vertical line represents
the lump-sum tax, T2 (0). The maximal indifference curve attained by the
∗∗
∗∗
agents of type H touches the budget line at two points: cH (0) and cL (0). The
maximal indifference curve of the agents of type L touches the budget line
∗∗
only at cL (0). Within this budget set, therefore, both types of agents choose
to consume their part of the optimal allocation, c∗∗ (0). In this way, the tax
system, T (0), implements the Second Best Pareto optimum c∗∗ (0).
As before, this construction generalizes for all γ < γ CE . For those γ for
which the incentive constraint of the H type does not bind, both types’ optimal
∗∗
consumption, cθ (γ ), is the unique maximizer of individual utility under the
budget constraints obtained from the equilibrium price, q(γ ) = m∗∗ (γ )y,
ˆ
H

19 As shown in G08, this is the case for γ in the interval [γ CE , γ ], where γ is the
2
2
threshold value at which the incentive constraint for the L type begins to bind.

264

Federal Reserve Bank of Richmond Economic Quarterly

Figure 4 Individual Optima of the Two Types Under the Budget
Constraint Resulting from Taxes τ (0.4)

3

c2

2

1

0
0

2

1

3

c1
and taxes, T (γ ). In those cases, as well, the optimal threshold, a(γ ), is not
uniquely determined by the optimum c∗∗ (γ ).
From the above graphical constructions, we can see how the implementation argument extends to all values of γ ∈ [0, ∞].

7.

OTHER TAX MECHANISMS

In this section, we briefly discuss the question of the uniqueness of the tax
system, T (γ ). The tax system, T (γ ), is by no means a unique tax system
capable of implementation of Second Best Pareto optima.
Consider an arbitrary feasible tax system, T , and denote by B(T ) the set of
all consumption pairs (c1 , c2 ) that are budget-feasible in the agents’ individual
utility maximization problem under taxes, T . Suppose that (a) B(T ) contains
∗∗
∗∗
the consumption pairs cH (γ ) and cL (γ ), and (b) B(T ) is contained in the

B. Grochulski: Redistributive Taxation

265

Figure 5 Individual Optima of the Two Types Under the Budget
Constraint Resulting from Taxes τ (0)

3

c2

2

1

0
0

1

2

3

c1

lower envelope of the indifference curves of the agents of type θ traced from
∗∗
the optimal consumption bundles cθ (γ ). It can easily be seen in Figures 3,
4, and 5 that any tax system, T , that satisfies (a) and (b) does implement the
optimum c∗∗ (γ ). This point goes back to Mirrlees (1971).
Nevertheless, the tax system, T (γ ), used in Theorem 3 has several features
that may be appealing (on the basis of out-of-model considerations, however).
First, it is simple. Second, it does not crowd out the market completely. Let
us discuss these two points by comparing the tax system, T (γ ), with two
alternatives.
As the first alternative, consider a tax system in which the government
taxes away all private wealth by setting the lump-sum taxes T1 = T2 = y
and offers two government welfare programs, with each agent in the economy
being eligible to sign up for at most one. The first welfare program hands
∗∗
out consumption cH (γ ) to each agent who signs up for it. The second hands

266

Federal Reserve Bank of Richmond Economic Quarterly

∗∗
out cL (γ ).20 Clearly, this system can implement any Second Best Pareto
optimum c∗∗ (γ ), as well as any resource feasible and incentive compatible
allocation. But it may be considered unappealing. Under this tax mechanism,
the market is completely shut down: Anticipating the lump-sum tax, T2 , agents
hold on to their capital and just consume the government handout. All trade is
crowded out by the combination of high taxes and generous welfare programs.
All transfers in this economy go through the hands of the government. The
tax system, T (γ ), of the previous section is comparatively appealing because
it calls for a much smaller government intervention in the market economy.
Only a part of the transfers needed to support Pareto-optimal allocations go
through the hands of the government, with private markets having a clear role.
Another possible tax system is one under which the budget constraint,
B(T ), is exactly equal to the lower envelope of the indifference curves traced
∗∗
from the optimal consumption, cθ (γ ), of the two types of agents. At this
system, the size of the transfers going through the government’s hands is
minimal. This system, however, is complicated because a high degree of
nonlinearity in the implicit tax rates is required to trace out the nonlinear
indifference curves of the two types. By comparison, the system, T (γ ), is
simple, with the budget constraint being given by a linear schedule with just
−
+
one parallel shift (by the amount of subsidy S1 or S2 ).

8.

CONCLUSION

Classical general equilibrium analysis of competitive markets provides a strong
argument against distortionary government interventions. Market allocations
are efficient and all societal needs for redistribution can be efficiently achieved
with lump-sum taxes and transfers. There is no reason to use distortionary
taxes in the classical general equilibrium model. From the vantage point of
the classical theory, distortionary taxes, which in fact are used by governments
in many countries, may appear to reflect a failure of government policy.
This appearance is overturned when one recognizes the strong informational requirements imposed by the classical general equilibrium theory.
When governments do not posses sufficiently fine information about the agents
populating the economy, general equilibrium analysis leads to a completely
different view of distortionary taxation. As our simple model illustrates,
with incomplete public information, governments must necessarily rely on
distortionary taxes in order to efficiently implement the desired level of
redistribution.
20 One can see that this tax mechanism is simply a version of the direct revelation mechanism
used to define the Social Planning Problem in G08.

B. Grochulski: Redistributive Taxation

267

REFERENCES
Akerlof, George. 1970. “The Market for Lemons: Qualitative Uncertainty
and the Market Mechanism.” Quarterly Journal of Economics 84
(August): 488–500.
Debreu, G´ rard. 1959. Theory of Value. New Haven and London: Yale
e
University Press.
Grochulski, Borys. 2008. “Limits to Redistribution and Intertemporal
Wedges: Implications of Pareto Optimality with Private Information.”
Economic Quarterly 94 (Spring): 173–96.
Kocherlakota, Narayana R. 2007. “Advances in Dynamic Optimal Taxation.”
In Advances in Economics and Econometrics: Theory and Applications,
Ninth World Congress, Vol. 1 edited by Richard Blundell, Whitney
Newey, and Torsten Persson. New York: Cambridge University Press,
269–97.
Mas-Colell, Andreu, Michael D. Whinston, and Jerry R. Green. 1995.
Microeconomic Theory. New York: Oxford University Press.
Mirrlees, James A. 1971. “An Exploration in the Theory of Optimum Income
Taxation.” Review of Economic Studies 38 (April): 175–208.
Pigou, Arthur C. 1932. The Economics of Welfare. London: Macmillan and
Co. Limited.
Ramsey, F. P. 1927. “A Contribution to the Theory of Taxation.” Economic
Journal 37: 47–61.
Stiglitz, Joseph E. 1987. “Pareto Efficient and Optimal Taxation and the New
Welfare Economics.” In Handbook of Public Economics, vol. 2, edited
by Alan J. Auerbach and Martin Feldstein. Amsterdam: North-Holland,
991–1,042.
Werning, Iv´ n. 2007. “Optimal Fiscal Policy with Redistribution.” The
a
Quarterly Journal of Economics 122 (August): 925–67.

Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 269–288

The Behavior of Household
and Business Investment
over the Business Cycle
Kausik Gangopadhyay and Juan Carlos Hatchondo

T

he spillover effects associated with the decline in the housing market
during 2007 and 2008 suggest the importance of this market for the
overall economy. Yet the decision to purchase a house is only part
of a broader plan of production and consumption of goods within the household. The residential services homeowners enjoy from their dwelling, the
transportation services they enjoy from their automobiles, the meals prepared
at home, the child/adult care services provided within the household, and the
entertainment services derived from television and audio equipment are just a
few examples of goods that are produced and consumed within the household,
as opposed to goods that are purchased in the market. The size of this nonmarket output is quite significant: Benhabib, Wright, and Rogerson (1991)
estimate that the output of the household sector in the United States is approximately half of the size of the output in the market sector.1 Furthermore, the
production of non-market goods requires the use of capital. Greenwood and
Hercowitz (1991) report that the stock of household capital is actually larger
than the stock of capital in the market sector. Examples of household capital
are the dwellings owned and occupied by the household, automobiles owned
and used by the household’s members, home appliances, furniture, etc.
Given the size of the household sector, several studies have incorporated
this sector into the real business cycle model with the goal of enhancing the
understanding of aggregate fluctuations of economic activity. Even though
Gangopadhyay is an assistant professor at the Indian Institute of Management Kozhikode.
Hatchondo is an economist at the Federal Reserve Bank of Richmond. The views expressed
in this article do not necessarily reflect those of the Federal Reserve Bank of Richmond or
the Federal Reserve System. E-mails: kausik@iimk.ac.in; juancarlos.hatchondo@rich.frb.org.
1 Except for the flow of services provided by dwellings to homeowners, the rest of non-market
output produced within the household goes unreported in the System of National Accounts.

270

Federal Reserve Bank of Richmond Economic Quarterly

the real business cycle model has proven to be a powerful tool for explaining
basic patterns of business cycle fluctuations in the United States, it has faced
several challenges when it has been utilized to account for the behavior of
business and household investment. This article presents a summary of the
literature that studies the behavior of household investment decisions over the
business cycle.
Previous studies have emphasized three stylized facts about the cyclical
behavior of household and business investment in the United States: (1) both
investment components display a positive co-movement with output—as well
as a positive co-movement with each other, (2) household investment is more
volatile than business investment, and (3) household investment leads the cycle
whereas business investment lags the cycle. With respect to the last finding,
household investment is correlated more with future output than with current
or past output, while business investment is correlated more with past output
than with current or future output. This article discusses the performances
of previous studies in terms of their ability to account for these stylized facts
within a framework that is broadly consistent with the main properties of
business cycles in the United States.
This article provides a summary of studies that have extended the real
business cycle model in order to reach a better understanding of the facts
described above. Alternative explanations for the positive co-movement and
relative volatilities between the two investment components have relied on different degrees of complementarity between capital and labor in the production
of home goods, the presence of alternative uses for labor and/or household
capital, and the presence of a more costly adjustment in the stock of market
capital compared with the stock of household capital. The leading behavior of
household investment has been harder to explain. The two studies that have
succeeded in accounting for this fact have relied on household capital as a
factor that may enhance the quality of the labor force and on a multiple-sector
model in which capital goods are produced in a separate sector. All the studies reviewed in this article rely on exogenous shocks to productivity levels
as the driving force of cyclical fluctuations. This modeling strategy abstracts
from explanations for cyclical fluctuations in which market imperfections lead
to inefficiently low or high output levels. For example, none of the studies
revisited in this article feature residential investment driven by house prices
that may be misaligned with fundamentals. This implies that the studies surveyed in this article portray cyclical downturns as an efficient response of the
economy to “bad shocks.”
The rest of the paper is organized as follows. Section 1 describes the main
characteristics of the business cycle in the United States and the importance of
household production. Sections 2 and 3 present a summary of the literature on
the cyclical behavior of household and business investment. The conclusions
are noted in Section 4.

Gangopadhyay and Hatchondo: Household and Business Investment
1.

271

DATA DESCRIPTION

The concept of business cycles refers to fluctuations of economic activity
around its long-run growth path. The long-run growth path is commonly referred to as the trend of the time series of an economic variable. The cyclical
component of the series is defined as the deviation from the trend. In real
business cycle theory, economists study the behavior of the cyclical component. For example, studies of business cycles focus on notions of persistence
in the detrended component of economic aggregates, co-movement among
various detrended (cyclical) components and the leading or lagging behavior
relative to the detrended component of output, and also the relative amplitudes
of standard deviation or volatilities of various detrended series.
The remarkable feature about fluctuations of aggregate variables over time
is that the cyclical components tend to move in a synchronized mode. There
has been an extensive literature over the last 30 years aimed at reaching a
coherent understanding of the regularities that characterize the business cycle
in the U.S. economy. As was pointed out by Lucas (1977), the development
of a theoretical explanation for these regularities constitutes a first step toward
the design of sound policy measures.
This section does not provide an exhaustive description of the properties
of business cycles in the United States. Instead, it focuses on the cyclical
behavior of the aggregate variables that are studied in this article.
Table 1 presents the behavior of market output, market consumption,
household and business investment, and total hours worked in the market
sector. The moments are computed using data from the first quarter of 1964
to the second quarter of 2008.2 The second column reports the standard deviation of market output and ratios of the standard deviations of each variable
relative to the standard deviation of market output. The remaining columns
report the cross-time correlation between each variable and market output.
In particular, the seventh column illustrates that there is a significant positive
co-movement between all five variables. However, the highest magnitudes of
the coefficients of correlations do not necessarily correspond to the contemporaneous correlations. Household investment is more closely correlated with
market output one and two quarters ahead than with current market output:

2 Market output consists of gross domestic product less consumption of housing services.
Market consumption consists of personal consumption expenditures in nondurables and services less
housing services. Household investment consists of residential fixed investment and expenditures in
durable consumption goods. Business investment consists of nonresidential fixed investment. Market
hours consists of total hours worked in the private sector. The Bureau of Economic Analysis is
the primary source for the first four variables and the Bureau of Labor Statistics is the primary
source for the last variable. The moments reported in the table correspond to deviations from the
trend of the natural logarithm of each variable. Trends are computed using the Hodrick-Prescott
filter with a smoothing parameter of 1,600.

Market Output
Market Consumption
Business Investment
Household Investment
Market Hours

Std. Dev.
1.66
0.55
2.91
4.03
1.11
xt−4
0.26
0.43
−0.06
0.58
0.02

Cross Correlation of Market Output at Period t with:
xt−3
xt−2
xt−1
xt
xt+1
xt+2
xt+3
0.47
0.68
0.86
1.00
0.86
0.68
0.47
0.61
0.75
0.82
0.79
0.66
0.49
0.30
0.13
0.37
0.59
0.78
0.84
0.81
0.71
0.68
0.78
0.81
0.73
0.50
0.27
0.04
0.22
0.46
0.69
0.86
0.89
0.82
0.69

Table 1 Properties of Business Cycles in the United States, Selected
Moments

xt+4
0.26
0.10
0.54
−0.15
0.51

272
Federal Reserve Bank of Richmond Economic Quarterly

Gangopadhyay and Hatchondo: Household and Business Investment

273

corr(xht−2 , yt ) = 0.78 and corr(xht−1 , yt ) = 0.81, while corr(xht , yt ) = 0.73.3
On the contrary, business investment is correlated more with market output
one and two quarters behind than with current market output: corr(xmt+1 , yt )
= 0.84 and corr(xmt+2 , yt ) = 0.81, while corr(xmt , yt ) = 0.78. In addition, both
investment components are significantly more volatile than market output and
consumption.
The leading behavior of household investment is also apparent in Figure
1. The graph illustrates the dynamics of household investment, business investment, and output before and after each of the last seven recessions. Except
for the 2001 recession, household investment had already peaked and was in
decline at the beginning of each recession. On the other hand, except for the
recessions that started in 1969 and 2001, business investment peaked either at
the beginning of the recession or after that.
Even though standard one-sector real business cycle models have been
successful in accounting for the cyclical pattern of aggregate investment, the
extensions to the one-sector model have been less successful. To some extent,
this poses a challenge to the use of transitory shocks to aggregate productivity as the main source of aggregate business fluctuations. The next sections
present a summary of the lessons that can be extracted from past work that
has studied the cyclical behavior of household and business investment.

2. THE BASELINE NEOCLASSICAL GROWTH MODEL
Kydland and Prescott (1982) and Long and Plosser (1983) are the first studies to
quantify the explanatory power of equilibrium theories to account for business
cycle fluctuations. They consider different extensions of the stochastic growth
model studied in Brock and Mirman (1972) and compare statistical properties
of the data generated by their models with actual statistics. In Kydland and
Prescott (1982) and Long and Plosser (1983), the only source of fluctuations
in the economy is a shock to the aggregate factor productivity. Their work laid
down the foundations of a vast literature that shows how equilibrium theories
could provide a plausible explanation of aggregate fluctuations of economic
activity. The rest of this section is devoted to elaborating on the structure of
the one-sector real business cycle model and the different multi-sector models
that have been used so far to explain the cyclical patterns of business and
household investment.
As a simple case study, consider a closed economy with no government
spending and complete markets. There is one good in the economy that can be
either consumed or invested. Fluctuations in economic activity are driven by
persistent shocks to total factor productivity. In the simple model, there is no
3 The leading behavior of household investment is shared by its two components: household
purchases of durable goods and residential investment.

274

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 Real Investment and GDP Before and After Each of the Last
Seven Recessions
125

GDP
Household Investment
Business Investment

120
115

125

1960

1969

GDP
Household Investment
Business Investment

120
115
110

110

105

105

100

100

95

95

90

90
85

85

80

80

75

75
-8

-7

-6

-5

-4

-3

-2

-1

0

1

2

3

4

5

6

7

-8

8

-7

-6

-5

Number of Quarters Before/After Recession Starts

GDP
Household Investment
Business Investment

125
120

-4

-3

-2

-1

0

1

2

3

4

5

6

7

8

Number of Quarters Before/After Recession Starts

1973

GDP
Household Investment
Business Investment

125
120

115

1980

115

110

110

105

105

100

100

95

95

90

90

85

85

80

80
75

75
-8

-7

-6

-4

-5

-3

-2

-1

0

1

2

3

4

5

6

7

-8

8

-7

-6

-5

1981

GDP
Household Investment
Business Investment

125
120

-4

-3

-2

-1

0

1

2

3

4

5

6

7

8

Number of Quarters Before/After Recession Starts

Number of Quarters Before/After Recession Starts

GDP
Household Investment
Business Investment

125
120

115

1990

115

110

110

105

105

100

100

95

95

90

90

85

85

80

80

75
-8

-7

-6

-4

-5

-3

-2

-1

0

1

2

3

4

5

6

7

8

75
-8

-7

-6

Number of Quarters Before/After Recession Starts

-5

-4 -3
-2 -1
0
1
2
3
4
5
6
Number of Quarters Before/After Recession Starts

2001
GDP
Household Investment
Business Investment

125
120

GDP
Household Investment
Business Investment

125
120

115

105

100

2007

110

105

8

115

110

7

100

95

95

90

90

85

85

80

80
75

75
-8

-7

-6

-5

-4 -3
-2 -1
0
1
2
3
4
5
6
Number of Quarters Before/After Recession Starts

7

8

-8

-7

-6

-5

-4

-3

-2

-1

0

1

2

3

4

Number of Quarters Before/After Recession Starts

Notes: The indexes take a value of 100 in the first quarter of each recession.

disutility of labor implying that the supply of labor is inelastic. Under a wide
range of values for the parameters, a positive shock to productivity generates

Gangopadhyay and Hatchondo: Household and Business Investment

275

higher output, consumption, and investment in the shock period, which can
account for the positive co-movement of these three economic aggregates. In
this economy, there are two effects through which a positive productivity shock
may induce higher investment level in the shock period. First, agents become
richer and may want to smooth out the current windfall of output. The only
aggregate mechanism available to transfer current resources to future periods
is capital accumulation. Secondly, if the shock is persistent enough, positive current productivity shocks predict a distribution biased toward positive
shocks in the following period, which augments the marginal benefit to invest
rather than to consume.4 Additionally, an agent’s ability to transfer resources
across time by investing or disinvesting enables the model to account for the
volatilities of consumption and investment relative to output.
What happens when investment is disaggregated between household and
business investment? The answer is that the baseline model faces a hard time
accounting for the cyclical pattern of these two components.

3.

MODELS WITH HOME PRODUCTION

Greenwood and Hercowitz (1991) constitutes the first attempt to study the
cyclical behavior of these two components of investment in a real business
cycle model. They consider a two-sector model in which the representative
household maximizes its expected lifetime utility, as given by
∞

E0

u (cMt , cH t ) ,

(1)

t=0

where cMt denotes the consumption of market goods, and cH t denotes the
consumption of home-produced goods at time period t. The consumption of
market goods is identical to the purchases of consumption goods, c, namely
cMt = ct ,

(2)

while home goods, cH t , are assumed to be a function of the stock of household
capital, kH t , and the number of hours allocated to produce home goods, hH t ,
(3)
cH t = H (kH t , zH t hH t ) .
Market goods are produced using a technology that depends on the capital
stock invested in the market sector, kMt , and the number of hours supplied to
the market sector, hMt ,
yt = F (kMt , zMt hMt ) .

(4)

4 Note that there may exist cases where a positive shock induces a decrease in investment
in the shock period. This would occur when agents predict that they are going to be sufficiently
rich in the future as a consequence of the current shock and thus want to transfer some of those
future resources to the current period.

276

Federal Reserve Bank of Richmond Economic Quarterly

In choosing market consumption, cMt , and savings, the household faces
the following budget constraint in period t:
ct + xMt + xH t = (1 − τ k ) rt kMt + (1 − τ l ) wt hMt + τ ,

(5)

where wt is the wage rate in the market sector, rt is the rental price of capital
in the market sector, xMt and xH t are the investment in household and market
capital, respectively, τ k is the tax rate on capital income, τ l is the tax rate on
labor income, and τ is a lump sum transfer.
The variables zMt and zH t represent labor-augmenting technological
progress. In this study, an important assumption is that productivity shocks in
the market and household sectors are perfectly correlated, i.e., zMt = zH t .
The endowment of hours in each period is normalized to 1 and it is assumed
that all hours that are not used to produce market goods are used to produce
home goods. That is,
hMt + hH t = 1.

(6)

Finally, the capital stocks in the market and household sector depreciate
at the constant rates δ M and δ H , respectively. This means that the capital stock
in sector i follows the law of motion
kit+1 = (1 − δ i ) kit + xit ,

with i ∈ {M, H } .

(7)

Similar investment motives to the ones described in the case of the onesector model are also present in this environment. The difference is that now
there is a tradeoff between the accumulation of business capital and that of
household capital. In the baseline calibration of Greenwood and Hercowitz
(1991), households respond to a positive productivity shock by increasing
business investment and decreasing household investment in the shock period. This behavior explains why the simulated data sets obtained using their
baseline calibration feature a strong negative co-movement between business
and household investment.
The mechanism of this model is summarized by the following passage
from Greenwood and Hercowitz (1991; 1,205):
. . . The negative co-movement of the two investments, which stands in
contrast with the positive one displayed by the actual data has to do with
the basic asymmetry between the two types of capital. Business capital
can be used to produce household capital, but not the other way around.
When an innovation to technology occurs, say a positive one, the optimal
levels for both capital stocks increase. Given the asymmetry in the nature
of the two capital goods, the tendency for the benchmark model is to
build business capital first, and only then household capital. . .

Gangopadhyay and Hatchondo: Household and Business Investment

277

Greenwood and Hercowitz (1991) show that a higher degree of complementarity between labor and capital in home technology helps in accounting
for the co-movement between household and business capital accumulation.
The Euler equation for household capital accumulation is given by
u1 (cM , cH ) = β u1 cM , cH

u2 cM , cH
u1 cM , cH

H1 kH , z hH +1−δ H dG(z | z),

(8)
where x denotes the next-period value of variable x. The marginal value
of household capital accumulation depends on the future shadow price of
u c ,c
household consumption, u2 (cM ,cH ) , and on the future marginal productivity of
1( M H )
household capital, H1 kH , z hH .
The Euler equation takes a simple form for the parameterization used in
Greenwood and Hercowitz (1991). They assume that the production function
for the home good, H (kH , zhH ), is of the following form:

H (kH , hH ) =

⎧ η
⎨ kH (zhH )1−η
ζ
⎩ ηkH + (1 − η) (zhH )ζ

1
ζ

if ζ = 0
(9)

if ζ = 0.

The value of ζ determines the elasticity of substitution between household capital and labor in the production of home goods. Both inputs are complements
when ζ < 0, and are substitutes when 0 < ζ < 1.
Greenwood and Hercowitz (1991) assume that the market technology
is specified by a standard Cobb-Douglas production function with a laboraugmenting productivity shock. Firms seek to maximize profits given the
rental rates for capital and labor.
The instantaneous utility function has the following form:
C (cM , cH )1−γ − 1
.
1−γ
The consumption aggregator, C (cM , cH ), is given by
u (cM , cH ) =

(10)

θ 1−θ
C (cM , cH ) = cM cH .

(11)

Under this parameterization, the Euler equation simplifies to
u2 cM , cH
u1 cM , cH

H1 kH , z hH

ηk ζ −1
1−θ
H
=
cM ζ
θ
ηk H + (1 − η) z hH

ζ

.

(12)

In Greenwood and Hercowitz’s (1991) baseline calibration ζ = 0, so the
direct role of the future productivity shock, z , on the future shadow price
of household consumption and the future marginal productivity of household
capital cancel each other out. However, when capital and labor are complements in the production of home goods (ζ < 0), higher future productivity

278

Federal Reserve Bank of Richmond Economic Quarterly

shocks have a direct positive effect on the incentives to accumulate household
capital. Thus, when ζ < 0, a positive productivity shock in the current period
increases the probability of observing higher shocks in the next period and
generates a stronger desire to accumulate household capital in the period of
the shock. The intuition is that when the ability to substitute capital for labor
decreases, it becomes more costly for households to compensate a decrease
in household capital with an increase in the number of hours devoted to the
production of home goods. Greenwood and Hercowitz (1991) show that a
value of ζ = −1 suffices to generate a positive reaction of household investment to productivity shocks and hence, a positive co-movement between
household and business investment. In addition, a value of ζ = −1 also helps
to account for the larger volatility of household investment relative to business
investment.

Modifications of the Baseline Model with Home Production
Differential capital adjustment costs in the market and
household sector

Gomme, Kydland, and Rupert (2001) point out that the alternative parameterization proposed by Greenwood and Hercowitz (1991) to account for the
positive co-movement between the two investment components may be inconsistent with the presence of balanced growth.5 Gomme, Kydland, and Rupert
(2001) extend the setup studied in Greenwood and Hercowitz (1991) by introducing a time-to-build technology for the production of market goods as well
as utility from leisure.
In Gomme, Kydland, and Rupert (2001) the representative household
lifetime utility is represented by
∞

E0

u (cMt , cH t , hLt ) ,

(13)

t=0

where hLt denotes the number of hours devoted to leisure activities. The
inputs required to produce market and home goods are the same as in equations
(2)–(4).
In Gomme, Kydland, and Rupert (2001), the household allocates its endowment of hours over three possible uses. This means that equation (6) is
replaced by
hMt + hH t + hLt = 1.

(14)

5 If the model were extended to account for the decline in the price of durable goods, it

would not be able to generate a constant fraction of expenditures in durable goods as observed
empirically.

Gangopadhyay and Hatchondo: Household and Business Investment

279

The assumption of time-to-build for market capital implies that an agent
decides today the increase in the stock of business capital that will take place
four periods ahead (a period refers to a quarter). In addition to that, the investment projects decided today entail a commitment of investment resources
during four periods until the projects can become active. More precisely, when
households decide at date t to increase their capital stock in the market sector
at date t + 4 in one unit, they need to spend 0.25 units per period from date
t until t + 3. This means that law of motion for capital in the market sector
satisfies the following equation:
kMt+1 = (1 − δ M ) kMt + pMt−3 ,

(15)

where pMt denotes the number of projects in the market sector started in
period t. Unlike in Greenwood and Hercowitz (1991), the investment in
market capital in a given period depends on the number of projects started
in that period as well as on the number of projects started over the last three
periods, namely
1
(16)
pMt + pMt−1 + pMt−2 + pMt−3 .
4
However, Gomme, Kydland, and Rupert (2001) assume that it takes only
one period to complete household investment projects. This means that equation (7) still applies for the stock of capital in the household sector.
Finally, Gomme, Kydland, and Rupert (2001) relax the strong assumption
of perfect correlation between productivity shocks in the household and market
sectors.
The main improvement over Greenwood and Hercowitz (1991) is that
the model with time-to-build technology manages to replicate the positive
co-movement between household and business investment and generates a
stronger lag in the reaction of business investment to output. That result
is obtained assuming a unitary elasticity of substitution between capital and
labor in the home technology (ζ = 0). In order to assist the intuition, Figure
2 describes the impulse response on a one-time shock to the productivity level
in the market sector ( M ).
Figure 2 shows that at the time of the shock, agents respond by starting
more investment projects. This accounts for the increase in market investment
at date 1 and at the dates that follow the shock. There are fewer investment
projects started after date 1, which accounts for the decline in market investment observed after date 5. Even though the productivity level in the household
sector remains unchanged throughout the period, the positive wealth effect because of the higher productivity in the market sector induces households to
consume more homemade goods and thus to invest more in home capital. The
upward pressure on wages triggered by the spike in market productivity induce
xMt =

280

Federal Reserve Bank of Richmond Economic Quarterly

Figure 2 Impulse Responses to a One-Time Shock to Current Market
Productivity
8
x
7

x

6

y
h

M
H
M

5
4
3
2
1
0
1
2
0

2

4

6

8

10

12

Notes: xM is business investment, xH is household investment, y is market output, and
hM is market hours. The deviations are expressed in percentage deviations from the
steady-state values for each variable.

households to work more hours in the market sector. As a result of the higher
supply of labor hours and the increase in factor productivity, market output
increases upon the shock. The initial increase in output and labor hours tends
to fade away until date 5. At that point, the investment projects started at date
1 become active and market output and hours worked in the market sector
jump up again.
The results are symmetric in the case of a negative shock to market productivity. The simultaneous rise (fall) in household and business investment
that tends to follow a rise (fall) in market productivity plays a key role in
explaining the co-movement of both investment components.
As it is explained in Gomme, Kydland, and Rupert (2001; 1,127):

Gangopadhyay and Hatchondo: Household and Business Investment

281

The effect of time to build is to mute the impact effect of the shock on
market investment by drawing out the response over the four quarters it
takes to build market capital. . . . As a result, home investment need not
take such a big hit in the initial period of the shock.

Chang (2000) explores a slightly different setup and provides an alternative
mechanism that can explain the co-movement between market and household
investment. The household’s objective is the same as the one specified in
equation (1), with the difference that both consumption goods are produced
within the household. That is, Chang (2000) replaces equation (2) with
cMt = M (ct , zCt hCt ) ,

(17)

where hCt denotes the number of hours allocated to the production of home
goods that do not require nondurable inputs, and zCt is a labor-augmenting
productivity shock. The production of home goods that require durable inputs
satisfies equation (3).6 As in Greenwood and Hercowitz (1991), there is only
one market sector in the economy. The market good can be used as a nondurable good, a durable good, or capital to be rented to firms in the market
sector. These uses are nonreversible.
The household’s allocation of time must satisfy
hCt + hMt + hH t = 1.

(18)

Chang (2000) assumes that the accumulation of durable goods and market
capital are subject to an adjustment cost, φ, that is
kit+1 = (1 − δ i ) kit + φ

xit
kit

kit

for i ∈ {H, M}.

(19)

The only source of uncertainty consists of a productivity shock in the market
sector (zH and zC display a constant and deterministic growth rate).
Chang (2000) shows that when the household technology features a higher
degree of substitutability between durable goods and labor than between nondurable goods and labor, a positive productive shock in the market sector
generates a simultaneous increase in the investment of market capital and
household stock of durable goods. The intuition is that a positive productivity
shock induces households to increase their consumption while it increases their
opportunity cost of time allocated to the production of consumption goods,
given that the market wage increases. When the production of cD displays a
6 Note that in Chang (2000) there are two types of household capital. One is composed

of nondurable goods and fully depreciates at the end of each period. The other is composed of
durable goods and is subject to partial depreciation.

282

Federal Reserve Bank of Richmond Economic Quarterly

sufficiently higher degree of substitution compared to the production of cN ,
households find it optimal to increase their consumption of cD by using more
capital (durable goods) and less labor. This accounts for the increase in the purchases of durable goods upon a positive productivity shock. In addition, Chang
(2000) shows that it is the joint presence of a higher elasticity of substitution
in the production of cD and the adjustment cost in the accumulation of durable
goods and business capital that helps in generating a positive co-movement of
purchases in durable goods and business investment. Once one of these two
assumptions is relaxed, the model generates a negative co-movement between
the accumulation of durable goods and business investment.
In contrast to Greenwood and Hercowitz (1991), the environment studied
by Chang (2000) suggests that the positive co-movement between the two
investment components can be explained by a high degree of substitutability
in the production of the home good that requires durable goods. In addition,
Chang (2000) estimates the elasticity of substitution between goods and time
in different consumption activities and finds that durable goods seem to be a
good substitute for time, a finding that is consistent with previous empirical
studies.
Home production as an input to market production

Einarsson and Marquis (1997) are able to explain the co-movement of household and business investment in a setup in which households supply labor
hours to the market sector and the non-market sector to accumulate human
capital. In Einarsson and Marquis (1997), the household faces the same objective as in equation (1) and it has to satisfy the same restrictions defined in
equations (2)–(5) with two differences. First, the term hit in equations (2)–(5)
needs to be replaced by Et hit for i ∈ {H, M}. The variable Et denotes the
stock of human capital in period t. Second, there are no productivity shocks
in the production of home goods.
Einarsson and Marquis (1997) assume that households can increase their
stock of human capital using the following technology:
Et+1 = G(Et , hEt ),

(20)

where hEt is the amount of time allocated in period t to learning activities.
That is, human capital has a few nonexclusive uses: it serves as an input in
the production of human capital and it affects the quality of hours supplied to
the market sector and allocated to the production of home goods. Thus,
hMt + hH t + hEt = 1.

(21)

Finally, the law of motion for market and household capital satisfies
equation (7).

Gangopadhyay and Hatchondo: Household and Business Investment

283

In Einarsson and Marquis’s (1997) baseline calibration, a positive productivity shock in the market sector induces households to work more hours
in the market and household sectors and decreases the number of hours devoted to accumulating human capital. In turn, the increase in hours worked
in the household sector increases the marginal return on capital in that sector,
which introduces an incentive to invest in household capital upon a positive
productivity shock. Unlike Greenwood and Hercowitz (1991), Einarsson and
Marquis (1997) do not rely on a high correlation of productivity shocks in the
market and non-market sectors. In fact, they assume that only the production
of market goods is hit with productivity shocks. Nonetheless as in Greenwood and Hercowitz (1991), they need to assume that capital and labor in the
household sectors are complementary.
Even though the articles summarized in this section provide different tentative explanations for the positive co-movement of business and household
investment, and the relative volatility of these two investment components,
they cannot explain the leading behavior of household investment and the
lagging behavior of business investment.
Fisher (2007) succeeds in this respect after introducing a direct role for
household capital as an input in market production. Fisher (2007) extends
Gomme, Kydland, and Rupert (2001) by introducing an additional use for
household capital: Households can affect total effective hours supplied to
˜
˜
business firms (hM ). The technology for determining hM is specified by
μ
˜
hMt = L (kH Mt , zH t hMt ) = kH Mt (zH t hMt )1−μ ,

(22)

where kMH and hM denote the household capital and hours allocated to improve the quality of labor supply to business firms. As in Gomme, Kydland,
and Rupert (2001), households produce a home good using household capital
and labor:
cH t = H (kH H t , zH t hH t ) ,

(23)

where kH H t and hH t denote the household capital and hours allocated to produce the home good. Note that unlike in Einarsson and Marquis (1997),
households cannot affect the quality of the hours allocated to the production
of home goods. The uses of household capital are constrained by the total
stock of household capital in the period, namely
kH Mt + kH H t = kH t .

(24)

In this setup, household capital is not only useful to produce home consumption goods, but it indirectly enhances the ability to produce market goods.
In that context, Fisher (2007) shows that the model can replicate the leading
behavior of household investment over business investment. When the share

284

Federal Reserve Bank of Richmond Economic Quarterly

of capital in the production of human capital (μ) is below 0.25 (it is 0.19 in
Fisher’s calibration), the optimal response of households to a positive productivity shock in the market sector is first to increase their investment in
household capital. This allows households to increase their effective labor
supply over periods following the shock, where higher productivity shocks
would tend to push up wages. In turn, the higher labor supply will augment
the production of market goods in future periods, which also helps to account for the leading behavior of household investment. The “strong” initial
increase in household investment takes place at the expense of market investment, which displays a modest increase in the shock period. The household
raises market investment in the periods following the positive shock.

Models with Multiple-Market Sectors
Finally, Davis and Heathcote (2005) and Hornstein and Praschnik (1997) study
the cyclical behavior of residential investment and/or purchases of durable
consumption goods without resorting to household production. These studies
consider a structure in which all goods are produced in the market and in which
households derive direct utility from the acquisition of durable goods. That is,
in both setups the household maximizes the same objective function defined
in equation (13), with the additional restrictions cMt = ct and cH t = kH t .
Unlike the articles surveyed above that study economies with only one
market sector, Davis and Heathcote (2005) and Hornstein and Praschnik
(1997) consider economies with multiple market sectors.
Davis and Heathcote (2005) consider a model with three intermediate
inputs: construction (b), manufactures (m), and services (s) that are produced
using labor and capital. Formally, let yit denote the production of intermediate
good i:
yit = Fi (kit , zit hit ) ,

with i ∈ {b, m, s} ,

(25)

where kit and hit denote the capital and labor hours used in the production of
intermediate input i. These three goods are the only inputs in the production
of two final goods: a consumption/capital good (M) and a residential good
(R). Thus,
yj t = Fj bj t , mj t , sj t ,

with j ∈ {M, R} ,

(26)

where yj t denotes the production of final good j , and bj t , mj t , and sj t denote
the quantities of each of the three intermediate goods in the production of
j . The residential good must be combined with land (xLt ) to produce houses
(xH t ), namely
xH t = FH (xLt , xRt ) ,

(27)

Gangopadhyay and Hatchondo: Household and Business Investment

285

where the stock of land is constant and equal to 1, i.e., xLt ≤ 1. In their
setup, houses are the only durable consumption good. In Davis and Heathcote
(2005) there are three alternative uses for market capital and four alternative
uses for the household’s endowment of hours, namely
kbt + kmt + kst = kMt , and

(28)

hbt + hmt + hst + hLt = 1.

(29)

The law of motion for market capital, kM , is the same as in equation (7),
while the law of motion for the stock of houses is given by
kH t+1 = (1 − δ H )1−φ kH t + xH t .

(30)

Finally, the resource constraint for final goods is given by
ct + xMt + gt = yMt ,

(31)

where the government expenditures, gt , are financed by labor and capital
income taxes.
Davis and Heathcote (2005) show that the model can account for the comovement between residential and nonresidential investment and the higher
volatility of residential compared to nonresidential investment. The environment studied in Davis and Heathcote (2005) is quite different from the environment considered in previous studies. Davis and Heathcote (2005) carry on
different experiments to identify the role of different features of the model.
On page 753 they state that
First, although our Solow residual estimates suggest only moderate comovement in productivity shocks across intermediate goods sectors, comovement in effective productivity across final-goods sectors is amplified
by the fact that both final-goods sectors use all three intermediate inputs, albeit in different proportions. Second, the production of new
housing requires suitable new land, which is relatively expensive during construction booms. We find that land acts like an adjustment cost
for residential investment, reducing residential investment volatility, and
increasing co-movement. Third, construction and hence residential investment are relatively labor intensive. This increases the volatility of
residential investment because following an increase in productivity less
additional capital (which takes time to accumulate) is required to efficiently increase the scale of production in the construction sector. Fourth,
the depreciation rate for housing is much slower than that for business
capital. This increases the relative volatility of residential investment and
increases co-movement, since it increases the incentive to concentrate
production of new houses in periods of high productivity.

Hornstein and Praschnik (1997) propose a multi-sector economy in which
the use of intermediate inputs helps to explain the co-movement of sectoral

286

Federal Reserve Bank of Richmond Economic Quarterly

employment and output. Their article also offers an explanation for the leading pattern of household investment. They consider a setup with two market
sectors: one produces a durable good and the other produces a nondurable
good. The durable good (MX) can be accumulated either as business capital or household capital. The nondurable good (MC) can be used either in
consumption or as an input in the production of durable goods. Thus,
xMXt + xMCt + xH t = yMXt = FMX (kMXt , zMXt hMXt , mt ) and
cM + m = FMC (kMCt , zMCt hMCt ) ,

(32)
(33)

where xit denotes the investment in the stock of capital, kit , yMXt denotes the
production of durable goods, kMXt (kMCt ), hMXt (hMCt ) denotes the capital and
labor hours used in the production of durable (nondurable) goods, mt denotes
the amount of nondurable goods used as input in the production of durable
goods, and zMXt (zMCt ) denotes a labor-augmenting productivity shock in the
durable (nondurable sector).
The resource constraint for labor hours reads
hMXt + hMCt + hLt = 1,

(34)

while the law of motion for kit is the same as in equation (7), for i ∈
{MX, MC, H }. Note that in Hornstein and Praschnik (1997) investment decisions are nonreversible.
This setup not only explains the co-movement between household and
business investment but it also explains the leading pattern of business investment. We quote Hornstein and Praschnik (1997, 589) below:
Following a productivity increase in either sector, capital becomes more
productive and in order to increase the production of capital goods
investment in the durable goods sector increases whereas investment in the
nondurable goods sector is postponed for one period. The positive wealth
effect of a productivity increase raises household consumption of capital
services, and household sector investment increases contemporaneously
with the productivity shock. Since investment in the nondurable goods
sector represents the bulk of business investment, household investment
leads business investment.

4.

CONCLUSION

A substantial fraction of societal consumption is not purchased in markets but
rather is produced and consumed within households. This article describes
the main characteristics of the cyclical behavior of household and business investment over the cycle in the United States, and offers a summary of studies

Gangopadhyay and Hatchondo: Household and Business Investment

287

that have tried to explain the dynamics of these two investment components.
Even though we have reached a better understanding of what economic relationships may help in explaining the behavior of these two investment components, more research is needed. For example, changes in the relative prices
of houses could be playing a significant role as a propagation mechanism or
as a coordination device across households. However, most existing studies
abstract from changes in the relative price of houses, and the ones that allow
for that channel generate house price movements that are not aligned with the
data.

REFERENCES
Benhabib, Jess, Randall Wright, and Richard Rogerson. 1991. “Homework
in Macroeconomics: Household Production and Aggregate
Fluctuations.” Journal of Political Economy 99 (December): 1,166–87.
Brock, William A., and Leonard J. Mirman. 1972. “Optimal Economic
Growth and Uncertainty: The Discounted Case.” Journal of Economic
Theory 4 (June): 479–513.
Chang, Yongsung. 2000. “Comovement, Excess Volatility, and Home
Production.” Journal of Monetary Economics 46 (October): 385–96.
Davis, Morris A., and Jonathan Heathcote. 2005. “Housing and the Business
Cycle.” International Economic Review 46 (August): 751–84.
Einarsson, Tor, and Milton H. Marquis. 1997. “Home Production with
Endogenous Growth.” Journal of Monetary Economics 39 (August):
551–69.
Fisher, Jonas D. M. 2007. “Why Does Household Investment Lead Business
Investment over the Business Cycle?” Journal of Political Economy 115:
141–68.
Gomme, Paul, Finn E. Kydland, and Peter Rupert. 2001. “Home Production
Meets Time to Build.” Journal of Political Economy 109 (October):
1,115–31.
Greenwood, Jeremy, and Zvi Hercowitz. 1991. “The Allocation of Capital
and Time over the Business Cycle.” Journal of Political Economy 99
(December): 1,188–214.
Hornstein, Andreas, and Jack Praschnik. 1997. “Intermediate Inputs and
Sectoral Comovement in the Business Cycle.” Journal of Monetary
Economics 40 (December): 573–95.

288

Federal Reserve Bank of Richmond Economic Quarterly

Kydland, Finn E., and Edward C. Prescott. 1982. “Time to Build and
Aggregate Fluctuations.” Econometrica 50 (November): 1,345–70.
Long, John B., Jr., and Charles I. Plosser. 1983. “Real Business Cycles.”
Journal of Political Economy 91 (February): 39–69.
Lucas, Robert E. 1977. “Understanding Business Cycles.”
Carnegie-Rochester Conference Series on Public Policy 5 (January):
7–29.

Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 289–313

Short-Term Headline-Core
Inflation Dynamics
Yash P. Mehra and Devin Reilly

M

any analysts contend that the Federal Reserve under ChairmenAlan
Greenspan and Ben Bernanke has conducted monetary policy that
focuses on core rather than headline inflation. The measure of
core inflation used excludes food and energy prices.1 The main argument in
favor of using core inflation to implement monetary policy is that core inflation approximates the permanent or trend component of inflation much better
than does headline inflation, the latter being influenced more by transitory
movements in food and energy prices. The empirical evidence favorable to
the use of core inflation in policy is recently reviewed in Mishkin (2007b).
This empirical evidence consists of examining short-term dynamics between
headline and core inflation measures, indicating that, in samples that start after the early 1980s, headline inflation has reverted more strongly toward core
inflation than core inflation has moved toward headline inflation. However,
the research reviewed also shows that the evidence indicating the reversion
of headline inflation to core inflation is quite weak in samples that start in
the 1960s, suggesting that headline-core inflation dynamics may not be stable
over time.2
Thomas Lubik, Roy Webb, and Nadezhda Malysheva provided valuable comments on this
article. The views expressed in this article do not necessarily reflect those of the Federal
Reserve Bank of Richmond or the Federal Reserve System. E-mails: yash.mehra@rich.frb.org;
devin.reilly@rich.frb.org.
1 The evidence suggesting the Federal Reserve under Chairman Greenspan focused on a core
measure of inflation appears in Blinder and Reis (2005), Mehra and Minton (2007), and Mishkin
(2007b).
2 See also Clark (2001), Blinder and Reis (2005), Rich and Steindel (2005), and Kiley (2008).
These analysts use different empirical methodologies to come to the same conclusion that core
inflation is better than headline inflation in gauging the trend in inflation if we focus on the
samples that start in the early 1980s. For example, Kiley (2008) uses statistical models to extract
directly the trend component of inflation and argues that, in the 1970s and early 1980s, core as
well as headline inflation contains information about the trend; however, in the recent data, the
trend is best gauged by focusing on core inflation. The evidence in Clark (2001), Blinder and
Reis (2005), Rich and Steindel (2005), and Crone et al. (2008) is based on comparing the relative

290

Federal Reserve Bank of Richmond Economic Quarterly

In this article we re-examine the short-term dynamics between headline
and core measures of inflation over a longer sample period of 1959–2007. We
offer new evidence that headline-core inflation dynamics have indeed changed
during this sample period and that this change in dynamics may be due to a
change in the conduct of monetary policy in 1979.3 In particular, we examine
such dynamics over three sub-periods: 1959:1–1979:1, 1979:2–2001:2, and
1985:1–2007:2. We consider the sub-sample 1985:1–2007:2, as it spans a
period of relatively low and stable inflation. We consider both the consumer
price index (CPI) and the personal consumption expenditure (PCE) deflator.
The data used is biannual because the structural vector autoregression (VAR)
model employed uses the Livingston survey data on the public’s expectations
of headline CPI inflation, which is available twice a year. However, the basic
results on the change in short-term headline-core inflation dynamics are robust
to using quarterly data and to including additional determinants of inflation in
bivariable headline-core inflation regressions.
The empirical evidence presented here indicates headline and core measures of inflation are co-integrated, suggesting long-run co-movement. However, the ways these two variables adjust to each other in the short run and
generate co-movement have changed across these sub-periods. In the pre1979 sample period, when a positive gap opens up with headline inflation
rising above core inflation, the gap is eliminated mainly as a result of headline
inflation not reverting and core inflation moving toward headline inflation.
This result suggests headline inflation is better than core inflation in assessing
the permanent component of inflation. In post-1979 sample periods, however,
the positive gap is eliminated as a result of headline inflation reverting more
strongly toward core inflation than core inflation moving toward headline inflation. This suggests core inflation would be better than headline inflation in
assessing the permanent component of inflation.
Recent research suggests a monetary policy explanation of this change in
short-term headline-core inflation dynamics. We focus on a version of monetary policy explanation suggested by the recent work of Leduc, Sill, and Stark
(2007), which attributes the persistently high inflation of the 1970s to a weak
monetary policy response to surprise increases in the public’s expectations of
inflation. In particular, using a structural VAR that includes a direct survey
measure of expected (headline CPI) inflation, Leduc, Sill, and Stark show that
prior to 1979, the Federal Reserve accommodated exogenous movements in
expected inflation seen in the result that the short-term real interest rate did not
increase in response to such movements, which then led to persistent increases
forecast performance of core and headline measures; only in recent data is core inflation the better
predictor of future headline inflation.
3 The evidence indicating that inflation dynamics have changed since 1979 appears in
Bernanke (2007); Leduc, Sill, and Stark (2007); and Mishkin (2007a).

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

291

in actual inflation. Such Federal Reserve behavior, however, is absent post1979, leading to a decline in the persistence of inflation. We illustrate that
such a change in Federal Reserve behavior is also capable of generating the
change in headline-core inflation dynamics documented above.
In particular, when we consider a variant of the structural VAR model
that includes expected headline inflation, actual headline inflation, actual core
inflation, and a short-term nominal interest rate, we find that a change in the
interest rate response to exogenous movements in expected headline inflation
can explain the change in actual headline-core inflation dynamics. Thus, prior
to 1979, when the Federal Reserve accommodated exogenous movements in
expected headline inflation, a surprise increase in expected headline inflation
(say, due to higher energy and food prices) was not reversed, leading to persistent increases in actual headline inflation with core inflation moving toward
headline inflation. A surprise increase in expected headline inflation thus
generates co-movement between actual headline and core inflation measures.
Since such Federal Reserve accommodation of shocks to expected headline
inflation is absent post-1979, surprise increases in expected headline inflation are reversed, with actual headline inflation inverting to core inflation. In
the most recent sample period, 1985:1–2007:2, surprise increases in expected
headline inflation have no significant effect on core inflation, whereas surprise
increases in core inflation do lead to increases in headline inflation, generating
co-movement between headline and core CPI inflation measures. Since movements in food and energy prices are likely significant sources of movements
in the public’s expectations of headline inflation, this empirical work implies
that change in headline-core inflation dynamics may be due to the Federal
Reserve having convinced the public it would no longer accommodate food
and energy inflation.
The rest of the paper is organized as follows. Section 1 presents the
main empirical results on the nature of the change in headline-core inflation dynamics across three sub-periods spanning the sample of 1959–2007.
Section 2 presents and discusses results from recent research including a structural VAR model, suggesting a monetary policy explanation of the change in
headline-core inflation dynamics documented in Section 1. Section 3 contains
concluding observations.

1.

EMPIRICAL RESULTS ON HEADLINE-CORE INFLATION
DYNAMICS

In this section we present the econometric work consistent with change in
short-term headline-core inflation dynamics. Figure 1, which charts headline
and core measures of PCE and CPI inflation, provides a look at the behavior
of these two measures of inflation during the sample period of 1959–2007.
Two observations are noteworthy. The first is that headline and core measures

292

Federal Reserve Bank of Richmond Economic Quarterly

Figure 1 PCE and CPI Inflation Rates Since 1959

Percent

Panel A: PCE - Upper, Core Deviation (Headline minus Core) - Lower
12.5
10.0
7.5
5.0
2.5
0.0

Headline
Core

1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007

Percent

5
3
1
-1
-3
1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007

Percent

Panel B: CPI - Upper, Core Deviation (Headline minus Core) - Lower
15.0
12.5
10.0
7.5
5.0
2.5
0.0

Headline
Core

1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007

4

Percent

2
0
-2
-4
1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007

of CPI and PCE inflation co-move over the sample period. The lower graph in
each panel of Figure 1 charts the “core deviation,” measured as the gap between
headline and core inflation rates. This series is mean stationary, consistent with
co-movement. The second point to note is that, although Figure 1 shows that
headline and core measures of inflation co-move in the long run, it is not clear
from the figure how this co-movement arises. This co-movement may be a
result of one series adjusting to the other, or both series adjusting to each other.
We formally investigate such dynamics in this section.
One approach to headline-core inflation dynamics uses the co-integrationerror-correction methodology popularized by Granger (1986) and Engle and
Granger (1987), among others. Under this approach, one examines short-term
inflation dynamics under the premise that headline and core inflation series
may be nonstationary but co-integrated, indicating the presence of a long-run
relationship between these two measures. Using short-term error-correction
equations, one can then estimate how these two series adjust if headline

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

293

inflation moves above or below its long-run value implied by the co-integrating
regression. Another approach treats the inflation series as being mean stationary in levels, especially during shorter sample periods.4 One infers short-term
headline-core dynamics by examining the near-term responses of headline
and core inflation measures to a core deviation. We employ both of these
approaches.

Unit Roots, Co-integration, and Short-Term
Dynamics
To investigate whether there exists a long-run co-integrating relationship between headline and core measures of inflation, we first examine the unit root
properties of these two series. Table 1 presents test results for determining
whether headline (π H ) and core (π C ) inflation measures have unit roots. The
t
t
test used is the t-statistic, implemented by estimating the augmented DickeyFuller (1979) regression of the form
k

Xt = m0 + ρXt−1 +

m1s Xt−s + ε t ,

(1)

s=1

where Xt is the pertinent variable, ε is the disturbance term, and k is the
number of lagged first differences to make ε serially uncorrelated. If ρ = 1,
Xt has a unit root. The null hypothesis ρ = 1 is tested using the t-statistic. As
can be seen, the t-statistic reported in Table 1 is small for levels of inflation
series but large for first differences of these series, suggesting that inflation is
nonstationary in levels but stationary in first differences over 1959:1–2007:2.
If headline and core inflation measures are nonstationary in levels, there may
exist a long-run co-integrating relationship between them. We use a twostep Engle-Granger (1987) procedure to test for the presence of a long-run
relationship. In step one of this procedure, we estimate by ordinary least
squares (OLS) the regression of the form
π H = a0 + a1 π C + μt ,
t
t

(2)

where μt is the disturbance term. In step two, we investigate the presence of a
unit root in the residuals of regression (2) using the augmented Dickey-Fuller

4 Many analysts have noted the low power of unit root tests in detecting nonstationarity
in series, arguing that inflation may not have a unit root when some more attractive alternative
hypotheses are considered. For example, Webb (1995) argues that it is possible to reject the
hypothesis of a unit root in inflation when the alternative hypothesis allows for the presence of
breaks in monetary policy regimes. As noted in the main text of this article, we also examine
short-term headline-core inflation dynamics, treating inflation as being stationary within each subperiod.

294

Table 1 Unit Root Tests
Augmented Dickey-Fuller Regressions
Biannual Data from 1959:1–2007:2
Levels:
H +
t−1
C +
t−1

k
s=1
k
s=1

First Differences:
k
H )+
s=1 (
t−1
k
C )+
s=1 (
t−1

H = m + ρ(
0
t
C = m + ρ(
0
t

H
t−s
C
t−s

H )
t−s
C )
t−s

Levels

Headline
Core

ρ
ˆ
0.8362
0.8510

CPI
tρ
ˆ
−2.3982
−2.4537

k
4
1

ρ
ˆ
0.8896
0.9092

First Differences
PCE
tρ
ˆ
−1.9427
−2.1164

k
4
0

ρ
ˆ
−0.4085
−0.4239

CPI
tρ
ˆ
−6.0378
−8.7399

k
3
1

ρ
ˆ
−0.4590
−0.0851

PCE
tρ
ˆ
−6.5147
−10.6229

k
3
0

H and
C are headline and core inflation, respectively, in levels, while
H and
C are first differences of
Notes:
headline and core inflation. ρ and the t-statistic, tρ , for ρ = 1 are from the augmented Dickey-Fuller regressions. The series
ˆ
ˆ
has a unit root if ρ = 1. The 5 percent critical value is −2.9. The number of lagged first differences (k) is chosen using the
Akaike Information Criterion.

Federal Reserve Bank of Richmond Economic Quarterly

H =m +ρ
0
t
C = m +ρ
0
t

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

295

Table 2 Co-integration Tests

CPI
PCE

α0
ˆ
0.1790
−0.0111

CPI
PCE

λ1
0.2617
0.2554

CPI
PCE

α0
0.0837
−0.0144

Panel A: Engle-Granger Test
ˆ
ˆ
δ
tδ
α1
ˆ
0.9714
0.2924
−4.4380
1.0408
0.3101
−4.7694
Panel B: Johansen Test
λ2
Co-integrating Vector
0.0852
(−1.3725, 1.4368)
0.0584
(−1.8244, 1.9375)
Panel C: Fully Modified OLS Estimates
α1
s1
s2
0.9956
0.9326
0.8841
1.0427
0.3521
0.2481

k
3
3
LR
28.8223**
28.0137**

Notes: Biannual data from 1959:1–2007:2. *10 percent significance, **5 percent significance. For the Engle-Granger (1987) test, α 0 , α 1 , δ , and the t-statistic for δ = 1
ˆ
ˆ ˆ
˜
in Panel A are from two regressions of the form H = α 0 + α 1 C + ut and ut =
t
t
˜
δ ut−1 + k bs ut−s . Headline and core measures are not co-integrated if the resid˜
s=1
ual series, ut , has a unit root, i.e., if δ = 1. For the Johansen (1988) test, the table
˜
shows the two eigenvalues, λ1 and λ2 , used in evaluating Johansen’s likelihood function,
the estimated co-integrating vectors, and the likelihood ratio statistic, LR, for testing the
null hypothesis of no co-integration. The LR is calculated as −T · ln(1 − λ1 ), where T is
the number of total observations. Critical values for LR are reported under the heading
Case 1 in Hamilton (1994, 768, Table B.11). Panel C shows results from a fully modified
OLS regression of the form H = α 0 + α 1 C + ut . The statistic s1 is the significance
t
t
level of the test hypothesis α 1 = 1, while s2 is the significance level of the test of the
hypothesis α 0 = 0 and α 1 = 1. See notes from Table 1 for variable definitions.

test implemented by estimating regression of the form
k

ut = δut−1 +

b1s ut−s ,

(3)

s=1

where u is the residual. If δ = 1, then there does not exist a long-run relationship between headline and core measures of inflation. The null hypothesis,
δ = 1, is tested using the t-statistic. Table 2, Panel A presents the pertinent
t-statistic, which is large for both PCE and CPI inflation measures, leading to
the rejection of the null hypothesis. These test results suggest headline and
core measures of inflation are indeed co-integrated.
The Engle-Granger test is implemented above by assuming a particular
normalization, regressing headline inflation on core inflation, and examining
the presence of a unit root in the residuals of (2). For robustness with respect
to normalization, we also implement the likelihood test of co-integration as
in Johansen (1988). Table 2, Panel B reports the likelihood test results and
estimated co-integrating vectors. The likelihood ratio statistic that tests the

296

Federal Reserve Bank of Richmond Economic Quarterly

null hypothesis of no co-integrating vector against the alternative of one cointegrating vector is large, leading to the rejection of the null hypothesis.
In order to be able to carry out tests of hypotheses on parameters of the estimated co-integrating vectors, we re-estimate the co-integrating relationship
(2) using a fully modified OLS estimator as in Phillips and Hansen (1990)
because standard OLS estimates, though consistent, do not have the asymptotic normal distribution. The estimates are reported in Table 2, Panel C. As
can be seen, the estimated long-run coefficient, a1 , is positive and statistically
different from zero, suggesting the presence of a positive relationship between
headline and core inflation measures. The estimated long-run coefficient, a1 ,
is not different from unity, suggesting the headline measure of inflation moves
one-for-one with the core measure in the long run. The significance level of
the statistic that tests the null hypothesis a0 = 0, a1 = 1 is .88 using CPI and
.25 using PCE. These significance levels are large, leading to an acceptance
of the null hypothesis.
Having established above that headline and core measures of inflation comove in the long run, we now investigate the sources of this co-movement by
estimating short-term error-correction equations of the form given in (4) and
(5):
k

π H = b0 + λh μt−1 +
t

π H + υ t , and
t−s

(4.1)

π C + υt .
t−s

(4.2)

s=1
k

π C = b0 + λc μt−1 +
t
s=1

Under the assumptions a0 = 0, a1 = 1, we can re-write (4) as (5):
k

π H = b0 + λh (π H − π C )t−1 +
t

π H + υ t , and
t−s

(5.1)

π C + υt .
t−s

(5.2)

s=1
k

π C = b0 + λc (π H − π C )t−1 +
t
s=1

Regressions (4) and (5) capture short-term dynamics between headline and
core inflation measures, and the coefficients λh and λc indicate how headline
inflation and core inflation adjust if a gap emerges between headline and
core inflation rates. If λh = 0 and λc > 0, headline and core inflation stay
together mainly by core inflation moving toward headline inflation. If λh < 0
and λc = 0, headline and core inflation stay together mainly by headline
inflation moving toward core inflation. If λh < 0 and λc > 0, both series
adjust, with headline inflation moving toward core inflation and core inflation
moving toward headline inflation. The relative magnitudes of these adjustment

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

297

coefficients convey information about which series adjusts more in response
to a core deviation.
Table 3, Panel A, presents estimates of short-term error-correction (adjustment) coefficients, providing information about the ways these two series
adjust over three sub-samples considered. Focusing first on the adjustment
coefficient, λh , that appears in headline inflation regressions, this estimated
coefficient is positive and not statistically different from zero in the pre-1979
sample period, but is negative and statistically different from zero in the recent
sample period, 1985:1–2007:2. This result holds for headline CPI as well as
for headline PCE inflation. These estimates of the adjustment coefficient, λh ,
suggest that if headline inflation is above core inflation, headline inflation inverts toward core inflation in the recent sample period but not in the pre-1979
sample period. Focusing now on the adjustment coefficient, λc , that appears
in core inflation equations, we see that results differ for CPI and PCE inflation measures. In core PCE inflation equations, the estimated coefficient is
positive, large, and statistically significant in the pre-1979 sample period but
it becomes small and not statistically different from zero in the recent sample
period, 1985:1–2007:2, suggesting that if headline inflation is above core inflation, core inflation moves toward headline inflation in the pre-1979 sample
period but not in the recent sample period, 1985:1–2007:2. For CPI inflation,
the adjustment coefficient, λc , that appears in the core inflation equation does
decline significantly from .91 in the pre-1979 sample period to .19 in the recent sample period. However, it remains statistically significant in the recent
sample period, suggesting the CPI measure of core inflation has also moved
somewhat toward headline inflation. Together, these short-term adjustment
coefficients suggest that, whereas in the pre-1979 sample period headline and
core measures of inflation stayed together as a result of core inflation moving
toward headline inflation, in the recent sample period they have stayed together more as a result of headline inflation moving toward core inflation than
core inflation moving toward headline inflation. In order to check robustness,
discussed in detail later in this article, we re-estimate short-term adjustment
equations (5) augmented to include two additional lags of other economic determinants of inflation such as changes in a short-term nominal interest rate
and changes in the unemployment rate. Table 3, Panel B, presents the shortterm adjustment coefficients from these short-term augmented regressions.
We can see estimates of short-term adjustment coefficients yield qualitatively
similar results about change in headline-core inflation dynamics.5

5 The adjusted R-squared statistics provided in Table 3 appear reasonable given that short-term
adjustment equations are estimated using first-differences of inflation measures.

298

Table 3 Short-Term Headline-Core Inflation Dynamics
Panel A: Bivariable Adjustment Regressions
2
H − C
H +v
t
t−s
s=1
t−1 +
t−1

H =β +λ
h
0
t
C =β +λ
c
0
t

H =β +λ
h
0
t
C =β +λ
c
0
t

1959:1–1979:1
1979:2–2001:2
1985:1–2007:2

λh
0.4745
−0.0881
−0.6471*

CPI
¯
R2
−0.027
0.144
0.365

C
t−1 +

¯
R2
0.433
0.136
0.264

λc
0.9141**
0.2917
0.1943**

2
s=1

C +v
t
t−s

λh
0.4011
−0.8139**
−0.6168**

PCE
¯
R2
−0.042
0.200
0.328

Panel B: Multivariable Adjustment Regressions
2
C + sr
H − C
H +
t−s +
t−s
t−s
s=1 (
t−1 +
t−1
H −
t−1

¯
R2
0.251
0.322
0.351

C
t−1 +

CPI
λc
1.0793**
0.6665**
0.2701**

2
s=1 (

C +
t−s

¯
R2
0.520
0.601
0.465

H +
t−s

λh
0.2213
−0.2972
−0.5400

λc
0.7734**
−0.0483
0.0763

¯
R2
0.406
0.107
0.203

urt−s ) + vt

srt−s +

urt−s ) + vt
PCE
¯
λc
R2
0.147
0.6770*
0.260
0.4519*
0.261
0.4158**

¯
R2
0.354
0.335
0.280

Notes: *10 percent significance, **5 percent significance. The coefficients λh and λc are estimated using OLS; srt is the
first difference in the short-term nominal rate, defined as the three-month Treasury-bill rate; ur is the first difference in the
unemployment rate. See notes from Table 1 for the definitions of other variables.

Federal Reserve Bank of Richmond Economic Quarterly

1959:1–1979:1
1979:2–2001:2
1985:1–2007:2

λh
0.3551
−0.2208
−0.7319**

H −
t−1

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

299

Stationarity and Mean Reversion
We also examine short-term headline-core dynamics by focusing on the influence of core deviation on the longer-horizon behavior of inflation, assuming
headline and core inflation measures are likely mean stationary in shorter
sample periods. If headline inflation is above core inflation and if adjustment
occurs mainly as a result of headline inflation moving toward core inflation,
we should expect headline inflation to decline in the near term. With that in
mind, we examine the behavior of inflation over various forecast horizons as
in (6.1) and (6.2):6
k

π H − π H = b0f + λhf (π H − π C )t +
t+f
t

b1f π H + μt+f , and
t−s

(6.1)

s=1
k

π C − π C = c0f + λcf (π H − π C )t +
t+f
t

c1f π C + μt+f ,
t−s

(6.2)

s=1

where π H is the f -periods-ahead headline inflation rate, π H is the currentt
t+f
period headline inflation rate, π C is the current-period core inflation rate,
t
π H − π C is the contemporaneous core deviation, f is the forecast horizon,
and μt+f is a mean-zero random disturbance term. Regressions (6.1) and
(6.2) relate the change in inflation over the next f (six-month) periods to
the contemporaneous gap between headline and core inflation rates. If the
coefficients, λhf , in (6.1) are generally negative and the coefficients, λcf ,
in (6.2) are zero, then core deviation is eliminated primarily as a result of
headline inflation inverting to core inflation. In contrast, if the coefficients,
λhf , in (6.1) are zero and the coefficients, λcf , in (6.2) are positive, core
deviation is eliminated mainly as a result of core inflation moving toward
headline inflation.
Table 4 presents estimates of the coefficients from regressions given in
(6.1) and (6.2). The estimates are presented for forecast horizons of one to
four periods in the future. Panel A presents estimates using CPI and Panel
B uses PCE. Since results derived using CPI are broadly similar to those derived using PCE inflation, we focus on estimates derived using CPI. As can be
seen in the pre-1979 sample period, estimated coefficients λhf , f = 1, 2, ..., 4
are zero and λcf , f = 1, 2, ..., 4 are positive, confirming that the series have
stayed together mainly as a result of core inflation moving toward headline
inflation. In the most recent sample period, 1985:1–2007:2, however, estimated coefficients λhf , f = 1, 2, ..., 4 are negative and λcf , f = 1, 2, ..., 4
are positive, suggesting that both series are adjusting to each other. However,
6 In previous research, analysts have focused only on equation (4.1), examining inversion in
headline inflation. See, for example, Clark (2001), Cogley (2002), and Rich and Steindel (2005).

300

Table 4 Short-Term Headline-Core Inflation Dynamics

H −
t+f
C −
t+f

f
f
f
f

H −
t

C
t

+

2
s=1

C +μ
t+f
t−s

=1
=2
=3
=4

1959:1–1979:1
ˆ
ˆ
λhf (t-value)
λcf (t-value)
0.2922 (1.4712)
0.9476 (5.1898)
0.1799 (0.6308)
1.0523 (4.0652)
0.2708 (0.8540)
0.9571 (3.2132)
−0.4165 (−1.1036) 0.5683 (1.5071)

Panel A: CPI
1979:2–2001:2
ˆ
ˆ
λhf (t-value)
λcf (t-value)
−0.7230 (−2.3898)
0.1660 (0.6635)
−0.8962 (−3.5360)
0.2796 (1.1604)
−0.6554 (−2.4870)
0.1431 (0.5892)
−1.2379 (−3.9101) −0.1730 (−0.6615)

1985:1–2007:2
ˆ
ˆ
λhf (t-value)
λcf (t-value)
−0.7101 (−4.9165)
0.1858 (3.7078)
−0.9658 (−5.9644)
0.1906 (2.5528)
−0.8059 (−4.9527)
0.1478 (1.7234)
−1.0563 (−6.2716)
0.0934 (3.6687)

=1
=2
=3
=4

1959:1–1979:1
ˆ
ˆ
λhf (t-value)
λcf (t-value)
0.1621 (0.6311)
0.7294 (4.3905)
−0.0373 (−0.1061) 0.7377 (2.7125)
−0.1051 (−0.2686) 0.5343 (1.6707)
−0.9059 (−2.1481) 0.2232 (0.6207)

Panel B: PCE
1979:2–2001:2
ˆ
ˆ
λhf (t-value)
λcf (t-value)
−0.9177 (−3.9444) −0.1905 (−1.1215)
−1.2218 (−5.5721) −0.0874 (−0.5304)
−0.8718 (−4.0458) −0.0183 (−0.0981)
−1.5878 (−7.2154) −0.6071 (−3.1880)

1985:1–2007:2
ˆ
ˆ
λhf (t-value)
λcf (t-value)
−0.7550 (−4.5498)
0.0684 (0.6118)
−0.9640 (−4.9353)
0.2006 (1.6302)
−0.8452 (−4.4408)
0.1519 (1.2177)
−1.1517 (−5.4391) −0.0300 (−0.1976)

Notes: f is the number of periods in the forecasting horizon. Regressions are estimated including levels of lagged inflation.
All regressions are estimated using OLS. See notes from Table 1 for variable definitions.

Federal Reserve Bank of Richmond Economic Quarterly

f
f
f
f

Long-Horizon Behavior of Inflation
2
H − C +
H
t
t
s=1 t−s + μt+f

H =b
0,f + λhf
t
C =b
0,f + λcf
t

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

301

relative magnitudes of the estimated adjustment coefficients suggest headline
inflation has moved more toward core inflation than core inflation has moved
toward headline inflation.

Robustness: Multivariate System, Data Frequency,
and Sample Breaks
The change in short-term headline-core inflation dynamics summarized above
are derived using a bivariable framework, biannual data, and three sub-periods
generated by breaking the sample in 1979 and 1984. Here, we present some
additional evidence indicating inference that the nature of change in headlinecore inflation dynamics remains robust to several changes in specification. The
first change in specification expands the regressions given in (5.1) and (5.2) to
include other possible determinants of inflation such as changes in a short-term
nominal interest (capturing the possible influence of monetary policy actions)
and changes in the unemployment rate (as a proxy for the influence of the
state of the economy). We focus on the sign and statistical significance of the
short-term adjustment coefficients in these expanded regressions. As already
noted, estimates from these multivariate regressions (Table 3, Panel B) yield
qualitatively similar inferences about the nature of the change in short-term
headline-core inflation dynamics to those derived using bivariable regressions.
Rather than focus on three sub-periods, we estimate the short-term adjustment coefficients from the multivariate versions of regressions given in
(5.1) and (5.2) using rolling regressions over a 19-year window.7 We estimate
those regressions using biannual as well as quarterly data. Since the results
using biannual data are qualitatively similar to those derived using quarterly
data and, since the results also appear robust to the use of CPI or PCE inflation, we focus on estimates derived using biannual data and CPI inflation.
Panel A in Figure 2 charts estimates of the short-term adjustment coefficient,
λh , from headline inflation regressions, and Panel B charts estimates of the
short-term adjustment coefficient, λc , from core inflation regressions, with
95 percent confidence bands. In samples that begin in the 1960s or early
1970s, the short-term adjustment coefficient, λh , is usually positive but statistically indifferent from zero whereas the short-term adjustment coefficient, λc ,
is positive and statistically different from zero, suggesting headline inflation
does not revert, but rather core inflation moves toward headline inflation. In
contrast, in samples that begin in the early 1980s, the short-term adjustment
coefficient, λh , is instead negative and statistically significant whereas the
short-term adjustment coefficient, λc , is positive but not always statistically
7 In the multivariable versions of (5.1) and (5.2), we include changes in a short-term nominal

interest rate and changes in the unemployment rate, besides including lags of headline and core
inflation rates.

302

Federal Reserve Bank of Richmond Economic Quarterly

Figure 2 Rolling Window Regression: 19-Year Window, Biannual
Data, CPI Inflation
Estimate of Adjustment Coefficient in Headline Equation with 95 Percent Confidence Band
3
2
1
0
-1
-2
-3
-4
1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988

Estimate of Adjustment Coefficient in Core Equation with 95 Percent Confidence Band
3
2
1
0
-1
-2
-3
-4
1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988

Notes: Entries on the x-axis represent the start of the sample window for the coefficient
estimate.

different from zero. This suggests that the gap between headline and core CPI
inflation is eliminated as a result of headline inflation inverting toward core
inflation rather than core inflation moving toward headline inflation. These
results are qualitatively similar to those derived using bivariable regressions
estimated across three chosen sample periods.

2.

DISCUSSION OF RESULTS

What explains the change in the short-term headline-core inflation dynamics
documented above? Recent research suggests a monetary policy explanation.
Mishkin (2007a) provides evidence that in recent years inflation persistence
has declined and inflation has become less responsive to changes in unemployment and other shocks. He attributes this change in inflation dynamics to the anchoring of inflation expectations as a result of better conduct of

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

303

monetary policy. In a recent paper, Leduc, Sill, and Stark (2007) attribute the
persistently high inflation of the 1970s to a weak monetary policy response to
surprise increases in the public’s expectations of inflation. In particular, using
a structural VAR that includes a direct survey measure of expected (headline
CPI) inflation, Leduc, Sill, and Stark show that, prior to 1979, the Federal
Reserve accommodated exogenous movements in expected inflation, seen in
the result that the short-term real interest rate did not increase in response
to such movements, which then led to persistent increases in actual inflation.
Such behavior, however, is absent post-1979. We argue below that such change
in the Federal Reserve’s accommodation of expected headline inflation is also
capable of generating the change in actual headline-core inflation dynamics
documented above. We demonstrate this by using a variant of the structural
VAR model that includes actual headline and core inflation measures.8
To explain further, consider a four-variable VAR that contains a direct
survey measure of the public’s expectations of headline inflation, represented
by the median Livingston survey forecast of the eight-month-ahead headline
CPI inflation rate (π eH ). The other variables included in the VAR are actual
t
headline CPI inflation (π H ), actual core CPI inflation (π C ), and a short-term
t
t
nominal interest rate (srt ). Following Leduc, Sill, and Stark (2007), we define
and measure variables in such a way that survey participants making forecasts
do not observe contemporaneous values of other VAR variables, thereby helping to identify exogenous movements in expected headline inflation.9 Using
a recursive identification scheme {π eH , π H , π C , srt } in which expected inflat
t
t
tion is ordered first and the short nominal interest rate is last, we examine and
compare the impulse responses of actual headline and core inflation measures
to surprise increases in expected headline inflation (and core inflation itself).
Figure 3 shows the responses of VAR variables to a one-time surprise
increase in expected headline inflation for three sample periods: 1959:1–
1979:1 (Panel A), 1979:2–2001:2 (Panel B), and 1985:1–2007:2 (Panel C).
Figure 4 shows the responses to a one-time increase in core inflation. In these
figures, and those that follow, the solid line indicates the point estimate, while
the darker and lighter shaded regions represent 68 percent and 90 percent
confidence bands, respectively.
Focusing on Figure 3, we highlight two observations. First, the effects of
a surprise increase in expected headline inflation on actual headline and core
measures of inflation have changed over time. In the pre-1979 sample period,
a surprise increase in expected headline inflation is not reversed and leads to
a persistent increase in actual headline and core inflation measures. However,
in post-1979 sample periods, such effects have become weaker. In fact, in the
8 For an empirical demonstration of the impact of change in policy on the stability of empirical models (the so-called Lucas critique), see Lubik and Surico (2006).
9 For further details see Leduc, Sill, and Stark (2007) and Mehra and Herrington (2008).

304

Federal Reserve Bank of Richmond Economic Quarterly

Figure 3 Shock to Expected Headline Inflation
Panel A: 1959:1 - 1979:1
Headline Inflation Response

3.0
2.5
2.0
1.5
1.0
0.5
0.0
0

1

2

3

4

5

6

7

8

Expected Headline Response

2.00
1.50
1.00
0.50
0.00

9 10 11 12 13 14

0

1

2

Core Inflation Response

1

2

3

4

5

6

7

8

4

5

6

7

8

9

10 11 12 13 14

Nominal Interest Rate Response

2.00
1.50
1.00
0.50
0.00
0

3

9 10 11 12 13 14

1.50
1.25
1.00
0.75
0.50
0.25
0.00
-0.25

0

1

2

3

4

5

6

7

8

9 10 11 12 13 14

Panel B: 1979:2 - 2001:2
Headline Inflation Response

Expected Headline Response
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4

1.2
0.8
0.4
0.0
-0.4
0

1

2

3

4

5

6

7

8

9 10 11 12 13 14

0

1

2

Core Inflation Response

3

4

5

6

7

8

9

10 11 12 13 14

Nominal Interest Rate Response
1.50
1.25
1.00
0.75
0.50
0.25
0.00
-0.25

1.2
0.8
0.4
0.0
-0.4
0

1

2

3

4

5

6

7

8

9

10 11 12 13 14

0

1

2

3

4

5

6

7

8

9

10 11 12 13 14

Panel C: 1985:1 - 2007:2
Headline Inflation Response
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4

Expected Headline Response
0.5
0.4
0.3
0.2
0.1
0.0
-0.1

0

1

2

3

4

5

6

7

8

9 10 11 12 13 14

0

1

2

Core Inflation Response
0.6
0.4
0.2
0.0
-0.2

3

4

5

6

7

8

9

10 11 12 13 14

Nominal Interest Rate Response
1.2
0.8
0.4
0.0
-0.4

0

1

2

3

4

5

6

7

8

9 10 11 12 13 14

0

1

2

3

4

5

6

7

8

9

10 11 12 13 14

Notes: Responses to a one standard deviation shock to expected headline CPI inflation.
The responses are generated from a VAR with expected headline CPI inflation, actual
headline CPI inflation, actual core CPI inflation, and the three-month Treasury bill rate.
All responses are in percentage terms. In each chart, the darker area represents the 68
percent confidence interval and the lighter area represents the 90 percent confidence interval. The x-axis denotes six-month periods.

most recent sample period, 1985:1–2007:2, a surprise increase in expected
headline inflation is reversed and has no significant effect on actual headline
and core inflation measures (compare responses in Panels A and C). These results suggest that, in the pre-1979 sample period, shocks to expected headline

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

305

inflation can generate co-movement between headline and core measures of
inflation and that this co-movement arises as a result of headline inflation not
reverting to core inflation and core inflation moving toward headline inflation.
In contrast, in the recent sample period, 1985:1–2007:2, a surprise increase
in expected headline inflation does not generate co-movement between actual
headline and core inflation measures because they are not affected by movements in expected headline inflation. As discussed below, a surprise increase
in core inflation, however, can generate co-movement between headline and
core measures of inflation in the most recent sample period.
Second, the interest rate responses shown in Figure 3 suggest monetary
policy may be at the source of the above-noted change in the response of actual headline inflation to expected headline inflation shocks. If we focus on
the nominal interest rate response shown in Panel A, we see that the nominal
interest rate does increase in response to a surprise increase in expected headline inflation, but that this increase in the nominal interest rate approximates
the increase in expected headline inflation leaving the real interest essentially
unchanged.10 The behavior of the real interest rate in response to surprise
increases in expected headline inflation suggests that the Federal Reserve followed an accommodative monetary policy. However, in the sample period
1979:2–2001:2, the real interest rate rises sharply in response to a surprise
increase in expected headline inflation, suggesting that the Federal Reserve
did not accommodate shocks to expected headline inflation. In the most recent sample period, 1985:1–2007:2, there is no significant response of the
real interest rate to an expected inflation shock, because a surprise increase in
expected headline inflation is reversed, having no significant effect on actual
headline and core inflation measures.
Focusing on Figure 4, we see that it is only in the most recent sample period, 1985:1–2007:2, in which a surprise increase in core inflation leads to an
increase in expected and actual headline inflation, generating co-movement between headline and core measures of inflation. This co-movement is generated
as a result of headline inflation moving toward core inflation. Furthermore,
the real interest rate does rise significantly in response to a surprise increase
in core inflation, suggesting that in conducting monetary policy the Federal
Reserve appears to be focused on the core measure of inflation. In contrast,
in the pre-1979 sample period, a surprise increase in core inflation does not
lead to an increase in headline inflation and there is no significant response of
the nominal interest rate.11
10 We infer the response of the real interest rate to a shock by comparing the responses of
the nominal interest rate and expected headline inflation. Thus, the expected real interest rate response is simply the short-term nominal interest rate response minus the expected headline inflation
response.
11 However, in the pre-1979 sample period, a surprise increase in core inflation is reversed
and leads to a decline (not increase) in expected and actual headline inflation. Even though the

306

Federal Reserve Bank of Richmond Economic Quarterly

Figure 4 Shock to Core Inflation
Panel A: 1959:1 - 1979:1
Headline Inflation Response
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.5

Expected Headline Response

0.25
-0.25
-0.75
-1.25
-1.75

0

1

2

3

4

5

6

7

8

9

10 11 12

13

14

Core Inflation Response
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
0

1

2

3

4

5

6

7

8

9

0

1

2

12

13

14

4

5

6

7

8

9

10

11

12

13 14

11

12

13

14

11

12

13

14

11

12

13

14

11

12

13

14

12

13

14

Nominal Interest Rate Response

0.25
0.00
-0.25
-0.50
-0.75
-1.00
-1.25
10 11

3

0

1

2

3

4

5

6

7

8

9

10

Panel B: 1979:2 - 2001:2
Headline Inflation Response

1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
0

1

2

3

4

5

6

7

8

9

Expected Headline Response

0.5
0.4
0.3
0.2
0.1
0.0
-0.1
-0.2

10 11 12 13 14

0

1

2

Core Inflation Response

1.0

3

4

5

6

7

8

9

10

Nominal Interest Rate Response
0.4
-0.0
-0.4

0.6
0.2
-0.2

-0.8
-1.2

-0.6
0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

0

1

2

3

4

5

6

7

8

9

10

Panel C: 1985:1 - 2007:2
Headline Inflation Response

1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
0

1

2

3

4

5

6

7

8

9

10 11 12 13 14

0

Core Inflation Response

0.6
0.5
0.4
0.3
0.2
0.1
0.0
-0.1
0

1

2

3

4

5

6

7

8

9

Expected Headline Response

0.5
0.4
0.3
0.2
0.1
0.0
-0.1
1

2

11

12

13

14

4

5

6

7

8

9

10

Nominal Interest Rate Response

1.25
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
10

3

0

1

2

3

4

5

6

7

8

9

10

11

Notes: Responses to one standard deviation shock to core CPI inflation. The responses
are generated from a VAR with expected headline CPI inflation, actual headline CPI inflation, actual core CPI inflation, and the three-month Treasury bill rate. All responses
are in percentage terms. In each chart, the darker area represents the 68 percent confidence interval and the lighter area represents the 90 percent confidence interval. The
x-axis denotes six-month periods.

nominal interest rate does not increase in response to a positive shock to core inflation, the expected
real interest rate does increase because of a decline in expected headline inflation. These responses

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

307

Together, the responses depicted in Figures 3 and 4 imply that, before
1979, headline and core inflation measures co-move mainly as a result of
core inflation moving toward headline inflation, because the Federal Reserve
accommodated surprise increases in the public’s expectations of headline inflation. A surprise increase in core inflation is simply reversed and does not
lead to higher expected or actual headline inflation. Since 1979, however, the
Federal Reserve has not accommodated increases in the public’s expectations
of headline inflation, and hence co-movement has mainly arisen as a result of
headline inflation moving toward core inflation.

Food and Energy Inflation
Since the measure of core inflation used here is derived excluding food and
energy inflation from headline inflation, and since food and energy prices are
likely to be a significant source of movements in expected headline inflation,
the results discussed above imply that change in monetary policy response to
expected headline inflation may reflect change in monetary policy response
to movements in expected food and energy prices. Since we do not have
any direct survey data on the public’s expectations of food and energy price
inflation, we provide some preliminary evidence on this issue by examining
responses to movements in actual food and energy inflation. With that in mind,
we consider another variant of the structural VAR model that includes expected
headline inflation, actual core inflation, the food and energy component of
headline CPI inflation, and the short-term nominal interest rate. We continue
to assume the baseline identification ordering {π eH , π C , (π H − π C ), srt } in
t
t
t
t
which expected headline inflation is exogenous but food and energy price
inflation is not. Food and energy inflation is measured as the gap between
headline and core inflation rates.
Figure 5 shows responses to a surprise increase in the food and energy
component of headline inflation over three sample periods: 1959:1–1979:1
(Panel A), 1979:2–2001:2 (Panel B), and 1985:1–2007:2 (Panel C). In the pre1979 sample period a surprise temporary increase in food and energy prices
has a significant effect on expected headline inflation, leading to a persistent
increase in expected (and hence actual) headline inflation. Core inflation is
also persistently higher in response to a surprise increase in food and energy
inflation. These responses suggest that a surprise increase in food and energy
inflation can generate co-movement between headline and core measures of
inflation, with core inflation moving toward headline inflation. However, in

suggest that the Federal Reserve was not as accommodative to shocks to core inflation as it was
to shocks to expected headline inflation. As noted by several analysts, the Federal Reserve may
have believed that shocks to food and energy prices are likely temporary and would not lead to
persistent increases in headline inflation.

308

Federal Reserve Bank of Richmond Economic Quarterly

Figure 5 Shock to Food and Energy Component of Headline Inflation
Panel A: 1959:1 - 1979:1
Core Inflation Response

2.5
2.0
1.5
1.0
0.5
0.0
-0.5
1

0

2

3

4

5

6

7

8

9

10 11 12 13 14

Food and Energy Inflation Response

1.50

Expected Headline Response

1.75
1.50
1.25
1.00
0.75
0.50
0.25
0.00
0

1

2

1.4

1.00

4

5

6

7

8

9

10

11

12

13

14

11

12

13

14

11

12

13

14

11

12

13

14

11

12

13

14

12

13

14

1.0

0.50

3

Nominal Interest Rate Response

0.6

0.00

0.2

-0.50

-0.2
1

0

2

3

4

5

6

7

8

9

10

11

12

13

0

14

1

2

3

4

5

6

7

8

9

10

Panel B: 1979:2 - 2001:2
Core Inflation Response

0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6

Expected Headline Response
0.4
0.2
-0.0
-0.2
-0.4

1

0

2

3

4

5

6

7

8

9

10 11 12 13 14

Food and Energy Inflation Response

0

1

2

4

5

6

7

8

9

10

Nominal Interest Rate Response

1.2

1.25
0.75
0.25
-0.25
-0.75

3

0.8
0.4
0.0
-0.4
1

0

2

3

4

5

6

7

8

9

10

11

12

13

14

0

1

2

3

4

5

6

7

8

9

10

Panel C: 1985:1 - 2007:2
Core Inflation Response

0.4
0.3
0.2
0.1
0.0
-0.1
-0.2

Expected Headline Response

0.25
0.15
0.05
-0.05
-0.15

1

0

2

3

4

5

6

7

8

9

10 11 12 13 14

Food and Energy Inflation Response

1.50
0.50
0.00
-0.50
1

2

3

4

5

6

7

8

9

10

11

1

2

12

13

14

3

4

5

6

7

8

9

10

Nominal Interest Rate Response

0.6
0.4
0.2
-0.0
-0.2
-0.4
-0.6
-0.8

1.00

0

0

0

1

2

3

4

5

6

7

8

9

10

11

Notes: Responses to one standard deviation shock to the food and energy component of
headline CPI inflation. The responses are generated from a VAR with expected headline
CPI inflation, core CPI inflation, food and energy inflation, and the three-month Treasury
bill rate. All responses are in percentage terms. In each chart, the darker area represents the 68 percent confidence interval and the lighter area represents the 90 percent
confidence interval. The x-axis denotes six-month periods.

post-1979 sample periods the positive response of expected headline inflation
to a surprise increase in food and energy inflation weakens considerably. More
interestingly, in the most recent sample period, 1985:1–2007:2, a surprise
increase in food and energy inflation has no significant effect on expected

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

309

headline inflation, suggesting that the public believes increases in food and
energy prices are unlikely to lead to a persistent increase in headline inflation
(compare responses across Panels A through C).12
The response of the real interest rate to a surprise increase in food and
energy prices implicit in Panels A through C suggests a monetary policy explanation of the decline in the influence of food and energy prices on expected
headline inflation. In the pre-1979 period, the real interest rate does not
change much because the rise in nominal interest rate matches the rise in expected headline inflation, suggesting an accommodative stance of monetary
policy. However, in the sample period 1979:2–2001:2, the real interest rate
rises significantly in response to a surprise increase in food and energy prices,
suggesting that the Federal Reserve did not accommodate increases in food
and energy prices. Hence, the decline in the influence of food and energy
inflation on expected headline inflation since 1979 may be due to the Federal
Reserve no longer accommodating shocks to food and energy prices.
In the most recent sample period, 1985:1–2007:2, however, there is no
significant response of the nominal (or real) interest rate to a surprise increase
in food and energy prices, because a surprise increase in food and energy
inflation has no significant effect on expected headline inflation. One plausible explanation of the absence of any significant effect of movements in
food and energy inflation on expected headline inflation is that past Federal
Reserve behavior has convinced the public that it would not accommodate
food and energy inflation. As a result, surprise increases in food and energy
inflation have no significant effect on expected headline inflation, suggesting
the Federal Reserve has become credible.
But do shocks to food and energy inflation matter for expected headline
inflation? The results of the variance decomposition of expected headline
inflation presented in Table 5 are consistent with the decline in the influence
of food and energy inflation on expected headline inflation since 1979. In the
pre-1979 sample period, shocks to the food and energy component of inflation
account for about 35 percent of the variability of expected headline inflation
at a two-year horizon, whereas in the recent sample period, 1985:1–2007:2,
they account for less than 4 percent of the variability of expected headline
inflation at the same two-year horizon.

3.

CONCLUDING OBSERVATIONS

This article investigates empirically short-term dynamics between headline
and core measures of CPI and PCE inflation over three sample periods: 1959:1–
1979:1, 1979:2–2001:2, and 1985:1–2007:2. Headline and core inflation
12 In the recent sample period, 1985:1–2007:2, a surprise increase in food and energy prices
does feed into core inflation.

310

Federal Reserve Bank of Richmond Economic Quarterly

Table 5 Variance Decomposition of Expected Headline CPI Inflation
Panel A: 1959:1–1979:1
Ordering:
Steps
1
2
3
4
6
8

eH ,

Ordering:
Steps
1
2
3
4
6
8

eH ,

Ordering:
Steps
1
2
3
4
6
8

eH ,

C,
eH

H −

100.000
83.939
56.768
49.345
45.326
44.895
C,
eH

H −

100.000
75.964
62.899
55.940
49.066
45.673
C,
eH

H −

100.000
66.457
55.816
50.830
49.452
49.412

C , SR
C

0.000
0.979
3.739
4.427
7.700
10.508
Panel B: 1979:2–2001:2
C , SR
C

0.000
12.141
22.970
29.473
35.928
39.126
Panel C: 1985:1–2007:2
C , SR
C

0.000
27.758
35.223
39.187
41.004
41.533

H −

C

SR
0.000
3.461
7.186
10.370
10.616
9.353

C

SR
0.000
4.125
3.749
3.777
4.644
5.343

C

SR
0.000
5.704
6.309
6.307
5.867
5.394

0.000
11.621
32.307
35.859
36.358
35.244
H −

0.000
7.771
10.382
10.809
10.363
9.858
H −

0.000
0.081
2.653
3.676
3.676
3.659

Notes: Entries are in percentage terms, with the exception of those under the column
labeled “Steps.” Those entries refer to n-step-ahead forecasts for which decomposition
is done. eH is expected headline inflation, as measured by the Livingston Survey. See
notes from Tables 1 and 3 for the definitions of the other variables.

measures are co-integrated, suggesting long-run co-movement. However, the
ways in which these two variables adjust to each other in the short run and
generate co-movement have changed across these sample periods. In the pre1979 sample period, when a positive gap opens up with headline inflation
rising above core inflation, the gap is eliminated mainly as a result of headline
inflation not reverting and core inflation moving toward headline inflation.
These dynamics suggest headline inflation would be better than core inflation in assessing the permanent component of inflation. In post-1979 sample
periods, however, the positive gap is eliminated as a result of headline inflation reverting more strongly toward core inflation than core inflation moving
toward headline inflation, suggesting core inflation would be better than headline inflation in assessing the permanent component of inflation. Although
short-term headline-core inflation dynamics are investigated using biannual

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

311

data, the basic result on change in inflation dynamics is robust to the use of
quarterly data and includes additional economic determinants of inflation in
the bivariable headline-core inflation regressions. The results are also not
sensitive to the precise breakup of the sample in 1979 and 1984.
Recent research suggests a monetary policy explanation of change in inflation dynamics. We focus on a version suggested in Leduc, Sill, and Stark
(2007) that attributes the decline in the persistence of actual headline inflation
to a change in the accommodative stance of monetary policy in 1979. We
illustrate that such a change in monetary policy response to exogenous shocks
to the public’s expectations of headline inflation can generate the change in
headline-core inflation dynamics documented above. Before 1979, the Federal Reserve accommodated shocks to expected headline inflation: A surprise
increase in expected headline inflation is not reversed, leading to a persistent
increase in actual headline inflation and co-movement arising as a result of
core inflation moving toward headline inflation. Since 1979 that has not been
the case: A surprise increase in expected headline inflation is reversed and
co-movement arises mainly as a result of headline inflation moving toward
core inflation.
Since food and energy prices are likely a significant determinant of expected headline inflation, the results imply that the change in headline-core
inflation dynamics may simply be due to the Federal Reserve no longer accommodating food and energy inflation. In the most recent sample period,
a surprise increase in food and energy inflation has no significant effect on
the public’s expectations of headline inflation. This result suggests that past
Federal Reserve behavior has convinced the public that it would no longer
accommodate food and energy inflation.
In previous research, analysts have often found that the empirical evidence
indicating that core inflation is better than headline inflation at gauging the
trend component of inflation is not robust across sample periods. The empirical work in this article explains this lack of robustness; namely, headline-core
inflation dynamics changed with a change in the conduct of monetary policy
in 1979. Hence, in sample periods beginning in the 1960s and ending in the
1980s or 1990s, the hypothesis that the trend component of inflation is best
gauged by focusing only on core inflation may or may not be found consistent
with the data.

312

Federal Reserve Bank of Richmond Economic Quarterly

REFERENCES
Bernanke, Ben S. 2007. “Inflation Expectations and Inflation Forecasting.”
Remarks at the Monetary Economics Workshop of the NBER Summer
Institute, Cambridge, Massachusetts.
Blinder, Alan S., and Ricardo Reis. 2005. “Understanding the Greenspan
Standard.” Proceedings of Jackson Hole Symposium, Federal Reserve
Bank of Kansas City.
Clark, Todd E. 2001. “Comparing Measures of Core Inflation.” Federal
Reserve Bank of Kansas City Economic Review: 5–31
Cogley, Timothy. 2002. “A Simple Adaptive Measure of Core Inflation.”
Journal of Money, Credit and Banking 34 (February): 94–113.
Crone, Theodore M., N. Neil K. Khettry, Loretta J. Mester, and Jason Novak.
2008. “Core Measures as Predictors of Total Inflation.” Federal Reserve
Bank of Philadelphia Working Paper 08-9.
Dickey, D. A., and W. A. Fuller. 1979. “Distribution of the Estimator for
Autoregressive Time Series with a Unit Root.” Journal of the American
Statistical Association 74: 427–31.
Engle, Robert F., and C. W. J. Granger. 1987. “Co-Integration and
Error-Correction: Representation, Estimation and Testing.”
Econometrica (March): 251–76.
Granger, C. W. J. 1986. “Development in the Study of Cointegrated
Economic Variables.” Oxford Bulletin of Economics and Statistics 48:
213–28.
Hamilton, J. D. 1994. Time Series Analysis. Princeton, N.J.: Princeton
University Press.
Johansen, S. 1988. “Statistical Analysis of Cointegrating Vectors.” Journal
of Economic Dynamics and Control 12: 231–54.
Kiley, Michael T. 2008. “Estimating the Common Trend Rate of Inflation for
Consumer Prices and Consumer Prices Excluding Food and Energy
Prices.” Finance and Economics Discussion Series, Division of
Research and Statistics and Monetary Affairs, Federal Reserve Board,
Washington, D.C.
Leduc, Sylvain, Keith Sill, and Tom Stark. 2007. “Self-Fulfilling
Expectations and the Inflation of the 1970s: Evidence from the
Livingston Survey.” Journal of Monetary Economics: 433–59.

Y. P. Mehra and D. Reilly: Headline-Core Inflation Dynamics

313

Lubik, Thomas, and Paolo Surico. 2006. “The Lucas-Critique and the
Stability of Empirical Models.” Federal Reserve Bank of Richmond
Working Paper 6.
Mehra, Yash P., and Brian D. Minton. 2007. “A Taylor Rule and the
Greenspan Era.” Federal Reserve Bank of Richmond Economic
Quarterly 93 (Summer): 229–50.
Mehra, Yash P., and Christopher Herrington. 2008. “On Sources of
Movements in Inflation Expectations: A Few Insights from a VAR
Model.” Federal Reserve Bank of Richmond Economic Quarterly 94
(Spring): 121–46.
Mishkin, Frederick S. 2007a. “Inflation Dynamics.” Working Paper 13147.
Cambridge, Mass.: National Bureau of Economic Research (June).
Mishkin, Frederick S. 2007b. “Headline Versus Core Inflation in the Conduct
of Monetary Policy.” Speech at the Business Cycles, International
Transmission and Macroeconomic Policies Conference, Montreal,
Canada, October 20.
Phillips, Peter, and Bruce E. Hansen. 1990. “Statistical Inference in
Instrumental Variables Regression with I(1) Processes.” Review of
Economic Studies 57 (January): 99–125.
Rich, Robert, and Charles Steindel. 2005. “A Review of Core Inflation and
an Evaluation of its Measures.” Federal Reserve Bank of New York Staff
Report 236.
Webb, Roy. 1995. “Forecasts of Inflation from VAR Models.” Journal of
Forecasting (May): 267–86.

Economic Quarterly—Volume 95, Number 3—Summer 2009—Pages 315–334

Why Could Political
Incentives Be Different
During Election Times?
Leonardo Martinez

T

he literature on political cycles argues that the proximity of the next
election date affects policy choices (Alesina, Roubini, and Cohen
[1997]; Drazen [2000]; and Shi and Svensson [2003] present reviews
of this literature).1 Evidence of such cycles is stronger for economies that
are less developed, have younger democracies, have less government transparency, have less media freedom, have a larger share of uninformed voters in
the electorate, and have a higher re-election value. Brender and Drazen (2005)
find evidence of a political deficit cycle in a large cross-section of countries
but show that this finding is driven by the experience of “new democracies.”
The budget cycle disappears when the new democracies are removed from
their sample. Similarly, using a large panel data set, Shi and Svensson (2006)
find that, on average, governments’ fiscal deficits increase by almost 1 percent
of gross domestic product in election years, and that these political budget
cycles are significantly larger and statistically more robust in developing than
in developed countries. Using suitable proxies, they also find that the size of
the electoral budget cycles increases with the size of politicians’ rents from
remaining in power, and with the share of informed voters in the electorate.
Akhmedov and Zhuravskaya (2004) use a regional monthly panel from Russia
and find a sizable and short-lived political budget cycle (public spending is
For helpful comments, the author would like to thank Juan Carlos Hatchondo, Pierre Sarte,
Anne Stilwell, and John Weinberg. The views expressed in this article do not necessarily reflect those of the Federal Reserve Bank of Richmond or the Federal Reserve System.
E-mail: leonardo.martinez@rich.frb.org.
1 Related work studies how political turnover causes movements in the real economy. Partisan
cycles are studied, for example, by Alesina (1987), Azzimonti Renzo (2005), Cuadra and Sapriza
(2006), and Hatchondo, Martinez, and Sapriza (forthcoming). Hess and Orphanides (1995, 2001)
and Besley and Case (1995) study how the presence of term limits introduces electoral cycles
between terms (while I focus on cycles within terms).

316

Federal Reserve Bank of Richmond Economic Quarterly

shifted toward direct monetary transfers to voters). They also find that the
magnitude of the cycle decreases over time and with democracy, government
transparency, media freedom, and voter awareness. They argue that the short
length of the cycle explains underestimation of its size by studies that use
lower frequency data.
Why would policymakers prefer to influence economic conditions at the
end of their term rather than at the beginning of their term? This article
discusses some answers to this question provided by the theoretical literature
on political cycles.
More generally, this article discusses agency relationships in which an important part of the compensation is decided upon infrequently. For instance,
my framework could be used to discuss incentives when a contract commits
the employer to working with a certain employee for a number of periods, but
allows the employer to replace this employee after the contract ends. Consider, for example, a professional athlete who signs a multi-year contract with
a team, which is free to terminate its relationship with this athlete (or not)
after the contract ends. Do athletes have stronger incentives to improve their
performance just before their contract expires? Wilczynski (2004) and Stiroh
(2007) present empirical evidence of a renegotiation cycle: performance improves in the year before the signing of a multi-year contract, but declines after
the contract is signed. Renegotiation cycles resemble the cycles discussed in
the political-economy literature. Even though my analysis applies to other
employment relationships, for concreteness, this article refers to voters and
policymakers.
I study political cycles in a standard three-period political-agency model of
career concerns. An incumbent policymaker who starts his political career in
period one with an average reputation can exert effort in periods one and two to
increase his re-election probability. Each period, the incumbent’s performance
depends on his ability, his effort level, and luck. Voters do not observe the
incumbent’s ability, effort, and luck; instead, they observe his performance.
Good current performance by the incumbent may signal that he is capable of
good performance in the future. Voters re-elect the incumbent only if they
expect that his performance will be good in the future. Since the incumbent
wants to be re-elected, he may exert effort to improve his current performance.2
2 By assuming that the policymaker can influence the beliefs about his future performance,

the literature on political cycles does not imply that he can fine-tune the aggregate economic effects
of economic policy. One may think that the policymaker is evaluated on the quality of services he
provides. For instance, Brender (2003) finds that “the incremental student success rate during the
mayor’s term had a significant positive effect on his reelection chances.” The quality of education
depends on economic policy (for example, it depends on the resources the policymaker makes
available for education). Thus, the policymaker may decide to make more resources available
for education (instead of keeping resources for his favorite interest group or himself) in order to
increase his re-election probability.

L. Martinez: Political Incentives and Elections

317

Earlier theoretical studies of political cycles succeeded in showing that
in environments with asymmetric information about the incumbent’s unobservable and stochastically evolving ability (as the one studied in this article),
cycles can arise with forward-looking and rational voters. These studies show
that political cycles may arise because the incumbent’s end-of-term performance may be more informative about the quality of his future (post-election)
performance than his beginning-of-term performance. Therefore, the incumbent’s end-of-term actions (that influence his end-of-term performance) may
be more effective in influencing the election result than his beginning-of-term
actions (that influence his beginning-of-term performance). Consequently,
the incumbent may have stronger incentives to improve his performance at
the end of his term. For expositional simplicity, these studies model this intuition in its most extreme form. That is, they assume that only the end-of-term
incumbent’s action is effective in changing the election result (see, for example, Rogoff [1990], Shi and Svensson [2006], and the references therein).
Thus, re-election concerns play a role only at the end of a term, and, therefore,
political cycles arise.
These earlier studies make three assumptions that imply that the incumbent only affects his re-election probability by influencing his end-of-term
performance. The first assumption is that at the time of the election, only the
end-of-term ability is not observable. If beginning-of-term ability is observable, the incumbent cannot influence voters’beliefs with his beginning-of-term
actions and, therefore, cycles arise.
The second assumption is that only end-of-term ability is correlated with
post-election ability. Consequently, only voters’ inference about end-of-term
ability directly influences their re-election decision.
The third assumption is that output is a perfect signal of ability. This
implies that voters can learn the incumbent’s end-of-term ability (which is
correlated with his post-election ability) perfectly from his end-of-term performance, without considering his beginning-of-term performance. Therefore, beginning-of-term actions are not effective in changing the re-election
probability.
The three assumptions described above imply strong asymmetries across
periods. Political cycles in these earlier studies are a direct result of these
asymmetries.
In Martinez (2009b), I explain why political cycles may arise even if
the incumbent’s end-of-term performance is not more informative about the
quality of his future performance, and, consequently, the incumbent’s endof-term actions are not more effective in influencing the election result. In
the model, the incumbent’s equilibrium effort choice depends on both the
proximity of the next election and his reputation (which I refer to as the
beliefs about his ability). Recall that we want to study how the proximity of
elections affects policy choices. Consequently, with political cycles I refer

318

Federal Reserve Bank of Richmond Economic Quarterly

to differences in the incumbent’s choices within a term in office for a given
reputation level. For a given reputation level, why would the incumbent exert
more effort closer to the election? If the incumbent’s reputation does not
change between periods one and two, why would the incumbent exert more
effort in period two than in period one?
The key insight to the answer to these questions comes from the characterization of the incumbent’s effort-smoothing decision, which is such that
he makes the marginal cost of exerting effort in period one (roughly) equal
to the expected marginal cost of exerting effort in period two. This decision
presents the typical intertemporal tradeoff in dynamic models: Having less
utility in period one allows the incumbent to have more utility in period two.
In this case, a lower expected effort level in period two compensates for a
higher effort level in period one. In period one, the incumbent (whose reputation is average) knows that his reputation is likely to change and anticipates
that this change will lead him to choose an effort level lower than the one
he would choose in period two if his reputation remains average—extreme
reputations imply low efforts. Consequently, the expected marginal cost of
exerting effort in period two is lower than the marginal cost of the equilibrium
period-two effort level for an average reputation (the marginal cost is an increasing function). Thus, the incumbent’s effort-smoothing decision implies
that the marginal cost of the equilibrium period-one effort level—which is
equal to the expected marginal cost of exerting effort in period two—is lower
than the marginal cost of the equilibrium period-two effort level for the same
(average) reputation. Therefore, for the same reputation, the period-one equilibrium effort level is lower than that of period two. That is, incentives to
influence the re-election probability are stronger closer to the election.
In another context, consider a professional athlete who has an average
reputation at the beginning of a multi-year contract with a team and may want
to exert effort in order to improve his reputation and obtain a good contract
after his current contract ends. The discussion above indicates that the optimal
strategy for the athlete is to wait until the end of his current contract to see
whether it is worth exerting a high effort level. At the beginning of his current
contract, he should choose an intermediate effort level. At the end of his
contract, if his reputation remains average, he should choose a higher effort
level. If his reputation became either very good or very bad (because his
performance was very good or very bad), he should choose a lower effort
level. Thus, for the same reputation level, the athlete exerts more effort at the
end of his contract and there is a “renegotiation cycle.”
This article first characterizes a model with the three simplifying assumptions adopted in earlier studies. Then, each of the three assumptions described
above is relaxed, and yet the model still generates cycles without assuming
strong asymmetries across periods because of the effort-smoothing considerations I first described in Martinez (2009b).

L. Martinez: Political Incentives and Elections

319

The rest of this article is structured as follows. Section 1 presents the
main elements of a standard model of political cycles. Section 2 characterizes a benchmark with the three simplifying assumptions adopted in earlier
studies. These assumptions are relaxed in Sections 3, 4, and 5. Section 3
assumes that beginning-of-term ability is not observable. It is shown that
this does not change the incumbent’s equilibrium decisions but it makes the
optimal period-two effort level a function of the period-one effort level. In
Section 4, I assume positive correlation between beginning-of-term ability
and post-election ability. I show that the incumbent still chooses to exert zero
effort at the beginning of the term, but his end-of-term equilibrium effort level
depends on his period-one ability. In Section 5, it is assumed that observing performance in one period is not sufficient to fully learn ability, and it is
explained how the incumbent’s optimal effort-smoothing decision generates
cycles. Section 6 concludes.

1. THE MODEL
This article presents a three-period political-agency model of career concerns.
In period one, there is a new policymaker in office. At the beginning of period
three, elections are held: Voters decide whether to re-elect the incumbent
policymaker or replace him with a policymaker who was not previously in
office.
The amount of public good produced by the incumbent policymaker in
period t, yt , is a stochastic function of his ability, ηt , and his effort level, at .
In particular,
yt = at + ηt + ε t ,

(1)

where εt is a random variable.
Each period, the policymaker in office can exert effort to increase the
amount of public good he produces. Voters do not observe the effort level
(which is, of course, known by the incumbent policymaker).
The incumbent and voters do not know the incumbent’s ability. The common belief about the ability of a new incumbent is given by the distribution
of abilities in the economy.
The timing of events within each period is as follows. First, the incumbent
decides on his effort level, after which ηt and εt are realized, and yt is observed.
Voters’ per-period utility is given by yt . In period three, they decide on
re-election in order to maximize the expected value of y3 .
A policymaker’s per-period utility is normalized to zero if he is not in
office. He receives R > 0 in each period during which he is in charge of the
production of the public good. The cost of exerting effort is given by c (a),
with c (a) ≥ 0, c (a) > 0, and c (0) = 0. Let δ ∈ (0, 1) denote the voters’

320

Federal Reserve Bank of Richmond Economic Quarterly

and the incumbent’s discount factor. I use backward induction to solve for the
subgame perfect equilibrium of this game.

2. A BENCHMARK
This section provides a benchmark following earlier studies of political cycles
by assuming that only the ability in the last period before the election is not
observable at the time of the election, that ability follows a first-order moving
average process, and that output is a perfect signal of ability (see, for example,
Rogoff [1990], Shi and Svensson [2006], and the references therein).
The first period a policymaker is in office, his ability is given by ηt = γ t ,
and in every other period, ηt = γ t +γ t−1 , where γ t is an i.i.d. random variable
with mean m1 , differentiable distribution function , and density function φ.
When voters decide on re-election, γ 1 is known and γ 2 is not known. The
production function is deterministic: ε t = 0 for all t.
Observing output yt allows voters and the incumbent to compute the values
of ηt and γ t using their knowledge of the effort exerted by the incumbent and
the production function. Let ηvt and ηit denote the ability computed by voters
and by the incumbent, respectively. Let γ vt and γ it denote the value of γ t
computed by voters and the incumbent, respectively. The incumbent knows
the effort level he chooses and, therefore, he always can compute ηt = yt − at
correctly (i.e., ηit = ηt ). Using η1 , he can compute the value of γ 2 :
γ i2 = y2 − η1 − a2 = γ 2 .
Voters compute η2 and γ 2 using equilibrium effort levels. They are rational
and understand the game. In particular, they know the incumbent’s equilibrium
strategy. At the time the incumbent decides his period-two effort level, he
knows a1 and y1 . Recall that the latter is a function of a1 and, therefore,
we can summarize the information available to the incumbent by the effort
component, a1 , and the stochastic component, η1 = y1 − a1 , of y1 . For any
value of η1 and a1 , let α 2 η1 , a1 denote the incumbent’s equilibrium period∗
two effort level. Let a1 denote the incumbent’s equilibrium period-one effort
level. Voters compute
∗
∗
γ v2 = y2 − η1 − α 2 η1 , a1 = γ 2 + a2 − α 2 η1 , a1 .

(2)

In period three, there is no future re-election probability that could be
influenced by the incumbent. Therefore, any policymaker would exert zero
effort. Consequently, when forward-looking voters decide on re-election, they
compare the incumbent’s period-three expected ability with the period-three
expected ability of a policymaker who was not previously in office. The
incumbent’s period-three expected ability computed by voters is equal to γ v2 .
The expected period-three ability of a policymaker who was not in office before
is m1 . Consequently, voters re-elect the incumbent if and only if γ v2 > m1 .

L. Martinez: Political Incentives and Elections

321

∗
That is, the incumbent is re-elected if and only if γ 2 + a2 − α 2 η1 , a1 > m1 ,
∗
or equivalently γ 2 > m1 + α 2 η1 , a1 − a2 . Thus, exerting effort in period
two decreases the minimum realization of γ 2 that would allow the incumbent
to be re-elected and, therefore, it increases the re-election probability.
The incumbent’s period-two maximization problem reads

max δR 1 −
a2 ≥0

∗
m1 + α 2 η1 , a1 − a2

− c (a2 ) ,

(3)

∗
where 1 −
m1 + α 2 η1 , a1 − a2 is the probability of re-election. Note
that the incumbent can compute equilibrium effort levels as voters do (all information available to voters is also available to the incumbent) and, therefore,
∗
he can compute α 2 η1 , a1 .
In this article, I characterize the incumbent’s equilibrium effort levels
through the first-order condition of his maximization problems.3 Note that for
finding the equilibrium effort level, we solve a fixed-point problem. The effort
level that maximizes the incumbent’s expected utility in (3) depends on the
∗
effort level voters use to compute the signal, α 2 η1 , a1 . In equilibrium, the
incumbent’s effort level must be equal to the effort level voters use to compute
the signal.
The optimal period-two effort level satisfies

c α 2 η1 , a1

∗
= δRφ m1 + α 2 η1 , a1 − α 2 η1 , a1

.

(4)

∗
∗
Let a2 denote the period-two equilibrium effort level. In equilibrium, a1 = a1
∗
and, therefore, a2 satisfies
∗
c a2 = δRφ (m1 ) > 0.

(5)

Equation (5) shows that the equilibrium effort level is such that the marginal
cost of exerting effort is equal to the marginal benefit of exerting effort. The
incumbent benefits from exerting effort because this increases the re-election
probability. The marginal benefit of exerting effort is given by the change
in the probability of re-election multiplied by R (the value of winning the
election) and the discount factor, δ.
It should be mentioned that, in models of career concerns, equilibrium
effort levels are typically inefficient (for a more thorough discussion of this
issue, see Foerster and Martinez [2006]). The efficient effort level is the one
a benevolent social planner would force the incumbent to exert (if he could
observe the effort exerted by the incumbent). This effort level can be defined
as the one at which the social marginal cost of exerting effort (the incumbent’s
marginal cost) equals the social marginal benefit of exerting effort (the increase
3 As in previous models of political agency, assumptions are necessary to guarantee the concavity of these problems in which the re-election probability may not be a concave function of
the incumbent’s decision. For example, the first term in the objective function in (3) may not be
globally concave. In order to assure global concavity of the incumbent’s problems, it is sufficient
to assume enough convexity in the cost of the effort function.

322

Federal Reserve Bank of Richmond Economic Quarterly

in output implied by an extra unit of effort, which according to the production
function in equation 1 is equal to one). Since the incumbent’s marginal benefit
of exerting effort represented in the right-hand side of equation (5) is typically
different from the marginal productivity of effort, the equilibrium effort level
is typically inefficient. Furthermore, since the social marginal benefit and
marginal cost of exerting effort are the same every period, political cycles
(differences in effort levels within a term) imply inefficiencies.
∗
Note that a2 does not depend on η1 or a1 . Equation (4) shows that, since
the period-two equilibrium effort level does not depend on η1 or a1 , off the
∗
equilibrium path (i.e., when a1 = a1 ) the optimal period-two effort level does
not depend on η1 or a1 (for a more thorough discussion of how the history
of the game affects the agent’s strategy in models of career concerns, see
∗
∗
Martinez [2009a]). Furthermore, since c a2 > 0, a2 > 0.
In period one, the incumbent anticipates equilibrium play in the subsequent periods. In particular, the incumbent anticipates that the probability of
re-election is given by 1 − (m1 ) and does not depend on his period-one
effort level. Consequently, the period-one equilibrium effort level is given by
∗
∗
a1 = 0 < a2 .
Thus, I have shown that, under the standard assumptions in earlier studies
of political cycles, the incumbent can affect his re-election probability only
with the last effort level prior to the election and, therefore, cycles appear
(the incumbent only chooses a positive effort level in period two). In the next
sections, I shall discuss the consequences of relaxing these assumptions.

3.

SYMMETRIC OBSERVABILITY

In Section 2, the incumbent’s period-one ability, η1 , was observable and,
therefore, there was nothing the incumbent could do in period one to influence
voters’beliefs about his post-election ability and the re-election probability. In
this section, I assume that η1 is not observable. I will show that this complicates
the analysis, but that not exerting effort in period one is still optimal for the
incumbent. The period-two equilibrium effort level is also identical to the one
found in Section 2. The assumption on the observability of η1 only affects the
incumbent’s off-equilibrium period-two optimal effort choices.
Let
∗
∗
ηv1 = y1 − a1 = η1 + a1 − a1

(6)

denote the period-one ability computed by voters using the equilibrium effort
level. Using ηv1 and the equilibrium effort strategies, voters compute
∗
∗
∗
γ v2 = y2 − ηv1 − α 2 ηv1 , a1 = γ 2 + a2 − a1 + a1 − α 2 ηv1 , a1 .

(7)

As in Section 2, the incumbent is re-elected if and only if γ v2 > m1 . He can
∗
∗
compute a1 and ηv1 as voters do and, therefore, he can compute α 2 ηv1 , a1 .

L. Martinez: Political Incentives and Elections

323

Thus, the incumbent’s period-two maximization problem reads
max δR 1 −
a2 ≥0

∗
∗
m1 + a1 − a1 + α 2 ηv1 , a1 − a2

− c (a2 ) .

(8)

The solution of problem (8), α 2 η1 , a1 , satisfies
c α 2 η1 , a1

∗
∗
= δRφ m1 + a1 − a1 + α 2 ηv1 , a1 − α 2 η1 , a1

.

(9)

∗
∗
In equilibrium, a1 = a1 and, therefore, α 2 η1 , a1 = α 2 ηv1 , a1 (see
equation 6). Consequently, the period-two equilibrium effort level is the same
∗
as in Section 2 (i.e., it is given by c a2 = δRφ (m1 ) > 0).
Note that, as in Section 2, the equilibrium period-two effort level does not
depend on η1 and a1 . However, if η1 is not observable, off equilibrium the
optimal period-two effort level depends on a1 . Let α 2 (a1 ) denote this optimal
ˆ
effort level, which satisfies
∗
∗
c α 2 (a1 ) = δRφ m1 + a1 − a1 + a2 − α 2 (a1 ) .
ˆ
ˆ

At the beginning of period two, the incumbent’s expected utility is given
by
∗
∗
ˆ
m1 + a1 − a1 + a2 − α 2 (a1 ) .
(10)
The period-one incumbent’s maximization problem is given by

W2 (a1 ) = R − c α 2 (a1 ) + δR 1 −
ˆ

max {δW2 (a1 ) − c (a1 )} .
a1 ≥0

Recall that, since the incumbent’s period-one ability, η1 , is not observable,
the period-one ability computed by voters, ηv1 , is increasing with respect to a1 .
Thus, in period one, the incumbent could choose a higher effort level in order
to make voters believe that he has more ability. However, the incumbent’s
continuation utility is lower when voters believe that his period-one ability is
higher. There are two reasons for this.
First, under the assumptions in this section (and in earlier studies of political cycles), only period-two ability is correlated with period-three ability and,
therefore, only period-two ability directly influences the re-election decision.
Consequently, the incumbent would only want to influence voters’ period-one
inference in order to influence their period-two inference.
Second, for any period-two output observation, y2 , voters’ inference about
the period-two ability, γ v2 , is decreasing with respect to ηv1 (see equation 7).
If ηv1 is higher, voters believe that y2 is the result of a higher period-one ability
and a lower period-two ability.
Since the incumbent’s continuation utility is lower when voters believe
that his period-one ability is higher, W2 (a1 ) is decreasing with respect to a1
(recall that equation 6 shows that ηv1 is increasing with respect to a1 ). That
is, the incumbent does not have incentives to exert effort in period one. If he
exerted effort, he would both suffer the cost of exerting effort and decrease

324

Federal Reserve Bank of Richmond Economic Quarterly

his continuation utility. Therefore, the period-one equilibrium effort level is
∗
∗
given by a1 = 0 < a2 . Thus, equilibrium effort levels are identical to those
found in Section 2, and the assumption on the observability of η1 only affects
the incumbent’s off-equilibrium period-two optimal effort choices.

4. A RANDOM WALK PROCESS FOR ABILITY
In the previous section, I showed that when the incumbent’s period-one ability
is not correlated with his post-election ability (and, therefore, his period-one
effort cannot directly influence the re-election probability), the incumbent
does not want to exert effort in period one. This section studies the effects of
allowing for correlation between the period-one ability and the post-election
ability.
Following Holmstr¨ m’s (1999) seminal paper on career concerns, I aso
sume that ηt+1 = ηt + ξ t , where ξ t is normally distributed with mean 0 and
precision hξ (the variance is h1ξ ), and it is unobservable. The common belief
about the ability of a new incumbent is given by the distribution of abilities
in the economy, which is normally distributed with mean m1 and precision hη
(these are the beliefs about the period-one incumbent’s ability). Thus, results
presented in this section are a special case of the results presented in Martinez
(2009b). Let φ(v; x, z) denote the density function for a normally distributed
random variable V with mean x and precision z, and let (v; x, z) denote the
corresponding cumulative distribution function.
As in previous sections, the incumbent is re-elected if and only if his
expected period-three ability is higher than the expected period-three ability
of a policymaker who was not previously in office. That is, the incumbent is
∗
re-elected if and only if ηv2 = η2 +a2 −α 2 ηv1 , a1 > m1 (i.e., the incumbent
∗
is re-elected if and only if η2 > m1 +α 2 ηv1 , a1 −a2 ). Thus, the incumbent’s
period-two maximization problem reads
max δR 1 −
a2 ≥0

∗
m1 + α 2 ηv1 , a1 − a2 ; η1 , hξ

− c (a2 ) .

(11)

The solution of (11), α 2 η1 , a1 , satisfies
c α 2 η1 , a1

∗
= δRφ m1 + α 2 ηv1 , a1 − α 2 η1 , a1 ; η1 , hξ .

∗
∗
In equilibrium, a1 = a1 and, therefore, ηv1 = ηi1 = η1 and α 2 ηv1 , a1 =
∗
∗
α 2 η1 , a1 . Let a2 η1 ≡ α 2 η1 , a1 denote the period-two equilibrium effort
level, which is given by
∗
c a2 η 1

= δRφ m1 ; η1 , hξ .

(12)

Note that, in this section, the period-two equilibrium effort level depends
on the period-one ability η1 (recall this was not the case in previous sections). The realization of period-one ability shock affects the distribution of
the period-two ability shock.

L. Martinez: Political Incentives and Elections

325

At the beginning of period two, the incumbent’s expected utility is given
by
W2 η1 , a1

= R − c α 2 η1 , a1
+δR 1 −

∗
m1 + a2 η1 − α 2 η1 , a1 ; η1 , hξ

.

The period-one incumbent’s maximization problem is given by
max
a1 ≥0

W2 η1 , a1 φ η1 ; m1 , hη dη1 − c (a1 ) .

∗
Let a2 η1 denote the derivative of the period-two equilibrium effort level
with respect to the period-one ability. The following proposition presents the
incumbent’s effort-smoothing decision (see Appendix A for the proof).

Proposition 1 There exists a unique period-one equilibrium effort level that
satisfies
∗
c a1 = δ

∗
∗
−a2 η1 c a2 η1

φ η1 ; m1 , hη dη1 .

(13)

The Euler equation (13) represents the typical intertemporal tradeoff in
dynamic models: Having less utility in period one allows the incumbent to
have more utility in period two. In this case, a lower expected effort level in
period two compensates for a higher effort level in period one. The incumbent
knows that he could affect the re-election probability by exerting effort in
periods one and two. He could exert more effort in period one and less effort
in period two (or vice versa) and still have the same re-election probability.
Equation (13) shows that the optimal effort-smoothing decision depends
on the cost and the effectiveness of exerting effort in each period. In equa∗
tion (13), −a2 η1 represents the relative effectiveness in changing ηv2 (and,
therefore, the re-election probability) of a1 (compared with a2 ). The incumbent’s period-one effort level affects ηv1 directly, and it affects ηv2 through
ηv1 . His period-two effort level affects ηv2 directly. Thus, the relative effec∗
tiveness is the derivative of ηv2 = y2 − a2 ηv1 , with respect to ηv1 . For
example, if voters expect a lower period-two effort level from an incumbent
who is perceived to be better, then, by choosing a higher effort level in period
one, and making ηv1 higher, the incumbent would make voters expect a lower
period-two effort level. Consequently, voters would think that the period-two
outcome is the result of a lower period-two effort level and a higher period-two
ability. Thus, the incumbent’s period-one effort would have a positive effect
on the voters’ period-two learning.
This section introduces incentives to exert effort at the beginning of a term.
These incentives were not present in previous sections, where beginning-ofterm ability was not correlated with post-election ability. A positive relative
effectiveness implies that period-one effort was effective in changing ηv2 (and,

326

Federal Reserve Bank of Richmond Economic Quarterly

therefore, the re-election probability). Thus, in period one, the incumbent may
want to exert effort. Recall that, in Section 2, the relative effectiveness is zero
(period-one effort is not effective), and in Section 3 it is negative (with the
moving-average assumption, the incumbent’s expected post-election ability is
decreasing with respect to the beginning-of-term ability inferred by voters). In
this section, the relative effectiveness of period-one effort could be positive.
It could even be higher than one (implying that beginning-of-term effort is
more effective than end-of-term effort in changing the re-election probability).
However, the next proposition shows that, even though the incumbent could
use beginning-of-term effort to increase the re-election probability, under the
assumptions in this section, the incumbent chooses to exert zero effort at the
beginning of the term because the expected relative effectiveness is equal to
zero (see Appendix B for the proof).4
Proposition 2 In period one, the incumbent chooses not to exert effort.
Loosely speaking, proposition 2 shows that the incumbent does not expect
his period-one effort level to be effective in changing the re-election probability
and, therefore, he does not exert effort in period one. There are two reasons for
this. First, on average, the effect of period-one effort on period-two learning
is zero. Second, period-one learning does not have a direct effect on the
re-election probability (i.e., period-one effort may only affect the re-election
probability through its effect on period-two learning). Since there is no noise
in the production process, learning the incumbent’s period-two performance
is enough to perfectly learn his type. Thus, the policymaker’s behavior is
different closer to the election because we assume that his actions can only have
a direct effect on the re-election probability closer to the election. The next
section explains how the model can generate a cycle without this assumption.

5. A STOCHASTIC PRODUCTION FUNCTION
In previous sections, cycles arise because I assume differences across periods
(besides the proximity of the election). In particular, in Section 4, I showed that
assuming that output is a perfect signal of ability generates a strong asymmetry
across periods. In this section I relax this assumption. In particular, as in
Holmstr¨ m (1999), I assume that εt is a normally distributed random variable
o
with expected value 0 and precision hε —consequently, I can interpret the
results in Section 4 as the limit of the results presented in this section when
hε goes to infinity. Thus, the model studied in this section is the one-election
version of the model I study in Martinez (2009b).
4 As shown in the proof of proposition 2, the symmetry of the equilibrium effort strategy is
necessary to prove this result. In Martinez (2009b), I show that, in a version of the model with
more than three periods in which the incumbent can be re-elected more than once, even if the
ability distribution is symmetric, the equilibrium effort strategy may not be symmetric.

L. Martinez: Political Incentives and Elections

327

Since there is noise in production, observing output only allows voters
and the incumbent to compute a “signal” of the incumbent’s ability. This is in
contrast with previous sections, where observing output allows voters and the
incumbent to compute the incumbent’s ability. Define st ≡ ηt +εt . I refer to st
as the period-t signal of the incumbent’s ability. Voters and the incumbent use
the signal they compute to update their beliefs about the incumbent’s ability.
From this point forward, belief refers to belief about the incumbent’s ability
unless stated otherwise.
Beliefs are Gaussian and, therefore, they can be characterized by their
mean and their precision. Depending on the precision of the shock that determines the evolution of the incumbent’s ability, hξ , the precision of beliefs
may be increasing or decreasing with respect to the number of performance
observations (see Holmstr¨ m 1999).5 For simplicity, I assume that hξ is such
o
that the precision of beliefs is constant. That is, I assume
hξ =

h2 + hη hε
η

.
(14)
hε
By making an assumption that guarantees that the precision of beliefs is constant, I can keep track of their evolution by following the evolution of their
mean. This simplifies the analysis.
Equation (14) implies that for any t, the precision of the period-t + 1
beliefs about the signal st+1 is equal to the precision of the period-t beliefs
about the signal st . This precision is given by
hη hε
.
(15)
H ≡
hε + hη
Since beliefs about the signal are also Gaussian and have a constant precision,
the evolution of these beliefs can also be summarized by the evolution of their
mean, which is equal to the mean of the beliefs about ability.
As in previous sections, the incumbent is re-elected if and only if his
expected period-three ability is higher than the expected period-three ability
of a policymaker who was not previously in office. Let mvt and mit denote
the mean of the voters’ and the incumbent’s beliefs at the beginning of period
t (from here on, at period t). I refer to a belief with mean m as belief m. The
incumbent is re-elected if and only if mv3 > m1 .
Bayes’ rule implies that the mean of beliefs at t + 1 is a weighted sum of
the mean at t and the period-t signal. Equation (14) implies that the weight
of the period-t mean belief in the period-t + 1 mean belief does not depend
on the number of observations of the incumbent’s performance. This weight
5 In general, the precision of t + 1 believes h
t+1 is given by

ht+1 =

(ht + hε ) hξ
.
ht + hε + hξ

328

Federal Reserve Bank of Richmond Economic Quarterly

is given by
μ=

hη
.
hη + hε

(16)

Let svt and sit denote the period-t signal computed by voters and by the
incumbent, respectively. Since the incumbent knows the effort he exerted,
he can compute the true signal, i.e., sit = yt − at = st . Thus, mit+1 =
μmit + (1 − μ) sit = μmit + (1 − μ) st .
Voters compute the signal using equilibrium effort strategies. In Section
4, I wrote the incumbent’s period-two equilibrium strategy as a function of his
period-one ability and effort level. In this section, at the time of the periodtwo effort decision, the incumbent does not know his period-one ability, but he
learned the signal s1 . Instead of writing his period-two equilibrium strategy as
a function of a1 and s1 , for expositional simplicity, I will write the equilibrium
strategy as a function of a1 and m2 = μm1 + (1 − μ)s1 , α 2 (m2 , a1 ). Thus,
the period-two signal computed by voters is given by
∗
∗
sv2 ≡ y2 − α 2 (mv2 , a1 ) = s2 + a2 − α 2 (mv2 , a1 ),

(17)

where
∗
mv2 = μm1 + (1 − μ)sv1 = μm1 + (1 − μ) s1 + a1 − a1 =
∗
m2 + (1 − μ) a1 − a1 .

Consequently,
∗
mv3 = μmv2 + (1 − μ)sv2 = μmv2 + (1 − μ)[s2 + a2 − α 2 (mv2 , a1 )]. (18)

Equation (18) shows how exerting effort helps the incumbent increase the
re-election probability. The expected ability in the voters’ belief is increasing
with respect to effort, and voters re-elect the incumbent if and only if they
expect his ability to be good enough.
Recall that voters and the incumbent have the same period-one belief.
Moreover, in any period in which the incumbent exerts the equilibrium effort
level, voters and the incumbent compute the same signal. Consequently, in
equilibrium, the voters’ and the incumbent’s beliefs coincide (mvt = mit ).
−μm
∗
The incumbent is re-elected if and only if s2 > m11−μ v2 +α 2 (mv2 , a1 )−a2
∗
(i.e., if and only if mv3 > m1 ). Let Mv2 (m2 , a1 ) ≡ m2 + (1 − μ) a1 − a1
denote the mean of the voters’ period-two belief when m2 is the mean of the
incumbent’s period-two belief and a1 is the period-one effort level. Thus, the
incumbent’s period-two maximization problem can be written as
m1 − μMv2 (m2 , a1 )
1−μ
∗
+α 2 (Mv2 (m2 , a1 ) , a1 ) − a2 ; m2 , H − c (a2 ) .

max δR 1 −
a2 ≥0

(19)

L. Martinez: Political Incentives and Elections

329

The following proposition shows that a unique fixed point that solves for the
period-two equilibrium effort strategy exists (see Martinez [2009b] for the
proof).6
Proposition 3 (Martinez 2009b): Let m2 denote the voters’ and the incumbent’s beliefs at the beginning of period two. The unique period-two equilib∗
rium effort strategy a2 (m2 ) satisfies
m1 − μm2
; m2 , H > 0.
(20)
1−μ
∗
Thus, for any reputation m2 , the equilibrium period-two effort level a2 (m2 ) is
positive.
∗
c a2 (m2 ) = δRφ

Let M2 (s1 ) ≡ μm1 + (1 − μ)s1 denote the mean of the incumbent’s
period-two posterior belief when s1 is the signal he uses to update his prior.
The period-one incumbent’s maximization problem is given by
max δ
a1 ≥0

W2 (M2 (s1 ), a1 ) φ (s1 ; m1 , H ) ds1 − c (a1 ) ,

where
W2 (m2 , a1 ) = R − c (α 2 (m2 , a1 ))
m1 − μMv2 (m2 , a1 )
+δR 1 −
1−μ
∗
+ a2 (Mv2 (m2 , a1 )) − α 2 (m2 , a1 ); m2 , H
denotes the incumbent’s expected utility at the beginning of period two when
his belief is characterized by m2 and he chose a1 . The following proposition presents the incumbent’s period-one effort-smoothing decision (Martinez
[2009b] presents the proof).
Proposition 4 (Martinez 2009b): There exists a unique and positive period∗
one equilibrium effort level a1 that satisfies
∗
c a1 = δμ

∞
−∞

c (α 2 (M2 (s1 ))) φ (s1 ; m1 , H ) ds1 > 0.

(21)

In equation (21), the expected relative effectiveness in changing the reelection probability of the incumbent’s period-one effort (compared with his
period-two effort) is represented by μ > 0, which indicates the relative weight
of sv1 (compared with sv2 ) in
mv3 = μ2 m1 + (1 − μ)sv2 + μ(1 − μ)sv1 .
6 Note that, for μ = 0 (and, therefore, for m = η ), the equilibrium effort strategy in
2
1
equation (20) coincides with the one in equation (12).

330

Federal Reserve Bank of Richmond Economic Quarterly

Thus, the expected relative effectiveness, μ, indicates the relative importance
of the direct effect on the re-election probability of appearing more talented
in period one (recall the incumbent is re-elected if and only if mv3 > m1 ).7
Since the equilibrium period-two effort level in equation (20) is a function
of the incumbent’s period-two reputation, m2 , differences in the incumbent’s
behavior during his term in office could be the result of changes in his reputation and may not imply that he is deciding differently because the election
time is closer. I want to focus on differences in the incumbent’s behavior that
are due to the proximity of the election. Therefore, I refer to differences in
behavior across the incumbent’s term for a given reputation level as political
cycles. The next proposition shows that the model generates such cycles (I
present the proof in Martinez [2009b]).
Proposition 5 (Martinez 2009b): For the same reputation level (m1 ), the
period-two equilibrium effort level is higher than the period-one equilibrium
effort level.
Recall that the lessened effectiveness of effort further from the election
is the force behind political cycles in previous sections, which present this
mechanism in its most extreme form by making assumptions that imply that
beginning-of-term effort is not expected to be effective in increasing the reelection probability. In particular, the equilibrium strategy in Section 4 is a
special case of the equilibrium strategy presented in this section for which
period-one effort is not expected to be effective (μ = 0). In contrast, proposition 5 shows that a standard model can generate cycles for all possible values
of μ. In particular, the model can generate cycles if the effectiveness of
beginning-of-term actions is arbitrarily close to the effectiveness of end-ofterm actions (μ is arbitrarily close to 1). The proposition also shows that
discounting is not necessary for generating cycles in the model: Cycles arise
for all values of δ, including δ = 1.
How could political cycles arise in an economy without no discounting
where manipulating policy is equally effective in every period? As I explain
in Martinez (2009b), cycles could still arise in such an economy because at
the beginning of his term, the incumbent knows that his reputation is likely
to change, and he anticipates that this change will lead him to choose an effort level lower than the one he would choose at the end of his term for his
beginning-of-term reputation level. Note first that the period-two equilibrium effort strategy defined in equation (20) is a hump-shaped function of the
7 As in Section 4, because of the symmetry of the equilibrium period-two effort strategy, the
incumbent does not expect that his period-one effort will affect the re-election probability through
the period-two effort level used by voters for their period-two learning. In Martinez (2009a), I
present a more thorough discussion of the relative effectiveness and this indirect effect of currentperiod effort on next-period learning.

L. Martinez: Political Incentives and Elections

331

incumbent’s period-two reputation, m2 , as is the signal density function.8 That
is, in period two, the incumbent exerts less effort when his reputation has more
extreme values. Thus, in period one, he anticipates that if his reputation does
not change, he will choose α 2 (m1 ) in period two. He also anticipates that,
for example, if his period-one performance turns out to be either very good or
very bad (and, therefore, his period-two reputation is either very good or very
bad), he will exert a lower effort level in period two. In particular, the expected
period-two effort level is lower than α 2 (m1 ), and the expected marginal cost of
exerting effort in period two is lower than c (α 2 (m1 )). Therefore, the effort∗
smoothing rule in (21) implies that c a1 < c (α 2 (m1 )), and the incumbent
∗
chooses a1 < α 2 (m1 ).
In Martinez (2009b), I analyze the multiple-election version of the model
presented in this section. That is, I analyze a model with more than three
periods in which the incumbent could run for re-election more than once. Such
a model allows for the study of situations that do not arise in the one-election
version: With multiple elections, the beginning-of-term reputation may be
better than the average reputation, and the end-of-term effort may not be
maximized at the beginning-of-term reputation. Recall that in the one-election
version of the model, at the beginning of the term, there is a new incumbent
with an average reputation, and the proof of proposition 5 (which shows that a
political cycle arises in the one-election version of the model) is based on the
end-of-term equilibrium effort strategy being such that it is optimal to exert
the maximum effort level for the beginning-of-term reputation. In Martinez
(2009b), I show that the insight described in the one-election version of the
model helps us understand political cycles with multiple elections: For the
same reputation, end-of-term effort is higher if, at the beginning of the term,
the incumbent anticipates that changes in his reputation will, on average, lead
him to choose an end-of-term effort level lower than the one he would choose
for his beginning-of-term reputation. I also show that the model can generate
expected end-of-term effort levels higher than the beginning-of-term effort
level.

6.

CONCLUSIONS

Using a career-concern model of political cycles, this article discusses why
political incentives could be different in election times. First, I show that cycles could arise if end-of-term political actions are more effective in changing
the re-election probability than beginning-of-term actions. Following earlier
8 As I explain in Martinez (2009b), one can expect equilibrium effort to be hump-shaped in
the incumbent’s belief if better incumbents are less (more) likely to produce bad (good) signals.
One can expect equilibrium effort to be hump-shaped in the voters’ belief if extreme signals are
less likely than average signals.

332

Federal Reserve Bank of Richmond Economic Quarterly

theoretical studies of political cycles, I model this intuition in its most extreme form. In particular, I assumed that at the time of the election, only the
end-of-term ability is not observable; that only the incumbent’s end-of-term
performance is correlated with his post-election performance; and that the
incumbent’s performance is a perfect signal of his type. Then, I relax each of
these assumptions and discuss how they affect results. In particular, I show
that the model still generates cycles without assuming strong asymmetries
across periods because of the effort-smoothing considerations I first described
in Martinez (2009b). The analysis in this article helps one understand other
agency relationships in which an important part of the compensation is decided
upon infrequently.

APPENDIX A : PROOF OF PROPOSITION 1
∗
In equilibrium, a1 = a1 and, therefore, the first-order condition of the incumbent’s period-one problem reads
∗
c a1 = δ

∗
−δRa2 η1 φ m1 ; η1 , hξ φ η1 ; m1 , hη dη1 .

(22)

Equation (12) shows that
∗
δRφ m1 ; η1 , hξ = c a2 η1

.

(23)

Plugging equation (23) into equation (22), we obtain equation (13). Since
∗
there is a unique period-two equilibrium strategy, a2 η1 , defined by equation
∗
(12), there is a unique period-one equilibrium effort level, a1 , that can easily
be obtained from equation (13) (the right-hand side of equation 13 does not
depend on the period-one effort level).

APPENDIX B

: PROOF OF PROPOSITION 2

Recall that φ m1 ; η1 , hξ is symmetric with respect to η1 with the maxi∗
mum at η1 = m1 . Consequently, c a2 η1 is a symmetric function with
the maximum at η1 = m1 (see equation 12). Moreover, φ η1 ; m1 , hη is a
symmetric function with respect to η1 with the maximum at η1 = m1 . In
∗
∗
∗
addition, a2 (m1 ) = 0, and, for any A ∈ , a2 (m1 + A) = −a2 (m1 − A)

L. Martinez: Political Incentives and Elections

333

(see equation 12). Consequently,
∗
∗
a2 η1 c a2 η1

φ η1 ; m1 , hη dη1 = 0,

∗
and according to equation (13), a1 = 0.

REFERENCES
Akhmedov, Akhmed, and Ekaterina Zhuravskaya. 2004. “Opportunistic
Political Cycles: Test in a Young Democracy Setting.” Quarterly Journal
of Economics 119 (November): 1,301–38.
Alesina, Alberto. 1987. “Macroeconomic Policy in a Two-Party System as a
Repeated Game.” Quarterly Journal of Economics 102 (August):
651–78.
Alesina, Alberto, Nouriel Roubini, and Gerald D. Cohen. 1997. Political
Cycles and the Macroeconomy. Cambridge, Mass.: MIT Press.
Azzimonti Renzo, Marina. 2005. “On the Dynamic Inefficiency of
Governments.” Manuscript, University of Texas at Austin.
Besley, Timothy J., and Anne Case. 1995. “Does Electoral Accountability
Affect Economic Policy Choices? Evidence from Gubernatorial Term
Limits.” Quarterly Journal of Economics 110 (August): 769–98.
Brender, Adi. 2003. “The Effect of Fiscal Performance on Local Government
Election Results in Israel: 1989–1998.” Journal of Public Economics 87
(September): 2,187–205.
Brender, Adi, and Allan Drazen. 2005. “Political Budget Cycles in New
Versus Established Democracies.” Journal of Monetary Economics 52
(October): 1,271–95.
Cuadra, Gabriel, and Horacio Sapriza. 2006. “Sovereign Default, Interest
Rates and Political Uncertainty in Emerging Markets.” Working Paper
2006-02, Banco de M´ xico.
e
Drazen, Allan. 2000. “The Political Business Cycle After 25 Years.” NBER
Macroeconomics Annual 15: 75–117.
Foerster, Andrew, and Leonardo Martinez. 2006. “Are We Working Too Hard
or Should We Be Working Harder? A Simple Model of Career
Concerns.” Federal Reserve Bank of Richmond Economic Quarterly 92
(Winter): 79–91.

334

Federal Reserve Bank of Richmond Economic Quarterly

Hatchondo, Juan Carlos, Leonardo Martinez, and Horacio Sapriza.
Forthcoming. “Heterogeneous Borrowers in Quantitative Models of
Sovereign Default.” International Economic Review.
Hess, Gregory D., and Athanasios Orphanides. 1995. “War Politics: An
Economic, Rational-Voter Framework.” American Economic Review 85
(September): 828–46.
Hess, Gregory D., and Athanasios Orphanides. 2001. “Economic Conditions,
Elections, and the Magnitude of Foreign Conflicts.” Journal of Public
Economics 80 (April): 121–40.
Holmstr¨ m, Bengt. 1999. “Managerial Incentive Problems: A Dynamic
o
Perspective.” Review of Economic Studies 66: 169–82.
Martinez, Leonardo. 2009a. “Reputation, Career Concerns, and Job
Assignments.” The B.E. Journal of Theoretical Economics
(Contributions) 9: Article 15.
Martinez, Leonardo. 2009b. “A Theory of Political Cycles.” Journal of
Economic Theory 144 (May): 1,166–86.
Rogoff, Kenneth. 1990. “Equilibrium Political Budget Cycles.” American
Economic Review 80: 21–36.
Shi, Min, and Jakob Svensson. 2003. “Political Budget Cycles: A Review of
Recent Developments.” Nordic Journal of Political Economy 29: 67–76.
Shi, Min, and Jakob Svensson. 2006. “Political Budget Cycles: Do They
Differ Across Countries and Why?” Journal of Public Economics 90
(September): 1,367–89.
Stiroh, Kevin J. 2007. “Playing for Keeps: Pay and Performance in the
NBA.” Economic Inquiry 45: 145–61.
Wilczynski, Adam. 2004. “Career Concerns and Renegotiation Cycle
Effect.” Unpublished manuscript.


Federal Reserve Bank of St. Louis, One Federal Reserve Bank Plaza, St. Louis, MO 63102