View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Working Paper Series

Market-based Incentives

WP 13-05R

Borys Grochulski
Federal Reserve Bank of Richmond
Yuzhe Zhang
Texas A&M University

This paper can be downloaded without charge from:
http://www.richmondfed.org/publications/

Market-based incentives∗
Borys Grochulski†

Yuzhe Zhang‡

Working Paper No. 13-05R
Abstract
In this paper, we study market-induced, external incentives similar to career concerns jointly
with standard incentives induced by long-term contracts. We consider a dynamic principalagent problem in which the agent’s outside option is determined endogenously in a competitive labor market. In equilibrium, strong performance increases the agent’s market value.
When this value becomes sufficiently high, the threat of the agent quitting forces the principal to give the agent a raise. The prospect of obtaining this raise gives the agent an
incentive to exert effort, which reduces the need for standard incentives, like performance
bonuses. In fact, whenever the agent’s option to quit is close to being “in the money,” the
market-induced incentive completely eliminates the need for standard incentives.
Keywords: incentives, long-term contracts, career concerns, moral hazard, limited commitment
JEL codes: D82, D86, J33

1

Introduction

Misaligned incentives can induce wasteful individual behavior and lead to disastrous aggregate
outcomes.1 For that reason, to understand how incentives are provided in the economy is one
of the central questions in economics.
∗

The authors would like to thank V.V. Chari, Lukasz Drozd, Piero Gottardi, Hari Govindan, Felipe Iachan,
Boyan Jovanovic, Patrick Kehoe, Dirk Krueger, Marianna Kudlyak, Ramon Marimon, Urvi Neelakantan, Andrew
Owens, Chris Phelan, Ned Prescott, B. Ravikumar, Marek Weretka, and Jan Werner for their helpful comments.
The views expressed herein are those of the authors and not necessarily those of the Federal Reserve Bank of
Richmond or the Federal Reserve System.
†
Federal Reserve Bank of Richmond, borys.grochulski@rich.frb.org.
‡
Texas A&M University, yuzhe-zhang@econmail.tamu.edu.
1
To give just one recent example, the Federal Reserve (2011) states that “Risk-taking incentives provided by
incentive compensation arrangements in the financial services industry were a contributing factor to the financial
crisis that began in 2007.”

1

Two main sources of incentives have been identified in the literature: direct incentives specified explicitly in contracts between the counterparties to a given economic relationship, and
external incentives coming from outside of the relationship. These two sources of incentives,
however, have been studied almost exclusively in separation from one another. The traditional
principal-agent literature studies direct incentives obtained by contractually connecting the
agent’s compensation to her performance. But this literature does not consider the impact
that the agent’s performance may have on her outside options.2 In polar contrast, the literature on career concerns studies the external incentives stemming from the impact the agent’s
performance has on her reputation and hence on wages she can earn in the future. But it
ignores contractual provision of incentives by assuming that spot wages are paid every period.3
In reality, both sources of incentives are present in many, if not most, long-term economic
relationships.4 It is therefore important to study direct and external incentives together and
examine how they jointly affect behavior.
In this paper, we study the interaction between direct and external incentives in a model with
both fully flexible, long-term contracts and the agents’ concern for the value they can obtain
outside the present relationship. We obtain a novel characterization of the optimal mix of
incentives: external incentives are fully sufficient when the agent is new or has produced a
record of strong performance; contractual incentives become needed only after an extended
period of weak performance.
Our model introduces persistence and one-sided commitment into a dynamic moral hazard
contracting problem between a risk-neutral principal/firm and a risk-averse agent/worker. The
worker’s productivity/output follows a random walk with drift. The worker controls the drift
of her productivity by taking a costly action (effort). As in the standard dynamic moral hazard
model (e.g., Rogerson (1985), Phelan and Townsend (1991), Sannikov (2008)), the worker’s
productivity/output is observable, but her effort is not. Unlike in the standard model, effort
has in our model a persistent impact on the worker’s productivity: because productivity is a
random walk, if the worker shirks today, her productivity is reduced in all future periods. A
contract between the firm and the worker specifies an effort recommendation and compensation
to be paid to the worker along any observed history of the worker’s productivity. As in Harris
and Holmstrom (1982), the firm is committed to the contract, but the worker is not. The worker
can quit at any time and move on to work for another firm. If she quits, competition among the
2

See, e.g., Rogerson (1985), Phelan and Townsend (1991), and Sannikov (2008).
See Fama (1980), Holmstrom (1982), and the large body of literature that has followed.
4
To give one example, consider the CEO of a publicly traded corporation. Typically, her compensation will be
contractually connected to the corporation’s performance via equity grants or options, which gives her a direct
incentive to create shareholder value. But the company’s performance also affects the CEO’s reputation and,
thus, her standing in the broader market for business executives. This indirect exposure to the corporation’s
performance is a second channel through which the CEO is incentivized to create shareholder value.
3

2

potential new employers ensures that the firm to which she moves offers her the best contract
feasible subject to the firm breaking even. When designing this contract, the firm faces two
constraints: an incentive compatibility constraint resulting from moral hazard, and the worker’s
participation, or quitting, constraint that is due to the worker’s lack of commitment.
The value this contract delivers to the worker as of the time she is hired is her market value.
In equilibrium, this value will depend on the level of productivity with which the worker enters
the labor market. More productive workers will, naturally, have a higher market value.
External incentives arise in our model because the worker’s market value increases with the
level of her productivity, and the worker can increase her productivity by putting in effort
on her current job. Despite being under a long-term contract with her current employer, the
worker always cares about her market value because she can quit at any time. In particular,
if the market option becomes more attractive to the worker than her current job, her current
employer will have to give her a raise so as to match her market value, or the worker quits.5
If the worker’s market value is improving but remains below the value of continuing with her
current job, the worker still benefits because she comes closer to getting a raise in the future
(i.e., the option value of quitting increases). Thus, the worker always benefits when her market
value goes up. This motivates the worker to put in effort and perform well on her current job
even if strong performance is not rewarded in terms of her current pay. As we see, this effort
incentive comes from outside of the current relationship between the worker and the firm and is
driven by the worker’s market value considerations. We will call this incentive the market-based
incentive.
The optimal contract between the firm and the worker has the following structure. The contract
has two phases: a rigid-wage phase and a pay-for-performance phase. The worker starts out in
the rigid-wage phase, where her compensation is downward-rigid, as in Harris and Holmstrom
(1982). Namely, the worker earns a constant wage and receives raises only as needed to match
her outside option and keep her from quitting. Strong performance induces frequent raises.
Weak performance means no raise, but compensation never decreases in this phase of the
contract. In this phase, thus, compensation is back-loaded, i.e., expected pay increases over
time, and is not volatile: compensation does not respond to the worker’s performance, unless
she gets a raise. In the pay-for-performance phase, compensation is similar to the optimal
contract from the pure moral hazard model. It is front-loaded, i.e., expected pay decreases over
time, and volatile, i.e., always sensitive to current performance, both on the upside and on the
downside.
5
As in Harris and Holmstrom (1982), there is no economic role for job transitions in our homogeneous-firm
model of the labor market. We thus derive the optimal long-term contract under the assumption that workers
do not quit if indifferent. The alternative assumption leads to the exact same equilibrium processes for effort
and compensation.

3

Which phase the contract is in at any given time is determined by the slackness in the worker’s
quitting constraint, i.e, by how much the worker prefers her current contract to quitting and
getting her market value from a new employer. When slackness is low, i.e., below a specific
threshold, the contract is in the rigid-wage phase; otherwise, the contract is in the pay-forperformance phase.
The intuition for this threshold pattern, as well as for the structure of optimal compensation
in each phase of the contract, comes from the inverse relation between slackness in the quitting
constraint and the strength of external incentives. It is well-known that moral hazard creates
a trade-off between insurance and incentives: a well-insured agent has little incentive to exert
effort. The firm, however, can provide very limited insurance to the worker when the worker
is about to quit. Limited insurance means the worker’s continuation value is highly sensitive
to her performance, which gives her an incentive to exert effort. The closer the worker is to
quitting, the less insured she is and, thus, the stronger is that incentive. The threshold for
switching from the rigid-wage to the pay-for-performance phase lies at the point at which that
external, market-based incentive becomes too weak to deter shirking by itself. The firm must
at that point supplement the market-based incentive with some contract-based incentives, and
the contract begins to resemble the standard moral hazard contract, where pay is linked to
current performance. When slackness in the quitting constraint goes to infinity, the strength of
market-based incentives goes to zero and the optimal contract converges to the solution of the
standard dynamic moral hazard problem, e.g., Phelan and Townsend (1991), in which there are
no external incentives.
The link between the strength of market-based incentives and the worker’s slackness in the quitting constraint, or her “distance to default,” can be seen easily in the following simple example.
Suppose the firm pays the worker constant compensation for as long as the worker chooses to
stay with the firm. This means that the firm does not provide any pay-for-performance incentives and thus all incentives are market-based under this simple contract. How well this contract
insures the worker depends on how long the worker stays with the firm, which in turn depends
on how good is the worker’s market option. If her market value is just a notch below the value
delivered by the contract, a small positive shock to her productivity is enough to elevate the
market value above the value the contract is providing, i.e., a small shock is enough to make the
worker quit. Since such small shocks happen often, the expected duration of the simple contract
is short, so the contract provides very little insurance to the worker.6 With little insurance,
the worker’s incentive to put in effort is strong. If, however, the worker’s market value is much
worse than the value delivered by the contract, i.e, the quitting constraint is very slack, it takes
a large positive shock to elevate the worker’s market value above the value of the contract. Such
6

Note that it is the upside, not downside, risk that is uninsurable when the insured agent lacks commitment.

4

shocks are rare, so the worker is very well-insured under the simple contract. She therefore has
little incentive to put in effort. As we see, the strength of the external, market-based incentive
is indeed decreasing in the slackness of the quitting constraint.
How frequently market-based incentives are strong in equilibrium depends on how close the
quitting constraint is to binding on average. One important factor determining the average
“distance to default” is the expected change in worker productivity. If productivity tends to
grow over time, the worker’s market value tends to increase, so the quitting constraint binds
often. This makes market-based incentives strong frequently and contract-based incentives
needed rarely. In particular, with a sufficiently large positive trend in worker productivity, the
probability that contract-based incentives are ever used can be arbitrarily small.
As an extension of our model, we study the possibility that not only workers but also firms lack
commitment. In particular, we follow Phelan (1995) in assuming that firms can fire workers
upon incurring a firing cost. In this extension, thus, in addition to the worker’s quitting
constraint, we have a firm’s participation, or firing, constraint. We show that if the firing cost
is not too large, the worker is always exposed to risk and, thus, market-based incentives are
always strong. If slackness in the quitting constraint is low, then, as in our basic model, marketbased incentives arise because the upside risk to the worker’s productivity is uninsurable. If
slackness in the worker’s quitting constraint becomes large, the firm’s firing constraint becomes
tight and market-based incentives arise because the downside risk to the worker’s productivity
is not fully insured. In this extension, optimal compensation has a “sticky wage” structure:
small positive or negative performance shocks do not affect the wage; large positive (negative)
shocks increase (decrease) the wage paid to the worker.
In order to characterize the solution to our model analytically, we make several assumptions
widely used in the dynamic contracting literature. Constant absolute risk aversions (CARA)
preferences and Gaussian shocks let us reduce to one the dimension of the state space sufficient for a recursive representation of our contracting problem.7 The optimal contract is then
characterized by solving an ordinary differential equation.8 Although needed for analytical
tractability, these assumptions are not necessary for the existence of market-based incentives.
We briefly consider a version of our model with log preferences and log-normal shocks and show
that there, too, market-based incentives are strong when slackness in the workers’ quitting
constraint is not too large.
Essential for the existence of market-based incentives are workers’ inability to commit to staying
7

See, e.g., He et al. (2013) and Prat and Jovanovic (2013) for recent applications of this technique.
Because the lower bound is a reflecting rather than an absorbing barrier for the state variable in our model,
the differential equation characterizing the equilibrium contract does not satisfy standard regularity conditions
for the existence of a solution. We develop a change of variable technique to solve this problem. This technique
can be useful in studying other contracting problems with reflective barrier dynamics.
8

5

on the job forever and a positive impact of workers’ on-the-job effort on their market value.9
These conditions seem very plausible. The latter condition, in particular, is similar to learningby-doing. It will be satisfied whenever putting in effort on the job helps a worker acquire any
kind of skill or experience that is valued in the labor market.
Our model provides several testable predictions, which we discuss in Section 7. In particular,
having both a rigid-wage phase and a pay-for-performance phase, the equilibrium contract from
our model can generate wage change frequencies consistent with empirical evidence.
Relation to the literature Naturally, our paper is related to the literature on career concerns.
The external incentives that we call market-based incentives in this paper are similar to career
concerns: both exist because the worker’s current effort has a persistent impact on her expected
productivity and, thus, on her future market value. In fact, market-based incentives are a
simpler version of career concerns, and this simplicity lets us study them jointly with long-term
contracts.
In the career concerns model, the worker’s market value increases with her effort because of the
market’s imperfect ability to infer the worker’s talent/productivity from her observed performance. The worker expends effort in an attempt to manipulate the market’s belief about her
productivity or, as Fudenberg and Tirole (1986) put it, to jam the signal about her productivity
received by the market. This gives rise to complicated belief dynamics in the career-concerns
equilibrium with spot wages paid every period. In our model, the market’s belief about the
worker’s talent/productivity is trivially correct at all times, as the worker’s productivity is
public information. It is impossible for the worker to manipulate the market’s belief about
her productivity, so the worker is not motivated by signal-jamming in our model. The persistent impact of effort on the worker’s market value is generated through the persistence in the
worker’s (publicly observable) productivity process. Although the worker enters a long-term
employment contract, she always has an incentive to improve her market value because she
has the right to quit at any time. In sum, the career concerns model has complicated learning
dynamics for the worker’s expected productivity, but uses a restrictive assumption about contracts used in the labor market (spot wages). Our model has simple dynamics for the worker’s
expected productivity, but allows for fully flexible, long-term compensation contracts.
We abstract from learning and instead use a simpler process capturing the impact of effort
on the worker’s future market value because learning dynamics give rise to the problem of
persistent private information: if the worker deviates from the equilibrium level of effort, she
becomes better-informed about her productivity than the firm. This belief discrepancy makes
9
Note that the latter condition will be satisfied even if workers’ productivity follows a process less persistent
than random walk. The less persistent this process is, the weaker market-based incentives will be. In reality,
workers’ productivity is highly persistent and close to being a random walk; see, e.g., Storesletten et al. (2001).

6

the provision of incentives complicated because the deviating worker’s optimal effort strategy
in the continuation contract depends on the history of deviations she took to date. He et al.
(2013) and Prat and Jovanovic (2013) study this contracting problem under full commitment,
i.e., without external incentives. These analyses, however, are not easily extendable to the case
with external markets and limited commitment, where external incentives can be studied. In our
model, a deviation by the worker does not make her better-informed about her productivity than
the firm or the outside market, so there is no persistent private information. This simplification
gives us a tractable model in which we include external markets, relax the assumption of full
commitment, and study contract- and market-induced incentives jointly.
Gibbons and Murphy (1992) study contract- and market-induced incentives in the learning
environment of Holmstrom (1982). They restrict attention to one-period, linear compensation
contracts. This restriction makes their model tractable because, after any deviation, the deviating agent’s optimal effort strategy coincides with the equilibrium strategy. They show that
pay should be most sensitive to performance for workers close to retirement and workers with
no promotion opportunities.
Our paper builds on the extensive literature on optimal long-term contracts in principal-agent
relationships with moral hazard (private information). Important papers in this literature
include Rogerson (1985), Spear and Srivastava (1987), and Phelan and Townsend (1991). In
particular, we follow Sannikov (2008) in studying dynamic moral hazard in continuous time.
This literature studies contract-based incentives but does not capture external, market-based
incentives because it takes the agent’s outside option parametrically. Our paper shows that
external incentives arise in the dynamic principal-agent problem if (a) strong performance
enhances the agent’s outside value, and (b) the agent cannot contractually commit to not
leaving her current relationship should quitting become beneficial. We show that external
incentives change significantly the structure of the optimal contract. In particular, in addition
to the pay-for-performance phase familiar from dynamic moral hazard models, with external
incentives the contract also has a rigid-wage phase in our model.
Our paper also builds on the literature on optimal contracting with commitment frictions. In
particular, we add a dynamic moral hazard problem to a version of the Harris and Holmstrom
(1982) model of the labor market with risk-neutral firms/principals competing for risk-averse
workers/agents who seek insurance against persistent idiosyncratic productivity shocks.10 Firms
are able to commit to long-term contracts, but workers cannot.11 As in Harris and Holmstrom
(1982), one-sided commitment leads to downward-rigid compensation in our model, but only
10

Harris and Holmstrom (1982) use an exogenous learning model to generate persistent productivity/talent
shocks.
11
A similar environment is used in Krueger and Uhlig (2006) to study the market for long-term financial
insurance contracts.

7

when the worker is close enough to quitting. Different from Harris and Holmstrom (1982), due
to moral hazard, downward-rigidity of compensation does not hold when the quitting constraint
is sufficiently slack.
There exist a small number of studies that, like we do here, examine optimal contracts under
the two frictions of private information and limited commitment. Two studies closely related
to our paper are Thomas and Worrall (1990, Section 8) and Phelan (1995). In these papers,
however, external incentives do not arise because the agent’s outside option does not depend
on her past performance. In Atkeson (1991), the outside option of the agent (a borrowing
country) does depend on her actions (investment). For this reason, although that paper asks
a different question, we expect that market-based incentives exist in that environment. These
incentives are probably weak in that model because persistence in the impact of the private
action (investment) on the value of the outside option (autarky) is not very strong. In our
model, effort has a permanent effect on the worker’s outside option, which makes market-based
incentives much stronger and easier to identify.
Modeling compensation as part of a long-term employment contract has a long tradition in
the economic theory of employment and wage determination that dates back to Baily (1974),
Azariadis (1975), and Holmstrom (1983). Although in this theory, as in our model, employment
contracts provide insurance to workers, shocks considered there are aggregate or industrywide, while we consider worker-specific shocks to individual productivity. Also, that literature
abstracts from incentive problems, which are the primary focus of this paper. Our main interest
is in showing the effect of market-based incentives on the structure of the optimal compensation
contract under moral hazard. To this end, we keep the model of the labor market simple. By
assuming frictionless matching between firms and workers, we abstract from search costs and
exogenous separations. All workers in our model are employed at all times.
Organization The model environment is formally defined in Section 2. Sections 3 and 4 study
single-friction versions of our model, with full commitment in Section 3 and full information
in Section 4. Optimal contracts from these models serve as benchmarks that we use to solve
the full model in Section 5. In particular, the minimum cost functions from these models
provide useful lower bounds on the cost function in the full model. Section 6 considers the
robustness of our results with respect to the functional forms we use, as well as with respect to
the assumption of full commitment on the firm side. Section 7 discusses testable predictions of
our model. Proofs of all results formally stated in the text are relegated to Appendix A.

8

2

A labor market with long-term contracts

We consider a labor market populated with a large number of workers and a potentially larger
number of firms operating under free entry. For concreteness, we will assume that one firm hires
one worker.12 Matching between workers and firms is frictionless: an unmatched worker can
instantaneously find a match with a new firm entering the market. In a newly formed match,
the firm offers the worker a long-term employment contract. Competition among firms, those
in the market and the potential new entrants, drives all firms’ expected profits to zero.
Workers are heterogeneous in their productivity yt , which changes stochastically over time
following a Brownian motion with drift. Let z be a standard Brownian motion z = {zt , Ft ; t ≥
0} on a probability space (Ω, F, P). A worker’s productivity process y = {yt ; t ≥ 0} is y0 ∈ R
at t = 0 and evolves according to
dyt = at dt + σdzt .
(1)
The drift in a worker’s productivity at t, at , is privately controlled by the worker via a costly
action she takes. Specifically, at ∈ {al , ah } with al < ah . The volatility of yt is fixed: σ > 0
at all t. Workers are heterogeneous in the initial level of their productivity y0 , in the realized
paths of their productivity shocks {zt ; t > 0}, and, potentially, in the action path {at ; t ≥ 0}
they choose. The path of actions {at ; t ≥ 0} taken by each worker is her private information.
The structure of the productivity process and each worker’s productivity level yt are public
information at all times.
We adopt a simple production function in which the revenue the worker generates for the firm
equals the worker’s productivity yt at all times during her employment with the firm. In a
long-term employment contract, the firm collects revenue {yt ; t ≥ 0} and pays compensation
{ct ; t ≥ 0} to the worker. We will identify compensation ct with the worker’s consumption at
all t ≥ 0.13
Formally, a long-term contract a firm and a worker enter at t = 0 specifies an action process
a = {at ; t ≥ 0} for the worker to take, and a compensation/consumption process c = {ct ; t ≥ 0}
the worker receives. Processes a and c must be adapted to the information available to the
firm.
We assume that firms and workers discount future payoffs at a common rate r. In a match, the
12

As long as each worker’s performance is observable, our results would be unchanged if firms in the model
hired multiple workers.
13
We can think of the worker’s savings or financial wealth as being observable and thus contractually controlled
by the firm.

9

firm’s expected profit from a contract (a,c) is given by
a

Z

∞

−rt

re

E


(yt − ct )dt ,

0

where Ea is the expectation operator under the action plan a.
Action at represents the worker’s effort at time t. If the worker takes the high-effort action
ah , she improves her current productivity and, hence, the revenue she generates for the firm.
Because yt is persistent, high effort ah also increases the worker’s expected productivity in the
future. Action ah , however, is costly to the worker in terms of current disutility of effort.
All workers have identical preferences over compensation/consumption processes c and action
processes a. These preferences are represented by the expected utility function
a

Z

∞

−rt

re

E


U (ct , at )dt .

0

To make our model tractable analytically, we will abstract in this paper from wealth effects in
the provision of incentives. That is, we will assume constant absolute risk aversion (CARA)
with respect to consumption by taking
U (ct , at ) = u(ct )φ1at =al ,
where u(ct ) = − exp(−ct ) < 0, 0 < φ < 1, and 1at =al is the indicator of the low-effort action
al at time t. High effort ah is costly to the agent because U (c, ah ) = u(c) < u(c)φ = U (c, al )
for all c.14 In Section 6, we discuss the extent to which our results depend on this form of the
utility function.
Firms can commit to long-term contracts, but workers cannot. A worker has the right to quit
and rejoin the labor market at any point during her employment with a firm. In the market, the
worker is free to enter another long-term contract with a new firm. Any contractual promise
by the worker to not use her market option would not be enforceable. The presence of this
inalienable right to quit restricts firms’ ability to insure workers against the upside risk to their
productivity. In particular, contracts will be restricted by workers’ participation (or quitting)
constraints defined as follows. Denote by V (yt ) the value a worker with productivity yt can
obtain if she quits and rejoins the labor market. This market value will be determined in
equilibrium. We show later (in Proposition 1) that V is strictly increasing. For a worker with
initial productivity y0 ∈ R, a contract (a, c) induces a continuation value process W = {Wt ; t ≥
14
We can equivalently write U (ct , at ) as u(ct + 1at =al log(φ−1 )) and interpret log(φ−1 ) > 0 as the consumption
equivalent of the utility the agent gets from leisure associated with exerting low effort.

10

0} given by
a

∞

Z

Wt = E

−rs

re


U (ct+s , at+s )ds |Ft .

(2)

0

Contract (a, c) satisfies the worker’s quitting constraints if at all dates and states
Wt ≥ V (yt ).

(3)

This constraint is standard in models of optimal contracts with limited commitment (e.g.,
Thomas and Worrall (1988)). It also resembles the lower-bound constraint on the continuation
value Wt used in many principal-agent models with private information (e.g., Atkeson and Lucas
(1995) and Sannikov (2008)), but is in an important way different because the lower bound in
these models is given by some fixed value, whereas in (3) the lower bound V (yt ) changes with
the worker’s productivity. Later in the paper, we will see that this difference has important
implications for the provision of incentives to the worker at the lower bound.
In this paper, we adopt the convention that when the quitting constraint (3) binds, i.e., when
the worker is indifferent to quitting, the worker stays. In our model, as in Harris and Holmstrom
(1982), there are no efficiency gains from separations. Since switching employers would not make
the worker more productive, the best continuation contract that the worker’s current employer
can provide is as good as the best contract that the worker can get in the market. Adopting
the convention that workers do not quit when (3) binds is thus without loss of generality, but
lets us avoid additional notation that would be needed to describe job transitions.15
Because action at is not observable, contracts will also have to satisfy incentive compatibility
(IC) constraints. A contract is incentive compatible if no deviation from the recommended
action process a can make the worker better off. We will express IC constraints using the
following results of Sannikov (2008).
Let (a, c) be a contract and W the associated continuation utility process as defined in (2).
There exists a (progressively measurable) process Y = {Yt ; t ≥ 0} such that the continuation
utility process W can be represented as
dWt = r(Wt − U (ct , at ))dt + Yt dzta ,
where
zta

=σ

−1



Z t
yt − y0 −
as ds .

(4)

(5)

0
15

If we follow the alternative convention and suppose that the worker quits when (3) binds, the optimal contract
is the same except it ends when (3) binds for the first time and is replaced with a new contract identical to the
continuation of the original contract. This interpretation of long-term contracts is equivalent to the no-separation
convention we adopt in that it leads to identical production, consumption, and welfare.

11

Contract (a, c) is IC if and only if for all t and ã ∈ {ah , al },
r (U (ct , ã) − U (ct , at )) + σ −1 (ã − at )Yt ≤ 0.

(6)

For proof of these results see Sannikov (2008).
In (4), dzta = σ −1 (dyt − at dt) represents the worker’s current on-the-job performance. Performance at t is measured by the change in the worker’s productivity, dyt , relative to what
this change is expected to be at t under the recommended action plan, at dt, and normalized
by σ. Note that as long as the worker follows the recommended action at , her (observable)
performance dzta will be the same as the (unobservable) innovation term dzt in her productivity
process given in (1).
Also in (4), Yt represents the sensitivity of the worker’s continuation value to current performance. Clearly, a larger Yt will imply a stronger response of Wt to any given observed
performance dzta . The IC constraint (6) requires that the total gain the worker can obtain by
deviating from the recommended action at to the alternative action ã be nonpositive. The first
component of this gain shows the direct impact of the deviation on the worker’s current utility.
The second component shows the indirect impact of the deviation on the continuation utility
expressed as the product of the action’s impact on the worker’s performance and the sensitivity
of the continuation value to performance.
If the recommended action at time t is to exert effort, i.e., if at = ah , then the IC condition (6)
reduces to ru(ct )(φ − 1) ≤ σ −1 (ah − al ) Yt , or
Yt
≥ β,
−u(ct )

(7)

where β = rσ a1−φ
> 0. Analogously, the low-effort action al is IC at t if and only if
h −al
Yt
≤ β.
−u(ct )
Written in this form, the IC constraints make it clear that the ratio Yt /(−u(ct )) measures the
strength of effort incentives that contract (a, c) provides to the worker at time t. The high-effort
action ah is incentive compatible at t if and only if this ratio is greater than the constant β. Low
effort is incentive compatible if and only if this ratio is smaller than β. As in Sannikov (2008),
higher sensitivity of the worker’s continuation value to her current on-the-job performance,
Yt , makes effort incentives stronger. Due to non-separability of workers’ preferences between
consumption and leisure, the level of consumption ct also affects the strength of effort incentives

12

in our model.16 In particular, if the contract recommends high effort, the gain in the flow utility
the worker can obtain by shirking is in our model smaller at higher consumption levels.17 For a
given level of sensitivity Yt , thus, higher current consumption ct makes effort incentives stronger.
We are now ready to define the contract design problem faced by a firm matched with a
worker. We will define this problem generally as a cost minimization problem in which the firm
needs to deliver some present discounted utility value W ∈ [V (y0 ), 0) to a worker whose initial
productivity is y0 . Let Σ(y0 ) denote the set of all contracts (a, c) that at all t satisfy quitting
constraints (3) and IC constraints (6). The firm’s minimum cost function C(W, y0 ) is defined
as
Z ∞

−rt
a
re (ct − yt )dt
(8)
C(W, y0 ) = min
E
(a,c)∈Σ(y0 )

subject to

0

W0 = W .

(9)

The constraint (9) is known as the promise-keeping constraint: the contract must deliver to
the worker the initial value W . In the special case of W = V (y0 ), the value −C(V (y0 ), y0 )
represents the profit the firm attains in a match with a worker of type y0 when the worker’s
outside value function is V .
Next, we define competitive equilibrium in the labor market with long-term contracts.
Definition 1 Competitive equilibrium consists of the workers’ market value function V : R →
R− and a collection of contracts (ay0 , cy0 )y0 ∈R such that, for all y0 ∈ R,
(i) (ay0 , cy0 ) attains the minimum cost C(V (y0 ), y0 ) in the firm’s problem (8)–(9),
(ii) C(V (y0 ), y0 ) = 0 and C(W, y0 ) > 0 for any W > V (y0 ).
The first equilibrium condition requires that when firms assume (correctly) that the workers’ outside value is their equilibrium market value, then the equilibrium contracts are costminimizing (i.e., efficient) and in fact deliver to workers their market value. The second condition comes from perfect competition under free entry: profits attained by firms must be zero in
equilibrium and no firm can deliver to a worker a larger value than her market value without
incurring a loss.
16

Compare our IC constraint (6) with the IC constraint (21) on page 976 of Sannikov (2008). Consumption ct
does not show up in the IC constraint of that model because preferences considered there are additively separable
between consumption and effort.
17
This property is particularly easy to see if we interpret log(φ−1 ) > 0 as the consumption equivalent of the
utility the agent gets from shirking. Since shirking at t is equivalent to consuming ct + log(φ−1 ) instead of ct ,
decreasing marginal utility of consumption implies that the gain from shirking is lower when ct is higher.

13

2.1

Level-independence of incentives

The following proposition shows a simple relationship between optimal contracts offered to
workers with different productivity levels. This relationship implies a particularly simple functional form for the equilibrium value function V and gives us a partial characterization of the
cost function C.
Proposition 1 If (a0 , c0 ) is the optimal contract for y0 = 0, then, for any y0 ∈ R, the optimal
contract (ay0 , cy0 ) is given by
ay0

= a0 ,

(10)

c y0

= c0 + y0 .

(11)

The equilibrium value function V satisfies
V (y) = e−y V (0)

∀y ∈ R.

(12)

∀y ∈ R, W < 0.

(13)

The minimum cost function C satisfies
C(W, y) = C(W ey , 0)

The independence of the optimal action recommendation from y0 , shown in (10), and the
additivity of the optimal compensation plan with respect to y0 , shown in (11), follow from
the independence of future productivity changes dyt from the initial condition y0 and from the
absence of wealth effects in CARA preferences. With no wealth effects, incentives needed to
induce high or low effort are the same for workers of all productivity levels. The contribution of
changes in a worker’s productivity to a firm’s revenue is also the same for all workers. Thus, the
same effort process is optimally recommended to workers of all productivity levels, and output
produced by a worker with initial productivity y0 = y > 0 is path-by-path larger by exactly y
than output produced by a worker with initial productivity y0 = 0. Competition among firms
implies then that in equilibrium the worker with y0 = y will obtain the same compensation
process as the worker with y0 = 0 plus the constant amount y at all t.
This structure of the compensation plan allows us to pin down the functional form of the
workers’ market value function V (y0 ), as given in (12). Intuitively, if a worker with y0 = 0
obtains V (0) in market equilibrium, then a worker with y0 = y will obtain e−y V (0) because her
consumption is larger by y at all t and the utility function is exponential, so u(ct +y) = e−y u(ct )
at all t.

14

In addition, this structure of optimal contracts implies a particular form of homogeneity for
a firm’s minimum cost function C(W, y), as shown in (13). Suppose some contract efficiently
delivers some value W < 0 to a worker whose initial productivity y0 = y > 0 (i.e., this
contract attains C(W, y)). Then a modified contract with compensation uniformly decreased
by y will efficiently deliver value ey W < W to a worker whose initial productivity y0 = 0
(i.e., the modified contract will attain C(ey W, 0)). But these two contracts generate the same
cost/profit for the firm, as in the second case the worker produces less output (uniformly less
by y) and receives less compensation (also less by y).18
The scalability of the contracting problem and the implied homogeneity of the minimum cost
function greatly simplify our analysis in this paper. In order to solve for the equilibrium, it is
sufficient to find one value, V (0), and one contract that supports it, (a0 , c0 ).

2.2

Optimality of high effort

In our analysis, we will focus on the case in which the recommendation of the high-effort action
ah is optimal and therefore always used by firms in equilibrium. In the absence of information
and commitment frictions, i.e., in the first best, the firm provides full insurance to the worker,
i.e., it keeps the worker’s utility constant. To keep U (ct , at ) constant, the firm must pay higher
compensation ct if it requires the high-effort action ah , because effort is costly to the worker.
In particular, under action ah compensation must be higher by log(φ−1 ) than under action al .
High effort at t, however, increases the worker’s output permanently by ah − al . In the absence
of frictions, therefore, high effort is optimal if and only if
ah − al ≥ r log(φ−1 ).

(14)

With limited commitment and moral hazard, there is an additional cost of implementing the
high-effort action ah : under high effort the firm cannot insure the worker as well as under low
effort. We will verify in Section 5 that the following modification of (14) is sufficient for the
action ah to be optimal at all times under moral hazard and one-sided commitment.

q
a2h + 2rσ 2 − ah . We assume that
Assumption 1 Let κ = σ −2
 1
κ
(ah − al ) ≥ r log φ−1 + βσ.
1+κ
2

(15)

In (15), the multiplicative factor κ/(1 + κ) represents the additional cost of implementing effort under limited commitment. High effort makes the worker’s productivity grow faster, which
18

Similarly, a worker with initial y0 = −y < 0 will produce and receive y units less than a worker with y0 = 0.

15

increases the worker’s upside risk. Because this risk is not fully insurable under limited commitment, implementing high effort becomes more costly in the presence of this friction. Similarly,
βσ/2 represents the additional cost of implementing effort under moral hazard. This cost, as
well, is positive because moral hazard restricts the firm’s ability to insure the worker.19
It is not hard to check that (15) holds for low enough al , so the set of parameter values satisfying
Assumption 1 is nonempty. We will maintain this assumption throughout the paper.

2.3

Recursive formulation

In order to find the cost function C(Wt , yt ), we will use the methods of Sannikov (2008) to study
a recursive minimization problem with control variables at , ut ≡ u(ct ), and Yt . Scalability and
homogeneity properties of Proposition 1 let us reduce the dimension of the state space in this
recursive problem. Instead of studying this problem in the two-dimensional state vector (Wt , yt ),
we can reduce the state space to a single dimension as follows. Using (13) and (12), we have
yt



C(Wt , yt ) = C(Wt e , 0) = C

Wt
V (0), 0
−y
e t V (0)




=C


Wt
V (0), 0 .
V (yt )

(16)

This shows that the minimum cost C(Wt , yt ) is the same for all pairs (Wt , yt ) for which the
ratio Wt /V (yt ) is the same. We will find it convenient to transform this ratio further and define
a single state variable as

St ≡ log

V (yt )
Wt


.

(17)

Using St , we can express the firm’s cost function as

C(Wt , yt ) = C

Wt
V (0), 0
V (yt )




= C e−St V (0), 0 = C (V (St ), 0) ,

where the first equality uses (16), the second uses (17), and the third uses (12). We will denote
C (V (·), 0) by J(·) and solve for this function in the state variable St .
To study the firm’s cost minimization problem in St , we must express not only the objective
function but also the constraints of this problem in terms of St . The IC constraint (7) is not
affected by the change of the state variable because it depends on the control variables only.
Using (17), we can express the worker’s quitting constraint (3) as
St ≥ 0.

(18)

19
In Sections 3 and 4, we discuss in detail how moral hazard and limited commitment, taken separately, impair
the firm’s provision of insurance.

16

Thus, St measures how slack the quitting constraint is at time t. When St = 0, the quitting
constraint binds, i.e, the worker is indifferent between continuing with her current contract and
quitting.
St measures slackness in the quitting constraint in units of permanent consumption. Using the
inverse utility function u−1 (x) = − log(−x), x < 0, we can express St as u−1 (Wt ) − u−1 (V (yt )).
Thus, St is the difference between the worker’s continuation value inside the contract and
her outside market value when both these values are converted to permanent compensation
equivalents. Indeed, if St = S for some S > 0, then the worker is indifferent between giving up
S units of her compensation forever and separating from the firm.20
With the worker equilibrium value function (12) substituted into (17), we can write the state
variable St as
St = − log(−Wt ) − yt + log(−V (0)).
(19)
Using Ito’s lemma, the law of motion for yt given in (1), and the law of motion for Wt given in
(4), we obtain the law of motion for the state variable St under high effort as
dSt =

!






Yt 2
1
Yt
ut
+
− ah dt +
− σ dzta .
r −1 −
−Wt
2 −Wt
−Wt

(20)

The expected change in the slackness St consists of three terms. The first term accounts for
the impact of the current utility flow ut on the permanent compensation equivalent of the
worker’s continuation value Wt . In particular, if ut > Wt , then the continuation value owed to
the worker decreases, and so does its permanent compensation equivalent − log(−Wt ). This,
ceteris paribus, reduces slackness in the quitting constraint. The second term in the drift of
St , 21 (Yt /−Wt )2 , accounts for the expected increase in permanent compensation associated
with risk exposure Yt . For a given continuation value Wt to be delivered to the worker, larger
volatility Yt will require higher expected compensation (in permanent units) because the worker
is risk averse. The third term comes from the expected change in the worker’s productivity.
Faster productivity growth increases the worker’s market value, which decreases slackness in the
quitting constraint, ceteris paribus. The volatility term in (20) is simply the difference between
the risk sensitivities of u−1 (Wt ) and u−1 (V (yt )). We will find it useful to normalize the control
variables ut and Yt by the absolute value of the worker’s continuation utility. Introducing
Yt
ut
and Ŷt ≡ −W
, we express (20) as
ût ≡ −W
t
t




1 2
dSt = r (−1 − ût ) + Ŷt − ah dt + Ŷt − σ dzta .
2

(21)

20
To see this, note that if St = S and {ct+s ; s ≥ 0} is a compensation process that gives the worker the
continuation value Wt , then the compensation process {ct+s − S; s ≥ 0} gives the worker the continuation value
exactly equal to the value of her outside option, V (yt ).

17

The Hamilton-Jacobi-Bellman (HJB) equation for the firm’s cost function J is

rJ(St ) = rSt − r log(−V (0)) + min r(− log(−ût )) +

(22)

ût ,Ŷt



1
J (St ) r (−1 − ût ) + Ŷt2 − ah
2
0




2 
1 00
,
+ J (St ) Ŷt − σ
2

where control variables must satisfy Ŷt ≥ −ût β to ensure incentive compatibility of the recommended high-effort action ah . Note that both the HJB equation and its solution J depend on
the value V (0), which in equilibrium is determined by the firm break-even condition J(0) = 0.
The meaning of the terms in the HJB equation is standard. It may be helpful to write the HJB
equation informally as

1 00
2
rJ(St ) = min r(ct − yt ) + J (St ) (drift of St ) + J (St ) (volatility of St ) .
2


0

(23)

Intuitively, the first derivative J 0 represents the firm’s aversion to the drift of St because, as we
see in (23), the total cost rJ(St ) increases by J 0 (St ) when the drift of St increases by one unit.
Similarly, the second derivative J 00 shows how strongly the cost function will respond to an
increase in the volatility of St , so in this sense it represents the firm’s volatility aversion. Also,
using definitions of St and ût , it is easy to verify that the first three terms on the right-hand
side of (22) represent the firm’s flow cost r(ct − yt ).
In Section 5, we will characterize optimal long-term contracts by finding a unique solution to
the HJB equation subject to appropriate boundary and asymptotic conditions. In the next
two sections, we provide two important benchmarks by finding optimal contracts in simplified
versions of our general environment in which one of the two frictions is absent.

3

Pay-for-performance incentives in equilibrium with private
information and full commitment

In this section, we will assume full commitment: not only firms but also workers have the
power to commit to never breaking the contract. As in our general model presented in the
previous section, firms match with workers and offer them long-term contracts at t = 0. At
this time, the worker can reject the offer and move to another match instantaneously. Upon
accepting a contract at t = 0, however, the worker commits to not quitting at any t > 0.
This commitment maximizes the match’s surplus as it allows firms to provide better insurance
against fluctuations in workers’ productivity relative to the case in which the workers would
not commit. In particular, it lets firms insure the upside risk to workers’ productivity. We solve
18

this version of our model in closed form. In equilibrium, firms provide incentives to workers by
making compensation sensitive to current on-the-job performance.
Let ΣF C (y0 ) denote the set of all contracts (a, c) that at all t satisfy the IC constraint (6). The
contracting problem we study in this section is identical to the cost-minimization problem in (8)
but with the quitting constraint (3) removed, i.e., with the set of feasible contracts expanded
from Σ(y0 ) to ΣF C (y0 ). We will use CF C (W, y0 ) to denote the minimum cost function in this
problem. The reduced-form cost function JF C (S) is defined analogously. Note that JF C (S) is
defined for any S, even negative. Market equilibrium is defined as in the general case but using
the cost function CF C (W, y0 ) instead of C(W, y0 ).
The following proposition gives a continuous-time version of standard characterization results
for optimal contracts with private information and full commitment.21
Proposition 2 In the model with full commitment, workers’ equilibrium compensation is given
by
ct = y0 +

µ + ah
− µt + ρβzta ,
r

(24)

p
where 0 < ρ = ( 1 + 4r−1 β 2 − 1)/(2r−1 β 2 ) < 1 and µ = r (1 − ρ) − 12 ρ2 β 2 > 0. The sensitivity
of the equilibrium continuation value Wt with respect to observed performance dzta is
Yt = −u(ct )β

at all t.

(25)

Proposition 2 shows two main features of optimal compensation schemes in the model with
private information and full commitment: contemporaneous sensitivity of compensation to
performance, represented in (24) by ρβ > 0, and a negative time trend in compensation,
represented in (24) by −µ < 0. The positive contemporaneous sensitivity of compensation with
respect to the worker’s observed performance represents the standard, contract-based, “pay-forperformance” incentive for workers to exert effort. The negative trend in compensation does not
provide effort incentives by itself, but it improves the effectiveness of the pay-for-performance
incentive.
Sensitivity Yt in (25) shows that the IC constraint in (7) binds at all t. This means that
incentives, as measured by the ratio Yt /(−u(ct )), are in equilibrium strong enough to make the
recommended high-effort action ah incentive compatible but not any stronger. Incentives are
21

Spear and Srivastava (1987), Thomas and Worrall (1990), and Phelan (1998) provide characterization results
for optimal contracts in discrete-time models with private information and full commitment, similar to the moral
hazard model with full commitment we study in this section in continuous time. Atkeson and Lucas (1995) and
Sannikov (2008) characterize optimal contracts with private information assuming an exogenous lower bound on
the agent’s continuation utility in, respectively, discrete- and continuous-time models.

19

costly because they reduce insurance. The equilibrium contract is efficient in holding incentives
down to a necessary minimum at all times. Because this minimum does not change over time,
the strength of incentives provided to the worker is always the same in this model.
This section shows that private information requires positive sensitivity Yt . The next section
shows that positive sensitivity Yt can arise completely independently of private information:
if workers lack commitment, their productivity shocks cannot be fully insured and, therefore,
their continuation values must remain sensitive to realizations of these shocks. Thus, in an
environment in which private information and limited commitment coexist, limited commitment
potentially could deliver the positive sensitivity Yt that private information requires. Our main
results in this paper, which we give in Section 5, consider precisely this possibility.

4

Market-based incentives in equilibrium with limited commitment and full information

In this section, we discuss the full-information version of our model. As in the general model
outlined in Section 2, firms match with workers and offer them long-term contracts at t = 0. A
worker who has accepted a contract retains the option to quit and go back to the labor market,
where she can find a new match instantaneously. Unlike in the general model, however, we will
assume in this section that workers’ actions on the job are observable, and that workers can
contractually commit to a prescribed course of action.22 The model we study in this section is
essentially a continuous-time version of the Harris and Holmstrom (1982) model of rigid wages.
This section also generalizes the optimal insurance model studied in Grochulski and Zhang
(2011), where the outside option is assumed to be autarky.
Let ΣF I (y0 ) denote the set of all contracts (a, c) that at all t satisfy the quitting constraint (3).
The contracting problem we study in this section is identical to the cost-minimization problem
in (8) but with the IC constraint (7) removed, i.e., with the set of feasible contracts expanded
from Σ(y0 ) to ΣF I (y0 ). We will use CF I (W, y0 ) to denote the minimum cost function in this
problem. The reduced-form cost function JF I (S) is defined analogously. Market equilibrium is
defined as in the general case but using the cost function CF I (W, y0 ) instead of C(W, y0 ).
Proposition 3 In the model with full information, workers’ equilibrium compensation is given
by
ct = mt − ψ,
(26)
where mt = max0≤s≤t ys and ψ =
22

κσ 2
2r

> 0. The sensitivity of the continuation value Wt with

In short, workers cannot be punished for quitting but can be punished for shirking on the job.

20

respect to observed performance dzta is
Yt = −u(ct )

κ −κ(mt −yt )
e
σ > 0.
κ+1

(27)

As in Grochulski and Zhang (2011), the distance between current productivity yt and the
maximum level that productivity has attained to date, mt , is a measure of slackness in the
quitting constraint. The quitting constraint binds whenever productivity attains a new to-date
maximum, i.e., when yt = mt , and is slack whenever productivity is below its to-date maximum,
i.e., when yt < mt .23
As we see in (26), equilibrium compensation follows the downward-rigid wage pattern of Harris
and Holmstrom (1982): compensation is constant unless the quitting constraint binds; when
it binds, compensation increases monotonically. Thus, compensation never decreases, and it
increases faster the faster new all-time high levels of a worker’s productivity are attained. The
mechanism behind this wage structure is the same as in Harris and Holmstrom (1982): if an
outside firm could hire the worker away from the current firm and make a positive profit, the
worker’s wage is bid up; otherwise, the wage is constant as the current employer absorbs all
downside risk in the worker’s productivity process.
Since sample paths of the productivity process are continuous, a worker has a better chance
of attaining a new to-date maximum of her productivity—and thus obtaining a permanent
increase in her compensation—the closer her current productivity level yt is to the current todate maximum mt . The worker’s continuation value in the contract, Wt , increases whenever
the chance for the next permanent increase in compensation improves. This means that Wt
increases whenever current productivity yt increases, even during time intervals in which yt
remains strictly below mt , i.e., when current consumption ct does not at all respond to changes
in yt . This everywhere-positive sensitivity of the continuation value to current performance is
shown in (27). Moreover, (27) shows that the continuation value’s performance sensitivity Yt
increases as the distance between yt and mt decreases. Thus, sensitivity Yt is larger the closer
the quitting constraint is to binding.
Sensitivity Yt is positive here and in the model discussed in the last section, but for completely
different reasons. There, firms pay for performance in order to elicit effort. Here, firms can
directly control workers’ effort, but face the possibility of workers quitting. When the quitting
constraint becomes binding, the firm must give the worker a raise in order to retain her. This
raise is the source of positive sensitivity of the continuation value to current performance at all
23

mt − yt is isomorphic to our state variable St with St = mt − yt −
In fact, the distance

log κ + 1 − e−κ(mt −yt ) + log(κ). Clearly, St is strictly increasing in mt − yt , and St = 0 if and only if
mt − yt = 0.

21

times, even when the quitting constraint is slack. Because the source of sensitivity Yt in this
section is the worker’s market option, we will call this Yt market-induced sensitivity.
As we see in the IC constraint (7), incentives are measured in our model by the ratio of Yt
to −u(ct ). Sensitivity Yt is therefore closely related to the notion of incentives. Despite there
being no need for incentives here, as we assume in this section that effort is observable and
contractually controllable, we should note that the contract in Proposition 3 still gives the
worker an effort incentive because the ratio Yt /(−u(ct )) is nonzero. Indeed, if the firm were
to not observe the worker’s effort for a short instant starting at time t, the worker would still
choose to supply effort at t as long as the ratio Yt /(−u(ct )) is larger than β. Thus, even without
moral hazard, an effort incentive exists here just because sensitivity Yt is positive. Since this
sensitivity is market-induced, we will call this incentive the market-based incentive.
Yt
−u(ct )

is strictly decreasing in mt − yt . In particular,


κ σ
only if mt − yt ≤ δ, where δ = κ−1 log κ+1
β > 0.
Corollary 1 The ratio

Yt
−u(ct )

≥ β if and

This corollary shows that the equilibrium contract obtained in the full-information model formally satisfies the IC constraint (7) whenever slackness in the quitting constraint (3), as measured by mt − yt , is small. That means that the market-based incentive is strong in this
region.24 The corollary also shows that the full-information contract is not overall incentive
compatible because it fails to satisfy the IC constraint (7) when the quitting constraint is sufficiently slack. Monotonicity of Yt /(−u(ct )) means that the market-based incentive is stronger
when the quitting constraint is tighter (less slack).
In this section, there is no need for incentives. Yet, they exist in equilibrium as a by-product
of limited commitment. In the next section, we consider the general version of our model
with both moral hazard and limited commitment, where incentives are needed. There, as here,
the market option improves with the worker’s performance, which generates a market-based
incentive. Similar to Corollary 1, the market-based incentive will be strong (sufficient to induce
high effort) when slackness in the quitting constraint is smaller than a threshold. In that region
of the state space, therefore, the equilibrium contract will rely completely on market-based
incentives and will not use pay-for-performance incentives at all.

4.1

Further properties of equilibrium with full information

Proposition 3 describes equilibrium compensation contracts in the full-information model using
two state variables: mt and yt . In Appendix B, we describe the equilibrium of this model in
terms of the single state variable St , and characterize the cost function JF I (St ). In particular,
24
In particular, the full-information equilibrium contract does satisfy the IC constraint at the onset of every
employment relationship because m0 − y0 = 0 < δ.

22

Appendix B discusses the following properties of the equilibrium expressed in terms of St . The
drift and the volatility of St are strictly decreasing in St . The possibility of violating the quitting
constraint makes the firm infinitely averse to volatility in St at St = 0, hence JF00 I (0) = ∞ and
the volatility of St at St = 0 is zero in equilibrium. Equilibrium drift in St at St = 0 is strictly
positive (i.e., zero is a reflective barrier for St ). In the next section, we show that all these
properties continue to hold when both private information and limited commitment are present
in the model.

5

Market-based and pay-for-performance incentives in equilibrium with both frictions

In this section, we characterize the optimal contract in our general model, where firms face
both the incentive problem and the quitting constraint.

5.1

Solving the optimal contracting problem

Standard methods for solving second-order differential equations like our HJB equation (22)
require two boundary conditions. Our problem is nonstandard. It has a semi-unbounded
domain (the positive half-line) with only one boundary condition: the second derivative of J
at the boundary St = 0 must be infinite because otherwise the quitting constraint would be
violated immediately after St becomes zero. Despite the lack of a second condition on J at
the boundary, our analysis of the full-information model suggests an asymptotic condition that
can be used to pin down the solution: the cost that the quitting constraint imposes on the
firm must become negligible when St goes to infinity because the (time-discounted) chance of
the constraint binding in the future becomes negligible when St is large. When St goes to
infinity, therefore, the cost function in the model with two frictions, J, must converge to the
cost function from the model with private information and full commitment, JF C . In particular,
first derivatives of these functions, J 0 (St ) and JF0 C (St ), must become close at large values of St .
We will use this asymptotic convergence condition to pin down the solution.25 26
25

In order to use the cost function JF C from the one-friction model with full commitment as a benchmark
(lower bound) for J in the two-friction model, one must shift JF C downward by a constant to account for the
fact that a lower level of utility is provided to the worker in equilibrium in the model with two frictions (the
value V (0) is lower in this model). It is thus more convenient to express the asymptotic convergence condition in
terms of first derivatives rather than levels because a uniform vertical shift of JF C does not affect its derivative.
26
Appendix B discusses the cost the quitting constraint imposes on the firm in the full-information model
relative to the environment with no frictions (the first best). In that model, this cost does go to zero when
slackness in the quitting constraint goes to infinity: the cost function JF I and its derivatives converge to the
first-best cost function and its derivatives, respectively.

23

Our analysis of the HJB equation gives the following theorem.
Theorem 1 There exists a unique solution to the HJB equation (22) satisfying the boundary
condition J 00 (0) = ∞ and the convergence condition limSt →∞ (J 0 (St ) − JF0 C (St )) = 0. This
solution represents the true minimum cost function for the firm.
The method of proof given in Appendix A is similar to that in Sannikov (2008) with two
technical difficulties stemming from the specific boundary and convergence conditions we have.
First, our HJB equation does not satisfy the Lipschitz condition at St = 0 because J 00 (0) = ∞.
We overcome this difficulty by using a change of variable technique. Second, the asymptotic
condition requiring convergence of J 0 (St ) to JF0 C (St ) does not provide an actual restriction on
the boundary of the state space. We overcome this difficulty as follows. We determine a range
of possible values for the first derivative of J at St = 0 and consider a family of candidate
solutions to the HJB equation, one for each possible value of J 0 (0) in this range. We show that
the asymptotic condition requiring that J 0 (St ) converge to JF0 C (St ) as St → ∞ is violated by
all but one candidate solution. We then confirm that the one candidate solution that satisfies
this asymptotic condition indeed represents the true minimum cost function J.
Lastly, we verify that the recommendation of high effort is optimal at all t. Lemma A.11 in
Appendix A shows that this conclusion follows from our Assumption 1.

5.2

Structure of the equilibrium contract

Proposition 4 In the model with two frictions, there exists a unique S ∗ > 0 such that the IC
constraint (7) binds whenever St ≥ S ∗ , but is slack whenever St < S ∗ . In each time interval in
which St remains strictly above 0 and below S ∗ , equilibrium compensation ct is constant. When
St hits zero, compensation increases monotonically.
This proposition shows that, despite moral hazard, optimal compensation in our model replicates the rigid-wage compensation structure of Harris and Holmstrom (1982), as long as slackness in the quitting constraint remains below a threshold. Intuitively, when quitting is near,
moral hazard does not matter. The worker’s exposure to her own performance risk is large
enough to guarantee her full effort. In fact, a slack IC constraint means that the worker’s
incentives are too strong (i.e., more than necessary to induce effort) in the rigid-wage region
St < S ∗ . The contract would be more efficient (i.e., of higher value to the worker) if the firm
could provide more insurance, thus weakening her incentives. Doing so, however, is impossible,
because the worker’s right to quit makes her upside performance risk impossible to insure fully.
Standard models in the principal-agent literature with private information, as in Section 3,
predict that pay should be volatile, i.e., at all times responding to the worker’s performance.

24

Proposition 4 shows that this is no longer true if external, market-based incentives are present.
In fact, whenever the worker’s market value is close to the value she obtains by continuing to
work for the current employer, optimal compensation is constant, i.e., completely unresponsive
to current performance of the worker, and the worker nevertheless chooses to supply effort.27
Key to this result are two facts. First, as we have seen in Section 4, when the quitting constraint
binds, the firm must increase the worker’s compensation in order to retain her. Second, when
the quitting constraint does not quite bind but is close to binding, the worker’s effort has a
strong impact on the chance that the quitting constraint becomes binding. These two facts
imply that when the worker is close to quitting, she will expend effort in order to actually hit
the quitting constraint and obtain a raise. Knowing this, the firm does not need to provide
an additional incentive via performance-dependent compensation; constant compensation is
efficient.28
When the quitting constraint is relatively slack, St > S ∗ , the IC constraint binds. This is
because the impact of the worker’s effort on her chance of hitting the quitting constraint is
smaller when the quitting constraint is more distant. The market option still gives the worker
an incentive to supply effort, but this incentive is weak (i.e., not sufficient to induce effort).
The firm must in this case supplement the market-based incentive with a contract-based payfor-performance incentive. We study the optimal mix of these incentives in the next subsection.
In the limiting case with St → ∞, the chance of St ever returning to zero becomes negligible
and the strength of market-based incentives goes to zero.
In sum, when St remains below S ∗ the optimal contract looks exactly like the optimal contract
from the model with limited commitment and full information in Section 4. When St goes to
infinity, in contrast, the optimal contract looks like the optimal contract from the model with
private information and full commitment in Section 3.

5.3

Strength of market-based incentives

In our model, the strength of effort incentives provided to a worker at time t is measured by
the ratio of Yt to −u(ct ). Workers will supply effort if and only if this ratio is larger than
27

He (2012) obtains downward-rigid compensation as a part of an optimal contract in a moral hazard model
with Poisson uncertainty, zero probability of success under shirking, and private savings. Also, in Sannikov
(2008), flat compensation arises in some specifications of a moral hazard model in which the agent must be
inefficiently retired at a lower bound of the set of feasible continuation values. These mechanisms are different
from ours as they rely exclusively on contract-based incentives while the source of incentives giving rise to
downward-rigid compensation is external in our model. In particular, while the IC constraint binds at all times
(prior to termination/retirement) in these two studies, it does not always bind in our model, and compensation
is downward-rigid in our equilibrium only when the IC constraint is slack.
28
Since workers are hired in our model at market value, i.e., without any slack in the quitting constraint,
optimal compensation for newly hired workers is always free of pay-for-performance incentives. New workers,
however, are likely to receive a raise shortly after they are hired.

25

0.4
0.35
0.3
0.25
0.2

β

0.15

pay-for-performance

0.1

market-based
0.05
0
0

S∗

1

S

2

Figure 1: Composition of incentives.
β. Proposition 4 shows that in equilibrium, the strength of incentives is only just sufficient
to induce effort when the quitting constraint is relatively slack (St ≥ S ∗ ), but is more than
sufficient when the quitting constraint is relatively tight (St < S ∗ ).
We will now decompose incentives into two parts: external, market-based and direct, contractbased. Market-based incentives will be those induced by the worker’s outside option (as in
Section 4). Contract-based incentives will be those not induced by the market option (as in
Section 3). To measure the strength of market-based incentives at t, we need to compute the
ratio Yt /(−u(ct )) that the firm would optimally choose at t if limited commitment were the
only friction, i.e., as if the worker’s effort were observable (and hence controllable) by the firm
locally at t. We compute this ratio as follows. Given the optimal cost function J, we disregard
the IC constraint at t in the HJB equation (22) and use first-order conditions to obtain current
utility u(ct ) and sensitivity Yt that the firm would choose in such a relaxed problem. Denoting
the ratio of Yt to −u(ct ) from this locally relaxed problem by Ỹt /(−ũ(ct )), we have
σJ 0 (St )J 00 (St )
Ỹt
= 0
.
−ũ(ct )
J (St ) + J 00 (St )
This ratio gives the portion of the actual Yt /(−u(ct )) that is induced by the presence of the
worker’s market option. Thus, it represents the strength of market-based incentives at t in our
model. The remainder of the actual Yt /(−u(ct )) represents contract-based incentives that the
firm must inject in order to ensure incentive compatibility of high effort at t.
Figure 1 plots the ratio Ỹt /(−ũ(ct )) against St in a typically parameterized numerical example.
The strength of market-based incentives decreases as the quitting constraint becomes more
26

1

volatility of c

0.1

0

volatility of S
0

−1

drift of c
−0.1

0

S∗

1

drift of S
S

2

0

(a) Dynamics of compensation.

S∗

1

S

2

(b) Dynamics of St .

Figure 2: Example with ah > 0. Threshold S ∗ = 0.76. Stationary point for St is 0.05.
distant. Below S ∗ , market-based incentives are strong, meaning they are sufficient to induce
effort, i.e., Ỹt /(−ũ(ct )) ≥ β, and contract-based incentives are zero. An implication of strong
market-based incentives when St < S ∗ , as we have seen in Proposition 4, is that compensation is flat and workers provide effort without being compensated for current performance.
Above S ∗ , market-based incentives are weak, i.e., not strong enough to induce worker effort,
and the optimal contract supplements them with pay-for-performance incentives. This means
that compensation does depend on current performance above S ∗ . Pay-for-performance incentives become stronger as market-based incentives become weaker when the quitting constraint
becomes more slack.

5.4

Dynamics of the equilibrium contract

Unlike the two single-friction models studied in Sections 3 and 4, the model with both frictions
does not admit a closed-form solution. In this section, therefore, we describe the dynamics of
the equilibrium contract by characterizing the drift and the volatility of compensation ct and
the state variable St . We provide a mix of analytical and numerical results in this section.
We start out by presenting in Figure 2 the drift and the volatility of ct and St computed
numerically under the parametrization of our model used earlier to produce Figure 1. In panel
(a), we can identify the region of strong market-based incentives by noting that for all St
above zero and below S ∗ the drift and the volatility of compensation are both zero, which
means that dct = 0, i.e., compensation remains constant in this region, as predicted earlier
in Proposition 4. When St goes to infinity, the impact of the quitting constraint vanishes and
optimal compensation converges to the optimal compensation from the full-commitment model,

27

where, by Proposition 2, the drift of ct is −µ < 0 and the volatility of ct is ρβ > 0.
In addition to these properties of compensation at low and high values of the state variable,
where market-based incentives are respectively strong and negligible, numerical analysis lets us
characterize the dynamics of ct in the intermediate region of the state space, where market-based
incentives are not strong but are not negligible either. As we see in panel (a), at all St greater
than S ∗ the volatility of compensation is increasing in St but remains smaller than its asymptotic
value of ρβ. The intuition for this follows from the monotonicity of the strength of market-based
incentives in St shown earlier in Figure 1. If at some St > S ∗ the worker’s observed performance
is positive, dzta = dyt − ah dt > 0, then both the worker’s continuation value inside the contract
and her outside market value increase. Because the contract provides some insurance to the
worker, the outside market value increases by more than does the continuation value inside
the contract. This means that the quitting constraint becomes less slack (St decreases) and,
thus, the chance of entering the area of constant compensation (below S ∗ ) and eventually
hitting the quitting constraint (when the worker receives a raise) improves. This improvement
provides some incentive for the worker to supply effort. Therefore, even in the region of weak
market-based incentives, compensation can be less sensitive to contemporaneous performance
than what it must be in the standard principal-agent model, or in our model at St approaching
infinity, where market-based incentives are absent. Because, as shown in Figure 1, the marketbased incentive is stronger at smaller St , the sensitivity of ct to performance decreases when St
decreases at all St > S ∗ .
Panel (a) of Figure 2 shows that at St = S ∗ (and, by continuity, also right above S ∗ ), the
sensitivity of compensation to observed performance is actually negative. This feature of the
optimal contract is due to the non-separability in the worker’s preferences between consumption
and leisure.29 The intuition for this is as follows. As we see in (7), higher current compensation
ct relaxes the IC constraint in our model. When the IC constraint binds, the firm saves on
incentive costs by paying higher compensation now. If the IC constraint does not bind, this
effect is absent. At the threshold point St = S ∗ , positive and negative worker productivity
shocks dzt have an asymmetric effect on the incentive benefit of high current compensation:
positive shocks decrease St and take it into the region in which the IC constraint does not bind,
where high current compensation is not needed, while negative shocks increase St and take it
into the region where the IC constraint binds, where high current compensation does have a
benefit. This produces negative sensitivity of compensation ct to innovations in zt at St = S ∗ :
a positive shock dzt > 0 will not affect ct and a negative shock dzt < 0 will increase ct .
In addition, panel (a) of Figure 2 shows that the drift of compensation is lower than its asymp29

In numerical examples with separable preferences that we computed, the volatility of consumption is everywhere weakly positive. It is zero at all St below S ∗ and positive at all St above S ∗ .

28

totic value of −µ at all St above S ∗ , and is monotonic in St in this region. Similar to the
negative volatility of compensation, these properties of its drift are due to the fact that compensation increases when the state variable crosses the S ∗ threshold and enters the region of
weak market-based incentives. A more strongly negative drift in ct right above S ∗ helps average
out the monotonic increase in ct occurring at S ∗ as the state variable fluctuates around this
threshold level. When St grows and moves away from S ∗ , this need for a more strongly negative
drift vanishes and the drift in ct approaches −µ.
These dynamic properties of compensation are robust in the numerical experiments with the
model we conducted. The discontinuity in the drift and the volatility of ct at S ∗ can be shown
analytically, but we do not have analytical results for the monotonicity of the drift and the
volatility of ct above S ∗ .
Moving on to the dynamics of the state variable, we note in panel (b) of Figure 2 that the
volatility of St is everywhere negative and monotonically increasing toward zero as St decreases
toward the boundary St = 0. The intuition for this, which we already mentioned earlier,
follows from the fact that the optimal contract provides more insurance to a worker who is
further away from quitting. At the boundary itself, the contract cannot provide any insurance,
i.e., the volatility of the continuation value inside the contract has to match the volatility of
the worker’s outside option to ensure that the quitting constraint is not violated immediately
after the state variable hits its lower bound. The further away St is from zero, the less likely
it is that the quitting constraint becomes binding, the more the firm can insure the worker
against her productivity shocks, and, in effect, the more negative the volatility of St becomes.
Asymptotically, the volatility of St converges to its value from the model without quitting
constraints.
The negative volatility of slackness St in the quitting constraint (3) means that this constraint
can become binding only after the worker’s good performance, which is exactly opposite to
the standard moral hazard model without external incentives, e.g., Sannikov (2008). In both
models, poor performance decreases the worker’s continuation value Wt . In Sannikov (2008),
the lower bound on Wt is fixed, so when Wt decreases, the distance between Wt and its lower
bound decreases. In our model, the lower bound on Wt , V (yt ), is not fixed: it is strictly
increasing in yt . In fact, V (yt ) responds to the worker’s performance more strongly than Wt .
When performance is poor, thus, V (yt ) decreases faster than Wt , so the distance between Wt
and its lower bound increases. When performance is strong, V (yt ) increases faster than Wt ,
i.e., the lower bound “catches up” to the continuation value Wt . The closer V (yt ) approaches
Wt , the slower this catching up becomes. When slackness St in the quitting constraint is zero,
Wt and V (yt ) respond to good performance exactly the same (St has zero volatility), so V (yt )
“pushes up” Wt but never exceeds it.

29

The drift of St , shown also in panel (b) of Figure 2, is positive at the boundary of the state
space St = 0 and converges to its value from the model without quitting constraints when St
goes to infinity. Clearly, drift of St must be nonnegative at zero or else the quitting constraint
would be violated shortly after St hits zero. When St is large, the contract behaves as in the
full commitment case, which determines the value of the drift of St in this region of the state
space.
Moreover, note in Figure 2 that the drift in St at St = 0 is not only nonnegative, which is
necessary to avoid violating the quitting constraint, but is actually strictly positive. Combined
with the fact that St has zero volatility at zero, this implies that zero is a reflective barrier for
the state variable in equilibrium. Although unable to provide insurance to the worker when
St is at its lower bound, by paying compensation ct lower than the worker’s output yt and
increasing the worker’s continuation value Wt , the firm can generate a positive drift in St at
St = 0, which allows it to provide insurance to the worker in the future. This property of our
model with market-based incentives is different from the absorbing lower bound that appears
in many dynamic moral hazard models with a fixed lower bound on the continuation utility,
e.g., in Sannikov (2008).
These properties of the drift and the volatility of the state variable hold not only in the numerical
example presented in Figure 2 but are true in our model in general. Formally, we have the
following result.
Proposition 5 Let α(St ) and ζ(St ) denote, respectively, the drift and the volatility of the
state variable. In the equilibrium contract, α(St ) is strictly decreasing with α(0) > 0 and
limSt →∞ α(St ) = −µ − ah , and ζ(St ) is strictly decreasing with ζ(0) = 0 and limSt →∞ ζ(St ) =
ρβ − σ.
Note that Proposition 5 implies that the volatility of St is always negative, but the sign of the
drift in St is not pinned down. In particular, the direction in which St tends to move when it
is large depends on the sign of −µ − ah . This value represents the drift of the state variable in
the full-commitment version of our model as well as in the model with both frictions at large
St . In the example presented in Figure 2, −µ − ah < 0 and the state variable has a unique
stationary point, where its drift is zero. Because this stationary point is much smaller than S ∗
in this example, St tends to start to decrease toward zero before it reaches S ∗ , and thus it will
only infrequently leave the region of strong market-based incentives.
Figure 3 modifies the parametrization used in Figure 2 by using a lower value of the drift
parameter of the worker’s productivity, ah < 0, resulting in a positive asymptotic value for
the drift of the state variable, −µ − ah > 0. Since, by Proposition 5, the drift of St is strictly
positive at zero and monotonic in St , −µ − ah > 0 means that St has in this example a positive

30

0.25

volatility of c

0

drift of S
drift of c

0

−0.25

volatility of S

0

S∗

1

2

0

S

S∗

(a) Dynamics of compensation.

1

2

S

(b) Dynamics of St .

Figure 3: Example with ah < 0. Threshold S ∗ = 0.18. No stationary point for St exists, i.e.,
the drift in St is everywhere positive.
drift everywhere in the state space. In this modified parametrization, therefore, St tends to
drift out of (0, S ∗ ). Over time, it thus becomes less and less likely that market-based incentives
are strong: market-based incentives are transient in this parametrization.30 These observations
lead us next to investigate more closely where the state variable tends to spend most time in
equilibrium.

5.5

Market-based incentives in the long run

This section provides two results. The first result gives a sufficient condition for the existence
of a stationary stochastic steady state (an invariant distribution) for the state variable St .
Theorem 2 If in the model with full commitment the drift of the state variable St is negative,
i.e., if −µ − ah < 0, then in the model with both frictions there exists an invariant distribution
for the state variable St .
This result is intuitive because a negative drift in St when St is large prevents St from diverging.
A strictly positive drift in St at zero makes the lower bound a reflecting barrier for St . These
two forces give rise to a non-degenerate stationary distribution in St in the long run.
The second result uses the stationary distribution for St to examine the fraction of time that
the optimal contract spends in the region with strong market-based incentives. Denote the
invariant distribution of St by π.
30

Panel (a) of Figure 3 shows that dynamic properties of compensation in the parametrization with low ah are
qualitatively the same as those presented in panel (a) of Figure 2 for the case of high ah .

31

Proposition 6 limah →∞ π([S ∗ , ∞)) = 0.
This proposition shows that if the worker’s productivity has a sufficiently large drift under high
effort, the optimal compensation contract will be free of pay-for-performance incentives most
of the time. The argument for this result is that when ah is large, the drift of the state variable
is strongly negative at values of St strictly smaller than S ∗ . This makes events in which St
leaves the region of strong market-based incentives very rare, and thus eliminates the need for
pay-for-performance compensation incentives in equilibrium almost always.

6

Two extensions

In this paper, we adopt the CARA utility function, a Brownian motion productivity process,
and one-sided limited commitment for the tractability of this framework. In particular, in this
framework we can show that the high effort recommendation is optimal everywhere, and we can
characterize the region of strong market-based incentives analytically. However, our main result
showing that market-based incentives have a strong impact on optimal compensation contracts
is not specific to the CARA-normal model with one-sided lack of commitment. In this section,
we examine robustness of our result by considering two extensions. First, we consider twosided lack of commitment (i.e., firms can fire workers) in the CARA-normal model. Second, we
consider a model with log preferences and a geometric Brownian motion productivity process
in the one-sided lack of commitment case. The cost of departing from the CARA-normal
framework in this section is that we are only able to provide numerical solutions for these two
extensions.31

6.1

Two-sided lack of commitment

Following Phelan (1995), we assume in this section that firms can fire workers upon incurring
a deadweight cost F ≥ 0. This introduces a participation constraint on the side of the firm:
J(St ) ≤ F at all t. This constraint implies that St ≤ S̄ at all t, where S̄ = J −1 (F ). Our model
in Section 5 is a special case with F = ∞.
The numerical solutions we have obtained under various parameterizations show that marketbased incentives become stronger when firm commitment becomes weaker. Figure 4 shows
the equilibrium dynamics of compensation and the state variable in a typically parameterized
31
Within the CARA-normal framework with one-sided limited commitment, our analytical results can easily
be extended to the case in which the absolute risk aversion parameter in the utility function is different from
one. In this paper, we keep the absolute risk aversion parameter fixed at one because considering other values
would make the notation less clear without adding any insight.

32

volatility of c
0.1
1

0

drift of S
−0.1

0

drift of c

volatility of S

0

1

2

−1
0

S

(a) Dynamics of compensation.

1

2

S

(b) Dynamics of S.

Figure 4: Example with two-sided lack of commitment.
example. In [0, S̄], there are two regions with strong market-based incentives, where compensation is constant, and one region with weak market-based incentives, where pay-for-performance
incentives are used.32 In the lower region of constant compensation, as in the baseline model,
the worker is motivated by the prospect of the raise that the firm must give her to keep her
from quitting when St hits zero. In the upper region of constant compensation, the worker is
motivated by the wage cut that she will have to accept in order to keep the firm from firing her
when St reaches S̄.
Like quitting, firing of workers never actually happens in equilibrium. Panel (b) of Figure 4
shows that when St approaches the firing boundary S̄, the drift of St is negative and its volatility
goes to zero. Thus, like zero, S̄ is a reflecting barrier for St .
In addition to the example presented in Figure 4, we have computed examples with different
levels of firing cost F . In these examples, we have examined the structure of equilibrium
compensation. When F decreases, S̄ = J −1 (F ) decreases, so the interval [0, S̄] shrinks. The
middle region of that interval, where market-based incentives are weak, shrinks as well. In fact,
the middle region shrinks faster than the interval [0, S̄].
For a small enough firing cost F , the region of weak market-based incentives vanishes completely
and, hence, equilibrium compensation never uses pay-for-performance incentives. In these cases,
compensation is piecewise constant: ct is constant when St fluctuates inside the interval (0, S̄),
ct increases when St hits zero, and ct decreases when St hits S̄. Compensation, therefore, has
a “sticky wage” structure: small performance shocks do not affect the wage, but large shocks
32

Similar to the one-sided case, due to the non-separability of preferences between consumption and leisure,
there is a discontinuity in the drift and in the volatility of compensation at the boundaries between the regions
of strong and weak market-based incentives.

33

1

0.1
0.08
0.06
0.04
0.02

drift of S

volatility of c

0
0
−0.02

drift of c

−0.04
−0.06

volatility of S

−0.08
−0.1
0

1

2

0

1

2

S

(a) Dynamics of compensation.

S

(b) Dynamics of S.

Figure 5: Example with two-sided lack of commitment and small firing cost F .
do. Also, we note that with a small enough firing cost compensation is the same as what it
would be if workers’ effort were observable, i.e., the commitment friction is strong enough to
completely crowd out the private information friction. Figure 5 presents one such example.
In this example, F is smaller than it is in Figure 4, but greater than zero.33 As we see, the
equilibrium firing threshold S̄ is smaller here than in Figure 4, but remains positive, i.e., the
firm still provides insurance to the worker. Panel (a) shows that the drift and the volatility of
ct are both zero everywhere inside (0, S̄). As in the previous example, we can see in panel (b)
that 0 and S̄ are reflecting barriers for St .
The lower the firing cost F , the less insurance firms provide in equilibrium. In the limiting case
with F = 0, we have S̄ = 0 and firms provide no insurance, i.e., they simply pay to workers the
output workers produce: ct = yt at all t.

6.2

Log preferences and geometric Brownian motion

We have also studied numerically a version of our model with the log utility of consumption
additively separable from the utility of leisure, with a geometric Brownian motion productivity
process, and with one-sided commitment. In that framework, the high-effort action is not
always optimal, but it is when slackness in the worker’s quitting constraint is not too large. We
have examined numerically the solution to the optimal contracting problem, and have found
that strong market-based incentives also exist in this model.34 Figure 6 shows the area of strong
market-based incentives in our main CARA-normal model (panel (a)) and in a log-geometric
33
34

All other parameters are the same as in Figure 4.
Detailed solutions are available upon request.

34

V (yt )
−5

−4

−3

−2

−1

0

0

3

Wt
−1

2

−2

1

−3

0

−4

−1

Wt
−5

−2
−2

(a) CARA-normal.

−1

0

1

V (yt )

2

(b) Log-geometric.

Figure 6: Regions of strong market-based incentives.
model (panel (b)). The main conclusion of our previous analysis holds in the log-geometric
framework: market-based incentives are strong when the quitting constraint is not very slack.

7

Testable predictions

There is a large body of evidence documenting downward wage rigidity. Consistent with other
studies estimating the frequency of nominal wage changes, Gottschalk (2005) estimates the
annual probability of a wage decline to be between 4 and 5 percent, and the probability of no
wage change to be about 50 percent. That evidence would not be consistent with the career
concerns model, the pure moral hazard model, or the rigid wage model, but it can be consistent
with the predictions of our model. The career concerns model and the pure moral hazard model
do not generate wage constancy, as wages change in these models as soon as new information
carried by the worker’s observed performance becomes available. The rigid wage model of
Harris and Holmstrom (1982) does not predict any wage decreases. Our model can generate
both wage constancy and wage decreases. In particular, the proportion of wage decreases can in
our model be small relative to wage constancy or wage increases if the contract spends a large
fraction of time in the rigid-wage region with occasional excursions into the pay-for-performance
region.35 As shown in Section 5.5, this pattern can be consistent with steady-state properties
of the contract.
Our characterization of the equilibrium contract also provides testable predictions on the likelihood of the use of pay-for-performance compensation across occupations and over the life-cycle.
35

In the extension of our model studied in Section 6, wage declines can also be generated by a binding firing
constraint for the firm.

35

Performance-based incentives should be more frequently observed (a) in occupations in which
workers do not acquire much general, transferable human capital but only firm-specific human
capital, or none, (b) when the growth of a worker’s general productivity is slower, e.g., later
in the life-cycle, (c) when firing workers is costly, and (d) when workers’ past performance is
harder for outsiders to observe. Gibbons and Murphy (1992), Loveman and O’Connell (1996),
and Lazear (2000) provide evidence consistent with these predictions.

8

Conclusion

In this paper, we build a model that lets us study contractual incentives jointly with external,
market-based incentives similar to career concerns. In the model, external incentives arise out
of (a) the persistent impact of effort on the worker’s productivity, and (b) one-sided commitment. We show how external incentives change the structure of the optimal long-term contract:
connecting pay to performance is only needed when performance is weak; it is not needed when
performance is strong.
When we relax the assumption of full commitment on the side of the firm and allow for firing
of workers upon paying a small firing cost, performance pay becomes completely unnecessary.
If firms can fire workers, market-based incentives are stronger because workers are motivated
not only by the prospect of a pay raise but also by the risk of being fired.
Our analysis suggests that market-based incentives exist in principal-agent relationships beyond the particular setting of our model, as long as the agent’s effort (or other action benefiting
the principal) improves the agent’s standing in the market outside the present principal-agent
relationship. For this reason, we expect that market-based incentives play an important role
in many firm-employee and, perhaps particularly so, firm-executive relationships. As well,
market-based incentives may be important in lender-borrower relationships, where the borrower’s outside option (e.g., refinancing terms) can depend on the performance of the loans she
has held in the past.
In our model, maximum worker effort is optimal at all times. The existence of market-based
incentives does not depend on this feature of the model. If less-than-maximum effort were to
be implemented, however, we expect that the impact of market-based incentives on optimal
compensation would be more complicated. In this paper, we abstract from search frictions in
the labor market and aggregate uncertainty. How they affect market-based incentives is another
potentially interesting question for future research.36
36
Rudanko (2011) studies long-term contracts in a search model with idiosyncratic and aggregate uncertainty
under full information and full commitment. Cooley et al. (2013) study long-term contracts in a search model
with limited commitment.

36

References
Atkeson, A. (1991). International lending with moral hazard and risk of repudiation. Econometrica 59 (4), 1069–1089.
Atkeson, A. and R. E. Lucas, Jr. (1995). Efficiency and equality in a simple model of efficient
unemployment insurance. Journal of Economic Theory 66 (1), 64–88.
Azariadis, C. (1975). Implicit contracts and underemployment equilibria. Journal of Political
Economy 83 (6), 1183–1202.
Baily, M. N. (1974). Wages and employment under uncertain demand. The Review of Economic
Studies 41 (1), 37–50.
Cooley, T., R. Marimon, and V. Quadrini (2013). Risky investments with limited commitment.
working paper .
Fama, E. F. (1980). Agency problems and the theory of the firm. Journal of Political Economy 88 (2), 288–307.
Federal Reserve (2011). Incentive compensation practices: A report on the horizontal review of
practices at large banking organizations. http://www.federalreserve.gov/publications/otherreports/files/incentive-compensation-practices-report-201110.pdf .
Fudenberg, D. and J. Tirole (1986). A “signal-jamming” theory of predation. The RAND
Journal of Economics 17 (3), 366–376.
Gibbons, R. and K. J. Murphy (1992). Optimal incentive contracts in the presence of career
concerns: Theory and evidence. Journal of Political Economy 100 (3), 468–505.
Gottschalk, P. (2005). Downward nominal-wage flexibility: Real or measurement error? Review
of Economics and Statistics 87 (3), 556–568.
Grochulski, B. and Y. Zhang (2011). Optimal risk sharing and borrowing constraints in a
continuous-time model with limited commitment. Journal of Economic Theory 146 (6), 2356–
2388.
Harris, M. and B. Holmstrom (1982). A theory of wage dynamics. Review of Economic Studies 49 (3), 315–333.
He, Z. (2012). Dynamic compensation contracts with private savings. Review of Financial
Studies 25 (5), 1494–1549.
He, Z., B. Wei, and J. Yu (2013). Optimal long-term contracting with learning. mimeo.
Holmstrom, B. (1982). Essays in Economics and Management in Honor of Lars Wahlbeck,
Chapter Managerial Incentive Problems - A Dynamic Perspective, pp. 210–235. Swedish
School of Economics. Reprinted in Review of Economic Studies, 66, 169-182.
Holmstrom, B. (1983). Equilibrium long-term labor contracts. The Quarterly Journal of Economics 98 (3), 23–54.
37

Karatzas, I. and S. Shreve (1991). Brownian Motion and Stochastic Calculus, 2nd ed. SpringerVerlag.
Krueger, D. and H. Uhlig (2006). Competitive risk sharing contracts with one-sided commitment. Journal of Monetary Economics 53 (7), 1661–1691.
Lazear, E. P. (2000). Performance pay and productivity. The American Economic Review 90 (5),
1346–1361.
Loveman, G. W. and J. O’Connell (1996). HCL America. Harvard Business Review , Case no.
9–396030.
Phelan, C. (1995). Repeated moral hazard and one-sided commitment. Journal of Economic
Theory 66 (2), 488–506.
Phelan, C. (1998). On the long run implications of repeated moral hazard. Journal of Economic
Theory 79 (2), 174–191.
Phelan, C. and R. Townsend (1991). Computing multi-period, information-constrained optima.
Review of Economic Studies 58 (5), 853–881.
Prat, J. and B. Jovanovic (2013). Dynamic contracts when agent’s quality is unknown. Theoretical Economics, forthcoming.
Rogerson, W. (1985). Repeated moral hazard. Econometrica 53 (1), 69–76.
Rudanko, L. (2011). Aggregate and idiosyncratic risk in a frictional labor market. American
Economic Review 101 (6), 28232843.
Sannikov, Y. (2008). A continuous-time version of the principal-agent problem. Review of
Economic Studies 75 (3), 957–984.
Spear, S. and S. Srivastava (1987). On repeated moral hazard with discounting. Review of
Economic Studies 54 (4), 599–617.
Storesletten, K., C. Telmer, and A. Yaron (2001). How important are idiosyncratic shocks?
evidence from labor supply. American Economic Review Papers and Proceedings 91, 413–
417.
Thomas, J. and T. Worrall (1988). Self-enforcing wage contracts. Review of Economic Studies 55 (4), 541–554.
Thomas, J. and T. Worrall (1990). Income fluctuations and asymmetric information: An
example of a repeated principal-agent problem. Journal of Economic Theory 51 (2), 367–
390.

38

Appendix A: proofs
Proof of Proposition 1
The proof proceeds in several steps.
(i) From the definition of equilibrium, we have C(W, y) ≥ 0 for W ≥ V (y). This property
and the quitting constraint
≥ V (yt ) imply that the
R ∞Wt −rs
 solution to the cost minimization
a
problem (8) satisfies E
(ct+s − yt+s )ds|Ft ≥ 0, ∀t ≥ 0.
0 re
(ii) Let us define C̃(W, y) for any y and W ≥ V (y) as follows:
Z ∞

−rt
a
re (ct − yt )dt
C̃(W, y) = min
E
(a,c)

subject to

(28)

0

W0 = W,

(29)

(a, c) is incentive compatible,
Z ∞

a
−rs
E
re (ct+s − yt+s )ds|Ft ≥ 0, ∀t ≥ 0,

(30)
(31)

0

where the process {yt ; t ≥ 0} starts from the initial condition y0 = y. We now show that
C(W, y) = C̃(W, y) ∀y, ∀W ≥ V (y). Since the solution to (8) satisfies (31), C(W, y) ≥
C̃(W, y) ≥ 0. This implies C̃(V (y), y) = 0. If W > V (y), denote the contract attaining
the solution to (28) as σ̃. Define λ ≡ min{t : Wt = V (yt )}. Then a contract σ that is
equal to σ̃ on [0, λ) but switches to the market contract at λ has the same cost as σ̃, as
both the market contract and the tail of σ̃ have zero cost starting at λ. Since σ satisfies
(3), it is feasible in (8). Hence C(W, y) ≤ C(σ) = C(σ̃) = C̃(W, y).
(iii) If a contract (a, c) R
delivers utility W , then
(a, c + x) delivers W e−x for any x ∈ R. This

∞
is because W = Ea 0 re−rt U (ct , at )dt if and only if
Z ∞

Z ∞

W e−x = e−x Ea
re−rt U (ct , at )dt = Ea
re−rt U (ct + x, at )dt .
0

0

(iv) The incentive compatibility of (a, c) is equivalent to the incentive compatibility
of (a,c +
R ∞
x).R In fact, the incentive compatibility of (a, c) requires that Ea 0 U (ct , at )dt ≥
∞
Eb 0 U (ct , bt )dt for any deviation strategy b, which is equivalent to
Z ∞

Z ∞

Ea
U (ct + x, at )dt ≥ Eb
U (ct + x, bt )dt .
0

0

(v) We now verify that C̃(W, y) = C̃(W ey , 0). Suppose (a, c) solves the problem in C̃(W, y).
We verify that (a, c − y) is feasible in the minimization problem defining C̃(W ey , 0).
First, parts (iii) and (iv) imply that (a, c − y) delivers utility W ey and is incentive
compatible.
with the initial condition y0 = y and
R ∞ −rs Second, if y = {y
 t ; t ≥ 0} starts
0
a
E
(ct+s − yt+s )ds|Ft ≥ 0, then y = {yt0 ; t ≥ 0} defined as yt0 = yt − y ∀t
0 re

39

starts with the initial condition y00 = 0, and
Z

Z ∞
a
−rs
0
a
re (ct+s − y − yt+s )ds|Ft = E
E

∞

−rs

re


(ct+s − y − (yt+s − y))ds|Ft ≥ 0.

0

0

Hence (a, c − y) satisfies (31) in C̃(W ey , 0) and so it is a feasible contract in this minimization problem.
Feasibility of (a, c − y) in this problem implies that
Z
Z ∞

−rt
0
a
y
a
re (ct − y − yt )dt = E
C̃(W e , 0) ≤ E

∞

−rt

re


(ct − yt )dt = C̃(W, y).

0

0

By a symmetric argument, we can show C̃(W ey , 0) ≥ C̃(W, y). Thus, C̃(W, y) = C̃(W ey , 0),
which by part (ii) implies (13).
(vi) To show V (y) = V (0)e−y , suppose the equality does not hold. If V (0)e−y > V (y),
then 0 = C(V (0), 0) = C(V (0)e−y , y) > C(V (y), y) = 0, which is a contradiction. If
V (0) < V (y)ey , then 0 = C(V (0), 0) < C(V (y)ey , 0) = C(V (y), y) = 0, which is again a
contradiction.
(vii) If (a, c) is optimal in the contracting problem for C(y, V (y)) defined in (8), then (a, c − y)
is optimal in the contracting problem for C(0, V (0)). We first show that it is feasible in
this problem. Parts (iii) and (iv) imply that the candidate contract (a, c − y) is incentive
compatible and delivers utility V (y)ey = V (0). The candidate contract satisfies the
quitting constraint (3) because
Z ∞

Z ∞

a
−rs
a
−rs
E
re U (ct+s − y, at+s )ds |Ft
= exp(y)E
re U (ct+s , at+s )ds |Ft
0

0

≥ exp(y)V (yt )
= V (yt − y)

= V yt0 ,
where, as before, the income process yt starts at y, and yt0 = yt − y starts at 0. Thus,
the candidate contract (a, c − y) satisfies quitting, IC, and promise-keeping constraints,
and so it is feasible in the contracting problem in a match with a worker whose initial
productivity is 0 and whose market value is V (0).
Next we show that the candidate contract (a, c − y) attains 0 = C(0, V (0)), and hence is
optimal in this problem:
Z ∞

Z ∞

a
−rt
0
a
−rt
E
re ((ct − y) − yt )dt = E
re (ct − y − (yt − y))dt
0
0
Z ∞

a
−rt
= E
re (ct − yt )dt
0

= 0.

40



Proof of Proposition 2
We first show that
JF C (S) = JF C (0) + S, for all S ∈ R.

(32)

Indeed, if an IC contract (a, c) delivers to the worker initial utility VF C (0), then for any S ∈ R
the contract (a, c + S) is also IC and delivers to the worker initial utility VF C (0) exp(−S) =
VF C (S). Hence, for any y, the principal’s cost function under full commitment satisfies CF C (
VF C (S), y) = CF C (VF C (0), y) + S. Setting y = 0 in this equality and using definition JF C (S) =
CF C (VF C (S), 0), we obtain JF C (S) = JF C (0) + S.
Substituting (32) into the HJB equation (22) and using JF0 C = 1 and JF00 C = 0, we obtain


1 2
r (St + JF C (0)) = rSt − r log(−VF C (0)) + min r(− log(−ût )) + r (−1 − ût ) + Ŷt − ah .
2
ût ,Ŷt
Canceling rSt on both sides, we obtain a static minimization problem (controls do not change
over time) determining the value of JF C (0)


1 2
rJF C (0) = −r log(−VF C (0)) + min
r(− log(−ût )) + r (−1 − û) + Ŷ − ah .
(33)
2
û,Ŷ ≥−ûβ
Since the value minimized is quadratic in Ŷ and −ûβ > 0, the IC constraint will bind and
optimal Ŷ = −ûβ, which implies (25). The optimal û solves the convex problem


1
2 2
min −r log(−û) + r (−1 − û) + (−û) β .
û
2
The first-order condition of this problem is a quadratic function in (−û) given by
−1 + (−û) + r−1 β 2 (−û)2 = 0,

(34)

with a single positive root
p
1 + 4r−1 β 2 − 1
−û =
.
2r−1 β 2
√

This root is in Proposition 2 denoted by ρ. Because 0 < 1+4x−1
< 1 for all x > 0, we have
2x
that 0 < ρ < 1. Substituting −û = ρ and Ŷ = ρβ into (33) yields
1
rJF C (0) = −r log(−VF C (0)) − r log(ρ) + r (−1 + ρ) + ρ2 β 2 − ah .
2

(35)

To confirm that high effort is always optimal, note that the value of JF C (0) under low effort

41

would be determined by

rJF C (0) = −r log(−VF C (0)) +

min
û,Ŷ ≤−ûβ


1
r(− log(−û)) + r (−1 − ût φ) + Ŷ 2 − al ,
2

(36)

where the optimal Ŷ = 0 and the optimal û solves minû {−r log(−û) + r (−1 − ûφ)}, which has
a unique solution −û = φ−1 . This implies that JF C (0) under low effort would be
rJF C (0) = −r log(−VF C (0)) − r log(φ−1 ) − al .
Thus, high effort is optimal if and only if −r log(φ−1 )−al ≥ −r log(ρ)+r (−1 + ρ)+ 21 ρ2 β 2 −ah .
To prove this inequality, note that Assumption 1 implies β < σ and


1
1 2
1
2 2
−1
ah − al − r log(φ ) ≥ βσ ≥ β =
−r log(−û) + r (−1 − û) + (−û) β
2
2
2
û=−1
1 2 2
≥ −r log(ρ) + r (−1 + ρ) + ρ β .
2
We next show (24). From u(ct )/Wt = −û = ρ, we have − exp(−ct ) = Wt ρ, which gives us that
dct = −d log(−Wt ) = d(St + yt ). Recalling (21), or using Ito’s lemma again, we have


1 2
dct =
r (−1 − û) + Ŷ
dt + Ŷ dzta
2


1
2
= − r (1 − ρ) − (ρβ) dt + ρβdzta
2
a
= −µdt + ρβdzt ,
where the second line uses optimal controls −û = ρ and Ŷ = ρβ, and the third line uses the
definition of µ in Proposition 2. To see that µ > 0 note that r(1−ρ)− 21 (ρβ)2 > r(1−ρ)−(ρβ)2 =
0, where the equality follows from (34). To obtain initial consumption c0 , note that JF C (0) = 0
in equilibrium. This and (35) imply that
1
r (log(−VF C (0)) + log(ρ)) = r (−1 + ρ) + ρ2 β 2 − ah = −µ − ah .
2
From − exp(−c0 ) = W0 ρ = VF C (y0 )ρ = VF C (0)e−y0 ρ we have
c0 = y0 − (log(−VF C (0)) + log(ρ)) = y0 +

µ + ah
.
r

Solving dct = −µdt + ρβdzta with this initial condition yields (24).



Proof of Proposition 3
We know from Grochulski and Zhang (2011, equations (11) and (12), page 2365) that the
optimal compensation at time t is given by ct = u−1 (ū(mt )), where ū is a strictly increasing

42

function given by
ū(y) = VF I (y) −

limε↓0

VF0 I (y)
ah −rτy+ε ,
d
])
dε (1 − Ey [e

with τy+ denoting the hitting time of the level y + . Because yt is a Brownian motion with
drift ah , we know that Eayh [e−rτy+ε ] = e−κε , where κ is defined in Assumption 1.37 Thus,
d
limε↓0 dε
(1 − Eah [e−rτy+ε ]) = κ and the function ū is simplifies to




−VF I (y)
1
1
ū(y) = VF I (y) −
VF I (y) = u(y) 1 +
(−VF I (0)).
= 1+
κ
κ
κ

1
−1 to
Therefore, optimal compensation satisfies u(c
)
=
u(m
)
1
+
t
t
κ (−VF I (0)). Applying u

1
both sides, we obtain
 ct = mt 2− log 1 + κ − log(−VF I (0)). From JF I (0) = 0 we compute
κ
+ κσ
log(−VF I (0)) = log κ+1
2r , which gives us (26).
As in Grochulski and Zhang (2011), the worker’s continuation value process satisfies
!


1 − e−κ(mt −yt )
−κ(mt −yt )
−κ(mt −yt )
ū(mt ) + e
VF I (mt ) = 1 +
Wt = 1 − e
VF I (mt ),
κ
from which we can compute the volatility of Wt as −VF I (mt )e−κ(mt −yt ) σ, which, with u(ct ) =
VF I (mt ) 1 + κ1 gives us (27).


Proof of Corollary 1
The proof follows immediately from (27). We only need to check that δ > 0, or
Indeed,

κσ
κ+1

> β.

κσ
rσ log(φ−1 )
rσ(1 − φ)
>
>
= β,
κ+1
ah − al
ah − al
where the first inequality follows from Assumption 1.



Preliminary analysis of the HJB equation
Below, we will often use û, Ŷ , J 0 and J 00 as shorthand notations for û(S), Ŷ (S), J 0 (S) and
J 00 (S), respectively.
Lemma A.1 The IC constraint is slack if and only if

σJ 0 J 00
J 0 +J 00

> β. When it is slack,

σJ 00
,
J 0 + J 00
û = −J 0−1 .

Ŷ

(37)

=

(38)
ah 

37

This expression is different in Grochulski and Zhang (2011), i.e., Ey
income process considered there is a geometric Brownian motion.

43

e−rτy+ε



= ( y+ε
)−κ , because the
y

Proof The first-order conditions for Ŷ and û are
Ŷ ≥

σJ 00
,
J 0 + J 00

û ≥ −J 0−1 ,

with equalities if the IC constraint is slack. Thus, if the IC constraint is slack, then (37) and
00
0 00
Ŷ
(38) hold, and Ŷ > −ûβ implies β < −û
= JσJ0 +JJ 00 . If the IC constraint binds, then Ŷ ≥ JσJ
0 +J 00
and û ≥ −J 0−1 , and thus Ŷ = −ûβ implies β =

Ŷ
−û

≥

σJ 0 J 00
J 0 +J 00 .



Let H (S, J 0 , J 00 ) denote the right-hand side of the HJB equation (22), that is
H (S, J 0 , J 00 ) ≡ min

r(S − log(−V (0)) − log(−û))

û,Ŷ ,
Ŷ ≥−ûβ



1 2
1
+J r(−1 − û) + Ŷ − ah + J 00 (Ŷ − σ)2 ,
2
2
0

(39)

where J 0 and J 00 are scalars. Whenever H (S, J 0 , J 00 ) is invertible in J 00 , we may rewrite the
HJB equation as a second-order ordinary differential equation (ODE)
J 00 (S) = H −1 (S, J, J 0 ).

(40)

We study the invertibility of H (S, J 0 , ·) next.
Lemma A.2 If J 0 ≥
creasing in J 00 , and

κ
κ+1 ,

then at any J 00 ∈ [0, ∞) the function H (S, J 0 , J 00 ) is strictly inŶ < σ.

(41)

∂H
1
2
0
00
Proof The Envelope theorem states that ∂J
00 = 2 (Ŷ − σ) , which implies that H (S, J , J )
strictly increases in J 00 whenever Ŷ 6= σ. It is then sufficient to show (41). Indeed, if the IC
00
constraint is slack, then Ŷ = JσJ
0 +J 00 < σ. If the IC constraint binds, then

Ŷ = −ûβ ≤ βJ 0−1 ≤

r(1 − φ)σ
r(1 − φ)σ
< σ,
κ <
(ah − al ) κ+1
r log(φ−1 )

where the inequalities follow from −û ≤ J 0−1 , J 0 ≥

κ
κ+1 ,

(42)

Assumption 1, and

Lemma A.2 allows us to define the ODE (40) in the region

D ≡ (S, J, J 0 ) ∈ R3 : H (S, J 0 , 0) ≤ rJ ≤ H (S, J 0 , ∞) and J 0 ≥

1−φ
log(φ−1 )

κ
κ+1

< 1.




.

Next we derive an explicit functional form for H −1 (S, J, J 0 ) when the IC constraint is slack.
Lemma A.3 If the IC constraint is slack, then
J 00 =



σ 2 /2
1
−
r(J − S + log(−V (0)) − log(J 0 ) − 1) + (r + ah )J 0 J 0

44

−1
.

(43)

Proof If the IC constraint is slack, substituting (37) and (38) into the HJB equation yields
!


2

1
σJ 00
1
0
0
rJ = rS − r log(−V (0)) + r log(J ) + J r −1 + 0 +
− ah
J
2 J 0 + J 00
2

1 00
σJ 00
+ J
−σ ,
2
J 0 + J 00
which simplifies to
J 0 J 00
1
.
rJ = rS − r log(−V (0)) + r log(J 0 ) − rJ 0 + r − ah J 0 + σ 2 0
2 J + J 00
Solving for J 00 in the above yields (43).



The next lemma studies the HJB equation at the boundary S = 0. Let α(St ) and ζ(St ) denote
the drift and the volatility of St given in (21) evaluated at optimal controls û(St ) and Ŷ (St ).
Lemma A.4 In the model with two frictions,
(i)

κ
κ+1

≤ J 0 (0) ≤

r
r+ah − 21 σ 2

and 0 ≤ α(0) ≤ 12 (κ + 1)σ 2 ,

(ii) J 00 (0) = ∞ and ζ(0) = 0,
(iii) the IC constraint is slack when the quitting constraint binds.
Proof That α(0) ≥ 0 and ζ(0) = 0 follow from the nonnegativity of St at all t. In particular,
ζ(0) 6= 0 would imply St < 0 shortly after St = 0 because a typical Brownian motion sample path
has infinite variation. From the law of motion (20) we have that α(0) = r (−1 − û(0))+ 21 σ 2 −ah
and ζ(0) = Ŷ (0) − σ.
κ
κ
(i) First, we show J 0 (0) ≥ κ+1
. Note that κ+1
= JF0 I (0) by part (i) of Lemma B.1. We are
0
0
thus showing here that J (0) ≥ JF I (0). By contradiction, suppose JF0 I (0) > J 0 (0). Then

rJF I (0) + r log(−VF I (0))


1 2
= min r(− log(−û)) +
r(−1 − û) − ah + σ
û
2




κ+1
κ+1
1 2
0
= r log
+ JF I (0) r(−1 +
) − ah + σ
κ
κ
2




κ+1
κ+1
1 2
0
> r log
+ J (0) r(−1 +
) − ah + σ
κ
κ
2


1 2
0
≥ min r(− log(−û)) + J (0) r(−1 − û) − ah + σ
= rJ(0) + r log(−V (0)),
−ûβ≤σ
2

1 2
where the first inequality follows from r(−1 + κ+1
)
−
a
+
σ
> 0, and the second
h
κ
2
inequality follows from κ+1
β
<
σ.
Because
J
(0)
is
the
minimum
cost to deliver utility
F
I
κ
VF I (0) under one friction (limited commitment), the scalability in Proposition 1 implies
that JF I (0) + log(−VF I (0)) − log(−V (0)) is the minimum cost to deliver utility V (0)
under one friction, which must be lower than J(0), the cost to deliver the same utility
JF0 I (0)

45

V (0) under two frictions. This contradicts the above inequality.
Second, since part (iii) shows that the IC constraint is slack, it follows from Lemma A.1
that −û = J 0−1 . Under this condition, J 0 (0) ≤ r+a r− 1 σ2 is equivalent to r (−1 − û(0)) +
h

2

κ
− ah = α(0) ≥ 0. Further, under −û = J 0−1 , κ+1
≤ J 0 (0) is equivalent to

r (−1 − û(0)) + 12 σ 2 − ah ≤ r −1 + κ+1
+ 12 σ 2 − ah = 12 (κ + 1)σ 2 .
κ
1 2
2σ

(ii) Suppose J 00 (0) < ∞ so that the assumptions of Lemma A.2 are met. But then (41)
contradicts ζ(0) = Ŷ (0) − σ = 0.
(iii) It follows from Assumption 1 and J 00 (0) = ∞ that β = rσ(1−φ)
ah −al <
By Lemma A.1, thus, the IC constraint is slack when St = 0.

rσ log(φ−1 )
ah −al

<σ=

σJ 0 J 00
J 0 +J 00 .


Discussion. It is useful to briefly discuss the intuition behind Lemma A.4. As in the model
with full information, the binding quitting constraint at St = 0 forces the firm to match the
volatility of the worker’s continuation value to that of her outside option, which implies that
ζ(0) = 0. This, in turn, is consistent with the firm’s cost minimization if and only if the firm is
infinitely averse to volatility in St at zero, hence J 00 (0) = ∞.
Because the firm matches the worker’s continuation value volatility to her outside value volatility, the firm provides no insurance to the worker at St = 0. Because providing insurance is only
feasible when St > 0, the firm induces a positive drift in St at St = 0 in order to be able to
provide insurance. Comparing part (i) of Lemma A.4 with part (i) of Lemma B.1, however,
we see that the firm’s aversion to drift in the state variable, represented by the first derivative
of the cost function, is larger in the two-friction model than in the full-information model.
Accordingly, the positive drift in St at zero is smaller here than in the full-information model.
This difference is due to the cost of future incentives. Part (iii) of Lemma A.4 shows that the
IC constraint is slack when the quitting constraint binds. But we know from our analysis of the
full-commitment model in Section 3 that the IC constraint binds when the quitting constraint
is completely absent. Since the equilibrium contract in the two-friction model approximates the
equilibrium contract of the full-commitment model when St is large, the IC constraint will bind
in the two-friction model at St large enough. Inducing a positive drift in St in the two-friction
model, therefore, has the disadvantage of making it more likely that the quitting constraint
becomes sufficiently slack for the IC constraint to bind. This disadvantage is absent in the
full-information model. The expected cost of future incentives, thus, makes positive drift in St
more costly to the firm in the two-friction model, which is reflected in the firm’s higher drift
aversion J 0 (0) ≥ JF0 I (0) and lower drift of St at zero, as shown in part (i) of Lemma A.4.38
Closely related is the intuition for why the IC constraint is slack when the quitting constraint
binds. Corollary 1 shows that the IC constraint is slack at St = 0 in the full-information model.
This and the fact that the firm has a higher drift aversion in the two-friction model imply that
the IC constraint must also be slack at St = 0 in the two-friction model. Indeed, a smaller drift
in St at zero implies that the worker in the two-friction model receives a higher utility flow ût .
38

Although conditions in part (i) of Lemma A.4 are given as weak inequalities, we show later that they are
actually strict. Intuitively, the cost of future incentives is strictly positive because the IC constraint binds with
strictly positive probability in equilibrium.

46

Since the volatility of St is zero at St = 0, the normalized, market-induced sensitivity of the
worker’s continuation value, Ŷt , must equal σ in both models. With higher ût and the same Ŷt ,
the IC constraint is more slack in the two-friction model than in the full-information model.

Proof of Theorem 1
The proof is organized into three lemmas: Lemma A.5, Lemma A.9 and Lemma A.10. Three
auxiliary lemmas are also proved: Lemma A.6, Lemma A.7 and Lemma A.8.
We start out by noting that because J 00 (0) = ∞, the HJB equation at S = 0 reduces to
J(0) = − log(−V (0)) + log(J 0 (0)) + 1 − J 0 (0)

r + ah − 21 σ 2
.
r

Treating the right-hand side of this equation as a function of J 0 (0), denote its value by h(J 0 (0)).
Lemma A.4 now implies a range of possible values
for J(0), Ji0 (0) and J 00 (0) given by J(0),
h

κ
, r+a r− 1 σ2 . Thus, the knowledge of J 0 (0)
J 0 (0), J 00 (0) = (h(J 0 (0)), J 0 (0), ∞) for J 0 (0) ∈ κ+1
h

2

would be sufficient to pinpoint the values for J(0) and J 00 (0). Not knowing J 0 (0), however, we
will proceed as follows. Denote
h by K(S) thei function solving the HJB equation starting from
κ
0
an initial condition K (0) ∈ κ+1
, r+a r− 1 σ2 . This gives us a set of candidate solution curves
h 2
h
i
κ
0
K(S), one for each starting value K (0) ∈ κ+1
, r+a r− 1 σ2 . The true cost function J has to
h

2

coincide with one of these curves. The asymptotic condition limS→∞ J 0 (S) = 1 = JF0 C (S) will
determine which of the candidate solution curves represents the true cost function J.
In order to carry out this program, we need to first show that the solution to the HJB equation
(43) exists in the neighborhood of zero despite the fact that the HJB does not satisfy the
Lipschitz condition at S = 0 (because J 00 (0) = ∞).
Lemma A.5 The HJB equation has a unique candidate solution K in a neighborhood of S =
0
00
0
0
0
0 with the
 boundary condition (K(0), K (0), K (0)) = (h(K (0)), K (0), ∞) for any K (0) ∈
κ
κ+1 , B , where
(
B=

1,

if ah < 12 σ 2 ,
r
, if ah ≥ 12 σ 2 .
− 1 σ2

r+ah

2

Proof Use a change of variable: define x ≡ K 0 (S) and interpret both S and K as functions of
1
dK
dK dS
x
x. Since dS
dx = K 00 (S) and dx = dS dx = K 00 (S) , we have the differential equation system
dS
dx
dK
dx

=
=

1
1
− ,
2σ −2 (r(K − S + log(−V (0)) − log(x) − 1) + x (r + ah )) x
x
− 1.
−2
2σ (r(K − S + log(−V (0)) − log(x) − 1) + x (r + ah ))

The solution exists and is unique in a neighborhood of (x, S, K) = (K 0 (0), 0, h(K 0 (0))) because
47

the local Lipschitz condition is satisfied. When x is close to K 0 (0), S and K both strictly
increase in x because
dS
dx
dK
dx

= 0,
x=K 0 (0)

= x
x=K 0 (0)

d2 S
dx2
d2 K
dx2

dS
dx

2σ −2

=

= 0,
x=K 0 (0)
r
x

− r − ah

K 0 (0)2

x=K 0 (0)

dS
dx

=
x=K 0 (0)

+x
x=K 0 (0)


+

d2 S
dx2

1
=
x2

2σ −2



r

K 0 (0)

− r − ah + 12 σ 2

K 0 (0)2


> 0,

> 0.
x=K 0 (0)

Because the IC constraint is slack at S = 0 and

K 0 K 00
K 0 +K 00

=

x
x K100 +1

=

function of x, the IC constraint remains slack in a neighborhood of x

x
xS 0 (x)+1
= K 0 (0).

is a continuous


We can now move on to studying global properties of candidate solutions to the HJB equation.
For a given candidate solution K, define

S̄ ≡ min S > 0 : K 0 (S) = 1 or K 00 (S) = 0 or K 00 (S) = ∞ ,
S

with min ∅ = ∞.
The next three lemmas are auxiliary and will be used later.
Lemma A.6 If S̄ < ∞ and K 00 (S̄) = 0, then K 0 (S̄) < 1.
Proof By contradiction, suppose K 00 (S̄) = 0 and K 0 (S̄) = 1. Then the function K(·) that
satisfies K(S) = K(S̄) + S − S̄ for all S solves the HJB equation. This violates the condition
that K 00 (0) = ∞.

Lemma A.7 If K is a candidate solution with S̄ < ∞ and K 00 (S̄) = ∞, then r(−1 − û(S̄)) +
1 2
2 σ − ah < 0.
Proof By contradiction, suppose r(−1 − û(S̄)) + 12 σ 2 − ah ≥ 0. The HJB equation at S = S̄ is


1
rK(S̄) = r(S̄ − log(−V (0)) − log(−û(S̄))) + K 0 (S̄) r(−1 − û(S̄)) + σ 2 − ah .
2
Because r(−1 − û(S̄)) + 21 σ 2 − ah ≥ 0 and K 0 (S̄) > K 0 (0),

1 2
rK(S̄) ≥ r(S̄ − log(−V (0)) − log(−û(S̄))) + K (0) r(−1 − û(S̄)) + σ − ah
2


1 2
0
≥ rS̄ + min r(− log(−V (0)) − log(−û)) + K (0) r(−1 − û) + σ − ah
û
2
= rS̄ + rK(0),
0

48



where the equality follows from the HJB equation at S = 0. This contradicts the fact that
K 0 (S) < 1 for all S ∈ [0, S̄].

Lemma A.8 If K is a candidate solution with S̄ = ∞, then limS→∞ K 0 (S) = 1.
Proof Suppose by contradiction G ≡ limS→∞ K 0 (S) 6= 1. Since K 0 (S) < 1 for all S, G < 1.
Then
0

>
=

≥

rK(S) − r(K(0) + GS)


n
1 2
0
min r(S − log(−V (0)) − log(−û)) + K (S) r(−1 − û) + Ŷ − ah
2
û,Ŷ
o
1 00
+ K (S)(Ŷ − σ)2 − r(K(0) + GS)
2
r(1 − G)S + min r(− log(−V (0)) − log(−û)) + K 0 (S) (r(−1 − û) − ah ) − rK(0)
û

→ ∞, as S → ∞.
This is a contradiction.



We now move on to two key lemmas of this proof.
κ
, B) such that the candidate solution K satLemma A.9 There exists a unique K 0 (0) ∈ ( κ+1
isfies S̄ = ∞.

Proof Existence: Suppose by contradiction that all candidate solutions have S̄ < ∞. The
rest of the proof proceeds in several steps.
(i) The solution curves starting with different K 0 (0) are ordered: higher K 0 (0) leads to permanently higher solution curves. Suppose there are two curves K1 and K2 with initial conditions K1 (0) < K2 (0) and K10 (0) < K20 (0), then K10 (S) < K20 (S) for all S ∈
[0, min{S̄1 , S̄2 }]. If not, define S ≡ min{S : K10 (S) = K20 (S)}. Because K10 (S) < K20 (S)
for all S ≤ S, K1 (S) < K2 (S). Hence the HJB equation and K10 (S) = K20 (S) imply
that K100 (S) < K200 (S), which means that K10 (S) > K20 (S) when S − S > 0 is small. This
contradicts the definition of S.
(ii) Define
U

≡ {K 0 (0) : S̄ < ∞, either K 0 (S̄) = 1 or K 00 (S̄) = ∞},

L ≡ {K 0 (0) : S̄ < ∞, K 00 (S̄) = 0}.
It follows from Lemma A.6 that U ∩ L = ∅. We show below that both U and L are
κ
, B) = U ∪ L is a
nonempty and open, which generates a contradiction because ( κ+1
connected set.
(iii) U is open. Take a K 0 (0) ∈ U . We will show that there exists a δ > 0 such that if
|K10 (0) − K 0 (0)| ≤ δ, then K10 (0) ∈ U . Since K 0 (0) ∈ U , S̄ < ∞. Two cases need to be
considered: K 0 (S̄) = 1, and K 00 (S̄) = ∞. In the first case, because K 00 (S̄) > 0, there
exists a small  > 0 such that K 0 (S̄ + ) > 1. Because the solution of a differential
equation depends continuously on its initial condition, there exists a small δ > 0, such if

49

|K10 (0) − K 0 (0)| ≤ δ, then
K10 (S̄ + ) > 1,
1
< ∞.
sup
00
S∈[0,S̄+] K1 (S)

(44)
(45)

Inequality (44) implies S̄1 < S̄ +. It follows from (45) that K100 (S̄1 ) > 0, hence K10 (0) ∈
/ L.
0
Thus, K1 (0) ∈ U .
In the second case, recall that the HJB equation is solved by a change of variable whenever
K 00 (S) = ∞. Then

2σ −2 r(−1 − û) − ah + 21 σ 2
dS
d2 S
= 0,
< 0,
=
dx x=K 0 (S̄)
dx2 x=K 0 (S̄)
x2
where the inequality follows from Lemma A.7. Hence there exists a small  > 0 such that
dS
dx |x=K 0 (S̄)+ < 0. Because the solution of a differential equation depends continuously on
its initial condition, there exists a small δ > 0 such that if |K10 (0) − K 0 (0)| ≤ δ, then
dS1
dx
sup

< 0,

(46)

x=K 0 (S̄)+

1

00
S∈[0,S1 (K 0 (S̄)+)] K1 (S)

=

sup
x∈[K10 (0),K 0 (S̄)+]

dS1
< ∞.
dx

(47)

Inequality (46) implies S̄1 < S1 (K 0 (S̄) + ). It follows from (47) that K100 (S̄1 ) > 0, hence
K10 (0) ∈
/ L. Thus, K10 (0) ∈ U .
(iv) L is open. Recall from Lemma A.6 that if K 0 (0) ∈ L, then K 00 (S̄) = 0 and K 0 (S̄) < 1.
Differentiating the HJB equation and applying the Envelope theorem yield39


1
1 2
00
0 = r + K r(−1 − û) + Ŷ − ah + K 000 (σ − Ŷ )2 − rK 0 .
2
2
Hence K 00 (S̄) = 0 and K 0 (S̄) < 1 imply


1 000
1 2
2
0
00
K (S̄)(σ − Ŷ ) = r(K − 1) − K r(−1 − û) + Ŷ − ah < 0.
2
2
Therefore K 000 (S̄) < 0 and there exists a small  > 0 such that K 00 (S̄ + ) < 0.
Pick a small 1 > 0, such that K 0 (1 ) satisfies r(−1 + K 01(1 ) ) + 12 σ 2 − ah > 0. Recall
that the HJB equation is solved in a neighborhood of S = 0 by a change of variable.
For convenience, we denote the solution for an initial condition K10 (0) by S1 (x) when
x ∈ [K 0 (0), K 0 (1 )]. Because the solution of a differential equation depends continuously
39

The third derivative K 000 exists because K 00 (S) = H −1 (S, K, K 0 ) and H −1 is differentiable.

50

on its initial condition, there exists a small δ > 0, such that if |K10 (0) − K 0 (0)| ≤ δ, then
K100 (S̄ + ) < 0,

(48)

K10 (S)

< 1,

(49)

K100 (S)

< ∞.

(50)

sup
S∈[0,S̄+]

sup
S∈[S1

(K 0 (

1 )),S̄+]

Inequality (48) implies S̄1 < S̄ + . If S̄1 ∈ (0, S1 (K 0 (1 ))] (i.e., K10 (S̄1 ) ≤ K 0 (1 )),
because r(−1 + K 0 (1S̄ ) ) + 21 σ 2 − ah > 0, Lemma A.7 implies that K100 (S̄1 ) < ∞. If S̄1 ∈
1

1

[S1 (K 0 (1 )), S̄ + ], (50) implies that K100 (S̄1 ) < ∞. It follows from (49) and K100 (S̄1 ) < ∞
that K10 (0) ∈
/ U . Hence K10 (0) ∈ L if |K10 (0) − K 0 (0)| ≤ δ.
κ
κ
(v) L 6= ∅. We will show that κ+1
∈ L. By contradiction, suppose κ+1
∈ U . That is, if
κ
0
0
00
K (0) = κ+1 , then either K (S̄) = 1 or K (S̄) = ∞. The HJB equations for JF I and K
imply that if JF I (S) + log(−VF I (0)) ≥ K(S) + log(−V (0)) and JF0 I (S) = K 0 (S), then
JF00 I (S) ≥ K 00 (S). Hence, the same argument as in part (i) shows that JF0 I (S) ≥ K 0 (S)
for all S ≤ S̄. It follows from JF0 I (S) < 1, ∀S that K 0 (S̄) < 1 and K 00 (S̄) = ∞. A
contradiction arises as follows.

rJF I (S̄) + r log(−VF I (0))


1
1
= min r(S̄ − log(−û)) + JF0 I (S̄) r(−1 − û) + Ŷ 2 − ah + JF00 I (S̄)(Ŷ − σ)2
2
2
û,Ŷ


1
< min r(S̄ − log(−û)) + JF0 I (S̄) r(−1 − û) + σ 2 − ah
û
2


1 2
0
≤ r(S̄ − log(−û(S̄))) + JF I (S̄) r(−1 − û(S̄)) + σ − ah ,
2
where û(S̄) = −(K 0 (S̄))−1 is the optimal û at S̄ for K(S̄). Because JF0 I (S̄) ≥ K 0 (S̄) and
r(−1 − û(S̄)) + 21 σ 2 − ah < 0 (shown by Lemma A.7),


1 2
0
r(S̄ − log(−û(S̄))) + JF I (S̄) r(−1 − û(S̄)) + σ − ah
2


1 2
0
≤ r(S̄ − log(−û(S̄))) + K (S̄) r(−1 − û(S̄)) + σ − ah = rK(S̄) + r log(−V (0)),
2
which is a contradiction as JF I (0) + log(−VF I (0)) = K(0) + log(−V (0)) and JF0 I (S) ≥
K 0 (S) imply JF I (S̄) + log(−VF I (0)) ≥ K(S̄) + log(−V (0)).
(vi) U 6= ∅. First, suppose ah < 21 σ 2 . If 1 − K 0 (0) > 0 is sufficiently small, then K(·) will
reach K 0 = 1. Second, suppose ah ≥ 21 σ 2 . If K 0 (0) = B, then we show that dS
dx |x=B+ < 0
for small  > 0. To prove this, note that, similar to the proof in Lemma A.5,
dS
dx

= 0 =
x=K 0 (0)

dK
dx

51

,
x=K 0 (0)

2σ −2

d2 S
dx2
d2 K
dx2



=

r
K 0 (0)

− r − ah + 12 σ 2



K 0 (0)2

x=K 0 (0)

dS
dx

=
x=K 0 (0)

+x
x=K 0 (0)

d2 S
dx2

= 0,
= 0.

x=K 0 (0)

The Taylor expansion of 2σ −2 (r(K − S + log(−V (0)) − log(x) − 1) + x (r + ah )) is
 


dK
dS
1
0
−2
r
K (0) + 2σ
+ (r + ah ) (x − K 0 (0))
−
−
dx
dx x
 2

d K
d2 S
1
+2σ −2 r
(x − K 0 (0))2 + o((x − K 0 (0))2 )
−
+
dx2
dx2
x2
2σ −2 r
= x+
(x − K 0 (0))2 + o((x − K 0 (0))2 ) > x,
(K 0 (0))2
where the inequality holds when x − K 0 (0) > 0 is small, since
dS
dx

=
x=B+

2σ −2 (r(K

2σ −2 r
(K 0 (0))2

> 0. Therefore,

1
1
− < 0,
− S + log(−V (0)) − log(x) − 1) + x (r + ah )) x

for small  > 0. Because the solution of a differential equation depends continuously
on its initial condition, there exists a small δ > 0, such that if the initial condition
K10 (0) ∈ (B − δ, B), then
dS1
dx
sup

< 0,

(51)

x=B+

1

00
S∈[0,S1 (B+)] K1 (S)

=

sup
x∈[K10 (0),B+]

dS1
< ∞.
dx

(52)

dS1
1
0
It follows from dS
dx |x=K1 (0) > 0 and (51) that dx = 0 for some Ŝ ∈ (0, S1 (B +)). Because
dS1
1
00
dx = K 00 (S) , we know that K1 (Ŝ) = ∞. Hence S̄1 ≤ Ŝ must be finite. It follows from
1

(52) that K100 (S̄1 ) > 0, hence K10 (0) ∈
/ L and K10 (0) ∈ U .
Uniqueness: By contradiction, suppose there are two initial conditions K10 (0) < K20 (0) with
S̄1 = S̄2 = ∞. Subtracting one HJB equation from the other yields
r(K2 (S) − K1 (S))




1 2
1 00
0
2
= min −r log(−û) + K2 (S) r(−1 − û) + Ŷ − ah + K2 (S)(Ŷ − σ)
2
2
û,Ŷ




1 00
1 2
0
2
− min −r log(−û) + K1 (S) r(−1 − û) + Ŷ − ah + K1 (S)(Ŷ − σ) .
2
2
û,Ŷ
The left-hand side is positive at S = 0 and is strictly increasing with S, as shown in part (i) of
the proof of existence. Lemma A.8 implies that limS→∞ K10 (S) = limS→∞ K20 (S) = 1. For any
 > 0, there exists a large S such that 0 < K100 (S) + K200 (S) < . Therefore, the right-hand side
52

can be made as small as needed if S is large. This is a contradiction.



Lemma A.10 The candidate solution K with S̄ = ∞ is the true cost function J.
Proof Because the technique of using the HJB equation to verify the optimality of K is
standard, we omit the details of the steps involved. We verify two things:
(i) The cost of any IC contract is weakly higher than K(S).
(ii) There exists an IC contract whose cost equals K(S).
To see (i), pick an IC contract starting at S0 = S ≥ 0 and consider the stochastic process
{St ; t ≥ 0} in this contract. Define
t

Z
Mt ≡

(cs − ys )re−rs ds + e−rt K(St ).

(53)

0

The HJB equation implies that Mt is a submartingale (i.e., it has a nonnegative drift), hence
Z ∞

−rs
K(S) = M0 ≤ E [M∞ ] = E
(cs − ys )re ds .
(54)
0

To see (ii), construct a stochastic process {St ; t ≥ 0} using S0 = S and the policy functions
implied by the HJB equation for K. Denote the contract generated by {St ; t ≥ 0} and the
policy functions as σ ∗ . Then Mt defined in (53) is a martingale, and the inequality in (54) is
replaced with an equality. This shows that the cost of σ ∗ is K(S).


Proof of Proposition 4
We will show the existence of a unique S ∗ > 0 such that

 > β, if S < S ∗ ,
0
00
σJ J
= β, if S = S ∗ ,
J 0 + J 00 
< β, if S > S ∗ .
By Lemma A.1, this will show that the IC constraint is slack if and only if St < S ∗ .
Existence of S ∗ : We have shown in the proof of Lemma A.5 that
small. If

S∗

does not exist, then

σJ 0 (S)J 00 (S)
J 0 (S)+J 00 (S)

σJ 00 (S) >

σJ 0 (S)J 00 (S)
J 0 (S)+J 00 (S)

> β when S is

> β for all S. This implies that

σJ 0 (S)J 00 (S)
≥ β, for all S,
J 0 (S) + J 00 (S)

which contradicts the fact that J 0 (S) < 1 for all S.
Uniqueness of S ∗ :
σJ 0 (S)J 00 (S)
J 0 (S)+J 00 (S)

It is sufficient to show that if

> β for S < S ∗ .

53

σJ 0 (S ∗ )J 00 (S ∗ )
J 0 (S ∗ )+J 00 (S ∗ )

= β for some S ∗ , then

σJ 0 J 00
J 0 +J 00

First, we show that if

≥ β, then
rJ 0 + ah

J 0 J 00
< r.
J 0 + J 00

(55)

If ah ≤ 0, then (55) is obvious because J 0 < 1. When ah > 0, by contradiction, suppose that
σJ 0 J 00
J 0 J 00
0
J 0 +J 00 ≥ β and rJ + ah J 0 +J 00 ≥ r at some Ŝ. Starting from Ŝ, solve the differential equation
rJ = r(S − log(−V (0)) + log(J 0 )) + J 0 (−r − ah ) + r +

σ 2 J 0 J 00
.
2 J 0 + J 00

Differentiating with respect to S in the above yields
 0 00 
J J


d
2
J 0 +J 00
σ
J 00
= rJ 0 − r 1 + 0 + J 00 (r + ah )
2
dS
J


0
00
J +J
J 0 J 00
0
=
rJ + ah 0
− r ≥ 0.
J0
J + J 00
Hence either

d



J 0 J 00
J 0 +J 00



> 0, or

dS

d2

J 00



 0 00 
J
d JJ0 +J
00
dS


J 0 J 00
J 0 +J 00
dS 2

(56)

(57)
0 00

J
= 0. In the latter case, rJ 0 + ah JJ0 +J
00 − r = 0 and it

follows from
> 0 that
> 0. In both cases, there exists a small  > 0 such that
J 0 J 00
J 0 +J 00 is strictly increasing in [Ŝ, Ŝ + ]. Hence the solution J to (56) does satisfy the HJB
equation on [Ŝ, Ŝ + ] and the IC constraint is slack. If we extend the solution beyond Ŝ + ,
J 0 J 00
0
J 0 +J 00 is always strictly increasing, because J is increasing and ah is positive in (57). Hence,
J 00 >

J 0 J 00
J 0 (Ŝ)J 00 (Ŝ)
>
, for all S > Ŝ,
J 0 + J 00
J 0 (Ŝ) + J 00 (Ŝ)

contradicting the fact that J 0 (S) < 1 for all S.
Second, we show that

σJ 0 (S)J 00 (S)
J 0 (S)+J 00 (S)

> β for all S < S ∗ . Solve the differential equation (56)
0

00

(S)J (S)
backward on [0, S ∗ ]. Equations (55) and (57) show that JJ0 (S)+J
00 (S) is strictly decreasing in S.
Hence the solution J to (56) does satisfy the HJB equation and the IC constraint is slack.

This completes the proof of the first statement in Proposition 4. To prove the second statement,
we now show that both the drift and the volatility of compensation are zero when S ≤ S ∗ .
It follows from û =

u(c)
−W

(y)
and S = log( VW
) that

c = − log(−û) − log(−W ) = − log(−û) + S + y − log(−V (0)).
If S ≤ S ∗ , then −û = (J 0 )−1 , and c = log(J 0 ) + S + y − log(−V (0)). According to Ito’s lemma,
the drift of compensation is


J 00
1 2
1 2 1 J 000 J 0 − J 002
r(−1
−
û)
+
Ŷ
−
a
+
r(−1
−
û)
+
Ŷ +
(Ŷ − σ)2
h
J0
2
2
2
J 02

54



J 00
1 2
1 2 1 J 000
1 J 002
r(−1
−
û)
+
Ŷ
−
a
(Ŷ − σ)2 −
(Ŷ − σ)2
h + r(−1 − û) + Ŷ +
0
0
J
2
2
2 J
2 J 02


1 2
1
J 00
1 J 000
r(−1
−
û)
+
Ŷ
−
a
+
r(−1
+
=
)
+
(Ŷ − σ)2
h
J0
2
J0
2 J0

 

1
1
=
J 00 r(−1 − û) + Ŷ 2 − ah + r + J 000 (Ŷ − σ)2 − rJ 0 J 0−1 ,
2
2
=

00

002

2 = J (Ŷ − σ)2 . Differentiating the
where the second equality follows from Ŷ = JσJ
0 +J 00 and Ŷ
J 02
HJB equation with respect to S and applying the Envelope theorem yield


1 2
1
0
00
rJ = r + J r(−1 − û) + Ŷ − ah + J 000 (Ŷ − σ)2 .
2
2

Therefore, the drift of compensation is zero. The volatility of compensation is


J 0 + J 00
J 0 + J 00
J 0σ
(Ŷ − σ) + σ =
− 0
+ σ = 0.
J0
J0
J + J 00
Finally, we show that consumption is nondecreasing in any time interval I such that St < S ∗ for
all t ∈ I. Suppose by contradiction that ct > cs for some t, s ∈ I, t < s. Let  be a small positive
number and consider an alternative consumption plan (c̃t , c̃s ) given by u(c̃t ) = u(ct ) −  and
u(c̃s ) = u(cs ) + . We will show that (c̃t , c̃s ) is better than (ct , cs ), contradicting the optimality
of the contract. The variation (c̃t , c̃s ) not only reduces the principal’s cost (because the agent
is risk averse) but also maintains the quitting constraint (because the agent’s continuation
value at t is unchanged, and her continuation value at s is increased). When  > 0 is small,
this variation does not violate any IC constraints because the constraints are slack in the time
interval I.


Verification of optimality of high effort
Lemma A.11 Under Assumption 1, it is optimal to implement high effort for all S ≥ 0.
Proof The law of motion for St under low effort is

 

1 2
dSt = r (−1 − ût φ) + Ŷt − al + Ŷt − σ dztal .
2
To show that low effort is suboptimal, we need to verify that


1 2
1
0
min r(− log(−û)) + J (S) r(−1 − û) + Ŷ − ah + J 00 (S)(Ŷ − σ)2
û,Ŷ ,
2
2
Ŷ ≥−ûβ


1
1
≤ min r(− log(−û)) + J 0 (S) r(−1 − ûφ) + Ŷ 2 − al + J 00 (S)(Ŷ − σ)2 .
û,Ŷ ,
2
2
Ŷ ≤−ûβ

55

(58)

First, if S ≤ S ∗ , then the IC constraint Ŷ ≥ −ûβ is slack according to Proposition 4. Inequality
(58) is equivalent to
J 0 (S)(ah − al ) ≥ r log(φ−1 ),
which follows from J 0 (S) ≥ J 0 (0) >

κ
κ+1

and Assumption 1.

Second, if S > S ∗ (i.e., the IC constraint binds), then

≤

σJ 0 J 00
J 0 +J 00

≤ β. We have



1
1
min r(− log(−û)) + J 0 r(−1 − û) + Ŷ 2 − ah + J 00 (Ŷ − σ)2
û,Ŷ ,
2
2
Ŷ ≥−ûβ




1 2
1 00
0
2
r(− log(−û)) + J r(−1 − û) + Ŷ − ah + J (Ŷ − σ)
2
2
û= −10 ,Ŷ = β0
J

=

r(− log(−û)) + J 0 (r(−1 − û) − ah )



≤

r(− log(−û)) + J 0 (r(−1 − û) − ah )



where the last inequality follows from σJ 0 ≥

û= −1
J0
û= −1
J0

rσ log(φ−1 )
ah −al

J

1
+ 02 (J 0 β 2 + J 00 (σJ 0 − β)2 )
2J
1
+ βσ,
2

≥

rσ(1−φ)
ah −al

= β and

βσJ 02 − J 0 β 2 − J 00 (σJ 0 − β)2 = (σJ 0 − β)(J 0 β − J 00 (σJ 0 − β))


σJ 0 J 00
0
0
00
= (σJ − β)(J + J ) β − 0
≥ 0.
J + J 00
Furthermore,

r(− log(−û)) + J 0 (r(−1 − û) − ah )

û= −1
J0

1
+ βσ
2

1
≤ min r(− log(−û)) + J 0 (r(−1 − ûφ) − ah ) + r log(φ−1 ) + βσ
û
2


1
1
≤ min r(− log(−û)) + J 0 r(−1 − ûφ) + Ŷ 2 − al + J 00 (Ŷ − σ)2
û,Ŷ ,
2
2
Ŷ ≤−ûβ

1
+J 0 (−ah + al ) + r log(φ−1 ) + βσ
2


1 2
1
0
≤ min r(− log(−û)) + J r(−1 − ûφ) + Ŷ − al + J 00 (Ŷ − σ)2 ,
û,Ŷ ,
2
2
Ŷ ≤−ûβ

where the last inequality follows from Assumption 1. Thus (58) is verified.

Proof of Proposition 5
We start with the following auxiliary lemma:
Lemma A.12 J 000 (S) < 0 for all S ≥ 0. Further, limS→∞ J 00 (S) = 0.
56



Proof

J 0 (S)J 00 (S)
1
J 0 (S)+J 00 (S) = 0 1 + 001
J (S)
J (S)
either J 0 (S) or J 00 (S) must

When S < S ∗ , recall that we have shown that

is strictly

decreasing in S in the proof of Proposition 4 . Hence
be strictly
decreasing. Since J 0 (S) increases with S, J 00 (S) strictly decreases with S when S < S ∗ . If
J 00 (S) is not globally decreasing, then there is a S̄ ≥ S ∗ at which J 000 (S̄) = 0. When the IC
constraint binds at S > S ∗ , we have Ŷ = −ûβ and the HJB equation takes the form of
rJ(S) = rS − r log(−V (0))




1
1 00
2
0
2
+ min −r log(−û) + J (S) r (−1 − û) + (−ûβ) − ah + J (S) (−ûβ − σ) .
û
2
2
The first-order condition for the optimal û is
rc0 (û) + J 00 (S)σβ = rJ 0 (S) + (J 0 (S) + J 00 (S))β 2 (−û).

(59)

Because J 0 increases with S while J 00 is stationary at S = S̄, equation (59) implies that (−û)
and Ŷ decrease with S, when S is close to S̄. Differentiating the HJB equation yields


1 2
1
00
0 = r + J r(−1 − û) + Ŷ − ah + J 000 (σ − Ŷ )2 − rJ 0 .
2
2


Because the term J 00 r(−1 − û) + 12 Ŷ 2 − ah − rJ 0 decreases with S ∈ (S̄ − , S̄ + ) for a
small , J 000 (S) < 0 for S ∈ (S̄ − , S̄) and J 000 (S) > 0 for S ∈ (S̄, S̄ + ). Because of these two
inequalities, J 000 cannot be zero again for any S > S̄. That is, J 000 > 0 for all S > S̄. Then J 00
increases with S and J 0 will reach one eventually, a contradiction.

Now we can prove the proposition.
00

First, we show that (−û) and Ŷ decrease with S. If S ≤ S ∗ , then −û = J 0 1(S) and Ŷ = JσJ
0 +J 00
decrease with S, because J 0 increases and J 00 decreases with S. If S ≥ S ∗ , then (−û) and
Ŷ decrease with S, because in the first-order condition (59), J 0 increases and J 00 decreases
with S, and σ > β(−û). Further, because limS→∞ J 0 (S) = 1 and limS→∞ J 00 (S) = 0, the
first-order condition (59) approaches condition (34), which means that limS→∞ (−û) = ρ and
limS→∞ Ŷ = ρβ.
Second, we show the properties of the drift and the volatility of S. That α(S) = r(−1 − û) +
1 2
2 Ŷ − ah and ξ(S) = Ŷ − σ are decreasing in S is because −û and Ŷ decrease with S. That
α(0) > 0 follows from Ŷ (0) = σ, −û(0) = J 01(0) , and J 0 (0) < r+a r− 1 σ2 in Lemma A.9. That
h

2

limSt →∞ α(St ) = −µ − ah follows from limS→∞ (−û) = ρ, limS→∞ Ŷ = ρβ, and the definition
of µ. That limS→∞ ζ(S) = ρβ − σ follows from limS→∞ Ŷ = ρβ.


57

Proof of Theorem 2
First, plug optimal policies −ût = ρ and Ŷt = ρβ into (21) to obtain the equilibrium dynamics
of the state variable St in the full-commitment model:
dSt = − (µ + ah ) dt − (σ − ρβ) dzta .

(60)

µ + ah > 0.

(61)

Thus, by assumption we have

Second, construct a function f : [J 0 (0), ∞) → [0, ∞) such that f 0 (x) > 0 for x > J 0 (0), f 00 (x)
is continuous for x ≥ J 0 (0), and f (x) = S(x) for x ≈ J 0 (0) and f (x) = x for large x. We can
extend the domain of f to (−∞, ∞) by defining f (x) ≡ f (2J 0 (0) − x) for x < J 0 (0). Because
S 0 (x)|x=J 0 (0) = 0, the left derivative and right derivative of f are equal at x = 0. Hence, f is
still continuously differentiable after the extension.
Third, construct a diffusion process for x ∈ (−∞, ∞) as follows. Because S is a diffusion
process, so is x = f −1 (S) whenever x > J 0 (0). The drift ᾱ(x) and volatility ζ̄(x) of x are,
respectively,
1
ᾱ(x) = (f −1 )0 (S)α(S) + (f −1 )00 (S)(ζ(S))2 ,
2
ζ̄(x) = (f −1 )0 (S)ζ(S),
where S = f (x). Symmetrically, if x < J 0 (0), then x = 2J 0 (0) − f −1 (S) is also a diffusion
process.
Fourth, for S to have an invariant distribution it is sufficient to show that x has an invariant
distribution. To show that x has an invariant distribution on (−∞, ∞) we verify the sufficient
conditions in Karatzas and Shreve (1991, Exercise 5.40, page 352).
(i) Nondegeneracy. The volatility ζ̄(x) 6= 0 at x > J 0 (0) because f 0 (x) > 0 for x > J 0 (0) and
ζ(S) 6= 0 for S > 0. Although ζ(0) = 0, ζ̄(J 0 (0)) 6= 0 because f −1 (S) = J 0 (S) for S ≈ 0
and
lim ζ̄(x) = lim J 00 (S)(Ŷ − σ) = lim

x↓J 0 (0)

S↓0

S↓0

J 00 (S)J 0 (S)
> 0.
J 0 (S) + J 00 (S)

The volatility ζ̄(x) 6= 0 at x < J 0 (0) due to symmetry.
(ii) Local integrability. Because ζ(x) is continuous in x and is always nonzero, it is bounded
away from zero. That is, there exists  > 0 such that (ζ(x))2 ≥  for all x.
(iii) p(−∞) = −∞ and p(∞) = ∞, where the scale function p(x) is defined as
Z
p(x) ≡

x

 Z
exp −2

c

c

ξ


ᾱ(θ)
dθ dξ,
ζ̄(θ)2

where c is a fixed number. We will only show p(∞) = ∞ as the proof for p(−∞) = −∞ is

58

α(θ)
−µ−ah
= limθ→∞ ζ(θ)
< 0, where the
2 =
σ2
R ξ ᾱ(θ)
inequality follows from (61). Therefore, limξ→∞ −2 c ζ̄(θ)2 dθ = ∞, and limx→∞ p(x) =
∞.

similar. Since f (θ) = θ for large θ, limθ→∞

ᾱ(θ)
ζ̄(θ)2

(iv) m(−∞, ∞) < ∞, where the speed measure m is defined as
m(dx) ≡
Because limθ→∞
ᾱ(θ)
ζ̄(θ)2

<

−µ−ah
2σ 2

−µ−ah
<
σ2
2
ζ̄(θ)2 > σ2 for

ᾱ(θ)
ζ̄(θ)2

and

=

2dx
p0 (x)ζ̄(x)2

.

0 and limθ→∞ ζ̄(θ)2 = σ 2 , there is a large x̄, such that
θ ≥ x̄. Hence, if x ≥ x̄, then


 Z x
ᾱ(θ)
dθ ζ̄(x)2
p (x)ζ̄(x) = exp −2
2
ζ̄(θ)
c
 2


 Z x̄
σ
µ + ah
ᾱ(θ)
(x − x̄)
,
dθ exp
≥ exp −2
2
2
σ
2
c ζ̄(θ)
R ∞ 2dx
which implies that m([x̄, ∞)) = x̄ p0 (x)
dx is finite. That m((−∞, 2J 0 (0) − x̄]) < ∞
ζ̄(x)
follows from symmetry.
0

2



Proof of Proposition 6
We start with the following auxiliary lemma:
−κ

Lemma A.13 Let S1 = 1 − log(1 + 1−eκ ) and S2 = 2 − log(1 +
r(−1 + J 0 1(S1 ) ) + 21 σ 2 − ah < − a3h and S1 < S2 < S ∗ .

1−e−2κ
).
κ

For large ah ,

FI

Proof First, we compute JF0 I (S1 ) in the full-information model. From the proof of Proposition
1−e−κ(mt −yt )
3, we know that ut = κ+1
)VF I (mt ). Therefore, û(St ) =
κ VF I (mt ) and Wt = (1 +
κ
ut
−Wt

=−

κ+1
κ
−κ(m−y)
1+ 1−e κ

. The first-order condition in the HJB equation implies û(S) = − J 0 −1(S) ,
FI

which together with m − y = 1 at S1 imply that

r −1 +

1
0
JF I (S1 )


= r −1 +

1+

κ+1
κ
1−eκ(y−m)
κ

!
=

re−κ
.
κ + 1 − e−κ

It follows from limah →∞ ah κ = r that
lim

ah →∞


Therefore, r −1 +

1

JF0 I (S1 )



re−κ
κ+1−e−κ

ah

re−κ
1
= .
−κ
ah →∞ (κ + 1 − e
)ah
2

= lim

+ 21 σ 2 − ah < − a3h for large ah .

59

Second, S1 < S2 because limah →∞ S1 − S2 = −1 − limκ→0 log(1 +
1−e−2κ
) = −1 + log( 23 ) < 0.
κ

1−e−κ
κ )

+ limκ→0 log(1 +

Third, S2 < SF∗ I , where SF∗ I denotes the smallest S at which the IC constraint is violated in the
κ+1
∗
full-information model. At S2 , m − y = 2. It follows from eκ(−2) > r(1−φ)
ah −al κ that S2 < SF I
for large ah .
Fourth, SF∗ I < S ∗ . We show that J 0 (SF∗ I ) > JF0 I (SF∗ I ) and J 00 (SF∗ I ) > JF00 I (SF∗ I ). An argument
similar to part (i) in the proof of Lemma A.9 shows the former. To see the latter, suppose by
contradiction J 00 (SF∗ I ) ≤ JF00 I (SF∗ I ). The HJB equation for JF I is


1 2
∗
∗
0
∗
rJF I (SF I ) + r log(−VF I (0)) = r(SF I − log(−û)) + JF I (SF I ) r(−1 − û) + Ŷ − ah
2
1 00
+ JF I (SF∗ I )(Ŷ − σ)2 .
2
Hence J 0 (SF∗ I ) > JF0 I (SF∗ I ), J 00 (SF∗ I ) ≤ JF00 I (SF∗ I ), and r(−1 − û) + 12 Ŷ 2 − ah < 0 imply that


1
rJF I (SF∗ I ) + r log(−VF I (0)) = r(SF∗ I − log(−û)) + JF0 I (SF∗ I ) r(−1 − û) + Ŷ 2 − ah
2
1
+ JF00 I (SF∗ I )(Ŷ − σ)2
2


1
> r(SF∗ I − log(−û)) + J 0 (SF∗ I ) r(−1 − û) + Ŷ 2 − ah
2
1
+ J 00 (SF∗ I )(Ŷ − σ)2
2
≥ rJ(SF∗ I ) + r log(−V (0)),
which contradicts JF I (S) + log(−VF I (0)) ≥ J(S) + log(−V (0)) for all S ≥ 0. That SF∗ I < S ∗
follows from
σJ 0 (SF∗ I )J 00 (SF∗ I )
σJF0 I (SF∗ I )JF00 I (SF∗ I )
>
= β.
J 0 (SF∗ I ) + J 00 (SF∗ I )
JF0 I (SF∗ I ) + JF00 I (SF∗ I )

Now we can prove the proposition.
Because the trend of S is negative in [S1 , ∞), the derivative of the scale function, p0 (S), is
strictly increasing in S. Further,
!
r(−1 + J 0 1(S) ) + 12 Ŷ (S)2 − ah

0
= −2
log(p0 (S))
(Ŷ (S) − σ)2
!
r(−1 + J 0 1(S1 ) ) + 12 σ 2 − ah
2ah
FI
≥ 2 , for S ≥ S1 ,
≥ −2
3σ
(Ŷ (S) − σ)2

60

where the first inequality follows from J 0 (S) ≥ J 0 (S1 ) > JF0 I (S1 ) and the second inequality
follows from r(−1 + J 0 1(S1 ) ) + 21 σ 2 − ah < − a3h , which is shown in Lemma A.13. This implies
FI

h
that p0 (S) ≥ p0 (S ∗ ) exp( 2a
(S − S ∗ )) for S ≥ S ∗ . We have
3σ 2

R∞
S∗
m[S ∗ , ∞)
= R S2
m[S1 , S2 ]
S1

1
dS
(S)−σ)2
1
dS
p0 (S)(Ŷ (S)−σ)2

p0 (S)(Ŷ

R∞

1
S ∗ p0 (S) dS
,
R S2 1
dS
0
S1 p (S)

≤

which follows from Ŷ (S) > Ŷ (S̃) for all S < S ∗ < S̃. This inequality is shown by Ŷ (S) =
J 00 (S)
J 0 (S)J 00 (S)
0
−1 > β(J 0 (S))−1 ≥ β(J 0 (S̃))−1 = β(−û(S̃)) = Ŷ (S̃). Further,
J 0 (S)+J 00 (S) = J 0 (S)+J 00 (S) (J (S))

R∞

1
S ∗ p0 (S) dS
R S2 1
S1 p0 (S) dS

R∞

R∞

≤

1
S ∗ p0 (S) dS
1
(S2 − S1 ) p0 (S
∗)

Hence limah →∞ π([S ∗ , ∞)) = limah →∞

≤

1
S ∗ exp( 2ah (S−S ∗ )) dS
2
3σ

S2 − S1

m[S ∗ ,∞)
m[0,∞)

≤ limah →∞

=

m[S ∗ ,∞)
m[S1 ,S2 ]

1
2ah
(S2
3σ 2

− S1 )

= 0.

.



Appendix B: Properties of the cost function JF I and dynamics
of the state variable St in the model with full information
Lemma B.1 In the model with full information,
(i) JF0 I is everywhere positive and strictly increasing with
JF0 I (0) =

κ
κ+1

and

lim JF0 I (St ) = 1.

St →∞

(ii) JF00 I is everywhere positive and strictly decreasing with
JF00 I (0) = ∞

and

lim JF00 I (St ) = 0.

St →∞

(iii) The drift of the state variable, α, is strictly decreasing with
1
α(0) = (κ + 1)σ 2 > 0
2

and

lim α(St ) = −ah .

St →∞

(iv) The volatility of the state variable, ζ, is everywhere negative and strictly decreasing with
ζ(0) = 0

and

lim ζ(St ) = −σ.

St →∞

Proof It is useful to derive policies û(St ) and Ŷ (St ) as functions of mt and yt . From the
−κ(mt −yt ) σ and W =
proof of Proposition 3, we know that ut = κ+1
t
κ VF I (mt ), Yt = −VF I (mt )e
61

(1 +

1−e−κ(mt −yt )
)VF I (mt ).
κ

Therefore,
κ+1
κ
1−e−κ(m−y)
κ
e−κ(mt −yt )

û(St ) =

ut
=−
−Wt
1+

,

Ŷ (St ) =

Yt
=
−Wt
1+

σ.

1−e−κ(mt −yt )
κ

This implies that û(St ) increases and Ŷ (St ) decreases in St . Further, û(0) = − κ+1
κ , Ŷ (0) = σ,
limSt →∞ û(St ) = lim(mt −yt )→∞ û(St ) = −1, and limSt →∞ Ŷ (St ) = lim(mt −yt )→∞ Ŷ (St ) = 0.
(i) Since JF0 I (St ) = (−û(St ))−1 , the property of JF0 I (St ) follows from that of û(St ) in the
above.
−κ(mt −yt )

(ii) It follows from JF0 I (St ) = (−û(St ))−1 = (1 + 1−e
1 − e−κ(mt −yt ) ) that
JF00 I (St ) =

κ
e−κ(mt −yt )
t −yt ) κ
κ + 1 1 − e−κ(m
−κ(m −y
κ+1−e

t

=
t)

κ

)/ κ+1
κ and St = mt − yt − log(κ +

κ
1
,
κ
κ(m
−y
)
κ + 1 e t t − κ+1−e−κ(m
t −yt )

which decreases in mt − yt . If St = 0, then mt −yt = 0 and clearly JF00 I (0) = ∞. Moreover,
limSt →∞ JF00 I (St ) = limmt −yt →∞ JF00 I (St ) = 0.
(iii) It follows from α(St ) = r(−1− û(St ))+ 21 (Ŷ (St ))2 −ah that α(St ) decreases in St . Further,
κ+1
1
1
) + σ 2 − ah = (κ + 1)σ 2 ,
κ
2
2
1 2
lim α(St ) = r(−1 + 1) + 0 − ah = −ah .
St →∞
2
α(0) = r(−1 +

(iv) It follows from ζ(St ) = Ŷ (St ) − σ that ζ(St ) decreases in St . Further,
ζ(0) = Ŷ (0) − σ = 0,

lim ζ(St ) = 0 − σ = −σ.

St →∞


Discussion. The quitting constraint is the only friction in the full-information version of
our model. If this friction were absent, the contracting environment would be the so-called
first best: firms would fully insure workers against fluctuations in their productivity by giving
them permanently constant compensation and workers would be committed to never quitting
or shirking. In the first best, Wt is constant, so, as evident from (19), the dynamics of the
state variable St reduce to dSt = −dyt , which means that α(St ) = −ah and ζ(St ) = −σ at
all St . With the worker’s compensation constant, the firm’s profit simply follows the random
changes in the output produced by the worker. The first-best cost function, denoted as JF B ,
therefore satisfies JF0 B (St ) = 1, as a larger drift of the worker’s output process would reduce

62

the firm’s cost one-to-one.40 Also, since firms are risk-neutral and never run into quitting or
incentive constraints in the first best, they are indifferent to volatility in St . This means that
JF00 B (St ) = 0 for all St .41
Lemma B.1 shows that the equilibrium cost function and the dynamics of the state variable
in the model with the quitting constraint converge to the first best when slackness St in the
quitting constraint becomes large. This convergence is intuitive. When St is large, the expected
time until the quitting constraint binds again is large, and so the equilibrium contract (26) is
expected to provide full insurance to the worker far into the future. Since the equilibrium
contract at this point approximates the first-best contract very closely, its cost is close to the
first-best cost function.
On the other extreme, when the quitting constraint binds (i.e., at St = 0), ensuring that it
continues to be satisfied under all realizations of the shock to the worker’s productivity is only
possible if, first, the volatility of St at St = 0 is zero, and, second, the drift of St at St = 0 is
nonnegative. The optimal contract, as we see in Lemma B.1, does induce ζ(0) = 0. Consistently,
JF00 I (0) = ∞, which reflects the fact that the firm is infinitely averse to the volatility in St when
the quitting constraint binds, as any nonzero volatility would lead to a violation of the quitting
constraint with probability one immediately after St hits zero.
Note that zero volatility of St means that the volatility of the worker’s continuation value inside
the contract is the same as the volatility of her outside option, which means that locally at
St = 0 the firm cannot provide any insurance to the worker. To avoid violating the quitting
constant, clearly, the drift of St at St = 0 must be nonnegative. A strictly positive drift of
St at St = 0 is beneficial in that it relaxes the quitting constraint, which allows the firm to
provide insurance to the worker as soon as St becomes strictly positive. But positive drift in
St is also costly because in order to obtain it the contract must back-load compensation and
produce a strictly positive drift in the worker’s continuation value Wt . Positive drift in Wt is
costly as it means that intertemporal smoothing of the worker’s consumption is poor. (Recall
that drift of Wt at the first best is zero.) The optimal drift α(0) given in the above lemma
is the outcome of balancing this trade-off. It is strictly positive, so zero is a reflecting rather
than absorbing barrier for the state variable and insurance is provided to the worker. Its size
is limited, however, by the intertemporal inefficiency of excessive compensation back-loading.
κ
Consistently, JF0 I (0) = κ+1
< 1 = JF0 B (0) reflects the fact that positive drift of St has a benefit
in the full-information-limited-enforcement model that it does not have in the first best: it helps
relax the quitting constraint. As a consequence, the firm is less averse to drift in St than it is
at the first best, which means that JF0 I is everywhere smaller than JF0 B ≡ 1.

40

Recall from (23) that the first derivative of the cost function represents the impact of the state variable’s
drift on the firm’s total cost. In the first best, the drift of the state variable is the negative of the drift of the
worker’s output.
41
Recall again from (23) that the second derivative of the cost function represents the impact of the state
variable’s volatility on the firm’s total cost.

63