Full text of Working Papers (Federal Reserve Bank of Richmond) : The Complexity of CEO Compensation, Working Paper 14-16

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Working Paper Series

The Complexity of CEO Compensation

WP 14-16

This paper can be downloaded without charge from:
http://www.richmondfed.org/publications/

Arantxa Jarque
Federal Reserve Bank of Richmond

The Complexity of CEO Compensation∗
Arantxa Jarque†
Richmond Fed
September 3, 2014
Working Paper No. 14-16

Abstract
I study firm characteristics that justify the use of options or refresher grants in the optimal compensation packages for CEOs in the presence of moral hazard. I model explicitly the
determination of stock prices as a function of the output realizations of the firm: Symmetric
learning by all parties about the exogenous quality of the firm makes stock prices sensitive
to output observations. Compensation packages are designed to transform this sensitivity of
prices—to—output into the sensitivity of consumption—to—output that is dictated by the optimal
contract. Heterogeneity in the structure of firm uncertainty implies that some firms are able to
implement the optimal contract with very simple schemes that do not include options, refresher
grants, or perks, while others need to use these more complex and potentially less transparent
instruments.
Journal of Economic Literature Classification Numbers: D80, D82, D86, G30.
Key Words: mechanism design; moral hazard; CEO compensation; stock options; repricing;
refresher grants; perks; learning

1

Introduction

It is widely accepted that in order to solve the agency problem between a firm’s CEO and its owners
the compensation of the executive must be tied to the results of the firm. Less well understood,
however, is how the eﬃcient provision of incentives must be implemented in practice. The recent
financial crisis has revived a longstanding debate about compensation practices for CEOs of large,
publicly traded firms. Although the interest in explaining the level of compensation is still present,1
∗

I would like to thank Andy Atkeson, Marco Celentani, Hector Chade, Huberto Ennis, Willie Fuchs, Juan Carlos
Hatchondo, Ángel Hernando, Hugo Hopenhayn, Boyan Jovanovic, Nobu Kiyotaki, Ned Prescott, and audiences at
the University of Iowa, the Feedback group at Carlos III, Yonsei University, the Applied System meetings and the
Midwest Macro Meetings in Spring 2014, and the SED meetings in Toronto for valuable comments. I also thank
Robert Sharp for excellent research assistance.
†
Correspondence: Arantxa.Jarque@rich.frb.org. The views expressed in this paper are those of the author and do
not necessarily represent the views of the Federal Reserve Bank of Richmond or those of the Federal Reserve System.
1
Academics have recently proposed explanations for the increase of the level of pay in the past decades, mainly
based on assortative matching combined with a sharp increase in the size of firms during this period (see Gabaix and
Landier (2008) and references therein).

1

0
,0
0
10
0
8,
00
0
6,
00
0
4,
00

Thousands of 2012$

0
2,
00

12

11

20

10

20

08

09

20

20

07

20

06

20

05

20

04

20

03

20

01

02

20

20

00

20

99

20

98

19

97

19

96

19

95

19

94

19

19

19

93

0

Mean salary
Mean Stock Grant
Mean Other Compensation

Mean Bonus and Inc. Comp.
Mean Option Grant

Source: Author’s calculations using Execucomp data

Figure 1: Relative importance of the diﬀerent components of CEO compensation packages over time.
Column height represents average compensation. Author’s calculations using Execucomp data (CEOs of the
largest 1,500 firms that are listed in the S&P index). CEOs who owned more than 3% of the total stock of
the firm at any point in the sample were considered “owners,” i.e., not subject to a moral hazard problem,
and they were dropped.

there has been a shift toward understanding the form of compensation packages: Is the use of certain
instruments, like stock options, golden shakehands and parachutes, or perks, a sign of captured
compensation boards and misaligned incentives?2 In a similar spirit, concerns that certain pay
practices, like the use of options, may induce excessive risk—taking in financial firms have prompted
regulatory agencies to increase their involvement in overseeing pay practices in the banking sector.3
Given the state of the debate and the increased desire for intervention, it is critical to enhance our
understanding of the form of compensation packages that are consistent with a correct alignment
of the interests of shareholders and the CEO. This is the objective of this paper.
Contract theory informs us about how to implement incentives optimally (Holmstrom, 1979;
Grossman and Hart, 1983; Wang, 1997). But such characterizations are mainly given in a context
that allows the payoﬀ to the agent to depend on signals of performance (e.g., accounting measures,
stock prices) in a very general way (i.e., “unrestricted” transfers). In this paper, I take a diﬀerent
approach: I look at the problem of implementing the optimal contract with a rich set of real—life
instruments, and I study what are the necessary instruments to implement it exactly.
This approach diﬀerentiates my paper from important recent contributions in the literature
that studies the implementation of incentives with real—life instruments. Available data on the
2

See Bebchuk, Cohen, and Spamann (2010) and references therein for an elaboration of this argument. Fahlenbrach
and Stulz (2009), however, find no support for it in their data.
3
See, for example, the 6/10/2009 statement by then Treasury Secretary Tim Geithner on executive compensation
in which he provided broad—based principles as a first step in “the process of bringing compensation practices more
tightly in line with the interests of shareholders and reinforcing the stability of firms and the financial system.” The
statement is available at http://www.treasury.gov/press-center/press-releases/Pages/tg163.aspx.

2

compensation of CEOs of large, publicly traded firms (Execucomp) shows that a fairly limited set of
compensation instruments, like bonus programs and stock and option grants, make up most of CEO
compensation (see Figure 1).4 Perhaps because of this evidence, a common practice in the literature
that studies the properties of real life compensation instruments has been to exogenously restrict the
class of instruments that are available to the firm and derive the optimal scheme within that class.
See, for example, Clementi, Cooley, and Wang (2006), Kadan and Swinkles (2007), or Edmans
and Liu (2010). The restriction in these studies on the number and generality of instruments is
necessary for practical purposes, since the complexity of the optimization problem increases very
fast with the number of instruments allowed. Although this approach yields interesting insights
on the way particular instruments may work, it poses a potentially important shortcoming: The
results rely on restricting the firm to use an ineﬃcient compensation scheme.5 This is not the case
in my setup, since I allow for a rich enough set of instruments so that the transfers given to the CEO
with the compensation package correspond to those in the optimal contract derived assuming an
unrestricted set of instruments. The issue that I want to study is not whether, for an exogenously
given set of compensation instruments, the optimal contract is feasible, or how closely it can be
approximated, but rather what are the necessary instruments to implement it exactly.
My model shows how a firm’s unobserved characteristics (related to the uncertainty about the
moral hazard problem it faces) may aﬀect the “complexity” of its compensation package. For
example, I show that, for some firms, the use of options will be necessary to implement the optimal
contract, while it will be optional for others. This is consistent with existing empirical evidence of
cross—sectional variation in compensation practices. Figure 2 presents publicly available data about
the complexity of compensation packages of CEOs of large, publicly traded firms (Execucomp).6 It
plots the evolution over time of the percentage of firms that include in the compensation package
of their CEO in a given year the following groups of instruments: both stock and options grants
( = ), only stock ( = ), only options ( = ), or neither stock nor options ( =  ).7 Two facts
stand out. First, in any given year there is heterogeneity in the instruments used by firms. Second,
the introduction of mandatory expensing of at—the—money options around 2006 that arguably made
options less attractive for accounting purposes seems to have pushed some firms to stop using them,
but after the change there is still an important fraction of firms using options. This provides further
evidence for heterogeneity in the cross—section in the value of including options in compensation
schemes.8 My model provides a justification for (unexplained) heterogeneity in the composition of
4

A reason frequently cited for the proliferation of option grants is the tax advantages of compensation contingent
on performance over base salaries. The Omnibus Budget Reconciliation Act (OBRA) resolution 162(m) of 1992
imposed a $1 million cap on the amount of non—performance—based compensation of the top executives of the firm
that qualifies for a tax deduction. Certain restricted stock and option plans are considered performance—based pay.
5
Two important exceptions, which I discuss later in this introduction, are Aseﬀ and Santos (2005) and Edmans
et al. (2012).
6
See also Kole (1997), Kadan and Swinkles (2007), and references therein for previous studies exploring empirically
the relationship between pay features (like the choice of stock versus options or the importance of incentive pay) with
the firm’s observable characteristics (such as size, industry, age, or default risk).
7
These are all firms for which the CEO owns less than 3% of the stock and which have a positive number (regardless
of its magnitude) in either stock or option grants in the Execucomp entry for the compensation of their CEO in a
given year.
8
Moreover, the new accounting standards for expensing options that were only mandatory in 2006 were voluntarily

3

1
Proportion of firms
.4
.6
.8
.2
19
93
19
94
19
95
19
96
19
97
19
98
19
99
20
00
20
01
20
02
20
03
20
04
20
05
20
06
20
07
20
08
20
09
20
10
20
11
20
12

0

Firms in I = N
Firms in I = S

Firms in I = O
Firms in I = B

Source: Author’s calculations using Execucomp data

Figure 2: Percentage of firms that include in the compensation package of their CEO in a given year the
following groups of instruments: both stock and options grants ( = ), only stock ( = ), only options
( = ), or neither stock nor options ( =  ) 
compensation packages.9
I model the moral hazard problem between the owners of the firm and the CEO as a principal—
agent problem. I propose a simple two—period framework in which a risk—averse CEO is asked to
exert an unobservable and costly eﬀort in the first period only. The risk—neutral owners of the
firm coordinate to act as the unique principal who designs the compensation package of the CEO.
I assume commitment to the two—period contract for both the CEO and the firm owners, and I
abstract from firing or quitting decisions. The only eﬀort of the CEO determines the distribution
of the firm’s output in both periods. I interpret the first period as an interim stage at which
information about performance is revealed (the company announces its earnings in the middle of
the fiscal year). New grants may be awarded to the CEO at this interim stage (refresher grants),
but no consumption takes place then; the CEO receives and consumes his compensation in the
second period only, after two output realizations have been observed.
With this timing I try to capture the fact that relevant information about performance is
typically revealed between the time at which compensation packages are set and the time when
the CEO collects his payments. Hence, the expected payouts from bonus plans and stock or
option grants may change with interim earnings announcements. In the simplified version of my
framework without learning, however, when the firm is valued according to the expected stream
of future output, stock prices do not change contingent on the value of earnings announced in the
interim stage. This is because, in equilibrium, the recommended level of eﬀort is chosen, and that
alone determines the distribution of output. Hence, the expectations about output in the second
period are independent of the first period realization. In reality, however, stock prices typically
change over the course of a fiscal year with the earnings announcements of the firm. To capture
adhered to by some firms starting in 2002 (Carter, Lynch, and Tuna, 2007).
9
Jarque and Gaines (2013), for example, find that much of this cross—sectional heterogeneity remains unexplained
after controlling for firm characteristics such as size and the level of compensation, as well as for the 2006 changes in
the mandatory expensing of options granted at the money.

4

this feature, I augment the model by introducing an exogenous source of uncertainty: a stochastic
state that aﬀects the eﬀectiveness of the eﬀort of the CEO.10 This can be interpreted as the quality
of the match between the CEO and the firm, or as idiosyncratic market conditions. I assume that
both the CEO and the owners, as well as buyers of the stock, have a prior about this state, which
they update through Bayes’ rule when they observe a new realization of output. This generates
contingent stock prices.
In my full model with learning, then, the distribution of stock prices contingent on eﬀort is
endogenously generated through the learning process, rather than assumed as a primitive. Importantly, this implies that consumption in the optimal contract need not be monotonic in output,
even though eﬀort first—order stochastically increases output in each of the states. This is a key
diﬀerence with most of the literature, and in particular with Aseﬀ and Santos (2005), who characterize the optimal contract in a standard static moral hazard problem (Grossman and Hart, 1986)
under the assumption that eﬀort aﬀects directly the distribution over stock prices. When they
calculate numerically the cost of limiting the compensation scheme to include only a wage and
an option grant, their calibrated model using CEO compensation data implies small costs for the
approximation. However, their assumptions imply contracts that are always monotonic in the stock
price, and hence their result is not directly applicable to my setup.
When studying the implementation of the optimal contract with real—life compensation instruments, the stylized compensation package that I consider tries to include all the standard instruments that we observe in real life (a salary, a bonus program, stock grants, and option grants),
as well as less standard but often observed types of compensation such as signing bonuses, severance payments, perks, discounted stock purchases, or below—market—rate loans, which I will refer
to generally as “ad—hoc” payments. An important characteristic of the standard instruments is
that there is a clear tie between their payoﬀ and the performance of the firm. This may facilitate
both the communication of stockholders’ objectives to CEOs, and the transparency of compensation practices to potential outside investors and regulators. The defining characteristic of ad—hoc
payments, instead, will be that, even though they are understood and anticipated by the CEO,
they cannot be calculated by an outside observer as a function of annual accounting measures or
the stock price realization. That is, even though these transfers may follow from a (non—formulaic,
potentially subjective) evaluation of the performance of the CEO using accounting and stock price
realizations, to an outside observer these remain arbitrary transfers of money.11
10

This is a simple way to make prices respond to output realizations. One could generalize the framework to allow
also for output persistence; even though this (or any other alternative stock pricing rule) would likely aﬀect the
particular characterizations in the core of the paper (and the tractability), the main idea that compensation packages
need to translate the sensitivity of stock—to—output into optimal consumption—to—output is robust to such alternative
specifications.
11
Identifying ad—hoc payments in the Execucomp data is not easy, since some of the “standard” compensation
instruments (like stock and option grants) that we observe in compensation plans may also be granted in an “ad—hoc”
manner, and this diﬀerence is not apparent in Execucomp. Gillan, Hartzell, and Parrino (2009) presents han—collected
evidence that slightly less than half of the firms in the S&P 500 in the year 2000 had explicit Employment Agreements
describing compensation agreements. For those that had them, the agreement covered on average three years, and
it specified mostly the initial salary, potential reasons for dismissal, and the compensation that would folow in such
events, as well as in the event of a change in corporate control. About half of the contracts specified a bonus target,

5

To present my conclusions about the form of real—life compensation packages, I define three
types of pay schemes, according to the instruments that are included in each. The first distinction
is between schemes that include ad—hoc payments and those that do not; I label schemes with
ad—hoc payments as “nontransparent,” trying to capture with this language the diﬃculty that
I just described for an outsider to understand the link of these payments to performance. In
contrast, “transparent” schemes may include only a wage and instruments that depend on output
or stock prices. Within the transparent category, I distinguish between “simple” schemes (which
may include only stock granted before any realization of output is observed,) or “complex” (which
may include also stock options, and stock and option refresher grants issued contingent on the first
period realization). Complexity, then, captures not only the number of diﬀerent instruments but
also their sophistication.
My results are the following. First, I show analytically that ad—hoc payments are only necessary
for firms whose optimal contract implies consumption that is nonmonotonic in cumulative output.
My characterization shows that there is a nontrivial set of firms for which this will be the case.
On the other hand, I show analytically that when the optimal contract is monotonic in output,
a complex scheme is always suﬃcient, and it only needs to include a wage, a bonus plan, and,
importantly, an option grant that is issued contingent on the interim output realization. Hence, my
framework provides a rationale for the seemingly counterintuitive compensation practice of issuing
refresher grants. Second, I show numerically that – perhaps surprisingly – simple schemes are
suﬃcient for many types of firms. Finally, I show that there is a nonempty set of firms for which
a simple scheme is not suﬃcient but a complex one is, making refresher stock and option grants
instrumental in avoiding ad—hoc compensation.
The result that refresher grants are useful in implementing the optimal contract is due to the fact
that the contract may call for diﬀerent sensitivity of pay—to—output depending on the interim results
of the firm, and instruments like bonus plans or plain stock awards cannot always implement this
contingent sensitivity. In particular, prices may change very little with output realizations in the
second period if the first output was very informative about the exogenous state (i.e., the sensitivity
of prices—to—output is very diﬀerent from the sensitivity of pay—to—output that the optimal contract
calls for). To illustrate this, I discuss examples in which awarding new options to a CEO, even after
observing bad results, may be part of the implementation of the optimal contract for firms that
want to avoid the use of ad—hoc payments. What may superficially look like undoing incentives, or
a sign of entrenchment, actually arises as part of the optimal provision of incentives.
The role of refresher grants that I underline in this paper is related to the results in three
previous studies. The first one is Kadan and Swinkles (2007). In their setting, compensation
instruments are restricted to a wage and an option with an exercise price that is derived optimally.
Although their model is static, they model and explicitly study the role of previous outstanding
grants, which imply a lower bound on the compensation that the CEO gets for every current
price realization. They show that only for financially distressed firms or start—ups (firms with
and details on the retirement plan, and about a third discussed initial restricted stock or option grants, as well as
perks (such as car or plane use, or a club membership). For those firms that relied on implicit agreements instead,
they show that incentive pay is nontrivial, averaging at 54% of total pay.

6

substantial nonviability risk) the optimal exercise price of the new (refresher) grants is zero, i.e.,
stock is a better incentive instrument than options. However, for firms in good financial health,
refresher option grants (positive exercise price) help implement a higher sensitivity of payments
to performance for the relevant range of market prices (the prices that are likely to be realized
going forward given the good financial health of the firm). My results reinforce their conclusions
by allowing for a richer set of compensation instruments that does not restrict the form of the
contract. A crucial contribution of this paper relative to Kadan and Swinkles (2007) is to have
endogenous stock prices, which helps understand an added diﬃculty in the use of compensation
instruments that are contingent on those prices: the possibility of non—monotonicities in the optimal
compensation contract.
A second related paper is Acharya, John, and Sundaram (2000). The authors study the practice
of repricing options (i.e., changing the exercise price of options previously granted, typically to make
options that are currently out—of—the—money be at—the—money). When outstanding options have
an exercise price that is viewed as unattainable before the expiration of the grant, repricing is,
in practice, equivalent to issuing refresher grants.12 Their two—period problem reduces formally
to a repeated moral hazard with consumption in the final period only. Because they restrict
the compensation instruments available to the principal (in particular, refresher grants are not
allowed), after some realizations of the interim period output the agent cannot be incentivized
to exert (otherwise eﬃcient) eﬀort. Hence, allowing for repricing eﬀectively expands the set of
compensation instruments in their setting, which can be useful to provide incentives, in spite of the
problem of commitment that this practice introduces. In my model, instead, when issuing refresher
grants or repricing is optimal, the principal commits to it ex—ante and only in the appropriate
nodes of the game.13
A third related paper is Edmans et al. (2012). The authors use for their analysis a particular
hidden information repeated moral hazard setup that allows them to provide an elegant closed form
solution for a complete characterization of the optimal contract.14 In their framework, the optimal
contract can be implemented using a scheme that establishes an escrow account for the CEO in
the initial period. Then, the contract dictates what proportion of the account becomes available
for consumption over time. Moreover, the proportion of the balance in the account that must be
invested in stock of the firm needs to be “rebalanced” over time, in response to changes in the value
of the firm given new signal realizations, in order to keep implementing the optimal incentives for
12

One main diﬀerence is that options granted outside of a shareholder pre—aproved long term option plan are not
tax deductible under current tax laws (see the Omnibus Budget Reconciliation Act resolution 162(m) of 1992); hence,
repricing may be an attractive alternative for companies that have exhausted the options available in their option
plans.
13
It will be clear that the equilibrium of my model is not robust to allowing for ex—post Pareto improving renegotiations. Once the eﬀort is done, it is optimal to renegotiate to uncontingent payments in the consumption stage.
However, this is anticipated by the agent ex—ante, and implementing high eﬀort is no longer feasible.
14
In Edmans et al (2012), each period’s signal is equal to the sum of a hidden action and a stochastic noise variable.
In a departure from the standard hidden action moral hazard problem in Grossman and Hart (1983), they assume
that the agent observes the noise realization before he chooses his eﬀort (a hidden information moral hazard model).
This timing simplifies the problem of finding the optimal contract, and it is also necessary for the implementation
with a scheme that uses only stock and cash (see their discussion on page 1630 of their article).

7

the CEO. Despite the diﬀerences across their setup and mine, the rebalancing of their incentive
account and the refresher grants in my model respond to the same force: the need to adjust the
sensitivity of compensation packages that are contingent on stock prices when new information is
revealed about the value of the firm that aﬀects those prices diﬀerently than it aﬀects the optimal
contract.
The paper is organized as follows. Section 2 presents the model. The equilibrium is analyzed
in section 3, with the results for unconstrained contracts in section 3.2 and those for real life
compensation instruments in section 3.3. A generalization of the model is discussed in section 4,
and section 5 concludes. All proofs are relegated to the Appendix.

2

The model

I model the moral hazard problem that arises between the CEO and the owners of a firm due to the
unobservability of the CEO’s eﬀort. I assume the CEO (agent) is risk averse with  () = ln () 
The firm is owned by well diversified shareholders that coordinate perfectly to act as the unique
risk—neutral principal of the agent. I also assume that there is a competitive stock market that
prices the stock of the firm according to its expected per—period output. I assume that the owners
of the firm can commit to implement a compensation contract.
The contract lasts for two periods,  = {1 2}  The output of the firm can take two values each
period,  = 0 and  = 1. The agent aﬀects the probability distribution over output with his
eﬀort, which can take two values:   with disutility of eﬀort  ( ) = 0, or  with  ( ) =  This
eﬀort is done only once, at the beginning of period 1, but it aﬀects the distribution of output both
at period 1 and period 2. Hence, the eﬀect of eﬀort is persistent in time. The agent receives his
payment only at the end of the second period. The first period represents an interim stage at which
new information is revealed (the first period output realization is observed), but no consumption
takes place in it. For simplicity, I also assume that there is no discounting between periods 1 and
2.
The distribution over output is also aﬀected by another parameter: a state that determines the
eﬀectiveness of the eﬀort of the CEO, denoted  ∈ { }  The true state is unknown by the agent,
the principal, and the stock market, and all players attach a prior probability of 0 to  =  The
probability of observing a high output contingent on an eﬀort level and a realization of the state is
as follows, for every :

Pr ( =  | )



(0 )



b

(1 − 0 )



b

(1)

b = 
b and   = 
b = 1 In such a firm,
For my leading example firm, I assume   =  
when eﬀort is not eﬀective, the firm always produces high output. This is a simplifying assumption
that I will relax in section 4, when I will consider firms for which output is always low in state
8

Figure 3: Timing and probabilities of output realizations. The state is represented by  = { } 
 (  = 
b = 0) and firms for which eﬀort is also eﬀective in state  but it implements
diﬀerent probabilities than in state . To distinguish this type of firm in matrix 1 from other types
introduced later, I will refer to it as a type H firm.
All probabilities are common knowledge. I assume that the prior over  =  satisfies 0  0  1
b  0.
Also, higher eﬀort ( ) implies higher probability of observing  in  =  that is, 1    
The timing and the stochastic structure are depicted in Figure 3. The above assumptions on the
probabilities imply that, at time 0 all the nodes of the tree have positive probability of being
reached under both levels of eﬀort.
Matrix 1 constitutes a very stylized model of a firm’s technology. However, moral hazard
and learning are present in this technology, complicating the analysis of compensation that is the
objective of this paper.

2.1

Learning about the state 

Each period, after observing the output realization, the principal, the agent, and the stock market
update their prior about the quality of the match. The updating is done using Bayes’ rule, for a
given  choice. I denote the posteriors in the first period when the agent chooses  as   and
those in the second period as   for  =    =   Similarly, the posteriors when the agent
chooses  are denoted b and b  To simplify the exposition, I introduce the following notation:
 0 = 0   + (1 − 0 )  

  =    + (1 −  )   

  =    + (1 −  )   

 =  
 =    =  

which denote the probability attached by all players to observing a high output realization in the
first period ( 0 ), and in the second period, contingent on the first period realization (  )  and,
9

in the third period, contingent on both realizations (  ). Similarly, for  , the corresponding
b  
b 
probabilities are denoted by 
b0  

2.2

Valuation of the firm by outside investors

I assume that there are a large number of investors in the economy who are willing to buy the stock
of the firm. Investors value the stock of the firm as a claim to the expected output of the firm.15 I
also assume a large number of shareholders (sellers of the stock), so no individual deviation aﬀects
the equilibrium price. Competition implies a price equal to the expected output. Investors and
shareholders update their beliefs about , for a given eﬀort  Hence, the market price for the stock
varies as output realizations become available.
In order to simplify my analysis of the compensation problem, I normalize the price to the
expected per—period value of the firm:16
¡
¢
     ≡  [ | ] 

where  [·] denotes the expectation taken with all information known at  which is summarized by
the posteriors based on the history of realizations,    I introduce the following notation for prices.
The price of the stock corresponds to the expected value of the firm given the history of realizations
and a given eﬀort choice:
0 ≡  (∅  ) =  0 

(2)

 ≡  (   ) =  

 =  

 ≡  (     ) =  

 =    =  

(3)
(4)

under high eﬀort. Similarly, under low eﬀort, b0  b  and b will be equal to 
b0  
b  and 
b 
correspondingly.

2.3

Compensation Instruments

In this section I define the compensation instruments available to the firm. I allow the compensation package to include the following elements: an annual wage, a bonus plan, ad—hoc payments,
and long—term performance—based plans that include both stock and option grants. With these elements, I try to capture the most important features of real—life compensation practices.17 In 2010,
15

By appliying this pricing rule in the second period I implicitly assume that the technology of the firm lasts for
at least 3 periods (i.e., high otput happens with the same probability in  = 3 as in  = 1 or 2). The commitment to
the contract, however, lasts only until  = 2 so the third period realizations cannot be used for compensation. This
simplification allows me to derive the analytical results in the paper.
16
If instead prices where equal to the net present value of output, the price in the first period would represent a
claim to two output realizations, while the price in the second period would be a claim only to one output. This
would imply a diﬀerence in the level of prices that is irrelevant for the economic problem of interest in this paper,
but it complicates the algebra.
17
See Murphy (1999) for a detailed description of compensation instruments based on compensation surveys. See
Hall and Liebman (1998) for details based on hand—picked data from proxy statements. See Clementi and Cooley
(2009) for a recent and careful description of the main facts related to the level and structure of compensation of the
executives of the largest public U.S. firms in the last two decades using Execucomp data.

10

data in Execucomp for the CEOs of the 1,500 largest public companies in the United States shows
that the average pay was $4,371,060, with a minimum of $200,000 and a maximum of $25,761,432.
The median of the highly skewed distribution of pay was of $3,022,000.18 Of this total pay, the
salary represented an average of 25% (or a median of 19%), the bonus and incentive program represented an average of 25% (or a median of 23%), stock grants 28% (median of 25%), option grants
an average of 19% (median of 13%), and perks and other compensation an average of 3% (median
of 1%).
I now present a brief description of each instrument and how it is captured in the model.
Base Salaries In real life, salaries for CEOs are normally negotiated at the time of signing a
contract, based on industry benchmarks. The negotiation usually includes a pre—specified
annual increase for the duration of the contract, independently of performance. In the model,
the salary is a constant payment given in period 2. I denote the salary as 19
Bonus plans In real life, most companies oﬀer bonus plans paid annually based on the firm’s
performance as measured by accounting results. They usually specify a performance target,
together with a minimum and maximum limit for bonuses and the sensitivity of the bonus to
the performance measure. These performance measures consist mainly of objective measures
such as net income, revenue, pre—tax income, or other accounting figures. Typically, about
25% of the total measures used in the evaluation are labeled as “individual performance”
measures, which are subjective evaluations. In the model, I summarize these characteristics
by making the bonus plan depend only on accumulated annual output, (1 + 2 )  as follows:
  (1  2 ) = min {   (1 + 2 )}. This mimics the structure of bonus programs in real
life, where a “pool” available for bonuses determines a “cap” (  in this case) for annual
payments.20
Long—term incentive plans In real life, compensation plans include long—term payments mainly
in the form of (i) stock of the company and (ii) options to buy stock at a pre—determined
price (the “exercise” price or “strike” price.) Both typically come with selling restrictions:
They cannot be traded before their “vesting” time. Also, the manager cannot take these
grants with him if he leaves the company, and he is not allowed to hedge against the risk
in his compensation package. In the model, I assume that all grants vest in period 2 and
they are exercised immediately by the CEO. Consistently with real—life practices, I assume
an exercise price equal to the market price of the stock at the time of granting. There is
evidence that most firms use multiyear stock and option plans to determine the value of
annual grants. They also occasionally use out—of—cycle or larger than average grants, often
18

This calculation excludes CEOs that at some point in their tenure with their firms owned more than 3% of the
total stock of their company; in picking this threshold I follow Clementi and Cooley (2009), who argue that a high
ownership is not consistent with a moral hazard problem.
19
Pension payments and life insurance premiums are usually oﬀered to executives as part of their compensation;
because their relative importance on the overall compensation is small, I do not explicitly include them in the model.
However, given their noncontingent nature, one could assume that they are included in the variable  in the model.
20
Results for an alternative linear bonus program, defined as   (1  2 ) =  (1 + 2 )  are similar and are discussed
in section 4.

11

called “refresher” grants; I model these as grants given in the interim period.21 I simplify
the vesting restrictions and expirations and assume that all stock and option grants can and
must be sold or exercised at time 2. I denote the diﬀerent grants as follows:
• 0 : restricted stock issued in period 0
•   for  =  : refresher restricted stock issued in period 1 contingent on realization 
being observed in period 1
• 0 : stock option grant issued at time 0 with exercise price 0
•   for  =   : refresher stock option grant issued at time 1 contingent on realization
 being observed in period 1, with exercise price 
Ad—hoc payments This category includes compensation that is not granted to the executive on
a regular basis, and for which the amounts received by the executive are not a previously set
function of accounting measures or stock price realizations. Examples are personal benefits
(“perks”), subsidized loans, signing bonuses, severance payments, or discounted share purchases. I will denote these ad—hoc payments as   for   =  , to make clear that these
payments may be fully contingent on the history of realizations of output.
For incentive purposes, it is important that the CEO understands and anticipates what the
compensation contract implies for his consumption in each possible state of the world, including
both the return of standard compensation instruments like bonus plans or grants, as well as ad—hoc
payments. In the model, I capture this feature by assuming commitment to the contract. However,
I will distinguish between compensation packages that include standard, transparent compensation
instruments that are a function of accounting results (like a bonus plan) or stock price realizations
(like stock or option grants), versus ad—hoc transfers. I make this distinction formally in the next
subsection.

2.4

Compensation Packages and Consumption for the CEO

´
³
I denote a compensation package as a vector  =    0  0  { }=  { }=  { }= 

Let P ⊂ R10
+ be the set of all possible compensation packages. Based on whether ad—hoc payments
are used, and on whether the package includes more complicated instruments like options, I define
the following strict subsets of P :
S = { ∈ P such that 0 =  =  =  = 0 ∀ } 

C = { ∈ P\S such that  = 0 ∀ } 
21

Hall (1999) reports that most of the firms in a sample of large publicly traded U.S. firms use option plans that
imply either a fixed number of shares to be granted every year for the duration of the plan (about 40% of the sample
firms), or a fixed value of grants (where the number is adjusted according to stock price changes). He classifies a
firm under one of these plans if he observes the same number or value for two consecutive years. Hall (1999) defines
“refresher” grants as “out—of—cycle” or larger than usual grants. See also Hall and Knox (2004), which I discuss later
in this paper in relation to my results, for evidence on refresher grants.

12

Definition 1 A compensation package  is classified as:
• Transparent if it does not include any ad—hoc payments, i.e.  ∈ C ∪ S
— Simple: if it is transparent and it includes only a wage, bonus scheme, and restricted
stock granted at time 0, i.e.,  ∈ S
— Complex: if it is transparent and it includes at least one option grant or a refresher
stock grant but no ad—hoc payments, i.e.,  ∈ C
• Non-transparent: if it includes at least one ad—hoc payment, which has a value that is not
a set function of output realizations or stock prices, i.e.,  ∈
 C ∪ S
I assume that the principal can force the agent to sell all stock and options in period 2 when
they are profitable and consume all income generated from it. Hence, given any  ∈ P, I can
calculate the implied consumption of the agent. I introduce the following notation to denote this
consumption as a function of the compensation package: C ( ) = { }= . This function takes
the following form:



=  +  + 0  + 0 ( − 0 ) +   +  ( −  ) + 

 =  +  + 0  + max {0 ( − 0 )  0} +   + 


=  +  + 0  + max {0 ( − 0 )  0} +   +  ( −  ) + 

(5)

 =  + 0  +   +  

It is important to note that the function C : P → R4+ is not injective, that is, diﬀerent compensation
packages may imply the same contingent consumption vector. Using the function C we can calculate
the expected utility of the agent. For a given compensation package  ∈ P and high eﬀort, this
expected utility is:
 (  ) =  0 [  ln ( ) + (1 −   ) ln ( )]

+ (1 −  0 ) [  ln ( ) + (1 −   ) ln ( )] − 

Finally, the cost to the principal of a contract  that implements  is
 (  ) =  0 [   + (1 −   ) ]

+ (1 −  0 ) [   + (1 −   ) ] 

The expected utility under low eﬀort,  (  )  and the cost of implementing low eﬀort,  (  ) 
are constructed in a similar manner, changing the probabilities to those corresponding to low eﬀort.

2.5

Incentive problem

With the compensation packages and the consumption function in hand, we are now ready to write
the optimization problem of the principal. I assume throughout that parameters are such that it
13

is always profitable to implement   Hence, the optimal compensation package  ∗ is the solution
to the following cost minimization problem, where  represents the outside utility the agent would
obtain if he were not to participate in the contract:
 ( ) = min ∈P  (  )

(P1)


 ≤  (  )

(PC)

 (  ) ≥  (  )

(IC)

Problem P1 is diﬃcult to solve in general, due to the large number of non—negativity constraints
that the domain for  implies. Moreover, the function C ( ) depends on the equilibrium stock
prices, which in turn depend on the choice of eﬀort. I propose, instead, to solve a simplified
problem in which the principal chooses directly a tuple  = { }= of transfers contingent on
the history of output realizations, as follows:

 () = min  (  )

(PS)


 ≤  (  )

(PC’)

 (  ) ≥  (  )

(IC’)

       ≥ 0
(NNC’)
n o
I denote the solution to PS as  ∗ ≡ ∗
 The standard arguments valid for a static moral
=

hazard problem (see Grossman and Hart, 1983) justify that both the PC and the IC constraints
bind in the optimum. Note that the agent has logarithmic utility, so the non—negativity constraints
(which are now in terms of consumption levels) will never bind. Also, with a simple change of
choice variables to utility levels, the objective function is linear and the constraint set is compact
and convex, so the solution to PS exists and is unique.

Lemma 1 Any solution  ∗ to problem P1 implements the same consumption for the agent as the
solution  ∗ to the simplified problem PS.
The proof is obvious so it is omitted. It is easy to see that the set of available compensation
instruments P is rich enough to implement any (positive) transfer scheme contingent on the history
of output realizations, i.e., any value for the tuple { }= . In particular, setting  = ∗ for all
  is always a suﬃcient implementation. This result implies that I can study the problem of choosing the instruments separately from the determination of contingent consumption in the optimal
contract. However, since the function C ( ) is not invertible there might be several compensation
packages that solve problem  1 and satisfy C ( ) =  ∗

14

3

Equilibrium

Recall from section 2.2 that individual deviations of the shareholders and the investors do not aﬀect
the stock prices in the equilibrium of the stock market. This implies that there are only two pricing
rules that may appear in any equilibrium: one for any contract that implements  and one for
any contract that implements  . Moreover, when considering a deviation, the CEO realizes that
he can only aﬀect the probability distribution over prices but not the prices themselves.
An equilibrium of the above game between the principal, the agent, and the stock market is
defined next.
Definition 2 A Perfect Bayesian Equilibrium of this game in which eﬀort  is implemented
consists of a compensation contract  ∗ and stock prices such that
) C ( ∗ ) =  ∗  where  ∗ = arg min{}  (  )
) The utility of the agent choosing  is equal to that of choosing  and is as large as his outside
utility 
) Market prices and the beliefs of the stock market participants about  are consistent with the
agent choosing   as defined by 0           in Equations (2) -(4)
) Beliefs about  are updated according to Bayes’ rule
Since the probability of observing any history is positive under the equilibrium level of eﬀort,
Bayesian updating provides consistent beliefs, and no refinement is necessary.
In the next subsections, I describe the properties of the equilibrium.

3.1

Equilibrium Stock prices

All stock traders anticipate that, in equilibrium, the agent chooses the recommended level of eﬀort,
  Hence, they update their beliefs using the probabilities in the above matrix corresponding to
  The equilibrium price of the stock corresponds to the expected output of the firm given the
history of realizations, given by  and  in Equations (2) -(4). For the rest of the analysis in the
paper, it will be useful to keep in mind the following property of stock prices:
Lemma 2 Stock prices are monotonic in the period’s output:  ≤   and  ≤  =  ≤
 
In particular for a type H firm, for any histories containing at least one  , the updated beliefs
H = 1 if  or  equals . If the observed history
put probability one on  = , i.e., we have that 
does not contain any  , instead,  =  has still positive probability. This is the case for histories
 and (   )  That is, using Bayes’ rule,

 = 0

0  + 1 − 0
2
 = 0

0  2 + 1 − 0
 =  =  = 1
15

A direct implication of this learning is that the stock prices take the simple form:
0 = 0  + 1 − 0 




(6)

=   + 1 −  

=   + 1 −  

 =  =  = 

3.2

Equilibrium Consumption

Problem PS is a particular example of a static moral hazard problem, with i.i.d. output revealed
over time and exogenous uncertainty about the probability distribution implemented by each eﬀort
level. The characterization of the optimal contract with unrestricted instruments follows easily from
the standard first order conditions of problem PS. Define the likelihood ratio (LR) of a history of
realizations as the ratio of the expected probabilities of that history under low and high eﬀort:


=

 =


=

 =

b
b2 + 1 − 0

b0 
0 
=

0 
0  2 + 1 − 0
b

b0 1 − 
(1 − 
b) 
b

=
0 1 − 
(1 − ) 
b
1−
b0 
(1 − 
b) 
b

=
1 − 0 
(1 − ) 

b
1−
b0 1 − 
(1 − 
b)2
=

1 − 0 1 − 
(1 − )2

(7)

Proposition 1 Consumption levels in the optimal contract  ∗ are ranked by likelihood ratios:
∗  ∗ ⇔     for   ∈ {   } 
Moreover, consumption is linear in the LR:
∗ =  +  (1 −  ) 

(8)

where  is the multiplier of the constraint PC and  that of the constraint IC.
It is worth noting that, if  were  for sure, the above characterization would always imply the
same ranking for consumptions for all combinations of parameters, as the next proposition states.
Define ∆ ≡ ∗ − ∗ and ∆ ≡ ∗ − ∗ 
Proposition 2 In the absence of learning about  (certainty case with  = ) the optimal consumption is monotonic in cumulative output and it satisfies:
0  ∆  ∆ 

16

With uncertainty about , however, the posterior evolves diﬀerently under  than   changing
the weight of each probability in the numerator and denominator of the LR. As stated in the next
proposition, this can create non—monotonicities: For a type H firm, it may be the case that, when
the first period output has been   the agent’s wage may be higher if we observe  in the second
period than if we observe  
Proposition 3 When the firm is of type H, consumption spread always satisfies ∆  ∆  Also,
non—monotonicities never arise following a low output in the first period, i.e., ∆  0 always.
Moreover, we have
(i) whenever  + 
b ≥ 1 for all 0 ∈ (0 1)  ∆  0

(ii) whenever  + 
b  1
where  ∗ =

for 0 ∈ (0 ∗ )  ∆  0

for 0 ∈ ( ∗  1)  ∆  0

1−−̂
1−−̂+̂ 

The fact that non-monotonicities only arise in ∆  and only when  + 
b  1 is related to the
interaction of the informativeness of the signal (   ) (or, equivalently, (   )) and the learning
about the true state. On one hand,  + 
b  1 implies that  is less than one, and hence the
optimal contract seeks to reward the agent when observing  as well as when observing 
Punishments are reserved for  However, observing  reveals perfectly that the true state is
 making  a more informative signal about eﬀort than if there were still positive probability
on state  (which is the case when we observe ) This tends to make  large, but not  
making ∆ large and ∆ small. For low enough 0  the relative informativeness of  and 
may be reversed and we may get ∆  0
Figure 4 presents graphically the analytical characterization in Proposition 3 of the parameters
that imply a nonmonotonicity. For this, we assume that, for each suﬃcient combination of  and 
b
there is a mass one of firms whose prior 0 is distributed uniformly between 0 and 1. The vertical
axis, then, represents the proportion of firms for which nonmonotonicity is present in the optimal
contract. We see that, when  is high enough, or the diﬀerence  − 
b is large enough (both cases
that lead to  + 
b  1), consumption is monotonic. For the combinations that imply the possibility
of nonmonotonicities, with  + 
b  1 consumption will be nonmonotonic for the firms with the
∗
smaller priors, 0   as defined in Proposition 3. Hence, figure 4 is, in practice, a graph of the
threshold  ∗ ( ̂).
With the properties of the optimal contract in hand, we now turn to the implications for
compensation packages.

3.3

Equilibrium compensation packages

The analysis of the properties of equilibrium consumption in the previous section was based on the
solution to problem PS, with contingent consumption transfers  ∗  In this section, I use the C ( ∗ )
17

1

0.8

q

*

0.6

0.4

0.2

0
0
0.1
0.2
0.3
0.4
1
0.5

0.9
0.8

0.6

0.7
0.6

0.7
0.5
0.4

0.8
0.3
0.9
pihat

0.2
1

0.1
0

pi

Figure 4: Type H firm. Probability that non—monotonicities arise in the optimal contract ( 
 )This happens whenever 0 ≤  ∗ 
mapping, together with the properties of  ∗  to analyze the characteristics of the solution to the
original problem P1 in terms of compensation packages  in P.
The first thing to note is that given the richness of the elements of P, the optimal contract
∗
 characterized in the previous subsection is always suﬃcient in a trivial way: Because of the
availability of ad—hoc payments, the firm can simply set  = ∗ for all  pairs. However, there
may be other combinations of compensation instruments that implement a given optimal contract.
To solve the indeterminacy of the compensation package, I make the following assumption, which
makes use of the classification of compensation packages presented in page 13:
Assumption 1 The principal, when presented with several choices to implement a given contingent consumption scheme, prefers a simple scheme to a complex one, and prefers not to use
ad—hoc payments.
Implicit behind this assumption is the fact that it may be more costly for shareholders to use
ad—hoc payments or complex compensation instruments. The costs may include communication
to investors of compensation practices or tax deductions that cannot be taken advantage of. In
the rest of this section, I ask the following questions: What types of firms are not able to avoid
using ad—hoc payments, and which are? Which can do with just a simple scheme? To answer these
questions, I use the following strategy. First, I spell out C ( ) under the restrictions implied by
a simple and a complex scheme. Then, I analyze the system of equations resulting from equating
C ( ) =  ∗ 
18

Definition 3 Consider a firm defined by a probability structure of the form of matrix (1). A
compensation scheme  is suﬃcient if, for the  ∗ corresponding to the parameter values
that describe the firm, the system of equations resulting from equating C ( ) =  ∗ has a
solution.
Note that although restricting to schemes in the subsets S and C simplifies the system C ( ) =
 ∗ , it is diﬃcult to characterize the solution in general, since all packages  need to have non—
negative elements. In what follows, I present a partial characterization of the choice of compensation
packages between simple, complex, or nontransparent. I complement my analysis with a complete
numerical characterization of this choice.
The conditions in the propositions in this section inform us about the restrictions that using
standard compensation instruments imposes on the sensitivity of the consumption of the agent
to changes in stock prices. This sensitivity is sort of a reduced form for the composition of the
sensitivity of consumption to signals (i.e., output realizations), which is dictated in the optimal
contract by the likelihood ratios, and the sensitivity of prices to signals, which is in turn dictated
by the pricing rule of outside investors. The results that I present next show that a limited
set of compensation instruments like S puts severe restrictions on the relationship of these two
sensitivities, but still there are many parameter combinations for which these limited instruments
are rich enough to implement the optimal contract. Complex schemes, on the other hand, are
always suﬃcient when consumption is monotonic. Also, they do not necessarily include more than
three instruments, although one of them will need to be a refresher option grant.
3.3.1

When are firms able to avoid ad—hoc payments? Suﬃciency of transparent
schemes

By simple inspection of the system  ∗ = C ( ), we can see that payments to the agent coming
from a compensation package that does not include ad—hoc payments are necessarily monotonic in
prices, and hence monotonic in output, since prices are themselves monotonic in output. As the
following proposition describes, monotonicity of the optimal consumption is both a necessary and
a suﬃcient condition for a complex scheme to be suﬃcient.
Proposition 4 A complex scheme is suﬃcient for a type H firm (with a capped bonus) if and only
1−−̂

if ∆ ≥ 0, i.e., if and only if 0 ≤ ∗ = 1−−̂+̂
The proof of this proposition shows that the system  ∗ = C ( ) for a complex scheme is
undetermined, i.e., if it has a solution, it has an infinite number of them. However, only solutions
that satisfy the non—negativity constraints on all the instruments constitute suﬃcient complex
schemes. In the proof of the suﬃciency it is shown that whenever ∆ ≥ 0 the following scheme is

19

always suﬃcient:


(9)

= 

 = ∆

0

∆
 − 
= 0 =  =  =  = 0
=

This scheme includes only three instruments, but one of them is a “refresher” option grant that
is promised to the agent only if the first period output is high. Hence, the exercise price of this
option grant will be   and it will only pay oﬀ in the final node 
A diﬀerent way of reading proposition 4 is that, for a firm of type H, compensation schemes
must be nontransparent if and only if non—monotonicities arise (∆  0)  Figure 4 in page 18
represented the population of firms for which this was the case. The model implies that only for
these firms are ad—hoc payments unavoidable.
3.3.2

When are firms able to avoid options and refresher grants? Suﬃciency of simple
schemes

In this section, I provide necessary and suﬃcient conditions for the suﬃciency of a simple compensation package, which does not include options or “refresher” grants given in period one after
the interim output is realized. One caveat is that the conditions will not be, in general, in terms
of the primitive parameters of the firm; to gain insight on the suﬃciency in terms of primitives, I
complement the analytical derivations with numerical characterizations.
Proposition 5 A simple scheme is suﬃcient if and only if
a)

∗
∆

≥


 −

=


(1− )(1−) ,

and

b) ∆ ≥ 0 or, equivalently, 0 ≤ ∗ =

1−−̂
1−−̂+̂

(see Proposition 3)

The proposition can be otherwise stated as a simple scheme being suﬃcient whenever the solution  ∗ = C ( ) for a simple scheme satisfies the non—negativity constraint for the three instruments:


0

∆
¡
¢ ≥ 0
H
(1 − ) 1 − 
= ∆ ≥ 0
∆
¡
¢ ≥ 0
=
H
(1 − ) 1 − 
= ∗ − 

The intuition behind the form of this solution is conveyed next. For a firm of type H any number
of low outputs is perfectly informative about the state. Hence, there is no variation in prices in the
lower range, implying that the spread  −  needs to be implemented with the bonus payout:
 = ∆  Setting the bonus  to satisfy this constraint is always suﬃcient, since ∆  0 always for
a firm of type H. Since we also have that ∆  ∆ always, the bonus plan needs to be capped,
20

cHH
cHL

r0(pHH−pHL) = ΔH

Consumption

b = ΔL

cLL

r p

0 LL

0

W = cLL − r0pLL

pLL = pHL

pHH

Stock price

Figure 5: For the firm in Example 1, a simple scheme is feasible.
rather than linear in cumulative output. Hence, the spread  −  needs to be implemented
with restricted stock:
∆ = 0 ( −  ) 
This is feasible only if the optimal consumption is monotonic, so that we have ∆  0 (condition
) in proposition 5). Finally, the level of  needs to be implementable, given the restricted stock
0  with a positive wage:
 =  − 0  
This is feasible whenever condition ) in in proposition 5 is satisfied. Example 1 presents a firm
for which a simple scheme is suﬃcient.
Example 1 Consider a type H firm described by the following parameters:  = 7 ̂ = 28 and
0 = 3 The agent has an outside utility of ̄ = 5 and an eﬀort disutility of  = 3 The
optimal contract is:  = 795  = 2059 and  = 2166. The equilibrium stock prices
are: 0 = 79  =  =  = 7  = 81 and  = 84 For this firm, a simple scheme
is suﬃcient, and it takes the following values:  = 258,  = 1263, and 0 = 768 Figure 5
presents this scheme graphically.
Unfortunately, condition ) in Proposition 5 depends in a nontrivial way on the primitives of
∗
) and it is not possible to provide an analytical characterization of the
the model (through ∆

combination of ̄   0   and ̂ for which it is satisfied. As lemma 3 in the appendix states,
21

e = 0.4

1

1

0.8

0.8

0.6

% of q0

% of q0

e = 0.1

0.4
0.2

0.6
0.4
0.2

0

0.8

0.8

0
0.7

0.2

0.2

0.6
0.4

0.8

0.2

0.5

0.4

0.4

0.6

0.6
0.4

0.6
pi

0.3
0.8

pihat

0.2

pi

pihat

Figure 6: Fraction of type H firm firms for which a simple scheme is suﬃcient, for  = 1 (left
panel) and  = 4 (right panel). There is suﬃciency for a smaller set of firms when eﬀort disutility
is higher.
however, condition ) is independent of ̄ 22 This result provides us with a simpler way of checking
the condition numerically. Figure 6 plots in the left panel the probability that both condition )
and ) are satisfied for a given  and ̂ combination, for  = 1 and assuming a uniform distribution
over an evenly spaced grid of 0 . The values that make it most likely are combinations of high
values of  with intermediate values of ̂ Whenever ̂ takes values above 0.5, no combination of 
and 0 makes a simple scheme suﬃcient. Numerical robustness checks with respect to the level of
eﬀort disutility,  indicate that a simple scheme is suﬃcient for a smaller set of ( ̂) combinations
when  is larger; to illustrate this fact, the right panel in Figure 6 plots the suﬃciency of a simple
scheme for  = 4 which is greatly reduced from that of a firm with  = 1.
3.3.3

The role of options and refresher grants in implementing the optimal contract

Given that condition b) in Proposition 5 is necessary and suﬃcient for a complex scheme to be
suﬃcient, but only necessary in the case of a simple scheme, a natural question is whether there
is a nontrivial role for complex instruments. Can they help in the implementation of the optimal
contract when a simple scheme is not suﬃcient? The answer to these questions is positive.
The next example illustrates the role of this refresher grant for a specific parametrization of a
type H firm. For the firm in the example, a simple scheme is not suﬃcient because condition ) is
violated. However, condition ) is satisfied, and hence a complex scheme is suﬃcient: For this firm,
the possibility of using a refresher grant  allows it to implement the optimal contract without
resorting to ad—hoc payments.
Example 2 Consider a firm of type H described by the following parameters:  = 7 ̂ = 4
and 0 = 3 The agent has an outside utility of ̄ = 5 and an eﬀort disutility of  = 3
22

This property is due to the logarithmic utility and is not likely to be robust for other functional forms of the
utility of the agent.

22

cHH

r (p
0

cHL

−p ) = Δ

HH

HL

H

b=Δ
Consumption

L

cLL

0

W=c

LL

−r p

0 LL

r p

0 LL

pLL = pHL

pHH

Stock price

Figure 7: For the firm in Example 2, a simple scheme is not feasible.
(Note that the only diﬀerence with the firm in example 1 is in the value of ̂) The optimal
contract is:  = 482  = 2023 and  = 2293. The equilibrium stock prices
are: 0 = 79  =  =  = 7  = 81 and  = 84 For this firm, a simple
scheme is not suﬃcient, as can be seen in Figure 7. Since condition ) fails, the level of 0
needed to implement ∆ implies that consumption in the  and  states is too high —
the only way to hit the target  and  would be to make  negative — which is not
feasible. A complex scheme, instead, is suﬃcient. One such scheme takes the following values:
 = 482,  = 1540, and  = 10394 Figure 8 presents this implementation graphically. A
refresher stock grant,   given to the agent after observing a high output in the first period
is instrumental in implementing the optimal scheme. Given that there is no variation in
prices following a low realization in the first period, the bonus needs to be used to implement
∆  Since ∆  ∆  the bonus needs to be capped, rather than linear in the accumulated
output. Hence, an instrument that will only pay oﬀ in the  history is needed: an option
 granted when market price is equal to  

The next proposition illustrates that, more generally, there is a nontrivial role for refresher stock
grants and option grants.
Proposition 6 There is a nonempty set of type H firms for which a simple scheme is not suﬃcient
but a complex one is.
23

cHH

sH(pHH−pH) = ΔH

Consumption

cHL

b = ΔL

cLL

W = cLL
0

pLL = pHL

pHpHH

Stock price

Figure 8: For the firm in Example 2, a complex scheme is feasible.
The proof is by example: Figure 9 presents graphically a grid of the combinations of 0  
and ̂ for which a simple scheme is not suﬃcient but a complex one is. That is, condition a) in
Proposition 5 is violated but condition b) is satisfied.

4

Generalization of the model

A more general firm description than the one I have been using as the leading example (the type H
firm) would be one as in matrix 1, where output in state  also depends on the eﬀort choice of the
b for at least one  and the prior
agent. When considering this general case, I assume that   6= 
over  =  satisfies 0  0  1 Also, higher eﬀort ( ) implies higher probability of observing 
(for any quality of the firm):   ≥ 
b and   ≥ 
b  with at least one being a strict inequality. For
a firm of this generality, moral hazard and learning interact in more complicated ways; however, I
will argue in this section that the intuition behind the properties of optimal consumption will rely
on the same forces highlighted earlier for a type H firm.

24

1
0.9
0.8
0.7

0.9

% of q0

0.6
0.8

0.5
0.4

0.7

0.3
0.6

0.2
0.1

0.5

0
0.4

0.1
0.2
0.3

0.3

0.4
0.5

pi

0.6
0.2

0.7
0.8

pihat

Figure 9: Proportion of firms for which a simple scheme is not feasible but a complex one is (type
H firm).

4.1

A type L firm

For the sake of this argument, I now introduce a second special type of firm. Let a firm of type L
be described, for all  by:
(0 ) (1 − 0 )
L
Pr ( =  | )


(10)


0


b
0
In such a firm, when eﬀort is not eﬀective the firm always produces low output, in contrast to the
type H firm, where output is always high in state . The complete set of counterpart results for
type L firms to the results presented for a type H firm are included in the appendix. Here, I present
the discussion of the results, compare them to those of a type H firm, and use both sets of results
to frame the discussion about the general firm case.
The first thing to note about a firm of type L is that the properties of consumption in the
optimal contract diﬀer from those of a type H firm: Non—monotonicities never arise following
a high realization in the first period (∆  0 always), but we may have ∆  ∆ and non—
monotonicities following a low realization in the first period (∆  0) (see Proposition 11 in the
Appendix).
As with a type H firm, a complete characterization of the suﬃciency of simple schemes in terms
of primitives is not possible. The next proposition presents the necessary and suﬃcient conditions,
where condition ) is again in terms of the endogenous variable   Figure 10 presents a numerical
25

characterization of the suﬃciency of a simple scheme for two diﬀerent parametrizations of the
disutility of eﬀort, 
Proposition 7 For a type L firm, a simple scheme with a capped bonus is never suﬃcient. A
simple scheme with a linear bonus is suﬃcient if and only if:
a)

∗
∆ −∆




 − ,

and

b) ∆  ∆  or 0  1L =

 2 (1−)+2
 −
2 −
2
2
 (−
)

(see Proposition 11).

The insuﬃciency of a capped bonus is straight forward: Because for an L firm we have that
 =   it follows that under a capped bonus ∆ would need to be zero, which is never the
case for the nontrivial parametrization that we study, with  6= 
b and 0 ∈ (0 1). On the other


hand, a linear bonus, defined as  =  (1 + 2 )  may be suﬃcient if the following solution is
positive for all three instruments:


= ∗ − 

∆ − ∆

(1 −  )

 = ∆ 
∆ − ∆

0 =
(1 −  ) 

The form of this solution is intuitive. Again, because for an L firm,  =  , the spread
 −  needs to be implemented through a bonus payout:  = ∆  Given the linear bonus
program, this is not a problem as long as ∆ ≥ ∆  If this is the case, the quantity of restricted
stock is determined to satisfy
∆ −  = 0 ( −  ) 
It is the case that ∆ ≥ ∆ whenever incentives need to be more high powered in the low range of
outcomes; since an L firm is more likely to get low output levels out of luck (when the true state is
,) incentives are high powered in the low range of outcomes only for high enough prior that the
¢
¡
state is  i.e., 0  1L  Finally, it must be the case that the implied wage given 0 is positive
(condition ) in Corollary 7):
 =  − 0  
If this condition is not met, a simple scheme is not suﬃcient. As was the case for type H firms, the
examples in Figure 10 suggest that higher values for eﬀort disutility imply that the suﬃciency of
a simple scheme is less likely to happen. (Note that by Lemma 3 condition a) in Proposition 7 is
independent of the value of ̄ )
When turning to the implementation with more complex compensation packages, the case ∆ 
∆ is the one that prompts the use of ad—hoc compensation – so for this firm nontransparent
schemes may be needed even if the optimal contract is monotonic in cumulative output. This is
summarized in the following characterization of the suﬃciency of complex schemes:
Proposition 8 A complex scheme is suﬃcient for a type L firm (with a linear bonus) if and only
if ∆ ≥ ∆ 
26

e = 0.1

e = 0.4

1
% of q0

% of q0

1

0.5

0
0.1

0.5

0
0.1
0.2

0.2
0.3

0.3
0.8

0.2

0.5

0.6

0.4

0.7
pihat

0.6

0.5

0.5

0.6

0.7

0.4

0.6

0.5

0.8

0.3

0.8
0.7

0.4

0.4
0.7

pi

pihat

0.3
0.8

0.2

pi

Figure 10: Fraction of type L firms for which a simple scheme is suﬃcient, for  = 1 (left panel)
and  = 4 (right panel). There is suﬃciency for a smaller set of firms when eﬀort disutility is
higher.
As the proof of this proposition shows, complex schemes that are suﬃcient when ∆ ≥ ∆ also
can take a very simple form, such as:







0

= 
= ∆
∆ − ∆
=

∆ − ∆
=
 − 
= 0 =  = 0 = 0

where the bonus is linear in this case. Proposition 11 in Appendix 2 shows that ∆  ∆ whenever
0  1L . We see that for a type L firm refresher grants may be transferred to the agent even after
a bad interim result of the firm ( ). This was not the case for a type H firm. The reason is that
a type L firm is successful only when eﬀort is eﬀective, so prices are not sensitive to a second high
output realization, while the optimal incentives are; this implies that the bonus should be linear
in output in order to implement ∆ . Because it is still the case that ∆ ≥ ∆  a grant that is
only given after a low output and pays only if the second period output is high is instrumental in
avoiding ad—hoc payments. The following example illustrates this role of refresher grants for a firm
of type L.
Example 3 The following parameters describe an example of a firm of type L for which a simple
scheme is not suﬃcient, but a complex one is:  = 04 ̂ = 02 0 = 8 The optimal contract
is:  = 1039  = 3465 and  = 4741. The equilibrium stock prices are: 0 = 32
 = 28  = 24  =  =  = 4 For this firm, a complex scheme is suﬃcient,
27

cHH

b = ΔH

Consumption

cHL

r p

H HL

=Δ −Δ
L

s (p −p ) = Δ − Δ

H

L

LH

L

L

H

b=Δ

H

cLL

W=c

LL

0

0

pLL

pL

pHL = pHH

Stock price

Figure 11: For the type L firm in example 3, a simple scheme is not feasible but a complex one
including refresher grants ( and  ) is feasible.
and it takes the following values:  = 1039,  = 1276,  = 9779 and  = 2876 Figure
11 presents this scheme graphically. The refresher grants are needed to provide ∆ − ∆ 
After a first period  , this diﬀerence is achieved by granting stock   which will pay the
same in the state  as in the state  (and hence  = ∆ will implement  −  )
After a first period   it is achieved by granting   which has an exercise price of  and
hence will only pay in the state  since    always.
In general, we also find a nontrivial role for complex schemes for type L firms, as stated in the
next proposition.
Proposition 9 There is a nonempty set of firms (parameter values) for which a simple scheme is
not suﬃcient but a complex one is.
The proof is by example; for a firm of type L for ̄ = 5 and  = 3 Figure 12 presents a set of
parameters for which a simple scheme is not suﬃcient but a complex one is.

4.2

Implications for other types of firms

Firms of type H and L represent particular examples of technologies that we may identify in real
life. For example, we may think of H firms as mature, successful ones in which only the possibility
of a bad match between the CEO and the firm, or an adverse change in technology or in regulations,
triggers bad realizations and the need for high eﬀort of the CEO. Type L firms, instead, may be
28

1
0.8

% of q0

0.6
0.4
0.2
0
0.1

0.2

0.3

0.4

0.5
0.9
0.8
0.6

0.7
0.6
0.5

0.7
0.4
pihat

0.3

0.8
0.2

pi

Figure 12: For a type L firm (for ̄ = 5 and  = 3), suﬃciency of a complex scheme whenever a
simple one is not suﬃcient.
younger, struggling firms for which a good match, or favorable change in technology or regulations,
need to be paired with high eﬀort to improve outcomes.
Consider now a firm for which, in state  a high output is realized with probability ̄ regardless
of the eﬀort choice of the agent. Learning interacts with the moral hazard problem in this setting
in a more general way than in the type H and L firms, since a high output observation is more
or less informative about the moral hazard problem depending on the particular value of ̄. For
this firm, if ̄ is closer to 1 than to 0, stock prices    and  will all be close to    but
not all equal as they were for a type H firm. The low sensitivity of prices to output, however,
will condition the usage of certain instruments in a similar way to how they did for a type H firm.
For example, using a grant 0 to implement ∆ will not be impossible now, but  −  being
small will imply that a very large 0 will be needed, which may imply too large of a diﬀerence in
consumption across the  and  histories. Again, refresher grants may prove useful.
In the most general type of firm, described in matrix 1, the learning will aﬀect prices in a similar
way as we just described for a firm with probability ̄ across the two eﬀort choices. However, the
fact that eﬀort will aﬀect the probability in state  will imply that the optimal contract will have
to accommodate for the incentives in state  as well. This will not change the fact that the
choice of compensation instruments will still be guided to translate the sensitivity of prices into the
sensitivity of consumption, in a similar fashion as the one described for the type H firm.
It is easy to find examples for which the optimal contract will not be monotonic in output (see
Lemma 4 in the Appendix). Because the state  is the only one that can have a likelihood ratio

29

greater or smaller than 1, the non—monotonicities will only arise following one particular realization
of output in the first period (i.e., only one of ∆ or ∆ may be negative at a time). In particular,
parametrizations for    ̂     and ̂  that are close to a type H firm may have ∆  0 while
parametrizations close to a type L firm may have ∆  0 Following the same logic as the one we
developed when studying a type H firm, we can show that monotonicity will be both necessary and
suﬃcient for a complex scheme to implement the optimal contract.
Proposition 10 A complex scheme is suﬃcient for a general type firm (with a capped bonus) if
and only if consumption is monotonic in the optimal contract (both ∆ ≥ 0 and ∆ ≥ 0 hold) and
  6=   .
Necessity of the monotonicity follows from the fact that any complex scheme implies ∆ ≥ 0 and
∆ ≥ 0 Suﬃciency follows from the fact that we can provide a scheme that always implements the
optimal contract for a firm of a general type; this is the same scheme in 9, which was used to prove
suﬃciency in Proposition 4. The knife—edge case   =   implies no variation in stock prices, so
ad—hoc payments are needed to implement any variation in consumption. This scheme only uses a
wage  a capped bonus, and a refresher grant   Hence, even with the richer stock price dynamics
of a general firm, the implementation of the optimal scheme with a complex compensation package
needs to use refresher grants.
For a firm of a general type, a numerical characterization of the suﬃciency of simple schemes
is not easily summarized graphically, since there are two extra parameters compared to the special
type firms.23 Figure 13 presents, in the top and bottom left panel, the suﬃciency of simple schemes
for firms with the same fixed value of   and ̂   for two diﬀerent pairs of   and ̂   In both
cases, around the case   =   a simple scheme is not feasible. However, there are many other firm
specifications for which suﬃciency fails. The figures on the right top and bottom panels present
the combinations of parameters for which a nontransparent scheme is needed. For the firm in the
top panel, we have   + ̂   1 and we see that when   and ̂  both take values close to 1 these
general firms have non—monotonicities in their optimal contract; these are firms similar to a type
H firm, for which ∆  0 when 0 is low enough (see Proposition 3). For the firms in the bottom
panel, we have   + ̂   1 and when   and ̂  both take values close to 0 these general firms
also exhibit non—monotonicities; these are firms similar to a type L firm, for which ∆  0 when 0
is low enough (see Proposition 11). As stated in Proposition 10, the right panels show that ad—hoc
payments will also be needed for the knife—edge case of   =   
Again, we find that complex instruments fulfill a nontrivial role, since for some firms a simple
scheme is not suﬃcient, but a complex one is. The next example discusses such a firm.
Example 4 The following parameters describe an example of a firm of a general type for which
a simple scheme is not suﬃcient, but a complex one is:   = 09 ̂  = 07   = 03
̂  = 01 0 = 8  = 075 ̄ = 5 The optimal contract is:  = 1228  = 1411 and
 = 1742. The equilibrium stock prices are: 0 = 78  = 52  = 35  = 68
23

See Propositions 13 and 12 in the Appendix for the three conditions that need to be satisfied for a simple scheme
to be suﬃcient for a general type firm, for a capped and a linear bonus scheme, respectively.

30

1
% of q0

% of q0

1

0.5

0.5
0
0

0
0

0.2

0.2

0.4
0.4
0.6
0.6
0.8
piBhat

1

0

0.2

0.4

0.6

0.8

0.8

1

piBhat

1

0.2

0

0.4

0.6

0.8

1

piB

1
% of q0

% of q0

1

piB

0.5

0.5
0
0

0
0

0.2

0.2

0.4
0.4

0.6
0.6
0.8

piBhat

1

0

0.2

0.4

0.6

0.8

0.8

1

piBhat

piB

1

0

0.2

0.4

0.6

0.8

1

piB

Figure 13: The top two panels correspond to firms with   = 3 and ̂  = 1 The bottom two
panels correspond to firms with   = 9 and ̂  = 5 For all firms, ̄ = 5 and  = 2 The left
panels present the percentage of firms for which a simple scheme is suﬃcient. The right panels
present the percentage of firms for which nontransparent schemes are necessary.
 = 85  = 88 For this firm, a complex scheme is suﬃcient, and it takes the following
values:  = 1228,  = 183,  = 11037 Figure 14 presents this scheme graphically. The
refresher stock grant is needed to provide ∆  since trying to use 0 to stick to a simple
scheme would imply too high consumption in the state  (and hence a negative capped
bonus would be necessary, which violates the non—negativity constraint).

4.3

Empirical evidence on refresher grants

With the implications for a general type firm in mind, it is worth mentioning some interesting –
although scarce – evidence about one particular complex compensation practice: refresher grants.24
Hall and Knox (2004) find evidence that larger than average grants are often used by firms both
following a stock price decline and following a stock price increase.25 They interpret refresher grants
as a mechanism to restore incentives for the CEO whenever the sensitivity of his compensation to
24

The related practice of option “repricing” is fairly uncommon (Brenner, Sundaram, and Yermak, 2000), and not
important in adjusting incentives over time (Hall and Knox, 2004). For studies of the explanatory power of observable
characteristics on the probability of repricing see also Chance, Kumar, and Todd (2000), Brenner, Sundaram and
Yermak (2000), Carter and Lynch (2001), and Chen (2004).
25
Hall (1999) uses a more inclusive definition of refresher grants (out—of—cycle or larger than average), while Hall
and Knox (2004) call refresher grants only out—of—cycle grants (in particular, within the same fiscal year in which a
large change in the stock price is observed). In their evidence, they find the importance of larger—than—usual grants
to be much greater than that of out—of—cycle grants. For this discussion I use the more comprehensive definition.

31

cHH

sH(pHH−pH) = ΔH
cHL

b=Δ

L

Consumption

cLL

W = cLL

0

0

pLL

pHL

pHpHH

Stock price

Figure 14: For the general type firm in Example 4, a complex scheme with a capped bonus is
feasible.
stock price movements decreases, and report that refresher grants following a stock price increase
seem puzzling under this logic (since stock price increases tend to increase the sensitivity of pay to
firm performance).26 My model provides a rationale for new grants contingent on both good and
bad firm performance.

5

Conclusion

In this paper, I ask the question of what are the firm characteristics that may justify the use of
options, refresher grants, or ad—hoc payments such as signing bonuses or discounted purchases
of stock in the compensation packages for CEOs. I view compensation packages as particular
implementations of the optimal contract that provides incentives to the CEO to exert high eﬀort at
a minimum cost in the presence of moral hazard. Working with models of asymmetric information
and risk—averse agents is generally diﬃcult. Here, I present a necessarily stark model of a firm. Its
simplicity allows me to enrich it with learning about the eﬀectiveness of the eﬀort of the CEO in
enhancing the output of the firm with his or her eﬀort. This provides me with a model that explains
stock prices and compensation jointly from primitives. One lesson emerges from the analysis of the
model: The level of uncertainty about (and the priors on) the eﬀectiveness of the CEO’s actions is
26

Core and Guay (1999), without trying explicitly to establish whether a given grant is a refresher grant (out of the
ordinary in some dimension), find evidence that firms increase or decrease their annual incentive grants to maintain
an optimal incentive level. The results in this paper are consistent with their findings as well.

32

an important factor for the type of compensation instruments that the firm uses.
The results show that a limited set of compensation instruments such as an uncontingent wage,
a bonus plan linear on output, and stock grants, may be too simple to implement the optimal
contract. Still, for a nontrivial set of firms this is suﬃcient. On the other hand, more complex
schemes including also refresher stock or option grants (contingent on new information about the
performance of the firm) are generally suﬃcient. Such a complex scheme is not suﬃcient when the
learning about the eﬀectiveness of the eﬀort of the CEO is very important, which will typically
imply that the optimal compensation contract is non—monotonic in output. For this class of firms,
ad—hoc payments are necessary to provide incentives in the most cost—eﬃcient way.

6

Appendix

Proof of Lemma 2. The weak monotonicity of prices follows from
¢
¡
     ≡  [ | ] 

the fact that, for each firm, either      or      is true. Then, by Bayes’ rule, the posterior
of the state with higher probability is weakly increasing in the number of high output realizations.

Proof of Proposition 1. With  () = ln ()  the non—negativity constraints on consumption
will not bind. With  ≥ 0 as the multiplier for the binding PC, and  ≥ 0 for the binding IC, the
first order conditions of the problem are
1
=  + (1 −  )
0 ( )
which imply the ranking because 0 (·) is monotonically decreasing in consumption. These simplify
to equation 8 in the case of  () = ln () 
Proof of Proposition 2. It is easy to see this by looking at the likelihood ratios in this particular
case of our framework, which simplify to:
b
1−
b1−
 =
1−1−
b
1−
b
 =
1−
b

b1−
 =
1−
b

b

 =


where   1  1−
1− and hence    =     which implies    =
    Moreover,
b

b −
∆ =  ( −  ) = 

 (1 − )
b
1−
b −

∆ =  ( −  ) = 
1 −  (1 − )
33

Since  is the Lagrange multiplier of the incentive constraint in problem (P1), it will satisfy  ≥ 0
Since   
b by assumption, the second result in the proposition follows.
Proof of Proposition 3. In the first part of the proposition we want to show that, for a firm of
type H, ∆  ∆  and also that ∆  0 We first show the second inequality holds, and then we
use it to prove the first inequality. From the expressions for the likelihood ratios in 7, we have
∆ =  ( −  )
b
1−
b −

= 
1 −  (1 − )

Since  is the Lagrange multiplier of the incentive constraint in problem (P1), it will satisfy  ≥ 0
Since   
b by assumption, ∆  0 follows. Now to establish ∆  ∆ we first note that
∆ − ∆
= ( −  ) − ( −  ) 


where all likelihood ratios except for  are independent of 0  As we showed in the proof of
Proposition 2, we have that  =     In turn, we can show that  decreases
monotonically for 0 ∈ (0 1):


b2 −  2
=
 0
0
(0  2 + 1 − 0 )2

Then, by taking the limit of  with respect to 0 we can bound ∆ and compare it to
both extreme cases. When 0 approaches 0,  goes to its maximum, 1, and we have:
lim

0 →0

which determines the minimum possible

∆


in

(1 − 
b) 
b
∆
=
− 1

(1 − ) 
∆
 

When 0 approaches 1, instead,  approaches

its minimum, which coincides with the expression for  in the no learning case,
∆

b −
b
=

0 →1 
 (1 − )


2
:
2

lim

This is the maximum possible value of

∆
 

For this maximum value of

∆ − ∆
1−
b −
b

b −
b
=
−


1 −  (1 − )  (1 − )

∆
 

we have

which again is positive since   
b by assumption. This implies ∆  ∆ for any value of 0 
Note that  depends on 0  but the proof works because the comparison is for a given common 
in both ∆ and ∆  For the second part of the proposition, note that we have that ∆  0 if and
only if
0 ̂ 2 + 1 − 0
̂ (1 − ̂)


 (1 − )
0  2 + 1 − 0
34

Rearranging, this condition becomes, for ̂ +   1
0 

1 −  − ̂

1 −  − ̂ + ̂

Whenever  + 
b ≥ 1 the inequality is never satisfied for a 0 ∈ (0 1) 

Proof of Proposition 4. (Necessity) It is useful to write the function C ( ) in 5 as a function of
the consumption diﬀerences ∆ and ∆ defined previously, as well as using the following notation
for the price diﬀerences:  =  −  and  =  −   For a capped bonus and   0 
for example, the system becomes:
 =  + 0  +  
∆

= 0  + 0 ( − 0 ) +   +  ( −  )

∆ =  + 0  +   +  ( −  )

0 =   −   −  ( −  ) 

The fourth restriction comes from the property of the optimal contract that states that  =  
It is easy to see from this system that no values for the instruments will ever be able to implement
∆  0 since all 0  are positive and    always. The argument is similar for the case
of  ≥ 0  and the corresponding two cases with a linear bonus. (Suﬃciency). The following
scheme with a capped bonus is always a solution to the system  ∗ = C ( ) :


= 

 = ∆

0

∆
 − 
= 0 =  =  =  = 0
=

Note that because the solution has 0 = 0 it implements the same consumption regardless of
whether  is greater or smaller than 0 
Proof of Proposition 5. For a type H firm, it is straightforward to see that a linear bonus
is insuﬃcient, since we have  =   or  = 0 This means that the spread ∆ must be
implemented with the bonus  However, a linear bonus implies ∆ ≥  while we saw in Proposition
3 that for this type of firm it is always the case that ∆  ∆  Given this result, the proposition is
∆
 −
≥ 
a corollary of Proposition 13. Note that the condition ∆
− is implied by the condition

∆ ≥ 0 for an H firm, since  −  = 0 for this type of firm.
Lemma 3 When  () = ln ()  consumptions in the optimal contract change proportionally with
changes in ̄ 
Proof of Lemma 3. Denote by ∗ , for   =   the solutions to problem PS when the outside
utility of the agent is equal to ̄  We want to show that  = ∗  where  = exp ()  is a solution
35

to problem PS when all other parameters remain the same but the outside utility of the agent is
equal to ̄ 0 = ̄ +  We will do this by showing that this proposed solution satisfies the PC, the IC
and the first order conditions for the problem with parameter ̄ 0  With a slight abuse of notation,
let   and ̂  denote the probabilities of history (   ) on and oﬀ the equilibrium path. First,
the PC is
X
  ln ( ) − 
̄ 0 =


Using the proposed solution, this becomes:
X
¡
¢
  ln ∗ − 
̄ 0 =


̄ +  = ln  +

X


¡ ¢
  ln ∗ − 

which is true since  = ln  and ∗ satisfies the PC when the outside utility is ̄  Second, the IC is
X


(  − ̂  ) ln ( ) = 

With the proposed solution this becomes

X


X


(  − ̂  ) ln  +

¢
¡
(  − ̂  ) ln ∗ = 

X


X


¡ ¢
(  − ̂  ) ln ∗ = 
¡ ¢
(  − ̂  ) ln ∗ = 

which is the IC for the problem with ̄  and hence it is satisfied by ∗  Finally, the first order
conditions are
 = 0 + 0 (1 −  )  for   =  
where  and  are the lagrange multipliers of the problem with ̄ 0 Using the proposed solution we
can see that  =  and 0 =  (where  and  are the lagrange multipliers of the problem with
̄ ) satisfy all the first order conditions as well when  = ∗  Hence, the proposed solution is a
true solution.

6.1
6.1.1

Generalizations
Type L firm prices and consumption properties

Consider instead a type L firm, described by matrix (10). The analysis of this case parallels that
of case H. For any histories containing at least one  , the updated believes put probability one
L = 1 if  or  equals . If the observed history does not contain
on  = , i.e., we have that 

36

any  , instead,  =  has still positive probability. This is the case for histories  and (   ) 
That is,
L = 0
L

= 0
L


(1 − )

0 (1 − ) + 1 − 0

(1 − )2

0 (1 − )2 + 1 − 0

L
L
= 
= 
= 1

The stock prices take the simple form:
L
0 = 0 

(11)

L
L
 =  
L
L
 =  

L



= L
 =  = 

L = L = L
For this type of firm we have that 

 = 1 Learning may in this case also give
rise to non—monotonicities. In particular, when the first period output has been   the agent’s
wage may be lower if we observe  in the second period than if we observe  . This is because
observing  in the second period reveals that  =  In state  the first period observation 
makes the history’s likelihood ratio be much higher (it is a much more likely history under low
eﬀort than it would be if  = ). Formally, the likelihood ratios are:
L
=

L


=

L


=

b)2 + 1 − 0
0 (1 − 
0 (1 − )2 + 1 − 0
(1 − 
b) 
b
(1 − ) 

b2

2

(12)

(13)

L
L
and 
coincide in this case with the corresponding ones in the
The likelihood ratios 
benchmark case of no learning characterized in Proposition 2, when the true state is known to be
. We can use these likelihood ratios to establish the following properties of consumption in the
optimal contract:

Proposition 11 When the firm is of type L as described by matrix (10), consumption spread
satisfies:
(i) whenever  + 
b  1

for 0 ∈ (0 1L ] ∆  0  ∆ 
¤
£
for 0 ∈ 1L  2L  0  ∆  ∆ 

for 0 ∈ [2L  1) 0  ∆  ∆ 
37

(ii) whenever  + 
b ≤ 1
where

©
ª
for 0 ∈ (0 max 2L  0 ] 0  ∆  ∆ 
©
ª
for 0 ∈ [max 2L  0  1) 0  ∆  ∆ 
1L =
2L =

+
b−1

b

b2 −  2 (1 − )
2b
 − b
2 − 

2b
 ( − 
b)

(The proof is included after this discussion.) Again, we find a diﬀerence depending on whether
L is greater than one and
is greater or smaller than one. When  + 
b  1 we have that 
hence the contract seeks to punish the agent at  as well as at  When the firm is of type L
however, observing a high realization implies that the state is  with probability one. This makes
a low realization a very valuable (and negative) signal about performance, making  but not 
low. This implies a small ∆  For low enough prior of being in state  this can lead to ∆  0
Higher priors diminish the relative informativeness of  with respect to  and hence reestablish
the relationship 0  ∆  ∆ of the no learning benchmark.
L
is smaller than one and hence the contract seeks to
When  + 
b  1 we have that 
reward the agent at  as well as at  The posterior becomes one under either realization, so
the standard ranking of    prevails, implying ∆  0 always. However, since the only
state in which there is punishment is  and this is a very likely outcome when the true state is
 unrelated to eﬀort choice, a small 0 (a high prior that we are in ) implies a very high cost
of utility imposed on the agent with very little incentive benefit (he consumes very low very often
but his incentives are little changed since  is very unlikely in state ). This tends to keep 
not too far from   and hence it can lead to ∆  ∆  However, for  + 
b smaller than one but
with ( − 
b) small the informativeness of signals in state  decreases a lot; this makes  close
to  as well, and more so than  close to  for the same reasons as in the benchmark case
without learning. This is reflected in the case 2L  0 which is more likely when ( − 
b) is small
and implies 0  ∆  ∆ always.
L


Proof of Proposition 11. From the expressions for the likelihood ratios in equation 12, we have
¢
¡ L
L
∆ =  
− 
b

b −

= 
 (1 − )
The expression for ∆ cannot be easily simplified. However, by taking the limit of
∆


∆


with respect

to 0  we can bound it and compare it to
in both extreme cases. The only likelihood ratio in
the diﬀerence that depends on 0 is   and it does so monotonically for 0 ∈ (0 1):
L

(1 − 
b)2 − (1 − )2
=h
i2  0
0
2
0 (1 − ) + 1 − 0

38

L approaches its maximum, which coincides with the expression for
When 0 approaches 1, 
2

)
 Note that  depends on 0 but it is always positive and its
 in the no learning case, (1−
(1−)2
value aﬀects both ∆ and ∆ proportionally, so it does not aﬀect the ranking of these two spreads
of consumption. Then it is easy to see that:

∆
1−
b −
b
∆
=

 0
0 →1 
1 −  (1 − )

lim

L goes to its minimum, 1, and we have:
When 0 approaches 0 instead, 

∆
(1 − 
b) 
b
=1−

0 →0 
(1 − ) 
lim

This is not conclusive; we need to analyze two cases separately, according to whether  and 
b are
such that (i) or (ii) is satisfied. First, we have that ∆  0 if and only if

or

̂ (1 − ̂)
0 (1 − ̂)2 + (1 − 0 )


2

(1 − )
0 (1 − ) + (1 − 0 )

̂ +  − 1

̂
When  + 
b ≤ 1 there is no value of 0 for which ∆  0. Note also that 1L ≤ 1 for any values of
 and 
b since:
0  1L ≡

̂ +  − 1
≤ 1
̂
̂ +  − 1 ≤ ̂

 (1 − ̂) ≤ 1 − ̂
 ≤ 1

which is always true. For the second threshold, we have that ∆ ≤ ∆ if and only if

Simplifying, we get:

0 (1 − ̂)2 + (1 − 0 ) ̂ (1 − ̂)

b −
b
≤

−
2
 (1 − )
0 (1 − ) + (1 − 0 )  (1 − )
0 ≤ 2L ≡

2b
−
b2 ( + 1) −  2 (1 − )

2b
 ( − 
b)

Note that 2L  1 for any combination of probabilities, since the denominator is larger than the
numerator:
2b
 ( − 
b)  2b
−
b2 ( + 1) −  2 (1 − ) 
or

b − b
 2 − 2b
+
b2 +  2 −  3  0
2 2 

2b
 ( − 1) − 
b2 ( − 1) −  2 ( − 1)  0

(1 − )( 2 − 2b
+
b2 )  0

(1 − )( − 
b)2  0
39

However, whether 2L is strictly positive depends on the sum of  and 
b:
2L  0

2b
−
b2 ( + 1) −  2 (1 − )  0

2b
−
b2 −  2  
b2  −  3
¡ 2
¢
 −
b)2
b2   ( − 

 ( − 
b) ( + 
b)  ( − 
b)2
 ( + 
b)   − 
b

b  1
If  + 
b ≥ 1 then the last inequality is always satisfied, and hence 0  2L  1 If  + 
however, for some pairs of  and 
b there will be no 0 for which ∆ ≤ ∆ .

Proof of Proposition 7. For a type L firm, it is straightforward to see that a capped bonus
is insuﬃcient, since we have  =   or  = 0 This means that the spread ∆ (which is
always positive, by Proposition 11) must be implemented with the bonus  and hence we cannot
have it capped. Given this result, the proposition is a corollary of Proposition 12 below, where


I consider a linear bonus program:   (1  2 ) =  (1 + 2 ). Note that the conditions ∆∆−∆

 −
∆ −∆
( − )−( − ) and ( − )−( − )  0 for a general firm both simplify to the condition
∆ − ∆  0 for an L firm, since  −  = 0.
Proof of Proposition 8. (Necessity) The argument parallels that of the proof of Proposition 4.
For a firm of type L we have
 =  + 0  +  
∆

= 

∆ =  + 0  + 0 ( − 0 ) +   +  ( −  )
0 =   −   −  ( −  ) 

and it is clear that ∆ ≤ ∆ for any combination of instruments with   0 which is needed to
implement ∆  0. (Suﬃciency) The following solution implements any optimal consumption for
a type L firm:







0

= 
= ∆
∆ − ∆
=

∆ − ∆
=
 − 
= 0 =  = 0 = 0

40

6.1.2

Generalization of results for suﬃciency of simple and complex schemes

In this appendix I present the suﬃciency results for a general firm as defined, at all  by matrix
b for at least one  and the prior over  =  satisfies 0  0  1
1, where I assume that   6= 
Also, higher eﬀort ( ) implies higher probability of observing  (for any quality of the firm):
b and   ≥ 
b  with at least one being a strict inequality.
 ≥ 
Lemma 4 Optimal consumption is not necessarily monotonic in output, i.e. we may have ∆  0
or ∆  0 Also, both ∆  ∆ and ∆  ∆ may occur.
The proof for this result, included in Miller (1999), simply analyzes the possible ranking of
likelihood ratios for diﬀerent combinations of probabilities in matrix 1.27
First, I study the case of a simple scheme with a linear bonus.
Proposition 12 A simple scheme with a linear bonus is suﬃcient if and only if:
a)

∗
∆ −∆

b)

∆ −∆
( − )−( − )

c)

∆
∆ −∆

≥

≥


( − )−( − ) ,

≥ 0

 −
( − )−( − ) 

Proof of Proposition 12. With a linear bonus, the function C ( ) implies:
∗

=  + 2 + 0 

∗ =  +  + 0 
∗

=  +  + 0 

∗ =  + 0  
It is useful to write these equations as a function of the consumption diﬀerences ∆ and ∆ defined
in the previous section. To simplify, I also introduce the following notation for the diﬀerences in
prices:  ≡  −  and  ≡  −  .
⎡
⎤ ⎡
⎤⎡
⎤
∆
0 1 

⎢
⎥ ⎢
⎥⎢
⎥
⎣ ∆ ⎦ = ⎣ 0 1  ⎦ ⎣  ⎦ 
0
∗
1 0 

The solution to this system is:

∆ − ∆

 − 
∆ − ∆
 = ∆ − 

 − 
∆ − ∆
0 =

 − 



27

= ∗ − 

See also Celentani and Loveira (2006) for a parallel result in a model of simultaneous (rather than sequential)
signals that can explain the apparent lack of relative performance in executive compensation.

41

The conditions a)-c) are then derived from the non—negativity constraints imposed on   and 0 

A similar analysis can be pursued for the case of a capped bonus, for a general firm as described
in matrix 1.
Proposition 13 A simple scheme with a capped bonus is suﬃcient if and only if:
a)


∆

≥


 − ,

b)

∆
∆

≥

 −
 − 

c) ∆ ≥ 0
Proof of Proposition 13. The function C ( ) is, in the case of a capped bonus:

and hence,

∗

=  +  + 0 

∗
∗
∗

=  +  + 0 
=  +  + 0 
=  + 0  

⎡

The solution to this system is

⎤ ⎡
⎤⎡
⎤
∆
0 0 

⎢
⎥ ⎢
⎥⎢
⎥
⎣ ∆ ⎦ = ⎣ 0 1  ⎦ ⎣  ⎦ 
∗
1 0 
0
∆


∆
 = ∆ − 


∆
0 =





= ∗ − 

Proof of Proposition 10 (in text). (Necessity) The case   =   implies no variation in
stock prices, so ad—hoc payments are needed to implement any variation in consumption. Provided
  6=    for a firm of general type, we have, for a capped bonus and   0  the system
becomes:
 =  + 0  +  
∆

= 0  + 0  +   +  ( −  )

∆ =  + 0  + 0 ( − 0 ) +   +  ( −  )
0 =   −   −  ( −  ) 
42

and it is clear that monotonicity follows from any combination of instruments. This is trivially also
true for the case  ≤ 0  and for a linear bonus. (Suﬃciency) The following solution implements
any optimal consumption for a general—type firm (the same we used in the proof of Proposition 4
for a type H firm):


= 

 = ∆

0

∆
 − 
= 0 =  =  =  = 0
=

References
[1] Acharya, Viral V., Kose John, and Rangarajan K. Sundaram. 2000. “Contract Renegotiation
and the Optimality of Resetting Executive Stock Options.” Journal of Financial Economics
57: 65—101.
[2] Jorge G. Aseﬀ and Manuel S. Santos, “Stock options and managerial optimal contracts,”
Economic Theory 26, 813—837 (2005)
[3] Bebchuk, Lucian A., Alma Cohen, and Holger Spamann, 2010, “The wages of failure: Executive
compensation at Bear Stearns and Lehman 2000-2008,” Yale Journal on Regulation 27, 257—
282.
[4] Bolton, P., Hamid Mehran, and Joel Shapiro. “Executive Compensation and Risk Taking”,
Mimeo (2010).
[5] Brenner, M., R. Sundaram and D. Yermak, “Altering the terms of executive stock options,”
Journal of Financial Economics 57, (2000) 103-128
[6] Carter, M. E., L. J. Lynch and I. Tuna (2007), ‘The Role of Accounting in the Design of CEO
Equity Compensation’, The Accounting Review, Vol. 82, No. 2, pp. 327—57.
[7] Carter, Mary Ellen, and Luann J. Lynch. 2001. “An Examination of Executive Stock Option
Repricing.” Journal of Financial Economics 61 (August): 207—25.
[8] Celentani, M. and R. Loveira, “A Simple Explanation of the Relative Performance Evaluation
Puzzle,” Review of Economic Dynamics, vol. 9(3), pages 525-540, July (2006).
[9] Chance, D., R. Kumar, R. Todd, “The ‘repricing’ of executive stock options”. Journal of
Financial Economics 57 129-154 (2000)
[10] Chen, M.A. “Executive Option Repricing, Incentives, and Retention”. The Journal of Finance,
Vol. 59, No. 3, pp. 1167-1199 (2004)
43

[11] Clementi, Gian Luca, and Thomas F. Cooley, 2009. “Executive Compensation: Facts,” NBER
Working Papers 15426, National Bureau of Economic Research, Inc.
[12] Clementi, Gian Luca, Cooley, Thomas F. and Wang, Cheng, 2006. “Stock grants as a commitment device,” Journal of Economic Dynamics and Control, Elsevier, vol. 30(11), pages
2191-2216, November.
[13] Chance, D. M.; R. Kumar; and R. B. Todd. “The ‘Repricing’ of Executive Stock Options.”
Journal of Financial Economics 57 (2000): 129—54.
[14] Core, J. E., and W. R. Guay. “The Use of Equity Grants to Manage Optimal Equity Incentive
Levels.” Journal of Accounting and Economics 28 (1999): 151—84.
[15] Edmans, Alex, Xavier Gabaix, Tomasz Sadzik, Yuliy Sannikov, “Dynamic CEO Compensation,” The Journal of Finance, 2012, vol. 67(5), p. 1603-1647
[16] Edmans, Alex, and Q. Liu, “Inside Debt”, Review of Finance (2010), 1—28
[17] Fahlenbrach, Rudiger, and Rene M. Stulz, 2009, “Managerial ownership dynamics and firm
value,” Journal of Financial Economics 92, 342—361.
[18] Gabaix, X. and Augustin Landier, “Why Has CEO Pay Increased So Much?”, Quarterly
Journal of Economics, vol. 123(1), 2008, p. 49-100.
[19] Gillan, Stuart L., Jay C. Hartzell and Robert Parrino. “Explicit versus Implicit Contracts:
Evidence from CEO Employment Agreements.” The Journal of Finance, Volume 64, Issue 4,
pages 1629—1655, August 2009
[20] Grossman, Sanford and Oliver D. Hart. “An Analysis of the Principal—Agent Problem.” Econometrica 51, Issue 1 (Jan.,1983), 7-46.
[21] Hall, Brian. “The Design of Multi-Year Option Plans.” Journal of Applied Corporate Finance
12, no. 2 (summer 1999): 97—106.
[22] Hall, Brian J., and Jeﬀrey B. Liebman. “Are CEOs Really Paid Like Bureaucrats?” Quarterly
Journal of Economics (August 1998): 653—691.
[23] Hall, Brian J. and Thomas A. Knox. “Underwater Options and the Dynamics of Executive
Pay-to-Performance Sensitivities,” Journal of Accounting Research, Blackwell Publishing, vol.
42(2), pages 365-412, 05. (2004)
[24] Holmström, B. “Moral Hazard and Observability,” Bell Journal of Economics, Vol. 10 (1) pp.
74-91. (1979)
[25] Jarque, Arantxa, and Brian Gaines. “Regulation and the Composition of CEO Pay,” FRB
Richmond Economic Quarterly, Vol. 98(4), Fourth Quarter 2012, p. 309-348.

44

[26] Kadan, O. and J. Swinkels. “Stocks or Options? Moral Hazard, Firm Viability, and the Design
of Compensation.” Review of Financial Studies (2007)
[27] Kole, Stacey R., 1997, “The complexity of compensation contracts,” Journal of Financial
Economics, 43, 79—104.
[28] Miller, Nolan.“Moral Hazard with Persistence and Learning”, Manuscript. (1999)
[29] Murphy, Kevin J. 1999. “Executive Compensation.” In Handbook of Labor Economics, edited
by Orley Ashenfelter and David Card. NewYork: Elsevier Science North Holland, 2485—563.
[30] Wang, Cheng (1997), “Incentives, CEO Compensation and Shareholder Wealth in a Dynamic
Agency Model,” Journal of Economic Theory 76: 72-105.

45
Full text of Working Papers (Federal Reserve Bank of Richmond) : The Complexity of CEO Compensation, Working Paper 14-16

FRASER