Full text of Working Papers (Federal Reserve Bank of Chicago) : Identification of Search Models with Initial Condition Problems, Working Paper 2006-03

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Federal Reserve Bank of Chicago

Identification of Search Models with
Initial Condition Problems
Gadi Barlevy and H. N. Nagaraja

WP 2006-03

Identification of Search Models with Initial Condition Problems∗
Gadi Barlevy
Economic Research Department
Federal Reserve Bank of Chicago
230 South LaSalle
Chicago, IL 60604
e-mail: gbarlevy@frbchi.org

H. N. Nagaraja
Department of Statistics
Ohio State University
1958 Neil Avenue
Columbus, OH 43210
e-mail: hnn@stat.ohio-state.edu

March 30, 2006

Abstract
This paper extends previous work on the identification of search models in which observed
worker productivity is imperfectly observed. In particular, it establishes that these models
remain identified even when employment histories are left-censored (i.e. we do not get to follow
workers from their initial job out of unemployment), as well as when workers set diﬀerent reservation wages from one another. We further show that allowing for heterogeneity in reservation
can aﬀect the empirical estimates we obtain, specifically estimates of the rate at which workers
receive job oﬀers.

Key Words: Record Statistics, On-the-Job Search, Job Mobility, Reservation Wages

∗

We would like to thank seminar participants at the Chicago Fed and the University of Pennsylvania for comments.
We also benefitted from discussions with Dan Aaronson, Jeﬀ Campbell, Merritt Lyon, and Rob Shimer.

Introduction
This paper derives new results on the identification of job search models with imperfectly measured
worker productivity. Various authors have argued that search models are helpful for understanding
labor markets, especially for young workers who are still in the beginning stages of their careers.
According to these models, a worker’s wage depends on which random employers he happened to
meet, along with his own productivity. We are interested in whether the key elements of these
models can be identified when worker productivity is not perfectly observable. That is, can we
infer the distribution of wages a worker of given productivity could earn on various jobs? Can we
infer the rate at which workers receive oﬀers from how often they choose to change jobs?
There are several applications that require recovering these objects. For example, various search
models imply that the distribution of wages available to workers reflects underlying heterogeneity
in productivity across producers (where less productive employers can still compete given search
frictions). In this case, the distribution of wages reveals the degree of productive ineﬃciency
across employers, which we can then use to evaluate the eﬀects of macro shocks or policy changes
on aggregate productivity. As another example, these estimates make it possible to quantify the
role of mobility in wage growth and distinguish it from the role of experience which makes workers
more productive. We can also examine whether workers in diﬀerent job markets face diﬀerent
wage distributions and opportunities for job mobility.
As previous authors pointed out, identifying search models is straightforward when worker productivity is relatively homogeneous or else perfectly observable. In these cases, we can recover the
distribution of wages for workers of given productivity from the empirical distribution of wages
among such workers on their first job out of unemployment. To infer the rate at which workers
receive oﬀers, we can look at how the duration of a job varies with its wage. Intuitively, since
lower oﬀers are easier to beat, workers who draw low wages will leave more quickly, at a rate that
depends in a precise way on the distribution of wages oﬀered and how frequently oﬀers arrive.
Once we acknowledge that worker productivity is hard to measure, though, this approach no
longer applies. For one thing, we can no longer determine whether dispersion in wages in the first
job out of unemployment reflects dispersion in worker ability or in the prices employers pay. Yet
it is important that we be able to distinguish the two. For example, as noted by Flinn (2002) and
Bowlus and Robin (2004), whether mobility can alleviate long-run earnings inequality ultimately
depends on the source of dispersion: if wage diﬀerentials were due to price dispersion, increasing
mobility would allow workers who drew low initial oﬀers to catch up to those who drew high oﬀers;

1

if wage diﬀerentials instead reflected diﬀerences in ability, increasing mobility would have no eﬀect
on inequality. Moreover, if we cannot tell whether a worker earning a low wage is less productive
or just happened to meet a low paying employer, we will not be able to use the fact that workers
tend to leave lower paying jobs more quickly to recover the rate at which oﬀers arrive.
One way to cope with unobserved worker productivity is to impose parametric restrictions on
the distribution of prices and unobserved abilities. However, since the conclusions we draw will
be sensitive to these restrictions, we would like to know whether we can estimate these models
without imposing them. At the very least, establishing that the model is identified can suggest
ways of testing particular functional forms before incorporating them in parametric estimation.
Work by Nagaraja and Barlevy (2003) and Barlevy (2005) demonstrated that these models can
be identified without requiring parametric assumptions. The insight in these papers is to use an
approach similar to the one used for nonparametric identification of auction models, e.g. Athey
and Haile (2002). Intuitively, the wages on diﬀerent jobs a worker accepts represent extreme values
among the oﬀers he received, much like winning bids in auctions represent extreme values among
the bids tendered. Fortunately, extreme values often characterize the distributions of variables
from which these extremes are taken. Yet while the approach in the auction literature is static
and concerns extremes among bids already submitted at some point in time, search models require
a dynamic approach that involves following a worker over time and keeping track of how often
he changes jobs and how his wages rise with mobility. Formally, these papers rely on results for
record statistics rather than order statistics. One can show that the distribution of wages available
to a worker of given ability can be identified from the way average wage gains of job changers vary
with past mobility (i.e. the average gaps between diﬀerent consecutive record values) rather than
from the distribution of wages on the first job out of unemployment. Similarly, the rate at which
workers receive oﬀers can be identified from the number of jobs between unemployment spells (i.e.
the number of record values) rather than by relating the duration of a job to its wage.
Unfortunately, the identification results cited above rely on various assumptions on both the
model and the data used for identification. One such assumption is that when we first observe a
worker, his wage represents a random draw from the distribution of wages a worker of his ability
could earn. However, there are various relevant scenarios in which this assumption will be violated.
For example, some datasets interview workers who are already employed at the time the survey
began. In this case, the wage on the first job we observe for a worker is not a random draw from
the distribution of wages a worker of his ability could earn, since some workers who drew low wages
for their first job out of unemployment are likely to have moved on to a higher wage job before
we first observe them. Restricting attention only to workers who leave unemployment within
2

the survey window dramatically reduces sample sizes and makes identification impractical. As
another example, suppose we do get to observe workers as of their first job out of unemployment,
but workers value leisure diﬀerently and therefore set diﬀerent reservation wages. In that case, a
worker’s wage on his first job will typically be drawn not from the distribution of wages a worker
of his ability could earn, but from a truncated version of this distribution.
The purpose of this paper is twofold. First, we establish that these models remain identified
even in the face of such initial condition problems, i.e. situations in which the wage on the initial
job we observe for a worker is not a random draw from the distribution of wages a worker of
equal ability could earn. Second, using data from the National Longitudinal Survey of Youth
(NLSY), we show that allowing for heterogeneity in reservation wages can aﬀect inference. In
particular, allowing for heterogeneity in reservation prices leads us to estimate a much higher
rate at which employed workers receive oﬀers in the NLSY than if we assume all workers share
the same reservation price. This result is due to certain distinguishing features of the empirical
distribution of the number of consecutive job-to-job transitions, patterns which previous research
has overlooked. Interestingly, our estimate for the rate at which employed workers receive oﬀers
when we accommodate heterogeneity in reservation wages is on par with what previous authors
have found using job duration data, although our estimate for the rate at which unemployed
workers receive oﬀers is much larger than previous authors have estimated.
The paper is organized as follows. Section 1 describes the search model we use. Section 2
derives the analytical results we require for identification. Section 3 applies our results to the case
of left-censored work histories. Section 4 considers the case of unobserved diﬀerences in reservation
prices across workers. In Section 5, we discuss the eﬀects of heterogeneity in reservation prices
on empirical estimates of mobility. Section 6 comments on the implications of heterogeneity in
reservation prices for empirical estimates of the price oﬀer distribution. We conclude in Section 7.

1. A Model of On-the-Job Search
There is already a vast literature on estimating search models using data from worker histories;
for a recent survey of this literature, see Eckstein and van den Berg (2005). We follow previous
authors in modeling search as a process in which workers periodically draw a fixed price from some
distribution and choose optimally among the oﬀers available at each point in time. However, we
also allow a worker’s productivity to vary over time, in a way that the econometrician may not be
able to measure. This feature allows the model to be consistent with two facts that a model with
fixed productivity would be unable to match. First, real wages tend to vary over the course of a
3

job, which cannot occur in our model if both the price on the job and the worker’s productivity
are constant. Second, workers occasionally earn less on a job they move to voluntarily than on the
job they left. This could occur if the worker’s productivity fell just when he happened to change
jobs, but not if his productivity were constant over time. Our assumption thus plays the same role
as measurement error in previous work.1 But unlike previous work which modeled measurement
error as a sequence of i.i.d. normal random variables, under our interpretation it is less obvious
how to model unmeasured variation in wages, which motivates our nonparametric approach.
We begin with a description of workers. All workers supply a homogeneous labor input, but
the amount they supply varies across workers and over time. Let

it

denote the amount of labor

worker i can supply per hour at date t. This amount is observable to both the employer and the
worker, but is only imperfectly observable to the econometrician who collects data on this market.
We follow Flinn (1986) by modelling this productivity as a log-linear function:
it

= exp (βXit + φi + εit )

(1.1)

The first term, Xit , represents characteristics of individual i that are observable to the econometrician, and β represents the returns to these characteristics. The next term, φi , is fixed over time,
reflecting variations in innate ability that make some workers consistently more productive than
others. We do not require this term to be observable to the econometrician. The last term, εit ,
denotes variation in the worker’s productivity over time which cannot be directly measured, but
can also include measurement error due to misreporting.
We impose the following assumptions on εit :
∞
Assumption 1.1: {εit }∞
t=0 is independent of {Xit }t=0 for any fixed i

Assumption 1.2: E [Ei [∆εit ]] ≡ E

£R

¤
(ε
−
ε
)
di
= 0 for all dates t
it
i,t−1
i

Assumption 1.1 states that the observable and unobservable components of an individual’s productivity are independent. The role of this assumption will become clear below. Assumption 1.2
states that if we were to average the change in εit across all workers, the unconditional expectation of this average would equal zero. This assumption is essentially a normalization; in general,
1

That is, in our model a voluntary job changer will report a wage cut solely because of how wages are measured,
not because he moved to a lower paying job. There are search models in which workers choose to accept lower paying
jobs; one example is Postel-Vinay and Robin (2002), in which workers accept a lower wage from a more productive
employer, correctly anticipating higher wage growth with such an employer. But their model also requires wages be
imperfectly measured to accord with the fact that wages both rise and fall over the duration of a job. Of course,
once we allow for mismeasurement, we reintroduce the possibility that most wage declines we observe are spurious.

4

E

£R

¤

i ∆εit di

is a function of t, and we can always include a semiparametric function of t in Xit

to yield a residual that satisfies Assumption 1.2. Note that we impose no restrictions on the
distribution of εit , how its variance varies over time or across individuals, or its autocorrelation.
At each point in time, a worker can be classified as either employed or unemployed. He can
produce bi

it

units of output per hour while unemployed, where bi is fixed over time and represents

either productivity in the home sector or the marginal value of leisure. Assuming this value
is proportional to

it

allows us to abstract from selection among employed workers. Without

this assumption, workers who became less productive on the job but not at home would opt
out of the labor market to enjoy leisure, and the distribution of changes in productivity ∆

it

of

employed workers would not be representative of the changes among all workers. In support of this
assumption, we note that recent work by Low, Meghir, and Pistaferri (2004) allowed for selection
in a similar model but found it to have a negligible impact on overall fit. Nagaraja and Barlevy
(2003) and Barlevy (2005) assume bi = b for all i. One of the goals of this paper is to relax this
assumption, although for purposes of exposition we occasionally invoke it as well.
While unemployed, a worker receives job oﬀers at rate λ0 per unit time. If he accepts an oﬀer
and becomes employed, he continues to receive oﬀers at a (possibly diﬀerent) rate λ1 . In addition,
he may be forced to leave his job, an event that occurs at rate δ. We assume a worker cannot
recall oﬀers he turned down, so a worker who loses his job becomes unemployed.
While working on a job, a worker supplies all of his labor to produce final goods using a linear
technology specific to that job. Let zij denote the productivity of worker i on job j. Thus, he
will produce zij

it

units of output per hour. We assume zij on any match is fixed over time. This

assumption eﬀectively rules out match specific human capital; if a worker becomes more productive
on one job, it must reflect changes in

it

that by construction carry over to all jobs. Barlevy (2005)

argues this specification is empirically plausible for young workers, since most of the wage growth
of young workers on a job does appear to carry over to other jobs as well.
Let Γi (·) denote the cdf of productivity zij for worker i across all jobs. We assume Γi (·) = Γ (·)
for all i, i.e. all workers face the same production possibilities. This assumption naturally follows
if we assume zij = zj for all i, i.e. productivity is job-specific rather than match-specific. We
appeal to this assumption for simplicity. However, it is not essential to interpret the model this
way. For example, Marimon and Zilibotti (1999), Barlevy (2002), and Gautier, Teulings, and van
Vuuren (2005) develop models in which workers enjoy a comparative advantage on diﬀerent jobs
rather than being all best suited to the same job, but the symmetry of those models ensures the
distribution of productivity across jobs is the same for all workers.
5

Upon meeting, a worker and employer observe match productivity z, and the latter makes an oﬀer
to the former. The exact nature of the oﬀer will depend on how workers and employers interact.
For the most part, we focus on models in which this interaction implies that all employers with
the same zj will oﬀer in equilibrium the same constant price wj per unit labor, independently of
that worker’s productivity

it .

The hourly wage for the worker is hence
Wijt = wj

(1.2)

it

Under this assumption, jobs can be ranked according to the price they pay, and workers will seek
out higher paying jobs, just as in the traditional search model with fixed worker productivity. One
model that is consistent with (1.2) is the Lucas and Prescott (1974) model, modified to allow for
on-the-job search. In that model, employers are located on islands. Workers can instantly move
between employers on an island, and all employers on the same island are equally productive.
Competition on an island will drive employers to pay workers their marginal product, i.e. wj = zj .
Productivity is not the same across islands, but workers only get to visit other islands at random
times. In an appendix, we show that our framework also corresponds to a model in which workers
cannot instantly move to other equally productive employers, and must instead bargain with their
employer over the wage they earn according to a particular bargaining protocol.
Although the distinguishing feature of a job is its productivity, in the models we consider a job
can also be characterized by the wage it pays. Let F (·) denote the distribution of wages across all
jobs. From the worker’s perspective, each encounter with an employer represents an independent
draw from F (·) that does not depend on his current productivity

it .

Given our assumptions, all

workers face the same F (·). In practice, workers with diﬀerent skills may direct their search to
diﬀerent labor markets. However, as long as these skills are observable, we can always carry out
our analysis for each skill group separately. Although we shall focus on identifying F (·), in certain
cases we will be able to map F (·) into a distribution of productivity Γ (·).2
What is the optimal strategy for a risk-neutral worker who discounts the future at rate ρ and
whose objective is to maximize the present discounted value of his earnings? Once he is employed,
it is clearly optimal for him to only accept a new oﬀer if it pays a higher price than his current
job. While unemployed, he must decide at what price it is better to work and continue searching
at rate λ1 than to remain unemployed and search at rate λ0 . This problem has a well known
2

Recovering Γ (·) from F (·) is analogous to recovering the distribution of valuations from the distribution of bids
in an auction. The Lucas and Prescott model resembles a second-price auction, since in equilibrium employers bid
zj , the amount at which they value a worker (but unlike the second-price auction, they also pay what they bid).
In other models, a firm’s oﬀer will be some function of zj . In that case, recovering Γ (·) from F (·) can be more
involved, just as mapping bids into valuations is more involved for first-price auctions than second-price auctions.

6

solution. As demonstrated, among others, by Mortensen and Neumann (1988), the optimal policy
for the worker is to set a reservation price wi∗ that solves
Z ∞
1 − F (x)
dx
wi∗ = bi + (λ0 − λ1 )
wi∗ ρ + δ + λ1 (1 − F (x))

(1.3)

If λ0 = λ1 , the worker should work at any price that exceeds bi . If instead λ0 > λ1 , the worker
should reject oﬀers slightly higher than bi and retain the option to search at a higher rate λ0 .
Let us define w∗ = inf i wi∗ as the lowest reservation price across workers. In what follows, we
will find it convenient to assume that the lowest price on any job, w = inf j wj = F −1 (0), is higher
than w∗ . In the wage setting models we described, this condition will automatically hold if λ0 ≥ λ1

and if the lowest productivity level z = inf j zj exceeds the value of leisure for the worker who least
enjoys it, namely b = inf i bi . This assumption allows us to avoid the issue of recoverability in Flinn
and Heckman (1982), i.e. that it is only possible to recover the distribution of prices below w∗ for
particular functional forms. However, this assumption is not essential, and we can talk about a
somewhat modified notion of identification without it.3
This completes the description of the model. We wish to recover its key parameters — λ0 , λ1 , δ,
and F (·) — using employment history data, i.e. the hourly wage {Wijt }∞
t=0 = {wij
worker, the duration of each job, and any occurrences of unemployment.

4

∞
it }t=0

for each

As noted earlier, we

may be able to map F (·) to the distribution Γ (·), but our discussion will focus on recovering F (·).
Our approach to identification builds on the fact that the jobs the worker accepts correspond to
“records” among the oﬀers he receives. More precisely, following Wolpin (1992), let us partition
the data for each worker into distinct employment cycles, where a cycle is defined as the time
between unemployment spells. Let M denote the number of oﬀers on an employment cycle, and
let {ym }M
m=1 denote the prices per unit labor on the respective oﬀers he receives. As demonstrated

in Barlevy (2005), M has a geometric distribution, i.e.
Pr (M = m) =

µ

λ1
λ1 + δ

¶m−1 µ

δ
λ1 + δ

¶

(1.4)

In particular, we can still identify the truncated distribution F (w | w ≥ w∗ ) and the arrival rates λ0 (1 − F (w∗ ))
and λ1 (1 − F (w∗ )) of “viable” oﬀers (i.e. those that some worker might accept) rather than F (w), λ0 , and λ1 .
3

4

More recently, economists have assembled matched employer-employee datasets that allow us to track the wages
of all employees at the same workplace. If employers paid the same price per unit labor on all jobs, we could recover
the distribution of prices directly using employer fixed eﬀects as in Abowd, Kramarz, and Margolis (1999). We
assume the econometrician has no access to such data. Even if such data were available, employer fixed eﬀects would
not yield identification if the price varied by job (or by match for the same job) rather than by employer.

7

Define L (1) = 1, and for any integer n > 1 define L (n) recursively as
©
ª
L (n) = min m : ym > yL(n−1)

(1.5)

To simplify matters, suppose all workers share the same value b for bi and thus the same reservation
price w∗ . Since we assumed w ≥ w∗ , workers accept their first oﬀer y1 , and then any new oﬀer that

exceeds what they currently earn. Define N as the number of actual jobs the worker is employed
on in a given cycle, so that N ≤ M , and let n ∈ {1, ..., N } index these jobs. Since a worker always
chooses the job oﬀering the highest price, the price on the n-th job in an employment cycle, wn ,
must be given by
wn = yL(n)
In extreme-value theory, L (n) is called the n-th record time and wn is called the n-th record value.
That is, wn corresponds to the n-th time in the sequence {ym }M
m=1 in which a value exceeds all

elements that preceded it. Hence, the prices on accepted jobs correspond to a sequence of record
values. Likewise, the number of jobs in an employment cycle N corresponds to the number of
record values in the sequence {ym }M
m=1 . For comprehensive surveys on record-value theory, see

Arnold, Balakrishnan, and Nagaraja (1998) and Nevzorov and Balakrishnan (1998). An important
distinction between our application and much of the work in these surveys is that our application
involves records from a random number of observations rather than an infinite sequence {ym }∞
m=1 .
Before we turn to what is new in this paper, let us review previous work that exploits the record
structure of search models for purposes of identification. Barlevy (2005) oﬀers the following multi-

step procedure for the special case where employment histories are not censored and all workers
share the same reservation price w∗ . The first step involves recovering the ratio κ1 = λ1 /δ.
From (1.4), the number of oﬀers M on an employment cycle has a geometric distribution that
depends on κ1 . One can show that this implies the distribution of the number of records N across
employment cycles is itself uniquely determined by the parameter κ1 (and moreover must be a
truncated Poisson). Hence, data on the number of jobs per employment cycle suﬃce for estimating
the relative rate at which employed workers receive oﬀers.
Note that we use neither wage data nor duration data to recover this ratio, in contrast to the
approach outlined in the survey by Eckstein and van den Berg (2005) which identifies κ1 from data
on how the duration of the job varies with its price wj . The fact that we cannot directly infer the
∗ = w
price on a job wj from the hourly wage Wijt
j

it

does not pose a problem. Moreover, as we

reiterate below, using duration data to recover κ1 requires that we know F (·), and at this stage
we have yet to recover it. That said, we would need duration data to separately identify λ1 and δ.

8

In particular, the duration of an employment cycle has an exponential distribution with hazard δ,
allowing us to recover this parameter.
The next two steps describe how to recover the oﬀer distribution F (·). We begin by taking the
log diﬀerence of the wage in (1.2)
∆ ln Wit = ln wn(t) − ln wn(t−1) + β∆Xit + ∆εit

(1.6)

where n (t) denotes the record number of the job the worker holds at date t. For workers who
remain on the same job at these two points in time, the price w per unit labor will be constant,
and their log-wage growth is given by
∆ ln Wit = β∆Xit + ∆εit

(1.7)

Given Assumption 1.1, we can recover β by ordinary least squares. Next, we turn to workers who
move from their n − 1-th job in date t − 1 to their n-th job at date t. For these workers, we have
∆ ln Wit − β∆Xit = ln wn − ln wn−1 + ∆εit
≡ ω n − ω n−1 + ∆εit
where we use the estimate of β from the previous step. Note that ω n = ln wn corresponds to
the n-th record in the sequence of log price oﬀers. Thus, net log wage growth for voluntary job
changers is just an error ridden record gap. Averaging across all workers who switch from their
n-th job to their n + 1-th job, and using the fact that wj is independent of

it ,

we obtain

E (∆ ln Wit − β∆Xit | N ≥ n) = E (ω n − ω n−1 | N ≥ n)

(1.8)

This average is conditional on the there being at least n jobs in the respective employment cycle,
i.e. on the event that the worker made it to his n-th job. The imputed average log wage changes
© ¡
¢ª
thus correspond to the conditional expected record gaps E yL(n) − yL(n−1) | N ≥ n from a

5
sequence {ym }M
m=1 that corresponds to log price oﬀers over an employment cycle.

These average record gaps turn out to suﬃce for recovering the shape of the oﬀer distribution. As
demonstrated by Nagaraja and Barlevy (2003) and Barlevy (2005), if M has a known geometric
¢ª∞
© ¡
distribution, the average record spacings E yL(n) − yL(n−1) | N ≥ n n=1 from a sequence of
5

While the observed log wage change for an individual ∆ω + ∆ε can be negative, average log wage changes
E (∆ω) must be positive, an implication we can test. More generally, not every nonnegative sequence corresponds
to a sequence of average record gaps from some distribution, imposing a stronger restriction on what average wage
gains can be. We do not derive the admissable set here, other than to note it is isomorphic to the set of moments
for which the related Hausdorﬀ Moment Problem described in Shohat and Tamarkin (1943) is solvable.

9

i.i.d. observations {ym }M
m=1 identify the distribution of ym up to a location shift. Using our
estimate for κ1 to pin down the distribution of M , we can map the average net wage gains (1.8)

into a unique distribution F (·) up to a scaling parameter. Intuitively, average wage growth over an
employment cycle is informative because the wage gains of workers leaving their first job depend
more on the shape of the oﬀer distribution near its lower support than the wage gains of workers
leaving their sixth job. Tracking wage growth over an employment cycle should therefore allow us
to reconstruct the oﬀer distribution.
Several caveats are warranted regarding this identification strategy. First, it assumes all workers
share the same value of leisure bi so that prices on each employment cycle reflect records from
identically distributed sequences of i.i.d. observations {ym }M
m=1 . But if workers valued leisure

diﬀerently, the price on the first job yL(1) would be drawn from a diﬀerent truncated distribution

F (w | w ≥ wi∗ ) for each worker. Another caveat is that we need to track each employment cycle
from its beginning to determine how many jobs a worker held between his last unemployment spell

and his current job. If employment cycles were censored, this approach would seem doomed. It
is therefore important to determine whether the model can remain identified when either of these
conditions is violated.
At first, heterogeneity in reservation prices and censored employment histories might seem like
distinct problems. However, one of the points of this paper is that in both cases, the wages on an
employment cycle can be viewed as record values from an independent but not identically distributed (i.n.i.d.) sequence of observations {ym }M
m=1 , specifically one in which the first observation y1

is distributed diﬀerently from the remaining observations. It turns out that in this case we can
continue to identify the key aspects of the model, although demonstrating this is more involved
than in the i.i.d. case. In the next section we analyze records in this setup, and in subsequent
sections we discuss censoring and heterogenous reservation prices, respectively.

2. Identification with a Non-Identically Distributed Initial Observation
This section contains the key mathematical results we use in subsequent sections. For a more
rigorous treatment of this model, see Barlevy and Nagaraja (2005).
Let {ym }M
m=1 denote a sequence of random variables. Suppose M is a geometrically distributed

random variable that is independent of the yi , i.e. for m ≥ 1,

Pr (M = m) = (1 − p)m−1 p
10

(2.1)

©
ª
where 0 < p < 1. Define L (1) = 1 and L (n) = min k : yk > yL(n−1) as the n-th record time,

and define wn = yL(n) for n ≥ 1 as the n-th record value. The total number of records that occur
within the sequence {ym }M
m=1 is given by N = max {j : L (j) ≤ M }.

We assume {ym }M
m=1 are mutually independent and y1 is distributed according to some contin-

uous cdf G (·) while ym for m ≥ 2 is distributed according to some other continuous cdf F (·). We

want to allow for the possibility that G (·) and F (·) are related. Thus, let us define a mapping
T from the set of continuous distribution functions into itself such that G = T (F ). A mapping
like this can always be defined. However, we add content to this formulation by imposing the
following assumptions on T :
Assumption 2.1: for any distribution F , if G = T (F ) , then the composite function G (u) ≡

G ◦ F −1 (u) is absolutely continuous in u ∈ (0, 1).

Assumption 2.2: for any distribution F , the support of G = T (F ) lies in the support of F .

Assumption 2.3: for any distribution F , if G = T (F ), then F −1 (0) = G−1 (0)
Assumption 2.1 is a technical assumption that allows us to assume without loss of generality
that G (·) has a density function. Assumption 2.2 states that the first observation y1 can only
take on values in the support of all subsequent observations. In our application, this amounts to
the assumption that the price on the first job we observe for a worker must lie in the support of
the price oﬀer distribution. Assumption 2.3 states that the lowest possible value for y1 does not
exceed the lowest possible value for y2 . If this were not the case, values from the lower support
ªN
©
of F (·) would never show up in the record sequence yL(n) n=1 , and there would be no hope of

fully recovering F (·) from record data. Note that the identity mapping T (F ) = F satisfies these

assumptions. Thus, the case where all observations are identically distributed that was analyzed
in Nagaraja and Barlevy (2003) and Barlevy (2005) is just a special case of the model here.
Our goal is to show that the key features of this model — the parameter p in (2.1) and the
distributions F (·) and G (·) — can be identified from record data. We begin by providing suﬃcient
¢
¡
conditions for the expected record gaps E yL(n) − yL(n−1) | N ≥ n to exist, since our results

employ these moments. The proof of this and other propositions is contained in an appendix.
Proposition 1: Consider a sequence of independent random variables {ym }M
m=1 where
11

(i) Pr (ym ≤ x) = F (x) for all m ≥ 2;
(ii) Pr (y1 ≤ x) = G (x) where G = T (F ) satisfies Assumptions 2.1 — 2.3
(iii) M is independent of {ym } and Pr (M = m) = (1 − p)m−1 p for some p ∈ (0, 1).
If E (|y2 |) < ∞, then E (wn − wn−1 | N ≥ n) exists for all n ≥ 2, where wn = yL(n) . ¥
Note that the existence of record moments for n ≥ 2 only depends on the distribution F (·), not

G (·). This may seem surprising at first, since the fact that E (y2 ) is finite does not imply E (y1 )

is finite. Indeed, if the support of F (·) is unbounded above, one can easily construct examples
that satisfy Assumptions 2.1 — 2.3 where the mean for F (·) exists but the mean for G (·) does
not. The key is that these moments are conditioned on M ≥ 2 and max {y2 , ..., yM } ≥ y1 . From

Nagaraja and Barlevy (2003), we know E (E (max {y2 , ..., yM } | M ≥ 2)) is finite whenever E (y2 )

is. Hence, even if the unconditional mean for y1 does not exist, the mean conditional on y1 being
surpassed by a variable whose mean is finite does exist.
We begin with the case in which T is a known mapping, as will be true of the application in
the next section. Moreover, we impose an additional assumption on T , namely that there exists a
function G0 : [0, 1] → [0, 1] such that for any function F and any real number x,
G (x) = T (F ) (x) = G0 (F (x))

(2.2)

where T (F ) (x) represents the value of T (F ) evaluated at x. In words, the probability that y1 < x
under G = T (F ) can be expressed purely in terms of the percentile that x occupies within the
distribution F (·). Again, this will be true for our application in the next section. We now have
Proposition 2: Consider a sequence of independent random variables {ym }M
m=1 where
(i) Pr (ym ≤ x) = F (x) for all m ≥ 2;
(ii) Pr (y1 ≤ x) = G0 (F (x)) where G0 : [0, 1] → [0, 1] is non-decreasing and onto
(iii) M is independent of {ym } and Pr (M = m) = (1 − p)m−1 p for some p ∈ (0, 1).
Then we have
a. The distribution {Pr (N = n)}∞
n=1 identifies a unique p ∈ [0, 1] (which depends on G0 ).
b. If E (|y2 |) < ∞, then {E (wn − wn−1 | N ≥ n)}∞
n=2 uniquely characterizes F (·) for a given

p within the set of continuous distributions, up to a location shift. ¥
12

Hence, given a known mapping T that satisfies (2.2), we can use record data to recover both the
parameter p and the distribution F (·) from which ym for m ≥ 2 is drawn (and since T : F → G is
a known mapping, once we know F we can also deduce G). Establishing this result proves to be

considerably more diﬃcult than for the i.i.d. case. In particular, our proof hinges on a relatively
obscure result in convolution theory due to Titchmarsh (1926) that would not be required if the
6
observations {ym }M
m=1 were instead i.i.d.

We next consider the case in which the mapping T is unknown, but is still assumed to satisfy
Assumptions 2.1 — 2.3. In this case, we establish the following result:
Proposition 3: Consider a sequence of independent random variables {ym }M
m=1 where
(i) Pr (ym ≤ x) = F (x) for all m ≥ 2;
(ii) Pr (y1 ≤ x) = G (x) where G (x) is compatible with Assumptions 2.1 — 2.3
(iii) M is independent of {ym } and Pr (M = m) = (1 − p)m−1 p for some p ∈ (0, 1).
∞
If E (|y2 |) < ∞, then {Pr (N = n)}∞
n=1 and {E (wn − wn−1 | N ≥ n)}n=2 uniquely determine p

and identify F (·) up to a location shift and its associated G = T (F ). ¥

In other words, given two arbitrary distributions G (·) and F (·) that comply with Assumptions
2.1 — 2.3, we can use record data to infer the shape of the two distributions as well as p. More
precisely, the distribution {Pr (N = n)}∞
n=1 identifies p and the distribution G (·) relative to F (·),

i.e. the composite function G (u) ≡ G ◦ F −1 (u) for all u ∈ (0, 1). Intuitively, the number of records

that we will observe in a typical sequence depends on how many observations there are (hence
p) and how much weight G (·) assigns to values near the upper support of F (·) that are hard to
improve upon. Given p and G (u), the expected record gaps can then be used to determine the
shape of F (·). By substituting F (·) into G (u), we can obtain the distribution G (·).

3. Censored Employment Histories
As a first application of our results, we turn to the case of censored employment histories. The
problem of censoring is highly relevant for empirical applications. For example, in applying the
insights of record statistics to analyze actual wage data, Barlevy (2005) was forced to turn to
6
In the i.i.d. case, characterization results can be instead derived using the more well-known Müntz-Szász theorem.
See Kamps (1998) for a survey of applications of the Müntz-Szász theorem to order statistics.

13

the National Longitudinal Survey of Youth (NLSY) dataset to obtain suﬃciently complete work
histories. One limitation of the NLSY is that it is not a particularly large sample, and thus yields
too few observations to accurately estimate the average net wage gains beyond the third job in an
employment cycle. The limited sample size also precludes implementing the approach separately
for diﬀerent groups of workers, making it impossible to explore such questions as whether the oﬀer
distribution and the rate at which oﬀers arrive diﬀer for blacks and whites. Larger panel datasets
are available, such as the Census Bureau’s Survey of Income and Program Participation (SIPP),
but most respondents in these surveys are already employed at the time of their first interview. In
other words, employment cycles are left-censored. As a result, whenever a worker we track changes
jobs, we don’t know how many jobs he previously worked on. We could of course wait until the
worker happens to become unemployed, but this would force us to throw out much of the data.
Developing a strategy for identification that can use censored employment cycles is essential for
exploiting the larger sample sizes of existing datasets.
For now, let us assume as in Nagaraja and Barlevy (2003) or Barlevy (2005) that all workers
share the same value of leisure b and hence the same reservation price w∗ . But in contrast to
these papers, suppose we only get to observe a random sample of already-employed workers. The
problem with implementing the procedure outlined in Section 1 is that we no longer know what
position in its respective employment cycle any job in the data represents. However, we can make
use of an important feature of the labor market we consider, namely that it converges to a steady
state in which both the employment rate and the distribution of employed workers across prices is
constant over time. Let u denote the fraction of workers who are unemployed in steady state, and
let G (w) denote the fraction of employed workers who are paid no more than w per unit labor.
Since both expressions are constant over time, and using the fact that unemployed workers will
accept any oﬀer they receive given our assumption that w = F −1 (0) ≥ w∗ , we have
du/dt = −λ0 u + δ (1 − u) = 0
u
dG (w) /dt = λ0 F (w)
− [δ + λ1 (1 − F (w))] G (w) = 0
1−u

Solving these two equations yields the following expression for the steady state distribution of
prices across employed workers
G (w) =

F (w)
1 + κ1 (1 − F (w))

(3.1)

where recall κ1 = λ1 /δ. This distribution first-order stochastically dominates F (·), reflecting the
fact that workers gravitate to higher paying jobs. Since some already-employed workers will have
moved on to jobs that pay more than the first job they were oﬀered, we will observe more workers
at high price jobs than in a random sample of workers on their first job out of unemployment.
14

Suppose we chose a worker at random among all employed workers, and denote by y1 the price
on the first job we observe him on. If workers have been active in the labor market for some time,
then y1 represents an independent draw from a distribution approximately equal to the steady
state distribution G (·) above. Let M − 1 denote the number of oﬀers a worker receives from the

time we first observe him until he is next laid oﬀ, and let us refer to these oﬀers as y2 through yM

according to the order they arrive. It is easy to show that the number of oﬀers from now until he
is laid oﬀ has the identical geometric distribution as in (1.4). Hence, M is geometric. The price
oﬀers ym for m ≥ 2 are drawn from F (·) and are independent of one another and of y1 .
It can be verified that the mapping T : F → G defined by (3.1) is consistent with Assumptions

2.1 — 2.3. Moreover, G (w) = G0 (F (w)), where G0 (y) = y [1 + κ1 (1 − y)]−1 . As long as the log
price oﬀer distribution has a finite mean, we satisfy all of the requirements of Proposition 2 above.

We can therefore use the distribution of the number of jobs N from when we first observe a worker
until he is eventually laid oﬀ to recover κ1 .
More precisely, as can be inferred from Appendix A, Pr (N = n) can be expressed as follows:
Pr (N = n) =

Z

0

1

1 + κ1
[ln (1 + κ1 (1 − u))]n−1
du
(n − 1)!
(1 + κ1 (1 − u))3

(3.2)

We can therefore recover κ1 from the empirical distribution of the number of jobs per left-censored
employment cycle using maximum likelihood, i.e. by choosing κ1 to maximize the likelihood of
the observed values of N across employment cycles under (3.2). Once we have an estimate for κ1 ,
Proposition 2 tells us we can use the net wage gains of voluntary job changers to recover the oﬀer
distribution F (·) up to a scaling parameter. Let n denote the number of job changes we observe
for the worker since his first survey. The average net wage gains E (∆ ln Wit − β∆Xit | N ≥ n)
¢
¡
then correspond to the sequence of moments E yL(n) − yL(n−1) | N ≥ n among log price oﬀers
{ym }M
m=1 , and by Proposition 2 these moments characterize the distribution of y2 . While this

two-step approach is similar to the one described earlier for complete cycles, the way we map the
underlying data to values for κ1 and F (·) will be diﬀerent and will depend on the function G0 (·)
implicit in (3.1).
At this point, we should mention earlier work by Bontemps, Robin, and van den Berg (2000) who
also argue that this model can be nonparametrically identified when we only get to observe workers
who are already employed. However, unlike this paper, they assumed no measurement error and
no unobserved worker productivity. This allowed them to pursue a diﬀerent identification strategy
from the one described above. It will be useful to compare the two approaches to understand the
diﬃculties posed by the presence of unobserved worker productivity.
15

When worker productivity is perfectly observable, we can deduce the price w on each job. Given
this, Bontemps et al note that we can directly recover G (·) from the empirical distribution of
prices among already-employed workers, assuming these workers have been active in the market
for a suﬃciently long time. Rearranging (3.1), we have
F (w) =

(1 + κ1 ) G (w)
1 + κ1 G (w)

(3.3)

Just as with our approach, their approach requires an estimate for κ1 to recover F (·). But
rather than estimating κ1 from the number of jobs a worker passes through before he is next laid
oﬀ, Bontemps et al turn to job duration data. Since a job ends if either the worker is sent into
unemployment or receives a better oﬀer, a job oﬀering a price w ends with hazard δ+λ1 (1 − F (w)).

Substituting in (3.3), this becomes

λ1 (1 + κ1 )
κ1 (1 + κ1 G (w))
As long as one can estimate the hazard for at least two diﬀerent values of w, it will be possible to
separately estimate λ1 and κ1 . This provides them with the parameter κ1 of interest, as well as
separate estimates for λ1 and δ = λ1 /κ1 .
Once we allow for the possibility that worker productivity cannot be perfectly measured, though,
the approach proposed by Bontemps et al no longer works. First and foremost, the distribution
of observed wages W = w in a random sample of employed workers no longer corresponds to
G (·), but to a convolution of G (·) and the distribution of worker ability. Without additional
assumptions on unobserved worker productivity, we cannot recover G (·) from a cross sectional
wage distribution. Second, even if G (·) were known, the fact that we cannot directly observe the
price w makes it impossible to exploit the variation between job duration and w to identify κ1 .
The virtue of recovering κ1 from longitudinal data on the number of jobs per employment cycle is
that we do not need to know either the distribution G (·) or the true price on each job.
Interestingly, the fact that it is possible to recover κ1 without relying on wage data suggests our
approach can be used in models where the price per unit labor is not constant over the course
of the job as we have assumed. For example, consider the Postel-Vinay and Robin (2002) model,
which uses the same production structure as here but assumes a diﬀerent wage setting mechanism.
They assume employers post wages, but can then increase the wage if their employee receives an
outside oﬀer. An implication of their model is that workers always prefer an employer who is more
productive to one who is less productive. This implies the number of jobs is the number of records
in the sequence {Zm }M
m=1 where Zm denotes the productivity of the m-th oﬀer and M denotes the

number of oﬀers in an employment cycle. We can therefore use the number of jobs per employment
16

cycle can to identify κ1 . This is true even though wages no longer correspond to record values: a
worker might agree to a lower wage from a more productive employer, correctly anticipating that
wage growth will be higher with this employer. By contrast, it will not be possible to recover κ1
using wage and duration data as described above. However, a similar approach would work if we
could observe z, since a job with productivity z will end with a hazard rate of δ + λ1 (1 − Γ (z)).7
Another diﬀerence between our approach and the one proposed by Bontemps et al is that their
strategy requires that equation (3.1) holds, whereas our approach does not. This is because the
approach advocated by Bontemps et al proceeds by first estimating G (·) and then using a known
mapping from G (·) to F (·) to deduce F (·). By contrast, our approach remains valid even if G (·)
is unknown, since we identify F (·) from the evolution of log wage growth over an employment
cycle, not by mapping an estimate of G (·) into an implied estimate of F (·). Formally, according
to Proposition 3 we can identify F (·) when T : F → G is unknown.
While our approach has the advantage that it does not require knowing how G (·) and F (·) are
related, its main drawback is that it is far more data intensive: we need to track workers to the
end of their employment cycle, whereas Bontemps et al only need data on one job per worker.
However, tracking each employment cycle to its very end is only necessitated by our assumption
of unobservable productivity. If worker productivity were observable, we could separately identify
G (·) and F (·) without as much data. Suppose we collected data on w1 and w2 , the price per
unit labor on the first and second jobs we observe for a worker, as well as the duration of the first
job. As in Bontemps et al, we could use the empirical distribution of w1 to recover G (·) and the
relationship between the duration of the job and its price to recover λ1 and δ. But rather than rely
on (3.1) to infer F (·) from G (·), we can recover F (·) directly given estimates of κ1 and G (·) by
using the fact that the empirical distribution of w2 corresponds to the distribution of the second
record from {ym }M
m=1 conditional on at the occurrence of a second record. We would need to track

workers beyond the first job we see them on, but not to the end of their employment cycle.

4. Heterogeneity in Reservation Prices
So far, we have maintained the assumption that all workers share the same value of leisure b. In this
section we relax this assumption. Formally, let Υ (·) denote the cdf of bi across individuals. Given
the implicit formula for wi∗ in (1.3), Υ (·) can be mapped into a distribution H (·) for reservation
7
Postel-Vinay and Robin (2002) appeal to precisely this strategy. Using matched employer-employee data, they
estimate the productivity of each firm by the average wage of all workers at the same location. This approach is
valid so long as all jobs with the same employer are equally productive.

17

prices wi∗ across all workers. As before, we assume that the least selective worker would be willing
to accept any oﬀer, i.e. w∗ = H −1 (0) ≤ F −1 (0) = w. Let us define Hu (·) as the distribution of

wi∗ across unemployed workers. This distribution will diﬀer from the distribution of reservation

prices in the population, since workers with high reservation prices are less likely to move into
employment and thus more likely to be unemployed at a given point in time. We show that it
is possible to nonparametrically identify both the oﬀer distribution F (·) and the distribution of
reservation prices among the unemployed Hu (·). Under additional assumptions, it will also be
possible to reconstruct H (·) and Υ (·). Throughout this section, we assume the econometrician
has access to complete employment histories.
Suppose we tracked an unemployed worker chosen at random and recorded the price on the first
job that worker accepted. Let us call this price y1 . How is y1 distributed? We can view y1 as
a random draw w from F (·) known to exceed the value of an independent draw w∗ from Hu (·).
Hence, its distribution is just the distribution of w conditional on the event that w ≥ w∗ , i.e.
Pr (y1 ≤ x) = Pr (w ≤ x | w ≥ w∗ ) =
One can show that Pr (w∗ ≤ w ≤ x) =

Rx

−∞ Hu (w) dF

Pr (w∗ ≤ w ≤ x)
Pr (w∗ ≤ w)

(w) and Pr (w ≥ w∗ ) =

Hence, if we let G (x) denote the cdf for y1 , then
Rx
Hu (w) dF (w)
G (x) = R−∞
∞
−∞ Hu (w) dF (w)

R∞

−∞ Hu (w) dF

(w).

(4.1)

Appealing to the change in variables y = F (x), we can rewrite (4.1) as
G (x) =

R F (x)
0

R1
0

¢
¡
Hu F −1 (y) dy

Hu (F −1 (y)) dy

(4.2)

As before, denote additional price oﬀers over the course of an employment cycle by {y2 , ..., yM }.

The prices on the jobs the worker accepts represent records from the sequence {ym }M
m=1 , where

y1 is distributed according to (4.1) while ym for m ≥ 2 is distributed according to F (·). It

easy to verify that G (x) satisfies Assumptions 2.1 and 2.2. Assumption 2.3 follows directly from
our assumption that F −1 (0) ≥ H −1 (0) = w∗ . However, as evident from (4.2), G (x) cannot be

expressed as a function of F (x) alone, since G (x) depends on F (y) for all y ≤ x . We therefore
appeal to Proposition 3, which allows G (·) to be an unknown function.

According to Proposition 3, we can use the distribution of the number of jobs N per employment cycle to recover κ1 and the composite function G (u) ≡ G ◦ F −1 (u). Let n denote the

number of jobs the worker has held since he was last unemployed. The average net wage gains
18

E (∆ ln Wit − β∆Xit | N ≥ n) across all workers correspond to the sequence of expected record
¢
¡
gaps E yL(n) − yL(n−1) | N ≥ n among the log price oﬀers the worker receives. Proposition 3
states that, given κ1 and G (u), these moments uniquely determine the log price oﬀer distribution
up to a location shift, and hence the oﬀer distribution F (·) up to a scaling parameter. Thus, even

though the sequence of prices over an employment cycle {wn }N
n=1 will be distributed diﬀerently

for workers with diﬀerent reservation prices, we can still recover F (·) from the average wage gains
across all job changers who are at the same position in their respective employment cycle.
Once we know F (·), we can substitute it into our original estimate for G (u) to recover G (·),

the distribution of the price on the starting job across . Although G (·) itself may not be of direct
interest, recovering G (·) is a first step towards recovering objects that are of inherent interest,

such as the distribution of reservation prices Hu (·) across unemployed workers, the distribution
of reservation prices H (·) across all workers, and the distribution of the value of leisure Υ (·). For
certain applications, it is important that we be able to estimate these distributions. For example,
one feature of U.S. labor markets is that many unemployed workers find jobs relatively quickly.
This could be because most workers are not very choosy and accept any oﬀer that comes their
way, or because search frictions for unemployed workers are small (i.e. λ0 is large) and even choosy
workers can quickly find jobs they are willing to take. The distribution of reservation prices among
unemployed workers Hu (·) can distinguish between these two explanations.
We now describe how to recover these distributions using G (·). We begin with the distribution
of reservation prices Hu (·) across unemployed workers. Intuitively, we should not expect to learn
about reservation values that do not correspond to prices oﬀered by some employers; this is because
we need data on how participation increases with the price to determine how many workers have
a given reservation value. Formally, we can only identify Hu (x) when x = F −1 (u) for some
u ∈ (0, 1). Setting x = F −1 (u) in (4.2) and diﬀerentiating with respect to u yields
¢
¡
¢
Hu F −1 (u)
d ¡ −1
G F (u) = R 1
du
Hu (F −1 (u)) du

(4.3)

0

The left-hand side of (4.3) is just G 0 (u), where recall G (u) is identified from data on the number
of jobs N per employment cycle. Since Hu (·) is a cdf, it follows that G 0 (u) must be positive. This

is a testable implication of the model, since not every distribution Pr (N = n) will yield a function
G (u) that is everywhere nondecreasing.
¢
¡
Equation (4.3) implies that Hu F −1 (u) is proportional to G 0 (u) for all u ∈ (0, 1). To get
¢
R1
¢
¡
¡
an expression for Hu F −1 (u) , all we need is the constant of proportionality 0 Hu F −1 (u) du.
19

This constant is not always identified, since it depends on the fraction of workers whose reservation
price exceeds the highest oﬀered price w = supj wj , which thus far we have not restricted.
If we assume w∗ ≤ w, an assumption that follows automatically if the oﬀer distribution F (·) has

unbounded support, then we can easily obtain the constant of proportionality. In particular, if all
¢
¡
workers set their reservation price below w = F −1 (1), then Hu F −1 (1) = 1, and the constant of

proportionality is given by

·Z

1

¡

Hu F

0

Hence, for any u ∈ (0, 1), we have

−1

¸−1
¢
(u) du
= G 0 (1)

¢ G 0 (u)
¡
Hu F −1 (u) = 0
G (1)

(4.4)

(4.5)

Even if we do not assume w∗ ≤ w, i.e. if the F (·) we identify has bounded support, we can still

interpret (4.5) as the conditional distribution Hu (· | w∗ ≤ w), i.e. as the distribution of reservation

prices among workers whose reservation price is no higher than w.

Given an estimate for the distribution of reservation prices Hu (·) among unemployed workers,
we next set out to construct the distribution of reservation prices H (·) for the population as a
whole. For this, we need to impose additional assumptions on how the population of unemployed
workers compares to that of all workers. Let us assume that the market we consider has been
active for quite some time, so that the distribution Hu (·) is close to its steady-state value. In
steady-state, the fraction of workers with reservation price w∗ who are unemployed, which we
denote uw∗ , remains constant over time. Using the law of motion for uw∗ ,
u̇w∗ = −λ0 (1 − F (w∗ )) uw∗ + δ (1 − uw∗ )
and setting the change in uw∗ to zero yields the following steady-state unemployment rate:
uw∗ =

1
1 + κ0 (1 − F (w∗ ))

Hence, the steady state distribution of reservation prices across unemployed workers Hu (·) will be
related to H (·) by the formula
Hu (w) u =

Z

w

w∗

∗

uw∗ dH (w ) =

Z

w

w∗

dH (x)
1 + κ0 (1 − F (x))

(4.6)

where u denotes the steady state unemployment rate and one can show is given by
u=

1 + κ0

R1
0

1
Hu (F −1 (y)) dy
20

(4.7)

A similar derivation for this formula can be found in Bontemps, Robin, and van den Berg (1999).
Diﬀerentiating (4.6) with respect to w yields
hu (w) u =

h (w)
1 + κ0 (1 − F (w))

We can therefore derive H (·) from Hu (·). To do so, we first need to estimate κ0 . One way to do
so is to use the steady-state unemployment rate u from equation (4.7) and our existing estimates
for Hu (·) to extract κ0 . Next, for any price x that lies in the support of the oﬀer distribution, i.e.
for any x that can be expressed as F −1 (u) for some u ∈ (0, 1), we multiply the density function
d
Hu (x) by the factor 1 + κ0 (1 − F (x)). Finally, we multiply our new function by the
hu (x) =
dx
unemployment rate to obtain the density function h (w).
Once we recover the distribution H (·), it is fairly straightforward to obtain the distribution of
leisure Υ (·). Recall that the cutoﬀ w∗ for each worker is implicitly defined by equation (1.3).
Rearranging yields
w∗ = b + (κ0 − κ1 )

Z

∞

w∗

1 − F (x)
dx
ρ/δ + 1 + κ1 (1 − F (x))

(4.8)

where κ0 = λ0 /δ. We already described how to estimate κ0 and κ1 . The remaining parameter
we need is the ratio of the discount rate ρ to the rate of job loss δ. Recall that we can estimate
δ from the duration of employment cycles. As for ρ, under certain assumptions we should be
able to appeal to interest rate data to recover it, although this is beyond the scope of this paper.
Assuming we can assign a value to ρ, we can appeal to (4.8) to map H (·) into Υ (·). It might be
possible to characterize this mapping analytically, but if not, it is always possible to use simulation
methods to find a numerical approximation for Υ (·). An example of these methods can be found
in the work on identification in auctions by Guerre, Perrigne, and Vuong (2000).

5. Empirical Application: Heterogeneity and Estimates of κ0 and κ1
Although one of our goals is to develop an identification strategy that can be applied to new
datasets where left-censoring is a concern, in our empirical analysis we limit ourselves to the same
National Longitudinal Survey of Youth (NLSY) dataset already explored in previous work. This
is because we wish to gauge the consequences of relaxing the assumption adopted in some of the
aforementioned papers that workers share a common reservation price. In this section we consider
the implications of this assumption for estimates of mobility parameters, and in the next section
we consider the implications for estimates of the oﬀer distribution F (·).
Our analysis suggests that allowing for heterogeneous reservation prices can dramatically aﬀect
21

the estimated rate at which workers receive oﬀers. As we discuss below, this sensitivity is driven
by two features of the empirical distribution of the number of jobs N per employment cycle: (1)
a majority of employment cycles end after only one job; (2) the distribution of N has a fat tail.
Allowing for heterogeneous reservation values gives more weight to observation (2), which in turn
leads to higher estimates for the rate at which employed workers receive oﬀers. Interestingly, these
estimates are on par with those implied by previous analysis based on duration data. However,
the distribution of reservation prices required to account for observation (1) implies that the rate
at which unemployed workers receive oﬀers is much higher than previous analysis suggests.
The reason we use the NLSY dataset is that it compiles comprehensive employment histories
for each worker in the survey for an extended period. This dataset tracks a cohort of workers
who were between ages 14 and 22 in 1979. We focus on male workers, whom we follow up to
1993, so the oldest person in our sample is 36. Although the NLSY was continued in subsequent
years, we chose not to incorporate that data, for two reasons. First, from 1994 on interviews were
conducted every two years, which as Pierret (2001) shows makes it diﬃcult to construct reliable
work histories. Second, in line with our assumptions, we want to focus on young workers who have
less incentive to invest in match-specific human capital, and age 36 seemed a reasonable cutoﬀ.
According to our analysis, all we need to estimate κ1 is the distribution of N across cycles. As in
previous work, such as Flinn (2002), we use periods of non-employment to demarcate employment
cycles. To avoid counting summer jobs for students as employment cycles, we restrict attention
to employment cycles that end when the worker has at least one year of potential experience, i.e.
he has been at least one year out of school. That is, a worker may begin an employment cycle
while still in school, but we will only include the cycle in our analysis if he remains continuously
employed beyond the point at which he finishes his schooling.
One concern about partitioning the data into employment cycles this way is that workers who
take time oﬀ before voluntarily moving into a new job — or must wait until this job starts (e.g.
teachers starting a job at a new school) — might be misclassified as starting a new employment cycle.
We therefore consider an alternative classification proposed in Barlevy (2005) which combines data
on non-employment spells with the reason the worker provides for ending his job. In particular, a
new employment cycle is said to begin if either the worker is laid oﬀ from his job or if he spends
more than 8 weeks in non-employment. The two approaches yield distributions for N that have
similar qualitative features, and hence lead to similar estimates for κ1 .
After we divide the data into employment cycles, we need to count the number of consecutive
jobs within each cycle. This raises the question of how to count dual job holdings, a relatively
22

common phenomenon in the data. We follow Barlevy (2005) in ignoring jobs that begin after and
end before another job, on the grounds that these are most likely secondary jobs that supplement
income and which the worker is not interested in taking on as a primary job. Indeed, in the vast
majority of such cases, workers identify these jobs as secondary in response to survey questions.
Before turning to results, a final issue we must contend with is that since the NLSY spans a
fixed time window, we cannot track employment cycles to their end. Formally, let tk denote the
length of the k-th cycle and Tk the time from when the k-th cycle beings until the end of the
sample. Given our data, we can estimate Pr (N = n | tk ≤ Tk ), not Pr (N = n). We will therefore

oversample short employment cycles with fewer jobs. To mitigate this concern, we only consider
cycles that began in the first five years of the sample. Since the first NLSY interview collected

retrospective data on jobs starting from 1977, this includes all cycles that started on or before 1981.
For large values of Tk , the degree of bias should be small. Indeed, among the 14, 178 employment
cycles that started prior to the first week of 1982, only 1, 837 are censored (13%). Only a quarter
of these are censored because they continue beyond the last year in the survey, 1993. The rest
are censored because of attrition or because the worker did not provide a reason for leaving his
job. Nearly half of all censored cycles (835 out of 1, 837) are censored within 2 years of when they
start. Censoring is therefore unlikely to generate a large bias.
Table 1 reports the distribution of N across the 12, 341 complete employment cycles in our data
that began prior to 1982. The first column partitions employment cycles by nonemployment spells.
A stark feature of the data is that the vast majority (61%) of all employment cycles end after one
job. The second column treats workers who quit and find a job within 8 weeks as continuing in
the same employment cycle. In this case, nearly 70% of all cycles end with only one job, and the
tail of the distribution looks nearly identical. Although not reported in Table 1, this feature is
pervasive: even when we break down the analysis into four education groups, we systematically
find that the majority of cycles end after only one job.
Armed with this data, we proceed to estimate κ1 . We first consider what happens when we
assume all workers share the same reservation price. The top panel of Table 2 reports the maximum
likelihood estimate for κ1 under this assumption, and are in line with estimates reported in Barlevy
(2005). The parameter κ1 is tightly estimated around 1.9. The reason for this relatively low value
is the high incidence of employment cycles with only one job. Intuitively, when workers share the
same reservation price, symmetry implies that in cycles with exactly two oﬀers, half of the time
the first oﬀer will be below the second oﬀer. The fact that few cycles result in more than one job
implies there must be few cycles in which the worker makes it to a second oﬀer. Hence, either the
rate λ1 at which oﬀers arrive is low, or the rate δ at which cycles end is high and workers do not
23

have enough time to accumulate multiple oﬀers.8 By contrast, if κ1 were in the range of 5 to 10,
in line with estimates of κ1 based on duration data that we discuss below, the expected fraction
of cycles that end after one job would be much lower, around 24 − 36%.
Next, we introduce heterogeneity in reservation prices. To appreciate why this can aﬀect the
estimates, suppose a significant fraction of unemployed workers were highly selective, i.e. they
demanded a price near the top of the oﬀer distribution to work. Once these workers managed
to find a job within that range, they would be unlikely to improve upon that oﬀer, even if κ1
were high. In other words, allowing for the possibility of heterogeneous reservation prices makes
it possible for a large fraction of employment cycles to end after one job even when κ1 is high.
To integrate heterogeneity into our empirical analysis, we searched for a convenient parametric specification for G (u) to facilitate the estimation. Recall that the model requires G (u) be

nondecreasing. Experimenting with diﬀerent specifications confirmed that the best fit is indeed

upward sloping, and moreover is convex. This prompted us to use an exponential form exp (u/b)
where b is some constant. However, for this specification b plays a dual role: it determines both
the shape of the distribution of reservation prices in the support of the oﬀer distribution and the
fraction of workers whose reservation price is lower than w = F −1 (0). We therefore turned to a
two-parameter generalization, properly scaled to reflect a proper cdf:
G (u) = au + (1 − a)

exp (u/b) − 1
exp (1/b) − 1

(5.1)

This generalization is natural, since when a → 1 it collapses to the special case where all workers
have a reservation price below w (and is thus equivalent to assuming all workers share a common

reservation w∗ ≤ w as we previously did). From the Appendix, we know we can express Pr (N = n)

as an integral involving G (u) and κ1 , and we choose the three parameters to maximize the likelihood

of the data given these expressions. Since the results proved sensitive to outliers, we eliminated
the extreme tail of the distribution and only included observations where N ≤ 7.
The bottom panel of Table 2 reports our estimates for κ1 , a, and b. Our estimates for κ1 by

education group range between 8.4 and 16.4. Our estimate for the population as a whole is 10.8.
Moreover, the point estimate for the population as a whole is quite tight, and we can safely reject
the lower estimates for κ1 we obtain when we assume workers share the same reservation price.
8

More precisely, the implication is that the arrival rate of “viable” oﬀers — that is, oﬀers that a worker might
accept — is low. This rate is equal to λ1 (1 − F (w∗ )), where w∗ denotes the common reservation price. Since we
assumed w = F −1 (0) ≥ w∗ , this is equal to λ1 . But this distinction hints at why heterogeneous reservation prices
can help: even if the arrival rate of viable oﬀers is low for most workers, implying they won’t make it to a second
job, the rate at which they receive oﬀers that are deemed viable to any worker in the economy could still be high.

24

Why does heterogeneity in reservation prices lead to higher estimates for κ1 ? Figure 1 sheds
some insight on this. The figure plots the log of Pr (N = n) against n. The solid black line
represents the data. The dashed line represents the fitted values when we assume all workers share
a common reservation price (i.e. when we constrain a = 1). Matching the large fraction of cycles
with only one job requires a low value for κ1 . However, as evident from the figure, a low value
of κ1 implies that the distribution for N has a thin tail, as reflected in the steep negative slope
of the dashed line. This is intuitive: if oﬀers arrive infrequently, few workers will make it to a
second job; of those that do, few will make it to a third job; and so on. Hence, Pr (N = n) should
decline sharply with n. But the empirical distribution has a much fatter tail: a non-negligible
fraction make it to a third, fourth, and fifth job. This requires a high value κ1 , since workers need
to receive enough oﬀers to have enough opportunities for this much upward mobility. When we
introduce heterogeneity in reservation prices, the estimation assigns a high value for κ1 to match
the fat tail of the distribution, and assigns a and b to match the large fraction of cycles with only
one job. In particular, it interprets the data to imply that a large fraction of workers who are so
selective that they will be unlikely to receive a more favorable oﬀer than the first oﬀer they are
willing to take, even when the rate at which oﬀers arrive is quite high.
Interestingly, the estimates we obtain for κ1 when we allow for heterogeneity are compatible with
the estimates previous authors have found based on duration data. For example, Flinn (2002),
using the same NLSY dataset we use, estimates κ1 between 3.3 and 7.8 across education groups,
and for the group as a whole at 4.6 (see Table 4, p633). Using data from the Netherlands, van den
Berg and Ridder (1996) estimate κ1 between 6.8 to 12.3 across age groups, and for the group as a
whole at 9.4 (see Table VII, p1208). Although these estimates are consistent with what we find,
they are based on independent evidence. Our results are driven by the fact that a fair number
of workers are observed in a large succession of jobs without a nonemployment spell, implying
that they must have received oﬀers at a fairly high rate to accumulate enough oﬀers to move this
many times. By contrast, estimates based on duration data are driven by the fact that higher
wage jobs have significantly longer duration. Recall that the hazard rate for a job that pays a
price w corresponds to δ + λ1 (1 − F (w)). The extent to which the average duration varies with
the price paid on the job depends on how large λ1 is relative to δ. Even though previous authors

have imposed diﬀerent assumptions on F (·) and diﬀer in how they account for unobserved worker
productivity, the fact that higher wage jobs last longer leads all to estimate a high value for κ1 .
While allowing for heterogeneous reservation prices leads to estimates of the arrival rate for employed workers that are consistent with what previous work has found, the nature of heterogeneity
in reservation prices we estimate leads to diﬀerent conclusions regarding the rate at which unem-

25

ployed workers encounter oﬀers. In particular, to accord with the large fraction of employment
cycles that end after only one job, our estimation requires that the distribution of reservation
prices be highly skewed towards the upper support of the oﬀer distribution, as implied by the low
value we estimate for b in Table 2. Intuitively, if employed workers receive many oﬀers, as implied
by a high κ1 , it must be that a large fraction of workers begin their first job near the top of their
potential earnings distribution. But if a large fraction of workers are indeed this choosy, the rate
at which workers encounter oﬀers while unemployed must be fairly high to accord with the low
duration and incidence of unemployment we see in the data.9
The reason our estimates diﬀer from those in previous work is that we identify the distribution
of reservation prices from mobility data, whereas previous work has either abstracted from it or
parameterized it in a particular way, e.g. Bontemps, Robin, and van den Berg (1999). Our estimate
reveals a far more skewed distribution, and hence a far higher estimate for κ0 = λ0 /δ, the relative
rate at which workers encounter oﬀers while unemployed. To see this, substitute (4.4) into (4.7)
to obtain the following expression for unemployment:
¤−1
£
u = 1 + κ0 /G 0 (1)

Rearranging, we get the following expression for κ0 :
¶
µ
1
0
−1
κ0 = G (1)
u

G 0 (1) measures how skewed the distribution is towards the upper support of the oﬀer distribution.

If all workers share a common reservation price, G 0 (1) = 1. By contrast, our estimate for G 0 (1),

reported in the final column in Table 2, is much larger: κ0 is almost 16 times as large as when
we assume all workers share the same reservation price. If we set u to 6%, our estimate for
λ0 /λ1 = κ0 /κ1 is 23, i.e. the rate at which oﬀers arrive while workers are unemployed is over
twenty times as large as when they are employed. By contrast, since previous work has either
abstracted from heterogeneity in reservation prices or considered far less skewed distributions
(where the implied G 0 (1) is much smaller), it has typically estimated λ0 /λ1 at no more than 2.
Why does our approach imply that workers receive oﬀers at a much lower rate when they are
employed than when they are unemployed? This interpretation is needed to explain why in the
data most workers move fairly quickly from unemployment to employment (as evidenced by the low
9

Note that one could alternatively interpret these findings as saying that a large proportion of employed workers
receive oﬀers at a lower rate than their peers, rather than that a large proportion of workers are choosier than their
peers. If the ratio λ1 /λ0 were constant across workers, this too would require that many unemployed workers receive
oﬀers at an extremely high rate to accord with the low incidence and duration of unemployment.

26

incidence of unemployment) but do not often move from their first job out of unemployment into
a second job (as evidenced by the high incidence of cycles that end with only one job). Within our
framework, the only possible explanation is that mobility slows down dramatically once workers
are employed. There may be some truth to this, but the magnitude seems rather large to be
plausible. More likely, our framework abstracts from some important consideration, and it is not
obvious how this feature would aﬀect our inference on the degree of labor market mobility.10
There are various features one could introduce into this search model that could explain these
facts without implying a dramatic decline in the arrival rate once a worker becomes employed.
One example is moving costs: even if employed workers receive oﬀers at a high rate, they may
not always move to a higher oﬀer. The problem with this explanation is that it also implies a
thin-tailed distribution for N , although this can be overcome if only some workers have a distaste
for moving rather than all. Another possibility is to endogenize search eﬀort, so that workers
who earn a wage closer to the top of the distribution search less intensively. In this case, the
number of oﬀers M would be correlated with the realizations {ym }M
m=1 , resulting in more spells

with only one job (those where the initial draw was high). This modification poses problems for
recovering arrival rates using duration data, since high wage jobs would tend to last longer even if
the “true” arrival rate λ1 were low, simply because workers on high wage jobs search less. Thus,
the fact that high wage jobs have longer durations does not seem to robustly imply κ1 is high. By
contrast, if oﬀers are independent, the fat tail in the distribution of N necessarily implies a high
κ1 : records are suﬃciently rare among i.i.d. draws that the number of oﬀers workers receive over
an employment cycle must be large, and hence so must κ1 .

6. Heterogeneity and Identification of the Oﬀer Distribution
In this section, we briefly note some of the implications of heterogeneity in reservation prices for
estimating the oﬀer distribution F (·). Barlevy (2005) argued that in the NLSY data, average
log wage gains appear to be roughly constant regardless of how many jobs the worker previously
changed. In the absence of heterogeneity in reservation prices, this pattern uniquely characterizes
the exponential distribution, implying that the oﬀer distribution is Pareto (the antilog of the
exponential). By contrast, if the oﬀer distribution were lognormal, as is assumed at times, average
10

However, there is some evidence to support the notion of heterogeneity in reservation wages. In particular, such
heterogeneity would imply a positive relationship between the duration of an unemployment spell and the duration
of the first job out of uneployment, since more selective workers will be unemployed for longer on average but are
then less likely to voluntarily leave for another job. Preliminary work we carried out revealed such a pattern in the
NLSY data, although a serious treatment of this analysis is beyond the scope of this paper.

27

wage gains would decline with n, a feature that is due to the log concave shape of the normal
distribution. However, Barlevy (2005) found that the rate at which average wage gains decline
with n for a lognormal distribution is not suﬃciently sharp that it can be rejected statistically.
How does our interpretation of this finding change once we allow for heterogeneous reservation
prices? On the one hand, to the extent that log wage growth does not vary with the number of
times the worker has already changed jobs, we would still conclude that the oﬀer distribution is
Pareto. This follows directly from Proposition 3. However, distinguishing the Pareto distribution
and the lognormal distribution becomes more diﬃcult, as can be seen graphically in Figure 2.
The dark line in the figure shows the implied average log wage gains under a Pareto distribution.
Regardless of whether there is heterogeneity, average wage gains will be constant and assume the
¡
¢
same value in both cases. The remaining two lines trace out E yL(n) − yL(n−1) | N ≥ n for n for

a lognormal distribution that is normalized so that average wage growth across all job changers
is consistent with what Barlevy (2005) estimates from the data. The dashed line is computed
for the estimate of κ1 from the top panel of Table 2 when we assume no heterogeneity, while the

gray line is computed using our estimates for G (u) in the bottom panel of Table 2. According to

these estimates, the average log wage gains decline even less rapidly with n, making it harder to
distinguish these two particular shape restrictions.
To understand why heterogeneity in reservation prices results in a flatter profile of average wage
gains over an employment cycle, note that when we observe a worker move multiple times, we can
infer he is probably not very choosy, or else he would have started with a high wage that he would
be unlikely to improve upon. But less choosy workers tend to earn lower wages on average. Since
under the lognormal distribution, log wage gains are higher on average for workers in lower wage
jobs, this will increase the average wage gains we would observe for workers who have already
moved several times. Hence, average wage gains will not fall as much with n as when there is no
heterogeneity in reservation prices. In fact, for even more skewed functions G (u), wage gains for
workers with some mobility can increase enough to result in a profile that is not monotonic in n.

According to Proposition 3, for any G (u), we can in principle identify the shape of the oﬀer

distribution F (·). But since diﬀerences are less pronounced in the face of heterogeneity in reserva-

tion prices, we might need a much larger dataset to estimate average wage growth more precisely
enough to rule out certain functional forms. Since one of the goals of this paper is to enable
identification in large datasets where employment histories are left-censored, though, it might be
possible to satisfy these data requirements using other datasets.

28

7. Conclusion
Standard approaches to estimating search models, as summarized in the recent survey article of
Eckstein and van den Berg (2005), either abstract from unmeasured variation in wages or impose
parametric assumptions to deal with them. This paper exploits the implicit record structure of simple search models and shows that these models remain nonparametrically identified when worker
productivity is measured imperfectly, even in the presence of initial condition problems. These
problems may arise because of censoring problems, or because workers set diﬀerent reservation
wages. Establishing this result required us to derive new results on records from observations that
are independent but not identically distributed, a model which has not been analyzed so far in the
statistics literature. Determining identification in even more realistic search frameworks is likely
to require a more rigorous analysis of records drawn from an underlying sequence of observations
that fails to satisfy the classical i.i.d. assumptions. The particular i.n.i.d structure analyzed here,
and the tools we use to analyze it, hopefully represent a first step towards this goal.
In addition, this paper documented two new empirical findings: (1) the majority of employment
cycles end after only one job; and (2) the distribution of the number of jobs per employment
cycle has a fat tail. Viewed from the perspective of the standard search model, the first finding
suggests a sluggish labor market, at least among employed searchers, since we would expect to
see a fair number of workers moving on to at least a second job (namely those who drew wages
slightly above their reservation price). By contrast, the second finding suggests a fluid labor
market, since it implies some workers do manage to accumulate a lot of job oﬀers. When we
consider a variation of the standard search model where workers diﬀer in their reservation values,
our estimation gives more weight to the second observation. Thus, we infer that the rate at which
employed workers receive oﬀers is high, in line with estimates that use job duration data. But our
approach also implies a much higher oﬀer arrival rate for unemployed workers than previous work,
perhaps implausibly so. Although we oﬀered some directions for modifying the model, properly
interpreting these facts remains as something that should be pursued more in future work.

29

Table 1: Distribution of N across employment cycles
Cycles than began before 1982 and end with at least one of potential experience

Definition 1

Definition 2

N

# of cycles

Pr(N = n)

# of cycles

Pr(N = n)

1
2
3
4
5
6
7
8
9
10
11
12

7555
2853
1125
471
172
87
38
20
13
4
0
3

0.612
0.231
0.091
0.038
0.014
0.007
0.003
0.002
0.001
0.000
0.000
0.000

2732
748
250
126
48
25
13
7
6
0
1
2

0.690
0.189
0.063
0.032
0.012
0.006
0.003
0.002
0.002
0.000
0.000
0.001

Definition 1 - cycles are partitioned according to nonemployment spells
Definition 2 - cycles are partitioned according to quit/layoff and time to next job

Table 2: Estimates for κ1 and G(F-1(.))

Sample
size

κ1

a

b

Implied
. ' (1)

No heterogeneity in reservation wages
All

12,341

1.963

Educ < 12

3,054

1.911

Educ = 12

3,715

1.978

Educ ∈ (13,15)

3,273

2.188

Educ > 16

2,299

1.707

0.028
0.056
0.052
0.060
0.058

Heterogeneity in reservation wages
All

12,341

10.768

0.191

0.051

1.7766

0.0350

0.0048

15.926
1.968

0.1875

0.0554

14.855

Educ < 12

3,054

10.082
3.442

0.0752

0.0107

3.803

Educ = 12

3,715

9.489

0.224

0.054

14.647

2.6549

0.0698

0.0079

2.942

Educ ∈ (13,15)

3,273

16.385

0.145

0.042

20.324

6.1542

0.0536

0.0109

6.242

8.442

0.209

0.053

15.083

3.2509

0.0937

0.0103

4.042

Educ > 16

2,299

Figure 1

n

0
1

2

3

4

5

6

-2

-4

-6

-8

ln[Pr(N=n)]
-10

original data

heterogeneity

no heterogeneity

7

Figure 2

0.10

0.09
E(Rn+1 - Rn | N > n)
exponential parent

0.08
E(Rn+1 - Rn | N > n)
normal parent
heterogeneity

0.07

0.06

E(Rn+1 - Rn | N > n)
normal parent
no heterogeneity

0.05

n

0.04
1

2

3

4

5

Appendix A: Proofs
¡
¢
Proof of Proposition 1: We begin with some necessary preliminaries. Since G F −1 (u) is assumed to
¢
d ¡ −1
be absolutely continuous under Assumption 2.1, there exists a density function
G F (u) for almost
du
all u ∈ (0, 1). Without loss of generality, we proceed as if there also exist density functions g (x) = G0 (x)
and f (x) = F 0 (x), even though their existence is not implied by Assumption 2.1. If either of these
0
functions does not exist, we can always define a new set of variables ym
= F (ym ) so that y10 has a uniform
0
distribution (which is absolutely continuous) and ym for m ≥ 2 has an absolutely continuous distribution
under Assumption 2.1. Since F (·) is monotonic, we can compute the likelihood of record events in the
original system using record events in the analogous system where all variables have absolutely continuous
distributions.
The
¡
¢ only consequence of proceeding this way is a slight abuse of notation; below, we will often
¢
g F −1 (u)
d ¡ −1
write
when we should write
G F (u) since the former representation may not exist. In
−1
f (F (u))
du
¢
d ¡ −1
the text we use are careful to refer to
G F (u) as G 0 (u).
du
Define q = 1 − p. From Bunge and Nagaraja (1991), we know that for n ≥ 2, the likelihood of at least n
records with values r1 through rn is given by
h (r1 , ..., rn ∩ N ≥ n) = f (rn )

n−1
qg (r1 ) Y qf (ri )
1 − qF (r1 ) i=2 1 − qF (ri )

Integrating r2 through rn−1 yields the following expression for the joint likelihood of r1 and rn :
·
µ
¶¸n−2
1
1 − qF (rn−1 )
qg (r1 )
h (r1 , rn ∩ N ≥ n) =
− ln
f (rn )
(n − 2)!
1 − qF (r1 )
1 − qF (r1 )
and so
·
µ
¶¸n−2
1 − qF (rn−1 )
− ln
∞
∞
qg (r1 )
1 − qF (r1 )
|rn |
E (|wn | | N ≥ n) =
f (rn ) drn dr1
(n − 2)! Pr (N ≥ n)
1 − qF (r1 )
−∞ r1
¶¸n−2
·
µ
1 − qun−1
¡
¢
Z 1 Z un
¯ −1
¯ − ln
g F −1 (u1 )
q
1 − qu1
¯
¯
F (un )
=
du1 dun
(n − 2)! Pr (N ≥ n) 1 − qu1 f (F −1 (u1 ))
0
0
Z

Z

For any pair (u1 , un−1 ) ∈ [0, 1] × [0, 1], it is always the case that
¶
µ
1 − qun−1
≤ − ln (1 − q) and
− ln
1 − qu1
and so

¢
¡
¯ −1
¯ g F −1 (u1 )
¯F (un )¯
du1 dun
f (F −1 (u1 ))
0
0
#
"Z
¡
¢
Z 1
un
¯ −1
¯
g F −1 (u1 )
[− ln (1 − q)]n−2
q
¯F (un )¯
du1 dun
(n − 2)!P (N ≥ n) 1 − q 0
f (F −1 (u1 ))
0
Z 1
n−2
¯ −1
¯
[− ln (1 − q)]
q
¯F (un )¯ dun < ∞
(n − 2)!P (N ≥ n) 1 − q 0
n−2

E (|wn | | N ≥ n) ≤
=
≤

q
q
≤
1 − qu1
1−q

[− ln (1 − q)]
q
(n − 2)!P (N ≥ n) 1 − q

Z

1

Z

un

(7.1)

where the last inequality follows from the assumption that E (|y2 |) < ∞. Hence, E (|wn | | N ≥ n) is welldefined. But for any random variable Y , if E (|Y |) exists then so does E (Y ). By a similar argument,
E (wn−1 | N ≥ n) can also be shown to exist, and hence so does E (wn − wn−1 | N ≥ n). ¥
Proof of Proposition 2: We begin by proving part (a). Let q = 1 − p. Using Nagaraja and Barlevy
(2003), we can deduce that the likelihood of exactly n observed records with values r1 through rn is given
by
n
(1 − q) g (r1 ) Y qf (ri )
h (r1 , ..., rn ∩ N = n) =
.
1 − qF (r1 ) i=2 1 − qF (ri )
Integrating out r2 through rn yields

1
1−q
h (r1 ∩ N = n) =
(n − 1)! 1 − qF (r1 )

µ µ
¶¶n−1
1 − qF (r1 )
g (r1 )
ln
1−q

Hence, Pr (N = n) is equal to
Pr (N = n) =

1
(n − 1)!

Z

∞
−∞

· µ
¶¸n−1
1 − qF (r1 )
1−q
g (r1 ) dr1
ln
1 − qF (r1 )
1−q

(7.2)

Using the change of variables u = F (r1 ), we can rewrite (7.2) as
¢
· µ
¶¸n−1 ¡ −1
Z 1
g F (u)
1 − qu
1−q
1
Pr (N = n) =
ln
du
(7.3)
(n − 1)! 0 1 − qu
1−q
f (F −1 (u))
¢
¡
¢
g F −1 (u)
d ¡ −1
represents
G F (u) . Under the additional assumption that G (x) = G0 (F (x)),
Recall that
−1
f (F (u))
du
¡ −1
¢
¢¢
¢
¡ ¡ −1
d ¡ −1
we have G F (u) = G0 F F (u) = G0 (u). Hence, for any distribution F (·),
G F (u) =
du
¢
¡
g F −1 (u)
0
G0 (u) ≡ g0 (u), i.e. the function
in (7.3) is independent of the function F (·). To conf (F −1 (u))
sider the most general case, we allow g0 (u) to depend on the parameter q and we will write out g0 (u, q).
Equation (3.2) in the text oﬀers one example of a function g0 (·) that depends on q, specifically g0 (u) =
i
d h
q
−1
where κ1 =
u (1 + κ1 (1 − u))
.
du
1−q
Suppose there were two values q1 and q2 which gave rise to the same Pr (N = n) for all n, i.e.
Z

0

1

¶¸n−1
¶¸n−1
· µ
· µ
Z 1
1 − q1
1 − q2
1 − q1 u
1 − q2 u
g0 (u, q1 ) du =
g0 (u, q2 ) du
ln
ln
1 − q1 u
1 − q1
1 − q2
0 1 − q2 u

for n = 1, 2, 3, ... We will show by contradiction that q1 = q2 , which in turn implies that Pr (N = n) uniquely
determines p = 1 − q.
Set t = ln ((1 − q1 u)/(1 − q1 )) on the left-hand side and t = ln ((1 − q2 u)/(1 − q2 )) on the right-hand side
to rewrite the above equations as
Z

0

− ln(1−q1 )

h1 (t) tn−1 dt =

Z

0

− ln(1−q2 )

q1 1 − q2
h2 (t) tn−1 dt.
q2 1 − q1

for n = 1, 2, 3, ... Suppose q1 6= q2 , and assume wlog that q2 > q1 , so − ln (1 − q2 ) > − ln (1 − q1 ) . Define

 h1 (t) if t ≤ − ln (1 − q1 )
b
h1 (t) =

0
if t > − ln (1 − q1 )
We can rewrite the above equation as
Z

− ln(1−q1 )

h1 (t) tn−1 dt =

0

Z

− ln(1−q2 )
0

Z

b
h1 (t) tn−1 dt =

− ln(1−q2 )

0

q1 1 − q2
h2 (t) tn−1 dt
q2 1 − q1

By the Müntz-Szász theorem (see Kamps (1998)), we can conclude that for almost all t ∈ (0, − ln (1 − q2 ))
q1 1 − q2
b
h2 (t)
h1 (t) =
q2 1 − q1

Hence, h£2 (t) = 0 for almost all t ∈ (− ln (1 − ¤q1 ) , − ln (1 − q2 )). But this implies g0 (u, q2 ) = 0 for almost
all u ∈ F2−1 (0) , F2−1 ((q2 − q1 )/[q2 (1 − q1 )]) , which violates Assumption 2.3. It follows that q1 and q2
must be equal.
We next establish (b). Using the likelihood function (7.1), changing variables and integrating out r2
through rn−2 , we can express the average record gap E (wn − wn−1 | N ≥ n) for any n ≥ 3 as the following
integral:
Z

1
0

Z

0

un−1

Z

1

un−1

¤
£ −1
F (un ) − F −1 (un−1 )

h
³
´in−3
n−1
g0 (u1 )
q 2 − ln 1−qu
1−qu1

(n − 3)! Pr (N ≥ n) (1 − qu1 ) (1 − qun−1 )

dun du1 dun−1

For a given value of q, the function g0 (u, q) is known, as are the values of q and Pr (N ≥ n). Define
φF (un−1 ) =

Z

1
un−1

¤
£ −1
F (un ) − F −1 (un−1 ) dun

and introduce the change of variables
t = − ln (1 − qun−1 )
s = − ln (1 − qu1 )
c = − ln (1 − q)
so the average record gap is given by
E (wn − wn−1

1
| N ≥ n) =
(n − 3)! Pr (N ≥ n)

Finally, we set ω = t − s and define
η F (ω) =

Z

c

g0
t=ω

µ

Z

c
0

Z

c

g0

t=s

1 − e−(t−ω)
q

¶

µ

1 − e−s
q

φF

µ

to get
E (wn − wn−1

1
| N ≥ n) =
(n − 3)!P (N ≥ n)

¶

φF

µ

¶

dt.

1 − e−t
q
Z

1 − e−t
q

¶

c

ω=0

η F (ω) ω n−3 dω

n−3

(t − s)

dt ds

Now, suppose there exist two functions F1 and F2 that give rise to the same sequence of expected record
gaps E (wn − wn−1 | N ≥ n). Then for n = 3, 4, 5, ... it must be the case that
Z

c

ω=0

η F1 (ω) ω n−3 dω =

Z

c

ω=0

ηF2 (ω) ω n−3 dω

The Müntz-Szász theorem then implies that for almost all ω ∈ (0, c),

If we define φ (t) ≡ φF2

³

´
−t

1−e
q

− φF1
Z

³

c

η F1 (ω) = ηF2 (ω)
´
1−e−t
, then this implies that for almost all ω ∈ (0, c),
q
g0

t=ω

µ

¶

1 − e−(t−ω)
q

φ (t) dt = 0

(7.4)

We next argue that (7.4) requires φ (t) = 0 almost surely. We first appeal to yet another change in variables,
w = c − t and z = c − ω to rewrite (7.4) as
Z z
a (z − w) b (w) dw = 0 for almost all z ∈ (0, c)
(7.5)
0

µ

¶
1 − e−x
where a (x) = g0
and b (x) = φ (c − x). Applying Theorem VII in Titchmarsh (1926), which is
q
identical to Theorem 151 from the more accessible Titchmarsh (1948, p. 324-5), there exists a c∗ such that
a (x) = 0 for all x ∈ (0, c∗ ) and b (x) = 0 for all x ∈ (0, c − c∗ ). But by Assumption 2.3, there exists an ε > 0
such that g0 (u) > 0 for all u ∈ (0, ε), and so there exists an ε0 such that a (x) > 0 for all x ∈ (0, ε0 ). It follows
that c∗ must equal 0, and hence b (z) = 0 for almost all z ∈ (0, c), which in turn implies φ (t) = b (c − t) = 0
for almost all t ∈ (0, c).
Thus far, we have shown that for any two distributions F1 and F2 that give rise to the same sequence
of expected record gaps E (wn − wn−1 | N ≥ n), it must be the case that φF1 (u) = φF2 (u) for almost all
u ∈ (0, 1). The last step is to show that this implies F1 and F2 are identical almost surely up to a location
shift, i.e. there exists some constant C such that for almost all u ∈ (0, 1),
F1−1 (u) = F2−1 (u) + C
Our argument follows Nagaraja and Barlevy (2003). Since φF1 (u) = φF2 (u) almost surely, then for almost
all u ∈ (0, 1), we have
Z

1

u

or, rearranging,
Z

1

u

¤
£ −1
F1 (x) − F1−1 (u) dx =

£ −1
¤
F1 (x) − F2−1 (x) dx =

Z

1

u

Z

¤
£ −1
F2 (x) − F2−1 (u) dx
1

£

¤
F1−1 (u) − F2−1 (u) dx
u
£
¤
= (1 − u) F1−1 (u) − F2−1 (u)

Define H (x) = F2−1 (x) − F1−1 (x). Then we the above equation implies that for almost all u ∈ (0, 1),
Z 1
H(x) dx = (1 − u)H(u)
(7.6)
u

Next, we observe that
d
[ln
du

Z

1

H(x) dx] = − R 1

u

Hence, (7.6) implies that
d
[ln
du

u

Z

1

u

H(x) dx] = −

Integrate both sides over u ∈ (s, t) ⊂ (0, 1) yields
ln

Z

1
t

H(x) dx − ln

Z

s

H(u)
H(v) dv
1
1−u

1

H(x) dx = ln(1 − t) − ln(1 − s)

Since this is true for any s and any t, it follows that there exists a constant c such that for all t ∈ (0, 1),
− log
or

Z

t

Z

t

1

H(v)dv + log(1 − t) = c

1

H(x) dx = e−c (1 − t) , t ∈ (0, 1) .

Diﬀerentiating with respect to t, we obtain H(t) = e−c for all t ∈ (0, 1), that is F2−1 (t) − F1−1 (t) = e−c for
almost all t ∈ (0, 1), as we need to show. ¥
Proof of Proposition 3: Starting with (7.2) and using the change of variable u = F (r), we obtain the
following expression:
¢
¶¸n−1 ¡ −1
· µ
Z 1
g F (u)
1 − q1 u
1 − q1
Pr (N = n) =
ln
du
1 − q1
f (F −1 (u))
0 1 − q1 u
Suppose there were two triplets {q1 , F1 , G1 } and {q2 , F2 , G2 } that gave rise to the same Pr (N = n) for all
n, i.e.
¢
¢
¶¸n−1 ¡ −1
¶¸n−1 ¡ −1
· µ
· µ
Z 1
Z 1
g1 F1 (u)
g2 F2 (u)
1 − q1 u
1 − q2 u
1 − q1
1 − q2
¢ du =
¢ du
¡
¡
ln
ln
1 − q1
1 − q2
f1 F1−1 (u)
f2 F2−1 (u)
0 1 − q1 u
0 1 − q2 u

for all n = 1, 2, 3, ... As in the proof of Proposition 2, set t = ln ((1 − q1 u)/(1 − q1 )) on the left-hand side
and t = ln ((1 − q2 u)/(1 − q2 )) on the right-hand side to rewrite the above equations as
Z

0

− ln(1−q1 )

h1 (t) tn−1 dt =

Z

− ln(1−q2 )

0

q1 1 − q2
h2 (t) tn−1 dt.
q2 1 − q1

for all n = 1, 2, 3, ... Just as in the proof of Proposition 2, it follows that
q1 = q2

Next, since q1 = q2 , the fact that {q, F1 , G1 } and {q, F2 , G2 } both give rise to the same distribution
Pr (N = n) implies that
¢
¢
· µ
¶¸n−1 ¡ −1
· µ
¶¸n−1 ¡ −1
Z 1
Z 1
g1 F1 (u)
g2 F2 (u)
1−q
1−q
1 − qu
1 − qu
¡
¡
¢
¢ du
du =
ln
ln
1−q
1−q
f1 F1−1 (u)
f2 F2−1 (u)
0 1 − qu
0 1 − qu

Appealing to the Müntz-Szász theorem implies that for almost all u ∈ (0, 1),
¢
¢
¡
¡
g1 F1−1 (u)
g2 F2−1 (u)
¢ = ¡ −1
¢
¡
f1 F1−1 (u)
f2 F2 (u)

Note that this function corresonds to G 0 (u) =
G (u) as noted in the text.

¢
d ¡ −1
G F (u) , and hence Pr (N = n) uniquely identifies
du

Define g0 (u) = G 0 (u). We then repeat the steps of the proof in Proposition
¡ −1
¢ 2 to argue that F (·) is
uniquely determined up to an aﬃne
transformation.
Since
G
(u)
=
G
F
(u)
, it follows that for any
¢
¡ −1
−1
−1
−1
u ∈ (0, 1), we have G (u) = F
G (u) . Hence, G (·) can be identified up to the same constant as
F (·). ¥

Appendix B: Bargaining
In this section, we describe a particular bargaining game whose reduced form corresponds to the model in
our paper. We consider an alternating-oﬀer bargaining game along the lines first proposed by Rubinstein
(1982): the worker and the firm alternate in proposing a schedule of payments the worker should receive
over the course of the job. More precisely, in line with the model we described, we require that the proposed
schedule assumes the form of a function w (z, it ) that specifies the worker receive a payment that is entirely
a function of his ability it , although the amount can vary with the productivity z of the match. If a party
proposes a schedule and the other party accepts, production takes place and the worker is paid according
to this schedule. If the worker proposes a schedule that is rejected, the two parties must wait ∆w units of
time, after which the employer gets to propose his own schedule. If the employer proposes a schedule that
is rejected, the two parties must wait ∆e units of time, at which point the worker gets to propose an oﬀer.
Let
∆w
β=
∆e + ∆w
We will consider taking the limit as ∆w and ∆e tend towards zero while holding the ratio ∆e /∆w , and
hence β, fixed. In the limit, it doesn’t matter whether the worker or the employer makes the first oﬀer.
As emphasized by Binmore, Rubinstein, and Wolinsky (1986), the outcome of the bargain depends crucially on what we assume occurs while the parties wait between oﬀers. We assume the worker and the
employer bargain in real time, and thus discount the future at rate ρ. While they wait, the two may be hit
by a shock that causes them to separate (which recall arrives at a rate δ per unit time), and the worker
may continue to encounter other employers. Hence, if ∆ units of time have passed, where ∆ is small, there
is a probability of roughly δ∆ the two will have separated. Moreover, if we assume that a worker will
only change employers if the employer he encounters is more productive (i.e. bargaining leads to eﬃcient
mobility decisions), then there is a probability of roughly λ1 (1 − Γ (z)) ∆ that the worker would leave for
a more productive employer. Correspondingly, the probability the two will remain together and continue
with a counteroﬀer is approximately 1 − [δ + λ1 (1 − Γ (z))] ∆. Finally, we assume that during the period
∆, the worker is unable to enjoy leisure. This assumption implies that all workers, regardless of their value
of leisure, will negotiate to the same wage. This is a reasonable assumption: workers who value their leisure
more will certainly be more choosy, but will probably not be able to use their higher value of leisure to
extract higher wages. In reality this is probably because the value of leisure is hard to verify, but rather than
use a model of bargaining with private information, it is simpler to assume workers cannot enjoy leisure

while bargaining. Hall and Milgrom (2005) propose a similar scheme and argue it provides a plausible
description of actual wage bargaining.
Binmore, Rubinstein, and Wolinsky (1986) show that in the limit as ∆w and ∆e tend to 0, the schedule
w (z, it ) solves the Nash bargaining problem
max [J (z,

w(z,

it )

it )

− J0 ]1−β [W (z,

it )

− W0 ]β

(7.7)

where J (z, it ) denotes the expected utility of an employer who employs a worker of ability on a job with
productivity z, W (z, it ) denotes the expected utility of a worker of ability when working on a a job with
productivity z, and J0 and W0 denotes the expected utility of the employer and the worker respectively if
the two fail to come to agreement. Turning first to J0 , if the parties fail to agree, either the two will remain
together after ∆ units of time, in which case the utility of the employer will be J (z, it ), or else the two will
have separated by then, in which case we assume the employer has a utility of zero (as would be implied by
a free entry condition). Discounting the future at the rate ρ, we have
J0 ≈

(1 − (δ + λ1 [1 − Γ (z)]) ∆) J (z,
1 + ρ∆

it+∆ )

Similarly, after ∆ units of time, the worker will either be unemployed, in which case his utility is defined
by the utility of an unmployed worker U , employed on a better job, which is associated with a utility of
E [W (z 0 , it+∆ ) | z 0 ≥ z], or else he will remain on the same match, which yields a utility of W (z, it+∆ ).
Hence,
W0 ≈

(1 − (δ + λ1 [1 − Γ (z)]) ∆) W (z,

it+∆ )

+ δ∆U + λ1 (1 − Γ (z)) E [W (z 0 ,
1 + ρ∆

it+∆ )

| z 0 ≥ z]

Substituting these expressions in, dividing by ∆2 and taking the limit as ∆ → 0 implies that w (z,

it )

solves

max (z − w)1−β wβ

w(z,

it )

which implies
w (z,

it )

= βz

Hence, the wage oﬀered to a worker under this particular bargaining protocol will be proportional to his
output, and all workers face the same potential oﬀer distribution. Note that under this outcome a worker
would indeed switch jobs if and only if the employer he encounteres has a higher productivity z, in line with
our assumption. Hence, the wage schedule above represents a proper equilibrium. Since the wage it implies
is proportional to productivity, the distribution F (·) is once again identically equal to Γ (·) up to a scaling
factor, and so we can easily identify Γ (·) from F (·).
Finally, we should note that in recent work, Shimer (2005) examined bargaining in a similar model
with on-the-job search. His formulation, which borrows from Binmore, Rubinstein, and Wolinsky (1986),
assumes the agents do not discount between oﬀers; rather, there is a constant rate that negotiations break
down, independently of how the worker ranks his current employer relative to other employers. Under this
alternative formulation, he shows that the wage schedule solves an analogous problem to (7.7), but with J0
replaced by 0 and W0 replaced by U . Shimer shows that when Γ (·) is a continuous distribution, there exist
equilibria in which all firms with the same productivity z end up agreeing with their workers on the same
wage. However, in his formulation worker productivity it is assumed to be fixed over time. Once we allow
for it to vary over time, the solution w (z, it ) that solves this alternative problem is typically not linear in
it or separable in z and it . Thus, his proposed scheme will typically not yield the model we describe as a
reduced form (except for particular processes it ). This underscores that while our model is consistent with
some models of wage determination, it is inconsistent with others.

References
[1] Abowd, John, Francis Kramarz, and David Margolis, 1999. “High Wage Workers and High
Wage Firms” Econometrica, March, 67(2), p251-333.
[2] Arnold, Barry, N. Balakrishnan and H. Nagaraja, 1998. Records. New York: John Wiley and
Sons.
[3] Athey, Susan and Philip Haile, 2002. “Identification in Standard Auction Models” Econometrica, November, 70(6), p2107-40.
[4] Barlevy, Gadi, 2002. “The Sullying Eﬀect of Recessions” Review of Economic Studies, January,
69 (1), p65-96.
[5] Barlevy, Gadi, 2005. “Identification of Job Search Models using Record Statistics” Federal
Reserve Bank of Chicago Working Paper.
[6] Barlevy, Gadi and H. N. Nagaraja, 2005. “Characterizations in a random record model with a
non-identically distributed initial record” Federal Reserve Bank of Chicago Working Paper.
[7] Binmore, Kenneth, Ariel Rubinstein, and Asher Wolinsky, 1986. “The Nash Bargaining Solution in Economic Modelling” Rand Journal of Economics, Summer, 17(2), p176-88.
[8] Bontemps, Christian, Jean-Marc Robin, and Gerard van den Berg, 1999. “An Empirical
Equilibrium Job Search Model with Search on the Job and Heterogeneous Workers and Firms”
International Economic Review, November, 40(4), p1039-1074.
[9] Bontemps, Christian, Jean-Marc Robin, and Gerard van den Berg, 2000. “Equilibrium Search
with Continuous Productivity Dispersion: Theory and Nonparametric Estimation” International Economic Review, May, 41(2), p305-358.
[10] Bowlus, Audra and Jean-Marc Robin, 2004. “Twenty Years of Rising Inequality in US Lifetime
Labor Income Values” Review of Economic Studies, July, 71(3), p709-742.
[11] Bunge, John and H. Nagaraja, 1991. “The Distributions of Certain Record Statistics from a
Random Number of Observations” Stochastic Processes and Their Applications, 38, p167-83.
[12] Burdett, Kenneth and Dale Mortensen, 1998. “Wage Diﬀerentials, Employer Size, and Unemployment” International Economic Review, 39, p257-273.
[13] Eckstein, Zvi and Gerard van den Berg, 2005. “Empirical Labor Search: a Survey” Tel Aviv
University Working paper and Journal of Econometrics (forthcoming).

[14] Flinn, Christopher, 1986. “Wages and Job Mobility of Young Workers” Journal of Political
Economy 94(3, Part 2), pS88-S110.
[15] Flinn, Christopher, 2002. “Labour Market Structure and Inequality: a Comparison of Italy
and the U.S.” Review of Economic Studies, July, 69 (3), p611-45.
[16] Flinn, Christopher and James Heckman, 1982. “New Methods for Analyzing Structural Models
of Labor Force Dynamics” Journal of Econometrics, January, 18(1), p115-68.
[17] Gautier, Peter, Coen Teulings, and Aico van Vuuren, 2005. “On-the-Job Search and Sorting”
Tinbergen Institute Discussion Paper.
[18] Guerre, Emmanuel, Isabelle Perrigne, and Quang Vuong, 2000. “Optimal Nonparametric
Estimation of First-Price Auctions” Econometrica, May, 68(3), p525-74.
[19] Kamps, Udo, 1998. Characterizations of distributions by recurrence relations and identities
for moments of order statistics. Handbook of Statistics, Vol. 16, eds. N. Balakrishnan and C.
R. Rao, Elsevier, Amsterdam. 291-311.
[20] Low, Hamish, Costas Meghir, and Luigi Pistaferri, 2004. “Wage Risk and Employment Risk
over the Life Cycle” Stanford University Working Paper.
[21] Lucas, Robert and Edward Prescott, 1974. “Equilibrium Search and Unemployment” Journal
of Economic Theory February, 7(2), p188-209.
[22] Marimon, Ramon and Fabrizio Zilibotti, 1999. “Unemployment vs. Mismatch of Talents:
Reconsidering Unemployment Benefits” Economic Journal, April, 109(127), p266-91.
[23] Mortensen, Dale and George Neumann, 1988. “Esimating Structural Models of Unemployment
and Job Duration” Dynamic Econometric Modeling: Proceedings of the Third International
Symposia in Economic Theory and Econometrics, eds. W. A. Barnett, E. Berndt, and H.
White. Cambridge: Cambridge University Press.
[24] Nagaraja, H. N. and Gadi Barlevy, 2003. “Characterizations Using Record Moments in a
Random Record Model and Applications” Journal of Applied Probability, September, 40(3),
p826-33.
[25] Nevzorov, V. B. and N. Balakrishnan, 1998. “A Record of Records” Handbook of Statistics:
Order Statistics, Theory and Methods, v16, eds. N. Balakrishnan and C. R. Rao, Amsterdam:
North-Holland, p515-70.

[26] Pierret, Charles, 2001. “Event History Data and Survey Recall: an Analysis of the National
Longitudinal Survey of Youth 1979 Recall Experiment” Journal of Human Resources, Summer, 36(3), p439-66.
[27] Postel-Vinay, Fabien and Jean-Marc Robin, 2002. “Equilibrium Wage Dispersion with Worker
and Employer Heterogeneity” Econometrica, November, 70(6), p2295-2350.
[28] Rubinstein, Ariel, 1982. “Perfect Equlibrium in a Bargaining Model” Econometrica, January,
50(1), p97-109.
[29] Shimer, Robert, 2005. “On-the-Job Search, Bargaining, and Wage Dispersion” Working Paper,
University of Chicago.
[30] Shohat J. A. and J. D. Tamarkin, 1943. The Problem of Moments. New York: American
Mathematical Society.
[31] Titchmarsh, E. C., 1926. “The zeros of certain integral functions” Proceedings of the London
Mathematical Society, Series 2, 25 283-302.
[32] Titchmarsh, E. C., 1948. Introduction to the Theory of Fourier Integrals. 2nd ed. Oxford:
Clarendon Press.
[33] van den Berg, Gerard and Geert Ridder 1998. “An Empirical Equilibrium Search Model of
the Labor Market” Econometrica, September, 66(5), p1183-1221.
[34] Wolpin, Kenneth, 1992. “The Determinants of Black-White Diﬀerences in Early Employment
Careers: Search, Layoﬀs, Quits, and Endogenous Wage Growth” Journal of Political Economy, June, 100(3), p535-60.

Working Paper Series
A series of research studies on regional economic issues relating to the Seventh Federal
Reserve District, and on financial and economic topics.
A Proposal for Efficiently Resolving Out-of-the-Money Swap Positions
at Large Insolvent Banks
George G. Kaufman

WP-03-01

Depositor Liquidity and Loss-Sharing in Bank Failure Resolutions
George G. Kaufman

WP-03-02

Subordinated Debt and Prompt Corrective Regulatory Action
Douglas D. Evanoff and Larry D. Wall

WP-03-03

When is Inter-Transaction Time Informative?
Craig Furfine

WP-03-04

Tenure Choice with Location Selection: The Case of Hispanic Neighborhoods
in Chicago
Maude Toussaint-Comeau and Sherrie L.W. Rhine

WP-03-05

Distinguishing Limited Commitment from Moral Hazard in Models of
Growth with Inequality*
Anna L. Paulson and Robert Townsend

WP-03-06

Resolving Large Complex Financial Organizations
Robert R. Bliss

WP-03-07

The Case of the Missing Productivity Growth:
Or, Does information technology explain why productivity accelerated in the United States
but not the United Kingdom?
Susanto Basu, John G. Fernald, Nicholas Oulton and Sylaja Srinivasan

WP-03-08

Inside-Outside Money Competition
Ramon Marimon, Juan Pablo Nicolini and Pedro Teles

WP-03-09

The Importance of Check-Cashing Businesses to the Unbanked: Racial/Ethnic Differences
William H. Greene, Sherrie L.W. Rhine and Maude Toussaint-Comeau

WP-03-10

A Firm’s First Year
Jaap H. Abbring and Jeffrey R. Campbell

WP-03-11

Market Size Matters
Jeffrey R. Campbell and Hugo A. Hopenhayn

WP-03-12

The Cost of Business Cycles under Endogenous Growth
Gadi Barlevy

WP-03-13

The Past, Present, and Probable Future for Community Banks
Robert DeYoung, William C. Hunter and Gregory F. Udell

WP-03-14

1

Working Paper Series (continued)
Measuring Productivity Growth in Asia: Do Market Imperfections Matter?
John Fernald and Brent Neiman

WP-03-15

Revised Estimates of Intergenerational Income Mobility in the United States
Bhashkar Mazumder

WP-03-16

Product Market Evidence on the Employment Effects of the Minimum Wage
Daniel Aaronson and Eric French

WP-03-17

Estimating Models of On-the-Job Search using Record Statistics
Gadi Barlevy

WP-03-18

Banking Market Conditions and Deposit Interest Rates
Richard J. Rosen

WP-03-19

Creating a National State Rainy Day Fund: A Modest Proposal to Improve Future
State Fiscal Performance
Richard Mattoon

WP-03-20

Managerial Incentive and Financial Contagion
Sujit Chakravorti and Subir Lall

WP-03-21

Women and the Phillips Curve: Do Women’s and Men’s Labor Market Outcomes
Differentially Affect Real Wage Growth and Inflation?
Katharine Anderson, Lisa Barrow and Kristin F. Butcher

WP-03-22

Evaluating the Calvo Model of Sticky Prices
Martin Eichenbaum and Jonas D.M. Fisher

WP-03-23

The Growing Importance of Family and Community: An Analysis of Changes in the
Sibling Correlation in Earnings
Bhashkar Mazumder and David I. Levine

WP-03-24

Should We Teach Old Dogs New Tricks? The Impact of Community College Retraining
on Older Displaced Workers
Louis Jacobson, Robert J. LaLonde and Daniel Sullivan

WP-03-25

Trade Deflection and Trade Depression
Chad P. Brown and Meredith A. Crowley

WP-03-26

China and Emerging Asia: Comrades or Competitors?
Alan G. Ahearne, John G. Fernald, Prakash Loungani and John W. Schindler

WP-03-27

International Business Cycles Under Fixed and Flexible Exchange Rate Regimes
Michael A. Kouparitsas

WP-03-28

Firing Costs and Business Cycle Fluctuations
Marcelo Veracierto

WP-03-29

Spatial Organization of Firms
Yukako Ono

WP-03-30

Government Equity and Money: John Law’s System in 1720 France
François R. Velde

WP-03-31

2

Working Paper Series (continued)
Deregulation and the Relationship Between Bank CEO
Compensation and Risk-Taking
Elijah Brewer III, William Curt Hunter and William E. Jackson III

WP-03-32

Compatibility and Pricing with Indirect Network Effects: Evidence from ATMs
Christopher R. Knittel and Victor Stango

WP-03-33

Self-Employment as an Alternative to Unemployment
Ellen R. Rissman

WP-03-34

Where the Headquarters are – Evidence from Large Public Companies 1990-2000
Tyler Diacon and Thomas H. Klier

WP-03-35

Standing Facilities and Interbank Borrowing: Evidence from the Federal Reserve’s
New Discount Window
Craig Furfine

WP-04-01

Netting, Financial Contracts, and Banks: The Economic Implications
William J. Bergman, Robert R. Bliss, Christian A. Johnson and George G. Kaufman

WP-04-02

Real Effects of Bank Competition
Nicola Cetorelli

WP-04-03

Finance as a Barrier To Entry: Bank Competition and Industry Structure in
Local U.S. Markets?
Nicola Cetorelli and Philip E. Strahan

WP-04-04

The Dynamics of Work and Debt
Jeffrey R. Campbell and Zvi Hercowitz

WP-04-05

Fiscal Policy in the Aftermath of 9/11
Jonas Fisher and Martin Eichenbaum

WP-04-06

Merger Momentum and Investor Sentiment: The Stock Market Reaction
To Merger Announcements
Richard J. Rosen

WP-04-07

Earnings Inequality and the Business Cycle
Gadi Barlevy and Daniel Tsiddon

WP-04-08

Platform Competition in Two-Sided Markets: The Case of Payment Networks
Sujit Chakravorti and Roberto Roson

WP-04-09

Nominal Debt as a Burden on Monetary Policy
Javier Díaz-Giménez, Giorgia Giovannetti, Ramon Marimon, and Pedro Teles

WP-04-10

On the Timing of Innovation in Stochastic Schumpeterian Growth Models
Gadi Barlevy

WP-04-11

Policy Externalities: How US Antidumping Affects Japanese Exports to the EU
Chad P. Bown and Meredith A. Crowley

WP-04-12

Sibling Similarities, Differences and Economic Inequality
Bhashkar Mazumder

WP-04-13

3

Working Paper Series (continued)
Determinants of Business Cycle Comovement: A Robust Analysis
Marianne Baxter and Michael A. Kouparitsas

WP-04-14

The Occupational Assimilation of Hispanics in the U.S.: Evidence from Panel Data
Maude Toussaint-Comeau

WP-04-15

Reading, Writing, and Raisinets1: Are School Finances Contributing to Children’s Obesity?
Patricia M. Anderson and Kristin F. Butcher

WP-04-16

Learning by Observing: Information Spillovers in the Execution and Valuation
of Commercial Bank M&As
Gayle DeLong and Robert DeYoung

WP-04-17

Prospects for Immigrant-Native Wealth Assimilation:
Evidence from Financial Market Participation
Una Okonkwo Osili and Anna Paulson

WP-04-18

Individuals and Institutions: Evidence from International Migrants in the U.S.
Una Okonkwo Osili and Anna Paulson

WP-04-19

Are Technology Improvements Contractionary?
Susanto Basu, John Fernald and Miles Kimball

WP-04-20

The Minimum Wage, Restaurant Prices and Labor Market Structure
Daniel Aaronson, Eric French and James MacDonald

WP-04-21

Betcha can’t acquire just one: merger programs and compensation
Richard J. Rosen

WP-04-22

Not Working: Demographic Changes, Policy Changes,
and the Distribution of Weeks (Not) Worked
Lisa Barrow and Kristin F. Butcher

WP-04-23

The Role of Collateralized Household Debt in Macroeconomic Stabilization
Jeffrey R. Campbell and Zvi Hercowitz

WP-04-24

Advertising and Pricing at Multiple-Output Firms: Evidence from U.S. Thrift Institutions
Robert DeYoung and Evren Örs

WP-04-25

Monetary Policy with State Contingent Interest Rates
Bernardino Adão, Isabel Correia and Pedro Teles

WP-04-26

Comparing location decisions of domestic and foreign auto supplier plants
Thomas Klier, Paul Ma and Daniel P. McMillen

WP-04-27

China’s export growth and US trade policy
Chad P. Bown and Meredith A. Crowley

WP-04-28

Where do manufacturing firms locate their Headquarters?
J. Vernon Henderson and Yukako Ono

WP-04-29

Monetary Policy with Single Instrument Feedback Rules
Bernardino Adão, Isabel Correia and Pedro Teles

WP-04-30

4

Working Paper Series (continued)
Firm-Specific Capital, Nominal Rigidities and the Business Cycle
David Altig, Lawrence J. Christiano, Martin Eichenbaum and Jesper Linde

WP-05-01

Do Returns to Schooling Differ by Race and Ethnicity?
Lisa Barrow and Cecilia Elena Rouse

WP-05-02

Derivatives and Systemic Risk: Netting, Collateral, and Closeout
Robert R. Bliss and George G. Kaufman

WP-05-03

Risk Overhang and Loan Portfolio Decisions
Robert DeYoung, Anne Gron and Andrew Winton

WP-05-04

Characterizations in a random record model with a non-identically distributed initial record
Gadi Barlevy and H. N. Nagaraja

WP-05-05

Price discovery in a market under stress: the U.S. Treasury market in fall 1998
Craig H. Furfine and Eli M. Remolona

WP-05-06

Politics and Efficiency of Separating Capital and Ordinary Government Budgets
Marco Bassetto with Thomas J. Sargent

WP-05-07

Rigid Prices: Evidence from U.S. Scanner Data
Jeffrey R. Campbell and Benjamin Eden

WP-05-08

Entrepreneurship, Frictions, and Wealth
Marco Cagetti and Mariacristina De Nardi

WP-05-09

Wealth inequality: data and models
Marco Cagetti and Mariacristina De Nardi

WP-05-10

What Determines Bilateral Trade Flows?
Marianne Baxter and Michael A. Kouparitsas

WP-05-11

Intergenerational Economic Mobility in the U.S., 1940 to 2000
Daniel Aaronson and Bhashkar Mazumder

WP-05-12

Differential Mortality, Uncertain Medical Expenses, and the Saving of Elderly Singles
Mariacristina De Nardi, Eric French, and John Bailey Jones

WP-05-13

Fixed Term Employment Contracts in an Equilibrium Search Model
Fernando Alvarez and Marcelo Veracierto

WP-05-14

Causality, Causality, Causality: The View of Education Inputs and Outputs from Economics
Lisa Barrow and Cecilia Elena Rouse

WP-05-15

5

Working Paper Series (continued)
Competition in Large Markets
Jeffrey R. Campbell

WP-05-16

Why Do Firms Go Public? Evidence from the Banking Industry
Richard J. Rosen, Scott B. Smart and Chad J. Zutter

WP-05-17

Clustering of Auto Supplier Plants in the U.S.: GMM Spatial Logit for Large Samples
Thomas Klier and Daniel P. McMillen

WP-05-18

Why are Immigrants’ Incarceration Rates So Low?
Evidence on Selective Immigration, Deterrence, and Deportation
Kristin F. Butcher and Anne Morrison Piehl

WP-05-19

The Incidence of Inflation: Inflation Experiences by Demographic Group: 1981-2004
Leslie McGranahan and Anna Paulson

WP-05-20

Universal Access, Cost Recovery, and Payment Services
Sujit Chakravorti, Jeffery W. Gunther, and Robert R. Moore

WP-05-21

Supplier Switching and Outsourcing
Yukako Ono and Victor Stango

WP-05-22

Do Enclaves Matter in Immigrants’ Self-Employment Decision?
Maude Toussaint-Comeau

WP-05-23

The Changing Pattern of Wage Growth for Low Skilled Workers
Eric French, Bhashkar Mazumder and Christopher Taber

WP-05-24

U.S. Corporate and Bank Insolvency Regimes: An Economic Comparison and Evaluation
Robert R. Bliss and George G. Kaufman

WP-06-01

Redistribution, Taxes, and the Median Voter
Marco Bassetto and Jess Benhabib

WP-06-02

Identification of Search Models with Initial Condition Problems
Gadi Barlevy and H. N. Nagaraja

WP-06-03

6
Full text of Working Papers (Federal Reserve Bank of Chicago) : Identification of Search Models with Initial Condition Problems, Working Paper 2006-03

FRASER