Full text of Working Papers (Federal Reserve Bank of Chicago) : Estimating Models of On-The-Job Search Using Record Statistics, Working Paper 2003-18

View original document
The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.
Federal Reserve Bank of Chicago

Estimating Models of On-the-Job
Search using Record Statistics
Gadi Barlevy

WP 2003-18

Estimating Models of On-the-Job Search using Record Statistics∗
Gadi Barlevy
Economic Research Department
Federal Reserve Bank of Chicago
230 South LaSalle
Chicago, IL 60604
e-mail: gbarlevy@frbchi.org
November 13, 2003

Abstract
This paper proposes a methodology for estimating job search models that does not require
either functional form assumptions or ruling out the presence of unobserved variation in worker
ability. In particular, building on existing results from record-value theory, a branch of statistics
that deals with the timing and magnitude of extreme values in sequences of random variables,
I show how we can use wage data to identify the distribution from which workers search.
Applying this insight to wage data in the NLSY dataset, I show that the data supports the
hypothesis that the wage oﬀer distribution is Pareto, but not that it is lognormal.

Key Words: Job Search, Record Statistics,
Hausdorﬀ Moment Problem, Non-Parametric Estimation

∗

I would like to thank Joe Altonji, Susan Athey, Marco Bassetto, Jeﬀ Campbell, Kim-Sau Chung, Chuck Manski,
Francesca Molinari, Chris Taber, Nick Williams, and especially Eric French and H. N. Nagaraja for helpful discussions. I also benefitted from comments at seminars in Arizona State, Michigan State, the Federal Reserve Bank
of Minneapolis, the Federal Reserve Bank of Chicago, the Society of Economic Dynamics, and the NBER Summer
Institute. The views expressed here do not necessarily reflect the position of the Federal Reserve Bank of Chicago
or the Federal Reserve System.

Introduction
In recent years, economists have made significant progress in estimating models of job search using
data on wages and job durations. These estimates can be applied to analyze a host of questions
that are of interest to economists. For example, we can use them to identify the sources of wage
growth among labor market entrants, which in turn can be used to determine why wage growth
varies by race and education, as was done in Wolpin (1992) and Bowlus, Kiefer, and Neumann
(2001); or we can use these estimates to infer the degree to which current wage inequality among
a given group of workers is likely to translate into inequality over lifetime earnings, as was done in
Flinn (2002) and Bowlus and Robin (2003); or we can use them to predict the eﬀects of changes in
the minimum wage and other government policies on equilibrium wages and employment, as was
done in van den Berg and Ridder (1998); or we can use them to calibrate macroeconomic models
in order to gauge the eﬀects of cyclical fluctuations on the labor market, as was done in Barlevy
(2002). Clearly, the ability to estimate job search models is an important tool for economists who
wish to better understand the operation of labor markets.
While previous authors have made significant progress in estimating such models, the procedures
they suggest have some important shortcomings. For example, most of the aforementioned papers
employ maximum likelihood using particular functional forms for the wage oﬀer distribution (or
alternatively for the distribution of firm productivities that gives rise to the wage oﬀer distribution
in equilibrium). As such, these papers assume rather than identify the shape of the wage oﬀer
distribution. But since there is no consensus as to what is an appropriate functional form, this
seems unsatisfactory without some a priori evidence on the shapes of this distribution. Indeed,
identifying the exact shape of the wage oﬀer distribution will be important for some of the questions
above. For example, the eﬀects of changes in the minimum wage depend on how many firms
set their wage close to the minimum wage level, and a functional form that fits the data well
along some dimensions may do poorly in matching this region of the distribution. Although
some nonparametric procedures to estimating the wage oﬀer distribution have been proposed,
most notably in Bontemps, Robin, and van den Berg (2000), these procedures do not allow for
unobserved earnings heterogeneity, i.e. they require that diﬀerences in wages across workers be
driven entirely by diﬀerences in wages paid across firms. In practice, though, even within narrowly
defined groups, diﬀerences in wages are likely to reflect unobserved diﬀerences in ability beyond
just diﬀerences in pay scales across firms. This paper proposes an alternative approach that allows
us to estimate these models in the presence of unobserved and time-varying ability, even without
having to specify the distribution of the unobserved ability across agents or over time.

1

The innovation of my approach is that rather than using data on wages and job duration to
estimate the model, as previous work has done, I use data on wages and individual work histories,
specifically the number of voluntary and involuntary job changes a worker experienced before
settling into each of his jobs. Work history data allows us to treat jobs as records, where a record
corresponds to an observation in a sequence that exceeds all of the observations that preceded
it in the sequence. Statisticians have studied record processes as a particular branch of extreme
value theory, and have applied them to study various phenomena such as record temperatures,
record athletic performances, road congestion, optimal tolerance testing, and ruin probabilities.1
A similar connection exists between record theory and models of on-the-job search: a worker’s job
at any point in time can be viewed as the record most attractive oﬀer he received since his last
involuntary job change. It turns out that we can exploit the implicit record structure of search
models to make inference about the search problem workers face.
In what follows, I follow many of the above cited papers in focusing on the Burdett and Mortensen
(1998) model of on-the-job search. That model implies an equilibrium wage oﬀer distribution that
depends on the distribution of productivity across all firms. I show that we can identify the shape
of this wage oﬀer distribution even when there is unobserved variation in workers’ ability. Exact
identification requires us to estimate an infinite list of record moments. Although we can obtain a
consistent estimator for the oﬀer distribution based on a finite number of moments, this estimator
may not be very precise when we use a small number of moments. Nevertheless, even a small
number of moments can yield sharp tests for particular hypotheses about the shape of the oﬀer
distribution. Using data from the National Longitudinal Survey of Youth (NLSY), I argue that
the oﬀer distribution for this sample is consistent with a Pareto distribution, which coincides with
the functional-form assumptions in some of the papers above, e.g. Flinn (2002). At the same time,
I can reject the hypothesis that the wage oﬀer distribution is lognormal, which has also been cited
as a plausible form for the wage oﬀer distribution.
The paper is organized as follows. Section 1 introduces the concept of record statistics. Section
2 describes the model and shows how to use record statistics to estimate it non-parametrically.
Section 3 describes data from the NLSY that can be used to implement this approach. Section
4 reports the results. Section 5 comments on the applicability of my approach to more general
search models in which wages may not correspond to record statistics but where there is still an
underlying record structure inherent to the model. Section 6 concludes.
1

An entertaining survey on the various applications of record statistics is provided in Glick (1978).

2

1. Record Statistics
Although statisticians have written extensively on record theory, it has attracted scant attention
from economists.2 I therefore begin with a quick overview of record statistics. More comprehensive reviews are available in Arnold, Balakrishnan, and Nagaraja (1992, 1998) and Nevzorov and
Balakrishnan (1998).
Consider a sequence of real numbers {Xm }M
m=1 . An element in the sequence is a record if it
exceeds the realized value of all observations that preceded it in the sequence. Formally, let M1 = 1,
and for any integer n > 1 define the observation number of the n-th record Mn recursively as
©
ª
Mn = min m : Xm > XMn−1

(1.1)

The n-th record, denoted Rn , is just the value of Xm at the n-th record time, i.e.
Rn = XMn

(1.2)

As an illustration, suppose we recorded the daily average temperatures in a particular city on the
same date each year, and obtained the following sequence:
{65, 61, 68, 69, 63, 67, 71, 66, ...}

(1.3)

The first observation is trivially a record, so M1 = 1 and R1 = 65. The next observation that
exceeds this value is the third one, so M2 = 3 and R2 = 68. The very next observation exceeds
this value, so M3 = 4, and R3 = 69. Similarly, M4 = 7 and R4 = 71. Thus, we can construct a
sequence of records {Rn } from the original sequence {Xm } in (1.3):
{65, 68, 69, 71, ...}
Note that {Rn } is a subsequence of {Xm }, and as such will be less informative. For example, we
cannot use the sequence {Rn } to infer the number of observations between consecutive records,

i.e. how many years transpired between when any two consecutive record temperatures were set,
whereas we could infer this information from the original sequence {Xm }. Formally, we cannot

deduce Mn from the sequence {Rn }, an observation I return to below.
2

Exceptions are Kortum (1997) and Munasinghe, O’Flaherty, and Danninger (2001). Kortum remarks on the
connection between his model of innovation and record theory. However, most of his analysis does not make
use of the underlying record structure, since he conditions on time elapsed rather than the number of previously
successful innovations. Munasinghe et al analyze the number of track and field records in national and international
competitions to gauge the eﬀects of globalization, and remark on the likely applicability of record theory in economics.

3

In applications, the sequence {Xm }M
m=1 is assumed to follow some stochastic process, so the
occurrence of records and the values they assume are well-defined probabilistic events. For example,
suppose M = ∞ and the individual observations Xm are i.i.d. random variables. This case was
examined by Chandler (1952), who first introduced the notion of record statistics, and has come
to be known as the classical record model. Various results for this special case have been derived,
including the distribution of record times Mn ; the distribution of the number of records within a
given sample size; the distribution of the n-th record value given the distribution of any individual
observation in the sequence (which is known as the parent distribution); the distribution of the
original parent distribution given various properties of the record statistics; and the asymptotic
distribution of the n-th record value as n tends to ∞ (if it exists). If Xm are not i.i.d., the analysis
becomes much more diﬃcult, although some results have been developed for special cases; see
Arnold, Balakrishnan, and Nagaraja (1998) for a summary of recent developments. As we shall
see below, the conventional search model does not quite correspond to the classical record model,
which introduces some complications in the analysis.
Finally, given the frequent occurrence of order statistics in economic applications, it is worth
commenting on the connection between order statistics and record statistics. The n-th maximal order statistic, denoted Xn:n , corresponds to the maximum of n random variables, max {X1 , ..., Xn }.

By contrast, the n-th record statistic Rn corresponds to the maximum of a random number Mn of
random variables, max {X1 , ..., XMn }. Thus, if we had to characterize the distribution of the n-th
record Rn , we would not be able to describe it as an order statistic, since the record sample comes
from a sample whose size is not known in advance and cannot be recovered from the sequence of
records {Rn }. Note that we could still express a record as a mixture of order statistics. Specifically,
since the number of observations Mn must be equal to an integer greater than or equal to n, the
probability that the n-th record is equal to x can be expressed as
Prob (Rn = x) =

∞
X

m=n

Prob (Mn = m) × Prob (Xm:m = x)

(1.4)

Since we can often compute the distribution of the n-th record directly without computing the
mixing probabilities Prob(Mn = m), appealing to the order statistics structure that underlies
record data may unnecessarily complicate the analysis.

2. Job Search and Record Statistics
We can now turn our attention to the task of estimating job search models. In line with most
of the previous literature on structural estimation of search models, I focus on the Burdett and
4

Mortensen (1998) model for my theoretical framework. I begin with a brief summary of the model.
I then illustrate why previous approaches to estimating the model either require us to make explicit
functional-form assumptions, or, if not, are incapable of dealing with unobserved heterogeneity. I
then show that appealing to the implicit record structure of the model allows us to identify the
shape of the wage oﬀer distribution without any assumptions on the distribution of unobserved
heterogeneity. Although I focus exclusively on the Burdett and Mortensen model in my discussion,
I briefly comment on alternative models below in Section 5.

2.1. The Burdett and Mortensen Model
The original Burdett and Mortensen (1998) model is an equilibrium model in which firms post
wages and workers sample these wages at random, choosing optimally among the oﬀers they receive.
Given a distribution of productivity across firms, the model makes a precise prediction as to the
wage oﬀer distribution that will arise in equilibrium. Among the papers cited above, some, such as
van den Berg and Ridder (1998), Bontemps, Robin, and van den Berg (2000) and Bowlus, Kiefer,
and Neumann (2001), explicitly estimate the distribution of firm productivities. Others, such as
Flinn (2002), estimate the wage oﬀer distribution without directly relating them to an underlying
distribution of firm productivity. I will pursue the latter strategy, i.e. I take the distribution of
wages posted as a primitive. As I observe below, we will still be able to apply the analysis of
Bontemps, Robin, and van den Berg (2000) to back out the original productivity distribution of
firms from the wage oﬀer distribution if we are so inclined.
Given a wage oﬀer distribution, the Burdett and Mortensen model can be summarized as follows.
At any point in time, individuals can be either employed or unemployed. An unemployed worker
receives a utility payoﬀ of b per unit time, and faces a constant hazard λ0 per unit time of receiving
a job oﬀer when he is unemployed. An oﬀer, when it does arrive, specifies a fixed wage drawn
from the wage oﬀer distribution denoted by F (·). Employed workers face a constant hazard λ1
of receiving an oﬀer, which is also drawn from the same oﬀer distribution F (·), and each draw
is independent of all previous draws. In addition, employed workers face a probability δ per unit
time of losing their job. This rate is assumed constant and independent of the wage on a worker’s
current job.
Each worker has to decide whether to accept a new oﬀer when one arrives or to stay on his
previous job (or remain unemployed if he doesn’t yet have a job). The optimal policy for an
unemployed worker is to set a reservation wage W ∗ that depends on b, δ, λ0 , λ1 , and F (·), and
to accept job oﬀers if and only if they oﬀer a wage of at least W ∗ . For an employed worker, the
5

optimal policy dictates accepting any wage oﬀer that exceeds the worker’s current wage W and
turning down any oﬀer below it. The optimal behavior of firms imposes at least two restrictions
that the equilibrium oﬀer distribution F (·), both of which I make use of in my analysis. First,
the equilibrium oﬀer distribution F (·) must be continuous, i.e. it cannot exhibit any mass points.
Second, F (W ∗ ) = 0, i.e. no firm will oﬀer a wage below the common reservation wage W ∗ .
Note that model makes a clear distinction between voluntary job changes that the worker initiates
upon receiving a better oﬀer and involuntary job changes in which the worker is forced to leave his
job to unemployment and resume searching from scratch. Following Wolpin (1992), we can break
down each worker’s employment history in the model into distinct employment cycles, where a
cycle is defined as the time between involuntary job changes. That is, a cycle begins the instant
the worker leaves his job involuntarily, continues on through his unemployment spell and each
job he subsequently takes on, and ends the next time he is laid oﬀ. Of course, in empirical
applications it will be important to classify job changes in a way in accordance with the model
so that we correctly identify distinct employment cycles, a point I return to in my data analysis.
In principle, we should index observations by the employment cycle they are associated with. For
ease of notation, though, I will omit this subscript in what follows.
Let M denote the (random) number of job oﬀers a worker receives on a given employment cycle,
and let m ∈ {1, 2, ..., M} index these oﬀers. Let {Xm }M
m=1 denote the list of wages on the respective

job oﬀers that the worker encounters over his employment cycle. The fact that F (W ∗ ) = 0
implies that workers will always accept the first oﬀer they receive out of unemployment. Thus,
X1 represents both the wage on first job oﬀer the worker receives and the the wage on the first
job the worker is employed on. This will not be true more generally; for example, by a simple
exchangeability argument, half of the the time when a worker draws a second wage oﬀer it will
be lower than his first oﬀer, so the second oﬀer will often not be the second job the worker is
employed on. As such, let us define N ≤ M as the number of actual jobs the worker is employed

on in a given cycle, and let n ∈ {1, 2, ..., N} index these jobs. Let {Wn , tn }N
n=1 denote the wages
and durations of all the jobs the worker is employed on over the cycle. Using the definition of Mn
in (1.1), the optimal search strategy for a worker implies that
Wn = XMn

i.e. the wage on the n-th job in the cycle must be the n-th record from the sequence {Xm }M
m=1 ,
and N is the total number of records set within this sequence. The fact that wages correspond to
record statistics will eventually figure prominently in my analysis.

6

2.2. Previous Approaches to Estimation
Having described the model, I now turn to the question of how to use data to identify its key
parameters. One approach is to assume that the oﬀer distribution F (·) lies in a class of functions
parameterized by some finite-dimensional vector ξ, i.e. F (·) = F (· ; ξ). Thus, identifying the
model amounts to estimating a finite vector {λ0 , λ1 , δ, ξ}. In the full-fledged version of the Burdett
and Mortensen model, ξ corresponds to the ratio λ1 /δ and any parameters that characterize the
distribution of productivity across firms. For example, if we assume all firms in a given labor market
are equally productive, as in van den Berg and Ridder (1998), estimating ξ amounts to estimating
a single productivity parameter for each market.3 Alternatively, if we assume the distribution of
productivity across firms in each market has finite support, as in Bowlus, Kiefer, and Neumann
(2001), ξ amounts to the diﬀerent levels of productivity and their respective probabilities. In
the reduced-form version of the model where we treat the wage oﬀer distribution as a given, ξ
corresponds to the parameters that characterize the particular functional form imposed on the
wage oﬀer distribution. For example, Flinn (2002) assumes F (·) is Pareto, and then estimates the
curvature parameter associated with this distribution.
Given an expression for F (· ; ξ), we can write down the likelihood of a given cycle, i.e. the
likelihood of there being exactly N jobs in the cycle and that the particular sequence of wages and
job durations for these N jobs are given by {W1 , t1 , ..., WN , tN }. As I show below, this likelihood is

a function of λ1 , δ, and ξ, so we can estimate these parameters using maximum likelihood. Armed
with these estimates, we can proceed to estimate λ0 using data on unemployment durations,
although I ignore this step in my discussion since I have nothing novel to say about it.4 To obtain
the likelihood function L (W1 , t1 , ..., WN , tN , N), note that the likelihood for any sequence that
fails to satisfy the condition that
W1 ≤ W2 ≤ · · · ≤ WN
must be zero. For sequences that satisfy this requirement, we can compute the likelihood of
{W1 , t1 , ..., WN , tN , N} as follows. First, we compute the likelihood of the wage and duration on
each job and whether it ends voluntarily or involuntarily conditional on the same information for
all of the jobs that preceded it in the cycle. Then, we multiply all of these terms to obtain the joint
3
More precisely, van den Berg and Ridder assume a worker can participate in only one market. Diﬀerent workers
(as distinguished by observable and unobservable characteristics) operate in diﬀerent markets, and the productivity
of any worker is assumed the same across all firms in the market in which he sells his labor services. Thus, there is
only one parameter to estimate per market, but the number of markets is large.

As previous authors have noted, estimating λ0 in this model is simplified by the fact that F (W ∗ ) = 0 in the
Burdett and Mortensen model, which avoids the question of recoverability raised in Flinn and Heckman (1982).
4

7

likelihood of the entire cycle. Formally, for each job except the last job in the cycle, we compute
the conditional likelihood
L (Wn , tn , N ≥ n + 1 | Wn−1 , tn−1 , ..., W1 , t1 , N ≥ n)

(2.1)

For the last job on the cycle, we compute the conditional likelihood
L (Wn , tn , N = n | Wn−1 , tn−1 , ..., W1 , t1 , N ≥ n)

(2.2)

To compute (2.1) and (2.2), we again break down these likelihoods into products of conditional
likelihoods, i.e.
L (Wn , tn , N | Wn−1 , ...) = L (Wn | Wn−1 , ...) × L (tn | Wn , Wn−1 , ...) × L (N | Wn , tn , Wn−1 , ...)
Turning to the first term, since the wage on the n-th job is a random draw from the distribution
truncated at Wn−1 , the likelihood of observing Wn on the n-th job given Wn−1 (as well as all other
past data, which are independent of Wn given Wn−1 ) corresponds to
L(Wn | Wn−1 , ...) =

f (Wn )
F (Wn−1 )

(2.3)

where F (·) = 1 − F (·). Adopting the convention that W0 ≡ W ∗ , this expression also characterizes
the distribution of the wage on the very first job in the cycle. Next, the duration of the n-th
job given the wage Wn will be exponential with an arrival rate equal to δ + λF (Wn ). Thus, the
conditional likelihood of tn given Wn and all other past data corresponds to
¡
¢
L(tn | Wn , Wn−1 , ...) = δ + λF (Wn ) e−[δ+λF (Wn )]tn

(2.4)

Finally, the conditional probabilities that the n-th job ends because of a quit and a layoﬀ, respectively, correspond to
L(N ≥ n + 1 | Wn , tn , Wn−1 , ...) =

λ1 F (Wn )
δ + λ1 F (Wn )

δ
L(N = n | Wn , tn , Wn−1 , ...) =
δ + λ1 F (Wn )
Multiplying (2.3) through (2.5), we get
L(Wn , tn , N ≥ n + 1 | Wn−1 , ...) = λ1 F (Wn )
L(Wn , tn , N = n | Wn−1 , ...) = δ
8

f (Wn ) −[δ+λF (Wn )]tn
e
F (Wn−1 )

f (Wn ) −[δ+λF (Wn )]tn
e
F (Wn−1 )

(2.5)

Multiplying the above for all jobs n = 1, ..., N and cancelling redundant terms yields the following
expression for the likelihood of a given employment cycle:
"N
#
δ Y
L(W1 , t1 , ..., WN , tN , N) =
λ1 f (Wn ) e−[δ+λ1 F (Wn )]tn
(2.6)
λ1
n=1

Maximum likelihood then amounts to choosing the values of {λ1 , δ, ξ} that maximize (2.6) evaluated at the sample data from diﬀerent employment cycles. As long as we correctly specified the
function F (· ; ξ), these values are consistent estimates of the true {λ1 , δ, ξ}.
We can easily amend the model above to allow for heterogeneity in ability across workers, a
feature most economists would agree is important for explaining actual data. Suppose the oﬀers
{Xm }M
m=1 reflect price oﬀers of how much a firm is willing to pay per unit of eﬀective labor rather
than wage oﬀers, and that workers vary in the amount of eﬀective labor they can supply.5 All
workers, regardless of their productivity, face the same distribution of prices and the same arrival
rate λ1 and δ. Let `it denote the amount of eﬀective labor per hour that worker i can supply at date
c n denote the wage we observe for worker i at date t on the n-th job in his employment
t, and let W
it

cycle. Then the wages we would observe for worker i on his various jobs in an employment cycle
correspond to
citn = Wn `it
W
(2.7)

A virtue of maximum likelihood estimation is that it does not require worker ability to be observable. For example, if we assume that there is a finite number of productivity levels `it a worker
could operate at, we can add these values and their respective frequency in the population as
parameters we need to estimate.6

The main drawback of the approach outlined above is that the choice of the oﬀer distribution
F (·, ξ) is arbitrary. Since the answers to the questions raised in the Introduction often hinge on the
exact shape of the relevant distribution, assuming rather than estimating the shape of this function
runs counter to the original spirit of this undertaking. Although we can conceptually overcome
this concern by allowing for a suﬃciently large parametric class, in practice the papers cited in
the Introduction focus on mutually exclusive parametric families; the wage oﬀer distribution is
sometimes assumed to be lognormal, sometimes Pareto, sometimes a power distribution, and so
5

If the production of output is linear in eﬀective labor, this is equivalent to assuming firms post piece rates.

To ensure workers of diﬀerent ability choose the same cutoﬀ price W ∗ in the distribution of prices, we would
have to further assume that the value of leisure is proportional to productivity, i.e. bit = b`it . This interpretation
could reflect the fact that the value of leisure is really the value of home production, and individuals are just as
productive at home as they are in the market sector.
6

9

on. At the very least, we would like some procedure to determine what shape restrictions on the
relevant distributions are valid before we carry out maximum likelihood.
In recent work, Bontemps, Robin, and van den Berg (2000) attempt to do just that by devising
a non-parametric estimator for the wage oﬀer distribution. They suggest the following two-step
procedure in the case where there is no unobserved ability. First, since the first job in the cycle
is a random draw from F (·), they advocate using the empirical distribution of wages across all
workers on their first job to estimate F (·). Next, given this estimate of the oﬀer distribution,
they use data on job duration to identify λ1 and δ. Recall from (2.4) that the duration of a job
is distributed as an exponential with rate δ + λ1 F (W ). One can therefore identify λ1 and δ from
the way the duration of a job t varies with the wage paid on the job W . Formally, they choose
λ1 and δ to maximize the likelihood in (2.6) where they substitute in the estimate for F (·) from
the first stage.7 This approach has the virtue that it does not impose any shape restrictions on
F (·), or alternatively on distribution of productivity across firms that gives rise to this distribution.
However, this approach cannot accommodate unobserved diﬀerences in worker ability. For suppose
once again that the wages we observe for worker i on his various jobs over the cycle are given by
c n = Wn `it
W
it

(2.8)

where `it denotes the productivity of worker i at date t. The empirical distribution of wages on the
c1 , is a convolution of the true oﬀer distribution
first job workers accept out of unemployment, W
it
F (·) and the distribution of `it across workers. Improperly attributing dispersion in `it to dispersion
in oﬀers would overstate the dispersion of the true oﬀer distribution F (·), and this misspecification
could then further contaminate estimates for λ1 and δ.
While we could in principle try to estimate `it from observed worker characteristics, e.g. age
and education, it is unlikely that we could ever capture all of the unobserved variation in wages.
In particular, an important feature of the data is that a non-negligible fraction of voluntary job
changes report a lower wage on their new job than on their previous job. Thus, we need to allow
c n even when Wn+1 ≥ Wn , and it is hard to come up with
c n+1 < W
for the possibility that W
it
it
observable characteristics that can account for such fluctuations in individual earnings over time.
Bontemps, Robin, and van den Berg (2000) avoid this issue by only using wage data for the first
7

Since Bontemps et al cannot observe when a worker began his employment cycle in their data, they cannot
isolate workers who are on their first job. Instead, they assume that any given worker in their sample is drawn
from the steady-state distribution of wages across all workers, denoted by G (·). The distribution G (·) can then be
directly related to F (·) given a ratio λ1 /δ. They estimate G (·) non-parametrically, and then maximize a variant
of (2.6) in which f (·) is replaced with the appropriate expression in terms of g (·) and the ratio λ1 /δ. They then
choose λ1 and δ to maximize the implied likelihood function.

10

job they observe for a worker, so they never have to consider wage changes across jobs. But both
van den Berg and Ridder (1998) and Flinn (2002), who use multiple jobs from each employment
cycle, need to introduce an `it term to reconcile the model with the data. They interpret `it
as measurement error in wages, but as the discussion above suggests it could just as well reflect
variability in a worker’s productivity over time. Either way, unless we knew the distribution of `it ,
cn .
we could not recover the distribution of Wn from the distribution of observed wages W
it

2.3. Non-Parametric Estimation using Record Moments
I now propose a way to estimate the wage oﬀer distribution non-parametrically even in the presence
of unobserved heterogeneity `it . My approach exploits the implicit record structure of the model.
As noted briefly in Section 1, given an i.i.d. sequence {Xm }∞
m=1 , we can sometimes characterize
∞
the distribution of Xm from observations on the records {Rn }∞
n=1 in the sequence {Xm }m=1 . For
example, Kirmani and Beg (1984) show that the list of record moments {E (Rn )}∞
n=1 , assuming

these moments exist, uniquely characterize the distribution of Xm within the set of continuous
distributions. To appreciate why this result might be useful, consider the case where `it reflects
pure measurement error, i.e. `it is i.i.d. across workers and over time and is independent of Wn .
In this case, the average wage across all workers on their n-th job converges to
cn ) = E (Wn ) E (`it )
E(W
it

= E (Wn ) × constant

Thus, we can empirically estimate the sequence {E (Wn )}∞
n=1 up to a scalar. Since {Wn } corresponds to the list of records in a sequence of i.i.d. draws from the underlying wage oﬀer distribution,
the list {E (Wn )}∞
n=1 should allow us to identify the underlying wage oﬀer distribution without
having to characterize the distribution of `it . The approach outlined below adapts this argument
for richer specifications of `it that capture diﬀerences in productivity as well as just pure measurement error. Note that this approach is conceptually diﬀerent from the approaches above, since
it relies on diﬀerent data to identify the fundamental parameters of the model. Whereas the apc n , tn }N , my approach relies on wage
proaches above rely on wage and duration data, i.e. on {W
it

n=1

c n , n}N .
data and the position in the cycle of each job, i.e. {W
n=1
it

While the discussion above conveys the basic intuition, it is also somewhat imprecise. Kirmani

and Beg (1984) assume records come from an infinite sequence {Xm }∞
m=1 , whereas in the model
M
wages correspond to records from a sequence {Xm }m=1 in which M is a random variable. Why
is this relevant? Since the occurrence of a record is a random event, the expected value of the
11

n-th record is conditional on the event that there are at least n records in the original sequence,
and should more properly be denoted as E (Rn | N ≥ n) where N denotes the total number of
records in the sequence {Xm }. If the distribution for each Xm is continuous, the infinite sequence
{Xm }∞
m=1 will contain infinitely many records almost surely, i.e. Prob(N ≥ n) = 1 for any finite

n, and it is common to simply omit the conditioning event and write E (Rn ) for E (Rn | N ≥ n).
But when we only get to observe {Xm }M
m=1 where M can be finite, it will no longer be the case
that every sequence we observe will contain n records. Thus, when we compute an average value

for the n-th record Rn , we can only average over those employment cycles in which the worker
makes it to his n-th job. But this is hardly a random sample; quite to the contrary, on those cycles
where the worker makes it to his n-th job before being laid oﬀ it is more likely that the first few
oﬀers the worker received were low enough that the worker could still find a better oﬀer before
being laid oﬀ. As a result, the moments E (Rn | N ≥ n) for records drawn from the sequence
{Xm }M
m=1 will typically be smaller than the corresponding moments E (Rn | N ≥ n) from the

infinite sequence {Xm }∞
m=1 . We will therefore not be able to rely on Kirmani and Beg’s results,
and must independently confirm that record moments for records from a sequence of random length
still uniquely identify the parent distribution of the individual observations.

I begin by laying out my assumptions on `it . My key assumption is that `it is independent of
both the job worker i is employed on at date t and the price per unit labor that it pays. This
assumption preserves the basic search structure of the model, since a worker should seek out the
jobs oﬀering the highest price per unit labor. The prices per unit labor on the jobs the worker
M
is employed on {Wn }N
n=1 thus still correspond to record values among the price oﬀers {Xm }m=1
the worker encounters over the cycle. Note that this assumption rules out match-specific wage
dynamics, since any change in the wage a worker earns on his current job must also aﬀect the
wages he would earn on all other jobs. This is hardly innocuous; for example, my framework does
not allow for the possibility that a worker becomes more skilled at only one particular job, i.e. any
human capital a worker accumulates must be general in nature. Nevertheless, in the next section
I argue that this assumption is reasonable for the sample of workers I consider.
Following Flinn (1986), I proceed to impose the following form for `it :
`it = exp (φi + βZit + εit )
The first term, φi , is fixed over time and serves to capture variations in innate ability that make
some workers consistently more productive than others, all other things equal. The next term,
Zit , represents observable time-varying characteristics for individual i that aﬀect his productivity.
In my application, this will correspond to the time since the worker first entered the market,
12

i.e. the worker’s potential experience. Lastly, εit denotes unobserved productivity shocks. It
serves to capture a combination of any shocks to productivity not accounted for by Zit as well
as multiplicative measurement error that may appear in reported wages. I assume εit follows a
stochastic process with the sole restriction that
E (∆εit ) = 0

(2.9)

i.e. the unconditional mean change in εit is zero. Note that (2.9) does not restrict the shape of
the distribution of εit or its autocorrelation; in particular, my approach allows for the presence of
serial correlation in earnings over the duration of a job.
It will be easier to work from now on with log wages than with wages. Let wn = ln Wn denote
the log price per unit labor on the worker’s n-th job. It follows that wn represents the n-th record
in the sequence of log price oﬀers {xm }M
m=1 where xm = ln Xm , so we can still hope to exploit the
record structure of wn to identify the distribution of xm (from which we can deduce the distribution
of Xm ). After substituting in for `it , we obtain the following equation for the log wage:
citn = wn + φi + βZit + εit
ln W

(2.10)

cit = ∆w + β∆Zit + ∆εit
∆ ln W

(2.11)

We next first-diﬀerence equation (2.10) to get rid of the fixed eﬀect term φi . Let ∆ denote the
diﬀerence in a particular variable between two distinct points in time. Then we have

For a worker who is employed on the same job at these two points in time, ∆w = 0, implying wage
growth on the job is given by
cit = β∆Zit + ∆εit
(2.12)
∆ ln W
It follows that we can use ordinary least squares on (2.12) to estimate β, i.e. to estimate the
contribution of observable characteristics to productivity growth. This estimate will be important
in what follows.
Next, consider the wage growth of workers across jobs. Using our estimate for β, we can net out
the role of observable productivity growth in these cases. Thus, for a worker who moves from his
n-th job to his n + 1-th job, the net wage gain from changing jobs is given by
cit − β∆Zit = (wn+1 − wn ) + ∆εit
∆ ln W

(2.13)

In other words, the net wage gain for a voluntary job changer who leaves his n-th job is the sum
of the gap between the n-th record and the n + 1-th record from a sequence of i.i.d. draws from
13

the log price oﬀer distribution and a noise term ∆εit . Averaging these net wage gains across all
such workers, we have
cit − β∆Zit | N > n) = E (wn+1 − wn | N > n) + E(∆εit | N > n)
E(∆ ln W
= E (wn+1 − wn | N > n)

where we use the fact that ∆εit is independent of N. Thus, using observations on wage growth
cit both within jobs and across jobs, we can estimate the expected record gaps
∆ ln W
{E (Rn+1 − Rn | N > n)}∞
n=1

(2.14)

for records that are drawn from the sequence {xm }M
m=1 where the xm are i.i.d. random variables
with the same distribution as the log wage oﬀer distribution. As anticipated by the previous
discussion, these record moments may be enough to identify the parent distribution of each xm . In
the next subsection, I show that the sequence of moments in (2.14) is indeed enough to characterize
the shape of the parent distribution of xm within the set of continuous distribution functions, and
show how to recover the parent distribution from this list.

2.4. Recovering the Distribution from Record Moments
I begin by establishing that the sequence of expected record gaps in (2.14) uniquely identifies
the shape of the parent distribution. First, though, I need to characterize the distribution of
the number of oﬀers M in each employment cycle. The proof of the next lemma, as well as all
remaining propositions, are contained in an appendix.
Lemma 1: The unconditional number of oﬀers on a cycle M is distributed as a geometric, i.e.
Prob (M = m) = qm−1 p
where q =

δ
λ1
and p =
. ¥
λ1 + δ
λ1 + δ

The next lemma provides a suﬃcient condition for the moments in (2.14) to be well-defined:
Lemma 2: Consider a sequence of i.i.d. random variables {Xm }M
m=1 where M is independent of
m−1
the realizations of {Xm } and Prob(M = m) = q
p, for p+q = 1. Let {Rn }N
n=1 denote the records

of this sequence. If E (|Xm |) < ∞, then the conditional expectation E (Rn+1 − Rn | N > n) is
finite for n = 1, 2, 3, ...

14

Assuming the moments in (2.14) exist, we have the following result:8
Proposition 1: Consider a sequence of i.i.d. random variables {Xm }M
m=1 where Prob(M = m) =

q m−1 p. If E (|Xm |) < ∞, the sequence

{E (Rn+1 − Rn | N > n)}∞
n=1
uniquely characterizes the distribution of Xm in the set of continuous distributions up to a location shift. That is, if Fb (·) is a continuous function that gives rise to the same sequence
{E (Rn+1 − Rn | N > n)}∞ as F (·), then Fb−1 (·) = F −1 (·) + constant.
n=1

Remark: Following up on the present paper, Nagaraja and Barlevy (2003) analyze record

moments when the number of observations M is geometric in more detail. Interestingly, they show
that characterization results based on record moments from a geometric number of observations are
stronger than those that are based on record moments from an infinite number of observations, i.e.
moment sequences that are not enough to uniquely identify the parent distribution in the classical
model can identify the parent distribution when M is geometric.
To gain some insight as to why record moments allow us to identify the parent distribution,
consider the expression E (X | X > x) − x, i.e. the amount we expect to rise above a given

number x when we sample at random from the parent distribution truncated at x. The n-th
average record gap E (Rn+1 − Rn | N > n) is just a weighted average of all these expected gains,
where the weight on the expected gain over a particular x corresponds to the probability that the
n-th record is equal to x. At low values of n, more weight is put on the expected gain starting
at low values of x, while at higher values of n, more weight is put on the expected gain starting
at high values of x. Taking all of these moments together, we can essentially deduce how much
a worker would expect to gain from moving to a better job starting at any initial wage; for low
initial wages, these gains will depend more on the average record gap for low values of n, while for
high initial wages, these gains will depend on the average record gap for higher values of n. Given
two diﬀerent wage oﬀer distributions, the expected gain from mobility will have to diﬀer at some
starting wage, implying at least some of the expected record gaps will be diﬀerent.

To make practical use of Proposition 1, we need to be able to invert the set of moments to
obtain the underlying parent distribution that gives rise to these moments. Since the moments in
(2.14) are all functions of the parameter p in Lemma 1, we first need to estimate this parameter.
8

I am indebted to H. N. Nagaraja for suggesting the proof of this proposition.

15

λ1
, it would seem natural to appeal to existing methods
Since p is just a function of the ratio
δ
for estimating these parameters. Unfortunately, the way previous authors have estimated these
arrival rates is not applicable. Recall that previous work estimated these parameters using the fact
that job duration is exponential with rate δ + λ1 F (W ), so the way job duration varies with wages
allows us to identify these two parameters. However, this approach requires that we know the
shape of the function F (·), which we are still in the progress of trying to determine. Fortunately,
p can be estimated directly from data on employment history. In particular, Bunge and Nagaraja
(1991) show that when M is geometric, the number of records N in an i.i.d. sequence {Xm }M
m=1
will distributed according to the truncated Poisson
Pr (N = n) =

p (− ln p)n
qn!

(2.15)

Since the number of records N on each employment cycle corresponds to the number of jobs a
worker was employed on before he was laid oﬀ, we can estimate p and q directly from mobility
data, without using information on job duration. Formally, we have
Proposition 2: Let Qn denote the number of employment cycles with exactly n records. A
consistent estimator for p is given by
¸
∞ ·
Y
p (− ln p)n Qn
pb = arg max
p
1−p
n!
n=1
Armed with this estimate, I now tackle the problem of constructing a function from the list of
expected record gaps in (2.14). To do this, note from the proof of Proposition 1 in the Appendix
that the n-th element of the list E (Rn+1 − Rn | N > n) can be expressed as
E (Rn+1 − Rn | N > n) =

(− ln p)n
(n − 1)! Pr (N > n)

Z

1

g (x) xn−1 dx

(2.16)

0

where g (x) is a function that depends on p and the parent distribution for each observation in
the sequence. To recover the parent distribution F (·), we proceed in two steps. First, we use the
moments E (Rn+1 − Rn | N > n) and our estimate for p to determine the function g (x). Second,
we invert the function g (x) to recover the distribution F (·).
To determine the function g (x), rewrite (2.16) as
Z

0

1

g (x) xn−1 dx = E (Rn+1 − Rn | N > n)

16

(n − 1)! Pr (N > n)
≡ µn−1
(− ln p)n

(2.17)

Given estimates for p and E (Rn+1 − Rn | N > n) for all n, we can determine each of the µn−1 .
The task of finding a function g (x) that solves the infinite system of equations above is known
as the Hausdorﬀ moment problem. That is, in a variety of applications, we will often want to
find a function g (x) defined over the unit interval whose moments (i.e. integrals of the function
multiplied by diﬀerent powers of x) are equal to a particular sequence of values. Shohat and
Tamarkin (1943) oﬀer a set of suﬃcient conditions for this problem to have a solution (which
proves to be unique) and provide an analytical representation for this solution g (x) in terms of
µn . Essentially, we use the expressions for µn to construct coeﬃcients for an infinite polynomial
expansion, which allows us to reconstruct any continuous function. A little algebra then allows us
to recover the distribution function of interest F (·) from the function g (x) we just constructed.
This procedure is summarized in the next proposition:
Proposition 3: Given the complete sequence of moments {E (Rn+1 − Rn | N > n)}∞
n=1 , the
−1
inverse parent distribution F (x) can be constructed according to the following procedure:
1. Let {Pn (x)}∞
n=0 denote the set of Legendre polynomials defined on (−1, 1). Define a new set
of polynomials {Pn (x)}∞
n=0 on (0, 1) as
Pn (x) = R 1
0

Pn (2x − 1)

Pn2 (2x − 1) dx

≡

n
X

cnj xj

(2.18)

j=0

2. Define a sequence {µn }∞
n=0 where

µn−1 = E (Rn+1 − Rn | N > n) ×

(n − 1)! Pr (N > n)
(− ln p)n

Using the coeﬃcients cnj from (2.18), construct a new sequence {λn }∞
n=1 where
λn =

n
X

cnj µk

j=0

Define a function g (x) over (0, 1) as the sum of polynomials Pn (x) with coeﬃcients λn , i.e.
g (x) =

∞
X
j=0

λn Pn (x)

(2.19)

3. The inverse parent distribution function F −1 (x) over (0, 1) can be constructed from g (x) as
follows:
µ
¶
ln (1 − qz)
Z x qg0
ln p
F −1 (x) =
dz + constant
0 (1 − z) (1 − qz) ln p
where the constant of integration denotes the unidentified location parameter.
17

Remark: Bontemps, Robin, and van den Berg (2000) oﬀer an explicit way to obtain the distribution of productivity in the Burdett and Mortensen model given a wage oﬀer distribution. The
λ1
inversion requires knowing the ratio
. Since Proposition 2 establishes that we can identify p,
δ
one can extend Proposition 3 to recover the implied distribution of productivities across firms from
the distribution F (·).
In practice, we can only estimate finitely many moments µn required to implement the method
laid out in Proposition 3, and even those moments are only imprecisely estimated due to sampling
error. Talenti (1987) examines a variant of the Hausdorﬀ moment problem with finitely many
moments measured with error. He suggests replacing the infinite sum in (2.19) with the finite sum
g (x) =

J
X
j=0

λn Pn (x)

(2.20)

where J denotes the number of moments we can observe. Talenti shows that the problem is stable,
in the sense that the inaccuracy from using a finite set of moments is bounded by the number
of moments J and the magnitude of the sampling error, and this bound converges to zero as the
number of moments goes to infinity and the error term both go to zero. Thus, the estimator
for F (·) in Proposition 3 is consistent as long as in the limit we can precisely estimate all of the
relevant moments. But with only a small number of moments, the approximation in (2.20) is likely
to be quite poor. Thus, while Proposition 3 yields a consistent non-parametric estimator for the
wage oﬀer distribution, this estimator may not be reliable if we can only estimate a small number
of moments in the list (2.14).
While a precise estimate for F (·) may be hard to come by in practice, a small number of moments
may still suﬃce to check if a particular functional form is consistent with data. As an illustration,
Figure 1 displays the expected record gaps E (Rn+1 − Rn | N > n) for two diﬀerent log wage
oﬀer distributions, an exponential and a normal (which correspond to Pareto and lognormal wage
oﬀers, respectively, both of which have been suggested in the previous literature). The moments
are computed for a geometric M with the probability p that is estimated in Section 4, and both
distributions are normalized to yield the same average wage gain across voluntary job changers as
in the data when we use the empirical distribution of job changes across n. As Figure 1 reveals,
the two distributions can be easily distinguished from one another even with only a small number
moments. In particular, the average net wage gain does not depend on n for the exponential
distribution, reflecting the memoryless property of this distribution, while the average wage gain
declines rapidly with n for the normal distribution. I will return to this observation below, where
I argue that we can in fact reject the null hypothesis that the wage oﬀer distribution is lognormal.
18

3. Data
To implement the above methodology, I need a dataset with detailed work-history data that can
be used to assign n for each job. Since job mobility is highest when workers first enter the labor
market, it also seems wise to focus on young workers. Not only will this allow for larger samples
of job changers, but the fact that young workers are so mobile makes them less likely to invest in
match-specific human capital, in line with my assumption that human capital is general. These
considerations lead me to data from the early part of the National Longitudinal Survey of Youth
(NLSY) dataset. The NLSY follows a single cohort of individuals who were between 14 and 22
years old in 1979. To avoid using observations where workers are already well established in
their careers, I only use data through 1993, at which point the oldest worker in the sample is 36.
Each year, respondents were asked questions about all jobs they worked on since their previous
interview, including starting and stopping dates, the wage paid, and the reason for leaving the job.
To mitigate the influence of mobility due to non-wage considerations, e.g. pregnancy or child-care,
I restrict attention to male workers.
Most of the variables that I use are standard. For the wage, I use the hourly wage as reported by
the worker for each job, divided by the GDP deflator (with base year 1992). I also experimented
with the CPI, but the results were similar. To minimize the eﬀect of extreme outliers on means, I
removed observations for which the reported hourly wage was less than or equal to $0.10 or greater
than or equal to $1000. This eliminated 0.1% of all wage observations. Many of these outliers
appear to be coding errors, since these wages appear to be out of line with what these same workers
report at other times, including on the same job. For my measure of potential experience, I follow
previous literature in dating entry into the labor market at the worker’s birthyear plus 6 plus his
reported years of schooling (highest grade completed). However, if an individual reported working
on a job prior to that year, I date his entry at the year in which he reports his first job. Table 1
provides summary statistics for the original sample of all jobs.
The more unconventional variable in my analysis is the position n each job represents in its
respective cycle. To construct this statistic, we first need to distinguish between voluntary job-tojob changes from involuntary job changes in order to delineate employment cycles. One approach
is to use individual self-reports on the reason they left each of their jobs, i.e. whether they quit
voluntarily or were laid oﬀ. Alternatively, we can use the time lapsed between jobs to gauge
whether a move was voluntary or not, since a voluntary job changer would immediately move
into a new job while a worker who changed jobs involuntarily would spend some time unemployed.
According to the model, these approaches should coincide. But in practice, the two agree only 60%
19

of the time. More precisely, it is true that the vast majority of workers who report an involuntary
job loss spend at least one week unemployed. Moreover, the majority of workers who do not spend
any time unemployed between jobs indeed report leaving their previous job voluntarily. The main
discrepancy is that nearly half of all workers who report leaving their job voluntarily do not start
their next job for weeks or even months later. Although some of these cases are probably due
to planned delays, it seems as if workers often report leaving a job voluntarily without having
another job already lined up. This could be because workers are embarrassed to admit they were
laid oﬀ, or because they decided to quit for reasons that are not captured by the model. I assume
these reasons are independent of the wage, and so can be considered as an involuntary job loss
in the model, i.e. workers who quit without lining up another job must resume searching from
scratch. Thus, if a worker reports leaving a job voluntarily, I classify that job as having ended
involuntarily if his next job began more than two months (8 weeks) after his previous job ended.
But if a worker reports leaving a job involuntarily, I continue to classify the job as having ended
involuntarily regardless of how long it took him to start a new job. If the worker oﬀers no reason
for leaving his job, I classify his job change as voluntary if he starts his next job immediately and
involuntary is he starts it over two months later, but otherwise do not classify the job.9
With this classification complete, we can assign n as follows. After the first involuntary job
change, we set n = 1. From then on, we either increment n by 1 if the worker leaves his job
voluntarily, or reset n to 1 if he did so involuntarily. One complication is that a non-trivial
fraction of workers simultaneously hold more than one job. To deal with this, I draw on Paxson
and Sicherman (1996), who argue that the primary reason workers hold multiple jobs is that they
are constrained to work a maximum number of hours on each job. Suppose then that workers
are constrained and can work on only one job full time. However, workers can receive additional
draws from the distribution F (·) and work on those jobs on a part time basis. If we observe a
worker employed in job A take on a second job B, we treat job B as a second draw from F (·) that
is available for part-time work. If the worker leaves job B before he leaves his original job A, job
B provides us with no information on the price of labor on job A. I therefore ignore job B in my
analysis, i.e. it is as if the worker had never taken on a secondary job. Alternatively, is the worker
leaves job A and remains in job B, then a full-time position must have opened up on job B. Since
the wages on these jobs are assumed to be drawn from the same distribution F (·), we can treat it
in the same way as a new job that started only after job A ended.
9

I experimented with cutoﬀs other than two weeks. These had very little impact on the first few record moments
(i.e. n = 1, 2, and 3) that can be estimated quite precisely, although they did aﬀect the less precisely estimated
moments for higher values of n.

20

Out of the 52,827 distinct jobs in my original sample, the procedure above throws out 8,232 as
secondary jobs that the worker was always employed on in addition to some other job. As a check,
we can use the fact that the NLSY asks workers to rank their jobs each year in terms of which
is their primary job. Of the 8,234 jobs I identify as secondary jobs, 72% are never ranked as the
primary job in any year, and only 9% are ranked as the primary job each year the job is reported.
Figure 2 reports the distribution of n across the remaining 44,593 jobs. Figure 2a shows the
fraction of all jobs each year for which a value for n cannot be assigned. Since we can only assign n
following the first involuntary job change, we will have to wait some time before we can assign n for
any one worker. Thus, in the first few years of the sample, we can assign n to only a small fraction
of jobs. However, by 1993, n was assigned to 87% of all the jobs respondents reported working on
in that year. Figure 2b shows the distribution of n where a value for n could be assigned. Not
surprisingly, most jobs early on in the sample that can be classified are associated with n = 1. But
over time, a larger share of workers is observed on higher levels of n. The distribution of n appears
to settle down after about 10 years, with roughly half of all jobs associated with n = 1, a quarter
with n = 2, 12% with n = 3, 6% with n = 4, and 3% with n = 5. Note that very high values of
n are uncommon, in line with the known result that records from a sequence of i.i.d. draws are
relatively rare.
Before I use this data to estimate the average wage for leaving the n-th job net of growth in
observable characteristics ∆Zit , a few issues remain to be settled. First, we need to decide the
horizon at which to compute the diﬀerences in equation (2.13). Since the NLSY provides one wage
on each job per interview, we can only measure within-job wage growth at one year diﬀerences.
However, when Topel and Ward (1992) study a similar sample of young workers using quarterly
data, they report a “strong tendency for within-job earnings changes to occur at annual intervals.”
Thus, it seems that little is lost by focusing on annual wage growth. Since my estimates involve the
diﬀerence between wage growth across jobs and within jobs, consistency would suggest restricting
attention to wage growth across jobs that is also computed at one year horizons. To ensure this,
I only use wage data for jobs the worker reported working on within two weeks of the interview
date, which is carried out on a yearly basis. My approach thus abstracts from wage growth for
jobs that fall between interviews. Although I ignore intervening jobs over the year in estimating
average wage growth, I do use these jobs for constructing n. My ultimate sample consists of 40,370
observations in which a wage is reported in both the current year and previous year. Of these,
28,015 observations involve the same job in both the current year and the previous year, and 12,355
observations involve a change in jobs between the previous interview and the current one.

21

Next, I need to specify the vector of observable characteristics Zit . I assume Zit is quadratic in
potential experience, i.e.
(3.1)
Zit = β 1 EXPit + β 2 EXPit2
Since at annual horizons EXPit = EXPi,t−1 + 1, it follows that
∆Zit = β 1 + β 2 (2EXPit − 1)
As noted above, the fact that (3.1) depends on the time the worker spent on all jobs rather on
any one job rules out the possibility of match-specific human capital. To gauge the plausibility of
this assumption, let T ENit denote the tenure of worker i on the job he holds at date t. Suppose
we amended (3.1) to include T ENit , i.e.
Zit = β 1 EXPit + β 2 EXPit2 + γT ENit

(3.2)

Under this alternative specification, a part of the wage growth the worker achieves on a given job
would disappear if he were to move to a new job where Tit = 0. This would invalidate my analysis,
since a worker would no longer find it optimal to accept an oﬀer if and only if it pays a higher
price per unit labor than his previous job. Thus, it is important to confirm that the coeﬃcient γ
in (3.2) is negligible for my sample.
In what follows, I use the approach advocated by Topel (1991) for estimating γ. This approach
tends to produce the largest estimates for returns to tenure in other samples, so finding small
returns to tenure using this methodology would be more compelling. Topel’s approach uses the
fact that EXPit = EXP0it + T ENit , where EXP0it is the experience of the worker when he first
started working on the job he holds at date t. Thus, the observed log wage can be written as
citn = wn + φi + β 1 EXP0it + β 2 EXPit2 + (β 1 + γ) T ENit + εit
ln W

(3.3)

To estimate γ, we use the following two-step procedure. First, wage growth over a one-year interval
on a given job will equal
cit = (β 1 + γ) + β 2 (2EXPit − 1) + ∆εit
∆ ln W

Hence, we can estimate (β 1 + γ) and β 2 by ordinary least squares. Next, we use these estimates
to construct the diﬀerence
citn − (β 1 + γ) T ENit − β 2 EXPit2 = ln wn + φi + β 1 EXP0it + εit
ln W

We then regress this diﬀerence on EXP0it and individual fixed eﬀects to estimate β 1 and φi . The
estimate for γ is the diﬀerence between the estimates for β 1 + γ and β 1 . Table 2 reports the
22

results of this two-step procedure. The point estimate for β 1 + γ and β 1 are 0.0794 and 0.0740,
respectively, implying γ = 0.0054. The implied point estimate for γ is significantly diﬀerent from
zero at the 5% level, but its magnitude is so small relative to the wage gains from job changing I
estimate in the next section that it seems safe to ignore it.10
In comparing my results to those of Topel (1991), the returns to experience in the two papers
are quite consistent. In fact, his point estimate for β 1 of 0.0713 is quite close to mine, implying
that returns to experience will be nearly identical at short horizons. The main diﬀerence is that
wage growth on the job is considerably smaller in my sample than in Topel’s sample. In particular,
he estimates β 1 + γ at 0.1258, compared to my estimate of 0.0794. Thus, wages in my sample
appear to grow on-the-job at nearly the same rate as wages rise with initial experience across jobs,
while wages in his sample grow on-the-job by much more than wages rise with initial experience.
This finding appears to be robust to variations in the functional form for the returns to tenure.
The bottom panel of Table 2 allows for tenure to enter as a quadratic, i.e. γ 1 T ENit + γ 2 T ENit2 .
The estimated returns remain small; although not reported, returns to tenure attain a maximum
of only 0.0433 log points at 5 years. The fact that returns to tenure are so much smaller in my
sample may reflect the fact that young workers have little incentive to invest in match-specific
skills given their high degrees of mobility, in contrast with older workers in Topel’s sample for
whom this incentive is presumably greater.

4. Empirical Results
In implementing the methodology above, we should keep in mind that diﬀerent worker groups, e.g.
high school graduates and college graduates, might sample from diﬀerent oﬀer distributions and
draw oﬀers at diﬀerent rates. We could carry out the estimation separately for each such group,
but since the number of observations in my sample is already relatively small, I resort to grouping
together all workers and assuming they all face identical search problems.
Since relating the moment sequence {E (Rn+1 − Rn | N > n)}∞
n=1 back to an underlying parent

distribution requires knowing the parameter p = (1 + λ1 /δ)−1 , I begin with estimates for this
10
As noted in Topel (1991), this two-step procedure is likely to overstate the true β 1 , since initial experience
EXP0it is likely to be positively correlated with n and hence wn . As a result, the estimated returns to tenure should
really be viewed as a lower bound for the true γ. However, with involuntary job changes, EXP0it is likely to be
weakly correlated with n, since workers with high experience can still be on the first job in their employment cycle.
Hence, the lower bound we estimate is likely to be close to the true estimate. Indeed, when Altonji and Williams
(1998) estimate returns to tenure in the same NLSY sample using an alternative instrumental variables approach,
they find returns to tenure that are only slightly larger than those reported here.

23

parameter. Recall that Proposition 2 oﬀers a maximum likelihood estimator for p based on the
distribution of n across jobs that ended in an involuntary job change. Of the 44,593 jobs in
my sample, 22,135 are classified as ending involuntarily. Among these, the distribution of n is
heavily skewed towards n = 1. This would suggest a relatively high probability of involuntary job
termination, i.e. a relatively large value of p. The exact point estimates are reported in Table
3. Grouping all workers together, p is estimated at 0.48, implying that the ratio λ1 /δ ≈ 1. To
check whether grouping workers together overlooks important diﬀerences in p across identifiable
subgroups, I also estimated p separately for diﬀerent education groups. The point estimates do
not seem to diﬀer much from one another, confirming a similar result in van den Berg and Ridder
(1998). The implied ratio for λ1 /δ of 1 is smaller than the ratio of 10 reported in several previous
papers using job duration data as opposed to employment history data (including some that use
the same NLSY dataset). However, not all papers find this ratio to be as large. In particular,
Bowlus, Keifer, and Neumann (2001), who use job duration data in the NLSY data to estimate
these parameters, also report a ratio λ1 /δ ≈ 1.
Next, I turn to estimating (2.13), which should yield estimates of the average record gaps.
Once again, to mitigate the eﬀect of outliers, I eliminated the most extreme 0.1% of my sample,
cit | > 2. Most of these deletions appear to be due to
specifically those observations where |∆ ln W

coding errors, since these wage changes are typically followed by equally large wage changes in
the opposite direction. Since there are very few observations for high values of n, I also confine
n,n+1
my analysis to job changers who leave their n-th job for n ≤ 5. Let Dit
represent a dummy
variable which equals 1 if worker i moved from his n-th job in date t − 1 to his n + 1-th in date t.
Rather than estimating returns to experience from a separate first-stage regression as suggested
in Section 2, I combine job stayers and job changers to run a single regression of the form
cit = β 1 ∆EXPit + β 2 ∆EXPit2 +
∆ ln W

∞
X

n,n+1
πn Dit
+ ∆εit

(4.1)

n=1

The coeﬃcients πn in this regression correspond to the estimates of the expected moment gaps
E (Rn+1 − Rn | N > n). Combining the two allows the wage growth of job changers to help in
identifying the coeﬃcient β 2 , and should therefore be more eﬃcient.
The results of this regression are reported in Table 4. The number of workers who are observed
to change from the n-th job to the n + 1-th job in the previous year for each n is reported next to
the corresponding dummy variable. The estimated coeﬃcients for (4.1) are reported in the second
column of Table 4. The first column in the table reports the estimated coeﬃcients β 1 and β 2
omitting job changers, confirming that estimating β 1 and β 2 from job stayers alone would have
negligible eﬀects on my point estimates. The estimated coeﬃcients πn are all clustered around
24

8%, with the exception of π 4 . However, this coeﬃcient (as well as those with even higher values
of n) is rather imprecisely estimated given the small number of job changers for this value of n.
With only three moments that are estimated with any reasonable degree of precision, using the
coeﬃcients πn to construct the estimator in Proposition 3 is likely to provide a poor approximation
of the true log wage oﬀer distribution. However, as noted earlier, we can still use a small number of
moments to test particular functional form restrictions. Recall from Figure 1 that if the wage oﬀer
is Pareto, in which case the log wage oﬀer distribution is exponential, the coeﬃcients πn should be
constant for all n. Thus, testing this particular functional form restriction amounts to testing a set
of linear restrictions on the coeﬃcients πn in (4.1), namely that they are all equal. Note that this
is a test of an entire family of distributions rather than a test of one particular distribution. To
the extent that we fail to reject that the πn are equal, we can then estimate the exact exponential
distribution (specifically, the inverse of the hazard rate that characterizes this distribution) from
the implied common value of the πn .
The first row in the bottom panel of Table 4 reports the results for the test that all of the
coeﬃcients πn are equal. The probability of observing these average wage gains under the null
that the log wage oﬀer distribution is exponential equals 0.264. Thus, we fail to reject the null that
the wage oﬀer distribution is Pareto at conventional significance levels. The third column of Table
4 then reports the estimate for πn assuming these coeﬃcients are equal. The average net wage
growth from voluntarily moving jobs is 0.0806, which is consistent with the estimated wage growth
for young workers reported in Topel and Ward (1992). This estimate implies that the underlying
log wage oﬀer distribution is exponential with hazard 0.0806−1 = 12.41. This is larger than the
estimate of 4.17 Flinn (2002) reports for this hazard using the same NLSY dataset (Table 4, p633).
But Flinn abstracts from on-the-job wage growth out of concern for sample selection, and instead
attributes all of the growth between the starting wage on the n-th job in the cycle to the starting
wage on the n + 1-th job in the cycle to a better price from the underlying oﬀer distribution. Since
the average wage gain from the start of one job to the start of the next job is 0.2400 in his sample,
the implied hazard rate of the underlying oﬀer distribution, 0.2400−1 = 4.17, will be smaller.
The fact that the estimated coeﬃcents πn = E (Rn+1 − Rn | N > n) are roughly constant naturally suggests the exponential distribution as a candidate for the log wage oﬀer distribution. But
we could similarly test other functional forms that have been proposed in the literature. The
second row in the bottom panel of Table 4 reports the results for an analogous test of whether the
log wage oﬀer distribution is normal. Once again, we can devise a test against the entire class of
normal distributions as opposed to a single candidate distribution, and then proceed to estimate
25

the exact normal distribution to the extent that we fail to reject this hypothesis. To see why,
¢
¡
suppose log wage oﬀers are distributed as N µ, σ2 . One can verify that the average net wage
growth among workers who move from their n-th job to their n + 1-th job would equal
¡ 0
¢
σE Rn+1
− R0n | N > n

(4.2)

¡
¢
0 }M
where E R0n+1 − Rn0 | N > n denotes the average moment gap from the sequence {Xm
m=1 in
0 are i.i.d. standard normals, i.e. X 0 ∼ N (0, 1). Thus, the sequence {π }∞
which the Xm
n n=1 is
m
uniquely determined for any normal distribution up to a scalar. To the extent that we fail to
reject the hypothesis of a normal distribution, we can estimate σ from the exact values of πn .11
As can be seen from the last line of Table 4, the probability that we would observe these average
wage gains under the null of any lognormal distribution is only 0.014, so we can reject this null
at conventional levels of significance. This is not quite precise, since the null hypothesis depends
on p, but p is an estimated parameter. However, since the moment sequence in (4.2) is sharply
declining for a variety of diﬀerent p, and since p is relatively tightly estimated, the rejection of the
normal is likely to be robust to incorporating sampling error. Interestingly, the expected record
gaps for the exponential distribution are independent of p, so the confidence interval in Table 4
does not need to be adjusted to reflect sampling error in the estimate of p.
Since the small sizes involved in my estimation make it hard to estimate the exact oﬀer distribution with great confidence, it would be useful to look for additional tests that can be used to
further gauge whether the wage oﬀer distribution is indeed Pareto, i.e. whether the log wage oﬀer
distribution is indeed exponential. Since my estimation relies only on the wage growth of voluntary job changers, a natural overidentifying restriction to consider would involve the wage losses of
involuntary job changers, data that I have so far ignored in my analysis. If a worker leaves his n-th
job involuntarily, it follows that the that the total number of records in the employment cycle he
just ended is exactly n, i.e. N = n (recall that employment cycles are defined as the time between
involuntary job changes). Thus, the average log price per unit labor on the job the worker left
involuntarily is given by E (wn | N = n), i.e. it is the expected value of the n-th record conditional
on the event that there are exactly n records in the sequence {Xm }M
m=1 . On his next job, all we
know is that he started a new cycle, so the number of records in his next employment cycle is at
least one. Hence, the average log price per unit labor on his new job is given by E (w1 | N ≥ 1),

which we can denote E (w1 ) since the event N ≥ 1 is true by default. Hence, the average net wage
11

By contrast, the parameter µ is not identified, since recall from Proposition 1 that we can only identify a
distribution up to a location parameter. Similarly, the exponential distribution is really defined by two parameters,
its hazard and its lower support, but only the hazard is identified.

26

loss suﬀered by the n-th work can again be expressed as a diﬀerence of record moments, namely
¯
³
´
cit − β∆Zit | ¯¯ Nt−1 = n = E (Rn | N = n) − E (R1 )
E |∆ ln W

Now, suppose the wage oﬀer distribution were exponential with a hazard rate λ. Then these
average losses can be written as
£ ¡
¢
¡ ¢¤
λ−1 E R0n | N = n − E R10

(4.3)

0 }M
0
where R0n denotes the n-th record the sequence {Xm
m=1 in which the Xm are i.i.d. standard
n,1
denote a dummy which equals 1 if worker i moved
exponentials, i.e. with hazard rate 1. Let Dit
from his n-th job in date t − 1 to a job where n is reset to 1 by date t. Then if the log wage oﬀer

distribution were indeed exponential, the coeﬃcients πn in the regression
cit = β 1 ∆EXPit + β 2 ∆EXPit2 −
∆ ln W

∞
X

n,1
πn Dit
+ ∆εit

(4.4)

n=1

should line up with (4.3) up to a scale parameter. More precisely, the scaling parameter λ−1
should be the inverse of the hazard for the exponential distribution we estimated from voluntary
job changers, i.e. λ−1 should equal 0.0806.
The first column in Table 5 reports the estimates for the coeﬃcients πn . Using the estimate
for p of 0.48 from Table 3, one can compute the moments E (R0n | N = n) − E (R10 ) numerically,
yielding
{0.197, 0.762, 1.127, 1.396, 1.616, ...}
(4.5)
The first row in the bottom panel of Table 5 reports the results of the test that the coeﬃcients
πn are equal to a this sequence up to a scale parameter. The probability of observing these values
under the null of an exponential distribution is given by 0.291, so we indeed fail to reject the
null hypothesis. The second column of Table 5 reports the implied value of the constant λ−1
if we impose the restriction that the wage oﬀer distribution is exponential. That is, we replace
∞
P
n,1
πn Dit
in (4.4) with the expression
n=1

³
´
1,1
2,1
3,1
4,1
5,1
+ 0.762Dit
+ 1.127Dit
+ 1.396Dit
+ 1.616Dit
λ−1 0.197Dit

and estimate λ−1 by ordinary least squares. The estimated coeﬃcient is 0.0816, nearly identical
to the value of 0.0806 implied by the wage gains of voluntary job changers. In other words, the
relationship between the wage gains of voluntary job changers and the wage losses of involuntary
job changers implied by the model appears to be satisfied in the data.
27

As a final remark, note that so far I have only used the fact that if the log wage oﬀer distribution is exponential, the average wage losses would have to be proportional to the sequence
in (4.5). However, a simple adaptation of an argument in Nagaraja and Barlevy (2003) shows
that the converse is also true, i.e. if wage losses are proportional to (4.5), then the oﬀer distribution has to be exponential. Formally, the sequence {E (Rn | N = n) − E (R1 )}∞
n=1 uniquely
characterizes the parent distribution within the class of continuous distribution functions up to a
location parameter. As such, we could potentially use wage losses to similarly discriminate between diﬀerent candidate wage oﬀer distributions. Unfortunately, this approach proves to be less
powerful than using the wage gains of voluntary job changers. For example, once again we can test
whether the wage change data is consistent with a normal distribution by computing the sequence
{E (Rn | N = n) − E (R1 )}∞
n=1 for the normal distribution and testing whether the coeﬃcients π n
are consistent with this series. As reported in the second row of the bottom panel of Table 5, we
cannot reject the null hypothesis that the log wage oﬀer distribution is normal as we could from
wage data for voluntary job changers. The reason for this is illustrated in Figure 3, which shows the
actual estimated net wage loss and the best-fitting moment diﬀerences E (Rn | N = n) − E (R1 )

for the normal and the exponential distribution. Although the two sequences are distinguishable
— the average wage losses decay more rapidly for the normal distribution — it is more diﬃcult to
tell them apart given that both sequences are necessarily increasing, in contrast the sequence of
moments for the diﬀerence between consecutive records which need not be monotonic. Thus, the
fact that wage losses for involuntary job changers can be reconciled with the exponential distribution serves more to aﬃrm the internal consistency of the model than to further pin down the
exact functional form of the wage oﬀer distribution.

5. Alternative Models of Search
The estimation strategy described in this paper relies heavily on the fact that according to the
cn } correspond to a sequence of records
model, observed wages over an employment cycle {W
it
N
{Wn }n=1 from an i.i.d. sequence that are contaminated by a noise term `it . In more general

search models, this may no longer be the case. Nevertheless, I now argue that even when the
underlying sequence {Wn }N
n=1 does not correspond to a list of record values, it will often be the

case that a model with on-the-job search has some implicit record structure, and in some cases
we might still be able to exploit this fact for purposes of identification. This section oﬀers two
suggestive examples. A more comprehensive treatment is clearly necessary to deal with each of
these variations, but this is beyond the scope of this paper.

28

As my first example, suppose a job oﬀer specifies both a wage W and a number of hours H that
the worker is required to work. Workers draw job oﬀers from a fixed distribution over (W, H) and
choose the job that maximizes their utility. Thus, on a job oﬀering the pair (W, H), an individual
cit = W `it , and an income Ibit = W H`it . Once again, define an
would earn an hourly wage of W
employment cycle as the time between the beginning of adjacent unemployment spells, and let
{Wn , Hn }N
n=1 denote the wages and hours on the diﬀerent jobs over each such cycle. The sequence
N
{Wn }n=1 will no longer correspond to a sequence of records; in fact, it need not even be monotonic,

since a worker might voluntarily move to a job that oﬀers lower W if it is more attractive in terms

of the hours it oﬀers. Nevertheless, the n-th job in the cycle still corresponds to the n-th record
in utility space, i.e. it is the n-th time the worker encounters a job he prefers to all of his previous
M
job oﬀers. Formally, the sequence {U (Wn , Hn )}∞
n=1 corresponds to records from the set {Um }m=1
where Um denotes the utility the worker derives from the m-th job oﬀer. If we know the function
U (·, ·), or could estimate it from observed choices, we might be able to use data on wages and
hours to identify the distribution of utility across job oﬀers, and possibly even use this to back
out the joint distribution of (W, H). As an illustration, consider the case where agents do not
care about leisure. Then they would always choose the job that oﬀers the greatest income, i.e.
U (W, H) = W H. In this case, the income on the n-th job corresponds to a record from i.i.d. draws
from the implied distribution for income across all oﬀers, and we can easily adapt the argument
above to identify the distribution of income levels across job oﬀers from observations on income
Ibn = (Wn Hn ) × `it . One might be able to extend a similar argument to other utility functions.
it

As another example, suppose once again that a job oﬀer only specifies a wage W , and workers
draw oﬀers from a fixed oﬀer distribution. However, the productivity of the worker is in part
match-specific. Formally, the log wage on the n-th job is given by
citn = wn + φi + β 1 EXP + β 2 EXP 2 + γ (T ENit ) + εit
ln W

where γ (·) is a strictly increasing function. Although we could rule out this case for the NLSY
sample, returns to tenure might be more relevant for workers who are at a more advanced stage
of their career. In this case, an individual might deliberately choose a job that pays a low wage
today because of the promise of higher growth in the future. That is, a worker will move from
c n+1 because he anticipates that the wage
cn > W
his n-th job to his n + 1-th job even though W
it
it
cn < W
c n+1 for some τ > t.12,13
on the n + 1-th job will be higher at some future date, i.e. that W
iτ

iτ

citn > W
c n+1 .
This would not be true in the Burdett and Mortensen model, i.e. it would never be the case that W
it
n+1
n
c
c
However, it is possible in that model that Wi,t−1 > Wit , i.e. the worker might voluntarily move to a job that pays
a lower wage than he earned in the past.
12

13

A closely related model is developed by Postel-Vinay and Robin (2002). In their model, wages rise on the job not

29

At first glance, the fact that workers accept wage cuts would suggest the record structure of the
model breaks down. But this turns out to be incorrect. In particular, a worker would never accept
a job that oﬀers a lower log price w than his current match, since the monotonicity of γ (·) implies
the wage on such a job would always lag behind his current job. Hence, the sequence {wn } is still
a monotonically increasing sequence. More precisely, the set wn forms a sequence of records in
which the threshold for setting a record evolves over time. That is, an observation forms a record
if it exceeds the previous record by at least some (time-varying) amount, in much the same way
that in athletic competitions a record is set only if it beats the previous record by more than the
degree of precision by which performances are measured (which conceivably changes over time as
technology makes it possible to measure time at higher precision). More generally, the diﬀerent
jobs in a worker’s employment cycle correspond to records in value function space as opposed to
wage space or instantaneous utility space. Using this insight for identification involves a non-trivial
modification to the record model explored here, and it is not obvious whether one can still obtain
as strong of a characterization result. But to the extent that such a result can be established,
the fact that we can directly estimate the function γ (·) using Topel’s method above suggests we
should still be able to estimate the average underlying record gaps ∆wn .

6. Conclusion
In many applications that use search models, we would like to estimate the structural parameters
of the model to learn about the underlying search process. Previous authors have indeed obtained
important insights on various interesting questions on labor markets by proceeding to estimate
such models. However, the methods they used require fairly stringent assumptions, either on the
functional form of the wage oﬀer distribution or on the presence of unobserved earnings heterogeneity. This paper proposes a way to estimate this distribution that can avoid these assumptions
by exploiting the underlying record structure of the standard search model. While the number of
observations in my dataset is too small to provide very precise estimates of the underlying wage
oﬀer distribution, my approach can still rule out certain functional forms, including some that have
been used in applications such as the lognormal distribution. At a first pass, the evidence appears
consistent with a Pareto wage oﬀer distribution, i.e. the wages a worker could expect to earn in
the various jobs available to him have a Pareto distribution. Note that this observation is distinct
because the worker becomes more productive, but because his employer is forced to match outside oﬀers the worker
receives. Since outside oﬀers arrive at random, the wage would no longer be a deterministic function of tenure on
the job. Ironically, in this case the wage on the job forms a record process, since the current wage reflects the record
outside oﬀer the worker encountered since he started working for his current employer.

30

from the oft-noted fact that the cross-sectional distribution of wages exhibits a Pareto tail.14 For
one thing, the cross-sectional distribution is a convolution of the distribution of prices firms pay
and the distribution of ability across agents. In addition, selection from workers moving to higher
wage jobs would tend to put more mass on higher values of this distribution. In the simple search
model explored here, the record structure is reflected in wages and can be used to uncover the
underlying wage oﬀer distribution. But more generally, the jobs on each employment cycle correspond to records in utility space, not wage space. Whether this still yields useful restrictions for
observable data (e.g. wages, hours, etc) remains as an open question.
While this paper only examines search applications, record theory is potentially applicable in
a variety of contexts. Records statistics arise whenever we get to observe the extremes from an
unknown number of observations. This structure characterizes a variety of economic environments.
For example, in the Postel-Vinay and Robin (2002) model, the wage a worker earns on his job
is the maximum of the outside oﬀers the worker receives, but most panel surveys do not record
the number of times the worker receives a matching outside oﬀer. A related example is the
problem of optimal contracting with one-sided commitment in Beaudry and DiNardo (1991) in
which the optimal strategy for the firm is to pay its worker a wage that reflects the record economic
conditions since the employment relationship began, which may be only imperfectly observable to
the econometrician. Another example involves auctions in which we do not observe how many
potential bidders there are (e.g. internet auctions where we don’t know whether those who visit
the site are seriously intent on bidding), and all we get to observe are those bids that already
exceed previous bids without knowing how many bidders had the opportunity to oﬀer a higher bid
but chose not to. Yet another application that is discussed at some length in Arnold, Balakrishnan,
and Nagaraja (1998) involves optimal stopping problems, since the event that we reach a point
at which we exceed some threshold can be translated into the statement that the record value
exceeds some cutoﬀ. Record statistics could thus serve as an important tool for economists in
both empirical and theoretical applications.

14

On the presence of a Pareto tail in cross-sectional earnings distributions, see Neal and Rosen (2000).

31

7. Appendix
Proof of Lemma 1: To derive the expression for Prob(M = m), let condition on the time between
the first oﬀer and the end of the cycle, which is distributed as an exponential with rate δ. Then
the probability that there are exactly m oﬀers on an employment cycle can be expressed as
Z ∞
Prob (M = m) =
Prob (m − 1 oﬀers arrive by date t) δe−δt dt
0

Since oﬀers arrive at rate λ1 , the number of oﬀers that arrive within t units of time is Poisson with
parameter λ1 t, so that
Z ∞ −λ1 t
e
(λ1 t)m−1 −δt
δe dt
Prob (M = m) =
(m − 1)!
0
¶m−1
µ
δ
λ1
=
λ1 + δ
λ1 + δ
To solve for these integrals, we use an induction argument together with the fact that for any
positive integer k
lim tk e−(λ1 +δ)t = 0

t→0

lim tk e−(λ1 +δ)t = 0

t→∞

This establishes the claim. ¥
Proof of Lemma 2: Given an i.i.d. sequence {Xm }M
m=1 where Xm ∼ F (·) and where
Prob(M = m) = q m−1 p, Theorem 4.1 in Bunge and Nagaraja (1991) implies that the probability density for the first n + 1 records when there are at least n + 1 records in the sequence is
given by
n
Y
qf (ri )
h (r1 , r2 , ..., rn+1 ∩ N > n) = f (rn+1 )
(7.1)
1 − qF (ri )
i=1

where f (·) = dF (·). Integrating out r1 through rn in (7.1) and using an induction argument, we
can show that the marginal density for rn+1 where there are at least n + 1 records is given by

[− ln (1 − qF (rn+1 ))]n
f (rn+1 )
n!
Define the inverse cdf F −1 (x) for x ∈ (0, 1) as sup {y : F (y) ≤ x}. Then using the change of
variables u = F (rn+1 ) and du = f (rn+1 ) drn+1 , the expected value of |Rn+1 | conditional on
N > n is given by
Z 1
¯
n
³
´
¯
¯ −1
¯
¯F (u)¯ [− ln (1 − qu)] du
E |Rn+1 | ¯ N > n
=
n! Pr (N > n)
0
n Z 1¯
¯
[− ln (1 − q)]
¯F −1 (u)¯ du
≤
n! Pr (N > n) 0
[− ln (1 − q)]n
=
E (|Xm |) < ∞
n!
h (rn+1 ∩ N > n) =

¯
¯
³
´
³
´
¯
¯
Since E |Rn | ¯ N > n < E |Rn+1 | ¯ N > n , the former is also finite. Lastly, since E (|Rn |) <
∞ implies E (Rn ) < ∞, the lemma follows. ¥
Proof of Proposition 1: Integrating out (7.1) yields the following densities:
[− ln (1 − qF (rn ))]n−1 qf (rn )
h (rn+1 , rn ∩ N > n) = f (rn+1 )
(n − 1)!
1 − qF (rn )
h (rn ∩ N > n) =

q − qF (rn ) [− ln (1 − qF (rn ))]n−1
f (rn )
1 − qF (rn )
(n − 1)!

Define ∆ = rn+1 − rn . By construction, ∆ ≥ 0. Using the law of iterated expectations, we have
E (∆ | N ≥ n) = E (E (∆ | rn , N > n))
¶
µZ ∞
∆ h (∆ | rn , N ≥ n) d∆
= E
0

where h (∆ | rn , N ≥ n) is the density of the diﬀerence between the n-th record and the n + 1-th
record conditional on rn , and is given by
h (∆ | rn , N ≥ n) =

f (rn + ∆)
1 − F (rn )

Hence, the conditional expectation of ∆ is given by

E (∆ | rn , N ≥ n) =

Z

∞

∆

0

≡ F (rn )

f (rn + ∆)
d∆
1 − F (rn )

If we integrate the above expression over rn , we have
E (∆ | N > n) = E ( F (rn )| N > n)
Z ∞
h (rn ∩ N > n)
drn
F (rn )
=
Pr (N > n)
−∞
Z ∞
q − qF (rn ) [− ln (1 − qF (rn ))]n−1
=
F (rn )
f (rn ) drn
1 − qF (rn ) (n − 1)! Pr (N > n)
−∞
¸
Z ∞ ·Z ∞
q
[− ln (1 − qF (rn ))]n−1
=
[1 − F (rn + ∆)] d∆
drn
f (rn )(7.2)
1 − qF (rn ) (n − 1)! Pr (N > n)
−∞
0
Now, suppose we have two functions F1 and F2 such that
³
´
³
´
(1)
(2)
E Rn+1 − Rn(1) | N > n = E Rn+1 − R(2)
|
N
>
n
n

for n = 1, 2, 3, ... Then we have
¸
Z ∞ ·Z ∞
(− ln (1 − qF1 (rn )))n−1 qf1 (rn )
drn =
[1 − F1 (rn + ∆)] d∆
(n − 1)! (1 − F1 (rn )) 1 − qF1 (rn )
−∞
0
¸
Z ∞ ·Z ∞
(− ln (1 − qF2 (rn )))n−1 qf2 (rn )
[1 − F2 (rn + ∆)] d∆
drn
(n − 1)! (1 − F2 (rn )) 1 − qF2 (rn )
−∞
0
Rewrite both integrals using the change of variables u = F (rn ) to get
1 ·Z ∞ £

¸
¡ −1
¢¤
(− ln (1 − qu))n−1 q
1 − F1 F1 (u) + ∆ d∆
du =
(n − 1)! (1 − u) 1 − qu
0
0
¸
Z 1 ·Z ∞
¢¤
£
¡ −1
(− ln (1 − qu))n−1 q
du
1 − F2 F2 (u) + ∆ d∆
(n − 1)! (1 − u) 1 − qu
0
0

Z

Applying Lemma 3 in Lin (1987), we know that given a function ψ (·),
Z

1

0

ψ (x) (− ln (1 − x))n dx = 0

for all n = 1, 2, 3, ... if and only if ψ (x) = 0 almost surely. By a simple contradiction argument,
one can show that this implies that ψ (x) = 0 almost surely if and only if
Z

0

1

ψ (x) (− ln (1 − qx))n dx = 0

Hence, for any u, it follows that
Z ∞
Z
£
¡
¢¤
1 − F1 F1−1 (u) + ∆ d∆ =
0

0

∞£

Let t = F1−1 (u) + ∆. Then it follows that for any u,
# "Z
"Z
∞

F1−1 (u)

[1 − F1 (t)] dt =

¡
¢¤
1 − F2 F2−1 (u) + ∆ d∆

∞

F2−1 (u)

#

[1 − F2 (t)] dt

Since F1 (·) and F2 (·) are continuous, nondecreasing, and bounded, it follows that they are both
diﬀerentiable almost everywhere. This, in turn, implies that F1−1 (u) and F2−1 (u) are diﬀerentiable
for almost every u ∈ (0, 1). Diﬀerentiating with respect to such u yields
£
¡
¡
¢¤ d −1
£
¢¤ d −1
1 − F1 F1−1 (u)
F1 (u) = 1 − F2 F2−1 (u)
F (u)
du
du 2
¡
¡
¢
¢
Since F1 F1−1 (u) = F2 F2−1 (u) = u, it follows that for almost all u ∈ (0, 1),
d −1
d −1
F1 (u) =
F (u)
du
du 2

Integrating out yields
F1−1 (u) = F2−1 (u) + c
for some constant c, which establishes the claim. ¥
Proof of Proposition 2: the Proposition is an immediate implication of the consistency of the
maximum likelihood estimator. ¥
Proof of Proposition 3: From equation (7.2), we have

E (Rn+1 − Rn | N > n) =
≡

Z

∞

−∞
∞

Z

µZ

¶
[− ln (1 − qF (rn ))]n−1 qf (rn )
(1 − F (rn + ∆)) d∆
drn
(n − 1)! Pr (N > n) 1 − qF (rn )

∞

0

G (rn )

−∞

[− ln (1 − qF (rn ))]n−1 qf (rn )
drn
(n − 1)! Pr (N > n) 1 − qF (rn )

Note that G (rn ) ≥ 0. Using the change in variables
1 − qF (rn ) = px
we can rewrite the above as
E (Rn+1 − Rn | N > n) =
≡

µ
¶¶
µ
x
(− ln p)n xn−1
−1 1 − p
G F
dx
q
(n − 1)! Pr (N > n)
−∞
Z ∞
(− ln p)n
g (x) xn−1 dx
(n − 1)! Pr (N > n) −∞

Z

∞

where again g (x) ≥ 0. Set
µn−1 = E (Rn+1 − Rn | N > n) ×

(n − 1)! Pr (N > n)
(− ln p)n

so that 0 < µn < ∞ for all n. The task of recovering g (·) from the system of equations
Z 1
g (x) xn = µn

(7.3)

0

for n = 0, 1, 2, ... is just a case of the Hausdorﬀ moment problem, which asks for an arbitrary
sequence {µn }∞
n=0 with µ0 = 1 if (1) there exists a function g (·) ≥ 0 that satisfies (7.3); (2) if the
solution is unique; and (3) a closed form expression for any solution g (·). Shohat and Tamarkin
(1943) provide a rigorous treatment of this and related moment problems. They prove that if a
solution exists to the Hausdorﬀ moment problem, it is unique (as it would indeed have to be from
Proposition 1). Moreover, by Theorem 3.7 of Shohat and Tamarkin (p91), they show that the
solution is given by
∞
X
g (x) =
λn Pn (x)
n=0

where
λn =
But we let Pn (x) =

Pn

Z

1

0

j
j=0 cnj x

Z

g (x) Pn (x) dx

1

g (x) Pn (x) dx =

0

=

=

Z

1

0

n
X
j=0
n
X



g (x)
cnj

Z

1

n
X
j=0



cnj xj  dx

g (x) xj dx

0

cnj µj

j=0

Hence, steps (1) and (2) allow us to recover the function g (x) where
¶¶
µ
µ
x
−1 1 − p
g (x) = G F
q
µ
µ
¶
¶¶
Z ∞µ
1 − px
=
1 − F F −1
+∆
d∆
q
0
Z ∞
=
³
´ (1 − F (t)) dt
x
F −1

1−p
q

¶
1 − px
+ ∆. Integration by parts and
where the last step uses the change in variables t =
q
a little algebra reveals that we can rewrite the above integral as
¶¸
µ
Z 1 ·
x
−1
−1 1 − p
g (x) =
du
F
(u)
−
F
1−px
q
F −1

µ

q

Diﬀerentiating both sides with respect to x yields
µ
¶
¶ x
µ
x
1 − px
p
−10 1 − p
F
ln p
g (x) = 1 −
q
q
q
0

Using the change of variables

z=

1 − px
q

and integrating, we have
Z

0

1

¶
ln (1 − qz)
Z x
0
ln p
dz =
F −1 (z) dz
(1 − z) (1 − qz) ln p
0
qg 0

µ

or
F

−1

(x) =

Z

0

which completes the proof. ¥

x

µ

¶
ln (1 − qz)
ln p
dz + constant
(1 − z) (1 − qz) ln p
qg 0

Table 1: Summary Statistics
for Entire Sample
# of individuals

6,284

individual characteristics:

age
years of potential experience
years of education

# of jobs

mean

median

24.6
8.3
12.7

25.0
9.0
12.0

44,593

job characteristics:
% jobs ending voluntarily
% jobs ending involuntarily
% jobs censored/not classified

0.35
0.50
0.15

average job tenure (uncensored)
average wage (1992 dollars)
median wage (1992 dollars)

1.05
$7.00
$5.40

Source: National Longitudinal Survey of Youth, author tabulations. Statistics above are for the original
sample, i.e. for all jobs reported in each year.

Table 2: Estimating Returns to Tenure γ
linear returns to tenure

implied returns to tenure

implied returns to experience

within-job wage
growth
β1 + γ

experience
effect
β1

tenure
effect
γ

0.0794

0.0740

0.0054

0.0065

0.0061

0.0024

1 year

2 years

5 years

7 years

10 years

0.0054

0.0108

0.0271

0.0380

0.0542

0.0024

0.0049

0.0122

0.0171

0.0245

0.0723

0.1411

0.3270

0.4337

0.5680

0.0058

0.0109

0.0226

0.0274

0.0300

quadratic returns to tenure

within-job wage
growth
β1 + γ1

experience
effect
β1

tenure
effect
γ1

tenure
squared
γ2

0.0826

0.0661

0.0165

-0.0016

0.0065

0.0067

0.0024

0.00048

The regressions above follow the two-step method outlined in Topel (1991). The first stage regresses annual within-job real
wage growth (in 1992 dollars using the implicit GDP deflator) on a ∆EXP (= constant) and ∆EXP2. This is the same
regression in column (1) of Table 4, where β 1+γ corresponds to the coefficient on ∆EXP. The second stage regresses the log
real wage net of the estimated (β 1+ γ)TEN + β 2EXP2 on initial experience and individual fixed-effects. The coefficient on
initial experience corresponds to the estimate of β 1, and the difference corresponds to the estimate of γ above. Standard errors
for β 1 and γ are adjusted to reflect estimation error in the first-stage regressor, using the stacking and weighting procedure in
Altonji and Williams (1998). Returns to tenure and experience in the middle of the table are based on estimates for γ, β, and
β 2. In the bottom panel, the first stage regression is amended to allow for a ∆TEN2 term, which is then subtracted from the log
real wage at the second stage.

Table 3: Estimates for p

All
Educ < 12
Educ = 12
Educ ∈ (13,15)
Educ > 16

Sample
size

p

Standard
error

Implied
λ1/δ

22,135

0.4823

0.0031

1.074

6,515
6,648
5,436
3,536

0.5008
0.4797
0.4504
0.5049

0.0055
0.0058
0.0062
0.0082

0.997
1.085
1.220
0.981

Estimates for p are derived using maximum likelihood in accordance with Proposition 2 in the text.
Sample size corresponds to the number of jobs that end in an involuntary job change used to estimate p.
The standard error is the asymptotic standard error. The implied ratio in the last column is computed
according to the formula p = (1+λ1/δ)-1.

Table 4: The Wage Gains of Voluntary
Job Changers, by n

∆EXP
∆EXP2

sample size

(1)

(2)

(3)
exponential

--

0.0794

0.0809

0.0816

0.0065

0.0050

0.0050

-0.0017

-0.0018

-0.0018

0.0002

0.0002

--

0.0003

D

12

D

23

D

34

2,473

0.0900
0.0094

0.0711

993

0.0137

459

0.0799

0.0806

0.0200

0.0072

D45

206

0.0168

D56

78

0.0799

0.0331

0.0520

# obs
stayers
changers

28,015
28,015
0

31,868
28,015
3,853

31,868
28,015
3,853

Test of particular functional forms:
Exponential
Normal

F (4, 31861) = 1.31
F (4, 31861) = 3.12

Prob > F = 0.2639
Prob > F = 0.0140

The dependent variable is the annual growth rate of real wages. The independent variables are
the growth ∆EXP, which is identically equal to 1, ∆EXP2, which is equal to 2 EXP - 1, and a
set of dummy variables Dn,n+1 equal to 1 if the worker moved from his n-th job to his n+1-th
job. The column labeled sample size denotes the number of workers in my sample who
voluntarily left their n-th job for each value of n. Column (1) estimates the coefficients on
∆EXP and ∆EXP2 using job stayers only. Column (2) adds job changers and estimates the
coefficients on the dummy variables as well. Column (3) estimates the same regression as in
column (2) assuming the coefficients on all the dummy variables are equal, which from the text
is true if and only if the log wage offer distribution is exponential. The numbers below the
coefficient denote robust standard errors. The F -statistics in the bottom panel are the robust
Wald-statistics that test constraints on the coefficients on the dummy variables in column (2).
The exponential case compares column (3) to column (2), while the normal case involves an
alternative set of linear restrictions on the coefficients on the dummy variables.

Table 5: The Wage Losses of Involuntary
Job Changers, by n

∆EXP
∆EXP2
D

11

D

21

D

31

sample size

(1)

(2)
exponential

--

0.0837

0.0849

0.0062

0.0050

--

-0.0020

-0.0020

0.0002

0.0002

2,767

0.0029
0.0094

873

0.0843
0.0153

305

0.0904

0.0816

0.0278

0.0130

D41

137

0.0942

D51

50

0.0754

0.0432

0.0726

# obs
stayers
changers

31,844
28,015
3,829

31,844
28,015
3,829

Test of particular functional forms:
Exponential
Normal

F (4, 31837) = 1.24
F (4, 31837) = 1.08

Prob > F = 0.2905
Prob > F = 0.3629

The dependent variable is the annual growth rate of real wages. The independent variables are
∆EXP and ∆EXP2 as in Table 4, and a set of dummy variables Dn,n+1 equal to 1 if the worker
moved from his n-th job to his n+1-th job. The column labeled sample size denotes the number
of workers who involuntarily left their n-th job for each value of n. Column (1) reports the
results of this regression, while column (2) estimates the same regression as in column (1) with
a particular set of linear restrictions on the coefficients of the dummy variables that are true if
and only if the log wage offer distribution is exponential. The numbers below the coefficient
denote robust standard errors. The F -statistics in the bottom panel are the robust Waldstatistics that test constraints on the coefficients on the dummy variables in column (2). The
exponential case compares column (2) to column (1), while the normal case involves an
alternative set of linear restrictions on the coefficients on the dummy variables.

Figure 1: Expected Record Gaps
for Different Parent Distributions

0.11
0.10
E(Rn+1 - Rn | N > n)
exponential parent

0.09
0.08
0.07
0.06
0.05
0.04

E(Rn+1 - Rn | N > n)
normal parent

0.03
0.02
0.01

n
0.00
1

2

3

4

5

Figure 2: Summary Statistics for n
1.00
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993

Figure 1a: Proportion of observations where no value for n was assigned

0.90
0.80
0.70
0.60
0.50

n =1

0.40
0.30

n =2

0.20

n =3
n =4
n =5

0.10
0.00

1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993

Figure 1b: Share of all observations with n > 1 for each level of n

Figure 3: Actual vs. Predicted Wage Loss
for Involuntary Job Changers

0.25

0.20
upper 95%
confidence bound

0.15

exp dist
normal dist
0.10

0.05

n

0.00
lower 95%
confidence bound

-0.05
1
-0.10

2

3

4

5

References
[1] Altonji, Joseph and Nicolas Williams, 1997. “Do Wages Rise with Job Seniority? A Reassessment” Mimeo, Northwestern University.
[2] Altonji, Joseph and Nicolas Williams, 1998. “The Eﬀects of Labor Market Experience, Job
Seniority, and Job Mobility on Wage Growth” Research in Labor Economics, 17, p233-276.
[3] Arnold, Barry, N. Balakrishnan and H. Nagaraja, 1992. A First Course in Order Statistics.
New York: John Wiley and Sons.
[4] Arnold, Barry, N. Balakrishnan and H. Nagaraja, 1998. Records. New York: John Wiley and
Sons.
[5] Barlevy, Gadi, 2002. “The Sullying Eﬀect of Recessions” Review of Economic Studies, 69 (1),
January, p65-96.
[6] Beaudry, Paul and John DiNardo, 1991. “The Eﬀect of Implicit Contracts on the Movement
of Wages over the Business Cycle: Evidence from Micro Data” Journal of Political Economy,
August, 99(4), p665-88.
[7] Bontemps, Christian, Jean-Marc Robin, and Gerard van den Berg, 2000. “Equilibrium Search
with Continuous Productivity Dispersion: Theory and Nonparametric Estimation” International Economic Review, May, 41(2), p305-358.
[8] Bowlus, Audra, Nicholas Kiefer, and George Neumann, 2001. “Equilibrium Search Models and
the Transition from School to Work” International Economic Review, May, 42(2), p317-43.
[9] Bowlus, Audra and Jean-Marc Robin, 2003. “Twenty Years of Rising Inequality in US Lifetime
Labor Income Values” Review of Economic Studies, forthcoming.
[10] Bunge, John and H. Nagaraja, 1991. “The Distributions of Certain Record Statistics from a
Random Number of Observations” Stochastic Processes and Their Applications, 38, p167-83.
[11] Burdett, Kenneth and Dale Mortensen, 1998. “Wage Diﬀerentials, Employer Size, and Unemployment” International Economic Review, 39, p257-273.
[12] Chandler, K. N., 1952. “The Distribution and Frequency of Record Values” Journal of the
Royal Statistical Society, Series B, 14, p220-8.
[13] Flinn, Christopher, 1986. “Wages and Job Mobility of Young Workers” Journal of Political
Economy 94(3, Part 2), pS88-S110.

[14] Flinn, Christopher, 2002. “Labour Market Structure and Inequality: a Comparison of Italy
and the U.S.” Review of Economic Studies, July, 69 (3), p611-45.
[15] Flinn, Christopher and James Heckman, 1982. “New Methods for Analyzing Structural Models
of Labor Force Dynamics” Journal of Econometrics, January, 18(1), p115-68.
[16] Glick, Ned, 1978. “Breaking Records and Breaking Boards” American Mathematical Monthly,
85(1), p2-26.
[17] Kirmani, S. N. U. A. and Beg, M. I., 1984. “On characterization of distributions by expected
records” Sankhyā A, 46(3), p463-465.
[18] Kortum, Samuel, 1997. “Research, Patenting, and Technological Change” Econometrica,
65(6), November, p1389-1419.
[19] Lin, G. D., 1987. “On Characterizations of Distributions via Moments on Record Values”
Probability Theory and Related Fields, 74, p479-83.
[20] Munasinghe, Lalith, Brendan O’Flaherty, and Stephan Danninger, 2001. “Globalization and
the Rate of Technological Progress: What Track and Field Records Show” Journal of Political
Economy, 109(5), October, p1132-49.
[21] Nagaraja, H. N. and Gadi Barlevy, 2003. “Characterizations Using Record Moments in a
Random Record Model and Applications” Journal of Applied Probability, September, 40(3),
p826-33.
[22] Neal, Derek and Sherwin Rosen, 2000. “Theories of the Distribution of Earnings” Handbook
of Income Distribution, 1, Amsterdam: Elsevier Science, North-Holland, p379-427.
[23] Nevzorov, V. B. and N. Balakrishnan, 1998. “A Record of Records” Handbook of Statistics:
Order Statistics, Theory and Methods, v16, eds. N. Balakrishnan and C. R. Rao, Amsterdam:
North-Holland, p515-70.
[24] Paxson, Christina and Nachum Sicherman, 1996. “The Dynamics of Dual-Job Holding and
Job Mobility” Journal of Labor Economics, July, 14(3), p357-93.
[25] Postel-Vinay, Fabien and Jean-Marc Robin, 2002. “Equilibrium Wage Dispersion with Worker
and Employer Heterogeneity” Econometrica, November, 70(6), p2295-2350.
[26] Shohat J. A. and J. D. Tamarkin, 1943. The Problem of Moments. New York: American
Mathematical Society.
[27] Talenti, Giorgio, 1987. “Recovering a Function from a Finite Number of Moments” Inverse
Problems, August, 3(3), p501-17.

[28] Topel, Robert, 1991. “Specific Capital, Mobility, and Wages: Wages Rise with Job Seniority”
Journal of Political Economy, February, 99(1), p145-76.
[29] Topel, Robert and Michael Ward, 1992. “Job Mobility and the Careers of Young Men” Quarterly Journal of Economics, May, 107(2), p439-79.
[30] van den Berg, Gerard and Geert Ridder 1998. “An Empirical Equilibrium Search Model of
the Labor Market” Econometrica, September, 66(5), p1183-1221.
[31] Wolpin, Kenneth, 1992. “The Determinants of Black-White Diﬀerences in Early Employment
Careers: Search, Layoﬀs, Quits, and Endogenous Wage Growth” Journal of Political Economy, June, 100(3), p535-60.

Working Paper Series
A series of research studies on regional economic issues relating to the Seventh Federal
Reserve District, and on financial and economic topics.
Dynamic Monetary Equilibrium in a Random-Matching Economy
Edward J. Green and Ruilin Zhou

WP-00-1

The Effects of Health, Wealth, and Wages on Labor Supply and Retirement Behavior
Eric French

WP-00-2

Market Discipline in the Governance of U.S. Bank Holding Companies:
Monitoring vs. Influencing
Robert R. Bliss and Mark J. Flannery

WP-00-3

Using Market Valuation to Assess the Importance and Efficiency
of Public School Spending
Lisa Barrow and Cecilia Elena Rouse
Employment Flows, Capital Mobility, and Policy Analysis
Marcelo Veracierto
Does the Community Reinvestment Act Influence Lending? An Analysis
of Changes in Bank Low-Income Mortgage Activity
Drew Dahl, Douglas D. Evanoff and Michael F. Spivey

WP-00-4

WP-00-5

WP-00-6

Subordinated Debt and Bank Capital Reform
Douglas D. Evanoff and Larry D. Wall

WP-00-7

The Labor Supply Response To (Mismeasured But) Predictable Wage Changes
Eric French

WP-00-8

For How Long Are Newly Chartered Banks Financially Fragile?
Robert DeYoung

WP-00-9

Bank Capital Regulation With and Without State-Contingent Penalties
David A. Marshall and Edward S. Prescott

WP-00-10

Why Is Productivity Procyclical? Why Do We Care?
Susanto Basu and John Fernald

WP-00-11

Oligopoly Banking and Capital Accumulation
Nicola Cetorelli and Pietro F. Peretto

WP-00-12

Puzzles in the Chinese Stock Market
John Fernald and John H. Rogers

WP-00-13

The Effects of Geographic Expansion on Bank Efficiency
Allen N. Berger and Robert DeYoung

WP-00-14

Idiosyncratic Risk and Aggregate Employment Dynamics
Jeffrey R. Campbell and Jonas D.M. Fisher

WP-00-15

1

Working Paper Series (continued)
Post-Resolution Treatment of Depositors at Failed Banks: Implications for the Severity
of Banking Crises, Systemic Risk, and Too-Big-To-Fail
George G. Kaufman and Steven A. Seelig

WP-00-16

The Double Play: Simultaneous Speculative Attacks on Currency and Equity Markets
Sujit Chakravorti and Subir Lall

WP-00-17

Capital Requirements and Competition in the Banking Industry
Peter J.G. Vlaar

WP-00-18

Financial-Intermediation Regime and Efficiency in a Boyd-Prescott Economy
Yeong-Yuh Chiang and Edward J. Green

WP-00-19

How Do Retail Prices React to Minimum Wage Increases?
James M. MacDonald and Daniel Aaronson

WP-00-20

Financial Signal Processing: A Self Calibrating Model
Robert J. Elliott, William C. Hunter and Barbara M. Jamieson

WP-00-21

An Empirical Examination of the Price-Dividend Relation with Dividend Management
Lucy F. Ackert and William C. Hunter

WP-00-22

Savings of Young Parents
Annamaria Lusardi, Ricardo Cossa, and Erin L. Krupka

WP-00-23

The Pitfalls in Inferring Risk from Financial Market Data
Robert R. Bliss

WP-00-24

What Can Account for Fluctuations in the Terms of Trade?
Marianne Baxter and Michael A. Kouparitsas

WP-00-25

Data Revisions and the Identification of Monetary Policy Shocks
Dean Croushore and Charles L. Evans

WP-00-26

Recent Evidence on the Relationship Between Unemployment and Wage Growth
Daniel Aaronson and Daniel Sullivan

WP-00-27

Supplier Relationships and Small Business Use of Trade Credit
Daniel Aaronson, Raphael Bostic, Paul Huck and Robert Townsend

WP-00-28

What are the Short-Run Effects of Increasing Labor Market Flexibility?
Marcelo Veracierto

WP-00-29

Equilibrium Lending Mechanism and Aggregate Activity
Cheng Wang and Ruilin Zhou

WP-00-30

Impact of Independent Directors and the Regulatory Environment on Bank Merger Prices:
Evidence from Takeover Activity in the 1990s
Elijah Brewer III, William E. Jackson III, and Julapa A. Jagtiani
Does Bank Concentration Lead to Concentration in Industrial Sectors?
Nicola Cetorelli

WP-00-31

WP-01-01

2

Working Paper Series (continued)
On the Fiscal Implications of Twin Crises
Craig Burnside, Martin Eichenbaum and Sergio Rebelo

WP-01-02

Sub-Debt Yield Spreads as Bank Risk Measures
Douglas D. Evanoff and Larry D. Wall

WP-01-03

Productivity Growth in the 1990s: Technology, Utilization, or Adjustment?
Susanto Basu, John G. Fernald and Matthew D. Shapiro

WP-01-04

Do Regulators Search for the Quiet Life? The Relationship Between Regulators and
The Regulated in Banking
Richard J. Rosen
Learning-by-Doing, Scale Efficiencies, and Financial Performance at Internet-Only Banks
Robert DeYoung
The Role of Real Wages, Productivity, and Fiscal Policy in Germany’s
Great Depression 1928-37
Jonas D. M. Fisher and Andreas Hornstein

WP-01-05

WP-01-06

WP-01-07

Nominal Rigidities and the Dynamic Effects of a Shock to Monetary Policy
Lawrence J. Christiano, Martin Eichenbaum and Charles L. Evans

WP-01-08

Outsourcing Business Service and the Scope of Local Markets
Yukako Ono

WP-01-09

The Effect of Market Size Structure on Competition: The Case of Small Business Lending
Allen N. Berger, Richard J. Rosen and Gregory F. Udell

WP-01-10

Deregulation, the Internet, and the Competitive Viability of Large Banks
and Community Banks
Robert DeYoung and William C. Hunter

WP-01-11

Price Ceilings as Focal Points for Tacit Collusion: Evidence from Credit Cards
Christopher R. Knittel and Victor Stango

WP-01-12

Gaps and Triangles
Bernardino Adão, Isabel Correia and Pedro Teles

WP-01-13

A Real Explanation for Heterogeneous Investment Dynamics
Jonas D.M. Fisher

WP-01-14

Recovering Risk Aversion from Options
Robert R. Bliss and Nikolaos Panigirtzoglou

WP-01-15

Economic Determinants of the Nominal Treasury Yield Curve
Charles L. Evans and David Marshall

WP-01-16

Price Level Uniformity in a Random Matching Model with Perfectly Patient Traders
Edward J. Green and Ruilin Zhou

WP-01-17

Earnings Mobility in the US: A New Look at Intergenerational Inequality
Bhashkar Mazumder

WP-01-18

3

Working Paper Series (continued)
The Effects of Health Insurance and Self-Insurance on Retirement Behavior
Eric French and John Bailey Jones

WP-01-19

The Effect of Part-Time Work on Wages: Evidence from the Social Security Rules
Daniel Aaronson and Eric French

WP-01-20

Antidumping Policy Under Imperfect Competition
Meredith A. Crowley

WP-01-21

Is the United States an Optimum Currency Area?
An Empirical Analysis of Regional Business Cycles
Michael A. Kouparitsas

WP-01-22

A Note on the Estimation of Linear Regression Models with Heteroskedastic
Measurement Errors
Daniel G. Sullivan

WP-01-23

The Mis-Measurement of Permanent Earnings: New Evidence from Social
Security Earnings Data
Bhashkar Mazumder

WP-01-24

Pricing IPOs of Mutual Thrift Conversions: The Joint Effect of Regulation
and Market Discipline
Elijah Brewer III, Douglas D. Evanoff and Jacky So

WP-01-25

Opportunity Cost and Prudentiality: An Analysis of Collateral Decisions in
Bilateral and Multilateral Settings
Herbert L. Baer, Virginia G. France and James T. Moser

WP-01-26

Outsourcing Business Services and the Role of Central Administrative Offices
Yukako Ono

WP-02-01

Strategic Responses to Regulatory Threat in the Credit Card Market*
Victor Stango

WP-02-02

The Optimal Mix of Taxes on Money, Consumption and Income
Fiorella De Fiore and Pedro Teles

WP-02-03

Expectation Traps and Monetary Policy
Stefania Albanesi, V. V. Chari and Lawrence J. Christiano

WP-02-04

Monetary Policy in a Financial Crisis
Lawrence J. Christiano, Christopher Gust and Jorge Roldos

WP-02-05

Regulatory Incentives and Consolidation: The Case of Commercial Bank Mergers
and the Community Reinvestment Act
Raphael Bostic, Hamid Mehran, Anna Paulson and Marc Saidenberg
Technological Progress and the Geographic Expansion of the Banking Industry
Allen N. Berger and Robert DeYoung

WP-02-06

WP-02-07

4

Working Paper Series (continued)
Choosing the Right Parents: Changes in the Intergenerational Transmission
of Inequality  Between 1980 and the Early 1990s
David I. Levine and Bhashkar Mazumder

WP-02-08

The Immediacy Implications of Exchange Organization
James T. Moser

WP-02-09

Maternal Employment and Overweight Children
Patricia M. Anderson, Kristin F. Butcher and Phillip B. Levine

WP-02-10

The Costs and Benefits of Moral Suasion: Evidence from the Rescue of
Long-Term Capital Management
Craig Furfine

WP-02-11

On the Cyclical Behavior of Employment, Unemployment and Labor Force Participation
Marcelo Veracierto

WP-02-12

Do Safeguard Tariffs and Antidumping Duties Open or Close Technology Gaps?
Meredith A. Crowley

WP-02-13

Technology Shocks Matter
Jonas D. M. Fisher

WP-02-14

Money as a Mechanism in a Bewley Economy
Edward J. Green and Ruilin Zhou

WP-02-15

Optimal Fiscal and Monetary Policy: Equivalence Results
Isabel Correia, Juan Pablo Nicolini and Pedro Teles

WP-02-16

Real Exchange Rate Fluctuations and the Dynamics of Retail Trade Industries
on the U.S.-Canada Border
Jeffrey R. Campbell and Beverly Lapham

WP-02-17

Bank Procyclicality, Credit Crunches, and Asymmetric Monetary Policy Effects:
A Unifying Model
Robert R. Bliss and George G. Kaufman

WP-02-18

Location of Headquarter Growth During the 90s
Thomas H. Klier

WP-02-19

The Value of Banking Relationships During a Financial Crisis:
Evidence from Failures of Japanese Banks
Elijah Brewer III, Hesna Genay, William Curt Hunter and George G. Kaufman

WP-02-20

On the Distribution and Dynamics of Health Costs
Eric French and John Bailey Jones

WP-02-21

The Effects of Progressive Taxation on Labor Supply when Hours and Wages are
Jointly Determined
Daniel Aaronson and Eric French

WP-02-22

5

Working Paper Series (continued)
Inter-industry Contagion and the Competitive Effects of Financial Distress Announcements:
Evidence from Commercial Banks and Life Insurance Companies
Elijah Brewer III and William E. Jackson III

WP-02-23

State-Contingent Bank Regulation With Unobserved Action and
Unobserved Characteristics
David A. Marshall and Edward Simpson Prescott

WP-02-24

Local Market Consolidation and Bank Productive Efficiency
Douglas D. Evanoff and Evren Örs

WP-02-25

Life-Cycle Dynamics in Industrial Sectors. The Role of Banking Market Structure
Nicola Cetorelli

WP-02-26

Private School Location and Neighborhood Characteristics
Lisa Barrow

WP-02-27

Teachers and Student Achievement in the Chicago Public High Schools
Daniel Aaronson, Lisa Barrow and William Sander

WP-02-28

The Crime of 1873: Back to the Scene
François R. Velde

WP-02-29

Trade Structure, Industrial Structure, and International Business Cycles
Marianne Baxter and Michael A. Kouparitsas

WP-02-30

Estimating the Returns to Community College Schooling for Displaced Workers
Louis Jacobson, Robert LaLonde and Daniel G. Sullivan

WP-02-31

A Proposal for Efficiently Resolving Out-of-the-Money Swap Positions
at Large Insolvent Banks
George G. Kaufman

WP-03-01

Depositor Liquidity and Loss-Sharing in Bank Failure Resolutions
George G. Kaufman

WP-03-02

Subordinated Debt and Prompt Corrective Regulatory Action
Douglas D. Evanoff and Larry D. Wall

WP-03-03

When is Inter-Transaction Time Informative?
Craig Furfine

WP-03-04

Tenure Choice with Location Selection: The Case of Hispanic Neighborhoods
in Chicago
Maude Toussaint-Comeau and Sherrie L.W. Rhine

WP-03-05

Distinguishing Limited Commitment from Moral Hazard in Models of
Growth with Inequality*
Anna L. Paulson and Robert Townsend

WP-03-06

Resolving Large Complex Financial Organizations
Robert R. Bliss

WP-03-07

6

Working Paper Series (continued)
The Case of the Missing Productivity Growth:
Or, Does information technology explain why productivity accelerated in the United States
but not the United Kingdom?
Susanto Basu, John G. Fernald, Nicholas Oulton and Sylaja Srinivasan

WP-03-08

Inside-Outside Money Competition
Ramon Marimon, Juan Pablo Nicolini and Pedro Teles

WP-03-09

The Importance of Check-Cashing Businesses to the Unbanked: Racial/Ethnic Differences
William H. Greene, Sherrie L.W. Rhine and Maude Toussaint-Comeau

WP-03-10

A Structural Empirical Model of Firm Growth, Learning, and Survival
Jaap H. Abbring and Jeffrey R. Campbell

WP-03-11

Market Size Matters
Jeffrey R. Campbell and Hugo A. Hopenhayn

WP-03-12

The Cost of Business Cycles under Endogenous Growth
Gadi Barlevy

WP-03-13

The Past, Present, and Probable Future for Community Banks
Robert DeYoung, William C. Hunter and Gregory F. Udell

WP-03-14

Measuring Productivity Growth in Asia: Do Market Imperfections Matter?
John Fernald and Brent Neiman

WP-03-15

Revised Estimates of Intergenerational Income Mobility in the United States
Bhashkar Mazumder

WP-03-16

Product Market Evidence on the Employment Effects of the Minimum Wage
Daniel Aaronson and Eric French

WP-03-17

Estimating Models of On-the-Job Search using Record Statistics
Gadi Barlevy

WP-03-18

7
Full text of Working Papers (Federal Reserve Bank of Chicago) : Estimating Models of On-The-Job Search Using Record Statistics, Working Paper 2003-18

FRASER