View original document

The full text on this page is automatically extracted from the file linked above and may contain errors and inconsistencies.

Incarceration, Earnings, and Race

WP 21-11

Grey Gordon
Federal Reserve Bank of Richmond
John Bailey Jones
Federal Reserve Bank of Richmond
Urvi Neelakantan
Federal Reserve Bank of Richmond
Kartik Athreya
Federal Reserve Bank of Richmond

Incarceration, Earnings, and Race∗
Grey Gordon†

John Bailey Jones‡

Urvi Neelakantan§

Kartik Athreya¶
July 2, 2021

Abstract
We study the implications of incarceration for the earnings and employment
of different groups, characterized by their race, gender, and education. Our
hidden Markov model distinguishes between first-time and repeat incarceration,
along with other persistent and transitory nonemployment and earnings risks,
and accounts for nonresponse bias. We estimate the model using the National
Longitudinal Survey of Youth 1979 (NLSY79), one of the few panel datasets
that includes incarcerated individuals. The consequences of incarceration are
enormous: First-time incarceration reduces expected lifetime earnings by 39%
(59%) and employment by 8 (13) years for black (white) men with a high
school degree. Conversely, nonemployment and adverse earnings shocks increase
expected years in jail. Among less-educated men, differences in incarceration
and nonemployment can explain a significant portion of the black-white gap in
lifetime earnings—44% of the gap for high school graduates and 52% of the gap
for high school dropouts.
Keywords: earnings dynamics, incarceration, racial inequality
JEL Codes: C23, D31, J15
∗

We are grateful to Amanda Michaud for helpful comments and to Thomas Lubik and Mark
Watson for early guidance. Emily Emick and Luna Shen provided excellent research assistance. The
views expressed here are those of the authors and should not be interpreted as reflecting the views
of the Federal Reserve Bank of Richmond or the Federal Reserve System.
†
Federal Reserve Bank of Richmond
‡
Federal Reserve Bank of Richmond
§
Federal Reserve Bank of Richmond
¶
Federal Reserve Bank of Richmond

1

1

Introduction

Between 1980 and 2016, the incarceration rate in the United States rose from 0.22% to
0.67% (US Department of Justice, Bureau of Justice Statistics, 2018). The impact of
this three-fold increase has fallen disproportionately on those who are male, black, or
less educated: The imprisoned population is overwhelmingly (91-93%) male and less
educated (Ewert and Wildhagen, 2011), and the imprisonment rate for black men
(2.2%) is nearly six times that for white men (US Department of Justice, Bureau
of Justice Statistics, 2019, Table 10).1 Within certain groups, incarceration is now
pervasive: Western and Pettit (2010, Table 1) find that among black high school
dropouts born between 1975 and 1979, 68% had been incarcerated at least once.
Despite the prevalence and growth of incarceration in the United States, much
remains unknown about the relationship between incarceration, employment, earnings, and demographics.2 Our goals in this paper are to (1) quantify the dynamic
relationship between incarceration, employment, and earnings and (2) measure the
contribution of incarceration and other forms of nonemployment to the earnings gap
between black and white men.3 To this end, we estimate a statistical model of incarceration, employment, and earnings over the life cycle, using a flexible framework
that controls semi-parametrically for race, education, and gender. Our hidden Markov
model allows for transitory and persistent nonemployment spells, movements up and
down the (positive) earnings distribution, and arbitrarily long-lasting effects of incarceration. Transition probabilities depend on age, gender, race, education, and previous incarceration. We estimate the model using the National Longitudinal Survey of
Youth 1979 (NLSY79), one of the few panel datasets that reports incarceration. We
explicitly account for missing data and allow for the possibility that its incidence is
not random.
1
The majority of style guides currently recommend that neither “black” nor “white” be capitalized, so we follow this convention. The incarceration rate and the imprisonment rate are distinct
measures, as only the former includes people held in local jails. While the measure we use in our
analyses is incarceration, many national statistics are available only for the imprisoned.
2
As Neal and Rick (2014) observe, there is a need for more research on the effects of incarceration on “the employment and earnings prospects of less-skilled men, and less-skilled black men in
particular.”
3
Although we study both men and women, we focus on reporting results for men because they
comprise the vast majority of the incarcerated. We report a few results and statistics for women in
Section 5 and in the Appendix. Additional results are available upon request.

2

Our estimates show that the income losses associated with incarceration and longterm nonemployment are enormous. A typical 25-year-old black (white) male high
school graduate entering jail for the first time will, relative to an otherwise identical
man, suffer a lifetime income loss of $121,000 ($273,000) in 1982-1984 dollars, a 40%
(54%) drop. For those without a high school degree, the losses amount to $103,000
and $170,000, respectively, resulting in 54% and 49% drops (respectively) for black
and white men. The large lifetime effects are a consequence of essentially permanent
reductions in flow earnings after an incarceration spell. The earnings losses from a
transition to persistent nonemployment are also significant. For black male high school
graduates, the loss ($113,000) is comparable to the loss from incarceration, while the
loss for white men ($154,000) is somewhat smaller.
Although the effects of incarceration and nonemployment are profound for both
black and white men, their incidence differs markedly. For high school graduates, our
estimates indicate that while 24% of black men will eventually be incarcerated, only
3% of white men will.4 Differences in nonemployment outside of incarceration are
also quite large. Between ages 22 and 57, black men with a high school degree will on
average experience 8.5 years of nonemployment, 4.5 years more than whites.
To summarize, the likelihood of nonemployment and incarceration is higher for
black men than for their white counterparts, while the effects on earnings are typically larger for white men. What, then, is the net effect of these forces on the lifetime
earnings gap between the two groups?5 One way we answer this question is to eliminate incarceration and/or nonemployment and recalculate the gap. In the baseline,
white male high school graduates earn 65% more than black male high school graduates over their lifetimes. Eliminating incarceration alone would reduce this to 59%,
while eliminating nonemployment alone would reduce it to 44%. If both incarceration
and nonemployment were eliminated, the lifetime earnings of white male high school
graduates would exceed those of black males by 37%. Alternatively, a formal decomposition suggests that 46% of the lifetime earnings gap for high school graduates is
attributable to nonemployment and/or incarceration. This fraction is higher (67%)
4

Because the NLSY79 cohort came of age before the height of the incarceration boom, their
incarceration rates, while high, fall below those realized at the height of the boom.
5
We again use numbers for male high school graduates, but again, similar patterns hold across
all education levels.

3

for high school dropouts and lower (20%) for college graduates.
In addition to our substantive findings, our paper introduces a rich yet relatively
tractable framework for earnings processes. Our framework integrates nonemployment, incarceration and earnings, imposes few distributional assumptions, and builds
on a well-established statistical literature. Whenever incarceration, or more generally any discrete outcome, is important to understanding earnings, our framework
provides a flexible way to account for it.

1.1

Related literature

Our paper contributes to three bodies of work: the study of the impact of incarceration
on employment and earnings, the study of the black-white earnings gap, and the study
of earnings processes in general.
The data show unambiguously that “labor market prospects after prison are bleak”
(Travis et al., 2014, page 233). In their review (and borrowing from Pager, 2008),
Travis et al. (2014) discuss three potential explanations. The first is selection: Individuals with poor job market prospects are more likely to acquire a criminal record.6
The second is transformation: Time spent in jail or prison changes individuals in ways
undesirable to employers. The third is labeling: A history of imprisonment in and of
itself makes an individual less desirable to employers. There are legal restrictions
(and/or liability concerns) regarding what positions those with a criminal record can
fill. Moreover, consistent with the first two mechanisms, a criminal record may signal
undesirable traits.
The leading empirical issue in this literature is controlling for the first mechanism, non-random selection into incarceration. Travis et al. (2014) describe several
methodological responses. Among studies using survey data, the leading strategy is
to construct “control groups” of nonincarcerated individuals who otherwise resemble
the incarcerated. This has many parallels with our approach, where we condition on
an individual’s incarceration and earnings history, as well as their education, gender,
and race. These studies generally find that incarceration depresses subsequent labor
market outcomes. Among studies using administrative data, a popular strategy is to
6

Throughout the document, we use “criminal record” to mean a record that includes time spent
in jail or prison.

4

exploit exogenous variation in incarceration due to the random assignment of judges
(Kling 2006, Loeffler 2013, Mueller-Smith 2015). As a whole, studies that use administrative data—with or without the judge instrument—provide mixed support for a
causal interpretation.7
Like most models of earnings processes, our framework is statistical, and episodes
of incarceration therein are not strictly exogenous. On the other hand, most individuals in our data go to jail or prison after we first observe them, allowing us to show
that individuals with low earnings are more likely to transition into incarceration.
Moreover, the NLSY79 cohort happened to live through a period where aggregate
incarceration rates increased dramatically, implying that much of the variation in
incarceration is exogenous to the individual.
Irrespective of whether incarceration is driven by worker characteristics or by
chance, it is valuable to know how labor market outcomes change in its aftermath,
and our framework allows us to do this. In particular, our framework allows us to track
earnings and employment for decades, enabling us to study the long-run dynamics
and cumulative effects of incarceration.
In addition to the purely empirical literature discussed by Travis et al. (2014),
there are a number of structural studies that incorporate incarceration, including
Lochner (2004), Fella and Gallipoli (2014), Fu and Wolpin (2018), and Guler and
Michaud (2018). These generate earnings losses in various ways. For example, Guler
and Michaud (2018) assume that incarceration leads to human capital depreciation
and a higher proclivity for crime. Relative to these structural studies, our approach
allows for a more flexible specification, with a rich set of age and demographic controls.
Our results can also complement structural analyses by providing estimation targets
like those used in Guler and Michaud (2018).
Our paper also contributes to the empirical literature on the black-white earnings
gap. As Bayer and Charles (2018) document, this gap has proven remarkably persistent: As a proportion of the median earnings of white men, the median earnings
of black men are no higher today than they were in 1950. They attribute much of
the difference to a large and expanding gap in employment; the gap in median earn7
This may reflect limitations of administrative data; although accurate, most administrative
sources have fewer variables, explanatory or outcome, than do surveys.

5

ings among male workers has in fact narrowed considerably.8 As the growth of the
employment gap has coincided with the surge in incarceration, it is natural to ask
whether the two are related.
The third literature to which our paper contributes is the estimation and analysis of earnings processes. This literature is huge; an incomplete list of papers includes Abowd and Card (1989), Meghir and Pistaferri (2004), Bonhomme and Robin
(2009), Guvenen (2009), Bonhomme and Robin (2010), Altonji et al. (2013), Hu
et al. (2019), De Nardi et al. (2020) and Guvenen et al. (2020). Our paper adds to
this literature in three ways. The first is that it explicitly accounts for incarceration.
Earlier earnings process studies have not differentiated between incarceration and
other forms of nonemployment. Because of data limitations—many data sets exclude
the institutionalized—they might not have had the capacity to do so. Second, many
earnings process studies have focused on the continuously employed. Our approach
combines incarceration, nonemployment, and positive earnings in a unified framework. This allows us to account for the possibility that incarceration is likely to be
preceded, as well as followed, by low earnings.9
We also make a methodological contribution to the literature. Like Arellano et al.
(2017), we define transition probabilities in terms of quantiles, rather than levels,
which allows for nonnormal shocks and variable persistence. We target a different
set of quantiles, however, which allows us to utilize existing work on latent Markov
Chains (e.g., Bartolucci et al. 2010, Bartolucci et al. 2012). One advantage of our
framework is that it allows us to differentiate between short- and long-term spells of
nonemployment. Hence, we can capture varying levels of labor market attachment.
Our framework also lets us deal with missing data flexibly, allowing its incidence to
be nonrandom and persistent over time.
The rest of the paper is organized as follows. In section 2, we introduce our statistical model, and in section 3, we describe the data. In section 4, we interpret our
parameter estimates. In section 5, we discuss the model’s implications for employment,
8

Bayer and Charles (2018) also emphasize the role of race-neutral increases in the returns to
education, which have amplified the effects of education differences.
9
In estimating the earnings process for their structural model, Caucutt et al. (2018) assume that
ex-convicts transition into either unemployment or the lowest possible positive earnings quintile. On
the other hand, they assume that the probability that an individual becomes incarcerated in the
future depends only on whether the individual is incarcerated at present.

6

incarceration, and earnings over the life-cycle and calculate the changes in lifetime
earnings and employment that follow an episode of incarceration. In section 6, we
assess the contributions of incarceration and other forms of nonemployment to the
racial gap in lifetime earnings. We conclude in section 7.

2

Statistical Model and Methodology

Our model of earnings contains two variables: an unobserved latent state that follows
a Markov chain; and a discrete-valued observed outcome, the distribution of which
depends only on the current latent state. This is a variant of the ubiquitous state-space
framework, arguably most akin to Hamilton’s (1989) regime-switching model.10

2.1

Latent States and Observed Outcomes

Let `n,t ∈ L = {L0 , L1 , ..., LI−1 } denote individual n’s underlying, latent labor market state at date t, and let mn,t ∈ M = {M0 , M1 , ..., MJ−1 } denote the earnings
outcome observed by the researcher. The set of latent states, L, consists of incarceration, long-term nonemployment, and Q∗ earnings potential bins.11 This set of latent
outcomes is then interacted with a {0, 1} criminal record flag, so that L contains
I = 2(Q∗ + 2) elements. The set of observed outcomes, M, consists of incarceration,
current nonemployment, Q positive earnings bins, and not interviewed/missing. The
nonmissing outcomes are also interacted with the criminal record flag, so that M
contains J = 2(Q + 2) + 1 elements.
We discretize the distributions of both earnings potential and observed earnings
(when positive). This both simplifies the estimation process and produces estimates
that port directly into dynamic structural models. As we show below, we can increase
the number of earnings bins without increasing the number of model parameters. It
also bears noting that the bins represent quantile rank (conditional on race, gender, education, and age), rather than level, groupings. As the extensive literature on
copulas (see, e.g., Trivedi and Zimmer 2007) has shown, working in quantile space
10
11

See also Farmer (2020). Bartolucci et al. (2010) provide an introduction.
An individual’s earnings potential is his or her unobserved earnings capacity.

7

is an effective way to model non-normal shocks and variable persistence.12 Let pq ,
where q = 0, 1, ..., Q, denote the probability cutoffs for the earnings bins; in a modest
abuse of notation we will also use q to index the bin given by the interval (pq−1 , pq ),
q = 1, 2, ..., Q. We partition earnings into deciles, so that pq ∈ {0.0, 0.1, ..., 0.9, 1}, and
there are Q = 10 bins. We assume further that the bins for latent earnings potential
are the same as those for observed earnings, so that Q∗ = Q; this is straightforward if
tedious to relax. We estimate the deciles for observed earnings, semi-parametrically,
in a separate procedure.
Our model is based on two key assumptions. The first is that `n,t is conditionally
Markov, with the I × I transition matrix, Ax :
Aj,k | x = Pr(`n,t+1 = Lk | `n,t = Lj , xn,t ) = Pr(`n,t+1 = Lk | Ft ),

(1)

where: xn,t is a vector of exogenous variables; Ft denotes the time-t information set;
and Aj,k | x denotes row j and column k of Ax . In our case, xn,t contains an individual’s
age (which enters parametrically) and their race, gender, and education level (which
enter semi-parametrically, as there are separate sets of parameters for each group).13
The second assumption is that the distribution of the observed outcome mn,t depends
on only the contemporaneous realization of `n,t . We place the probabilities that map
`n,t to mn,t in the I × J matrix Bz :
Bj,k | z = Pr(mn,t = Mk | `n,t = Lj , zn,t ) = Pr(mn,t = Mk | Ft−1 , `n,t ).

(2)

The vector zn,t is the concatenation of xn,t and an indicator of whether the individual
was interviewed in period t − 1, which captures the persistence of nonresponse. The
final element of our model is the 1 × I row vector µ1 , which gives the unconditional
distribution of the initial latent state `n,1 conditional on xn,1 .
For the remainder of the section, we will drop the individual index n and suppress
the probabilities’ dependence on x and z.
12

Arellano et al. (2017) also rely heavily on quantiles, for similar reasons. As we discuss in Appendix A, however, the structure of their approach is very different from ours.
13
Recall that `n,t includes whether the individual has been previously incarcerated.

8

Figure 1: Earnings process transitions
Current Latent State:

lt 

{ Jail, Not Employed, bin1, bin2, …, binQ* }
Logit Transition Probabilities

Transition
Matrix A: lt → lt+1

bins 1 through Q*

Kumaraswamy CDF: depends on

Jail

bin1

Not Employed

bin2

…

lt

binQ*

Observation Matrix B: lt+1 → mt+1
Not
Interviewed

2.2

Jail

bin1

Not Employed

bin2

…

binQ

Latent State Transitions

As the top half of Figure 1 shows, we populate the transition matrix A in two steps.
First, we use a multinomial logit regression to determine the one-period-ahead probabilities of incarceration, long-term nonemployment, or employment (bins 1 to Q∗ ).
We assume that an incarceration record is backward-looking and permanent, so that
once a person is incarcerated, he will have an incarceration record in all subsequent
periods.
The variables in this regression include the current state, age, and interactions.
Appendix A presents our exact specification. An important simplification is that we
characterize the earnings potential bins by their midpoint rank, p̃q := [pq +pq−1 ]/2. By
way of example, when earnings are partitioned into deciles, p̃q ∈ {0.05, 0.15, ..., 0.95}.
Because we treat p̃q as continuous rather than categorical, the number of variables in
the logistic regression is invariant to the number of bins.
Second, we estimate the distribution of next period’s earnings potential, conditional on being employed, across the bins. To do this, we assume that the conditional
distribution of ranks follows the Kumaraswamy (1980) distribution. Like the Beta
9

distribution, the Kumaraswamy distribution is a flexible function defined over the
[0, 1] interval; however, its CDF is much simpler:
K(p; α, β) = Pr(y ≤ p; α, β) = 1 − (1 − pα )β .
The parameters α and β are both strictly positive. It follows that if earnings bin q
covers quantiles pq−1 to pq ,
Pr(bin q) = K(pq ; α, β) − K(pq−1 ; α, β).

(3)

We allow α and β to depend on the current state `t and the explanatory vector xt , so
that α = α(`t , xt ) and β = β(`t , xt ). When the current state is the earnings bin qt , we
characterize it by its midpoint value, p̃qt . Appendix A presents the full specification.
Our functional forms place relatively few restrictions on the earnings transitions. As Jones (2009) argues, the Kumaraswamy distribution appears well-suited
for “quantile-based” statistical modeling, permitting a wide variety of shapes. Moreover, given enough terms, α(·) and β(·) can vary in arbitrarily complicated ways,

allowing the conditional CDF K p; α(`t , xt ), β(`t , xt ) to vary in arbitrarily complicated ways. Strictly speaking, our approach is valid only when the true conditional
distribution of earnings potential is smooth. This is a standard assumption, however,
and we have separate nonemployment and incarceration states that absorb the mass
of zero-earnings outcomes. In Appendix B, we assess the ability of the Kumaraswamy
distribution to approximate a standard Gaussian AR(1) process and show that the
approximation works well.
Because we use the midpoint value p̃qt to characterize the current earnings bin,
the number of parameters in α(·) and β(·) need not increase with the number of bins
(see Appendix A). Even if we treat α(·) and β(·) as sieve estimators, the number of
parameters will grow more slowly than the sample size. As the number of bins grows

large, we get the conditional CDF K pt+1 ; α(pt , xt ), β(pt , xt ) , where pt and pt+1 are
both quantile ranks; at this point, K(·) is a copula. The probability difference in equation (3), appropriately deflated, likewise converges to the density of the underlying
Kumaraswamy distribution.

10

2.3

Observation Dynamics

The bottom half of Figure 1 shows how we populate the observation matrix B. The
first step of the process is determining the probability that an individual is interviewed
by the NLSY at time t.14 We use a logistic specification. The explanatory variables
include the current latent state, age, and an indicator of whether the individual was
interviewed in the previous wave. Including these variables helps us control for nonrandom attrition.
Conditional on being observed, we impose the following mapping from latent states
to measured outcomes. We assume that the NLSY79 measures incarceration accurately, so that the latent incarceration state maps directly into the incarceration
outcome. Because our latent nonemployment state is meant to capture long-term
disengagement from the labor force, we assume the persistent nonemployment state
maps directly into nonemployment (again, conditional on being observed). Finally,
each earnings potential bin can map into nonemployment and any of the observed
earnings bins. The probability of nonemployment is logistic. Conditional on being
employed, the distribution of earnings across bins follows a formula akin to equation (3), the main difference being that the Kumaraswamy distribution is replaced by
a truncated univariate logistic distribution. Because multiple combinations of A and
B can produce similar patterns of observed outcomes, we seek a specification where
the distribution of observed earnings shifts rightward in earnings potential. Using the
symmetric logistic distribution, which we further center around the earnings potential
rank p̃q , ensures that the mapping from the latent states to the observed outcomes
has this property.
In the standard earnings model, transitory shocks capture both short-term earnings shocks and measurement error. A similar sort of ambiguity applies here. We
believe that transitions from latent earnings to nonemployment reflect short-term
spells of nonemployment. Transitions between latent and observed employment bins
may reflect measurement error as well.
14

Individuals who die are dropped from the likelihood function at their date of death, rather
than treated as missing. We view attrition via death as qualitatively distinct from nonresponse. For
similar reasons, we also remove individuals when they are dropped from the NLSY79’s Supplemental
Sample in 1991.

11

2.4

Initial Probabilities

We construct the initial distribution of latent states, µ1 (`1 |x̃1 ), in much the same
way we found their transition probabilities. (Here, x̃1 consists of race, gender, and
education.) First, we find the probability that the individual is incarcerated, nonemployed (long-term), or in one of the earnings potential bins. Conditional on having
positive earnings, we find the distribution across initial earnings potential bins using
the Kumaraswamy distribution; the calculations parallel those in equation (3). The
final step is to estimate the probability that the individual has a criminal record, conditional on the other latent states, using a logistic regression. The product of these
two probabilities gives us our initial distribution.

2.5

Likelihood

We estimate our model using maximum likelihood, utilizing the forward recursion
described in, e.g., Bartolucci et al. (2010) and Scott (2002). This is quite similar to
the methodology for the regime-switching model presented in Hamilton (1994, chapter
22). Appendix A provides a more detailed description. We estimate separate sets of
transition and observation probabilities for each race-gender-education combination.
We weight each individual’s log-likelihood using the NLSY79 sampling weights for
1979.15
For some groups—white men with a college degree, white women with at least a
high school diploma, and black women with either a high school or a college degree—
the incidence of incarceration is so low that their incarceration-related parameters
cannot be estimated with any precision. In these cases, we drop individuals with
a criminal record and estimate a simplified model of employment and earnings. To
this set of parameters, we add incarceration-related parameters estimated for other,
similar groups, namely white men with some college education, white women without
a high school diploma, or black women with some college experience. In making
these imputations, we adjust the constant terms for the incarceration probabilities
to match the ever-incarcerated rate observed for that group in the NLSY79;16 with a
15

Within each race-gender-education group, the weights are scaled to have an average value of 1.
For white men and black women with college degrees, and white women with a high school
degree, the fraction of individuals with a criminal record is implausibly low, 0.03% or less. In these
16

12

logistic formulation, this is simple to do. Appendix F describes the adjustments. These
imputations are somewhat ad hoc, but the groups to which they are applied have
very low rates of incarceration, implying that any imputation error will be relatively
unimportant in the aggregate.

2.6

Quantiles and Conditional Means

To complete our model, we need to delineate the earnings bins and assign a level of
earnings to each bin. We estimate bin cutoffs by age, for each race-gender-education
group, using quantile regression. Using these cutoffs, we then assign individuals to
bins and take averages by age. While the estimation procedure works with any set
of cutoffs, to reduce sampling error we estimate the cutoffs and within-bin conditional means from the Current Population Survey (CPS), which contains far more
observations than the NLSY79.

3

Data

We now describe how we use our two data sources, the NLSY79 and the CPS.

3.1

The NLSY79

Our primary source of data is the 1979 cohort of the National Longitudinal Survey of
Youth (NLSY79), a nationally representative panel survey of young men and women
born between 1957 and 1964. From 1979-1994, respondents were interviewed every
year; since 1994, interviews occur every other year. The NLSY collects information
about education, employment, family, and finances. It is also one of the few nationally
representative surveys that enables us to observe an individual’s incarceration status.
Specifically, the variable that reports a person’s residence status and location allows
“jail/prison” as a response.17 Coupled with the available earnings and employment
data, this information makes the NLSY79 well-suited for our study and enables us to
carry out our analysis largely using this single dataset.
cases we impute the rates using data from other race and education groups.
17
We use “jail” henceforth to refer to either jail or prison.

13

The NLSY79 has three subsamples: the (core) cross-sectional sample, a supplemental sample of minority and/or disadvantaged individuals, and a military sample.
We exclude the military sample, as earnings for this group are hard to interpret, and
we drop Hispanic respondents. This leaves us with roughly 9,600 individuals, of which
4,747 are male. We include both workers and the self-employed; Appendix C describes
our employment and earnings measures in some detail.
We have four education categories: less than a high school diploma, high school
diploma, some college, and bachelor’s degree or higher. Exploiting the panel design of
the NLSY79, we classify individuals on the basis of their highest observed attainment,
treating education as a permanent characteristic. We categorize individuals by years
of schooling, except for GED recipients, whom we classify as high school dropouts. As
Heckman et al. (2011) and others have noted, GED recipients on average have worse
labor market outcomes than those receiving high school diplomas. It is also the case
that many recipients earn their GEDs while incarcerated.
Table 1 shows summary statistics for men, the main focus of our analysis.18 Our
data cover the years 1980-2014. The first panel illustrates the education gradient in
earnings and shows that at every education level, black men earn significantly less than
their white counterparts. By way of example, median earnings for a black high school
dropout are 41% (4.93/12.00) those of his white counterpart. The second panel shows
incarceration rates. For most groups, incarceration rates are highest for men in their
30s. This may reflect to some extent a conflation of time and age effects: The national
transition toward mass incarceration in the 1980s and 1990s occurred at the same time
the NLSY79 cohort aged out of their 20s and into their 30s. Another notable feature
is that men with some college experience are more likely to be incarcerated than high
school graduates; recall that we classify GED recipients as high school dropouts. As
expected, incarceration rates differ markedly by race. The largest absolute differences
are among high school dropouts: The difference across all ages is over 6 percentage
points (pp). The largest proportional differences, however, are among those with at
least a high school degree.
The third panel of Table 1 shows our measure of a “criminal record,” namely a
personal history of at least one previous incarceration spell.19 The fourth panel shows
18
19

Appendix D shows the summary statistics for women. Statistics calculated using 1979 weights.
Because our measure of a criminal record is backward-looking, our estimation sample starts in

14

Table 1: Summary Statistics by Race and Education for Men, NLSY79
Black Men
LTHS

White Men

HS

SC

CG

LTHS

HS

SC

CG

Earnings (in $1,000s)
Mean
7.73
10th percentile
0
25th percentile
0
50th percentile
4.93
75th percentile
12.32
90th percentile
19.35

11.90
0
2.74
10.50
17.06
24.90

14.35
0
4.99
13.20
20.48
28.80

24.00
0.93
10.11
19.21
30.60
45.93

13.45
0
4.40
12.00
18.55
26.57

19.65
4.29
11.01
17.57
25.08
34.21

22.53
2.91
10.90
18.69
28.07
40.38

32.44
3.01
12.32
24.11
38.42
65.22

Currently Incarcerated (%)
All ages
9.61
22-29
10.87
30-39
12.39
40-49
5.93
50 and older
3.02

2.54
2.10
4.03
1.84
0.67

3.40
3.13
4.59
2.74
1.69

0.41
0.69
0.24
0
0.66

3.30
3.60
3.23
3.26
1.98

0.26
0.25
0.41
0.09
0.07

0.53
0.59
0.58
0.55
0

0.01
0
0
0
0.11

Previously Incarcerated (%)
All ages
27.52
22-29
18.72
30-39
31.11
40-49
34.55
50 and older
35.87

9.75
4.40
10.95
14.65
16.94

9.21
5.89
10.13
13.17
11.36

2.80
1.99
3.01
3.85
3.14

13.18
9.23
13.29
19.11
19.74

0.95
0.45
0.99
1.50
1.87

2.44
1.38
2.82
3.32
3.89

0.02
0
0
0
0.21

(%)
55.00
32.17

69.90
41.60

71.21
33.62

78.51
47.89

68.56
46.09

78.25
43.66

77.48
53.31

81.44
50.00

63.55

72.98

75.03

79.45

71.91

78.56

78.04

81.43

1960.4
29.76

1960.4
30.05

1960.4
28.80

1960.3
29.77

1960.5
27.62

1960.2
27.86

1960.1
28.09

1960.3
28.99

4.67

4.76

3.01

2.12

15.61

28.12

17.37

24.34

8,475
479

9,071
499

5,829
322

3,935
209

10,035
781

16,081
1,016

9,755
603

14,315
838

Fraction Employed
All
Previously
incarcerated
Not previously
incarcerated
Mean Values
Year of birth
Age
Fraction of male
population (%)
Observations
Individuals

Note: [LTHS,HS,SC,CG] denote less than high school/high school/some college/college
graduate.

15

employment. The first row of this panel, which shows aggregate results, reveal that
the earnings gaps found in the first panel are to some extent employment gaps. This
is consistent with the findings of Bayer and Charles (2018) described earlier: The
earnings gaps among the fully-employed, although still significant, are smaller. The
second and third rows of this panel show that the employment rates for men with
a criminal record are 25-40pp lower than those of men without. There may also be
incarceration-related differences in the earnings of those who work.
The final panel shows the distribution of respondents by race and education. The
first line shows proportions calculated using the NLSY79 sample weights, while the
last two lines show unweighted counts. Including the supplemental sample provides
us with a large number of black respondents.

3.2

The Current Population Survey (CPS)

Although the NLSY79 is our principal data source, to calculate the cutoffs that delineate the earnings quantiles, we make use of the larger sample available in the CPS
(downloaded from IPUMS; Ruggles et al. 2020). CPS data are available from 1962 to
2019; however, we limit our sample to 1976 onward because data on hours worked,
which we need for our measure of employment, are not available prior to that year.
Since the CPS is not a panel, we employ a synthetic cohort approach. Ideally, we
would limit the sample to those born in the same years as our NLSY79 cohort (1957
to 1964) but, to have a sufficient number of observations, we include individuals born
between 1941 and 1980 (i.e., the NLSY79 +/- two cohorts) and use cohort dummies
to account for any cohort effects within this group. This consists of around 4.2 million observations. We restrict the sample to white and black individuals, who together
make up about 94% of the sample. We also exclude those for whom educational attainment is not reported. We limit observations to those aged between 22 and 66.
Starting the sample at age 22 helps ensure that those who chose to attend college
have entered the workforce. We choose 66 as the upper limit since that is the normal Social Security retirement age for the NLSY79 cohort. After applying these age
restrictions, around 3 million individuals remain in the sample.
The CPS elicits income information for the year prior to the survey year. Our
1980, allowing us to use the 1979 incarceration measure.

16

focus is on earnings, which we define broadly to include not just wage and salary income, but also the labor portions of farm and business income. Since we have separate
categories for the nonemployed and incarcerated, we limit ourselves to those who were
employed in the previous year. Those who remain (about 2.4 million) form our sample
of employed individuals. Appendix C describes our employment and earnings measures in more detail. We weight the data to ensure that the sample is representative
of the population.
Within this sample, we estimate earnings bin cutoffs (deciles) separately for each
race-gender-education group. In particular, in each group, for each quantile q, we run
the following quantile regression:
yt = βq,0 + βq,1 at + βq,2 a2t + βq,3 a3t +

5
X

γq,m cohortm + q,t .

(4)

m=2

Here yt denotes earnings, at is age, and cohortm is a dummy variable for one of the
five 8-year cohorts contained in our CPS sample.
With the results of Equation (4) in hand, we use post-estimation procedures to
obtain the decile cutpoints for each race-gender-education-cohort group at each age.
These are shown in Figures E.1 and E.2 for women and men, respectively, in Appendix E.
Applying the cutpoints to the CPS data, we calculate within-decile mean earnings
at each age for each group. We then fit a cubic polynomial in age with cohort dummies
through these age-specific means. Applying post-estimation procedures to the results
of this regression yields life-cycle profiles of within-decile mean earnings for our cohort
of interest. Figures E.3–E.4 show mean earnings and the fitted life-cycle profiles for
women and men from the 1957-64 birth cohort.

4

Estimation Results

Given the nonlinear nature of the underlying model, the parameter estimates and
standard errors, displayed in Tables F.1 and F.2 of Appendix F, are hard to interpret.
Consequently, we instead highlight a few of the implied transition matrices (A) and
observation matrices (B).

17

4.1

Latent Transition Matrices

We focus our discussion of the transition matrices on men without a high school
degree, where the dynamics of incarceration are easiest to see, but all education
groups display similar patterns. Tables 2 and 3 present the latent state transition
probabilities for a 25-year-old black man and white man, respectively, without a high
school degree. Rows index the current state `t , while columns index the future state
`t+1 .20
Table 2: Latent transition probabilities, 25-year-old black men without a high school
diploma

Current
State ↓

N

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

0
0
0
0
0
0
0
0

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

1
1
1
1
1
1
1
1

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

0.73
0.14
0.05
0.02
0.01
0.01
0.00

Future State
Jail Flag = 0
Jail Flag = 1
Q1+ Q3+ Q5+ Q7+ Q9+
Q1+ Q3+ Q5+ Q7+ Q9+
Q2
Q4
Q6
Q8 Q10 Jail N
Q2
Q4
Q6
Q8 Q10 Jail
0.17
0.52
0.27
0.01
0.00
0.00
0.00

0.03
0.18
0.53
0.23
0.07
0.01
0.00

0.02
0.06
0.11
0.63
0.53
0.12
0.02

0.01
0.01
0.00
0.09
0.37
0.55
0.08

0.01
0.00
0.00
0.00
0.00
0.31
0.89

0.02
0.08
0.04
0.02
0.01
0.01
0.00
0.16 0.31

0.10

0.06

0.03

0.01 0.34

0.56
0.07
0.03
0.01
0.01
0.00
0.00
0.04

0.03
0.13
0.44
0.19
0.06
0.01
0.00
0.04

0.02
0.05
0.10
0.58
0.48
0.11
0.02
0.02

0.01
0.01
0.00
0.09
0.37
0.52
0.08
0.01

0.01
0.00
0.00
0.00
0.00
0.31
0.87
0.00

0.18
0.35
0.20
0.01
0.00
0.00
0.00
0.10

0.18
0.39
0.23
0.12
0.09
0.05
0.03
0.79

Note: Rows are indexed by the latent state at age 25, columns by the latent state at age 26. JF or
Jail Flag indicates a history of incarceration. “Jail” denotes currently incarcerated. N indicates not
employed but not currently incarcerated. Qi denotes earnings potential decile i. Some transitions
omitted.

The first of these general patterns is that men who are nonemployed or have low
earnings potential are much more likely to transit to jail. A 25-year-old black man
20
To avoid presenting 242 numbers, we condense the matrix A in two ways: We present transition
probabilities for only a subset of the current states, reducing the number of rows; and we combine
future states by summing probabilities, reducing the number of columns.

18

with no criminal history (JF = 0) in the bottom decile of the earnings potential
distribution (Q1) has a roughly 8% chance of becoming incarcerated at age 26. The
incarceration probability for an otherwise identical man in the 8th earnings potential
decile (Q8) is 1%. White men follow the same pattern, though their corresponding
chances of being incarcerated are lower than for black men.
Table 3: Latent transition probabilities, 25-year-old white men without a high school
diploma

Current
State ↓

N

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

0
0
0
0
0
0
0
0

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

1
1
1
1
1
1
1
1

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

0.69
0.05
0.03
0.02
0.01
0.01
0.00

Future State
Jail Flag = 0
Jail Flag = 1
Q1+ Q3+ Q5+ Q7+ Q9+
Q1+ Q3+ Q5+ Q7+ Q9+
Q2
Q4
Q6
Q8 Q10 Jail N
Q2
Q4
Q6
Q8 Q10 Jail
0.17
0.71
0.28
0.00
0.00
0.00
0.00

0.06
0.18
0.63
0.17
0.02
0.00
0.00

0.04
0.02
0.04
0.76
0.52
0.03
0.00

0.02
0.00
0.00
0.05
0.44
0.64
0.06

0.01
0.00
0.00
0.00
0.00
0.31
0.93

0.01
0.04
0.02
0.01
0.00
0.00
0.00
0.04 0.45

0.12

0.04

0.01

0.00 0.34

0.71
0.05
0.03
0.02
0.01
0.01
0.00
0.02

0.05
0.18
0.56
0.17
0.03
0.00
0.00
0.05

0.03
0.05
0.07
0.70
0.48
0.04
0.00
0.03

0.03
0.01
0.00
0.07
0.46
0.56
0.05
0.01

0.03
0.00
0.00
0.00
0.00
0.38
0.94
0.00

0.11
0.52
0.25
0.00
0.00
0.00
0.00
0.14

0.05
0.18
0.08
0.04
0.02
0.01
0.00
0.75

Note: Rows are indexed by the latent state at age 25, columns by the latent state at age 26. JF or
Jail Flag indicates previous incarceration. N indicates not employed but not currently incarcerated.
Qi denotes earnings potential decile i. Some transitions omitted.

The second is that recidivism is prevalent. A 25-year-old black (white) man currently in the bottom decile of earnings potential with a criminal record has a 39%
(18%) chance of being in jail the following year, an increase of 31pp (14pp) over
that for a man with no record. Moreover, men who are currently jailed, should they
exit, are most likely to exit to nonemployment or to the bottom decile of earnings
potential, where the odds of reincarceration are the highest. A man who is currently
incarcerated and in possession of a criminal record will remain incarcerated nearly
80% of the time.

19

A third feature is that men with low earnings potential are more likely to transit
to nonemployment than those with high earnings potential. On the other hand, men
who stay employed are most likely to remain in their current earnings potential bin, as
the large numbers on the diagonals indicate. For example, a 25-year-old man with no
criminal record in the top earnings potential decile has around a 90% chance of being
in the top two deciles in the following year. It also bears noting that the transition
probabilities are not symmetric. It is much more common for a man in the bottom
decile of earnings potential to transit to higher deciles than it is for a man at the top
decile to transition down.
The patterns described above largely hold for both black and white men. In addition, the newly incarcerated in both groups face identical chances (34%) of remaining incarcerated the following year. There are, however, differences between the two
groups, most notably that white men are much less likely to become incarcerated
than their black counterparts. White men are also less likely to transition to nonemployment.21

4.2

Measurement Matrices

We turn next to the probabilities mapping from the latent states to observed outcomes, embodied in the matrix B. Table 4 presents the observation probabilities for
a 25-year-old black man without a high school degree. Rows index the latent state `t ,
while columns index the observed outcome mt . We condense the results in much the
same way that we condensed those for the transition matrix A.
Perhaps the most notable feature of Table 4 is the high likelihood that a worker
with low earnings potential will be nonemployed. For example, a man in the bottom
earnings potential decile will be nonemployed 40% of the time if he has no criminal
record and 60% of the time if he has one. Recall that this nonemployment spell is
completely transitory. Conditional on latent earnings potential, realizing such a spell
has no effect whatsoever on the probability of future nonemployment or, for that
matter, any future outcome. Nonetheless, in every period, black men with low earnings
potential face a significant risk of nonemployment. In addition to nonemployment, for
21

The one seeming exception involves nonemployed men with a criminal record, where the increased risk of nonemployment facing white men (71% vs. 56%) is almost completely offset by a
decreased risk of incarceration (5% vs. 18%).

20

Table 4: Observation probabilities, 25-year-old black men without a high school diploma

Latent
State ↓

N

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

0
0
0
0
0
0
0
0

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

1
1
1
1
1
1
1
1

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

1.00
0.40
0.17
0.07
0.04
0.02
0.01

Observed Outcome
Jail Flag = 0
Jail Flag = 1
Q1+ Q3+ Q5+ Q7+ Q9+
Q1+ Q3+ Q5+ Q7+ Q9+
Q2
Q4
Q6
Q8 Q10 Jail N
Q2
Q4
Q6
Q8 Q10 Jail
0.60
0.22
0.13
0.11
0.00
0.00

0.00
0.57
0.25
0.20
0.00
0.00

0.00
0.04
0.29
0.27
0.09
0.00

0.00
0.00
0.19
0.24
0.59
0.00

0.00
0.00
0.08
0.15
0.30
0.99
1.00
1.00
0.61
0.31
0.14
0.09
0.05
0.03

0.39
0.22
0.15
0.10
0.03
0.00

0.00
0.33
0.20
0.19
0.09
0.00

0.00
0.11
0.21
0.25
0.22
0.00

0.00
0.02
0.18
0.22
0.33
0.01

0.00
0.00
0.12
0.14
0.28
0.96
1.00

Note: Rows are indexed by the latent state at age 25, columns by the observed state at the same age.
JF or Jail Flag indicates previous incarceration. “Jail” denotes currently incarcerated. N indicates
not employed but not currently incarcerated. For rows, Qi denotes earnings potential decile i. For
columns, Qj denotes observed earnings decile j. Some transitions omitted.

most earnings potential deciles, the distribution of observed outcomes spans a wide
range of positive earnings realizations. The one exception is the top earnings potential
decile, where 99% of realized earnings fall in the top two outcome deciles. This may
reflect the rightward skew of the earnings distribution. At the upper tail, large changes
in earnings levels need not produce large changes in earnings ranks; see the figures in
Appendix E.
Table 5 presents the observation probabilities for a 25-year-old white man without
a high school degree. Transitory nonemployment is less common among white men
than among black men. For example, a man in the bottom earnings potential decile
with no criminal record will be nonemployed 29% of the time if he is white and
40% of the time if he is black. Black men are not only more likely to be persistently
nonemployed, but also more likely to experience temporary nonemployment spells.
Taking stock, we see that nonemployment and incarceration pose significant risks
for less educated men, especially those with low latent earnings potential. It is also
21

clear that the effects of incarceration are profound. Men with criminal records face
markedly higher odds of nonemployment and (re-)incarceration. All of these risks are
greater for black men.
Table 5: Observation probabilities, 25-year-old white men without a high school diploma

Latent
State ↓

N

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

0
0
0
0
0
0
0
0

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

JF
JF
JF
JF
JF
JF
JF
JF

=
=
=
=
=
=
=
=

1
1
1
1
1
1
1
1

N
Q1
Q3
Q5
Q6
Q8
Q10
Jail

1.00
0.29
0.05
0.02
0.01
0.01
0.01

Observed Outcome
Jail Flag = 0
Jail Flag = 1
Q1+ Q3+ Q5+ Q7+ Q9+
Q1+ Q3+ Q5+ Q7+ Q9+
Q2
Q4
Q6
Q8 Q10 Jail N
Q2
Q4
Q6
Q8 Q10 Jail
0.71
0.21
0.03
0.01
0.00
0.00

0.01
0.21
0.30
0.13
0.02
0.00

0.00
0.20
0.56
0.50
0.16
0.00

0.00
0.18
0.10
0.31
0.48
0.00

0.00
0.14
0.01
0.04
0.32
0.99
1.00
1.00
0.42
0.09
0.03
0.02
0.01
0.02

0.58
0.18
0.04
0.03
0.00
0.00

0.00
0.18
0.30
0.16
0.01
0.00

0.00
0.18
0.49
0.42
0.12
0.00

0.00
0.18
0.12
0.30
0.54
0.00

0.00
0.18
0.01
0.07
0.32
0.98
1.00

Note: Rows are indexed by the latent state at age 25, columns by the observed state at the same
age. JF or Jail Flag indicates previous incarceration. N indicates not employed but not currently
incarcerated. For rows, Qi denotes earnings potential decile i. For columns, Qj denotes observed
earnings decile j. Some transitions omitted.

5

Simulations: Incarceration, Nonemployment, and
Earnings

We now turn to our model’s predictions for the longer-term behavior of incarceration, employment, and earnings. We look first at the life-cycle profiles of these three
variables. We then provide a sense of the long-run effect of these states in two ways:
by looking at a measure of their persistence; and then by generating impulse responses to incarceration, nonemployment, and earnings shocks. Taken together, these
22

results address our first objective, quantifying the relationship between incarceration,
employment, and earnings.

5.1

Age Profiles

5.1.1

Incarceration

Figure 2 presents age-incarceration profiles by race for less than high school (L) and
high school (H) men. The first-time incarceration rates, depicted in the bottom left
panel, are monotonically declining in age, as one might expect. The fraction of the
population with a history of incarceration (top right panel) thus rises most quickly at
younger ages. Tables 2 and 3 showed that men with criminal records are more likely
to be (re-)incarcerated in the future, and when incarcerated, more likely to spend
consecutive years in jail. This is reflected in the average incarceration spell length
(bottom right panel), which increases early in life. The number of repeat offenders
in jail thus rises for a while, before slowly falling. This causes the total incarceration
rate (top left panel) to have a hump shape.
Figure 2 also highlights the large disparities in incarceration rates by race and
education. Within race, incarceration rates decrease sharply with education. Across
races, incarceration rates are markedly higher for blacks than whites. Putting the two
together, the rates for white men without a high school degree are comparable to
rates for black men with a high school degree.
The patterns for years spent incarcerated are quite distinct from those for incarceration rates. The average incarceration spells of white men are, if anything, longer
than those of blacks. Conditional on being incarcerated, middle-aged white men with
a high school diploma have longer spells than those of any other group. Black men
have higher incarceration rates not because they serve longer spells, but because they
are far more likely to be sent to jail.22
Figure G.1, found in Appendix G, compares the current and ever-incarcerated
rates predicted by the model to those in the data. Because we allow for nonrandom
22

Our finding that white defendants serve longer spells appears at odds with the tendency of black
defendants to receive longer sentences in federal courts (Rehavi and Starr 2014; Light 2021). There
appears to be very little difference in felony sentence lengths at the state level (Rosenmerkel et al.,
2009, Table 3.6), however, and incarceration stints need not involve felony convictions at all.

23

Figure 2: Age-Incarceration Profiles

Note: [B,W][L,H] denote black/white, less than high school/high school.

attrition, the data and the model need not align perfectly.23 The fit is nonetheless
quite good.
5.1.2

Nonemployment

Figure 3 displays nonemployment profiles. Recall that the model has two types of
nonemployment: persistent nonemployment, where the latent state is nonemployment,
which automatically results in measured nonemployment; and transitory nonemployment, where the latent state is working (in particular, one of the earnings potential
23

Figure G.3 shows the effects of this observation bias on the incarceration and nonemployment
profiles.

24

deciles), but the observed outcome is nonemployment.24 These are given in the top
and bottom left panels, respectively. The top right panel shows total measured nonemployment, the sum of persistent and transitory nonemployment. Figure G.2, found
in Appendix G, compares the total nonemployment rates predicted by the model to
those in the data. As with incarceration, the fits are good.
Figure 3: Nonemployment by age and type

0.6

0.6

0.4

0.4

0.2

0.2

0

0
30

40

50

0.6

30

40

50

30

40

50

10
8

0.4

6
4

0.2

2
0

0
30

40

50

Note: [B,W][L,H,S,C] denote black/white, less than high school/high school/some
college/bachelor’s degree.

The age profiles for persistent nonemployment generally rise with age, the one
24

Because of the annual (or biennial) frequency of the NLSY79, there is a time aggregation issue
regarding how to treat individuals who are nonemployed for periods of less than a year: Under our
coding, individuals who work for only part of a year are classified as employed. This is one likely
reason why the dynamics of the lowest earnings potential deciles, which include many part-timers,
are somewhat distinct from those higher up.

25

exception being a modest decline for young men without a high school degree. These
upward slopes are consistent with the tendency of older workers to exit the workforce. The profiles for transitory nonemployment behave very differently, sometimes
displaying a hump shape. But even when transitory nonemployment is rising, persistent nonemployment rises more quickly. As men age, an increasing fraction of their
nonemployment is persistent. This is one reason why the duration of nonemployment
(bottom right panel) rises with age.
The profiles for total (measured) nonemployment show that black men who did not
attend college are much more likely to be nonemployed than their white counterparts.
While education-related differences in nonemployment are significant, race-related
differences are arguably larger. For example, the nonemployment rate for black high
school graduates rises from 11% at age 22 to 53% by age 57; nonemployment for white
men rises from 4% to 27%. In fact, at any age, a black man with a high school degree
is more likely to be nonemployed than a white man without one.
5.1.3

Earnings

Our model’s predictions for earnings (for men), disaggregated by race and education,
are shown in Figure 4. The top left and bottom right panels include both incarcerated and nonincarcerated men. They show that the canonical hump shape over
the life-cycle is maintained. As expected, earnings increase sharply with educational
attainment, while significant racial differences within education groups remain.
The impact of incarceration on earnings can be seen by comparing the earnings of
those who have never been incarcerated (top right) with those who have (bottom left).
The bottom left panel shows that incarceration compresses the education gradient
of earnings to a striking degree. The figure suggests that because people with an
incarceration history have similar earnings, it is those with high initial earnings—
whites and the more highly educated—who suffer the bigger earnings loss.

5.2

Persistence

One of the strengths of our statistical framework is that it allows the intertemporal
persistence of earnings to vary across the earnings distribution. To construct a simple

26

Figure 4: Age-earnings profiles for men by race and education

Note: [B,W][L,H,S,C] denote black/white, less than high school/high
school/some college/bachelor’s degree.
state-specific measure of persistence, we begin with the deviation
zk,t+j := E[yt+j | `t = Lk ] − E[yt+j ],
where y denotes measured earnings, and ` denotes the latent state. We then measure
state k-conditional persistence as:
ρk,t

 1 P9
1/5
j=5 zk,t+j
5
:= 1 P4
.
j=0 zk,t+j
5

27

ρk,t measures how quickly earnings return to their unconditional mean value when the
age-t latent state is k. When y follows a simple AR(1), yt+1 = ρyt + et , ρk,t reduces
to ρ. We take five-year averages to remove the highest-frequency dynamics.25
Figure 5: Earnings persistence by latent state, men with a high school degree or less

Note: [B,W]M[L,H] denote black/white, men, less than high school/high school; J
indicates current incarceration; N indicates not employed but not currently incarcerated; Qi denotes earnings potential decile i.

Figure 5 shows age-averaged earnings persistence by state for men who are high
school dropouts or have a high school degree.26 Across the latent states Q1-Q10, the
earnings of white men are considerably more persistent. While ρk,t is typically above
1 for white men with a high school degree and around .98 for white high school
dropouts, for black men 0.96 is a more representative number. The earnings effects of
nonemployment or jail are similarly more persistent for white men than black men.
For both races, persistence is lower at the tails of the earnings potential distribution
(e.g., Q1 and Q10) than in the middle.
25

Our persistence measure is similar in spirit to the quantile-based measure proposed by Arellano
t+1 | `t ]
et al. (2017), which, if adapted to scaled means, would equal ρk,t = ∂E[y∂`
.
`t =Lk
t
26
To avoid small denominators, we estimate ρk,t only for the states where
P4
| 15 j=0 zk,t+j /σ(yt+j )| ≥ 0.25 for σ(yt+j ) the unconditional standard deviation of yt+j . Values for these states are indicated by markers on the plot.

28

Figure 5 also highlights the importance of treating incarceration differently from
other forms of nonemployment: the effects of jail are considerably more persistent
than those of nonemployment, even though both states have zero earnings.

5.3

Lifetime Totals

To assess the cumulative effects of incarceration and earnings, it is useful to construct
lifetime totals. We convert the stream of pre-tax earnings {yt }Tt=1 into a net present
value,
E

T
X

R1−t yt ,

t=1

setting the risk-free rate R to 1.02, a standard value (e.g., McGrattan and Prescott,
2000). We will refer to this total as lifetime earnings. To avoid extrapolating beyond
the NLSY79 sample period, the terminal period T corresponds to age 57. Although
our measure of lifetime earnings is only partial, it covers more than three decades.
Table 6 summarizes the distribution of lifetime earnings as of age 22 for all racegender-education combinations. It also reports the average total time in years that
individuals spend incarcerated, employed, or nonemployed. The top panel of the table
shows results for men. While black men have lower lifetime earnings at any level, the
differences are most stark for the least educated. Among those without a high school
diploma, white men will on average earn $346,000 over their lives, 83% more than
the $189,000 earned by black men. For those with a college degree, the gap is 51%.
The differences are even larger at the 10th percentile: a gap of 250% for high school
dropouts vs. 51% for college graduates. This is consistent with the findings of Bayer
and Charles (2018), who show that the racial earnings gap is smaller at the top of the
earnings distribution. The low absolute earnings of black high school dropouts are
also notable. 25% of black men without a high school degree earn $67,000 or less over
their lifetimes, and 10% earn $28,000 or less. The top panel further shows that the
higher incarceration rates of black men lead them to spend considerably more time
in jail, an additional two years for high school dropouts.
The second panel of Table 6 presents statistics for women. Although our focus is
on men, who are far more likely to be incarcerated, there are some notable differences
between the genders. Women are much more likely to be nonemployed; a black woman
29

Table 6: Lifetime totals by race, education, and gender
Variable

BML

WML

BMH

WMH

BMS

WMS

BMC

WMC

Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p25
Lifetime earnings p50
Lifetime earnings p75
Lifetime earnings p90
Expected years E
Expected years J
Expected years N
Ever-J rate, old

189
28
67
151
276
417
21.3
3.2
11.5
0.44

346
98
174
335
493
629
27.8
1.3
7.0
0.23

306
83
147
270
427
598
26.4
1.1
8.5
0.24

506
193
323
501
656
835
31.9
0.1
4.0
0.03

378
125
212
353
505
670
28.6
1.2
6.2
0.17

561
220
321
539
724
994
31.7
0.2
4.1
0.05

631
256
371
567
811
1142
30.3
0.2
5.6
0.06

950
387
500
819
1198
1912
33.1
0.0
2.9
0.02

Variable

BFL

WFL

BFH

WFH

BFS

WFS

BFC

WFC

Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p25
Lifetime earnings p50
Lifetime earnings p75
Lifetime earnings p90
Expected years E
Expected years J
Expected years N
Ever-J rate, old

94
5
22
57
139
247
15.6
0.2
20.2
0.06

158
29
68
135
227
325
21.4
0.1
14.5
0.03

197
41
79
155
294
422
23.4
0.0
12.6
0.01

240
66
121
212
329
462
26.4
0.0
9.6
0.01

263
87
134
223
367
498
26.8
0.0
9.2
0.02

295
102
161
264
407
539
28.4
0.0
7.6
0.01

414
160
234
372
548
730
30.0
0.0
6.0
0.01

461
152
243
407
614
859
29.8
0.0
6.2
0.01

Note: [B,W][M,F][L,H,S,C] denote black/white, male/female, less than high
school/high school/some college/bachelors degree; E indicates employed; J indicates
jailed or incarcerated; N indicates nonemployed but not currently incarcerated; earnings are pre-tax thousands of 1982-1984 dollars.

without a high school diploma is nonemployed for an average of nearly 20 years, a
white woman for 15 years. The racial gaps are also considerably smaller for more
educated groups.

5.4

GIRFs

When an individual becomes incarcerated or nonemployed, what happens to his future
earnings and employment? To answer this question within the context of our nonlinear

30

model, we calculate generalized impulse response functions (GIRFs). To construct
the GIRFs, we first identify a time-0 information set F such that all individuals with
a common value of F have the same expected future outcomes. We then simulate
forward a large number of individuals for t periods. This gives us, for each simulated
individual i and each variable of interest x, the history {xi,t }Tt=1 . Let the indicator
function δi equal 1 when a particular shock, say incarceration, is realized at time
1. The effects of this particular shock at time t ≥ 1 are then given by the sample
analogue of
∆[xt |F] := E[xi,t | δi = 1, F] − E[xi,t | F].
When we compute this, we also compute a boot-strapped standard error. In the GIRFs
we present below, the conditioning set F always includes being age 22 at t = 0, being
male, and initially residing in the fifth latent earnings potential decile. We calculate
separate sets of GIRFs for each race-education combination and for each value of the
criminal record flag.
5.4.1

Incarceration Shocks

Figure 6 plots the GIRFs generated by an incarceration episode among male high
school dropouts. The first row of the figure shows that at impact, a jail shock reduces
expected earnings for black (white) men by roughly $6,000 ($8,000). The earnings
loss wears off slowly, particularly for first-time offenders. The second and third rows
show that some of the earnings loss is due to higher rates of future incarceration or
nonemployment.
The top panel of Table 7 reports the lifetime impacts of the jail shocks. The impact
of first-time incarceration on lifetime earnings is a loss of $103,400 for black men and
$169,700 for white men with less than a high school diploma. These are massive
amounts. To put them in perspective, unconditional lifetime earnings are $189,000
and $346,000 for the two groups, respectively (Table 6). Thus, for those with less than
a high school diploma, first-time incarceration reduces earnings by more than 45%. A
large part of the decline in earnings results from the fact that first-time incarceration
leads to a reduction of 36% or more in the number of years spent working due to more
years spent nonemployed or in jail: 9.6 years (out of 21.3—see Table 6) and 10.1 (out
of 27.8) for black men and white men, respectively, without a high school diploma.
31

Figure 6: GIRFs for jail shocks, by race and incarceration history, men without a high
school diploma

Note: [B,W][L,H][,r] denote black/white, less than high school/high school, no
criminal record/criminal record; Qi denotes earnings potential decile i. Earnings are measured in thousands of 1982-1984 dollars.
Figure 6 and Table 7 both show that high school graduates experience larger
earnings losses after an incarceration spell than high school dropouts and that white
men experience larger losses than blacks. This in part follows mechanically from the
employment channel. If incarceration reduced male employment in every group by the
same number of years, white men and high school graduates, who earn more when
employed, would lose more income. Table 7 shows that the employment effects of
incarceration are in fact larger for white men, perhaps because they are more likely
to be employed in its absence. Moreover, the third panel of Figure 4 shows that men
with a criminal record have similar earnings across races and education levels.

32

Table 7: GIRF statistics by shock, group type, and response variable

Response variable

GIRF for a transition to jail: Q5 to J
BML WML BMLr WMLr BMH WMH

BMHr

WMHr

-9.5
-143.7
-6.9
2.3
4.7

-12.6
-324.5
-14.7
5.1
9.6

GIRF for a transition to nonemployment: Q5 to N
Response variable BML WML BMLr WMLr BMH WMH BMHr

WMHr

Earnings
Lifetime earnings
Future years E
Future years N
Future years J

Earnings
Lifetime earnings
Future years E
Future years N
Future years J

-6.1
-103.4
-9.6
1.8
7.7

-4.4
-195.0
-3.6
2.3
1.3

GIRF for a good latent earnings transition: Q5 to Q7
Response variable BML WML BMLr WMLr BMH WMH BMHr

WMHr

2.8
49.6
2.2
-1.5
-0.6

2.0
70.2
1.5
-1.2
-0.3

2.0
44.5
2.5
-1.0
-1.5

1.4
84.0
2.3
-1.2
-1.1

-4.0
-83.8
-2.3
2.0
0.3

-12.7
-163.2
-6.9
6.5
0.4

-4.1
-85.3
-2.5
1.6
0.9

Earnings
Lifetime earnings
Future years E
Future years N
Future years J

0.8
-114.7
-3.9
2.2
1.7

-9.1
-109.8
-7.3
6.9
0.5

-12.7
-273.1
-12.1
4.9
7.1

WMHr

-2.9
-36.1
-2.0
1.1
0.9

-8.9
-119.3
-6.3
5.0
1.2

-9.1
-121.0
-8.0
2.1
5.9

GIRF for a bad latent earnings transition: Q5 to Q3
Response variable BML WML BMLr WMLr BMH WMH BMHr
0.1
-118.4
-3.4
2.9
0.5

-5.5
-46.8
-3.9
2.9
0.9

-8.9
-159.1
-7.5
2.3
5.2

-12.6
-209.7
-9.9
5.5
4.4

-3.4
-57.6
-2.8
2.3
0.6

-9.2
-145.8
-6.8
6.1
0.7

-5.5
-51.0
-4.4
0.8
3.5

-9.5
-111.3
-6.3
4.7
1.6

Earnings
Lifetime earnings
Future years E
Future years N
Future years J

-6.1
-78.2
-6.4
5.9
0.4

-9.2
-169.7
-10.1
3.6
6.5

3.1
63.5
1.4
-1.2
-0.2

-1.4
-177.4
-2.9
2.7
0.1

3.9
103.1
0.5
-0.5
0.0

2.6
62.1
1.5
-0.9
-0.6

3.7
120.7
0.8
-0.6
-0.2

Note: [B,W][M,F][L,H][,r] denote black/white, male/female, less than high school/high
school, no criminal record/criminal record; J indicates jailed or incarcerated; N indicates
nonemployed but not currently incarcerated; earnings are pre-tax thousands of 1982-1984
dollars.

Comparing the results for men with and without a criminal record shows that
for high school dropouts, the impact of a return to jail (indicated by an ‘r’ in the
heading) is smaller than the impact of an initial incarceration. This is not the case
33

for men with a high school degree, however, where repeat offenders experience larger
earnings losses. The differences across education levels likely reflect differences in
recidivism. The GIRF for any incarceration stint captures the increased risk of future
incarceration. For high school dropouts, the reincarceration risk is so high that at any
point a realized return to jail has (somewhat) modest effects. The reincarceration risk
of high school graduates, while still significant, is lower, making a return to jail more
costly.
Figure 7: GIRFs for latent nonemployment shocks, by race and incarceration history,
men without a high school diploma

Note: [B,W][L,H][,r] denote black/white, less than high school/high school, no
criminal record/criminal record. Earnings are measured in thousands of 19821984 dollars.

34

5.4.2

Nonemployment Shocks

Figure 7 plots the GIRFs generated by a latent nonemployment shock. Although
the initial impact of a nonemployment shock on earnings is identical to that of an
incarceration shock, its effects wear off more quickly. The second panel of Table 7 thus
shows that the lifetime earnings loss following the nonemployment shock is smaller
than the one following incarceration. The lifetime impact is still quite large, ranging
from $47,000 to $210,000. The reader should keep in mind that at young ages, a
nontrivial portion of nonemployment is transitory; the lifetime effects reported here
are for a shock to the persistent component.
Nonemployment appears to be an important pathway to incarceration. The transition matrices in Tables 2 and 3 imply that nonemployed men are especially likely
to become incarcerated. Table 7 likewise shows that a spell of nonemployment raises
future jail time by 0.4 to 4.4 years. To put this in perspective, note that a white
high school dropout with no criminal record would on average spend 1.3 years in jail
(Table 6). The additional 0.7 expected years of incarceration due to nonemployment
(Table 7) is an increase of over 50%. Even after conditioning on education, race, and
gender, nonemployment significantly contributes to incarceration.
5.4.3

Q5 to Q3 Shocks and Q5 to Q7 Shocks

Tables 2 and 3 also show that men with low latent earnings potential are more likely
to transition to incarceration or nonemployment. We examine these dynamics more
carefully in Figures 8 and 9, which plot the effects of moving down (from decile 5 to
decile 3) or up (from decile 5 to decile 7) the distribution of earnings potential.
Figure 8 shows the GIRFs that result when a man’s latent earnings potential
falls from the fifth to the third decile. Among high school dropouts, the dynamic
effects of this shock on earnings differ markedly by race. For blacks, the shock has
a large initial impact that shrinks monotonically. For whites, the initial effect of the
shock is small—in fact it is slightly positive—but the earnings loss expands rapidly
in subsequent years. This may reflect heterogeneous age dynamics, especially in the
mapping from latent states to observed outcomes (see Table 5). Table 7 shows that the
cumulative earnings losses from this shock are comparable, if in most cases smaller,
to those from a nonemployment shock. For example, a white high school dropout
35

with no criminal record who is hit by a nonemployment shock will on average suffer
a lifetime earnings loss of $146,000; the latent earnings potential shock generates an
average loss of $118,000.
Figure 8: GIRFs for a latent Q5 to Q3 shock, by race and incarceration history, men
without a high school diploma

Note: [B,W][L,H][,r] denote black/white, less than high school/high school, no
criminal record/criminal record. Earnings are measured in thousands of 19821984 dollars.

36

Figure 9: GIRFs for a Q5 to Q7 shock, by race and incarceration history, men without
a high school diploma

Note: [B,W][L,H][,r] denote black/white, less than high school/high school, no
criminal record/criminal record. Earnings are measured in thousands of 19821984 dollars.
Figure 8 and Table 7 also show that a decline in latent earnings potential leads
to higher rates of incarceration. Here too the effects of a shock to latent earnings
are similar to those of a shock to nonemployment. For a black high school dropout

37

with a criminal record, a negative earnings shock implies an additional 0.9 years in
jail, while a nonemployment shock also implies an additional 0.9 years. A negative
earnings shock also implies higher future nonemployment, although the increases are
smaller than those following the nonemployment shock itself.
Figure 9 shows the GIRFs for a positive transition, an increase in latent earnings potential from the fifth to the seventh decile. While the GIRFs for an increase
in earnings potential qualitatively mirror those for a decrease in earnings potential,
quantitatively the effects are often asymmetric. This can be seen most easily in Table 7, where, for example, the increase in years of employment after a positive earnings
shock is smaller than the decrease in years of employment after a negative earnings
shock. This sort of asymmetry, assumed away in many statistical models of earnings,
arises naturally in our flexible Hidden Markov Model specification.

6

Accounting for the Black-White Earnings Gap

We have seen that incarceration and nonemployment shocks have large and persistent effects on earnings, and that black individuals are in general more likely to be
incarcerated or nonemployed than whites. It is thus natural to ask how the racial
earnings gap would change if we eliminated racial differences in incarceration and
nonemployment. The first way we answer this question is by constructing counterfactual simulations where episodes of incarceration and/or nonemployment no longer
occur and comparing the racial gaps generated by these counterfactual exercises to
those generated by the full model.
The first and second panels of Table 8 report summary statistics for the benchmark model and the no-incarceration counterfactual, respectively. For high school
dropouts, the benchmark model produces a gap in mean lifetime earnings of $157,000
($346, 000 − $189, 000). In the absence of incarceration, this gap falls to $145,000
($366, 000 − $221, 000), a decline of 7.6%. The contribution of incarceration to the
racial earnings gap is relatively small for two main reasons. The first reason is that
even though incarceration is far more prevalent among blacks, its effects, when measured in levels, are larger for whites. For example, Table 7 shows that among dropouts
and high school graduates, the loss of lifetime earnings following an incarceration

38

shock is roughly twice as large for white men as it is for black men. The second is
mechanical: We have expressed the gaps in levels rather than proportional changes.
For example, in the benchmark model, the gap for high school dropouts equals 83%
of black earnings (157/189). When incarceration is eliminated, the fraction falls to
66% (145/220). As a proportion of a fraction, this is a reduction of 21%, as opposed
to the 7.6% reduction in levels.
An informative exercise is to compare the change in average earnings induced by
eliminating incarceration to the change in average years of work. For black high school
dropouts, removing incarceration causes years of employment to increase 14.6%, from
21.3 to 24.4. Lifetime earnings increase by a similar amount, 16.9%. Given that lowearnings individuals are more likely to become incarcerated, one might expect the
elimination of incarceration to have a larger effect on aggregate employment than
aggregate earnings. One reason this does not happen is that incarceration has a
scarring effect on earnings. As Tables 2-5 show, men with criminal records are, when
employed, more likely to experience low earnings.
The third panel of Table 8 shows results for the no-nonemployment counterfactuals. In the absence of nonemployment, the gap for dropouts actually rises, to $157,000.
The proportional gap moves in the opposite direction, however, falling to 59%. The
bottom panel shows results for the full employment counterfactual. Eliminating both
incarceration and nonemployment causes the gap to shrink substantially, to $128,000,
about 40% (128/322) of black earnings. This is about 18% smaller than the benchmark gap for dropouts in levels, and as a fraction of black earnings, smaller by 52%.
The effects of eliminating incarceration and nonemployment are, as expected,
smaller for more highly educated men. By way of example, for high school graduates,
the original lifetime earnings gap is $200,000, 65% of the lifetime earnings of black
men. Eliminating incarceration reduces the gap to $188,000 (59% of black lifetime
earnings); eliminating nonemployment reduces the gap to $172,000 (44%); at full
employment, the gap falls to $152,000 (37%). The effects are also smaller at the top
of the lifetime earnings distribution. For high school dropouts, the lifetime earnings
gap at the 90th percentile is $212,000, or 51% of the black earnings total of $417,000.
At full employment, the gap falls to $163,000, or 31% of black earnings.
The counterfactual exercises in Table 8 provide one way to measure the effects

39

Table 8: Summary statistics by race and education, benchmark model and
counterfactual experiments

Variable
Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E
Expected years J
Expected years N
Ever-J rate, old
Variable
Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E
Expected years N
Variable
Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E
Expected years J
Ever-J rate, old
Variable
Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E

BML

Benchmark
WML BMH WMH

BMS

WMS

BMC

WMC

506
193
501
835
31.9
0.1
4.0
0.03

378
125
353
670
28.6
1.2
6.2
0.17

561
220
539
994
31.7
0.2
4.1
0.05

631
256
567
1142
30.3
0.2
5.6
0.06

950
387
819
1912
33.1
0.0
2.9
0.02

No incarceration
WML BMH WMH

BMS

WMS

BMC

WMC

509
199
503
837
32.1
3.9

401
156
378
691
30.1
5.9

567
229
543
999
32.0
4.0

644
275
578
1152
30.7
5.3

954
388
823
1917
33.2
2.8

No nonemployment
WML BMH WMH

BMS

WMS

BMC

WMC

460
218
442
758
34.4
1.6
0.15

652
333
609
1093
35.8
0.2
0.03

768
346
666
1410
35.8
0.2
0.05

1095
451
929
2103
36.0
0.0
0.02

No incarceration or nonemployment
BML WML BMH WMH BMS WMS

BMC

WMC

780
357
676
1418
36.0

1098
450
931
2106
36.0

189
28
151
417
21.3
3.2
11.5
0.44
BML
221
40
197
450
24.4
11.6
BML
268
76
253
491
30.4
5.6
0.46

322
166
297
519
36.0

346
98
335
629
27.8
1.3
7.0
0.23

366
120
357
636
29.3
6.7

425
180
422
675
34.2
1.8
0.22

450
226
439
682
36.0

306
83
270
598
26.4
1.1
8.5
0.24

321
95
289
612
27.6
8.4

388
156
364
673
34.2
1.8
0.24

412
189
384
685
36.0

560
260
542
894
35.8
0.2
0.02

564
265
544
896
36.0

487
247
463
779
36.0

657
337
612
1097
36.0

Note: [B,W][M,F][L,H,S,C] denote black/white, male/female, less than high school/high
school/some college/college graduate; E means employed; J means jailed or incarcerated;
N means nonemployed; earnings are pre-tax thousands of 1982-1984 dollars.

of incarceration and/or nonemployment on the racial earnings gap. An alternative
approach is the following decomposition. Let mT = {mt }Tt=1 denote a history of bin
40

realizations—jail, nonemployment, and earnings deciles—and let p(mT , race), race ∈
{W, B}, denote the probability function for these histories. (The function pt (·) also
depends on gender and education, which we will ignore for now.) Let yt (mt , race)
denote the time-t earnings associated with outcome bin mt , with lifetime earnings
PT
1−t
given by P DV (mT , race) :=
yt (mt , race). It follows that for either race,
t=1 R
any summary statistic (mean, median, etc.) for lifetime earnings can be written as
ς(y(·, race), p(·, race)) or, more compactly, as ς(y race , prace ).27
The racial gap in earnings is then given by:
h
i h
i
ς(y W , pW ) − ς(y B , pB ) = ς(y W , pW ) − ς(y B , pW ) + ς(y B , pW ) − ς(y B , pB )
h
i h
i
W
B
B
B
W
W
W
B
= ς(y , p ) − ς(y , p ) + ς(y , p ) − ς(y , p ) .

(5)
(6)

The first bracketed term in equations (5) and (6) measures the effects of racial differences in earnings, holding fixed the distribution of outcome bins. The second bracketed term measures the effects of racial differences in the distribution of outcome bins,
holding fixed the earnings values associated with each outcome bin. The second term
can be viewed as capturing the effects of racial gaps in incarceration and nonemployment, although it captures distributional differences of every sort.28 The ratio of this
term to the entire gap provides a relative measure. Equations (5) and (6) are equally
valid; in practice, we calculate the ratio both ways and take the average.
We apply this decomposition to lifetime earnings in Table 9, which presents the
share of the lifetime earnings gap attributable to racial differences in the distribution
of outcome bins.29 The first row shows that for male high school dropouts, 64% of
the difference in average lifetime earnings is attributable to differences in the distribution of outcome bins, much of which is due to differences in incarceration or
nonemployment. By way of comparison, recall that for high school dropouts, eliminating incarceration and nonemployment in their entirety reduces the levels gap by
18% and the fractional gap by 52%.
Continuing along the first row, the ratios for high school graduates, men with some
27

P
To fix ideas, for means we have ς(y race , prace ) = mT P DV (mT , race) p(mT , race).
28
This decomposition does not allow us to separately measure the effects of incarceration and
nonemployment.
29
The numbers underlying Table 9 can be found in Appendix H.

41

college education, or college graduates, are 46%, 48%, and 21%, respectively. The decomposition exercises thus suggest that, for most education levels, racial differences in
nonemployment and incarceration constitute a significant portion of the earnings gap.
The remaining rows of Table 9 show results for various lifetime earnings percentiles.
In general, the ratios are larger at lower percentiles, suggesting that differences in
employment histories matter more at the bottom of the earnings distribution.
Table 9: Decomposition: Fraction of lifetime earnings gap explained by differences in
the distribution across bins
Variable
Lifetime
Lifetime
Lifetime
Lifetime

ML
earnings
earnings
earnings
earnings

avg.
p10
p50
p90

MH

MS

MC

63.7 46.1 47.5 20.5
76.6 61.0 59.3 29.2
72.4 55.0 50.4 15.9
47.3 22.8 48.0 31.2

Note: [M][L,H,S,C] denote male, less than high
school/high school/some college/college graduate. Analysis follows equations (5) and (6), as
described in the text.

7

Conclusion

Despite the prevalence and growth of incarceration in the United States, much remains
unknown about the relationship between incarceration, employment, earnings, and
demographics. In this paper, we exploit the rich panel structure of the NLSY79, one of
the few datasets that tracks incarceration, to estimate the dynamics of incarceration,
employment, and earnings. We deploy a Hidden Markov Model that distinguishes
between first-time and repeat incarceration, allows for both persistent and transitory
employment and earnings shocks, and allows for nonresponse bias.
The estimated effects of first-time incarceration on earnings are substantial, reducing lifetime earnings by at least a third and—for some subgroups—a half. This
reduction in lifetime earnings is due both to the extensive margin, through fewer
years employed, and the intensive margin, through lower earnings while working. A
positive link between nonemployment and jail is also apparent: Low latent earnings
42

imply higher incarceration risk. All of the shocks we consider have highly persistent
effects.
Relative to their white counterparts, black men earn less and are more likely
to be nonemployed or incarcerated. Decomposition exercises with our model show
that among less-educated men, differences in incarceration and nonemployment can
explain a significant portion of the black-white gap in lifetime earnings. A promising
avenue for future research, which we are currently pursuing, is to use our estimates in
a consumption-savings model that can quantify the role of incarceration (among other
factors) in the large wealth differences observed across race, gender, and education
groups.

References
Abowd, J. M. and D. Card (1989): “On the Covariance Structure of Earnings
and Hours Changes,” Econometrica, 57, 411–445.
Altonji, J. G., A. A. Smith Jr, and I. Vidangos (2013): “Modeling Earnings
Dynamics,” Econometrica, 81, 1395–1454.
Arellano, M., R. Blundell, and S. Bonhomme (2017): “Earnings and Consumption Dynamics: A Nonlinear Panel Data Framework,” Econometrica, 85, 693–
734.
Bartolucci, F., A. Farcomeni, and F. Pennoni (2010): “An Overview
of Latent Markov Models for Longitudinal Categorical Data,” arXiv preprint
arXiv:1003.2804.
——— (2012): Latent Markov Models for Longitudinal Data, Chapman and
Hall/CRC.
Bayer, P. and K. K. Charles (2018): “Divergent Paths: A New Perspective on
Earnings Differences Between Black and White Men Since 1940,” Quarterly Journal
of Economics, 133, 1459–1501.
Bonhomme, S. and J.-M. Robin (2009): “Assessing the Equalizing Force of Mobility using Short Panels: France, 1990–2000,” Review of Economic Studies, 76,
63–92.
43

——— (2010): “Generalized Non-Parametric Deconvolution with an Application to
Earnings Dynamics,” Review of Economic Studies, 77, 491–533.
Caucutt, E. M., N. Guner, and C. Rauh (2018): “Is Marriage for White People?
Incarceration, Unemployment, and the Racial Marriage Divide,” Tech. rep., CEPR,
Discussion Paper No. DP13275.
De Nardi, M., G. Fella, and G. Paz-Pardo (2020): “Nonlinear Household
Earnings Dynamics, Self-insurance, and Welfare,” Journal of the European Economic Association, 18, 890–926.
Ewert, S. and T. Wildhagen (2011): “Educational Characteristics of Prisoners:
Data from the ACS,” Presentation at the Population Association of America.
Farmer, L. (2020): “The Discretization Filter: A Simple Way to Estimate Nonlinear
State Space Models,” Tech. rep., University of Virginia, available at SSRN 2780166.
Fella, G. and G. Gallipoli (2014): “Education and Crime over the Life Cycle,”
Review of Economic Studies, 81, 1484–1517.
Fu, C. and K. I. Wolpin (2018): “Structural Estimation of a Becker-Ehrlich Equilibrium Model of Crime: Allocating Police across Cities to Reduce Crime,” Review
of Economic Studies, 85, 2097–2138.
Guler, B. and A. Michaud (2018): “Dynamics of Deterrence: A Macroeconomic
Perspective on Punitive Justice Policy,” CHCP working papers, 2018-6, Centre for
Human Capital and Productivity, University of Western Ontario.
Guvenen, F. (2009): “An Empirical Investigation of Labor Income Processes,” Review of Economic dynamics, 12, 58–79.
Guvenen, F., F. Karahan, S. Ozkan, and J. Song (2020): “What Do Data
on Millions of US Workers Reveal about Life-Cycle Earnings Risk?” Tech. rep.,
University of Toronto.
Hamilton, J. D. (1989): “A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle,” Econometrica, 357–384.
——— (1994): Time Series Analysis, Princeton University Press.

44

Heckman, J. J., J. E. Humphries, and N. S. Mader (2011): “The GED,” in
Handbook of the Economics of Education, Elsevier, vol. 3, 423–483.
Hu, Y., R. Moffitt, and Y. Sasaki (2019): “Semiparametric Estimation of the
Canonical Permanent-Transitory Model of Earnings Dynamics,” Quantitative Economics, 10, 1495–1536.
Jones, M. C. (2009): “Kumaraswamy’s Distribution: A Beta-Type Distribution with
Some Tractability Advantages,” Statistical Methodology, 6, 70–81.
Kling, J. R. (2006): “Incarceration Length, Employment, and Earnings,” American
Economic Review, 96, 863–876.
Kumaraswamy, P. (1980): “A Generalized Probability Density Function for
Double-Bounded Random Processes,” Journal of Hydrology, 46, 79–88.
Light, M. T. (2021): “The Declining Significance of Race in Criminal Sentencing:
Evidence from US Federal Courts,” Social Forces.
Lochner, L. (2004): “Education, Work, and Crime: A Human Capital Approach,”
International Economic Review, 45, 811–843.
Loeffler, C. E. (2013): “Does Imprisonment Alter the Life Course? Evidence on
Crime and Employment from a Natural Experiment,” Criminology, 51, 137–166.
McGrattan, E. R. and E. C. Prescott (2000): “Is the Stock Market Overvalued?” Federal Reserve Bank of Minneapolis Quarterly Review, 24, 20–40.
Meghir, C. and L. Pistaferri (2004): “Income Variance Dynamics and Heterogeneity,” Econometrica, 72, 1–32.
Mueller-Smith, M. (2015): “The Criminal and Labor Market Impacts of Incarceration,” Unpublished.
Neal, D. and A. Rick (2014): “The Prison Boom and the Lack of Black Progress
after Smith and Welch,” NBER working paper no. 20283, National Bureau of Economic Research.
Pager, D. (2008): Marked: Race, Crime, and Finding Work in an Era of Mass
Incarceration, University of Chicago Press.
45

Rehavi, M. M. and S. B. Starr (2014): “Racial Disparity in Federal Criminal
Sentences,” Journal of Political Economy, 122, 1320–1354.
Rosenmerkel, S. P., M. R. Durose, and D. Farole Jr (2009): “Felony Sentences in State Courts, 2006-statistical tables,” US Department of Justice, Bureau
of Justice Statistics, 1–45.
Ruggles, S., S. Flood, R. Goeken, J. Grover, E. M. J. Pacas,
and M. Sobek (2020): “IPUMS USA: Version 10.0 [Dataset],”
Https://doi.org/10.18128/D010.V10.0.
Scott, S. L. (2002): “Bayesian Methods for Hidden Markov Models: Recursive
Computing in the 21st Century,” Journal of the American Statistical Association,
97, 337–351.
Survey Research Center (1992): “A Panel Study of Income Dynamics: Procedures and Tape Codes, 1989 Interviewing Year; Wave XXII: A Supplement,” Tech.
rep., Institute for Social Research, University of Michigan.
Travis, J., B. Western, and F. S. Redburn, eds. (2014): The Growth of Incarceration in the United States: Exploring Causes and Consequences, The National
Academies Press.
Trivedi, P. K. and D. M. Zimmer (2007): Copula Modeling: An Introduction for
Practitioners, Now Publishers Inc.
US
Department
of
Justice,
Bureau
of
Statistics
(2018):
“Key
Statistic:
Incarceration
https://www.bjs.gov/index.cfm?ty=kfdetail&iid=493.

Justice
Rate,”

——— (2019): “Number of Sentenced State and Federal Prisoners per 100,000 U.S.
Residents of Corresponding Sex, Race, Hispanic Origin, and Age Groups,” .
Western, B. and B. Pettit (2010): “Incarceration & Social Inequality,” Daedalus,
139, 8–19.

46

For Online Publication: Appendices
A

Specification Details

Let `n,t ∈ L = {L0 , L1 , ..., LI−1 } denote individual n’s underlying, latent labor market
state at date t, and let mn,t ∈ M = {M0 , M1 , ..., MJ−1 } denote the earnings outcome
observed by the researcher. The set of latent states, L, consists of incarceration, longterm nonemployment, and Q∗ earnings potential bins. This set of outcomes is then
interacted with a {0, 1} criminal record flag, so that L contains 2(Q∗ + 2) elements.
The set of observed outcomes, M, consists of incarceration, nonemployment (shortor long-term), Q positive earnings bins, and not interviewed/missing. The nonmissing
outcomes are also interacted with the criminal record flag, so that M contains 2Q + 5
elements.
Let pq , q = 0, 1, ..., Q denote the probability cutoffs for the earnings bins. In
practice, we partition earnings into deciles, so that pq ∈ {0.0, 0.1, ..., 0.8, 1} and Q =
10. We also assume that the bins for latent earnings potential are the same as those
for observed earnings, so that Q∗ = Q. To streamline the notation, we will proceed
under this assumption.
Our model is based on two key assumptions. The first is that `n,t is conditionally
Markov, with the I × I transition matrix, Ax :
Aj,k | x = Pr(`n,t+1 = Lk | `n,t = Lj , xn,t ) = Pr(`n,t+1 = Lk | Ft ),

(7)

where xn,t is a vector of exogenous variables, and Ft denotes the time-t information
set. The second is that the distribution of the observed outcome mn,t depends on only
the contemporaneous realization of `n,t . We can place the probabilities that map `n,t
to mn,t in the I × J matrix Bz :
Bj,k | z = Pr(mn,t = Mk | `n,t = Lj , zn,t ) = Pr(mn,t = Mk | Ft−1 , `n,t ).

(8)

The final element of our model is the 1 × I row vector µ1 (ln,1 |xn,1 ), which gives
the unconditional distribution of the initial latent state `n,1 .
We estimate separate sets of transition and observation probabilities for each
47

race-gender-education combination. We now turn to describing how these probability
matrices are populated. To simplify the notation, we drop the individual index n, and
suppress the dependence on race, gender, and education.
Our framework has many similarities to Arellano et al. (2017), who also rely
heavily on quantiles. A fundamental difference is that they work with conditional
quantiles, in order to construct inverse conditional CDFs, while we work with unconditional quantiles, in the spirit of copulas. In Arellano et al.’s (2017) framework,
the conditional distribution of quantile ranks is always uniform, and thus independent of the current latent state, because the quantiles themselves are conditional and
thus depend on the latent state. In contrast, we have a single set of cross-sectional
(given race, gender, education, and age) quantiles for all values of the latent state;
in our framework, the quantiles are independent of the current latent state, but the
distribution of outcomes across quantile ranks depends on the current latent state.

A.1

Latent State Transitions

We define a person as having a criminal record if he has been incarcerated in any
previous period. Once a person is incarcerated, he will have a criminal record in all
subsequent periods. Figure 1 illustrates the remainder of the process for populating
the matrices A and B. As the top half of this figure shows, we find the elements
of the transition matrix Ax in two steps. First, we use a multinomial logit regression to determine the one-period-ahead probabilities of incarceration (IC), long-term
nonemployment (N E), or employment (Q∗ ):


Pr `t+1 ∈ k | `t = j, xt = λj,k

.

X

λj,m ,

(9)

m∈{N E,Q∗ ,IC}

j ∈ L, k ∈ {N E, Q∗ , IC},
λj,N E ≡ 1,

∀j,


λj,m = exp x(at , `t )ζm ,

m ∈ {Q∗ , IC},

where {ζm }m∈{Q∗ ,IC} are coefficient vectors for future states. Nonemployment is the
benchmark state. In an abuse of notation, x(at , `t ) denotes the explanatory variables
in the logit regression. The elements of this vector include a polynomial in current

48

age (at ), indicators for the current state, and interactions:
x(at , `t ) =

h

1 at

a2t
100

ItN E

ItN E at

p̃j p̃j at

p̃2j

p̃2j at

ItIC

ItCR

i
,

where: at denotes the individual’s age at calendar year t; ItN E and ItIC are 0-1 indicators for long-term nonemployment or incarceration, respectively; ItQ = 1 − ItN E − ItIC
indicates positive earnings; ItCR is the 0-1 indicator for a criminal record (previous
incarceration); and p̃j gives the individual’s (approximate) earnings rank. In particular, when state j corresponds to earnings bin qj , p̃j = [pqj + pqj −1 ]/2. By way of
example, when earnings are partitioned into deciles, p̃j ∈ {0.05, 0.15, ..., 0.85, 0.95}.
When j indicates incarceration or persistent nonemployment, p̃j is set to 0. Because
we treat p̃j as continuous rather than categorical, the number of variables in the
logistic regression is invariant to the number of bins.
Second, we estimate the distribution of next period’s earnings potential, conditional on being employed, across the bins. To do this, we assume that the conditional
distribution of ranks follows the Kumaraswamy (1980) distribution. Like the Beta
distribution, the Kumaraswamy distribution is a flexible function defined over the
[0, 1] interval; however, its cdf is much simpler:
K(p; α, β) = Pr(y ≤ p; α, β) = 1 − (1 − pα )β .
The parameters α and β are both strictly positive. It follows that if bin q covers
quantiles pq−1 to pq ,


Pr `t+1 = bin q | `t = j, xt = Pr `t+1 ∈ Q∗ | `t = j, xt
(10)
h 




i
× K pq ; α x(at , `t ) , β x(at , `t ) − K pq−1 ; α x(at , `t ) , β x(at , `t )
,
j ∈ L, q ∈ {1, 2, ..., Q},


α x(at , `t ) = exp x(at , `t )ζa ,


β x(at , `t ) = exp x(at , `t )ζb .


K 0; α, β = 0 and K 1; α, β = 1 by definition. The Kumaraswamy parameters α
and β are functions of the current state and the vector x; ζa and ζb are the associated
coefficient vectors. Because we use the midpoint value p̃qj to characterize the current
49

earnings bins, as the number of bins grows large, our discretized distribution converges
to a continuous one. It is natural to view both the binning of the data and the


expressions for α x(at , `t ) and β x(at , `t ) as semiparametric approximations that
will become more complicated as the sample size grows.
Recall that once a person is incarcerated, he will have a criminal record for the
rest of his life. As a result, A will be approximately block diagonal (as in Tables 2
and 3). The first Q + 1 rows of this matrix denote cases where at time t the individual
has no criminal record (ItCR = 0) and is not currently incarcerated; his current state
is long-term nonemployment (state 0) or one of the earnings potential bins. Such a
person will not have a criminal record at time t+1; even if he becomes incarcerated at
t + 1, he will not have a prior conviction at that point. The first Q + 1 rows thus have
(potentially) nonzero values in the first Q + 2 columns and zeros for the remainder.
(Given that we start our indexing at 0, this corresponds to rows 0 through Q and
columns 0 through Q + 1.) The final Q + 2 rows are for people who have a criminal
record at time t, ItCR = 1. These rows will have zeros in the first Q + 2 columns
and (potentially) nonzero values for the remainder. This leaves one row to consider,
namely the one for a person who at time t has no criminal record—he has not been
incarcerated in the past—but is currently in jail. This person will have a criminal
record at time t + 1, so row Q + 1 is configured like the rows for those with a criminal
record at time t. The transition probabilities for this person will differ, however, from
that of a person who at time t is both in jail and in possession of a criminal record.
To sum, the matrix A has a (Q + 1) × (Q + 2) block in the upper left corner, a
(Q + 3) × (Q + 2) block in the lower right corner, and zeros everywhere else.

A.2

Observation Probabilities

Let Ot indicate whether the individual was interviewed in the current survey wave.
The set of observed outcomes, M, consists of the set of latent states, L, and missing
(Ot = 0). It is of course not necessary that M and L align so closely, but it simplifies
the analysis. With this assumption, M contains 2(Q + 2) + 1 = 2Q + 5 elements, and
the observation matrix B is (2Q + 4) × (2Q + 5).
To populate B, we first find the probability that the individual is not interviewed

50

at time t, using a logit specification:
Bj,0 | z = Pr Ot = 0 | `t = j, zt




exp z0 (at , `t , Ot−1 )γ0
,
=1−
1 + exp z0 (at , `t , Ot−1 )γ0

j ∈ L,

(11)

where
z0 (at , `t , Ot−1 ) =

h

a2t
100

1 at

ItN E

ItCR

i

Ot−1 .

The interview probability depends on the presence of earnings but not their rank.
Next, we find the distribution of states for individuals who are interviewed. We
assume that the individual’s criminal record is measured accurately, so that the latent
incarceration state maps 1-for-1 into observed incarceration. Likewise, we assume that
the latent nonemployment state maps 1-for-1 into observed nonemployment, consistent with our view that the state represents long-term unemployment. We assume
further that criminal records are reported accurately. With these assumptions, we get
Bj,j+1 | z


exp z0 (at , `t , Ot−1 )γ0
,
=
1 + exp z0 (at , `t , Ot−1 )γ0

Bj,k | z = 0,

j ∈ {0, Q + 1, Q + 2, 2Q + 3},

j ∈ {0, Q + 1, Q + 2, 2Q + 3}, k 6∈ {0, j + 1}.

(12)
(13)

To populate the remaining elements of B, we assume that when the latent state
`t is the earnings potential bin q the individual’s observed outcome may be any observed earnings bin, qk ∈ Q, or nonemployment. Nonemployment realized in these
circumstances is purely transitory, and has no effect on the individual’s latent earnings prospects. Transitory incarceration shocks are ruled out, and the corresponding
elements of B are set to zero. Our approach for finding these probabilities is similar to the one employed for the latent states. First, we find the probability that the
individual will be nonemployed or working, using a logit:

51



Pr mt ∈ k | `t = j, Ot = 1, zt = λj,k

.

X

λj,h ,

(14)

h∈{N E,Q}

j ∈ {bin 1, bin 2, ..., bin Q} × {0, 1},
k ∈ {N E, Q},
λj,N E ≡ 1,

∀j,


λj,Q = exp z1 (at , `t )γQ ,
with the conditioning vector z1 given by
z1 (at , `t ) =

h

1 at

a2t
100

p̃j p̃j at

p̃2t

p̃2t at

ItCR

ItCR at

i

.

Note that M0 = N E is again the benchmark state. We continue to assume that
criminal records are reported accurately.
Next, the probabilities for the individual earnings bins are found using a univariate
logit distribution:


Pr mt = bin q | `t = j, Ot = 1, zt = Pr mt ∈ Q | `t = j, Ot = 1, zt



× L? pq , p̃j ; σz1 − L? pq−1 , p̃j ; σz1 , (15)
j ∈ {bin 1, bin 2, ..., bin Q} × {0, 1},
q ∈ {1, 2, ..., Q},



exp(σ(1 − p̃))
exp(−σ p̃)
exp(σ(p − p̃)) 
?
−
,
L p, p̃; σ =
1 + exp(σ(p − p̃))
1 + exp(σ(1 − p̃)) 1 + exp(−σ p̃)

σz1 = exp z1 (at , `t )γE .

(16)


Our decision to center the distribution L? p, qj ; σ around the latent earnings rank
p̃j is an identifying assumption. Given that the logistic density is symmetric around
zero, our assumption implies that if the earnings bins are also symmetric, the most
likely observed earnings level is the current latent state. The denominator in equation


P
(16) ensures that q Pr bin q | `j , O = 1, zt = Pr Q | `j , O = 1, zt , ∀j.
Multiplying the probabilities in equations (14) and (15) by the observation prob-

52

ability given in (12) completes the process.

A.3

Initial Probabilities

We construct the initial distribution of latent states, µ1 (`1 |x̃1 ), in much the same
way we found their transition probabilities. First, we find the probability that the
individual is incarcerated, nonemployed (long-term), or in one of the positive earnings
potential bins. Conditional on having positive earnings, we find the distribution across

the earnings potential bins using the Kumaraswamy distribution K p; α0 , β0 ; the
calculations parallel those in equation (10). The parameters α0 and β0 are scalars
to be estimated. The final step is to estimate the probability that the individual
has a criminal record, conditional on the other latent states. Here we use a logistic
distribution, allowing the probability of a criminal record to depend on I0N E , I0IC ,
and q0 . The product of these two probabilities gives us our initial distribution.

A.4

Likelihood

We estimate our model using maximum likelihood. Suppose that an individual has
the sequence of observed outcomes {mt }Tt=1 . The likelihood of this sequence can be
found via forward recursion (Bartolucci et al. 2010; Scott 2002; see also Hamilton
1994, chapter 22, and Farmer 2020):
1. Begin with the 1 × I vector of initial latent state probabilities, µ1 .
2. Letting j1 index the realization of m1 , calculate the 1×I vector η1 = µ1
where


ιj1 B01 ,

denotes the Hadamard product, element by element multiplication.

Here ιj is the 1 × J row vector with 1 at position j and zeros elsewhere; the
product ιj1 B01 returns (transposed) column j1 of B1 . Element i of η1 thus gives
the joint time-1 probability of latent state i and the observed outcome m1 .


1
ηt At
ιjt+1 B0t+1 .30 The 1 × I
ηt 1
vector ηt+1 gives the joint probability distribution of the latent state `t+1 and the

3. For t = 1, 2, ..., T − 1, calculate ηt+1 =

outcome observed at time t + 1 (mt+1 ), conditional on all outcomes observed
30

When the NLSY79 moves to two-year frequency, we simply multiply successive transition matrices.

53

through time t. The sum ηt 1 is the probability of the outcome observed at
time t, conditional on all prior observed outcomes. The ratio

ηt
ηt 1

thus gives the

distribution of the latent states at time t, conditional on all outcomes observed
through time t, and At updates the distribution to time t + 1. The right-hand
term of the Hadamard product accounts for mt+1 .
4. Calculate the cumulative probability, Pr(m1 , m2 , ..., mT ) =

T
Q

(ηt 1).

t=1

As Scott (2002) observes, if T is large, the product in item (4) may be very small.
We thus find the sum of the logged probabilities rather than the log of the product.

B

The Kumaraswamy Approximation

By assuming that the distribution of earnings ranks is Kumarswarmy with parameters
α(x(at , `t )), β(x(at , `)), there will always be some approximation error (unless that
happens to be the exact distribution). In this section, we investigate how much error
there is. We begin by running a long simulation of an AR(1) for earnings,
zt = ρz zt−1 + σz εt ,

εt ∼ N (0, 1),

constructing bins (which define our latent state `) and a transition matrix from the
simulation, and then fitting the simulated data with our Kumaraswarmy approximation. We then conduct a number of tests to assess the goodness of fit between our
target and fitted transition matrix.
In terms of details, we use ρz = 0.95, in line with most estimates, and fix σz = 1,
which is just a normalization since we work in quantile space. We use 10 bins with the
9 cutoffs {0.1, 0.2, ..., 0.9}. We let x (for α and β) be a cubic in p̃q , the bin midpoints.
This leaves us with 8 parameters (4 for both α and β). In choosing the parameters,
we use maximum likelihood. In particular, letting θ denote the coefficient vector, the
log-likelihood is
L(θ) =

10 X
10
X


N S (`t+1 = bin q and `t = bin b)·ln PrK (`t+1 = bin q | `t = bin b ; θ) ,

b=1 q=1

54

where N S (·) denotes simulation counts, and PrK (·) denotes probabilities generated
by our Kumaraswamy specification.
The top left and right panels of Figure B.1 contain the target and fitted transition
matrices, respectively. The current state is given by rows, while the next period state
is given by columns. Brighter colors represent large probability transitions, while dark
blue colors are close to zero. Visually, one can see the fitted and target matrix are
quite close to each other. The difference between the fitted and target transition rates
is presented in the middle left panel labeled “Error.” The errors are quite close to
zero, but can be as high as 0.05 or as low as -0.05 in some points. Notably, these
high and low errors occur close to one another, allowing for the possibility that they
average out in some sense. As discussed immediately below, this idea is confirmed by
long simulations of the transition matrices.
Figure B.1: Error from the Kumaraswamy approximation

55

The middle right panel plots a 500-period simulation using both the target and
fitted transition matrix. To do this, we start both simulations with the same initial
state and then use the same sequence of U [0, 1] realizations to draw from F (`t+1 |`t ).
This is a quite demanding test as it allows for errors to accumulate over time. And,
indeed, some slight errors can be seen. However, in practice the errors do not accumulate, with mistakes low or high corrected shortly. This is consistent with the balance
of high and low errors in the fitted transition matrices.
When we do an even longer (100,000) period simulation to look at the target and
fitted invariant distribution (bottom left panel) and innovation distribution (bottom
right panel), we again see some error. However, again these errors seem to be balanced.
E.g., while the probability of being in the lowest state is too high in the fitted distribution, the probability of being in the penultimate lowest bin is too small. Similar
statements can be made in terms of the innovation distribution. Hence, while the Kumaraswamy approximation is not perfect, it seems to do a good job of approximating
earnings dynamics provided they are reasonably captured by a persistent, AR(1) process. Stated differently, if the earnings dynamics conditional on job-to-job transitions
roughly follow the most common assumption in the literature—a persistent AR(1)
process with normal innovations—then our Kumaraswamy functional form allows for
a good approximation of earnings dynamics.

C

Earnings and Employment Measures

In this appendix, we describe how we measure employment and earnings in the data.

C.1

NLSY79

We measure earnings as the sum of wage income, salary income and the labor portion
of farm and business income, with the latter found using the approach found in the
Panel Study of Income Dynamics (PSID) (Survey Research Center, 1992). We will
refer to individuals with no farm or business income as “workers.” The NLSY79
reports both total hours and total weeks of work. We include military hours in the
total: These are fairly small, as we do not use the NLSY79’s military subsample.

56

In general, employed individuals have positive (80 or more) hours of work, positive
(2 or more) weeks of work, and positive ($250 or more in 1980 dollars) earnings. When
the three measures contradict, we define employment as follows.
1. If a person has positive earnings and either positive hours or positive weeks,
she is employed.
2. A person with no hours and no weeks of work is nonemployed, regardless of
earnings.
3. A worker with no earnings is nonemployed, regardless of hours or weeks.
4. A nonworker with positive hours and positive weeks is employed, regardless
of earnings.
5. If a nonworker has no earnings and either no hours or no weeks, she is
nonemployed.

C.2

CPS

Consistent with our approach in the NLSY, our CPS earnings measure includes not
just wage and salary income, but also the labor component of farm and business
income, again applying the PSID (Survey Research Center, 1992) methodology.
The CPS reports both the number of weeks worked in the prior year and the usual
number of hours worked each week. We consider individuals who worked for less than
two weeks to be nonemployed, along with those who worked less than 80 hours over
the entire year (i.e., the product of usual hours worked each week and the number of
weeks worked was less than 80). Workers (no business or farm income) with less than
$250 (in 1980 dollars) of earnings are also considered to be nonemployed.

57

D

Summary Statistics for Women

This section presents CPS summary statistics for women in Table D.1.
Table D.1: Summary Statistics by Race and Education for Women, NLSY79
Black Women
LTHS

White Women

HS

SC

CG

LTHS

HS

SC

CG

Earnings (in $1,000s)
Mean
3.23
10th percentile
0
25th percentile
0
50th percentile
0
75th percentile
5.20
90th percentile
11.11

7.34
0
0
5.77
11.96
17.74

9.69
0
0.64
8.80
14.87
22.01

14.79
0
4.56
13.54
21.74
30.01

5.69
0
0
2.70
9.32
15.01

9.09
0
0.63
7.92
13.64
20.11

11.09
0
2.22
9.60
16.13
23.33

17.19
0
4.56
14.29
23.37
34.60

Currently Incarcerated (%)
All ages
0.79
22-29
0.42
30-39
1.49
40-49
0.60
50 and older
0.27

0.10
0.03
0.11
0.21
0.18

0.12
0.04
0.16
0.19
0.10

0
0
0
0
0

0.17
0.11
0.41
0
0

0.01
0.00
0.02
0.03
0

0.07
0.01
0.19
0.04
0

0.02
0.02
0.02
0.01
0

Previously Incarcerated (%)
All ages
2.82
22-29
0.55
30-39
4.06
40-49
4.49
50 and older
3.56

0.45
0
0.24
1.24
1.24

0.80
0.15
0.67
1.63
1.84

0
0
0
0
0

1.48
0.73
1.41
2.60
2.77

0.03
0.01
0
0.07
0.14

0.30
0.05
0.20
0.77
0.67

0.29
0.20
0.26
0.41
0.44

61.22
18.81

67.62
47.87

76.60
N/A

53.80
23.89

65.38
19.96

69.87
42.56

76.87
96.08

61.41

67.74

76.66

54.22

65.37

69.94

76.80

1960.2
31.24

1960.4
31.04

1960.2
30.85

1960.3
30.80

1960.4
29.57

1960.1
28.99

1960.2
28.99

1960.3
30.59

3.36

3.78

4.73

3.04

12.10

27.19

20.34

25.45

6,206
325

7,464
392

8,768
452

5,788
301

8,665
660

16,584
1,058

12,160
724

16,086
934

Fraction Employed (%)
All
38.39
Previously
25.78
incarcerated
Not previously
38.76
incarcerated (%)
Mean Values
Year of birth
Age
Fraction of female
population (%)
Observations
Individuals

Note: [LTHS,HS,SC,CG] denote less than high school/high school/some college/college graduate.

58

E

Decile Cutpoints and Within-Decile Means

This section gives the estimated decile cutpoints in Figures E.1 and E.2, and the
estimated within-decile means in Figures E.3 and E.4.
Figure E.1: Decile Cutpoints for Earnings by Group (Females)
black female lths

white female lths

15000

15000

10000

10000

5000

5000

0

0
20

30

40

50

60

70

20

30

black female hs

40

50

60

70

60

70

60

70

white female hs
25000

20000

20000
15000
15000
10000
10000
5000

5000

0

0
20

30

40

50

60

70

20

30

black female sc

40

50

white female sc

30000

30000

20000

20000

10000

10000

0

0
20

30

40

50

60

70

20

30

black female cg

40

50

white female cg

50000

60000

10th
20th
30th
40th
50th
60th
70th
80th
90th

40000
40000
30000

20000
20000
10000

0

0
20

30

40

50

60

70

20

59

30

40

50

60

70

Figure E.2: Decile Cutpoints for Earnings by Group (Males)
black male lths

white male lths
25000

20000
20000
15000
15000
10000
10000

5000

5000

0

0
20

30

40

50

60

70

20

30

40

black male hs

50

60

70

60

70

60

70

white male hs

30000

40000

30000
20000

20000

10000
10000

0

0
20

30

40

50

60

70

20

30

40

black male sc

50

white male sc

40000

50000

40000

30000

30000
20000
20000
10000

10000

0

0
20

30

40

50

60

70

20

30

black male cg

40

50

white male cg

60000

10th
20th
30th
40th
50th
60th
70th
80th
90th

80000

60000

40000

40000
20000
20000

0

0
20

30

40

50

60

70

20

60

30

40

50

60

70

Figure E.3: Within-Decile Mean Earnings by Group: Data and Estimates (Females)
black female lths

white female lths

30000
80000

60000

20000

40000
10000
20000

0

0
20

30

40

50

60

70

20

30

black female hs

40

50

60

70

60

70

60

70

white female hs

50000

50000

40000

40000

30000

30000

20000

20000

10000

10000

0

0
20

30

40

50

60

70

20

30

black female sc

40

50

white female sc

80000

50000

40000

60000

30000
40000
20000
20000
10000

0

0
20

30

40

50

60

70

20

30

black female cg

40

50

white female cg

200000

150000

Bottom 10th
10th-20th
20th-30th
30th-40th
40th-50th
50th-60th
60th-70th
70th-80th
80th-90th
Top 10th

150000
100000

100000

50000
50000

0

0
20

30

40

50

60

70

20

61

30

40

50

60

70

Figure E.4: Within-Decile Mean Earnings by Group: Data and Estimates (Males)
black male lths

white male lths

50000

50000

40000

40000

30000

30000

20000

20000

10000

10000

0

0
20

30

40

50

60

70

20

30

black male hs

40

50

60

70

60

70

60

70

white male hs

60000

60000

40000

40000

20000

20000

0

0
20

30

40

50

60

70

20

30

40

black male sc

50

white male sc
100000

80000
80000
60000
60000
40000
40000

20000

20000

0

0
20

30

40

50

60

70

20

30

black male cg

40

50

white male cg
200000

Bottom 10th
10th-20th
20th-30th
30th-40th
40th-50th
50th-60th
60th-70th
70th-80th
80th-90th
Top 10th

150000

150000
100000
100000

50000
50000

0

0
20

30

40

50

60

70

20

62

30

40

50

60

70

F

Coefficient Estimates

Tables F.1 and F.2 show coefficient estimates for, respectively, men and women. The
associated standard errors are found using the information matrix.
For a number of groups—white men with a college degree, white women with at
least a high school diploma, and black women with either a high school or a college degree—the incidence of incarceration is so low that their incarceration-related
parameters cannot be estimated accurately. In these cases, we use the data to estimate a simplified model that omits incarceration. To this set of parameters, we add
incarceration-related parameters estimated for other, similar groups, namely white
men with some college education, or white women without a high school diploma,
or black women with some college experience. These coefficients are identified by an
entry of “NA” in the standard error slot.
When making these imputations, we adjust the intercepts for the incarceration
probabilities to match the probabilities observed in the NLSY for the group. The logic
of our adjustments is the following. Consider a simple static logistic model, where
1
,
1 + E + IC
E
e=
,
1 + E + IC
IC
,
ic =
1 + E + IC
u=

give the probabilities of being nonemployed, employed, or incarcerated, respectively.
Suppose the incarceration constant for group i is known to be ln(ICi ), and we want
to build off this constant to impute ln(ICj ) for group j. If we know the probabilities
ici , icj , ui , uj , we have
ici = exp(ln(ICi )) · ui ,
icj = exp(ln(ICj )) · uj ,
which can be rearranged to yield

63


exp

ln(ICj )
ln(ICi )


=

icj ui
· ,
ici uj



⇒ ln(ICj ) = ln(ICi ) + ln(icj ) − ln(ici ) − ln(uj ) − ln(ui ) .

(17)

Letting ln(IC) be the intercept for the incarceration-related expression, the latter
two terms of (17) comprise our adjustment. In practice, we calculate ic and u by
averaging across all sample periods, and we estimate ic as the average probability
of criminal record, which is somewhat less noisy than incarceration itself. A further
complication is that for white men and black women with a college degree, and white
women with a high school degree, the fraction of individuals with a criminal record
is very low, 0.03% or less. In these cases, we have a second layer of imputation, for
the fraction ic itself. Black female college graduates are assigned the criminal record
fraction observed for black female high school graduates; white female high school
graduates are assigned the fraction for white female college graduates; and white
male college graduates are given the fraction for white men with some college, scaled
downward using the fractions observed for black men. All of these imputations are
admittedly ad hoc, but the groups to which they are applied have very low rates of
incarceration, implying that the imputations have small quantitative effects.
One final complication is that at the baseline estimates for white men with some
college education and black men with college degrees, older men cycle between nonemployment and the bottom earnings decile on annual basis. Because the NLSY79
switches to a biennial frequency after 1994, such behavior is consistent with the data,
as individuals have a high probability of returning to their initial state two years
later. To prevent such behavior, we either set a time trend coefficient to zero, or use
estimates from a local maximum where the time trend is small.

64

65

0.32
0.01
0.02
0.36
0.86
0.02
0.86
0.02
0.04
0.96
0.05
0.07
0.33
2.96
0.10
3.06
0.10
0.16
1.02
0.06
0.07
0.18
0.17
0.06

Latent Earnings Decile, Kumaraswamy Parameter α
Constant
-1.65
0.37
-1.71
Age
0.06
0.02
0.07
2
Age /100
-0.06
0.02
-0.06
Not employed
-0.79
0.47
-0.92
p̃
6.51
0.98
5.90
p̃× age
0.02
0.02
0.02
2
p̃
-5.01
0.98
-4.24
p̃2 × age
0.00
0.03
0.00
Criminal record
0.02
0.03
-0.03

Latent Earnings Decile, Kumaraswamy Parameter β
Constant
0.26
0.90
-0.04
Age
0.02
0.05
0.05
2
Age /100
0.00
0.07
-0.02
Not employed
-0.51
0.36
-1.51
p̃
-3.28
3.20
-4.94
p̃× age
0.67
0.10
0.63
p̃2
1.64
3.37
3.73
p̃2 × age
-0.72
0.10
-0.70
Criminal record
-0.01
0.12
-0.21

Probability that Individual Is Interviewed
Constant
2.24
1.04
Age
-0.12
0.06
2
Age /100
0.12
0.07
Not employed
0.25
0.19
Criminal record
0.01
0.09
Observed prior wave
3.50
0.07
2.53
-0.14
0.14
0.05
0.00
3.59

1.99
0.11
0.14
0.65
0.67
0.44
0.21

1.45
0.08
0.11
0.53
0.64
0.40
0.17

Probability of Incarceration Next Period
Constant
1.52
Age
-0.09
2
Age /100
0.03
Not employed
-2.84
p̃
1.24
In jail
1.28
Criminal record
2.25
-1.83
0.11
-0.25
-3.17
0.06
1.53
1.93

Ability State Next Year
1.30
2.53
1.29
0.07
0.01
0.07
0.09
-0.11
0.08
0.74
-4.44
0.78
0.02
0.03
0.02
1.22
7.62
1.25
1.40
-4.49
1.33
0.41
-0.79
0.52
0.16
0.20
0.18

4.25
-0.23
0.24
-0.32
0.06
3.57

-2.58
0.15
-0.11
-0.92
-6.13
0.79
6.90
-0.90
0.55

-2.83
0.12
-0.11
-1.36
6.52
0.00
-4.13
0.00
0.08

0.74
-0.10
0.04
-1.99
0.79
1.66
2.47

1.88
0.01
-0.08
-3.30
0.02
11.11
-8.04
-1.03
-0.01

1.32
0.07
0.10
0.27
0.19
0.09

1.09
0.06
0.09
0.44
3.60
0.12
3.72
0.12
0.27

0.34
0.02
0.02
0.65
0.98
0.02
0.98
0.02
0.05

3.59
0.21
0.28
0.69
1.53
0.53
0.37

1.74
0.09
0.11
0.78
0.02
1.67
1.78
0.59
0.23

Black Men
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

Probability of a Positive Latent Earnings
Constant
2.26
Age
-0.03
2
Age /100
0.00
Not employed
-1.18
Not employed x age
-0.05
p̃
6.06
p̃2
-1.38
In jail
-0.30
Criminal record
0.34

Less than HS
coeff.
(s.e)

5.57
-0.34
0.44
-1.95
0.38
4.33

-1.70
0.13
-0.11
-2.75
-18.01
1.10
18.21
-1.20
0.65

-2.79
0.12
-0.11
-2.97
5.47
0.01
-3.38
0.00
0.07

0.00
-0.14
0.20
-6.34
-2.51
12.19
0.35

4.01
-0.10
0.09
-2.97
0.00
9.17
-7.64
7.85
-0.40

2.40
0.13
0.16
0.31
99.19
0.15

1.33
0.08
0.09
0.56
3.82
0.14
4.07
0.14
2.2e02

0.43
0.02
0.02
1.19
1.22
0.03
1.32
0.03
36.90

2.0e02
10.34
12.34
5.8e02
1.2e02
3.6e05
2.5e02

2.15
0.13
0.17
0.46
NA
2.34
2.12
3.6e05
1.4e02

Bachelors +
coeff.
(s.e)

2.45
-0.13
0.12
0.44
0.13
3.55

1.27
0.00
0.01
-0.89
-5.79
0.83
5.30
-0.90
-0.42

-1.09
0.04
-0.03
-0.36
5.94
0.03
-3.25
-0.03
-0.08

0.02
0.01
-0.07
-4.00
-1.32
2.50
1.56

2.94
0.01
-0.08
-2.54
-0.04
3.35
-0.38
0.10
-0.22

0.62
0.03
0.04
0.17
0.08
0.04

0.61
0.03
0.05
0.20
2.40
0.07
2.62
0.07
0.14

0.21
0.01
0.01
0.20
0.59
0.02
0.61
0.02
0.03

1.27
0.07
0.09
0.54
0.59
0.49
0.14

1.08
0.06
0.07
0.66
0.02
1.07
1.05
0.55
0.12

Less than HS
coeff.
(s.e)

Table F.1: Parameter Estimates, Men

2.20
-0.16
0.15
-1.06
0.00
4.55

8.48
0.00
0.03
-9.31
-7.34
0.60
0.00
-0.67
-0.07

1.13
0.00
0.00
-3.02
2.55
0.04
-0.73
-0.03
-0.01

0.35
-0.12
0.07
-1.63
0.01
2.21
2.79

3.45
-0.06
0.00
-2.96
0.00
12.38
-9.15
-2.27
0.30

0.74
0.04
0.05
0.14
0.38
0.04

1.04
0.04
0.05
0.69
3.23
0.08
2.97
0.08
0.55

0.13
0.00
0.01
0.46
0.34
0.01
0.29
0.01
0.06

13.19
0.76
1.01
1.24
1.55
1.33
1.27

1.16
0.06
0.08
0.59
0.01
1.32
1.32
1.04
0.67

3.01
-0.21
0.23
-1.05
0.39
4.58

-3.85
0.14
0.06
-0.99
3.21
0.59
0.94
-0.83
0.62

-3.02
0.11
-0.07
-4.48
9.27
-0.06
-6.26
0.05
0.12

0.06
-0.11
0.04
-1.56
-0.11
2.99
2.11

2.80
-0.05
0.02
-1.44
-0.03
11.90
-9.22
-1.17
-0.32

3.91
-0.25
0.27
-1.44
0.39
4.80

1.20
-0.12
0.26
-0.72
-28.95
1.64
31.34
-1.80
0.62

-4.66
0.17
-0.13
-2.05
9.47
-0.04
-6.84
0.04
0.12

-0.95
-0.11
0.04
-1.56
-0.11
2.99
2.11

1.86
0.01
-0.01
0.02
-0.08
8.97
-7.18
-1.17
-0.32

0.93
0.05
0.07
0.17
NA
0.05

0.37
0.02
0.04
0.23
1.45
0.05
1.61
0.06
NA

0.13
0.01
0.01
0.65
0.47
0.01
0.50
0.01
NA

NA
NA
NA
NA
NA
NA
NA

0.82
0.05
0.07
0.29
0.01
1.05
1.09
NA
NA

Bachelors +
coeff.
(s.e)

table continues on next page

0.94
0.05
0.07
0.18
1.28
0.05

0.53
0.03
0.05
0.57
2.25
0.07
2.31
0.07
1.16

0.15
0.01
0.01
3.35
0.44
0.01
0.48
0.01
0.16

12.40
0.74
1.11
1.88
6.29
2.24
0.79

0.97
0.05
0.07
0.47
0.01
1.30
1.39
2.12
0.33

White Men
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

66
1.41
727.76
9.60e03
3.39

0.92
4.59
0.12
0.13

15.75
0.53
0.42
60.47
1.79
60.19
1.79
1.61

2.41
0.12
0.16
10.33
0.37
10.19
0.34
0.57

-6.01
4.10
4.46
0.00

3.83
-0.51
-0.13
0.96

-0.02
-2.59
3.25
50.00
3.06
29.48
-5.93
1.54

-4.88
0.28
-0.23
29.44
-0.60
-20.09
0.40
-1.82

2.3e02
6.4e02
1.2e03
1.8e03

1.1e02
2.5e02
0.35
0.66

24.24
0.82
0.65
88.06
2.48
85.71
2.48
1.9e03

2.22
0.15
0.25
12.80
0.31
13.28
0.29
2.2e02

Bachelors +
coeff.
(s.e)

-2.19
0.32
2.45
-3.35

2.76
-0.57
0.26
0.30

3.57
1.50
0.73
0.30
-10.49
5.94
11.23
2.27

4.10
-0.26
0.44
24.79
-0.50
-19.66
0.46
-0.58

0.51
0.93
0.69
1.19

0.21
0.28
0.07
0.07

6.47
0.23
0.23
30.54
0.98
32.11
1.03
0.80

1.16
0.07
0.10
4.86
0.14
5.33
0.15
0.13

Less than HS
coeff.
(s.e)

-3.13
0.00
0.00
-50.00

3.71
-3.58
0.28
0.27

18.10
1.14
1.25
-39.51
-11.22
50.00
11.09
-11.55

-0.01
0.04
0.10
30.25
-0.73
-25.71
0.64
-0.43

6.1e03
6.1e03
4.7e03
1.2e05

0.35
83.88
0.14
0.09

4.73
0.17
0.14
23.11
0.72
22.95
0.72
2.26

1.23
0.06
0.08
5.14
0.15
4.94
0.14
1.29

-14.10
10.53
16.68
10.89

3.15
-2.80
-0.37
-0.04

-39.38
2.80
-1.72
-12.11
-9.63
50.00
11.20
-22.59

13.71
-0.51
0.69
0.00
-0.01
0.09
-0.01
-1.14

24.83
25.17
33.92
27.67

0.26
8.05
0.14
0.10

2.84
0.14
0.19
22.78
0.77
29.49
0.99
9.63

5.75
0.21
0.21
16.00
0.44
12.55
0.34
2.21

White Men
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

-15.12
10.53
16.68
10.89

2.15
-3.81
-0.83
0.35

-0.01
1.03
-3.26
-3.09
7.22
-50.00
-6.68
-22.59

19.84
-0.67
0.77
-0.02
-0.05
0.00
0.00
-1.14

NA
NA
NA
NA

0.12
NA
0.12
0.09

2.93
0.15
0.20
18.73
0.49
19.88
0.51
NA

8.61
0.28
0.24
24.37
0.58
18.86
0.45
NA

Bachelors +
coeff.
(s.e)

Note: “NA” indicates that the coefficient in question was not estimated, but was based on coefficients for another race-gender-education group. “NaN” indicates that numerical gradients could not
be calculated. Coefficients bounded by ±50.
†
Coefficient fixed at zero.

1.25
1.96
1.63
6.19

Probability of a Criminal Record, Initial Distribution
Constant
-1.25
0.69
-2.20
Not employed
-6.36
1.73e02
-0.73
In jail
1.25
0.79
1.76
p̃
-4.04
2.03
-7.52
-2.86
-4.02
10.70
-3.05

3.12
-0.81
0.25
0.60

0.38
0.56
0.10
0.10

0.25
0.28
0.12
0.13

3.22
-0.68
0.22
0.50

1.82
-0.37
0.12
0.23

Initial Distribution
Working
In jail
Kumaraswamy α
Kumaraswamy β

-5.83
0.23
0.24
50.00
-1.67
-41.08
1.42
-0.73
0.00
2.22
-1.17
-2.73
-5.99
0.00
6.22
-22.93

Generates Positive Earnings
1.17
0.29
1.08
0.06
-0.08
0.06
0.08
0.29
0.09
4.68
28.07
4.88
0.13
-0.73
0.15
5.19
-20.85
5.50
0.14
0.63
0.16
0.13
-0.59
0.23

Black Men
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

Distribution of Observed Earnings, Dispersion Parameter σ
Constant
17.96
19.30
-17.26
12.95
Age
1.86
0.66
2.41
0.43
Age2 /100
-0.78
0.40
-1.06
0.25
p̃
-33.73
71.39
25.00
49.58
p̃× age
-7.08
2.22
-6.01
1.54
2
p̃
0.00
68.60
7.92
47.42
2
p̃ × age
8.07
2.17
5.18
1.46
Criminal record
-9.31
0.97
-4.27
0.89

Probability that Latent Working State
Constant
2.22
Age
-0.16
2
Age /100
0.28
p̃
13.10
p̃× age
-0.25
p̃2
-5.87
p̃2 × age
0.14
Criminal record
-0.82

Less than HS
coeff.
(s.e)

Table F.1: Parameter Estimates, Men (continued)

67

0.29
0.01
0.02
0.34
0.95
0.02
1.03
0.03
NA
0.83
0.05
0.07
0.27
4.15
0.11
4.59
0.12
NA
1.63
0.09
0.11
0.19
NA
0.08

Latent Earnings Decile, Kumaraswamy Parameter α
Constant
0.27
0.63
-1.81
Age
0.05
0.03
0.06
2
Age /100
-0.09
0.04
-0.03
Not employed
-2.58
1.07
-0.70
p̃
-8.14
2.25
7.29
p̃× age
0.15
0.06
0.00
2
p̃
10.55
2.25
-3.88
p̃2 × age
-0.11
0.06
-0.01
Criminal record
-0.83
2.7e05
0.01

Latent Earnings Decile, Kumaraswamy Parameter β
Constant
2.88
1.30
0.47
Age
0.05
0.06
-0.02
2
Age /100
-0.10
0.09
0.09
Not employed
-3.34
0.85
-0.25
p̃
-16.74
4.75
-0.33
p̃× age
0.24
0.12
0.70
p̃2
13.98
5.25
1.54
p̃2 × age
-0.12
0.13
-0.81
Criminal record
2.51
5.0e05
7.94

Probability that Individual Is Interviewed
Constant
2.66
1.38
Age
-0.15
0.07
2
Age /100
0.16
0.09
Not employed
0.04
0.23
Criminal record
0.49
4.86
Observed prior wave
4.13
0.10
4.83
-0.28
0.32
-0.31
1.78
4.46

NA
NA
NA
NA
NA
NA
NA

17.64
1.16
2.15
11.18
20.44
4.32
2.77

Probability of Incarceration Next Period
Constant
-7.57
Age
0.31
2
Age /100
-0.49
Not employed
-3.44
p̃
-2.13
In jail
2.74
Criminal record
2.19
-12.82
0.45
-0.63
-2.18
-5.73
0.25
3.87

Ability State Next Year
1.46
-0.80
1.02
0.08
0.12
0.05
0.10
-0.17
0.07
1.02
-1.06
0.64
0.02
-0.05
0.02
2.66
8.44
1.35
2.60
-5.55
1.42
6.57
-1.78
NA
0.57
0.39
NA

2.42
-0.15
0.16
-0.16
1.78
4.19

0.03
-0.01
0.06
-0.12
-7.57
0.89
8.17
-0.98
7.94

-2.31
0.07
-0.05
-1.15
7.45
0.00
-4.59
0.00
0.01

-12.01
0.45
-0.63
-2.18
-5.73
0.25
3.87

-0.11
0.05
-0.08
-0.73
-0.04
10.03
-7.25
-1.78
0.39

1.25
0.07
0.09
0.22
3.6e02
0.07

0.75
0.04
0.06
0.25
2.94
0.09
3.18
0.09
1.7e06

0.31
0.01
0.01
0.35
1.06
0.02
1.05
0.02
1.3e05

3.9e02
19.38
25.58
1.2e02
7.3e02
6.1e02
1.7e02

1.71
0.12
0.16
0.41
0.01
1.20
1.29
3.7e02
20.17

Black Women
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

Probability of a Positive Latent Earnings
Constant
-4.44
Age
0.28
2
Age /100
-0.37
Not employed
0.55
Not employed x age
-0.10
p̃
9.65
p̃2
-7.20
In jail
-0.91
Criminal record
0.40

Less than HS
coeff.
(s.e)

3.86
-0.22
0.25
-0.23
1.78
4.18

-0.88
0.10
-0.11
-1.32
-18.77
1.08
20.71
-1.21
7.94

-2.26
0.10
-0.10
-1.95
3.00
0.09
0.00
-0.10
0.01

-12.03
0.45
-0.63
-2.18
-5.73
0.25
3.87

2.58
-0.05
0.03
-1.94
-0.01
9.61
-6.85
-1.78
0.39

1.70
0.09
0.12
0.30
NA
0.08

0.80
0.05
0.06
0.25
2.58
0.08
2.99
0.09
NA

0.22
0.01
0.01
0.47
0.80
0.02
0.93
0.02
NA

NA
NA
NA
NA
NA
NA
NA

0.97
0.05
0.07
0.54
0.01
1.23
1.49
NA
NA

Bachelors +
coeff.
(s.e)

4.03
-0.19
0.18
-0.27
1.87
3.95

3.75
-0.18
0.18
0.04
-16.26
1.04
14.63
-1.06
-0.08

-0.63
0.01
-0.05
-1.40
0.17
0.23
0.94
-0.19
-0.34

-11.60
0.42
-0.67
-0.56
-0.15
-18.49
3.88

-1.21
0.11
-0.15
-0.21
-0.06
5.71
-2.90
-1.27
-0.62

1.00
0.05
0.06
0.11
6.39
0.05

0.66
0.03
0.04
0.24
2.51
0.08
2.65
0.08
2.56

0.33
0.01
0.01
0.56
1.24
0.03
1.21
0.03
0.48

22.95
1.50
2.08
5.78
7.84
1.1e09
2.26

0.63
0.03
0.05
0.37
0.01
0.67
0.67
4.05
0.61

Less than HS
coeff.
(s.e)

Table F.2: Parameter Estimates, Women

2.44
-0.16
0.15
-0.48
1.87
4.59

2.14
-0.07
0.10
-1.31
-2.89
0.63
0.03
-0.63
-0.08

-1.25
0.04
-0.03
-2.31
6.34
0.01
-4.78
0.01
-0.34

-12.79
0.42
-0.67
-0.56
-0.15
-18.49
3.88

-0.78
0.10
-0.14
-1.24
-0.04
7.69
-5.35
-1.27
-0.62

0.66
0.04
0.05
0.10
NA
0.04

0.41
0.02
0.03
0.13
1.36
0.04
1.48
0.05
NA

0.15
0.01
0.01
0.32
0.41
0.01
0.41
0.01
NA

NA
NA
NA
NA
NA
NA
NA

0.49
0.03
0.04
0.27
0.01
0.54
0.54
NA
NA

2.26
-0.16
0.15
-0.20
1.87
4.70

-0.50
0.06
-0.06
-1.90
-8.07
0.65
8.36
-0.74
-0.08

-2.57
0.08
-0.08
-6.24
6.09
0.04
-3.86
-0.03
-0.34

-12.49
0.42
-0.67
-0.56
-0.15
-18.49
3.88

1.77
0.01
-0.03
-2.76
-0.01
8.09
-6.08
-1.27
-0.62

2.06
-0.14
0.13
-0.59
1.87
4.91

-2.18
0.15
-0.16
-1.39
-16.45
0.99
18.74
-1.14
-0.08

-2.29
0.10
-0.10
-1.67
2.78
0.10
-0.01
-0.10
-0.34

-12.28
0.42
-0.67
-0.56
-0.15
-18.49
3.88

4.70
-0.16
0.20
-2.55
-0.02
8.09
-6.30
-1.27
-0.62

0.76
0.04
0.05
0.12
NA
0.05

0.33
0.02
0.03
0.13
1.12
0.04
1.21
0.04
NA

0.12
0.01
0.01
0.22
0.37
0.01
0.40
0.01
NA

NA
NA
NA
NA
NA
NA
NA

0.60
0.03
0.04
0.31
0.01
0.60
0.64
NA
NA

Bachelors +
coeff.
(s.e)

table continues on next page

0.94
0.05
0.07
0.17
NA
0.05

0.37
0.02
0.03
0.36
1.20
0.04
1.32
0.04
NA

0.16
0.01
0.01
2.72
0.60
0.01
0.63
0.02
NA

NA
NA
NA
NA
NA
NA
NA

0.67
0.04
0.05
0.37
0.01
0.79
0.84
NA
NA

White Women
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

68
41.28
57.54
NaN
2.0e02

0.25
4.3e04
0.19
0.12

5.57
0.22
0.23
39.18
1.24
49.46
1.55
1.4e02

2.28
0.10
0.13
9.07
0.25
8.66
0.23
4.6e04

-21.38
-0.62
-27.20
-0.46

1.81
-10.44
-0.06
1.24

39.08
1.04
-0.04
-50.00
-5.85
0.01
7.37
-32.07

-1.57
0.00
0.58
50.00
-1.64
-39.38
1.29
3.76

NA
NA
NA
NA

0.27
NA
0.11
0.18

19.14
0.60
0.40
72.88
2.13
73.60
2.20
NA

3.07
0.19
0.32
16.77
0.55
16.46
0.52
NA

Bachelors +
coeff.
(s.e)

-4.05
-0.28
-15.88
-5.33

1.26
-11.45
-0.09
0.25

44.34
-1.44
1.42
-50.00
3.03
-38.69
-2.80
-30.41

-3.23
-0.01
0.48
31.93
-1.09
-16.23
0.74
0.72

41.28
57.54
NaN
2.0e02

0.28
2.4e04
0.21
0.12

9.84
0.40
0.42
32.67
0.87
37.26
1.01
64.83

1.22
0.06
0.11
5.34
0.17
5.68
0.16
17.00

Less than HS
coeff.
(s.e)

-5.24
-0.28
-15.88
-5.33

2.11
-12.64
0.22
0.42

7.67
1.47
0.63
1.63
-6.94
50.00
5.92
-30.41

-1.21
-0.06
0.36
35.01
-0.83
-23.75
0.59
0.72

NA
NA
NA
NA

0.13
NA
0.06
0.05

10.17
0.35
0.22
39.57
1.26
39.76
1.26
NA

0.99
0.06
0.09
3.54
0.11
3.62
0.10
NA

-4.93
-0.28
-15.88
-5.33

2.84
-12.34
-0.36
0.24

-2.82
-0.80
0.36
50.00
4.80
50.00
-5.82
-30.41

4.69
-0.39
0.81
46.71
-1.04
-36.56
0.82
0.72

NA
NA
NA
NA

0.24
NA
0.09
0.07

3.85
0.17
0.21
26.20
0.77
29.07
0.79
NA

1.44
0.11
0.20
9.39
0.26
9.14
0.25
NA

White Women
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

-4.73
-0.28
-15.88
-5.33

3.21
-12.13
-0.41
0.73

-0.38
-2.90
-0.47
-35.80
11.82
0.28
-10.62
-30.41

4.18
-0.28
0.65
30.25
-0.71
-17.76
0.41
0.72

NA
NA
NA
NA

0.24
NA
0.06
0.06

15.50
0.49
0.24
59.96
1.84
57.47
1.76
NA

1.23
0.09
0.16
5.73
0.17
5.57
0.15
NA

Bachelors +
coeff.
(s.e)

Note: “NA” indicates that the coefficient in question was not estimated, but was based on coefficients for another race-gender-education group. “NaN” indicates that numerical gradients could not
be calculated. Coefficients bounded by ±50.

NA
NA
NA
NA

Probability of a Criminal Record, Initial Distribution
Constant
-24.10
6.3e07
-22.17
Not employed
-0.16
NaN
-0.62
In jail
-21.71
NaN
-27.20
p̃
0.61
NaN
-0.46
-21.36
-0.62
-27.20
-0.46

1.22
-10.42
-0.44
0.22

0.20
NA
0.16
0.15

0.33
NaN
0.26
0.26

1.19
-11.23
-0.07
0.32

0.54
-46.21
-0.01
0.81

Initial Distribution
Working
In jail
Kumaraswamy α
Kumaraswamy β

3.49
-0.12
0.33
16.34
-0.56
-9.89
0.40
3.76
5.61
0.85
0.15
-47.75
-8.52
50.00
10.86
-32.07

Generates Positive Earnings
1.68
-2.36
1.21
0.08
0.07
0.07
0.11
0.02
0.10
8.40
26.92
5.96
0.23
-0.51
0.16
9.50
-19.61
6.29
0.25
0.39
0.16
6.36
3.76
NA

Black Women
HS Diploma
Some College
coeff.
(s.e)
coeff.
(s.e)

Distribution of Observed Earnings, Dispersion Parameter σ
Constant
28.19
25.41
7.50
11.98
Age
0.93
0.85
1.71
0.42
Age2 /100
-0.81
0.70
0.85
0.32
p̃
0.23
81.13
3.08
60.24
p̃× age
-6.18
2.12
-13.18
1.87
2
p̃
0.05
74.48
-0.24
66.25
2
p̃ × age
6.26
1.97
14.61
2.06
Criminal record
-43.54
5.00
-32.07
NA

Probability that Latent Working State
Constant
-3.78
Age
0.02
2
Age /100
0.14
p̃
24.04
p̃× age
-0.45
p̃2
-13.95
2
p̃ × age
0.29
Criminal record
1.69

Less than HS
coeff.
(s.e)

Table F.2: Parameter Estimates, Women (continued)

G

Model Fits and Observation Bias

This appendix gives the model fits for incarceration (Figure G.1) and nonemployment
(Figure G.2), as well as the bias induced by conditioning on observed outcomes (Figure
G.3).
Figure G.1: Incarceration Rates, Model and Data, Men
Incarceration Rates, BML

Incarceration Rates, WML

0.60

0.35

0.50

0.30

0.25

0.40

0.20

0.30

0.15

0.20

0.10

0.10

0.05

0.00

0.00
20

25
30
35
Ever incarcerated, data

40

Currently incarcerated, data

45
50
55
Ever incarcerated, model

60

20

Currently incarcerated, model

25
30
35
Ever incarcerated, data

40

Currently incarcerated, data

Incarceration Rates, BMH

45
50
55
Ever incarcerated, model

60

Currently incarcerated, model

Incarceration Rates, WMH

0.40

0.05

0.35

0.04

0.30
0.25

0.03

0.20

0.02

0.15
0.10

0.01

0.05
0.00

0.00
20

25
30
35
Ever incarcerated, data

40

Currently incarcerated, data

45
50
55
Ever incarcerated, model

60

20

Currently incarcerated, model

25
30
35
Ever incarcerated, data

40

Currently incarcerated, data

Incarceration Rates, BMS

45
50
55
Ever incarcerated, model

60

Currently incarcerated, model

Incarceration Rates, WMS

0.21

0.07

0.18

0.06

0.15

0.05

0.12

0.04

0.09

0.03

0.06

0.02

0.03

0.01

0.00

0.00
20

25
30
35
Ever incarcerated, data

40

Currently incarcerated, data

45
50
55
Ever incarcerated, model

60

Currently incarcerated, model

Incarceration Rates, BMC

25
30
35
Ever incarcerated, data
Currently incarcerated, data

40

45
50
55
Ever incarcerated, model

25
30
35
Ever incarcerated, data
Currently incarcerated, data

0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00
20

20

60

Currently incarcerated, model

69

40

45
50
55
Ever incarcerated, model

60

Currently incarcerated, model

Figure G.2: Nonemployment Rates, Model and Data, Men
Nonemployment Rates, BML

Nonemployment Rates, WML

0.60

0.60

0.50

0.50

0.40

0.40

0.30

0.30

0.20

0.20

0.10

0.10

0.00

0.00
20

25

30

35

40
Data

45

50

55

60

20

25

30

35

Model

40
Data

Nonemployment Rates, BMH

45

50

55

60

55

60

55

60

55

60

Model

Nonemployment Rates, WMH

0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00

0.30

0.25
0.20
0.15
0.10
0.05
0.00
20

25

30

35

40
Data

45

50

55

60

20

25

30

35

Model

40
Data

Nonemployment Rates, BMS

45

50

Model

Nonemployment Rates, WMS

0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00

0.30

0.25
0.20
0.15
0.10
0.05
0.00
20

25

30

35

40
Data

45

50

55

60

20

25

30

35

Model

40
Data

Nonemployment Rates, BMC

45

50

Model

Nonemployment Rates, WMC

0.35

0.14

0.30

0.12

0.25

0.10

0.20

0.08

0.15

0.06

0.10

0.04

0.05

0.02

0.00

0.00
20

25

30

35

40
Data

45

50

55

60

Model

20

25

30

35

40
Data

70

45
Model

50

Figure G.3: Model-predicted Incarceration and Nonemployment Rates, with and
without Observation Bias, Men
Incarceration and Nonemp. Rates, BML

Incarceration and Nonemp. Rates, WML

0.60

0.50

0.50

0.40

0.40

0.30

0.30
0.20

0.20

0.10

0.10
0.00

0.00
20

25

30

35

40

45

50

55

60

20

25

30

35

40

45

50

55

60

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Nonemployed, Unbiased

Nonemployed, Biased

Nonemployed, Unbiased

Nonemployed, Biased

Incarceration and Nonemp. Rates, BMH

Incarceration and Nonemp. Rates, WMH

0.60

0.30

0.50

0.25

0.40

0.20

0.30

0.15

0.20

0.10

0.10

0.05

0.00

0.00
20

25

30

35

40

45

50

55

60

20

25

30

35

40

45

50

55

60

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Nonemployed, Unbiased

Nonemployed, Biased

Nonemployed, Unbiased

Nonemployed, Biased

Incarceration and Nonemp. Rates, WMS

Incarceration and Nonemp. Rates, BMS
0.48

0.25

0.40

0.20

0.32

0.15

0.24
0.10

0.16

0.05

0.08
0.00
20

25

30

35

40

45

50

55

0.00

60

20

25

30

35

40

45

50

55

60

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Nonemployed, Unbiased

Nonemployed, Biased

Nonemployed, Unbiased

Nonemployed, Biased

Incarceration and Nonemp. Rates, BMC

Nonemployment Rates, WMC

0.35

0.14

0.30

0.12

0.25

0.10

0.20

0.08

0.15

0.06

0.10

0.04

0.05

0.02

0.00
20

25

30

35

40

45

50

55

60

Ever Incarcerated, Unbiased

Ever Incarcerated, Biased

Nonemployed, Unbiased

Nonemployed, Biased

0.00
20

71

25

30
35
40
Nonemployed, Unbiased

45

50
55
60
Nonemployed, Biased

H

Decomposition Analysis, Details

This appendix provides the numbers underlying the decomposition exercise in Section 6.
Table H.1: Summary statistics by race and education, decomposition exercises

Variable

BML

BMS

WMS

BMC

WMC

378
125
353
670
28.6
1.2
6.2
0.17

561
220
539
994
31.7
0.2
4.1
0.05

631
256
567
1142
30.3
0.2
5.6
0.06

950
387
819
1912
33.1
0.0
2.9
0.02

Earnings bin values switched across races
BML WML BMH WMH BMS WMS

BMC

WMC

456
175
437
806
31.7
0.2
4.1
0.05

871
342
770
1618
30.3
0.2
5.6
0.06

683
288
598
1329
33.1
0.0
2.9
0.02

Earnings shocks switched across races
BML WML BMH WMH BMS WMS

BMC

WMC

683
288
598
1328
33.1
0.0
2.9
0.02

871
342
770
1618
30.3
0.2
5.6
0.06

Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E
Expected years J
Expected years N
Ever-J rate, old
Variable

Benchmark
WML BMH WMH

Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E
Expected years J
Expected years N
Ever-J rate, old
Variable
Lifetime earnings avg.
Lifetime earnings p10
Lifetime earnings p50
Lifetime earnings p90
Expected years E
Expected years J
Expected years N
Ever-J rate, old

189
28
151
417
21.3
3.2
11.5
0.44

236
38
190
516
21.3
3.2
11.5
0.44

279
75
272
504
27.8
1.3
7.0
0.23

346
98
335
629
27.8
1.3
7.0
0.23

279
75
272
504
27.8
1.3
7.0
0.23

236
38
190
515
21.3
3.2
11.5
0.44

306
83
270
598
26.4
1.1
8.5
0.24

402
116
357
776
26.4
1.1
8.5
0.24

386
140
380
647
31.9
0.1
4.0
0.03

506
193
501
835
31.9
0.1
4.0
0.03

386
140
380
647
31.9
0.1
4.0
0.03

402
116
357
775
26.4
1.1
8.5
0.24

465
158
436
820
28.6
1.2
6.2
0.17

456
175
437
806
31.7
0.2
4.1
0.05

465
157
436
819
28.6
1.2
6.2
0.17

Note: [B,W][M,F][L,H,S,C] denote black/white, male/female, less than high school/high
school/some college/college graduate; E means employed; J means jailed or incarcerated;
N means nonemployed; earnings are pre-tax thousands of 1982-1984 dollars.

72